Questions and Answers :
LODA Language :
What is the research goal of LODA miner
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 May 22 Posts: 2 Credit: 0 RAC: 0 |
As a researcher who use OEIS whenever meeting an integer sequence, I am quite interested in this project, but there are also some doubt. LODA seems to be a versatile language for computing integer sequences. It seems to encode primitive recursive functions, which is largely sufficient for all computations, but mind that Knuth's up arrow notation is also primitive recursive, so even though LODA programs will eventually halt, it may halt after a long time. Some of the sequences on OEIS are also out of reach for LODA, for instance A079365 of the Chaitin constant, but that is less important. But that's not my point. As a researcher, I am much more interested in the interpretation of these programs. In mathematics, not only do we want a program that computes a sequence, we also want to understand it, ususally by giving a proof. That is the only way that we can be sure that the program really computes the sequence. However, the LODA language seems to be quite difficult to interprete into meaningful mathematical content. Do you have concrete ideas on how to exploit the mined programs? fwjmath. |
Send message Joined: 9 May 22 Posts: 251 Credit: 450,392 RAC: 198 |
The goal is to find new and faster programs / algorithms and conjectures for integer sequences. Sometimes they can be translated into a mathemetical notation using sum and products or recursions. But we don't want to be too restricted. Yes, it can take some effort to understand what the programs are actually doing. But often this also gives some insights into the nature of the sequence. PS: In OEIS, the defining criteria for a sequence are the terms. For us (in LODA), it's different. We identify (infinite) seqences by a program that computes it. |
Send message Joined: 8 May 22 Posts: 5 Credit: 4,969 RAC: 0 |
Disclaimer: I know nothing about proving math and little about AI. Only the OEIS terms data are being used as training data for LODA. The OEIS name and formulas are currently being ignored. If there are few terms (less than 40) registered in OEIS, then the LODA program will output those terms, but may likely diverge after the last term. If there are many terms (more than 10000) registered in OEIS and the LODA program outputs these terms, then the chance of correctness is higher. A program satisfies all the many terms in OEIS and compute things in a novel way. Proving correctness of a such program, that is the challenge. Currently it's being done manually. https://oeis.org/search?q=loda&language=english&go=Search Can this be automated? Approach A. Transformer architecture, that takes LODA programs+terms as input. Given a prompt of 20 terms and see what program is being generated. Is it correct/incorrect and decide on what to do about it. Are there patterns that can be proven. Approach B. Huge AI trained with OEIS terms + OEIS metadata + documents being referenced. Lots of math training data. |
Send message Joined: 14 May 22 Posts: 2 Credit: 0 RAC: 0 |
I think that, at least for me, one of the key is how to know if the way a mined LODA program computes a sequence is new in some sense. It should be different from know formulas, it should be different from the definition, and it should have a proof. All these three parts are difficult and requires human input for the moment. So the best we may have for the moment is that people interested in a given sequence may take time (or not!) to look at such programs. This is not likely to happen for sequences with "nice" tag though, as many of them may already have nice formulas. I think that the best bet may be those with "hard" and/or "more" tags, and preferably without "less" tag and/or with "nice" tag. But not many such sequences have enough term for LODA program mining. Some tests may be needed. fwjmath. |
Send message Joined: 9 May 22 Posts: 251 Credit: 450,392 RAC: 198 |
Sequences with too few terms are in fact an issue. We had the idea to use jOEIS to compute more terms. This was also successfully done by the author of Sequence Machine (https://sequencedb.net/). Yes, you're right: manual validation is a major effort/task. But the good thing is: we have can automated way of determining whether one program is "better" than another. This is mainly decided based on complexity (one program needs less computation steps to compute the same number of terms). In addition: all programs are stored in a repository (including its history). So at any point in time, if someone is interested in a particular sequence or program, it is there for analysis. The manual reviewing process that is done at the OEIS requires major human effort and does not really scale. We take a different approach and try to automated as muss as possible. That being said, we also did manual coding and validation of programs. |
Send message Joined: 19 Jun 22 Posts: 6 Credit: 57,484 RAC: 2 |
... If there are many terms (more than 10000) registered in OEIS and the LODA program outputs these terms, then the chance of correctness is higher... I would expect not... (Unless you have some proof?) All that a coincidence of two sequences that are 10000 integers long shows, is that they are coincident for that part of their sequence. The very next number can be different. Nothing can be said about what that next number is unless you already know what that number is, or there is some consistent pattern that is being followed that continues to hold true for both sequences to remain coincident. ... Just like the games played trying to follow stock markets. The next valuation can have no bearing of what went on before... Now... If there are some stats available for what is actually seen?... Keep searchin', Martin |
Send message Joined: 9 May 22 Posts: 251 Credit: 450,392 RAC: 198 |
I wrote "the chance of correctness is higher", which is true. Of course, in the end it requires a proof. But this is how maths works: come up with a hypothesis, collect enough evidence that supports it, then come up with a proof. In LODA, the hypotheses are the programs. The evidence is the number of correctly terms computed. The proof requires manual inspection of the program. |
©2024 LODA Language