1 view

To be clear I don't mean, provided the last two numbers in the sequence provide the next one:

```(2, 3, -> 5)
```

But rather given any index provide the Fibonacci number:

```(0 -> 1) or (7 -> 21) or (11 -> 144)
```

Adding two numbers is a very simple task for any machine learning structure, and by extension counting by ones, twos or any fixed number is a simple addition rule. Recursive calculations, however...

To my understanding, most learning networks rely on forwards only evaluation, whereas most programming languages have loops, jumps, or circular flow patterns (all of which are usually ASM jumps of some kind), thus allowing recursion.

Sure some networks aren't forwards only; But can processing weights using the hyperbolic tangent or sigmoid function enter any computationally complete state?

i.e. conditional statements, conditional jumps, forced jumps, simple loops, complex loops with multiple conditions, providing sort order, actual reordering of elements, assignments, allocating extra registers, etc?

It would seem that even a non-forwards only network would only find a polynomial of best fit, reducing errors across the expanse of the training set and no further.

Am I missing something obvious, or did most of Machine Learning just look at recursion and pretend like those problems don't exist?

Update

Technically any programming language can be considered the DNA of a genetic algorithm, where the compiler (and possibly console out measurement) would be the fitness function. The issue is that programming (so far) cannot be expressed in a hill-climbing way - literally, the fitness is 0, until the fitness is 1. Things don't half work in programming, and if they do, there is no way of measuring how 'working' a program is for unknown situations. Even an off by one error could appear to be a totally different and chaotic system with no output. This is exactly the reason for learning to code in the first place is so difficult, the learning curve is almost vertical.

Some might argue that you just need to provide stronger foundation rules for the system to exploit - but that just leads to attempting to generalize all programming problems, which circles right back to designing a programming language and loses all notion of some learning machine at all. Following this road brings you to a close variant of LISP with mutate-able code and virtually meaningless fitness functions that brute force the 'nice' and 'simple' looking code-space in attempt to follow human coding best practices.

Others might argue that we simply aren't using enough population or momentum to gain footing on the error surface or make a meaningful step towards a solution. But as your population approaches the number of DNA permutations, you are really just brute-forcing (and very inefficiently at that). Brute forcing code permutations is nothing new, and definitely not machine learning - it's actually quite common in regex golf, I think there's even an xkcd about it...

The real problem isn't finding a solution that works for some specific recursive function but finding a solution space that can encompass the recursive domain in some useful way.

So other than Neural Networks trained using Backpropagation hypothetically finding the closed form of a recursive function (if a closed-form even exists, and they don't in most real cases where recursion is useful), or a non-forwards only network acting like a pseudo-programming language with awful fitness prospects in the best case scenario, plus the virtually impossible task of tuning exit constraints to prevent infinite recursion... That's really it so far for machine learning and recursion?

by (6.8k points)

According to Kolmogorov et al's On the illustration of continuous functions of the many variables by superposition of continuous functions of one variable and addition, a three layer neural network will model arbitrary function with the linear and logistic functions, including f(n) = ((1+sqrt(5))^n - (1-sqrt(5))^n) / (2^n * sqrt(5)), which is the close form solution of Fibonacci sequence.

If you'd prefer to treat the matter as a algorithmic sequence while not a closed-form resolution, I would view it as a special sliding window approach (I called it special because your window size seems fixed as 2). There are additional general studies on the correct window size for your interest. See these two posts:

Time Series Prediction via Neural Networks

Proper way of using recurrent neural network for time series analysis

Also, understanding the Neural Network Basics would pave the way for learning Machine Learning Tutorial and eventually learning Machine Learning Algorithms as well.