2 views

I understand neural networks with any number of hidden layers can approximate nonlinear functions, however, can it approximate:

f(x) = x^2

I can't think of how it could. It seems like a very obvious limitation of neural networks that can potentially limit what it can do. For example, because of this limitation, neural networks probably can't properly approximate many functions used in statistics like Exponential Moving Average, or even variance.

Speaking of moving average, can recurrent neural networks properly approximate that? I understand how a feedforward neural network or even a single linear neuron can output a moving average using the sliding window technique, but how would recurrent neural networks do it without X amount of hidden layers (X being the moving average size)?

Also, let us assume we don't know the original function f, which happens to get the average of the last 500 inputs, and then output a 1 if it's higher than 3, and 0 if it's not. But for a second, pretend we don't know that, it's a black box.

How would a recurrent neural network approximate that? We would first need to know how many timesteps it should have, which we don't. Perhaps a LSTM network could, but even then, what if it's not a simple moving average, it's an exponential moving average? I don't think even LSTM can do it.

Even worse still, what if f(x,x1) that we are trying to learn is simply

f(x,x1) = x * x1

That seems very simple and straightforward. Can a neural network learn it? I don't see how.

Am I missing something huge here or are machine learning algorithms extremely limited? Are there other learning techniques besides neural networks that can actually do any of this?

by (33.1k points)

Neural networks can approximate any continuous function only within a compact set.

In other words, the theory states that, given:

A continuous function f(x),

A finite range for the input x, [a,b], and

A desired approximation accuracy ε>0,

then there is a neural network that approximates f(x) with an approximation error less than ε, everywhere within [a,b].

In your example of f(x) = x2, you can approximate it with a neural network within any finite range: [-1,1], [0, 1000], etc. To visualize this, imagine that you approximate f(x) within [-1,1] with a Step Function. You need to make the steps narrow enough so you can achieve any desired accuracy. The way neural networks approximate f(x) is not much different than this.

There can’t be a neural network with a finite number of parameters that can approximate f(x) = x2 for all x in [-∞, +∞].