I have a minimal example of a neural network with a back-propagation trainer, testing it on the IRIS data set. I started off with 7 hidden nodes and it worked well.

I lowered the number of nodes in the hidden layer to 1 (expecting it to fail), but was surprised to see that the accuracy went up.

I set up the experiment in azure ml, just to validate that it wasn't my code. Same thing there, 98.3333% accuracy with a single hidden node.

Can anyone explain to me what is happening here?

Iris is very predictable and also that there are fewer features in the data set. Also, there are high linear correlations. These facts point to a less complex, linear function which yields good results. Since you have used, you are nearly using a linear model.

