I'm having trouble understanding the backpropagation algorithm. I read a lot and searched a lot but I can't understand why my Neural Network doesn't work. I want to confirm that I'm doing every part the right way.
Here is my Neural Network when it is initialized and when the first line of inputs [1, 1] and the output [0] is set (as you can see, I'm trying to do the XOR Neural Network) :
I have 3 layers: input, hidden and output. The first layer (input) and the hidden layer contains 2 neurons in which there are 2 synapses each. The last layer (output) contains one neuron with 2 synapses too.
A synapse contains a weight and it’s the previous delta (at the beginning, it is 0). The output connected to the synapse can be found with the source neuron associated with the synapse or in the inputs array if there is no sourceNeuron (like in the input layer).
The class Layer.java contains a list of neurons. In my NeuralNetwork.java, I initialize the Neural Network then I loop in my training set. In each iteration, I replace the inputs and the output values and call train on my BackPropagation Algorithm and the algorithm run a certain number of time (epoch of 1000 times for now) for the current set.
The activation function I use is sigmoid.
Training set AND validation set is (input1, input2, output):
1,1,0 0,1,1 1,0,1 0,0,0
Here is my Neuron.java implementation:
public class Neuron { private IActivation activation; private ArrayList<Synapse> synapses; // Inputs private double output; // Output private double errorToPropagate; public Neuron(IActivation activation) { this.activation = activation; this.synapses = new ArrayList<Synapse>(); this.output = 0; this.errorToPropagate = 0; } public void updateOutput(double[] inputs) { double sumWeights = this.calculateSumWeights(inputs); this.output = this.activation.activate(sumWeights); } public double calculateSumWeights(double[] inputs) { double sumWeights = 0; int index = 0; for (Synapse synapse : this.getSynapses()) { if (inputs != null) { sumWeights += synapse.getWeight() * inputs[index]; } else { sumWeights += synapse.getWeight() * synapse.getSourceNeuron().getOutput(); } index++; } return sumWeights; } public double getDerivative() { return this.activation.derivative(this.output); } [...] }
The Synapse.java contains:
public Synapse(Neuron sourceNeuron) { this.sourceNeuron = sourceNeuron; Random r = new Random(); this.weight = (-0.5) + (0.5 - (-0.5)) * r.nextDouble(); this.delta = 0; } [... getter and setter ...]
The train method in my class BackpropagationStrategy.java run a while loop and stop after 1000 times (epoch) with one line of the training set. It looks like this:
this.forwardPropagation(neuralNetwork, inputs); this.backwardPropagation(neuralNetwork, expectedOutput); this.updateWeights(neuralNetwork);
Here is all the implementation of the methods above (learningRate = 0.45 and momentum = 0.9):
public void forwardPropagation(NeuralNetwork neuralNetwork, double[] inputs) { for (Layer layer : neuralNetwork.getLayers()) { for (Neuron neuron : layer.getNeurons()) { if (layer.isInput()) { neuron.updateOutput(inputs); } else { neuron.updateOutput(null); } } } } public void backwardPropagation(NeuralNetwork neuralNetwork, double realOutput) { Layer lastLayer = null; // Loop à travers les hidden layers et le output layer uniquement ArrayList<Layer> layers = neuralNetwork.getLayers(); for (int i = layers.size() - 1; i > 0; i--) { Layer layer = layers.get(i); for (Neuron neuron : layer.getNeurons()) { double errorToPropagate = neuron.getDerivative(); // Output layer if (layer.isOutput()) { errorToPropagate *= (realOutput - neuron.getOutput()); } // Hidden layers else { double sumFromLastLayer = 0; for (Neuron lastLayerNeuron : lastLayer.getNeurons()) { for (Synapse synapse : lastLayerNeuron.getSynapses()) { if (synapse.getSourceNeuron() == neuron) { sumFromLastLayer += (synapse.getWeight() * lastLayerNeuron.getErrorToPropagate()); break; } } } errorToPropagate *= sumFromLastLayer; } neuron.setErrorToPropagate(errorToPropagate); } lastLayer = layer; } } public void updateWeights(NeuralNetwork neuralNetwork) { for (int i = neuralNetwork.getLayers().size() - 1; i > 0; i--) { Layer layer = neuralNetwork.getLayers().get(i); for (Neuron neuron : layer.getNeurons()) { for (Synapse synapse : neuron.getSynapses()) { double delta = this.learningRate * neuron.getError() * synapse.getSourceNeuron().getOutput(); synapse.setWeight(synapse.getWeight() + delta + this.momentum * synapse.getDelta()); synapse.setDelta(delta); } } } }
For the validation set, I only run this:
this.forwardPropagation(neuralNetwork, inputs);
And then check the output of the neuron in my output layer.
Did I do something wrong? I need some explanations...
Here are my results after 1000 epoch:
Real: 0.0 Current: 0.025012156926937503 Real: 1.0 Current: 0.022566830709341495 Real: 1.0 Current: 0.02768416343491415 Real: 0.0 Current: 0.024903432706154027
Why the synapses in the input layer are not updated? Everywhere it is written to only update the hidden and output layers.
Like you can see, it is totally wrong! It doesn't go to the 1.0 only to the first train set output (0.0).