2 views

I have a quick question regarding backpropagation. I am looking at the following:

http://www4.rgu.ac.uk/files/chapter3%20-%20bp.pdf

In this paper, it says to calculate the error of the neuron as:

Error = Output(i) * (1 - Output(i)) * (Target(i) - Output(i))

I have put the part of the equation that I don't understand in bold. In the paper, it says that the Output(i) * (1 - Output(i)) term is needed because of the sigmoid function - but I still don't understand why this would be necessary.

What would be wrong with using

Error = abs(Output(i) - Target(i))

?

Is the error function regardless of the neuron activation/transfer function?

by (108k points)

You are actually calculating the derivative of the error function with respect to the neuron's inputs.

While you get the derivative through the chain rule, you are required to multiply by the derivative of the neuron's activation function (which appears to be a sigmoid).

The derivative portion would be different according to the function you work with. In the most trivial case, doutput/dinput is just 1 if you have a linear function. And this works for any cost function. Clearly, there are certain desirable properties for the cost function (smooth derivatives, a single maximum where gradient crosses zero, etc.)

Calculate the derivative of the error on the neuron's inputs via the chain rule:

E = -(target - output)^2

dE/dinput = dE/doutput * doutput/dinput

Work out doutput/dinput:

output = sigmoid (input)

doutput/dinput = output * (1 - output)    (derivative of the sigmoid function)

therefore:

dE/dinput = 2 * (target - output) * output * (1 - output)

If you wish to learn about backpropagation in the neural network then visit this Neural Network Tutorial.