已知 x0,x1,w0,w1,w2,y
g = 1 / (1 + math.exp( -((x0 * w0) + (x1 * w1) + w2)))
损失函数 f = y - g
使用BP算法,调整w0,w1,w2使得 f <0.1
x0 = -1
x1 = -2
w0 = 2
w1 = -3
w2 = -3
y = 1.73
https://cs231n.github.io/optimization-2/
原文例程:
For example, the sigmoid expression receives the input 1.0 and computes the output 0.73 during the forward pass.
The derivation above shows that the local gradient would simply be (1 - 0.73) * 0.73 ~= 0.2,
as the circuit computed before (see the image above),
except this way it would be done with a single,
simple and efficient expression (and with less numerical issues).
Therefore, in any real practical application it would be very useful to group these operations into a single gate. Lets see the backprop for this neuron in code:
w = [2,-3,-3] # assume some random weights and data x = [-1, -2] # forward pass dot = w[0]*x[0] + w[1]*x[1] + w[2] f = 1.0 / (1 + math.exp(-dot)) # sigmoid function # backward pass through the neuron (backpropagation) ddot = (1 - f) * f # gradient on dot variable, using the sigmoid gradient derivation dx = [w[0] * ddot, w[1] * ddot] # backprop into x dw = [x[0] * ddot, x[1] * ddot, 1.0 * ddot] # backprop into w # we're done! we have the gradients on the inputs to the circuit