机器学习 用一个易懂的小栗子解释神经网络的BP算法

神经网络之BP算法

  • BP算法也叫做δδ算法
  • 以三层的感知器为例:
    机器学习 用一个易懂的小栗子解释神经网络的BP算法

输出层误差:E=12(dO)2=12k=1ζ(dkOk)2E=\frac{1}{2}(d-O)^2=\frac{1}{2}\sum_{k=1}^ζ(d_k-O_k)^2

隐层的误差:E=12k=1ζ(dkf(netk))2=12k=1ζ(dkf(j=1mwjkyj))E=\frac{1}{2}\sum_{k=1}^ζ(d_k-f(net_k))^2 = \frac{1}{2}\sum_{k=1}^ζ(d_k-f(\sum_{j=1}^mw_{jk}y_j))

输入层误差:
E=12k=1ζdkf[j=0mf(neti)]2=12k=1ζdkf[j=0mwjkf(i=1nvijxi)]E=\frac{1}{2}\sum_{k=1}^ζd_k-f[\sum_{j=0}^mf(net_i)]^2 = \frac{1}{2}\sum_{k=1}^ζd_k-f[\sum_{j=0}^mw_{jk}f(\sum_{i=1}^nv_{ij}x_i)]


神经网络之SGD

误差E有了,为了使误差越来越小,可以采用随机梯度下降的方式进行wvw和v的求解,即求得wwvv使得误差E最小。
机器学习 用一个易懂的小栗子解释神经网络的BP算法


BP算法例子
机器学习 用一个易懂的小栗子解释神经网络的BP算法
neth1=w1l1+w2l2+b11=0.15+0.1510+0.351=2.35net_{h1} = w_1*l_1+w_2*l_2+b_1*1 = 0.1*5+0.15*10+0.35*1=2.35

outh1=11+eneth1=11+e2.35=0.912934out_{h1} = \frac{1}{1+e^{-net_{h1}}} = \frac{1}{1+e^{-2.35}} = 0.912934

同理可求:outh2=0.979164out_{h2} = 0.979164outh3=0.995275out_{h3} = 0.995275

neto1=w7outh1+w9outh2+w11outh3+b21net_{o1} = w_7 * out_{h1} + w_9 * out_{h2} + w_11 * out_{h3} + b_2 * 1

neto1=0.40.912934+0.50.979164+0.60.995275=2.1019206net_{o1} = 0.4*0.912934+0.5*0.979164+0.6*0.995275 = 2.1019206

outo1=11+eneto1=11+e2.1019206=0.891090out_{o1} = \frac{1}{1+e^{-net_{o1}}} = \frac{1}{1+e^{-2.1019206}} = 0.891090

同理可求:outo2=0.904330out_{o2} = 0.904330

即输出层误差
Eo1=12(o1outo1)2E_{o1} = \frac{1}{2}(o1-out_{o1})^2

Eo1=12(o2outo2)2E_{o1} = \frac{1}{2}(o2-out_{o2})^2

Etotal=Eo1+Eo2=12(0.010.891090)2+12(0.990.904330)2=0.391829E_{total} = E_{o1} + E_{o2} = \frac{1}{2}(0.01-0.891090)^2 + \frac{1}{2}(0.99-0.904330)^2 = 0.391829


以计算w7+w_7^+为例子

  • Etotalw7=Etotalouto1outo1neto1neto1w7\frac{∂E_{total}}{∂w_7} = \frac{∂E_{total}}{∂out_{o1}}*\frac{∂out_{o1}}{∂net_{o1}}*\frac{∂net_{o1}}{∂w_7}
  1. Etotalouto1=212(targeto1outo1)211+0=(0.010.891091)=0.88109\frac{∂E_{total}}{∂out_{o1}} = 2*\frac{1}{2}(target_{o1} - out_{o1})^{2-1} * -1 + 0 = -(0.01-0.891091) = 0.88109
  2. outo1neto1=outo1(1outo1)=0.891090(10.891090)=0.097049\frac{∂out_{o1}}{∂net_{o1}} = out_{o1}(1-out_{o1}) = 0.891090(1-0.891090) = 0.097049
  3. neto1w7=1outh1w711+0+0+0=0.912934\frac{∂net_{o1}}{∂w_7} = 1*out_{h1}*w_7^{1-1}+0+0+0 = 0.912934
  • Etotalw7=0.881090.0970490.912934=0.078064\frac{∂E_{total}}{∂w_7} = 0.88109 * 0.097049 * 0.912934 = 0.078064
  • w7+=w7+w7=w7ηEtotalw7=0.40.50.078064=0.360968ηw_7^+ = w_7 + △w_7 = w_7 - η\frac{∂E_{total}}{∂w_7} =0.4 - 0.5*0.078064 = 0.360968(η自定义的值)

同理可求:
w1+=0.094534w_1^+ = 0.094534

w2+=0.139069w_2^+ = 0.139069

w3+=0.198211w_3^+ = 0.198211

w4+=0.246422w_4^+ = 0.246422

w5+=0.299497w_5^+ = 0.299497

w6+=0.348993w_6^+ = 0.348993

w7+=0.360968w_7^+ = 0.360968

w8+=0.453383w_8^+ = 0.453383

w9+=0.458137w_9^+ = 0.458137

w10+=0.553629w_{10}^+ = 0.553629

w11+=0.557448w_{11}^+ = 0.557448

w12+=0.653688w_{12}^+ = 0.653688


第十次迭代结果:O=(0.662866,0.908195)O = (0.662866,0.908195)
第百次迭代结果:O=(0.073889,0.945864)O = (0.073889,0.945864)
第千次迭代结果:O=(0.022971,0.977675)O = (0.022971,0.977675)
机器学习 用一个易懂的小栗子解释神经网络的BP算法