机器学习笔记 ---- Neural Networks

Neural Network

1. Model Summary

At a very simple level, neurons are basically computational units that take inputs (dendrites) as electrical inputs (called “spikes”) that are channeled to outputs (axons). In our model, our dendrites are like the input features x1xn and the output is the result of our hypothesis function. In this model our x0=1 input node is sometimes called the “bias unit.” It is always equal to 1. In neural networks, we use the same logistic function as in classification, 11+eθTx, yet we sometimes call it a sigmoid (logistic) activation function. In this situation, our “theta” parameters are sometimes called “weights”.
Visually, a simplistic representation looks like:

[x0x1x2][]hθ(x)

机器学习笔记 ---- Neural Networks
three layers: input layer / hidden layer / output layer
ai(j) : activation unit i in layer j
Θ(j) : Matrix that controls function mapping from j-th layer to (j+1)-th layer
If layer j has sj units, layer j+1 has sj+1 units, then size of Θ(j) is sj+1(sj+1)
L : Number of Layers
sl : Number of units in l-th layer
Number of Inputs: the dimension of features in x(i)
Binary Classification: 1 output unit
K-classes Classification: K output unit

2. Forward Propagation

1) Add ax(0)=1 first
2) zx+1=Θ(x)ax
3) ax+1=g(zx+1) — g(x) : Sigmoid

3. Cost Function

机器学习笔记 ---- Neural Networks
Excluding Bias Term

4. Backpropagation Algorithm

δj(l) error of node j in layer l, then

δ(L)=a(L)yδ(i)=(Θ(i))Tδ(i+1).g(z(i))(i!=L,i!=1)

where g(z(i))=a(i).(1a(i))
机器学习笔记 ---- Neural Networks
One thing to note: use one training set to train the model at one time!

5. Unrolling Parameters

Enroll them to vectors/Get back:
机器学习笔记 ---- Neural Networks
机器学习笔记 ---- Neural Networks

6.Gradient Checking

机器学习笔记 ---- Neural Networks
机器学习笔记 ---- Neural Networks

When learning, turn off gradient checking!!!

7. Random Initialization

机器学习笔记 ---- Neural Networks

8.Network Architecture

one hidden layer/
more than one hidden layer with same number of units