一天搞懂机器学习PPT笔记-2
Tips for Training DNN
minimize total loss
more layers do not imply better
- so it is hard to get the power of deep
Learning rate
- popular&simple idea:reduce the learning rate by some factor every few epochs
– at the beginning,we are far from the destination,so we use larger learning rate
– after several epochs,we are close to the destination,so we reduce the learning rate.
– a demo rate function:rate = init rate / sqrt(t+1)
– learning rate cannot be one-size-fits-all,so we should give different parameters and different learning rates.
hard to find optimal network parameters
- there are many points where the value of judging the parameters is 0.so we has the Momentum
Momentum
– to make sure that we can find the better parameters
Why Overfitting
- training data and testing data can be different
- learning target is trained by the training data
- the parameters achieving the learning target do not necessary have good results on the testing data
panacea for OverFitting
- have more training data
- create more training data,for example:
some ways to reduce the time to get the better parameters
- early Stopping
- weight decay
- drop out
Variants of Neural Networks
- Convolutional Neural Network(Widely used in image processing)
- Recurrent Neural Network(RNN)