[coursera/ImprovingDL/week3]Hyperparameter tuning, Batch Normalization(summary&question)
The video for this week is easy.
3.1 hyperparameter tuning
try random
select a zone: log scale
model/parallel model: depend on the computation power
3.2 Batch Normalization
normalize the hidden units: z
3.3 multi-class classification
softmax regression
softmax classification
question:
WA: 8.Batch norm can be learned using Adam, Gradient descent with momentum, or RMSprop, not just with gradient descent.