Paper Discussion:[masked] Disciplined Approach to Hyper parameters
Learning Rate, Batch Size, Momentum and Weight Decay
We'll discuss this follow up paper to Leslie Smith's Super-Convergence paper. Leslie Smith's papers focus on minimizing the time and epochs to train a model.
Here is a link to a good web post on Cyclical Learning Rates. https://iconof.com/1cycle-learning-rate-policy. Consider studying the papers before hand.
The idea for this topic and the links came from Vicor Lu, thanks.