CosineAnnealingWarmRestartsScheduler#
- class composer.optim.CosineAnnealingWarmRestartsScheduler(t_0, t_mult=1.0, alpha_f=0.0)[source]#
Cyclically decays the learning rate according to the decreasing part of a cosine curve.
See also
This scheduler is based on
CosineAnnealingWarmRestarts
from PyTorch.This scheduler resembles a regular cosine annealing curve, as seen in
CosineAnnealingScheduler
, except that after the curve first completest_0
time, the curve resets to the start. The durations of subsequent cycles are each multiplied byt_mult
.Specifically, the learning rate multiplier \(\alpha\) can be expressed as:
\[\alpha(t) = \alpha_f + (1 - \alpha_f) \times \frac{1}{2}(1 + \cos(\pi \times \tau_i)) \]Given \(\tau_i\), the fraction of time elapsed through the \(i^\text{th}\) cycle, as:
\[\tau_i = (t - \sum_{j=0}^{i-1} t_0 t_{mult}^j) / (t_0 t_{mult}^i) \]Where \(t_0\) represents the period of the first cycle, \(t_{mult}\) represents the multiplier for the duration of successive cycles, and \(\alpha_f\) represents the learning rate multiplier to decay to.