CosineAnnealingWarmRestartsScheduler#

class composer.optim.CosineAnnealingWarmRestartsScheduler(t_0, t_mult=1.0, alpha_f=0.0)[source]#

Cyclically decays the learning rate according to the decreasing part of a cosine curve.

See also

This scheduler is based on CosineAnnealingWarmRestarts from PyTorch.

This scheduler resembles a regular cosine annealing curve, as seen in CosineAnnealingScheduler, except that after the curve first completes t_0 time, the curve resets to the start. The durations of subsequent cycles are each multiplied by t_mult.

Specifically, the learning rate multiplier \(\alpha\) can be expressed as:

\[\alpha(t) = \alpha_f + (1 - \alpha_f) \times \frac{1}{2}(1 + \cos(\pi \times \tau_i)) \]

Given \(\tau_i\), the fraction of time elapsed through the \(i^\text{th}\) cycle, as:

\[\tau_i = (t - \sum_{j=0}^{i-1} t_0 t_{mult}^j) / (t_0 t_{mult}^i) \]

Where \(t_0\) represents the period of the first cycle, \(t_{mult}\) represents the multiplier for the duration of successive cycles, and \(\alpha_f\) represents the learning rate multiplier to decay to.

Parameters
  • t_0 (str | Time) โ€“ The period of the first cycle.

  • t_mult (float) โ€“ The multiplier for the duration of successive cycles. Default = 1.0.

  • alpha_f (float) โ€“ Learning rate multiplier to decay to. Default = 0.0.