composer.optim#
Optimizers and learning rate schedulers.
Composer is compatible with optimizers based off of PyTorchโs native Optimizer
API, and common
optimizers such
However, where applicable, it is recommended to use the optimizers provided in decoupled_weight_decay
since
they improve off of their PyTorch equivalents.
PyTorch schedulers can be used with Composer, but this is explicitly discouraged. Instead, it is recommended to use
schedulers based off of Composerโs ComposerScheduler
API, which allows more flexibility and
configuration in writing schedulers.
Functions
Converts a stateless scheduler into a PyTorch scheduler object. |
Classes
Specification for a stateless scheduler function. |
|
Maintains a fixed learning rate. |
|
Maintains a fixed learning rate, with an initial warmup. |
|
Decays the learning rate according to the decreasing part of a cosine curve. |
|
Cyclically decays the learning rate according to the decreasing part of a cosine curve. |
|
Decays the learning rate according to the decreasing part of a cosine curve, with an initial warmup. |
|
Adam optimizer with the weight decay term decoupled from the learning rate. |
|
SGD optimizer with the weight decay term decoupled from the learning rate. |
|
Decays the learning rate exponentially. |
|
Adjusts the learning rate linearly. |
|
Adjusts the learning rate linearly, with an initial warmup. |
|
Decays the learning rate discretely at fixed milestones. |
|
Decays the learning rate discretely at fixed milestones, with an initial warmup. |
|
Sets the learning rate to be proportional to a power of the fraction of training time left. |
|
Decays the learning rate according to a power of the fraction of training time left, with an initial warmup. |
|
Decays the learning rate discretely at fixed intervals. |