composer.optim#

Optimizers and learning rate schedulers.

Composer is compatible with optimizers based off of PyTorch’s native Optimizer API, and common optimizers such However, where applicable, it is recommended to use the optimizers provided in decoupled_weight_decay since they improve off of their PyTorch equivalents.

PyTorch schedulers can be used with Composer, but this is explicitly discouraged. Instead, it is recommended to use schedulers based off of Composer’s ComposerScheduler API, which allows more flexibility and configuration in writing schedulers.

Functions

compile_composer_scheduler

Converts a stateless scheduler into a PyTorch scheduler object.

Classes

`ComposerScheduler`	Specification for a stateless scheduler function.
`ConstantScheduler`	Maintains a fixed learning rate.
`ConstantWithWarmupScheduler`	Maintains a fixed learning rate, with an initial warmup.
`CosineAnnealingScheduler`	Decays the learning rate according to the decreasing part of a cosine curve.
`CosineAnnealingWarmRestartsScheduler`	Cyclically decays the learning rate according to the decreasing part of a cosine curve.
`CosineAnnealingWithWarmupScheduler`	Decays the learning rate according to the decreasing part of a cosine curve, with an initial warmup.
`DecoupledAdamW`	Adam optimizer with the weight decay term decoupled from the learning rate.
`DecoupledSGDW`	SGD optimizer with the weight decay term decoupled from the learning rate.
`ExponentialScheduler`	Decays the learning rate exponentially.
`LinearScheduler`	Adjusts the learning rate linearly.
`LinearWithWarmupScheduler`	Adjusts the learning rate linearly, with an initial warmup.
`MultiStepScheduler`	Decays the learning rate discretely at fixed milestones.
`MultiStepWithWarmupScheduler`	Decays the learning rate discretely at fixed milestones, with an initial warmup.
`PolynomialScheduler`	Sets the learning rate to be proportional to a power of the fraction of training time left.
`PolynomialWithWarmupScheduler`	Decays the learning rate according to a power of the fraction of training time left, with an initial warmup.
`StepScheduler`	Decays the learning rate discretely at fixed intervals.