# composer.algorithms.agc.agc#

Core adaptive gradient clipping classes and functions.

Functions

 apply_agc Clips all gradients in model based on ratio of gradient norms to parameter norms.

Classes

 AGC Clips all gradients in model based on ratio of gradient norms to parameter norms.
class composer.algorithms.agc.agc.AGC(clipping_threshold=0.01)[source]#

Clips all gradients in model based on ratio of gradient norms to parameter norms.

From <https://arxiv.org/abs/2102.06171>. Computes the norm of the weights and the norm of their corresponding gradients, then scales the gradients by (weight_norm / grad_norm) * clipping_threshold for gradients whose norms are greater than weight_norm * clipping_threshold. Norms are taken across rows for weight matrices in MLPs, across entire filters/kernels for CNNs (channel and spatial dimensions), and across the whole vector for biases.

Runs on Event.AFTER_TRAIN_BATCH.

Example

from composer.algorithms import AGC
from composer.trainer import Trainer
agc_algorithm = AGC()
trainer = Trainer(
model=model,
max_duration="1ep",
algorithms=[agc_algorithm],
optimizers=[optimizer]
)

Parameters

clipping_threshold (float, optional) – The largest acceptable ratio between grad norms and parameter norms before clipping is done.

apply(event, state, logger)[source]#

Freeze layers in the model.

match(event, state)[source]#

Run on Event.AFTER_TRAIN_BATCH.

composer.algorithms.agc.agc.apply_agc(model, clipping_threshold=0.01)[source]#

Clips all gradients in model based on ratio of gradient norms to parameter norms.

Example

import composer.functional as cf

cf.apply_agc(model=model)

Parameters
• model (Module) – The model being trained.

• clipping_threshold (float, optional) – The largest acceptable ratio between grad norms and parameter norms before clipping is done.