GradientClipping#

class composer.algorithms.GradientClipping(clipping_type, clipping_threshold)[source]#

Clips all gradients in model based on specified clipping_type.

Runs on Event.AFTER_TRAIN_BATCH.

Example

from composer.algorithms import GradientClipping
from composer.trainer import Trainer
gc = GradientClipping(clipping_type='norm', clipping_threshold=0.1)
trainer = Trainer(
    model=model,
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    max_duration="1ep",
    algorithms=[gc],
    optimizers=[optimizer]
)
Parameters
  • clipping_type ('adaptive', 'norm', 'value') โ€“ String denoting which type of gradient clipping to do. The options are: โ€˜normโ€™, which clips the gradient norm and uses torch.nn.utils.clip_grad_norm_, โ€˜valueโ€™, which clips gradient at a specified value and uses torch.nn.utils.clip_grad_value_, and โ€˜adaptiveโ€™, which clips all gradients based on gradient norm:parameter norm ratio using composer.algorithms.gradient_clipping.gradient_clipping._apply_agc.

  • clipping_threshold (float, optional) โ€“ Specifies what value to clip the gradients to (for โ€˜valueโ€™), what values to clip the gradient norms to (for โ€˜normโ€™), and threshold by which if grad_norm / weight_norm is greater than this threshold then scale gradients by this threshold * (weight_norm / grad_norm) (for โ€˜adaptiveโ€™).

Raises
  • NotImplementedError โ€“ if deepspeed is enabled and clipping_type is not โ€˜normโ€™.

  • ValueError โ€“ if deepspeed is enabled and clipping_type is not โ€˜normโ€™.