GradientClipping#
- class composer.algorithms.GradientClipping(clipping_type, clipping_threshold)[source]#
Clips all gradients in model based on specified clipping_type.
Runs on
Event.AFTER_TRAIN_BATCH
.Example
from composer.algorithms import GradientClipping from composer.trainer import Trainer gc = GradientClipping(clipping_type='norm', clipping_threshold=0.1) trainer = Trainer( model=model, train_dataloader=train_dataloader, eval_dataloader=eval_dataloader, max_duration="1ep", algorithms=[gc], optimizers=[optimizer] )
- Parameters
clipping_type ('adaptive', 'norm', 'value') โ String denoting which type of gradient clipping to do. The options are: โnormโ, which clips the gradient norm and uses torch.nn.utils.clip_grad_norm_, โvalueโ, which clips gradient at a specified value and uses torch.nn.utils.clip_grad_value_, and โadaptiveโ, which clips all gradients based on gradient norm:parameter norm ratio using composer.algorithms.gradient_clipping.gradient_clipping._apply_agc.
clipping_threshold (float, optional) โ Specifies what value to clip the gradients to (for โvalueโ), what values to clip the gradient norms to (for โnormโ), and threshold by which if grad_norm / weight_norm is greater than this threshold then scale gradients by this threshold * (weight_norm / grad_norm) (for โadaptiveโ).
- Raises
NotImplementedError โ if deepspeed is enabled and clipping_type is not โnormโ.
ValueError โ if deepspeed is enabled and clipping_type is not โnormโ.