FusedLayerNorm#

class composer.algorithms.FusedLayerNorm[source]#

Replaces all instances of torch.nn.LayerNorm with a apex.normalization.fused_layer_norm.FusedLayerNorm.

By fusing multiple kernel launches into one, this usually improves GPU utilization.

Runs on Event.INIT, so it can replace all instances of torch.nn.LayerNorm before the model is DDP wrapped. Has no hyperparameters.

Example

from composer.algorithms import FusedLayerNorm

algorithm = FusedLayerNorm()
trainer = Trainer(
    model=model,
    train_dataloader=train_dataloader,
    max_duration="1ep",
    algorithms=[algorithm],
    optimizers=[optimizer]
)