composer.distributed#

Distributed training.

Functions

ddp_sync_context

A context manager for handling the DDPSyncStrategy.

fix_batch_precision_for_deepspeed

Ensures that a batch is properly formatted for DeepSpeed precisions, if active.

parse_deepspeed_config

Parses the provided DeepSpeed config for compatibility with the Mosaic trainer.

prepare_ddp_module

Wraps the module in a torch.nn.parallel.DistributedDataParallel object if running distributed training.

prepare_fsdp_module

Prepare a module (assumed ComposerModel) and optimizer for use with torch.distributed.fsdp.FullyShardedDataParallel.

prepare_tp_module

Prepare a module (assumed ComposerModel) for use with tensor parallel.

Classes

DDPSyncStrategy

How and when gradient synchronization should happen.