composer.distributed.prepare_ddp_module(module, find_unused_parameters)[source]#

Wraps the module in a torch.nn.parallel.DistributedDataParallel object if running distributed training.

  • module (Module) โ€“ The module to wrap.

  • find_unused_parameters (bool) โ€“ Whether or not to do a pass over the autograd graph to find parameters to not expect gradients for. This is useful if there are some parameters in the model that are not being trained.