FSDP2Config#
- class composer.utils.FSDP2Config(device_mesh=None, reshard_after_forward=True, activation_checkpointing=False, activation_cpu_offload=False, verbose=False)[source]#
Configuration for Fully Sharded Data Parallelism (FSDP2).
- Parameters
device_mesh (Optional[DeviceMesh]) โ The DeviceMesh for sharding. If None, a default 1D mesh is created. For 1D mesh, parameters are fully sharded across the mesh (FSDP). For 2D mesh, parameters are sharded across the 1st dimension and replicated across the 0th dimension (HSDP).
reshard_after_forward (Union[bool, int]) โ Controls parameter behavior after forward.
- classmethod from_compatible_attrs(attrs)[source]#
Create an FSDP2Config by filtering FSDP2 compatible attributes from given attrs.
Only attributes that are valid for FSDP2Config will be used, and warnings will be issued for any attributes that cannot be transferred. Therefore it supports both FSDP1 and FSDP2 attributes, and main use case is FSDP1 backwards compatibility.
- Parameters
attrs (dict[str, Any]) โ Dictionary of FSDP1/2 configuration attributes.
- Returns
FSDP2Config โ A new FSDP2Config instance with compatible attributes.
Warning
- UserWarning: If an attribute in the input dictionary is not a settable attribute
of FSDP2Config and will be ignored.