composer.callbacks.speed_monitor#
Monitor throughput during training.
Classes
Logs the training throughput. |
- class composer.callbacks.speed_monitor.SpeedMonitor(window_size=100)[source]#
Bases:
composer.core.callback.Callback
Logs the training throughput.
The training throughput in terms of number of samples per second is logged on the
BATCH_END
event if we have reached thewindow_size
threshold. Per epoch average throughput and wall clock train time is also logged on theEPOCH_END
event.Example
>>> from composer.callbacks import SpeedMonitor >>> # constructing trainer object with this callback >>> trainer = Trainer( ... model=model, ... train_dataloader=train_dataloader, ... eval_dataloader=eval_dataloader, ... optimizers=optimizer, ... max_duration="1ep", ... callbacks=[SpeedMonitor(window_size=100)], ... )
The training throughput is logged by the
Logger
to the following keys as described below.Key
Logged data
throughput/step
Rolling average (over
window_size
most recent batches) of the number of samples processed per secondthroughput/epoch
Number of samples processed per second (averaged over an entire epoch)
wall_clock_train
Total elapsed training time
- Parameters
window_size (int, optional) โ Number of batches to use for a rolling average of throughput. Default to 100.
- load_state_dict(state)[source]#
Restores the state of SpeedMonitor object.
- Parameters
state (Dict[str, Any]) โ The state of the object, as previously returned by
state_dict()
- state_dict()[source]#
Returns a dictionary representing the internal state of the SpeedMonitor object.
The returned dictionary is pickle-able via
torch.save()
.- Returns
Dict[str, Any] โ The state of the SpeedMonitor object