SelectiveBackprop#

class composer.algorithms.SelectiveBackprop(start=0.5, end=0.9, keep=0.5, scale_factor=1.0, interrupt=2, input_key=0, target_key=1)[source]#

Selectively backpropagate gradients from a subset of each batch.

Based on (Jiang et al, 2019), Selective Backprop (SB) prunes minibatches according to the difficulty of the individual training examples, and only computes weight gradients over the pruned subset, reducing iteration time, and speeding up training.

The fraction of the minibatch that is kept for gradient computation is specified by the argument 0 <= keep <= 1.

To speed up SBโ€™s selection forward pass, the argument scale_factor can be used to spatially downsample input image tensors. The full-sized inputs will still be used for the weight gradient computation.

To preserve convergence, SB can be interrupted with vanilla minibatch gradient steps every interrupt steps. When interrupt=0, SB will be used at every step during the SB interval. When interrupt=2, SB will alternate with vanilla minibatch steps.

Parameters
  • start (float, optional) โ€“ SB interval start as fraction of training duration. Default: 0.5.

  • end (float, optional) โ€“ SB interval end as fraction of training duration. Default: 0.9.

  • keep (float, optional) โ€“ fraction of minibatch to select and keep for gradient computation. Default: 0.5.

  • scale_factor (float, optional) โ€“ scale for downsampling input for selection forward pass. Default: 1..

  • interrupt (int, optional) โ€“ interrupt SB with a vanilla minibatch step every interrupt batches. Default: 2.

  • input_key (str | int | Tuple[Callable, Callable] | Any, optional) โ€“ A key that indexes to the input from the batch. Can also be a pair of get and set functions, where the getter is assumed to be first in the pair. The default is 0, which corresponds to any sequence, where the first element is the input. Default: 0.

  • target_key (str | int | Tuple[Callable, Callable] | Any, optional) โ€“ A key that indexes to the target from the batch. Can also be a pair of get and set functions, where the getter is assumed to be first in the pair. The default is 1, which corresponds to any sequence, where the second element is the target. Default: 1.

Example

from composer.algorithms import SelectiveBackprop
algorithm = SelectiveBackprop(start=0.5, end=0.9, keep=0.5)
trainer = Trainer(
    model=model,
    train_dataloader=train_dataloader,
    eval_dataloader=eval_dataloader,
    max_duration="1ep",
    algorithms=[algorithm],
    optimizers=[optimizer]
)