# composer.algorithms.mixup.mixup#

Core MixUp classes and functions.

Functions

 mixup_batch Create new samples using convex combinations of pairs of samples.

Classes

 MixUp MixUp trains the network on convex batch combinations.
class composer.algorithms.mixup.mixup.MixUp(alpha=0.2, interpolate_loss=False, input_key=0, target_key=1)[source]#

MixUp trains the network on convex batch combinations.

The algorithm uses individual examples and targets to make a convex combination of a given batch X with a randomly permuted copy of X. The mixing coefficient is drawn from a Beta(alpha, alpha) distribution.

Training in this fashion sometimes reduces generalization error.

Parameters
• alpha (float, optional) – the psuedocount for the Beta distribution used to sample mixing parameters. As alpha grows, the two samples in each pair tend to be weighted more equally. As alpha approaches 0 from above, the combination approaches only using one element of the pair. Default: 0.2.

• interpolate_loss (bool, optional) – Interpolates the loss rather than the labels. A useful trick when using a cross entropy loss. Will produce incorrect behavior if the loss is not a linear function of the targets. Default: False

• input_key (str | int | Tuple[Callable, Callable] | Any, optional) – A key that indexes to the input from the batch. Can also be a pair of get and set functions, where the getter is assumed to be first in the pair. The default is 0, which corresponds to any sequence, where the first element is the input. Default: 0.

• target_key (str | int | Tuple[Callable, Callable] | Any, optional) – A key that indexes to the target from the batch. Can also be a pair of get and set functions, where the getter is assumed to be first in the pair. The default is 1, which corresponds to any sequence, where the second element is the target. Default: 1.

Example

from composer.algorithms import MixUp
algorithm = MixUp(alpha=0.2)
trainer = Trainer(
model=model,
max_duration="1ep",
algorithms=[algorithm],
optimizers=[optimizer]
)

composer.algorithms.mixup.mixup.mixup_batch(input, target, mixing=None, alpha=0.2, indices=None)[source]#

Create new samples using convex combinations of pairs of samples.

This is done by taking a convex combination of input with a randomly permuted copy of input. The permutation takes place along the sample axis (dim=0).

The relative weight of the original input versus the permuted copy is defined by the mixing parameter. This parameter should be chosen from a Beta(alpha, alpha) distribution for some parameter alpha > 0. Note that the same mixing is used for the whole batch.

Parameters
• input (Tensor) – input tensor of shape (minibatch, ...), where ... indicates zero or more dimensions.

• target (Tensor) – target tensor of shape (minibatch, ...), where ... indicates zero or more dimensions.

• mixing (float, optional) – coefficient used to interpolate between the two examples. If provided, must be in $$[0, 1]$$. If None, value is drawn from a Beta(alpha, alpha) distribution. Default: None.

• alpha (float, optional) – parameter for the Beta distribution over mixing. Ignored if mixing is provided. Default: 0.2.

• indices (Tensor, optional) – Permutation of the samples to use. Default: None.

Returns
• input_mixed (torch.Tensor) – batch of inputs after mixup has been applied

• target_perm (torch.Tensor) – The labels of the mixed-in examples

• mixing (torch.Tensor) – the amount of mixing used

Example

import torch
from composer.functional import mixup_batch

N, C, H, W = 2, 3, 4, 5
X = torch.randn(N, C, H, W)
y = torch.randint(num_classes, size=(N,))
X_mixed, y_perm, mixing = mixup_batch(
X,
y,
alpha=0.2,
)