# GatedLinearUnits#

class composer.algorithms.GatedLinearUnits(act_fn=None, gated_layer_bias=False, non_gated_layer_bias=False)[source]#

Replaces all instances of Linear layers in the feed-forward subnetwork with a Gated Linear Unit. The Gated Linear Units provide a more expressive form for the same number of parameters, and a slight degredation to throughput.

Runs on Event.INIT, so it can swap the Linear layers in the FFN for GLUs before the model is DDP wrapped.

Parameters
• act_fn (Callable[[Tensor], Tensor], optional) – Optionally, the activation function to use. If None, the algorithm will use the existing activation function in the model.

• gated_layer_bias (bool, optional) – Whether to use biases in the linear layers within the GLU. Default: False.

• non_gated_layer_bias (bool, optional) – Whether to use biases in the linear layers within the GLU. Default: False.

Example

from composer.algorithms import GatedLinearUnits

algorithm = GatedLinearUnits()
trainer = Trainer(
model=model,