🃏 Methods Overview# AGC CV Clips gradients based on the ratio of their norms with weights’ norms. Alibi NLP Replace attention with AliBi AugMix CV Image-preserving data augmentations BlurPool CV Applies blur before pooling or downsampling ChannelsLast CV Uses channels last memory format (NHWC) ColOut CV Removes columns and rows from the image for augmentation and efficiency. CutMix CV Combines pairs of examples in non-overlapping regions and mixes labels CutOut CV Randomly erases rectangular blocks from the image. Factorize CV NLP Uses Fused LayerNorm kernels for increased GPU utilization. Fused LayerNorm CV NLP Swaps linear layers for Gated Linear Units in the feed-forward network. Gated Linear Units NLP Factorize GEMMs into smaller GEMMs GhostBatchNorm CV Use smaller samples to compute batchnorm LabelSmoothing CV Smooths the labels with a uniform prior LayerFreezing CV NLP Progressively freezes layers during training. MixUp CV Blends pairs of examples and labels ProgressiveResizing CV Increases the input image size during training RandAugment CV Applies a series of random augmentations SAM CV SAM optimizer measures sharpness of optimization space ScaleSchedule Scale the learning rate schedule by a factor SelectiveBackprop CV Drops examples with small loss contributions. SeqLengthWarmup NLP Progressively increase sequence length. SqueezeExcite CV Replaces eligible layers with Squeeze-Excite layers StochasticDepth CV Replaces a specified layer with a stochastic verion that randomly drops the layer or samples during training SWA CV NLP Computes running average of model weights.