# 🏙️ ResNet#

Vision / Image Classification

The ResNet model family is a set of convolutional neural networks that can be used as a basis for a variety of vision tasks. Our implementation is a simple wrapper on top of the torchvision ResNet implementation.

## How to Use#

from composer.models import ComposerResNet

model = ComposerResNet(
model_name="resnet50",
num_classes=1000,
pretrained=False
)


## Architecture#

The basic architecture defined in the original papers is as follows:

• The first layer is a 7x7 Convolution with stride 2 and 64 filters.

• Subsequent layers follow 4 stages with {64, 128, 256, 512} input channels with a varying number of residual blocks at each stage that depends on the family member. At the end of every stage, the resolution is reduced by half using a convolution with stride 2.

• The final section consists of a global average pooling followed by a linear + softmax layer that outputs values for the specified number of classes.

The below table from He et al. details some of the building blocks for ResNets of different sizes.

## Family Members#

ResNet family members are identified by their number of layers. Parameter count, accuracy, and training time are provided below.

Model Family Members

Parameter Count

Our Accuracy

Training Time on 8xA100s

ResNet-18

11.5M

TBA

TBA

ResNet-34

21.8M

TBA

TBA

ResNet-50

25.6M

76.5%

3.83 hrs

ResNet-101

44.5M

78.1%

5.50 hrs

ResNet-152

60.2M

TBA

TBA

Note: Please see the CIFAR ResNet model card for the differences between CIFAR and ImageNet ResNets.

## Default Training Hyperparameters#

optimizer:
sgd:
learning_rate: 2.048
momentum: 0.875
weight_decay: 5e-4
lr_schedulers:
linear_warmup: "8ep"
cosine_decay:
T_max: "82ep"
eta_min: 0
verbose: false
interval: step
train_batch_size: 2048
max_duration: 90ep


Paper: Deep Residual Learning for Image Recognition by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Code and hyperparameters: DeepLearningExamples Github repository by Nvidia

## API Reference#

class composer.models.resnet.model.ComposerResNet(model_name, num_classes=1000, pretrained=False, groups=1, width_per_group=64, initializers=None, loss_name='soft_cross_entropy')[source]

A ComposerClassifier wrapper around the torchvision implementations of the ResNet model family.

From Deep Residual Learning for Image Recognition (He et al, 2015).

Parameters
• model_name (str) – Name of the ResNet model instance. Either ["resnet18", "resnet34", "resnet50", "resnet101", "resnet152"].

• num_classes (int, optional) – The number of classes. Needed for classification tasks. Default: 1000.

• pretrained (bool, optional) – If True, use ImageNet pretrained weights. Default: False.

• groups (int, optional) – Number of filter groups for the 3x3 convolution layer in bottleneck blocks. Default: 1.

• width_per_group (int, optional) – Initial width for each convolution group. Width doubles after each stage. Default: 64.

• initializers (List[Initializer], optional) – Initializers for the model. None for no initialization. Default: None.

• loss_name (str, optional) – Loss function to use. E.g. ‘soft_cross_entropy’ or ‘binary_cross_entropy_with_logits’. Loss function must be in loss. Default: 'soft_cross_entropy'”.

Example:

from composer.models import ComposerResNet

model = ComposerResNet(model_name='resnet18')  # creates a torchvision resnet18 for image classification