composer.datasets#
Modules
ADE20K Semantic segmentation and scene parsing dataset. |
|
BraTS (Brain Tumor Segmentation) dataset. |
|
C4 (Colossal Cleaned CommonCrawl) dataset. |
|
CIFAR image classification dataset. |
|
COCO (Common Objects in Context) dataset. |
|
Common settings across both the training and eval datasets. |
|
Mapping between dataset names and corresponding HParams classes. |
|
Specifies an instance of an |
|
composer.datasets.ffcv_utils |
|
GLUE (General Language Understanding Evaluation) dataset (Wang et al, 2019). |
|
Dataset Hyperparameter classes. |
|
ImageNet classfication dataset. |
|
Generic dataset class for self-supervised training of autoregressive and masked language models. |
|
MNIST image classification dataset. |
|
Synthetic datasets used for testing, profiling, and debugging. |
|
Synthetic language modeling datasets used for testing, profiling, and debugging. |
|
Utility and helper functions for datasets. |
|
composer.datasets.webdataset_utils |
Natively supported datasets.
Modules in datasets namespace define utilities and mechanisms to create dataloaders from the given hyperparameters. Two of the important classes in this module are described below:
All datasets derive from the abstract base class
DatasetHparams
and it contains common parameters such asshuffle
.DatasetHparams
returns a dataloader (atorch.utils.data.DataLoader
or aDataSpec
) for the trainer.DataLoaderHparams
contains thetorch.utils.data.DataLoader
settings that are common across both training and eval datasets. See the documentation ofDataLoaderHparams
for more details on these settings.
Functions
Returns a mapping between different supported datasets and their HParams classes that create an instance of the dataset. |
Classes
Enum class to represent different memory formats. |
|
Emulates a dataset of provided size and shape. |
|
Defines the class label type of the synthetic data. |
|
Defines the distribution of the synthetic data. |
|
Similar to |
|
A wrapper around dataloader. |
Hparams
These classes are used with yahp
for YAML
-based configuration.
Defines an instance of the ADE20k dataset for semantic segmentation from a local disk. |
|
Defines an instance of the ADE20k dataset for semantic segmentation from a remote blob store. |
|
Defines an instance of the BraTS dataset for image segmentation. |
|
Builds a |
|
Defines an instance of the CIFAR-100 WebDataset for image classification. |
|
Defines an instance of the CIFAR-10 dataset for image classification from a local disk. |
|
Defines an instance of the CIFAR-10 WebDataset for image classification. |
|
Defines an instance of the CIFAR-20 WebDataset for image classification. |
|
Defines an instance of the COCO Dataset. |
|
Hyperparameters to initialize a |
|
Abstract base class for hyperparameters to initialize a dataset. |
|
Params for the |
|
Sets up a generic GLUE dataset loader. |
|
Defines an instance of the ImageNet-1k WebDataset for image classification. |
|
Defines an instance of the ImageNet dataset for image classification. |
|
Defines a generic dataset class for self-supervised training of autoregressive and masked language models. |
|
Defines an instance of the MNIST dataset for image classification. |
|
Defines an instance of the MNIST WebDataset for image classification. |
|
Synthetic dataset parameter mixin for |
|
Defines an instance of the TinyImagenet-200 WebDataset for image classification. |
|
Abstract base class for hyperparameters to initialize a webdataset. |