StreamingCIFAR10#
- class streaming.vision.StreamingCIFAR10(local, remote=None, split=None, shuffle=False, transform=None, target_transform=None, predownload=100000, keep_zip=None, download_retry=2, download_timeout=60, validate_hash=None, shuffle_seed=None, num_canonical_nodes=None, batch_size=None)[source]#
- Implementation of the CIFAR-10 dataset using StreamingDataset. - Parameters
- local (str) β Local dataset directory where shards are cached by split. 
- remote (str, optional) β Download shards from this remote path or directory. If None, this rank and workerβs partition of the dataset must all exist locally. Defaults to - None.
- split (str, optional) β Which dataset split to use, if any. Defaults to - None.
- shuffle (bool) β Whether to iterate over the samples in randomized order. Defaults to - False.
- transform (callable, optional) β A function/transform that takes in an image and returns a transformed version. Defaults to - None.
- target_transform (callable, optional) β A function/transform that takes in the target and transforms it. Defaults to - None.
- predownload (int, optional) β Target number of samples ahead to download the shards of while iterating. Defaults to - 100_000.
- keep_zip (bool, optional) β Whether to keep or delete the compressed file when decompressing downloaded shards. If set to None, keep iff remote is local. Defaults to - None.
- download_retry (int) β Number of download re-attempts before giving up. Defaults to - 2.
- download_timeout (float) β Number of seconds to wait for a shard to download before raising an exception. Defaults to - 60.
- validate_hash (str, optional) β Optional hash or checksum algorithm to use to validate shards. Defaults to - None.
- shuffle_seed (int, optional) β Seed for shuffling, or - Nonefor random seed. Defaults to- None.
- num_canonical_nodes (int, optional) β Canonical number of nodes for shuffling with resumption. Defaults to - None, which is interpreted as the number of nodes of the initial run.
- batch_size (int, optional) β Batch size of its DataLoader, which affects how the dataset is partitioned over the workers. Defaults to - None.