composer.datasets.build_streaming_imagenet1k_dataloader(global_batch_size, remote, *, local='/tmp/mds-cache/mds-imagenet1k', split='train', drop_last=True, shuffle=True, resize_size=- 1, crop_size=224, **dataloader_kwargs)[source]#

Builds an imagenet1k streaming dataset

  • global_batch_size (int) โ€“ Global batch size.

  • remote (str) โ€“ Remote directory (S3 or local filesystem) where dataset is stored.

  • local (str, optional) โ€“ Local filesystem directory where dataset is cached during operation. Defaults to '/tmp/mds-cache/mds-imagenet1k/`.

  • split (str) โ€“ Which split of the dataset to use. Either [โ€˜trainโ€™, โ€˜valโ€™]. Default: 'train`.

  • drop_last (bool, optional) โ€“ whether to drop last samples. Default: True.

  • shuffle (bool, optional) โ€“ whether to shuffle dataset. Defaults to True.

  • resize_size (int, optional) โ€“ The resize size to use. Use -1 to not resize. Default: -1.

  • size (crop) โ€“ The crop size to use. Default: 224.

  • **dataloader_kwargs (Dict[str, Any]) โ€“ Additional settings for the dataloader (e.g. num_workers, etc.)