LocalDataset#
- class streaming.LocalDataset(local, split=None)[source]#
A streaming dataset whose shards reside locally as a pytorch Dataset.
- Parameters
- get_item(sample_id)[source]#
Get sample by global sample ID.
- Parameters
sample_id (int) – Sample ID.
- Returns
Dict[str, Any] – Column name with sample data.
- property size#
Get the size of the dataset in samples.
- Returns
int – Number of samples.