AzureDataLakeUploader#
- class streaming.base.storage.AzureDataLakeUploader(out, keep_local=False, progress_bar=False, retry=2, exist_ok=False)[source]#
- Upload file from local machine to Microsoft Azure DataLake. - Parameters
- out (str | Tuple[str, str]) – - Output dataset directory to save shard files. - If - outis a local directory, shard files are saved locally.
- If - outis a remote directory, a local temporary directory is created to cache the shard files and then the shard files are uploaded to a remote location. At the end, the temp directory is deleted once shards are uploaded.
- If - outis a tuple of- (local_dir, remote_dir), shard files are saved in the local_dir and also uploaded to a remote location.
 
- keep_local (bool) – If the dataset is uploaded, whether to keep the local dataset shard file or remove it after uploading. Defaults to - False.
- progress_bar (bool) – Display TQDM progress bars for uploading output dataset files to a remote location. Default to - False.
- retry (int) – Number of times to retry uploading a file. Defaults to - 2.
- exist_ok (bool) – When exist_ok = False, raise error if the local part of - outalready exists and has contents. Defaults to- False.