In order to stream data from S3 buckets when training models, MCLI will need access to your AWS S3 credentials.
First, make sure the
awscli is installed, and then run
aws configure to create the config and credential files:
python -m pip install awscli aws configure
Note: the requested credentials can be retrieved through your AWS console, typically under “Command line or programmatic access”.
To add S3 credentials to MCLI, use the following command:
mcli create secret s3
which produces the following output:
> mcli create secret s3 ? What would you like to name this secret? my-s3-credentials ? Where is your S3 config file located? ~/.aws/config ? Where is your S3 credentials file located? ~/.aws/credentials ✔ Created secret: my-s3-credentials
The values for each of these queries can be passed as arguments using the
--credentials-file arguments, respectively.
Your config and credentials files should follow the standard structure output by
[default] region=us-west-2 output=json
[default] aws_access_key_id=AKIAIOSFODNN7EXAMPLE aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
More details on these files can be found here.
Once you’ve created an S3 secret, we mount these secrets inside all of your runs and export two environment variables:
$AWS_CONFIG_FILE: Path to your config file
$AWS_SHARED_CREDENTIALS_FILE: Path to your credentials file
Libraries like boto3 will use these environment variables by default to discover you s3 credentials