AWS S3#

In order to stream data from S3 buckets when training models, MCLI will need access to your AWS S3 credentials.

First, make sure the awscli is installed, and then run aws configure to create the config and credential files:

python -m pip install awscli
aws configure

Note: the requested credentials can be retrieved through your AWS console, typically under “Command line or programmatic access”.

To add S3 credentials to MCLI, use the following command:

mcli create secret s3

which produces the following output:

> mcli create secret s3
? What would you like to name this secret? my-s3-credentials
? Where is your S3 config file located? ~/.aws/config
? Where is your S3 credentials file located? ~/.aws/credentials
✔  Created secret: my-s3-credentials

The values for each of these queries can be passed as arguments using the --name, --config-file and --credentials-file arguments, respectively. Your config and credentials files should follow the standard structure output by aws configure:

~/.aws/config

[default]
region=us-west-2
output=json

~/.aws/credentials

[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

More details on these files can be found here.

Once you’ve created an S3 secret, we mount these secrets inside all of your runs and export two environment variables:

  • $AWS_CONFIG_FILE: Path to your config file

  • $AWS_SHARED_CREDENTIALS_FILE: Path to your credentials file

Libraries like boto3 will use these environment variables by default to discover you s3 credentials