AWS S3#
In order to stream data from S3 buckets when training models, MCLI will need access to your AWS S3 credentials.
First, make sure the awscli
is installed, and then run aws configure
to create the config and credential files:
python -m pip install awscli
aws configure
Note: the requested credentials can be retrieved through your AWS console, typically under âCommand line or programmatic accessâ.
To add S3 credentials to MCLI, use the following command:
mcli create secret s3
which produces the following output:
> mcli create secret s3
? What would you like to name this secret? my-s3-credentials
? Where is your S3 config file located? ~/.aws/config
? Where is your S3 credentials file located? ~/.aws/credentials
â Created secret: my-s3-credentials
The values for each of these queries can be passed as arguments using the --name
, --config-file
and --credentials-file
arguments, respectively.
Your config and credentials files should follow the standard structure output by aws configure
:
~/.aws/config
[default]
region=us-west-2
output=json
~/.aws/credentials
[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
More details on these files can be found here.
Once youâve created an S3 secret, we mount these secrets inside all of your runs and export two environment variables:
$AWS_CONFIG_FILE
: Path to your config file$AWS_SHARED_CREDENTIALS_FILE
: Path to your credentials file
Libraries like boto3 will use these environment variables by default to discover you s3 credentials