๐ Artifact Logging#
Composer supports uploading artifacts, such as checkpoints and profiling traces, directly to third-party experiment trackers (e.g. Weights & Biases) and cloud storage backends (e.g. AWS S3).
What is an artifact?#
An artifact is a file generated during training. Checkpoints, profiling traces, and log files are the most common examples of artifacts. An artifacts must be a single, local file. Collections of files can be combined into a single tarball, and a file can be stored in a temporary folder.
Each artifact must have a name, which is independent of the artifactโs local filepath. A remote backend that logs an artifact is responsible for storing and organizing the file by the artifactโs name. An artifact with the same name should override a previous artifact with that name. It is recommended that artifact names include file extensions.
How are artifacts generated?#
In Composer, individual classes, such as algorithms, callbacks, loggers, and profiler trace handlers, can generate artifacts.
Once a artifact file has been written to disk, the class should call
file_artifact()
, and the
centralized Logger
will then pass the filepath and artifact name to all
LoggerDestinations, which are ultimately responsible for uploading and storing artifacts
(more on that below).
Below are some examples of the classes that generate artifacts and the types of artifacts they generate. For each class, see the linked API Reference for additional documentation.
Type |
Class Name |
Description of Generated Artifacts |
---|---|---|
Callback |
Training checkpoint files |
|
Callback |
Trained models in inference formats |
|
Callback |
MLPerf submission files |
|
Logger |
Log files |
|
Logger |
Tensorboard TF Event Files |
|
Trace Handler |
Profiler trace files |
Logging custom artifacts#
It is also possible to log custom artifacts outside of an algorithm or callback. For example:
from composer import Trainer
from composer.loggers import LogLevel
# Construct the trainer
trainer = Trainer(...)
# Log a custom artifact, such as a configuration YAML
trainer.logger.file_artifact(
log_level=LogLevel.FIT,
artifact_name='hparams.yaml',
file_path='/path/to/hparams.yaml',
)
# Train!
trainer.fit()
How are artifacts uploaded?#
To store artifacts, in the loggers
argument to the Trainer constructor, you must specify a
LoggerDestination
that implements the
log_file_artifact()
.
See also
The built-in WandBLogger
and
ObjectStoreLogger
implement this method โ see the examples below.
The centralized Composer
Logger
will invoke this method for all LoggerDestinations. If no LoggerDestination
implements this method, then artifacts will not be stored remotely.
Because LoggerDestinations can both generate and store artifacts, there is a potential for a circular dependency. As such, it is important that any logger that generates artifacts (e.g. the Tensorboard Logger) does not also attempt to store artifacts. Otherwise, you could run into an infinite loop!
Where can I store artifacts?#
Composer includes two built-in LoggerDestinations to store artifacts:
The
WandBLogger
can upload Composer training artifacts as W & B Artifacts, which are associated with the corresponding W & B project.The
ObjectStoreLogger
can upload Composer training artifacts to any cloud storage backend or remote filesystem. We include integrations for AWS S3 and SFTP (see the examples below), and you can write your own integration for a custom backend.
Why should I use artifact logging instead of uploading artifacts manually?#
Artifact logging in Composer is optimized for efficiency. File uploads happen in background threads or processes, ensuring that the training loop is not blocked due to network I/O. In other words, this feature allows you to train the next batch while the previous checkpoint is being uploaded simultaneously.
Examples#
Below are some examples on how to configure Composer to log artifacts to various backends:
Weights & Biases Artifacts#
See also
The WandBLogger
API Reference.
from composer.loggers import WandBLogger
from composer import Trainer
# Configure the logger
logger = WandBLogger(
log_artifacts=True, # enable artifact logging
)
# Define the trainer
trainer = Trainer(
...,
loggers=logger,
)
# Train!
trainer.fit()
S3 Objects#
To log artifacts to a S3 bucket, weโll need to configure the ObjectStoreLogger
with the S3ObjectStore
backend.
See also
The ObjectStoreLogger
and
S3ObjectStore
API Reference.
from composer.loggers import ObjectStoreLogger
from composer.utils.object_store import S3ObjectStore
from composer import Trainer
# Configure the logger
logger = ObjectStoreLogger(
object_store_cls=S3ObjectStore,
object_store_kwargs={
# Keyword arguments for the S3ObjectStore constructor.
# See the API reference for all available arguments
'bucket': 'my-bucket-name',
},
)
# Define the trainer
trainer = Trainer(
...,
loggers=logger,
)
# Train!
trainer.fit()
SFTP Filesystem#
Similar to the S3 Example above, we can log artifacts to a remote SFTP filesystem.
See also
The ObjectStoreLogger
and
SFTPObjectStore
API Reference.
from composer.loggers import ObjectStoreLogger
from composer.utils.object_store import SFTPObjectStore
from composer import Trainer
# Configure the logger
logger = ObjectStoreLogger(
object_store_cls=SFTPObjectStore,
object_store_kwargs={
# Keyword arguments for the SFTPObjectStore constructor.
# See the API reference for all available arguments
'host': 'sftp_server.example.com',
},
)
# Define the trainer
trainer = Trainer(
...,
loggers=logger,
)
# Train!
trainer.fit()