0.4#

Looking for the latest release notes? See v0.5.x

v0.4.17#

Fix bug for first time mcli set api-key calls

v0.4.16#

Adds max_retries default value 10 when watchdog is enabled
Patches a small bug in the describe display column ordering for a failed run

0.4.15#

Adds 🐕 to runs in CLI display for runs with watchdog turned on
Add max retry checking to watchdog

0.4.14#

Patch update_run import bug
Add connect functionality to mcli interactive

0.4.13#

New create_interactive_run API to launch interactive runs
Better error handling for inference deployments
Small improvements to documentation

0.4.12#

Update run functionality via Run.update(..), update_run('name', ...), and mcli update run
Additional deployment update functionality via mcli update deployment
Improved ping and predict error handling
Run resumption override support mcli run -r run-name --priority low
Documentation updates

0.4.11#

Revamp of mcli describe run: now shows detailed information about run resumptions
mcli watchdog command to support automatic run submission on failure
mcli get deployments command shows inference deployment replica count

0.4.10#

Support for compute field in inference deployments to allow selecting specific gpu instances
Dynamic batching config for inference deployments
Updated docs for inference deployments

0.4.9#

Split clusters by submission type (training or inference)
Attempts renamed to resumptions

0.4.8#

Schema validation JSON for run yamls

0.4.7#

Add SSL certificate warning for MacOS

0.4.6#

Allow inputting just the deployment name for ping and predict commands

0.4.5#

Support retrieving logs for all run attempts

0.4.4#

Added sdk support for updating an InferenceDeployment

mcli.update_inference_deployment("foo",{"replicas":2})

0.4.3#

mcli logs defaults to the first failed rank if the run has failed
New singleton inference SDK functions: get_inference_deployment, delete_inference_deployment
Adds new methods to InferenceDeployment

d = get_inference_deployment('foo')

print(f'{d.status} before')
d = d.refresh()
print(f'{d.status} after')

status = d.ping()
output = d.predict(input)

d.delete()

0.4.2#

What’s Changed

Small patch to fix breaking change to mcli get deployment logs in 0.4.1

0.4.1#

Backwards breaking changes

We are replacing mcli init-kube with mcli kube get-config and mcli kube merge-config

What’s Changed

Ping: Don’t throw error with empty content
--name is not required to delete deployment
Fix ping to actually return status code

0.4.0#

Backwards breaking changes

Deprecation of LEGACY mode and all associated code and dependencies (including kubernetes!) - removes 16,500 lines of code from mcli! 🔥🔥
Remove positional arguments for SDK filters

get_runs(["run-name1", "run-name2"], ["cluster1", "cluster2"])
# TypeError: get_runs() takes from 0 to 1 positional arguments but 2 were given

get_runs(["run-name1", "run-name2"], cluster_names=["cluster1", "cluster2"])
# OK! 🙆‍♀️

Summary of changes since 0.3.0

MCLI can now be imported directly:

from mcli import get_runs, ...

from mcli.sdk import get_runs # don’t worry, this still works!

MosaicML Inference!
Run resumption and preemption
Run metadata
Shared runs
Better describe run data and cluster specifications
MCLI can now be imported directly:

from mcli import get_runs, ...

from mcli.sdk import get_runs # don't worry, this still works!