Git#

The git repo integration clones a git repo into the working directory of your run’s execution environment comes with a number of configurable options (see below) for the cloning and setup of the repo.

In order to include the git repo integration, use integration_type: git_repo. Note that you can have any number of these included in your YAML.

Prerequisite: Git SSH Secret Setup#

In order to clone from private repositories you will have to set up an SSH key that gives the run clone access to the repo. Follow the steps for creating a git-ssh secret on the SSH Secrets Page to set up a git SSH secret.

Required Parameters#

The only required parameter in the Git Repos integration is the git_repo field, which corresponds to the repo name in / format.

integrations:
  - integration_type: git_repo
    git_repo: mosaicml/composer

Optional Parameters#

Optional parameters of the Git Repos Integration configure how the repo is cloned and installed.

Optional parameters include: git_branch, git_commit, path, ssh_clone, pip_install and host.

git_branch (str): Clone the repo with a specific branch checked out. Default: the repo default branch.

git_commit (str): Commit your changes in git.

path (str): Clone the repo to a specific path inside of the image. ( Note: by default git clones to the repo name within the image’s working directory, e.g. composer if git_repo is mosaicml/composer). Specifying this value is equivalent to runing git clone <repo url> <path>. Default: the repo name.

ssh_clone (bool): Use SSH keys to clone the git repo. To use HTTPS, ssh_clone=False. Default: true.

host (str): The hostname for the git repo. Default: github.com.

Git SSH Secret

Note that to properly clone private repos with SSH you will need an SSH Secret set.

pip_install (str): Pip install the cloned repo. The value of this field is used in pip install <value>. Default: no pip install.

Example#

integrations:
  - integration_type: git_repo

    # github.com/mosaicml/composer
    git_repo: mosaicml/composer

    # The git branch to checkout (optional, default = the repo default branch)
    git_branch: my-branch

    # Clone to /workspace/composer (optional, default = the repo name)
    path: /workspace/my_composer_clone

    # Use SSH Keys to clone (optional, default = True)
    ssh_clone: True

    # pip install command for the repo (optional, default = None)
    pip_install: -e .[all]

    # host for the git repo (defaults to github.com)
    host: github.com

The above settings are equivalent to:

> git clone [email protected]:mosaicml/composer.git -b my-branch /workspace/my_composer_clone
> cd my_composer_clone
> pip install -e .[all]
> cd ..

Example: Multi-Repo Install#

One common use case for the Git repo integration is to clone multiple repos in the same environment.

To do this, it is as easy as just adding multiple git integrations.

integrations:
  - integration_type: git_repo
    git_repo: mosaicml/composer
    path: /workspace/composer
    git_branch: v0.8.1
    pip_install: -e .[all]
  - integration_type: git_repo
    git_repo: facebookresearch/xformers
    path: /workspace/xformers
    pip_install: -e .
  - integration_type: git_repo
    git_repo: myuser/privaterepo
    path: /workspace/myrepo

In the above example, we are cloning and adding three different repos with different instructions. Because we set the filepaths specifically with the Git Integration path option, the resulting filestructure looks like:

workspace
├── composer
├── myrepo
└── xformers

where workspace is the working directory of the container being used for the run.

Checking Out Branches

Note that the mosaicml/composer repo is checked out of v0.8.1 and installed from the v0.8.1 branch with the pip installation options pip install -e .[all]