HuggingFaceModel#

class composer.models.HuggingFaceModel(model, tokenizer=None, use_logits=False, metrics=None, eval_metrics=None, shift_labels=None, allow_embedding_resizing=False)[source]#

A wrapper class that converts 🤗 Transformers models to composer models.

Parameters

model (PreTrainedModel) – A 🤗 Transformers model.
tokenizer (PreTrainedTokenizer, optional) –
The tokenizer used to prepare the dataset. Default None.

Note

If the tokenizer is provided, its config will be saved in the composer checkpoint, and it can be reloaded using HuggingFaceModel.hf_from_composer_checkpoint(). If the tokenizer is not provided here, it will not be saved in the composer checkpoint.
use_logits (bool, optional) – If True, the model’s output logits will be used to calculate validation metrics. Else, metrics will be inferred from the HuggingFaceModel directly. Default: False
metrics (list[Metric], optional) – list of torchmetrics to apply to the output of eval_forward during training. If eval_metrics is None, these will also be used as eval_metrics. Default: None.
eval_metrics (list[Metric], optional) – list of torchmetrics to compute on the eval_dataloader, or be accessible to Evaluator`s. Default: ``None`.
shift_labels (bool, optional) – If True, the batch’s labels will be shifted before being used to calculate metrics. This should be set to true for CausalLM models and false otherwise. If not specified, shift_labels will be set automatically based on the model class name. Default: None.
allow_embedding_resizing (bool, optional) – If True, the model’s embeddings will be automatically resized when they are smaller than the tokenizer vocab size. Default: False.
note: (..) – To ensure correct behavior:
(i.e. (set shift_labels manually if using a custom model) –
not (if model is) –
class). (an instance of a registered 🤗 Transformers) –

Warning

This wrapper is designed to work with 🤗 datasets that define a labels column.

Example:

import transformers
from composer.models import HuggingFaceModel

hf_model = transformers.AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
hf_tokenizer = transformers.AutoTokenizer.from_pretrained('bert-base-uncased')
model = HuggingFaceModel(hf_model, hf_tokenizer)

generate(input_ids, **kwargs)[source]#

Generate from the underlying HuggingFace model.

Except for pad_token_id, which is optionally read from self.tokenizer, all args are passed along to transformers.GenerationMixin.generate() function.

Parameters

input_ids (Tensor) – Input ids to generate from.
**kwargs – Additional arguments passed to transformers.GenerationMixin.generate() function. See transformers.GenerationConfig for all available arguments.

static hf_from_composer_checkpoint(checkpoint_path, model_instantiation_class=None, model_config_kwargs=None, local_checkpoint_save_location=None)[source]#

Loads a HuggingFace model (and tokenizer if present) from a composer checkpoint.

Note

This function does not load the weights from the checkpoint. It just loads the correctly configured model and tokenizer classes.

Example:

hf_model, hf_tokenizer = HuggingFaceModel.hf_from_composer_checkpoint('composer-hf-checkpoint.pt')
# At this point, hf_model is randomly initialized
composer_model = HuggingFaceModel(hf_model, hf_tokenizer)
trainer = Trainer(model=composer_model,
                  train_dataloader=train_dataloader,
                  save_filename='composer-hf-checkpoint-2.pt',
                  max_duration='1ep',
                  save_folder='./',
                  load_path='composer-hf-checkpoint.pt')
# At this point, the weights have been loaded from the composer checkpoint into hf_model

Parameters

checkpoint_path (str) – Path to the composer checkpoint, can be a local path, or a remote path beginning with s3://, or another backend supported by composer.utils.maybe_create_object_store_from_uri().
model_instantiation_class (Union[Type[transformers.PreTrainedModel], Type[transformers.AutoModel], str]), optional) – Class to use to create the HuggingFace model. Defaults to the model class used in the original checkpoint. If this argument is a HuggingFace auto class (e.g. transformers.AutoModel or transformers.AutoModelForSequenceClassification), the from_config method will be used, while if it is of type transformers.PreTrainedModel, the constructor will be called. This argument can also be a string, which will attempt to be imported as the class to use.
model_config_kwargs – Dict[str, Any]: Extra arguments to pass in for the model config creation (e.g. num_labels for creating a sequence classification model)
local_checkpoint_save_location (Optional[Union[Path, str]], optional) – If specified, where to save the checkpoint file to locally. If the input checkpoint_path is already a local path, this will be a symlink. Defaults to None, which will use a temporary file.

Raises

ValueError – If the model_instantiation_class, or the model class saved in the checkpoint, is not able to be imported

Returns

Tuple[transformers.PreTrainedModel, Optional[transformers.PreTrainedTokenizer]] – The loaded HuggingFace model and (if present) tokenizer