HuggingFaceModel#

class composer.models.HuggingFaceModel(model, tokenizer=None, use_logits=False, metrics=None, eval_metrics=None, shift_labels=None, allow_embedding_resizing=False, peft_config=None, should_save_peft_only=True)[source]#

A wrapper class that converts 🤗 Transformers models to composer models.

Parameters

model (Union[PreTrainedModel, peft.PeftModel)) – A 🤗 Transformers model or a PEFT model.
tokenizer (PreTrainedTokenizer, optional) –
The tokenizer used to prepare the dataset. Default None.

Note

If the tokenizer is provided, its config will be saved in the composer checkpoint, and it can be reloaded using HuggingFaceModel.hf_from_composer_checkpoint(). If the tokenizer is not provided here, it will not be saved in the composer checkpoint.
use_logits (bool, optional) – If True, the model’s output logits will be used to calculate validation metrics. Else, metrics will be inferred from the HuggingFaceModel directly. Default: False
metrics (list[Metric], optional) – list of torchmetrics to apply to the output of eval_forward during training. If eval_metrics is None, these will also be used as eval_metrics. Default: None.
eval_metrics (list[Metric], optional) – list of torchmetrics to compute on the eval_dataloader, or be accessible to Evaluator`s. Default: ``None`.
shift_labels (bool, optional) – If True, the batch’s labels will be shifted before being used to calculate metrics. This should be set to true for CausalLM models and false otherwise. If not specified, shift_labels will be set automatically based on the model class name. Default: None.
allow_embedding_resizing (bool, optional) – If True, the model’s embeddings will be automatically resized when they are smaller than the tokenizer vocab size. Default: False.
peft_config (PeftConfig, optional) – Optional PEFT config to apply to the model. If provided, the model will be converted to a PEFT model. Only LoRA is currently supported.
should_save_peft_only (bool, optional) – If True _and_ PEFT is active, the state dict will only contain the PEFT weights, not the frozen base model weights.
note: (..) – To ensure correct behavior:
(i.e. (set shift_labels manually if using a custom model) –
not (if model is) –
class). (an instance of a registered 🤗 Transformers) –

Warning

This wrapper is designed to work with 🤗 datasets that define a labels column.

Example:

import transformers
from composer.models import HuggingFaceModel

hf_model = transformers.AutoModelForSequenceClassification.from_pretrained('google-bert/bert-base-uncased', num_labels=2)
hf_tokenizer = transformers.AutoTokenizer.from_pretrained('google-bert/bert-base-uncased')
model = HuggingFaceModel(hf_model, hf_tokenizer)

generate(input_ids, **kwargs)[source]#

Generate from the underlying HuggingFace model.

Except for pad_token_id, which is optionally read from self.tokenizer, all args are passed along to transformers.GenerationMixin.generate() function.

Parameters

input_ids (Tensor) – Input ids to generate from.
**kwargs – Additional arguments passed to transformers.GenerationMixin.generate() function. See transformers.GenerationConfig for all available arguments.

static hf_from_composer_checkpoint(checkpoint_path, model_instantiation_class=None, model_config_kwargs=None, local_checkpoint_save_location=None, trust_remote_code=False)[source]#

Loads a HuggingFace model (and tokenizer if present) from a composer checkpoint.

Note

This function does not load the weights from the checkpoint. It just loads the correctly configured model and tokenizer classes.

Example:

hf_model, hf_tokenizer = HuggingFaceModel.hf_from_composer_checkpoint('composer-hf-checkpoint.pt')
# At this point, hf_model is randomly initialized
composer_model = HuggingFaceModel(hf_model, hf_tokenizer)
trainer = Trainer(model=composer_model,
                  train_dataloader=train_dataloader,
                  save_filename='composer-hf-checkpoint-2.pt',
                  max_duration='1ep',
                  save_folder='./',
                  load_path='composer-hf-checkpoint.pt')
# At this point, the weights have been loaded from the composer checkpoint into hf_model

Parameters

checkpoint_path (str) – Path to the composer checkpoint, can be a local path, or a remote path beginning with s3://, or another backend supported by composer.utils.maybe_create_object_store_from_uri().
model_instantiation_class (Union[Type[transformers.PreTrainedModel], Type[transformers.AutoModel], str]), optional) – Class to use to create the HuggingFace model. Defaults to the model class used in the original checkpoint. If this argument is a HuggingFace auto class (e.g. transformers.AutoModel or transformers.AutoModelForSequenceClassification), the from_config method will be used, while if it is of type transformers.PreTrainedModel, the constructor will be called. This argument can also be a string, which will attempt to be imported as the class to use.
model_config_kwargs – dict[str, Any]: Extra arguments to pass in for the model config creation (e.g. num_labels for creating a sequence classification model)
local_checkpoint_save_location (Optional[Union[Path, str]], optional) – If specified, where to save the checkpoint file to locally. If the input checkpoint_path is already a local path, this will be a symlink. Defaults to None, which will use a temporary file.
trust_remote_code (bool, optional) – Whether to trust the remote code when loading the tokenizer. Defaults to False.

Raises

ValueError – If the model_instantiation_class, or the model class saved in the checkpoint, is not able to be imported

Returns

tuple[transformers.PreTrainedModel, Optional[Union[transformers.PreTrainedTokenizer, transformers.PreTrainedTokenizerFast]]] – The loaded HuggingFace model and (if present) tokenizer

static load_huggingface_model_from_saved_state(hf_state, loaded_state_dict, model_instantiation_class, model_config_kwargs)[source]#

A helper function that loads a HuggingFace model class from a loaded in hf state.

Parameters

hf_state (dict[str, Any]) – HF state loaded from a Composer checkpoint.
model_instantiation_class (Union[Type[transformers.PreTrainedModel], Type[transformers.AutoModel], str]), optional) – Class to use to create the HuggingFace model. Defaults to the model class used in the original checkpoint. If this argument is a HuggingFace auto class (e.g. transformers.AutoModel or transformers.AutoModelForSequenceClassification), the from_config method will be used, while if it is of type transformers.PreTrainedModel, the constructor will be called. This argument can also be a string, which will attempt to be imported as the class to use.
model_config_kwargs – dict[str, Any]: Extra arguments to pass in for the model config creation (e.g. num_labels for creating a sequence classification model)

Returns

transformers.PreTrainedModel – The loaded HuggingFace model

static load_huggingface_tokenizer_from_saved_state(hf_state, trust_remote_code=False, tokenizer_save_dir=None)[source]#

A helper function that loads a HuggingFace tokenizer from a loaded in hf state.

Parameters

hf_state (dict[str, Any]) – HF state loaded from a Composer checkpoint.
trust_remote_code (bool, optional) – Whether to trust the remote code when loading the tokenizer. Defaults to False.
tokenizer_save_dir (Optional[str], optional) – If specified, where to save the tokenizer files to locally. If not specified, a folder with a unique suffix will be saved in the current working directory. Defaults to None.

Returns

Optional[transformers.PreTrainedTokenizer | transformers.PreTrainedTokenizerFast] – The loaded HuggingFace tokenizer

state_dict(*args, **kwargs)[source]#: Returns the state dict of the model.