Evaluator#
- class composer.Evaluator(*, label, dataloader, metric_names=None, subset_num_batches=None, eval_interval=None, device_eval_microbatch_size=None)[source]#
A wrapper for a dataloader to include metrics that apply to a specific dataset.
For example,
CrossEntropyLossmetric for NLP models.>>> eval_evaluator = Evaluator( ... label='myEvaluator', ... dataloader=eval_dataloader, ... metric_names=['MulticlassAccuracy'] ... ) >>> trainer = Trainer( ... model=model, ... train_dataloader=train_dataloader, ... eval_dataloader=eval_evaluator, ... optimizers=optimizer, ... max_duration='1ep', ... )
- Parameters
label (str) โ Name of the Evaluator.
dataloader (DataSpec | Iterable | Dict[str, Any]) โ Iterable that yields batches, a
DataSpecfor evaluation, or a Dict ofDataSpeckwargs.metric_names โ
The list of metric names to compute. Each value in this list can be a regex string (e.g. โMulticlassAccuracyโ, โf1โ for โBinaryF1Scoreโ, โTop-.โ for โTop-1โ, โTop-2โ, etc). Each regex string will be matched against the keys of the dictionary returned by
model.get_metrics(). All matching metrics will be evaluated.By default, if left blank, then all metrics returned by
model.get_metrics()will be used.subset_num_batches (int, optional) โ The maximum number of batches to use for each evaluation. Defaults to
None, which means that theeval_subset_num_batchesparameter from theTrainerwill be used. Set to-1to evaluate the entiredataloader.eval_interval (Time | int | str | (State, Event) -> bool, optional) โ
An integer, which will be interpreted to be epochs, a str (e.g.
1ep, or10ba), aTimeobject, or a callable. Defaults toNone, which means that theeval_intervalparameter from theTrainerwill be used.If an integer (in epochs),
Timestring, orTimeinstance, the evaluator will be run with this frequency.Timestrings orTimeinstances must have units ofTimeUnit.BATCHorTimeUnit.EPOCH.Set to
0to disable evaluation.If a callable, it should take two arguments (
State,Event) and return a bool representing whether the evaluator should be invoked. The event will be eitherEvent.BATCH_ENDorEvent.EPOCH_END.When specifying
eval_interval, the evaluator(s) are also run at theEvent.FIT_ENDif it doesnโt evenly divide the training duration.device_eval_microbatch_size (int, optional) โ The number of samples to use for each microbatch when evaluating. If set to
auto, dynamically decreases device_eval_microbatch_size if microbatch is too large for GPU. If None, sets device_eval_microbatch_size to per rank batch size. (default:None)