InContextLearningQAAccuracy#

class composer.metrics.InContextLearningQAAccuracy(dist_sync_on_step=False)[source]#

Computes accuracy for In-context learning (ICL) question answering (QA) tasks.

ICL QA tasks consist of some number of example question answering tasks (referred to as the โ€˜contextโ€™), followed by a test task where the model must match one of the possible answer aliases (referred to as the โ€˜continuationโ€™).

For example, the model may be provided the context below and evaluated on its ability to correctly predict the continuation.

Context: Question: Who was president of the United States in 2012?nAnswer: Barack ObamanQuestion: Is water wet?nAnswer: ` Continuation: [`yes, no]

Both predictions and answers will be normalized before comparison.

Adds metric state variables:

correct (float): The number of instances where the prediction was a prefix for any of the answer aliases. total (float): The number of total instances that were predicted.

Parameters

dist_sync_on_step (bool, optional) โ€“ Synchronize metric state across processes at each forward() before returning the value at the step. Default: False.

normalize_answer(answer)[source]#

Lower text and remove punctuation, articles and extra whitespace.

Copied from https://github.com/mandarjoshi90/triviaqa/blob/master/evaluation/triviaqa_evaluation.py