InContextLearningQATaskDataset#

class composer.datasets.InContextLearningQATaskDataset(cot_delimiter='', early_stopping_criteria=None, do_normalization=True, *args, **kwargs)[source]#

A dataset that constructs batches for in-context learning question answering evaluation. QA tasks evaluate a modelโ€™s ability to answer questions using a consistent format.

The input format is expected to be a jsonl file with the following fields: - context: The question - answer: The preferred answer to the question - aliases: A list of aliases for the answer

See InContextLearningDataset for more details.

Additional Args:

cot_delimiter (str): Delimiter to place between the chain of thought and continuations.

get_answer_from_example(example, in_context=False)[source]#

Returns the answer from the example. Applies chain of thought if self.has_cot is marked as true. :param example: The example from which to retrieve the answer :type example: Dict

Returns

str โ€“ The answer in from the example with chain of thought and delimiter if needed

tokenize_example(prompt_and_fewshot, ctxt, example)[source]#

Run text through the tokenizer and handle special cases. :param prompt_and_fewshot: The collection of the prompt and fewshot examples that belongs before the exampleโ€™s context :type prompt_and_fewshot: str :param ctx: The specific exampleโ€™s derrived context :type ctx: str :param example: The example as a dictionary. :type example: Dict

Returns

Dict โ€“ Dictionary with the tokenized data