Question about the evaluation of OpenBookQA #2610

xumingyu2021 · 2025-01-06T13:07:15Z

It seems that the fact is not provided to give a answer, as a openbook qa, it may be better to use the fact, otherwise it become "closebookqa"
https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/openbookqa/openbookqa.yaml

task: openbookqa
dataset_path: openbookqa
dataset_name: main #here maybe additional
output_type: multiple_choice
training_split: train
validation_split: validation
test_split: test
doc_to_text: question_stem #here may need use fact1
doc_to_target: "{{choices.label.index(answerKey.lstrip())}}"
doc_to_choice: "{{choices.text}}"
should_decontaminate: true
doc_to_decontamination_query: question_stem
metric_list:

metric: acc
aggregation: mean
higher_is_better: true
metric: acc_norm
aggregation: mean
higher_is_better: true
metadata:
version: 1.0

baberabb added the asking questions For asking for clarification / support on library usage. label Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the evaluation of OpenBookQA #2610

Question about the evaluation of OpenBookQA #2610

xumingyu2021 commented Jan 6, 2025

Question about the evaluation of OpenBookQA #2610

Question about the evaluation of OpenBookQA #2610

Comments

xumingyu2021 commented Jan 6, 2025