Bug report in otb project

In UnderThinkingBench, there exists some items without "source_dataset", but "source" instead with values "aime" and "hmmt", which seems to be in the  UnderThinkingBench-Math subset.

UnderThinkingBench entry:
https://github.com/facebookresearch/RAM/blob/264c0479aa2df0ea697a0fadbb10356d1e4ba41f/projects/otb/eval.py#L39-L40

Error encountered here:
https://github.com/facebookresearch/RAM/blob/264c0479aa2df0ea697a0fadbb10356d1e4ba41f/projects/otb/evals/underthink_eval.py#L33-L34

Possible solution in `eval.py`:
```python
import json

# ...

elif subset == "underthinking-bench" or "underthinking" in subset:
    metadata = json.loads(row["metadata"])
    if "source" in metadata and metadata["source"] in ["aime", "hmmt"]:
        acc = eval_math(row, tokenizer, model_name)
    else:
        acc = eval_underthink(row)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug report in otb project #77

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	elif subset == "underthinking-bench" or "underthinking" in subset:
	acc = eval_underthink(row)

	def eval_underthink(row, find_last_box: bool = False) -> float:
	puzzle = json.loads(row["metadata"])["source_dataset"]

Uh oh!

Bug report in otb project #77

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions