Skip to content

Reproducing Qwen2.5-3B performance #5

@zhuqiangLu

Description

@zhuqiangLu

Hi there,

I just tried running the LV-Eval -128k subset on the Qwen2.5-3B-Instruct model(no AHN enabled). The performance is terribly low and can't match the number reported in paper.

So here is the screenshot. I ran the evaluation on a single A6000Ada 48G VRAM with single process mode. Could you please share the model prediction (the json files) so I can verify what goes wrong.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions