Skip to content

--power flag unusually slow in some cases #465

Description

@omnomburp

When using the --power flag I saw extremely bad decode speeds ~0.11tok/s.

I tried the extremely naive approach of having a warmup period in order to get a more reliable avg speed and a guard against outliers, which seemed to solve the issue.

This was from an m5 max 128gb ./ds4 -c 30000 --power 50

ds4: metal graph token pos=96 encode=6.150 ms execute=3499.725 ms read=0.010 ms total=3505.885 ms logits=1
ds4: metal graph token pos=97 encode=5.487 ms execute=1018.332 ms read=0.008 ms total=1023.827 ms logits=1
ds4: metal graph token pos=98 encode=5.848 ms execute=1023.379 ms read=0.008 ms total=1029.235 ms logits=1
ds4: prefill: 25.39 t/s, generation: 0.24 t/s

#464

^ for your reference @antirez

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions