Skip to content

deps: bump llama.cpp correctness backports#453

Merged
davide221 merged 1 commit into
mainfrom
deps/llama-cuda-correctness-backports
Jun 25, 2026
Merged

deps: bump llama.cpp correctness backports#453
davide221 merged 1 commit into
mainfrom
deps/llama-cuda-correctness-backports

Conversation

@davide221

@davide221 davide221 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Follow-up hub bump for Luce-Org/llama.cpp-dflash-ggml#21.

This updates server/deps/llama.cpp from the RDNA3.5 MMQ override merge to the
current luce-dflash head:

  • old: 6fd3d84e2168476d5e199a2fa1221d82ba883c21
  • new: 30c9d7dac8fee3c6c4cb1dddf7b97c0355cab00f

Included llama.cpp changes:

  • CUDA/HIP flash-attention MMA mask offset overflow fix.
  • CUDA GGML_OP_REPEAT support check restricted to implemented F32/F16 paths,
    avoiding runtime asserts for unsupported types.

This PR intentionally only changes the submodule pointer.

Review in cubic

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Re-trigger cubic

@davide221 davide221 merged commit 9610083 into main Jun 25, 2026
5 checks passed
@davide221 davide221 deleted the deps/llama-cuda-correctness-backports branch June 26, 2026 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant