Skip to content

Portable Metal loader rejects Phi-3 (fused QKV) GGUF at model load #402

Description

@shsym

The portable (compiled-in) Metal driver fails to load Phi-3/Phi-3.5 GGUFs (tested bartowski/Phi-3.5-mini-instruct Q4_K_M). Phi-3 fuses Q/K/V into a single qkv_proj tensor; the portable loader's attention-weight mapping expects separate q_proj/k_proj/v_proj, so load aborts post-Metal-init (exit 1).

Qwen2.5 / Qwen3 / Llama-3.1 / Llama-3.2 GGUFs load fine.

Repro: pie serve a Phi-3.5-mini Q4_K_M GGUF → model-load fatal:

[pie-driver-portable] fatal: gguf: missing tensor 'model.layers.0.self_attn.qkv_proj.weight'
pie-driver-portable failed during startup (rc=-1): gguf: missing tensor 'model.layers.0.self_attn.qkv_proj.weight'

Ask: handle fused qkv_proj (split into q/k/v) in the portable loader's GGUF attention mapping.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions