The portable (compiled-in) Metal driver fails to load Phi-3/Phi-3.5 GGUFs (tested bartowski/Phi-3.5-mini-instruct Q4_K_M). Phi-3 fuses Q/K/V into a single qkv_proj tensor; the portable loader's attention-weight mapping expects separate q_proj/k_proj/v_proj, so load aborts post-Metal-init (exit 1).
Qwen2.5 / Qwen3 / Llama-3.1 / Llama-3.2 GGUFs load fine.
Repro: pie serve a Phi-3.5-mini Q4_K_M GGUF → model-load fatal:
[pie-driver-portable] fatal: gguf: missing tensor 'model.layers.0.self_attn.qkv_proj.weight'
pie-driver-portable failed during startup (rc=-1): gguf: missing tensor 'model.layers.0.self_attn.qkv_proj.weight'
Ask: handle fused qkv_proj (split into q/k/v) in the portable loader's GGUF attention mapping.
The portable (compiled-in) Metal driver fails to load Phi-3/Phi-3.5 GGUFs (tested
bartowski/Phi-3.5-mini-instructQ4_K_M). Phi-3 fuses Q/K/V into a singleqkv_projtensor; the portable loader's attention-weight mapping expects separateq_proj/k_proj/v_proj, so load aborts post-Metal-init (exit 1).Qwen2.5 / Qwen3 / Llama-3.1 / Llama-3.2 GGUFs load fine.
Repro:
pie servea Phi-3.5-mini Q4_K_M GGUF → model-load fatal:Ask: handle fused
qkv_proj(split into q/k/v) in the portable loader's GGUF attention mapping.