LocalAI version:
localai/localai:v4.4.2
Environment, CPU architecture, OS, and Version:
Linux 7.0.10-101.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Wed May 27 14:05:53 UTC 2026 x86_64 GNU/Linux
Describe the bug
Model mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M cannot be loaded:
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'mistral3'
To Reproduce
Just download the model via LocalAI's Install models, open chat and select this model.
Expected behavior
Each model provided in LocalAI's Install models should be compatible and loadable.
Logs
14:46:00.112stderrsrv load_model: loading model '/models/llama-cpp/models/mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M.gguf'
14:46:00.150stderrllama_model_loader: loaded meta data with 48 key-value pairs and 363 tensors from /models/llama-cpp/models/mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M.gguf (version GGUF V3 (latest))
14:46:00.150stderrllama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
14:46:00.150stderrllama_model_loader: - kv 0: general.architecture str = mistral3
14:46:00.150stderrllama_model_loader: - kv 1: general.type str = model
14:46:00.150stderrllama_model_loader: - kv 2: general.name str = Ministral-3-14B-Reasoning-2512
14:46:00.150stderrllama_model_loader: - kv 3: general.version str = 2512
14:46:00.150stderrllama_model_loader: - kv 4: general.finetune str = Reasoning
14:46:00.150stderrllama_model_loader: - kv 5: general.basename str = Ministral-3-14B-Reasoning-2512
14:46:00.150stderrllama_model_loader: - kv 6: general.quantized_by str = Unsloth
14:46:00.150stderrllama_model_loader: - kv 7: general.size_label str = 14B
14:46:00.150stderrllama_model_loader: - kv 8: general.repo_url str = https://huggingface.co/unsloth
14:46:00.150stderrllama_model_loader: - kv 9: mistral3.block_count u32 = 40
14:46:00.150stderrllama_model_loader: - kv 10: mistral3.context_length u32 = 262144
14:46:00.150stderrllama_model_loader: - kv 11: mistral3.embedding_length u32 = 5120
14:46:00.150stderrllama_model_loader: - kv 12: mistral3.feed_forward_length u32 = 16384
14:46:00.150stderrllama_model_loader: - kv 13: mistral3.attention.head_count u32 = 32
14:46:00.150stderrllama_model_loader: - kv 14: mistral3.attention.head_count_kv u32 = 8
14:46:00.150stderrllama_model_loader: - kv 15: mistral3.attention.layer_norm_rms_epsilon f32 = 0.000010
14:46:00.150stderrllama_model_loader: - kv 16: mistral3.attention.key_length u32 = 128
14:46:00.150stderrllama_model_loader: - kv 17: mistral3.attention.value_length u32 = 128
14:46:00.150stderrllama_model_loader: - kv 18: mistral3.vocab_size u32 = 131072
14:46:00.150stderrllama_model_loader: - kv 19: mistral3.rope.dimension_count u32 = 128
14:46:00.150stderrllama_model_loader: - kv 20: mistral3.rope.scaling.type str = yarn
14:46:00.150stderrllama_model_loader: - kv 21: mistral3.rope.scaling.factor f32 = 16.000000
14:46:00.150stderrllama_model_loader: - kv 22: mistral3.rope.scaling.yarn_beta_fast f32 = 32.000000
14:46:00.150stderrllama_model_loader: - kv 23: mistral3.rope.scaling.yarn_beta_slow f32 = 1.000000
14:46:00.150stderrllama_model_loader: - kv 24: mistral3.rope.scaling.yarn_log_multiplier f32 = 1.000000
14:46:00.150stderrllama_model_loader: - kv 25: mistral3.rope.scaling.original_context_length u32 = 16384
14:46:00.150stderrllama_model_loader: - kv 26: mistral3.rope.freq_base f32 = 1000000000.000000
14:46:00.150stderrllama_model_loader: - kv 27: mistral3.attention.temperature_scale f32 = 0.100000
14:46:00.150stderrllama_model_loader: - kv 28: tokenizer.ggml.model str = gpt2
14:46:00.150stderrllama_model_loader: - kv 29: tokenizer.ggml.pre str = tekken
14:46:00.163stderrllama_model_loader: - kv 30: tokenizer.ggml.tokens arr[str,131072] = ["<unk>", "<s>", "</s>", "[INST]", "[...
14:46:00.165stderrllama_model_loader: - kv 31: tokenizer.ggml.token_type arr[i32,131072] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
14:46:00.189stderrllama_model_loader: - kv 32: tokenizer.ggml.merges arr[str,269443] = ["Ġ Ġ", "Ġ t", "e r", "i n", "Ġ �...
14:46:00.189stderrllama_model_loader: - kv 33: tokenizer.ggml.bos_token_id u32 = 1
14:46:00.189stderrllama_model_loader: - kv 34: tokenizer.ggml.eos_token_id u32 = 2
14:46:00.189stderrllama_model_loader: - kv 35: tokenizer.ggml.unknown_token_id u32 = 0
14:46:00.189stderrllama_model_loader: - kv 36: tokenizer.ggml.padding_token_id u32 = 11
14:46:00.189stderrllama_model_loader: - kv 37: tokenizer.ggml.add_bos_token bool = true
14:46:00.189stderrllama_model_loader: - kv 38: tokenizer.ggml.add_sep_token bool = false
14:46:00.189stderrllama_model_loader: - kv 39: tokenizer.ggml.add_eos_token bool = false
14:46:00.189stderrllama_model_loader: - kv 40: tokenizer.chat_template str = {#- Unsloth template fixes #}\n{#- Def...
14:46:00.189stderrllama_model_loader: - kv 41: tokenizer.ggml.add_space_prefix bool = false
14:46:00.189stderrllama_model_loader: - kv 42: general.quantization_version u32 = 2
14:46:00.189stderrllama_model_loader: - kv 43: general.file_type u32 = 15
14:46:00.189stderrllama_model_loader: - kv 44: quantize.imatrix.file str = Ministral-3-14B-Reasoning-2512-GGUF/i...
14:46:00.189stderrllama_model_loader: - kv 45: quantize.imatrix.dataset str = unsloth_calibration_Ministral-3-14B-R...
14:46:00.189stderrllama_model_loader: - kv 46: quantize.imatrix.entries_count u32 = 280
14:46:00.189stderrllama_model_loader: - kv 47: quantize.imatrix.chunks_count u32 = 139
14:46:00.189stderrllama_model_loader: - type f32: 81 tensors
14:46:00.189stderrllama_model_loader: - type q4_K: 241 tensors
14:46:00.189stderrllama_model_loader: - type q6_K: 41 tensors
14:46:00.189stderrprint_info: file format = GGUF V3 (latest)
14:46:00.189stderrprint_info: file type = Q4_K - Medium
14:46:00.189stderrprint_info: file size = 7.67 GiB (4.88 BPW)
14:46:00.191stderrllama_model_load: error loading model: error loading model architecture: unknown model architecture: 'mistral3'
14:46:00.191stderrllama_model_load_from_file_impl: failed to load model
14:46:00.191stderrcommon_init_from_params: failed to load model '/models/llama-cpp/models/mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M.gguf'
14:46:00.191stderrsrv load_model: failed to load model, '/models/llama-cpp/models/mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M.gguf'
Additional context
None
LocalAI version:
localai/localai:v4.4.2
Environment, CPU architecture, OS, and Version:
Linux 7.0.10-101.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Wed May 27 14:05:53 UTC 2026 x86_64 GNU/Linux
Describe the bug
Model mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M cannot be loaded:
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'mistral3'To Reproduce
Just download the model via LocalAI's
Install models, open chat and select this model.Expected behavior
Each model provided in LocalAI's
Install modelsshould be compatible and loadable.Logs
Additional context
None