How to test models with AIWG and submit your results to help other users.
Every model validation expands the tested compatibility matrix. Your setup — hardware, model, quantization, backend — may be the exact combination another user needs confirmed before committing to a local deployment.
Run AIWG with your model and report what works:
- Basic CLI — Does
aiwg use sdlcdeploy without errors? - Agent delegation — Can agents be spawned and return results?
- Tool use — Does the model produce valid tool calls (Read, Edit, Grep)?
- Structured output — Can it generate valid JSON, YAML, markdown tables?
- Commit messages — Does
/commit-and-pushproduce correctly formatted output?
For more thorough validation:
- SDLC workflows — Phase transitions, artifact generation
- Multi-step reasoning — Agent loops, implementation planning
- Context handling — Quality at different session lengths
- Code generation — TypeScript/JavaScript quality, test writing
# 1. Install and start your model
ollama pull <model-id>
ollama serve
# 2. Point AIWG at the local endpoint
export OPENAI_BASE_URL="http://localhost:11434/v1"
export OPENAI_API_KEY="ollama"
# 3. Deploy AIWG
aiwg use sdlc --provider codexTry each capability and note what works, what fails, and any quality observations:
# Test tool use — ask agent to read and edit a file
# Test structured output — ask for JSON or YAML generation
# Test coding — ask to write a function with tests
# Test reasoning — ask for a phased implementation planFill out the submission template below with your findings.
Copy this template and submit as a PR to docs/models/supported-models.md, or post it as a GitHub/Gitea issue.
## Model Test Report
**Model**: <!-- e.g., hermes-3-llama-3.1:8b -->
**Quantization**: <!-- e.g., Q5_K_M, default, F16 -->
**Backend**: <!-- e.g., Ollama 0.5.4, llama.cpp b4567, vLLM 0.6.2 -->
**Hardware**: <!-- e.g., RTX 4070 Ti 12GB VRAM, M2 Pro 16GB unified -->
**OS**: <!-- e.g., Ubuntu 24.04, macOS 15.3 -->
**AIWG Version**: <!-- e.g., 2026.3.2 -->
**Date**: <!-- e.g., 2026-03-20 -->
### Configuration
<!-- Paste your models.json (redact any real API keys) -->
```json
{
"shorthand": {
"opus": "your-model-id",
"sonnet": "your-model-id",
"haiku": "your-model-id"
}
}| Capability | Status | Notes |
|---|---|---|
| Basic CLI | ✅/ |
|
| Agent delegation | ✅/ |
|
| Tool use | ✅/ |
|
| Structured output | ✅/ |
|
| Commit messages | ✅/ |
|
| SDLC workflows | ✅/ |
|
| Multi-step reasoning | ✅/ |
|
| Context handling | ✅/ |
|
| Code generation | ✅/ |
- Recommended tier: haiku / sonnet / opus
- Reasoning:
---
## Submission Process
### Option 1: Pull Request (Preferred)
1. Fork the repository
2. Add your model to the "Local Models — Validated" table in `docs/models/supported-models.md`
3. Include your full test report in the PR description
4. Submit the PR
### Option 2: Issue
1. Open an issue titled `model-test: <model-name> on <hardware>`
2. Paste the filled-out submission template
3. A maintainer will add the results to the supported models list
### Option 3: Eval Suite (When Available)
Once the AIWG eval suite is available (see roadmap):
```bash
aiwg eval <model-id> --backend ollama --submit
This will run standardized tests and optionally submit results directly.
Every submission must include:
- Model ID and version (exact Ollama/HuggingFace identifier)
- Inference backend and version
- Hardware specs (GPU model and VRAM, or CPU if CPU-only)
- Quantization level (if applicable)
- AIWG version tested against
- At least 3 capability results from the test matrix
- Any issues or limitations encountered
Submissions missing required fields will be asked for clarification before being added to the supported models list.
- Supported Models — Where results are published
- Local Model Deployment — Setup instructions
- Contributor Quickstart — General contribution guide