Skip to content

Benchmark candidate local models for smdu AI tasks #67

@ScottMorris

Description

@ScottMorris

Goal

Benchmark local model candidates for smdu AI tasks and produce a shortlist.

Questions to answer

  • Which model sizes are viable on common developer machines?
  • What latency/memory profile is acceptable for interactive TUI usage?
  • Which tasks require embeddings vs generation vs classification?

Candidate families

  • Qwen2.5 instruct variants (0.5B, 1.5B, 3B)
  • Qwen2.5 Coder instruct variants (for path/action reasoning)
  • compact embedding models compatible with transformers.js

Evaluation matrix

  • cold start time
  • warm inference latency
  • memory footprint (idle and active)
  • quality on task-specific prompts
  • download size and first-run impact

Acceptance Criteria

  • at least 3 viable candidates benchmarked end-to-end
  • one recommended default model per task type (summary/query/tagging)
  • explicit fallback strategy for low-memory environments

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovements or feature refinements

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions