Skip to content

saif-shines/doraval

Repository files navigation

doraval

Scale your AI context for coding agents. Make your next context work (skills, plugins & more) for your team, community, or self. Context engineering toolkit for AI coding agents.

doraval (dor-uh-val) blends Doraemon and eval — gadget-pulling context tools plus session evaluation. The dora alias is the same CLI.

Documentation · Quickstart

You scale AI context — skills, plugins, decisions — so agents (and humans) succeed on the first try. For yourself, your team, or your community. Broken manifests, skills that never trigger, decisions that drift session to session, and "works on my machine with Claude" that fails for Cursor, Codex, or the tenth engineer you onboard.

The problem: context you cannot trust until someone has already wasted a day debugging it.

The win: validate, scaffold, journal, and eval so the first attempt succeeds across Claude, Cursor, Codex, Copilot, Grok, and whatever comes next.

doraval is the toolkit for context engineering — authoring, validating, and evolving reliable context that works the first time.

Magic is free. First win in under 2 minutes:

# macOS (Homebrew, recommended):
brew tap saif-shines/tap
brew trust saif-shines/tap
brew install doraval
doraval validate .

# Everyone else: no install required
npx @hacksmith/doraval validate .

validate auto-detects what you built and tells you what's broken before anyone burns a session on it.

Typical first run:

dora validate .
  dora validate — 1 validator(s)
  Path:  .
  1 validators • 0 errors • 0 warnings
  Claude Skill (claude:skill)
  Status  Check
  ✓  YAML frontmatter present and parseable
  ✓  name: "my-skill"
  ✓  description field present
  ✓  uses dynamic context injection
  ✓ All checks passed.

Who it's for

Anyone scaling AI context for coding agents — yourself, your team, or your community:

  Scale AI context (skills, plugins, decisions)
                        │
                        ▼
                 ┌─────────────┐
                 │   doraval   │  validate · scaffold · journal · eval
                 └─────────────┘
                        │
                        ▼
        yourself · your team · your community
                        │
                        ▼
              first attempt succeeds

Same loop everywhere: scaffold → validate → journal → eval. Quickstart →

Give 10 engineers (or agents) a skill and only 3/10 succeed on the first try. doraval left-shifts success: catch breakage before the first session, not after the third debugging thread.

Quickstart

One path whether you are tuning your own agent, onboarding a team, or shipping to a community:

# 1. Scaffold
doraval claude new --yes --intent self my-context      # personal / team
doraval claude new --yes --intent distribute my-plugin # ship to others

# 2. Validate before anyone relies on it
doraval validate .

# 3. One-time setup (decision memory + agent integration for scaling AI context)
doraval init

# 4. Record decisions that persist across sessions
doraval journal add "Validate before shipping skill changes"
doraval journal sync
doraval journal hook enable

# 5. Measure adherence in real agent sessions
doraval eval

Full walkthrough: Quickstart

Install

macOS (Homebrew, recommended)

brew tap saif-shines/tap
brew trust saif-shines/tap
brew install doraval

No runtime required. The binary is self-contained.

On some systems, run brew trust saif-shines/tap (or brew trust --formula saif-shines/tap/doraval) before the install step for the tap to work smoothly.

npm / npx

npx @hacksmith/doraval validate .        # run without installing
npm install -g @hacksmith/doraval        # or install globally

Requires Node.js. The wrapper will automatically use Bun when available (faster). If Bun is missing, it guides you to install it — and on macOS it recommends Homebrew first.

Bun

Don't have Bun? Install it first:

curl -fsSL https://bun.sh/install | bash   # macOS / Linux

After the installer finishes, restart your terminal (or run source ~/.zshrc / source ~/.bashrc), then:

bunx @hacksmith/doraval validate .
bun add -g @hacksmith/doraval

doraval and dora are the same CLI.

Commands

validate: catch breakage before anyone uses it

Point at a directory or GitHub URL. doraval finds what's there and checks it.

doraval validate .                                          # local directory
doraval validate https://github.com/obra/superpowers        # remote repo
doraval validate . --for claude           # all Claude validators
doraval validate . --for claude:plugin    # one validator

Validators (Claude)

Validator Detects Checks
claude:skill SKILL.md Frontmatter, body, supporting files, dynamic injection (!, $ARGUMENTS, ${CLAUDE_*}`)
claude:plugin .claude-plugin/plugin.json Full manifest schema, path rules, .claude-plugin/ purity, version pinning
claude:marketplace plugins/ Plugin directory structure, README, LICENSE
claude:hooks hooks/hooks.json 30+ lifecycle events, hook groups, command/http/mcp_tool/prompt/agent types
claude:mcp .mcp.json Server entries (stdio or url), env/cwd, substitution detection
claude:lsp .lsp.json Per-language command + extensionToLanguage map
claude:monitors monitors/monitors.json Array entries, unique names, substitution support
claude:subagent agents/*.md Frontmatter, disallowed security fields, non-empty body
claude:command commands/*.md Frontmatter, body, advanced fields
claude:memory CLAUDE.md Non-empty, length limit, @path import resolution

claude new / cursor new / …: scaffold by construction

Interactive wizard for skills and plugins. Targets the coding agent you use — or the one your team and community run.

doraval claude new                              # interactive
doraval claude new --yes --intent distribute my-plugin   # ship to others
doraval claude new --yes --intent self my-context        # personal / team
doraval cursor new / doraval codex new / doraval copilot new

skill validate / skill drift: one skill, two lenses

Structural check vs. rubric deviation:

doraval skill validate ./skills/my-skill/
doraval skill drift ./skills/my-skill/

drift measures six rubric areas: trigger phrases, step-by-step structure, imperative voice, examples, guardrails (MUST / MUST NOT), and clarity.

eval: did the agent actually follow the skill?

validate and drift check the document. eval checks reality: it reads a real session transcript, finds which skills were invoked, and runs an LLM judge for a per-skill PASS / FAIL with a dynamic checklist, familiarity score, and closure info.

Example judgment:

[FAIL] improve
  familiarity: 2/10  (prompt was very vague)
  ✓ Invoke the improve skill before responding
  ✗ Phase 1: Run git log for churn signal
  ✗ Phase 2: Fan out parallel subagents
  Result: 3/9 checks — stopped after initial recon.
doraval eval                    # pick from recent sessions interactively
doraval eval --verbose
doraval judge ./skills/improve/ # alias: eval for one skill
doraval eval history            # past verdicts

Requires doraval init first. See eval docs.

journal: decision memory that survives sessions

Record project principles so future you (and agents) don't contradict past choices. SessionStart hooks inject the journal before the first message.

doraval init                  # journal repo + agent config (Claude or Grok)
doraval journal list          # view active principles
doraval journal add "..."     # propose a decision
doraval journal sync          # publish pending entries
doraval journal update        # pull latest from remote
doraval journal hook enable   # inject journal on every SessionStart

Requires the GitHub CLI (gh). Journal lives in a private GitHub repo you control.

ui: local dashboard

doraval ui                 # start dashboard (opens browser)
doraval ui --port 4921
doraval ui --status        # check if running
doraval ui --force         # force restart

Re-running doraval ui is idempotent (PID tracking). Sidebar navigation, loading states, and open-dir support ship in recent releases.

config: dot-notation settings

doraval config get
doraval config set eval.model claude-sonnet-4-20250514

Other

doraval update              # self-update to latest
doraval providers           # which agents understand which keywords

Options

Flag Short Description
--format <type> -f table (default) or json
--for <spec> Target a provider or specific validator
--verbose -v Show detailed diagnostics
--ci Machine-friendly output, non-zero exit on issues

CI/CD

doraval validate . --for claude --format json --ci
doraval skill validate ./my-skill/ --format json --ci
doraval skill drift ./my-skill/ --format json --ci
doraval eval --ci --format json

Exits with code 1 when errors are found. Pipe --format json to jq or consume programmatically.

Providers

Claude Code, Cursor, Codex, Copilot CLI, and Grok validators and scaffolding built in. OpenCode support is experimental.

Links