GitHub - gtskevin/ai-native-review: 🔍 Claude Code Skill: Catch the #1 AI anti-pattern before you build. Reviews tool designs with a 4-point decision tree — AI-decorated vs AI-native.

Stop building AI-decorated tools. Start building AI-native ones.

🎯 What It Does

When AI builds tools, it defaults to writing deterministic rules (regex, if/else, templates) instead of using its own reasoning. This skill catches that — before you waste time building it.

You: "I'm building a tool that reviews survey scales"

AI (without this skill):               AI (with this skill):
┌──────────────────────┐               ┌──────────────────────┐
│ if (score > 0.8) ✅  │               │ LLM analyzes each    │
│ regex.test(content)  │               │ item with rubric,    │
│ template.replace()   │               │ cites evidence,      │
│ if/else chains       │               │ scores confidence    │
└──────────────────────┘               └──────────────────────┘
   ❌ AI-decorated                        ✅ AI-native

⚡ Quick Start

⏱️ 30 seconds from install to first review

Option A — One-line install (recommended):

curl -fsSL https://raw.githubusercontent.com/gtskevin/ai-native-review/main/install.sh | bash

Auto-installs the skill + companion design rule for Codex and/or Claude Code when detected.

Option B — Manual install for Codex:

git clone https://github.com/gtskevin/ai-native-review.git
mkdir -p ~/.codex/skills/ai-native-review
cp ai-native-review/SKILL.md ~/.codex/skills/ai-native-review/
cp -R ai-native-review/rules ~/.codex/skills/ai-native-review/
cp -R ai-native-review/examples ~/.codex/skills/ai-native-review/
rm -rf ai-native-review

For Claude Code, replace ~/.codex with ~/.claude.

Then invoke the skill:

Claude Code: /ai-native-review
Codex: Review this design for AI-native quality

Describe the tool you're building, or point the AI to an existing app, Skill, or codebase.

Expected output:

✅ Task decomposition: 8 subtasks tagged (🧠5 ⚙️2 👤1)
✅ Decision tree: 5 judgment tasks tested
✅ Anti-pattern scan: 3 findings (P0: 1, P1: 2)
✅ Report generated with provenance + priority matrix

🔍 The Core Test

For every subtask that requires judgment, the skill runs a 4-point decision tree:

Test	Question	If Yes	If No
🔴 Delete Test	Remove the AI call. Still works?	AI-decorated ❌	Continue
🔴 Upgrade Test	Better model = better output?	Continue	Not AI-driven ❌
🟡 Context Test	AI gets enough context?	Continue	Context-starved ⚠️
🔵 Soft Rule Test	Prompt beats hardcoded rule?	AI-native ✅	Disguised rule ❌

🧭 AI-Native Interaction: Goal → Plan → Act → Confirm

AI-native design has two complementary dimensions. A tool can have a strong AI reasoning core while still forcing users through a rigid click maze. It can also offer a chat box while its core decisions remain hardcoded. Review both.

Dimension	Core Question	Healthy Pattern
AI-native intelligence	Does the model make the judgment calls that require understanding?	LLM judgment + evidence + confidence
AI-native interaction	Can users express the outcome they want while the system plans and executes the workflow under appropriate human control?	Goal-oriented input + tool use + visible progress + confirmation

Reference interaction architecture:

User goal in natural language
        ↓
AI clarifies missing constraints
        ↓
AI proposes or silently forms a bounded plan
        ↓
AI uses tools and reports progress
        ↓
Low-risk, reversible actions → execute
High-risk or irreversible actions → ask for confirmation
        ↓
User reviews, corrects, overrides, or continues the conversation

Interaction review questions:

Area	Review Question	Good Evidence
Intent	Can users describe the result they want without knowing internal steps, APIs, or menus?	Natural-language goal, contextual defaults
Orchestration	Can the model choose the next action or tool based on the current state?	Dynamic planning and tool selection
Visibility	Can users see what the system is doing and what changed?	Progress, evidence, action log, explicit outcomes
Control	Can users correct, refine, interrupt, undo, or take over?	Follow-up conversation, edit, retry, cancel, rollback
Safety	Are consequential actions gated appropriately?	Confirmation, scoped permissions, human handoff
Interface fit	Is conversation used where it reduces friction, while structured UI remains available where precision helps?	Chat + previews, forms, tables, or approval widgets

Evaluation: Should Every AI-Native App Be Chat-First?

No. Natural language is valuable when the user's desired outcome is easier to describe than the steps needed to achieve it. It should not replace every button, form, or workspace.

Scenario	Recommended Interaction
Ambiguous, multi-step goal	Natural-language goal + AI planning
Repetitive low-risk workflow	One command or event trigger + visible execution
Precise structured input	Form, table, or direct manipulation, optionally assisted by AI
High-stakes action	AI proposes → human confirms
Creative or preference-heavy work	Conversational iteration + editable artifacts

The goal is not "chat everywhere." The goal is to reduce unnecessary procedural work while preserving user understanding and control.

📋 What You Get

Section	What It Tells You
Task Decomposition	Every subtask tagged: 🧠 judgment / ⚙️ code / 👤 human
Findings + Provenance	Which test found what, with code-level evidence
Interaction Audit	Whether users can state goals, supervise execution, and recover from mistakes
Priority Matrix	P0 (fix now) → P3 (someday)
"Is AI-Native Worth It?"	Cost-benefit per subtask
Drift Risk	Where future devs will be tempted to "add a quick rule"
Migration Path	If AI-decorated: phased plan with rollback

📊 Real Example

Reviewing a questionnaire-checking Skill (full report):

Subtask	Nature	AI-native?	Context?
Excel structure extraction	⚙️ Precision	N/A — correct use of code	—
Scale classification	🧠 Judgment	✅	✅
Translation equivalence	🧠 Judgment	✅	⚠️ Missing rubric
Item-writing error detection	🧠 Judgment	✅	✅
Respondent experience	🧠 Judgment	✅	✅

Result: 🟢 5/5 capability allocation correct. 🟡 2 improvements added to report.

🚫 Anti-Patterns Detected

Anti-Pattern	Example	Verdict
AI orchestration + rule execution	AI designs flow, regex does checking	❌
Regex replacing judgment	`regex.test(content)` for quality	❌
Template replacing generation	`template.replace('{x}', val)`	❌
if-else for business decisions	`if (score > 0.8) good`	❌
Keyword matching for semantics	`text.includes('keyword')`	❌
Unstructured AI output	No JSON schema, no confidence	❌
Click maze for a goal-oriented task	User manually executes steps the agent could plan	⚠️ Review interaction fit
Unbounded autonomy	Agent takes consequential actions without confirmation	❌

Domain exemptions: Compliance, security, real-time, data masking, financial calculation — rules are correct here.

🏆 Why This Skill?

Capability	AI-Native Review	Manual Review	Lint Rules
Catches "AI-decorated" anti-pattern	✅ 4-point test	⚠️ If you know to look	❌
Structured findings with evidence	✅ Provenance tagged	⚠️ Varies	❌
Priority matrix (P0→P3)	✅ Auto-generated	❌ Manual	❌
Cost-benefit "worth it?" analysis	✅ Per subtask	❌ Rarely done	❌
Drift risk markers	✅ Future-proofing	❌	❌
Domain exemption awareness	✅ Built-in	⚠️ Experience-dependent	❌
Goal-oriented interaction review	✅ Control-aware	⚠️ Varies	❌

🧠 Grounded In

Source	Contribution
OpenAI — A practical guide to building agents	Goal-oriented agents, tools, guardrails, human intervention
Anthropic — Building effective agents	Workflows vs. agents, feedback loops, meaningful oversight
Microsoft — Copilot agent UI guidelines	Conversation as the source of intent and control, progressive UI
Microsoft — Guidelines for Human-AI Interaction	Efficient invocation, correction, explanation, global controls
Google PAIR — Feedback + Control	Balance automation and user control

❓ FAQ

🤔 Does this work with Codex or only Claude Code?

Yes. The installer detects Codex and Claude Code and installs the same SKILL.md bundle for either environment. In Claude Code, use /ai-native-review. In Codex, ask to review a design for AI-native quality.

🛠️ What kind of tools can I review?

Any tool that involves AI making decisions: apps, Skills, agents, pipelines, evaluators, content processors, recommendation systems. Use it before building, during implementation, or after a working product exists.

📊 How is this different from just asking "is my design good?"

That gets you opinions. This gives you a structured 4-point test with evidence, priority ranking, cost-benefit analysis, and migration paths. Repeatable and auditable.

⏱️ How long does a review take?

2-5 minutes depending on project complexity. The skill decomposes, tests, and generates a full report in one pass.

🔍 Is this only for design-time review?

No. Design-time review is the cheapest moment to catch architecture mistakes, but you can also use the skill during implementation or to audit an existing app, Skill, or tool. Fully automated CI/CD enforcement is not included yet because semantic findings still need contextual judgment.

💬 Does every AI-native app need a chat interface?

No. Natural language is useful when users can state the desired result more easily than they can navigate the procedure. Structured UI remains better for precise input, review, comparison, and approval. A strong AI-native product often combines conversation with contextual widgets or editable artifacts.

🤝 Contributing

Found a new anti-pattern? Want to add domain-specific rules? Contributions welcome!

📁 Repo Structure

ai-native-review/
├── SKILL.md                              # The review skill (main)
├── rules/
│   └── ai-native-design.md              # Companion design rule
├── examples/
│   └── ob-scale-review-report.md        # Real review report
├── tests/
│   ├── test_dual_dimension_framework.sh
│   ├── test_install_compatibility.sh
│   └── test_install_smoke.sh
├── install.sh
├── README.md
├── README.zh-CN.md
├── .github/
│   ├── assets/                          # Visual assets
│   ├── ISSUE_TEMPLATE/                  # Issue templates
│   ├── PULL_REQUEST_TEMPLATE.md
│   └── social-preview.svg               # og:image
├── CONTRIBUTING.md
├── CODE_OF_CONDUCT.md
└── SECURITY.md

Built with ❤️ by @gtskevin · Build AI-native, not AI-decorated 🧠

⬆ Back to top

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎯 What It Does

⚡ Quick Start

🔍 The Core Test

🧭 AI-Native Interaction: Goal → Plan → Act → Confirm

Evaluation: Should Every AI-Native App Be Chat-First?

📋 What You Get

📊 Real Example

🚫 Anti-Patterns Detected

🏆 Why This Skill?

🧠 Grounded In

❓ FAQ

🤝 Contributing

📁 Repo Structure

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
examples		examples
rules		rules
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SECURITY.md		SECURITY.md
SKILL.md		SKILL.md
install.sh		install.sh

Folders and files

Latest commit

History

Repository files navigation

🎯 What It Does

⚡ Quick Start

🔍 The Core Test

🧭 AI-Native Interaction: Goal → Plan → Act → Confirm

Evaluation: Should Every AI-Native App Be Chat-First?

📋 What You Get

📊 Real Example

🚫 Anti-Patterns Detected

🏆 Why This Skill?

🧠 Grounded In

❓ FAQ

🤝 Contributing

📁 Repo Structure

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages