A comprehensive, opinionated mega-guide for the engineer who gets dropped into a company and needs to add AI capabilities to products, teams, and infrastructure — organized as a fractal spiral curriculum that revisits every concept at increasing depth.
63 chapters · 13 parts · 2 phases · TypeScript + Python
100x-ai-engineer-guide/
═══ PHASE 1: GET DANGEROUS ═══ (TypeScript-first, API-level, ship in a week)
├── part-0-fundamentals/ ← Just enough theory: tokens, embeddings, models, providers
├── part-1-building-with-llms/ ← Your first AI features: APIs, chat, structured output, streaming
├── part-2-agent-engineering/ ← Tool calling, agent loops, memory, approvals, context management
├── part-3-rag-knowledge/ ← Embeddings, vector search, document QA, advanced retrieval
├── part-4-evals-quality/ ← Single-turn, multi-turn, eval-driven dev, telemetry
├── part-5-harness-engineering/ ← Claude Code, MCP, skills, plugins, automation, coding agents
═══ PHASE 2: BECOME AN EXPERT ═══ (Python + TS, ML-deep, build platforms)
├── part-6-anatomy-of-ai-tools/ ← How Claude Code, Cursor work inside: memory, context, tools, search
├── part-7-hard-parts/ ← Neural nets, transformers, attention from scratch
├── part-8-open-source-ai/ ← Hugging Face, inference, pipelines, model selection
├── part-9-fine-tuning-training/ ← LoRA, datasets, quantization, RAG vs fine-tuning
├── part-10-production-ai/ ← Security, cost, context mgmt, observability, guardrails
├── part-11-deployment-infrastructure/← Sandboxing, CI/CD for AI, scaling, provider management
├── part-12-ai-platform/ ← Multi-agent, internal tools, skills marketplaces, org enablement
├── appendices/ ← Glossary, resources, cheat sheet
└── README.md ← You are here
You're an engineer who just got told "we need to add AI to our product." Maybe you're the senior engineer who needs to ship an AI feature by next sprint. Maybe you're the tech lead evaluating whether to build or buy. Maybe you're the platform engineer who needs to make the whole company productive with AI tools. Maybe you're the curious developer who sees what Ramp, Stripe, and Anthropic are doing and wants to understand how it all works.
This guide is for you if:
- You need to ship AI features into production, not just prototype in a notebook
- You want to understand agents, RAG, evals, and tool calling — not just call an API
- You need to know when to use an API vs. fine-tune vs. run open-source
- You want to build the AI infrastructure that makes your whole team dangerous
- You believe the harness matters more than the model
- You want to understand how tools like Claude Code and Cursor actually work inside
Phase 1 gets you dangerous in a week. Phase 2 makes you the person who builds the platform.
This guide uses a fractal spiral curriculum. Every core concept appears multiple times at increasing depth:
- Within each Part — each chapter deepens the previous chapter
- Between Parts — later Parts spiral back to earlier Parts
- Between Phases — Phase 2 revisits everything from Phase 1 at expert depth
Example — how embeddings spiral through the guide:
Ch 1 (concept) → "Embeddings are vectors that capture semantic meaning"
Ch 14 (use for search) → Build semantic search with vector stores
Ch 19 (eval retrieval) → Score how well your retrieval works
Ch 36 (understand math) → Neural nets produce embeddings via learned weights
Ch 42 (generate own) → Run Sentence Transformers, choose embedding models
Ch 44 (vs fine-tuning) → When to retrieve knowledge vs bake it into weights
Example — how cost awareness spirals through the guide:
Ch 2 (pricing) → "GPT-4o costs $2.50/M input tokens, Claude Sonnet costs $3/M"
Ch 12 (context as cost) → Every token in context costs money — manage it
Ch 22 (tracking) → Instrument your app to track cost per feature
Ch 29 (compaction) → How Claude Code saves money with smart summarization
Ch 49 (engineering) → Model routing, caching, budgets — the full cost toolkit
Ch 57 (infrastructure) → Self-host vs API break-even math, GPU cost optimization
| Your Goal | Start Here | Then |
|---|---|---|
| Ship an AI feature NOW | Part 1: Ch 3 → 4 → 5 | Part 2: Ch 8 → 9 |
| Build an AI agent | Part 2: Ch 8 → 9 → 10 → 11 | Part 4: Ch 18 → 19 |
| Add RAG to your product | Part 3: Ch 14 → 15 → 16 | Part 4: Ch 19 |
| Set up evals | Part 4: Ch 18 → 19 → 20 → 21 | Part 10: Ch 51 |
| Make your team productive with AI | Part 5: Ch 23 → 24 → 25 | Part 12: Ch 59 → 60 |
| Understand how Claude Code works | Part 6: Ch 27 → 28 → 29 | Ch 30 → 32 |
| Learn ML fundamentals | Part 7: Ch 34 → 35 → 36 | Part 8: Ch 40 |
| Fine-tune a model | Part 9: Ch 44 → 45 → 46 | Part 8: Ch 43 |
| Deploy AI safely | Part 11: Ch 53 → 55 → 56 | Part 10: Ch 48 |
| Build an AI platform for your company | Part 12: Ch 58 → 59 → 60 | Ch 61 → 62 |
| Quick lookup | Glossary | Resources |
TypeScript-first. API-level. Ship in a week.
Just enough theory to not be confused. Every concept here gets revisited deeper later.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 0 | How LLMs Actually Work | Beginner | Tokens, context windows, temperature, top-p, why LLMs hallucinate, training vs inference |
| 1 | Embeddings & Similarity | Beginner | Vectors, cosine similarity, semantic meaning in numbers |
| 2 | The AI Engineer's Landscape | Beginner | Providers, models, pricing, SDKs, build vs buy |
Your first AI features. Spirals Part 0 from theory into code.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 3 | Your First LLM Call | Beginner | SDK setup, chat completions, system prompts, streaming, error handling |
| 4 | Prompt Engineering That Works | Beg→Inter | System prompts, few-shot, chain-of-thought, templates, versioning, prompt management at scale |
| 5 | Structured Output | Intermediate | JSON mode, Zod schemas, response_format, type-safe LLM responses |
| 6 | Building a Chat Interface | Intermediate | History management, streaming to UI, Vercel AI SDK, AI UX patterns |
| 7 | Multimodal AI | Intermediate | Vision, image generation, audio, multi-input |
From API calls to autonomous agents. Spirals Part 1: your LLM calls become tools inside a loop.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 8 | Tool Calling | Intermediate | Tool definitions, Zod schemas, execute functions, descriptions |
| 9 | The Agent Loop | Intermediate | Prompt→LLM→tool→execute→append→repeat, stop conditions, streaming |
| 10 | Agent Memory & State | Inter→Adv | Short-term, working, long-term memory architectures |
| 11 | Human-in-the-Loop | Intermediate | Sync/async approvals, trust spectrum, approval architectures |
| 12 | Context Window Management | Inter→Adv | Token counting, compaction, summarization, sliding windows |
| 13 | Agent Patterns & Frameworks | Inter→Adv | ReAct, Plan-and-Execute, LangChain, Mastra, Vercel AI SDK |
How to give your AI access to your company's knowledge. Spirals Part 0 embeddings into implementation.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 14 | Semantic Search | Intermediate | Embeddings in code, vector stores (Pinecone, Upstash, pgvector), similarity search, NL-to-SQL, scoring |
| 15 | The RAG Pipeline | Intermediate | Document loading, chunking, embedding, storage, retrieval, reranking |
| 16 | Document QA Systems | Intermediate | PDF/YouTube/web loaders, source attribution, retrieval + generation |
| 17 | Advanced Retrieval | Inter→Adv | Hybrid search, reranking, HyDE, query expansion, multi-index |
How to know if your AI actually works. Spirals back into Parts 1-3.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 18 | Why Evals Matter | Intermediate | Non-determinism, what to measure, offline/online, datasets |
| 19 | Single-Turn Evals | Intermediate | Tool selection scoring, output format checking, scorer functions |
| 20 | Multi-Turn Evals | Inter→Adv | Conversation evals, LLM-as-judge, structured judge prompts |
| 21 | Eval-Driven Development | Inter→Adv | Write eval → run → analyze → improve → repeat, the Ralph loop |
| 22 | Telemetry & Tracing | Intermediate | OpenTelemetry, Laminar, Datadog, token tracking, cost per feature |
The meta-skill: making yourself and your team 10x more productive with AI tools.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 23 | Claude Code Mastery | Beg→Adv | Skills, plugins, hooks, CLAUDE.md, plan mode, worktrees, permissions |
| 24 | MCP Servers & Integrations | Intermediate | MCP protocol, building servers, connecting tools |
| 25 | Skills, Plugins & Automation | Inter→Adv | Writing skills, building plugins, cron jobs, scheduled agents, event-driven workflows |
| 26 | AI-Augmented Development | Inter→Adv | Coding agents, multi-agent delegation, Stripe Minions, self-reinforcing loops |
Python + TypeScript. ML-deep. Build platforms.
How Claude Code, Cursor, and coding agents actually work inside. Spirals Part 5: from using the harness to understanding it.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 27 | The Agent Harness Architecture | Inter→Adv | CLI→loop→tools→permissions→UI. Claude Code vs Cursor vs Codex internals |
| 28 | Memory Systems: KAIROS & Beyond | Advanced | 3-layer memory, CLAUDE.md injection, AutoDream, append-only logs, semantic merging |
| 29 | Context Window Internals | Advanced | Compaction service, token budgets, cache-break vectors, session limits |
| 30 | Tool Execution & Permissions | Advanced | Permission models, approval flows, risk tiers, sandboxing |
| 31 | Web Search & Knowledge Pipelines | Advanced | Search APIs, content extraction, Turndown, paraphrase limits, pre-approved domains |
| 32 | Multi-Agent Coordination | Advanced | Coordinator mode, tick loops, background agents, worker delegation |
| 33 | Skills, Plugins & Distribution | Advanced | Skill architecture, Skillify, hook system, anti-distillation, org distribution |
Build a neural network by hand. Spirals Part 0 all the way down.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 34 | ML Decision Making | Intermediate | Prediction, features, decision boundaries, the DoorDash refund example |
| 35 | Data & Preprocessing | Intermediate | Sample populations, normalization, train/test splits, the pixel grid |
| 36 | Neural Networks from Scratch | Inter→Adv | Weights, sigmoid, gradient descent, backpropagation, smile detector |
| 37 | Tokenization Deep Dive | Inter→Adv | BPE, WordPiece, encoding/decoding, batching, attention masks |
| 38 | Transformers & Attention | Advanced | Self-attention, multi-head, positional encoding, encoder/decoder |
| 39 | Decoding & Generation | Advanced | Greedy, beam search, top-k, top-p, temperature as math |
Run your own models. Spirals Part 1: from calling APIs to local inference.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 40 | Hugging Face & Pipelines | Intermediate | Pipeline API, model hub, tasks, running inference locally |
| 41 | Image Generation | Intermediate | Stable Diffusion, text-to-image, image-to-image, DreamBooth |
| 42 | Embeddings & Sentence Transformers | Intermediate | Generate your own embeddings, MTEB benchmarks, model selection |
| 43 | Model Selection & Architecture | Inter→Adv | BERT vs GPT vs T5 vs Llama, sizes, quantization, local vs cloud |
Bake knowledge into weights. Spirals Part 3: from retrieving knowledge to training it in.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 44 | RAG vs Fine-Tuning | Inter→Adv | When to retrieve vs train, cost/quality/latency trade-offs |
| 45 | Fine-Tuning with LoRA | Advanced | Low-rank adaptation, PEFT, fine-tune GPT-2 on custom data |
| 46 | Dataset Engineering | Advanced | Curating data, quality, formats, synthetic data, augmentation |
| 47 | Quantization & Deployment | Advanced | GGUF, GPTQ, AWQ, inference optimization, when to quantize |
Ship AI that doesn't break. Spirals Part 4 from "does it work?" to full production hardening.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 48 | AI Security & Guardrails | Inter→Adv | Prompt injection, jailbreaks, lethal trifecta, PII detection |
| 49 | Cost Engineering | Inter→Adv | Token optimization, model routing, caching, usage budgets |
| 50 | Advanced Context Strategies | Advanced | Recursive compaction, sub-agent delegation, tiered memory |
| 51 | Production Eval Pipelines | Advanced | Continuous evaluation, A/B testing, regression detection, eval in CI |
| 52 | AI Observability & Incidents | Advanced | Datadog LLM monitoring, hallucination detection, degradation alerts |
Deploy and run AI safely at scale. Spirals Part 10: from what to worry about to how to actually deploy it.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 53 | Deploying LLM Applications | Intermediate | Serverless vs dedicated, Vercel Functions, Docker, environment management |
| 54 | API Gateway & Provider Management | Inter→Adv | Rate limiting, failover, Vercel AI Gateway, caching, secrets |
| 55 | Sandboxing & Isolating Agents | Advanced | The OpenClaw pattern, VPC isolation, proxy restrictions, file system isolation, code execution sandboxing |
| 56 | CI/CD for AI Applications | Inter→Adv | Evals in CI, canary deployments, A/B testing, feature flags for AI |
| 57 | Scaling & Cost at the Infra Level | Advanced | GPU vs CPU, serverless inference, batching, model caching, break-even math |
The capstone. From your setup to everyone's setup. Spirals everything.
| Ch | Title | Difficulty | What You'll Learn |
|---|---|---|---|
| 58 | Multi-Agent Orchestration | Advanced | Agents spawning agents, council patterns, parallel execution |
| 59 | Building Internal AI Tools | Advanced | The Glass pattern: SSO, pre-configured, low barrier internal platforms |
| 60 | Skills Marketplaces & Knowledge Sharing | Advanced | The Dojo pattern: Git-backed, versioned, discovery, Sensei recommendations |
| 61 | Self-Reinforcing AI Systems | Advanced | Feedback loops, eval-driven iteration, agents that improve themselves |
| 62 | AI Adoption & Enablement | All levels | The Ramp playbook: L0-L3, leaderboards, hub-and-spoke, removing constraints |
| Item | What's Inside |
|---|---|
| Glossary | 200+ AI/ML terms from attention to zero-shot |
| Resources | Essential papers, books, courses, blogs |
| Cheat Sheet | Quick reference: model comparison, pricing, SDK patterns |
Every core concept's journey through the guide:
EMBEDDINGS: Ch 1 (concept) → Ch 14 (search) → Ch 19 (eval)
→ Ch 36 (math) → Ch 42 (generate) → Ch 44 (vs fine-tuning)
AGENT LOOP: Ch 9 (build) → Ch 13 (frameworks) → Ch 23 (Claude Code)
→ Ch 27 (production architecture) → Ch 32 (multi-agent)
→ Ch 58 (orchestration)
EVALS: Ch 18-22 (learn) → Ch 25 (skill evals) → Ch 45 (fine-tuning)
→ Ch 46 (datasets) → Ch 51 (production) → Ch 61 (self-reinforcing)
TOOLS/MCP: Ch 8 (calling) → Ch 24 (MCP) → Ch 25 (skills)
→ Ch 30 (internals) → Ch 48 (security) → Ch 59 (platforms)
CONTEXT: Ch 0 (concept) → Ch 6 (chat) → Ch 12 (management)
→ Ch 29 (internals) → Ch 49 (cost) → Ch 50 (advanced)
MEMORY: Ch 10 (concepts) → Ch 23 (CLAUDE.md) → Ch 28 (KAIROS)
→ Ch 50 (production) → Ch 59 (platform tools)
SECURITY: Ch 11 (approvals) → Ch 30 (permissions) → Ch 48 (guardrails)
→ Ch 55 (sandboxing) → Ch 62 (adoption governance)
DEPLOYMENT: Ch 3 (API call) → Ch 6 (chat UI) → Ch 27 (harness architecture)
→ Ch 53 (deploy apps) → Ch 55 (isolate agents) → Ch 56 (CI/CD)
STREAMING: Ch 3 (basic streamText) → Ch 6 (streaming UI with useChat)
→ Ch 9 (streaming in agent loop) → Ch 27 (production streaming architecture)
→ Ch 53 (deploying streaming endpoints) → Ch 59 (streaming in platform tools)
COST: Ch 2 (pricing landscape) → Ch 12 (context as cost)
→ Ch 22 (token tracking) → Ch 29 (compaction saves money)
→ Ch 49 (cost engineering) → Ch 57 (infrastructure cost optimization)
Every chapter follows a consistent, AI-scannable format:
<!-- HTML metadata: CHAPTER, TITLE, PART, PHASE, PREREQS, KEY_TOPICS, DIFFICULTY, LANGUAGE, UPDATED -->
# Chapter N: Title
> Part · Phase · Prerequisites · Difficulty · Language
Summary paragraph.
### In This Chapter ← section index
### Related Chapters ← cross-references (spiral connections)
---
## 1. MAJOR SECTION
### 1.1 Subsection
**What it is:** ...
**When to use:** ...
**Trade-offs:** ...
**Real-world example:** ...
**Code:** ... (working examples in the chapter's language)
- The harness matters more than the model — a well-configured system around a good model beats a great model with no system
- Get dangerous first, understand later — ship something, then learn why it works
- Every concept spirals — you'll see each idea multiple times at increasing depth
- Use the right language for the job — TypeScript for applications, Python for ML
- Evals are not optional — if you can't measure it, you can't improve it
- Build for your team, not just yourself — the platform engineer's job is to raise the floor
Built with Claude Code. Contributions welcome.