-
Notifications
You must be signed in to change notification settings - Fork 153
Trim AGENTS.md and drop tailnet handback note #1229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,131 +1,83 @@ | ||
| # AGENTS.md | ||
|
|
||
| This file is principles — the contracts that stay true while implementations | ||
| churn. For how to actually run, boot, share, or navigate things today | ||
| (fresh-worktree setup, dev servers, ports, environment gotchas), see | ||
| [RUNNING.md](RUNNING.md) (which may lag reality; it says so itself). For | ||
| writing e2e scenarios, see [e2e/AGENTS.md](e2e/AGENTS.md). Run | ||
| `bun run bootstrap` first in any fresh checkout or worktree. | ||
|
|
||
| ## Task Completion Requirements | ||
|
|
||
| - Use Effect Vitest for tests. | ||
| - Run targeted tests with `vitest run ...` when working on a scoped area. | ||
| - The root/package `bun run test` scripts are allowed because they delegate to | ||
| Vitest. | ||
| - NEVER run `bun test`. | ||
| - For code changes, run the narrowest useful verification before handing back. | ||
| - For broad or merge-ready changes, the full gates are `bun run format:check`, | ||
| `bun run lint`, `bun run typecheck`, and `bun run test`. | ||
| - Always run `bun run format` before opening a PR so the diff lands | ||
| already-formatted. Only commit formatting changes to files your branch | ||
| actually touches: if `format` rewrites unrelated pre-existing files, leave | ||
| those out of the PR (stage just your files). | ||
| Principles: the contracts that stay true while implementations churn. For how to | ||
| run, boot, share, or navigate things today, see [RUNNING.md](RUNNING.md); for | ||
| e2e scenarios, [e2e/AGENTS.md](e2e/AGENTS.md). Run `bun run bootstrap` first in | ||
| any fresh checkout or worktree. | ||
|
|
||
| ## Task Completion | ||
|
|
||
| - Tests use Effect Vitest. Run scoped tests with `vitest run ...`. `bun run test` | ||
| is fine (it delegates to Vitest); NEVER run `bun test`. | ||
| - Run the narrowest useful verification for a change. For broad or merge-ready | ||
| work the full gates are `bun run format:check`, `lint`, `typecheck`, `test`. | ||
| - Run `bun run format` before opening a PR, staging only files your branch | ||
| touched (leave unrelated files `format` rewrites out of the PR). | ||
|
|
||
| ## Handing Back Work: Evidence, Not Assertions | ||
|
|
||
| "Done" is something the user can open, not a claim. When work changes what a | ||
| user sees or touches, the handoff has three parts, delivered unprompted: | ||
|
|
||
| 1. **Watch it** — an e2e scenario covers the change, and the handoff links | ||
| directly to the specific run(s) that prove it, with one line each on what | ||
| to look at. Never hand back a bare wall of green results: the user's | ||
| question is "show me the new thing working," not "is everything healthy?" | ||
| 2. **Touch it** — leave the session's dev server running and reachable over | ||
| the user's tailnet, with credentials, so they can take over and poke at | ||
| it. The instance you already booted for e2e IS this — leave it up rather | ||
| than standing up something separate. | ||
| 3. **What to try** — name the paths worth exercising by hand, especially | ||
| ones no scenario pins yet. Honesty about coverage gaps is part of the | ||
| handoff. A human driving a real browser from another device reaches | ||
| states the test harness structurally cannot; invite that. | ||
|
|
||
| The same machinery runs in reverse: you can seed an environment INTO a | ||
| state — reproduce a bug live, stage data for the user to take over, set up | ||
| a walkthrough — and hand across the link. "Here's the broken state, live" | ||
| beats a paragraph describing it. | ||
|
|
||
| If no scenario covers the change yet, that is the cue to write one. When a | ||
| change is user-visible, embed the run's recording in the PR description — | ||
| reviewers should see the change, not just read about it. | ||
|
|
||
| Don't memorize the mechanics (ports, viewer, sharing commands) — discover | ||
| them from RUNNING.md and the code; they change. | ||
| "Done" is something the user can open, not a claim. When work changes what a user | ||
| sees, hand back: an e2e run that proves it (link the specific run with one line | ||
| on what to look at, not a wall of green), the dev server left running so they can | ||
| poke at it, and the paths worth trying by hand including ones no scenario covers | ||
| yet. If no scenario covers the change, write one, and embed its recording in the | ||
| PR. The machinery runs in reverse too: seed an environment into a state | ||
| (reproduce a bug live, stage data) and hand across the link. | ||
|
|
||
| ## Service Emulators | ||
|
|
||
| When a test or demo needs an upstream API, OAuth/OIDC provider, or webhook | ||
| source, use the `@executor-js/emulate` emulators (GitHub, Google, Stripe, | ||
| Resend, WorkOS, and a dozen more) instead of writing a stub. They are | ||
| wire-level and stateful — real SDKs run against them unmodified — and each | ||
| serves a full OpenAPI spec ready for addSpec, mints real-shaped credentials, | ||
| runs working OAuth flows, and records every call in a request ledger you can | ||
| assert against. Hosted instances exist at `https://<service>.emulators.dev` | ||
| with zero setup. See the `emulate` skill | ||
| (`.claude/skills/emulate/SKILL.md`) for the control-plane reference and | ||
| recipes. | ||
|
|
||
| The emulators are a standalone project (`github.com/UsefulSoftwareCo/emulate`), | ||
| not vendored here — this repo only consumes the published `@executor-js/emulate` | ||
| package. You have full autonomy to change, publish, and deploy the emulators, | ||
| working directly on their `main`; the skill covers the loop. Don't re-introduce | ||
| a `vendor/` submodule for them. | ||
| For any test or demo needing an upstream API, OAuth/OIDC provider, or webhook | ||
| source, use the `@executor-js/emulate` emulators instead of writing a stub: | ||
| wire-level and stateful, real SDKs run unmodified, each serves a full OpenAPI | ||
| spec, mints real-shaped credentials, and records every call in a request ledger | ||
| to assert against. Hosted at `https://<service>.emulators.dev`. See the `emulate` | ||
| skill (`.claude/skills/emulate/SKILL.md`). They are a standalone project | ||
| (`github.com/UsefulSoftwareCo/emulate`) consumed here as the published package: | ||
| full autonomy to change, publish, and deploy them on their `main`; don't re-vendor. | ||
|
|
||
| ## Attribution | ||
|
|
||
| Do not add any AI assistant, Claude, Anthropic, or Co-Authored-By | ||
| attribution/trailers to commits, commit messages, PRs, or generated files. | ||
|
|
||
| Pull request titles and descriptions are going to a public GitHub repo, so | ||
| avoid using specific names or internal info unless explicitly stated to. | ||
| No AI/Claude/Anthropic/Co-Authored-By attribution in commits, messages, PRs, or | ||
| generated files. PR titles and descriptions go to a public repo: no internal info | ||
| or specific names unless explicitly stated. | ||
|
|
||
| ## Collaboration Notes | ||
|
|
||
| The user uses speech to text occasionally, so if sentences are weird or words | ||
| are not right, infer the likely intent and ask only when needed. | ||
|
|
||
| Code is very cheap to write. Do not give time estimates; with agents, code is | ||
| practically instant to generate. Unless stated otherwise, time to implement is | ||
| not a blocker. | ||
|
|
||
| Never use em-dashes (the `—` character) anywhere: prose, docs, code comments, | ||
| commit messages, or PRs. Use commas, colons, parentheses, or separate sentences | ||
| instead. | ||
| - The user uses speech-to-text; infer likely intent from odd wording, ask only | ||
| when needed. | ||
| - Code is cheap to write: no time estimates, implementation time isn't a blocker. | ||
| - Never use em-dashes anywhere. Use commas, colons, parentheses, or separate | ||
| sentences. | ||
|
|
||
| ## Reference Repos | ||
|
|
||
| Repos in `.reference`, such as Effect and effect-atom, are available for | ||
| patterns. If given a Git URL for reference, clone it into `.reference` and | ||
| inspect it there. Make sure to pull the latest changes from the reference repo | ||
| before using it. | ||
| Repos in `.reference` (Effect, effect-atom, …) are available for patterns. Clone | ||
| a given Git URL into `.reference` and pull latest before using it. | ||
|
|
||
| ## Engineering Priorities | ||
|
|
||
| - Prefer correctness and predictable behavior over short-term convenience. | ||
| - Preserve runtime behavior when changing lint, typing, or test structure. | ||
| - Keep package boundaries clear; use public package exports instead of relative | ||
| imports across package roots. | ||
| - Extract shared logic only when the shared behavior is real and local patterns | ||
| support it. Avoid broad generic abstractions for one-off duplication. | ||
| - Keep package boundaries clear; use public package exports, not cross-package | ||
| relative imports. | ||
| - Extract shared logic only when the shared behavior is real; avoid broad generic | ||
| abstractions for one-off duplication. | ||
|
|
||
| ## Package Roles | ||
|
|
||
| - `packages/core/sdk`: executor core contracts, plugin wiring, scopes, sources, | ||
| secrets, policies, and test fixtures. The `@executor-js/sdk/http-auth` | ||
| subpath carries the shared placements-based auth-method vocabulary the HTTP | ||
| protocol plugins compose (core itself never imports it — composition, not | ||
| location, keeps core carrier-agnostic). | ||
| - `packages/core/storage-*`: storage adapters and storage test support. | ||
| - `packages/plugins/*`: protocol and provider plugins. Plugin-specific | ||
| runtime, React, API, and testing helpers should live with the owning plugin. | ||
| - `packages/core/sdk`: core contracts, plugin wiring, scopes, sources, secrets, | ||
| policies, fixtures. `@executor-js/sdk/http-auth` carries the shared auth-method | ||
| vocabulary the HTTP protocol plugins compose (core never imports it, keeping it | ||
| carrier-agnostic). | ||
| - `packages/core/storage-*`: storage adapters and test support. | ||
| - `packages/plugins/*`: protocol and provider plugins; their runtime, React, API, | ||
| and testing helpers live with the owning plugin. | ||
| - `packages/react`: shared React UI and atom/client integration. | ||
| - `packages/hosts/mcp`: MCP host surface for exposing Executor through MCP. | ||
| - `packages/hosts/mcp`: MCP host surface. | ||
| - `packages/kernel/*`: execution runtimes and code execution substrate. | ||
| - `apps/local`, `apps/cloud`, `apps/cli`, and `apps/desktop`: product entry | ||
| points that compose the packages. | ||
| - `apps/{local,cloud,cli,desktop}`: product entry points composing the packages. | ||
|
|
||
| ## Other | ||
|
|
||
| Please make note of mistakes you make in MISTAKES.md. If you find you wish you | ||
| had more context or tools, write that down in DESIRES.md. If you learn anything | ||
| about your env write that down in LEARNINGS.md. | ||
| Note mistakes in MISTAKES.md, missing context or tools in DESIRES.md, and env | ||
| learnings in LEARNINGS.md. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bun runprefixThe new line lists the full gates as
bun run format:check,lint,typecheck,test— only the first command keeps itsbun runprefix. Since this file is parsed by AI agents as executable instructions, an agent following this literally may runlint,typecheck, ortestas bare shell commands instead ofbun run lint,bun run typecheck,bun run test. The original spelled each one out in full to prevent exactly this ambiguity.Context Used: AGENTS.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!