feat: reusable AI agent steps with hybrid linking and evals by rubenfiszel · Pull Request #9825 · windmill-labs/windmill

rubenfiszel · 2026-06-27T19:34:14Z

Summary

Makes an AI agent a savable, shareable, evaluable artifact instead of inline-only config. An agent is stored as a resource of a new built-in ai_agent resource type bundling its brain (provider/model, system prompt, temperature, output schema, memory…), its tools, and an eval suite. A flow step can link to a saved agent — the brain + tools resolve at runtime from the resource (so edits propagate), while the flow keeps only the local user_message/user_attachments wiring.

Changes

Backend

FlowModuleValue::AIAgent gains an optional agent field (resource path) for hybrid linking; threaded through the custom deserializer + lockfile resolver. Existing flows are byte-compatible (skip_serializing_if).
Runtime resolution in windmill-worker/src/ai_executor.rs: a linked step loads brain config + tools from the resource via get_resource_value_interpolated (nested provider $res: resolves for free) and overlays the flow-local inputs.
Migration 20260627124035_ai_agent_resource_type seeds the ai_agent resource type into the admins workspace (visible everywhere via the list_resource_types admins-union).
New windmill-api/src/ai_agents.rs: POST /ai_agents/run and /ai_agents/eval_case. Runs an agent by pushing a single-step AIAgent flow-preview job (reuses push + run_wait_result_internal, permissioned as the caller — no new job payload, no privilege escalation). Grading = deterministic assertions (contains/regex/json_path_equals/…) + an LLM judge that itself runs as an inline AIAgent step with a structured {score,pass,summary} schema. Unit-tested.
Eval-suite types in windmill-ai/src/types.rs; OpenAPI updated.

Frontend

flows/agentResourceUtils.ts — config↔input_transforms conversion + flow-local schema filter.
flows/content/AgentResourceBar.svelte — Save as agent / pick a saved agent / Unlink, in the Step Input tab; brain config collapses to the flow inputs when linked. Warns if a non-static brain transform would be dropped on save.
flows/content/AgentEvalsPanel.svelte — Evals tab: case CRUD (input message, judge checklist, assertions) + Run via the typed AiAgentService.
Wired into FlowModuleComponent.svelte; docs/reusable-ai-agents.md.

Screenshots

AgentResourceBar (resource picker + "Save as agent") and the new Evals tab in the AI agent step.

Linked state: banner + Unlink, brain config hidden, only flow-local inputs remain.

Evals tab: a case with input message, judge checklist, and assertions editor.

Test plan

cargo check (full backend) green; windmill-api unit tests for the grading logic pass (6 tests)
Frontend check:fast clean
Playwright: add AI agent step → Save as agent creates ai_agent resource (verified shape in Postgres) → linked banner + Unlink, brain hidden → Evals tab case editor renders
Run a live eval/agent with a configured AI provider (not exercised — needs an API key + tokens): confirm pass/fail badge, judge score, latency, per-assertion results; and that a linked agent step in a real flow run resolves the provider's nested $res: under the caller's permissions
Pick an existing saved agent via the resource picker; Unlink and confirm brain+tools copy back into the step preserving flow-local inputs

🤖 Generated with Claude Code

Summary by cubic

Make AI agents reusable. Flow steps can link to a saved ai_agent so updates propagate, and you can run evals with assertions and an LLM judge.

New Features
- New ai_agent resource type (brain config, tools, eval suite). AI Agent steps accept an agent link.
- At runtime, provider/model/tools come from the resource; flows keep only user_message and user_attachments.
- API: POST /w/{workspace}/ai_agents/run and /w/{workspace}/ai_agents/eval_case to run an agent or grade a case (assertions + LLM judge returning {score, pass, summary}), executed as the caller.
- Frontend: “Save as agent”, “Use saved agent”, and “Unlink” in Step Input; when linked, only flow‑local inputs are shown. New Evals tab to author cases and run them.
- Worker resolves linked agent config and overlays flow inputs; OpenAPI/docs updated; grading logic unit‑tested.
Migration
- Seeds built‑in ai_agent resource type in admins (visible to all workspaces).
- Backward compatible: existing flows unchanged; tools default to empty; linking is optional.

^{Written for commit fb25ea8. Summary will update on new commits.}

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-06-27T19:37:11Z

Deploying windmill with Cloudflare Pages

Latest commit:	`fb25ea8`
Status:	✅ Deploy successful!
Preview URL:	https://45beb17d.windmill.pages.dev
Branch Preview URL:	https://feat-reusable-ai-agents.windmill.pages.dev

View logs

feat: reusable AI agent steps with hybrid linking and evals

fb25ea8

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: reusable AI agent steps with hybrid linking and evals#9825

feat: reusable AI agent steps with hybrid linking and evals#9825
rubenfiszel wants to merge 1 commit into
mainfrom
feat/reusable-ai-agents

rubenfiszel commented Jun 27, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rubenfiszel commented Jun 27, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Screenshots

Test plan

Summary by cubic

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 27, 2026

Deploying windmill with Cloudflare Pages

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rubenfiszel commented Jun 27, 2026 •

edited by cubic-dev-ai Bot

Loading