Skip to content

feat: reusable AI agent steps with hybrid linking and evals#9825

Draft
rubenfiszel wants to merge 1 commit into
mainfrom
feat/reusable-ai-agents
Draft

feat: reusable AI agent steps with hybrid linking and evals#9825
rubenfiszel wants to merge 1 commit into
mainfrom
feat/reusable-ai-agents

Conversation

@rubenfiszel

@rubenfiszel rubenfiszel commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Makes an AI agent a savable, shareable, evaluable artifact instead of inline-only config. An agent is stored as a resource of a new built-in ai_agent resource type bundling its brain (provider/model, system prompt, temperature, output schema, memory…), its tools, and an eval suite. A flow step can link to a saved agent — the brain + tools resolve at runtime from the resource (so edits propagate), while the flow keeps only the local user_message/user_attachments wiring.

Changes

Backend

  • FlowModuleValue::AIAgent gains an optional agent field (resource path) for hybrid linking; threaded through the custom deserializer + lockfile resolver. Existing flows are byte-compatible (skip_serializing_if).
  • Runtime resolution in windmill-worker/src/ai_executor.rs: a linked step loads brain config + tools from the resource via get_resource_value_interpolated (nested provider $res: resolves for free) and overlays the flow-local inputs.
  • Migration 20260627124035_ai_agent_resource_type seeds the ai_agent resource type into the admins workspace (visible everywhere via the list_resource_types admins-union).
  • New windmill-api/src/ai_agents.rs: POST /ai_agents/run and /ai_agents/eval_case. Runs an agent by pushing a single-step AIAgent flow-preview job (reuses push + run_wait_result_internal, permissioned as the caller — no new job payload, no privilege escalation). Grading = deterministic assertions (contains/regex/json_path_equals/…) + an LLM judge that itself runs as an inline AIAgent step with a structured {score,pass,summary} schema. Unit-tested.
  • Eval-suite types in windmill-ai/src/types.rs; OpenAPI updated.

Frontend

  • flows/agentResourceUtils.ts — config↔input_transforms conversion + flow-local schema filter.
  • flows/content/AgentResourceBar.svelte — Save as agent / pick a saved agent / Unlink, in the Step Input tab; brain config collapses to the flow inputs when linked. Warns if a non-static brain transform would be dropped on save.
  • flows/content/AgentEvalsPanel.svelte — Evals tab: case CRUD (input message, judge checklist, assertions) + Run via the typed AiAgentService.
  • Wired into FlowModuleComponent.svelte; docs/reusable-ai-agents.md.

Screenshots

agent-step-added

AgentResourceBar (resource picker + "Save as agent") and the new Evals tab in the AI agent step.

linked-banner

Linked state: banner + Unlink, brain config hidden, only flow-local inputs remain.

evals-panel

Evals tab: a case with input message, judge checklist, and assertions editor.

Test plan

  • cargo check (full backend) green; windmill-api unit tests for the grading logic pass (6 tests)
  • Frontend check:fast clean
  • Playwright: add AI agent step → Save as agent creates ai_agent resource (verified shape in Postgres) → linked banner + Unlink, brain hidden → Evals tab case editor renders
  • Run a live eval/agent with a configured AI provider (not exercised — needs an API key + tokens): confirm pass/fail badge, judge score, latency, per-assertion results; and that a linked agent step in a real flow run resolves the provider's nested $res: under the caller's permissions
  • Pick an existing saved agent via the resource picker; Unlink and confirm brain+tools copy back into the step preserving flow-local inputs

🤖 Generated with Claude Code


Summary by cubic

Make AI agents reusable. Flow steps can link to a saved ai_agent so updates propagate, and you can run evals with assertions and an LLM judge.

  • New Features

    • New ai_agent resource type (brain config, tools, eval suite). AI Agent steps accept an agent link.
    • At runtime, provider/model/tools come from the resource; flows keep only user_message and user_attachments.
    • API: POST /w/{workspace}/ai_agents/run and /w/{workspace}/ai_agents/eval_case to run an agent or grade a case (assertions + LLM judge returning {score, pass, summary}), executed as the caller.
    • Frontend: “Save as agent”, “Use saved agent”, and “Unlink” in Step Input; when linked, only flow‑local inputs are shown. New Evals tab to author cases and run them.
    • Worker resolves linked agent config and overlays flow inputs; OpenAPI/docs updated; grading logic unit‑tested.
  • Migration

    • Seeds built‑in ai_agent resource type in admins (visible to all workspaces).
    • Backward compatible: existing flows unchanged; tools default to empty; linking is optional.

Written for commit fb25ea8. Summary will update on new commits.

Review in cubic

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying windmill with  Cloudflare Pages  Cloudflare Pages

Latest commit: fb25ea8
Status: ✅  Deploy successful!
Preview URL: https://45beb17d.windmill.pages.dev
Branch Preview URL: https://feat-reusable-ai-agents.windmill.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant