Add LLM proxy skills and API spec by jagadeshsid · Pull Request #31 · mulesoft/mulesoft-dx

jagadeshsid · 2026-04-27T12:44:26Z

Summary

Adds four JTBD skills covering the LLM Gateway workflows plus the private llm-proxy API spec (9 operations on /apimanager/xapi/v1).

Skills (4)

Skill	Steps	Purpose
`create-llm-proxy-model-based-routing`	8	Create an LLM proxy that routes by `model` field in the request body
`create-llm-proxy-semantic-routing`	10	Create an LLM proxy that routes by prompt semantics (Semantic Service Config + Global Prompt Topics)
`apply-token-rate-limiting-to-llm-proxy`	5	Apply the `llm-token-rate-limit` policy (`groupId 68ef9520-…`, `assetId llm-token-rate-limit`, `version 1.0.0`) to an existing LLM proxy
`request-llm-proxy-access`	5	Mint a `client_id` / `client_secret` and create a contract for a consumer

API spec

apis/llm-proxy/api.yaml — 9 operations: listLlmRouteConfigurations, listEnvironmentLlmProxies, createEnvironmentLlmProxy, getGatewayTargetApisByPortAndPath, listSemanticServiceConfigs, createSemanticServiceConfig, getSemanticServiceConfig, listGlobalPromptTopicsBySsc, createGlobalPromptTopic
apis/llm-proxy/exchange.json — groupId: f1e97bc6-…anypoint-platform, assetId: llm-proxy-api, version: 1.0.0, visibility: private
apis/llm-proxy/examples/responses/ — 11 captured example responses (secrets redacted)

Validation

make validate-api API=apis/llm-proxy — basic OAS 0/0, governance 0/0
python3 scripts/build/validate_jtbd.py — all 4 skills PASS individually
python3 scripts/build/validate_xorigin.py apis/llm-proxy — no violations
make validate-descriptions — no violations
make generate-portal — portal renders cleanly; [oas] LLM Proxy API + 4 [agent-skill] entries in registry

Verification approach

Payload shapes, URLs, and response shapes are grounded in:

Anypoint UI source — api-manager-ui-node-lib repo. Every URL, payload field, and enum value traces back to a specific file/line (e.g., AddLlmProxyWizard.js:596-736 for the LLM proxy POST shape, routeConfigs.js:21-276 for the per-provider field catalog, constants.js:703-706 for the approvalMethod enum).
Live HAR captures from stgx.anypoint.mulesoft.com — one model-based proxy create flow and one semantic proxy create flow (with both basic and advanced SSCs), captured end-to-end and diffed against the skill claims.
Live 200 round-trip through the gateway — invoked a deployed LLM proxy with valid credentials; confirmed every x-llm-proxy-* response header documented in the skills, plus the 401 / fallback / model-override behaviors.

Test plan

Reviewer can run make validate-api API=apis/llm-proxy locally and confirm 0 violations
Reviewer can run python3 scripts/build/validate_jtbd.py skills/<skill>/SKILL.md . for each of the four new skills
Reviewer can run make generate-portal and verify the LLM Proxy API spec page + four skill HTML pages render without errors
CLA: I will sign the Salesforce CLA on first PR submission as prompted

Notes for review

The repo's pre-existing run-agent-scan-and-view-results skill currently fails validate-jtbd (missing YAML frontmatter). That is unrelated to this PR and predates these changes.
The LLM proxy API is marked visibility: private consistent with api-portal-xapi (the only other private API in the repo). Private APIs render the spec into portal/apis/<name>/api.yaml but skip the standalone HTML detail page by design.
A standalone review document with full source citations and a "Verified by HAR" appendix is available on request — happy to share if useful for reviewers.

Made with Cursor

Adds four JTBD skills covering the LLM Gateway workflows (model-based routing, semantic routing, token rate limiting, consumer access) and the private llm-proxy API spec (9 operations on /apimanager/xapi/v1). Skills: - create-llm-proxy-model-based-routing (8 steps) - create-llm-proxy-semantic-routing (10 steps) - apply-token-rate-limiting-to-llm-proxy (5 steps) - request-llm-proxy-access (5 steps) API: apis/llm-proxy/ (api.yaml, exchange.json, 11 example responses) Payload shapes and URLs are grounded in the Anypoint UI source (api-manager-ui-node-lib) and live HAR captures from stgx, including a successful 200 invocation through the gateway. All four validators pass: validate-jtbd, validate-api, validate-xorigin, validate-descriptions; make generate-portal renders the spec and all four skill pages cleanly. Made-with: Cursor

Three alignment fixes after a deep validator + cross-reference pass: 1. Convert input refs to canonical `from: { variable: X }` form The first cut used `from: { step: <name>, output|input: X }`, an evolved schema documented in the (now-archived) anypoint-public-api-specs repo. mulesoft-dx's validator + job-template + every existing skill use the simpler `from: { variable: X }`. Reverted 61 references across all four LLM skills; validators now run cleanly with zero warnings (was 19+26+8+8 warnings on the prior commit). 2. Wire up the missing example response in api.yaml `gatewayTargetApisOccupied.json` was committed but not referenced from any operation. Switched the corresponding `getGatewayTargetApisByPortAndPath` 200 response to OAS 3.0 `examples:` (plural) with named entries: `Available` (the existing case) and `Occupied` (the conflict case). Both cases now show up in generated portal docs. 3. Remove broken cross-skill reference in `request-llm-proxy-access` The Related Jobs section linked `manage-consumer-contracts`, which exists in older internal repos but was never migrated to mulesoft-dx. Dropped the bullet rather than introduce a dead link. Also adds `apis/llm-proxy/examples/responses/README.md` cataloguing every example file: which `api.yaml` operation references it, and the two supplementary references that document responses from other API specs (used as concrete shapes by the LLM Gateway skills). Validators all green: validate-jtbd (per skill, 0 warnings + 0 errors), validate-api (basic OAS 0/0, governance 0/0), validate-xorigin (clean), validate-descriptions (clean). Made-with: Cursor

Walked the four skills end-to-end as if I were the agent and identified specific spots where the agent would have to guess or where it could miss a step. Eight surgical edits across the four skills: Skill 1 (model-based routing): - Step 6: when the user hasn't specified an inbound `properties.platform`, default to `openai` (universal client format, supports all five providers via transcoding). Pick `gemini` only if the user explicitly wants Gemini-format consumer bodies (single-route only). - Step 7: the `fallbackRoute` value MUST match a route's `label` exactly. Server silently accepts mismatched values, but at runtime fallback never triggers — explicit warning so the agent picks from the labels it just constructed instead of asking the user for free text. - Step 7: capture `routing[*].rules.headers.x-routing-header` as a `modelPrefixes` output, and surface those values to the user in "What happens next" so they know which `<prefix>/<model>` strings are valid in the request body. - Tips: add a mixed-mode credentials note — modes are per-upstream, so a proxy can have OpenAI on a static key AND Gemini on DataWeave at the same time. Avoids the "pick one mode for the whole proxy" assumption. Skill 2 (semantic routing): - Step 4: utterance-elicitation guidance. When the user hasn't supplied utterances, the agent should prompt per-topic for 5–10 diverse phrasings, refuse fewer than 5, and challenge utterances that overlap multiple topics. Skill 3 (token rate limiting): - Step 5: post-apply verification. Make a test call after ~30s and confirm the `x-llm-proxy-ratelimit` header is present in the 200 response. Header absence after 90s suggests the policy didn't apply. Skill 4 (request access): - Step 5: SLA-tier discovery. Before posting the contract, list tiers via `listOrganizationsEnvironmentsApisTiers` to detect tiered proxies proactively rather than recovering from a 400. - Invocation Next Steps: lead with `/chat/completions` as the universal subpath and document that `/responses` is OpenAI-only — the transcoding policies for non-OpenAI providers don't currently translate `/responses` semantics. - Invocation Next Steps: add explicit guidance on how to find a proxy's valid `model` prefixes — read `routing[*].rules.headers.x-routing-header` on the existing proxy and construct `<prefix>/<model>`. All four skills still pass `validate-jtbd` per-skill with zero warnings. `make validate-api` clean. `make validate-descriptions` clean. Made-with: Cursor

Adds descriptions to the few inner-array `items:` and the route-rule `headers` map that previously inherited context from their parents. Surfaced by an end-to-end audit pass (every property under every request body now has a description). Made-with: Cursor

Addresses lead review feedback that the combined create-llm-proxy- semantic-routing skill conflates the basic and advanced flows. The two flows differ enough at the topic-create endpoint, the SSC payload, the deployment story, and the post-create policy attach that one skill was forcing branchy logic on the agent. Changes: - Add skills/create-llm-proxy-semantic-routing-basic/SKILL.md (10 steps). Uses /prompt-topics, no vector DB, documents the platform's auto-rebind behavior and the agent's need to read back the resolved SSC, no manual policy attach (basic variants auto-attach). - Add skills/create-llm-proxy-semantic-routing-advanced/SKILL.md (12 steps). Uses /global-prompt-topics, requires a vector DB, includes a new step for fetching and running the GET /semantic-setup-script that hydrates the vector DB, plus the manual semantic-routing-policy-* apply step (advanced variants are not auto-attached). - Remove skills/create-llm-proxy-semantic-routing/. - Add createPromptTopic and getSemanticSetupScript operations to apis/llm-proxy/api.yaml plus a captured promptTopic.json example (live-tested against stgx). - Both new skills: - Restructure topic + utterance gathering as inline-first, with a CSV/JSON file-path fallback for larger sets (the agent parses the file itself). - Consolidate ask-the-user inputs into the Prerequisites section so the agent reads it as a checklist and prompts up front. - Skill 1 (model-based) gets the same Prerequisites consolidation. - Update cross-references in Skill 1, Skill 3, Skill 4 to point at the two new skill names. All five validators pass clean. Cross-ref audit: 29 output JSONPaths across the 5 skills resolve in their captured example responses. Made-with: Cursor

Addresses lead review feedback that the create skills should be config-driven, with explicit verification curls per creation step and a final end-to-end test step. Common changes (all three create skills): - New "Read llmproxy.yaml first" section at the top of each skill describing a recommended input file alongside .env. The skill text tells the agent to be flexible about field names and structure since customer YAMLs may vary; per-step lookup tables specify which YAML fields short-circuit which steps. - Conditional skip notices on Steps 1 + 2 (org/env), Step 4 (gateway target), Step 5 (port + base path defaults), based on llmproxy.yaml contents. - Asset publish broken into three explicit phases — POST, poll the publicationStatusLink, verify attributes — each with its own curl block, per the lead's "polling and verifying are two separate steps". - New verification curl after the proxy create (Step 7 in model-based, Step 9 in semantic-basic, Step 10 in semantic-advanced) so customers can check the API instance metadata before the deployment poll. - New "Test the Proxy with a real request" section at the end of each create skill, with concrete curls for happy-path + fallback cases and triage guidance per response shape. References request-llm-proxy-access for credential creation. - Soften apiVersionStatus first-poll claim — agent E2E test on stgx showed the field is null on the very first poll (~3s in), then unregistered, then active. Gate is unchanged ("== active") but the explanatory wording is corrected. Skill 2 (semantic-basic) specific: - Replaced the "Trade-offs vs advanced" comparison table with a smaller "Basic-mode limits" callout — keeps the limits visible without confusing readers who came here for basic only. - Added embedding service provider name to Prerequisites alongside key + model. - Step 3 (list SSCs), Step 4 (create SSC) marked conditional on routing.ssc.id being set in YAML. - Topic + utterance source priority now lists llmproxy.yaml inline, llmproxy.yaml topics_csv/topics_json paths, then in-chat inline, then file path the user names in chat. Skill 3 (semantic-advanced) specific: - Steps 3, 4, 5 all conditional on routing.ssc.id (reuse existing SSC scenario). - Step 6 (getSemanticSetupScript) kept and emphasized — verified live on stgx that this is a separate API call from the SSC create response, and that without running the script the vector DB stays empty and every prompt routes to fallback. Validation: - All 5 JTBD validators PASS (0 warnings, 0 errors each) - validate-api / validate-xorigin / validate-descriptions clean - Cross-ref: 29/29 output JSONPaths resolve in their captured examples - Live agent E2E test against the revised model-based skill succeeded end-to-end on stgx in ~30s with zero outside lookups, zero hard gaps, and zero ambiguities (only one wording fix applied, included above). Made-with: Cursor

Five comments surfaced when an agent ran the semantic-basic skill interactively (without a pre-baked llmproxy.yaml). All five are addressed; one of them required removing a previously-documented manual step that the platform has since started auto-attaching. Comment 1 — Routes elicitation should be explicit and unbounded. - All three create skills: the Step that generates the proxy POST (or, for semantic skills, the Step that creates topics) now opens with a "Routes — elicit these from the user" sub-section. The prose explicitly says don't assume a fixed number of routes (not two, not any specific count), don't assume which providers, and don't assume any particular credential mode. Per route the agent asks: provider (any subset of the catalog), target model, key mode (static or DataWeave), label. - For semantic skills, also calls out that multiple topics CAN map to the same route (many-to-one is normal — Finance + Investing both routing to the same OpenAI route, etc.). The validation checklist now includes "every topic's route_label matches one of the route labels you defined above". Comment 2 — Routing policy attach is automatic for advanced too (verified live on stgx 2026-04-30 — proxy 4696396 has semantic-routing-policy-openai-qdrant auto-attached). - Advanced skill: removed the manual policy-attach Step 11 entirely. Renumbered the deployment poll from Step 12 to Step 11. Updated the overview, the API endpoints table, the URL gotchas list, the completion checklist, and the troubleshooting prose to reflect that the platform now auto-attaches the semantic-routing-policy-<provider>-<vectordb> variant. Net effect: 12 steps → 11 steps. Skill is shorter and matches current platform behavior. Comment 3 — Step 3 should filter by serviceType + ASK the user. - semantic-basic Step 3: explicit "filter to serviceType: basic client-side" instruction + an explicit "ASK the user whether to use one of these or create a new one" prompt. Don't silently default. The auto-rebind quirk caveat remains. - semantic-advanced Step 3: same pattern, filter to advanced. Comment 4 — Don't auto-select the Flex Gateway target. - All three create skills' Step 4/6/7 (gateway target listing) now end with "Surface the eligible targets to the user and ASK them to pick one. Do NOT auto-select even if there's only one eligible candidate or one with a 'right-looking' name." Comment 5 — Client app create body shape — be strict, surface the common failure modes. - request-llm-proxy-access Step 4: new "Critical — request body shape" callout. Lists exactly the two-field body shape that works (verified live 2026-04-30: `{name, description}` returns 201 with clientId + clientSecret), and explicitly enumerates fields the agent should NOT include (applicationId, clientId, clientSecret as null inputs; targetApiInstanceId; redirectUri, homepageUrl) including the specific 400 message that confused agents in a previous run. Validation: - All 5 JTBD validators PASS (0 warnings, 0 errors each) - validate-api / validate-xorigin / validate-descriptions clean - Portal generates clean Made-with: Cursor

Five subagents tested model-based×{YAML,inline}, semantic-basic×{YAML,inline}, and apply-token-rate-limiting on stgx in parallel. All five completed end-to-end (cleanup included). Aggregate findings drove the changes below. Hard skill errors corrected: - **Model-based routing supports at most ONE route per provider.** The agent that elicited 3 routes (testing the "you can have 1, 3, 5+ routes" claim) had its third route silently deduped by the server — upstreams are merged on (provider, uri, label), and the routing policy keys off the resolved provider, not the route's x-routing-header value. Skill 1's Step 7 elicitation now spells out the constraint: max 5 routes (one per supported provider), with an explicit "reject duplicate provider" guard during elicitation. - **Upstream `label` MUST equal the provider identifier.** Skill 1 never said this; the agent hit `400 BadRequestError: "Upstream in route 'X' has provider 'openai' which requires upstream label to be 'openai', but found label 'openai-full'"` on first try. Now stated explicitly with the exact 400 message included. - **`endpointUri` is often comma-separated** (`<cloudhub-url>,<custom>`). All three create skills' "Test the Proxy" sections now warn about this and document the recovery (split on `,` and pick first, OR fetch canonical URL from `gateway-targets/{id}.configuration.ingress.publicUrl`). - **Deploy-failed-with-no-error recovery** (target-side flake — even with `RUNNING+ready=true`). All three create skills' deployment-poll Common Issues now document the full-instance PATCH-to-different-target recovery lever, plus a tip that long polls should tolerate transient curl errors (`--max-time 20 --retry 3 --retry-delay 2`). - **apply-token-rate-limiting was missing Cleanup + verification curl.** Added a Cleanup section (the policy-DELETE pattern), and a copy-pasteable verification curl with `jq` filter showing the policy with its applied configuration. Also: corrected the catalog-expansion magnitudes (~101→629 verified live, was claimed ~130→1200), and softened the `assetVersion: "1.0.0"` "Latest stable" wording since `1.0.1` is also published. Ambiguities clarified: - **`approvalMethod` on Skill 1 Step 7** — was hard-coded `value: null` in the YAML block but the lookup table at the top says it's YAML-driven. Now consistently YAML-driven with `null` as the default. - **Auto-rebind, send `null` on every call** — Skill 2 Step 5 now explicitly says "do NOT propagate topic-1's resolved SSC into topic-2's request body; send `null` on every prompt-topic POST and capture the response value each time." - **`embedding.*` block in YAML when Step 4 is skipped** — Skill 2 Step 5 now flags that the YAML's embedding config is silently disregarded if the env already has basic SSCs (auto-rebind picks the env default regardless), so the agent surfaces this to the user. - **Threshold inclusivity** — both semantic skills now say the comparison is "strictly greater than" the threshold (a score of exactly the threshold falls back). Verified: score 0.50 with threshold 0.50 falls back. - **`.env` orgId reconciliation** — Skill 1 Step 1 now notes that `.env` orgIds are commonly stale; trust `listMe` over `.env` and surface mismatches. Validation: - All 5 JTBD validators PASS (0 warnings, 0 errors) - validate-api / validate-xorigin / validate-descriptions clean - Portal regenerates clean - 5 parallel agent E2E runs on stgx all completed (resources cleaned up) Made-with: Cursor

jagadeshsid requested a review from a team as a code owner April 27, 2026 12:44

Merge branch 'master' into siddhartha/llm-gateway-skills

48c6f58

jagadeshsid changed the title ~~Add LLM Gateway skills and llm-proxy API spec~~ Add LLM proxy skills and API spec Apr 27, 2026

jagadeshsid marked this pull request as draft April 27, 2026 13:15

jagadeshsid added 8 commits April 28, 2026 16:50

Adding learnings from live debugging

31b1c21

Minor improvements to semantic flow

637a1d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLM proxy skills and API spec#31

Add LLM proxy skills and API spec#31
jagadeshsid wants to merge 11 commits into
mulesoft:masterfrom
jagadeshsid:siddhartha/llm-gateway-skills

jagadeshsid commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jagadeshsid commented Apr 27, 2026

Summary

Skills (4)

API spec

Validation

Verification approach

Test plan

Notes for review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant