Add LLM proxy skills and API spec#31
Draft
jagadeshsid wants to merge 11 commits into
Draft
Conversation
Adds four JTBD skills covering the LLM Gateway workflows (model-based routing, semantic routing, token rate limiting, consumer access) and the private llm-proxy API spec (9 operations on /apimanager/xapi/v1). Skills: - create-llm-proxy-model-based-routing (8 steps) - create-llm-proxy-semantic-routing (10 steps) - apply-token-rate-limiting-to-llm-proxy (5 steps) - request-llm-proxy-access (5 steps) API: apis/llm-proxy/ (api.yaml, exchange.json, 11 example responses) Payload shapes and URLs are grounded in the Anypoint UI source (api-manager-ui-node-lib) and live HAR captures from stgx, including a successful 200 invocation through the gateway. All four validators pass: validate-jtbd, validate-api, validate-xorigin, validate-descriptions; make generate-portal renders the spec and all four skill pages cleanly. Made-with: Cursor
Three alignment fixes after a deep validator + cross-reference pass:
1. Convert input refs to canonical `from: { variable: X }` form
The first cut used `from: { step: <name>, output|input: X }`, an
evolved schema documented in the (now-archived) anypoint-public-api-specs
repo. mulesoft-dx's validator + job-template + every existing skill use
the simpler `from: { variable: X }`. Reverted 61 references across all
four LLM skills; validators now run cleanly with zero warnings (was
19+26+8+8 warnings on the prior commit).
2. Wire up the missing example response in api.yaml
`gatewayTargetApisOccupied.json` was committed but not referenced from
any operation. Switched the corresponding `getGatewayTargetApisByPortAndPath`
200 response to OAS 3.0 `examples:` (plural) with named entries:
`Available` (the existing case) and `Occupied` (the conflict case).
Both cases now show up in generated portal docs.
3. Remove broken cross-skill reference in `request-llm-proxy-access`
The Related Jobs section linked `manage-consumer-contracts`, which
exists in older internal repos but was never migrated to mulesoft-dx.
Dropped the bullet rather than introduce a dead link.
Also adds `apis/llm-proxy/examples/responses/README.md` cataloguing every
example file: which `api.yaml` operation references it, and the two
supplementary references that document responses from other API specs
(used as concrete shapes by the LLM Gateway skills).
Validators all green: validate-jtbd (per skill, 0 warnings + 0 errors),
validate-api (basic OAS 0/0, governance 0/0), validate-xorigin (clean),
validate-descriptions (clean).
Made-with: Cursor
Walked the four skills end-to-end as if I were the agent and identified specific spots where the agent would have to guess or where it could miss a step. Eight surgical edits across the four skills: Skill 1 (model-based routing): - Step 6: when the user hasn't specified an inbound `properties.platform`, default to `openai` (universal client format, supports all five providers via transcoding). Pick `gemini` only if the user explicitly wants Gemini-format consumer bodies (single-route only). - Step 7: the `fallbackRoute` value MUST match a route's `label` exactly. Server silently accepts mismatched values, but at runtime fallback never triggers — explicit warning so the agent picks from the labels it just constructed instead of asking the user for free text. - Step 7: capture `routing[*].rules.headers.x-routing-header` as a `modelPrefixes` output, and surface those values to the user in "What happens next" so they know which `<prefix>/<model>` strings are valid in the request body. - Tips: add a mixed-mode credentials note — modes are per-upstream, so a proxy can have OpenAI on a static key AND Gemini on DataWeave at the same time. Avoids the "pick one mode for the whole proxy" assumption. Skill 2 (semantic routing): - Step 4: utterance-elicitation guidance. When the user hasn't supplied utterances, the agent should prompt per-topic for 5–10 diverse phrasings, refuse fewer than 5, and challenge utterances that overlap multiple topics. Skill 3 (token rate limiting): - Step 5: post-apply verification. Make a test call after ~30s and confirm the `x-llm-proxy-ratelimit` header is present in the 200 response. Header absence after 90s suggests the policy didn't apply. Skill 4 (request access): - Step 5: SLA-tier discovery. Before posting the contract, list tiers via `listOrganizationsEnvironmentsApisTiers` to detect tiered proxies proactively rather than recovering from a 400. - Invocation Next Steps: lead with `/chat/completions` as the universal subpath and document that `/responses` is OpenAI-only — the transcoding policies for non-OpenAI providers don't currently translate `/responses` semantics. - Invocation Next Steps: add explicit guidance on how to find a proxy's valid `model` prefixes — read `routing[*].rules.headers.x-routing-header` on the existing proxy and construct `<prefix>/<model>`. All four skills still pass `validate-jtbd` per-skill with zero warnings. `make validate-api` clean. `make validate-descriptions` clean. Made-with: Cursor
Adds descriptions to the few inner-array `items:` and the route-rule `headers` map that previously inherited context from their parents. Surfaced by an end-to-end audit pass (every property under every request body now has a description). Made-with: Cursor
Addresses lead review feedback that the combined create-llm-proxy-
semantic-routing skill conflates the basic and advanced flows. The two
flows differ enough at the topic-create endpoint, the SSC payload, the
deployment story, and the post-create policy attach that one skill was
forcing branchy logic on the agent.
Changes:
- Add skills/create-llm-proxy-semantic-routing-basic/SKILL.md (10 steps).
Uses /prompt-topics, no vector DB, documents the platform's auto-rebind
behavior and the agent's need to read back the resolved SSC, no manual
policy attach (basic variants auto-attach).
- Add skills/create-llm-proxy-semantic-routing-advanced/SKILL.md (12 steps).
Uses /global-prompt-topics, requires a vector DB, includes a new step
for fetching and running the GET /semantic-setup-script that hydrates
the vector DB, plus the manual semantic-routing-policy-* apply step
(advanced variants are not auto-attached).
- Remove skills/create-llm-proxy-semantic-routing/.
- Add createPromptTopic and getSemanticSetupScript operations to
apis/llm-proxy/api.yaml plus a captured promptTopic.json example
(live-tested against stgx).
- Both new skills:
- Restructure topic + utterance gathering as inline-first, with a
CSV/JSON file-path fallback for larger sets (the agent parses the
file itself).
- Consolidate ask-the-user inputs into the Prerequisites section so
the agent reads it as a checklist and prompts up front.
- Skill 1 (model-based) gets the same Prerequisites consolidation.
- Update cross-references in Skill 1, Skill 3, Skill 4 to point at the
two new skill names.
All five validators pass clean. Cross-ref audit: 29 output JSONPaths
across the 5 skills resolve in their captured example responses.
Made-with: Cursor
Addresses lead review feedback that the create skills should be
config-driven, with explicit verification curls per creation step and
a final end-to-end test step.
Common changes (all three create skills):
- New "Read llmproxy.yaml first" section at the top of each skill
describing a recommended input file alongside .env. The skill text
tells the agent to be flexible about field names and structure since
customer YAMLs may vary; per-step lookup tables specify which YAML
fields short-circuit which steps.
- Conditional skip notices on Steps 1 + 2 (org/env), Step 4 (gateway
target), Step 5 (port + base path defaults), based on llmproxy.yaml
contents.
- Asset publish broken into three explicit phases — POST, poll the
publicationStatusLink, verify attributes — each with its own curl
block, per the lead's "polling and verifying are two separate steps".
- New verification curl after the proxy create (Step 7 in model-based,
Step 9 in semantic-basic, Step 10 in semantic-advanced) so customers
can check the API instance metadata before the deployment poll.
- New "Test the Proxy with a real request" section at the end of each
create skill, with concrete curls for happy-path + fallback cases
and triage guidance per response shape. References
request-llm-proxy-access for credential creation.
- Soften apiVersionStatus first-poll claim — agent E2E test on stgx
showed the field is null on the very first poll (~3s in), then
unregistered, then active. Gate is unchanged ("== active") but the
explanatory wording is corrected.
Skill 2 (semantic-basic) specific:
- Replaced the "Trade-offs vs advanced" comparison table with a
smaller "Basic-mode limits" callout — keeps the limits visible
without confusing readers who came here for basic only.
- Added embedding service provider name to Prerequisites alongside
key + model.
- Step 3 (list SSCs), Step 4 (create SSC) marked conditional on
routing.ssc.id being set in YAML.
- Topic + utterance source priority now lists llmproxy.yaml inline,
llmproxy.yaml topics_csv/topics_json paths, then in-chat inline,
then file path the user names in chat.
Skill 3 (semantic-advanced) specific:
- Steps 3, 4, 5 all conditional on routing.ssc.id (reuse existing SSC
scenario).
- Step 6 (getSemanticSetupScript) kept and emphasized — verified live
on stgx that this is a separate API call from the SSC create
response, and that without running the script the vector DB stays
empty and every prompt routes to fallback.
Validation:
- All 5 JTBD validators PASS (0 warnings, 0 errors each)
- validate-api / validate-xorigin / validate-descriptions clean
- Cross-ref: 29/29 output JSONPaths resolve in their captured examples
- Live agent E2E test against the revised model-based skill succeeded
end-to-end on stgx in ~30s with zero outside lookups, zero hard gaps,
and zero ambiguities (only one wording fix applied, included above).
Made-with: Cursor
Five comments surfaced when an agent ran the semantic-basic skill
interactively (without a pre-baked llmproxy.yaml). All five are
addressed; one of them required removing a previously-documented
manual step that the platform has since started auto-attaching.
Comment 1 — Routes elicitation should be explicit and unbounded.
- All three create skills: the Step that generates the proxy POST
(or, for semantic skills, the Step that creates topics) now opens
with a "Routes — elicit these from the user" sub-section. The
prose explicitly says don't assume a fixed number of routes (not
two, not any specific count), don't assume which providers, and
don't assume any particular credential mode. Per route the agent
asks: provider (any subset of the catalog), target model, key mode
(static or DataWeave), label.
- For semantic skills, also calls out that multiple topics CAN map
to the same route (many-to-one is normal — Finance + Investing
both routing to the same OpenAI route, etc.). The validation
checklist now includes "every topic's route_label matches one of
the route labels you defined above".
Comment 2 — Routing policy attach is automatic for advanced too
(verified live on stgx 2026-04-30 — proxy 4696396 has
semantic-routing-policy-openai-qdrant auto-attached).
- Advanced skill: removed the manual policy-attach Step 11
entirely. Renumbered the deployment poll from Step 12 to Step 11.
Updated the overview, the API endpoints table, the URL gotchas
list, the completion checklist, and the troubleshooting prose
to reflect that the platform now auto-attaches the
semantic-routing-policy-<provider>-<vectordb> variant. Net
effect: 12 steps → 11 steps. Skill is shorter and matches
current platform behavior.
Comment 3 — Step 3 should filter by serviceType + ASK the user.
- semantic-basic Step 3: explicit "filter to serviceType: basic
client-side" instruction + an explicit "ASK the user whether to
use one of these or create a new one" prompt. Don't silently
default. The auto-rebind quirk caveat remains.
- semantic-advanced Step 3: same pattern, filter to advanced.
Comment 4 — Don't auto-select the Flex Gateway target.
- All three create skills' Step 4/6/7 (gateway target listing) now
end with "Surface the eligible targets to the user and ASK them
to pick one. Do NOT auto-select even if there's only one
eligible candidate or one with a 'right-looking' name."
Comment 5 — Client app create body shape — be strict, surface the
common failure modes.
- request-llm-proxy-access Step 4: new "Critical — request body
shape" callout. Lists exactly the two-field body shape that
works (verified live 2026-04-30: `{name, description}` returns
201 with clientId + clientSecret), and explicitly enumerates
fields the agent should NOT include (applicationId, clientId,
clientSecret as null inputs; targetApiInstanceId; redirectUri,
homepageUrl) including the specific 400 message that confused
agents in a previous run.
Validation:
- All 5 JTBD validators PASS (0 warnings, 0 errors each)
- validate-api / validate-xorigin / validate-descriptions clean
- Portal generates clean
Made-with: Cursor
Five subagents tested model-based×{YAML,inline}, semantic-basic×{YAML,inline},
and apply-token-rate-limiting on stgx in parallel. All five completed
end-to-end (cleanup included). Aggregate findings drove the changes below.
Hard skill errors corrected:
- **Model-based routing supports at most ONE route per provider.** The
agent that elicited 3 routes (testing the "you can have 1, 3, 5+
routes" claim) had its third route silently deduped by the server —
upstreams are merged on (provider, uri, label), and the routing
policy keys off the resolved provider, not the route's
x-routing-header value. Skill 1's Step 7 elicitation now spells out
the constraint: max 5 routes (one per supported provider), with an
explicit "reject duplicate provider" guard during elicitation.
- **Upstream `label` MUST equal the provider identifier.** Skill 1
never said this; the agent hit `400 BadRequestError: "Upstream in
route 'X' has provider 'openai' which requires upstream label to be
'openai', but found label 'openai-full'"` on first try. Now stated
explicitly with the exact 400 message included.
- **`endpointUri` is often comma-separated** (`<cloudhub-url>,<custom>`).
All three create skills' "Test the Proxy" sections now warn about
this and document the recovery (split on `,` and pick first, OR fetch
canonical URL from `gateway-targets/{id}.configuration.ingress.publicUrl`).
- **Deploy-failed-with-no-error recovery** (target-side flake — even
with `RUNNING+ready=true`). All three create skills' deployment-poll
Common Issues now document the full-instance PATCH-to-different-target
recovery lever, plus a tip that long polls should tolerate transient
curl errors (`--max-time 20 --retry 3 --retry-delay 2`).
- **apply-token-rate-limiting was missing Cleanup + verification curl.**
Added a Cleanup section (the policy-DELETE pattern), and a
copy-pasteable verification curl with `jq` filter showing the policy
with its applied configuration. Also: corrected the catalog-expansion
magnitudes (~101→629 verified live, was claimed ~130→1200), and
softened the `assetVersion: "1.0.0"` "Latest stable" wording since
`1.0.1` is also published.
Ambiguities clarified:
- **`approvalMethod` on Skill 1 Step 7** — was hard-coded `value: null`
in the YAML block but the lookup table at the top says it's
YAML-driven. Now consistently YAML-driven with `null` as the default.
- **Auto-rebind, send `null` on every call** — Skill 2 Step 5 now
explicitly says "do NOT propagate topic-1's resolved SSC into
topic-2's request body; send `null` on every prompt-topic POST and
capture the response value each time."
- **`embedding.*` block in YAML when Step 4 is skipped** — Skill 2
Step 5 now flags that the YAML's embedding config is silently
disregarded if the env already has basic SSCs (auto-rebind picks
the env default regardless), so the agent surfaces this to the user.
- **Threshold inclusivity** — both semantic skills now say the
comparison is "strictly greater than" the threshold (a score of
exactly the threshold falls back). Verified: score 0.50 with
threshold 0.50 falls back.
- **`.env` orgId reconciliation** — Skill 1 Step 1 now notes that
`.env` orgIds are commonly stale; trust `listMe` over `.env` and
surface mismatches.
Validation:
- All 5 JTBD validators PASS (0 warnings, 0 errors)
- validate-api / validate-xorigin / validate-descriptions clean
- Portal regenerates clean
- 5 parallel agent E2E runs on stgx all completed (resources cleaned up)
Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds four JTBD skills covering the LLM Gateway workflows plus the private
llm-proxyAPI spec (9 operations on/apimanager/xapi/v1).Skills (4)
create-llm-proxy-model-based-routingmodelfield in the request bodycreate-llm-proxy-semantic-routingapply-token-rate-limiting-to-llm-proxyllm-token-rate-limitpolicy (groupId 68ef9520-…,assetId llm-token-rate-limit,version 1.0.0) to an existing LLM proxyrequest-llm-proxy-accessclient_id/client_secretand create a contract for a consumerAPI spec
apis/llm-proxy/api.yaml— 9 operations:listLlmRouteConfigurations,listEnvironmentLlmProxies,createEnvironmentLlmProxy,getGatewayTargetApisByPortAndPath,listSemanticServiceConfigs,createSemanticServiceConfig,getSemanticServiceConfig,listGlobalPromptTopicsBySsc,createGlobalPromptTopicapis/llm-proxy/exchange.json—groupId: f1e97bc6-…anypoint-platform,assetId: llm-proxy-api,version: 1.0.0,visibility: privateapis/llm-proxy/examples/responses/— 11 captured example responses (secrets redacted)Validation
make validate-api API=apis/llm-proxy— basic OAS 0/0, governance 0/0python3 scripts/build/validate_jtbd.py— all 4 skills PASS individuallypython3 scripts/build/validate_xorigin.py apis/llm-proxy— no violationsmake validate-descriptions— no violationsmake generate-portal— portal renders cleanly;[oas] LLM Proxy API+ 4[agent-skill]entries in registryVerification approach
Payload shapes, URLs, and response shapes are grounded in:
api-manager-ui-node-librepo. Every URL, payload field, and enum value traces back to a specific file/line (e.g.,AddLlmProxyWizard.js:596-736for the LLM proxy POST shape,routeConfigs.js:21-276for the per-provider field catalog,constants.js:703-706for theapprovalMethodenum).200round-trip through the gateway — invoked a deployed LLM proxy with valid credentials; confirmed everyx-llm-proxy-*response header documented in the skills, plus the 401 / fallback / model-override behaviors.Test plan
make validate-api API=apis/llm-proxylocally and confirm 0 violationspython3 scripts/build/validate_jtbd.py skills/<skill>/SKILL.md .for each of the four new skillsmake generate-portaland verify the LLM Proxy API spec page + four skill HTML pages render without errorsNotes for review
run-agent-scan-and-view-resultsskill currently failsvalidate-jtbd(missing YAML frontmatter). That is unrelated to this PR and predates these changes.visibility: privateconsistent withapi-portal-xapi(the only other private API in the repo). Private APIs render the spec intoportal/apis/<name>/api.yamlbut skip the standalone HTML detail page by design.Made with Cursor