Added VertexAI support, including GCP ADC Resolution#12
Conversation
rng1995
left a comment
There was a problem hiding this comment.
Request changes — the provider is cleanly structured and conforms to the ModelMetadataProvider / CredentialsProvider protocols (it mirrors OpenAIProvider and reuses registry.lookup_* correctly), and the ADC-based credential flow is sensible (fail-closed when env is unset, no secret logging, no eval/exec/subprocess). However, there's a functional blocker that means Gemini calls won't actually succeed as written, plus a merge conflict and a couple of smaller items.
Blocking
-
Model identifier is missing the required
google/prefix. The Vertex AI OpenAI-compatibility endpoint (.../endpoints/openapi/chat/completions) requires themodelfield to be prefixed withgoogle/for Gemini models, per Google's OpenAI-compatibility docs (e.g.model="google/gemini-2.5-flash"). Hereresolve_model()returns the bare label (gemini-2.5-flash), which flows unchanged intoMODEL_CONFIG→get_chat_model(model=...)→ChatOpenAI(model=...), so requests go out asmodel: "gemini-2.5-flash"and the endpoint will reject them. The README example (SKILLSPECTOR_MODEL=gemini-2.5-pro) has the same gap.- Heads-up on the fix: the resolved model string is also used for token-budget lookups (
get_max_output_tokens(model)→registry.lookup_*), so simply prependinggoogle/insideresolve_modelwould break the registry key match (keys are bare) and silently drop token budgeting. Please decide where the prefix belongs — e.g. apply it only at theChatOpenAIboundary for this provider, or key the registry on the prefixed names and update the README/.env.exampleto match — and ideally add a test asserting the outgoingmodelvalue.
- Heads-up on the fix: the resolved model string is also used for token-budget lookups (
-
Merge conflict with
main. GitHub reports this branch as conflicting (mergeable_state: dirty) — the shared registry YAMLs /pyproject.tomlappear to have diverged. Please rebase and resolve before merge.
Should fix
- Lint failures.
pyproject.tomlselects theWruleset (onlyE501is ignored).provider.pyhas trailing whitespace on blank lines (W293) and no newline at end of file (W292), soruff checkwill fail. - No tests for the new provider. The credential/URL logic is non-trivial (ADC resolution, token refresh, base-URL construction). A unit test mocking
google.auth.default()— asserting the constructedbase_url, the returned token, theNonefall-through when env is unset, and the model id sent to the client — would match the coverage the other providers have intests/unit/test_providers.py.
Minor / optional
- Dead branch: after the
if not project_id or not location: return Noneguard,project_idis always truthy, soproject = project_id or default_projectand the followingif not project: raise ValueError(...)are unreachable (anddefault_projectis then unused). Either drop the dead branch, or restructure soGOOGLE_CLOUD_PROJECTcan be optional and fall back to the ADC-resolved project. - Registry duplication: the same Gemini block is added to the root,
openai, andnv_buildregistries in addition to the newvertexaione. Theopenaientry is defensible (Gemini via an OpenAI-compatible base URL), but thenv_buildprovider doesn't serve Gemini, so that entry reads as misleading. - Stray double blank line after
SLOT_DEFAULTS.
Thanks — the structure is solid and protocol conformance is spot on; mainly the google/ model prefix needs sorting out for this to work end to end.
|
@rng1995 Thank you for the feedback. Fixes should be applied. Let me know if you want anything further. |
Enables VertexAI Integration for the series of Gemini models. Automatically resolves API Key via google-auth package.
Follows same pattern as every other provider, only difference is resolving from ADC json file, rather than an API key.