Skip to content

Added VertexAI support, including GCP ADC Resolution#12

Open
logans-a11y wants to merge 6 commits into
NVIDIA:mainfrom
logans-a11y:vertexai-provider
Open

Added VertexAI support, including GCP ADC Resolution#12
logans-a11y wants to merge 6 commits into
NVIDIA:mainfrom
logans-a11y:vertexai-provider

Conversation

@logans-a11y

Copy link
Copy Markdown

Enables VertexAI Integration for the series of Gemini models. Automatically resolves API Key via google-auth package.

Follows same pattern as every other provider, only difference is resolving from ADC json file, rather than an API key.

@rng1995 rng1995 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request changes — the provider is cleanly structured and conforms to the ModelMetadataProvider / CredentialsProvider protocols (it mirrors OpenAIProvider and reuses registry.lookup_* correctly), and the ADC-based credential flow is sensible (fail-closed when env is unset, no secret logging, no eval/exec/subprocess). However, there's a functional blocker that means Gemini calls won't actually succeed as written, plus a merge conflict and a couple of smaller items.

Blocking

  1. Model identifier is missing the required google/ prefix. The Vertex AI OpenAI-compatibility endpoint (.../endpoints/openapi/chat/completions) requires the model field to be prefixed with google/ for Gemini models, per Google's OpenAI-compatibility docs (e.g. model="google/gemini-2.5-flash"). Here resolve_model() returns the bare label (gemini-2.5-flash), which flows unchanged into MODEL_CONFIGget_chat_model(model=...)ChatOpenAI(model=...), so requests go out as model: "gemini-2.5-flash" and the endpoint will reject them. The README example (SKILLSPECTOR_MODEL=gemini-2.5-pro) has the same gap.

    • Heads-up on the fix: the resolved model string is also used for token-budget lookups (get_max_output_tokens(model)registry.lookup_*), so simply prepending google/ inside resolve_model would break the registry key match (keys are bare) and silently drop token budgeting. Please decide where the prefix belongs — e.g. apply it only at the ChatOpenAI boundary for this provider, or key the registry on the prefixed names and update the README/.env.example to match — and ideally add a test asserting the outgoing model value.
  2. Merge conflict with main. GitHub reports this branch as conflicting (mergeable_state: dirty) — the shared registry YAMLs / pyproject.toml appear to have diverged. Please rebase and resolve before merge.

Should fix

  1. Lint failures. pyproject.toml selects the W ruleset (only E501 is ignored). provider.py has trailing whitespace on blank lines (W293) and no newline at end of file (W292), so ruff check will fail.
  2. No tests for the new provider. The credential/URL logic is non-trivial (ADC resolution, token refresh, base-URL construction). A unit test mocking google.auth.default() — asserting the constructed base_url, the returned token, the None fall-through when env is unset, and the model id sent to the client — would match the coverage the other providers have in tests/unit/test_providers.py.

Minor / optional

  • Dead branch: after the if not project_id or not location: return None guard, project_id is always truthy, so project = project_id or default_project and the following if not project: raise ValueError(...) are unreachable (and default_project is then unused). Either drop the dead branch, or restructure so GOOGLE_CLOUD_PROJECT can be optional and fall back to the ADC-resolved project.
  • Registry duplication: the same Gemini block is added to the root, openai, and nv_build registries in addition to the new vertexai one. The openai entry is defensible (Gemini via an OpenAI-compatible base URL), but the nv_build provider doesn't serve Gemini, so that entry reads as misleading.
  • Stray double blank line after SLOT_DEFAULTS.

Thanks — the structure is solid and protocol conformance is spot on; mainly the google/ model prefix needs sorting out for this to work end to end.

@logans-a11y

Copy link
Copy Markdown
Author

@rng1995 Thank you for the feedback. Fixes should be applied. Let me know if you want anything further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants