Skip to content

feat(database): add databricks oauth support#41421

Open
rusackas wants to merge 10 commits into
masterfrom
adopt/databricks-oauth
Open

feat(database): add databricks oauth support#41421
rusackas wants to merge 10 commits into
masterfrom
adopt/databricks-oauth

Conversation

@rusackas

Copy link
Copy Markdown
Member

SUMMARY

OAuth 2.0 support for Databricks, across AWS, Azure, and GCP.

Adopts #34619 (original work by @drummerwolli, who stepped away). I re-applied their commits on current master and fixed the failing unit tests. Authorship is preserved via co-author trailers.

The only real fix on top of the original work: get_oauth2_token was overwriting the config's already-resolved token_request_uri with the AWS template that still had an unsubstituted account-id placeholder, so the token exchange POSTed to .../accounts/{}/v1/token. It now only falls back to the AWS endpoint when no URI is configured. No behavioral change to the feature otherwise.

TESTING INSTRUCTIONS

  • set up oauth config in superset_config.py for databricks as described in the newly added docs
  • install databricks dependencies
  • run superset
  • set up a DB connection to Databricks
  • run a query with that connection and notice the OAuth login screen

Unit tests: pytest tests/unit_tests/db_engine_specs/test_databricks.py tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py

ADDITIONAL INFORMATION

  • Introduces new feature or API

drummerwolli and others added 5 commits June 25, 2026 02:13
get_oauth2_token clobbered the config's already-resolved token_request_uri
with the AWS template that still contained an unsubstituted account-id
placeholder, so the token exchange POSTed to .../accounts/{}/v1/token. Only
fall back to the AWS endpoint when no token_request_uri is configured.

Co-authored-by: fabian_zse <fabian@zalando.de>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dosubot dosubot Bot added authentication:sso Single Sign On data:connect:databricks Databricks DB Experience enhancement:db Suggest new DB connections labels Jun 25, 2026
@github-actions github-actions Bot added the doc Namespace | Anything related to documentation label Jun 25, 2026
@netlify

netlify Bot commented Jun 25, 2026

Copy link
Copy Markdown

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 2561312
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/6a3d76f30bfec90008bcf0bf
😎 Deploy Preview https://deploy-preview-41421--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

Comment thread tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py Outdated
Comment thread tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py Outdated
Comment thread tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py Outdated
Comment thread tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py Outdated
Comment thread tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py Outdated
@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 34.66667% with 49 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.38%. Comparing base (fe2424e) to head (2561312).
⚠️ Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
superset/db_engine_specs/databricks.py 34.66% 49 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #41421      +/-   ##
==========================================
- Coverage   64.44%   64.38%   -0.06%     
==========================================
  Files        2655     2656       +1     
  Lines      145491   145741     +250     
  Branches    33585    33626      +41     
==========================================
+ Hits        93762    93842      +80     
- Misses      50028    50195     +167     
- Partials     1701     1704       +3     
Flag Coverage Δ
hive 39.22% <34.66%> (-0.03%) ⬇️
mysql 57.91% <34.66%> (-0.09%) ⬇️
postgres 57.97% <34.66%> (-0.10%) ⬇️
presto 40.79% <34.66%> (-0.03%) ⬇️
python 59.40% <34.66%> (-0.10%) ⬇️
sqlite 57.63% <34.66%> (-0.09%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread superset/db_engine_specs/databricks.py
Comment thread superset/db_engine_specs/databricks.py Outdated
Comment thread superset/db_engine_specs/databricks.py
Comment thread superset/db_engine_specs/databricks.py Outdated
Comment thread superset/db_engine_specs/databricks.py

@bito-code-review bito-code-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #0cb4e7

Actionable Suggestions - 1
  • docs/user_docs_versioned_docs/version-6.0.0/configuration/databases.mdx - 1
    • Missing account_id placeholder substitution · Line 545-548
Additional Suggestions - 5
  • docs/user_docs_versioned_docs/version-6.0.0/configuration/databases.mdx - 1
    • Azure token_request_uri placeholder not substituted · Line 547-548
      The Azure entry in `_oauth2_endpoints` (`databricks.py:293-294`) has a `{}` placeholder in `token_request_uri` (`https://login.microsoftonline.com/{}/oauth2/v2.0/token`). Unlike `authorization_request_uri` where the code in `get_oauth2_authorization_uri` can override it, the `token_request_uri` is set in `get_oauth2_config` (base.py:745-747) from the class attribute which is empty (""), then never updated. Additionally, `get_oauth2_token` (databricks.py:605-614) only provides a fallback for missing token_request_uri using the AWS endpoint — it never extracts a per-provider tenant_id. The documentation at line 547 only flags `authorization_request_uri` as needing override, omitting `token_request_uri`, so users will configure and still fail silently.
  • tests/unit_tests/db_engine_specs/test_databricks.py - 2
    • Duplicated fixture definitions · Line 418-432
      The `oauth2_config_python` fixture (lines 418-432) is functionally identical to `oauth2_config_native` (lines 402-416) — both return the same dictionary. This duplication adds maintenance burden and divergence risk if one needs to change.
    • Semantic test duplication · Line 667-701
      This test is semantically duplicated from `test_get_oauth2_fresh_token_native` (lines 630-664). Both tests use identical fixtures (the `request_content_type: "json"` configs are structurally identical), call the same inherited method, and assert identical expectations. Since neither `DatabricksNativeEngineSpec` nor `DatabricksPythonConnectorEngineSpec` overrides `get_oauth2_fresh_token()`, testing them separately provides no additional coverage. The duplication creates maintenance risk — any future change must be applied twice, risking divergence.
  • superset/db_engine_specs/databricks.py - 2
    • Duplicate import across diff · Line 575-575
      The `from typing import cast` import appears 4 times (lines 575, 603, 856, 884) — twice in each child class. Consolidating the methods into the parent class will eliminate 3 of these. The remaining one should be moved to module-level imports.
    • Docstring Missing Accepted Values · Line 310-326
      The `_detect_cloud_provider` method accepts 'cloud_provider' from extra configuration but only validates against known providers after lowercasing. Consider documenting accepted values and what happens with invalid values.
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/db_engine_specs/databricks.py - 2
  • tests/unit_tests/db_engine_specs/test_databricks.py - 1
Review Details
  • Files reviewed - 4 · Commit Range: dafe036..47b7e01
    • docs/user_docs_versioned_docs/version-6.0.0/configuration/databases.mdx
    • superset/db_engine_specs/databricks.py
    • tests/unit_tests/db_engine_specs/test_databricks.py
    • tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread docs/user_docs_versioned_docs/version-6.0.0/configuration/databases.mdx Outdated
…uth fixtures

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bito-code-review

bito-code-review Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #71d2ac

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 47b7e01..d84b5db
    • tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

…figured URIs

The per-cloud OAuth2 endpoint templates carry a `{}` placeholder for the
Databricks account id (or Azure tenant id) that was never substituted, so
auto-detected authorize/token URLs were emitted as `.../accounts/{}/v1/...`.
The authorization-URI methods also unconditionally overwrote a fully-resolved
`authorization_request_uri` supplied via DATABASE_OAUTH2_CLIENTS.

- Add `_resolve_oauth2_endpoint`: substitutes `account_id`/`tenant_id` from the
  database extra into the template, or raises OAuth2Error when absent instead of
  issuing a request to an unresolved endpoint.
- Preserve a configured `authorization_request_uri`; only auto-detect/resolve
  when none is set.
- `get_oauth2_token` has no database context to auto-detect, so fail fast on a
  missing `token_request_uri` rather than POST to `.../{}/v1/token`.
- Cover auto-detect/resolve, preserve-configured, and fail-fast paths for both
  the native and Python-connector specs; document `account_id`/`tenant_id`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread tests/unit_tests/db_engine_specs/test_databricks.py
Comment thread superset/db_engine_specs/databricks.py
Comment thread superset/db_engine_specs/databricks.py
rusackas and others added 3 commits June 25, 2026 11:39
The multi-cloud OAuth2 URI tests passed a config with a fully-resolved
authorization_request_uri, which the engine spec now preserves. Drop the
URI for the Azure/GCP detection cases (and give those mock databases an
account_id/tenant_id) so the per-provider endpoint is actually resolved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The authorization endpoint auto-resolves from the hostname, but the token
exchange has no database context, so token_request_uri must be supplied for
the auto-detected flow. Docs implied both endpoints auto-detect.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@bito-code-review

bito-code-review Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #909191

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/db_engine_specs/databricks.py - 1
Review Details
  • Files reviewed - 4 · Commit Range: d84b5db..2561312
    • docs/user_docs_versioned_docs/version-6.0.0/configuration/databases.mdx
    • superset/db_engine_specs/databricks.py
    • tests/unit_tests/db_engine_specs/test_databricks.py
    • tests/unit_tests/db_engine_specs/test_databricks_multi_cloud.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authentication:sso Single Sign On data:connect:databricks Databricks DB Experience doc Namespace | Anything related to documentation enhancement:db Suggest new DB connections size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants