Skip to content

ref(options): migrate runtime config to sentry-options (Python RPC/query + Rust killswitches)#8096

Draft
phacops wants to merge 35 commits into
masterfrom
claude/upbeat-newton-4njhnb
Draft

ref(options): migrate runtime config to sentry-options (Python RPC/query + Rust killswitches)#8096
phacops wants to merge 35 commits into
masterfrom
claude/upbeat-newton-4njhnb

Conversation

@phacops

@phacops phacops commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What

Migrates a batch of Redis-backed runtime config keys to sentry-options, and wires up the Python sentry-options client so Python code can read the same snuba namespace the Rust consumers already use (via the sentry-options crate, e.g. blq_router.rs). Values stay managed centrally in sentry-options-automator; Snuba reads them read-only.

Three commits:

  1. Introduce the Python client + migrate enable_any_attribute_filter.
  2. Migrate 12 more read-only Python RPC/query keys.
  3. Migrate 2 static Rust consumer killswitches.

Infrastructure

  • Dependency: sentry-options>=1.1.1 (pyproject.toml, uv.lock) — the Python binding of the same client the Rust consumers use.
  • Wrapper snuba/state/sentry_options.py:
    • init_options() — idempotent, never raises (a missing/misconfigured options mount must not break startup); called from setup_sentry(), the chokepoint every entrypoint and pytest_configure already hits.
    • get_option(key, default) and typed get_bool_option / get_int_option / get_float_option / get_str_option(key, default) — return the configured value (or schema default), falling back to the caller's default on any OptionsError. Mirrors the Rust .ok()...unwrap_or(default) semantics so call sites behave exactly as before.
  • conftest.py points SENTRY_OPTIONS_DIR at the in-repo schema so init() is cwd-independent.

Keys migrated

Python (RPC / query path) — schema type / default match the prior get_*_config default, so behavior is unchanged until a value is set in automator:

key type default
enable_any_attribute_filter boolean true
aggregation_deprecation_enabled boolean true
enable_trace_pagination boolean true
use.low.cardinality.processor boolean true
cross_item_queries_no_sample_outer boolean true
default_tier integer 1
export_trace_items_default_page_size integer 10000
use_sampling_factor_timestamp_seconds integer 1744131600
ExecutionStage.max_query_size_bytes integer 131072
EndpointGetTrace.apply_final_rollout_percentage number 0.0
rpc_logging_sample_rate number 0.0
rpc_logging_flush_logs number 0.0
ExecutionStage.disable_max_query_size_check_for_clusters string ""

Rust consumers (previously read via runtime_config::get_str_config, i.e. a PyO3 round-trip into Python/Redis — now read natively, no PyO3):

key type default
eap_items_drop_invalid_timestamps boolean false
experimental_healthcheck boolean false

Tests that toggled these via state.set_config(...) / patch_str_config_for_test(...) now use sentry_options.testing.override_options(...) (Python) / sentry_options::testing::override_options(...) (Rust).

Deliberately NOT migrated

sentry-options requires every key to be declared in a static schema, so dynamically-named / parameterized keys can't move, and the runtime-config management plane must stay:

  • Dynamic / parameterized keys (Python & Rust): per-storage / per-policy / per-topic / per-consumer-group keys — e.g. clickhouse_load_balancing:<storage>, clickhouse_max_insert_block_size:<storage>, eap_items_dlq_grace_period_min:<storage>, quantized_rebalance_consumer_group_delay_secs__<group>, validate_schema_<topic>, allocation-policy configs, rate-limiter buckets.
  • The runtime-config management plane: the admin UI / CLI / snuba.state CRUD and audit log — sentry-options has no in-Snuba write path, and it manages the keys that can't migrate.
  • List/JSON-valued and string-parsed keys (e.g. CSV allowlists/denylists, generic_metrics_use_case_killswitch) — these don't map cleanly to a scalar option and are left for a follow-up.
  • enable_long_term_retention_downsampling — migratable, but its test refactor needs care; deferred to a follow-up.

Operational note

After cutover these keys are read-only from Snuba and edited in sentry-options-automator (not the admin UI). For each, set the value in automator to match any current production override before this lands; effective defaults are otherwise unchanged. Several are operational killswitches — flagging so reviewers can veto moving any specific one off the live admin toggle. Deploy-infra detail to confirm: the options ConfigMap is mounted for the Python web/RPC pods (the Rust consumer pods already have it).

Verification

  • pytest tests/state/test_sentry_options.py → 6 passed; mypy (strict) clean on changed source; ruff check + ruff format --check clean.
  • Rust: cargo check clean; cargo test healthcheck (2) + utils (4) pass. The runtime_config PyO3 tests fail only in this sandbox (no Python bootstrap) — they're unrelated to these changes and run in CI.
  • The RPC/endpoint integration tests need ClickHouse and were not run locally; CI covers them.

🤖 Generated with Claude Code


Generated by Claude Code

Adds the sentry-options Python client and uses it for the
`enable_any_attribute_filter` flag, which previously lived in
Redis-backed runtime config (`state.get_int_config`). This mirrors how
the Rust consumers already read the `snuba` options namespace.

- Add `sentry-options>=1.1.1` dependency (uv.lock updated)
- Declare `enable_any_attribute_filter` (boolean, default true) in the
  snuba sentry-options schema
- Add `snuba/state/sentry_options.py` wrapping `init()` /
  `options("snuba").get()` with a safe fallback to each call site's
  default; initialized from `setup_sentry()`
- Swap the RPC call site to `get_option(...)`, preserving the
  default-on kill-switch semantics
- Add unit + integration tests; point conftest at the in-repo schema
  via SENTRY_OPTIONS_DIR so init() is cwd-independent

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
@phacops phacops requested review from a team as code owners June 23, 2026 22:52
claude added 2 commits June 23, 2026 23:40
…ptions

Continues the migration of Redis-backed runtime config to sentry-options
(the Python counterpart to how the Rust consumers already read the `snuba`
namespace). Migrates 12 read-only feature flags / tuning knobs that have a
single source of truth and are safe to manage centrally:

  boolean: aggregation_deprecation_enabled, enable_trace_pagination,
           use.low.cardinality.processor, cross_item_queries_no_sample_outer
  integer: default_tier, export_trace_items_default_page_size,
           use_sampling_factor_timestamp_seconds,
           ExecutionStage.max_query_size_bytes
  number:  EndpointGetTrace.apply_final_rollout_percentage,
           rpc_logging_sample_rate, rpc_logging_flush_logs
  string:  ExecutionStage.disable_max_query_size_check_for_clusters

- Add typed accessors get_bool_option/get_int_option/get_float_option/
  get_str_option to snuba/state/sentry_options.py (mirroring get_int_config
  & friends) so call sites stay typed under strict mypy. Each falls back to
  the call site default if sentry-options is unavailable, matching the Rust
  `.ok()...unwrap_or(default)` semantics.
- Declare each key in the snuba sentry-options schema with the type and
  default matching the previous runtime-config default (behavior-preserving).
- Swap each call site from state.get_*_config(...) to the typed accessor.
- Update tests that toggled these via state.set_config(...) to use
  sentry_options.testing.override_options(...) instead.

Schema defaults match prior get_*_config defaults, so behavior is unchanged
until a value is set in sentry-options-automator.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…options

The Rust consumers read some config via `runtime_config::get_str_config`,
which calls back into Python `snuba.state` (Redis) over PyO3. Migrate the
two static, non-parameterized boolean killswitches to sentry-options
instead, matching how the Rust consumers already read the `snuba` namespace
(see blq_router.rs). This also removes their PyO3/Redis round-trip.

- eap_items_drop_invalid_timestamps (utils.rs): drop messages with event
  timestamps >1 week future / >30 days past.
- experimental_healthcheck (healthcheck.rs): treat commit-request progress
  as healthy.

Both are declared in the snuba sentry-options schema (boolean, default
false) and read via `options("snuba").get(key).as_bool()` with a
fallback to false, identical to the existing BLQ pattern. Healthcheck
tests now use `sentry_options::testing::override_options` instead of
`runtime_config::patch_str_config_for_test`.

Not migrated (left on runtime config): the per-storage / per-consumer-group
parameterized keys (clickhouse_load_balancing:<storage>,
clickhouse_max_insert_block_size:<storage>,
eap_items_dlq_grace_period_min:<storage>,
quantized_rebalance_consumer_group_delay_secs__<group>) and the
string-valued generic_metrics_use_case_killswitch — dynamic keys cannot be
declared in a static sentry-options schema.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
@phacops phacops changed the title ref(options): migrate enable_any_attribute_filter to sentry-options ref(options): migrate runtime config to sentry-options (Python RPC/query + Rust killswitches) Jun 23, 2026
Migrates the read-only query-cache feature flags read in web/db_query.py:
enable_cache_partitioning (bool, default true), randomize_query_id (bool,
default false), retry_duplicate_query_id (bool, default false), and
enable_bypass_cache_referrers (bool, default false). Swaps the call sites to
get_bool_option and converts the one test toggle to override_options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Comment thread snuba/state/sentry_options.py
claude added 3 commits June 24, 2026 00:38
Aligns get_option with its docstring's "any reason" fallback contract.
NotInitializedError/SchemaError/UnknownNamespaceError/UnknownOptionError all
subclass OptionsError and were already handled, but a non-OptionsError escaping
the client would have propagated into hot query paths. Catch and log those,
returning the call-site default, and add a regression test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Migrates seven read-only operational knobs to sentry-options:
- debug_buffer_size_bytes, http_batch_join_timeout (clickhouse/http.py)
- project_quota_time_percentage, counter_window_size_minutes,
  allows_skipping_single_project_replacements (utils/bucket_timer.py)
- use_sentry_metrics (utils/metrics/backends/dualwrite.py)
- ondemand_profiler_hostnames (utils/profiler.py)

None has a test toggle. debug_buffer_size_bytes maps to integer default 0
because the downstream check is `size < (value or 0)`, so None and 0 were
already equivalent; the redundant isinstance assert is dropped.

simultaneous_queries_sleep_seconds (read at two sites with different defaults)
and optimize_parallel_threads (caller-supplied default) are intentionally left
on runtime config: a single schema default cannot preserve their semantics.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Migrates seven read-only flags and converts their test toggles to
override_options (using the context-manager-as-decorator form):
- throw_on_uniq_select_and_having (uniq_in_select_and_having)
- function-validator.enabled (query/validation/functions)
- mandatory_condition_enforce (conditions_enforcer)
- eap.reject_string_timestamp_filters (time_series_request_visitor)
- trace_ids_cross_item_query_limit (cross_item_queries)
- storage_routing.enable_get_cluster_loadinfo (storage_routing)
- max_spans_per_transaction (transactions_processor)

The max_spans_per_transaction try/except + isinstance assert is dropped since
get_int_option already coerces and falls back; mandatory_condition_enforce and
eap.reject_string_timestamp_filters become real booleans.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Comment thread snuba/clickhouse/http.py Outdated
claude added 6 commits June 24, 2026 01:06
…ptions

Migrates six read-only flags and converts their test toggles to
override_options (autouse fixtures become override_options yield-fixtures;
function-scoped toggles use the decorator form):
- admin.querylog_threads (admin/clickhouse/querylog.py)
- enable_eap_readonly_table (storage_selectors/eap_items.py)
- enable_events_readonly_table (storage_selectors/errors.py)
- use_cross_item_path_for_single_item_queries (endpoint_get_traces.py)
- executor_queue_size_factor (subscriptions/executor_consumer.py)
- snuba_api_cogs_probability (querylog/__init__.py)

admin.querylog_threads now reads via get_int_option, which always returns a
valid int, so the BadThreadsValue path (and its now-unreachable test) is
removed. Also fixes two pre-existing E712 lint errors in touched files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Revert the http_batch_join_timeout migration flagged by review. Its default is
settings.BATCH_JOIN_TIMEOUT = int(os.environ.get("BATCH_JOIN_TIMEOUT", 10)),
an env-var-derived value, not a constant. sentry-options returns the schema
default when an option is unset, so deployments that raised the timeout only
via the BATCH_JOIN_TIMEOUT env var would have silently dropped back to 10.
Same class of issue as optimize_parallel_threads/simultaneous_queries_sleep_seconds,
which were never migrated for the same reason. debug_buffer_size_bytes (constant
default) stays on sentry-options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Migrates cache_expiry_sec (int) and read_through_cache.short_circuit (bool) in
state/cache/redis/backend.py. Converts the short_circuit test toggles to
override_options across four test files: decorator form for function/method
toggles, and a class-level autouse override_options yield-fixture in
test_max_rows_enforcer where the flag was set in a shared _insert_event helper
and had to persist for the whole test. Also fixes two pre-existing E712 lint
errors and one latent mypy attr-defined error surfaced by touching these files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Migrates nine read-only deletion knobs and converts their test toggles to
override_options (decorators, a context-manager helper for the off-peak window
tests, and with-blocks where a flag flips mid-test):
- lw_deletions_offpeak_enabled/start/end (lw_deletions/off_peak.py)
- org_ids_delete_allowlist, max_parts_mutating_for_delete (lw_deletions/strategy.py)
- permit_delete_by_attribute (web/bulk_delete_query.py)
- MAX_ONGOING_MUTATIONS_FOR_DELETE, storage_deletes_enabled,
  enforce_max_rows_to_delete (web/delete_query.py)

settings.MAX_ONGOING_MUTATIONS_FOR_DELETE (5) and MAX_PARTS_MUTATING_FOR_DELETE
(20) are constants, so the schema defaults match. lightweight_deletes_sync is
intentionally left on runtime config: it uses `is not None` to decide whether to
set the ClickHouse setting at all, which a typed scalar option cannot express.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…tions

Migrates six more read-only keys and converts their test toggles to
override_options (decorators, with-blocks for parametrized values, and a context
helper):
- ignore_clickhouse_settings_override (clickhouse_settings_override.py)
- enable_long_term_retention_downsampling (routing_strategies/outcomes_based.py)
- storage_routing_config_override, default_storage_routing_config
  (routing_strategy_selector.py) — JSON-blob configs kept as string options
- subscription_primary_task_builder (subscriptions/scheduler.py) — stored as the
  TaskBuilderMode value string, schema default "jittered"
- consistent_override (request/validation.py) — the None/str tri-state becomes a
  string option where empty means "no override"

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Migrates the remaining replacement read-only knobs and converts their test
toggles to override_options (decorators for per-test values, with-blocks where a
value flips mid-test or has a pre-read):
- skip_final_subscriptions_projects, post_replacement_consistency_projects_denylist,
  max_group_ids_exclude (query/processors/physical/replaced_groups.py)
- max_group_ids_exclude (replacers/projects_query_flags.py) — same key, both sites
- skip_seen_offsets, consumer_groups_to_reset_offset_check (replacer.py)
- write_node_replacements_global, replacements_bypass_projects
  (replacers/errors_replacer.py)

settings.REPLACER_MAX_GROUP_IDS_TO_EXCLUDE (256) is a constant so the schema
default matches. Bracketed-list string configs keep their "[]" string form.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Comment thread snuba/replacer.py
…ptions (rust)

Reads the generic-metrics use-case killswitch from sentry-options instead of
Redis runtime config, matching the other Rust consumer killswitches. The string
is substring-matched against the message use_case_id. should_use_killswitch now
takes Option<String> (sentry-options reads yield an Option, no Result wrapper);
its unit tests are updated accordingly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Comment thread snuba/web/rpc/common/common.py Outdated
Switches the enable_any_attribute_filter read from the raw get_option to the
typed get_bool_option, matching every other boolean key in the migration. The
schema already enforces a boolean so behavior is unchanged, but this keeps the
call sites consistent and adds the same defensive coercion as the rest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 7659d7a. Configure here.

Comment thread rust_snuba/src/processors/generic_metrics.rs
claude added 7 commits June 24, 2026 23:32
…n-4njhnb

# Conflicts:
#	pyproject.toml
#	uv.lock
Some runtime-config keys were named dynamically — one Redis key per storage,
topic, dataset, or bucket (f"{prefix}_{name}") — which a static sentry-options
schema cannot enumerate. Collapse each family into a single object option (a
dict declared with additionalProperties, defaulting to {}) keyed by the dynamic
name, and read one entry via new get_mapped_{int,float,str}_option helpers.

Migrates the five remaining dynamic-name keys:
- lw_deletes_killswitch_<storage>            -> lw_deletes_killswitch (dict[str,str])
- lw_deletes_split_by_partition_<storage>    -> lw_deletes_split_by_partition (dict[str,int])
- validate_schema_<topic>                    -> validate_schema_sample_rate (dict[str,number])
- <dataset>_ignore_consistent_queries_..rate -> ignore_consistent_queries_sample_rate (dict[str,number])
- mem_rate_limit_per_sec_<bucket>            -> mem_rate_limit_per_sec (dict[str,number])

An absent entry falls back to the call-site default, preserving the previous
per-key default. Test toggles converted from set_config to override_options
with the dict value.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…options migration

This migration touches files whose latent type errors were never surfaced
because mypy.ini excludes tests/datasets/ and tests/query/, while pre-commit
passes changed files to mypy explicitly (which bypasses that exclude). Fix the
errors properly rather than masking them:

- test_errors_replacer.py: narrow process_message() results (assert non-None),
  narrow Replacement to the errors_replacer subclass that defines
  get_query_time_flags/get_project_id, route re.sub through a helper that
  asserts the query string is non-None, annotate args/parametrize.
- test_transaction_processor.py: correct serialize()/build_result() return
  types to the concrete dicts they return; isinstance-narrow processed messages.
- test_replaced_groups.py: pass ReplacerState.<X>.value (the str the
  constructor expects) instead of the enum member.
- test_db_query.py: narrow excinfo.value to QueryException before .extra/__cause__.
- test_uniq_in_select_and_having.py: alias param is Optional[str].
- consumer.py / query_execution.py: type-only casts for confluent_kafka
  produce() args and a QueryExtraData TypedDict field.

All changes are type-only / behavior-preserving. mypy is clean on the full
changed set and ruff passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
The test conftest used os.environ.setdefault to point sentry-options at the
in-repo schemas, but the Docker test image sets ENV
SENTRY_OPTIONS_DIR=/etc/sentry-options (the production values mount, which
ships no schemas). setdefault is therefore a no-op in the container, so
sentry_options.init() raised SchemaError ("Failed to read file"), init_options()
swallowed it, and every override_options-based test failed with
NotInitializedError. Force-assign the in-repo path so init() reads the
committed schema.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…acer test

test_query_time_flags_bounded_size patched
settings.REPLACER_MAX_GROUP_IDS_TO_EXCLUDE to bound the excluded-group set, but
the production read was migrated to
get_int_option("max_group_ids_exclude", settings.REPLACER_MAX_GROUP_IDS_TO_EXCLUDE).
Once sentry-options initializes, that returns the schema default (256), ignoring
the patched settings fallback, so no bounding occurred and the test saw all 10
group ids instead of the most-recent 5. Override the sentry-option instead.

(This surfaced only after the conftest SENTRY_OPTIONS_DIR fix let init() succeed;
previously every override_options test died at NotInitializedError first.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…sentry-options

This dynamic-name key (slicing_mega_cluster_partitions_<storage_set>) was missed
in the first dynamic-options pass because its key is built into a local variable
(key = f"{PREFIX}_{storage_set.value}") rather than passed as an f-string
literal directly to get_config. Migrate it to a dict option keyed by storage-set
name (value is the bracketed logical-partition list), via get_mapped_str_option;
also drops a redundant get_config call. Test toggles converted to
override_options (which also removes a latent bug in the old delete_config key).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
claude added 3 commits June 25, 2026 01:21
…ector tests

test_strategy_selector.py set default_storage_routing_config and
storage_routing_config_override through runtime state.set_config (via the
imported key constants), but RoutingStrategySelector reads them with
get_str_option. Once sentry-options initializes, those reads return the schema
default ("{}") and ignore the runtime config, so the configured routing was
never exercised (test_valid_config_is_parsed_correctly failed in CI; two
"expects default" tests passed only by coincidence). Convert all 11 set_config
sites to override_options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…-options

The MappingOptimizer reads a per-storage killswitch whose key name comes from
the storage YAML (self.__killswitch). The distinct names are a fixed set of four
static keys, so migrate them as boolean options (default true, matching the old
get_config(..., 1) "enabled unless explicitly disabled" behavior):
  - tags_hash_map_enabled
  - generic_metrics/tags_hash_map_enabled
  - events_tags_hash_map_enabled
  - events_flags_hash_map_enabled
Switch the read to get_bool_option(self.__killswitch, True); convert the one test
toggle to override_options. (Confirmed sentry-options accepts the '/' in the
generic_metrics key.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…ions

Migrate the dynamic-by-storage runtime_config reads in the Rust consumers to
sentry-options dict options (object, additionalProperties keyed by storage
name), read via options("snuba") instead of the Redis runtime-config bridge:

- clickhouse_load_balancing            (string dict, default "in_order")
- clickhouse_load_balancing_first_offset (string dict; read in the same
  get_load_balancing_config(), migrated together to avoid splitting its reads)
- clickhouse_max_insert_block_size     (integer dict; <1048449 still ignored)
- eap_items_dlq_grace_period_min       (integer dict)

Test toggles in runtime_config and writer_v2 converted from
patch_str_config_for_test to sentry_options::testing::override_options + a
once-init of the embedded SNUBA_SCHEMA (matching the blq_router/healthcheck
pattern). cargo check/test/fmt all pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Comment thread rust_snuba/src/processors/eap_items.rs
claude added 6 commits June 25, 2026 02:17
_get_query_settings_from_config read ClickHouse query settings from runtime
config via get_all_configs() and prefix filtering. Migrate to four sentry-options
dicts (object, additionalProperties string):
- query_settings / async_query_settings: {setting: value}
- query_settings_by_prefix / query_settings_by_referrer: keyed by prefix/referrer,
  each value a JSON-object string {setting: value} (sentry-options can't express
  dict-of-dict, so the second level is JSON-in-string), preserving the
  referrer > prefix > base precedence.

Values are string-typed (ClickHouse HTTP settings are strings on the wire). The
parametrized test now applies config via override_options (no longer needs redis)
and compares against stringified expected values.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Add a test that overrides the eap_items_dlq_grace_period_min dict option and
asserts get_dlq_grace_period_min returns the per-storage value (Some(45)), plus
absent-key (None) and negative-value (rejected) cases. This exercises the nested
serde_json::Value get on the value returned by options("snuba").get(...), which
the other migrated reads (load balancing, max insert block size) already rely on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…ptions

A full audit of `state.get_config`/`get_configs` read sites surfaced seven
read-only runtime configs that were still backed by Redis. Migrate each to the
`snuba` sentry-options namespace:

- optimize_parallel_threads (clickhouse/optimize/util.py)
- http_batch_join_timeout (clickhouse/http.py)
- simultaneous_queries_sleep_seconds (clickhouse/native.py, two sites)
- max_days / date_align_seconds (query/snql/parser.py)
- snql_disabled_dataset__<dataset> -> snql_disabled_dataset dict (request/validation.py)
- quantized_rebalance_consumer_group_delay_secs__<group> -> dict (rust_snuba rebalancing)
- bypass_rate_limit / rate_history_sec / rate_limit_shard_factor (state/rate_limit.py)

The two configs whose fallback came from a caller arg / env setting
(optimize_parallel_threads, http_batch_join_timeout) use a sentinel-0 schema
default and fall back to the original value, preserving the prior
"option is only an override" behavior. The two per-suffix keys collapse into
single dict options keyed by the dynamic part, matching the pattern used for
the earlier dynamic-name migrations; a new get_mapped_bool_option helper backs
the boolean dict.

Migrating the rebalancing consumer was the last Rust caller of the Python-bridge
runtime_config::get_str_config, so that reader (plus its cache and test-patch
helper) is removed.

Tests that previously set these via state.set_config now use
sentry_options.testing.override_options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…n-4njhnb

# Conflicts:
#	pyproject.toml
#	uv.lock
…c to sentry-options

The master merge combined #8106's new get_str_config("lightweight_delete_mode")
read with our branch's import block (which no longer imported get_str_config),
producing an undefined-name failure in pre-commit. Migrate both lightweight-delete
ClickHouse-setting reads in lw_deletions/strategy.py to sentry-options instead of
re-adding the legacy import:

- lightweight_deletes_sync -> integer option, schema default -1 ("unset", leave
  ClickHouse's own default in place)
- lightweight_delete_mode -> string option, schema default "" (unset)

test_clickhouse_settings now drives the two flushes via override_options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…tions

Standalone read-only runtime config in the replacer's auto-replacements
bypass-expiry path; not part of the ConfigurableComponent system.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
@phacops phacops marked this pull request as draft June 25, 2026 23:19
claude added 4 commits June 25, 2026 23:34
…s to sentry-options

Migrate the remaining runtime-config reads in the allocation-policy / storage-
routing-strategy (ConfigurableComponent) subsystem to sentry-options:

- ConfigurableComponent.get_config_value now consults a single sentry-options
  dict, `configurable_component_overrides`, keyed by the same fully-qualified
  config key these configs have always used ({resource}.{ClassName}.{config}
  [.{param}:{value},...]). Values are stored as strings and coerced to each
  config's declared value_type. This is a single chokepoint, so it covers every
  allocation policy and routing strategy at once, and it is the authoritative
  source: a key absent from the option falls back to the legacy Redis runtime
  config and then the code default. With the option defaulting to {}, behavior is
  unchanged until the automator is populated.
- The storage-routing strategies' direct state.get_int_config reads
  (time_budget_ms, sampled_too_low_threshold, max_items_before_downsampling,
  min_timerange_to_query_outcomes) now read per-strategy dict options keyed by
  class name, preserving the per-strategy -> global ("StorageRouting") -> default
  fallback chain.

The legacy set_config_value / admin write path is left intact as the transitional
fallback; editing now happens centrally via the sentry-options-automator.

Tests that set these via state.set_config now use override_options; added
coverage for the new override precedence (incl. coercion and parameterized keys).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…aster merge

Master #8101 added use_array_map_columns() in web/rpc/common/common.py reading
state.get_int_config("use_array_map_columns_timestamp_seconds", ...) with a
typing.cast. This branch had already removed the state/cast imports from that
file when migrating the neighbouring use_sampling_factor read, so the merge
produced undefined-name failures (a semantic conflict with no textual clash).

Migrate the new read to get_int_option (mirroring use_sampling_factor), which
also removes the need for the dropped imports. Add the use_array_map_columns_
timestamp_seconds integer option (default 1782172800) and convert its new test
from snuba_set_config to override_options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
…gration

test_expiry_window_changes mock.patched
snuba.replacers.replacements_and_expiry.get_int_config, but that read was
migrated to get_int_option, so the patch target no longer existed
(AttributeError at collection of the patched test). Patch get_int_option
instead (preserving the side_effect=[5, 10] per-call semantics) and read the
class-level baseline via get_int_option too.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01U2Cu68uGZRcCVS14jcyd3E
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants