DENG-10914: Added managed backfill depends on past override for cfs v3#9582
DENG-10914: Added managed backfill depends on past override for cfs v3#9582wwyc wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
This PR adds an override_depends_on_past backfill option that lets a custom_query_path backfill run per-partition against a depends_on_past table with a null date_partition_parameter, instead of requiring --reinitialize-table. It threads the flag through the CLI, parser, validation, and the staging→prod copy step, and adds a custom backfill query plus backfill.yaml for clients_first_seen_v3 to populate the new attribution_msstoresignedin column.
The plumbing, the validate_override_depends_on_past checks (custom-query-path requirement, mutual exclusivity with reinitialize), and the per-partition copy resolution in _copy_backfill_staging_to_prod are coherent and well tested. My one concern is that _initiate_backfill itself was not adjusted for the override case, so the depends_on_past ref-rewrite and previous-partition seeding still run — which I believe makes the backfill fail (or silently overwrite prod with empty partitions) for the actual target table. Details inline; worth confirming an end-to-end initiate against clients_first_seen_v3 rather than relying on the mocked unit test.
| # Not reinitialized; runs per-partition and seeds the previous partition. | ||
| assert mock_reinitialize.call_count == 0 | ||
| assert mock_initialize_partition.call_count == 1 | ||
| assert mock_context.invoke.call_count == 1 | ||
| # The depends_on_past ref-rewrite produced replaced_ref.sql from the custom query, | ||
| # and that rewritten query (not query.sql) is what query_backfill is invoked with. | ||
| replaced_ref = Path(QUERY_DIR) / "replaced_ref.sql" | ||
| assert replaced_ref.exists() | ||
| assert mock_context.invoke.call_args.kwargs["custom_query_path"] == replaced_ref |
There was a problem hiding this comment.
issue: This test mocks _initialize_previous_partition and uses metadata without bigquery.time_partitioning, which hides that the override_depends_on_past path is broken for the real target table.
_initiate_backfill was not changed for the override case, so an override_depends_on_past entry still runs the full depends_on_past block at bigquery_etl/cli/backfill.py:1022 — both the ref-rewrite and _initialize_previous_partition. For clients_first_seen_v3 (depends_on_past, date_partition_parameter: null, day-partitioned on first_seen_date, no date_partition_offset):
_initialize_previous_partitioncallsget_backfill_partition(prev_date, None, 0, DAY), which returnsNonebecausedate_partition_parameteris null, so it raisesValueError: Unable to get initial partition id for depends_on_past table(backfill.py:1186-1189). The backfill cannot even be initiated. This path was previously unreachable for null-partition tables because validation blocked them; the override now opens it.- Even if that were bypassed, the ref-rewrite asserted here (lines 2722-2726) repoints the custom query's
cfssourcemoz-fx-data-shared-prod.telemetry_derived.clients_first_seen_v3to the empty staging table.WHERE cfs.first_seen_date = @submission_datethen returns zero rows, and_copy_backfill_staging_to_prodcopies the empty staging partition over the live prod partition — wiping data.
The override query reads its own prod partition to carry columns forward via * REPLACE; it must read prod (not staging) and does not need the previous-partition seed. Gate the depends_on_past block in _initiate_backfill on not entry.override_depends_on_past, and cover it with a test that uses metadata with time_partitioning matching the real table and does not mock _initialize_previous_partition.
Integration report
|
BenWu
left a comment
There was a problem hiding this comment.
Did you test a backfill in a dev project, e.g. the steps I listed in #9535 (comment) for org_mozilla_ios_firefoxbeta_derived.baseline_clients_first_seen_v1? The initiate command still has paths for metadata.scheduling.get("depends_on_past") that weren't changed which I wouldn't expect
|
|
||
| # rewrite query to query the staging table instead of the prod table | ||
| # and copy previous partition if table depends on past | ||
| if metadata.scheduling.get("depends_on_past") and not is_python_script: |
There was a problem hiding this comment.
I think this path still gets run, which is surprising, but I can't tell at-a-glance if it needs to be changed. The custom query will have the prod table rewritten as the staging table which would be wrong
| "--override-depends-on-past", | ||
| "--override_depends_on_past", |
There was a problem hiding this comment.
I recommend a more specific name even if it's wordy because this is for a very specific case and the parameter overlaps with override-depends-on-past-end-date which might be confusing
| "--override-depends-on-past", | |
| "--override_depends_on_past", | |
| "--override-depends-on-past-null-partition", | |
| "--override_depends_on_past_null_partition", |
Description
Extended managed backfill to allow override for depends on past tables.
Related Tickets & Documents
Reviewer, please follow this checklist