Additional debug logging on onestop_id resolution#662
Open
irees wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds opt-in, temporary instrumentation to quantify how often allow_previous_onestop_ids stop resolution provides value beyond exact Onestop ID lookup, and whether a geohash+name search would have reproduced the same resolutions (decision-support for potentially replacing the history table).
Changes:
- Adds a new diagnostic query + structured logging probe gated by
TL_LOG_ALLOWPREV_PROBE. - Invokes the probe from
FindStopsonly for the “bare Onestop ID + allow_previous_onestop_ids + active feed” request shape.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| server/finders/dbfinder/stop.go | Adds a guarded call to the allow-prev probe for qualifying requests. |
| server/finders/dbfinder/stop_onestop_probe.go | Implements the diagnostic query and structured logging for allow-prev resolution effectiveness. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+72
to
+77
| Column(sq.Expr( | ||
| "(ST_DWithin(gs.geometry, ST_PointFromGeoHash(split_part(req.requested_osid,'-',2))::geography, ?) "+ | ||
| "and (split_part(coalesce(cur.onestop_id,''),'-',3) = split_part(req.requested_osid,'-',3) "+ | ||
| "or word_similarity(split_part(req.requested_osid,'-',3), split_part(coalesce(cur.onestop_id,''),'-',3)) > ?)) as search_would_match", | ||
| allowPrevProbeRadiusM, allowPrevProbeSimilarity, | ||
| )). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds temporary, opt-in instrumentation to measure how often the
allow_previous_onestop_idsstop resolution actually does work beyond an exact Onestop ID lookup, and how much of that work a content-based (geohash point + name) search would reproduce. This is decision-support for interline-io/tlv2#354, which is evaluating whether thefeed_version_stop_onestop_idshistory table (~640 GB in production, the bulk of it per-feed-version repetition) can be replaced by a search-key resolver without losing real resolutions. The instrumentation is gated behind an environment variable, off by default with zero overhead, read-only, and intended to be removed once the measurement window is complete.What it measures
For each resolved current stop, it classifies two things:
differs— the stop's current Onestop ID is not the one that was requested, meaning an exact lookup would have missed it and the previous-id behavior did real work; andsearch_would_match— whether a geohash-point + name-similarity search (using the same 100 m radius and 0.4word_similaritythreshold a candidate search-key resolver would use) would also have found the stop. The decision metric isn_meaningful=differs AND NOT search_would_match: resolutions that only stored history can recover. If that count stays near zero over real traffic, a search-key resolver is safe; if not, it quantifies the recall a search-only design would lose.How it works
A probe in
FindStopsfires only whenTL_LOG_ALLOWPREV_PROBEis set and the request is a bare-osid previous-id lookup —allow_previous_onestop_idstrue, nofeed_version_sha1, no explicit ids — which isolates the durable-key use case (e.g. a stored Onestop ID powering a departure board) from frontend pinned-version browsing. It runs one additional read-only query that reproduces the same(feed, stop_id)continuity the production resolver uses, then computes both classifications entirely in SQL, usingST_PointFromGeoHashto decode the requested geohash so no Go-side geohash or name-normalization logic is needed. Results are emitted as structured logs: oneallowprev_probesummary line per request (n_requested,n_resolved,n_differs,n_meaningful,requested_osids) and oneallowprev_probe_meaningfuldetail line per resolution that only history could recover.Scope and caveats
Off by default; when the env var is unset there is no added query or overhead. When enabled it adds one synchronous read-only query to qualifying requests, so it is a measurement-window tool rather than a permanent fixture. It measures raw continuity resolution without permission or license filtering, so counts reflect resolution capability rather than what any specific caller would be authorized to see. Pinned-feed-version previous-id requests are intentionally excluded as a different semantic. The whole change is one new file plus an eight-line guard in
FindStops, so it is straightforward to delete afterward.Test plan
Set
TL_LOG_ALLOWPREV_PROBE=1and start the server, then issue a GraphQL stops query with a bare previous Onestop ID, e.g.stops(where:{onestop_id:"s-9q9nfswzpg-fruitvale", allow_previous_onestop_ids:true}), and confirm anallowprev_probelog line is emitted with sensible counts (and anallowprev_probe_meaningfulline only when a resolution would not be reproduced by search). Confirm that a request withfeed_version_sha1also set produces no probe line. Unset the env var and confirm no probe log lines appear and no extra query is issued.