Skip to content

Additional debug logging on onestop_id resolution#662

Open
irees wants to merge 1 commit into
mainfrom
osid-instrumentation
Open

Additional debug logging on onestop_id resolution#662
irees wants to merge 1 commit into
mainfrom
osid-instrumentation

Conversation

@irees

@irees irees commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds temporary, opt-in instrumentation to measure how often the allow_previous_onestop_ids stop resolution actually does work beyond an exact Onestop ID lookup, and how much of that work a content-based (geohash point + name) search would reproduce. This is decision-support for interline-io/tlv2#354, which is evaluating whether the feed_version_stop_onestop_ids history table (~640 GB in production, the bulk of it per-feed-version repetition) can be replaced by a search-key resolver without losing real resolutions. The instrumentation is gated behind an environment variable, off by default with zero overhead, read-only, and intended to be removed once the measurement window is complete.

What it measures

For each resolved current stop, it classifies two things: differs — the stop's current Onestop ID is not the one that was requested, meaning an exact lookup would have missed it and the previous-id behavior did real work; and search_would_match — whether a geohash-point + name-similarity search (using the same 100 m radius and 0.4 word_similarity threshold a candidate search-key resolver would use) would also have found the stop. The decision metric is n_meaningful = differs AND NOT search_would_match: resolutions that only stored history can recover. If that count stays near zero over real traffic, a search-key resolver is safe; if not, it quantifies the recall a search-only design would lose.

How it works

A probe in FindStops fires only when TL_LOG_ALLOWPREV_PROBE is set and the request is a bare-osid previous-id lookup — allow_previous_onestop_ids true, no feed_version_sha1, no explicit ids — which isolates the durable-key use case (e.g. a stored Onestop ID powering a departure board) from frontend pinned-version browsing. It runs one additional read-only query that reproduces the same (feed, stop_id) continuity the production resolver uses, then computes both classifications entirely in SQL, using ST_PointFromGeoHash to decode the requested geohash so no Go-side geohash or name-normalization logic is needed. Results are emitted as structured logs: one allowprev_probe summary line per request (n_requested, n_resolved, n_differs, n_meaningful, requested_osids) and one allowprev_probe_meaningful detail line per resolution that only history could recover.

Scope and caveats

Off by default; when the env var is unset there is no added query or overhead. When enabled it adds one synchronous read-only query to qualifying requests, so it is a measurement-window tool rather than a permanent fixture. It measures raw continuity resolution without permission or license filtering, so counts reflect resolution capability rather than what any specific caller would be authorized to see. Pinned-feed-version previous-id requests are intentionally excluded as a different semantic. The whole change is one new file plus an eight-line guard in FindStops, so it is straightforward to delete afterward.

Test plan

Set TL_LOG_ALLOWPREV_PROBE=1 and start the server, then issue a GraphQL stops query with a bare previous Onestop ID, e.g. stops(where:{onestop_id:"s-9q9nfswzpg-fruitvale", allow_previous_onestop_ids:true}), and confirm an allowprev_probe log line is emitted with sensible counts (and an allowprev_probe_meaningful line only when a resolution would not be reproduced by search). Confirm that a request with feed_version_sha1 also set produces no probe line. Unset the env var and confirm no probe log lines appear and no extra query is issued.

@irees irees marked this pull request as ready for review June 13, 2026 11:06
Copilot AI review requested due to automatic review settings June 13, 2026 11:06

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds opt-in, temporary instrumentation to quantify how often allow_previous_onestop_ids stop resolution provides value beyond exact Onestop ID lookup, and whether a geohash+name search would have reproduced the same resolutions (decision-support for potentially replacing the history table).

Changes:

  • Adds a new diagnostic query + structured logging probe gated by TL_LOG_ALLOWPREV_PROBE.
  • Invokes the probe from FindStops only for the “bare Onestop ID + allow_previous_onestop_ids + active feed” request shape.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
server/finders/dbfinder/stop.go Adds a guarded call to the allow-prev probe for qualifying requests.
server/finders/dbfinder/stop_onestop_probe.go Implements the diagnostic query and structured logging for allow-prev resolution effectiveness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +72 to +77
Column(sq.Expr(
"(ST_DWithin(gs.geometry, ST_PointFromGeoHash(split_part(req.requested_osid,'-',2))::geography, ?) "+
"and (split_part(coalesce(cur.onestop_id,''),'-',3) = split_part(req.requested_osid,'-',3) "+
"or word_similarity(split_part(req.requested_osid,'-',3), split_part(coalesce(cur.onestop_id,''),'-',3)) > ?)) as search_would_match",
allowPrevProbeRadiusM, allowPrevProbeSimilarity,
)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants