fix(logging): emit valid JSON for messages containing quotes or newlines#597
Conversation
The string-template formatter in dashboard/logging.ini interpolated %(message)s raw into a JSON-shaped string, producing invalid JSON whenever a message contained a quote or newline. In practice this breaks every exception log, since dashboard/app.py joins tracebacks into the message with newlines. Replace the template with a JsonFormatter class that uses json.dumps to serialize. Output shape is preserved.
bheesham
left a comment
There was a problem hiding this comment.
I like that this removes a hack. There are some cases it doesn't handle, but we don't make use of any of those.
Calls to logger.exception(...) or logger.error(..., exc_info=True) attach a traceback to record.exc_info. The first JsonFormatter commit dropped that field, so although the JSON was valid the traceback was lost from the output. Affected callers in this repo: yaml_loader, vanity, tile, user, and app's TokenError handler. Surface the traceback as a separate exc_info field (and stack_info when present), serialized via json.dumps so newlines and quotes remain escaped.
|
Bhee — you were right. Caught it: 7 callsites in this repo use Pushed a follow-up commit (c23d791) that surfaces the traceback as a separate Updated test plan:
Affected callsites in this repo (all now keep their tracebacks in the output):
Cross-org search: no other |
Two asymmetries flagged in review: - The implementation handled record.stack_info but no caller in this repo produces it (and nothing in the runtime deps or other mozilla-iam / mozilla-it Python code does either). Drop the branch; if a stack_info caller ever appears, a test can land alongside it. - Test 1 passed args=() while asserting against a literal message string, so the args parameter was dead weight. Use a %s-formatted msg with real args so the test actually covers record.getMessage()'s % interpolation alongside the JSON escaping.
|
Both asymmetries addressed in 1644ab7:
|
What changed
Replaces the brittle string-template
[formatter_json]indashboard/logging.iniwith a smallJsonFormatterclass (dashboard/jsonlog.py) that usesjson.dumpsto escape field values. Output shape is unchanged: every record still emitstime,level,process_id,message, and a compoundnameoffilename:logger:funcName:lineno.Also adds a regression test (
tests/test_jsonlog.py) asserting that a message containing both quotes and newlines round-trips throughjson.loadscleanly.Why
The previous formatter interpolated
%(message)sdirectly into a JSON-shaped template string. Any log message containing a"or a\nproduced invalid JSON. This is not hypothetical:dashboard/app.py(the global exception handler aroundhandle_exception) joins tracebacks into the message field with\n, so under the old formatter every exception log was emitted as malformed JSON. Downstream log consumers (Cloud Logging on Cloud Run, anything else parsing these as JSON) silently drop or text-fallback those records.Using
json.dumpsmakes the bug class disappear: every field is escaped at serialization time.Why not python-json-logger
I considered pulling in
python-json-logger, but adding a runtime dependency to a critical production system isn't worth it for ~14 lines of stdlib code. The historical JSON shape (custom keys likeprocess_id, the compoundnamefield) would have required a subclass to preserve regardless, so the dep would not have eliminated the code.Blast radius
Output shape is preserved, so any consumer that worked before continues to work. The one behavioral change worth flagging: parsers that were silently dropping malformed records (every exception log) will now successfully parse them. Expect the visible volume of structured-log records to increase after this deploys, particularly in the error/warning tiers. Anyone watching log-volume dashboards or alerts on this service should be aware that the bump is the bug being fixed, not a regression.
No public-facing routes, no auth flow, no apps.yml interaction.
Cross-repo coordination
None.
Test plan
tests/test_jsonlog.py— new test asserts aLogRecordwhose message contains both"and\nproduces output that round-trips throughjson.loads. Regression guard for the specific bug.toxpasses locally (lint + types + tests + style).JsonFormatter, fed it aLogRecordshaped like whatapp.py's exception handler produces (a string concatenated with a traceback containing newlines), confirmed valid JSON output and the historical field shape.Caveats
messagefield as a\n-joined string, same as before. Splitting them into a separateexc_infofield would be a behavior change worth doing later but is out of scope here.filename,funcName,lineno, loggername) are still collapsed into the single compoundnamefield. Cloud Logging would prefer them as separate top-level keys for easier filtering, but again, scope creep — a separate PR if anyone wants it.