Service needs structured logging, Prometheus metrics, robust graceful shutdown, request-ID tracing, and a soak test to confirm no FD leaks or unbounded memory growth.
Metrics to expose (/metrics)
| Metric |
Labels |
feedgen_requests_total |
endpoint, status |
feedgen_request_duration_seconds |
endpoint |
feedgen_subscriptions_open |
kind="service"|"actor" |
feedgen_subscription_reconnects_total |
kind, did (optional) |
feedgen_index_posts_total |
— |
feedgen_blob_cache_hits_total / _misses_total |
— |
feedgen_boundary_cache_hits_total / _misses_total |
— |
To-do
Acceptance criteria
Service needs structured logging, Prometheus metrics, robust graceful shutdown, request-ID tracing, and a soak test to confirm no FD leaks or unbounded memory growth.
Metrics to expose (
/metrics)feedgen_requests_totalendpoint,statusfeedgen_request_duration_secondsendpointfeedgen_subscriptions_openkind="service"|"actor"feedgen_subscription_reconnects_totalkind,did(optional)feedgen_index_posts_totalfeedgen_blob_cache_hits_total/_misses_totalfeedgen_boundary_cache_hits_total/_misses_totalTo-do
src/logger.ts—pinologger with Stratos service conventionssrc/middleware/request-id.ts): acceptX-Request-Idor generate UUID; attach toAsyncLocalStorage{requestId, viewerDid?, endpoint, durationMs, status}src/metrics.ts— Prometheus-format counter/histogram helpers; register all metrics above/metricsendpointsrc/lifecycle/shutdown.ts— SIGTERM/SIGINT handler: stop accepting connections → drain in-flight (15 s timeout) → flush actor cursors → close DB/healthto return{ok, version, serviceStreamConnected, actorPoolSize}tests/shutdown.test.tsandtests/metrics.test.tsAcceptance criteria
requestId,viewerDid(where available), anddurationMs/metricsoutput passespromtool check metricsor equivalent programmatic validation (test)/healthreflects actual subscription state (serviceStreamConnected,actorPoolSize)