Skip to content

feat(telemetry): anonymous opt-out usage telemetry (stack, cost, deploy-shape, CLI usage, call funnel)#160

Open
nicolotognoni wants to merge 4 commits into
mainfrom
worktree-analytics-axiom-internal
Open

feat(telemetry): anonymous opt-out usage telemetry (stack, cost, deploy-shape, CLI usage, call funnel)#160
nicolotognoni wants to merge 4 commits into
mainfrom
worktree-analytics-axiom-internal

Conversation

@nicolotognoni

Copy link
Copy Markdown
Collaborator

Summary

Adds anonymous, opt-out usage telemetry to both SDKs (Python getpatter + TypeScript getpatter) so the maintainers can see, in aggregate, how the SDK is actually used — which engines, providers, models, and carriers people choose, on which platforms, at which versions — and prioritise accordingly. It follows the open-source norm (Next.js, Astro, Homebrew): on by default, completely anonymous, trivially disabled, and fail-safe so it can never block, slow, or break a live phone call.

No PII or call content is ever collected. Every field is a coarse enum, bucket, or bool, enforced by a two-layer allowlist (key and value) in the SDK and re-validated server-side by the collector.

Events

Event When Carries (all coarse / PII-safe)
sdk_initialized SDK construction version, language, OS/arch, runtime, install id, upgrade funnel, carrier, tunnel, and presence-only deploy-shape probes (AI-agent invoker, container, serverless, cloud, package manager)
first_run once per install activation marker — same anonymous deploy-shape dims; never sent when opted out
cli_command every getpatter CLI run the command name only (dashboard / eval / telemetry / other) — never arguments or flags
feature_used first use of a stack engine + provider family, and for pipeline the composed stack: vendor + sanitized model token per layer (e.g. anthropic-claude-haiku-4-5, deepgram-nova-3)
agent_configured per agent shape bucketed tool counts, integration category, and which coarse features are enabled
call_started a call connects engine / provider / carrier / direction — pairs with call_completed for a connect→complete funnel
call_completed a call ends outcome, direction (inbound/outbound), error code (the code, never the message), duration, latency, cost, bucketed turn count

Privacy posture

Never collected: phone numbers, transcripts, audio, prompts, tool arguments, API keys, customer identifiers, IPs (dropped at the collector), hostnames, file paths, or any free text. Custom / self-hosted model names and custom tool names are structurally impossible to send — they collapse to a vendor bucket or other before anything leaves the process. The two identifiers (run_id, install_id) are random UUIDs, never derived from hardware.

Opt out — any one of

  • Patter(telemetry=False) / new Patter({ telemetry: false })
  • getpatter telemetry disable (persisted, machine-level; re-enable with getpatter telemetry enable, inspect with getpatter telemetry status)
  • PATTER_TELEMETRY_DISABLED=1, or the cross-tool standard DO_NOT_TRACK=1

Auto-disabled in CI/test. Inspect-without-send: PATTER_TELEMETRY_DEBUG=1 prints each event to stderr and sends nothing. Redirect to your own collector with PATTER_TELEMETRY_ENDPOINT.

Implementation notes

  • New telemetry/ module in each SDK: events (schema + allowlist), consent, client (bounded buffer, short timeouts, swallow-all, best-effort exit flush), stack (vendor + PII-safe model sanitizer), environment (presence-only probes), install_id, call_metrics.
  • Schema is at v5; Python ↔ TypeScript verified byte-for-byte (7 events, 23 dimensions).
  • The README gains an OSS-standard Anonymous Telemetry disclosure (what we collect / what we never collect / not PII / opt-out incl. CLI + debug), and docs/telemetry.mdx documents every field.
  • Fail-isolation is load-bearing: every emit is wrapped so a telemetry bug can never break SDK construction, a call, or the CLI.

Test plan

  • Python suite green (telemetry: dedicated authentic tests with a real local HTTP collector)
  • TypeScript suite green (telemetry: dedicated tests)
  • tsc --noEmit clean
  • Python ↔ TypeScript schema parity verified byte-for-byte
  • Off-list values (custom tool/integration/model names, unknown commands/directions) verified to coerce to other and never reach the wire raw
  • Opt-out paths (env vars, DO_NOT_TRACK, in-code flag, persisted CLI marker, CI/test auto-disable, debug inspect-only) verified to produce zero egress

…all id)

Add anonymous, opt-out, fail-safe usage telemetry so maintainers can see how the
SDK is used. No PII or call content is ever collected.

Events (schema v3):
- sdk_initialized: carrier, tunnel
- feature_used: engine + provider, and for pipeline agents the composed stack —
  stt/tts/llm provider and a sanitized model token (e.g. deepgram-nova-3,
  anthropic-claude-opus-4-8)
- agent_configured: built-in vs custom tool counts (never names) + integration
  category (openclaw/mcp/hermes/other/none)
- call_completed: outcome, error_code (closed code, never the message),
  engine/provider/carrier, raw latency_ms + duration_seconds, and total cost_usd

Identifiers are random and PII-free: a per-process run_id and a persistent
anonymous install_id (random UUID stored locally to count active installs, never
a hardware fingerprint, never created when opted out). A two-layer allowlist
coerces any off-list value to "other"; custom/fine-tuned model names collapse to
{vendor}-other, so customer brands can never reach the wire.

Opt out with telemetry=False / { telemetry: false }, PATTER_TELEMETRY_DISABLED=1,
or DO_NOT_TRACK=1; auto-disabled in CI/test. Inspect locally with
PATTER_TELEMETRY_DEBUG=1; redirect with PATTER_TELEMETRY_ENDPOINT.

Tests: Python 2103 passed, TypeScript 1688 passed; Python/TS schema parity
verified byte-for-byte.
…als (schema v4)

Extend anonymous usage telemetry with high-value, privacy-safe signals (all
enums/bools/buckets, no PII; presence-only env probes never read a var's value):

- sdk_initialized: invoked_by_agent (claude/cursor/copilot/...), container,
  serverless, cloud, package_manager, days_since_install_bucket, and
  previous_sdk_version (upgrade funnel via a local state file).
- agent_configured: noise_reduction, turn_detection, preambles_used,
  per_tool_timeouts_set, llm_fallback_configured.
- call_completed: turn_count_bucket.

New telemetry/environment.{py,ts} (deploy-shape probes); install-id gains the
version funnel + days-since-install bucket. events.* gains a boolean-dimension
category; schema bumped v3 -> v4; the Vercel/Cloudflare relay allowlists updated
to match. README now carries an honest anonymous/opt-out telemetry note.

Telemetry never breaks construction: feature-adoption derivation is fault-isolated
(array-guarded) so invalid input still raises the correct validation error.

Tests: Python 2106, TypeScript 1691 passed; Python/TS schema parity byte-for-byte;
live relay -> Axiom verified for every new field.
…(schema v5)

Round out anonymous usage telemetry to OSS-CLI parity (Next.js / Astro /
Homebrew) so the maintainers get a clear picture of how the SDK is used — still
no PII, every field a coarse enum / bucket / bool.

New events:
- cli_command: which `getpatter` CLI command ran (the command NAME only, never
  arguments or flag values). The `getpatter telemetry` control command never
  emits telemetry itself.
- first_run: sent once per install (the run that creates the local state) to
  mark activation; carries the same anonymous deploy-shape dims as
  sdk_initialized and is never sent when opted out.
- call_started: emitted when a call connects, pairing with call_completed for a
  connect->complete funnel and a failure-rate denominator.

New dimension: direction (inbound / outbound) on call_started and call_completed
— a core usage split that was previously invisible.

User control: `getpatter telemetry status | disable | enable` (parity with
`next telemetry disable`) writes a persisted, machine-level opt-out marker
(~/.getpatter/telemetry-disabled) honoured by the consent resolver at
precedence #3 (after the env kill-switches, before the in-code flag).

events.* gains the direction + cli_command enums and three event names; schema
bumped v4 -> v5; the Vercel/Cloudflare relay allowlists updated to match.
README restructured to the OSS-standard "Anonymous Telemetry" disclosure
(what we collect / what we never collect / not PII / opt-out incl. CLI + debug);
docs/telemetry.mdx + CHANGELOG updated.

Fail-safe throughout: every new emit is wrapped so it can never block or break
a call, SDK construction, or the CLI. Python<->TS verified byte-for-byte
(7 events, 23 dimensions, schema 5).

Tests: Python 2113, TypeScript 1696 passed (telemetry: Python 48 / TS 37).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant