diff --git a/ruby/rails-api/.env.example b/ruby/rails-api/.env.example new file mode 100644 index 0000000..8b96679 --- /dev/null +++ b/ruby/rails-api/.env.example @@ -0,0 +1,41 @@ +# OpenTelemetry Configuration for Last9 +# Get these values from your Last9 dashboard + +# OTLP Endpoint — send directly to Last9 (or set to http://localhost:4318 to use OTel Collector) +OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.last9.io:443 + +# Authentication header for Last9 +OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic + +# Service name +OTEL_SERVICE_NAME=ruby-on-rails-api-service + +# Traces exporter +OTEL_TRACES_EXPORTER=otlp + +# Optional: probabilistic sampling (0.0–1.0). Unset = sample everything. +# OTEL_SAMPLE_RATE=0.1 + +# Optional: drop additional URL paths (comma-separated) +# OTEL_FILTER_PATHS=/admin,/internal,/metrics + +# Optional: drop spans by peer hostname +# OTEL_FILTER_HOSTS=internal.svc,cache.local + +# Optional: drop spans whose name contains any substring +# OTEL_FILTER_SPAN_NAMES=render_partial,render_template + +# Optional: override which Redis commands to drop (default: HGET,HSET,HMGET,HMSET,...) +# OTEL_FILTER_REDIS_COMMANDS=GET,SET,DEL,EXPIRE + +# Optional: drop all spans from specific Sidekiq queues +# OTEL_FILTER_SIDEKIQ_QUEUES=mailers,low + +# Optional: drop all spans from specific Sidekiq job classes +# OTEL_FILTER_SIDEKIQ_JOBS=HeartbeatJob,MetricsSyncJob + +# --- OTel Collector mode (docker-compose.yaml) --- +# When using the collector, the app sends to it and the collector forwards to Last9. +# Set these in your shell or a .env file — the collector reads them at startup. +# LAST9_OTLP_ENDPOINT=https://otlp.last9.io:443 +# LAST9_OTLP_AUTH_HEADER=Basic diff --git a/ruby/rails-api/.gitignore b/ruby/rails-api/.gitignore index 30b4baf..0904bfe 100644 --- a/ruby/rails-api/.gitignore +++ b/ruby/rails-api/.gitignore @@ -10,6 +10,7 @@ # Ignore all environment files (except templates). /.env* !/.env*.erb +!/.env.example # Ignore all logfiles and tempfiles. /log/* diff --git a/ruby/rails-api/README.md b/ruby/rails-api/README.md index 7dc46c5..b774e76 100644 --- a/ruby/rails-api/README.md +++ b/ruby/rails-api/README.md @@ -1,48 +1,141 @@ -# Auto instrumentating Ruby on rails application using OpenTelemetry +# Rails API OpenTelemetry Example -This example demonstrates how to instrument a simple Ruby on rails application -with OpenTelemetry. +OpenTelemetry instrumentation for a Rails API application with built-in span noise reduction, sending traces to [Last9](https://last9.io). -1. Install the packages using following command: +## Prerequisites -```bash -bundle install +- Ruby 3.x +- Bundler + +## Quick Start + +1. **Install dependencies:** + ```bash + bundle install + ``` + +2. **Configure environment:** + ```bash + cp .env.example .env + # Fill in your Last9 OTLP endpoint and credentials + ``` + +3. **Start the server:** + ```bash + bin/rails server + ``` + +4. **Send test requests:** + ```bash + curl http://localhost:3000/api/v1/users + curl -X POST http://localhost:3000/api/v1/users -H 'Content-Type: application/json' -d '{"name":"Alice"}' + ``` + +## Configuration + +| Variable | Description | +|---|---| +| `OTEL_SERVICE_NAME` | Service name shown in traces | +| `OTEL_EXPORTER_OTLP_ENDPOINT` | Last9 OTLP endpoint | +| `OTEL_EXPORTER_OTLP_HEADERS` | `Authorization=Basic ` | +| `OTEL_TRACES_EXPORTER` | Set to `otlp` | + +## Reducing Trace Volume + +Ruby's `opentelemetry-instrumentation-all` + `use_all()` generates a large number of spans by default. This example includes several mechanisms to reduce noise. + +### What's disabled + +`ActionView` instrumentation is disabled — it creates a span per template and partial render, which is very high volume in full-stack apps and irrelevant for JSON APIs: + +```ruby +c.use_all('OpenTelemetry::Instrumentation::ActionView' => { enabled: false }) ``` -2. Obtain the OTLP Auth Header from the [Last9 dashboard](https://app.last9.io). - The Auth header is required in the next step. +### FilterSpanProcessor + +A custom `OtelFilterSpanProcessor` wraps the `BatchSpanProcessor` and drops spans before export. The following are dropped by default: -3. Next, run the commands below to set the environment variables. +| Category | Examples | Reason | +|---|---|---| +| DB transaction boundaries | `BEGIN`, `COMMIT`, `ROLLBACK` | 2 extra spans per transaction, no debug value | +| Health check paths | `/health`, `/healthz`, `/ping`, `/readyz`, `/livez` | Load balancer polling noise | +| OTLP exporter calls | Calls to your Last9 endpoint | Prevents Net::HTTP meta-tracing feedback loop | +| Noisy Redis commands | `HGET`, `HSET`, `HMGET`, `PIPELINED`, `EXPIRE`, `TTL`, etc. | High-frequency cache ops with no actionable signal | + +### Tuning via environment variables ```bash -touch .env -cp .env.example.erb .env +# Drop additional URL paths (comma-separated) +OTEL_FILTER_PATHS=/admin,/metrics,/internal + +# Drop spans by peer hostname +OTEL_FILTER_HOSTS=internal.svc,cache.local + +# Drop spans whose name contains any substring +OTEL_FILTER_SPAN_NAMES=render_partial,render_template + +# Override which Redis commands to drop +OTEL_FILTER_REDIS_COMMANDS=GET,SET,DEL,EXPIRE + +# Drop all spans from specific Sidekiq queues +OTEL_FILTER_SIDEKIQ_QUEUES=mailers,low + +# Drop all spans from specific Sidekiq job classes +OTEL_FILTER_SIDEKIQ_JOBS=HeartbeatJob,MetricsSyncJob ``` -4. In the `.env` file, set the value of `OTEL_EXPORTER_OTLP_HEADERS` to the OTLP - Authorization Header obtained from the Last9 dashboard and make sure the - value of the header is URL encoded. +### Probabilistic sampling + +Sample a percentage of traces instead of sending everything: ```bash -OTEL_EXPORTER_OTLP_HEADERS="Authorization=" +OTEL_SAMPLE_RATE=0.1 # 10% of traces +OTEL_SAMPLE_RATE=0.25 # 25% of traces ``` -5. Run the Ruby on Rails application: +Uses `parentbased_traceidratio` — downstream services respect the parent's sampling decision, so traces are never split mid-way. + +### Sidekiq + +`opentelemetry-instrumentation-sidekiq` (included via `opentelemetry-instrumentation-all`) auto-instruments Sidekiq at Rails boot — no extra setup needed for basic tracing. + +`config/initializers/sidekiq.rb` adds one critical hook: it calls `OpenTelemetry.tracer_provider.shutdown` on Sidekiq stop. Without this, spans buffered in the `BatchSpanProcessor` are lost when the process receives a stop signal. +## OTel Collector Mode + +Instead of sending traces directly to Last9, you can route them through an OTel Collector. The collector handles filtering and forwarding, keeping credentials out of the app container. + +**Architecture:** +``` +Rails app → OTel Collector (filter noise) → Last9 +``` + +**Start with Docker Compose:** ```bash -bin/rails server +export LAST9_OTLP_ENDPOINT=https://otlp.last9.io:443 +export LAST9_OTLP_AUTH_HEADER="Basic " +docker compose up ``` -6. Once the server is running, you can access the application at - `http://localhost:3000` by default. The API endpoints are: +The collector config (`otel-collector/config.yaml`) drops the same noisy spans as the in-app filter: +- `BEGIN` / `COMMIT` / `ROLLBACK` spans +- Health check paths (`/health`, `/healthz`, `/ping`, etc.) +- Noisy Redis commands (`HGET`, `HSET`, `PIPELINED`, etc.) + +This is complementary to the in-app `OtelFilterSpanProcessor` — you can use either or both. + +## Available Endpoints -- GET `/api/v1/users` - Get all users -- GET `/api/v1/users/:id` - Get a user by ID -- POST `/api/v1/users` - Create a new user -- PUT `/api/v1/users/:id` - Update a user -- DELETE `/api/v1/users/:id` - Delete a user +| Endpoint | Description | +|---|---| +| `GET /api/v1/users` | List users | +| `GET /api/v1/users/:id` | Get a user | +| `POST /api/v1/users` | Create a user | +| `PUT /api/v1/users/:id` | Update a user | +| `DELETE /api/v1/users/:id` | Delete a user | -7. Sign in to [Last9 Dashboard](https://app.last9.io) and visit the APM - dashboard to see the traces and metrics in action. +## References -![Traces](./traces.png) +- [OpenTelemetry Ruby docs](https://opentelemetry.io/docs/languages/ruby/) +- [Last9 documentation](https://last9.io/docs) diff --git a/ruby/rails-api/config/initializers/opentelemetry.rb b/ruby/rails-api/config/initializers/opentelemetry.rb index 83772ec..19e5bc5 100644 --- a/ruby/rails-api/config/initializers/opentelemetry.rb +++ b/ruby/rails-api/config/initializers/opentelemetry.rb @@ -2,32 +2,165 @@ require 'opentelemetry/exporter/otlp' require 'opentelemetry/instrumentation/all' -# Custom SpanProcessor that adds service.namespace from request-scoped storage +# Adds service.namespace from request-scoped storage to every span. +# CurrentRequest resets automatically between requests — no cross-request leakage. class NamespaceSpanProcessor < OpenTelemetry::SDK::Trace::Export::SimpleSpanProcessor def on_start(span, parent_context) - # Get namespace from request-scoped CurrentAttributes (not baggage) - # CurrentRequest resets automatically between requests - no leakage namespace = CurrentRequest.service_namespace rescue nil span.set_attribute("service.namespace", namespace) if namespace end end -# Exporter and Processor configuration -otel_exporter = OpenTelemetry::Exporter::OTLP::Exporter.new -batch_processor = OpenTelemetry::SDK::Trace::Export::BatchSpanProcessor.new(otel_exporter) +# Reduces trace volume by dropping high-cardinality, low-value spans before export. +# +# Drops by default: +# - DB transaction boundary spans (BEGIN / COMMIT / ROLLBACK) — high volume, no debug value +# - HTTP health check paths (/health, /healthz, /ping, /readyz, /livez) +# - OTLP exporter's own HTTP calls (prevents meta-tracing feedback loop) +# - Noisy Redis commands (HGET, HSET, HMGET, HMSET, EXPIRE, TTL, EXISTS, PIPELINED) +# +# Configurable via env vars: +# OTEL_FILTER_PATHS — comma-separated URL paths to drop (e.g. /admin,/metrics) +# OTEL_FILTER_HOSTS — comma-separated hostnames to drop (e.g. internal.svc) +# OTEL_FILTER_SPAN_NAMES — comma-separated span name substrings to drop +# OTEL_FILTER_REDIS_COMMANDS — override Redis commands to drop (e.g. GET,SET,DEL) +# OTEL_FILTER_SIDEKIQ_QUEUES — drop all spans from these Sidekiq queues (e.g. mailers,low) +# OTEL_FILTER_SIDEKIQ_JOBS — drop all spans from these Sidekiq job classes (e.g. HeartbeatJob) +class OtelFilterSpanProcessor + DB_TRANSACTION_PATTERN = /\A(BEGIN|COMMIT|ROLLBACK)/i + + DEFAULT_DROP_PATHS = %w[ + /health /healthz /ping /readyz /livez /metrics /favicon.ico + ].freeze + + DEFAULT_REDIS_NOISE_COMMANDS = %w[ + HGET HSET HMGET HMSET HGETALL HDEL + GET SET SETEX SETNX GETEX + EXPIRE TTL PEXPIRE PTTL EXISTS DEL + PIPELINED MULTI EXEC + ].freeze + + def initialize(delegate_processor) + @delegate = delegate_processor + @drop_paths = build_drop_paths + @drop_hosts = build_drop_hosts + @drop_names = build_drop_names + @redis_commands = build_redis_commands + @sidekiq_queues = build_sidekiq_queues + @sidekiq_jobs = build_sidekiq_jobs + end + + def on_start(span, parent_context) + @delegate.on_start(span, parent_context) + end + + def on_finish(span) + @delegate.on_finish(span) unless drop?(span) + end + + def force_flush(timeout: nil) + @delegate.force_flush(timeout: timeout) + end + + def shutdown(timeout: nil) + @delegate.shutdown(timeout: timeout) + end + + private + + def drop?(span) + drop_by_span_name?(span.name) || + drop_by_http_path?(span.attributes) || + drop_by_peer_host?(span.attributes) || + drop_redis_noise?(span) || + drop_sidekiq_noise?(span) + end + + def drop_by_span_name?(name) + return true if name.match?(DB_TRANSACTION_PATTERN) + @drop_names.any? { |pattern| name.include?(pattern) } + end + + def drop_by_http_path?(attrs) + target = attrs['http.target'] || attrs['url.path'] || '' + return false if target.empty? + @drop_paths.any? { |p| target == p || target.start_with?("#{p}/") } + end + + def drop_by_peer_host?(attrs) + host = attrs['net.peer.name'] || attrs['server.address'] || '' + return false if host.empty? + @drop_hosts.any? { |h| host == h || host.end_with?(".#{h}") } + end + + def drop_redis_noise?(span) + return false unless span.attributes['db.system'] == 'redis' + @redis_commands.include?(span.name.upcase) + end + + def drop_sidekiq_noise?(span) + return false unless span.attributes['messaging.system'] == 'sidekiq' + queue = span.attributes['messaging.destination'] || '' + job = span.attributes['messaging.sidekiq.job_class'] || '' + @sidekiq_queues.include?(queue) || @sidekiq_jobs.include?(job) + end + + def build_drop_paths + env = ENV.fetch('OTEL_FILTER_PATHS', '').split(',').map(&:strip).reject(&:empty?) + (DEFAULT_DROP_PATHS + env).uniq + end + + def build_drop_hosts + hosts = [] + if (endpoint = ENV['OTEL_EXPORTER_OTLP_ENDPOINT']) + uri = URI.parse(endpoint) rescue nil + hosts << uri.host if uri&.host + end + env = ENV.fetch('OTEL_FILTER_HOSTS', '').split(',').map(&:strip).reject(&:empty?) + (hosts + env).uniq + end + + def build_drop_names + ENV.fetch('OTEL_FILTER_SPAN_NAMES', '').split(',').map(&:strip).reject(&:empty?) + end + + def build_redis_commands + env = ENV.fetch('OTEL_FILTER_REDIS_COMMANDS', '') + return DEFAULT_REDIS_NOISE_COMMANDS.to_set if env.empty? + env.split(',').map { |c| c.strip.upcase }.reject(&:empty?).to_set + end + + def build_sidekiq_queues + ENV.fetch('OTEL_FILTER_SIDEKIQ_QUEUES', '').split(',').map(&:strip).reject(&:empty?).to_set + end + + def build_sidekiq_jobs + ENV.fetch('OTEL_FILTER_SIDEKIQ_JOBS', '').split(',').map(&:strip).reject(&:empty?).to_set + end +end + +otel_exporter = OpenTelemetry::Exporter::OTLP::Exporter.new +batch_processor = OpenTelemetry::SDK::Trace::Export::BatchSpanProcessor.new(otel_exporter) +filter_processor = OtelFilterSpanProcessor.new(batch_processor) namespace_processor = NamespaceSpanProcessor.new(otel_exporter) OpenTelemetry::SDK.configure do |c| - # Add processors - namespace processor adds attributes, batch processor exports c.add_span_processor(namespace_processor) - c.add_span_processor(batch_processor) + c.add_span_processor(filter_processor) + + # Probabilistic sampling via OTEL_SAMPLE_RATE (0.0–1.0). + # Uses parentbased_traceidratio so downstream services respect the parent's decision. + if (rate = ENV['OTEL_SAMPLE_RATE']&.to_f) && rate < 1.0 + c.sampler = OpenTelemetry::SDK::Trace::Samplers.parent_based( + root: OpenTelemetry::SDK::Trace::Samplers.trace_id_ratio_based(rate.clamp(0.0, 1.0)) + ) + end - # Resource configuration c.resource = OpenTelemetry::SDK::Resources::Resource.create({ - OpenTelemetry::SemanticConventions::Resource::SERVICE_NAME => 'ruby-on-rails-api-service', + OpenTelemetry::SemanticConventions::Resource::SERVICE_NAME => ENV['OTEL_SERVICE_NAME'] || 'ruby-on-rails-api-service', OpenTelemetry::SemanticConventions::Resource::SERVICE_VERSION => "0.0.0", OpenTelemetry::SemanticConventions::Resource::DEPLOYMENT_ENVIRONMENT => Rails.env.to_s }) - c.use_all() # enables all instrumentation! + c.use_all('OpenTelemetry::Instrumentation::ActionView' => { enabled: false }) end diff --git a/ruby/rails-api/config/initializers/sidekiq.rb b/ruby/rails-api/config/initializers/sidekiq.rb new file mode 100644 index 0000000..f159729 --- /dev/null +++ b/ruby/rails-api/config/initializers/sidekiq.rb @@ -0,0 +1,21 @@ +# Sidekiq + OpenTelemetry initialization +# +# opentelemetry-instrumentation-sidekiq (included via opentelemetry-instrumentation-all) +# auto-instruments Sidekiq via middleware when use_all() runs at Rails boot. +# +# This file handles two concerns: +# 1. Ensuring the OTel SDK shuts down cleanly when Sidekiq stops (flushes pending spans) +# 2. Providing a place to configure queue/job-level filtering via env vars: +# +# OTEL_FILTER_SIDEKIQ_QUEUES — drop all spans from these queues (e.g. mailers,low) +# OTEL_FILTER_SIDEKIQ_JOBS — drop all spans from these job classes (e.g. HeartbeatJob) + +if defined?(Sidekiq) + Sidekiq.configure_server do |config| + # Flush and shut down the OTel SDK when Sidekiq receives a stop signal. + # Without this, spans buffered in the BatchSpanProcessor may be lost on shutdown. + config.on(:shutdown) do + OpenTelemetry.tracer_provider.shutdown + end + end +end diff --git a/ruby/rails-api/docker-compose.yaml b/ruby/rails-api/docker-compose.yaml new file mode 100644 index 0000000..93c7a6e --- /dev/null +++ b/ruby/rails-api/docker-compose.yaml @@ -0,0 +1,39 @@ +services: + app: + build: . + ports: + - "3000:3000" + environment: + RAILS_ENV: production + OTEL_SERVICE_NAME: ruby-on-rails-api-service + OTEL_TRACES_EXPORTER: otlp + # Send to the local collector, not directly to Last9 + OTEL_EXPORTER_OTLP_ENDPOINT: http://otel-collector:4318 + OTEL_LOG_LEVEL: error + # Optional: probabilistic sampling (e.g. 0.1 = 10%) + # OTEL_SAMPLE_RATE: "0.1" + # Optional: Sidekiq queue/job filters + # OTEL_FILTER_SIDEKIQ_QUEUES: mailers,low + # OTEL_FILTER_SIDEKIQ_JOBS: HeartbeatJob + depends_on: + otel-collector: + condition: service_healthy + + otel-collector: + image: otel/opentelemetry-collector-contrib:0.144.0 + command: ["--config=/etc/otel/config.yaml"] + volumes: + - ./otel-collector/config.yaml:/etc/otel/config.yaml:ro + ports: + - "4317:4317" # OTLP gRPC + - "4318:4318" # OTLP HTTP + - "13133:13133" # health_check + environment: + # Last9 credentials — set these in your shell or a .env file + LAST9_OTLP_ENDPOINT: ${LAST9_OTLP_ENDPOINT} + LAST9_OTLP_AUTH_HEADER: ${LAST9_OTLP_AUTH_HEADER} + healthcheck: + test: ["CMD", "wget", "--spider", "-q", "http://localhost:13133"] + interval: 5s + timeout: 3s + retries: 5 diff --git a/ruby/rails-api/otel-collector/config.yaml b/ruby/rails-api/otel-collector/config.yaml new file mode 100644 index 0000000..0aa8cc2 --- /dev/null +++ b/ruby/rails-api/otel-collector/config.yaml @@ -0,0 +1,78 @@ +# OpenTelemetry Collector configuration for ruby/rails-api +# +# This collector receives spans from the Rails app, drops noisy spans, +# then exports clean data to Last9. +# +# Flow: Rails app → OTel Collector (filter) → Last9 +# +# Usage: +# Set OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 in the Rails app +# so it sends to the collector instead of directly to Last9. +# The collector holds the Last9 credentials. +# +# Start: docker compose up +# or: docker run -v $(pwd)/otel-collector/config.yaml:/etc/otel/config.yaml \ +# -p 4317:4317 -p 4318:4318 \ +# otel/opentelemetry-collector-contrib:0.144.0 \ +# --config /etc/otel/config.yaml + +receivers: + otlp: + protocols: + grpc: + endpoint: 0.0.0.0:4317 + http: + endpoint: 0.0.0.0:4318 + +processors: + batch: + timeout: 5s + send_batch_size: 512 + + # Optional: probabilistic sampling at the collector level. + # sampling_percentage: 10 means keep 10% of traces. + # Uses hash of TraceID — consistent across collector replicas, same trace always + # gets the same decision. Uncomment and add to the pipeline to enable. + # probabilistic_sampler: + # sampling_percentage: 10 + + filter/drop_noise: + error_mode: ignore + traces: + span: + # DB transaction boundary spans — 2 extra spans per transaction, no debug value + - 'IsMatch(name, "^(BEGIN|COMMIT|ROLLBACK)")' + + # Health check paths polled by load balancers every few seconds + - 'IsMatch(attributes["http.target"], "^(/health|/healthz|/ping|/readyz|/livez|/metrics|/favicon\\.ico)(/.*)?$")' + + # Noisy Redis commands — high-frequency cache ops with no actionable signal + - | + attributes["db.system"] == "redis" and + IsMatch(name, "^(HGET|HSET|HMGET|HMSET|HGETALL|HDEL|GET|SET|SETEX|SETNX|GETEX|EXPIRE|TTL|PEXPIRE|PTTL|EXISTS|DEL|PIPELINED|MULTI|EXEC)$") + +exporters: + otlp: + endpoint: ${env:LAST9_OTLP_ENDPOINT} + headers: + Authorization: ${env:LAST9_OTLP_AUTH_HEADER} + tls: + insecure: false + + # Uncomment to debug: logs all spans that pass the filter to stdout + # debug: + # verbosity: detailed + +extensions: + health_check: + endpoint: 0.0.0.0:13133 + +service: + extensions: [health_check] + pipelines: + traces: + receivers: [otlp] + # To enable probabilistic sampling, add probabilistic_sampler before batch: + # processors: [filter/drop_noise, probabilistic_sampler, batch] + processors: [filter/drop_noise, batch] + exporters: [otlp]