Skip to content

Releases: pambrose/prometheus-proxy

v3.2.0

14 Jun 04:51

Choose a tag to compare

Prometheus Proxy 3.2.0 focuses on security hardening, fail-fast validation, a full end-to-end Testcontainers suite, richer observability, and a major documentation expansion (including a Kubernetes deployment guide).

🚀 New Features

  • Pre-shared agent token for authenticating agent gRPC connections — --agent_token / AGENT_TOKEN (proxy.agentToken / agent.agentToken). The proxy rejects any RPC with a missing or mismatched token (UNAUTHENTICATED, constant-time digest comparison). Empty (the default) preserves the existing open behavior and logs a startup warning unless mutual TLS is configured. The value is never logged.
  • Per-CA HTTPS trust store for the agent's scrape client — --https_truststore / --https_truststore_password (HTTPS_TRUST_STORE_PATH / HTTPS_TRUST_STORE_PASSWORD) verify HTTPS targets against a custom/private CA without disabling validation.
  • Full Testcontainers end-to-end suite (io.prometheus.containers) — a smoke test plus seven specs over real Netty/Docker (proxy HTTP surfaces, agent-token auth, consolidated merge, chunked + gzipped large payloads, agent reconnect, gRPC TLS, HTTPS targets), plus a parameter-driven ContainersScalingTest. All gated on RUN_CONTAINER_TESTS=true.
  • make help, a make container-tests target with Docker-context auto-detection, and a container-tests GitHub workflow.

🔐 Security

  • Mitigates the unauthenticated agent-registration / path-hijacking finding via the optional pre-shared agent token above.
  • Prevents agent-supplied service-discovery labels from overriding proxy-computed reserved keys (__metrics_path__, agentName, hostName).
  • Redacts query-parameter values (not just user:pass@ userinfo) wherever a scrape URL is logged or echoed to Prometheus, so secrets in ?token=… no longer leak.
  • Derives the agent HttpClientCache key from a salted HMAC-SHA256 digest instead of plaintext username:password.
  • Bounds the agent's scrape response-body read (reads at most maxContentLength + 1 bytes), closing an OOM path for targets with no/understated Content-Length.

🔁 Behavior Changes

  • Embedded agents (startAsyncAgent, exitOnMissingConfig=false) now throw the new public io.prometheus.common.ConfigLoadException on a config-load failure instead of calling exitProcess; standalone agents and the proxy still exit.
  • ProxyOptions, BaseOptions, and AgentOptions now validate ports, gRPC timeouts, and scrape/inactivity timeouts at startup, failing fast with clear messages.
  • Removed the deprecated all log level — use trace for the most verbose output.
  • Per-request call logging emits at DEBUG instead of INFO when enabled.

📊 Observability

  • Added an outcome label to proxy_scrape_request_latency_seconds and now record latency for the timeout and agent-disconnected paths.
  • Labeled proxy_start_time_seconds with a per-process launch_id.
  • Count scrape results dropped on connection-close as agent_scrape_result_count{type="dropped"}.

🐛 Bug Fixes

  • Fixed DnsNameResolverProvider / PickFirstLoadBalancerProvider missing from the shaded agentJar/proxyJar (Shadow 9.4.2 dropped same-named META-INF/services entries), which had made gRPC default to the unix scheme on non-IP hostnames.
  • Fixed embedded Agent.stop() / EmbeddedAgentInfo.shutdown() spawning a zombie reconnect thread; shutdown now routes through the Guava lifecycle and blocks until terminated.
  • Reject registration of multi-segment paths (e.g. app/metrics) that appeared in service discovery but 404'd at scrape time.
  • Fixed appendQueryParams URL-decoding the encoded query blob before concatenation.
  • Fixed a per-response processing error blocking the HTTP handler until scrapeRequestTimeoutSecs; the scrape now fails immediately.
  • Fixed flaky ProxyHttpRoutesTest connection-reset by replacing the TCP probe with an HTTP-level readiness check.

🐳 Docker Images

  • The proxy/agent images now run on Java 25 (LTS) via the eclipse-temurin:25-jre base (Ubuntu-based, pinned by manifest-list digest), replacing the previous alpine + apk add openjdk17-jre build. The published fat JARs remain Java 17 bytecode and run unchanged on the newer JRE, so self-run and embedded-agent usage still requires only Java 17.
  • Multi-arch coverage is amd64, arm64, s390x, and ppc64le (Temurin is one of the few JDK distributions that publishes the s390x/ppc64le ports).
  • The container ENTRYPOINT adds --enable-native-access=ALL-UNNAMED and --sun-misc-unsafe-memory-access=allow to silence the JDK 25 startup warnings from jansi and netty.

📚 Documentation

  • New Kubernetes deployment guide with ready-to-use proxy/agent manifests, standalone and sidecar patterns, gRPC exposure, and Prometheus Operator (ServiceMonitor) integration.
  • New glossary, troubleshooting, production, Grafana & alerting, and example-config pages; README now links into the docs site per section.

🧰 Build, Tooling & Code Quality

  • Removed dead config keys (proxy.http.maxThreads/minThreads, proxy.internal.scrapeRequestCheckMillis) and annotated unimplemented knobs.
  • BuildConfig timestamps now read fresh each build via ValueSource.
  • Moved detekt config to config/detekt/ and wired detekt into make lint; tests run in CI with Kover coverage uploaded to Codecov.
  • Extensive no-behavior-change refactors across the agent, proxy, and common modules, plus broad new test coverage.

📦 Artifacts

Docker:

docker pull pambrose/prometheus-proxy:3.2.0
docker pull pambrose/prometheus-agent:3.2.0

Maven Central:

implementation("com.pambrose:prometheus-proxy:3.2.0")

See CHANGELOG.md for the complete, itemized list of changes.

Full Changelog: 3.1.1...3.2.0

v3.1.1

30 Apr 18:05

Choose a tag to compare

Maintenance release focused on public-API documentation, reproducible builds, flaky test fixes, and dependency updates.

Highlights

  • Documented public API — Full KDoc on every @Parameter field of BaseOptions / AgentOptions / ProxyOptions, every EnvVars value, the Agent and Proxy companion entry points, and EmbeddedAgentInfo, covering resolution precedence (CLI → env → config → default), sentinel values, and validation rules.
  • Reproducible buildsBuildConfig.APP_RELEASE_DATE and BuildConfig.BUILD_TIME accept -PoverrideReleaseDate / -PoverrideBuildTime Gradle properties so CI can produce bit-identical artifacts.
  • Flaky test fixes — Replaced timing-based probes in AgentTest and AgentHttpServiceTest with deterministic readiness gates, eliminating two long-standing CI flakes.
  • Cleaner build script — Centralized repositories in settings.gradle.kts, dropped the redundant fat-jar rewrap, removed the redundant java plugin alias, and aligned dependsOn calls on tasks.named().

Bug Fixes

  • Fix flaky AgentTest "Bug #1" coroutine backpressure test — sample point could land between batches and observe 0 active coroutines. Replaced with a deterministic CompletableDeferred gate plus Kotest eventually() for scheduler jitter.
  • Fix flaky AgentHttpServiceTest — fixed 100 ms post-server.start delay was insufficient on busy machines. Replaced with an active TCP-connect probe (20 ms poll, 5 s deadline).

Build & Tooling

  • Centralize repository declarations in settings.gradle.kts via dependencyResolutionManagement(FAIL_ON_PROJECT_REPOS); mavenLocal() is opt-in with -PuseMavenLocal=true.
  • Replace the agentJar/proxyJar zipTree-rewrap with two ShadowJar tasks (configuration-cache safe; one fewer redundant fat jar on disk).
  • Drop the redundant java plugin (applied transitively by kotlin.jvm).
  • Switch compileKotlin.dependsOn(":generateProto") to tasks.named("generateProto") for type-safe task references.
  • Mark the internal Utils object as internal.
  • Hoist formatter, releaseDate, and buildTime out of the buildConfig {} block to top-level vals.
  • Centralize test server readiness in a shared startServerAndGetPort helper.
  • Add check-gpg-env Makefile target for GPG signing validation.
  • Fix the date format passed by the build and local-build Makefile targets to match the MM/dd/yyyy pattern parsed by build.gradle.kts.
  • Add Claude Code GitHub workflow.

Dependency Updates

Dependency Old New
Kotlin 2.3.20 2.3.21
Gradle wrapper 9.4.1 9.5.0
Ktor 3.4.2 3.4.3
serialization 1.10.0 1.11.0
tcnative 2.0.74.Final 2.0.77.Final
utils 2.7.1 2.8.1
gradle-plugins 1.0.12 1.0.14
protobuf 0.9.6 0.10.0
taskinfo 3.0.1 3.0.2

Full Changelog: 3.1.0...3.1.1

v3.1.0

06 Apr 22:19
a922570

Choose a tag to compare

Breaking Changes

  • Maven coordinates changed: Published to Maven Central as com.pambrose:prometheus-proxy
  • JitPack is no longer used; all dependencies resolve from Maven Central

New Features

  • Add Zensical documentation site with comprehensive guides, code examples, and architecture diagrams
  • Publish documentation to GitHub Pages via CI

Build & Tooling

  • Migrate publishing from JitPack to Maven Central using vanniktech maven-publish plugin
  • Replace manual maven-publish + sources/javadoc JAR tasks with mavenPublishing DSL
  • Remove JitPack plugin resolution strategy from settings.gradle.kts
  • Remove jitpack.yml
  • Add GPG signing for Maven Central (skipped when no key is provided)
  • Add google() repository to build script
  • Add overrideVersion property support for snapshot publishing
  • Import VisibilityModifier directly instead of using fully qualified name in Dokka config

Documentation

  • Add full documentation site in website/prometheus-proxy/ with 13 pages covering architecture, getting started, configuration, security/TLS, Docker, embedded agent, service discovery, monitoring, CLI reference, and advanced topics
  • Add code example snippets imported via pymdownx.snippets
  • Extract Java/Kotlin code examples into compilable source files so API changes are caught by the compiler
  • Add mkdocs-material dependency and grid card layouts for Next Steps sections
  • Add markdown extensions: admonition, details, attr_list, md_in_html, pymdownx.emoji with material icon support
  • Add KDocs nav entry with API Reference section
  • Update README.md with Maven Central badge, documentation site link, and dependency coordinates

Dependencies

  • Bump utils to 2.7.1
  • Bump Kotest to 6.1.10, Ktor to 3.4.2, Logback to 1.5.32
  • Bump gradle-plugins to 1.0.12, Protoc to 4.34.1, Dropwizard to 4.2.38
  • Bump Dokka to 2.2.0, maven-publish plugin to 0.36.0, Kover to 0.9.8

Metrics & Observability

  • Add new proxy metrics: proxy_chunk_validation_failures_total, proxy_chunked_transfers_abandoned_total, proxy_agent_displacement_total, proxy_scrape_response_bytes
  • Convert proxy and agent latency metrics from summaries to histograms
  • Add new agent metrics: agent_client_cache_size, agent_scrape_backlog_size
  • Add path and encoding labels to proxy response metrics
  • Rebuild Grafana dashboards for new metric schema

Bug Fixes

  • Fix flaky HttpClientCacheTest by ensuring deterministic LRU eviction order
  • Fix scrape response bytes metric to observe correct unzipped size

Misc

  • Use portable bash shebang (#!/usr/bin/env bash) in bin/ scripts
  • Extract Docker image version from build.gradle.kts in bin/ scripts
  • Remove .superset config files
  • Remove legacy files and clean up .gitignore

Full Changelog: 3.0.3...3.1.0

v3.0.3

19 Mar 06:00
042cc1d

Choose a tag to compare

v3.0.3

Dependency Updates

Dependency Old New
Kotlin 2.3.10 2.3.20
Gradle wrapper 9.2.0 9.4.0
gRPC 1.79.0 1.80.0
Kotest 6.1.3 6.1.7
Protoc 4.33.5 4.34.0
utils 2.5.3 2.6.3
gradle-plugins 1.0.8 1.0.10
config plugin 6.0.7 6.0.9

Build & Tooling

  • Extract JitPack URLs into reusable Makefile variables (JITPACK_BUILD_URL, JITPACK_API_URL)
  • Enable Gradle configuration caching and daemon for faster builds
  • Add homepage link to plugins configuration in build.gradle.kts
  • Update .gitignore to include test configuration files
  • Use forEach instead of map in coroutine launches for clarity in AgentConnectionContextTest

Documentation & Cleanup

  • Add GitHub workflow commands and API documentation section to README
  • Remove outdated GEMINI.md, AGENTS.md, and OpenSpec instructions
  • Remove legacy documentation and workflows
  • Clean up CLAUDE.md

See Release Notes for full details.

v3.0.0

16 Feb 00:19
5cc70c4

Choose a tag to compare

Prometheus Proxy 3.0.0 (AKA Claude Code massive cleanup)


Bug Fixes

Data Integrity & Correctness

  • Fix integer overflow in ChunkedContext.totalByteCount (Int → Long) that could silently bypass size limits on large
    payloads
  • Fix chunk checksum calculation to use actual byte count instead of full buffer size
  • Fix toScrapeResponseHeader to propagate the actual srZipped value (was hardcoded to true)
  • Fix applySummary to propagate the headerZipped value from chunked response headers
  • Fix IOException error code from NotFound (404) to ServiceUnavailable (503) — semantically correct for
    unreachable targets
  • Fix catch-all HTTP exception handler from NotFound (404) to InternalServerError (500)
  • Fix errorCode() to walk the exception cause chain for wrapped timeout exceptions
  • Fix OpenMetrics # EOF marker handling in consolidated responses — intermediate # EOF markers are now stripped
  • Fix parseHostPort to strip brackets from IPv6 addresses in HostPort[::1]:50051 now yields host ::1 instead
    of [::1]

Concurrency & Resource Management

  • Fix TOCTOU race in AgentContextCleanupService — agents are now re-checked for staleness before eviction
  • Fix negative scrapeRequestBacklogSize with atomic CAS-loop decrement clamped at zero
  • Fix ConcurrentModificationException in ProxyPathManager.removePathsForAgentId and recentReqs access
  • Fix HttpClientCache.close() deadlock — coroutine scope cancelled before acquiring mutex
  • Fix HTTP client close calls moved outside mutex to avoid blocking cache operations during slow I/O
  • Fix idle HTTP clients now closed on eviction (previously only marked for close)
  • Fix AgentHttpService now properly closed during agent shutdown (resource leak)
  • Fix path registration concurrency by moving gRPC calls outside the mutex
  • Fix AgentClientInterceptor to use the next channel parameter instead of bypassing the interceptor chain
  • Fix synchronized agentId assignment in AgentClientInterceptor to prevent race condition
  • Fix ScrapeRequestWrapper.markComplete() is now idempotent via AtomicBoolean.compareAndSet
  • Fix runCatching replaced with runCatchingCancellable throughout to avoid swallowing CancellationException
  • Fix agent context added after ID validation to prevent orphaned contexts

Error Handling & Cleanup

  • Fix orphaned ChunkedContext cleanup on stream failure — associated scrape requests are now explicitly failed
  • Fix chunk validation errors now throw ChunkValidationException instead of crashing the gRPC stream
  • Fix readRequestsFromProxy throws StatusException(NOT_FOUND) when agent context is missing (was silently no-op)
  • Fix connectAgent/connectAgentWithTransportFilterDisabled throw StatusException(FAILED_PRECONDITION) instead of
    RequestFailureException
  • Fix sendHeartBeat re-throws NOT_FOUND status to trigger agent reconnection (was zombie state)
  • Fix agent invalidation now drains pending scrape requests and unblocks HTTP handlers immediately
  • Fix handleConnectionFailure re-throws JVM Error subclasses instead of retrying in a corrupted state
  • Fix stream cleanup for transportFilterDisabled mode in readRequestsFromProxy finally block

Security

  • Fix credential leak in HttpClientCache logs — ClientKey.toString() now masks credentials
  • Fix password CharArray zeroed after use in SslSettings.getKeyStore
  • Fix FileInputStream resource leak in SslSettings — now uses try-with-resources
  • Fix URL sanitization in agent logs to strip credentials before logging

Misc

  • Fix gzip compression for small responses — enforced minimumSize(1024) in ProxyHttpConfig
  • Fix redundant response.status() call in ProxyUtils.respondWith
  • Fix service discovery and metrics paths now ensure leading /
  • Fix dynamic parameter handling to correctly set system properties
  • Fix registerPath/registerAgent/sendHeartBeat responses only set reason field when valid is false
  • Fix typo: "Overide" → "Override" in config and ConfigVals

New Features

  • Content size limits — New configurable limits to prevent zip bombs and unbounded memory:
    • proxy.internal.maxZippedContentSizeMBytes (default 5 MB)
    • proxy.internal.maxUnzippedContentSizeMBytes (default 10 MB)
    • agent.http.maxContentLengthMBytes / AGENT_MAX_CONTENT_LENGTH_MBYTES (default 10 MB)
  • Unary RPC deadlineagent.grpc.unaryDeadlineSecs / UNARY_DEADLINE_SECS (default 30s) prevents unary gRPC
    calls from hanging indefinitely
  • Graceful scrape request failure — Orphaned scrape requests are failed with proper status on agent disconnect,
    stream termination, chunk validation failure, and proxy shutdown
  • Consolidated/non-consolidated mismatch rejectionaddPath now rejects mismatched agent types on the same path
    with a descriptive error
  • Authorization header TLS warning — One-time warning logged when auth headers are sent over non-TLS connections
  • HTTP request lifecyclecancelCallOnClose = true cancels HTTP requests when clients disconnect
  • Bounded scrape request channel — Agent-side channel now has configurable backpressure instead of unlimited
    capacity
  • Outer scrape timeoutwithTimeout wrapper in fetchContent() as safety net beyond Ktor client timeout
  • Strict env var parsing — Boolean env vars only accept "true"/"false"; integer/long env vars throw descriptive
    errors on invalid values
  • "all" log levelsetLogLevel now accepts "all" as a valid level
  • Input validationparseHostPort validates blank strings; parsePort validates port ranges
  • TLS config validation — Requires both certificate and key for TLS; warns on disabled X.509 verification

Refactoring

  • ScrapeResults fields changed from var to val (fully immutable construction)
  • ResponseResults and ScrapeRequestResponse converted to immutable data classes
  • updateMsg: StringupdateMsgs: List<String> in ResponseResults
  • ProxyUtils response functions now return values instead of mutating a passed-in object
  • AgentContextManager maps made private with accessor methods and read-only views
  • ScrapeRequestManager.scrapeRequestMap made private with read-only view
  • ProxyPathManager changed from ConcurrentMap to HashMap with explicit synchronized blocks
  • AgentPathManager uses ConcurrentHashMap and Mutex for thread-safe registration
  • AgentGrpcService uses ReentrantLock for thread-safe shutdown and stub creation
  • gRPC metadata constants consolidated into GrpcConstants
  • Config file moved: etc/config/config.confconfig/config.conf
  • Detekt config moved: config/detekt/etc/detekt/
  • SslSettings return types changed from nullable to non-nullable
  • Scrape request queue changed from Channel to ConcurrentLinkedQueue with notifier
  • Scrape request polling loop replaced with event-driven awaitCompleted() suspension
  • Proto: reserved field 5 in RegisterAgentRequest; added header_zipped field 8 to HeaderData

Dependency Updates

Dependency Old New
Kotlin 2.2.20 2.3.10
Gradle wrapper 8.x 9.2.0
Ktor 3.2.3 3.4.0
gRPC 1.75.0 1.79.0
Protoc 4.32.0 4.33.5
JCommander 2.0 3.0
Kotest 6.0.3 6.1.3
Logback 1.5.18 1.5.31
MockK (new) 1.14.9
tcnative 2.0.73 2.0.74
utils 2.4.5 2.5.3
config plugin 5.6.8 6.0.7
kotlinter 5.2.0 5.4.2
kover 0.9.1 0.9.7
dropwizard 4.2.36 4.2.38
gengrpc 1.4.3 1.5.0
serialization 1.9.0 1.10.0
slf4j 2.0.13 2.0.17
typesafe 1.4.4 1.4.5

CI/CD

  • Added GitHub Actions CI workflow for building the project on push/PR to master
  • Added GitHub Actions workflow for deploying Dokka API documentation to GitHub Pages
  • Removed Travis CI configuration (.travis.yml)

Documentation

  • Integrated Dokka for HTML API documentation generation (./gradlew dokkaHtml)
  • Added KDoc documentation across agent, proxy, and common packages
  • Added module and package documentation (docs/packages.md)
  • Added improvements roadmap document (docs/improvements.md)

Testing

  • ~26,000+ lines of new unit tests added
  • Tests reorganized into io.prometheus.agent/, io.prometheus.proxy/, io.prometheus.common/, io.prometheus.misc/
  • Added MockK for mocking support
  • Compiler option -Xreturn-value-checker=check enabled

Breaking Changes

High Impact — Will affect most users monitoring scrape responses

# Change Detail
1 Default scrape failure status: 404 → 503 ScrapeResults.srStatusCode default changed from NotFound (404) to ServiceUnavailable (503). Any monitoring/alerting keyed on status codes from failed scrapes will see different codes.
2 IOException scrape error: 404 → 503 When the agent can't reach the scrape target (connection refused, DNS failure, etc.), the status returned to Prometheus changed from 404 to...
Read more

v2.4.0

11 Sep 06:59
a9e4d21

Choose a tag to compare

  • Refactor dependency management in build.gradle.kts and libs.versions.toml
  • Fix dependency declaration for Kotlin BOM in build.gradle.kts
  • Update dependencies in libs.versions.toml, including gRPC, Jetty, and Kotest, and add protobuf-kotlin entry
  • Update dependencies: Dropwizard to 4.2.36, Kotest to 6.0.2, and tcnative to 2.0.73 in libs.versions.toml
  • Update plugin and dependency versions in libs.versions.toml
  • Update Kotlin to 2.2.20

v2.3.0

14 Aug 05:48

Choose a tag to compare

  • Add support for tuning concurrent endpoint HTTP clients
  • Add support for tuning caching endpoint HTTP clients
  • Rename concurrent scrapes configuration to maxConcurrentScrapes and update related documentation
  • Expose httpClientCache in AgentHttpService and add a healthcheck for cache size
  • Introduced clientTimeoutSecs to configure HTTP client timeout in seconds, replacing agent.configVals.agent.internal.cioTimeoutSecs.
  • Update Ktor version to 3.2.2

v2.2.0

26 Jun 01:28
ba6cd0e

Choose a tag to compare

  • Update to Kotlin 2.2.0
  • Update jars
  • Convert to using libs.versions.toml
  • Add log level configuration and update logging in various components

v2.1.0

23 Mar 06:37
53f10c7

Choose a tag to compare

  • Refactor coroutine dispatchers to use IO for better performance
  • Refactor coroutine exception handling for improved safety
  • Update Kotlin to 2.1.20 and update jars; refactor atomic operations for improved clarity

v2.0.0

14 Feb 22:53
a7cf130

Choose a tag to compare