Enforce last-value-wins semantics in AttributesMap without performance regression#8548
Open
EvgeniiR wants to merge 4 commits into
Open
Enforce last-value-wins semantics in AttributesMap without performance regression#8548EvgeniiR wants to merge 4 commits into
EvgeniiR wants to merge 4 commits into
Conversation
…ixes open-telemetry#7897) AttributesMap previously extended HashMap<AttributeKey<?>, Object>, where AttributeKey.equals() includes the AttributeType. This caused attributes with the same string name but different types to coexist as separate entries, violating the OTel spec last-value-wins rule. Replace the HashMap backing with LinkedHashMap<String, AttributeEntry> keyed by raw attribute name. Overwrites with a different type now update the existing entry in place, so size() stays correct and capacity limits are not consumed. Also eliminates the double hash-probe in put() (containsKey + get → single get).
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8548 +/- ##
============================================
+ Coverage 90.96% 90.99% +0.02%
- Complexity 10206 10227 +21
============================================
Files 1013 1013
Lines 27166 27228 +62
Branches 3182 3191 +9
============================================
+ Hits 24712 24776 +64
+ Misses 1730 1729 -1
+ Partials 724 723 -1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Contributor
Author
This PR also includes benchmarks for individual read/write operations (uniqueKeys, getHit, getTypeMiss, forEachAll). The project already has More benchmarksAttributesMapBenchmark — write (avg ns/op, lower is better)
AttributesMapBenchmark — read (avg ns/op, lower is better)
FillSpanBenchmark (ops/ms, higher is better)
SpanRecordBenchmark (ops/s, higher is better)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #7897
Problem
AttributesMapextendedHashMap<AttributeKey<?>, Object>, usingAttributeKeyas the mapkey. Because
AttributeKey.equals()includes the attribute type, two attributes with the samename but different types (e.g.
stringKey("http.method")andlongKey("http.method")) werestored as separate entries.
This violated the OpenTelemetry specification, which requires that
attribute name alone determines identity — last write wins regardless of type.
Solution
Correctness fix
Replace the
HashMap<AttributeKey<?>, Object>backing store with a string-keyed map so thatput("http.method", String)followed byput("http.method", Long)results in exactly one entry(the Long value), consuming only one capacity slot.
The first implementations were based on HashMap/LinkedHashMap (2nd performed better once forEach was included in the benchmarks), but they introduced another issue — the fixed
AttributesMapperformed ~40-80% worse than baseline.Therefore, I started looking for a better solution that would preserve the required last-value-wins semantics without introducing a performance regression.
LinkedHashMap implementation can be observed in the previous commit — https://github.com/EvgeniiR/opentelemetry-java/blob/d7df58af76e693aa1fe897d2757e2bdb50ab9798/sdk/common/src/main/java/io/opentelemetry/sdk/common/internal/AttributesMap.java
Final solution is described below.
Performance optimization
Instead of
LinkedHashMap<String, AttributeEntry>, use parallel arrays with an open-addressingint[]hash table (linear probing, load factor ≤ 0.5):forEachbecomes a tight sequential array loop with no pointer chasing, directly benefiting the export.Benchmark results (avgt ns/op, lower is better)
putThenForEach — N unique puts + 1 forEach (dominant production path)
* This is a worst-case for the new implementation. Performance for spans with 16–32 attributes could be improved by raising the initial array size from 16 to 32, at the cost of extra memory per map. I'm not sure where the right tradeoff sits. Currently our implementation uses less memory then the baseline:
AttributesMapBenchmark — putThenForEach memory allocation (gc.alloc.rate.norm, B/op, lower is better)
PA matches the baseline at n=4 and n=16 while being spec-correct. Given that, I think that further performance optimizations(besides may be changing init size) are out of scope of this PR.