Conversation
This comment has been minimized.
This comment has been minimized.
Regression Detector (Agent Data Plane)Run ID: Optimization Goals: ✅ No significant changes detectedFine details of change detection per experiment (5)Experiments configured
Bounds Checks: ✅ Passed (5)
ExplanationA change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression ( |
Binary Size Analysis (Agent Data Plane)Baseline: 2dd3eab · Comparison: 9bb344d · diff ✅ Binary size difference within thresholdChanges by Module
Detailed Symbol Changes |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2c38265a5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… resolution when config key disabled
webern
left a comment
There was a problem hiding this comment.
AI aided, seems like maybe there's a bug with evicting from the cache. Otherwise looks good.
| inner.order.push_back(origin.to_string()); | ||
|
|
||
| while inner.order.len() > MAX_ORIGIN_COUNTERS { | ||
| if let Some(evicted_origin) = inner.order.pop_front() { |
There was a problem hiding this comment.
Below is what sonnet considers a bug. I have to admit I don't know this domain at all, so I hope it's not just wrong or something, but it's saying that the deletion method you added misses the key and the cache grows unbounded. This probably wouldn't be an issue in most environments but if there are 10's of thousands of container IDs churning, perhaps it could be:
origin_processed_metric_key builds a key with only 3 tags (message_type, listener_type, origin), but the counters were registered through MetricsBuilder, which prepends component_id and component_type as default tags. The keys don't match, so delete_counter always misses and the registry grows unbounded.
Fix: store the correct key at registration time rather than reconstructing it at eviction. Add a make_key method to MetricsBuilder that mirrors register_counter_with_tags but returns the Key:
// in saluki-metrics MetricsBuilder
pub fn make_key<I, T>(&self, metric_name: &'static str, additional_tags: I) -> Key
where I: IntoIterator<Item = T>, T: MetricTag,
{
let mut tags = (*self.default_tags).clone();
tags.extend(additional_tags.into_iter().map(MetricTag::into_label));
Key::from_parts(metric_name, tags)
}Then extend OriginMetricCounters to carry ok_key: Key and error_key: Key, populate them alongside the counters in get_or_create, and replace these two delete_counter calls with delete_counter(&evicted.ok_key) / delete_counter(&evicted.error_key). The two free functions (origin_processed_metric_key, origin_processed_error_key) can then be removed.
There was a problem hiding this comment.
Good catch. Seems like Sonnet was right. I just put up the fix in my newest commit.
| [ | ||
| ("message_type", "service_checks"), | ||
| ("listener_type", listener_type), | ||
| ("origin", ""), |
There was a problem hiding this comment.
Sonnet: Verified parity with Core Agent: dogstatsd.processed for events and service checks is declared with three label dimensions (message_type, state, origin) in server.go:283-284, and both paths always call tlmProcessed.Inc("events"/"service_checks", "ok"/"error", "") with a hardcoded empty string for origin. So Core Agent also always emits origin:"" on these series — origin:"" here is correct and consistent. 👍
Summary
telemetry.dogstatsd.*config keys are not applicable to ADP, and thus, won't be implemented. These config keys define static bucket bounds for histograms intended to capture latency; however, ADP employs dynamic, logarithmically calculated bounds.telemetry.dogstatsd_originwas implemented by threading the resolved origin as thedogstatsd.processedmetric. If there is no origin, it is represented as"", like the Core Agent, but never omitted.Change Type
How did you test this PR?
Unit testing.
References
telemetry.dogstatsd.*andtelemetry.dogstatsd_origin. #1679