Skip to content

[WIP] feat: add common subnode aggregation#46

Draft
jspaezp wants to merge 1 commit into
binius-zk:mainfrom
jspaezp:feat/aggregate_common
Draft

[WIP] feat: add common subnode aggregation#46
jspaezp wants to merge 1 commit into
binius-zk:mainfrom
jspaezp:feat/aggregate_common

Conversation

@jspaezp

@jspaezp jspaezp commented Oct 20, 2025

Copy link
Copy Markdown

[copying here the text from https://github.com//issues/47]

Hello!

First of all thanks a lot for the crate, it has been incredibly handy.
I was wondering if you would be interested in a feature that I am experimenting with and found useful.

The main issue is that for calls that are executed in loops, I generally require knowing how much the aggregate time was instead of the per-call timing, so aggregating equivalent sub-graphs greatly simplifies this process.

Example (That I should have made more minimal ... but shows the point):

use std::thread;
use std::time::Duration;

use tracing::{debug_span, event, instrument::WithSubscriber, level_filters::LevelFilter, Level};
use tracing_profile::{PrintTreeConfig, PrintTreeLayer};
use tracing_subscriber::prelude::*;
use tracing_subscriber::EnvFilter;

#[tracing::instrument]
fn sleep_a_bit() {
    thread::sleep(Duration::from_millis(10));
}

#[tracing::instrument]
fn sleep_a_bit_and_yell() {
    thread::sleep(Duration::from_millis(10));
    event!(name: "yelling", Level::DEBUG, { value = 5 });
}

#[tracing::instrument]
fn sleep_a_bit_with_args(foo: &str, bar: u32) {
    thread::sleep(Duration::from_millis(10));
    sleep_a_bit();
}

#[tracing::instrument(skip(foo, bar))]
fn sleep_a_bit_with_skipped(foo: &str, bar: u32) {
    thread::sleep(Duration::from_millis(10));
    sleep_a_bit();
}

#[tracing::instrument]
fn all_the_sleeps() {
    sleep_a_bit();
    sleep_a_bit_and_yell();
    sleep_a_bit_with_args(&"hi", 42);
}

#[tracing::instrument]
fn root_fn() {
    for u in 0..5 {
        all_the_sleeps();
    }
}

fn make_spans() {
    event!(name: "event outside of span", Level::DEBUG, {value = 10});
    event!(name: "test_instant_event", Level::DEBUG, test_key = "test_value");

    {
        root_fn();
    }
    event!(name: "event after last span", Level::DEBUG, {value = 20});
}

fn main() {
    use std::env;

    let my_var = env::var("BUNDLE");
    let aggregate_similar_siblings = match my_var {
        Ok(val) if val == "1" || val.to_lowercase() == "true" => true,
        _ => false,
    };
    let (tree_layer, _guard) = PrintTreeLayer::new(PrintTreeConfig {
        attention_above_percent: 25.0,
        relevant_above_percent: 2.5,
        hide_below_percent: 1.0,
        display_unaccounted: false,
        no_color: true,
        accumulate_spans_count: false,
        accumulate_events: false,
        aggregate_similar_siblings, //// <------ This is the new guy
    });
    let perf_filter = EnvFilter::builder()
        .with_default_directive("trace".parse().unwrap())
        .with_env_var("RUST_PERF_LOG")
        .from_env_lossy();

    tracing_subscriber::registry()
        .with(tree_layer.with_filter(perf_filter))
        .init();

    make_spans();
}

Right now outputs the very verbose:

> event outside of span { value: 10 }: 1
> test_instant_event { test_key: test_value }: 1

root_fn [ 237.43ms | 100.00% ]
├── all_the_sleeps [ 48.13ms | 20.27% ] { index = 1 }
│  ├── sleep_a_bit [ 12.26ms | 5.16% ]
│  ├── sleep_a_bit_and_yell [ 12.57ms | 5.30% ]
│  │  ├>yelling { value: 5 }: 1
│  └── sleep_a_bit_with_args [ 23.20ms | 9.77% ] { foo = hi, bar = 42 }
│     └── sleep_a_bit [ 10.89ms | 4.59% ]
├── all_the_sleeps [ 49.33ms | 20.78% ] { index = 2 }
│  ├── sleep_a_bit [ 12.54ms | 5.28% ]
│  ├── sleep_a_bit_and_yell [ 12.59ms | 5.30% ]
│  │  ├>yelling { value: 5 }: 1
│  └── sleep_a_bit_with_args [ 24.08ms | 10.14% ] { foo = hi, bar = 42 }
│     └── sleep_a_bit [ 11.47ms | 4.83% ]
├── all_the_sleeps [ 47.18ms | 19.87% ] { index = 3 }
│  ├── sleep_a_bit [ 11.64ms | 4.90% ]
│  ├── sleep_a_bit_and_yell [ 11.07ms | 4.66% ]
│  │  ├>yelling { value: 5 }: 1
│  └── sleep_a_bit_with_args [ 24.33ms | 10.25% ] { foo = hi, bar = 42 }
│     └── sleep_a_bit [ 11.82ms | 4.98% ]
├── all_the_sleeps [ 45.87ms | 19.32% ] { index = 4 }
│  ├── sleep_a_bit [ 10.76ms | 4.53% ]
│  ├── sleep_a_bit_and_yell [ 10.98ms | 4.62% ]
│  │  ├>yelling { value: 5 }: 1
│  └── sleep_a_bit_with_args [ 23.98ms | 10.10% ] { foo = hi, bar = 42 }
│     └── sleep_a_bit [ 12.17ms | 5.13% ]
└── all_the_sleeps [ 46.60ms | 19.63% ]
   ├── sleep_a_bit [ 11.46ms | 4.83% ]
   ├── sleep_a_bit_and_yell [ 11.79ms | 4.97% ]
   │  ├>yelling { value: 5 }: 1
   └── sleep_a_bit_with_args [ 23.18ms | 9.76% ] { foo = hi, bar = 42 }
      └── sleep_a_bit [ 10.53ms | 4.43% ]

> event after last span { value: 20 }: 1

But if we aggregate all equivalent sub-graphs, we get the more concise:

> event outside of span { value: 10 }: 1
> test_instant_event { test_key: test_value }: 1

root_fn [ 231.50ms | 100.00% ]
└── all_the_sleeps [ 230.95ms | 99.76% ] (5 calls)
   ├── sleep_a_bit [ 59.48ms | 25.69% ] (5 calls)
   ├── sleep_a_bit_and_yell [ 60.02ms | 25.93% ] (5 calls)
   │  ├>yelling { value: 5 }: 5
   └── sleep_a_bit_with_args [ 110.86ms | 47.89% ] (5 calls)
      └── sleep_a_bit [ 54.41ms | 23.50% ] (5 calls)

> event after last span { value: 20 }: 1

LMK if you would be interested in the PR, I can polish my branch and make it a thing. (https://github.com/jspaezp/tracing-profile/tree/feat/aggregate_common right now ... there are a couple of TODOs I would like to handle before it "goes into the wild" but has worked well for a couple of my projects)

Best!

@jspaezp jspaezp marked this pull request as draft October 22, 2025 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant