Title: Add step-level total communication time sampler

### Description:
TraceML currently captures compute, memory, and input pipeline timing, but it does not yet measure total distributed communication time. Add a new step-level sampler that estimates the total communication cost for each training step, starting from the first collective before the split / communication phase and aggregating all communication activity during the step.

This should be implemented with lightweight hooks or patches similar to the existing PyTorch instrumentation paths. The feature must be best-effort: if patch installation fails, user training code must continue to run without interruption. Users should also be able to disable communication timing completely when they do not want the overhead.

### Acceptance Criteria:

- A new sampler records step-level communication time
- Communication timing is aggregated per step and per rank
- Instrumentation is best-effort and never breaks user training if patching fails
- Users can disable communication timing entirely
- No rendering/UI changes are required in this issue


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Title: Add step-level total communication time sampler #86

Description:

Acceptance Criteria:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Title: Add step-level total communication time sampler #86

Description

Description:

Acceptance Criteria:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions