Standalone aggregator service + launcher support

### Description:

Add a dedicated aggregator service command, traceml serve, so the aggregator can run independently of training. Also extend traceml run, traceml watch, and traceml deep with a --serve flag that defaults to False.

When --serve=False, TraceML should keep the current behavior and launch the embedded aggregator as it does today. When --serve=True, TraceML should not launch the embedded aggregator and instead connect to an existing standalone aggregator using serve_ip and serve_port, with sensible default values.

The goal is to separate the long-running aggregator lifecycle from training execution while preserving backward compatibility. This should use the existing TCP transport and keep the rank-0 fallback available for simple runs.



### Acceptance Criteria:

- traceml serve starts the aggregator as a standalone process
- traceml run, traceml watch, and traceml deep support --serve
 - --serve=False keeps current embedded aggregator behavior
- --serve=True skips embedded aggregator startup and connects via serve_ip and serve_port
- Default serve_ip and serve_port values are provided
- Rank-0 fallback remains available for compatibility
- No changes to rendering behavior in this issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standalone aggregator service + launcher support #83

Description:

Acceptance Criteria:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Standalone aggregator service + launcher support #83

Description

Description:

Acceptance Criteria:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions