PyDart is a lightweight experimental framework for studying and optimizing multi-model inference execution across shared compute resources. It is designed to make it easy to compare simple baseline execution against PyDart’s pipeline-parallel partitioning and scheduling approach towards excuetion, while keeping the workflow easy to follow.
Note: This repository is currently intended as a experimental research framework.
PyDart currently supports:
- Built-in model-registry-based experiments
- Manual Python workflows for custom models and custom task construction
- Baseline execution through
run_baseline_execution(mode=...) - PyDart’s inference engine for partitioned and scheduled parallel execution, along with comparison against baseline execution
You can use the CLI for simple built-in experiments, while more flexible or advanced workflows are better handled through Python scripts or notebooks.
At present, the most stable and recommended baseline mode for testing is sequential. An async baseline path is also supported for stronger fully parallel comparison, but it may place more stress on the host system at higher workloads.
PyDart is mainly built to help you explore how running multiple DNN inference tasks together behaves on your own system in a simple way.
With it, you can study things like:
- Task mix: how different mixes of inference tasks behave when run together
- Batch size: how the number of tasks in a batch affects performance
- Baseline comparison: how concurrent execution compares with a simple baseline
- Partitioning & Scheduling behavior: how PyDart can split and schedule tasks using custom metrics
At the moment, custom metrics are not directly exposed through the minimal CLI workflow. To explore or extend that part of the framework, refer to metrics.py.
In the current demos, notebooks, and custom examples, PyDart uses a default cost metric based on a custom Arithmetic Intensity metric.
See Installation to set up the environment, or jump to CLI and Python for Custom Workflows to start running experiments.
PyDart_temp/
├── configs/ # Experiment configuration files
├── examples/ # Example Python scripts
├── notebooks/ # Notebook demos and experiments
├── outputs/ # Multi-experiment / sweep outputs
├── outputs_built_in_run/ # CLI-based single built-in run outputs
├── outputs_custom/ # Custom Python run outputs
├── src/pydart/ # Core PyDart package
├── README.md
├── pyproject.toml
└── System_Diagram.png
src/pydart/contains the main framework codeexamples/is the best place to look for custom Python usagenotebooks/is useful for demo-style exploration and experimentationoutputs_built_in_run/stores CLI-based single built-in run artifactsoutputs_custom/stores outputs from custom Python-based runsoutputs_custom/sweep or run_multiple_experiments style outputs, including traces, plots, summaries, and related experiment artifacts
This structure is intended to make the repo easy to navigate: core code in src/pydart/, runnable examples in examples/, interactive exploration in notebooks/, and generated results in outputs/.
At the moment, PyDart compares a baseline execution path against PyDart's scheduled parallel execution.
This is handled through run_baseline_execution(mode=...) in Evaluator.
This is the current primary and most stable baseline mode.
In this mode:
- Tasks are executed one by one
- Each task is run in a simple sequential loop
- Execution time, completion time, outputs, and makespan are recorded
- This acts as the default baseline for comparison
An async baseline is also supported.
The intended purpose of this mode is to represent a more aggressive host-side parallel launch strategy, where tasks are submitted asynchronously as a baseline against PyDart's structured scheduling.
In practice:
- It can provide an extra comparison point than sequential execution
- It may stress the system more heavily than sequential mode, especially at higher workloads
- It is best used when the machine has enough available CPU / GPU resources and comparable workloads
- Sequential mode remains the safest and most recommended option for routine testing
This is the current run_parallel_execution() path in Evaluator.
In this mode:
- Tasks are assigned through the PyDart execution framework
- The
Tasksetexecutes tasks across workers / nodes - Outputs, per-task execution times, completion times, and makespan are collected
- This is the main framework-driven execution path
Figure: PyDart System Design Updated — Offline Task Partitioning and Runtime Scheduling
Figure: PyDart System Design Old
- Python 3.9+
pip- A supported PyTorch installation for your platform and device
Important
- It is strongly recommended to use a virtual environment before installing PyDart.
- PyTorch should be installed separately first, using the correct build for your platform (CPU or CUDA), before installing or running PyDart.
- For PyTorch installation instructions, use the official guide: https://pytorch.org/get-started/locally/
- PyDart generally gives the best and most representative results on CPU+GPU systems, since the framework is intended to study shared-resource execution across heterogeneous compute.
-
Clone the repository
git clone https://github.com/parthshinde1221/PyDart_temp.git cd PyDart_temp -
Create and activate a virtual environment
Using
venv:python -m venv .venv source .venv/bin/activateOn Windows:
.venv\Scripts\activate
-
Install PyTorch first
Follow the official instructions for your platform: https://pytorch.org/get-started/locally/
-
Upgrade/Install packaging tools(Important)
python -m pip install --upgrade pip setuptools wheel
-
Install PyDart
Recommended:
python -m pip install -e .If needed, you can also try:
python -m pip install --no-build-isolation -e . -
Verify the installation
pydart --help
PyDart provides a minimal CLI for built-in experiments.
The CLI is intentionally small and only exposes the simplest built-in execution paths using the default model registry.
-
Show CLI help
pydart --help
-
Run a single built-in experiment
pydart run --workers 2 --ratio 1:1 --tasks 10 --baseline-mode sequential
-
Run multiple built-in experiments
pydart sweep --workers 2 --tasks 10 --baseline-mode sequential
The number of workers can be greater than 2. In practice, a useful range to explore is up to cpu_core_count - 1, while cpu_core_count // 2 is often a good starting point on systems with a larger number of CPU cores. The best value depends on workload characteristics and system-level contention.
--ratio defines the heavy-to-light task mix for built-in experiments and is written in H:L form.
Examples:
1:1= equal heavy and light tasks1:2= more light tasks2:1= more heavy tasks9:1= most heavy tasks
This is useful for exploring how workload composition affects execution and scheduling.
PyDart supports configurable baseline execution through the CLI using --baseline-mode.
Expected baseline modes include:
sequential— recommended default for testing; simpler, safer, and more stable for most systemsasync— also supported and useful for stronger fully parallel comparison, but it may stress the host system more at higher workloads
- Use
sequentialfor routine testing and the most stable baseline comparison - Use
asyncwhen you want a more aggressive parallel baseline and your system has enough available resources
Examples:
pydart run --workers 2 --ratio 1:1 --tasks 10 --baseline-mode sequentialpydart run --workers 2 --ratio 1:1 --tasks 10 --baseline-mode asyncUse Python scripts or notebooks when you want to:
- Define custom
nn.Modulemodels - Define custom
ModelSpecobjects - Create custom dataloaders
- Control tracing manually
- Build tasks explicitly
- Profile tasks manually
- Experiment with evaluator logic directly
This separation keeps the CLI minimal while still letting PyDart act like a flexible library.
For custom experiments, run one of the provided example files directly:
python examples/custom_model_run_sequential.py
python examples/custom_model_run_async.pyYou can also add your own custom run file under examples/ and execute it in the same way:
python examples/<custom_run_file>.pyRefer to examples/custom_model_run_sequential.py and examples/custom_model_run_async.py for example custom workflows.
PyDart organizes generated artifacts by workflow type.
outputs_built_in_run/is used for CLI-based single built-in runsoutputs_custom/is used for custom Python-based runsoutputs/is used for sweep orrun_multiple_experimentsstyle runs, including artifacts such as traces, plots, summaries, and related experiment outputs
Depending on the workflow, generated artifacts may include:
- traces
- logs
- profiling CSVs
- plots
- experiment results
This structure helps separate built-in, custom, and multi-experiment outputs more clearly.
- The repository is currently experimental.
- The CLI is intentionally minimal.
sequentialis the recommended baseline mode for most users and for routine testing.asyncis also supported and can provide a stronger comparison point, but it may stress the system more at higher workloads.- For custom models and more advanced workflows, prefer Python scripts or notebooks.