Health AI Research Portfolio

Three production-style health-AI systems built around the engineering that makes clinical ML trustworthy - calibrated ranking, conformal alerting, drift monitoring, and a privacy release gate. ~11,500 lines of Python and a 396-test suite. The demos run on real public data where it exists (ClinicalTrials.gov, the open MIMIC-IV demo, Synthea); the learned models and their labels use synthetic stand-ins only where there is no public ground truth.

The Three Projects

#	Project	Domain	Core contribution
1	OncoBoard-MM	Precision oncology	Multimodal patient–trial matching with calibrated ranking and evidence traces
2	AcuteWatch-FM	Hospital deterioration	Real-time multi-horizon risk prediction with conformal alerting and drift monitoring
3	SynGuard-EHR	Synthetic data governance	Privacy-preserving synthetic EHR generation with attack-suite evaluation and release gating

How They Connect

The three projects form a deliberate reuse chain:

OncoBoard-MM                    AcuteWatch-FM                    SynGuard-EHR
─────────────                   ─────────────                    ────────────
Modality masks ──────────────►  Modality masks (5 modalities)
Gated fusion ────────────────►  Gated multimodal fusion
Conformal prediction ────────►  Conformal alerting ──────────►   Release gate thresholds
Synthetic oncology cohorts ──►  ─────────────────────────────►   Generator + privacy screening
Calibration audit ───────────►  Drift monitoring ────────────►   Fairness auditing
Sparse retrieval (evidence) ─►  Guideline retrieval              ─────────────────────────

OncoBoard-MM establishes the core patterns: modality-mask conventions, gated late fusion, conformal uncertainty, and calibration auditing.
AcuteWatch-FM extends these to longitudinal hospital data with irregular time-series, multi-horizon prediction, and operational governance (drift detection, alert budgets).
SynGuard-EHR inherits the privacy screening and calibration machinery to build a governed synthetic-data workbench where the evaluator and release gate - not just the generator - are the primary deliverables.

Demonstrated capabilities (real public data, synthetic models)

Each sub-project includes a demo notebook that runs end-to-end on real public data, with synthetic stand-ins for the learned models and labels where no public ground truth exists. Headline numbers from the most recent run:

Project	What runs end-to-end	One headline number
OncoBoard-MM	profile build → biomarker normalization → ranker training → calibration audit	Brier 0.0185, ECE 0.0824 on 500 synthetic predictions
AcuteWatch-FM	event ingest → leakage-safe windows → 9-head multi-task training → conformal alerts → drift	All 9 outcome-horizon heads train to val_acc 0.976; drift monitor flags a 0.82-std heart-rate shift
SynGuard-EHR	generate → privacy attacks → fidelity → fairness → release gate → FHIR export	Release gate correctly BLOCKS a deliberately leaky generator against a real Synthea cohort (PRIV-01 distinguishability 0.61 vs 0.05 threshold)

The data backbones are real (ClinicalTrials.gov trials, the open MIMIC-IV demo, a Synthea cohort); the learned rankers/models, their labels, and the calibration/conformal figures are synthetic stand-ins where no public match/outcome labels exist. These are pipeline demonstrations, not predictive benchmarks - see each sub-project's Demonstration and Status sections for exactly what is real versus synthetic.

Quickstart

Each project is a self-contained Python package. To run any one of them:

cd project_01_oncoboard_mm   # or project_02_acutewatch_fm or project_03_synguard_ehr
pip install -e .
PYTHONPATH=src pytest tests/ -v

Each project includes a demo notebook under notebooks/ that runs a worked example with actual metrics:

cd project_01_oncoboard_mm/notebooks
jupyter notebook demo_trial_matching.ipynb

References

See references.md for the full bibliography referenced in each project's README.

License

MIT - see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Health AI Research Portfolio

The Three Projects

How They Connect

Demonstrated capabilities (real public data, synthetic models)

Quickstart

References

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
project_01_oncoboard_mm		project_01_oncoboard_mm
project_02_acutewatch_fm		project_02_acutewatch_fm
project_03_synguard_ehr		project_03_synguard_ehr
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
references.md		references.md

Folders and files

Latest commit

History

Repository files navigation

Health AI Research Portfolio

The Three Projects

How They Connect

Demonstrated capabilities (real public data, synthetic models)

Quickstart

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages