A curated, citation-backed, variability-aware dataset of cardiac ion-channel drug-block parameters and the in-silico ventricular action-potential models that turn them into a torsade-de-pointes (proarrhythmia) risk distribution — not a verdict.
Code: MIT · Data: CC-BY-4.0 · Python ≥3.9
Harmonia reports a torsade-risk-metric distribution and a classification-flip frequency that make the dependence of a safety call on input-data variability visible by default. It is NOT a clinical tool and NOT a regulatory safety determination. It never issues a bare "this drug is safe/unsafe" verdict. See Safety & scope.
Harmonia is the fourth sibling of Nidus (gestational physiology), Hypnos (anesthetic PK/PD), and Onkos (oncology efficacy) — completing a physiology → dosing → efficacy → safety arc — built on one principle: a model is only as trustworthy as its weakest, least-validated input, so make that a first-class, machine-readable field. Harmonia's load-bearing idea is the propagation of input variability to the safety classification.
Drug-induced QT prolongation and torsade de pointes (TdP) are a leading cause of late-stage drug attrition. The FDA-initiated CiPA paradigm assesses risk in silico: measure a drug's block of the major cardiac currents (an IC50 + Hill per current), feed them into a human ventricular myocyte model, simulate the action potential, and stratify TdP risk.
The model machinery is published. The inputs are the problem. IC50 values for the same drug–channel pair routinely differ several-fold across labs and platforms, and for a large fraction of published pharmacology the maximum block observed was below ~60% — which makes the IC50 unidentifiable, yet such values still get used as point estimates. The accuracy of the final risk call depends as much on input variability and quality as on the model.
Harmonia operationalizes exactly that. Pick a drug; it pulls the spread of published IC50s per channel, propagates that spread through the chosen AP model by Monte-Carlo, and shows how often the high/intermediate/low classification flips depending on which sources you believe:
A near-pure hERG blocker like dofetilide lands tightly in HIGH (2% flip under qNet). A balanced multichannel blocker like verapamil straddles the low/intermediate boundary — its classification flips on ~36% of draws (Wilson 95% CI 30–43% at 200 Monte-Carlo draws; the flip frequency is itself an estimate, and v0.7 reports its sampling interval, §flip-frequency CI). The same drug, different believed sources, different safety call. That is the finding the uncertainty-quantification literature (Chang et al. 2017) demonstrated and that no dataset operationalized — until this one. (The metric is qNet, the CiPA net-charge biomarker; lower qNet means higher risk.)
git clone https://github.com/clay-good/harmonia
cd harmonia
pip install -e ".[dev]" # numpy, scipy, jsonschema
harmonia validate # JSON-Schema- + semantically validate the dataset
harmonia info # counts by subsystem / tier / review status
harmonia simulate dofetilide --mc 200 # qNet metric (default); --metric apd90 to switch
harmonia simulate dofetilide --uq bayes # v0.2 hierarchical-Bayesian uncertainty propagation
harmonia infer verapamil # v0.2 per-channel IC50/Hill posteriors + sampler diagnostics
harmonia priors # the prior registry (declared, citable, non-predictive)
harmonia flip verapamil # classification stability across AP-model variants
harmonia sensitivity verapamil # which channel's IC50 spread drives the flip
harmonia sensitivity verapamil --sobol # variance-based indices WITH interactions (v0.2)
harmonia combo terfenadine ondansetron # drug-combination (polypharmacy) assessment
harmonia population sotalol # population-of-models spread (HYPOTHESIS-TIER, not for prediction)
harmonia performance # score qNet vs CiPA expert labels (train/val/all); --metric apd90
harmonia export --all --output exports/import harmonia
ds = harmonia.load()
b = ds["channel_block.dofetilide.ikr"]
b.tier # "A"
b.assay_context.max_block_observed_percent # 95 (>60 => identifiable)
b.variability.fold_range # 1.65 -> inter-source spread is first-class
b.source_values # the individual lab measurements
# Simulate an action potential + risk-metric DISTRIBUTION (never a bare verdict)
res = harmonia.assess(ds, "dofetilide", ap_model="cipaordv1.0", n_mc=200) # metric="qnet" (default)
res.qnet_distribution # distribution, not a point value
res.classification_flip_frequency # how often the class flips
res.tier, res.warnings, res.excluded_channels # propagated tier + unidentifiable-channel flags
harmonia.assess(ds, "dofetilide", metric="apd90") # the classic QT/APD surrogate instead
# Headline comparison across AP-model variants
cmp = harmonia.flip_view(ds, "verapamil", ap_models=["ord", "cipaordv1.0", "tor_ord"])
cmp.flip_by_model # {'ord': 'low', 'cipaordv1.0': 'intermediate', ...}
cmp.stable_across_models # False
# Which input drives the flip? — per-channel uncertainty attribution
sens = harmonia.flip_sensitivity(ds, "verapamil")
sens.dominant_channel # 'ICaL' — the IC50 to pin down first
sens.channels[0].solo_flip_frequency # flip freq when ONLY ICaL's IC50 varies
# v0.2 — Bayesian dose-response UQ: infer the IC50/Hill posterior, don't transcribe it
res = harmonia.assess(ds, "dofetilide", uq="bayes") # posterior-predictive propagation
res.classification_flip_frequency # samples the TRUE-value posterior
res.reproducibility_flip_frequency # samples the NEW-LAB predictive (a fresh replication)
res.censored_channels, res.prior_dominated_channels # one-sided / prior-shaped flags
post = harmonia.posterior(ds, "dofetilide", "IKr") # the inferred object, not a point
post.mean_log10, post.sd_log10, post.identifiability_score, post.prior_sensitivity
sob = harmonia.flip_sensitivity(ds, "verapamil", method="sobol") # interaction-aware
sob.channels[0].total_effect, sob.channels[0].interaction_load # S_Ti and S_Ti - S_i
# Dynamic (CiPA-style) hERG binding with trapping, where kinetics are recorded
res = harmonia.assess(ds, "dofetilide", herg_dynamic=True) # trapped blocker -> extra prolongation
# Exposure layer: drive block from a TOTAL plasma concentration via protein binding
res = harmonia.assess(ds, "verapamil", exposure_nM=3200, exposure_kind="total") # free = fu * total
harmonia.free_from_total(3200, 0.10) # 320.0 nM free
# Drug combination (polypharmacy): joint variability, the interaction, the flip
combo = harmonia.assess_combination(ds, ["terfenadine", "ondansetron"], n_mc=200)
combo.classification # "high" (two intermediates -> high together)
combo.interaction_dapd90_pct # extra prolongation beyond the worst single agent
combo.classification_flip_frequency # joint-uncertainty flip frequency
# Population-of-models (HYPOTHESIS-TIER, Tier D, not for prediction)
pop = harmonia.assess_population(ds, "sotalol", n_models=100)
pop.susceptible_fraction # fraction of the population classified "high"
pop.tier # "D" (always; non-predictive)| Layer | Status |
|---|---|
| Dataset — 68 channel-block records across the 28 CiPA compounds (12 training + 16 validation), 28 drug-reference records (expert risk label + free Cmax + protein binding), 3 AP-model records, 5 population specs (1 variability + 3 LQTS disease backgrounds + 1 experimentally-calibrated), 15 Crossref-checked citations | ✅ |
| Variability is first-class — multi-source IC50s with computed fold-range / IQR; the reliability gate (max block < 60% ⇒ Tier D, unidentifiable) machine-enforced | ✅ |
Bayesian dose-response UQ (Phase C, v0.2) — the IC50/Hill spread is inferred under a declared prior, not transcribed: hierarchical posterior with dataset-learned between-lab pooling, propagated Hill uncertainty, a one-sided censored posterior for sub-60%-block channels, a raw dose-response regime (fit from concentration-block points), a prior registry with per-channel prior_sensitivity, variance-based (Sobol) sensitivity, and a calibrated inference (simulation-based calibration + posterior coverage). Opt-in via uq="bayes"; uq="moments" (default) reproduces every v0.1 number |
✅ |
| Reference kernel — a SciPy reduced O'Hara-Rudy-lineage ventricular AP (7 currents + Na-Ca exchanger) with Hill block per current; APD90 / qNet / triangulation / cqInward / EAD biomarkers | ✅ |
| Discriminating qNet (Phase C) — adding a shape-dependent Na-Ca exchanger (excluded from the qNet sum) makes qNet sensitive; qNet is now the default metric (10/12 training, zero two-category errors over all 28 compounds); APD90 selectable | ✅ |
Dynamic hERG binding (Phase B) — Langmuir kon/koff with trapping; reduces to the static Hill block at steady state, captures use-dependent block (assess(..., herg_dynamic=True)) |
✅ |
CiPA dynamic-hERG kinetics (v0.6) — the real published Li-2017 IKr-Markov binding parameters (Kmax/Ku/halfmax/n/Vhalf) for the 12 CiPA dynamic-fit compounds, sourced from the FDA/CiPA repository as a citation-backed cipa_binding field; a faithful binding model reproducing the trapping phenotype (assess(..., herg_dynamic="cipa")). Data authoritative (shipped unverified); model an opt-in Tier-C reduction that touches no default number |
✅ |
Exposure layer (Phase D) — free ↔ total plasma conversion via protein binding (fraction_unbound); assess from a free or total concentration (composable with a Hypnos PK trajectory) |
✅ |
Drug combinations (Phase D) — assess_combination propagates joint IC50 variability; independent block multiplies per channel; reports the interaction and how often the combined class flips |
✅ |
Population-of-models (Phase E) — assess_population samples a population of virtual myocytes (per-conductance variability) to spread a drug's risk across individuals. Hypothesis-tier, Tier D, NOT FOR PREDICTION |
✅ |
| Disease / genetic backgrounds (v0.3) — the three congenital long-QT channelopathies (LQT1 IKs↓, LQT2 IKr↓, LQT3 INaL↑) as population records: a per-current mean conductance shift recenters the variability cloud on a reduced-repolarization-reserve background, so a drug's risk distribution can be re-evaluated against a susceptible subpopulation. Hypothesis-tier, Tier D, NOT FOR PREDICTION | ✅ |
| Experimentally-calibrated populations (v0.5) — the Britton-2013 calibration-by-rejection: a virtual myocyte enters the population only if its drug-free AP biomarkers (APD90, rest/peak potential, triangulation) are physiologically plausible, removing the abnormal-repolarization tail of the raw prior. Acceptance ranges are kernel-plausibility bounds (not patient-fit), so it stays Tier D, NOT FOR PREDICTION | ✅ |
| Risk distribution + flip frequency — Monte-Carlo over source variability; classification-flip frequency; worst-tier propagation | ✅ |
Flip-frequency confidence interval (v0.7) — the headline flip frequency is a Monte-Carlo binomial proportion, so it now carries a Wilson 95% CI (flip_ci) derived from n_mc; same for the combination flip, the all-vary flip sensitivity, the new-lab reproducibility flip, and the population susceptible fraction. The project's own "never a number without its uncertainty" rule, applied to its own headline number. Purely additive — no previously-reported value moves |
✅ |
Flip sensitivity — flip_sensitivity attributes the flip to each channel's IC50 spread (solo / frozen effects), surfacing the dominant uncertain input to pin down first |
✅ |
Recorded classification performance (Phase B) — harmonia performance scores either metric vs CiPA expert labels on training / validation / all, with the full confusion matrix |
✅ |
Exports — CellML 2.0, Myokit .mmt, SBML L3v2, SED-ML, CiPA inputs (CSV/JSON), CSV, BibTeX, COMBINE .omex — all carrying clinicalUse = PROHIBITED, tier, and DOI RDF |
✅ |
| CLI · Streamlit dashboard · CI | ✅ |
Release hardening (Phase F) — declaration-level CellML unit-conformance check, three executable nbmake notebooks, .zenodo.json, CHANGELOG.md |
✅ |
| Full CiPA Markov hERG + published optimized kinetics, ToR-ORd reformulation, broader multi-source aggregation, full dimensional/OpenCOR cross-check | Phase C/F (roadmap below) |
The dataset is the single source of truth; everything else is a deterministic projection.
flowchart TD
DS["<b>dataset/</b> — source of truth<br/>JSON records + JSON Schema + JSON-LD<br/>channels · block params (multi-source) · AP models · tiers · citations"]
DS -->|"build_records.py (provenance log)"| REC["records/*.json"]
REC --> LOAD["harmonia.load → Dataset"]
LOAD --> VAL["validate.py<br/>schema + reliability-gate + citation rules"]
LOAD --> SIM["simulate.py<br/>Monte-Carlo variability → risk distribution + flip freq<br/>flip sensitivity (OAT + Sobol) · uq=moments|bayes<br/>dynamic hERG · exposure scaling · drug combinations<br/>populations.py: population-of-models (Tier D)"]
LOAD --> INFER["infer.py (v0.2)<br/>hierarchical Bayesian IC50/Hill posterior<br/>learned between-lab pooling · censored likelihood<br/>prior_sensitivity · identifiability_score"]
PRI["dataset/priors/<br/>declared, citable, non-predictive priors"] --> INFER
INFER --> SIM
LOAD --> EXP["harmonia.export<br/>format builders"]
SIM --> PERF["performance.py<br/>score vs CiPA expert labels"]
SIM --> REF["reference.py<br/>SciPy ORd-lineage AP kernel (+ INaCa)<br/>qNet (default) / APD90 / triangulation / cqInward biomarkers"]
EXP --> SPEC["model_spec.py<br/>one AST → Myokit / CellML / SBML"]
SPEC --> CELLML["CellML 2.0"]
SPEC --> MYOKIT["Myokit .mmt"]
SPEC --> SBML["SBML L3v2"]
EXP --> CIPA["CiPA inputs (CSV/JSON)"]
EXP --> SEDML["SED-ML"] --> OMEX["COMBINE .omex"]
CELLML --> OMEX
LOAD --> CLI["harmonia CLI"]
LOAD --> DASH["Streamlit dashboard<br/>flip view · combinations · population-of-models · browse"]
Every model export is generated from one renderer-agnostic
model_spec (a tiny expression AST), so
the Myokit, CellML, and SBML artifacts cannot drift from each other or from the
reference kernel — the numeric oracle.
"Cannot drift" is enforced, not asserted: the AST carries a pure-Python
evaluator, and registry.roundtrip_ode re-integrates it with the kernel's own
solver settings and confirms the resulting action potential matches to ≈1e-7
relative — the "round-trip validates ~1e-4 ODE" guarantee from spec.md §6, now
enforced in CI.
The two load-bearing fields:
source_values+variability— multiple labs'/assays' measurements of the same IC50, with the inter-source spread computed and stored. Input variability is not hidden behind a single number.assay_context.max_block_observed_percent— below ~60% block the IC50 is unidentifiable and any point estimate is fiction. This is what lets a Tier-D "we don't actually know this IC50" be stated honestly. (ranolazine.ical, max block 35%, is the worked example: it is excluded from simulation and caps any assessment that touches it at Tier D.)
| Tier | Channel block | AP model |
|---|---|---|
| A | Multiple labs agree (low fold-range), block ≳60% so IC50 identifiable | Validated on CiPA validation set |
| B | One good measurement with adequate block; a single well-curated source | Published, internally validated |
| C | Single measurement; low/borderline block; unresolved manual-vs-automated discrepancy | Reduced / reference kernel |
| D | Max block < ~60% (IC50 unidentifiable), population extrapolation, or hypothesis-tier — not predictive | — |
Two hard, machine-checked rules (harmonia validate enforces both):
- The reliability gate —
max_block < 60%⟺ Tier D ⟺ aknown_failure_modeis present. No point IC50 is recorded as if reliable. - Worst-input-wins — a composed assessment inherits the worst tier among its channel-block records + AP model. One unidentifiable IC50 caps the whole assessment at D — and the input variability is propagated by Monte-Carlo, producing a distribution of outcomes and a flip frequency, never a bare class.
verifiedvs the tier. The tier is data quality (do labs agree? is the IC50 identifiable?).review_statusis whether a human opened the source PDF. They are orthogonal. Every v0.1 record shipsunverified— the values are literature-derived but no PDF has been confirmed inside Harmonia. LLMs assist extraction but never promote to verified.harmonia inforeports the verified count honestly (currently 0/104). Promoting records by reading the source is the single highest-leverage contribution — see CONTRIBUTING.
The bundled kernel is a reduced O'Hara-Rudy-lineage human ventricular AP: seven named currents (INa, INaL, Ito, ICaL, IKr, IKs, IK1) plus a phenomenological Na-Ca exchanger (INaCa), with Hodgkin-Huxley gating, an algebraic inward rectifier, and fixed ionic concentrations, paced at 0.5 Hz. It is structurally faithful, numerically stable (steady state in ~3 beats), and reproduces the qualitative pharmacology CiPA rests on:
It is not bit-exact to the published ORd CellML, so AP-model records ship at Tier C. Two design facts are worth stating plainly:
- qNet is the default metric, and it works (Phase C). CiPA replaced APD with
qNet — the integral of the six currents INaL + ICaL + IKr + IKs + IK1 + Ito
over the beat (lower qNet = higher risk). In a pump-free kernel that sum is
charge-conserved and so insensitive to block. Adding the Na-Ca exchanger and
excluding it from the qNet sum breaks that conservation and makes qNet
genuinely discriminate. ΔAPD90% remains selectable (
metric="apd90"). - The classifier is a methodology demonstrator, not a qualified classifier. Calibrated on the 12 CiPA training drugs under the default model (qNet thresholds: high < 0.220, low > 0.285), the reduced kernel recovers 10/12 training labels — and across the full 28-compound set it makes zero two-category errors (it never calls a high-risk drug low, or vice versa):
High-risk drugs (red) sit left of the red line, low-risk (green) right of the green line; the asterisked drugs are the 16-compound validation set. The kernel also captures the protective multichannel mechanism: diltiazem and verapamil's ICaL block raises their qNet (lowers risk), via an L-type window current.
Alongside the classifier, every assessment reports two more CiPA biomarkers as honest diagnostic readouts, never as a second verdict; the high/intermediate/low call stays with qNet (or ΔAPD90):
- triangulation (APD90 − APD50) — the T in the classic TRIaD proarrhythmia profile. hERG block prolongs late repolarization more than early, so a torsadogenic drug widens triangulation above the drug-free baseline (dofetilide ≈71 ms vs ≈36 ms).
- cqInward (v0.4) — the CiPA inward-charge biomarker
(Dutta et al. 2017): the
control-normalized average of the late-sodium (INaL) and L-type-calcium (ICaL)
charge ratios,
½·(qNaL_drug/qNaL_ctrl + qCaL_drug/qCaL_ctrl). It isolates the inward side of CiPA's inward/outward balance hypothesis — torsade is driven by sustained inward plateau current. It is dimensionless and self-normalizing (1 at no drug, no kernel threshold needed), and it is propagated through the same Monte-Carlo as qNet, so the assessment reports acqinward_distribution. The mechanism validates it: a pure ICaL/INaL blocker reduces inward charge (cqInward < 1 — protective, the multichannel mechanism that spares verapamil, measured ≈0.81), while a pure IKr blocker prolongs the AP and raises it (cqInward > 1 — proarrhythmic; dofetilide ≈1.18). It completes the computable half of the spec §3 biomarker list (spec v0.4); EAD occurrence (structurally unreachable in the reduced kernel) and the electromechanical window (needs a mechanical model) remain honestly out of reach.
harmonia performance scores either metric against the CiPA expert labels and
prints the full confusion matrix. Honest numbers under the default qNet metric:
| Set | Exact accuracy | Within-one-category |
|---|---|---|
| Training (12) | 10/12 (83%) | 12/12 (100%) |
| Validation (16) | 7/16 (44%) | 16/16 (100%) |
| All 28 | 17/28 (61%) | 28/28 (100%) |
qNet beats the APD90 surrogate (8/12 training, ~82% within-one) on both counts. The validation set is honestly harder on exact 3-way accuracy: many validation drugs have very low free Cmax, so block at 4× EFTPC is sub-IC50 and both metrics underread them. But qNet never makes a catastrophic (two-category) error on any of the 28 compounds — the property that matters most for a safety screen. The durable contribution remains the flip-frequency-under-variability machinery, correct regardless of absolute accuracy.
hERG records can carry dynamic binding kinetics (kon, koff, trapping).
With assess(..., herg_dynamic=True) the kernel integrates a Langmuir binding ODE
instead of applying a static Hill factor. It reduces to the static block at steady
state (verified in tests) but captures use-dependent trapping: dofetilide, the
prototypical trapped blocker, accumulates more block over successive beats and
prolongs the AP further than the static estimate.
The Phase-B Langmuir above was a placeholder. v0.6 sources the actual published CiPA
dynamic-hERG binding kinetics — the Li et al. 2017
IKr-Markov drug-binding model, optimized per drug against Milnes-protocol data and
validated in Li et al. 2019 — for the
12 CiPA compounds that have them, straight from the
FDA/CiPA repository optimal fits. They live on the hERG
records as a first-class, citation-backed cipa_binding block
(spec v0.6):
| Parameter | Meaning |
|---|---|
Kmax |
maximum binding-rate scale (dimensionless) |
Ku |
unbinding rate (ms⁻¹) |
n |
Hill coefficient of the concentration-dependent binding rate |
halfmax |
nᵗʰ power of the half-maximal concentration (nMⁿ) |
Vhalf |
trapping half-voltage (mV) — near 0 ⇒ trapped; very negative ⇒ washes out |
Kt |
shared fixed closed-channel trapping rate, 3.5×10⁻⁵ ms⁻¹ |
The binding model is on = Kmax·Ku·Dⁿ/(Dⁿ + halfmax) to the open channel, unbinding at
Ku, with voltage-dependent trapping Kt/(1+exp(−(V−Vhalf)/6.789)) governing whether
drug bound at depolarization stays trapped through diastole. Opt in with
assess(..., herg_dynamic="cipa"); it is coupled to the reduced kernel's IKr gate via
open-bound/closed-bound sub-states.
The mechanism reproduces the experimentally-reported trapping phenotype without a
tuned number: at a matched concentration a near-zero-Vhalf blocker (dofetilide, −1 mV)
accumulates and retains block beat-over-beat, while a strongly-negative-Vhalf blocker
(verapamil, −97 mV) washes out of the closed channel — the published ordering
dofetilide ≫ terfenadine > verapamil falls straight out of the fitted Vhalf.
What this is, and is not. The data are authoritative published values (shipped
unverifiedper §9 — promotion is a contributor confirming them against the source). The model is an honest Tier-C reduction: it applies the exact CiPA binding kinetics to the reduced kernel's IKr gate, not the full 9-state CiPA Markov IKr (that — and re-validating the AP against it — is declared future work). Because the kinetics equilibrate slowly (the official protocol paces ~1000 beats), the CiPA path is a research/demonstration surface; it is opt-in and changes no default qNet/ΔAPD90 number, threshold, or recorded performance result. Only the 12 compounds with published dynamic data getcipa_binding— Harmonia never fabricates kinetics for the other 16.
Block is driven by the free (unbound) drug concentration, but clinical PK
usually reports the total plasma Cmax; the two differ by the fraction unbound
(fu), often by one to two orders of magnitude. Drug-reference records carry
protein_binding.fraction_unbound, and assess(..., exposure_kind="total")
converts a total concentration to free (free = fu × total) before scaling block
— so a total-concentration PK trajectory, including a Hypnos output, can drive a
Harmonia assessment.
assess_combination extends the thesis to polypharmacy. Block from
independent agents multiplies per channel (the fraction of a current remaining is
the product of each drug's remaining fraction), and the IC50 variability of
every drug is propagated jointly by Monte-Carlo. The result: two drugs that are
each "intermediate" alone can combine into a high classification, and the
combined call carries its own flip frequency.
terfenadine + ondansetron at therapeutic exposures cross into HIGH (qNet 0.21),
with ~+22% extra APD prolongation beyond the worst single agent and a ~34%
classification-flip frequency under joint input variability. The combined safety
call is only as trustworthy as its least-identifiable input — the single-drug
principle, extended to the combination.
Where assess propagates input (IC50) variability, assess_population
propagates physiological variability. Instead of one myocyte it builds a
population of virtual hearts by sampling the kernel conductances (lognormal,
per-channel CVs from a population record), then runs the drug at a fixed
exposure across the population. Individuals with intrinsically reduced
repolarization reserve (low IKr/IKs) cross into high risk while robust individuals
do not, so one drug yields a spread of classifications and a susceptible
fraction.
Two things this view makes visible: dofetilide is high in ~60% of the population (and lower in the resilient tail), while verapamil is high in only ~8%; and a drug the single-cell estimate "misses" — sotalol, weak-hERG at 4× EFTPC — still has a substantial susceptible subpopulation (~30% high), which is exactly the kind of sensitivity gain population approaches are known for (Passini et al. 2017).
This subsystem ships hypothesis-tier (spec.md §3, §10). The conductance CVs are illustrative, not calibrated to human data, so every population assessment is capped at Tier D and stamped NOT FOR PREDICTION. It is a hypothesis-generating methodology view, never a per-patient or population safety claim. (v0.5 refines which myocytes enter the population — see Experimentally-calibrated populations below.)
v0.1's population layer spread a cloud of conductance variability around the healthy mean. v0.3 adds the other half the spec §3 names — a disease or genetic background, a systematic shift of a current. The clinical fact behind it is repolarization reserve (Moss & Kass 2005): the healthy ventricle repolarizes through redundant currents, so blocking one is usually tolerated; a patient with a latent channelopathy has already spent that reserve, and the same drug block that is benign in a healthy myocyte can be torsadogenic against the disease background.
A population record gains an optional conductance_scale — a per-current mean
multiplier applied under the existing variability cloud (g = s_c · λ · g_healthy).
Three congenital long-QT channelopathies ship as records:
| Population | Gene (mechanism) | Shift | conductance_scale |
|---|---|---|---|
lqt1 |
KCNQ1 loss of function | IKs ↓ ~50% | {"IKs": 0.5} |
lqt2 |
KCNH2 / hERG loss of function | IKr ↓ ~50% | {"IKr": 0.5} |
lqt3 |
SCN5A gain of function | late INa (INaL) ↑ ~100% | {"INaL": 2.0} |
The same drug, re-assessed against a reduced-reserve background, crosses into the high-risk class far more often. ranolazine — a low-risk, mostly-INaL blocker — is high in ~5% of the healthy population but ~24–39% against the LQTS backgrounds (most under LQT3, whose enhanced INaL compounds ranolazine's own inward effect); dofetilide on the IKr-deficient LQT2 background is high in ~88%. The reduced kernel reproduces the textbook ordering — LQT2 prolongs the AP most (direct IKr loss), LQT3 lowers qNet most (the inward-current shift) — which is the validation that the mechanism, not a tuned number, is doing the work.
harmonia population dofetilide --population lqt2 # re-assess against an LQT2 backgroundStill strictly hypothesis-tier and never predictive. The magnitudes are illustrative heterozygous-scale shifts, not genotype-calibrated parameters; the qNet/APD thresholds remain the healthy reference; every disease-population assessment is capped at Tier D and stamped NOT FOR PREDICTION. It makes the gene–drug repolarization-reserve interaction visible and quantitative — a mechanism demonstration, never a per-patient or per-genotype safety claim (spec v0.3).
The v0.1/v0.3 population samples the prior conductance cloud and accepts every draw — including the implausible tail where an extreme conductance combination yields a drug-free action potential no real myocyte would show. v0.5 adds the landmark experimentally-calibrated populations-of-models discipline (Britton et al. 2013): a candidate myocyte is admitted only if its drug-free AP biomarkers are physiologically plausible, so the population's members could actually exist before any drug is applied.
A population record gains an optional calibration block — accepted ranges for
the drug-free apd90_ms, vrest_mv, vpeak_mv, and triangulation_ms — and the
calibrated_v0 record ships the illustrative_v0 cloud admitted through that
filter. Empirically ≈92% of drug-free myocytes pass; the triangulation bound is
the dominant filter, which is mechanistically right — high drug-free triangulation
is itself a repolarization-instability marker, exactly the abnormality calibration
removes (here the kernel's drug-free triangulation tail reaches ~250–300 ms against
a ~42 ms baseline).
harmonia population dofetilide --population calibrated_v0 # drug-free-plausible populationStill strictly hypothesis-tier and never predictive. The acceptance ranges are kernel-plausibility bounds — bounds on this reduced kernel's own biomarkers, bracketing its physiological bulk — not a fit to human population electrophysiology. The methodology is Britton 2013; the numbers are not. Calibration buys physiological plausibility, not predictiveness: every calibrated assessment is still capped at Tier D, stamped NOT FOR PREDICTION, and assessed against the healthy qNet/APD thresholds, while the assessment additionally reports the acceptance rate and per-biomarker rejection counts (spec v0.5).
All three population layers — the variability cloud, the LQTS disease backgrounds,
and the calibrated population — are reproduced (and asserted Tier-D / NOT-FOR-PREDICTION)
in the executable notebook
notebooks/03_populations.ipynb, run in CI under
nbmake.
The flip frequency says whether a safety call is unstable; the obvious next
question is which channel's IC50 uncertainty drives it — i.e. which lab
measurement, if pinned down, would most stabilize the call. flip_sensitivity
answers it by re-running the Monte-Carlo with one channel varying at a time
(main effect, "solo-flip") and with one channel frozen while the rest vary
(total effect, "frozen-flip"), using common random numbers so the scenarios are
comparable.
For verapamil (point class LOW, but ~35% flip), the dominant driver is its ICaL block:
channel sources fold solo-flip frozen-flip
ICaL 1* 1.0 38% 30%
IKr 3 3.1 33% 35%
INaL 1* 1.0 0% 37%
dominant uncertainty driver: ICaL — pin this IC50 down first
The insight is sharper than the flip frequency alone: ICaL drives the call yet is
single-source (*), so its spread is a prior, not a measurement — the honest
recommendation is to characterize verapamil's ICaL block across more labs before
trusting the call. This is the load-bearing thesis (input variability governs the
safety call) made actionable, and like every Harmonia output it is an
uncertainty attribution, never a verdict.
Figures regenerate from the dataset with python docs/make_figures.py, and the
headline analysis is reproduced (and asserted) in the executable notebook
notebooks/01_flip_frequency.ipynb, run in
CI under nbmake.
The flip frequency is computed by counting how often the classification changes
over n_mc Monte-Carlo draws — so it is a binomial proportion k / n_mc with
its own sampling error, which shrinks only as you add draws. Reporting it as a
bare number invites over-trust in a noisy estimate, which is exactly the failure
mode the whole project exists to prevent. So v0.7 turns the project's own rule on
its own headline number: every reported flip frequency now carries a Wilson
score 95% confidence interval.
res = harmonia.assess(ds, "verapamil", n_mc=200)
res.classification_flip_frequency # 0.365
res.flip_ci # (0.30, 0.43) Wilson 95% CI at n_mc=200
# the helpers are public and pure
harmonia.wilson_interval(0, 200) # (0.0, 0.019) — NOT (0, 0): an honest upper bound
harmonia.flip_ci(0.365, 200) # (0.30, 0.43)classification-flip frequency: 36% (95% CI 30%–43%, 200 MC draws)
The Wilson interval (not the normal/Wald approximation) is used deliberately:
it stays inside [0, 1] and is non-degenerate at the extremes k = 0 and
k = n — exactly where flip frequencies live. A tight HIGH blocker that flips
0 / 200 times has a Wilson CI of [0, 1.9%], an honest upper bound; the Wald
interval would report a misleading [0, 0]. The same interval is attached to the
combination flip (CombinationAssessment.flip_ci), the all-channels-vary flip
sensitivity (FlipSensitivity.all_vary_flip_ci), the new-lab reproducibility flip
under uq="bayes" (reproducibility_flip_ci), and the population susceptible
fraction (PopulationAssessment.susceptible_fraction_ci). It is purely
additive: every previously-reported value — flip frequencies, class
probabilities, qNet/ΔAPD90, the calibrated thresholds — is byte-identical, and the
n_mc = 0 point-estimate path reports (nan, nan) (no sampling, no interval). See
spec v0.7.
streamlit run dashboard/app.py opens the spec-§6 headline view. It is a pure
presentation layer over the dataset and never shows a bare verdict — every
panel is a distribution, a flip frequency, or a population spread, with
unidentifiable channels flagged and hypothesis-tier surfaces banner-stamped. Five
tabs cover the full analysis surface:
| Tab | What it shows |
|---|---|
| Risk-uncertainty (flip) view | the qNet/ΔAPD90 distribution under IC50 variability, the classification-flip frequency, triangulation + cqInward diagnostics, stability across the three AP-model variants, the per-channel sensitivity attribution, and a dose–response curve. A Bayesian dose-response UQ toggle (v0.2) swaps the moments sampler for the hierarchical posterior and adds the true-value vs new-lab (reproducibility) flip split and the censored / prior-dominated channel flags. |
| Drug combinations | polypharmacy: joint IC50 variability, the interaction term, the combined-class flip frequency, single agents vs the combination. |
| Population-of-models | physiological (between-heart) variability: the susceptible fraction and class spread across a population of virtual myocytes — including the LQTS disease backgrounds (v0.3 mean shift) and the experimentally-calibrated population (v0.5 drug-free-plausibility acceptance, with its acceptance rate and per-biomarker rejection counts). Banner-stamped Tier D / NOT FOR PREDICTION. |
| Browse dataset | every channel-block record with its IC50, tier, max-block, identifiability, source count, fold-range, and review status. |
| About / safety | the guardrails, restated. |
Its data contract (every field, function, and dict key the UI reads) is asserted
in tests/test_dashboard.py, so a drift in the simulate / populations API
fails in CI rather than in front of a user.
Where v0.1 made input variability a first-class field, v0.2 makes it a
first-class inference: the spread of an IC50 stops being a number transcribed
from a table and becomes a posterior, derived under a declared prior from the
underlying data. It is opt-in — uq="moments" (the default) reproduces every
v0.1 number exactly; uq="bayes" swaps the per-channel draw for a
posterior-predictive sample. Full design rationale:
docs/specs/v0.2-bayesian-dose-response-uq.md.
The v0.1 Monte-Carlo built, per channel, a lognormal IC50 draw centered on the
log-geomean with sigma = std(log10(source_ic50s)) (or a hard-coded constant for
a single source) and a fixed Hill coefficient. That is correct but discards
information in three places. v0.2 closes all three:
| Gap in the v0.1 sampler | v0.2 fix |
|---|---|
Spread is a point estimate from 2–3 numbers, and a single-source channel carries a magic DEFAULT_SINGLE_SOURCE_SIGMA. |
A hierarchical posterior whose between-lab SD tau_pop is learned across every multi-source channel in the dataset (learn_tau_pop) and borrowed by sparse channels — a constant becomes an inferred, citable quantity that sharpens as the dataset grows. |
| The Hill coefficient is fixed, even when sources disagree on block steepness. | A joint (IC50, Hill) posterior; Hill uncertainty now propagates into the block factor. |
| Sub-60%-block is a binary exclusion — a "45% block at the top dose" measurement is thrown away entirely. | A one-sided censored posterior: the max-block observation becomes a probit likelihood that bounds the IC50 from below near the recovered top tested dose, with the Hill marginalized over its prior — a proper but wide, heavy-right-tailed posterior. The Tier-D gate is preserved; v0.2 only stops discarding the information. |
| The summary IC50 is transcribed as a point with an assumed spread, even when raw concentration-block points exist. | Raw regime (v0.2.1): a source carrying raw (concentration, fractional_block) points has its (IC50, Hill) and the genuine fit uncertainty inferred from the curve (harmonia.fit_dose_response), weighted into the hierarchical pooling by its actual precision. Reduces exactly to the summary path when no raw data is present. |
The non-drift guarantee. In the well-identified, multi-source, agreed-Hill
limit the posterior mean of log10(IC50) converges to the log-geomean and the
predictive SD to the sample SD — so uq="bayes" and uq="moments" agree where
v0.1 was already right (left panel above; asserted in tests/test_infer.py). A
reviewer can trust v0.2 precisely because it does not move the numbers v0.1 had
right.
Two distinct flip frequencies — and v0.2 never conflates them. The headline
classification_flip_frequency samples the true-value posterior (μ_c, "what is
the drug's IC50?"). A separate, labeled reproducibility_flip_frequency samples the
new-lab predictive (adds the between-lab spread τ_c, "how much would a fresh
replication move the call?"). Both are distributions; neither is a verdict.
Priors are declared inputs, not hidden choices. Every prior lives in
dataset/priors/ as a version-pinned, citable, non-predictive
object (harmonia validate enforces predictive == false — no prior may carry a
risk conclusion). Each posterior reports its prior_sensitivity — the fraction of
posterior variance attributable to the genuinely-subjective priors, probed by
re-inference under a widened prior while the empirical-Bayes tau_pop is held fixed.
A high value is not an error; it is the honest statement "this number is mostly
prior," and it drives the prior-dominance flag. A censored channel is prior-dominated
by construction (centre panel above), and the honest recommendation is unchanged from
v0.1: go measure it at higher doses.
$ harmonia infer verapamil
posteriors drug=verapamil prior=harmonia-ic50-prior-v1 learned between-lab SD tau_pop=0.249 log10
channel n IC50 (nM) q05/med/q95 hill ident priorS rhat ess flags
ICaL 1 81.6/ 203.1/ 526.3 1.09+/-0.10 0.39 0.11 1.000 3723
IKr 3 136.2/ 252.8/ 475.1 1.00+/-0.06 0.69 0.07 1.000 3722
INaL 1 2529.5/ 6900.4/ 16480.0 1.00+/-0.09 0.45 0.11 1.000 3713
Global, interaction-aware sensitivity (Sobol). The v0.1 one-at-a-time
attribution reads main effects but cannot see channel interactions (an IKr–ICaL
trade-off where neither IC50 alone flips the call but their joint variation does).
flip_sensitivity(method="sobol") reports first-order S_i (Janon estimator),
total-effect S_Ti (Jansen estimator), and the interaction load S_Ti − S_i,
each with a bootstrap Monte-Carlo standard error. The dominant-driver recommendation
becomes interaction-aware: a channel can be the dominant total-effect driver while
having a small solo effect — exactly the case OAT misses. The cheap OAT readout
remains the default.
The inference is calibrated, not just plausible. Two synthetic-by-construction §9
gates prove the implementation is correct: simulation-based calibration — data drawn
from the prior and re-inferred yields rank-uniform posteriors (chi-square uniformity
p ≈ 0.6) — and posterior coverage — the 90% credible interval covers the truth ≈ 90%
of the time (measured 0.905). Run them with python dataset/tools/build_posteriors.py --validate or harmonia.simulation_based_calibration / harmonia.posterior_coverage.
Implementation choices (and why), in the family tradition:
| Decision | Rationale |
|---|---|
| Exact direct (grid + conjugate) sampler in pure NumPy, not a PPL/MCMC | The summary regime has 1–3 data points per channel; the between-lab SD is drawn from its collapsed 1-D marginal (the channel mean integrated out in closed form, so no Neal funnel) and the true log-IC50 from a conjugate Normal. Draws are i.i.d. and exactly from the posterior, deterministic, dependency-free, and millisecond-fast; rhat/ess are still reported and trivially satisfied. |
| Posteriors recomputed on demand, not cached into records | Keeps the source of truth (source data + prior) and harmonia.load() a zero-extra-dependency read; reproducibility is asserted by a "run twice → byte-identical" test rather than a brittle cross-platform git diff of 68 regenerated files. |
uq="moments" stays the default |
A non-flag-day rollout (spec v0.2 §11): every v0.2 number is diffable against its v0.1 counterpart, and the backward-compatibility suite guards the moments path byte-for-byte. |
| The 60%-block gate is preserved; censoring is additive | The gate governs the claim (Tier cap); the continuous identifiability_score governs the diagnostic. A continuous score must never launder a sub-threshold channel into a higher tier. |
The v0.2 analysis is reproduced and asserted in
notebooks/02_bayesian_uq.ipynb, run in CI under
nbmake.
| Format | Role | File |
|---|---|---|
| CellML 2.0 | Native language of cardiac EP / the Physiome repo; shared with Nidus | exports/cellml/ |
Myokit .mmt |
The most directly runnable artifact (model + pacing protocol) | exports/myokit/ |
| SBML L3v2 | ODE system → COPASI / Tellurium / BioModels | exports/sbml/ |
| CiPA inputs | The IC50/Hill table the CiPA in-silico tool ingests (+ variability/tier columns) | exports/tables/cipa_inputs.csv |
| SED-ML | Reproducible pacing/simulation protocol, paired with CellML | exports/sedml/ |
COMBINE .omex |
Bundles CellML + SBML + SED-ML + provenance | exports/omex/ |
| CSV / BibTeX | Flat parameter + citation export | exports/tables/ |
Every exported model carries a universal, machine-readable
harmonia:clinicalUse = "PROHIBITED — research / safety-methodology / education only; not a regulatory determination" annotation, plus the propagated tier and
bqbiol:isDescribedBy DOI links as MIRIAM-style RDF. Exports are generated,
never hand-edited — and that is enforced: CI regenerates exports/ on every
push and git diff --exit-codes the committed text artifacts (CellML, SBML,
Myokit, SED-ML, the CiPA/parameter tables, BibTeX), so a committed export that has
drifted from the dataset fails the build, exactly as the dataset itself is guarded
against build_records.py. (The .omex zips are regenerated and manifest-checked
but excluded from the byte diff — zlib's compressed output is not guaranteed
identical across platforms; their text members are covered by the directories
above.) harmonia export --all additionally fails if any round trip drifts. Three
round trips guard
the exports: the CiPA-input export has a true numeric round trip (parse back ⇒
dataset values); the kernel constants are verified to survive the
CellML/SBML/Myokit text; and — the strongest — the ODE round trip
re-integrates the model AST that every CellML/SBML/Myokit export is rendered from
and confirms it reproduces the reference-kernel action potential (≈1e-7 relative
on the V trace, far inside the 1e-4 target), so the exported equations, not
merely the constants, provably match the numeric oracle. Both primary model
formats are then validated against their canonical libraries in CI: every
exported CellML model passes libCellML's Parser + Validator
(cellml.validity_violations — MathML, units, and interface consistency, beyond
the declaration-level cellml.conformance_violations), and every exported
SBML model passes libSBML's checkConsistency
(sbml.consistency_violations) — so "CellML → Physiome/OpenCOR" and
"SBML → COPASI/Tellurium/BioModels" are verified, not asserted. Both formats
declare identical unit metadata. The spec-§7 clinicalUse/tier/DOI RDF is
embedded in every export; in the standalone CellML it is a foreign-namespace
metadata island (CellML 2.0 has no blessed annotation wrapper, unlike SBML's
<annotation>), so libCellML validates the model proper while the identical RDF
also travels in the COMBINE metadata.rdf. Full dimensional validation and the
Myokit/OpenCOR numeric cross-check against the canonical ORd CellML remain an
optional local step (they need a heavy simulation engine, so are not run in CI).
Everything downstream of the dataset is a deterministic projection, so the test suite (211 tests, all run in CI on Python 3.9 / 3.11 / 3.12) is mostly about provable non-drift rather than fixtures:
| Guard | What it proves | Where |
|---|---|---|
| Lint | no dead imports, undefined names, or unused variables (Pyflakes + pycodestyle) | ruff check |
| Type-check | the package's py.typed contract holds — every public load / simulate / infer / export signature checks under mypy (no implicit Optional, no unused ignores), so the typed views downstream tools rely on cannot silently drift |
mypy |
| Dataset reproducibility | dataset/records regenerates byte-identically from build_records.py (the provenance log) — the curated table is the dataset |
CI git diff --exit-code |
| Export reproducibility | the committed exports/ text artifacts (CellML/SBML/Myokit/SED-ML/tables/BibTeX) regenerate byte-identically from the dataset — a hand-edit or a stale sample fails the build |
CI git diff --exit-code |
| Schema + semantic validation | every record satisfies the JSON Schema; the reliability gate (block < 60% ⟺ Tier D ⟺ failure-mode) and variability bookkeeping hold | harmonia validate |
| Prior registry validity (v0.2) | every prior schema-validates, its id matches its filename, every cited key resolves, and predictive == false (no prior carries a risk conclusion) |
harmonia validate |
| Bayesian reduction / non-drift (v0.2) | the posterior mean → log-geomean for a multi-source channel; the moments path is byte-identical, so uq="moments" reproduces v0.1 exactly |
tests/test_infer.py, tests/test_uq_assess.py |
| Sampler convergence + censoring (v0.2) | every channel posterior meets rhat < 1.01 / an ess floor; a sub-60%-block channel yields a wide, prior-dominated one-sided posterior and still produces a Tier-D assessment |
tests/test_infer.py |
| Inference calibration (v0.2.1, §9) | simulation-based calibration → rank-uniform posteriors; 90% credible interval covers the truth ≈90%; raw dose-response fit recovers a synthetic IC50/Hill | tests/test_infer_raw.py |
| Disease populations (v0.3) | the conductance_scale mean shift is applied correctly; LQTS backgrounds raise the susceptible fraction in the textbook order; every disease assessment is Tier D / NOT FOR PREDICTION; the healthy population is byte-identical |
tests/test_disease_populations.py |
| Calibrated populations (v0.5) | every accepted myocyte's drug-free biomarkers are in range; the abnormal-repolarization tail is rejected (triangulation-dominated); the calibrated assessment is Tier D / NOT FOR PREDICTION; the uncalibrated path is byte-identical (shared RNG draw) | tests/test_calibrated_populations.py |
| cqInward biomarker (v0.4) | control identity (=1 at no drug); ICaL/INaL block lowers it (<1), IKr block raises it (>1); propagated as a distribution; adding it changes no qNet/flip number | tests/test_cqinward.py |
| CiPA dynamic-hERG kinetics (v0.6) | the 12 CiPA dynamic-fit compounds carry cipa_binding (the rest don't); zero-drug ⇒ no block; block rises with concentration; the trapping phenotype (dofetilide retains ≫ verapamil washes out); the opt-in path leaves the default classification unchanged |
tests/test_cipa_binding.py |
| Sobol consistency (v0.2) | total-effect ≥ first-order per channel (within MC error); indices deterministic; standard errors reported | tests/test_sobol.py |
| Flip-frequency CI (v0.7) | the Wilson interval matches the textbook value, stays in [0,1], is non-degenerate at k=0/k=n, narrows ∝ 1/√n, and brackets the reported flip frequency; adding it changes no flip/susceptible/metric value (n_mc=0 ⇒ (nan, nan)) |
tests/test_flip_ci.py |
| CiPA numeric round trip | export the CiPA CSV, parse it back, every IC50/Hill equals the dataset value | registry.roundtrip_cipa |
| Parameter round trip | the kernel conductances appear verbatim in the CellML/SBML/Myokit text | registry.roundtrip_parameters |
| ODE round trip | the model AST re-integrates to the reference-kernel AP within ≈1e-7 — the exported equations match the oracle, not just the constants | registry.roundtrip_ode |
| CellML unit conformance | every variable and <cn> carries a defined/built-in unit, no dangling references |
cellml.conformance_violations |
| CellML validity (canonical) | the exported model passes libCellML's Parser + Validator (MathML, units, interfaces) — the canonical CellML 2.0 library | cellml.validity_violations |
| SBML validity (canonical) | the exported SBML passes libSBML checkConsistency with zero errors; every parameter declares units |
sbml.consistency_violations |
| SED-ML reference resolution | every task→model/sim, variable→task, curve→dataGenerator reference resolves; the model source points at a file that exists |
sedml.reference_violations |
| OMEX manifest consistency | the COMBINE manifest lists exactly the archive's files, with one master | combine.manifest_violations |
| Executable notebooks | the three analyses (flip-frequency, Bayesian UQ, population-of-models) run clean and every inline assertion holds | nbmake notebooks/ |
| Dashboard data contract | the headline UI's data API — every field, function, and dict key each of the five tabs reads (flip + Bayesian-UQ toggle, combinations, population-of-models incl. disease/calibrated, browse) still exists | tests/test_dashboard.py |
| Recorded performance | the honest, live confusion matrix vs CiPA expert labels — never hidden behind one accuracy number | harmonia performance |
harmonia export --all runs all of these checks (the three round trips, CellML
unit conformance + libCellML validity, SBML validity, SED-ML reference
resolution, and OMEX manifest consistency) and exits non-zero on any drift, so a
regenerated artifact that disagrees with the dataset or kernel fails the build. The OpenCOR/libcellml dimensional cross-check
against the canonical ORd CellML is the one validation deliberately left out of
CI (it needs a heavy engine); it remains an optional local step.
| Decision | Rationale |
|---|---|
| Pure-Python reference kernel (SciPy); Myokit/OpenCOR optional | Validation must not depend on a heavy engine at load time; exports are artifacts. |
| Dataset is the centerpiece; everything else is a projection | The durable contribution is the curated, multi-source, tier-annotated block parameters. |
source_values + assay_context first-class; variability propagated |
Input variability — not the model — is the dominant uncertainty; making it machine-enforced is the load-bearing idea. |
| Output is a risk distribution + flip frequency, never a bare class | A single high/intermediate/low label hides exactly the uncertainty that matters. |
| Worst input wins; unidentifiable IC50 caps at D | A safety call is only as trustworthy as its least-identifiable channel. |
| One AST → all model exports | Myokit/CellML/SBML provably cannot drift from each other or the kernel. |
| qNet default (via a shape-dependent INaCa), APD90 selectable; kernel honestly Tier C | qNet is the CiPA-canonical metric and now discriminates; the reduced kernel is not the qualified ORd, so it stays Tier C and over-claims nothing. |
| Methodology only; never a regulatory or clinical determination | The line is making (or appearing to make) a safety verdict. |
harmonia/
├── spec.md # the design spec (v0.1)
├── dataset/ # SOURCE OF TRUTH
│ ├── schema/ # JSON Schema + JSON-LD context
│ ├── records/ # one JSON per channel-block / AP-model / drug-reference record
│ ├── citations/ # Crossref/PubMed-checked citation records
│ └── tools/build_records.py # provenance log: the curated table → records (CI checks reproducibility)
├── python/harmonia/
│ ├── load.py · validate.py · filter.py · records.py
│ ├── simulate.py # Monte-Carlo variability → risk distribution + flip view; dynamic hERG; combinations
│ ├── exposure.py # free ↔ total plasma concentration (protein binding)
│ ├── populations.py # population-of-models risk spread + Britton-2013 calibration (hypothesis-tier, Tier D)
│ ├── performance.py # score the kernel vs CiPA expert labels (confusion matrix)
│ ├── cli.py
│ └── export/
│ ├── reference.py # SciPy ORd-lineage AP kernel + risk metrics (the oracle)
│ ├── model_spec.py # one expression AST → Myokit / CellML / SBML
│ ├── cellml.py · myokit.py · sbml.py · sedml.py · cipa_inputs.py
│ ├── csv_bibtex.py · annotate.py · combine.py · registry.py
├── dashboard/app.py # Streamlit: flip view (+ Bayesian UQ) · combinations · population-of-models · browse
├── notebooks/ # executable analyses, run in CI under nbmake
├── tests/ # 211 tests: dataset, kernel, qNet, simulate, dynamic binding, CiPA hERG kinetics, flip-frequency CIs, exposure, combinations, populations (incl. calibrated), performance, Bayesian UQ, exports (round trips + unit conformance + SED-ML/OMEX integrity), CLI, dashboard contract
├── docs/ # essay, figures, make_figures.py
├── CHANGELOG.md · .zenodo.json # release metadata
└── exports/ # sample generated artifacts (regenerated in CI)
Non-negotiable (spec.md §10). Harmonia is NOT a clinical tool, NOT a regulatory safety determination, and NOT a verdict that a drug is safe or unsafe. It is research, method development, education, and support for — never replacement of — the CiPA paradigm.
- No bare safety classification as an authoritative output. Harmonia reports a risk-metric distribution with its full input uncertainty and a classification-flip frequency.
- Unidentifiable inputs are stated, not imputed (max block < 60% ⇒ Tier D).
- Every export carries
clinicalUse = PROHIBITED. - The
populationssubsystem (Phase E) ships hypothesis-tier, non-predictive.
The tell that the project has crossed its line: any feature that emits a single, authoritative "this drug is safe/unsafe" verdict without its uncertainty. That feature does not get built.
| Phase | Content | Status |
|---|---|---|
| A — CiPA spine | Channel block + ORd kernel + risk metric for the 12 training drugs, end to end, with exports, validation, and the flip view | ✅ |
| B — Dynamic hERG + validation | Dynamic (Langmuir + trapping) hERG binding; the 16 validation drugs (28 CiPA compounds total); recorded classification performance | ✅ |
| C — Variability layer | Discriminating qNet via a shape-dependent Na-Ca exchanger ✅; the published CiPA dynamic-hERG optimized kinetics sourced + a binding model shipped (v0.6) ✅. Remaining: the full 9-state CiPA Markov IKr (the kinetics are coupled to the reduced gate, a Tier-C reduction); broader multi-source aggregation; ToR-ORd reformulation | ◧ |
| D — Exposure layer | Free ↔ total plasma conc + protein binding (composable with Hypnos); drug-combination assessment | ✅ |
| E — Populations | Population-of-models risk spread ✅; disease & genetic backgrounds (LQTS, v0.3) ✅; experimentally-calibrated populations (Britton 2013, v0.5) ✅ — all shipped non-predictive (Tier D). Remaining: real-data-calibrated (not kernel-plausibility) populations | ✅ |
| F — Hardening | Declaration-level CellML unit-conformance check (in CI) ✅; executable nbmake notebooks (3) ✅; .zenodo.json + CHANGELOG.md ✅; Wilson 95% CIs on every Monte-Carlo flip frequency (v0.7) ✅. Remaining: full dimensional validation + the Myokit/OpenCOR cross-check against the canonical ORd CellML (optional local step); minted Zenodo DOI on first tagged release |
◧ |
This release (v0.7): the headline flip frequency — the number the whole dataset exists to compute — now carries its own Wilson 95% confidence interval, because it is a Monte-Carlo estimate and the project's rule is never a number without its uncertainty. Purely additive; no previously-reported value moves. See spec v0.7.
- Code: MIT (LICENSE). Dataset: CC-BY-4.0 (LICENSE-DATASET).
- When you use a record, cite Harmonia and the primary source(s) named in
that record (
record.primary_citation.doi). Machine-readable metadata in CITATION.cff and.zenodo.json; a concept DOI is minted on the first tagged release. Changes are tracked in CHANGELOG.md.
Harmonia shares CellML with Nidus (the Physiome lineage) and composes with Hypnos: a drug's free-plasma-concentration trajectory (Hypnos PK) can scale Harmonia's channel block, giving an open, tier-annotated PK → proarrhythmia chain — the dosing and safety ends of the same molecule, in one toolchain.











{ "id": "channel_block.dofetilide.ikr", "kind": "channel_block", "drug": { "name": "dofetilide", "unii": "R4Z9X1N42Q" }, "channel": "IKr", "parameters": [ { "symbol": "IC50", "value": { "central": 5.06, "low": 4.0, "high": 6.6, "units": "nM" } } ], "assay_context": { "max_block_observed_percent": 95 }, // <60 => unidentifiable => Tier D "source_values": [ // the variability, made first-class { "ic50_nm": 4.9, "platform": "manual", "citation": "crumb-2016" }, { "ic50_nm": 6.6, "platform": "automated", "citation": "kramer-2013" }, { "ic50_nm": 4.0, "platform": "manual", "citation": "li-2017" } ], "variability": { "fold_range": 1.65, "n_sources": 3, "iqr_nm": [4.45, 5.75] }, "tier": "A", "extraction": { "review_status": "unverified" } // honest by default — see below }