Open-source infrastructure for reproducible quantitative research.
A local-first research validation control plane for crypto: it enforces deterministic dataset identity, run identity, fold causality, and fail-closed promotion so that only attested, reproducible results can be promoted. Governance and auditability are enforced at the DB and gatekeeper layers. No API keys, no trading — validation and reports only.
Who this is for
- Researchers who need reproducible validation and promotion gates
- Research platform / quant infra engineers who need auditability and determinism
What this is (in six bullets):
- Validation control plane — Governs whether a research result is eligible for promotion (candidate/accepted). Does not execute or trade.
- Content-addressed datasets —
dataset_id_v2hashes logical content of allowlisted SQLite tables; one row change changes the id. STRICT mode required for promotion. - Deterministic run identity —
run_key(semantic) andrun_instance_id(execution); seeds derived from run_key + salt + version; same config + dataset → same run_key. - Fold-causality enforcement — Purge/embargo in walk-forward splits; train-only fit; attestation artifact required for candidate/accepted when walk-forward is used.
- Fail-closed promotion — Candidate and accepted require a passing eligibility report; DB triggers block promotion without it; referenced eligibility reports are immutable.
- Append-only governance and lineage —
governance_events,artifact_lineage, andartifact_edgesare append-only; audit trace from accepted → inputs/configs/artifacts.
| If you want to… | Read this |
|---|---|
| Quickstart (5 minutes) | Quickstart → venv, install, minimal report path. |
| For researchers | Trust model in practice, Key guarantees, Core workflows, Determinism & reproducibility. |
| For engineers | Architecture at a glance, CLI cheatsheet, Development / Verification. |
| For reviewers | Trust model in practice, Promotion model, Auditability, Methods & limits, References. |
Prerequisites: Python 3.10+. No API keys (public endpoints only). Run all commands from the repo root after cloning.
SQLite: The default database file is dex_data.sqlite at the repo root (config.yaml → db.path). Poll, materialize, and most flows use this path (also resolved when relative). Override with CRYPTO_DB_PATH or per-command --db.
uv sync --frozen
uv run crypto-analyzer --help
uv run crypto-analyzer doctorExact verification commands (same as CI): see CONTRIBUTING.
Minimal path to a research report (after install):
uv run crypto-analyzer doctor
uv run crypto-analyzer universe-poll --universe --universe-chain solana --interval 60
uv run crypto-analyzer materialize --freq 1h
uv run crypto-analyzer reportv2 --freq 1h --out-dir reports --hypothesis "baseline momentum"
One-command demo: uv run crypto-analyzer demo
Offline path (no network): Install, then run init, demo-lite, and check-dataset. No config or live data required. CI smoke is for internal stability; demo-lite is for developer onboarding.
uv run crypto-analyzer init
uv run crypto-analyzer demo-lite
uv run crypto-analyzer check-dataset --db dex_data.sqliteIf you prefer pip or uv is not available:
python -m venv .venv
.\.venv\Scripts\activate
python -m pip install -U pip setuptools wheel
python -m pip install -e ".[dev]"
python -m crypto_analyzer --help
crypto-analyzer --help
Then run commands as crypto-analyzer <command> or python -m crypto_analyzer <command>.
On Windows you can use .\scripts\run.ps1 <command> as a convenience wrapper. It uses VIRTUAL_ENV if set, otherwise .venv at repo root, and invokes python -m crypto_analyzer <command> (no reliance on PATH). See README Quickstart if the script reports venv not found.
crypto-analyzernot found — Usepython -m crypto_analyzer <command>; it always works when the package is installed.- uv sync fails — Check
uv --version. Install uv withpython -m pip install -U uv. Run from repo root. - Doctor reports "Not running inside a virtual environment" — Activate the venv (e.g.
.\.venv\Scripts\activate) or useuv run crypto-analyzer doctorso uv runs inside its environment. - run.ps1 fails — Ensure
.venvexists at repo root and you are in the repo root when running the script.
After install, confirm the CLI works (from repo root):
uv run crypto-analyzer doctor
Fallback: python -m crypto_analyzer doctor (with venv activated).
- Boundaries — CI enforces import/boundary rules so that execution, broker, and CLI layers cannot be part of the research control plane (core/governance). See Validation control plane audit.
- Reproducibility — Content-addressed datasets, deterministic run identity, and seeded RNG; promotion requires STRICT dataset hash and provenance. See Determinism & reproducibility and Key guarantees below.
- Fold causality — Purge/embargo and train-only fit are enforced; attestation is required for candidate/accepted when walk-forward is used. See Key guarantees below and Methods & Limits.
- dataset_id_v2 — Content-addressed hashing of allowlisted tables (canonical ordering); STRICT for promotion.
- run_key + run_instance_id — Semantic run identity and execution instance; run_key excludes timestamps/paths.
- Deterministic RNG —
seed_root(run_key, salt, version); salted, reproducible across processes;seed_versionin artifacts. - Calibration harness — BH/BY, RC, RW, CSCV PBO, bootstrap, HAC: CI-safe Type I (and FDR/RC/RW) checks; guards, not full statistical certification.
- Fold-causality + attestation — Purge/embargo, train-only fit; attestation artifact with schema version; gatekeeper requires valid attestation when walk-forward used.
- Fail-closed promotion — Eligibility reports + DB triggers; no candidate/accepted without linked passing report at same level; evidence immutable when referenced.
- Append-only governance_events — All evaluate/promote actions logged; no updates or deletes.
- artifact_lineage + artifact_edges — Audit graph from accepted → run → configs/versions/artifacts.
- SQLite authoritative — Single source of truth for governance and lineage; optional DuckDB analytics backend (read-only for governance).
| Risk | Control | Enforcement mechanism | Verified by |
|---|---|---|---|
| Data drift | dataset_id_v2 | Content-addressed hash + STRICT requirement for promotion | test_dataset_v2.py |
| Seed drift | seed_root + versioned salts | SEED_ROOT_VERSION; deterministic RNG across processes | Deterministic tests (e.g. test_reportv2_deterministic_rerun, test_statistics_research) |
| Promotion bypass | DB triggers | candidate/accepted require linked passing eligibility_report_id; trigger blocks UPDATE/DELETE without it | test_migrations_phase3.py |
| Leakage (fold) | fold_causality_attestation | Purge/embargo, train-only fit; gatekeeper requires valid attestation when walk-forward used | test_fold_causality_attestation.py, test_promotion_requires_fold_causality_attestation.py, test_transform_fit_called_only_on_train.py |
| RC provenance ambiguity | rc_summary schema version + seed_root | rc_summary_schema_version; seed_root/component_salt in RC summary; gatekeeper version check | test_calibration_rc_smoke.py, test_promotion_gating.py |
| Artifact mutability | sha256 + artifact_lineage | compute_file_sha256; artifact_lineage rows with sha256; append-only lineage triggers | test_artifact_lineage_*.py, test_lineage_reproducibility_same_run_key_same_hashes.py |
Why deterministic IDs? So every run is traceable and repeatable: same inputs and config produce the same dataset_id_v2, run_key, and artifact hashes. That lets you compare runs, invalidate caches when data changes, and prove reproducibility in audits. Why opt-in migrations? Phase 3 (regimes, promotion, lineage) adds schema and behavior that not every user needs. Why governance modeling? Research that moves toward production needs a path from “exploratory” to “accepted” with clear gates (eligibility reports, fold attestation, RC/RW when enabled) and an append-only audit log.
If you only read one thing: run the Golden acceptance run and inspect the DB-only audit trace.
- Golden acceptance run — Copy-paste PowerShell steps for a minimal and full proof: deterministic run, promotion to accepted, trigger check, and DB-only audit trace. One-command-ish proof of accepted promotion and provenance.
- Methods & implementation alignment — Mapping from method (dataset_id_v2, RC/RW, seed_root, schema versions, etc.) to code and artifact keys.
Signal discovery is treated as a multiple-testing problem under dependence. Key controls: walk-forward (purge/embargo, train-only fit, fold-causality attestation); deflated Sharpe with Neff; PBO-style/CSCV; BH/BY; optional Reality Check and Romano–Wolf; HAC mean inference. Details: Methods & Limits, Statistical Methods Appendix, implementation-aligned formulae.
flowchart TB
subgraph Data["Data layer"]
SQLiteInputs[("SQLite research tables\n(bars_*, snapshots, universe)")]
DatasetHash["dataset_id_v2\ncontent-addressed hash\n(STRICT/FAST_DEV)"]
SQLiteInputs --> DatasetHash
end
subgraph RunIdentity["Run identity"]
RunKey["run_key\n(semantic)"]
RunInstanceId["run_instance_id\n(execution)"]
RunContext["RunContext\n(seed_version, versions)"]
RunKey --> RunContext
RunInstanceId --> RunContext
DatasetHash --> RunKey
end
subgraph Eval["Evaluation"]
FoldSplits["Fold splits\n(purge/embargo)"]
Scoring["Scoring / metrics"]
FoldSplits --> Scoring
end
subgraph Validation["Validation"]
BHBY["BH/BY, RC, RW\nCSCV PBO, HAC"]
CalHarness["Calibration harness"]
BHBY --> CalHarness
end
subgraph Artifacts["Artifacts"]
ValBundle["validation_bundle\nfold_causality_attestation\nrc_summary"]
end
subgraph Governance["Governance"]
EvalElig["evaluate_eligibility\n→ eligibility_reports"]
PromCand["promotion_candidates\n(exploratory→candidate→accepted)"]
DBTriggers["DB triggers\nfail-closed"]
GovEvents["governance_events\n(append-only)"]
EvalElig --> PromCand
PromCand --> DBTriggers
EvalElig --> GovEvents
PromCand --> GovEvents
end
subgraph Lineage["Lineage"]
ArtLineage["artifact_lineage\nartifact_edges"]
end
RunContext --> Eval
Eval --> Validation
Validation --> Artifacts
Artifacts --> EvalElig
Artifacts --> ArtLineage
Governance --> Lineage
Full diagram source: docs/architecture/validation_control_plane.mmd.
- Ingest — Poll writes to
spot_price_snapshots,sol_monitor_snapshots, universe tables.run_migrationsapplies core + v2 factor tables. - Bars — Raw snapshots → deterministic OHLCV bars (5min, 15min, 1h, 1D). Idempotent.
- Factors — Rolling OLS (or optional Kalman) vs BTC/ETH → residual returns. Materialized to
factor_model_runs,factor_betas,residual_returns; identified bydataset_idandfactor_run_id. - Signals — Cross-sectional factors; winsorized z-scores; signal panels.
- Validation — IC, IC decay; per-signal ValidationBundle (paths, metrics). Fold causality: purge/embargo, attestation when walk-forward used.
- Corrections — Deflated Sharpe, PBO proxy, block bootstrap, BH/BY; optional Reality Check (reportv2
--reality-check,family_id). - Reporting — reportv2; optional regime-conditioned IC with
--regimes REGIME_RUN_ID; Streamlit dashboard; experiment registry; manifests. Promotion — Create candidate; evaluate eligibility; promote to candidate/accepted via governance entrypoint; all actions logged togovernance_events.
Run any command as crypto-analyzer <command> [args...] or python -m crypto_analyzer <command> [args...] (cross-platform). On Windows, .\scripts\run.ps1 <command> [args...] is a convenience wrapper that invokes the same CLI.
| Command | Description |
|---|---|
doctor |
Preflight: environment, DB schema, pipeline smoke test |
doctor --ci |
CI-safe preflight (no network, temp DB, migrations + tables) |
smoke --ci |
Synthetic-data, no-network smoke (migrations, dataset_id_v2, run identity) |
init |
Create local SQLite DB and run migrations (default dex_data.sqlite; optional --phase3) |
demo-lite |
Synthetic dataset, no network; run after init (same default DB: dex_data.sqlite) |
poll |
Single-pair data poll (provider fallback) |
universe-poll --universe ... |
Multi-asset universe discovery (e.g. --universe-chain solana) |
materialize |
Build OHLCV bars (e.g. --freq 1h) |
reportv2 |
Research report: IC, PBO, QP; optional --regimes, --reality-check, --execution-evidence when Phase 3 enabled |
walkforward |
Walk-forward backtest, out-of-sample fold stitching |
promotion |
Promotion subcommands: list, create, evaluate |
verify |
Full gate: doctor → pytest → ruff → research-only boundary → diagram export |
test |
Run pytest |
streamlit |
Interactive dashboard |
demo |
One-command demo: doctor → poll → materialize → report |
check-dataset |
Inspect dataset fingerprints and row counts |
- exploratory — No gate; warnings only.
- candidate — Requires passing
evaluate_eligibility(..., level="candidate"): STRICT dataset_id_v2, run_key, engine_version, config_version, seed_version; fold attestation when walk-forward used; RC/RW contract when enabled. Result stored ineligibility_reports; DB trigger blocks status without linked passing report. - accepted — Same fail-closed requirement at level
accepted; eligibility_report_id and report level must match status.
Walk-forward runs require a valid fold-causality attestation (schema version, purge_applied, embargo_applied, train_only_fit_enforced) for candidate/accepted.
Promotion gating is policy-only and does not perform I/O; filesystem and evidence loading live in the service and execution_evidence layer.
How to trace an accepted result (without reading report files):
- promotion_candidates — Filter
status = 'accepted'; getcandidate_id,eligibility_report_id. - eligibility_reports — Join on
eligibility_report_id; getrun_key,run_instance_id,dataset_id_v2,passed,level,blockers_json,computed_at_utc. - governance_events — Filter by
candidate_id; order byevent_id; see sequence of evaluate/promote and actors. - artifact_lineage — Filter by
run_keyorrun_instance_id; getartifact_id,artifact_type,sha256,created_utcfor that run. - artifact_edges — Join on
child_artifact_id/parent_artifact_idto walk graph (e.g. validation_bundle → fold_causality_attestation, rc_summary). - Versions — From eligibility report or artifact_lineage:
engine_version,config_version,dataset_id_v2; from bundle meta or attestation:seed_version, schema versions.
| ID or mechanism | What it keys |
|---|---|
| dataset_id_v2 | Content-addressed hash of allowlisted tables (logical content, canonical ordering). STRICT for promotion. |
| run_key | Deterministic hash of semantic payload (dataset_id_v2, config, versions); excludes timestamps/paths. |
| run_instance_id | Execution instance (e.g. manifest run_id); same run_key can have many instances. |
| factor_run_id | Hash of dataset_id + factor config (freq, window, estimator). |
| family_id | Reality Check family (signal×horizon); used in RC cache and promotion gating. |
| Artifact SHA256 | File hashes for validation bundles and outputs; deterministic rerun test compares bundle and manifest bytes. |
| CRYPTO_ANALYZER_DETERMINISTIC_TIME | Fixes timestamps so materialize and reportv2 produce identical outputs on rerun. Intended for deterministic rerun testing; does not change promotion eligibility gates (STRICT dataset hash and provenance still required). |
| Bootstrap / RC seed | Derived via seed_root(run_key, salt, version); seed_version in artifacts; reproducible null distributions and CIs. |
Exact commands (PowerShell). Run from repo root with venv activated (e.g. .venv\Scripts\activate).
Doctor: crypto-analyzer doctor = full local preflight (env, DB, pipeline). crypto-analyzer doctor --ci = CI-safe: no network, temp DB only; validates migrations and expected tables (core ingestion, phase3 promotion/governance, lineage).
CI smoke (no network):
crypto-analyzer smoke --ciArchitecture refactor plan (no behavior change): Package boundaries and compatibility shims are documented in Refactor move map. That doc describes the target layout (core, data, artifacts, stats, pipeline, governance, execution, compute), shims (e.g. crypto_analyzer.rng → core.seeding), and verification commands. Public API contract / refactor policy: public_api_contract.md defines stable facades, compatibility shims policy, import boundaries, and how to add new exports. The public API surface is frozen for release; see the contract doc for the exact __all__ and versioning.
uv run ruff check crypto_analyzer cli tests tools
uv run ruff format --check crypto_analyzer cli tests tools
uv run python -m pytest -m "not slow and not network" -q --tb=short- ruff: All checks passed.
- pytest -m "not slow": Skips tests marked
@pytest.mark.slow(full report pipeline). Typical runtime under a few minutes. Seepyproject.tomlfor theslowmarker.
Pre-release checklist (venv only, no uv): If uv is not in PATH, run from repo root with venv activated (e.g. .\.venv\Scripts\activate). Use these in order:
python -m ruff check .
python -m ruff format --check .
python tools/check_version_changelog.py --expected-version 0.3.0
python -m crypto_analyzer --help
crypto-analyzer --help
crypto-analyzer doctor --ci
$env:CRYPTO_ANALYZER_NO_NETWORK="1"; crypto-analyzer smoke --ci
python -m pytest -m "not slow" -q --tb=short(Skip uv lock --check if uv is not installed; CI runs it.)
Tier 2 lists common debug targets. The canonical gate is Tier 1 + .\scripts\run.ps1 verify. If filenames change, use ls tests/test_* (or equivalent) to locate the current modules.
Dataset v2 and run identity:
python -m pytest tests/test_dataset_v2.py tests/test_run_identity.py tests/test_backfill_dataset_v2.py -v --tb=shortRNG and bootstrap:
python -m pytest tests/test_statistics_research.py -v --tb=shortCalibration (BH/BY, RC, RW, CSCV, Type I) — smoke:
python -m pytest tests/test_calibration_fdr_smoke.py tests/test_calibration_cscv_smoke.py tests/test_calibration_rc_smoke.py tests/test_calibration_rw_smoke.py tests/test_calibration_harness_type1.py -v --tb=shortFold causality and attestation:
python -m pytest tests/test_fold_causality_attestation.py tests/test_promotion_requires_fold_causality_attestation.py tests/test_transform_fit_called_only_on_train.py -v --tb=shortPromotion gating and eligibility:
python -m pytest tests/test_promotion_gating.py tests/test_gatekeeper_requires_versions_and_seed_version.py tests/test_promotion_service.py tests/test_audit_invariants_fail_closed.py -v --tb=shortPhase 3 migrations and governance:
python -m pytest tests/test_migrations_phase3.py tests/test_governance_event_log_append_only.py tests/test_artifact_lineage_append_only.py tests/test_artifact_lineage_written.py tests/test_acceptance_audit_trace.py -v --tb=shortDeterminism and Reality Check:
python -m pytest tests/test_reportv2_deterministic_rerun.py tests/test_research_pipeline_smoke.py tests/test_reality_check_null_sanity.py -v --tb=short(Optional: DuckDB backend tests require DuckDB; skip if not installed.)
python -m pytest -q --tb=shortFull verification script (doctor → pytest → ruff → research-only boundary → diagram export):
.\scripts\run.ps1 verifySome docs include Mermaid diagrams and math. To normalize for GitHub (fenced Mermaid, $...$ / $$...$$ math):
python scripts/normalize_markdown_math.pyTo check only: python scripts/normalize_markdown_math.py --check
- Vulnerability scanning: CI runs
pip-audit(weekly schedule and on push to main). The job requires network for advisories and is separate from smoke/demo-lite so those remain guaranteed offline. - SBOM: CycloneDX SBOM is generated and uploaded as a workflow artifact (
sbom-cyclonedx). - Offline guarantees:
smoke --cianddemo-lite(withCRYPTO_ANALYZER_NO_NETWORK=1) are enforced network-free in CI. See SECURITY.md for supported versions and reporting.
Canonical references for the statistical and econometric methods used in the validation stack:
- Deflated Sharpe Ratio (DSR) / effective trials (Neff): Bailey, D., & López de Prado, M. (2014). The Deflated Sharpe Ratio. Journal of Portfolio Management, 40(5), 94–107.
- Benjamini–Hochberg (BH): Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society B, 57(1), 289–300.
- Benjamini–Yekutieli (BY): Benjamini, Y., & Yekutieli, D. (2001). The Control of the False Discovery Rate in Multiple Testing Under Dependency. Annals of Statistics, 29(4), 1165–1188.
- White's Reality Check: White, H. (2000). A Reality Check for Data Snooping. Econometrica, 68(5), 1097–1126.
- Romano–Wolf stepdown: Romano, J. P., & Wolf, M. (2005). Stepwise Multiple Testing as Formalized Data Snooping. Econometrica, 73(4), 1237–1282.
- CSCV / PBO (Bailey et al.): Bailey, D. H., Borwein, J., López de Prado, M., & Zhu, Q. J. (2014). Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance. Notices of the AMS, 61(5), 458–471.
- Stationary bootstrap (Politis & Romano): Politis, D. N., & Romano, J. P. (1994). The Stationary Bootstrap. Journal of the American Statistical Association, 89(428), 1303–1313.
- Newey–West / HAC: Newey, W. K., & West, K. D. (1987). A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica, 55(3), 703–708.
Short form and artifact keys: Methods & limits (§19). Formal definitions: Statistical Methods Appendix. Repo formulae: implementation-aligned.
- Single-node, local-first — Not distributed; one SQLite DB per environment.
- Not a multi-user concurrent platform — No built-in concurrency control for concurrent promotion or lineage writes.
- Research validation only — No execution, order routing, or live trading.
- Calibration harness — CI-safe guards (Type I, FDR, RC, RW, CSCV, bootstrap); not full statistical certification under all data-generating processes.
- Scalability — SQLite is the single store; suitable for research and moderate history. Optional DuckDB for read-heavy analytics; governance and lineage remain in SQLite.
- Data scope: Ingestion is public CEX/DEX only; no authenticated feeds. No real-time execution or order routing.
| Directory / module | Purpose |
|---|---|
| crypto_analyzer/core/ | RunContext, run identity (run_identity.py), context |
| crypto_analyzer/dataset_v2.py | dataset_id_v2 hashing, backfill |
| crypto_analyzer/fold_causality/ | Folds, purge/embargo, attestation, runner |
| crypto_analyzer/governance/ | promote, audit, audit_invariants |
| crypto_analyzer/promotion/ | gating (evaluate_eligibility), service, store_sqlite |
| crypto_analyzer/db/ | migrations_phase3, lineage, governance_events |
| crypto_analyzer/store/ | sqlite_backend, duckdb_backend (lineage → SQLite) |
| crypto_analyzer/rng.py | seed_root, salts, SEED_ROOT_VERSION |
| crypto_analyzer/stats/ | reality_check, calibration_*, multiple_testing |
| crypto_analyzer/contracts/ | validation_bundle_contract, schema_versions |
| cli/ | research_report_v2, poll, materialize, promotion |
| tests/ | test_dataset_v2, test_run_identity, test_promotion_, test_artifact_lineage_, etc. |
| docs/audit/ | validation_control_plane.md, phase1_verification.md |
| docs/architecture/ | validation_control_plane.mmd |
| docs/spec/ | system_overview, stats_stack_upgrade_acceptance |
| scripts/ | run.ps1, export_diagrams.ps1 |
| Document | Contents |
|---|---|
| Methods & limits | Statistical methods, assumptions, artifact keys, limitations (DSR, PBO, BH/BY, RC, RW, HAC, breaks, capacity). See: Statistical Methods Appendix. |
| Stats stack acceptance | Definition of done for upgrades #1–#6; exact artifact keys; minimum data thresholds; golden run command. |
| Statistical Methods Appendix | Formal definitions, assumptions, derivations/proof sketches for DSR, PBO, BH/BY, bootstrap, Reality Check, HAC (Appendices A & B). |
| Methods & Limits — implementation-aligned | Exact repo formulae: DSR, Neff, PBO proxy + CSCV PBO, BH/BY, bootstrap, RC, Romano–Wolf, HAC, break diagnostics, capacity curve. |
| Validation control plane audit | Threat model, design, governance, reproducibility, gaps. |
| Phase 3 summary | Phase 3 migrations, governance, lineage, store. |
| Phase 1 verification | Phase 1 verification checklist (dataset_id_v2, run_key, backfill). |
| Research validation workflow | Exploratory vs full-scale runs, run_id, snapshot semantics, validation readiness criteria. |
| Case study: Coinbase majors benchmark (Phase 1.1) | Public-facing benchmark artifact for the expanded Coinbase majors workflow with explicit scope and caveats. |
| Case study: Coinbase majors strict validation (Phase 1.2) | Follow-on case study documenting stricter validation evidence and a rigor-focused upgrade with explicit scope and caveats. |
| Case study: Coinbase majors live + signal upgrade (Phase 1.3) | Public-facing methodology upgrade covering websocket live-majors ingestion, second active majors signal path, and stricter validation on the same benchmark surface. |
| Case study: Coinbase majors methodology correction (Phase 1.4) | Public-facing methodology-correction upgrade covering lagged portfolio application, more robust neutralization, cleaner majors-native factor preference, and a strict rerun on the same benchmark surface. |
| Spec index (canonical) | Master spec, system overview, implementation ledger, component specs. |
| System overview | Pipeline lifecycle, determinism, statistical stack, feature flags, promotion. |
| Implementation ledger | Requirement → status, PRs, evidence. |
| Design | Data flow, provider contracts, failure modes. |
| Architecture | Module responsibility matrix. |
| Contributing | Dev setup, testing, style, adding providers, verify. |
| Diagrams | PlantUML index and export. |
| Audit notes | Architecture audits and alignment reports. |
| Problem | Solution |
|---|---|
| No data in dashboard | Run poll (or universe-poll) then materialize. |
| Bars table not found | Run .\scripts\run.ps1 materialize --freq 1h. |
| Provider DOWN | Circuit breaker; auto-recovers after cooldown. |
| reportv2 --regimes fails | Set CRYPTO_ANALYZER_ENABLE_REGIMES=1, run Phase 3 migrations, then regime materialize. |
| Verify fails | Run doctor; ensure venv active; fix ruff/pytest as indicated. |
MIT License. See LICENSE.
Research-only. This tool analyzes data and produces reports. It does not execute trades, hold API keys, or connect to any broker. Opt-in features (regimes, Phase 3 migrations, Reality Check, promotion) do not change default behavior.
The project is cross-platform: Python 3.10+, uv, and the CLI work the same on Linux and macOS as on Windows. The Quickstart commands that use uv run crypto-analyzer … are already copy-pasteable in bash or zsh from the repo root.
Virtual environment (pip fallback): After python -m venv .venv, activate with:
source .venv/bin/activateThen install and run as elsewhere in this README (python -m pip install -e ".[dev]", python -m crypto_analyzer doctor, or crypto-analyzer doctor if the console script is on your PATH).
scripts/run.ps1: That wrapper is Windows-only. On Linux or macOS, call the CLI directly, for example:
uv run crypto-analyzer <command> [args…]python -m crypto_analyzer <command> [args…](with venv activated)
Full local verification without PowerShell: use the same Tier 1 commands as in Development / Verification, but run them in your shell (e.g. uv run ruff check …, uv run python -m pytest …). For the bundled doctor → pytest → ruff → boundary → diagram flow, use uv run crypto-analyzer verify or python -m crypto_analyzer verify instead of .\scripts\run.ps1 verify.
Environment variables: Where this README uses PowerShell ($env:NAME="value"), use:
export CRYPTO_ANALYZER_NO_NETWORK=1 # example: no-network smokeUnset with unset CRYPTO_ANALYZER_NO_NETWORK when you are done.
Optional tools: GrapeRoot and link checks may expect rg (ripgrep) on your PATH. On macOS: brew install ripgrep. On Debian/Ubuntu: sudo apt install ripgrep (or your distro’s package manager).
Paths: Default SQLite DB path (dex_data.sqlite at repo root) and CRYPTO_DB_PATH / --db behavior are the same on all platforms.