Crypto-Anaylzer — Deterministic Research Validation Control Plane

Open-source infrastructure for reproducible quantitative research.

A local-first research validation control plane for crypto: it enforces deterministic dataset identity, run identity, fold causality, and fail-closed promotion so that only attested, reproducible results can be promoted. Governance and auditability are enforced at the DB and gatekeeper layers. No API keys, no trading — validation and reports only.

Who this is for

Researchers who need reproducible validation and promotion gates
Research platform / quant infra engineers who need auditability and determinism

^{All diagrams}

What this is (in six bullets):

Validation control plane — Governs whether a research result is eligible for promotion (candidate/accepted). Does not execute or trade.
Content-addressed datasets — dataset_id_v2 hashes logical content of allowlisted SQLite tables; one row change changes the id. STRICT mode required for promotion.
Deterministic run identity — run_key (semantic) and run_instance_id (execution); seeds derived from run_key + salt + version; same config + dataset → same run_key.
Fold-causality enforcement — Purge/embargo in walk-forward splits; train-only fit; attestation artifact required for candidate/accepted when walk-forward is used.
Fail-closed promotion — Candidate and accepted require a passing eligibility report; DB triggers block promotion without it; referenced eligibility reports are immutable.
Append-only governance and lineage — governance_events, artifact_lineage, and artifact_edges are append-only; audit trace from accepted → inputs/configs/artifacts.

Reading paths

If you want to…	Read this
Quickstart (5 minutes)	Quickstart → venv, install, minimal report path.
For researchers	Trust model in practice, Key guarantees, Core workflows, Determinism & reproducibility.
For engineers	Architecture at a glance, CLI cheatsheet, Development / Verification.
For reviewers	Trust model in practice, Promotion model, Auditability, Methods & limits, References.

Quickstart

Prerequisites: Python 3.10+. No API keys (public endpoints only). Run all commands from the repo root after cloning.

SQLite: The default database file is dex_data.sqlite at the repo root (config.yaml → db.path). Poll, materialize, and most flows use this path (also resolved when relative). Override with CRYPTO_DB_PATH or per-command --db.

Canonical install (uv)

uv sync --frozen
uv run crypto-analyzer --help
uv run crypto-analyzer doctor

Exact verification commands (same as CI): see CONTRIBUTING.

Minimal path to a research report (after install):

uv run crypto-analyzer doctor
uv run crypto-analyzer universe-poll --universe --universe-chain solana --interval 60
uv run crypto-analyzer materialize --freq 1h
uv run crypto-analyzer reportv2 --freq 1h --out-dir reports --hypothesis "baseline momentum"

One-command demo: uv run crypto-analyzer demo

Offline path (no network): Install, then run init, demo-lite, and check-dataset. No config or live data required. CI smoke is for internal stability; demo-lite is for developer onboarding.

uv run crypto-analyzer init
uv run crypto-analyzer demo-lite
uv run crypto-analyzer check-dataset --db dex_data.sqlite

Pip fallback

If you prefer pip or uv is not available:

python -m venv .venv
.\.venv\Scripts\activate
python -m pip install -U pip setuptools wheel
python -m pip install -e ".[dev]"
python -m crypto_analyzer --help
crypto-analyzer --help

Then run commands as crypto-analyzer <command> or python -m crypto_analyzer <command>.

Windows: run.ps1 wrapper

On Windows you can use .\scripts\run.ps1 <command> as a convenience wrapper. It uses VIRTUAL_ENV if set, otherwise .venv at repo root, and invokes python -m crypto_analyzer <command> (no reliance on PATH). See README Quickstart if the script reports venv not found.

Troubleshooting

crypto-analyzer not found — Use python -m crypto_analyzer <command>; it always works when the package is installed.
uv sync fails — Check uv --version. Install uv with python -m pip install -U uv. Run from repo root.
Doctor reports "Not running inside a virtual environment" — Activate the venv (e.g. .\.venv\Scripts\activate) or use uv run crypto-analyzer doctor so uv runs inside its environment.
run.ps1 fails — Ensure .venv exists at repo root and you are in the repo root when running the script.

Verification

After install, confirm the CLI works (from repo root):

uv run crypto-analyzer doctor

Fallback: python -m crypto_analyzer doctor (with venv activated).

Trust model in practice

Boundaries — CI enforces import/boundary rules so that execution, broker, and CLI layers cannot be part of the research control plane (core/governance). See Validation control plane audit.
Reproducibility — Content-addressed datasets, deterministic run identity, and seeded RNG; promotion requires STRICT dataset hash and provenance. See Determinism & reproducibility and Key guarantees below.
Fold causality — Purge/embargo and train-only fit are enforced; attestation is required for candidate/accepted when walk-forward is used. See Key guarantees below and Methods & Limits.

Key guarantees (Phase 1–3.5)

dataset_id_v2 — Content-addressed hashing of allowlisted tables (canonical ordering); STRICT for promotion.
run_key + run_instance_id — Semantic run identity and execution instance; run_key excludes timestamps/paths.
Deterministic RNG — seed_root(run_key, salt, version); salted, reproducible across processes; seed_version in artifacts.
Calibration harness — BH/BY, RC, RW, CSCV PBO, bootstrap, HAC: CI-safe Type I (and FDR/RC/RW) checks; guards, not full statistical certification.
Fold-causality + attestation — Purge/embargo, train-only fit; attestation artifact with schema version; gatekeeper requires valid attestation when walk-forward used.
Fail-closed promotion — Eligibility reports + DB triggers; no candidate/accepted without linked passing report at same level; evidence immutable when referenced.
Append-only governance_events — All evaluate/promote actions logged; no updates or deletes.
artifact_lineage + artifact_edges — Audit graph from accepted → run → configs/versions/artifacts.
SQLite authoritative — Single source of truth for governance and lineage; optional DuckDB analytics backend (read-only for governance).

System guarantees

Risk	Control	Enforcement mechanism	Verified by
Data drift	dataset_id_v2	Content-addressed hash + STRICT requirement for promotion	test_dataset_v2.py
Seed drift	seed_root + versioned salts	SEED_ROOT_VERSION; deterministic RNG across processes	Deterministic tests (e.g. test_reportv2_deterministic_rerun, test_statistics_research)
Promotion bypass	DB triggers	candidate/accepted require linked passing eligibility_report_id; trigger blocks UPDATE/DELETE without it	test_migrations_phase3.py
Leakage (fold)	fold_causality_attestation	Purge/embargo, train-only fit; gatekeeper requires valid attestation when walk-forward used	test_fold_causality_attestation.py, test_promotion_requires_fold_causality_attestation.py, test_transform_fit_called_only_on_train.py
RC provenance ambiguity	rc_summary schema version + seed_root	rc_summary_schema_version; seed_root/component_salt in RC summary; gatekeeper version check	test_calibration_rc_smoke.py, test_promotion_gating.py
Artifact mutability	sha256 + artifact_lineage	compute_file_sha256; artifact_lineage rows with sha256; append-only lineage triggers	test_artifact_lineage_*.py, test_lineage_reproducibility_same_run_key_same_hashes.py

Design rationale

Why deterministic IDs? So every run is traceable and repeatable: same inputs and config produce the same dataset_id_v2, run_key, and artifact hashes. That lets you compare runs, invalidate caches when data changes, and prove reproducibility in audits. Why opt-in migrations? Phase 3 (regimes, promotion, lineage) adds schema and behavior that not every user needs. Why governance modeling? Research that moves toward production needs a path from “exploratory” to “accepted” with clear gates (eligibility reports, fold attestation, RC/RW when enabled) and an append-only audit log.

Auditability and proof bundle

If you only read one thing: run the Golden acceptance run and inspect the DB-only audit trace.

Golden acceptance run — Copy-paste PowerShell steps for a minimal and full proof: deterministic run, promotion to accepted, trigger check, and DB-only audit trace. One-command-ish proof of accepted promotion and provenance.
Methods & implementation alignment — Mapping from method (dataset_id_v2, RC/RW, seed_root, schema versions, etc.) to code and artifact keys.

Research rigor & overfitting defenses

Signal discovery is treated as a multiple-testing problem under dependence. Key controls: walk-forward (purge/embargo, train-only fit, fold-causality attestation); deflated Sharpe with Neff; PBO-style/CSCV; BH/BY; optional Reality Check and Romano–Wolf; HAC mean inference. Details: Methods & Limits, Statistical Methods Appendix, implementation-aligned formulae.

Architecture at a glance

flowchart TB
    subgraph Data["Data layer"]
        SQLiteInputs[("SQLite research tables\n(bars_*, snapshots, universe)")]
        DatasetHash["dataset_id_v2\ncontent-addressed hash\n(STRICT/FAST_DEV)"]
        SQLiteInputs --> DatasetHash
    end

    subgraph RunIdentity["Run identity"]
        RunKey["run_key\n(semantic)"]
        RunInstanceId["run_instance_id\n(execution)"]
        RunContext["RunContext\n(seed_version, versions)"]
        RunKey --> RunContext
        RunInstanceId --> RunContext
        DatasetHash --> RunKey
    end

    subgraph Eval["Evaluation"]
        FoldSplits["Fold splits\n(purge/embargo)"]
        Scoring["Scoring / metrics"]
        FoldSplits --> Scoring
    end

    subgraph Validation["Validation"]
        BHBY["BH/BY, RC, RW\nCSCV PBO, HAC"]
        CalHarness["Calibration harness"]
        BHBY --> CalHarness
    end

    subgraph Artifacts["Artifacts"]
        ValBundle["validation_bundle\nfold_causality_attestation\nrc_summary"]
    end

    subgraph Governance["Governance"]
        EvalElig["evaluate_eligibility\n→ eligibility_reports"]
        PromCand["promotion_candidates\n(exploratory→candidate→accepted)"]
        DBTriggers["DB triggers\nfail-closed"]
        GovEvents["governance_events\n(append-only)"]
        EvalElig --> PromCand
        PromCand --> DBTriggers
        EvalElig --> GovEvents
        PromCand --> GovEvents
    end

    subgraph Lineage["Lineage"]
        ArtLineage["artifact_lineage\nartifact_edges"]
    end

    RunContext --> Eval
    Eval --> Validation
    Validation --> Artifacts
    Artifacts --> EvalElig
    Artifacts --> ArtLineage
    Governance --> Lineage

Full diagram source: docs/architecture/validation_control_plane.mmd.

Core workflows

Ingest — Poll writes to spot_price_snapshots, sol_monitor_snapshots, universe tables. run_migrations applies core + v2 factor tables.
Bars — Raw snapshots → deterministic OHLCV bars (5min, 15min, 1h, 1D). Idempotent.
Factors — Rolling OLS (or optional Kalman) vs BTC/ETH → residual returns. Materialized to factor_model_runs, factor_betas, residual_returns; identified by dataset_id and factor_run_id.
Signals — Cross-sectional factors; winsorized z-scores; signal panels.
Validation — IC, IC decay; per-signal ValidationBundle (paths, metrics). Fold causality: purge/embargo, attestation when walk-forward used.
Corrections — Deflated Sharpe, PBO proxy, block bootstrap, BH/BY; optional Reality Check (reportv2 --reality-check, family_id).
Reporting — reportv2; optional regime-conditioned IC with --regimes REGIME_RUN_ID; Streamlit dashboard; experiment registry; manifests. Promotion — Create candidate; evaluate eligibility; promote to candidate/accepted via governance entrypoint; all actions logged to governance_events.

CLI cheatsheet

Run any command as crypto-analyzer <command> [args...] or python -m crypto_analyzer <command> [args...] (cross-platform). On Windows, .\scripts\run.ps1 <command> [args...] is a convenience wrapper that invokes the same CLI.

Command	Description
`doctor`	Preflight: environment, DB schema, pipeline smoke test
`doctor --ci`	CI-safe preflight (no network, temp DB, migrations + tables)
`smoke --ci`	Synthetic-data, no-network smoke (migrations, dataset_id_v2, run identity)
`init`	Create local SQLite DB and run migrations (default `dex_data.sqlite`; optional `--phase3`)
`demo-lite`	Synthetic dataset, no network; run after `init` (same default DB: `dex_data.sqlite`)
`poll`	Single-pair data poll (provider fallback)
`universe-poll --universe ...`	Multi-asset universe discovery (e.g. `--universe-chain solana`)
`materialize`	Build OHLCV bars (e.g. `--freq 1h`)
`reportv2`	Research report: IC, PBO, QP; optional `--regimes`, `--reality-check`, `--execution-evidence` when Phase 3 enabled
`walkforward`	Walk-forward backtest, out-of-sample fold stitching
`promotion`	Promotion subcommands: list, create, evaluate
`verify`	Full gate: doctor → pytest → ruff → research-only boundary → diagram export
`test`	Run pytest
`streamlit`	Interactive dashboard
`demo`	One-command demo: doctor → poll → materialize → report
`check-dataset`	Inspect dataset fingerprints and row counts

Promotion model

exploratory — No gate; warnings only.
candidate — Requires passing evaluate_eligibility(..., level="candidate"): STRICT dataset_id_v2, run_key, engine_version, config_version, seed_version; fold attestation when walk-forward used; RC/RW contract when enabled. Result stored in eligibility_reports; DB trigger blocks status without linked passing report.
accepted — Same fail-closed requirement at level accepted; eligibility_report_id and report level must match status.

Walk-forward runs require a valid fold-causality attestation (schema version, purge_applied, embargo_applied, train_only_fit_enforced) for candidate/accepted.

Promotion gating is policy-only and does not perform I/O; filesystem and evidence loading live in the service and execution_evidence layer.

Auditability story

How to trace an accepted result (without reading report files):

promotion_candidates — Filter status = 'accepted'; get candidate_id, eligibility_report_id.
eligibility_reports — Join on eligibility_report_id; get run_key, run_instance_id, dataset_id_v2, passed, level, blockers_json, computed_at_utc.
governance_events — Filter by candidate_id; order by event_id; see sequence of evaluate/promote and actors.
artifact_lineage — Filter by run_key or run_instance_id; get artifact_id, artifact_type, sha256, created_utc for that run.
artifact_edges — Join on child_artifact_id / parent_artifact_id to walk graph (e.g. validation_bundle → fold_causality_attestation, rc_summary).
Versions — From eligibility report or artifact_lineage: engine_version, config_version, dataset_id_v2; from bundle meta or attestation: seed_version, schema versions.

Determinism & reproducibility

ID or mechanism	What it keys
dataset_id_v2	Content-addressed hash of allowlisted tables (logical content, canonical ordering). STRICT for promotion.
run_key	Deterministic hash of semantic payload (dataset_id_v2, config, versions); excludes timestamps/paths.
run_instance_id	Execution instance (e.g. manifest run_id); same run_key can have many instances.
factor_run_id	Hash of dataset_id + factor config (freq, window, estimator).
family_id	Reality Check family (signal×horizon); used in RC cache and promotion gating.
Artifact SHA256	File hashes for validation bundles and outputs; deterministic rerun test compares bundle and manifest bytes.
CRYPTO_ANALYZER_DETERMINISTIC_TIME	Fixes timestamps so materialize and reportv2 produce identical outputs on rerun. Intended for deterministic rerun testing; does not change promotion eligibility gates (STRICT dataset hash and provenance still required).
Bootstrap / RC seed	Derived via `seed_root(run_key, salt, version)`; seed_version in artifacts; reproducible null distributions and CIs.

Development / Verification

Exact commands (PowerShell). Run from repo root with venv activated (e.g. .venv\Scripts\activate).

Doctor: crypto-analyzer doctor = full local preflight (env, DB, pipeline). crypto-analyzer doctor --ci = CI-safe: no network, temp DB only; validates migrations and expected tables (core ingestion, phase3 promotion/governance, lineage).

CI smoke (no network):

crypto-analyzer smoke --ci

Architecture refactor plan (no behavior change): Package boundaries and compatibility shims are documented in Refactor move map. That doc describes the target layout (core, data, artifacts, stats, pipeline, governance, execution, compute), shims (e.g. crypto_analyzer.rng → core.seeding), and verification commands. Public API contract / refactor policy: public_api_contract.md defines stable facades, compatibility shims policy, import boundaries, and how to add new exports. The public API surface is frozen for release; see the contract doc for the exact __all__ and versioning.

Tier 1: Fast checks (canonical; matches CI)

uv run ruff check crypto_analyzer cli tests tools
uv run ruff format --check crypto_analyzer cli tests tools
uv run python -m pytest -m "not slow and not network" -q --tb=short

ruff: All checks passed.
pytest -m "not slow": Skips tests marked @pytest.mark.slow (full report pipeline). Typical runtime under a few minutes. See pyproject.toml for the slow marker.

Pre-release checklist (venv only, no uv): If uv is not in PATH, run from repo root with venv activated (e.g. .\.venv\Scripts\activate). Use these in order:

python -m ruff check .
python -m ruff format --check .
python tools/check_version_changelog.py --expected-version 0.3.0
python -m crypto_analyzer --help
crypto-analyzer --help
crypto-analyzer doctor --ci
$env:CRYPTO_ANALYZER_NO_NETWORK="1"; crypto-analyzer smoke --ci
python -m pytest -m "not slow" -q --tb=short

(Skip uv lock --check if uv is not installed; CI runs it.)

Tier 2: Phase-specific targeted suites

Tier 2 lists common debug targets. The canonical gate is Tier 1 + .\scripts\run.ps1 verify. If filenames change, use ls tests/test_* (or equivalent) to locate the current modules.

Dataset v2 and run identity:

python -m pytest tests/test_dataset_v2.py tests/test_run_identity.py tests/test_backfill_dataset_v2.py -v --tb=short

RNG and bootstrap:

python -m pytest tests/test_statistics_research.py -v --tb=short

Calibration (BH/BY, RC, RW, CSCV, Type I) — smoke:

python -m pytest tests/test_calibration_fdr_smoke.py tests/test_calibration_cscv_smoke.py tests/test_calibration_rc_smoke.py tests/test_calibration_rw_smoke.py tests/test_calibration_harness_type1.py -v --tb=short

Fold causality and attestation:

python -m pytest tests/test_fold_causality_attestation.py tests/test_promotion_requires_fold_causality_attestation.py tests/test_transform_fit_called_only_on_train.py -v --tb=short

Promotion gating and eligibility:

python -m pytest tests/test_promotion_gating.py tests/test_gatekeeper_requires_versions_and_seed_version.py tests/test_promotion_service.py tests/test_audit_invariants_fail_closed.py -v --tb=short

Phase 3 migrations and governance:

python -m pytest tests/test_migrations_phase3.py tests/test_governance_event_log_append_only.py tests/test_artifact_lineage_append_only.py tests/test_artifact_lineage_written.py tests/test_acceptance_audit_trace.py -v --tb=short

Determinism and Reality Check:

python -m pytest tests/test_reportv2_deterministic_rerun.py tests/test_research_pipeline_smoke.py tests/test_reality_check_null_sanity.py -v --tb=short

(Optional: DuckDB backend tests require DuckDB; skip if not installed.)

Tier 3: Full test suite

python -m pytest -q --tb=short

Full verification script (doctor → pytest → ruff → research-only boundary → diagram export):

.\scripts\run.ps1 verify

Docs formatting

Some docs include Mermaid diagrams and math. To normalize for GitHub (fenced Mermaid, $...$ / $$...$$ math):

python scripts/normalize_markdown_math.py

To check only: python scripts/normalize_markdown_math.py --check

Security

Vulnerability scanning: CI runs pip-audit (weekly schedule and on push to main). The job requires network for advisories and is separate from smoke/demo-lite so those remain guaranteed offline.
SBOM: CycloneDX SBOM is generated and uploaded as a workflow artifact (sbom-cyclonedx).
Offline guarantees: smoke --ci and demo-lite (with CRYPTO_ANALYZER_NO_NETWORK=1) are enforced network-free in CI. See SECURITY.md for supported versions and reporting.

References

Canonical references for the statistical and econometric methods used in the validation stack:

Deflated Sharpe Ratio (DSR) / effective trials (Neff): Bailey, D., & López de Prado, M. (2014). The Deflated Sharpe Ratio. Journal of Portfolio Management, 40(5), 94–107.
Benjamini–Hochberg (BH): Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society B, 57(1), 289–300.
Benjamini–Yekutieli (BY): Benjamini, Y., & Yekutieli, D. (2001). The Control of the False Discovery Rate in Multiple Testing Under Dependency. Annals of Statistics, 29(4), 1165–1188.
White's Reality Check: White, H. (2000). A Reality Check for Data Snooping. Econometrica, 68(5), 1097–1126.
Romano–Wolf stepdown: Romano, J. P., & Wolf, M. (2005). Stepwise Multiple Testing as Formalized Data Snooping. Econometrica, 73(4), 1237–1282.
CSCV / PBO (Bailey et al.): Bailey, D. H., Borwein, J., López de Prado, M., & Zhu, Q. J. (2014). Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance. Notices of the AMS, 61(5), 458–471.
Stationary bootstrap (Politis & Romano): Politis, D. N., & Romano, J. P. (1994). The Stationary Bootstrap. Journal of the American Statistical Association, 89(428), 1303–1313.
Newey–West / HAC: Newey, W. K., & West, K. D. (1987). A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica, 55(3), 703–708.

Short form and artifact keys: Methods & limits (§19). Formal definitions: Statistical Methods Appendix. Repo formulae: implementation-aligned.

Limitations

Single-node, local-first — Not distributed; one SQLite DB per environment.
Not a multi-user concurrent platform — No built-in concurrency control for concurrent promotion or lineage writes.
Research validation only — No execution, order routing, or live trading.
Calibration harness — CI-safe guards (Type I, FDR, RC, RW, CSCV, bootstrap); not full statistical certification under all data-generating processes.
Scalability — SQLite is the single store; suitable for research and moderate history. Optional DuckDB for read-heavy analytics; governance and lineage remain in SQLite.
Data scope: Ingestion is public CEX/DEX only; no authenticated feeds. No real-time execution or order routing.

Repository map

Directory / module	Purpose
crypto_analyzer/core/	RunContext, run identity (`run_identity.py`), context
crypto_analyzer/dataset_v2.py	dataset_id_v2 hashing, backfill
crypto_analyzer/fold_causality/	Folds, purge/embargo, attestation, runner
crypto_analyzer/governance/	promote, audit, audit_invariants
crypto_analyzer/promotion/	gating (evaluate_eligibility), service, store_sqlite
crypto_analyzer/db/	migrations_phase3, lineage, governance_events
crypto_analyzer/store/	sqlite_backend, duckdb_backend (lineage → SQLite)
crypto_analyzer/rng.py	seed_root, salts, SEED_ROOT_VERSION
crypto_analyzer/stats/	reality_check, calibration_*, multiple_testing
crypto_analyzer/contracts/	validation_bundle_contract, schema_versions
cli/	research_report_v2, poll, materialize, promotion
tests/	test_dataset_v2, test_run_identity, test_promotion_, test_artifact_lineage_, etc.
docs/audit/	validation_control_plane.md, phase1_verification.md
docs/architecture/	validation_control_plane.mmd
docs/spec/	system_overview, stats_stack_upgrade_acceptance
scripts/	run.ps1, export_diagrams.ps1

Documentation index

Document	Contents
Methods & limits	Statistical methods, assumptions, artifact keys, limitations (DSR, PBO, BH/BY, RC, RW, HAC, breaks, capacity). See: Statistical Methods Appendix.
Stats stack acceptance	Definition of done for upgrades #1–#6; exact artifact keys; minimum data thresholds; golden run command.
Statistical Methods Appendix	Formal definitions, assumptions, derivations/proof sketches for DSR, PBO, BH/BY, bootstrap, Reality Check, HAC (Appendices A & B).
Methods & Limits — implementation-aligned	Exact repo formulae: DSR, Neff, PBO proxy + CSCV PBO, BH/BY, bootstrap, RC, Romano–Wolf, HAC, break diagnostics, capacity curve.
Validation control plane audit	Threat model, design, governance, reproducibility, gaps.
Phase 3 summary	Phase 3 migrations, governance, lineage, store.
Phase 1 verification	Phase 1 verification checklist (dataset_id_v2, run_key, backfill).
Research validation workflow	Exploratory vs full-scale runs, run_id, snapshot semantics, validation readiness criteria.
Case study: Coinbase majors benchmark (Phase 1.1)	Public-facing benchmark artifact for the expanded Coinbase majors workflow with explicit scope and caveats.
Case study: Coinbase majors strict validation (Phase 1.2)	Follow-on case study documenting stricter validation evidence and a rigor-focused upgrade with explicit scope and caveats.
Case study: Coinbase majors live + signal upgrade (Phase 1.3)	Public-facing methodology upgrade covering websocket live-majors ingestion, second active majors signal path, and stricter validation on the same benchmark surface.
Case study: Coinbase majors methodology correction (Phase 1.4)	Public-facing methodology-correction upgrade covering lagged portfolio application, more robust neutralization, cleaner majors-native factor preference, and a strict rerun on the same benchmark surface.
Spec index (canonical)	Master spec, system overview, implementation ledger, component specs.
System overview	Pipeline lifecycle, determinism, statistical stack, feature flags, promotion.
Implementation ledger	Requirement → status, PRs, evidence.
Design	Data flow, provider contracts, failure modes.
Architecture	Module responsibility matrix.
Contributing	Dev setup, testing, style, adding providers, verify.
Diagrams	PlantUML index and export.
Audit notes	Architecture audits and alignment reports.

Release / Verification status

Troubleshooting

Problem	Solution
No data in dashboard	Run `poll` (or universe-poll) then `materialize`.
Bars table not found	Run `.\scripts\run.ps1 materialize --freq 1h`.
Provider DOWN	Circuit breaker; auto-recovers after cooldown.
reportv2 --regimes fails	Set `CRYPTO_ANALYZER_ENABLE_REGIMES=1`, run Phase 3 migrations, then regime materialize.
Verify fails	Run `doctor`; ensure venv active; fix ruff/pytest as indicated.

License and disclaimer

MIT License. See LICENSE.

Research-only. This tool analyzes data and produces reports. It does not execute trades, hold API keys, or connect to any broker. Opt-in features (regimes, Phase 3 migrations, Reality Check, promotion) do not change default behavior.

Linux and macOS (Unix)

The project is cross-platform: Python 3.10+, uv, and the CLI work the same on Linux and macOS as on Windows. The Quickstart commands that use uv run crypto-analyzer … are already copy-pasteable in bash or zsh from the repo root.

Virtual environment (pip fallback): After python -m venv .venv, activate with:

source .venv/bin/activate

Then install and run as elsewhere in this README (python -m pip install -e ".[dev]", python -m crypto_analyzer doctor, or crypto-analyzer doctor if the console script is on your PATH).

scripts/run.ps1: That wrapper is Windows-only. On Linux or macOS, call the CLI directly, for example:

uv run crypto-analyzer <command> [args…]
python -m crypto_analyzer <command> [args…] (with venv activated)

Full local verification without PowerShell: use the same Tier 1 commands as in Development / Verification, but run them in your shell (e.g. uv run ruff check …, uv run python -m pytest …). For the bundled doctor → pytest → ruff → boundary → diagram flow, use uv run crypto-analyzer verify or python -m crypto_analyzer verify instead of .\scripts\run.ps1 verify.

Environment variables: Where this README uses PowerShell ($env:NAME="value"), use:

export CRYPTO_ANALYZER_NO_NETWORK=1   # example: no-network smoke

Unset with unset CRYPTO_ANALYZER_NO_NETWORK when you are done.

Optional tools: GrapeRoot and link checks may expect rg (ripgrep) on your PATH. On macOS: brew install ripgrep. On Debian/Ubuntu: sudo apt install ripgrep (or your distro’s package manager).

Paths: Default SQLite DB path (dex_data.sqlite at repo root) and CRYPTO_DB_PATH / --db behavior are the same on all platforms.

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
.cursor		.cursor
.github		.github
.vscode		.vscode
ai/skills		ai/skills
cli		cli
crypto_analyzer.egg-info		crypto_analyzer.egg-info
crypto_analyzer		crypto_analyzer
docs		docs
plots		plots
scripts		scripts
tests		tests
tools		tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.graperootignore		.graperootignore
.pre-commit-config.yaml		.pre-commit-config.yaml
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SKILL.md		SKILL.md
bug_report.md		bug_report.md
case_study_liqshock_1h.md		case_study_liqshock_1h.md
config.yaml		config.yaml
feature_request.md		feature_request.md
pull_request_template.md		pull_request_template.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Crypto-Anaylzer — Deterministic Research Validation Control Plane

Reading paths

Quickstart

Canonical install (uv)

Pip fallback

Windows: run.ps1 wrapper

Troubleshooting

Verification

Trust model in practice

Key guarantees (Phase 1–3.5)

System guarantees

Design rationale

Auditability and proof bundle

Research rigor & overfitting defenses

Architecture at a glance

Core workflows

CLI cheatsheet

Promotion model

Auditability story

Determinism & reproducibility

Development / Verification

Tier 1: Fast checks (canonical; matches CI)

Tier 2: Phase-specific targeted suites

Tier 3: Full test suite

Docs formatting

Security

References

Limitations

Repository map

Documentation index

Release / Verification status

Troubleshooting

License and disclaimer

Linux and macOS (Unix)

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages