statarb

PCA-based market-neutral statistical arbitrage, implemented as a clean, tested, reproducible research framework. It recovers statistical risk factors from the cross-section of S&P 500 returns using Principal Component Analysis, projects those factors out of each stock to isolate the idiosyncratic residual, models the residual as a mean-reverting process, and trades the resulting signal in a book that is dollar- and beta-neutral by construction.

This is a faithful re-implementation and extension of Avellaneda and Lee (2010), built entirely on free data so anyone can clone, run, and audit it end to end. The methodology is specified in SPEC.md, which is the single source of truth for scope; the modules and functions map one-to-one onto the equations there, so the repository doubles as a teaching reference.

This is not investment advice and not a live trading system. It runs on free data with known survivorship bias (see Data and the survivorship caveat). Results are methodology demonstrations, not return forecasts. Published statistical-arbitrage edges have decayed substantially since the 2000s.

What it does

At each rebalance date the pipeline runs nine steps on a strictly historical window: resolve the universe, compute returns, extract PCA factors (with correlation-matrix cleaning), regress out the factors to get residuals, fit the Ornstein-Uhlenbeck model and compute s-scores, turn s-scores into signed positions, size them into a neutral and constrained book, simulate execution with costs, and mark the ledger. The design avoids look-ahead in code rather than by convention, fills trades with a one-day lag by default, and is deterministic given a config and a seed.

Install

git clone https://github.com/slye-us/statarb.git
cd statarb
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,data]"

The core library needs only numpy, pandas, scipy, PyYAML, pyarrow, and matplotlib. The data extra adds the optional free-data providers (yfinance, pandas-datareader); the dev extra adds the test and lint toolchain.

Quickstart

Run the network-free synthetic backtest, which exercises the whole pipeline and writes every figure and table:

statarb run --config configs/smoke.yaml --output artifacts

Run the headline configuration on real S&P 500 data (requires network and the data extra; first run downloads and caches prices):

statarb data  --config configs/baseline.yaml --refresh-universe
statarb run   --config configs/baseline.yaml --output artifacts
statarb verify --config configs/baseline.yaml

Every default in configs/baseline.yaml maps to the parameter table in SPEC.md Section 4. Experiments are declarative: change the YAML, rerun, diff the artifacts.

Example run

The figures below come from configs/smoke.yaml, a deterministic synthetic market with a planted factor structure and genuinely mean-reverting residuals. It is an illustration of the machinery, not a performance claim. On this data the strategy recovers the planted edge (gross Sharpe well above one) and the cost model takes a realistic bite out of it, while the book stays neutral to machine precision.

The net market beta and net dollar exposure sit at roughly 1e-17 throughout: the book demonstrably is market-neutral, it does not merely claim to be. Full numbers are in docs/example_run/summary.md.

How the code maps to the method

statarb/
  data/        point-in-time universe, price ingestion + cache, pluggable providers
  factors/     correlation matrix, cleaning (Ledoit-Wolf / MP), selection, eigenportfolios
  signals/     factor-regression residuals, OU fit + s-score, trading rules
  portfolio/   dollar/beta-neutral construction with constraints, ex-ante risk
  backtest/    no-look-ahead event loop, cost models, ledger
  evaluation/  metrics, deflated/probabilistic Sharpe, bootstrap, report builder
  cli.py       one-command reproduction

The factor engine cleans the correlation matrix before eigendecomposition to control noise in the T < N regime, where a 252-day window and a near-500-name universe leave the sample matrix rank-deficient. Cleaning is configurable (ledoit_wolf, mp_clip, or none) and is on by default.

Evaluation

Performance is never summarized by a single Sharpe ratio. The report includes return and risk metrics, drawdowns, turnover and cost diagnostics with a breakeven-cost figure, realized neutrality over time, and overfitting controls: the probabilistic and deflated Sharpe ratios (Bailey and Lopez de Prado) and a stationary-bootstrap confidence interval on the Sharpe. A walk-forward split with frozen hyperparameters is supported in config.

Data and the survivorship caveat

Free data is the project's central validity caveat, and it is treated as a first-class concern rather than a footnote. Two facts shape what the framework can honestly claim:

Point-in-time S&P 500 membership reconstructed from public change logs is accurate for recent years and progressively incomplete further back. It is best-effort, not a clean historical index.
Free sources generally do not serve prices for delisted, acquired, or bankrupt tickers, which are exactly the names a survivorship correction needs.

Because of this, the realistic operating mode is a current-membership ("surviving names") backtest, and its results carry an upward survivorship bias by construction. The honest product of the data layer is the survivorship-sensitivity analysis, not a single headline number. See SPEC.md Section 6.2 and Section 11.

Development

make lint        # ruff
make format      # black
make typecheck   # mypy
make test        # pytest
make cov         # pytest with coverage
make run         # statarb run --config configs/baseline.yaml

Continuous integration runs lint, format check, type check, and the test suite on Python 3.10 through 3.12, plus a nightly synthetic smoke backtest. New math modules require unit tests; the no-look-ahead guarantee and the neutrality constraints have dedicated tests and should stay green.

References

Avellaneda, M. and Lee, J. (2010). Statistical Arbitrage in the US Equities Market. Quantitative Finance, 10(7), 761-782. Bailey, D. and Lopez de Prado, M. (2014). The Deflated Sharpe Ratio. Journal of Portfolio Management. Laloux, L., Cizeau, P., Bouchaud, J.-P. and Potters, M. (1999). Noise Dressing of Financial Correlation Matrices. Physical Review Letters.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
configs		configs
docs		docs
examples		examples
statarb		statarb
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SPEC.md		SPEC.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

statarb

What it does

Install

Quickstart

Example run

How the code maps to the method

Evaluation

Data and the survivorship caveat

Development

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

statarb

What it does

Install

Quickstart

Example run

How the code maps to the method

Evaluation

Data and the survivorship caveat

Development

References

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages