AutoForge — Classical Tabular AutoML

Train, optimize, evaluate, ensemble, and deploy machine learning pipelines automatically — with a clean sklearn-style API.

Canonical imports:

from autoforge import AutoForgeRegressor, AutoForgeClassifier

What is AutoForge?

AutoForge is an open-source classical tabular AutoML framework. You give it a CSV or pandas DataFrame with a target column; it figures out how to preprocess the data, which models to try, how to tune them, and which one to keep — then saves everything you need to predict on new rows or explain the decision later.

Think of it as sklearn with automation: instead of writing separate pipelines for imputation, scaling, encoding, model selection, and hyperparameter tuning, you call fit() once and AutoForge runs that workflow for you.

Who is it for?

Data scientists & analysts who want strong baselines quickly on tabular data
Engineers who need a saved pipeline + report they can hand to stakeholders
Students & researchers learning AutoML without wiring every sklearn step by hand
Teams that want reproducible training artifacts (REPORT.md, leaderboard, JSON decisions)

What problem does it solve?

Building a good ML pipeline by hand usually means:

Cleaning and encoding columns differently for linear vs tree models
Trying several algorithms and tuning each one
Checking that validation is honest (no data leakage)
Documenting why you picked the final model

AutoForge automates all of that on the classical tabular path. Advanced paths (deep learning, NAS, multimodal) are opt-in and experimental — the default golden path is Random Forest–class ML on spreadsheet-style data.

AutoForge vs manual sklearn

You do manually	AutoForge does automatically
Pick imputer + scaler + encoder	Searches among 8 preprocessing recipes
Try one model at a time	Runs a model search across the classical canon
GridSearch / Optuna per model	HPO per algorithm (Optuna when `balanced`/`deep`)
Hope your CV setup is leak-free	Holdout split first; feature selection on train only
Explain choice in a slide deck	Writes `REPORT.md` with leaderboard + rationale

How training works (step by step)

When you call fit(), AutoForge runs roughly this pipeline:

Validate — checks columns, types, missing values, and detects classification vs regression
Split holdout — reserves 20% of rows for final evaluation (never used for selection)
Profile & preprocess — analyzes the train split and picks a preprocessing recipe (balanced/deep)
Select features — on train data only, reduces noise columns if helpful
Search models — cross-validates several algorithms and tunes hyperparameters
Try ensemble — if top models score similarly, tests a voting ensemble
Evaluate on holdout — reports honest out-of-sample metrics
Save artifacts — pipeline + REPORT.md when you use --save-model

You stay in control via search_depth (fast / balanced / deep) — trade training time for search thoroughness.

First run in 2 minutes

git clone https://github.com/AjayRajan05/AutoForge---AutoML-Engine.git
cd AutoForge---AutoML-Engine

python -m venv .venv
# Windows:  .venv\Scripts\activate
# macOS/Linux:  source .venv/bin/activate

pip install -e ".[dev]"

# Built-in smoke test (no CSV required)
python main.py test --type small

The smoke test trains on a tiny built-in dataset so you can confirm the install works before using your own files.

Train on your CSV, save artifacts, and predict:

python main.py train --data data.csv --target price --save-model my_model --report
# Artifacts: models/my_model/REPORT.md, leaderboard.csv, selection_decision.json

python main.py predict --model my_model --data test.csv --output predictions.csv
python main.py info --model my_model

Step	What happens
`train`	Loads `data.csv`, runs the full AutoML pipeline, prints best score and strategy
`--save-model my_model`	Writes the trained pipeline to `models/my_model/`
`--report`	Prints a short selection summary and points to `REPORT.md`
`predict`	Loads the saved model and adds a `predictions` column to `test.csv`
`info`	Shows model type, save time, and file size

After pip install -e ., the autoforge console script is equivalent to python main.py.

Installation

pip install -e .

Optional extras:

pip install -e ".[lightgbm]"   # LightGBM models
pip install -e ".[dl]"         # TensorFlow + Keras Tuner
pip install -e ".[serve]"      # FastAPI REST server
pip install -e ".[dev]"        # pytest, psutil
pip install -e ".[all]"        # everything above

Pinned versions for CI are in requirements-lock.txt.

Python API

Choose the API that fits your style:

AutoForgeRegressor / AutoForgeClassifier — simplest; feels like sklearn (fit, predict, score)
UnifiedAutoML — full control; same engine the CLI uses; best for custom config and saving

Regression (DataFrame + target column)

Best when your data is one table and the target is a column name (typical CSV workflow):

import pandas as pd
from autoforge import AutoForgeRegressor

df = pd.read_csv("data.csv")
test_df = pd.read_csv("test.csv")

model = AutoForgeRegressor()  # search_depth='balanced' by default
model.fit(df, target="price")
model.print_model_comparison()   # see all models tried
preds = model.predict(test_df)

Classification (sklearn-style split)

Best when you already have separate X_train, y_train arrays or DataFrames:

import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from autoforge import AutoForgeClassifier

X, y = load_iris(return_X_y=True)
X = pd.DataFrame(X, columns=[f"f{i}" for i in range(X.shape[1])])
y = pd.Series(y, name="target")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = AutoForgeClassifier(search_depth="balanced", model_family="ml")
model.fit(X_train, y_train)
print(model.score(X_test, y_test))

Full orchestrator

Use this when you need the same object for training, saving, loading, and serving:

from core.unified_automl import UnifiedAutoML
from input_output.input_types import AutoMLInput

automl = UnifiedAutoML({"model_family": "ml", "random_state": 42})
automl.fit(AutoMLInput(data=df, target_column="price"))
automl.save_model("my_model")
preds = automl.predict(test_df)

CLI reference

Prefer the terminal? The CLI wraps the same UnifiedAutoML engine as the Python API.

Command	Example
Train	`python main.py train --data data.csv --target price --save-model my_model --report`
Predict	`python main.py predict --model my_model --data test.csv --output predictions.csv`
Smoke test	`python main.py test --type small`
Model info	`python main.py info --model my_model`
List models	`python main.py info`

Useful train flags: --search-depth fast\|balanced\|deep, --model-family ml, --preference auto, --output results.txt

Saved models and reports go to models/<name>/ (gitignored except .gitkeep).

Search modes

Pick how much time AutoForge spends searching. All modes use the same leakage-aware holdout evaluation at the end.

Mode	Behavior	When to use
`fast`	CV on default hyperparameters; no preprocessing recipe search	Quick baseline, CI, or very large tables
`balanced` (default)	Preprocessing recipe search + per-model HPO; voting ensemble when top models are close	Most real workloads — best time/quality tradeoff
`deep`	Preprocessing recipe search + exhaustive grid per algorithm (slowest)	When you need maximum accuracy and can wait

Preprocessing recipes (balanced / deep): standard, robust, minimal, poly, onehot_low_card, quantile, no_scale_tree, target_encode.

AutoForge screens these recipes on the train split only, then runs model search on the winner.

Key features

Dataset validation — catches bad columns, types, and empty targets early
Preprocessing recipe search — picks among 8 pipelines instead of one fixed preprocessor
Feature selection on train only — holdout rows never influence which features are kept
Multi-model search with Optuna HPO — tries classical algorithms with tuned hyperparameters
Adaptive voting ensemble — combines top models when their CV scores are within ensemble_epsilon
Holdout evaluation — 20% split held back until final metrics (reduces overfitting to CV)
Training report bundle — human-readable and machine-readable artifacts for audits and demos

Supported models (classical canon)

AutoForge searches across a fixed set of well-understood tabular models (not black-box only):

Classification: Logistic Regression, Random Forest, Extra Trees, XGBoost, Gradient Boosting, SVM, KNN, Naive Bayes

Regression: Ridge, Lasso, ElasticNet, Random Forest, Extra Trees, XGBoost, Gradient Boosting, SVR

Install pip install -e ".[lightgbm]" to add LightGBM to the registry when needed.

Architecture

High-level view of how modules connect:

flowchart TD
    A[CSV / DataFrame] --> B[InputValidator]
    B --> C[ProcessorIntegrator]
    C --> D[IntelligenceEngine]
    D --> E[FeatureSelector train only]
    E --> F[ExecutionEngine CV + HPO]
    F --> G{Ensemble?}
    G --> H[ModelEvaluator holdout]
    H --> I[ModelSaver + REPORT.md]
    I --> J[predict / CLI / REST]

core/ — orchestrates the full workflow (UnifiedAutoML)
execution/ — model search, CV, preprocessing pipelines
persistence/ — save/load so predict works after restart
serving/ — optional HTTP API on top of a saved model

Configuration (common keys)

Pass to UnifiedAutoML(config=...) or set on estimator kwargs.

Key	Default	Description
`search_depth`	`"balanced"`	`"fast"`, `"balanced"`, or `"deep"`
`model_family`	`"ml"`	`"ml"`, `"dl"`, or `"both"`
`random_state`	`42`	Reproducibility seed
`enable_ensemble`	`True`	Try voting ensemble when top models are close
`use_processors`	`True`	Run data-type processors before training
`auto_save_model`	`False`	Save automatically at end of `fit()`

See CHANGELOG.md and docs/SEMVER.md for version policy.

Outputs after training

When you use --save-model my_model, inspect models/my_model/. These files are what you show stakeholders to prove how the model was chosen — not just what it predicts.

File	Contents
`REPORT.md`	Dataset profile, preprocessing choice, model leaderboard (CV + train time + holdout), selection rationale
`leaderboard.csv`	Sortable comparison of all models tried
`selection_decision.json`	Winner, runner-up, ensemble decision (machine-readable)
`preprocessing_report.json`	Selected recipe and pipeline details

Demo tip: open REPORT.md after training — it answers “why this model?” better than the terminal summary alone.

REST API (optional)

Deploy a saved model as an HTTP service (requires the [serve] extra):

pip install -e ".[serve]"

# Serve a saved model
python -m serving.run --model my_model --host 0.0.0.0 --port 8000

# Or Docker (mount ./models, set AUTOFORGE_MODEL_PATH)
docker compose up

Server: http://localhost:8000 — see /docs for OpenAPI. Useful when another app needs predictions over HTTP instead of importing Python.

Examples

Ready-made scripts in examples/:

python examples/train_and_report.py --data data.csv --target price --save-model my_model
python examples/load_and_predict.py
python -m benchmarking.classical_suite   # writes benchmarks/results/BENCHMARKS.md

Testing

pip install -e ".[dev]"
pytest tests/ -q -m "not slow"    # fast CI suite (~73 tests)
pytest tests/ -q                  # full suite including slow tests

Coverage includes preprocessing recipes, save/load round-trips, leakage guards, serving E2E, unseen categories, and integration tests.

Benchmarking

AutoForge ships a classical benchmark suite so you can compare fast vs balanced modes and a sklearn baseline on standard datasets.

Datasets: iris, wine, california_housing, regression_synthetic, imbalanced_classification, high_cardinality_cats, housing_csv.

Latest results (see benchmarks/results/BENCHMARKS.md):

Balanced beats fast on holdout: 7/7
Balanced within 5% of sklearn baseline: 6/7
Predict round-trip (save → load → predict): OK

Re-run: python -m benchmarking.classical_suite

Stable vs experimental

Area	Status
Tabular ML (`UnifiedAutoML`, `AutoForgeRegressor`/`Classifier`)	Stable — primary path
Ensembling, Optuna HPO, model persistence, training reports	Stable when deps installed
Text / time-series processors	Beta
Deep learning, NAS, multimodal, meta-learning	Experimental — opt-in
REST serving	Beta — no auth; requires `[serve]`
Legacy `api/` wrappers	Deprecated — use `autoforge` or `core.estimator`

Known limitations (v1.0)

Classical tabular focus; DL/NAS/multimodal are not production-grade
Target encoding uses train-split mean encoding (not full nested CV encoding)
Intelligence engine may analyze the full dataset before the holdout split (documented tradeoff)
No ONNX export; serving has no authentication
Some edge-case datasets (e.g. malformed CSV columns) may need manual cleaning

Project layout

core/              UnifiedAutoML, estimators, training reports
execution/         Model search, CV, preprocessing pipeline
processors/        Tabular, text, time-series preprocessing
intelligence/      Dataset analysis and strategy
ensemble/          Voting/stacking integration
optimizer/         Hyperparameter search (Optuna)
persistence/       Save/load pipelines
registry/          Model and feature registries
benchmarking/      Classical benchmark suite
serving/           Optional FastAPI server
tests/             Pytest suite
examples/          Train, predict, benchmark scripts
main.py            CLI entry point
pyproject.toml     Package metadata and extras

Contributing

git checkout -b feature/my-feature
pytest tests/ -q -m "not slow"

Open a PR with a clear description, tests, and documentation updates.

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoForge — Classical Tabular AutoML

What is AutoForge?

Who is it for?

What problem does it solve?

AutoForge vs manual sklearn

How training works (step by step)

First run in 2 minutes

Installation

Python API

Regression (DataFrame + target column)

Classification (sklearn-style split)

Full orchestrator

CLI reference

Search modes

Key features

Supported models (classical canon)

Architecture

Configuration (common keys)

Outputs after training

REST API (optional)

Examples

Testing

Benchmarking

Stable vs experimental

Known limitations (v1.0)

Project layout

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
api		api
autoforge		autoforge
benchmarking		benchmarking
benchmarks/results		benchmarks/results
cli		cli
config		config
core		core
docs		docs
ensemble		ensemble
examples		examples
execution		execution
features		features
input_output		input_output
intelligence		intelligence
models		models
optimizer		optimizer
persistence		persistence
processors		processors
registry		registry
serving		serving
systemization		systemization
tests		tests
tracking		tracking
utils		utils
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
bootstrap.py		bootstrap.py
data.csv		data.csv
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
requirements-lock.txt		requirements-lock.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AutoForge — Classical Tabular AutoML

What is AutoForge?

Who is it for?

What problem does it solve?

AutoForge vs manual sklearn

How training works (step by step)

First run in 2 minutes

Installation

Python API

Regression (DataFrame + target column)

Classification (sklearn-style split)

Full orchestrator

CLI reference

Search modes

Key features

Supported models (classical canon)

Architecture

Configuration (common keys)

Outputs after training

REST API (optional)

Examples

Testing

Benchmarking

Stable vs experimental

Known limitations (v1.0)

Project layout

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages