An end-to-end ML pipeline that predicts telecom customer churn with cost-sensitive threshold tuning, feature explainability, and an interactive Streamlit dashboard.
churn-prediction/
├── data/
│ ├── raw/
│ │ ├── generate_dataset.py
│ │ └── telco_churn.csv
│ └── processed/
│ └── features.csv
│
├── src/
│ ├── features.py
│ ├── train.py
│ ├── evaluate.py
│ └── explain.py
│
├── models/
│ ├── best_model.pkl
│ ├── metadata.json
│ ├── evaluation_report.json
│ └── threshold_sweep.csv
│
├── tests/
│ └── test_pipeline.py
│
├── app.py
├── requirements.txt
└── README.md
- Python 3.10+
- pip
Install dependencies:
pip install -r requirements.txtStep 1 — Generate dataset
python data/raw/generate_dataset.pyGenerates data/raw/telco_churn.csv — 7,043 customers, 25.3% churn rate.
Alternatively, download the real IBM Telco dataset from Kaggle (blastchar/telco-customer-churn) and place it at data/raw/telco_churn.csv. The column names are identical — no code changes required.
Step 2 — Build features
python src/features.pyStep 3 — Train models
python src/train.pyStep 4 — Evaluate
python src/evaluate.pyStep 5 — Launch dashboard
streamlit run app.pyOpens at http://localhost:8501
Run tests
pytest tests/ -v| Property | Value |
|---|---|
| Total customers | 7,043 |
| Features after engineering | 28 |
| Churn rate | 25.30% |
| Train / test split | 80% / 20% (stratified) |
| Model | Mean AUC | Std |
|---|---|---|
| Logistic Regression | 0.7604 | 0.0109 |
| Voting Ensemble | 0.7555 | 0.0111 |
| Random Forest | 0.7516 | 0.0123 |
| Gradient Boosting | 0.7308 | 0.0112 |
| Model | AUC |
|---|---|
| Logistic Regression | 0.7461 |
| Voting Ensemble | 0.7375 |
| Random Forest | 0.7310 |
| Gradient Boosting | 0.7134 |
Best model selected: Logistic Regression (AUC = 0.7461)
| Class | Precision | Recall | F1 |
|---|---|---|---|
| No Churn | 0.87 | 0.63 | 0.73 |
| Churn | 0.40 | 0.73 | 0.52 |
| Weighted Avg | 0.75 | 0.65 | 0.68 |
Overall accuracy: 65% at default threshold.
Pred: Stay Pred: Churn
Actual: Stay 661 391
Actual: Churn 96 261
| Parameter | Value |
|---|---|
| FN cost per missed churner | $500 |
| FP cost per false retention offer | $50 |
| Optimal threshold | 0.23 |
| Precision at optimal | 0.323 |
| Recall at optimal | 0.958 |
| Estimated revenue saved | $135,100 |
The pipeline sweeps thresholds from 0.10 to 0.90 and selects the one that maximises expected revenue saved. At the default 0.50 threshold the model is more conservative; lowering to 0.23 captures 95.8% of actual churners at the cost of more false positives, but the net revenue outcome is significantly better given the cost asymmetry between missing a churner versus a wasted retention offer.
| Feature | Description |
|---|---|
tenure |
Months as a customer |
MonthlyCharges |
Current monthly bill |
Contract |
Ordinal: 0 = monthly, 1 = annual, 2 = two-year |
service_count |
Total add-on services subscribed |
is_new_customer |
1 if tenure < 6 months |
has_no_support |
No tech support and no online security |
charges_per_month |
TotalCharges / tenure |
monthly_to_total |
MonthlyCharges / TotalCharges ratio |
contract_tenure |
Contract x tenure interaction term |
| OHE columns | Gender, InternetService, PaymentMethod |
| Model | Configuration |
|---|---|
| Random Forest | 300 trees, max_depth=12, class_weight=balanced |
| Gradient Boosting | HistGradientBoosting, 300 iterations, lr=0.05 |
| Logistic Regression | L2, C=0.5, class_weight=balanced |
| Voting Ensemble | Soft vote across all three |
All models use a StandardScaler preprocessing step inside a sklearn Pipeline. The best model is selected by ROC-AUC on the held-out validation set and saved to models/best_model.pkl.
The dashboard provides feature-level explanations for every prediction using the model's built-in feature importances. For each customer, it shows the top 15 features driving the prediction, coloured by whether the customer is above or below the dataset average on that feature. A radar chart compares the customer profile against the average churner and average non-churner across six key dimensions.
Rule-based counterfactual suggestions are generated based on the customer's current feature values — for example, recommending a contract upgrade if the customer is on a monthly plan.
If shap is installed, the dashboard upgrades to full SHAP TreeExplainer values automatically.
| Page | Content |
|---|---|
| Predict | Customer profile form, churn probability gauge, feature importance chart, radar comparison, recommended actions |
| Evaluate | ROC-AUC, PR-AUC, F1, recall, precision, confusion matrix, CV scores by model, per-class metrics, threshold sweep with revenue curve, global feature importance |
| Explore | Churn distribution, tenure histogram, churn rate by contract / internet / payment method, monthly charges boxplot, tenure vs charges scatter, senior vs non-senior breakdown |
- Swap in the real IBM Telco dataset — same column names, zero code changes required
- Adjust
COST_FNandCOST_FPinevaluate.pyto reflect actual business unit economics - Add MLflow experiment tracking by wrapping
train.pywithmlflow.start_run() - Deploy to Streamlit Cloud for free: push to GitHub, connect at
share.streamlit.io