SentimentEdge is a full-stack ML web application that reads financial tweets and predicts whether a stock will move UP, DOWN, or stay NEUTRAL the next day.
It uses a dual-model pipeline:
- FinBERT β a financial NLP model that extracts sentiment scores from the tweet
- Random Forest β a trained classifier that maps those scores to stock movement predictions
No downloads. No local setup. Paste a tweet β get a prediction instantly via the live demo.
Tweet Input
β
βΌ
βββββββββββββββββββββββββββββββββββ
β FinBERT (ProsusAI/finbert) β β Pre-trained Financial NLP
β Extracts sentiment scores: β
β positive: 0.92 β
β negative: 0.04 β
β neutral: 0.04 β
β polarity: +0.92 β
ββββββββββββββββ¬βββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Feature Engineering β β Sentiment + metadata features
β [polarity, pos_score, β
β neg_score, neu_score, β
β tweet_volume, pumper, ...] β
ββββββββββββββββ¬βββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Random Forest Classifier β β Trained on 1 year of real data
β 100 trees vote β β
β UP: 72 votes β β
β NEUTRAL: 17 votes β
β DOWN: 11 votes β
ββββββββββββββββ¬βββββββββββββββββββ
β
βΌ
Prediction: UP
| Model | Metric | Score |
|---|---|---|
| Random Forest | Accuracy | 67.4% |
| Random Forest | AUC-ROC | 0.712 |
| FinBERT | Accuracy | 87.3% |
| FinBERT | Classification | Positive / Negative / Neutral |
| Feature | Description |
|---|---|
| π Single Tweet Analysis | Paste any financial tweet β get FinBERT scores + RF movement prediction instantly |
| π€ Dual Model Pipeline | FinBERT sentiment scores feed directly as features into Random Forest |
| π FinBERT Explorer | Inspect raw positive / negative / neutral confidence scores per tweet |
| π Evaluation Dashboard | View model accuracy, F1 score, AUC-ROC, and performance breakdown |
| π Zero Setup | Fully hosted on Hugging Face Spaces β no installation, no downloads |
Trained on a curated dataset of financial tweets covering the top 25 most-watched stock tickers on Yahoo Finance, spanning 30 Sept 2021 β 30 Sept 2022.
| Column | Description |
|---|---|
Date |
Date and time of the tweet |
Tweet |
Full text of the tweet |
Stock Name |
Ticker symbol (e.g. AAPL, TSLA) |
Company Name |
Full company name |
Price / Volume |
Yahoo Finance market data for corresponding dates |
The dataset enables direct correlation between public tweet sentiment and actual next-day price movement β making the Random Forest labels real and grounded.
Inspired by Stock Market Tweet Sentiment Analysis and Stock-Market Sentiment Dataset.
| Layer | Technology | Purpose |
|---|---|---|
| NLP Model | ProsusAI/FinBERT (HuggingFace Transformers) | Financial sentiment extraction |
| ML Model | Scikit-learn Random Forest | Stock movement classification |
| Serialization | Joblib | Save/load trained RF model |
| Backend | Flask + Flask-CORS | REST API with 6+ endpoints |
| Data Processing | Pandas, NumPy | Feature engineering pipeline |
| Frontend | HTML, CSS, Vanilla JS | Multi-page interactive dashboard |
| Containerization | Docker | Reproducible deployment |
| Hosting | Hugging Face Spaces | Zero-setup public access |
SentimentEdge/
βββ backend/
β βββ app.py # Flask app β routes and API endpoints
β βββ model.py # FinBERT loader and inference (classify_text, classify_batch)
β βββ predict.py # Random Forest prediction (predict_stock, predict_batch)
β βββ train.py # Training pipeline β feature engineering + RF training
β βββ templates/
β β βββ index.html # Overview dashboard
β β βββ analyzer.html # Tweet analyzer page
β β βββ finbert.html # FinBERT explorer page
β β βββ evaluation.html # Model evaluation page
β βββ static/
β β βββ style.css # Global styles
β β βββ app.js # Frontend logic and API calls
β βββ models/
β βββ finbert/ # Local FinBERT weights (offline mode)
β βββ random_forest.joblib # Trained RF model
βββ requirements.txt
βββ Dockerfile
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Overview dashboard |
GET |
/analyzer |
Tweet analyzer UI |
GET |
/finbert |
FinBERT explorer UI |
GET |
/evaluation |
Model evaluation UI |
POST |
/api/analyze |
Analyze a single tweet β FinBERT + RF result |
POST |
/api/predict |
Predict stock movement from features |
POST |
/api/predict/batch |
Batch tweet prediction |
GET |
/api/metrics |
Model performance metrics |
GET |
/api/status |
API health check + model status |
curl -X POST https://vectorxx-sentiment.hf.space/api/analyze \
-H "Content-Type: application/json" \
-d '{"tweet": "Tesla just announced record deliveries! $TSLA to the moon!"}'{
"status": "success",
"finbert": {
"label": "positive",
"score": 0.92,
"score_pct": 92.0,
"polarity": 0.92
},
"prediction": {
"movement": "UP",
"confidence": 0.87,
"signal": "BULLISH"
}
}# 1. Clone the repo
git clone https://github.com/YOUR_USERNAME/SentimentEdge.git
cd SentimentEdge/backend
# 2. Install dependencies
pip install -r requirements.txt
# 3. (Optional) Download FinBERT locally for offline mode
python download_models.py
# 4. Start the server
python app.pyVisit http://localhost:7860
If FinBERT is not downloaded locally, it will automatically load from HuggingFace on first run (requires internet).
# Build
docker build -t sentimentedge .
# Run
docker run -p 7860:7860 sentimentedgeπ Try SentimentEdge on Hugging Face Spaces
First load may take 2β3 minutes if the Space was inactive (free tier auto-sleep after 48hrs of no traffic).
- Built a dual-model NLP pipeline combining FinBERT (87.3% acc.) and Random Forest to predict next-day stock price movement from financial tweets
- Developed and deployed a production-ready full-stack ML web app using Flask REST API with 6+ endpoints, containerized via Docker and hosted on Hugging Face Spaces
- Engineered a hybrid feature pipeline where FinBERT sentiment confidence scores are extracted and passed directly as input features to the Random Forest classifier
- Trained on 1-year real-world Twitter dataset covering top 25 Yahoo Finance tickers, combining tweet text with actual stock price and volume data
- Designed a multi-page interactive dashboard with batch and single-tweet prediction, live API integration, and a model evaluation page