Skip to content

Illuminoxx/QuantMirror

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

42 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SentimentEdge πŸ“ˆ

Dual-model financial tweet sentiment analysis for next-day stock movement prediction

Live Demo Python Flask HuggingFace Docker License


What is SentimentEdge?

SentimentEdge is a full-stack ML web application that reads financial tweets and predicts whether a stock will move UP, DOWN, or stay NEUTRAL the next day.

It uses a dual-model pipeline:

  • FinBERT β€” a financial NLP model that extracts sentiment scores from the tweet
  • Random Forest β€” a trained classifier that maps those scores to stock movement predictions

No downloads. No local setup. Paste a tweet β†’ get a prediction instantly via the live demo.


How It Works

Tweet Input
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  FinBERT (ProsusAI/finbert)     β”‚  ← Pre-trained Financial NLP
β”‚  Extracts sentiment scores:     β”‚
β”‚  positive: 0.92                 β”‚
β”‚  negative: 0.04                 β”‚
β”‚  neutral:  0.04                 β”‚
β”‚  polarity: +0.92                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Feature Engineering            β”‚  ← Sentiment + metadata features
β”‚  [polarity, pos_score,          β”‚
β”‚   neg_score, neu_score,         β”‚
β”‚   tweet_volume, pumper, ...]    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Random Forest Classifier       β”‚  ← Trained on 1 year of real data
β”‚  100 trees vote β†’               β”‚
β”‚  UP: 72 votes βœ“                 β”‚
β”‚  NEUTRAL: 17 votes              β”‚
β”‚  DOWN: 11 votes                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
    Prediction: UP

Model Performance

Model Metric Score
Random Forest Accuracy 67.4%
Random Forest AUC-ROC 0.712
FinBERT Accuracy 87.3%
FinBERT Classification Positive / Negative / Neutral

Features

Feature Description
πŸ” Single Tweet Analysis Paste any financial tweet β€” get FinBERT scores + RF movement prediction instantly
πŸ€– Dual Model Pipeline FinBERT sentiment scores feed directly as features into Random Forest
πŸ“Š FinBERT Explorer Inspect raw positive / negative / neutral confidence scores per tweet
πŸ“ˆ Evaluation Dashboard View model accuracy, F1 score, AUC-ROC, and performance breakdown
🌐 Zero Setup Fully hosted on Hugging Face Spaces β€” no installation, no downloads

Dataset

Trained on a curated dataset of financial tweets covering the top 25 most-watched stock tickers on Yahoo Finance, spanning 30 Sept 2021 – 30 Sept 2022.

Column Description
Date Date and time of the tweet
Tweet Full text of the tweet
Stock Name Ticker symbol (e.g. AAPL, TSLA)
Company Name Full company name
Price / Volume Yahoo Finance market data for corresponding dates

The dataset enables direct correlation between public tweet sentiment and actual next-day price movement β€” making the Random Forest labels real and grounded.

Inspired by Stock Market Tweet Sentiment Analysis and Stock-Market Sentiment Dataset.


Tech Stack

Layer Technology Purpose
NLP Model ProsusAI/FinBERT (HuggingFace Transformers) Financial sentiment extraction
ML Model Scikit-learn Random Forest Stock movement classification
Serialization Joblib Save/load trained RF model
Backend Flask + Flask-CORS REST API with 6+ endpoints
Data Processing Pandas, NumPy Feature engineering pipeline
Frontend HTML, CSS, Vanilla JS Multi-page interactive dashboard
Containerization Docker Reproducible deployment
Hosting Hugging Face Spaces Zero-setup public access

Project Structure

SentimentEdge/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app.py              # Flask app β€” routes and API endpoints
β”‚   β”œβ”€β”€ model.py            # FinBERT loader and inference (classify_text, classify_batch)
β”‚   β”œβ”€β”€ predict.py          # Random Forest prediction (predict_stock, predict_batch)
β”‚   β”œβ”€β”€ train.py            # Training pipeline β€” feature engineering + RF training
β”‚   β”œβ”€β”€ templates/
β”‚   β”‚   β”œβ”€β”€ index.html      # Overview dashboard
β”‚   β”‚   β”œβ”€β”€ analyzer.html   # Tweet analyzer page
β”‚   β”‚   β”œβ”€β”€ finbert.html    # FinBERT explorer page
β”‚   β”‚   └── evaluation.html # Model evaluation page
β”‚   β”œβ”€β”€ static/
β”‚   β”‚   β”œβ”€β”€ style.css       # Global styles
β”‚   β”‚   └── app.js          # Frontend logic and API calls
β”‚   └── models/
β”‚       β”œβ”€β”€ finbert/        # Local FinBERT weights (offline mode)
β”‚       └── random_forest.joblib  # Trained RF model
β”œβ”€β”€ requirements.txt
└── Dockerfile

API Endpoints

Method Endpoint Description
GET / Overview dashboard
GET /analyzer Tweet analyzer UI
GET /finbert FinBERT explorer UI
GET /evaluation Model evaluation UI
POST /api/analyze Analyze a single tweet β†’ FinBERT + RF result
POST /api/predict Predict stock movement from features
POST /api/predict/batch Batch tweet prediction
GET /api/metrics Model performance metrics
GET /api/status API health check + model status

Example Request

curl -X POST https://vectorxx-sentiment.hf.space/api/analyze \
  -H "Content-Type: application/json" \
  -d '{"tweet": "Tesla just announced record deliveries! $TSLA to the moon!"}'

Example Response

{
  "status": "success",
  "finbert": {
    "label": "positive",
    "score": 0.92,
    "score_pct": 92.0,
    "polarity": 0.92
  },
  "prediction": {
    "movement": "UP",
    "confidence": 0.87,
    "signal": "BULLISH"
  }
}

Run Locally

# 1. Clone the repo
git clone https://github.com/YOUR_USERNAME/SentimentEdge.git
cd SentimentEdge/backend

# 2. Install dependencies
pip install -r requirements.txt

# 3. (Optional) Download FinBERT locally for offline mode
python download_models.py

# 4. Start the server
python app.py

Visit http://localhost:7860

If FinBERT is not downloaded locally, it will automatically load from HuggingFace on first run (requires internet).


Docker

# Build
docker build -t sentimentedge .

# Run
docker run -p 7860:7860 sentimentedge

Live Demo

πŸš€ Try SentimentEdge on Hugging Face Spaces

First load may take 2–3 minutes if the Space was inactive (free tier auto-sleep after 48hrs of no traffic).


Resume Highlights

  • Built a dual-model NLP pipeline combining FinBERT (87.3% acc.) and Random Forest to predict next-day stock price movement from financial tweets
  • Developed and deployed a production-ready full-stack ML web app using Flask REST API with 6+ endpoints, containerized via Docker and hosted on Hugging Face Spaces
  • Engineered a hybrid feature pipeline where FinBERT sentiment confidence scores are extracted and passed directly as input features to the Random Forest classifier
  • Trained on 1-year real-world Twitter dataset covering top 25 Yahoo Finance tickers, combining tweet text with actual stock price and volume data
  • Designed a multi-page interactive dashboard with batch and single-tweet prediction, live API integration, and a model evaluation page

About

stock prediction through tweet sentiment analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors