Skip to content

KX-ai/SocialMediaTrendPredictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📈 Social Media Trend Predictor

A multi-country social media analytics system that scrapes TikTok comments, performs sentiment analysis, predicts trending topics, and visualises insights through an interactive dashboard — covering Malaysia, Japan, Philippines, Singapore, and South Korea.


Overview

This project analyses TikTok comment data to understand public sentiment and predict emerging social media trends across Southeast and East Asia. It combines NLP-based sentiment analysis, machine learning-based trend prediction (One-Class SVM), geographic information system (GIS) visualisation, and an AI-powered chatbot, all surfaced through a web dashboard.


Features

  • 💬 Comments Scraper — Scrapes and collects TikTok comments for analysis.
  • 🌏 Multi-Country Sentiment Analysis — Analyses sentiment (positive/negative/neutral) of TikTok comments for MY, JP, PH, SG, and SK using transformer-based models and TextBlob.
  • 📊 Trend Prediction — Uses trained One-Class SVM models to predict whether content is trending, per country.
  • 🗺️ GIS Visualisation — Maps sentiment and trend data geographically across the covered regions.
  • 🤖 AI Chatbot — A conversational chatbot (powered by Flask) to query and explore the analysed data.
  • 🖥️ Interactive Dashboard — A unified HTML dashboard to view insights from all modules.

Covered Countries

Country Code Notebooks
Malaysia MY MY Sentiment, Malaysia Sentiment Analysis, Malaysia Prediction, One-Class SVM MY
Japan JP JP Sentiment, Japan Sentiment Analysis, JP Prediction, One-Class SVM Japan
Philippines PH PH Sentiment, PH Sentiment Analysis, PH Prediction, One-Class SVM PH
Singapore SG SG Sentiment, Singapore Sentiment Analysis, SG Prediction, One-Class SVM SG
South Korea SK SK Sentiment, SK Sentiment Analysis, SK Prediction, One-Class SVM SK

Tech Stack

  • Data Processing — pandas, numpy, tqdm
  • NLP & Sentiment — Hugging Face Transformers, TextBlob, NLTK, langdetect, deep-translator, emoji, fugashi (Japanese tokenizer), sentencepiece
  • Machine Learning — scikit-learn (One-Class SVM), joblib
  • Deep Learning — PyTorch, torchvision, torchaudio, accelerate, timm
  • Backend / API — Flask, flask-cors
  • Frontend — HTML/CSS/JS (dashboard.html)
  • Notebooks — Jupyter Notebook

Project Structure

SocialMediaTrendPredictor/
├── Datasets/                          # Raw and processed datasets
│
├── Comments Scraper.ipynb             # TikTok comment scraping
├── GIS.ipynb                          # Geographic visualisation
├── Chatbot.ipynb                      # AI chatbot backend
│
├── Malaysia Sentiment Analysis.ipynb  # Full sentiment pipeline (MY)
├── Japan Sentiment Analysis.ipynb     # Full sentiment pipeline (JP)
├── PH Sentiment Analysis.ipynb        # Full sentiment pipeline (PH)
├── Singapore Sentiment Analysis.ipynb # Full sentiment pipeline (SG)
├── SK Sentiment Analysis.ipynb        # Full sentiment pipeline (SK)
│
├── MY Sentiment.ipynb                 # Sentiment model (MY)
├── JP Sentiment.ipynb                 # Sentiment model (JP)
├── PH Sentiment.ipynb                 # Sentiment model (PH)
├── SG Sentiment.ipynb                 # Sentiment model (SG)
├── SK Sentiment.ipynb                 # Sentiment model (SK)
│
├── Malaysia Prediction.ipynb          # Trend prediction (MY)
├── JP Prediction.ipynb                # Trend prediction (JP)
├── PH Prediction.ipynb                # Trend prediction (PH)
├── SG Prediction.ipynb                # Trend prediction (SG)
├── SK Prediction.ipynb                # Trend prediction (SK)
│
├── One-Class SVM MY.ipynb             # SVM model training (MY)
├── One-Class SVM Japan.ipynb          # SVM model training (JP)
├── One-Class SVM PH.ipynb             # SVM model training (PH)
├── One-Class SVM SG.ipynb             # SVM model training (SG)
├── One-Class SVM SK.ipynb             # SVM model training (SK)
│
├── *_cleaned_annotated_tiktok_comments.csv  # Cleaned datasets per country
├── *_one_class_svm_tiktok.pkl               # Trained SVM models per country
├── *_scaler_tiktok.pkl                      # Fitted scalers per country
│
├── dashboard.html                     # Web dashboard
└── requirements.txt                   # Python dependencies

Getting Started

Prerequisites

  • Python 3.9+
  • Jupyter Notebook
  • A modern web browser
  • Visual Studio Code (for running the dashboard)

Note: All required libraries are already installed inline within each .ipynb file. The requirements.txt is provided as a reference only.

Installation

git clone https://github.com/KX-ai/SocialMediaTrendPredictor.git
cd SocialMediaTrendPredictor
pip install -r requirements.txt

How to Run

The system requires two applications running simultaneously: Jupyter Notebook and VS Code.

Step 1 — Jupyter Notebook

Open and run the notebooks in the following order:

  1. GIS.ipynb — Starts the GIS/mapping backend
  2. Chatbot.ipynb — Starts the chatbot Flask server
  3. Trend Prediction notebooks (run all five):
    • Malaysia Prediction.ipynb
    • JP Prediction.ipynb
    • PH Prediction.ipynb
    • SG Prediction.ipynb
    • SK Prediction.ipynb
  4. Sentiment Analysis notebooks (run all five):
    • MY Sentiment.ipynb
    • JP Sentiment.ipynb
    • PH Sentiment.ipynb
    • SG Sentiment.ipynb
    • SK Sentiment.ipynb

Step 2 — Visual Studio Code

  1. Open dashboard.html in VS Code
  2. Launch it with the Live Server extension (or open directly in a browser)

You should now be able to use the full system through the dashboard! 🎉


Datasets

Pre-cleaned and annotated TikTok comment datasets are included for all five countries:

  • my_cleaned_annotated_tiktok_comments.csv
  • jp_cleaned_annotated_tiktok_comments.csv
  • ph_cleaned_annotated_tiktok_comments.csv
  • sg_cleaned_annotated_tiktok_comments.csv
  • sk_cleaned_annotated_tiktok_comments.csv

Additional raw datasets can be found in the Datasets/ folder.


Pre-trained Models

Trained One-Class SVM models and their corresponding scalers are included for each country, ready for inference:

Country Model Scaler
Malaysia my_one_class_svm_tiktok.pkl my_scaler_tiktok.pkl
Japan jp_one_class_svm_tiktok.pkl jp_scaler_tiktok.pkl
Philippines ph_one_class_svm_tiktok.pkl ph_scaler_tiktok.pkl
Singapore sg_one_class_svm_tiktok.pkl sg_scaler_tiktok.pkl
South Korea sk_one_class_svm_tiktok.pkl sk_scaler_tiktok.pkl

License

This project is open source. Feel free to fork and build on it.

Releases

No releases published

Packages

 
 
 

Contributors