Skip to content

Jackylwl/nlp-emotion-classifier

Repository files navigation

COMP4332 2026 Spring Project 1 — Emotion Classification

Task

Given a short text (e.g. a social-media post), classify it into one of 7 emotion categories:

Label Emotion
0 anger
1 disgust
2 fear
3 joy
4 neutral
5 sadness
6 surprise

The primary evaluation metric is Macro-F1.

Data

All data files are located in data/:

File Description
train.csv Training set (columns: id, text, label)
valid.csv Validation set (same format as train)
test_no_label.csv Test set — labels are withheld (columns: id, text)

Dependencies

pip install -r requirements.txt

This covers the retained classical and transformer workflows in the repo. It also includes the plotting dependency used by notebooks/model_analysis_plots.ipynb. Use python3 in the commands below if your environment does not expose a python alias. If python already points to Python 3 on your machine, either form is fine.

Main Files

The current project keeps reusable scripts in src/, notebooks in notebooks/, generated experiment artifacts in artifacts/, and submission-style prediction CSVs in submissions/.

  • src/train_tfidf_logistic_regression.py: best retained classical model.
  • src/search_tfidf_models.py: validation search utility for TF-IDF + linear classifiers.
  • src/train_transformer.py: main Hugging Face fine-tuning script for distilroberta-base.
  • notebooks/train_transformer_colab.ipynb: Colab notebook for running the retained transformer workflow on Google Drive.
  • notebooks/model_analysis_plots.ipynb: matplotlib-based analysis notebook for report figures and model comparison.

Project Structure

.
├── Project1_emotion_classification_Spring2026.pdf
├── data/
│   ├── train.csv
│   ├── valid.csv
│   └── test_no_label.csv
├── src/
│   ├── baselines/
│   ├── evaluate.py
│   ├── run_experiments.py
│   ├── search_tfidf_models.py
│   ├── train_tfidf_logistic_regression.py
│   └── train_transformer.py
├── notebooks/
├── artifacts/
│   ├── outputs/
│   └── results/
│       ├── baselines/
│       ├── search/
│       └── tfidf/
└── submissions/

Use artifacts/results/ for validation predictions, metrics, and search output used during model selection. Use artifacts/outputs/ for full transformer experiment artifacts. Use submissions/ only for final test-set prediction CSVs.

Baselines

All baseline scripts must be run from the project_1/ directory.

python3 src/baselines/mlp.py

This script writes validation predictions to artifacts/results/baselines/emb_mlp_valid_predictions.csv and test predictions to submissions/emb_mlp_pred.csv by default.

To run the Bi-RNN baseline:

python3 src/baselines/rnn.py

This script writes validation predictions to artifacts/results/baselines/rnn_valid_predictions.csv and test predictions to submissions/rnn_pred.csv by default.

Evaluation

To evaluate predictions on the validation set, run:

python3 src/evaluate.py --pred <path_to_pred.csv>

Example:

python3 src/evaluate.py --pred artifacts/results/tfidf/tfidf_logreg_valid_predictions.csv

The script prints accuracy, macro-precision, macro-recall, and macro-F1, along with a per-class breakdown.

Note: The src/evaluate.py script evaluates against data/valid.csv. For the final test set, submit submissions/prediction.csv (columns: id, label) following the course submission instructions.

Retained Workflows

TF-IDF + Logistic Regression

Train on train.csv, evaluate on valid.csv, and export both validation and test predictions:

python3 src/train_tfidf_logistic_regression.py

This single command produces two output files:

  • artifacts/results/tfidf/tfidf_logreg_valid_predictions.csv: validation predictions for src/evaluate.py
  • submissions/prediction.csv: test-set predictions for submission formatting checks

It also saves summary metrics to:

  • artifacts/results/tfidf/tfidf_train_metrics.json
  • artifacts/results/tfidf/tfidf_valid_metrics.json

Validation predictions are written to:

artifacts/results/tfidf/tfidf_logreg_valid_predictions.csv

Test predictions are written to:

submissions/prediction.csv

To train the final classical model on train.csv + valid.csv before generating the test submission:

python3 src/train_tfidf_logistic_regression.py --train-on-all

With --train-on-all, the script still writes the validation prediction file, then retrains the classical model on the full labeled data (train.csv + valid.csv) before overwriting submissions/prediction.csv with the final test-set predictions.

To search TF-IDF feature and classifier combinations on validation:

python3 src/search_tfidf_models.py

This writes the search table to artifacts/results/search/tfidf_search_results.csv and the best configuration summary to artifacts/results/search/tfidf_search_best.json.

DistilRoBERTa Transformer

Train one transformer experiment with the recommended setup:

python3 src/train_transformer.py \
  --model-name distilroberta-base \
  --experiment-name distilroberta_weighted_seed42 \
  --loss-type weighted \
  --seed 42 \
  --train-batch-size 16 \
  --eval-batch-size 32 \
  --learning-rate 2e-5 \
  --num-train-epochs 4 \
  --max-length 128

The retained Colab notebook runs this same script-based workflow and is useful when you want GPU-backed training on Google Colab:

notebooks/train_transformer_colab.ipynb

The notebook now reads both train/metrics.json and valid/metrics.json after a run so you can compare train-vs-validation performance directly.

Key options:

  • --loss-type plain|weighted|focal
  • --sampler-strategy none|weighted
  • --merge-train-valid to retrain the final model on the full labeled set before generating test predictions
  • --max-train-examples, --max-valid-examples, --max-test-examples for quick smoke tests
  • --fp16 or --bf16 if your hardware supports mixed precision

Quick smoke test

Use a tiny checkpoint and small subsets to validate the pipeline before launching a long run:

python3 src/train_transformer.py \
  --model-name hf-internal-testing/tiny-random-roberta \
  --experiment-name smoke_tiny_roberta \
  --output-dir artifacts/outputs/smoke \
  --loss-type plain \
  --num-train-epochs 1 \
  --max-length 32 \
  --max-train-examples 64 \
  --max-valid-examples 64 \
  --max-test-examples 64

Artifacts are written under <output-dir>/<experiment_name>/. For the smoke-test command above, that means:

artifacts/outputs/smoke/smoke_tiny_roberta/

Typical artifact files include:

  • config.json
  • class_weights.json
  • train/metrics.json
  • valid_predictions.csv
  • test_predictions.csv
  • valid/metrics.json
  • valid/confusion_matrix.csv
  • valid/probabilities.npz
  • test/probabilities.npz
  • model/

An experiment summary row is also appended to <output-dir>/experiment_summary.csv. The summary CSV now includes train and validation summary metrics, which makes it easier to inspect train-vs-valid gaps for overfitting analysis.

Final model training

After choosing the best settings on validation:

python3 src/train_transformer.py \
  --model-name distilroberta-base \
  --experiment-name distilroberta_weighted_full_seed42 \
  --output-dir artifacts/outputs/transformer \
  --loss-type weighted \
  --seed 42 \
  --merge-train-valid

This retrains on train.csv + valid.csv and writes the final test predictions to:

artifacts/outputs/transformer/distilroberta_weighted_full_seed42/test_predictions.csv

Analysis Notebook

Use the analysis notebook to generate matplotlib figures for:

  • final model comparison
  • TF-IDF train vs validation Macro-F1
  • TF-IDF search trends
  • confusion matrices
  • per-class F1 comparison
  • transformer train vs validation Macro-F1
  • transformer validation curves when retained epoch outputs are available

Notebook:

notebooks/model_analysis_plots.ipynb

The notebook can now read the saved transformer train/metrics.json and valid/metrics.json files directly when those local outputs are available.

Prediction Format

Your prediction file must be a CSV with exactly two columns: id and label.

id,label
eebbqej,4
ed00q6i,4
...

About

COMP4332 Project 1: Short-text emotion classification with classical machine learning, neural baselines, and DistilRoBERTa fine-tuning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors