COMP4332 2026 Spring Project 1 — Emotion Classification

Task

Given a short text (e.g. a social-media post), classify it into one of 7 emotion categories:

Label	Emotion
0	anger
1	disgust
2	fear
3	joy
4	neutral
5	sadness
6	surprise

The primary evaluation metric is Macro-F1.

Data

All data files are located in data/:

File	Description
`train.csv`	Training set (columns: `id`, `text`, `label`)
`valid.csv`	Validation set (same format as train)
`test_no_label.csv`	Test set — labels are withheld (columns: `id`, `text`)

Dependencies

pip install -r requirements.txt

This covers the retained classical and transformer workflows in the repo. It also includes the plotting dependency used by notebooks/model_analysis_plots.ipynb. Use python3 in the commands below if your environment does not expose a python alias. If python already points to Python 3 on your machine, either form is fine.

Main Files

The current project keeps reusable scripts in src/, notebooks in notebooks/, generated experiment artifacts in artifacts/, and submission-style prediction CSVs in submissions/.

src/train_tfidf_logistic_regression.py: best retained classical model.
src/search_tfidf_models.py: validation search utility for TF-IDF + linear classifiers.
src/train_transformer.py: main Hugging Face fine-tuning script for distilroberta-base.
notebooks/train_transformer_colab.ipynb: Colab notebook for running the retained transformer workflow on Google Drive.
notebooks/model_analysis_plots.ipynb: matplotlib-based analysis notebook for report figures and model comparison.

Project Structure

.
├── Project1_emotion_classification_Spring2026.pdf
├── data/
│   ├── train.csv
│   ├── valid.csv
│   └── test_no_label.csv
├── src/
│   ├── baselines/
│   ├── evaluate.py
│   ├── run_experiments.py
│   ├── search_tfidf_models.py
│   ├── train_tfidf_logistic_regression.py
│   └── train_transformer.py
├── notebooks/
├── artifacts/
│   ├── outputs/
│   └── results/
│       ├── baselines/
│       ├── search/
│       └── tfidf/
└── submissions/

Use artifacts/results/ for validation predictions, metrics, and search output used during model selection. Use artifacts/outputs/ for full transformer experiment artifacts. Use submissions/ only for final test-set prediction CSVs.

Baselines

All baseline scripts must be run from the project_1/ directory.

python3 src/baselines/mlp.py

This script writes validation predictions to artifacts/results/baselines/emb_mlp_valid_predictions.csv and test predictions to submissions/emb_mlp_pred.csv by default.

To run the Bi-RNN baseline:

python3 src/baselines/rnn.py

This script writes validation predictions to artifacts/results/baselines/rnn_valid_predictions.csv and test predictions to submissions/rnn_pred.csv by default.

Evaluation

To evaluate predictions on the validation set, run:

python3 src/evaluate.py --pred <path_to_pred.csv>

Example:

python3 src/evaluate.py --pred artifacts/results/tfidf/tfidf_logreg_valid_predictions.csv

The script prints accuracy, macro-precision, macro-recall, and macro-F1, along with a per-class breakdown.

Note: The src/evaluate.py script evaluates against data/valid.csv. For the final test set, submit submissions/prediction.csv (columns: id, label) following the course submission instructions.

Retained Workflows

TF-IDF + Logistic Regression

Train on train.csv, evaluate on valid.csv, and export both validation and test predictions:

python3 src/train_tfidf_logistic_regression.py

This single command produces two output files:

artifacts/results/tfidf/tfidf_logreg_valid_predictions.csv: validation predictions for src/evaluate.py
submissions/prediction.csv: test-set predictions for submission formatting checks

It also saves summary metrics to:

artifacts/results/tfidf/tfidf_train_metrics.json
artifacts/results/tfidf/tfidf_valid_metrics.json

Validation predictions are written to:

artifacts/results/tfidf/tfidf_logreg_valid_predictions.csv

Test predictions are written to:

submissions/prediction.csv

To train the final classical model on train.csv + valid.csv before generating the test submission:

python3 src/train_tfidf_logistic_regression.py --train-on-all

With --train-on-all, the script still writes the validation prediction file, then retrains the classical model on the full labeled data (train.csv + valid.csv) before overwriting submissions/prediction.csv with the final test-set predictions.

To search TF-IDF feature and classifier combinations on validation:

python3 src/search_tfidf_models.py

This writes the search table to artifacts/results/search/tfidf_search_results.csv and the best configuration summary to artifacts/results/search/tfidf_search_best.json.

DistilRoBERTa Transformer

Train one transformer experiment with the recommended setup:

python3 src/train_transformer.py \
  --model-name distilroberta-base \
  --experiment-name distilroberta_weighted_seed42 \
  --loss-type weighted \
  --seed 42 \
  --train-batch-size 16 \
  --eval-batch-size 32 \
  --learning-rate 2e-5 \
  --num-train-epochs 4 \
  --max-length 128

The retained Colab notebook runs this same script-based workflow and is useful when you want GPU-backed training on Google Colab:

notebooks/train_transformer_colab.ipynb

The notebook now reads both train/metrics.json and valid/metrics.json after a run so you can compare train-vs-validation performance directly.

Key options:

--loss-type plain|weighted|focal
--sampler-strategy none|weighted
--merge-train-valid to retrain the final model on the full labeled set before generating test predictions
--max-train-examples, --max-valid-examples, --max-test-examples for quick smoke tests
--fp16 or --bf16 if your hardware supports mixed precision

Quick smoke test

Use a tiny checkpoint and small subsets to validate the pipeline before launching a long run:

python3 src/train_transformer.py \
  --model-name hf-internal-testing/tiny-random-roberta \
  --experiment-name smoke_tiny_roberta \
  --output-dir artifacts/outputs/smoke \
  --loss-type plain \
  --num-train-epochs 1 \
  --max-length 32 \
  --max-train-examples 64 \
  --max-valid-examples 64 \
  --max-test-examples 64

Artifacts are written under <output-dir>/<experiment_name>/. For the smoke-test command above, that means:

artifacts/outputs/smoke/smoke_tiny_roberta/

Typical artifact files include:

config.json
class_weights.json
train/metrics.json
valid_predictions.csv
test_predictions.csv
valid/metrics.json
valid/confusion_matrix.csv
valid/probabilities.npz
test/probabilities.npz
model/

An experiment summary row is also appended to <output-dir>/experiment_summary.csv. The summary CSV now includes train and validation summary metrics, which makes it easier to inspect train-vs-valid gaps for overfitting analysis.

Final model training

After choosing the best settings on validation:

python3 src/train_transformer.py \
  --model-name distilroberta-base \
  --experiment-name distilroberta_weighted_full_seed42 \
  --output-dir artifacts/outputs/transformer \
  --loss-type weighted \
  --seed 42 \
  --merge-train-valid

This retrains on train.csv + valid.csv and writes the final test predictions to:

artifacts/outputs/transformer/distilroberta_weighted_full_seed42/test_predictions.csv

Analysis Notebook

Use the analysis notebook to generate matplotlib figures for:

final model comparison
TF-IDF train vs validation Macro-F1
TF-IDF search trends
confusion matrices
per-class F1 comparison
transformer train vs validation Macro-F1
transformer validation curves when retained epoch outputs are available

Notebook:

notebooks/model_analysis_plots.ipynb

The notebook can now read the saved transformer train/metrics.json and valid/metrics.json files directly when those local outputs are available.

Prediction Format

Your prediction file must be a CSV with exactly two columns: id and label.

id,label
eebbqej,4
ed00q6i,4
...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COMP4332 2026 Spring Project 1 — Emotion Classification

Task

Data

Dependencies

Main Files

Project Structure

Baselines

Evaluation

Retained Workflows

TF-IDF + Logistic Regression

DistilRoBERTa Transformer

Quick smoke test

Final model training

Analysis Notebook

Prediction Format

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
artifacts		artifacts
data		data
notebooks		notebooks
src		src
submissions		submissions
.gitignore		.gitignore
Project1_emotion_classification_Spring2026.pdf		Project1_emotion_classification_Spring2026.pdf
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

COMP4332 2026 Spring Project 1 — Emotion Classification

Task

Data

Dependencies

Main Files

Project Structure

Baselines

Evaluation

Retained Workflows

TF-IDF + Logistic Regression

DistilRoBERTa Transformer

Quick smoke test

Final model training

Analysis Notebook

Prediction Format

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages