deepmoji

Tensorflow implementation of the paper Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm.

Usage

cd src
python main.py --config ../config_{bert|deepmoji}.yaml

Explanation

Model

Deepmoji model is a pre-trained base which is able to generalize well on text classification tasks. The model is as on fig xx. The authors use emoji classification for model's pre-training and fine-tune it more on downstream tasks' small scale datasets. It is a nice example of representation learning for text which is nowadays best performed by transformer based language models such as BERT.

As the 1.2 billion tweet emoji dataset is not publicly available, and I see this project as a proof-of-concept, I used amazon-polarity's test set as pre-training dataset. Since it has positive-negative sentiment classification objective, it is similar to chosen final task, insult-detection.

Furthermore, for comparing model performance with state-of-the-art language models, I also used distilBERT.

Evaluation

Learning rate decay and early stopping are both set on validation loss. Plain tokenization, gensim simple_preprocess() and full preprocessing (stop word, punctuation, numerics removal, lemmatization) are implemented and only the best among these per model is reported below. Spacy embeddings are used for initialization in deepmoji model.

Model	Test F1 score	Best Validation F1 score
Deepmoji base	0.76	0.75
Deepmoji fine-tuning (freeze base)	0.71	0.74
Deepmoji fine-tuning (train base)	0.74	0.76
DistilBERT fine-tuning (freeze base)	0.70	0.72
DistilBERT fine-tuning (train base)	0.76	0.76

Deepmoji base: trained directly on insult-detection Deepmoji fine-tuning: pretrained on amazon-polarity, fine-tuned on insult-detection Distilbert fine-tuning: fine-tuned on insult-detection

Insight

In most of the experiments, plain tokenization performed best, while full preprocessing being the worst. It is sensible as the model also cares functional words for understanding.
Validation set for insult-detection has only 495 samples and I observed label-sentence mismatch for some samples. This makes the validation loss not informative, and training unstable.
Freezing pre-trained base model degrades the performance on down-stream task as only the classification head is updated.
DistilBERT fine-tuning leads to best performance. Compared to it, Deepmoji model is low capacity because of 10x less parameter size and lack of attention mechanism, and its pre-training is very limited. Thus I admits its lower performance.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
src		src
.gitignore		.gitignore
README.md		README.md
config_bert.yaml		config_bert.yaml
config_deepmoji.yaml		config_deepmoji.yaml
model.png		model.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

deepmoji

Usage

Explanation

Evaluation

Insight

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

deepmoji

Usage

Explanation

Evaluation

Insight

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages