GitHub - RusticHaze634/Text-Analysis: Sentiment classification using Naive Bayes

Text Analytics

( Sentiment Classification )

💭

ABSTRACT :

This is a text classification task - sentiment classification.
Every document (a line in the data file) is a sentence extracted from social media (blogs).
The goal is to classify the sentiment of each sentence into "positive" or "negative".
Accuracy of this model: 97.91%

DATASET:

The training data contains 7086 sentences, already labeled with 1 (positive sentiment) or 0 (negative sentiment).
The test data contains 33052 sentences that are unlabeled.
The submission should be a .txt file with 33052 lines.
In each line, there should be exactly one integer, 0 or 1, according to the classification results.

SOLUTION APPROACH:

 Text Analytics
     NLP + ML (Naive Bayes Theoerm)  
     Ensemble Learning

STEPS:

1. EDA and Pre-processing

Bag-of-Words method
Feature Extraction
Removing the low frequency words
Count the features
Remove Stop Words
Stemming

2. Balancing the unbalanced data classes

3. Naive Bayes Model for sentiment Classification

Dataset segmentation
Model building
Prediction test
Classification report

DATA VISUALIZATION:

Sample Data:

Positive and Negative Class-wise Data Distribution Bar Plot :

The Occurances of Features (Column-wise) :

Positive and Negative Class-wise Data Distribution After Balancing Data-Classes:

RESULTS:

The Table with Results:

Confusion Matrix

The Prediction Accuracy Score for Training Data : 99.37%
The Prediction Accuracy Score for Testing Data : 97.91%

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
Images		Images
Text_Analytics_(Sentiment)		Text_Analytics_(Sentiment)
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Analytics

( Sentiment Classification )

ABSTRACT :

DATASET:

SOLUTION APPROACH:

STEPS:

DATA VISUALIZATION:

RESULTS:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Text Analytics

( Sentiment Classification )

ABSTRACT :

DATASET:

SOLUTION APPROACH:

STEPS:

DATA VISUALIZATION:

RESULTS:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages