- A logistic regression based ML model that's trained with a dataset with 20k+ values.
- Predicts real v/s fake news based on text stemming.
The use of logistic regression is done because the final output predicted by this model is supposed to be a binary value which results that the news/article is real or fake. Refer here for further details.
- Python
- sklearn
- numpy
- PorterStemmer
A single train.csv dataset obtained from Kaggle is split into 80%-20%
for training and testing the model.
The model is trained with 80% and tested with 20% of the train.csv dataset and the accuracy of the model was 96% and 94% respectively.
A live version of the main.ipynb is uploaded at Google Drive.