Review-Insight is a powerful tool designed to analyze customer reviews and extract meaningful insights through natural language processing and sentiment analysis. This application helps businesses understand customer feedback at scale by automatically identifying key topics (tags) mentioned in reviews and analyzing the sentiment associated with each topic.
Using Review-Insight, you can:
- Automatically extract common themes and topics from large volumes of customer reviews
- Analyze sentiment around specific product features or aspects
- Identify areas of strength and opportunities for improvement
- Track customer sentiment trends over time
In the provided dataset example, Review-Insight analyzes reviews for headphones, identifying key aspects like sound quality, comfort, battery life, design, and more - along with whether customers feel positively or negatively about each aspect.
- Automated Tag Extraction: Identifies the most relevant topics and features mentioned in reviews
- Sentiment Analysis: Determines whether reviews express positive or negative sentiments about specific features
- Interactive Visualization: Displays tag frequency and associated sentiment in an easy-to-understand format
- Scalable Processing: Handles large datasets with thousands of reviews
- Customizable Analysis: Allows adjustment of tag extraction parameters and sentiment thresholds
After processing reviews, Review-Insight provides:
- A list of the most commonly mentioned product features/aspects
- The sentiment distribution for each feature (positive vs. negative)
- Representative review excerpts for each feature
- Overall sentiment metrics across all reviews
streamlit run code/app.pyIt will take around 2 to 3 minutes to run completely (depending on data size).
The application processes customer reviews to extract and analyze tags, and then performs sentiment analysis on these tags. Below is a detailed explanation of the approach used:
- Extract Tags:
- The
extract_tags_with_nlpfunction inextract_tags.pyuses the Spacy NLP library to extract tags from reviews. It identifies noun chunks and individual nouns, cleans them, and removes generic words and stop words. - The
assign_tagsfunction associates the extracted tags with the corresponding reviews.
- The
- Process Tags:
- The
process_tagsfunction intag_processing.pyprocesses all extracted tags to filter out common words and sorts them by frequency. It then limits the tags to the most relevant ones based on a specified threshold and maximum number of tags.
- The
- Sentiment Analysis:
- The
get_sentiment_indicatorfunction insentiment_analysis.pyuses the TextBlob library to perform sentiment analysis on each review. It assigns a sentiment indicator ('✔️' for positive and '❌' for negative) based on the polarity of the review.
- The
- Aggregate Sentiments:
- The
aggregate_sentimentsfunction intag_processing.pyaggregates sentiment indicators for each tag to determine the overall sentiment (positive or negative) for that tag.
- The
- Streamlit Application:
- The
app.pyfile integrates all the above functionalities into a Streamlit application. It loads and processes the data, extracts and processes tags, performs sentiment analysis, and displays the results in an interactive web interface.
- The
code/extract_tags.py: Contains functions to extract and clean tags from reviews using Spacy.code/tag_processing.py: Contains functions to process tags and aggregate sentiments.code/sentiment_analysis.py: Contains functions to perform sentiment analysis using TextBlob.code/app.py: The main Streamlit application file that integrates all functionalities and provides an interactive interface.
The application expects a CSV file (data/data.csv) with customer reviews and ratings. The file should have the following columns:
review: The text of the customer review.(This is a mandatory column)rating: The rating given by the customer.(not a mandatory column)