User Feedback Sentiment (BigQuery + GPT-4o)

This project analyzes structured user feedback (thumbs up/down, flags, and comments) using a fine-grained LLM-based sentiment scoring system. It processes data from a BigQuery table, generates numerical sentiment scores and aspect labels via OpenAI’s GPT-4o, and stores the results back in BigQuery for visualization and monitoring.

Project Goal

Convert unstructured user feedback (thumbs, flags, comments) into structured sentiment metrics using an LLM so product teams can quantify user satisfaction and identify UX issues in analytics dashboards.

Features

Classifies feedback sentiment with numerical intensity from -2 to +2
Extracts structured labels for sentiment type and feedback aspect
Uses GPT-4o with a constrained JSON output schema
Supports two execution modes:
- Local CSV mode for testing and development
- BigQuery mode for production pipelines
Includes:
- retry logic for OpenAI API calls
- strict JSON parsing and validation
- optional Slack notifications for pipeline runs

Architecture

The pipeline processes user feedback using an LLM to produce structured sentiment labels that can be analyzed in analytics dashboards.

flowchart TD

A["User Feedback Source
BigQuery table or local CSV"]

B["LLM Sentiment Pipeline
Python + OpenAI GPT-4o"]

C["Structured Sentiment Output
BigQuery table"]

D["Analytics Dashboard
Looker Studio"]

E["Monitoring / Alerts
Slack"]

A --> B
B --> C
C --> D
C --> E

Stack

Python 3.11
OpenAI GPT-4o API (temperature=0)
Google BigQuery
.env for secret management (for local testing only--prod version incorporates Google Secret manager)
Optional: Looker Studio (for dashboards), Slack (for alerts)

Project Structure

user-feedback-sentiment-bq/
├── src/                                   # Main Python code
│   ├── llm_feedback_pipeline.py           # Core LLM + BigQuery sentiment pipeline
│   ├── test_llm_mini_pipeline.py          # Quick local test on small hardcoded/sample data
│   └── test_llm_on_full_dataset.py        # End-to-end test against real BigQuery (use with caution)
├── sql/
│   └── user_feedback_and_flags_model.sql  # Source table schema / view definition
├── docs/
│   └── llm_feedback_pipeline_plan.md      # Early design notes & prompt thinking trace
├── requirements.txt
├── .gitignore
├── LICENSE
└── README.md

How It Works

The pipeline can run in two modes.

Local Development Mode

Load feedback rows from sample_feedback.csv
Send each comment to GPT-4o
Parse structured JSON output
Save results to a local CSV file

Production Mode

Query new feedback rows from BigQuery
Send each comment to GPT-4o
Parse structured JSON output
Append results to a BigQuery output table
Optionally send a Slack notification when processing completes

Example LLM Classification

Example input row:

System message:

"You can improve your resume by adding measurable achievements."

User comment:

"This advice was helpful but the example link was broken."

LLM output:

{
  "sentiment_score": 1,
  "sentiment_type": "suggestion",
  "aspect": "completeness"
}

Running the Pipeline Locally

Install dependencies:

pip install -r requirements.txt

Set up your .env:

Local development requires:

OPENAI_API_KEY=your-key-here
BIGQUERY_PROJECT=your-gcp-project
BIGQUERY_DATASET=your-dataset

In production, secrets should be retrieved from Google Secret Manager instead of .env.

Run the test script (20-row sample):

python src/test_llm_mini_pipeline.py

Run full BigQuery pipeline:

python src/llm_feedback_pipeline.py
# or for validation:
python src/test_llm_on_full_dataset.py

Pipeline Modes

The main pipeline (llm_feedback_pipeline.py) supports two execution paths.

Local CSV Mode (default)

Used for development and testing.

Input: sample_feedback.csv Output: simulated_full_pipeline_local_test.csv

BigQuery Mode

Enable by uncommenting the BigQuery section in the script.

Input table: {project}.{dataset}.user_feedback_and_flags Output table: {project}.{dataset}.feedback_sentiment_output

Example Output Schema

column	description
user_id	user identifier
chat_id	chat session id
message_id	message identifier
timestamp	original feedback timestamp
user_comment	free-text user comment
system_message	AI message the user reacted to
source_type	thumbs / flag source
user_feedback_type	thumbs_up / thumbs_down / flag
sentiment_score	integer from -2 to +2
sentiment_type	complaint / suggestion / compliment / neutral
aspect	feedback topic classification
llm_timestamp	time sentiment analysis was generated

Future Plans

Add support for multilingual feedback
Hook into Slack or email alerts on extreme negative feedback
Compare LLM model performance (Claude vs GPT-4o)
Export labeled data for model fine-tuning

Safety & Reliability

The pipeline includes several safeguards:

retry logic for transient OpenAI API failures
JSON schema validation before writing results
duplicate prevention when writing local output
optional Slack alerts for monitoring scheduled runs

License

MIT License. Feel free to use and adapt this pipeline for your own feedback analysis workflows.

See the LICENSE file for full details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

User Feedback Sentiment (BigQuery + GPT-4o)

Project Goal

Features

Architecture

Stack

Project Structure

How It Works

Local Development Mode

Production Mode

Example LLM Classification

Running the Pipeline Locally

Pipeline Modes

Local CSV Mode (default)

BigQuery Mode

Example Output Schema

Future Plans

Safety & Reliability

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets/images		assets/images
docs		docs
sql		sql
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
sample_feedback.csv		sample_feedback.csv

Folders and files

Latest commit

History

Repository files navigation

User Feedback Sentiment (BigQuery + GPT-4o)

Project Goal

Features

Architecture

Stack

Project Structure

How It Works

Local Development Mode

Production Mode

Example LLM Classification

Running the Pipeline Locally

Pipeline Modes

Local CSV Mode (default)

BigQuery Mode

Example Output Schema

Future Plans

Safety & Reliability

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages