-
Notifications
You must be signed in to change notification settings - Fork 13
AI reviews summary #1219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
phongo1
wants to merge
25
commits into
dev
Choose a base branch
from
AI-reviews-summary
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
AI reviews summary #1219
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
56eabac
AI reviews summary feature
phongo1 e52c17f
lint
phongo1 a4524bc
remove fallback model
phongo1 2ba7411
pylint
phongo1 9c56b02
AI block invisible if no summary and styling updates
phongo1 c70166c
pylint
phongo1 29db012
var cleanup
phongo1 24d57cf
address claude code pr comment
phongo1 1a942b1
increase max tokens
phongo1 1bfe5ae
additional flag for generating summaries for top n courses missing s…
phongo1 a83fb19
dry run prints id's
phongo1 77e1a4f
more verbose error propagation
phongo1 ae2450b
script to print summaries we have already
phongo1 20c4960
UI "New" tag and script timeout
phongo1 e6268a2
remove unnecessary model params
phongo1 c27cf77
fix(lint): linting / coderabbit warnings
artiehumphreys 2df1522
fix(lint): pylint
artiehumphreys b5dd4f1
Merge branch 'dev' into AI-reviews-summary
phongo1 1de94bc
more accurate source_review_count when generating summaries
phongo1 ccc2049
Remove list summaries script from PR
phongo1 1fdfbc3
migrations patch
phongo1 dfa319a
lint
phongo1 8b60885
cleaner file structuring
phongo1 2b40326
Merge branch 'dev' into AI-reviews-summary
phongo1 5274635
small batch generate summaries on container startup
phongo1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
229 changes: 229 additions & 0 deletions
229
tcf_website/management/commands/generate_ai_summaries.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,229 @@ | ||
| """Manual generator for AI review summaries.""" | ||
|
|
||
| import time | ||
| from typing import Any | ||
|
|
||
| from django.conf import settings | ||
| from django.core.management.base import BaseCommand, CommandError | ||
| from django.db.models import Count, Exists, Max, OuterRef | ||
|
|
||
| from ...models import Course, Instructor, Review, ReviewLLMSummary | ||
| from .generate_ai_summaries_helpers import generate_review_summary | ||
|
|
||
|
|
||
| # pylint: disable=too-many-branches,too-many-locals,no-member | ||
|
|
||
|
|
||
| class Command(BaseCommand): | ||
| """Command to manually generate AI summaries for course/instructor pairs. | ||
|
|
||
| Usage examples: | ||
| # Help command: | ||
| docker compose exec web python manage.py generate_ai_summaries help | ||
|
|
||
| # Dry run: show top 20 pairs by written-review count (no LLM calls) | ||
| python manage.py generate_ai_summaries --dry-run | ||
| OR | ||
| docker compose exec web python manage.py generate_ai_summaries --dry-run | ||
|
|
||
| # Generate for top 30 course / instructor pairs with at least 3 written reviews | ||
| python manage.py generate_ai_summaries --limit 30 --min-reviews 3 | ||
| OR | ||
| docker compose exec web python manage.py generate_ai_summaries --limit 30 --min-reviews 3 | ||
|
|
||
| # Generate for top 20 course / instructor pairs missing a summary | ||
| python manage.py generate_ai_summaries --limit 20 --missing-only | ||
| OR | ||
| docker compose exec web python manage.py generate_ai_summaries --limit 20 --missing-only | ||
|
|
||
| # Generate for a specific course/instructor pair | ||
| python manage.py generate_ai_summaries --course-id 1 --instructor-id 4019 | ||
| docker compose exec web python manage.py generate_ai_summaries | ||
| --course-id 1 --instructor-id 4019 | ||
| """ | ||
|
|
||
| help = "Generate AI summaries manually for course/instructor pairs." | ||
|
|
||
| def add_arguments(self, parser): | ||
| parser.add_argument( | ||
| "--limit", | ||
| type=int, | ||
| default=15, | ||
| help="Number of top course/instructor pairs by review count to summarize.", | ||
| ) | ||
| parser.add_argument( | ||
| "--course-id", | ||
| type=int, | ||
| help="Specific course id to summarize (must be used with --instructor-id).", | ||
| ) | ||
| parser.add_argument( | ||
| "--instructor-id", | ||
| type=int, | ||
| help="Specific instructor id to summarize (must be used with --course-id).", | ||
| ) | ||
| parser.add_argument( | ||
| "--min-reviews", | ||
| type=int, | ||
| default=5, | ||
| help="Minimum number of written reviews required to generate a summary.", | ||
| ) | ||
| parser.add_argument( | ||
| "--max-reviews", | ||
| type=int, | ||
| default=25, | ||
| help="Use only the top N most recent written reviews in the prompt (default: 25).", | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
| ) | ||
| parser.add_argument( | ||
| "--missing-only", | ||
| action="store_true", | ||
| help="Only include pairs without an existing summary.", | ||
| ) | ||
| parser.add_argument( | ||
| "--dry-run", | ||
| action="store_true", | ||
| help="Show which pairs would be processed without calling the model.", | ||
| ) | ||
|
|
||
| def handle( | ||
| self, *args: Any, **options: Any | ||
| ): # pylint: disable=unused-argument,too-many-locals | ||
| if not getattr(settings, "OPENROUTER_API_KEY", ""): | ||
| raise CommandError("OPENROUTER_API_KEY is not configured.") | ||
|
|
||
| course_id = options.get("course_id") | ||
| instructor_id = options.get("instructor_id") | ||
| limit = options.get("limit") | ||
| min_reviews = options.get("min_reviews") | ||
| max_reviews = options.get("max_reviews") | ||
| missing_only = options.get("missing_only") | ||
| dry_run = options.get("dry_run") | ||
|
|
||
| if bool(course_id) ^ bool(instructor_id): | ||
| raise CommandError( | ||
| "Provide both --course-id and --instructor-id together, or neither." | ||
| ) | ||
|
|
||
| # Base queryset: visible, non-toxic, with text | ||
| base_reviews = Review.objects.filter( | ||
| hidden=False, toxicity_rating__lt=settings.TOXICITY_THRESHOLD | ||
| ).exclude(text="") | ||
|
|
||
| pairs: list[tuple[int, int, int, int]] = [] | ||
|
|
||
| if course_id and instructor_id: | ||
| if ( | ||
| missing_only | ||
| and ReviewLLMSummary.objects.filter( | ||
| course_id=course_id, instructor_id=instructor_id, club__isnull=True | ||
| ).exists() | ||
| ): | ||
| self.stdout.write( | ||
| self.style.WARNING( | ||
| f"Skipping course {course_id} / instructor {instructor_id}: " | ||
| "summary already exists." | ||
| ) | ||
| ) | ||
| return | ||
| review_count = base_reviews.filter( | ||
| course_id=course_id, instructor_id=instructor_id | ||
| ).count() | ||
| if review_count < min_reviews: | ||
| msg = ( | ||
| f"Skipping course {course_id} / instructor {instructor_id}: " | ||
| f"only {review_count} reviews." | ||
| ) | ||
| self.stdout.write(self.style.WARNING(msg)) | ||
| return | ||
| latest_id = ( | ||
| base_reviews.filter(course_id=course_id, instructor_id=instructor_id) | ||
| .order_by("-id") | ||
| .values_list("id", flat=True) | ||
| .first() | ||
| or 0 | ||
| ) | ||
| pairs = [(course_id, instructor_id, review_count, latest_id)] | ||
| else: | ||
| agg = ( | ||
| base_reviews.values("course_id", "instructor_id") | ||
| .annotate(review_count=Count("id"), last_id=Max("id")) | ||
| .filter(review_count__gte=min_reviews) | ||
| .order_by("-review_count") | ||
| ) | ||
| if missing_only: | ||
| existing_summaries = ReviewLLMSummary.objects.filter( | ||
| course_id=OuterRef("course_id"), | ||
| instructor_id=OuterRef("instructor_id"), | ||
| club__isnull=True, | ||
| ) | ||
| agg = agg.annotate(has_summary=Exists(existing_summaries)).filter( | ||
| has_summary=False | ||
| ) | ||
| agg = agg[:limit] | ||
| pairs = [ | ||
| ( | ||
| row["course_id"], | ||
| row["instructor_id"], | ||
| row["review_count"], | ||
| row["last_id"], | ||
| ) | ||
| for row in agg | ||
| ] | ||
|
|
||
| if not pairs: | ||
| self.stdout.write(self.style.WARNING("No pairs matched criteria.")) | ||
| return | ||
|
|
||
| self.stdout.write(f"Processing {len(pairs)} pair(s)...") | ||
|
|
||
| for idx, (course_id, instructor_id, review_count, latest_id) in enumerate( | ||
| pairs | ||
| ): | ||
| course = Course.objects.get(id=course_id) | ||
| instructor = Instructor.objects.get(id=instructor_id) | ||
| qs = base_reviews.filter(course=course, instructor=instructor).order_by( | ||
| "-created" | ||
| ) | ||
|
|
||
| if dry_run: | ||
| self.stdout.write( | ||
| f"[DRY RUN] Would summarize {course.code()} / {instructor.full_name} " | ||
| f"(course_id={course_id}, instructor_id={instructor_id}, " | ||
| f"{review_count} reviews)" | ||
| ) | ||
| continue | ||
|
|
||
| summary_text, error, model_used = generate_review_summary( | ||
| course, instructor, qs, max_reviews=max_reviews | ||
| ) | ||
| if not summary_text: | ||
| self.stdout.write( | ||
| self.style.ERROR( | ||
| f"Failed for {course.code()} / {instructor.full_name}: {error}" | ||
| ) | ||
| ) | ||
| continue | ||
|
|
||
| ReviewLLMSummary.objects.update_or_create( | ||
| course=course, | ||
| instructor=instructor, | ||
| club=None, | ||
| defaults={ | ||
| "summary_text": summary_text, | ||
| "source_review_count": min(review_count, max_reviews) | ||
| if max_reviews is not None | ||
| else review_count, | ||
| "last_review_id": latest_id, | ||
| "source_metadata": {}, | ||
| "model_id": model_used or getattr(settings, "OPENROUTER_MODEL", ""), | ||
| }, | ||
| ) | ||
|
artiehumphreys marked this conversation as resolved.
|
||
|
|
||
| self.stdout.write( | ||
| self.style.SUCCESS( | ||
| f"Saved summary for {course.code()} / {instructor.full_name} " | ||
| f"({review_count} reviews) using {model_used or 'unknown model'}" | ||
| ) | ||
| ) | ||
|
|
||
| if not dry_run and idx < len(pairs) - 1: | ||
| time.sleep(4.0) # avoids rate limits | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.