feat: gameplay companion — Claude vision, Qdrant RAG, Supabase memory, proactive vision loop#2
Conversation
…ude LLM Pre-work: Full CHN Strip - Delete: bilibili_live.py, fun_asr.py, cosyvoice*.py, minimax_tts.py, spark_tts.py, siliconflow_tts.py, conf.ZH.default.yaml, zh_*.yaml, README.CN/JP/KR.md, scripts/run_bilibili_live.py - Fix language.py to always return 'en' (no OS locale detection) - Fix constants.py: ZH_DEFAULT_CONF = EN_DEFAULT_CONF - Remove bilibili optional dep group and CI workflows for ZH release - Remove fun_asr, cosyvoice*, minimax, spark, siliconflow from all factories, config classes, Literals, and validators - Fix PiperTTSConfig/piper_tts.py defaults to en_US-lessac-medium - Fix azure_asr/AzureASRConfig defaults to ['en-US'] - Convert compare_yaml.py print strings to English - Rewrite conf.default.yaml: wsl_claude_llm as sole LLM backend - Update CLAUDE.md with new architecture docs; create PLAN.md roadmap Phase 1: WSL Claude LLM Core - Create wsl_claude.py: WSLClaudeLLM runs claude -p subprocess - Force wsl_claude_llm in LLMFactory.create_llm() regardless of config - Add WSLClaudeConfig to config_manager/stateless_llm.py - Add wsl_claude_llm to BasicMemoryAgentConfig.llm_provider Literal - Handle markdown code fences in claude -p JSON output - Tested: subprocess call, JSON parsing, config validation all pass Docs: docs/PHASE_1.md with timestamps and test results Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rking Live WebSocket conversation confirmed on port 19393: - claude -p subprocess fires for every user message - JSON parsed + markdown fence stripping working - Live2D emotion tags extracted ([joy], [smirk]) - TTS produces audio chunks (5–11 per response) - Character persona (Mili) maintained across turns - Server isolated on custom port 19393 (no conflicts) - Ruff lint: zero warnings Phase 1 pass criteria all met. Updated PHASE_1.md with real test evidence. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 2 — Qdrant RAG: - knowledge_base.py: get_game_context() → Qdrant search (score≥0.8) or claude-p research subprocess → ingest tactic+best_practices → return tactic - service_context.py: KB initialized at startup, game context injected into system prompt before every conversation turn - websocket_handler.py: inject-game-context WS endpoint for dev/test, game context refresh on every conversation trigger - pyproject.toml: qdrant-client, python-dotenv added Phase 3 — Proactive Vision Loop: - vision_loop.py: ScreenWatcher daemon thread, PowerShell screenshot every 15-25s, pHash Hamming-distance AFK detection (< 5 = skip), 60s cooldown between proactive triggers - websocket_handler.py: ScreenWatcher started/stopped per client session Conversation quality fixes: - ai-speak-signal blocked while bot conversation task is active (no interrupt) - Response length capped to 1-2 sentences via system prompt rule - CRITICAL RULE added: bot never reveals tech implementation details - system_prompt copied to session context so game injection always has base All Phase 2 live tests pass (KB standalone PASS, WebSocket PASS). Bot confirmed responding with Elden Ring tactics from Qdrant context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nd-to-end PASS
## What changed
### knowledge_base.py
- `_research_with_claude`: uses `@{screenshot_path}` prefix so `claude -p` visually
analyzes the screenshot (was: text path reference — Claude couldn't see the image)
- Removed `timeout=120` entirely — research completes however long it takes
- Returns 4-tuple `(tactic, best_practices, identified_game, identified_state)`
- `get_game_context`: uses Claude's returned game_name/current_state for Qdrant ingest
and `current_game_name` attribute (not the caller's "unknown")
- `get_game_context`: skips Qdrant search when `game_name == "unknown"` to avoid
false hits on stale "unknown" entries
### vision_loop.py
- `_analyze_screen`: reuses `kb.current_game_name` after first identification so
subsequent captures hit Qdrant cache (was: always passed "unknown")
### .gitignore
- Added `sessions/` (runtime screenshots) and `.claude/` to ignore list
## Live test (2026-02-10)
- elden1.jpg → Claude identified "Elden Ring / Boss fight: Malenia Phase 2 (Scarlet Aeonia attack)" — PASS
- Qdrant ingest: 2 points (tactic + best_practices) — PASS
- Supabase user_history row saved for game='Elden Ring' — PASS
- docs/PHASE_5.md: full e2e pipeline test report added
- README.md: rewritten to describe gameplay companion architecture
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThis PR transforms the project from a multi-language, platform-agnostic VTuber into an English-focused, self-learning gameplay companion. It removes language-specific resources and bilibili integration, introduces Qdrant-based knowledge caching, screen-watching vision loops, and Supabase memory persistence, all coordinated by a new WSLClaudeLLM backend and integrated via WebSocket handlers. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip We've launched Issue Planner and it is currently in beta. Please try it out and share your feedback on Discord! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @diannt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request transforms the Open-LLM-VTuber into a self-learning gameplay coaching companion. It establishes a robust, local-first AI pipeline by integrating WSL Claude for all language model inference, a Qdrant vector database for contextual game knowledge, and Supabase for long-term user memory. The system proactively monitors gameplay through a vision loop, offering timely and personalized advice via the Live2D avatar, while also refining conversation quality for a more focused user experience. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is a substantial pull request that transforms the project into a self-learning gameplay coaching companion. It introduces several major new components, including a WSL-based Claude integration for local LLM inference, Qdrant for RAG, Supabase for long-term memory, and a proactive screen-watching loop for gameplay analysis. The PR also performs a significant cleanup by removing multi-language support and consolidating to a single LLM backend, which simplifies the architecture. The changes are well-documented with detailed planning and phase-report files. While the new features are impressive, I've identified a few high-severity issues related to hardcoded paths and blocking I/O that impact portability and performance. Addressing these will be crucial for making the new functionality robust and usable for the community.
| from .stateless_llm_interface import StatelessLLMInterface | ||
|
|
||
| # Resolve claude CLI path at import time; fall back to known install location | ||
| _CLAUDE_CLI = shutil.which("claude") or "/home/maatru/.local/bin/claude" |
There was a problem hiding this comment.
The fallback path for the claude CLI is hardcoded to a user-specific directory (/home/maatru/.local/bin/claude). This makes the application not portable and will fail for any other user if claude is not in the system's PATH. This same issue is present in src/open_llm_vtuber/modules/knowledge_base.py on line 33.
To improve portability and avoid code duplication, this path should be configurable or determined in a single, shared location.
# Resolve claude CLI path at import time. Fails if not found.
_CLAUDE_CLI = shutil.which("claude")
if not _CLAUDE_CLI:
raise FileNotFoundError(
"`claude` CLI not found in PATH. Please install it or configure its path."
)| if not url or not key: | ||
| raise ValueError("SUPABASE_URL and SUPABASE_SECRET_KEY must be set in .env") | ||
|
|
||
| self._db: Client = create_client(url, key) |
There was a problem hiding this comment.
The MemoryManager uses supabase.create_client, which provides a synchronous client. Since this application is built on asyncio, all database operations performed by this manager (e.g., save_advice, save_chat_log) are blocking network calls that will freeze the event loop, severely impacting performance and responsiveness.
| self._db: Client = create_client(url, key) | |
| self._db: AsyncClient = create_async_client(url, key) |
| _PS_TIMEOUT = 10 # seconds for PowerShell call | ||
| _MIN_TRIGGER_INTERVAL = 60 # minimum seconds between proactive AI speaks | ||
|
|
||
| _POWERSHELL = "/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe" |
There was a problem hiding this comment.
The path to powershell.exe is hardcoded to /mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe. This makes the screen capture functionality strictly dependent on a specific WSL2 environment and will not work on native Windows, macOS, or other Linux distributions. The pyautogui.screenshot() function is cross-platform, so this PowerShell-based implementation seems to be a workaround for WSL.
| _POWERSHELL = "/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe" | |
| # Path to PowerShell, should be configurable or detected. | |
| _POWERSHELL = os.getenv("POWERSHELL_PATH", "/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe") |
Summary
Transform Open-LLM-VTuber into a self-learning gameplay coaching companion.
claude -p) replaces all LLM backends — no API keys, no remote callsgame_knowledgecollection; tactics cached and reused across sessions@/path/to/screenshot.pngprefix lets Claude analyze the screenshot directly and returngame_name,current_state,tactic,best_practicesas JSONScreenWatcherdaemon takes a screenshot every 15–25 s, pHash Hamming distance detects screen changes, triggers AI commentary when gameplay changesuser_history,chat_logs,long_term_profiletables; profile injected into system prompt at session start; advice saved after each responseai-speak-signalsuppressed while bot is talking; response length capped to 1–2 sentencesCommits
bbc9927feat: gameplay companion transformation — CHN strip + Phase 1 WSL Claude LLM31a8065test(phase1): confirm live end-to-end test — WSLClaudeLLM verified working6013052feat: Phase 2+3 — Qdrant RAG game knowledge + proactive vision loopab0f5a5feat: Claude vision game identification — @path prompt, no timeout, end-to-end PASSLive test results
All phases tested end-to-end against real services (Qdrant cloud, Supabase):
elden1.jpg→ Claude identified "Elden Ring / Malenia Phase 2 Scarlet Aeonia attack" ✓user_historyrow saved ✓Test plan
uv syncinstalls all dependencies.envwithQDRANT_API_KEY,QDRANT_CLUSTER_ENDPOINT,SUPABASE_URL,SUPABASE_SECRET_KEYmigrations/001_create_tables.sqlin Supabase SQL editoruv run run_server.pystarts on port 12393game_knowledgegets populateduser_historyrows appear in Supabase🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Removals
Configuration Changes
Rebranding