Skip to content

feat: gameplay companion — Claude vision, Qdrant RAG, Supabase memory, proactive vision loop#2

Merged
diannt merged 5 commits into
mainfrom
gameplay-companion
Feb 10, 2026
Merged

feat: gameplay companion — Claude vision, Qdrant RAG, Supabase memory, proactive vision loop#2
diannt merged 5 commits into
mainfrom
gameplay-companion

Conversation

@diannt

@diannt diannt commented Feb 10, 2026

Copy link
Copy Markdown
Owner

Summary

Transform Open-LLM-VTuber into a self-learning gameplay coaching companion.

  • WSL Claude subprocess (claude -p) replaces all LLM backends — no API keys, no remote calls
  • Qdrant cloud RAGgame_knowledge collection; tactics cached and reused across sessions
  • Claude visual game identification@/path/to/screenshot.png prefix lets Claude analyze the screenshot directly and return game_name, current_state, tactic, best_practices as JSON
  • Proactive vision loopScreenWatcher daemon takes a screenshot every 15–25 s, pHash Hamming distance detects screen changes, triggers AI commentary when gameplay changes
  • Supabase memoryuser_history, chat_logs, long_term_profile tables; profile injected into system prompt at session start; advice saved after each response
  • Conversation quality fixesai-speak-signal suppressed while bot is talking; response length capped to 1–2 sentences

Commits

  • bbc9927 feat: gameplay companion transformation — CHN strip + Phase 1 WSL Claude LLM
  • 31a8065 test(phase1): confirm live end-to-end test — WSLClaudeLLM verified working
  • 6013052 feat: Phase 2+3 — Qdrant RAG game knowledge + proactive vision loop
  • ab0f5a5 feat: Claude vision game identification — @path prompt, no timeout, end-to-end PASS

Live test results

All phases tested end-to-end against real services (Qdrant cloud, Supabase):

  • elden1.jpg → Claude identified "Elden Ring / Malenia Phase 2 Scarlet Aeonia attack" ✓
  • Qdrant ingest: 2 points (tactic + best_practices) ✓
  • Supabase user_history row saved ✓
  • Bot responds with specific tactical advice via Live2D avatar ✓

Test plan

  • uv sync installs all dependencies
  • Set .env with QDRANT_API_KEY, QDRANT_CLUSTER_ENDPOINT, SUPABASE_URL, SUPABASE_SECRET_KEY
  • Run migrations/001_create_tables.sql in Supabase SQL editor
  • uv run run_server.py starts on port 12393
  • Open a game, wait 15–25 s, confirm avatar comments on gameplay
  • Confirm Qdrant collection game_knowledge gets populated
  • Confirm user_history rows appear in Supabase

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added proactive screen watching daemon that continuously monitors gameplay and identifies game context
    • Integrated vector-based knowledge caching for intelligent game tactic retrieval
    • Added persistent memory layer to track long-term player preferences and advice history
  • Removals

    • Removed multi-language documentation support (Chinese, Japanese, Korean)
    • Removed BiliBili live streaming integration
    • Removed multiple LLM, ASR, and TTS provider options; consolidated to Claude AI
  • Configuration Changes

    • Unified to English-only language support
    • Simplified package architecture and deployment workflow
  • Rebranding

    • Project repositioned as "Gameplay Companion — Self-Learning AI Coach"

Nikita A and others added 5 commits February 10, 2026 02:16
…ude LLM

Pre-work: Full CHN Strip
- Delete: bilibili_live.py, fun_asr.py, cosyvoice*.py, minimax_tts.py,
  spark_tts.py, siliconflow_tts.py, conf.ZH.default.yaml, zh_*.yaml,
  README.CN/JP/KR.md, scripts/run_bilibili_live.py
- Fix language.py to always return 'en' (no OS locale detection)
- Fix constants.py: ZH_DEFAULT_CONF = EN_DEFAULT_CONF
- Remove bilibili optional dep group and CI workflows for ZH release
- Remove fun_asr, cosyvoice*, minimax, spark, siliconflow from all
  factories, config classes, Literals, and validators
- Fix PiperTTSConfig/piper_tts.py defaults to en_US-lessac-medium
- Fix azure_asr/AzureASRConfig defaults to ['en-US']
- Convert compare_yaml.py print strings to English
- Rewrite conf.default.yaml: wsl_claude_llm as sole LLM backend
- Update CLAUDE.md with new architecture docs; create PLAN.md roadmap

Phase 1: WSL Claude LLM Core
- Create wsl_claude.py: WSLClaudeLLM runs claude -p subprocess
- Force wsl_claude_llm in LLMFactory.create_llm() regardless of config
- Add WSLClaudeConfig to config_manager/stateless_llm.py
- Add wsl_claude_llm to BasicMemoryAgentConfig.llm_provider Literal
- Handle markdown code fences in claude -p JSON output
- Tested: subprocess call, JSON parsing, config validation all pass

Docs: docs/PHASE_1.md with timestamps and test results

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rking

Live WebSocket conversation confirmed on port 19393:
- claude -p subprocess fires for every user message
- JSON parsed + markdown fence stripping working
- Live2D emotion tags extracted ([joy], [smirk])
- TTS produces audio chunks (5–11 per response)
- Character persona (Mili) maintained across turns
- Server isolated on custom port 19393 (no conflicts)
- Ruff lint: zero warnings

Phase 1 pass criteria all met. Updated PHASE_1.md with real test evidence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 2 — Qdrant RAG:
- knowledge_base.py: get_game_context() → Qdrant search (score≥0.8) or
  claude-p research subprocess → ingest tactic+best_practices → return tactic
- service_context.py: KB initialized at startup, game context injected into
  system prompt before every conversation turn
- websocket_handler.py: inject-game-context WS endpoint for dev/test, game
  context refresh on every conversation trigger
- pyproject.toml: qdrant-client, python-dotenv added

Phase 3 — Proactive Vision Loop:
- vision_loop.py: ScreenWatcher daemon thread, PowerShell screenshot every
  15-25s, pHash Hamming-distance AFK detection (< 5 = skip), 60s cooldown
  between proactive triggers
- websocket_handler.py: ScreenWatcher started/stopped per client session

Conversation quality fixes:
- ai-speak-signal blocked while bot conversation task is active (no interrupt)
- Response length capped to 1-2 sentences via system prompt rule
- CRITICAL RULE added: bot never reveals tech implementation details
- system_prompt copied to session context so game injection always has base

All Phase 2 live tests pass (KB standalone PASS, WebSocket PASS).
Bot confirmed responding with Elden Ring tactics from Qdrant context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nd-to-end PASS

## What changed

### knowledge_base.py
- `_research_with_claude`: uses `@{screenshot_path}` prefix so `claude -p` visually
  analyzes the screenshot (was: text path reference — Claude couldn't see the image)
- Removed `timeout=120` entirely — research completes however long it takes
- Returns 4-tuple `(tactic, best_practices, identified_game, identified_state)`
- `get_game_context`: uses Claude's returned game_name/current_state for Qdrant ingest
  and `current_game_name` attribute (not the caller's "unknown")
- `get_game_context`: skips Qdrant search when `game_name == "unknown"` to avoid
  false hits on stale "unknown" entries

### vision_loop.py
- `_analyze_screen`: reuses `kb.current_game_name` after first identification so
  subsequent captures hit Qdrant cache (was: always passed "unknown")

### .gitignore
- Added `sessions/` (runtime screenshots) and `.claude/` to ignore list

## Live test (2026-02-10)
- elden1.jpg → Claude identified "Elden Ring / Boss fight: Malenia Phase 2 (Scarlet Aeonia attack)" — PASS
- Qdrant ingest: 2 points (tactic + best_practices) — PASS
- Supabase user_history row saved for game='Elden Ring' — PASS
- docs/PHASE_5.md: full e2e pipeline test report added
- README.md: rewritten to describe gameplay companion architecture

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Feb 10, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This PR transforms the project from a multi-language, platform-agnostic VTuber into an English-focused, self-learning gameplay companion. It removes language-specific resources and bilibili integration, introduces Qdrant-based knowledge caching, screen-watching vision loops, and Supabase memory persistence, all coordinated by a new WSLClaudeLLM backend and integrated via WebSocket handlers.

Changes

Cohort / File(s) Summary
Workflow automation
.github/workflows/create_release.yml, .github/workflows/update-requirements.yml
Consolidated single-language package creation and removed bilibili-specific build steps.
Language-specific documentation removal
README.CN.md, README.JP.md, README.KR.md
Deleted entire Chinese, Japanese, and Korean README files; project now English-only.
Project documentation overhaul
README.md, CLAUDE.md, PLAN.md, docs/PHASE_*.md
Rewrote README to emphasize gameplay-companion architecture with screen watching, game identification, and memory; added comprehensive phased implementation plan and phase-specific documentation.
Configuration simplification
config_templates/conf.ZH.default.yaml, config_templates/conf.default.yaml, pyproject.toml, requirements.txt
Removed Chinese config template, consolidated to single default template with WSLClaudeLLM, simplified dependencies by replacing bilibili packages with Qdrant/Supabase/vision tools.
LLM backend consolidation
src/open_llm_vtuber/agent/stateless_llm/wsl_claude.py, src/open_llm_vtuber/agent/stateless_llm_factory.py, src/open_llm_vtuber/config_manager/stateless_llm.py
Introduced WSLClaudeLLM as sole backend (subprocess-based Claude CLI execution); removed Zhipu and Deepseek provider support.
ASR provider cleanup
src/open_llm_vtuber/asr/asr_factory.py, src/open_llm_vtuber/asr/azure_asr.py, src/open_llm_vtuber/asr/fun_asr.py, src/open_llm_vtuber/config_manager/asr.py
Removed FunASR entirely; narrowed Azure ASR languages from bilingual to English-only; updated factory and config accordingly.
TTS provider rationalization
src/open_llm_vtuber/tts/cosyvoice_tts.py, src/open_llm_vtuber/tts/cosyvoice2_tts.py, src/open_llm_vtuber/tts/minimax_tts.py, src/open_llm_vtuber/tts/siliconflow_tts.py, src/open_llm_vtuber/tts/spark_tts.py, src/open_llm_vtuber/tts/piper_tts.py, src/open_llm_vtuber/tts/tts_factory.py, src/open_llm_vtuber/config_manager/tts.py
Removed five TTS backends (Cosyvoice, Cosyvoice2, Minimax, SiliconFlow, Spark); updated Piper default model from Chinese to English; cleaned factory and config.
Live platform integration removal
src/open_llm_vtuber/live/bilibili_live.py, scripts/run_bilibili_live.py, src/open_llm_vtuber/config_manager/live.py, characters/zh_*.yaml
Deleted BiliBili live streaming module, script, and character configuration files; replaced with empty live config placeholder.
Knowledge base and RAG integration
src/open_llm_vtuber/modules/knowledge_base.py, src/open_llm_vtuber/modules/__init__.py, src/open_llm_vtuber/service_context.py
Added new KnowledgeBase class leveraging Qdrant for vector search with Claude fallback on cache misses; integrated into ServiceContext for dynamic context injection.
Vision loop and screen watching
src/open_llm_vtuber/modules/vision_loop.py
Introduced ScreenWatcher daemon that captures periodic screenshots, computes perceptual hashes to detect game state changes, triggers KB context lookup and async AI responses.
Persistent memory layer
src/open_llm_vtuber/integrations/memory_manager.py, src/open_llm_vtuber/integrations/__init__.py, migrations/001_create_tables.sql
Added MemoryManager for Supabase-backed persistence of game advice, chat logs, and user profiles; created database schema with three tables (user_history, chat_logs, long_term_profile).
WebSocket integration expansion
src/open_llm_vtuber/websocket_handler.py, src/open_llm_vtuber/conversations/single_conversation.py
Extended WebSocket handler to initialize and manage per-client ScreenWatcher instances; added game context injection message handler; wired advice persistence into conversation flow; enhanced system prompt with dynamic context and memory profiles.
Configuration management updates
src/open_llm_vtuber/config_manager/__init__.py, run_server.py, upgrade_codes/...
Removed BiliBiliLiveConfig, FunASRConfig, CosyvoiceTTSConfig exports; simplified language detection to hardcoded English; updated comparison scripts and constants to use single default config.
.gitignore and miscellaneous
.gitignore
Added ignore rules for sessions/ (runtime data) and .claude/ (Claude Code project directory).
Live testing suite
tests/phase2_live_test.py
Added comprehensive Phase 2 test script validating KB standalone performance, WebSocket context injection, and Qdrant cache hits with game scenario validation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes


🐰 A rabbit's ode to this transformation:

The languages fade, but Claude stays bright,
Qdrant caches wisdom in the night,
Vision loops watch screens with care,
While memories persist everywhere—
From chaos came a focused companion true,
Ready to guide your gameplay through! 🎮✨

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

We've launched Issue Planner and it is currently in beta. Please try it out and share your feedback on Discord!


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@diannt diannt merged commit 3eef951 into main Feb 10, 2026
2 of 3 checks passed
@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello @diannt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request transforms the Open-LLM-VTuber into a self-learning gameplay coaching companion. It establishes a robust, local-first AI pipeline by integrating WSL Claude for all language model inference, a Qdrant vector database for contextual game knowledge, and Supabase for long-term user memory. The system proactively monitors gameplay through a vision loop, offering timely and personalized advice via the Live2D avatar, while also refining conversation quality for a more focused user experience.

Highlights

  • Core LLM Backend Transition: The project now exclusively uses a WSL Claude subprocess (claude -p) as its LLM backend, eliminating the need for API keys and remote LLM calls. This change is reflected across the LLM factory and configuration.
  • Qdrant RAG for Game Knowledge: A new Qdrant-backed knowledge base (game_knowledge collection) has been integrated to store and retrieve game-specific tactics and best practices. It leverages Claude's vision capabilities for initial research and caches findings for future use.
  • Proactive Vision Loop: A ScreenWatcher daemon thread has been introduced to proactively monitor the user's screen. It takes screenshots, uses pHash to detect significant screen changes, and triggers AI commentary when gameplay shifts, providing real-time coaching.
  • Supabase Memory Integration: Supabase is now used for persistent memory, storing user_history (advice given), chat_logs (session history), and long_term_profile (user preferences). This allows the AI companion to remember past interactions and tailor advice.
  • Internationalization and Redundancy Streamlining: All Chinese (ZH), Japanese (JP), and Korean (KR) specific READMEs, character configurations, and default config templates have been removed, streamlining the codebase to focus on English. Several redundant ASR and TTS engine options have also been removed.
  • Conversation Quality Enhancements: Improvements have been made to conversation flow, including suppressing proactive AI signals when the bot is already speaking and capping response lengths to 1-2 sentences for more concise advice.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added entries to ignore runtime session data in sessions/ and Claude Code project directories in .claude/.
  • CLAUDE.md
    • Rewrote the document to reflect the new architecture, commands, configuration, and key files for the gameplay companion.
  • PLAN.md
    • Added a new file outlining the phased development plan for the gameplay companion transformation.
  • README.CN.md
    • Removed the Chinese README file.
  • README.JP.md
    • Removed the Japanese README file.
  • README.KR.md
    • Removed the Korean README file.
  • README.md
    • Rewrote the main README to describe the new gameplay companion features, architecture, setup, and key files.
  • characters/zh_米粒.yaml
    • Removed the Chinese character configuration file for '米粒'.
  • characters/zh_翻译腔.yaml
    • Removed the Chinese character configuration file for '翻译腔-神经大人'.
  • config_templates/conf.ZH.default.yaml
    • Removed the Chinese default configuration template.
  • config_templates/conf.default.yaml
    • Updated system settings, agent configurations to use wsl_claude_llm, removed various LLM providers, simplified ASR language settings, removed multiple TTS engine configurations, and set live_config to an empty object.
  • migrations/001_create_tables.sql
    • Added a new SQL migration file to create user_history, chat_logs, and long_term_profile tables for Supabase memory.
  • pyproject.toml
    • Added new dependencies including qdrant-client, python-dotenv, supabase, pyautogui, pillow, opencv-contrib-python-headless, and psycopg2-binary. Removed the bilibili optional dependency group.
  • requirements.txt
    • Updated the list of Python dependencies to include new packages required for Qdrant, Supabase, and vision loop functionalities, and removed dependencies for deprecated features.
  • run_server.py
    • Simplified the check_frontend_submodule function by removing language-specific messages and error handling.
  • scripts/run_bilibili_live.py
    • Removed the script for running BiliBili live integration.
  • src/open_llm_vtuber/agent/stateless_llm/wsl_claude.py
    • Added a new file implementing WSLClaudeLLM to run claude -p as a subprocess for LLM inference.
  • src/open_llm_vtuber/agent/stateless_llm_factory.py
    • Modified the LLMFactory to always return WSLClaudeLLM, effectively making it the sole LLM provider.
  • src/open_llm_vtuber/asr/asr_factory.py
    • Removed the fun_asr engine from the ASR factory and updated azure_asr to default to English only.
  • src/open_llm_vtuber/asr/azure_asr.py
    • Updated the default languages parameter to ['en-US'].
  • src/open_llm_vtuber/asr/fun_asr.py
    • Removed the FunASR implementation file.
  • src/open_llm_vtuber/config_manager/init.py
    • Updated imports to reflect the removal of BiliBiliLiveConfig, FunASRConfig, CosyvoiceTTSConfig, and the addition of WSLClaudeConfig.
  • src/open_llm_vtuber/config_manager/agent.py
    • Added wsl_claude_llm to the llm_provider literal for BasicMemoryAgentConfig and removed zhipu_llm and deepseek_llm.
  • src/open_llm_vtuber/config_manager/live.py
    • Simplified LiveConfig to an empty placeholder class, removing BiliBiliLiveConfig.
  • src/open_llm_vtuber/config_manager/stateless_llm.py
    • Removed ZhipuConfig and DeepseekConfig, and added WSLClaudeConfig for the WSL Claude LLM.
  • src/open_llm_vtuber/config_manager/tts.py
    • Removed CosyvoiceTTSConfig, Cosyvoice2TTSConfig, SiliconFlowTTSConfig, SparkTTSConfig, and MinimaxTTSConfig from TTS configurations.
  • src/open_llm_vtuber/conversations/single_conversation.py
    • Added logic to save advice to Supabase's user_history table after each bot response, integrating with the new memory manager and knowledge base.
  • src/open_llm_vtuber/integrations/init.py
    • Added a new __init__.py file to export MemoryManager.
  • src/open_llm_vtuber/integrations/memory_manager.py
    • Added a new file implementing MemoryManager for Supabase integration, handling user_history, chat_logs, and long_term_profile.
  • src/open_llm_vtuber/live/bilibili_live.py
    • Removed the BiliBili live platform integration file.
  • src/open_llm_vtuber/modules/init.py
    • Added a new __init__.py file to export KnowledgeBase and ScreenWatcher.
  • src/open_llm_vtuber/modules/knowledge_base.py
    • Added a new file implementing KnowledgeBase for Qdrant RAG, including Claude research fallback and collection management.
  • src/open_llm_vtuber/modules/vision_loop.py
    • Added a new file implementing ScreenWatcher for proactive screen monitoring, pHash comparison, and triggering AI commentary.
  • src/open_llm_vtuber/service_context.py
    • Added knowledge_base and memory_manager attributes, updated load_cache and load_from_config to initialize these, and modified construct_system_prompt to inject profile and game context, along with critical rules for AI responses.
  • src/open_llm_vtuber/tts/cosyvoice2_tts.py
    • Removed the Cosyvoice2 TTS implementation file.
  • src/open_llm_vtuber/tts/cosyvoice_tts.py
    • Removed the Cosyvoice TTS implementation file.
  • src/open_llm_vtuber/tts/minimax_tts.py
    • Removed the Minimax TTS implementation file.
  • src/open_llm_vtuber/tts/piper_tts.py
    • Updated the default model path for Piper TTS to an English model (en_US-lessac-medium.onnx).
  • src/open_llm_vtuber/tts/siliconflow_tts.py
    • Removed the SiliconFlow TTS implementation file.
  • src/open_llm_vtuber/tts/spark_tts.py
    • Removed the Spark TTS implementation file.
  • src/open_llm_vtuber/tts/tts_factory.py
    • Removed factories for cosyvoice_tts, cosyvoice2_tts, siliconflow_tts, minimax_tts, and spark_tts.
  • src/open_llm_vtuber/websocket_handler.py
    • Added screen_watchers dictionary, updated _store_client_data to start ScreenWatcher, handle_disconnect to stop it and save chat logs, and _handle_conversation_trigger to suppress proactive signals and inject game context. Also added _handle_inject_game_context for testing game context injection.
  • tests/phase2_live_test.py
    • Added a new test file for live testing of Qdrant RAG and WebSocket injection, covering standalone KB tests and WebSocket game context injection.
  • upgrade_codes/compare_yaml.py
    • Updated the comparison logic to use conf.default.yaml instead of conf.ZH.default.yaml and adjusted output messages.
  • upgrade_codes/upgrade_core/constants.py
    • Modified ZH_DEFAULT_CONF to point to EN_DEFAULT_CONF, indicating the removal of a separate Chinese default configuration.
  • upgrade_codes/upgrade_core/language.py
    • Simplified language detection to always return 'en', removing complex OS-specific language checks.
Ignored Files
  • Ignored by pattern: .github/workflows/** (2)
    • .github/workflows/create_release.yml
    • .github/workflows/update-requirements.yml
Activity
  • The pull request introduces a significant architectural shift, transforming the VTuber into a gameplay coaching companion.
  • Initial commits focused on stripping Chinese-specific components and integrating the WSL Claude LLM.
  • Subsequent commits added Qdrant RAG for game knowledge and the proactive vision loop.
  • The final commits integrated Claude's vision for game identification and Supabase for memory management.
  • Live tests were conducted for each phase, confirming end-to-end functionality for Claude vision, Qdrant ingest, Supabase memory, and bot responsiveness.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial pull request that transforms the project into a self-learning gameplay coaching companion. It introduces several major new components, including a WSL-based Claude integration for local LLM inference, Qdrant for RAG, Supabase for long-term memory, and a proactive screen-watching loop for gameplay analysis. The PR also performs a significant cleanup by removing multi-language support and consolidating to a single LLM backend, which simplifies the architecture. The changes are well-documented with detailed planning and phase-report files. While the new features are impressive, I've identified a few high-severity issues related to hardcoded paths and blocking I/O that impact portability and performance. Addressing these will be crucial for making the new functionality robust and usable for the community.

from .stateless_llm_interface import StatelessLLMInterface

# Resolve claude CLI path at import time; fall back to known install location
_CLAUDE_CLI = shutil.which("claude") or "/home/maatru/.local/bin/claude"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback path for the claude CLI is hardcoded to a user-specific directory (/home/maatru/.local/bin/claude). This makes the application not portable and will fail for any other user if claude is not in the system's PATH. This same issue is present in src/open_llm_vtuber/modules/knowledge_base.py on line 33.

To improve portability and avoid code duplication, this path should be configurable or determined in a single, shared location.

# Resolve claude CLI path at import time. Fails if not found.
_CLAUDE_CLI = shutil.which("claude")
if not _CLAUDE_CLI:
    raise FileNotFoundError(
        "`claude` CLI not found in PATH. Please install it or configure its path."
    )

if not url or not key:
raise ValueError("SUPABASE_URL and SUPABASE_SECRET_KEY must be set in .env")

self._db: Client = create_client(url, key)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The MemoryManager uses supabase.create_client, which provides a synchronous client. Since this application is built on asyncio, all database operations performed by this manager (e.g., save_advice, save_chat_log) are blocking network calls that will freeze the event loop, severely impacting performance and responsiveness.

Suggested change
self._db: Client = create_client(url, key)
self._db: AsyncClient = create_async_client(url, key)

_PS_TIMEOUT = 10 # seconds for PowerShell call
_MIN_TRIGGER_INTERVAL = 60 # minimum seconds between proactive AI speaks

_POWERSHELL = "/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The path to powershell.exe is hardcoded to /mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe. This makes the screen capture functionality strictly dependent on a specific WSL2 environment and will not work on native Windows, macOS, or other Linux distributions. The pyautogui.screenshot() function is cross-platform, so this PowerShell-based implementation seems to be a workaround for WSL.

Suggested change
_POWERSHELL = "/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe"
# Path to PowerShell, should be configurable or detected.
_POWERSHELL = os.getenv("POWERSHELL_PATH", "/mnt/c/Windows/System32/WindowsPowerShell/v1.0/powershell.exe")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant