Skip to content

gowrishkar/GowrishXGemini_Bot

Repository files navigation

GowrishXGemini_Bot

A high-performance, fully voice-enabled Tanglish AI agent for Telegram, powered by the advanced Google Gemini 3.1 architecture and designed for seamless serverless deployment on Google Cloud Run.

🌟 Features

  • Two-Model Architecture:
    • Uses gemini-3.1-flash-lite-preview for high-speed reasoning, text generation, and multi-turn state management.
    • Uses gemini-3.1-flash-tts-preview for natural, high-quality voice note generation.
  • Native Voice Understanding: Send voice notes directly to the bot. It natively understands audio inputs without needing a separate Speech-to-Text translation step.
  • Real-Time Web Browsing: Integrated with Google Search Grounding. If you ask about the weather, news, or current events, it will search the live internet before responding.
  • Tanglish Persona: Ingests soul.md and skills.md at startup to adopt a dedicated "Tanglish" speaking AI identity.
  • Response Modes: Use the /mode command to toggle between text-only, voice-only, or both. You can also customize the AI's voice actor.
  • Optimized for Cloud Run: Features a custom FastAPI webhook server to prevent Google Cloud Run CPU throttling, ensuring zero cold-start delays even without "Always On" CPU allocation.

🛠️ Technology Stack

  • Python 3.11+
  • python-telegram-bot: Core Telegram API framework.
  • google-genai: The latest official Google Generative AI SDK.
  • FastAPI & Uvicorn: Custom ASGI webhook server to keep the event loop alive on serverless platforms.
  • pydub & ffmpeg: Raw PCM to OGG Opus audio conversion for native Telegram Voice Notes.
  • Google Cloud Run: Serverless Docker deployment.

🚀 Local Development

  1. Clone the repository and navigate to the directory.
  2. Install dependencies:
    pip install -r requirements.txt
  3. Set up Environment Variables: Create a .env file in the root directory and add your keys:
    TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here
    GEMINI_API_KEY=your_gemini_api_key_here
    WEBHOOK_URL=  # Leave completely empty for local development
    PORT=8080
  4. Run the bot:
    python bot.py
    Note: Ensure ffmpeg is installed on your local machine for audio conversion to work properly.

☁️ Google Cloud Run Deployment

The project is fully pre-configured with a Dockerfile to be deployed seamlessly to Google Cloud Run.

  1. Install the Google Cloud CLI (gcloud).
  2. Authenticate and initialize:
    gcloud auth login
  3. Deploy to Cloud Run (initial deployment using a temporary dummy webhook):
    gcloud run deploy gowrish-bot \
      --source . \
      --region us-central1 \
      --allow-unauthenticated \
      --set-env-vars TELEGRAM_BOT_TOKEN="YOUR_TELEGRAM_TOKEN",GEMINI_API_KEY="YOUR_GEMINI_KEY"
  4. Link the Webhook: Once the deployment finishes, copy the generated Service URL and update the environment variable to finalize the webhook connection to Telegram:
    gcloud run services update gowrish-bot --update-env-vars WEBHOOK_URL="https://YOUR_NEW_SERVICE_URL.run.app"

📂 Project Structure

  • bot.py: Application entry point and FastAPI webhook server.
  • handlers.py: Telegram event handlers (commands, text, voice callbacks). Uses asyncio.to_thread for non-blocking API calls.
  • gemini_handler.py: Wrapper for the Google GenAI SDK, handling chat sessions, TTS generation, and search grounding.
  • audio_utils.py: Converts raw PCM audio output from Gemini TTS into OGG format for Telegram.
  • soul.md & skills.md: Markdown documents defining the bot's core identity and capabilities.
  • Dockerfile: Production container specification.

📝 Notes

Because Google Cloud Run scales to zero, the in-memory chat history (Interactions API) resets if the container goes to sleep. For long-term permanent memory, a database integration (like Firebase/Redis) would need to be added to gemini_handler.py.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors