A realistic AI companion with full 3D animations, voice interaction, and emotional intelligence powered by Google Gemini API.
- π Fully Animated 3D Avatar - Realistic female character with smooth animations
- π€ Voice Interaction - Speak and listen using Web Speech API
- π Intelligent Conversations - Powered by Google Gemini AI with girlfriend personality
- π Dynamic Emotions - Facial expressions matching conversation sentiment
- π Natural Gestures - Wave, nod, jump, dance, and more based on context
- π Warm Personality - Caring, supportive, and emotionally intelligent companion
- π§ Conversation Memory - Remembers context throughout your chat
- π¨ Beautiful UI - Modern, gradient-based design with smooth animations
- Node.js 16+ and npm
- A Google Gemini API key (free at Google AI Studio)
- Modern web browser with Web Speech API support (Chrome, Edge recommended)
- Clone or download this repository
git clone <repository-url>
cd voice-assistant- Install dependencies
npm install- Start the development server
npm run dev- Open your browser
Navigate to http://localhost:3000
- Configure your API key
On first launch, you'll see a configuration screen:
- Enter your Gemini API key
- (Optional) Enter a custom Ready Player Me avatar URL
- Click "Start Chatting"
- Go to Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the key and paste it in the configuration screen
Note: Gemini API offers a generous free tier perfect for personal projects!
- Visit Ready Player Me
- Create your custom avatar
- Click "Download" and choose ".glb" format
- You'll get a URL like:
https://models.readyplayer.me/[ID].glb - Paste this URL in the configuration screen
Place your .glb file in the public folder and reference it as /your-avatar.glb
- Type your message in the chat input at the bottom right
- Press Enter or click the send button
- Aria will respond with text, emotions, and gestures
- Click the microphone button at the bottom center
- Speak your message (you'll see "Listening..." indicator)
- Your speech will be transcribed and sent automatically
- Aria will respond with voice and animations
Aria automatically displays emotions and gestures based on conversation:
Emotions:
- π Happy - Joyful, positive responses
- π’ Sad - Empathetic, comforting responses
- π€© Excited - Enthusiastic reactions
- π€ Thoughtful - Considering responses
- π₯° Loving - Affectionate expressions
- π Neutral - Calm, relaxed state
Gestures:
- π Wave - Greetings
- β Nod - Agreement, affirmation
- β Shake - Disagreement, negation
- π¦ Jump - Excitement, celebration
- π Dance - Joy, celebration
- π€ Think - Pondering, reflecting
- β€οΈ Heart - Love, affection
voice-assistant/
βββ src/
β βββ components/
β β βββ Avatar.tsx # 3D avatar component with animations
β β βββ Scene.tsx # Three.js scene setup
β β βββ ChatInterface.tsx # Chat UI component
β β βββ VoiceControls.tsx # Voice input/output controls
β β βββ ConfigPanel.tsx # API key configuration
β βββ services/
β β βββ speechService.ts # Web Speech API wrapper
β β βββ lipSyncService.ts # Lip-sync engine
β β βββ animationController.ts # Animation state management
β βββ store/
β β βββ conversationStore.ts # Zustand store with Gemini integration
β βββ App.tsx # Main app component
β βββ main.tsx # Entry point
β βββ index.css # Global styles
βββ public/ # Static assets
βββ package.json # Dependencies
βββ README.md # This file
- Model: gemini-pro
- Personality: Custom system prompt creates warm, caring companion
- Memory: Conversation history maintained in Zustand store
- Emotion Detection: Responses tagged with [EMOTION] and [GESTURE] markers
- Context: Full conversation history sent with each request
- Input: Web Speech API (browser-based, free)
- Output: Web Speech Synthesis API (browser-based, free)
- Voice: Automatically selects female voice when available
- Lip Sync: Amplitude-based mouth movement from audio analysis
- Engine: Three.js with React Three Fiber
- Avatar: Ready Player Me .glb models
- Animations: Procedural animations for gestures and idle
- Lighting: Ambient + directional + point lights with shadows
- Environment: HDR environment mapping for realistic lighting
- Store: Zustand for global state
- Real-time Updates: React hooks for reactive UI
- Persistence: LocalStorage for API key and avatar URL
Edit src/store/conversationStore.ts to adjust:
generationConfig: {
temperature: 0.9, // Creativity (0-1)
topK: 40, // Token sampling
topP: 0.95, // Nucleus sampling
maxOutputTokens: 1024, // Response length
}Default avatar is provided, but you can:
- Use Ready Player Me for custom avatars
- Create your own .glb models with Blender
- Use VRoid Studio for anime-style characters
Edit the SYSTEM_PROMPT in src/store/conversationStore.ts to change:
- Personality traits
- Communication style
- Relationship dynamic
- Response format
- Use Chrome or Edge browser (best support)
- Ensure you're using HTTPS or localhost
- Check browser permissions for microphone
- Check internet connection (Ready Player Me avatars load from CDN)
- Verify avatar URL is a valid .glb file
- Try the default avatar first
- Verify your Gemini API key is correct
- Check API quota at Google AI Studio
- Ensure you have internet connection
- Check browser console for detailed errors
- Check browser audio permissions
- Ensure speakers/headphones are connected
- Try adjusting system volume
- Verify Web Speech API is supported (check console)
npm run buildThis creates an optimized build in the dist folder.
- Push code to GitHub
- Import repository in Netlify or Vercel
- Build command:
npm run build - Publish directory:
dist - Deploy!
Note: API key is stored in browser localStorage, not exposed in build.
Potential improvements you can add:
- Google Cloud Text-to-Speech for higher quality voices
- Advanced lip-sync using phoneme analysis
- More complex animations (walking, sitting, etc.)
- Background environment options
- Voice customization (pitch, rate, volume)
- Conversation history export
- Multiple personality presets
- Mobile app version (React Native)
- VR support
- Google Gemini - Conversational AI
- Ready Player Me - Avatar system
- Three.js - 3D rendering
- React Three Fiber - React + Three.js integration
- Zustand - State management
MIT License - Feel free to use this for personal or commercial projects!
Contributions welcome! Feel free to:
- Report bugs
- Suggest features
- Submit pull requests
- Share your customizations
- Use headphones to prevent echo during voice conversations
- Speak clearly for better speech recognition
- Good lighting if using camera-based features
- Stable internet for smooth Gemini API responses
- Chrome browser for best compatibility
Made with β€οΈ using Google Gemini AI
Enjoy your AI companion! π