Vinu is an AI song generator that blends generative audio, lyric intelligence, and automated deployment.
- AI-assisted song authoring that crafts prompts, lyrics, and cover art from a single description.
- Flexible generation modes: instrumentals, custom lyrics
- S3 delivery for both audio and artwork.
- Modern UI interactions (motion design, dark mode, responsive layouts) paired with automated vercel-ready builds.
- Generative Music Stack: Uses the ACE-Step pipeline running on GPUs to compose full-length tracks, with configurable duration, guidance scale, and seeding for repeatability.
- Lyric Intelligence: Qwen 2 7B Instruct powers both prompt shaping and lyric writing, giving coherent verses and choruses based on user intent.
- Cover Art Rendering: SDXL Turbo generates album visuals that are automatically uploaded alongside audio in the same workflow.
- Smart Categorization: Lightweight LLM prompts classify songs into mood, genre, and tempo buckets to assist discovery views.
- Event-Driven Orchestration: Inngest drives queued song generations so the frontend can submit work, poll status, and react without blocking users.
- Content Delivery: Boto3 utilities push generated assets to S3, returning signed metadata the frontend consumes instantly.
- Frontend (Next.js 15 + React 19): Modular components, Zustand state management, and motion-powered transitions. Prisma + PostgreSQL (Neon) for user data, while AWS S3 stores generated media. Inngest client triggers background jobs directly from the UI.
- Backend (Modal): . Modal handles container build (Debian slim + ACE-Step + Hugging Face assets), GPU scheduling, and secret management. FastAPI endpoints are defined per use case.
- Services Layer: Reusable helpers in
backend/services.pyencapsulate S3 interactions and temp-storage hygiene, keeping the Modal class focused on inference.
- Async Generation Pipeline: The frontend writes an Inngest event (
generate-song-event) that kicks off Modal inference. While audio renders, the UI surfaces progress indicators fed by Inngest webhooks, improving perceived responsiveness. - Stateless API Nodes: Modal FastAPI handlers accept Pydantic payloads, perform inference and reply with S3 keys. Stateless design lets Modal scale horizontally as requests spike.
- Observability Hooks: Modal run logs, Inngest step logs, and structured responses provide traceability from user action to GPU job completion.
- Security & Cost Controls: Secrets (S3, Modal) remain vault-managed, while model volumes share cached weights to avoid repeated downloads, keeping spend predictable.
- Frontend:
cd frontend && npm install && npm run dev - Modal backend (from
backend/):modal run -m main - Configure environment: S3 credentials, Modal secrets, and database connection strings are read from environment variables.
Since GPU inference costs money, I have not deployed it despite keeping a limit of 5 credits per user. Please contact me if you would like to see a demo.




