Skip to content

Dd1235/Euphonia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vinu

Vinu is an AI song generator that blends generative audio, lyric intelligence, and automated deployment.

Product Highlights

  • AI-assisted song authoring that crafts prompts, lyrics, and cover art from a single description.
  • Flexible generation modes: instrumentals, custom lyrics
  • S3 delivery for both audio and artwork.
  • Modern UI interactions (motion design, dark mode, responsive layouts) paired with automated vercel-ready builds.

Screenshots

  • Landing
  • Create Session
  • Discover Library
  • Backend Processing Snapshot
  • ACE Step Pipeline

Feature Deep Dive

  • Generative Music Stack: Uses the ACE-Step pipeline running on GPUs to compose full-length tracks, with configurable duration, guidance scale, and seeding for repeatability.
  • Lyric Intelligence: Qwen 2 7B Instruct powers both prompt shaping and lyric writing, giving coherent verses and choruses based on user intent.
  • Cover Art Rendering: SDXL Turbo generates album visuals that are automatically uploaded alongside audio in the same workflow.
  • Smart Categorization: Lightweight LLM prompts classify songs into mood, genre, and tempo buckets to assist discovery views.
  • Event-Driven Orchestration: Inngest drives queued song generations so the frontend can submit work, poll status, and react without blocking users.
  • Content Delivery: Boto3 utilities push generated assets to S3, returning signed metadata the frontend consumes instantly.

Architecture Overview

  • Frontend (Next.js 15 + React 19): Modular components, Zustand state management, and motion-powered transitions. Prisma + PostgreSQL (Neon) for user data, while AWS S3 stores generated media. Inngest client triggers background jobs directly from the UI.
  • Backend (Modal): . Modal handles container build (Debian slim + ACE-Step + Hugging Face assets), GPU scheduling, and secret management. FastAPI endpoints are defined per use case.
  • Services Layer: Reusable helpers in backend/services.py encapsulate S3 interactions and temp-storage hygiene, keeping the Modal class focused on inference.

System Design Choices

  • Async Generation Pipeline: The frontend writes an Inngest event (generate-song-event) that kicks off Modal inference. While audio renders, the UI surfaces progress indicators fed by Inngest webhooks, improving perceived responsiveness.
  • Stateless API Nodes: Modal FastAPI handlers accept Pydantic payloads, perform inference and reply with S3 keys. Stateless design lets Modal scale horizontally as requests spike.
  • Observability Hooks: Modal run logs, Inngest step logs, and structured responses provide traceability from user action to GPU job completion.
  • Security & Cost Controls: Secrets (S3, Modal) remain vault-managed, while model volumes share cached weights to avoid repeated downloads, keeping spend predictable.

Running the Stack

  • Frontend: cd frontend && npm install && npm run dev
  • Modal backend (from backend/): modal run -m main
  • Configure environment: S3 credentials, Modal secrets, and database connection strings are read from environment variables.

Note

Since GPU inference costs money, I have not deployed it despite keeping a limit of 5 credits per user. Please contact me if you would like to see a demo.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors