Skip to content

Latest commit

 

History

History
157 lines (121 loc) · 4.62 KB

File metadata and controls

157 lines (121 loc) · 4.62 KB

DrapeNet - Project Documentation

1. Overview

DrapeNet is a multi-modal virtual try-on platform designed to reduce online fashion returns by creating a user-specific digital twin and enabling three types of visualization:

  • Photorealistic 2D try-on (texture-focused)
  • Volumetric 3D reconstruction (fit/size-focused)
  • Real-time AR mirror (movement-focused)

The system uses a multi-agent architecture where specialized AI services collaborate across ingestion, 3D reconstruction, styling, and rendering.

2. Problem Statement

Online returns are often caused by poor fit and mismatched expectations from static product images.

3. Solution Summary

DrapeNet provides a "Digital Fitting Room" workflow:

  1. User uploads front/side photos and garment references.
  2. System validates input quality and routes data.
  3. 3D reconstruction builds a personalized body mesh.
  4. Stylist agent matches user intent + body type to product suggestions.
  5. Visualization agent renders 2D, 3D, and AR try-on experiences.

4. Agent Workflow

Ingestion Agent (Gatekeeper)

  • Monitors user uploads.
  • Validates lighting/pose/occlusion quality.
  • Routes:
    • Front/side body photos -> 3D Reconstruction Agent
    • Cloth/product images -> Catalog/Stylist pipeline

3D Reconstruction Agent (Architect)

  • Runs HMR pipeline (e.g., ROMP / HMR2.0).
  • Converts 2D cues to volumetric body measures (chest, waist, inseam, etc.).
  • Outputs and stores .obj/.glb digital twin mesh.

Stylist Agent (Brain)

  • Handles natural-language style queries.
  • Uses RAG + vector search for body-aware recommendations.
  • Maps current trend/rulebook knowledge to user context.

Visualization Agent (Artist)

  • 2D mode: IDM-VTON warping/generation for photorealistic overlays.
  • 3D mode: Garment fitting on user-specific SMPL/SMPL-X body in browser.
  • AR mode: MediaPipe landmarks + frontend canvas/video overlay.

5. High-Level Architecture (HLD)

flowchart LR
  U[User: photos + query + product link] --> FE[Next.js 14 Frontend]
  FE --> I[Ingestion Agent]

  I -->|body photos| R[3D Reconstruction Agent]
  I -->|cloth/product refs| S[Stylist Agent]

  R --> M[(Mesh Store: OBJ/GLB + measurements)]
  S --> VDB[(Vector DB: embeddings + fashion knowledge)]
  S --> CAT[(Product Catalog / Trends)]

  R --> V[Visualization Agent]
  S --> V
  M --> V

  V -->|2D output| FE
  V -->|3D output| FE
  V -->|AR output| FE

  FE --> C[(Cloudinary)]
  FE --> Z[(Zustand App State)]

  subgraph Backend
    I
    R
    S
    V
    L[LangGraph Orchestration]
    Q[Celery + Redis Queue]
  end

  L --- I
  L --- R
  L --- S
  L --- V
  Q --- R
Loading

6. Request-to-Response Flow

sequenceDiagram
  participant User
  participant FE as Frontend (Next.js)
  participant ING as Ingestion Agent
  participant ORCH as LangGraph
  participant R3D as 3D Agent
  participant STY as Stylist Agent
  participant VIS as Visualization Agent

  User->>FE: Upload photos + ask style query
  FE->>ING: Submit assets
  ING->>ORCH: Validation + routing metadata
  ORCH->>R3D: Trigger mesh reconstruction task
  ORCH->>STY: Trigger style recommendation retrieval
  R3D-->>ORCH: Mesh + body measurements
  STY-->>ORCH: Ranked products + style plan
  ORCH->>VIS: Render plan for 2D / 3D / AR
  VIS-->>FE: Rendered assets + session metadata
  FE-->>User: Interactive try-on experience
Loading

7. Proposed Tech Stack

Frontend

  • Framework: Next.js 14 (App Router)
  • 3D Engine: React Three Fiber (R3F)
  • AR Library: MediaPipe Pose + TensorFlow.js
  • State: Zustand

Backend and Orchestration

  • Language/API: Python + FastAPI
  • Agent Orchestration: LangGraph
  • Task Queue: Celery + Redis

AI and Data

  • GenAI Models: IDM-VTON (2D try-on), Stable Diffusion (texture generation)
  • Body Model: SMPL / SMPL-X
  • LLM Layer: GPT-4o or Gemini 1.5 Pro (Stylist Agent)
  • Vector Search: MongoDB Atlas Vector Search
  • Storage/CDN: Cloudinary

8. Non-Functional Requirements

  • End-to-end responsiveness for try-on sessions
  • Async processing for 3D jobs to avoid frontend blocking
  • Scalable media processing and caching
  • Privacy-safe image handling and retention controls
  • Monitoring for model latency, rendering quality, and failed reconstructions

9. MVP Scope

  1. Input validation + upload flow
  2. Single-user digital twin generation
  3. 2D try-on and basic 3D preview
  4. Prompt-based stylist recommendations
  5. Session persistence + downloadable output

10. Future Extensions

  • Multi-garment layering and cloth simulation physics
  • Personal wardrobe memory + sizing profile history
  • Brand onboarding APIs for catalog ingestion
  • Real-time collaborative shopping sessions