Skip to content

drona-gyawali/Quark

Repository files navigation

Notice: This application is designed with a microservice architecture consisting of three distinct services: Ingestion, Retrieval, and Email. Due to the high cost of cloud-hosting a multi-service stack, a live web demo is not currently available. However, you can easily self-host the application by following the commands below. If you encounter any issues during the setup, please feel free to open an issue.

image

A high-performance RAG (Retrieval-Augmented Generation) system designed for deep document analysis and persistent context awareness. This project implements a sophisticated architecture that synchronizes unstructured text and image data, utilizing a dual-memory layer for a truly personalized chat experience.


image

Demo Video: Processing ingestion and retrieval layers in one app.

Note: This demo was recorded on a low-specification system. While the ingestion and retrieval layers are highly optimized for multi-threading and asynchronous processing, the hardware limitations may impact perceived speed. On high-performance systems, these operations are significantly faster.

how-quark.mov

Architecture Overview

image

Quark distinguishes itself through a multi-stage pipeline:

Multimodal Ingestion

  • Partitioning: Leveraging Unstructured.io for semantic text decomposition and layout analysis.
  • Extraction: Utilizing pdfplumber for precise image and table coordinate extraction.
  • Sync Layer: A custom orchestration layer that aligns text and visual modalities for comprehensive multimodal embeddings.

Dual-Stream Memory

  • STM (Short-Term Memory): Powered by Redis. Provides sub-millisecond access to rapid session-based context and transient state.
  • LTM (Long-Term Memory): Powered by Mem0. Acts as a persistent intelligence layer that retains user history, evolving preferences, and long-form knowledge over time.

Core Intelligence & Retrieval

  • Embedding & Reranking: Powered by Voyage AI, utilizing advanced rerankers and metadata filtering to maximize retrieval precision.
  • Vector Infrastructure: Qudrant handles high-dimensional vector storage alongside robust relational metadata.

Technical Stack

  • Web Framework: ElysiaJS — The high-performance, Bun-native framework for the backend.
  • Identity & DB: Supabase — Unified Auth and PostgreSQL backend.
  • Frontend: React — A minimalist, streaming-responsive interface optimized for real-time AI interactions.
  • Worker(BullMQ + Redis): — Persistent workers. Heavy I/O and compute offloaded. Scalability by design. Powered by BullMQ and Redis."

Getting Started

Follow these steps to initialize and run the Quark-RAG system on your local machine.

1. Initialization

We provide a setup script to handle dependency installation and environment checks.

  1. Grant execution permissions:
    chmod +x setup.sh
  2. Run the initializer:
    ./setup.sh

2. Configure Environment Variables

Before launching the web app, you must set up your credentials. inside a .env file in the root directory and populate it with your API keys

3. Launching the Ingestion worker in seperate process

npm run worker:ingestion

4. Launching the retrieval engine for query preprocessing and Database Action

npm run worker:chat

5. Launching the Frontend

Once the setup is complete, you can start chatting with your documents immediately via the web interface.

cd frontend
npm install
npm run dev

Summary

This implementation follows a Modular RAG pattern. By decoupling the ingestion of images and text and re-syncing them at the metadata level, the system maintains higher contextual integrity than standard "text-only" pipelines. The integration of Redis and Mem0 mimics human cognitive functions by separating immediate recall from historical knowledge.

About

A high-performance RAG (Retrieval-Augmented Generation) system designed for deep document analysis and persistent context awareness.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors