feat: add opt-in TwelveLabs video modality (Pegasus + Marengo) by mohit-twelvelabs · Pull Request #301 · HKUDS/RAG-Anything

mohit-twelvelabs · 2026-06-25T07:47:12Z

Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).

Description

This PR adds an opt-in video modality to RAG-Anything, backed by the TwelveLabs platform. It lets RAG-Anything ingest and retrieve video the same way it already handles images, tables and equations.

Pegasus (analyze) generates a detailed transcript/description of a video. That text flows through the existing knowledge-graph + chunk pipeline via BaseModalProcessor._create_entity_and_chunk — no new retrieval path, no special-casing downstream.
Marengo (embed) produces a 512-dim multimodal embedding, returned on the item's entity_info (tl_video_embedding) for semantic video retrieval. A companion embed_text() lets you score a text query in the same 512-dim space.

Why it helps this project

RAG-Anything is explicitly multimodal, but video isn't yet a first-class modality. Pegasus + Marengo give a turnkey way to bring video understanding and video↔text semantic search into the existing graph/RAG pipeline, with no extra infra on the user's side.

Changes Made

raganything/twelvelabs.py: new TwelveLabsModalProcessor (mirrors GenericModalProcessor). Accepts video_url, video_path (uploaded as a TwelveLabs asset), or video_id.
config.py: enable_video_processing flag (env ENABLE_VIDEO_PROCESSING), default False.
raganything.py: registers the video processor only when enabled and twelvelabs is installed and a key is present; otherwise it logs a warning and skips — initialization never breaks.
utils.py: dispatch routes "video" to the processor (falls back to generic when unregistered, never KeyError).
__init__.py: optional TwelveLabsModalProcessor export (same try/except pattern as other optional features).
pyproject.toml / setup.py: new [video] extra (twelvelabs>=1.2.8), also folded into [all].
env.example, requirements.txt, README.md: docs + config entries.

Opt-in / non-breaking

Disabled by default. With ENABLE_VIDEO_PROCESSING unset, there are zero behavioural changes and no new hard dependency (twelvelabs only installs via the [video]/[all] extra).

How it was tested

tests/test_twelvelabs_integration.py:

No-network unit tests: config default, dispatch routing/fallback, and Pegasus/Marengo request wiring against a mocked TwelveLabs client.
A live smoke test (gated on TWELVELABS_API_KEY, skipped in CI without it) asserting a real Marengo text embedding is 512-dim — verified locally (9 passed).

Ran the repo's ruff format --check and ruff check --ignore=E402 on all changed files — clean.

Note: full Pegasus/Marengo video runs are server-side and can be slow (Marengo video embedding is an async, polled task). The video paths are wiring-verified against the SDK contract; the synchronous Marengo text-embedding path is verified end-to-end against the live API.

Checklist

Changes tested locally
Documentation updated
Unit tests added

Additional Notes

You can grab a free API key at https://twelvelabs.io — there's a generous free tier.

Add a video modality backed by the TwelveLabs platform: - Pegasus (analyze) generates a transcript/description that flows through the existing knowledge-graph + chunk pipeline like every other modality. - Marengo (embed) produces a 512-dim multimodal embedding returned on the item's entity_info for semantic video retrieval (with embed_text() for text queries in the same space). TwelveLabsModalProcessor mirrors GenericModalProcessor and reuses BaseModalProcessor._create_entity_and_chunk. Registered as 'video' only when ENABLE_VIDEO_PROCESSING is set and the 'twelvelabs' package is installed (new [video] extra); disabled by default, so existing behaviour is unchanged. Adds config flag, dispatch routing, optional package export, env.example entries, README docs, and tests (no-network wiring unit tests + a live Marengo 512-dim smoke test gated on TWELVELABS_API_KEY).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add opt-in TwelveLabs video modality (Pegasus + Marengo)#301

feat: add opt-in TwelveLabs video modality (Pegasus + Marengo)#301
mohit-twelvelabs wants to merge 1 commit into
HKUDS:mainfrom
mohit-twelvelabs:feat/twelvelabs-integration

mohit-twelvelabs commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mohit-twelvelabs commented Jun 25, 2026

Description

Why it helps this project

Changes Made

Opt-in / non-breaking

How it was tested

Checklist

Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant