Autonomous Qwen3-VL training-code research on the official DocVQA benchmark. main: NVIDIA multi-GPU, mlx: Apple Silicon/MPS.
-
Updated
May 11, 2026 - Python
Autonomous Qwen3-VL training-code research on the official DocVQA benchmark. main: NVIDIA multi-GPU, mlx: Apple Silicon/MPS.
99.156% Accuracy from Agentic Document Extraction DPT-2 model on DocVQA val split
Quasar PoC, Multitenant PoC.
Config-driven long-context benchmark toolkit for vision-language models
The repository host codes, link to datasets and models for our research paper. In this paper we have developed a novel approach that can perform DocVQA, RCVQA and MathVQA tasks.
Document XAI Model for DocVQA. Official implementation of "Towards Self-Explainable DocVQA with Chain-of-Explanation Predictions". Submitted to NeurIPS 2026
Add a description, image, and links to the docvqa topic page so that developers can more easily learn about it.
To associate your repository with the docvqa topic, visit your repo's landing page and select "manage topics."