SpaceVista: All-Scale Visual Spatial Reasoning from $mm$ to $km$

🤗 Hugging Face | 📑 Paper | ⚙️ Github | 🖥️ Home Page

Peiwen Sun $^{*}$, Shiqiang Lang $^{*}$, Dongming Wu, Yi Ding, Kaituo Feng, Huadai Liu, Zhen Ye, Rui Liu, Yun-Hui Liu, Jianan Wang, Xiangyu Yue

Keywords:

The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from $mm$ to $km$.

Outlines

💥 News 💥

[2026.5.28] 📦 Our SpaceVista-1M is released at .

[2026.5.28] 🎯 Our SpaceVista-Bench is released at .

[2026.5.10] 🏆 The Guinness World Records data used in our paper is released at .

[2026.5.2] 🎉 Our paper is accepted by ICML 2026. See you in Seoul.

[2025.10.10] Our preview SFT code base is released for preview. .

[2025.10.10] Our preview 100K subset of SpaceVista-1M is now available at .

[2025.10.10] Our initial paper is now accessible at .

Overall Structure

Training Dataset: SpaceVista-1M .
Evaluation Dataset: SpaceVista-Bench .
SFT training: SFT code for SpaceVista .
Evaluating: Evaluation code for SpaceVista .

SpaceVista

Spatial reasoning is the ability to perceive, interpret, and act across spatial scales, from millimeter-sized components to distant aerial scenes. All-scale spatial reasoning is fundamental to next-generation intelligent systems and supports diverse applications: mm sensing for advanced manufacturing, cm and m perception for embodied agents, 10m operation for autonomous driving, and 100m for drone-based sensing. Despite progress, existing work shows clear limitations in both model design and dataset coverage. Current scene perception research mostly targets indoor scenes, narrow object classes, and limited spatial ranges, and lacks training paradigms engineered for end to end, cross scale reasoning. SpaceVista addresses this gap by presenting the first systematic optimization across both data and model dimensions to enable robust, full-scene spatial reasoning.

Training Data

SpaceVista-1M: Available at .

# Download SpaceVista-1M
huggingface-cli download SpaceVista/SpaceVista-Full --repo-type dataset --local-dir ./SpaceVista-1M

Evaluation Data

SpaceVista-Bench: Available at .

# Download SpaceVista-Bench
huggingface-cli download SpaceVista/SpaceVista-Bench --repo-type dataset --local-dir ./SpaceVista-Bench

Evaluation

We provide API-based evaluation scripts that work with any OpenAI-compatible API (OpenRouter, OpenAI, POE, etc.). No GPU required.

cd eval
pip install openai pillow numpy tqdm pandas
export API_KEY="your-api-key-here"

For APIs that support frame (image) input:

python evaluate_api.py --model qwen/qwen2.5-vl-72b-instruct

For APIs that support video input (requires ffmpeg):

python evaluate_api_video.py --model qwen/qwen2.5-vl-72b-instruct

To use a different API provider, override --base_url:

# OpenAI
python evaluate_api.py --model gpt-4o --base_url https://api.openai.com/v1

# POE
python evaluate_api.py --model gpt-4o --base_url https://api.poe.com/v1

See the eval/README.md for full argument reference, resume support, and output format.

Usage

In case you want to train the Qwen2.5-VL-7B model with SpaceVista, please refer to the sft folder for detailed instructions.

Reference

If you find this repo useful, please cite our papers:

@article{sun2025spacevista,
  title={SpaceVista: All-Scale Visual Spatial Reasoning from mm to km}, 
  author={Sun, Peiwen and Lang, Shiqiang and Wu, Dongming and Ding, Yi and Feng, Kaituo and Liu, Huadai and Ye, Zhen and Liu, Rui and Liu, Yun-Hui and Wang, Jianan and Yue, Xiangyu},
  journal={arXiv preprint arXiv:2510.09606},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.asset		.asset
dependency/transformers		dependency/transformers
eval		eval
sft		sft
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpaceVista: All-Scale Visual Spatial Reasoning from $mm$ to $km$

Outlines

💥 News 💥

Overall Structure

SpaceVista

Training Data

Evaluation Data

Evaluation

Usage

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpaceVista: All-Scale Visual Spatial Reasoning from $mm$ to $km$

Outlines

💥 News 💥

Overall Structure

SpaceVista

Training Data

Evaluation Data

Evaluation

Usage

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages