Skip to content

[Model] RoboMamba #33

Description

@skt0725

ID (slug)

robomamba

Name

RoboMamba

Organization

Peking University / AI2Robotics

Year

2024

Description (English)

RoboMamba is an efficient end-to-end Vision-Language-Action model that leverages the Mamba state space model for robotic reasoning and manipulation with linear inference complexity. It integrates a vision encoder with Mamba, aligning visual tokens with language embeddings, and uses a lightweight policy head for SE(3) pose prediction. RoboMamba achieves 3x faster inference than existing VLA models while maintaining competitive reasoning and manipulation performance.

Description (Korean)

RoboMamba는 Mamba 상태 공간 모델을 활용하여 선형 추론 복잡도로 로봇 추론과 조작을 수행하는 효율적인 엔드투엔드 VLA 모델입니다. 비전 인코더와 Mamba를 통합하여 시각 토큰을 언어 임베딩과 정렬하고, 경량 정책 헤드로 SE(3) 포즈 예측을 수행합니다. 기존 VLA 모델 대비 3배 빠른 추론 속도를 달성하면서도 경쟁력 있는 추론 및 조작 성능을 보여줍니다.

GitHub URL

https://github.com/lmzpai/roboMamba

Paper URL (arXiv)

https://arxiv.org/abs/2406.04339

HuggingFace URL

No response

Project Page URL

https://sites.google.com/view/robomamba-web

Categories

  • manipulation
  • locomotion
  • navigation
  • dexterous
  • whole-body
  • aerial

Hardware Targets

  • manipulator
  • humanoid
  • quadruped
  • biped
  • mobile
  • drone
  • hand

Learning Methods

  • VLA
  • IL
  • RL
  • diffusion
  • world_model
  • sim2real

Framework

  • pytorch
  • jax
  • tensorflow
  • other

Communication

  • ros2
  • grpc
  • lcm
  • zenoh

Tags (optional)

VLA, mamba, state-space-model

Checklist

  • The model is open-source (code or weights publicly available)
  • At least one URL (GitHub, paper, or HuggingFace) is provided
  • I have read the contribution guidelines

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions