- 2025-07-31: The preprint of LIDAR has been posted on 📤️arXiv!
- 2025-07-30: The code for LIDAR is publicly available in this repository! 📦
- 2025-07-06: 🎉🎉🎉We are delighted to announce that our LIDAR has been accepted by the ACM MM 2025! 🖐😭🤚
Achieving pixel-level segmentation with low computational cost using multimodal data remains a key challenge in crack segmentation tasks. Existing methods lack the capability for adaptive perception and efficient interactive fusion of cross-modal features. To address these challenges, we propose a Lightweight Adaptive Cue-Aware vision Mamba network (LIDAR), which efficiently perceives and integrates morphological and textural cues from different modalities under multimodal crack scenarios, generating clear pixel-level crack segmentation maps. Specifically, LIDAR is composed of a Lightweight Adaptive Cue-Aware Visual State Space module (LacaVSS) and a Lightweight Dual Domain Dynamic Collaborative Fusion module (LD3CF). LacaVSS adaptively models crack cues through the proposed mask-guided Efficient Dynamic Guided Scanning Strategy (EDG-SS), while LD3CF leverages an Adaptive Frequency Domain Perceptron (AFDP) and a dual-pooling fusion strategy to effectively capture spatial and frequency-domain cues across modalities. Moreover, we design a Lightweight Dynamically Modulated Multi-Kernel convolution (LDMK) to perceive complex morphological structures with minimal computational overhead, replacing most convolutional operations in LIDAR. Experiments on three datasets demonstrate that our method outperforms other state-of-the-art (SOTA) methods. On the light-field depth dataset, our method achieves 0.8204 in F1 and 0.8465 in mIoU with only 5.35M parameters.
The CrackPolar, CrackDepth and IRTCrack that we use can be downloaded from Multimodal_Crack_Dataset.
You can create your own conda environment for LIDAR based on the following command:
conda create -n LIDAR python=3.9 -y
conda activate LIDAR
pip install torch==2.1.2+cu121 torchvision==0.16.2+cu121 torchaudio==2.1.2+cu121
pip install -U openmim
mim install mmcv-full
pip install mamba-ssm==1.2.0Before formal training, 10 rounds of pre-training are required and the mask is generated using the pre-training weights file, first, change the value of the scan_list_json_path parameter to pretrain in main.py and run:
python main.pyAfter pre-training is complete, modify the path to the weight file in inference_mask.py to the location of the pre-training weight file and run the following command to generate the mask:
python inference_mask.pyOnce the mask has been generated, change the dataset path to the location of the dataset to be pre-scanned in the ./pre_scan/scan.py file, change the dataset path to the location of the dataset to be pre-scanned and run the following command to generate the JSON file that holds the pre-scan path:
python scan.pyRun the following command to check if the scan sequence was generated correctly:
python test_scan_json.pyNext, change the value of the scan_list_json_path parameter in main.py to the location of the pre-scanned JSON file, and run the following command for formal training:
python main.py✍️✍️✍️Note:
- When conducting the pre-training process, you need to change the value of the
scan_list_json_pathparameter inmain.pytopretrain, and change the value ofinference_masktoTrue. - When conducting the formal training, it is necessary to change the value of
scan_list_json_pathto thepathof the JSON file, and change the value ofinference_masktoFalse.
After training, the weights file can be used for inference:
python test.pyRun the following commands to calculate the ODS, OIS, F1, and mIoU metrics:
cd eval
python evaluate.pyRun the following command to calculate the Params, FLOPs metrics:
cd ..
python eval_compute.pyVisual comparison under dual-modal input:
Visual comparison under multimodal input:
Visual comparison under RGB single-modal input:
This project is released under the Apache 2.0 license.
This work stands on the shoulders of the following open-source projects:
If you have any other questions, feel free to contact me at liuhui1109@stud.tjut.edu.cn or liuhui@ieee.org.





