A monorepo of Python packages for CAD document processing, neural networks for 3D geometry, geometric tokenization, optical CAD recognition, and generative CAD modeling.
📚 Documentation: https://latticelabsai.github.io/ll_toolkit/ — per-package guides, tutorials, concepts, and a generated API reference. Source lives in site/.
| Package | Path | Description |
|---|---|---|
| cadling | cadling/ |
CAD document processing toolkit (docling-inspired). Multi-format parsing (STEP, STL, BRep, IGES), topology analysis, RAG-ready chunking, and synthetic data generation. |
| ll-stepnet | ll_stepnet/ |
Neural network package for STEP/B-Rep CAD files. Tokenization, feature extraction, topology encoding, and task-specific models. |
| geotoken | geotoken/ |
Geometric tokenizer with adaptive quantization for CAD and mesh data. Mesh, parametric, and topology-level tokenization. |
| ll-ocadr | ll_ocadr/ |
Optical CAD Recognition. DeepSeek-OCR-inspired 3D geometry processing for LLMs (tiled chunks + global context). HF-native inference (run_ll_ocadr_hf.py); vLLM serving is experimental/future. |
| ll-gen | ll_gen/ |
Generation orchestration: neural propose, deterministic dispose in a kernel sandbox. Ships DeepCAD-trained, program-based generators (native MLX) that produce measured-valid CAD — an autoregressive command generator (validity 0.914) and a latent diffusion (sampled-z 0.934), measured through the real kernel. Includes the propose/dispose sandbox + REINFORCE alignment loop (python -m ll_gen.training.run); the legacy VAE/VQ-VAE orchestrator generators ship untrained. |
| ll-clouds | ll_clouds/ |
Point-cloud processing & analysis (NumPy/SciPy): PLY/PCD/XYZ I/O + mesh sampling, normalize/voxel/FPS/outlier preprocessing, normals & curvature, ICP registration, RANSAC/Euclidean segmentation, lazy cadling/ll_ocadr bridges. |
| ll-brepnet | ll_brepnet/ |
B-Rep face-segmentation network (UV-Net / BRepNet lineage) over coedge topology + UV-grid geometry. Trained on the Fusion 360 Segmentation split (test mIoU 0.828) with a parity-verified native-MLX port. MIT-licensed, built on cadling's B-Rep machinery (no BRepNet/UV-Net code — see ATTRIBUTION.md). |
- Python 3.9 - 3.12
- Conda (Miniconda or Miniforge recommended)
PyTorch must be installed via conda-forge (not pip) to avoid OpenMP library conflicts on macOS.
# Clone the repository
git clone https://github.com/LatticeLabsAI/ll_toolkit.git
cd ll_toolkit
# Create the conda environment (installs PyTorch, pythonocc, and all packages)
conda env create -f environment.yml
conda activate cadlingThe environment installs cadling, ll_stepnet, and geotoken as editable packages. To install individual packages manually:
pip install -e ./cadling # CAD document processing
pip install -e ./ll_stepnet # STEP/BRep neural networks
pip install -e ./geotoken # Geometric tokenizer
pip install -e ./ll_ocadr # Optical CAD recognition
pip install -e ./ll_gen # Generation orchestrationpip install -e ".[dev]" # Testing, linting, docs
pip install -e ".[cad]" # CAD processing (trimesh, networkx, numpy-stl)
pip install -e ".[ml]" # ML (transformers, accelerate, einops)
pip install -e ".[vision]" # Vision (opencv, easyocr, matplotlib)
pip install -e ".[hub]" # HuggingFace Hub integration
pip install -e ".[drawings]" # 2D drawings (DXF, PDF)
pip install -e ".[all]" # Everything# Convert a CAD file to JSON or Markdown
cadling convert model.step --format json
# Chunk a CAD file for RAG
cadling chunk model.step
# Generate synthetic Q&A pairs
cadling generate-qa model.stepfrom cadling.backend.document_converter import DocumentConverter
converter = DocumentConverter()
result = converter.convert("model.step")from geotoken import GeoTokenizer
tokenizer = GeoTokenizer()
tokens = tokenizer.tokenize(mesh)from stepnet.encoder import StepNetEncoder
encoder = StepNetEncoder()
embeddings = encoder.encode(step_data)Input CAD File (STEP / STL / BRep / IGES)
|
v
cadling: DocumentConverter
-> Format Detection -> Backend Selection
-> Backend (format-specific parsing)
-> Pipeline (Build -> Assemble -> Enrich)
-> CADlingDocument
|
+-> Chunking (RAG) # cadling.chunker
+-> SDG (Q&A pairs) # cadling.sdg
+-> Export (JSON / Markdown)
|
+-> geotoken: tokenize geometry for neural models
+-> ll_stepnet: neural STEP/BRep processing
+-> ll_ocadr: optical CAD recognition
+-> ll_gen: generative CAD modeling
# All packages (from repo root)
pytest
# Individual packages
cd cadling && pytest tests/unit/ -v
cd ll_stepnet && pytest tests/ -v
cd geotoken && pytest tests/ -vruff check .
black .
mypy cadling/cadling ll_stepnet/stepnet geotoken/geotokenpytest -m "not slow" # Skip slow tests
pytest -m "not requires_gpu" # Skip GPU tests
pytest -m "not requires_pythonocc" # Skip pythonocc tests
pytest -n auto # Parallel executionll_toolkit/
cadling/ # CAD document processing toolkit
cadling/ # Python package
tests/ # Tests
ll_stepnet/ # Neural networks for STEP/BRep
stepnet/ # Python package
tests/
geotoken/ # Geometric tokenizer
geotoken/ # Python package
tests/
ll_ocadr/ # Optical CAD recognition
tests/
ll_gen/ # Generation orchestration
ll_gen/ # Python package
mlx/ # native-MLX trained generators
tests/
ll_clouds/ # Point-cloud processing & analysis
ll_clouds/ # Python package
tests/
ll_brepnet/ # B-Rep face-segmentation network (UV-Net/BRepNet lineage)
ll_brepnet/ # Python package
mlx/ # native-MLX trainer
tests/
docs/ # Research docs and plans
pyproject.toml # Root config (tooling, shared deps)
environment.yml # Conda environment definition
MIT