Building in the LLM serving & agent space — inference, quantization, and a bit of agent RL.
vLLM ecosystem · now heading deeper into infra 🛠️
FP8 / INT8 quantization · efficient inference & serving · multi-agent systems · RLVR for small models
| Project | What it is |
|---|---|
| langextract-vllm |
A vLLM provider plugin for LangExtract — run structured extraction on a local vLLM backend |
| claude-code-architecture |
Deep reverse-engineering of the Claude Code CLI (v2.1.88) internals from sourcemaps |
| mobileground-r1 |
A small-VLM phone-GUI grounding agent, trained with RLVR (GRPO) |
| vantage |
AI Job Decision Copilot — scan, score, advise, decide |
Where I contribute upstream:
- Step-Audio2 integration into the vLLM-Omni serving stack
- Stable Audio 3 integration into the vLLM-Omni serving stack
📫 421774554@qq.com


