I build speech AI for Indian languages β because voice reaches people across the literacy and language barriers that text never will. My focus is on-device, low-resource, and efficient: models that run on the phone in someone's hand, not just in a datacenter.
Background in production speech systems (TTS, ASR, speaker recognition) at scale, now spending my best hours on open source at the edge of audio + small models.
- ποΈ On-device speech β TTS / ASR for Hindi, Bhojpuri, and the broader Indic family
- π Apple Silicon + MLX β fine-tuning and inference where latency and cost actually matter
- π§ Efficient architectures β SSM / Mamba, LoRA adaptation, KV-cache & inference optimization
- π Low-resource adaptation β getting good speech models out of languages with little data
| Project | What it is |
|---|---|
| mlx-audio-train | LoRA fine-tuning for audio models on Apple Silicon β ships a Hindi LoRA adapter |
| prefill-decode-bench | MLX / CUDA profiler for prefill-vs-decode throughput |
| synthetic-dialog-gen | Synthetic dialogue data generation for speech & language models |
| spoken-number-disambiguation | Disambiguating spoken-form numbers β text normalization for speech |
| Hindi LoRA adapter | Fine-tuned Hindi adapter, up on Hugging Face |
When I'm not shipping models, I'm usually:
- π₯Ύ hiking & treks β long days on the trail, the higher the better
- π΄ cycling & long motorbike rides
- π swimming and ποΈ lifting weights
- β½ sports β football, badminton & cricket
- π reading β currently working through Long Way Round, Into Thin Air, The Snow Leopard, and Our Mathematical Universe


