Skip to content
View notwitcheer's full-sized avatar
๐Ÿƒ
locked-in
๐Ÿƒ
locked-in

Block or report notwitcheer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
notwitcheer/README.md

AI Practitioner & Data-Driven Growth Specialist

building local LLM infrastructure, benchmarking models, publishing results

Twitter HuggingFace Telegram

Profile Views


What I Do

I build local LLM inference stacks from source on consumer hardware, benchmark models systematically, and publish datasets on HuggingFace. I also build analytics dashboards and have scaled a tech community to 20,000+ members.

Current Focus:

  • local inference optimisation (llama.cpp, CUDA..)
  • systematic benchmarks across dense, MoE, and hybrid architectures
  • quantisation testing (GGUF Q4_K_M, IQ4_XS, turboquant turbo2/turbo3)
  • context window scaling analysis and VRAM profiling
  • publishing benchmark datasets on HuggingFace

๐Ÿ› ๏ธ Tech Stack

AI / ML

CUDA llama.cpp HuggingFace Python

data & analytics

Dune Analytics SQL

frontend

React Next.js TypeScript

infra & tools

Linux Bash Git


๐Ÿ“Š Background

  • AI / ML Practitioner - local LLM inference, model evaluation, HuggingFace contributor
  • Growth Lead @ Yari Finance - DeFi protocol growth, partnerships, on-chain analytics
  • Founder @ BeraLand - built a 20K+ member blockchain community from zero
  • 15+ Dune dashboards tracking $1B+ in trading volume
  • Master's in Corporate & Market Finance - KPMG background

๐ŸŽ“ Learning Journey

Boot.dev Profile


I write about AI infrastructure, local inference, and model evaluation on ๐•

Pinned Loading

  1. llm-bench-rig llm-bench-rig Public

    Dual-engine (llama.cpp + vLLM) LLM benchmarking pipeline for GGUF & safetensors on NVIDIA GPUs โ€” speed, quality, live dashboard, publishable cards.

    Python 9 2