Skip to content
View devhms's full-sized avatar

Highlights

  • Pro

Block or report devhms

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
devhms/README.md

Typing SVG

Ibrahim Salman Β· [devhms](https://github.com/devhms)

CS @ UET Taxila. I build production systems β€” obfuscation engines, 
OCR pipelines, WhatsApp automation, RAG chatbots, offline STT apps.
Always shipping.


πŸ”₯ Flagship

☽ Nightshade β€” LLM Training-Data Poisoning Engine

Java, Python, JavaScript Β· SLSA L3 Β· Sigstore Cosign Β· OpenSSF Scorecard

The only open-source LLM anti-scraping tool for Java. Applies 8 adversarial strategies (variable scrambling, dead code injection, control flow flattening, watermark embedding, comment poisoning, Unicode homoglyphs, opaque predicates, string splitting) to source code before public release β€” code compiles and runs identically, but degrades LLM training quality if scraped.

  • Evades MinHash/LSH deduplication via low-entropy perturbation
  • SLSA Level 3 supply chain with Sigstore Cosign signatures, CodeQL, Dependabot
  • CLI + GitHub Action + pre-commit hook + JavaFX GUI
  • Full strategy pipeline: tokenizer β†’ poisoner β†’ verifier

2 stars Β· nightshade Β· mvn package

πŸš€ B.L.A.S.T. OCR β€” Production OCR Engine

Python, EasyOCR, Streamlit Β· 100% test coverage Β· Forensic audit

High-performance deterministic OCR with 3-Layer A.N.T. Architecture (Architect β†’ Navigator β†’ Tools). Extracts text from PDFs (Poppler), PPTX slides, and images with self-healing retries and exponential backoff.

  • 160+ tests, 100% branch coverage β€” CI-enforced, verified by coverage report
  • Parallel threaded processing for multi-page documents
  • CLI + Streamlit GUI dashboard with job history, analytics, JSON output
  • Forensic audit: 17 critical vulnerabilities remediated (XXE, SQLi, thread isolation)
  • Configurable via JSON/YAML, designed for unattended batch runs

3 stars Β· OCR

πŸ›οΈ UET GPT β€” Campus RAG Chatbot for UET Taxila

TypeScript, Python, Next.js 15, Convex, Clerk, Vector Search

Authoritative AI assistant for UET Taxila β€” crawls campus info, syllabi, and official docs, chunks them into Convex vector search, answers via LLM with citation.

  • Custom async BFS crawler (curl_cffi, trafilatura) β†’ chunkMarkdown β†’ embeddings
  • Convex vector search + RRF reranking + anti-hallucination guardrails
  • Tiered rate limiting (anon/user/admin) via Clerk auth
  • 50-pair golden eval dataset with automated evaluation harness
  • Freshness cron job, Sentry error tracking, Neo Kinpaku design
  • 13 key Python + TypeScript crawlers for comprehensive campus data

4 stars Β· uet_gpt


βš™οΈ Automation & Tools

πŸ“° DailyNewsBot β€” AI News Intelligence

Python, Gemini AI, WhatsApp, Circuit Breaker, Smart Cache

Aggregates news from Google RSS, NewsAPI, GNews β†’ deduplicates via cosine similarity clustering β†’ Gemini AI summarizes β†’ delivers formatted WhatsApp reports.

  • Thread-safe circuit breaker pattern (CLOSED β†’ OPEN β†’ HALF-OPEN)
  • Smart file-based cache reduces API calls ~60%
  • Dashboard (Chart.js) with health reports, analytics, delivery tracking
  • Optional TTS video briefings (pyttsx3 + moviepy)
  • Exponential backoff retry, rate limiter, content scraper for full-article extraction

DailyNewsBot Β· DailyAutomationSystem

πŸ“± Sheet β†’ WhatsApp Automation

Python, Selenium, Google Sheets API Β· Production, 4 bots

Production-grade automation bridging Google Sheets to WhatsApp. Runs daily for jamiat group operations:

  • Submission Bot β€” atomic JSON delivery journal, single-instance lock, dual persistent profiles
  • Ghost Hunter β€” cross-references sheet submissions against member list, flags missing reports
  • Red Flag Scanner β€” anomaly detection on Salah records and discipline entries
  • Reminder Bot β€” scheduled reminders with phone-first delivery + search fallback

Idempotent, self-healing selectors, heartbeat telemetry, failure screenshots β€” designed for unattended 24/7 operation.

Sheet_to_whatsapp_automation

πŸ“‘ News_Scrapper

Python, BeautifulSoup

Lightweight Dawn News scraper β€” fetches headlines, topic filters, CSV export.

News_Scrapper


πŸ§ͺ Research & ML

✍️ Typedβ†’Handwritten β€” PakE OCR Corpus Pipeline

Python, Bezier Splines, Augraphy, OCR Benchmarking

Academic-grade synthetic handwriting pipeline generating high-fidelity handwritten documents from digital text:

Phase What Tech
1 PakE dialect modeling + stochastic typo injection NLP augmentation, sociolinguistic corpus
2 Bezier spline motor-path synthesis + ink rendering 15-feature enhancement (fatigue, rotation, bleed, pressure, snake drift etc.)
3 Document aging + 3D photo-realism Augraphy + ISO noise + sub-pixel jitter
4 Telemetry + OCR benchmark sweep CER/WER with bootstrap confidence intervals

Referenced research: DiffInk ICLR 2026, InkSpire ICLR 2026, DiffusionPen ECCV 2024.

Typed_to_handwritten


πŸ–₯️ Desktop & Creative

πŸŽ™οΈ Zuban / Handy β€” Offline Speech-to-Text

Rust, TypeScript, Tauri v2 Β· Whisper Β· Parakeet Β· Silero VAD

Cross-platform desktop app for fully offline speech transcription. Press a hotkey β†’ speak β†’ text appears in any application β€” zero cloud dependency.

  • Whisper (Small/Medium/Turbo/Large) + GPU acceleration via CTranslate2
  • Parakeet V3 β€” CPU-optimized with auto language detection
  • Silero VAD for voice activity/silence filtering
  • Global hotkeys + system tray (Windows/macOS/Linux)
  • Managed fork of Handy with 10+ CI workflows, Playwright E2E, WER benchmarking

5 stars Β· Zuban

πŸ’» Portfolio Sites

Version Stack Highlights
v1 Portfolio Next.js 14, R3F, GSAP, React Spring, Zustand, Lenis 3D matcap scene, magnetic cursor physics, terminal CLI simulator, smooth scroll
v2 portfolio_final Next.js 15, Framer Motion, Vercel AI SDK + Groq AI assistant chat, next-themes, strict TypeScript, security headers, Turbopack

🎬 Remotion Video β€” IJT Cinematic Promo

TypeScript, React, Remotion, FFmpeg

Programmatic video production for Islami Jamiat-e-Talaba β€” cinematic promo with 14+ animated scenes, virtual camera, film grain, color grading, Pakistan map visualization, audio mixer, and sub-organization hierarchy.

video-project

πŸ€– AI SDK Starter

TypeScript, Next.js, Groq, shadcn/ui

Open-source AI chatbot template with streaming responses + tool integration (weather, components). One-click Vercel deploy. Forked from Vercel AI SDK.

ai-sdk-starter-groq


πŸ“Š GitHub

GitHub contribution grid snake animation

πŸ› οΈ Stack

Domain Tools
Languages Java (Maven), Python, TypeScript, Rust, JavaScript, C++
Security SLSA, Sigstore Cosign, CodeQL, OpenSSF Scorecard, OWASP
AI/ML Gemini, EasyOCR, Whisper, Parakeet, PyTorch, DiffInk
Infrastructure Docker, GitHub Actions, Vercel, Convex, Clerk, Supabase
Desktop Tauri v2, JavaFX, Streamlit, Selenium
Creative Three.js, R3F, GSAP, Framer Motion, Remotion

🎯 Currently

  • πŸ”­ Building Nightshade v2 β€” expanding strategy count + WASM frontend for browser-side obfuscation
  • πŸ“š Publishing the PakE OCR Corpus paper β€” synthetic handwriting for Urdu/English low-resource OCR
  • 🌱 Deep-diving Rust systems programming + vector search internals
  • πŸ‘― Open to collaboration on LLM security research and open-source OCR tooling
  • πŸ’¬ Ask me about: code obfuscation, RAG pipelines, WhatsApp automation, offline STT

"A computer can do it β€” I just have to tell it how."

Pinned Loading

  1. nightshade nightshade Public

    ☽ LLM Training-Data Poisoning Engine β€” 8 adversarial strategies for Java/Python/JS source code. SLSA L3, Sigstore Cosign, OpenSSF Scorecard, CLI + GitHub Action + JavaFX GUI. Evades MinHash dedup.

    Java 2

  2. Zuban Zuban Public

    πŸŽ™οΈ Handy β€” offline speech-to-text desktop app (Tauri v2). Press hotkey β†’ speak β†’ text anywhere. Whisper + Parakeet V3 models, Silero VAD. Zero cloud. Cross-platform.

    TypeScript

  3. OCR OCR Public

    B.L.A.S.T. OCR Engine β€” 100% test coverage, 3-Layer A.N.T. Architecture, self-healing retries. Extracts text from PDF/PPTX/images via Python + EasyOCR with Streamlit GUI dashboard.

    Python

  4. portfolio_final portfolio_final Public

    πŸ’» AI-powered portfolio v2 β€” Next.js 15 + Framer Motion + Vercel AI SDK (Groq). AI assistant chat, next-themes dark mode, Turbopack, strict TypeScript, security headers.

    TypeScript

  5. News_Scrapper News_Scrapper Public

    πŸ“‘ Lightweight Dawn News scraper β€” BeautifulSoup headline extraction with topic filtering, CSV export.

  6. uet_gpt uet_gpt Public

    πŸ“š UET Taxila Campus RAG Chatbot β€” Custom Convex vector search pipeline crawling syllabi + campus data. Next.js 15 + Clerk auth + Gemini AI. 50-pair eval dataset, Neo Kinpaku design.

    TypeScript 1