Skip to content
View harshagm665-netizen's full-sized avatar

Block or report harshagm665-netizen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

⚑ Harsha G M

Systems Engineer | Edge AI Β· Agentic RAG Β· Distributed Autonomous Infrastructure

Location Availability

I design and build deterministic software architectures for resource-constrained systems. My work focuses on pulling heavy AI workloads out of the cloud and optimizing them to run efficiently on bare-metal and ARM edge infrastructure.


🧠 About Me

With a background rooted in mechanical engineering and a career built shipping production robotics R&D in Bengaluru, I treat software as an extension of physical constraints. I write production-grade Python and systems automation that accounts for thermal throttling, memory leaks, I/O bottlenecks, and network failures before a single line of code goes live.


πŸ›οΈ Core Systems & Projects

1️⃣ NEXUS v3 β€” Stateful Code Generation Framework

A deterministic, graph-based compiler that converts arbitrary system requirements into fully audited, deployable software assets.

  • Fractal Decomposition: Avoided sequential prompting pitfalls by building a stateful, cyclic execution graph that recursively splits monolithic code requirements down to 4 isolation layers.
  • Context Optimization: Engineered a 3-tier token budget manager (Manifest β†’ Contract Surface β†’ Git Diffs), resulting in a 7x reduction in token overhead.
  • Static & Semantic Verification: Built pie.py (Production Intuition Engine) to run deterministic AST parsing alongside an LLM-as-a-judge node, enforcing over 10,000 structural engineering constraints.
  • Crash-Safe Persistence: Wrote an asynchronous checkpoint engine that commits runtime state to local disk, surviving mid-run API failures without data loss.

Stack: Python LangGraph FastAPI Pydantic v2 SQLite

2️⃣ OmniRAG β€” Layout-Aware Document Intelligence Pipeline

A production-grade document synthesis engine designed to process non-textual, high-density corporate data structures natively.

  • Visual-Spatial Ingestion: Utilized ColPali vision-language embeddings to parse complex structural assets (tables, schematics, charts) without lossy OCR steps.
  • Hybrid Indexing: Configured custom Qdrant vector spaces optimized for hybrid keyword and dense-spatial searches.
  • Deterministic Citations: Enforced strict, metadata-validated tracing layers ([Source: filename, Page: N]) to prevent hallucinated data injection.
  • Fault-Tolerant Networking: Designed a KeyRotator abstraction layer to absorb aggressive provider rate-limiting and handle horizontal network failovers seamlessly.

Stack: ColPali Qdrant Groq API FastAPI Docker

3️⃣ Monk OS β€” Offline Edge Perception Runtime

A cloud-independent, low-latency execution runtime built explicitly for ARM-based embedded compute platforms.

  • Inference Latency Hardening: Quantized and compiled vision architectures down to localized Int8 execution providers, dropping perception-to-action overhead to <100ms on a Raspberry Pi 5.
  • Asynchronous Pipelines: Deployed local audio capture, Whisper processing, local LLM evaluation, and TTS into independent, zero-copy memory pipelines.
  • Low-Latency Telemetry: Re-engineered spatial transmission using WebRTC data channels with sub-15ms system command latency.
  • Enterprise Hardening: Secured systems via RSA-signed OTA updates, mutual TLS tunnels, and a local SQLCipher storage layer.

Stack: Python TFLite YOLOv8 WebRTC ROS2 OpenCV

4️⃣ DARWIN β€” Low-Overhead Kernel Telemetry Pipeline

High-throughput system monitoring built at the kernel layer for minimal performance degradation.

  • Kernel Instrumentation: Deployed custom eBPF probes directly into the kernel data path to log system events without performance penalties.
  • Distributed Logging: Directed asynchronous event packets through an Apache Kafka streaming pipeline into an optimized InfluxDB time-series backend.
  • State Tracking: Designed dynamic Grafana dashboards paired with automated webhook alerting to identify anomalous system load spikes instantly.

Stack: eBPF Apache Kafka InfluxDB Grafana


πŸ› οΈ Production Tech Stack

Operational Architecture

Python Shell Linux Docker Kubernetes

AI, Agents & Orchestration

LangGraph LangChain LlamaIndex MLflow

Edge Vision & Compute

TFLite YOLO ONNX OpenCV ARM

Data & Vector Systems

Qdrant ChromaDB Kafka SQL


πŸ“Š Proven Production Impact

  • ⚑ Zero-Cloud Compute Dependency: Moved heavy vision pipelines from expensive x86 cloud clusters down to on-device ARM edge chips, hitting a hard sub-100ms processing threshold.
  • πŸ“‰ 30% Reduction in Telemetry Latency: Replaced legacy, synchronous polling protocols with event-driven WebSocket and WebRTC pipelines to minimize transport layer overhead.
  • πŸ”„ 40% Faster Prototyping Life-cycles: Bridged the communication gap between mechanical CAD pipelines and ML lifecycles by designing software abstractions that account for real-world mechanical constraints.
  • πŸ“ˆ 25% Uptime Improvement: Designed robust, failure-tolerant Python scripts with strict process isolation, preventing cascading hardware crashes in real-world deployments.

πŸ“ Pragmatic Engineering Philosophies

"I don't look at machine learning as a magical black box; I treat an LLM or neural network as a highly non-linear, stochastic software module that requires the same strict testing, validation, and cost constraints as any legacy database or compiler."

  • Compute is Never Free: An elegant architecture is defined by how small of a model it needs to solve a problem deterministically, not how many parameters it can throw at it.
  • Fail Gracefully at the Boundary: When a sensor fails, a camera drops a frame, or an external API times out, the system should degrade safelyβ€”not throw an unhandled exception and crash the entire machine.
  • Hardware and Software are Married: Building software without understanding memory maps, CPU architectures, cache lines, or thermal thresholds is how bad code makes great hardware look slow.

πŸ“Œ Availability & Networking

  • πŸ”­ Current R&D: Building asynchronous, stateful agent networks and local Edge AI perception graphs.
  • 🌱 Current Deep-Dive: Advanced weight distillation, structural quantization-aware training, and model fine-tuning with LoRA/QLoRA.
  • 🀝 Open To: Principal, Lead, or Senior Systems Engineering roles within Edge AI, Agentic Systems Design, or Intelligent Robotics Infrastructure.

Equipped for global asynchronous remote engineering.

Pinned Loading

  1. AWS_locally AWS_locally Public

    The definitive guide to local AWS development. Features a lightning-fast Floci emulator setup, IaC with CloudFormation, Lambda functions, and comprehensive bash scripts for 13 AWS services.

    Shell

  2. darwin-system darwin-system Public

    πŸ›‘οΈ DARWIN: A high-performance, eBPF-powered security telemetry pipeline. Captures kernel-level execve syscalls, streams via Apache Kafka, stores in InfluxDB, and visualizes real-time threats in Gra…

    C

  3. Multi_agent_workflow Multi_agent_workflow Public

    Fractal multi-agent code generation pipeline β€” decomposes problems via LLM, verifies output through LLM-as-judge and a production-readiness rules engine.

    Python

  4. omnirag-multimodal-rag omnirag-multimodal-rag Public

    Production-grade Multimodal RAG pipeline β€” ingests PDFs, images & tables with ColPali vision embeddings, Qdrant vector search, and citation-tracked FastAPI serving.

    Python

  5. microsoft/markitdown microsoft/markitdown Public

    Python tool for converting files and office documents to Markdown.

    Python 159k 11.1k

  6. facebook/prophet facebook/prophet Public

    Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

    Python 20.3k 4.6k