Skip to content

symonbaikov/ai-super-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

131 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Super Parser AI

Real-time Solana intelligence platform — discover, analyze, and act on token opportunities before the market does.

Super Parser AI is a production-grade trading intelligence system that fuses social signals, on-chain whale activity, exchange listings, and an autonomous AI reasoning pipeline into a single live operations dashboard. The platform monitors thousands of social streams and on-chain events per minute, scores each candidate token through a triple-LLM decision engine, and surfaces actionable signals — with full audit trails, risk assessment, and optional automated execution.


Table of Contents


What It Does

Super Parser AI watches the Solana ecosystem in real time and answers a single question continuously:

Which tokens are about to move, why, and how risky are they?

It does this by ingesting heterogeneous signals — Twitter/X mentions, Telegram chatter, Apify-scraped narratives, Helius on-chain webhooks, PumpFun launches, CEX listing announcements, and whale wallet movements — and routing them through a multi-stage AI pipeline that produces ranked, risk-scored, explainable trading recommendations. Operators see everything live on a streaming dashboard; downstream automation can execute trades when confidence and risk thresholds align.

Key Capabilities

Domain Capability
Signal Ingestion Multi-source intake: Apify webhooks, Helius webhooks, custom HTTP/Browser parsers, PumpFun monitor, Twitter hype detector, CEX radar
AI Reasoning Triple-stage pipeline (Scout → Analyst → Judge) over Groq + Gemini with configurable rules, risk policies, and influencer tier weighting
Whale Tracking Three pluggable whale-scanner generations combining Nansen, Birdeye, Solscan, DexScreener, Chainbase, and Helius/QuickNode RPC
Risk Engine HMAC-verified webhooks, Rugcheck/SolSniffer security adapters, configurable thresholds, severity tiers (info / warn / critical)
Real-time UX Server-Sent Events stream every signal, alert, trade simulation, and whale event to the React control center with sub-second latency
Automation Auto-trade simulator triggered when risk = info; manual confirmation flow for live execution
Operations Prometheus metrics, structured logs, BullMQ queue introspection, alert acknowledgement, audit-friendly Postgres schema
Extensibility Pluggable adapters for whales, security scoring, parsers; JSON-driven AI config (platforms, accounts, rules, risk policy)

System Architecture

                                ┌──────────────────────────────────────┐
                                │            React Control Center       │
                                │     (Vite • Context API • EventSource) │
                                └──────────────┬───────────────────────┘
                                               │ HTTPS + SSE
                                               ▼
                            ┌───────────────────────────────────────┐
                            │            Caddy Reverse Proxy         │
                            │  app.localhost • api.localhost • ai.* │
                            └───────┬───────────────┬───────────────┘
                                    │               │
              ┌─────────────────────┘               └────────────────────┐
              ▼                                                          ▼
   ┌────────────────────┐    enqueue jobs    ┌──────────────────────────────┐
   │  FastAPI Gateway   │ ─────────────────▶ │      Node.js Worker Mesh      │
   │   (Python 3.11)    │                    │   BullMQ • 12+ Job Handlers   │
   │                    │ ◀───── results ─── │   SSE bus • Metrics • Adapters│
   │  • REST API        │                    │                              │
   │  • Webhook intake  │                    │  parser-run / apify-dataset  │
   │  • HMAC verify     │                    │  helius-events / pumpfun     │
   │  • SSE relay       │                    │  whales-scan v1/v2/v3        │
   │  • Auth + Config   │                    │  twitter-hype / cex-radar    │
   └─────┬────────┬─────┘                    │  security-validation         │
         │        │                          │  trading-executor            │
         │        │                          └──────┬──────────────┬────────┘
         │        │                                 │              │
         │        ▼                                 ▼              ▼
         │  ┌──────────────┐                ┌──────────────┐  ┌────────────┐
         │  │  AI Core     │ ◀── advice ─── │   Redis +    │  │  External  │
         │  │  (FastAPI)   │                │   BullMQ     │  │  Providers │
         │  │              │                └──────────────┘  │            │
         │  │ Scout        │                                  │  Groq      │
         │  │   ↓          │                                  │  Gemini    │
         │  │ Analyst      │                                  │  Helius    │
         │  │   ↓          │                                  │  QuickNode │
         │  │ Judge        │                                  │  Apify     │
         │  │              │                                  │  Nansen    │
         │  │ Groq+Gemini  │                                  │  Birdeye   │
         │  └──────────────┘                                  │  Solscan   │
         │                                                    │  Rugcheck  │
         ▼                                                    └────────────┘
   ┌──────────────┐
   │  PostgreSQL  │   candidates • alerts • pumpfun_tokens • settings • logs
   └──────────────┘

Architectural Principles

  • Decoupled ingestion and processing. The FastAPI gateway accepts and verifies traffic; all heavy work runs asynchronously on a horizontally scalable Node worker fleet.
  • Streaming-first UX. Every state change is emitted on a single SSE bus, so the dashboard, automation, and external integrations share one source of truth.
  • AI as a service. The AI Core is an isolated microservice with its own configuration, models, and lifecycle — replaceable without touching ingestion code.
  • Provider redundancy. Critical capabilities (whale tracking, AI inference, RPC) have multiple adapter implementations selectable via environment flags.
  • Auditable by default. Postgres stores every candidate, alert, and decision with structured payloads for forensic review.

Data Flows

1. Parser Job (manual or scheduled)

Operator / Scheduler
  → POST /api/parser/run
  → FastAPI persists Candidate (status=queued)
  → BullMQ enqueue (sp-worker:parser-run)
  → Worker: symbol resolution → security check → AI advice (Scout/Analyst/Judge)
  → SSE emit `parser_job`
  → If risk=info → POST /api/trade/confirm → SSE emit `trade_simulated`
  → Candidate updated (status=completed)

2. Apify Social Ingestion

Apify Actor finishes scrape
  → POST /api/apify/callback (HMAC-SHA256 signed)
  → FastAPI verifies signature, persists Candidate, enqueues `apify-dataset`
  → Worker: parse dataset → score hype → AI analysis → create Alert
  → SSE emit `social_signal`
  → Frontend updates AlertsPanel + EventsTable in real time

3. Helius On-Chain Webhook

Helius detects transfer/swap
  → POST /api/helius/webhook (HMAC signed)
  → FastAPI verifies, enqueues `helius-events`
  → Worker: classify transfer → compare against HELIUS_HIGH_VALUE_SOL
  → If whale → enrich via Nansen/Birdeye/Solscan → create critical Alert
  → SSE emit `helius_event` + `whale_alert`

4. Continuous Monitors

Background BullMQ workers run on intervals:

  • PumpFun monitor → new mints → security validation → watchlist match → alert
  • Twitter hype → keyword/influencer scans → candidate creation
  • Whales scan v3 → wallet snapshots → diff detection → enriched alerts
  • CEX radar → exchange listing pages → new listing detection

Component Reference

FastAPI Gateway (backend/api/)

Async Python 3.11 service. Lifespan-managed startup wires PostgreSQL (asyncpg + SQLAlchemy 2.x async), Redis/BullMQ producer, and HTTP clients. Routes are organized by domain (parser, apify, helius, signals, whales, pumpfun, trade, advice, health). Pydantic Settings drives configuration. Webhook routes verify HMAC-SHA256 signatures against raw bytes before dispatch.

Worker Mesh (backend/src/)

Node.js 20 + BullMQ. Each handler is an idempotent async function with retry/backoff configured per queue. The worker exposes:

  • HTTP control plane on port 8811 (status, queue introspection, manual triggers)
  • SSE bus on /stream for the frontend
  • Prometheus metrics on port 9110

Adapters live under src/adapters/{whales,security,...} and are selected at runtime via env flags (WHALE_SCANNER_VERSION, etc.). The signals/engine.js module computes RSI, EMA, and Bollinger Bands for technical confirmation alongside AI verdicts.

AI Core (ai_core/)

Standalone FastAPI service implementing the Triple-AI pipeline:

  1. Scout — gathers context, normalizes metadata, identifies platform/influencer tiers
  2. Analyst — produces structured findings (momentum, holder distribution, narrative strength, security flags)
  3. Judge — applies rules.json + risk_policy.json to issue a final BUY / WATCH / AVOID decision with rationale

Primary inference uses Groq (llama-3.1-8b-instant, optional mixtral-8x7b); Gemini 2.5 Flash is the fallback. All prompts and decision thresholds live in versioned JSON under ai_core/configs/, so trading logic can be tuned without code changes.

React Control Center (web/)

Vite + React 18, Context API + useReducer for state. A single EventSource connection to /stream drives live updates across:

  • Helius Monitor — streaming mint and transfer activity
  • Whales Monitor — wallet flows, ranked by net volume
  • Alerts Panel — severity-filtered, ack-able feed
  • Trading Dashboard — simulation history with PnL
  • Alert Rules Manager — declarative rule editor (persisted via API)
  • Providers Status — health pings for every external integration

Data Layer

PostgreSQL 16 with Alembic migrations. Core tables:

Table Purpose
candidates Every token discovery with status, score, source attribution, and queue correlation
alerts Severity-tiered events (info/warn/critical), ack state, structured JSON payload
pumpfun_tokens New PumpFun mints with security scoring and watchlist match flags
settings Runtime configuration overrides (alert rules, thresholds)
logs Structured audit trail for every job and decision

Redis 7 backs BullMQ queues (sp-worker:*), pub/sub for cross-process events, and rate limiters for upstream APIs.

Technology Stack

Backend: Python 3.11, FastAPI, SQLAlchemy 2.x (async), Pydantic v2, asyncpg, httpx Worker: Node.js 20, BullMQ, ioredis, axios, Express (control plane) AI: Groq SDK, Google Generative AI SDK, custom multi-stage orchestrator Frontend: React 18, Vite 5, Vitest, Context API, native EventSource Data: PostgreSQL 16, Redis 7, Alembic Infrastructure: Docker Compose v2, Caddy 2 (automatic TLS), multi-stage Dockerfiles Observability: Prometheus exporter, structured JSON logs, health endpoints per service

Repository Layout

ai-super-system/
├── ai_core/                  # Triple-AI FastAPI service + JSON configs
│   ├── server/               # app.py, pipeline.py
│   └── configs/              # platforms, accounts, rules, risk_policy
├── backend/
│   ├── api/                  # FastAPI gateway (routes, services, models)
│   ├── src/                  # Node worker (handlers, adapters, signals)
│   ├── tests/                # pytest + node test suites
│   ├── Dockerfile            # multi-stage: api + worker targets
│   └── .env.sample
├── web/                      # React control center (Vite)
├── deploy/
│   └── Caddyfile             # TLS + virtual hosts
├── docs/                     # integrations, demo scripts, requirements
├── docker-compose.yml        # production stack
├── docker-compose.dev.yml    # live-reload dev stack
└── CLAUDE.md                 # contributor guide

Quick Start

Prerequisites

  • Docker Engine 24+ with Compose v2
  • Node.js 20+ and Python 3.11+ (only required for running tests outside Docker)
  • API credentials for: Groq, Apify, Helius (production); Gemini, Nansen, Birdeye, Solscan, QuickNode are optional and gracefully skipped when absent

Bring up the stack

git clone <repo-url> ai-super-system
cd ai-super-system

cp .env.example .env                    # FQDNs and admin email for Caddy
cp backend/.env.sample backend/.env     # API and integration secrets

# Edit backend/.env — at minimum set GROQ_API_KEY, APIFY_TOKEN, HELIUS_API_KEY,
# ALERTS_SIGNATURE_SECRET, HELIUS_WEBHOOK_SECRET

docker compose up --build -d
docker compose ps                       # all services should be Up (healthy)

Access points

Surface URL
Control Center https://app.localhost
API + Swagger https://api.localhost/docs
SSE stream https://api.localhost/stream
AI Core health https://ai.localhost/health
Worker metrics http://localhost:9110/metrics
Worker control http://localhost:8812/status

Caddy issues self-signed certificates for *.localhost; pass -k to curl or trust the local CA.

Development mode

docker compose -f docker-compose.dev.yml up --build -d

This variant mounts source directories for live reload and exposes raw service ports without TLS.

Configuration

All configuration is environment-driven. The most relevant variables are documented inline in backend/.env.sample. Highlights:

Runtime

DATABASE_URL=postgresql+asyncpg://parser:parser@postgres:5432/super_parser
REDIS_URL=redis://redis:6379/0
HTTP_TIMEOUT_SECONDS=30

AI providers

GROQ_API_KEY=...
GROQ_MODEL=llama-3.1-8b-instant
GEMINI_API_KEY=...
GOOGLE_GEMINI_MODEL=gemini-2.5-flash

Ingestion

APIFY_TOKEN=...
APIFY_ACTOR_ID=...
HELIUS_API_KEY=...
QUICKNODE_URL=...

Whale stack

WHALE_SCANNER_VERSION=v3                # v1 | v2 | v3
NANSEN_API_KEY=...
BIRDEYE_API_KEY=...
SOLSCAN_API_KEY=...
HELIUS_HIGH_VALUE_SOL=500

Webhook security

ALERTS_SIGNATURE_SECRET=...             # HMAC for Apify
HELIUS_WEBHOOK_SECRET=...               # HMAC for Helius

AI behavior is tuned via JSON in ai_core/configs/ (platform definitions, influencer tiers, decision rules, risk policy). These files are hot-reloaded on AI Core restart.

Operating the Platform

Restart a single service after a code change

docker compose restart api      # FastAPI
docker compose restart worker   # Node worker
docker compose restart ai-core  # AI service

Inspect queues

docker compose exec redis redis-cli LLEN sp-worker:parser-run:wait
docker compose exec redis redis-cli KEYS 'sp-worker:*'

Query recent activity

docker compose exec postgres psql -U parser -d super_parser \
  -c "SELECT symbol, status, score, created_at FROM candidates ORDER BY created_at DESC LIMIT 10;"

docker compose exec postgres psql -U parser -d super_parser \
  -c "SELECT title, severity, created_at FROM alerts WHERE acked=false ORDER BY created_at DESC;"

Database migrations

cd backend
alembic revision --autogenerate -m "describe change"
alembic upgrade head
docker compose restart api

Verifying End-to-End Flows

Trigger a parser job

curl -k -X POST https://api.localhost/api/parser/run \
  -H 'Content-Type: application/json' \
  -d '{"symbol":"WIF","sources":["twitter"],"filters":["hot"],"priority":9}'

Expect 202 Accepted with a job_id, a new candidates row, and a parser_job SSE event followed by trade_simulated when risk resolves to info.

Request AI advice directly

curl -k -X POST https://api.localhost/api/advice \
  -H 'Content-Type: application/json' \
  -d '{
        "prompt": "Evaluate meme coin momentum",
        "metadata": {"profile":"pump","metrics":{"SOCIAL_BURST":6,"FREQ_5M":7,"MENTIONS":200}}
      }'

Response includes a decision block and a chain field (scout → analyst → judge).

Replay a signed Apify webhook

SECRET=$(grep ^ALERTS_SIGNATURE_SECRET backend/.env | cut -d= -f2)

cat > tmp_apify.json <<'JSON'
{"actorRun":{"id":"run-1","status":"SUCCEEDED","succeeded":true},
 "datasetItems":[{"symbol":"TEST","score":87,"tweet":"demo"}],
 "meta":{"priority":7,"sources":["twitter"],"filters":["hot"]}}
JSON

SIG=$(python3 -c "import hmac,hashlib,os;print(hmac.new(os.environ['SECRET'].encode(),open('tmp_apify.json','rb').read(),hashlib.sha256).hexdigest())")

curl -k -X POST https://api.localhost/api/apify/callback \
  -H 'Content-Type: application/json' \
  -H "x-apify-signature: $SIG" \
  --data-binary @tmp_apify.json

Replay a signed Helius webhook

Identical pattern with HELIUS_WEBHOOK_SECRET and the x-helius-signature header against /api/helius/webhook.

Testing

# Backend (FastAPI + AI orchestration)
python3 -m pytest backend/tests
python3 -m pytest backend/tests/e2e            # full pipeline e2e

# Worker (Node + BullMQ)
cd backend && npm install && npm test

# Frontend (Vitest)
cd web && npm install && npm test

E2E coverage in backend/tests/e2e/test_full_cycle.py exercises a complete parser → worker → AI → trade cycle against the running stack.

Observability

  • Health probesGET /api/health (gateway), GET /health (AI Core), GET /status (worker)
  • Prometheus metrics — port 9110 exposes job throughput, queue depth, handler latency, provider errors
  • Structured logsdocker compose logs -f <service> returns JSON-friendly lines for ingestion into Loki/ELK
  • SSE inspectioncurl -kN https://api.localhost/stream shows the live event firehose

Security Model

  • Webhook authenticity + replay protection: every external POST is HMAC-SHA256 verified over f"{timestamp}.{body}"; the timestamp must be within WEBHOOK_SIGNATURE_TOLERANCE seconds (default 300). Legacy unsigned-by-time signatures are off by default and gated behind ALLOW_LEGACY_WEBHOOK_SIGNATURES.
  • Brute-force defense: auth endpoints are rate-limited (20 req / 5 min per IP) with per-username lockout (5 failures → 30 min lock). Webhook endpoints are throttled (600 req / min per IP) and additionally gated by an optional IP allow-list (APIFY_IP_ALLOWLIST, HELIUS_IP_ALLOWLIST).
  • Strong password hashing: Argon2id (OWASP-recommended) with m=64MiB, t=3, p=4. Legacy bcrypt hashes are accepted on login and transparently re-hashed on first successful sign-in.
  • Two-factor authentication: TOTP (RFC 6238) with QR-code provisioning, single-use hashed recovery codes, and a short-lived pre-auth ticket between the password step and the TOTP step. Failed 2FA attempts feed the same lockout counter as password failures.
  • Server-side session revocation: every session JWT carries a unique jti recorded in auth_sessions; logout, logout-all, and admin actions revoke individual or all of a user's sessions instantly. Tokens without a jti are rejected once AUTH_REQUIRE_SESSION_RECORD=true.
  • Session security: session JWTs are httpOnly + Secure + SameSite=Strict cookies; a paired CSRF cookie is required on state-changing requests. SSE streams use one-time, short-TTL Redis-backed tickets.
  • RBAC: User.role (user / admin) is enforced via the require_admin dependency; admin-only endpoints (e.g. runtime trade kill-switch) reject non-admins with 403.
  • Secret hygiene: production docker-compose.yml refuses to start if any required secret is missing (POSTGRES_PASSWORD, AUTH_JWT_SECRET, WORKER_SHARED_SECRET, both webhook secrets). .env.sample documents every required key.
  • CORS + security headers: strict origin allow-list (FRONTEND_ORIGIN); responses include Content-Security-Policy (strict default-src 'self'), X-Content-Type-Options, X-Frame-Options: DENY, Referrer-Policy: no-referrer, Permissions-Policy, Cross-Origin-Opener-Policy, Cross-Origin-Resource-Policy, and HSTS preload (max-age=63072000) when running over HTTPS. Caddy mirrors these at the edge.
  • Container hardening: the api and worker services run with read_only: true, cap_drop: [ALL], no-new-privileges, and a small writable tmpfs for scratch.
  • Trade kill-switch: runtime kill-switch is admin-controllable via POST /api/admin/trade/kill-switch (Redis-backed, takes effect instantly) in addition to the env-level TRADE_KILL_SWITCH.
  • Trade execution gate: fail-closed — confirmations are blocked unless TRADING_ENABLED=true and TRADE_KILL_SWITCH=false. Per-order, per-day count, and per-day USD caps are enforced via Redis counters. Flipping TRADE_KILL_SWITCH=true halts all trading instantly.
  • Network isolation: services communicate over the internal Docker network; only Caddy is publicly bound and terminates TLS.
  • Least privilege: the worker container runs as a non-root user; database access is scoped to a dedicated role; every /api/* route requires authentication except health and the explicitly public auth/webhook endpoints.
  • Audit trail: every authentication attempt, webhook signature failure, and trade-gate decision is persisted to logs with source="security" and full context (IP, actor, reason).
  • Supply-chain scanning: CI runs pip-audit, bandit, npm audit (worker + web), trivy (HIGH/CRITICAL fail the build), gitleaks for committed secrets, and produces a CycloneDX SBOM artifact on every push.
  • Risk gating: automated trade simulation only fires when the Judge stage returns info severity; warn and critical paths require explicit human acknowledgement.

Production Deployment

The stack is designed for single-VM or Kubernetes deployment.

Single-host (Docker Compose):

  1. Provision a host with Docker Engine 24+ and a public DNS record pointing to it.
  2. Set the FQDNs in root .env (e.g., app.example.com, api.example.com, ai.example.com) and update the admin email — Caddy will obtain Let's Encrypt certificates automatically.
  3. Populate backend/.env with production credentials.
  4. docker compose up -d --build and monitor docker compose ps until all services are healthy.
  5. Configure Apify and Helius webhooks to point at https://api.example.com/... with the matching HMAC secrets.

Kubernetes: the multi-stage backend/Dockerfile builds api and worker targets independently; deploy each as its own Deployment with shared Postgres/Redis services and a managed ingress in place of Caddy.

Scaling levers:

  • Worker concurrency: increase replicas or BULLMQ_CONCURRENCY
  • Ingest throughput: scale the FastAPI gateway horizontally behind the proxy
  • AI throughput: deploy multiple AI Core replicas; the gateway round-robins via the service name

Roadmap

  • Multi-tenant workspace model with per-team rules and watchlists
  • Live execution adapter (Jupiter / Jito bundles) gated behind explicit risk policy
  • Backtesting harness replaying historical SSE streams against rule changes
  • gRPC streaming API for downstream automation
  • Native mobile companion for alert acknowledgement

License: proprietary — contact the maintainer for commercial use. Contributors: see CLAUDE.md for engineering conventions, docs/ for integration playbooks.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors