A Chrome extension + FastAPI backend that verifies every link, image, and video you hover over β protecting Coinbase users from phishing attacks before they click.
Features Β· Architecture Β· Quick Start Β· How It Works Β· API Reference
Phishing attacks targeting crypto users cost billions annually. Attackers register lookalike domains (e.g., coinbase-support.xyz, wallet-connect-base.org) that are visually identical to the real thing. Users click, enter credentials, and lose funds.
Phish-Guard solves this by verifying every link before you click it.
| Feature | Description |
|---|---|
| π Link Scanning | Every hyperlink on every page is scanned in real-time |
| πΌοΈ Image & Ad Scanning | Images and videos wrapped in links are scanned too (Media Verification) |
| π’ iframe/Ad Detection | Runs inside iframes β catches banner ads and embedded content |
| π±οΈ Hover Badges | Security status badge appears on hover β Green/Blue/Amber/Red |
| π¨ Right-Click Verify | Right-click any link, image, or video to manually trigger verification |
| π Live Dashboard | Real-time stats page showing engine status and domain counts |
| π΄ Report Threats | One-click reporting from the popup β flags suspicious sites for review |
| β‘ LRU Cache | 50,000-entry cache ensures repeated URLs are verified in microseconds |
Phish-Guard uses a Hybrid Async Architecture to ensure zero latency for the user while handling heavy ML tasks in the background.
graph TD
subgraph "Chrome Extension (MV3)"
C[content.js<br>Media & Link Scanner]
B[background.js<br>Context Menu Handler]
P[popup.js<br>UI & Reporting]
end
subgraph "FastAPI Backend"
API[API Gateway<br>FastAPI / Uvicorn]
subgraph "Real-Time Detection Engine"
Cache[LRU Cache<br>50k Entries]
L1[Layer 1: Allowlist]
L2[Layer 2: Blocklist]
L3[Layer 3: Tranco 1M]
L4[Layer 4: ML Model]
end
subgraph "Async Background Workers"
T_Loader[Tranco Loader<br>Background Thread]
ML_Trainer[ML Trainer<br>Background Thread]
end
end
C -->|POST /verify| API
B -->|POST /verify| API
P -->|POST /report| API
API --> Cache
Cache --> L1
L1 --> L2
L2 --> L3
L3 --> L4
T_Loader -.->|Async Update| L3
ML_Trainer -.->|Async Update| L4
- Async ML Training: The RandomForest model trains on 235K URLs in a background daemon thread. The server starts instantly, and the ML layer activates seamlessly once training completes.
- Lazy Scanning:
IntersectionObserveris used to only scan links and media elements when they enter the viewport, ensuring zero performance impact on heavy web pages.
- Python 3.9+
- Google Chrome or Brave browser
git clone https://github.com/shri-915/phish-guard.git
cd phish-guard./start_backend.shThis script will:
- Create a Python virtual environment
- Install all dependencies automatically
- Start the FastAPI server at
http://localhost:8000
First run note: The ML model trains on 235K URLs in the background. The server is immediately available β ML kicks in after ~30-60 seconds.
- Open
chrome://extensionsin Chrome - Enable Developer Mode (top-right toggle)
- Click Load Unpacked
- Select the
extension/folder
Hover over any link, image, or video on any website. The Phish-Guard badge will appear showing the security status.
Every URL passes through a high-performance filtering pipeline:
flowchart TD
Start([URL Input]) --> L1{Layer 1<br>Coinbase Allowlist}
L1 -->|Match| Safe[β
Safe<br>Official Domain]
L1 -->|No Match| L2{Layer 2<br>Blocklist}
L2 -->|Match| Phish[π¨ Phish<br>Known Threat]
L2 -->|No Match| L3{Layer 3<br>Tranco 1M}
L3 -->|Match| Trusted[π‘οΈ Trusted<br>Popular Site]
L3 -->|No Match| L4{Layer 4<br>ML Model}
L4 -->|Prob > 65%| Warning[β οΈ Warning<br>Suspected Phishing]
L4 -->|Prob > 40%| Neutral[βͺ Neutral<br>Unverified]
L4 -->|Prob <= 40%| SafeML[π‘οΈ Safe<br>Low Risk]
style Safe fill:#d1fae5,stroke:#059669
style Phish fill:#fee2e2,stroke:#dc2626
style Trusted fill:#dbeafe,stroke:#2563eb
style Warning fill:#fef3c7,stroke:#d97706
| Badge | Status | Meaning |
|---|---|---|
| β Blue | safe |
Official Coinbase domain (coinbase.com, base.org, etc.) |
| π‘οΈ Green | trusted |
Tranco Top 1M verified popular site (google.com, github.com) |
warning |
ML model flagged as likely phishing | |
| π¨ Red | phish |
Known phishing domain β blocked |
| βͺ Gray | neutral |
Unverified β use caution |
Verify a URL against all 4 detection layers.
Request:
{ "url": "https://coinbase-support.xyz" }Response:
{
"status": "phish",
"message": "Known Phishing Domain",
"confidence": 1.0,
"badge": "red_alert"
}Submit a user-reported threat.
Request:
{ "url": "https://suspicious-site.com", "reason": "user_report" }Live engine statistics.
Response:
{
"trusted_domains": 1000044,
"cached_predictions": 127,
"blocklist_size": 8,
"model_status": "active"
}Health check endpoint.
{ "status": "ok", "message": "Phish-Guard is running" }Interactive Swagger API documentation (auto-generated by FastAPI).
phish-guard/
βββ backend/
β βββ main.py # FastAPI app, async entry point
β βββ phishing_detector.py # 4-layer engine logic
β βββ ml_model.py # Async ML training logic
β βββ requirements.txt # Python dependencies
β βββ static/ # Landing page assets
β βββ datasets/
β βββ PhiUSIIL_Phishing_URL_Dataset.csv # 235K URLs
β βββ top-1m.csv # Tranco Top 1M list
β
βββ extension/
β βββ manifest.json # Chrome MV3 manifest (v1.1)
β βββ content.js # Media/Link scanner (IntersectionObserver)
β βββ background.js # Service worker
β βββ popup.html # UI
β βββ icons/ # Generated logos
β
βββ docs/ # GitHub Pages static site
βββ start_backend.sh # Robust startup script
βββ README.md
- No data stored: URLs are checked in-memory and never logged to disk.
- Local backend: All processing happens on your machine β nothing sent to third-party servers.
- CORS restricted: The backend only accepts connections from the local extension (configurable).
- Open source: Full source available for audit.
Production note: For deployment, restrict CORS origins and consider URL hashing for additional privacy.
- Phase 4: Real-time blocklist sync from threat intelligence feeds (PhishTank, OpenPhish)
- Phase 5: URL hashing for privacy-preserving cloud verification
- Phase 6: Firefox extension support
- Phase 7: Community-contributed blocklist with upvoting
| Component | Technology |
|---|---|
| Backend | Python, FastAPI, Uvicorn |
| ML Model | scikit-learn (TF-IDF + RandomForest) |
| Dataset | PhiUSIIL (235K URLs), Tranco Top 1M |
| Extension | JavaScript, Chrome Manifest V3 |
| Frontend | HTML5, CSS3 (vanilla, no frameworks) |
Shrimun Agarwal Role: Aspiring Data/ML Engineer @ Coinbase
Built as a security tool showcase for Coinbase Trust & Safety. Not affiliated with Coinbase Inc.
β Star this repo if Phish-Guard protected you! β
Made with β€οΈ for the crypto community