Lingua: English | Italiano
This document describes the security architecture and protective measures implemented in AI Article Summarizer. It is intended for users who want to understand how their data and browsing activity are protected, and for developers evaluating the codebase.
- Architecture Principles
- XSS Prevention
- Input Sanitization
- Prompt Injection Defense
- Content Security Policy
- Extension Permissions
- Message Passing Security
- API Key Management
- Network Security
- Iframe Sandboxing
- Content Extraction Safety
- Error Handling and Information Disclosure
- Storage Security
- Export Security
- Backup Import Validation
- Dependency Management
- CI/CD Pipeline
- Reporting a Vulnerability
The extension follows a defense-in-depth strategy with multiple independent layers of protection. Compromising one layer does not automatically compromise the others.
| Principle | Implementation |
|---|---|
| Least Privilege | Minimal Chrome permissions; no <all_urls> |
| Input Validation at Boundaries | All external data sanitized before use |
| Separation of Concerns | Content script has no API key access; all LLM calls go through the service worker |
| Fail Secure | Errors produce safe user-friendly messages; no sensitive data is exposed |
| No Secrets in Code | API keys stored in sandboxed chrome.storage.local, never in source code |
Module: src/utils/security/html-sanitizer.js
All dynamic content inserted into the DOM passes through HtmlSanitizer.escape(), which uses the browser's own escaping mechanism:
static escape(text) {
const div = document.createElement('div');
div.textContent = text; // browser auto-escapes
return div.innerHTML; // returns safe HTML entities
}This pattern is applied consistently to:
- Article titles, authors, URLs
- AI-generated summaries and key points
- Citation data and Q&A answers
- History entries and metadata badges
- Translation output
data-idattributes in innerHTML templates
Dedicated rendering methods (renderText(), renderList()) build HTML from pre-escaped fragments, preventing injection even when composing complex structures.
Module: src/utils/security/input-sanitizer.js
A multi-stage pipeline sanitizes all user-supplied and web-extracted text before it reaches the AI provider:
| Stage | Purpose |
|---|---|
| HTML Tag Stripping | Removes <script>, <style>, <noscript> and all remaining tags |
| URL Removal | Optionally strips URLs to reduce noise |
| Control Character Removal | Strips ASCII control characters (0x00-0x1F, 0x7F) |
| Prompt Injection Escaping | Detects and neutralizes injection patterns (see below) |
| Length Validation | Enforces min/max bounds to prevent empty or oversized inputs |
A non-destructive validate() method is also available for checking inputs without modifying them.
Module: src/utils/security/input-sanitizer.js — escapePromptInjection()
The extension defends against prompt injection using a three-step approach:
Text is normalized to NFKC form before pattern matching, preventing Unicode bypass techniques (e.g., using full-width characters or homoglyphs).
The following invisible characters are stripped:
- Zero-width space (U+200B)
- Zero-width non-joiner/joiner (U+200C-U+200D)
- Left/right-to-right marks (U+200E-U+200F)
- Line/paragraph separators (U+2028-U+2029)
- Various embedding controls (U+202A-U+202F)
- Byte order mark (U+FEFF)
- Soft hyphen (U+00AD)
Injection patterns are detected in five languages:
| Language | Example Patterns |
|---|---|
| English | "ignore previous instructions", "disregard all prior context" |
| Italian | "ignora istruzioni precedenti", "dimentica istruzioni" |
| French | "ignorer instructions precedentes", "oublie instructions" |
| Spanish | "ignora instrucciones anteriores", "olvida instrucciones" |
| German | "ignoriere vorherige anweisungen", "vergiss anweisungen" |
Additionally, special tokens used by LLM systems are detected and removed: system:, assistant:, user:, <|...|>, [INST], [/INST].
User Q&A input is further limited to 2,000 characters via sanitizeUserPrompt().
File: manifest.json
The extension enforces a strict CSP on all extension pages:
default-src 'none';
script-src 'self' 'wasm-unsafe-eval';
style-src 'self';
font-src 'self';
object-src 'none';
img-src 'self' data:;
media-src 'none';
connect-src https://api.groq.com https://api.openai.com
https://api.anthropic.com https://generativelanguage.googleapis.com;
frame-src https:;
| Directive | Purpose |
|---|---|
default-src 'none' |
Whitelist-only approach; everything blocked unless explicitly allowed |
script-src 'self' |
Only bundled extension scripts can execute; no inline scripts, no eval() |
'wasm-unsafe-eval' |
Required by PDF.js for WebAssembly-based parsing |
connect-src |
Network requests limited to the four LLM provider APIs |
object-src 'none' |
Blocks Flash, Java applets, and other plugin content |
frame-src https: |
Iframes restricted to HTTPS (used for article original view) |
File: manifest.json
The extension requests only the permissions strictly necessary for its functionality:
| Permission | Purpose | Scope |
|---|---|---|
activeTab |
Access the current tab to extract article content | Only the active tab, only when clicked |
storage |
Save settings, history, and cache locally | Extension-sandboxed storage |
tts |
Text-to-speech for reading summaries aloud | Local synthesis only |
alarms |
Schedule automatic cache maintenance | Internal timing only |
Network access is restricted to the four LLM provider endpoints:
https://api.groq.com/*
https://api.openai.com/*
https://api.anthropic.com/*
https://generativelanguage.googleapis.com/*
The extension does not request <all_urls>, tabs, webRequest, cookies, clipboard, downloads, debugger, or any other broad permission.
The content script is explicitly excluded from sensitive pages:
"exclude_matches": [
"https://accounts.google.com/*",
"*://*.bank*/*"
]Module: src/background/service-worker.js
All inter-component communication (popup, content script, service worker) is validated:
Every incoming message is checked against the extension's own ID:
if (sender.id !== chrome.runtime.id) {
return false;
}This prevents other extensions or web pages from sending messages to the service worker.
A strict Set-based whitelist validates provider names before any API operation:
const VALID_PROVIDERS = new Set(['groq', 'openai', 'anthropic', 'gemini']);Invalid providers are rejected immediately with an error response, before any async operation begins.
All error responses sent back through the message channel pass through ErrorHandler.getErrorMessage(), which maps technical errors to user-friendly messages — preventing information leakage through error responses.
Module: src/utils/storage/storage-manager.js
API keys are stored in chrome.storage.local, which is automatically sandboxed by Chrome — each extension has its own isolated storage area that other extensions and web pages cannot access.
The extension deliberately does not apply custom encryption to stored keys. The reasoning:
In a Chrome Extension, the source code is always readable (extensions are distributed as plain JavaScript). Any encryption key or algorithm embedded in the code provides no meaningful protection — an attacker with access to the extension's storage also has access to the decryption logic.
chrome.storage.localsandboxing is the real security boundary.
- API keys are never accessible from the content script
- All LLM calls are routed through the service worker, which retrieves keys directly from storage
- Keys are never transmitted in Chrome messages (except during the initial test flow, which reads from a just-entered input field)
- Keys are masked in the Options UI (showing only the last 4 characters)
Modules: src/utils/ai/provider-caller.js, retry-strategy.js, rate-limiter.js
Every API call has a 60-second timeout enforced via AbortController:
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 60000);The timeout is always cleared in a finally block to prevent resource leaks.
Transient failures are retried with exponential backoff:
| Attempt | Delay |
|---|---|
| 1st retry | 1 second |
| 2nd retry | 2 seconds |
| 3rd retry | 4 seconds |
| Max | 8 seconds (capped) |
Only temporary errors (429, 500, 502, 503, 504) trigger retries. Permanent errors (400, 401, 403, 404) fail immediately.
A token bucket rate limiter respects each provider's limits:
| Provider | Limit |
|---|---|
| Groq | 30 requests/minute |
| OpenAI | 60 requests/minute |
| Anthropic | 50 requests/minute |
| Gemini | 60 requests/minute |
Requests exceeding the limit are queued, not dropped.
Files: src/pages/reading-mode/reading-mode.html, display.js
The reading mode iframe (used to display the original article) is maximally sandboxed:
<iframe sandbox="" referrerpolicy="no-referrer"></iframe>sandbox="" (empty) blocks:
- JavaScript execution
- Form submission
- Pop-ups and modals
- Same-origin access to the extension
- Downloads and plugins
Before loading any URL into the iframe, the protocol is validated against a whitelist (https:, http:). URLs with javascript:, data:, or file: protocols are rejected, and the UI falls back to the safe text view.
Module: src/utils/core/content-extractor.js
Before extracting article text, the following elements are removed from the DOM clone:
- Scripts, styles, iframes, forms
- Navigation, headers, footers, sidebars
- Ads, paywalls, subscription blocks, cookie banners
- Social sharing widgets, newsletters, popups
- Related/recommended content blocks
Paragraphs matching known noise patterns are filtered out:
- Pricing and subscription text (multi-currency, multi-language)
- Call-to-action and promotional content
- Cookie consent and login prompts
- Free trial and cancellation offers
Near-identical paragraphs (common in paywall-repeated content) are deduplicated via normalized text comparison before being sent to the AI.
Module: src/utils/core/error-handler.js
Technical error messages are never shown directly to users. ErrorHandler.getErrorMessage() maps every error category to a safe, actionable message:
| Technical Error | User Message |
|---|---|
| HTTP 401/Unauthorized | "API key non valida. Verifica la configurazione." |
| HTTP 429/Too Many Requests | "Rate limit raggiunto. Cambia provider o attendi." |
| Network/fetch errors | "Errore di connessione. Verifica la tua connessione." |
chrome:// URLs |
"Impossibile analizzare pagine interne di Chrome." |
| QUOTA_BYTES | "Spazio di archiviazione esaurito." |
| Unrecognized errors | "Si e verificato un errore imprevisto. Riprova." |
Error logs stored for diagnostics follow strict privacy rules:
- Original messages are truncated to 200 characters
- Stack traces retain only the top 5 frames
- URLs are stripped of query parameters and fragments (only
origin + pathname) - No API keys or user PII are ever logged
- Logs are rotated at 50 entries (FIFO)
Modules: src/utils/storage/compression-manager.js, cache-store.js, base-history-repository.js
Article data is compressed using LZ-string before storage, reducing quota usage and minimizing the data surface stored in chrome.storage.local.
_safeStorageSet()catchesQUOTA_BYTESerrors and provides an actionable user message- Automatic cache maintenance runs periodically via Chrome Alarms
- Cache entries have configurable TTL (time-to-live) with automatic eviction
All data resides in chrome.storage.local, which is:
- Sandboxed per-extension (no cross-extension access)
- Inaccessible from web pages
- Cleared when the extension is uninstalled
Module: src/utils/export/email-manager.js
Before constructing mailto: links:
- Recipient email is stripped of
\r,\n,\t, and%characters - Email format is validated via regex
- Article titles have newlines replaced with spaces
- Subject and body are encoded with
encodeURIComponent()
PDF generation via jsPDF operates entirely client-side with no network calls. Content is escaped before being written to the PDF document.
Module: src/pages/history/io-backup.js
Importing backup data from JSON files goes through a comprehensive validation pipeline:
| Check | Detail |
|---|---|
| File type | Must be JSON (MIME type or .json extension) |
| File size | Maximum 10 MB to prevent denial-of-service |
| JSON parsing | Wrapped in try-catch with descriptive error |
| Structure validation | Must contain version (string) and data (object) |
| Content validation | Must include singleArticles or multiAnalysis arrays |
| Metadata sanitization | Provider and content type whitelisted; language validated via /^[a-z]{2}(-[A-Z]{2})?$/ |
| Field sanitization | Title (500 chars), content (100 KB), excerpt (1 KB) max lengths enforced |
| UUID regeneration | All imported entries receive new crypto.randomUUID() IDs |
| Duplicate prevention | Existing IDs are checked before import |
| User confirmation | Modal dialog shows backup metadata before proceeding |
File: package.json
The extension uses a minimal set of well-known, actively maintained dependencies:
| Dependency | Purpose | Security Profile |
|---|---|---|
@mozilla/readability |
Article content extraction | Mozilla-maintained, battle-tested |
jspdf |
PDF generation | Client-side only, no network calls |
lz-string |
Data compression | Pure JavaScript, no dependencies |
pdfjs-dist |
PDF parsing | Mozilla-maintained, worker-sandboxed |
Known transitive vulnerabilities are addressed via npm overrides:
"overrides": {
"rollup": ">=2.80.0"
}File: .github/workflows/ci.yml
Every push and pull request triggers an automated pipeline:
| Step | Purpose |
|---|---|
npm audit --audit-level=high |
Fails the build if high-severity CVEs are found in dependencies |
npm run lint |
ESLint with security-focused rules (no-eval, eqeqeq) |
npm test |
641 unit tests covering security-critical modules |
npm run build |
Ensures the production bundle compiles without errors |
If you discover a security vulnerability, please report it responsibly:
- Do not open a public GitHub issue
- Email the maintainer at the address listed in the GitHub profile: @AndreaBonn
- Include a description of the vulnerability, steps to reproduce, and potential impact
- Allow reasonable time for a fix before public disclosure
We aim to acknowledge reports within 48 hours and provide a fix within 7 days for critical issues.
This document reflects the security posture as of version 2.2.0 (April 2026).