Summary
Conduct a systematic audit of g0's detection capabilities against all known MCP server and agent skill attack patterns to ensure comprehensive coverage and identify any gaps.
Motivation
The MCP security scanning space is maturing rapidly, and standardized attack pattern taxonomies are emerging. g0 should verify it detects all known attack types and achieves low false positive rates on legitimate skills/servers.
Known Attack Patterns to Verify
MCP Server Attacks
- Prompt injection in tool descriptions
- Tool poisoning — malicious behavior hidden in tool implementations
- Tool shadowing — overriding legitimate tools with malicious versions
- Cross-origin tool confusion — tools that impersonate other servers
- Capability inflation — tools claiming broader access than needed
- Rug-pull attacks — tool descriptions changing after initial approval
- Exfiltration via tool output — embedding sensitive data in responses
- Parameter injection — malicious content in tool parameters
- Server instruction injection — malicious instructions in server metadata
- Toxic flows — chains of tools that create security risks when combined
Agent Skill Attacks
- Malicious skill packages — skills that execute harmful operations
- Dependency confusion — skills that load malicious dependencies
- Typosquatting — skills with names similar to legitimate ones
- Credential harvesting — skills that capture and exfiltrate credentials
- Backdoor installation — skills that establish persistent access
Detection Quality Metrics
- Recall — what percentage of known malicious patterns does g0 detect?
- Precision — what's the false positive rate on legitimate skills/servers?
- Severity accuracy — are findings classified at the right severity level?
Proposed Work
- Build a test corpus of known malicious MCP tools/skills (synthetic, not real malware)
- Build a legitimate corpus of 100+ popular, trusted MCP tools/skills
- Run g0 scan against both and measure recall/precision
- Identify gaps — attack patterns that g0 misses
- Add missing rules for any undetected patterns
- Tune severities based on real-world impact
- Document coverage matrix showing which rules detect which attack patterns
Files to Create/Modify
tests/fixtures/malicious-mcp/ — synthetic malicious MCP tools
tests/fixtures/legitimate-mcp/ — legitimate MCP tool corpus
tests/integration/detection-audit.test.ts — automated coverage test
- New rules in
src/rules/builtin/ for any gaps found
docs/detection-coverage.md — coverage matrix documentation
Acceptance Criteria
Summary
Conduct a systematic audit of g0's detection capabilities against all known MCP server and agent skill attack patterns to ensure comprehensive coverage and identify any gaps.
Motivation
The MCP security scanning space is maturing rapidly, and standardized attack pattern taxonomies are emerging. g0 should verify it detects all known attack types and achieves low false positive rates on legitimate skills/servers.
Known Attack Patterns to Verify
MCP Server Attacks
Agent Skill Attacks
Detection Quality Metrics
Proposed Work
Files to Create/Modify
tests/fixtures/malicious-mcp/— synthetic malicious MCP toolstests/fixtures/legitimate-mcp/— legitimate MCP tool corpustests/integration/detection-audit.test.ts— automated coverage testsrc/rules/builtin/for any gaps founddocs/detection-coverage.md— coverage matrix documentationAcceptance Criteria