fix: reduce false positives from real-world repo analysis by JBAhire · Pull Request #142 · guard0-ai/g0

JBAhire · 2026-04-08T21:21:53Z

Summary

Scanned g0 against 14 real-world OSS repos and eliminated 89% of false positives (4,578 → 482 findings) across two rounds of fixes.

Repos tested

langchain, langchainjs, langgraph, crewAI-examples, MCP servers, MCP python-sdk, openai-agents-python, swarm, autogen, vercel ai-chatbot, langchain4j, langchaingo, spring-ai, open-interpreter

Final Results (default `--min-confidence medium`)

Repo	Before	After	Reduction	Score
servers	373	17	95%	67→85
crewAI-examples	238	158	34%	45→47
langchain	317	3	99%	47→48
langchainjs	299	22	93%	51→58
langgraph	110	39	65%	66→67
autogen	695	89	87%	22→24
ai-chatbot	100	29	71%	60→60
python-sdk	26	10	62%	92→82
openai-agents-python	106	26	75%	68→67
langchain4j	50	24	52%	81→88
langchaingo	126	33	74%	79→86
spring-ai	281	32	89%	80→89
swarm	377	0	100%	69→100
open-interpreter	1480	0	100%	17→100
Total	4,578	482	89%

Round 1 Fixes (commit 1)

Test file filtering — expanded isTestFile() patterns, remove ALL test-file findings by default
CVE version placeholders — reject ${project.version}, workspace:* as non-numeric
Utility-code suppression — suppress when >80% of findings are utility-code
Server endpoint detection — server.(py|ts|js) pattern for MCP servers

Round 2 Fixes (commit 2)

CVE package matching — affectedPackages field so CVE-2026-28363 only matches openclaw (eliminated 430 false critical CVEs)
Framework library detection — isFrameworkLibFile() downgrades confidence for langchain_core/, crewai/src/, autogen_agentchat/ (hidden by default)
Blanket rule cleanup — AA-GI-059, AA-IA-096, AA-MP-086, AA-RA-068, AA-DL-102 downgraded to info/low
AA-CF-056 cap — max 3 findings per scan, severity medium
MCP tool server exclusion — file_not_matches context for AA-RA-041/AA-HO-075 (new yaml-compiler feature)
AA-TS-142 scoped to Python — no longer fires on React .tsx setTimeout/setInterval
Framework filter on agent_property — AA-TS-184 now respects frameworks: [mcp]
Intelligence test file filter — IOC findings in test files now filtered

Test plan

All 1,420 tests pass
14 real-world repos scanned, 89% FP reduction
No regressions on repos with good agent detection
langchain: 317→3, langchainjs: 299→22, spring-ai: 281→32

Tested g0 against 16 OSS repos (langchain, crewAI, AutoGPT, MCP servers, spring-ai, etc.) and found 70% of 7,983 findings were false positives. Root causes and fixes: 1. Test file filtering too lenient (engine.ts) - Expanded isTestFile() patterns: .github/, __mocks__/, _test_/, testing/, testutils/, testdata/, _tests.py, Tests.java - Changed default from "keep downgraded critical/high in test files" to "remove all test-file findings" (--include-tests still works) 2. CVE version comparison bug (cve-feed.ts) - Maven ${project.version}, pnpm workspace:*, workspace:^ were parsed as NaN, which compared as "less than" any version - Added early return for non-numeric version placeholders - Eliminated 430 false critical CVE findings across 7 repos 3. Utility-code suppression too conservative (pipeline.ts) - Previously only suppressed when agents/tools were detected - Now also suppresses when >80% of findings are utility-code - Prevents noise floods in large repos where parsers miss entry points 4. Server file endpoint detection (reachability.ts) - Added server.(py|ts|js) pattern to endpoint detection - MCP server files now classified as endpoint-reachable instead of utility-code, preserving real findings in server repos

Round 2 of FP analysis against 14 OSS repos (4,578 -> 482 findings). Fixes: 1. CVE matching: add affectedPackages to CVEEntry so CVE-2026-28363 only matches openclaw, not langchain/ollama/google-generativeai. Also filter intelligence findings from test files (SSRF test IOCs). 2. Blanket agent_property rules: downgrade AA-GI-059, AA-IA-096, AA-MP-086, AA-RA-068, AA-DL-102 to info/low — these check for properties no parser ever populates, making them 100% FP. 3. Framework library detection: new isFrameworkLibFile() downgrades confidence to low for findings in langchain_core/, crewai/src/, autogen_agentchat/, etc. Hidden by default --min-confidence medium. 4. AA-CF-056 capped at 3 findings and downgraded to medium — was firing 101 times (once per CrewAI agent) for the same systemic issue. 5. AA-RA-041/AA-HO-075: add file_not_matches context for MCP fetch servers. Also add file_not_matches support to yaml-compiler for both ast_matches and code_matches check types. 6. AA-TS-142: scope to Python only — was firing on React .tsx files matching setTimeout/setInterval. 7. AA-TS-184: enforce frameworks filter on agent_property checks — rule declared frameworks: [mcp] but fired on langchain agents.

JBAhire added 2 commits April 8, 2026 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reduce false positives from real-world repo analysis#142

fix: reduce false positives from real-world repo analysis#142
JBAhire wants to merge 2 commits into
mainfrom
fix/fp-analysis-from-real-repos

JBAhire commented Apr 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JBAhire commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Repos tested

Final Results (default --min-confidence medium)

Round 1 Fixes (commit 1)

Round 2 Fixes (commit 2)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JBAhire commented Apr 8, 2026 •

edited

Loading

Final Results (default `--min-confidence medium`)