fix: scoring recalibration, FP reduction, enriched JSON, Zod schemas by JBAhire · Pull Request #132 · guard0-ai/g0

JBAhire · 2026-03-27T00:36:38Z

Summary

Major quality overhaul — fixes scoring inversion bug, reduces false positives, enriches JSON API, adds platform security hardening, and establishes shared schema contract.

Scoring fix: A secure agent scored worse (B:85) than a vulnerable agent with shell injection (B:87). Now: vulnerable=D:69, secure=C:75.

CLI Changes (9 commits)

Scoring Recalibration

Severity deductions 2x (critical 20→40, high 10→18)
Critical-floor clamp: 1 vuln critical→max C, 2→max D, 3+→max F
Unknown reachability 0.6→0.85, not-assessed exploitability 0.7→0.85
FindingCategory type: vulnerability | hardening | informational
Absence-check rules (no kill switch, no RBAC) downgraded to medium

False Positive Reduction

AA-DL-046: skip imports, expand memory isolation patterns
AA-GI-002: expanded instruction guarding detection
AA-TS-184: framework-scoped YAML agent_property rules
AA-TS-021: narrow network access check to tool scope
Non-agent guard: drop hardening findings when no agents/tools detected
Self-scan: exclude .claude/worktrees/, fixtures/, CVE docs

JSON API

metadata (frameworks, agentCount, toolCount, filesScanned)
score.securityScore / score.hardeningScore split
finding.category, finding.message, finding.fix aliases
graph.nodes[] and graph.edges[] populated
Banner suppressed for --cyclonedx and --output

Architecture

Zod schemas for all 8 upload payload types (shared contract)
Code splitting (SDK consumers get smaller bundles)
BaseFinding interface, Rule type safety (removed unsafe casts)
AI provider throws on --ai without API key
Structured logger (TTY-aware, JSON in CI)
--strict flag (exit 2 on critical findings)
guard0 npm wrapper (g0 command only, no guard0 command conflict)

Results

Test Case	Before	After
Vulnerable agent (shell+SQL injection)	B (87)	D (69)
Secure agent (proper controls)	B (85)	C (75)
Non-agent code (Flask)	A (99)	A (100)

Test plan

Vulnerable agent scores D, secure agent scores C, non-agent scores A
JSON output has metadata, categories, graph nodes, message/fix
Zod schemas validate correct payloads, reject malformed
--strict exits with code 2 on critical findings
102/103 test files pass (1 pre-existing daemon failure)
Build succeeds with code splitting

Fixes the #1 trust-destroying bug: a secure agent scored worse (B:85) than a vulnerable agent with shell injection (B:87). Scoring changes: - Severity deductions increased (critical 20→40, high 10→18, medium 4→6) - Critical-floor clamp: 1 crit→max C, 2→max D, 3+ criticals→max F - Unknown reachability 0.6→0.85 (assume reachable until proven safe) - Not-assessed exploitability 0.7→0.85 (same principle) - Correlation bonus capped at 50% of remaining domain score Finding categorization: - Added FindingCategory type: vulnerability | hardening | informational - Improved isAbsenceBased() with title-pattern fallback detection Rule severity downgrades (absence → hardening): - AA-HO-005 "No emergency stop": critical → medium - AA-IA-030 "No RBAC enforcement": high → medium - AA-RA-011 "No kill switch": critical → medium Results: Vulnerable agent B:87→D:69, Secure agent B:85→C:72 Also includes: - guard0 CLI alias (guard0 + g0 both work) - guard0 npm wrapper package for `npm install guard0` - Version sync script (scripts/version.mjs) - Updated release workflow to publish guard0 wrapper - Updated banner to show GUARD0 branding

…mport skipping AA-DL-046 (shared memory): - Skip import statements (only flag actual instantiation) - Expand isolation patterns: thread_id, config.*thread, configurable, memory_key=, chat_history checkInstructionGuarding (shared parser): - Add broader deny patterns: MUST NOT, you can only, do not disclose, outside these boundaries, politely decline, refuse any requests - Fixes FP where well-written system prompts weren't recognized as guarded Results: Secure agent 0 criticals (was 1), score C:75 (was C:72)

…ge/fix aliases JSON reporter: - Added metadata: frameworks, agentCount, toolCount, promptCount, modelCount, filesScanned - Added score.securityScore and score.hardeningScore split - Added finding.category (vulnerability | hardening | informational) - Added finding.message (alias for title) and finding.fix (alias for remediation) - Added graph.nodes[] (agents, tools, models with file/line) - Added graph.edges[] (typed edges between nodes) Analysis engine: - Derives finding category from checkType and title patterns - Absence-based rules (no X, missing Y, lacks Z) → hardening - Code-pattern rules → vulnerability - Info severity → informational CLI: - Banner now suppressed for --cyclonedx and --output flags

New file: src/platform/schemas/upload.ts - 8 upload payload schemas (scan, inventory, mcp, test, flows, endpoint, host-hardening, openclaw-audit) as a discriminated union on 'type' - Endpoint register + heartbeat schemas - Shared sub-schemas: Finding, ScanScore, ProjectMeta, MachineMeta, CIMeta - Validates: severity enums, score ranges (0-100), grade enums, required fields All schemas exported from @guard0/g0 package index so the platform can: import { UploadPayloadSchema } from '@guard0/g0' const result = UploadPayloadSchema.safeParse(payload) This is the shared schema contract between CLI and platform — single source of truth, runtime validation, no more silent schema drift.

FP fixes: - agent_property YAML rules now skip when framework filter doesn't match (fixes AA-TS-184 MCP rule firing on LangChain agents) - Non-agent project guard: drop hardening findings when no agents/tools detected (Flask app: 5 findings → 2, score 99 → 100) Architecture fixes: - AI provider throws explicit error when --ai set but no API key configured (was silent console.error, now fails loudly) - Rule interface: added suppressedBy/requiresControl fields directly, removed unsafe `as Rule & Record<string, unknown>` type assertions - ModelNode: added maxTokens/temperature/topP optional fields

Build: - Split tsup config into CLI (single bundle + shebang) and SDK/daemon (code-split for smaller imports) - SDK consumers importing { runScan } no longer pull in CLI code Types: - Added BaseFinding interface (severity, title, description, category) - Finding now extends BaseFinding - Exported BaseFinding and FindingCategory from package index

New: src/utils/logger.ts - Human-readable colored output in TTY mode - JSON lines on stderr in non-TTY/CI mode - Log levels via G0_LOG_LEVEL env var (error/warn/info/debug) - Replaced console.error in platform/upload.ts with logger.error

- Removed guard0 bin entry from main package (conflicts with enterprise pip) - Removed guard0 bin entry from wrapper package - Removed cliName() detection — always 'g0' - guard0 npm package still exists for discoverability (npm install guard0) but only installs the g0 command

- AA-TS-021: narrow network access check to within 2000 chars of tool definition (was file-wide, caused FPs on non-tool network calls) - Self-scan: exclude .claude/worktrees/, advisories/, CVE docs from analysis - Added --strict flag: exit code 2 if any critical finding exists

JBAhire added 9 commits March 26, 2026 16:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: scoring recalibration, FP reduction, enriched JSON, Zod schemas#132

fix: scoring recalibration, FP reduction, enriched JSON, Zod schemas#132
JBAhire wants to merge 9 commits into
mainfrom
fix/scoring-recalibration

JBAhire commented Mar 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JBAhire commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

CLI Changes (9 commits)

Scoring Recalibration

False Positive Reduction

JSON API

Architecture

Results

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JBAhire commented Mar 27, 2026 •

edited

Loading