fix: scoring recalibration, FP reduction, enriched JSON, Zod schemas#132
Open
JBAhire wants to merge 9 commits into
Open
fix: scoring recalibration, FP reduction, enriched JSON, Zod schemas#132JBAhire wants to merge 9 commits into
JBAhire wants to merge 9 commits into
Conversation
Fixes the #1 trust-destroying bug: a secure agent scored worse (B:85) than a vulnerable agent with shell injection (B:87). Scoring changes: - Severity deductions increased (critical 20→40, high 10→18, medium 4→6) - Critical-floor clamp: 1 crit→max C, 2→max D, 3+ criticals→max F - Unknown reachability 0.6→0.85 (assume reachable until proven safe) - Not-assessed exploitability 0.7→0.85 (same principle) - Correlation bonus capped at 50% of remaining domain score Finding categorization: - Added FindingCategory type: vulnerability | hardening | informational - Improved isAbsenceBased() with title-pattern fallback detection Rule severity downgrades (absence → hardening): - AA-HO-005 "No emergency stop": critical → medium - AA-IA-030 "No RBAC enforcement": high → medium - AA-RA-011 "No kill switch": critical → medium Results: Vulnerable agent B:87→D:69, Secure agent B:85→C:72 Also includes: - guard0 CLI alias (guard0 + g0 both work) - guard0 npm wrapper package for `npm install guard0` - Version sync script (scripts/version.mjs) - Updated release workflow to publish guard0 wrapper - Updated banner to show GUARD0 branding
…mport skipping AA-DL-046 (shared memory): - Skip import statements (only flag actual instantiation) - Expand isolation patterns: thread_id, config.*thread, configurable, memory_key=, chat_history checkInstructionGuarding (shared parser): - Add broader deny patterns: MUST NOT, you can only, do not disclose, outside these boundaries, politely decline, refuse any requests - Fixes FP where well-written system prompts weren't recognized as guarded Results: Secure agent 0 criticals (was 1), score C:75 (was C:72)
…ge/fix aliases JSON reporter: - Added metadata: frameworks, agentCount, toolCount, promptCount, modelCount, filesScanned - Added score.securityScore and score.hardeningScore split - Added finding.category (vulnerability | hardening | informational) - Added finding.message (alias for title) and finding.fix (alias for remediation) - Added graph.nodes[] (agents, tools, models with file/line) - Added graph.edges[] (typed edges between nodes) Analysis engine: - Derives finding category from checkType and title patterns - Absence-based rules (no X, missing Y, lacks Z) → hardening - Code-pattern rules → vulnerability - Info severity → informational CLI: - Banner now suppressed for --cyclonedx and --output flags
New file: src/platform/schemas/upload.ts
- 8 upload payload schemas (scan, inventory, mcp, test, flows, endpoint,
host-hardening, openclaw-audit) as a discriminated union on 'type'
- Endpoint register + heartbeat schemas
- Shared sub-schemas: Finding, ScanScore, ProjectMeta, MachineMeta, CIMeta
- Validates: severity enums, score ranges (0-100), grade enums, required fields
All schemas exported from @guard0/g0 package index so the platform can:
import { UploadPayloadSchema } from '@guard0/g0'
const result = UploadPayloadSchema.safeParse(payload)
This is the shared schema contract between CLI and platform — single
source of truth, runtime validation, no more silent schema drift.
FP fixes: - agent_property YAML rules now skip when framework filter doesn't match (fixes AA-TS-184 MCP rule firing on LangChain agents) - Non-agent project guard: drop hardening findings when no agents/tools detected (Flask app: 5 findings → 2, score 99 → 100) Architecture fixes: - AI provider throws explicit error when --ai set but no API key configured (was silent console.error, now fails loudly) - Rule interface: added suppressedBy/requiresControl fields directly, removed unsafe `as Rule & Record<string, unknown>` type assertions - ModelNode: added maxTokens/temperature/topP optional fields
Build:
- Split tsup config into CLI (single bundle + shebang) and SDK/daemon
(code-split for smaller imports)
- SDK consumers importing { runScan } no longer pull in CLI code
Types:
- Added BaseFinding interface (severity, title, description, category)
- Finding now extends BaseFinding
- Exported BaseFinding and FindingCategory from package index
New: src/utils/logger.ts - Human-readable colored output in TTY mode - JSON lines on stderr in non-TTY/CI mode - Log levels via G0_LOG_LEVEL env var (error/warn/info/debug) - Replaced console.error in platform/upload.ts with logger.error
- Removed guard0 bin entry from main package (conflicts with enterprise pip) - Removed guard0 bin entry from wrapper package - Removed cliName() detection — always 'g0' - guard0 npm package still exists for discoverability (npm install guard0) but only installs the g0 command
- AA-TS-021: narrow network access check to within 2000 chars of tool definition (was file-wide, caused FPs on non-tool network calls) - Self-scan: exclude .claude/worktrees/, advisories/, CVE docs from analysis - Added --strict flag: exit code 2 if any critical finding exists
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Major quality overhaul — fixes scoring inversion bug, reduces false positives, enriches JSON API, adds platform security hardening, and establishes shared schema contract.
Scoring fix: A secure agent scored worse (B:85) than a vulnerable agent with shell injection (B:87). Now: vulnerable=D:69, secure=C:75.
CLI Changes (9 commits)
Scoring Recalibration
False Positive Reduction
JSON API
Architecture
Results
Test plan