g0 produces a 0-100 security score with a letter grade (A-F) for each scan. Scores are calculated per-domain and then combined via weighted average.
Overall Score = ROUND( SUM(domain_score * domain_weight) / SUM(domain_weight) )
Domain Score = MAX(0, ROUND(100 - total_deduction))
total_deduction = SUM( severity_base * reachability_mult * exploitability_mult )
for each finding in the domain
| Grade | Score Range | Meaning |
|---|---|---|
| A | 90-100 | Excellent — minimal security issues |
| B | 80-89 | Good — low risk, minor improvements needed |
| C | 70-79 | Acceptable — moderate risk, action recommended |
| D | 60-69 | Poor — significant risk, remediation required |
| F | 0-59 | Failing — critical risks, immediate action needed |
Each finding deducts points from its domain score based on severity:
| Severity | Base Deduction |
|---|---|
| Critical | 20 points |
| High | 10 points |
| Medium | 5 points |
| Low | 2.5 points |
| Info | 0 points |
Findings are weighted by how accessible the vulnerable code is:
| Reachability | Multiplier | Description |
|---|---|---|
agent-reachable |
1.0x | In agent definition or directly called by agent |
tool-reachable |
1.0x | In tool implementation called by agent |
endpoint-reachable |
0.8x | In API route handler or HTTP endpoint |
utility-code |
0.3x | Helper/library code not on agent path |
unknown |
0.6x | Reachability not determined |
Utility code gets a 70% reduction because vulnerabilities in code not reachable from agent entry points pose significantly lower risk.
Findings are further weighted by likelihood of exploitation:
| Exploitability | Multiplier | Description |
|---|---|---|
confirmed |
1.2x | Taint flow confirmed from input to vulnerable sink |
likely |
1.0x | High confidence the vulnerability is exploitable |
unlikely |
0.4x | Constraints block typical exploitation paths |
not-assessed |
0.7x | Default when taint analysis not available |
Utility-code findings are automatically tagged unlikely (0.4x).
Domains are weighted by their relative importance to agent security:
| Domain | Weight | Relative % |
|---|---|---|
| Goal Integrity | 1.5 | 10.4% |
| Tool Safety | 1.5 | 10.4% |
| Rogue Agent | 1.4 | 9.7% |
| Code Execution | 1.3 | 9.0% |
| Data Leakage | 1.3 | 9.0% |
| Identity & Access | 1.2 | 8.3% |
| Cascading Failures | 1.2 | 8.3% |
| Reliability Bounds | 1.2 | 8.3% |
| Memory & Context | 1.1 | 7.6% |
| Inter-Agent | 1.1 | 7.6% |
| Supply Chain | 1.0 | 6.9% |
| Human Oversight | 1.0 | 6.9% |
Total weight sum: 14.8
Severity: critical (20 pts)
Reachability: agent-reachable (1.0x)
Exploitability: confirmed (1.2x)
Deduction: 20 * 1.0 * 1.2 = 24 points from domain score
Severity: high (10 pts)
Reachability: utility-code (0.3x)
Exploitability: unlikely (0.4x, auto-set)
Deduction: 10 * 0.3 * 0.4 = 1.2 points from domain score
88% reduction from full impact due to reachability + exploitability context.
Severity: medium (5 pts) -> low (2.5 pts) after control detection
Reachability: endpoint-reachable (0.8x)
Exploitability: likely (1.0x)
Deduction: 2.5 * 0.8 * 1.0 = 2.0 points from domain score
When g0 test --adaptive confirms a vulnerability, the finding is scored with a CVSS 3.1 base vector in addition to g0's standard scoring. CVSS provides an industry-standard severity rating (0.0-10.0) that maps to severity labels:
| CVSS Score | Severity |
|---|---|
| 9.0-10.0 | Critical |
| 7.0-8.9 | High |
| 4.0-6.9 | Medium |
| 0.1-3.9 | Low |
| 0.0 | None |
The CVSS vector is derived from:
- Attack Vector (AV): Network — all dynamic tests target network-accessible agents
- Attack Complexity (AC): Low if ≤3 turns succeeded, High if >10 turns required
- Privileges Required (PR): None (unauthenticated) or Low (if auth headers provided)
- User Interaction (UI): None — attacks are fully automated
- Scope (S): Changed if the attack crosses trust boundaries (e.g., data exfiltration)
- Impact (C/I/A): Derived from the attack category and observed behavior
CVSS scores appear in terminal output, JSON reports, and SARIF results.
The full scoring pipeline applies these steps in order:
- Raw findings — Rules produce initial finding set
- Deduplication — Exact line, cross-rule, function scope, prompt cap
- Suppression —
g0-ignoreinline comments - Compensating controls — ±5 line proximity check downgrades severity
- Control registry — Project-wide security control detection adjusts confidence/severity
- Test file handling — Downgrade or filter findings in test/fixture files
- Reachability tagging — Categorize code accessibility
- Exploitability assessment — AST-based taint analysis
- Score calculation — Apply formula with multipliers
- Grade assignment — Map score to letter grade