Problem
Issue #177 (merged as PR #178) improves CI analysis by instructing the agent to download build logs and artifacts. However, for Prow-based projects (hypershift, kubernetes-nmstate, openshift/release), CI artifacts stored in GCS can be massive:
| Prow artifact type |
Typical size |
| Build logs |
1-10 MB |
| JUnit XML |
100 KB - 1 MB |
| Must-gather (management cluster) |
100-500 MB |
| Must-gather (hosted cluster) |
100-300 MB |
| Full Prow job artifacts |
200 MB - 1 GB+ |
Oompa runs on a VM where `/tmp` is a 16 GB tmpfs (RAM-backed). Oompa worktrees already consume 3.6 GB. If the agent downloads Prow artifacts for multiple failing checks across multiple projects, it could consume 4+ GB in a single poll cycle, potentially crashing the system.
Risk scenario
3 Prow projects x 3 failing checks x 500 MB must-gather = 4.5 GB per cycle. Without cleanup, artifacts accumulate.
Proposed Safeguards
1. Only download logs, not full artifacts by default
The prompt should instruct the agent to:
- Use `gh run view --log-failed` for GitHub Actions (tiny, ~8 KB)
- Use `gh api .../jobs/ID/logs` for specific job logs (< 2 MB)
- NOT download full Prow artifacts (must-gather, dump archives, screenshots)
- Only download JUnit XML and build-log.txt for Prow jobs (< 10 MB total)
2. Download budget per investigation
Cap total downloads per check investigation at 100 MB. If an artifact exceeds the budget, skip it with a log warning.
3. Cleanup after each investigation
Delete all downloaded artifacts immediately after the agent completes its analysis. Don't accumulate across cycles.
4. Disk space check before downloading
Before any artifact download, check available disk space. If below 2 GB available, skip downloads entirely and fall back to analyzing the `CheckRun.Output` text only (current behavior).
5. Configurable artifact download toggle
Add a config option to control artifact downloading:
```yaml
prs:
- watch: [6252]
download-ci-artifacts: false # default: false for safety
```
Only enable for projects where you trust the artifact sizes.
6. GitHub Actions vs Prow distinction
| CI system |
Download approach |
Risk |
| GitHub Actions |
`gh run view --log-failed` |
Very low (KB) |
| GitHub Actions artifacts |
`gh run download` |
Low (usually 0, some repos a few MB) |
| Prow build-log.txt |
`gcloud storage cp .../build-log.txt` |
Low (1-10 MB) |
| Prow JUnit XML |
`gcloud storage cp .../junit*.xml` |
Low (< 1 MB) |
| Prow must-gather |
`gcloud storage cp .../must-gather.tar` |
HIGH (100-500 MB) -- never download |
| Prow full artifacts |
`gcloud storage cp -r .../artifacts/` |
VERY HIGH (1 GB+) -- never download |
Current State
The merged PR #178 updated the CI analysis prompt to mention `gh run download` and artifact inspection. The prompt doesn't currently have safeguards against downloading massive artifacts. The agent uses its own judgment about what to download, which may lead to downloading must-gather archives on Prow-based projects.
Implementation
Prompt changes (`pkg/agent/prompt.go`)
Add explicit size constraints and prohibitions to the CI analysis prompt:
```
IMPORTANT: Disk space is limited. When downloading CI artifacts:
- NEVER download must-gather archives, dump tarballs, or full artifact directories
- Only download: build-log.txt, junit*.xml, and specific small log files
- If a single file is > 50 MB, skip it
- Clean up all downloaded files when your analysis is complete
- Prefer gh run view --log-failed over gh run download (much smaller)
```
Runtime safeguards (`pkg/agent/ci.go`)
- Check available disk space (`df`) before any CI investigation that may download artifacts
- Set a download timeout
- Clean up `/tmp/ci-artifacts` after each investigation
- Log a warning if disk space drops below 2 GB threshold
Config changes
- Add `download-ci-artifacts` boolean to PRsRoleConfig (default: false)
- When false, the prompt tells the agent not to download artifacts at all
- When true, the prompt allows downloads with the size constraints above
Problem
Issue #177 (merged as PR #178) improves CI analysis by instructing the agent to download build logs and artifacts. However, for Prow-based projects (hypershift, kubernetes-nmstate, openshift/release), CI artifacts stored in GCS can be massive:
Oompa runs on a VM where `/tmp` is a 16 GB tmpfs (RAM-backed). Oompa worktrees already consume 3.6 GB. If the agent downloads Prow artifacts for multiple failing checks across multiple projects, it could consume 4+ GB in a single poll cycle, potentially crashing the system.
Risk scenario
3 Prow projects x 3 failing checks x 500 MB must-gather = 4.5 GB per cycle. Without cleanup, artifacts accumulate.
Proposed Safeguards
1. Only download logs, not full artifacts by default
The prompt should instruct the agent to:
2. Download budget per investigation
Cap total downloads per check investigation at 100 MB. If an artifact exceeds the budget, skip it with a log warning.
3. Cleanup after each investigation
Delete all downloaded artifacts immediately after the agent completes its analysis. Don't accumulate across cycles.
4. Disk space check before downloading
Before any artifact download, check available disk space. If below 2 GB available, skip downloads entirely and fall back to analyzing the `CheckRun.Output` text only (current behavior).
5. Configurable artifact download toggle
Add a config option to control artifact downloading:
```yaml
prs:
download-ci-artifacts: false # default: false for safety
```
Only enable for projects where you trust the artifact sizes.
6. GitHub Actions vs Prow distinction
Current State
The merged PR #178 updated the CI analysis prompt to mention `gh run download` and artifact inspection. The prompt doesn't currently have safeguards against downloading massive artifacts. The agent uses its own judgment about what to download, which may lead to downloading must-gather archives on Prow-based projects.
Implementation
Prompt changes (`pkg/agent/prompt.go`)
Add explicit size constraints and prohibitions to the CI analysis prompt:
```
IMPORTANT: Disk space is limited. When downloading CI artifacts:
```
Runtime safeguards (`pkg/agent/ci.go`)
Config changes