Advanced compaction extension and skill for Pi with automatic threshold-based compaction that follows Pi's active model metadata.
/ultracompactcommand for manual compaction- Auto-adapts threshold to Pi's active model context window
- Works with any Pi model that exposes context window metadata
- Graduated Eviction (4 levels) — strips reasoning, bulk outputs, artifacts, then messages
- Generational Compaction — micro (fast, no LLM) at 60-90%, full at 90%+
- Preemptive Trigger — fires before next turn, never pays latency during user turns
- Cache-Aware Compaction — immutable summary blocks keep prompt cache warm
- Circuit Breaker — 3 strikes → lossy truncation fallback, session never dies
- Hierarchical summarization with entropy-based information extraction
- Critical context preservation - goals, decisions, errors, file paths
- Extension + Skill - works as both a Pi extension and a skill
- Smart model switching - follows Pi model metadata and preserves custom settings
- Conversation structure detection - identifies turns, phases, and progress
- Multi-pass summarization — progressive compression with quality scoring
- LLM-based summarization — optional AI-powered compression (useLLM config)
- Content-aware token counting — dynamic ratios for code, prose, and whitespace
- Compact section templates — shorter headers, condensed formatting, saves 10-15% more tokens
pi install npm:pi-ultra-compactAfter installation and restarting Pi, use:
/ultracompact
This triggers manual ultra-compact compaction.
Auto-compaction triggers automatically based on Pi's active model context window.
The extension uses Pi's active model metadata as the source of truth for context window size. This avoids maintaining a separate model table in the extension and keeps thresholds aligned with Pi when new providers or models are added.
If Pi does not expose model metadata, the extension uses a conservative 128K-token fallback.
- Preemptive check (every turn): Projects next turn's token usage. If projected > 60% of context, triggers micro-compaction.
- Micro-compaction (60-90% usage): Strips reasoning blocks + bulk tool outputs. No LLM call. Runs in microseconds.
- Full compaction (90%+ usage): Graduated eviction preconditions the input, then structured summarization produces the final compacted context.
| Level | What it strips | When |
|---|---|---|
| 1 | Assistant thinking/reasoning blocks | Always (harmless removal) |
| 2 | Bulk tool outputs (>100 lines, >5K chars) | Most sessions |
| 3 | All non-error tool results | Heavy sessions |
| 4 | Oldest non-protected messages | Only when necessary |
- Snapshot-rollback: Messages are deep-copied before compaction. If anything fails, the original is preserved.
- Circuit breaker: After 3 consecutive failures, falls back to lossy truncation (keep system + last 10 turns).
- User messages inviolable: Never stripped regardless of token pressure.
- Cache-aware mode: Previous summaries stay immutable — only new content pays prefill cost.
Default settings work out of the box. The extension reads Pi's active model metadata and sets thresholds from ctx.model.contextWindow when available.
| Setting | Default | Description |
|---|---|---|
thresholdTokens |
Auto (80% of Pi context window, or 102,400 without metadata) | When to trigger compaction |
keepPercentage |
30% | Percentage of context to keep |
maxKeepTokens |
30,000 | Maximum tokens to keep |
autoCompact |
true | Enable automatic compaction |
cacheAware |
false | Immutable summary blocks (saves API costs) |
maxEvictionLevel |
FULL_REMOVAL | Max eviction aggressiveness |
outputHeadroom |
4,096 | Tokens reserved for LLM response |
circuitBreakerMaxFailures |
3 | Failures before lossy truncation |
preemptiveWatermark |
0.70 | Preemptive trigger level |
hardWatermark |
0.95 | Reactive fallback level |
| Command | Description |
|---|---|
/ultracompact |
Trigger manual ultra-compact compaction |
The extension does not ship its own model context-window table. Pi remains responsible for provider and model metadata; this extension uses the active model's contextWindow value for threshold calculation.
- Works with any Pi-compatible model
- Compatible with gentle-engram (Engram memory backup)
- Compatible with gentle-pi (SDD/OpenSpec)
- No conflicts with Pi's default compaction
See CHANGELOG.md for full version history.
- Fixed 27 test failures from jest/vitest API mismatch
- Replaced jest.fn() with vi.fn() across all test files
- Automated senior-dev-agent pipeline deployed (review, fix, publish, label)
- Graduated Eviction — 4-level content stripping (reasoning → bulk → artifacts → full)
- Generational Compaction — micro (60-90%, no LLM) + full (90%+) tiers
- Preemptive Trigger — fires at 70% watermark by projecting next turn
- Cache-Aware Mode — immutable summary blocks preserve prompt cache
- Snapshot-Rollback + Circuit Breaker — session never dies from bad compaction
- Vitest suite passing — zero regressions
- Compact section templates - shorter headers save 10-15% tokens across all conversations
- LLM-based summarization - optional LLM-powered semantic compression
- Content-aware token estimation - dynamic ratios for code/prose/whitespace
- 66 tests, 100% pass rate - including 13 new effectiveness benchmarks
- generateSummary is now async - supports LLM callback integration
Major improvements to compaction quality and performance:
- Smart model switching - follows Pi model metadata and preserves custom settings
- Conversation structure detection - identifies turns, phases, progress
- Enhanced critical extraction - progress indicators, questions, user preferences
- Multi-pass summarization - 3-pass compression with quality scoring
- Token estimation cache - LRU cache for 3x faster performance
- 100% test pass rate - 43 unit tests + 17 performance benchmarks
This release fixes 18 issues found via comprehensive 5-agent audit:
- 3 Critical regex bugs fixed -
\bword boundaries on all patterns, no more false matches - Startup model detection fixed - correct threshold from boot
- Custom thresholds preserved - across model switches
- Null safety - guards on all message-consuming methods
- 53-test Jest suite - comprehensive coverage
- Dead code removed - 329-line
.disabledfile deleted, unusedtypeboxdep removed
- Restart Pi after installation
- Check
pi install npm:pi-ultra-compactcompleted successfully
- The extension reads Pi's active model metadata at session start and model switch time
- Ensure Pi reports a
contextWindowfor your selected model - If Pi does not expose model metadata, the extension falls back to 128K tokens
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Pi - The AI coding agent