The Agent Context Compression Protocol (ACCP) is a proposed open, lightweight communication protocol designed to drastically reduce token consumption and API costs in multi-agent AI systems. While existing protocols like MCP (agent-to-tool), A2A (agent-to-agent), and ACP (agent messaging) solve interoperability and routing, none are optimized for token efficiency — the single largest cost driver in production agentic AI.
ACCP introduces a semantic encoding layer that replaces verbose natural language and bloated JSON payloads with compact, structured, machine-interpretable messages. It is designed to be harness-agnostic and to complement MCP and A2A rather than replace them.
Target outcome: 60–90% reduction in token consumption for inter-agent communication within agentic harnesses, while preserving full semantic fidelity.
Modern agentic harnesses (CrewAI, LangGraph, AutoGen, Semantic Kernel, custom orchestrators) suffer from compounding token costs:
- Context re-transmission: Every reasoning step re-sends the full accumulated context (system prompt + tool definitions + conversation history + RAG docs + agent state).
- Verbose inter-agent messaging: Agents communicate in natural language, consuming 5–50x more tokens than the semantic payload requires.
- Redundant tool definitions: Tool schemas are re-injected into every call, even when the agent has already demonstrated competence with that tool.
- State serialization bloat: Agent state is captured as pretty-printed JSON or raw text, wasting tokens on whitespace, redundant keys, and structural overhead.
A single agent performing 10 reasoning steps can easily consume 50K–100K tokens per task. A multi-agent system with 4–6 agents on a complex workflow can burn 500K–1M+ tokens per execution.
| Protocol | Purpose | Token-Aware? | Compression? |
|---|---|---|---|
| MCP | Agent ↔ Tool connectivity | No | No (JSON-RPC) |
| A2A | Agent ↔ Agent coordination | No | No (JSON over HTTPS) |
| ACP | Agent messaging (RESTful) | No | No (MIME multipart) |
| ANP | Agent network discovery | No | No |
| AG-UI | Agent ↔ User interface | Partial | No |
None of these protocols treat token efficiency as a first-class design concern.
If agents could communicate through a shared, compact semantic encoding — where a 500-token natural language message compresses to 30–50 tokens of structured intent — the economics of agentic AI shift dramatically:
- 10x cost reduction on inter-agent messaging
- Deeper reasoning chains within the same context budget
- More agents per workflow without hitting context ceilings
- Faster inference due to smaller payloads
ACCP is a semantic encoding and context compression protocol for AI agent communication. It defines:
- A compact message format — structured, schema-driven, and optimized for minimal token footprint when injected into LLM context.
- A shared ontology of intents and operations — standardized codes that agents understand without verbose natural language descriptions.
- A context management strategy — rules for state accumulation, compression checkpoints, and memory hierarchy.
- A codec layer — encode/decode functions that sit between the harness and the LLM API, transparently compressing and decompressing communication.
┌─────────────────────────────────────────────┐
│ Agentic Harness │
│ (CrewAI / LangGraph / AutoGen / Custom) │
├─────────────────────────────────────────────┤
│ ACCP Codec Layer │ ◄── THIS IS WHAT WE BUILD
│ ┌─────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Encoder │ │ State Mgr│ │ Intent Index │ │
│ └─────────┘ └──────────┘ └──────────────┘ │
├─────────────────────────────────────────────┤
│ Transport (MCP / A2A / HTTP) │
├─────────────────────────────────────────────┤
│ LLM API Layer │
│ (OpenAI / Anthropic / etc.) │
└─────────────────────────────────────────────┘
- Token-first design — Every format decision is benchmarked against token count (tiktoken/BPE).
- Semantic fidelity — Compression must be lossless in meaning, even if lossy in verbosity.
- LLM-readable — The compressed format must be natively parseable by modern LLMs without special fine-tuning. Human readability is secondary but desirable.
- Harness-agnostic — Works with any orchestration framework via a thin adapter.
- Protocol-complementary — Sits alongside MCP/A2A, not in competition.
- Progressive adoption — Teams can adopt incrementally: start with message compression, add state management later.
Replace verbose agent messages with structured semantic encodings:
Before (natural language — ~85 tokens):
The research agent has completed its analysis of the quarterly
sales data. Key findings include: revenue is down 12%
quarter-over-quarter, the enterprise segment is the primary
driver of the decline, and customer churn increased by 3.2%.
The research agent recommends escalating to the strategy agent
for remediation planning.
After (ACCP-M — ~22 tokens):
@RSA>done:analyze{d:q_sales|f:[rev:-12%QoQ,seg:enterprise=primary_decline,
churn:+3.2%]|nx:@STA:plan_remediation}
A shared vocabulary of agent intents, operations, and status codes:
INTENTS:
REQ — Request action
DONE — Task completed
FAIL — Task failed
WAIT — Awaiting input
ESC — Escalate to another agent
COMP — Context compression checkpoint
SYNC — State synchronization
QRY — Query for information
ACK — Acknowledgment
Rules for when and how to compress accumulated context:
- Checkpoint triggers: After N tool calls, at phase boundaries, before handoffs.
- Compression tiers: Full → Summary → Key-facts → Hashes.
- Shared state references: Agents reference a shared state store by key rather than re-transmitting full context.
- Delta encoding: Only transmit what changed since last checkpoint.
Pre-defined schemas for common agent operations, indexed by compact codes rather than re-described in each prompt.
- AI/ML Engineers building multi-agent systems in production
- Platform teams managing agentic infrastructure at scale
- Indie developers running agents on personal budgets
- Enterprise architects designing cost-efficient AI workflows
- Multi-agent research pipelines — 5+ agents collaborating on analysis tasks
- Coding agent harnesses — Agents with deep tool integration and long sessions
- Customer service agent networks — High-volume, cost-sensitive deployments
- Personal assistant agent chains — Budget-constrained local/API hybrid setups
| Metric | Target |
|---|---|
| Token reduction on inter-agent messages | ≥ 70% |
| Token reduction on full pipeline execution | ≥ 50% |
| Semantic fidelity (meaning preservation) | ≥ 95% |
| Adoption friction (integration time) | < 1 hour for supported harnesses |
| Latency overhead of encode/decode | < 5ms per message |
| LLM comprehension accuracy of ACCP format | ≥ 90% without fine-tuning |
| Project | Relationship to ACCP |
|---|---|
| JSPLIT (Janea Systems) | Solves MCP tool pruning, not message compression |
| TOON (Token-Oriented Object Notation) | Overlapping — data serialization format for LLMs; ACCP can incorporate TOON-like ideas for data payloads |
| Claw Compactor | Workspace-level compression, rule-based; ACCP is protocol-level with semantic awareness |
| LLMLingua | Prompt compression via smaller LM; could be used as one ACCP compression backend |
| MCP | Complementary — ACCP compresses what MCP transmits |
| A2A | Complementary — ACCP compresses what A2A coordinates |
ACCP is the only proposed protocol that treats token efficiency as the primary design constraint for agent-to-agent communication. It does not replace transport or routing protocols — it makes them cheaper to use.
- LLM comprehension ceiling: Can LLMs reliably parse heavily compressed formats without fine-tuning? (Requires empirical validation.)
- Format standardization: Who governs the intent ontology? Open governance model needed.
- Versioning: How do agents negotiate ACCP versions?
- Security: Compact formats may be harder to audit/inspect.
- Adoption: Requires harness-level integration — chicken-and-egg problem.
- Model-specificity: Different LLMs may tokenize the same format differently, affecting compression ratios.
- Release v1.0 of the formal protocol specification (Completed).
- Finalize codec implementation in C# (.NET 8+) and proceed to production hardening.
- Expand harness adapters to include LangChain and AutoGen alongside Semantic Kernel.
- Develop comprehensive threat model and best practices guides for protocol implementers.
- Publish benchmarks and open-source the protocol repository.
Document Owner: Project Lead Status: APPROVED Created: April 2026