Official LangBot AgentRunner plugin for the local, LangBot-hosted agent path.
This runner is a consumer of LangBot AgentRunner Protocol v1. LangBot provides the host infrastructure, authorization, facts, and pull APIs; the runner owns the model-facing agent behavior such as prompt assembly, history selection, tool loop, RAG orchestration, and optional context compaction.
This plugin tracks the LangBot 4.11.x AgentRunner integration work. Test it with:
langbot-app/LangBotbranchdev/4.11.xlangbot-app/langbot-plugin-sdkbranchdev/4.11.xlangbot-app/langbot-agent-runnerbranchmainlangbot-app/langbot-agent-control-planebranchmain
This repository does not define the LangBot host protocol. It consumes the
Protocol v1 run context produced by LangBot. The canonical protocol source is
LangBot/docs/agent-runner-pluginization/PROTOCOL_V1.md; this README only
documents how Local Agent consumes that contract:
ctx.event: event-first metadata for the current trigger.ctx.conversation,ctx.actor,ctx.subject: current run scope metadata.ctx.input: current structured input, including text, multimodal contents, and lightweight attachment/file references.ctx.context: context handles, inline policy, and available pull APIs. Local Agent uses the Host history API for conversation history instead of adapter bootstrap.ctx.resources: run-scoped authorized models, tools, knowledge bases, skills, and storage capabilities.ctx.state: small Host-projected state for the current run.ctx.runtime: runtime metadata such as deadline, trace id, query id from migration adapter paths, and Host metadata.ctx.delivery: host delivery surface and streaming/edit capabilities.ctx.config: runner binding config.ctx.adapter: migration adapter fields; not part of Protocol v1 core and not a place for prompt, history, RAG results, tool schemas, or authorized resources.
LangBot does not inline full conversation history by default. When the runner
needs more context, it should use authorized Host APIs through
AgentRunAPIProxy, for example model, prompt, history, tool,
knowledge-base, state, and storage APIs.
AgentRunner components should obtain that proxy with self.get_run_api(ctx).
They should not use the legacy self.plugin proxy that regular non-runner
plugin components use.
The SDK proxy import path is
from langbot_plugin.api.proxies.agent_run import AgentRunAPIProxy.
plugin:langbot/local-agent/default
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
| model | model-fallback-selector | yes | primary: '', fallbacks: [] | LLM model with fallbacks |
| timeout | integer | no | 300 | Total runner execution timeout in seconds. Set to 0 or null to disable the host deadline. |
| prompt | prompt-editor | yes | system: "You are a helpful assistant." | Default system prompt edited in LangBot UI |
| remove-think | boolean | no | false | Ask Host model APIs to remove provider thinking output when supported |
| knowledge-bases | knowledge-base-multi-selector | no | [] | Knowledge bases for RAG |
| retrieval-top-k | integer | no | 5 | Retrieval results requested per knowledge base |
| rerank-model | rerank-model-selector | no | '' | Rerank model for improved retrieval |
| rerank-top-k | integer | no | 5 | Top-K results after reranking |
| max-tool-iterations | integer | no | 100 | Maximum tool-call follow-up iterations |
| tool-execution-mode | select | no | parallel | Same-batch tool execution: parallel or serial |
| max-tool-result-chars | integer | no | 20000 | Maximum serialized tool result characters injected into the next model request |
| context-history-fetch-limit | integer | no | 50 | Transcript messages pulled from the Host history API |
| context-window-tokens | integer | no | 200000 | Fallback context window, and an upper cap when Host model metadata is available |
| context-reserve-tokens | integer | no | 16384 | Tokens reserved for the model response and provider overhead, clamped to at most 25% of the effective window |
| context-keep-recent-tokens | integer | no | 20000 | Approximate recent history tokens to retain when compaction triggers |
| context-summary-tokens | integer | no | 8000 | Maximum deterministic summary tokens inserted for compacted older history |
prompt is the static binding default. When LangBot exposes
ctx.context.available_apis.prompt_get, Local Agent pulls the
post-preprocessing effective prompt through AgentRunAPIProxy.get_prompt() and
uses it instead of the static default so PromptPreProcessing changes are
preserved. If the prompt API is unavailable, Local Agent falls back to
ctx.config.prompt.
remove-think is the first supported thinking-output control for Local Agent.
When enabled, the runner passes remove_think=True to Host model APIs for both
streaming and non-streaming calls. It is not Pi-style thinking-level control; it
only requests provider thinking output removal when the active Host model
adapter supports that flag.
Skill support is Host-mediated. When Local Agent advertises
skill_authoring, LangBot lists the current pipeline-visible skill facts in
ctx.resources.skills and exposes the Host-owned activate and
register_skill tools according to the same visibility policy. Calling
activate returns the full SKILL.md instructions as a tool result and
registers the skill package for Box mount resolution under
/workspace/.skills/<skill-name>. Local Agent consumes skill facts and tools
through Host APIs; it decides how tool schemas, tool results, prompt context,
or MCP surfaces are presented to the model.
Legacy singular knowledge-base values must be normalized by LangBot
configuration migration before runner execution. Local Agent only reads the
manifest-defined knowledge-bases binding config.
max-tool-result-chars is a runner-level safety fallback for model-facing tool
messages. String results, serialized JSON results, and error results are bounded
before they are appended as role="tool" messages for the next model request.
Oversized non-error tool results are reported as bounded previews; Local Agent
does not persist runner-owned large-result assets or expose an internal read
tool for them.
Large files and generated assets should be returned by sandbox or Host tools as
sandbox paths, URLs, or other explicit external references, not inline file
content.
tool-execution-mode controls tool calls emitted in the same model turn.
parallel runs the batch concurrently and still writes tool-result messages
back in source order. serial executes them one by one.
Tools can return a top-level terminate: true runtime hint when the tool action
already completes the user-visible work and the automatic follow-up model call
should be skipped. Local Agent stops early only when every finalized tool result
in that batch sets terminate: true; mixed batches continue normally. The hint
is stripped from the model-facing role="tool" message so it does not become
business data for the next provider request.
When a sandbox or Host tool already returns explicit path fields, Local Agent
treats those as authoritative sandbox references. If the surrounding result is
large, the model and Host events receive those references plus a bounded preview
only.
The local agent should be treated as a runner-owned or hybrid-context runner:
- LangBot inlines the current event/input and context handles.
- The runner pulls transcript history through the authorized Host history API.
- The runner decides whether to page history, summarize, compact, or construct a model request from scratch.
- Large files, images, audio, and tool outputs should be consumed as sandbox paths, URLs, or other explicit references instead of large inline payloads.
Local Agent currently uses a runner-owned context pipeline:
- Assemble effective prompt, host transcript history, RAG context, and current structured input.
- Use the Host-provided model context window from
ctx.runtime.metadatawhen available, capped by the runner binding'scontext-window-tokens. If Host metadata is unavailable,context-window-tokensis the fallback window and defaults to 200k tokens. - Estimate message tokens with a conservative local heuristic until LangBot exposes tokenizer/model usage metadata to runner plugins.
- When the assembled context exceeds the effective input budget
(
window - reserve, with reserve clamped for small windows), use the authorized Host model API to generate a structured checkpoint summary, wrap it in asystemmessage containing<conversation_summary>...</conversation_summary>, and keep a recent history tail bounded bycontext-keep-recent-tokens. If Host exposes the state API, Local Agent persists that summary as a conversation-scoped compaction checkpoint atrunner.compaction.checkpoint, anchored bycovers_until, and later runs reuse it before pulling transcript entries after that cursor. If model summarization fails or returns empty content, Local Agent falls back to a deterministic bounded summary. - Re-run the context transform before every model turn, including tool-call follow-up turns, so tool results and assistant tool calls are budgeted before the next provider request.
- If a provider fails before producing any streamed content with a
context-overflow style error, compact the current loop context with a more
aggressive retry budget and retry the model turn once before surfacing
run.failed.
This is not max-round behavior. History is not selected by number of rounds;
the runner budgets prompt, current input, summary, and recent history together,
following the Pi-style context threshold and per-turn transform shape. When the
Host does not expose the state API or a checkpoint cannot be parsed, Local Agent
falls back to the previous tail-history behavior. Future iterations can replace
local estimates with tokenizer/model usage metadata from the LiteLLM model-info
work.
Pipeline adapter data is intentionally narrow. Local Agent does not consume
ctx.adapter.extra.prompt; prompt handoff goes through the run-scoped Host
prompt API when available. New runner logic should prefer event-first context
and Host APIs over adapter fields.
Model, prompt, history, state, storage, tool, knowledge-base, rerank, and
steering access go through AgentRunAPIProxy. LangBot validates these calls
with the current run_id, run-scoped resource policy / available APIs, and
caller plugin identity.
Local Agent must not expose the runner process filesystem as an agent
capability. In sandboxed deployments, file access is mediated by Host/sandbox
tools registered in ctx.resources.tools; the model can request those tools,
and the runner invokes them through AgentRunAPIProxy.call_tool(). "Local" here
means the agent loop runs locally as a LangBot plugin, not that the model can
read or write arbitrary files on the runner machine.
Skill activation uses the same tool path. If Host exposes activate in the
run's allowed tools, the model calls activate like any other function tool and
Local Agent forwards it through AgentRunAPIProxy.call_tool(); no separate
runner action is required for skill activation.
Typical local-agent usage:
- Invoke authorized LangBot-hosted models.
- Call authorized tools.
- Retrieve authorized knowledge bases and rerank results.
- Page transcript history for the model request.
- Pull authorized steering inputs at turn boundaries.
The runner must not bypass ctx.resources or call host-private managers to
access unauthorized models, tools, knowledge bases, storage, or platform APIs.
streaming: yestool_calling: yesknowledge_retrieval: yesmultimodal_input: yesskill_authoring: yesinterrupt: yessteering: yes
interrupt is cooperative. When Host exposes the run ledger API, Local Agent
polls the current run through AgentRunAPIProxy.run_get() at run boundaries and
streaming event boundaries. If Host has recorded cancel_requested_at, the
runner stops and emits run.failed with code="cancelled".
skill_authoring means Local Agent can receive LangBot's Host-owned
ctx.resources.skills facts plus activate/register_skill tools when skills
are available. Skills remain owned by LangBot/Box; the runner owns how
model-facing prompts, tool schemas, tool results, or MCP adapters are assembled
from those Host capabilities.
Local Agent is reentrant and does not keep mutable per-conversation state in
the plugin instance. It can pull Host history each run. When Host state APIs
are available, it persists compacted summary checkpoints through
AgentRunAPIProxy so later runs can resume from
runner.compaction.checkpoint. It does not persist external session IDs or
runner-owned memory outside Host-managed state/storage.
This plugin does not implement LangBot EventGateway, event subscription, event notification, scheduler, or event fanout. Those systems belong to LangBot host or separate event-focused branches. This runner only consumes the run context that LangBot delivers through AgentRunner Protocol v1.
This plugin is the target external implementation of LangBot's local agent runner. LangBot's internal runner code can be used as reference material, but its host-private structures must not become plugin API.
We welcome contributions. Useful areas include:
- Protocol v1 adapter fixes
- history/state/storage API consumption
- tool loop and RAG behavior
- multimodal input handling
- focused tests and documentation improvements