Skip to content

#74, #75, #76#87

Merged
LinseCed merged 19 commits into
devfrom
75-create-slim-pipeline-orchestrator
Jun 13, 2026
Merged

#74, #75, #76#87
LinseCed merged 19 commits into
devfrom
75-create-slim-pipeline-orchestrator

Conversation

@LinseCed

@LinseCed LinseCed commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

This went a little bit out of scope : )

What was Done:
Replace the flat "retrieve → prompt → stream" chat path with an agentic pipeline, adds native LLM tool-calling, and ships a CLI test client.

How it fits together

ChatOrchestrator (SSE wrapper)
└─ OrchestratorAgent ← routes the question
└─ AgentTool("synthesis") ← exposes a sub-agent as a callable tool
└─ SynthesisAgent ← answers from the knowledge base
├─ RetrieveTool ├─ GrepTool └─ FetchFileTool
Every Agent runs the same two phases: gather (call tools until it has enough) then answer (synthesize a reply). Tools
are the leaves that touch data; an AgentTool lets one agent call another as just another tool.

Agents (src/agents/)

  • Agent (base) — shared engine. gather_stream loops up to max_steps, asking the LLM with the tool catalogue and
    running returned tool calls until it stops; answer_stream synthesizes the final reply. Emits an Invocation per
    tool/sub-agent used (drives tool_use events). User query fenced with a per-request random marker (injection guard).
  • OrchestratorAgent — top-level router. Delegates to the most relevant sub-agent; answers greetings/meta directly.
    Streams a single delegation's answer straight through, or synthesizes once from multiple delegations' summaries.
  • SynthesisAgent — knowledge-base specialist. Picks among retrieve/grep/fetch_file to gather context, then answers
    strictly from gathered sources.
  • ChatOrchestrator — not an Agent; wraps OrchestratorAgent and serializes its output as the SSE stream (tool_use →
    token → citation → done).

Tools (src/agents/tools/)

  • Tool (base) / ToolRegistry — Pydantic-validated args, execute never raises, registry dispatches by name and exposes
    JSON-schema specs to the LLM.
  • RetrieveTool — semantic + keyword search over the vector store; for conceptual/open-ended questions.
  • GrepTool — case-insensitive substring search; for exact identifiers or phrases.
  • FetchFileTool — returns all chunks of a named file; explicit extension matches exactly, bare name matches by stem.
  • AgentTool — adapts a sub-agent to the tool interface so agents can be composed; returns a deferred Delegation
    (sub-agent gathers now, synthesizes only if needed).

LLM clients (src/llm/)

  • LLMClient protocol gains chat(messages, tools) plus ToolCall/ToolSpec/ChatResult; implemented natively for OpenAI
    and Ollama (replaces the JSON workaround). Ollama tool-call IDs are uuid4-unique.

API & CLI

  • chat route runs through a new get_orchestrator dependency; SSE adds tool_use events (new ToolUseEvent schema).
    ChatRequest no longer takes top_k/min_score (the pipeline owns retrieval depth). New scripts/chat_cli.py terminal
    client (ask / ingest / history).

Tests

  • New tests/agents/ suite (agent loop, orchestrator, tools) and a ScriptedLLMClient stub that drives the tool loop
    deterministically; added OpenAI/Ollama tool-calling coverage.

Type of PR — pick one:

  • Functional — adds or changes user-visible behavior
  • Non-functional — improves a measurable property (perf, security, a11y …)
  • Mixed — both
  • Internal — refactor / docs / tests only

1 Process baseline — every box must be ticked, on every PR

  • All acceptance criteria in the linked issue are met
  • 1 review approval from a non-author
  • CI green: lint, type-check, unit tests, build, secret-scan
  • No secrets / tokens / credentials in the diff
  • No new TODO / FIXME without a follow-up issue
  • Docs updated if behavior, API, config, or architecture changed
  • No regression of other NFR baselines (a11y / perf / security)

2 Outcome proof — fill the bullets that match your PR type

Functional / Mixed PR:

  • ≥ 1 black-box test that exercises the acceptance criteria
    e.g. upload a markdown file, then assert the chat answer cites it

Non-functional / Mixed PR:

  • Measurement that proves the acceptance criteria, attached to the PR
    e.g. benchmark output for chat p95 < 2 s, axe audit for a11y ≥ 90, scan report for 0 critical CVEs

3 Cross-cutting impact — tick the areas this PR touches

For each area below, ask: "does my PR touch this?"

-> If yes → tick the area and complete its sub-checks (they become mandatory).
-> If no → skip it.
-> If nothing applies → tick "None of the above".

  • UI touched

    • Responsive on desktop and mobile
    • Contrast + visible focus state, no color-only signals
      e.g. new button has a focus ring and is readable on a 360 px screen
  • Backend endpoint added or changed

    • Behind authentication — not accidentally public
    • Input validation (size / type / allowed values)
    • OpenAPI spec updated
      e.g. /upload rejects files > 10 MB with a clear error
  • Stores user data, ingested content, or LLM output

    • Logged with the standard fields (request id, source, latency)
    • No PII / credentials in log output
  • New env var or config

    • Added to .env.example and mentioned in README.md
    • No real value committed
      e.g. LLM_API_KEY=... — only the key name lives in .env.example
  • Deployment / build changed

    • Deployable via the standard pipeline (no undocumented manual steps)
    • Local dev setup still works
  • None of the above — purely internal change


@LinseCed LinseCed requested review from Afif-del and DaniloTatti June 12, 2026 16:57
@LinseCed LinseCed merged commit 9418e8c into dev Jun 13, 2026
4 checks passed
@LinseCed LinseCed deleted the 75-create-slim-pipeline-orchestrator branch June 14, 2026 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants