Skip to content

Releases: Mingye-Lu/AgenticCrawler

v0.10.0

11 Jun 13:55

Choose a tag to compare

Added

  • HTML Diff Mode (optimization.html_diff_mode) — on repeated visits to the same URL, only changed content sections are returned with [unchanged: N sections] markers, reducing token usage 50–70% on multi-turn sessions. Also active in MCP direct-tool mode (the server now maintains a persistent CrawlState across calls).
  • Action Loop Detection (optimization.loop_detection) — rolling-window action hash detects repeated identical actions with escalating nudges (soft at 5, medium at 8, strong at 12 repeats); page stagnation detection after 5 consecutive identical page fingerprints.
  • Page Fingerprinting (optimization.page_fingerprinting) — lightweight FNV-1a fingerprint (url + element_count + first-1000-char text hash) stored in CrawlState; used by loop detection and action caching for cache invalidation.
  • Planning Interval (optimization.planning_interval) — every N steps injects planning-checkpoint or execution-mode guidance into the dynamic prompt; disabled by default (interval=0).
  • Failure Classification (optimization.failure_classification) — 16-category keyword-based error taxonomy (zero LLM cost); classify() maps error messages to SelectorNotFound, CaptchaDetected, RateLimited, etc.; retry_strategy() returns RetryWithHealing, RetryWithDelay, NoRetry, or ResetAndRetry per category.
  • Self-Healing Selectors (optimization.self_healing) — on SelectorNotFound/SelectorAmbiguous, fetches a fresh page_map and text-matches to the correct element ref; logs [healed: @eOLD → @eNEW]; zero LLM calls; max retries configurable (default 2).
  • Action Caching (optimization.action_caching) — in-memory SHA-256 keyed cache for read-only tools (page_map, read_content, list_resources); invalidated on page fingerprint change; TTL-based expiry (default 30s). execute_js is intentionally excluded as it may have side effects.
  • Confidence Tracking (optimization.confidence_tracking) — parses [confidence: HIGH/MEDIUM/LOW] from assistant responses; 2+ consecutive LOWs triggers stagnation alert via DynamicPromptContext; advisory only, never blocks.
  • Compound Component Enrichment (optimization.compound_enrichment) — extends interactive element JSON with an enrichment field for complex form controls: date format hints, range min/max/step/value, number bounds, select option lists (max 20 + overflow count), file accept types, textarea maxlength. Max 200 bytes/element.
  • Content-Aware Cleaning Profiles (optimization.content_aware_profiles) — CleaningProfile enum (Default/Minimal/Aggressive/ReadingMode) auto-selected by task keyword and content size; select_profile() picks ReadingMode for extraction tasks, Minimal for interaction tasks, Aggressive for content > 50KB.
  • Budget Enforcement (optimization.budget_max_session_cost_usd, optimization.budget_enforcement) — BudgetEnforcer with Warn/Block modes; Warn injects budget warning into the dynamic prompt at configurable threshold (default 80%); Block terminates the agent loop cleanly when the cost limit is reached.
  • Per-Agent Cost Attribution (optimization.per_agent_cost_tracking) — build_cost_breakdown() walks flat child sessions and reconstructs per-child cost via UsageTracker; /cost command shows per-agent breakdown when flag is ON.
  • Dynamic System Prompt InfrastructureDynamicPromptContext struct with four optional fields (stagnation_alert, planning_guidance, budget_warning, loop_nudge); injected as section 9 of the system prompt via a shared Arc<Mutex<>> slot; all optimizations write to this slot, runtime picks up on the next iteration.
  • Optimization Settings Schema — nested OptimizationSettings struct in Settings with 18 fields, all Option<T> and defaulting to OFF for backward compatibility; 18 settings_get_* getter functions.

v0.9.1

10 Jun 06:26

Choose a tag to compare

Changed

  • navigate defaults to fit_markdown — the format parameter now defaults to fit_markdown instead of markdown, saving 30–40% tokens on typical pages. Pass format: "markdown" explicitly to restore full output.
  • wait returns page_state — both the selector-based and fixed-duration wait branches now return a page_state diff (URL, title, added/removed/modified elements) after the condition resolves, consistent with all other action tools (click, fill_form, press_key, scroll, etc.). Eliminates the extra page_map call previously needed to observe what changed.

Fixed

  • Script tools missing from ToolRegistryrun_script, wait_for_scripts, script_status, cancel_script, save_script, list_scripts, and read_script were parsed and validated correctly but not dispatched by the agent loop. All 7 script tools are now registered.

Improved

  • MCP tool descriptions — all 28 tool descriptions and parameter schemas enriched with concrete examples, edge-case guidance, and clearer return-value documentation for better LLM tool selection and Glama TDQS scoring.

v0.9.0

09 Jun 08:38

Choose a tag to compare

Added

  • Autonomous Script Protocol — a new deterministic execution layer that lets the LLM run multi-step browser automation without per-step LLM round-trips. Write scripts once, execute them in tight loops — dramatically faster and cheaper for repetitive page patterns (pagination, form filling, bulk extraction).

  • New crates/script/ crate — standalone grammar, parser, and persistence layer:

    • AST types: ScriptDefinition, ScriptNode (10 node kinds), Expression (5 expression kinds)
    • parse_script + validate_script with comprehensive error reporting (unknown tools, undefined variables, excessive nesting, oversized scripts)
    • Disk persistence: save_script_to_disk, load_script_from_disk, list_scripts_on_disk
  • 7 new script tools (available in the agent loop, MCP server, and via run_goal):

    • run_script — execute an inline script or load a saved one by name; returns script_id immediately (non-blocking). Accepts save_as to persist the script after execution and limits to override defaults.
    • wait_for_scripts — block until script(s) complete and collect full ScriptResult (extracted_data, yielded_data, steps_executed, elapsed_secs, error)
    • script_status — non-blocking poll returning live state (step, items_collected, current_url, elapsed_secs, errors_caught)
    • cancel_script — abort a running script via cooperative cancellation token
    • save_script — persist a script definition to ~/.acrawl/scripts/<name>.json
    • list_scripts — list all saved scripts with ISO 8601 UTC timestamps (modified_at) and file sizes
    • read_script — read back a full script definition from disk
  • Script execution engine (crates/agent/src/script_executor/):

    • Supported nodes: tool_call, assign, collect, yield, for_loop, for_each, while_loop, if_else, try_catch (with catch/finally/error_var), parallel
    • Limits enforced at runtime: max_steps, max_timeout_secs, per_step_timeout_secs, max_output_bytes, max_parallel_branches, max_nesting_depth
    • parallel branches each get their own browser page; share a global step counter and cancellation token; errors_caught and output_bytes are propagated back to the parent executor on completion
    • collect accumulates to extracted_data; yield writes to a shared Arc<RwLock<Vec<Value>>> readable via script_status mid-execution
    • Variable substitution: $varname strings in tool inputs are replaced with their current values

Fixed

  • Expression serde deserialization — changed from internally-tagged (#[serde(tag="kind")]) to adjacently-tagged (#[serde(tag="kind", content="value")]). The old tag caused Literal, Variable, and JsEval newtype variants to fail deserialization from JSON — meaning no user-submitted script with variables or literals would parse.
  • MCP server run_script panicspawn_script calls tokio::task::spawn internally but was invoked outside any block_on context, causing an immediate "no reactor running" panic that killed the server process. Wrapped in rt.block_on(async { … }).
  • cleanup_completed result racespawn_script called cleanup_completed() before checking concurrency limits, silently evicting just-completed scripts from the map. wait_for_scripts would then return NotFound for fast-completing scripts. Removed the premature cleanup.
  • max_output_bytes not enforced — the limit was stored and validated but never checked during execution. push_extracted and push_yielded now track accumulated byte count and return ScriptExecutionError on overflow.
  • validate_script_name duplicated with inconsistent rules — three separate implementations in save_script.rs, read_script.rs, and persistence.rs. Consolidated into persistence::validate_script_name with the strictest ruleset (rejects leading dash, dots, path separators, non-normal path components).
  • list_scripts timestamp formatmodified_at previously returned a raw Unix epoch integer (e.g. "1780991949"). Now returns ISO 8601 UTC (e.g. "2026-06-09T13:39:09Z") via time::OffsetDateTime + Rfc3339.

v0.8.7

08 Jun 12:52

Choose a tag to compare

Added

  • page_map_depth parameter for navigate — new page_map_depth option (full, slim, none, default: slim) controls how much structural data is returned inline with navigation responses. slim strips CSS selectors from links/headings/landmarks and caps link text at 60 chars, reducing token usage while preserving @eN refs for interaction. none omits the page_map entirely. Full page_map is still cached internally for differential feedback.
  • MCP server unit tests — 18 tests covering parse_run_goal_request, validate_tool_names, normalize_tool_name, filtered_tool_specs, execute_run_goal, and framed/line-delimited protocol detection.
  • Render crate unit tests — tests for MarkdownStreamState push/flush, incremental streaming, partial content boundaries, and long-line handling.

Changed

  • mvp_tool_specs() refactored — the 376-line monolithic tool specification function is now split into navigation_tools(), interaction_tools(), extraction_tools(), and agent_control_tools() helpers. Public API unchanged.

Fixed

  • CloakBrowser-dependent tests skip gracefully — tests requiring PlaywrightBridge now detect PlaywrightNotInstalled and return early instead of panicking, eliminating 3 false failures on machines without Node.js/CloakBrowser.

v0.8.6

08 Jun 06:19

Choose a tag to compare

Added

  • fit_markdown format for navigate — new format="fit_markdown" option that prunes boilerplate DOM nodes (ads, navs, sidebars, footers) before markdown conversion, dramatically reducing token consumption on noisy pages. Scores elements by text density, descendant link density, semantic tag weight, and class/id signals. Falls back to plain text when pruning removes all content. Tool instructions now recommend fit_markdown as the preferred default format.

v0.8.5

07 Jun 14:42

Choose a tag to compare

Added

  • Stable @eN element referencespage_map now assigns short, stable handles (@e1, @e2, …) to each interactive element. Interaction tools (click, hover, fill_form, press_key, select_option) accept @eN in their selector fields, resolving them to the underlying CSS selector. This eliminates the need to copy long, fragile CSS paths — the LLM can just say click @e3.
  • RefMap data structure (crates/browser/src/ref_map.rs) — maps integer IDs to CSS selectors with stable reuse (same selector always gets the same ref) and lifecycle management (clear on navigation).
  • Ref resolution module (crates/agent/src/tools/ref_resolve.rs) — centralized @eN → CSS selector resolution shared across all interaction tools. Plain CSS selectors pass through unchanged for full backward compatibility.
  • Navigate embedded refs — the page_map returned inline with navigate responses now includes @eN annotations, so the first page view the LLM sees already has stable handles (no extra page_map call needed).
  • Scoped page_map refspage_map with a scope parameter (e.g. modals/dialogs) now also annotates interactive elements with refs.
  • NopBridge test utility (crates/browser/src/testing.rs) — no-op BrowserBackend implementation for unit testing BrowserContext without launching a real browser.
  • Glama MCP registry verification — added glama.json for Glama marketplace discovery.

Fixed

  • Ref invalidation on navigationnavigate, go_back, and switch_tab now clear the ref map immediately, preventing stale refs from resolving against a different page and clicking wrong elements.
  • Bridge script launch on Windows — the CloakBrowser bridge script is now written to ~/.acrawl/bridge.cjs and executed via node <path> instead of node -e <script>, fixing the Windows command-line length limit (OS error 206) that prevented all browser features from working.
  • URL normalization deduplication — consolidated duplicate URL-normalization helpers into a single shared normalize_url function used by both page_map and feedback.

v0.8.4

04 Jun 15:05

Choose a tag to compare

Added

  • Differential page_map feedback — interaction tools (click, fill_form, select_option, hover, press_key) now return a differential page state showing exactly what changed instead of a full page dump. Includes added/removed headings, links, landmarks, and interactive elements, plus state changes (disabled, checked, value, aria-expanded, aria-pressed, aria-selected). Falls back to full page_map when changes exceed the previous element count.
  • Interactive element value tracking — page_map now captures the current value of select, input, and textarea elements (truncated to 60 chars). For selects, reports the selected option's display text.
  • Smithery MCP marketplace listing — added smithery.yaml for Smithery discovery.
  • Dockerfile for MCP server introspection — enables Glama verification and container-based deployment.
  • Smithery MCPB publish step — release workflow now publishes to Smithery marketplace.

Fixed

  • Navigate seeds page snapshot cache — the first interaction after navigate now produces a differential response instead of falling back to a full page_map.
  • Hash-route fragment preservation — cache keys now preserve #/path and #!/path fragments (hash-routed SPAs) while still stripping simple in-page anchors like #section.
  • Multiset-aware structural diff — duplicate headings/links/landmarks are now correctly counted (previously collapsed by set-based comparison).

v0.8.3

04 Jun 07:24

Choose a tag to compare

Added

  • MCPB bundles in releases — each release now includes platform-specific .mcpb archives (ZIP of manifest.json + binary) for single-click installation in Claude Desktop and other MCP hosts. Five bundles: linux-x64, linux-arm64, macos-x64, macos-arm64, windows-x64.
  • Automated MCP Registry publishing — the release workflow now automatically publishes acrawl to registry.modelcontextprotocol.io via GitHub OIDC after each release, making it discoverable in the MCP ecosystem.

v0.8.2

03 Jun 11:50

Choose a tag to compare

Added

  • page_map interactive elements — the interactive section now returns up to 30 actual elements with text, selector, tag, type, and ARIA state (aria-pressed, aria-expanded, aria-selected, disabled, checked, role). Covers buttons, inputs, selects, textareas, and ARIA widgets (role=button/tab/menuitem/option/switch/checkbox). Flat count keys (buttons, inputs, selects, textareas) preserved at root level for backward compatibility.
  • page_map scope parameter — optional scope CSS selector restricts all queries to a container element (e.g. scope: "[role='dialog']" for modal-only content). Returns scope_not_found: true with empty sections if the selector doesn't match.
  • wait state parameter — optional state field accepts visible, hidden, attached, or detached. Enables waiting for elements to become visible (not just exist in DOM) or disappear (e.g. loading spinners). Errors if state is provided without a selector.

Changed

  • BrowserBackend::page_map() trait method now accepts scope: Option<&str>.
  • BrowserBackend::wait_for_selector() trait method now accepts state: Option<&str>.
  • Extension backend visibility checks use getComputedStyle + getBoundingClientRect to match Playwright's stricter semantics.

v0.8.1

03 Jun 08:06

Choose a tag to compare

Added

  • click_at tool — new tool (#21) that dispatches real mouse clicks at specific viewport coordinates via Playwright's page.mouse.click(x, y). Enables interaction with canvas elements, maps, SVGs, and UI components that lack stable CSS selectors. Schema is OpenAI strict-mode compatible (all properties required, no nullable optionals). Both CloakBrowser and Chrome extension backends supported.
  • screenshot element & format options — the screenshot tool now accepts:
    • selector — screenshot a specific element (auto-scrolls into view, crops to element bounds)
    • formatpng, jpeg, or webp output (JPEG/WebP produce 5-10x smaller files)
    • quality — compression level 0-100 for lossy formats
    • full_page — capture the entire scrollable page, not just the viewport
    • Saved filenames now use the correct extension (.jpg, .webp, .png) based on format
    • MCP server returns the correct media_type (image/jpeg, image/webp) instead of hardcoded image/png

Changed

  • Tool count is now 21 (17 browser + 4 agent-control). MCP server exposes 18 tools (17 browser + run_goal).
  • BrowserBackend::screenshot() trait method now accepts a ScreenshotOptions struct instead of no arguments.