Releases: Mingye-Lu/AgenticCrawler
Releases · Mingye-Lu/AgenticCrawler
v0.10.0
Added
- HTML Diff Mode (
optimization.html_diff_mode) — on repeated visits to the same URL, only changed content sections are returned with[unchanged: N sections]markers, reducing token usage 50–70% on multi-turn sessions. Also active in MCP direct-tool mode (the server now maintains a persistentCrawlStateacross calls). - Action Loop Detection (
optimization.loop_detection) — rolling-window action hash detects repeated identical actions with escalating nudges (soft at 5, medium at 8, strong at 12 repeats); page stagnation detection after 5 consecutive identical page fingerprints. - Page Fingerprinting (
optimization.page_fingerprinting) — lightweight FNV-1a fingerprint (url + element_count + first-1000-char text hash) stored in CrawlState; used by loop detection and action caching for cache invalidation. - Planning Interval (
optimization.planning_interval) — every N steps injects planning-checkpoint or execution-mode guidance into the dynamic prompt; disabled by default (interval=0). - Failure Classification (
optimization.failure_classification) — 16-category keyword-based error taxonomy (zero LLM cost);classify()maps error messages to SelectorNotFound, CaptchaDetected, RateLimited, etc.;retry_strategy()returns RetryWithHealing, RetryWithDelay, NoRetry, or ResetAndRetry per category. - Self-Healing Selectors (
optimization.self_healing) — on SelectorNotFound/SelectorAmbiguous, fetches a fresh page_map and text-matches to the correct element ref; logs[healed: @eOLD → @eNEW]; zero LLM calls; max retries configurable (default 2). - Action Caching (
optimization.action_caching) — in-memory SHA-256 keyed cache for read-only tools (page_map,read_content,list_resources); invalidated on page fingerprint change; TTL-based expiry (default 30s).execute_jsis intentionally excluded as it may have side effects. - Confidence Tracking (
optimization.confidence_tracking) — parses[confidence: HIGH/MEDIUM/LOW]from assistant responses; 2+ consecutive LOWs triggers stagnation alert via DynamicPromptContext; advisory only, never blocks. - Compound Component Enrichment (
optimization.compound_enrichment) — extends interactive element JSON with anenrichmentfield for complex form controls: date format hints, range min/max/step/value, number bounds, select option lists (max 20 + overflow count), file accept types, textarea maxlength. Max 200 bytes/element. - Content-Aware Cleaning Profiles (
optimization.content_aware_profiles) —CleaningProfileenum (Default/Minimal/Aggressive/ReadingMode) auto-selected by task keyword and content size;select_profile()picks ReadingMode for extraction tasks, Minimal for interaction tasks, Aggressive for content > 50KB. - Budget Enforcement (
optimization.budget_max_session_cost_usd,optimization.budget_enforcement) —BudgetEnforcerwith Warn/Block modes; Warn injects budget warning into the dynamic prompt at configurable threshold (default 80%); Block terminates the agent loop cleanly when the cost limit is reached. - Per-Agent Cost Attribution (
optimization.per_agent_cost_tracking) —build_cost_breakdown()walks flat child sessions and reconstructs per-child cost via UsageTracker;/costcommand shows per-agent breakdown when flag is ON. - Dynamic System Prompt Infrastructure —
DynamicPromptContextstruct with four optional fields (stagnation_alert, planning_guidance, budget_warning, loop_nudge); injected as section 9 of the system prompt via a sharedArc<Mutex<>>slot; all optimizations write to this slot, runtime picks up on the next iteration. - Optimization Settings Schema — nested
OptimizationSettingsstruct inSettingswith 18 fields, allOption<T>and defaulting to OFF for backward compatibility; 18settings_get_*getter functions.
v0.9.1
Changed
navigatedefaults tofit_markdown— theformatparameter now defaults tofit_markdowninstead ofmarkdown, saving 30–40% tokens on typical pages. Passformat: "markdown"explicitly to restore full output.waitreturnspage_state— both the selector-based and fixed-duration wait branches now return apage_statediff (URL, title, added/removed/modified elements) after the condition resolves, consistent with all other action tools (click,fill_form,press_key,scroll, etc.). Eliminates the extrapage_mapcall previously needed to observe what changed.
Fixed
- Script tools missing from
ToolRegistry—run_script,wait_for_scripts,script_status,cancel_script,save_script,list_scripts, andread_scriptwere parsed and validated correctly but not dispatched by the agent loop. All 7 script tools are now registered.
Improved
- MCP tool descriptions — all 28 tool descriptions and parameter schemas enriched with concrete examples, edge-case guidance, and clearer return-value documentation for better LLM tool selection and Glama TDQS scoring.
v0.9.0
Added
-
Autonomous Script Protocol — a new deterministic execution layer that lets the LLM run multi-step browser automation without per-step LLM round-trips. Write scripts once, execute them in tight loops — dramatically faster and cheaper for repetitive page patterns (pagination, form filling, bulk extraction).
-
New
crates/script/crate — standalone grammar, parser, and persistence layer:- AST types:
ScriptDefinition,ScriptNode(10 node kinds),Expression(5 expression kinds) parse_script+validate_scriptwith comprehensive error reporting (unknown tools, undefined variables, excessive nesting, oversized scripts)- Disk persistence:
save_script_to_disk,load_script_from_disk,list_scripts_on_disk
- AST types:
-
7 new script tools (available in the agent loop, MCP server, and via
run_goal):run_script— execute an inline script or load a saved one byname; returnsscript_idimmediately (non-blocking). Acceptssave_asto persist the script after execution andlimitsto override defaults.wait_for_scripts— block until script(s) complete and collect fullScriptResult(extracted_data, yielded_data, steps_executed, elapsed_secs, error)script_status— non-blocking poll returning live state (step, items_collected, current_url, elapsed_secs, errors_caught)cancel_script— abort a running script via cooperative cancellation tokensave_script— persist a script definition to~/.acrawl/scripts/<name>.jsonlist_scripts— list all saved scripts with ISO 8601 UTC timestamps (modified_at) and file sizesread_script— read back a full script definition from disk
-
Script execution engine (
crates/agent/src/script_executor/):- Supported nodes:
tool_call,assign,collect,yield,for_loop,for_each,while_loop,if_else,try_catch(withcatch/finally/error_var),parallel - Limits enforced at runtime:
max_steps,max_timeout_secs,per_step_timeout_secs,max_output_bytes,max_parallel_branches,max_nesting_depth parallelbranches each get their own browser page; share a global step counter and cancellation token;errors_caughtandoutput_bytesare propagated back to the parent executor on completioncollectaccumulates toextracted_data;yieldwrites to a sharedArc<RwLock<Vec<Value>>>readable viascript_statusmid-execution- Variable substitution:
$varnamestrings in tool inputs are replaced with their current values
- Supported nodes:
Fixed
Expressionserde deserialization — changed from internally-tagged (#[serde(tag="kind")]) to adjacently-tagged (#[serde(tag="kind", content="value")]). The old tag causedLiteral,Variable, andJsEvalnewtype variants to fail deserialization from JSON — meaning no user-submitted script with variables or literals would parse.- MCP server
run_scriptpanic —spawn_scriptcallstokio::task::spawninternally but was invoked outside anyblock_oncontext, causing an immediate"no reactor running"panic that killed the server process. Wrapped inrt.block_on(async { … }). cleanup_completedresult race —spawn_scriptcalledcleanup_completed()before checking concurrency limits, silently evicting just-completed scripts from the map.wait_for_scriptswould then returnNotFoundfor fast-completing scripts. Removed the premature cleanup.max_output_bytesnot enforced — the limit was stored and validated but never checked during execution.push_extractedandpush_yieldednow track accumulated byte count and returnScriptExecutionErroron overflow.validate_script_nameduplicated with inconsistent rules — three separate implementations insave_script.rs,read_script.rs, andpersistence.rs. Consolidated intopersistence::validate_script_namewith the strictest ruleset (rejects leading dash, dots, path separators, non-normal path components).list_scriptstimestamp format —modified_atpreviously returned a raw Unix epoch integer (e.g."1780991949"). Now returns ISO 8601 UTC (e.g."2026-06-09T13:39:09Z") viatime::OffsetDateTime + Rfc3339.
v0.8.7
Added
page_map_depthparameter for navigate — newpage_map_depthoption (full,slim,none, default:slim) controls how much structural data is returned inline with navigation responses.slimstrips CSS selectors from links/headings/landmarks and caps link text at 60 chars, reducing token usage while preserving@eNrefs for interaction.noneomits the page_map entirely. Full page_map is still cached internally for differential feedback.- MCP server unit tests — 18 tests covering
parse_run_goal_request,validate_tool_names,normalize_tool_name,filtered_tool_specs,execute_run_goal, and framed/line-delimited protocol detection. - Render crate unit tests — tests for
MarkdownStreamStatepush/flush, incremental streaming, partial content boundaries, and long-line handling.
Changed
mvp_tool_specs()refactored — the 376-line monolithic tool specification function is now split intonavigation_tools(),interaction_tools(),extraction_tools(), andagent_control_tools()helpers. Public API unchanged.
Fixed
- CloakBrowser-dependent tests skip gracefully — tests requiring PlaywrightBridge now detect
PlaywrightNotInstalledand return early instead of panicking, eliminating 3 false failures on machines without Node.js/CloakBrowser.
v0.8.6
Added
fit_markdownformat for navigate — newformat="fit_markdown"option that prunes boilerplate DOM nodes (ads, navs, sidebars, footers) before markdown conversion, dramatically reducing token consumption on noisy pages. Scores elements by text density, descendant link density, semantic tag weight, and class/id signals. Falls back to plain text when pruning removes all content. Tool instructions now recommendfit_markdownas the preferred default format.
v0.8.5
Added
- Stable
@eNelement references —page_mapnow assigns short, stable handles (@e1,@e2, …) to each interactive element. Interaction tools (click,hover,fill_form,press_key,select_option) accept@eNin their selector fields, resolving them to the underlying CSS selector. This eliminates the need to copy long, fragile CSS paths — the LLM can just sayclick @e3. - RefMap data structure (
crates/browser/src/ref_map.rs) — maps integer IDs to CSS selectors with stable reuse (same selector always gets the same ref) and lifecycle management (clear on navigation). - Ref resolution module (
crates/agent/src/tools/ref_resolve.rs) — centralized@eN→ CSS selector resolution shared across all interaction tools. Plain CSS selectors pass through unchanged for full backward compatibility. - Navigate embedded refs — the
page_mapreturned inline withnavigateresponses now includes@eNannotations, so the first page view the LLM sees already has stable handles (no extrapage_mapcall needed). - Scoped
page_maprefs —page_mapwith ascopeparameter (e.g. modals/dialogs) now also annotates interactive elements with refs. NopBridgetest utility (crates/browser/src/testing.rs) — no-opBrowserBackendimplementation for unit testingBrowserContextwithout launching a real browser.- Glama MCP registry verification — added
glama.jsonfor Glama marketplace discovery.
Fixed
- Ref invalidation on navigation —
navigate,go_back, andswitch_tabnow clear the ref map immediately, preventing stale refs from resolving against a different page and clicking wrong elements. - Bridge script launch on Windows — the CloakBrowser bridge script is now written to
~/.acrawl/bridge.cjsand executed vianode <path>instead ofnode -e <script>, fixing the Windows command-line length limit (OS error 206) that prevented all browser features from working. - URL normalization deduplication — consolidated duplicate URL-normalization helpers into a single shared
normalize_urlfunction used by bothpage_mapandfeedback.
v0.8.4
Added
- Differential page_map feedback — interaction tools (click, fill_form, select_option, hover, press_key) now return a differential page state showing exactly what changed instead of a full page dump. Includes added/removed headings, links, landmarks, and interactive elements, plus state changes (disabled, checked, value, aria-expanded, aria-pressed, aria-selected). Falls back to full page_map when changes exceed the previous element count.
- Interactive element value tracking — page_map now captures the current
valueof select, input, and textarea elements (truncated to 60 chars). For selects, reports the selected option's display text. - Smithery MCP marketplace listing — added
smithery.yamlfor Smithery discovery. - Dockerfile for MCP server introspection — enables Glama verification and container-based deployment.
- Smithery MCPB publish step — release workflow now publishes to Smithery marketplace.
Fixed
- Navigate seeds page snapshot cache — the first interaction after
navigatenow produces a differential response instead of falling back to a full page_map. - Hash-route fragment preservation — cache keys now preserve
#/pathand#!/pathfragments (hash-routed SPAs) while still stripping simple in-page anchors like#section. - Multiset-aware structural diff — duplicate headings/links/landmarks are now correctly counted (previously collapsed by set-based comparison).
v0.8.3
Added
- MCPB bundles in releases — each release now includes platform-specific
.mcpbarchives (ZIP of manifest.json + binary) for single-click installation in Claude Desktop and other MCP hosts. Five bundles: linux-x64, linux-arm64, macos-x64, macos-arm64, windows-x64. - Automated MCP Registry publishing — the release workflow now automatically publishes acrawl to
registry.modelcontextprotocol.iovia GitHub OIDC after each release, making it discoverable in the MCP ecosystem.
v0.8.2
Added
page_mapinteractive elements — theinteractivesection now returns up to 30 actual elements withtext,selector,tag,type, and ARIA state (aria-pressed,aria-expanded,aria-selected,disabled,checked,role). Covers buttons, inputs, selects, textareas, and ARIA widgets (role=button/tab/menuitem/option/switch/checkbox). Flat count keys (buttons,inputs,selects,textareas) preserved at root level for backward compatibility.page_mapscope parameter — optionalscopeCSS selector restricts all queries to a container element (e.g.scope: "[role='dialog']"for modal-only content). Returnsscope_not_found: truewith empty sections if the selector doesn't match.waitstate parameter — optionalstatefield acceptsvisible,hidden,attached, ordetached. Enables waiting for elements to become visible (not just exist in DOM) or disappear (e.g. loading spinners). Errors ifstateis provided without aselector.
Changed
BrowserBackend::page_map()trait method now acceptsscope: Option<&str>.BrowserBackend::wait_for_selector()trait method now acceptsstate: Option<&str>.- Extension backend visibility checks use
getComputedStyle+getBoundingClientRectto match Playwright's stricter semantics.
v0.8.1
Added
click_attool — new tool (#21) that dispatches real mouse clicks at specific viewport coordinates via Playwright'spage.mouse.click(x, y). Enables interaction with canvas elements, maps, SVGs, and UI components that lack stable CSS selectors. Schema is OpenAI strict-mode compatible (all properties required, no nullable optionals). Both CloakBrowser and Chrome extension backends supported.screenshotelement & format options — the screenshot tool now accepts:selector— screenshot a specific element (auto-scrolls into view, crops to element bounds)format—png,jpeg, orwebpoutput (JPEG/WebP produce 5-10x smaller files)quality— compression level 0-100 for lossy formatsfull_page— capture the entire scrollable page, not just the viewport- Saved filenames now use the correct extension (
.jpg,.webp,.png) based on format - MCP server returns the correct
media_type(image/jpeg,image/webp) instead of hardcodedimage/png
Changed
- Tool count is now 21 (17 browser + 4 agent-control). MCP server exposes 18 tools (17 browser +
run_goal). BrowserBackend::screenshot()trait method now accepts aScreenshotOptionsstruct instead of no arguments.