feat(mcp): outils navigate + set_field_text, robustesse scroll/OCR, fixes plateforme & CI à iso grob#3
Merged
Conversation
Load a URL in the attached browser window, always forcing a fresh load. Unlike open_url_and_attach_tab, it never re-selects a stale existing tab that merely matches the URL terms, which is the failure mode that lands on the wrong tab. Navigation does not go through the address bar: background keystrokes can't reach browser chrome, where a typed URL falls through to the page and arrives as in-page shortcuts (e.g. GitHub g+i jumps to Issues). Reuses launch_app plus the existing attach/verify machinery.
Clear the focused field and set its text in one step by writing the app's AXFocusedUIElement value directly (AX select-all-replace, with a keyboard fallback through the existing type_text path). This avoids the erratic cursor of raw End/Backspace/double-click clears on backgrounded web forms, which garble the result. Fetching AXFocusedUIElement from the app handles sparse-AX web inputs that never reach the scene graph. Gated as high-risk mutating input, mirroring type_keys.
scroll_direction_policy now charges one event per scroll rather than the page count, and grants a batch of eight same-direction scrolls. Approving "scroll <dir>" once therefore covers repeated scroll_at calls at successive coordinates within the TTL, instead of re-prompting on every point-scroll.
Vision returns one observation per recognized line, but a line can span
visually separate UI runs that merely sit on the same band — e.g. a
dropdown label overlapping background page text ("Choose the minimal…
Contents"). An OCR-bound click on either run then lands between them.
observation_to_boxes now uses per-character-range boxes
(boundingBoxForRange) to break a line into tight runs at large
horizontal gaps, with a safe whole-line fallback on any uncertainty.
The pure gap clustering is unit-tested.
…t-safe (foreground + Cmd+A/Cmd+V via AppleScript keystroke, jamais de keycode lettre)
… méthode fiable crayon/OCR/scroll
- guard: CGEventSource passe en HIDSystemState pour ignorer l'input synthetique poste par dunst (ne bloque plus dunst contre ses propres events) - guard: budget de retry configurable via DUNST_MCP_USER_IDLE_RETRY_MS (defaut = comportement actuel) + format du message fige pour parser age_ms/guard_ms - find_element: fallback OCR/vision (get_hit_targets) quand l'AX ne renvoie aucun match, hits tagges par source ; AX-first preserve - open_url_and_attach_tab: reutilise un onglet deja ouvert via son titre au lieu de re-naviguer ; signature publique inchangee (TODO param reuse)
… un collage perime Le collage (Cmd+V) est consomme de maniere asynchrone par l'app cible ; restaurer le presse-papier trop tot fait coller l'ancien contenu. Attente de 300 ms (alignee sur le chemin foreground prouve) avant restauration, uniquement si un contenu precedent doit etre remis.
- ci.yml: merge-queue (merge_group) + push.paths-ignore (docs/md/license...) - nouveaux workflows adaptes au workspace (crates/**): codeql, semgrep, shellcheck, nightly, audit-cron, auto-update-branch, auto-merge-release - release-plz.yml: corrige le filtre de chemins src/** -> crates/** (le workflow ne se declenchait jamais sur les sources du workspace) - cla.yml volontairement non porte (CLA contributeurs externes, secret/stockage dedies) ; cleanup-branches.yml reste a porter - etapes dependant d'un secret optionnel rendues resilientes (pas de CI rouge si secret absent)
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
…hebdo) Porte le dernier workflow grob manquant : supprime la branche head a la fusion d'une PR, et nettoie chaque semaine les branches release-plz-/claude/codex deja mergees dans main. Branches protegees (main/release/hotfix/dependabot) et forks exclus.
Sur les apps Entree=Envoyer (ChatGPT, LinkedIn, Slack...), chaque \n tape etait poste comme Entree brute et envoyait le message en morceaux. - texte multi-lignes route via le presse-papier (paste_text_background, restore clipboard) : \n inseres comme texte, jamais de submit - chemin frappe caractere: \n -> Maj+Entree (newline) au lieu d'Entree brute - press_key(Return/Enter) explicite inchange = submit (flags 0), couvert par un test
…e choix par lot) Design detaille des 2 primitives MCP pour enumerer toute une surface de choix scrollable en une passe puis appliquer toutes les selections en un lot (1 approbation, re-scan sur changement d'epoch). Reutilise hit_targets/ui_epoch/expected_epoch/approve existants.
This was referenced Jun 29, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Résumé
Branche feature (17 commits) qui ajoute des outils MCP et fiabilise le pilotage macOS de dunst. 40 fichiers, +1930 / −340.
Nouveaux outils MCP
navigate— charge une URL dans le navigateur attaché et re-vérifie la cibleset_field_text— remplacement de champ layout-safe (foreground + Cmd+A/Cmd+V via AppleScript keystroke), avec gating d'approbationunstick_cursor— récupération du curseur I-beam figé (bug OS)Fiabilité / corrections
HIDSystemState) : dunst ne se bloque plus contre ses propres events synthétiques ; budget de retry configurablefind_element: fallback OCR/vision quand l'arbre d'accessibilité ne renvoie aucun match (pages AX-pauvres)open_url_and_attach_tab: réutilise un onglet déjà ouvert (match sur titre) au lieu de re-naviguerInfra
release-plz.toml, ajustements scriptsAutomatisation CI/CD (alignée sur grob)
ci.yml: ajout de la merge-queue (merge_group) +push.paths-ignorecrates/**) : CodeQL, semgrep, shellcheck, nightly, audit-cron, auto-update-branch, auto-merge-releaserelease-plz.ymlfiltraitsrc/**(jamais déclenché sur ce workspace) → corrigé encrates/**cla.ymlvolontairement non porté ;cleanup-branches.ymlreste à ajoutermain. Secret :RELEASE_PLZ_TOKEN(release-plz / auto-merge / tags / tap Homebrew).Validation locale
cargo build,cargo clippy -D warnings,cargo test,cargo audit,cargo machete(pre-push hooks verts) ·actionlint+ parse YAML OK sur les 10 workflows.