Accessibility: agent-readiness tooling, audit, and fixes (3.50.0.0)#189
Merged
Conversation
Add src/Tools/A11yPeerDump.Headless — boots the real MainWindow under Avalonia.Headless and walks the automation-peer tree (the source the AT-SPI/UIA backends project from), so the app's accessibility surface can be verified with no display server, a11y bus, or Accerciser. Reports: the peer tree, a full menu-command audit (incl. closed submenus, via the logical tree), explicitly-named chrome, the DocumentViewportAutomationPeer spotlight (page/zoom/rail-line/outline), and a summary of gaps. Mirrors RenderHarness.Headless's proven boot (real Skia, no capture). First findings on a real run: all 70 menu commands carry accessible names; toolbar/nav buttons expose Invoke/Toggle; the viewport peer reports "Document viewport"/#DocumentViewport + a live page/zoom/mode description. Open question it surfaces: Avalonia's MenuItemAutomationPeer exposes no UIA Invoke pattern — confirm whether the AT-SPI backend still maps an Action via a live pyatspi/dogtail dump (recipe in the README). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ail run Make --rail reliably engage rail (zoom above the threshold, mirroring the screenshot harness, since the background-analysis timer is unreliable under headless) and report both the viewport peer's cached Name/HelpText and the LIVE Core reading position (role + line text). The live line is the authoritative proof the rail channel produces the spoken line text; the cached peer snapshot can lag it under the harness's rapid drive because PDFium text extraction settles after the snap-start accessibility notify (the peer keys its cache on (page,rail,block,line)). Verified end-to-end with the ONNX model present: rail engaged, peer in rail mode, live line "reproduce the tables and figures in this paper solely for use in ...". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dump script
The live AT-SPI tree showed the always-visible rail toolbar buttons surfaced
as bare single letters ("P"/"J"/"F"/"H") and the sliders as "Slider" — a
screen reader / agent couldn't tell what they do. Set AutomationProperties.Name
so they read "Auto-scroll" / "Jump mode" / "Line focus blur" / "Line highlight"
and "Scroll speed" / "Motion blur intensity" (the speed slider's name tracks
its Spd/Jmp mode). These buttons already expose a click action, so this makes
the signature rail commands properly named AND agent-actionable — the reliable
path on Linux, where Avalonia menu items expose no AT-SPI Action.
Also add scripts/atspi-dump.py (the live pyatspi cross-check used to find this)
and document the conda-vs-/usr/bin/python3 and "Avalonia Application" gotchas.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs/agent-playbook.md — how to drive RailReader2's GUI via an AT-SPI automation agent. Grounded in the live AT-SPI verification: act via buttons + keyboard accelerators (Avalonia menus expose no AT-SPI Action on Linux), an audit-sourced element/AutomationId reference, the viewport + status-bar read-channels, the shortcut map, and task recipes. Also flags the CLI as the right tool for content extraction (no GUI needed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Application.Name = "RailReader2" so the app registers on the a11y bus under its real name instead of the generic "Avalonia Application" (frame is already named railreader2). Needs a live atspi re-check to confirm. - Known-state startup flags for agents / scripted launches: --page <n> (1-based), --zoom <pct> (e.g. 300), --rail. Parsed in App startup (ParseStartupFlags) and applied after the document opens via new MainWindowViewModel.ApplyStartupView (GoToPage + SetZoomPercent + an analysis-poll that forces rail once the page is analysed). - Document both in docs/agent-playbook.md. Build clean (6 projects, 0 warnings). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Agent-readiness / accessibility work, verified against the live AT-SPI tree (KDE/X11).
What's in here
src/Tools/A11yPeerDump.Headless— headless Avalonia automation-peer audit (runs anywhere, no a11y bus/Accerciser). Reports the peer tree, a full menu-command audit, named chrome, and theDocumentViewportAutomationPeerspotlight (incl. a--railrun that exercises the live rail-line channel).scripts/atspi-dump.py— companion live pyatspi dump (run with/usr/bin/python3).Auto-scroll/Jump mode/Line focus blur/Line highlight(were bare letters "P/J/F/H") + the two sliders, viaAutomationProperties.Name. Now properly named and agent-actionable.RailReader2on the a11y bus (was the generic "Avalonia Application") —Application.Name. Live-verified.--page <n>/--zoom <pct>/--railfor known-state startup.docs/agent-playbook.md— the computer-use-linux driving guide.Key finding (documented in the playbook)
Avalonia menu items expose no AT-SPI Action on Linux — agents must drive via buttons + keyboard accelerators, which the playbook is built around. The viewport peer's accessible name is the live rail line (confirmed).
Verification
🤖 Generated with Claude Code