Skip to content

Fix multi-monitor snapshot display selection#283

Open
sposer wants to merge 4 commits into
CursorTouch:mainfrom
sposer:main
Open

Fix multi-monitor snapshot display selection#283
sposer wants to merge 4 commits into
CursorTouch:mainfrom
sposer:main

Conversation

@sposer

@sposer sposer commented Jun 14, 2026

Copy link
Copy Markdown

Summary

This PR includes three focused fixes around desktop snapshot correctness:

  1. Skip unnamed semantic children in the UI tree

    • Prevents unnamed structural/control children from leaking into semantic output when they do not provide useful interaction context.
  2. Match DXCAM outputs by desktop geometry

    • Resolves screenshot capture against DXGI output rectangles instead of relying on DXCAM output order.
    • Avoids selecting the wrong physical display when DXGI output ordering differs from virtual desktop layout.
  3. Align Snapshot/Screenshot display selection with active Windows displays

    • Resolves display=[...] through active Windows display metadata.
    • Keeps Snapshot and Screenshot display filtering consistent with the actual capture region.
    • Adds available display metadata to Snapshot output so callers can see display indices, bounds, device names, and primary status.
    • Keeps the screenshot flash overlay aligned with the selected capture region.
    • Updates README, manifest descriptions, and tests for the zero-based active display index semantics.

Validation

  • uv run pytest tests/test_snapshot_display_filter.py tests/test_flash_overlay.py tests/test_screenshot_capture.py
    • 87 passed
  • git diff --check
  • uv run python -m py_compile src/windows_mcp/uia/core.py

Notes

  • Full ruff check currently reports pre-existing lint issues in legacy UIA files; this PR does not broaden into lint cleanup.

ackinul added 3 commits June 13, 2026 20:45
Why: unnamed interactive controls could leave tree_node undefined during semantic tree construction.

Impact: preserves existing unnamed-control behavior while avoiding traversal failures.
Why: Win32 monitor indices do not reliably match DXGI output indices.

Impact: display selection keeps the existing monitor-index API while falling back when dxcam cannot service a region.
Why: display indices could diverge from Windows active monitor identity and DXGI output order on multi-monitor setups.

Impact: Snapshot and Screenshot now resolve display filters through active display metadata, expose available displays in output, and keep the capture flash on the selected region.
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@Jeomon

Jeomon commented Jun 18, 2026

Copy link
Copy Markdown
Member

can you show a demo of it

@sposer

sposer commented Jun 19, 2026

Copy link
Copy Markdown
Author

can you show a demo of it

Screens

image

McpServers

{
  "mcpServers": {
    "windows-mcp": {
      "command": "uvx",
      "args": [
        "windows-mcp",
        "serve"
      ]
    },
    "windows-mcp-new": {
      "command": "uv",
      "args": [
        "--directory",
        "E:\\Private\\Windows-MCP",
        "run",
        "windows-mcp",
        "serve"
      ]
    }
  }
}

Show

image image image image image image image image

@Jeomon

Jeomon commented Jun 19, 2026

Copy link
Copy Markdown
Member

Can you also show a screenshot of one screen with use_ui_tree and use_annotation as true to check if any grounding is broken after that? I will merge the PR

@sposer

sposer commented Jun 19, 2026

Copy link
Copy Markdown
Author

Can you also show a screenshot of one screen with use_ui_tree and use_annotation as true to check if any grounding is broken after that? I will merge the PR

It appears that the UI Tree is displaying an ‘No elements’ error? and the Annotations are present on Screen 1, they seem to be displayed as desktop icons, whereas those in the bottom taskbar on Screen 2 are not being displayed at all.

Screenshots

image image image image

@Jeomon

Jeomon commented Jun 19, 2026

Copy link
Copy Markdown
Member

Is it possible show me the inspect.exe window in multi screen setup
In the first,second levels to see whether is there any attributes for multi screen desktop

@sposer

sposer commented Jun 19, 2026

Copy link
Copy Markdown
Author

Is it possible show me the inspect.exe window in multi screen setup In the first,second levels to see whether is there any attributes for multi screen desktop

Is this?
image
image

How found:	Selected from tree...
Name:	"设置"
ControlType:	UIA_WindowControlTypeId (0xC370)
LocalizedControlType:	"窗口"
BoundingRectangle:	{l:-2568 t:-8 r:8 b:1408}
IsEnabled:	true
IsOffscreen:	false
IsKeyboardFocusable:	true
HasKeyboardFocus:	false
AccessKey:	""
ProcessId:	11476
RuntimeId:	[2A.1F0B06]
FrameworkId:	"Win32"
ClassName:	"ApplicationFrameWindow"
NativeWindowHandle:	0x1F0B06
ProviderDescription:	"[pid:3236,providerId:0x1F0B06 Main:Nested [pid:11476,providerId:0x1F0B06 Annotation(parent link):Microsoft: Annotation Proxy (unmanaged:UIAutomationCore.DLL); Main:Microsoft: MSAA Proxy (unmanaged:UIAutomationCore.DLL)]; Hwnd(parent link):Microsoft: HWND Proxy (unmanaged:uiautomationcore.dll)]"
IsPassword:	false
HelpText:	""
IsDialog:	false
LegacyIAccessible.ChildId:	0
LegacyIAccessible.DefaultAction:	""
LegacyIAccessible.Description:	""
LegacyIAccessible.Help:	""
LegacyIAccessible.KeyboardShortcut:	""
LegacyIAccessible.Name:	"设置"
LegacyIAccessible.Role:	客户端 (0xA)
LegacyIAccessible.State:	可设定焦点 (0x100000)
LegacyIAccessible.Value:	""
Transform.CanMove:	false
Transform.CanResize:	false
Transform.CanRotate:	false
Window.CanMaximize:	true
Window.CanMinimize:	true
Window.IsModal:	false
Window.IsTopmost:	false
Window.WindowInteractionState:	ReadyForUserInteraction (2)
Window.WindowVisualState:	Maximized (1)
IsAnnotationPatternAvailable:	false
IsDragPatternAvailable:	false
IsDockPatternAvailable:	false
IsDropTargetPatternAvailable:	false
IsExpandCollapsePatternAvailable:	false
IsGridItemPatternAvailable:	false
IsGridPatternAvailable:	false
IsInvokePatternAvailable:	false
IsItemContainerPatternAvailable:	false
IsLegacyIAccessiblePatternAvailable:	true
IsMultipleViewPatternAvailable:	false
IsObjectModelPatternAvailable:	false
IsRangeValuePatternAvailable:	false
IsScrollItemPatternAvailable:	false
IsScrollPatternAvailable:	false
IsSelectionItemPatternAvailable:	false
IsSelectionPatternAvailable:	false
IsSpreadsheetItemPatternAvailable:	false
IsSpreadsheetPatternAvailable:	false
IsStylesPatternAvailable:	false
IsSynchronizedInputPatternAvailable:	false
IsTableItemPatternAvailable:	false
IsTablePatternAvailable:	false
IsTextChildPatternAvailable:	false
IsTextEditPatternAvailable:	false
IsTextPatternAvailable:	false
IsTextPattern2Available:	false
IsTogglePatternAvailable:	false
IsTransformPatternAvailable:	true
IsTransform2PatternAvailable:	false
IsValuePatternAvailable:	false
IsVirtualizedItemPatternAvailable:	false
IsWindowPatternAvailable:	true
IsCustomNavigationPatternAvailable:	false
IsSelectionPattern2Available:	false
FirstChild:	"设置" 
LastChild:	"" 窗格
Next:	"bin" 窗口
Previous:	"微信" 窗口
Other Props:	Object has no additional properties
Children:	"设置" 
	"设置" 窗口
	"" 窗格
Ancestors:	"桌面 1" 窗格
	[ No Parent ]

How found:	Selected from tree...
Name:	"设置"
ControlType:	UIA_WindowControlTypeId (0xC370)
LocalizedControlType:	"窗口"
BoundingRectangle:	{l:-2560 t:0 r:0 b:1400}
IsEnabled:	true
HasKeyboardFocus:	false
ProcessId:	31832
RuntimeId:	[2A.80B16]
FrameworkId:	"XAML"
ClassName:	"Windows.UI.Core.CoreWindow"
NativeWindowHandle:	0x80B16
ProviderDescription:	"[pid:3236,providerId:0x80B16 Main:Nested [pid:31832,providerId:0x80B16 Main(parent link):Unidentified Provider (unmanaged:Windows.UI.Xaml.dll)]; Hwnd(parent link):Microsoft: HWND Proxy (unmanaged:uiautomationcore.dll); View:Microsoft: ViewID Proxy (unmanaged:uiautomationcore.dll)]"
ClickablePoint:	{x:-1280 y:700}
IsDialog:	false
LegacyIAccessible.ChildId:	0
LegacyIAccessible.DefaultAction:	""
LegacyIAccessible.Description:	""
LegacyIAccessible.Help:	""
LegacyIAccessible.KeyboardShortcut:	""
LegacyIAccessible.Name:	"设置"
LegacyIAccessible.Role:	窗口 (0x9)
LegacyIAccessible.State:	可设定焦点 (0x100000)
LegacyIAccessible.Value:	""
IsAnnotationPatternAvailable:	false
IsDragPatternAvailable:	false
IsDockPatternAvailable:	false
IsDropTargetPatternAvailable:	false
IsExpandCollapsePatternAvailable:	false
IsGridItemPatternAvailable:	false
IsGridPatternAvailable:	false
IsInvokePatternAvailable:	false
IsItemContainerPatternAvailable:	false
IsLegacyIAccessiblePatternAvailable:	true
IsMultipleViewPatternAvailable:	false
IsObjectModelPatternAvailable:	false
IsRangeValuePatternAvailable:	false
IsScrollItemPatternAvailable:	false
IsScrollPatternAvailable:	false
IsSelectionItemPatternAvailable:	false
IsSelectionPatternAvailable:	false
IsSpreadsheetItemPatternAvailable:	false
IsSpreadsheetPatternAvailable:	false
IsStylesPatternAvailable:	false
IsSynchronizedInputPatternAvailable:	false
IsTableItemPatternAvailable:	false
IsTablePatternAvailable:	false
IsTextChildPatternAvailable:	false
IsTextEditPatternAvailable:	false
IsTextPatternAvailable:	false
IsTextPattern2Available:	false
IsTogglePatternAvailable:	false
IsTransformPatternAvailable:	false
IsTransform2PatternAvailable:	false
IsValuePatternAvailable:	false
IsVirtualizedItemPatternAvailable:	false
IsWindowPatternAvailable:	false
IsCustomNavigationPatternAvailable:	false
IsSelectionPattern2Available:	false
FirstChild:	"设置" 文本
LastChild:	"" 组
Next:	"" 窗格
Previous:	"设置" 
Other Props:	Object has no additional properties
Children:	"设置" 文本
	"主页" 按钮
	"" 组
	"系统" 文本
	"" 列表
	"" 组
Ancestors:	"设置" 窗口
	"桌面 1" 窗格
	[ No Parent ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants