macos-control

English · 中文

macos-control

A lightweight macOS desktop automation toolkit — control the real mouse, keyboard, and screen from the shell. It is packaged as a Cursor Agent Skill (SKILL.md), but the scripts are plain Bash and work standalone in any terminal.

No MCP server, no Node, no Python. Just system tools (screencapture, osascript, sips) plus cliclick for precise mouse/keyboard events.

⚠️ Safety warning. These scripts move your real cursor, press real keys, and capture your screen. Run only code you understand. Anything driving the GUI can click the wrong thing if a window isn't focused — review each action. Avoid destructive operations and never let it type secrets.

What it does

📸 Screenshots (downscaled JPEG, cheap to read) and high-detail region "zoom"
🖱️ Mouse: move, single / double / right click at logical coordinates
⌨️ Keyboard: type text and press shortcuts (cmd c, return, arrows, …)
🪟 App control: activate apps, read frontmost app / window / mouse / screen size

It is designed to be driven by an AI agent in an observe → act → verify loop, but every script is usable by hand.

Requirements

macOS (Apple Silicon or Intel)
cliclick: brew install cliclick
Grant the controlling app (e.g. Cursor, Terminal, iTerm) these in System Settings → Privacy & Security:
- Screen Recording (for screenshots)
- Accessibility (for clicks / keystrokes)

Install

As a Cursor skill

git clone https://github.com/ZcwDev/macos-control.git
cp -R macos-control ~/.cursor/skills/macos-control   # SKILL.md is auto-discovered

Standalone

git clone https://github.com/ZcwDev/macos-control.git
cd macos-control && chmod +x scripts/*.sh

Scripts

Script	Purpose
`context.sh`	Frontmost app, window title, mouse point, logical screen size
`screenshot.sh [path]`	Downscaled JPEG of the screen; prints path + `logical_size`
`zoom.sh X Y [W H]`	High-detail capture around a point (cursor shown) to verify a target
`move.sh X Y`	Move mouse, no click
`click.sh X Y [single\|double\|right]`	Click at logical points
`type.sh "text"`	Type literal ASCII (use the clipboard for non-ASCII)
`key.sh [cmd ctrl alt shift] KEY`	Keys / shortcuts (`return`, `cmd c`, `cmd shift 4`, …)
`app.sh "Name"`	Activate / focus an app

Coordinates

Coordinates are logical points (what cliclick uses). screenshot.sh prints logical_size=WxH. Convert a target by fraction of the image, not raw pixels (the image you view is downscaled to a variable size):

click_x = (target_x_in_image / image_width)  * logical_width
click_y = (target_y_in_image / image_height) * logical_height

For small/adjacent targets, refine with the move.sh → zoom.sh → click.sh loop.

Example

scripts/app.sh Safari
scripts/screenshot.sh                 # read it, estimate a target's fraction
scripts/click.sh 735 480
scripts/type.sh "hello"
scripts/key.sh return

Notes & limitations

Works best with native macOS apps. Electron-based apps (some chat and editor apps) don't expose accessibility data, so locate targets visually with screenshots + the zoom loop.
Typing goes through the active input source. Switch to a plain Latin/ABC input source before typing ASCII, or use the clipboard for other text.
On multi-monitor setups it captures and controls the main display.

Credits & license

This project: MIT (see LICENSE).
Depends at runtime on cliclick by Carsten Blüm (BSD-3-Clause), installed separately — not bundled here.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

macos-control

What it does

Requirements

Install

As a Cursor skill

Standalone

Scripts

Coordinates

Example

Notes & limitations

Credits & license

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

macos-control

What it does

Requirements

Install

As a Cursor skill

Standalone

Scripts

Coordinates

Example

Notes & limitations

Credits & license

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages