This repository contains a reference implementation for controlling an Android emulator using the Gemini 3.5 Flash Computer Use API (mobile environment) via the Google GenAI SDK.
The agent operates in a continuous loop:
- It captures a screenshot of the virtual device using ADB.
- It sends the screenshot along with the user's task to Gemini 3.5 Flash.
- The model returns structured tool commands (such as
click,type,long_press,drag_and_drop,press_key,go_back, gwait,list_apps,open_app,take_screenshot`). - The client executing script maps the normalized coordinates (0-999) to the actual physical resolution of the screen and executes the action via ADB.
- The loop repeats until the task is complete.
agent.py: The main agent script orchestrating the interaction loop.setup_emulator.sh: Idempotent shell script to configure and create the Android virtual device (AI_Agent_Phone) on macOS.requirements.txt: Python package dependencies.
Ensure you have the following installed on your Mac:
Run the setup script to install the Android CLI tools, system images, and create the AI_Agent_Phone emulator instance:
chmod +x setup_emulator.sh
./setup_emulator.shRetrieve your API key from Google AI Studio and export it:
export GEMINI_API_KEY="your-api-key-here"Create a virtual environment and install the required dependencies:
uv venv
source .venv/bin/activate
uv pip install -r requirements.txtThen run the agent:
python agent.py "Find the latest blog post from philipp schmid and summarize it."Alternatively, you can run the agent with a custom task:
python agent.py "Open Settings and enable dark mode"Copyright 2026 Google LLC
All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0
All other materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode
Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.
This is not an official Google product.