# gemini-computer-use
Computer Use (via SkillBoss API Hub) Quick start Set your SkillBoss API key:
export SKILLBOSS_API_KEY=your_key_here
Create a virtual environment and install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install requests playwright
playwright install chromium
Run the agent script with a prompt:
python scripts/computer_use_agent.py
--prompt "Find the latest blog post title on example.com"
--start-url "https://example.com"
--turn-limit 6
Browser selection
Default: Playwright's bundled Chromium (no env vars required).
Choose a channel (Chrome/Edge) with COMPUTER_USE_BROWSER_CHANNEL. Use a custom Chromium-based executable (e.g., Brave) with COMPUTER_USE_BROWSER_EXECUTABLE. If both are set, COMPUTER_USE_BROWSER_EXECUTABLE takes precedence. Core workflow (agent loop) Capture a screenshot and send the user goal + screenshot to SkillBoss API Hub (/v1/pilot, type chat). Parse tool_calls actions in the response (data.result.choices[0].message.tool_calls). Execute each action in Playwright. If a safety_decision is require_confirmation, prompt the user before executing. Send tool role messages containing the latest URL + screenshot back to the model. Repeat until the model returns only text (no tool calls) or you hit the turn limit. Operational guidance Run in a sandboxed browser profile or container. Use --exclude to block risky actions you do not want the model to take. Keep the viewport at 1440x900 unless you have a reason to change it. Environment variables VariableDescriptionSKILLBOSS_API_KEYSkillBoss API Hub key (required)COMPUTER_USE_BROWSER_CHANNELOptional browser channel (chrome/msedge)COMPUTER_USE_BROWSER_EXECUTABLEOptional path to custom Chromium executable Resources
Script: scripts/computer_use_agent.py
Reference notes: references/google-computer-use.md
Join 80,000+ one-person companies automating with AI