Patents pending / now open source

Screenshots are
obsolete.

UAB gives AI agents structured spatial perception of any desktop app. Buttons, inputs, menus organized into a 2D spatial grid as data, not pixels. 61 action types across 9+ frameworks. 17 native MCP tools. 5-tier cascade control. 50-100x faster than screenshots. 100-200x cheaper.

BSL 1.1 / Free for personal use. Runs locally. Never phones home. v1.2.0

Windows may show a SmartScreen warning. Click "More info" then "Run anyway". The installer is unsigned (open source). Verify the source on GitHub.

100% local. Nothing leaves your machine
Zero telemetry. No phone-home, no tracking
API key auth. Your apps, your control
Open source. BSL 1.1
61Action Types
9+Frameworks
17MCP Tools
32Element Types
155+Tests
Excel — COM automation
ChatGPT — Electron
Blender — OpenGL keyboard
Chrome — DevTools Protocol
VS Code — Electron CDP
Slack — Electron CDP
FreeCAD — Qt deep query
Grok — Electron
Discord — Electron CDP
PowerPoint — COM
Word — COM
Teams — Electron CDP
Notepad — Win32
Any Windows app — UIA
Excel — COM automation
ChatGPT — Electron
Blender — OpenGL keyboard
Chrome — DevTools Protocol
VS Code — Electron CDP
Slack — Electron CDP
FreeCAD — Qt deep query
Grok — Electron
Discord — Electron CDP
PowerPoint — COM
Word — COM
Teams — Electron CDP
Notepad — Win32
Any Windows app — UIA

Not automation.
Understanding.

UAB doesn't guess at coordinates or read pixels. It understands each app's interface the same way you do, and can act on any element by name.

New in v1.2
🔬
X-ray Vision

See every button, input, link, and menu in any application in a single query. Deep Query uses FindAll and RAWCrawler to scan the entire UI tree. ChatGPT exposes 123 elements. FreeCAD exposes 191. Click any of them by name.

POST /deep-query {"pid": 28968} → 123 elements: buttons, inputs, sidebar, model selector... POST /invoke {"pid": 28968, "name": "Copy", "occurrence": "last"} → Clicks the last Copy button, returns clipboard text
New in v1.2
🧠
Recursive Learning

Every app interaction gets stored in the Flow Library. The first time is exploration. Every subsequent time is instant. The system gets smarter with every use.

GET /flow/grok → 13-step sequence: focus, tab, activate, paste, send... // Agent follows it mechanically. Zero guesswork.
🔒
Zero-Accessibility Control

Apps like Blender render entirely in OpenGL with zero accessibility. No other automation tool can touch them. UAB controls them through keyboard shortcuts, the same way a power user would. Created a 3D snowman in Blender. No plugin. No API. No accessibility tree.

🌉
VM Bridge

AI agents running in VMs (like Claude Co-work) can't reach localhost. UAB's Chrome extension bridges the gap. Agents talk through Chrome to reach the host machine. Zero network config.

New in v1.2
🔌
MCP Native Tools

17 desktop control tools discovered automatically by any MCP-compatible agent. No skill files, no curl commands. desktop_scan, desktop_smart_click, desktop_spatial_map, desktop_chain, desktop_flow, desktop_invoke, and more.

New in v1.2
Atomic Chains

Execute multi-step sequences (open menu, arrow down, select) in a single process session. No focus stealing between steps. Solves the menu timing problem that plagues every other automation tool.

🤝
AI-to-AI Communication. Proven.

Claude typed messages into ChatGPT's desktop app, read the responses, and carried a multi-turn conversation. Then did the same with Grok. Two rival AIs talking through pure desktop automation. No API integration. No webhooks. Just UAB.

🗺️
Spatial Map Engine

Converts flat UI Automation elements into a 2D rows/columns grid with spatial indexing. Agents understand where elements are relative to each other without screenshots. "What's to the right of the Save button?" becomes a data query, not a vision task.

POST /spatial-map {"pid": 28968} → Grid: 8 rows × 4 columns, 47 elements mapped → Row 3: [File] [Edit] [View] [Help]
⚙️
5-Tier Cascade Control

Every app gets the best control method automatically. If one tier fails, it falls to the next transparently. 61 action types across click, text, toggle, keyboard, window, Office, and browser categories. 32 normalized UI element types across all frameworks.

P1: Chrome Extension Bridge (zero disruption) P2: Browser CDP (remote debugging) P3: Framework hooks (Electron, Office, Qt) P4: Windows UI Automation (universal) P5: Vision fallback (LLM Vision API)
🖥️
Three Access Modes

Import as a TypeScript library. Use 20+ CLI commands with JSON output. Or run as an HTTP server on port 3100. Same 61 action types, three interfaces. Pick what fits your workflow.

🏗️
Dual-Mode Architecture

Desktop mode (Session 1+, full UIA/CDP, 100 actions/min). Server mode (Session 0, Session Bridge via Task Scheduler, 60/min). Container mode (minimal, 30/min). Works transparently in SSH and Windows Service contexts.

🛡️
Production Hardening

Three-tier smart caching with auto-invalidation on mutating actions. Exponential backoff retry with jitter. Rate limiting per PID. 30-second health checks with auto-reconnect. 1000-entry audit log. 155+ tests across the full surface.

🔍
Smart Invoke

Find and activate any UI element by name using a 6-method activation cascade. No coordinates, no element IDs, no fragile selectors. Just say "click Copy" and UAB finds and triggers it.

Install once.
Every agent benefits.

One installer. Zero configuration. Works with any MCP-compatible agent, any REST client, or any TypeScript project. Three access modes: Library, CLI, HTTP Server.

01
Download and install

Run the installer. UAB starts as a background service, installs the Chrome extension, and writes skill files for every Claude product. Starts on boot. No terminal required.

uab-bridge install
02
Open any AI agent

Claude Co-work, Claude Code, Claude Desktop. The installer deploys skills to all of them automatically. UAB is just there. No setup. No configuration.

03
Tell your agent to do things

"Open Excel and build a pivot table." "Type a message in ChatGPT." "Create a 3D model in Blender." Your agent does it natively.

Controls apps that
nothing else can.

UAB detects each app's framework and selects the best control method automatically. Including OpenGL apps with zero accessibility that no other tool can touch.

📊
Excel
COM + UIA
💬
ChatGPT
Electron + Deep Query
🤖
Grok
Electron + Flow Library
🧊
Blender
OpenGL — Keyboard Native
🔧
FreeCAD
Qt — 191 elements
📝
VS Code
Electron CDP
💬
Slack
Electron CDP
🌐
Chrome
DevTools Protocol
📄
Word
COM + UIA
🗒
Obsidian
Electron CDP
🎮
Discord
Electron CDP
📋
Notion
Electron CDP
🎞
PowerPoint
COM + UIA
📹
Teams
Electron CDP
📺
VLC
Qt + UIA
⚙️
Any app
UI Automation fallback

Not a workaround.
An architecture.

Every other approach approximates desktop control. UAB talks directly to the application's own interface layer.

UAB Bridge
  • Structured spatial maps replace screenshots (50-100x faster)
  • 17 native MCP tools. No skill files needed
  • Learns and remembers app interaction patterns
  • Controls OpenGL apps with zero accessibility
  • Localhost only. Nothing leaves your machine.
  • Works with any agent: Claude, Cursor, custom
  • No VM. No startup delay. No 10GB image.
Screenshot-based approaches
  • Reads pixels and guesses what they mean
  • Clicks coordinates. Fragile to UI changes
  • Starts from zero every session
  • Cannot control OpenGL or custom-rendered UIs
  • Cloud execution. Your screen data leaves the machine
  • Locked to one agent platform
  • 10GB+ image, 30 second startup, high overhead

UAB writes skill files automatically. Any agent that can read markdown or call a REST endpoint gets native desktop control.

Claude Co-work
Claude Code CLI
Claude Code Desktop
Cursor
Windsurf
Any MCP agent
Custom agents

Your machine. Your data. Period.

UAB runs 100% locally. Zero telemetry. Zero phone-home. Zero cloud. Your desktop activity never leaves your machine. The server binds to localhost with API key authentication. Open source so you can verify every line.

Request support
for any application.

We're building the Flow Library with curated interaction sequences for every app. Tell us what you need and we'll add it.

Ready to give your agents real vision?

One install. Works immediately. Free for personal use.

Free for individuals.
Licensed for business.

UAB Bridge uses the Business Source License 1.1, the same license trusted by MariaDB, CockroachDB, and Sentry. Here's what that means for you:

Free / No License Needed
  • Personal use on your own machines
  • Academic and research use
  • Evaluation and prototyping
  • Open source projects (OSI-approved)
  • Up to 25 users / devices
  • Lancelot ecosystem integration
📋
Commercial License Required
  • Embedding in commercial products
  • Offering UAB as a service (SaaS)
  • Deployment to 25+ users / devices
  • Enterprise internal automation
  • Building competing products
🔓
Converts to Apache 2.0

Four years after each version release, the license automatically converts to Apache 2.0, fully permissive, no restrictions. BSL is a time-limited protection, not a permanent lock.

v1.2.0 released March 2026 → Apache 2.0 in March 2030

Not sure if your use case requires a license?
Reach out. We're happy to clarify.

Part of the Lancelot Ecosystem

UAB Bridge is built by Lancelot Governance Systems, the governed autonomy platform for AI agents. UAB provides the hands. Lancelot provides the governance, memory, and orchestration.

Learn about Lancelot →