Patents pending / source-available

Screenshots are
obsolete.

UAB gives AI agents structured spatial perception of any desktop app. Buttons, inputs, menus organized into a 2D spatial grid as data, not pixels. 62 action types across 9+ frameworks. 17 native MCP tools. 6-tier cascade control. 50-100x faster than screenshots. 100-200x cheaper. The concerto — every micro-operation uses the most efficient method.

BSL 1.1 / Free for personal use. Runs locally. Never phones home. v1.3.0

Windows may show a SmartScreen warning. Click "More info" then "Run anyway". The installer is unsigned. Verify the source on GitHub.

100% local. Nothing leaves your machine
Zero telemetry. No phone-home, no tracking
API key auth. Your apps, your control
Source-available. BSL 1.1
61Action Types
9+Frameworks
17MCP Tools
32Element Types
20Tests Passing
Excel — COM automation
ChatGPT — Electron
Blender — OpenGL concerto
Chrome — DevTools Protocol
VS Code — Electron CDP
Slack — Electron CDP
FreeCAD — Qt deep query
Grok — Electron
Discord — Electron CDP
PowerPoint — COM
Word — COM
Teams — Electron CDP
Notepad — Win32
Any Windows app — UIA
Excel — COM automation
ChatGPT — Electron
Blender — OpenGL concerto
Chrome — DevTools Protocol
VS Code — Electron CDP
Slack — Electron CDP
FreeCAD — Qt deep query
Grok — Electron
Discord — Electron CDP
PowerPoint — COM
Word — COM
Teams — Electron CDP
Notepad — Win32
Any Windows app — UIA

Not automation.
Understanding.

UAB doesn't guess at coordinates or read pixels. It understands each app's interface the same way you do, and can act on any element by name.

New in v1.2
🔬
X-ray Vision

See every button, input, link, and menu in any application in a single query. Deep Query uses FindAll and RAWCrawler to scan the entire UI tree. ChatGPT exposes 123 elements. FreeCAD exposes 191. Click any of them by name.

POST /deep-query {"pid": 28968} → 123 elements: buttons, inputs, sidebar, model selector... POST /invoke {"pid": 28968, "name": "Copy", "occurrence": "last"} → Clicks the last Copy button, returns clipboard text
New in v1.2
🧠
Recursive Learning

Every app interaction gets stored in the Flow Library. The first time is exploration. Every subsequent time is instant. The system gets smarter with every use.

GET /flow/grok → 13-step sequence: focus, tab, activate, paste, send... // Agent follows it mechanically. Zero guesswork.
🔒
Zero-Accessibility Control

Apps like Blender render entirely in OpenGL with zero accessibility tree. No other automation tool can touch them. UAB controls them through the concerto — keyboard shortcuts for commands, OS input injection for sculpt brush strokes, scroll for zooming, screenshots for verification. First AI agent to sculpt 3D geometry. No plugin. No API. No accessibility tree. Full spatial control.

🌉
VM Bridge

AI agents running in VMs (like Claude Co-work) can't reach localhost. UAB's Chrome extension bridges the gap. Agents talk through Chrome to reach the host machine. Zero network config.

New in v1.2
🔌
MCP Native Tools

17 desktop control tools discovered automatically by any MCP-compatible agent. No skill files, no curl commands. desktop_scan, desktop_smart_click, desktop_spatial_map, desktop_chain, desktop_flow, desktop_invoke, and more.

New in v1.2
Atomic Chains

Execute multi-step sequences (open menu, arrow down, select) in a single process session. No focus stealing between steps. Solves the menu timing problem that plagues every other automation tool.

🤝
AI-to-AI Communication. Proven.

Claude typed messages into ChatGPT's desktop app, read the responses, and carried a multi-turn conversation. Then did the same with Grok. Two rival AIs talking through pure desktop automation. No API integration. No webhooks. Just UAB.

🗺️
Spatial Map Engine

Converts flat UI Automation elements into a 2D rows/columns grid with spatial indexing. Agents understand where elements are relative to each other without screenshots. "What's to the right of the Save button?" becomes a data query, not a vision task.

POST /spatial-map {"pid": 28968} → Grid: 8 rows × 4 columns, 47 elements mapped → Row 3: [File] [Edit] [View] [Help]
⚙️
6-Tier Cascade — The Concerto

Never one method per app. For each micro-operation, UAB picks the most efficient method — optimizing for speed, outcome, control, and cost. Keyboard for commands. UI Automation for element lookup. Drag injection for sculpting and painting. Screenshots for verification. A single task blends them all. That's the concerto.

P1: Chrome Extension Bridge (zero disruption) P2: Browser CDP (remote debugging) P3: Framework hooks (Electron, Office, Qt) P4: Windows UI Automation (universal) P5: Keyboard native (fastest for commands) P6: OS raw input injection (drag, scroll, gestures) + Vision: screenshot verification loop
🖥️
Three Access Modes

Import as a TypeScript library. Use 20+ CLI commands with JSON output. Or run as an HTTP server on port 3100. Same 62 action types, three interfaces. Pick what fits your workflow.

🏗️
Dual-Mode Architecture

Desktop mode (Session 1+, full UIA/CDP, 100 actions/min). Server mode (Session 0, Session Bridge via Task Scheduler, 60/min). Container mode (minimal, 30/min). Works transparently in SSH and Windows Service contexts.

🛡️
Production Hardening

Three-tier smart caching with auto-invalidation on mutating actions. Exponential backoff retry with jitter. Rate limiting per PID. 30-second health checks with auto-reconnect. 1000-entry audit log. 20 tests passing across the standalone UAB surface.

🔍
Smart Invoke

Find and activate any UI element by name using a 6-method activation cascade. No coordinates, no element IDs, no fragile selectors. Just say "click Copy" and UAB finds and triggers it.

Install once.
Every agent benefits.

One installer. Zero configuration. Works with any MCP-compatible agent, any REST client, or any TypeScript project. Three access modes: Library, CLI, HTTP Server.

01
Download and install

Run the installer. UAB starts as a background service, installs the Chrome extension, and writes skill files for supported agents. Starts on boot. No terminal required.

uab-bridge install
02
Open any AI agent

Desktop agents, CLI agents, MCP clients. The installer deploys skills to supported tools automatically. UAB is just there. No setup. No configuration.

03
Tell your agent to do things

"Open Excel and build a pivot table." "Type a message in ChatGPT." "Create a 3D model in Blender." Your agent does it natively.

Controls apps that
nothing else can.

UAB detects each app's framework and selects the best control method automatically. Including OpenGL apps with zero accessibility that no other tool can touch.

📊
Excel
COM + UIA
💬
ChatGPT
Electron + Deep Query
🤖
Grok
Electron + Flow Library
🧊
Blender
OpenGL — Concerto (KB + Drag + Vision)
🔧
FreeCAD
Qt — 191 elements
📝
VS Code
Electron CDP
💬
Slack
Electron CDP
🌐
Chrome
DevTools Protocol
📄
Word
COM + UIA
🗒
Obsidian
Electron CDP
🎮
Discord
Electron CDP
📋
Notion
Electron CDP
🎞
PowerPoint
COM + UIA
📹
Teams
Electron CDP
📺
VLC
Qt + UIA
⚙️
Any app
UI Automation fallback

Not a workaround.
An architecture.

Every other approach approximates desktop control. UAB talks directly to the application's own interface layer.

UAB Bridge
  • Structured spatial maps replace screenshots (50-100x faster)
  • 17 native MCP tools. No skill files needed
  • Learns and remembers app interaction patterns
  • Controls OpenGL apps — sculpting, painting, full spatial gestures
  • Localhost only. Nothing leaves your machine.
  • Works with any agent: Claude, Cursor, custom
  • No VM. No startup delay. No 10GB image.
Screenshot-based approaches
  • Reads pixels and guesses what they mean
  • Clicks coordinates. Fragile to UI changes
  • Starts from zero every session
  • Cannot control OpenGL apps or do spatial gestures (drag, sculpt)
  • Cloud execution. Your screen data leaves the machine
  • Locked to one agent platform
  • 10GB+ image, 30 second startup, high overhead

UAB writes skill files automatically. Any agent that can read markdown or call a REST endpoint gets native desktop control.

Claude Co-work
CLI agents
Desktop agents
Cursor
Windsurf
Any MCP agent
Custom agents

Your machine. Your data. Period.

UAB runs 100% locally. Zero telemetry. Zero phone-home. Zero cloud. Your desktop activity never leaves your machine. The server binds to localhost with API key authentication. Source-available so you can verify every line.

Request support
for any application.

We're building the Flow Library with curated interaction sequences for every app. Tell us what you need and we'll add it.

Ready to give your agents real vision?

One install. Works immediately. Free for personal use.

Free for individuals.
Licensed for business.

UAB Bridge uses the Business Source License 1.1, the same license trusted by MariaDB, CockroachDB, and Sentry. Here's what that means for you:

Free / No License Needed
  • Personal use on your own machines
  • Academic and research use
  • Evaluation and prototyping
  • Non-production public projects
  • Up to 25 users / devices
  • Lancelot ecosystem integration
📋
Commercial License Required
  • Embedding in commercial products
  • Offering UAB as a service (SaaS)
  • Deployment to 25+ users / devices
  • Enterprise internal automation
  • Building competing products
🔓
Converts to Apache 2.0

Four years after each version release, the license automatically converts to Apache 2.0, fully permissive, no restrictions. BSL is a time-limited protection, not a permanent lock.

v1.2.0 released March 2026 → Apache 2.0 in March 2030

Not sure if your use case requires a license?
Reach out. We're happy to clarify.

Part of the Lancelot Ecosystem

UAB Bridge is built by Lancelot Governance Systems, the governed autonomy platform for AI agents. UAB provides the hands. Lancelot provides the governance, memory, and orchestration.

Learn about Lancelot →