hacker news

I kept running into the same annoying friction with Claude Desktop and Cursor: I’d snip a Python traceback, the AI would tell me to "run pip install pandas," and then I’d sit there and type it out myself. If the AI clearly knows the fix, why am I doing the typing?

So I built OmniGlass.

The UX is simple: You draw a box on your screen, local OCR extracts the text, and an LLM classifies what you're looking at. But instead of generating a chat response, it gives you an action menu.

The core difference from Claude Desktop isn't the AI—it’s what happens after the AI thinks. Claude reads your screen and writes you a paragraph. OmniGlass reads your screen and runs the command.

What it does today:

Snip a traceback → Generates the fix command, you confirm, it runs.

Snip a data table → Opens a native save dialog and spits out a clean CSV.

Snip a Slack bug report → Drafts a GitHub issue with all the context filled in.

Menu bar input → Type plain English, and it triggers the appropriate command.

The security elephant in the room (Why I built this): Nobody is really talking about the security risks of MCP plugins yet. Claude Desktop runs them with your full user permissions. A rogue plugin—or a clever prompt injection—can just read your SSH keys, scrape your .env files, and ship them off.

To fix this, OmniGlass sandboxes every plugin at the macOS kernel level using sandbox-exec. Your /Users/ directory is completely walled off. Environment variables are aggressively filtered. Shell commands strictly require your manual confirmation before executing. I wanted to be able to run community plugins without sweating about what they can access.

The Stack:

Frontend/Backend: Tauri (Rust + TypeScript)

Vision: Apple Vision OCR (local)

Plugin System: MCP over stdio

Models: Works with Claude Haiku, Gemini Flash, or fully local via llama.cpp using Qwen-2.5 (takes ~6s end-to-end, nothing leaves your machine).

Current Status: I just shipped our second working plugin (a Slack Webhook) to run alongside the GitHub Issues plugin. It's two real-world plugins proving the architecture actually works, not just a boilerplate template and a promise. Both are under 250 lines of code.

Where I'd love your help:

Break the sandbox. Seriously. If you can figure out a way to read ~/.ssh/id_rsa from a plugin, that is a critical bug and I want to know about it.

Build a plugin. There are 8 open issues in the repo right now with full MCP schemas, manifests, and implementation hints. Most take less than 100 lines.

Port to Windows/Linux. The Windows build compiles in CI but hasn't been tested on real metal. Linux needs Tesseract + Bubblewrap to replace the Apple OCR and sandbox.

Requires macOS 12+ right now. Fully open source (MIT).

Would love to hear your thoughts or answer any questions about the sandboxing setup!

Show HN: OmniGlass – Executable AI screen snips with kernel-level sandboxing