What's the difference between Skilly and Gemini Live?

Gemini Live is Google's voice + screen-sharing assistant inside the Gemini app and Chrome browser. It runs in a browser tab or the Gemini iOS/Android app and uses Google's Gemini multimodal model. Skilly is a native macOS menu-bar app — no browser tab, no Google account required — that watches your active Mac app, speaks back in your language, and physically moves your cursor to the exact UI element. Skilly uses the OpenAI Realtime API and runs natively in Swift on macOS 14+. Gemini Live is free with a Google account; Skilly is $19/month after a 15-minute free trial. Skilly is open source under Apache-2.0 (github.com/tryskilly/skilly); Gemini Live is closed source.

Is Skilly better than Gemini Live for Mac users?

For learning unfamiliar Mac apps (Blender, Figma, Xcode, Photoshop, etc.), Skilly is built specifically for that job and Gemini Live is not. Gemini Live can see what's on your screen if you screen-share, but it doesn't move your cursor, doesn't ship with per-app skill curricula, and runs inside Google's UI rather than as a system-wide menu-bar tool. For general Q&A or browser-based tasks, Gemini Live is fine and free. For active tutoring inside a specific desktop app, Skilly is the better fit.

Honest comparison

Skilly vs Gemini Live

Google's Gemini Live is a genuinely impressive multimodal model running in the browser. Skilly wraps a similar loop into a native Mac app with a cursor that actually moves. If you are learning a Mac app, that last part is the difference.

Pick Skilly if

You are inside a Mac app and need help without switching windows to a browser. You want the cursor to physically move to the button you need to click, not just read instructions.

Pick Gemini Live if

You want a general-purpose multimodal AI you can use on any device (Android, iOS, web). You are comfortable working in a browser tab and do not need a Mac-native UI.

Feature by feature

Feature Skilly Gemini Live
Where it runs Native macOS menu bar app — always one shortcut away Browser tab in Google AI Studio (aistudio.google.com)
Trigger Voice-gated: push-to-talk or Live Tutor mode. Nothing captured between voice events. Start a session, continuous screen + mic during the session
Cursor guidance Cursor physically moves to the exact UI element on your screen Text and voice answers only — no cursor control
Skills / curriculum Per-app Markdown curriculum (Blender, Figma, Xcode, Photoshop…) — stays current independent of model General-purpose, no per-app context preloaded
Model Single-call OpenAI Realtime API (voice-to-voice) Google Gemini 2.x (Flash / Pro depending on tier)
Tab-switching Stays out of the way in the menu bar — you stay in the app you are learning You are in the browser, not the app — you switch windows to act on answers
Language support 20+ languages auto-detected 40+ languages (broader reach)
Privacy posture Capture gated by voice activity. Screen Recording permission revokable anytime. Screen + audio streamed to Google during session. Bound by Google Workspace terms.
Offline Requires OpenAI API access but the app works without a persistent browser tab Requires a browser session + Google account
Open source Yes — Apache-2.0 (fork of farzaa/clicky, MIT) No — Google proprietary
Pricing 15 minutes free, then $19/month for 3 hours of tutoring Free tier with rate limits, paid tiers via Google AI Studio / API

Common questions

Is Gemini Live free and Skilly paid?

Yes. Gemini Live is free with any Google account. Skilly is $19 per month after a 15-minute free trial (no credit card to start). Skilly is also open source under Apache-2.0 — you can run it free with your own OpenAI API key (BYOK option, in beta) — but the hosted convenience tier is paid.

Why pay for Skilly when Gemini Live is free?

Three reasons users pay: (1) Skilly physically moves your cursor to the exact UI element you need — Gemini Live just talks. (2) Skilly is a system-wide menu-bar app — no Chrome tab, no Google account, works in any Mac app including offline-first tools. (3) Skilly ships with 5 free skill curricula (Blender, Figma, AE, DaVinci, Premiere) for guided learning. If you don't need any of those, Gemini Live is genuinely fine.

Can Gemini Live see my Mac apps the way Skilly does?

Yes if you screen-share to it via the browser, but the experience is different. Skilly uses macOS ScreenCaptureKit natively — instant capture, multi-monitor aware, no browser permission dance. Gemini Live works through Chrome's screen-sharing API, which is fine for casual use but adds friction every session and lacks the cursor-moving guidance Skilly provides.

Does Skilly use Gemini under the hood?

No. Skilly uses the OpenAI Realtime API for voice + vision in a single round-trip call (sub-second latency). Gemini Live uses Google's Gemini multimodal model. Different model providers, different latency profiles, different language coverage. Skilly's source code is open at github.com/tryskilly/skilly if you want to verify.

Want a Mac-native tutor?

15 minutes free, no card. Apache-2.0 open source, fork of Farza's Clicky (MIT).

Download Skilly