Don't type,
just speak

The open-source voice-to-text app that turns speech into clean, polished text in every macOS app. Fully offline. No cloud. No subscriptions.

Download Now View on GitHub

macOS 14+ · Apple Silicon · Apache 2.0

VibeFlow

Listening...

🎙 Capture

→

📝 Transcribe

→

✨ Polish

→

📋 Paste

VS Code

Slack

Notion

Terminal

Safari

Xcode

Why VibeFlow

Voice that respects
your privacy

Unlike cloud-based dictation tools, VibeFlow runs entirely on your Mac. Your voice never leaves your device.

VibeFlow

Open Source

✓ 100% local processing
✓ No account required
✓ No subscription fees
✓ Works without internet
✓ Apache 2.0 licensed
✓ Extend with your own engines

Cloud Dictation

Proprietary

✗ Voice sent to servers
✗ Account + login required
✗ Monthly subscription
✗ Requires internet
✗ Closed source
✗ Limited customization

🎯

Universal Dictation

Works across any macOS app — Slack, VS Code, Notion, browsers, terminals. Hold a hotkey, speak, release. Done.

🧠

Dual Speech Engines

Apple Speech Recognition for speed, or WhisperKit for offline Whisper-quality transcription via Neural Engine.

✨

AI Text Cleanup

Local Qwen 0.5B via MLX polishes your speech — removes filler words, fixes grammar, formats text. All on-device.

✈️

Fully Offline

WhisperKit + Local SLM = zero network dependency. Dictate on a plane, in a tunnel, anywhere.

📖

Custom Dictionary

Add technical terms — Kubernetes, Terraform, gRPC — for accurate recognition of your domain vocabulary.

🎨

Writing Styles

Casual, Professional, Creative, or Technical. VibeFlow adapts its text cleanup to your preferred tone.

How It Works

Three seconds from
thought to text

fn

Hold the hotkey

Press and hold Fn (or your custom key). The Dynamic Island HUD appears with a live waveform.

Speak naturally

Talk normally. Say "um" and "like" all you want — the filler removal pipeline strips them out.

Release & paste

Let go. VibeFlow transcribes, cleans, and pastes polished text into whatever app you're using.

Hotkey

Speech Engine

Apple Speech / WhisperKit

Filler Remover

Regex patterns

Text Processor

Qwen 0.5B / Remote LLM

Paste

Engine Options

Mix and match
your pipeline

Two stages, two choices each. Pick what fits your workflow — switch anytime in Settings.

Stage 1: Speech-to-Text

Built-in

Apple Speech

Uses macOS SFSpeechRecognizer. Fast, zero setup, supports contextualStrings for custom dictionary terms.

0 MB download Instant start

Offline

WhisperKit

OpenAI Whisper running on Apple Neural Engine via CoreML. Best accuracy, fully offline, auto-downloads models.

~80 MB model ~100 MB RAM

Stage 2: Text Cleanup

Offline

Local SLM

Qwen 0.5B quantized to 4-bit, running on Apple GPU via MLX Swift. Polishes text without any network call.

~350 MB model ~500 MB RAM

Flexible

Remote LLM

Any OpenAI-compatible endpoint — GPT-4, Claude, local Ollama. Use the LiteLLM client for maximum flexibility.

0 MB local Requires network

Resource Usage

Component	Disk	RAM
App binary	~20 MB	~50 MB
WhisperKit (base)	~80 MB	~100 MB
Qwen 0.5B (4-bit)	~350 MB	~500 MB
Total (fully offline)	~450 MB	~650 MB

Runs comfortably on a base MacBook Air M1 with 8 GB RAM.

Architecture

Built for
extensibility

Protocol-based engine system lets you hot-swap speech and text processing at runtime. Add your own engines by conforming to a protocol.

SpeechRecognitionService

protocol SpeechRecognitionService

AppleSpeechEngine
WhisperEngine
+ your own

TextProcessingService

protocol TextProcessingService

LocalSLMProcessor
RemoteLLMProcessor
+ your own

Core Services

ConversationController

FillerRemover
ModelDownloadManager
PermissionsHelper

Technologies

SwiftUI SwiftData AVFoundation Speech.framework WhisperKit MLX Swift AppKit CoreAudio

Changelog

What's new

VibeFlow ships fast. Here's what landed recently.

v1.2 Latest Mar 28, 2026

Dark Mode & UX Polish

Dark Mode — Appearance toggle (Light / Dark / System) in Settings, saved across launches. All views render correctly in both modes.
Richer History — Tap any transcription to see the full text, WPM speed, and duration. Average WPM now shown on Dashboard. WPM shown inline on each history row.
Recording Quality — 300 ms grace delay before stopping so the last word is never cut off.
Settings Cleanup — Microphone permissions surfaced at the top, redundant Formality setting removed.
Bug Fixes — Window remembers its size between launches. AI model used is now accurately logged per transcription.

Full release notes →

v1.1.0 Mar 28, 2026

Memory & Reliability

Critical Memory Fix — Eliminated a staircase memory leak (400 MB → 4+ GB) caused by 7 reactive observers rebuilding models on every keystroke. Models now only rebuild on Save Settings.
Model Load Error Surfacing — If WhisperKit or the local SLM fails to load, a red banner with Retry / Change Model appears on the Dashboard. Recording is blocked until the model is healthy.
HUD Processing State — Spinner + "Processing…" shown after release while transcription + AI cleanup runs. A red error pill appears for 3 s on failure.
Settings Save UX — All changes are held locally until you press Save Settings. Navigating away with unsaved changes shows a discard confirmation.

Full release notes →

v1.0.0 Mar 24, 2026

Initial Release

Universal dictation across any macOS app via global hotkey
Dual speech engines: Apple Speech & WhisperKit
Dual text cleanup: Local SLM (Qwen 0.5B) & Remote LLM
Dynamic Island HUD with live waveform
Custom dictionary, writing styles, transcription history

Full release notes →

View all releases on GitHub →

Don't type,
just speak

Watch the demo

Voice that respects
your privacy

VibeFlow

Cloud Dictation

Universal Dictation

Dual Speech Engines

AI Text Cleanup

Fully Offline

Custom Dictionary

Writing Styles

Three seconds from
thought to text

Hold the hotkey

Speak naturally

Release & paste

Mix and match
your pipeline

Stage 1: Speech-to-Text

Apple Speech

WhisperKit

Stage 2: Text Cleanup

Local SLM

Remote LLM

Resource Usage

Built for
extensibility

SpeechRecognitionService

TextProcessingService

Core Services

Technologies

What's new

Dark Mode & UX Polish

Memory & Reliability

Initial Release

Start flowing

Don't type, just speak

Watch the demo

Voice that respectsyour privacy

VibeFlow

Cloud Dictation

Universal Dictation

Dual Speech Engines

AI Text Cleanup

Fully Offline

Custom Dictionary

Writing Styles

Three seconds fromthought to text

Hold the hotkey

Speak naturally

Release & paste

Mix and matchyour pipeline

Stage 1: Speech-to-Text

Apple Speech

WhisperKit

Stage 2: Text Cleanup

Local SLM

Remote LLM

Resource Usage

Built forextensibility

SpeechRecognitionService

TextProcessingService

Core Services

Technologies

What's new

Dark Mode & UX Polish

Memory & Reliability

Initial Release

Start flowing

Don't type,
just speak

Voice that respects
your privacy

Three seconds from
thought to text

Mix and match
your pipeline

Built for
extensibility