Don't type,
just speak

The open-source voice-to-text app that turns speech into clean, polished text in every macOS app. Fully offline. No cloud. No subscriptions.

macOS 14+ · Apple Silicon · MIT License

VibeFlow
Listening...
🎙 Capture
📝 Transcribe
Polish
📋 Paste
VS Code
Slack
Notion
Terminal
Safari
Xcode

Voice that respects
your privacy

Unlike cloud-based dictation tools, VibeFlow runs entirely on your Mac. Your voice never leaves your device.

VibeFlow

Open Source
  • 100% local processing
  • No account required
  • No subscription fees
  • Works without internet
  • MIT licensed
  • Extend with your own engines

Cloud Dictation

Proprietary
  • Voice sent to servers
  • Account + login required
  • Monthly subscription
  • Requires internet
  • Closed source
  • Limited customization
🎯

Universal Dictation

Works across any macOS app — Slack, VS Code, Notion, browsers, terminals. Hold a hotkey, speak, release. Done.

🧠

Dual Speech Engines

Apple Speech Recognition for speed, or WhisperKit for offline Whisper-quality transcription via Neural Engine.

AI Text Cleanup

Local Qwen 0.5B via MLX polishes your speech — removes filler words, fixes grammar, formats text. All on-device.

✈️

Fully Offline

WhisperKit + Local SLM = zero network dependency. Dictate on a plane, in a tunnel, anywhere.

📖

Custom Dictionary

Add technical terms — Kubernetes, Terraform, gRPC — for accurate recognition of your domain vocabulary.

🎨

Writing Styles

Casual, Professional, Creative, or Technical. VibeFlow adapts its text cleanup to your preferred tone.

Three seconds from
thought to text

1
fn

Hold the hotkey

Press and hold Fn (or your custom key). The Dynamic Island HUD appears with a live waveform.

2

Speak naturally

Talk normally. Say "um" and "like" all you want — the filler removal pipeline strips them out.

3

Release & paste

Let go. VibeFlow transcribes, cleans, and pastes polished text into whatever app you're using.

Hotkey
Speech Engine
Apple Speech / WhisperKit
Filler Remover
Regex patterns
Text Processor
Qwen 0.5B / Remote LLM
Paste

Mix and match
your pipeline

Two stages, two choices each. Pick what fits your workflow — switch anytime in Settings.

Stage 1: Speech-to-Text

Built-in

Apple Speech

Uses macOS SFSpeechRecognizer. Fast, zero setup, supports contextualStrings for custom dictionary terms.

0 MB download Instant start

Stage 2: Text Cleanup

Flexible

Remote LLM

Any OpenAI-compatible endpoint — GPT-4, Claude, local Ollama. Use the LiteLLM client for maximum flexibility.

0 MB local Requires network

Resource Usage

ComponentDiskRAM
App binary~20 MB~50 MB
WhisperKit (base)~80 MB~100 MB
Qwen 0.5B (4-bit)~350 MB~500 MB
Total (fully offline)~450 MB~650 MB

Runs comfortably on a base MacBook Air M1 with 8 GB RAM.

Built for
extensibility

Protocol-based engine system lets you hot-swap speech and text processing at runtime. Add your own engines by conforming to a protocol.

SpeechRecognitionService

protocol SpeechRecognitionService
  • AppleSpeechEngine
  • WhisperEngine
  • + your own

TextProcessingService

protocol TextProcessingService
  • LocalSLMProcessor
  • RemoteLLMProcessor
  • + your own

Core Services

ConversationController
  • FillerRemover
  • ModelDownloadManager
  • PermissionsHelper

Technologies

SwiftUI SwiftData AVFoundation Speech.framework WhisperKit MLX Swift AppKit CoreAudio

Start flowing

Open source voice dictation for engineers. 100% local. 100% free.

git clone https://github.com/agarwalvivek29/VibeFlow.git && cd VibeFlow && open VibeFlow.xcodeproj