Vocoding
Vocoding Docs
Compatibility

Platform Compatibility

Supported platforms, requirements, and feature compatibility for macOS, Windows, and Linux.

Supported Platforms

PlatformStatusNotes
macOS Apple Silicon (M1, M2, M3, M4)AvailableBest experience, native GPU acceleration
macOS Intel (x86_64)AvailableFully functional, CPU-based transcription
Windows 10+ (x64)In DevelopmentCore components compatible, adaptation in progress
LinuxNot SupportedNo immediate plans

macOS Apple Silicon (M1, M2, M3, M4)

Overview

Apple Silicon Macs provide the optimal Vocoding experience. The native GPU acceleration makes transcription with large Whisper models fast and efficient.

Requirements

RequirementMinimumRecommended
macOS version10.15 (Catalina)13+ (Ventura)
RAM4 GB8 GB+
Disk500 MB + models2 GB+
PermissionsMicrophoneMicrophone + Accessibility

Feature Compatibility

FeatureStatusDetails
Audio RecordingFullBuilt-in and external microphones
Whisper TranscriptionOptimalGPU-accelerated (Metal), all models supported
Ollama (Local + Cloud)FullNative Apple Silicon builds. Local mode (free, offline) and cloud mode (Free $0, Pro $20/month, Max $100/month)
Cloud LLM (Groq/OpenRouter)FullRequires internet + API key
Global ShortcutsFullOption+Space, Cmd+Shift+R, etc.
Overlay WindowFullAlways-on-top without stealing focus
Auto Updates (OTA)FullBuilt-in update system
Secure Key StorageFullmacOS Keychain
PriorityModelWhy
Best overalllarge-v3-turboFast + accurate with Metal GPU
Best accuracylarge-v3Highest accuracy, reasonable speed on M-series
Fast draftsbaseVery fast, good for quick notes

macOS Intel (x86_64)

Overview

Intel Macs are fully supported. The main difference from Apple Silicon is that Whisper transcription runs on CPU instead of GPU, which makes larger models slower.

Requirements

RequirementMinimumRecommended
macOS version10.15 (Catalina)12+ (Monterey)
RAM4 GB8 GB+
Disk500 MB + models2 GB+
PermissionsMicrophoneMicrophone + Accessibility

Feature Compatibility

FeatureStatusDetails
Audio RecordingFullBuilt-in and external microphones
Whisper TranscriptionFull (CPU)No GPU acceleration; large models will be slow
Ollama (Local + Cloud)FullIntel Mac builds. Local mode (free, offline) and cloud mode (no local hardware requirements)
Cloud LLM (Groq/OpenRouter)FullRequires internet + API key
Global ShortcutsFullSame shortcuts as Apple Silicon
Overlay WindowFullAlways-on-top without stealing focus
Auto Updates (OTA)FullBuilt-in update system
Secure Key StorageFullmacOS Keychain
PriorityModelWhy
Best balancesmallGood accuracy, reasonable CPU speed
Higher accuracymediumBetter accuracy, slower on CPU
Fast draftstiny or baseVery fast even on CPU

Performance Notes

  • Models large-v3 and large-v3-turbo will work but may be significantly slower without GPU acceleration
  • For real-time or near-real-time use, stick with small or lighter models
  • Transcription speed depends on your specific Intel processor generation

Windows 10+ (x64)

Overview

The Windows version is in active development. Many core components are technically compatible with Windows, but integration and testing are ongoing.

Requirements

RequirementMinimumRecommended
Windows versionWindows 10 (x64)Windows 11
Processorx64 with AVX2Modern x64 processor
RAM4 GB8 GB+
Disk500 MB + models2 GB+
RuntimeWebView2Pre-installed on Windows 11

Feature Compatibility

FeatureStatusDetails
Audio RecordingPlannedAudio capture layer is cross-platform compatible
Whisper TranscriptionIn DevelopmentGPU acceleration requires Windows adaptation; CPU works
Ollama (Local + Cloud)CompatibleWindows installer available. Cloud mode works without local hardware requirements
Cloud LLM (Groq/OpenRouter)CompatibleRequires internet + API key
Global ShortcutsPlannedAlt+Space, Ctrl+Shift+R, etc.
Overlay WindowIn DevelopmentWindow management requires Windows-specific adjustments
Auto Updates (OTA)PlannedRequires Windows installer format (MSI/NSIS)
Secure Key StorageCompatibleWindows Credential Manager

Known Limitations

  1. GPU acceleration: Whisper transcription currently uses Apple Metal for GPU acceleration. On Windows, transcription will use CPU until CUDA/DirectML support is implemented.
  2. Installer format: The Windows installer (.msi or .exe) is in preparation.
  3. Overlay window: Some window behaviors may differ from macOS.

WebView2 Runtime

Vocoding on Windows requires the WebView2 runtime:

  • Windows 11: Pre-installed, no action needed.
  • Windows 10: May need to install WebView2 Runtime from Microsoft.

What Is AVX2?

AVX2 (Advanced Vector Extensions 2) is a CPU instruction set required for efficient Whisper transcription. Most x64 processors since 2014 support AVX2.

How to check if your processor supports AVX2:

  • Check your processor model at the manufacturer's website (Intel ARK or AMD product pages).
  • Common AVX2 processors: Intel Haswell (4th gen, 2013) and newer, AMD Excavator (2015) and newer.

Linux

Linux is not currently supported. There are no immediate plans for a Linux version.