First-Time Setup
Configure Whisper models and LLM providers for optimal performance.
Download a Whisper Model
Vocoding uses Whisper for 100% local, offline transcription. On first use:
- Go to Settings, then Transcription
- Click Download next to your preferred model
Model Comparison
| Model | Size | Speed | Accuracy | Recommended For |
|---|---|---|---|---|
tiny | 75 MB | Fastest | Basic | Quick notes, testing |
base | 142 MB | Fast | Good | Daily use (default) |
small | 466 MB | Medium | Better | When accuracy matters |
medium | 1.5 GB | Slow | High | Professional use |
large-v3 | 2.9 GB | Slowest | Best | Maximum accuracy |
large-v3-turbo | 1.5 GB | Fast | High | Best balance (Apple Silicon) |
Recommendations by Platform
- Apple Silicon Mac: Start with
large-v3-turbo(fast + accurate with GPU acceleration) - Intel Mac: Start with
baseorsmall(CPU-based processing) - Windows: Start with
baseorsmall
Configure an LLM Provider (Optional)
For prompt optimization (turning raw transcription into structured prompts), configure an LLM provider. You have four options:
Option A: Ollama Local (Free, Offline)
Run AI models directly on your machine. 100% private, no data leaves your device.
- Install Ollama on your system
- Run:
ollama pull llama3.1:8b - In Vocoding: Settings, then Local LLM
- Verify connection status shows green
- Select your model
Best for: Privacy-first users with capable hardware.
Option B: Ollama Cloud (Recommended for Most Users)
Access state-of-the-art models without consuming local resources.
- Create an account at ollama.com
- Choose a plan:
| Plan | Price | Description |
|---|---|---|
| Free | $0/month | Basic cloud access, limited usage |
| Pro | $20/month | Multiple cloud models simultaneously, 3 private models |
| Max | $100/month | 5+ simultaneous cloud models, 5x Pro usage, 5 private models |
- Configure in Vocoding: Settings, then LLM
Why Ollama Cloud? Running state-of-the-art models locally requires very expensive hardware (40+ GB RAM for top models). Cloud plans give you access to the same quality at a fraction of the cost.
Option C: Groq (Ultra-Fast Cloud)
Ultra-fast inference powered by custom LPU chips (>500 tokens/second).
- Visit console.groq.com
- Create an account (free tier available, no credit card required)
- Generate an API key
- In Vocoding: Settings, then LLM, then Groq
- Paste your API key
- Click Test Connection
Option D: OpenRouter (300+ Models)
Access to 300+ models from multiple AI providers through a single API key.
- Visit openrouter.ai
- Create an account
- Generate an API key
- In Vocoding: Settings, then LLM, then OpenRouter
- Paste your API key
- Click Test Connection
OpenRouter offers 25+ free models (no credit required) including DeepSeek R1, Llama 3.3 70B, Gemma, and more.
No API Key?
You can still use Vocoding in Transcribe mode (voice to text only) or with Ollama (local or cloud). An API key is only required for Groq and OpenRouter.