Vocoding
Vocoding Docs
Getting Started

First-Time Setup

Configure Whisper models and LLM providers for optimal performance.

Download a Whisper Model

Vocoding uses Whisper for 100% local, offline transcription. On first use:

  1. Go to Settings, then Transcription
  2. Click Download next to your preferred model

Model Comparison

ModelSizeSpeedAccuracyRecommended For
tiny75 MBFastestBasicQuick notes, testing
base142 MBFastGoodDaily use (default)
small466 MBMediumBetterWhen accuracy matters
medium1.5 GBSlowHighProfessional use
large-v32.9 GBSlowestBestMaximum accuracy
large-v3-turbo1.5 GBFastHighBest balance (Apple Silicon)

Recommendations by Platform

  • Apple Silicon Mac: Start with large-v3-turbo (fast + accurate with GPU acceleration)
  • Intel Mac: Start with base or small (CPU-based processing)
  • Windows: Start with base or small

Configure an LLM Provider (Optional)

For prompt optimization (turning raw transcription into structured prompts), configure an LLM provider. You have four options:

Option A: Ollama Local (Free, Offline)

Run AI models directly on your machine. 100% private, no data leaves your device.

  1. Install Ollama on your system
  2. Run: ollama pull llama3.1:8b
  3. In Vocoding: Settings, then Local LLM
  4. Verify connection status shows green
  5. Select your model

Best for: Privacy-first users with capable hardware.

Access state-of-the-art models without consuming local resources.

  1. Create an account at ollama.com
  2. Choose a plan:
PlanPriceDescription
Free$0/monthBasic cloud access, limited usage
Pro$20/monthMultiple cloud models simultaneously, 3 private models
Max$100/month5+ simultaneous cloud models, 5x Pro usage, 5 private models
  1. Configure in Vocoding: Settings, then LLM

Why Ollama Cloud? Running state-of-the-art models locally requires very expensive hardware (40+ GB RAM for top models). Cloud plans give you access to the same quality at a fraction of the cost.

Option C: Groq (Ultra-Fast Cloud)

Ultra-fast inference powered by custom LPU chips (>500 tokens/second).

  1. Visit console.groq.com
  2. Create an account (free tier available, no credit card required)
  3. Generate an API key
  4. In Vocoding: Settings, then LLM, then Groq
  5. Paste your API key
  6. Click Test Connection

Option D: OpenRouter (300+ Models)

Access to 300+ models from multiple AI providers through a single API key.

  1. Visit openrouter.ai
  2. Create an account
  3. Generate an API key
  4. In Vocoding: Settings, then LLM, then OpenRouter
  5. Paste your API key
  6. Click Test Connection

OpenRouter offers 25+ free models (no credit required) including DeepSeek R1, Llama 3.3 70B, Gemma, and more.

No API Key?

You can still use Vocoding in Transcribe mode (voice to text only) or with Ollama (local or cloud). An API key is only required for Groq and OpenRouter.