Vocoding
Vocoding Docs
Settings

LLM Providers

Configure Ollama (local and cloud), Groq, and OpenRouter for prompt optimization.

Vocoding supports multiple LLM providers for prompt optimization. You can use local models via Ollama, ultra-fast cloud inference via Groq, or access 300+ models via OpenRouter.


Ollama (Local LLM)

Configure Ollama for local AI model execution. All processing stays on your machine.

Connection

SettingDescriptionDefault
Base URLOllama server addresshttp://127.0.0.1:11434
Connection statusShows if Ollama is reachableAuto-detected

Prerequisites (local mode only):

  1. Install Ollama from ollama.com
  2. Ensure Ollama is running (or start with ollama serve)

Model Manager

ActionDescription
Model listShows all locally available models with sizes
Pull inputEnter a model name to download (e.g., llama3.1:8b)
DeleteRemove a downloaded model
Progress barShows download progress for new models

Advanced Settings

SettingDescriptionDefault
Keep AliveHow long model stays loaded in memory10 minutes
Reasoning AutoEnable extended reasoning for supported modelsON
Persist ReasoningSave reasoning history across sessionsOFF

Keep Alive Options:

  • 0 -- Unload immediately after each request
  • 5m, 10m, 30m -- Unload after idle time
  • -1 -- Keep loaded permanently (uses more RAM)

Model Capabilities

Vocoding shows model capabilities with badges:

  • Tools -- Can use web search and other tools
  • Vision -- Can process images
  • Thinking -- Supports extended reasoning mode

Groq

Ultra-fast cloud inference powered by custom LPU chips.

SettingDescription
API KeyYour Groq API key
ModelSelected Groq model
Test ConnectionVerify API key works

Getting Started

  1. Visit console.groq.com
  2. Create account (free tier, no credit card)
  3. Generate API key > paste in Vocoding

Tiers

  • Free: No credit card, rate-limited (requests/min and requests/day)
  • Developer: Pay-as-you-go, higher rate limits
  • Enterprise: Custom volume pricing

OpenRouter

Access 300+ models from multiple AI providers.

SettingDescription
API KeyYour OpenRouter API key
ModelSelected model (searchable list)
Test ConnectionVerify API key works

Getting Started

  1. Visit openrouter.ai
  2. Create account
  3. Generate API key > paste in Vocoding

Pricing

  • Free: 25+ free models (no credit card), 50 requests/day
  • Pay-as-you-go: Buy credits, 300+ models, no minimum spend
  • Enterprise: Custom volume pricing with SLA

Ollama Cloud

Access state-of-the-art models without local hardware requirements.

PlanPriceKey Features
Free$0/monthBasic cloud access, limited usage
Pro$20/monthMultiple simultaneous cloud models, 3 private models
Max$100/month5+ simultaneous models, 5x Pro usage, 5 private models

Configure at ollama.com and add your account in Settings > LLM.


Fallback Behavior

When the provider is set to Auto, Vocoding tries providers in order:

  1. Groq (if configured)
  2. OpenRouter (if configured)
  3. Ollama (local or cloud, if available)
  4. Raw fallback (no optimization -- returns cleaned transcription)

If a provider fails or times out, Vocoding automatically falls back to the next available provider. The raw fallback always works as a last resort, ensuring you always get a result.