LLM Providers
Configure Ollama (local and cloud), Groq, and OpenRouter for prompt optimization.
Vocoding supports multiple LLM providers for prompt optimization. You can use local models via Ollama, ultra-fast cloud inference via Groq, or access 300+ models via OpenRouter.
Ollama (Local LLM)
Configure Ollama for local AI model execution. All processing stays on your machine.
Connection
| Setting | Description | Default |
|---|---|---|
| Base URL | Ollama server address | http://127.0.0.1:11434 |
| Connection status | Shows if Ollama is reachable | Auto-detected |
Prerequisites (local mode only):
- Install Ollama from ollama.com
- Ensure Ollama is running (or start with
ollama serve)
Model Manager
| Action | Description |
|---|---|
| Model list | Shows all locally available models with sizes |
| Pull input | Enter a model name to download (e.g., llama3.1:8b) |
| Delete | Remove a downloaded model |
| Progress bar | Shows download progress for new models |
Advanced Settings
| Setting | Description | Default |
|---|---|---|
| Keep Alive | How long model stays loaded in memory | 10 minutes |
| Reasoning Auto | Enable extended reasoning for supported models | ON |
| Persist Reasoning | Save reasoning history across sessions | OFF |
Keep Alive Options:
0-- Unload immediately after each request5m,10m,30m-- Unload after idle time-1-- Keep loaded permanently (uses more RAM)
Model Capabilities
Vocoding shows model capabilities with badges:
- Tools -- Can use web search and other tools
- Vision -- Can process images
- Thinking -- Supports extended reasoning mode
Groq
Ultra-fast cloud inference powered by custom LPU chips.
| Setting | Description |
|---|---|
| API Key | Your Groq API key |
| Model | Selected Groq model |
| Test Connection | Verify API key works |
Getting Started
- Visit console.groq.com
- Create account (free tier, no credit card)
- Generate API key > paste in Vocoding
Tiers
- Free: No credit card, rate-limited (requests/min and requests/day)
- Developer: Pay-as-you-go, higher rate limits
- Enterprise: Custom volume pricing
OpenRouter
Access 300+ models from multiple AI providers.
| Setting | Description |
|---|---|
| API Key | Your OpenRouter API key |
| Model | Selected model (searchable list) |
| Test Connection | Verify API key works |
Getting Started
- Visit openrouter.ai
- Create account
- Generate API key > paste in Vocoding
Pricing
- Free: 25+ free models (no credit card), 50 requests/day
- Pay-as-you-go: Buy credits, 300+ models, no minimum spend
- Enterprise: Custom volume pricing with SLA
Ollama Cloud
Access state-of-the-art models without local hardware requirements.
| Plan | Price | Key Features |
|---|---|---|
| Free | $0/month | Basic cloud access, limited usage |
| Pro | $20/month | Multiple simultaneous cloud models, 3 private models |
| Max | $100/month | 5+ simultaneous models, 5x Pro usage, 5 private models |
Configure at ollama.com and add your account in Settings > LLM.
Fallback Behavior
When the provider is set to Auto, Vocoding tries providers in order:
- Groq (if configured)
- OpenRouter (if configured)
- Ollama (local or cloud, if available)
- Raw fallback (no optimization -- returns cleaned transcription)
If a provider fails or times out, Vocoding automatically falls back to the next available provider. The raw fallback always works as a last resort, ensuring you always get a result.