LLM Providers

Configure Ollama (local and cloud), Groq, and OpenRouter for prompt optimization.

Vocoding supports multiple LLM providers for prompt optimization. You can use local models via Ollama, ultra-fast cloud inference via Groq, or access 300+ models via OpenRouter.

Ollama (Local LLM)

Configure Ollama for local AI model execution. All processing stays on your machine.

Connection

Setting	Description	Default
Base URL	Ollama server address	`http://127.0.0.1:11434`
Connection status	Shows if Ollama is reachable	Auto-detected

Prerequisites (local mode only):

Install Ollama from ollama.com
Ensure Ollama is running (or start with ollama serve)

Model Manager

Action	Description
Model list	Shows all locally available models with sizes
Pull input	Enter a model name to download (e.g., `llama3.1:8b`)
Delete	Remove a downloaded model
Progress bar	Shows download progress for new models

Advanced Settings

Setting	Description	Default
Keep Alive	How long model stays loaded in memory	10 minutes
Reasoning Auto	Enable extended reasoning for supported models	ON
Persist Reasoning	Save reasoning history across sessions	OFF

Keep Alive Options:

0 -- Unload immediately after each request
5m, 10m, 30m -- Unload after idle time
-1 -- Keep loaded permanently (uses more RAM)

Model Capabilities

Vocoding shows model capabilities with badges:

Tools -- Can use web search and other tools
Vision -- Can process images
Thinking -- Supports extended reasoning mode

Groq

Ultra-fast cloud inference powered by custom LPU chips.

Setting	Description
API Key	Your Groq API key
Model	Selected Groq model
Test Connection	Verify API key works

Getting Started

Visit console.groq.com
Create account (free tier, no credit card)
Generate API key > paste in Vocoding

Tiers

Free: No credit card, rate-limited (requests/min and requests/day)
Developer: Pay-as-you-go, higher rate limits
Enterprise: Custom volume pricing

OpenRouter

Access 300+ models from multiple AI providers.

Setting	Description
API Key	Your OpenRouter API key
Model	Selected model (searchable list)
Test Connection	Verify API key works

Getting Started

Visit openrouter.ai
Create account
Generate API key > paste in Vocoding

Pricing

Free: 25+ free models (no credit card), 50 requests/day
Pay-as-you-go: Buy credits, 300+ models, no minimum spend
Enterprise: Custom volume pricing with SLA

Ollama Cloud

Access state-of-the-art models without local hardware requirements.

Plan	Price	Key Features
Free	$0/month	Basic cloud access, limited usage
Pro	$20/month	Multiple simultaneous cloud models, 3 private models
Max	$100/month	5+ simultaneous models, 5x Pro usage, 5 private models

Configure at ollama.com and add your account in Settings > LLM.

Fallback Behavior

When the provider is set to Auto, Vocoding tries providers in order:

Groq (if configured)
OpenRouter (if configured)
Ollama (local or cloud, if available)
Raw fallback (no optimization -- returns cleaned transcription)

If a provider fails or times out, Vocoding automatically falls back to the next available provider. The raw fallback always works as a last resort, ensuring you always get a result.

Ollama (Local LLM)

Connection

Model Manager

Advanced Settings

Model Capabilities

Groq

Getting Started

Tiers

OpenRouter

Getting Started

Pricing

Ollama Cloud

Fallback Behavior

On this page