Complete guide to setting up Ollama with Continue for local AI development. Learn installation, configuration, model selection, performance optimization, and troubleshooting for privacy-focused offline coding assistance
ollama pull
instead of ollama run
to download
models. The run
command starts an interactive session which isn’t needed for
Continue.:latest
- Default version (used if no tag specified):32b
, :7b
, :1.5b
- Parameter count versions:instruct
, :base
- Model variantsdeepseek-r1:32b
on Ollama’s website, you must pull it
with that exact tag. Using just deepseek-r1
will pull :latest
which may be
a different size.ollama/deepseek-r1-32b
configures Continue
to use model: deepseek-r1:32b
, but the actual model must be installed:404 model "deepseek-r1:32b" not found, try pulling it first
AUTODETECT
, Continue will dynamically populate
the model list based on what’s installed locally via ollama list
. This is
useful for quickly switching between models without manual configuration. For
any roles not covered by the detected models, you may need to manually
configure them.apiBase
with the IP address of a remote machine serving Ollama.
capabilities: [tool_use]
to your model configqwen2.5-coder:7b
- Excellent for code completioncodellama:13b
- Strong general coding supportdeepseek-coder:6.7b
- Fast and efficientllama3.1:8b
- Latest Llama with tool supportmistral:7b
- Fast and versatiledeepseek-r1:32b
- Advanced reasoning capabilitiesqwen2.5-coder:1.5b
- Lightweight and faststarcoder2:3b
- Optimized for code completionollama ps
to see memory usageollama logs
to debug performance issuesollama pull deepseek-r1
installs :latest
but hub block expects :32b
Solution: Always pull with the exact tag:
capabilities: [tool_use]
to your model configcurl http://localhost:11434
systemctl status ollama
(Linux)OLLAMA_HOST=0.0.0.0:11434
ollama ps
num_gpu
layers in model configurationollama ps
for active models and memory usage