Ollama is an open-source tool that allows to run large language models (LLMs) locally on their own computers. To use Ollama, you can install it here and download the model you want to run with the ollama run command.

How to Set Up Ollama Chat Models

We recommend configuring Llama3.1 8B as your chat model.
config.yaml
models:
  - name: Llama3.1 8B
    provider: ollama
    model: llama3.1:8b

How to Configure Ollama Autocomplete Models

We recommend configuring Qwen2.5-Coder 1.5B as your autocomplete model.
config.yaml
models:
  - name: Qwen2.5-Coder 1.5B
    provider: ollama
    model: qwen2.5-coder:1.5b-base
    roles:
      - autocomplete

How to Set Up Ollama Embeddings Models

We recommend configuring Nomic Embed Text as your embeddings model.
config.yaml
models:
  - name: Nomic Embed Text
    provider: ollama
    model: nomic-embed-text
    roles:
      - embed

Ollama Reranking Model Availability

Ollama currently does not offer any reranking models. Click here to see a list of reranking model providers.

How to Configure Remote Ollama Instance

To configure a remote instance of Ollama, add the "apiBase" property to your model in config.json:
config.yaml
models:
  - name: Llama3.1 8B
    provider: ollama
    model: llama3.1:8b
    apiBase: http://<my endpoint>:11434

How to Configure Model Capabilities in Ollama

Ollama models usually have their capabilities auto-detected correctly. However, if you’re using custom model names or experiencing issues with tools/images not working, you can explicitly set capabilities:
config.yaml
models:
  - name: Custom Vision Model
    provider: ollama
    model: my-custom-llava
    capabilities:
      - tool_use      # Enable if your model supports function calling
      - image_input   # Enable for vision models like llava
Most standard Ollama models (like llama3.1, mistral, etc.) support tool use by default. Vision models (like llava) also support image input.