IPEX-LLM is a PyTorch
library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU,
discrete GPU such as Arc A-Series, Flex and Max) with very low latency.
"ollama"
provider as follows:
config.yaml
OLLAMA_HOST=0.0.0.0
before executing the command ollama serve
. Then, in the Continue configuration, set 'apiBase'
to correspond with the IP address / port of the remote machine. That is, Continue can be configured to be:
config.yaml
If you would like to preload the model before your first conversation with
that model in Continue, you could refer to
here
for more information.