Local LLMs - Ollama and VSCode

Introduction

I had some time to play around with running LLMs locally and hooking those models up to VSCode to easily use them when coding. Explaining code, creating unit tests and code completion.

This short blog post explains an easy way to get up and running fast using Ollama and the CodeGPT extension.

I am assuming you are running Ollama on a Linux host.

Ollama

Ollama is a tool that you can use to run open-source large language models, such as Llama 2, locally. Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. It optimizes setup and configuration details, including GPU usage.

The easiest way to run it is to download the single binary and run it.

Download and install the Ollama binary in a folder.

curl -L https://ollama.ai/download/ollama-linux-amd64 -o /usr/bin/ollama
chmod +x /usr/bin/ollama

Run Ollama. Ollama will listen on localhost port 11434 by default.

./ollama start

Ollama models

Ollama needs to download and use a model for it to be useful. You can see the models you can download here.

With the Ollama server running download a model. I am pulling the python version of the wizardcoder model as an example.

./ollama pull wizardcoder:python

Other Ollama commands

List all Ollama models available in your server.

./ollama list
NAME                    ID              SIZE    MODIFIED   
codellama:7b-instruct   8fdf8f752f6e    3.8 GB  2 days ago
codellama:7b-python     120ca3419eae    3.8 GB  2 days ago
codellama:latest        8fdf8f752f6e    3.8 GB  2 days ago
mistral:instruct        1ab49bc0b6a8    4.1 GB  2 days ago
mistral:latest          1ab49bc0b6a8    4.1 GB  2 days ago
wizardcoder:7b-python   de9d848c1323    3.8 GB  2 days ago
wizardcoder:latest      de9d848c1323    3.8 GB  2 days ago
wizardcoder:python      de9d848c1323    3.8 GB  2 days ago

Play around with some prompts from the CLI

./olama run wizardcoder:python
>>> Can you help me with some Python code?
Sure, what kind of Python code would you like me to help you with?

VSCode and CodeGPT

Now that we have Ollama up and running let’s install the VSCode exptension called CodeGPT and connect it up to our model. If you have Ollama running on a remote server make sure to make it available locally.

SSH and make Ollama available locally

As CodeGPT can not be configured to talk to a remote Ollama server we will need to make sure the Ollama server is locally available. SSH port-forwarding will do.

ssh -L 11434:localhost:11434 user@your_server

CodeGPT

CodeGPT is an extension for VSCode that can connect to different LLMs. We will use it to connect to out local LLM.

Install the CodeGPT Extension in VSCode
- View -> Extensions and search for ‘CodeGPT’ and click Install
Configure the CodeGPT Extension
- Click the configuration icon in the Extension view and choose Extension Settings
  - _Code GPT: Api Key : Ollama
  - _Code GPT: Model : wizardcoder:python

Now you can try it out on some Python code. Open up a random Python file, select some code, right click and choose Code GPT: Code Explanation

CodeGPT code explanation example