Running Local LLMs with Ollama

Large Language Models (LLMs) have revolutionized AI, but cloud-based solutions often come with privacy concerns and usage limitations. Ollama provides an elegant solution by enabling you to run LLMs locally on your machine. This guide will walk you through the entire process from installation to interaction.

What is Ollama?

Ollama is a lightweight, extensible framework that allows you to run various open-source LLMs on your local hardware. It handles model weights, configurations, and provides a simple API interface similar to OpenAI's.

Installation Process

Download Ollama

Open your terminal and execute the following command to download the latest version of Ollama:

1
wget https://ollama.com/download/ollama-linux-amd64.tgz

Install Ollama

Extract the downloaded archive to your system directory:

1
sudo tar -C /usr -xzvf ollama-linux-amd64.tgz

Starting the Ollama Service

Launch the Ollama service with this simple command:

1
ollama serve

Keep this terminal session active to maintain the service.

Verifying Your Installation

Open a new terminal window and verify that Ollama is properly installed and running:

1
ollama -v

This command will display the installed version of Ollama, confirming successful installation.

Finding the Right Model

Ollama supports numerous models with different capabilities and sizes. Browse the Ollama Model Library to explore available options.

Downloading a Model

For this example, we'll download the gemma3:4b model:

1
ollama pull gemma3:4b

The download process may take considerable time depending on your Internet connection speed and the model size.

Terminal-Based Conversation

Once your model is downloaded, you can start interacting with it directly through the terminal:

1
ollama run gemma3:4b

This command launches an interactive chat session with your model. Press Ctrl-D to exit the conversation when finished.

Programmatic Access via Python API

Ollama provides an OpenAI-compatible API, making it easy to integrate with existing applications and scripts.

Install a client library:

1
pip install chat-completions-conversation

Use the client library:

1
2
3
4
5
6
7
8
9
10
from chat_completions_conversation import ChatCompletionsConversation

conversation = ChatCompletionsConversation(
api_key='',
base_url='http://localhost:11434/v1',
model='gemma3:4b'
)

conversation.send_and_receive_response('Hello, how are you?')
# Model returns a string

Running Local LLMs with Ollama
https://jifengwu2k.github.io/2025/09/05/Running-Local-LLMs-with-Ollama/
Author
Jifeng Wu
Posted on
September 5, 2025
Licensed under