Running Local LLMs with Ollama
Large Language Models (LLMs) have revolutionized AI, but cloud-based solutions often come with privacy concerns and usage limitations. Ollama provides an elegant solution by enabling you to run LLMs locally on your machine. This guide will walk you through the entire process from installation to interaction.
What is Ollama?
Ollama is a lightweight, extensible framework that allows you to run various open-source LLMs on your local hardware. It handles model weights, configurations, and provides a simple API interface similar to OpenAI's.
Installation Process
Download Ollama
Open your terminal and execute the following command to download the latest version of Ollama:
1 |
|
Install Ollama
Extract the downloaded archive to your system directory:
1 |
|
Starting the Ollama Service
Launch the Ollama service with this simple command:
1 |
|
Keep this terminal session active to maintain the service.
Verifying Your Installation
Open a new terminal window and verify that Ollama is properly installed and running:
1 |
|
This command will display the installed version of Ollama, confirming successful installation.
Finding the Right Model
Ollama supports numerous models with different capabilities and sizes. Browse the Ollama Model Library to explore available options.
Downloading a Model
For this example, we'll download the gemma3:4b model:
1 |
|
The download process may take considerable time depending on your Internet connection speed and the model size.
Terminal-Based Conversation
Once your model is downloaded, you can start interacting with it directly through the terminal:
1 |
|
This command launches an interactive chat session with your model. Press Ctrl-D
to exit the conversation when finished.
Programmatic Access via Python API
Ollama provides an OpenAI-compatible API, making it easy to integrate with existing applications and scripts.
Install a client library:
1 |
|
Use the client library:
1 |
|