Part 3 - Choosing and Managing Your AI Models in Ollama

April 16, 2025

Now that Ollama is up and running on your system, it’s time for the exciting part: choosing which AI models will power your personal lab. Think of this as assembling your team of AI assistants, each with different specialties and capabilities. Let’s dive into the world of Llama models and how to manage them effectively.

The Llama Family: Your AI Workhorses

Llama models, developed by Meta, represent some of the most capable open-source AI models available today. Within the Llama family, there are several variations to choose from, each with its own strengths:

Llama 3.3: New state of the art 70B model. Llama 3.3 70B offers similar performance compared to the Llama 3.1 405B model.
Llama 3.2: Meta’s Llama 3.2 goes small with 1B and 3B models.
Llama 3: Meta Llama 3: The most capable openly available LLM to date
CodeLlama: A large language model that can use text prompts to generate and discuss code.

Each model also comes in different sizes, with larger models offering more capabilities but requiring more system resources.

Pulling Your First Model

Getting a model is surprisingly simple with Ollama. Open your terminal and type:

ollama pull llama3.1

This single command downloads the standard Llama 3 model (8B parameters). You’ll see a progress bar as Ollama fetches the model files and prepares them for use on your system.

Want to be more specific about which variant you want? Use tags:

ollama pull llama3.2    # Standard 3B parameter model
ollama pull llama3.2:1b  # Largest 1B parameter model
ollama pull llama3.1:405b  # Llama 3.1 405B!!! parameter model
ollama pull llama3.2-vision:90b # Vision 90B Parameter 

For machines with limited memory, you might want to use quantized models:

ollama pull llama3.2:1b  # 8B parameter model with 4-bit quantization

This model will use approximately 1.3GB of RAM instead of the 16GB+ required for full precision, making it accessible even on machines with modest specifications.

⚠️ This will probably go stale quickly so check the Ollama repo for the latest versions and models ⚠️

Managing Your Model Library

As you experiment with different models, you’ll want to know what’s available in your local collection:

ollama list

This displays all your downloaded models along with their file sizes and when they were last used.

Need to free up space? Remove models you’re not using:

ollama rm llama3.3:latest

For a comprehensive browsable catalog of all models available through Ollama, visit the Ollama Library. This excellent resource lets you explore different model families, view their capabilities, and discover community-contributed models beyond just the Llama family.

Model Storage Considerations

Models can be quite large, especially if you’re experimenting with several variants:

7B/8B parameter models: ~4GB (quantized) to ~16GB (full precision)
13B parameter models: ~8GB (quantized) to ~26GB (full precision)
70B parameter models: ~40GB (quantized) to ~140GB (full precision)

By default, Ollama stores models in:

macOS: ~/.ollama/models/
Linux: ~/.ollama/models/

You can check your current storage usage with:

du -sh ~/.ollama/models/

If you’re running low on disk space, focus on quantized versions of smaller models, or consider using an external drive for model storage by creating a symbolic link:

# First, move your existing models
mv ~/.ollama/models /path/to/external/drive/ollama-models

# Then create a symbolic link
ln -s /path/to/external/drive/ollama-models ~/.ollama/models

Choosing the Right Model for Knowledge Base Applications

For building a knowledge base Q&A system, which we’ll explore in later posts, here are some recommendations:

For machines with 8GB RAM: llama3.2:1b - Good balance of capability and efficiency
For machines with 16GB RAM: llama3.2 or llama3.3 - Better quality or larger model
For machines with 32GB+ RAM: llama3.3 - Best quality responses for complex questions

The larger models typically provide more accurate information and better reasoning for complex queries, but the 8B variants are still remarkably capable for most knowledge base applications.

With your models selected and downloaded, you’re now ready to start running them and building your knowledge base system. In our next post, we’ll explore how to interact with your models and optimize their performance for your specific needs.

Notes mentioning this note

Part 1 - AI Lab - Understanding Ollama

Part 2 - Installing Ollama

Part 4 - Hands-On with Ollama

Part 5 - Creating Your Own AI-Powered Knowledge Base with Ollama

Part 6 - The Future of Personal AI