Getting Started with Ollama

Ollama makes it easy to run open-weight AI models locally on your machine. Follow this guide to get up and running in minutes.

1. Install Ollama

macOS

Download from the official website or install via Homebrew:

brew install ollama

Linux

Use the official install script:

curl -fsSL https://ollama.com/install.sh | sh

Windows

Download the installer from ollama.com/download and run the setup wizard.

2. Run Your First Model

Once installed, you can run a model with a single command. Let's start with Llama 3.2 3B — a small, fast model that runs on virtually any modern hardware:

ollama run llama3.2:3b

Ollama will automatically download the model (about 2GB) and start a chat session. Type your message and press Enter to chat.

3. Try Other Models

Here are some popular models to try, roughly ordered by VRAM requirements:

Command Model Min VRAM Best For
ollama run llama3.2:1b Llama 3.2 1B ~2 GB Quick tasks, low-end hardware
ollama run llama3.2:3b Llama 3.2 3B ~3 GB Good starting point
ollama run llama3.1:8b Llama 3.1 8B ~6 GB Great all-rounder
ollama run qwen2.5-coder:7b Qwen 2.5 Coder 7B ~6 GB Code generation
ollama run deepseek-r1:14b DeepSeek R1 14B ~10 GB Reasoning, math
ollama run llama3.3:70b Llama 3.3 70B ~43 GB Best open-weight quality

4. Check Your Hardware

Not sure what your hardware can run? Use our compatibility checker:

Check Compatibility →

Useful Commands

ollama list

List all downloaded models

ollama pull llama3.1:8b

Download a model without starting a chat

ollama rm llama3.1:8b

Delete a downloaded model to free space

ollama serve

Start the Ollama API server (runs on port 11434)