How To Use Free Cloud-Based LLMs via Ollama on Ubuntu

Large Language Models (LLMs) have become essential tools for developers, researchers, and hobbyists. Ollama is a free, open-source framework that simplifies running LLMs locally or connecting to cloud-based models. It even offers a free tier with reasonable limits—perfect for learning, testing, and small projects. This guide will walk you through every step of installing Ollama on Ubuntu Linux, including manual download (to avoid GitHub interruptions), launching the service, using free cloud models (like Kimi K2.5), troubleshooting common errors, and exposing the service to other machines on your network.

1. Understanding Ollama and Preparing Your Environment

Before diving into commands, let’s clarify what Ollama is and why we’re using a manual installation method.

  1. What is Ollama? Ollama is a lightweight LLM runner that supports both local models (e.g., Llama, Mistral) and cloud-hosted models. It provides a simple CLI and API interface.
  2. Why manual installation? The official one-liner install command (curl … | sh) requires accessing GitHub during runtime. In regions where GitHub is unstable, this often fails. By downloading the .tar.zst archive from GitHub Releases beforehand, we avoid these interruptions.
  3. Preparation checklist:
    Ubuntu Linux (physical or virtual machine)
    – Internet access to download files
    USB drive (if transferring files between host and guest OS)
    – Basic terminal familiarity

2. Downloading the Ollama Linux Package from GitHub

We’ll fetch the pre-compiled Linux binary from GitHub’s release page.

Steps:

  1. On your host machine (e.g., Windows), open a browser and go to: `https://github.com/ollama/ollama/releases`
  2. Locate the latest release (in the video: V0.20.6). Under “Assets,” click to download: `ollama-linux-amd64.tar.zst`
  3. Wait for the download to finish (~1.9 GB).
  4. Copy the file to a USB drive.
  5. Plug the USB drive into your Ubuntu machine and copy the file into Ubuntu’s file system.
  6. Tip: If Ubuntu doesn’t detect the USB drive in VMware, go to VM SettingsUSB Controller → change compatibility to USB 3.2.

3. Extracting and Organizing Ollama in Ubuntu

Now we’ll extract the archive and prepare the executable.

Steps:

  1. In Ubuntu’s home folder, create a new directory called `bin` (to keep user-installed programs).
  2. Enter the `bin` folder and paste the `ollama-linux-amd64.tar.zst` file.
  3. Right-click the archive and select “Extract here” (or use `tar -xf` in terminal).
  4. After extraction, you’ll see a folder containing `bin` and `lib` subfolders.
  5. Go into the `bin` folder. You’ll find the `ollama` executable – this is the main program.
  6. No system-wide installation is required; you can run Ollama directly from this location.

4. Starting the Ollama Service and Verifying It Works

The Ollama service must be running before you can interact with any model.

Steps:

  1. Open a terminal in Ubuntu.
  2. Navigate to the directory containing the `ollama` binary, e.g.:
    cd /home/yourusername/bin/ollama.../bin
  3. Start the service:
    ./ollama serve
  4. Open Firefox and go to `http://127.0.0.1:11434`,  You should see the message: `Ollama is running`
  5. Open another terminal and check the process:
    ps -ef | grep ollama
  6. You’ll see `./ollama serve` listed.
  7. Congratulations – Ollama is now running on your Ubuntu system.

5. Using Free Cloud-Based LLMs via Ollama

One of Ollama’s best features is the ability to use free cloud models (marked with `:cloud`) without needing powerful local hardware.

Steps:

  1. In the terminal (still inside the Ollama binary folder), run:
    ./ollama launch
  2. From the menu, select `chat with a model`.
  3. You’ll see a list of available models. Choose one with the `:cloud` suffix, e.g., `kimi-k2.5:cloud`.
  4. The terminal will prompt you to log in to your Ollama account.
    – A browser tab should open automatically for login. If not, copy the provided URL manually.
    – Remove any spaces from the URL before pasting.
    – Log in with your email and password (or sign up if you don’t have an account).
  5. After successful login, you can start chatting.

Example interaction:

  1. You ask: “What LLM are you using?”
  2. Model replies: “I am Kimi, developed by Dark Side of the Moon Technology Co., Ltd., part of the Kimi K2.5 series.”
  3. You can continue asking questions like “Where are you running?” and the model will explain it runs on distributed cloud servers.

6. Fixing a Common Error – Switching from `launch` to `run`

The video demonstrates an error when using `launch` with some cloud models:

`error running model flag accessed but not defined verbose`

Why this happens:

  1. The `launch` command has parameter compatibility issues with certain cloud models.
  2. Solution: Use the `run` command instead, which directly specifies the model name.

Steps to fix:

  1. In the terminal, execute:
    ./ollama run kimi-k2.5:cloud
  2. Once connected, ask a question (e.g., “Hello, are you running in the cloud? Whose servers do you use?”).
  3. The model will answer normally without errors.
  4. Check processes with `ps -ef | grep ollama` – you’ll now see an additional `./ollama run …` process.
  5. Recommendation: Always prefer `ollama run ` for production use.

7. Enabling Remote Access for Other Machines (e.g., OpenClaw)

If you want to use this Ollama instance from another computer or VM (like a Windows VM running OpenClaw), you need to bind the service to your Ubuntu machine’s external IP address.

Steps:

  1. Find your Ubuntu machine’s IP address:
    sudo apt install net-tools # if ifconfig is missing
    ifconfig
  2. Example output: `192.168.204.129`
  3. Set the environment variable to listen on all network interfaces:
    export OLLAMA_HOST=0.0.0.0:11434
  4. Restart the Ollama service:
    – Stop the current `./ollama serve` (pkill -9 ollama).
    – Run `./ollama serve` again.
  5. On the same Ubuntu machine, test remote access via browser: `http://192.168.204.129:11434`,  you should see `Ollama is running`.
  6. Now, on your Windows VM (or any other machine in the same network), configure your OpenClaw client to use `http://192.168.204.129:11434` as the Ollama API endpoint.
  7. Note: This exposes the service to your local network only. For internet access, add firewall rules and authentication.

8. Summary and Next Steps

You’ve successfully installed Ollama on Ubuntu Linux using a manual, network-resilient method. You’ve also learned how to:

  1. Start the Ollama service
  2. Use free cloud-based LLMs (Kimi K2.5 as an example)
  3. Fix the `launch` vs `run` error
  4. Expose Ollama to other machines on your network

What to explore next:

  1. Download local models (e.g., `ollama run llama3`)
  2. Integrate Ollama with Python using the `requests` library
  3. Use Ollama as a backend for AI assistants like Continue or OpenClaw

Thank you for following along. If you run into any issues, feel free to leave a comment. See you in the next video!

9. Demo Video

You can watch the following demo video by select the subtitle to your preferred subtitle language.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.