Build an AI inference server on Ubuntu

Open source tools like Ollama and Open WebUI are convenient for building local LLM inference stacks that let you create a ChatGPT-like experience on your own infrastructure. Whether you are a hobbyist, someone concerned about privacy, or a business looking to deploy LLMs on-premises, these tools can help you achieve that. Prerequisites We assume here that you are running an LTS version of Ubuntu (NVIDIA and AMD tooling is best supported on LTS releases) and that you have a GPU installed on your machine (either NVIDIA or AMD). If you don’t have a GPU, you can still follow this guide, but inference will be much slower as it will run on CPU. ...

December 13, 2025 · 6 min · Gauthier Jolly

How to install NVIDIA drivers on Ubuntu

Make sure the system is up-to-date This section is important to avoid pulling DKMS NVIDIA drivers during the installation. First make sure your server is up-to-date: 1 2 sudo apt update sudo apt full-upgrade -y If your system needs reboot, reboot it before running: 1 sudo apt autoremove -y Note: You can check if your system needs to be rebooted by checking if this file exists: /var/run/reboot-required. ...

February 12, 2025 · 3 min · Gauthier Jolly