The race toward Confidential AI inference

For almost half a decade now, I have been working on Confidential Computing at Canonical. This position has given me a front-row seat to the evolution of Confidential Computing technologies and their applications. One of the most exciting applications is Confidential AI inference, which allows AI models to be hosted and executed in a way that can keep the user’s input data confidential, even from the service provider itself. While Apple is announcing a partnership with Google, to base its own models on Google Gemini and while some might see this as a failure, it is worth noting that Apple Intelligence already has a meaningful legacy. ...

January 18, 2026 · 2 min · Gauthier Jolly

Build an AI inference server on Ubuntu

Open source tools like Ollama and Open WebUI are convenient for building local LLM inference stacks that let you create a ChatGPT-like experience on your own infrastructure. Whether you are a hobbyist, someone concerned about privacy, or a business looking to deploy LLMs on-premises, these tools can help you achieve that. Prerequisites We assume here that you are running an LTS version of Ubuntu (NVIDIA and AMD tooling is best supported on LTS releases) and that you have a GPU installed on your machine (either NVIDIA or AMD). If you don’t have a GPU, you can still follow this guide, but inference will be much slower as it will run on CPU. ...

December 13, 2025 · 6 min · Gauthier Jolly