Blog
Rob Gibbon

Rob Gibbon

28 posts

Rob has 20+ years' industry experience building, scaling, managing and serving the teams, technology and environments behind around 50+ commercial web properties and data hubs across all major industries in varied roles. Rob brings both deep commercial and technical expertise to the product leadership team at Canonical. When he's not busy making great product, Rob is out running, reading and thinking, or just being.

Rob Gibbon
28 May 2026

Migrating from Apache Spark 3 to Spark 4

Data Platform Ubuntu tech blog

The purpose of this guide is to highlight the key differences between Apache Spark 3 and Spark 4, and provide advice on how to plan a migration. Let’s get started. The biggest changes Let’s talk about the biggest changes between Apache Spark 3.x and Spark 4. Scala 2.12 no more First up, there’s no support ...

Rob Gibbon
27 April 2026

Understanding disaggregated GenAI model serving with llm-d

AI Ubuntu tech blog

What is llm-d? llm-d is an open source solution for managing high-scale, high-performance Large Language Model (LLM) deployments. LLMs are at the heart of generative AI – so when you chat with ChatGPT or Gemini, you’re talking to an LLM. Simple LLM deployments – where an LLM is deployed to a single server – can ...

Rob Gibbon
20 April 2026

Hybrid search and reranking: a deeper look at RAG

AI Ubuntu tech blog

Many of us are familiar with the retrieval augmented generative AI (RAG) pattern for building agentic AI applications – like digital concierges, frontline support chatbots and agents that can help with basic self-service troubleshooting. At a high level, the flow for RAG is fairly clear – the user’s prompt is augmented with some relevant ...

Rob Gibbon
15 October 2024

Apache Spark 4.0 beta release – try it now

Data Platform Ubuntu tech blog

Apache Spark is a popular framework for developing distributed, parallel data processing applications. Our solution for Apache Spark on Kubernetes has made significant progress in the past year since we launched, adding support for Apache Iceberg, a new GPU accelerated image using the NVIDIA Spark-RAPIDS plugin, and support for the Volcan ...

Rob Gibbon
15 July 2024

Deploying and scaling Apache Spark on Amazon AWS EKS

Data Platform Ubuntu tech blog

Move over Hadoop, it’s time for Spark on Kubernetes Apache Spark, a framework for parallel distributed data processing, has become a popular choice for building streaming applications, data lake houses and big data extract-transform-load data processing (ETL). It is horizontally scalable, fault-tolerant, and performs well at high scale. H ...

Rob Gibbon
23 May 2024

Can it play Doom? Running an AI LAN party on a Spark cluster with ViZDoom

AI Ubuntu tech blog

It’s all about AI these days, so I decided to try and answer the important question: can you make a Spark cluster run AI agents that play a game of Doom, in a multiplayer LAN party? Although I’m no data scientist, I was able to get this to work and I’ll show you how so ...

Rob Gibbon
14 May 2024

Deploy an on-premise data hub with Canonical MAAS, Spark, Kubernetes and Ceph

AI Ubuntu tech blog

Download the Spark reference architecture guide In this post we’ll explore deploying a fully operational, on-premise data hub using Canonical’s data centre and cloud automation solutions MAAS (Metal as a Service) and Juju. MAAS is the industry standard open source solution for provisioning and managing physical servers in the data centre. ...

Rob Gibbon
22 February 2024

Migrating from Cloudera to a modern data hub architecture

Data Platform Ubuntu tech blog

In the early 2010s, Apache Hadoop captured the imagination of the tech community. A free and powerful open source platform, it gave users a way to process unimaginably large quantities of data, and offered a dazzling variety of tooling to suit nearly every use case – MapReduce for odd jobs like processing of text, audio ...

Rob Gibbon
12 December 2023

Announcing the Charmed Kafka beta

Data Platform Ubuntu tech blog

Charmed Kafka is a complete solution to manage the full lifecycle of Apache Kafka. The Canonical Data Fabric team is pleased to announce the first beta release of Charmed Kafka, our solution for Apache Kafka®. Apache Kafka® is a free, open source message broker for event processing at massive scale. Kafka is ideal for building ...

Rob Gibbon
17 October 2023

Why we built a Spark solution for Kubernetes

Data Platform Ubuntu tech blog

We’re super excited to announce that we have shipped the first release of our solution for big data – Charmed Spark. Charmed Spark packages a supported distribution of Apache Spark and optimises it for deployment to Kubernetes, which is where most of the industry is moving these days. Reimagining how to work with big data ...

Rob Gibbon
10 August 2023

Write a Spark big data job with ChatGPT

AI Ubuntu tech blog

I’ve read and watched more than a few articles about ChatGPT in the last couple of months. It seems the large language model AI hype machine just can’t stop. As somebody with a passion for music production, some of the more interesting things I’ve seen included a guy using ChatGPT to build a virtual effect ...

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Categories

Industries

Partner programs

Quick links

Roles by department

Working here

Explore Canonical

Latest updates

Company highlights ›

Rob Gibbon

Migrating from Apache Spark 3 to Spark 4

Understanding disaggregated GenAI model serving with llm-d

Hybrid search and reranking: a deeper look at RAG

Apache Spark 4.0 beta release – try it now

Deploying and scaling Apache Spark on Amazon AWS EKS

Can it play Doom? Running an AI LAN party on a Spark cluster with ViZDoom

Deploy an on-premise data hub with Canonical MAAS, Spark, Kubernetes and Ceph

Migrating from Cloudera to a modern data hub architecture

Announcing the Charmed Kafka beta

Why we built a Spark solution for Kubernetes

Write a Spark big data job with ChatGPT