KubeCon Cloud Native Con Europe Recap 2025

Published on Apr 23, 2025 — 12 min read

Summary

KubeCon EU once was yet again a great conference, with themes between the lines on the next decade of cloud native in celebration of CNCF’s 10th anniversary, and other themes including bridging the gap between cloud and hardware.

A couple of weeks after KubeCon + CloudNativeCon in London, all the recordings have been uploaded and can now be watched in retrospect. It’s a great time to browse through the talks that were discussed and highlight some notable ones.
The CNCF also published a report at the start of the conference, outlining general shifts within the industry.

[A] new study identifies a shift from security concerns to collaboration and efficiency as the top priority in cloud native adoption, emphasizing the need for seamless teamwork and automation:

Kubernetes adoption continues to grow, with 80% of organizations running it in production, up from 66% in 2023.
CI/CD adoption is fueling faster releases, with 60% of organizations leveraging CI/CD for most or all applications.
Security measures are improving, with 60% of organizations vetting open source projects for active communities, and 57% using automated tools to detect vulnerabilities.
GitOps is gaining traction, with 77% of organizations adopting its principles for deployment.
Serverless adoption remains split, with some expanding use while others step back due to cost and complexity.
Service mesh adoption is declining, dropping from 50% in 2023 to 42% in 2024 due to operational overhead concerns.

All Kubecon EU talks are recorded and can be watched in the keynotes playlist and the talks playlist on YouTube.

1. Next 10 Years Cloud Native

Cloud Native Maintainer Summit with over 300 attendees.

Like last years cloud native event's talking about 10 years of Kubernetes, this years event's have a similar anniversary with 10 years of the Cloud Native Computing Foundation that grew from 32 projects in 2019 to now over 196 projects early 2025. The maintainer summit the Monday before the main conference started had over 300 attendees. One of the themes was about whats in front of us; what could the next ten years cloud native and Kubernetes look like?

1.1. User Experience

🎥

Evolving the Kubernetes User Experience & Session info

Andrew Randall from Microsoft mentioned that in order to grow a larger user base for Kubernetes, it's key to improve the user experience. Kubernetes is not easy to use, and to reach a broader audience, projects need to become more approachable and intuitive. Maintaining the CLI and terminal-based workflows, while also offering UIs that achieve the same functionality is essential for the next ten years of Kubernetes.
As an example, Windows only truly broke through once a GUI was added on top of the terminal. In the Linux world, there are distributions like Arch that work great but don’t reach as wide an audience as Fedora or Ubuntu — “Kubernetes is like Arch.” For Kubernetes, the project Headlamp could be the UI that elevates the user experience. Headlamp is also extensible and can be integrated into a self service platform with a companies CI.

Windows Breakthrough came with the UI & Kubernetes needs a UI.

Windows Breakthrough came with the UI & Kubernetes needs a UI.

1.2. Reliability, Hardware and Framework Orchestration

🎥

OSS = Open Source ... Strategy!? Google Is Doubling Down on K8s in the AI Era & Session info

Jago Macleod from Google shared how they rationalize their open source engagement with Kubernetes. The talk reviews Google’s history with Kubernetes up to the present day. Kubernetes’ declarative nature (e.g. reconciliation loops), its extensibility (e.g. CRD's), and its modularity (e.g. custom schedulers) were mentioned as the core reasons the project became so popular. Jago also shared Google’s vision for the future of Kubernetes and distilled it into three key areas.

Improve reliability — focus on upgrades to make it easier to adopt new versions.
Redefine K8s relationship with Hardware - there is a lot of innovation happening in Hardware.
Breaking free from containers centric views towards framework orchestration - like enabling Ray for AI workloads or Slurm for HPC.

Jago compared Kubernetes vision in infrastructure to the Hourglass Model, that describes the internet stack, where IP is in the center sitting between applications and protocols like HTTP, Ethernet and the actual fibers. Kubernetes could be a similar focal point to orchestrate any kind of infrastructure resource.

1.3. CNCF Organizational Restructuring

CNCF TOC and TAG Community restructuring.

The CNCF TOC and CNCF Foundation worked end of 2024 and beginning of 2025 on organizational changes to the community to better capture the changed landscape of projects over the last ten years. The CNCF has over 196 projects in the foundation and with that a broader scope than before. There is going to be new Technical Advisory Groups (TAGs) formed over the next months in 2025 (slides). There was no specific talk about the subject. It was discussed on the hallway track and during community meetings prior the event and on GitHub Issues.

1.4. NeoNephos Foundation

NeoNephos Logo

🎥

Welcome + Opening Remarks & Session info

In the opening keynote, Vasu Chandrasekhara introduced the NeoNephos Foundation, launched under the Linux Foundation Europe to drive digital sovereignty in European cloud computing. Built on top of CNCF projects like Kubernetes, NeoNephos aims to empower organizations to run cloud-native workloads on their own hardware—outside the control of hyperscalers.

Initial contributions come from the ApeiroRA reference architecture, which focuses on projects close to the hardware layer. The goal: fostering of bare metal tooling to leverage open cloud standards (OCI, CNI, KVM, etc.) to avoid vendor lock-in and preserve the freedom to make independent technical decisions.
There will be a side event during the Open Source Summit Europe this summer focused on NeoNephos projects.

2. Hardware, Resources and Self-Hosting

Another set of topics at KubeCon was about hardware enablement and awareness. Talks focused on Dynamic Resource Allocation (DRA), a feature within Kubernetes, for better utilization of GPUs and TPUs, enabling more efficient and user friendly scheduling.

2.1. Dynamic Resource Allocation

🎥

More Nodes, More Problems: Solving Multi-Host GPU/TPU Scheduling with Dynamic Resource Allocation & Session info

Dynamic Resource Allocation activities broken down & allocation occupation on multi-host

Dynamic Resource Allocation occupation on multi-host — Dynamic Resource Allocation activities broken down & allocation occupation on multi-host

Dynamic Resource Allocation (DRA) has been a focus area within the Kubernetes community for some time (since v1.26). The topic has previously appeared on the KubeCon keynote stage and continues to see active development. DRA aims to enable the use of new hardware resources within Kubernetes while improving overall usability and resource efficiency.

The primary driver is the increasing demand to host and orchestrate AI workloads, which rely on accelerators such as GPUs and TPUs. DRA brings new APIs the ResourceSlice API to describe available devices and the ResourceClaim API to request and assign them to pods.

There are several other talks about DRA:

GPU Sharing at CERN: Cutting the Cake Without Losing a Slice
Taming the Beast: Advanced Resource Management With Kubernetes

Since DRA primarily is driven by AI; there is an interesting discussion about AI in the industry by theCUBE. The discussion also addresses the sustainability challenges associated with AI, particularly its high demand for GPU/TPU resources, energy and general infrastructure.

2.2. Resource Awareness and Efficiency

🎥

Practical Zombie Hunting for Kubernetes Users & Session info

High Level Resource Usage Breakdown - The Green Software Foundation has great guidelines about sustainability.

Servers should be like "light switches". — Servers should be like "light switches" & Backstage can be used to communicate cost and resource use to the dev teams.

Servers should be like "light switches" & Backstage can be used to communicate cost and resource use to the dev teams.

Another great talk by Holly Cummins focused on resource awareness. While the cloud might give the impression of improved resource efficiency, there's a common unresolved issue: the tendency to forget about servers and workloads. Servers that are "comatose" doing virtually nothing (no traffic) — and underutilized servers. Holly connected this issue to both (financial) costs that could be better managed (FinOps) and the environmental cost of wasted resources such as electricity, e-waste, water, and data center infrastructure (GreenOps).

Forgetting about workloads is a significant problem. Even though virtualization in the cloud should be more efficient, it also makes it easier to overlook the underlying infrastructure or unnecessary applications still running in the background. To address this, Holly recommended designing applications with elasticity and practices like GitOps in mind to manage infrastructure declaratively; so practices the cloud native community is familiar with. She also suggested using chaos testing (see Chaos Mesh and Litmus), not just to assess service resiliency, but to evaluate whether services are actually in use and if their impact is noticeable.

2.3. Hardware part of the Software Testing Loop

🎥

A Cloud Native Workflow for Hardware-in-the-Loop Software Development & Session info

Hardware is fun again and with the cloud deployments pushing to edge and specialized environments, onto new hardware devices and in general on self hosted servers with various specs, Miguel suggested to have hardware part of the software engineering culture. Keeping hardware in the loop by embedding it in the development workflow and integrating hardware testing in the CI/CD pipeline.

Miguel shared Jumpstarter, an open-source framework for automating tests across both hardware and virtual environments. In his demo, he showed a test setup running on a machine with a camera that detects red and green light signals. The results were automatically captured and pushed back into the pipeline. Jumpstarter is built around cloud-native principles—it separates the test logic from the physical setup, so the same workflows can run locally, in CI, or in distributed environments.

*Using Jumpstarter to test infrastructure on bare metal hardware in CI/CD & Jumpstarter Drivers snapshot overview*

Jumpstarter of Drivers as snapshot reference — *Using Jumpstarter to test infrastructure on bare metal hardware in CI/CD & Jumpstarter Drivers snapshot overview*

2.4. Self-Hosted Cloud Environments - End User Stories by LinkedIn and Saxo Bank

Every Kubecon there are a couple of talks by companies about their journey adopting cloud native projects. In the past, Mercedes Benz shared their stories hosting Kubernetes on bare metal among others. This time there were similar presentations by LinkedIn and Saxo Bank about self-hosting bare metal infrastructure.

2.4.1. LinkedIn

🎥

From Metal To Apps: LinkedIn’s Kubernetes-based Compute Platform & Session info

LinkedIn shared their story hosting bare-metal Kubernetes clusters. A lot of components are build in like their own auto-scaler, scheduler and other components to run over 3.000 services on over 500.000 servers (all bare-metal). The Kubernetes deployment run's without further virtualization besides containerization (no KVM or Xeon hypervisor). They don't use kubeadm or CAPI and no Kubernetes distribution - just the "bare" upstream source components.

LinkedIn Bare Metal Deployment Numbers. — LinkedIn scale and stack.

LinkedIn Bare Metal Stack. — LinkedIn scale and stack.

LinkedIn has an interesting concept of "maintenance-zones" (MZ's) within their infrastructure. They operate 20 MZ's, which they rotate to take away control of the machines from customers; they call it a "disruption", to execute their maintenance work until the MZ is cleared and workloads are scheduled once again.

2.4.2. Saxo Bank

🎥

Breaking Free From the Cloud: Banking on Self-Hosted Kubernetes & Session info

Saxo Bank shared their story, looking in to self-hosting their infrastructure to first have an exit plan from public cloud providers (compliance reasons), further to improve reliability of services and lastly to reduce the cloud bill. Not the entire infrastructure is lifted to the cloud yet, but with services they were able to migrate, they cut their cloud infrastructure bill by 85% (1U Rack Server ~ 20k EUR/5y = 333,3 EUR/month). They improved the cluster creation speed from ~30min to ~2min and improved their security benchmarks CIS Benchmark from 35% to 75%.

SaxoBank Saves by moving to bare metal. — *SaxoBank saves by moving to bare metal.*

3. More Talks

With 379 talks during the main conference and 667 overall, including pre-events, there are plenty of other great talks to cover. This is a subset of some that stood out to me.

3.1. DNS (CoreDNS)

🎥

Scalable DNS With CoreDNS Plugins: A Deep Dive & Session info

CoreDNS is the default DNS in Kubernetes which is written in Go and is easily extendible using plugins. The extensibility of CoreDNS is one of the main reasons why it is so popular. // The talk showed an example of it's extensibility can be done.

*CoreDNS simple example expanding it’s functionality with plugins.* "Basic / Core" Software is not boring.

John and Yong shared a common feature request in CoreDNS where the goal is to use a different DNS server depending on the subnet of the request. If the traffic is coming from 172.0.0.0/8 it should be routed to 1.1.1.1 otherwise to 8.8.8.8. This can be done using plugins and not as a direct feature (see demo repository).

3.2. Quantum Computing

🎥

Beyond Classical Cryptography: Building Quantum-Resistant Cloud Native Infrastructure with SPIFFE & Session info

Emerging technologies like Quantum Computing also is discussed at every Kubecon. There was a good panel discussion about it, "Quantum-Ready Kubernetes: How Do We Get There?" and an approachable talk about fundamentals of quantum computing with a focus on encryption.

3.3. WebAssembly (WASM)

🎥

Wasm I Right or Wasm I Wrong? a Review of the Wasm Ecosystem & Session info

WASM is a technology that compiles application code into a binary format that is universal across programming languages and can be executed at native speed in a sandboxed environment (deny by default). It’s one of those technologies that has been in active development for years and offers many benefits, prompting large companies and contributors to invest in its expansion. However, WASM also suggests a shift in the cloud space in some ways. There were several talks at KubeCon about WASM once again, but this one by Taylor and David provided a great introduction. WASM & specifically development around WASI is a technology to look out for 🚀!

3.4. Noisy Neighbors

🎥

Breaking Free From the Cloud: Banking on Self-Hosted Kubernetes & Session info

Noisy neighbors are a significant problem, as even a small dip in performance can have a larger impact on user retention and sales than it might initially seem (he included a helpful list of studies on this topic).
In this talk, Jonathan gave an academic presentation referencing various studies in the field of noisy neighbors and how the space has evolved over the last decade to address the issues.

Performance deficits comes with a high cost. — Studies showing that even minor performance issues can have a large impact on customer experience and sales.

One common method to address this is pinning processes to CPUs and adjusting their frequency—something containers don’t typically handle. Isolating memory and cache also requires management outside of container runtimes. There is a subsystem called resctrl that can help with this.
He also shared the unvariance collector project, which captures P95 and P99 latency at millisecond intervals to surface noisy neighbor issues.

Control over your machine increases your performance and can be more economical. — *Control over your machine increases your performance and can be more economical and Measured Noise in Hyperscaler’ VMs vs. Hyperscaler’ bare metal machines.*

*Control over your machine increases your performance and can be more economical and Measured Noise in Hyperscaler’ VMs vs. Hyperscaler’ bare metal machines.*

3.5. High Performance Computing (HPC)

🎥

Slinky: Slurm in Kubernetes, Performant AI and HPC Workload Management in Kubernetes & Session info

HPC operates on fundamentally different premises than the cloud. While the cloud assumes infinite tasks and finite resources, HPC is based on the opposite: finite tasks running on (nearly) infinite resources. However, this distinction is becoming less clear—especially with GPUs and other compute resources becoming limited. This shift is not inherently natural to the cloud but is central to HPC.
Tim shared a project called Slinky that enables Slurm to be used with Kubernetes and integrates it properly into the ecosystem (monitoring, APIs, SDKs). Slurm is used by the majority of supercomputers and is one of the leading projects in the HPC space.
Tim’s goal with Slinky is to bridge the gap between these two paradigms and help move both forward. Slurm and Kubernetes are often run alongside each other, so integrating them is a compelling and logical step.