Matt Pryor

Senior Cloud Native Platform Engineer @ Nscale

Photos

Session

The Fast and the Flexible: bare-metal performance for HPC and AI with the flexibility of cloud

Tuesday, October 28, 202515:10 - 15:50Hovedsalen (1st floor)

Presentation (40 min)IntermediateEnglish

Session Recordings

Users running HPC simulations and AI models need the performance of bare-metal. However with the explosion in diversity and scale of these workloads over recent years, the only way to fulfill all of the different use cases is with the flexibility of cloud. This has historically been associated with a performance penalty through the introduction of virtualization.

At Nscale, we provide our customers with bare-metal performance with the flexibility of cloud by taking advantage of open-source technologies. In this session, we will show how Nscale uses a Kubernetes underlay to provide high-performance virtual Kubernetes clusters that are available in a handful of minutes and provide direct access to hardware. We will show how we layer Slinky from SchedMD on top of this to provide a familiar Slurm interface for users. Finally, we will show how we benchmark from top to bottom to verify that our customers are getting the performance that they expect, and how we build that in to the lifecycle of hardware that underpins our platform.