Data processing units (DPUs) are having a moment.
As artificial intelligence (AI), high performance computing (HPC), increasingly complex cloud architectures and security demands put more pressure on central processing units (CPUs), and graphics processing units (GPUs), DPUs have stepped up their game.
In baseball parlance, DPUs can be seen as a “utility player,” in that they can be deployed in different ways to support workload-focused players – the CPUs and GPUs. DPUs can trace their history to the Smart network interface card (SmartNIC), which were initially designed to offload some network functions from the CPU to improve network traffic flow.
But traditional SmartNICs aren’t smart enough proceed the high volumes and different patterns of data that DPUs can. The democratization of AI through ChatGPT has put a lot of pressure on the backend as frontend users began consuming it faster – once it hit a tipping point, everyone was using it. Not all DPUs are created equal however, and AI is just one use case that’s driving their adoption.
One position DPUs can play is security. In September 2022, Nvidia introduced a DPU designed to implement zero-trust security distributed computing environments, both in the data center and at the edge. Nvidia’s BlueField-2 DPUs were specifically tailored to be used Dell PowerEdge systems with the aim of improving the performance of virtualized workloads based on VMware vSphere 8.
Nvidia’s Bluefield-3, meanwhile, is the equivalent of 300 CPU cores, and it employs purpose-built accelerators to handle storage, security, networking and steering traffic.
The next generation of AI is AI agents talking to other AI agents, and DPUs play a key role in automation and scalability while humans continue to increasingly interact with AI. Nvidia is building out a microservices platform, NIM, that leverage DPUs to orchestrate AI interactions, providing containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models.
The utility of DPUs in modern AI data centers has meant that other players beyond semiconductor vendors are seeing the value of owning their own DPU technology. Microsoft acquired Fungible to integrate its DPU technology into its data center infrastructure – it was already using DPUs in Azure, and there are plenty of options. Aside from Nvidia, Intel, Broadcom and AMD all have DPU offerings. Amazon Web Services, meanwhile, has its own DPUs, dubbed “Nitro.”
AMD bulked up its DPU capabilities in 2022 with the acquisition of Pensando and its distributed services platform, which had a footing in Azure, IBM Cloud and Oracle Cloud. The Pensando acquisition means AMD finds its DPU adopted in several hyperscale cloud providers. It has also been incorporated into smart switches from Aruba and Cisco.
Taking on front-end networking functions to securely connect to the GPU continues to be a key role for DPUs, as well as enabling secure isolation in multi-tenant hyperscale cloud data centers. The DPU can also accelerate storage access and speed up data ingestion – the increased volumes of training data are driving speed and performance requirements, and hence more offloading to the DPU.
With the explosive growth of generative AI, networking is the currently bottleneck as the number of GPUs in data centers grow, which is why DPUs are essentially for managing data traffic and taking on functions so the GPUs can focus on application workloads.
Read my full story on Fierce Electronics.
Gary Hilson is a freelance writer with a focus on B2B technology, including information technology, cybersecurity, and semiconductors.