Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
Dubbed VMware Private AI Foundation with Nvidia, the offering is a single-stack product that provides enterprises with everything they need — from software to computing capacity — to fine-tune large language models and run private and highly performant generative AI applications on their proprietary data in VMware’s hybrid cloud infrastructure.
“Customer data is everywhere — in their data centers, at the edge, and in their clouds. Together with Nvidia, we’ll empower enterprises to run their generative AI workloads adjacent to their data with confidence while addressing their corporate data privacy, security and control concerns,” Raghu Raghuram, CEO of VMware, said in a statement.
However, the offering is still being developed and will launch sometime in early 2024, the companies said.
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
What will the fully integrated solution have on offer?
Today, enterprises are racing to build custom applications and services (like intelligent chatbots and summarization tools) driven by large language models. The effort is such that McKinsey estimates that gen AI could add up to $4.4 trillion annually to the global economy. However, in this race, many teams are working in fragmented environments and struggling to maintain the best possible standards for the security of their data and the performance of the gen AI applications they power.
With the new fully-integrated suite, VMware and Nvidia are tackling this challenge by giving enterprises running VMware’s cloud infrastructure a one-stop shop to take any open model of their choice, whether it’s Llama 2, MPT or Falcon, and iterate on them to streamline the development, testing and deployment of their gen AI apps.
“It takes those models and provides all the power of Nvidia NeMo framework, which lets you take those models and helps you pre-tune and prompt-tune as well as optimize the runtime and results from gen AI workloads. It’s all built on VMware Cloud Foundation on our virtualized platform,” Paul Turner, VP of product management at VMware, said in a press briefing.
The NeMo framework, as many know, is an end-to-end, cloud-native offering that combines customization frameworks, guardrail toolkits, data curation tools and pre-trained models to help enterprises deploy generative AI to production. Meanwhile, VMware Cloud Foundation is the company’s hybrid cloud platform which enables enterprises to pull in their data and provides a complete set of software-defined services to run the developed applications.
The new offering preserves data privacy and ensures enterprises are able to run AI services adjacent to wherever their data resides. Further, Nvidia’s infrastructure handles the computing department, delivering performance equal to or even exceeding bare metal in some use cases. This will be done with the help of multiple ecosystem OEMs which will launch Nvidia AI Enterprise Systems with Nvidia L40S GPUs (which enable up to 1.2 times more inference performance and up to 1.7 times more training performance than Nvidia A100 Tensor Core GPU), BlueField-3 DPUs and ConnectX-7 SmartNICs to run VMware Private AI Foundation with Nvidia.
Turner noted that the solution can scale workloads up to 16 vGPUs/GPUs in a single virtual machine and across multiple nodes to speed fine-tuning and deployment of generative AI models.
“These models don’t just fit in a single GPU. They can need two GPUs, sometimes even four or eight, to get the performance that you need. But [with] our work together, we actually can scale that even up to 16. GPUs are all interconnected via direct-to-direct paths, GPU to GPU, using NVLink and NVSwitch and tying it in with VMware,” he said.
In addition to this, VMware is building differentiated capabilities for the joint offering, including deep learning VMs that can fast-track the work of enterprises looking to build generative AI apps.
“We believe many customers will see the benefits of just being able to pop up and start VMs that are actually pre-prescribed with the right content. We’re also including a vector database, a Postgres with PG vector, that’s going to be built into this. The vector database is very useful as people build these models — you sometimes have fast-moving and changing information that you want to put into a vector database; think of it as a ‘lookaside buffer,’” Turner noted.
As of now, the work on VMware Private AI Foundation with Nvidia continues to progress, with the first AI-ready systems set to launch by the end of the year and the full-stack suite becoming available in early 2024.
Nvidia expects more than 100 servers that support VMware Private AI Foundation to be in the market from over 20 global OEMs, including Dell Technologies, Hewlett Packard Enterprise and Lenovo.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.