HPE has become the latest high-performance computing vendor to take a step into the foundation model artificial intelligence space with a new cloud-based offering. Its new service, dubbed GreenLake for LLM, is built on HPE Cray XD supercomputers. It comes as Cisco also looks to capitalise on generative AI with the launch of new networking chips pitched as the growing AI supercomputer market.
Announced at the company’s Discover 2023 conference, taking place in Las Vegas this week, HPE GreenLake for LLM is the company’s attempt to go up against cloud hyperscalers Amazon AWS, Microsoft Azure and Google Cloud, which all offer AI services to clients. The company says it will roll out for customers in North America this year and Europe in 2024.
HPE says its approach is to focus specifically on AI, rather than shoe-horn AI into an existing cloud infrastructure. In a typical cloud data centre, physical servers are turned into virtual machines, but this doesn’t work as well with AI as it takes multiple GPUs and machines running together to meet the compute requirements of serving a large model. The company says its supercomputer systems are built for that type of approach.
GreenLake will run large language models on an AI-native architecture, the company explained. This architecture and the related software have been “uniquely designed to run a single large-scale AI training and simulation workload, and at full computing capacity”. It effectively works like a distributed system in one, spreading the workload across hundreds or thousands of CPUs and GPUs at once. This is similar to the approaches taken by the major AI labs like OpenAI and Anthropic when training foundation models.
The HPE platform is built on the Cray XD supercomputer which is built using Nvidia H100 GPU accelerators. Initially hosted in a Canadian data centre it will be offered “as-a-service” to enterprise clients and eventually include a library of third-party and open-source AI models.
Executive VP Justin Hotard told reporters: “We see it as a complementary and very different than what our fellow cloud partners provide. It’s not trivial or freely accessible.” It will come with a number of large language models that can be trained, tuned and deployed using a “multi-tenant instance” of the Cray supercomputing platform.
The first LLM to be deployed on the GreenLake platform comes from German AI startup Aleph Alpha, which provides text and image processing and analysis. HPE says it will deploy domain-specific AI applications in the future include for climate modelling, healthcare and financial services.
Aleph Alpha CEO Jonas Andrulis said the model running on GreenLake had been specifically built for high-risk industries such as financial services, healthcare and the legal profession, allowing them to deploy digital assistants to speed up decision-making and save time but in a secure way. “We are proud to be a launch partner on HPE GreenLake for Large Language Models, and we look forward to expanding our collaboration with HPE to extend Luminous to the cloud and offer it as a-service to our end customers to fuel new applications for business and research initiatives.”
‘Generational market shift’ towards AI use
The rise of generative AI and artificial intelligence generally was described as a “generational market shift” by HPE CEO Antonio Neri. He said it would be as transformative as the web, mobile phones and cloud computing. “HPE is making AI, once the domain of well-funded government labs and the global cloud giants, accessible to all by delivering a range of AI applications, starting with large language models, that run on HPE’s proven, sustainable supercomputers.”
Neri says this move will allow organisations to deploy AI in a way that could disrupt markets and achieve new breakthroughs via an “on-demand cloud service” at scale. The company says the supercomputing approach is “significantly more effective, reliable, and efficient to train AI and create more accurate models” than using classical cloud platforms. It in turn allows “enterprises to speed up their journey from POC to production to solve problems faster”.
Cisco has also jumped on the AI supercomputing race, and aims to compete with the likes of Broadcom and Marvell Technology with a range of new networking switch chips for AI supercomputers. The SiliconOne series is already being tested by the major hyperscalers and have been designed to speed up the communication between GPUs.
The new generation of ethernet switches known as G200 and G202 are said to have doubled the performance of the previous generation and can bring together 32,000 GPUs. Cisco fellow Rakesh Chopra told Reuters they were “the most powerful networking chips in the market fuelling AI/ML workloads”.