View all newsletters
Receive our newsletter – data, insights and analysis delivered to you
  1. Technology
  2. Cloud
May 11, 2022

Google Cloud claims ‘most powerful’ publicly available machine learning cluster

New machine learning cluster will help customers utilise advanced AI systems and models, Google says.

By Matthew Gooding

Google Cloud has unveiled what it describes as the “world’s largest publicly available machine learning mega cluster”, which will deliver nine exaflops of compute power to users of Google Cloud Platform. Artificial intelligence workloads are increasingly important for many cloud users, and Google Cloud will be hoping its new cluster will prove appealing to customers.

Google Cloud is launching what it says is the world’s most powerful machine learning cluster. (Photo by Sean Gallup/Getty Images)

Revealed as part of the company’s Google I/O developer conference, the machine learning cluster will be powered by the latest v4 version of Google’s in-house Tensor Processing Units, which is designed to run its cloud services as well as other platforms such as YouTube.

Google Cloud’s machine learning cluster

“Google Cloud’s ML cluster enables researchers and developers to make breakthroughs at the forefront of AI, allowing them to train increasingly sophisticated models to power workloads such as large-scale natural language processing (NLP), recommendation systems, and computer vision algorithms,” Sachin Gupta, vice president and general manager for infrastructure and Max Sapozhnikov, product manager for cloud TPU said in a joint statement. “At nine exaflops of peak aggregate performance, we believe our cluster of Cloud TPU v4 Pods is the world’s largest publicly available ML hub in terms of cumulative computing power.”

Based at one of the company’s data centres in Oklahoma, the ML cluster is powered by a series of Cloud TPU v4 pods, each of which consists of 4,096 chips connected in an ultra-fast interconnected network. Google says each pod has “industry leading” bandwidth of six terabits per second, enabling it to rapidly digest information and train large AI models. The Oklahoma data centre operates on 90% renewable energy.

The machine learning cluster is available in preview from today.

Why AI and machine learning are important to cloud providers

Increasing numbers of businesses are turning to AI and ML to help improve efficiency and digitise their operations. According to a McKinsey study, ‘The State of AI in 2021’, published last year, 56% of more than 1,800 organisations polled around the world said they had adopted AI in at least one function of their business, up from 50% in 2020.

The role of cloud computing in successful AI adoption was also highlighted in the McKinsey study, which identifies companies it describes as “AI high performers” – businesses which attribute at least 20% of their earnings to their AI implementation. This high-performance group run an average of 64% of their AI workloads in the cloud, compared to 44% for other respondents, suggesting cloud-based AI can offer better returns than on-premise systems.

The high-performing group “is also accessing a wider range of AI capabilities and techniques on a public cloud,” the McKinsey study says. “For example, they are twice as likely as the rest to say they tap the cloud for natural-language-speech understanding and facial-recognition capabilities.”

AR, VR and database innovations at Google I/O

Google Cloud made several other announcements at I/O, including the launch of AlloyDB, a new PostgreSQL-compatible database service for highly demanding workloads. It claims this offers double the processing of the comparable service for transactional workloads offered by Amazon’s AWS, the world’s leading cloud platform.

Also available to Google Cloud Platform users from today is Immersive Stream, a new service that renders immersive 3D and augmented reality experiences and allows them to be streamed to mobile devices.

For security, Google Cloud has Network Analyzer, a new module in the platform’s Network Intelligence Center to enable developers to detect network failures and prevent downtime by pinpointing potential problems such as accidental misconfigurations and over-utilisation of services.

Read more: Google Cloud creates web3 unit to outpace AWS and Azure

Websites in our network
NEWSLETTER Sign up Tick the boxes of the newsletters you would like to receive. Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
I consent to New Statesman Media Group collecting my details provided via this form in accordance with the Privacy Policy
SUBSCRIBED

THANK YOU