Google Cloud says it has made NVIDIA T4 GPUs cloud instances available in eight regions, improving the ability of customers to run demanding AI inference workloads.
NVIDIA T4s, first released in October last year, are a high-end, single-slot, 6.6-inch PCI Express Gen3 Deep Learning accelerator based on the TU104 NVIDIA GPU.
They ship with 16 GB GDDR6 memory and a 70W maximum power limit and are offered as a passively cooled board that requires system air flow to operate the card.
Google’s announcement is good news for NVIDIA, which had noted “dramatic” pause in hardware spending by hyperscale cloud providers in its last earnings.
Read this: NVIDIA Targets Enterprise GPU Buyers as Hyperscale Slumps
Chris Kleban, Google Cloud’s GPU product manager, wrote late Tuesday: “NVIDIA’s T4 GPU… accelerates a variety of cloud workloads, including high performance computing (HPC), machine learning training and inference, data analytics, and graphics.”
Prices for T4 instances start at $0.29 per hour per GPU on preemptible VM instances. On-demand instances start at $0.95 per hour per GPU, with up to a 30 percent discount with sustained use discounts, he noted.
The T4 offers the acceleration benefits of Tensor Cores, but at a lower price than the V100 GPU, he noted: “This is great for large training workloads, especially as you scale up more resources to train faster, or to train larger models.”
The company rolled out Snap Inc. as case study, saying the company is using NVIDIA T4 to create more effective algorithms for its global user base, while keeping costs low.
A Princeton University neuroscience researcher, meanwhile, is using the offering to trace neural “wiring” in a project that is seeing a team attempt to reconstruct the connectome, or neural map, of a discrete segment of brain.
Princeton University’s Sebastian Seung said: “We are excited to partner with Google Cloud on a landmark achievement for neuroscience: reconstructing the connectome of a cubic millimeter of neocortex. It’s thrilling to wield thousands of T4 GPUs powered by Kubernetes Engine. These computational resources are allowing us to trace 5 km of neuronal wiring, and identify a billion synapses inside the tiny volume.”