Meta has open-sourced a new set of AI tools called AITemplate that make it easier for developers to work with different GPUs without sacrificing speed and performance. It is the latest in a round of open-source AI projects from the Facebook parent including the framework PyTorch.
The new tools are based on the PyTorch framework and when used, the code can run up to 12 times faster on Nvidia’s A100 AI GPU or four times faster on AMD’s M1250 when compared to existing PyTorch methods, according to Meta engineers.
Its biggest benefit to developers is the ability to switch between processors when running machine learning calculations. Currently, to get the most out of an AI-tailored GPU, developers need to write their code to the hardware, making it difficult to then run the same code on another graphics card.
Meta says AITemplate will act as a layer above the chip that doesn’t hamper performance but does allow for easily swapping without being locked to a specific chip.
It is built specifically for inference, which is where machine learning algorithms trained on a large dataset need to make a quick judgement based on a received request. This is used in labelling.
“Currently, AI practitioners have very limited flexibility when choosing a high-performance GPU inference solution because these are concentrated in platform-specific, and closed black box runtimes,” Meta engineers explained in a blog post.
“A machine learning system designed for one technology provider’s GPU must be completely reimplemented in order to work on a different provider’s hardware. This lack of flexibility also makes it difficult to iterate and maintain the code that makes up these solutions, due to the hardware dependencies in the complex runtime environments.”
Solutions to accelerate AI development are in high demand among developers keen to try new modelling techniques and businesses looking to use greater degrees of automation. According to a new report by Forrester the AI sector is set to outpace the overall software market over the next two years by about 50%.
The report found that AI software revenues will see an 18% compound annual growth rate by 2025. Off-the-shelf and platform AI software spend will increase from $33bn in 2021 to $64bn in 2025.
Forrester analyst Michael O'Grady said: "As AI becomes mainstream, enterprises will need to manage its complexity across its tech infrastructure, AI practices and processes, business models, and across talent management, which includes the democratisation of tools for the citizen data scientist."
Meta and its AI open source toolkits
To meet this growing demand developers are looking for faster turnarounds on new concepts, as well as ways to reduce costs – including turning to open source toolkits.
“Although proprietary software toolkits such as TensorRT provide ways of customisation, they are often not enough to satisfy this need,” Meta's team says. “Furthermore, the closed, proprietary solution may make it harder to quickly debug the code, reducing development agility.”
Meta says it created AITemplate to tackle that problem, and made it open source to allow for continued development to meet the needs of the community. It is a unified inference system with separate acceleration back ends for AMD and Nvidia GPUs with plans for other hardware in future.
“We also plan to extend AITemplate to additional hardware systems, such as Apple M-series GPUs, as well as CPUs from other technology providers," engineers from Meta revealed. "Beyond this, we are working on the automatic lowering of PyTorch models to provide an additional turnkey inference solution for PyTorch."
Benchmark tests found it was able to deliver close to hardware-native Tensor and Matrix Core performance on a range of widely used AI models including convolutional neural networks, transformers, and diffusers.
“We’ve used AIT to achieve performance improvements up to 12x on Nvidia GPUs and 4x on AMD GPUs compared with eager mode within PyTorch,” the Meta blog post says.
“Our project offers many performance innovations, including advanced kernel fusion, an optimisation method that merges multiple kernels into a single kernel to run them more efficiently, and advanced optimisations for transformer blocks. These optimisations deliver state-of-the-art performance by significantly increasing utilisation of Nvidia's Tensor Cores and AMD's Matrix Cores.”