Google has developed a new artificial intelligence tool that can produce high-definition video from a simple text prompt, joining another Big Tech company, Meta, which revealed its own text-to-video system last week. While the current applications for generative AI technology may seem trivial, it could dramatically improve AI efficiency and reduce bias in future, experts believe.
Meta’s Make-A-Video generator is an upgrade to its Make-A-Scene text-to-image tool, while Google AI has updated an existing function, Imagen, to support video. Both are forms of generative AI and are in the research and development phase, not ready for public use. Neither company has made it clear when they will be generally available.
The announcement of AI video took some experts by surprise. The authors of the State of AI report, investors Nathan Benaich and Ian Hogarth, said research in the area only started in April.
“In late September new research from Meta and Google came with a jump in quality, announcing a sooner-than-expected DALL-E moment for text-to-video generation,” the pair write in the report, which was released today.
While generative video might be in the early stages, AI-produced images are becoming mainstream with the likes of OpenAI’s DALL-E and the open-source tool Stable Diffusion in wide use. Away from images, the technique can also be used to generate text through chatbots and automated articles, as well as speech.
All the Big Tech companies and many start-ups are investing heavily in generative artificial intelligence. It is a form of AI where the machine is producing something new rather than simply analysing something that is already in existence.
“Generative artificial intelligence, in essence, allows computers to benefit from self-learning made possible from multiple data sets,” explains Shelly Kramer, principal analyst at Futurum Research.
According to tech investor Sequoia Capital, “Generative AI is well on the way to becoming not just faster and cheaper, but better in some cases than what humans create by hand”.
Kramer adds: “This means that the AI sees and learns patterns over time, which can then be used in what are some incredibly cool ways, perhaps the most exciting of which is that it can be used to create data that doesn’t even exist yet.”
Several of the more advanced generative models can create complex concept art in seconds, and in future, this technology could be adapted to, for example, allow an architect to describe a building and have the AI model produce a walkthrough.
Generative AI could help reduce bias in machine learning
The technique could also be used to create a full-length commercial to run on television by simply feeding in a script and have the AI produce the visuals based on the descriptions used in the text, and future versions using the OpenAI GPT-3 natural language model, could also write the script from a prompt.
But are there uses outside of the creative arts and marketing? Yes, says Kramer. “Generative AI is absolutely exciting, it’s already in use in various instances, and I see every reason to expect that the early promise we’re seeing will continue to grow.”
This goes beyond pretty pictures and videos says Kramer, as generative AI can do some of the analysis as well, particularly of things that are conceptual or abstract. “Generative AI can help reduce bias in machine learning models, deliver higher quality outputs, and help make data analysts’ jobs easier by doing some of the heavy lifting.”
These advanced functions will require significant compute power. Generative AI is one of the most demanding forms of artificial intelligence in terms of processing required to achieve the desired result. In its research paper on Imagen Video, Google says it is using a technique called “progressive distillation”, a process to distil information through the model more efficiently.
“Given the tremendous recent progress in generative modelling, we believe there is ample scope for further improvements in video generation capabilities in future work,” Google’s engineers say.
A sea change for the creative industries?
The most basic use cases for generative AI are relatively straightforward but could bring about fundamental changes for the creative industries. There is already a plugin for Photoshop that allows an artist to generate an image or feature for use in an image, and a number of stock libraries have had to ban the inclusion of AI images after concerns were raised by artists and photographers.
In future, these techniques could be used to create massive world games that are different for every user that plays them – even down to what is said by the non-playable characters.
“It could also be used to create things like product descriptions, synopses of content, or even creating whole articles,” said Kramer. “Staying on the creative track, generative AI can be used to create music, then keep improving it.”
Kramer said these generative tools will also aid companies in their digital transformation journey, with automation often being a key driver for transformation projects,
Globally, the AI market is predicted to grow significantly over the coming few years, with some predictions suggesting the value will hit $190bn by 2025, according to Kramer. “I have no doubt that generative AI is absolutely playing, and will continue to play, a role there and that as organizations begin to understand the value it can deliver, we’ll see even more uptake.”