It was, for many months, the internet’s favourite toy. Released in November 2022 by OpenAI to little fanfare, the use of ChatGPT soon exploded among netizens eager to test its facility for writing code, songs, essays and limericks. Then, like every boy who wonders how long his Action Man will last if he puts him in the microwave, the internet began to push the generative AI agents’ boundaries to breaking point. Despite OpenAI’s best efforts, ChatGPT was soon accused of racism, facilitating plagiarism, engineering malware and, with a little jiggery-pokery, giving away the recipe for napalm.
None of this stopped the market from thinking that generative AI was about to change the world. Sure, ChatGPT could be manipulated into saying more or less anything, but it was also the harbinger of a whole new wave of workplace automation. The ability of the newest generation of foundation models to ingest vast amounts of data and spit out imagery, code, legal advice and philosophical digressions as good as a journeyman in any one of these fields could, investors theorised, (ideally) augment or (sadly) eliminate jobs in any one of these areas. Microsoft soon invested some $10bn in OpenAI, sparking a race with seemingly every other tech brand, large and small, to launch something, anything that included some measure of generative AI.
Almost a year after ChatGPT was launched, however, many of these scenarios have yet to materialise. In fact, many of the flaws initially identified in the OpenAI agent have persisted in its enterprise competitors. The use of platforms like Microsoft’s Copilot coding assistant has increased among developers, for example, but many are troubled by these applications’ tendency to regurgitate old and flawed code. Elsewhere publications have been forced to correct reams of AI-generated articles, supermarkets have apologised for recipe apps recommending “poison bread sandwiches” and Microsoft and AI developer Midjourney are fighting class action lawsuits from coders and artists complaining of mass copyright infringement.
Even the English language seems to dislike generative AI: saying that something sounds like it was “written by ChatGPT” is, after all, now just another way of saying the statement is formulaic or unreliable. Have billions of dollars, then, been invested in a technology that has little hope of living up to the vaunted expectations of big tech executives? Gary Marcus thinks so. In a series of Substack posts last month, the New York University psychology professor and artificial intelligence guru argued that generative AI, for all its initial potential, was actually turning out to be a bit of a “dud”. While Marcus glimpsed solid use cases for the technology in code generation and marketing, it wasn’t enough, in his view, to sustain the high valuations netted by hundreds of AI start-ups over the past year. These, too, might eventually collapse, he wrote, “if year after year they manage only tens or hundreds of millions in revenue”.
Marcus isn’t alone in raising a sceptical eyebrow. Last month, research agency Gartner released its own report, concerning the “Hype Cycle for Emerging Technologies”, which argued that the market was about to dive headfirst into a “trough of disillusionment” when it came to all things generative AI. Right now, business leaders risk getting distracted by the pitches being made by the myriad start-ups that have emerged in the market in recent months. “While all eyes are on AI right now,” wrote one of the agency’s analysts, “CIOs and CTOs must also turn their attention to other emerging technologies with transformational potential.”
Great expectations for generative AI
Arun Chandrasekaran knows from first-hand experience just how much enthusiasm is still bubbling in the marketplace for ChatGPT-like solutions. “In the last eight months or so I’ve probably taken, conservatively, 700 inquiry calls from clients, banks and public sector agencies on gen AI,” says the Gartner analyst, and co-author of its recent Hype Cycles report. “I’ve never seen anything that’s even close to gen AI in terms of the sheer interest and sense of urgency [among] enterprise IT leaders today.”
That hype has created an imbalance in the marketplace, argues Chandrasekaran. “There are too many start-ups chasing too few problems,” he says, with Gartner tracking 200 new companies offering content-writing services alone. The vast majority of these businesses, Chandrasekaran adds, are basing their products on OpenAI’s GPT models. “We all know that we don’t need 200 start-ups essentially doing the same thing, particularly when they have very limited differentiation,” he says.
Other phenomena eroding market confidence in generative AI include what Chandrasekaran calls “pre-announcements”, wherein a vendor will release details about LLM-powered products they might be releasing in a few months, or just simple “AI washing”, where services promoted as containing generative AI are actually just derived from simpler machine learning algorithms. Much of this is just marketing fluff, says the analyst, an inevitable by-product of the fight among technology companies to distinguish themselves in an increasingly crowded marketplace. Nevertheless, he concedes, the frenzied rush to market implied by some of this activity suggests a heightened risk of shoddy products being released and adopted by businesses eager to embrace generative AI wherever they can.
But vendors aren’t the only ones at fault. Businesses themselves, drip-fed stories about an impending era of AI-fueled economic disruption, not to mention a rash of long reads about an impending onslaught from malign superintelligences, are more inclined than ever before to believe that this or that generative service is as good as its marketing claims. Many have been disappointed. According to reporting by Axios (cited by Marcus in a later Substack post), while 70% of companies recently surveyed by S&P Global had an AI project on the go, half of them were still in the “pilot or proof-of-concept stage, outnumbering those who have reached enterprise scale with an AI project”.
Most of this, the article continued, was down to the siloing of potential training data for enterprise AI products across many of the companies surveyed. But such reporting confirms what Chandrasekaran has suspected for a while about the first wave of gen AI products to hit the market. “What we are seeing – and this is very early evidence – is that the productivity numbers that the vendors are claiming seem to be far higher than the actual productivity savings that users are experiencing today,” he says.
That is especially true when it comes to coding applications, one of the areas where even sceptics like Marcus concede that there’s a solid business case for generative AI. Indeed, the technology seems to have captured the imaginations of those CIOs who believe that their coders are spending most of their working days bashing out HTML or fixing COBOL. In reality, explains Chandrasekaran, they’re probably only doing that for up to two hours a day, with the rest of their time devoted to testing, attending meetings or liaising with QL teams. As such, tools like Copilot are inevitably going to boost the productivity of software development teams, but only at the margins. Even then, adds Chandrasekaran, “they’re good at generating boilerplate code – but they’re not exactly good at generating anything game-changing”.
Brass tacks on generative AI
Not everyone agrees, however, that generative AI hasn’t lived up to expectations. For his part, synthetic media expert Henry Ajder has witnessed its wholehearted embrace by a motley crew of film studios, marketing outfits, and digital artists. Every day, we’re seeing breakthroughs in the use of generative AI to produce weird and wonderful art, revive ageing actors (and resurrect the dead ones), and use lip synchronisation and voice cloning to abolish ropey, Spaghetti Western-esque dubbing.
“These aren’t theoretical use cases,” says Ajder. “These are use cases which have tangibly impacted and saved studios money, or made them more money.”
LLMs are a different story, he concedes. Their habit of hallucinating responses to questions that they’ve not been trained to answer is likely to dampen CIO enthusiasm for gen AI in the near term, explains Ajder. “Key questions emerge about liability,” he says. Is it the responsibility of the user for any system malfunctions, or “the tool creator for building something which isn’t robust or reliable?”
That won’t be a problem forever, says AI21 Labs’ platform vice-president Dan Padnos, who claims that the firm’s Contextual Answers service is already letting companies adopt generative AI services capable of fact-checking their own outputs. Future engineering challenges should be approached in a similar way, argues Padnos, who compares the technology’s ongoing challenges with the steady evolution of the automobile.
“A couple of years ago, we were surprised that this thing even drives,” he says. “Now, we want more range, we want more comfort… And that’s fair. The goalposts are going to keep moving as the technology improves and the next challenge becomes progressively more inspiring and bigger.”
So, what happens next? Some consolidation over the next year is inevitable, argues Chandrasekaran, as the market begins to realise that generative AI is more limited than initially thought. As that happens, many of the more nebulous-seeming start-ups will be starved of funding. Incoming regulations, like the EU’s AI Act, will also impose limitations on what generative AI is, further constraining its raw potential.
Even so, Chandrasekaran predicts, this imminent market correction will not be permanent. Though disdainful of the market fluffery that has abounded in recent months, the analyst firmly believes in the transformative potential of generative AI. “I think, in the short term, the usage of these technologies [will] mostly be for augmentation,” says Chandrasekaran. What’s more, “we’re going to see augmentation happening at different speeds, in different companies, in different industries, or even in different business functions.”
Customer service will prove to be one area where generative AI will, eventually, make a major impact. Though such systems are not likely to be seen in the pathway of customers, explains Chandrasekaran, given the current hallucination risk, they could prove an invaluable assistant to those human customer service agents already in conversations with customers (indeed, IBM claims to have offered a version of this application to several UK banks.)
For the moment, though, Chandrasekaran advises CIOs and IT leaders to be cautious about generative AI. “You have to be realistic in terms of the business value gains that you should expect,” he says. “You should be a lot more methodical in terms of how you choose the use cases [for generative AI] that are technically feasible.”
In the meantime, new LLM-powered products continue to be announced with every passing day, from OpenAI’s release of a new enterprise version of ChatGPT to LexisNexis adding generative AI functionality to its legal software. These announcements are hardly likely to stop, says Connor Leahy, CEO of the alignment startup Conjecture. “There might be market overexuberance at this point – this is totally possible,” says Leahy, agreeing with Marcus and Chandrasekaran that many AI pretenders have emerged onto the scene in the past few months. “But from my perspective, the idea that gen AI is a “dud” is just absurd.”
An expert in the finer points of foundation models, Leahy argues that the pace of innovation across AI has been blisteringly fast over the past 16 months. In such an environment, he adds, scammy startups will multiply. “But that doesn’t matter from my perspective,” says Leahy. “I don’t really care about the median or the bottom 20%. I care about the top 1%; I care about the top, cutting-edge technology. I care about GPT-4, I care about Midjourney, systems that can do literal miracles. It’s insane that we’re in a world where you can talk to your computer – and it works.”