Firms are still struggling to find out what agentic AI can achieve

Agentic AI is the next big thing. Marketing hype, however, is obscuring how hard it can be to deploy – or even what qualifies as ‘agentic AI.’ (Image: Shutterstock)

In his blockbuster speech to the AI faithful at this year’s GTC, NVIDIA founder Jensen Huang pronounced that humanity is on the cusp of an era of agentic and reasoning AI – one which, he hoped, will eventually lead to ten billion digital workers working alongside humans. In other words, rather than sitting and waiting for people to come to them with queries and data for them to parse, models would go out into the world and solve problems themselves, one step at a time.

How would that work in practice? Currently, marketing spiel from the biggest players makes it difficult to identify what is and what isn’t an agentic AI service (Anthropic’s Claude, for example, is simultaneously a generative AI chatbot and a service you can use to create agents). In large part, an agentic AI model can be defined as “an orchestrating AI master agent that is in charge of a fleet of AI agents that are designed to carry out specific tasks,” says Michael Azoff, a senior analyst at Omdia. That master agent may be tasked with “completing a specific complex task requiring completion of sub-tasks or play a role that performs such complex tasks on an ongoing basis. The master agent collects all the information gathered by the AI agents to complete any specific complex task.”

On a practical level, ChatGPT 3.5 alerted the world to the potential of generative AI, but it is the evolution of “reasoning models” and chain of thought models such as Claude 3.7 Sonnet, OpenAI 01 or DeepSeek-R1 that has set the stage for agentic systems. They are designed to work through and analyse more complex problems using step-by-step reasoning. But that’s just part of the story.

Gen AI can be thought of as the equivalent to humans’ “right brain”, with the ability to create things, whether recipes or summaries of customer service calls, explains Pegasystems’ Peter van der Putten, director of Pegasystems’ AI lab, and assistant professor, AI and creative research at the Leiden Institute of Advanced Computer Science.

But he adds, other forms of non-generative AI, such as machine learning and predictive analytics, together with automation and process mining, could be thought of as more akin to the left part of our brain AI “that we use to make so-called intelligent decisions”. It’s the combination of the two that can get us towards the sort of autonomous systems Huang waxed lyrical about.

Multi-agent systems

Van der Putten says multi-agent systems have been around since the start of AI. Even so, “they were always very much constrained to very well-defined, narrow use cases, matching supply and demand on electricity markets. At the moment, if you would try something broader, they would fail.”

Research by Cloudera suggests agentic AI has certainly caught the imagination of enterprises, with 57% of respondents saying they had started “implementing AI agents” within the last two years. Almost all plan to expand their use in the year ahead.

It’s less clear how many projects are in full production. Cloudera reported that 37% of organisations it polled had found integrating agents into their workflows to be “extremely challenging”. Problems included data privacy concerns, integration challenges and, inevitably, high implementation costs.

What about those companies that have successfully deployed agentic AI? Omdia’s Azoff says many “successful” agentic AI systems will be in the realm of commercial IP secrets. “But my impression is that a lot of agentic AI is still in R&D.”

Ian Makgill of global tender search site Open Opportunities has been working with language models since the debut of BERT, back in 2018. More recently, he’s experimented with various models to automate processes, effectively putting agentic AI workflows in place, using a combination of Python and AWS Bedrock or its own language models.

The technology shows “promise and risk,” says Makgill, but the results so far have often been underwhelming. “We moderately managed to get Claude to output D3 charts,” he says, but when a JavaScript expert (human) looked over the LLM’s output, “he just laughed [and said] ‘I couldn’t put that code into production.’”

Keep it constrained

George Dunning, chief operating officer at financial intelligence platform Bud Financial, has had a happier experience with agentic AI. Dunning says that most of the AI he sees at work in the financial sector is not actually working on financial data but documentation.

His colleagues, by comparison, specialise in applying AI to transaction data.

Gen AI gets us some way towards being able to quiz and analyse data more easily, says Dunning, whether it’s asking “which of my customers are using competitive credit cards, which of my customers are using buy now pay later products.”

However, it’s the ability to automate data discovery and layer automated suggestions and actions on top which gets us into the realm of agentic AI, he says.

On the consumer side, Dunning explains, this could mean developing systems that can analyse and optimise customers’ stagnant cash by moving it in and out of a savings account to earn more interest. “Rather than trying to change your financial behaviors,” he says, it’s just creating movement of money for you so you can get better financial outcomes.”

But Dunning is “crystal clear” that when it comes to agentic in finance, at least, “You want to create very specific locked down things that utilise LLMs, but those LLMs are constrained”. And the data, too, is locked down to a dedicated environment, “where it’s not going anywhere.”

And at PegaSystems, says van der Putten, an internal “research agent”, called Iris handles up to 1,500 emails a day. “She’ll do research and come back with a response, researching all kinds of different data sources, knowledge sources, internal, external,” he says, with the model explaining how it reached its conclusion. But, he adds, the firm has limited Iris’ email capabilities to head off the possibility of the agent sending proposals to clients.

Agentic AI is not going to solve world hunger, says van der Putten. But for specific use cases, such as inspecting insurance claims or resolving service issues, such models “can work really, really well.”

That’s a long way away for most organisations. McKinsey research earlier this year showed that 80% of respondents had to see “a tangible impact on enterprise level EBIT” from adopting Gen AI, never mind agentic AI.

Dom Couldwell, head of field engineering EMEA at DataStax, suggests that those companies who are serious about starting that journey should take a SWAT team approach. “Get a cross-functional team of folks together, give them the mandate to actually solve a problem,” says Couldwell. “This is not just some paper exercise. Make it a meaningful problem for the organisation and make it a somewhat difficult problem for the organisation.”

Agentic AI systems differ from their Gen AI contemporaries by going out into the world and doing things, rather than sitting idly by and waiting for people to come to them for answers. That doesn’t make them any easier to fine-tune and deploy, as first-movers are discovering. (Image: Shutterstock)

Be ready to move. Quickly

Ultimately, he suggests, the companies that will do well with agentic AI are those that have a deep legacy of data. “A neo bank is going to move a million miles an hour,” says Couldwell, but won’t be able to call upon the data reserves of the lumbering legacy dinosaurs of the financial services sector.

The choice of models and hardware could be critical to how fast and agile an organisation can be, he adds. “Some models are tied to certain hardware,” says Couldwell, and some of this can be worked around using APIs. But, he says, “if it’s literally a complete black box and you have no control of it, you’re locked in.”

Bud Financial, for example, does everything on Google Cloud Platform, which allows a more flexible approach. “We can swap out Gemini for Claude or Claude or Open AI,” says Dunning. “Google Cloud has built the infrastructure so that you can just easily swap in the most relevant firms in the market.” This also means it can quickly test different performances for different models, to make calls on the most performant or cost-effective.

It helps, he says, to ensure that the underlying data is as clean as possible. While the models Bud uses can handle less-than-ideal data, that will drive up costs.

NVIDIA’s Huang might be certain that 10bn digital workers are going to march over the horizon at some point in the future. But CIOs will have to navigate many experiments and a sea of hype first.

Will it be worth it? “The point of software is to manipulate data, isn’t it?” Kubernetes pioneer and Civo board member, Kelsey Hightower, said at a conference back in February. “So, an AI agent is software. Its only purpose is to deal with the data. So we’re back to square one.”

DataStax’s Dom Couldwell is less cynical, saying there’s certainly a need for agentic AI, or at least what it promises. But, he adds, that’s in part because a lot of CEOs have signed big checks for AI and Gen AI – and now they’re demanding payback.

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

Multi-agent systems

Keep it constrained

Be ready to move. Quickly

Read more: AI data centres are the new power play – and most CIOs will never build one

Sign up for our regular news round-up!

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing