OpenAI released its next generation large language model GPT-4 to much fanfare on Tuesday, stating that it scored in the top 10% on many of the toughest exams including the bar and LSAT. As part of the announcement OpenAI revealed a dozen commercial partners including Duolingo, Stripe and Morgan Stanley that are incorporating GPT-4 into their workflow, products or systems. One analyst told Tech Monitor it includes features that “could be transformational for many businesses”.
The exact size and parameter numbers in GPT-4 haven’t been revealed although it is thought to be more than the 175bn in the previous generation GPT-3. Despite being trained on the same data, it has been fine-tuned to have a higher level of reasoning, fewer mistakes, and can take longer input including thousands of words from a document.
Combined with efforts to improve safety features and limit dangerous or toxic content generation using the tool has made it more viable for enterprise. This, combined with longer inputs of up to 25,000 words and better fine-tuning on company data opened a new avenue.
Morgan Stanley is using GPT-4 as a way to organise its knowledge base of hundreds of thousands of articles and insights on investment strategies, market research and analyst insights. It has been using embeddings to train and fine-tune the foundation AI model on this information and allow employees to use a chat interface to retrieve what they need.
“You essentially have the knowledge of the most knowledgeable person in wealth management—instantly”, said Jeff McMillan, head of analytics, data and innovation at the investment bank. “Think of it as having our chief investment strategist, chief global economist, global equities strategist, and every other analyst around the globe on call for every advisor, every day. We believe that is a transformative capability for our company.”
Delving deeper into he announcement, GPT-4 also includes improved document analysis which gives it a better understanding of context than GPT-3, which can be used for more accurate analysis of a business document such as a contract, report or legal paperwork.
GPT-4: More input and output flexibility
Manish Sinha, chief marketing officer at fibre optics manufacturer STL said the 40% boost in accuracy and 82% reduction in likelihood to generate an offensive response cited by OpenAI make it a much more viable option for enterprise. “GPT-4 also provides enterprises with much more input and output flexibility thanks to the new multi-modal capabilities,” he said. This is due to the fact the model can now take an image input and analyse it, providing a text report in response. It can also go further, with OpenAI demonstrating the ability of GPT-4 to take a rough sketch of a website and turn it into real code.
“At the very least, we’re likely to see more advanced and personalized virtual assistants, chatbots, and customer service interactions in the very near future,” said Sinha. “This could change the way enterprises interact and engage with customers, laying the groundwork for seamless self-service and more timely responses to queries with more multimedia content and immersive experiences that deliver value and context.”
Dr Andrew Rogoyski from the University of Surrey Institute of People-Centred Artificial Intelligence, said its ability to respond to queries that include text, images, drawings and diagrams “opens it up to industries where the visual element of information is important, from image search to architecture.”
This extends to providing an understanding of ways images and text relates to each other, which brings AI much closer to the way humans organize memories and ideas. “Imagine sketching something, then being able to ask the computer to find existing examples that resemble your sketch or provide a photorealistic rendering of your drawing. The possibilities are very exciting,” said Dr Rogoyski.
The problem that may arise is the size and training requirements for such a mode. “These LLMs require substantial and increasing amounts of computing infrastructure which means we’re becoming dependent on the Silicon Valley hyper-scale companies like Microsoft, Amazon and Google. This raises interesting questions about concepts of sovereign control of AI and dependencies that may prove uncomfortable for some organisations,” he said.
This feeds into calls for the UK to develop its own sovereign large language model. The government recently announced it was forming a taskforce to investigate ways foundation models could improve and impact on society, as well as how it should be regulated.
GPT-4 is ‘living up to the hype’
Nikolaj Buhl, Founder Associate at computer vision company Encord outlined several use cases for GPT-4 that would not have been viable in the previous version. “GPT-3 didn’t live up to the hype of AI and large language models, but it looks like GPT-4 does. GPT-4 represents a significant leap in AI capabilities compared to its predecessor GPT-3 and GPT-3.5.”
He suggested that it could enhance and improve customer support, particularly when the ability to process visual input is available through the API. This, explained Buhl, would allow for more comprehensive support including allowing customers to submit images of their issues. “GPT-4 can also analyse images and charts to provide valuable strategic insights to businesses. Similarly, GPT-4 could also help generate data visualisations potentially helping businesses make informed decisions based on complex data sets,” he added.
“GPT-4 is set to bring numerous innovations to the enterprise and business landscape, with capabilities that surpass those of GPT-3.5 and other models,” Buhl adds. “Businesses that adopt GPT-4 will likely gain a competitive edge by leveraging its advanced multi-model capabilities, natural language understanding, multi-tasking abilities, and enhanced personalization features, among other benefits.”
Aaron Kalb, co-founder of enterprise data company Alation, said GPT-4 cannot be trusted to advise on important decisions when relying on its training purely on publicly available data and no specific proprietary information. “That’s because it’s designed to generate content that simply looks correct with great flexibility and fluency, which creates a false sense of credibility and can result in so-called AI ‘hallucinations,'” he explained. “While the authenticity and ease of use is what makes GPT so alluring, it’s also its most glaring limitation.”
“If, and only when, a GPT model is fed knowledge with metadata context – so essentially contextual data about the data like where it’s located, how trustworthy it is, and whether it is of high quality – can these hallucinations or inaccurate responses be fixed, and GPT trusted as an AI advisor,” he added. While it is incredibly impressive in its ability to sound smart, it still has no idea what it is saying and doesn’t have the knowledge it “tries to put into words.”
“It’s just really good at knowing which words ‘feel right’ to come after the words before, since it has effectively read and memorized the whole internet. It often gets the right answer since, for many questions, humanity collectively has posted the answer repeatedly online. The Shakespearean sonnets about tuna salad are not actually original works of tremendous creativity but rather excellent pastiches of other content, like someone making a ransom note by cutting letters out of different magazines.”