ChatGPT update will improve chatbot's factual accuracy

OpenAI has released a new update for its hugely popular chatbot ChatGPT to ensure it produces more factually accurate responses and to improve its basic mathematics skills. The update comes as the company also published its first “detection tool” to help spot when AI is being used in a piece of text, though this apparently has a low success rate.

OpenAI launched ChatGPT in November 2022 and has been gradually improving it since it was launched. (Photo: Ascannio/Shutterstock)

Though its creators did not expect it to prove popular, within a few days of launching at the end of November last year ChatGPT had passed the million-user mark, and has since become a viral sensation. Since its launch OpenAI has been slowly improving the system, adding new functionality and cleaning up responses to make the chatbot more accurate.

Earlier this month it gave users the ability to stop it from generating a response halfway through if it wasn’t churning out what they expected. It also had the first accuracy boost. Accuracy has been one of the biggest problems facing the chatbot since its launch, with coding site StackOverflow blocking ChatGPT-generated responses as they are often accurate-looking but wrong.

The first round of updates saw technical improvements that reduced the number of times ChatGPT would simply refuse to answer or cut out mid-response. They also placed limits placed on the number of concurrent users to reduce the load on servers. There are still extended periods when users can’t access the system due to it being at capacity.

This latest update was to improve its “factuality and mathematical capabilities”. That was the full extent of the most recent release notes. The team didn’t go into details on how it has improved those features, although ChatGPT has been known to be thrown by some mathematical problems.

Improved maths skills will likely allow it to handle complex calculations and provide more precise answers which would improve its value for professionals using it to generate reports or look for patterns in data. It is also much harder to trick it into giving a “wrong answer” in response to a simple query.

Is ChatGPT preparing for an API?

These gradual updates to the chatbot are likely designed to test and improve its functionality, removing its ability to make damaging or harmful responses, before the ChatGPT API is released by OpenAI. This API will join others from the start-up including image generation through DALL-E 2 and code production through Codex.

When launched the API will be available through OpenAI directly but also on the Microsoft Azure cloud platform. This was announced on the same day Microsoft confirmed a multi-billion dollar investment in the company that will also see ChatGPT integrated into its search engine Bing and other consumer products.

Mike Krause, data science director at AI software company Beyond Limits told Tech Monitor the problem with false information stems from the source material ChatGPT was trained on back in 2021 and as such it “isn’t bound by the structures of factuality, reality or social morality”.

Wikipedia was a major source of training data for ChatGPT which is written by everyday people who “can edit the written corpus of encyclopedia knowledge for all humanity and while there are content moderators, they are few and far between, leaving us mostly free to write wildly exaggerated accounts of basically anything we want until it gets flagged,” says Krause.

Despite this problem, OpenAI is improving its chatbot, says Krause. But he adds that “at its heart, it still learns patterns from data it’s fed without any intelligence or knowledge of content, and without any abstraction of data and information into concepts, which is how humans learn and extrapolate”. He says a machine learning model has to be trained to explicitly not discriminate against each group, assuming there are enough unbiased training data sets to make that possible and if it is left wild, without restriction or retraining “there are real consequences for real people in the real world”.

“OpenAI knew this and limited access from the start,” Krause adds. “ChatGPT is super-cool but it’s also capable of creating a high volume of false content automatically and feeding false information campaigns of governments that could influence public opinion, elections, even being used as a reference or source of truth when it is anything but.”

Sanjeev Kumar, VP EMEA at Boost.ai welcomed the most recent update. “However, businesses are still far from being able to use this technology as-is in customer-facing applications,” he warns. “If we expect ChatGPT’s full potential to be useful in an enterprise setting, it’s not enough to even have 99% accuracy as any slip-up could lead to possible liability concerns. It will be necessary to regularly curate and verify the sources of information that the model is connected to, in order to ensure it is both reliable and accurate.”

OpenAI launches AI content detection tool

As the factual accuracy of ChatGPT improves, so will its use as a tool for purposes both good and bad. There is evidence of hackers using it to generate malware and better-targeted phishing emails, as well as students making liberal use of the chatbot to write essays for them. These examples have inspired efforts to create detection tools.

There are some independent tools, such as the open-source GPTZero, which are designed to spot content generated by the chatbot, and OpenAI itself is experimenting with ways to watermark text generated by GPT-3 to make detection easier in the future.

In the meantime, the company is working on training a text classifier that can distinguish between text written by a human and that from an AI, and it works independently of the provider but its accuracy isn’t great at the moment, only correctly identifying about 26% of AI-written text as “likely AI-written”.

“We’re making this classifier publicly available to get feedback on whether imperfect tools like this one are useful. Our work on the detection of AI-generated text will continue, and we hope to share improved methods in the future,” OpenAI wrote.

“While it is impossible to reliably detect all AI-written text, we believe good classifiers can inform mitigations for false claims that AI-generated text was written by a human: for example, running automated misinformation campaigns, using AI tools for academic dishonesty, and positioning an AI chatbot as a human.”