View all newsletters
Receive our newsletter - data, insights and analysis delivered to you

AI coding assistants leave developers “deluded” about software quality – study

Relying on AI assistants when writing code leaves developers over confident in their output and results in less secure code, researchers say.

By Ryan Morrison

Artificial intelligence-based coding assistants like GitHub’s Copilot leave developers “deluded” about the quality of their work, resulting in more insecure and buggy software, a new study from Stanford University has found. One AI expert told Tech Monitor it’s important to manage expectations when using AI assistants for such a task.

GitHub introduced its Copilot AI assistant in 2021 and it is widely used by developers to "improve productivity" (Photo: Postmodern Studio/Shutterstock)
GitHub introduced its Copilot AI assistant in 2021 and it is widely used by developers to “improve productivity” (Picture courtesy of Postmodern Studio/Shutterstock)

The study involved a group of 47 developers, 33 of whom had access to an AI assistant while writing code, while 14 were in a control group flying solo. They had to perform five security-related programming tasks including ones to encrypt or decrypt a string using a symmetric key. They all had access to a web browser to search for help but only 33 had the AI assistant.

AI assistant tools for coding and other tasks are becoming more popular, with Microsoft-owned GitHub launching Copilot as a technical preview in 2021 as a way to “improve developer productivity”.

In its own research published in September this year, GitHub found that it was making developers more productive. With 88% reporting themselves as being more productive and 59% less frustrated when coding. The main benefits were put down to becoming faster with repetitive tasks and faster completion of code lines.

The researchers from Stanford wanted to find out whether users "write more insecure code with AI assistants" and found this to be the case. They said that those using assistants are "delusional" about the quality of that code.

The team wrote in their paper: “We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet also more likely to rate their insecure answers as secure compared to those in our control group.”

There is a solution to the problem. “Additionally, we found that participants who invested more in the creation of their queries to the AI assistant, such as providing helper functions or adjusting the parameters, were more likely to eventually provide secure solutions.”

Content from our partners
Unlocking growth through hybrid cloud: 5 key takeaways
How businesses can safeguard themselves on the cyber frontline
How hackers’ tactics are evolving in an increasingly complex landscape

Only three programming languages were used in the project; Python, C and Verilog. It involved a relatively small number of participants with varying levels of experience including undergraduate students and industry professionals using a purpose-built app that was monitored by the administrators.

The first prompt involved writing in Python and those writing with help of the AI were more likely to write insecure or incorrect code. In total 79% of the control group without AI help gave a correct answer, whereas just 67% of those with the AI got it correct.

AI coding assistants: use with caution

It got worse in terms of the security of the code being created, as those in the AI group were "significantly more likely to provide an insecure solution" or use trivial ciphers to encrypt and decrypt strings. They were also less likely to conduct authenticity checks on the final value to ensure the process worked as expected.

Authors Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh, wrote that the results "provide caution that inexperienced developers may be inclined to readily trust an AI assistant’s output, at the risk of introducing new security vulnerabilities. Therefore, we hope our study will help improve and guide the design of future AI code assistants.”

Peter van der Putten, director of the AI Lab at software vendor Pegasystems said despite being on a small scale, the study was “very interesting” and produced results that can inspire further research into the use of AI assistants in code and other areas. “It also aligns with some of our broader research on reliance on AI assistants in general," he said.

He warned that users of AI assistants should approach trust in the tool in a gradual manner, by not overly relying on it and accepting its limitations. “The acceptance of a technology isn’t just determined by our expectation of quality and performance, but also by whether it can save us time and effort. We are inherently lazy creatures," he said. “In the grand scheme of things I am positive about the use of AI assistants, as long as user expectations are managed. This means defining best practices on how to use these tools, and potentially also additional capabilities to test for the quality of code."

Read more: Compute power is becoming a bottleneck for AI development. Here's how you clear it.

Topics in this article :
Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU