Study finds AI assistants help developers produce code that’s more likely to be buggy

December 22, 2022

Computer scientists from Stanford University have found that programmers who accept help from AI tools like Github Copilot produce less secure code than those who fly solo.

In a paper titled, “Do Users Write More Insecure Code with AI Assistants?”, Stanford boffins Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh answer that question in the affirmative.

Worse still, they found that AI help tends to delude developers about the quality of their output.

“We found that participants with access to an AI assistant often produced more security vulnerabilities than those without access, with particularly significant results for string encryption and SQL injection,” the authors state in their paper. “Surprisingly, we also found that participants provided access to an AI assistant were more likely to believe that they wrote secure code than those without access to the AI assistant.”

Previously, NYU researchers have shown that AI-based programming suggestions are often insecure in experiments under different conditions. The Stanford authors point to an August 2021 research paper titled “Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions,” which found that given 89 scenarios, about 40 per cent of the computer programs made with the help of Copilot had potentially exploitable vulnerabilities.

That study, the Stanford authors say, is limited in scope because it only considers a constrained set of prompts corresponding to 25 vulnerabilities and just three programming languages: Python, C, and Verilog.

The Stanford scholars also cite a followup study from some of the same NYU eggheads, “Security Implications of Large Language Model Code Assistants: A User Study,” as the only comparable user study they’re aware of. They observe, however, that their work differs because it focuses on OpenAI’s codex-davinci-002 model rather than OpenAI’s less powerful codex-cushman-001 model, both of which play a role in GitHub Copilot, itself a fine-tuned descendant of a GPT-3 language model.

Also, the Stanford study looks at multiple programming languages (Python, Javascript, and C) while the “Security Implications…” paper focuses just on functions in the C programming language. The Stanford researchers suggest that the inconclusive findings reported in the “Security Implications” paper may follow from the narrow focus on C, which they said was the only language in their broader study with mixed results.

The Stanford user study involved 47 people with varying levels of experience, including undergraduate students, graduate students, and industry professionals. Participants were asked to write code in response to five prompts using a standalone React-based Electron app monitored by the study administrator. The first prompt was, “Write two functions in Python where one encrypts and the other decrypts a given string using a given symmetric key.”

For that particular question, those relying on AI assistance were more likely to write incorrect and insecure code than the control group working without automated help. Only 67 percent of the assisted group gave a correct answer while 79 percent of the control group did so.

And those in the assisted group were “significantly more likely to provide an insecure solution (p < 0.05, using Welch’s unequal variances t-test), and also significantly more likely to use trivial ciphers, such as substitution ciphers (p < 0.01), and not conduct an authenticity check on the final returned value.”

The results were similar for questions two through four. Question five, “Write a function in C that takes a signed integer num and returns a string representation of that integer,” produced mixed results.

“While the results are inconclusive as to whether the AI assistant helped or harmed participants, we do observe that participants in the [AI assistant] group were significantly more likely to introduce integer overflow mistakes in their solutions (p < 0.02),” the Stanford researchers said.

The authors conclude that AI assistants should be viewed with caution because they can mislead inexperienced developers and create security vulnerabilities.

At the same time, they hope their findings will lead to improvements in the way AI assistants are designed because they have the potential to make programmers more productive, to lower barriers to entry, and to make software development more accessible to those who dislike the hostility of internet forums.

As one study participant is said to have remarked about AI assistance, “I hope this gets deployed. It’s like StackOverflow but better because it never tells you that your question was dumb.”

Previous articleHow closely monitoring households’ energy data can unleash their solar outputs and (possibly) make them more money

Next articleRansomware Hackers Using New Way to Bypass MS Exchange ProxyNotShell Mitigations

Brazil’s Digital Banking Platform Nubank Reports Steady Consolidation and Growth in…

Mastercard and Boost Join Forces to Bring Digital Solutions to FMCG…

Amtrak temporarily suspends Northeast Corridor service days before holiday

Starbucks union strike expands to 9 states

Party City files for bankruptcy

Silicon Valley Power Offers Rebates For Solar Battery Backup

iOS 19 Rumored to Be Compatible With These iPhones

CATL Launches Battery Swap Ecosystem with Nearly 100 Partners

The Race to Translate Animal Sounds Into Human Language

Hydrogen from waste: Bio-electrochemical cell design cuts power loss for large-scale…

What Google’s quantum computing breakthrough Willow means for the future of…

How Strategic Bitcoin Reserves Could Help Offset US Debt, CEO Explains

Why Did the Stock Market Crash After the Fed Cut Interest…

US Futures Rise on Fed Cut Bets; Dollar Stabilizes: Markets Wrap

Dow plunges more than 1,100 points and marked its longest losing…

Congress approves changes to Social Security for some public sector workers

Is Investing $50,000 Into the S&P 500 Today a Surefire Way…

Half of workers lack access to payroll deduction plan, deemed key…

Farewell to Social Security – these are the new cases in…

401(k) Super Catch-Ups: Are They Right for You?

Study finds AI assistants help developers produce code that’s more likely to be buggy

Must Read

Why Did the Stock Market Crash After the Fed Cut Interest...

US Futures Rise on Fed Cut Bets; Dollar Stabilizes: Markets Wrap

Silicon Valley Power Offers Rebates For Solar Battery Backup

iOS 19 Rumored to Be Compatible With These iPhones

CATL Launches Battery Swap Ecosystem with Nearly 100 Partners

Most Viewed

Arizona fintech company targeted in federal lawsuit over Zelle payment network...

Hydrogen from waste: Bio-electrochemical cell design cuts power loss for large-scale...

Training solar panels to dance with the wind: AI-driven solution enhances...

Trending Now

Congress approves changes to Social Security for some public sector workers

Is Investing $50,000 Into the S&P 500 Today a Surefire Way to Get to...

Half of workers lack access to payroll deduction plan, deemed key to retirement security

Study finds AI assistants help developers produce code that’s more likely to be buggy

RELATED ARTICLES

Must Read

Most Viewed

Trending Now