Researchers Discover Security Threats in ChatGPT and Other AI Tools

According to research by the University of Sheffield’s Department of Computer Science, Artificial Intelligence tools such as ChatGPT can be fooled into creating malicious code that can be used to launch cyber attacks.


This study provides a novel demonstration showing that Text-to-SQL systems can be used to attack computer systems. These are AI tools that allows people to search databases by asking questions in plain language. The study’s findings show that the AI can be tricked into destroying databases, stealing sensitive personal information, and bringing down services through DOS attacks.


The Sheffield research team found security loopholes in about six commercial AI tools-BAIDU-UNIT, ChatGPT, AI2SQL, AIHELPERBOT, Text2SQL, and ToolSKE- and successfully countered each one. They found that if asked specific questions, the AI could produce malicious code. If executed, the code would destroy or tamper with the information in the database. Using BAIDU-UNIT, the researchers obtained confidential BAIDU server configurations.


‘In reality, many companies are simply not aware of these types of threats and due to the complexity of chatbots, even within the community, there are things that are not fully understood,’ said Xutan Peng, doctoral student and co-author of the research.


‘At the moment, ChatGPT is receiving a lot of attention. It’s a standalone system, so the risks to the service itself are minimal, but what we found is that is can be tricked into producing malicious code that can do serious harm to other services.’


The study’s findings highlighted the dangerous ways people are learning programming languages using AI, so they can interact with databases.


‘The risk with AIs like ChatGPT is that more and more people are using them as productivity tools, rather than a conversational bot, and this is where our research shows the vulnerabilities are. For example, a nurse could ask ChatGPT to write an SQL command so that they can interact with a database, such as one that stores clinical records. As shown in our study, the SQL code produced by ChatGPT in many cases can be harmful to a database, so the nurse in this scenario may cause serious data management faults without even receiving a warning,’ added Peng.


During the study, the research team discovered they could launch simple backdoor attacks in Text-to-SQL models by poisoning the training data. Although this backdoor attack doesn’t really affect the general model performance, it can be activated at any time to cause real damage.


‘Users of Text-to-SQL systems should be aware of the potential risks highlighted in this work. Large language models, like those used in Text-to-SQL systems, are extremely powerful but their behaviour is complex and can be difficult to predict. At the University of Sheffield, we are currently working to better understand these models and allow their full potential to be safely realized,’ said Mark Stevenson, senior lecturer in Natural Language Processing at the University.


The work of the team has been recognized by Baidu, and in response, the company has fixed all the reported vulnerabilities. The researchers were also financially rewarded. OpenAI has also fixed the reported issues found in ChatGPT in February.


‘Our efforts are being recognized by industry and they are following our advice to fix these security flaws. However, we are opening a door on an endless road- what we now need to see are large groups of researchers creating and testing patches to minimize security risks through open source communities,’ continued Peng.


‘There will always be more advanced strategies being developed by attackers, which means security strategies must keep pace. To do so we need a new community to fight these next generation attacks.’


By Marvellous Iwendi,


Source: University of Sheffield