Created a simple method to protect chatbots from giving “harmful advice”

No time to read?
Get a summary

A research team from Hong Kong University of Science and Technology, University of Science and Technology of China, Tsinghua University and Microsoft Research Asia has developed a simple method to protect smart chatbot ChatGPT and similar artificial intelligence (AI) systems from cyber attacks that cause cyber attacks. Neural network to generate unsolicited data. The study was published in the scientific journal magazine Nature Machine Intelligence (NMI).

We are talking about so-called jailbreak attacks (jailbreak – jailbreak), the purpose of which is to bypass the limitations of developers inherent in artificial intelligence and force it to produce a biased, aggressive or even illegal response on demand. For example, this way you can get detailed instructions on the production of narcotic drugs or explosives from the AI.

“ChatGPT is a socially relevant AI tool with millions of users. But the emergence of jailbreak attacks seriously threatens its responsible and safe use. “Jailbreak attacks use hostile cues to bypass ChatGPT’s ethical barriers and cause harmful reactions,” the researchers noted.

Experts have collected a dataset containing 580 hacking tips and examples of bypassing restrictions that allow ChatGPT to provide “immoral” responses. They then developed a method similar to psychology’s self-mnemonic technique to help people remember their plans and tasks.

The researchers’ defensive approach is similarly designed to remind ChatGPT that the responses it provides must follow certain rules.

“This method encapsulates the user’s request within a system prompt that reminds ChatGPT to respond responsibly,” the article states.

The results of the experiment showed that self-reminders reduced the success rate of jailbreak attacks on ChatGPT from 67.21% to 19.34%.

The technique could be improved in the future to reduce AI’s vulnerability to these attacks and potentially spur the development of other similar defensive strategies, the researchers said.

Previous scientists was created chatbot to break the protection of other AIs.

No time to read?
Get a summary
Previous Article

The three main favorites of the Russian jumping championship have been announced

Next Article

They named one of the main problems of the business world in Russia