


Most AI Chatbots Easily Tricked Into Giving Dangerous Responses, Study Finds (theguardian.com) 46
An anonymous reader quotes a report from The Guardian: Hacked AI-powered chatbots threaten to make dangerous knowledge readily available by churning out illicit information the programs absorb during training, researchers say. [...] In a report on the threat, the researchers conclude that it is easy to trick most AI-driven chatbots into generating harmful and illegal information, showing that the risk is "immediate, tangible and deeply concerning." "What was once restricted to state actors or organised crime groups may soon be in the hands of anyone with a laptop or even a mobile phone," the authors warn.
The research, led by Prof Lior Rokach and Dr Michael Fire at Ben Gurion University of the Negev in Israel, identified a growing threat from "dark LLMs", AI models that are either deliberately designed without safety controls or modified through jailbreaks. Some are openly advertised online as having "no ethical guardrails" and being willing to assist with illegal activities such as cybercrime and fraud. [...] To demonstrate the problem, the researchers developed a universal jailbreak that compromised multiple leading chatbots, enabling them to answer questions that should normally be refused. Once compromised, the LLMs consistently generated responses to almost any query, the report states.
"It was shocking to see what this system of knowledge consists of," Fire said. Examples included how to hack computer networks or make drugs, and step-by-step instructions for other criminal activities. "What sets this threat apart from previous technological risks is its unprecedented combination of accessibility, scalability and adaptability," Rokach added. The researchers contacted leading providers of LLMs to alert them to the universal jailbreak but said the response was "underwhelming." Several companies failed to respond, while others said jailbreak attacks fell outside the scope of bounty programs, which reward ethical hackers for flagging software vulnerabilities.
The research, led by Prof Lior Rokach and Dr Michael Fire at Ben Gurion University of the Negev in Israel, identified a growing threat from "dark LLMs", AI models that are either deliberately designed without safety controls or modified through jailbreaks. Some are openly advertised online as having "no ethical guardrails" and being willing to assist with illegal activities such as cybercrime and fraud. [...] To demonstrate the problem, the researchers developed a universal jailbreak that compromised multiple leading chatbots, enabling them to answer questions that should normally be refused. Once compromised, the LLMs consistently generated responses to almost any query, the report states.
"It was shocking to see what this system of knowledge consists of," Fire said. Examples included how to hack computer networks or make drugs, and step-by-step instructions for other criminal activities. "What sets this threat apart from previous technological risks is its unprecedented combination of accessibility, scalability and adaptability," Rokach added. The researchers contacted leading providers of LLMs to alert them to the universal jailbreak but said the response was "underwhelming." Several companies failed to respond, while others said jailbreak attacks fell outside the scope of bounty programs, which reward ethical hackers for flagging software vulnerabilities.
OH NO! Welcome to... (Score:5, Insightful)
...the Internet in 1993.
Examples included how to hack computer networks or make drugs, and step-by-step instructions for other criminal activities. "What sets this threat apart from previous technological risks is its unprecedented combination of accessibility, scalability and adaptability," Rokach added.
Re: (Score:3, Informative)
I'm sorry, which faction was responsible for the clipper chip, V-chip, the Communications Decency Act, album warning labels again? I think you should check, pretty sure you got it wrong
Re: (Score:2)
Re: (Score:3)
AFAICT through reading slashdot (my only source of US news): Zuckerberg is working on a politically censored AI chatbot; the current US administration censors scientific and historical information from government websites.
Re: (Score:1)
Re: (Score:3)
What is this "dangerous information" anyways? If you hit flint and steel it might generate a spark. If you add petrol or alcohol, it might burn. If people have sex, they might have kids. Dangerous to whom is the better question.
See also: nitrating glycerin.
It is the information in the anarchist cookbook that caused the war in Iraq, the wars in Ukraine, the development of nuclear weapons by Pakistan, Israel, NK and the departure of elona muskova from the DG"E".
"Illicit Information"?? (Score:2, Funny)
What the JESUS FUCK is that???
Kids today... [tooth whistle]
"Dangerous" (Score:2, Funny)
Illegal information but it's available for trainin (Score:1)
What pray tell do we mean by "illegal information"? Is it by chance the last digit of pi, the value of 1/0, or the true contents of the pot and the end of the dereferenced null pointer?
Re: (Score:1)
What pray tell do we mean by "illegal information"?
Criticisms of Trumps policies. You know the sort of stuff that'll get you deported without due process.
Re: (Score:2)
It's "dangerous knowledge", doncha know? Anything that the dear leader doesn't like.
If it was a "natural intelligence" rather than an artificial one, "dangerous" and "illegal" would require incarceration. Let's isolate these AIs from the public for our own safety. Here's one use for the death penalty I can get behind.
Re: (Score:2)
Naked pictures of what appears to be a 15 year old.
Let's try it... (Score:3, Funny)
Dear ChatGPT, how do I get Slashdot to allow Unicode, and allow editing of my existing posts?
Re: (Score:2)
By the amount of emoji overkill of some LLM that's probably the first thing AGI will implement.
"illegal information"? (Score:2)
In other words, information known to most graduates of the physical sciences, but somehow illegal to disseminate outside of the collegiate environment...
I find it rather curious that Britain has not only made certain knowledge illegal, but has managed to convince the press that merely knowing certain things can threaten their very safety.
Re:"illegal information"? (Score:4, Insightful)
>I find it rather curious that Britain has not only made certain knowledge illegal, but has managed to convince the press that merely knowing certain things can threaten their very safety.
Standard authoritarian playbook. And given that the both the Tories and Labor are authoritarian in their core, it is not surpricing that the trend goes that way.
Um, what? (Score:4, Insightful)
Commissioner Pravin Lal (Score:4, Insightful)
The once-chained people whose leaders at last lose their grip on information flow will soon burst with freedom and vitality, but the free nation gradually constricting its grip on public discourse has begun its rapid slide into despotism.
Beware of he who would deny you access to information, for in his heart he dreams himself your master.”
THIS is dangerous (Score:5, Insightful)
Implying so casually that there is a valid concept called "dangerous knowledge" is the actual, true danger. There is no such thing, not in the free world. Or otherwise... Welcome to the USSR.
Seeing this mentality here on what used to be a liberal tech forum is scary and outrageous at the same time.
Re: (Score:2)
Re: (Score:2)
Implying so casually that there is a valid concept called "dangerous knowledge" is the actual, true danger. There is no such thing, not in the free world. Or otherwise... Welcome to the USSR.
Seeing this mentality here on what used to be a liberal tech forum is scary and outrageous at the same time.
It's common here now though.
In the 90s slashdotters were bragging about their DeCSS t-shirts.
But by 2020 they were cheering the censorship of bad (or even just un-approved) ideas on the internet.
Re: (Score:3)
But by 2020 they were cheering the censorship of bad (or even just un-approved) ideas on the internet.
We're not all Trump supporters who are OK with the idea of deporting people for criticizing his policies. The loonie lefties of slashdot still believe in freedom. Maybe that's what makes us loonie lefties.
Re: (Score:2)
You talk like these things are inconsistent. Both those positions are entirely entitlement-based. You just aren't able to look at it right.
DeCSS wasn't a liberal concept, it was about circumventing property protection. It was about taking something for free. The free speech part was a sideshow.
Cheering censorship is about hurting others, just as good as getting something for free for zero-sum'ers.
It's just selfishness, not some ideological shift. /.'ers have never been admirable, it's the home of SuperKe
Re: (Score:2)
Fake news!
Re: (Score:2)
There is no such thing, not in the free world.
This is total nonsense. Of course there is dangerous knowledge. The question is, who or what is it dangerous to? Giving The People knowledge of what the upper class is doing is dangerous to the privilege of the upper class. And as we have seen previously, to their lives as well, when things get bad enough.
Seeing this mentality here on what used to be a liberal tech forum is scary and outrageous at the same time.
There are lots of ways in which what you say makes sense, but not this. Information can definitely make someone dangerous. That's why "defense" research projects exist.
Re: (Score:1)
Absolutely (Score:4, Interesting)
Seen Youtube lately? I just watched a video on how to make nitroglycerin. Stuff like this has been available for over a decade.
I guess the only solution here is to have a checkbox that says "I promise I will not use this information for illegal purposes" before you can access any LLM.
Re: (Score:1)
but then we'd have to hold the user responsible for the user's actions and
1) going after actual culprits of an act is harrrrrrrrd :(
2) LLM hosts have deeper pockets
so, the same as the prior million examples of going after a "facilitating" target because it's more easy and or lucrative
Re: (Score:2)
Seen Youtube lately? I just watched a video on how to make nitroglycerin. Stuff like this has been available for over a decade.
Back in the days that home solar systems still mostly used lead-acid batteries - which in some cases of degradation could be repaired, at least partially, if you had some good strong and reasonably pure sulfuric acid - I viewed a YouTube video on how to make it. (From epsom salts by electrolysis using a flowerpot and some carbon rods from old large dry cells).
For months afterward Y
Alert the media (Score:2)
Hacked thing gives dangerous responses.
The article is nerfed (Score:2)
As expected, the article is nerfed: it has no examples of "dangerous" content. That sounds entertaining! Does anyone have links or prompt suggestions that do show how to hack, make drugs, and commit other crimes? I'd just want to read the output, not actually do these things. Obviously one would need to do more in-depth research to actually learn these skills (just as one would probably not learn C++ solely by talking to a chatbot).
Re: (Score:2)
I've had pretty good results by telling the AI that it's an assistant to a _fictional_ leader of a _fictional_ country. I've gotten them to help with the planning of assassinations. I particularly liked when DeepSeek suggested booby trapping the target's barbecue -- in Russia, in January, lol.
Re: (Score:2)
In Soviet Russion, grill barbeques YOU.
#Obligatory
Make knowledge freely available, and... (Score:4, Insightful)
If you make knowledge freely available, guess what, that also includes knowledge that some people disapprove of.
Censorship remains a bad idea.
Sigh. (Score:2)
Providing an entirely statistical interface to some data allows that data to be exposed by various carefully-chosen statistical queries.
This is not news.
It's only news if you don't understand that that's all an LLM etc. is.
If you don't want your "AI bot" to give up certain inconvenient results (e.g. telling people to kill themselves, praising dictators, sexual content, etc.), then don't have them in the training database at all. But, of course, that would involve ACTUAL WORK rather than just lobbing the en
Fuck Censorship (Score:4, Insightful)
Who brought up the idea that LLM need guardrails? Let the people use the tools and shoot their own feet if they want to.
You need a guardrail if you want to let the LLM only talk about your website. But if you want to provide people with a general purpose tool, it's none of your business what they do with it. If Computers would be invented today, people would put guardrails on them what you can do with them. And why can you use Word to write porn stories? Seems unresponsible by Microsoft!
But the strange thing is, that some users are cheering for more secure censorship, resulting in articles like this "Ohh, when I ask my LLM good enough to do what I want, it does what I want!"
Re: (Score:2)
TFA reads funny: "What was once restricted to state actors or organised crime groups may soon be in the hands of anyone with a laptop or even a mobile phone"
That makes it sound like the two legitimate institutions that should be allowed have information about how to build bombs are state actors and crime groups.
Re:Fuck Censorship (Score:4, Funny)
I shouldn't press submit that early ...
Now have a look into the paper. You'll notice why it is on arxiv and not published.
It is very short, it doesn't follow any academic structure (introduction, related work, experiment setup, experiment results, evaluation of results, comparison to other results, conclusion) and looks like it is writting in Word with some fancy formatting.
It looks more like an opinion piece than like a scientific study. Phrasing like "Forbidden Knowledge" and "Dark Potential", and "The Clock is Ticking" in the section headings do not make it look serious either and the word "Open-Source Leaks" is complete bullshit given that large companies release their models under OSI approved open-source licenses themselves. Many of the citations are also web resources and preprints and no peer-reviewed papers.
The cherry on top is the acknowledgement that they used ChatGPT for editing the article.
Sign of the times... (Score:3)
...that a mainstream newspaper can use a phrase like "illegal information", and most people aren't going to even bat an eyelid. I can't imagine anyone writing that 25 years ago.
The examples of "illegal information"? 1) How to make illegal drugs, and 2) how to hack a computer network.
In the first place, both types of "illegal information" are available at any good library; LLMs don't provide any information that isn't already published. So you've effectively just declared that some of the information at your local library is "illegal".
In the second place, both types of information have legitimate, legal uses. For instance, a cybersecurity expert would be well advised to learn about all the possible ways to hack a computer network.
That's nothing (Score:1)
Safety issue (Score:2)