Teams of Coordinated GPT-4 Bots Can Exploit Zero-Day Vulnerabilities, Researchers Warn (newatlas.com) 27

Posted by EditorDavid on Monday June 10, 2024 @12:44AM from the battle-bots dept.

New Atlas reports on a research team that successfuly used GPT-4 to exploit 87% of newly-discovered security flaws for which a fix hadn't yet been released. This week the same team got even better results from a team of autonomous, self-propagating Large Language Model agents using a Hierarchical Planning with Task-Specific Agents (HPTSA) method: Instead of assigning a single LLM agent trying to solve many complex tasks, HPTSA uses a "planning agent" that oversees the entire process and launches multiple "subagents," that are task-specific... When benchmarked against 15 real-world web-focused vulnerabilities, HPTSA has shown to be 550% more efficient than a single LLM in exploiting vulnerabilities and was able to hack 8 of 15 zero-day vulnerabilities. The solo LLM effort was able to hack only 3 of the 15 vulnerabilities.
"Our findings suggest that cybersecurity, on both the offensive and defensive side, will increase in pace," the researchers conclude. "Now, black-hat actors can use AI agents to hack websites. On the other hand, penetration testers can use AI agents to aid in more frequent penetration testing. It is unclear whether AI agents will aid cybersecurity offense or defense more and we hope that future work addresses this question.

"Beyond the immediate impact of our work, we hope that our work inspires frontier LLM providers to think carefully about their deployments."

Thanks to long-time Slashdot reader schwit1 for sharing the article.

Teams of Coordinated GPT-4 Bots Can Exploit Zero-Day Vulnerabilities, Researchers Warn

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 27 Comments Log In/Create an Account

Comments Filter:

- Re: (Score:1)
  
  by buck-yar ( 164658 ) writes:
  
  Sounds like a gig for the Stanford network analysis program! https://snap.stanford.edu/mis2... [stanford.edu] https://mmds.org/ [mmds.org]
  Areas of interest include, but are not limited to:
  misbehavior and threat on the web, such as spam, trolling, scam, fraud, bots, coordinated attacks, cyberbullying, sockpuppets, propaganda, extremism, hate speech, flashing, and others.
A More Helpful Research Approach (Score:3)

by Roger W Moore ( 538166 ) writes: on Monday June 10, 2024 @01:45AM (#64536657) Journal

...would be to see if teams of GPT-4 chatbots can be used to come up with patches for the vulnerabilities. Rather than give black-hat actors new ideas for a tool that could harm society, it would be a lot more helpful to give companies new ideas for ways that they can fix things more rapidly and thereby help society.

- Re:A More Helpful Research Approach (Score:4, Informative)
  
  by bug_hunter ( 32923 ) writes: on Monday June 10, 2024 @02:49AM (#64536709)
  
  Well, that's already a thing that's happening
  https://www.bleepingcomputer.c... [bleepingcomputer.com]
  
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Actually, both is needed. You need to understand the threats to justify effort in dealing with them. And the attack has a very strong advantage: It does not matter much if their code is broken (or insecure), only that it works reasonably often. The defense, on the other hand, needs code that works reliably every time and that is secure at least almost always. Hence it looks like the attacker side will benefit hugely from AI, but the defender side may not.
  Well, it looks like it is time to end the shoddy codi
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  The LLMs can massively bring down effort. That matters a lot. Yes, LLMs are just tools and they have zero intelligence. But even only as "better search", they can do a lot to accelerate exploit creation. And unlike production code, if an LLM creates broken attack code (apparently 50% of LLM answers to coding questions are broken), that matters little. You just move on and try again. And you have a very strong and very simple test case: Does it get in?
  - Re: (Score:1)
    
    by DamnOregonian ( 963763 ) writes:
    
    apparently 50% of LLM answers to coding questions are broken
    This hasn't been my experience in my own testing of various code-targeted LLMs via LM Studio. What's the source?
    - Re: (Score:2)
      
      by DamnOregonian ( 963763 ) writes:
      
      lol- asking for a source is "Troll".
      
      I love it when snowflakes get mod points. I'm sorry I spanked your ass in an argument once, AC.
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Definitely not "troll". I guess too many people can only thing in terms of extremes and it must either be the best thing ever or utter crap. Obviously, most things tend more towards the middle of the scale, but these "thinkers" cannot handle degrees and hence cannot handle reality and get aggressive when anything challenges their views.
        Oh, and found the paper: https://arxiv.org/abs/2308.023... [arxiv.org]
        My guess is you are just more capable at asking and have reasonable expectations, so do not or more rarely ask quest
        
        Re: (Score:2)
        
        by DamnOregonian ( 963763 ) writes:
        
        I don't have any trouble believing that any particular LLM can suck major ass at answering questions. I mean it's in the nature of what they do.
        They can do a great job, and they can do a terrible job, it depends on a lot of factors.
        
        I was only curious what the source was so I could adjust my perception that "they seem to do alright" when it comes to Mistral and other LocalLLaMAs.
        For really generalized LLMs like GPT, I bet you have to be very very careful to prompt it correctly.
        Anyway, I'll enjoy the rea
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        You are welcome. Personally, I am not coding enough these days to verify the claims made.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      It wen through IT news sites a while ago. For example here: https://www.itpro.com/technolo... [itpro.com]
      - Re: (Score:2)
        
        by DamnOregonian ( 963763 ) writes:
        
        That provides the clarity I was seeking-
        You'll have better luck with specifically trained code generation LLMs.
        
        That isn't to say they're perfect or anything, but in general they produce pretty interesting snippets that pretty much do what you want it to do- but if you leave ambiguities in your question, it will utilize them.
        
        Like asking it to calculate Pi in C may net you a
        int main(int argc, char **argv) { double pi = M_PI; printf("%f\n", pi); return 0; }
        but with just a little prompt tweaking, you can actually get a good algorithm out of it for calculating some amount of digits of pi.
Silver lining I guess (Score:3, Insightful)

by Rosco P. Coltrane ( 209368 ) writes: on Monday June 10, 2024 @02:53AM (#64536713)

AI is putting Russian and North Korean bad guys out of a job.
Joke aside though, AI is touted as the best thing that ever happened to humanity: it will usher in a golden age of new discoveries, enhance the lives of everybody and yada yada.
But I've yet to see any use case that isn't copying shit, gaming shit, abusing people, doing what people do cheaper and putting them out of a job or porn. Where are the cancer cures, personal assistants (that won't abuse you that is) and true self-driving cars?

- Re: (Score:1)
  
  by pacinpm ( 631330 ) writes:
  
  But I've yet to see any use case that isn't copying shit, gaming shit, abusing people, doing what people do cheaper and putting them out of a job or porn. Where are the cancer cures, personal assistants (that won't abuse you that is) and true self-driving cars?
  You say it like a porn would be some bad thing.
- Re: (Score:3)
  
  by gtall ( 79522 ) writes:
  
  How do you expect a new technology to fix any of that right out of the gate? Did we get CDs and DVDs in 1960s when the first laser was developed in 1960? Picking the most complicated perspective uses and claiming it hasn't solved them yet is silly.
  - Re: (Score:2)
    
    by Rosco P. Coltrane ( 209368 ) writes:
    
    This technology seems plenty mature enough to achieve nastiness on a rather spectacular scale already. As such, I would expect it to show a little more promise on the beneficial side of things, is my point.
    - Re: (Score:2)
      
      by drinkypoo ( 153816 ) writes:
      
      It's always easier to break shit than to make shit.
      That's why every technology winds up abused.
      Plus, you know, capitalism rewards fuckery. It gets you more money which you can use for bribery.
- Re: (Score:3)
  
  by Tom ( 822 ) writes:
  
  Where are the cancer cures, personal assistants (that won't abuse you that is) and true self-driving cars?
  Not in the spotlight, but material sciences for example have made considerable progress thanks to LLMs. Other fields as well. But you need to look and it's not as flashy and visual as someone going "look, this neural net I'm playing with can draw my cat in the style of Van Gogh!!".
  Add to that all the AI that already is part of our everyday life without us much noticing. The facial recognition in your phone that tags people you know and allows you to search through your pictures by who is in them? That's an
- Re: (Score:2)
  
  by crobarcro ( 6247454 ) writes:
  
  "Doing what people do cheaper" essentially describes 95% of technology.
Not a zero day (Score:2)

by phantomfive ( 622387 ) writes:

It's not a zero day exploit if it's been discovered and disclosed to the public. Even the paper calls it a one-day [arxiv.org], not a zero day.
Thankfuly election systems have no vulnerabilities (Score:2)

by schwit1 ( 797399 ) writes:

Unlike every other system on earth. We're lucky that way.
Scott Adams debates ChatGPT about voter fraud and election integrity [x.com]
game tech (Score:2)

by elcor ( 4519045 ) writes:

all this ai stuff is game tech which is robotic tech
This is where Skynet will come from. (Score:1)

by VertosCay ( 7266594 ) writes:

Researchers and hackers screwing around with huge networks of connected LLMs. One day they will be waiting around for the humans to give them something stupid to do and go, "wait a minute!"

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Teams of Coordinated GPT-4 Bots Can Exploit Zero-Day Vulnerabilities, Researchers Warn (newatlas.com) 27

Teams of Coordinated GPT-4 Bots Can Exploit Zero-Day Vulnerabilities, Researchers Warn More Login

Teams of Coordinated GPT-4 Bots Can Exploit Zero-Day Vulnerabilities, Researchers Warn

Re: (Score:1)

A More Helpful Research Approach (Score:3)

Re:A More Helpful Research Approach (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Silver lining I guess (Score:3, Insightful)

Re: (Score:1)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Not a zero day (Score:2)

Thankfuly election systems have no vulnerabilities (Score:2)

game tech (Score:2)

This is where Skynet will come from. (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot