



Google Says Its AI-Based Bug Hunter Found 20 Security Vulnerabilities (techcrunch.com) 17
"Heather Adkins, Google's vice president of security, announced Monday that its LLM-based vulnerability researcher Big Sleep found and reported 20 flaws in various popular open source software," reports TechCrunch:
Adkins said that Big Sleep, which is developed by the company's AI department DeepMind as well as its elite team of hackers Project Zero, reported its first-ever vulnerabilities, mostly in open source software such as audio and video library FFmpeg and image-editing suite ImageMagick. [There's also a "medium impact" issue in Redis]
Given that the vulnerabilities are not fixed yet, we don't have details of their impact or severity, as Google does not yet want to provide details, which is a standard policy when waiting for bugs to be fixed. But the simple fact that Big Sleep found these vulnerabilities is significant, as it shows these tools are starting to get real results, even if there was a human involved in this case.
"To ensure high quality and actionable reports, we have a human expert in the loop before reporting, but each vulnerability was found and reproduced by the AI agent without human intervention," Google's spokesperson Kimberly Samra told TechCrunch.
Google's vice president of engineering posted on social media that this demonstrates "a new frontier in automated vulnerability discovery."
Given that the vulnerabilities are not fixed yet, we don't have details of their impact or severity, as Google does not yet want to provide details, which is a standard policy when waiting for bugs to be fixed. But the simple fact that Big Sleep found these vulnerabilities is significant, as it shows these tools are starting to get real results, even if there was a human involved in this case.
"To ensure high quality and actionable reports, we have a human expert in the loop before reporting, but each vulnerability was found and reproduced by the AI agent without human intervention," Google's spokesperson Kimberly Samra told TechCrunch.
Google's vice president of engineering posted on social media that this demonstrates "a new frontier in automated vulnerability discovery."
And how many false-positives did it find? (Score:3)
I have tried pasting chunks of code into AI prompts and saying "where are the bugs?"
The answers included a lot of "this might be a problem if [condition that clearly does not apply]." Or it sees a potential problem that is prevent by an "if" statement shortly before it. Or it just starts hallucinating about common functions not doing what they clearly and reliably do. It has very rarely found any actual bugs for me, and that one time it did, it was an obvious bug I already knew about.
I think an AI bug finder would be awesome. But so far, I have gotten poor mileage from what's available.
Re: (Score:2)
To me it seems that this automated "security vulnerability discovery" stuff at this time is still glorified pattern search, with tons of false positives to annoy those who actually understand the code.
Re: (Score:3, Interesting)
What a strange claim. Should we tried it for another field? "If spam filters were actually good, they could write my business e-mails!"
Why should something that is a good detector be a good creator as well?
Re: (Score:2)
The article is about finding bugs.
Re: (Score:2)
AI has found and fixed quite a few bugs for me, but what I use is integrated with the IDE and can see the entire code context. Also it will write debug logging statements etc to zero in on problems and evaluate the results. It won't necessarily do everything for you but it will help.
But in this case it appears that their test case was looking for vulnerabilities "in various popular open source software" which is a whole different thing and presumably harder.
Re: (Score:2)
PEBKAC.
Will wait and see (Score:2)
... what actual open source maintainers think about this.
So far they've not been very positive about AI generated reports, like this scathing post from the curl maintainer: https://daniel.haxx.se/blog/20... [daniel.haxx.se]
Re: (Score:2)
Re: (Score:2)
He also got hundreds of bug reports of people writing "file:// is able to read local files" hoping to win some bounties.
The whole issue there would be resolved by pay a little fee to apply for a bounty that is payed back when the bounty is paid out. I wonder how many people who submitted these reports had any idea about code and how many just thought ChatGPT can get them bounties.
The AI world is a study in contrast (Score:2)
Scientists and engineers are using the tools to do useful and interesting things
The general public is demonstrating how stoopid they are by misusing the tools to do stoopid stuff
Re:The AI world is a study in contrast (Score:5, Interesting)
The world works by having each generation forget the past, and think they are clever reinventing things for the first time in a new way. *Actual* progress almost never comes close to the hype.
Re: (Score:3)
The actual *excess* return on investment, once you remove what previous solutions achieved in the same space, is rather unimpressive and close to zero.
And that is just it: Where AI works well, it does not do a lot. Were it works badly, it decreases productivity and creates risks. Not a tool on professional level.
The world works by having each generation forget the past, and think they are clever reinventing things for the first time in a new way. *Actual* progress almost never comes close to the hype.
Indeed. And never for AI hypes. This must be number 5 or so.
Re: (Score:2)
In computer science there is the wisdom, that what can be done (and there it actually means CAN and not someone claiming it can) with AI is usually orders of magnitude more efficient than without.
Try to find an algorithm for OCR without AI. There were quite a few. None is as good as the AI ones. And all of them are slower. As soon as you can formulate it into a neural network problem you benefit from the huge parallism achieved for matrix-vector math on GPUs, such that no sequential (or limited parallel) al
That is not a lot (Score:2)
And supposedly it also sounds a ton of false positives and missed a lot of vulnerabilities. Sounds useless to me.
Not just white hats do this (Score:1)