Future Hack: New Cybersecurity Tool Predicts Breaches Before They Happen 33
An anonymous reader writes: A new research paper (PDF) outlines security software that scans and scrapes web sites (past and present) to identify patterms leading up to a security breach. It then accurately predicts what websites will be hacked in the future. The tool has an accuracy of up to 66%. Quoting: "The algorithm is designed to automatically detect whether a Web server is likely to become malicious in the future by analyzing a wide array of the site's characteristics: For example, what software does the server run? What keywords are present? How are the Web pages structured? If your website has a whole lot in common with another website that ended up hacked, the classifier will predict a gloomy future. The classifier itself always updates and evolves, the researchers wrote. It can 'quickly adapt to emerging threats.'"
Re: (Score:2)
True - and how is it that they say they're not counting vulns when that is precisely what they're doing (albeit counting past vulns and extrapolating...)
Nothing New Here (Score:1)
Precrime Division has had it for years.
Isn't the correct answer: (Score:2)
Given enough time all of the sites on the Internet will eventually be hacked?
Re: (Score:2)
Re: (Score:2)
a large percentage of attacks are performed by automated tools searching for targets. They don't give a shit if the site is of huge interest or your Granny's blog talking about how cute her poodle is. check your logs, even your home computers will be receiving regular port scans, and knocks on various ports/protocols to see if there is anything to attack.
Re: (Score:2)
Exception: ;)
My ancient and long-dead first domain/site ever had never got hacked, and it never will: I shuttered it in 2001 (-ish) when I sold the domain name (spark.org).
Re:Isn't the correct answer: (Score:4, Insightful)
The premise was "given enough time...".
By taking the site down, you limited the time.
That's not an "exception", that's violating the premise.
Re: (Score:2)
Mostly Wordpress, then. 50% accurate: all sites (Score:5, Informative)
I see of the top "features" they identified, mostly is just various tags that mean Wordpress is in use. So they learned that Wordpress sites tend to get hacked. Duh. The Wordpress team isn't interested in security. I demonstrated an exploit for a serious vulnerability in Wordpress and submitted it to their bug tracker. For two years it sat, with one WP developer saying "it can't be exploited" - even though I attached an exploit directly to the tracker issue. Two years later, the vulnerability was added to a 'sploit kit and thousands of sites were compromised over the course of just a few days. That's when WP finally got around to patcing the clear and significant vulnerability.
I see TFA claims "66% accuracy". "All sites will be hacked at some point" is about 50% accurate. I bet we could have 66% accuracy simply by saying "sites running PHP 5.2 or below will be hacked."
It's a confidence score. Normal for binary decisio (Score:2)
The "inferred third value" is almost certainly the probability/score/confidence level, and it's normally included for machine-learning or any classifier algorithm, such as one that makes a yes/no decision based on a numeric value within a range. You'll see it a lot with spam filters. It's required because the USER choses at which threshold they wish to take certain actions.
I'm going to use the spam filter example because that's one many people are familiar with, specifically Spamassassin. It will score a m
16% Improvement! (Score:3)
That's like a 16% improvement over the quarter I flip...
Re: (Score:1)
66%? Worthless trash... (Score:4, Interesting)
I can predict for most sites that they will be hacked eventually, because they do not have anything resembling a secure set-up. But predicting when? That is impossible. Likely this tool gets even its pathetic 66% only dues to cherry-picked test data (also known as "lying" in scientific circles).
Re: (Score:2)
My algorithm does better than 66% and I'm open sourcing it right here...
(Predicts whether site will be hacked between now and the destruction of earth)
public boolean willSiteBeHacked(Vector whateverYouFeelLike) {
return true;
}
You can't disprove my claim.
Re: (Score:2)
I'm pretty sure your algorithm would be worse than 50%. It basically amounts to "which even comes first? A) site gets hacked or b) site gets taken down."
I think more sites get taken down every day than get hacked.
... accurately predicts .. (Score:2)
66% = "could happen."
RUns PHP? (Score:2)
Results? (Score:2)
What a coincidence. (Score:2)
Re: (Score:2)
I'm glad one of my side jobs is setting up IPS / IDP and similar security on firewalls. I'll never be thirsting for work.
In totally unrelated news (Score:2)
Re: (Score:1)
Re: (Score:1)
The tool has an accuracy of up to 66% (Score:2)