Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?
Security Social Networks

Machine Learning Susses Out Social-Network Fraud 42

CowboyRobot writes "Machine learning techniques can be used to detect fraud and spies on social networks based on certain features, such as the number of followers and the number of devices used to access the network. Certain characteristics of social-network accounts have a high correlation with fraud and can be used to differentiate between real and fake accounts, a researcher presenting at the SOURCE Boston Conference said this week. Using machine learning techniques, Vicente Diaz, a senior security analyst with security software firm Kaspersky Lab, found that seven characteristics of Twitter profiles could identify fraudulent accounts 91% of the time. The number of devices from which a user accesses the service, the ratio of followers to people following an account, the average number of tweets to each person, and the number of tweets to an unknown receiver are all features that correlate strongly to fraudulent accounts, he says."
This discussion has been archived. No new comments can be posted.

Machine Learning Susses Out Social-Network Fraud

Comments Filter:
  • by s1d3track3D ( 1504503 ) on Monday April 22, 2013 @11:49AM (#43515991)
    In related news, social network machine learning fraud bots get algorithm update based on current fraud detection algorithms.
  • ... If I had a facebook account. Using my Orkut account as an example, the software would find that I only use a single device to access (desktop pc), have few friends (but genuine) and post few reviews and comments (only what I consider important).

    In conclusion, as I do not access facebook even from my watch, do not comment on every single thing I do in my day and not have "thousands of followers", so I can only be a fraud :-)
    • by Jane Q. Public ( 1010737 ) on Monday April 22, 2013 @01:54PM (#43517181)

      "So I would be a fraud if I had a facebook account."

      Precisely. There are several things wrong with trying to actually use this in the real world.

      (1) 91% is not nearly good enough. Period.

      (2) Even if it were 99.9% accurate, it would still not be good enough. Because it runs into the base rate fallacy [].

      (3) Similar but not related to the base rate fallacy, is that a statistical correlation between datasets of millions says nothing about an individual account.

  • by sjbe ( 173966 ) on Monday April 22, 2013 @12:22PM (#43516309)

    found that seven characteristics of Twitter profiles could identify fraudulent accounts 91% of the time.

    Taking the 91% number as accurate for argument's sake, what are the false positive and false negative rates? Even a 1% false positive or false negative rate would be quite a lot of accounts when you consider how many millions of twitter accounts there are out there.

  • by number17 ( 952777 ) on Monday April 22, 2013 @12:57PM (#43516659)
    Most of the information I put on my facebook account is noise. I didn't really attend 10 different universities, speak 15 different languages, or was born in that other country.

    The only people that care about this are marketers. But even then, does it matter if the account is real or not? I haven't seen any good evidence that social marketing can directly relate to in-store or online purchases. Its all a scam.
    • Yep, you're not in marketing clearly.

      I think you'll find that for businesses that rely on strong ties to their customers. For many businesses one off sales don't cut it, particularly small businesses and so social networking is an essential tool. It may shock you to hear that social networking is merely the new phrase for "word of mouth" with some extra bells and whistles to help along repeat business (the whole "following" mechanic).

      Not far from where I live is a pie shop called "Piefection" - I thought i

  • The number of devices from which a user accesses the service.

    So does Twitter just publicly disclose a simple device count or the detailed information on all devices? If the latter, isn't that a whopping security hole to be exploited by people looking for targets with known vulnerable devices.

panic: kernel trap (ignored)