Forgot your password?
typodupeerror
Spam IT

Over 40% of New Mechanical Turk Jobs Involve Spam 56

Posted by Soulskill
from the lower-than-expected dept.
An anonymous reader writes "An NYU study reveals that over 40% of the jobs posted by new employers on MTurk are some sort of spam request, such as fake account creation, fraudulent ad clicks, or fake comments, tweets, likes and votes. The study also shows that the bad jobs could be automatically filtered with 95% accuracy, but Amazon is not interested."
This discussion has been archived. No new comments can be posted.

Over 40% of New Mechanical Turk Jobs Involve Spam

Comments Filter:
  • I guess you really can't build a robot shill.

    • by icebike (68054) on Friday December 17, 2010 @08:01PM (#34595002)

      The surprise is that anyone noticed all these HIT requests.

      Who, other than the utterly unemployable, has time to take on meaningless tasks dished out by machine for pennies. You can find more money laying on the ground in a parking lot.

      A casual perusal didn't find one task I would do for fun or profit.

      • by Sparr0 (451780)

        Consider an HIT that is worth one cent every ten seconds. To an American, $3.60/hr sounds appalling. To someone in a third world country where $3.60 will buy a week's worth of food and $20 is rent for a month, that's a hell of a good job.

        • by jago25_98 (566531)

          We can use this to our benefit: Write captcha questions that only your target audience can pass.

          But (and this is crucial), don't call it a captcha. Call it a admissions proficiency test.

            India & Nigeria are English speaking but many other places will have difficulty answering question involving spelling.

      • Welcome to the global economy. The $3.60/hour that you can make from these jobs is a bit more than a lot of outsourced workers are getting and is for unskilled work.
  • by whathappenedtomonday (581634) on Friday December 17, 2010 @05:37PM (#34593046) Journal
    Because Amazon only cares about ToS [amazon.com], and about nothing else.

    "We look forward to continuing to serve our AWS customers and are excited about several new things we have coming your way in the next few months."

    Well, I'm looking forward to you confirming the deletion of my account I requested a week ago. And that 2nd part sounds like a threat.

  • Hmmm (Score:4, Interesting)

    by sexconker (1179573) on Friday December 17, 2010 @05:37PM (#34593054)

    So when 40% of their MT service usage is contrary to the ToS, everything's fine and dandy.

    But when Wikileaks is in full compliance with the ToS of their EC2 service, they get the boot?

    • Precisely, because that's what's bringing the money in I would presume.
    • Re:Hmmm (Score:4, Funny)

      by forkfail (228161) on Friday December 17, 2010 @06:25PM (#34593684)

      So, obviously, Wikileaks should have hired people at 0.0001 cents per word to type in the leaked documents.

    • by Anonymous Coward

      They explained that Wikileaks does not own the content that they were distributing and as such were clearly violating ToS. Call it what you want, but dont call it "full compliance".

    • Seriously? Do you actually think that anyone plays by their own ToS? Hence the reason pretty much all ToS have a "We have the ultimate say and there is nothing you can do about it but sulk and try again" clause.

      ToS are jokes and exist to only limit what you, the user, can do (as opposed to limit what they, the provider, can do.)

  • MTurk (Score:5, Informative)

    by fiannaFailMan (702447) on Friday December 17, 2010 @05:38PM (#34593074) Journal

    I had to look this up.

    Amazon Mechanical Turk (beta)

    Amazon Mechanical Turk is a marketplace for work that requires human intelligence. The Mechanical Turk web service enables companies to programmatically access this marketplace and a diverse, on-demand workforce. Developers can leverage this service to build human intelligence directly into their applications.

    While computing technology continues to improve, there are still many things that human beings can do much more effectively than computers, such as identifying objects in a photo or video, performing data de-duplication, transcribing audio recordings or researching data details. Traditionally, tasks like this have been accomplished by hiring a large temporary workforce (which is time consuming, expensive and difficult to scale) or have gone undone.

    Mechanical Turk aims to make accessing human intelligence simple, scalable, and cost-effective. Businesses or developers needing tasks done (called Human Intelligence Tasks or “HITs”) can use the robust Mechanical Turk APIs to access thousands of high quality, low cost, global, on-demand workers—and then programmatically integrate the results of that work directly into their business processes and systems. Mechanical Turk enables developers and businesses to achieve their goals more quickly and at a lower cost than was previously possible.

    • by pudding7 (584715)
      Ditto that. My first thought on reading the headline was "What the fuck is Mechanical Turk?"
      • Re: (Score:2, Funny)

        by Anonymous Coward

        A little more informative.

        http://en.wikipedia.org/wiki/The_Turk

        The Turk, the Mechanical Turk or Automaton Chess Player was a fake chess-playing machine constructed in the late 18th century. From 1770 until its destruction by fire in 1854, it was exhibited by various owners as an automaton, though it was exposed in the early 1820s as an elaborate hoax.[1] Constructed and unveiled in 1770 by Wolfgang von Kempelen (1734–1804) to impress the Empress Maria Theresa, the mechanism appeared to be able to play a strong game of chess against a human opponent, as well as perform the knight's tour, a puzzle that requires the player to move a knight to occupy every square of a chessboard exactly once.

        The Turk was in fact a mechanical illusion that allowed a human chess master hiding inside to operate the machine. With a skilled operator, the Turk won most of the games played during its demonstrations around Europe and the Americas for nearly 84 years, playing and defeating many challengers including statesmen such as Napoleon Bonaparte and Benjamin Franklin. Although many had suspected the hidden human operator, the hoax was initially revealed only in the 1820s by the Londoner Robert Willis.[2] The operator(s) within the mechanism during Kempelen's original tour remains a mystery. When the device was later purchased and exhibited by Johann Nepomuk Mälzel, the chess masters who secretly operated it included Johann Allgaier, Boncourt, Aaron Alexandre, William Lewis, Jacques Mouret, and William Schlumberger.

        Figures the original was a scam.

      • by Pharmboy (216950)

        "What the fuck is Mechanical Turk?"

        Isn't that like a Islamic version of the "Six Million Dollar Man"? [imdb.com] Some things get lost in the translation.

        • *rolls eyes and sighs*

          I wish my fellow Americans would just learn to shut up when they have nothing intelligent to say in an international forum. Terribly sorry about that. In hopes of redeeming some measure of dignity to the US...

          Turkey has historically been a secular country. Sadly, they're falling for the same nonsense the US is and are sliding backwards towards the establishment of a theocratic government.

          • by Pharmboy (216950)

            No apology needed, unless you were assuming I was saying something bad about either Turkey or Islam, in which case an apology would be appropriate, but from you instead, for putting words in my mouth. More likely, you simply need to grow up. There was absolutely nothing in my comment that was insulting to either Turks or Muslims, and quite frankly, as an open "international" forum, anyone is allowed to express any opinion without your apologies or input. This isn't your house, it isn't your place to apol

            • Calling Turkey "islamic" is in fact offensive, since Islam does not define that country, just as "Anglican" does not define the UK.

              That you do not know this and make awful jokes reflects poorly on me, thus the need to apologize for my failing to ensure you don't run around the world being a jackass. It doesn't have to be my house; an idiot child brought to a restaurant demands that apologies be made to the other guests there as much as to the proprietor.

              When you can comprehend these points, you may apologi

              • by Legion303 (97901)

                Or you could roll your eyes, groan at the bad joke and move on to the next comment, like the rest of us without sticks in our asses.

      • by nickb64 (1885128)
        I take it you haven't read Doctorow's For The Win, then? If not, then I definitely suggest it.
    • So, given that there's apparently no oversight and no interest by amazon in ensuring a quality service, why would anyone want to enter that site? It sounds like schmuckbait to me.
  • Filtering (Score:5, Funny)

    by AndrewNeo (979708) on Friday December 17, 2010 @05:45PM (#34593168) Homepage

    So, would the filtering of bad services from MTurk be performed using MTurk?

    • by YesIAmAScript (886271) on Friday December 17, 2010 @06:22PM (#34593652)

      I find it interesting that the people placing the HITs have to decide whether the work done is good quality and then decide to pay or not. So that means for each tiny job you farm out, you have to do your own tiny bit of make work to decide whether to pay or not. Can you farm this out on the turk too? If not, maybe there's a market for a service that let's you do so...

      • I did the turk thing for a couple days, and you nailed it. Sometimes a job would pop up reviewing someone else's work. I quit though because being offered pittance to do some tasks that take time, then having the person take the work and not pay (you have no control of that) or decide to sit on it for a few weeks, well it's crap.
        • by Lehk228 (705449)
          seems like the solution to someone not paying you would be to track down all their listings, accept them and fill them with garbage constantly
      • There are lots of tasks where verifying the work is easier than doing it. This is the entire basis of public key cryptography and encompasses entire classes of algorithm beyond it. Often, it's easy to automate the verification, but not the work. For example, it takes a human to enter a captcha, but a machine can verify whether the value that the human gave is accepted by the system.
  • Profit (Score:5, Insightful)

    by The Raven (30575) on Friday December 17, 2010 @05:47PM (#34593184) Homepage

    Same reason the USPS likes bulk mailers... they keep the operation afloat. Especially as more and more people turn to email.

  • I know a few research scientists who use the Turk for some awesome ideas (it's a LOT cheaper than in-person human subjects and the people you get aren't homeless, drunks, or freshman psych students fulfilling requirements). However, there is little money in (non-military) basic research at the moment, and only a fraction of that even requires human subjects.

    The rest is merely a new breed of on-demand advertising and promotion. Amazon is still getting paid, so they likely don't care. I'd argue that if t

  • by Anonymous Coward
    Did anyone else notice that the summary says 95% accuracy but doesn't break it down to False Accept and False Reject?

    Not to mention, spammers adapt. That's the main problem with them.
    • by Lehk228 (705449)
      it's also inaccurate, more like 90% spam and garbage on mturk, it's a real shit hole.
  • by gringer (252588) on Friday December 17, 2010 @06:05PM (#34593402)

    "Accuracy" is a difficult measure to quantify. I see from reading the article that the accuracy has been estimated at 95% due to a a 95% true positive rate and a 95% true negative rate. Given that the current spam rate is 40%, these rates aren't particularly bad, but Amazon would still have quite a few problems with angry customers. Assuming 1500 HITs per day, and 60% of those non-spam submissions, 45 would be falsely flagged as spam.

  • This just got me thinking. Could the service be used to game the App Store? Currently, there are several companies offering to get any free App into the top 25 list for $5000. It's widely believed that they use bots to do it, but it could just as easily be mechanical turks.

  • by Animats (122034) on Friday December 17, 2010 @06:15PM (#34593556) Homepage

    That data is from two months back, before Google Places appeared in web search. Now, it's worse. There's a whole mini-industry in the "black hat" search engine "optimization" community creating phony Google Places entries. Here's an ad on Mechanical Turk today [mturk.com]:

    Reno Gym - Google Maps Promotion (Client QMDHKOB)
    Requester: Smartsheet.com Clients
    HIT Expiration Date: Dec 18, 2010 (10 hours 52 minutes) Time Allotted: 60 minutes
    Reward: $0.25 HITs Available: 2
    Description:

    • Follow Instructions on PDF attached for BUSINESS ADDRESS (1)
    • Repeat Instructions on page 5 to 14 for BUSINESS ADDRESS (2) and (3) below.
      GMAIL ADDRESS: [Create a new Gmail Account] PASSWORD:
      BUSINESS ADDRESSES:
      • (1) 6370 Mae Anne Avenue, Reno, NV 89523
      • (2) 4784 Caughlin Parkway, Reno, NV 89519
      • (3) 18603 Wedge Parkway, Reno, NV 89511

      BUSINESS TITLE AND FULL ADDRESSES:

      • (1) Anytime Fitness 6370 Mae Anne Avenue, Reno, NV 89523 (775) 746-8400
      • (2) Anytime Fitness 4784 Caughlin Parkway, Reno, NV 89519 (775) 622-8034
      • (3) Anytime Fitness 18603 Wedge Parkway, Reno, NV 89511 (775) 852-7007

      WEBSITE URL: http://renogyms.org/ [renogyms.org]
      GOOGLE PLACES URLs:

    Keywords: Smartsheet, Reno, Gym, Google, Maps, Promotion, QMDHKOB

    Google Places spamming hasn't been fully automated yet, so we get to watch spammers outsource their manual spamming. Spamming Google Places is incredibly easy, much easier than creating the link farms required to spam Google's old web search. See the instructions in "Dominating Google Maps- The Most Effective Spam Ever And What You Can Learn From It" [convertoffline.com].

    Google Places has been 0wned.

    • by vlueboy (1799360)

      I'm surprised you weren't modded up.
      I'm also surprised at how low the wages are at this turk thing, when a 15+ page script needs to be read just to get started. In the US, centuries of constant demolition and rebuilding mean that house numbers easily jump from "1" to "21" when maybe 10 houses on the same block no longer need individual numbers after the block turns into a single vacant lot.

      Marketting fake numerical addresses in between legit ones ensures that Google Pagerank rates your "unique" business as

      • by Animats (122034) on Friday December 17, 2010 @08:10PM (#34595106) Homepage

        I'm also surprised at how low the wages are at this Turk thing. ... I thought spammers had to at least sweat through that manual task by themselves.

        It's like $0.25 per human-generated spam. Automation seems to be coming. I'm seeing mentions on black hat SEO forums that an automated tool for doing this in bulk will be released early next month.

        Marketing fake numerical addresses in between legit ones ensures that Google Pagerank rates your "unique" business as #1...

        Sometimes. That technique is mostly used to give real businesses extra bogus locations. Check out "New York City locksmith", for example. Other heavily spammed terms are "carpet cleaning" and "divorce lawyer".

        This week's new technique is described at "How To Spam Google Maps For Top Google Place Listings" [seroundtable.com]. This is like SQL injection for mailing addresses. The trick depends on Google's parsing of mailing addresses from the top, while USPS standards say they should be parsed from the bottom line upward. So a mailing address with two street addresses is parsed differently by the USPS and Google, allowing the spammer to redirect Google's confirmation postcard to some mail drop.

        Google seems to be out to lunch in this area. The same exploits have been working for months. Yet Google doesn't list any such issues under "Known Issues [google.com]. Over on Matt Cutts' blog [mattcutts.com], where you'd expect to see some discussion of this, he reports that he's writing a novel.

        It's even worse at Bing. Bing emulated Google's October 27th merger of Places into web search within a few days. But they weren't ready. Look up "New York City locksmith" in Bing, and the five "Places" entries are all the same business.

        • by Lehk228 (705449)
          well it would certainly be interesting if google sits on this then drops the hammer on everyone that did it,
    • I hope the folks at Google start trolling the same MTurk job listings to mark down location spam for what it is...

  • Only 40%? (Score:3, Interesting)

    by D J Horn (1561451) on Friday December 17, 2010 @06:22PM (#34593656)

    From my time exploring mturk I would have guessed it to be much higher than that, non-spam related jobs were definitely the minority of what I saw.

    The creepiest (and highest paying) job I saw though involved watching surveillance footage from airports, making sure the automated face tracker stayed on target...

  • This is a fake comment. A real one would have looked different.

If A = B and B = C, then A = C, except where void or prohibited by law. -- Roy Santoro

Working...