Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Security Advertising Google Privacy The Internet

hCaptcha Runs On 15% Of the Internet (hcaptcha.com) 66

In a blog post, hCaptcha announced that its bot detector is running on about 15% of the internet, adding they they "took most of this market share directly from Google reCAPTCHA." From the post: Competing with Google and other Big Tech companies seems like a tall order: their monopolistic market power, platform effects and army of highly paid developers are generally considered too powerful to tackle for anyone but other tech giants such as Facebook or Amazon. Our story shows that it doesn't have to be that way -- you can beat Big Tech by focussing on privacy. Consider Google reCAPTCHA, which consumes enormous amounts of behavioral data to determine whether web users are legitimate humans or bots. At hCaptcha, we have deliberately taken a very different approach, using privacy-preserving machine learning techniques to identify typical bot behaviors at high accuracy, all while consuming and storing as little data as possible.

Google is an ad company, and their security products look very much like their ad products: they track user behavior on every page of a website and across the web. We designed hCaptcha to be as privacy-friendly as possible from day one. This led to a completely different approach to the problem. As it turns out, tracking users across the web and tying their web history to their identity is completely unnecessary for achieving good security. The many companies that have switched over to hCaptcha often report equal or better performance in bot detection and mitigation despite our privacy focus.

A growing number of critics have pointed out that Google's disregard for user privacy should concern customers looking to protect their websites and apps. At the same time, stopping bots from accessing publisher sites can reveal ad fraud, pitting Google's reCAPTCHA product directly against their ad business, which produces over 80% of their revenue. Every bot Google detects should be earning zero ad dollars. Google's company incentives are thus poorly aligned with the users of their security services, and this may be one explanation for the poor performance of their reCAPTCHA security offering.

This discussion has been archived. No new comments can be posted.

hCaptcha Runs On 15% Of the Internet

Comments Filter:
  • How much of that 15% is from Cloudflare's usage of hCaptcha?
    • by AmiMoJo ( 196126 )

      Cloudflare needs to sort their shit out. Half the sites using their "protection" systems are inaccessible with privacy enhancements enabled, like blocking canvas fingerprinting.

      • Re:Cloudflare (Score:5, Insightful)

        by LenKagetsu ( 6196102 ) on Thursday November 26, 2020 @06:43AM (#60767570)

        NoScript used to be a fantastic defense that caused minimal fuss, now every cunt wants to have fancy dynamic PHP Javascript memetic bullshit to do things as simple as rendering an image or drop-down menus that are done faster and easier in HTML, and that's not even getting into the nonsensical bullshit that is spreading your site across several domains.

        Modern internet fucking sucks.

      • by fred911 ( 83970 )

        ''Half the sites using their "protection" systems ''

        The primary reason Cloudflare is used is to deliver the clients' content in a balanced and distributed manner so the clients' server will never be directly hammered, and to provide resources to the clients', clients [sorry couldn't figure out better verbiage] so that are all resource requests are of an equally high QOS.

        The fuckery comes when Cloudflare sees an IP that was wasting resources, sent questionable packets or just wasn't cost effective to serve.

      • CloudFlare lives on tracking the shit out of you just as much as Google does. They have zero interest in letting you through if you protect your privacy. In fact, I'd argue that their entire DDOS protection offering is just an excuse to force tracking on their customers' visitors.

        • Re:Cloudflare (Score:4, Informative)

          by ftobin ( 48814 ) on Thursday November 26, 2020 @02:01PM (#60768340) Homepage

          Cloudflare developed Privacy Pass [hhttps] which is a darn-well designed privacy-protecting means of having pre-preprared tokens for getting through captchas. If you read the design document it's pretty amazing.

          The overview:

          When an internet challenge is solved correctly by a user, Privacy Pass will generate a number of random nonces that will be used as tokens. These tokens will be cryptographically blinded and then sent to the challenge provider. If the solution is valid, the provider will sign the blinded tokens and return them to the client. Privacy Pass will unblind the tokens and store them for future use.

          Privacy Pass will detect when an internet challenge is required in the future for the same provider. In these cases, an unblinded, signed token will be embedded into a privacy pass that will be sent to the challenge provider. The provider will verify the signature on the unblinded token, if this check passes the challenge will not be invoked.

          This protocol allows a client to bypass a number of internet challenges proportional to the number of tokens that are signed. The blinding feature used in the signing process preserves the anonymity of the user involved by randomising the tokens that are signed â" rendering them unlinkable from the tokens that are redeemed.

          Cryptographically speaking, every time the Privacy Pass plugin needs a new set of privacy passes, it creates a set of thirty random numbers t1 to t30, hashes them into an elliptic curve (P-256 in our case), blinds them with a value b and sends them along with a challenge solution. The server returns the set of points multiplied by its private key and a batch discrete logarithm equivalence proof. Each pair (ti, HMACi(M)) constitutes a Privacy Pass and can be redeemed to solve a subsequent challenge. Voila!

  • I'm pretty sure that's because google doesn't want that traffic anymore and is charging for it / discouraging it now. hCaptcha is still interested in the dirty, dirty, shameful deeds that go along with captcha analalytcs so they keep that shit free (not so free).
  • IMHO *captcha is better at keeping humans out than bits. I often have problems with the *captcha while bits seem to easly pass them.

  • by Gabest ( 852807 ) on Thursday November 26, 2020 @06:17AM (#60767544)

    Buy Captcha Premium and we select all the crosswalks for you.

    • Better yet: Buy Captcha Premium Adult and you get to select tits!

    • by AmiMoJo ( 196126 )

      There is an extension called PrivacyPass which allows you to skip most of these captchas. You do one and get 30 tokens which mean you can skip the next 30.

      Unfortunately the extension isn't so good. I've seen it cause performance issues and when combined with privacy enhancements you tend to just get a failure loop that uses all your passes up in a few seconds.

  • by Arnonyrnous Covvard ( 7286638 ) on Thursday November 26, 2020 @06:38AM (#60767560)
    With captchas and cookie-acknowledgements, the web has become a place to avoid. It is one big clusterfuck of adversarial design.
    • What did you expect when the web stopped being html, and started being actual programming.

      Things were better with flash, because you didnt need flash to "log in" to these otherwise very much non-stateful websites, therefore disabling flash had little consequence.

      Now there isnt anything worthwhile to disable without frequently experiencing side-effects, aside from advertisements.
  • by thegarbz ( 1787294 ) on Thursday November 26, 2020 @07:02AM (#60767584)

    If I'm going to fill out a Captcha I greatly prefer to fill one out for Google. Why? Because it's positive work and not simply a busywork hinderance. Most other Captchas just ask you to fill out something they already know, Google on the other hand uses it to train its AI for stuff that will ultimately benefit us be that for identifying what real world items look like or for scanning books.

    So remember this when you get hit by that self-driving car which ignored a traffic light because it didn't understand what it looked like. I applaud hCaptcha for taking a privacy high ground, but at this point that's closing the barn door after all the animals have bolted, the tractors have been sold, and the wife left you for the milkman.

    • by fred911 ( 83970 )

      ''Google on the other hand uses it to train its AI for stuff that will ultimately benefit us''

      I'm sorry but are you fucking high?

      Google provides high quality services in exchange for permissive use of data in order to provide even higher quality ads to sell to buyers. How many of their services are no longer available, or have been ''migrated'' to services that are better monitized? Google music RIP, Goo.gl RIP and soon to come your free POTS via IP... RIP.

      Whereas I do applaud them for never breaking a TOS

      • by thegarbz ( 1787294 ) on Thursday November 26, 2020 @11:25AM (#60768102)

        When accusing someone of being high it pays to post a coherent reply. What part about Google's reCaptcha being used to train self driving car algorithms or supporting their book scanning services did you not understand? I just checked yes their library is still available online. So literally precisely nothing they have done with reCaptcha has been made unavailable.

        You win this week's irrelevant comment on the internet award.

    • That was some heroic Google shilling right there. I applaud your efforts.

      Google and their captchas can fuck right off. All captchas in fact, not just Google's. Captchas have been the bane of the internet since they were invented.

      • by AmiMoJo ( 196126 )

        The real problem is that nobody has come up with a better way of stopping bots. Figure out how to detect betters with equal accuracy and less annoyance.

      • That was some heroic Google shilling right there. I applaud your efforts.

        I'll happily shill for you instead. All you need to do is beat Google's price. Deposit a single cent in my paypal account and you've already beaten them.

        So what part of my post is shilling? The fact that you didn't know reCaptcha was used to improve the quality of Google's book scanning service, or the fact you didn't know reCaptcha is used to train their self driving car image analysis?

        • Tell me how identifying fire hydrants, bus stops or stop lights in photographs helps the quality of book scanning services.

          I'll tell you what it is: it's unpaid labor forced on people who want to visit a website. It doesn't take a genius to figure out that all those human recognition microtasks benefit Google at zero cost to them.

          • by tlhIngan ( 30335 )

            Tell me how identifying fire hydrants, bus stops or stop lights in photographs helps the quality of book scanning services.

            I'll tell you what it is: it's unpaid labor forced on people who want to visit a website. It doesn't take a genius to figure out that all those human recognition microtasks benefit Google at zero cost to them.

            reCAPTCHA ran out of books, actually. So the fire hydrants and such aren't to digitize books, but to train the object recognition system for image processing.

            Of course, it helps tr

          • Tell me how identifying fire hydrants, bus stops or stop lights in photographs helps the quality of book scanning services.

            Not sure if you're going for a funny, have no idea of history or what reCaptcha did before showing you pictures of hydrants, or if you're actually retarded.

            Given your language I'm going to go with three.

      • somehow, I have trouble putting "improving google's ability to recognized and collate information" into the column of "light" rather than "dark" . . .

        next we can have diabolicCAPTCHA--click on the box that you find most tempting . . .

        hawk

    • by jmccue ( 834797 )

      If I'm going to fill out a Captcha I greatly prefer to fill one out for Google

      Google is the only place I have run into these, and it takes me something like 10 tries if I decide to try and get by it. Once in a great while maybe I can deal with it, but it now has become a common occurrence.

      More than anything else, this forced me to duckduckgo, if I start getting them there, off to another search engine. But so far duck has been great

      Google are you listening ?

      • Re:Both good and bad (Score:4, Informative)

        by jenningsthecat ( 1525947 ) on Thursday November 26, 2020 @09:51AM (#60767866)

        Google is the only place I have run into these, and it takes me something like 10 tries if I decide to try and get by it.

        Unfortunately, I run into Google's broken, fucked-up, ass-sniffing captchas when I order parts from Digi-Key. I've tried to get past them, but after 5 minutes or so I give up. I'll be double-goddamned to hell if I'll sign into the Google account that I never use just to make the captchas less painful, because that just further rewards Google's bullshit privacy-raping efforts.

        So instead I enter my order online, stop before I get involved in Google's clusterfuck, place a phone call, and complete my order with a live operator. It takes no more of my time than doing a stint in captcha hell, it's a fuck-you to Google, and it costs Digi-Key money, which I'm happy about because of their choice to allow Google captchas on their site.

        Really, all of Google needs to DIAF, and if I held the button that would make them disappear from the universe I'd keep pressing it until one hand failed and then I'd switch hands and repeat to make damn sure the job got done. We'd all live through the subsequent economic dislocation, and hopefully we'd replace Google with something that would serve the best interests of all of us.

        • Unfortunately, I run into Google's broken, fucked-up, ass-sniffing captchas when I order parts from Digi-Key. I've tried to get past them, but after 5 minutes or so I give up.

          Either you hit some kind of a bug or you need to return your drivers license. I've literally never failed a Google reCaptcha without misclicking due to lack of attention. How hard is it to click on a traffic light? On the other side hCaptcha is a piece of shit.

          Google:
          Please select the pictures with bicycles: Proceeds to show 9 pictures, 6 of which are of scenery and cars and 3 of which are bicycles.
          hCaptcha
          Please select the pictures with bicycles: Proceeds to show 9 pictures, 2-4 of which may be bicycles, 2

      • Google is the only place I have run into these, and it takes me something like 10 tries if I decide to try and get by it.

        That's quite impressive. Google is about the only place I've never run into these. They used to be quite regular on Cloudflare before they rolled their own. Also I'm kinda worried that you don't know what a traffic light or a level crossing looks like, since just clicking on those is all you need to to pass them.

    • I'm the exact opposite. I'd rather not be charged with labor to train Google's AI to read a website. Let them pay people for that shit. I just don't want to do intern work for a trillion dollar company and get paid in bits..

      • You believe the entire world is a zero sum game. Luckily not everyone believes like you did or we'd still be in the frigging stone age.

        You are doing the work regardless. Why not do work that achieves a secondary purpose with zero additional effort on you behalf?

        • Of course the world isn't zero sum. Win-win situations are great.

          But Google needs that work done. If the choices are "work gets done by people for free on the Internet" or "Google pays people to get it done", the second option leads to some poorer people having a better job and therefore a better life. I can help bring about the second option by declining the first.

          • If the choices are "work gets done by people for free on the Internet" or "Google pays people to get it done", the second option leads to some poorer people having a better job and therefore a better life.

            And you just jumped right back to a zero sum game. Why don't you give the poor people money directly? It's a true win-win: You part money which you would have parted with anyway (since we all know that corporations are lovely and love eating up costs and not passing them off to consumers), you get the efficiency of solving a Captcha that does something more than just prove you're human, and the poor people are rich.

            Everyone wins. Get started: https://www.foodforthepoor.org... [foodforthepoor.org]

            • Why don't you give the poor people money directly?

              Because that doesn't solve the problem. I want Google to pay the person because they have $10 billion sitting around or whatever. Let's make up numbers, say they have that, and I have $10 thousand sitting around. I want it to be Google: $9,999,999,999, Me:$10,000, Poor Person: $1. You're suggesting it should be Google: $10,000,000,000, Me; $9,9999, Poor Person: $1. You don't see the dramatic difference?

              you get the efficiency of solving a Captcha that d

              • Ahhh self entitled wealth redistribution. I like it. The hilarious part is that you think somehow you'll end up with that extra dollar. Boy have you a lot to learn about business.

  • by MrL0G1C ( 867445 ) on Thursday November 26, 2020 @07:06AM (#60767588) Journal

    Google only cares if you're human as much as they have to, but if you block them then they will punish you. It seems that now they have competition they are being less nasty with their captchas.

    Hcaptcha says they can recognise bots with their machine learning, so why is it I get the 2 pages of boats / planes to recognise nearly every time even though I don't block cookies. Seems their machines didn't actually learn much.

    • Hcaptcha says they can recognise bots with their machine learning

      They're lying.

      so why is it I get the 2 pages of boats / planes to recognise nearly every time even though I don't block cookies.

      Because they are lying.

      Seems their machines didn't actually learn much.

      If you believe that "learning" has anything to do with this, you are retarded.

  • Proxy (Score:5, Interesting)

    by markdavis ( 642305 ) on Thursday November 26, 2020 @07:19AM (#60767622)

    >"Consider Google reCAPTCHA, "

    The biggest problem I have with Google reCAPTCHA (other than the data mining and handing ever more power to Google) is that they don't use a unique domain for it. It is just google.com. This makes it incompatible with lots of methods of restricting users to whitelists. When you can't allow a user to go to google.com, you can't allow google.com/recaptcha (squid/squidguard has this issue, for example). So this means users can visit any approved site, if it uses google.com/recaptcha, they are blocked from the captcha and can't log in or continue. It should be something like recaptcha.google.com.

    Why Google does it this way is a mystery to me, since almost all of their other services do it the "right" way- mail.google.com, maps.google.com, news.google.com, play.google.com, contacts.google.com, drive.google.com, photos.google.com, duo.google.com, myaccount.google.com, etc.

    • Re: (Score:2, Insightful)

      What's the mystery? They know you can't block their captcha if you want the web to keep working for your users, so you can't block their domain. They use their captcha users as shields for their domain.
    • Even if they used recaptcha.google.com, what good would it do? Why would you want to block people from accessing websites that are using reCAPTCHA?

      • He doesn't want to block people from accessing websites that are using reCAPTCHA, that's the point. He wants to block them from google.com, while NOT blocking websites just because they use reCAPTCHA.

        • Oh, right. /facepalm

        • >"He doesn't want to block people from accessing websites that are using reCAPTCHA, that's the point. He wants to block them from google.com, while NOT blocking websites just because they use reCAPTCHA."

          Correct. There are situations were I (we) need to block people from general/uncontrolled browsing and restrict to a generous whitelist. We even allow mail.google.com and certain other google services, but not generalized search. Because it is not a separate domain or sub-domain, we cannot whitelist Goo

  • When it goes into 3x3 images mode the thumbnail images are so darn small, I cannot correctly identify what is in half the images, let alone if they are a bike,boat etc I get told verification failed a lot. Why? I suspect because I don't have a large high resolution screen but rather a 35.6 cm (14.0 in) laptop LCD.
    • My screen is 22' and they look uncomfortably small, too. So no, it's not your screen. (And anyway, building a responsive/scalable website is not rocket science, so that would not be an excuse for them.)
  • Yeah, right. I don't believe it's 15% of the web, either. Though there is the use of a lower case 'i', so maybe they mean 15% of whatever network they are calling "the internet". Even then, I don't believe it's that much.

    • You're more than four years late. [nytimes.com]

      • "Others are doing it, so we think we should, too."

        The world is littered with inaccuracies that are accepted as true. Hence the appeal of QI in the UK, having 18 seasons of shows doing so so far.

        What's even better, they've had to correct themselves as they have subsequently discovered their own inaccuracies.

        A style guide is rarely a good source for fact. Plus, this is NYT we're taking about...

  • by DontBeAMoran ( 4843879 ) on Thursday November 26, 2020 @10:34AM (#60767964)

    They used to have a button to report a problem with a reCAPTCHA but they removed it for some insane reason.

    And now, thanks to idiots who don't know the differences, the system has accepted wrong answers. What fucking idiots think a big-ass RV is a bus?

    What scares me is that Google is probably using reCAPTCHA answers to train some fucking A.I. to recognize objects in the real world.

    • by pt73 ( 2506856 )
      Actually reCAPTCHAs are presented internationally and not every country will call an RV an "RV" (try motorhome). Some places don't use the word sidewalk (footpath) or crosswalk (pedestrian crossing). Traffic lights don't always hang from wires and fire hydrants can be underground with a cover on them. And now I know google is trying to learn from our responses, next time they present geographically inappropriate pictures, I'm going to stop pretending I'm american and answer for my part of the world. Bring
  • by Anonymous Coward on Thursday November 26, 2020 @11:51AM (#60768146)

    Ever since an update earlier this year, Google's captcha has been a real bitch. You know how the original point was to identify that you weren't a bot, but were a real human? Well the people who understood that fundamental concept apparently don't work at Google any more. They've clearly been replaced by the age 27-ish know-it-alls who arbitrarily change things because they're "too old", while they sip away on lattes. Here's what I've learned in the past few months, and you're going to wonder what they're smoking, because it seems to actively punish humans!

    The main thing I noticed right away is that now there's some kind of "rate limiting", but it's absurdly sensitive. You know how it should take an AI longer to solve these than an actual human? Well someone at Google forgot this, and the faster you solve them, the harder it punishes you! If it sees more than a click every 5 seconds or so, it'll start punishing you. And the punishment makes you have to do more clicks, resulting in a feedback loop to hell.

    Note that this seems to be worse if you're not using one of the four major browsers that are constantly updated. It's probably trying to do busy crap in the background in such a way that browsers which don't put each tab into its own process are unfairly punished. A lot of "alternative" browsers don't support process-per-tab. Punishment includes being made to do up to three "rounds" of captchas, slower fades, and noisier pictures. Some of us don't like the Chrome layout with no window title bar, which all the other main browsers try to copy. But the easiest way for you to get out of captcha hell may be to switch to Firefox.

    First, some definitions of the various tasks:
    3x3 select three - you pick three items matching a category. Rarely there's a fourth, but it only needs you to click on three.
    4x4 select a category - usually this is traffic lights or crosswalks. Even though it says select "select all", it really only needs three or four matching squares selected, even if half of the squares are an enormous traffic light! Less clicks are better, to avoid the rate-limiting. Sometimes there is no matching item.
    3x3 fade - you have to click on matching items, but they slowly fade to another picture that may be another matching item. There will always be at least a fourth item to match. This is a punishment for clicking "too fast", aka 5+ seconds per click. The fade speed gets slower the more you are being punished.
    "session" - the total of all the captchas that you have to solve before you get a pass. This may be as many as three rounds.
    "round" - a set of captchas that you have to solve. If captcha hates you, you will usually have to do all three rounds in a session.
    2-per-round - you have to solve two captchas for a round. Often the second one is a 4x4 with no matching item.
    5-per-round - you have to solve five captchas for a round. The timing is such that if you have to go slow, the entire session will time out. This comes up so rarely for me that I don't actually know what triggers this particular punishment, perhaps too many incorrect captchas, perhaps excessive speed.

    So basically under the current broken reCAPTCHA, you have to poke along. Click an item, do something else for 5-10 seconds, click another item, etc. Be sure to pause both before and after using the VERIFY button! Above all, do not go clicking like you're in a race to click them as fast as possible. You wouldn't want to be mistaken for a human, now would you?

    • by Avidiax ( 827422 )

      I think they are trying to go beyond just detecting humans, and are now looking to detect click farms. But sophisticated privacy-conscious users click fast and aren't signed in or have workflows that require lots of captchas, and it looks the same.

  • by Paxtez ( 948813 ) on Thursday November 26, 2020 @02:21PM (#60768372)

    From the summary:
    "companies that have switched over to hCaptcha often report equal or better performance in bot detection and mitigation"

    "often" says that the majority of the time it is worse...

  • All the people complaining about how bad CAPTCHA is are right.

    Letting bad actors into the system then squashing them later if/when they act bad is what should happen. But it'll never happen because engineers don't want to spend time designing systems that have that ability. Easier for them to treat humans like a raw material, like something to mine and refine. And we let them get away with it. Tragic really.

"The great question... which I have not been able to answer... is, `What does woman want?'" -- Sigmund Freud

Working...