Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Bug Chrome OS X Apple

OS X Users: 13 Characters of Assyrian Can Crash Your Chrome Tab 119

abhishekmdb writes No browsers are safe, as proved yesterday at Pwn2Own, but crashing one of them with just one line of special code is slightly different. A developer has discovered a hack in Google Chrome which can crash the Chrome tab on a Mac PC. The code is a 13-character special string which appears to be written in Assyrian script. Matt C has reported the bug to Google, who have marked the report as duplicate. This means that Google are aware of the problem and are reportedly working on it.
This discussion has been archived. No new comments can be posted.

OS X Users: 13 Characters of Assyrian Can Crash Your Chrome Tab

Comments Filter:
  • The Assyrian came down like the wolf on the fold,

    And his cohorts were gleaming in purple and gold;

    And the sheen of their spears was like stars on the sea,

    When the blue wave rolls nightly on deep Galilee.

    Byron [poetryfoundation.org]

  • by Applehu Akbar ( 2968043 ) on Saturday March 21, 2015 @03:01PM (#49309299)

    Let us henceforth dub it the Snow Crash exploit.

    • Weren't the Snow-Crash-related fertile crescent dwellers Sumerians, the Xerox-PARC of Mesopotamian civilization, who invented more or less everything and then got massacred by their imitators?
      • by Whiteox ( 919863 )

        It's the imitator language derivative that is still being used today in Old Persia. Those Iranians are fun guys!
        It's the script to use when you don't want to write in Arabic.

    • That was my first thought as well.
  • by Anonymous Coward

    Stop the presses a bug found in a large complex program.

    • by gnupun ( 752725 )

      ... which millions of people use to connect to the internet... and there are dozens (thousands) of bugs still hidden where that bug came from. Do you still think browsers should be allowed for serious stuff like online banking, home automation and online elections?

    • Stop the presses a bug found in a large complex program.

      No Browser is safe : Chrome, Firefox, Internet Explorer, Safari all hacked at Pwn2Own contest [techworm.net]

      It's not "a bug" in "a program". It's every major browser. And it's pretty much like this every time they do pwn2own. If a group of hackers are able to bring down every major browser with previously unknown* exploits every year just for a chance to win a laptop, what can better motivated (financed) groups do?

      * unknown to the browser developers anyway... 17 seconds to pwn IE, yeah right... like they say on the cooki

  • by Max Hyre ( 1974 ) * <mh-slash AT hyre DOT net> on Saturday March 21, 2015 @03:19PM (#49309355)
    This exploit rang a bell, so I searched Bruce Schneier's website. And, sure enough, on July 15, 2000, he observed ``Unicode is just too complex to ever be secure.'' [schneier.com] Doesn't exactly warm the cockles of the paranoid's heart.
    • by gweihir ( 88907 )

      At that time, Schneier was just one of many that held this opinion. None if us is surprised by what is happening. If you want to be secure, stay away from Unicode or process UTF-8 as ASCII. As soon as you try to render, parse or even only compare anything besides standard ASCII, you are screwed.

      • Unfortunately, unicode is now woven into various Java string handling and database interactions, and it is far too complex to test all the possible input and storage scenaries. I've also noticed a strong tendency among current QA engineers to test only the new feature, and to avoid testing old components interacting with new features without _amazing_ pushback from their managers who want to keep testing costs very small. The result is a fairly predictable string of failure modes, and of production failures

        • by gweihir ( 88907 )

          Indeed. That is why I usually add to stay away from Java if you want/need security. Testing is pretty much a non-starter to get secure code though, unless the person doing the tests really understands the code, security and has a generous testing budget. In usual industrial practice, none of the three are the case.

          • It's also aggravated by the "install the latest software, and build components, from arbitrary 3rd party repositories". I'm afraid that I just a long discussion with some Java developers who were accustomed to building their software on their desktops, pulling in arbitrary, unknown versions of components and their dependencies, and and using the resulting components to build the next round. .I'm afraid it's reminding me, forcibly, of Perl developers saying "just use cpan build!", and ruby developers saying

        • by spitzak ( 4019 )

          Yes, Java and Python (3) and Qt all are causing enormous difficulties as they followed Microsoft down the fantasy road and thought you had to convert strings on input to "unicode" or somehow it was impossible to use them. Since not all 8-byte strings can convert there must either be a lossy conversion or there must be an error, neither of which are expected, especially if the software is intended to copy data from one point to another without change.

          The original poster is correct in saying "stay away from U

      • by lgw ( 121541 )

        UTF8 has nothing to do with it.

        The problem commonly is: people try to "clean" input with some stupid regex, rather than treating all user-provided strings as permanently dirty. You can do anything you need to, risk-free, with this attitude. You have to understand the encoding you use for storage/transmission (if your framework doesn't provide a way to safely, blindly store/transmit any string, then just encode the string in some way first), but that's a much, much smaller world than the universe of possib

        • I don't know what your definition of "dirty" is, but there are going to be scenarios where you need your data cleaned.

        • Well, assyrian unicode characters are in the range around U12000. They require four bytes in UTF-8 and two 16-bit words in UTF-16.

          In UTF-8 I'd be surprised if someone handled this wrong, because three byte characters are common, and there is no good reason to be able to process three byte but not four byte UTF-8.

          If they are using UTF-16 on the other hand, I wouldn't be surprised if someone assumes that characters are a single UTF-16 word.
          • by lgw ( 121541 )

            You might be right, but it's such an old problem - it was a big deal 10 years ago in the Windows world as UCS2 didn't handle it. C# was actually UTF from the start, like Java, of course.

            Still, crashing because of, what, a null in the input? I could certainly understand truncation (just like other incorrect display problems), but a crash?

        • by gweihir ( 88907 )

          You miss my point: I basically said that as soon as you are interpreting the data as Unicode, you are screwed. As to treating input as permanently dirty, that would be effective if possible, but it is not. For many security-critical functionality, you just have to reject anything that is not 7-bit ASCII, because quite often you need to sanitize input and use it afterwards.

          • by lgw ( 121541 )

            Maybe I'm still not getting your point. Sure, if you need to understand the details of Unicode character composition and such because you're the one rendering the output glyphs, or you want to sort or search across different encodings of the same word, that's rough, but there's no excuse for a security failure while doing those tasks.

            On your other point: the notion of "sanitizing input" is fundamentally flawed to begin with. You can never know what future framework that user data will be interacting with,

    • I'd say the things that Schneier mentions in this article are not actual problems. The first step is avoiding UTF-16 because it is much too tempting to assume that one 16-bit word = one character; nobody will make that assumption with UTF-8. The next step is cleaning UTF-8 and accepting only valid UTF-8; simply removing anything that isn't valid will do fine. What _must_ happen is that after this cleaning step nobody ever again accesses the original data, only the cleaned data. At that point handling the ch
      • Unicode is sort of complicated, or at least it's more complicated than might be expected. But the problem with Schneier saying "Unicode is too complex to ever be secure" is that he might as well just say "programming is too complex to ever be secure." Sure, Unicode is a little complicated. But it's hardly the most complicated thing you'll ever have to deal with as a programmer. If we can't even get that right, we might as well just quit.

        • by AmiMoJo ( 196126 ) *

          If they had just stuck with 24 or 32 bits per character, instead of going with multiple variable length character encodings, you might be right. When you can't be sure how many bytes any given character needs you can't use simple maths to work out how big buffers need to be, or even be sure that you won't end up with odd spare bytes at the end.

          It looks like this what has happened here. Even supposedly well debugged library code still has issues with it.

  • by NotInHere ( 3654617 ) on Saturday March 21, 2015 @03:19PM (#49309357)

    to ditch unicode support. They recognized that experimental technology like this shouldn't be rolled out to this much users. Thank you dice for keeping slashdot safe!

    • by cdrudge ( 68377 )

      Did Dice ditch unicode support? I thought the slash code always had issues/didn't support it, long before Dice acquired them.

      • by Anonymous Coward

        perhaps i can draw the situation in pictures
                              joke
                 
                              0
                              \/ you /\

      • by tlhIngan ( 30335 )

        Did Dice ditch unicode support? I thought the slash code always had issues/didn't support it, long before Dice acquired them.

        Slashcode always supported Unicode.

        The reason it appears it doesn't is that thanks to a bunch of wankers who decided to abuse Unicode to no end, it ended up screwing the site layout up thanks to abuse of control codes.

        So what was added was an input filter that limited what Unicode could come in - pretty much just ASCII at this point.

        Unicode IS complex, and you really cannot blindly ha

    • Yeah. It's not like Slashdot.jp patched slashcode to support Unicode 10+ years ago.

    • by AmiMoJo ( 196126 ) *

      Actually we are probably going to have to ditch Unicode at some point, at least in its current form. East Asian language support is badly broken. I could be fixed, but not in a non-breaking way.

      CJK unification is one of the biggest screw-ups in the history of computing.

      • by Anonymous Coward

        From what I understand unicode has abandoned CJK unification a long time ago there are now separate planes for each language.
        Of course the old planes still exists, so you need to transpose those when you find them in a string.

    • by Megane ( 129182 )
      The support is in there, it's just that it uses a whitelist, which happens to be very small, probably only to U00FF if that much. There are also likely problems on the client side where the user's browser posts in the wrong encoding.
  • If I were looking for a language to scare a program into submission with, Assyrian would be a pretty plausible choice. Even by the rather high standards of the rough neighborhood that is the near and middle east, they cut quite a swath of blood-soaked mayhem through their neighbors; and put out lots of cuneiform inscriptions and rather morbid art gloating about their efficiency at this.
    • Wrong Assyrians. The ones you're thinking of spoke Akkadian and wrote cuneiform.

      Eventually their (Christian) descendants ended up speaking Aramaic like practically everyone else in the Near East at the time (it was the official language in the Western part of the Persian Empire); the modern Assyrian language is one of the many forms of modern Aramaic (now split into several different languages, much as Latin evolved into several different languages over much the same period) and this script is properly c
      • Re:In fairness... (Score:4, Informative)

        by bargainsale ( 1038112 ) on Saturday March 21, 2015 @04:46PM (#49309767)
        (They spoke Aramaic long before they became Christian, of course.)

        The people in question call themselves Assyrians at the present day; there are some Akkadian words preserved in their Aramaic language even now, although Akkadian itself probably died out in the earlier part of the first millennium BC.

        The name "Syriac" is itself from a worn-down version of the same name; it was once used pretty much as the equivalent of "Aramaic" but is now generallly used to describe only one particular version of Aramaic which was a major literary language of Western Asia in early Christian times, and is still used as a liturgical language by Nestorian Christians as far afield as India. The script is used to write several modern Aramaic languages spoken by Christians.

        These ancient communities have suffered greatly in the Middle East wars of recent times, and a huge proportion have left as refugees.
  • Syriac not Assyrian (Score:5, Informative)

    by seyyah ( 986027 ) on Saturday March 21, 2015 @03:26PM (#49309401)

    That script is the Syriac script not the Assyrian one: https://en.wikipedia.org/wiki/... [wikipedia.org].

  • this report is a dupe: https://code.google.com/p/chro... [google.com]

  • I once had a small Notes web thing running for a bunch of people in Scandinavia. The thing crashed every time when someone from Iceland worked with it. Ruend out that the icelandic character is not in some middle european character set (this was before UTF-8) and wasted Notes every time. That was a total bastard of a problem to find.

  • by Anubis IV ( 1279820 ) on Saturday March 21, 2015 @04:03PM (#49309573)

    In related news, we don't need to worry about this bug being used by unscrupulous sorts of folks in the comments here. The one and only time a lack of unicode support has come in useful...

  • mtbf - 15 mins.

  • hmm, ancient and dead language from the time of reported magic. Just typing the words will crash your Mac. Imagine if one spoke them!
  • ... we know that Assyrian or more precisely Sumerian is tricky.

  • I know, Syrian, but still. I always knew he was going to be the death of Apple.

THEGODDESSOFTHENETHASTWISTINGFINGERSANDHERVOICEISLIKEAJAVELININTHENIGHTDUDE

Working...