Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Security

Unicode Encoding Flaw Widespread 184

LordNikon writes "According to this CERT advisory: 'Full-width and half-width encoding is a technique for encoding Unicode characters. Various HTTP content scanning systems fail to properly scan full-width/half-width Unicode encoded HTTP traffic. By sending specially-crafted HTTP traffic to a vulnerable content scanning system, an attacker may be able to bypass that content scanning system.' A proof of concept affecting IIS is already being posted to security mailing lists. Cisco IPS and other IDS products are also affected." The CERT advisory lists 93 systems, with 6 reported as vulnerable (including 3com, Cisco, and Snort), 5 known not vulnerable (including Apple and HP), and the rest unknown.
This discussion has been archived. No new comments can be posted.

Unicode Encoding Flaw Widespread

Comments Filter:
  • Limited impact. (Score:3, Informative)

    by shird ( 566377 ) on Tuesday May 22, 2007 @01:38AM (#19217829) Homepage Journal
    This appears to be limited to content scanning, and isn't really a vulnerability in itself. Relying on content scanning to prevent an exploit to reach an exploitable system is a pretty bad idea, much better to fix the system than the extra layer of defense on the outside.

    Content scanning is mostly useful against filtering known exploits, and is hardly meant to be your primary defense. Being able to bypass this scanning won't buy you much. If the content scanner is aware of an exploit it scans for, chances are so are the systems being targeted and are patched to protect against it.
    • by KevMar ( 471257 )
      So this is another case of don't trust user input.

      I don't see anything new here, just another trick to look for. Most well tested systems should not be affected.

      Unless I'm overlooking something? I'm not am I?
    • $-$

      They've been trying to sell this kind of kit to us for years.
    • by jrumney ( 197329 )
      There have been many vulnerabilities in the past that were based on encoding a URL in some broken (or even non-broken) way to get past the first level of URL checking to a lower level where directory traversal is possible. On Unix based servers, the risk of this is mitigated by running your webserver in a chroot jail. On IIS, you just have to hope that IIS 6.0 is actually fundamentally secure down to its lowest levels, not just an insecure product with a thin veneer of security layered over it like previous
      • On Unix based servers, the risk of this is mitigated by running your webserver in a chroot jail.
        chroot jail doesn't protect your application against XSS.
  • Incident response (Score:4, Interesting)

    by Anonymous Coward on Tuesday May 22, 2007 @02:11AM (#19217961)
    I work incident response in a large web company (hence anonymous posting, natch) and currently we're treating this as "interesting, but case not proven". We test our web apps filter all input so I'm adding double-width unicode to our security regression test cases; however I'm happy to let the FD posters lab it out between them in the short term. These alleged IIS exploits don't work for us - which is not to say that we don't have some system, somewhere, for which this is an issue. At the end of the day it's just a clear restatement of something that's obvious to anyone - you need to filter input carefully, and you need to be aware of issues around alternative encodings. But it's not a "BRB" (big-red-button, ie emergency stop and all hands to the pumps to fix a vulnerability) issue for us - yet. The last time we had one of those, it was the Microsoft DNS server remote root... because most of our internal domain controllers were also running DNS servers.
  • "Not vunerable" (Score:3, Informative)

    by iamacat ( 583406 ) on Tuesday May 22, 2007 @02:20AM (#19218001)
    According to the advisory, Apple products do not provide HTTP content filtering and are therefore not vulnerable. This will do nothing to help someone build a functioning protection system.
    • Yeah, no "content filtering" is needed, why would it be? Any text is either the request (and thus not "content") or mere data, in the second case it shouldn't be filtered unless something is terribly broken.

      Trying to parse encapsulated data is a bad idea generally; as is trying to detect the same attack twice. Of course, unless you're snakeoil^Wsecurity software salesman.
  • I'm wondering if the great firewalls (Cisco product?) are also vulnerable to this. At least it'll force them to do longer string matching.
  • by udippel ( 562132 ) on Tuesday May 22, 2007 @05:24AM (#19218849)
    It is a vulnerability, in the strict sense.
    It is a self-inflicted misbehaviour as in common sense.
    It is like those silly Cisco content inspectors on port 25, that try to avoid attacks on flimsy MTAs.
    It is like someone dying from a jab against measles: the jab protected that person from contracting measles, actually.
    It is like those stupid anti-virus programs that are more vulnerable than the daemons they profess to protect.

    When the attacker uses a codepage different from the one that you think she ought to use, she can circumvent your content filter. Which ought not be an attack vector, in any case.

    As I said: nothing to see, move along ...
  • What kind of a flawed design is it where character encoding can impact security. The concept of scanning for unsafe strings is also flawed as in the case of virus scanning, as it only know about the stuff it knows about. This is another example of Ranums enumerating badness [ranum.com]. If the SQL engine used only stored procedures then you wouldn't have to run a content scanner as the only thing coming over HTTP is DATA.
    • If the SQL engine used only stored procedures then you wouldn't have to run a content scanner as the only thing coming over HTTP is DATA.
      Do the popular free software implementations of SQL (MySQL, PostgreSQL, Firebird SQL, etc.) implement stored procedures in any sort of standard manner?
      • by rs232 ( 849320 )
        'Do the popular free software implementations of SQL (MySQL, PostgreSQL, Firebird SQL, etc.) implement stored procedures in any sort of standard manner?'

        I don't know what you mean by standard manner. According to this [postgresql.org] PostgreSQL uses something called procedural languages. But then again since when was SQL ever implimented in a common standard. Remember when Microsoft 'extended' SQL so as to allow spaces in table names, you only have to wrap the name in square brackets [] or back-ticks ``.

        But my point
  • Back in the Win95 days, I recall a stupid little exploit that would lock up a Win95 machine. The root of the problem, however, was in the TCP/IP code from BSD's source. Microsoft had used BSD's TCP/IP stack code in building one for Win95. I'm not here to complain that big bad commercial vendors are "stealing" from the open source community. I'm just suggesting that perhaps this is yet another example of how OSS has made yet another important, thought silent, contribution.

    It's annoying to me when people
    • Just because you use "free" code doesn't mean you don't have to check it for correctness.
      If X works in system Y doesn't imply it works in system Z. Heck, the reason it works in Y could be because of a bug in Y.
    • 'Back in the Win95 days, I recall a stupid little exploit that would lock up a Win95 machine. The root of the problem, however, was in the TCP/IP code from BSD's source'

      I assume you are referring to the ping of death [archive.org]. The root cause being a bug in the TCP protocol and occured on other platforms not using the BSD code.

      was Another likely example of OSS?
  • After reading through this carefully, it seems the fault is really with the webserver software (in this case, IIS). The problem is that normally a full-width character (such as FF1C in the example) and the regular character "<" are not equivalent, but IIS is translating the full-width form of a character into the regular character, so although the two forms were distinct before reaching the frontline filters, they are no longer distinct by the time it reaches application code running under IIS.

    I guess
    • "Full width" vs. "Half width" (or, as I prefer, "half-wit") characters exist for typographical convenience in rendering Japanese characters. (Take a look at the Unicode spec, section 10.3 for example http://www.unicode.org/book/ch10.pdf/ [unicode.org]). This does not, however, explain why certain symbols that are already defined in other parts of the Unicode standard, such as the less-than symbol (or left angle bracket) are duplicated there. I suspect that it has something to do with possible confusions that might arise

      • Re: (Score:3, Insightful)

        by phasm42 ( 588479 )

        here are 2 ways of producing the < glyph: you can use character code x8B or xFF1C.

        Shouldn't that be x3C?

        I'm not sure if that's right or wrong, if there is a right and wrong way to handle this issue (I suppose that means it's excellent grounds for a religious war)--it's just important that it be handled consistently.

        I thought about this a little more, and I think the difference will be in what it is used for. In HTML, the "<" glyph has a special meaning, so it makes sense that a different version (in

        • Shouldn't that be x3C?

          Er...yes, of course. Apparently x8B is one of those European-style single quotes (at least that's what I think the purpose of that character is) that looks like a small left angle bracket. (There's a double version as well.)

          That's what I get for posting from work, where I have to keep looking over my shoulder watching for my boss, who doesn't understand that posting to /. is research.

    • Re: (Score:3, Informative)

      by spitzak ( 4019 )
      They are there for compatability with some Japanese and Chinese character sets, which contained most of the ascii characters in both "half" and "full width" forms. The full-width ones were twice as wide to match the square characters, which was useful for lining up columns.

      This is all pointless now with proportionally-spaced fonts (and multiple fonts, you could easily select the "wide" font to print those characters instead). However Unicode had as a design requirement that translating from any common encod
  • So how long til we find out that there has been exploitation of this vulnerability for X number of months for the sole purpose of stealing our WoW accounts!!!

    Why steal someone's real identity when you can steal their uber virtual Undead Priest identity and sell it for 16 bucks.
  • Am I the only one who has noticed that since CERT partnered with the US Government, the response time on advisories has been much slower, and the details and depth of reports are less comprehensive? CERT advisories used to be a critical part of our security strategy. Now by the time the hit the mailing list (if at all), they're more of an afterthought.

    Is there a better alternative to CERT now because it just isn't cutting it. I am familiar with Bugtraq and Security Focus. By the time CERT mentions somet
  • Full-width and half-width encoding is a technique for encoding Unicode characters.

    That comes as a complete surprise to me, and I thought I knew at least a little about Unicode and other character encoding schemes. The usual methods of encoding Unicode character points are UTF-8 (variable-length scheme where characters may be represented by anything from one to six bytes), UTF-16 (fixed-width double byte encoding), UTF-32 (fixed-length 4 byte encoding), and well there's UTF-7 and other oddballs. But the cl

    • Re: (Score:2, Insightful)

      by HeroreV ( 869368 )

      UTF-16 (fixed-width double byte encoding)

      UTF-16 is a variable-width encoding. Code points from plane 0 are encoded in 16 bits and code points from planes 1 through 16 are encoded as two 16 bit surrogates. Many developers, like you, aren't aware of this, so it's very common for software to choke on UTF-16 with surrogate pairs.

      I don't understand how mistaking one character for another is going to break anything

      scenario:
      1) You escape a Unicode string that contains fullwidth characters. The fullwidth characters

Genius is ten percent inspiration and fifty percent capital gains.

Working...