Forgot your password?
typodupeerror
Security PHP Programming

New PHP Interpreter Finds XSS, Injection Holes 66

Posted by kdawson
from the double-edged-sword dept.
rkrishardy writes "A group of researchers from MIT, Stanford, and Syracuse has developed a new program, named 'Ardilla,' which can analyze PHP code for cross-site scripting (XSS) and SQL injection attack vulnerabilities. (Here is the paper, in PDF, and a table of results from scanning six PHP applications.) Ardilla uses a modified Zend interpreter to analyze the code, trace the data, and determine whether the threat is real or not, significantly decreasing false positives." Unfortunately, license issues prevent the tool in its current form from being released as open source.
This discussion has been archived. No new comments can be posted.

New PHP Interpreter Finds XSS, Injection Holes

Comments Filter:
  • Fixed it for you (Score:4, Informative)

    by techprophet (1281752) <emallsonNO@SPAMarchlinux.us> on Friday June 19, 2009 @10:24AM (#28390519) Journal

    New PHP Interpreter Finds XSS, Injection Holes

    Fixed it for you.

    • Find X? (Score:4, Funny)

      by eldavojohn (898314) * <eldavojohn AT gmail DOT com> on Friday June 19, 2009 @10:27AM (#28390549) Journal

      New PHP Interpreter Findx XSS, Injection Holes

      New PHP Interpreter Finds XSS, Injection Holes

      Fixed it for you.

      Clearly the title was trying to illustrate the PHP interpreter's ability to solve the pythagorean theorem [mit.edu].

      • Clearly the title was trying to illustrate the PHP interpreter's ability to solve the pythagorean theorem [mit.edu].

        I don't need PHP for that! Besides, the pythagorean theorem doesn't have X, just a, b, and c.

        a^2 + b^2 = c^2

        • Re:Find X? (Score:5, Funny)

          by eldavojohn (898314) * <eldavojohn AT gmail DOT com> on Friday June 19, 2009 @10:36AM (#28390677) Journal

          Clearly the title was trying to illustrate the PHP interpreter's ability to solve the pythagorean theorem [mit.edu].

          I don't need PHP for that! Besides, the pythagorean theorem doesn't have X, just a, b, and c.

          a^2 + b^2 = c^2

          I see you prefer short, nondescript variable names for your algorithms. I pity the person who has to maintain that bit of code. What is a? What is b? What is c?

          I ascribe to a more Knuth-y self descriptive code and prefer the Pythagorean theorem to look more like:

          sideAdjacentToRightAngle^2 + otherSideAdjacentToRightAngle^2 = sideOppositeRightAngle^2

          Or maybe I'm just being a smartass? It's so hard to tell with developers these days ...

          • Or maybe I'm just being a smartass? It's so hard to tell with developers these days ...

            You mean there's a difference?

            [disclaimer]I am a developer[/disclaimer]

          • Re:Find X? (Score:4, Funny)

            by MillionthMonkey (240664) on Friday June 19, 2009 @10:56AM (#28390943)

            I ascribe to a more Knuth-y self descriptive code and prefer the Pythagorean theorem to look more like: sideAdjacentToRightAngle^2 + otherSideAdjacentToRightAngle^2 = sideOppositeRightAngle^2 Or maybe I'm just being a smartass? It's so hard to tell with developers these days ...

            Would you want to stare at a wall of code with otherSideAdjacentToRightAngles and sideOppositeRightAngles and sideAdjacentToRightAngles all over the place?

            You could just go all the way and call them II11011I, I1IIOI1I, and II110I1I. At least call one of them "hypotenuse", christ.

          • Re: (Score:3, Funny)

            by Haeleth (414428)

            I ascribe to a more Knuth-y self descriptive code and prefer the Pythagorean theorem to look more like:

            sideAdjacentToRightAngle^2 + otherSideAdjacentToRightAngle^2 = sideOppositeRightAngle^2

            Magic constants?! That's dreadful! How am I supposed to know what 2 is for in that code? And, worse, what if you need to change it to something other than 2? You'd have to change it in three places. You might easily forget one and break everything.

          • by Spaham (634471)

            this reminds me when I was in calculus class in high school.
            we had all copied some homework from each others, and of course the
            teacher found out. everyone got F but I got an A... why ?
            because I changed the vector names (ok, it was trigonometry, but in calc class)
            I used names like Mike Joe Jay instead of AB AC CD DE like everybody else :)

      • by zoward (188110)

        Thanks - I needed that!

    • by EkriirkE (1075937)
      You do realize its a replacement for the Zend engine - the "Findx XSS" engine? With script kiddie tools to perform injections (SQL I'm assuming)
  • by Anonymous Coward

    it probably hasn't been open sourced because it's full of security holes

  • holy smokes batman (Score:3, Interesting)

    by sublimino (1425913) * on Friday June 19, 2009 @10:29AM (#28390583)

    From the results paper: "Part of Ardilla's implementation depends on modifications to the open-source Zend interpreter...made (for a different purpose) by a student while he was an intern at IBM. We have since made many more modifications, but since the original small diffs are owned by IBM, we cannot release either those original modifications or our later work that builds on them...It would be valuable for someone to re-implement the original changes, so that we could release our entire system as we would prefer. "

    How would these changes be "re-implemented" - would the code have to be re-engineered, or would a trawl through the original code (patching in changes verbatim) be acceptable? Otherwise, would somebody have to find alternative syntax for implementing the same functionality? Barrel of worms methinks.

    • Yeah, makes me wonder if open-sourcing this project was a primary goal at the beginning of the project. If so, they should have known about this wrinkle and had the intern re-write what he did for IBM. Seems like an oversight to build so much functionality only to, at the end, go "oh crap"...
    • It's only copyright and nobody would get harmed from sharing it. Let's get Jammie Thomas to release the source.

    • by Tanktalus (794810)

      Um, why not just ask the former-intern's IBM manager for permission? Or is it that IBM doesn't open-source anything [opensource.org]?

  • by JNSL (1472357) on Friday June 19, 2009 @10:29AM (#28390585)
    Although it would be nice to be able to use this, I'd imagine there'd be lots of damage following from widespread release of this program without a quick turnaround on fixing vulnerable sites.
    • by tirerim (1108567)
      Not really, unless those sites already have other serious security problems. The PHP code only runs on the server, and is thus invisible to the end user: all they see is the generated HTML. If your PHP code is exposed to the outside world, you're doing something wrong.
  • by Norsefire (1494323) * on Friday June 19, 2009 @10:29AM (#28390591) Journal
    And mine is open source:

    open( my $code, '<', @ARGV ) or die 'File not found';
    while( <$code> ) {
    if( /php/i ) {
    print "Exploit found\n";
    }
    }

    • by Anonymous Coward
      Same program, just in one line, hence easier to understand: perl -nE'say q(Exploit found) if /php/i' *
      • easier to understand:perl

        This particular grouping of words should not ever be used outside the privacy of your own home...

        • by psyclone (187154)

          easier to understand: perl

          This particular grouping of words should not ever be used outside the privacy of your own home...

          Unless you are wanting to do some Practical Extracting and Reporting (with a programming language)

    • Re: (Score:2, Funny)

      by BabyDave (575083)
      /me turns on short_open_tag in php.ini, then cackles maniacally ...
  • This somehow ... (Score:3, Insightful)

    by xmff (1489321) on Friday June 19, 2009 @10:32AM (#28390633)
    ... reminds me on Perl's taint mode where all external input data is traced until it was explicitly checked through a regular expression or similar.
  • i cant even find where to download a closed source version of it. is it available at all?
  • The basic issue here is that most PHP code does not currently use Frameworks, and many PHP developers aren't exactly experienced enough to know what XSS or SQL Injection are.

    The problem will never really be fixed in PHP until some framework or at least methodology wins out as the PHP framework of choice.

    It'd be nice if the PHP guys picked one and put their backing behind it, maybe even included it by default like they did APC for caching.

    • The problem will never really be fixed in PHP until the average PHP programmer at least cares about security.

      Sorry to everyone who uses PHP for a living, there are actually very good PHP programmers. Unfortunately, though, they are the exception. Easy syntax and being the server sided language of choice for many cheap webspace providers, every other PHP based page you stumble upon has glaring security holes due to someone programming it who barely knows enough PHP to make it work at all, and as soon as it "

    • by Ash Vince (602485)

      The problem will never really be fixed in PHP until some framework or at least methodology wins out as the PHP framework of choice.

      It'd be nice if the PHP guys picked one and put their backing behind it, maybe even included it by default like they did APC for caching.

      Does the Zend Framework count as a framework? In which case they have picked one, it has just not been universally excepted yet.

      There is however another issue. Languages like PHP and ASP were originally designed to make creating a server side code driven web site fairly easy. They succeeded so people who were not well grounded in writing code started dabbling in projects that were over their head, they just did not know it. These people had never heard of things like buffer overruns so they tended to trust

      • by ukyoCE (106879)

        As an example, I saw some lovely code recently where the developer had used prepared statements all through his code, but still left it wide open to SQL injection by not using variables in the prepared statements. He just prepared entire strings already containing the relevant form variables concatenated with the SQL. Genius.

        I almost said in my post that they should require prepared statements - but then I thought of that scenario and decided against saying that =D

  • by loufoque (1400831) on Friday June 19, 2009 @10:53AM (#28390875)

    Just teach people how to code. When a function or subsystem expects a certain format as a precondition on its input, you actually have to make sure you enforce that precondition (in the case of PHP applications, you simply need to apply trivial conversions such as htmlspecialchars() or mysql_escape_char() depending on whether you want to use that input to generate HTML or XML or to include it into a MySQL request -- this is enough to get rid of XSS and SQL injections completely).

    There would be no need for such tools if PHP developers actually were software engineers rather than kiddies surfing on the web hype that barely understand the tools they're manipulating.

    • by slummy (887268)
      Fuck that. Teaching people how to code the correct way creates equals.

      Messy spaghetti code is always a pain in the ass to fix, but does help us consultants rack up the hours.

      Keep the crappy PHP code coming boys!
    • Re: (Score:2, Interesting)

      by strimpster (1074645)
      Unfortunately you are incorrect at how easy it is to prevent these issues. In some examples, you want the input to come through as HTML that is allowed to be displayed back to the end users. An example of this is MySpace.com (or even the commenting system here). Do you remember the Samy worm [wikipedia.org] that crawled through their system? The techniques you have given would not have worked. An advanced parser that validates the input is necessary to prevent that (by stripping out the bad portions of the data). I was tas
      • by loufoque (1400831)

        Unfortunately you are incorrect at how easy it is to prevent these issues

        Sure it is easy to circumvent XSS, I just gave a way that always works. I never said that way covered all uses you may want to do of your input, however.
        Indeed, if you want to treat your input as a HTML fragment to include verbatim into your document (which in my opinion, is a terrible idea, just look at how annoying that is on slashdot, this messed up my message elsewhere in this thread because I naively wrote & instead of &am

        • Saying that a user should not be able to put in html is a cop out. As a versed software engineer, you should be completely perfect with parsing data and validating it. In fact, if you have a degree from a university (which I'm assuming that you do), you should have had to deal with grammars in one of your classes. It sounds like you don't recognize the need for this, as you are most likely not what one would classify as a "web developer". That is fine, but some applications require the use of this. One very
          • by loufoque (1400831)

            Saying that a user should not be able to put in html is a cop out. As a versed software engineer, you should be completely perfect with parsing data and validating it

            I never said it was problematic to implement, I said it was a terrible idea from an usability point of view, and this was between parentheses, which shows it was nothing more than a side note.
            Can't you read at all? I said that if you wanted to allow this, you should parse, which you should do anyway if you used a different input format than HTM

    • in the case of PHP applications, you simply need to apply trivial conversions such as htmlspecialchars() or mysql_escape_char()

      Let's see. You have to

      • Know to do it.
      • Remember to do it.
      • Be careful to only do it once.
      • Actually type the characters.

      One of them is incredibly easy.

      The rest could be made a lot easier with a static type system where you can create a type HtmlString and offer htmlspecialchars() as the only conversion from String to HtmlString, and only allow instances of HtmlString to be output. Similarly for SQL.

      Doing things the hard way instead of the easy way (and insisting others also do it the hard way) for no good reason

      • by loufoque (1400831)

        The rest could be made a lot easier with a static type system where you can create a type HtmlString and offer htmlspecialchars() as the only conversion from String to HtmlString, and only allow instances of HtmlString to be output. Similarly for SQL.

        Could be interesting.
        I guess you could implement that approach in any language with support for user-defined implicit conversions (C++ comes to mind, albeit I've heard Scala does it too).

        Now, don't get me wrong. I don't like typing type names all the time.

        Typin

    • If people aren't using escaping functions like that at all then this tool isn't really needed, a simple parser could see the functions aren't being called. This tool seems like it may be useful for catching occasional cases where something has mistakenly being omitted. Ie because people are imperfect, not because they are clueless.

      That said I don't think it's really something that developers should have to care about. PHP is primarily a language for interacting with databases and web browsers and as such
    • by Waccoon (1186667)

      It would also help if PHP had a decent built-in template engine. PHP is supposed to be a template language, but (supposedly) up to PHP 6, it can't even handle UTF-8 encoding.

      Anything in PEAR isn't much use, either, because my scripts are designed to be redistributed and run on shared servers. These servers usually don't have any PEAR modules installed.

  • DarkReading! (Score:4, Informative)

    by jginspace (678908) <jginspace@yahoo. c o m> on Friday June 19, 2009 @10:58AM (#28390973) Homepage Journal
    TFA is just blog spam. See source [darkreading.com].

    And I wonder, are the maintainers of schoolmate and webchess now frantically patching their code? None of the articles gives dates - although the PDF is more than 18 months old.

  • There is a similar tool available under a BSD license called PHPAudit, but it does seem to generate a few more false positives than the one linked in this article... It's site is http://phpaudit.precor-incorporated.com/ [precor-incorporated.com]
  • not possible (Score:3, Interesting)

    by Lord Ender (156273) on Friday June 19, 2009 @11:43AM (#28391587) Homepage

    I agree that it is possible (but difficult) to identify sql injection vulnerabilities with automated code inspection. I do not think XSS can be identified so easily. In a web app, user-submitted text is added to a database. Then who-knows-what happens to it. Eventually, something based on that text is submitted as output, at which time special characters must be escaped.

    The only way to accurately identify XSS in such a scenario is to track the input from the user, into the database, and back out, so that you know the special characters are escaped. That's not something software could accurately do for a general case, without tons of false positives.

  • Unfortunately, license issues prevent the tool in its current form from being released as open source.

    The existence of a tool (even if it's pricey) is invaluable; especially when compared to inferior tools. If we want a FOSS solution, all that's stopping us is ourselves.

"Never ascribe to malice that which is caused by greed and ignorance." -- Cal Keegan

Working...