A predominant PHP developer (whose name I didn't get permission to drop, so I won't, but many of you know who I mean) has been doing a bunch of research related to Cross Site Scripting (XSS), lately. It's really opened opened my eyes to how much I take user input for granted.
Don't get me wrong. I write by the "never trust users" mantra. The issue, in this case, is something abusable that completely slipped under my radar.
The problem that was demonstrated to me exploited something I considered to be safe. The filename portion of request URI. Now I know just how wrong I was.
Consider this: you build a simple script; let's call it simple.php but that doesn't really matter. simple.php looks something like this:
Alright. Let's put this script at: http://example.com/tests/simple.php. On a properly-configured web server, you would expect the script to always render to this, on request:
What I forgot about, as I suspect some of you have, too (or maybe I'm the only loser who didn't think of this (-; ), is that $_SERVER['PHP_SELF'] can be manipulated by the user.
How's that? If I put a script at /simple/test.php, $_SERVER['PHP_SELF'] should always be "/simple/test.php", right?
See, there's a feature of Apache (I think it's Apache, anyway) that you may have used for things like short URLs, or to optimize your query-string-heavy website to make it search-engine friendly. $_SERVER['PATH_INFO']-based URLs.
Quickly, this is when scripts are able to receive data in the GET string, but before the question mark that separates the file name from the parameters. In a URL like http://www.example.com/download.php/path/to/file, download.php would be
executed, and /path/to/file would (usually, depending on config) be available to the script via $_SERVER['PATH_INFO'].
The quirk is that $_SERVER['PHP_SELF'] contains this extra data, opening up the door to potential attack. Even something as simple the code above is vulnerable to such exploits.
Let's look at our simple.php script, again, but requested in a slightly different manner: http://example.com/tests/simple.php/extra_data_here
It would still "work"--the output, in this case, would be:
I hope that the problem is now obvious. Consider: http://example.com/tests/simple.php/%22%3E%3Cscript%3Ealert('xss')%3C/script%3E%3Cfoo
The output suddenly becomes very alarming:
The solution should also be obvious. Convert the user-supplied data to entities. The code becomes:
And an attack, as above, would be rendered:
This still violates the assumption that the script name and path are the only data in $_SERVER['PHP_SELF'], but the payload has been neutralized.
Needless to say, I felt silly for not thinking of such a simple exploit, earlier. As the aforementioned PHP developer said, at the time (to paraphrase): if guys who consider themselves experts in PHP development don't notice these things, there's little hope for the unwashed masses who have just written their first 'echo "hello world!\n";'. He's working on a generic user-input filtering mechanism that can be applied globally to all user input. Hopefully we'll see it in PECL, soon. Don't forget about the other data in $_SERVER, either..
Upon experimenting with this exploit on my own server (and watching the raw data in my _SUPERGLOBALS, conveniently, via phpinfo()), I noticed something very interesting that reminded me that even though trusting this data was a stupid mistake on my part, I'm not the only one to do so. A fun (and by fun, I mean nauseating) little game to play: create a file called "info.php" (or whatever name you like). In it, place only "<php phpinfo(); ?>". Now request it like this: http://your-server/path/to/info.php/%22%3E%3Cimg%20src=http://www.perl.com/images/75-logo.jpg%3E%3Cblah
Nice huh? A little less nauseating: it's fixed in CVS.