1. Security and... Driving? (and Hiring)

    There's been a blip on the PHP blogosphere (think what you will of that word, it's accurate) regarding PHP's "inherent security flaws."

    I guess it's time to toss in my 2c (even though I was one of the first to reply to Chris' post on this). Since I like similes, I propose the following: coding is like driving.

    What? It's pretty simple, if you think about it.

    If you drive, you'll follow. If you don't, but have tried, you'll also follow. If you've never tried it, you should. (-:

    Coding is like driving. When you start driving, you're really bad at it. Everyone is horrible, even if they aren't aware.

    As time passes, and you gain more experience behind the wheel, you're subjected to different driving conditions and new hazardous situations. These eventually make most of us better drivers.

    Take me, for example. I grew up in a relatively small city in New Brunswick. I learned to drive there. At the time, there was very little street parking, and as a result, very little parallel parking. I was really bad at parallel parking for a long time. I first started driving when I was 16. It wasn't until I was 20 that some friends and I took my car to the first (and only?) Geek Pride Festival. Closing in on Boston, the roads got wider and wider. Suddenly, I found myself driving on a road that was 4 lanes in each direction. You laugh, but this is daunting for a guy who'd never driven on anything wider than 2 lanes (in each direction), before. I knew to cruise on the right, and pass on the left, but... how do I use those other two lanes? I now live in Montreal, and feel confined when there are only two lanes. (-:

    Another parallel is when I learned to drive stick (manual transmission). My first few weeks were quite jumpy... then, my clutch foot smoothed out, and my passengers were relieved.

    More food for thought lies in the insurace industry. Now, I'll keep my feelings towards these racketeering slimeballs (mostly) to myself for the purposes of this entry, but they DO do something right: reward experienced drivers (often at the cost of young males, but I digress).

    I have a motorcycle license. I had to pass both written and driven tests to be able to ride. Even then, I only qualified for the lower class of bike ( 550cc).

    Alright, so what's my point? Simple: new coders are bad at their jobs. I thought I was good at the time, but I was horrible. I'm better now, but in 2 years, I know I'll look back at this and think about how bad I was 2 years ago. New drivers are also bad.

    So, the people who control the roads have put a few safeguards into effect to keep these people from hurting others. First, there's graduated licensing in many parts of the world. When I was 16, I had a 12 month waiting period before I could drive by myself, and even then, I had to maintain a 0.00% blood alcohol level whenever driving.

    Insurance companies penalize (or, if you're fluent in marketing, "don't reward") new drivers. My insurance payments are now an order of magnitude lower than when I first started driving.

    Trucking companies are likely to hire newgrad drivers, but this is because their workforce is scarce. They put their better, and more experienced drivers on the most complicated routes. And most taxi drivers I see are well over 30.

    Getting offtopic again: New coders are bad. They learn. Some quickly, some not so much. They make mistakes.

    So, how do you get around this? Two ways. If you run a small shop, you should ONLY have experienced developers on staff. If your shop is a little bigger, then you can afford (ironically) to pay less to inexperienced devs that can do some grunt work, and get a bit of experience under their belts. Make sure that your good devs are reviewing their work, though.

    You're effectively enforcing "graduated licensing" on your devs. If they have little experience, give them little power.

    That said, I firmly believe (and agree with Marco) that it's not PHP's job to enforce this. Just as I would not expect Plymouth to limit my ability to drive my old Reliant K car. There are rules in place at a higher level, and that's GOOD in my opinion.

    PHP is easy, or at least it starts out that way, and then, after a certain threshold, gets more and more complicated, but that's OK. Everything works this way. "Windows" is easy.. but when your registry pukes, it takes guru skills to clean it up (or novice skills to find your XP CD to reinstall). Driving is "easy"... just don't put new drivers in a situation they haven't seen before (whiteout/blizzard, collision, black ice, blinding sun, etc).

    The money you save by hiring new grads (without proper mentors/filtering/etc) is often trumped by your exposure to security flaws, bad design, and failure.

    A little aside: development shops and otherwise-hiring companies seem to be catching on to this. In the past 3 months, I've had 4 colleagues (former) come to me asking if I know any advanced PHP devs in Montreal who are looking for work... I've made a few suggestions, but most of the GOOD locals I know are already happily employed. If you live here (or are planning on moving here), and you've got LOTS of PHP experience (more than 3 years), have diverse experience, and are genuinely a good coder, let me know, and I'll try to hook you up.

  2. ($var == TRUE) or (TRUE == $var)?

    Interesting little trick I picked up a while back, been meaning to blog about it. If you're already in the loop, run along.

    Prior to enlightenment, I used to write conditionals something like this:

    if ($var == SOME_CONSTANT_CONDITION) { // do something }

    ... more specifically:

    if ($var == TRUE) { // do the true thing }

    That's how I'd "say" it, so that's how I wrote it. But is it the best way? I now don't think so. When reviewing other peoples' code (often from C programmers), I've seen "backwards" conditionals.. something like:

    if (TRUE == $var) { // ... }

    Which just sounds weird. Why would you compare a constant to a variable (you'd normally compare a variable to a constant).

    So, what's the big deal?

    Well, a few months back, I stumbled on an old article about a backdoor almost sneaking into Linux.

    Here's the almost-break:

    if ((options == (__WCLONE|__WALL)) && (current->uid = 0)) retval = -EINVAL;

    Ignore the constants, I don't know what they mean either. The interesting part is current->uid = 0

    See, unless you had your eyes peeled, here, it might look like you're trying to ensure that current->uid is equal to 0 (uid 0 = root on Linux). So, if options blah blah, AND the user is root, then do something.

    But wait. There's only a single equals sign. The comparison is "==". "=" is for assignment!

    Fortunately, someone with good eyes noticed, and Linux is safe (if this had made it into a release, it would've been trivial to escalate your privileges to the root level).. but how many times have you had this happen to you? I'm guilty of accidentally using "=" when I mean "==". And it's hard to track down this bug.. it doesn't LOOK wrong, and the syntax is right, so...

    This is nothing new. Everyone knows the = vs == problem. Everyone is over it (most of the time). But how can we reduce this problem?

    A simple coding style adjustment can help enormously here.

    Consider changing "$var == TRUE" to "TRUE == $var".

    Why? Simple:

    sean@iconoclast:~$ php -r '$a = 0; if (FALSE = $a) $b = TRUE;'
    Parse error: parse error in Command line code on line 1

    Of course, you can't ASSIGN $a to the constant FALSE. The same style applied above would've caused a a similar error in the C linux kernel code:

    if ((options == (__WCLONE|__WALL)) && (0 = current->uid ))

    Obviously, "0" is a constant value--you cannot assign a value to it. The missing "=" would've popped up right away.

    Cool. Seems a little awkward at first, but in practice, it make sense.

    HTH.

  3. APC Docs

    Just a quick note to say that I finished up the first round of APC Documentation over the weekend.

    (and Rasmus linked to my livedocs on the Internals list)

    There've been more changes since that was built, but at least the basic docs are there, now.

    Enjoy.

    ((update: seems I suck at making links in HTML; fixed))

  4. mail() replacement -- a better hack

    This morning, I read Davey's post about how to compile PHP in a way that allows you ro specify your own mail() function. This is kind of a cool hack, but I've been using a different approach for a while, now, that allows much better control. Read on if you're interested.

    Davey's hack, if you didn't read his post, yet, centers around defining your OWN mail function, after you have instructed PHP not to build the default one.

    My hack doesn't require editing of the PHP source, or even a recompile. It doesn't require an auto-prepend, either, but it does require a small change to php.ini.

    So, where's the magic? It lies in the sendmail_path directive.

    When it comes to mail() (as well as many other things), PHP prefers to delegate the heavy lifting to another piece of software: sendmail (or a sendmail compatible command-line mail transport agent). By default, PHP will call your sendmail binary, and pass it the entire message, after composing it from the headers and body supplied by the developer.

    One of the side-benefits to this system is the ability to override PHP's default, and seamlessly hook in your own sendmailesque binary or script.

    Here's an example from one of my development environments:

    sendmail_path=/usr/local/bin/logmail sean@sarcosm:~$ cat /usr/local/bin/logmail cat >> /tmp/logmail.log

    This little bit of config & code is extremely useful in a non-production environment. How many of us have accidentally sent emails to actual customers from the development server? This little bit of trickery avoids this, and instead of sending the email (as PHP normally would), mail is instead logged to the /tmp/logmail.log file. Disaster avoided.

    But, that file gets pretty big over time... it becomes unmanageable very quickly. So, in a different environment, I have an alternative:

    sendmail_path=/usr/local/bin/trapmail sean@sarcosm:~$ cat /usr/local/bin/trapmail formail -R cc X-original-cc \ -R to X-original-to \ -R bcc X-original-bcc \ -f -A"To: devteam@example.com" \ | /usr/sbin/sendmail -t -i

    And what does this do? It traps all mail that would normall go OUT (say, to a customer), and instead, delivers it to devteam@example.com (with the original fields renamed for debugging purposes).

    So, how does all of this solve Davey's problem?

    This is something I whipped up after work, today, so it's pretty new code that likely has a few bugs lurking in it, but it's a good start:sendmail_path=/usr/local/bin/mail_proxy.php

    <?php //---CONFIG $config = array( 'host' => 'localhost', 'port' => 25, 'auth' => FALSE, ); $logDir = '/www/logs/mail'; $logFile = 'mail_proxy.log'; $failPrefix = 'fail_'; $EOL = "\n"; // change to \r\n if you send broken mail $defaultFrom = '"example.net Webserver" <www@example.net>'; //---END CONFIG if (!$log = fopen("{$logDir}/{$logFile}", 'a')) { die("ERROR: cannot open log file!\n"); } require('Mail.php'); // PEAR::Mail if (PEAR::isError($Mailer = Mail::factory('SMTP', $config))) { fwrite($log, ts() . "Failed to create PEAR::Mail object\n"); fclose($log); die(); } // get headers/body $stdin = fopen('php://stdin', 'r'); $in = ''; while (!feof($stdin)) { $in .= fread($stdin, 1024); // read 1kB at a time } list ($headers, $body) = explode("$EOL$EOL", $in, 2); $recipients = array(); $headers = explode($EOL, $headers); $mailHdrs = array(); $lastHdr = false; $recipFields = array('to','cc','bcc'); foreach ($headers AS $h) { if (!preg_match('/^[a-z]/i', $h)) { if ($lastHdr) { $lastHdr .= "\n$h"; } // skip this line, doesn't start with a letter continue; } list($field, $val) = explode(': ', $h, 2); if (isset($mailHdrs[$field])) { $mailHdrs[$field] = (array) $mailHdrs[$field]; $mailHdrs[$field][] = $val; } else { $mailHdrs[$field] = $val; } if (in_array(strtolower($field), $recipFields)) { if (preg_match_all('/[^ ;,]+@[^ ;,]+/', $val, $m)) { $recipients = array_merge($recipients, $m[0]);; } } } if (!isset($mailHdrs['From'])) { $mailHdrs['From'] = $defaultFrom; } $recipients = array_unique($recipients); // remove dupes // send if (PEAR::isError($send = $Mailer->send($recipients, $mailHdrs, $body))) { $fn = uniqid($failPrefix); file_put_contents("{$logDir}/{$fn}", $in); fwrite($log, ts() ."Error sending mail: $fn (". $send->getMessage() .")\n"); $ret = 1; // fail } else { fwrite($log, ts() ."Mail sent ". count($recipients) ." recipients.\n"); $ret = 0; // success } fclose($log); return $ret; ////////////////////////////// function ts() { return '['. date('y.m.d H:i:s') .'] '; } ?>

    Voila. SMTP mail from a unix box that may or may not have a MTA (like sendmail) installed.

    Don't forget to change the CONFIG block.

  5. XSS Woes

    A predominant PHP developer (whose name I didn't get permission to drop, so I won't, but many of you know who I mean) has been doing a bunch of research related to Cross Site Scripting (XSS), lately. It's really opened opened my eyes to how much I take user input for granted.

    Don't get me wrong. I write by the "never trust users" mantra. The issue, in this case, is something abusable that completely slipped under my radar.

    Most developers worth their paycheque, I'm sure, know the common rules of "never trust the user", such as "escape all user-supplied data on output," "always validate user input," and "don't rely on something not in your control to do so (ie. Javascript cannot be trusted)." "Don't output unescaped input" goes without saying, in most cases. Only a fool would "echo $_GET['param'];" (and we're all foolish sometimes, aren't we?).

    The problem that was demonstrated to me exploited something I considered to be safe. The filename portion of request URI. Now I know just how wrong I was.

    Consider this: you build a simple script; let's call it simple.php but that doesn't really matter. simple.php looks something like this:

    <html> <body> <?php if (isset($_REQUEST['submitted']) && $_REQUEST['submitted'] == '1') { echo "Form submitted!"; } ?> <form action="<?php echo $_SERVER['PHP_SELF']; ?>"> <input type="hidden" name="submitted" value="1" /> <input type="submit" value="Submit!" /> </form> </body> </html>

    Alright. Let's put this script at: http://example.com/tests/simple.php. On a properly-configured web server, you would expect the script to always render to this, on request:

    <html> <body> <form action="/tests/simple.php"> <input type="hidden" name="submitted" value="1" /> <input type="submit" value="Submit!" /> </form> </body> </html>

    Right? No.

    What I forgot about, as I suspect some of you have, too (or maybe I'm the only loser who didn't think of this (-; ), is that $_SERVER['PHP_SELF'] can be manipulated by the user.

    How's that? If I put a script at /simple/test.php, $_SERVER['PHP_SELF'] should always be "/simple/test.php", right?

    Wrong, again.

    See, there's a feature of Apache (I think it's Apache, anyway) that you may have used for things like short URLs, or to optimize your query-string-heavy website to make it search-engine friendly. $_SERVER['PATH_INFO']-based URLs.

    Quickly, this is when scripts are able to receive data in the GET string, but before the question mark that separates the file name from the parameters. In a URL like http://www.example.com/download.php/path/to/file, download.php would be

    executed, and /path/to/file would (usually, depending on config) be available to the script via $_SERVER['PATH_INFO'].

    The quirk is that $_SERVER['PHP_SELF'] contains this extra data, opening up the door to potential attack. Even something as simple the code above is vulnerable to such exploits.

    Let's look at our simple.php script, again, but requested in a slightly different manner: http://example.com/tests/simple.php/extra_data_here

    It would still "work"--the output, in this case, would be:

    <html> <body> <form action="/tests/simple.php/extra_data_here"> <input type="hidden" name="submitted" value="1" /> <input type="submit" value="Submit!" /> </form> </body> </html>

    I hope that the problem is now obvious. Consider: http://example.com/tests/simple.php/%22%3E%3Cscript%3Ealert('xss')%3C/script%3E%3Cfoo

    The output suddenly becomes very alarming:

    <html> <body> <form action="/tests/simple.php/"><script>alert('xss')</script><foo"> <input type="hidden" name="submitted" value="1" /> <input type="submit" value="Submit!" /> </form> </body> </html>

    If you ignore the obviously-incorrect <foo"> tag, you'll see what's happening. The would-be attacker has successfully exploited a critical (if you consider XSS critical) flaw in your logic, and, by getting a user to click the link (even through a redirect script), he has executed the Javascript of his choice on your user's client (obviously, this requires the user to have Javascript enabled). My alert() example is non-malicious, but it's trivial to write similarly-invoked Javascript that changes the action of a form, or usurps cookies (and submits them in a hidden iframe, or through an image tag's URL, to a server that records this personal data).

    The solution should also be obvious. Convert the user-supplied data to entities. The code becomes:

    <html> <body> <?php if (isset($_REQUEST['submitted']) && $_REQUEST['submitted'] == '1') { echo "Form submitted!"; } ?> <form action="<?php echo htmlentities($_SERVER['PHP_SELF']); ?>"> <input type="hidden" name="submitted" value="1" /> <input type="submit" value="Submit!" /> </form> </body> </html>

    And an attack, as above, would be rendered:

    <html> <body> <form action="/tests/simple.php/&amp;quot;&amp;gt;&amp;lt;script&amp;gt;alert('xss')&amp;lt;/script&amp;gt;&amp;lt;foo"> <input type="hidden" name="submitted" value="1" /> <input type="submit" value="Submit!" /> </form> </body> </html>

    This still violates the assumption that the script name and path are the only data in $_SERVER['PHP_SELF'], but the payload has been neutralized.

    Needless to say, I felt silly for not thinking of such a simple exploit, earlier. As the aforementioned PHP developer said, at the time (to paraphrase): if guys who consider themselves experts in PHP development don't notice these things, there's little hope for the unwashed masses who have just written their first 'echo "hello world!\n";'. He's working on a generic user-input filtering mechanism that can be applied globally to all user input. Hopefully we'll see it in PECL, soon. Don't forget about the other data in $_SERVER, either..

    ... ...

    Upon experimenting with this exploit on my own server (and watching the raw data in my _SUPERGLOBALS, conveniently, via phpinfo()), I noticed something very interesting that reminded me that even though trusting this data was a stupid mistake on my part, I'm not the only one to do so. A fun (and by fun, I mean nauseating) little game to play: create a file called "info.php" (or whatever name you like). In it, place only "<php phpinfo(); ?>". Now request it like this: http://your-server/path/to/info.php/%22%3E%3Cimg%20src=http://www.perl.com/images/75-logo.jpg%3E%3Cblah

    Nice huh? A little less nauseating: it's fixed in CVS.

  6. Over-zealous Ebay ads

    Well, definitely not PHP related, but...

    Anyone else noticed that Ebay's affiliates have been a little over-zealous in their ads, lately?

    Babies for sale... [img]http://blog.phpdoc.info/uploads/adsgonewrong.png[/img]

    Google for "bombs"... [img]http://blog.phpdoc.info/uploads/adsgonewrong2.png[/img]

    (my wife pointed out the "babies" one)

    Weird stuff.. (-:

  7. Fun with the tokenizer...

    I was reminded, this past week, of how cool the tokenizer is.

    One of the guys who works in the same office as I do had what seemed to be a simple problem: he had a php file that contained ~50 functions, and wanted to summarize the API without parsing through the file, manually, and cutting out the function declarations.

    We introduced him to in-line phpdoc blocks (he works (as a Jr.-level PHP developer) in the same office, but for a different company, so he doesn't have to follow our coding standards, but I digress..), but the 50-function library in question didn't have docblocks.

    Sure, he could (and did) pull up a list function NAMES with get_defined_functions (I assume by using array_diff against a before-and-after capture), but this didn't give him the argument names, or even the number of arguments for a given function, so I broke out some old tokenizer code I'd written.

    In case you aren't familiar with the tokenizer, the PHP manual defines it as:

    “[an interface to let you write] your own PHP source analyzing or modification tools without having to deal with the language specification at the lexical level.”

    The extension (which has been part of the PHP core distribution since 4.3.0) consists only of two functions: token_get_all and token_name, and a boatload of constants.

    Enough babble, though, let's get to the meat. I pulled out this code I'd written for PEARClops (on EFNet #PEAR) that parses PHP source files and figures out what classes, functions/methods and associated parameters are included.

    [php] $tokens[$i][1], 'class' => $currClass, ); } else { $thisFunc['params'][] = array( 'byRef' => $nextByRef, 'name' => $tokens[$i][1], ); $nextByRef = FALSE; } } elseif ($tokens[$i] == '&') { $nextByRef = TRUE; } elseif ($tokens[$i] == '=') { while (!in_array($tokens[++$i], array(')',','))) { if ($tokens[$i][0] != T_WHITESPACE) { break; } } $thisFunc['params'][count($thisFunc['params']) - 1]['default'] = $tokens[$i][1]; } } $funcs[] = $thisFunc; } elseif ($tokens[$i] == '{') { ++$classDepth; } elseif ($tokens[$i] == '}') { --$classDepth; }

    if ($classDepth == 0) { $currClass = ''; } }

    return $funcs; }

    function parse_protos($funcs) { $protos = array(); foreach ($funcs AS $funcData) { $proto = ''; if ($funcData['class']) { $proto .= $funcData['class']; $proto .= '::'; } $proto .= $funcData['name']; $proto .= '('; if ($funcData['params']) { $isFirst = TRUE; foreach ($funcData['params'] AS $param) { if ($isFirst) { $isFirst = FALSE; } else { $proto .= ', '; } if ($param['byRef']) { $proto .= '&'; } $proto .= $param['name']; } } $proto .= ")"; $protos[] = $proto; } return $protos; } echo "Functions in {$_SERVER['argv'][1]}:\n"; foreach (parse_protos(get_protos($_SERVER['argv'][1])) AS $proto) { echo " $proto\n"; } ?> [/php]

    Save it as "parse_funcs.php" (or whatever you like) and call it like so: php parse_funcs.php /path/to/php_file

    For instance: [code] sean@iconoclast:~/php/scripts$ php token_funcs_cli.php ~/php/cvs/Mail_Mime/mime.php Functions in /home/sean/php/cvs/Mail_Mime/mime.php: Mail_mime::Mail_mime($crlf) Mail_mime::__wakeup() Mail_mime::setTXTBody($data, $isfile, $append) Mail_mime::setHTMLBody($data, $isfile) Mail_mime::addHTMLImage($file, $c_type, $name, $isfilename) Mail_mime::addAttachment($file, $c_type, $name, $isfilename, $encoding) Mail_mime::_file2str(&$file_name) Mail_mime::_addTextPart(&$obj, $text) Mail_mime::_addHtmlPart(&$obj) Mail_mime::_addMixedPart() Mail_mime::_addAlternativePart(&$obj) Mail_mime::_addRelatedPart(&$obj) Mail_mime::_addHtmlImagePart(&$obj, $value) Mail_mime::_addAttachmentPart(&$obj, $value) Mail_mime::get(&$build_params) Mail_mime::headers(&$xtra_headers) Mail_mime::txtHeaders($xtra_headers) Mail_mime::setSubject($subject) Mail_mime::setFrom($email) Mail_mime::addCc($email) Mail_mime::addBcc($email) Mail_mime::_encodeHeaders($input) Mail_mime::_setEOL($eol) [/code]

    Not bad, huh?

    There are some not-so-obvious bugs (inheritance, mostly), but for a relatively short script, it does a pretty good job.

  8. PDO::firstImpressions()

    My pet project du jour has to do with PHP, commits and RSS. I'll talk about it here when it's ready, but in the mean time, here's a teaser.

    Part of what I'm doing is putting commit data into a database. Since I'm re-architecting my database schema and access methods, I figured that this would be the perfect opportunity to try PDO.

    (UPDATED below)

    Here are my first impressions.

    Let me say, first, that I think PDO has a LOT of potential. Even the work that has already been completed is wonderful, and it will push PHP forward into even heavier usage. I used to write CFML (please don't hold it against me), and one of the key benefits of Coldfusion was uniform database connectivity.

    Additionally, I've never been able to do cool things like use prepared queries in MySQL before, and while it's not technically possible to actually do this in (at least in the version I run -- see below), PDO's emulation layer is an incredible idea that I'm going to have a hard time living without at work (where we're still on PHP 4.. at least for the next couple months).

    In short: PDO is great. It will become an indispensible part of the PHP core distribution. Kudos to all involved.

    I'm running PHP 5.0.3 with php checked out from PECL CVS. The database I maintain on that box is MySQL, so I grabbed both PDO and PDO_MYSQL. I used the phpize method, and then stuck the extension=*.so directives in my php.ini.

    So far so good.

    The version of MySQL I have installed is that which is part of Debian stable: MySQL v3.23.

    Here's where the going got rough. I was (apparently) the first to try PDO on this version of MySQL. The first time I constructed a PDO object, an exception was thrown, which lead to PECL bug #3470.

    Wez has been not only instrumental in creating PDO, but also extremely helpful in diagnosing/confirming (or bogussing (-: ) the problems I've had.

    So, I commented out the offending line that caused the constructor to fail (and I see that this has been fixed in CVS, but I'm unable to test, currently (again, see below)), and continued working on my project.

    The prepared statements are REALLY nice. I can't say I've ever had such an enjoyable experience with MySQL access (except for DBDO).

    I ran into three more issues. The last of which leaves me unable to use PDO in its current state (in CVS).

    All in all, PDO is going to be great. Hopefully in time for PHP 5.1 (tentatively March 1st). I'd even suggest postponing 5.1 if PDO still has major bugs.

    PDO_MYSQL isn't what I'd consider production/release quality, currently, but these bugs seem like they'll be easily overcome. I think it needs more testers -- what I've found broken, I've reported, but some of this stuff I was quite surprised to find that I'm the first to discover the flaw.

    It seems that with PDO, MySQL was largely overlooked (the absence of a PDO_MYSQLI attests to this), and I'm certainly not one to extoll the virtues of MySQL, many PHP users know nothing else, and we'd be well-advised to make PDO_MYSQL as smooth as possible.

    ... anxiously awaiting a stable PDO and 5.1. Again, great work, guys.

    UPDATE: Following Wez' suggestion in bug #3549, (but also additionally actually removing the previously-instlaled pdo*.so files), I was able to get PDO working in its current state (from CVS).

  9. Schizophrenic Methods

    Occasionally, it is useful for a developer to determine if a method is being called statically (not in an object context -- Class::method() ), or "not statically" (in an object context -- $object->method()).

    This is normally (but incorrectly) done by checking $this:

    [php] class Foo { function bar() { echo "bar() called: " . (isset($this) ? 'non-statically' : 'statically'); } } [/php]

    Why the "but incorrectly", you might ask?

    A few weeks ago, I started maintaining PEAR::Mail_Mime -- it had a lot of reported bugs, and nobody was really taking care of the package. (I'm going to release 1.3.0RC1 within the next couple weeks)

    Anyway, without getting too far off-topic, one of the bugs was "Fatal error: Using $this when not in object context."

    Basically, the code was checking for $this->mailMimeDecode, but when called statically, $this was unset.

    My fix was to check if $this was set, but once committed, Jan Schneider sent me mail telling me that my patch would not work if the method was called statically from within another object.

    This hadn't even occurred to me, so I did some testing (and eventually updated the manual).

    Here's the scenario:

    [php] class A { function foo() { if (isset($this)) { echo '$this is defined ('; echo get_class($this); echo ")\n"; } else { echo "\$this is not defined.\n"; } } } class B { function bar() { A::foo(); } } $a = new A(); $a->foo(); A::foo(); $b = new B(); $b->bar(); B::bar(); [/php]

    The output:

    [code] $this is defined (a) $this is not defined. $this is defined (b) $this is not defined. [/code]

    As you can see (if you have the human parser module installed (-: ), $this is defined, and is the calling object, even when a method is called statically (but from the context of another object).

    So (and here's my point), how does a developer determine if a given method is called statically? Here's what I came up with (and is in Mail_Mime - CVS):

    [php] $isStatic = !(isset($this) && get_class($this) == __CLASS__); [/php]

    It seems a little hackish to be using __CLASS__, but nothing else came to mind, and it works in every test I came up with.

    Side note: When I stuck this stuff in the manual, its place in the oop4 docs is pretty good, but in the oop5 docs, I don't like that it's in The Basics but I don't know where else to put it. So, if anyone has a good suggestion, let me know.

  10. PHP Fun - Variable Arguments Be Reference?

    Earlier, this week, one of my co-workers was working on a personal project in which he wanted to use a function to set a variable number of parameters to zero.

    [php] [/php]

    His first impulse was to use func_get_args(). But this wasn't working for him. Turns out function_get_args() returns a COPY of the arguments, and not a reference. The manual didn't lean either way or the other, so I updated it.

    A few mails bounced around our internal developer's list. It seems that there's no non-hack way to do this. Here's what we came up with:

    Non-Solution #1

    [php] $v) { $array[$k] = 0; } } $a = 1; $b = 2; $c = 3; set_array_to_zero(array(&$a, &$b, &$c)); echo "$a $b $c"; // prints "0 0 0"; ?> [/php]

    This WORKS, and is probably the most "proper" way to do this, but the semantics violate his original requirements.

    A couple other non-semantic-repecting solutions were proposed (one using $obj = new StdClass;), but nothing that really worked the way he intended.

    Here's one that seems pretty close, on the surface: Non-Solution #4

    [php] [/php]

    Sure, it breaks semantics, again, but there's another major problem -- scope: Non-Solution #5

    [php] [/php]

    Since there's no way to operate on an intermediate scope (only local symbols and global symbols), at least in a functional context, this approach is a dead end.

    Finally, after sleeping on it, I came up with this:

    Non-Solution #16

    [php] [/php]

    This ALMOST works--I mean, it's REALLY close. The semantics are still a LITTLE off -- the $ I can live with. The @error_suppressor is also necessary, because it seems that it's not possible to pass default values to create_function(...) (try removing the @ -- you'll get (1024 - actual_number_of_arguments) error messages). There's also the issue of limiting the number of arguments to an arbitrary number. The way I see it, no developer should be working on > 1000 (actually, even that is entirely too many), anyway.

    So, without resorting to eval / writing to a file + include, this is the best solution we could come up with.

    Personally, I'd alter my requirements to go with the array-passing approach.