1. A Case of Mistaken Iterator

    Earlier this week, I spent most of a day tracing through code in search of the source of a bug that was causing part of our application to fail in strange ways.

    In the back end, we have models that connect to CouchDB. These models implement the Iterator pattern to allow easy traversal of a record’s keys.

    When I wrote the code to implement Iterator several months ago, I dutifully checked the PHP Manual and adapted the reference example that I found there:

    <?php
    class Record implements Iterator
    {
        // (partial class, showing the iterator implementation only)
     
    	public $_data = array();
     
    	public function rewind()
    	{
    		reset($this->_data);
    	}
     
    	public function current()
    	{
    		return current($this->_data);
    	}
     
    	public function key()
    	{
    		return key($this->_data);
    	}
     
    	public function next()
    	{
    		return next($this->_data);
    	}
     
    	public function valid()
    	{
    		return (current($this->_data) !== false);
    	}
     
    }

    Little did I realize that this implementation is very broken. I’ll explain why, below.

    Over the past few years, I’ve implemented many iterators in this way, using PHP’s implicit array manipulation functions (reset(), current(), key(), next()). These functions are very convenient because PHP arrays are so powerful — arrays in PHP work like ordered hash tables in other languages.

    PHP’s implicit management of an array’s iteration index (the value that is incremented by next() and referenced by key()) is indeed convenient, but the convenience can sometimes be offset by its very implicitness — the value is hidden from you, the PHP programmer.

    In PHP, generic array iteration (without the implicit iterator) isn’t actually as simple as it sounds. Remember that arrays aren’t arrays in the traditional sense, but ordered hash tables. Consider this:

    $data = array('zero','one','two','three');
    for ($i=0; $i<count($data); $i++) {
        // yeah, don't calculate count() on every iteration
        echo "{$data[$i]}\n";
    }

    Output:

    zero
    one
    two
    three

    This first example is easy to iterate — the array contains sequential, numeric, zero-based keys. It gets more complicated when using non-sequential, and non-numeric keys:

    $data = array(
        'apple',
        'cow' => 'moo',
        'pig' => 'oink',
        'orange'
    );
    for ($i=0; $i<count($data); $i++) {
        echo "{$data[$i]}\n";
    }

    Output:

    apple
    orange
    Notice: Undefined offset: 2 in - on line 10
    Notice: Undefined offset: 3 in - on line 10

    I could use foreach, but because a numeric loop illustrates the point more clearly, here’s how I might implement the above code so that it works:

    $data = array(
        'apple',
        'cow' => 'moo',
        'pig' => 'oink',
        'orange'
    );
    $k = array_keys($data);
    for ($i=0; $i<count($data); $i++) {
        echo "{$data[$k[$i]]}\n";
    }

    Output:

    apple
    moo
    oink
    orange

    This brings us back to the Iterator implementation. Why isn’t the code above correct? Take a closer look at this:

    public function valid()
    {
        return (current($this->_data) !== false);
    }

    A value of false in the array is indistinguishable from a false value returned by current(). Using the above implementation with the following array would cause it to bail after orange (and subsequently might cause you to waste a day tracking down the cause):

    array(
        'apple',
        'orange',
        false,
        'banana',
    );

    On Tuesday night, I updated the manual to use an improved Iterator implementation. It’s probably a bit slower (so you can use the internal-indexing implementation if you’re sure your arrays will never contain false), but my implementation is more robust.

    <?php
    /**
     * A mixed-key iterator implementation
     *
     * Note: these array_keys() calls are slow. The array keys could be cached
     * as long as the cache value is invalidated when $_data is changed.
     */
    class It implements Iterator
    {
    	public $_data = array();
    	protected $_index = 0;
     
    	public function rewind()
    	{
    		$this->_index = 0;
    	}
     
    	public function current()
    	{
    		$k = array_keys($this->_data);
    		return $this->_data[$k[$this->_index]];
    	}
     
    	public function key()
    	{
    		$k = array_keys($this->_data);
    		return $k[$this->_index];
    	}
     
    	public function next()
    	{
    		$k = array_keys($this->_data);
    		if (isset($k[++$this->_index])) {
    			return $this->_data[$k[$this->_index]];
    		} else {
    			return false;
    		}
    	}
     
    	public function valid()
    	{
    		$k = array_keys($this->_data);
    		return isset($k[$this->_index]);
    	}
     
    }

    To use it:

    $it = new It;
    $it->_data = array(
        'one' => 1,
        'two' => 2,
        false,
        'four' => 4
    );
    foreach ($it as $k => $v) {
        echo "{$k}: {$v}\n";
    }

    Output:

    one: 1
    two: 2
    0: 
    four: 4
  2. Beer Alchemy Integration

    As I mentioned in my previous post, my beer recipes are now online.

    I've had several people ask me how this is done, so I think a post is in order.

    While it's entirely possible to brew beer at home without any fancy gadgets, there are several tools I use (such as my refractometer) that make the process easier, more controlled, or both. Brewing software is one of the few instruments that I'm not sure I'd want to brew without. I use a Mac, primarily, so Beer Alchemy (BA) is the obvious choice for recipe formulation, calculation, and logging.

    BA has its own HTML export mechanism for recipes, and I used this for quite a long time, but I was never really satisfied with the results. The markup was hard to style, contained a lot of clutter (occasionally useful, but often redundant information such as style parameters), and simply didn't fit well with the rest of my site.

    Beer Alchemy HTML Output

    You can also export from BA in PDF (not suitable for web publishing), ProMash's binary recipe format (a pain to convert, although there do seem to be some tools to help with this), BeerXML (normally the most accessible, but in my opinion, a poorly-designed XML format), or in BA's native .bar ("Beer Alchemy Recipe") format, which is what I chose.

    Beer Alchemy Recipe Export Dialog

    The bar format contains a property list, similar to those found throughout Apple systems. Property lists are either binary or XML (but the XML is very difficult to work with using traditional tools because of the way it employs element peering instead of a hierarchy to manage relationships). Luckily, I found a project called CFPropertyList that allows for easy plist handling in PHP. (I even contributed a minor change to this project, a while ago.)

    Once you've run the .bar file's contents through CFPropertyList, layout is very simple. Here's most of the code I use to generate my recipes:

    <?php
    $beerPath = __DIR__ . '/../resources/beer/';
     
    $recipes = apc_fetch('seancoates_recipes');
    $fromCache = true;
    if ($recipes === false) {
    	$fromCache = false;
    	foreach (new DirectoryIterator($beerPath) as $f) {
    		if ($f->isDot()) {
    			continue;
    		}
    		if (substr($f->getFilename(), -4) != '.bar') {
    			continue;
    		}
    		$cfpl = new CFPropertyList($beerPath . '/' . $f->getFilename());
    		$recipe = $cfpl->toArray();
    		$title = $recipe['RecipeTitle'];
    		$recipes[self::slugify($title)] = array(
    			'title' => $title,
    			'content' => $recipe,
    		);
    	}
    	asort($recipes);
    	if ($recipes) {
    		apc_store('seancoates_recipes', $recipes, 3600); // 1h
    	}
    }

    In addition to displaying the recipe's data, I also wanted to show the approximate (calculated) beer colour. Normally, beer recipes declare their colour in "SRM" (Standard Reference Method). There's no obvious, simple, and direct way to get from SRM (which is a number from 0 to 40—and higher, but above the mid 30s is basically "black") to an HTML colour.

    I found a few tables online, but I wasn't terribly happy with any of them, and keeping a dictionary for lookups was big and ugly. I like the way Beer Alchemy previews its colours, and since it has HTML output, I emailed the author to see if he'd be willing to share his algorithm. Steve from Kent Place Software graciously sent me an excerpt from his Objective C code, and I translated it to PHP. This might be useful for someone, and since Steve also granted me permission to publish my version of the algorithm, here it is:

    <?php
    /**
     * Calculate HTML colour from SRM
     * Thanks to Steve from Kent Place Software (Beer Alchemy)
     *
     * @param float $srm the SRM value to turn into HTML
     * @return string HTML colour (without leading #)
     */
    public function srm2html($srm)
    {
    	if ($srm <= 0.1) { // It's water
    		$r = 197;
    		$g = 232;
    		$b = 248;
    	} elseif ($srm <= 2) {
    		$r = 250;
    		$g = 250;
    		$b = 60;
    	} elseif ($srm <= 12) {
    		$r = (250 - (6 * ($srm - 2)));
    		$g = (250 - (13.5 * ($srm - 2)));
    		$b = (60 - (0.3 * ($srm - 2)));
    	} elseif ($srm <= 22) {
    		$r = (192 - (12 * ($srm - 12)));
    		$g = (114 - (7.5 * ($srm - 12)));
    		$b = (57 - (1.8 * ($srm - 12)));
    	} else { // $srm > 22
    		$r = (70 - (5.6 * ($srm - 22)));
    		$g = (40 - (3.1 * ($srm - 22)));
    		$b = (40 - (3.2 * ($srm - 22)));
    	}
    	$r = max($r, 0);
    	$g = max($g, 0);
    	$b = max($b, 0);
    	return sprintf("%02X%02X%02X", $r, $g, $b);
    }

    Here it is, in action, in JavaScript this time:

    SRM: =
     
  3. A new seancoates.com

    Over the past few weeks, my business partner Cameron and I have spent evenings, late nights, and weekends (at least partially) working on a much-improved seancoates.com.

    If you’re reading this via my feed, or through a syndication outlet, you probably hadn’t noticed.

    The primary goal of this change was to reduce (hopefully even remove) the ugliness of my main presence on the Web, and I’m very happy with the results.

    In addition to making things look nicer, we also wanted to improve the actual functionality of the site. Formerly, seancoates.com was a blog, with a couple haphazard pages thrown in. The new version serves to highlight my blog (which I fully intend to pick up with more frequency), but also contains a little bit of info about me, a place to highlight my code and speaking/writing contributions, and a good place for me to keep my beer recipes.

    Cameron came up with the simple visual design and great interaction design, so a public “Thank You” is in order for his many hours of thought and contribution. Clearly, the ugliness reduction was his doing (due to my poorly-functioning right brain).

    I’m very happy with how the site turned out as a whole, and thought I’d outline a few of my favourite bits (that might otherwise be missed at first glance).

    URL Sentences

    The technique of turning URLs into sentences was pioneered by my friend and colleague Chris Shiflett. Cameron (who shares studio space (and significant amounts of beer) with Chris) and I both like this technique, so we decided to implement it for my site.

    The main sections of the site are verbs, so this was pretty easy (once we decided on proper nomenclature). Here are a few examples:

    • seancoates.com/blogs – Sean Coates blogs…
    • seancoates.com/blogs/about-php – Sean Coates blogs about PHP (my “PHP” blog tag)
    • seancoates.com/brews – an index of my published recipes
    • seancoates.com/brews/coatesmeal-stout – the recipe page for Coatesmeal Stout

    To complement the URLs, the page title spells out the page you’re viewing in plain language, and the visual site header indicates where you are (while hopefully enticing you to click through to the other sections).

    Moving my blog from the root “directory” on seancoates.com to /blogs caused my URLs to break, so I had to whip up yet another bit of transition code to keep old links functioning. Even links on my original blog (which was hosted on blog.phpdoc.info) should still work. If you find broken links, please let me know.

    Vertical Content Integration

    My “/is” page contains feeds from Twitter and Flickr.

    The Twitter integration was pretty simple; I use the JSON version of my user feed, but I didn’t want to include @replies, so they’ve been filtered out by my code. If the fetch was successful, the filtered data is cached in APC for a short period of time so that I’m not constantly hammering Twitter’s API.

    Flickr’s integration was also very simple. After a run-in with some malformed JSON in their API, I decided to integrate through their Serialized PHP Response Format. The resulting data is also cached in APC, but for a longer period of time, as my beer tasting log changes much less frequently.

    Code Listings

    Displaying code listings on a blog isn’t quite as easy as it sounds. I recently had a discussion with a friend about redesigning his site, and he was considering using Gist from Github’s pastebin-like functionality. Doing so would have given him easy highlighting, but one thing he hadn’t considered was that his blog’s feed would be missing the embedded listings (they come from a third party, and wouldn’t actually appear in his feed’s data stream).

    Another problem we faced was one of space. While I often try to keep code to a maximum of 80 (or slightly fewer) characters wide, this isn’t always possible. Injecting a line break into the middle of a line of code is risky, especially for things like SSH keys and URLs. This problem is usually solved by setting the content’s CSS to overflow: scroll, but that littered Cameron’s beautiful design with ugly platform-specific scroll bars. “Clever” designers and developers sometimes overcome this by implementing “prettier” scroll bars, but I’m strongly against this behaviour, so I wouldn’t have it on my site.

    I’m quite happy with our eventual solution to this problem. Now, when a blog post contains code that extends beyond the normal width of the blog’s text, the right-most part of the text fades to white, and the listing is clickable. Clicking expands all listings on the page to the minimum width that will accommodate all embedded code.

    Here's some example code that stretches much wider than this column would normally allow.
    Injecting line breaks is dangerous. Here's why: http://example.com/obviously/not/a/sentence/url
    Breaking that in the middle is far from ideal.

    jQuery saved me hours of development work here, and I couldn’t recommend it more highly. Highlighting is provided by a plugin that I wrote a couple years ago. It uses GeSHi to highlight many languages. I’ve never been very happy with GeSHi’s output, but it’s Good Enough™ until I can find time to implement a better solution that uses the Tokenizer for PHP.

    Software

    In addition to PHP, this site integrates a custom version of Habari, with our own theme and plugins. One of those plugins allows me to keep my blog posts in HTML files in my Git repository, to make for much easier editing, greping, etc.

    Everything except /blogs was built within the Lithium framework. It handles all of the boring stuff like controllers, routing, and templates, so I didn’t have to write that code myself (which I find incredibly boring these days).

    Hashgrid was invaluable in ensuring that the site aligns with a visual grid (again, thanks to Cameron’s meticulous expertise). Pressing your g key will show the grid he used. I even made a few improvements to how Hashgrid works, which I hope to eventually see in the master branch.

  4. PHP 5.3 on Snow Leopard

    My old post on compiling PHP for Mac OS 10.5 (Leopard) continues to top my most-viewed page statistics. Sadly, that article is old and doesn't apply very well to Snow Leopard (10.6).

    I've been meaning to post instructions on how to compile PHP for Snow Leopard since last summer when I picked up the DVD, but hadn't found the time or opportunity to build PHP from a completely fresh start, until a few weeks ago.

    This time, I took notes on how to reliably compile PHP and Apache from scratch on this system.

    1. Download and install Xcode. You're on your own for the details of this one, but frankly, if you can't figure it out, you'll find the next steps too difficult. Think of it as a prerequisite.

    2. Create a working directory. I use ~/src, but you can use whatever you like.

      
      $ mkdir ~/src
      $ cd ~/src
        

    3. Install Homebrew. Homebrew is a truly great software packager for OS X. Think Macports, but not as ugly; Fink, but not as broken (and not as binary). Designed for Mac. It's Ruby, but we don't have to hold that against them. (-:

      
      $ curl http://gist.github.com/raw/323731/572b315c4f7ee78244de70e7ad703c8ae324da7a/install_homebrew.rb > install_homebrew.rb
      $ ruby install_homebrew.rb
         

    4. Install your own iconv. I don't know what Apple did to theirs, but it's a huge headache. You're best installing your own, in my experience.

      
      $ curl http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.13.1.tar.gz | tar -zx -
      $ cd libiconv-1.13.1
      $ ./configure --prefix=/opt && make && make install
      $ cd ..
        

    5. Install Apache-HTTPD from source. This isn't absolutely necessary, but Apple seems to have done some weird stuff with their Apache, and in my experience, it's best to build your own. If you skip over this step, you'll need to change the apxs in the PHP configure command, below.

      First, find your closest mirror.

      
      $ curl http://apache.mirror.iweb.ca/httpd/httpd-2.2.15.tar.bz2 | tar -jxf -
      $ cd httpd-2.2.15/
      $ ./configure --enable-rewrite --enable-ssl && make && make install
      $ cd ..
         

    6. Install PHP dependencies using Homebrew. Easy, huh?

      
      $ echo "gd jpeg libpng libxml2 libzzip mcrypt mysql" | xargs brew install
      $ echo "libpng libxml2 readline" | xargs brew link
        

    7. Install PHP from source by first selecting a mirror.

      Note: you will need to use a really nasty patch to get this to build properly. See the note on iconv above. Even Apple's own iconv patch for PHP doesn't work (at least not for me).

      
      $ curl -L http://ca2.php.net/get/php-5.3.2.tar.bz2/from/this/mirror | tar -jxf -
      $ cd php-5.3.2
      $ curl http://www.php.net/~scoates/patches/php-5.3.1-Makefile.global-iconv.patch | patch -p0
      $ ./configure --prefix=/usr/local --with-xsl --with-gd --with-zlib-dir \
        --enable-sockets --enable-exif --with-mcrypt --enable-soap \
        --enable-embedded-mysqli --with-mysql --with-pdo-mysql --with-curl \
        --with-libedit --with-apxs2=/usr/local/apache2/bin/apxs --enable-mbstring \
        --with-openssl --with-iconv=/opt && make && make install
      $ cd ..
        

    8. Configure Apache. If you've done this on other platforms, this step should look familiar.

      1. In /usr/local/apache2/conf/httpd.conf, in the <IfModule mime_module> block, add the following:
        
        AddType application/x-httpd-php .php
        AddType application/x-httpd-php-source .phps
            
      2. Optionally, add PHP to DirectoryIndex by changing
        
        DirectoryIndex index.html
            
        to
        
        DirectoryIndex index.php index.html
            

      You can now test Apache + PHP by creating a phpinfo() page, and restarting Apache:

      
      $ echo "<?php phpinfo(); ?>" > /usr/local/apache/htdocs/info.php
      $ ln -s /usr/local/apache2/bin/apachectl /usr/local/bin/apachectl
      $ sudo /usr/local/bin/apachectl restart
        

      Now, visit localhost/info.php, and you should have an independent, custom-compiled Apache-PHP stack.

    I hope this has been helpful. If I've given bad instructions, or if something doesn't work for some reason, please let me know in the comments.

    Belorussian translation provided by Patricia.

  5. Post-ConFoo

    Today I am back in the (home) office after speaking at ConFoo last week.

    I gave two talks:

    Despite some timing issues (as is always the case for me with new talks), I think both sessions went well. I got some good, constructive feedback from attendees on how the talks could be made better, and if I get the opportunity to give them again, I'll definitely take it into consideration.

    Because PSAV (not the conference organizers' fault, other than trusting PSAV—a lesson all conference organizers eventually learn) was its normal bucket of fale, conference Internets weren't exactly usable. I was lucky enough to be in my home country (for a change), so I was able to tether on 3G for most of the week, and that kept me and a few others online. However, I think that most attendees didn't have the same opportunity, so if you attended one of my talks, I'd greatly appreciate it if you could head over to joind.in and post a comment or two about what you thought (talk-specific links above).

    After the conference ended, Andrei stayed at my place.

    On Saturday, we brewed a second annual batch of Белый, which—if all goes well—will end up at 11.5%ABV. Maybe it will be ready for next year's ConFoo. (-:

    We decided to take it easy on Sunday… sort of. We spent most of the afternoon planning out a PHP extension we'd been talking about for several months: a highly-customizable and user-hookable PHP preprocessor. Afternoon turned to evening, turned to night, and by the time we turned in, we had a working—but not yet ready for release—extension. More details to come, but here's a teaser.