1. A new seancoates.com

    Over the past few weeks, my business partner Cameron and I have spent evenings, late nights, and weekends (at least partially) working on a much-improved seancoates.com.

    If you’re reading this via my feed, or through a syndication outlet, you probably hadn’t noticed.

    The primary goal of this change was to reduce (hopefully even remove) the ugliness of my main presence on the Web, and I’m very happy with the results.

    In addition to making things look nicer, we also wanted to improve the actual functionality of the site. Formerly, seancoates.com was a blog, with a couple haphazard pages thrown in. The new version serves to highlight my blog (which I fully intend to pick up with more frequency), but also contains a little bit of info about me, a place to highlight my code and speaking/writing contributions, and a good place for me to keep my beer recipes.

    Cameron came up with the simple visual design and great interaction design, so a public “Thank You” is in order for his many hours of thought and contribution. Clearly, the ugliness reduction was his doing (due to my poorly-functioning right brain).

    I’m very happy with how the site turned out as a whole, and thought I’d outline a few of my favourite bits (that might otherwise be missed at first glance).

    URL Sentences

    The technique of turning URLs into sentences was pioneered by my friend and colleague Chris Shiflett. Cameron (who shares studio space (and significant amounts of beer) with Chris) and I both like this technique, so we decided to implement it for my site.

    The main sections of the site are verbs, so this was pretty easy (once we decided on proper nomenclature). Here are a few examples:

    • seancoates.com/blogs – Sean Coates blogs…
    • seancoates.com/blogs/about-php – Sean Coates blogs about PHP (my “PHP” blog tag)
    • seancoates.com/brews – an index of my published recipes
    • seancoates.com/brews/coatesmeal-stout – the recipe page for Coatesmeal Stout

    To complement the URLs, the page title spells out the page you’re viewing in plain language, and the visual site header indicates where you are (while hopefully enticing you to click through to the other sections).

    Moving my blog from the root “directory” on seancoates.com to /blogs caused my URLs to break, so I had to whip up yet another bit of transition code to keep old links functioning. Even links on my original blog (which was hosted on blog.phpdoc.info) should still work. If you find broken links, please let me know.

    Vertical Content Integration

    My “/is” page contains feeds from Twitter and Flickr.

    The Twitter integration was pretty simple; I use the JSON version of my user feed, but I didn’t want to include @replies, so they’ve been filtered out by my code. If the fetch was successful, the filtered data is cached in APC for a short period of time so that I’m not constantly hammering Twitter’s API.

    Flickr’s integration was also very simple. After a run-in with some malformed JSON in their API, I decided to integrate through their Serialized PHP Response Format. The resulting data is also cached in APC, but for a longer period of time, as my beer tasting log changes much less frequently.

    Code Listings

    Displaying code listings on a blog isn’t quite as easy as it sounds. I recently had a discussion with a friend about redesigning his site, and he was considering using Gist from Github’s pastebin-like functionality. Doing so would have given him easy highlighting, but one thing he hadn’t considered was that his blog’s feed would be missing the embedded listings (they come from a third party, and wouldn’t actually appear in his feed’s data stream).

    Another problem we faced was one of space. While I often try to keep code to a maximum of 80 (or slightly fewer) characters wide, this isn’t always possible. Injecting a line break into the middle of a line of code is risky, especially for things like SSH keys and URLs. This problem is usually solved by setting the content’s CSS to overflow: scroll, but that littered Cameron’s beautiful design with ugly platform-specific scroll bars. “Clever” designers and developers sometimes overcome this by implementing “prettier” scroll bars, but I’m strongly against this behaviour, so I wouldn’t have it on my site.

    I’m quite happy with our eventual solution to this problem. Now, when a blog post contains code that extends beyond the normal width of the blog’s text, the right-most part of the text fades to white, and the listing is clickable. Clicking expands all listings on the page to the minimum width that will accommodate all embedded code.

    Here's some example code that stretches much wider than this column would normally allow.
    Injecting line breaks is dangerous. Here's why: http://example.com/obviously/not/a/sentence/url
    Breaking that in the middle is far from ideal.

    jQuery saved me hours of development work here, and I couldn’t recommend it more highly. Highlighting is provided by a plugin that I wrote a couple years ago. It uses GeSHi to highlight many languages. I’ve never been very happy with GeSHi’s output, but it’s Good Enough™ until I can find time to implement a better solution that uses the Tokenizer for PHP.

    Software

    In addition to PHP, this site integrates a custom version of Habari, with our own theme and plugins. One of those plugins allows me to keep my blog posts in HTML files in my Git repository, to make for much easier editing, greping, etc.

    Everything except /blogs was built within the Lithium framework. It handles all of the boring stuff like controllers, routing, and templates, so I didn’t have to write that code myself (which I find incredibly boring these days).

    Hashgrid was invaluable in ensuring that the site aligns with a visual grid (again, thanks to Cameron’s meticulous expertise). Pressing your g key will show the grid he used. I even made a few improvements to how Hashgrid works, which I hope to eventually see in the master branch.

  2. Goodbye, OmniTI

    Today is my last day at OmniTI.

    From an email I just sent out to my soon-to-be-past colleagues:

    “I sincerely wish you continued success as a company, and also as individuals who truly make up a significant portion of the best people in this industry. There are many things that OmniTI does very well, and I won't hesitate to refer business your way when the situation arises.

    This past year and a half (or so) has been a bumpy road, but I'm absolutely sure I will look back on my time with OmniTI as a net-positive. Thank you all for supporting me and my team with our sometimes-(absurd | stupid | obvious | amateur | tough) questions and requests.”

    The road ahead for OmniTI doesn't look nearly as bumpy, but after a very long period of thought, I finally decided to pursue other options around 6 weeks ago, and will now join the ranks of the funemployed.

    Thanks for the opportunities, experience, insight, and tough problems, OmniTI.

    2010 will be a great year. I'm already excited about some of the prospects that are in my future.

    Bonus points if the title of this post seems familiar. (-:

  3. Twitter's Chronic Anti-Pattern Problem

    This morning, via a colleague, John, I stumbled on a service called gdzlla that allows you to use Flickr as an alternative to the other de facto Twitter media posting services (twitpic, yfrog, etc.), from Tweetie on the iPhone. The idea is great, but unfortunately, the implementation is dangerous.

    Intrigued by an integrated media-posting solution, I started browsing the gdzlla site, and one of the first pages I saw grabbed my attention... in the wrong way.

    Screen shot of gdzlla login page

    The idea of random web sites asking for credentials is hardly a new concept—especially when it comes to Twitter. Almost a year ago, news broke about a now-defunct site called Twitterank that was created by @brianoberkirch to illustrate the danger of carelessly sharing Twitter credentials with third parties. Since then, Twitter has implemented OAuth to avoid this exact scenario, but uptake has been slow: many third parties who provide a Twitter-related service still require users to submit their Twitter credentials to authenticate.

    What struck me about gdzlla's login page was the text at the bottom of the form: "(Your password gets hashed, we won't ever know it)." Thinking about ways to implement this (the password could be hashed in JavaScript, before the form is submitted, for example), I turned on Firebug, and discovered that the value is actually submitted with the form, in plaintext:

    Screen shot of Firebug showing plaintext submission to gdzlla

    I suspected that the gdzlla guys were not actually being malicious here, and would actually hash the value prior to storage on their side, but the text was misleading at best, so I tweeted about it:

    John noticed that I linked to the form processor page, which didn't work properly, so be brought that part to gdzlla's attention:

    This kicked off a conversation with @gdzl_la:

    Their reply shed some light on exactly how they're integrating with Tweetie. The iPhone app allows users to supply their own custom image service URL. When submitting media, if this value is filled in, Tweetie sends the raw image data (and other information, see below) to the third-party URL and expects to receive a URL where the media is hosted, in return.

    This type of integration is actually a really great idea. More apps should allow customization of third-party services. It's exactly how web services should be used.

    Unfortunately, as @gdzl_la pointed out in our conversation, Tweetie's actual implementation of this feature is horribly insecure, and prevents gdzlla from using OAuth—gdzlla doesn't even use your Twitter credentials to post to Twitter, that's Tweetie's job (as indicated in their instructions).

    So, why does gdzlla require users to submit their Twitter credentials if they're immediately transforming your password into a hashed form that would prevent them from actually using it to access the Twitter API? The simple answer is that this is the only way for them to integrate with Tweetie's poor implementation of a great feature.

    gdzlla presumably collects your Twitter credentials and then has you authenticate against the Flickr API. It then links the accounts to associate your Twitter and Flickr accounts, on the gdzlla side.

    The tragic flaw in all of this is that Tweetie uniformly sends the user's Twitter credentials to the custom image URL as part of the image hosting request. There's no other way for gdzlla to associate the incoming data with a particular Flickr account.

    Tweetie's instruction page says that it will send the following as POST data:

    • username - Twitter username
    • password - Twitter password (plain text, thus HTTPS is strongly recommended, and may be required by future versions of Tweetie)
    • (other information such as the data for the media)

    There's really no good reason for Tweetie to do this. They could just as easily ask the user to supply credentials for the third-party media hosting service. In fact, they absolutely should ask the user to supply this information on the setup page. Providing a user's Twitter credentials to third-parties is irresponsible at the very least, and leaves legitimate third parties in a pinch because there's currently no good way to implement authentication in this system—not even OAuth will save the day. (This also leads to non-security usability problems with services like gdzlla—handling password changes must be a huge headache for them.)

    Hopefully the Tweetie developers will recognize this problem and fix it. In the meantime, my suggestion is to avoid using any service that implements the Password Anti-Pattern, even if you trust them.

  4. Recent Happenings

    I've got a bunch of stuff that I haven't found/made time to blog about, so just dropping some quick notes here:

    • I've been invited to speak at PHP Quebec 2009. I've been to this conference a few times (but not for a couple years, now), and I'm really looking forward to getting back into the conference circuit (as a speaker, not an organizer... think of all the free time I'll have! (-; Anyway, I'll be giving a talk entitled "Stupid Browser Tricks" in which I'll talk (at a high level) about Firebug, and Selenium IDE, and possibly a few other things like granular browser security, komodo macros/extensions (like a browser!) and maybe greasemonkey.
    • This year, I was once again invited back to the Microsoft Web Developers Summit (couldn't think of a better URL). This is a yearly event where Microsoft selects members of the PHP community to Redmond to have a discussion on PHP and Microsoft's offerings. This year was definitely the best one yet, as it was better organized, and it felt much less like they were trying to sell us things. Their candor was especially appreciated this year, as I think many of the attendees felt like Microsoft was asking us for our opinions instead of trying to give them to us. I wrote about this last year, and I think what I wrote still rings true, today. Thanks to the organizers... we got some great information, made our opinions clear, and had a LOT of fun (great people!).
    • I tweeted about this, but never posted it on my blog. My colleague Luke Welling is a funny guy.
    • Over the holiday weekend (I got days off, but in Canadia, we celebrate Thanksgiving in October), I found some time to work on a bunch of pet projects, including fale.ca, which is nothing special, but kind of fun. See?
    • Today, I was extended an invitation to join the Habari Cabal, which I quickly accepted. So, if you use Habari and your blog breaks in the future, it's probably my fault.
    • ... and last, but not least, Chris and I—with the help of many other people—managed to almost get the 2008 PHP Advent calendar launched in time. Word on the street is that Jon Tan is going to show the design some love, and we have a feed. The 2007 edition was a success, but was a lot of work, so I offered to pitch in this year. Thanks to everyone who's already submitted... and the rest of you slackers: get to it! (-;
    • S
  5. UTF: WTF?

    Note: This article first ran in php|architect in March 2008, while I still worked at MTA. Marco (the publisher, and my former colleague) has graciously agreed to allow me to republish this in a more public forum. I've wanted to link a few people to it in the past few months and until now that was only possible if they were php|architect subscribers. That said, if you're into PHP, you really should subscribe to php|a.

    As you might know, one of my roles at php|architect is to organize and manage speakers (and their talks) for our PHP conferences.

    A while back, PHP 6's main proponent, Andrei Zmievski, submitted a talk that we accepted, entitled "I ♥ Unicode, You ♥ Unicode." When we selected the talk and invited Andrei to attend the conference, he accepted and humorously suggested that we pay special attention to the talk's heart characters when publishing details on the conference website and in other promotional materials. I took his suggestion as wise advice, and double checked the site before releasing it to the public—it worked perfectly.

    Within a few hours of publication, Andrei dropped me a note indicating that I hadn't heeded his warning, and that the s weren't showing up properly. The problem turned out to be a bug in a specific version of Firefox, and I believe we resolved it by employing the entity. This ordeal, while minor, was my first taste of how bad things would become.

    If I had to guess, I would estimate that I've spent somewhere in the range of 40 hours wrangling UTF-8 in the past 3 months, which is not only expensive for my employer, but also disheartening as a developer who's got real work to do. Admittedly, this number is inflated, due to the heavy development cycle we completed with the launch of our new site. As time goes on, though, I don't see this situation improving in the short term (though, if we were to glimpse much further into the future, I'm sure we'll eventually consider this a solved problem).

    The main problem with using Unicode, today, is that it's partially supported by some parts of any given tool chain. Sometimes it works great, and other times—due to a given piece of software's lack of implementation (or worse, a partial implementation), human error, or full-on bugs—the chain's weakest link shatters in a non-spectacular way.

    As any experienced developer knows, having the weak point of a process collapse is a normal part of building complex systems. We're used to it, and we usually manage this by making the systems less complex, by eliminating the parts that are prone to collapse, or by fixing the broken parts. When implementing a system that may contain Unicode data, today, we're challenged with many potential points of failure that are often difficult to identify, and nearly impossible to replace.

    To illustrate, consider an overly simplified web development work—and content delivery—flow: developer creates a file, developer edits file, developer uploads the files to the web server, httpd receives a request from a browser, httpd passes the request to PHP, PHP delivers content back to httpd, httpd delivers content to the visitor's browser. If a single part of this flow fails to handle Unicode properly, a snowball effect causes the rest of the chain to fail.

    A more typical flow for me (and our code) goes something like this: create file, edit file, commit file to svn, other developers edit file, others commit to svn, release is rolled from svn, visitor browser requests page, httpd parses request, httpd delivers request to PHP, PHP processes request, PHP (client) calls service to fulfill back-end portions of request (encodes the request in an envelope—we use JSON most of the time), PHP (service) receives request, service retrieves and/or stores data in database, service returns data to PHP client, PHP client processes returned data and in turn delivers it to httpd, httpd returns data to browser.

    If you'll bear with me for one last list in this article, that means that any (one or more!) of the following could fail when handling unicode: developers' editors, developers' transport (either upload or version control), user's browser, user's http proxy, client-side httpd, client-side PHP, client-side encoder (JSON), service-side httpd (especially HTTP headers), service-side decoder, service-side PHP, service-side database client, database protocol character set imbalance, database table charset, database server, service-side encoder, client-side decoder, client-side PHP (again), client-side httpd (including HTTP headers, again), user's proxy (again), and user's browser (again). I've probably even left some out.

    As you can see, there are so many points of failure here, that determining the source of an invalid UTF-8 character is torturous, at best.

    Recently, I had to wrestle UTF-8 monsters. In my case, it was a combination of user (me) error and an actual bug in PHP, but it was so non-obvious that it caused most of my day to melt away, trying to resolve the issue. In my case, I had decided to split a file that contained UTF-8 characters into two files. By default, my editor of choice creates new files using my system character encoding—which happened to be Mac-Roman because I hadn't changed it from Leopard's default. The original file was UTF-8, and the characters displayed normally in the new Mac-Roman file. However, when the data was passed to PHP's json_encode function, the string was arbitrartily truncated, due to a PHP bug .

    Because the script that triggered the bug pulled the data from a database, and the data was inserted by another script—the one with the broken encoding/characters—it took me entirely too long to trace it back to the change I'd made to that now-split file. For a while, I even thought that MySQL was storing the data poorly because we'd had problems with that before, and also because the database client I was using that day was reporting the characters improperly, due to its own encoding issues. I believe my blood pressure skyrocketed to dangerous levels, that afternoon.

    Universal Unicode support is going to be a long uphill battle. I'm not sure I'm ready for it, but I hope it's worth it, nonetheless.