UTF: WTF?

Note: This article first ran in php|architect in March 2008, while I still worked at MTA. Marco (the publisher, and my former colleague) has graciously agreed to allow me to republish this in a more public forum. I've wanted to link a few people to it in the past few months and until now that was only possible if they were php|architect subscribers. That said, if you're into PHP, you really should subscribe to php|a.

As you might know, one of my roles at php|architect is to organize and manage speakers (and their talks) for our PHP conferences.

A while back, PHP 6's main proponent, Andrei Zmievski, submitted a talk that we accepted, entitled "I ♥ Unicode, You ♥ Unicode." When we selected the talk and invited Andrei to attend the conference, he accepted and humorously suggested that we pay special attention to the talk's heart characters when publishing details on the conference website and in other promotional materials. I took his suggestion as wise advice, and double checked the site before releasing it to the public—it worked perfectly.

Within a few hours of publication, Andrei dropped me a note indicating that I hadn't heeded his warning, and that the s weren't showing up properly. The problem turned out to be a bug in a specific version of Firefox, and I believe we resolved it by employing the ♥ entity. This ordeal, while minor, was my first taste of how bad things would become.

If I had to guess, I would estimate that I've spent somewhere in the range of 40 hours wrangling UTF-8 in the past 3 months, which is not only expensive for my employer, but also disheartening as a developer who's got real work to do. Admittedly, this number is inflated, due to the heavy development cycle we completed with the launch of our new site. As time goes on, though, I don't see this situation improving in the short term (though, if we were to glimpse much further into the future, I'm sure we'll eventually consider this a solved problem).

The main problem with using Unicode, today, is that it's partially supported by some parts of any given tool chain. Sometimes it works great, and other times—due to a given piece of software's lack of implementation (or worse, a partial implementation), human error, or full-on bugs—the chain's weakest link shatters in a non-spectacular way.

As any experienced developer knows, having the weak point of a process collapse is a normal part of building complex systems. We're used to it, and we usually manage this by making the systems less complex, by eliminating the parts that are prone to collapse, or by fixing the broken parts. When implementing a system that may contain Unicode data, today, we're challenged with many potential points of failure that are often difficult to identify, and nearly impossible to replace.

To illustrate, consider an overly simplified web development work—and content delivery—flow: developer creates a file, developer edits file, developer uploads the files to the web server, httpd receives a request from a browser, httpd passes the request to PHP, PHP delivers content back to httpd, httpd delivers content to the visitor's browser. If a single part of this flow fails to handle Unicode properly, a snowball effect causes the rest of the chain to fail.

A more typical flow for me (and our code) goes something like this: create file, edit file, commit file to svn, other developers edit file, others commit to svn, release is rolled from svn, visitor browser requests page, httpd parses request, httpd delivers request to PHP, PHP processes request, PHP (client) calls service to fulfill back-end portions of request (encodes the request in an envelope—we use JSON most of the time), PHP (service) receives request, service retrieves and/or stores data in database, service returns data to PHP client, PHP client processes returned data and in turn delivers it to httpd, httpd returns data to browser.

If you'll bear with me for one last list in this article, that means that any (one or more!) of the following could fail when handling unicode: developers' editors, developers' transport (either upload or version control), user's browser, user's http proxy, client-side httpd, client-side PHP, client-side encoder (JSON), service-side httpd (especially HTTP headers), service-side decoder, service-side PHP, service-side database client, database protocol character set imbalance, database table charset, database server, service-side encoder, client-side decoder, client-side PHP (again), client-side httpd (including HTTP headers, again), user's proxy (again), and user's browser (again). I've probably even left some out.

As you can see, there are so many points of failure here, that determining the source of an invalid UTF-8 character is torturous, at best.

Recently, I had to wrestle UTF-8 monsters. In my case, it was a combination of user (me) error and an actual bug in PHP, but it was so non-obvious that it caused most of my day to melt away, trying to resolve the issue. In my case, I had decided to split a file that contained UTF-8 characters into two files. By default, my editor of choice creates new files using my system character encoding—which happened to be Mac-Roman because I hadn't changed it from Leopard's default. The original file was UTF-8, and the characters displayed normally in the new Mac-Roman file. However, when the data was passed to PHP's json_encode function, the string was arbitrartily truncated, due to a PHP bug .

Because the script that triggered the bug pulled the data from a database, and the data was inserted by another script—the one with the broken encoding/characters—it took me entirely too long to trace it back to the change I'd made to that now-split file. For a while, I even thought that MySQL was storing the data poorly because we'd had problems with that before, and also because the database client I was using that day was reporting the characters improperly, due to its own encoding issues. I believe my blood pressure skyrocketed to dangerous levels, that afternoon.

Universal Unicode support is going to be a long uphill battle. I'm not sure I'm ready for it, but I hope it's worth it, nonetheless.

More Web of Trust Thoughts

A while back, I blogged about trust on the web, and how there are a lot of assumptions made by content providers that simply don't carry over to end users, or are just a small (but important) step from being good practices.

Yesterday, at $work, we were talking about something that lead to a discussion on SSL, and how I think that most sites, even if they don't contain sensitive information should be available by https—even if the certificate is self-signed.

Chris respectfully (I think (-; ) disagreed with me saying that certificates that are not trusted a user's browser are as bad, or even worse than not allowing SSL at all. His theory—and I'm sure he'll correct me if I'm misrepresenting him—is that offering this type of unverifiable certificate is not only useless, but harmful to users because there's a false sense of security. My retort, though not well received, is that users of modern browsers (Firefox at least) will be notified when a self-signed certificate that they've accepted has changed. This at least allows the user to verify when something is amiss. His rebuttal was that there's no way for the user to tell which certificate is the "good" one, and which is the "bad" one, and I can see his point.

We had a discussion on DNS and how we trust it for a lot of things that we shouldn't, even though we don't want to... especially given the recent problems with DNS. In the end, we all agreed that putting something like http://omniti.com/ on self-signed https serves no practical value as users will a) never use it, b) not know how to verify the certificate, and c) will get confused by their browser warning them about security problems.

This lead to a few other branches of thinking about SSL. The first was a question Chris asked us "how do access your online banking?" clarifying with "how do you get to the login page?" A few of us (myself included) answered "bookmark" while others said they hit their bank's main domain either from URL history or manually, and clicked through from there. Chris's point was that most users visit http://bank.example.com/ and are somehow directed their https login page. I checked my bank, and bad things happen:

  • visit http://www.royalbank.ca/
  • click "online banking", which links to http://www.rbcroyalbank.com/STRINGHERE/redirect-bank-hp-pagelink-olb.html
  • which redirects, via META tag to: https://www1.royalbank.com/cgi-bin/rbaccess/RESTOFURLHERE
  • user is presented the login form (in https)

My bookmark is the https://www1.royalbank.com/... page, so I feel relatively safe, but let's look at the bad things that happen here:

  • User visits one domain (HTTP, not secure)
  • User is _silently_ redirected to another domain on HTTPs

Why are these bad? Well, aside from the possible confusion of getting bumped from royalbank.ca to rbcroyalbank.com to royalbank.com, the user's chain of trust breaks down when they visit http://royalbank.ca/. http—no "s". If this site was compromised, the user would never know (without careful URL confirmation at the https destination) that s/he was not maliciously redirected to https://www1.roya1bank.com/ (note "L" is "1" (one) in my bad-guy example). Phishers could easily get a SSL certificate for roya1bank.com.

That got me thinking a bit about the SSL certificate acquisition process. I'm sure some of the really high-end SSL certificates still come with human validation (a real person looks at the application and makes a real decision about granting the certificate; in the case above, hopefully this would have been caught). Most certificate signing I've seen recently is based on proven ownership of the domain in question. So, as I say, it's trivial for me to go register a domain that LOOKS like a bank. Sure, I'd still have to compromise either the http server or DNS that points at the server, but Kaminsky demonstrated that this isn't so hard (or wasn't until just a few weeks ago).

Let's take it a step further back. If bad guys can compromise DNS, which is inherently insecure (not SSL, no trust model other than IP address, and it runs on UDP(!)), then surely they can trick your the certificate authority's SMTP server to deliver mail to another mail exchanger, right?

  • bad guy targets example.com poisons the certificate authority's DNS for example.com to point MX at an IP controlled by bad guy
  • bad guy generates a certificate signing request (CSR) and send it to the certificate authority (CA), "From" bob@exmaple.com
  • CA receives the CSR and verifies with whois that the contact for the domain is bob@example.com
  • CA signes the CSR and returns the certificate to bob@example.com (either by mail or through a web interface)
  • bad guy is now in posession of a perfectly valid and trusted http://example.com/ SSL certificate

Scary. You must be thinking that CAs probably have a more secure DNS setup and wouldn't get poisoned (as easily). I believe that to be somewhat true. Let's say it's absolutely true: the CA has 100% perfectly secure DNS. Ok, we'll need to go one step further back:

  • bad guy poisons the DNS for the target's less secure $20/month ISP, example.com, to redirect the MX for example.com to a different server
  • bad guy visits example.com's registrar's web interface and indicates that he has forgotten his password
  • registrar generates password reset URL/instructions and emails it to bob@example.com
  • bad guy receives the hijacked email, logs into the domain and changes the contacts to badguy@example.net, an email account that he controls
  • bad guy generates a CSR and sends it to the CA from badguy@example.net, and continues the process outlined above to receive a legitimate, valid and trusted certificate

In any of these scenarios, hundreds or thousands of account credentials could be acquired—especially with creative use of proxies at the bad guy's malicious server.

We're lead to believe that SSL is truly safe, and it's true that the encryption part lives up to the expectation, but modern practice of the certificate generation/signing process certainly leaves something to be desired, I think.

Yeah, it might be a long shot that an attacker could easily poison specific DNS servers on the internet, but again, as Kaminsky showed the world just a few weeks ago, (nearly?) every DNS server on the planet was vulnerable to exactly this type of attack before summer 2008.

Pardon me if I don the tinfoil hat until we all forget about this mess.

Personal Password Policies

As you may have already heard, I've recently taken a position at OmniTI. Big changes in my life and career usually cause me to review other parts of the same. Recently, I've been considering my personal password policies, and I thought it might be interesting to both share my conclusions, as well as to hear from my 3 remaining readers (after months of an untouched blog) what you think and if you have any of your own policies that I should adopt.

Here's the short version for the short-attention-spanned among us:

(There's also some (IMO) cool Keychain command line code at the end...)

  • unique password or each site/service
  • passwords should be changed every 90 days
  • My Vidoop for web (exported to keychain daily (once Vidoop allows this))
  • delegated OpenID whenever possible
  • keychain for non-web (+time machine backups regularly)
  • 8+ glyphs whenever possible
  • glyph = upper + lower + nums + symbols
  • ssh via RSA keypair when possible
  • ssh priv escalation via user password (re-auth)
  • re-gen RSA keypair annually
  • mail: GPG w/1-year key expiry
  • publish ssh-RSA and GPG public keys

Up until a few weeks ago, I had what I'd considered a "medium" password footprint. I've done some things right, but a lot of things wrong. I wouldn't consider it a weak footprint because I don't (e.g.) use my birthdate as my PIN, but I also wouldn't consider it a strong footprint because I was prone to using the same password on different (lower security/risk) sites. The repeated password is also composed of lowercase letters only, which means that it's relatively easy to crack, if one of my "low security" password hashes were ever to be compromised.

This realization has lead me to review some of my personal policies, and has helped me identify a few things that I need to stop doing immediately, and other things that I should start doing as soon as possible.

Keychain

Once upon a time, it might have been reasonable to expect users to create and remember passwords for their accounts, but if you ask me, that era has long passed. As technology has thrived, and systems have become more pervasive, users have had to create an impossible number of accounts on dozens or hundreds (or—for power users—maybe even in the thousands) of independent services: on web sites, email accounts, personal computers, in-home routers, printers, bank accounts, phone authentication systems (think cable/phone support) and company networks.

Everyone needs a little help, and thankfully, many of the applications we use in our daily lives will remember our passwords for us. Firefox, Safari and (I believe) IE will all remember usernames and passwords, and will each try to semi-intelligently. Our mail applications (if they're not our browsers) remember our IMAP credentials, and On the Mac, we have Keychain built into the OS as one of its core components.

I intended to write a long piece on this, but I've been intending to do so for weeks to no avail, so simply put, I'd like to know your password policies, and I'll see how I can improve mine. One of the key elements in my new strategy is a script I wrote for mac keychain called "getpw":

#!/bin/bash

# no parameters spit out usage, then exit
if [ -z $1 ]; then
    echo "Usage: $0 name [account] (or:" `basename $0` "account@name)"
    exit 1
fi

if [ -z $2 ]; then
    # account not provided
    # check for account@name:
    USER=`echo -n $1 | sed -e 's/@.*//'`
    if [ $1 != $USER ]; then
        # found account@name
        ACCT="-a $USER"
        NAME=`echo -n $1 | sed -e 's/.*@//'`
    else
        # not found; ignore account
        ACCT=''
        NAME=$1
    fi
else
    ACCT="-a $2"
    NAME=$1
fi

PW=`security -q find-generic-password $ACCT -gs $NAME 2>&1 | egrep '^password: ' | sed -e 's/^password: \"//' -e 's/\"//' | tr -d '\012'`

if [ -z $PW ]; then
    echo password $1 not found
else
    echo -n "$PW" | pbcopy
    if [ -z $2 ]; then
        echo password $1 copied to pasteboard
    else
        echo password $2@$1 copied to pasteboard
    fi
fi

Basically, I do something like:

sarcasm:~ sean$ getpw sean@iconoclast
password sean@iconoclast copied to pasteboard

Keychain politely asks me to unlock the keychain if necessary (via a nice GUI dialog), and voila, I've got my password in my pasteboard, ready for use. No need to remember complex passwords, and no need to ever see them (bypasses keyloggers, too).

Hope that's helpful to someone; I use it dozens of times per day.

A Weak Web of Trust

Every time I'm forced to waste small fractions of my life navigating (and re-navigating) the Air Canada web site, I run into new points of frustration. For example, this week, I couldn't check pricing on a trip because of a JavaScript error that prevented the multi-city page from allowing me to submit the form.

Errors (which have since been fixed) aside, I was finally able to complete my reservation, today, and was reminded of an issue of cross-site trust that I suspect will become more and more of a problem, as sites and businesses continue to deepen their level of cooperation. This type of collaboration can be good or bad for end users, and in this case, what seems beneficial is actually extremely problematic.

The fundamental source of this problem is two-fold: the end-user's inability to know who is receiving trusted information, and the same user's obligation to determine if the identified party should receive this information in the first place.

I've seen it happen in a few places in the past few weeks (my colleague Paul pointed out the Google tie-in that I mention below). I'll comment on these from least- to most-severe/dangerous.

Google

Let's first look at Google. Five years ago (2003), Google acquired Blogger, a blogging service site. Today, if you visit Blogger, you'll be invited to conveniently sign in using your Google Account:

So, what's the problem? It's simple: there's no easy way to tell that Google actually owns Blogger, and that blogger.com should be trusted with your Google credentials. Sure, I know that Blogger is part of the GOOG, and—being up-to-date on things-Web—you probably know... but does your mother? your friends? My wife didn't know.

Indeed, Blogger's main page does say "Copyright © 1999 – 2008 Google" but there's no real, hard link between the two. I could falsely put a similar notice on any of my domains, and it would allow me to steal accounts of anyone who thinks that this is a reasonable practice.

Fortunately, for Blogger users, your gmail account is a relatively low risk (we do use Google docs to plan certain business things that would be considered "confidential" but not necessarily "critically secret.")

Paypal

To step up to what I consider a much more problematic example of "convenient business relationship gone bad" our attention turns to eBay's purchase of Paypal (2002).

I like to browse eBay from time to time, especially to find reasonable prices on brewing stuff. I've won a couple auctions in the past couple months, and I've noticed a very peculiar and dangerous tie-in like the Blogger-Google connection above.

eBay's relationship with Paypal is certainly no secret. I would guess that most eBay regulars generally use Paypal to complete transactions, and many of those are aware that they are, in fact, the same people. Admittedly, this problem might be more or less serious than I'm about to explain, but the fundamental issue is the same—one of trust.

I can't grab a screen shot of this one because I'm unwilling to complete a transaction just for the sake of this blog entry, so you'll have to trust me for this example (or you may have already noticed for yourself). It used to be that when paying a seller via Paypal, you'd be shuttled off to the Paypal site, and returned to eBay upon transaction completion. This is how nearly all Paypal transactions work: merchant passes user off to Paypal to pay, and user is redirected to merchant.

Over the past few weeks (perhaps months, now), there has been a new branding scheme applied to eBay-specific Paypal transactions. When paying, buyers are still (re)directed to paypal.com, but instead of standard Paypal greetings, text, images and colours, users are asked to log into a page that is decorated with eBay's brand (logo, colours, language).

Business-conglomerate aside, this is a very dangerous precedent for Paypal to set. Paypal is understandably one of the biggest targets for phishing scams, and I think it would be in their best interest to keep their site very clearly labeled "Paypal" even if it is "just" eBay. They are quick to attempt to educate their users on the dangers of phishing, and their tips even indicate such now-ambiguous suggestions as "Don't use the same password for PayPal and other online services such as AOL, eBay, MSN, or Yahoo." (Emphasis mine.)

What about sites that LOOK like eBay, but are actually Paypal? Again, I bet that would easily confuse someone who's less Web-savvy.

Visa

Getting back to the problems I had with Air Canada, today, let's discuss the most idiotic and dangerous idea of them all: Verified by Visa.

Verified by Visa is a programme introduced by Visa, in 2001, to help reduce fraudulent credit transactions online by shifting part of the responsibility of preventing fraud from the merchant to the card's issuing bank. The idea is to insert a verification step into an online merchant's purchase process to have a bank essentially vouch for a given card. In this case, Air Canada is the merchant, and Royal Bank of Canada is my issuing bank.

Once again, on the surface, this sounds like a mild inconvenience to end users to create a significant increase in security. In most cases, I believe it does actually do this. Here's my problem: the verification step is inserted into the merchant's page via an iframe. The user is asked for his/her online banking password within this frame, which is actually the issuing bank's web site. I can verify this by loading and inspecting the source, determining that the iframe (probably(!!)) is actually coming from my bank's site (I say "probably" because there COULD be some hard-to-find, obfuscated JavaScript hiding, somewhere that changes this URL and/or loads a different frame/source). One cannot reasonably expect casual users to have the necessary HTML-parsing abilities to determine that it's safe to give this page (that appears to actually be the merchant's site, according to the address bar of my browser, by the way) their online banking password. Again, I'm unwilling to purchase a multi-hundred dollar plane ticket to grab a screen shot to illustrate this point. Sorry (-:

Wrap-up

This whole idea of third-party verification without somehow allowing the user to easily intercept/inspect the process is dangerous and sounds like a ripe venue for increased phishing/social engineering exploits. "Reliably check and/or type the URL yourself (to ensure that it matches the site's content and your intent)" is probably the number-one rule for avoiding phishing scams, and the implementations above make it impossible for casual users to take even the most basic of precautions.

Some tips/rules (in my opinion):

  • You have a URL. It's secured by SSL. Use it. Don't split users off onto different sites. Don't allow login from third-party domains (instead direct the user to your main domain, and securely redirect them back to the main content).
  • Optionally use a system like OpenID (I'm looking at you, Blogger).
  • Don't embed critical information forms into a page hosted on a different domain than one that should be trusted with said information; instead, redirect as above
  • It's bad form to brand brand your trusted domain with a different site's scheme—it's confusing and dangerous.
  • Make your intentions clear to users. Make the recipient of trusted information painfully obvious to the end user, and do so through a mechanism that the user is prone to actually trust—read: use the URL/Address Bar, and not text "don't worry, this form on thanksforthecreditcard.example.com actually submits to paypal.com; you're safe!
  • NEVER expect casual users to know how to figure out where an iframe is sourcing from, or where a form submits.

Google, Paypal, Visa: shame on you. You're violating some of the most fundamental social Web security rules.

How to record a podcast on OSX 10.5.2

I'm so frustrated. It seems that every time we sit down to record the podcast, lately, it all goes to crap, and I'm sick of recording the same thing over and over again only to have it fail (audio gets garbly; drops samples; garageband crashes; kernel panics; all around nasty stuff).

It all seems to stem from Apple seriously screwing up their USB drivers on 10.5.2. This is definitely the first time I've felt seriously let down by my operating system since switching from Linux (which has its own issues) last May.

So, to help all other would-be podcasters out there, I've come up with a chart that helps you choose the proper combination of hardware and software when recording podcasts on 10.5.2:

Seriously, though, if anyone has a real solution to this problem that doesn't involve an OS reinstall (and then not upgrading past 10.5.1), please PLEASE let me know. And no, switching from the left USB port to the right isn't a real solution.

*sob*

PHP Advent Calendar

A few days back, Chris Shiflett sent out an email asking a bunch of members of the PHP community to submit to a project he wants to run this year, the PHP Advent Calendar. I have the honour of providing the first entry.

Thanks to Chris; I think this is a great idea. I'm so happy to be included on the list of potential writers.

I'd write more, but I'm currently in the middle of nowhere (again) so I can't write more (without waiting for incessantly latent internet), so I'll leave it at that. Enjoy!

We all grow up

A friend of mine sent me a message on Facebook, yesterday. "Never thought that I'd see Microsoft and your name together at the same place," she said, referring to posted (and tagged) photos of me at the Microsoft Web Developers Summit 2007.

"Me either. Long story..." I replied. The story dates back 5 years, when I had conversations with my aforementioned friend's husband–who happened to be a Microsoft guy, professionally–about the things MS was doing that were hurtful to not only open source, but the software community in general, and ultimately Microsoft's bottom line (if they were willing to look past "next quarter's" earnings projections).

In the past 5 years, a number of things have happened to change my solid and negative opinion of Microsoft to one that is more fluid and better reflects reality. The first of those is that I've grown to accept that things often look better on paper than "in the field." Another notable change to my behaviour is a realization that herd mentality had set in, and that part of my dislike for all things Redmond was due to that being the socially correct thing to do (I've since mostly stopped reading Slashdot).

Without a doubt, though, if asked to identify a single factor that has most significantly changed my frigid opinion of Microsoft, I would immediately identify the Web Developer Summits.

Last year (in 2006), I was invited to attend the first of these summits, partially due to a logistical problem that left open seats that needed to be filled. I was excited to visit Seattle for the first time, and MS was footing the entire bill, so who was I to say "no"?

Joe Stagner, Microsoft's Opinionated Misfit Geek (and yes, his business cards DO say that, I've seen it), who has worked with us at php|architect, speaking at and sponsoring conferences, was my "sponsor" for last year's summit (and again this year). Joe's contributions to our conferences have been honest and forthcoming. He does a good job of balancing Microsoft's agenda with a fair dose of self deprecation that tends to engage our attendees, and (discarding the troll comments) I hear overwhelmingly positive comments after each time Joe speaks.

Coming away from last year's summit, it dawned on me that Microsoft simply isn't the same company that it was five years ago. Based on the candid information that Microsoft has shared in the past two Web Developers' summits, it's obvious to me that not only has MS' business strategy toward open source changed dramatically in the past few years, but there is a seemingly fundamental change in their actual philosophy toward software they haven't written, themselves.

Their corporate attitude–that is, at least from the sector that's focused on Web development–has swayed from a nearly-violent and extremely arrogant position of dominance, to one that is more open and dare I say even humble? Their recent offerings seem to be standards compliant (or at least standards-savvy) and more open than ever. Their past position of embrace and extinguish seems to have died with a past generation of middle management.

After seeing demos of some of their upcoming web-centric technologies such as IIS 7, Silverlight, and Expressions, I'm left re-evaluating my current preferred platforms.

Don't get me wrong, I'm unlikely to place a Windows box into production when not absolutely necessary (thank you Flash Media Server), but one of the things that I keep catching myself saying to colleagues when discussing the summit is "Doubtful I'll be using Expressions, but it does seem like the perfect Frontpage replacement for my Father-in-Law."

Even after being shown IIS 7, and having in-depth technical discussions with core developers, such as Rick James, the developer behind the IIS 7 FastCGI implementation, when asked "What do we need to add to IIS to make you use it?" my half-serious reply is "Make it run on Linux!" I say half-serious because I'm almost certainly not going to switch my production boxes from Linux, but if IIS 7 did, in fact, run on Linux, I'd be giving it some serious thought (that is, if it didn't end up having a high per-CPU cost, as an anonymous colleague pointed out).

IIS can pull some sweet integration tricks that more loosely coupled stacks like LAMP struggle with, such as deep kernel/filesystem hooks to determine when the IIS equivalent of .htaccess files have actually changed, giving them a serious performance advantage. There's also an integration point with Silverlight (the "Flash killer"), where the httpd can analyze, in realtime, the bitrate of the served video file and scale allocated bandwidth appropriately to maximize user experience, while saving on bandwidth (think: user watches 2 minutes of a 2 hour video file, and only actually downloads 3 minutes of the file, instead of up to the full 2 hours).

Maybe I'm just drinking the kool-aid. I'm usually more paranoid than that, but I guess it's possible. Or perhaps, if you put the Microsoft-hating tendencies aside for just a moment, you might agree with me that they're up to something different. They've certainly got an uphill battle, but at least they're trying, and I really do think that's what counts.

Thanks, Eric, Joe, Drew, Sanjoy, Tanya and everyone else who was involved in bringing us out to Redmond. I hope, no matter how hard to correlate to actual sales, it was worth it for you. It was definitely worth my time.

Short Date Formats Suck

When I'm traveling, I often like to sample beer that's unavailable here in beer-wasteland-Quebec (local microbreweries not withstanding).

For some reason, I often get asked for ID... especially in near-airport bars and restaurants. I noticed that in Orlando, last year, everyone in every group was carded each time anyone ordered any sort of alcohol. I guess they have a low-tolerance for under age drinking, there, or perhaps their waiters are just well-trained to ask everyone for ID.

Anyway, the first piece of ID I usually have on-hand is my Quebec driver's license. Quebec is messed up in many ways, but one that they're particularly oblivious about is that our driver's licenses don't explicitly show the holder's birth date. It's abstracted into the license number, and isn't obvious to anyone who's never seen one before.

(note: yes, there is a PHP (or at least code) related component to this piece, if you feel like reading on. It has to do with idiotic short date formats.)

I get a kick out of handing my driver's license to bouncers/bartenders/waitresses and watching their faces as they try to find the birth date. It must be there... right?

My license has a number similar to the following, at the top (changed in case there's any "private" information in there, but the format is the same): C6401-090280-01.

Now that I've pointed it out, you can probably pick out my birth date: September 2nd, 1980. Or perhaps it's February 9th, 1980... and I'm only sure about the year because there's no month for "80". The practice of displaying an abstracted birth date on a piece of identification that is normally used much more often as... identification... than as an actual license to drive, but I digress.

Let's get get to the real issue: ambiguous date formats. Can we stop this, please? I realize that each country is different, but it's really annoying. Take my example above: you have no idea which month I was born in.

On top of that, why should the YEAR–read: the most significant digits–ever be last? Today's date should be denoted 2007.09.27. Choose your own punctuation for all I care, but please, PLEASE, use a little bit of sense, here.

Don't even get me started on 24h date formatting.

Rant over. Sorry about that.

Anyway, IF you must represent dates in a messed up format like 090507 (which what?), then at LEAST let your users choose their own format.

How to [not] get fired

Marco already posted on this, but I thought I'd pitch in our side of the story.

At php|works, this year (a couple weeks ago), our /fear(some|ful)/ leader was absent. He had some personal stuff that conflicted with the conference's schedule, so he left it in our (Paul, Arbi and myself) mostly-capable hands.

I think we did a good job, even without him, but to deter him for deserting us at our [bigger!] spring conference, we came up with an idea... a good idea (-:

Tradition states that Marco should give the closing keynote at our conferences. This time around, we had excellent internet connectivity–thanks to the nice folks at OneRing Networks and a little experience (read "don't trust the hotel's AV company for networking needs) on our part–we (Marco and I) decided that he could reliably give his keynote via iChat. The idea had not yet been conceived...

We try very hard to make our conferences professional but not uptight. Some ideas works, some don't. This one did (-:

We devised a way that we could harness the audience to "caption" Marco's closing keynote. It involved 15 minutes of coding, 2 laptops, 2 projectors and an unsuspecting boss.

The first projector displayed the output from my laptop's iChat window, so everyone could see Marco's lovely mug. The 2nd projector ran a web browser that displayed the audience-sourced caption (updated every 5 seconds).

Oh, and as I alluded, he had absolutely no idea we'd done it until a few days later when I let the cat out of the bag–I figured it best that he hear it from one of us than read about it on someone else's blog.

We had fun, and I distinctly remember hearing "Best Closing Keynote EVER" after we logged off. Hope you had fun, too (-:

Handling Downtime: Job Well Done

Yesterday, if you tried to call me (either at work, or at home, or even my mobile), you were probably unable to reach me.

The telephone (VOIP) wholesaler I use, Unlimitel, was down for around 9 hours. Normally, 9h of downtime would really bother me, but I can honestly say I've never been happier with their service.

Here are a few reasons why:

  • This is the first significant downtime we've had in 2 years (since I started using Unlimitel)
  • We pay 1.1¢/min for on-net service (local calls in a number of Canadian cities) and 2¢ for the rest of North America. Low rates for most of the rest of the world, too. $2.50/month for our DIDs. I don't expect completely flawless service for this price.
  • Stephan, the president, emailed customers to explain the problem. The first mail was before we even noticed that calls weren't working.
  • The actual cause of the accident was a screwup at Rogers (NOT Unlimitel's fault):
    • A truck accident somehow caused a bundle of fiber at Rogers to be cut. (Initial reports were that the lines were cut by a backhoe.)
    • Rogers' redundancy somehow failed. The actual cut was 25KM away from Unlimitel's datacenter.
  • Throughout the downtime, Stephan kept us well-informed of the situation, relaying ETA data from Rogers whenever possible.
  • Unlimitel routed all possible traffic to non-Rogers circuits as soon as possible. (Outbound calls started working, but inbound lines are on Rogers, and Rogers' redundancy failed, as mentioned.)
  • The few times that Unlimitel has actually made mistakes, they've kept us well-informed, and owned up to these mistakes quickly (they implemented CallerID poorly, a while back, and quickly fixed it, for example).

All told, I'm very happy with them. I can understand why some people would be upset over ~9h of unplanned downtime, but all things considered, I think Unlimitel did an excellent job of handling the crisis.

For what we pay, I couldn't expect better. Kudos to the team over at Unlmitel.