1. Deploy on push (from GitHub)

    Continuous deployment is all the rage right now, and I applaud the use of systems that automate a task that seems way easier than it is.

    That said, sometimes you need something simple and straightforward: a hook that easily deploys a few pages, or a small application, all without often-complicated set up (come on, this is a PHP-focused site, mostly).

    Sometimes, you just need to deploy code when it’s ready. You don’t need a build; you don’t need to run tests — you just need to push code to a server. If you use git and GitHub (and I think you should be using GitHub), you can easily deploy on push. We use a setup like this for PHP Advent, for example (it’s a very simple app), and we also used this approach to allow easy deployment of the PHP Community Conference site (on Server Grove), last year.

    There are really only three things that you need, in most cases, to make this work: a listener script, a deploy key and associated SSH configuration, and a post-receive hook. This is admittedly a pretty long post, but once you’ve done this once or twice, it gets really easy to set up deploy hooks this way. If your code is in a public repository, you don’t even need to do the SSH configuration or deploy key parts.

    The listener is just a simple script that runs when a request is made (more on the request bit below). This can sometimes be a bit tricky, because servers can be configured in different ways, and are sometimes locked down for shared hosting. At the most basic level, all your listener really needs to do is git pull. The complicated part is that git might not be in your web user’s path, or the user’s environment might otherwise be set up in a way that is unexpected. The most robust way I’ve found to do this just requires you to be as explicit as possible when defining the parameters around the call to git pull.

    To do this with PHP (and this method would port to other platforms, too), make a script in your application’s web root (which is not necessarily the same thing as the git root), and give it a name that is difficult to guess, such as githubpull_198273102983712371.php. The abstracted name isn’t much security, but much security isn’t needed here for the simple cases we’re talking about, in my opinion. In this file, I have something like the following.

    $gitpath = '/usr/bin/git';
    header("Content-type: text/plain"); // be explicit to avoid accidental XSS
    // example: git root is three levels above the directory that contains this file
    chdir(__DIR__ . '/../../../'); // rarely actually an acceptable thing to do
    system("/usr/bin/env -i {$gitpath} pull 2>&1"); // main repo (current branch)
    system("/usr/bin/env -i {$gitpath} submodule init 2>&1"); // libs
    system("/usr/bin/env -i {$gitpath} submodule update 2>&1"); // libs
    echo "\nDone.\n";

    The header prevents accidental browsers to this page from having their clients cross-site-scripted (XSS). The submodule lines are only necessary if you’re using submodules, but it’s easy to forget to re-add these if they’re removed, so I just tend to use them every time. 2>&1 causes stderr to get redirected to stdout so errors don’t get lost in the call to system(), and env -i causes your system() call to be executed without inheriting your web user’s normal environment (which, in my experience, reduces confusion when your web host has git-specific environment variables configured).

    Before we can test this script, we need to generate a deploy key, register it with GitHub, and configure SSH to use it. To generate a key, run ssh-keygen on your workstation and give it a better path than the default (such as ./deploy-projectname), and use a blank password (which isn’t the most secure thing in the world, but we’re going for convenience, here). Once ssh-keygen has done its thing, you’ll have two files: ./deploy-projectname (the private key), and ./deploy-projectname.pub (the matched public key).

    Copy the private key to your web server, to a place that is secure (not served by your web server, for example), but is readable by your web user. We’ll call this /path/to/deploy-projectname. SSH is (correctly) picky about file permissions, so make sure this file is owned by your web user and not world-writable:

    chown www-data:www-data /path/to/deploy-projectname
    chmod 600 /path/to/deploy-projectname

    Now that we have the key in place, we need to configure SSH to use this key. For this part, I’m going to assume that projectname is the only repository that you’ll be deploying with this method, but if you have multiple projects on the same server (with the same web user, really), you’ll need to use a more complicated setup.

    You’ll need to determine the home directory of the web user on this server. One way to do this is just to check the value at $_ENV['HOME'] from PHP; alternately, on Linux (and Linux-compatible su), you can sudo su -s /bin/bash -u www-data; cd ; pwd (assuming the web user is www-data). (Aside: you could specify the value of the HOME environment variable in your call to env and avoid some of this, but for some reason this hasn’t always worked properly for me.)

    Once you know the home directory of the web user (let’s call it /var/www for the sake of simplicity (this is the default on Debian type systems)), you’ll need to mkdir /var/www/.ssh if it doesn’t already exist, and make sure this directory is owned by the right user, and not world-writable. As I mentioned, SSH is (rightly) picky about file permissions here. You should ensure that your web server won’t serve this .ssh directory, but I’ll leave the details of this as an exercise to the reader.

    On your server, in /var/www/.ssh/config (which, incidentally, also needs to be owned by your web user and should be non-world-readable), add the following stanza:

    Host github.com
      User git
      IdentityFile /path/to/deploy-projectname

    Those are the server-side requirements. Luckily, GitHub has made registering deploy keys very easy: visit https://github.com/yourusername/projectname/admin/keys. “Add deploy key”, give it a title of your liking (this is just for your reference), and paste the contents of the previously-generated deploy-projectname.pub file.

    At this point, your web user and GitHub should know how to speak to each other securely. You can test your progress with something like sudo su -u www-data -s /bin/bash ; cd /path/to/projectname ; git pull, and you should get a proper pull of your previously-cloned GitHub-hosted project.

    You should also test your pull script by visiting http://projectname.example.com/githubpull_198273102983712371.php (or whatever you named it). If everything went right, you’ll see the regular output from git pull (and the submodule commands), and Done. If not, you’ll need to read the error and figure out what went wrong, and make the appropriate changes (another exercise to the reader, but hopefully this is something you can handle pretty easily).

    The last step is to set up a post-receive POST on GitHub. Visit https://github.com/yourusername/projectname/admin/hooks, and add a WebHook URL that points to http://projectname.example.com/githubpull_198273102983712371.php. Now, whenever someone does a git push to this repository, GitHub should send a POST to your githubpull script, and your server should pull the changes.

    In order for this to work properly (and avoid conflicts), you should never change code directly on the server. This is a pretty good rule to follow, even if you don’t take this pull-on-push approach, for what it’s worth.

    Note that other than the bits about registering a deploy key, and setting up the post-receive POST, most of this can be ported to a system that uses git without a GitHub-hosted repository.

    Additionally, you should prevent the serving of your .git directory. One easy way to do this is to keep your web root and your git root at different hierarchical levels. This can also be done at the server configuration level, such as in .htaccess if you’re on Apache.

    I hope this helps. I’m afraid I’ve missed some bits, or got some of the steps wrong, despite testing as I wrote, but if I have, please leave a comment and I’ll update this post as necessary.

  2. Lexentity

    A very long time ago (three and a half livers ago), I wrote a little utility to help us with the 2008 edition of PHP Advent. The utility is called Lexentity, and my recent blogging uptake made me realize that I’ve never actually written about it on here, so here it is (mostly borrowed from the README).

    Let's face it--this sentence is much "uglier" than the one below it.
    Let’s face it–this sentence is much “prettier” than the one above it.

    Lexentity is a simple piece of software that takes HTML as input and outputs a context-aware, medium-neutral representation of that HTML, with apostrophes, quotes, emdashes, ellipses, accents, etc., replaced with their respective numeric XML/Unicode entities.

    Context is important. It is especially important when considering a piece of HTML like this:

    <p>…and here's the example code:</p>
    <pre><code>echo "watermelon!\n";</pre></code>

    Contextually, you’d want here's to become here’s (note the apostrophe), but you certainly don’t want the code to read echo “watermelon!\n”;.

    A fancy/smart/curly quotes apostrophe is appropriate, but curly quotes in the code are likely to cause a parse error.

    Lexentity understands its context, and acts appropriately, by means of lexical analysis, and turning tokens into text, not through a mostly-naive and overly-complicated regular expression.

    Regarding context, my friend and former colleague Jon Gibbins said it best in this piece on his blog: In modern systems, you can’t count on your HTML to always be represented as HTML. It’s often (poorly) embedded in RSS or other HTML-like media, as XML.

    Therefore, it is important to avoid HTML-specific entities like &rdquo; and &hellip;, and instead use their Unicode code point to form numeric entities such as &#8230;. This ensures proper display on any (for small values of “any”) terminal that can properly render Unicode XML, and avoids missing entity errors.

    You can try a demo at http://files.seancoates.com/lexentity, and the (PHP) code is available on GitHub.

    We still use it for PHP Advent, and I ran this post through it. (-: