mod_rewrite

mod_rewrite is cool.

As I mentioned in an earlier entry, I recently upgraded by blogging software to WordPress. WordPress uses mod_rewrite to support it’s pretty URL system. This works by defining a set rewrite rules base on regular expressions. For example, every time you call http://pezra.barelyenough.org/blog/2005/04/the-importance-of-a-rare-name/ apache internally rewrites the URL as /blog/index.php?year=2005&monthnum=04&name=the-importance-of-a-rare-name. This is not a redirect, the browser thinks it is pointed the pretty URL apache actually uses index.php instead. This behavior is nice, but it lead me to something even better.

I realized today that the perma-links generated by my previous blogging software were now 404s. mod_rewrite to the rescue, I just added a new rule so that the URLs blogger.com generated are now are rewritten to call WordPress instead. That means that if anyone linked to the existing posts, those links are not broken any more. And it only cost one line in a config file. Now, that is sweet.

2 thoughts on “mod_rewrite

  1. I don’t understand how you’re doing this redirect from the old URLs. Do you mean the blog software is configured to recognize the patterns and make a substitution when it generates the pages? Explain it in more detail.

  2. It is pattern matching. Basically you write a regular expression that matches the original URI and the constructs the new, rewritten, URI based on parts of the originally URI. The URI rewriting happens in apache, before it gets to the blog software. But the blog software builds the rewrite rule for you based on a simple specification.

    The basic rewrite rule is

    RewriteBase /blog/
    RewriteRule ^([0-9]{4})/([0-9]{1,2})/([^/]+)/?([0-9]+)?/?$ /blog/index.php?year=$1&monthnum=$2&name=$3&page=$4 [QSA,L]

    The $n bits are back references to the sub-matches (the bits in parenthesis) of the regular expression.

Comments are closed.