Using RewriteMap and DBM files for mass Apache redirects

Its not that unusual to need a large number of redirects in Apache, and even if you’ve been following best practice and keep sites in separate vhost files, a couple of rounds of redirects can leave your files bloated, messy and just that little bit more difficult to maintain.

The Apache RewriteMap directive allows you to create a ‘function’ which can then be called in RewriteCond and RewriteRule rules. Basically it means you don’t need to explicitly state each rule in your vhost files.

It works by allowing you to specify an external map once in your file using the following syntax:
RewriteMap mapname maptype:maplocation
mapname – being the the name you’ve given to your external mapfile
maptype – being the type of mapfile used, I’ll be showing you txt and dbm but you can find all the accepted types here
maplocation – is the location of your file.

Both txt & dbm maptypes contain a 1 to 1 mappings of keywords to value, where dbm files win hands down is that they’re indexed files.

So what? well consider a txt file of 400 redirects, when Apache references an unidexed file it needs to process each line in turn until it finds a match. So every hit which matches the rule will step through the file. depending on the scope of your rules, number of items in your file and the position of the correct entry, that could be some serious lost performance right there.

The indexed dbm files allow Apache to go straight to the entry it needs, the more redirects the more noticeable the performance increase.

As already stated you can use your mapfile in any valid RewriteCond/RewriteRule, using the syntax below:
ReWriteRule (.*) ${mapname:$1| default}
Special note should be made of the default section, whilst optional its pretty important, its the value thats returned if no matches are found in the mapfile.

hopefully it’s all made sense so far. Right onto the example, to save work I’ve shamelessly ripped this from the Apachedocs

For example, we might use a mapfile to translate product names to product IDs for easier-to-remember URLs, using the following recipe:

Product to ID configuration
RewriteMap product2id txt:/etc/apache2/productmap.dbm
RewriteRule ^/product/(.*) /prods.php?id=${product2id:$1|NOTFOUND} [PT]

We assume here that the prods.php script knows what to do when it received an argument of id=NOTFOUND when a product is not found in the lookup map.

to createThe file /etc/apache2/productmap.dbm file, we first create the txt file as below:

##
## productmap.txt - Product to ID map file
##
television 993
stereo 198
fishingrod 043
basketball 418
telephone 328

Then we convert this to a dbm, using the Apache httxt2dbm function:
httxt2dbm -i productmap.txt -o productmap.dbm
Thus, when http://example.com/product/television is requested, the RewriteRule is applied, and the request is internally mapped to /prods.php?id=993 and as previously discussed, because this is a dbm indexed file, we can do hundreds of redirects and they shouldn’t impact performance :-)

Redirects shouldn’t be used as an alternative to good site maintenance. If content is no longer needed it really should be removed, archived or whatever; and crazy redirects which have no relation to the original destination are a quick way to tick users off.

3 Comments

  1. Have you run into any sort of a limit on size, with respect to DBM files? I have a group that’s looking to switch to a new CMS and in doing so implement about 60,000 1:1 redirects. Before trying, I’m trying to see if that’s too big even for DBM, before it starts getting totally out of hand.

    • Ian

      Hi Tad,

      I honestly haven’t pushed the number of rewrites that far. I think I’ve only ever done about 9k. However I’ve done some research and can’t find anything (other than the inbuilt size limits of dbm hashes) to stop you.

      The only thing I’d say is that the DBM files are read in at service startup so just keep an eye on how long Apache is taking to restart.

      Otherwise I think you should be good!

Leave a Reply

Your email address will not be published. Required fields are marked *