Updating Drupal Pathauto module aliases using Cron

by Hugh on April 29, 2009

Drupal, as we all know, comes great out of the box in terms of search engine friendly urls and the like.  However, the built-in path module doesn’t always handle urls as you’d like, especially on large sites.  Pathauto is the module for choice for those of us that are fussy about how our urls are structured, however it does have some shortcomings.  One of these is for owners of really, really large sites.  If you have a site with tens (or hundreds) of thousands of nodes, updating the url aliases can be a tedius process, only achievable via the admin interface, and you can only create a few (500 or so) at a time.  So, how does one automate this tedious process?

After a big long search, I managed to find the solution.  I managed to track down various solutions (view post here), but i’ll outline the one that worked for me.

For a background on my Drupal install, i’m running Drupal 6.10 and Pathauto 6.x-1.1.  All sites are installed using the Multi Site method, which is a fantastic feature of Drupal, allowing for great power across all your sites on a small footprint.

Ok, here goes.

1. Create a file in /var/www/drupal (or wherever your core Drupal files are located) called cron-update-pathauto.php.  Chmod this as executable – 500 should suffice.

2. In the file, add the following code and save:

<?php
include_once ‘./includes/bootstrap.inc’;
include_once ‘./sites/all/modules/pathauto/pathauto.inc’;
include_once ‘./sites/all/modules/pathauto/pathauto_node.inc’;

drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

node_pathauto_bulkupdate();
?>

3. Test by visiting http://yoursite.tld/cron-update-pathauto.php – take note of any unaliased urls beforehand so you can see if the script has worked or not.  Remember, this will only execute based on the maximum aliases you’d like created as set in your admin area, so don’t expect Rome to be built in a day!

4. Set up a cron job to execute the script on whichever interval you’d like.  If you have lots and lots of content created daily, and you’d like to be totally fussy about them all having the right url aliases immediately, set your cron to run maybe once an hour.  Otherwise once a day should suffice.

That’s it!  Enjoy Drupalling!

Similar Posts:

Leave a Comment

Previous post:

Next post: