Games with redirects
Warning! This page and the related Python scripts are under construction
You have a mind-numbing set of Redirect and RedirectMatch directives in
your .htaccess and are having trouble keeping them all working. You generally
find out from 404 errors that a rule is not doing what you thought it did.
Why not test the redirects locally first and deal with the problems all at once?
Assumptions of these instructions
- You'll test with Apache httpd 2.2.x since that is what is running your site.
- You'll test on Windows (or figure out yourself how to modify the instructions as appropriate for your platform).
- You're willing to install Apache httpd and Python on your computer.
- Most of your redirects are implemented with Redirect and RedirectMatch, and they're of relatively simple form (or at least tell me what forms of those directives caused your testing to fail).
- You know how to use a plain text editor to edit text files.
In the instructions I'll use C:\Users\Jeff\test_redirects for the test
area; it will have a subdirectory called public_html which in turn includes
the .htaccess file being tested. (Create a suitable directory with
those contents now.)
Download test_one_redirect.py and test_htaccess_redirects.py to the test area.
Install, configure, and start Apache httpd
Head over to Apache Lounge and
download the latest Apache httpd 2.2.x for Windows. Example file: httpd-2.2.24-win32.zip
Note: Apache Lounge is not affiliated with the Apache Software Foundation. It is supported by members of the user community.
Extract it to C:\, such that C:\Apache2 is created.
(Or extract it elsewhere, but fix several references to C:/Apache2
in httpd.conf as mentioned in the readme file.)
This server is only for testing your redirects locally, so edit C:\Apache2\conf\httpd.conf and change the line Listen 80 to Listen 127.0.0.1:80
so that Apache httpd won't listen for request arriving over the network.
That section of httpd.conf will look like this when done:
# # Listen: Allows you to bind Apache to specific IP addresses and/or # ports, instead of the default. See also the <VirtualHost> # directive. # # Change this to Listen on specific IP addresses as shown below to # prevent Apache from glomming onto all bound IP addresses. # #Listen 12.34.56.78:80 Listen 127.0.0.1:80
Additionally, you need to include another .conf file from
your work area by adding Include C:/Users/Jeff/test_redirects/test.conf
at the bottom of httpd.conf.
That section of httpd.conf will look like this when done:
<IfModule ssl_module> SSLRandomSeed startup builtin SSLRandomSeed connect builtin </IfModule> Include C:/Users/Jeff/test_redirects/test.conf
Yes, you need to use forward slashes in file and directory names when
they appear in .conf files.
Now create a new .conf file in your work area —
C:\Users\Jeff\test_redirects\test.conf with these contents:
Listen 127.0.0.1:9999
<VirtualHost *:9999>
DocumentRoot C:/Users/Jeff/test_redirects/public_html
<Directory C:/Users/Jeff/test_redirects/public_html>
AllowOverride All
Order Deny,Allow
Allow from All
</Directory>
</VirtualHost>
Now start Apache httpd in a command prompt:
Microsoft Windows [Version 6.2.9200] (c) 2012 Microsoft Corporation. All rights reserved. C:\Users\Jeff>cd \apache2 C:\Apache2>bin\httpd
It will keep running until you press Control-C and wait a few seconds.
Install Python
What, Python isn't installed yet? Go to the Python download site
and download and install the latest Python 2.7.x Windows Installer. Then, add
the Python directory to PATH in the control panel. Normally that would be C:\Python27.
Try this command and see if it displays the level of Python you installed, or if you get a nasty error:
C:\Users\Jeff\test_redirects>python --version Python 2.7.3
Testing
Now when you go to host 127.0.0.1:9999 your .htaccess file will be processed. But don't go there with a browser. You don't really have a site there; instead, you have just enough to see if your redirects are working.
How can you test a redirect to see if Apache httpd returns the desired target URL? Use my little Python script called test_one_redirect.py.
C:\Users\Jeff\test_redirects>python test_one_redirect.py 127.0.0.1 9999 /projects.html Testing /projects.html... -> http://www.example.com/projects/index.html
(But different tests and results will make sense for your redirects.)
You can test your redirects one by one like that, but is more useful to test the whole slew of redirects in your htaccess file.
Try this:
C:\Users\Jeff\test_redirects>python test_htaccess_redirects.py --help
Usage: test_htaccess_redirects.py [options]
Options:
-h, --help show this help message and exit
-a HTACCESS, --htaccess=HTACCESS
override default .htaccess filename (.htaccess)
-v VALIDURLS, --valid-urls=VALIDURLS
override default list of valid urls (valid_urls.txt)
-p PORT, --port=PORT override default port (80)
-o HOST, --host=HOST override default host (127.0.0.1)
Your .htaccess file should be in the public_html subdirectory
and it is processed on port 9999, so invoke the script like this:
C:\Users\Jeff\test_redirects>python test_htaccess_redirects.py --htaccess=public_html/.htaccess --port=9999 Verifying that targets of redirects are in the sitemap... Testing /START-OF-URI/projects.html... -> http://www.example.com/projects/index.html Testing /projects.html... -> http://www.example.com/projects/index.html ERROR: expected to be redirected to [http://www.example.com/projects/projects.html] instead of to [http://www.example.com/projects/index.html] from [/projects.html] Directive: [Redirect 301 /projects.html http://www.example.com/projects/projects.html] (Redirect/301//projects.html/http://www.example.com/projects/projects.html) Testing /START-OF-URI/projects.html... -> http://www.example.com/projects/index.html ERROR: expected to be redirected to [http://www.example.com/projects/projects.html] instead of to [http://www.example.com/projects/index.html] from [(?i)/projects.html] Directive: [RedirectMatch 301 (?i)/projects.html http://www.example.com/projects/projects.html] (RedirectMatch/301/(?i)/projects.html/http://www.example.com/projects/projects.html)
You can see that my .htaccess rules aren't working very well.
Todos
- explain here all the many limitations of test_htaccess_redirects.py, such as the fact that it only tests Redirect and relatively simple RedirectMatch directives, and doesn't even handle all variations of those
- set user agent and implement throttling so that it can be directed against your
real site to verify your redirects once
.htaccesshas been uploaded - explain how to use the valid_urls.txt file to confirm that the targets of the redirects actually exist on your site
- if anybody ever cares, remove some of the limitations
- what about generating .htaccess redirects from a higher level construct?