Games with redirects
Warning! This page and the related Python scripts are under construction
You have a mind-numbing set of Redirect and RedirectMatch directives in
your .htaccess
and are having trouble keeping them all working. You generally
find out from 404 errors that a rule is not doing what you thought it did.
Why not test the redirects locally first and deal with the problems all at once?
Assumptions of these instructions
- You'll test with Apache httpd 2.2.x since that is what is running your site.
- You'll test on Windows (or figure out yourself how to modify the instructions as appropriate for your platform).
- You're willing to install Apache httpd and Python on your computer.
- Most of your redirects are implemented with Redirect and RedirectMatch, and they're of relatively simple form (or at least tell me what forms of those directives caused your testing to fail).
- You know how to use a plain text editor to edit text files.
In the instructions I'll use C:\Users\Jeff\test_redirects
for the test
area; it will have a subdirectory called public_html
which in turn includes
the .htaccess
file being tested. (Create a suitable directory with
those contents now.)
Download test_one_redirect.py and test_htaccess_redirects.py to the test area.
Install, configure, and start Apache httpd
Head over to Apache Lounge and
download the latest Apache httpd 2.2.x for Windows. Example file: httpd-2.2.24-win32.zip
Note: Apache Lounge is not affiliated with the Apache Software Foundation. It is supported by members of the user community.
Extract it to C:\
, such that C:\Apache2
is created.
(Or extract it elsewhere, but fix several references to C:/Apache2
in httpd.conf
as mentioned in the readme file.)
This server is only for testing your redirects locally, so edit C:\Apache2\conf\httpd.conf
and change the line Listen 80
to Listen 127.0.0.1:80
so that Apache httpd won't listen for request arriving over the network.
That section of httpd.conf
will look like this when done:
# # Listen: Allows you to bind Apache to specific IP addresses and/or # ports, instead of the default. See also the <VirtualHost> # directive. # # Change this to Listen on specific IP addresses as shown below to # prevent Apache from glomming onto all bound IP addresses. # #Listen 12.34.56.78:80 Listen 127.0.0.1:80
Additionally, you need to include another .conf
file from
your work area by adding Include C:/Users/Jeff/test_redirects/test.conf
at the bottom of httpd.conf
.
That section of httpd.conf
will look like this when done:
<IfModule ssl_module> SSLRandomSeed startup builtin SSLRandomSeed connect builtin </IfModule> Include C:/Users/Jeff/test_redirects/test.conf
Yes, you need to use forward slashes in file and directory names when
they appear in .conf
files.
Now create a new .conf
file in your work area —
C:\Users\Jeff\test_redirects\test.conf
with these contents:
Listen 127.0.0.1:9999 <VirtualHost *:9999> DocumentRoot C:/Users/Jeff/test_redirects/public_html <Directory C:/Users/Jeff/test_redirects/public_html> AllowOverride All Order Deny,Allow Allow from All </Directory> </VirtualHost>
Now start Apache httpd in a command prompt:
Microsoft Windows [Version 6.2.9200] (c) 2012 Microsoft Corporation. All rights reserved. C:\Users\Jeff>cd \apache2 C:\Apache2>bin\httpd
It will keep running until you press Control-C and wait a few seconds.
Install Python
What, Python isn't installed yet? Go to the Python download site
and download and install the latest Python 2.7.x Windows Installer. Then, add
the Python directory to PATH in the control panel. Normally that would be C:\Python27
.
Try this command and see if it displays the level of Python you installed, or if you get a nasty error:
C:\Users\Jeff\test_redirects>python --version Python 2.7.3
Testing
Now when you go to host 127.0.0.1:9999 your .htaccess file will be processed. But don't go there with a browser. You don't really have a site there; instead, you have just enough to see if your redirects are working.
How can you test a redirect to see if Apache httpd returns the desired target URL? Use my little Python script called test_one_redirect.py.
C:\Users\Jeff\test_redirects>python test_one_redirect.py 127.0.0.1 9999 /projects.html Testing /projects.html... -> http://www.example.com/projects/index.html
(But different tests and results will make sense for your redirects.)
You can test your redirects one by one like that, but is more useful to test the whole slew of redirects in your htaccess file.
Try this:
C:\Users\Jeff\test_redirects>python test_htaccess_redirects.py --help Usage: test_htaccess_redirects.py [options] Options: -h, --help show this help message and exit -a HTACCESS, --htaccess=HTACCESS override default .htaccess filename (.htaccess) -v VALIDURLS, --valid-urls=VALIDURLS override default list of valid urls (valid_urls.txt) -p PORT, --port=PORT override default port (80) -o HOST, --host=HOST override default host (127.0.0.1)
Your .htaccess file should be in the public_html
subdirectory
and it is processed on port 9999, so invoke the script like this:
C:\Users\Jeff\test_redirects>python test_htaccess_redirects.py --htaccess=public_html/.htaccess --port=9999 Verifying that targets of redirects are in the sitemap... Testing /START-OF-URI/projects.html... -> http://www.example.com/projects/index.html Testing /projects.html... -> http://www.example.com/projects/index.html ERROR: expected to be redirected to [http://www.example.com/projects/projects.html] instead of to [http://www.example.com/projects/index.html] from [/projects.html] Directive: [Redirect 301 /projects.html http://www.example.com/projects/projects.html] (Redirect/301//projects.html/http://www.example.com/projects/projects.html) Testing /START-OF-URI/projects.html... -> http://www.example.com/projects/index.html ERROR: expected to be redirected to [http://www.example.com/projects/projects.html] instead of to [http://www.example.com/projects/index.html] from [(?i)/projects.html] Directive: [RedirectMatch 301 (?i)/projects.html http://www.example.com/projects/projects.html] (RedirectMatch/301/(?i)/projects.html/http://www.example.com/projects/projects.html)
You can see that my .htaccess
rules aren't working very well.
Todos
- explain here all the many limitations of test_htaccess_redirects.py, such as the fact that it only tests Redirect and relatively simple RedirectMatch directives, and doesn't even handle all variations of those
- set user agent and implement throttling so that it can be directed against your
real site to verify your redirects once
.htaccess
has been uploaded - explain how to use the valid_urls.txt file to confirm that the targets of the redirects actually exist on your site
- if anybody ever cares, remove some of the limitations
- what about generating .htaccess redirects from a higher level construct?