Introduction to mod_rewrite

Embed Size (px)

Citation preview

  • 8/14/2019 Introduction to mod_rewrite

    1/68

    mod_rewriteIntroduction to mod_rewrite

    Rich Bowen, Web Guy, Asbury [email protected]

    http://people.apache.org/~rbowen/

    ApacheCon US, 2006

    1

    http://people.apache.org/~rbowen/http://people.apache.org/~rbowen/mailto:[email protected]:[email protected]
  • 8/14/2019 Introduction to mod_rewrite

    2/68

    OutlineRegex basics

    RewriteRule

    RewriteCondRewriteMap

    The evils of .htaccess les

    Assorted odds and ends

    2

  • 8/14/2019 Introduction to mod_rewrite

    3/68

    mod_rewriteis not magic

    Fear, more thancomplexity, makesmod_rewrite difcult

    3

  • 8/14/2019 Introduction to mod_rewrite

    4/68

    Although, it is complex

    4

    ``The great thing aboutmod_rewrite

    is it gives you all the congurabilityand exibility of Sendmail. The downside to

    mod_rewrite is that it gives you allthe congurability and exibility of

    Sendmail.''-- Brian Behlendorf

  • 8/14/2019 Introduction to mod_rewrite

    5/68

    And lets not forget

    voodoo!

    5

    `` Despite the tons of

    examples and docs,mod_rewrite is voodoo.Damned cool voodoo,

    but still voodoo. ''

    -- Brian Moore

  • 8/14/2019 Introduction to mod_rewrite

    6/68

    Line noise

    6

    "Regular expressionsare just line noise.

    I hate them!"

    (Heard 20 times perday on IRC)

    When you hear it oftenenough, you start tobelieve it

  • 8/14/2019 Introduction to mod_rewrite

    7/68

    Now that thats out of the way

    Regular expressions are not magic

    They are an algebraic expression of text

    patterns

    Once you get over the mysticism, it can still be

    hard, but it's no longer mysterious

    7

  • 8/14/2019 Introduction to mod_rewrite

    8/68

    Vocabulary

    Were going to start with a very small

    vocabulary, and work up from thereMost of the time, this vocabulary is all thatyoull need

    8

  • 8/14/2019 Introduction to mod_rewrite

    9/68

    .

    . matches any character

    a.b matches acb, axb, a@b, and so on

    It also matches Decalb and Marbelized

    9

  • 8/14/2019 Introduction to mod_rewrite

    10/68

    +

    + means that something needs to appear oneor more times

    a+ matches a, aa, aaa, andStellaaaaaaaaaa!

    The thing that is repeating isnt necessarily

    just a single character

    10

  • 8/14/2019 Introduction to mod_rewrite

    11/68

    *

    * means that the previous thingy needs tomatch zero or more times

    This is subtly different from + and somefolks miss the distinction

    giraf*e matches giraffe and girafe

    It also matches girae

    11

  • 8/14/2019 Introduction to mod_rewrite

    12/68

    ?

    ? means that the previous thingy needs to

    match zero or one timesIn other words, it makes it optional

    colou?r matches color and colour

    12

  • 8/14/2019 Introduction to mod_rewrite

    13/68

    ^^ is called an anchor

    It requires that the string start with thepattern

    ^A matches ANDY but it does not matchCANDY

    Pronounced hat or caret or circumexor pointy-up thingy

    13

  • 8/14/2019 Introduction to mod_rewrite

    14/68

    $

    $ is the other anchor

    It requires that the string end with thepattern

    a$ matches canada but not afghanistan

    14

  • 8/14/2019 Introduction to mod_rewrite

    15/68

    ( )

    ( ) allows you to group several charactersinto one thingy

    This allows you to apply repetitioncharacters (*, +, and ?) to a larger group of characters.

    (ab)+ matches ababababababab

    15

  • 8/14/2019 Introduction to mod_rewrite

    16/68

    ( ), continued( ) allows you to capture a match so that youcan use it later.

    The value of the matched bit is stored in avariable called a backreferenceIt might be called $1 or %1 depending on thecontext

    The second match is called $2 (or %2) and soon

    16

  • 8/14/2019 Introduction to mod_rewrite

    17/68

    [ ]

    [ ] denes a character class

    [abc] matches a or or b or c

    c[uoa]t matches cut, cot, or cat

    It also matches cote

    It does not match coat

    17

  • 8/14/2019 Introduction to mod_rewrite

    18/68

    NOT

    In mod_rewrite regular expressions, ! negatesany match

    In a character class, ^ negates the characterclass

    [^ab] matches any character except for a or

    b.

    18

  • 8/14/2019 Introduction to mod_rewrite

    19/68

    So, what does this have to do withApache?

    mod_rewrite lets you match URLs (or otherthings) and transform the target of the URLbased on that match.

    19

    RewriteEngine On# Burninate ColdFusion!RewriteRule (.*)\.cfm$ $1.php [PT]# And there was much rejoicing. Yaaaay.

  • 8/14/2019 Introduction to mod_rewrite

    20/68

    RewriteEngineRewriteEngine On enablesthe mod_rewrite rewritingengine

    No rewrite rules will beperformed unless this isenabled in the active scope

    It never hurts to say it again

    20

  • 8/14/2019 Introduction to mod_rewrite

    21/68

    RewriteLog

    21

    RewriteLog /www/logs/rewrite_logRewriteLogLevel 9

    You should turn on the RewriteLog before youdo any troubleshooting.

  • 8/14/2019 Introduction to mod_rewrite

    22/68

    RewriteRule pattern target [ags]

    The pattern part is the regular expressionthat you want to look for in the URL

    If they try to go HERE send them HEREinstead.

    The behavior can be further modied by theuse of one or more ags

    RewriteRule

    22

  • 8/14/2019 Introduction to mod_rewrite

    23/68

    Example 1

    SEO - Search Engine Optimization

    Frequently based on misconceptions abouthow search engines work

    Typical strategy is to make clean URLs -Avoid?argument=value&xyz=123

    23

  • 8/14/2019 Introduction to mod_rewrite

    24/68

    URL beauticationA URL looks like:

    24

    We would prefer that it looked like

    Its easier to type, and easier toremember

    http://example.com/book/bowen/apachehttp://example.com/book/bowen/apachehttp://example/http://example/
  • 8/14/2019 Introduction to mod_rewrite

    25/68

    Example 1, contd

    User does not notice that the transformationhas been made

    Used $1 and $2 to capture what was

    requestedSlight oversimplication. Should probably use([^/]+) instead.

    25

    RewriteRule ^/book/(.*)/(.*) \/cgi-bin/book.cgi?topic=$1&author=$2 [PT]

  • 8/14/2019 Introduction to mod_rewrite

    26/68

    Flags

    Flags can modify the behavior of aRewriteRule

    I used a ag in the example, and didnt tellyou what it meant

    So, here are the ags

    26

  • 8/14/2019 Introduction to mod_rewrite

    27/68

  • 8/14/2019 Introduction to mod_rewrite

    28/68

    By the way ...Default is to treat the rewrite target as ale path

    If the target starts in http:// or https://then it is treated as a URL, and a [R] isassumed (Redirect)

    In a .htaccess le, or in scope,the le path is assumed to be relative tothat scope

    28

  • 8/14/2019 Introduction to mod_rewrite

    29/68

    RewriteRule ags[Flag] appears at end of RewriteRule

    More than one ag separated by commas

    I recommend using ags even when thedefault is what you want - it makes it easierto read later

    Each ag has a longer form, which you can

    use for greater readability.Theres *lots* of ags

    29

  • 8/14/2019 Introduction to mod_rewrite

    30/68

    Chain

    [C] or [Chain]

    Rules are considered as a whole. If one fails,the entire chain is abandoned

    30

  • 8/14/2019 Introduction to mod_rewrite

    31/68

    Cookie[CO=NAME:Value:Domain[:lifetime[:path]]

    Long form [cookie=...]

    Sets a cookie

    31

    RewriteRule ^/index.html - [CO=frontdoor:yes:.example.com]

    In this case, the default values forpath (/) and lifetime (session)are assumed.

  • 8/14/2019 Introduction to mod_rewrite

    32/68

    Env[E=var:val]

    Long form [env=...]

    Sets environment variable

    Note that most of the time, SetEnvIf worksjust ne

    32

    RewriteRule \.jpg$ - [env=dontlog:1]

  • 8/14/2019 Introduction to mod_rewrite

    33/68

    Forbidden[F] or [Forbidden] forces a 403 Forbidden

    response

    Consider mod_security instead for pattern-based URL blocking

    33

    RewriteEngine OnRewriteRule (cmd|root)\.exe - [F]

    You could use this in conjunction with [E]

    to avoid logging that stuff RewriteRule (cmd|root)\.exe - [F,E=dontlog:1]CustomLog /var/log/apache/access_log combined \

    env=!dontlog

  • 8/14/2019 Introduction to mod_rewrite

    34/68

    Handler[H=application/x-httpd-php]

    Forces the use of a particular handler tohandle the resulting URL

    Can often be replaced with using [PT] but isquite a bit faster

    Available in Apache 2.2

    34

  • 8/14/2019 Introduction to mod_rewrite

    35/68

    Last[L] indicates that youve reached the end of the current ruleset

    Any rules following this will be considered asa completely new ruleset

    Its a good idea to use it, even when it would

    otherwise be default behavior. It helps makerulesets more readable.

    35

  • 8/14/2019 Introduction to mod_rewrite

    36/68

    NextThe [N] or [Next] ag is a good way to getyourself into an innite loop

    It tells mod_rewrite to run the entireruleset again from the beginning

    Can be useful for doing global search andreplace stuff

    I nd RewriteMap much more useful in thosesituations

    36

  • 8/14/2019 Introduction to mod_rewrite

    37/68

    NoCase

    [NC] or [nocase] makes the RewriteRule caseinsensitive

    Regular expressions are case-sensitive bydefault

    37

  • 8/14/2019 Introduction to mod_rewrite

    38/68

    NoEscape

    [NE] or [noescape] disables the defaultbehavior of escaping (url-encoding) specialcharacters like #, ?, and so on

    Useful for redirecting to a page #anchor

    38

  • 8/14/2019 Introduction to mod_rewrite

    39/68

    NoSubreq

    [NS] or [nosubreq] ensures that the rulewont run on subrequests

    Subrequests are things like SSI evaluations

    Image and css requests are NOT subrequests

    39

  • 8/14/2019 Introduction to mod_rewrite

    40/68

    Proxy[P] rules are served through a proxysubrequest

    mod_proxy must be installed for this ag to

    work

    40

    RewriteEngine OnRewriteRule (.*)\.(jpg|gif|png) \

    http://images.example.com/http://images.example.com/http://images.example.com/
  • 8/14/2019 Introduction to mod_rewrite

    41/68

    Passthrough

    [PT] or [passthrough]

    Hands it back to the URL mapping phase

    Treat this as though this was the originalrequest

    41

  • 8/14/2019 Introduction to mod_rewrite

    42/68

    QSAppend

    [QSA] or [qsappend] appends to the querystring, rather than replacing it.

    42

  • 8/14/2019 Introduction to mod_rewrite

    43/68

    Redirect

    [R] or [redirect] forces a 302 Redirect

    Note that in this case, the user will see thenew URL in their browser

    This is the default behavior when the target

    starts with http:// or https://

    43

    k

  • 8/14/2019 Introduction to mod_rewrite

    44/68

    Skip

    [S=n] or [skip=n] skips the next nRewriteRules

    This is very weird

    Ive never used this in the real world. Couldbe used as a sort of inverse RewriteCond(viz WordPress)

    44

    RewriteRule %{REQUEST_FILENAME} -f [S=15]

  • 8/14/2019 Introduction to mod_rewrite

    45/68

    Type[T=text/html]

    Forces the Mime type on the resulting URL

    Used to do this instead of [H] in somecontexts

    Good to ensure that le-path redirects are

    handled correctly

    45

    RewriteRule ^(.+\.php)s$ $1 [T=application/x-httpd-php-source]

    i C d

  • 8/14/2019 Introduction to mod_rewrite

    46/68

    RewriteCondCauses a rewrite to be conditional

    Can check the value of any variable andmake the rewrite conditional on that.

    46

    RewriteCond TestString Pattern [Flags]

    R i C d

  • 8/14/2019 Introduction to mod_rewrite

    47/68

    RewriteCond

    The test string can be just about anything

    Env vars, headers, or a literal string

    Backreferences become %1, %2, etc

    47

    L i

  • 8/14/2019 Introduction to mod_rewrite

    48/68

    Looping

    Looping occurs when the target of a rewriterule matches the pattern

    This results in an innite loop of rewrites

    48

    RewriteCond %{REQUEST_URI} \!^/example.html

    RewriteRule ^/example /example.html [PT]

    C di i l i

  • 8/14/2019 Introduction to mod_rewrite

    49/68

    Conditional rewritesRewrites conditional on some arbitrarythingy

    Only rst Rule is dependent

    49

    RewriteEngine onRewriteCond %{TIME_HOUR}%{TIME_MIN} >0700

    RewriteCond %{TIME_HOUR}%{TIME_MIN}

  • 8/14/2019 Introduction to mod_rewrite

    50/68

    SSL RewritesRedirect requests to https:// if the requestwas for http

    (In a .htaccess le)

    50

    RewriteCond %{HTTPS} !ONRewriteRule (.*) https://%{HTTP_HOST}/$1 [R]

    R i M

  • 8/14/2019 Introduction to mod_rewrite

    51/68

    RewriteMap

    Call an external program, or map le, toperform the rewrite

    Useful for very complex rewrites, or perhapsones that rely on something outside of Apache

    51

    R i M l

  • 8/14/2019 Introduction to mod_rewrite

    52/68

    RewriteMap - leFile of one-to-one relationships

    52

    RewriteMap docsmap txt:/www/conf/docsmap.txtRewriteRule /docs/(.*) ${docsmap:$1} [R,NE]

    Where docsmap.txt contains:

    ... etc

    Requests for http://example.com/docs/somethingnow get redirected to the Apache docs site forsomething. [NE] makes the #anchor bit work.

    P l d b l i

    http://example.com/http://example.com/http://httpd.apache.org/docs/mod_alias.html#http://httpd.apache.org/docs/mod_alias.html#http://httpd.apache.org/docs/mod_alias.html#http://httpd.apache.org/docs/mod_alias.html#
  • 8/14/2019 Introduction to mod_rewrite

    53/68

    Poor-mans load balancingRandom selection of server for load

    balancing

    53

    RewriteMap servers rnd:/www/conf/servers.txtRewriteRule (.*) http://${servers:loadbalance}$1 [P,NS]

    servers.txt contains:

    loadbalance mars|jupiter|saturn|neptune

    Requests are now randomly distributed between the fourservers. The NS ensures that the proxied URL doesntget re-rewritten.

    db

  • 8/14/2019 Introduction to mod_rewrite

    54/68

    dbm

    Convert a one-to-one text mapping to a dbm

    le

    httxt2dbm utility (2.0)

    54

    RewriteMap asbury \dbm:/usr/local/apache/conf/aliases.map

    R it M

  • 8/14/2019 Introduction to mod_rewrite

    55/68

    RewriteMap - programCall an external program to do the rewrite

    Perl is a common choice here, due to its skillat handling text.

    55

    RewriteMap dash2score \prg:/usr/local/apache/conf/dash2score.pl

    RewriteEngine OnRewriteRule (.*-.*) ${dash2score:$1} [PT]

    d h2 l

  • 8/14/2019 Introduction to mod_rewrite

    56/68

    dash2score.pl#!/usr/bin/perl

    $| = 1; # Turn off bufferingwhile () {s/-/_/g; # Replace - with _ globallyprint $_;

    }

    Turning off buffering is necessarybecause we need the outputimmediately for each line we feed it.

    Apache starts the script on serverstartup, and keeps it running for the

    life of the server process

    SQL (i 2 3 HEAD)

  • 8/14/2019 Introduction to mod_rewrite

    57/68

    SQL (in 2.3-HEAD)

    Just committed on Monday

    Have a SQL statement in the RewriteMapdirective which returns the mapping

    57

    ht l

  • 8/14/2019 Introduction to mod_rewrite

    58/68

    .htaccess les

    .htaccess les are evil

    However, a lot of people have no choice

    So ...

    58

    ht l

  • 8/14/2019 Introduction to mod_rewrite

    59/68

    .htaccess les

    In .htaccess les, or scope,everything is assumed to be relative to thatcurrent scope

    So, that scope is removed from theRewriteRule

    ^/index.html in httpd.conf becomes^index.html in a .htaccess le or scope

    59

    ht l

  • 8/14/2019 Introduction to mod_rewrite

    60/68

    .htaccess lesRewriteLog is particularly useful when tryingto get .htaccess le RewriteRules working.

    However, you cant turn on RewriteLog ina .htaccess le, and presumably youreusing .htaccess les because you dont haveaccess to the main server cong.

    Its a good idea to set up a test server onyour home PC and test there withRewriteLog enabled

    60

    ht l

  • 8/14/2019 Introduction to mod_rewrite

    61/68

    .htaccess les

    The rewrite pattern is relative to thecurrent directory

    The rewrite target is also relative to thecurrent directory

    In httpd.conf, the rewrite target is assumedto be a le path. In .htaccess les, that lepath is relative to the current directory, so itseems to be a URI redirect.

    61

    F rther reso rces

  • 8/14/2019 Introduction to mod_rewrite

    62/68

    Further resources

    http://rewrite.drbacchus.com/

    Denitive Guide to mod_rewrite by RichBowen, from APress

    http:/ /httpd.apache.org/docs-2.2/r ewrite/

    62

    Questions?

    http://rewrite.drbacchus.com/http://httpd.apache.org/docs-2.1/rewrite/http://httpd.apache.org/docs-2.1/rewrite/http://rewrite.drbacchus.com/http://rewrite.drbacchus.com/
  • 8/14/2019 Introduction to mod_rewrite

    63/68

    Questions?

    63

    Bonus slides Recipes

  • 8/14/2019 Introduction to mod_rewrite

    64/68

    Bonus slides - Recipes

    Redirect everything to a central handler

    64

  • 8/14/2019 Introduction to mod_rewrite

    65/68

    65

    RewriteEngine OnRewriteCond %{REQUEST_URI} !handler.phpRewriteRule (.*) /handler.php?$1 [PT,L,NE]

    All requests are sent to handler.phpThe request is passed as a QUERY_STRING

    argument to handler.php so that it knows whatwas requested.

    Virtual Hosts

  • 8/14/2019 Introduction to mod_rewrite

    66/68

    Virtual Hosts

    Rewrite a request to a directory based onthe requested hostname.

    66

  • 8/14/2019 Introduction to mod_rewrite

    67/68

    The hostname ends up in %1

    The requested path is in $1 - includes leadingslash

    Will probably have to do special things forhandlers (like .php les)

    67

    RewriteEngine On

    RewriteCond %{HTTP_HOST} (.*)\.example\.com [NC]RewriteRule (.*) /home/%1/www$1

    phps source handler

  • 8/14/2019 Introduction to mod_rewrite

    68/68

    .phps source handler

    Syntax-highlighted coderendering of any .phple

    RewriteRule (.*)\.phps \$1.php [H=application/x-httpd-php-source]