Filtering Data With PHP

Embed Size (px)

Citation preview

  • 8/3/2019 Filtering Data With PHP

    1/37

    Filtering Data with PHPContents

    1. What is a PHP Filter2. Getting Started

    3. Filtering Variables

    4. Validating INTEGERS

    5. Validate BOOLEAN

    6. Validate FLOAT

    7. Validate REGEX

    8. Validate a URL

    9. Validate an IP Address

    10.Validate an Email Address

    11.Sanitizing Variables

    12.Sanitize a String

    13.URL Encode

    14.Sanitize Special Chars

    15.Filter Unsafe RAW

    16.Sanitize an Email Address

    17.Sanitize a URL

    18.Sanitize an Integer

    19.Sanitize a Float

    20.Magic Quotes

    21.Callback Filter

    22.The INPUT Filter

    23.Filter an array

    24.Filter an array with callback

    http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#1http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#2http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#3http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#4http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#5http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#6http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#7http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#8http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#9http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#10http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#11http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#12http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#13http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#14http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#15http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#16http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#17http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#18http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#19http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#19http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#20http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#21http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#22http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#23http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#24http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#2http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#3http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#4http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#5http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#6http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#7http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#8http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#9http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#10http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#11http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#12http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#13http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#14http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#15http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#16http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#17http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#18http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#19http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#20http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#21http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#22http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#23http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#24http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#1
  • 8/3/2019 Filtering Data With PHP

    2/37

    25.A Real World Example

    26.Credits

    What is a PHP Filter

    One of the greatest strengths of PHP is its ease of use. Unfortunately this same benifithas worked against PHP as many new coders have forgotten any security measures orlack the expertise to create a class to validate their variables from end users. PHPprovides an extension to help with this process. There are many validation classes outthere, some better than others, with an equal number of methods for doing the sametask. The PHP filter extension has many of the functions needed for checking manytypes of user input. Handled locally this provides a standard method of filtering data.This makes for easier to read code as we will all be using the same functions ratherthan having to create our own. This will bring PHP security to fore with programmersable to easily implement simple, yet robust, filtering of data. Never again do we need tosee code like this below.

    Or, even more common is database queries like this one...

    The above code will produce a table like the one below.

    Filter Name Filter ID

    int 257

    boolean 258

    http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#25http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#100http://m.facebook.com/index.php?q=aHR0cDovL3BocHJvLm9yZy90dXRvcmlhbHMvVmFsaWRhdGluZy1Vc2VyLUlucHV0Lmh0bWw%3Dhttp://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#25http://m.facebook.com/index.php?q=aHR0cDovL3d3dy5waHByby5vcmcvdHV0b3JpYWxzL0ZpbHRlcmluZy1EYXRhLXdpdGgtUEhQLmh0bWw%3D#100http://m.facebook.com/index.php?q=aHR0cDovL3BocHJvLm9yZy90dXRvcmlhbHMvVmFsaWRhdGluZy1Vc2VyLUlucHV0Lmh0bWw%3D
  • 8/3/2019 Filtering Data With PHP

    3/37

    float 259

    validate_regexp 272

    validate_url 273

    validate_email 274

    validate_ip 275

    string 513

    stripped 513

    encoded 514

    special_chars 515

    unsafe_raw 516

    email 517

    url 518

    number_int 519

    number_float 520

    magic_quotes 521

    callback 1024

    This is quite an impressive list and more will be added in time. Note also that each filterhas its own Filter ID, this will become useful as we progress through this tutorial. Eachof these filters can be used with the filter_var() function and here we will step througheach one show how it works. Note that the string and stripped have the same ID. This isbecause they are the same.

    Filtering Variables

    The actual filtering of variables is done with the filter_var() function. Lets start with a

    simple integer filter to see how it works.

  • 8/3/2019 Filtering Data With PHP

    4/37

    The above code will echo 1234. The filter_var() function will return INT(1234) becausethe input has validated as an integer. Lets change the $int variable to a non int valueand see the results.

    Now we see a different result. No display is made because the variable $int has failedvalidation and the filter_var() function has returned bool(false). Also note that if thevariable is set to $int='' then it will again return bool(false). Consider this code..

    As you can see if you run the above snippet of code, the result is once again a blankpage. From here we can continue with INTEGER validation as the filter_var() functionhas some amazing properties to allow us to do many of the validation tasks we oncehad to write ourselves.

    Validating INTEGERSAs we saw in the previous section, validating a variable as type INT is simple with thefilter_var() function. But wait, theres more. FILTER_VALIDATE_INT also allows us tospecify a range for our integer variable. This is most excellent when we need to check ifa variable is both of type INT and between 1 and 100. Lets see some code.

    WTF!... Above we have tried to validate an interger that must be in the range of 50 and

  • 8/3/2019 Filtering Data With PHP

    5/37

    100. The number 42 clearly is not within this range, yet it passes the validation. Whathas happened here is we have incorrectly specified the options for min_range andmax_range. Although this looks correct, and no error is generated the filter simply fallsback to being FILTER_VALIDATE_INT and the number 42 passes. Below we show howto correctly specify the options array.

    Now we see a different behaviour when the options array is correctly specified asabove. We havetried to validate an INT(42), checking that it is both of the type INT andit is within the range of 50 and 100. The above code will return boolean FALSE as theinterger 42 is not withing the range specified. Note also that this will work for negativevalues. Consider this next block of code.

    The above code will echo -2 as it is both of type INT and within the range of -10 to 100.We could also validate our integer by specifying only a min_range or a max_range. Thiscan prove helpful if we need a number to be no less than 10 for example. Lets see it in

    action.

  • 8/3/2019 Filtering Data With PHP

    6/37

    /*** validate the integer ***/echo filter_var($int, FILTER_VALIDATE_INT,array('options'=>array('min_range'=>$min)));?>

    The above code now correctly validates an integer that must not be less than, or equalto 10. Now we can move on a little. We have seen how we can validate a single INT,

    but we can go further and validate an array of values. To validate an array of values, inthis case integers, we use the filter_var_array() function. This function alone makes thewhole filter extension worth the effort. The filter_var_array() function allows us to filter orvalidate many different data types. But for now, lets simply stick with INTs. Lets code itup and see.

    The above code will print out the array. It should look like this0 -- 101 -- 1092 --

    3 -- -12344 --5 --6 -- ArrayAs you can see, it has printed out the values that are of type INT. We note that 4 and 5in the list are blank as the filter_var() function has return bool(false) here. Finally we seethe last type is merely an array. This type of flexibility is makes the filter extension agreat improvement over having to code up all this validation yourself. There is no longerany excuse for sloppy security practices.

    Moving on from validating an array of values, the FILTER_VALIDATE_INT filter allowsus to also validate INTEGERS of the type OCTAL and HEX. Clever stuff, and will save abundle of time as we no longer need to code up endless hex and octal arrays. Two flagsare currently available for this type of checking. They are:

    FILTER_FLAG_ALLOW_HEX

    FILTER_FLAG_ALLOW_OCTAL

    Once again the filter uses an array to hold the flags, be sure to correctly specify them as

  • 8/3/2019 Filtering Data With PHP

    7/37

    shown here with the flags array. Lets see how we go with a hex value.

    The above code will correctly echoes 255. The filter has successfully validated the HEXvalue and returned that value. If validation fails with a non-hex value it will returnbool(false). Of course, hex values are not case sensitive so 0Xff or 0xFF would validatealso. As we mentioned earlier, we can also use an OCTAL value in the same way.

    Validate Boolean

    Along with the amazing INTEGER validation PHP also provides a similar method ofvalidating BOOLEAN values. Here we see how easy it is to check for a BOOLEANvalue.

    The above code will echo 1. This is because the BOOLEAN filter has found a validboolean value. Other acceptable boolean values that return TRUE are listed here.

    1

  • 8/3/2019 Filtering Data With PHP

    8/37

    "1"

    "yes"

    "true"

    "on" TRUE

    These values are not case sensitive so that true and TrUe are the same. Values thatreturn false are listed below here.

    0

    "0"

    "no"

    "false"

    "off"

    ""

    NULL

    FALSE

    The BOOLEAN filter will also allow us to evaluate code within the filter itself to test forvalues. Consider the below code using the ternary operator.

    The above code will echo TRUE as it has tested the result of the in_array() function andhas found the return value to be bool(true). Not to be forgotten here is that we can use

    an array of values to test for boolean values also. Lets see how.

  • 8/3/2019 Filtering Data With PHP

    9/37

    /*** dump the values ***/var_dump($values);

    ?>

    The above code will produce something like this..

    array(6) {

    [0]=> bool(false)

    [1]=> bool(true)

    [2]=> bool(false)

    [3]=> bool(false)

    [4]=> bool(false)

    [5]=> array(5) {

    o [0]=> bool(false)

    o [1]=> bool(true)

    o [2]=> bool(false)

    o [3]=> bool(false)

    o [4]=> bool(false)

    o }

    }

    As you can see above, the filter has recursively iterated over the array and returned thea boolean value depending on the array members value.

    Validate FLOAT

    There are of course times we need to validate a floating point number and once againthis is simply done as the rest.

  • 8/3/2019 Filtering Data With PHP

    10/37

    echo "$float is not valid!";}

    else{echo "$float is a valid floating point number";}

    ?>

    Like other validation options we can still validate an array of floats. Similar to how wevalidated an array of boolean values, we apply the same principles, and flags, to thefloat filter. We can also use negative values.

    The above code will return a list something like this

    array(7) {

    [0]=> float(1.2)

    [1]=> float(1.7)

    [2]=> bool(false)

    [3]=> float(-23234.123)

    [4]=> bool(false)

    [5]=> bool(false)

    [6]=> array(0) { }

    }

    The filter will also allow us to filter an array with a user specified seperator. So that afloat like 1,234 can be accepted.

  • 8/3/2019 Filtering Data With PHP

    11/37

    "1.2e3" => ",");

    /*** validate the floats against the user defined decimal seperators ***/foreach ($floats as $float => $dec_sep)

    { $out = filter_var($float, FILTER_VALIDATE_FLOAT, array("options"=>array("decimal" => $dec_sep))); /*** dump the results ***/ var_dump($out);

    }?>

    From the code above we get the output as follows..

    float(1.234)

    Warning: filter_var() [function.filter-var]: decimal separator must be one char in/www/filter.php on line 13

    bool(false)

    bool(false)

    Lets step through this for a minute. The first value has used a comma as a decimalseperator. Normally this would be invalid and be regarded as a thousands seperator.However we have specified the comma as the decimal seperator and so it remains validand the filter replaces the decimal seperator with a dot and returns 1.234. Next we havea Warning due to the use of a double dot as a decimal seperator. This is nvalid as adecimal seperator must be one character only. Following this, the filter returnsbool(false) for that value. The final value also returns bool(false) as the decimalseperators do not match.

    Validate REGEX

    If you are unfamiliar with regex then it is highly recommended you begin with thephPRO Introduction to PHP Regular Expressions. This should put you in good conditionfor the following. For those adept at regex, then read on. Validating by regex issomething we do commonly for information that should fit a pattern. The filter extensionallows us to validate variables in the same manner as we have seen previously. First wewill attempt a simple pattern match to see if a string begins with the letter and then wewill try to match an email address.

  • 8/3/2019 Filtering Data With PHP

    12/37

    /*** if there is no match ***/ echo "Sorry, no match";

    }else

    { /*** if we match the pattern ***/ echo "The string begins with T";

    }?>

    From the above code we see the string does match the pattern and a bool(false) is notreturned. If the pattern does match the string value is returned just as with all theprevious validation types. Lets see a little more complex regex to try to match an emailaddress.

    So, the above code will match the email address and print the lineThe email address is validNormally you would not use this method of email checking for email addresses as thefilter extension provides checking for this purpose, but more on that later. Should youaccidently omit the regex in your checking an E_WARNING is generated and the filterwill return bool FALSE.

    Validate a URLURLs can be tricky to deal with. there is no maximum length defined in the rfc and youwould be totally amazed at just how many different variations and formats a URL can bein. Recommended reading is RFC 1738 which explains all you need to know about URLvalidation. You could then write up your own class with which to validate all your ipv4and ipv6 URLs etc. Or, you can simply use FILTER_VALIDATE_URL in filter_var(). Letssee how it works in its simplest form.

  • 8/3/2019 Filtering Data With PHP

    13/37

    Above we see, like in the examples previous, a simple check of the variable with an ifstatement tells us the url value is valid. But not all URLs are in this form. According to

    the rfc, a URL can take many forms. The URL may be in the form of an IP address, orhave a QUERY string attached to it. This is why URLs have been so difficult to validatein the past. Several flags have been provided with the URL validation filter to check forsome of these occurances and to validate against them. They are listed here.

    FILTER_FLAG_SCHEME_REQUIRED

    FILTER_FLAG_HOST_REQUIRED

    FILTER_FLAG_PATH_REQUIRED

    FILTER_FLAG_QUERY_REQUIRED

    We will begin at the top here and check for flag scheme.

    We see from above that the URL supplied does not validate against the

  • 8/3/2019 Filtering Data With PHP

    14/37

    FILTER_FLAG_SCHEME_REQUIRED flag. The URL needs to be properly formatted tofit the required scheme. Lets try again with a different URL.

    Now we see that the URL validates against the scheme. The return values arebool(false) if the validation fails, or if it passes validation the URL string is returned. Letscontinue with the other flags and FLAG_HOST_REQUIRED. If we were to alter theabove filter flag from FILTER_FLAG_SCHEME_REQUIRED toFILTER_FLAG_HOST_REQUIRED it would still validate as the URL http://foo has thehost name of "foo". Lets try with an invalid host name.

    Now we see that the missing hostname does not validate against the required flag andreturns bool(false). What would happen if you tried a URL like http:// one wonders. It isleft as an excercise for the reader to try. To enable correct validation theFILTER_FLAG_HOST_REQUIRED flag must be satisified such as the example below.

  • 8/3/2019 Filtering Data With PHP

    15/37

    /*** try to the validate the URL ***/if(filter_var($url, FILTER_VALIDATE_URL, FILTER_FLAG_HOST_REQUIRED) === FALSE)

    { /*** if there is no match ***/ echo "Sorry, $url is not valid!";

    }else

    { /*** if the URL is valid ***/ echo "The URL, $url is valid!";

    }?>

    Of course now, the code above validates the URL as we have now supplied a value thatsatisfies the FILTER_FLAG_HOST_REQUIRED flag. Next we see theFILTER_FLAG_PATH_REQUIRED flag. This flag as the name suggests tells the URLvalidator that a path is required within the url for it to validate. Lets see it in action.

    We see from the above code that the URL is valid because it contains a path after thehost name. Should the path be missing and the URL was simply http://www.phpro.orgthen it would fail validation and return bool(false). Next we see theFILTER_FLAG_QUERY_REQUIRED flag. You should now have no problemsimplementing this as it follows the same convention as the previous flags. As the namesuggests, this flag requires the URL to have a query string of type file.php?var=foo&var2=bar. The use is exactly the same as before.

  • 8/3/2019 Filtering Data With PHP

    16/37

    echo "Sorry, $url is not valid!";}

    else{

    /*** if the URL is valid ***/ echo "The URL, $url is valid!";

    }?>

    Filter IP Address

    Following on from validation of URLs, we often find we need to validate an IP Address.Of course, and IP address may be of different formats for ipv4 and ipv6. An IP addressmay also need to be within a range of private or reserved ranges. The filter extensionmakes it possible to discern these differences and to validate an IP address to fit mostneeds. In its simplest form the validation of a url will look like this.

    As we have supplied the above with a valid IP address it validates and all is well. Butnow we may wish to validate an IPV6 address or an address with a private range. TheIP filter has several flag with which to validate an IP address with. Listed here.

    FILTER_FLAG_IPV4

    FILTER_FLAG_IPV6

    FILTER_FLAG_NO_PRIV_RANGE

    FILTER_FLAG_NO_RES_RANGEStarting at the top we will check to see if an IP is a valid IPV4 address.

  • 8/3/2019 Filtering Data With PHP

    17/37

    /*** try to validate as IPV4 address ***/if(filter_var($ip, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4) === FALSE)

    {echo "$ip is not a valid IP";}

    else{echo "$ip is valid";}

    ?>

    In the above example the IP address has failed to validate as it is not a complete IPV4address. It would need to be of the form of the example that preceded it, 192.168.0.1 tovalidate. This is fine, but growth of the net has seen us run out of IPV4 addresses andso we need to validate against IPV6 addresses also. Here we show the use of theFILTER_FLAG_IPV6 flag.

    Above we see that the IPV6 address is a valid form of IP and passes validation. Nextwe see if an IP falls within a private range.

  • 8/3/2019 Filtering Data With PHP

    18/37

    The above code will print the line192.168.0.1 is within a private rangebecause the IP address is within the range of private IP address as specified in theRFC. Try changing the IP to a public IP address, that is one used over the internet, andsee the results.

    The FILTER_FLAG_NO_RES_RANGE, as the name suggests, will allow no reservedrange IP addresses. Only valid non-reserved address will pass validation here.

    Of course, the IP address 255.255.255.255 is within a reserved range and it fails tovalidate against the FILTER_FLAG_NO_RES_RANGE flag and returns bool FALSE.The filter will take an IPV4 or IPV6 IP address. Try this simple script with some other IPaddresses such as:66.163.161.1172001:0db8:85a3:08d3:1319:8a2e:0370:7334and see this in action.

    Validate Email Address

    What validation suite would be complete without the ability to validate an email address.Almost everybody has their own function and regex for this and many holy wars havebeen fought over the "correct" method of doing it. We saw earlier how we could do thiswith the FILTER_VALIDATE_REGEXP filter and supplying a REGEX to the filter. Ofcourse, everybody has their own REGEX that is better than everybody elses. Now wehave a standard way that everybody can use. Lets put it to work.

  • 8/3/2019 Filtering Data With PHP

    19/37

    else{

    /*** if the address passes validation ***/ echo "$email is valid";

    }?>

    So, there you have the email validated and the world is a better place having savedmany kittens. This filter takes no addition flags or options, it simply validates the email.

    Sanitizing Variables

    Whilst it is well to be able to validate the data we use, it is equally important to be ableto clean up any data that may come to our scripts, especially data from user land. Thefilter_var() function also contains filters for many data types that will clean up data foruse in our scripts. Here we will show their uses in a simple context.

    Sanitize A String

    The FILTER_SANITIZE_STRING filter allows us to filter various information from astring to allow us to safely use the data within for our applications. Lets look at thepossible flags for FILTER_SANITIZE_STRING. You can use this to strip tags, unwantedcharacters or even encode them.

    FILTER_FLAG_NO_ENCODE_QUOTES

    FILTER_FLAG_STRIP_LOW

    FILTER_FLAG_STRIP_HIGH

    FILTER_FLAG_ENCODE_LOW FILTER_FLAG_ENCODE_HIGH

    FILTER_FLAG_ENCODE_AMP

    These flags perform various sanitizing functions on a string. Without them theFILTER_SANITIZE_STRING works in this way.

    We see from the above code that the tags have been removed leavingonly the "foo" text. Using the optional flags we can gain a little more control of this

  • 8/3/2019 Filtering Data With PHP

    20/37

    behaviour. Lets modify our string variable a little to include some quotes.

  • 8/3/2019 Filtering Data With PHP

    21/37

    /*** sanitize the string ***/echo filter_var($string, FILTER_SANITIZE_STRING, FILTER_FLAG_STRIP_HIGH)

    ?>

    This is all well for stripping out unwanted characters, but does not address the issue ofkeeping them. To this end we use the encoding functions in the same way.

    By now you should be getting into the swing of the filter functions and flags. It should bequite easy to see the companian flag to the code above and howFILTER_FLAG_ENCODE_HIGH works. It will encode ASCII characters over 32.

    Lastly in this section is the FILTER_FLAG_ENCODE_AMP. No prizes for guessing thisone. It encodes the & character to &

    The resulting string will look the same in the browser, however, if you view the sourceyou will see the ampersand has been correctly encode to look like

  • 8/3/2019 Filtering Data With PHP

    22/37

    http://phpro.org/file.php?foo=1&bar=2

    The FILTER_SANITIZE_STRING (string) filter has an alias calledFILTER_SANITIZE_STRIPPED (stripped). I have no idea why it is there and suggest itdies a slow painful death as it can only lead to confustion.

    URL Encode

    The ability to url encode data has previously fallen to the urlencode() function. Thisfunctionality has been embrace by the filter extension and provides more features thanpreviously. We can now URL encode a string and optionally strip or encode specialcharacters. Lets see it in action.

    The above code will produce a string like thishttp%3A%2F%2Fphpro.org%2Fa%20dir%21%2Ffile.php%3Ffoo%3D1%26bar%3D2As you can see, the spaces and special chars have been encoded for use in urls. Likethe string filter, the encoded filter has HIGH and LOW filtering options with flags that canbe set. The LOW filter deals with special chars below ASCII 32 and the HIGH deals withthe rest.The flags allow the filtering or sanitizing of the input string. The flags should

    look familiar by now and are listed below.

    FILTER_FLAG_STRIP_LOW

    FILTER_FLAG_STRIP_HIGH

    FILTER_FLAG_ENCODE_LOW

    FILTER_FLAG_ENCODE_HIGH

    Lets put them each through thier paces to see how each works.

  • 8/3/2019 Filtering Data With PHP

    23/37

    Any characters below ASCII 32 will be stripped from the URL variable above. Of course,we can strip all characters above ASCII 32 with FILTER_FLAG_STRIP_HIGH as shownbelow.

    Rather than strip the any ASCII chars from a url we could encode them instead usingFILTER_FLAG_ENCODE_HIGH or FILTER_FLAG_ENCODE_LOW.

    The same again but encoding the HIGH characters.

    Sanitize Special Chars

    This filter option allows us to HTML-escape '"& and characters with ASCII value lessthan 32. Because of this there is no FILTER_FLAG_ENCODE_LOW flag. The availableflags are listed here

    FILTER_FLAG_STRIP_LOW

    FILTER_FLAG_STRIP_HIGH

    FILTER_FLAG_ENCODE_HIGH

  • 8/3/2019 Filtering Data With PHP

    24/37

    If we view the result in our browser and view the source, the result from the above codewill look this:?>

    The sanitized string from above should look a little like this:funky....

    We can of course choose to strip the HIGH ASCII values like this:

    Finally, we can encode the high chars

  • 8/3/2019 Filtering Data With PHP

    25/37

    ?>

    In the above code, the foo and the bar remain, whilst the characters between foo andbar have been sanitized, whilst the ampersand has been encoded.

    Filter Unsafe RAWThis is perhaps the oddest of the filters. The PHP manual quotes it as "Do nothing,optionally strip or encode special characters." Well, doing nothing is what I do best. Notto be out-performed it still keeps all the flags as listed below.

    FILTER_FLAG_STRIP_LOW

    FILTER_FLAG_STRIP_HIGH

    FILTER_FLAG_ENCODE_LOW

    FILTER_FLAG_ENCODE_HIGH

    FILTER_FLAG_ENCODE_AMP

    You should by now know how to use the STRIP and ENCODE flags. If you do not,return to the top of this document and start from their. Here we will show only theFILTER_FLAG_ENCODE_AMP flag in use.

    The code above will print the following lineBed & BreakfastAs you can see,the ampersand is now encoded as the flag dictates.

    Sanitize an Email Address

    Earlier we saw how to filter and validate an email address. Here we can take the emailaddress and sanitize it. That is, remove illegal or unwanted characters from it. It issurprising the amount of characters that are allowed in a valid email address. They are:All letters, digits and $-_.+!*'(),{}|\\^~[]`#%";/?:@&=.Lets put it to the test.

  • 8/3/2019 Filtering Data With PHP

    26/37

    /*** sanitize the email address ***/echo filter_var($email, FILTER_SANITIZE_EMAIL);

    ?>

    The above code will produce the output:

    [email protected] ampersand has remained but the () and the three backslashes have been removed,leaving us with a usable, or sanitized, email address.

    Sanitize a URL

    Unlike the FILTER_SANITIZE_ENCODED filter that encodes URL strings, or theFILTER_VALIDATE_URL which checks if a URL is valid, the FILTER_SANITIZE_URLwill strip out illegal characters. The characters that are not removed are letters anddigits and the following:$ - _ . + ! * ' ( ) , { } | \ \ ^ ~ [ ] ` > < # % " ; / ? : @ & = .

    To see it in action is simple as this filter takes no flags.

    The url string above contains 2 utf-8 characters (your browser may not show these,check source) which are illegal in urls. The FILTER_SANITIZE_URL filter has stripped

    these characters from the string leaving us with a valid URL.

    Sanitize an Integer

    To sanitize an Integer is simple with the FILTER_SANITIZE_INT filter. This filter stripsout all characters except for digits and . + -It is simple to use and we no longer need to boggle our minds with regular expressions.

    The above code produces an output of 40+2 as the none INT values, as specified bythe filter, have been removed.

    http://m.facebook.com/index.php?q=aHR0cDovL3BocHJvLm9yZy90dXRvcmlhbHMvSW50cm9kdWN0aW9uLXRvLVBIUC1SZWd1bGFyLUV4cHJlc3Npb25zLmh0bWw%3Dhttp://m.facebook.com/index.php?q=aHR0cDovL3BocHJvLm9yZy90dXRvcmlhbHMvSW50cm9kdWN0aW9uLXRvLVBIUC1SZWd1bGFyLUV4cHJlc3Npb25zLmh0bWw%3D
  • 8/3/2019 Filtering Data With PHP

    27/37

    Sanitize a Float

    Sanitizing a float is a little more playful and as FILTER_SANITIZE_NUMBER_FLOATfilter takes 3 flags.

    FILTER_FLAG_ALLOW_FRACTION FILTER_FLAG_ALLOW_THOUSAND

    FILTER_FLAG_ALLOW_SCIENTIFIC

    These flags are self explanatory and are easy to use as shown below. First we will putthe FILTER_SANITIZE_NUMBER_FLOAT filter to the test without any flags.

    Like the previous example this filter has stripped out all chacaracters except for digitsand the + and - characters. The dot character is removed with this filter. Optionallywould could keep the .(dot) character with the optionalFILTER_FLAG_ALLOW_FRACTION flag.

    Above we now see that the script returns40.4+2as we have now allowed fractional notation. We could also allow thousand seperatorswith the FILTER_FLAG_ALLOW_THOUSAND flag as demonstrated below.

    Now the script above returns40,000+2

  • 8/3/2019 Filtering Data With PHP

    28/37

    because we have allowed the use of the thousand seperator in the string. If we wishedto use scientific notation we can also allow the use of the characters e and E as we seebelow here.

    This should now be rather self explanatory as we see the flag allowing scientific notationand the script returns40+42eNote the the decimal seperator has been removed also.

    Magic QuotesThe use of magic quotes in PHP has been a contentious issue from its inception. Theremoval of magic quotes in favour of user defined escaping of characters makes a lot ofsense. Here we can impliment the functionality of magic quotes without having themforce upon us by php.ini settings which may vary from server to server. TheFILTER_SANITIZE_MAGIC_QUOTES filter applies the addslashes function to a string,thus providing us with an escaped string for use in our applications.

  • 8/3/2019 Filtering Data With PHP

    29/37

    ** @param $string** @return string***/function space2underscore($string) {return str_replace(" ", "_", $string);

    }

    $string = "This is not a love song";

    echo filter_var($string, FILTER_CALLBACK, array("options"=>"space2underscore"));

    ?>

    We see the filter has used our space2underscore() function as a callback and convertedthe spaces in the string so that it now returnsThis_is_not_a_love_song.We could use a PHP function rather than our own as demonstrated here

    Now we see that using a simple PHP callback function is simple on our string and thereturn value is

    This Is Not A Love Song. If a non-existant function is used an error an error is generated as shown below.

    The above script will produce an error like

    Warning: filter_var() [function.filter-var]: First argument is expected to be a valid callbackin /www/fil.php on line 6We have seen above the use of several functions to use as a callback. But what if wewanted to use several functions for our callback array. Lets put it to the test.

  • 8/3/2019 Filtering Data With PHP

    30/37

    /*** use a PHP defined function as callback ***/echo filter_var($string, FILTER_CALLBACK, array("options"=>array("ucwords", "strtolower")));

    ?>

    GRIM...We see above that the use of mulitple callbacks is not permitted and will generate anerror as follows:Warning: filter_var() [function.filter-var]: First argument is expected to be a valid callbackin /home/kevin/html/fil.php on line 7.We can however, call a class method as shown below here.

    Moving on from the above class we can also use a call back with a thown exception likethis..

  • 8/3/2019 Filtering Data With PHP

    31/37

    catch (Exception $e){echo $e->getMessage();}

    ?>

    From the above code we see that the $num variable does not match the value in the

    check_num() method and so an exception is thrown. The exception is then caught asusual in the first catch block and the error messages is displayed.

    The INPUT Filter

    As the name suggests, the input filter gets input from outside our script and can thenfilter it. The function used for this is the filter_input() function. With this we can validateour variables as the come in from userland and be sure they are dealt with before westart using them. This ensures we have some semblance of a security model in place. Ifyou are not moving to this architecture then you are letting yourself, and yourcustomers, down. The input filter can gather data from several sources.

    INPUT_GET

    INPUT_POST

    INPUT_COOKIE

    INPUT_ENV

    INPUT_SERVER

    INPUT_SESSION (Not yet implemented)

    INPUT_REQUEST (Not yet implemented)

    Here follows a simple example of using the filter_input() function to deal with GETvariables. Lets assume you have a URL of the the type http://www.example.com?num=7Lets see how we can validate this using our input filter.

    As seen with earlier use of the FILTER_VALIDATE_INT filter, we are able to validatethat the supplied value is a digit and that it is with the range of 1 to 10. Should an invalid

  • 8/3/2019 Filtering Data With PHP

    32/37

    value be supplied the filter_input will return bool(false). The INPUT_GET parameter tellsthe filter_input that the value is coming from GET.

    In the same manner as we validated the Integer above, we can do the same for anyvalue. Here we show the filter_input in use with a different filter.

    The above code uses the FILTER_SANITIZE_SPECIAL_CHARS filter to check thevalue of the GET variable named text. When if finds special chars it converts them foruse like this#62;kevin