19
PHP with Regular Expressions Web Technologies Computing Science Thompson Rivers University

PHP with Regular Expressions Web Technologies Computing Science Thompson Rivers University

Embed Size (px)

Citation preview

PHP with Regular Expressions

Web Technologies

Computing ScienceThompson Rivers University

Regular Expressions 2

1. How to Use Regular Expressions

[Q] When a user registers in our application or changes his/her password, how can the application enforce him/her to use a good password?

[Q] Good password? Minimum 8 charters At least one special character or number

[Q] How to verify the email format? [Q] How to find <title>…</title> in a b

web document? Link, title, comment, tags No title [Q] Can we read the title

from google.com?

Regular Expressions 3

[Q] What is a regular expression? They are coded patterns that you can use to search for matching patterns

in text strings. They are commonly used to validate the data that users enter in web

forms. [Q] Examples?

1. How to create and use regular expressions

2. How to match characters

3. How to use the character class

4. How to create complex patterns

5. How to use look-ahead assertions

6. How to use a multiline regular expression

7. How to replace a regular expression with a string

8. How to split a string on a regular expression

9. Regular expressions for data validation

Regular Expressions 4

How to create and use regular expressions

preg_match($pattern, $str) It returns 1 if the pattern is found and 0 if it’s not found. If there is an error in the pattern, it returns FALSE.

$pattern = '/Harris/'; // case sensitive?

$author_match = preg_match($pattern, 'Ray Harris'); // ?

$editor_match = preg_match($pattern, 'Joel Murach'); // ?

$pattern = '/murach/i'; // case insensitive?

$editor_match = preg_match($pattern, 'Joel Murach'); // ?

1.

Regular Expressions 5

How to match characters

\\ \/ \t \n \r \f

\xhh The hexadecimal for a character

$string = " 2010 Mike's Music. \ All rights reserved (5/2010).";

preg_match('/\xA9/', $string); // ?

preg_match('///', $string); // something wrong? FALSE?

preg_match('/\//', $string); // ?

preg_match('/\\\\/', $string); // ?

// Be careful with '\'

// '\\' or "\\" is interpreted as a single backslash.

Regular Expressions 6

How to match types of characters

. Any single chracter except a new line

\w Any letter, number, or the underscore

\W Any character that is not a letter, number or the underscore

\d Any digit

\D Any character that is not a digit

\s Any whitespace chracter (space, tab, new line, carrige return, form feed, or vertical tab)

\S Any character that is not whitespace

$string = 'The product code is MBT-3461';

preg_match('/MB./', $string); // ?

preg_match('/MB\d/', $string); // ?

preg_match('/MB-\d/', $string); // ?

1.

Regular Expressions 7

How to use the character class

[chars] a single character that is listed inside the brackets

$string = 'The product code is MBT-3461';

preg_match('/MB[TF]/', $string); // ?

preg_match('/[.]/', $string); // ?

preg_match('/MB[TF]/', $string); // ?

preg_match('/[12345]/', $string); // ?

^ [^aeiou] Negates the list of characters

- [a-z] Creates a range of characters

preg_match('/MB[^TF]/', $string); // ?

preg_match('/MBT[^^]/', $string); // ?

preg_match('/MBT-[1-5]/', $string); // ?

preg_match('/MBT[_*-]/', $string); // ?

Regular Expressions 8

[:digit:] Digits (same as \d ?)

[:lower:] Lower case letters

[:upper:] Upper case letters

[:letter:] Upper and lower case letters

[:alnum:] Upper and lower case letters and digits

[:word:] Upper and lower case letters, digits, and the underscore (same as \w ?)

[:print:] All printable characters including the space

[:graph:] All printable characters excluding the space

[:punct:] All printable characters excluding letters and digits, i.e., special characters?

$string = "The product code is MBT-3461";

preg_match('/MBT[[:punct:]]/', $string); // ?

preg_match('/MBT[[:digit:]]/', $string); // ?

preg_match('/MB[[:upper:]]/', $string); // ?

1.

Regular Expressions 9

How to create complex patterns

^ The begining of the string

$ The end of the string

\b The beginning or end of a word

\B A position other than the beginning or end of a word

$author = 'Ray Harris';

preg_match('/^Ray/', $author); // ?

preg_match('/Harris$/', $author); // ?

preg_match('/^Harris/', $author); // ???

$editor = 'Anne Bohme';

preg_match('/Ann/', $editor); // ?

preg_match('/Boh\b/', $editor); // ??? ***

Regular Expressions 10

(subspattern) Creates a numbered subpattern group

(?:subpattern) Creates an unnumbered subpattern group

| Matches either the left or the right subpattern

\n Matches a numbered subpattern group

$name = 'Rob Robertson';

preg_match('/^(Rob)\b/', $name); // ?

preg_match('/^(Rob)|(Bob)\b/', $name); // ?

preg_match('/^(\w\w\w) \1/', $name);

// \1 stands for the three exact characters

// that are matched with \w\w\w ?

// [Q] The above regular expression searchs for \w\w\w

// or \w\w\w \w\w\w ???

Regular Expressions 11

{n} Pattern must repeat exactly n times

{n,} Pattern must repeat n or more times

{n,m} Subpattern must repeat from n to m times

? Zero or one of the previous subpattern

+ One or more of the previous subpattern

* Zero or more of the previous subpattern

$phone = '559-555-6627'; // pattern for phone numbers?

preg_match('/^\d{3}-\d{3}-\d{4}$/', $phone); // ?

$fax = '(559) 555-6627';

preg_match('/^\(\d{3}\) *\d{3}-\d{4}$/', $fax); // ?

// pattern for for both ?

$phone_pattern = '/^(\d{3}-)|(\(\d{3}\) +)\d{3}-\d{4}$/';

preg_match($phone_pattern, $phone); // ?

preg_match($phone_pattern, $fax); // ?

1.

Regular Expressions 12

How to use look-ahead assertions [Q] How to enforce users to include a special character in passwords? An assertion is a fact about the pattern that must be true.

(?=assertion) Creates a look-ahead assertion

(?!assertion) Creates a negative look-ahead assertion

(?=[[:digit:]]) The next character in the pattern must be digit

(?=.*[[:digit:]]) The pattern must contain at least one digit

$pattern = '/^(?=.*[[:digit:]])[[:alnum:]]{6}$/';

preg_match($pattern, 'Harris'); // ?

preg_match($pattern, 'Harri5'); // ?

preg_match($pattern, 'Harrison5'); // ?

Regular Expressions 13

$pattern = '/^(?!3[2-9])[0-3][[:digit:]]$/';

// good to use to check months

preg_match($pattern, '32'); // ?

preg_match($pattern, '31'); // ?

preg_match($pattern, '21'); // ?

$pw_pattern = // [Q] password pattern?

// Should include a digit, a special

// character and minimum 8 characters.

'/^(?=.*[[:digit:]])(?=.*[[:punct:]])[[:print:]]{8,}$/';

preg_match($pw_pattern, 'sup3rsecret'); // ?

preg_match($pw_pattern, 'sup3rse(ret'); // ?

preg_match($pw_pattern, 'su3)rt'); // ?

1.

Regular Expressions 14

How to use a multiline regular expression

$string = "Ray Harris\nAuthor";

$pattern1 = '/Harris$/';

preg_match($pattern1, $string); // no?

$pattern2 = '/Harris$/m'; // multiline

preg_match($pattern2, $string); // ?

Regular Expressions 15

A function to find/get multiple matches in a string

preg_match_all($pattern, $string, $matches)

Returns a count of the number of matches or FALSE.

Also stores all matched substrings as a multi-dimensional array in the $matches. $matches[0] is an array of full pattern matches. If the pattern contains numbered parenthesized subpatterns, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on.

preg_match($pattern, $string, $matches) // the 1st match

$string = 'MBT-6745 MBT-5712';

$pattern = '/MBT-[[:digit:]]{4}/';

$count = preg_match_all($pattern, $string, $matches); // ?

foreach($matches[0] as $match)

echo '<div>' . $match . '</div>'; // ?

1.

Regular Expressions 16

How to replace a regular expression with a string How to split a string on a regular expression

preg_replace($pattern, $new, $string)

preg_split($pattern, $string) Returns an array of strings that's created by splitting the string on the specified pattern.

$items = 'MBT-6745 MBS-5729';

$items = preg_replace('/MB[TS]/', 'ITEM', $items);

$items = 'MBT-6745 MBS-5729, MBT-6824, and MBS-5214';

$pattern = '/[, ]+(and[ ]*)?/'; // ?

$items = preg_split($pattern, $items);

foreach($items as $item)

echo '<li>' . $item . '</li>'; // ?

1.

Regular Expressions 17

Regular expressions for data validation [Q] Phone numbers: 999-999-9999

'/^[[:digit:]{3}-[[:digit:]{3}-[[:digit:]{4}$/'

[Q] Credit card numbers: 9999-9999-9999-9999'/^[[:digit:]{4}-[[:digit:]{4}-[[:digit:]{4}-[[:digit:]{4}$/'

[Q] Zip code: 99999 or 99999-9999'/^[[:digit:]{5}(-[[:digit:]{4})?$/'

[Q] Dates: mm/dd/yyyy'/^(0?[1-9]|1[0-2])\/(0?[1-9]|[12][[:digit:]]|3[01])\/[[:digit:]]{4}$/'

[Q] Email address ?

Regular Expressions 18

[Q] Email address ?function valid_email ($email) {

$parts = explode("@", $email);

if (count($parts) != 2 ) return false;

if (strlen($parts[0]) > 64) return false;

if (strlen($parts[1]) > 255) return false;

 

$atom = '[[:alnum:]_!#$%&\'*+\/=?^`{|}~-]+';

$dotatom = '(\.' . $atom . ')*';

$address = '(^' . $atom . $dotatom . '$)';

$char = '([^\\\\"])';

$esc = '(\\\\[\\\\"])';

$text = '(' . $char . '|' . $esc . ')+';

$quoted = '(^"' . $text . '"$)';

$local_part = '/' . $address . '|' . $quoted . '/';

$local_match = preg_match($local_part, $parts[0]);

if ($local_match === false || $local_match != 1) return false;

$hostname = '([[:alnum:]]([-[:alnum:]]{0,62}[[:alnum:]])?)';

$hostnames = '(' . $hostname . '(\.' . $hostname . ')*)';

$top = '\.[[:alnum:]]{2,6}';

$domain_part = '/^' . $hostnames . $top . '$/';

$domain_match = preg_match($domain_part, $parts[1]);

if ($domain_match === false || $domain_match != 1) return false;

 

return true;

}

Regular Expressions 19

[Q] How to get <title>…</title> in an URL?

function getTitle($url)

{

$str = file_get_contents($url); // should be http://...

if (strlen($str) > 0) {

// for non-greedy matching,

// meaning stop after the first match,

// not preg_match_all()

preg_match("/\<title.*\>(.*)\<\/title\>/", $str,

$title);

return $title[1]; // not ...[0]; for the first captured

// parenthesized subpattern

}

else

return '';

}

1.