And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

Preview:

DESCRIPTION

A wise hacker said: Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. Regular expressions are a powerful tool in our hands and a first class citizen in ruby so it is tempting to overuse them. But knowing them and using them properly is a fundamental asset of every developer. We’ll see hands-on examples of proper Reg Exps usage in ruby code, we’ll also look at bad and ugly cases and learn how to approach writing, testing and debugging regular expressions.

Citation preview

And now you have two problems

Ruby regular expressions for fun and profit

Luca Mearelli @lmeaCodemotion Rome - 2013

@lmea

Regular expressions

•cat catch indicate ...

•2013-03-22, YYYY-MM-DD, ...

•$ 12,500.80

patterns to describe the contents of a text

@lmea

Regexps: good for...

Pattern matching

Search and replace

@lmea

Regexp in ruby

Regexp object: Regexp.new("cat")literal notation #1: %r{cat}literal notation #2: /cat/

@lmea

Regexp syntax

literals: /cat/ matches any ‘cat’ substring

the dot: /./ matches any character

character classes: /[aeiou]/ /[a-z]/ /[01]/

negated character classes: /[^abc]/

@lmea

Regexp syntax

case insensitive: /./i

only interpolate #{} blocks once: /./o

multiline mode - '.' will match newline: /./m

extended mode - whitespace is ignored: /./x

Modifiers

@lmea

Regexp syntax

/\d/ digit /\D/ non digit

/\s/ whitespace /\S/ non whitespace

/\w/ word character /\W/ non word character

/\h/ hexdigit /\H/ non hexdigit

Shorthand classes

@lmea

Regexp syntax

/^/ beginning of line /$/ end of line

/\b/ word boundary /\B/ non word boundary

/\A/ beginning of string /\z/ end of string

/\Z/end of string. If string ends with a newline,

it matches just before newline

Anchors

@lmea

Regexp syntax

alternation: /cat|dog/ matches ‘cats and dogs’

0-or-more: /ab*/ matches ‘a’ ‘ab’ ‘abb’...

1-or-more: /ab+/ matches ‘ab’ ‘abb’ ...

given-number: /ab{2}/ matches ‘abb’ but not ‘ab’ or the whole ‘abbb’ string

@lmea

Regexp syntax

greedy matches: /.+cat/ matches ‘the cat is catching a mouse’

lazy matches: /.+?\scat/ matches ‘the cat is catching a mouse’

@lmea

Regexp syntax

grouping: /(\d{3}\.){3}\d{3}/ matches IP-like strings

capturing: /a (cat|dog)/ the match is captured in $1 to be used later

non capturing: /a (?:cat|dog)/ no content captured

atomic grouping: /(?>a+)/ doesn’t backtrack

@lmea

String substitution

"My cat eats catfood".sub(/cat/, "dog") # => My dog eats catfood

"My cat eats catfood".gsub(/cat/, "dog") # => My dog eats dogfood

"My cat eats catfood".gsub(/\bcat(\w+)/, "dog\\1") # => My cat eats dogfood

"My cat eats catfood".gsub(/\bcat(\w+)/){|m| $1.reverse} # => My cat eats doof

@lmea

String parsing

"Codemotion Rome: Mar 20 to Mar 23".scan(/\w{3} \d{1,2}/)# => ["Mar 20", "Mar 23"]

"Codemotion Rome: Mar 20 to Mar 23".scan(/(\w{3}) (\d{1,2})/) # => [["Mar", "20"], ["Mar", "23"]]

"Codemotion Rome: Mar 20 to Mar 23".scan(/(\w{3}) (\d{1,2})/){|a,b| puts b+"/"+a}# 20/Mar# 23/Mar# => "Codemotion Rome: Mar 20 to Mar 23"

@lmea

Regexp methods

if "what a wonderful world" =~ /(world)/

puts "hello #{$1.upcase}" end# hello WORLD

if /(world)/.match("The world") puts "hello #{$1.upcase}" end# hello WORLD

match_data = /(world)/.match("The world")puts "hello #{match_data[1].upcase}" # hello WORLD

@lmea

Rails app examples

# in routing

match 'path/:id', :constraints => { :id => /[A-Z]\d{5}/ }

# in validations

validates :phone, :format => /\A\d{2,4}\s*\d+\z/

validates :phone, :format => { :with=> /\A\d{2,4}\s*\d+\z/ }

validates :phone, :format => { :without=> /\A02\s*\d+\z/ }

@lmea

Rails examples

# in ActiveModel::Validations::NumericalityValidatordef parse_raw_value_as_an_integer(raw_value) raw_value.to_i if raw_value.to_s =~ /\A[+-]?\d+\Z/end

# in ActionDispatch::RemoteIp::IpSpoofAttackError# IP addresses that are "trusted proxies" that can be stripped from# the comma-delimited list in the X-Forwarded-For header. See also:# http://en.wikipedia.org/wiki/Private_network#Private_IPv4_address_spacesTRUSTED_PROXIES = %r{ ^127\.0\.0\.1$ | # localhost ^(10 | # private IP 10.x.x.x 172\.(1[6-9]|2[0-9]|3[0-1]) | # private IP in the range 172.16.0.0 .. 172.31.255.255 192\.168 # private IP 192.168.x.x )\.}x

WILDCARD_PATH = %r{\*([^/\)]+)\)?$}

@lmea

Regexps are dangerous

"If I was going to place a bet on something about Rails security, it'd be that there are more regex vulnerabilities in the tree. I am uncomfortable with how much Rails leans on regex for policy decisions."

Thomas H. Ptacek (Founder @ Matasano, Feb 2013)

@lmea

Tip #1Beware of nested quantifiers

/(x+x+)+y/ =~ 'xxxxxxxxxy'

/(xx+)+y/ =~ 'xxxxxxxxxx'

/(?>x+x+)+y/ =~ 'xxxxxxxxx'

@lmea

Tip #2Don’t make everything optional

/[-+]?[0-9]*\.?[0-9]*/ =~ '.'

/[-+]?([0-9]*\.?[0-9]+|[0-9]+)/

/[-+]?[0-9]*\.?[0-9]+/

@lmea

Tip #3Evaluate tradeoffs

/\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,4}\b/

/(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"

.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*))*)?;\s*)/

@lmea

Tip #4Capture repeated groups and don’t repeat a captured group/!(abc|123)+!/ =~ '!abc123!'# $1 == '123'

/!((abc|123)+)!/ =~ '!abc123!'# $1 == 'abc123'

@lmea

Tip #5use interpolation with care

str = "cat"

/#{str}/ =~ "My cat eats catfood"

/#{Regexp.quote(str)}/ =~ "My cat eats catfood"

@lmea

Tip #6Don’t use ^ and $ to match the strings beginning and end

validates :url, :format => /^https?/

"http://example.com" =~ /^https?/

"javascript:alert('hello!');%0Ahttp://example.com"

"javascript:alert('hello!');\nhttp://example.com" =~ /^https?/

"javascript:alert('hello!');\nhttp://example.com" =~ /\Ahttps?/

@lmea

From 060bb7250b963609a0d8a5d0559e36b99d2402c6 Mon Sep 17 00:00:00 2001

From: joernchen of Phenoelit <joernchen@phenoelit.de>Date: Sat, 9 Feb 2013 15:46:44 -0800Subject: [PATCH] Fix issue with attr_protected where malformed input could circumvent protection

Fixes: CVE-2013-0276--- activemodel/lib/active_model/attribute_methods.rb | 2 +- activemodel/lib/active_model/mass_assignment_security/permission_set.rb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/activemodel/lib/active_model/attribute_methods.rb b/activemodel/lib/active_model/attribute_methods.rbindex f033a94..96f2c82 100644--- a/activemodel/lib/active_model/attribute_methods.rb+++ b/activemodel/lib/active_model/attribute_methods.rb@@ -365,7 +365,7 @@ module ActiveModel end @prefix, @suffix = options[:prefix] || '', options[:suffix] || ''- @regex = /^(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})$/+ @regex = /\A(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})\z/ @method_missing_target = "#{@prefix}attribute#{@suffix}" @method_name = "#{prefix}%s#{suffix}" enddiff --git a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb b/activemodel/lib/active_model/mass_assignment_security/permission_set.rbindex a1fcdf1..10faa29 100644--- a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb+++ b/activemodel/lib/active_model/mass_assignment_security/permission_set.rb@@ -19,7 +19,7 @@ module ActiveModel protected def remove_multiparameter_id(key)- key.to_s.gsub(/\(.+/, '')+ key.to_s.gsub(/\(.+/m, '') end end -- 1.8.1.1

@lmea

From 99123ad12f71ce3e7fe70656810e53133665527c Mon Sep 17 00:00:00 2001From: Aaron Patterson <aaron.patterson@gmail.com>Date: Fri, 15 Mar 2013 15:04:00 -0700Subject: [PATCH] fix protocol checking in sanitization [CVE-2013-1857]

Conflicts: actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb--- .../action_controller/vendor/html-scanner/html/sanitizer.rb | 4 ++-- actionpack/test/template/html-scanner/sanitizer_test.rb | 10 ++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rbindex 02eea58..994e115 100644--- a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb+++ b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb@@ -66,7 +66,7 @@ module HTML # A regular expression of the valid characters used to separate protocols like # the ':' in 'http://foo.com'- self.protocol_separator = /:|(&#0*58)|(&#x70)|(%|&#37;)3A/+ self.protocol_separator = /:|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i # Specifies a Set of HTML attributes that can have URIs. self.uri_attributes = Set.new(%w(href src cite action longdesc xlink:href lowsrc))@@ -171,7 +171,7 @@ module HTML def contains_bad_protocols?(attr_name, value) uri_attributes.include?(attr_name) &&- (value =~ /(^[^\/:]*):|(&#0*58)|(&#x70)|(%|&#37;)3A/ && !allowed_protocols.include?(value.split(protocol_separator).first.downcase))+ (value =~ /(^[^\/:]*):|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i && !allowed_protocols.include?(value.split(protocol_separator).first.downcase.strip)) end end enddiff --git a/actionpack/test/template/html-scanner/sanitizer_test.rb b/actionpack/test/template/html-scanner/sanitizer_test.rbindex 4e2ad4e..dee60c9 100644--- a/actionpack/test/template/html-scanner/sanitizer_test.rb+++ b/actionpack/test/template/html-scanner/sanitizer_test.rb@@ -176,6 +176,7 @@ class SanitizerTest < ActionController::TestCase %(<IMG SRC="jav&#x0A;ascript:alert('XSS');">),

@lmea

ToolsPrint a cheatsheet!

Info:

http://www.regular-expressions.info

Debug:

http://rubular.com

http://rubyxp.com

Visualize:

http://www.regexper.com/

Thank you!

Recommended