27
And now you have two problems Ruby regular expressions for fun and profit Luca Mearelli @lmea Codemotion Rome - 2013

And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

Embed Size (px)

DESCRIPTION

A wise hacker said: Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. Regular expressions are a powerful tool in our hands and a first class citizen in ruby so it is tempting to overuse them. But knowing them and using them properly is a fundamental asset of every developer. We’ll see hands-on examples of proper Reg Exps usage in ruby code, we’ll also look at bad and ugly cases and learn how to approach writing, testing and debugging regular expressions.

Citation preview

Page 1: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

And now you have two problems

Ruby regular expressions for fun and profit

Luca Mearelli @lmeaCodemotion Rome - 2013

Page 2: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regular expressions

•cat catch indicate ...

•2013-03-22, YYYY-MM-DD, ...

•$ 12,500.80

patterns to describe the contents of a text

Page 3: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexps: good for...

Pattern matching

Search and replace

Page 4: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp in ruby

Regexp object: Regexp.new("cat")literal notation #1: %r{cat}literal notation #2: /cat/

Page 5: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp syntax

literals: /cat/ matches any ‘cat’ substring

the dot: /./ matches any character

character classes: /[aeiou]/ /[a-z]/ /[01]/

negated character classes: /[^abc]/

Page 6: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp syntax

case insensitive: /./i

only interpolate #{} blocks once: /./o

multiline mode - '.' will match newline: /./m

extended mode - whitespace is ignored: /./x

Modifiers

Page 7: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp syntax

/\d/ digit /\D/ non digit

/\s/ whitespace /\S/ non whitespace

/\w/ word character /\W/ non word character

/\h/ hexdigit /\H/ non hexdigit

Shorthand classes

Page 8: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp syntax

/^/ beginning of line /$/ end of line

/\b/ word boundary /\B/ non word boundary

/\A/ beginning of string /\z/ end of string

/\Z/end of string. If string ends with a newline,

it matches just before newline

Anchors

Page 9: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp syntax

alternation: /cat|dog/ matches ‘cats and dogs’

0-or-more: /ab*/ matches ‘a’ ‘ab’ ‘abb’...

1-or-more: /ab+/ matches ‘ab’ ‘abb’ ...

given-number: /ab{2}/ matches ‘abb’ but not ‘ab’ or the whole ‘abbb’ string

Page 10: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp syntax

greedy matches: /.+cat/ matches ‘the cat is catching a mouse’

lazy matches: /.+?\scat/ matches ‘the cat is catching a mouse’

Page 11: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp syntax

grouping: /(\d{3}\.){3}\d{3}/ matches IP-like strings

capturing: /a (cat|dog)/ the match is captured in $1 to be used later

non capturing: /a (?:cat|dog)/ no content captured

atomic grouping: /(?>a+)/ doesn’t backtrack

Page 12: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

String substitution

"My cat eats catfood".sub(/cat/, "dog") # => My dog eats catfood

"My cat eats catfood".gsub(/cat/, "dog") # => My dog eats dogfood

"My cat eats catfood".gsub(/\bcat(\w+)/, "dog\\1") # => My cat eats dogfood

"My cat eats catfood".gsub(/\bcat(\w+)/){|m| $1.reverse} # => My cat eats doof

Page 13: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

String parsing

"Codemotion Rome: Mar 20 to Mar 23".scan(/\w{3} \d{1,2}/)# => ["Mar 20", "Mar 23"]

"Codemotion Rome: Mar 20 to Mar 23".scan(/(\w{3}) (\d{1,2})/) # => [["Mar", "20"], ["Mar", "23"]]

"Codemotion Rome: Mar 20 to Mar 23".scan(/(\w{3}) (\d{1,2})/){|a,b| puts b+"/"+a}# 20/Mar# 23/Mar# => "Codemotion Rome: Mar 20 to Mar 23"

Page 14: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexp methods

if "what a wonderful world" =~ /(world)/

puts "hello #{$1.upcase}" end# hello WORLD

if /(world)/.match("The world") puts "hello #{$1.upcase}" end# hello WORLD

match_data = /(world)/.match("The world")puts "hello #{match_data[1].upcase}" # hello WORLD

Page 15: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Rails app examples

# in routing

match 'path/:id', :constraints => { :id => /[A-Z]\d{5}/ }

# in validations

validates :phone, :format => /\A\d{2,4}\s*\d+\z/

validates :phone, :format => { :with=> /\A\d{2,4}\s*\d+\z/ }

validates :phone, :format => { :without=> /\A02\s*\d+\z/ }

Page 16: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Rails examples

# in ActiveModel::Validations::NumericalityValidatordef parse_raw_value_as_an_integer(raw_value) raw_value.to_i if raw_value.to_s =~ /\A[+-]?\d+\Z/end

# in ActionDispatch::RemoteIp::IpSpoofAttackError# IP addresses that are "trusted proxies" that can be stripped from# the comma-delimited list in the X-Forwarded-For header. See also:# http://en.wikipedia.org/wiki/Private_network#Private_IPv4_address_spacesTRUSTED_PROXIES = %r{ ^127\.0\.0\.1$ | # localhost ^(10 | # private IP 10.x.x.x 172\.(1[6-9]|2[0-9]|3[0-1]) | # private IP in the range 172.16.0.0 .. 172.31.255.255 192\.168 # private IP 192.168.x.x )\.}x

WILDCARD_PATH = %r{\*([^/\)]+)\)?$}

Page 17: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Regexps are dangerous

"If I was going to place a bet on something about Rails security, it'd be that there are more regex vulnerabilities in the tree. I am uncomfortable with how much Rails leans on regex for policy decisions."

Thomas H. Ptacek (Founder @ Matasano, Feb 2013)

Page 18: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Tip #1Beware of nested quantifiers

/(x+x+)+y/ =~ 'xxxxxxxxxy'

/(xx+)+y/ =~ 'xxxxxxxxxx'

/(?>x+x+)+y/ =~ 'xxxxxxxxx'

Page 19: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Tip #2Don’t make everything optional

/[-+]?[0-9]*\.?[0-9]*/ =~ '.'

/[-+]?([0-9]*\.?[0-9]+|[0-9]+)/

/[-+]?[0-9]*\.?[0-9]+/

Page 20: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Tip #3Evaluate tradeoffs

/\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,4}\b/

/(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"

.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*))*)?;\s*)/

Page 21: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Tip #4Capture repeated groups and don’t repeat a captured group/!(abc|123)+!/ =~ '!abc123!'# $1 == '123'

/!((abc|123)+)!/ =~ '!abc123!'# $1 == 'abc123'

Page 22: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Tip #5use interpolation with care

str = "cat"

/#{str}/ =~ "My cat eats catfood"

/#{Regexp.quote(str)}/ =~ "My cat eats catfood"

Page 23: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

Tip #6Don’t use ^ and $ to match the strings beginning and end

validates :url, :format => /^https?/

"http://example.com" =~ /^https?/

"javascript:alert('hello!');%0Ahttp://example.com"

"javascript:alert('hello!');\nhttp://example.com" =~ /^https?/

"javascript:alert('hello!');\nhttp://example.com" =~ /\Ahttps?/

Page 24: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

From 060bb7250b963609a0d8a5d0559e36b99d2402c6 Mon Sep 17 00:00:00 2001

From: joernchen of Phenoelit <[email protected]>Date: Sat, 9 Feb 2013 15:46:44 -0800Subject: [PATCH] Fix issue with attr_protected where malformed input could circumvent protection

Fixes: CVE-2013-0276--- activemodel/lib/active_model/attribute_methods.rb | 2 +- activemodel/lib/active_model/mass_assignment_security/permission_set.rb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/activemodel/lib/active_model/attribute_methods.rb b/activemodel/lib/active_model/attribute_methods.rbindex f033a94..96f2c82 100644--- a/activemodel/lib/active_model/attribute_methods.rb+++ b/activemodel/lib/active_model/attribute_methods.rb@@ -365,7 +365,7 @@ module ActiveModel end @prefix, @suffix = options[:prefix] || '', options[:suffix] || ''- @regex = /^(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})$/+ @regex = /\A(#{Regexp.escape(@prefix)})(.+?)(#{Regexp.escape(@suffix)})\z/ @method_missing_target = "#{@prefix}attribute#{@suffix}" @method_name = "#{prefix}%s#{suffix}" enddiff --git a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb b/activemodel/lib/active_model/mass_assignment_security/permission_set.rbindex a1fcdf1..10faa29 100644--- a/activemodel/lib/active_model/mass_assignment_security/permission_set.rb+++ b/activemodel/lib/active_model/mass_assignment_security/permission_set.rb@@ -19,7 +19,7 @@ module ActiveModel protected def remove_multiparameter_id(key)- key.to_s.gsub(/\(.+/, '')+ key.to_s.gsub(/\(.+/m, '') end end -- 1.8.1.1

Page 25: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

From 99123ad12f71ce3e7fe70656810e53133665527c Mon Sep 17 00:00:00 2001From: Aaron Patterson <[email protected]>Date: Fri, 15 Mar 2013 15:04:00 -0700Subject: [PATCH] fix protocol checking in sanitization [CVE-2013-1857]

Conflicts: actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb--- .../action_controller/vendor/html-scanner/html/sanitizer.rb | 4 ++-- actionpack/test/template/html-scanner/sanitizer_test.rb | 10 ++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rbindex 02eea58..994e115 100644--- a/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb+++ b/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb@@ -66,7 +66,7 @@ module HTML # A regular expression of the valid characters used to separate protocols like # the ':' in 'http://foo.com'- self.protocol_separator = /:|(&#0*58)|(&#x70)|(%|&#37;)3A/+ self.protocol_separator = /:|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i # Specifies a Set of HTML attributes that can have URIs. self.uri_attributes = Set.new(%w(href src cite action longdesc xlink:href lowsrc))@@ -171,7 +171,7 @@ module HTML def contains_bad_protocols?(attr_name, value) uri_attributes.include?(attr_name) &&- (value =~ /(^[^\/:]*):|(&#0*58)|(&#x70)|(%|&#37;)3A/ && !allowed_protocols.include?(value.split(protocol_separator).first.downcase))+ (value =~ /(^[^\/:]*):|(&#0*58)|(&#x70)|(&#x0*3a)|(%|&#37;)3A/i && !allowed_protocols.include?(value.split(protocol_separator).first.downcase.strip)) end end enddiff --git a/actionpack/test/template/html-scanner/sanitizer_test.rb b/actionpack/test/template/html-scanner/sanitizer_test.rbindex 4e2ad4e..dee60c9 100644--- a/actionpack/test/template/html-scanner/sanitizer_test.rb+++ b/actionpack/test/template/html-scanner/sanitizer_test.rb@@ -176,6 +176,7 @@ class SanitizerTest < ActionController::TestCase %(<IMG SRC="jav&#x0A;ascript:alert('XSS');">),

Page 26: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

@lmea

ToolsPrint a cheatsheet!

Info:

http://www.regular-expressions.info

Debug:

http://rubular.com

http://rubyxp.com

Visualize:

http://www.regexper.com/

Page 27: And now you have two problems. Ruby regular expressions for fun and profit by Luca Mearelli

Thank you!