22
Rabin-Karp algorithm Robin Visser

Rabin-Karp algorithm

  • Upload
    menora

  • View
    113

  • Download
    5

Embed Size (px)

DESCRIPTION

Rabin-Karp algorithm. Robin Visser. What is Rabin-Karp?. What is Rabin-Karp?. String searching algorithm. What is Rabin-Karp?. String searching algorithm Uses hashing. What is Rabin-Karp?. String searching algorithm Uses hashing (e.g. hash(“answer ” ) = 42 ). Algorithm. - PowerPoint PPT Presentation

Citation preview

Page 1: Rabin-Karp algorithm

Rabin-Karp algorithmRobin Visser

Page 2: Rabin-Karp algorithm

What is Rabin-Karp?

Page 3: Rabin-Karp algorithm

What is Rabin-Karp?

String searching algorithm

Page 4: Rabin-Karp algorithm

What is Rabin-Karp?

String searching algorithm

Uses hashing

Page 5: Rabin-Karp algorithm

What is Rabin-Karp?

String searching algorithm

Uses hashing

(e.g. hash(“answer”) = 42 )

Page 6: Rabin-Karp algorithm

Algorithm

Page 7: Rabin-Karp algorithm

Algorithmfunction RabinKarp(string s[1..n], string sub[1..m])

hsub := hash(sub[1..m]); hs := hash(s[1..m])

for i from 1 to n-m+1

if hs = hsub

if s[i..i+m-1] = sub

return i

hs := hash(s[i+1..i+m])

return not found

Page 8: Rabin-Karp algorithm

Algorithmfunction RabinKarp(string s[1..n], string sub[1..m])

hsub := hash(sub[1..m]); hs := hash(s[1..m])

for i from 1 to n-m+1

if hs = hsub

if s[i..i+m-1] = sub

return i

hs := hash(s[i+1..i+m])

return not found

Naïve implementation: Runs in O(nm)

Page 9: Rabin-Karp algorithm

Hash function

Page 10: Rabin-Karp algorithm

Hash functionUse rolling hash to compute the

next hash value in constant time

Page 11: Rabin-Karp algorithm

Hash functionUse rolling hash to compute the

next hash value in constant timeExample: If we add the values of each character in the substring as our hash, we get:

hash(s[i+1..i+m]) = hash(s[i..i+m-1]) – hash(s[i]) + hash(s[i+m])

Page 12: Rabin-Karp algorithm

Hash functionA popular and effective hash function

treats every substring as a number in some base, usually a large prime.

Page 13: Rabin-Karp algorithm

Hash functionA popular and effective hash function

treats every substring as a number in some base, usually a large prime.

(e.g. if the substring is “IOI" and the base is 101, the hash value would be 73 × 1012 + 79 × 1011  + 73 × 1010 = 752725)

Page 14: Rabin-Karp algorithm

Hash functionA popular and effective hash function

treats every substring as a number in some base, usually a large prime.

(e.g. if the substring is “IOI" and the base is 101, the hash value would be 73 × 1012 + 79 × 1011  + 73 × 1010 = 752725)

Due to the limited size of the integer data type, modular arithmetic must be used to scale down the hash result.

Page 15: Rabin-Karp algorithm

How is this useful

Page 16: Rabin-Karp algorithm

How is this useful Inferior to KMP algorithm and Boyer-Moore

algorithm for single pattern searching, however can be used effectively for multiple pattern searching.

Page 17: Rabin-Karp algorithm

How is this useful Inferior to KMP algorithm and Boyer-Moore

algorithm for single pattern searching, however can be used effectively for multiple pattern searching.

We can create a variant, using a Bloom filter or a set data structure to check whether the hash of a given string belongs to a set of hash values of patterns we are looking for.

Page 18: Rabin-Karp algorithm

How is this usefulConsider the following variant:

Page 19: Rabin-Karp algorithm

How is this usefulConsider the following variant:

function RabinKarpSet(string s[1..n], set of string

subs, m):

set hsubs := emptySet

for each sub in subs

insert hash(sub[1..m]) into hsubs

hs := hash(s[1..m])

for i from 1 to n-m+1

if hs hsubs and s∈ [i..i+m-1] subs ∈

return i

hs := hash(s[i+1..i+m])

return not found

Page 20: Rabin-Karp algorithm

How is this usefulRuns in O(n + k) time, compared to

O(nk) time when searching each string individually.

Page 21: Rabin-Karp algorithm

How is this usefulRuns in O(n + k) time, compared to

O(nk) time when searching each string individually.

Note that a hash table checks whether a substring hash equals any of the pattern hashes in O(1) time on average.

Page 22: Rabin-Karp algorithm

Questions?