8
A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. Author: Jiekun Zhang, Dafang Zhang, Kun Huang Publisher: IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2009) Presenter: Sih-An Pan Date: 2014/6/18

A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Embed Size (px)

Citation preview

Page 1: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

A Regular Expression Matching Algorithm Using Transition

Merging

Department of Computer Science and Information Engineering National Cheng Kung University Taiwan ROC

Author Jiekun Zhang Dafang Zhang Kun Huang

Publisher IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2009)

Presenter Sih-An Pan

Date 2014618

Computer amp Internet Architecture Lab CSIE NCKU

Introduction

The authors in [1] propose a novel method to reduce the DFA memory requirement and still provide worst-case speed guarantees called State Merging DFA (SM-DFA)

SM-DFA results in large memory reductions But this algorithm only considers the reduction of states while adding auxiliary information on the transitions at the same time

But the transitions have not been reduced which increase the memory requirement

2

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

3

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

4

The transition [g-i]0 j1 indicates that the same next state in this case state 5 is reached from state 3_4 upon receiving input characters g h i with label 0 or input character j with label 1

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

The transition a001 from state 3_4 to state 1_2 means

bull The transition carries with it a label 0 that tells its destination state 1_2 that the transition is meant for underlying original state 1

bull The transition is taken when its source state 3_4 receives labels 0 or 1

5

Computer amp Internet Architecture Lab CSIE NCKU

TRANSITION MERGING DFA

6

Stringrdquoacgacikrdquo

0-gt1-2-gt3-4-gt5-gt1-2-gt3-4-gt5-gt6

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

7

TM-DFA matching algorithm ensures the speed of pattern matching and reduces the memory consumption by 30 compared to SM-DFA while compared to the original DFA it reduces the memory consumption by 42

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8

Page 2: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Computer amp Internet Architecture Lab CSIE NCKU

Introduction

The authors in [1] propose a novel method to reduce the DFA memory requirement and still provide worst-case speed guarantees called State Merging DFA (SM-DFA)

SM-DFA results in large memory reductions But this algorithm only considers the reduction of states while adding auxiliary information on the transitions at the same time

But the transitions have not been reduced which increase the memory requirement

2

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

3

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

4

The transition [g-i]0 j1 indicates that the same next state in this case state 5 is reached from state 3_4 upon receiving input characters g h i with label 0 or input character j with label 1

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

The transition a001 from state 3_4 to state 1_2 means

bull The transition carries with it a label 0 that tells its destination state 1_2 that the transition is meant for underlying original state 1

bull The transition is taken when its source state 3_4 receives labels 0 or 1

5

Computer amp Internet Architecture Lab CSIE NCKU

TRANSITION MERGING DFA

6

Stringrdquoacgacikrdquo

0-gt1-2-gt3-4-gt5-gt1-2-gt3-4-gt5-gt6

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

7

TM-DFA matching algorithm ensures the speed of pattern matching and reduces the memory consumption by 30 compared to SM-DFA while compared to the original DFA it reduces the memory consumption by 42

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8

Page 3: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

3

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

4

The transition [g-i]0 j1 indicates that the same next state in this case state 5 is reached from state 3_4 upon receiving input characters g h i with label 0 or input character j with label 1

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

The transition a001 from state 3_4 to state 1_2 means

bull The transition carries with it a label 0 that tells its destination state 1_2 that the transition is meant for underlying original state 1

bull The transition is taken when its source state 3_4 receives labels 0 or 1

5

Computer amp Internet Architecture Lab CSIE NCKU

TRANSITION MERGING DFA

6

Stringrdquoacgacikrdquo

0-gt1-2-gt3-4-gt5-gt1-2-gt3-4-gt5-gt6

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

7

TM-DFA matching algorithm ensures the speed of pattern matching and reduces the memory consumption by 30 compared to SM-DFA while compared to the original DFA it reduces the memory consumption by 42

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8

Page 4: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

4

The transition [g-i]0 j1 indicates that the same next state in this case state 5 is reached from state 3_4 upon receiving input characters g h i with label 0 or input character j with label 1

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

The transition a001 from state 3_4 to state 1_2 means

bull The transition carries with it a label 0 that tells its destination state 1_2 that the transition is meant for underlying original state 1

bull The transition is taken when its source state 3_4 receives labels 0 or 1

5

Computer amp Internet Architecture Lab CSIE NCKU

TRANSITION MERGING DFA

6

Stringrdquoacgacikrdquo

0-gt1-2-gt3-4-gt5-gt1-2-gt3-4-gt5-gt6

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

7

TM-DFA matching algorithm ensures the speed of pattern matching and reduces the memory consumption by 30 compared to SM-DFA while compared to the original DFA it reduces the memory consumption by 42

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8

Page 5: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Computer amp Internet Architecture Lab CSIE NCKU

STATE MERGING DFA

The transition a001 from state 3_4 to state 1_2 means

bull The transition carries with it a label 0 that tells its destination state 1_2 that the transition is meant for underlying original state 1

bull The transition is taken when its source state 3_4 receives labels 0 or 1

5

Computer amp Internet Architecture Lab CSIE NCKU

TRANSITION MERGING DFA

6

Stringrdquoacgacikrdquo

0-gt1-2-gt3-4-gt5-gt1-2-gt3-4-gt5-gt6

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

7

TM-DFA matching algorithm ensures the speed of pattern matching and reduces the memory consumption by 30 compared to SM-DFA while compared to the original DFA it reduces the memory consumption by 42

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8

Page 6: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Computer amp Internet Architecture Lab CSIE NCKU

TRANSITION MERGING DFA

6

Stringrdquoacgacikrdquo

0-gt1-2-gt3-4-gt5-gt1-2-gt3-4-gt5-gt6

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

7

TM-DFA matching algorithm ensures the speed of pattern matching and reduces the memory consumption by 30 compared to SM-DFA while compared to the original DFA it reduces the memory consumption by 42

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8

Page 7: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

7

TM-DFA matching algorithm ensures the speed of pattern matching and reduces the memory consumption by 30 compared to SM-DFA while compared to the original DFA it reduces the memory consumption by 42

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8

Page 8: A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Computer amp Internet Architecture Lab CSIE NCKU

EXPERIMENTAL RESULTS

TM-DFA continues reducing the memory consumption1049288and we can see from Fig 8 and Fig 9 that the performance of the proposed TM-DFA scheme is outperformed by the SM-DFA scheme when the rule length becomes larger

8