9
String Searching In String Searching In Parallel Parallel By Sowmya Padmanabhan By Sowmya Padmanabhan Final Term Project Final Term Project Presentation for Parallel Presentation for Parallel Processing Processing Dr. Charles Fulton Dr. Charles Fulton

String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

Embed Size (px)

Citation preview

Page 1: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

String Searching In ParallelString Searching In ParallelBy Sowmya PadmanabhanBy Sowmya Padmanabhan

Final Term Project Presentation for Final Term Project Presentation for Parallel ProcessingParallel Processing

Dr. Charles FultonDr. Charles Fulton

Page 2: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

One way to parallelize is:One way to parallelize is:

Consider a huge text document ( something like an Consider a huge text document ( something like an encyclopedia available electronically ) and you want encyclopedia available electronically ) and you want to search through it for several words or phrases or to search through it for several words or phrases or sentences at the same time.sentences at the same time.

We call what we are searching as “search_string”.We call what we are searching as “search_string”. Rather than having one processor look for all the Rather than having one processor look for all the

search_strings in the given huge document, we could search_strings in the given huge document, we could take advantage of parallel processing and have 10 take advantage of parallel processing and have 10 different processors look for 10 different different processors look for 10 different search_strings simultaneously thereby doing the search_strings simultaneously thereby doing the searching really quickly and efficiently.searching really quickly and efficiently.

Page 3: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

One way to parallelize is:One way to parallelize is:

My first program basically accomplishes this My first program basically accomplishes this objective.objective.

The document in which I am searching for The document in which I am searching for search_strings is an actual document, search_strings is an actual document, collection of collection of William Shakespeare’sWilliam Shakespeare’s works, works, downloaded from an online resource and downloaded from an online resource and consists of approximately 400 Million consists of approximately 400 Million characters.characters.

My program is capable of handling up to 450 My program is capable of handling up to 450 Million characters.Million characters.

Page 4: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

Second Way to ParallelizeSecond Way to Parallelize Think of this scenario:Think of this scenario: I have to look up the available huge electronic I have to look up the available huge electronic

document (again imagine an encyclopedia ) for just document (again imagine an encyclopedia ) for just one word or phrase or sentence at a time.one word or phrase or sentence at a time.

How do I take advantage of parallel processing?How do I take advantage of parallel processing? Simple!Simple! Divide the whole document into as many equal partsDivide the whole document into as many equal parts as there are processors. Let’s call these “sub-as there are processors. Let’s call these “sub-

documents” and allot each sub-document to one documents” and allot each sub-document to one processor.processor.

Now, what do we do with these sub-documents?Now, what do we do with these sub-documents?

Page 5: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

Second Way to ParallelizeSecond Way to Parallelize

Yes, you are right!Yes, you are right! Have each of the processors search for the Have each of the processors search for the

search_string in only the sub-document that it has search_string in only the sub-document that it has been allotted.been allotted.

Sounds great! So, how do I code it?Sounds great! So, how do I code it? Using Using MPI_ScatterMPI_Scatter Of Course! Of Course! Note: This program works when no. of processors are Note: This program works when no. of processors are

10 and above, for less no. of processors, the buffer 10 and above, for less no. of processors, the buffer gets exceeded for MPI_Scatter command.gets exceeded for MPI_Scatter command.

Page 6: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

Comparison of TimesComparison of Times

See Table of Comparisons.See Table of Comparisons.

Page 7: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

Algorithm for String SearchingAlgorithm for String Searching int string_searching_algo (char *string, char *search_string) {int string_searching_algo (char *string, char *search_string) {

int i, j, k;int i, j, k; int count = 0, occurences = 0;int count = 0, occurences = 0;

const int len_search_string = strlen ( search_string );const int len_search_string = strlen ( search_string ); const int len_given_string = strlen ( string );const int len_given_string = strlen ( string );

for (i = 0; i <= (len_given_string - len_search_string); i++ ) {for (i = 0; i <= (len_given_string - len_search_string); i++ ) { count = 0;count = 0;

for(j = i,k = 0; k < (len_search_string) ; j++, k++) {for(j = i,k = 0; k < (len_search_string) ; j++, k++) {

if ( *(string + j) != *(search_string + k) ) {if ( *(string + j) != *(search_string + k) ) { break;break; } else {} else { count++;count++; }}

if ( count == len_search_string ) {if ( count == len_search_string ) { occurences++;occurences++; }} }} }}

return occurences;return occurences; }}

Page 8: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

ConclusionConclusion

String searching done in parallel saves a lot of time especially String searching done in parallel saves a lot of time especially when string searching needs to be done in an extremely huge when string searching needs to be done in an extremely huge document and is more efficient than single-processor document and is more efficient than single-processor searching. searching.

One way to parallelize is to have several processors search One way to parallelize is to have several processors search different strings in one document in parallel and second way is different strings in one document in parallel and second way is to have several processors search for the same string in to have several processors search for the same string in different portions(sub-documents) of the same document in different portions(sub-documents) of the same document in parallel.parallel.

Page 9: String Searching In Parallel By Sowmya Padmanabhan Final Term Project Presentation for Parallel Processing Dr. Charles Fulton

One Problem however…One Problem however…

The second program that uses The second program that uses MPI_ScatterMPI_Scatter has one has one drawback that is, when a search_string overlaps in drawback that is, when a search_string overlaps in two sub-documents (one portion of it exists at the two sub-documents (one portion of it exists at the end of one sub-document and the other portion of end of one sub-document and the other portion of the search-string exists at the beginning of next sub-the search-string exists at the beginning of next sub-document, available with some other processor), document, available with some other processor), then the program will not give proper results. then the program will not give proper results.