DESIGN AND ANALYSIS OF O ALGORITHMS HM

Associate ProfessorDepartment of Information Science and Technology

College of Engineering, Guindy CampusAnna University, Chennai

DESIGN ANDANALYSIS OF ALGORITHMS

Oxford

e ofofAnnaAnnaU

niver

sity

sociate Profciate PInformation formatioff EngineeEngi

Un

vers

Pres

sOOHMHM

Oxford University Press is a department of the University of Oxford.It furthers the University’s objective of excellence in research, scholarship,

and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries.

Published in India by Oxford University Press

YMCA Library Building, 1 Jai Singh Road, New Delhi 110001, India

© Oxford University Press 2014

The moral rights of the author/s have been asserted.

First published in 2014

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the

prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence, or under terms agreed with the appropriate reprographics

rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the

address above.

You must not circulate this work in any other form and you must impose this same condition on any acquirer.

ISBN-13: 978-0-19-809369-5ISBN-10: 0-19-809369-1

Typeset in Times New Romanby Ideal Publishing Solutions, Delhi

Printed in India by Radha Press, New Delhi 110031

Third-party website addresses mentioned in this book are providedby Oxford University Press in good faith and for information only.

Oxford University Press disclaims any responsibility for the material contained therein.

Oxford

ebsite aite aUniversity Pversity P

Press disclaimdisclaim

Univer

sity

o

k in any otherin any othcondition on anndition on an

-0-19-809369-519-80936: 0-19-809369-1-19-80936

et in Times Newin Timesal Publishing Sl Publishin

ndia by Radha ndia by Rad

dressesdresss

Pres

sed, storestorens, withoutwithou

expressly pepressly pepriate reproiate repro

outside theoutside thed Universid Universi

11.8.1 M S TThe history of an MST is as interesting as its concept. In 1926, Otaker Boruvka formulated the MST problem. A Polish mathematician, Vojtech Jarnik, described the problem in 1929 in a letter to Otaker Boruvka. The same problem was conceived independently by Kruskal in 1956. ence, Kruskal rediscovered the problem. ater it was de ned independently by Robert Prim in 1957 and by Edsger Dijkstra in 1958

C H A P T E R

1Introduc�onto

Algorithms C H A P T E R

18 BasicsofComputa�onal

Complexity

Features of the Book

Topical Coverage The book provides

extensive coverage of

design techniques, followed by discussions

Treatment of Concepts

are provided using

Algorit m Presenta on

in two ways, that is, step-wise approach

pseudocode approach

understanding of the logic behind solving a

Example 11.12 Consider the graph G shown in Fig. 11.11. Construct an MST for the given graph G using Kruskal’s algorithm.

The rst step in Kruskal’s algorithm is to sort all the edges and form an edge list, say E. The edges of graph G shown in Fig. 11.21 are sorted and shown in Table 11.13.

Example 11.12

Since the disjoint set data structure is used, initialization takes at most (|V|) time. The time complexity of the algorithm depends on the number of edges. As there are |E| edges, O(| E |log| E |) time is required to sort these edges. The disjoint set takes at most 2| E | nd opera-tions and |V | 1 operations. Therefore, the total complexity of Kruskal’s algorithm is at most O(| E |log| E |) time.

Step 1: Create a node x by allocating memory for it.Step 2: Assign the required value to the item part of node x. item(x) = valueStep 3: Set in the pointer to null. next(x) = nullStep 4: Return the node x.Algorithm create(L, x, value)

%% Input: List L and element x with 'value'%% Output: Node xBegin allot(x) %% Allot memory for node x with two elds item and next item(x) = value

next(x) = null

C H A P T E R

5DataStructures—I

“Bad programmers worry about the code. Good programmers worry about data structures C H A P T E R

11 GreedyAlgorithms

“Greed is all right, by the way... I think greed is healthy. You can be greedy d till f l d b t lf ”

C H A P T E R

13DynamicProgramming

Oxfor

ed, initialiends on the num

t these edgerefore, th

ord

sh

lgorithm is in Fig. 11.2

Ooror

O

Univer

sity

er Boruvka formulaer Boruvka formbed the problem ind the probl

ed independently byndependenwas de ned indepwas de ned i

ers

UnUnwn in Fig

as de ned

Un

yyy Pres

sssT E R

13ssss

Glossary and Summary

given at the end of each chapter to help readers

Revie ues ons E ercises and

Addi onal Pro lems

the end of every chapter to test the readers’

conceptual knowledge and also enhance their

Crossword PuzzlesCrossword puzzles,

exercise at the end of each

readers to self-check their

CROSSWORD

1

2

3

4 5

7

8

9 10

11

Historical Notes Historical notes are

provided throughout the The word algorithm is derived from the name of a Persian mathematician, Abu Ja’fer Mohammed Ibn Musa al Khowarizmi, who lived sometime around 780–850 AD. He was from the town of Khowarazm, now in Uzbekistan. He was a teacher of mathematics in Baghdad. He wrote a book

Europe. He also introduced the simple step-bfor addition, subtraction, multiplication, andhis book. The word algebra has also been dthe title of this book. When his book was tLatin, his name was quoted as Algorismus

Box 1.1 Origin of the word ‘algorithm’

eorge Bernard Dantzig was born in 1914 at Portland, Oregon, United States. His father became a professor of mathematics at University of Maryland after World War II. Dantzig’s biggest contribution is that he designed the simplex method for solving LPPs. Apart from the simplex

duality theory. He worked with Fulkerson anin formulating the travelling salesperson prolinear programming and solved the TSP proble49 cities at that time. In 197 , he was awardNational Medal of Science, the highest honou

Box 17.2 George Bernard Dantzig

GLOSSARY

Agent A performer of an algorithmAlgorist A person who is skilled in algorithm developmentAlgorithm A step-by-step procedure for solving a given

problemAlgorithm gap The difference between lower and upper

The process of providcal proof that the algorithm works correctly f

Algorithm validation The process of checness of an algorithm, this is done by givinit and checking its results with expected SUMMARY

An algorithm is a step-by-step procedure for solving a given problem.

A computational problem is characterized by two factors speci cation of valid input and output param-

Algorithm veri cation is a process of provematical proof that the given algorithm wofor all instances of data.

A proof of an algorithm is said to exist if t

REVIEW QUESTIONS

1.1 De ne an algorithm.1.2 What are the characteristics of an algorithm1.3 Survey the Internet and list out at least ve algorithms

that have huge impact on our daily lives.1.4 What are the stages of problem solving

1.8 What is the difference between algoand algorithm validation

1.9 How is an algorithm validated and with an example.

1.10 State algorithm classi cations. Wha EXERCISES

1.1 Assume that there are two algorithms A and B for a given problem P. The time complexity functions of algo-rithms A and B are, respectively, 5n and log2n. Which algorithm should be selected assuming that all other conditions remain the same for both the problems

1 2 L h f l h di h

complexities of A, B, and C are 3nrespectively. Assume that the input iAssume that the machine executesper second. How much time will algoC take Which algorithm will be the ADDITIONAL PROBLEM

1.1 John MacCormick had written a book titled Nine Algorithms That Changed the Future: The Ingenious Ideas that drive Today’s Computers, Princeton University Press, Princeton, that had listed the nine wonderful algorithms, namely, search engine indexing, page rank,

recognition, data compression, databsignatures, that changed the world.

(a) What are these algorithms Seand nd what these algorithms

(b) Identify one more algorithm that

Oxford

fOxpter

erge

rdSthat

4 What a

rdrdive

rsityes

n algorithorit

m is said to exist if tm is said to exis

ityt

Univn algorithm.at are the char

y the Inteh

iviv

Preseses

sgiv

PprovprovwowoPPvvoo

Computers have become an integral part of our daily life in recent times. They have enormously impacted our personal, professional, as well as social lives. Computers help us in tasks such as document editing, Internet browsing, sending emails, making presentations, performing complex scienti c computations, social networking, and playing games. Industries and government of ces use computers effectively in production, e-governance, and e-commerce. Considering the increasing demand of computers in society, schools, colleges, and universities have included computer education in their curriculum, to help students become skilled in programming and developing applications which can be used to solve various business, scienti c, and social problems.

Programming is a process of converting a given problem into an executable code for the computer. It involves understanding, analysis, and solving problems to create an algorithm. Veri cation of the algorithm, coding of the algorithm in a speci c programming language, testing, debugging, and maintaining the source code are also part of the programming process. Therefore, in order to construct ef cient programs, a ne understanding of algorithms is essential. An algorithm is a set of logical instructions for solving a given problem. It is expected to give correct results for valid inputs and should be ef cient, consuming less computer resources. An algorithm is implemented as a program using a programming language. A well-designed algorithm runs faster and consumes lesser computer resources, namely time and space. Therefore, expertise in programming is more related to ef ciency in problem-solving and effective de-signing of algorithms, rather than developing codes with the help of programming languages. Though programming languages are important, their role is just limited to the implementation of a well-designed algorithm. For this reason, algorithms are a central theme of computer study.

In fact, history of algorithms is much older than that of computers, dating back to 3000 BC. The ancient people of the Sumerian civilization were aware of basic numeric computation like addition. A Sumerian tablet found in the Euphrates river showed how to partition a given quantity of wheat in a way that each person receives the speci ed quantity. Such tablets were also used by ancient Babylonians (2000 to 1650 BC). Mathematicians of this period such as Euclid, Al-Khwarizmi, Leonardo Pisano (also known as Fibonacci), and others produced procedures that provided the foundations of the concepts of algorithms. Later developments in the eld of algorithms were due to the contributions of Gottfried Leibnitz, David Hilbert, and Alan Turing. The history of modern computers, however, only starts from the 1940s. Thus, algorithms have played a very important role in the development of modern computing.

The study of algorithms, called algorithmics, includes three aspects—algorithm design, analysis, and computational complexity of problems. Algorithm design is a creative activity. It includes various techniques (such as divide-and-conquer, greedy approach, dynamic programming, backtracking) that help in producing outputs at a faster pace by consuming lesser computer resources. Algorithm analysis is the estimation of how much resource is required by the algorithm. Computational complexity deals with the analysis and solvability of problems itself. A compulsory course on algorithms design and analysis is generally offered to computer science and information technology students in most universities. The course aims to help students create ef cient algorithms in common engineering design situations and analyse the asymptotic performance of algorithms by using the important algorithmic design paradigms and methods of analysis.

ABOUT THE BOOK

Design and Analysis of Algorithms is designed to serve as a textbook for the rst level course in algorithms that discusses all the fundamental and necessary information related to the three important aspects of

Oxford

vilizatlizatuphrates rivrates riv

he speci ed he speci eematicians oaticians

and others pnd othersvelopmentelopmen

rt, andrt, and

Univer

sityis a sa s

alid inputs aalid inputas a programs a program

lesser compusser comated to ef cied to ef c

codes with thcodes withr role is just liole is ju

e a central tha centralch older thach older

on weron w

Pres

sutable coable corithm. Verihm. Veri

ing, debuging, debuin order ton order

et of let o

Preface vii

algorithm study using minimal mathematics and lucid language. This book is suitable for undergradu-ate students of computer science and engineering (CSE) and information technology (IT), as well as for postgraduate students of computer applications. It is also useful for diploma courses, competitive examinations (like GATE), and recruitment interviews for this subject.

The book begins with an introduction to algorithms and problem-solving concepts followed by an introduction to algorithm writing and analysis of iterative and recursive algorithms. In-depth explana-tions and designing techniques of various types of algorithms used for problem-solving such as the brute force technique, divide-and conquer-technique, decrease-and-conquer strategy, greedy approach, transform-and-conquer strategy, dynamic programming, branch-and-bound approach, and backtracking, are provided in the book. Subsequent chapters of the book delve into the discussion of string algorithms, iterative improvement, linear programming, computability theory, NP-hard problems, NP- completeness, probability analysis, randomized algorithms, approximation algorithms, and parallel algorithms, with the appendices throwing light on basic mathematics and proof techniques.

The various design techniques have been elucidated with the help of numerous problems, solved examples, and illustrations (including schematics, tables, and cartoons). The algorithms are presented in plain English (informal algorithm presentation) and pseudocode approach (formal algorithm presenta-tion) to make the book programming language-independent and easy-to-comprehend. The book includes a variety of chapter-end pedagogical features such as point-wise summary, glossary, review questions, exercises, and additional problems to help readers assimilate and implement the concepts learnt.

KEY FEATURES

Provides simple and coherent explanations without using excessive theorems, proofs, and lemmas Detailed coverage for topics such as greedy approach, dynamic programming, transform-and-conquer

technique, decrease-and-conquer technique, linear programming, and randomized and approximation algorithms

Dedicated chapters on backtracking and branch-and-bound techniques, string matching algorithms, and parallel algorithms

Simple and judicious presentation of algorithms throughout the text in both informal and formal forms, followed by the discussion of their complexity analysis

Numerous review questions, exercises, and additional problems given at the end of each chapter to help readers apply and practise the concepts learnt

Includes glossary and point-wise summary at the end of each chapter to help readers quickly recapitulate the important concepts

Provides historical notes on various topics and crossword puzzles at the end of each chapter to elicit learning interest in students

ORGANIZATION OF THE BOOK

The book consists of twenty chapters. A chapter-wise scheme of the book is presented here.Chapter 1 provides an overview of algorithms. It introduces all the basic concepts of algorithms and

the fundamental stages of problem-solving. The chapter ends with the classi cation of algorithms.Chapter 2 starts with the basic tools used for problem-solving. All the guidelines required for present-

ing the pseudocode and ow charts are provided along with many examples. The focus of this chapter is to provide some practice on writing algorithms. The basics of recursion and algorithm correctness are also covered in this chapter.

Chapter 3 covers the basics of algorithm complexity and analysis of iterative algorithms. Step count and operation count methods used for analysing iterative algorithms are discussed in detail. Asymptotic

Oxford

ing ng

entation of ation of discussion oscussion o

estions, ns, exeexeand practisand practisand poiand poi

rtan

Univer

sity

ise se te and impe and im

ithout using eout usiny approach, dyapproach,

que, linear prque, linea

nd brancnd bra

Pres

sf numeronumer. The algorThe algor

proach (foroach (forsy-to-comsy-to-comummaumm

ll

viii Preface

analysis is also discussed in detail. Finally, the chapter ends with the concept of analysing the ef ciency of algorithms.

Chapter 4 discusses the analysis of recursive algorithms. It explains the basics of recurrence equations along with the methods for solving them. Generating functions are also brie y discussed as part of this chapter.

Data structures are an important component of algorithms. Chapter 5 deals with the fundamentals concepts related to data structures such as stacks, queues, linear lists, and linked lists. Trees and graphs are also explained in this chapter. Chapter 6 covers advanced data structures used for organizing large amounts of dynamic data. Binary search trees and AVL trees used for organizing dictionaries are discussed in this chapter, in addition to priority queues and heaps. Finally, a discussion on disjoint sets and amortized analysis is provided.

Chapter 7 deals with the brute force techniques, which use no special logic but instead follow an intuitive way of solving problems using the problem statement. Various problems such as sequential search, bubble sort, and selection sort that can be solved using this approach are explained. Some basic computational geometry problems such as closest-pair and convex hull are also covered. The chapter ends with the discussions on exhaustive searching problems such as 15-puzzle problem, 8-queen prob-lem, magic square problem, knapsack problem, container loading problem, and assignment problem.

Chapter 8 discusses the divide-and-conquer design paradigm. Important problems such as quicksort, merge sort, nding maximum and minimum, multiplication of long integers, Strassen matrix multiplica-tion, tiling problem, closest-pair problem, and convex hull are solved using this technique. The chapter ends with a discussion of the Fourier transform problem.

Chapter 9 explains the decrease-and-conquer technique, which is also known as the incremental or inductive approach. Examples problems such as insertion sort, topological sort, generating permutations and subsets, binary search, fake coin detection, and Russian peasant multiplication problem are used to illustrate the decrease by constant and constant factor methods. Finally, discussions on interpolation search, selection, and nding the median problems using the decrease by variable factor method are provided.

Chapter 10 deals with time space tradeoffs. The selection of one type of ef ciency over the other and problems related to linear sorting and Hashing are discussed. The chapter ends with a discussion on B-trees and their operations.

Chapter 11 explains the greedy approach concept. This chapter discusses important problems such as coin change problem, scheduling, knapsack problem, optimal storage of tapes, Huffman code, minimum spanning tree algorithms, and Dijkstra’s shortest path algorithm to illustrate the greedy approach.

Chapter 12 discusses the transform-and-conquer approach and its three basic techniques—instance simpli cation, representation change, and problem reduction. Problems such as Gaussian elimination, decomposition methods, nding determinant and matrix inverses are discussed. Heap sort, Horner’s method, binary exponentiation algorithm, and reduction problems are also covered in this chapter.

Chapter 13 deals with dynamic programming. Important example problems such as Fibonacci prob-lem, binomial coef cient multistage graph, Graph algorithms, Floyd-Warshall Algorithm, Bellman Ford algorithms, travelling salesman problem, chain matrix multiplication, knapsack problem, and optimal binary search tree problem are discussed to illustrate the dynamic programming concept. Finally, the chapter concludes with the ow-shop scheduling algorithms.

Chapter 14 discusses backtracking algorithms. This chapter covers important problems such as N-queen problem, Hamiltonian circuit problem, sum of subsets, vertex colouring problem, graph colouring prob-lems, Graham scan, and generating permutations.

Chapter 15 explains branch-and-bound techniques. Search techniques using this concept are discussed. Important problems such as assignment problem and 15-puzzle are covered and the chapter ends with a discussion on traveling salesperson and knapsack problems.

Chapter 16 deals with the string algorithms. Some basic string algorithms such as nding the length of strings, nding substrings, concatenation of two strings, longest common sequence, and pattern

Oxford

ng ag a

eedy approady approaduling, knauling, kna

ms, and Dijkand Dijksses the transes the traentationentatio

Univer

sity

f lononare solveare solv

m.hnique, whichique, w

nsertion sort, trtion sorand Russian d Russi

nt factor methofactor mms using the dms using th

radeoffs. Thradeoffs. Td Hashind Has

Pres

sh a

are alsoe also5-puzzle ppuzzle p

problem, anblem, anImportantImportantg integg int

dd

Preface ix

recognition algorithms such as Rabin Karp, Harspool, Knuth Morris Pratt and Boyer Moore algorithms are discussed in this chapter. Finally, approximate string matching algorithm is discussed.

Chapter 17 discusses the iterative approach and basics of linear programming. The linear programming formulation of a problem and simplex method is discussed in this chapter. Minimization problem, prin-ciple of duality, and max- ow problems are also explained. Finally, matching algorithms are considered for better understanding of computational complexity.

Chapter 18 explains the basics of computational complexity and the upper and lower bound theory. Decision problems, complexity classes, and reduction concepts are also discussed. This chapter also covers theory of NP-complete problems and examples for proving NP-completeness.

Chapter 19 covers the basic concepts and types of both randomized and approximation algorithms. Randomized algorithms are illustrated through examples such as hiring problem, primality testing, com-parison of strings, and randomized quicksort. Approximation algorithms are illustrated through examples based on heuristic, greedy, linear, and dynamic programming approaches.

Chapter 20 begins with an introduction to parallel processing and classi cation of parallel sys-tems. It then discusses the fundamentals of parallel algorithms and parallel random access machine (PRAM) model. The concept of parallelism is illustrated through examples related to parallel search-ing, parallel sorting, and graph and matrix multiplication problems.

There are two appendices in this book. Appendix A explains the basics of mathematics such as sets, series and sequences, relations, functions, matrix algebra, and probability that are necessary for algorithm study. Appendix B deals with mathematical logic and proof techniques.

ONLINE RESOURCES

To aid teachers and students, the book is accompanied by online resources that are available at http://oupinheonline.com/book/sridhar-Design-Analysis-Algorithms/9780198093695. The content for the online resources are as follows:

For Instructors PowerPoint slides Solutions manual

For Students Answers to the crossword puzzles

ACKNOWLEDGEMENTS

This book would not have been possible without the help and encouragement of many friends, colleagues, and well-wishers. I thank all my students who motivated me by asking interesting questions related to the subject. I acknowledge the assistance of my friend, P. Kanniappan, in typing the manuscripts and for all his advice. I also express my gratitude to all my colleagues at the departments of Computer Science and Engineering and Information Science and Technology, Anna niversity and National Institute of Technology, Tiruchirapalli for reviewing my manuscripts and providing constructive suggestions for improvement. I acknowledge all my reviewers whose feedback has helped making this book better. I am also very thankful to the editorial team at Oxford niversity Press, India for providing valuable as-sistance. My sincere thanks are also due to my family members, especially my wife, Dr N. Vasanthy, my mother, Mrs Parameswari, my mother-in-law, Mrs Renuga, and my children, Shobika and Shreevarshika, for their constant support and encouragement during the development of this book.

S. Sridhar

Oxford

ssword psword

Univer

sity

ns ththd probabild probab

oof techniquef techniqu

ompanied bympanied nalysis-Algorinalysis-Alg

Pres

sclassi cssi cparallel ranrallel ran

xamples remples relems.lems.e basice ba

ii

Features of the Book iv

Preface vi

Detailed Contents xi

1 Introduction to Algorithms 1

2 Basics of Algorithm Writing 22

3 Basics of Algorithm Analysis 58

4 Mathematical Analysis of Recursive Algorithms 98

5 Data Structures—I 141

6 Data Structures—II 194

7 Brute Force Approaches 231

8 Divide-and-conquer Approach 262

9 Decrease-and-conquer Approach 309

10 Time–Space Tradeoffs 342

11 Greedy Algorithms 372

12 Transform-and-conquer Approach 417

13 Dynamic Programming 455

14 Backtracking 517

15 Branch-and-bound Technique 543

16 String Algorithms 569

17 Iterative Improvement and Linear Programming 601

18 Basics of Computational Complexity 638

19 Randomized and Approximation Algorithms 663

20 Parallel Algorithms 705

Appendix A—Mathematical Basics 735

Appendix B—Proof Techniques 752

Bibliography 763

Index 766

Brief Contents

Oxford

Techniquechniqu

mss

Univer

sity Pr

ess

Features of the Book ivPreface viBrief Contents x

DetailedContents

1 Introduc on to Algorithms 11.1 Introduction 11.2 Need for Algorithmic Thinking 11.3 Overview of Algorithms 3

1.3.1 Computational Problems, Instance, and Size 5

1.4 Need for Algorithm Ef ciency 71.5 Fundamental Stages of Problem

Solving 91.5.1 Understanding the Problem 91.5.2 Planning an Algorithm 91.5.3 Designing an Algorithm 111.5.4 Validating and Verifying an

Algorithm 111.5.5 Analysing an Algorithm 121.5.6 Implementing an Algorithm

and Performing Empirical Analysis 14

1.5.7 Post (or Postmortem) Analysis 14

1.6 Classi cation of Algorithms 151.6.1 Based on Implementation 151.6.2 Based on Design 161.6.3 Based on Area of

Specialization 171.6.4 Based on Tractability 18

2 Basics of Algorithm Wri ng 222.1 Tools for Problem-solving 22

2.1.1 Stepwise e nement or Top-down design 23

2.1.2 Bottom-up Approach 242.1.3 Structured Programming 25

2.2 Algorithm Speci cations 262.2.1 Guidelines for Writing

Algorithms 27

2.3 Non-recursive Algorithms 312.3.1 Flowcharts 32

2.4 Basics of Recursion 422.5 Recursive Algorithms 432.6 Algorithm Correctness 52

3 Basics of Algorithm Analysis 583.1 Basics of Algorithm Complexity 583.2 Introduction to Time complexity 60

3.2.1 Random Access Machine 603.3 Analysis of Iterative Algorithms 62

3.3.1 Measuring Input Size 633.3.2 Measuring Running Time 633.3.3 Best-, Worst-, and

Average-case Complexity 713.4 Rate of Growth 73

3.4.1 Measuring Larger Inputs 733.4.2 Comparison Framework 74

3.5 Asymptotic Analysis 763.5.1 Asymptotic Notations 783.5.2 Asymptotic Rules 853.5.3 Asymptotic Complexity Classes 87

3.6 Space Complexity Analysis 883.7 Empirical Analysis and Algorithm

Visualization 883.7.1 Experimental Purpose 893.7.2 Statistical Tests 90

Mathema cal Analysis of Recursive Algorithms 984.1 Introduction to Recurrence

Equations 984.1.1 Linear Recurrences 994.1.2 Non-linear Recurrences 101

4.2 Formulation of Recurrence Equations 103

Oxfordem)

Algorithmsgorithmsn ImplementImplemen

on Designn DesignAreaArea

Univer

sity

1212

l l 14

3.2.23.33 AnalysAnal

3.3.13.3.13.3

Pres

srre

gorithm Anrithm Acs of Algorof Algor

roduction roductioRanR

xii Detailed Contents

4.3 Techniques for Solving Recurrence Equations 1064.3.1 Guess-and-verify Method 1064.3.2 Substitution Method 1094.3.3 Recurrence-tree Method 1124.3.4 Difference Method 117

4.4 Solving Recurrence Equations sing Polynomial Reduction 118

4.4.1 Solving Homogeneous Equations 118

4.4.2 Solving Non-homogeneous Equations 122

4.5 Generating Functions 1234.5.1 Properties of Generating

Functions 1244.6 Divide-and-conquer Recurrences 127

4.6.1 Master Theorem 1274.6.2 Transformations 1334.6.3 Conditional Asymptotics 135

5 Data Structures—I 1415.1 Data Structures and Algorithms 1415.2 Lists 142

5.2.1 Linear Lists and Arrays 1435.2.2 Linked Lists 146

5.3 Stacks 1525.3.1 Representation of and

Operations on Stacks 1525.4 Queues 154

5.4.1 Queue Representation 1545.5 Trees 157

5.5.1 Tree Terminologies 1575.5.2 Classi cation of Trees 1595.5.3 Binary Tree Representation 1615.5.4 Binary Tree Operations 163

5.6 Graphs 1675.6.1 Terminologies and

Types of Graphs 1685.6.2 Graph Representation 1725.6.3 Graph Traversal 1745.6.4 Elementary Graph Algorithms 1785.6.5 Spanning Tree and Minimum-cost

Spanning Tree 186

6 Data Structures—II 1946.1 Introduction to Dictionary 194

6.1.1 Introduction to Binary Search Tree 194

6.1.2 AVL Trees 2006.2 Priority Queues and Heaps 206

6.2.1 Binary Heaps 2076.2.2 Binomial Heaps 2126.2.3 Fibonacci heap 218

6.3 Disjoint Sets 2216.3.1 Representation and

Operations 2216.4 Amortized Analysis 224

6.4.1 Aggregate Method 2256.4.2 Accounting Method 2266.4.3 Potential Method 226

7 Brute Force Approaches 2317.1 Introduction 231

7.1.1 Advantages and Disadvantages of Brute Force Method 232

7.2 Sequential Search 2327.2.1 Analysis of Recursion

Programs 2347.2.2 Recursive Form of Linear

Search Algorithm 2357.3 Sorting Problem 236

7.3.1 Classi cation of Sorting Algorithms 237

7.3.2 Properties of Sorting Algorithms 237

7.3.3 Bubble Sort 2387.3.4 Selection Sort 242

7.4 Computational Geometry Problems 2457.4.1 Closest-pair Problem 2457.4.2 Convex Hull Problem 247

7.5 Exhaustive Searching 2497.5.1 15-puzzle Problem 2507.5.2 8-queen Problem 2517.5.3 Magic Squares 2527.5.4 Container Loading Problem 2537.5.6 Assignment Problem 256

n of and n of and s on Stacksn Stacks

e RepresenReprese

Univer

sity

11411411421421431414

Brute ForcBrute Fo7.17.1 IntroIntro

7

Pres

sion

d AnalysAnalysAggregate Mgregate M

2 AccountiAccount4.34.3 PotenPote

Detailed Contents xiii

8 Divide-and-conquer Approach 2628.1 Introduction 262

8.1.1 Recurrence Equation for Divide and Conquer 263

8.1.2 Advantages and Disadvantages of Divide-and-conquer Paradigm 264

8.2 Merge Sort 2648.3 Quicksort 269

8.3.1 Partitioning Algorithms 2708.3.2 Variants of Quicksort 275

8.4 Finding Maximum and Minimum Elements 277

8.5 Multiplication of Long Integers 2808.6 Strassen Matrix Multiplication 2848.7 Tiling Problem 2888.8 Closest-pair Problem 291

8.8.1 Using Divide-and-conquer Method 291

8.9 Convex Hull 2938.9.1 Quickhull 2938.9.2 Merge Hull 295

8.10 Fourier Transform 2968.10.1 Polynomial Multiplication 2968.10.2 Application of Fourier

Transform 2988.10.3 Fast Fourier Transform 302

9 Decrease-and-conquer Approach 3099.1 Introduction 3099.2 Decrease by Constant Method 311

9.2.1 Insertion Sort 3119.2.2 Topological Sort 3159.2.3 Generating Permutations 3219.2.4 Generating Subsets 323

9.3 Decrease by Constant Factor Method 3259.3.1 Binary Search 3269.3.2 Fake Coin Detection 3309.3.3 Russian Peasant

Multiplication Problem 3319.4 Decrease by Variable

Factor Method 332

9.4.1 Interpolation Search 3339.4.2 Selection and Ordered

Statistics 3349.4.3 Finding Median 336

10 Time Space Tradeo s 34210.1 Introduction to Time–Space

Tradeoffs 34210.2 Linear Sorting 342

10.2.1 Counting Sort 34210.2.2 Bucket Sort 34710.2.3 Radix Sort 349

10.3 Hashing and Hash Tables 35110.3.1 Properties of Hash

Functions 35310.3.2 Hash Table Operations 35410.3.3 Collision 356

10.4 B-trees 35910.4.1 B-tree Balancing

Operations 36110.4.2 B-tree Operations 363

11 Greedy Algorithms 37211.1 Introduction to Greedy Approach 372

11.1.1 Components of Greedy Algorithms 373

11.2 Suitability of Greedy Approach 37411.3 Coin Change Problem 37511.4 Scheduling Problems 376

11.4.1 Scheduling without Deadline 377

11.4.2 Scheduling with Deadline 37911.4.3 Activity Selection Problem 382

11.5 Knapsack Problem 38411.6 Optimal Storage of Tapes 38811.7 Optimal Tree Problems 390

11.7.1 Optimal Merge 39011.7.2 Huffman Coding 39311.7.3 Tree Vertex Splitting Problem 398

11.8 Optimal Graph Problems 40111.8.1 Minimum Spanning Trees 40111.8.2 Single-source Shortest-path

Problems 407

Oxford

rier er 8

r Transformr Transfor

nquer Appnquer App

onstonst

Univer

sity

329595296296

ionon 29629

10.3010.410.4 B-treeB-tr

10.410 4

1

Pres

sSo

and Hashd HashProperties roperties FunctioFuncti

0.3.20.3.2 HashHa33 C

xiv Detailed Contents

12 Transform-and-conquer Approach 41712.1 Introduction to Transform and

Conquer 41712.2 Introduction to Instance

Simpli cation 41812.3 Matrix Operations 419

12.3.1 Gaussian Elimination Method 420

12.3.2 LU Decomposition 42712.3.3 Crout’s Method of

Decomposition 43412.3.4 Finding Matrix Inverse 43612.3.5 Finding Matrix

Determinant 43912.4 Change of Representation 441

12.4.1 Heap Sort 44112.4.2 Polynomial Evaluation

Using Horner’s Method 44512.4.3 Binary Exponentiation 447

12.5 Problem Reduction 450

13 Dynamic Programming 45513.1 Basics of Dynamic Programming 455

13.1.1 Components of Dynamic Programming 456

13.1.2 Characteristics of Dynamic Programming 458

13.2 Fibonacci Problem 46013.3 Computing Binomial Coef cients 46313.4 Multistage Graph Problem 466

13.4.1 Forward Computation Procedure 467

13.4.2 Backward Computation Procedure 471

13.5 Transitive Closure and Warshall Algorithm 47213.5.1 Finding Transitive Closure Using

Brute Force Approach 47313.5.2 Finding Transitive Closure Using

Warshall Algorithm 47313.5.3 Alternative Method to

Warshall Algorithm for Finding Transitive Closure 476

13.6 Floyd–Warshall All Pairs Shortest-path Algorithm 47813.6.1 Shortest-path

Reconstruction 48113.7 Bellman–Ford Algorithm 48113.8 Travelling Salesperson Problem 48613.9 Chain Matrix Multiplication 488

13.9.1 Dynamic Programming Approach for Solving Chain Matrix Multiplication Problem 490

13.10 Knapsack Problem 49613.11 Optimal Binary Search Trees 500

13.11.1 Brute Force Approach for Constructing Optimal BSTs 500

13.11.2 Dynamic Programming Approach for Constructing Optimal BSTs 502

13.12 Flow-shop Scheduling Problem 50713.12.1 Single-machine

Sequencing Problem 50813.12.2 Two-machine Sequencing

Problem 509

14 Backtracking 51714.1 Introduction 51714.2 Basics of Backtracking 51814.3 N-queen Problem 522

14.3.1 State Space of 4-queen Problem 523

14.4 Sum of Subsets 52514.5 Vertex Colouring Problem 52714.6 Hamiltonian Circuit Problem 531

14.6.1 Promising (or Bounding Function) for Hamiltonian Problem 531

14.7 Generating Permutation 53414.8 Graham Scan 536

15 Branch-and- ound Technique 54315.1 Introduction 54315.2 Search Techniques for Branch-

and-bound Technique 545

Oxford

of Dynamynamg

emmBinomial Coomial Co

Graph ProGraph Proard Coard Co

Univer

sity

455455g 455455

c 456

1

13.1213.1

Pres

sPr

l BinaryBinary1 Brute FBrute F

for Cfor CB

3.11.23.11

Detailed Contents xv

15.2.1 BFS using Branch-and- bound algorithm— FIFOBB 545

15.2.2 LIFO with Branch and Bound 547

15.2.3 Least Cost with Branch and Bound 547

15.3 15-puzzle Game 54815.4 Assignment Problem 55215.5 Traveling Salesperson Problem 55915.6 Knapsack Problem 562

16 String Algorithms 56916.1 Introduction to String

Processing 56916.2 Basic String Algorithms 571

16.2.1 Length of Strings 57116.2.2 Concatenation of

Two Strings 57116.2.3 Finding Substrings 572

16.3 Longest Common Subsequences 57316.4 naïve String Matching

Algorithm 57616.5 Pattern Matching sing Finite

Automata 57816.6 Rabin–Karp Algorithm 58016.7 Knuth–Morris–Pratt Algorithm 58416.8 Harspool Algorithm 58816.9 Boyer–Moore String Matching

Algorithm 59016.10 Approximate String Matching 594

17 Itera ve Improvement and Linear Programming 60117.1 Introduction to Iterative

Improvement 60117.2 Linear Programming 60117.3 Formulation of LPPs 60317.4 Graphical Method for

Solving LPPs 60617.5 Simplex Method 61117.6 Minimization Problems 61517.7 Principle of Duality 619

17.8 Max- ow Problem 623 17.9 Bipartite Matching Problem 62817.10 Stable Marriage Problem 630

18 Basics of Computa onal Complexity 63818.1 Introduction to Computational

Complexity 63818.2 Algorithm Complexity,

pper and Lower Bound Theory 63918.2.1 Upper and Lower Bounds 640

18.3 Decision Problems and Turing Machine 644

18.4 Complexity Classes 64718.4.1 Class P 64818.4.2 NP Class 648

18.5 Theory of NP-complete Problems 650

18.6 Reductions 65118.6.1 Turing Reduction 65118.6.2 Karp Reduction 652

18.7 Satis ability Problem and Cook’s Theorem 654

18.8 Example Problems for Proving NP-completeness 65518.8.1 SAT is NP-complete 65518.8.2 Problem 3-CNF-SAT is

NP-complete 65618.8.3 Clique Decision Problem

is NP-complete 65718.8.4 Sum of Subsets

(from 3-CNF-SAT) 658

19 Randomized and Approxima on Algorithms 66319.1 Dealing with NP-hard

Problems 66319.2 Introduction to Randomized

Algorithms 66419.2.1 Generation of Random

Numbers 66619.2.2 Types of Randomized

Algorithms 669

Oxfordm

tt Algorithmtt Algorithmthmm

e String Mae String M

e Strie Stri

Univer

sity

576576e e

57855

1.5 TheoTh

ProbleProb18.618.6 Red

Pres

sobleble

plexity Claxity Cla4.14.1 Class PClass P

.4.2.4.2 NP NPryry

xvi Detailed Contents

19.3 Examples of Randomized Algorithms 66919.3.1 Hiring Problem 66919.3.2 Primality Testing Algorithm 67219.3.3 Comparing Strings Using

Randomization Algorithm 67419.3.4 Randomized Quicksort 675

19.4 Introduction to Approximation Algorithms 678

19.5 Types of Approximation Algorithms 679

19.6 Examples of Approximation Algorithms 68019.6.1 Heuristic-based

Approximation Algorithms 68119.6.2 Greedy Approximation

Algorithms 68819.6.3 Approximation Algorithm

Design Using Linear Programming 693

19.6.4 Designing Approximation Algorithms Using Dynamic Programming 696

20 Parallel Algorithms 70520.1 Introduction to Parallel Processing 705

20.2 Classi cation of Parallel Systems 70520.2.1 Flynn Classi cation 70620.2.2 Address-space (or Memory

Mechanism-based) Classi cation 708

20.2.3 Classi cation Based on Interconnection Networks 710

20.3 Introduction to PRAM Model 71120.4 Parallel Algorithm Speci cations

and Analysis 71220.4.1 Parallel Algorithm

Analysis 71420.5 Simple Parallel Algorithms 716

20.5.1 Pre x Computation 71620.5.2 List Ranking 71820.5.3 Euler Tour 719

20.6 Parallel Searching and Parallel Sorting 72020.6.1 Parallel Searching 72020.6.2 Odd–Even Swap Sort 72220.6.3 Parallel Merge–Split

Algorithm 72420.7 Additional Parallel Algorithms 726

20.7.1 Parallel Matrix Multiplication 726

20.7.2 Parallel Graph Algorithms 729

Appendix A—Mathematical Basics 735Appendix B—Proof Techniques 752Bibliography 763Index 766 Oxfo

rdel ProcessProcess

ical BasicsBasicsTechniquesechniques

Univer

sity

696696

705ng

6 PaPSortiSor20.620 62

Pres

ssi

arallel Aallel APre xre x ComCom

5.2 List RaList Ra0.5.30.5.3 EuleEu

rallelrall

1.1 INTRODUCTION

Computers are powerful tools of computing. One cannot ignore the impact of computers on our modern life. We use computers for personal needs such as typing documents, browsing the Internet, sending emails, playing computer games, performing numeric calculations, and so on. Industries and govern-ments use computers much more effectively to perform complicated tasks to improve productivity and ef ciency. Applications of computer systems in airline reservation, video surveillance, biometric recognition, e-governance, and e-commerce are all examples of their usefulness in improving ef ciency and productivity. The increasing importance of computers in our lives has prompted schools and universities to introduce computer science as an integral part of our modern education. Informally, everyone is expected to handle computers to accomplish certain basic tasks. This knowledge of using computers to perform our day-to-day activities is often called computational thinking. Computational thinking is a necessity to survive in this modern world. However, computer science professionals are expected to accomplish much more than acquiring this basic skill of computer usage. They are required to write speci c computer programs to provide computer-automated solutions for problems. Writing a program is slightly more complicated than merely learning to use computers, as it requires ‘algorithmic thinking’. Algorithmic thinking is an important analytical skill that is required for writing effective programs in order to solve given problems. Algorithmic thinking is not con ned to computer science only, but it spreads through all disciplines of study. Computer science is a domain where the skills of algorithmic thinking are taught to the aspiring computer professionals for solving computational problems.

1.2 NEED FOR ALGORITHMIC THINKING

Norman E. Gibbs and Allen B. Tucker1 proposed a de nition for computer science that captures the core truth of computer science study. According to

Learning O ec ves

This chapter introduces the basics of algorithms. All important de nitions and concepts related to the study of algorithms are the focus of this chapter. The reader would be familiar with the following concepts by the end of this chapter: Basic terminologies of algorithm study

Need for algorithms Characteristics of algorithms

Stages of a problem-solving process

Need for ef cient algorithms

Classi cation of algorithms

C H A P T E R

1Introduc�onto

Algorithms

1 Gibbs, Norman E., Allen B. Tucker, ‘A Model Curriculum for a Liberal Arts Degree in Computer Science’, Communications of ACM, Vol. 29, Issue 3, 1986.

“Ideas are the beginning point of all fortunes.”—Napolean Hill

Oxford

dernernmplish ceplish ce

ur day-to-daday-to-dathinking ithinking isciencescienthisth

Univer

sity

nternernalculations, alculation

h more effectimore effecand ef ciencyd ef cie

deo surveillanco surveilll examples ofl examples

he increasing iincreasind universities td universitieeducation.educat

ain

Pres

sing. One cag. One ca

e computere computeet, senet, s

2 Design and Analysis of Algorithms

them, ‘Computer science is the systematic study of algorithms and data structures, speci cally their formal properties, their mechanical and linguistic realizations, and their applications.’ Hence, there cannot be any dispute regarding the fact that study of algorithms is the central theme of computer science.

The word algorithm is derived from the name of a Persian mathematician, Abu Ja’fer Mohammed Ibn Musa al Khowarizmi, who lived sometime around 780–850 AD. He was from the town of Khowarazm, now in Uzbekistan. He was a teacher of mathematics in Baghdad. He wrote a book titled Kitab al Jabr w’al Muqabala (Rules of Restoration and Reduction) and Algoritmi Numero Indorum where he introduced the old Arabic–Indian number systems to

Europe. He also introduced the simple step-by-step rules for addition, subtraction, multiplication, and division in his book. The word algebra has also been derived from the title of this book. When his book was translated to Latin, his name was quoted as Algorismus from which the word ‘algorithm’ emerged. Algorithm as a word thus became famous for referring to procedures that are used by computers for solving problems.

Box 1.1 Origin of the word ‘algorithm’

Let us elaborate this de nition further. The formal and mathematical properties of algorithms include the study of algorithm correctness, algorithm design, and algorithm analysis for understanding the behaviour of algorithms. Hardware realizations include the study of computer hardware, which is necessary to run the algorithms in the form of programs. Linguistic realizations include the study of programming languages and their design, translators such as interpreters and compilers, and system software tools such as linkers and loaders, so that the algorithms can be executed by hardware in the form of programs. Applications of algorithms include the study of design and development of ef cient software packages and software tools so that these algorithms can be used to solve speci c problems.

Thus, computer scientists consider the study of algorithms as the core theme of computer science. The art of designing, implementing, and analysing algorithms is called algorithmics. Algorithmics is a general word that comprises all aspects of the study of algorithms. Let us now attempt to de ne algorithms formally. An algorithm is a set of unambiguous instructions or procedures used for solving a given problem to provide correct and expected outputs for all valid and legal input data (refer to Box 1.1).

An algorithm can also be referred by other terminologies such as recipe, prescription, process, or computational procedure. Box 1.2 provides a brief history of algorithms.

Algorithms thus serve as prescriptions of how a computer should carry out instructions to solve problems. Hence, algorithms can be visualized as strategies for solving a problem. One cannot write a program for a given problem without the necessary analytical skills or a strategy for solving the problem. Thus, the knowledge of how to solve a problem is called algorithmic thinking.

One can compare a program construction with the construction of a house. The blueprint of the house incorporating all the planning necessary for the construction of the house can be visualized as an algorithm. A computer program is only an algorithm expressed using a programming language such as C or C++. In addition to possessing the essential skills of using a programming language, the ability to conceive a strategy or apply analytical skills are important for solving a problem or constructing a program.

Oxford

gramramthe probleproble

kingg..mpare a propare a pr

se incorporae incorporized as an zed as an

ng lang la

Univer

sityl aspep

algorithmalgorithm iblem to proviem to prov

ox 1.1).1.1).d by other teby othe

dure.re. Box 1.2 Box 1rescriptions orescription

algorithms caalgorithms for a givfor a

m T

Pres

ses

se algoralgor

rithms as thhms as thysing algoysing alg

cts ofcts

Introduction to Algorithms 3

1.3 OVERVIEW OF ALGORITHMS

Algorithms are generic and not limited to computers alone (refer to Box 1.2). We perform many algorithms in our daily life unknowingly. Consider a few of our daily activities, some of which are listed as follows:

1. Searching for a speci c book2. Arranging books based on titles3. Using a recipe to cook a new dish4. Packing items in a suitcase5. Scheduling daily activities6. Finding the shortest path to a friend’s house7. Searching for a document on the Internet8. Preparing a CD of compressed personal data9. Sending messages via email or SMS

We perform many tasks without being aware of their inherent algorithms. For example, consider the activity of searching a word in a dictionary. How do we search? It can be noted that indexing the words in a dictionary reduces the effort of searching signi cantly. We just open the book, compare the word with the index given, and accordingly decide on the por-tions of the books to be searched for nding the meaning of that particular word. This kind

The history of algorithms dates back to ancient civilizations that used numbers. Ancient civilizations such as Egyptian, Indian, Greek, and Chinese were known to have used algorithms or procedures to carry out tasks like simple calculations. This is in contrast to the history of modern computers that were designed in the 1940s only. Thus, the history of algorithms is much older and more exciting. Euclid, who lived in ancient Greece around 400–300 BC, is credited with writing the rst algorithm in history for com-puting the greatest common divisor (GCD). Archimedes created an algorithm for the approximation of the number pi. Eratosthenes introduced to the world the algorithm for nding prime numbers. Averroes (112 –1198) also used

algorithmic methods for calculations.A number of in uences led to the formalization of the

theory of algorithms. Some of the noteworthy developments include the introduction of Boolean algebra by George Boole (1847) and set theory by Gottlob Frege (1879). Set theory has become a foundation of modern mathemat-ics. The concept of recursion formulated by Kurt Gödel, and contributions of Giuseppe Peano (1888), Alfred North Whitehead, and Bertrand Russell in mathematical logic led to the formalization of algorithm as an independent eld of study.

Alan Turing’s Turing machines developed in 193 play a very important role in computational complexity theory. This Turing machine and Alonzo Church’s lambda calcu-lus formed the basis for formalization of the complexity theory. Thus, the history of algorithms is much richer than that of computers.

The history of computing is another interesting story. Abacus and other mechanical devices used for computing have been utilized at various stages of human history. A real attempt was made in the 17th century by Leibniz, a German mathematician and philosopher. He invented in nitesimal calculus, generating the idea of computers. In 1822, Charles Babbage introduced the difference engine, which required changing of gears manually to perform calculations. These ideas nally led to the invention of computers in the 1940s. In 1942, the US government designed a machine called ENIAC (electronic numerical integrator and computer). This was succeeded by EDVAC (electronic discrete variable automatic computer) in 1951. In 1952, IBM introduced its mainframe computer. Developments in the hardware domain and declining costs have shifted computer usage from big research and military environments to homes and educational institutions. In 1981, IBM personal computers were introduced for home and of ce use.

Box 1.2 Short history of compu ng and algorithms

Oxford

MSS

neric and nneric and nms in our dain our d

listed as foted as

ing for a ng for a bob

Undent

ctroceeded beded

matic computeatic compumainframe comainframe com

domain and demain andfrom big resfrom big educatioeducwerew

of codifference ference

nually to perfally to perfthe inventioninvention

S governmenS governmenic numeric numer

y Ey E


of a plan or an idea is called a strategy, which is more formally known as an algorithm.

The environment of a typical algorithm is shown in Fig. 1.1. Here, agent, which can be a human or a computer, is the performer.

To illustrate further, let us consider the example of tea preparation. The hardware used here includes cooking

utensils a heater, and a person. To make tea, one needs to have the following ingredients—water, tea powder, and sugar. These ingredients are analogous to the inputs of an algorithm. Here, preparing a cup of tea is called a process. The output of this process is, in this case, tea. This corresponds to the output of an algorithm. The procedure for preparing tea is as follows:

Fig. 1.1 Typical algorithm

Valid input Output

Agent

Algorithm

1. Put tea powder in a cup.2. Boil the water and pour it into the cup.3. Filter it.

4. Pour milk.5. Add sugar if necessary.6. Pour the tea into a cup.

This kind of a procedure can be called an algorithm. It can be noted that an algorithm consists of step-by-step instructions that are required to accomplish a certain task.

Humans often perform such procedures intuitively or even mechanically without spending much conscious thought, and hence, they label such actions as habitual activity. Many of our day-to-day activities are not very ef cient. However, algorithms that are meant for computers represent a different case altogether. Computer procedures should be ef cient as computer resources are scarce. Hence, much thought is given for writing computer procedures that can solve problems. Problems can be classi ed into two types: computational problems and non-computational problems.

Computational problems can be solved by a computer system. A computational problem is characterized by two factors: (a) the formalization of all legal inputs and expected outputs of a given problem and (b) the characterization of the relationship between problem output and input. Thus, an algorithm is expected to give the expected output for all legal inputs. If an algorithm yields the correct output for a legal input, then it is called an algorithmic solution.

Non-computational problems cannot be solved by a computer system. This classi cation shows the fundamental differences between computers and humans in solving problems. Computers are more effective than humans in performing calculations and can crunch num-bers in a fraction of seconds with more precision and ef ciency. However, humans outscore computers in recognition. The ability of recognizing an object by humans is much better than that by machines. Recognition of an object by computer systems requires lots of programming involving images and concepts of image processing. In addition, some tasks are plainly not possible to be carried out by computer systems. Can a computer offer its opinion or show emotion like humans? To put it simply, problems involving more intellectual complexity are much more dif cult to solve by computers as computer systems lack intelligence.

Thus, developing algorithms to make computers more intelligent and make them perform tasks like humans becomes much more crucial and challenging. Developing algorithms for computers is both an art and a science. Algorithm design is an art that involves a lot of creative ideas, novelty, and even adventurous strategies and knowledge. Algorithm design is also a science because its construction usually involves the application of some set of principles. The

Oxford

o factoractorm and (b) thed (b) the

s, an algorit an algorihm yields tyields t

omputatimputatiun

Univer

sity

edively or evely or

el such actionl such acti. However, alHowever

Computer promputer ph thought is gought

an be classi ebe clas

ss can be solv can be s: (a) t: (a

Pres

secessary.essarya into a cupnto a cup

hmhm. It can . It can to accomto accom


review and knowledge of these principles can facilitate the development of better algorithms, based on sound mathematical and scienti c principles.

1.3.1 Computa onal Pro lems Instance and SizeOne encounters many types of computational problems in the domain of computer science. Some of the problems are as follows:

Structuring problems In structuring problems, the input is restructured based on certain conditions or properties. For example, sorting a list in an ascending or a descending order is a structuring problem.

Search problems A search problem involves searching for a target in a list of all possibili-ties. All potential solutions may be generated and represented in the form of a list or graph. Based on a property or condition, the best solution for the given problem is searched. Puzzles are good examples of search problems where the target or goal is searched in a huge list of possible solutions.

Construction problems These problems involve the construction of solutions based on the constraints associated with the problem.

Decision problems Decision problems are yes/no type of problems where the algorithm output is restricted to answering yes or no. Let us assume that a road network map of a city is given. The problem, say, ‘Is there any road connectivity between two cities, say Hyderabad and Chennai?’, can be called as a decision problem as the output of this algorithm is restricted to either yes or no.

Optimization problems Optimization problems constitute a very important set of problems that are often encountered in computer science domain. The decision problem about road con-nectivity between Chennai and Hyderabad can also be posed as an optimization problem as follows: What is the shortest distance between Chennai and Hyderabad? This is an optimiza-tion problem, as the problem involves nding the shortest path. Thus, optimization problems involve a certain objective function that is typically of the following form: maximize (say pro t) or minimize (say effort) based on a set of constraints that are associated with the problem.

Once the problem is recognized, the input and output of an algorithm should be identi ed. Consider, for example, a problem of nding the factorial of a number. The factorial of a positive number N can be written as follows: N! = N × (N 1) × (N 2) × . . . × 1. A valid input can be called an instance of a problem. For example, factorial of a negative number is not possible. Therefore, all valid positive integers {0, 1, 2, . . .} can serve as inputs and every legal input is called an instance. All possible inputs of a problem are often called a domain of the input data. The input should be encoded in a suitable form so that computers can process it. The number of binary bits used to represent the given input, say N, is called the input size. Input size is important, as a larger input size consumes more computer time and space.

The core question still remains to be answered: how to solve a given problem? To illustrate problem solving, let us consider a simple problem of counting the number of students in a tuition centre who have passed or failed a test, assuming that the pass mark is 50. Details of the students such as their registration number, name, and more importantly, marks obtained, which are necessary for the given problem, are shown in Table 1.1.

red in coin coChennai andnnai and

the shortestthe shortesas the problhe prob

ertain objectrtain objecze (say ee (say e

ro

Univer

sity

e th

re yes/no typeyes/no typno. Let us assuLet us a

ny road conneroad conecision problecision pro

Optimizationptimizatimputempu

Pres

sth

n problemroblemgoal is seaoal is sea

he consthe const


The class strength of the tuition centre is 6. The pass mark of the course is given as 50. How do we manually solve this problem? First, we will read the student marks. Thus, the inputs for this problem are a set of student marks. The goal of this problem is to print the pass and fail counts of the students, which is also the output of this algorithm. The process of reading a student’s mark is done manually. Compare the student mark with the pass mark, that is, 50. If the student mark is greater than or equal to 50, then pass count should be added by one. Otherwise, the fail count should be added by one. This process is repeated for all the students.

The procedure that is done manually can be given as an algorithm. Therefore, informally, the algorithm for this problem can be given as follows:

Step 1: Let counter = 1, number of students = 6Step 2: While (counter ≤ number of students) 2.1: Read the marks of the students 2.2: Compare the marks of the student with 50 2.3: If student mark is greater than or equal to 50 Then increment the pass count Else increment the fail count 2.4: Increment the counterStep 3: Print pass count and fail countStep 4: Exit

Thus, algorithmic solving can be observed to be much similar to how we solve problems manually.

Some of the important characteristics are listed as follows:

Input An algorithm can have zero or more inputs.

Output An algorithm should produce at least one or more outputs.

e niteness An algorithm is characterized by de niteness. Its instructions should be clear and unambiguous without any confusion. All operations should be well de ned. For example, opera-tions involving the division of zero or taking a square root of a negative number are unacceptable.

ni ueness An algorithm should be a well-de ned and ordered procedure that consists of a set of instructions in a speci c order. The order of the instructions is important as a change in the order of execution leads to a wrong result or uncertainty.

Registration number Student name Course marks

1 Abraham 80

2 Beena 30

3 Chander 83

4 David 23

5 Elizabeth 90

Fauzia 78

Table 1.1 Students’ course marks

Registration number Student name Course marks

mamacrementment

increment trement tcrement thecrement th

nt pass counss counxitxit

gorithgorith

Univer

sity

hopeated foated

be given as ane given as en as followsas follow

tudents = 6dents =of students)of student

f the studentshe studemarks of the sarks of thrk is greaterrk is grea

the pth

Pres

smarks

s to print to print The procehe proc

the pass mathe pass muld be aduld be ad

r ar a


orrectness The algorithm should be correct.

ecti eness An algorithm should be effective, which implies that it should be traceable manually.

Finiteness An algorithm should have a nite number of steps and should terminate after executing the nite set of instructions. Therefore, niteness is an important characteristic of an algorithm.

Some of the additional characteristics that an algorithm is supposed to possess are the following:

i p icit Ease of implementation is another important characteristic of an algorithm.

enera it An algorithm should be generic, independent of any programming language or operating systems, and able to handle all ranges of inputs; it should not be written for a speci c instance.

An algorithm that is de nite and effective is also called a computational procedure. In addition, an algorithm should be correct and ef cient, that is, should work correctly for all valid inputs and yield correct outputs. An algorithm that executes fast but gives a wrong result is useless. Thus, ef ciency of an algorithm is secondary compared to program correctness. However, if many correct solutions are available for a given problem, then one has to select the best solution (or the optimal solution) based on factors such as speed, memory usage, and ease of implementation.

Algorithms can be contrasted with programs. Algorithms, like blueprints that are used for constructing a house, help solve a given problem logically. A program is an expres-sion of that idea, consisting of a set of instructions written in any programming language. The development of algorithms can also be contrasted with software development. At the industrial level, software development is usually undertaken by a team of programmers who develop software, often for third-party customers, on a commercial basis. Software engineering is a domain that deals with large-scale software development. Project manage-ment issues such as team management, extensive planning, cost estimation, and project scheduling are the main focus of software engineering. On the other hand, algorithm design and analysis as a eld of study take a micro view of program development. Its focus is to develop ef cient algorithms for basic tasks that can serve as a building block for various applications. Often project management issues of software development are not relevant to algorithms.

1.4 NEED FOR ALGORITHM EFFICIENCY

Computer resources are limited. Hence, many problems that require a large amount of resources cannot be solved. One good example is the travelling salesperson problem (TSP). Its brief history is given in Box 1.3.

A TSP is illustrated in Fig. 1.2. It can be modelled as a graph. A graph consists of a set of nodes and edges that interconnect the nodes. In this case, cities are the nodes and the paths that connect the cities are the edges. Edges are undirected in this case. Alternatively, it is

Oxford

e, he, hensisting ofting of

of algorithof algorithl, software softwar

p softwarep softwareng is a domg is a do

suchsuch

Univer

sity

entm that exthat

s secondary csecondaryailable for a gable for

on) based onn) based

ted with proted with plp solve lp sol

se

Pres

sany py

; it shouldt shoul

called a called a comco that is, that is,

xecxec


A travelling salesperson problem (TSP) is one of the most important optimization problems studied in history. It is very popular among computer scientists. It was studied and developed in the 19th century by Sir William Rowan Hamilton and Thomas Pennington Kirkman. In 1857, Hamilton created an Icosian game, a pegboard with 20

holes called vertices. The objective is to nd a tour starting from a city, visiting all other cities only once, and nally returning to the city where the tour has started. This is called a Hamiltonian cycle. The TSP is to nd a Hamiltonian cycle in a graph. The problem was then studied extensively by Karl Menger in Vienna and by Harvard later in 1920.

Box 1.3 History of travelling salesperson pro lem

possible to move from a particular city to any other city. The complete details of graphs are provided in Chapter 5. A TSP involves a travelling person who starts from a city, visits all other cities only once, and returns to the city from where he started.

A brute force technique can be used for solving this problem. Let us enu-merate all the possible routes. For a TSP involving only one city, there is no path. For two cities, there is only

one path (A–B). For three cities, there are two paths. In Fig. 1.2, these two paths are A–B–C and A–C–B, assuming that A is the origin from where the travelling salesperson started. For four cities, the paths are {A–D–B–C–A, A–D–C–B–A, A–B–C–D–A, A–B–D–C–A, A–C–D–B–A, and A–C–D–B–A}.

Thus, every addition of a city can be noted to increase the path exponentially. Table 1.2 shows the number of possible routes.

Therefore, it can be observed that, for N cities, the number of routes would be (N 1)! for N 2. The availability of an algorithm for a problem does not mean that the problem is solvable by a computer. There are many problems that cannot be solved by a computer and for many problems algorithms require a huge amount of resources that cannot be provided practically. For example, a TSP cannot be solved in reality. Why? Let us assume that there

are 100 cities. As N = 100, the possible routes are then (100 1)! = 99!.

The value of the number 50! is 30414093201713378043612608166064768844377641568960512000000000000. Therefore, 99! is a very large number, and even if a computer takes one second for exploring a route, the algorithm will run for years, which is plainly not acceptable. Just imagine how much time this algorithm will require, if the TSP is tried out for all cities of India or USA. This shows that the development of ef cient algorithms is very important as computer resources are often limited.

Number of cities Number of routes

1 0 (as there is no route)

2 1

3 2

4

5 24

120

Table 1.2 Complexity of TSP

Number of cities Number of routes

Fig. 1.2 Travelling salesperson problem (a) One city—no path (b) Two cities (c) Three cities (d) Four cities

A A B

(d)

(a) (b)

(c)

A B

C

A B

DC

Oxford

hs as aA–C–D–B–C–D–B

ddition of a ion of a mber of posser of poss

e, it can beit can b. The availThe availy a comy a co

Univer

sityf

s, there are twthere areA is the origA is the o

re {A–D–re {A–A}

tyy—no path —no path

itiess

tytityD

itytytityPr

esshs are pare p

P involves anvolves starts from rts from only onconly oncwherwh


1.5 FUNDAMENTAL STAGES OF PROBLEM SOLVING

Problem solving is both an art and a science. The problem-solving process starts with the understanding of a given problem and ends with the programming code of the given problem. The stages of problem solving are shown in Fig. 1.3.

The following are the stages of problem solving:

1.5.1 Understanding the Pro lemThe study of an algorithm starts with the computability theory. Theoretically, the primary question of computability theory is the following: Is the given problem solvable? Therefore, a guideline for problem solving is necessary to understand the problem fully. Hence, a proper problem statement is required. Any confusion regarding or misunderstanding of a problem statement will ultimately lead to a wrong result. Often, solving the numerical instances of a given problem can give an insight to the problem. Similarly, algorithmic solutions of a related problem will provide more knowledge for solving a given algorithm.

Generally, computer systems cannot solve a problem if it is not properly de ned. Puzzles often fall under this category, which needs supreme level of intelligence. These problems illustrate the limitations of computing power. Humans are better than computers in solving these problems.

1.5.2 Planning an AlgorithmThe second important stage of problem solving is planning. Some of the important decisions are detailed in the following subsections.

A computation model or computational model is an abstraction of a real-world computer. A model of computation is a theoretical math-ematical model and does not exist in physical sense. It is thus a virtual machine, and all programs can be executed on this virtual machine theoretically. What is the need for a computing model? It is meaning-less to talk about the speed of an algorithm as it varies from machine to machine. An algorithm may run faster in machine A compared to in machine B. This kind of an analysis is not correct as algorithm analysis should be independent of machines. A computing model thus facilitates such machine-independent analysis.

First, all the valid operations of the model of computation should be speci ed. These valid operations help specify the input, process, and

1. Understanding the problem2. Planning an algorithm3. Designing an algorithm4. Validating and verifying an algorithm

5. Analysing an algorithm6. Implementing an algorithm7. Performing empirical analysis (if necessary)

Fig. 1.3 Stages of problem solving

Planning

Algorithm design

Algorithm correctness

Algorithm analysis

Implementation and

empirical analysis

Post analysis

Is

OK

Exit

Yes

No

Problem

Oxford

f intintpower. Hower. H

1.5.21.5.T

OxfOxfn and

alysis

Univer

sity

t is rrproblem staproblem

olving the numving the nut to the probleo the pro

em will provm will p

ally, computery, compued. Puzzles ofted. Puzzles

lligence. Tlligencuma

Pres

sh the the of compuf compu

olvable? Tvable? T understanunderstanequireequi


output. In addition, the computing models provide a notion of the necessary steps to compute time and space complexity of a given algorithm.

For algorithm study, a computation model such as a random access machine (RAM) or a Turing machine is used to perform complexity analysis of algorithms. RAM is discussed in Chapter 3 and Turing machines are discussed in Chapter 17. The contribution of Alan Turing is vital for the study of computability and computing models (refer to Box 1.4).

OData structure concerns the way data and its relationships are stored. Algorithms require data for solving a given problem. The nature of data and their organization can have impacts on the ef ciency of algorithms. Therefore, algorithms and data structures together often constitute an important aspect of problem solving.

Data organization is something we are familiar with in our daily life. Figure 1.4 shows an example of data organization called a queue. A gas station with one servicing point should have a queue, as shown in the gure, to avoid chaos. A queue (or rst come rst serve—FCFS) is an organization where the processing ( lling of gas) is done in one end and a vehicle is added at the other end. All vehicles have same priority in this case. Often a problem dictates the choice of structures. However, this structure may not be valid in cases of, for example, handling medical emergencies, where highest priority should be given to urgent cases.

Thus, data organization is a very important issue that in uences the effectiveness of algorithms. At some point of problem solving, careful consideration should be given to storing the data effectively. A popular statement in computer science is ‘algorithm + data structure = program’. A wrong selection of a data structure often proves fatal in the problem-solving process.

Alan Turing (1912–1954) is well known for his contribution towards computing. He is considered by many as the ‘father of modern computing’. His contribution towards computability theory and arti cial intelligence is monumental. He designed

a theoretical machine, called the Turing machine that is used widely in computability theory and complexity analysis. Alan Turing is also credited with the designing of the ‘Turing Test’ as a measure of testing the intelligence of a machine.

Box 1.4 Alan Turing

Fig. 1.4 Example of a queue

Queue of trucks

Fuel station

Oxford

e choichoiple, handl, handl

cases.cases.organizationanization

point of proboint of proy. A popu A popu

ectiti

Univer

sitynd their thei

ms and data sms and data

are familiar wre familicalled a ed a queuq

in the guren the gation where thtion wher

he other endhe other ee of ste of

Pres

she contconls (refer torefer to

hips are ships are orgorg


1.5.3 Designing an AlgorithmAlgorithm design is the next stage of problem solving. The following are the two primary issues of this stage:

1. How to design an algorithm? 2. How to express an algorithm?

Algorithm design is a way of developing algorithmic solutions using an appropriate design strategy. Design strategies differ from problem to problem. Just imagine searching a name in a telephone directory. If anyone starts searching from page 1, it is termed as a brute force strategy. Since a telephone directory is indexed, it makes sense to open the book and use the index at the top to locate the name. This is a better strategy. One may come across many design paradigms in the algorithm study. Some of the important design paradigms are divide and conquer, and dynamic programming. Divide and conquer, for example, divides the problem into sub-problems and combines the results of the sub-problems to get the nal solution of the given problem. Dynamic programming visualizes the problem as a sequence of decisions. Then it combines the optimal solutions of the sub-problems to get an optimal global solution. Many design vari-ants such as greedy approach, backtracking, and branch and bound techniques have been dealt with in this book. The role of algorithm design is important in developing ef cient algorithms. Thus, one important skill required for problem solving is the selection and application of suit-able design paradigms. A skilled algorithm designer is called an ‘algorist’.

After the algorithm is designed, it should be communicated to a programmer so that the algorithm can be coded as a program. This stage is called algorithm speci cation. Only three possibilities exist for communicating the idea of algorithms. One is to use natural languages such as English to communicate the algorithm. This is preferable. However, the natural lan-guage has some disadvantages such as ambiguity and lack of precision. Hence, an algorithm is often written and communicated through pseudocode. Pseudocode is a mix of the natural language and mathematics. Another way is to use a programming language. The problem with programming language notation is that readers often get bogged down by the programming code details. Therefore, the pseudocode approach is preferable. Even though there are no speci c rules for algorithm writing, certain guidelines can be followed to make algorithms more understandable. Some of these guidelines are presented in Chapter 2.

1.5.4 Valida ng and Verifying an AlgorithmAlgorithm validation and veri cation are the methods of checking algorithm correctness. An algorithm is expected to give correct outputs for all valid inputs. Sometimes, algorithms may not give correct outputs due to logical errors. Hence, validation of algorithms becomes a necessity. Algorithm validation is a process of checking whether the given algorithm gives correct outputs for valid inputs or not. This is done by comparing the output of the algorithm with the expected output.

Once validation is over, algorithm veri cation or algorithm proving begins. Algorithm veri-cation is a process of providing a mathematical proof that the given algorithm works correctly

for all instances of data. One way of doing it is by creating a set of assertions that are expressed using mathematical logic. Assertions are statements that indicate the conditions of algorithm

Oxford

municunicdvantages antages

nd communnd commumathematicshematic

ng languagng languagils. Theres. There

s fof

Univer

sity

chmportanporta

solving is thsolving is designer is casigner is ca

it should be shouldogram. This sogram. Th

unicating theunicatingte the ate th

Pres

se prp

solution olution ce of decisof decis

mal global smal global sand bouand bou

t int i


variables at various points of the algorithm. Preconditions indicate the conditions or variables before the execution of an algorithm, and postconditions indicate the status of the variables at the end of the execution of an algorithm. Remember, still the algorithm has not been translated or converted into a code of any programming language. Hence, all these executions are rst carried out theoretically on a paper and proof for the algorithm is determined. A proof of an algorithm is said to exist if the preconditions can be shown to imply the postconditions logically. A complete proof is to write each statement clearly for proving that the algorithm is right. In addition to assertions, mathematical proof techniques such as mathematical induction can be used for proving that algorithms do work correctly. Mathematical induction is one such useful proof technique that is often used to prove that the algorithm works for all possible instances. Mathematical proofs are rigorous and always better than algorithm validation. Program cor-rectness itself is a major study and extensive research is done in this eld.

1.5.5 Analysing an AlgorithmIn algorithm study, complexity analysis is important as we are mainly interested in nding optimal algorithms that use fewer computer resources. Complexity theory is a eld of algorithms that deals with the analysis of a solution in terms of computational resources and their optimization. Humans are more complex than, say, amoeba. So what does the word ‘complexity’ refer to here? Complexity is the degree of dif culty associated with a problem and the algorithm. The complex-ity of an algorithm can be observed to be related to its input size. For example, an algorithm for sorting an array of 10 numbers is easy, but the same algorithm becomes dif cult for 1 million inputs. This shows the connection between complexity and the size of the input.

Thus, complexity analysis is useful for the following two purposes:

1. To decide the ef ciency of the given algorithms2. To compare algorithms for deciding the effective solutions for a given problem

Consider the following scenario: for a problem , two algorithms A and B exist. Find the optimal algorithm. To solve this, there must be some measures based on which comparisons can be made. In general, two types of measures are available for comparing algorithms—subjective and objective. Subjective measures are factors such as ease of implementation, algorithm style, and readability of the algorithm. However, the problem with these subjective measures is that these factors cannot be quanti ed. In addition, a measure such as the ease of implementation, style of the algorithm, or understandability of algorithms is a subjective measure that varies from person to person. Therefore, in algorithm study, comparisons are limited to some objective measures. Objective measures, unlike subjective measures, can be quanti ed. The advantages of objective measures are that they are measurable and quanti able, and can be used for predic-tions. Often time and space are used as objective measures for analysing algorithms.

Time complexity means the time taken by an algorithm to execute for different increasing inputs (i.e., differently-scaled inputs). In algorithm study, two time factors are considered—execution time and run time. Execution time (or compile time) does not depend on problem instances. Additionally, a program may be compiled many times. Therefore, time in an algo-rithm context always refers to the run time as only this is characterized by instances. Another complexity is space complexity, which is the measurement of memory space requirements of an algorithm. Technically, an algorithm is ef cient if lesser resources are used. Chapter 3 focuses on the analysis of algorithms.

Oxfordy

hms fos fo

ollowing sceowing scehm. To solvem. To solve

general, twoneral, twctive. ive. SubjeSubje

ility oility o

Univer

sitymputatuta

o what does o what doted with a probd with a pro

related to its iated to iy, but the sambut the s

between competween cos useful for ths useful fo

of the givenof the givdecididec

Pres

sh

are mainly ie mainly iexity theorexity the

ionalion


In algorithm study, time complexity is not measured in absolute terms. For example, one cannot say the algorithm takes 3.67 seconds. It is wrong as time complexity is always de-noted as a complexity function t(n) or T(N), where t(n) is a function of input size ‘n’ and is not an absolute value. The variables n and N are used interchangeably and always reserved to represent the input size of the algorithm. Recollect that input size is the number of binary bits used to represent the input instance. The logic here is that for larger inputs the algorithm would take more time. For example, sorting a list of 10 elements is easy, but sorting a list of 1 billion elements is dif cult. Generally, algorithms whose time complexity function is a polynomial, for example, say N or log N, can be solved easily, and but problems whose algorithms have exponential functions, say 2N, are dif cult to solve.

Example 1.1 Assume that there are two algorithms A and B for a given problem P. The time complexity functions of algorithms A and B are, respectively, 3n and 2n. Which algorithm should be selected assuming that all other conditions remain the same for both algorithms?

o ution Assuming that all conditions remain the same for both algorithms, the best algorithm takes less time when the input is changed to a larger value. Results obtained employing different values of n are shown in Table 1.3.

Example 1.1

Input size (n) Algorithm A T(n) = 3n Algorithm B T(n) = 2n

1 3 2

5 15 32

10 30 1024

100 300 2100

Table 1.3 Time complexities of algorithms A and B

Input size (n) Algorithm 3n Algorithm B T(TT n) = 2n

It can be observed that algorithm A performs better as n is increased; time complexity increases linearly and gives lesser values for larger values of n. The second algorithm instead grows exponentially, and 2100 is a very large number.

Therefore, algorithm A is better than algorithm B.

Example 1.2 Let us assume that, for a problem P, two algorithms are possible—algorithm A and algorithm B. Let us also assume that the time complexities of algorithms A and B are, respectively, 3n and 10n2 instructions. In other words, the instructions or steps of algorithms A and B are 3n and 10n2 respectively. Here, n is the input size of the problem. Let the input size n be 105 instructions. If the computer system executes 109 instructions per second, how much time do algorithms A and B take?

o ution Here, n = 105, and the computer can execute 109 instructions per second.

Therefore, algorithm A (time complexity 3n) would take 3 × 105

109 = 3104 = 0.0003 seconds.

Algorithm B (time complexity 10n2) would take 10 × (105)2

109 = 10 × 1010

109 = 100 seconds.

Example 1.2

Oxfod

fordd

for

ford

e observed observed linearly annearly a

entienti

rsity

gedble 1.3. 1.3

Unive

Unier

sve

UUU15

30

plexities of algexities of

hm A m A TT((TTTT nn) = 3) = n

Pres

sfor

ly, 3, 3nn an ann the same he same

same for same for d to a lad to a la


It can be observed that algorithm A would take very less time compared to algorithm B. Hence, it is natural that algorithm A is the preferred one.

One may wonder whether this has got anything to do ef ciency. Imagine that we are now executing algorithm A on a slower machine—a machine that executes only 105instructions. What would be the scenario in this case?

Algorithm A would take 3 × 105

105 = 3 seconds.

We can see that algorithm A is still better compared to algorithm B that takes 10 × (105)2

105 = 106 seconds. Therefore, the important point to be noted here is that speed of the machine does not affect selection of algorithm A as a better algorithm. Hence, while computer speed is crucial, the role of a better designed algorithm is still signi cant.

1.5.6 Implemen ng an Algorithm and Performing Empirical AnalysisAfter the algorithm is designed, it is expressed as a program and a suitable programming language is chosen to implement the algorithm as a code. After the program is written, it must be tested. Even if the algorithm is correct, sometimes the program may not give expected results. This may be due to syntax errors in coding the algorithm in a programming language or hardware fault. The error due to syntax problems of a language is called a logical error. The process of removing logical errors is called debugging. If the program leads to an error even in the absence of syntax errors, then one has to go back to the design stage to correct the algorithmic errors.

Now, complexity analysis can be performed for the developed program. The analysis that is carried out after the program is developed is called empirical analysis (or a priori analysis or theoretical analysis). This analysis is popular for larger programs; a new area, called experimental algorithmics, where analysis is performed for larger programs using a dataset, is emerging. A dataset is a huge collection of valid input data of an algorithm; the standard dataset that is often used for testing of algorithms is called a benchmark dataset. Interpretation of the results of a program on a benchmark dataset provides vital statistical information about the behaviour of the algorithm. This kind of analysis is called empirical analysis (also called a posteriori analysis). In addition, the term pro ling is often used to denote the process of running a program on a dataset and measuring the time/space require-ment of the program empirically.

1.5.7 Post or Postmortem AnalysisA problem-solving process ends with postmortem analysis. Any analysis should end with a valu-able insight. The following are the questions that are often raised in this problem-solving process:

1. Is the problem solvable?2. Are there any limits for this algorithm? Is there any theoretical limit for the ef ciency of

this problem?3. Is the algorithm ef cient, and are there any other algorithms that are faster than the current

algorithm?

Answers to these questions give a lot of insight into the algorithms that are necessary for re ning the design of algorithms to make them more ef cient. The best possible or optimal solution that is theoretically possible for a given problem speci es a lower bound of a given problem. The worst-case estimate of resources required by the algorithm is called an upper

Oxford

A datasatashat is often s often

f the resultf the resulabout the but the b

(also calledalso callede procesproce

pro

Univer

sity

ograithm in a thm in

anguage is canguage is g. If the prograf the pro

ck to the desigto the debe performedbe perform

gram is deveam is dsis). This anas). This a

hmicshmics, whe, wt is a it

Pres

scal Analyl Analand a suitabd a suitab

he programhe programm may m may


bound. Theoretically, the best solution of a problem should be closer to its lower bound. The difference between the lower and upper bounds should be minimal for a better solution. The difference between the upper and the lower bounds is called an algorithmic gap. Technically, this should be zero. However, in practice, there may be vast difference between an upper and a lower bound. A problem-solving process tries to reduce this difference by focusing on better planning, design, and implementation on a continuous basis.

1.6 CLASSIFICATION OF ALGORITHMS

As algorithms are important for us, let us classify them for further elaboration. No single criterion exists for classifying algorithms. Figure 1.5 shows some of the criteria used for classi cation of algorithms.

Fig. 1.5 Classi cation of algorithms

Algorithms

Classi cation based on

implementation

Classi cation based on

design

Classi cation based on area of

specialization

Classi cation based on tractability

1.6.1 Based on Implementa onBased on implementation, one can classify the algorithms as follows

Problem reduction is a scienti c principle for problem solving. Take a problem, reduce it, and repeat the reduction process till the problem is reduced to a level where it can be solved directly. Using recursion, the problem is reduced to another problem with a decrease in input instance. The transformed problem is the same as the original one but with a different input, which is less than that of the original problem. The problem reduction process is continued till the given problem is reduced to a smaller problem that can be solved directly. Then the results of the sub-problems are combined to get the result of the given problem. This strategy of problem solving is called recursion.

Recursive algorithms use recursive functions for creating repetitions required for solving a given problem.

A good example of recursion is computing the factorial of a number n. Factorial of a number can be computed using the recursive function as follows:

n! = 0 if n = 0n × (n 1)! for n 1

Thus, a recursive function has a base case and an inductive case. A base case is the simplest problem that can be solved directly. In this factorial example, the base case is nding 0!, as 0! can be solved directly. An inductive case is a recursive de nition of the problem that captures the essence of a problem reduction process. It can be observed from Fig. 1.6, which shows the computation of 4!, that recursion works by the principle of work postponement or delaying the

Oxfordis a scientia scient

eduction production prng recursionrecursion

The transfohe transfoss thanss than

Univer

sity

Classi cation ossi catiosityCla

based specsity

sity

onne can classine can cl

Pres

sPrPPPPPP


work. For a given factorial program, the facto-rial function calls itself by reducing its input by 1 till the program reaches 0! This is the base case and the algorithm stops here; results of the sub-problems are collected for computing the

nal answer of the given problem.The correctness of recursive algorithms is

due to its close relationship with the concept of mathematical induction. In other words, re-cursion is a mirror of mathematical induction.

Non-recursive (or iterative) algorithms, on the other hand, are deductive in nature. Non-recursive algorithms do not use the recursion

concept but, instead, rely on looping constructs, such as for or while statement, to create repetition of tasks. Non-recursive (or iterative) algorithms and their analyses are discussed in Chapter 2, and recursive algorithms and their analyses in Chapter 3.

An algorithm that is designed for a single processor is called a sequential algorithm.A parallel algorithm is designed for systems that use a set of processors. The concept of

parallel processing and distributed processing are interrelated. Parallel systems have mul-tiple processors that are located closely. Distributed systems, on the contrary, have multiple processors that are located at different places separated by a vast distance geographically. Hence, distributed systems are called loosely coupled systems, while parallel systems are called tightly coupled systems. Distributed algorithms are implemented in a distributed system environment. Parallel algorithms are discussed in Chapter 20.

An exact algorithm nds the exact solutions for a given problem. Some problems are so complex that nding their exact solutions is dif cult. Approximation algorithms (discussed in Chapter 19 of this book) nd equivalent or approximate solutions for a given problem.

Deterministic algorithms always provide xed predictable results for a given input. In con-trast, non-deterministic algorithms or randomized algorithms take a different approach. For deterministic algorithms, the output should always be true. On the other hand, randomized algorithms relax this condition. It is argued that outputs based on random decisions may not often result in correct answers or the algorithm may not terminate at all. Thus, the accuracy of an output is associated with a probability. In daily life, we often use random decisions, for example, in games such as dice. Industries conduct random quality checks on products. Randomized samples are used to predict poll results and so on. One can be very sceptical about randomized algorithms. However, randomized algorithms have been proved to be very effective. Randomized algorithms are discussed in Chapter 19.

1.6.2 Based on DesignBased on design techniques, algorithms can be classi ed into various categories. As discussed earlier, every design technique uses a strategy for solving a problem. Algorithms can be

Fig. 1.6 Computation of 4! using recursion

4!

3!

Reduce

4 × 3 × 2 × 1

3 × 2 × 1

2 × 1

1

Reduce

Reduce

Reduce

2!

1!

0!

Oxford

algorithorith

orithmthm nds ndhat nding tat nding

er 19 of th19 of th

Univer

sityrocessor is cacessor is c

ystems that utems thaocessing are essing a

osely. Distribuosely. Distrifferent placeferent pl

are called looe called lms. ms. DistribuDistri

ms ar

Pres

se

ithms doms dowhile statemhile statem

r anaanalyses alyses a3.3.


classi ed based on design as brute force, divide and conquer, dynamic programming, greedy approach, backtracking, and branch and bound algorithms. This textbook is organized based on this classi cation of algorithms only.

1.6.3 Based on Area of Specializa onAlgorithms can be classi ed based on the area of specialization. General algorithms such as searching and sorting, and order statistics such as nding mean, median, rank, and so on are useful for all elds of specialization. These algorithms can be considered as building blocks of any area of specialization. Other than these basic algorithms, some algorithms are special-ized for a particular domain. String algorithms, graph algorithms, combinatorial algorithms, and geometric problems are some examples of specialized algorithms that are discussed in this book. Let us discuss some of these algorithms now.

Sorting algorithm is a general algorithm that is useful in all areas of specialization. Sorting problem is a structural problem. It involves structuring or rearranging the sequence in a spe-ci c order. The importance of this problem arises from the fact that all modern applications require sorting. For example, an organization may require sorting of employee records based on employee identi cation number (called primary key). A library system may want books to be ordered based on titles or ISBN numbers. Sorting problems for small instances are relatively easy. When sorting is required for a larger number, say a billion elements, designing sorting algorithms becomes more challenging.

Let us discuss some of these domain-speci c algorithms now:String algorithms String algorithms are used frequently in document editing, web searches, and pattern matching.

A sequence of characters is called a string. The sequence can be text, bits, numbers, or gene sequences (A, C, G, or T). One problem that is of considerable interest is string match-ing, which takes a pattern, searches the text string, and reports whether a pattern is present or not. String algorithms, which are discussed in Chapter 16, constitute a very important

eld of study.

Graph algorithms A graph is used for modelling complex systems by representing the rela-tionships among the subsystems. A graph represents a set of nodes and edge. The nodes are also called vertices. The nodes are connected by edges, which are also known as arcs. Consider the graph shown in Fig. 1.7. Here, the nodes are given as {A, B, C, D, E} and the edges as {(A,B), (A,C), (A,D), (B,A), (B,E), (B,C), (C,A), (C,B), (C,D), (C,E), (D,A), (D,C), (D,E), (E,B), (E,C),

and (E,D)}.

Many interesting problems can be formulated using this graph structure. The following are some of the interesting graph algorithms:

ra e ing sa esperson pro e As stated earlier, a salesperson starts from a node, visits all other nodes only once, and gets back to the starting node. This problem is reduced to a Hamiltonian cycle if the graph is undirected. In the case of a weighted graph (where edges carry some weights), the TSP is about

nding a tour of minimum cost.Fig. 1.7 Sample graph

A

E

B C D

ring alg alg.

f characters f characteres (A, C, G, A, C, G,

takes a patttakes a patring algorng algo

y

Univer

sity

omy requireequi

mary key). Amary key). Sorting proborting pr

ger number, sr numberng.

domain-specdomain-sporithmsorith

Pres

sall areas ofareas oor rearrangor rearrang

m the facm the facsos


Chinese postman problem This problem is the same as a TSP, but instead of vertices, an edge should be visited only once. There is no restriction on vertices in this problem.Graph colouring problem This problem is about how to colour all the vertices distinctly using only a small number of colours such that no two neighbouring vertices share the same colour.Combinatorial algorithms The focus of combinatorial algorithms is to nd a combinatorial object inherent in the problem such as permutations, combinations, or a subset that satis es some constraints and objective functions. These problems are dif cult to solve, and many problems do not have algorithmic solutions. TSPs and graph colouring problems are examples of combinatorial problems.Geometric algorithms Geometric algorithms deal with geometrical objects such as points, lines, and polygons. The following are some of the algorithms discussed in this textbook:Closest pair problem This problem deals with nding the distance between points in a 2D space and nding a pair of points that are closest to each other.Convex hull problem This problem deals with nding the smallest convex polygon that includes all points in a 2D space.

1.6.4 Based on TractabilityTractability means solvability of a given problem within a reasonable amount of time and space. An intractable problem is dif cult to solve within the reasonable amount of computer resources. Based on tractability, the following categories are possible:Easily solvable problems These problems have polynomial time complexity or their upper bounds are characterized by a polynomial. These problems are solvable. For example, sorting is an example of solvable or polynomial problems.Unsolvable problems Problems such as halting problems cannot be solved at all. These are called unsolvable or non-computable problems. Knowledge about the non-computability of these problems helps in project management.Intractable problems These problems are of two categories. One category comprises a set of problems that have algorithmic solutions but require more computer resources. Hence, these solutions are practically non-implementable. In addition, these have been proved to be computationally hard. The other category consists of a set of problems that have been proved to be computationally hard. For example, a TSP is a proven computationally hard problem; it had already been discussed that time complexity of the algorithm increases exponentially with an increase in the number of cities.

Oxford

ble ore orems ProbProb

vable or nonvable or nolems helps ms helps

ble problemle problemblems thlems t

Univer

sity

n problem wiproblem cult to solve wt to solv

he following e followinhese problemhese prob

a polynomiaa polynompolynompolyn

Pres

sdistance btance botherher..

ding the smding the sm


Thus, algorithm study can be concluded as the central theme of computer science. An as-piring computer professional should be a better problem solver. The problem-solving process starts with the understanding of a given problem and ends with nding an ef cient program-ming code for the given problem. Problem solving poses many challenges as it is a creative process. Chapter 2 introduces the basics of a problem-solving process and also introduces basic guidelines for writing algorithms.

SUMMARY

An algorithm is a step-by-step procedure for solving a given problem.

A computational problem is characterized by two factors—speci cation of valid input and output param-eters of the algorithm, and speci cation of the relation-ship between inputs and outputs.

An algorithm that yields the correct output for a legal input is called an algorithmic solution.

The art of designing, analysing, and implementing algorithms is called algorithmics.

Algorithms can be contrasted with programs. A program is an expression of algorithm in a programming language.

A valid input is called an instance. The number of binary bits necessary to encode inputs of an algorithm is called the input size.

An algorithm should have characteristics such as a well-de ned order, inputs, outputs, niteness, de nite-ness, ef ciency, and generality.

Problem solving starts with the understanding of a problem statement without any confusion.

A computation or computational model is an abstraction of a real-world computer.

Algorithm design is a way of developing algorithmic solutions using a suitable course of action called a design strategy. Algorithm speci cation is about com-municating the design strategy to a programmer often in the form of a pseudocode.

Algorithmic validation means checking whether the algorithm gives a correct result or not.

Algorithm veri cation is a process of providing a math-ematical proof that the given algorithm works correctly for all instances of data.

A proof of an algorithm is said to exist if the precondi-tions can be shown to imply postconditions logically.

An estimation of the time and space complexities of an algorithm for varying input sizes is called algorithm analysis.

Time complexity refers to the measurement of run time of an algorithm in terms of its input size, and space complexity is the measurement of space required for a given algorithm.

A dataset is a huge collection of valid input data of an algorithm. The standard dataset that is often used for testing of algorithms is called a benchmark dataset.

The theoretically best possible or optimal solution for a given problem speci es a lower bound of a given problem. The worst-case estimate of resources that can be required by an algorithm is called an upper bound. The difference between the upper and the lower bound is called an algorithmic gap.

Problem reduction is a scienti c principle for problem solving. Take a problem, reduce it, and repeat the re-duction process till the problem is reduced to a level where it can be solved directly.

Algorithms can be categorized based on their imple-mentation methods, design techniques, eld of study, and tractability.

GLOSSARY

Agent A performer of an algorithmAlgorist A person who is skilled in algorithm developmentAlgorithm A step-by-step procedure for solving a given

problemAlgorithm gap The difference between lower and upper

boundsFormalization of an algorithm in

a suitable form that can be conveyed to a programmer

The process of providing a mathemati-cal proof that the algorithm works correctly for all valid inputs

Algorithm validation The process of checking the correct-ness of an algorithm, this is done by giving valid inputs to it and checking its results with expected values

Approximation algorithms Algorithms that provide ap-proximate solutions for problems whose exact solutions are dif cult to obtain

nderstanerstanconfusion.fusion.

nal model is annal model is

way of devey of devuitable coursitable courrithm sprithm sp

tra

ed

uch as a uch as a ess, de nitess, de nite-

ing oi

omplexlean algorithm an algorithm

complexity is thmplexity is ta given algoa given aA dataseA dataof an of ause

m is said said to imply pomply po

f the time ane time afor varying infor varying in

ty refty re


REVIEW QUESTIONS

-

EXERCISES

-n 2n

2

n n nn 3.

9 .

Complexity theory

Computability theory

Computational model - -

Computational problem

Deterministic algorithm

Exact algorithm

Experimental algorithmics

Intractable problems

Lower bound Non-deterministic algorithm -

Non-recursive algorithm (iterative algorithm)

Parallel algorithm

-

Proof

Recursive algorithm

Sequential algorithm -

Space complexity

Strategy Time complexity

Tractable problems

Upper bound -

ADDITIONAL PROBLEM

. Nine Algorithms That Changed the Future: The Ingenious Ideas that drive Today’s Computers,

.

.

.

Oxford..

Univer

sity

W QUESTIONQUEST

.

pper bounder boun

Pres

soblemsoblems


Answers to the Crossword are available as online resources

CROSSWORD

1

2

3

4 5

7

8

9 10

11

12

13

Across 1. Algorithms that are solved within reasonable amount

of computable resources 4. Algorithms that use recursive functions . Who coined the word algorithm analysis 7. An abstraction of a real world problem 8. Statistical process of measuring resources on a larger

dataset 10. Process of checking algorithms 11. Theory of algorithms 12. Analytical skills 13. Attributed as Father of Modern Computing by many

Down1. Measurement of time2. Study of analysis of algorithms3. A step-by-step procedure5. A person skilled in algorithms9. Process of removing syntax errors

Oxford

onsanalysisysis

problemproblemsuring resourcing resou

algorithmsalgorithms

Univer

sity

ersit

ytytttytyityyyititrsityityityyyrser

srseeererrrble amount e amount

DowD1

ess

Pres

ssssssessssssee

Presesresss

PrPPre

PrerePPP

Documents

DESIGN AND ANALYSIS OF O ALGORITHMS HM