MC0080 – Analysis and design of algorithms

MC0080 – Analysis and design of algorithms

1. Briefly explain the concept of Djikstra’s Algorithm. Ans:

Directed Graph So far we have discussed applications of Greedy technique to solve problems involving undirected graphs in which each edge (a, b) from a to b is also equally an edge from b to a. In other words, the two representations (a, b) and (b, a) are for the same edge. Undirected graphs represent symmetrical relations. For example, the relation of ‘brother’ between male members of, say a city, is symmetric. However, in the same set, the relation of ‘father’ is not symmetric. Thus a general relation may be symmetric or asymmetric. A general relation is represented by a directed graph, in which the (directed) edge, also called an arc, (a, b) denotes an edge from a to b. However, the directed edge (a, b) is not the same as the directed edge (b, a). In the context of directed graphs, (b, a) denotes the edge from b to a. Next, we formally define a directed graph and then solve some problems, using Greedy technique, involving directed graphs. Actually, the notation (a, b) in mathematics is used for ordered pair of the two elements viz., a and b in which a comes first and then b follows. And the ordered pair (b, a) denotes a different ordered set in which b comes first and then a follows. However, we have misused the notation in the sense that we used the notation (a, b) to denote an unordered set of two elements, i.e., a set in which order of occurrence of a and b does not matter. In Mathematics the usual notation for an unordered set is {a, b}. In this section, we use parentheses (i.e., (and)) to denote ordered sets and braces (i.e., {and}) to denote a general (i.e., unordered set). Definition A directed graph or digraph G = (V(G), E(G)) where V(G) denotes the set of vertices of G and E(G) the set of directed edges, also called arcs, of G. An arc from a to b is denoted as (a, b). Graphically it is denoted as follows: in which the arrow indicates the direction. In the above case, the vertex a is sometimes called the tail and the vertex b is called the head of the arc or directed edge. Definition A Weighted Directed Graph is a directed graph in which each arc has an assigned weight. A weighted directed graph may be denoted as G = (V(G), E(G)), where any element of E(G) may be of the form (a, b, w) where w denotes the weight of the arc (a, b). The directed Graph G = ((a, b, c, d, e), ((b, a, 3), (b, d, 2), (a, d, 7), (c, b, 4), (c, d, 5), (d, e, 4), (e, c, 6))) is diagrammatically represented as follows: Figure 7.7.1 Single-Source Shortest Path

Next, we consider the problem of finding the shortest distances of each of the vertices of a given weighted connected graph from some fixed vertex of the given graph. All the weights between pairs of vertices are taken as only positive number. The fixed vertex is called the source. The problem is known as Single-Source Shortest Path Problem (SSSPP). One of the well-known algorithms for SSSPP is due to Dijkstra. The algorithm proceeds iteratively, first consider the vertex nearest to the source. Then the algorithm considers the next nearest vertex to the source and so on. Except for the first vertex and the source, the distances of all vertices are iteratively adjusted, taking into consideration the new minimum distances of the vertices considered earlier. If a vertex is not connected to the source by an edge, then it is considered to have distance from the source. Algorithm Single-source-Dijkstra (V, E, s) // The inputs to the algorithm consist of the set of vertices V, the set of edges E, and s // the selected vertex, which is to serve as the source. Further, weights w(i, j) between // every pair of vertices i and j are given. The algorithm finds and returns dv, the // minimum distance of each of the vertex v in V from s. An array D of the size of // number of vertices in the graph is used to store distances of the various vertices // from the source. Initially Distance of the source from itself is taken as 0 // and Distance D(v) of any other vertex v is taken as . // Iteratively distances of other vertices are modified taking into consideration the // minimum distances of the various nodes from the node with most recently modified // distance. D(s) 0 For each vertex v s do D(v) // Let Set-Remaining-Nodes be the set of all those nodes for which the final minimum // distance is yet to be determined. Initially Set-Remaining-Nodes V while (Set-Remaining-Nodes ) do begin choose v Î Set-Remaining-Nodes such that D(v) is minimum

Set-Remaining-Nodes Set-Remaining-Nodes ~ {v} For each node x Î Set-Remaining-Nodes such that w(v, x) do D(x) min {D(x), D(v) + w(v, x)} end 2. Describe the following with suitable examples for each: o Binary Search Trees Ans: We know that for binary search trees and red-black trees, any “satellite information” associated with a key are stored in the same node as the key. In practice, one might actually store with each key just a pointer to another disk page containing the satellite information for that key. The pseudo code in this chapter implicitly assumes that the satellite information associated with a key, or the pointer to such satellite information, travels with the key whenever the key is moved from node to node. A common variant on a B – tree, known as a B+ – tree, stores all the satellite information in the leaves and stores only keys and child pointers in the internal nodes, thus maximizing the branching factor of the internal nodes. Objectives At the end of this unit the student should be able to: · Find the height of a B-tree. · Recognize a Fibonacci Heap Properties of B – Trees A B – tree T is a rooted tree (whose root is root [T]) having the following properties: 1. Every node x has the following fields: a. n [x], the number of keys currently stored in node x, b. the n [x keys themselves, stored in nondecreasing order, so that

, c. leaf [x], a Boolean value that is TRUE if x is a leaf and FALSE if x is an internal node. 2. Each internal node x also contains n [x]+1 pointers c1[x], c2 [x]…., cn[x]+1[x] to its children. Leaf nodes

have no children, so their fields are undefined. 3. The keys keyI [x] separate the ranges of keys stored in each subtree: if ki is any key stored in the subtree with root ci [x], then

. 4. All leaves have the same depth, which is the tree’s height h. 5. There are lower and upper bounds on the number of keys a node can contain. These bounds can be

expressed in terms of a fixed integer called the minimum degree of the B – Tree: a. Every node other than the root must have at least t – 1 keys. Every internal node other than the root thus has at least t children. If the tree is nonempty, the root must have at least one key. b. Every node can contain at most 2t – 1 keys. Therefore, an internal node can have at most 2t children. We say that a node is full if it contains exactly 2t – 1 keys. The simplest B – tree occurs when t = 2. Every internal node then has either 2, 3, or 4 children, and we have a 2-3-4 tree. In practice, however, much larger values of t are typically used.

http://resources.smude.edu.in/slm/wp-content/uploads/2010/07/clip-image00226.gif




o Red Black Trees Properties: A binary search tree in which

The root is colored black All the paths from the root to the leaves agree on the number of black nodes No path from the root to a leaf may contain two consecutive nodes colored red

Empty subtrees of a node are treated as subtrees with roots of black color. The relation n > 2h/2 - 1 implies the bound h < 2 log 2(n + 1). 3. Define and explain a context free grammar. Ans: Earlier in the discussion of grammars we saw context-free grammars. They are grammars whose productions have the form X -> , where X is a nonterminal and is a nonempty string of terminals and nonterminals. The set of strings generated by a context-free grammar is called a context-free language and context-free languages can describe many practically important systems. Most programming languages can be approximated by context-free grammar and compilers for them have been developed based on properties of context-free languages. Let us define context-free grammars and context-free languages here. Definition (Context-Free Grammar) : A 4-tuple G = < V , , S , P > is a context-free grammar (CFG) if V and are finite sets sharing no elements between them, S V is the start symbol, and P is a finite set of productions of the form X -> , where X V , and ( V )* . A language is a context-free language (CFL) if all of its strings are generated by a context-free grammar. Example 1: L1 = { anbn | n is a positive integer } is a context-free language. For the following context-free grammar G1 = < V1 , , S , P1 > generates L1 : V1 = { S } , = { a , b } and P1 = { S -> aSb , S -> ab }. Example 2: L2 = { wwr| w {a, b }+ } is a context-free language , where w is a non-empty string and wr denotes the reversal of string w, that is, w is spelled backward to obtain wr . For the following context-free grammar G2 = < V2 , , S , P2 > generates L2 : V2 = { S } , = { a , b } and P2 = { S -> aSa , S -> bSb , S -> aa , S -> bb }. Example 3: Let L3 be the set of algebraic expressions involving identifiers x and y, operations + and * and left and right parentheses. Then L3 is a context-free language. For the following context-free grammar G3 = < V3 , 3, S , P3 > generates L3 : V3 = { S } , 3 = { x , y , ( , ) , + , * } and P3 = { S -> ( S + S ) , S -> S*S , S -> x , S -> y }. Example 4: Portions of the syntaxes of programming languages can be described by context-free grammars. For example { < statement > -> < if-statement > , < statement > -> < for-statement > , < statement > -> < assignment > , . . . , < if-statement > -> if ( < expression > ) < statement > , < for-statement > -> for ( < expression > ; < expression > ; < expression > ) < statement > , . . . , < expression > -> < algebraic-expression > , < expression > -> < logical-expression > , . . . } .

4. Explain in your own words the concept of Turing machines. Ans:

There are a number of versions of a TM. We consider below Halt State version of formal definition a TM. Definition: Turing Machine (Halt State Version)

A Turing Machine is a sixtuple of the form , where (i) Q is the finite set of states,

(ii) is the finite set of non-blank information symbols,

(iii) is the set of tape symbols, including the blank symbol

(iv) is the next-move partial function from , where ‘L’ denoted the tape Head moves to the left adjacent cell, ‘R’ denotes tape Head moves to the Right adjacent cell and ‘N’ denotes Head does not move, i.e., continues scanning the same cell.

In other words, for qi Q and ak , there exists (not necessarily always, because d is a partial

function) some q j Q and some a1 such that (qi ak) = (q j, a1, x), where x may assume any one of the values ‘L’, ‘R’ and ‘N’.

The meaning of (qi ak) = (q j, al, x) is that if qi is the current state of the TM, and ak is cell currently under the Head, then TM writes a1 in the cell currently under the Head, enters the state q j and the Head moves to the right adjacent cell, if the value of x is R, Head moves to the left adjacent cell, if the value of x is L and continues scanning the same cell, if the value of x is N.

(v) q0 Q, is the initial / start state.

(vi) h Q is the ‘Halt State’, in which the machine stops any further activity. 5. Describe Matrix Chain Multiplication Algorithm using Dynamic Programming. Ans: It can be seen that if one arrangement is optimal for A1A2 ……A n then it will be optimal for any pairings of (A1……A k) and (Ak+1 An). Because, if there were a better pairing for say A1A2 ……Ak, then we can replace the better pair A1A2 ……Ak in A1A2 ……Ak Ak+1…..A n to get a pairing better than the initially assumed optimal pairing, leading to a contradiction. Hence the principle of optimality is satisfied. Thus, the Dynamic Programming technique can be applied to the problem, and is discussed below: Let us first define the problem. Let A i, 1 ≤ i ≤ n, be a d i – 1 x d i matrix. Let the vector d [0…n] stores the dimensions of the matrices, where the dimension of A i is d i – 1 x di for i = 1, 2, ….., n. By definition, any subsequence A j….A k of A1A2 ……A n for 1 ≤ j ≤ k ≤ n is a well-defined product of matrices. Let us consider a table m [1…n, 1…n] in which the entries mij for 1 ≤ i ≤ j ≤ n, represent optimal (i.e., minimum) number of operations required to compute the product matrix (A i……A j). We fill up the table diagonal-wise, i.e., in one iteration we fill-up the table one diagonal m i, i + s, at a time, for some constant s ≥ 0. Initially we consider the biggest diagonal m ii for which s = 0. Then next the diagonal m i, i + s for s = 1 and so on. First, filling up the entries mii, i = 1, 2, ….., n. Now mii stands for the minimum scalar multiplication required to compute the product of single matrix A i. But number of scalar multiplications required are zero. Hence, mii = 0 for i =1, 2, …..; n. Filling up entries for m i (i + 1) for i = 1, 2, ….. (n – 1).

http://resources.smude.edu.in/slm/wp-content/uploads/2010/07/clip-image01279.jpg
















m i (i + 1) denotes the minimum number of scalar multiplication required to find the product A i A i + 1. As A i is d i – 1 x d i matrix and A i + 1 is d i x d i + 1 matrix. Hence, there is a unique number for scalar multiplication for computing A i A i + 1 giving m i , (i + 1) = d i – 1d i d i + 1 for i = 1, 2, …., (n – 1) The above case is also subsumed by the general case m i (i + s) for s ≥ 1 For the expression A i A i + 1……A i + s Let us consider top-level pairing (A i A i + 1…..A j) (A j + 1 …..A i + s) for some k with i ≤ j ≤ i + s. Assuming optimal number of scalar multiplication viz., mij and mi + 1, j are already known, we can say that m i (i + s) = min i ≤ j ≤ i + s (m i , j + m j + 1, s + d i – 1 d j d i + s) for i = 1, 2, ….., n – s. When the term d i – 1 d j d i + s represents the number of scalar multiplications required to multiply the resultant matrices (A i……A j) and (A j + 1 …..A i + s) Summing up the discussion, we come the definition m i, i + s for i = 1, 2, ….., (n – 1) as

m i, i + s = min i ≤ j ≤ i + s (m ij + m j + 1, i + s + d i – 1 d i d i + 1) for i = 1, 2, ….., (n – s) Then m 1, n is the final answer Let us illustrate the algorithm to compute m j + 1, i + s discussed above through an example Let the given matrices be

Thus the dimension of vector d [0 . . 4] is given by [14, 6, 90, 4, 35] For s = 0, we know m i i = 0. Thus we have the matrix Next, consider for s = 1, the entries m i, i + 1 = d i – 1 d i d i + 1




6. Show that the clique problem is a N.P. complete problem. Ans:

Proof: The verification of whether every pairs of vertices is connected by an edge in E, is done for different pairs of vertices by a Non-deterministic TM, i.e., in parallel. Hence, it takes only polynomial time because for each of n vertices we need to verify atmost n (n + 1) /2 edges, the maximum number of edges in a graph with n vertices. We next show that 3-CNF-SAT problem can be transformed to clique problem in polynomial time.

Take an instance of 3-CNF-SAT. An instance of 3CNF-SAT consists of a set of n clauses, each consisting of exactly 3 literal, each being either a variable or negated variable. It is satisfiable if we can choose literals in such a way that:

· Atleast one literal from each clause is chosen



· If literal of form x is chosen, no literal of form x is considered.

For each of the literals, create a graph node, and connect each node to every node in other clauses, except those with the same variable but different sign. This graph can be easily

computed from a Boolean formula in 3-CNF-SAT in polynomial time.

Consider an example, if we have –

then G is the graph shown in above.

In the given example, a satisfying assignment of is (x1 = 0, x2 = 0, x3 = 1). A corresponding

clique of size k = 3 consists of the vertices corresponding to x2 from the first clause, x3 from

the second clause, and x3 from the third clause.

The problem of finding n-element clique is equivalent to finding a set of literals satisfying SAT. Because there are no edges between literals of the same clause, such a clique must contain exactly one literal from each clause. And because there are no edges between literals of the

same variable but different sign, if node of literal x is in the clique, no node of literal of form x is.

This proves that finding n-element clique in 3n-element graph is NP-Complete.









Documents

MC0080 – Analysis and design of algorithms