21
Lecture Notes, Sanjay Goel, DS, 2005 Lecture Notes: Data Structures 2005 1. 22 July (1 hr.) 1. Maintain the enthusiasm and a notebook for this course. Notebook will carry some marks. 2. Discussion about course objectives 3. What is Problem Solving? - Recollection and narration of personal experiences of problem solving. - Identification of characteristics of real problems and solutions - Some Identified Characteristics of Real Problems: i. There are some missing resources or knowledge. ii. They are difficult, and often do not have straight answers iii. They are Unexpected. iv. Often people are involved. - Some Identified Characteristics of Solutions of Real Problems: i. Many alternate correct solutions are possible. ii. Require information/ knowledge acquisition during problem solving. iii. Require high level of motivation. iv. Real life problems are solvable. v. Often help from others is needed. 4. Knowledge is constructed during problem solving experiences. 5. Best strategy to acquire knowledge is to solve problems. 6. Course strategy : Confront and challenge students with complex problems. 7. Assignment : Write 5 incidences of real life problems faced by you, out of these five, at least two should be from academic experiences. How did you resolve these problems? Identify some common characteristic of these problems and your solution. Discuss your analysis in a group of three and prepare a set of properties to characterize problems and solutions. 2. 26 July (1 hr.) 1. Group wise summarization of : (1 mark for all participants) - Common characteristics of Problems - Common characteristics of Solutions 2. Solution generation begins by describing the problem clearly and precisely. 3. Instruments of bringing clarity and precision in our expressions: - Give examples and analogies - Give measurements - Give references - Avoid jargons - Avoid unnecessary descriptions and details - Use graphical symbols. - Use mathematical symbols and expressions. 4. We often have to learn, adapt and also invent new symbols for clear and precise expressions. All engineering disciplines including CSE and Software engineering have discipline specific notations. It is a formal requirement to use these notations in technical communications. 5. Group Assignment (3 students) : Pick any software, machine or any other such artifact and describe its functionality in clear and precise manner. 6. Programming Assignment : Practice pointers, files and structures. Request your lab instructors and tutors to give you simple assignments to give you this practice. This is non evaluative exercise. But you must do it to develop confidence. 3. 28 July (1 hr.) 1. Counter examples help in bringing precision. 2. Test the clarity and precision of another group’s description of functionality of some software/machine. 3. Prepare a list of the characteristics of the gaps. 4. Assignment: Take two best computer programs written by you so far and describe both in a clear and precise manner without using syntax of programming language. Make a group of ten students and develop a common checklist and style

Data Structures 2005

Embed Size (px)

DESCRIPTION

Lecture Notes on Data Structures by Sanjay Goel, July-Dec 2005, JIIT, Noida

Citation preview

Page 1: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 Lecture Notes: Data Structures

2005 1. 22 July (1 hr.)

1. Maintain the enthusiasm and a notebook for this course. Notebook will carry some marks. 2. Discussion about course objectives 3. What is Problem Solving?

- Recollection and narration of personal experiences of problem solving. - Identification of characteristics of real problems and solutions - Some Identified Characteristics of Real Problems:

i. There are some missing resources or knowledge. ii. They are difficult, and often do not have straight answers

iii. They are Unexpected. iv. Often people are involved.

- Some Identified Characteristics of Solutions of Real Problems: i. Many alternate correct solutions are possible.

ii. Require information/ knowledge acquisition during problem solving. iii. Require high level of motivation. iv. Real life problems are solvable. v. Often help from others is needed.

4. Knowledge is constructed during problem solving experiences. 5. Best strategy to acquire knowledge is to solve problems. 6. Course strategy : Confront and challenge students with complex problems. 7. Assignment : Write 5 incidences of real life problems faced by you, out of these five, at least two should be from

academic experiences. How did you resolve these problems? Identify some common characteristic of these problems and your solution. Discuss your analysis in a group of three and prepare a set of properties to characterize problems and solutions.

2. 26 July (1 hr.) 1. Group wise summarization of : (1 mark for all participants)

- Common characteristics of Problems - Common characteristics of Solutions

2. Solution generation begins by describing the problem clearly and precisely. 3. Instruments of bringing clarity and precision in our expressions:

- Give examples and analogies - Give measurements - Give references - Avoid jargons - Avoid unnecessary descriptions and details - Use graphical symbols. - Use mathematical symbols and expressions.

4. We often have to learn, adapt and also invent new symbols for clear and precise expressions. All engineering disciplines including CSE and Software engineering have discipline specific notations. It is a formal requirement to use these notations in technical communications.

5. Group Assignment (3 students) : Pick any software, machine or any other such artifact and describe its functionality in clear and precise manner.

6. Programming Assignment : Practice pointers, files and structures. Request your lab instructors and tutors to give you simple assignments to give you this practice. This is non evaluative exercise. But you must do it to develop confidence.

3. 28 July (1 hr.) 1. Counter examples help in bringing precision. 2. Test the clarity and precision of another group’s description of functionality of some software/machine. 3. Prepare a list of the characteristics of the gaps. 4. Assignment: Take two best computer programs written by you so far and describe both in a clear and precise manner

without using syntax of programming language. Make a group of ten students and develop a common checklist and style

Page 2: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 guideline for describing computer programs. Group members should test the clarity and precision of other member’s description against this style guideline. Identify and enumerate common gaps and deficiencies. Bring this work in the next lecture.

5. A Computer program is an example of clear and precise expression. 6. ACM and IEEE have suggested following broad level characteristics of ‘Computer Science’ graduates :

a. System-level perspective; b. Appreciation of the interplay between theory and practice; c. Familiarity with common themes; d. Significant project experience; and e. Adaptability.

They also suggested following general skills for Computer Science graduates: a. Communication; b. Teamwork; c. Numeracy; d. Self management; and e. Professional development.

7. ACM and IEEE have suggested following broad level characteristics of ‘Computer Engg’ graduates : a. System Level Perspective; b. Depth and Breadth; c. Design Experience; d. Use of Tools; e. Professional Practice; and f. Communication Skills

8. ACM and IEEE have suggested following broad level characteristics of ‘IT’ graduates : a. Use and apply current technical concepts and practices in the core information technologies; b. Analyze, identify and define the requirements that must be satisfied to address problems or opportunities faced

by organizations or individuals; c. Design effective and usable IT-based solutions and integrate them into the user environment; d. Assist in the creation of an effective project plan; e. Identify and evaluate current and emerging technologies and assess their applicability to address the users’

needs; f. Analyze the impact of technology on individuals, organizations and society, including ethical, legal and policy

issues; g. Demonstrate an understanding of best practices and standards and their application; h. Demonstrate independent critical thinking and problem solving skills; i. Collaborate in teams to accomplish a common goal by integrating personal initiative and group cooperation; j. Communicate effectively and efficiently with clients, users and peers both verbally and in writing, using

appropriate terminology; and k. Recognize the need for continued learning throughout their career.

9. Assignment : Read, brainstorm and understand above recommendations very carefully. Self-analyze yourself on above mentioned parameters and prepare a roadmap to acquire and strengthen above abilities.

10. Assignment : Learn about System level perspective. Discuss with your seniors about this. 4. 30 July (1 hr.) 1. Software development is team effort involving following sub teams:

a. Analyst and Designer b. Programmer c. Test designer d. Tester.

2. Assignment (4 marks) : Make a team of 5 to 10. i. Design a common format (style guideline) for writing program descriptions without committing to any

specific language. ii. Design a common format (style guideline) for writing Test Plans without committing to any specific

program. iii. Design a common format (style guideline) for writing test reports with reference to planned format test

plan. iv. Rewrite your program descriptions (two numbers) of your last assignment using this common format.

Page 3: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 v. Forward one description to another member within this group (do no mutually exchange). vi. Write Test plan for the program given to you by your group member as per the test plan format. B writes

the Test plan for the program description forwarded by A. vii. Forward the Test plan, Program description and Program. Member C will test the program written by A

as per the Test plan prepared by B and prepare the Test report per the Test report format and give it to A for program modification. A will either modify the program or clarify to B and C on each reported entry in Test report.

viii. Identify common pitfalls, gaps, mistakes and errors.

3. Give clearer and more precise description of following problem : A design team has conceived the following initial specifications of a search engine for a large company’s internal Digital Library: Only specially authorized users can upload new documents or new versions of old document. All employees can look at the documents. Information systems department will create, update and maintain a list of keywords for faster search facility. The search engine users can also search by entering any word through the keyboard. Searched documents are to be listed as follows: Case A, Faster search on a listed keyword: As per the frequency of occurrence of the word i.e. the documents having higher “density” of the chosen keyword will be listed before the documents having lower density, where density[k, d] = (Occurrence count of the word k in d)/(word count in d) Case B, Search by entering a word though the keyboard: As per the frequency of usage of a document, where usage is defined as number of times a document is opened by users through the search engine. (Those who want to take the challenge of writing this program are exempted from first 3 week’s programming exercises, discuss with your lab instructor)

4. Relationship between Engineering, Science and Art. Engineering is not merely applied Science. Engineering is older

than Science. 5. 2 August (1 hr.)

1. Discussion on the assignment. Issues to be addressed in program description: i. Title, author, brief objective and solution description, applicability, target users, usage, target scale ii. Bibliography of technical terms (application domain) iii. Input/Output variables, processing only variables. (name, usage, example, counterexample, domain

definition, constraints and so on) iv. Examples and Counterexamples with processing steps v. GUI/screen shots vi. Hierarchical description: Macro view (executive summary) and detailed view (detailed description). vii. Memory sketch:

a. View of the Data can be shown as a drawing. As flow chart gives a view of processor’s action, this sketch can show the activity in memory.

b. Data may be atomic, molecular or macromolecular in nature. It should be shown from logical perspective. c. Single line box can be used for atomic data, double line box can be used for molecular and triple line box

can be used for macromolecular data. viii. User’s manual

2. (1) mark for all those who did this exercise in the class, (-2) marks for all absentees. 3. The group size can vary from 5 to 10 for this assignment.

6. 4 August (1 hr.) 1. Review of format of program description. 2. More symbols for schematic format of memory sketch :

i. Input data : Add incoming arrow head on left side of data box. ii. Output data : Add outgoing arrow on right side of data box. iii. I/O data : Add both arrowheads on left and right side. iv. Processing data : no arrow head. v. Data copy transfer : directed links between data boxes. vi. Proceed data assignment : use elliptical boxes to show processing of chosen data items.

A C +

B

Page 4: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005

4. Format for developing Test Plan. Be extra careful about planning to test the program at the corners and edges. 5. Corners and Edges of program:

i. Loop entry and exits. ii. Worst case inputs

iii. Function calls iv. Return from functions.

6. Develop the program for successful testing without knowing the test plan and unfriendly testing process. 7. Assignment (2 marks): WAP to calculate factorial of any +ve integer up to 9999. Continue you earlier

assignment.

7. 6 August (1 hr.) 1. Because of no progress on the last assignment by any student, all students get a (–1) mark.

2. Individual Assignment (3 marks): Use this training of clearly and precisely describing problems and solutions to propose a computational system in the assigned domain. The last digit of your enrollment number will determine the domain as follows:

0: Polynomial 1: Genealogical chart 2: Time Table 3: Periodic table 4: Maps 5: Analog Circuits 6: Digital Circuits 7: Musical compositions/ Sketches 8: Dictionary and Thesaurus 9: News paper If you want to work with some other domain outside the above list, you can. However, you are not permitted to

work on domain already assigned to other students. You have to propose the functional requirements of software that stores (also creates) the mentioned object(s) in

soft form and allows the users to do useful operations with stored soft form of your object. Identify these operations, their effect, their requirements and method of activating them.

Propose the data structure for storing the main objects for your software.

If you sincerely complete all above mentioned assignments mentioned in the lecture notes till this date within one week, and show good performance, your –ve marks assigned to you till this date will be removed.

8. 9 August (1 hr.)

1. Some common operations that computers are used for with any object: Create/Store

Search/Retrieve/Pattern Matching/Traversal Process/Modify

Sort/Arrange/Rearrange Generate/Scheduling

Render

2. Group work out on the last assignment.

Page 5: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005

3.

9. 11.8.05 1. 1st version of Conceptual Design of Data Storage: Case Study of Genealogical chart by Abhinav (B9) 2. Issues: Memory consumption and others. 3. 2nd version of Conceptual Design of Data Storage Usage of codes and Lookup table to decrease memory requirement.

10. 13.8.05 1. 1st version of Physical Design of Data Storage for Genealogical chart using facilities offered by C language. 2. 3rd version of Conceptual design to address the constraints of physical design facilities. 3. 2nd version of Physical design. (Table as an array of records, Graph as a matrix or Graph as an array of records [i,j,

type of relationship]. 4. More versions of Conceptual design and physical design to reduce memory space requirements. 5. Algorithms for information retrieval (explicitly stored): requires accessing many data tanks. 6. Generating possibilities for retrieving information that is not stored explicitly and requires some processing: Algorithms 7. Generating possibilities for automated updating of information on some events: Algorithms. 8. Assignment: Apply this cyclic process for incremental improvement of your design already assigned to you. 9. Assignment [4 marks]: Write a C program to store up to 50000 persons and their relationships. Enter the data for 50.

WAP to address following queries: i. How ith person is related to jth person? ii. To how many persons is ith person related? iii. Who are the common persons somehow related to both ith and jth person? iv. Who are the common persons somehow related to all ith, jth and kth person?

10. Assignment [2 marks]: Think how two different family charts can be merged and automatically updated in the event of marriage between the persons belonging to different family charts.

Page 6: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 11. 16.8.05

1. Genealogical chart : Algorithmically computing non-primary family relationships from primary family relationships of father, mother and spouse. Options:

Use program logic or Use program logic + relationship definition stored as structured data. 2. Designing Coding schemes: encoding useful information or just a serial number or a mix of the two. 3. Designing Codes for members of genealogical tree. 4. Algorithms for Query processing over stored genealogical database. 5. Storage of Truth tables. 6. Individual Assignment: Propose storage scheme for any 5 variable Boolean functions. WAP to compute the output the

result for user inputted Boolean values of 5 variables as per stored function. Write another program without storing the Boolean function as data. Evaluate the two options.

12. 18.8.05 1. Storage scheme for any 5 variable Boolean functions (truth tables) : different options, 5 input variables and output

variable are all Boolean variables. a. Store Binary output for all possible combinations: i. Boolean TruthTable[2][2][2][2][2]

ii. Boolean TruthTable[32] b. Store only those combinations of input variables for which the output is 1 in an array. The input variable

combinations can be encoded as char string, bit string, or decimal numbers. Char TruthTable[32][5] Byte TruthTable[32] Char TruthTable[32]

2. Performance evaluation of these options in terms of space and time. 3. Assignment [2 marks]: Generate more options? Evaluate these options in terms of time and space. Write different

functions for realization of the function of truth table using each of these storage options. Test the scalability of your design with functions of 4000 hexadecimal variables.

13. 20.8.05 1. More options for Truth Table storage: Store only those combinations of input variables for which the output is 1 in a

linked list. 2. Assignment [2 marks]: WAP to allow the user to enter names, store the names in a linked list, display the name list in

forward as well reverse order. Use linked list to store the truth table.

14. 23.08.05 1. Vivek’s non recursive solution for list traversal without using significant number temporary variables, or changing the

stucture of the ;list from singly linked to doubly linked. Estimated number of nodes to be traversed is O(n2) 2. Tanu’s recursive solutions for linked list traversal : forward traversal, backward traversal. 3. Count the number of nodes traversed in recursive solution. 4. Tabular Analysis of control flow, lifetime of variables, and variable visibility in recursive algorithms.

i. Number all executable statements in the source code (including the last ‘}’ (indicating the return or end of function) of the functions and also of the main), as 1,2,3, and so on.

ii. Each call of recursive function as expected to be made at run time is numbered as i, ii, iii, and so on. iii. Hence each run time statement is numbered as i.1, i.2,.., ii.1, ii.2,….and so on. iv. Key variables (parameters and local variables declared within recursive function) such as varname1 are also

labeled as i.varname1, ii.varname1, and so on. v. Create a control flow analysis table,

v.i. First column has the estimated run time statement number of current statement (e.g. i.1,i.2) simulating the logic of control flow,

v.ii. Second column has list of key live variables (i.varname1, ii.varname1) and their respective estimated values after executing the current statement. Underline the visible variables as earlier versions of variables, though are live (available in the memory) become invisible during the later calls of functions. On return from the jth call of the a functions all variables labeled as j.varnamex become dead and are no more available in the memory.

v.iii. Third column has the run time estimated statement number of next statement to be executed after execution of current statement.

5. Assignment: Analyze all recursive programs so far written by you with the help of this tabular analysis technique as discussed in the class.

Page 7: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 15. 25.08.05 1. More discussion on recursion and its analysis. 2. Assignment [2 marks]: Write and analyze (Tabular analysis) recursive programs for:

a. Finding the sum of the elements of an array. b. Finding the count of the nodes in a linked list. c. Finding the maximum number and its index in an array. d. Finding the maximum number in a linked list.

16. 30.08.05 1. More discussion on recursion and its tabular analysis. 2. Comparison of recursive and non-recursive solutions: a priori performance estimation in terms of

computation time and memory requirement. 3. Analysis of call-return statements. Control transfer, memory demand/release. 4. Identification of data Structures for a given application

A design team has conceived the following initial specifications of a search engine for a large company’s internal Digital Library: Only specially authorized users can upload new documents or new versions of old document. All employees can look at the documents. Information systems department will create, update and maintain a list of keywords for faster search facility. The search engine users can also search by entering any word through the keyboard. Searched documents are to be listed as follows: Case A, Faster search on a listed keyword: As per the frequency of occurrence of the word i.e. the documents having higher “density” of the chosen keyword will be listed before the documents having lower density, where density[k, d] = (Occurrence count of the word k in d)/(word count in d) Case B, Search by entering a word though the keyboard: As per the frequency of usage of a document, where usage is defined as number of times a document is opened by users through the search engine.

5. Assignment (Group of three, 4 marks each student) : Design DS and a program for above application.

17. 01.09.05 1. Review of Genealogical database storage. 2. Array of persons, Array of relationship code, Matrix of interpersonal relationships. 3. Queries for which the algorithms were discussed:

i. What is the relationship between x and y ? ii. Who all are related to x ? iii. Who all are related to x and how ? iv. Who all are related x as well as y ? v. Who all are related to x or y ? vi. Who is at the root of the family tree ? vii. Find out the Tree position (e.g. 1 (root), 11, 12, 111, 112, 1111, …) for all persons in the person array by

processing the relationship matrix. viii. Draw family Tree using stored Tree positions for all persons in the database as computed by above query.

4. Assignment: WAP for processing all above queries. Also WAP for listing the members of last generation of a family. Generate more queries.

18. 03.09.05 1. Review of Genealogical database storage. person_i [ relationship code for person_1, relationship code for person_2, .. , … relationship code for person_N] 2. Matrix of interpersonal relationship consumes a lot of space hence is not sufficiently scalable as well. What is the

solution, if N is large ? 3. Several options : a. Store only non null relationships in a limited and same sized array for each person person_i [(person_j, relationship code), (person_k, relationship code), ….]; store only n elements in this array [ ], n<<N 4. Algorithms Design and Analysis (time and space) for some of queries previously discussed in last class with this

modified data structure. i. What is the relationship between x and y? ii. Who all are related to x? iii. Who all are related to x and how?

Page 8: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 iv. Who all are related x as well as y ? v. Who all are related to x, y, and z ?

5. Assignment: (Group of three, 1 mark for each student) WAP for processing all above queries with modified DS for

family database. Generate more queries. 6. Other possible applications of relationship matrix (relationships between homogeneous objects) as a data storage

structure: i. Inter-city railway/bus/air routes availability ii. Inter-city fares iii. inter-chemical affinities iv. inter-city railway/bus/air time table v. inter-state border sharing vi. inter-currency exchange rates vii. inter-team tournaments schedule viii. inter-country business

7. Assignment: Generate problems for two of the above or other similar applications.

19. 06.09.05 1. Review of Inter-homogeneous object relations matrix storage and query algorithms. 2. More contexts for relationship matrix. 3. Inter-city distance matrix: example queries :

j. What is the distance between x and y ? ii. Which cities are directly connected to x ? iii. Which cities are directly connected to x as well as y ? iv. Which cities are directly connected to x, y, and z ?

4. Design algorithms for a new Query: Find out the city (k), which connects two directly unconnected cities (i and j) with minimum total distance between j, and i if they can be connected with one in-between city.

5. Usage of temporary Buffers for designing algorithms. Simple buffer : Array (for storing partially processed data). 6. Limitations of array of array of structure : limited nodes limited scalability 7. Consider array of linked list of structure for enhancing scalability. Variable length of linked list for each person 8. Redesign of algorithms for all queries with further modified DS. 9. Key tradeoff issues: Scalability, memory size and execution time. 10. Consider the option of linked list of linked list of structure. Compare this DS with earlier two structure. 11. Assignment: Generate three more problems for more contexts of inter- homogeneous object relationship and propose

their storage and algorithmic solutions. 12. Are their more alternate storage possibilities? What if pointers are not available or are not to be used for some

reason? What if objects are not homogeneous but belong to two different types of categories?

20. 13.09.05 1. Using the data structures defined below, the database has been populated with the

data as shown in TABLE 1. struct data

{ char info; int index1; int index2; } struct data database[12] ;

Page 9: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005

TABLE 1 What_is_it (int index) { if index <> - 1 { What_is_it (database[index].index2) ; What_is_it (database[index].index1) ; Output database[index].info ; } } Analyze the above recursive function using the tabular analysis technique & draw the recursion anaylsis table. Also show the output if the “ What_is_it ” function is initially called with index=0. Recursive algorithms have a risk of hidden infinite recursion for some specific cases of data as in above case.

2. A database stores information about hyperlinks across Websites. There are 10 Websites

namely A,B, C,… J each having link to other as shown in TABLE 2. The ‘ y’ indicates the presence of a hyperlink. For example: there is a hyperlink from Website ‘A’ to Website ‘E’ and not vice versa.

A B C D E F G H I J A y y B y y y C y y y D y y E y y F y y G y y y H y y I y y J y y

TABLE 2

QUERY : If it is possible to move from xth Website to yth Website with up to two intermediate Websites in between

them , then display “ link exists between xth and yth Websites” and also give sequence of names of intermediate

Websites that are visited while moving from xth to yth Website ,otherwise display “No links Possible”.

Propose the Data Structures and an appropriate algorithm for processing the above Query.

Also demonstrate the working of your algorithm by simulating key-steps of the algorithm for two cases e.g. A E C B.

info index1 index2 Y 1 11 X 7 10 A 6 -1 M 8 -1 F -1 9 N 4 -1 S -1 -1 C 3 5 G -1 -1 E -1 -1 W 0 2 D -1 -1

Page 10: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 Options : a. Neha’s approach : Three Nested For loops over a 2d matrix data structure : Limitations of scalability of

space and logic. b. Hina’s approach: Nested for loops over an array of [(source, destination),……] : How many nested loops

are required ? What about space scalability? What about logic scalability ? c. Recursive approach. (Multiple recursive calls within each call depending upon the number of linked

websites). Usage (increment and also decrement) of a global counter to stop processing after 2 steps. d. Consider using a temporary buffer. Assignment [4 marks] : Write, test and evaluate program for both problems with all mentioned approaches

21. 15.09.05 1. Design a data storage scheme for storing polynomial functions and series of numbers. Design

an algorithm to test if the given series is a Taylor series approximation (at x) for a given polynomial function and x0.

Taylor series approx. of a given function is as follows:

2. Design a dynamic data structure for storing a randomly ordered collection of single variable polynomial functions

and another static data structure for storing randomly-ordered collection of real number-sequences. The records in both the collections also should have additional provision for storing indices of all the matching entries (if any) in another collection. One entry in any collection may match with none, one or many entries in another. A number-sequence is declared as matching with a polynomial, if all the numbers in the sequence match with corresponding terms of the Taylor series expansion of a function for given x and x0 within the limits of a user-defined ‘permitted-mismatch’. Design an algorithm for updating matching indices in both the collections for a given user-defined input of ‘permitted-mismatch’, x and x0.

3. Assignment [2 marks] : Write programs for above problems. 22. 17.09.05 1. Abhinav’s recursive solution to earlier linked website data base problem. 2. Basic Approach : Recurse-path (Source, destination) { ….. For I=0 to 9 { …… If (xxxxxx) Recurse-path (new source, destination) …… } } 3. Create a tree from the inter homogeneous matrix, with source as the root. Add nodes as children till permitted

depth is not reached or destination is not found. Be Careful : about infinite depth. However, Tree is not a good model for such matrix. View the inter homogeneous matrix as a graph : Adjacency matrix.

4. Design of DS for Graph as a connected collection (non linear linked list) of node interconnected through pointers. 5. Design Algorithm for converting Adjacency matrix into linked form of Graph. 6. Buffer based approach for solving earlier linked website data base problem. Use a simple Buffer for storing seeds.

Start with supplied seed (initial source) and identify new seed and store them in a buffer. Pick next seed from the buffer and use this seed to identify new unidentified and unprocessed seeds. Put these new seeds into the buffer and continue till the objective is not met or possibility of meeting the objective is exhausted. This buffer can be in the form of unordered set, or chronologically ordered either as FIFO Queue or LIFO stack.

7. Assignment [2 marks] : WAP for converting Adjacency matrix into linked form of Graph. Implement linked website data base problem using recursive and buffer (simple

unordered array) based approach. . 23. 20.09.05 1. Pick one of the following or any other similar popular family game and propose a design for a software version: Snakes and Ladders Ludo Chess

Page 11: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 Any Cards based Game Make-a-word Rubic Cube You can design you software on the model of ‘play through computer’. 2. case study : Snakes and ladders, Ludo 3. Assignment [3 marks for each student] : WAP for your chosen game in the class in a group of three (with

exception of two groups). Those who were absent are not authorized to this assignment. Absentees get a zero. 24. 22.09.05 1. Case Study: Text based User interface for Snakes and ladders by Ankush. 2. Case study: Data Structure for Chess by Saurabh. 3. Analysis and design of Data Structure, UI and Algorithm for Chess game (without graphical interface) (Enable

play through computer and not with computer), Check the legality of the move proposed by the players, swap the turns of the players.

4. Define the rules for legality: Check legality of delta_x, delta_y Check availability of new destination position Check path clearance 5. Assignment (group of three) [3 extra marks : for Chess or 1 extra mark for Ludo] : This is an extension of last

assignment and is not open to those were absent in this as well as last class. Absentees get a zero. 25. 27.09.05 1. Rat in the maze : Problem definition, Data Structures for modeling the problem variables (maze, start position,

destination position, current position), basic solution approach (exhaustive search) to solve the problem, and constraints. Design an detailed solution.

2. Project demo by Siddhartha Batra (3rd year) : Issues in design Graphics User interface for Turbo C program 26. 27.09.05 1. Problem Solving Process:

i. Mark the Nouns (future data items) and verbs (future functions). ii. Create Schematic representation of the world of the problem.

a. Nouns as Data Tanks (Rectangular Boxes) and verbs as related activity pumps/filters (as elliptical boxes).

b. Single lined data boxes for single occurrence of a data item, Double lined box for multiple occurrence of similarly typed homogeneous data items.

c. Create a Concept Map as a network of these boxes. iii. Identify the basic solution approach i.e. by searching or by comparing and so on. iv. Manually solve the problem. v. Test the manual solution with examples. vi. Think about your thinking. vii. Identify your mental constructs and cognitive processes in solving the problems. viii. Represent the details of your thinking process in terms of very simple and stupid atomic operations. ix. Simulate the natural memory based constructs with computer memory based constructs i.e. data which

structured. x. Simulate the cognitive processes with computational (mathematical and logical) processes.

2. Discussion on Problem of Rat in the Maze and its variations: a. 2D maze with square cells b. Network of roads and squares c. 3D maze with cubic cells. d. 2D maze with hexagonal cells e. PCB routing f. VLSI routing g. Graph Traversal and so on

3. Discussion on Tower of Hanoi.

Page 12: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 Assignments (eligibility : All students of B2-B5, B7-B9 and only those students of B1 and B6 who marked their attendance in the class) :

a. (single student, 3 marks) WAP for solving 2d and 3d Rat in the maze problem with square and cubic cells (textual interface will be sufficcient.

b. (group of three, 1 marks for each student) WAP for 2d maze with hexagonal cells. 5. Optional Assignments in lieu of above assignment (eligibility : Out of the above students only those studnts

who love to be challenged and are committed to excellence) (group of three, 5 marks each; students who do this assignment will be exempted from the above assignment, however if you choose to do both of these assignments then the maximum marks are limited to 7 marks per student): In a new single storey building, the floors in the rooms and corridors are tiled with thick hexagonal tiles with grooves between adjacent tiles. These grooves are to be used for laying several types of cables and pipes. Maximum of three cables/pipes may pass through any groove. However, more are allowed to pass through groove intersections. There is no constraint on cable/pipe size between two groove intersection points. Design the data structures for specifying the building for cabling purpose and propose a machine-readable format for specifying the cabling/piping requirements in terms of end points. Also propose an output format for a computer generated textual cabling/piping specifications in terms of the complete paths (if possible) for each cabling/piping requirement. Design additional data structures and an algorithm to process the data and generate the output in the desired format. WAP a program based on your design for a given set cabling piping requirements as per your format.

27. 01.10.05

1. Project demo and case study by Siddhartha Batra (3rd year) 2. Project demo and case study by Saaransh Bagga (3rd year)

28. 04.10.05

1. Discussion of a recursive algorithm for a modified version of Tower of Hanoi problem using 5 poles and n disks i.e. n disks have to be shifted from source pole to destination pole using 3 extra poles.

2. Divide the Conquer approach to Algorithm design Identify the terminating condition (the one which will not require further division of the problem into similar smaller problems)

3. Assignment (eligibility : All students of B2-B7, and only those students of B1, B8, and B9 who marked their attendance in the class) (Group of two student, 3 marks each ) Write the algorithm for a modified version of Tower of Hanoi problem using user defined n poles (where n >=3) and m disks i.e. m disks have to be shifted from source pole to destination pole using n-2 extra poles. Use recursive approach. You can design simple textual interface inbstead of using sophisticated graphics. Complete tabular analysis of your recusive algorithm for n = 7. Validate your analysis with single step run.

4. Recursion Tree : Ever call creates a child, every return is represented by movement to parent node. Factorial has one child per node, Fibonacci has two children per node, rat in the maze has upto 4 children per node.

5. Memory requirement can be assessed by the depth of the recursion tree. 6. Processing time requirement can be assessed by the number of nodes in the recursion tree. 7. Call is one of the most expensive instruction of most of CPUs. 8. Recursion is recommended to be avoided, if possible. 9. Converting recursive programs into non-recursive programs is possible through the use of iteration.

10. Converting tail recursion into iteration is simple e.g. factorial, forward print of linked list and so on. 11. What about other type of recursion e.g. backward printing of linked list, Rat in the maze, Tower of Hanoi and

so on?

29-30. 06.10.05 1. Converting recursive program into iterative program for cases other than tail recursion. 2. Backward order print of linked list using iteration : 3. Options: processing intensive or memory intensive : 4. Processing intensive approach suggested by Vivek earlier a nested for loop is required that results into

(n)+(n-1)+ (n-2) + ….+2+1 element accesses n(n+1)/2 O(n2) : very inefficient for large lists, processing time, t is proportional to n2 i.e. for doubled data size , processing time increases quadruples.

5. Memory intensive approaches : change the data structure of the input linked list to doubly linked list, One forward loop followed by one backward loop. Processing requires n forward access followed by n-1 backward accesses, processing time, t is proportional to n O(n). However, changing the input data is not possible in many situations.

6. Memory intensive approaches : without changing the data structure of the input suggested by Anuraj. Use an array of pointers (Buffer) to node in the list. Populate the array during one forward pass

Page 13: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 over the list and print the node content on a reverse order loop on this array. Processing requires n forward access over the linked list followed by n backward accesses over the array, processing time, t is proportional to n O(n).

7. This buffer is being used as LIFO, hence can be replaced by a structured stack which does not allow direct access to any array element.

8. Stack should allow operation of Push, Pop, Top, Is_stack_empty, Is_stack_full operations. 9. Stack can be implement using array or using linked list. 10. Array based stack can be implemented with two approaches, Fixed Top and Moving Top. 11. Array based Moving Top version provides O(1) for all stack operations. (Indirect addressing) 12. Array based Fixed Top version provides O(n) for push and pop. (Direct Addressing) 13. Array based stack has size limitations. 14. Linked list based Stack can be implemented in two variations: new element as the last element of the

linked list or new element as the first element of the linked list. (Linked Addressing) 15. New element as the last element push, pop, top all are O(n). 16. New element as the first element push, pop, top all are O(1). Hence this is preferred. 17. All recursive programs can be converted into iterative program with the help of Stack. 18. Assignment (5 marks) : Write Stack based iterative programs for backward printing of the linked list and

also 2d, 3d, and hexagonal Rat in the Maze problems. Use array based stack in the first case and use linked list based stack in the later problem.

19. Infix, postfix and prefix expressions. 20. Using a stack for evaluation of postfix expressions. 21. Problem: The crack team at Sikand’s Car Wash requires precisely four minutes to wash a car. A car arrives

on the average at Sikand’s every four minutes. In a typical 10-hour day, management wants to know how long a car waits between the time it arrives and the time it gets washed. Design a computer simulator for this analysis.

22. Identify the objects , activities, and events. 23. Objects : Cars: new arrivals, in waiting queue and at the wash station. 24. Activities: waiting and movement in the queue, car wash 25. Events: car arrival, wash start, wash complete.

26. Some approximations (simplifications) : any even occurs in a minute, only one car can arrive in a minute. 27. Simulate the day: for loop of 600 minutes. The probability of a car arriving in any minute is ¼. Generate

a random number 1-4, associate any one of these four number with car arrival event.

30. 18.10.05 1. Contd. Problem : The crack team at Sikand’s Car Wash requires precisely four minutes to wash a car. A car

arrives on the average at Sikand’s every four minutes. In a typical 10-hour day, management wants to know how long a car waits between the time it arrives and the time it gets washed. Design a computer simulator for this analysis.

2. Why for loop of 600 minutes? Why not less or more? This is discretising of a continuous signal. Hence, use Nyquist criteria for deciding sampling rate for simulation of time.

3. Each iteration of the loop represents one time period as per the sampling rate. All activities of real system that take place during that time period need to be performed (simulated by appropriate functions) in each iteration. Naturally in real world some activities go on simultaneously, on a sequential processing computer, however, we simulate parallel activities by sequencing the function (for each activity) in an appropriate order.

4. Real world activities in this case: arrival_check function, car queue, and wash_station. 5. Wash station status is simulated by a flag that indicates if wash station is busy or unused 6. A local timer (like washing machine) is set for each car when it arrives on wash station. This timer needs to be

updated during each iteration. 7. Each car is simulated by a record. Each arriving car is assigned a unique ID that does not change during the

entire simulation. Identify what else do we need to store in this record to achieve the goal of our simulation? 8. Queue should allow operation of Enqueue, Dequeue, Front, Rear, Is_queue_empty, Is_queue_full operations. 9. Queue can be implement using array or using linked list. 10. Fixed size array based Queue can be implemented with two approaches.

i. Fixed Front and moving Rear; ii. Moving Front and moving Rear. Using array as a circular list i.e., next to (n-1)th element of array is zero-

th element. iii. Fixed Front and moving Rear version provides O(1) for Enqueue operation but O(n) for Dequeue

operation. Flags can ensure O(1) for Is_queue_empty, Is_queue_full.

Page 14: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 iv. Moving Front and moving Rear version provides O(1) for Enqueue and Dequeue operation. Flags can

ensure O(1) for Is_queue_empty, Is_queue_full. 11. Linked list based Queue can be implemented in two variations : new element as the last element of the linked

list or new element as the first element of the linked list with front and rear pointers. (Linked Addressing) i. New element as the last element all operations O(1).

ii. New element as the first element all operations O(1). 12. Another similar problem: Design a simulator for a petrol pump offering the facilities of petrol, diesel, CNG and

car wash. Also give a diagrammatic representation for this simulator software. Customer arrival rates are as follows: 5 am – 6 am : 30 customers per hr. 6 am – 7 am : 50 customers per hr. 7 am – 10 am : 200 customers per hr. 10 am – 4 pm : 100 customer per hr. 4 pm – 8 pm : 200 customers per hr. 8 pm – 10 pm : 50 customer per hr. 10 pm – 5 am : 15 customers per hr.

There are separate queues for the petrol, diesel, CNG and car wash. There are three petrol counters, two diesel counters, two CNG and one car-wash counters. 50% of the Fuel seekers purchase petrol, 25% purchase diesel and rest purchase CNG. During peak hrs., only fuel buyers are entertained. During 10 am to 6 pm, 10% customers come for car wash and rest for the fuel. Customers first buying the fuel do not go for car wash. However, 70% also buy fuel after car wash.

13. Identify the objects, activities, and events. 14. Assignment: Group work (three students, 3 +7 marks for each student): Write program for both these simulators.

Use array-based queue in first and linked list based queue in second.

31. 20.10.05 1. More discussion on the simulator. 2. Draw a schematic for the problem. 3. Frame clear and precise queries to be answered by the simulator e.g. What is the avg wait time?, What is

the maximum wait time?, How many cars have to wait for more than T time?, and so on 4. Design the simulator to answer these queries. 5. Three options:

a. Simulator continues to track the status of some of the fixed queries answer variables, b. Simulator produces a log table, which is used for query processing. c. Mixed approach, some statistics is kept regularly updated, while other queries are answered by

processing the log table. 6. Data Structures for CAR (id, arrival time) and log table (car id, arrival time, departure time), statistical status

variables for: a. Average wait time so far, b. Max wait time_ so far, c. Number of cars with more than T wait time, an so on.

7. In the second problem, we have multiple servers and multiple decision points and multiple exit conditions. 8. Architectural options:

a. Dedicated queue for each server (as suggested by Ishaan), b. Shared single queue shared for similar servers (as suggested by Tushar), c. Shared single queue for all kinds of servers (what is the likely problem with this models).

9. We need multiple queues and multiple random numbers, one at each decision point.

32. 22.10.05 1. Simulators Contd.. What about the service time if it is not constant? 2. Problem: Design DS and Write programs to insert, delete, modify and inquire the records into the lists of

contributors for JYC activities (with their enrollment number, address, hobbies, past record of extra curricular activities, awards, previously held positions, current responsibilities, personally known established hobbyists/professionals and so on) under the four broad divisions of JYC i.e. literary, cultural, technical, and sports.

3. Develop a deeper understanding of the system by creating a behavioral model (User action System response).

Page 15: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 4. Convert open-ended Qs (like Inquire) into a set of clear and precise Questions. Formulate a list of queries and

rank them in order of likely frequency. Try to evolve DS and algorithm such that the most frequent queries are most efficient.

5. Nested Structures. 6. Assignment: Group work (three students, 7 marks for each student): Write program for JYC system.

33. 29.10.05

1. Assignment: Individual work (5 marks for each student): A new C like programming language Csmall (a truncated version of C) has a limitation of 64 elements on array length. Csmall also does not support pointers and dynamic data structures. Using the features of Csmall, propose the design details for creating four stacks as well as five queues of 512 integers each, such that all the standard stack and queue operations have the time complexity of O(1). Based on your design, write the functions for standard Stack as well as Queue operations. Explain your design with the help of diagrams and examples.

Further, write an algorithm to merge all elements of all these stack and queues by randomly interleaving the elements of different sources into a single file such that it should be possible for the reconstruct algorithm to reconstruct the pre-merger distribution of data. Merge algorithm should progressively empty the stacks and queues by removing one element at a time from a randomly chosen stack or queue and append that element to a single file. Reconstruct algorithm should read such a file and re-populate the stacks and queues. Reconstructed stack and queues should be exactly same as pre-merger stacks and queues.

2. Element addressing scheme for elements of multi-dimensional array. i. Row major order (elements are stored Row by row).

ii. Column major order (elements are stored Column by column).

3. Design - Regard engineering design practice as a process of "story telling". - Design is story telling

- Elements of a design story: - Plot / Theme

- Characters with defined roles : (Anything that has a Stimulus-Response behaviour is a character. Identify the stimulus-response pairs for your characters)

- Events - Sequence - Time based events/objects - Interactive events/objects - Characters take Actions - Actions have consequences - Consequences are actions by other characters.

Stories Design Concept Prototype ……… Product 4. Some Tips on story writing

Experience lots of stories Start with the whole and move to the parts: Present the big picture within a whole- system global context

and connect to local initiatives. 5. Assignment : Group work (three students, 30 marks each student): A. (23 marks) Create design story for any

kind of computer application. Progressively give more details, create concept map, identify data tanks, identify data structure, and functions, Write software for this design, prepare test plan, test cases, test report.

B. (7 marks ) Review the design document of another group , prepare test plan and test report for that project.

Do not mutually exchange the projects.

34. 5.11.05 1. Design Concept Map for given design story or application scenario. Some Guidelines and hints on

construction of Concept maps

Page 16: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 2. This concept map will provide a birds-eye view of a collection of interacting and collaborating data tanks and

data items . 3. This Concept map will be a diagram of inter-connected data tanks via processing units with marked labeled

boxes and arrows.. Use MS-Excel for your drafting. 4. Identify the nouns and verbs. Some nouns will become data tanks in Concept Map (CM). Verbs will be

become processing box in CM 5. The nouns will be singular as well as plural. 6. Use double line boxes for data tanks containing several homogeneous data items and single line boxes for

single data item/packets, if any. 7. Give examples of the data inside each data tank. 8. Put a small circle on the top right corner of boxes, if it represents dynamic data i.e. the data can change as a

result of valid operations. 9. Put another circle on the top left corner, if the data population size can change during processing because of

insertions and deletions. This dynamic data is not to be confused with dynamic data structure as this higher level of dynamism can be implemented with dynamic or static data structures at lower layer.

10. Use oval shape boxes for processing units. 11. Your concept map should be hierarchical i.e. it should gradually show more details in different diagrams rather

than showing all the details in one diagram. Initially focus on most critical aspects. 12. Draw three dotted horizontal lines. Put the name of data tanks (a plural noun) in the top (first) sub-box and

give some examples of representative data items in the same sub-box. 13. Write the attributes (fields) in the second sub-box. First put and also underline the attribute(s) that are required

to have unique values e.g. ID No. etc. 14. Identify all the operations that are required to be performed on this data tank during the lifetime of given

application. Write these operations in third sub-box. 15. Horizontally divide the fourth sub-box into two parts. 16. See if the data tank is required to maintain some order on the elements or not. 17. If no order is required, leave the fourth left and fourth right sub-boxes empty. 18. Check the type of order Is it ordered chronologically i.e. based on time of insertion of records or ordered

on attribute(s) value(s). Also check if you need ascending or descending order. If ordered on value, Identify the attribute(s) that control the order of records.

19. Put ‘T’ in top left corner of fourth left sub-box if the tank is ordered on time. Put an upward arrow for ascending order, put downward arrow for descending order.

20. Put ‘V’ and attribute followed by appropriate name(s) arrow if the tank is ordered on attribute value. 21. If the data tank is an ordered collection of Xs, see how the relative position of a specific data item is defined

with respect to other similar data items. 22. Some Possible positional arrangements of ordered data tanks..

a. X (Linear). b. X (Non linear: Descending Tree)

c. X (Non linear: Ascending Tree)

d. X (Non linear: Graph)

Page 17: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005

23. Place one of the above (22.a to 22.d) ordering structure symbols in fourth left sub-box. 24. Examine Relative Positional eligibility for Retrieval (access): All/ only some strategic relative positions. 25. Examine Relative Positional eligibility for Manipulation: All/None/ only some strategic relative positions. 26. Examine Relative Positional eligibility for Insertion: Any empty slot/ only some strategic relative positions. 27. Examine Relative Positional eligibility for Deletion: All/None/ only some strategic positions. 28. If retrieval (access), manipulation, insertion and deletion are dependant on some well defined strategic relative

position within the data tank; observe, identify and define these positions. 29. Some examples of strategic relative positions can be as follows: Based on order of insertion: earliest, latest, after the latest insertion, before the earliest insertion, 3rd earliest,

4th latest, next relative to the current position as per insertion order, previous relative to the current position as per insertion order and so on;

Based on the value: Minimum, Maximum, 3rd minimum, in between a given range of values, in between an appropriate range of values.

30. Mention these strategic positions for each position in fourth right sub-box. 31. Give indication of what, when and how does some data move or change in any data tank.

a. Example: An analyst has proposed the following requirements for a linear ADT for a given application: - Ordered by: Time of insertion. - Insertion at: after the latest insertion - Deletion at: 3rd earliest, if available; else 2nd earliest, if available; else earliest - Access at: Anywhere at 1st to 5th latest and also at earliest insertions - Manipulations at: Anywhere at 1st to 10th latest and also at earliest and 2nd earliest insertions.

32. Also Mark the data tanks as Input only, Output only, Input and Output, and Buffer.

35. 8.11.05 1. Review of CM design 2. Hierarchical CM. Show the all DTs and Processing box in one sheet. 3. Use separate sheet for showing the details of each tank. 4. Processing elements are like pumps, boiler, mixer, or filters in a chemical plan. Data moves between tank(s)

through processors. 5. Create enough retrieval operations in order to check the correctness of insertion, manipulation and deletion

operations like the speedometer of fuel gauge meter. 6. Validate the design specifications of Abstracted DTs from the design stories of 2003 and 2004 batch.

Time of insertion (TOI) None Earliest Insertion (EI) Linear Value Non Linear

Value+Time (VT) Anywhere (aw) After the last insertion (ALI)

After Last Low Priority Insertion (ALLPI) Before Last High Priority Insertion (BLHPI)

Last Low Priority Insertion (LLPI)

Last High Priority Insertion (LHPI)

Before Last Insertion (BLI)

Data tank Types (Templates) needed for modelling student design (2003 batch as well as 2004 batch)

Sno.

Topology

Order By Insertion at Deletion at

Manipulation at Retrieval at

1 Linear Value ap aw aw aw

Page 18: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 2 Linear TOI ALI aw aw None Semi-Open

Queue 3 Linear Value ap aw aw None 4 Linear None aw aw aw aw 5 Linear None None None aw aw 6 Linear TOI ALI None aw None 7 Linear TOI ALI EI LI None Queue with

semi-open back

8 Linear TOI ALI aw aw aw Open Queue

9 Linear Value None None aw None 10 Linear VT ap None aw None 11 Linear None aw aw aw aw 12 Linear None aw aw aw None 13 Linear Value ap aw aw aw 14 Linear Value ap None aw aw 15 Linear TOI None None aw None 16 Non

Linear VT ap sublist's last aw None

17 Linear TOI ALI None aw aw 18 Linear None aw None aw None

Addiotional types (Templates) required by 2004 batch

19 Linear None None None aw None 20 Linear VT ap+aw aw aw aw 21 Linear VT ALI aw aw aw 22 Linear TOI BLI EI aw None 23 Linear TOI None None aw aw 24 Non

Linear VT ap aw aw None

25 Linear Value ap None aw None Additional important types (Templates) but not needed for students conceived problems (2003 and 2004 batch)

26 Linear TOI ALI LI LI ? Stack 27 Linear TOI ALI EI LI, EI ? Queue 28 Linear TOI ALI LI, EI LI, EI ? Shelf 29 Linear TOI+Priority ALLPI, BLHPI LLPI LLPI, LHPI ? Scroll/Roll 30 Linear TOI+Priority ALLPI, BLHPI LHPI LLPI, LHPI ? Scroll/Roll 31 Linear TOI+Priority ALLPI, BLHPI LLPI,

LHPI LLPI, LHPI ? Deque

7. Validate the and review the design specifications for a linear ADT for a given application:

• Ordered by: Time of insertion. • Insertion at: after the latest insertion • Deletion at: 3rd earliest, if available; else 2nd earliest, if available; else earliest • Access (retrieval) at: Anywhere at 1st to 5th latest and also at earliest insertions

Page 19: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 8. Manipulations at: Anywhere at 1st to 10th latest and also at earliest and 2nd earliest insertions. 9. Pre-condition: Operation specific checklist to be checked by user program (or function) before calling a

given operation on the given DT. 10. Post-condition: Operation specific checklist to be guaranteed by the DT. Ideally user program or function

should be able to check the pre and post conditions through retrieval operations.

36. 17.11.05 1. Data Tank Implementation strategies: Ease of creation Vs Ease of retrieval. 2. Data Storage : Options:

- Amorphous or Structured. 3. Amorphous storage (collapsed structure) makes insertion of data very easy but retrieval of required data is

very inefficient. • No predictable relative ordering of the records within the data tank and no predictable relative

ordering of the fields with a record. 4. Structured data storage requires more discipline and effort at the time of insertion of data so that retrieval

becomes more efficient. 5. Several types of primary data types and Data Structuring facilities are offered by Programming Languages. 6. All application specific data for any data-tank needs to be represented and stored in terms of language

specified primary data types using structuring facilities for future usage and processing. 7. Data of one data-tank can be stored on primary/secondary or mixed memory. 8. For storing structured data, addresses of individual record within a data tank can be realised through

following optional addressing mechanisms: • Formula based (Direct addressing) • Linked List • Indirect addressing (using a directly addressable index) • Simulated Pointer.

9. All data tanks can be realised through any of these mechanisms , amorphous or any of four addressing mechanisms.

10. Completely amorphous: unordered collection of variable length strings for records with variable number of fields.

11. Semi amorphous : • Unordered collection of variable length strings for records with fixed number fields.

12. Formula based : Ordered collection of fixed length records without any missing keys i.e. record number can be a function of key attribute(s) e.g. roll_no. Example: Storage of Lower triangle matrix or a diagonal matrix or an upper diagonal or a tri-diagonal matrix in a single dimensional array.

13. Linked addressing: addresses are stored in the preceding records i.e. linked list, doublu linked list, pointer based tree, pointer-based graph, adjacency list, and so on.

14. Indirect addressing : Using Array based Index structure on multiple fields (on which efficient retrival is needed). Use binary search over the index array. Store index in a separate file and make it RAM resident at the run time of application. Example, thesaurus, Book, Railway timetable and so on.

15. History of Evolution of Book’s structure. 37. 18.11.05 1. Indirect Addressing (Indexed storage) 2. Index can be implemented as a sorted array or as a BST. Each node will point to the location of the

concerned record in the main database. 3. Linear search in unsorted array Vs Linear search in sorted array Vs Binary search in sorted array Vs BST. 4. Assignment: Clearly identify and report your Query goals for your project. 5. Index structures. Index structures can be RAM based or can also be on file (if index is also large). Index

structures are also often persistent and are stored on files. At run time index )or part of it) is loaded into RAM. Index can be stored as sorted list, hashed list or more often as some kind of a search tree. BST is most simple form of such a search tree.

6. Implementing Index for Indexed storage: simple ordered array based storage offers good search performance using Binary search but makes insertion and deletion slow as it requires expensive movement of index records. An ordered linear linked list will give flexibility of insertion and deletion but search will be slow as it will require linear search. We want to have both of the following:

Page 20: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 a. Good search time (similar to Binary search over ordered array) and b. Good insertion and deletion time (similar to singly linked list).

7. Essentially use a special non-linear Linked structure rather than ordered array such that it also follows (at least tries a close approximation of) a search strategy like Binary search over an ordered array.

8. Binary Search Tree is a special Binary tree to facilitate such search mechanism. It follows some constraints for positioning the nodes within the tree i.e. values on the left side are lesser than parent and values on the right side are more than the parent.

9. BST search requires no computation at run time to find the next location comparison, instead the two next candidate locations are pointed to by current locations (as in linked list in which each location points to one next location).

10. New node insertion algorithm in BST can be fast and very simple i.e. it attaches the new arriving nodes at the leaf level only, appropriate leaf has to be searched within the current BST for linking new node as its left/right child. This simple insertion may make the tree unbalanced and it results into deeper tree requiring more comparisons on an average as compared to binary search through an ordered array.

11. It is possible to design more sophisticated algorithm for insertion into a BST which keep the BST balanced after every insertion by reorganizing part of the BST. This helps in optimizing the number of average comparison during search time.

12. More sophisticated search trees structures are used for storing index structures in real application like DBMS and so on. They essentially follow multi-way search through multiway trees with multiple pointers (even 100 or more) rather than just 2 (left and right) pointers.

13. Indexing facilitates faster retrieval. Key to improving retrieval time often lies in designing better indexing structures. Index Structure Design has been a very active CS research area for several decades and it continues to get new breakthroughs for specialized and newer applications.

14. Simulated Pointer: Simulating pointer with the help of record number. • Example: A Binary Tree stored without using pointer facility. Record numbers start from 0, -1

indicates null.

38. 20.11.05 1. All Data tanks be implemented in more than one style out of the above mentioned styles. 2. Implementation of Stack, Queue, Double Ended Queue (Deque) in Amorphous, semi-amorphous, direct

addressing, Indirect addressing, linked, and Simulated pointer based styles: Time and space complexity comparison of different implementation styles.

3. Graph storage: Linked, Adjacency matrix, and simulated pointer based. How about amorphous and semi amorphous storage of graph?

4. Optional Assignment (Individual work, 5 marks Last date: 6.12.2005): Compare the performance of different graph storage techniques with respect to standard Graph operations and algorithms studied by you in Discrete Maths course.

5. Optional Assignment (Individual work, 5 marks Last date: 6.12.2005): Compare the performance of different tree storage techniques with respect to standard tree operations and algorithms studied by you in Discrete Maths course.

info Left child Right child Y 1 11 X 7 10 A 6 -1 M 8 -1 F -1 9 N 4 -1 S -1 -1 C 3 5 G -1 -1 E -1 -1 W -1 2 D -1 -1

Page 21: Data Structures 2005

Lecture Notes, Sanjay Goel, DS, 2005 39. 22.11.05 (1.5 hrs.) 1. Storing Tree and Graph using amorphous style. Evaluation of Amorphous storage in terms of following

operations: a. Display Tree/Graph. b. Find children and parent nodes (tree)/neighbour nodes (graph).

2. Completely full Binary Tree, Complete Binary Tree, Skewed Binary Tree. 3. n-ary tree, positional tree. 4. Tree traversal: Inorder, Preprder, post order, level order. 5. Using stack or queue for buffering the nodes that need to be revisited during Tree or graph traversal. 6. Rat in the maze Graph traversal. 7. Direct addressing (Formula based) storage of Binary Tree, Trinary tree. Performance evaluation in terms of

above mentioned operations. 8. Performance analysis and comparison of some sorting techniques: Bubble, Insertion, Selection, and Merge

sort algorithms. 40. 24.11.05 1. BST: Purpose, Genesis, Search, Node Insertion, Performance Analysis 2. BST implementation using direct addressing, linked addressing, and simulated pointers. 3. BST: Node deletion (Self Study)