27
Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Embed Size (px)

Citation preview

Page 1: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Other Trees

Applications of the Tree Structure

Page 2: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Expression trees• An expression tree contains:

– Operators as interior nodes– Values as leaves

• The shape of the expression tree captures the precedence

• Consider the following expression:2+3*4

Page 3: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Expression trees

2+3*4

+

*2

3 4

+

* 2

3 4

3*4+2

Page 4: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Traversal• The names come from the above

expression tree• There are six (3!) ways to traverse

the depending on the order of processing:– The node – The left subtree– The right subtree

• Inorder (left and right)• Preorder• Postorder

Page 5: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Inorder • According to the sorted order of

tree• Visit lower (left) subtree• Process node• Visit upper (right) subtree• The reverse produces higher to

lower• Left to right 2 + 3 * 4• This gives standard algebraic

notation

Page 6: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Preorder • Node first then subtrees• Process node• Visit lower (left) subtree• Visit upper (right) subtree• Expression + 2 * 3 4• Remember this?

Page 7: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Postorder • Subtrees first and then node• Visit lower (left) subtree• Visit upper (right) subtree• Process node• Expression: 2 3 4 * +• Reverse Polish

Page 8: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Parse Trees• Expression trees are a small

instance of parse trees• A presentation on parse trees also

exists

Copyright © 2004-2014 Curt Hill

Page 9: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Balance and Search Times• The time it takes to search a tree is

based upon the path length to the desired node

• Assuming equal distributions then– The average search is the sum of the

path lengths divided by the number of tree nodes

Page 10: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Unbalanced Tree

12

19 6

2 15 36

4 0 24

30

29

Page 11: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Average Search Length• 12 – 1• 6 – 2• 19 – 2• 2 – 3• 15 – 3• 36 – 3

• 24 – 4• 0 – 4• 4 – 4• 30 – 5• 29 – 6

• Sum of 37 for 11 nodes gives average search length of 3.3

Page 12: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Perfectly Balanced Tree

36

24 4

2 15 28

6 0 19

30

12

Page 13: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Average Search Length• 36 – 1• 4 – 2• 24 – 2• 2 – 3• 12 - 3• 15 – 3• 28 – 3

• 0 – 4• 6 – 4• 19 – 4• 30 – 4

• Sum of 33 fpr 11 nodes gives average search length of 3.0

• Balanced does perform better

Page 14: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

AVL Balanced Tree

36

19 6

2 15 28

4 0 24

30

12

Page 15: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Average Search Length• 36 – 1• 6 – 2• 19 – 2• 2 – 3• 12 - 3• 15 – 3• 28 – 3

• 0 – 4• 4 – 4• 28 – 4• 30 – 4

• Sum of 33 fpr 11 nodes gives average search length of 3.0

• AVL balanced has the same performance as perfectly balanced

Page 16: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Balanced is Best?• The idea of balancing a tree is

predicated on equal frequencies of keys– Reasonable assumption if no contrary

information– However, if we have frequency

information we can do better• C++ keywords are not evenly

distributed

Page 17: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Path Lengths• The idea of balance is nice in general

but…• If we have a reasonable idea of the

frequency of entries we can do better than perfectly balanced

• What we want to do is minimize the average path length

• With our previous knowledge we could make not assumptions concerning frequency

• Now we can generate a more precise formula

Page 18: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Average Path Length• Sum of search length

• Where– n is the number of words– p is the path length of word i– f is the frequency of word I

• The average search length is:

Page 19: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Optimal Search Trees• What we want are high frequency

words close to the root and low frequency words at the leaves

• You might think that the most common word should be the root and the next two words the second and third common

• It does not work that way since we need to maintain the order as well

Page 20: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Example• For example the word "the" is the most

common word in English text• The top n are:

– the (20)– and (15)– of (13)– to (12)– you (7)– in (7)– a (6)

• Because the top two are such extremes it may be better to have “of” as the root

Page 21: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

LISP Lists• LISP is very old

– Second only to FORTRAN• Usually encountered in

Programming Language or Artificial Intelligence classes

• It has an ubiquitous data structure called a list

• However it is not a list in the sense that it is purely linear

• Instead it is a tree, but a tree without a key

Page 22: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Variables in LISP• A variable may be:

– An atom– A list

• An atom is any word or number• A list may be:

– Empty– A variable followed by a list

Page 23: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Lists• A list could be a simple list within

parenthesis– (Three element list)

• It could also have sub-lists– (Atom (A sub list) another (list))– This is clearly not a linear list such as

an STL List• LISP programs were also lists

– The programs and data had same form

Page 24: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Implementation• The LISP language was influenced

by the machine on which it was developed

• It had a 36 bit word that was partitioned into two pointers– Contents Address Register (CAR)– Contents Data Register (CDR)

• An atom used the word for data• A list used the pointers and atoms• A list always ended in nil, a special

pointer

Page 25: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Example

Three

element

List

nil

(Three element list)

Page 26: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

Second Example

Atom

sub

last

nil

(Atom (sub list) last)

list

nil

Page 27: Copyright © 2004-2014 Curt Hill Other Trees Applications of the Tree Structure

Copyright © 2004-2014 Curt Hill

List Processing• There were two functions that were

continually used in LISP to process a list

• Car gave the first item of the list– Which could be a list itself

• Cdr gave the rest of the list• A heavy dose of recursion and LISP

could do it all