20
1 Project 7: Huffman Code

1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

Embed Size (px)

Citation preview

Page 1: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

1

Project 7: Huffman Code

Page 2: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

2

Project 7: Huffman Code

Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use that information to restore the original ASCII text.

Page 3: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

3

Embedding the Code

In order for the coded file to be useful, we have to store the code along with it. Then we can read and decode the file at a

later time. Even on a different computer (with the same

architecture)

In order to decode First read the code Then read and decode the message

Page 4: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

4

Serialization

We will need to serialize the decode tree.

Convert data structure into a byte array. Necessary any time an object is to be written

to a file or transmitted over a network.

Also deserialize. Convert byte array back into the data structure in

memory.

Serialize and deserialize methods are required for any class whose objects need to be preserved in files or transmitted over a network.

Page 5: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

5

Serialization We will write the serialized decode tree into the

binary output file ahead of the coded message.

On decode, first read back and deserialize the decode tree, then read and decode the message.

Serialization and deserialization are byte level operations.

Don't want to do one bit at a time. Will need new operations in our binary file classes. http://www.cse.usf.edu/~turnerr/Data_Struct

ures/Downloads/Project_7_Binary_File_Classes/

Page 6: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

6

Serializing the Decode Tree

How do we convert the decode tree into a byte array that can be converted back into a decode tree elsewhere?

Don't output pointers! We can't put nodes back into the same memory

locations. Have to create new nodes and link them

together in the same way they were linked in the original

but at new memory locations.

Applies to all serialization operations.

Page 7: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

7

How to serialize a decode tree?

*

1.0

*

0.55

e

0.45

*

0.35

*

0.20

a

0.20

d

0.15

c

0.10

b

0.10

Note that each nonleaf node has two child nodes.

Page 8: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

8

How to deserialize a decode tree?

Need the child node addresses in order to restore a nonleaf node.

Get restored left child address. Get restored right child address. Get data for node. Create new node with data, and

pointers to left child and right child.

Page 9: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

9

How to deserialize a decode tree?

Work from the bottom up.

Leaf nodes can be restored immediately from the serialized data.

Push addresses onto a stack.

Get parent data from serialized data. Pop left child address. Pop right child address. Restore parent node. Push parent node address onto stack.

Page 10: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

10

Serialization Algorithm Do a postorder depth-first traversal

Output node data as each node is visited.

1.0

0.55

0.45

0.35

0.20

a

0.20 d

0.15

c

0.10b

0.10

Output only the node data

not the pointers.

Page 11: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

11

Deserialization Algorithm

While any node data left in serialized stream:

Get next node data from serialized stream. If it is a leaf

Create a new node in memory with the data. Push address of new node onto stack.

Else Pop child address from stack. Pop child address from stack. Create new node in memory with data from the

serialized stream and child addresses from the stack.

Push address of new node onto the stack.

Page 12: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

12

Implementing Serialization

Add code to class Huffman_Tree to serialize and deserialize the decode tree. Output and input data from the Char_Freq

objects. Need serialize and deserialize methods in

that class also.

Page 13: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

13

Implementing Serialization

How do we indicate end of the serialized decode tree?

Use a sentinel. A unique value that cannot appear as real

data. Char_Freq(0, 0)

Page 14: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

14

Sample Run

Page 15: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

15

Test on Full Text

Delete screen output of decoded text.

Page 16: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

16

The Files

Page 17: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

17

Development Environment

You may develop your program on any system you like.

The same source files should compile and run on either Windows or Linux.

Page 18: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

18

Ground Rules You may work with one other person.

OK to work alone if you prefer.

If you do work as a pair Work together!

Both members are expected to contribute. Both members should understand the program in

detail. Submit a single program.

Do not share your code with other students. Before or after submitting the project. OK to discuss the project.

Do not copy any other student’s work. Don’t look at anyone else’s program. Don’t let anyone look at your program.

Page 19: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

19

Ground Rules

Except for code posted on the current class web site

Do not copy code from the Internet or any other source.

Write your own code.

Page 20: 1 Project 7: Huffman Code. 2 Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use

2020

Submission

Project is due by 11:59 PM, Sunday night, April 24

Deliverables: Source files only. Zip using Windows "Send to Compressed Folder"

If you work with another student, include both names in the assignment comments.

Other student submit just a Blackboard submission comment including both names.

End of Presentation