48
mid presentation Part A Project Netanel Yamin & by: Shahar Zuta Moshe porian Advisor : Dual semester project November 2012

LZRW3 Data Compression Core

Embed Size (px)

DESCRIPTION

LZRW3 Data Compression Core. mid presentation Part A Project Netanel Yamin & by: Shahar Zuta Moshe porian Advisor: Dual semester project November 2012. Contents. Project Overview Project goals Requirements Architecture Micro architecture Problems & solutions Conclusions - PowerPoint PPT Presentation

Citation preview

Page 1: LZRW3  Data Compression Core

mid presentation Part A Project

Netanel Yamin & by: Shahar Zuta

Moshe porian Advisor:

Dual semester project November 2012

Page 2: LZRW3  Data Compression Core

Contents Project Overview Project goals Requirements Architecture Micro architecture Problems & solutions Conclusions Testability Methodology Schedule

Page 3: LZRW3  Data Compression Core

algorithm overview

INPUT FILE

-------------------------------------------------------

Literal items ONLY

A copy item consists of two bytes that represent from 3 to 18 bytes. literal item consist of one byte which represents himself

LZRW3 COMPRESSO

R

OUTPUT FILE

]----[-]-----[]-------[]-----------[]----[

GROUPS OF ITEMS(literal/Copy)

Page 4: LZRW3  Data Compression Core

mechanism

HASH FUNCTIO

N

INDEX409

5

0

INPUT FILE:

Offset

Expression_c om

press _ion

E x p

Offset value=

0

XXX

ZZZ

YYY

UUU

demonstration

UUU

r e s

3

XXX

Output

Exp

res

L.I

L.I

NOTE: The next 3 byte should be

“x p r” , then “ p r e “ and only then “r e s”, we did’nt demonstrate all the actions

for simplicity.

“L.I“ stands for

“Literal Item“

Page 5: LZRW3  Data Compression Core

mechanism

HASH FUNCTIO

N

INDEX409

5

0

INPUT FILE:

Expres sion_c om

press _ion

Offset value=

XXX

ZZZ

YYY

UUU

demonstration

ZZZ

03

6

s i

9

_ o

YYYExp

res

Output

L.I

L.I

sio L.I

n_c L.I

Offset

cn

Page 6: LZRW3  Data Compression Core

mechanism

HASH FUNCTIO

N

INDEX409

5

0

INPUT FILE:

Expression_c om

press _ion

Offset value=

XXX

ZZZ

YYY

UUU

demonstration

o m p

03

12

69

Exp

res

Output

L.I

L.I

sio L.I

n_c L.I

omp L.I

Offset

Page 7: LZRW3  Data Compression Core

mechanism

HASH FUNCTIO

N

INDEX409

5

0

INPUT FILE:

Express _comp ress _io

Offset value=

XXX

ZZZ

YYY

UUU

r e s

XXX

03

15

12

96

demonstration

Exp

res

Output

L.I

L.I

sio L.I

n_c L.I

omp L.I

123

C.IXXX

io nn

3+

012345

Offset

“C.I“ stands for

“Copy Item “

Page 8: LZRW3  Data Compression Core

Hash 3 bytes

Hash table [index

]

Enter offset

O.F-. Literal

item

Get offset

O.F.- Copy item

Length++

more

same byte

s

FWD 1 byte

FWD 3 +Length

bytes

START

index

empty filed

Same 3

bytes

no

yes

yes

Page 9: LZRW3  Data Compression Core

Project Goals

Implementation of LZRW3 data compression

algorithm

Implementing strong debugging capabilities

via GUI

Page 10: LZRW3  Data Compression Core

RequirementsVHDL implementationDE2 development board that features an

Altera Cyclone II FPGAFPGA – Host communication via UART

protocolUse internal memory on FPGA, no interface

to external memoryAdapted to data templates of 2Kbyte to

32KbyteHigh performance- data transfer of 1Gbps

Page 11: LZRW3  Data Compression Core

RequirementsVHDL implementationXUPV5 development board that features an

Xilinx Virtex-5 FPGAFPGA – Host communication via UART

protocolUse internal memory on FPGA, no interface

to external memoryAdapted to data templates of 2Kbyte to

32KbyteHigh performance- data transfer of 1Gbps

Page 12: LZRW3  Data Compression Core

Architecture

Rx PATH

Tx PATH

INPUT BLOCK memory LZRW3

COMPRESSOR

CORE

COMPRESSED FILE memory

GUI

XILINX VIRTEX 5 ON XUVP505 BOARD

UART

UART

Page 13: LZRW3  Data Compression Core

Architecture

Rx PATH

Tx PATH

INPUT BLOCKmemory LZRW3

COMPRESSOR

CORE

COMPRESSED FILE memory

GUI

XILINX VIRTEX 5 ON XUVP505 BOARD

UART

UART

Page 14: LZRW3  Data Compression Core

LZRW3 COMPRESSOR

CORE

Lzrw3_go

Lzrw3_mode

data_input_byte (7..0)

data_input_valid

data_input_taken

clk

Lzrw3_busy

Lzrw3_done

Lzrw3_output_group_size (4..0)

data_output_valid

data_output_taken

data_output_last

reset

data_output_bytes(13..0)

End_of_file

Page 15: LZRW3  Data Compression Core
Page 16: LZRW3  Data Compression Core
Page 17: LZRW3  Data Compression Core

STAGE 1 – three bytes buffer

3 BYTESBUFFER

enable

reset

New_byte(7..0)

clk

Newer_byte(7..0)

Mid_byte(7..0)

Older_byte(7..0)

Page 18: LZRW3  Data Compression Core
Page 19: LZRW3  Data Compression Core

STAGE 2- hash function

enable

HASH FUNCTION

middle_byte(7..0)

clk

Table_index(11..0)

older_byte(7..0)

Newer_byte(7..0)

reset

Page 20: LZRW3  Data Compression Core

TABLE INDEX = (((40543*(((*(PTR))<<8)^((*((PTR)+1))<<4)^(*((PTR)+2))))>>4) & 0xFFF) PTR pointes to the first byte . TABLE INDEX range: 0 to 4095.

7 6 5 4 3 2 1 0

7 6 5 4 3 2 1 0

7 6 5 4 3 2 1 0

7 6 5 4 3 7 2 6 1 5 0 4 3 7 2 6 1 5 0 4 3 2 1 0

, ,0000,0000

0000, , ,0000

0000,0000, ,

, , , , , , , , , , , , , , ,

a a a a a a a a

b b b b b b bb

c c c c c c c c

a a a a a b a b a b a b b c b c b c b c c c c c

Page 21: LZRW3  Data Compression Core

STAGE 2- RTL view

Page 22: LZRW3  Data Compression Core

STAGE 3 – hash tableenable

HASH TABLE

Data_out_valid

Table_index(0..11)

clk

Offset(19..0)

Current_offset(19..0)Offset

counter

reset

clear

Page 23: LZRW3  Data Compression Core

Current_offset

0

0

0

0

1

1

0

1

0

1

1

0

Valid bits

21 bits

40

96

ro

ws

Offsetcounter

DATA_ IN

INDEX

ADDRESS

Offset

Data_out_valid

1

Offsetcounter

Page 24: LZRW3  Data Compression Core

STAGE 4 – input file memory

Page 25: LZRW3  Data Compression Core

Stage 4 implementationInput file memory should supply three byte at

the same time.

Page 26: LZRW3  Data Compression Core

How to choose bank when byte arrives?

# _ %3Bank current offset

__ _

3

current offsetAddress in bank

Page 27: LZRW3  Data Compression Core

SOLUTIONInstead of counting in stage 3 and divide in

stage 4, we incerment by one only after three clock cycles.

In this configuration we expand the offset by 2 bits (tagging) to select the the data need to write into.

Hash table size now is 4096 x (19+2) .

1001010101001110011 10

19 bits 2 bits

Page 28: LZRW3  Data Compression Core

Solution costs (mem units) Memory usage At stage 3 from synplify_pro:

same as before.

LUT usage:

20 4096 81920 80 3 _ 108bit Kbit RAM block Kbit

36Kbit

Page 29: LZRW3  Data Compression Core

Back to stage 4

Page 30: LZRW3  Data Compression Core

Input file memorybanks

comparator

Continue

1

0

clk

clkTentative

Next address

clk

counter

offset

TAG

Com

pris

on_v

alid

Compare_success

clk

Offset_tag

Tentative_tag

clk

clk

Tentative_taken

Compare_success_P

Item_length_p

Offs

et_v

alid

Bank 0,1,2addresses

0

1

Addresses

alignment

Older_byte_P

Offset_valid

CBA

3401

Y Z

TENT

00

A

0

0

XB CD

CD

B

B

11

1

0

INDEX

TAG indicate the banks bytes order

Page 31: LZRW3  Data Compression Core

Input file memorybanks

comparator

Continue

1

0

clk

clkTentative

Next address

clk

counter

offset

TAG

Com

pris

on_v

alid

Compare_success

clk

Offset_tag

Tentative_tag

clk

clk

Tentative_taken

Compare_success_P

Item_length_p

Offs

et_v

alid

Bank 0,1,2addresses

0

1

Addresses

alignment

Older_byte_P

Offset_valid

D C

00

1

T

DE

CINDE

X

C

Page 32: LZRW3  Data Compression Core
Page 33: LZRW3  Data Compression Core

Problem(1)in stage 4, at first we implemented the counter that counts the number of successful comparisons in the comparator which is made of an asynchronous process. It passed simulations but was not synthesizable.

Page 34: LZRW3  Data Compression Core

Solution(1)we’ve changed the architecture of the units so the counter is implemented in a synchronous unit, it receives a signal from the asynchronous comparator if the comparison was successful and responds accordingly.

Page 35: LZRW3  Data Compression Core

Problem(2)in stage 4, in order to perform the comparison of the current 3 bytes in the pipe and three bytes from the RAM memory we need to extract three following bytes from different addresses at one clock period.

Page 36: LZRW3  Data Compression Core

Solution(2)we distributed the one memory we had into 3 RAM memory banks which contains following addresses so in case we want to extract 3 following bytes from the memory we’ll extract one byte from each bank.

Page 37: LZRW3  Data Compression Core

Problem(3)in stage 4, the current pipe bytes that arrive the comparator are arranged in their arrival order but the three bytes withdrawn from the banks aren’t necessarily arranged in the right order.

Page 38: LZRW3  Data Compression Core

Reading configurations

1. SAME ADDRESES

Page 39: LZRW3  Data Compression Core

2. DIFFERENT ADDRESS

Reading configurations

Page 40: LZRW3  Data Compression Core

3. DIFFERENT ADDRESS # 2

Reading configurations

Page 41: LZRW3  Data Compression Core

(�ׂ3)SolutionWe used the TAG that represented the extracted bytes addresses to determine which extracted byte will be compared with which current piped byte.

Page 42: LZRW3  Data Compression Core

Problem(4)In stage 4, the RAM memory banks need to have the next address to extract on the next

clock before the end of the current clock .

Page 43: LZRW3  Data Compression Core

(4)SolutionWe created two units that will contain the next two possible addresses (tentative

address unit or address align unit).

Page 44: LZRW3  Data Compression Core

ConclusionsWriting code for synthesis is different from

writing code for simulation.In asynchronous implementation all the

signals need to be in the sensitivity list.Reset should not pass through any logic.Think hardware when writing VHDL code for

synthesis.Keep on simplicity to achieve more flexibility.

Page 45: LZRW3  Data Compression Core

2048

2048Testability

Synthesisable

Hash Function

Block

UnsynthesisableSimulation Function

Random input

generator

A B C

A B C

Assert the comparison and report to console

Input file

Page 46: LZRW3  Data Compression Core

MethodologyStage data flow review.Writing VHDL code.Writing VHDL testbench.Code review and debugging.Synthesis check- synplify.

Check RTL view.Check CLK constraints.

Commit SVN folders and update data flow if needed.

Next stage data flow review.

Simulation & debugging

Page 47: LZRW3  Data Compression Core

Schedule 1/2DateGoals

24/4/2012 – 1/5/2012

Project Characterization& Algorithm interpreting

2/5/2012Characterization Presentation

2/5/2012 – 16/5/2012

Full Characterization of all blocks

17/5/2012 – 1/7/2012

•System blocks VHDL •Design

1/7/2012 – 27/7/2012

Work on project paused for exams

29/7/2012– 11/11/2012

•System blocks VHDL •Design (Cont.)•Writing every unit a simulating testbench

Page 48: LZRW3  Data Compression Core

Schedule 2/2DateGoals

12/11/2012Mid presentation

13/11/2012– 19/12/2012

•System blocks VHDL •Design (Cont.)•Writing every unit a simulating testbench

20/1/2012Part A final- Core Simulation Vs. Golden model

21/1/2012 – 15/2/2012

Assemble all units and FPGA synthesis

16/2/2012 – 28/2/2012

GUI implementation

1/3/2012 – 10/3/2012

Final overall Tests & debug

11/3/2012 – 31/3/2012

Editing and finishing project portfolio

1/4/2012Final presentation