Upload
hoangque
View
214
Download
0
Embed Size (px)
Citation preview
A Design Flow for12.8 GBit/s Triple DES using
Dynamic Logic andStandard Synthesis Tools
F. Grassert, S. Flügel, M. Grothmann, M. Haase, P. Nimsch, H. Ploog, A. Wassatsch, D. Timmermann
University of Rostock (Germany)Dep. Electrical Engineering and Information Technology
Institute of Applied Microelectronics and CS
2/22
Purpose
• Wanted: Fast Triple DES processor• Wanted: Integration of dynamic logic in standard synthesis
½Development of design flow for TSPC Logic½Application: Triple DES
3/22
Outline
• Introduction• Basics of dynamic logic and TSPC logic• Design flow for TSPC• Realization of Triple DES• Summary
4/22
Dynamic Logic
• Application– High-end chips
i.e. Posluszny et al „Design methodology for a 1.0 GHz Microprocessor“, ICCD 1998– Ultra fast logic
i.e. Yee, Sechen, „Clock-Delayed Domino for Dynamic Circuit Design“, Trans. On VLSI Systems 2000
• Advantages– Very fast– Can be small
• Disadvantages– Difficult to design– Not supported by synthesis tools– Power consumption– Expensive clock-tree design
5/22
Dynamic Logic - Function
• Works only with two phases
φ
φ
n-BlockInputs
Precharge
EvaluateOutput
Inputs
φ − ClockOutput
Precharge Evaluate
• Logic function using n-transistors (“n-block”) – fast and small
• No direct interconnect of outputs and inputs possible
6/22
True Single Phase Clock (TSPC) Logic (1)
• True Single Phase Clock
• Realizes register function → combined register and logic • Only non-inverting logic functions• P-block: slow, large area
φ
φ
n-BlockφIN
n-Latchn-Logic
p-Block
φ
φ
φ
OUT
p-Latchp-Logic Complementary
part
7/22
True Single Phase Clock (TSPC) Logic (2)
• Second TSPC realization: N-N2 logic• Inverting logic functions only
φ
φ
n-BlockφIN
n-Latchn-Logic
n-Block
φ
φ
φ
OUT
p-Latchn2-Logic
8/22
True Single Phase Clock (TSPC) Logic (3)
• Result: non-inverting and inverting logic functions• No use of second logic block, but possible• Robust and fast cells• Extreme pipelining – extreme clock load• Problem: clock distribution
φ
φ
n-Block φp-Block
φ
φ
φ
OUTIN
n-Latch p-Latchp-Logikn-Logik
φ
φ
n-Block φn-Block
φ
φ
φ
OUTIN
n-Latch p-Latchn2-Logikn-Logik
Non-inverting functions Inverting functions
Used for logic Used for logic
9/22
• Generating the library
• Generating a combinational netlist
• Generating the pipeline
• Implementing the pipeline in the design
• Generating the library
• Generating a combinational netlist
• Generating the pipeline
• Implementing the pipeline in the design
Synthesis of TSPC Logic
10/22
Building a Library
• Cell layout• Verifying the cells• Simulation• Compiling the library
• Combinational function is the main point• Timing behavior not interesting• Register function not important
11/22
Synthesis of TSPC Logic
• Compiling the library
• Generating a combinational netlist
• Generating the pipeline
• Implementing the pipeline in the design
12/22
Generating a Combinational Netlist
• Standard design flow– VHDL description, analyzing, elaboration
• Mapping (using the library with TSPC gates)
• No feedbacks allowed – only straightforward– Otherwise splitting in shorter combinational parts or hand design
13/22
Synthesis of TSPC Logic
• Compiling the library
• Generating a combinational netlist
• Generating the pipeline
• Implementing the pipeline in the design
14/22
Generating the Pipeline
• Interpreting all cells as registers (pipeline stages)• Building the pipeline and integrating buffers
AND
OR
E1E2
E3
A1
A2
AND
OR
E1E2
E3
A1
A21
11
1
15/22
Synthesis of TSPC Logic
• Compiling the library
• Generating a combinational netlist
• Generating the pipeline
• Implementing the pipeline in the design
16/22
DES Core as an Example
• Data Encryption Standard (DES) – symmetric encryption• 64 bit message with 56 bit key to 64 bit cipher text• 16 rounds per data word• Loop-unrolling ideal for pipelining
Li-1 Ri-1
Li=Ri-1 Ri=Li-1⊗f(Ri-1,Ki)
f Ki48
32
48
32
Expansion
permute
SBoxes
17/22
Structure of the DES Chip
• One round – Single DES• Three rounds – Triple DES• 4 channels for encryption• Communication with control
unit through parallel and I²C interfaces
• Control part, Input / Output 200MHz, SCMOS
• Core part 800Mhz, TSPC• Built In Self Test (BIST)
Input Scheduler
DES Core
Output Scheduler
Con
trol
Uni
t
Input Selector
Late
ncy
feedback data
data in
I/O
Output Selector
data
out
18/22
Realization DES Chip
• DES - Core completely realized with automatic synthesis
• Black box verification between VHDL description and C-program via CLI
• Compiling VHDL to combinational netlist with TSPC library information
• Micro Pipeline Reorganization (MPR) step using developed tool
• Verification steps
Input Scheduler
DES Core
Output Scheduler
Con
trol
Uni
t
Input Selector
Late
ncy
feedback data
data in
I/O
Output Selector
data
out
19/22
Realization DES Chip
• Scheduler – TSPC handmade• VHDL description needs
register information• Combining with DES core• Verification
Input Scheduler
DES Core
Output Scheduler
Con
trol
Uni
t
Input Selector
Late
ncy
feedback data
data in
I/O
Output Selector
data
out
20/22
Realization DES Chip
• Control parts with standard synthesis
• VHDL description compiled to standard library
• Complete chip:Combination with TSPC parts
• Verification of behavior• Verification of cycles
Input Scheduler
DES Core
Output Scheduler
Con
trol
Uni
t
Input Selector
Late
ncy
feedback data
data in
I/O
Output Selector
data
out
21/22
Discussion
• 12.8 GBit Triple DES chip (64 Bit @ 200 MHz)• Three separated parts during synthesis• Speed-up of the calculation intensive parts through TSPC• All parts work to capacity – no bottleneck
• Very long TSPC pipeline (ca. 250 stages)• Large area (no optimizations yet)
22/22
Summary
• Automatic design of dynamic logic with standard synthesis tools• Example: GBit DES• Disadvantages: very large area and high power consumption• Demonstrates the possibility to speed up designs through extreme
pipelining and using dynamic logic• Compared with other realizations: large area, but high speed,
standard synthesis