Upload
ngonhi
View
214
Download
0
Embed Size (px)
Citation preview
THE RTL DESIGN OF 32-BIT RISC PROCESSOR USING VERILOG HDL
HAFIZUL HASNI BIN MANAB
UNIVERSITI TEKNOLOGI MALAYSIA
THE RTL DESIGN OF 32-BIT RISC PROCESSOR USING VERILOG HDL
HAFIZUL HASNI BIN MANAB
A project report submitted in partial fulfilment of the
requirements for the award of the degree of
Master of Engineering (Electrical-Computer & Microelectronic System)
Faculty of Electrical Engineering
Universiti Teknologi Malaysia
JANUARY 2012
iii
To my beloved family, friends and lecturers who have guided and inspired me
along this journey.
iv
ACKNOWLEDGEMENT
Special thanks to project supervisor AP Dr. Muhammad Nasir Bin Ibrahim,
lecturers Prof. Dr. Mohamed Khalil Hani, Dr. Muhammad Nadzir Marsono and Dr.
Usman Ullah Sheikh Izzat Ullah Sheik for their ideas and knowledge sharing which
help me a lot in finishing this project.
Not to forget, greatest appreciation to my family who have supported
throughout the development of this project as well as friends who have given a lot of
helps.
v
ABSTRACT
The objective of this research is to design a Reduced Instruction Set Computer
(RISC) processor core based on ARM instruction set architecture for System-on-Chip
(SoC) development design. The RISC computer architecture is selected because as it
is accepted as the processor for mobile computing and in SoC based design
computing system. Moreover, it reduces processor complexity by reducing its
instruction set from highly complex microprogrammed instruction set into a limited
number of instruction that can completely executes one instruction in one cycle. As
System on Chip (SoC) becomes an amazing solution in various applications such as
hardware accelerator for video and image processing system in an embedded system,
importance of microprocessor design in SoC increases for developing an optimal
embedded system which are fast, small memory size, and low power consumption.
vi
ABSTRAK
Objektif kajian ini adalah untuk mereka bentuk Set Arahan Mengurangkan
Komputer (RISC)pemproses teras yang berdasarkan seni bina ARM set arahan untuk
pembangunan reka bentuk Sistem-atas-cip (SoC).Seni bina RISC komputer dipilih
kerana kerana ia diterima sebagai pemproses untuk pengkomputeran mudah alih dan
dalam sistem komputerberasaskan SoC. Selain itu, ia mengurangkan kerumitan
pemproses dengan mengurangkan set arahan daripada set arahan yang sangat
kompleks microprogrammedkepada beberapa arahan yang terhad yang boleh benar-
benar melaksanakan satu arahan dalam satu kitaran. Sebagai Sistem atas Cip (SoC)
menjadi satu penyelesaian yangmenakjubkan dalam pelbagai aplikasi seperti pemecut
perkakasan bagi sistem pemprosesan video dan imej dalam sistem terbenam,
kepentingan reka bentuk mikropemproses di SoC meningkAt bagi membangunkan
sistem optimum yang tertanamyang cepat, memori saiz yang kecil, dan penggunaan
kuasa yang rendah.
vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION................................................................................. ii
DEDICATION.................................................................................... iii
ACKNOWLEDGEMENT ................................................................. iv
ABSTRACT ..........................................................................................v
ABSTRAK .......................................................................................... vi
TABLE OF CONTENTS ................................................................. vii
LIST OF TABLES ............................................................................ xii
LIST OF FIGURES ......................................................................... xiii
LIST OF ABBREVIATIONS ......................................................... xvi
LIST OF APPENDICES ................................................................ xvii
1 INTRODUCTION ...................................................................................................1
1.1 Research Background ......................................................................1
1.2 Project Background ..........................................................................2
1.3 Problem Statement ...........................................................................3
1.4 Objective ..........................................................................................4
1.5 Scope of Work .................................................................................4
1.6 Project Schedule...............................................................................4
1.7 Report Outline ..................................................................................4
1.8 Summary of Chapter 1 .....................................................................5
viii
2 LITERATURE REVIEW .......................................................................................6
2.1 Procedure Identification ...................................................................6
2.2 Instruction Set Architecture (ISA) ...................................................7
2.2.1 Complex Instruction Set Computer (CISC) ......................7
2.2.2 Reduced Instruction Set Computer (RISC).......................8
2.2.3 Instruction Usage ..............................................................9
2.3 Load/Store Architecture .................................................................10
2.4 Pipelining .......................................................................................11
2.5 Pipeline Hazard ..............................................................................13
2.5.1 Structural Hazard ............................................................13
2.5.2 Data Hazard ....................................................................13
2.5.3 Control Hazard ................................................................14
2.6 Processor Resources.......................................................................14
2.6.1 Incrementer .....................................................................15
2.6.2 Program Counter .............................................................15
2.6.3 Link Register ...................................................................15
2.6.4 Instruction Memory ........................................................15
2.6.5 Instruction Register .........................................................16
2.6.6 Register File ....................................................................16
2.6.7 Barrel Shifter ...................................................................16
2.6.8 Arithmetic and Logic Unit ..............................................16
2.6.9 Status Register ................................................................17
2.6.10 Data Memory ................................................................17
2.6.11 Control Unit ..................................................................17
2.6.12 Data Bus ........................................................................17
2.6.13 Address Bus ..................................................................18
2.6.14 Control Bus ...................................................................18
2.6.15 Pipeline Register ...........................................................18
2.7 Previous Work ...............................................................................18
2.8 Jurij Silc, Advanced Processor Architecture .................................18
2.8.1 V. Heuring and H. Jordan Computer Systems Design
and Architecture ......................................................................20
ix
2.8.2 VHDL Design of A 32-bit RISC Processor Core for
FPGA Implementation .............................................................21
2.8.3 The RTL Design of 32-bit 5-stage Pipeline RISC
Processor Using Verilog HDL .................................................23
2.9 Introduction to ARM Architecture.................................................27
2.10 Summary of Chapter 2 .................................................................28
3 METHODOLOGY AND PRECEDURES ..........................................................29
3.1 Methodology ..................................................................................29
3.2 Research Flow and Methodology ..................................................30
3.3 Design Flow and Methodology ......................................................32
3.4 Summary of Chapter 3 ...................................................................36
4 RESULT AND DISCUSSION ..............................................................................37
4.1 Processor Top Level Design ..........................................................37
4.2 Processor Design Specification......................................................40
4.3 Processor Resources.......................................................................42
4.3.1 Clock ...............................................................................42
4.3.2 Incrementer .....................................................................43
4.3.3 Program Counter .............................................................45
4.3.4 Link Register ...................................................................46
4.3.5 Instruction Memory ........................................................48
4.3.6 Instruction Register .........................................................50
4.3.7 Register File ....................................................................51
4.3.8 Barrel Shifter ...................................................................54
4.3.8.1 Logical Shift Left .....................................................................56
4.3.8.2 Logical Shift Right ...................................................................57
4.3.8.3 Arithmetic Shift Right..............................................................59
4.3.8.4 Rotate Right .............................................................................60
4.3.8.5 Rotate Right Extended .............................................................62
4.3.9 Arithmetic Logic Unit .....................................................63
4.3.10 Current Program Status Register ..................................65
4.3.11 Data Memory ................................................................66
x
4.4 Instruction Set Architecture and Format ........................................68
4.4.1 Data Processing ...............................................................68
4.4.2 Single Data Transfer .......................................................71
4.4.3 Branch .............................................................................74
4.5 ASM Flowchart ..............................................................................74
4.6 Datapath Unit Design .....................................................................77
4.7 Control Unit Design .......................................................................79
4.8 myARM RISC Processor Core Architecture .................................81
4.9 Processor Validation ......................................................................83
4.10 Design Optimization and Performance Analysis .........................85
4.10.1 Maximum Operating Frequency ...................................85
4.10.2 Critical Path Delay ........................................................85
4.10.3 Throughput ....................................................................86
4.10.4 Latency ..........................................................................86
4.10.5 Resource Utilization......................................................86
4.11 Summary of Chapter 4 .................................................................87
5 CONCLUSION ......................................................................................................88
5.1 Limitation of the Processor Core ...................................................88
5.2 Project Conclusion .........................................................................88
5.3 Future Works .................................................................................89
5.3.1 Pipeline Hazard Handling ...............................................89
5.3.2 Interrupt Capability .........................................................89
5.3.3 Multiplier and Divisor Design ........................................90
5.3.4 Assembler Compatibility ................................................90
5.3.5 Implementation on FPGA ...............................................90
5.3.6 Memory Extension ..........................................................90
5.3.7 Extending the Instruction ................................................91
5.3.8 Cache Memory Architecture ...........................................91
5.3.9 Interfacing with Co-Processor ........................................92
5.3.10 I/O System ....................................................................92
5.4 Summary of Chapter 5 ...................................................................92
xi
REFERENCES ...........................................................................................................93
Appendix A ..............................................................................................................96
xii
LIST OF TABLES
TABLE NO. TITLE PAGE
4.1 Register file operation ..........................................................................54
4.2 Barrel shifter operation ........................................................................55
4.3 Logical shift left operation barrel shifter .............................................57
4.4 Logical shift right operation barrel shifter ...........................................59
4.5 Arithmetic shift right operation barrel shifter ......................................60
4.6 Rotate right operation of barrel shifter.................................................61
4.7 Rotate right extended operation of barrel shifter .................................63
4.8 Arithmetic and logic unit operation .....................................................64
4.9 Condition flags .....................................................................................65
4.10 Instruction set and format ....................................................................68
4.11 Data processing instruction ..................................................................70
4.12 RTL notation for single data transfer operation ...................................73
4.13 Pipeline stages ......................................................................................77
4.14 Hardware for pipeline stages ...............................................................78
4.15 Pipeline validation ...............................................................................85
xiii
LIST OF FIGURES
FIGURE NO. TITLE PAGE
2.1 Basic 5-stage pipeline structure ...........................................................12
2.2 5-stage pipeline structure by Jurij Silc .................................................19
2.3 5-stage pipeline structure by V. Heuring .............................................20
2.4 VHDL Design of A 32-bit RISC Processor Core ................................22
2.5 Top level architecture ..........................................................................24
2.6 Datapath unit design and the interconnection ......................................25
2.7 ARM7TDMI core architecture ............................................................28
3.1 Project Methodology ............................................................................29
3.2 Research Methodology ........................................................................30
3.3 Experiment on memory’s timing performance ....................................31
3.4 RTL Design Flow ................................................................................32
3.5 Hierarchical Modularity of RTL Design Methodology .......................34
3.6 RTL Design Methodology ...................................................................35
4.1 Selected blocks from ARM7TDMI core architecture ..........................38
4.2 Hierarchical modularity of processor design .......................................39
4.3 Hierarchical modularity of processor design .......................................40
4.4 Clocking design ...................................................................................43
4.5 Incrementer design ...............................................................................44
4.6 Incrementer simulation result ..............................................................44
4.7 Program counter design .......................................................................45
4.8 Output waveform program counter simulation ....................................46
4.9 Link register design..............................................................................47
4.10 Register instantiation module ..............................................................48
xiv
4.11 Instruction memory design ..................................................................49
4.12 Memory initialization file ....................................................................49
4.13 Output waveform of instruction memory simulation ...........................50
4.14 Output waveform of instruction memory simulation ...........................50
4.15 Register file design ..............................................................................51
4.16 Register file organization .....................................................................52
4.17 Register file organization .....................................................................53
4.18 Barrel shifter design .............................................................................55
4.19 Output waveform of logical shift left operation ..................................56
4.20 Output waveform of logical shift right operation ................................58
4.21 Output waveform of arithmetic shift right operation ...........................59
4.22 Output waveform of rotate right operation ..........................................61
4.23 Output waveform of rotate right extended operation ...........................62
4.24 Arithmetic and logic unit design ..........................................................64
4.25 Current Program Status Register design ..............................................65
4.26 Comparing the condition field with CPSR ..........................................66
4.27 Data Memory design ............................................................................67
4.28 Store and load instructions ...................................................................67
4.29 Data processing instruction binary encoding .......................................69
4.30 Shift operation binary encoding ...........................................................70
4.31 Single data transfer binary encoding....................................................71
4.32 Branch instruction binary encoding .....................................................74
4.33 Algorithmic State Machine ..................................................................75
4.34 Processor’s state transition ...................................................................76
4.35 Datapath unit design ............................................................................78
4.36 Top level design of datapath unit .........................................................79
4.37 Control unit design ...............................................................................80
4.38 Conceptual design of myAR ................................................................81
4.39 myARM core architecture ....................................................................82
4.40 Test program ........................................................................................83
4.41 Output waveform of pipeline verification............................................84
4.42 Critical path delay ................................................................................85
4.43 Resource Usage ....................................................................................86
xv
5.1 ARM7 instruction set and format ........................................................91
xvi
LIST OF ABBREVIATIONS
RISC - Reduced Instruction Set Computer
CISC - Complex Instruction Set Computer
SoC - System-on-Chip
ISA - Instruction Set Architecture
PC - Program Counter
OPCODE - Operation Code
xvii
LIST OF APPENDICES
APPENDIX TITLE PAGE
A VERILOG CODE ................................................................................96
CHAPTER 1
INTRODUCTION
This project report is about 32-bit 5-stage pipeline RISC processor design
based on ARM instruction set architecture and format. This chapter discusses the
introduction to this project which covers the background research, research
motivation, scope of work and the report organization. The problem statement and
the objective that lead to implementation of this project are also discussed.
1.1 Research Background
Microprocessor is one of the greatest inventions in 20th
century to fulfill the
people needs such as in daily works and communications where nowadays, people are
communicating anytime and from anywhere [1]. As a result, people need a
communication device such smartphone or tablet computer for them to communicate
with various high end applications running on that device which need a high end
computing system. RISC is one of the simple and yet popular processor architectures
in computing industry [1]. To develop a high performance computing, yet lower
powered and small area usage, a microprocessor system which meet that specification
must be design.
2
In this project, a processor is designed which is based on RISC processor
design. The design philosophy of RISC processor is to reduce the complexity of the
ISA by limiting the instruction set in to a smaller number of more frequently used
instruction that yields better efficiency in modern computing [2].
Besides, the RISC processor throughput is improved by implementation of the
pipeline mechanism that brings the processor to achieve a high performance in speed
because all the operations are done by the registers. RISC architecture was first
introduced by IBM in 1975 [16]. However, RISC designs such as Berkeley‟s RISC
processor and Stanford‟s MIPS processor which were introduced by respective
university research teams were gaining higher popularity in term of public RISC
design [16].
1.2 Project Background
The RISC processor design proposed is based on ARM processor core
architecture is designed using Verilog HDL design entry and the design methodology
is based on hierarchical modularity of RTL design methodology so that the functional
unit of the processor can be modeled using behavioral programming style and the all
functional blocks will be integrated into a system using structural modeling technique
for both processor core datapath and control unit design.
Hardwired control approach will be applied to design the control unit as
against microprogrammed control approach in conventional Complex Instruction Set
Computer (CISC) processor [11]. CISC processor has gained the major marketplace
in world of computing over the decades [5]. They support various addressing modes
and data types. The instruction is complex and the length is varies from one
instruction to another instruction [6]. The CISC processor is also frequently accessing
data in external memory for the processor to execute its instruction and this is very
slow [5].
3
Compared to RISC processor, it operates on very few data types, simple and
yet limited addressing modes, and does only the simple instructions [5]. It supports
very few addressing modes and is mostly register based. Most of the instructions
operate on data present in the register files, so called register-to-register operation,
and this is faster than CISC‟s memory-to-memory operation [6]. Only load and store
data from and into memory are working on memory accessing. Furthermore, the
RISC instruction length is fixed and hence the decoding technique is easier compared
to CISC microprogrammed decoding technique to generate the control signals [6].
Parallel execution of instructions through the pipelined mechanism of
processor will improves the overall throughput [4]. The ARM architecture is used as
guidance to design the RISC processor because of successful design of its architecture
in many embedded systems as well as mobile computing. The SoC design could be
used to develop various fast, small die area size usages and yet low powered
embedded system as well mobile computing [4].
1.3 Problem Statement
Nowadays, System-on-Chip (SoC) becomes a realistic solution in various
application domains such as cryptography, image processing system and digital signal
processing in various embedded system such as mobile smart phones, portable
gaming gadget as well as mobile computing. SoC is bringing down their complex
algorithm that demanding a heavy computation into hardware, implemented as an
Application Specific Instruction Set Processor (ASISP) as co-processor or more
specific as a hardware accelerator for a processor core which is faster instead of
implementation on software on the same platform. The idea of this project is to bring
down the heavily computation and complex algorithm from the software realm to the
hardware realm [6] to perform a specific instruction which is can be performed faster
in hardware so that the application is mainly depends on hardware instead of software
by designing a processor core that is open to work with any ASIC design and
becomes a system that is called System-on-Chip (SoC) [6].
4
1.4 Objective
The objective of this project is to study, design, and validate a 32-bit 5-stage
pipeline RISC processor based on ARM instruction set architecture and format. It
covers the study of ARM core architecture datapath design and investigation on how
the processor executes it‟s instruction.
1.5 Scope of Work
The scope of works in this project covers the design of a 32-bit RISC
processor with implementation of 5-stage pipeline that can execute three main types
of ARM instruction set architecture which are data processing, single data transfer, as
well as branching. The project covers the design entry using Verilog HDL and
synthesizing using Altera Quartus II Tool.
1.6 Project Schedule
The project is scheduled for two semesters, which is span for eight months.
The hardware specification and requirement were derived in the first semester while
the Verilog HDL coding, synthesis, compilation, simulation, and validation for the
design processor were done in the second semester.
1.7 Report Outline
Report outline discusses the content in each chapter of this project report.
Chapter 1 A brief introduction to the research and project background, the project‟s objectives
and scopes covers in this project, and the organization of this project report.
5
Chapter 2 Introduction to the instruction set architectures, pipelining mechanism, basic
processor‟s functional units needed to design a RISC processor, and the previous
works done.
Chapter 3 Research and design methodology that applied to design the processor and the tools
used.
Chapter 4 Discussion of the processor design and the performance analysis.
Chapter 5 Conclusion and suggestion for future works.
1.8 Summary of Chapter 1
Research and project background, objective of the project, scope of work, and
significance of this research has been discussed. The project report organization is
also discussed.
93
REFERENCE
[1] R. de J. Romero-Troncoso; A. Ordaz-Moreno; J. A. Vite-Frias; A. Garcia-Perez.
(2006). 8-bit CISC Microprocessor Core for Teaching Applications in the Digital
Systems Laboratory. Reconfigurable Computing and FPGA's, 2006. ReConFig 2006.
IEEE International Conference. Sept. 2006. San Luis Potosi. 1 – 5.
[2] Kwanghyun Cho; Jaebeom Kim; Euibong Jung; Sik Kim; Zhenmin Li; Young-
Rae Cho; Byeong Min; Kyu-Myung Choi (2008). Reusable Platform Design
Methology For SoC Integration and Verification. SoC Design Conference, 2008.
ISOCC '08. International. 24-25 Nov. 2008. Busan . I-78 - I-81.
[3] Zhihui Xiong; Sikun Li; Jihua Chen; Dawei Wang (2004), A Platform Based SoC
Hardware/Software Co-Design Environment. Computer Supported Cooperative Work
in Design, 2004. Proceedings. The 8th
International Conference. 26-28 May 2004.
Vol 2. 443 – 448.
[4] Muhammad Nadzir Bin Marsono (2001.) VHDL Design of A 32-Bit RISC Processor
Core For FPGA Implementation, Master‟s Degree. Universiti Teknologi Malaysia,
Skudai.
[5] Hasan Krad, Aws Yousif Al-Taie (2007). A New Trend for CISC and RISC
Architecture. Department of Computer Science & Engineering College of
Engineering Qatar University.
94
[6] Hannibal (1999). RISC vs. CISC: The Post-RISC Era “A historical approach to the
debate”. http://http://arstechnica.com/cpu/4q99/risc-cisc/rvc-2.html.
[7] Mar Smotherman, Manoj Franklin (1995). Improving CISC Instruction Decoding
Performance Using a Fill Unit. Microarchitecture, 1995. Proceedings of the 28th
Annual International Symposium. 29 Nov-1 Dec 1995 . Ann Arbor, MI , USA. 219 -
229 .
[8] Shiu, R.-M. ; Chiu, J.-C. ; Cheng, S.-K. ; Shann, J.J.-J. (2000). Decoding of CISC
instruction in superscalar processors with high issue rate. Mar 2000. Computers and
Digital Techniques, IEE Proceedings. 147 , Issue:2. 101 – 107.
[9] Breternitz, M. ; Hum, H. ; Peri, R. ; Pickett, J. ; Youfeng Wu (2005). Enhanced Code
Density of Embedded CISC Processor With Echo Technology. Hardware/Software
Codesign and System Synthesis, 2005. CODES+ISSS '05. Third IEEE/ACM/IFIP
International Conference. Sept. 2005. 160 – 165.
[10] Jun Zhang ; Xiao-Ya Fan ; Song-He Liu (2008). A Precise Interrupt Mechanism
Based on Micro-Operation Tracing of Instruction Boundary for Embedded Processor.
Networking, Architecture, and Storage, 2008. NAS '08. International Conference. 12-
14 June 2008. 181 – 182.
[11] Alan Clements, (2006) Principle of Computer Hardware, Fourth Edition Oxford
University Press.
[12] Dr Mohamed Khalil Hani (2010), Starter’s Guide to Digital Systems VHDL &
Verilog Design, Revised Edition 2, July 2010, Desktop Publisher.
[13] Jurij Silc Advanced Processor Architecture : http://csd.ijs.si/silc/.
[14] V. Heuring and H. Jordan (1997). Computer Systems Design and Architecture,
Prentice Hall.
95
[15] Jonie Lin Joo Nee (2008). The RTL Design of 32-bit 5-stage Pipeline RISC Processor
Using Verilog HDL. Master‟s Degree. Universiti Teknologi Malaysia, Skudai.
[16] Koay Boon Wooi (2009). Development of Single Board Computer Based on 32-bit 5-
stage Pipeline RISC Processor. Master‟s Degree. Universiti Teknologi Malaysia,
Skudai.
[17] ATM Limited (2001). ARM Open Access Datasheet. http://infocenter.arm.com.
[18] Stephen Brown & Zvonko Vranesic (2003), Fundamentals of Digital Logic with
Verilog Design, Second Edition, 2003, McGraw Hill.
[19] M. Rafizuzzaman (2005), Fundamental of Digital Logic and Microprocessor Design,
Fifth Edition 2005, John Wiley &Sons, Inc.
[20] S. Furber (2000) ARM System-on-Chip Architecture, Second Edition.