Upload
others
View
4
Download
4
Embed Size (px)
Citation preview
i
Full Custom Layout Design and FPGA
Implementation of an XOR Based 16-Bit
Carry Select Adder for Area, Delay and
Power Minimization
A. N. M. HOSSAIN
Department of Electrical and Electronic Engineering
Dhaka University of Engineering & Technology, Gazipur
May 2019
ii
Full Custom Layout Design and FPGA
Implementation of an XOR Based 16-Bit
Carry Select Adder for Area, Delay and
Power Minimization
Submitted to the Department of Electrical and Electronic Engineering,
Dhaka University of Engineering & Technology, Gazipur in partial
fulfillment of the requirements for the award of degree of
Master of Science in
Electrical and Electronic Engineering
by
A. N. M. HOSSAIN
Student No. 112229-P
under the supervision of
Prof. Dr. Md. Anwarul Abedin
Professor, Dept. of EEE
Department of Electrical and Electronic Engineering
Dhaka University of Engineering & Technology, Gazipur
May 2019
iii
The thesis titled “Full Custom Layout Design and FPGA Implementation of an XOR Based
16-Bit Carry Select Adder for Area, Delay and Power Minimization” submitted by
A. N. M. HOSSAIN, Student ID: 112229-P, Session: 2011-2012 has been accepted as
satisfactory in partial fulfillment of the requirement for the degree of Master of Science in Electrical and Electronic Engineering on May 13, 2019.
Board of Examiners
………………….……………
Dr. Md. Anwarul Abedin Chairman
Professor (Supervisor)
Department of Electrical and Electronic Engineering
Dhaka University of Engineering & Technology, Gazipur
………………………….……
Dr. Md. Sharafat Hossain Member
Professor and Head (Ex-officio)
Department of Electrical and Electronic Engineering
Dhaka University of Engineering & Technology, Gazipur
………………………….…
Dr. Md. Saifuddin Faruk Member
Professor
Department of Electrical and Electronic Engineering
Dhaka University of Engineering & Technology, Gazipur
………………………..
Dr. Md. Arifur Rahman Member
Assistant Professor
Department of Electrical and Electronic Engineering
Dhaka University of Engineering & Technology, Gazipur
……………….……………….
Dr. Syed Iftekhar Ali Member
Professor (External)
Department of Electrical & Electronic Engineering
Islamic University of Technology, Gazipur
iv
Declaration
I declare that this thesis is my own work and has not been submitted in any form for another
degree or diploma at any university or other institute of tertiary education. Information
derived from the published and unpublished work of others has been acknowledged in the
text and a list of references is given.
Signature of the candidate
A. N. M. Hossain Date: 13/05/2019
(Student ID: 112229-P)
v
Dedication
To my family
vi
Acknowledgements
First of all, I thank the Almighty, who gave me the opportunity and strength to carry out this
research work.
I would like to express my sincere gratitude and profound indebtedness to my supervisor
Prof. Dr. Md. Anwarul Abedin for constant guidance, insightful advice, helpful criticism,
valuable suggestions, commendable support and endless patience towards the completion of
this thesis. I feel very proud to have worked with him. Without his inspiring enthusiasm and
encouragement this work could not have been completed.
I am deeply indebted and grateful to Dr. Md. Sharafat Hossain, Professor and Head,
Department of EEE, DUET, Gazipur, for his supports throughout thesis work.
I would like to thank the members of my thesis examination committee, Dr. Md. Saifuddin
Faruk, Dr. Md. Arifur Rahman, Dr. Syed Iftekhar Ali for their constructive suggestion on
future improvement of my work.
I thank all my teachers and staffs at the Department of EEE, DUET, Gazipur for their support
and encouragement.
I wish to express my gratitude to DUET, Gazipur for providing an excellent environment for
research. The support I have received from DUET, Gazipur is gratefully acknowledged.
I would like to express my most sincere gratitude to my family, my friends and well-wishers
who are taking lot of pains for progress in my life and for their sacrifices, blessings and
constant prayers for my advancement.
Finally, last but not least, I am also thankful to those, who have directly or indirectly helped
me and encouraged me to complete my thesis. I feel sorry for not able to express my
appreciation to each of my well-wishers and ask forgiveness for my improper behavior with
anyone who was intending to help me.
vii
Abstract
Adders are most widely used in different types of processors and other digital circuits. Low
power and area efficient high-speed circuits are most substantial area in the research of VLSI
design. The carry select adder is one of the fast adders which has less area and reduced power
consumption. In this thesis, a 16-bit carry select adder has been presented using modified
XOR based full adder to reduce circuit complexity, area and delay. The modified full adder
design requires only two XOR gates and one multiplexer. The modified 16-bit carry select
adder gives better result than conventional carry select adder with respect to area, power
consumption and delay.
The pass transistor based 2×1 MUX is used in semi custom and full custom design. Here only
6 transistors are needed in place of 20 used in conventional 2×1 MUX. The full custom
2×1 MUX design has 85.12% area, 94.07% power and 70% total no MOSFET reduction over
conventional design where as it has 92% area and 84.89% power reduction over semi custom
design.
The XOR based full custom 1-bit full adder design has 60.4% area, 28.4% power, 54.5%
delay and 40% total no MOSFET reduction over conventional design where as it has 78.3%
area, 50.3% power and 50.8% delay reduction over semi custom design. To design a 16-bit
Carry Select Adder (CSLA), 4-bit and 8-bit CSLA have been designed first. For XOR based
16-bit CSLA the full custom design has 84.3% area, 70.6% power and 56.5% delay reduction
over conventional design where as it has 87.0% area, 20.1% power and 55.2% delay
reduction over semi custom design.
To implement the CSLA in FPGA, the Verilog code of 1-bit adder, 4-bit CSLA, 8-bit CSLA
and 16-bit CSLA have been written. These codes are then simulated in Modelsim software to
check the functionality of the design. When the simulation results are ok then the synthesis
and power analysis is done in Quartus II software. Finally, The 1-bit adder , 4-bit CSLA and
8-bit CSLA in Altera DE2-115 FPGA board. Due to pin constraints up to 8-bit CSLA have
been implemented.
viii
Table of Contents
Page No.
Declaration……………………………………………………………………… viii
Dedication………………………………………………………………………. viii
Acknowledgement………………………………….…………………………... viii
Abstract…………………………………………………………………………. viii
List of Figures……………………………………………………………........... viii
List of Tables…………………………………………………………………… viii
Chapter 1
Introduction
1.1 Introduction…………………………………………………………………. 1
1.2 Literature Review ………………………………………………………….. 2
1.3 Objective of Thesis…………………………………………………………. 3
1.4 Organization of Thesis……………………………………………………… 3
Chapter 2
Adder Topologies
2.1 Ripple Carry Adder…………………………………………………………. 4
2.2 Carry Look-ahead Adder…………………………………………………… 5
2.3 Carry Save Adder…………………………………………………………… 7
2.4 Carry Skip Adder…………………………………………………………… 8
2.5 Carry Select Adder………………………………………………………….. 9
2.5.1 Uniform sized CSLA…………………………………………………. 10
2.5.2 Variable sized CSLA………………………………………………..... 11
Chapter 3
Semi Custom Design
3.1 Conventional 2×1 Multiplexer………………………………………………
12
3.2 Pass Transistor Based 2×1 Multiplexer…………………………………….. 15
3.3 Conventional Full Adder…………………………………………………… 17
ix
3.4 XOR Based Full Adder …………………………………………………….. 19
3.5 Carry Select Adder (CSLA)………………………………………………… 21
3.5.1. XOR Based 4-Bit CSLA…….………………………………………… 21
3.5.2. XOR Based 8-Bit CSLA …………………...…………………………. 23
3.5.3. XOR Based 16-Bit CSLA….………………...………………………... 25
Chapter 4
Full Custom Design
4.1 Inverter or NOT Gate…………………………………………..………….... 29
4.2 Two-input AND Gate…………..……….…………….……………………. 30
4.3 Two-input XOR Gate…………...………………..…………………………. 32
4.4 2×1 MUX……….……………………………….………………………….. 34
4.5 XOR Based Full Adder………………………………………………..……. 36
4.6 XOR Based 4-bit CSLA……………………………………..…………....... 37
4.7 XOR Based 8-bit CSLA………………………………………….……….... 39
4.8 XOR Based 16-bit CSLA…………………………...……..……………….. 40
Chapter 5
Performance Analysis
5.1 Comparison of Different 2 ×1 MUX……………………………………….. 42
5.2 Comparison of Different Full Adder……………………………………….. 44
5.3 Comparison of 4-bit CSLA…………………………………………………. 46
5.4 Comparison of 8-bit CSLA…………………………………………………. 48
5.5 Comparison of 16-bit CSLA………………………………………………... 50
5.6 Comparison with Other Work………………………………………………. 52
Chapter 6
FPGA Implementation of CSLA
6.1 Introduction to FPGA ……………………………………………………… 53
6.2 Altera DE2 -115 FPGA Board ……………………………………………... 54
6.3 FPGA Implementation of Conventional 1-bit Full Adder………………….. 55
x
6.4 FPGA Implementation of 4-bit CSLA……………………………………… 58
6.5 FPGA Implementation of 8-bit CSLA ……………………………………... 62
6.6 FPGA Implementation of 16-bit CSLA ……………………………………. 67
Chapter 7
Conclusion and Future Recommendation
7.1 Conclusion………………………………………………………………….. 72
7.2 Future Recommendation …………………………………………………… 72
References 73
xi
List of Figures
Fig. 2.1 Ripple carry adder…………………………………………………….. 5
Fig. 2.2 Carry look-ahead adder……………………………………………...... 7
Fig. 2.3 Carry save adder……………………………………………………… 8
Fig. 2.4 Carry skip adder………………………………………………………. 9
Fig. 2.5 Carry select adder…………………………………………………….. 10
Fig. 2.6 Uniform sized 16-bit carry select adder………………………………. 10
Fig. 2.7 Variable sized 16-bit carry select adder…………………………......... 11
Fig. 3.1 Schmatic circuit of 2×1 Multiplexer………………………………...... 13
Fig. 3.2 Layout of Conventional 2×1 Multiplexer…………………………..... 13
Fig. 3.3 Input-Output wave shapes of conventional 2×1 MUX……………….. 14
Fig. 3.4 Pass transistor based 2×1 MUX………………………………………. 15
Fig. 3.5 Layout of pass transistor based 2×1 MUX…………………………… 15
Fig. 3.6 Input-Output wave shapes of pass transistor based 2×1 MUX……...... 16
Fig. 3.7 Conventional 1-bit full adder…………………………………………. 17
Fig. 3.8 Layout of the conventional 1-bit full adder…………………………... 17
Fig. 3.9 Input-Output wave shapes of conventional 1-bit full adder………...... 18
Fig. 3.10 XOR based 1-bit full adder…………………………………………… 19
Fig. 3.11 Layout of XOR based 1-bit full adder………………………………... 19
Fig. 3.12 Input-Output wave shapes of XOR based 1-bit full adder……………. 20
Fig. 3.13 XOR based 4-bit CSLA………………………………………………. 21
Fig. 3.14 Layout of the XOR based 4-bit CSLA………………………………... 22
Fig. 3.15 Input-Output wave shapes of XOR based 4-bit CSLA………………. 22
Fig. 3.16 XOR based 8-bit CSLA……………………………………………..... 23
Fig. 3.17 Layout of XOR based 8-bit CSLA…………………………………… 24
Fig. 3.18 Input-Output wave shapes of XOR based 8-bit CSLA……………...... 24
Fig. 3.19 XOR based 16-bit CSLA……………………………………………... 25
Fig. 3.20 Layout of XOR based 16-bit CSLA………………………………….. 26
Fig. 3.21 Input-Output wave shapes of XOR based 16-bit CSLA……………… 26
Fig. 4.1 VLSI design flow……………………………………………………... 28
Fig. 4.2 Full custom layout of NOT gate……………………………………… 29
Fig. 4.3 Input-Output wave shapes of NOT gate……………………………… 29
Fig. 4.4 2-input AND gate symbol…………………………………………...... 30
xii
Fig. 4.5 Full custom layout of 2-input AND gate……………………………... 30
Fig. 4.6 Input-Output wave shapes of 2-input AND gate……………………... 31
Fig. 4.7 Schematic design of 2-input XOR gate……………………………..... 32
Fig. 4.8 Full custom layout of 2-input XOR gate…………………………….. 32
Fig. 4.9 Input-Output wave shapes of 2-input XOR gate……………………... 33
Fig. 4.10 Schematic circuit of 2×1 MUX………………………………………. 34
Fig. 4.11 Full custom layout of 2×1 MUX……………………………………... 34
Fig. 4.12 Input-Output wave shapes of the 2×1 MUX………………………...... 35
Fig. 4.13 XOR based 1-bit full adder ………………………………………...... 36
Fig. 4.14 Full Custom Layout of a XOR Based 1-bit Full Adder……………… 36
Fig. 4.15 Input-Output wave shapes of XOR based 1-bit full adder…………… 36
Fig. 4.16 Full custom layout of XOR based 4-bit CSLA………………………. 37
Fig. 4.17 Input-Output wave shapes of XOR based 4-bit CSLA……………...... 38
Fig. 4.18 Full custom layout of XOR based 8-bit CSLA……………………...... 39
Fig. 4.19 Input-Output wave shapes of XOR based 8-bit CSLA……………...... 39
Fig. 4.20 Full custom layout of XOR based 16-bit CSLA……………………… 40
Fig. 4.21 Input-Output wave shapes of XOR based 16-bit CSLA……………… 41
Fig. 5.1 Area, Power and Delay comparison of 2×1 MUX…………………… 43
Fig. 5.2 Area, Power and Delay comparison of 1-bit Full Adder……………... 45
Fig. 5.3 Area, Power and Delay comparison of 4-bit CSLA………………….. 47
Fig. 5.4 Area, Power and Delay comparison of 8-bit CSLA………………….. 49
Fig. 5.5 Area, Power and Delay comparison of 16-bit CSLA………………… 51
Fig. 6.1 FPGA Architecture…………………………………………………… 53
Fig 6.2 Altera DE2 -115 FPGA Board………………………………………... 54
Fig. 6.3 Block diagram of the Altera DE2 -115 FPGA Board………………… 55
Fig. 6.4 Verilog code of 1-bit full adder……………………………………..... 55
Fig. 6.5 Simulation waveform results of conventional 1-bit full adder……….. 56
Fig. 6.6 Synthesis summary of the conventional 1-bit full adder……………... 56
Fig. 6.7 Power analysis of the conventional 1-bit full adder………………….. 57
Fig. 6.8 Implementation of the conventional 1-bit full adder in FPGA……...... 57
Fig. 6.9 Verilog code of 4-bit CSLA…………………………………………... 58
Fig. 6.10 Simulation waveform results of 4-bit CSLA (binary)………………... 59
Fig. 6.11 Simulation waveform results of 4-bit CSLA (decimal)………………. 59
xiii
Fig. 6.12 Synthesis summary of 4-bit CSLA…………………………………… 60
Fig. 6.13 Power analysis of 4-bit CSLA………………………………………... 60
Fig. 6.14 FPGA implementation of 4-bit CSLA………………………………... 61
Fig. 6.15 Verilog code of 8-bit CSLA…………………………………………... 62
Fig. 6.16 Simulation waveform results of 8-bit CSLA (Binary)……………….. 64
Fig. 6.17 Simulation waveform results of 8-bit CSLA (Decimal)……………… 64
Fig. 6.18 Synthesis summary of 8-bit CSLA…………………………………… 65
Fig. 6.19 Power analysis of 8-bit CSLA………………………………………... 65
Fig. 6.20 FPGA implementation result of 8-bit CSLA………………………..... 66
Fig. 6.21 Verilog code of 16-bit CSLA…………………………………………. 67
Fig. 6.22 Simulation waveform results of 16-bit CSLA (Binary)………………. 69
Fig. 6.23 Simulation waveform results of 16-bit CSLA (Decimal)…………...... 69
Fig. 6.24 Synthesis summary of 16-bit CSLA………………………………….. 70
Fig. 6.25 Power analysis of 16-bit CSLA……………………………………..... 70
xiv
List of Tables
Table 3.1 Area, Delay and Power Dissipation of the Conventional 2×1 MUX………… 14
Table 3.2 Area, Delay and Power Dissipation of the pass transistor based 2×1 MUX…. 16
Table 3.3 Area, Delay and Power Dissipation of conventional 1-bit full adder……… 18
Table 3.4 Area, Delay and Power Dissipation of XOR based 1-bit full adder…………. 20
Table 3.5 Area, Delay and Power Dissipation of XOR based 4-bit CSLA……............... 23
Table 3.6 Area, Delay and Power Dissipation of XOR based 8-bit CSLA……………... 25
Table 3.7 Area, Delay and Power Dissipation of XOR based 16-bit CSLA……………. 27
Table 4.1 Area, Delay and Power Dissipation of NOT gate…………………................ 30
Table 4.2 Area, Delay and Power Dissipation of 2-input AND gate…………................ 31
Table 4.3 Area, Delay and Power Dissipation of 2-input XOR gate…………................ 33
Table 4.4 Area, Delay and Power Dissipation of the 2×1 MUX……………………….. 35
Table 4.5 Area, Delay and Power Dissipation of the 1-bit full adder…………............... 37
Table 4.6 Area, Delay and Power Dissipation of the XOR based 4-bit CSLA…………. 38
Table 4.7 Area, Delay and Power Dissipation of the XOR based 8-bit CSLA…………. 40
Table 4.8 Area, Delay and Power Dissipation of the XOR based 16-bit CSLA……… 41
Table 5.1 Performance Analysis of Conventional, Semi and Full Custom 2×1 MUX..... 42
Table 5.2 Performance analysis of Conventional, Semi and Full Custom 1-bit Full
Adder…………………………………………………………………………. 44
Table 5.3 Performance analysis of Conventional, Semi and Full Custom 4-bit CSLA… 46
Table 5.4 Performance analysis of Conventional, Semi and Full Custom 8-bit CSLA… 48
Table 5.5 Performance analysis of Conventional, Semi and Full Custom 16-bit CSLA.. 50
Table 5.6 Performance analysis of Comparison with Other Work……………………... 52
Table 6.1 Summary of the all results obtained from the synthesis and power analysis
using Quartus II software…………………………………………………….. 70
1
Chapter 1
Introduction
1.1 Introduction
Adder are one of the widely used digital components in digital integrated circuit design. It has
special significance in VLSI design and used in computer and many other processors. In
rapidly growing mobile industry, faster units are not the only concern but also smaller area
and less power become major concerns for design of digital circuits [1]. Design of low power
and area efficient high speed data path logic systems are most substantial field in the research
of VLSI design. In mobile electronics, reducing area and power consumption are key factors
in increasing portability and battery life. Area and power reduction in data path logic systems
are the main area of research in VLSI system design. High-speed addition and multiplication
has always been a fundamental requirement of high-performance processors and systems area
[2]. Number of fast adders can be used for addition. Addition is the heart of computer
arithmetic, and the arithmetic unit is often the work horse of a computational circuit.
Designing power efficient, high performance adder is one of the major concerns as far as
VLSI Sub system is considered. Speed is usually limited as carry propagation bit of an adder.
They are the necessary component of a data path, e.g. in microprocessors or a signal
processor. The propagated carry reduces the speed of addition. In digital adders, the speed of
addition is limited by the time required to propagate a carry through the adder. The sum for
each bit position in an elementary adder is generated sequentially only after the previous bit
position has been summed and a carry propagated into the next position.
There are many types of adder designs available in the literature such as Ripple Carry Adder
(RCA), Carry Look Ahead Adder (CLA), Carry Select Adder (CSLA), Carry Skip Adder
(CSkA) which have their own advantages and disadvantages. CSLA is one of the fastest
adders having less area and power consumption. The CSLA consists of two multiplexed RCA
and performs operation in parallel with carry Cin = 0 and Cin = 1, then final sum is selected
through multiplexer. In conventional CSLA, XOR, AND and OR gate based Full Adders are
2
used. These adders consumes more area in the chip as large number of transistors are used in
the gates, the delay is higher and consumes more power [3]. In this proposed work, XOR
based modified Full Adder has been used as the building blocks of the modified CSLA to
reduce area, delay and power consumption. The layout of the 16-bit CSLA is designed in
Microwind software and then implemented in Altera DE2-115 FPGA as a hardware design.
The results obtained from the layout and hardware is compared with the conventional RCA
and CSLA.
1.2 Literature Review
Implementation of efficient and high-performance VLSI systems are increasingly used in
portable and mobile devices, multi standard wireless receivers and biomedical
instrumentation [4], [5]. An adder is the main component of an arithmetic unit. A complex
digital signal processing (DSP) system involves several adders. An efficient adder design
essentially improves the performance of a complex DSP system. A RCA uses a simple design
but carry propagation delay (CPD) is the main concern in this adder. Carry look-ahead and
carry select (CS) methods have been suggested to reduce the CPD of adders. A conventional
CSLA is an RCA–RCA configuration that generates a pair of sum words and output carry bits
corresponding the anticipated input-carry (Cin=0 and 1) and selects one out of each pair for
final-sum and final-output-carry [6].
A conventional CSLA has less CPD than an RCA but the design is not attractive since it uses
a dual RCA. Few attempts have been made to avoid dual use of RCA in CSLA design. Kim
and Kim [7] used one RCA and one add-one circuit instead of two RCAs, where the add-one
circuit is implemented using a multiplexer (MUX). He et al. [8] proposed a square-root
(SQRT)-CSLA to implement large bit-width adders with less delay. In a SQRT CSLA,
CSLAs with increasing size are connected in a cascading structure. The main objective of
SQRT-CSLA design is to provide a parallel path for carry propagation that helps to reduce
the overall adder delay. Ramkumar and Kittur [9] suggested a binary to binary to excess-1
converter (BEC)-based CSLA. The BEC-based CSLA involves less logic resources than the
conventional CSLA, but it has marginally higher delay. A CSLA based on common Boolean
logic (CBL) is also proposed in [10] and [11]. The CBL-based CSLA of [10] involves
significantly less logic resource than the conventional CSLA but it has longer CPD which is
almost equal to that of the RCA.
3
To overcome this problem, a SQRT-CSLA based on CBL was proposed in [11]. However,
the CBL-based SQRT CSLA design of [8] requires more logic resource and delay than the
BEC-based SQRT-CSLA of [9]. These adders need a large area in the chip as large number
of transistors are used in the gates, the delay is higher and consumes more power. Therefore,
there is a need for design an efficient CSLA which will be less complex, area and power
consumption will also be less and delay will be minimized.
1.3 Objective of Thesis
a. To design full custom layout of the XOR based 16-bit CSLA.
b. To implement the CSLA in FPGA.
c. To compare area, delay and power consumption with conventional design.
1.4 Organization of Thesis
Introduction, literature review and objective of the thesis are given in Chapter 1. Chapter 2
describes different adder topologies. Detailed description of different adder topologies like
RCA, CLA, CSA, CSkA and CSLA are also presented. Chapter 3 describes on Semi Custom
design of Conventional Full Adder, 2 1 Multiplexer, XOR based Full Adder, basic concept
of Carry Select Adder, XOR based 4-bit, 8-bit and 16-bit CSLA with Schematic diagram,
Verilog code and Layout diagram. Full Custom Layout design of NOT gate, 2-input AND
Gate, 2-input XOR gate, 2 1 MUX, 4-bit XOR based CSLA, 8-bit XOR based CSLA and
16-bit XOR based CSLA with Input-output wave shapes have been described in Chapter 4.
Chapter 5 compares the performance analysis of conventional, semi-custom and full custom
2 1 MUX, Full Adder, CSLA, 4-bit, 8-bit and 16-bit CSLA. Here area, delay, power
dissipation, gate count, IDD(Max) and IDD(Avg) have been compared. Chapter 6 describes the
FPGA implementation of full adder, CSLA, 4-bit, 8-bit and 16-bit CSLA with Simulation
result in Modelsim. The conclusion and future recommendations are given in Chapter 7.
4
Chapter 2
Adder Topologies
The design of various adders such as RCA, CLA, CSA, CSkA and CSLA will be discussed in
this chapter. The each and every adder is named based on the propagation of carry between
the stages. The advantages and unique characteristics in terms of the area, delay and power
consideration will also be focused.
2.1 Ripple Carry Adder
Ripple Carry Adder abbreviated as RCA is considered as basic adder which works on basic
addition principle. It is basically a Cascading formation of full adders (FA) in series, as a full
adder block process three inputs along with carry bit and produce two outputs i.e. Sum bit
and Carry-out bits, the Carry of one full adder block acting as a carry in for the next full
adder. Hence, the carry is propagated in a serial computation [12]. Delay is more as the
number of bits is increased in RCA. A n-bit RCA requires n number of full adders as a full
adder block process three inputs along with carry bit and produce two outputs i.e. Sum bit
and Carry-out Bits, the Carry of one full adder block acting as a carry in for the next full
adder.
For an n-bit ripple carry adder, the block diagram is shown below in Fig. 2.1. RCA is slow
when the word length is large, because the propagation time from C0 to Cn includes all the
carry bits in the worst case. The worst-case delay of the RCA is when a carry signal transition
ripples through all stages of adder chain from the least significant bit to the most significant
bit, which is approximated by:
( ) ……...…………………..… (2.1)
5
Where, is the delay through the carry stage of a full adder and is the delay to compute
the sum of the last stage. The delay of ripple carry adder is linearly proportional to n, the
number of bits, therefore the performance of the RCA is limited when n grows bigger.
The disadvantage of RCA is the long delay due to the propagation of carry from low to high
order stages.
The advantages of the RCA are lower power consumption as well as compact layout giving
smaller chip area and less no of gate count occurs.
Fig. 2.1: Ripple carry adder
2.2 Carry Look-ahead Adder
A Carry Look-ahead Adder abbreviated as CLA is a type of adder used in digital logic. The
disadvantage of the RCA is that it is very slow when many bits are added. CLA solves this
problem by pre-calculation of the carry signals, based on the input signals. It is based on the
fact that a carry signal will be generated in two cases:
Case 1: When both bits Ai and Bi are 1
Case 2: When one of the two bits is 1 and the carry from the previous stage is 1.
Thus, it can be written as
……………………………….. (2.2)
Equation (2.2) can also be written as
……………………………………… (2.3)
Where, and , and are called the Generate and Propagate term,
respectively.
6
Assuming the delay through an AND gate is one gate delay and the delay through an XOR
gate is two gate delays. The Propagate and Generate terms only depend on the input bits and
thus will be valid after two and one gate delay, respectively. If eqn. (2.3) is used to calculate
the carry signals, one does not need to wait for the carry to ripple through all the previous
stages to find its proper value. Hence, the carry for each bit is computed independently. As an
example, for a 4-bit adder the carry bits will be as in eqn. (2.4) to (2.7).
……………………………. (2.4)
……………………...… (2.5)
……………...…… (2.6)
……... (2.7)
The carry-out bit as given in eqn. (2.8), Ci+1, of the last stage will be available after four
delays (two gate delays to calculate the Propagate signal and two delays as a result of the
AND and OR gate). In this way, the carry of an n-bit carry look-ahead adder can be
recursively written as
( )
( ) …....... (2.8)
The Sum signal can be calculated as eqn. (2.9):
…..…………………….….. (2.9)
The Sum bit will thus be available after two additional gate delays (due to the XOR gate) or a
total of six gate delays after the input signals Ai and Bi have been applied. The advantage is
that these delays will be the same independent of the number of bits one needs to add, in
contrast to the ripple adder. A 4-bit carry look-ahead adder is shown in Fig. 2.2.
7
Fig. 2.2: Carry look-ahead adder
2.3 Carry Save Adder
A carry save adder abbreviated as CSA is used to compute the sum of three or more bits in
binary format. It is widely used in the final stages of fast multipliers for summing the partial
products to give out the final value [13]-[14]. A carry save adder is described in Fig. 2.3. An
n-bit adder that does not connect up the carries. It is simply a parallel ensemble of n full-
adders with-out any horizontal connection and the carry is saved as an output and not
propagated to the next higher-order adder. The latency of a carry save adder is the same as
that of a full adder. The propagation delay is independent of the number of bits and the
amount of circuitry is less than a carry select adder.
The advantage of carry save adder is that the sum is computed faster than the conventional
RCA. The carry save adder is better than the conventional carry select adder in terms of area
and power consumption while slower than carry select adder
However, CSA has disadvantages. It does not actually solve the problem of adding two
integers and producing a single output. Instead, it adds three integers and produces two such
that sum of these two is equal to the sum of three inputs.
8
Fig. 2.3: Carry save adder
2.4 Carry Skip Adder
A carry skip adder abbreviated as CSkA consists of a simple ripple carry adder with a special
speed up carry chain called a skip chain. Here for speed up operation, carry propagation is
skipped to position i without waiting for rippling. A carry skip adder reduces the carry
propagation time by skipping over groups of consecutive adder stages. The carry skip adder
is usually comparable in speed to the carry look-ahead technique, but it requires less chip area
and consumes less power. To implement carry skip adder stages are divided into r–bit blocks
of simple carry scheme. Carry skip logic is added to each block to detect when carry-in the
block can be passed directly to the next block. In each block, a ripple carry adder is utilized to
produce the sum and carry out bit for each block. Every block generates a block propagate
and block generate signal. Also, for the given column, CSkA uses the carry out equation in
terms of the carry in signal.
……………………………… (2.10)
Also Generate and Propagate signals used by the carry skip adder are:
……………………………………. (2.11)
9
………………………………… (2.12)
From this equation, it can be seen that setting the carry-in signal of a block to zero causes the
carry out to serve as a block generate signal. Therefore, an r-bit AND gate is also used to
form the block propagate signal. The block generate and block propagate signals produce the
input carry to the next block [15]. Fig. 2.4 shows the 8-bit Carry Skip Adder using 2-bit
blocks of RCA
Fig. 2.4: Carry skip adder
2.5 Carry Select Adder
A carry select adder abbreviated as CSLA is divided into sectors, each of which (except for
the least significant) performs two additions in parallel, one assuming a carry-in of zero, the
other a carry-in of one. A four bit CSLA generally consists of two ripple carry adders and a
multiplexer. The carry select adder is simple but rather fast, having a gate level depth of
(√ ). Adding two n-bit numbers with a CSLA is done with two adders (two ripple carry
adders) in order to perform the calculation twice, one time with the assumption of the carry
being zero and the other assuming one. After the two results are calculated, the correct sum,
as well as the correct carry is then selected with the multiplexer once the correct carry is
known. The design schematic of CSLA is shown in Fig. 2.5. Here, two 4-bit ripple carry
adders are multiplexed together, where the resulting carry and sum bits are selected by the
carry-in. Since one ripple carry adder assumes a carry-in of 0 and the other assumes a carry-
in of 1, selecting which adder had the correct assumption via the actual carry-in yields the
desired result. A carry select adder speeds 40% to 90% faster than RCA by performing
additions in parallel and reducing the maximum carry path [3].
10
Fig. 2.5: Carry select adder
The CSLA are divided into two types: uniform sized adders and variable sized adders. If the
bit length is equally divided it is called uniform sized adders. It is also called Linear CSLA.
In variable sized adders the bit length is unequally divided.
2.5.1 Uniform sized CSLA
A 16-bit carry select adder with a uniform block size of 4 can be created with three of these
blocks and a 4-bit ripple carry adder. Since carry-in is known at the beginning of
computation, a carry select block is not needed for the first four bits. The delay of this adder
will be four full adder delays, plus three MUX delays. Uniform sized 16-bit carry select adder
is shown in Fig. 2.6
Fig. 2.6: Uniform sized 16-bit carry select adder
11
2.5.2 Variable sized CSLA
A 16-bit carry-select adder with variable size can be similarly created. Here we show an
adder with block sizes of 2-2-3-4-5. This break-up is ideal when the full-adder delay is equal
to the MUX delay, which is unlikely. The total delay is two full adder delays, and four mux
delays. We try to make the delay through the two carry chains and the delay of the previous
stage carry equal. Variable sized 16-bit carry select adder is shown in Fig. 2.7.
Fig. 2.7: Variable sized 16-bit carry select adder
12
Chapter 3
Semi Custom Design
Design of logic networks with the highest performance requires deliberate design of logic
networks, design of transistor circuits, layout of these transistor circuits most compactly and
manufacturing of them. Such logic networks are realized by full-custom design. In contrast to
full custom design, semi-custom design simplifies design and layout of transistor circuits to
save expenses and design time. Depending on how design and layout of transistor circuits are
simplified (e.g., repetition of small transistor sub circuit or not so compact layout) and even
how logic design is simplified. In semi custom design the designer has little control over the
specification and functionality of the specific function but the required time is less. It uses
pre-designed logic cell (AND gates, OR gate, multiplexers) known as standard cells and the
designer use pre-tested or pre-characterized cell. In this chapter semi custom design of the
16-bit CSLA will be discussed. Here Verilog codes will be generated using schematic circuits
designed in DSCH software. Using this Verilog code, layout will be constructed in
Microwind software.
3.1 Conventional 2×1 Multiplexer
Multiplexing is the generic term used to describe the operation of sending one or more
analogue or digital signals over a common transmission line at different times or speeds.
The multiplexer, shortened to “MUX” is a combinational logic circuit designed to switch one
of several input lines through to a single common output line by the application of a control
signal.
The Boolean expression of a 21 MUX is
……………….…………………. (3.1)
13
The schematic circuit, extracted layout from the Verilog code and the input-output wave
shapes of the conventional MUX are given in Fig. 3.1, Fig. 3.2 and Fig. 3.3 respectively.
Fig. 3.1 : Schmatic circuit of 2×1 Multiplexer
Fig 3.2 : Layout of Conventional 2×1 Multiplexer
14
Fig 3.3 : Input-Output wave shapes of conventional 2×1 MUX
The schematic of Fig. 3.1 is converted to Verilog code using DSCH 3.1 software. Then the
Verilog code is compiled in Microwind 3.1 software to generate the layout in 90 nm CMOS
process. By this the schematic of the logic design is converted into physical layout. Using this
physical layout, the parameters like Area, Delay, Power Dissipation, AT, A , resistance,
capacitance, node voltage and current can be estimated. Different parameters of the
conventional 2×1 MUX are given in Table 3.1.
Table 3.1: Area, Delay and Power Dissipation of the Conventional 2×1 MUX
Parameter Value Parameter Value
Area 47.7 IDD (Max) 0.690 mA
Delay 15 ps No. of NMOS 10
Power Dissipation 0.793 µW No. of PMOS 10
15
3.2 Pass Transistor Based 2×1 Multiplexer
The 2×1 MUX of Fig. 3.1 can be constructed using pass transistor concept. In that case the
number of transistors used will be minimized. Here only 6 transistors are needed in place of
20 used in conventional logic. The schematic circuit, constructed layout and the input-output
wave shape of the pass transistor based 2×1 MUX is given in Fig. 3.4, Fig. 3.5 and Fig. 3.6
respectively.
Fig. 3.4: Pass transistor based 2×1 MUX
Fig. 3.5: Layout of pass transistor based 2×1 MUX
16
Fig. 3.6: Input-Output wave shapes of pass transistor based 2×1 MUX
Different parameters of the pass transistor based 2×1 MUX are given in Table 3.2. Compared
to the conventional 2×1 MUX , it is found that the pass transistor based 2×1 MUX consumes
more space, and the delay is little bit high however the number of transistor and the power
dissipation is drastically reduced.
Table 3.2: Area, Delay and Power Dissipation of the pass transistor based 2×1 MUX
Parameter Value Parameter Value
Area 88.7 IDD (Max) 1.189 mA
Delay 22 ps No. of NMOS 03
Power Dissipation 0.311 µW No. of PMOS 03
17
3.3 Conventional Full Adder
The conventional 1-bit Full Adder consists of two XOR gates, two AND gates and an OR
gate as shown in Fig. 3.7. The constructed layout in Microwind and the input-output wave
shapes are shown in Fig. 3.8 and Fig. 3.9 respectively.
Fig. 3.7: Conventional 1-bit full adder
Fig. 3.8: Layout of the conventional 1-bit full adder
18
Fig. 3.9: Input-Output wave shapes of conventional 1-bit full adder
Different parameters of the conventional 1-bit Full Adder are given in Table 3.3.
Table 3.3: Area, Delay and Power Dissipation of conventional 1-bit full adder
Parameter Value Parameter Value
Area 72.5 No. of NMOS 15
Delay 66 ps No. of PMOS 15
Power Dissipation 39.543 W IDD (Max) 0.781 mA
No. of Transistor 30 IDD (Avg) 0.040 mA
19
3.4 XOR Based Full Adder
A 1-bit full adder realization employing two XOR gates and one 2×1 MUX is shown in
Fig. 3.10. The main difference between the conventional and XOR based adders is that in
XOR based adder other than two XOR gates only one 2×1 MUX is used which needs only 6
MOSFETs where as in conventional type adder other than two XOR gates two AND gates
and one OR gate is needed which requires at least 18 number of MOSFETs. So, there is at
least 12 numbers of MOSFET savings in XOR based 1-bit adders than conventional which in
turn has less area and power consumption at the same time there is also a better delay
performance. Layout and the input-output wave shapes of the XOR based 1-bit full adder is
shown in Fig. 3.11 and Fig. 3.12 respectively.
Fig. 3.10: XOR based 1-bit full adder
Fig. 3.11: Layout of XOR based 1-bit full adder
20
Fig. 3.12: Input-Output wave shapes of XOR based 1-bit full adder
Different parameters of the XOR based 1-bit Full Adder are given in Table 3.4.
Table 3.4: Area, Delay and Power Dissipation of XOR based 1-bit full adder
Parameter Value Parameter Value
Area 132.0 No. of NMOS 09
Delay 61 ps No. of PMOS 09
Power Dissipation 56.959 µW IDD (Max) 0.732mA
No. of Transistor 18 IDD (Avg) 0.050mA
21
3.5 Carry Select Adder
The CSLA is constructed from two RCAs and a multiplexer. Addition of two n-bit
numbers with CSLA is nothing but adding two numbers taking input carry first as zero then
using another adder taking input carry as one. After calculation of the two results depending
on the correct carry-in the correct sum as well as the correct carry-out is selected with the
multiplexer connected at last to get the final output.
Fig. 3.13: XOR based 4-bit CSLA
3.5.1 XOR Based 4-bit CSLA
The 4-bit CSLA generally consists of two 4-bit RCA. In one RCA the Cin bit is taken as zero
and for other the Cin bit is taken as one. When the addition is completed the correct output as
well as Cout is taken out with the MUX from one of the RCAs depending on the actual Cin.
The schematic circuit, semi custom layout and the input-output wave shapes of the 4-bit
CSLA are shown in Fig. 3.13, Fig. 3.14 and Fig. 3.15 respectively and the different
parameters like Area, Delay and Power dissipation of the XOR based 4-bit CSLA are given
in Table 3.5.
22
Fig. 3.14: Layout of the XOR based 4-bit CSLA
Fig. 3.15: Input-Output wave shapes of XOR based 4-bit CSLA
23
Table 3.5: Area, Delay and Power Dissipation of XOR based 4-bit CSLA
Parameter Value Parameter Value
Area 2177.3 No. of NMOS 87
Delay 91 ps No. of PMOS 87
Power Dissipation 0.291 mW IDD (Max) 3.090mA
No. of Transistor 174 IDD (Avg) 0.240mA
3.5.2 XOR Based 8-bit CSLA
The 8-bit CSLA generally consists of three 4-bit RCA and five 2×1 MUX. As shown in Fig.
3.16 the 8-bit CSLA is divided in two groups. In first group 4-bit RCA and in second group
4-bit CSLA is used. The semi custom layout and the input-output wave shapes of the XOR
based 8-bit CSLA are shown in Fig. 3.17 and Fig. 3.18 respectively and the different
parameters like Area, Delay and Power dissipation of the 8-bit CSLA are given in Table 3.6.
Fig. 3.16: XOR based 8-bit CSLA
24
Fig. 3.17: Layout of XOR based 8-bit CSLA
Fig. 3.18: Input-Output wave shapes of XOR based 8-bit CSLA
25
Table 3.6: Area, Delay and Power Dissipation of XOR based 8-bit CSLA
Parameter Value Parameter Value
Area 2984 No. of NMOS 123
Delay 136 ps No. of PMOS 123
Power Dissipation 0.438 mW IDD (Max) 2.968 mA
No. of Transistor 246 IDD (Avg) 0.360 mA
3.5.3 XOR Based 16-bit CSLA
The 16-bit CSLA consists of seven 4-bit RCA and fifteen 2×1 MUX. As shown in Fig. 3.19
the 16-bit CSLA has one 4-bit RCA and three 4-bit CSLA. The semi custom layout and the
input-output wave shapes of the XOR based 16-bit CSLA are shown in Fig. 3.20 and Fig.
3.21 respectively and the different parameters like Area, Delay and Power dissipation of the
16-bit CSLA are given in Table 3.7.
Fig. 3.19: XOR based 16-bit CSLA
26
Fig. 3.20: Layout of XOR based 16-bit CSLA
Fig. 3.21: Input-Output wave shapes of XOR based 16-bit CSLA
27
Table 3.7: Area, Delay and Power Dissipation of XOR based 16-bit CSLA
Parameter Value Parameter Value
Area 9732.4 No. of NMOS 297
Delay 136 ps No. of PMOS 297
Power Dissipation 1.029 mW IDD (Max) 7.044 mA
No. of Transistor 594 IDD (Avg) 0.994 mA
28
Full-Custom
ASICs
Semi-Custom
ASICs
Standard- Cell Based
ASICs
Gate-Array Based
ASICs
Programmable ASICs
PLDs FPGA
ASICs
Chapter 4
Full Custom Design
Full-custom design is a methodology for designing integrated circuits by specifying the
layout of each individual transistor and the interconnections between them. Full-custom
design is limited to ICs that are to be fabricated in extremely high volumes, notably certain
microprocessors and a small number of ASICs. The VLSI design flow is given in Fig. 4.1.
Fig. 4.1: VLSI design flow
In full custom design, the entire mask design is done as new without use of any library. Full-
custom design is logic design to attain the highest smallest size, utilizing the most advanced
technology. Designers usually try to improve the economic aspect, that is, performance per
cost, at the same time. Every design stage is carefully done for the maximum performance
and transistor circuits are deliberately laid out on chips most compactly, spending months by
many draft people and engineers. In this chapter, the full custom design of 16-bit CSLA is
presented starting from designing an inverter.
29
4.1 Full Custom Design of Inverter or NOT Gate
A NOT gate, often called an inverter, is a digital logic gate to start with because it has only a
single input with simple behavior. A NOT gate performs logical negation on its input. In
other words, if the input is true, then the output will be false. The NOT gate is designed and
simulated using Microwind 3.1 software. The designed layout and simulation results of the
NOT gate are shown in Fig 4.2 and Fig. 4.3 respectively. The technology library used in this
work is CMOS 6-metal 90nm technology. The simulation result of different parameters of
NOT gate is shown in Table 4.1.
Fig. 4.2: Full custom layout of NOT gate
Fig. 4.3: Input-Output wave shapes of NOT gate
30
Table 4.1: Area, Delay and Power Dissipation of NOT gate
Parameter Value Parameter Value
Area 2.0 No. of NMOS 01
Delay 1ps No. of PMOS 01
Power Dissipation 0.042 W IDD (Max) 0.088 mA
No. of Transistor 02 IDD (Avg) 0.000 mA
4.2 Full Custom Design of 2-input AND Gate
The 2-input AND gate is logically represented as shown in Fig. 4.4 with two inputs and one
output.
Fig. 4.4 : 2-input AND gate symbol
Fig. 4.5: Full custom layout of 2-input AND gate
31
The 2-input AND gate is designed and simulated using Microwind 3.1 software. The
designed layout and simulation results of the 2-input AND gate are shown in Fig 4.5 and
Fig. 4.6 respectively. The simulation result of different parameters of 2-input AND gate are
given in Table 4.2.
Fig. 4.6: Input-Output wave shapes of 2-input AND gate
Table 4.2: Area, Delay and Power Dissipation of 2-input AND gate
Parameter Value Parameter Value
Area 4.1 No. of NMOS 03
Delay 9 ps No. of PMOS 03
Power Dissipation 0.127 W IDD (Max) 0.218 mA
No. of Transistor 06 IDD (Avg) 0.000 mA
32
4.3 Full Custom Design of 2-input XOR Gate
An XOR gate or exclusive OR gate is a digital logic gate with two or more inputs and one
output that performs exclusive disjunction. The output of an XOR gate is true only when
exactly one of its inputs is true.
The Boolean expression of XOR gate is ( ) . The pass transistor
based schematic circuit, full custom layout and input-output wave shapes of the 2-input XOR
gate are shown in Fig. 4.7, Fig. 4.8 and Fig. 4.9 respectively.
Fig. 4.7: Schematic design of 2-input XOR gate
Fig. 4.8: Full custom layout of 2-input XOR gate
33
Fig. 4.9: Input-Output wave shapes of 2-input XOR gate
Table 4.3 shows the different simulated parameters of 2-input XOR gate.
Table 4.3: Area, Delay and Power Dissipation of 2-input XOR gate
Parameter Value Parameter Value
Area 5.9 No. of NMOS 03
Delay 46 ps No. of PMOS 03
Power Dissipation 14.052 W IDD (Max) 0.128 mA
No. of Transistor 06 IDD (Avg) 0.013 mA
34
4.4 Full Custom Design of 2×1 MUX
A two in one multiplexer (or 2×1 MUX) is a common digital circuit used to mix a lot of
signals into just one. The pass transistor based schematic circuit of a 2×1 MUX is shown in
Fig. 4.10. Here, A and B are the two inputs, S is the select signal and Z is the output.
Fig. 4.10: Schematic circuit of 2×1 MUX
Fig. 4.11: Full custom layout of 2×1 MUX
35
The full custom layout and input-output wave shapes of the 2×1 MUX are shown in Fig. 4.11
and Fig. 4.12. Table 4.4 shows the different simulated parameters of 2×1 MUX.
Fig 4.12: Input-Output wave shapes of the 2×1 MUX
Table 4.4: Area, Delay and Power Dissipation of the 2×1 MUX
Parameter Value Parameter Value
Area 7.1 No. of NMOS 03
Delay 22 ps No. of PMOS 03
Power Dissipation 0.047 W IDD (Max) 0.125 mA
No. of Transistor 06 IDD (Avg) 0.000 mA
36
4.5 Full Custom Design of XOR Based Full Adder
The schematic circuit of a 1-bit full adder realization employing two XOR gates and one 2×1
MUX is shown in Fig. 4.13. Full custom layout and the input-output wave shapes of the XOR
based full adder are shown in Fig. 4.14 and Fig. 4.15 respectively.
Fig. 4.13: XOR based 1-bit full adder
Fig 4.14: Full Custom Layout of a XOR Based 1-bit Full Adder
Fig 4.15: Input-Output wave shapes of XOR based 1-bit full adder
37
The simulation result of different parameters of XOR based 1-bit full adder are given in
Table 4.5.
Table 4.5: Area, Delay and Power Dissipation of the 1-bit full adder
Parameter Value Parameter Value
Area 28.7 No. of NMOS 09
Delay 30 ps No. of PMOS 09
Power Dissipation 28.328 W IDD (Max) 0.223 mA
No. of Transistor 18 IDD (Avg) 0.029 mA
4.6 Full Custom Design of XOR Based 4-bit CSLA
The full custom layout of the 4-bit CSLA is shown in Fig. 4.16.
Fig. 4.16: Full custom layout of XOR based 4-bit CSLA
38
The input-output wave shapes of the 4-bit CSLA is shown in Fig. 4.17 and the simulated
parameters are given in Table 4.6 respectively.
Fig. 4.17: Input-Output wave shapes of XOR based 4-bit CSLA
Table 4.6: Area, Delay and Power Dissipation of the XOR based 4-bit CSLA
Parameter Value Parameter Value
Area 345.8 No. of NMOS 84
Delay 14 ps No. of PMOS 84
Power Dissipation 196 W IDD (Max) 1.298 mA
No. of Transistor 168 IDD (Avg) 0.202 mA
39
4.7 Full Custom Design of XOR Based 8-bit CSLA
The 8-bit CSLA generally consists of three 4-bit RCA. The full custom layout of the 8-bit
CSLA is shown in Fig. 4.18. The input-output wave shapes of the 8-bit CSLA is shown in
Fig. 4.19 and the simulated parameters are given in Table 4.7 respectively.
Fig. 4.18: Full custom layout of XOR based 8-bit CSLA
Fig. 4.19: Input-Output wave shapes of XOR based 8-bit CSLA
40
Table 4.7: Area, Delay and Power Dissipation of the XOR based 8-bit CSLA
Parameter Value Parameter Value
Area 479.5 No. of NMOS 117
Delay 46 ps No. of PMOS 117
Power Dissipation 289 W IDD (Max) 1.0588 mA
No. of Transistor 234 IDD (Avg) 0.304 mA
4.8 Full Custom Design of XOR Based 16-bit CSLA
The full custom XOR based 16-bit CSLA is construct with thirty XOR based full adder and
eighteen 21 MUX. The layout and the input-output wave shapes of the XOR based 16-bit
CSLA is shown in Fig. 4.20 and Fig. 4.21 respectively. The different parameters after
simulation is given in Table 4.8.
Fig. 4.20: Full custom layout of XOR based 16-bit CSLA
41
Fig. 4.21: Input-Output wave shapes of XOR based 16-bit CSLA
Table 4.8: Area, Delay and Power Dissipation of the XOR based 16-bit CSLA
Parameter Value Parameter Value
Area 1263.5 No. of NMOS 285
Delay 61 ps No. of PMOS 285
Power Dissipation 823 W IDD (Max) 3.88 mA
No. of Transistor 570 IDD (Avg) 0.966 mA
42
Chapter 5
Performance Analysis
In this chapter the performance of the designed CSLA will be analyzed which are implemented
in Microwind 3.1 using 90 nm CMOS process. Power dissipation (the power which is consumed
by a device during the execution of its logical operation or the power which is dissipated as heat
by the device), delay and power delay product (PDP) are measured for different designs. All the
simulation is done for the supply voltage of = 1.2V and a clock frequency of 10 MHz to
500 MHz.
5.1 Comparison of Different 2 ×1 MUX
Table 5.1 shows the performance analysis of conventional, semi custom and full custom
2 ×1 MUX. Graphical representation of area, delay and power are given in Fig. 5.1.
Table 5.1: Performance Analysis of Conventional, Semi and Full Custom 2×1 MUX
Conv. Semi
Custom
Full
Custom
% reduction of full
custom compared
to conventional
% reduction of full
custom compared to
semi Custom
Area ( ) 47.7 88.7 7.1 85.12% 92.00%
Power (µW) 0.793 0.311 0.047 94.07% 84.89%
Delay (ps) 15 22 22 -46.67% 0.00%
Gate
Count
NMOS 10 03 03 70.00% 0.00%
PMOS 10 03 03 70.00% 0.00%
Total
Gate
Count
20 06 06 70.00% 0.00%
(mA) 0.690 1.189 0.125 81.88% 89.49%
43
Fig 5.1: Area, Power and Delay comparison of 2×1 MUX
0
20
40
60
80
100
Conventional Semi Custom Full Custom
µm
2
Types of Design
Area
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Conventional Semi Custom Full Custom
µW
Types of Design
Power
0
5
10
15
20
25
Conventional Semi Custom Full Custom
ps
Types of Design
Delay
44
5.2 Comparison of Different Full Adder
Performance analysis of conventional, semi custom and full custom 1-bit full adder is given in
Table 5.2 and the graphical representation of area, delay and power are given in Fig. 5.2.
Table 5.2: Performance analysis of Conventional, Semi and Full Custom 1-bit Full Adder
Conv. Semi
Custom
Full
Custom
% reduction of full
custom compared
to conventional
% reduction of full
custom compared
to semi Custom
Area ( ) 72.5 132.0 28.7 60.4% 78.3%
Power (µW) 39.543 56.959 28.328 28.4% 50.3%
Delay (ps) 66 61 30 54.5% 50.8%
Gate
Count
NMOS 15 09 09 40.0% 0 %
PMOS 15 09 09 40.0% 0 %
Total Gate
Count 30 18 18 40.0% 0 %
0.781 0.732 0.223 71.5% 69.6%
IDD (Avg) (mA) 0.040 0.050 0.029 27.5% 42.0%
(a)
0
20
40
60
80
100
120
140
Conventional Semi Custom Full Custom
µm
2
Types of Design
Area
45
(b)
(c)
Fig. 5.2: Area, Power and Delay comparison of 1-bit Full Adder
0
10
20
30
40
50
60
Conventional Semi Custom Full Custom
µW
Types of Design
Power
0
10
20
30
40
50
60
70
Conventional Semi Custom Full Custom
ps
Types of Design
Delay
46
5.3 Comparison of 4-bit CSLA
Performance analysis of conventional, semi custom and full custom 4-bit CSLA is given in
Table 5.3 and the graphical representation of area, delay and power are given in Fig. 5.3.
Table 5.3: Performance analysis of Conventional, Semi and Full Custom 4-bit CSLA
Conv. Semi
Custom
Full
Custom
% reduction of full
custom compared
to conventional
% reduction of full
custom compared
to semi Custom
Area ( ) 2490 2177.3 345.8 86.5% 84.2%
Power (mW) 0.299 0.291 0.196 34.5% 32.7%
Delay (ps) 96 91 14 85.5% 84.6%
Gate
Count
NMOS 170 87 84 50.6% 3.45%
PMOS 170 87 84 50.6% 3.45%
Total Gate
Count 340 174 168 50.6% 3.45%
2.941 3.090 1.298 55.8% 58.0%
IDD (Avg) (mA) 0.249 0.240 0.202 18.8% 15.8%
(a)
0
500
1000
1500
2000
2500
3000
Conventional Semi Custom Full Custom
µm
2
Types of Design
Area
47
(b)
(c)
Fig. 5.3: Area, Power and Delay comparison of 4-bit CSLA
For XOR based 4-bit CSLA the full custom design has 86.5% area, 34.5% power and 85.5%
delay reduction over conventional design where as it has 84.2% area, 32.7% power and 84.6%
delay reduction over semi custom design.
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Conventional Semi Custom Full Custom
mW
Types of Design
Power
0
20
40
60
80
100
120
Conventional Semi Custom Full Custom
ps
Types of Design
Delay
48
5.4 Comparison of 8-bit CSLA
Performance analysis of conventional, semi custom and full custom 8-bit CSLA is given in
Table 5.4 and the graphical representation of area, delay and power are given in Fig. 5.4.
Table 5.4: Performance analysis of Conventional, Semi and Full Custom 8-bit CSLA
Conv. Semi
Custom
Full
Custom
% reduction of full
custom compared
to conventional
% reduction of full
custom compared
to semi custom
Area ( ) 3013.5 2984 479.5 84.1% 84.0%
Power (mW) 0.936 0.438 0.289 69.2% 34.02%
Delay (ps) 138 136 46 66.7% 66.2%
Gate
Count
NMOS 195 123 117 40.00% 4.88%
PMOS 195 123 117 40.00% 4.88%
Total Gate
Count 390 246 234 40.00% 4.88%
2.231 2.968 1.058 52.6% 64.4%
IDD (Avg) (mA) 0.550 0.360 0.304 44.7% 15.6%
(a)
0
500
1000
1500
2000
2500
3000
3500
Conventional Semi Custom Full Custom
µm
2
Types of Design
Area
49
(b)
(c)
Fig. 5.4: Area, Power and Delay comparison of 8-bit CSLA
For XOR based 8-bit CSLA the full custom design has 84.1% area, 69.2% power and 66.7%
delay reduction over conventional design where as it has 84.0% area, 34.02% power and 66.2%
delay reduction over semi custom design.
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Conventional Semi Custom Full Custom
mW
Types of Design
Power
0
20
40
60
80
100
120
140
160
Conventional Semi Custom Full Custom
ps
Types of Design
Delay
50
5.5 Comparison of 16-bit CSLA
Performance analysis of conventional, semi custom and full custom 16-bit CSLA is given in
Table 5.5 and the graphical representation of area, delay and power are given in Fig. 5.5.
Table 5.5: Performance analysis of Conventional, Semi and Full Custom 16-bit CSLA
Conv. Semi
Custom
Full
Custom
% reduction of full
custom compared
to conventional
% reduction of full
custom compared
to semi custom
Area ( ) 8046.8 9732.4 1263.5 84.3% 87.0%
Power (mW) 2.803 1.029 0.823 70.6% 20.1%
Delay (ps) 140 136 61 56.5% 55.2%
Gate
Count
NMOS 465 297 285 38.71% 4.04%
PMOS 465 297 285 38.71% 4.04%
Total Gate
Count 930 594 570 38.71% 4.04%
6.626 7.044 3.88 41.5% 45.0%
IDD (Avg) (mA) 2.336 0.994 0.966 58.7% 2.8%
(a)
0
2000
4000
6000
8000
10000
12000
Conventional Semi Custom Full Custom
µm
2
Types of Design
Area
51
(b)
(c)
Fig. 5.5: Area, Power and Delay comparison of 16-bit CSLA
For XOR based 16-bit CSLA the full custom design has 84.3% area, 70.6% power and 56.5%
delay reduction over conventional design where as it has 87.0% area, 20.1% power and 55.2%
delay reduction over semi custom design.
0.00
0.50
1.00
1.50
2.00
2.50
3.00
Conventional Semi Custom Full Custom
mW
Types of Design
Power
0
20
40
60
80
100
120
140
160
Conventional Semi Custom Full Custom
ps
Types of Design
Delay
52
5.6 Comparison with Other Work
Performance analysis of semi custom and full custom 8-bit and 16-bit CSLA with Other Work is
given in Table 5.6 .
Table 5.6: Performance analysis of Comparison with Other Work
Bit
size Parameters Ref. [18] Ref. [19] Ref. [3]
This work
Semi Custom Full Custom
8-Bit
Power(mW) 13.598 0.659 1.109 0.438 0.289
Delay(ns) 2.094 2.79 2.75 0.136 0.046
Area (μm2) 952.343 1035 6201 2984 479.5
16-Bit
Power(mW) 29.311 1.316 - 1.029 0.823
Delay(ns) 2.450 3.79 - 0.136 0.061
Area (μm2) 1901.09 2325 - 9732.4 1263.5
Technology 180 nm 180 nm 120 nm 90nm
53
Chapter 6
FPGA Implementation of CSLA
6.1 Introduction to FPGA
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured
by a customer or a designer after manufacturing – hence the term “field-programmable”. The
FPGA configuration is generally specified using a hardware description language (HDL),
similar to that used for an application-specific integrated circuit (ASIC). Circuit
diagrams were previously used to specify the configuration, but this is increasingly rare due
to the advent of electronic design automation tools.
Fig. 6.1: FPGA Architecture
FPGAs contain an array of programmable logic blocks and a hierarchy of reconfigurable
interconnects that allow the blocks to be “wired together”, like many logic gates that can be
inter-wired in different configurations as shown in Fig. 6.1. Logic blocks can be configured
to perform complex combinational functions or merely simple logic gates like AND and
XOR. In most FPGAs, logic blocks also include memory elements, which may be
54
simple flip-flops or more complete blocks of memory. Many FPGAs can be reprogrammed to
implement different logic functions allowing flexible reconfigurable computing as performed
in computer software. Here, we have used Altera DE2-115 FPGA board to implement the
CSLA in hardware.
6.2 Altera DE2 -115 FPGA Board
Altera DE2-115 board become one of the most widely development FPGA board which is
used for development of FPGA design and implementations [16]. The purpose of the Altera
DE2-115 Development and Education board is to provide the ideal vehicle for learning about
digital logic, computer organization and FPGAs. It uses the state-of the-art technology in
hardware tools to expose students and professionals to a wide range of topics [17]. The board
offers a rich set of features that make it suitable for use in a laboratory environment for
university and college courses, for a variety of design projects, as well as for the development
of sophisticated digital systems. The Altera DE2-115 FPGA Board is shown in Fig. 6.2.
Fig 6.2: Altera DE2 -115 FPGA Board
55
Fig. 6.3: Block diagram of the Altera DE2 -115 FPGA Board
6.3 FPGA Implementation of Conventional 1-bit Full Adder
To implement the conventional 1-bit full adder a Verilog code has been written as shown in
Fig. 6.4.
module fulladder (a, b, cin, sum, cout);
input a, b, cin;
output sum, cout;
wire [2:0]w;
xor xor1(w[0], a, b);
xor xor2(sum, w[0], cin);
and and1(w[1], a, b);
and and2(w[2], cin, w[0]);
or or1(cout, w[1], w[2]);
endmodule
Fig. 6.4: Verilog code of 1-bit full adder
56
Then this code has been simulated in Modelsim software. The simulation result is shown in
Fig. 6.5. When the simulation results are OK then the synthesis and power analysis is done in
Quartus II software. The synthesis and power analysis results are shown in Fig. 6.6 and Fig.
6.7 respectively. The implementation of the 1-bit full adder in FPGA is shown in Fig. 6.8.
Fig. 6.5: Simulation waveform results of conventional 1-bit full adder
Fig. 6.6: Synthesis summary of the conventional 1-bit full adder
57
Fig. 6.7: Power analysis of the conventional 1-bit full adder
Fig. 6.8: Implementation of the conventional 1-bit full adder in FPGA
58
6.4 FPGA Implementation of 4-bit CSLA
The verilog code for the 4-bit CSLA is given in Fig. 6.9. This code has been simulated in
Modelsim software. The simulation result is shown in Fig. 6.10 for binary and in Fig. 6.11 for
decimal. The synthesis and power analysis done in Quartus II software are shown in Fig. 6.12
and Fig. 6.13 respectively. The implementation of the 4-bit CSLA in FPGA is shown in
Fig. 6.14.
module fourbitcsla (a, b, cin, sum, cout);
input [3:0] a, b;
input cin;
output [3:0]sum;
output cout;
wire [2:0]w;
fulladder fulladd1(a[0], b[0], cin, sum[0], w[0]);
fulladder fulladd2(a[1], b[1], w[0], sum[1], w[1]);
fulladder fulladd3(a[2], b[2], w[1], sum[2], w[2]);
fulladder fulladd4(a[3], b[3], w[2], sum[3], cout);
endmodule
module fulladder (a, b, cin, sum, cout);
input a, b, cin;
output sum, cout;
wire [2:0]w;
xor xor1(w[0], a, b);
xor xor2(sum, w[0], cin);
and and1(w[1], a, b);
and and2(w[2], cin, w[0]);
or or1(cout, w[1], w[2]);
endmodule
Fig. 6.9: Verilog code of 4-bit CSLA
59
Fig. 6.10: Simulation waveform results of 4-bit CSLA (binary)
Fig. 6.11: Simulation waveform results of 4-bit CSLA (decimal)
60
Fig. 6.12: Synthesis summary of 4-bit CSLA
Fig. 6.13: Power analysis of 4-bit CSLA
61
Fig. 6.14: FPGA implementation of 4-bit CSLA
B3 B2 B1 B0 A3 A2 A1 A0
1 1 1 1 1 0 1 0
CoutS3S2S1S0
1 1 0 0 1
62
6.5 FPGA Implementation of 8-bit CSLA
The verilog code for the 8-bit CSLA is given in Fig. 6.15. This verilog code then simulated
in Modelsim software and the simulation result is shown in Fig. 6.16 for binary and in Fig.
6.17 for decimal. The synthesis and power analysis are done in Quartus II software and are
shown in Fig. 6.18 and Fig. 6.19 respectively. The implementation of the 8-bit CSLA in
FPGA is shown in Fig. 6.20.
module eightbitcsla(a, b, cin, c0, c1, sum, cout);
input [7:0]a, b;
input cin, c0, c1;
output [7:0]sum;
output cout;
wire [19:0]w;
fulladder fulladd1(a[0], b[0], cin, sum[0], w[0]);
fulladder fulladd2(a[1], b[1], w[0], sum[1], w[1]);
fulladder fulladd3(a[2], b[2], w[1], sum[2], w[2]);
fulladder fulladd4(a[3], b[3], w[2], sum[3], w[3]);
fulladder fulladd5(a[4], b[4], c0, w[4], w[5]);
fulladder fulladd6(a[5], b[5], w[5], w[6], w[7]);
fulladder fulladd7(a[6], b[6], w[7], w[8], w[9]);
fulladder fulladd8(a[7], b[7], w[9], w[10], w[11]);
fulladder fulladd9(a[4], b[4], c1, w[12], w[13]);
fulladder fulladd10(a[5], b[5], w[13], w[14], w[15]);
fulladder fulladd11(a[6],b[6],w[15],w[16],w[17]);
fulladder fulladd12(a[7], b[7], w[17], w[18], w[19]);
mux mux1(sum[4], w[3], w[4], w[12]);
mux mux2(sum[5], w[3], w[6], w[14]);
mux mux3(sum[6], w[3], w[8], w[16]);
mux mux4(sum[7], w[3], w[10], w[18]);
mux mux5(cout, w[3], w[11], w[19]);
endmodule
Fig. 6.15: Verilog code of 8-bit CSLA
63
module fulladder(a, b, cin, sum, cout1);
input a, b, cin;
output sum, cout1;
wire [2:0]w;
xor xor1(w[0], a, b);
xor xor2(sum, w[0], cin);
and and1(w[1], a, b);
and and2(w[2], cin, w[0]);
or or1(cout1, w[1], w[2]);
endmodule
module mux(out, S, B, A);
input B,A,S;
wire [2:0]w;
output out;
and and3(w[1], w[0], A);
and and4(w[2], B, S);
or or2(out, w[1], w[2]);
not inv1(w[0], S);
endmodule
Fig. 6.15: Verilog code of 8-bit CSLA (contd…..)
64
Fig. 6.16: Simulation waveform results of 8-bit CSLA (Binary)
Fig. 6.17: Simulation waveform results of 8-bit CSLA (Decimal)
65
Fig. 6.18: Synthesis summary of 8-bit CSLA
Fig. 6.19: Power analysis of 8-bit CSLA
66
Fig. 6.20: FPGA implementation result of 8-bit CSLA.
B7 B6 B5 B4 B3 B2 B1 B0
0 0 0 0 0 0 0 0
A7 A6 A5 A4 A3 A2 A1 A0
1 1 1 1 1 1 1 1
Cout S7 S6 S5 S4 S3 S2 S1 S0
0 1 1 1 1 1 1 1 1
67
6.6 FPGA Implementation of 16-bit CSLA
The verilog code for the 16-bit CSLA is given in Fig. 6.21. This verilog code then simulated
in Modelsim software and the simulation result is shown in Fig. 6.22 for binary and in Fig.
6.23 for decimal. The synthesis and power analysis are done in Quartus II software and are
shown in Fig. 6.24 and Fig. 6.25 respectively. Due to the pin shortage in the FPGA board
the 16-bit CSLA can not be implemented in FPGA.
module sixteenbitcsla(a, b, cin, c0, c1, sum, cout);
input cin, [15:0]a, [15:0]b, [3:1]c0, c1;
output cout, [15:0]sum;
wire [3:1]w;
fourbitcsla fourbitcsla11(a[3:0], b[3:0], cin, sum[3:0], w[1]);
fourbitcslatwo fourbitcslatwo1(a[7:4], b[7:4], c0[1], c1[1], w[1], sum[7:4], w[2]);
fourbitcslatwo fourbitcslatwo2(a[11:8], b[11:8], c0[2], c1[2], w[2], sum[11:8], w[3]);
fourbitcslatwo fourbitcslatwo3(a[15:12], b[15:12], c0[3], c1[3], w[3], sum[15:12], cout);
endmodule
module fourbitcslatwo(a, b, cin, c0, c1, sum, cout);
input [7:4]a,b;
input cin,c0,c1;
output [7:4]sum;
output cout;
wire [19:0]w;
fulladder fulladd1(a[4], b[4], c0, w[4], w[5]);
fulladder fulladd2(a[5], b[5], w[5], w[6], w[7]);
fulladder fulladd3(a[6], b[6], w[7], w[8], w[9]);
fulladder fulladd4(a[7], b[7], w[9], w[10], w[11]);
fulladder fulladd5(a[4],b[4],c1,w[12],w[13]);
fulladder fulladd6(a[5],b[5],w[13],w[14],w[15]);
fulladder fulladd7(a[6],b[6],w[15],w[16],w[17]);
fulladder fulladd8(a[7],b[7],w[17],w[18],w[19]);
mux mux1(sum[4],cin,w[4],w[12]);
mux mux2(sum[5],cin,w[6],w[14]);
mux mux3(sum[6],cin,w[8],w[16]);
mux mux4(sum[7],cin,w[10],w[18]);
mux mux5(cout,cin,w[11],w[19]);
endmodule
Fig. 6.21: Verilog code of 16-bit CSLA
68
module fourbitcsla (a, b, cin, sum, cout);
input [3:0] a, b;
input cin;
output [3:0]sum;
output cout;
wire [2:0]w;
fulladder fulladd1(a[0], b[0], cin, sum[0], w[0]);
fulladder fulladd2(a[1], b[1], w[0], sum[1], w[1]);
fulladder fulladd3(a[2], b[2], w[1], sum[2], w[2]);
fulladder fulladd4(a[3], b[3], w[2], sum[3], cout);
endmodule
module fulladder (a, b, cin, sum, cout);
input a, b, cin;
output sum, cout;
wire [2:0]w;
xor xor1(w[0], a, b);
xor xor2(sum, w[0], cin);
and and1(w[1], a, b);
and and2(w[2], cin, w[0]);
or or1(cout, w[1], w[2]);
endmodule
module mux(out, S, B, A);
input B, A, S;
wire [2:0]w;
output out;
and and3(w[1], w[0], A);
and and4(w[2], B, S);
or or2(out, w[1], w[2]);
not inv1(w[0], S);
endmodule
Fig. 6.21: Verilog code of 16-bit CSLA (contd……)
69
Fig. 6.22: Simulation waveform results of 16-bit CSLA (Binary)
Fig. 6.23: Simulation waveform results of 16-bit CSLA (Decimal)
70
Fig. 6.24: Synthesis summary of 16-bit CSLA
Fig. 6.25: Power analysis of 16-bit CSLA
71
Summary of the all results obtained from the synthesis and power analysis using Quartus II
software is given in Table 6.1.
Table 6.1: Summary of the all results obtained from the synthesis and power analysis using
Quartus II software
Full Adder
CSLA
4-bit 8-bit 16-bit
Total pins 5 14 28 56
Total Logic Elements 2 8 24 56
Total Thermal Power Dissipation (mW) 115.88 116.72 118.04 120.67
Core Static Thermal Power Dissipation (mW) 99.09 99.09 99.10 99.10
I/O Thermal Power Dissipation (mW) 16.79 17.63 18.94 21.57
72
Chapter 7
Conclusion and Future Recommendation
7.1 Conclusion
A 16-bit CSA is implemented here using XOR-based 1-bit full adder as a building block.
The schematic has been designed in DSCH software and synthesized using 90 nm CMOS
technology. The layout has been created and simulated in Microwind software. The
comparison has been performed with area, delay and power dissipation. The Performance
analysis, simulation result and comparison are reported. From the simulation results of 2×1
MUX, 94.07% reduction in power consumption has been achieved using full custom design
over conventional design and 84.89% over semi custom design. For full custom design is the
area is 85.12% less than the conventional and 92% less than the semi custom design.
For full custom XOR based CSLA design the area, power, delay and total no of MOSFET are
86.5%, 34.5%, 85.5% and 50.6% less than the conventional CSLA for 4-bit, 84.1%, 69.2%,
66.7% and 40% less for 8-bit, 84.3%, 70.6%, 55.5% and 50.59% less for 16-bit respectively.
The full custom CSLA has 84.2%, 32.7%, 84.6% and 3.45% area, power, delay and no. of
MOSFET reduction over the semi custom design for 4-bit, 84%, 34.02%, 66.2% and 4.88%
for 8-bit, 8%, 20.1%, 55% and 3.45% for 16-bit has been achieved.
We have also implemented the CSLA in Altera DE2-115 FPGA board and performed the
synthesis and power analysis. For this at first Verilog code has been simulated in Modelsim
software, then the simulation results are checked, when the simulation results are OK then the
synthesis and power analysis is done in Quartus II software and implementation has been
done in Altera DE2-115 FPGA board. By giving some arbitrary inputs we have checked that
the implemented hardware was performing correctly.
7.2 Future Recommendation
In future work, it is needed to design unique CSLA which provides low area as well as delay
in order to meet the needs of current VLSI industry. Further, this work can be extended by
designing and simulating the adders with increased number of bits such as 32-bit, 64-bit and
128-bit.
73
References:
[1] K. Tejasvi1 and G. S. Kishore, “Low-Power and Area-Efficient N-Bit Carry-Select Adder,”
International Advanced Research Journal in Science, Engineering and Technology, Vol. 3,
Issue 7, pp. 186-189, July 2016.
[2] P. Balasubramanian, N. E. Mastorakis, “High Speed Gate Level Synchronous Full Adder
Designs,” WSEAS Transactions on Circuits and Systems, Volume 8, Issue 2, pp. 290-300,
February 2009.
[3] R. Uma, V. Vijayan, M. Mohanapriya and S. Paul, “Area, Delay and Power Comparison of
Adder Topologies,” International Journal of VLSI design & Communication Systems
(VLSICS), Vol. 3, No. 1, pp. 153-168, February 2012.
[4] K. K. Parhi, VLSI Digital Signal Processing. New York, NY, USA: Wiley, 1998.
[5] A. P. Chandrakasan, N. Verma, and D. C. Daly, “Ultralow-power electronics for biomedical
applications,” Annual Review of Biomedical Engineering, Vol. 10, pp. 247–274, August
2008.
[6] O. J. Bedrij, “Carry-select adder,” IRE Transactions on Electronic Computers, Vol. EC-11,
No. 3, pp. 340–344, June 1962.
[7] Y. Kim and L. S. Kim, “64-bit carry-select adder with reduced area,” Electron. Lett., Vol. 37,
No. 10, pp. 614–615, May 2001.
[8] Y. He, C. H. Chang, and J. Gu, “An area-efficient 64-bit square root carry select adder for
low power application,” IEEE International Symposium on Circuits and Systems, Vol. 4, pp.
4082–4085, 2005.
[9] B. Ramkumar and H. M. Kittur, “Low-power and area-efficient carry-select adder,” IEEE
Transactions on Very Large-Scale Integration (VLSI) Systems, Vol. 20, No. 2, pp. 371–375,
February 2012.
[10] I. C. Wey, C. C. Ho, Y. S. Lin, and C. C. Peng, “An area-efficient carry select adder design
by sharing the common Boolean logic term,” International Multi Conference of Engineers
and Computer Scientists (IMECS), March 2012.
74
[11] S. Manju and V. Sornagopal, “An efficient SQRT architecture of carry select adder design
by common Boolean logic,” International Conference on Emerging Trends in VLSI,
Embedded System, Nano Electronics and Telecommunication System (ICEVENT), January
2013.
[12] P. Devi, A. Girdher, and B. Singh, “Improved Carry Select Adder with Reduced Area and
Low Power Consumption,” International Journal of Computer Applications, Vol.3, No.4,
pp. 14-18, June 2010.
[13] B. Ramkumar, H. M. Kittur and P. M. Kannan, “ASIC Implementation of Modified Faster
Carry Save Adder,” European Journal of Scientific Research, Vol.42, No. 1, pp. 53-58,
2010.
[14] V. G. Oklobdzija, “High-Speed VLSI Arithmetic Units: Adders and Multipliers” Design of
High-Performance Microprocessor Circuits, IEEE press, 2000.
[15] J. E. Stine, “Digital Computer Arithmetic Data Path Design Using Verilog HDL,” Kluwer
academic publishers, 2004.
[16] Altera “DE2 Development and Education Board, user manual” Version 1.4, 2006.
[17] Handouts “Introduction to FPGAs” June 2009.
[18] L. Shanigarapu and B. P. Shrivastava, “Low-Power and High Speed Carry Select Adder,”
International Journal of Scientific and Research Publications, Volume 3, Issue 8, pp. 01-09,
August 2013.
[19] A. Mitra, A. Bakshi, B. Sharma and N. Didwania “Design of a High Speed Adder”
International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-
2015.
[20] A. N. M. Hossain, M. A. Abedin , “Implementation of an XOR Based 16-bit Carry Select
Adder for Area, Delay and Power Minimization”.International Conference on Electrical,
Computer and Communication Engineering (ECCE),CUET-2019.
[21] JISMI.T.A, NITHIN JOSE K, “An Area-Efficient Carry Select Adder Designed by Using
Transmission Gate,” International Journal of VLSI and Embedded Systems-IJVES Volume
06, pp 1519-1522, May 2015.
[22] K. Sanjay, V. V. Teresa,“Modified Full Adder Architecture For Area Efficient Carry Select
Adder,”International Journal of Engineering Research & Technology (IJERT),Vol. 2, pp.
968-972, May 2013.
75
[23] B. Ramkumar, H. M. Kittur and P. M. Kannan, “ASIC implementation of modified faster
carry save adder,” European Journal on Scientific Research,vol. 42, no. 1, pp. 53–58, 2010.
[24] Padma Devi, Ashima Girdher, Balwinder Singh,“Improved Carry Select Adder with
Reduced Area and Low Power Consumption,” International Journal of Computer
Applications (0975-8887), Volume 3, No.4, June 2010.
[25] Mariano Aguirre-Hernandez, Monico LinaresAranda,“CMOS Full-Adders for Energy-
Efficient Arithmetic Applications,” in IEEE Transactions On Very Large Scale Integration
(VLSI) Systems, Vol. 19, No. 4, April 2011.
[26] Guguloth Sreekanth,V Harish, D Mohammad Elias,“Design Of Low Power and Area
Efficient Carry Select Adder (CSLA) Using Verilog Language,” International Journal of
Engineering And Science, Vol.6, PP. 61-66, May 2016.
[27] B. Vijaya Lakshmi, B. Praveen Kumar,“Design of Modified Carry Select Adder with Low
Power and Efficient Area Using D-Latch,” International Journal of Innovative Research in
Science, Engineering and Technology, Vol. 7, February 2018.
[28] B. Ramkumar, H. M. Kittur and P. M. Kannan, “ASIC Implementation of Modified Faster
Carry Save Adder,” European Journal on Scientific Research, vol. 42, no. 1, pp. 53–58,
2010.
[29] B. Ramkumar and Harish M. Kittur, “Low-Power and Area-Efficient Carry Select
Adder,”IEEE Transactions on Very Large Scale Integration (VLSI) Systems, pp. 371–375,
vol. 20, no. 2, February 2012.
[30] Basant Kumar Mohanty and Sujit Kumar Patel, “ Area, Delay and Power Efficient Carry
Select Adder,” IEEE Transactions on Circuits and Systems-II:Express Briefs, vol. 61, no.
6, June 2014.
[31] U. Sajesh Kumar, K. Mohamed Salih and K. Sajith, “Deisgn and Implementation of Carry
Select Adder without using Mutilpexers,” 1st International Conference on Emerging
Technology in Electronics, Communication and Networking, 2012.
[32] K. Bala Sindhuri, K. Padma Vasavi, I. Santi Prabha and N. Udaya Kumar, “VLSI
Architecture for Linear Carry Select Adder with Zero Finding Logic,” 6th International
Advanced Computing Conference, pp. 31, February 2016.
[33] B. Tapasvi, K. Bala Sindhuri, I. Chaitanya Varma and N. Udaya Kumar, “Implementation
of 64 Bit KoggeStone Carry Select Adder with ZFC For Efficient Area,”IEEE International
Conference on Electrical, Computer & Communication Technology, SVS College of
Engineerng, Coimbatore, 5-7 March-2015.