Upload
manasa-upadhyaya
View
230
Download
0
Embed Size (px)
Citation preview
7/29/2019 VLSI Implementation Styles
1/40
44-1 2006 by CRC Press LLC
44Full-Custom andSemi-Custom DesignCONTENTS
44.1 Introduction ..........................................................44-144.1.1 Semi-Custom Design .................................................44-2
44.1.2 Full-Custom Design... ........................................ ........ 44-2
44.1.3 Motivation for Semi-Custom Design.......................44-2
44.2 Full-Custom Design Sequence of aDigital System........................................................44-3
References .........................................................................44-5
44.1 Introduction
As integrated circuits become more inexpensive and compact, many new types of products, such as digital
cameras, digital camcorders, and digital television [2], are being introduced, based on digital systems.
Consequently, logic design must be done under many different motivations. Since each case is different,
we have different design problems. For example, we have to choose an appropriate IC (integrated circuit)
logic family, since these cases have different performance requirements (scientific computers require high
speed, but wristwatches require very low power consumption), although in recent years, CMOS has beenmore widely used than other IC logic families, such as ECL, which has been used for fast computers.
Logic functions that are frequently used by many designers, such as a full adder, are commercially
available as off-the-shelf IC packages. (A package means an IC chip or a discrete component encased in
a container.) Logic networks that realize such logic networks are often called standard (logic) networks.
A single component, such as a resistor and a capacitor, is also commercially available as an off-the-shelf
discrete component package. Logic networks can be assembled with these off-the-shelf packages. In many
cases, not only performance requirements but also compactness and low cost are very important for
products such as digital cameras. So, digital systems must accordingly be realized in IC packages that are
designed, being tailored to specific objectives, rather than assembling many of these off-the-shelf packages
on pc-boards, although assembling with these off-the-shelf packages has the advantage of ease of partial
design changes.
Here, however, let us consider two important cases of designing an IC chip inside such an IC package,
which is not off-the-shelf, that leads to two sharply contrasting logic design approaches: quick design
and high-performance design. Quick design of IC chips is called semi-custom design (recently called
ASIC design, abbreviating Application Specific Integrated Circuit design), whereas deliberate design for
high performance is called full-custom design because full-custom design is fully customized to high
performance. Full-custom design is discussed in this chapter, and different approaches of semi-custom
design will be discussed in the succeeding chapters.
Saburo MurogaUniversity of Illinois
at Urbana-Champaign
7/29/2019 VLSI Implementation Styles
2/40
7/29/2019 VLSI Implementation Styles
3/40
Full-Custom and Semi-Custom Design 44-3
2006 by CRC Press LLC
The second term on the right-hand side of Eq. 44.1, [Manufacturing cost per IC package], is fairly
proportional to the size of each chip when the complexity of manufacturing is determined, being usually
on the order of dollars, or tens of dollars in the case of commercial chips. In the case of full-custom
design, chips are deliberately designed by many designers spending many months. So, [Design expenses],
the first term on the right-hand side of Eq. 44.1 is very high and can easily be on the order of tens of
millions of dollars. Thus, the first term is far greater than the second term, making [Total cost of an IC
package] very expensive, unless [Production volume] is very large, being on the order of more than tens
of millions. Many digital systems that use IC chips are produced in low volume and [Design expenses]
must be very low. Semi-custom design is for this purpose and CAD programs need to be used extensively
for shortening design time and manpower in order to reduce [Design expenses]. In this case, [Manufac-
turing cost per IC chip] is higher than that in the case of full-custom design because the size of each
chip is larger.
Thus, we can see the following from the formula in Eq. 44.1: chips by semi-custom design are cheaper
in small production volume than those by full-custom design, but more expensive in high production
volume. But chips by full-custom design are cheaper in the case of high volume production, and are
expensive for low volume production.
44.2 Full-Custom Design Sequence of a Digital System
Full-custom design flow of a digital system follows a long sequence of different design stages, as follows.
First, the architecture of a digital system is designed by a few people. The performance or cost of the
entire system is predominantly determined by architectural design, which must be done based on good
knowledge of all other aspects of the system, including logic design and also software to be run. If an
inappropriate architecture is chosen, the best performance or lowest cost of the system cannot be achieved,
even if logic networks, or other aspects like software, are designed to yield the best results. For example,
if microprogramming is chosen for the control logic of a microcomputer based on ROM, it occupies too
much of the precious chip area, sacrificing performance and cost, although we have the advantages of
short design time and design flexibility. Thus, if performance or manufacturing cost is important,
realization of control logic by logic networks (i.e., hard-wired control logic) is preferred. Actually, every
design stage is important for the performance of the entire system. Logic design is also one of key factors
for computer performance, such as architecture design, transistor circuit design, layout design, compilers,
and application programs. Even if other factors are the same, computer speed can be significantly
improved by deliberate logic design.
Next, appropriate IC logic families and the corresponding transistor circuit technology are chosen for
each segment of the system. Other aspects such as memories are simultaneously determined in greater
detail. We do not use expensive, high-speed IC logic families where speed is not required.
Architecture and transistor circuits are outside the scope of this handbook, so they are not discussed
here further.
The next stage in the design sequence is the design of logic networks, considering cost reduction and
the highest performance, realizing functions for different segments of the digital system. Logic design
requires many engineers for a fairly long time.
Then, logic networks are converted into transistor circuits. This conversion is called technologymapping. It is difficult to realize the functions of the digital system with transistor circuits directly,
skipping logic design, although experienced engineers can design logic networks and technology mapping
at the same time, at least partly. Logic design with AND, OR, and NOT gates, using conventional switching
theory, is convenient for human minds because AND, OR, and NOT gates in logic networks directly
correspond, respectively, to basic logic operations, AND, OR, and NOT in logic expressions. Thus, logic
design with AND, OR, and NOT gates is usually favored for manual design by designers and then followed
by technology mapping. For example, the logic network with AND and OR gates shown in Figure 44.1(a)
is technology-mapped into the MOS circuit shown in Figure 44.1(c). A variety of IC logic families, such
7/29/2019 VLSI Implementation Styles
4/40
7/29/2019 VLSI Implementation Styles
5/40
7/29/2019 VLSI Implementation Styles
6/40
7/29/2019 VLSI Implementation Styles
7/40
45-1 2006 by CRC Press LLC
45Programmable LogicDevicesCONTENTS
45.1 Introduction ...........................................................45-145.2 PLAs and Variations...............................................45-2
45.3 Logic Design with PLAs ........................................45-5
45.4 Dynamic PLA .........................................................45-7
45.5 Advantages and Disadvantages of PLAs...............45-745.5.1 Applications of PLAs.......................... ....................... 45-9
45.6 Programmable Array Logic ...................................45-9References ........................................................................45-10
45.1 Introduction
Hardware realization of logic networks is generally very time-consuming and expensive. Also, once logic
functions are realized in hardware, it is difficult to change them. In some cases, we need logic networks
that are easily changeable. One such case is logic networks whose output functions need to be changed
frequently, such as control logic in microprocessors, or logic networks whose outputs need to be flexible,such as additional functions in wrist watches and calculators. Another case is logic networks that need
to be debugged before finalizing. Programmable logic devices (i.e., PLDs)are for this purpose. On these
PLDs, all transistor circuits are laid out on IC chips prior to designers use, considering all anticipated
cases. With PLDs, designers can realize logic networks on an IC chip, by only deriving concise logic
expressions such as minimal sums or minimal products, and then making connections among pre-laid logic
gates on the chip. So, designers can realize their own logic networks quickly and inexpensively using these
pre-laid chips, because they need not design logic networks, transistor circuits, and layout for each design
problem. Thus, designers can skip substantial time of months for hardware design. CAD programs for
deriving minimal sums or minimal products are well developed [1], so logic functions can be realized very
easily and quickly as hardware, using these CAD programs. The ease in changing logic functions without
changing hardware is just like programming in software, so the hardware in this case is regarded as
programmable. Programmable logic arrays (i.e., PLAs) and FPGAs are typical programmable logic devices.
PLDs consists of mask-programmable PLDs and field-programmable PLDs. Mask-programmablePLDs (i.e., MPLDs) can be made only by semiconductor manufacturers because connections are made
by custom masks. Manufacturers need to make few masks for connections out of all of more than
20 masks, according to customers specification on what logic functions are to be realized. Unlike mask-
programmable PLDs, field-programmable PLDs (i.e., FPLDs) can be programmed by users and are
economical only for small production volume, whereas MPLDs are economical for high production
volume. Logic functions can be realized quicker on FPLDs than on MPLDs, saving payment of charges
Saburo MurogaUniversity of Illinois
at Urbana-Champaign
7/29/2019 VLSI Implementation Styles
8/40
7/29/2019 VLSI Implementation Styles
9/40
7/29/2019 VLSI Implementation Styles
10/40
45-4 The VLSI Handbook
2006 by CRC Press LLC
by De Morgans theorem. Thus, this is interpreted as a network of AND gates in the first level and
OR gates in the second (output) levels, as illustrated in Figure 45.1(d). This is the reason why the
upper and lower matrices in Figure 45.1(a) are called AND and OR arrays, respectively. The vertical
lines which run through the two arrays in Figure 45.1(a) are called the product lines, since they
correspond to the product terms in disjunctive forms for the output functions f1, f2, and f3. Thus,
any combinational network (or networks) of AND and OR gates in two levels can be realized by a
PLA. The connections of MOSFET gates to horizontal or vertical lines are usually denoted by dots,
as shown in Figure 45.2.
Sequential networks can also be easily realized on a PLA, as shown in Figure 45.2. Some outputs of
the OR array are connected to the inputs of master-slave flip-flops (usuallyJ-Kmaster-slave flip-flops),
whose outputs are in turn connected to the AND array as its inputs. More than one sequential network
can be realized on a single PLA, along with many combinational networks. Flip-flops can be also realized
inside the AND and OR arrays without providing them outside the arrays.
In many PLAs, the option of an outputf1 or its complement is provided in order to give flexibility,
as illustrated in the lower right-hand corner of Figure 45.2. By disconnecting one of the two s at each
output, we can have eitherf1 or as output, as illustrated in Figure 45.3. Whenf1 has too many productsin its disjunctive form and cannot be realized on a PLA, its complement may have a sufficiently small
number of terms to be realizable on the PLA, or vice versa.
If the number of product lines in a PLA is too many, each horizontal line gets too long with a significant
increase in parasitic capacitance. Then, if the majority of the MOSFET gates provided are connected to
this horizontal line, the input or its inverter has too many fan-out connections on this horizontal line.
Similarly, the total number of horizontal lines cannot be too large. In other words, the array size of a
PLA is limited because of speed considerations. In contrast, the size of a ROM can be much larger, since
we can use more than one decoder, or use a complex decoding scheme.
FIGURE 45.2 PLA with flip-flops and output-complementation choice.
x
y
z
J
Cl
K
Q
J
Cl
K
Q
Inputs
f1
f2
f3
OR array
J-Kmaster-slave
flip-flops
AND array
Reset
Clock
f1
f1 f1
7/29/2019 VLSI Implementation Styles
11/40
7/29/2019 VLSI Implementation Styles
12/40
45-6 The VLSI Handbook
2006 by CRC Press LLC
The PLA show in Figure 45.3, for example, is minimized for the given functions f1, f2, and f3, with 8
product lines and array size, (2 4 + 3) 8 = 88.
However, the minimization of the number of connections in a minimal two-level AND-OR network
may not be as important as the minimization of the number of AND gates, although it tends to reduce
the power consumption, because the chances of faulty PLAs can be greatly reduced by careful fabrication
of chips. But the PLA size is determined by the number of AND gates and cannot be changed by any
other factors. Also, instead of making connections (i.e., dots) as they become necessary on a PLA, a
PLA is sometimes prepared by disconnecting unnecessary connections by laser beam or by blowing
fuses after it has been manufactured with all MOSFET gates connected to the lines. In this case, the
chances of faults can be reduced by increasing the number of connections (i.e., the number of dots) in
the two-level AND-OR network.
For comparison with a PLA, the MOS realization of a ROM is shown in Figure 45.4. The upper matrix
is a decoder which has 2nvertical lines if there are ninput variables. The lower matrix stores information
by connecting or not connecting MOSFET gates. Figure 45.4 actually realizes the same output functions
(in negative logic) as those in Figure 45.1(a). The AND array in Figure 45.1(a) is essentially a counterpart
of the decoder in Figure 45.4, or the decoder may be regarded as a fixed AND array with 2n product
lines, which is the maximum number of the product lines in a PLA. The AND array in Figure 45.1(a)
has only three vertical lines, whereas the decoder in Figure 45.4 has eight fixed vertical lines. This indicates
the compact information packing capability of PLAs. PLAs are smaller than ROMs, although the packing
advantage of PLAs varies, depending on functions. For example, if we construct a ROM that realizes
the functions of the PLA of Figure 45.3, in a manner similar to Figure 45.4, the decoder consists of 8
horizontal lines and 16 vertical lines, and the lower matrix for information storage consists of 16 vertical
lines and 3 horizontal lines. Thus, the ROM requires the array size of 16 (8 + 3) = 176, compared with
88 in Figure 45.3.
FIGURE 45.4 ROM that corresponds to the PLA in Figure 45.1.
f1xyz xyz
f3xyz xz
f2xz
x
y
z
VddVdd
(111) (110) (101) (100) (011) (010) (001) (000)
Decoder
7/29/2019 VLSI Implementation Styles
13/40
Programmable Logic Devices 45-7
2006 by CRC Press LLC
Generally, the size difference between PLAs and ROMs sharply increases as the number of input
variables increases.
A PLA, however, cannot store some functions, such as x1x2xnifn is large, because 2n1
product lines are required and the number of these lines is excessively large for a PLA. (The horizontal
lines become too long with excessive fan-out and parasitic capacitance.) However, we can store these
functions in a ROM with an appropriate decoding scheme.
Of course, in the case of ROMs, storing a truth table without worrying about conversion of given logic
functions into a minimal sum is convenient, although it makes the ROM size bigger than the PLA size.
Minimal two-level networks of AND and OR gates for the absolute minimization of the PLA size
can be derived by the minimization methods discussed in earlier chapters, if a function to be
minimized has either at most several variables, or many more variables but with a simple relationship
among its prime implicants [8]. But otherwise, we have to be content with near-minimal networks
instead of minimal networks. In many cases, efforts to reduce the PLA size, even without reaching
an absolute minimum, result in significant size reduction. Also, CAD programs have been developed
with heuristic minimization methods [12,13], such as the one by Hong et al. [7], which was the
first powerful heuristic procedure drastically different from conventional minimization procedures.
MINI, PLA minimization program of Hong, et al., was later improved to ESPRESSO by Rudell,
Brayton, et al. [1,10,11]. Recently, however, Coudert and Madre [26] developed a new method for
absolute minimization by implicitly expressing prime implicants and minterms using BDDs
described in Chapter 29. By this method, absolute minimization of functions with greater numbers
of variables is more feasible than before, although it is still time-consuming.
45.4 Dynamic PLA
If we want to realize a PLA in CMOS, instead of static nMOS circuit that has been discussed in Chapter 33,
Section 33.3, in order to save power consumption, then a PLA in CMOS requires a large area because
we need pMOS and nMOS subcircuits. Thus, instead of static CMOS, the dynamic CMOS illustrated in
Figure 45.5(a) is usually used. During the absence of a clock pulse of the first- and second-phase clocks,
1 and 2 (i.e., during 1 = 2 = 0 (low voltage, using positive logic)) shown in Figure 45.5(b), pMOSFETs,
T1, T2, and T3, become conductive and nMOSFETs, T4, T5, and T6 become non-conductive prechargingvertical lines, P1, P2, and P3. When a clock pulse of the first-phase clock, 1, appears but a clock-pulse of
the second-phase clock, 2, does not appear yet, i.e., when 1 = 1 (high voltage) and 2 = 0, pMOSFETs,
T1, T2, and T3, become non-conductive and nMOSFETs, T4, T5, and T6, become conductive. Then,
depending on the values ofx,y, and z, some verticle lines, P1, P2, and P3 are discharged through some
of the nMOSFETs in the AND array. (For example, ify= 0 (low voltage), P1 is discharged through
nMOSFETs A.) A clock pulse of the second-phase clock, 2, is still absent (i.e., 2 = 0), so pMOSFETs,
T7, T8, and T9, become conductive and nMOSFETs T10, T11, and T12, become non-conductive, precharging
horizontal lines,f1,f2, andf3. When a clock pulse of the first-phase clock, 1, is still present, and a clock
pulse of the second-phase clock, 2, appears, i.e., when 1 = 2 = 1, pMOSFETs, T7, T8, and T9, become
non-conductive and nMOSFETs,T10, T11, and T12, become conductive. Then, some of horizontal lines, f1,
f2, and f3, are discharged through some of the nMOSFETs in the OR array, depending on which of the
vertical lines, P1, P2, and P3, are still charged.
45.5 Advantages and Disadvantages of PLAs
PLAs, like ROMs which are more general, have the following advantages over random-logic gate networks,
where random-logic gate networks are those that are compactly laid out on an IC chip:
1. There is no neeed for the time-consuming logic design of random-logic gate networks and even
more time-consuming layout.
2. Design checking is easy, and design change is also easy.
7/29/2019 VLSI Implementation Styles
14/40
7/29/2019 VLSI Implementation Styles
15/40
Programmable Logic Devices 45-9
2006 by CRC Press LLC
PLAs have the following advantage and disadvantage, compared with ROMs:
For storing the same functions or tasks, PLAs can be smaller than ROMs; generally, the size
difference sharply increases as the number of input variables increases.
The small size advantages of PLAs diminishes as the number of terms in a disjunctive formincreases. Thus, PLAs cannot store complex functions, i.e., functions whose disjunctive forms
consist of many product terms.
45.5.1 Applications of PLAs
Considering the above advantages and disadvantages, PLAs have numerous unique applications. A micro-
processor chip uses many PLAs because of easy of design change and check. In particular, PLAs are used
in its control logic, which is complex and requires many changes, even during its design. Also, PLAs are
used for code conversions, microprogram address conversions, decision tables, bus priority resolvers, and
memory overlay.
When a new product is to be manufactured in small volume or test-marketed, PLAs is a choice. When
the new product is well received in the market and does not need further changes, PLAs can be replaced
by random-logic gate networks for low cost for high volume production and high speed. Also, a full-custom design approach is very time-consuming, probably taking months or years, but if PLAs are used
in the control logic, a number of different custom-design chips with high performance can be made
quickly by changing only one connection mask for the PLAs, although these chips cannot have drastically
different performance and functions.
45.6 Programmable Array Logic
A programmable array logic (PAL) is a special type of a PLA where the OR array is not programmable.
In other words, in a PAL, the AND array is programmable but the OR array is fixed; whereas in a PLA,
both arrays are programmable. The advantage of PALs is the elimination of fuses in the OR array in
Figure 45.1(a) and special electronic circuits to blow these fuses. Since these special electronic circuits
and programmable OR array occupy a very large area, the area is significantly reduced in PAL. Sincesingle-output, two-level networks (i.e., many AND gates in the first level and one OR gate as the network
output) are needed most often in desing practice, many single-output two-level networks which are
mutually unconnected are placed in some PAL packages.
In digital systems, many non-standard networks are still used because designers want to differentiate
their computers from competitors. But logic functions that designers want to have are too diverse to be
standardized by semiconductor manufacturers. When off-the-shelf IC packages for standard networks,
including microprocessors and their peripheral networks, are assembled on pc boards, many non-
standard networks are usually required for interfacing them to other key networks or for minor modi-
fications. So, they require many discrete components and IC packages, each of which has a smaller number
of transistors, in addition to a microprocessor package with millions of gates, occupying a significant
share of the areas on pc boards. Now, we can make connections inside PALs, instead of custom-making
pc boards. Custom-made pc boards are expensive and time-consuming because connection patterns on
pc boards need to be designed, these pc boards need to be manufactured and then the holes of pc boards
have to be soldered to the pins of IC packages. The replacement by PAL packages can substantially reduce
the area, time, and cost. If we consider related factors such as reductions of cabinet size, power consump-
tion, and fans, the significance of this reduction is further appreciated.
There are mask-programmable PALs and field-programmable PALs (i.e., FPALs). When logic design
is not finalized and needs to be changed often, FPAL packages can reduce expense and time for repeatedly
redesigning and remaking pc boards.
7/29/2019 VLSI Implementation Styles
16/40
7/29/2019 VLSI Implementation Styles
17/40
7/29/2019 VLSI Implementation Styles
18/40
46-2 The VLSI Handbook
2006 by CRC Press LLC
46.2 CMOS Gate Arrays
CMOS gate arrays are commercially available from many manufacturers in slightly different layout
forms. As an example, Figure 46.2 shows a cell of a CMOS gate array, where a pair of pMOSFETs anda pair of nMOSFETs are placed on the left and right, respectively, without connections between them.
The NAND gate shown in Figure 46.3(a) can be realized by connecting the components shown in
Figure 46.2 by two metal layers as shown in Figure 46.3(b). These two metal layers are formed by
forming the first metal layer shown in Figure 46.3(c), the insulation layer (not shown), and then the
second metal layer shown in (d). The inverter shown in Figure 46.4(a) can be realized by connections
as shown in Figure 46.4(b).
Many different patterns other than that in Figure 46.2 are available for the components of a cell.
FIGURE 46.1 Gate array.
FIGURE 46.2 A cell of CMOS gate array. (Courtesy of Fujitsu Ltd. With permission.)
(a) Before making connections (b) After connections made
f1
f3
x8
f2x3
x4x5
x1x2
x6x7
Polysilicon gate for pMOS Polysilicon gate for nMOS
p for source/drain n for source/drain
n substrate
n forVdd p forVss
p tab
7/29/2019 VLSI Implementation Styles
19/40
7/29/2019 VLSI Implementation Styles
20/40
7/29/2019 VLSI Implementation Styles
21/40
Gate Arrays 46-5
2006 by CRC Press LLC
logic networks. The cost difference would be greater (the cost is not necessarily linearly propor-
tional to chip size) for the same production volume.
3. It is difficult to keep gate delays uniform. As the number of fan-outs and the length of fan-out
connections increase, delays increase dramatically. (If delay times of gates are not uniform, the network
tends to generate spurious output signals.) In the case of full-custom design, the increase of gate delay
by long or many-output connections of a gate can be reduced by redesigning the transistor circuit
(e.g., increasing transistor size for delivering greater output power and accordingly reducing the delay).
But such a precise adjustment is not possible in the case of gate arrays.
Responding to a variety of different user needs in terms of speed, power consumption, cost, design
time, ease of change, and possibly others, a large number of different gate arrays are commercially available
from semiconductor manufacturers or are used in-house by computer manufacturers. Different numbers
of gates are placed on a chip, with different configuration capabilities. Some gate arrays contain memories,
for example.
References
1. Okabe, M. et al., A 400k-transistor CMOS sea-of-gate array with continuous track allocation, IEEEJ. Solid-State Circuits, pp. 12801286, Oct. 1989.
2. Muroga, S., VLSI System Design, John Wiley & Sons, 1982.
3. Price, J.E., VLSI chip architecture for large computers, in Hardware and Software Concepts in VLSI,
Edited by G. Rabbat, Van Nostrand Reinhold Co., pp. 95115, 1983.
7/29/2019 VLSI Implementation Styles
22/40
7/29/2019 VLSI Implementation Styles
23/40
7/29/2019 VLSI Implementation Styles
24/40
47-2 The VLSI Handbook
2006 by CRC Press LLC
functions by software. Even application programs can be run on FPGAs and perform much faster than
on general-purpose computer in many cases.
As the price of FPGAs goes down with higher speed, FPGAs are replacing other semi-custom design
approaches in many applications.
47.2 Basic Structures of FPGAs
In the case of mask-programmable gate arrays, designers have to wait a few weeks for delivery of finished
gate arrays from semiconductor manufacturers because the semiconductor manufacturers must prepare
custom masks (although the number of custom masks for gate arrays is fewer than the case of the
standard-cell library approach described in Chapter 48). With FPGAs, designers can realize their design
on FPGA chips by themselves in minutes. Thus, FPGAs are becoming popular [1,2,810].
Several different types of structures for FPGAs are available commercially. All of them have a basic
structures that consists of many logic blocks or logic cells, accompanied by a large number of pre-laid
lines for connecting these logic blocks. So, some manufacturers call FPGAs logic block arrays(LBAs).
One has a structure similar to a gate array with routing channels where each logic cell in a gate array is
replaced with a logic block, as shown in Figure 47.1. Another one is similar to sea-of-gate array, as shownin Figure 47.2 illustrated with 16 logic blocks. Also, there is a structure similar to standard cells (to be
discussed in the next chapter) where there are routing channels between a pair of rows of logic blocks,
as shown in Figure 47.3. There is a structure where outputs of logic blocks are connected to the inputs
of other logic blocks through bus lines, as shown in Figure 47.4.
The internal structure of logic blocks or logic cells differs, depending on the manufacturer. A logic
block consists of SRAMs (used as look-up tables), PALs, NAND gates, along with multiplexers, flip-flops,
and others. Lines are pre-laid horizontally and vertically and are connected to the inputs and outputs of
logic blocks byprogrammable switches. Various programmableswitches, such as fuses, anti-fuses, RAMs,
and non-volatile memories, are provided by different manufacturers. Each line actually consists of many
short line segments and only necessary line segments are connected in order not to add unnecessary
delay due to parasitic capacitance by using an excessive number of line segments. Line segments are also
connected by programmable switches.
FIGURE 47.1 FPGA type of gate array with routing channels.
Connection lines
denotes a connection to be
made or to be disconnected.
Logic block
Switch matrix
7/29/2019 VLSI Implementation Styles
25/40
7/29/2019 VLSI Implementation Styles
26/40
7/29/2019 VLSI Implementation Styles
27/40
7/29/2019 VLSI Implementation Styles
28/40
7/29/2019 VLSI Implementation Styles
29/40
7/29/2019 VLSI Implementation Styles
30/40
7/29/2019 VLSI Implementation Styles
31/40
7/29/2019 VLSI Implementation Styles
32/40
7/29/2019 VLSI Implementation Styles
33/40
7/29/2019 VLSI Implementation Styles
34/40
7/29/2019 VLSI Implementation Styles
35/40
Cell-Library Design Approach 48-3
2006 by CRC Press LLC
48.3 Hierarchical Design Approach
The cell library design approaches, using cells of different shapes and sizes, can reduce the chip size more
than the polycell design approach, because by keeping the same height, a large portion of the area of
each cell is wasted, and by keeping all connections among cells in routing channels, the connection area
may not be minimized. Moreover, by using a hierarchical approach based on cells of different shapes
and sizesin other words, by treating many cells as a building block in a higher level, and many such
building blocks as a building block in a next higher level, and so onwe can further reduce the chip
area, as illustrated in Figure 48.2, because global area minimization can be treated better, even though
this is done on the monitor. In other words, cells A, B, C, and D are assembled into a block R (shown
in a dot-lined rectangle), as shown in Figure 48.2. Then, such blocks, R, S, T and U, shown in dot-lined
rectangles are assembled into a bigger block W, which is a block in a higher level than blocks R, S, T,and U, as shown in Figure 48.2. But this is much more time-consuming than the polycell design approach,
and the development of efficient CAD programs is harder. It appears to be difficult to make the difference
of chip area from full-custom designed chips within about 20%, although the areas of full-custom
designed chips vary greatly with designers and, accordingly, comparison is not simple.
References
1. Lauther, U., Cell based VLSI design system, in Hardware and Software Concepts in VLSI, Ed. by
G. Rabbat, Van Nostrand Reinhold, pp. 480494, 1983.
2. Kick, B. et al. Standard-cell-based design methodology for high-performance support chips, IBM
Jour. Res. Dev., pp. 505514, July/Sept. 1997.
3. Muroga, S., VLSI System Design, John Wiley & Sons, 1982.
FIGURE 48.2 Hierarchical design approach.
T U
R S
W
A
C D
B
7/29/2019 VLSI Implementation Styles
36/40
7/29/2019 VLSI Implementation Styles
37/40
7/29/2019 VLSI Implementation Styles
38/40
7/29/2019 VLSI Implementation Styles
39/40
7/29/2019 VLSI Implementation Styles
40/40
49-4 The VLSI Handbook
has variations and it makes a difference whether or not libraries of cells or macrocells are prepared from
scratch. (Notice that in Figure 49.2, design approaches are shown in thin-line curves for the sake of
simplicity, but actually they should be represented in very broad lines.) The cost per package for the
off-the-shelf package design approach is fairly uniform over the entire range, but it increases for low
production volumes because the development cost becomes significant as initial investment in the overall
package cost. The relationship shown in this figure will change as the integration size of an IC chip
increases, because the dependence on CAD will inevitably increase.
49.4 Comparison of All Different Design Approaches
As discussed so far, we have a very wide spectrum of different design approaches, from full-custom design
approaches to the design approaches with off-the-shelf packages, as illustrated in Table 49.1. Digital
systems can be designed by combining them. Depending upon different criteria imposed by different
design motivations, such as speed, power consumption, size, design time, ease of changes, and reliability,
designers can use the following approaches:
1. Custom-design full- and semi-custom approaches
2. Off-the-shelf discrete components and off-the-shelf IC packages, along with memory packages
3. Off-the-shelf microcomputers along with off-the-shelf IC packages
The full-custom design approaches give us the highest performance and reliability or the smallest
chip size, although they are most time-consuming. (Even in the case of microcomputers, the full-
custom designed microcomputers have better performance and smaller size than off-the-shelf micro-
computers, by being tailored to the users specific needs.) This is one end of the wide spectrum of
different design approaches. At the other end, the off-the-shelf microcomputers give us a design
approach where the development time is shortest, by programming rather than by chip design
(including logic design), and the design changes are the easiest. The off-the-shelf discrete components
and off-the-shelf IC packages give us logic networks tailored to specific needs with less programming
than the off-the-shelf microcomputers.
Custom design approaches, in particular the full-custom design approaches, are the most economical
for very high production volumes (on the order of a few hundred thousand) but the least economical
for low production volumes.
When the production volume is low, the off-the-shelf discrete components and off-the-shelf IC
packages give us the most economical approaches for simple tasks, but the off-the-shelf microcomputers
are more economical for complex tasks, although performance is usually sacrificed.
TABLE 49.1 Comparison of Different Task-Realization Approaches
Full-Custom Semi-Custom
Off-the-Shelf IC
Package
Off-the-Shelf
Microcomputer
Speed Fastest Fast Medium Slowest
Size Smallest (chip size) Small (chip size) Large (many chips) Medium (many chips)
Development time Longest (layout) Long (layout) Medium (logic design) Short (programming)Flexibility Lowest Low Medium High
Initial investment Highest (layout) High (layout) Medium (logic design) Low (programming)
Unit Cost
High volume Lowest Low Medium Highest
Low volume Highest High Medium Lowest
Reliability Highest High Low Medium