View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Memory
• It’s all about storing bits--binary digits
• Vacuum tubes, CRTs, drums, disks, core, ICs
• Issues of size, cost, speed
• Semiconductor memories (chips)
Memory
• How to store
• How to organize--so as to be able to “store” a bit (or byte or word) and then find it again
• How to associate an address with a “set” of bits
Up to 2 k
addressable locations
MDR
MAR
Figure 5.1. Connection of the memory to the processor.
k-bitaddress bus
n-bitdata bus
Control lines( , MFC, etc.)
Processor Memory
Word length = n bits
WR/
Memory
• Memory access (read)
address MAR
MAR bus memory
READ bus memory
bus data
processor bus MFC
MDR bus
Memory
• Memory access time:
– Time from Read issued to MFC received
• Memory cycle time:
– Time between two successive reads
Memory
• Obviously, the speed of the processor depends
on the speed of the memory
• Random Access Memory (RAM) simply means
that access time is fixed (and the same) for all
memory locations (addresses)
FF
Figure 5.2. Organization of bit cells in a memory chip.
circuitSense / Write
Addressdecoder
FF
CS
cellsMemory
circuitSense / Write Sense / Write
circuit
Data input/output lines:
A0
A1
A2
A3
W0
W1
W15
b7 b1 b0
WR /
b¢7 b¢1 b¢0
b7 b1 b0
•••
•••
•••
•••
•••
•••
•••
•••
•••
small (very) example (128 bit chip)
16 words of 8 bits each (16 x 8)
FF
Figure 5.2. Organization of bit cells in a memory chip.
circuitSense / Write
Addressdecoder
FF
CS
cellsMemory
circuitSense / Write Sense / Write
circuit
A0
A1
A2
A3
W0
W1
W15
b7 b1 b0
WR /
b¢7 b¢1 b¢0
b7 b1 b0
•••
•••
•••
•••
•••
•••
•••
•••
•••Data input/output lines:
Addressdecoder
A0
A1
A2
A3
W0
W1
W15
•••
four input, sixteen output decoder
4 address lines, 16 word addresses
word lines
circuitSense / Write
W 0
W 1
W 15
b7
b7
b7
•••
•••
•••
•••
•••
. .
. .
word lines
bit lines
data output lines
memory cell
(one bit)
CS
WR / of course
chip select for multi-chip memory
bit 7 of the selected word
circuitSense / Write
W 0
W 1
W 15
b7
b7
b7
•••
•••
•••
•••
•••
. .
. .
word lines
bit lines
data output lines
circuitSense / Write
b0
b0
b0
•••
•••
••••••
•••. .
. .
bit lines
FF
Figure 5.2. Organization of bit cells in a memory chip.
circuitSense / Write
Addressdecoder
FF
CS
cellsMemory
circuitSense / Write Sense / Write
circuit
A0
A1
A2
A3
W0
W1
W15
b7 b1 b0
WR /
b¢7 b¢1 b¢0
b7 b1 b0
•••
•••
•••
•••
•••
•••
•••
•••
•••
small (very) example (128 bit chip)
16 words of 8 bits each (16 x 8)
Data input/output lines:
FF
circuitSense / Write
Addressdecoder
FF
CS
cellsMemory
circuitSense / Write Sense / Write
circuit
A0
A1
A2
A3
W0
W1
W15
b7 b1 b0
WR /
b¢7 b¢1 b¢0
b7 b1 b0
•••
•••
•••
•••
•••
•••
•••
•••
•••
128 bit chip, 16 words of 8 bits each (16 x 8)
4 address lines 8 data lines
R / W linechip select line
power ground
16 external connections
Data input/output lines:
Chip with 1024 Memory Cells
• Could be 128 x 8– Same as the 16 x 8, but with more word lines
Addressdecoder
A0
A1
A7
W0
W1
W127
•••
word lines
•••
4 more address lines, total of 20 external connections
8 data lines
Chip with 1024 Memory Cells
• Or, it could be 1024 x 1
Addressdecoder
A0
A1
A9
W0
W1
W1023
•••
word lines
(1 bit each)•••
2 more address lines, but only 1 data line, total of 15 external connections
1 data line
Figure 5.3. Organization of a 1K 1 memory chip.
CS
address5-bit row
Data input/output
(1 bit)
5-bit column address
address10-bit
output multiplexer 32-to-1
input demultiplexer
WR/
W0
W1
W31
and
32 x 32 memory cell
array Sense/Write circuitry
5-bit decoder
2 lines to choose one of the 4 inputs as its output
Multiplexer
2 lines to choose one of the 4 inputs as its output
Figure 5.3. Organization of a 1K 1 memory chip.
CS
address5-bit row
Data input/output
(1 bit)
5-bit column address
address10-bit
output multiplexer 32-to-1
input demultiplexer
W0
W1
W31
and
32 x 32 memory cell
array Sense/Write circuitry
5-bit decoder
WR/
Static Memory
• Static: retains its state (content) as long as power is applied– and, of course, loses it if powered off (volatile)
• SRAM: static ram– fast– expensive
YX
Word line
Bit lines
Figure 5.4. A static RAM cell.
b
T2T1
b
latch
T T
Voltage = 0 (ground), open switch Voltage = Vs, closed switch
Transistors in the circuit are effectively switches
YX
Word line
Bit lines
Figure 5.4. A static RAM cell.
b
T2T1
b
If the Word line is low, nothing on the bit lines
If the Word line goes high (read), then
b = 1, (high)
b = 0, (low)
sense line sets output high
If the cell represents a 1, for example
1 0
to sense/write circuit
latch
YX
Word line
Bit lines
Figure 5.4. A static RAM cell.
b
T2T1
b
To write (say 0), put b low, b high, set Word line high, latch changes
Then, if the Word line goes high (read)
b = 0, b = 1
If the cell represents a 1, for example
1 0
to sense/write circuit
latch
CMOS SRAM
• Complementary Metal Oxide Semiconductor
• Uses both “P type” and “N type” transistors
(a) (b)
Vsupply
R
Vout
S
An inverter circuit.
Vsupply
Vout
R
drain
source
Vin
gate
T
(c)
Vsupply
Vout
R
drain
source
Vin
gate
T
NMOS--closed when Vin raised
PMOS--open when Vin raised
Word line
b
Bit lines
Figure 5.5. An example of a CMOS memory cell.
T1T2
T6
T5
T4T3
YX
Vsupplyb
CMOS SRAM
• Volatile
• Low power consumption--no current flows except when being accessed
• Fast--access times of a few nanoseconds
• Expensive (6 transistors per cell)
Dynamic Ram (DRAM)
• Simpler cells, higher density
• 1 million to 16 million bits or more per chip
• Less expensive
• But, DRAM cells do not retain their state
• Must be refreshed periodically
Figure 5.6. A single-transistor dynamic memory cell
T
C
Word line
Bit line
capacitor is charged to write a 1
(voltage applied to Word line and to the Bit line)
charge on the capacitor will discharge over time
Figure 5.6. A single-transistor dynamic memory cell
T
C
Word line
Bit line
If a Read detects a voltage on the capacitor above the “threshold, ” it “sees” a 1, and drives the bit line to full voltage and recharges the capacitor
If a Read detects a voltage on the capacitor below the “threshold, ” it “sees” a 0, and drives the bit line to ground and fully discharges the capacitor
Refreshes whenever read
Refresh circuit will periodically read all cells
Column decoder
CSSense / Write circuits
Row address
latch
Column address
latch
Row decoder
4096 x (512 x 8) cell array
R/W
A20 9- A8 0-
D0D7
RAS
CAS
Figure 5.7. Internal organization of a 2M x 8 dynamic memory chip.
/
16 megabits, 2 million bytes
12 bits to select one of the 4096 rows
9 bits to select one of the 512 bytes in a row
the selected byte
4096 lines
4096 lines
Column decoder
CSSense / Write circuits
Row address
latch
Column address
latch
Row decoder
4096 x (512 x 8) cell array
R/W
D0D7
RAS
CAS
21 bit address on 12 lines (reduces
external connections)
the selected byte
Row Address Strobe
Column Address Strobe
12 bit row address applied, latched on RAS
9 bit column address applied, latched on CAS
1
2
DRAM
• Possible to leave row selected (all 512 bytes on sense lines)
• Then rapidly retrieve successive bytes by changing column addresses
• Result is a “fast page mode” for “blocks” or “pages” of bytes where appropriate (such as cache loading, disk transfer)
• Or, synchronous DRAM, SDRAM
SDRAM
• Can operate in different modes
• “Burst” modes of different lengths
• Can transfer “blocks” of data on single Read or Write
R/ W
RASCAS
CS
Clock
Cell array
Figure 5.8. Synchronous DRAM.
Data
Row decoder
Columndecoder
Column address counter
Row address
latch
Refresh counter
Read/Write circuits &
latches
Mode register
and timing control
Row/Column address
Data input
register
Data output
register
Entire row can be addressed and put into latches
Successive columns put into output register on successive clock pulsesClock pulses
cause “counting” to select successive columns
R/ W
RAS
CAS
Clock
Figure 5.9. Burst read of length 4 in an SDRAM.
Row Col
D0 D1 D2 D3
Address
Data
2 cycles to activate selected row
1 cycle to put data on data lines
column address latched
row address latched
column address automatically incremented by memory control each cycle
chip select (2 bits)
19 bit address on chip
8-bit data input/output
512K x 8 memory chip
Larger Memories Using Multiple Chips
4 chips
2 million 32-bit words
21 bit address
Figure 5.10. Organization of a 2M 32 memory module using 512K 8 static memory chips (16 chips).
19-bit internal chip address
2-bit decoder
21-bit addresses
A0
A18
A19A20
D31-24 D7-0D23-16 D15-8512K x 8 memory
chip
16 chips
Figure 5.10. Organization of a 2M 32 memory module using 512K 8 static memory chips (16 chips).
19-bit internal chip address
A0
A18
A
D31-24 D7-0D23-16 D15-8
512K x 8 memory
chip
4 chips for each 32 bit word
Figure 5.10. Organization of a 2M 32 memory module using 512K 8 static memory chips (16 chips).
19-bit internal chip address
2-bit decoder
21-bit addresses
A0
A18
A19A20
D31-24 D7-0D23-16 D15-8512K x 8 memory
chip
16 chips
Processor
RAS
CAS
R/ W
Clock
Address
Row/Column address
Memorycontroller
R / W
Clock
Request
CS
Data
Memory
Figure 5.11. Use of a memory controller.
Processor sends all bits of address
Memory controller does the multiplexing of row and column and issues strobe signals
Processor
RAS
CAS
R/ W
Clock
Address
Row/Column address
Memorycontroller
R / W
Clock
Request
CS
Data
Memory
Figure 5.11. Use of a memory controller.
Memory controller provides the refresh control if not done on the chip
Refreshing typically once every 64 ms. At a cost of .2ms
Less than .4% overhead
Not connected to store a 1Connected to store a 0
Figure 5.12. A ROM cell.
Word line
P
Bit line
T
ROM: Read Only Memory
A PROM cell.
Word line
P
Bit line
T Manufactured connected (storing 0), but the connection is a “fuse” and can be burned out with a high current to change it to a 1
PROM: Programmable Read Only Memory
An EPROM cell.
Word line
Bit line
T
Connection to ground always made
P
Transistor can have a charge put into it that causes it to remain permanently open (programmed to be a 1)
Can be erased with ultraviolet light
EPROM: Erasable Read Only Memory
EEPROM: Electrically Erasable PROM
Cells erasable selectively
vs. EPROM, erase all
Flash Memory
Similar to EEPROM--each cell a single transistor
with a “trapped” charge
Read individual cells, write in blocks
Greater density, low power consumption, small, cheap
Can substitute for disks (up to a gigabyte?)
higher cost, but portable
Processor
Primary cache L1
Secondary cache L2
Magnetic disk secondary memory
Main memory
Increasing size
Figure 5.13. Memory hierarchy.
Registers Increasing speed
Increasing cost per bit
Processor
Primary cache L1
Secondary cache L2
Magnetic disk secondary memory
Main memory
Increasing size
Figure 5.13. Memory hierarchy.
Registers always on the processor chip
may also be on the processor chip
main usually DRAM--cheap enough to be large
cache usually SRAM--faster but more expensive
Cache Memories
• Main memory (still) slow in comparison to processor speed
• Main memory constrained by packaging, electronic characteristics and costs
• Cache memory on the processor chip typically ten times faster than main memory
Locality of Reference
• Programs tend to spend their time “focused” on particular groups of instructions– Loops– Frequently called procedures
• “Localized” areas of programs executed repeatedly during some time period
• Much (most?) of program not accessed during some time period
Locality of Reference
• Temporal– Recently executed instruction likely to repeat soon– When first accessed, move to cache where it will
be when referenced again• Spatial
– Instructions near an executed instruction likely to be executed soon
– When fetching an instruction from memory, move its neighbors into cache as well
Figure 5.14. Use of a cache memory.
CacheMain
memoryProcessor
blocks of memory transferred to (and from) cache
processor accesses instructions and data in the cache if there (a “hit”), in main memory if not (a “miss”)
Writing to Cache
• Write through– Cache copy and main memory copy updated
simultaneously– May repeatedly update the same word in main
memory unnecessarily
• Write back– Update cache only– Mark cache block “dirty” or “modified”– Copy it back to main memory when another block
needs the cache space
Cache Management
• Mapping– Determination of where in the cache the blocks
(cache lines) of main memory are to be placed
• Replacement– Determination of when to replace a block in cache
with another block of main memory
• Coherency– Assurance that no problems arise from cache
version differing from main memory version
Direct Mapping Example
64K main memory
16 bit address (word addressed only)
View as 4096 blocks of 16 words each
Cache of 128 blocks of 16 words
0-31 which block
“assigned” to this position
is in the cache
0-15 which word
in block
0--127 which
block in cache
The 16-bit address
Direct Mapping Example
Main memory blocks 0, 128, 256, etc to block 0 of cache
Main memory blocks 1, 129, 257, etc to block 0 of cache
4
tagtag
tag
Cache
Main memory
Block 0
Block 1
Block i
Block 4095
Block 0Block 1
Block 127
12 Main memory address
Figure 5.16. Associative-mapped cache.
Tag Word
Any block of main memory can be put in any block in the cache
Tags are searched
“associatively” to find the
referenced block
Figure 5.24. Caches and external connections
in Pentium III processor.
Processing units
Bus interface unit
L2 cache Mainmemory Input/Output
System busCache bus
L1 instruction
cache
L1 data cache
m bits
Address in module MM address
Figure 5.25. Addressing multiple-module memory systems.
(b) Consecutive words in consecutive modules
i
k bits
0ModuleModuleModule
Module MM address
DBRABRABR DBRABR DBR
Address in module
(a) Consecutive words in a module
i
k bits
Module Module Module
Module
DBRABR DBRABR ABR DBR
0
2 k 1-
n 1-
m bits
Figure 5.26. Virtual memory organization.
Data
Data
DMA transfer
Physical address
Physical address
Virtual address
Disk storage
Main memory
Cache
MMU
Processor