09-Arjun p Raj-phase Change Memory

PHASE CHANGE MEMORY

Seminar Report

2011-2012

Submitted in partial fulfillment for the award of the Degree

of Bachelor of Technology in Electrical and Electronics

By

ARJUN P RAJUniv roll no:65309

Under the guidance ofTHOMAS K P

Department of Electrical and ElectronicsRAJAGIRI SCHOOL OF ENGINEERING AND TECHNOLOGY

Rajagiri Valley, Cochin-682039Kerala, India

CERTIFICATE

This is to certify that the report entitled ‘PHASE CHANGE MEMORY’ is a bonafiderecord of the project done by Arjun P Raj, of 7th semester Electrical and Electronics En-gineering in partial fulfillment of the requirements for the award of Degree of Bachelor ofTechnology in Electrical & Electronics Engineering of the Mahatma Gandhi University,Kottayam during the academic year 2011− 2012.

Mr. Thomas K P(guide)

Asst ProfessorDept. of Electrical & Electronics Engineering

Prof. K R VarmahPlace:Kakkanad Professor & HODDate:19-12-2011 Dept. of Electrical & Electronics Engineering

Acknowledgement

The satisfaction and euphoria that accompanies the successful completion of anytask would be incomplete sans the mention of the people who made it possible, whoseconstant guidance and encouragement crowd our effort with success.

First and foremost, I would like to express my whole hearted thanks to the invisible,indomitable God for his blessings showered upon me in enabling to complete this seminaron time.

I would like to extend my heartiest thanks to the management of our college, whoprovided me with necessities for the completion of the seminar.

I would also like to extend my heartfelt thanks to Prof.Rajendra Varmah (H.O.D.,EEE) for the inspiration inculcated in me and for apt guidance.

It would be a grave error if I forget to take a mention of my seminar guide, Asst.Prof. Thomas K P and coordinator,Lecturer Jebin Francis whose constant persistenceand support helped me in the completion of this seminar.

I also remember the teachers of the Department of Electrical Engineering, who werealways a support in my academics. I am also thankful to my friends and well-wishersfor their support and prayers. With a heart full of gratitude I submit this report.Onceagain I thank all who walked with me to make this venture a grant success.

ARJUN P RAJ

2

Abstract

The memory subsystem accounts for a significant cost and power budget of a com-puter system. Current DRAM-based main memory systems are starting to hit thepower and cost limit. An alternative memory technology that uses resistance contrast inphase-change materials is being actively investigated in the circuits community. PhaseChange Memory (PCM) devices offer more density relative to DRAM, and can helpincrease main memory capacity of future systems while remaining within the cost andpower constraints. A PCM-based hybrid main memory system using an architecturelevel model of PCM is analyzed and the trade-offs for a main memory system consistingof PCM storage coupled with a small DRAM buffer is explored. Such an architecturehas the latency benefits of DRAM and the capacity benefits of PCM.

i

Contents

1 Introduction 1

2 What is PCM? 2

3 Chalcogenide materials 4

4 Theory Of Operation 54.1 Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64.2 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Main Features 9

6 Disadvantages 11

7 Comparison 12

8 PCM Based Memory Model 14

9 Hybrid Main Memory Organization 169.1 Lazy Write Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169.2 Line Level Writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179.3 Fine-Grained Wear-Leveling for PCM . . . . . . . . . . . . . . . . . . . . 179.4 Page Level Bypass for Write Filtering . . . . . . . . . . . . . . . . . . . . 199.5 Impact Of These Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 20

10 Conclusion 21

ii

List of Figures

2.1 Typical PCM cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 PCM storage cell and its implementation . . . . . . . . . . . . . . . . . . 3

3.1 Periodic table-Chalcogenides . . . . . . . . . . . . . . . . . . . . . . . . . 4

4.1 Amorphous and Polycrystalline . . . . . . . . . . . . . . . . . . . . . . . . 54.2 current-voltage characteristic . . . . . . . . . . . . . . . . . . . . . . . . . 64.3 Set Pulse And Reset Pulse . . . . . . . . . . . . . . . . . . . . . . . . . . . 74.4 Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

7.1 Typical Access Latencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137.3 PCM VS FLASH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

8.1 Main Memory Organisations . . . . . . . . . . . . . . . . . . . . . . . . . 14

9.1 Lazy write Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179.2 Fine Grained Wear Leveling . . . . . . . . . . . . . . . . . . . . . . . . . . 18

iii

List of Tables

4.1 Set operation VS Reset Operation . . . . . . . . . . . . . . . . . . . . . . 7

9.1 Impact of the different techniques on performance . . . . . . . . . . . . . 209.2 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

iv

List of Abbreviations

1. PCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phase Change Memory

2. DRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dynamic Random Access Memory

3. OS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Operating System

4. HDD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hard Disk Drive

5. LLWB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Line Level Write Back

6. FGWL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Fine Grained Wear Leveling

7. PLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Page Level Bypass

v

Chapter 1

Introduction

Current computer systems consist of several cores on a chip, and sometimes severalchips in a system. As the number of cores in the system increases, the number ofconcurrently running applications (or threads) increases, which in turn increases thecombined working set of the system.The memory system must be capable of supportingthis growth in the total working set. For several decades, DRAM has been the buildingblock of the main memories of computer systems. However, with the increasing sizeof the memory system, a significant portion of the total system power and the totalsystem cost is spent in the memory system. Current DRAM-based main memory sys-tems are starting to hit the power and cost limit. Therefore, technology researchershave been studying new memory technologies that can provide more memory capacitythan DRAM while still being competitive in terms of performance, cost, and power. Analternative memory technology that uses resistance contrast in phase-change materialsis being actively investigated in the circuits community.this is known as Phase ChangeMemory (PCM).These devices offer more density relative to DRAM, and can help in-crease main memory capacity of future systems while remaining within the cost andpower constraints.

There are several challenges to overcome before PCM can become a part of the mainmemory system. First, PCM being much slower than DRAM, makes a memory systemcomprising exclusively of PCM, to have much increased memory access latency; thereby,adversely impacting system performance. Second, PCM devices are likely to sustainsignificantly reduced number of writes compared to DRAM, therefore the write trafficto these devices must be reduced. Otherwise, the short lifetime may significantly limitthe usefulness of PCM for commercial systems.

1

Chapter 2

What is PCM?

PCM or Phase change memory is a type of non-volatile memory that exploits the prop-erty of chalcogenide glass to switch between two states, amorphous and crystalline,with the application of heat using electrical pulses. The phase change material can beswitched from one phase to another reliably, quickly, and a large number of times. Theamorphous phase has low optical reflexivity and high electrical resistivity. Whereas, thecrystalline phase has high reflexivity and low resistance. PCM exploits differences in theelectrical resistivity of a material in these different phases. The difference in resistancebetween the two states is typically about five orders of magnitude and can be used toinfer logical states of binary data namely 1(high bit) and 0 (low bit).

Figure 2.1: Typical PCM cell

The figure shows a graphical representation of a basic PCM storage element. Asshown on the left, a layer of chalcogenide is sandwiched between a top electrode and abottom electrode. A resistive heating element extends from the bottom electrode andcontacts a layer of the chalcogenide material. Current injected into the junction of thechalcogenide and the heater induces the phase change through Joule heating. Figure atright is the actual implementation of the concept, showing an amorphous bit formed in

2

Figure 2.2: PCM storage cell and its implementation

a layer of polycrystalline chalcogenide. Because of the change in reflectivity, the amor-phous bit appears as a mushroom cap shaped structure in the layer of polycrystallinechalcogenide.

3

Chapter 3

Chalcogenide materials

The PCM technology uses a class of materials known as chalcogenides (prounouncedkal-KOJ-uh-nydes). Chalcogenides are alloys that contain an element in the Oxy-gen/Sulphur family of the Periodic Table i.e Group 16 in the new style or Group VIa inthe old style Periodic Table (usually combined with IV and V group elements)

Figure 3.1: Periodic table-Chalcogenides

The history of phase-change materials can be traced back to work starting in the1950s by Dr. Stanford Ovshinsky who was researching the properties of a class of glassymaterials that exhibited the ability to easily and stably change between two phases. Bythe late 1960s, he had reported that certain of these materials exhibited a reversiblechange both in resistivity and reflectivity when changing between an ordered (poly-crystalline) state and a disordered (amorphous) state. It was recognized that this effectcould be exploited both for optical memory as well as electronic memory

Phase-change materials have been in use for many years for high-volume rewritableCDs and DVDs which make use of the difference in optical properties. Numonyx PCMis using an alloy of Germanium, Antimony and Tellurium (Ge2Sb2Te5), known morecommonly as ”GST”. Most companies performing research and development in PCMtoday are using GST or closely related alloys. Other alloys that are being used forthe research purposes of PCM are Nitrogen-doped GST, Sb2Te3 with N-doping (STN),AgInSbTe (silver-indium-antimony-tellurium)

4

Chapter 4

Theory Of Operation

Phase-change chalcogenides exhibit a reversible phase change between the amorphousphase and the crystalline phase. As illustrated in Figure 4.1, in the amorphous phase,there is an absence of regular order to the crystalline lattice. In this phase, the materialdemonstrates high resistivity and low reflectivity.In contrast, in the polycrystalline phase, the material has a regular crystalline structureand exhibits high reflectivity and low resistivity.

Figure 4.1: Amorphous and Polycrystalline

In PCM, we are exploiting the difference in resistivity between the two phases ofthe material.This phase change is induced in the material through localized Joule heat-ing caused by current injection. The final phase of the material is modulated by themagnitude of the injected current and the time of the operation.

5

4.1 Writing

The PCM material is between a top and a bottom electrode with a heating element thatextends from the bottom electrode, and establishes contact with the PCM material.When current is injected into the junction of the material and the heating element, itinduces the phase change.

Crystallizing the phase-change material by heating it above the crystallization tem-perature (but below the melting temperature) is called the SET operation. The SEToperation is controlled by moderate power, and long duration of electrical pulses andthis returns the cell to a low-resistance state, and logically stores a 1. Melt-quenchingthe material is called the RESET operation, and it makes the material amorphous. TheRESET operation is controlled by high-power pulses which places the memory cell inhigh-resistance state and logically stores a 0.

In the phase-change memory, threshold switching provides a means to deliver therequired programming current needed to program a bit in the high-resistance state atlow voltage. From a high-resistance (”RESET”) state, a pcm bit is programmed into alow-resistance (”SET”) state by applying programming voltage in excess of Vth, allowingthe bit to enter the dynamic ON state. Current then is allowed to flow for a length of timesufficient to ensure crystallization. The device can then be programmed to the RESETstate by applying a short, somewhat larger current pulse to a bit in the polycrystallinestate. The reset pulse only needs to be of sufficient magnitude and duration to meltthe programmed volume of chalcogenide alloy and to have a fast enough falling edge topermit the molten programmed volume of material to cool fast enough to vitrify. Theduration of the reset pulse can be short, since the material in the programmed volumecan be heated to the melting point in a few nanoseconds.

Figure 4.2: current-voltage characteristic

figure 4.2:Current-voltage characteristics for an Ovonic Unified Memory (OUM) cellelement in both the RESET (amorphous, high-resistance) and SET (crystalline, low-resistance) states,showing key device parameters: Read/SET/RESET regimes and SETand RESET states.Vh is the holding voltage, and Vth is the switching threshold voltage.

6

Figure 4.3: Set Pulse And Reset Pulse

figure 4.3:Ta-amorphization temperature, Tx-crystallization temperature

SET OPERATION RESET OPERATIONCrystallizing the pcm Melt quenching to make it amorphousPulse of Moderate power but long duration Pulse of higher power but short durationTo Low resistance state To high resistance stateLogically stores 1 Logically stores 0pulse of 150 microampere ,1.2V pulse of 300 microampere, 1.6VA SET dissipates 90 microWatt for 150ns A RESET dissipates 480 microWatts for 40nsA set operation consumes around 13.5 picojoules A RESET operation consumes about 19.2pJ

Table 4.1: Set operation VS Reset Operation

7

4.2 Reading

Figure 4.4: Reading

To read the data, the chips use a smaller current to determine which state thechalcogenide is in.Information stored in the cell is read out by measurement of the cell’sresistance. In read mode, verifying the cell resistance is accomplished at a Voltage Lessthan Vth, typically 0.4 V. This ensures that while reading the state of the cell is notaffected and no writing can take place.

Prior to reading the cell, the bitline is precharged to the read voltage. The wordlineis active low when using a BJT access transistor. If a selected cell is in a crystallinestate,having low resistance, the bitline is discharged with current flowing through thestorage element and access transistor. Otherwise,if the cell is in an amorphous state,itprevents or limits bitline current since in this state the material has high resistance.

8

Chapter 5

Main Features

1.Bit-alterabileLike RAM or EEpROM, PCM is bit alterable. Flash technology requires a separate

erase step in order to change information. Information stored in bit-alterable memorycan be switched from a one to zero or zero to a one without a separate erase step.

2.ScalingBoth NOR and NAND rely on memory structures which are difficult to shrink at

small lithos. This is due to gate thickness remaining constant and the need for operationvoltage of more than 10V while the operation of CMOS logic has been scaled to 1V oreven less. This scaling effect is often referred to as Moore’s Law, where memory densi-ties double with each smaller generation. Flash rely on floating gate memory structures,which are also difficult to shrink. With PCM, as the memory cell shrinks, the volumeof GST material shrinks as well, providing a truly scalable solution. Chalcogenide filmshave already been proven to have stable characteristics to a 5nm node. As the PCMmemory cell shrinks, the volume of GST material involved in the state change shrinks re-sulting in reduced power consumption or higher write performance. This unique featureof PCM technology supports the promise of scalability beyond that of other memorytechnologies.

3.DensityPCM is a dense technology with feature size comparable to DRAM cells. Further-

more, a PCM cell can be in different degrees of partial crystallization thereby enablingmore than one bit to be stored in each cell, Recently, a prototype with two logical bits ineach physical cell has been demonstrated. This means four states with different degreesof partial crystallization are possible, which allows twice as many bits to be stored in thesame physical area.Hence the density of PCM is almost four times to that of DRAM.Hence more amount of information can be stored in a PCM, than that of a DRAM fora given size.

4.Non-volatilePCM is non-volatile.It does not require a constant power supply to retain informa-

tion, while DRAM does.Hence there is no need of refreshing circuits inorder to maintainthe data in a PCM.

9

5.Read performanceLike RAM and NOR-type flash, the technology features fast random access times.

This enables the execution of code directly from the memory, without an intermediatecopy to RAM. The read latency of PCM is comparable to single bit per cell NOR flash,while the read bandwidth can match DRAM. In contrast, NAND flash suffers from longrandom access times on the order of 10s of microseconds that prevent direct code exe-cution.

6.Write/erase performancePCM is capable of achieving write speeds like NAND, but with lower latency and

with no separate erase step required. NOR flash features moderate write speeds butlong erase times. As with RAM, no separate erase step is required with PCM, butthe write speed (bandwidth and latency) does not match the capability of RAM today.The capability of PCM is expected, however, however, to improve with each processgeneration as the PCM cell area decreases.

10

Chapter 6

Disadvantages

1.Limited LifetimeThe number of writes to a PCM is limited about 109,afterwhich the memory cell be-

gin to wear out.Due to the fact that the operation is temperature dependant.Expansionand contraction.

2.High access latenciesPCM also suffers from high access latencies compared to DRAM.It is around 250 ns

for PCM whereas 60 ns in case of DRAM

3.High energy consumptionThough PCM enjoys the advantage of having almost zero leakage power, it suffers

from higher dynamic power consumption. This mainly supported by the fact that theread and write operations are temperature dependant.

11

Chapter 7

Comparison

Figure 7.1: Typical Access Latencies

Figure 7.1 shows the typical access latency (in cycles, assuming a 4GHz machine) ofdifferent memory technologies, and their relative place in the overall memory hierarchy.Hard disk drive (HDD) latency is typically about four to five orders of magnitude higherthan DRAM . A technology denser than DRAM and access latency between DRAM andhard disk can bridge this speed gap. Flash-based disk caches have already been proposedto bridge the gap between DRAM and hard disk, and to reduce the power consumedin HDD. However, with Flash being 28 times slower than DRAM, it is still importantto increase DRAM capacity to reduce the accesses to the Flash-based disk cache. Theaccess latency of PCM is much closer to DRAM, and coupled with its density advantage,PCM is an attractive technology to increase memory capacity while remaining withinthe system cost and power budget. Furthermore, PCM cells can sustain 1000x morewrites than Flash cells, which makes the lifetime of PCM-based memory system in therange of years as opposed to days for a Flash-based main memory system.

Write endurance is the maximum number of writes for each cell.Data retention is the duration for which the non-volatile technologies can retain data.It can be found from the figure 7.2 that it has Density similar to NAND flash and Readlatency similar to NOR flash

PCM offers a density advantage similar to NAND Flash, which means more mainmemory capacity for the same chip area. The read latency of PCM is similar to NOR

12

Figure 7.2: Comparison

Flash, which is only about 4X slower compared to DRAM. The write latency of PCMis about an order of magnitude slower than read latency. However, write latency istypically not in the critical path and can be tolerated using buffers. Finally, PCM isalso expected to have higher write endurance (106 to 108 writes) relative to Flash (104

writes).

Figure 7.3: PCM VS FLASH

13

Chapter 8

PCM Based Memory Model

There are several challenges to overcome before PCM can become a part of the mainmemory system. First, PCM being much slower than DRAM, makes a memory systemcomprising exclusively of PCM, to have much increased memory access latency; thereby,adversely impacting system performance. Second, PCM devices are likely to sustainsignificantly reduced number of writes compared to DRAM, therefore the write trafficto these devices must be reduced. Otherwise, the short lifetime may significantly limitthe usefulness of PCM for commercial systems. There is active research on PCM, andseveral PCM prototypes have been proposed, each optimizing for some important devicecharacteristics (such as density, latency, bandwidth, or lifetime). While the PCM tech-nology matures, and becomes ready to be used as a complement to DRAM, it is believedthat system architecture solutions can be explored to make these memories part of themain memory to improve system performance.

Figure 8.1: Main Memory Organisations

Figure 8.1 (a) shows a traditional system in which DRAM main memory is backedby a disk. Flash memory is finding widespread use to reduce the latency and powerrequirement of disks. In fact, some systems have only Flash-based storage without thehard disks; for example, the MacBook Air laptop has DRAM backed by a 64GB Flash

14

drive. It is therefore reasonable to expect future highperformance systems to have Flash-based disk caches such as shown in Figure 8.1(b). However, because there is still twoorders of magnitude difference in the access latency of DRAM memories and the nextlevel of storage, a large amount of DRAM main memory is still needed to avoid goingto the disks. PCM can be used instead of DRAM to increase main memory capacityas shown in Figure 8.1(c). However, the relatively higher latency of PCM compared toDRAM will significantly decrease the system performance. Therefore, to get the bestcapacity and latency, Figure 8.1(d) shows the hybrid system we foresee emerging forfuture high-performance systems. The larger PCM storage will have the capacity tohold most of the pages needed during program execution, thereby reducing disk accessesdue to paging. The fast DRAM memory will act as both a buffer for main memory, andas an interface between the PCM main memory and the processor system. We showthat a relatively small DRAM buffer (3 percentage size of the PCM storage) can bridgemost of the latency gap between DRAM and PCM.

15

Chapter 9

Hybrid Main MemoryOrganization

In a hybrid main memory organization, the PCM storage is managed by the OperatingSystem (OS) using a Page Table, in a manner similar to current DRAM main memorysystems. The DRAM buffer is organized similar to a hardware cache that is not visibleto the OS, and is managed by the DRAM controller. Although, the DRAM buffer canbe organized at any granularity, it can be assumed that both the DRAM buffer and thePCM storage are organized at a page granularity. DRAM memory acts as a buffer aswell as an interface between the PCM and processor

Different techniques used in this hybrid main memory organization are1.Lazy write organization2.Line level writes3.Fine grained wear levelling4.Page level bypass

9.1 Lazy Write Organization

The Lazy-Write organization reduces the number of writes to the PCM and overcomesthe slow write speed of the PCM, both without incurring any performance overhead.When a page fault is serviced, the page fetched from the hard disk (HDD) is written onlyto the DRAM cache. Although allocating a page table entry at the time of page fetchfrom HDD automatically allocates the space for this page in the PCM, the allocatedPCM page is not written with the data brought from the HDD. This eliminates theoverhead of writing the PCM. To track the pages present only in the DRAM, and notin the PCM, the DRAM tag directory is extended with a ”presence” (P) bit. When thepage from HDD is stored in the DRAM cache, the P bit in the DRAM tag directory isset to 0. In the ”lazy write” organization, a page is written to the PCM only when it isevicted from the DRAM storage, and the P bit is 0, or the dirty bit is set. If on a DRAMmiss, the page is fetched from the PCM then the P bit in the DRAM tag directory entryof that page is set to 1. When a page with P bit set is evicted from the DRAM, it isnot written back to the PCM unless it is dirty. Furthermore, to account for the largerwrite latency of the PCM a write queue is associated with the PCM. We assume thattags of both the write queue and the DRAM buffer are made of SRAM in order to help

16

Figure 9.1: Lazy write Organization

in probing these structures while incurring low latency. Given the PCM write latency,a write queue of 100 pages is sufficient to avoid stalls due to write queue being full.

9.2 Line Level Writes

Typically, the main memory is read and written in pages. However, ”endurance” limitsof the PCM require exploring mechanisms to reduce the number of writes to the PCM.We propose writing to the PCM memory in smaller chunks instead of a whole page.

For example, if writes to a page can be tracked at the granularity of a processor’scache line, the number of writes to the PCM page can be minimized by writing only”dirty” lines within a page. We propose Line Level WriteBack (LLWB), that tracks thewrites to pages held in the DRAM on the basis of processor’s cache lines. To do so, theDRAM tag directory shown in Figure 9.1 is extended to hold a ”dirty” bit for each cacheline in the page. In this organization, when a dirty page is evicted from the DRAM, ifthe P bit is 1 (i.e., the page is already present in the PCM), only the dirty lines of thepage are written to the PCM.When the P bit of a dirty page chosen for eviction is 0, allthe lines of the page will have to be written to the PCM. LLWB significantly reduceswasteful writes from DRAM to PCMfor workloads which write to very few lines in adirty page. To support LLWB we need dirty bits per line of a page. For example, forthe baseline system with 4096B page and 256B linesize, we need 16 dirty bits per pagein the tag store of DRAM buffer.

9.3 Fine-Grained Wear-Leveling for PCM

Memories with limited endurance typically employ wear-leveling algorithms to extendtheir life expectancy. For example, in Flash memories, wear-leveling algorithms arrangedata in a manner so that sector erasures are distributed more evenly across the Flash cellarray and single sector failures due to high concentration of erase cycles are minimized.

17

LLWB reduces write traffic to PCM. However, if only some cache lines within a pageare written to frequently, they will wear out sooner than the other lines in that page.We analyze the distribution of write traffic to each line in a PCM page. Figure 9.2shows the total writeback traffic per dirty page for the two database applications, db1and db2. The average number of writes per line is also shown. The page size is 4KBand line size is 256B, giving a total of 16 lines per page, numbered from 0 to 15. Thelifetime of PCM can be increased if the writes can be made uniform across all lines inthe page. This can be done by tracking number of writes on a per line basis, however,this would incur huge tracking overhead.

Figure 9.2: Fine Grained Wear Leveling

Fine Grained Wear-Leveling (FGWL),is used for making the writes uniform (in theaverage case) while avoiding per line storage. In FGWL, the lines in each page are storedin the PCM in a rotated manner. For a system with 16 lines per page the rotate amountis between 0 and 15 lines. If the rotate value is 0, the page is stored in a traditionalmanner. If it is 1, then the Line 0 of the address space is stored in Line 1 of the physicalPCM page, each line is stored shifted, and Line 15 of address space is stored in Line 0.When a PCM page is read, it is realigned. The pages are written from the Write Queueto the PCM in a line-shifted format. On a page fault, when the page is fetched from thehard disk, a Pseudo Random Number Generator (PRNG) is consulted to get a random4-bit rotate value, and this value is stored in the WearLevelShift (W) field associatedwith the PCM page as shown in Figure 9.1. This value remains constant until the pageis replaced, at which point the PRNG is consulted again for the new page allocated inthe same physical space of the PCM.

18

9.4 Page Level Bypass for Write Filtering

Not all applications benefit from more memory capacity. For example, streaming appli-cations typically access a large amount of data but have poor reuse. Such applicationsdo not benefit from the capacity boost provided by PCM. In fact, storing pages of suchapplications only accelerates the endurance related wear-out of PCM. As PCM serves asthe main memory, it is necessary to allocate space in PCM when a page table entry isallocated for a page. But, the actual writing of such pages in the PCM can be avoided byleveraging the lazy write architecture. We call this Page Level By- pass (PLB). When apage is evicted from DRAM, PLB invalidates the Page Table Entry associated with thepage, and does not install the page in PCM. We assume that the OS enables/disablesPLB for each application using a configuration bit. If the PLB bit is turned on, all pagesof that application bypass the PCM storage.

19

9.5 Impact Of These Techniques

Configuration No.of bytes per cycle Average LifetimePCM 32GB 0.317 7.6yrs

+1 GB DRAM 0.807 3.0 yrs+LAZY WRITE 0.725 3.4 yrs

+LLWB 0.316 7.6 yrs+PLB 0.247 9.7 yrs

Table 9.1: Impact of the different techniques on performance

No. Parameter DRAM PCM Hybrid1 scalabality less high limited2. Density less high high3. latency(read) less high medium4. write speed high low medium5. dynamic power less high medium6. static power high nil medium7. crosstalk effect high nil less

Table 9.2: Comparison

20

Chapter 10

Conclusion

Phase change memory can be exploited by the memory system and by the convergence ofconsumer, computer and communication electronic systems. The caching of the existingmemory technologies, reducing the overall system cost and system complexity will bethe compelling motivation for PCM adoption. Bandwidth will drive the sustaining sideof PCM in code and data transfer applications, while reduction in power dissipation willrepresent a further added value of this technology.

However, PCM comes with the drawback of increased access latency and limitednumber of writes. Inorder to overcome these disadvantages we can use it in conjunctionwith a DRAM buffer and make use of three techniques: Lazy Write, LLWB, and PLB.These simple techniques can reduce the write traffic by 3X and increase the average life-time of PCM from 3 years to 9.7 years. Fine Grained Wear Leveling (FGWL) techniquecan be used to make the wear-out of PCM storage uniform across all lines in a page.

PCM is today’s memory breakthrough. Like flash, PCM is a non-volatile memorythat can store bits even without a power supply. But unlike flash, data can be written tocells much faster, at rates comparable to the dynamic and static random-access memory(DRAM and SRAM) used in all computers and cell phones today. Quite simply, PCMblends together the best attributes of NOR flash, NAND flash, EEpROM and RAM-delivering a new category of memory for new usage models.

21

Bibliography

[1] IBM Research (2009, May.). “Scalable High Performance Main Mem-ory System Using Phase-Change Memory Technology”, ibm research 978-1-60558-526-0/09/06(2009)”, ibm research . [On-line]. Available: Inter-net:http://www.cs.ucsb.edu/ chong/290N/pcm.pdf [May.26, 2009].

[2] “The Basics of Phase Change Memory Technology”. Inter-net:http://www.numonyx.com/Documents/WhitePapers/ [Jan.30, 2010].

[3] Wong, H.-S.P.; SangBum Kim; Byoungil Lee; Caldwell, M.A.; Jiale Liang; YiWu; Jeyasingh, R.G.D.; Shimeng Yu, Recent progress of phase change memory(PCM) and resistive switching random access memory (RRAM) , IEEE 10.1109/IC-SICT.2010.5667542, 2010, pp: 1055 - 1060

22