i Cache Seminar

Embed Size (px)

Citation preview

  • 8/6/2019 i Cache Seminar

    1/37

    ASEMINAR REPORT

    ON

    iCACHE

    Submitted in partial fulfillment of Bachelor of Engineering Degree

    University Of KOTA, Jaipur

    College logo

    (2009-10)

    Submitted To: Submitted By:

    Mr. XXXX XXXXXX (H.O.D.) XXXX XXXXX

    Mr. XXXXX XXXX

    DEPARTMENT OF COMPUTER SCIENCE ENGINEERINGXXX Engineering College, JAIPUR

    1

  • 8/6/2019 i Cache Seminar

    2/37

    CERTIFICATE

    This is to clarify that seminar report entitled iCACHE has been submitted

    by XXXXX XXXXX of final year B.Tech Computer Science, XXX

    Engineering College, Jaipur. This work is found to be satisfactory and it was

    observed that he was sincere during course of his work.

    Seminar Guide Head Of DepartmentMr. XXXXX XXXX Computer Science

    Mr. XXXX XXXXX

    2

  • 8/6/2019 i Cache Seminar

    3/37

    PREFACE

    It gives me immense pleasure to present this paper to the Department ofComputer Science and Engineering, XXX Engineering College, University of

    Kota. In partial fulfillment for the award of Degree of bachelor of Engineering

    in Computer Science.

    This report is the crux of the seminar presented on iCACHE at XXXEngineering College, Jaipur.

    Although, it was not possible for me to describe all the things in detail, I have

    tried my best to cover all the important concepts.

    3

  • 8/6/2019 i Cache Seminar

    4/37

    Acknowledgment

    I sincerely acknowledge Mr. XXXX XXXX XXX, Head, Computerengineering Department, XXXXXXXX INSTITUTE OF TECHNOLOGY,JAIPUR who helped us a lot so that project was completed within specified

    period. We are extremely grateful to him for his kind consent, co-operation andencouragement. We are greatly motivated by his character of lookingeverything from completely differently angle and his never ending enthusiasmwork. I also learned a lot from while attending his sessions.

    We acknowledge Mr. XXXXXXXXXXX (Seminar Coordinator), who has

    helped and guided us throughout in the completion of this project as well as thechallenges that lie behind us. He is always there to meet and talk about myideas, to proofread and mark up my project. We are really fortunate to workunder this guidance .He was instrumental in making us understand how toimplement the project and also provided continuous supervision of the project.

    This project has been benefited from the many useful comments provided to meby the numerous of my colleagous. In addition many other of my friends havechecked it and have offered many suggestions and comments. Besides there are

    some books and some online helps. Although I cannot mention all these peoplehere, I thank each and everyone who supported me on this.

    XXXX XXXXXXXB.Tech (IV Year)Comp.Sc. & Engg.

    XXX, JAIPUR

    4

  • 8/6/2019 i Cache Seminar

    5/37

    CONTENTS

    S. No. Topic

    1. Introduction

    2. Application of Cache

    3. 1-way Cache

    4. The First Biometric Digital wallet

    5. Ram set 1 and 2

    6. iCACHE operation

    7. Flowchart of Line Load Process

    8. Process Diagram of ICACHE

    9. CPU Bits for Controller the ICACHE

    10. PowerPC Architecture ICACHE

    11. iCACHE Register

    12. Advantage

    13. Conclusion

    14. Reference

    5

  • 8/6/2019 i Cache Seminar

    6/37

    INTRODUCTION OF INSTRUCTION CACHE

    The average person carries nine credit, debit and store loyalty cards [CNNMoney]. That's a lot of bulk to carry around in your wallet. What if you couldreplace that wallet full of plastic with a single slim device that would function

    just like each of your cards?That's exactly what the makers of iCache have in mind. Realizing that thetechnology behind the credit or debit card hasn't changed much in the last 40years, the creators of iCache decided to update the way we conduct oureveryday business transactions [CNN Money]. The iCache gadget is as thin as aRazr cell phone, and its sleek style resembles an iPod. Basically, the iCacheharbors data from every credit, debit, loyalty (for example, your supermarket

    discount card) or gift card that you own. The gadget's design is meant to appealto young and technologically savvy consumers as well as those seeking a safeand secure way to carry out transactions. It's the first gadget of its kind -- adigital wallet secured by the owner's fingerprint.The iCache could not only alter day-to-day shopping, it could also change theway people travel. Instead of carrying a wallet full of credit and debit cardswhile you travel domestically or internationally, with the iCache travel gadget,you could streamline all your cards into one device.So far, the device has only been distributed by banks and other financial

    institutions to a select few customers. But it's a good glimpse at the potentialfuture of making transactions. In fact, Jonathan Ramaci, CEO of theCambridge, Mass.-based iCache, anticipates that 7 million units will be sold bythe end of 2009 [CNN Money].But how exactly can one device replace all of your credit cards? Read on tolearn about the mechanics of the iCache and how to use this unique travelgadget.

    Instruction Cache-

    On the TMS320VC5510 digital signal processor (DSP), instructions can residein internal memory or external memory. When instructions reside in externalmemory, the instruction cache (I-Cache) can improve the overall system

    performance by buffering the most recent instructions accessed by theCPU. Toconfigure the I-Cache and check its status, the CPU accesses a set of registers in

    6

    http://howstuffworks.com/framed.htm?parent=icache.htm&url=http://money.cnn.com/2007/08/23/technology/one_credit_card.biz2/index.htmhttp://howstuffworks.com/framed.htm?parent=icache.htm&url=http://money.cnn.com/2007/08/23/technology/one_credit_card.biz2/index.htmhttp://howstuffworks.com/framed.htm?parent=icache.htm&url=http://money.cnn.com/2007/08/23/technology/one_credit_card.biz2/index.htmhttp://howstuffworks.com/framed.htm?parent=icache.htm&url=http://money.cnn.com/2007/08/23/technology/one_credit_card.biz2/index.htmhttp://howstuffworks.com/framed.htm?parent=icache.htm&url=http://money.cnn.com/2007/08/23/technology/one_credit_card.biz2/index.htmhttp://howstuffworks.com/framed.htm?parent=icache.htm&url=http://money.cnn.com/2007/08/23/technology/one_credit_card.biz2/index.htm
  • 8/6/2019 i Cache Seminar

    7/37

    the I-Cache. For storing instructions, the I-Cache contains: One 2-way cache.The 2-way cache uses 2-way set associative mapping and holds up to 16K

    bytes: 512 sets, two lines per set, four 32-bit words per line. In the 2-way cache,each line is identified by a unique tag.Two RAM sets (1 and 2). These two banks of RAM are available to hold blocksof code. Each RAM set holds up to 4K bytes: 256 lines, four 32-bit words perline. Each RAM set uses a single tag to identify a continuous range of memoryaddresses that is represented in the RAM set. Before enabling the I-Cache,configure the I-Cache to use zero, one, or both RAM sets.

    Cache

    In computer science, a cache is a collection of data duplicating original valuesstored elsewhere or computed earlier, where of data is expensive to fetch or tocompute, compared to the cost of reading the cache. In other words, a cache is atemporary storage area where frequently accessed data can be stored for rapidaccess. Once the data is stored in the cache, future use can be made byaccessing the cached copy rather than re-fatching or recomputing the originaldata, so the average access time is shorter.

    History

    Use of the word cache in the compyter context originated in 1967 duringpreparation of an article for publication in the IBM systems Journal. The paperconcerned an exciting memory improvement in model 85, a latecomer in theIBM system/360 product line. The Journal editor, Lyle R. Johnson, pleaded fora more descriptive term than high speed buffer. When none was forthcoming,he suggested the noun cache, from the French noun meaning a safekeeping orstorage place. The paper was published in early 1968, the authors were honored

    by IBM, their work was widely welcomed and subsequently improved upon,and cache soon became standard usage in computer literature.

    Operation

    A cache is a block of memory for temporary storage of data likely to be usedagain. The CPU and hard drive frequently use a cache, as do web browsers andweb servers.

    7

  • 8/6/2019 i Cache Seminar

    8/37

    A cache is made up of a pool of entries. Each entry has a datum which is a copyof the datum in some backing store. Each entry also has a tag, which specifies

    the identity of the datum in the backing store of which the entry is a copy.When the cache client wishes to access a datum presumably in the backingstore, it first checks the cache. If an entry can be found with a tag matching thatof the desired datum, the datum in the entry is used instead. This situation isknown as a cache hit. So, for example, a web browser program might check itslocal cache on disk to see if it has a local copy of the contents of a web page ata particular URL. In this example, yhe URL is the tag, and the contents of theweb page is the datum. The percentage of accesses that result in cache hits isknown as hit rate of the cache.The alternate situation, when the cache is consulted and found not to contain a

    datum with the desired tag, is known as a cache miss. The previously uncaseddatum fetched from the backing store during miss handling is usually copiedinto the cache, ready for the next access.During a cache miss, the CPU usually ejects some other entry in other to makeroom for the previously uncased datum. The heuristic used to select the entry toeject is known as the replacement policy. One popular replacement policy, leastrecently used, replaces the least recently used entry. More efficient cachescompute use frequency against the size of the backing store. While this workswell for larger amounts of data, long latencies, and slow throughputs for both

    the cache and the backing store. While this works well for larger amounts ofdata, long latencies, and slow throughputs, such as experienced with a harddrive and the internet, its not efficient to use this for cached main memory.When a datum is written to the cache, it must at some point be written to the

    backing store as well. The timing of this written to the cache, it must at somepoint be written to the backing store as well. The timing of this write iscontrolled by what is known as the write is controlled by what is known as thewrite policy.In a write through cache, every write to the cache causes a synchronous write tothe backing store.

    Alternatively, in a write back cache, writes are not immediately mirrored to thestore. Instead, the cache tracks which of its locations have been written over.The data in these locations is written back to the backing store when those dataare evicted from the cache, an effect reffered to as a lazy write. For this reason,a read miss in a write back cache will often require two memory accesses toservice: one to retrieve the need datum, and one to write replaced data from thecache to the store.

    8

  • 8/6/2019 i Cache Seminar

    9/37

    Data write back may be triggered by other policies as well. The client maymake many changes to a datum in the cache, and then explicitly notify the

    cache to write back datum.No write allocation is a cache policy where only processor reads are cached,thus avoiding the need for write back or write through when the old value of thedatum was absent from the cache prior to the write.The data in the backing store may be changed by entities other than the cache,in which case the copy in the cache may become out of date or stale.Alternatively, when the client updates the data in the cache, copies of the datain other caches will become stale. Communication protocols between the cachemanages which keep the data consistent are known as coherency protocols.

    Applications

    CPU cachesSmall memories on or close to the CPU can be made faster than the larger mainmemory. Most CPUs since the 1980s have used one or more caches, andmodern microprocessors inside personal computers may have as half a dozen,each specialized to a different part of the task of executing programs.Disk cacheWhile CPU caches are generally managed entirely by hardware, other caches

    are managed by a variety of software. The page cache in main memory, whichis an example of the disk cache, is usually managed by the operating systemkernel.While the hard drives hardware disk buffer is sometimes misleadingly referredto as disk cache, its main functions are write sequencing and read perfecting.Repeated cache hits are relatively rare, due to the small size of the buffer incomparison to HDDs capacity.In turn, fast local hard disk can be used to cache information held on evenslower data storage devices, such as remote servers or local tape drives oroptical jukeboxes.

    Such a scheme is the main concept of hierarchical storage management.Web cacheWeb cache are employed by web browsers and web proxy servers to store

    previous responses from web servers, such as web pages. Web caches reducethe amount of information that needs to be transmitted across the network, asinformation previously stored in the cache can often be re-used. This reduces

    bandwidth and processing requirements of the web server, and helps to improveresponsiveness for users of the web.

    9

  • 8/6/2019 i Cache Seminar

    10/37

    Modern web browsers employ a built in web cache, but some internet serviceprovides or organizations also use a cachinh proxy server, which is a web cache

    that is shared between all users of that network.Other cachesThe BIND DNS daemon caches a mapping of domain names to IP addresses, asdoes a resolver library.Write through operation is common when operating over unreliable networks,

    because of the enormous complexity of the coherency protocol requiredbetween multiple write back caches when communication is unreliable. Forinstance, Web page caches and client side network file system caches aretypically read only or write through specifically to keep the network protocolsimple and reliable.

    Search engine also frequently make web pages they have indexed availablefrom their cache. For example, Google provides a Cached link next to eachsearch result. This is useful when web pages are temporarily inaccessible from aweb server.Another type of caching is stored computed results that will likely be neededagain, or memorization. An example of this type of caching is cache, a programthat caches the output of the copmpilation to speed up the second timecompilation.Introduction to the I-Cache-

    When the C5510 CPU requests instructions, it requests 32 bits at a time. Toinitiate an instruction fetch, the CPU sends a fetch request and a fetch address tothe I-Cache. If theI-Cache is enabled, it handles the fetch request as follows: If the requested wordis in one of the data arrays (a hit), the I-Cache delivers the word to the CPU. Ifthe requested word is not in the I-Cache (a miss), the I-Cache uses the externalmemory interface (EMIF) to fetch the 4-word external memory block thatcontains the requested word. As soon as the requested word arrives in the I-Cache, it is delivered to the CPU. Timing information for I-Cache hits and

    misses can be found in section 5 on page 22.If the I-Cache is disabled, the dataarrays are not checked. Instead, the fetch request and fetch address are passed tothe EMIF. Once fetched by the EMIF, the requested 32-bit word is passeddirectly to the CPU.

    1-Way Cache:

    10

  • 8/6/2019 i Cache Seminar

    11/37

    The 2-way cache has two memory banks. Each memory bank has the sameparts:

    Data array. Each data array contains 512 lines (0 through 511) that the I-Cache can fill one by one in response to misses in the 2-way cache. Line valid (LV) bit array. Each line has a line valid bit. Once a line has

    been loaded, its line valid bit is set. Whenever the I-Cache is flushed, all512 line valid bits are cleared, invalidating all the lines. For moreinformation on flushing the I-Cache.

    Tag array. Each line has a tag field. When the I-Cache receives a 24-bitfetch address from the CPU, the I-Cache interprets bits 23-13 as a tag.When a line gets filled, the associated tag is stored in the tag field for thatline.

    Introduction to the I-Cache-Instruction Cache 12

    Across the two memory banks, every two lines with the same number belong toone set. For example, line 0 of memory bank 1 and line 0 of memory bank 2

    belong to set 0. When the I-Cache receives a fetch address, the I-Cache findsthe set number in bits 12-4. If the I-Cache must replace one of the lines in theset, it uses a least-recently used (LRU) algorithm: The line replaced is the one

    that has been unused for the longest time. Each set has an LRU bit that istoggled to indicate which line should be replaced.

    2-Way Cache:

    Data LV Tag LRU Tag LV DataMemory bank 1 Memory bank 2Line 0Line 1Line 254

    Line 255Line 510Line 511Line 0Line 1Line 254Line 255Line 510

    11

  • 8/6/2019 i Cache Seminar

    12/37

    Line 511Set 0

    Set 1Set 254Set 255Set 510Set 511... ...... ...... ...

    12

  • 8/6/2019 i Cache Seminar

    13/37

    13

  • 8/6/2019 i Cache Seminar

    14/37

    iCache: The First Biometric Digital Wallet

    How long have we been hearing about the digital wallet? I know Bill gates was

    a mere millionaire back when I heard him spinning yarns, and yet here today isthe iCache. Billed as the first biometric digital wallet, this handy little deviceholds your credit card data, frequent flier numbers, and virtually anything withan magnetic strip on it for safe keeping. Your finger unlocks the card.

    We Say: If you can brave thebeyond-annoying flash-based website, youll see apretty cool idea coming to life at last. No word on pricing and the product isstill in testing, but were intrigued. Too bad it looks like an iPodyou knowsome test user will demand music and photo capability and well all be waitinganother three years for this.

    Share and Enjoy: These icons link to social book marking sites where readerscan share and discover new web pages.

    14

    http://www.realtechnews.com/posts/4042http://www.icache.com/http://www.icache.com/http://www.realtechnews.com/posts/4042http://www.icache.com/http://www.icache.com/
  • 8/6/2019 i Cache Seminar

    15/37

    RAM Sets 1 and 2:

    RAM set 1 and RAM set 2 each have the following parts:

    Data array. Each data array contains 256 lines (0 through 255). Line valid (LV) bit array. Each line has a line valid bit. When a line has

    been loaded, its line valid bit is set. Whenever the I-Cache is flushed, all256 line valid bits are cleared, invalidating all the lines. For moreinformation on flushing the I-Cache.

    Tag field. The RAM set has one 12-bit tag field that indicates whichrange of external memory addresses are mapped to the RAM set. Toselect a tag for RAM set n (1 or 2), write to RAM-set tag register n.When you write to the tag register, the I-Cache immediately fills theRAM set with all the 32-bit words in the address range specified by thetag. As each line is loaded, the associated line valid bit is set.

    Introduction to the I-Cache-13 Instruction Cache

    Tag valid (TV) bit. The RAM set has one tag valid bit. Just before filling theRAM set, the I-Cache clears the tag valid bit. When the filling is complete, theI-Cache sets the tag valid bit. For RAM set n (1 or 2), the tag valid bit isreflected in RAM-set control register n.

    Note:Once the I-Cache begins to fill a RAM set, the I-Cache will not service CPUinstruction-fetch requests until the RAM-set fill operation is complete. Foroptimal code performance, it is recommended that instruction fetches not bemade from external memory during RAM-set fill operations.RAM Sets 1 and 2

    Data LV Tag TVLine 0Line 1Line 254Line 255

    15

  • 8/6/2019 i Cache Seminar

    16/37

  • 8/6/2019 i Cache Seminar

    17/37

    Bit Description Field:

    3-2 Offset When the I-Cache must read a 32-bit word from one of the lines ofthe 2-way cache, the offset field indicates which of the four 32-bit words in theline should be read.1-0 Byte This field is not used by the I-Cache but is the part of the fetch addressthat indicates the specific byte being addressed.Fetch Address Fields for a RAM Set23 12 11 4 3 2 1 0Tag Index (line) Offset Byte12 bits 8 bits 2 bits 2 bits

    Bit Field Description:

    23-12 Tag During an instruction presence check, the I-Cache compares the tagportion of the fetch address with the tag defined in the RAM-se11-4 Index This8-bit value references one of the 256 lines of the RAM set. 3-2 Offset When theI-Cache must read a 32-bit wordfonofthe lines of the RAM set, the offset fieldindicates which of the four 32-bit words in the line should be read.

    iCache Eliminates a Fat WalletIt's impossible to see everything at CES, but I managed to spot the

    iCache over at Popgadget, which says this little device is a "mini storage for allyour credit cards." It's funny I found this because this weekend I bought a newwallet and realized I carry way too many credit cards. I have a card for Borders,Blockbuster, Costco, Safeway, and others stores I wish I didn't need.

    A concept device predicted to be a big hit among the ladies made its way intoCES, and only a few people spotted it. It's called iCache, and it promises to

    eliminate the bulge from your wallet. The digital wallet consists of a universalcard and an iPod-like device that holds all of your credit card information. Toactivate the iCache digital wallet, you'll have to register all your credit cards toan online account. Anytime you need a particular card, you simply select itfrom the list on your iCache, insert the universal card to temporarily load it withthe data you need, then use it at the store of your choice. The informationremains on the card for ten minutes before it's completely wiped out.

    17

    http://www.icache.com/http://www.popgadget.net/2007/01/time_to_digitiz.phphttp://www.icache.com/http://www.popgadget.net/2007/01/time_to_digitiz.php
  • 8/6/2019 i Cache Seminar

    18/37

    In terms of security, all the information on the iCache is encrypted, and thedevice itself requires a scan of your fingerprint before it can work. While most

    of us may not be ready to use this type of technology, do expect it to becomeavailable in the next two years, that is if financial institutions warm up to theidea at all. What do you think of this emerging technology? Guys, would youcarry an iCache in your back pocket? The next time you go shopping for awallet, think about how nice it would be to replace all of that plastic yourcarrying with one card. Thats precisely what iCache will allow you to do, onceits released. The iCache keeps a copy of every card you have programmed intothe device and will program the dynamic magnetic strip when you select that

    particular card.

    How It Works:

    The iCache will most likely be available first through banks, then throughretail locations. After you receive your iCache, you have to register your finger

    print and the cards you would like to have programmed into the device byplugging it into your PC through a USB cable. The software will prompt youfor your card numbers and expiration dates which will then be saved on theiCache device itself. Then, when you reach the cash register, place your fingeron the print scanner, navigate to the card to want to use and activate it. Themagnetic strip will be programmed for that card, will eject out of the iCachecard holder and can then be used to swipe at the terminal. Here is a video Ifound that goes into a little more detail in to how this thing works.

    Security:

    You cant talk about the iCache without highlighting the aspect ofsecurity. If you lose the device, it will be no good considering you need to use

    your finger print before it can be activated. According to iCache, if the device istampered with in anyway, the data will be permanently deleted. If you choose tostore all of your credit card data online with iCache, you can simply plug inyour iCache device into your PC and the data will automatically be restored.

    Notice how I said choose. Many people are worried that they have to store theircard data online with iCache which is not the case. It will just be more of a painto restore the data if you choose not to have them store your data.

    18

    http://www.icache.com/http://www.icache.com/
  • 8/6/2019 i Cache Seminar

    19/37

    Which Cards Can You Use:

    iCache states that technically, you should be able to program any cardthat contains a mag stripe. That is, any card with a black magnetic strip thatwould use to swipe at pay terminals. This includes credit cards, debit cards,

    prepaid cards, ATM cards, loyalty cards, gas cards, give cards and giftcertificate cards. It will be awesome to finally take all of those LOYALTYcards off of my keyring and put them into this device.

    When Will It Be Available:

    iCache is aiming towards a soft release towards the 2nd quarter of 2008with a more widespread release near the end of the year. There is no word yeton how much this device will cost but if its $100.00 or cheaper, Im going to

    pick one up.

    Instruction Presence Check and the Corresponding I-CacheResponse-

    When a fetch request arrives, the I-Cache performs an instruction presencecheck to determine whether the 32-bit requested word is available in the I-Cache. During the instruction presence check, the I-Cache performs these twooperations on both the 2-way cache and the RAM sets:

    19

  • 8/6/2019 i Cache Seminar

    20/37

    1) Compares the tag portion of the fetch address with the tag in the data arrayat the location referenced by the Index portion of the fetch address.

    2) Checks the line valid bit at the referenced location, to determine whether theline associated with the tag is valid.

    I-Cache Operation-Instruction Cache 16

    If the tag comparison fails and/or the line valid bit is 0, this qualifies as a miss.If the instruction presence check finds a tag match and the line valid bit is 1,this qualifies as a hit. Whenever a line in the I-Cache must be loaded fromexternal memory (cases 1, 2, and 5), the I-Cache uses the line load process.

    Case 2-Way Cache RAM Sets Presence I-Cache Response

    1 Miss Miss(no tag match)False 2-way cache line loaded from external memory, requested 32-bit worddelivered to CPU

    2 Miss Miss buttag matchFalse RAM set line loaded from external memory, requested 32-bit worddelivered to CPU

    3 Miss Hit True Requested 32-bit word taken directly from RAM set; 2-waycache line not loaded

    4 Hit Miss(no tag match)

    True Requested 32-bit word taken directly from 2-way cache

    5 Hit Miss buttag matchTrue RAM set line loaded from external memory, requested 32-bit worddelivered to CPU

    6 Hit Hit True Requested 32-bit word taken directly from RAM set

    20

  • 8/6/2019 i Cache Seminar

    21/37

    Line Load Process:

    When an instruction presence check results in a fetch from the externalmemory, the 4-word external memory block that contains the requested word isfetched and loaded into a line in the I-Cache. The I-Cache uses the externalmemory interface (EMIF) to fetch the 4-word block that contains the requestedword. These four 32-bit words are written to the line in the I-Cache one word ata time. The I-Cache delivers the requested word to the CPU as soon as the wordarrives in the data array, even if the rest of the line is still being loaded. Whenthe entire line is loaded in the data array, the corresponding tag is written to thetag array and

    the line valid bit is set to validate the line.

    I-Cache Operation-17 Instruction Cache

    Flow Chart of the Line Load Process

    I-Cache must load2-way cache line

    or RAM set lineCommand EMIF to readfour 32-bit words fromexternal memoryIswordreceived?

    NoWrite word to line

    Yesit therequestedwordIs?YesDeliver word to

    21

  • 8/6/2019 i Cache Seminar

    22/37

    I unit of CPUload done

    ?Line NoWait fornext word

    No

    Process Diagram

    INCLUSION FILTER

    The inclusion filter enables detection of instructions in the processor pipeline,mainly to support self-modifying code (SMC). To reduce design impact on thistiming-sensitive area of the processor pipeline, detection techniques arearchitecturally minimized to provide pessimistic estimates. The goal is a

    22

  • 8/6/2019 i Cache Seminar

    23/37

    logically optimized solution in which false SMC detection is sufficientlyuncommon such that the resulting performance loss is negligible.

    Motivation

    As the processor pipeline capacity increases, the inclusion detection solutionneeded to be re-examined. The pipeline capacity includes instructions in allstages and structures between instruction fetch and retirement, which increasedsubstantially when the issue width was increased for the Merom architecturefrom 3 to 4. The increased instruction capacity resulted in increased false SMCdetection conditions during Merom silicon testing, which tended to have a morelimiting impact on server performance. For example, Transaction Processing

    Performance Council Benchmark C performance increased by 2 percent withinclusion checking disabled. The Inclusion Filter in the processor significantlyreduces false SMC detection by using an alternative technique to filter from theexisting detection mechanism the most common false detection scenarios.

    Solution

    Most instructions in the pipeline will also naturally exist in the InstructionCache (ICache), so the Inclusion Filter monitors ICache activity toalgorithmically identify states in which this common property is guaranteed

    ((Figure 9) ). Snoops in this state can then be filtered from the existinginclusion-detection mechanism, and this combination virtually eliminates falseSMC detection.

    Figure 9: The Inclusion Filter reduces false SMC detection

    23

  • 8/6/2019 i Cache Seminar

    24/37

    To increase confidence in this new micro architectural solution, it was essentialto minimize design complexity. To reduce logic risk and validation

    requirements, the Inclusion Filter has a single functional output. To avoid therisk of frequency degradation, it is logically separated from the existing ICachestructure (which is ideal for separating logic vs. process-related debug) and usesonly non-timing-critical signals, such as the ICache LRU bits.

    For example, a property of the iCache pseudo-LRU algorithm is that for an X-way configuration, an accessed entry will not be evicted until at least log 2 (x)different entries in the same set have been accessed. Therefore, for an 8-aycache, each set is allowed to filter at least three iCache evictions prior toresorting to the existing inclusion detection mechanism. Determining a

    different entry can be accomplished without additional storage by detecting achange of LRU bits. Monitoring changes to specific LRU bits and other controllogic can increase the limit substantially using other similar properties.

    When the Inclusion Filter is saturated and finally allows the existing mechanismto be used, it is more beneficial to have it return to its reset state than tocontinue filtering. From a reset, the average cycles needed for the InclusionFilter to resort to the existing inclusion mechanism is 50 times greater than thecycles needed to ensure that a fetched instruction is no longer in the pipeline.Therefore, when the Inclusion Filter is finally saturated, it takes the opportunityto completely reset its state, but it disables filtering until it is certain that allinstructions in the pipeline at the time of this reset have been retired.

    In effect, a small window is opened during which the existing detection methodis used, then it is closed for a very long time (98 percent closed on average).This translates to a 98 percent reduction in false SMC detection, and nearoptimal performance.

    CPU Bits for Controlling the I-Cache:

    Control of the I-Cache is maintained not only through the I-Cache registers butalso through three bits located in status register ST3_55 of the CPU.

    CAEN Bit to Enable or Disable the I-Cache

    24

  • 8/6/2019 i Cache Seminar

    25/37

    To enable the I-Cache, set the cache enable (CAEN) bit of ST3_55. Todisable the I-Cache, clear the CAEN bit. When disabled, the lines of the I-

    Cache data arrays are not checked; instead, the I-Cache forwards instruction-fetch requests directly to the external memory interface (EMIF). For properoperation of the I-Cache, configure the I-Cache before enabling it and disablethe I-Cache before making any changes to its configuration.A DSP reset forces CAEN = 0 (I-Cache disabled).

    CACLR Bit to Flush the I-Cache

    The flush operation is defined as the invalidation of all of the lines in the I-Cache.To flush the I-Cache, write 1 to the cache clear (CACLR) bit of ST3_55. Inresponse, all the line valid bits of the 2-way cache and of the RAM sets arecleared. In addition, the tag valid bit of each RAM set is cleared. The CACLR

    bit remains 1 until the flush process is complete, at which time CACLR isautomatically reset to 0.A DSP reset forces CACLR = 0 (no flush in process).

    CPU Bits for Controlling the I-Cache19 Instruction Cache

    CAFRZ Bit to Freeze the Contents of the I-Cache

    When you write 1 to the cache freeze (CAFRZ) bit of ST3_55, the contentsof the I-Cache are locked. Instruction words that were cached prior to the freezeare still accessible in the case of an I-Cache hit, but the data arrays are notupdated in response to an I-Cache miss. To re-enable updates, write 0 toCAFRZ.A DSP reset forces CAFRZ = 0 (I-Cache not frozen).

    Note:

    When the I-Cache is frozen (CAFRZ = 1), each I-Cache miss still causes a 4-word (16-byte) fetch cycle in the EMIF. It is recommended that you profileyour code to minimize the number of misses during an I-Cache freeze.s

    Configuring and Enabling the I-Cache

    25

  • 8/6/2019 i Cache Seminar

    26/37

    Configuring and Enabling the I-Cache

    This section gives the procedures for preparing and enabling the I-Cache forthe three I-Cache configurations:

    _ 2-way cache and no RAM sets_ 2-way cache and one RAM set_ 2-way cache and two RAM setsThe cache enable (CAEN) bit that is used to enable and disable the I-Cache

    Notes:

    1) Write to the control registers (ICGC, ICWC, ICRC1, and ICRC2) only whenthe I-Cache is disabled (CAEN = 0 in ST3_55).2) Write to the RAM-set tag registers (ICRTAG1 and ICRTAG2) only whenthe I-Cache is enabled (after making CAEN = 1 in ST3_55, wait for IEN = 1 inICST).

    To configure with 2-way cache and no RAM sets:

    1) Write to the appropriate control registers:

    _ Write CBFFh to ICGC to indicate no RAM sets._ Write 000Fh to ICWC to initialize the logic for the 2-way cache.2) Set the cache enable bit (CAEN) bit of CPU status register ST3_55 to sendan enable request to the I-Cache.3) Poll the I-Cache-enabled (IEN) bit of ICST until IEN = 1. (The I-Cache isnot instantaneously enabled.)To configure with 2-way cache and one RAM set:

    1) Write to the appropriate control registers:_ Write CE1Fh to ICGC to indicate one RAM set._ Write 000Fh to ICWC to initialize the logic for the 2-way cache._ Write 000Fh to ICRC1 to initialize the logic for RAM set 1.2) Set the cache enable bit (CAEN) bit of CPU status register ST3_55 to sendan enable request to the I-Cache.3) Poll the I-Cache-enabled (IEN) bit of ICST until IEN = 1. (The I-Cache isnot instantaneously enabled.)

    Configuring and Enabling the I-Cache

    26

  • 8/6/2019 i Cache Seminar

    27/37

    21 Instruction

    4) Write the desired tag to ICRTAG1. When you write to the tag register, thetag is used to immediately fill RAM set 1 from external memory. While the I-Cache is enabled, you can write to the tag register at any time to change theRAM-set address range. Each time you load the tag register, RAM set 1 isimmediately filled from the selected address range.5) To monitor the RAM-set filling, poll the tag-valid bit: When R1TVALID = 1in ICRC1, the I-Cache has finished filling RAM set 1.

    To configure with 2-way cache and two RAM sets:

    1) Write to the appropriate control registers:_ Write CFFFh to ICGC to indicate two RAM sets._ Write 000Fh to ICWC to initialize the logic for the 2-way cache._ Write 000Fh to ICRC1 to initialize the logic for RAM set 1._ Write 000Fh to ICRC2 to initialize the logic for RAM set 2.2) Set the cache enable bit (CAEN) bit of CPU status register ST3_55 to sendanenable request to the I-Cache.3) Poll the I-Cache-enabled (IEN) bit of ICST until IEN = 1, indicating thattheI-Cache is enabled. (The I-Cache is not instantaneously enabled.)

    4) Write to the RAM-set tag registers:_ Write the desired tag to ICRTAG1. When you write to the tag register, the tagis used to immediately fill RAM set 1 from external memory.

    _ Write the desired tag to ICRTAG2. When you write to the tag register, the tagis used to immediately fill RAM set 2 from external memory. While the I-Cache is enabled, you can write to a tag register at any time to change theaddress range as necessary. Each time you load a tag register, the correspondingRAM set is immediately filled from the selected address range.5) To monitor the RAM-set filling, poll the tag-valid bits:

    _ When R1TVALID = 1 in ICRC1, the I-Cache is done filling RAM set 1.

    _ When R2TVALID = 1 in ICRC2, the I-Cache is done filling RAM set 2.

    Timing ConsiderationsInstruction Cache 22

    27

  • 8/6/2019 i Cache Seminar

    28/37

    Timing Considerations:

    As the I-Cache fetches and returns 32-bit words requested by the CPU, twokey time periods affect the speed of the I-Cache:

    _ Hit time_ Miss penalty

    Hit TimeThe hit time is the time required for the I-Cache to deliver the 32-bit

    requested word to the CPU in the case of a hit (when the word is present in theI-Cache).The hit time is either 1 or 2 CPU clock cycles:

    _ An initial request (a request that follows a period of inactivity) has a hittime of 2 cycles.

    _ Subsequent requests have a hit time of 1 cycle if:_ The requests are consecutive (no inactivity in between) and_ The requests are to sequential addresses_ Subsequent requests have a hit time of 2 cycles if:_ The requests are not consecutive or_ The requests are to non sequential addresses

    Timing Considerations23 Instruction Cache

    Miss Penalty

    The miss penalty is the time required for the I-Cache to deliver the 32-bitrequested word to the CPU in the case of a miss (when the word must befetched from external memory). In response to a miss, the I-Cache requests fourwords from the external memory interface (EMIF) to load the appropriate line.The miss penalty due to an initial request to the EMIF is:1) Four cycles for the I-Cache to receive the fetch request, detect an I-Cachemiss, and forward the fetch request to the EMIF.2) X cycles for the EMIF to get the requested word to the I-Cache, where Xdepends on factors such as:

    _ The initial access latency of the type of external memory that is used_ The position of the requested word in the I-Cache line. For example, ifthe requested word is the third word of the line, two words are fetched

    28

  • 8/6/2019 i Cache Seminar

    29/37

    before the requested word._ Whether the four words are fetched in a burst access (if synchronous

    memory is used)3) Three cycles for the I-Cache to get the requested 32-bit word to theinstruction fetch unit (I unit) of the CPU.

    Subsequent requests can incur a smaller miss penalty if the external memory issynchronous. After accessing the first word from synchronous memory,the EMIF can return each of the remaining words in a single cyc

    PowerPC Architecture Instruction Cache (I-Cache)

    The PowerPC 405 accesses memory through the Instruction CacheUnit (ICU) and Data Cache Unit (DCU). Each cache unit includes a ProcessorLocal Bus (PLB) Interface, Cache Arrays, and a Cache Controller. Hits into theinstruction cache and data cache appear to the CPU as single-cycle memoryaccesses

    The PowerPC 405 implements separate instruction cache and data cache arrays.Each is 16 KB in size, two-way set-associative, and operates using 8-word (32

    byte) cache lines. The caches are non-blocking to allow the PowerPC 405 tooverlap instruction execution with reads over the PLB. The cache controllers

    replace cache lines according to a least-recently used (LRU) replacementpolicy. When a cache line fill occurs, the most-recently accessed line in thecache set is retained and the other line is replaced. The cache controller updatesthe LRU during a cache line fill.

    The ICU supplies up to two instructions every cycle to the Fetch and Decodeunit. The ICU can also forward instructions to the Fetch and Decode unit duringa cache line fill, minimizing execution stalls caused by instruction-cachemisses. When the ICU is accessed, four instructions are read from the

    appropriate cache line and placed temporarily in a line buffer. Subsequent ICUaccesses check this line buffer for the requested instruction prior to accessingthe cache array. This allows the ICU cache array to be accessed as little as onceevery four instructions, significantly reducing ICU power consumption.

    Power, Emulation, and Reset ConsiderationsInstruction Cache 24

    Power, Emulation, and Reset Considerations

    29

  • 8/6/2019 i Cache Seminar

    30/37

    Idle Mode for Reducing Power Consumed

    If you want to temporarily halt the I-Cache to reduce power, you can place itin its idle mode:1) Select the idle mode for the I-Cache by making CACHEI = 1 in the idleconfiguration register (ICR) of the DSP.2) Execute the IDLE instruction. When the I-Cache is in its idle mode or isdisabled, instruction-fetch requests are handled by the external memoryinterface (EMIF).To wake the I-Cache from its idle mode:1) Deselect the idle mode by making CACHEI = 0 in ICR.2) Execute the IDLE instruction.

    Emulator Access

    The software emulator can read the contents of the I-Cache during the debugmode. The contents of the I-Cache are not modified by emulator readoperations.

    Effect of Setting a Software Breakpoint

    During emulation, If you set or remove a software breakpoint at aninstruction, the corresponding line in the I-Cache is automatically invalidated.

    Reconfiguration Required After a DSP Reset

    After a DSP reset, the I-Cache is not automatically reconfigured for use. Makesure that your initialization code configures the I-Cache

    I-Cache Registers25 Instruction Cache

    I-Cache Registers:

    Control of the I-Cache is maintained through a set of registers within the I-Cache. These registers are accessible at addresses in the I/O space of the DSP.

    30

  • 8/6/2019 i Cache Seminar

    31/37

    I-Cache Global Control Register (ICGC)

    The I-Cache supports one 2-way cache and zero, one, or two RAM sets.Before enabling the I-Cache, use the global control register (ICGC) to select thedesired RAM-set mode.Do not write other values to this register.

    Legend: R = Read; W = Write; -x = Value after reset is not defined

    I-Cache RegistersInstruction Cache 26

    I-Cache Global Control Register (ICGC) Bits

    Bit Field Value Description

    15-0 RMODE RAM-set mode bitsCBFFh No RAM setsCE1Fh 1 RAM set. Only RAM set 1 is available.CFFFh 2 RAM sets. RAM set 1 and RAM set 2 are available.

    I-Cache Way Control Register (ICWC)

    You must initialize the logic for the 2-way cache before you enable the I-Cache. To perform the initialization, write 000Fh to the way control register(ICWC). Do not write any value other than 0Fh to the WINIT field of ICWC.

    I-Cache Way Control Register (ICWC)15 5 4 0Reserved WINITR-0 R/W-xLegend: R = Read; W = Write; -n = Value after reset; -x = Value after reset isnot definedI-Cache Way Control Register (ICWC) Bits

    Bit Field Value Description

    15-5 Reserved These read-only bits are not used.

    31

  • 8/6/2019 i Cache Seminar

    32/37

    4-0 WINIT 0Fh Way initialization bits. Make WINIT = 0Fh to initialize thelogic for the 2-way cache.

    I-Cache Registers27 Instruction Cache

    I-Cache RAM Set Control Registers (ICRC1 and ICRC2)

    Each RAM set control register contains an initialization field and a tag-validbit. Initialization field (RxINIT). If you have selected one or two RAM setswith the global control register, you must initialize the logic for RAM set 1

    before you enable the I-Cache. If you have selected two RAM sets with theglobal control register, you must also initialize the logic for RAM set 2. To

    perform the initialization for each RAM set, write 000Fh to its RAM set controlregister. Do not write any value other than 11b to R1INIT and R2INIT.Tag-valid bit (RxTVALID). When the I-Cache completes the process of fillingRAM set 1, the I-Cache sets R1TVALID. You can poll this bit to determinewhen the RAM set is ready.

    I-Cache RAM Set Control Registers (ICRC1 and ICRC2)ICRC1

    15 14 2 1 0R1TVALID Reserved R1INITR-0 R-0003h R/W-xICRC215 14 2 1 0R2TVALID Reserved R2INITR-0 R-0003h R/W-xLegend: R = Read; W = Write; -n = Value after reset; -x = Value after reset isnot defined

    I-Cache RAM Set 1 Control Register (ICRC1) BitsBit Field Value Description15 R1TVALID RAM set 1 tag-valid bit. Check this bit to determine when the I-Cache has completed the process of filling RAM set 1.0 The fill is not started or is not complete.1 The fill is complete.14-2 Reserved These read-only bits are not used.

    32

  • 8/6/2019 i Cache Seminar

    33/37

    1-0 R1INIT 11b RAM set 1 initialization bits. Make R1INIT = 11b to initializethe

    logic for the RAM set 1.

    I-Cache RegistersInstruction Cache 28

    I-Cache RAM Set 2 Control Register (ICRC2) BitsBit Field Value Description15 R2TVALID RAM set 2 tag-valid bit. When the I-Cache completes the

    process of filling RAM set 2, the I-Cache sets R2TVALID.0 The fill is not started or is not complete.

    1 The fill is complete.14-2 Reserved These read-only bits are not used.1-0 R2INIT 11b RAM set 2 initialization bits. Make R2INIT = 11b to initializethe logic forthe RAM set 2.

    I-Cache RAM Set Tag Registers (ICRTAG1 and ICRTAG2):

    For each active RAM set (selected with the global control register), youmust give the I-Cache a 12-bit tag that defines the range of addresses assignedto that RAM set. Load the tag into the appropriate RAM set tag register. Write avalue with zeros in bits 15-12 and the tag in bits 11-0.

    I-Cache RAM Set Tag Registers (ICRTAG1 and ICRTAG2)ICRTAG115 0R1TAGR/W-0

    ICRTAG215 0R2TAGR/W-0Legend: R = Read; W = Write; -n = Value after reset

    I-Cache RAM Set 1 Tag Register (ICRTAG1) BitsBit Field Value Description

    33

  • 8/6/2019 i Cache Seminar

    34/37

    15-0 R1TAG 0000h-0FFFh RAM set 1 tag bits. Write a value with zeros in bits15-12 and the tag in bits 11-0. This register is only applicable if you have

    selected one or two RAM sets with the global control register.

    I-Cache RegistersInstruction cache 29

    I-Cache RAM Set 2 Tag Register (ICRTAG2) BitsBit Field Value Description15-0 R2TAG 0000h-0FFFh RAM set 2 tag bits. Write a value with zeros in bits15-12 and the tag in bits 11-0. This register is only applicable if you have

    selected two RAM sets with the global control register.

    I-Cache Status Register (ICST)

    The status register contains the IEN bit to indicate when the I-Cache isenabled. When you send an enable request to the I-Cache (CAEN = 1 in theCPU status register ST3_55), poll for IEN = 1 before writing to either of theRAM set tag registers.

    I-Cache Status Register (ICST) BitsBit Field Value Description15-3 Reserved These read-only bits are not used. 2 IEN I-Cache-enabled bit.When you send an enable request to the I-Cache, poll for IEN = 1 beforewriting to either of the RAM set tagregisters.0 The I-Cache is disabled.1 The I-Cache is enabled.1-0 Reserved These read-only bits are not used.

    34

  • 8/6/2019 i Cache Seminar

    35/37

    CONCLUSION

    The instruction cache (I-Cache) can improve the overall system performance bybuffering the most recent instructions accessed by the CPU and to reduse a lotof bulk to carry around in your wallet that's exactly what the makers of iCachehave in mind.

    We have been exlplained how iCache works. The iCache will most likelybe available first through banks, then through retail locations. After you receiveyour iCache, you have to register your finger print and the cards you would liketo have programmed into the device by plugging it into your PC through a USBcable. The software will prompt you for your card numbers and expiration dates

    which will then be saved on the iCache device itself. Then, when you reach thecash register, place your finger on the print scanner, navigate to the card to wantto use and activate it. We cant talk about the iCache without highlighting theaspect of security. In terms of security, all the information on the iCache isencrypted, and the device itself requires a scan of your fingerprint before it canwork.

    The PowerPC 405 accesses memory through the Instruction CacheUnit (ICU) and Data Cache Unit (DCU). We show that, contrary to popular

    belief, strong cache consistency can be maintained for the Web with little or no

    extra cost than the current weak consistency approaches, and it should bemaintained using an invalidation-based protocol.

    35

  • 8/6/2019 i Cache Seminar

    36/37

    REFERENCES

    1. http:/www.cl.cam.ac.uk/~jgd1000/

    2. http://en.wikipedia.org/wiki/Iris_recoginition

    3. http://www.google.com

    36

    http://en.wikipedia.org/wiki/Iris_recoginitionhttp://en.wikipedia.org/wiki/Iris_recoginition
  • 8/6/2019 i Cache Seminar

    37/37