Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Dynamic Tag-Check Omission: A Low-Power Instruction Cache Architecture
Exploiting Execution Footprints
K. Inoue, V. Moshnyaga, and K. Murakami
Introduction
DEC 21164 CPU* StrongARM SA-110 CPU* Bipolar ECL CPU**
25% 43% 50%* Kamble et. al., “Analytical energy Dissipation Models for Low Power Caches”, ISLPED’97** Joouppi et. al., “A 300-MHz 115-W 32-b Bipolar ECL Microprocessor” ,IEEE Journal
of Solid-State Circuits’93
Increase in cache size
Power consumed in on-chip caches
Breakdown of Cache Energy
# of words in a Subbank-entry(Total # of Subbanks)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
8(1) 4(2) 2(4) 1(8)
This calculation is based on Kamble, et. Al., “Analytical energy Dissipation Models for Low Power Caches”, ISLPED’97
Word: 64 bitsCache Size: 64 KBLine Size: 64 B
Data(bit-lines)
Tag(bit-lines)
Others
Energy consumed in CacheEdecode + Esram + Eio
Etag + Edata
Breakdown of Esram per access
TagMemory
Data Memory (Cache Lines)
Cache Subbanking
Subbank
History-Based Tag-Comparison (HBTC) Instruction Cache -Motivation-
Hit rate of instruction cache (I-$) is quite HIGH!Most of the tag-comparisons result in HIT
Conventional I-$Performs tag-comparison in EVERY cache access despite that some instructions obviously exist in cache.
Dissipates energyunnecessarily!
HBTC I-$OMITS tag-comparison for some cache accesses if it knows that the corresponding instructions obviously exist in cache.
Eliminate the energy consumed by the
redundant tag-checks!
Can We Know The Existence of Instructions in the I-Cache without Tag-Comparison?
YES!
We know that the target instruction exists in the cache !
Consider;• The target instruction has been referenced
before, and• No cache miss has occurred since the previous
reference of the target instruction.
A?
1. Execute an instruction block A at time TLeave the execution footprint in the corresponding BTB-entry.$
(2. If a cache miss occurs, then erase all the footprints.)
3. Execute the instruction block A at time T+X
A $If the footprint is detected in the BTB, then omit the tag comparisons for all the instructions in the block A!
Exploit Execution Footprints Provided by the BTB!
But, How?
Branch Inst. Addr. Target Address
Target Address
BTB
EFT(Execution Footprint for Target)
EFF(EF for Fall-through)
Not-taken
I-$
ModeController Tag-check control
Branch Prediction Result
HBTC I-$ Architecture
Target Instruction Block
Branch Inst. Addr.
PBAregBranch Inst. Addr. Pred.
Result
PC
Address for EF writing
HBTC I-$ OperationNormal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
Mode Transition
BTB HitOM
NM TM
BTB Hit
HBTC I-$ Operation
OM
NM TM
Mode Transition Read EFT and EFF
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
BTB Hit
HBTC I-$ Operation
OM
NM TM
EF==1Mode Transition Read
EFT and EFF
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
BTB Hit
HBTC I-$ Operation
OM
NM TM
EF==1
EF==0
Mode Transition Read EFT and EFF
PC and Pred.-result
PBAreg
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
BTB Hit
HBTC I-$ Operation
OM
NM TM
EF==1
EF==0
Mode Transition Read EFT and EFF
PC and Pred.-result
PBAreg
Cache miss!
Cache miss!
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
BTB Hit
HBTC I-$ Operation
OM
NM TM
EF==1
EF==0
Mode Transition Read EFT and EFF
PC and Pred.-result
PBAreg
Cache miss!
Cache miss!
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
BTB Hit
HBTC I-$ Operation
OM
NM TM
EF==1
EF==0
Mode Transition Read EFT and EFF
PC and Pred.-result
PBAreg
Cache miss!
Cache miss!
BTB Hit!Instruction
Block
Previous BTB hit
Current BTB hit
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
BTB Hit
HBTC I-$ Operation
OM
NM TM
EF==1
EF==0
Mode Transition Read EFT and EFF
PC and Pred.-result
PBAreg
Cache miss!
Cache miss!
BTB Hit!Instruction
Block
Previous BTB hit
Current BTB hitValidate the EF pointed by the PBAreg!
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
HBTC I-$ Operation
OM
NM TM
BTB HitGOtoNM
GOtoNM
GOtoNMI-Cache miss orBTB replacement orRAS address orBranch misprediction
Mode Transition
All EFs are invalidated!
EF==1
EF==0
Normal Mode (NM): w/ Tag checksOmitting Mode (OM): w/o Tag checksTracing Mode (TM): w/ Tag checks
(on a BTB hit, the EF addressed by the PBAreg is set to 1)
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
EFT EFF MODE (NM,OM,TM)
1-C
Execution Flow
State of BTB
Time(Iteration Count – Address of Branch)
Branch Target Buffer
Operation Example
PBAregAdr:T/N12 345
IterationCount
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Performing!
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Omitting!
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
Performing!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
Performing!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
--:--Branch-C3-CBranch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
--:--Branch-C3-CBranch-D
Omitting-Mode
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
--:--Branch-C3-CBranch-D
Omitting-ModeOmitting!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
--:--Branch-C3-CBranch-D
Omitting-Mode
--:--Branch-C3-DBranch-D
Omitting-Mode
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
--:--Branch-C3-CBranch-D
Omitting-Mode
--:--Branch-C3-DBranch-D
Omitting-Mode
Omitting!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
12 345
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitialBranch-D
--:--Branch-C1-CBranch-D
Omitting-Mode
C:NBranch-C2-CBranch-D
Tracing-Mode
D:TBranch-C2-DBranch-D
Tracing-Mode
--:--Branch-C3-CBranch-D
Omitting-Mode
--:--Branch-C3-DBranch-D
Omitting-ModePerforming!Performing!Omitting!
Performance/Energy Overhead
PerformanceBTB access conflict (1 stall cycle)
for execution footprint writingfor execution footprint invalidation
EnergyEnergy for execution-footprint access
for reading (every BTB look-up)for writing (BTB hit in TMode)for invalidating (Cache miss or BTB replace)
Evaluation – Environment –•Cache Energy Model based on Kamble[97]
includes BTB access overhead•SimpleScalar Simulator
16 KB I-cache (partitioned into 4 subbanks), 32 B block, 2-bit bimod predictor4-way 2K-enetry BTB
•Benchmark6 from the SPECint, 4 from the Mediabench
Nor
mal
ized
Ta
g-C
ompa
rison
Cou
nt
0
0.2
0.4
0.6
0.8
099.go 129.compress 130.li adpcm(e) mpeg2(e)124.m88ksim 126.gcc 132.ijpeg adpcm(d) mpeg2(d)
Evaluation – Tag-Check Counts –
ITCHBTCComb.
The ITC approach works well for all programs.The HBTC approach works well for media programs.The hybrid approach makes significant reductions!
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
099.go 129.compress 130.li adpcm(e) mpeg2(e)124.m88ksim 126.gcc 132.ijpeg adpcm(d) mpeg2(d)
Nor
mal
ized
Cac
he E
nerg
y
EbtbaddEbtbaddEoutputEoutputEtagEtagEdataEdata
BTB energy overhead does not have large impact.For media programs, Ecache saving is more than 15 %.
Evaluation – Cache Energy –
0.9
0.92
0.94
0.96
0.98
1
1.02
1.04
099.go 126.compress 130.li adpcm(e) mpeg2(e)124.m88ksim 126.gcc 132.ijpeg adpcm(d) mpeg2(d)
Nor
mal
ized
Exe
cutio
n Ti
me
For media programs, the performance overhead is trivial.For others, the performance degradation might not be acceptable.
Evaluation – Execution Time –
0
0.5
1
1.5
2
2.5
3
3.5
1 2 4 8 16 32
099.go
124.m88ksim
126.gcc
129.compress
130.li
132.ijpeg
adpcm(e)
adpcm(d)
mpeg2(e)
mpeg2(d)
Nor
mal
ized
Exe
cutio
n Ti
me
If the penalty is equal to or smaller than 4 cycles, the performance overhead is trivial.If the penalty is greater than 4 cycles, the performance overhead is serious.
Evaluation – Invalidation Penalty –
Invalidation Penalty [clock cycle]
Conclusions
1. Exploits execution footprints recorded in the BTB.2. Reduces tag-comparison count by 95% (adpcm(d)).3. Achieves 17 % of cache energy saving!
History-Based Tag-Comparison Instruction Cache
Future work• Analyze energy consumption based on chip design.
Buck Up Slides(History-based Tag-Comparison Cache)
Outline
1. Introduction2. History-Based Tag-Comparison Cache
• Motivation• Mechanism• Architecture• Operation
3. Evaluations4. Conclusions
Low Power Caches- Reducing both Etag and Edata -
Adding a small L0 cache
L1 Cache
L0 CacheProcessor
•Filter Cache•S-Cache•Block Buffering
Dividing cache moduleCache•MDM Cache
Multiple accessing•MRU Cache•Hash-Rehash Cache
Sequential Way-Access
way
0w
ay1
way
2w
ay3
Dividing cache module •Cache Sub-Banking
Accessing sequentially•Phased Cache•Pipelined Cache
Tag Line
Tag Line
Hit!Miss!
Replace
Low Power Caches- Reducing Edata -
Breakdown of Esram
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
8(1) 4(2) 2(4) 1(8)
# of words in a Subbank (Total # of Subbanks)
Bre
akdo
wn
of E
nerg
y
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
8(1) 4(2) 2(4) 1(8)
32-bit CPU 64-bit CPU
Esram_data_bit Esram_tag_bit
Esram_others This calculation is based on Kamble, et. Al., “Analytical energy Dissipation Models for Low Power Caches”, ISLPED’97
CS = 32 KBL S = 32 B
CS = 64 KBLS = 64 B
CS: Cache SizeLS: Line Size
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
EFT EFF MODE (NM,OM,TM)
1-C
Execution Flow
State of BTB
Time(Iteration Count – Address of Branch)
Branch Target Buffer
Operation Example
PBAregAdr:T/N
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Performing!
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Performing!
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Performing!
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3--:--
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3--:--
Omitting!Omitting!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3--:--
Branch-CC-4C:N
EFF is selected!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3--:--
Performing!Performing! Branch-CC-4C:N
EFF is selected!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
PBAreg--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3--:--
Branch-CC-4C:N
Branch-CD-4D:TBranch-D New Entry
Registration!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3--:--
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Performing!Performing!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3--:--
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
EFF of Branch-C
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Omitting!Omitting!
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Branch-CD-5--:--Branch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Omitting!Omitting!
Branch-CD-5--:--Branch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Branch-CD-5--:--Branch-D
Branch-CC-6--:--Branch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Omitting!Omitting!
Branch-CD-5--:--Branch-D
Branch-CC-6--:--Branch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Branch-CD-5--:--Branch-D
Branch-CC-6--:--Branch-D
Branch-CD-6--:--Branch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Omitting!Omitting!
Branch-CD-5--:--Branch-D
Branch-CC-6--:--Branch-D
Branch-CD-6--:--Branch-D
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Branch-CD-5--:--Branch-D
Branch-CC-6--:--Branch-D
Branch-CD-6--:--Branch-D
Branch-CB-7B:TBranch-D
Branch-B
Top
Branch to F
Branch to A
Branch to A
Top
A
B
C
D
F
1234 567
IterationCount
Operation ExampleEFT EFF MODE (NM,OM,TM)
--:--Branch-CInitial
Branch-CC-1C:T
Branch-CC-2C:T
Branch-CC-3C:T
Branch-CC-4C:N
Branch-CD-4D:TBranch-D
Branch-CC-5--:--Branch-D
Performing!Performing!
Branch-CD-5--:--Branch-D
Branch-CC-6--:--Branch-D
Branch-CD-6--:--Branch-D
Branch-CB-7B:TBranch-D
Branch-B
Hit rate of instruction cache (I-$) is quite HIGH!Most of the tag-comparisons result in MATCH
Conventional I-$Performs tag check in EVERYcache access despite that some instructions obviously exist in the cache.
Wastes unnecessary energy!
Conventional Tag-Check Scheme
Cache-line address0 511255
100
104
1
Ave
. # o
f ref
eren
ces
Per
sta
ble-
time
16 KB Direct-Mapped Cache (32 B Lines)132.ijpeg
History-Based Tag-Check (HBTC) Scheme (1/2)• An instruction has been executed at least once.• No cache miss has occurred since the last
reference of the instruction.
We know that the instruction exists in the cache now!
How can we detect these conditions at run time?