Operation Reuse on Handheld Devices

Preview:

DESCRIPTION

Operation Reuse on Handheld Devices. Yonghua Ding and Zhiyuan Li For LCPC 2003. Outline. Introduction Computation reuse Branch reuse by IF-merging Conclusions. Introduction. Handheld devices have Limited processing power Limited energy resource Operation reuse Computation reuse - PowerPoint PPT Presentation

Citation preview

Operation Reuse on Handheld Devices

Yonghua Ding and Zhiyuan Li

For LCPC 2003

Outline

Introduction Computation reuse Branch reuse by IF-merging Conclusions

Introduction Handheld devices have

Limited processing power Limited energy resource

Operation reuse Computation reuse Branch reuse

Hardware solutions Software solutions

Computation Reuse Can be viewed as an extension of CSE Redundancy among different

instances of a code segment Code segment with repetitive inputs A hashing table records the input

values and the computed output values

Replace the computation with a table look-up if the input is in the table

An Example code Int quan(int val) { int I; for (i=0; i<15; i++) { if (val < power2[i]) break; } return (i); }

Transformation Code Int quan(int val) { int I, key if (check_hash(val,hash_tab,&key)==0) { for (i=0; i<15; i++) { if ( val<power2[i] ) break; } hash_tab[key].output = I; } else I = hash_tab[key].output; return (i); }

Framework of the SchemeIdentify candidate code segments

Data flow analysis to determine input/output

Estimate hashing overhead

Granularity analysis

Choose code segments for value profiling

Determine code segments to transform

Important factors

Computation granularity ( C ) Hashing overhead ( O )

Hashing function complexity The size of input/output

Reuse rate ( R ) R = 1 – Nds/N

Cost-Benefit Analysis

Cost of computation reuse (C+O)(1-R)+O.R

The gain of computation reuse C - (C+O)(1-R)+O.R Ξ R.C – O

Criteria to choose code segments R.C – O > 0 or R > O/C

Experimentation Setup

Compaq iPAQ 3650 PDA 206MHZ StrongARM SA1110

processor 32MB RAM 16KB I-cache and 8KB D-cache

Digital multi-meter HP 3458a 6 MediaBench programs and a

GNU GO game

Performance Improvement

Programs Original (s)

Reuse (s)

Speedup

G721_encode 2.01 1.53 1.31G721_decode 3.69 2.76 1.34MPEG2_encode

120.63 113.30 1.06

MPEG2_decode

83.02 46.06 1.80

RASTA 14.92 12.66 1.18UNEPIC 1.73 0.76 2.28GNU GO 788.05 654.51 1.20Harmonic Mean

1.37

Energy Saving

Programs Original (J)

Reuse (J) Saving

G721_encode 4.59 3.56 22.4%G721_decode 8.43 6.47 23.3%MPEG2_encode

281.67 265.12 5.9%

MPEG2_decode

193.85 108.01 44.3%

RASTA 36.60 31.02 15.2%UNEPIC 4.03 1.81 55.1%GNU GO 1936.23 1613.69 16.7%

Performance Improvement for Different Input Files

Programs Sources of Inputs

Speedups

G721_encode

MiBench 1.35

G721_decode

MiBench 1.36

MPEG2_encode

Tektronix 1.19

MPEG2_decode

Tektronix 1.48

RASTA Rasta_testsuite_1998

1.18

UNEPIC EPIC web-site 4.25

GNU GO “-b 9 –r 2” 1.20Harmonic Mean

1.43

Related Work

Richardson’s result cache Sodani and Sohi’s instruction reuse Huang and Lilja’s basic block level

reuse Connors and Hwu’s code region

level reuse

Branch Reuse by IF-Merging

Motivation Branch instructions degrade the

efficiency of deep pipelining Branches reduce the size of basic

blocks Branches introduce control

dependences Source-level code transformation

An Example Code If ( sign ) { diff = -diff; } …… If ( sign ) valpred -= vpdiff; Else valpred += vpdiff;

Transformation by IF-merging If ( sign ) { diff = -diff; …… valpred -= vpdiff; } Else { …… valpred += vpdiff; }

Three Schemes of IF-Merging

A basic IF-merging scheme Merge IF statements with identical

condition An IF-condition Factoring scheme

Factor and merge common sub-predicates

A path profiling scheme IF-merging with path profiling

information

A Basic IF-Merging Scheme

Symbolic analysis to identify IF statements with identical IF condition

Data dependence analysis to determine intermediate statements

A Factoring Scheme

Non-identical conditions have common sub-predicates (a&&b, a&&c)

Factor the common sub-predicates to construct a common IF statement

The new IF statement encloses the original IF statements with the remaining sub-predicates as conditions

A Path Profiling Scheme

Merge IF statements with high rate of all taken

Exchange nested IF statements whose conditions are dependent

Experimental Results

Programs Speedups Energy Saving

ADPCM_coder 1.104 9.3%

ADPCM_decoder

1.076 8.0%

G721_encode 1.069 5.8%

G721_decode 1.066 6.1%

GSM_toast 1.067 6.0%

GSM_untoast 1.085 8.2%

PEGWIT_encrypt

1.029 2.5%

PEGWIT_decrypt

1.017 1.5%

Average 1.063 5.9%

Related Work

Kreahling et al’s profile-based condition merging

Branch prediction Predicated execution Muller and Whalley’s avoiding

branches by code replication Yang et al’s branch reordering

Conclusions

Operation reuse techniques are desirable for both program speed and energy saving on handheld devices Computation reuse Branch reuse by IF-merging

Recommended