21
The Queen’s Tower The Queen’s Tower Imperial College London Imperial College London South Kensington, SW7 South Kensington, SW7 27th Jan 2008 | Ashley Brown Profile-directed speculative optimization of reconfigurable floating point data paths Workshop on Reconfigurable Computing at 2008 Ashley Brown, 27 th Jan 2007

Profile-directed speculative optimization of reconfigurable floating point data paths

Embed Size (px)

DESCRIPTION

Profile-directed speculative optimization of reconfigurable floating point data paths. Workshop on Reconfigurable Computing at 2008 Ashley Brown, 27 th Jan 2007. Introduction. Computational science requires reproducible and accurate results IEEE-754 is a compromise - PowerPoint PPT Presentation

Citation preview

Page 1: Profile-directed speculative optimization of reconfigurable floating point data paths

The Queen’s TowerThe Queen’s TowerImperial College LondonImperial College LondonSouth Kensington, SW7South Kensington, SW7

27th Jan 2008 | Ashley Brown

Profile-directed speculative optimization

ofreconfigurable floating

point data pathsWorkshop on Reconfigurable

Computing at 2008

Ashley Brown, 27th Jan 2007

Page 2: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 2

IntroductionIntroduction

• Computational science requires reproducible and accurate results

• IEEE-754 is a compromise– Broad range of values

– Many special cases

• Idea: use profiling to reduce range and remove special cases

Generate floating-point data-paths for FPGAs which are smaller and faster

• BUT KEEP RESULTS CONSISTENT WITH IEEE-754

Page 3: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 3

Advantages of Smaller Floating PointAdvantages of Smaller Floating Point

• Embedded Systems– Do the same work for a lower cost– Implement IEEE-754 compliant floating point where

it may not have been possible before

• High performance– Do more work with the same hardware– Increase in parallel execution on FPGAs– No need to sacrifice IEEE-754 compliance

Page 4: Profile-directed speculative optimization of reconfigurable floating point data paths

Four Pictures to Explain: #1Four Pictures to Explain: #1

27th Jan 2008 | Ashley Brown # 4

Page 5: Profile-directed speculative optimization of reconfigurable floating point data paths

Four Pictures to Explain: #2Four Pictures to Explain: #2

27th Jan 2008 | Ashley Brown # 5

Page 6: Profile-directed speculative optimization of reconfigurable floating point data paths

Four Pictures to Explain: #3Four Pictures to Explain: #3

Page 7: Profile-directed speculative optimization of reconfigurable floating point data paths

Four Pictures to Explain: #4Four Pictures to Explain: #4

27th Jan 2008 | Ashley Brown # 7

Pre-optimisation Post-optimisation

Page 8: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 8

Optimisation TechniqueOptimisation Technique

• Remove features from the floating-point unit:– Operand alignment– Normalisation– Operand swap

• If these were required, detect and fall-back to alternative solution:– Software-based on embedded/host processor– Hardware-based full implementation for larger

designs

Page 9: Profile-directed speculative optimization of reconfigurable floating point data paths

Optimisation OptionsOptimisation Options

27th Jan 2008 | Ashley Brown # 9

Page 10: Profile-directed speculative optimization of reconfigurable floating point data paths

The stages of optimisationThe stages of optimisation

• Profile target application with training datasets– Source usually FORTRAN, C

• Identify frequently-executed blocks

• Check for good value-locality

• Generate reduced-size floating point datapath– Reduced operand alignment hardware– Reduced normalisation hardware

• Error checking: execute with additional datasets, check error rates

27th Jan 2008 | Ashley Brown # 10

Page 11: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 11

FloatWatch ProfilerFloatWatch Profiler

• Valgrind-based value profiler

• Can return some metrics of interest here:– Floating point value

ranges– Ratio of floating point

operands

• Each has uses for optimisation!

Page 12: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 12

VFLOAT LibraryVFLOAT Library

• VHDL variable-precision floating-point library– Initially developed by Belanovic at Northeastern,

continued development under the supervision of Leeser

• Allows basic customisation of precision, exponent bit widths

• Further customisations added for our optimisations:– Operand alignment

– Normalisation

• Performance is lower than vendor-specific libraries

Page 13: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 13

Data-path GeneratorData-path Generator

• Takes user-selected data-path and generates VHDL implementation

• Assembles modified version of the RPL library – customised to allow removal of various items

• Builds hardware/software integration layer– C library for software– VHDL for hardware

• Does not modify the software source automatically (yet)

Page 14: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 14

Proof-of-Concept TestingProof-of-Concept Testing

• Original application modified to call C library (usually from FORTRAN)

• Data sent to hardware, calculated, and returned– Software waits for response– No data-aggregation or hardware-side error

detection occurs

• Software layer performs same calculation for verification

• Overall error rate reported

Page 15: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 15

‘‘ydl_pij’ydl_pij’

• ‘ydl_pij’ is an iterative solver for quantum mechanics, using the “Molecular Mechanics – Valence Bond” method

• Datasets of various sizes available, allowing a variety of test cases be used

• Initial profiling and testing use separate datasets

Page 16: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 16

‘‘ydl_pij’: Profiling (Hot Code Section)ydl_pij’: Profiling (Hot Code Section)

Narrow value ranges

Page 17: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 17

‘‘ydl_pij’: Identificationydl_pij’: Identification

• FloatWatch identifies the regions of code executing the most operations

• In this case, these show narrow value ranges

• Create optimised datapaths for testing– Maximum operand alignment reduced to 2n

, where n is in the range [1, 6]

– Normalisation hardware modified similarly

Page 18: Profile-directed speculative optimization of reconfigurable floating point data paths

‘‘ydl_pij’ Error Rateydl_pij’ Error Rate

Not profiled

Page 19: Profile-directed speculative optimization of reconfigurable floating point data paths

‘ydl_pij’: Error Rate and Size

• 20% size reduction with negligible re-execution rate (< 0.5%)

• 27% size reduction with 3% re-execution rate

• Size reduction permits ~40% increase parallelism due to better space usage

Page 20: Profile-directed speculative optimization of reconfigurable floating point data paths

ydl_pij: Area saving for one F.P. ydl_pij: Area saving for one F.P. adder/subtractoradder/subtractor

27th Jan 2008 | Ashley Brown # 20

Pre-optimisation Post-optimisation

Page 21: Profile-directed speculative optimization of reconfigurable floating point data paths

27th Jan 2008 | Ashley Brown # 21

Coming SoonComing Soon

• Per-operation optimisations– Currently only at data-path level

• Optimisation of operand-swap hardware

• Per-operation exponent customisation (size, bias)

• Performance evaluation using state-of-the-art FPGA accelerator hardware

• Implementation of error detection and re-execution

• Potential for even greater size reductions