16
synergy.cs.vt.edu Accelerating Sequence Analysis on Graphics Processing Unit (GPU) Wu Feng and Heshan Lin Department of Computer Science

Heshan Lin: Accelerating Short Read Mapping, Local Realignment, and a Discovery on a Graphics Processing Unit (GPU)

Embed Size (px)

Citation preview

synergy.cs.vt.edu

Accelerating Sequence Analysis on Graphics Processing Unit (GPU)

Wu Feng and Heshan Lin

Department of Computer Science

synergy.cs.vt.edu

NGS Democratizing DNA Sequencing

Source: www.genome.gov

Sequencing available to the masses in the near future

synergy.cs.vt.edu

Bottleneck Shift -> Computation

ChIP-Seq …Transcriptome

Sequencing

BIG Data

Complete

Genome Re-

sequencing

Metagenomics

synergy.cs.vt.edu

Traditional HPC Resources

HPC

Users

Supercomputers

Clusters

The Masses

synergy.cs.vt.edu

Graphics Processing Unit (GPU)

Graphics & gaming -> general purpose computing

Ubiquitously available: Desktop, laptop, iPad

synergy.cs.vt.edu

“Personalized Supercomputer”

• 10x > CPU

• 512 cores

• 10^12 flops

• On par with power of a

supercomputer in 2004

synergy.cs.vt.edu

Traditional CPU Cores

Control

(Fetch / Decode)

ALU

Execution

Context

(Registers)

Out-of-order Control Logic

Branch Predictor

Memory Prefecter

Data Cache

Courtesy to K. Fatahalian

Optimized for single thread

synergy.cs.vt.edu

Power Density will Increase

4004 8008

8080

8085

8086

286 386 486 Pentium®

1

10

100

1000

10000

1970 1980 1990 2000 2010

Power D

ensity (W

/cm2)

Hot Plate

Nuclear

Reactor

Rocket

Nozzle

Power densities too high to keep junctions at low temps

Source: Borkar, De Intelâ

Sun’ s

Surface

Source: Borkar, De Intel

synergy.cs.vt.edu

GPU: Optimized for Throughput

Use much simpler cores

Use vectorization to replicate simple cores

Control

(Fetch / Decode)

ALU

Execution

Context

(Registers)

Control

(Fetch / Decode)

ALU

Execution

Context

(Registers)

Execution

Context

(Registers)

ALU

ALU

Execution

Context

(Registers)

Execution

Context

(Registers)

ALU

ALU

Execution

Context

(Registers)

Execution

Context

(Registers)

ALU

ALU

Execution

Context

(Registers)

Execution

Context

(Registers)

ALU

ALU

Execution

Context

(Registers)

Execution

Context

(Registers)

ALU

Shared Execution Context

Courtesy to K. Fatahalian

synergy.cs.vt.edu

Take with a Grain of Salt

Raw Compute Power != Application Performance Not all applications are suitable for GPUs

Developing fully optimized codes on GPU is non-trivial and requires computational rethinking A GPU core is MUCH SLOWER than a CPU core

Need a lot of parallelism to hide memory latency

Reduce branching as much as possible

Think about an army of synchronized snails

synergy.cs.vt.edu

GPU Potential for Sequence Alignment

Why sequence alignment? Fundamental in sequence analysis

Computationally intensive

Preliminary study

Algorithm Description Speedup

RMAP Short read mapping 10x

Smith Waterman Optimal Sequence alignment 30x

BLASTP Sequence database search 6.5x

Indel realigner Locally realign mismatched reads On going

synergy.cs.vt.edu

Lessons Learnt

CPU optimized code may be difficult to accelerate on GPUs BLASTP 6.5x vs. Smith Waterman 30x

Require rethinking of algorithm design Scalable but less optimal algorithm is better

Example: RMAP Originally uses hash table to find the match (O(n))

Switched to a slower binary search algorithm (O(nlogn))

synergy.cs.vt.edu

Opportunities

Smith Waterman

Needleman-Wunsch

BWA

BLAST

Bowtie

Next-gen

Algorithm?

Accuracy

Tim

e

synergy.cs.vt.edu

Compute the Cure Initiative

Partnership between NVIDIA and VT

Goal: Leverage GPU power to fight cancer

Current focus: GPU accelerated sequence alignment framework

http://www.nvidia.com/object/compute-the-cure.html

synergy.cs.vt.edu

Conclusion

Democratizing DNA sequencing requires more accessible HPC resources

GPUs present both opportunities and challenges Initial results are promising

For more information Synergy website – http://synergy.cs.vt.edu

synergy.cs.vt.edu

Acknowledgement

Collaborators David Mittelman, Virginia Bioinformatics Institute

Students Ashwin Aji

Shucai Xiao

Funding NVIDIA Compute the Cure Program

NSF Center for High-Performance Reconfigurable Computing