Volunteer Computing with BOINC

Preview:

DESCRIPTION

David P. Anderson Space Sciences Laboratory University of California, Berkeley. Volunteer Computing with BOINC. High-throughput computing. Goal: finish lots of jobs in a given time Paradigms: Supercomputing Cluster computing Grid computing Cloud computing Volunteer computing. - PowerPoint PPT Presentation

Citation preview

Volunteer Computingwith BOINC

David P. Anderson

Space Sciences LaboratoryUniversity of California, Berkeley

High-throughput computing

• Goal: finish lots of jobs in a given time

• Paradigms:

– Supercomputing

– Cluster computing

– Grid computing

– Cloud computing

– Volunteer computing

Cost of 1 TFLOPS-year

• Cluster: $145K

– Computing hardware; power/AC infrastructure; network hardware; storage; power; sysadmin

• Cloud: $1.75M

• Volunteer: $1K - $10K

– Server hardware; sysadmin; web development

Performance

• Current

– 500K people, 1M computers

– 6.5 PetaFLOPS (3 from GPUs, 1.4 from PS3s)

• Potential

– 1 billion PCs today, 2 billion in 2015

– GPU: approaching 1 TFLOPS

– How to get 1 ExaFLOPS:• 4M GPUs * 0.25 availability

– How to get 1 Exabyte:• 10M PC disks * 100 GB

History of volunteer computing

Applications

Middleware

1995 2005distributed.net, GIMPS

SETI@home, Folding@home

Commercial: Entropia, United Devices, ...

BOINC

Climateprediction.netPredictor@homeIBM World Community GridEinstein@homeRosetta@home ...

20052000 now

Academic: Bayanihan, Javelin, ...

Applications

The BOINC computing ecosystem

volunteers projects

CPDN

LHC@home

WCGattachments

• Projects compete for volunteers

• Volunteers make their contributions count

• Optimal equilibrium

What apps work well?

• Bags of tasks

– parameter sweeps

– simulations with perturbed initial conditions

– compute-intensive data analysis

• Native, legacy, Java, GPU

– soon: VM-based

• Job granularity: minutes to months

Data size issues

CommodityInternet

Institution~ 1 Gbpsnon-dedicatedunderutilized

~ 1 Mbps (450 MB/hr)possibly sporadicnon-dedicatedunderutilized

• Most current projects not data-intensive

• Probably works for data-intensive also

Example projects

• Einstein@home

• Climateprediction.net

• Rosetta@home

• IBM World Community Grid

• GPUGRID.net

• Primegrid

Creating a volunteer computing project

• Set up a server

• Port applications, develop graphics

• Develop software for job submission and result handling

• Develop web site

• Ongoing:

– publicity, volunteer communication

– system, DB admin (Linux, MySQL)

How many CPUs will you get?

• Depends on:

– PR efforts and success

– public appeal

– availability of internal resources

• 12 projects have > 10,000 active hosts

• 3 projects have > 100,000 active hosts

Organizational issues• Creating a volunteer computing project has

startup costs and requires diverse skills

• This limits its use by individual scientists and research groups

• Better model: umbrella projects

– Institutional• Lattice, VTU@home

– Corporate• IBM World Community Grid

– Community• AlmereGrid

Summary

• Volunteer computing is an important paradigm for high-throughput computing

– price/performance

– performance potential

• Low technical barriers to entry (due to BOINC)

• Organizational structure is critical

• Use GPUs if developing new app

Recommended