Lessons Learned from SETI@home David P. Anderson Director, SETI@home Spaces Sciences Laboratory U.C. Berkeley April 2, 2002

Lessons Learned fromSETI@home

David P. Anderson

Director, SETI@homeSpaces Sciences Laboratory

U.C. Berkeley

April 2, 2002

SETI@home Operations

datarecorder

screensavers

science DBuser DB

WU storage

splitters

DLT tapes

dataserver

resultqueue

acct.queue

garbagecollector

tape archive,delete

tape backup

master DBredundancy

checking

RFIelimination

repeatdetection

web site

CGI program

web pagegenerator

Radio SETI projects

Name Sensitivity Sky coverage (%sky)

FrequencyRange (MHz)

Max drift rate (Hz/sec)

Frequency resolution(Hz)

Computing power (GFLOPs)

Phoenix(SETI Inst.)

1e-26 0.005 (1000 stars)

2000 1 1 200

SETI@home 3e-25 33 2.5 50 0.07 to 1200

25,000

SERENDIP(Berkeley)

1e-24 33 100 0.4 0.6 150

Beta(Harvard)

3e-23 70 320 0.25 0.5 25

History and statistics Conceived 1995, launched April 1999 Funding: TPS, DiMI, numerous companies 3.6M users (.5M active), 226 countries 40 TB data recorded, processed 25 TeraFLOPs average over last year Almost 1 million years CPU time No ET signals yet, but other results

Public-resource computing

Original: GIMPS, distributed.net Commercial: United Devices, Entropia,

Porivo, Popular Power Academic, open-source

Cosm, folding@home, SETI@home II The peer-to-peer paradigm

Characterizing SETI@home

Fixed-rate data processing task Low bandwidth/computation ratio Independent parallelism Error tolerance

Millions and millions of computers

Server scalability Dealing with excess CPU time Redundant computing

Deals with cheating, malfunctions Control by changing computation

Moore’s Law is true (causes same problems)

Network bandwidth costs money

SSL to campus: 100 Mbps, free, unloaded Campus to ISP: 70 Mbps, not free First: load limiting at 25 Mbps Now: no limit, zero priority How to adapt load to capacity? What’s the break-even point (1GB per

CPU day)

How to get and retain users

Graphics are important But monitors do burn in

Teams: users recruit other users Keep users informed

Science news System management news Periodic project emails

Reward users

PDF certificates Milestone pages and emails Leader boards (overall, country, …) Class pages Personal signal page

Let users express themselves

User profiles Message boards Newsgroup (sci.astro.seti) Learn about users Online poll

Users are competitive

Patched clients, benchmark wars Results with no computation Intentionally bad results Team recruitment by spam Sale of accounts on eBay Accounting is tricky

Anything can be reverse engineered

Patched version of client efforts at self-checksumming

Replacement of FFT routine Bad results

Digital signing: doesn’t work Techniques for verifying work

Users will help if you let them

Web-site translations Add-ons

Server proxies Statistics DB and display

Beta testers Porting Open-source development

(will use in SETI@home II)

Client: mechanism, not policy

Error handling, versioning Load regulation

Let server decide Reasonable default if no server

Put in a level of indirection Separate control and data

Cross-platform is manageable

Windows, Mac are harder GNU tools and POSIX rule

Server reliability/performance

Hardware Air conditioning, RAID controller

Software Database server

Architect for failure Develop diagnostic tools

What’s next for public computing?

Better handling of large data Network scheduling Reliable multicast

Expand computation model Multi-application, multi-project platform BOINC (Berkeley Open Infrastructure for

Network Computing)

Documents

Lessons Learned from SETI@home David P. Anderson Director, SETI@home Spaces Sciences Laboratory U.C. Berkeley April 2, 2002