Upload
charles-blair
View
219
Download
6
Embed Size (px)
Citation preview
Lessons Learned fromSETI@home
David P. Anderson
Director, SETI@homeSpaces Sciences Laboratory
U.C. Berkeley
April 2, 2002
SETI@home Operations
datarecorder
screensavers
science DBuser DB
WU storage
splitters
DLT tapes
dataserver
resultqueue
acct.queue
garbagecollector
tape archive,delete
tape backup
master DBredundancy
checking
RFIelimination
repeatdetection
web site
CGI program
web pagegenerator
Radio SETI projects
Name Sensitivity Sky coverage (%sky)
FrequencyRange (MHz)
Max drift rate (Hz/sec)
Frequency resolution(Hz)
Computing power (GFLOPs)
Phoenix(SETI Inst.)
1e-26 0.005 (1000 stars)
2000 1 1 200
SETI@home 3e-25 33 2.5 50 0.07 to 1200
25,000
SERENDIP(Berkeley)
1e-24 33 100 0.4 0.6 150
Beta(Harvard)
3e-23 70 320 0.25 0.5 25
History and statistics Conceived 1995, launched April 1999 Funding: TPS, DiMI, numerous companies 3.6M users (.5M active), 226 countries 40 TB data recorded, processed 25 TeraFLOPs average over last year Almost 1 million years CPU time No ET signals yet, but other results
Public-resource computing
Original: GIMPS, distributed.net Commercial: United Devices, Entropia,
Porivo, Popular Power Academic, open-source
Cosm, folding@home, SETI@home II The peer-to-peer paradigm
Characterizing SETI@home
Fixed-rate data processing task Low bandwidth/computation ratio Independent parallelism Error tolerance
Millions and millions of computers
Server scalability Dealing with excess CPU time Redundant computing
Deals with cheating, malfunctions Control by changing computation
Moore’s Law is true (causes same problems)
Network bandwidth costs money
SSL to campus: 100 Mbps, free, unloaded Campus to ISP: 70 Mbps, not free First: load limiting at 25 Mbps Now: no limit, zero priority How to adapt load to capacity? What’s the break-even point (1GB per
CPU day)
How to get and retain users
Graphics are important But monitors do burn in
Teams: users recruit other users Keep users informed
Science news System management news Periodic project emails
Reward users
PDF certificates Milestone pages and emails Leader boards (overall, country, …) Class pages Personal signal page
Let users express themselves
User profiles Message boards Newsgroup (sci.astro.seti) Learn about users Online poll
Users are competitive
Patched clients, benchmark wars Results with no computation Intentionally bad results Team recruitment by spam Sale of accounts on eBay Accounting is tricky
Anything can be reverse engineered
Patched version of client efforts at self-checksumming
Replacement of FFT routine Bad results
Digital signing: doesn’t work Techniques for verifying work
Users will help if you let them
Web-site translations Add-ons
Server proxies Statistics DB and display
Beta testers Porting Open-source development
(will use in SETI@home II)
Client: mechanism, not policy
Error handling, versioning Load regulation
Let server decide Reasonable default if no server
Put in a level of indirection Separate control and data
Cross-platform is manageable
Windows, Mac are harder GNU tools and POSIX rule
Server reliability/performance
Hardware Air conditioning, RAID controller
Software Database server
Architect for failure Develop diagnostic tools
What’s next for public computing?
Better handling of large data Network scheduling Reliable multicast
Expand computation model Multi-application, multi-project platform BOINC (Berkeley Open Infrastructure for
Network Computing)