9
1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research http://research.microsoft.com/~gray Alex Szalay Johns Hopkins University

1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

Embed Size (px)

Citation preview

Page 1: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

1

Online ScienceThe World-Wide Telescope

as a Prototype For the New Computational Science

Jim GrayMicrosoft Research

http://research.microsoft.com/~gray

Alex SzalayJohns Hopkins University

Page 2: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

2

The Evolution of Science• Observational Science

– Scientist gathers data by direct observation– Scientist analyzes data

• Analytical Science – Scientist builds analytical model– Makes predictions.

• Computational Science – Simulate analytical model– Validate model and makes predictions

• Data Exploration Science Data captured by instrumentsOr data generated by simulator– Processed by software– Placed in a database / files– Scientist analyzes database / files

Page 3: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

3

Information Avalanche• In science, industry, government,….

– better observational instruments and – and, better simulations producing a data avalanche

• Examples– BaBar: Grows 1TB/day

2/3 simulation Information 1/3 observational Information

– CERN: LHC will generate 1GB/s .~10 PB/y– VLBA (NRAO) generates 1GB/s today– Pixar: 100 TB/Movie

• New emphasis on informatics:– Capturing, Organizing,

Summarizing, Analyzing, Visualizing

Image courtesy C. Meneveau & A. Szalay @ JHU

BaBar, Stanford

Space Telescope

P&E Gene Sequencer Fromhttp://www.genome.uci.edu/

Page 4: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

4

World Wide TelescopeVirtual Observatoryhttp://www.astro.caltech.edu/nvoconf/

http://www.voforum.org/

• Premise: Most data is (or could be online)

• The Internet is the world’s best telescope:– It has data on every part of the sky– In every measured spectral band: optical, x-ray, radio..

– As deep as the best instruments (2 years ago).

– It is up when you are up.The “seeing” is always great (no working at night, no clouds no moons no..).

– It’s a smart telescope: links objects and data to literature on them.

Page 5: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

5

Why Astronomy Data?•It has no commercial value

–No privacy concerns–Can freely share results with others–Great for experimenting with algorithms

•It is real and well documented– High-dimensional data (with confidence intervals)– Spatial data– Temporal data

•Many different instruments from many different places and many different times•Federation is a goal•There is a lot of it (petabytes)•Great sandbox for data mining algorithms

–Can share cross company–University researchers

•Great way to teach both Astronomy and Computational Science

IRAS 100

ROSAT ~keV

DSS Optical

2MASS 2

IRAS 25

NVSS 20cm

WENSS 92cm

GB 6cm

Page 6: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

6

SkyServer.SDSS.org• A modern Astronomy archive

– Raw Pixel data lives in file servers– Catalog data (derived objects) lives in Database– Online query to any and all

• Also used for education– 150 hours of online Astronomy– Implicitly teaches data analysis

• Interesting things– Spatial data search– Client query interface via Java Applet– Query interface via Emacs– Popular – Cloned by other surveys (a template design) – Web services are core of it.

Page 7: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

7

Federation: SkyQuery.Net• Combine 4 archives initially

• Just added 6 more

• Send query to portal, portal joins data from archives.

• Problem: want to do multi-step data analysis (not just single query).

• Solution: Allow personal databases on portal

• Problem: some queries are monsters

• Solution: “batch schedule” on portal server, Deposits answer in personal database.

Page 8: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

82MASS

INT

SDSS

FIRST

SkyQueryPortal

ImageCutout

SkyQuery Structure• Each SkyNode publishes

– Schema Web Service– Database Web Service

• Portal is – Plans Query (2 phase) – Integrates answers– Is itself a web service

Page 9: 1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research gray

9

Information Avalanche: science, business, personal

Astronomy dataSkyServer: http://SkyServer.SDSS.orgdemo http://skyquery.net/

pixel spacerecord spaceset space

Personal SkyServer download http://skyserver.org/myskyserver/Mention data mining.

World-Wide TelescopeFederated web servicesdemo http://skyquery.net/Other web servicesInterop with Linux/Python/…

Other stuffPortal with batch job scheduler

http://skyservice.pha.jhu.edu/devel/casjobs/