19
AstroGrid http://www.astrogrid.ac .uk NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL RAL

AstroGrid NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

Embed Size (px)

Citation preview

Page 1: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

AstroGrid http://www.astrogrid.ac.uk

AstroGrid http://www.astrogrid.ac.uk

NAM 2001 Andy Lawrence

Cambridge

NAM 2001 Andy Lawrence

Cambridge

BelfastCambridgeEdinburghJodrellLeicesterMSSLRAL

Page 2: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

AstroGrid http://www.astrogrid.ac.uk

AstroGrid http://www.astrogrid.ac.uk

NAM 2001 Andy Lawrence

Cambridge

NAM 2001 Andy Lawrence

Cambridge

OpticalInfraredX-rayRadioSolarSpace Plasma

Page 3: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

collectivisationcollectivisation

• thirty year trend.....– facility class (common-user) instruments

– central development of data reduction s/w

– calibrated archives with simple tools

– information services (ADS, NED)

– large consortium projects (MACHO, 2dF, SLOAN, VISTA...)

• next steps– inter-operable archives (joint queries)

– communal exploration and analysis tools (data mining)

– information discovery tools

Page 4: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

the archive is the skythe archive is the sky– large fraction of astro-papers based on archives

– HST : retrieval growing faster than ingest

– ISO : whole archive downloaded twice

Already more retrieval than ingest!Already more retrieval than ingest!

graphics from US NVO project

Page 5: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

Large database scienceLarge database science

• Rare object searches

• modelling populations

• statistical manipulation

• large sample monitoring

• the UNKNOWN

Page 6: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

next steps in use of archives

next steps in use of archives

• inter-operability and joint queries

– e.g. retrieve Sloan, UKIDSS and XMM images from single query

– click on image and get spectrum

– give me all objects redder than X that have no radio counterpart but already have a spectrum

Page 7: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

next steps in use of archives

next steps in use of archives

• exploration and visualisation tools

– large image scrolling and projection

– N-d parameter space plotting

– VR headsets

Page 8: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

next steps in use of archives

next steps in use of archives

• large data-set manipulation tools

– Fourier transforms

– Finding outliers

– Data compression

– PCA analysis

Page 9: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

next steps in use of archives

next steps in use of archives

• information discovery tools

– intelligent search agents

– networked NED

Page 10: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

the scary bit.....the scary bit.....

• SDSS science archive a few TB

• WFCAM will produce a TB/week

• VISTA even worse...

• Peta-Byte databases coming your way ...

Page 11: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

data intensive computingdata intensive computing

• search SuperCOS data : few hours

• search VISTA DB : few months !

• need clever DB structures / query memory

• need parallel machines– simple PC farms for simple queries ?

– shared memory architecture for manipulations ?

Page 12: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

remote analysis servicesremote analysis services

• Janet delivers 10 Mb/s to door– 10TByte dataset takes 93 days to download

• lesson : shift the results not the data– i.e. data centres must also be service providers

• data subset access

• database query service

• analysis tools on line OR ability to upload code

• remote visualisation service

Page 13: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

GridsGrids

• services remote .... also distributed ?

• computational grids– web is distributed information ; grid is distributed CPU

– networked users have supercomputers at their fingertips

– don't even need to know where they are

– like plugging into the electrical power grid

Page 14: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

technical issuestechnical issues• data format standards• metadata and annotation standards• information exchange protocols• presentation service standards• request translation middleware• workload scheduling, resource allocation• mass storage management• computing fabric management• differentiated service network technology• distributed data management - caching, file replication, file migration• visualisation technology and algorithms• data discovery methods• search agents and AI• database structure and query methods• data mining algorithms• s/w libraries and tools for upload requests• data quality assurance (levels of club membership ?)

– all science-wide and commerce-wide issues ...

Page 15: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

context context

• Global Grids work– basic computer science and technology development

• grids work in other sciences– CERN grid

– Earth observation grid

• international astro-plans– US National Virtual Observatory (NVO) project

– UK AstroGrid project

– European Astrophysical Virtual Observatory (AVO) project

Page 16: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

AstroGrid projectAstroGrid project

• developed during LTSR• proposal to PPARC October 2000• three year project• one year Phase A study

– community consultation

– science requirements analysis

– benchmark tests

– pilot database federations

• we need use-cases....

Page 17: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

Phase B - preliminaryPhase B - preliminary

• uniform AstroGrid interface

• data-mining machines connected in grid

• tool for simultaneous browsing– plus advanced visualisation, links to spectra etc.

• tools for advanced database analysis– advanced querying, mixture fitting, statistical manipulations etc.

• tools for on-line data analysis– statistics, model fitting

• system for uploading code

Page 18: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL

FIN

Page 19: AstroGrid  NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL