AstroGrid NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge...

Preview:

Citation preview

AstroGrid http://www.astrogrid.ac.uk

AstroGrid http://www.astrogrid.ac.uk

NAM 2001 Andy Lawrence

Cambridge

NAM 2001 Andy Lawrence

Cambridge

BelfastCambridgeEdinburghJodrellLeicesterMSSLRAL

AstroGrid http://www.astrogrid.ac.uk

AstroGrid http://www.astrogrid.ac.uk

NAM 2001 Andy Lawrence

Cambridge

NAM 2001 Andy Lawrence

Cambridge

OpticalInfraredX-rayRadioSolarSpace Plasma

collectivisationcollectivisation

• thirty year trend.....– facility class (common-user) instruments

– central development of data reduction s/w

– calibrated archives with simple tools

– information services (ADS, NED)

– large consortium projects (MACHO, 2dF, SLOAN, VISTA...)

• next steps– inter-operable archives (joint queries)

– communal exploration and analysis tools (data mining)

– information discovery tools

the archive is the skythe archive is the sky– large fraction of astro-papers based on archives

– HST : retrieval growing faster than ingest

– ISO : whole archive downloaded twice

Already more retrieval than ingest!Already more retrieval than ingest!

graphics from US NVO project

Large database scienceLarge database science

• Rare object searches

• modelling populations

• statistical manipulation

• large sample monitoring

• the UNKNOWN

next steps in use of archives

next steps in use of archives

• inter-operability and joint queries

– e.g. retrieve Sloan, UKIDSS and XMM images from single query

– click on image and get spectrum

– give me all objects redder than X that have no radio counterpart but already have a spectrum

next steps in use of archives

next steps in use of archives

• exploration and visualisation tools

– large image scrolling and projection

– N-d parameter space plotting

– VR headsets

next steps in use of archives

next steps in use of archives

• large data-set manipulation tools

– Fourier transforms

– Finding outliers

– Data compression

– PCA analysis

next steps in use of archives

next steps in use of archives

• information discovery tools

– intelligent search agents

– networked NED

the scary bit.....the scary bit.....

• SDSS science archive a few TB

• WFCAM will produce a TB/week

• VISTA even worse...

• Peta-Byte databases coming your way ...

data intensive computingdata intensive computing

• search SuperCOS data : few hours

• search VISTA DB : few months !

• need clever DB structures / query memory

• need parallel machines– simple PC farms for simple queries ?

– shared memory architecture for manipulations ?

remote analysis servicesremote analysis services

• Janet delivers 10 Mb/s to door– 10TByte dataset takes 93 days to download

• lesson : shift the results not the data– i.e. data centres must also be service providers

• data subset access

• database query service

• analysis tools on line OR ability to upload code

• remote visualisation service

GridsGrids

• services remote .... also distributed ?

• computational grids– web is distributed information ; grid is distributed CPU

– networked users have supercomputers at their fingertips

– don't even need to know where they are

– like plugging into the electrical power grid

technical issuestechnical issues• data format standards• metadata and annotation standards• information exchange protocols• presentation service standards• request translation middleware• workload scheduling, resource allocation• mass storage management• computing fabric management• differentiated service network technology• distributed data management - caching, file replication, file migration• visualisation technology and algorithms• data discovery methods• search agents and AI• database structure and query methods• data mining algorithms• s/w libraries and tools for upload requests• data quality assurance (levels of club membership ?)

– all science-wide and commerce-wide issues ...

context context

• Global Grids work– basic computer science and technology development

• grids work in other sciences– CERN grid

– Earth observation grid

• international astro-plans– US National Virtual Observatory (NVO) project

– UK AstroGrid project

– European Astrophysical Virtual Observatory (AVO) project

AstroGrid projectAstroGrid project

• developed during LTSR• proposal to PPARC October 2000• three year project• one year Phase A study

– community consultation

– science requirements analysis

– benchmark tests

– pilot database federations

• we need use-cases....

Phase B - preliminaryPhase B - preliminary

• uniform AstroGrid interface

• data-mining machines connected in grid

• tool for simultaneous browsing– plus advanced visualisation, links to spectra etc.

• tools for advanced database analysis– advanced querying, mixture fitting, statistical manipulations etc.

• tools for on-line data analysis– statistics, model fitting

• system for uploading code

FIN

Recommended