Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
NASA'sLong-TermAstrophysicsData
Archives
L.M.Rebull,IRSA
17Oct2016HEASARC
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
WithInputFrom:
• VandanaDesai(IRSA)
• HarryTeplitz(IRSA)
• SteveGroom(IRSA)
• RachelAkeson(NExScI)
• BruceBerriman(NExScI)
• GeorgeHelou(IPAC)
• DavidImel(IPAC)
• JoeMazzarella(NED)
• AlbertoAccomazzi(ADS)
• TomMcGlynn(GSFC)
• AlanSmale(GSFC)
• RickWhite(MAST)
2
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
Havingsaidthat…
• I’manastronomeratIRSA,sothatiswhatIknowbest,andmostofmyexamplesareIRSA-focusedbynecessity.
3
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
NASA’sCommitmenttoAstrophysicsDataArchives
• “NASAhasregardeddatahandlingandarchivingasanintegralpartofspacemissions.”
• “Thissupportnowprovidesthemajorreturnontheconsiderableinvestmenttheagencymade…overthepast20years.”
4
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
“ASustainableArchive”• Providesdatadiscoveryand
analysistools.Facilitatesnewscience.
• Containshigh-quality,reliabledata.
• Providessimpleandusefultoolstoabroadcommunity.
• Providesusersupporttothenoviceaswellastothepoweruser.
• Adaptsandevolvesinresponsetocommunityinput.
5
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
AnArchive’sJob• Ingestnewdata(andreprocessingofolddata).
• Maintain/servevitalrepositoryofirreplaceabledata:– Supportforobservationplanningandmissionplanning.
– Resourcefororiginalscience.
– Highlevelscienceproducts.
• Enablecutting-edgeresearch:– APIandVirtualObservatory.
– Usersupportbyexperts.
– New/enhancedservices.– Multi-wavelengthprojects.
6
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
ArchivalPapersOutnumberNon-Archival(GO/PI)Papers
2008 2009 2010 2011 2012 2013 2014Year
0
2
4
6
8
10
12
Perc
ent o
f Ref
eree
d Pa
pers
Papers that use NASA IR data
2004 2006 2008 2010 2012 2014Year
0
20
40
60
80
100
Perc
ent o
f Spi
tzer
pap
ers
non-archival
both
Spitzer archival
7
Non-Archival
Hubble Archival
Both
Archivesdoubleanobservatory’soutput!
NA
SA’s
Ast
roph
ysic
s Arc
hive
sIRSAScienceHighlights
8
WISEmorphologicalstudyofWolf-Rayetnebulae,Toalaetal.(2015,A&A578,66)
WISE+2MASS+PanSTARRSdatamayrevealsuper-voidinCMBcoldspotseenbyPlanck;(Szapudietal.2015,MNRAS,450,288)
WISE+2MASSgalaxies Planck
WISE+Spitzerdiscoverthecoldestbrowndwarf(Luhman2014,ApJL786,L18)
WISE4.5µm(2010) Spitzer4.5µm(2014)
Timedomain Follow-upObservations
CombinationofSurveys
BuckyballsinaYoungPlanetaryNebula(Camietal.,2010,Science,3291180)
Re-analysisofSpectra
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
9
planet d: 33,000x fainter than star
at 0.60 arcsec
planet b: 83,000x fainter than star
at 1.72 arcsec
planet c: 36,000x fainter than star
at 0.96 arcsec
MAST Science Highlight: HR8799 b,c,d imaged by HST in 1998
Post-processing speckle subtraction, >an order of magnitude contrast improvement over “state of the art” when data taken in 1998.
Soummer et al. 2011, Pueyo et al. 2015
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
HEASARCScienceHighlight
10
TarantulaNebula:Combining6yearsofFermidatatodiscoverthefirstextragalacticgamma-raypulsar!
Credit:NASA/DOE/FermiLATCollaboration,2015,Science,350,801
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
SomeNASAArchivesbyCenter• IPAC:
– IRSA–IR,submm– NED–Extragalactic– NASAExoplanetArchive– KOA(w/WMKO)–KeckObservatory
• STScI:MAST–UV,optical,IR• GSFC:HEASARC–highenergy,CMB• SAO/CfA:
– ADS–literature– Chandra
11
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
IRSA• IRSA=NASA/IPACInfraredScienceArchive,@Caltech,@IPAC.• CharteristoprovideinterfacetoallNASAinfraredandsub-mm
datasets.(~1μm->~1cm).• Founded1993,originalhometoIRASdata(1983).• IRSAensuresthelegacyofNASA’s“goldenage”ofIR:
– Enableresearchthathasnotyetbeenenvisioned.– Prioritiessetbymissionsandthecommunity.– Supportfutureflightmissions.
• IRSAdatasetsarecitedinabout10%ofastronomicalrefereedjournalarticles.
• Totalholdingsoverapetabyte(>1000TB);>120billionrowsincatalogsasof9/2016.
• Totalnumberqueries:Over33.7millionqueries,255TBdownloaded1-9/2016.
12
http://irsa.ipac.caltech.edu/
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
13
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
NED
• NED=NASA/IPACExtragalacticDatabase(@Caltech,@IPAC).
• Primaryhubformulti-wavelengthresearchonextragalacticscience.
• Mergesdatafromcatalogsandliterature.• 1000sofextragalacticpapersperyear,with
uniquemeasurementsformillionsofobjects.
• 215millionobjectswith256millioncross-IDs(from>102,000articles/catalogs)!
• 2billionphotometricdatapointsjoinedintospectralenergydistributions.
• Myriadcross-links,notes,etc.• Updateseveryfewmonths.
14
http://ned.ipac.caltech.edu/
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
15
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
NASAExoplanetArchive
• (Also@Caltech,@IPAC)• Focusedonconfirmedandcandidateexoplanetsandondatasearchingforexoplanets
• IncludesKeplerdata,andUSportaltoCoRoTdata.
• Onlinetoolstoworkwiththesedata,liketheperiodogramservice.
• Placeforobserverstoupload/sharedata(Exo-FOP).
16
http://exoplanetarchive.ipac.caltech.edu/
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
17
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
KOA
18
https://koa.ipac.caltech.edu/
• AcollaborationbetweenNExScIandtheW.M.KeckObservatory.
• PublicdataforalltenKeckinstrumentssincetheObservatorysawfirst-lightin1994.• Browse-qualityimagesofrawdata.• Browse-quality,reduceddataforHIRES,NIRC2,OSIRIS,andLWS,createdbyautomatingpipelines.
• Contributeddata:KeckObservatoryDatabaseofIonizedAbsorptiontowardQuasars(KODIAQ;N.Lehner,PI).
• Comingsoon:NIRSPECextractedspectra;movingtargetservices.
• SeeLucaRizzi’sposterformoredetails.
Growthinpeer-reviewedKOAcitations
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
19
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
MAST
• MikulskiArchiveforSpaceTelescopes@STScI.• ArchiveestablishedwithHSTlaunchin1990.• Multi-missionsinceadditionofIUEin1998.• Optical,UV,IR.• IncludesHubble,Kepler,GALEX,IUE,FUSE,TESS,JWST,Pan-STARRS,DSS,GSC2,…
• >700TBofdata(soontojumpto2.5PBwithPan-STARRSrelease),2millionsearchespermonth,1200refereedpapersperyear.
20
https://archive.stsci.edu/
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
21
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
HEASARC
• HighEnergyAstrophysicsScienceArchiveResearchCenter,@GSFC,since1990.
• ExtremelyenergeticcosmicphenomenarangingfromblackholestotheBigBang.
• Chandra,XMM-Newton,Fermi,Suzaku,NuSTAR,INTEGRAL,ROSAT,Swift,&morethan20others.
• MergedwithLegacyArchiveforMicrowaveBackgroundDataAnalysis(LAMBDA)in2008(CMBR):WMAP,COBE,ACT,etc.
22
http://heasarc.gsfc.nasa.gov/
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
23
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
24
• ADS=AstrophysicsDataSystem(@SAO/CfA).
• Indexes12millionpublicationsinastronomy,physics,arXiv.
• Completecoverageofastronomyandrefereedphysicsliterature.
• Trackscitations,institutionalandtelescopebibliographies,linkstodataproducts(backtotheotherarchives).
• NewinterfaceandAPIintegratingORCID,full-textsearch,analytics.
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
25
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
More…
• Thereareotherarchives(basedatthesecenters),notnecessarilyNASA-funded,thatfollowthismodel.– Ex:Pan-STARRS,VLA-FIRST@STScI
– Ex:PalomarOschinwide-fieldsurvey@IRSA: • ZwickyTransientFacility(2017+)• intermediatePalomarTransientFactory(iPTF;2013-2016)• PalomarTransientFactory(2009-2012)
• Thereare(ofcourse)manyothernon-NASAarchives(SDSS,NRAO,…)andnon-USarchives(Simbad,ESA,ESO,…).
• Also,observerscandeliverdatabacktothesecentersfordistribution(whichmayincludedatabeyondoriginalprogram).
26
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
LessonsLearned…
27
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
EaseofAccess
• Researchersatalllevels(teammembers,emeriti,summerstudents)needtobeabletogetandusedata.
• Intuitive,web-basedinterface.– Noextrasoftwareinstallation.– Visualizationofresources,data,tools…– Easychoicesto“justgivemethetable”,etc.
• Helpneedstobetherewhenusersneedit,easilyfoundorpromptlyanswered.
28
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
Support
• Needtohaveknowledgeablestaff,whohavedonesciencewiththedataproducts,whocan(a)findproblems;(b)passonvaluableexperiencetonewusers.
• Helpdesk:– Speedandaccuracymatters!– Questionscanbecomplex.
• Documentation:– Tools/datareleases.– Documentationupdatesinresponsetotickets.
• Demos:– Live(AAS,ADASS,DPS,etc.).– Videotutorials(IRSAhas>60videos;>4500viewstotal).
• ThecomplexityofScienceUserneedsincreaseswithtime.
29
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
FinderChart
Spectrumvisualization
Visualization
• Data,catalogs,plots
• Whatdoyouhave?
• WhatdoIneed?– WhatIknowIneed…
– WhatdidIjustfind?
30
CatalogSearch
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
EaseofUse:HighLevelScienceProducts
• Greatlyenhancethesciencereturnofthearchives.• HubbleLegacyhigh-levelscienceproducts(HLSP)areused
10xasmuchastypicalpipelineproducts.• Makecomplexdatasetsaccessibletoawideraudienceof
researchers.• Expandtheuseoflarge,
coherentprojects:– HubbleTreasury– SpitzerLegacy,ExplorationSci
• Generatedbythecommunityorbythecenter.
31Spitzer/GLIMPSE
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
EaseofUse:Multi-Survey
• Combininginformationacrosswavelengths,surveys,missions…
• Sourcelistsfromentiremissions:– Spitzer,Hubble,Chandra,Herschel,WISE...
32
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
NEDScienceExample
• IncontextofassessingNEDcompleteness,lookingatfusionofGALEX,SDSS,2MASS,WISE,…
• Foundsuper-luminousspiralgalaxies!
• Ogleetal.,2016,ApJ,817,109
33
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
ChangingMission
• Thisisaresultthatcamefromlookingatwhatwasinthearchivealready.
• Asdatagetbiggerandbigger,won’tbeabletopulldataoutofthearchivetoworkwithit.
• Missionevolvingfrom“search-and-retrieve”to“do[some]analysisinsitu.”
• SciencediscoverieswaitinginthearchivesthatwereneverimaginedorexpectedbythemissionorevenprogramPIs.
34
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
CommunityFeedback
• Needtotalktocommunitytofindoutwhatisneeded,wanted,wishedfor.
• Missionmembers,usercommittees,surveys,helpdesk,talkingatconferences,andreviewcyclesallfeedintosettingpriorities.
35
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
CommitmenttoArchivalResearch
• NASAasawhole(+sometimesmissions)explicitlyfundsarchives,andarchivalresearch:NASAADAP(AstrophysicsDataAnalysisProgram).
• Well-designedarchive&productscangreatlyenhanceresearchvalueofthedataset.– Reducingbarriers:findingdata,makingdataaccessible(reliability,
units,fileformat,artifacts,documentation).• NASAenablesnewideasofthingstodowitholderdata.
• NASAhasstrongtraditionofactivecollaborationbetweenmissionsandarchives.– Thinkingaboutarchiveduringthemission!
36
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
EaseofUse:VO
• VO=VirtualObservatory,IVOA=:InternationalVirtualObservatoryAlliance– Standardizedprotocolsforinteroperabilitybetweenarchives(i.e.,NOTthe
applicationsthatusetheprotocols).
– Datadiscovery.
• Useinterfaceyouknow,togettodataelsewhere.
• Interoperabilityoftools,withinarchivesandacrossarchives.
• (People)communicationandcollaborationacrossarchives:– AstronomyDataCentersExecutiveCommittee(ADEC).
– USVirtualObservatoryAlliance(USVOA).
– NASAAstronomicalVirtualObservatories(NAVO).• NAVO:
– ProvidecomprehensiveandconsistentaccesstoallNASAdatathroughVOprotocols.
– CoordinateNASAinteractionswithinternationalandnationalVOcommunities.
37
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
IRSAVOWeeklyQueries
38
Smoo
thed
weeklyqu
eries
20142015 2016
VOqueriestakingoff!!
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
HEASARC+MAST+IRSA+NED
39
Smoo
thed
DailyRate
2014
2015 2016
VOqueriestakingoff!!
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
EaseofUse:API
• ApplicationProgramInterface.– (e.g.,callbyprograms,scripts,commandline)
• Allowsscriptedaccesstoarchivedata.
• Enablescomplexprojects.
• Enablesrapidqueries…
• (alsoinadvertentDoS soneedtowatchandthrottle!)
40−6 −4 −2 0 2 4log(GB Downloaded)
1
10
100
1000
10000
# of
Uni
que
Use
rs p
er M
onth
Lotsofusers! smallrequestsFewusers! enormousrequests!
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
OperationalIssues
• Keeparchiverunningwhilemakingitbetter.– Assemblingtheplanewhileinflight!
• Growingaudience,usagewithinexistingresources.– Beefficientinhowuseresources.
– Usethesamesoftwareacrossmultipledatasets(X.Wu,earlier).
41
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
Innovations
• InteractiveUI,usingpiecesdevelopedbyothers.• Machinelearning.
– NED:dataembeddedinfree-formtext,tablesnotstandard(e.g.,RA,Ra,ra,R.A.,…);pilotprojecttoapplyMLtoclassifydataandfacilitateextraction.
• Improvingscalability,extensibility,dataprospecting...
• Greaterintegrationoffunctionalityandcontentacrosssystems.– ADS:ORCIDclaiming;searchbyobjectviaSIMBADTAPservice;embeddingofpublisherimagesviaAPIs.
42
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
Technical:DataIngest
• SpitzerLegacyprogramschangedculturebyrequestingproductsbedeliveredbacktothecommunity.
• NowacommonfeatureofSpitzerproposals.
• Bringsthesedatatolargeraudienceviacentralarchive.
• IRSAhastohaveresourcestoingesttheseproducts.
43
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
Technical:DataIngest
• ExpenseisnotnecessarilyinTBbutineducationofthepeopledeliveringtheproducts.– Needtohavedeliverywell-organizedanddocumented.
– Peoplewhohavedoneitalot:easy.
– Peoplenewatthis:notnecessarilyeasy.– Toolstohelppeoplenewatthis.– Complexityisnotjustaboutsize!
• Canendupwith,e.g.,opticalandUVdataavailablethroughtheSpitzerarchive(SINGS,LVL).
44
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
What’sNext:BigData
• “Bigdata”insomemissions,certainly“bigdata”acrossallNASAArchives.
• Tosomeextent,havealreadybeenworkingwith“bigdata”!
• IRSAhasalreadyinvestedindatavisualizationservicestohelppeopleidentifyandexperimentwithdataquickly.
• Planning:identifyingthemostcriticalneedsofusers,includingincreasedanalysisatthearchivefacilitatedbyuserworkspaces.
• Richerservicesforinsituanalysis.
• Allarchivesthinkingaboutthisinsomeway.
45
NA
SA’s
Ast
roph
ysic
s Arc
hive
s
Summary
• Long-term,stablearchivesgreatlyincreasethereturnonobservatoryinvestment.(Doublespapers!)
• Robustsupportforbothexpertandnoviceuserspaysoff.
• Usersupportbyinstrumentexpertsiscrucial.
• Standardizationoftoolswithinanarchiveincreasesefficiency.
• Interoperabilitybetweenarchivesbenefitseveryone.
• Highleveldataproductscanexpandthereachoflargedatasets.
• Shiftinapproachfrom“searchandretrieve”to“analyzeinsitu”.
46