46
NASA’s Astrophysics Archives NASA's Long-Term Astrophysics Data Archives L. M. Rebull, IRSA 17 Oct 2016 HEASARC

NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

NASA'sLong-TermAstrophysicsData

Archives

L.M.Rebull,IRSA

17Oct2016HEASARC

Page 2: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

WithInputFrom:

• VandanaDesai(IRSA)

• HarryTeplitz(IRSA)

• SteveGroom(IRSA)

• RachelAkeson(NExScI)

• BruceBerriman(NExScI)

• GeorgeHelou(IPAC)

• DavidImel(IPAC)

• JoeMazzarella(NED)

• AlbertoAccomazzi(ADS)

• TomMcGlynn(GSFC)

• AlanSmale(GSFC)

• RickWhite(MAST)

2

Page 3: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

Havingsaidthat…

• I’manastronomeratIRSA,sothatiswhatIknowbest,andmostofmyexamplesareIRSA-focusedbynecessity.

3

Page 4: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

NASA’sCommitmenttoAstrophysicsDataArchives

• “NASAhasregardeddatahandlingandarchivingasanintegralpartofspacemissions.”

• “Thissupportnowprovidesthemajorreturnontheconsiderableinvestmenttheagencymade…overthepast20years.”

4

Page 5: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

“ASustainableArchive”• Providesdatadiscoveryand

analysistools.Facilitatesnewscience.

• Containshigh-quality,reliabledata.

• Providessimpleandusefultoolstoabroadcommunity.

• Providesusersupporttothenoviceaswellastothepoweruser.

• Adaptsandevolvesinresponsetocommunityinput.

5

Page 6: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

AnArchive’sJob• Ingestnewdata(andreprocessingofolddata).

• Maintain/servevitalrepositoryofirreplaceabledata:– Supportforobservationplanningandmissionplanning.

– Resourcefororiginalscience.

– Highlevelscienceproducts.

• Enablecutting-edgeresearch:– APIandVirtualObservatory.

– Usersupportbyexperts.

– New/enhancedservices.– Multi-wavelengthprojects.

6

Page 7: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

ArchivalPapersOutnumberNon-Archival(GO/PI)Papers

2008 2009 2010 2011 2012 2013 2014Year

0

2

4

6

8

10

12

Perc

ent o

f Ref

eree

d Pa

pers

Papers that use NASA IR data

2004 2006 2008 2010 2012 2014Year

0

20

40

60

80

100

Perc

ent o

f Spi

tzer

pap

ers

non-archival

both

Spitzer archival

7

Non-Archival

Hubble Archival

Both

Archivesdoubleanobservatory’soutput!

Page 8: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

sIRSAScienceHighlights

8

WISEmorphologicalstudyofWolf-Rayetnebulae,Toalaetal.(2015,A&A578,66)

WISE+2MASS+PanSTARRSdatamayrevealsuper-voidinCMBcoldspotseenbyPlanck;(Szapudietal.2015,MNRAS,450,288)

WISE+2MASSgalaxies Planck

WISE+Spitzerdiscoverthecoldestbrowndwarf(Luhman2014,ApJL786,L18)

WISE4.5µm(2010) Spitzer4.5µm(2014)

Timedomain Follow-upObservations

CombinationofSurveys

BuckyballsinaYoungPlanetaryNebula(Camietal.,2010,Science,3291180)

Re-analysisofSpectra

Page 9: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

9

planet d: 33,000x fainter than star

at 0.60 arcsec

planet b: 83,000x fainter than star

at 1.72 arcsec

planet c: 36,000x fainter than star

at 0.96 arcsec

MAST Science Highlight: HR8799 b,c,d imaged by HST in 1998

Post-processing speckle subtraction, >an order of magnitude contrast improvement over “state of the art” when data taken in 1998.

Soummer et al. 2011, Pueyo et al. 2015

Page 10: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

HEASARCScienceHighlight

10

TarantulaNebula:Combining6yearsofFermidatatodiscoverthefirstextragalacticgamma-raypulsar!

Credit:NASA/DOE/FermiLATCollaboration,2015,Science,350,801

Page 11: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

SomeNASAArchivesbyCenter• IPAC:

– IRSA–IR,submm– NED–Extragalactic– NASAExoplanetArchive– KOA(w/WMKO)–KeckObservatory

• STScI:MAST–UV,optical,IR• GSFC:HEASARC–highenergy,CMB• SAO/CfA:

– ADS–literature– Chandra

11

Page 12: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

IRSA• IRSA=NASA/IPACInfraredScienceArchive,@Caltech,@IPAC.• CharteristoprovideinterfacetoallNASAinfraredandsub-mm

datasets.(~1μm->~1cm).• Founded1993,originalhometoIRASdata(1983).• IRSAensuresthelegacyofNASA’s“goldenage”ofIR:

– Enableresearchthathasnotyetbeenenvisioned.– Prioritiessetbymissionsandthecommunity.– Supportfutureflightmissions.

• IRSAdatasetsarecitedinabout10%ofastronomicalrefereedjournalarticles.

• Totalholdingsoverapetabyte(>1000TB);>120billionrowsincatalogsasof9/2016.

• Totalnumberqueries:Over33.7millionqueries,255TBdownloaded1-9/2016.

12

http://irsa.ipac.caltech.edu/

Page 13: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

13

Page 14: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

NED

• NED=NASA/IPACExtragalacticDatabase(@Caltech,@IPAC).

• Primaryhubformulti-wavelengthresearchonextragalacticscience.

• Mergesdatafromcatalogsandliterature.• 1000sofextragalacticpapersperyear,with

uniquemeasurementsformillionsofobjects.

• 215millionobjectswith256millioncross-IDs(from>102,000articles/catalogs)!

• 2billionphotometricdatapointsjoinedintospectralenergydistributions.

• Myriadcross-links,notes,etc.• Updateseveryfewmonths.

14

http://ned.ipac.caltech.edu/

Page 15: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

15

Page 16: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

NASAExoplanetArchive

• (Also@Caltech,@IPAC)• Focusedonconfirmedandcandidateexoplanetsandondatasearchingforexoplanets

• IncludesKeplerdata,andUSportaltoCoRoTdata.

• Onlinetoolstoworkwiththesedata,liketheperiodogramservice.

• Placeforobserverstoupload/sharedata(Exo-FOP).

16

http://exoplanetarchive.ipac.caltech.edu/

Page 17: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

17

Page 18: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

KOA

18

https://koa.ipac.caltech.edu/

• AcollaborationbetweenNExScIandtheW.M.KeckObservatory.

• PublicdataforalltenKeckinstrumentssincetheObservatorysawfirst-lightin1994.• Browse-qualityimagesofrawdata.• Browse-quality,reduceddataforHIRES,NIRC2,OSIRIS,andLWS,createdbyautomatingpipelines.

• Contributeddata:KeckObservatoryDatabaseofIonizedAbsorptiontowardQuasars(KODIAQ;N.Lehner,PI).

• Comingsoon:NIRSPECextractedspectra;movingtargetservices.

• SeeLucaRizzi’sposterformoredetails.

Growthinpeer-reviewedKOAcitations

Page 19: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

19

Page 20: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

MAST

• MikulskiArchiveforSpaceTelescopes@STScI.• ArchiveestablishedwithHSTlaunchin1990.• Multi-missionsinceadditionofIUEin1998.• Optical,UV,IR.• IncludesHubble,Kepler,GALEX,IUE,FUSE,TESS,JWST,Pan-STARRS,DSS,GSC2,…

• >700TBofdata(soontojumpto2.5PBwithPan-STARRSrelease),2millionsearchespermonth,1200refereedpapersperyear.

20

https://archive.stsci.edu/

Page 21: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

21

Page 22: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

HEASARC

• HighEnergyAstrophysicsScienceArchiveResearchCenter,@GSFC,since1990.

• ExtremelyenergeticcosmicphenomenarangingfromblackholestotheBigBang.

• Chandra,XMM-Newton,Fermi,Suzaku,NuSTAR,INTEGRAL,ROSAT,Swift,&morethan20others.

• MergedwithLegacyArchiveforMicrowaveBackgroundDataAnalysis(LAMBDA)in2008(CMBR):WMAP,COBE,ACT,etc.

22

http://heasarc.gsfc.nasa.gov/

Page 23: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

23

Page 24: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

24

• ADS=AstrophysicsDataSystem(@SAO/CfA).

• Indexes12millionpublicationsinastronomy,physics,arXiv.

• Completecoverageofastronomyandrefereedphysicsliterature.

• Trackscitations,institutionalandtelescopebibliographies,linkstodataproducts(backtotheotherarchives).

• NewinterfaceandAPIintegratingORCID,full-textsearch,analytics.

Page 25: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

25

Page 26: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

More…

• Thereareotherarchives(basedatthesecenters),notnecessarilyNASA-funded,thatfollowthismodel.– Ex:Pan-STARRS,VLA-FIRST@STScI

– Ex:PalomarOschinwide-fieldsurvey@IRSA: • ZwickyTransientFacility(2017+)• intermediatePalomarTransientFactory(iPTF;2013-2016)• PalomarTransientFactory(2009-2012)

• Thereare(ofcourse)manyothernon-NASAarchives(SDSS,NRAO,…)andnon-USarchives(Simbad,ESA,ESO,…).

• Also,observerscandeliverdatabacktothesecentersfordistribution(whichmayincludedatabeyondoriginalprogram).

26

Page 27: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

LessonsLearned…

27

Page 28: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

EaseofAccess

• Researchersatalllevels(teammembers,emeriti,summerstudents)needtobeabletogetandusedata.

• Intuitive,web-basedinterface.– Noextrasoftwareinstallation.– Visualizationofresources,data,tools…– Easychoicesto“justgivemethetable”,etc.

• Helpneedstobetherewhenusersneedit,easilyfoundorpromptlyanswered.

28

Page 29: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

Support

• Needtohaveknowledgeablestaff,whohavedonesciencewiththedataproducts,whocan(a)findproblems;(b)passonvaluableexperiencetonewusers.

• Helpdesk:– Speedandaccuracymatters!– Questionscanbecomplex.

• Documentation:– Tools/datareleases.– Documentationupdatesinresponsetotickets.

• Demos:– Live(AAS,ADASS,DPS,etc.).– Videotutorials(IRSAhas>60videos;>4500viewstotal).

• ThecomplexityofScienceUserneedsincreaseswithtime.

29

Page 30: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

FinderChart

Spectrumvisualization

Visualization

• Data,catalogs,plots

• Whatdoyouhave?

• WhatdoIneed?– WhatIknowIneed…

– WhatdidIjustfind?

30

CatalogSearch

Page 31: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

EaseofUse:HighLevelScienceProducts

• Greatlyenhancethesciencereturnofthearchives.• HubbleLegacyhigh-levelscienceproducts(HLSP)areused

10xasmuchastypicalpipelineproducts.• Makecomplexdatasetsaccessibletoawideraudienceof

researchers.• Expandtheuseoflarge,

coherentprojects:– HubbleTreasury– SpitzerLegacy,ExplorationSci

• Generatedbythecommunityorbythecenter.

31Spitzer/GLIMPSE

Page 32: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

EaseofUse:Multi-Survey

• Combininginformationacrosswavelengths,surveys,missions…

• Sourcelistsfromentiremissions:– Spitzer,Hubble,Chandra,Herschel,WISE...

32

Page 33: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

NEDScienceExample

• IncontextofassessingNEDcompleteness,lookingatfusionofGALEX,SDSS,2MASS,WISE,…

• Foundsuper-luminousspiralgalaxies!

• Ogleetal.,2016,ApJ,817,109

33

Page 34: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

ChangingMission

• Thisisaresultthatcamefromlookingatwhatwasinthearchivealready.

• Asdatagetbiggerandbigger,won’tbeabletopulldataoutofthearchivetoworkwithit.

• Missionevolvingfrom“search-and-retrieve”to“do[some]analysisinsitu.”

• SciencediscoverieswaitinginthearchivesthatwereneverimaginedorexpectedbythemissionorevenprogramPIs.

34

Page 35: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

CommunityFeedback

• Needtotalktocommunitytofindoutwhatisneeded,wanted,wishedfor.

• Missionmembers,usercommittees,surveys,helpdesk,talkingatconferences,andreviewcyclesallfeedintosettingpriorities.

35

Page 36: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

CommitmenttoArchivalResearch

• NASAasawhole(+sometimesmissions)explicitlyfundsarchives,andarchivalresearch:NASAADAP(AstrophysicsDataAnalysisProgram).

• Well-designedarchive&productscangreatlyenhanceresearchvalueofthedataset.– Reducingbarriers:findingdata,makingdataaccessible(reliability,

units,fileformat,artifacts,documentation).• NASAenablesnewideasofthingstodowitholderdata.

• NASAhasstrongtraditionofactivecollaborationbetweenmissionsandarchives.– Thinkingaboutarchiveduringthemission!

36

Page 37: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

EaseofUse:VO

• VO=VirtualObservatory,IVOA=:InternationalVirtualObservatoryAlliance– Standardizedprotocolsforinteroperabilitybetweenarchives(i.e.,NOTthe

applicationsthatusetheprotocols).

– Datadiscovery.

• Useinterfaceyouknow,togettodataelsewhere.

• Interoperabilityoftools,withinarchivesandacrossarchives.

• (People)communicationandcollaborationacrossarchives:– AstronomyDataCentersExecutiveCommittee(ADEC).

– USVirtualObservatoryAlliance(USVOA).

– NASAAstronomicalVirtualObservatories(NAVO).• NAVO:

– ProvidecomprehensiveandconsistentaccesstoallNASAdatathroughVOprotocols.

– CoordinateNASAinteractionswithinternationalandnationalVOcommunities.

37

Page 38: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

IRSAVOWeeklyQueries

38

Smoo

thed

weeklyqu

eries

20142015 2016

VOqueriestakingoff!!

Page 39: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

HEASARC+MAST+IRSA+NED

39

Smoo

thed

DailyRate

2014

2015 2016

VOqueriestakingoff!!

Page 40: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

EaseofUse:API

• ApplicationProgramInterface.– (e.g.,callbyprograms,scripts,commandline)

• Allowsscriptedaccesstoarchivedata.

• Enablescomplexprojects.

• Enablesrapidqueries…

• (alsoinadvertentDoS soneedtowatchandthrottle!)

40−6 −4 −2 0 2 4log(GB Downloaded)

1

10

100

1000

10000

# of

Uni

que

Use

rs p

er M

onth

Lotsofusers! smallrequestsFewusers! enormousrequests!

Page 41: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

OperationalIssues

• Keeparchiverunningwhilemakingitbetter.– Assemblingtheplanewhileinflight!

• Growingaudience,usagewithinexistingresources.– Beefficientinhowuseresources.

– Usethesamesoftwareacrossmultipledatasets(X.Wu,earlier).

41

Page 42: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

Innovations

• InteractiveUI,usingpiecesdevelopedbyothers.• Machinelearning.

– NED:dataembeddedinfree-formtext,tablesnotstandard(e.g.,RA,Ra,ra,R.A.,…);pilotprojecttoapplyMLtoclassifydataandfacilitateextraction.

• Improvingscalability,extensibility,dataprospecting...

• Greaterintegrationoffunctionalityandcontentacrosssystems.– ADS:ORCIDclaiming;searchbyobjectviaSIMBADTAPservice;embeddingofpublisherimagesviaAPIs.

42

Page 43: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

Technical:DataIngest

• SpitzerLegacyprogramschangedculturebyrequestingproductsbedeliveredbacktothecommunity.

• NowacommonfeatureofSpitzerproposals.

• Bringsthesedatatolargeraudienceviacentralarchive.

• IRSAhastohaveresourcestoingesttheseproducts.

43

Page 44: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

Technical:DataIngest

• ExpenseisnotnecessarilyinTBbutineducationofthepeopledeliveringtheproducts.– Needtohavedeliverywell-organizedanddocumented.

– Peoplewhohavedoneitalot:easy.

– Peoplenewatthis:notnecessarilyeasy.– Toolstohelppeoplenewatthis.– Complexityisnotjustaboutsize!

• Canendupwith,e.g.,opticalandUVdataavailablethroughtheSpitzerarchive(SINGS,LVL).

44

Page 45: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

What’sNext:BigData

• “Bigdata”insomemissions,certainly“bigdata”acrossallNASAArchives.

• Tosomeextent,havealreadybeenworkingwith“bigdata”!

• IRSAhasalreadyinvestedindatavisualizationservicestohelppeopleidentifyandexperimentwithdataquickly.

• Planning:identifyingthemostcriticalneedsofusers,includingincreasedanalysisatthearchivefacilitatedbyuserworkspaces.

• Richerservicesforinsituanalysis.

• Allarchivesthinkingaboutthisinsomeway.

45

Page 46: NASA's Long-Term Astrophysics Data Archives · • Hubble Legacy high-level science products (HLSP) are used 10x as much as typical pipeline products. • Make complex data sets accessible

NA

SA’s

Ast

roph

ysic

s Arc

hive

s

Summary

• Long-term,stablearchivesgreatlyincreasethereturnonobservatoryinvestment.(Doublespapers!)

• Robustsupportforbothexpertandnoviceuserspaysoff.

• Usersupportbyinstrumentexpertsiscrucial.

• Standardizationoftoolswithinanarchiveincreasesefficiency.

• Interoperabilitybetweenarchivesbenefitseveryone.

• Highleveldataproductscanexpandthereachoflargedatasets.

• Shiftinapproachfrom“searchandretrieve”to“analyzeinsitu”.

46