Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
©2015IHS.ALLRIGHTSRESERVED.
KNOWLEDGEARCHITECTUREANDBIGDATA
HowtoApplyKnowledgeArchitecturetoBigData
DavidMezaChiefKnowledgeArchitectNASAJohnsonSpaceCenter
FederalReserveJune15,2016
AGENDA
• KnowledgeArchitecture• NASADataStrategy• CogniPveCompuPng
2
“ThemostimportantcontribuPonmanagementneedstomakeinthe21stCenturyistoincreasetheproducPvityofknowledgeworkandtheknowledgeworker.”PETERF.DRUCKER,1999
ToconvertdatatoknowledgeaconvergenceofKnowledgeManagement,InformaPonArchitectureandDataScienceisnecessary.
4
KnowledgeManagement
DataScienceInformaPonArchitecture
KnowledgeArchitecture• Thepeople,processes,andtechnologyofdesigning,implemenPng,andapplying
theintellectualinfrastructureoforganizaPons.
• Whatisanintellectualinfrastructure?
• ThesetofacPviPestocreate,capture,organize,analyze,visualize,present,
anduPlizetheinformaPonpartoftheinformaPonage..
• InformaPon+Contexts=Knowledge
• InformaPonArchitecture+KnowledgeManagement+DataScience=Knowledge
Architecture
• KMwithoutapplicaPonsisempty(StrategyOnly)
• ApplicaPonswithoutKAareblind(ITbasedKM)
• DataSciencetransformyourdatatoknowledge
5
KnowledgeManagement"Knowledgemanagementistheprocessofcapturing,distribuPng,andeffecPvely
usingknowledge.”
ThisdefiniPonhasthevirtueofbeingsimple,stark,andtothepoint.Afewyearslater,the
GartnerGroupcreatedanotherseconddefiniPonofKM,whichisperhapsthemostfrequently
citedone(Duhon,1998):
"Knowledgemanagementisadisciplinethatpromotesanintegratedapproachto
idenPfying,capturing,evaluaPng,retrieving,andsharingallofanenterprise's
informaPonassets.Theseassetsmayincludedatabases,documents,policies,
procedures,andpreviouslyun-capturedexperPseandexperienceinindividual
workers.”
6
InformaPonArchitectureTheintentistoachieveavarietyofcapabiliPestoenabletheAgencytoefficiently
acquireorgenerate,findandaccess,useandreuse,shareandexchange,manageand
govern,andstoreandrePreourdata.
7
DataScienceDatascienceisaninterdisciplinaryfieldaboutprocessesandsystemstoextract
knowledgeorinsightsfromdatainvariousforms,eitherstructuredorunstructured,
whichisaconPnuaPonofsomeofthedataanalysisfieldssuchasstaPsPcs,data
mining,andpredicPveanalyPcs,similartoKnowledgeDiscoveryinDatabases(KDD).TheKnowledgeDiscoveryinDatabases(KDD)processiscommonlydefinedwiththestages:(1)SelecPon(2)Pre-processing(3)TransformaPon(4)DataMining(5)InterpretaPon/EvaluaPon.
8
DataStrategy
9
Key Recommendations : • Data Management • Unified Data Lifecycle • Data Governance • Data Analytics Lab • Data Fellows Program • Data Stewards
DataStrategyFramework
10
Challenge Example Opportunity RecommendaEonLackofanexplicitdatamanagementframework,fragmenteddatalifecycleandlackofdataintegraPon
NoAgency-widearchitectureandstandardsforinformaPoninteroperability.MuchofthedataNASAproducesisinaccessibleorhuman-readableonly,withnomethodtodraw-in,parse,organize,ormakeuseofthisdata.
Improvedarchitecture,standardsandaccessibilitypermimngquickerandmoreeffecPvecollecPon,digiPzaPonanddiscovery;increasedfocusonmission-specificdataneedsandtype-specificapproaches
1. DataManagement2. UnifiedDataLifecycle3. DataGovernanceProgram
NeedfornewemergingdataanalyPcstechnologiesandcapabiliPestoaddressmissionspecificchallenges
ManyofNASA’scurrentdatasystemsaresignificantlyoutdatedandcannotscaletomeetdemand.
ExperimenPngwithnewalgorithms,applicaPons,andtechniques
4.DataAnalyPcsLab
DataexperPsegap DatascienPstsareinlowsupplyandhighdemand,andNASAwillneedtocompetewithindustrytoapractthebest&brightest.
CollaboraPvepartnershipstobuildinternalcapacityandexperPseanduPlizeexternaltalent,tools,andinformaPon
5.DataFellowsProgram
NeedtoeffecPvelyaddresscultureandpolicyissuesalongsidetechnology
Inmanycases,individualsarenotmoPvatedtosharedataforcollaboraPveusewithothers.
Increasedcross-agencyandcross-stakeholderownershipandapproachtodatamanagementanddataanalyPcschallenges
6.DataStewards
KNOWLEDGEARCHITECTURE–ANALYTICSFRAMEWORK
11
IT&IntellectualInfrastructure
Security,DataQuality,WorkflowManagement,DataManagement,ResourceManagement
DataProducts:• PredicPons• Models• VisualizaPons• DecisionAnalysis• Wiki
Sources:• Sensor• Experimental• Computed
(modeling&simulaPon)
Forms:• Digital• Text• VisualOrganizaPon:• Structured• Semi-Structured• Unstructured
FuncPons:• Governance• Taxonomy• Ontology• Comm.Plan• OperaPons
Management• Security• MasterData
Management• Content
Management• Metadata• DataQuality
Tools&Environments:• Largescalestorage• RDBMS• ParallelRDBMS• NOSQL• HadoopOrganizaPon:• Structured• Semi-Structured• Unstructured
Tools&Environments:• ComputaPon&data
access• DataMining• TextMining• OpPmizaPon• NetAlgorithm• NewAlgorithm• VisualizaPonAccessPapern:• Structured• Semi-Structured• Unstructured• Predictable• Unpredictable
DataAcquisiPon&CreaPon
DataManagement
DataWarehousing
DataAnalyPcs,BI
(KnowledgeExtracPon)
KnowledgePresentaPon
andVisualizaPon
Source User
“Wehaveanopportunityforeveryoneintheworldtohaveaccesstoalltheworld’sinformaPon.Thishasneverbeforebeenpossible.WhyisubiquitousinformaPonsoprofound?Itisatremendousequalizer.InformaPonispower.”ERICSCHMIDT(FORMERCEOOFGOOGLE)
30%oftotalR&DspendiswastedduplicaPngresearchandworkpreviouslydone.Source:Na+onalBoardofPatentsandRegistra+on(PRH),WIPO,IFA
54%ofdecisionsaremadewithincomplete,inconsistentandinadequateinformaPonSource:InfoCentricResearch
46%Workerscan’tfindtheinformaPontheyneedalmosthalfthePme.Source:IDC
KnowledgeArchitecture:TheNextPhase
14
15
16
17
PushversusPull
18
WHATCOULDYOUACCOMPLISHIFYOUCOULD:
• Empowerfasterandmoreinformeddecision-making
• Leveragelessonsofthepasttominimizewaste,rework,re-invenPonandredundancy
• Reducethelearningcurvefornewemployees
• EnhanceandextendexisPngcontentanddocumentmanagementsystems
19
JSCKnowledgeArchitectureServices:§ AnalyPcs
§ WebPlauormforAnalysisandVisualizaPon
§ NOSQL-Neo4jandMongoDB
§ VisualizaPonServices-BusinessIntelligence
§ RepositorySpecificSearch
§ WikiFarm
§ CodeSharingandProjectcollaboraPon
§ Training
Contact Information
David Meza – [email protected]
Twitter - @davidmeza1
Linkedin - hpps://www.linkedin.com/pub/david-meza/16/543/50b
Github – davidmeza1
Blog davidmeza1.github.io
20
Contents
©2015IHS.ALLRIGHTSRESERVED. 21ReportName/Month2015
QUESTIONS?