Upload
ibm-analytics
View
444
Download
2
Embed Size (px)
Citation preview
© 2014 IBM Corporation2
What you’ll learn…
The opportunity Big data governance:
− Requirements
− How it works
− Capabilities A holistic approach Next steps
© 2014 IBM Corporation3
Veracity: Can I trust what I am seeing?
What Is Big Data?
Immense volume, variety and velocity of data, in context, beyond what was previously possible
Opportunity to derive new insights – challenged by questionable veracity
Volume
Prevent customer churn
call detail records per day500million
Velocity
trade events per second
Identify potential fraud
5 million
is images, video, documents
Improve customer satisfaction
80%
Variety
from surveillance cameras
Monitor events of interest
100’s of video feeds
of data growth
meter readings per annum350 billion
Analyze product sentiment
of Tweets create daily
12 terabytes
Predict power consumption
© 2014 IBM Corporation4
Utilities• Weather analysis• Smart grid management
Retail• 360° View of the customer• Real-time promotions
Law Enforcement• Multimodal surveillance• Cyber security detection
Transportation•Logistics optimization •Traffic congestion
Financial Services•Fraud detection•360° View of the customer
Information Technology• System Log Analysis• Cybersecurity
Health & Life Sciences•Epidemic early warning •ICU monitoring
Telecommunications•Geomapping/marketing•Network monitoring
What Can You Do With Big Data?
© 2014 IBM Corporation5
c
cc
c
cMake decisions on untrusted information1 in 3
60%
Don’t have necessary information1 in 2
Time spent per big data project to find, prepare, understand & defend information due to lack of context
80%
Have more data than they can use60%
So, How Are We Doing?
© 2014 IBM Corporation6
American’s in a recent survey don’t want personalized on-line advertising
When you tell them the information you collect and store in order to do it
66%
Increasing to
86%
© 2014 IBM Corporation7
Context, Agility and Security are Essential Requirements to Meet Business Objectives in a Big Data Environment
AgilityA business framework (policies) for determining how and where to use big data.
Context Flexibility to establish and maintain context
independent of the volume, variety and
velocity of data.
SecurityProtection of data privacy and access; compliance with data
security and other regulatory requirements
Essential Requirements
© 2014 IBM Corporation8
Context Requires Governance; Agility Requires a Unique Big Data Approach to Governance
Traditional approach Big data approach
Govern data to the highest standard. Store it, then use it for multiple purposes
Understand data and usage. Govern to the appropriate level. Use it, and iterate
RepositoryGovern to
Perfection
UseData
Data
Explore/ Understand
Govern Appropriately
Use
How does an organization achieve agility in creating and continually evolving a safe and secure context in big data environments?
© 2014 IBM Corporation9
ACTACT
Implement planned projects with governed data search, preparation, defense and security
Implement planned projects with governed data search, preparation, defense and security
Begin by defining the business problem to solve with big data
Begin by defining the business problem to solve with big data
Obtain Executive
Sponsorship
2
AlignTeams
3
Understand Data Risk and
Value
4
Define Business Problem
1
MeasureResults
6ImplementAnalytical / Operational Project(s)
5
ACT
ASSESSPLAN
Defend Secure and Comply
PrepareFind
Big Data Governance is a Holistic Approach
Obtain executive sponsor to finalize priorities and goals
Obtain executive sponsor to finalize priorities and goals
Update governance roles to account for big data
Update governance roles to account for big data
Categorize data to understand risk exposure
Categorize data to understand risk exposure
Assess governance results and adjust
Assess governance results and adjust
© 2014 IBM Corporation10
Key Data Scenarios for Big Data Governance
Find Prepare Defend Secure and Comply
Establish context to find, visualize, and understand data for improved decision making
Understand context to extract, cleanse, integrate and monitor data properly, to increase integrity and trustworthiness for subsequent usage
Build confidence in information by making it defensible against challenges
Protection of data privacy and access; compliance with data security and other regulatory requirements
Analytical use Operational use
© 2014 IBM Corporation11
FindEstablish context to find, visualize, and understand data for improved decision making
Capabilities to Consider
The Cost is High
of data scientists’ time on big data projects is spent finding and preparing data
80%
Connectivity to sources
Real-time queries
(SQL, etc)
Enterprise search
Automated data
discoveryData profiling
Key Data Scenarios for Big Data Governance
© 2014 IBM Corporation12
Key Data Scenarios for Big Data Governance
PrepareUnderstand context to extract, cleanse, integrate and monitor data properly to increase integrity and trustworthiness for subsequent usage
Capabilities to Consider
The Risk is Real
Highly scalable data
integration
Define terms and policies
Data cleansing
Quality dashboardin
g
Rich annotation
© 2014 IBM Corporation13
Capabilities to Consider
Maintain data lineage
Data quality dashboardin
g
Master data management
Make decisions on untrusted information
DefendBuild confidence in information by making it defensible against challenges
The Risk is Real
1 in 3
Key Data Scenarios for Big Data Governance
© 2014 IBM Corporation14
Capabilities to Consider
Secure data at rest and in
motion
Data masking
Governed data
retention
Test data management
Governance reporting
$200 million just to replace cards!Secure and
ComplyProtection of data privacy and access; compliance with data security and other regulatory requirements
The Risk is Severe
Key Data Scenarios for Big Data Governance
© 2014 IBM Corporation15
Organizations rated their decision making as
7 or higher on a scale of 1 to 10
4 out of 5 Organizations are
improving at 3 times the rate of competitors
3XOf organizations show
high or very high levels of trust
77%
Source: The Big Data Imperative: Why Information Governance Must Be Addressed Now, Aberdeen Group, Dec 2012
IBM Big Data Governance Offers a Golden Opportunity
© 2014 IBM Corporation16
All Hadoop Vendors Talk About Their Big “Data Lake”.ONLY IBM Delivers Consumable Big Data From The Swamp.
Clean Hadoop LakeHadoop Data Swamp
IBM Big Data Governance–including quality, security, and data lineage– transforms your Hadoop Data Swamp to a consumable Big Data Lake.
© 2014 IBM Corporation17
A Complete Big Data Solution Is More Than Just An Engine
© 2013 IBM Corporation
IBM Teradata Pivotal INFA Cloudera Horton
Hadoop Distribution Horton
Hadoop Available via Appliance ORCL & HP Teradata
Hadoop SQL Engine Postgre
Streaming Data Flume/ Storm
Flume/ Storm
Data Exploration Tools
Enterprise Reporting
Data Provisioning Tools IBM, INFA Scripting Talend
Security Monitoring Protegrity
ELT, ETL & Replication IBM, INFA Talend
Metadata & Lineage Revelytix
Profile & Cleanse (native) IBM, INFA Talend
Hadoop Matching (native) IBM, INFA
Reference Data Mgmt.
Data Masking on Hadoop IBM, INFA
Archiving on Hadoop
© 2014 IBM Corporation18
Reduces reporting timefrom 2 to 3 days to minutes
“The IBM analytics solution greatly improves our ability to define and monitor business KPIs, and it brings much greater transparency to reporting. We now have a single version of the truth and a single comprehensive report for each topic.”
— Irfan Zafar, Chief Technology Innovation Officer and Senior General Manager of Customer Services, Sui Southern Gas Company Limited
Enables timely analyticscombining real-time operational and geographic data from over 5000 sources
Single source to informationthat is reliable and provides better clarity into the supply chain
Chemicals & Petroleum, Energy & Utilities
The transformation: Deployed an analytics solution that overlays digital maps with real-time operational and financial data, enabling SSGC to analyze data in a real-world context.
IBM Software–Information ManagementSui Southern Gas CompanyMitigates Business Risk Through Insights Into Supply and Demand
© 2014 IBM Corporation20
Legal Disclaimer
• © IBM Corporation 2014. All Rights Reserved.• The information contained in this publication is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained
in this publication, it is provided AS IS without warranty of any kind, express or implied. In addition, this information is based on IBM’s current product plans and strategy, which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this publication or any other materials. Nothing contained in this publication is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.
• References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in this presentation may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.
• If the text contains performance statistics or references to benchmarks, insert the following language; otherwise delete:Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.
• If the text includes any customer examples, please confirm we have prior written approval from such customer and insert the following language; otherwise delete:All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.
• Please review text for proper trademark attribution of IBM products. At first use, each product name must be the full name and include appropriate trademark symbols (e.g., IBM Lotus® Sametime® Unyte™). Subsequent references can drop “IBM” but should include the proper branding (e.g., Lotus Sametime Gateway, or WebSphere Application Server). Please refer to http://www.ibm.com/legal/copytrade.shtml for guidance on which trademarks require the ® or ™ symbol. Do not use abbreviations for IBM product names in your presentation. All product names must be used as adjectives rather than nouns. Please list all of the trademarks that you use in your presentation as follows; delete any not included in your presentation. IBM, the IBM logo, Lotus, Lotus Notes, Notes, Domino, Quickr, Sametime, WebSphere, UC2, PartnerWorld and Lotusphere are trademarks of International Business Machines Corporation in the United States, other countries, or both. Unyte is a trademark of WebDialogs, Inc., in the United States, other countries, or both.
• If you reference Adobe® in the text, please mark the first use and include the following; otherwise delete:Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
• If you reference Java™ in the text, please mark the first use and include the following; otherwise delete:Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
• If you reference Microsoft® and/or Windows® in the text, please mark the first use and include the following, as applicable; otherwise delete:Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both.
• If you reference Intel® and/or any of the following Intel products in the text, please mark the first use and include those that you use as follows; otherwise delete:Intel, Intel Centrino, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
• If you reference UNIX® in the text, please mark the first use and include the following; otherwise delete:UNIX is a registered trademark of The Open Group in the United States and other countries.
• If you reference Linux® in your presentation, please mark the first use and include the following; otherwise delete:Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.
• If the text/graphics include screenshots, no actual IBM employee names may be used (even your own), if your screenshots include fictitious company names (e.g., Renovations, Zeta Bank, Acme) please update and insert the following; otherwise delete: All references to [insert fictitious company name] refer to a fictitious company and are used for illustration purposes only.