Upload
others
View
15
Download
0
Embed Size (px)
Citation preview
© 2013 IBM Corporation1
InfoSphere Guardium Tech Talk:
Big Data Security Use Case: A HolisticApproach to Data ProtectionRodrigo Bisbal / [email protected]
© 2013 IBM Corporation2
Logistics This tech talk is being recorded. If you object, please hang up and
leave the webcast now.
We’ll post a copy of slides and link to recording on the Guardiumcommunity tech talk wiki page: http://ibm.co/Wh9x0o
You can listen to the tech talk using audiocast and ask questions inthe chat to the Q and A group.
We’ll try to answer questions in the chat or address them atspeaker’s discretion.
– If we cannot answer your question, please do include your emailso we can get back to you.
When speaker pauses for questions:– We’ll go through existing questions in the chat
© 2013 IBM Corporation
Reminder: Guardium Tech Talks
Link to more information about this and upcoming tech talks can be found on the InfoSpereGuardium developerWorks community: http://ibm.co/Wh9x0o
Please submit a comment on this page for ideas for tech talk topics.
Next tech talk: WATCH THIS SPACE
Speakers:
Date &Time:
Register here:
© 2013 IBM Corporation4
Big Data Security Use Case: A HolisticApproach to Data ProtectionRodrigo Bisbal / [email protected]
© 2013 IBM Corporation5
Table of contents
–What is Big Data ?
–Why Big Data ?
–What to do with it ?
–Access methods, new exposure
–How to secure it ?
–Playing nice with the enterprise
–Q&A
© 2013 IBM Corporation
There is an Explosion in Data and Real World Events
4.6 BillonMobile PhonesWorld Wide
1.3 Billion RFID tags in200530 Billion RFID today
2 Billion Internetusers by 2011
Twitter process7 terabytes ofdata every day
Facebook process10 terabytes ofdata every day
World Data Centre for Climate220 Terabytes of Web data9 Petabytes of additional
data
Capital marketdata volumes grew
1,750%, 2003-06
© 2013 IBM Corporation
Information is Exploding…
2009800,000 petabytes
202035 zettabytes
as much Data and ContentOver Coming Decade44x Of world’s data
is unstructured80%
Source: IDC, The Digital Universe Decade – Are You Ready?, May 2010
© 2013 IBM Corporation
Source: WHATRUNSWHERE.COM http://blog.whatrunswhere.com/big-data-online-marketing-strategy/
Why Big Data ?
VolumeBe able to capture large amounts ofunstructured data: stats, video, sensor data,messages, likes, etc
CostIt is cheaper to store in BigData than in RDBMS
Real TimeUse analysis tools to go through largeunstructured amounts of data very fast.Results are used to improve business processesand decisions.
ScalabilityBe able to grow exponentially withoutincreasing cost compared to warehousing.
……
© 2013 IBM Corporation10 14 November 2013
Why Big Data ?Case study: Aviation Data
Jet sensors:Collect jet engine data ( temperature, humidity, air pressure ) to predict partfailure, take preventative action. Reduce cost by pre-empting failure
Reduce down-time:Preventative maintenannce reduces down time, thus more planes to servicecustomers.
Analyzing arrivals/departure data, weather conditions and other data sourcesairlines can bette rmanage their fleets and schedules.
Happier customers:Improved customer satisfaction is the result of fewer delays, increased customerloyalty and increased bookings.
Nalayze customer’s flying patterns airlines can identify new routes and add otherservices to benefits customers and the airline.
Greener:
More efficient jet engines consume less fuel and emit fewer CO2 gases
© 2013 IBM Corporation11
Case study: Facebook Messaging
▪ High write throughput
▪ Every message, instant message, SMS, and e-mail
▪ Search indexes for all of the above
▪ Denormalized schema
▪ A product at massive scale on day one
▪ 6k messages a second
▪ 50k instant messages a second
▪ 300TB data growth/month compressed
…
© 2013 IBM Corporation
Case Study: Facebook “likes” on outdoor gear
USA 2.5M
MEX 300K
BRA 1.2MPERU 100K
ARG 350K SA 80K
INDIA 1.5M
RUSSIA 500KUK 300K
© 2013 IBM Corporation
Two security requirements in the era of big data
#1 Deploy Security Analytics
– Security analytics to predict, prevent and act oninformation in real time and through historicalanalysis
– Threats to physical and cyber assets must beunderstood and analyzed. Analyze in real timeand based on persisted data.
#2 Scale Existing Technology to Big Data
– Existing cyber security strategies such asencryption and data activity monitoring mustbe applied to big data. For example, maskunstructured data types such as medicalrecords or XML data or dynamically mask datafrom Hadoop platform or monitor all Hadoopactivity and access patterns
– Apply data security and privacy policies basedon security analytics and business rules
Big DataAnalytics
LogsLogs
EventsEvents AlertsAlerts
Traditional SecurityOperations andTechnology
ConfigurationConfigurationinformationinformation
SystemSystemaudit trailsaudit trails
External threatExternal threatintelligence feedsintelligence feeds
Network flowsNetwork flowsand anomaliesand anomalies
Identity contextIdentity context
Web pageWeb pagetexttext
Full packet andFull packet andDNS capturesDNS captures
EE--mailmail
BusinessBusinessprocess dataprocess dataCustomerCustomer
transactionstransactions
Social DataSocial Data --blogs, tweets,blogs, tweets,
chatschats
SatellitesSatellites
GPS trackingGPS trackingSmart devicesSmart devices
Network TrafficNetwork TrafficSensorsSensors
ImagesImages
SpreadsheetsSpreadsheets
FinancialFinancialTransactionsTransactions
TelephoneTelephoneRecordsRecords
© 2013 IBM Corporation16
Access Methods: create nyse_stocks table from the CSV file
Very easy toload data andcreate tablesfrom CSV files
Automatic datatype detection
© 2013 IBM Corporation
Access Methods: run analysis script against Hadoop file
Find volume average of IBM stock data, very easy !
© 2013 IBM Corporation
Use InfoSphere Guardium Hadoop Activity Monitor to auditevery transaction
Guardium report:
© 2013 IBM Corporation
Monitor sensitive data access withInfoSphere Guardium Authorized users
group
Directories thatcontain sensitive data
© 2013 IBM Corporation
Who is accessing my sensitive data?
Unauthorized useraccessing sensitivedata
Sensitive datadirectory
© 2013 IBM Corporation
Hadoop – Unauthorized MapReduce Jobs Report
This group containsauthorized programs.
This report showsprograms that areNOT in the group.
What applications arerunning on mysystem?
Who is running them?
© 2013 IBM Corporation
InfoSphere Guardium pre built reports
Login as a user
On the View tab is the Hadoop section
Hadoop section has all thepre-built reports
If you login as admin, you will need to add reports to the web console.You can add them to the “My New Reports” tab.
© 2013 IBM Corporation
Sensitive Data
Distributed Hadoop Cluster
Traditional SourcesPro
tec
tH
ere D
ata
Stre
am
s
Bu
sin
ess
Inte
llige
nc
eO
utp
ut
Protect Here
Web PagesSocial
Networks
Where to protect ?
© 2013 IBM Corporation
Securing the Hadoop Filesystem with InfoSphereGuardium Data Encryption:
• High level HDFS accesspolicy easily implementedwith Guardium DataEncryption
• Process aware
• User aware
© 2013 IBM Corporation
Sample data file exploit:
• Data store files: csv,images, encoded,etc sit in thefilesystem
• Direct access canbe used to extractdata
• Mission critical dataneeds to besecured
Simple strings command isenough to extract card datafrom a file !
© 2013 IBM Corporation
Use Guardium Data Encryption to protect
• Define the user thatis allowed to accessthe file
• Define the processthat is allows toaccess the files
• Specify the Effect:all FS operations:read, write, list,audit, encrypt, etc
Create a policy to protectthe directory:
© 2013 IBM Corporation
Use Guardium Data Encryption to protect ( cont. )
• Different policiescan be used fordifferent directories
• Centrally managefile system securityon the entireenterprise
Policy has been applied to adirectory: Guard point
© 2013 IBM Corporation
This time the data exploit fails !
• Only the authorizedprocess can accessthe files
• In this case theadmin cannot readthe file contentsdirectly
• Policy allows forauthorizedprocesses, apps andbackups to accessthe file
Simple strings command isnot enough to extract carddata from a file !
© 2013 IBM Corporation31
How to Secure It and Add Enterprise Value ?How to securely use mission critical data with big data ?
© 2013 IBM Corporation
Integrate Guardium Data Encryption logs withGuardium Activity Monitoring
-Guarded file system’s activity is logged in detail:User, action, process, object, timestamp
-In this form the logs usefulness is limited
-Read native logs from CSV stream or using Guardiummessaging API to send them in real time
© 2013 IBM Corporation
Integrate GDE logs with Guardium !
-GDE audit data is now on Guardium and can becorrelated with Hadoop Activity Monitoring-Data is normalized for easy filtering-Easily integrate with: Alerting, Workflow,Correlation engine, Quick Search, etc
© 2013 IBM Corporation
A2: Transparent. You simply definewhat processes, and/or user’s andgroups get access. Nothing changes forthose “trusted” with access. Applicationno API calls or code changesrequired. **** This is a Crucial benefitwe are bringing
A1: It is FS block level. HDFS writes tothe local FS blocks. This is what wecare about. We don’t care what HDFSdoes before actually doing the IO (filewrite/read) to the underlying FS.
Q1: How does this work at runtime? Forexample, if a file is encrypted, and thatis used by a MR job, how is thedecryption invoked (since MR does notlike encrypted files). Is there an API thatone needs to call as part of theapplication? Or is there a plugin at theDFS layer that is invoked on access?
Q2: Is the encryption at a block level? Isit sensitive to Hadoop "splits"? How doesit work in concert with compression(especially BZ2 and CMX)?
Sample Customer Questions about GDE:
© 2013 IBM Corporation
Integrate GDE logs with Guardium !
By reading the GDE audit stream and forwardinglogs into the Guardium audit database for:
-Alerting to QRadar, Syslog, ArcSight, etc-Correlate with audit data-Guardium Quick Search-Complete risk view-Analyze blocked operations-etc
© 2013 IBM Corporation
monito
rend-u
ser
activity
InfoSphere Guardium integration with other IBM products
Master Data ManagementInfoSphere MDM
Web Application PlatformWebSphere
Databases•DB2 [LUW, i, z, native agent]
•Informix
•IMS
DatawarehousesNetezza
PureData
PureFlex
Big DataBig Insights
SIEMQRadar
Storage and Archival•Optim Archival
•Tivoli Storage Manager
Endpoint ConfigurationAssessment and Patch
ManagementTivoli Endpoint Manager
LDAP DirectorySecurity Directory Server
Static Data MaskingOptim Data Masking
Data Discovery/Classification•InfoSphere Discovery
•Business Glossary
Help DeskTivoli Maximo
Event MonitoringTivoli Netcool
Software DistributionTivoli Provisioning Manager
TransactionApplication
CICS
Database tools•Change Data Capture
•Query Monitor
•Optim Test Data Manager
•Optim Capture Replay
•InfoSphere Data Stage
Analytic EnginesInfoSphere Sensemaking
open
ticke
ts
SNMP alerts
distribute
STAPs
remediate vulnerability
send alert, audit, vulnerabilityuser and group mgmtmonitor end-user activity
monitor end-user activity
monito
rend-u
seract
ivity
end-user activity
leverage capture function
leverage audit change
share discovery & policies
share discovery
share discovery & classif.
monitor, audit, protect
monitor, audit
monito
r,audit
mon
itor,
aud
it,a
rch
ive
arc
hiv
eau
dit
share discovery
InfoSphereGuardium
BusinessIntelligence
Cognos
© 2013 IBM Corporation
Resources• E-book “Planning a security and auditing deployment for
Hadoop” http://www.ibm.com/software/sw-library/en_US/detail/I804665J74548G31.html
• Big Data Security and Auditing with IBM InfoSphere Guardium:http://www.ibm.com/developerworks/data/library/techarticle/dm-1210bigdatasecurity/
• Data Security best practices: A practical guide to implementingdata encryption on IBM InfoSphere BigInsightshttp://public.dhe.ibm.com/software/dw/bigdata/bd-datasecuritybp/Encryption_1.4.pdf
• Quick Start Edition for BigInsights:http://www.ibm.com/software/data/infosphere/biginsights/quick-start/
© 2013 IBM Corporation39
Information, training, and community
InfoSphere Guardium Tech Talks – at least one per month. Suggestions welcome!
InfoSphere Guardium YouTube Channel – includes overviews, technical demos, tech talk replays
InfoSphere Guardium newsletter
developerWorks forum (very active)
Guardium DAM User Group on Linked-In (very active)
Community on developerWorks (includes discussion forum, content and links to a myriad of sources,developerWorks articles, tech talk materials and schedules)
Guardium Info Center (Installation, System Z S-TAPs, how-tos, more to come)
Technical training courses (classroom and self-paced)
InfoSphere Guardium Virtual User Group. Open, technicaldiscussions with other users. Not recorded!
Send a note to [email protected] if interested.
InfoSphere Guardium Virtual User Group. Open, technicaldiscussions with other users. Not recorded!
Send a note to [email protected] if interested.
@2013 IBM Corporation