The Heatmap - Why is Security Visualization so Hard?

Embed Size (px)

DESCRIPTION

The extent and impact of recent security breaches is showing that current approaches are just not working. But what can we do to protect our business? We have been advocating monitoring for a long time as a way to detect subtle, advanced attacks. However, products have failed to deliver on this promise. Current solutions don't scale in both data volume and analytical insights. In this presentation we will explore why it is so hard to come up with a security monitoring (or shall we call it security intelligence) approach that helps find sophisticated attackers in all the data collected. We are going to explore the question of how to visualize a billion events. We are going to look at a number of security visualization examples to illustrate the problem and some possible solutions. These examples will also help illustrate how data mining and user experience design help us get a handle of the security visualization challenges - enabling us to gain deep insight for a number of security use-cases.

Citation preview

  • 1. Security VisualizationRaffael Marty, CEOWhy is It So Hard?ISF, Shanghai, ChinaNovember, 2014

2. 2 Secur i ty. Analyt ics . Ins ight .Visualization - Heatmaps 3. 3 Secur i ty. Analyt ics . Ins ight .Visualization - Graphs 4. 4 Secur i ty. Analyt ics . Ins ight .I am Raffy - I do Viz!IBM Research 5. 27 days229 daysAverage time to resolve a cyber attack1.4$7.2M5 Secur i ty. Analyt ics . Ins ight .How Compromises Are DetectedMandiant M Trends Report 2014 Threat ReportAttackers in networks before detectionSuccessful attacks per company per weekAverage cost per company per year 6. 6 Secur i ty. Analyt ics . Ins ight .Our Security GoalsFind Intruders and New AttacksDiscover Exposure EarlyCommunicate Findings 7. 7 Secur i ty. Analyt ics . Ins ight .Why Visualization?the stats ...http://en.wikipedia.org/wiki/Anscombe%27s_quartetthe data... 8. 8 Secur i ty. Analyt ics . Ins ight .Why Visualization?http://en.wikipedia.org/wiki/Anscombe%27s_quartet 9. 9 Secur i ty. Analyt ics . Ins ight .Visualize Me Lots (>1TB) of Data 10. 9 Secur i ty. Analyt ics . Ins ight .Visualize Me Lots (>1TB) of Data 11. 9 Secur i ty. Analyt ics . Ins ight .Visualize Me Lots (>1TB) of DataSecViz is Hard! 12. ?10 Secur i ty. Analyt ics . Ins ight .Its Hard - Understanding Data We dont understand the data / logs Single log entry:Mar 16 08:09:48 kernel: [0.00000] Normal 1048576 -> 1048576 Absence of logs? Logging configuration? Collection of logs Understanding context (setup, business processes) Is this normal?2011-07-22 20:34:51 282 ce6de14af68ce198 - - - OBSERVED"unavailable" http://www.surfjunky.com/members/sj-a.php?r=44864 200 TCP_NC_MISS GET text/html http www.surfjunky.com80 /members/sj-a.php ?r=66556 php "Mozilla/5.0 (Windows NT6.1; WOW64) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.65 Safari/534.24" 82.137.200.42 1395 663 - 13. Situational Awareness11 Secur i ty. Analyt ics . Ins ight .Its Hard - The Right DataSecurity MonitoringData Exfiltration DNS trafficFraud HTTP header sequences Application logs DB logs context feeds! Application logs DLP ProxiesPhishing et al. email logs Are we focusing on the right data sources? Everyone focuses onTraffic flows IDS dataZero DaysBotnet / Malware infections 14. Its Hard - Mapping the DataOct 13 20:00:05.680894 rule 57/0(match): pass in on xl1: 195.141.69.45.1030 >217.12.4.104.53: 7040 [1au] A? mx1.mail.yahoo.com. (47) (DF)1. Understand all elements2. Which fields are important?3. Do we need more context?4. What do we want to see?- Time-behavior?- Relationships?5. How much data do we have? What graph will scale to that?12 Secur i ty. Analyt ics . Ins ight . 15. Visualize 1TB of Data - What Graph?13 Secur i ty. Analyt ics . Ins ight .drop reject NONE ctl accept DNS Update FailedLog InIP FragmentsMax Flows InitiatedPacket FloodUDP FloodAggressive AgingBootpRenewLog OutReleaseNACKConflictDNS Update SuccessfulDNS record not deletedDNS Update RequestPort Flood1 10000 100000000How much information does each of the graphs convey? 16. 14 Secur i ty. Analyt ics . Ins ight .It Is Hard - IP AddressesFOCUSInfo-Viz =Sec-Viz = 17. An Approach - And The Challenges15 18. 16 Secur i ty. Analyt ics . Ins ight .Data Visualization WorkflowOverview Zoom / Filter Details on Demand 19. 16 Secur i ty. Analyt ics . Ins ight .Data Visualization WorkflowOverview Zoom / Filter Details on Demand 20. 16 Secur i ty. Analyt ics . Ins ight .Data Visualization WorkflowOverview Zoom / Filter Details on Demand 21. Overview - The HeatmapMatrix A, where aij are integer values mapped to a color scale.17 Secur i ty. Analyt ics . Ins ight .aij = 1 10 20 30 40 50 60 70 80 >9042rowscolumns 22. Mapping Log Records to HeatmapsMay 5 23:57:50 pixl-ram sudo: pam_unix(sudo:session):session opened for user root by ram(uid=0)t .. time bin time18 Secur i ty. Analyt ics . Ins ight .rootrampegsue} 23. Mapping Log Records to HeatmapsMay 5 23:57:50 pixl-ram sudo: pam_unix(sudo:session):session opened for user root by ram(uid=0)t .. time bin time18 Secur i ty. Analyt ics . Ins ight .rootrampegsue} 24. Mapping Log Records to HeatmapsMay 5 23:57:50 pixl-ram sudo: pam_unix(sudo:session):session opened for user root by ram(uid=0)18 Secur i ty. Analyt ics . Ins ight .rootrampegsue}()=+1t .. time bintime 25. Scales well to a lot of data (can aggregate ad infinitum) Shows more information than a bar chart Flexible measure mapping frequency count sum(variable) [avg(), stddev(), ] distinct count(variable)19 Secur i ty. Analyt ics . Ins ight .Why Heatmaps? 26. Scales well to a lot of data (can aggregate ad infinitum) Shows more information than a bar chart Flexible measure mapping frequency count sum(variable) [avg(), stddev(), ] distinct count(variable)19 Secur i ty. Analyt ics . Ins ight .Why Heatmaps? BUT information content is limited! Aggregates too highly in time and potentially value dimensions 27. random row order20 Secur i ty. Analyt ics . Ins ight .HeatMap Challenges - Sorting Random Alphabetically Based on values Similarity What algorithm? What distance metric? Leverage third data field / context?rows clustereduser 28. Whats the HeatMap Not Good At21 Secur i ty. Analyt ics . Ins ight . Showing relationships-> link graphs Showing multiple dimensions and their inter-relatedness-> || coords 29. color = Port22 Secur i ty. Analyt ics . Ins ight .GraphsSourceIP DestIP 30. 23 Secur i ty. Analyt ics . Ins ight .Graphs To Show Relationships 31. destIPURLuserdestIPusersourceportdestIPuser24 Secur i ty. Analyt ics . Ins ight .Some Graph Challenges How to map data to graph Dont scale to few hundred (thousand) nodes What layout algorithm to chose? Node placement should be semantically motivated Graph metrics dont mean anything in security (centrality, etc.) Analytics needs interactive features linked views Analytics is not a linear processsource event destination destportsourceIPactiondestPort 32. 25 Secur i ty. Analyt ics . Ins ight .Backend ChallengesDifferent backend technologies (big data) Key-value store Search engine GraphDB RDBMS Columnar - can answer analytical questions Hadoop (Map Reduce) good for operations on ALL dataOther things to consider: Caching Joins 33. Raffael . Marty @ pixlcloud . com26Examples 34. 27 Secur i ty. Analyt ics . Ins ight .VincentTh i s heatmap s howsbehavior over time.In this case, we see activityper user. We can see thatvincent is visually differentfrom all of the other users.He shows up very lightlyover the ent i re t imeperiod. This seems to besomething to look into.Purely visual, withoutunderstanding the datawere we able to find this. 35. Security. Analytics. Insight.AttributionAuthentication Events: users over timeWho is behind these scans?Challenges Finding meaningful patternsGraph credit: Tye Wells 36. Security. Analytics. Insight.Same Pattern For Sources From 4 CountriesGraph credit: Tye Wells 37. 30 Secur i ty. Analyt ics . Ins ight .Firewall Heatmap 38. Intra-Role Anomaly - Random Orderuserstimedc(machines)31 Secur i ty. Analyt ics . Ins ight . 39. Intra-Role Anomaly - With Seriation32 Secur i ty. Analyt ics . Ins ight . 40. Intra-Role Anomaly - Sorted by User Role33 Secur i ty. Analyt ics . Ins ight . 41. Intra-Role Anomaly - Sorted by User RoleAdministratorSalesDevelopmentFinance33 Secur i ty. Analyt ics . Ins ight . 42. Intra-Role Anomaly - Sorted by User RoleAdministratorAdmin???SalesDevelopmentFinance33 Secur i ty. Analyt ics . Ins ight . 43. 34 Secur i ty. Analyt ics . Ins ight .Graphs - A Story 44. 34 Secur i ty. Analyt ics . Ins ight .Graphs - A Story 45. 34 Secur i ty. Analyt ics . Ins ight .Graphs - A StoryThis looks interesting What is it? Green -> Port 53 Only port 53? What IPs? Whats the time behavior?The graph doesnt answer thesequestions 46. 35 Secur i ty. Analyt ics . Ins ight .Graphs - A Story Adding a porthistogram Select DNS trafficand see if otherports light up. 47. 36 Secur i ty. Analyt ics . Ins ight .DNS Traffic - A Closer LookLinked Views- Histograms forSourcePort (Source)Destination- ||-coord 48. 37 Secur i ty. Analyt ics . Ins ight . 49. 37 Secur i ty. Analyt ics . Ins ight .select port 1900 50. 37 Secur i ty. Analyt ics . Ins ight .select port 1900 51. 38 Secur i ty. Analyt ics . Ins ight .port 80 52. Security. Analytics. Insight.After some exploration 53. 40 Secur i ty. Analyt ics . Ins ight .Firewall Time Behaviorsource10.0.0.110.0.0.210.0.0.310.0.0.4 54. 40 Secur i ty. Analyt ics . Ins ight .Firewall Time Behaviorsource10.0.0.110.0.0.210.0.0.310.0.0.4block &passcolor mapping: pass block 55. 40 Secur i ty. Analyt ics . Ins ight .Firewall Time Behavior}t .. time bin - aggregationsource10.0.0.110.0.0.210.0.0.310.0.0.4block &passcolor mapping: pass block 56. High Frequency Sources Over Timeblock &passpass block41 Secur i ty. Analyt ics . Ins ight . 57. 42 Secur i ty. Analyt ics . Ins ight .High Frequency Traffic Split Upinbound outbound192.168.0.201195.141.69.42195.141.69.43195.141.69.44195.141.69.45195.141.69.46212.254.110.100212.254.110.101212.254.110.107212.254.110.108212.254.110.109212.254.110.110212.254.110.98212.254.110.9962.245.245.139 58. Outbound Traffic - Some Questions To Ask What happened mid-way through? Why is anything outbound blocked? What are the top and bottom machines doing? Did we get a new machine into the network? Some machines went away?43 Secur i ty. Analyt ics . Ins ight . 59. Outbound Traffic - Some Questions To Ask What happened mid-way through? Why is anything outbound blocked? What are the top and bottom machines doing? Did we get a new machine into the network? Some machines went away?43 Secur i ty. Analyt ics . Ins ight .195.141.69.42 60. 44 Secur i ty. Analyt ics . Ins ight .195.141.69.42 - Interactionsactionportdest 61. 44 Secur i ty. Analyt ics . Ins ight .195.141.69.42 - Interactionsactionportdest 62. 44 Secur i ty. Analyt ics . Ins ight .195.141.69.42 - Interactionsactionportdest 63. Inbound - Zooming in on Top Rows45 Secur i ty. Analyt ics . Ins ight .rows 0,300 64. Inbound - Zooming in on Top Rows45 Secur i ty. Analyt ics . Ins ight .rows 0,300rows 200,260 65. 46 Secur i ty. Analyt ics . Ins ight .Zooming in on Top Rows Hardly any pass-block 66. 46 Secur i ty. Analyt ics . Ins ight .Zooming in on Top Rows212.254.110.100212.254.110.101212.254.110.102212.254.110.103212.254.110.104212.254.110.105212.254.110.106212.254.110.107212.254.110.108212.254.110.109212.254.110.110212.254.110.111212.254.110.112212.254.110.113212.254.110.114212.254.110.115212.254.110.116212.254.110.117212.254.110.118212.254.110.119212.254.110.120212.254.110.121212.254.110.122212.254.110.123212.254.110.124212.254.110.125212.254.110.126212.254.110.127212.254.110.66212.254.110.96212.254.110.97212.254.110.98212.254.110.99 Hardly any pass-block 67. 46 Secur i ty. Analyt ics . Ins ight .Zooming in on Top Rows212.254.110.100212.254.110.101212.254.110.102212.254.110.103212.254.110.104212.254.110.105212.254.110.106212.254.110.107212.254.110.108212.254.110.109212.254.110.110212.254.110.111212.254.110.112212.254.110.113212.254.110.114212.254.110.115212.254.110.116212.254.110.117212.254.110.118212.254.110.119212.254.110.120212.254.110.121212.254.110.122212.254.110.123212.254.110.124212.254.110.125212.254.110.126212.254.110.127212.254.110.66212.254.110.96212.254.110.97212.254.110.98212.254.110.99 Hardly any pass-blockOct 22 14:20:08.351202 rule 237/0(match): block in on xl0: 66.220.17.151.80 >212.254.110.103.1881: S 1451746674:1451746678(4) ack 1137377281 win 16384 (DF)ao.lop.com: 66.220.17.151 - Spyware Gang (LOP)http://www.freedomlist.com/forum/viewtopic.php?t=15724 68. 46 Secur i ty. Analyt ics . Ins ight .Zooming in on Top Rows212.254.110.100212.254.110.101212.254.110.102212.254.110.103212.254.110.104212.254.110.105212.254.110.106212.254.110.107212.254.110.108212.254.110.109212.254.110.110212.254.110.111212.254.110.112212.254.110.113212.254.110.114212.254.110.115212.254.110.116212.254.110.117212.254.110.118212.254.110.119212.254.110.120212.254.110.121212.254.110.122212.254.110.123212.254.110.124212.254.110.125212.254.110.126212.254.110.127212.254.110.66212.254.110.96212.254.110.97212.254.110.98212.254.110.99 Hardly any pass-block 69. 46 Secur i ty. Analyt ics . Ins ight .Zooming in on Top Rows212.254.110.100212.254.110.101212.254.110.102212.254.110.103212.254.110.104212.254.110.105212.254.110.106212.254.110.107212.254.110.108212.254.110.109212.254.110.110212.254.110.111212.254.110.112212.254.110.113212.254.110.114212.254.110.115212.254.110.116212.254.110.117212.254.110.118212.254.110.119212.254.110.120212.254.110.121212.254.110.122212.254.110.123212.254.110.124212.254.110.125212.254.110.126212.254.110.127212.254.110.66212.254.110.96212.254.110.97212.254.110.98212.254.110.99 Hardly any pass-block 70. 46 Secur i ty. Analyt ics . Ins ight .Zooming in on Top Rows212.254.110.100212.254.110.101212.254.110.102212.254.110.103212.254.110.104212.254.110.105212.254.110.106212.254.110.107212.254.110.108212.254.110.109212.254.110.110212.254.110.111212.254.110.112212.254.110.113212.254.110.114212.254.110.115212.254.110.116212.254.110.117212.254.110.118212.254.110.119212.254.110.120212.254.110.121212.254.110.122212.254.110.123212.254.110.124212.254.110.125212.254.110.126212.254.110.127212.254.110.66212.254.110.96212.254.110.97212.254.110.98212.254.110.99 Hardly any pass-block212.254.110.102Oct 16 13:14:05.627835 rule 0/0(match): pass in on xl0: 66.220.17.151.80 >212.254.110.102.1977: S 1841864015:1841864019(4) ack 1308753921 win 16384 (DF)pass in log quick on $ext from any to $honey 71. 47 Secur i ty. Analyt ics . Ins ight .This Guy Sure Keeps Busy212.254.144.40 72. 47 Secur i ty. Analyt ics . Ins ight .This Guy Sure Keeps Busy212.254.144.40dest port 73. 48 Secur i ty. Analyt ics . Ins ight .Recap Attackers are very successful Data can reveal adversaries We have a big data analytics problem We need the right analytics and visualizations Security visualization is hard Data visualization workflow is a promising approach Analytics is not a linear process 74. [email protected]