Upload
dinhkhue
View
220
Download
2
Embed Size (px)
Citation preview
LAB1‐R04 – Analytics and Detection
through Coding, Scripting and Organizing
Data
Post‐Conference Summary
Mischel Kwon
Dilan Bellinghoven
Brian Kwon
Matt Norris
David Smith
2
Table of Contents
INTRODUCTION ......................................................................................................................... 3
Thank you for attending our Lab at RSA! ..................................................................................... 3
LAB SUMMARY ......................................................................................................................... 4
Key Takeaways ............................................................................................................................. 4
Lessons Learned ........................................................................................................................... 6
Further Resources ........................................................................................................................ 7
Take‐Home Exercise ..................................................................................................................... 7
3
Introduction
Thank you for attending our Lab at RSAC!
We appreciate you spending time with us during RSA Conference 2017. We hope you learned something new from our Analytics and Detection through Coding, Scripting and Organizing Data Lab. We had a lot of interaction during the Lab which, to us, was a sign that people were engaged and interested. We’re excited to have this chance to recap what we covered and share further information with you.
Our Lab was designed to expose you to a new approach to aggregating and organizing alerts and data to better identify attack patterns. It showed you how to put the data in a ticketing system like JIRA so that you could capture, share, and use the data in the future. This summary contains key takeaways, lessons learned, key feedback, and further resources should you wish to delve into more on analytics and detection through coding, scripting, and organizing data.
See you at the Conference next year!
Mischel Kwon, Dilan Bellinghoven, Brian Kwon, Matt Norris, and David Smith
MKA Cyber and Phantom Cyber
4
Lab Summary
Key Takeaways
ANALYZING DATA OUTSIDE OF A SIEM
Alerts by themselves do not tell an analyst much. SIEMs can be useful in extracting information from alerts, but they are too often in a state of neglect. Therefore, the ability to perform as many of the functions of a SIEM as possible using alternative tools and methods can offset deficiencies in a SIEM and ensure continuity of operations in the case of a disruption in the SIEM. However, a high volume of alerts poses a challenge – how do analysts make sense of such massive amounts of data without a SIEM? One approach is to implement a standard process which is organized by patterns of attack. We call these patterns “use cases”, and break them down further into specific activities called “scenarios”. Designing content to specific rules, signatures, and indicators will enable you to detect scenarios. A hierarchical view of this idea is shown below.
5
AGGREGATING ALERTS TO FIND PATTERNS
Many methods by which a large volume of alerts can be organized into a useful format exist, and many of those are likely to be at your fingertips right now. Several command‐line tools which are built into Unix, such as “grep”, “awk”, “sort”, and “uniq”, can perform such operations. For instance, a file containing a high volume of Snort alerts can be parsed and rearranged to reveal a variety of useful information, such as the top ten highest‐frequency destination IP addresses, the type of alerts, peak traffic times, and much more. Though it requires some degree of Unix aptitude, this method can be highly effective.
USING EXISTING TOOLS TO STORE AND SHARE YOUR FINDINGS
Though command‐line tools are useful, often it is not enough to see the patterns just once by yourself. The real value lies in sharing that knowledge with the rest of the SOC. Communicating and documenting findings enhances future threat detection which improves the performance of the SOC. There are many free or low‐cost options available to a SOC that enable team‐wide knowledge sharing and collaboration. For this Lab, we chose JIRA as the platform since it is available in a free trial version and has a robust REST API system. Keep in mind that if automation is to be implemented, some sort of programming language is required. We chose Python simply because of familiarity, but many other options are equally effective, and likely at minimal to no cost. It is highly likely that both of these tools or their equivalent alternatives are already present in a SOC, and they can be leveraged to perform more robust organization and sharing of the alert data.
AUTOMATING REPETITIVE TASKS FOR SOC OPTIMIZATION
Using tools like JIRA and Python, and with some basic programming experience, an analyst can automate the processing (e.g. parsing, rearranging, organizing, etc.) of the messy, raw Snort logs as we did using Unix command‐line tools. However, doing so programmatically opens up a far wider variety of functions that can be performed on the data, and enables repeatable data processing since the program can be re‐run again and again. In this Lab, we leveraged some of Python’s built‐in libraries to parse the logs, reorganize them, and feed them to JIRA via its REST API to create new tickets, all in a modularized, stepwise fashion. Developing automated workflows like this frees an analyst from performing menial tasks such as creating tickets and manually organizing data, and thus allows him or her to devote more time and energy towards tasks of higher value.
6
Lessons Learned
RESOURCEFULNESS CAN IMPROVE CONTINUITY OF OPERATIONS
An unfortunate trend among many SOCs today is an unhealthy dependence on a single tool or technology, such as a SIEM. Because of this dependence, a SOC might be left fumbling in the dark during a disruption in the SIEM. To avoid this loss of continuity – and to therefore avoid leaving the proverbial front door open – SOC analysts should be taught to be resourceful. They should be able to use the tools at their disposal and their experience to do what is necessary to ensure continuity.
As demonstrated in the exercises, in many cases, a plethora of suitable substitutes exist for any sort of data processing operation. Many of these substitutes are free and readily‐available to anyone. A SOC composed of analysts versed in a multitude of tools and technologies will be more resilient in the face of a failure in or unavailability of a certain tool. Such a SOC will be well‐prepared to meet the demands of the ever‐changing landscape of today’s cyber battlegrounds.
AUTOMATION GIVES ANALYSTS MORE TIME TO FOCUS ON HIGHER‐VALUE OPERATIONS
It goes without saying that there are certain tasks which a SOC performs that are perhaps tedious or repetitive and which do not require a high level of skill, but which nonetheless are essential to the healthy operation of a SOC. These are sometimes referred to as “tier one” tasks. Some examples might include scavenging trusted cybersecurity blogs for new indicators of compromise, parsing, filtering, and reformatting sensor logs, or finding contextual data related to a certain indicator of compromise. Such tasks are often highly amenable to automation.
As shown in the Lab exercises, automating these tier one tasks can save SOC analysts a significant amount of time. It can often reduce the time spent on a task by orders of magnitude. These time savings are apparent in the final exercise in which we demonstrated the beginnings of a web scraper which crawls trusted cybersecurity blogs and scrapes from them indicators of compromise. A program such as this can reduce the duration of the task from hours to seconds. Automation can free up SOC analysts to focus on tasks of higher value, such as investigating an incident. With the automation of each task, the performance of the SOC is further optimized.
7
Further Resources
There are many resources available for learning the basics of Unix command line. For one which covers the basics of processing log files with Unix, see this blog post: https://www.loggly.com/ultimate‐guide/analyzing‐linux‐logs/
For a basic JIRA tutorial, see the official JIRA 101 page on Atlassian’s website ‐‐ https://confluence.atlassian.com/jira064/jira‐101‐720412861.html
Python is no small subject to cover, but a good reference on using Python to automate some basic operations is Automate the Boring Stuff by Al Sweigart. It is highly accessible to beginners with no prior experience with Python.
Take‐Home Exercise
During the Lab, attendees were presented with a final take‐home challenge. This section reviews that challenge and provides the solution script so that you may check your solution.
INTRODUCTION
This exercise will introduce you to the basics of using the Python Request library, an API for interacting with the web via Python. In this demonstration you will learn how to pull out links from the FireEye blog.
Note that in order to implement this program, you may need to install the Python "Requests" and "BeautifulSoup" packages. In order to do this, it would be useful to first install "pip", a Python package manager (see https://packaging.python.org/installing/). After installing pip, you may use it to install Requests and BeautifulSoup from the command line with the following commands:
pip install requests
pip install bs4
From the command line, run the Python interpreter by typing “python” and hitting enter. This will open the interpreter shell. From the interpreter shell, import Requests and BeautifulSoup.
>>> import requests >>> from bs4 import BeautifulSoup
Request "https://www.fireeye.com/blog.html" and store Requests object in r.
>>> r = requests.get("https://www.fireeye.com/blog.html", headers={'User-Agent' : 'Magic Browser'}) >>> print r.text
As you can see, contained in r is the entire blog web page. Next, create a BeautifulSoup object with our Requests object in order to traverse the HTML page in r.
>>> soup = BeautifulSoup(r.text, 'html.parser')
Build a list containing all the links in any of the elements of the HTML page of class "c05_item".
8
>>> list_items = soup.find_all(class_="c05_item") >>> list_items[0].find('a').get('href')
Store all hrefs in 'articles'
>>> articles = [] >>> for item in list_items: articles.append(item.find('a').get('href')) >>> articles = ['https://www.fireeye.com' + i for i in articles] >>> for link in articles: print link
ASSIGNMENT
Using this strategy, find a way to request the ZeusTracker IP blocklist at (https://zeustracker.abuse.ch/blocklist.php?download=badips) and, with the resultant Requests object, find a way to create a CSV file of the following format:
ip,source,date,time
‐IP: The IPs shown on the web page (e.g. "039b1ee.netsolhost.com", "0if1nl6.org", etc.) ‐Source: The URL ("https://zeustracker.abuse.ch/blocklist.php?download=domainblocklist") ‐Date: The runtime date (see Python's datetime library) ‐Time: The runtime time (see Python's datetime library)
SOLUTION
import requests, csv, datetime, os from bs4 import BeautifulSoup import os def main(): url = "https://zeustracker.abuse.ch/blocklist.php?download=badips" r = requests.get(url, headers={'User-Agent' : 'Magic Browser'}) date = datetime.date.today() time = datetime.datetime.now().time() soup = BeautifulSoup(r.text, 'html.parser') html_text = soup.get_text() bad_ips = html_text.split('\n')[6:-1] # From here on out, figure out how to print these to a CSV with open(os.getcwd() + '/zeustracker_ip_csv.csv', 'wb') as outfile: csvwriter = csv.writer(outfile) for ip in bad_ips: csvwriter.writerow([ip, url, date, time]) if __name__ == '__main__': main()
SESSION ID:SESSION ID:
#RSAC
LAB1-R04
Matt NorrisSenior AnalystMKACyber
Mischel Kwon Brian KwonFounder, MKACyber Analyst, MKACyber@mkacyber
Analytics and Detection through Coding, Scripting and Organizing Data
Dilan BellinghovenSOC AnalystMKACyber
David SmithQA ManagerMKACyber
#RSAC
All hail the mighty SIEM!
2
Alerts are just alerts
How do we actually find patterns in the noise without drinking from the firehose?
Most SIEM products are in some state of neglect
#RSAC
So how do we get around this?
3
Organize the operation via a use case and scenario framework
Perform data and alert aggregation to see the signal through the noise
Establish workflow and content management via efficient use of ticketing systems
Automate basic analysis and enrichment
#RSAC
Threat based approach. Context?
4
Most of the alerts you see in a modern SOC are missing one thing… Context!
Categorizing as you go helps you specifically know what you’re looking for, when to look for it, and make sure you have detection across the spectrum
Avoids detecting down the
rabbit hole and getting
tunnel vision
#RSAC
But how do we tag things?
5
K. I. S. S.Well... As simple as you can in this case
Use Cases: A category for activity on a system (pieces of knowledge, or scenarios)
Scenarios: Individual pieces of activity on a system
Content: Specific rules, signatures, and indicators written to detect scenarios
#RSAC
Current MKA Use Cases
Web
Malware
High Value Targets
Unauthorized Access and Privilege Escalation
VPN
Data Exfiltration
Traffic Anomalies
Vulnerability
6
#RSAC
Break out of scenarios
Data ExfiltrationUnusual large upload
Unusual large download
Unusual large transfer during off business hours
Mismatched file headers
Unusual network session lengths
Matches on keywords/PII
Unusual large outbound traffic to suspect country
Unusually large outbound traffic to competitor/adversary
7
#RSAC
Example: Widget Warehouse
You come on as a new analyst
They have a plethora of detection tools
A new site is coming online that has not been integrated into the larger security architecture, but has sensors deployed
IT has commented that they’ve seen a large amount of malware infections, but the SOC can’t do anything yet
8
#RSAC
What can we do?
The Macarena?
Throw our hands up and get mad at the engineers?
SSH into the sensor and try and set up manual detection in the meantime?
9
#RSAC
Example: File header EXE detection
Use Case: Malware
Scenario: EXE File header found in environment
Content: Snort rule aggregated by Signature and then destination IP
We’ll walk through this example in class
10
#RSAC
Alert aggregation
Stack and Sort to make patterns manageable
What do we care about the data?
11
#RSAC
Or this?
Top 25 Snort Exe events:
13
cat alert.fast.maccdc2012_00000.pcap | awk -F"\[**\]" '{print $3;}' | sed -e 's/\[$//' -e 's/^\]//' | grep 'exe' | sort | uniq -c | sort -rn | head -25105 [1:2059:1] WEB-MISC MsmMask.exe access105 [1:2058:1] WEB-MISC MsmMask.exe attempt82 [1:2326:3] WEB-IIS sgdynamo.exe access82 [1:1610:11] WEB-CGI formmail arbitrary command execution attempt69 [1:809:11] WEB-CGI whois_raw.cgi arbitrary command execution attempt66 [1:2018403:7] ET TROJAN GENERIC Likely Malicious Fake IE Downloading .exe65 [1:1165:9] WEB-MISC Novell Groupwise gwweb.exe access47 [1:832:11] WEB-CGI perl.exe access47 [1:2019714:2] ET CURRENT_EVENTS Terse alphanumeric executable downloader high likelihood of being hostile47 [1:1648:7] WEB-CGI perl.exe command attempt44 [1:1614:8] WEB-MISC Novell Groupwise gwweb.exe attempt42 [1:1158:10] WEB-MISC windmail.exe access42 [1:100000217:1] COMMUNITY WEB-MISC man2web cmd exec attempt41 [1:2241:5] WEB-MISC cwmail.exe access41 [1:1654:6] WEB-CGI cart32.exe access41 [1:1536:8] WEB-CGI calendar_admin.pl arbitrary command execution attempt37 [1:1762:5] WEB-CGI phf arbitrary command execution attempt37 [1:1547:11] WEB-CGI csSearch.cgi arbitrary command execution attempt21 [1:989:11] BACKDOOR sensepost.exe command shell attempt21 [1:889:10] WEB-CGI ppdscgi.exe access21 [1:2244:4] WEB-MISC VsSetCookie.exe access21 [1:1655:6] WEB-CGI pfdispaly.cgi arbitrary command execution attempt21 [1:1595:10] WEB-IIS htimage.exe access7 [1:962:13] WEB-FRONTPAGE shtml.exe access7 [1:2010704:8] ET WEB_SERVER Possible HP OpenView Network Node Manager ovalarm.exe CGI Buffer Overflow Attempt
#RSAC
Emerging Threat Hit on “.exe”
14
Connections to two hosts
Lots of connections between the hosts
cat alert.fast.maccdc2012_00000.pcap | grep 'ET TROJAN GENERIC Likely Malicious Fake IE Downloading‘.exe' | awk -F"\ " '{split($23,a,":");print " " a[1] " -> " $25;}' | sort | uniq -c | sort –rn
50 192.168.202.110 -> 192.168.27.203:808016 192.168.202.110 -> 192.168.27.102:3128
cat ../http.log | grep 192.168.202.110 | grep "192.168.27.203\t8080" | wc –l
17894
cat ../http.log | grep 192.168.202.110 | grep "192.168.27.102\t3128" | wc –l
3663
#RSAC
Scanning…
15
cat ../http.log | grep 192.168.202.110 | grep "192.168.27.102\t3128" | awk '{print$8 " " $10;}' | sort | uniq -c | sort -rn | head
541 GET /22 GET /<IMG12 GET /scripts/12 GET /cgi-bin/6 GET /index.jsp4 GET /scripts/index.php4 GET /index.php4 GET /cgi-bin/index.php4 GET /cgi-bin/index.cgi4 GET /../../../../../../../../../../../../etc/passwd
#RSAC
Scanning…
16
cat ../http.log | grep 192.168.202.110 | grep "192.168.27.102\t3128" | awk '{print $8 " " $10;}' | sort | uniq -c | sort -rn | tail
1 GET././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././../../../../../../../../
1 GET ..\\..\\..\\..\\..\\..\\winnt\\win.ini1 GET ..\\..\\..\\..\\..\\..\\windows\\win.ini1 GET ..\..\..\..\..\..\winnt\win.ini1 GET ..\..\..\..\..\..\windows\win.ini1 GET ..\..\..\..\..\..\..\..\..\..\winnt\win.ini1 GET ..\..\..\..\..\..\..\..\..\..\windows\win.ini1 GET .1 GET %.1 CONNECT localhost:3141
#RSAC
Another event
17
Potential cmd.exe trafficcat alert.fast.maccdc2012_00000.pcap | grep 'ATTACK-RESPONSES Microsoft cmd.exe banner'
03/16-08:13:32.500000 [**] [1:2123:3] ATTACK-RESPONSES Microsoft cmd.exe banner [**] [Classification:Successful Administrator Privilege Gain] [Priority: 1] {TCP} 192.168.28.100:1138 -> 192.168.202.96:443
cat ../conn.log | grep -e '192.168.28.100\t1138\t192.168.202.96\t443'
1331903612.440000 CdDGxV32GCW7JfVHba 192.168.28.100 1138 192.168.202.96 443 tcp -0.060000 104 0 S1 - 0 ShADa 3 232 2 88 (empty)
1331904353.160000 CCuhkk1UHaue9Hp6y4 192.168.28.100 1138 192.168.202.96 443 tcp -66.440000 1814 225 SHR - 0 dDafA 12 2294 19 985 (empty)
cat ../conn.log | grep '192.168.28.100' | ../send_recv_counter.pl
192.168.202.110 -> 192.168.28.100 103688192.168.202.110 <- 192.168.28.100 172126
Lots of other traffic between the hosts
‘encrypted’ connections with not much traffic
#RSAC
Multiple ways to skin the cat
Grep for “ EXE ” in the snort alerts
AWK destination IP
Sort and count unique lines (uniq)
18
grep -i exe sample.pcap
grep -i exe sample.pcap |awk -F '}' '{print $2}' | awk -F ' ' '{print $3}' | awk –F ':' '{print $1}' | sort | uniq
grep -i exe sample.pcap |awk -F '}' '{print $2}' | awk -F ' ' '{print $3}' | awk –F ':' '{print $1}' |
#RSAC
Poor cat
Grep for “ EXE ” in the snort alerts
CUT destination IP
Sort and count unique lines (uniq)
19
grep -i exe sample.pcap
grep -i exe sample.pcap | cut -d '{' -f 2 | cut -d ' ' -f 4 | cut -d ':' -f 1
grep -i exe sample.pcap |cut -d '{' -f 2 | cut -d ' ' -f 4 | cut -d ':' -f 1 | sort | uniq -c
#RSAC
Exercise
Basic Linux analysis on Snort logsEntirely doable in PowerShell for the Windows-oriented
Example walkthrough of blackhole script
20
#RSAC
Back to Widget Warehouse
We have all of this security architecture lying around… can we use any of it?
How about the big shiny JIRA box?We can share with the rest of the SOC
Build on the knowledge base everyone else is using
Avoid reinventing the wheel
Gather all of your information in one useable and documentable place
But how do we talk to it?
Is it possible to correlate without a SIEM?
21
#RSAC
How about Python?
Easier than most people think
You don’t actually need a full development team
Python isn’t the only option (we just like it)
22
#RSAC
How do we pull out the data?
Snort alert fast is just text. How do we find and organize the good bits?
24
#RSAC
How do we pull out the data?
PyParsing – A quick, easy, and modular parser in Python
Snort_log_parser.py – Builds PyParsing object for Snort logs
Regex is fun, but not necessary
25
snort_log_parser.py <PyParsing object>
03/16-07:30:02.730000 [**] [1:2016141:2] ET INFO Exectuable Download from dotted-quad Host [**] [Classification: A Network Trojan was Detected] [Priority: 1] {TCP} 192.168.202.79:50770 -> 17.172.224.47:80
[['03/16-07:30:02.730000', '1:2016141:2', 'A Network Trojan was Detected', '1', 'TCP', '192.168.202.79', '50770', '17.172.224.47', '80']]
#RSAC
Exercise 2a: Broken Parser
26
Parser works with old log format:
Doesn’t work with new log format:
How do we fix it?
03/16-07:30:00.000000 [--] [1:2009358:5] ET SCAN Nmap Scripting EngineUser-Agent Detected (Nmap Scripting Engine) [--] [Category: Web ApplicationAttack] [Priority: 1] {TCP} 192.168.202.79:50465 -> 192.168.229.251:80
03/16-07:30:00.000000 [**] [1:2009358:5] ET SCAN Nmap Scripting EngineUser-Agent Detected (Nmap Scripting Engine) [**] [Classification: Web Application Attack] [Priority: 1] {TCP} 192.168.202.79:50465 -> 192.168.229.251:80
#RSAC
Exercise 2a: Broken Parser
Exercise instructions:Change directories to ~/exercise_2
Open ~/exercise_2/snort_log_parser_broken.py
Test whether the log is being properly parsed (follow along)
27
[RSA@RSAMKA ~]$ cd exercise_2
[RSA@RSAMKA ~]$ vim ~/exercise_2/snort_log_parser_broken.py
#RSAC
Exercise 2a: Broken Parser
Once fixed, move it to ~/exercise_2/snort_parsing
28
[RSA@RSAMKA exercise_2]$ mv snort_log_parser_broken.py./snort_parser/snort_log_parser.py
#RSAC
03/16-07:30:00.000000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50465 -> 192.168.229.251:8003/16-07:30:00.010000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50467 -> 192.168.229.251:8003/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 81.177.139.111:8003/16-07:30:00.030000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50469 -> 192.168.229.251:8003/16-07:30:00.040000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50471 -> 192.168.229.251:8003/16-07:30:02.730000 [**] [1:2100952:8] GPL WEB_SERVER author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:8003/16-07:30:00.050000 [**] [1:2102924:4] GPL NETBIOS SMB-DS repeated logon failure [**] ... 192.168.229.153:445 -> 192.168.202.79:5517303/16-07:30:00.050000 [**] [1:2924:3] NETBIOS SMB-DS repeated logon failure [**] ... 192.168.229.153:445 -> 192.168.202.79:5517303/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:8003/16-07:30:00.050000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50473 -> 192.168.229.251:8003/16-07:30:00.060000 [**] [1:402:7] ICMP Destination Unreachable Port Unreachable [**] ... 192.168.27.25 -> 192.168.202.10003/16-07:30:00.070000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50475 -> 192.168.229.251:8003/16-07:30:00.080000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50477 -> 192.168.229.251:80
How do we map the data?
29
{”81.177.139.111” : [
“03/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exeaccess [**] ... 192.168.202.79:50770 -> 81.177.139.111:80”
],“166.62.112.150” : [
“03/16-07:30:02.730000 [**] [1:2100952:8] GPL WEB_SERVER author.exeaccess [**] ... 192.168.202.79:50770 -> 166.62.112.150:80”,“03/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exeaccess [**] ... 192.168.202.79:50770 -> 166.62.112.150:80”
]}
03/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 81.177.139.111:8003/16-07:30:02.730000 [**] [1:2100952:8] GPL WEB_SERVER author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:8003/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:80
#RSAC
How do we map the data? (cont’d)
How can we process our logs and find patterns?
Can we organize them for easier use in a later stage?
Yes we can! With snort_mapping.py
Uses a parser object from previous exercise
Runs parser over data sample to create a JSON object with sorted, organized, structured, and beautiful data
30
#RSAC
How do we map the data? (cont’d)
Step 1: Parse Snort alert log file
31
snort_log_parser.py
Snort alert log file
(sample.pcap)
snort_mapping.py
SnortParser
DEST_IP_ALERTS_MAP
# Create a PyParsing object for the Snort alert log fileparser = snort_log_parser.SnortParser()
# Parse logfile using parser and return array of # {dst_ip_1 : [alert_1, alert_2, ..., alert_n], # dst_ip_2 : [alert_1, alert_2, ..., alert_n], ...]mapping = snort_mapping.build_unique_dest_ips(parser, logfile)
#RSAC
How do we insert data into JIRA?
JIRA.py
Iterates over the structured output (JSON) from the previous step
Gives a way to perform actions (maybe intel gathering) per ticket by creating a pipeline
AUTOMATE TIER ONE ALL THE TIME
JIRA ingests the JSON object via REST API to create one ticket per destination IP in JSON
32
#RSAC
How do we insert data into JIRA?
Step 2: Use mapping and JIRA REST API to create one ticket per destination IP in JIRA
33
# Create JIRA tickets using JIRA REST APIJIRA.transmit(creds, JIRA_domain, mapping)
JIRA.pyNew tickets created in
JIRA
JIRA
REST
APIDEST_IP_ALERTS_MAP
#RSAC
How do we insert data into JIRA? (cont’d)
34
snort_log_parser.py
Snort alert log filesnort_mapping.py
SnortParser
JIRA.pyDEST_IP_ALERTS_MAP
New tickets created
in JIRA
JIRA
REST
API
osint.py
(Additional Context)
{dst_ip_1 : [
alert_1, alert_2
],dst_ip_2 : [
alert_1,alert_2
], …}
Dest. IP
Open source
intel for dest. IP
stored in file
1
2
3
4
5
#RSAC
Exercise 2b: Put it all together
Change directories to snort_parsing
Get a fresh IP
Determine your VM’s IP address (ifconfig)
35
[RSA@RSAMKA exercise_2]$ cd snort_parsing
[RSA@RSAMKA snort_parsing]$ sudo dhclient –r[RSA@RSAMKA snort_parsing]$ sudo dhclient
#RSAC
Exercise 2b: JIRA setup
Run JIRA on your local machineUsing the IP from Step 1, type in your browser:http://<IP>:8080/
Login using:Username: RSAPassword: mkarsa
If you get a “Base URL Error”, click the “update base URL” button in the pop-up box with the yellow banner
Feel free to explore a bit in JIRA
36
#RSAC
Exercise 2b: Explore the program
In snort_parsing, open master.py
Be sure not to edit the code
If you accidentally edit the code, type “:q!” and press Enter(do not type the quotation marks)
Try to understand how each piece fits together
37
[RSA@RSAMKA exercise_2]$ vim master.py
#RSAC
Exercise 2b: Try it for yourself
Run the script
Go to JIRA, select the “Issues” dropdown, and click “Search for Issues” to see the newly-created tickets
38
[RSA@RSAMKA snort_parsing]$ python master.py sample.pcap –u http://<IP>:8080/ -a RSA:rsamka
#RSAC
Exercise 2b: Congratulations!
39
For you to try at home:Open hub.py and add the missing code to make the program work
The module master.py is the solution to hub.py
#RSAC
Data Enrichment
Analysis of data provided by tools is good, but that data is better when it has additional context!
Some classic examples of enrichment:Network ranges (by organization, building, etc.)
User (hostname, organization, etc.)
OSINT (Alexa, reputation, NoD, etc.)
Net Defense (blocklists, greylists, etc.)
Analyst assistance (whitelists, previous tickets, etc.)
40
#RSAC
But I want to use (Insert Tool Name Here)
Well how about Splunk?
Splunk loves to eat JSON
CSV lookups make things easy
41
#RSAC
Splunk Lookup Tables
Spunk has several mechanisms for data enrichment:Comma Separated Value (CSV)
External tool
Key/Value store
CSV is the easiest to work with initially since it is a simple mechanism that a lot of tools will import/export
Keep It Simple
42
#RSAC
Basic Configuration
In their most basic form CSV lookup tables consist of a base lookup file that is stored in the ‘<appname>/lookup’ directory and then referenced through a ‘transforms.conf’ entry
These lookup tables can then be referenced through multiple mechanisms
status,status_description,status_type100,Continue,Informational101,Switching Protocols,Informational200,OK,Successful201,Created,Successful202,Accepted,Successful…300,Multiple Choices,Redirection301,Moved Permanently,Redirection302,Found,Redirection…400,Bad Request,Client Error401,Unauthorized,Client Error402,Payment Required,Client Error403,Forbidden,Client Error404,Not Found,Client Error405,Method Not Allowed,Client Error…500,Internal Server Error,Server Error501,Not Implemented,Server Error502,Bad Gateway,Server Error503,Service Unavailable,Server Error504,Gateway Timeout,Server Error505,HTTP Version Not Supported,Server Error
[http_status]filename = http_status.csv
http://docs.splunk.com/Documentation/Splunk/6.5.1/Knowledge/ConfigureCSVlookups
43
#RSAC
Two methods of populating
The results are enriched through defining key/value pairs to be inserted when a specified field in the dataset matches a specified field in the lookup table.
Two sample ways of performing these lookups are through the use of:The ‘lookup’ search command.
— When there is a match, the specified fields will be output into the result set
Automatic lookups through ‘props.conf’
... | lookup http_status status OUTPUT status_description, status_type
[http_log]LOOKUP-http_log_lookup = code AS status OUTPUT status_description status_type
http://docs.splunk.com/Documentation/Splunk/6.5.1/Knowledge/ConfigureCSVlookups
44
#RSAC
Example
When the below data is ingested, it will be parsed and fields extracted.
In this example, the 7th field of the input is the HTTP status codeThe config on the previous page will match against the first column of the lookup!
status,status_description,status_type
100,Continue,Informational
101,Switching Protocols,Informational
200,OK,Successful
201,Created,Successful
202,Accepted,Successful
…
300,Multiple Choices,Redirection
301,Moved Permanently,Redirection
302,Found,Redirection
…
400,Bad Request,Client Error
401,Unauthorized,Client Error
402,Payment Required,Client Error
403,Forbidden,Client Error
404,Not Found,Client Error
405,Method Not Allowed,Client Error
…
500,Internal Server Error,Server Error
501,Not Implemented,Server Error
502,Bad Gateway,Server Error
503,Service Unavailable,Server Error
504,Gateway Timeout,Server Error
505,HTTP Version Not Supported,Server Error
192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395
127.0.0.1 - - [28/Jul/2006:10:22:04 -0300] "GET / HTTP/1.0" 200 2216
127.0.0.1 - - [28/Jul/2006:10:27:32 -0300] "GET /hidden/ HTTP/1.0" 404 7218
x.x.x.90 - - [13/Sep/2006:07:01:53 -0700] "PROPFIND /svn/[xxxx]/Extranet/branches/SOW-101
HTTP/1.1" 401 587
x.x.x.90 - - [13/Sep/2006:07:01:51 -0700] "PROPFIND /svn/[xxxx]/[xxxx]/trunk HTTP/1.1" 401 587
x.x.x.90 - - [13/Sep/2006:07:00:53 -0700] "PROPFIND /svn/[xxxx]/[xxxx]/2.5 HTTP/1.1" 401 587
http://ossec-docs.readthedocs.io/en/latest/log_samples/apache/apache.html
45
#RSAC
OSINT Bash script
To assist analysis, we need more context
Open source intel (OSINT) script to the rescue!Bash
— More variations on cat skinning
Quick and dirty way to look up indicators from several vetted open sources
Written by another analyst doing work on their own
Retrofitted into this workflow to provide intel on our IPs
46
#RSAC
How do we insert data into JIRA? (cont’d)
47
snort_log_parser.py
Snort alert log filesnort_mapping.py
SnortParser
JIRA.pyDEST_IP_ALERTS_MAP
New tickets created
in JIRA
JIRA
REST
API
osint.py
(Additional Context)
{
dst_ip_1 : [
alert_1,
alert_2
],
dst_ip_2 : [
alert_1,
alert_2
], …
}
Dest. IP
Open source
intel for dest. IP
stored in file
1
2
3
4
5
#RSAC
Attempt to standardize network/security event data into a single format
Easily add new sources of event data to your existing scripts and decision making process.
48
“Common” Event Format (CEF)
#RSAC
Goal: Your tools create data in an output format that is easily managed by one central system
Gotcha: Don’t worry about conforming to the exact standard
Easily stored in JSON, remember to structure to keep fields available for other information
49
"severity": "high",
"label": "event",
"type": "network",
"cef": {
"fileHash": "dae375687c520e06cb159887a37141bf",
"requestURL": "www.adssa-org.1gb.ru",
"destinationPort": "80",
"sourcePort": "4286",
"deviceDirection": "inbound",
"destinationUserName": "Dan",
"sourceAddress": "213.57.77.220"
},
"update_time": "2017-01-11T19:11:09.992701Z",
"hash": "f0053797a5e4509f4cc093b8e67f0259”,
“Common” Event Format (CEF)
#RSAC
50
Snort alert log file snort_mapping.py
Ingestion to Platform as Events with Artifacts
DEST_IP_ALERTS_MAP(IN CEF FORMAT)
Execute specific
“Playbook(s)” Based
on Event Type
Whois IP / URL
Reputation,File Reputation
ENRICHMENT PHASE(OSINT TYPE TOOLS)
ALERT PHASE
Playbook Decision:Malicious?
Playbook Automation: Create Ticket
Prompt User for DecisionBlock IP
DNS Black-hole
Close Event with (automated) comment
No
Yes
Other data sources
Security Automation and Orchestration
Integrated Work Flow
#RSAC
51
Ingestion to Platform as Events with Artifacts
Data Sources / Alerts
Security Automation and Orchestration
Integrated Work Flow
Pivot on DataManual
Investigation/Enrichment
Automation
ExecutionAction Approval Automated resolution
Automated
Enrichment and
Investigation
Case Management
ReportingCentralized Display
of Data
#RSAC
52
Security Automation and OrchestrationBenefits to Automation and Orchestration Platforms
Manage all your event data in one single combined screen
App development - Connectivity to hundreds of third party toolsStandardized Data formats
Asset administration
Visual automation / playbook development
Case Management and Reporting
Pivoting on Data
#RSAC
Exercise: Beautiful Soup
HTML or XML parser
Tree traversal
Web scrapers
And here’s what we did with it
53