LAB1‐R04 – Analytics and Detection through Coding ... · PDF filethrough Coding, Scripting and Organizing ... on analytics and detection through coding ... a use case and scenario

LAB1‐R04 – Analytics and Detection

through Coding, Scripting and Organizing

Data

Post‐Conference Summary

Mischel Kwon

Dilan Bellinghoven

Brian Kwon

Matt Norris

David Smith

2

Table of Contents

INTRODUCTION ......................................................................................................................... 3

Thank you for attending our Lab at RSA! ..................................................................................... 3

LAB SUMMARY ......................................................................................................................... 4

Key Takeaways ............................................................................................................................. 4

Lessons Learned ........................................................................................................................... 6

Further Resources ........................................................................................................................ 7

Take‐Home Exercise ..................................................................................................................... 7

3

Introduction

Thank you for attending our Lab at RSAC!

We appreciate you spending time with us during RSA Conference 2017. We hope you learned something new from our Analytics and Detection through Coding, Scripting and Organizing Data Lab. We had a lot of interaction during the Lab which, to us, was a sign that people were engaged and interested. We’re excited to have this chance to recap what we covered and share further information with you.

Our Lab was designed to expose you to a new approach to aggregating and organizing alerts and data to better identify attack patterns. It showed you how to put the data in a ticketing system like JIRA so that you could capture, share, and use the data in the future. This summary contains key takeaways, lessons learned, key feedback, and further resources should you wish to delve into more on analytics and detection through coding, scripting, and organizing data.

See you at the Conference next year!

Mischel Kwon, Dilan Bellinghoven, Brian Kwon, Matt Norris, and David Smith

MKA Cyber and Phantom Cyber

4

Lab Summary

Key Takeaways

ANALYZING DATA OUTSIDE OF A SIEM

Alerts by themselves do not tell an analyst much. SIEMs can be useful in extracting information from alerts, but they are too often in a state of neglect. Therefore, the ability to perform as many of the functions of a SIEM as possible using alternative tools and methods can offset deficiencies in a SIEM and ensure continuity of operations in the case of a disruption in the SIEM. However, a high volume of alerts poses a challenge – how do analysts make sense of such massive amounts of data without a SIEM? One approach is to implement a standard process which is organized by patterns of attack. We call these patterns “use cases”, and break them down further into specific activities called “scenarios”. Designing content to specific rules, signatures, and indicators will enable you to detect scenarios. A hierarchical view of this idea is shown below.

5

AGGREGATING ALERTS TO FIND PATTERNS

Many methods by which a large volume of alerts can be organized into a useful format exist, and many of those are likely to be at your fingertips right now. Several command‐line tools which are built into Unix, such as “grep”, “awk”, “sort”, and “uniq”, can perform such operations. For instance, a file containing a high volume of Snort alerts can be parsed and rearranged to reveal a variety of useful information, such as the top ten highest‐frequency destination IP addresses, the type of alerts, peak traffic times, and much more. Though it requires some degree of Unix aptitude, this method can be highly effective.

USING EXISTING TOOLS TO STORE AND SHARE YOUR FINDINGS

Though command‐line tools are useful, often it is not enough to see the patterns just once by yourself. The real value lies in sharing that knowledge with the rest of the SOC. Communicating and documenting findings enhances future threat detection which improves the performance of the SOC. There are many free or low‐cost options available to a SOC that enable team‐wide knowledge sharing and collaboration. For this Lab, we chose JIRA as the platform since it is available in a free trial version and has a robust REST API system. Keep in mind that if automation is to be implemented, some sort of programming language is required. We chose Python simply because of familiarity, but many other options are equally effective, and likely at minimal to no cost. It is highly likely that both of these tools or their equivalent alternatives are already present in a SOC, and they can be leveraged to perform more robust organization and sharing of the alert data.

AUTOMATING REPETITIVE TASKS FOR SOC OPTIMIZATION

Using tools like JIRA and Python, and with some basic programming experience, an analyst can automate the processing (e.g. parsing, rearranging, organizing, etc.) of the messy, raw Snort logs as we did using Unix command‐line tools. However, doing so programmatically opens up a far wider variety of functions that can be performed on the data, and enables repeatable data processing since the program can be re‐run again and again. In this Lab, we leveraged some of Python’s built‐in libraries to parse the logs, reorganize them, and feed them to JIRA via its REST API to create new tickets, all in a modularized, stepwise fashion. Developing automated workflows like this frees an analyst from performing menial tasks such as creating tickets and manually organizing data, and thus allows him or her to devote more time and energy towards tasks of higher value.

6

Lessons Learned

RESOURCEFULNESS CAN IMPROVE CONTINUITY OF OPERATIONS

An unfortunate trend among many SOCs today is an unhealthy dependence on a single tool or technology, such as a SIEM. Because of this dependence, a SOC might be left fumbling in the dark during a disruption in the SIEM. To avoid this loss of continuity – and to therefore avoid leaving the proverbial front door open – SOC analysts should be taught to be resourceful. They should be able to use the tools at their disposal and their experience to do what is necessary to ensure continuity.

As demonstrated in the exercises, in many cases, a plethora of suitable substitutes exist for any sort of data processing operation. Many of these substitutes are free and readily‐available to anyone. A SOC composed of analysts versed in a multitude of tools and technologies will be more resilient in the face of a failure in or unavailability of a certain tool. Such a SOC will be well‐prepared to meet the demands of the ever‐changing landscape of today’s cyber battlegrounds.

AUTOMATION GIVES ANALYSTS MORE TIME TO FOCUS ON HIGHER‐VALUE OPERATIONS

It goes without saying that there are certain tasks which a SOC performs that are perhaps tedious or repetitive and which do not require a high level of skill, but which nonetheless are essential to the healthy operation of a SOC. These are sometimes referred to as “tier one” tasks. Some examples might include scavenging trusted cybersecurity blogs for new indicators of compromise, parsing, filtering, and reformatting sensor logs, or finding contextual data related to a certain indicator of compromise. Such tasks are often highly amenable to automation.

As shown in the Lab exercises, automating these tier one tasks can save SOC analysts a significant amount of time. It can often reduce the time spent on a task by orders of magnitude. These time savings are apparent in the final exercise in which we demonstrated the beginnings of a web scraper which crawls trusted cybersecurity blogs and scrapes from them indicators of compromise. A program such as this can reduce the duration of the task from hours to seconds. Automation can free up SOC analysts to focus on tasks of higher value, such as investigating an incident. With the automation of each task, the performance of the SOC is further optimized.

7

Further Resources

There are many resources available for learning the basics of Unix command line. For one which covers the basics of processing log files with Unix, see this blog post: https://www.loggly.com/ultimate‐guide/analyzing‐linux‐logs/

For a basic JIRA tutorial, see the official JIRA 101 page on Atlassian’s website ‐‐ https://confluence.atlassian.com/jira064/jira‐101‐720412861.html

Python is no small subject to cover, but a good reference on using Python to automate some basic operations is Automate the Boring Stuff by Al Sweigart. It is highly accessible to beginners with no prior experience with Python.

Take‐Home Exercise

During the Lab, attendees were presented with a final take‐home challenge. This section reviews that challenge and provides the solution script so that you may check your solution.

INTRODUCTION

This exercise will introduce you to the basics of using the Python Request library, an API for interacting with the web via Python. In this demonstration you will learn how to pull out links from the FireEye blog.

Note that in order to implement this program, you may need to install the Python "Requests" and "BeautifulSoup" packages. In order to do this, it would be useful to first install "pip", a Python package manager (see https://packaging.python.org/installing/). After installing pip, you may use it to install Requests and BeautifulSoup from the command line with the following commands:

pip install requests

pip install bs4

From the command line, run the Python interpreter by typing “python” and hitting enter. This will open the interpreter shell. From the interpreter shell, import Requests and BeautifulSoup.

>>> import requests >>> from bs4 import BeautifulSoup

Request "https://www.fireeye.com/blog.html" and store Requests object in r.

>>> r = requests.get("https://www.fireeye.com/blog.html", headers={'User-Agent' : 'Magic Browser'}) >>> print r.text

As you can see, contained in r is the entire blog web page. Next, create a BeautifulSoup object with our Requests object in order to traverse the HTML page in r.

>>> soup = BeautifulSoup(r.text, 'html.parser')

Build a list containing all the links in any of the elements of the HTML page of class "c05_item".

8

>>> list_items = soup.find_all(class_="c05_item") >>> list_items[0].find('a').get('href')

Store all hrefs in 'articles'

>>> articles = [] >>> for item in list_items: articles.append(item.find('a').get('href')) >>> articles = ['https://www.fireeye.com' + i for i in articles] >>> for link in articles: print link

ASSIGNMENT

Using this strategy, find a way to request the ZeusTracker IP blocklist at (https://zeustracker.abuse.ch/blocklist.php?download=badips) and, with the resultant Requests object, find a way to create a CSV file of the following format:

ip,source,date,time

‐IP: The IPs shown on the web page (e.g. "039b1ee.netsolhost.com", "0if1nl6.org", etc.) ‐Source: The URL ("https://zeustracker.abuse.ch/blocklist.php?download=domainblocklist") ‐Date: The runtime date (see Python's datetime library) ‐Time: The runtime time (see Python's datetime library)

SOLUTION

import requests, csv, datetime, os from bs4 import BeautifulSoup import os def main(): url = "https://zeustracker.abuse.ch/blocklist.php?download=badips" r = requests.get(url, headers={'User-Agent' : 'Magic Browser'}) date = datetime.date.today() time = datetime.datetime.now().time() soup = BeautifulSoup(r.text, 'html.parser') html_text = soup.get_text() bad_ips = html_text.split('\n')[6:-1] # From here on out, figure out how to print these to a CSV with open(os.getcwd() + '/zeustracker_ip_csv.csv', 'wb') as outfile: csvwriter = csv.writer(outfile) for ip in bad_ips: csvwriter.writerow([ip, url, date, time]) if __name__ == '__main__': main()

SESSION ID:SESSION ID:

#RSAC

LAB1-R04

Matt NorrisSenior AnalystMKACyber

Mischel Kwon Brian KwonFounder, MKACyber Analyst, MKACyber@mkacyber

Analytics and Detection through Coding, Scripting and Organizing Data

Dilan BellinghovenSOC AnalystMKACyber

David SmithQA ManagerMKACyber

#RSAC

All hail the mighty SIEM!

2

Alerts are just alerts

How do we actually find patterns in the noise without drinking from the firehose?

Most SIEM products are in some state of neglect

#RSAC

So how do we get around this?

3

Organize the operation via a use case and scenario framework

Perform data and alert aggregation to see the signal through the noise

Establish workflow and content management via efficient use of ticketing systems

Automate basic analysis and enrichment

#RSAC

Threat based approach. Context?

4

Most of the alerts you see in a modern SOC are missing one thing… Context!

Categorizing as you go helps you specifically know what you’re looking for, when to look for it, and make sure you have detection across the spectrum

Avoids detecting down the

rabbit hole and getting

tunnel vision

#RSAC

But how do we tag things?

5

K. I. S. S.Well... As simple as you can in this case

Use Cases: A category for activity on a system (pieces of knowledge, or scenarios)

Scenarios: Individual pieces of activity on a system

Content: Specific rules, signatures, and indicators written to detect scenarios

#RSAC

Current MKA Use Cases

Web

Malware

High Value Targets

Unauthorized Access and Privilege Escalation

VPN

Data Exfiltration

Email

Traffic Anomalies

Vulnerability

6

#RSAC

Break out of scenarios

Data ExfiltrationUnusual large upload

Unusual large download

Unusual large transfer during off business hours

Mismatched file headers

Unusual network session lengths

Matches on keywords/PII

Unusual large outbound traffic to suspect country

Unusually large outbound traffic to competitor/adversary

7

#RSAC

Example: Widget Warehouse

You come on as a new analyst

They have a plethora of detection tools

A new site is coming online that has not been integrated into the larger security architecture, but has sensors deployed

IT has commented that they’ve seen a large amount of malware infections, but the SOC can’t do anything yet

8

#RSAC

What can we do?

The Macarena?

Throw our hands up and get mad at the engineers?

SSH into the sensor and try and set up manual detection in the meantime?

9

#RSAC

Example: File header EXE detection

Use Case: Malware

Scenario: EXE File header found in environment

Content: Snort rule aggregated by Signature and then destination IP

We’ll walk through this example in class

10

#RSAC

Alert aggregation

Stack and Sort to make patterns manageable

What do we care about the data?

11

#RSAC

Wouldn’t this be nicer?

12

#RSAC

Or this?

Top 25 Snort Exe events:

13

cat alert.fast.maccdc2012_00000.pcap | awk -F"\[**\]" '{print $3;}' | sed -e 's/\[$//' -e 's/^\]//' | grep 'exe' | sort | uniq -c | sort -rn | head -25105 [1:2059:1] WEB-MISC MsmMask.exe access105 [1:2058:1] WEB-MISC MsmMask.exe attempt82 [1:2326:3] WEB-IIS sgdynamo.exe access82 [1:1610:11] WEB-CGI formmail arbitrary command execution attempt69 [1:809:11] WEB-CGI whois_raw.cgi arbitrary command execution attempt66 [1:2018403:7] ET TROJAN GENERIC Likely Malicious Fake IE Downloading .exe65 [1:1165:9] WEB-MISC Novell Groupwise gwweb.exe access47 [1:832:11] WEB-CGI perl.exe access47 [1:2019714:2] ET CURRENT_EVENTS Terse alphanumeric executable downloader high likelihood of being hostile47 [1:1648:7] WEB-CGI perl.exe command attempt44 [1:1614:8] WEB-MISC Novell Groupwise gwweb.exe attempt42 [1:1158:10] WEB-MISC windmail.exe access42 [1:100000217:1] COMMUNITY WEB-MISC man2web cmd exec attempt41 [1:2241:5] WEB-MISC cwmail.exe access41 [1:1654:6] WEB-CGI cart32.exe access41 [1:1536:8] WEB-CGI calendar_admin.pl arbitrary command execution attempt37 [1:1762:5] WEB-CGI phf arbitrary command execution attempt37 [1:1547:11] WEB-CGI csSearch.cgi arbitrary command execution attempt21 [1:989:11] BACKDOOR sensepost.exe command shell attempt21 [1:889:10] WEB-CGI ppdscgi.exe access21 [1:2244:4] WEB-MISC VsSetCookie.exe access21 [1:1655:6] WEB-CGI pfdispaly.cgi arbitrary command execution attempt21 [1:1595:10] WEB-IIS htimage.exe access7 [1:962:13] WEB-FRONTPAGE shtml.exe access7 [1:2010704:8] ET WEB_SERVER Possible HP OpenView Network Node Manager ovalarm.exe CGI Buffer Overflow Attempt

#RSAC

Emerging Threat Hit on “.exe”

14

Connections to two hosts

Lots of connections between the hosts

cat alert.fast.maccdc2012_00000.pcap | grep 'ET TROJAN GENERIC Likely Malicious Fake IE Downloading‘.exe' | awk -F"\ " '{split($23,a,":");print " " a[1] " -> " $25;}' | sort | uniq -c | sort –rn

50 192.168.202.110 -> 192.168.27.203:808016 192.168.202.110 -> 192.168.27.102:3128

cat ../http.log | grep 192.168.202.110 | grep "192.168.27.203\t8080" | wc –l

17894

cat ../http.log | grep 192.168.202.110 | grep "192.168.27.102\t3128" | wc –l

3663

#RSAC

Scanning…

15

cat ../http.log | grep 192.168.202.110 | grep "192.168.27.102\t3128" | awk '{print$8 " " $10;}' | sort | uniq -c | sort -rn | head

541 GET /22 GET /<IMG12 GET /scripts/12 GET /cgi-bin/6 GET /index.jsp4 GET /scripts/index.php4 GET /index.php4 GET /cgi-bin/index.php4 GET /cgi-bin/index.cgi4 GET /../../../../../../../../../../../../etc/passwd

#RSAC

Scanning…

16

cat ../http.log | grep 192.168.202.110 | grep "192.168.27.102\t3128" | awk '{print $8 " " $10;}' | sort | uniq -c | sort -rn | tail

1 GET././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././././../../../../../../../../

1 GET ..\\..\\..\\..\\..\\..\\winnt\\win.ini1 GET ..\\..\\..\\..\\..\\..\\windows\\win.ini1 GET ..\..\..\..\..\..\winnt\win.ini1 GET ..\..\..\..\..\..\windows\win.ini1 GET ..\..\..\..\..\..\..\..\..\..\winnt\win.ini1 GET ..\..\..\..\..\..\..\..\..\..\windows\win.ini1 GET .1 GET %.1 CONNECT localhost:3141

#RSAC

Another event

17

Potential cmd.exe trafficcat alert.fast.maccdc2012_00000.pcap | grep 'ATTACK-RESPONSES Microsoft cmd.exe banner'

03/16-08:13:32.500000 [**] [1:2123:3] ATTACK-RESPONSES Microsoft cmd.exe banner [**] [Classification:Successful Administrator Privilege Gain] [Priority: 1] {TCP} 192.168.28.100:1138 -> 192.168.202.96:443

cat ../conn.log | grep -e '192.168.28.100\t1138\t192.168.202.96\t443'

1331903612.440000 CdDGxV32GCW7JfVHba 192.168.28.100 1138 192.168.202.96 443 tcp -0.060000 104 0 S1 - 0 ShADa 3 232 2 88 (empty)

1331904353.160000 CCuhkk1UHaue9Hp6y4 192.168.28.100 1138 192.168.202.96 443 tcp -66.440000 1814 225 SHR - 0 dDafA 12 2294 19 985 (empty)

cat ../conn.log | grep '192.168.28.100' | ../send_recv_counter.pl

192.168.202.110 -> 192.168.28.100 103688192.168.202.110 <- 192.168.28.100 172126

Lots of other traffic between the hosts

‘encrypted’ connections with not much traffic

#RSAC

Multiple ways to skin the cat

Grep for “ EXE ” in the snort alerts

AWK destination IP

Sort and count unique lines (uniq)

18

grep -i exe sample.pcap

grep -i exe sample.pcap |awk -F '}' '{print $2}' | awk -F ' ' '{print $3}' | awk –F ':' '{print $1}' | sort | uniq

grep -i exe sample.pcap |awk -F '}' '{print $2}' | awk -F ' ' '{print $3}' | awk –F ':' '{print $1}' |

#RSAC

Poor cat

Grep for “ EXE ” in the snort alerts

CUT destination IP

Sort and count unique lines (uniq)

19

grep -i exe sample.pcap

grep -i exe sample.pcap | cut -d '{' -f 2 | cut -d ' ' -f 4 | cut -d ':' -f 1

grep -i exe sample.pcap |cut -d '{' -f 2 | cut -d ' ' -f 4 | cut -d ':' -f 1 | sort | uniq -c

#RSAC

Exercise

Basic Linux analysis on Snort logsEntirely doable in PowerShell for the Windows-oriented

Example walkthrough of blackhole script

20

#RSAC

Back to Widget Warehouse

We have all of this security architecture lying around… can we use any of it?

How about the big shiny JIRA box?We can share with the rest of the SOC

Build on the knowledge base everyone else is using

Avoid reinventing the wheel

Gather all of your information in one useable and documentable place

But how do we talk to it?

Is it possible to correlate without a SIEM?

21

#RSAC

How about Python?

Easier than most people think

You don’t actually need a full development team

Python isn’t the only option (we just like it)

22

#RSAC

What are we trying to do?

Pull out the data

Organize the data

Put the Data in to JIRA

23

#RSAC

How do we pull out the data?

Snort alert fast is just text. How do we find and organize the good bits?

24

#RSAC

How do we pull out the data?

PyParsing – A quick, easy, and modular parser in Python

Snort_log_parser.py – Builds PyParsing object for Snort logs

Regex is fun, but not necessary

25

snort_log_parser.py <PyParsing object>

03/16-07:30:02.730000 [**] [1:2016141:2] ET INFO Exectuable Download from dotted-quad Host [**] [Classification: A Network Trojan was Detected] [Priority: 1] {TCP} 192.168.202.79:50770 -> 17.172.224.47:80

[['03/16-07:30:02.730000', '1:2016141:2', 'A Network Trojan was Detected', '1', 'TCP', '192.168.202.79', '50770', '17.172.224.47', '80']]

#RSAC

Exercise 2a: Broken Parser

26

Parser works with old log format:

Doesn’t work with new log format:

How do we fix it?

03/16-07:30:00.000000 [--] [1:2009358:5] ET SCAN Nmap Scripting EngineUser-Agent Detected (Nmap Scripting Engine) [--] [Category: Web ApplicationAttack] [Priority: 1] {TCP} 192.168.202.79:50465 -> 192.168.229.251:80

03/16-07:30:00.000000 [**] [1:2009358:5] ET SCAN Nmap Scripting EngineUser-Agent Detected (Nmap Scripting Engine) [**] [Classification: Web Application Attack] [Priority: 1] {TCP} 192.168.202.79:50465 -> 192.168.229.251:80

#RSAC


Exercise instructions:Change directories to ~/exercise_2

Open ~/exercise_2/snort_log_parser_broken.py

Test whether the log is being properly parsed (follow along)

27

[RSA@RSAMKA ~]$ cd exercise_2

[RSA@RSAMKA ~]$ vim ~/exercise_2/snort_log_parser_broken.py

#RSAC


Once fixed, move it to ~/exercise_2/snort_parsing

28

[RSA@RSAMKA exercise_2]$ mv snort_log_parser_broken.py./snort_parser/snort_log_parser.py

#RSAC

03/16-07:30:00.000000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50465 -> 192.168.229.251:8003/16-07:30:00.010000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50467 -> 192.168.229.251:8003/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 81.177.139.111:8003/16-07:30:00.030000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50469 -> 192.168.229.251:8003/16-07:30:00.040000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50471 -> 192.168.229.251:8003/16-07:30:02.730000 [**] [1:2100952:8] GPL WEB_SERVER author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:8003/16-07:30:00.050000 [**] [1:2102924:4] GPL NETBIOS SMB-DS repeated logon failure [**] ... 192.168.229.153:445 -> 192.168.202.79:5517303/16-07:30:00.050000 [**] [1:2924:3] NETBIOS SMB-DS repeated logon failure [**] ... 192.168.229.153:445 -> 192.168.202.79:5517303/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:8003/16-07:30:00.050000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50473 -> 192.168.229.251:8003/16-07:30:00.060000 [**] [1:402:7] ICMP Destination Unreachable Port Unreachable [**] ... 192.168.27.25 -> 192.168.202.10003/16-07:30:00.070000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50475 -> 192.168.229.251:8003/16-07:30:00.080000 [**] [1:2009358:5] ET SCAN Nmap Scripting Engine User-Agent Detected ... 192.168.202.79:50477 -> 192.168.229.251:80

How do we map the data?

29

{”81.177.139.111” : [

“03/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exeaccess [**] ... 192.168.202.79:50770 -> 81.177.139.111:80”

],“166.62.112.150” : [

“03/16-07:30:02.730000 [**] [1:2100952:8] GPL WEB_SERVER author.exeaccess [**] ... 192.168.202.79:50770 -> 166.62.112.150:80”,“03/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exeaccess [**] ... 192.168.202.79:50770 -> 166.62.112.150:80”

]}

03/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 81.177.139.111:8003/16-07:30:02.730000 [**] [1:2100952:8] GPL WEB_SERVER author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:8003/16-07:30:02.730000 [**] [1:952:6] WEB-FRONTPAGE author.exe access [**] ... 192.168.202.79:50770 -> 166.62.112.150:80

#RSAC

How do we map the data? (cont’d)

How can we process our logs and find patterns?

Can we organize them for easier use in a later stage?

Yes we can! With snort_mapping.py

Uses a parser object from previous exercise

Runs parser over data sample to create a JSON object with sorted, organized, structured, and beautiful data

30

#RSAC

How do we map the data? (cont’d)

Step 1: Parse Snort alert log file

31

snort_log_parser.py

Snort alert log file

(sample.pcap)

snort_mapping.py

SnortParser

DEST_IP_ALERTS_MAP

# Create a PyParsing object for the Snort alert log fileparser = snort_log_parser.SnortParser()

# Parse logfile using parser and return array of # {dst_ip_1 : [alert_1, alert_2, ..., alert_n], # dst_ip_2 : [alert_1, alert_2, ..., alert_n], ...]mapping = snort_mapping.build_unique_dest_ips(parser, logfile)

#RSAC

How do we insert data into JIRA?

JIRA.py

Iterates over the structured output (JSON) from the previous step

Gives a way to perform actions (maybe intel gathering) per ticket by creating a pipeline

AUTOMATE TIER ONE ALL THE TIME

JIRA ingests the JSON object via REST API to create one ticket per destination IP in JSON

32

#RSAC

How do we insert data into JIRA?

Step 2: Use mapping and JIRA REST API to create one ticket per destination IP in JIRA

33

# Create JIRA tickets using JIRA REST APIJIRA.transmit(creds, JIRA_domain, mapping)

JIRA.pyNew tickets created in

JIRA

JIRA

REST

APIDEST_IP_ALERTS_MAP

#RSAC

How do we insert data into JIRA? (cont’d)

34

snort_log_parser.py

Snort alert log filesnort_mapping.py

SnortParser

JIRA.pyDEST_IP_ALERTS_MAP

New tickets created

in JIRA

JIRA

REST

API

osint.py

(Additional Context)

{dst_ip_1 : [

alert_1, alert_2

],dst_ip_2 : [

alert_1,alert_2

], …}

Dest. IP

Open source

intel for dest. IP

stored in file

1

2

3

4

5

#RSAC

Exercise 2b: Put it all together

Change directories to snort_parsing

Get a fresh IP

Determine your VM’s IP address (ifconfig)

35

[RSA@RSAMKA exercise_2]$ cd snort_parsing

[RSA@RSAMKA snort_parsing]$ sudo dhclient –r[RSA@RSAMKA snort_parsing]$ sudo dhclient

#RSAC

Exercise 2b: JIRA setup

Run JIRA on your local machineUsing the IP from Step 1, type in your browser:http://<IP>:8080/

Login using:Username: RSAPassword: mkarsa

If you get a “Base URL Error”, click the “update base URL” button in the pop-up box with the yellow banner

Feel free to explore a bit in JIRA

36

#RSAC

Exercise 2b: Explore the program

In snort_parsing, open master.py

Be sure not to edit the code

If you accidentally edit the code, type “:q!” and press Enter(do not type the quotation marks)

Try to understand how each piece fits together

37

[RSA@RSAMKA exercise_2]$ vim master.py

#RSAC

Exercise 2b: Try it for yourself

Run the script

Go to JIRA, select the “Issues” dropdown, and click “Search for Issues” to see the newly-created tickets

38

[RSA@RSAMKA snort_parsing]$ python master.py sample.pcap –u http://<IP>:8080/ -a RSA:rsamka

#RSAC

Exercise 2b: Congratulations!

39

For you to try at home:Open hub.py and add the missing code to make the program work

The module master.py is the solution to hub.py

#RSAC

Data Enrichment

Analysis of data provided by tools is good, but that data is better when it has additional context!

Some classic examples of enrichment:Network ranges (by organization, building, etc.)

User (hostname, organization, etc.)

OSINT (Alexa, reputation, NoD, etc.)

Net Defense (blocklists, greylists, etc.)

Analyst assistance (whitelists, previous tickets, etc.)

40

#RSAC

But I want to use (Insert Tool Name Here)

Well how about Splunk?

Splunk loves to eat JSON

CSV lookups make things easy

41

#RSAC

Splunk Lookup Tables

Spunk has several mechanisms for data enrichment:Comma Separated Value (CSV)

External tool

Key/Value store

CSV is the easiest to work with initially since it is a simple mechanism that a lot of tools will import/export

Keep It Simple

42

#RSAC

Basic Configuration

In their most basic form CSV lookup tables consist of a base lookup file that is stored in the ‘<appname>/lookup’ directory and then referenced through a ‘transforms.conf’ entry

These lookup tables can then be referenced through multiple mechanisms

status,status_description,status_type100,Continue,Informational101,Switching Protocols,Informational200,OK,Successful201,Created,Successful202,Accepted,Successful…300,Multiple Choices,Redirection301,Moved Permanently,Redirection302,Found,Redirection…400,Bad Request,Client Error401,Unauthorized,Client Error402,Payment Required,Client Error403,Forbidden,Client Error404,Not Found,Client Error405,Method Not Allowed,Client Error…500,Internal Server Error,Server Error501,Not Implemented,Server Error502,Bad Gateway,Server Error503,Service Unavailable,Server Error504,Gateway Timeout,Server Error505,HTTP Version Not Supported,Server Error

[http_status]filename = http_status.csv

http://docs.splunk.com/Documentation/Splunk/6.5.1/Knowledge/ConfigureCSVlookups

43

#RSAC

Two methods of populating

The results are enriched through defining key/value pairs to be inserted when a specified field in the dataset matches a specified field in the lookup table.

Two sample ways of performing these lookups are through the use of:The ‘lookup’ search command.

— When there is a match, the specified fields will be output into the result set

Automatic lookups through ‘props.conf’

... | lookup http_status status OUTPUT status_description, status_type

[http_log]LOOKUP-http_log_lookup = code AS status OUTPUT status_description status_type

http://docs.splunk.com/Documentation/Splunk/6.5.1/Knowledge/ConfigureCSVlookups

44

#RSAC

Example

When the below data is ingested, it will be parsed and fields extracted.

In this example, the 7th field of the input is the HTTP status codeThe config on the previous page will match against the first column of the lookup!

status,status_description,status_type

100,Continue,Informational

101,Switching Protocols,Informational

200,OK,Successful

201,Created,Successful

202,Accepted,Successful

…

300,Multiple Choices,Redirection

301,Moved Permanently,Redirection

302,Found,Redirection

…

400,Bad Request,Client Error

401,Unauthorized,Client Error

402,Payment Required,Client Error

403,Forbidden,Client Error

404,Not Found,Client Error

405,Method Not Allowed,Client Error

…

500,Internal Server Error,Server Error

501,Not Implemented,Server Error

502,Bad Gateway,Server Error

503,Service Unavailable,Server Error

504,Gateway Timeout,Server Error

505,HTTP Version Not Supported,Server Error

192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395

127.0.0.1 - - [28/Jul/2006:10:22:04 -0300] "GET / HTTP/1.0" 200 2216

127.0.0.1 - - [28/Jul/2006:10:27:32 -0300] "GET /hidden/ HTTP/1.0" 404 7218

x.x.x.90 - - [13/Sep/2006:07:01:53 -0700] "PROPFIND /svn/[xxxx]/Extranet/branches/SOW-101

HTTP/1.1" 401 587

x.x.x.90 - - [13/Sep/2006:07:01:51 -0700] "PROPFIND /svn/[xxxx]/[xxxx]/trunk HTTP/1.1" 401 587

x.x.x.90 - - [13/Sep/2006:07:00:53 -0700] "PROPFIND /svn/[xxxx]/[xxxx]/2.5 HTTP/1.1" 401 587

http://ossec-docs.readthedocs.io/en/latest/log_samples/apache/apache.html

45

#RSAC

OSINT Bash script

To assist analysis, we need more context

Open source intel (OSINT) script to the rescue!Bash

— More variations on cat skinning

Quick and dirty way to look up indicators from several vetted open sources

Written by another analyst doing work on their own

Retrofitted into this workflow to provide intel on our IPs

46

#RSAC

How do we insert data into JIRA? (cont’d)

47

snort_log_parser.py

Snort alert log filesnort_mapping.py

SnortParser

JIRA.pyDEST_IP_ALERTS_MAP

New tickets created

in JIRA

JIRA

REST

API

osint.py

(Additional Context)

{

dst_ip_1 : [

alert_1,

alert_2

],

dst_ip_2 : [

alert_1,

alert_2

], …

}

Dest. IP

Open source

intel for dest. IP

stored in file

1

2

3

4

5

#RSAC

Attempt to standardize network/security event data into a single format

Easily add new sources of event data to your existing scripts and decision making process.

48

“Common” Event Format (CEF)

#RSAC

Goal: Your tools create data in an output format that is easily managed by one central system

Gotcha: Don’t worry about conforming to the exact standard

Easily stored in JSON, remember to structure to keep fields available for other information

49

"severity": "high",

"label": "event",

"type": "network",

"cef": {

"fileHash": "dae375687c520e06cb159887a37141bf",

"requestURL": "www.adssa-org.1gb.ru",

"destinationPort": "80",

"sourcePort": "4286",

"deviceDirection": "inbound",

"destinationUserName": "Dan",

"sourceAddress": "213.57.77.220"

},

"update_time": "2017-01-11T19:11:09.992701Z",

"hash": "f0053797a5e4509f4cc093b8e67f0259”,

“Common” Event Format (CEF)

#RSAC

50

Snort alert log file snort_mapping.py

Ingestion to Platform as Events with Artifacts

DEST_IP_ALERTS_MAP(IN CEF FORMAT)

Execute specific

“Playbook(s)” Based

on Event Type

Whois IP / URL

Reputation,File Reputation

ENRICHMENT PHASE(OSINT TYPE TOOLS)

ALERT PHASE

Playbook Decision:Malicious?

Playbook Automation: Create Ticket

Prompt User for DecisionBlock IP

DNS Black-hole

Close Event with (automated) comment

No

Yes

Other data sources

Security Automation and Orchestration

Integrated Work Flow

#RSAC

51

Ingestion to Platform as Events with Artifacts

Data Sources / Alerts

Security Automation and Orchestration

Integrated Work Flow

Pivot on DataManual

Investigation/Enrichment

Automation

ExecutionAction Approval Automated resolution

Automated

Enrichment and

Investigation

Case Management

ReportingCentralized Display

of Data

#RSAC

52

Security Automation and OrchestrationBenefits to Automation and Orchestration Platforms

Manage all your event data in one single combined screen

App development - Connectivity to hundreds of third party toolsStandardized Data formats

Asset administration

Visual automation / playbook development

Case Management and Reporting

Pivoting on Data

#RSAC

Exercise: Beautiful Soup

HTML or XML parser

Tree traversal

Web scrapers

And here’s what we did with it

53

#RSAC

Apply What You Have Learned Today

54

Next week you should:

Be able to analyze data outside the SIEM

Be able to add context to your alerts

Be able to add enrichment data to your SIEM

Within three months you should:

Use workflow to share data on alerts

Analyze data using use cases and scenarios

Documents

LAB1‐R04 – Analytics and Detection through Coding ... · PDF filethrough Coding, Scripting and Organizing ... on analytics and detection through coding ... a use case and scenario