Upload
amir-razmjou
View
31
Download
3
Embed Size (px)
Citation preview
Using GSP data mining algorithm to detect malicious flows in Lawrence Berkeley National
Laboratory FTP
Amir Razmjou
Pattern-based Techniques and Today’s Cybersecurity Challenges
• Protocols specifications evolve more rapidly• Vendor-Specific, Closed Standard Protocols. • Network traffic verification against protocol
specifications does not always account for legitimate traffic,– XML XXE Attacks– FTP Bounce Attacks
• Unknown attacks.• That abnormality to user interactions account for
changes.
Sequential Pattern Mining
• It is similar to the frequent item sets mining, but with consideration of ordering.
• Sequential Pattern Mining is useful in many application.– Customer shopping sequences: – Medical treatments, natural disasters (e.g., earthquakes),
science & eng. processes, stocks and markets, etc.
• Useful for extraction of knowledge from semi-structured data (i.e. XML)
What is sequence database and sequential pattern mining
• A sequence database consists of ordered elements or events where each element is an unordered set of items.
SID sequences
10 <a(abc)(ac)d(cf)>
20 <(ad)c(bc)(ae)>
30 <(ef)(ab)(df)cb>
40 <eg(af)cbc>
TID itemsets
10 a, b, d
20 a, c, d
30 a, d, e
40 b, e, f
Sequential Shopping Cart
Transaction 1 biscuits
Sequence 1
biscuits
Sequence 2
biscuits
Sequence 3
snack
Sequence 4
baking needs
frozen foods frozen foods salads cake
fruit frozen foods chickens
fruit baking needs beef snack
Transaction 2 baking needs cake pet food
cake baking needs lamb
vegetables snack chickens
pet food electrical salads
Transaction 3
snack snack lamb brushware
salads
chickens salads salads
beef chickens
Transaction 5 chickens electrical brushware
Sample FTP FlowWelcome to Microsoft FTP Server 3.4USER anonymous331 Guest login ok, send your complete e-mail address as password.PASS <password>230 Guest login ok, access restrictions apply.TYPE I200 Type set to I.CWD xfig250 CWD command successful.
Data Preparation
Resulting DatasetSource Destination APP Signature COMMAND CODE
4.251.189.14:33257 131.243.1.10:21 custom1 USER 3314.251.189.14:33257 131.243.1.10:21 custom1 PASS 2304.251.189.14:33257 131.243.1.10:21 custom1 REST 3504.251.189.14:33257 131.243.1.10:21 custom1 TYPE 2004.251.189.14:33257 131.243.1.10:21 custom1 CWD 2504.251.189.14:33257 131.243.1.10:21 custom1 TYPE 200
140.114.97.25:33983 131.243.1.10:21 custom1 USER 331140.114.97.25:33983 131.243.1.10:21 custom1 PASS 230140.114.97.25:33983 131.243.1.10:21 custom1 SYST 215140.114.97.25:33983 131.243.1.10:21 custom1 CWD 55053.55.176.50:10011 131.243.1.10:21 custom1 USER 33153.55.176.50:10011 131.243.1.10:21 custom1 PASS 23053.55.176.50:10011 131.243.1.10:21 custom1 FEAT 50053.55.176.50:10011 131.243.1.10:21 custom1 SYST 21553.55.176.50:10011 131.243.1.10:21 custom1 PWD 257
Result Sequence Rules
[1] <{USER}{PASS,230}{TYPE,200}{PASV,227}{RETR,150}> 6391[2] <{USER}{PASS,230}{TYPE,200}{SIZE,213}{RETR,150}> 4853[3] <{USER,331}{PASS}{TYPE,200}{PASV,227}{RETR,150}> 6391[4] <{USER,331}{PASS}{TYPE,200}{SIZE,213}{RETR,150}> 4853[5] <{USER,331}{PASS,230}{CWD,250}{TYPE,200}{150}> 4872[6] <{USER,331}{PASS,230}{TYPE}{PASV,227}{RETR,150}> 6391[7] <{USER,331}{PASS,230}{TYPE}{SIZE,213}{RETR,150}> 4853[8] <{USER,331}{PASS,230}{TYPE,200}{PASV}{RETR,150}> 6392[9] <{USER,331}{PASS,230}{TYPE,200}{PASV,227}{RETR}> 7927[10] <{USER,331}{PASS,230}{TYPE,200}{PASV,227}{150}> 8342[11] <{USER,331}{PASS,230}{TYPE,200}{SIZE}{RETR,150}> 5062[12] <{USER,331}{PASS,230}{TYPE,200}{SIZE,213}{RETR}> 4893
Abnormal Flows
USER, 331 , PASS, 230, PORT, 200, 500, QUIT, 221, 220, PWD, 257, SYST, 215, CWD, 550, PASV, 227, TYPE, SIZE,213, RETR, 150, 226, MDTM, 250, LIST, 421, ABOR,533, Udd20dfd1U, U15030ab9U, U54668fafU, Udb6ef1c3U, U7694531dU, PORTQUIT, U07c4edf9U, U8855979dU, Uab12679fU, Uc2ca1083U, U5b79257aU, U5f561953U, Ud4a28da8U
wu2616121custom1wu2616120proftpdrc2general172125general8msftp4msftpsunos41sunos56othergeneral5vxworks54WarFTPd167proftpdpre
• Commands in unmatched flows
• Signatures of FTP servers in unmatched flows
7%
Sequence Size and Support
References• Almulhem, A., & Traore, I. (2007). Mining and detecting
connection-chains in network traffic. IFIP International Federation for Information Processing, 238, 47–57. http://doi.org/10.1007/978-0-387-73655-6_4
• Bronson, B. J. (2004). Protecting Your Network from ARP Spoofing-Based Attacks, 1–5.
• Scigocki, M., & Zander, S. (2013). Improving Machine Learning Network Traffic Classification with Payload-based Features, (November), 1–7.
• Zander, S., Zander, S., Nguyen, T., Nguyen, T., Armitage, G., & Armitage, G. (2005). Automated Traffic Classification and Application Identification using Machine Learning. Proceedings of the IEEE.