WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER STREAM

WARNINGBIRD: A NEAR REAL-TIME

DETECTION SYSTEM FOR SUSPICIOUS

URLS IN TWITTER STREAM

Presented By,

AUGUSTIN JOSE

S7-CSE

ROLL NO:21

Guided By,

NEETHA K N CS DEPARTMENT

INTRODUCTION

o TWITTER:Famous social networking siteShare information as tweetsTweet length 140 charactersURL shortening services- bit.ly, tinyurl.com

oURL(universal resource locator)

oTHREATS: Common ways of web attacks• Spam• Phishing• Malicious software downloading

AREA OF SEMINAR

SPAM DETECTION:

• Spam becomes a problem as soon as an online communication medium becomes popular

• Unwanted messages containing malicious items

• Since small tweet length easy to use shortened URL

• Through shortened URL user will redirected to suspicious sites

LITERATURE SURVEY Name/founder Features Disadvantages Reference

Twitter Detection based on account information

•Consumes more time•Can be easily fabricated

G. Stringhini, C. Kruegel, and G. Vigna, “Detecting Spammers onSocial Networks,” 2010.

Don’t Follow Me Detection based on account information

Can be easily fabricated

A. Wang, “Don’t Follow Me: Spam Detecting in Twitter”,2010

ARROW Based on correlated URLs

•Not detecting all types of spams•Uses more time

J. Zhang, C. Seifert, “ARROW:Generating Signatures to Detect Drive-By Downloads,”2011

H.Gao & Y,Chen Uses message based features

Easily fabricates them

H. Gao, Y. Chen, “TowardsOnline Spam Filtering in Social Networks,”2012

J.Song & S.Lee Sender- Reciever relationship

•Uses twitter graph•Time and resource consuming

J. Song, S.Lee, “Spam Filtering in Twitter UsingSender-Receiver Relationship” 2012

PhishAri: Automatic Realtime Phishing Detection on Twitter

Uses details of the user and the content of the tweet

•Can't detect suspicious URLs•Less browser compatibility

Anupama Aggarwaly, Ashwin Rajadesingan” PhishAri:Automatic Real-time Phishing Detection on Twitter”,2012

EXISTING SYSTEM

1. Detect accounts based on account information

Ratio of Tweets with URLs to Tweets without URLs

Easily fabricated by attacker 2. Detect accounts based on social graph Connectivity measures for each node

Hard to obtain and analyze large amounts of Twitter data

3. Crawl URLs to classify them Detect malicious URLs based on html content

Redirection chains used by attackers

PROPOSED SYSTEM-BASIC IDEAS

Redirection Chains:• Starts by shortened URL • Attacker uploads tweet with shortened URL• Intermediate URL contains entry point • Conditional redirections are used now• The normal browsers redirected to malicious

landing pages• The crawlers get redirected to benign pages

Correlated redirect chains attackers have limited resources so they reuses them they share same URLs Most frequently shared URL is entry point By grouping same domains correlated redirect

chain is made A3=C3 A4=B3=C4(entry point) A6=B5

PROPOSED SYSTEM-DETAILS

1.DATA COLLECTION:

Input is twitter stream Keep only Tweets with URLs Crawl and store URL chain of each URL Pushes the tweets into a tweet queue

2.FEATURE EXTRACTION

When more than w tweets collected tweet queue pops the tweets

Checking for same domains on these w tweets

Grouping domains xyz.com = 20.30.40.50 =abc.com

Discovers most frequent URL as entry point Extracts 11 features for each redirect chain

3.TRAINING AND CLASSIFICATION All features are normalized b/w 0 and 1 offline supervised algorithm is used Account status is accessed URLs from suspended accounts are considered

suspicious Classifier flags the corresponding URL and tweet info

as suspicious Suspicious URLs will be given to security experts

COMPARISON OF EXISTING VS PROPOSED SYSTEM

Detection based on account features

Time consuming Fabrication possible

More live optimization

Less detection accuracy

Based on correlated URL redirect chains

Real time detection Fabrication not

possible Less live

optimization High detection

accuracy

EXISTING SYSTEM PROPOSED SYSTEM

ADVANTAGES AND DISADVANTAGES OF WARNINGBIRD

Advantages: 1.Real time detection of suspicious URLs 2.No need of accessing the twitter graphs 3.Discovered the new features of suspicious

URLs 4.No fabrication is possibleDisadvantages 1.Dynamic redirection can not be handled 2.Multiple redirections is not possible 3.coverage and scalability is less

FUTURE SCOPE Adaptation to other services like facebook,

LinkedIn More scalability and coverage Develop more features which can not be

fabricated Handle multiple redirections

CONCLUSION Conventional suspicious URL detection

systems are ineffective on the conditional redirections

This system can effectively handle the conditional redirections

Found important feature others have ignored Attacker must either spend more for more

redirection servers or risk being caught

REFERENCES S. Lee and J. Kim, “WarningBird: Detecting Suspicious URLs

in Twitter Stream,” Proc. 19th Network and Distributed System SecuritySymp. (NDSS), 2012.

G. Stringhini, C. Kruegel, and G. Vigna, “Detecting Spammers on Social Networks,” Proc. 26th Ann. Computer Security Applications Conf. (ACSAC), 2010.

H. Kwak, C. Lee, H. Park, and S. Moon, “What Is Twitter, a Social Network or a News Media?” Proc. 19th Int’l World Wide Web Conf. (WWW), 2010.

A. Wang, “Don’t Follow Me: Spam Detecting in Twitter,” Proc.Int’l Conf. Security and Cryptography (SECRYPT), 2010.

H. Gao, Y. Chen, K. Lee, D. Palsetia, and A. Choudhary, “Towards

Online Spam Filtering in Social Networks,” Proc. 19th Network and Distributed System Security Symp. (NDSS), 2012.

WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER STREAM

Education

Traffic Blazer URLs

ID DD URLS

Z39.50 URLs

Malicious Accounts Detection based on Short URLs in Twitter

Validation of the Nationwide Suspicious Activity Reporting ... · Validation of the Nationwide Suspicious Activity Reporting (SAR) Initiative: Identifying Suspicious Activities from

URLs and Resources

Suspicious Shrimp

Learning based Malicious Web Sites Detection Using ...users.eecs.northwestern.edu/~hlc720/349/HTXPZYQ_poster.pdf · Learning based Malicious Web Sites Detection Using Suspicious URLs!

Suspicious Packaging

Suspicious Behavior Detection: Current Trends and …pengcui.thumedialab.com/papers/suspicious behavior-survey...niques for detecting suspicious behaviors that have existed over the

Karnataka Colleges Urls

Information Retrieval - Stanford University · Basic crawler operation §Begin with known seed URLs §Fetch and parse them §Extract URLs they point to §Place the extracted URLs

suspicious email detection

Clinically Suspicious cervix

Information - Upgraded URLs Google

Suspicious Minds

DotNetNuke Friendly Urls

From Drives to URLs

Files Paths Folders URLs

Suspicious Minds