A.C. Chen 2012/07/23 @ ADL 1
A FRAMEWORK FOR DETECTING MALFORMED SMS ATTACK
M Zubair RafiqueMuhammad Khurram KhanKhaled AlghathbarMuddassar Farooq
The 8th FTRA International Conference on Secure and Trust Computing, data management, and Applications ( STA 2011)
A.C. Chen 2012/07/23 @ ADL 2
Outline Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 3
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 4
SMS Deliver Process
SMS_SUBMIT
SMS_DELIVER
BSC: Base Station Controller
MSC: Mobile Switch CenterGMSC: Gateway MSCIWMSC: Interworking MSC
A.C. Chen 2012/07/23 @ ADL 5
Short Message Service ( SMS ) A message sent to and from a mobile
phone are first sent to an intermediate component called the Short Message Service Center (SMSC)
The SMS message exists in 2 formats SMS_SUBMIT: mobile phone to SMSC SMS_DELIVER: SMSC to mobile phone
A.C. Chen 2012/07/23 @ ADL 6
GSM Modem The SMS received on a mobile phone
is handled through the GSM modem Provides an interface with the GSM network
and the application processor of a smart phone Controlled through standardized AT commands
AppsTelephony Stack
Modem
AT commandsAT Result Codes
Responsible for cellular communications
Responsible for the communication between application processor and the modem
A.C. Chen 2012/07/23 @ ADL 7
Example: SMS_DELIVER///AT Result Code + the length of SMS
Complete SMS string in hex.
A.C. Chen 2012/07/23 @ ADL 8
Malformed SMS attack Cause the application processor to
reach an undefined state Significant processing delays Unauthorized access Denying legitimate users access …
AppsTelephony
Stack
Modem
However, malformed message detection in mobile phones has received little attention
A.C. Chen 2012/07/23 @ ADL 9
In this Paper… A malformed message detection
framework was proposed Automatically extracts novel syntactical
features to detect a malformed SMS at the access layer of mobile phones
A.C. Chen 2012/07/23 @ ADL 10
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 11
Common Idea Anomalies are deviations from a
learnt normal model [Patrick Dssel, et al.] Learning→Normal model→Anomaly detection Supported by our pilot studies
• The distance values of malformed messages are normally greater than those of benign messages
A.C. Chen 2012/07/23 @ ADL
SMS Detection Framework
MessageAnalyzer
FeatureExtractio
nFeature
SelectionClassificatio
n
12
A.C. Chen 2012/07/23 @ ADL
Message Analyzer Message dissection
Transform incoming SMS messages into a format from which we can extract intelligent features
Extracts the complete SMS message string i.e. the second line of AT Result code
FeatureExtraction
FeatureSelection ClassificationMessage
Analyzer 13
A.C. Chen 2012/07/23 @ ADL 14
Extraction of String Features Mine features from an incoming SMS
message Exploit the properties of a suffix tree Use a set of attribute strings to model the content
of the incoming message Entrenching function : Extracts the
( attribute, value ) pair from the suffix tree attribute: a feature string a value: the frequency of a from the nodes of the
suffix tree Example
FeatureExtraction
FeatureSelection ClassificationMessage
Analyzer
A.C. Chen 2012/07/23 @ ADL 15
Raw Model Vectors For the purpose of training, we
prepared a training data set 𝛫: Set of messages used for training, ={ 𝛫 m1,
…,mk } After each mi passes through the entrenching
function, we have our raw model
FeatureExtraction
FeatureSelection ClassificationMessage
Analyzer
A.C. Chen 2012/07/23 @ ADL 16
Feature Selection The high dimensionality of the raw
model will result in large processing overheads
Remove redundant features having low classification potential Not at the cost of a high false alarm rate
MessageAnalyzer
FeatureExtraction ClassificationFeature
Selection
A.C. Chen 2012/07/23 @ ADL 17
Selection Techniques Use 3 selection mechanisms to obtain
3 distinct model set of attributes Information Gain (IG) Gain Ratio (GR) Chi Squared (CH)
MessageAnalyzer
FeatureExtraction ClassificationFeature
Selection
A.C. Chen 2012/07/23 @ ADL 18
Distance/Divergence For a given vector of pairs, compute
the deviation ( message score, distance ) of the vector
Use 2 well-known distance measures to obtain the score Manhattan distance (md) Itakura-Saito Divergence (isd)
MessageAnalyzer
FeatureExtraction
FeatureSelection Classification
A.C. Chen 2012/07/23 @ ADL 19
Classification Threshold value
The largest distance score of a message in the training model
Raise an alarm If the distance score of an incoming SMS is
greater than the threshold value
MessageAnalyzer
FeatureExtraction
FeatureSelection Classification
A.C. Chen 2012/07/23 @ ADL
ReviewTraining is only required in the beginning
20
threshold
message score
A.C. Chen 2012/07/23 @ ADL 21
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 22
Evaluation Collect real world dataset of SMS
message ≥ 5000 benign datasets
• Developed modem terminal interface to collect more than 5000 real world benign SMS dataset
≥ 5000 malformed datasets• SMS injection framework ( Mulliner, C., et al., 2009)
A.C. Chen 2012/07/23 @ ADL 23
Experimental Goal To select the best feature selection technique and distance measure
3 feature selection modules• Information Gain (IG)• Gain Ratio (GR) • Chi-squared (CH)
2 distance measures• Manhattan distance (md)• Itakura-Saito Divergence (isd)
A.C. Chen 2012/07/23 @ ADL 24
Parameters and Definitions Used 4 parameters to define the
detection accuracy and the false alarm rate True Positive (TP), False Positive (FP), False
Negative (FN), True Negative (TN) Detection Rate
False Alarm Rate
A.C. Chen 2012/07/23 @ ADL 25
Results: Receiver Operating Characteristic Curves
ROC using Manhattan Distance ROC using Itakura-Saito Divergence
A.C. Chen 2012/07/23 @ ADL 26
Results: Overheads Training and Threshold calculation overheads in ( ms/100 SMS ) Testing overheads in ( ms/1 SMS ) using Information Gain, Gain Ratio
and Chisquared for Manhattan distance and Itakura-Saito Divergence
Average training time = 3.5s/100SMS
Average detection time of a malformed message = 10ms
Provides the best performance
A.C. Chen 2012/07/23 @ ADL 27
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 28
Conclusion A real time malformed message
detection framework Tested on real datasets of SMS messages Successfully detects malformed messages with
a detection accuracy of more than 98% The future research will focus on
further optimizing and deploying it on real world mobile devices and smart phones
A.C. Chen 2012/07/23 @ ADL 29
Q & A
A.C. Chen 2012/07/23 @ ADL 30
Example of a Suffix Tree Extract feature strings from an
incoming message m=0110223 The set of attribute strings is thus generated
FeatureExtraction
FeatureSelection ClassificationMessage
Analyzer
A.C. Chen 2012/07/23 @ ADL 31
Example of Entrenching Function
Message m=0110223 Set of attribute:
{3, 0, 1, 2, 23, 223, 110223, 10223, 0223, 0110223}
Vector of pairs =(3, 1), (0, 2), (1, 2), (2, 2), (23, 1), (223, 1)…
FeatureExtraction
FeatureSelection ClassificationMessage
Analyzer
A.C. Chen 2012/07/23 @ ADL 32
The RIL in the context of Android's Telephony system architecture [ref]
A.C. Chen 2012/07/23 @ ADL 33
Modules that implement telephony functionality