Upload
ruo-ando
View
229
Download
0
Embed Size (px)
Citation preview
An empirical study of third party APK’s URL using scriptable API and fast
identifier-specific filter
Ruo Ando, National Institute of Informatics, Japan
Yuuki Takano, Shinsuke Miwa, National Institute of Information and Communications Technology, Japan
ICCSN 2017: 2017 9th IEEE International Conference on Communication Software and NetworksGuangzhou University city Guangdong University of Technology May 6-7
Abstract: URLs of Android third party’s APK files
• With rising popularization of Android application, third party of APK market has become attractive target of attackers. In this paper, we present a framework to inspect URL strings to which third party APK connects using headless browser and fast URL filter. In our system, for collecting APK files, navigation scripting with JavaScript enables more interactive web page crawling in order to fetch the results after dynamic web page loading.
• Besides, FARIS (fast uniform resource identifier-specific filter) is applied for matching URL strings in APK with black list in AdBlock Plus which is one of the most popular ad blockers.
http://www.cmcm.com/blog/en/security/2016-01-20/925.html
Android App Stores Become Significant Sources for Malware
System Overview: mining destination URL in APK’s
Casper JS
800,000 APK files and extracted 12,000 URLs
Phantom JS
Ablock Plus
Faris VMAdBlock's syntax Regular Expression
* /.*/
| of the beginning of the line /^/
| of the end of the line /$/
|| of the beginning of the line /[\w\-]+\/+/
0
50000
100000
150000
200000
250000
RANKING OF DESTINATION(TOP 40)
Enhanced APK crawler
Enhanced String Matchng
FARIS is byte code interiter for regular expressions, but for simplicity, it provides only four instructions.
Overview of D2 (Droid Dowser)
perl
CasperJS
PhantomJS
Xpath templates
AWS Cloud Formation
Qt WebKit
Stack Templatesdeploy
Loop generation
Lightweight DSL for each Distribution sites
API invocation
Qt Metacall incovaion
Crawler deployment for parallel retrieval
PhantomJS - sendEvent
Event Loop
Qt Metacall
SendEvent
void WebPage::qt_static_metacall(QObject *_o, QMetaObject::Call
_c, int _id, void **_a)
{
switch (_id) {
case 0: _t->initialized(); break;
case 31: _t->sendEvent((*reinterpret_cast< const
QString(*)>(_a[1])),(*reinterpret_cast< const
QVariant(*)>(_a[2])),(*reinterpret_cast< const
QVariant(*)>(_a[3])),(*reinterpret_cast< const
QString(*)>(_a[4])),(*reinterpret_cast< const QVariant(*)>(_a[5])));
break;
* - eventType: "keypress", "keyup" or
"keydown" (default: "keypress")
#4 0x000000000041b603 in WebPage::sendEvent (this=0x2cd5370, type=...,
arg1=..., arg2=..., mouseButton=..., modifierArg=...)
at webpage.cpp:1449
#5 0x000000000041b7a2 in WebPage::sendEvent (this=0x2cd5370, type=...,
arg1=..., arg2=..., mouseButton=..., modifierArg=...)
at webpage.cpp:1465
#6 0x0000000000467c4f in WebPage::qt_static_metacall (_o=0x2cd5370, _c=
QMetaObject::InvokeMetaMethod, _id=33, _a=0x7fffffffd9f0)
at moc_webpage.cpp:265
#7 0x00000000004687d6 in WebPage::qt_metacall (this=0x2cd5370, _c=
QMetaObject::InvokeMetaMethod, _id=33, _a=0x7fffffffd9f0)
at moc_webpage.cpp:361
#8 0x0000000000543b9f in JSC::Bindings::QtRuntimeMetaMethod::call(JSC::
ExecState*) ()
https://software.intel.com/zh-cn/forums/topic/289577
CasperJS – navigation scripting without callbacks
Start()
then()
run()
evaluateExecute function
Start() run()
callbacks
Qt Metacalls
PhantomJS
CasperJS
Query Selector
Dom Elements
Response(async)
Send event
Passing function
Return native type
Headless Browser with Scriptable JavaScript API
casper.run(function() {test.done();});});
var x = require('casper').selectXPath;
casper.options.viewportSize = {width: 1300, height: 700};
casper.test.begin('test', 1, function(test) {
casper.start('http://www.freewarelovers.com/android', function() { });
casper.waitFor(function check() {
return this.click(x("//*[\@id=\"fieldset\"]/table/tbody/tr[2]/td[1]/p[1]/a[1]"))
!= 0},
casper.start(ARGV[1], function() {
this.capture('google.png');
});
Perl: Xpath templates, loop generation and timeout this.click(x("//*[¥@id=¥"fieldset¥"]/table/tbody/tr
[2]/td[1]/p[1]/a[1]")),215
3 this.click(x("//*[¥@id=¥"fieldset¥"]/table/tbody/tr
[2]/td[1]/p[1]/a[2]")),255
for($counter=1;$counter<$item;$counter++) {
print "casper.waitFor(function␣check()␣{␣¥n";
print "return␣this.click(x(";
print "¥"";
print "//*[¥@id=";
print "¥¥";
print "¥"fieldset";
print "¥¥";
print "¥"]";
print "/table/tbody/tr[1]/td[3]/table/tbody/tr[";
print $counter."]/td/p/b/a¥"))␣!=0;␣¥n";
print "},";
print "function␣then()␣{␣¥n";
for($counter=1;$counter<$item;$counter++){
$TIMEOUT = 10;
eval {
local $SIGfALRMg = sub fdieg;
8 alarm($TIMEOUT);
$str = "/home/ubuntu/casperjs/bin/
casperjs␣test␣"
$pid = fork;
if ($pid == 0) {
exec($str);
}
else f
wait;
}
my $timeleft = alarm(0);
}
if ($@) f{
# timeoit
kill(SIGKILL, $pid);
Generating Java Scripts
FarisVM and AdBlock’s syntax
it has two registers, i.e., the string pointer (SP) and program counter (PC), as well as a frame stack for the SP and PC.
AdBlock's syntax Regular Expression
* /.*/
| of the beginning of the line /^/
| of the end of the line /$/
|| of the beginning of the line /[\w\-]+\/+/
URL filters can be efficiently and practically expressed. For example, ads.com, which is an exact pattern, does not distinguish between http://ads.com/b.gif and http: //ads.com/idx.html; however, ads.comˆ*.gif will filter only the former.
FARIS VM
• FARIS is based on a virtual machine approach for regular expressions, but for simplicity, it provides only four instructions.
• FARIS is a bytecode interpreter. Thus, to perform pattern matching, AdBlockPlus’s rules are translated into its machine instructions. FARIS interprets the four instructions as follows: char,skip_to, skip_scheme, match.
AdBlock's syntax Regular Expression
* /.*/
| of the beginning of the line /^/
| of the end of the line /$/
|| of the beginning of the line /[\w\-]+\/+/
input instruction
*c skip_to c
*^ skip_to separator
c char c
^ char separator
|| + line skip_scheme
| + line char head
line + | char tail
Experiments: MATCHING URL WITH ADBLOCK
list FARIS (ms)grep with regex (ms)
easylist_france 62416 3079
easylist_germany 487361 50454
easylist_italy 58318 1978
easyprivacy 4745 6740
fanboy_annoyance 4760 11276
japanese 56241 6992
japanese_tohu 1090 1383
malwaredomains_full 1032 15407
FARIS should be quite suitable for Web
browsers or browser extensions. AdBlock Plus
is one of the most popular browser extensions,
but it is implemented inefficiently. Using FARIS
could increase AdBlock Plus’s performance and
reduce its large memory utilization. Thus,
embedding FARIS into Web browsers or
JavaScript engines is a good choice for
improving overall performance.
Table VI shows the comparison of processing
time in matching strings in ADBLOCK Plus. We
have measured computing time in coping with
strings in ADBLOCK Plus with basic regular
expressions and FARIS. The results are different
according to item of ADBLOCK Plus. However,
it can be concluded that proposal method with
FARIS can work with reasonable processing
time compared with conventional regular
expressions.
Conclusion: investigating URLs of Android third party’s APK files using Faris VM
With rising popularization of Android application, third party of APK market has become attractive target of attackers. Unfortunately, there have been very few research efforts on empirical studies of the large number of APKs distributed by third party market. In this paper, we present a framework to inspect URL strings to which third party APK connects using headless browser and fast URL filter.
In experiment, we have collected 800,000 APK files and extracted 12,000 URLs. For matching URLs with AdBlock, we have applied FARIS for inspecting URL strings with list such as easylist, easy privacy and malware domains full. Experiment show that FARIS can process these strings in reasonable computing time compared with conventional regex method.