Our Data, Ourselves Hack-Day Department of Digital Humanities Giles Greenway Tobias Blanke Jenifer...

Preview:

Citation preview

Our Data, Ourselves Hack-Day

Department of Digital Humanities

Giles Greenway

Tobias BlankeJenifer PybusMark Cote

The Project:

• What and how much data do smartphone apps collect?• What can it say about us and how is it used?• What do we think about this?• Can we put it to better use?• ~20 Young Rewired State coders issued with Android

'phones. • Custom MobileMiner app reports on app usage.• Sends data to a modified CKAN instance.• CKAN is written in Python, based on the Pylons

framework.• Released by the Open Knowledge Foundation:

http://ckan.org/

The Data:

• Poll /proc/<pid>/net/<tcp/udp>• Look for sockets/ports.• Count transmitted/received

bytes.• GSM cell ids.• Mobile and wireless networks.• App notifications.• Periodically save data to an

internal SQLite database that users can access.

• Upload data to a CKAN instance

.

MobileMiner App: http://kingsbsd.github.io/MobileMiner

.

GSM Cell Tower Locations: http://opencellid.org

• Full GPS is too invasive, and consumes power.

• Avoid use of Google location API.

• OpenCellId provides locations of (many) cell towers.

• Currently include UK database within the app.

● Next: Bridge MobileMiner to cell DB via CKAN API?

CKAN:

CKAN:

Processing The Data:

• Aggregate app usage per user per day.

• Cluster GSM cells by k-means using SciKitLearn Python library

• Label clusters using OpenStreetMaps.

• Gather app data by scraping the Play Store. (BeautifulSoup, PhantomJS & Selenium )

Docker: https://www.docker.com/

• Docker Linux Containers: Dockerfile->Image->Container

• Installs CKAN, packages, libraries.• Link to containers for Postgress and Solr.• Create users and database tables.• Provide access to the data via Ipython Notebooks.• Provide tools like Numpy, SciKitLearn and NLTK.• Allows users to experiment.• Documents the software environment.• Allows for easy deployment.• Free public image hosting.

Questions:

• Can we link app usage to physical locations?• Can we make use of cells whose locations are

unknown?• Can we cluster on a spatial AND temporal basis?• Do apps with certain permissions use more data?

The Line!

Getting an .apk package:

http://apps.evozi.com/apk-downloader/

Fighting Back?

• Grab the app's .apk package file from a rooted phone?• Decompress the package and examine

AndroidManifest.xml.• Decompile the app and examine the source code.

Fighting back: Decompressing the .apk:

apktool d com.onetouchgame.TheLine.apk

http://code.google.com/p/android-apktool/

AndroidManifest.xml

<receiver android:enabled="true" android:name="com.simplecreator.app.RemoteNotificationReceiver">

<intent-filter>

<action android:name="cn.jpush.android.intent.REGISTRATION"/>

<action android:name="cn.jpush.android.intent.UNREGISTRATION"/>

<action android:name="cn.jpush.android.intent.MESSAGE_RECEIVED"/>

<action android:name="cn.jpush.android.intent.NOTIFICATION_RECEIVED"/>

<action android:name="cn.jpush.android.intent.NOTIFICATION_OPENED"/>

<action android:name="cn.jpush.android.intent.ACTION_RICHPUSH_CALLBACK"/>

<category android:name="com.onetouchgame.TheLine"/>

</intent-filter>

</receiver>

<service android:name="com.umeng.update.net.DownloadingService" android:process=":DownloadingService"/>

<activity android:name="com.umeng.update.UpdateDialogActivity" android:theme="@android:style/Theme.Translucent.NoTitleBar"/>

• The app receives intents from the push notification service jpush.cn. There is a mobile analytics service.

• Is that why it had open sockets on port 3000?

.

Fighting Back: Decompile the App

http://code.google.com/p/dex2jar/

dex2jar.sh com.onetouchgame.TheLine

Decompile the .jar file:

Fighting Back: “The Usual Suspects”

Look for PhoneStateListeners and LocationListeners:

if (paramLocation != null) { d1 = paramLocation.getLatitude(); d2 = paramLocation.getLongitude(); boolean bool1 = d1 < 29.999998211860657D;Classes provided by tencent.com (a mobile ad service) reference latitutude and longitude.Classes provided by jpush.cn and umeng.com also reference LocationListeners.

Download our app:

Follow us on Twitter: @KingsBSD

Read our blog:

Slideshare:http://www.slideshare.net/kingsBSD/

Recommended