23
arXiv:1703.02874v2 [cs.CR] 31 Mar 2017 A Study of MAC Address Randomization in Mobile Devices and When it Fails Jeremy Martin * , Travis Mayberry , Collin Donahue, Lucas Foppe, Lamont Brown, Chadwick Riggins, Erik C. Rye , and Dane Brown § US Naval Academy Abstract Media Access Control (MAC) address randomization is a privacy technique whereby mobile devices rotate through random hardware addresses in order to pre- vent observers from singling out their traffic or phys- ical location from other nearby devices. Adoption of this technology, however, has been sporadic and varied across device manufacturers. In this paper, we present the first wide-scale study of MAC address randomization in the wild, including a detailed break- down of different randomization techniques by oper- ating system, manufacturer, and model of device. We then identify multiple flaws in these implementa- tions which can be exploited to defeat randomization as performed by existing devices. First, we show that devices commonly make improper use of randomiza- tion by sending wireless frames with the true, global address when they should be using a randomized ad- dress. We move on to extend the passive identifica- tion techniques of Vanhoef et al. to effectively defeat randomization in 96% of Android phones. Finally, we show a method that can be used to track 100% of devices using randomization, regardless of manufac- turer, by exploiting a previously unknown flaw in the way existing wireless chipsets handle low-level control frames. * [email protected] [email protected] [email protected] § [email protected] 1 Introduction Smartphones are one of the most impactful technolo- gies of this century. The ability to access the Internet anytime and anywhere has fundamentally changed both work and personal life across the globe [21]. It is gradually becoming clear, however, that in exchange for this level of access to the Internet people may be giving up a substantial amount of privacy. In par- ticular, it has recently been made public that state sponsored intelligence agencies, in countries such as Russia and China [5, 11, 3], as well as private sec- tor companies [18], are actively attempting to track cellphone users. Smartphones conventionally have two major modes of communication, both of which can potentially be used to track users. The first and most obvious is the cellular radio itself [8, 20]. However, an often over- looked second avenue for tracking cellphones (and their corresponding users) is the 802.11 (WiFi) ra- dio that most smart phones also use. Every 802.11 radio on a mobile device possesses a 48-bit link-layer MAC address that is a globally unique identifier for that specific WiFi device. The MAC address is a crucial part of WiFi communica- tion, being included in every link-layer frame that is sent to or from the device. This unfortunately poses a glaring privacy problem because any third party eavesdropping on nearby WiFi traffic can uniquely identify nearby cellphones, and their traffic, through their MAC addresses [10]. There is one particular type of WiFi packet, called a probe request frame, that is an especially vulner- 1

AStudyofMACAddressRandomizationinMobileDevicesand ... · ∗[email protected][email protected][email protected] §[email protected] 1 Introduction Smartphones are one of the most

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

arX

iv:1

703.

0287

4v2

[cs

.CR

] 3

1 M

ar 2

017

A Study of MAC Address Randomization in Mobile Devices and

When it Fails

Jeremy Martin ∗, Travis Mayberry †, Collin Donahue, Lucas Foppe, Lamont Brown,

Chadwick Riggins, Erik C. Rye ‡, and Dane Brown §

US Naval Academy

Abstract

Media Access Control (MAC) address randomizationis a privacy technique whereby mobile devices rotatethrough random hardware addresses in order to pre-vent observers from singling out their traffic or phys-ical location from other nearby devices. Adoptionof this technology, however, has been sporadic andvaried across device manufacturers. In this paper,we present the first wide-scale study of MAC addressrandomization in the wild, including a detailed break-down of different randomization techniques by oper-ating system, manufacturer, and model of device.We then identify multiple flaws in these implementa-tions which can be exploited to defeat randomizationas performed by existing devices. First, we show thatdevices commonly make improper use of randomiza-tion by sending wireless frames with the true, globaladdress when they should be using a randomized ad-dress. We move on to extend the passive identifica-tion techniques of Vanhoef et al. to effectively defeatrandomization in ∼96% of Android phones. Finally,we show a method that can be used to track 100% ofdevices using randomization, regardless of manufac-turer, by exploiting a previously unknown flaw in theway existing wireless chipsets handle low-level controlframes.

[email protected][email protected][email protected]§[email protected]

1 Introduction

Smartphones are one of the most impactful technolo-gies of this century. The ability to access the Internetanytime and anywhere has fundamentally changedboth work and personal life across the globe [21]. It isgradually becoming clear, however, that in exchangefor this level of access to the Internet people may begiving up a substantial amount of privacy. In par-ticular, it has recently been made public that statesponsored intelligence agencies, in countries such asRussia and China [5, 11, 3], as well as private sec-tor companies [18], are actively attempting to trackcellphone users.

Smartphones conventionally have two major modesof communication, both of which can potentially beused to track users. The first and most obvious is thecellular radio itself [8, 20]. However, an often over-looked second avenue for tracking cellphones (andtheir corresponding users) is the 802.11 (WiFi) ra-dio that most smart phones also use.

Every 802.11 radio on a mobile device possessesa 48-bit link-layer MAC address that is a globallyunique identifier for that specific WiFi device. TheMAC address is a crucial part of WiFi communica-tion, being included in every link-layer frame that issent to or from the device. This unfortunately posesa glaring privacy problem because any third partyeavesdropping on nearby WiFi traffic can uniquelyidentify nearby cellphones, and their traffic, throughtheir MAC addresses [10].

There is one particular type of WiFi packet, calleda probe request frame, that is an especially vulner-

1

able part of WiFi traffic with respect to surveil-lance. Since probe requests continuously broadcastat a semi-constant rate they make tracking trivial.Mobile devices are effectively playing an endless gameof digital “Marco Polo,” but in addition to “Marco”they are also broadcasting out their IDs (in the formof a MAC address) to anyone that cares to listen.To address this problem, some modern mobile de-vices make use of temporary, randomized MAC ad-dresses that are distinct from their true global ad-dress. When probe requests are sent out, they use arandomized pseudonym MAC address that is changedperiodically. A listener should be unable to continu-ously track the phone because the MAC changes in away that hopefully cannot be linked to the previousaddress.

In this work we evaluate the effectiveness of

various deployed MAC address randomization

schemes. We first investigate how exactly differ-ent mobile Operating Systems (OSs) actually imple-ment randomization techniques, specifically lookingat how the addresses are generated and under whatconditions the devices actually use the randomizedaddress instead of the global one. Using real-worlddatasets we provide the first evaluation of adoptionrates for randomization across a diverse manufacturerand model corpus.

After establishing the current state of randomiza-tion for widely used phone models and OS versions,we move on to show several weaknesses in theseschemes that allow us to track phones within andacross multiple collections of WiFi traffic. Our workbuilds on the fingerprinting techniques of Matte et al.[17] in addition to new approaches for deanonymizingphones based on weaknesses we discovered while ana-lyzing wireless traffic from many randomizing phones.

This paper makes the following novel contribu-tions:

• We decompose a large 802.11 corpus, providingthe first granular breakdown of real-world MACaddress randomization. Specifically, we developnovel techniques to identify and isolate random-ization and randomization schemes from largecollections of wireless traffic.

• We present the first manufacturer and devicebreakdown for MAC randomization, describingthe particular technique each uses. Our resultsindicate that adoption rates are surprising low,specifically for Android devices.

• We review previous techniques for determiningglobal MAC addresses and find them to be in-sufficient. We provide additional context and im-provements to existing passive and active tech-niques, substantially increasing their effective-ness.

• We identify significant flaws in the majority ofrandomization implementations on Android de-vices. These flaws allow for trivial retrieval ofthe global MAC address.

• Discovery and implementation of a control frameattack which exposes the global MAC address(and thus allows tracking/surveillance) for allknown devices, regardless of OS, manufacturer,device type, or randomization scheme. Further-more, Android devices can be susceptible to thisattack even when the user disables WiFi and/orenables Airplane Mode.

2 Background

2.1 MAC Addresses

Every network interface on an 802.11 capable devicehas a 48-bit MAC address layer-2 hardware identi-fier. MAC addresses are designed to be persistent andglobally unique. In order to guarantee the unique-ness of MAC addresses across devices the Institute ofElectrical and Electronics Engineers (IEEE) assignsblocks of addresses to organizations in exchange fora fee. A MAC Address Block Large (MA-L), com-monly known as an Organizationally Unique Iden-tifier (OUI), may be purchased and registered withthe IEEE [15], which gives the organization control ofand responsibility for all addresses with a particularthree-byte prefix. The manufacturer is then free toassign the remaining low-order three bytes (224 dis-tinct addresses) any value they wish when initializing

2

devices, subject to the condition that they do not usethe same MAC address twice.

An implication of the IEEE registration system isthat it is trivial to look up the manufacturer of a de-vice given its MAC address. Using, again, the exam-ple of a wireless eavesdropper, this means that anyonelistening to 802.11 traffic can determine the manu-facturer of nearby devices. To combat this, the IEEEalso provides the ability to purchase a “private” OUIwhich does not include the company’s name in theregister. However, this additional privacy feature isnot currently used by any major manufacturers thatwe are aware of.

01 : 23 : 45 : 67 : 89 : AB

OUI NIC

00000001 Unicast/Multicast Bit

Universal/Local Bit

Figure 1: 48-bit MAC Address Structure

In addition to the public, globally unique, andmanufacturer assigned MAC address, modern devicesfrequently use locally assigned addresses [6] which aredistinguished by a Universal/Local bit in the mostsignificant byte. Locally assigned addresses are notguaranteed to be unique, and generally are not usedin a persistent manner. Locally assigned addressesare used in a variety of contexts, including multi-Service Set IDentifier (SSID) configured access points(APs), mobile device-tethered hotspots, and peer-to-peer (P2P) services. A visual depiction of the MACaddress byte structure is illustrated in Figure 1.

Most importantly for this paper, locally assignedaddresses may also be used to create randomizedMAC addresses as an additional measure of privacy.Similar to an OUI, a three-byte Company Identi-fier (CID) prefix can be purchased from the IEEE,with the agreement that assignment from this ad-dress space will not be used for globally unique ap-plications. As such, a CID always has the local

bit set, and is predisposed for use within MAC ad-dress randomization schemas. One such example, theGoogle owned DA:A1:19 CID, is prominent withinour dataset.With the advent of randomized, locally assigned

MAC addresses that change over time, tracking awireless device is no longer trivial. For this reason,we frequently observe 802.11 probe requests using lo-cally assigned addresses when the device is in a disas-sociated state (not associated with an AP). When amobile device attempts to connect to an AP, however,it reverts to using its globally unique MAC address.As such, tracking smartphones becomes trivial whilethey are operating in an associated state.Since mobile devices are usually only associated

while the user is relatively stationary (otherwise theywould be out of range of the AP), tracking them inthis state is less of a privacy vulnerability than hav-ing the ability to track devices in an unassociatedstate, which usually occurs when the user is mov-ing from one location to another. Additionally, thereare several good reasons to use a global address inan associated state, such as to support MAC addressfiltering on the network. Therefore we concentrate,in this paper, on evaluating randomization methodsand tracking of unassociated devices.

2.2 Mobile OS MAC Randomization

A particularly sensitive privacy issue arises fromthe manner in which wireless devices identify accesspoints within close proximity. Traditionally, devicesperform active scanning where they broadcast proberequest frames asking nearby APs to identify them-selves and respond with 802.11 parameter informa-tion required for connection setup. These probe re-quest frames require a source MAC address, but if an802.11 device uses its globally unique MAC addressthen it is effectively broadcasting its identity at alltimes to any wireless receiver that is nearby. Wire-less device users can then easily be tracked acrosstemporal and spatial boundaries as their devices aretransmitting with their unique identity.To combat this privacy concern, both Android and

Apple iOS operating systems allow for devices in adisassociated state to use random, locally assigned

3

MAC addresses when performing active scans. Sincethe MAC address is now random, users gain a mea-sure of anonymity up until they associate with anAP.

The particular software hooks used for randomiza-tion vary between operating systems. See AppendixA for a discussion of the OS mechanisms and config-uration files that support MAC randomization.

3 Related Work

Vanhoef et al. [22] present several techniques fortracking devices regardless of privacy countermea-sures such as MAC address randomization. Theseattacks rely on devices’ support for Wi-Fi ProtectedSetup (WPS), a protocol that allows unauthenti-cated devices to negotiate a secure connection withaccess points. Unfortunately, in order to facilitatethis process, extra WPS fields are added in a device’sprobe requests that contain useful information for de-vice tracking. Among these is the manufacturer andmodel of the device, but also a unique identifier calledthe Universally Unique IDentifier-Enrollee (UUID-E)which is used to establish WPS connections. The flawthat Vanhoef et al. [22] discovered is that the UUID-Eis derived from a device’s global MAC address, and byusing pre-computed hash tables an attacker can sim-ply lookup the UUID-E from the table and retrievethe global MAC address [22, 16]. We refer to thistechnique as UUID-E reversal. Since the UUID-Edoes not change, the implication is that even if theMAC address is randomized, an attacker can still re-cover the original, global address by performing thisreversal technique on the UUID-E.

While the revelation of the flaw was significant,several holes in the analysis were observed due to thedataset on which the work was evaluated. The at-tack was applied against an anonymized dataset from2013 [7]. This dataset did not include randomizedMAC address implementations as they did not existin 2013. Additionally, due to the fact that the datawas anonymized, and ground truth was not available,a validation of the reversal technique was not pro-vided. The authors state that the address could notbe confirmed to be the WiFi MAC address, rather

it may represent the Bluetooth MAC address of thedevice. Because of this, the reader is left with littleunderstanding on the scope of practical use of theseattacks. Namely, is the attack truly viable againstdevices performing randomization?

The first contribution of this paper is a better eval-uation of the attacks presented by Matte et al. [17].Using more recent real-world data, we verify that thistechnique is plausible for defeating randomization fora small set of devices. However, we also show that animprovement on their technique can achieve a highersuccess rate, up 99.9% effectiveness against vulner-able devices. We are also able to confirm that theretrieved MAC address is in fact the 802.11 WiFiidentifier and not the Bluetooth address using addi-tional techniques. More importantly, we provide areal-world assessment for the scope of the attack, re-vealing that only a small portion of Android devicesare actually vulnerable.

Vanhoef et al. [22] and Matte et al. [17] presentan additional technique: fingerprinting of the proberequest 802.11 Information Elements (IEs). IEs areoptional, variable length fields which appear in WiFimanagement frames and are generally used to im-plement extensions and special features on top ofthe standard WiFi protocol. Importantly, there areenough of these extensions and manufacturer spe-cific functions that the various combinations whichare supported on a particular device may be uniqueto that device, causing the IEs to form a fingerprintwhich can be used to identify traffic coming from thatdevice.

However, we find one significant flaw in the eval-uation of these fingerprints: locally assigned MACaddresses were ignored by the authors. Nearly allrandomization schemes utilize locally assigned MACaddresses to perform randomization. As such, pre-vious research failed to identify problems observedwhen tracking randomized MAC addresses. A sim-ple example of this is the signature of a device’s proberequest, which we observed changing during random-ization and even when not randomizing. Only by ob-serving these behaviors can we truly implement effec-tive derandomization techniques and present honestreflections on the limitations of the attack methods.

4

Also presented in [22] is a revival of the Karma at-tack using a top-n popular SSID honeypot approach.As noted above, MAC randomization stops once a de-vice becomes associated with an AP. Karma attacksare active attacks where a rogue AP is configuredwith an identical name (SSID) to one that the deviceis set up to automatically connect to. In effect, thisforces the devices into an authenticated state whereit reveals its global MAC address and bypass ran-domization. We validate this attack by finding thatthe increased prevalence of seamless WiFi-offloadingfrom cellular networks means that many devices inthe wild are vulnerable.

4 Methodology

Our initial goal is to identify which mobile devices areusing randomization, in order to narrow down furtherinvestigation into their exact methods for doing so.Since this is not a capability that is advertised on aspec sheet, we resort to broad capture and analysisof WiFi traffic in order to determine which devicemodels are doing randomization.

Over the course of approximately two years, wecaptured unencrypted 802.11 device traffic using in-expensive commodity hardware and open-source soft-ware. We primarily use an LG Nexus 5 Androidphone running Kismet PcapCapture paired with anAWUS036H 802.11b/g Alfa card. We hop betweenthe 2.4GHz channels 1, 6, and 11 to maximize cov-erage. We additionally employ several RaspberryPi devices running Kismet with individual wirelesscards each dedicated to channels 1, 6, and 11. Ourcorpus spans January 2015 to December 2016 andencompasses approximately 9,000 individual packetcaptures. The collection contains over 600 gigabytes(GBs) of 802.11 traffic, consisting of over 2.8 millionunique devices.

It is important to note that, since devices only ran-domize when they are unassociated, the only trafficwe are interested in is 802.11 management framesand unencrypted multicast Domain Name System(mDNS) packets. Therefore we did not captureactual intentional user traffic from the device, i.e.

web browsing, email, etc., but only automatic, non-personal traffic sent by the device.

4.1 Ethical Considerations

Our collection methodology is entirely passive. At notime did we attempt to decrypt any data, or performactive actions to stimulate or alter normal networkbehavior while outside of our lab environment. Ourintent is to show the ease with which one can buildthis capability with low-cost, off-the-shelf equipment.However, given the nature of our data collection, weconsulted with our Institutional Review Board (IRB).

The primary concerns of the IRB centered on: i)the information collected; and ii) whether the exper-iment collects data “about whom” or “about what.”Because we limit our analysis to 802.11 manage-ment frames and unencrypted mDNS packets, we donot observe Personally Identifiable Information (PII).Although we observe IP addresses, our experimentdoes not use these layer-3 addresses. Even with anIP address, we have no reasonable way to map theaddress to an individual. Further, humans are in-cidental to our experimentation as our interest is inthe randomization of wireless device layer-2 MAC ad-dresses, or “what.” Again, we have no way to mapMAC addresses to individuals.

Finally, in consideration of beneficence and re-spect for persons, our work presents no expectationof harm, while the concomitant opportunity for net-work measurement and security provides a societalbenefit. Our experiment was therefore determined tonot be human subject research.

4.2 Identifying Randomization

We know devices implement MAC randomization indifferent ways. In order to quantify the vulnerabili-ties of employed randomization policies, we first at-tempt to categorize devices into different bins, withidentical behavior, so that we can investigate char-acteristics of these individual techniques and seek toidentify flaws in their implementation. For instance,as we will see, all iOS devices fall into the same bin,in that they handle randomization in a similar way.

5

Table 1: Corpus Statistics

Category # MACs

Corpus 2,604,901Globally Unique 1,204,148Locally Assigned 1,400,753

Table 2: Locally Assigned Bins

Category # MACs

Locally Assigned 1,400,753Service 3,147Randomized 1,388,566Unknown 9,040

Table 3: Randomization Bins

Category # MACs

Randomized 1,388,566Android: DA:A1:19 (WPS) 8,761Android: DA:A1:19 43,924Android: 92:68:C3 (WPS) 8,961iOS 1,326,951Windows 10 / Linux 59

Android devices, on the other hand, differ signifi-cantly from iOS, and also vary greatly from manu-facturer to manufacturer.

Our first step is to identify whether a device isperforming randomization. This starts with extract-ing all source MAC addresses derived from probe re-quest frames in our corpus. If the local bit of theMAC address is set, we store the address as a lo-cally assigned MAC address in our database. Sincerandomized addresses cannot be unique, we assumeat this point that any device using randomizationwill set the local bit in its MAC address and there-fore all randomization candidates will be in this dataset. For each address we then parse the advertisedWPS manufacturer, model name, model number,and uuid e values. Additionally, we build signaturesderived from a mapping of the advertised 802.11 IEvendor fields using techniques from related work indevice-model classification [22, 13]. Each MAC ad-dress, associated WPS values (when applicable), andthe device IE signature are stored in our database.

Our device signatures are created using custombuilt Wireshark dissectors to parse the 802.11 ven-dor IE fields and values. Our modifications to stan-dard wireshark files (packet-ieee80211.c and packet-

ieee80211.h) allow us to efficiently create the indi-vidual device signatures as we process the packetcaptures, eliminating any need for post-processingscripts. Furthermore, this allows us to use a signa-ture as a display filter while capturing. We will lateruse the device signatures for both passive and activederandomization techniques.

Our corpus contained a total of ∼66 million in-dividual probe requests. We have a dataset of 2.6million unique source MAC addresses after removingduplicates. In Table 1 we observe that 1.4 million(∼53%) of the 2.6 million distinct MAC addresses

had locally assigned MAC addresses. Recall that lo-cally assigned addresses are not only used for ran-domization. Therefore, after partitioning the corpus,we separate the locally assigned MAC addresses thatare used for services such as P2P andWiFi-Extendersfrom those used as randomized addresses for privacypurposes. Doing so required us to manually inspectthe frame attributes and look for identifying charac-teristics.

One prevalent P2P service that makes use of lo-cally assigned addresses is WiFi-Direct. Fortunately,WiFi-Direct operations contain a WiFi-Direct IE(0x506f9a,10). Specifically, the following attributesare are observed with all WiFi-Direct traffic: i) WiFi-Direct IE is present, ii) the observed OUI is sim-ply the original OUI with the local bit set, and iii)the SSID value, if observed, is set with a prefix ofDIRECT-. Furthermore, manual inspection of thepacket capture reveals that these devices use a singlelocally assigned MAC address for all observed proberequest frames. As these devices are not conductingrandomization we remove them from our dataset.

Similarly, Nintendo devices operating in a P2Pmode are observed utilizing a locally assigned ad-dress. Associated frames use a modified NintendoOUI, one with the local bit set. Additionally, allNintendo P2P probe requests contain a unique Ven-dor Specific IE, 0x00:1F:32, allowing for an efficientidentification and removal from our dataset.

Lastly, the remainder of our service-based locallyassigned addresses were attributed to WiFi exten-ders forwarding client probe requests. These werealso identified as modifying their original OUI by set-ting the local bit. Commonly observed OUIs, such asCisco, D-Link, and Belkin indicated a likely associ-ation to infrastructure devices. We confirmed ourassumptions through manual packet analysis, which

6

showed: i) the MAC address never changes, ii) eachunique device probes for only one SSID, and iii) de-vices with WPS attributes clearly indicate wirelessextender models.

Table 2 illustrates that 99.12% of all locally as-signed mac addresses are randomized addresses, rep-resenting ∼53% of our total corpus. While this mayseem like it indicates a large rate of adoption for MACrandomization, these addresses do not directly corre-late to the number of unique devices in our dataset.While globally unique addresses have a 1-to-1 rela-tionship with individual devices, a device perform-ing randomization has a 1-to-many relationship. Itis plausible that a device conducting randomizationmay have tens of thousands of addresses over a collec-tion period. Therefore we posit that much less than∼50% of devices conduct randomization.

Our goal, to identify and evaluate potential flawsin currently fielded randomization policies, requiresthat we must first answer non-trivial questions aboutour real-world dataset. How many devices were ac-tually performing randomization? Which manufac-turers and models have implemented randomizationin practice and why? What operating systems areprevalent? Which randomization policies are actu-ally used?

As discussed above, we must first identify distinctbins of randomization within the data. Table 3 high-lights the results of this analysis. We completed thisanalysis by evaluating the following; i) the MACaddress prefix (OUI, CID, random), ii) WPS at-tributes, iii) 802.11 IE derived device signatures, andiv) mDNS fingerprinting techniques [16]. Lastly, weconfirm our analysis using devices procured by ourteam and evaluated in a controlled Radio Frequency(RF) environment. We provide detailed analysis ofour methods, results, and answers to our stated ques-tions in §5.

5 Analysis

5.1 Android Randomization

After removing all of the service-based locally as-signed MAC addresses described in §4.2, we aim to

separate the remaining ∼1.388 million addresses intodistinct bins. First we perform a simple query ofour database where we identify the most commonthree byte prefixes. We expect that the prefixes withthe highest occurrences will be the CID owned bythe representative devices. Our findings were sur-prising: first, the Google owned CID DA:A1:19, wasby far the most commonly observed prefix (52,595),while the second most common prefix 92:68:C3, ob-served 8,691 times was not an IEEE allocated CID,but rather a Motorola owned OUI with the local bitset.

The remaining 177k observed three-byte prefixes,each with total occurrences ranging from a low oftwo to a high of seven, show no indication of beinga defined prefix or CID. While we expected to seethe Google owned CID, we also expected to see addi-tional CIDs configured by manufacturers to overridethe default Google CID.

5.1.1 92:68:C3

Investigating the 92:68:C3 prefix in more detail, wesee that devices using this prefix always transmitgranular WPS details. This is helpful as it lets useasily determine the device model (see §3). First,the Motorola Nexus 6 is the only device using thisprefix. Using the WPS derived UUID-E as a uniqueidentifier, we see that there were 849 individual Mo-torola Nexus 6 devices in our dataset. Second, inorder to retrieve the global MAC address we usethe UUID-E reversal technique previously mentioned[22, 16]. We find that the actual prefix of the device’sMAC address is not the expected 90:68:C3 OUI.Rather, we observe a set of different Motorola ownedOUIs. In combination with with the config.xml file(see Appendix A) retrieved from publicly availablerepositories we identify that the prefix 92:68:C3 waspurposefully set by Motorola to replace the Googleowned CID.

Searching open source Android code repositoriesrevealed no additional config.xml defined prefixesother than the Google and Motorola ones. Thismatches what we observe in our real-world dataset.

7

5.1.2 DA:A1:19

The analysis of the Google CID DA:A1:19 provedmore complex, having serious implications to priorwork in derandomization attacks. Unlike the Mo-torola prefix, not all devices using the Google CIDtransmit WPS attributes. This had multiple ef-fects on our analysis. First, we were unable to eas-ily identify the manufacturer and model informationwhen no WPS information was present. Lacking aUUID-E, we were unable to precisely identify totaldevice counts. More importantly, we were unable toretrieve the global MAC address via the reversal tech-nique. Surprisingly, only ∼19% of observed MACaddresses with the Google CID contain UUID-E val-ues. Since the reversal technique of Matte et al. [17]require a UUID-E, this emphasizes the fact that pre-vious evaluations are insufficient. A large majorityof Android phones are not vulnerable to UUID-E re-versal, despite how valuable the technique initiallyseems.

We evaluated the 8,761 addresses that have WPSvalues before attempting to breakdown the 43,924DA:A1:19MAC addresses with no WPS information.We observed a diverse, yet limited spread of manufac-turers and models, depicted in Table 4. Huawei wasthe most prevalent manufacturer observed, primarilyattributed to the (Google) Nexus 6P (1660 uniquedevices). Various versions of the Huawei Mate andHuawei P9 were also commonly observed. Sony waswell-represented with 277 unique devices across 23variations of Xperia models. There were several sur-prising observations in this list, namely that Samsungwas absent despite having the largest market sharefor Android manufacturers [19]. Blackberry, HTC,and LG were also poorly represented. The Black-berry device models were actually four derivations ofthe Blackberry Priv, accounting for 277 unique de-vices observed. HTC was largely represented by theHTC Nexus 9 from the Google Nexus line, which ex-plains the likely use of randomization. The HTC OneM10 was the remaining HTC device and was only ob-served once. The only observed LG device was theLG G4 model. We provide a full device breakdownin Appendix C.

Table 4: DA:A1:19 Manufacturer Breakdown

Manufacturer Total Devices Model Diversity

Huawei 1708 11Sony 277 23BlackBerry 234 4HTC 108 2Google 13 2LG 1 1

In all, devices having randomized MAC addresseswith a Google CID and containing WPS attributesamount to a total of 2,341 unique devices. Takinginto account the 849 unique Motorola Nexus 6 de-vices, only 3,188 devices spanning 44 unique mod-els are susceptible to the UUID-E reversal attack.Effectively, ∼99.98% of the locally assigned MACaddresses in our corpus are not vulnerable to theUUID-E attack. Furthermore, our corpus containsapproximately 1.2 million client devices with glob-ally unique MAC addresses and over 600 manufactur-ers and 3,200 distinct models using WPS data fields.This begs the question, are a large number of Androiddevices not conducting randomization? Do we expectthe 43,924 randomized addresses using the GoogleCID that did not not transmit WPS information tomake up all remaining Android devices?

We attempt to answer these questions by evaluat-ing the 43,924 DA:A1:19 MAC addresses where noWPS derived data is available. The process proceedsas follows:

1. Divide the entire bin into segments, based on thedevice’s signature described in §4.2, resulting in67 distinct device signatures, with a starting hy-pothesis that each signature represents a distinctmodel of phone.

2. For each signature, parse every packet capturefile where that device signature and the CIDDA:A1:19 were observed.

3. Apply to our parsing filter our custom Tsharkdevice signature and limit to probe requestframes.

The output of the algorithm is the source MAC ad-dress, sequence number, SSID, and device signature.

8

Left with 2,858 output files, each mapping a devicesignature with distinct packet capture, we system-atically retrieve the global MAC addresses for therandomized devices. We will describe in detail themethods for derandomization for this portion of thedataset in §6. After we obtain the global MAC ad-dress for the set of randomized MAC addresses withineach bin, we attempt to identify the device model us-ing a variety of techniques. It is trivial to identifythe manufacturer as the OUI provides sufficient res-olution. However, in order to conjecture the devicemodel we borrow from the work of [16] in which weobtain model granularity from MAC address decom-position. Next, we look for any case where a deviceusing a global MAC address as the source of a proberequest matches the desired signature and also trans-mitted a mDNS packet at some point. For this sub-set we simply retrieve the model information fromthe mDNS packet [16]. This leaves us with guessesas to what devices randomize MAC addresses usingthe DA:A1:19 CID and transmit no granular WPS-derived model data. We posit that our set of 67 sig-nature bins can be condensed into groups of similarsignatures based on our derived model correlations.In order to better evaluate our assumptions, and

now that we have a smaller, manageable set of pos-sible devices, we procure devices for lab testing. Wetest each device using an RF enclosed chamber toensure we limit our collection to only our individualtest phones. We leave each device in the chamber forapproximately five minutes, collecting only the proberequests.We evaluate the collection results by comparing

to our derived signatures and ask the following: dowe observe MAC address randomization? If so, doesthe device signature match expectations when usinga global address? Similarly, does the device signa-ture match expectations when using a randomizedaddress? Our findings are presented in Table 5.Bin 1 is represented by the Google devices LG

Nexus 5X and Google Pixel. This bin encompasses57.7% of the 43,924 MAC addresses observed usingthe Google CID without WPS data. It is prudent tomention that we cannot claim that is an exhaustivelist of devices implementing randomization using thisset of signatures.

Table 5: DA:A1:19 no WPS

Category Confirmed % of no WPS

Bin 1 57.7%LG Nexus 5X

Google Pixel√

Bin 2 18.5%LG G5

LG G4√

Bin 3 2.0%OnePlus 3

Xiaomi Mi Note Pro√

Bin 4 .2%Huawei

Sony√

Bin 5 2.6%Cat S60

Bin 6 12.2%Composite

Bin 7 6.8%Unknown

Next, we evaluate bin 2, representing 18.5% of thecategory’s total. We observe only LG devices, specif-ically we posit that LG G series devices make up thissubset. We confirm that both the LG G4 and G5 de-vices match the signatures and behavior of this bin.We surmise that additional G series devices are rep-resented, however we have no validation at this time.Worth mentioning is that the LG G4 and Pixel iden-tified in the previous DA:A1:19 with WPS sectionwere only observed because a WPS action was trig-gered. By default, WPS data is not transmitted bythe devices in our no-wps category. We confirm thisanalysis in our lab environment, observing WPS datafields only when the user triggers a WPS event.

In bin 3, a smaller bin (2%), the OnePlus 3, andthe Xiamoi Mi Note Pro are representative of theidentified signatures.

Bin 4, the smallest of our bins with less then onepercent of our dataset, consisted of Huawei and Sonydevices. These are devices seen using WPS, but insome frames do not include the WPS data fields.

The Cat S60 smartphone was the only device iden-tified in bin 5. As in other bins, we make no claimthat no other devices share this signature.

Bin 6 represents a combination of the aforemen-tioned devices observed in the various bins. This iscaused by a device, that on occasion rotate betweena standard device signature and a stripped down ver-sion with limited 802.11 IE fields. An example of this

9

signature behavior is described in §6.4 and depictedin Figure 2. As such, this bin is represented by thepreviously mentioned devices.We fail to identify anything with any sense of con-

fidence within bin 7.

5.1.3 Motorola

After an exhaustive look at the randomizationschemes employed by Android we still lack any evi-dence of MAC address randomization by Samsung orMotorola devices (other then the Google based Mo-torola Nexus 6). We attempt to find any evidence ofnon-standard randomization employed by these mod-els by looking at probe requests with globally as-signed MAC addresses. In a similar manner to howwe identified the most common prefixes for locally as-signed addresses, we attempt to identify OUIs withunusually high occurrences within individual packetcaptures. Our premise is that this will indicate theuse of an OUI as a prefix for a set of randomizedMAC addresses.We first ruled out all P2P service related addresses

as previously described, leaving a single manufacturerof interest - Motorola. We identified multiple occur-rences of various Motorola OUIs with an abnormallyhigh percentage of the unique addresses in a packetcapture. After inspecting forty captures with thisanomaly we confirmed that a subset of Motorola de-vices perform randomization using neither a CID noran OUI with the local bit set. These devices usedone of several Motorola owned OUIs, using the globalMAC address occasionally, and a new randomizedMAC address when transmitting probe requests.This is an especially strange result because it shows

that Motorola is using randomized global addresses.This violates the core expectation that no two deviceswill use the same global MAC address. In particular,it is possible for one of these devices to temporarilyuse the true, global MAC address of another deviceas one of its random addresses.We identified two distinct signatures consistently

observed within this Motorola dataset. Using theaforementioned mDNS techniques to guess a devicemodel we posit that one signature belongs to theMoto G4 model while the second corresponds to a

Moto E2. We acquired Moto G4 and E2 smartphonesand confirmed our hypothesis. Additionally, we ob-served that a Moto Z2 Play device model shares thesame randomization behavior and signature as theMoto G4.

5.1.4 Samsung

It is interesting to note that we never observedSamsung devices performing MAC address random-ization, despite being the leading manufacturer ofAndroid smartphones. Samsung uses their own802.11 chipsets, so it is possible that chipset compati-bility issues prevent implementing randomized MACsaddresses. Samsung devices alone represent ∼23% ofAndroid devices in our data set, contributing sub-stantially to the low adoption rate that we see.

5.2 iOS Randomization

After completing the randomization analysis ofAndroid devices, we still have over 1.3 millionMAC addresses not attributed to any randomizationscheme. Next we turn to the analysis of iOS random-ization.Upon the release of iOS 8.0, Apple introduced

MAC address randomization, continuing with mi-nor but valuable updates to the policy across subse-quent iOS releases. We were faced with an immediatedilemma, how do we identify iOS associated proberequests? Apple iOS devices do not transmit WPSfields to indicate any sort of model information, andwe had no knowledge of any Apple owned CID. Inorder to identify any prefix pattern we once again uti-lized our RF-clean environment to test Apple devicebehavior. Our goal was to create as many random-ized MAC addresses as possible from a device andlook for a pattern in the resulting prefixes. To forcea new randomized MAC address we simply enableand disable WiFi mode repeatedly.Our initial thought was that Apple would use a

OUI or CID like other manufacturers and simply ran-domize the least significant 24 bits of the MAC ad-dress. However, we quickly found that the MAC ad-dresses randomly generated by iOS devices do notshare any common prefix. In fact, they appear to be

10

completely random, including the 24 OUI bits, ex-cept for the local bit which is always set to 1 andthe multicast bit which is set to 0. To lend credenceto this new hypothesis we sampled 47,255 randomMAC addresses from an iOS device and ran standardstatistical tests to determine if they were uniformlydistributed (see Appendix B). These tests confirmedthat, with the exception of the local and unicastbit, iOS most likely implements true randomizationacross the entire MAC address. This is interestinggiven the fact that the IEEE licenses CID prefixesfor a price, meaning that Apple is freely making useof address space that other companies have paid for.Based on these findings, we are faced with identi-

fying a randomization scheme where randomness isapplied across 246 bits of the byte structure. We cannot simply assume that if the prefix does not matchan offset of an allocated OUI that it is an iOS de-vice. This is due to the aforementioned clobberingof other manufacturers OUI space. Our next stepwas to leverage the use of mDNS once again. Wetake the union of global MAC addresses derived fromprobe requests that are also seen as source addressesfor iOS related mDNS packets. This results in a setof probe requests that we can confirm are Apple iOSdevices. We then extract all of the signatures forthese devices. We suspected that this retrieved onlya portion of the relevant iOS signatures. Next we col-lected signatures from all of our Apple iOS lab testdevices using our RF enclosure. Finally, we identifysignatures of all remaining locally assigned MAC ad-dresses in which we have no assigned categorization.We then seek to find any probe requests with globalsource address that have matching signatures. If theOUI of the global addresses resolves to an Apple OUIwe consider that a valid signature. This is slightlydifferent then our mDNS test as we cannot attributethe signature to a specific set of iOS device models.We test our entire iOS signature set and ensure thatno non-iOS global MAC addresses are ever observedwith these signatures.In June 2016, midway through our research, iOS 10

was released. Inexplicably the addition of an Applevendor specific IE was added to all transmitted proberequests. This made identification of iOS 10 Appledevices trivial regardless of the use of MAC address

randomization. We believe the difficulty of identify-ing MAC address randomization to be one of the bestcountermeasures to defeating randomization. Com-pounding our incredulity, the data field associatedwith this IE never changes across devices.Using our combined set of all Apple iOS signatures,

we identify ∼1.3 million distinct randomized MACaddresses, by far the most populous (94.7%) of ourrandomization categories.

5.3 Windows 10 and Linux Random-ization

To conclude our categorization of randomizationschemes, we look to identify the probe requests fromdevices using Windows 10 and Linux MAC addressrandomization implementations. Our first test com-pares the signatures obtained from laboratory lap-tops to the signatures of our locally assigned dataset.We find 59 matches to our laptop signatures, indi-cating possible Windows 10 or Linux randomization.Next, we parse collection files using the locally as-signed MAC addresses from the probe request framesof these devices. Our hypothesis, if we find matchinglocally assigned MAC addresses in authentication, as-sociation, or data frames, that the randomizationsscheme is likely Windows 10 or Linux. This assump-tion is due to the fact that the randomization policiesuse the same locally assigned address for network es-tablishment and higher layer data frames. To thatend, we find that 14 of the 59 devices assessed tobe Windows/Linux computers use a locally assignedMAC address when associated to a network.

6 MAC Randomization Flaws

Now that we have a baseline understanding of therandomization implementations used by modern mo-bile OSs we are able to assess for vulnerabilities.

6.1 Adoption Rate

The most glaring observation, while not necessarilya flaw per se, is that the overwhelming majority ofAndroid devices are not implementing the available

11

randomization capabilities built into the AndroidOS. We expect that this may be partly due to802.11 chipset and firmware incompatibilities. How-ever, some non-randomizing devices share the samechipsets as those implementing randomization, so it isnot entirely clear why they are not utilizing random-ization. Clearly, no effort by an attacker is requiredto target these devices.

6.2 Global Probe Request

We next explore the flaws of the observed MAC ad-dress randomization schemes. One such flaw, the in-explicable transmission of the global MAC address intandem with the use of randomized MAC addresses.We observe this flaw across the gamut of Android de-vices. The single device in which we do not observethis was the Cat S60 smartphone. In no instance didthe Cat S60 transmit a global MAC address proberequest, except immediately prior to an associationattempt. Exploiting this flaw it was trivial to linkthe global and randomized MAC addresses using ourdevice signatures and sequence number analysis. Be-tween probe requests, the sequence numbers increasepredictably so an entire series of random addressescan be linked with a global address by just followingthe chain of sequence numbers. While using sequencenumbers has been discussed before in prior work [22],the fact that the global MAC address is utilized whilein a supposedly randomized scan state has not. Thisstrange behavior is a substantial flaw, and effectivelynegates any privacy benefits obtained from random-ization. In our lab environment we observed thatin addition to periodic global MAC addressed proberequests, we were able to force the transmission ofadditional such probes for all Android devices. First,anytime the user simply turned on the screen, a setof global probe requests were transmitted. An activeuser, in effect, renders randomization moot, eliminat-ing the privacy countermeasure all together. Second,if the phone received a call, regardless of whether theuser answers the call, global probe requests are trans-mitted. While it may not always be practical for anattacker to actively stimulate the phone in this man-ner, it is unfortunate and disconcerting that device

activity unrelated to WiFi causes unexpected conse-quences for user privacy.

6.3 UUID-E Reversal

Vanhoef et. al. introduce the UUID-E reversal at-tack against Android devices [22]. Devices transmit-ting probe request frames with WPS enriched datafields, specifically, the UUID-E are vulnerable to areversal attack where the global MAC address canbe retrieved using the WPS UUID-E value. The flawcaused by the construction of the UUID-E, where theMAC address is used as an input variable along with anon-random hard-coded seed value. This implemen-tation design flaw allows for the computation of pre-computed hash tables, whereby retrieving the globalMAC address requires only a simple search of thehash tables. This revelation, both groundbreakingand disconcerting, still leaves the reader to guess asto the plausibility of the attack against randomizeddevices. We find several issues with their approach,specifically in respect to derandomization analysis:i) randomization was not employed in 2013, whenthe data used in their evaluation was gathered ii)anonymized data eliminates accuracy checks, and iii)removing locally assigned MAC addresses effectivelyeliminates the ability to evaluate the attack againstdevices performing randomization.Accordingly, we use our corpus of DA:A1:19 and

92:68:C3 datasets to evaluate the effectiveness andviability of the UUID-E attack. Our foremost obser-vation is that only 29% of random MAC addressesfrom Android devices include WPS attributes. Ef-fectively 71% of this Android dataset is completelyimmune to the UUID-E reversal attack. This is inaddition to the fact that iOS devices are wholly im-mune to the attack, as they do not use WPS. Werefer back to Table 4 the limited number of Androidmodels performing randomization and transmittingthe necessary WPS UUID-E attribute.We then retrieve the global MAC address from the

probe requests of these devices that used both ran-dom and global MAC addresses, exploiting the previ-ously discussed flaw. We use this set of 1,417 ground

truth MAC addresses to test the effectiveness of theUUID-E reversal attack. First we pre-compute the

12

required hash tables. To build hash tables for the en-tire IEEE space would be non-trivial, requiring sig-nificant disk space and processing time. While an ex-haustive compilation of the address space is certainlypossible, we use the knowledge gained from decom-posing the randomization schemes to efficiently con-struct our tables. We build the hash tables using onlythe OUIs owned by manufacturers we have observedto implement randomization. The resulting hash ta-ble is a manageable 2.5TBs, where using pre-sortingtechniques, we can retrieve an UUID-E’s global MACaddress in < 1 second.We retrieve a global MAC address for 3,187 of the

3,188 UUID-Es. In previous work it was left incon-clusive whether the retrieved MAC addresses were infact the global 802.11 MAC address or instead theBluetooth MAC address. The UUID-E derived fromthe HTC One M10 device, was the example UUID-Elisted in the wpa supplicant.conf file. With exceptionof the HTC Nexus 9, all HTC phones in our dataset(regardless of randomization) used this non uniqueUUID-E.Comparing the 1,417 ground truth addresses to

those retrieved from the UUID-E attack we achieve a100% success rate. Indicating that the retrieved ad-dresses are in fact the global 802.11 MAC addresses,completing the missing link from the evaluation ofVanhoef et al. [22].

6.4 Device Signature

To aide in derandomization we employ fingerprintingtechniques, using signatures derived from the 802.11IEs borrowed from previous work [22, 13]. We usedthis technique first to aide in the identification ofthe randomization schemes employed by Android andiOS devices.This technique allows us to remove all extrane-

ous probe request traffic, providing us a “cleaner”dataset in which to employ sequence number analy-sis. We modify the Wireshark files packet-ieee80211.cand packet-ieee80211.h, creating a new dissector fil-ter, device.signature. We are able to filter previouscollection files as well as conduct filtering on live col-lection. While our contribution to the Wireshark dis-tribution is novel, the fingerprinting technique is not,

as we borrowed from related work. However, priorwork tested against datasets not performing random-ization which fails to provide accurate context. Wetest the signature technique against our real worldcorpus, revealing flaws in previous signature basedattacks.Regardless of the Android implementation, a de-

vice transmits probe request frames which have vary-ing signatures (based on IEs, see §3). Devices of-ten use two or more signatures while using a globalMAC address, so simply using the signature is in-sufficient. Additionally, the same holds for random-ized addresses, in which we observe multiple signa-tures. In both cases, the second signature, has min-imal 802.11 IEs. Due to the fact that nearly all de-vices periodically use this signature, it creates signif-icant complexity to any signature based derandom-ization attack. Finally, as Figure 2 illustrates, weobserve that most Android devices use different sig-natures when randomizing compared to when usinga global MAC address. As such, previously describedsignature-based tracking methods fail to correlate theaddresses. Using our decomposition of Android ran-domization schemes, and the derived knowledge ofhow distinct bins of devices behave, we properly pairthe signatures of probe requests using global and ran-domized MAC addresses. Only by combining thesesignatures are we able to accurately and efficientlyretrieve the global MAC address.We observe no such change in signatures of iOS

devices within a collection timeframe. While an iOSdevice may not use alternate signatures, they do notsend globally addressed probe requests. Therefore,at this juncture, we have not identified a method ofresolving the global MAC address.

6.5 Association/AuthenticationFrames

We observe that Android and iOS devices use se-quential sequence numbers across management frametypes. Using only passive analysis we can follow a de-vices transition from randomized probe requests to anauthentication or association frame by following thesequence numbers. This is particularly useful as allauthentication and association frames from iOS and

13

SigG = 0,1,50,3,45,221(0x50f2,8),htcap:012c,htagg:03,htmcs:000000ff

SigR = 0,1,50

Figure 2: Device Signature (Motorola Moto E2)

Android devices use the global MAC address. Usingthe techniques described in [13] we create a set ofsignatures for the association frames of iOS devices,specifically to aide in confirmation that the deviceobserved in the probe request is also the same devicetype as the association frame. This method relies onthe targeted device attempting to establish a networkconnection with a nearby AP. As this is fairly user-activity dependent, we reinvestigate the plausibilityof the Karma attack against current randomizationschemes.

6.6 Karma Attack

The current versions of iOS and Android random-ization policies have eliminated the vast majority ofcases where a directed probe is used. A directedprobe is a probe request containing a specified SSIDthat the device wishes to establish a connection (apreviously known or configured SSID), as opposedto a broadcast probe which solicits a response fromall APs in range. Today, the predominant use ofbroadcast probes has directly effected the ability fora Karma-based attack to succeed. Karma-based at-tacks work by simulating an access point that a deviceprefers to connect to. A variety of implications suchas man-in-the-middle attacks are common follow-onconsequences, however we are only interested in re-trieving the global MAC address and therefore re-quire only a single authentication frame to be trans-mitted by a targeted device. To this end Vanhoef et.al. also investigate Karma attacks, implemented viaa predefined top-n SSID attack, achieving a 17.4%success rate, albeit not specifically related to devicesperforming randomization.Unlike previous work, we observe devices while in a

randomized state in order to identify specific behav-iors that directly counteract randomization privacygoals. Specifically, do we observe traits that allowfor a targeted Karma attack? It is well known that

hidden networks require directed probes, so while thisis a vulnerability to randomization, it is fairly uncom-mon, and a decision in which a user chooses to imple-ment. Similarly, previous connections to ad hoc net-works, saved to the devices network list, cause bothAndroid and iOS devices to send directed probes. Aswith hidden networks, this uncommon condition re-quires action from the user, however when observed,the Karma attack is viable.

Finally, we observe a more disconcerting trend: de-vices configured for seamless cellular to WiFi data-offloading, such as Hotspot 2.0, EAP-SIM and EAP-AKA force the use of directed probes and are inher-ently vulnerable to Karma-based attacks [4]. Theexpanding growth of such handover polices revealsa significant vulnerability to randomization counter-measures. Further exasperating the problem, thesedevices are pre-configured with these settings, requir-ing no user interaction. We confirmed these settingsby inspecting the wpa supplicant.conf file of a Mo-torola Nexus 6 and Nexus 5X. Removing the networksfrom the configuration file requires deletion by a rareuser with both command line savvy and awareness ofthis issue.

We test for the presence of these network configura-tions in our corpus by evaluating all randomized ad-dresses using WPS fields. We are able to accuratelyevaluate unique devices using the UUID-E value asthe unique identifier. We filter for any instance wherethe device sends a directed probe, retrieving the SSIDvalue for each. Sorting by most common occurrencethe top three most common SSIDs were BELL WIFI,5099251212, and attwifibn. The SSIDs BELL WIFI

and 5099251212 are used by the mobile carrier BellCanada for seamless WiFi offloading. Interestingly,the attwifibn SSID is related to free WiFi hotspotsprovided by the Barnes and Noble bookstore. Only∼5% of the datasets 3,188 devices transmitted a di-

14

rected probe. However, of those that did, 17% of werecaused by the preconfigured mobile provider settings.

Next we take a cursory look at Apple iOS andAndroid devices with no amplifying WPS informa-tion. We do not get precise statistics, however, weobserve the same trend.

6.7 Control Frame Attack

We now evaluate active attack methods for identify-ing a device by its global MAC address while in arandomized state. Our premise: can we force a de-vice performing MAC address randomization to re-spond to frames targeting the global MAC address?This would allow for easy tracking of devices, evenwhen they are randomizing, because an active at-tacker could elicit a specific response from them atany time if they are within wireless range.

Table 6: Class 1 Frames [12]

Control Management Data

RTS Probe Request Frame w/DS bits falseCTS Probe ResponseAck BeaconCF-End AuthenticationCF-End+CF-Ack Deauthentication

ATIM

Figure 3 depicts the 802.11 state diagram illustrat-ing the various states of association for 802.11 devices[12]. We are particularly interested in the frame typesthat can be sent or received while in an unauthenti-cated and unassociated state (State 1). The frametypes (Class 1 frames) allowed while in State 1 aredepicted in Table 6.

In our lab environment, we use packet craftingtools (SCAPY, libtins) to transmit customized pack-ets for each frame type, targeting the global MAC ofthe device.

The source MAC address of the frame is a uniquelycrafted MAC address. It is not the actual MAC ad-dress of our transmitter. This ensures that we can ac-curately track any responses to our crafted message,removing any possible control frames that happen tobe sent to the actual transmitter address. Of thetwelve Class 1 frame types used for the attack, we suc-

State 3

Authenticatedand associated

State 2

Authenticatedand unassociated

State 1

Unauthenticatedand unassociated

Class 1, 2and 3 frames

Class 1 and 2frames

Class 1 frames

Disassociation

Deauthentication

Association

Authentication

Figure 3: 802.11 State Diagram

cessfully elicited a response from only the Request-to-Send (RTS) frame.

Request to Send and Clear to Send (RTS/Clear-to-Send (CTS)) transmissions are available in the IEEE802.11 specification as part of a Carrier Sense Multi-ple Access with Collision Avoidance scheme. Whena node desires to send data an RTS may be sent toalert other nodes on the channel that a transmissionis about to begin and the period of time during whichthey should not transmit on that channel so as toavoid collisions. If there are no conflicting uses of thechannel, the target node will respond with a CTSto acknowledge the request and give the transmit-ting node permission to solely communicate on themedium.

As for previous location and tracking attacks, someresearchers have used RTS/CTS messages to performTime of Arrival computations [14] while others haveextended these techniques to perform Time Differ-ence of Arrival calculations from timestamps in ex-changed frames [9]. These older methods perform lo-calization on Access Points from client devices. Thenovelty in our method is that we are sending RTSframes to IEEE 802.11 client devices, not APs, to ex-

15

tract a CTS response message which we derive thetrue global MAC address of that device. Instead of alocalization attack, we are using RTS/CTS exchangesto perform derandomization attacks.The result of sending a RTS frame to the global

MAC address of a device performing randomizationwas that the target device responded with a CTSframe. A CTS frame, having no source MAC address,is confirmed as a response to our attack based on thefact that it was sent to the original, crafted sourceMAC address. A full device listing utilized for thecontrol frame attack is available in Appendix D.Once the global MAC address is known, that de-

vice can be easily tracked just as if randomizationwere never enabled. This might cause one to won-der why vendors would go to such lengths to includeMAC address randomization in a device only to allowthat same device to divulge the protected informationthrough an administrative protocol. We assert thatthis phenomenon is beyond the control of individualvendors. The fact is that this behavior occurs acrossthe board on every device we have physically tested asshown in Appendix D. This leads us to believe thatRTS/CTS responses are not a function of the OS,but of the underlying IEEE 802.11 chipset. Manu-facturers have configured their chipset hardware withdefault RTS/CTS operation which may not even beaccessible to configure at the OS level. If we are cor-rect, this derandomization issue can not be fixed witha simple patch or OS update. Susceptible mobile de-vices will be unmasked by this method for the life-time of the device. Additionally, due to the hard-ware level nature of this phenomenon, there will bea significant delay in the market until mobile devicesresistant to this attack are produced, assuming man-ufacturers recognize this as a flaw and subsequentlydesign a process truly capable of delivering MAC ad-dress privacy.There are multiple scenarios in which a motivated

attacker could use this method to violate the privacyof an unsuspecting user. If the global MAC addressfor a user is ever known, it can then be added to adatabase for future tracking. This global MAC ad-dress can be divulged using the techniques discussedin this paper, but it can also be observed any timethe user is legitimately using that global MAC ad-

dress, such as when connected to an AP at home orwork. This single leakage of the true identifier willallow an attacker to send an RTS frame containingthat global MAC address in the future to which thathost will respond with a correct CTS when it is inrange. Conceivably, an adversary with a sufficientlylarge database and advanced transmission capabil-ities could render randomization protections moot.Additional tests, while the target device had WiFior Airplane-modes, enabled or disabled respectively,revealed further concerns. Namely, Android devicesperforming location-service enabled functions wakethe 802.11 radio. Our RTS attack was thusly able totrigger a CTS response from the target, circumvent-ing even extreme privacy countermeasures.Lastly, we add improvements, using our Wireshark

signature filters, to eliminate the constant barrage oftransmitted RTS frames. Our collection algorithmis pre-loaded with the target of interest’s device sig-nature, where upon observing the signature in thetarget area we launch the preconfigured MAC ad-dress. We test this against our diverse test phoneswith 100% success.

6.7.1 Bluetooth Correlation

We offer an additional method to derive the globalWiFi MAC address for later use in a RTS attack.Wright and Cache [23] claim that Apple iPhone de-vices, beginning with the iPhone 3G, utilize a one-offscheme for the allocation of the Bluetooth and WiFiMAC addresses, where the MAC address is actuallyequal to the Bluetooth address, plus or minus one.Using a novel algorithm to calculate the WiFi andBluetooth MAC address from iOS devices operatingin hotspot mode, we provide evidence countering thisclaim.We identified that Apple iOS devices, operating

in hotspot mode, send beacon management framescontaining an Apple vendor specific IE. This Type

6 field closely resembles the source MAC address ofthe device. As Wireshark does not process this fieldcorrectly we built custom dissectors to create displayfilters for the Apple vendor tag IE and associateddata fields. We first test on 29 Apple iOS lab devices,placing each in hotspot mode and collecting the bea-

16

Table 7: Derandomization Technique Results

Randomization Bin UUID-E Reversal Global MAC Address Auth/Assoc Hotspot 2.0 - Directed Probes RTS AttackProbe Request Frames Karma Attack

DA:A1:19 with WPS√ √ √ √ √

DA:A1:19 w/o WPS ×√ √ √ √

92:68:C3 with WPS√ √ √ √ √

Motorola (No local bit) × ×√ √ √

Apple iOS × ×√ √ √

con frames. We retrieve the true Bluetooth and WiFiMAC addresses from the device settings menu of thephone. We then parse the beacon frames, outputtingthe source MAC address and six byte Type 6 IE.

We observe that the Type 6 field exactly repre-sents the Bluetooth MAC address. The source MACaddress of the Beacon frame has the local bit set.However, the first byte of the source MAC addressis not a simple offset of the global MAC address asseen in most P2P operations. To resolve the actualglobal MAC address we find that replacing the firstbyte of the source MAC address with the first byteof the Type 6 (Bluetooth Derived) MAC address, weobtain the correct WiFi MAC address of the device.This permutation is successfully tested for all 29 testdevices across the gamut of model and iOS versions.

Interestingly, six of the 29 test devices did not showa one-offMAC address allocation. As such, we seek toidentify the accuracy of the previous claim that iOSdevices use this one-off scheme by evaluating acrossour entire corpus.

A total of 3,576 devices were identified in ourdataset containing the Type 6 field of which ∼95.4%utilized a one-off addressing scheme. Interestingly,∼88.2% of those devices had a Bluetooth address thatwas one-higher then the WiFi MAC address. Indi-cating that even when the offset is used it is not uni-formly implemented. We are unsure as to why ∼4.6%of iOS devices do not use the one-off policy. Regard-less, in all cases the OUI of the two interfaces arethe same. Using the mDNS model correlation anal-ysis we observed no indication that offset scheme iscorrelated with the device model.

7 Conclusions

We provide a detailed breakdown of the randomiza-tion polices implemented, the associated device mod-els, and the identification methods thereof. Thisgranularly detailed decomposition allowed for fine-tuned improvements to prior attempts at MAC ad-dress derandomization as well as providing novel ad-ditions.

Our analysis illustrates that MAC address random-ization policies are neither universally implementednor effective at eliminating privacy concerns. Ta-ble 7 depicts the diversity of presented attacks, acrossthe spectra of randomization schemes and OSs, high-lighted by the RTS control frame attack targeting awidespread low-level chipset vulnerability.

To be truly effective, randomization should be uni-versally adopted. A continued lack of adoption, al-lowing for simpler identification, effectively reducesthe problem set for an attacker. The more de-vices performing randomization within a test set, theharder it will be to diffuse each device’s associatedtraffic. This is particularly true if we can continue tobin the various schemes, further reducing the prob-lem set.

We propose the following best practices for MACaddress randomization. Firstly, mandate a universalrandomization policy to be used across the spectra of802.11 client devices. We have illustrated that whenvendors implement unique MAC address randomiza-tion schemes it becomes easier to identify and trackthose devices. A universal policy must include atminimum, rules for randomized MAC address bytestructure, 802.11 IE usage, and sequence number be-havior.

17

To reiterate, these best practices can only be trulyeffective when enforced across the spectrum of de-vices. Granular examples of such policy rules:

• Randomize across the entire address, providing246 bits of randomization.

• Use a random address for every probe requestframe.

• Remove sequence numbers from probe requests.

• If sequence numbers are used, reset sequencenumber when transmitting authentication andassociation frames.

• Never send probe requests using a global MACaddress.

• Enforce a policy requiring a minimal and stan-dard set of vendor IEs. Move any lost function-ality to the authentication/association process,or upon network establishment utilize discoveryprotocols.

• Specifically, the use of WPS attributes shouldbe removed except when performing P2P opera-tions. Prohibit unique vendor tags such as thoseintroduced by Apple iOS 10.

• Eliminate the use of directed probe requests forcellular offloading.

• Mandate that chipset firmware remove behaviorwhere RTS frames received while in State 1 elicita CTS response.

Acknowledgments

We thank Rob Beverly, Adam Aviv, and Dan Rochefor early feedback.

References

[1] Linux wpa supplicant (ieee 802.1x,wpa, wpa2, rsn, ieee 802.11i). URLhttps://w1.fi/wpa_supplicant/.

[2] wpa supplicant change log. URLhttps://w1.fi/cgit/hostap/plain/wpa_supplicant/ChangeLo

[3] China deputizes smart phones to spy on bei-jing residents’ real-time location, Oct 2011. URLhttps://www.eff.org/deeplinks/2011/03/china-deputizes-

[4] Wifigate - how mobile carriers exposeus to wi-fi attacks, Apr 2014. URLhttps://www.skycure.com/blog/wifigate-how-mobile-carri

[5] Danger close: Fancy bear tracking ofukrainian field artillery units, Jan 2017. URLhttps://www.crowdstrike.com/blog/danger-close-fancy-be

[6] D. E. 3rd and J. Abley. IANA Considera-tions and IETF Protocol and DocumentationUsage for IEEE 802 Parameters. RFC 7042(Best Current Practice), Oct. 2013. URLhttp://www.ietf.org/rfc/rfc7042.txt.

[7] M. V. Barbera, A. Epasto, A. Mei, S. Kosta,V. C. Perta, and J. Stefa. CRAW-DAD dataset sapienza/probe-requests.http://crawdad.org/sapienza/probe-requests/20130910,Sept. 2013.

[8] J. Bard. Unpacking the dirtbox: Confrontingcell phone location tracking with the fourthamendment. BCL Rev., 57:731, 2016.

[9] Z. Cui and A. Agrawala. Wifi localization basedon ieee 802.11 rts/cts mechanism. In Proceed-

ings of the 12th EAI International Conference

on Mobile and Ubiquitous Systems, pages 199–208. ICST, 2015.

[10] M. Cunche. I know your mac address: tar-geted tracking of individual using wi-fi. Journalof Computer Virology and Hacking Techniques,2014.

[11] . Dara Kerr July 29. Russian police spy on peo-ple’s mobile data to catch thieves, Jul 2013. URLhttps://www.cnet.com/news/russian-police-spy-on-people

[12] M. Gast. 802.11 wireless networks : the

definitive guide. O’Reilly, Beijing, Farn-ham, 2005. ISBN 0-596-10052-3. URLhttp://opac.inria.fr/record=b1128195.

18

[13] D. Gentry and A. Pennarun. Passive taxon-omy of wifi clients using MLME frame con-tents. CoRR, abs/1608.01725, 2016. URLhttp://arxiv.org/abs/1608.01725.

[14] C. Hoene and J. Willmann. Four-way toa andsoftware-based trilateration of ieee 802.11 de-vices. In 2008 IEEE 19th International Sym-

posium on Personal, Indoor and Mobile Radio

Communications, pages 1–6, Sept 2008.

[15] IEEE. OUI Public Listing.http://standards.ieee.org/develop/regauth/oui/oui.txt.

[16] J. Martin, E. Rye, and R. Beverly. Decompo-sition of mac address structure for granular de-vice inference. In Proceedings of the 32nd Annual

Conference on Computer Security Applications,pages 78–88. ACM, 2016.

[17] C. Matte, M. Cunche, F. Rousseau, and M. Van-hoef. Defeating mac address randomizationthrough timing attacks. In Proceedings of the 9th

ACM Conference on Security &#38; Privacy in

Wireless and Mobile Networks, WiSec ’16, pages15–20. ACM, 2016.

[18] C. Mims. If you have a smart phone, anyonecan now track your every move, Oct 2012. URLhttps://www.technologyreview.com/s/427687/if-you-have-a-smart-phone-anyone-can-now-track-your-every

[19] T. Mitchell. 2. smartphone ownership ratesskyrocket in many emerging economies, butdigital divide remains, Feb 2016. URLhttp://www.pewglobal.org/2016/02/22/smartphone-ownership-rates-skyrocket-in-many-emerging-economies-

[20] B. L. Owsley. Spies in the skies: Dirtboxes andairplane electronic surveillance. Mich. L. Rev.

First Impressions, 113:75–75, 2015.

[21] M. Sarwar and T. R. Soomro. Impact of smart-phone’s on society. European journal of scientific

research, 98(2):216–226, 2013.

[22] M. Vanhoef, C. Matte, M. Cunche, L. Cardoso,and F. Piessens. Why MAC Address Random-ization is not Enough: An Analysis of Wi-FiNetwork Discovery Mechanisms. In ACM Asi-

aCCS, 2016.

[23] J. Wright and J. Cache. Hacking Exposed

Wireless: Wireless Security Secrets & Solu-

tions. McGraw-Hill Education Group, 3rd edi-tion, 2015. ISBN 0071827633, 9780071827638.

19

A OS Randomization Configu-ration

A.1 Android

In October 2014 the wpa suppplicant.conf file, usedby Android, Linux, Windows, and OS X client sta-tions [1] for configuration of 802.11 networking, wasupdated to add experimental support for MAC ad-dress randomization in network scans. Full imple-mentation support was added in March 2015 [2]. List-ing 1 depicts the added support for MAC addressrandomization. It is worth noting that the configura-tion file provides two policies for using a non-globallyunique address while in an associated state. If thevariable mac addr is set to 1 the device will use arandomized MAC address for each unique networkthe device connects to. If mac addr is set to 2 thedevice will randomize the lower three bytes of theMAC address prefixed with the original OUI wherethe local bit has been set to 1.

The wpa supplicant.conf file also addresses the ran-domization policies available for disassociated devicesconducting active scanning. In this case, the variablepreassoc mac addr can be set similarly to the pre-viously described address policies.

Listing 1: wpa supplicant.conf

# MAC address policy default

# 0 = use permanent MAC address

# 1 = use random MAC address for each ESS

connection

# 2 = like 1, but maintain OUI (with local

admin bit set)

#

# By default, permanent MAC address is used

unless policy is changed by

# the per-network mac_addr parameter. Global

mac_addr=1 can be used to

# change this default behavior.

#mac_addr=0

# Lifetime of random MAC address in seconds

(default: 60)

#rand_addr_lifetime=60

# MAC address policy for pre-association

operations (scanning, ANQP)

# 0 = use permanent MAC address

# 1 = use random MAC address

# 2 = like 1, but maintain OUI (with local

admin bit set)

#preassoc_mac_addr=0

Android introduced MAC address randomizationfor probe requests with Android 6.0 (Marshmal-low) and in an incremental patch to 5.0 (Lollipop).With the release of Marshmallow, the WifiStateMa-

chine.java and WifiNative.java files were modifiedto implement MAC address randomization for activescanning. When the SupplicantStartedState functionis called upon enabling WiFi, a call to the newlyadded setRandomMacOui function sets the first threebytes of the MAC address to the default Google CID(DA:A1:19). If the config wifi random mac oui

variable has been redefined in the config.xml file, thatprefix will be used in place of the default Google CID.The XML configuration file allows an Android smart-phone manufacturer to override the default GoogleCID with a prefix to be used as the substitute forthe OUI. Finally, the prefix is passed to anotherfunction, setScanningMacOui located in the WifiNa-

tive.java file which calls a corresponding function ata lower, native level. If the device chipset is compat-ible to support randomization then the prefix will beused during active scans.

We extracted the wpa supplicant.conf, WifiS-

tateMachine.java, and WifiNative.java files fromAndroid devices that do and do not performMAC address randomization. We found that thewpa supplicant file was never utilized to implementrandomization, as attempts to modify the random-ization settings of the file had no affect on any device.The Java files also had the supporting functions forrandomization included, regardless if the device usedthem. Interestingly, with logging enabled, the devicesthat did not conduct randomization sent output tothe logs indicating that the random MAC had beenset, where devices seen randomizing did not.

20

A.2 iOS

In late 2014, Apple introduced MAC address random-ization with the release of iOS 8.0. Apple iOS ran-domization settings are not device-model customiz-able, unlike Android, which allows each model tomodify settings such as the CID. As of the currentiOS 10.x version, Apple devices only use the locallyassigned MAC address while in a disassociated state.Since iOS is not open source, we cannot determinethe exact method or configuration options that Ap-ple uses on their devices to support randomization.Instead, we are left to determine device behavior froma “black box” perspective by observing communica-tion from different devices and iOS versions in §5.

B iOS Randomization Tests

To determine if iOS is using random prefixes, or ifthere is just a pattern that we have not been ableto see, we used several standard statistical tests tocompare our observations with an ideal, random dis-tribution. First, we calculated the number of colli-sions we observed, where the same prefix appearedmore than once. If they are truly random we wouldexpect to see a moderate number of collisions, whichis easy to quantify. We would also expect to see acertain, far fewer, number of triple collisions whereone prefix appears three times. These numbers canbe calculated as follows:

E[# of collisions] =

(

n

2

)

m

E[# of triple collisions] =

(

n

3

)

m2

where n = # of addresses observed

m = # of possible prefixes (222)

Comparing our empirical results with the statisti-cal expectations, we get:

For :

Collisions : expected = 266, observed = 262

Triple collisions : expected = 1, observed = 3

Additionally, we decomposed the bytes of subse-quent MAC addresses into a bit stream and ran thetests specified in the FIPS 140-1 standard publishedby NIST to test random number generators. We ob-tained the following results:

• Monobit test: 9939

• Poker test: 13.56

• Runs test length 1: 2515

• Runs test length 2: 1342

• Runs test length 3: 581

• Runs test length 4: 281

• Runs test length 5: 166

• Longest run test: 12

All tests passed within the allowable ranges. Thesetests indicate to us that the MAC addresses are dis-tributed uniformly.

21

C Google CID Device Break-down

Table 8: DA:A1:19 with WPS Model Breakdown

Manufacturer Model Distinct Devices

Huawei Nexus 6P 1660BlackBerry STV100-3 133HTC Nexus 9 107BlackBerry STV100-1 71Sony E5823 61Sony E6653 59Sony SO-01H 29Sony E6853 23Blackberry STV100-4 20Huawei NXT-L29 17Sony SO-02H 17Google Pixel C 12Sony SO-03H 11Sony SOV32 11Huawei NXT-AL10 11BlackBerry STV100-2 10Sony SO-03G 9Sony SOV31 8Sony E6883 8Sony E5803 8Sony E6553 7Huawei NXT-L09 6Sony E6683 6Huawei EVA-L09 5Sony F5121 5Sony E6533 4Huawei EVA-AL00 3Huawei KNT-AL20 2Huawei EVA-AL10 2Sony SGP712 2Sony SGP771 2Sony E6603 1Sony E6633 1Sony SO-05G 1LGE LG-H811 1Sony E6833 1Huawei VIE-AL10 1Huawei EVA-DL00 1Sony 402SO 1Google Pixel XL 1Sony 501SO 1Huawei EVA-L19 1Sony F5321 1HTC HTC 2PS650 1

22

D RTS Control Frame Attack -Device Diversity

Table 9: RTS Control Frame Attack - DeviceDiversity

Model OS Version Success

iPhone 6s 10.1.1√

iPhone 6s 9.3.5√

iPhone 6s Plus 9.3.5√

iPhone 5s 10.1√

iPhone 5s 9.3.5√

iPhone 5 9.3.5√

iPad Air 9.3.5√

Google Pixel XL 7.1√

LGE Nexus 5X 7.0√

LGE G5 6.0.1√

LGE G4 6.0.1√

Motorola Nexus 6 6.0.1√

Moto E2 5.1.1√

Moto Z Play 6.0.1√

OnePlus 3 6.0.1√

Xiaomi Mi Note Pro 5.1.1√

23