Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
WhatCanYouLearnfromanIP?
Simran Patil andNikitaBorisovUniversityofIllinoisatUrbana-Champaign
@SimranPatil25 @nikitab
Inthebeginning…
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 2
GET /~nikitab/ HTTP/1.1Host: geocities.com…
HTTP/1.1 200 OK…
<blink>this page is under construction</blink>
http://geocities.com/~nikitab/
underconstruction
Today
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 3
GET /anrw/2019/ HTTP/1.1Host: irtf.org…
HTTP/1.1 200 OK…
<title>ANRW’19</title>
TLSencrypted
A? irtf.org
irtf.org A 4.31.198.44
ClientHello… SNI irtf.org
Server Certificate… CN=irtf.org
TLShandshake
DNSquery
https://irtf.org/???
???
Soon?
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 4
GET /anrw/2019/ HTTP/1.1Host: irtf.org…
HTTP/1.1 200 OK…
<title>ANRW’19</title>
TLSencrypted
A? irtf.org
irtf.org A 4.31.198.44
ClientHello… SNI irtf.org
Server Certificate… CN=irtf.org
TLShandshake
DNSquery
TLS1.3
DNS-over-HTTPS/TLS
ESNI
4.31.198.44
Whatcanyoulearnfromadomainname?
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 5
drugrehab.ca
lymphoma.cafoxnews.com
aljazeera.com
dailystormer.namewww.lgbtcenters.org
www.oshawamosque.com
montrealcathedral.ca
whatisabrony.com
furrycons.com
anime-expo.orgnickleback.com
vim.org
Methodology
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 6
Alexaglobaltop1000000 MIDA
Pageresources:URLs,
domains,types
944094sites90514000objects
zdnsdomains=>IPaddress=>
rDNS
1819087domains1795506resolved741049IPs
rDNS
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 7
PublicSuffixList(PSL)match:server1.facebook.com=~facebook.com
DomainsandIPs
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 8
domain1
domain6
domain2
domain3
domain4
domain5
IP1
IP2
IP3
IP4
IP5
Averagedegree:1.46
Averagein-degree:3.14
IPAnonymitySet
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 9
domain1
domain6
domain2
domain3
domain4
domain5
IP1
IP2
IP3
IP4
IP5
Averagedegree:1.46
Averagein-degree:3.14
IPAnonymitySets
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 10
ANRW ’19, July 22, 2019, Montreal, QC, Canada Simran Patil and Nikita Borisov
Figure 3: If an IP address is hit by several di�erentweb sites, then a reverse DNS lookup will not pro-vide much information about which website the userwas looking at. But if the IP address has a one-to-onebackward mapping to a website then the chances ofthe user’s web activity being pro�led increase signi�-cantly which is a threat to the user’s privacy.
average of 1.46 IP addresses, but many domain names mapto the same address.)
We can now calculate how well an adversary, armed withthis data set, can map an IP back to a domain name. Foreach IP address, we compute the set of domain names thatmap to it as its anonymity set. Figure 4 shows a histogram ofthese sizes. A slight minority of the IPs in our data set (47.6%)correspond to a single domain. For these domains, under ourthreat model, where the adversary knows the set of potentialaddresses a user may look up and is able to perform forwardlookups on them, encrypted DNS provides little to no bene�t.Note that this technique is much more successful than usingreverse DNS—only 34 840 domains resolved to IPs that hadthe corresponding domain as its rDNS entry. The mediananonymity set size is 2 and the average is 3.14. Some IP
Figure 4: A histogram of IP anonymity set sizes. Foreach IP in our dataset we calculate the number ofdomains that map as its anonymity set. The mediananonymity set has size 2, and the average is 3.14. Thelargest is 16 050 (only the top 100 are shown in the �g-ure for clarity).
Figure 5: A CDF of the anonymity set size that do-mains map to.
addresses map to a large set of addresses, including one thatcorresponds to over 16 000 domains.Figure 5 shows a cumulative distribution function of the
anonymity set sizes that each domain belongs to. Note thatsince larger anonymity set sizes have more domains, a me-dian domain corresponds to an IP address of an anonymityset size of 4.
We do note that there is some potential for consolidationthat is present here. We use the domain names returned dur-ing a lookup (including CNAMEs) to classify a sample ofour IP addresses as belonging to various content distributionnetworks, as shown in �g. 6. This shows that a signi�cantfraction of addresses come from CDNs. Today, many CDNsare able to serve a large number of sites from a small setof IP addresses (a feature exploited by domain fronting [3]).
47.6%IPshaveananonymitysetof1
Largestanonymitysethas16 050domains
Site-uniqueIPs
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 11
domain1
domain6
domain2
domain3
domain4
domain5
IP1
IP2
IP3
IP4
IP5
site1
site2
site3
E.g.,74.125.132.154has ananonymity setof 1—stats.g.doubleclick.net—butisseenonover10%ofallthesitesinourdataset!
Site-uniqueIPs
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 12
domain1
domain6
domain2
domain3
domain4
domain5
IP1
IP2
IP3
IP4
IP5
site1
site2
site3
68%ofIPsinoursetaresite-unique
43%ofsitesuseatleast1resourcethatmapstoasite-uniqueIP
For39.5%ofsites,thefrontpagemapstoasite-uniqueIP
PageLoadFingerprints
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 13
23.64.109.196192.33.31.7098.84.112.4
193.200.231.133
site???
SiteIPsets
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 14
domain1
domain6
domain2
domain3
domain4
domain5
IP1
IP2
IP3
IP4
IP5
site1
site2
site3site3IPset95.7%siteshaveauniqueIPset
clusterof903siteshassameIPset
WhataboutCDNs?
• ManyCDNscould usesameIPaddressforallsitesbutdon’t• PortedIPspace• Connectionsw/oSNI
• Inourdataset200KdomainsarehostedbyCloudFlare,using91KIPs• Including3%ofthesiteswithasite-uniquefrontpageIP
• RandomizingornormalizingIPaddressescouldhelp
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 15
Conclusions
• DNSprivacyofferslimitedprotection• Forwebbrowsing• Againstanadversarywithagoodpriorlistofsites
• InourAlexa1Mcrawldataset• 48%ofallIPsmaptoasingledomain• 68%ofallIPsmaptoasinglesite• 43%ofallsitescontainasite-uniqueIP• 95%ofsiteshaveauniqueIPset
• Changestowebhostinginfrastructurecouldhelp• NormalizeorrandomizeCDNIPaddresses
ANRW'19 S.Patil&N.Borisov,"WhatCanYouLearnfromanIP?" 16