Upload
baychi
View
539
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Ben Gross at BayCHI November 10, 2009
Citation preview
@
(Ab)using Identifiers
Ben GrossUniversity of Illinois Urbana Champaign
Library and Information [email protected]
http://bengross.com/
@ BayCHI2009-11-10
@
@
[email protected]@bengross
http://facebook.com/bengross
http://bengross.comhttp://flickr.com/bengross
Why I am interested
@
@
How many
Do you have?
@ Social network profiles
Web site logins
Instant messenger IDs
Email addresses
Phone numbersDomain names
@
All your @’s
are belong to us
@
Why you might care
•Usability implications
•Productivity implications
•Security implications
•Employee satisfaction
@
How did I get here?
•“I only have one email address...”
•“Well, except that one I only use for...”
•“And that other one I use with...”
@
Half a million users
“... average user has 6.5 passwords, each of which is shared across 3.9 different sites. Each user has about 25 accounts that require passwords, and types an average of 8 passwords per day.”
Dinei Florêncio and Cormac Herley. A Large-Scale Study of Web Password Habits. WWW ’07
@
Population
•Qualitative in-depth interview study
•44 people across two Bay Area firms
•Financial services firm (regulated)
•Design firm (unregulated)
•
@
Data•Financial services
•Design Firm
•Combined total
•Average # of email addresses = 1.8 min 1 / max 4. IM = 1.8 min 1 / max 4
•Average # of email addresses = 3.6 min 1 / max 10 IM = 1.7 min 1 / max 3
•Average = 3.3
@
“The individual in ordinary work situations presents himself and his activity to others, the ways in which he guides and controls the impression they form of him and the kinds of things he may and may not do while sustaining his performance before them.”
Erving GoffmanPresentation of Self in Everyday Life, 1959.
@
Why more than one?
@
Social factors•“I knew that my college one wasn't
forever, so I wanted something more permanent after I graduated.”
•“...I didn't like the name that I picked when it was my first email.”
•“...you just say oh my first name and last name at gmail.com ... something easy to remember.”
@
Technical factors
•Namespace saturation AKA the [email protected] problem
•Firewalls and VPNs AKA “They don’t let me use Hotmail at work...”
•Configuration problems AKA “What does SMTP-AUTH with MD5 checksums on port 567 mean?”
@
Regulatory factors
@
It’s Just Data...“We’re an information economy. They
teach you that in school. What they don't tell you is that it's impossible to move, to
live, to operate at any level without leaving traces, bits, seemingly meaningless
fragments that can be retrieved amplified...”
William Gibson Johnny Mnemonic
@
What’s Underneath?
•Developer Tools
•FireBug/FireCookie
•Safari Web Inspector
•Charles Proxy/HTTP Analyzer
•Forensic Tools
@
Cookies
@
More detail
@
Bake Your Own
@
Managing Flash Cookies
http://www.macromedia.com/support/documentation/en/flashplayer/help/
settings_manager07.html
@
Referer (sic)
•adsl-75-18-132-43.dsl.pltn13.sbcglobal.net - - [10/Nov/2009:14:50:56 -0800] "GET /wireless.html HTTP/1.1" 200 29149 "http://bengross.com/voip.html" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-us) AppleWebKit/531.9 (KHTML, like Gecko) Version/4.0.3 Safari/531.9"
@
Leaky Headers
On the Leakage of Personally Identifiable Information Via Online Social Networks
Balachander Krishnamurthy and Craig Wills
@
More Options
•URL Munging and Session IDs in URL
•Flash Cookies/Local Shared Object
•Silverlight Cookies
•Virtual Page Views, Event (Google Analytics) User Defined Values
@
Synthetic IDs
•Everything in the Referer header can be used to for a synthetic identifier.
•The User Agent is a good source
•IP addresses if you have them
•Screen dimensions, user agent
•Hash of IP address/remote ports
@
Other Sources of Bits
•Last Modified and ETag headers
•HTTP Keepalive
•SSL Session IDs
•TCP Timestamps
@
The Art of Being Lost
•“We do not collect personal contact information from visitors to your website. Personal contact information means billing address, physical address, individual name, email address, etc.” (OpenTracker.com)
@
Netflix Data Released•Dataset contains 100,480,507 movie
ratings, created by 480,189 Netflix subscribers between December 1999 and December 2005.
•“...all customer identifying information has been removed; all that remains are ratings and dates. This follows our privacy policy...”
•No unique identifiers or quasi-identifiers
@
You Only Need Two•Robust De-anonymization of Large Sparse
Datasets by Arvind Narayanan and Vitaly Shmatikov
•IMBD as a source of entropy
•“With 8 movie ratings (of which 2 may be completely wrong) and dates that may have a 14-day error, 99% of records can be uniquely identified in the dataset.”
@
It comes down to this“Q: If you don't publicly rate movies on IMDb and similar
forums, there is nothing to worry about.
A: ...you should not ever mention any movies you watched prior to 2005 on a public blog or website.
Everybody who was a Netflix subscriber prior to 2005 should restrain themselves from these activities...
We do not think this is a feasible privacy policy.”
FAQ“How to Break Anonymity of the Netflix Prize Dataset”
@
Guessing Your SSN
•Predicting Social Security Numbers from Public Data by Alessandro Acquisti and Ralph Gross
•...I’ll just need the last 4 of your SSN for verification purposes...
•“...we accurately predicted the first 5 digits of 2% of California records with 1980 birthdays, and 90% of Vermont records with 1995 birthdays.”
@
Disclosure and UI•“Facebook Beacon is a way for you to
bring actions you take online into Facebook. Beacon works by allowing affiliate websites to send stories about actions you take to Facebook.”
•Launched November 2007
•Class action lawsuit August 2008
•Shut down September 2009
@
Opt Out: First Try
@
Opt Out: Second Try
@
Evasion
•Ghostery
•Opt Out Tools
•Ad Blockers/Flash Blockers
•HTTP Cookie/LSO Managers
•Header Modification Tools
•Proxies/Tor
@
@
@
@
@
What’s Next?
•Geolocation
•Roll up for more large collections
•More of addition bits need for de-anonymization available via social networks
@
Ben GrossUniversity of Illinois Urbana Champaign
Library and Information [email protected]
http://bengross.com/
@