Upload
dexter-givens-ii
View
7
Download
0
Embed Size (px)
Citation preview
AlessandroAcq istiandRalphGrossAlessandroAcquistiandRalphGross
HeinzCollege/CyLabC i M ll U i itCarnegieMellonUniversity
ResearchsupportfromNationalScienceFoundation,U.S.ArmyR hOffi (th hC L b) C i M llResearchOffice(throughCyLab),CarnegieMellon
Berkman Fund,andPittsburghSupercomputingCenter
BlackHatUSA2009
1. Show thatSocialSecuritynumbers(SSNs)arepredictable
frompubliclyavailabledata
Knowledgeofanindividualsbirthdayandbirthplacecanbeexploitedtoinfernarrowrangesofvalueslikelytoincludethatp g y
individualsSSN
Thisisdueinparttowell meaning butcounter effective public Thisisdueinparttowellmeaning,butcountereffective,publicpolicyinitiatives
Hi hli ht i t d i k di li ti2. Highlight associatedrisksandimplications
3. Discuss possibleriskmitigatingstrategies&policies
SSNsweredesignedandissuedbytheSocialSecurityg y yAdministration(SSA)forthefirsttimein1936asidentifiersforaccountstrackingindividualearnings
Unfortunately,overtimetheystartedbeingused,andabused,asauthenticationdevices NotwithstandingwarningsbySSA,FCT,GAO,scholars,andso
forth
Naturally,thesamenumbercantbeusedsecurelybothasidentifierandforauthentication
ThewideavailabilityofSSNs andtheirdualuseas ThewideavailabilityofSSNs,andtheirdualuseasidentifiersandauthenticators,makeidentitythefteasyandwidespreadandwidespread
Knowledgeofsomebodysname,DOB,andSSNisoftensufficientconditionforaccesstofinancial medical andsufficientconditionforaccesstofinancial,medical,andotherservices Sometimes evenapplicationswithjust7outof9correctdigitsare Sometimes,evenapplicationswithjust7outof9correctdigitsare
acceptedasvalid(FTC2004)
E hSSNh di it EachSSNhas9digits: XXXYYZZZZ
andiscomposedofthreeparts andiscomposedofthreeparts: Areanumber:XXX
G b YY Groupnumber:YY Serialnumber:ZZZZ
TheSSNissuanceschemeiscomplex butnot TheSSNissuanceschemeiscomplex,butnotstochastic
Th SSAi lfh f l i bli l l di TheSSAitselfhasforalongtimepubliclyrevealeditsdetails
Thisiswellknown Thisiswellknown Infact,inferenceofthelikelytimeandlocationofSSN
applicationsbasedontheirdigitshasbeenexploitedtocatchpp g pfraudstersandimpostors
However,theSSAalsostatesthattheSSNassignmentff l dprocessis,effectively,random:
SSNsareassignedrandomly bycomputerwithintheconfinesoftheareanumbersallocatedtoaparticularstatebasedondatatheareanumbersallocatedtoaparticularstatebasedondatakeyedtotheModernizedEnumerationSystem(RM00201.060)
Alaska NewYork
First 5digitswith1guess
All9digitswith
Inthelast30years SSNissuancehasbecomemoreregular Inthelast30years,SSNissuancehasbecomemoreregular Increasingcomputerizationofthepublicadministration,including
SSAanditsvariousfieldsoffices
After1972,SSNassignmentcentralizedfromBaltimore TaxReformActof1986(P.L.99514) After1989,EnumerationatBirthProcess (EAB)
Priorto1989,onlysmallpercentageofpeoplereceivedSSNwhentheywereborntheywereborn
Currentlyatleast90percentofallnewbornsreceiveSSNviaEABtogetherwithbirthcertificate
1. WeexpectedSSNissuancepatternstohavebecomemore
regularovertheyears,i.e.increasinglycorrelatedwithan
individualsbirthdayandbirthplacey p
ThisshouldbedetectedthroughanalysisofavailableSSNdata2. Weexpectedthesepatternstohavebecomesoregularthatitp p g
ispossibletoinferunknownSSNs basedonthepatterns
detectedonavailableSSNsdetected o a a lable SS s
ThisshouldbeverifiedbycontrastingestimatedSSNsagainstknownSSNs
OutsidetheSSA,thecurrentunderstandingoftheassignmentofthefirst3digitswasincorrect,andtherelationshipbetween
demographicpatternsandthesequentiality ofthelast4digitsg p p q y g
wasunexplored
Hence,previousworkinthisareafocusedoninferringthelikelyyearor, p g y yyearsandstateofSSNissuanceofaknown SSN(e.g.,[Wessmiller,
2002],[Sweeney,2004],[EPIC,2008])
Wefocusedontheinverse,harder,andmuchmoreconsequentialinference:exploitingthepresumptivedayand
locationofSSNapplicationtopredictunknown SSNs
Alaska,1998 NewYork,1998
First 5digitswith1guess
All9digitswith
TheSocialSecurityAdministrationsDeathMasterFileisa TheSocialSecurityAdministration sDeathMasterFileisapubliclyavailabledatabaseoftheSSNsofindividualswhoaredeceased Oneofthepurposesofmakingthisdataavailablewastocombat
fraud Unfortunately,itcanalsobeanalyzedtofindpatternsintheSSN
issuancescheme WeusedDMFdatatofindpatternsintheissuanceofSSNs
bydateofbirthandStateofSSNissuancefordeceasedbydateofbirthandStateofSSNissuancefordeceasedindividuals Namely,wesortedrecordsbyreportedDOBandgroupedthemby
t dSt t fireportedStateofissuance Aniterativeprocess
Name Birth Death Last Residence SSN Issued
JOHN SMITH
21 Jun 1904
Oct 1979
33540 (Zephyrhills, Pasco, FL) 022-10-3459 Massachusetts
1. TEST1:WeusedmorethanhalfamillionDMFrecordstodetectpatternsinSSNissuancebasedonbirthplaceandstateofissuance andusedthosepatternstopredict(andstateofissuance,andusedthosepatternstopredict(andverify)individualSSNsintheDMF
2 TEST2:Wemineddatafromanonlinesocialnetworkto2. TEST2:Wemineddatafromanonlinesocialnetworktoretrieveindividualsselfreportedbirthdaysandbirthplaces,andestimatedtheirSSNsbyinterpolatingp y p gthatdatawithDMFpatterns.WeverifiedtheestimatesusingofficialEnrollmentdatausingaprotected(andIRBapproved)protocol
1. Whetherwecouldpredictthefirst5digitsofanindividualsSSNwithoneattempt
2. WhetherwecouldpredicttheentireSSNwithfewerthan10,100,and1,000attempts
Note:1,000attemptsisequivalentto3digitPIN Thatis,veryinsecureandvulnerabletobruteforcey
attacks
ME
EAB starts here (1989)CA
1973 2003
h l (f f d l ) Withasingleattempt(firstfivedigitsonly): 7%(1973 1988) 44%(19892003)
With10attempts(complete9digitSSNs): 0.01%of(1973 1988) 0 1%(19892003)0.1%(1989 2003)
With1,000attempts(complete9digitSSNs): 0.8%(19731988)
8 %( 8 ) 8.5%(1989 2003)
Theseareweightedaverages forsmallerstatesandrecentyears,predictionratesarehigher.E.g.,1outof20SSNsinDE,1996,areidentifiablewith10orfewerattempts
f InTest2weusedbirthdaydataof621aliveindividualstopredicttheirSSN,basedoninterpolationwithDMFdatadata Oursample:bornin19861990(i.e.,mostlybeforeEAB) Inmostpopulousstates(i e worstcasescenario)Inmostpopulousstates(i.e.,worstcasescenario)
Birthdayandbirthplacedatacanbeobtainedfromseveralsources,butmosteasily,andinmassamounts,fromonlinesocialnetworks ItistrivialforanattackertowritescriptstopenetrateOSN
d d l d f dcommunitiesanddownloadmassiveamountsofdata
T t fi d lt fT t (f i f Test2confirmedresultsofTest1(forsamemixofyears/statesofbirth)
ThisvalidatesthatinterpolationofSSNdatafordeceased ThisvalidatesthatinterpolationofSSNdatafordeceasedindividualsandbirthdaydataforaliveindividualscanleadtothepredictionofthelattersSSNs
ExtrapolatingtotheUSlivingpopulation,thiswouldimplyh id ifi i f d illi SSN fi di i dtheidentificationofaround40millionSSNsfirst5digitsandalmost8millionindividualscompleteSSNs
Caveat:Assumingknowledgeofbirthdata! Caveat:Assumingknowledgeofbirthdata!
l k l d Personalknowledge Onlinesocialnetworks Voterregistrationlists Voterregistrationlists Freeonlinepeoplesearchservices Commercialdatabases
Statisticalpredictionsdonotamount,alone,doStatisticalpredictionsdonotamount,alone,doidentitytheft Howcanyoutest10,100,or1,000variationsofanSSNy , , ,withoutraisingredflags? Usingbotnets anddistributedonlineservicesforbruteforceverificationattacks TumblingattackshavebeendocumentedbyIDAnalytics
Phishing Phishing SSNVS:SSNVerificationService(SSA) eVerify (DHS) eVerify (DHS) Instantcreditapprovalservices DOB/SSNmatchoftenissufficientconditiontoget DOB/SSNmatchoftenissufficientconditiontoget
approvedforseveralservices
SSN Online SSNs as Availability of Distributed predictability verification
systems Instant credit
approvalseVerify
authenticators CRAs Financial
institutions Medical
birth data Commercial
databases Free online
people
attacks Botnets
eVerify SSNVS
services []
people searches
Voter registration lists
Online social networks
d i l ih d f l Randomizeassignmentscheme(alldigits)?
Improvepersonalcomputersecurity?
Beonthealertfordistributedattacks?Improverealtimecoordination?(ID
StopusingSSNsforauthentication,reverttosingleuseasidentifiers?
Changedefaultsettings?Changeaccess/securitypolicies? coordination?(ID
Analytics2003)Improvelaxverificationprocedures?
useasidentifiers?policies?
Shortterm Randomizescheme But,thisalonenotenough:
DoesnotprotectedissuedSSNs;doesnotresolveauthenticator/identifierissue
LongtermLongterm Reconsiderlegislativeinitiativesfocusingon
redacting/removingSSNsfromdocuments/publicexposure Phaseoutauthenticationusage
Negligenceargumentforbusinessesthatusethemassuch?
Sunsetsolution? Sunset solution? E.g.,makeallSSNspublicbyyear2014 transitiontosecure,private,
efficientauthenticationmethodsinthemeanwhile?
Research support from the National Science Foundation under ppGrant 0713361, from the U.S. Army Research
Office under Contract DAAD190210389, from Carnegie MellonBerkman Development Fund, and from the Pittsburgh Supercomputing Center is gratefully acknowledged