26
Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record Linkage Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Universitat Rovira i Virgili Dept. of Computer Engineering and Mathematics UNESCO Chair in Data Privacy Av. Pa¨ ısos Catalans 26, 43007 Tarragona, Catalonia October 5, 2015 Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Assessing Disclosure Risk via Record Linkage

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas

Universitat Rovira i VirgiliDept. of Computer Engineering and Mathematics

UNESCO Chair in Data PrivacyAv. Paı̈sos Catalans 26, 43007 Tarragona, Catalonia

October 5, 2015

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 2: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Why anonymization?

Private information is routinely collected and stored.

Google

Hospitals

Universities

ProblemPrivacy and Utility

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 3: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Statistical disclosure control

Statistical disclosure control methods are about protecting the privacy ofindividual subjects whose answers constitute the original data set.Two main approaches exist:

Utility-first: Priority is given to preserving certain utility properties.Disclosure risk is assessed a posteriori.

Privacy-first: A privacy model is adopted to specificy privacy guaranteesbefore anonymization. Utility is assessed a posteriori.

Note

We propose an a posteriori disclosure risk analysis that simulates attacks viarecord linkage.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 4: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Statistical disclosure control

Statistical disclosure control methods are about protecting the privacy ofindividual subjects whose answers constitute the original data set.Two main approaches exist:

Utility-first: Priority is given to preserving certain utility properties.Disclosure risk is assessed a posteriori.

Privacy-first: A privacy model is adopted to specificy privacy guaranteesbefore anonymization. Utility is assessed a posteriori.

Note

We propose an a posteriori disclosure risk analysis that simulates attacks viarecord linkage.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 5: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Statistical disclosure control

Statistical disclosure control methods are about protecting the privacy ofindividual subjects whose answers constitute the original data set.Two main approaches exist:

Utility-first: Priority is given to preserving certain utility properties.Disclosure risk is assessed a posteriori.

Privacy-first: A privacy model is adopted to specificy privacy guaranteesbefore anonymization. Utility is assessed a posteriori.

Note

We propose an a posteriori disclosure risk analysis that simulates attacks viarecord linkage.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 6: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Statistical disclosure control

Statistical disclosure control methods are about protecting the privacy ofindividual subjects whose answers constitute the original data set.Two main approaches exist:

Utility-first: Priority is given to preserving certain utility properties.Disclosure risk is assessed a posteriori.

Privacy-first: A privacy model is adopted to specificy privacy guaranteesbefore anonymization. Utility is assessed a posteriori.

Note

We propose an a posteriori disclosure risk analysis that simulates attacks viarecord linkage.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 7: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Record linkageStandard record linkage

Data 1 External information

The data protector needs to make assumptions on the attacker’s backgroundknowledge (external non-de-identified data sets available, attributes that can beused for linkage, etc.).

Record linkage mainly focuses on identity disclosure.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 8: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Record linkageOur approach

Original data Anonymized data

We assume a maximum-knowledge attacker.

Attribute disclosure can be assessed.

The attacker can assess the accuracy of any record linkage he wishes to claim.

The protector can use the methodology to tune the anonymization level so thatthe attacker can claim no linkage.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 9: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

IntroductionOur contributionBackground

Permutation distanceThe permutation distance measures the dissimilarity between two records.

d(x1, x2) = max1≤i≤m

|rankX i (x i1)− rankX i (x i

2)|

X AX1 X2 X3 A1 A2 A3 max(d)

270914 45554 4173 299391 38993 3894 15 rank250802 57610 2639 248899 70240 2005 16 (10,17,6)299391 56606 3315 280169 59646 2548 15 d=(7,16,4)167656 38993 1619 176962 40050 2745 8 16176962 40462 4604 197601 45554 4173 14

rank 193328 30406 3433 167656 25938 3315 9(3,1,2) 178808 8730 824 200017 8730 705 5

260530 25938 4145 253471 14669 5575 17187347 95500 5575 193328 95500 4846 18253471 72700 3894 260530 72700 4145 17270708 36956 2548 280501 36956 2188 14280169 56839 1804 242873 56606 1619 11 rank178670 6539 215 178670 6539 215 2 (2,0,0)248899 70240 3536 250802 58427 3433 14 d=(1,1,2)345103 23024 2005 345103 15380 1804 16 2197601 59646 4846 178808 57610 4604 15200017 58427 2188 196740 40462 2639 9 rank196740 15380 705 187347 23024 824 3 (4,4,2)280501 40050 1323 270914 56839 1323 12 d=(1,3,0)242873 14669 2745 270708 30406 3536 11 3

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 10: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Maximum-knowledge attacker model

Axiom (Kerckhoffs’s principle)

A cryptosystem should be secure even if everything about the system, exceptthe key, is public knowledge.

It can be applied in two different ways:

the attacker knows both the original and the anonymized data set, butnot the linkage between anonymized and original records(re-identification disclosure).

the attacker knows all the original data set except one attribute, and allthe anonymized data set (attribute disclosure)

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 11: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

A Relative Measure of Disclosure Risk

Definition

For each record x ∈ X we define its linked record yx ∈ Y as one of theanonymized records in Y at shortest distance from x.

Let MX,Y(x, yx) be a function measuring the amount of masking between xand yx, that will be

linkage distance (re-identification disclosure).

MX,Y(x, yx) = |rankXm (x)− rankY m (yx)| (attribute disclosure).

Given Y1 and Y2 two different anonymizations of X,

distM(X,Y1) and distM(X,Y2)

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 12: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

A Relative Measure of Disclosure Risk

Definition

For each record x ∈ X we define its linked record yx ∈ Y as one of theanonymized records in Y at shortest distance from x.

Let MX,Y(x, yx) be a function measuring the amount of masking between xand yx, that will be

linkage distance (re-identification disclosure).

MX,Y(x, yx) = |rankXm (x)− rankY m (yx)| (attribute disclosure).

Given Y1 and Y2 two different anonymizations of X,

distM(X,Y1) and distM(X,Y2)

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 13: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

A Relative Measure of Disclosure Risk

Definition

For each record x ∈ X we define its linked record yx ∈ Y as one of theanonymized records in Y at shortest distance from x.

Let MX,Y(x, yx) be a function measuring the amount of masking between xand yx, that will be

linkage distance (re-identification disclosure).

MX,Y(x, yx) = |rankXm (x)− rankY m (yx)| (attribute disclosure).

Given Y1 and Y2 two different anonymizations of X,

distM(X,Y1) and distM(X,Y2)

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 14: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Record linkage

Algorithm (Disclosure risk assessmentvia record linkage)

Require: Original data set X.Require: Anonymized data set Y.dist ← distribution of linkage distancesbetween X and Y.dist ′ ← distribution of distances of anon-disclosive linkage.return comparison of dist and dist ′.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 15: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Record linkage

Algorithm (Disclosure risk assessmentvia record linkage)

Require: Original data set X.Require: Anonymized data set Y.dist ← distribution of linkage distancesbetween X and Y.dist ′ ← distribution of distances of anon-disclosive linkage.return comparison of dist and dist ′.

XA Ba1 b1a2 b2

DXA Ba1 b1a1 b2a2 b1a2 b2

We call dictionary an artificial datasetincluding all the possible recordscontaining combinations of attributevalues of the original data set.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 16: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Record linkage

Algorithm (Disclosure risk assessmentvia record linkage)

Require: Original data set X.Require: Anonymized data set Y.dist ← distribution of linkage distancesbetween X and Y.dist ′ ← distribution of distances of anon-disclosive linkage.return comparison of dist and dist ′.

XA Ba1 b1a2 b2

DXA Ba1 b1a1 b2a2 b1a2 b2

Note (Dictionary Linkage)

In the dictionary linkage test we comparethe distribution of linkage distancesbetween X and Y to the distribution oflinkage distances between DX and Y.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 17: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Record linkage

Algorithm (Disclosure risk assessmentvia record linkage)

Require: Original data set X.Require: Anonymized data set Y.dist ← distribution of linkage distancesbetween X and Y.dist ′ ← distribution of distances of anon-disclosive linkage.return comparison of dist and dist ′.

Note (Linkage to Permuted Data Set)

In this test we compare the distributionof linkage distances between X and Y tothe distribution of linkage distancesbetween X and Y′, where Y′ is a dataset of the same dimension as X, andwith the same attributes, but randomlypermuted and assigned to records.

YA Ba1 b1a2 b2...

...an bn

Y′

A Baσ(1) bρ(1)aσ(2) bρ(2)

......

aσ(n) bρ(n)

Note (Dictionary Linkage)

In the dictionary linkage test we comparethe distribution of linkage distancesbetween X and Y to the distribution oflinkage distances between DX and Y.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 18: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Attribute linkageThe Attribute Disclosure Test is based on attribute linkage. Let X be theoriginal data set with m attributes.

Note (Attribute Disclosure Test)

The attacker knows A1, . . . ,Am−1 attributes of X and his goal is to determinethe value of Am as accurately as possible.

Age Height Income Age Height Income

55 1.80 2000 57 1.82 150044 1.60 1100 rank 40 1.65 1350 rank32 1.83 1500 29 1.80 165067 1.78 900 70 1.76 100036 1.56 750 41 1.55 70072 1.70 1350 69 1.67 120045 1.85 600 46 1.87 75023 1.71 400 30 1.66 350

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 19: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Attribute linkageThe Attribute Disclosure Test is based on attribute linkage. Let X be theoriginal data set with m attributes.

Note (Attribute Disclosure Test)

The attacker knows A1, . . . ,Am−1 attributes of X and his goal is to determinethe value of Am as accurately as possible.

Age Height Income Age Height Income

55 1.80 2000 57 1.82 150044 1.60 1100 rank 40 1.65 1350 rank32 1.83 1500 (2, 7) 29 1.80 1650 (1, 7)67 1.78 900 70 1.76 1000 d = 136 1.56 750 41 1.55 70072 1.70 1350 69 1.67 1200 rank45 1.85 600 46 1.87 750 (3,8)23 1.71 400 30 1.66 350 d = 2

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 20: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Attribute linkageThe Attribute Disclosure Test is based on attribute linkage. Let X be theoriginal data set with m attributes.

Note (Attribute Disclosure Test)

The attacker knows A1, . . . ,Am−1 attributes of X and his goal is to determinethe value of Am as accurately as possible.

Age Height Income Age Height Income

55 1.80 2000 57 1.82 150044 1.60 1100 rank 40 1.65 1350 rank32 1.83 1500 (7) 29 1.80 1650 (8)67 1.78 900 70 1.76 1000 d = 136 1.56 750 41 1.55 70072 1.70 1350 69 1.67 120045 1.85 600 46 1.87 75023 1.71 400 30 1.66 350

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 21: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Experimental results: dictionary linkage test

Distribution of linkage distances between X and Y and the distribution oflinkage distances between DX and Y.

Right, same as the left plot but replacing X by a random permutation Xσ andDX by DXσ .

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 22: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Linkage to permuted data set and attribute linkageLinkage to permuted data set

Attribute linkage

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 23: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Experimental results: dictionary linkage test

Distribution of linkage distances between X and Y and the distribution oflinkage distances between DX and Y.

Right, same as the left plot but replacing X by a random permutation Xσ andDX by DXσ .

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 24: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Linkage to permuted data set and attribute linkageLinkage to permuted data set

Attribute linkage

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 25: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Maximum-Knowledge Attacker ModelA Relative Measure of Disclosure RiskDisclosure Evaluation Risk via Record LinkageExperimental Results

Correlations

Noise addition Differential privacy

Solid curve: distance between attribute correlation matrices of X and Y.Dashed curve: minimum linkage distance between X and Y.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage

Page 26: Assessing Disclosure Risk via Record Linkage · 2017-11-03 · Introduction New Disclosure Assessment Methods Conclusions and Further Research Assessing Disclosure Risk via Record

IntroductionNew Disclosure Assessment Methods

Conclusions and Further Research

Conclusions and further research

We have proposed a general method for disclosure risk assessmentbased on record linkage by a maximum-knowledge attacker

We have presented three specific record linkage tests, two are focusedon re-identification disclosure risk and one focused on attributedisclosure risk.

Achieving perfect anonymization requires huge noise, and hencecauses a lot of utility damage.

Our empirical results show that the amount of noise needed for safeanonymization is proportional to the dependency between the attributesof the original data set (the more independent, less noise needed).

As future research, we will use different distances to see which one ismore representative on the assessment of the disclosure risk.

Josep Domingo-Ferrer, Sara Ricci and Jordi Soria-Comas Assessing Disclosure Risk via Record Linkage