14
Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Farm Structure Survey: Considerations on the Release of a European Microdata L. Franconi D. Ichim L. Corallo Istituto Nazionale di Statistica (ISTAT) ITALIA Tarragona, Spain, 26-28 October 2011

Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

  • Upload
    delora

  • View
    56

  • Download
    1

Embed Size (px)

DESCRIPTION

Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality. Farm Structure Survey: Considerations on the Release of a European Microdata. L. Franconi D. Ichim L. Corallo. Istituto Nazionale di Statistica ( ISTAT ) ITALIA. Tarragona, Spain, 26-28 October 2011. Summary. - PowerPoint PPT Presentation

Citation preview

Page 1: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Farm Structure Survey: Considerations on the Release of a

European MicrodataL. Franconi

D. Ichim

L. Corallo

Istituto Nazionale di Statistica (ISTAT)

ITALIA

Tarragona, Spain, 26-28 October 2011

Page 2: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Summary

• Description of the European Farm Structure Survey (FSS)

• Disclosure Scenarios and Risk Analysis

• Disclosure Limitation Procedure

• Information Loss Assessment

• Conclusions

Page 3: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Objective

CASE STUDYCASE STUDY

To explore,analyze and make recommendations on the release of European FSS MFR

Italian FSS 2005

Dutch FSS 2007

Page 4: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

D•Survey provides information on: a) utilised agricultural area (UAA) b) livestock unit c) SGMtotal standard gross margin (ESU) d) geographical location (NUTS) e) farming type

•Regional character and sparsity of the data•The survey unit agricultural holdings•The target population agricultural holdings •Member States: a) census at least each 10 years b)BE,LU,NL,FI,SE census each survey round c) UK,NO sample survey and census

•Response rate >90%

Description of the European FSS

Page 5: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

coefficientsFarming Type total SGM partial SGM agricultural quantities

on a three years average prices available with one year delay

The dominating activity partial SGM = more than 66% total SGM 70 farming type sudvisionClassification farms 50 the particular type 17 principal type 9 general type

SGM and farming type

Page 6: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Relative variations of the mean number of holdings with respect to y2000 at NUTS2 level

stability of the phenomenon

Analysis of the temporal detail

Page 7: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Percentage of large holdings at NUTS0 level in each wave of FSS.

i) it’s difficult to analyse the phenomenon in a single MSii) «small» countries do not have many large holdings

0

1

2

3

4

5

6

7

8

9

BE BG CZ DK DE EE IE GR ES FR IT CY LV LT LU HU MT NL AT PL PT RO SI SK FI SE UK NO

2000200320052007

Analysis of the geographical detail

Page 8: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Scenario Spontaneous Identification • categorical structural variables Area status (A05) – 3 categories SGM region code (A07) – NUTS2 - 21 categories for

Italy Holder-Sex (L011) – 3 categories Age group (L012) – 7 categories

Disclosure scenario and risk analysis

COUNTRY COMBINATIONNUMBER OF SAMPLE UNIQUE

NUMBER OF SAMPLE DOUBLES

Italy 649 4% 3%The Netherlands 27 15% 11%

Page 9: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

VISIBLE RE-IDENTIFICATIONVISIBLE RE-IDENTIFICATION

External information

Page 10: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

release SGM as it is in the original data

TWO STRATEGIES recalculation SGM based on the recoded and perturbed variables (agricultural

quantitites)

SGM and Farming Type

SUPPRESSION of some identification variables AGGREGATION of some categorical variables

PERTURBATION of some numerical variables

Page 11: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Suppression and aggregation

Variable Description UNIQUE CASESA04A (NUTS3) District 61%A04D Municipality 12%A07 (NUTS2) Region 4%Suppression A05 Area status 0%

A05 might be considered by some MS a significant data utility

loss.

• variables with high percentage of zero values (missing phenomenon) ADD UP• regional character and sparsity of the data

Page 12: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Perturbation

continuous variables skew distribution re-identification

- microaggregation k=3 - preserve the weighted means - SGM region as blocking variable - retains characteristics of the data

INDIVIDUAL RANKING

Page 13: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Information loss assesment

• percentage variation of the means/variances • the skew distributions as a consequence of the

sparsity• Member States decide: IR at NUTS3 or NUTS2 level

Page 14: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Conclusion

• EACH FARM IS VISIBLE• REGIONAL CHARACTER AND SPARSITY OF THE DATA

GEOGRAPHICAL DETAIL = NUTS2 LEVEL GEOGRAPHICAL DETAIL = NUTS2 LEVEL

• RELEASE SGM ORIGINAL AND FARMING TYPE• FARMING TYPE AT PARTICULAR LEVEL• VARIATION OF INDIVIDUAL RANKING