Upload
beverly-haynes
View
218
Download
0
Embed Size (px)
DESCRIPTION
the Business Data Linking project (BDL) begun in the late 1990s –core dataset: Annual Respondents Database –other datasets: R&D, skills, Community Innovation Surveys, e-commerce, New Earnings Survey… joint venture between ONS, OGDs*, academics academics on secondment work in a “safe setting” no access outside ONS outputs checked manually for disclosure checking *OGD: other government department
Citation preview
Business data linkingrecent UK experience
business data in the UK• common register (IDBR) since 1994• key law: Statistics of Trade Act 1947• data collection supervised by a Survey Control Unit
– concerns over burden on business– exemptions from repeat surveys for smallest firms
• devolved political and statistical framework– government departments separate bodies– data sharing has purposes and limitations specified
the Business Data Linking project (BDL)• begun in the late 1990s
– core dataset: Annual Respondents Database– other datasets: R&D, skills, Community Innovation
Surveys, e-commerce, New Earnings Survey…• joint venture between ONS, OGDs*, academics• academics on secondment work in a “safe setting”• no access outside ONS• outputs checked manually for disclosure checking
*OGD: other government department
sample outputs• solving the productivity problem?
– UK multinationals as productive as foreign-owned firms– domestically-oriented firms even more unproductive?
• ecommerce lowers prices!– ...perhaps...– actually seems to emphasise existing market conditions– competition increases - but monopolies get stronger too
• on-the-job versus general skills– linking skills and schooling data to firm data indicates a
genuine productivity gain from general human capital
problems (1): “the ministry for adding things up”• microdata quality suffers
– statistical editing and block adjustment• redefinition and interpretation of data or metadata
– more problematic for micro users– eg SIC80-SIC92
• longitudinal integrity– crucial to micro analysis, irrelevant to macro numbers– not designed into repeat surveys
• documentation– different focus
problems (2): sampling frames• small firms
– low probability of reselection– smallest excluded by design
• changes in census band• voluntary surveys• non-IDBR sample selection
problems (3): inconsistencies • inconsistent across time
– eg ICT and innovation surveys• inconsistent across surveys
– eg foreign ownership
problems (4): confidentiality• linking complicates disclosure control
– increases number of quality assurers• linking across small samples
– reduces frequencies– increases likelihood of disclosiveness
• no general government right to share data– explicit agreement needed to share data across OGDs
new developments• timely electronic documentation• automatic matching• feedback into survey design• integrated data and metadata system
• increasing awareness of benefits of microdata– increases value of data– lowers business burden– answers new questions– improves knoweldge of datasets
what have we learnt?• enthusiastic data providers are the key
– plan early for disclosure checking too– feed back
• check data version– may not be a ‘definitive’ file– and even ‘clean’ datasets need preparation time
• check micro validity - macro validity isn’t enough– duplicates and bad values– inconsistencies within and across datasets and time
• “useless” data can be useful when linked
finally...• be prepared to take the lead• don’t get stressed
– recognise the data wasn’t collected for this purpose– enjoy the fact that is available
• talk about it
contact
Felix Ritchie
Business Data Linking Office for National Statistics 1 Drummond Gate London SW1V 1QQ