Upload
katherine-harmon
View
214
Download
1
Embed Size (px)
Citation preview
The Experiences of Web Based Data Collection from Enterprises in Finland
August 9th 2006, JSM Seattle USA
August 9th 2006 2Rami Peltola
Introduction - Strategies And Methods
Statistics Finland’s Strategy for EDR To offer an electronic option in all data collections by 2007
(not in person statistics) It’s the respondent’s choice whether to use it or not
Data Collection Methods About 97% of data are derived from administrative registers About 3% are from direct data collection (paper forms, machine readable data / primary EDI, EDR,
interviews by CATI/CAPI systems: mainly Blaise)
Business Data Collections About 50 surveys (excluding collections with less than 30 respondents)
45 Web (Internet form) collections in use
August 9th 2006 3Rami Peltola
Background - Data Collection And Infrastructure
Traditionally high response rates (in both annual and sub-annual business surveys)
Up to over 99%, persistent staff Good relations with data providers
Experienced staff, continuous personal contacts, high level of trust High level of using the Internet
Almost every enterprise has access to the internet (employees 10+ 98%, employees 100+ 100%)
Business surveys are directed to the largest enterprises Positive atmosphere for using internet with the government
Respondents are even enthusiastic about using the Internet “It’s fun to fill in web forms instead of paper ones!”
August 9th 2006 4Rami Peltola
Background - Three Generations of In-house EDR Solutions
1. Generation: Building cost index 2001 Built using Microsoft Windows DNA
(Distributed Internet Application Architecture) 2. Generation: 7 EDR solutions 2002-2005
VB.NET 3. Generation: 23 EDR solutions 2005-2006
XCola
11 EDR solutions made by outside service provider 1997-2006 Pilot in integrated data collection (tourism statistics)
August 9th 2006 5Rami Peltola
Technical information - XCola in a Nutshell
A generic application for Web surveys Processes the XML questionnaires and transforms them into
Web applications Supports client and server side validations Executed on the server side, does not require any installation
on the respondent side Works on every modern browser Easy to implement new questionnaires in just hours
Main developer: Mr. Toni Räikkönen, [email protected]
August 9th 2006 6Rami Peltola
Benefits - Summary of Main Benefits
Simplifying data collection process Reducing need for human resources Reducing other data collection costs
Improving the quality of collected data Decreasing non-response
Speeding up the data accumulation
Reducing response burden Enabling direct individual feedback for respondents Enabling browsing of previously submitted data Assuring high level data security
Timeliness
Cost-efficiency
Accuracy
Data providerrelations
August 9th 2006 7Rami Peltola
Achieved Cost-Efficiency - 2nd Generation
Four second generation solutions have been in production for 3 years (3300 respondents per month plus 800 per quarter)
Average per cent of work saved in the data collection phase is over 40 (2 person years)
The amount of ground mail has been reduced by 65% (0.5 person years)
Number of reminders sent has gone down by half “Mass e-mailer” for all kinds of collections
Investment is paid off in about a year
August 9th 2006 8Rami Peltola
Cost-Efficiency Continues to Improve - 3rd Generation
Common framework (one engine) for similar systems An effective build-up
Simple method for transferring data between collection and production databases
Only one application to maintain and support Support and development knowledge easier to acquire and
spread Reducing need for human resources
As manual handling diminishes, it can be replaced by more rewarding tasks
August 9th 2006 9Rami Peltola
An Example - Working Hours Used in Data Collection and Validation in Sale Inquiry
0
500
1000
1500
2000
2500
hou
rs
2001 2002 2003 2004 2005
years
August 9th 2006 10Rami Peltola
Accuracy and Timeliness
The data received are of better quality: “25% less errors”(both annual and sub-annual surveys)
Response rates have remained on high level The average response time of monthly surveys has reduced
in the best case by 8-10 days or 30% The number of reminders sent has decreased substantially
in the best case by 50% (from 1000 to 500 in just 4 months) The share of the respondents using EDR -solution has in most
cases reached high level sub-annual surveys > 60% (in the best case 85%) annual surveys ~ 30% (in the best case 75%)
August 9th 2006 11Rami Peltola
An Example - Sale Inquiry Accumulation of Data 01/2002 - 01/2006
0
500
1000
1500
2000
01/2002 01/2003 01/2004 01/2005 01/2006
resp
ons
es
August 9th 2006 12Rami Peltola
Data Provider Relations
Perceived response burden has gone down E-mail informs of the survey and reminds to answer Questionnaire is “always” available and fast to fill-in Option to fill in the questionnaire in separate sessions Good designing of the questionnaire Helpful validity checks - no additional inquiries Contextual on-line help Support for several languages Individually tailored feedback Access to all the previously submitted data
and pre-filled questionnaires
August 9th 2006 13Rami Peltola
High Level of Data Security
Data security audit by an outside consult All traffic on the Internet is SSL -encrypted An authentication / authorisation -process is always needed New user IDs and passwords every year User IDs and passwords are initially sent in a letter
Only one of them can be sent by email The other one must always be sent in a letter or given over by
telephone Only a certain number of our staff have access to user IDs
and passwords (usually two persons per survey)
August 9th 2006 14Rami Peltola
An Example - Sale Inquiry Change in Response Media 12/2001 - 12/2005
0
200
400
600
800
1000
1200
1400
1600
1800
Fax Mail EDR
12/2001
12/2002
12/2003
12/2004
12/2005
resp
ons
es
August 9th 2006 15Rami Peltola
Costs - Investment and Maintenance
The costs have dropped by 60-70% during the last few years Average investment cost per new EDR -solution (today)
An outside service provider: was EUR 5000 In-house solution (XCola): less than 150 hours of work
Maintenance costs of EDR solution per year (today) An outside service provider: was EUR 1000 In-house solution (XCola): less than 50 hours of work
During the first and second phases the total resource input was about 2,5 person years (“learning by doing”)
Included the development of a secure communication environment Included the implementation of 7 solutions
August 9th 2006 16Rami Peltola
An Example - Work Done in Development and Maintenance of An EDR Solution (Sale Inquiry)
0
100
200
300
400
500
600
700
800
900
Includes hours used in development of
infrastrucre
hou
rs
2002 2003 2004 2005
August 9th 2006 17Rami Peltola
Challenges - In-house Development and Maintenance
The development of surveys can be very fast if the IT -personnel have good skills in XML and related techniques
At the moment the number of very skilled survey developers is limited
The whole production environment around XCola is not yet finished
Somewhat dependent on certain named persons The statistics departments typically have a lot of
requirements for the surveys Some minor development in XCola is needed all the time
August 9th 2006 18Rami Peltola
Pilot - Integrated Data Collection (Tourism)
Data are delivered directly from hotel management systems into our database
No manual work needed (except to initiate the transfer) After their reception data are submitted to the standard
validation process Software vendors implement a module for the hotel’s
management software using Statistics Finland’s definitions for data and service interface
Implemented using typical B2B integration technique: XML Web Services
August 9th 2006 19Rami Peltola
Near Future - Productisation and Integration
More integrated data collections? Co-operation with management system providersProject for productisation of XCola (since June 2006) Has already been made (Xcola v. 3.1):
Developer’s manual, finalised administration tools Routines for transfers between collection and production databases XCola version for outside evaluation has been built
Under development Graphical editor for building questionnaires and links to metadata
Project for co-ordination of business surveys In the future more co-ordinated surveys - instead of many
independent surveys targeted towards businesses