Upload
robert-daniels
View
212
Download
0
Embed Size (px)
Citation preview
Benefits of data access
Public money Professional responsibility to share data Data access and scientific progress Fosters open scientific community and transparency
in scientific inquiry Allows for verification, refutation and refinement of
findings Promotes new research Use for evidence based policy making
Improves methods and measurements Encourages multiple perspectives Protects against faulty data Increases chances of future funding
Risks of data access
Poor quality data Ownership, authorship
– Fear that secondary analyst may publish before– Concern of loss of control over data
Sponsor concerns Financial costs for data collection / access
– Secondary researcher gains more, loses less Perceived risk to privacy / confidentiality
– Breach of confidentiality– Statistical disclosure
Ethical use of data– Use for non-analytic, non-research purpose– Concern of integrity, competence of secondary
analyst
Lack of incentives for data sharing– Lack of disincentives for not sharing data
Developing a Prototype for Data Developing a Prototype for Data Sharing Among Indepth-Network Sharing Among Indepth-Network
Member Sites – Building capacity in Member Sites – Building capacity in data managementdata management
-Kanchanaburi (Thailand)-Wosera (Papua New Guinea)-Vadu (India)-International Institute of Information Technology (I2IT), India
Indepth Network (SIDA SAREC small grants proposal on capacity building)
Stakeholders
Vadu, India. Kanchanaburi, Thailand Wosera, Papua New Guinea I2IT Indepth - Network
Aims
Strengthen data sharing mechanism within INDEPTH sites and their preparation for sharing with other partners
To develop a prototype for data sharing amongst Indepth sites
Objectives
Define minimal and optimal data sets that allow data sharing and data analysis amongst sites.
Develop and standardized system of Unique data structure amongst Indepth sites to allow for data sharing and merging for cross site comparisons.
Strengthen data collection systems suitable for data sharing and promote data sharing at Indepth sites safeguarding site and citizen interests.
Steps towards data access & data sharing
Commitment to data sharing Minimum data sets
– Technical issues– Technology related issues
Minimum quality assurance checks Modality of data access Ensuring confidentiality, anonymity Legal, financial, ethical, scientific considerations
Technical Procedure
Site Data manager sends data in format decided by all sites
Each site have freedom to send data in any database manager which they are using for data management
I2IT converts data into the Centralized Database in My-SQL as backend and PHP as front-end
I2IT uploads converted data on its server and gives permission for accessing and using the data to each site as well as other users
VaduMySQL & PHP
KanchanaburiSQL Server
WoseraFox-Pro &Visual Basic
Minimum Dataset
ETL Tool (Created by I2IT)(To convert data into MySQL & PHP)
Central Database (Created and Managed by I2IT)MySQL & PHP
Server
Vadu
Kanchanaburi
Wosera
upload dataDOWNLOAD
UPLOAD
Other Users (With prior online Permission to
view data)
Data Sharing, Upload & Download
For Users other than member sitesOther Users
(With prior online Permission to view data)
Vadu Kanchanaburi Wosera
I2IT
Create User
Terms and Condition by sites for prior permission
Request to I2IT by E-mail
Request site by E-mail
Approval by site for sharing data
Permission by I2IT for data downloading
I2IT - tasks
I2IT performs Extraction of the Data provided by the sites.
Conversion of given fields into the fields as per the prototype.
Uploading and Storage of the data into the website.
Providing web accessing facility to the sites and others interested.
Providing authorizations and implementing security constraints for data accessing through the web site.
ETL?
ETL is Acronym for Extract, Transform and Load. ETL involves extracting the data from various data
sources like Oracle, Visual FoxPro, Excel, SQL Server etc.
Transformation involves converting the data from source structure to destination structure.
This transformation is crucial for entire project
Example: Conversion of PID from 12 digit to 15 digit of vadu, India
APTGNT0023004(12 digit)
106001068002304 (15 digit)
ETL (SAS) I2IT team
Continue..
We have Extrated nearly 70,000 records from Vadu database and converted their 12 digit alphanumeric PID into 15 digit numeric format and stored into the centralized data base.
Conversion of Paupa New Gunea and Kanchanaburi records is currently in progress
All process of ETL is documented meticulously
Software & Hardware Requirements
Operating System : Linux/Unix/Windows
Data Base : My-SQL
Front end : PHP/Java
ETLTOOLS : BASE SAS 9.1.3
HARDWARE : PENTIUM DUAL PRO
HARD DISK : 160 GB.
RAM : 1 GB
Architecture
vadu
kanchanaburi
wosera
ETLCentralized
database
web
End users
End users
Website tour
http://localhost/index.php
Thank You