June 14, 2012
Brief History of DOE’s Comprehensive
Epidemiologic Data Resource (CEDR)
Dr. Cliff Strader, HS-13CEDR Program ManagerUS Dept of Energy
Phil WallaceManager, Information TechnologyOak Ridge Associated Universities
Formation of CEDR
• Recommendation from SPEERA report in 1990
• Data providers were ORAU, Los Alamos, and Hanford.
• Decision made to house CEDR at LBNL
• Many working groups in the initial effort to establish and define CEDR
• Working group eventually reduced to LBNL staff, DOE staff, one representative each from the 3 data providers, and Jack Fix of PNNL
• Goal to combine all epidemiologic data from the 3 DOE epidemiologic groups (data providers) into one publicly accessible resource
Formation of CEDR – con’t
• Decisions included contents, data access and permissions, who is responsible, etc.
• Early decision made to ensure data were de-identified – protecting individuals’ privacy essential
• All data initially submitted to CEDR were processed at ORAU to establish a pseudo-ID number for each worker and insert that pseudo-ID in all records for that worker
• Each of the 3 data providers prepared their own files with this pseudo-ID number on every record and submitted to LBNL
• Documentation prepared to define all data fields and explain all codes
Formation of CEDR – con’t
Measures taken to protect workers’ privacy included elimination of:
• Name
• SSN
• Date of birth (masked to July 1 plus the year)
• Date of death (masked to July 1 plus the year)
• Date of hire (day masked to 15 plus the year)
• Date of termination (day masked to 15 plus the year)
• Badge numbers
• Vital status date (masked to July 1 plus year)
Formation of CEDR – con’t
State Co-operation:• All states contacted in initial CEDR development to seek their consent to
use their data without review
• NYC gave emphatic NO
• Pennsylvania, Rhode Island, and Alabama asked to review each account request
• All other states gave tacit approval
• So, forms were created……….
Formation of CEDR – con’t
• LBNL staff continually monitored CEDR for potential vulnerabilities and brought concerns to advisory group for consideration
• CEDR was established before Internet access, so catalog was published and data requests manually prepared by LBNL staff and furnished on acceptable medium (diskette, CD)
• CEDR publicized using brochures and online bulletin boards where possible
• Data made accessible online when Internet established
• Online access required development of new method to allow access
• Initially 2 types of users – “general” user could only view data, but “authorized” user could actually download data
Structure of CEDR
• Contained both “Working Data” and “Analytic Data”
• Analytic data only from some later studies (e.g., ORNL update by UNC)
• Surveillance data from Former Worker Programs
• Bibliographic section – epidemiologic studies, links to CDC documents, Los Alamos Historical Document Retrieval and Assessment
• Naval shipyard data – requires separate account approval
Cybersecurity and CEDR
• CEDR was migrated from LBNL to ORISE in January 2010
• For this program to reside at ORISE, an Authority to Operate (ATO) is required. ATO was in place when migration began
• Before CEDR could be placed into production, a Software Application Risk Assessment (SARA) had to be performed internally by ORAU’s Cybersecurity group
• Privacy Impact Assessment also conducted to examine system for PII / PHI and assess impact should breach occur
• CEDR production version went online June 30, 2010
Cybersecurity and CEDR
User Accounts:
• Account requests are treated as OUO information
• Passwords distributed according to NIST 800-53 guidelines; changed on annual basis
• Notice there is a one-way data flow from CEDR to user
• Viewing data shows only 15 records, not the entire file
• All web site access is recorded
Physical Protection and CEDR:
• Locked facility; triple layer of locks
• Data stored on ORISE network; backups performed per schedule
CEDR: What’s new?
• Look and feel of website updated
• Redesigned web page
• Better organized
• Greater visual appeal
• Easier site navigation
• More data, information
CEDR now has
• Data on 1,045,968 individuals
• Extensive bibliographic holdings related to data
• Updates on 8,923 workers to determine mortality status completed just a few months ago
• Analysis files identical to those used by former researchers permit confirmation of their analyses as teaching tool
• Free data updates on many people, not yet analyzed
• Sandia cohort hasn't yet been examined
• So, 100s of thousands of untapped data await interested researchers
CEDR open webinar planned:
• Focus on increasing awareness of resource
• Use data as teaching tool
• Engage students now in training in public health
• Improve the program through feedback from users
CEDR resources for teaching
• Mail contact with schools of public health to introduce program
• Distribute updated CEDR brochures to encourage interest
• Emphasize opportunities for teaching data analysis techniques, conducting research on exceptional data sets, publication of as yet minimally explored topics
• Potential use for MPH and PhD level research