Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Comprehensive Inventory of Research Networks
Clinical Data Research Networks, Patient-Powered Research Networks, and Patient Registries
This report was prepared by researchers based at the University of California, San Diego, RAND Corporation, and San Francisco State University on behalf of PCORI.
Acknowledgements
This report was prepared by researchers based at the University of California, San Diego, RAND Corporation, and San Francisco State University on behalf of PCORI. PCORI would like to thank the team for its thorough report, delivered on a quick timeline. PCORI notes that any networks omitted from this report and limitations in the amount of detail provided on each network have resulted from the tight turnaround time required for this report. Information on the original Request for Proposal for the Comprehensive Inventory of Networks is available at pcori.org.
Report submited February 19, 2013 and published June 12, 2013. It was revised July 30,2013
Principal Investigator: Lucila Ohno-Machado, University of California San Diego
Researchers:
University of California San Diego, Division of Biomedical Informatics Neda Alipanah Michele E. Day Robert El-Kareh Seena Farzaneh Patricia Freeland Adela Grando Hyeon-eui Kim
RAND Corporation Daniella Meeker
San Francisco State University Katherine Kim
CDRN, PPRN, Patient Registries: Taxonomy and Comprehensive Inventories
DISCLAIMERAll statements in this publication, including its findings and conclusions, are solely those of the authorsand do not necessarily represent the views of the Patient-Centered Outcomes Research Institute (PCORI) or its’ Board of Governors. This publication was developed through a contract to support PCORI’s research agenda and PCORI has not peer-reviewed or edited the content. The publication is being made available free of charge for the information of the scientific community and general public as part of PCORI’s ongoing research programs. Questions or comments may be sent to PCORI at [email protected] or by mail to 1828 L St., NW, Washington, DC 20036.
i
Executive Summary
Objective The objective of this summary is to provide a lay summary of our methods, key findings about patient engagement, and descriptions of the final products (taxonomy and comprehensive inventories). We were tasked with developing a taxonomy and comprehensive inventories of three types of collaboratives: clinical data research networks (CDRNs), patient-powered research networks (PPRNs), and patient registries based on 22 criteria defined by the Patient-Centered Outcomes Research Institute (PCORI).
Methods We translated the 22 criteria into interview questions (see Appendix for the original 22 criteria with our reworded criteria) and defined CDRNs, PPRNs, and patient registries using the characteristics listed in Table 1.
Table 1. Characteristics used to classify networks into CDRNs and PPRNs, and patient registries. Collaborative Characteristics CDRN • Provides researchers with access to aggregate data such as counts and descriptive
statistics (in some cases, patient-level data are provided)• Includes multiple healthcare institutions and/or research organizations• Has the ability to extract all data, i.e., does not only extract data based on a specific
condition or diseasePPRN • Provides patients with access to patient-provided data and/or their own genetic data
• Enables patient-patient interactions• Has the ability to involve physicians and researchers
Patient Registry
• Provides researchers with patient-level data• Has a specific condition or disease focus• Sometimes provides patient contact information to researchers• Sometimes allows patients to contact researcher• No patient-patient interactions
We identified CDRNs, PPRNs, and patient registries by consulting experts, browsing the Internet and funded research projects on NIH reporter, searching for citations, and reading through PCORI’s RFIs. Initially, we focused on CDRNs that covered at least one million lives and PPRNs that covered at least 10,000 individuals with a particular condition (or 1,000 for rare diseases as defined by the NIH—fewer than 200,000 affected individuals in the United States). However, through our search for the CDRNs and PPRNs to include in this report, we found some that have been in existence for only a few years and therefore included fewer than one million lives (e.g., Community Health Applied Research Network or CHARN) or fewer than 10,000 individuals (e.g., Cancer Commons). Although these do not meet the covered lives criterion, including them provides a richer landscape of the types of existing networks; therefore, we expanded our parameters to include these networks. The final list of CDRNs, PPRNs, and patient registries
ii
included in this report are listed in Tables 2, 3, and 4 respectively. We also categorized the PPRNs into tiers:
• Tier 1 – meets minimum population criterion and characteristics from Table 1• Tier 2 – new and does not meet minimum population criterion yet• Tier 3 – meets minimum population criterion and collects data, but not necessarily for
research• Tier 4 – meets minimum population criterion but does not collect any data for research
(e.g., message boards)
We created a preliminary taxonomy structure based on an initial assessment of how the criteria could be grouped by criterion subject. This structure evolved as we conducted our research so that the taxonomy would better represent the distinguishing features of the networks and registries. We collected data from information obtained through each CDRN, PPRN, and patient registry’s respective website, articles obtained from the website, and RFI sent by PCORI if available. We also conducted 48 phone interviews (see Tables 2, 3, and 4). Note that figures and tables in the inventory pages were taken from documents generated by the respective network or registry.
iii
Table 2. CDRNs included in this inventory. Underlined name indicates that information from PCORI’s RFI was incorporated in the inventory. “E-mail” in Interview Date column indicates that questions were answered through e-mail and a phone interview was declined.
CDRN Name Website Interview Date
1 Association of Asian Pacific Community Health Organizations (AAPCHO) http://www.aapcho.org/ 2/7/13
2 Breast Cancer Surveillance Consortium (BCSC) http://breastscreening.cancer.gov/ 2/4/13
3 Cancer Research Network (CRN) http://crn.cancer.gov 2/1/13
4 Connecticut Center for Primary Care (CCPC) http://www.centerforprimarycare.org/ 1/30/13
5 CER2 Not available 2/15/13
6 CERTAIN http://www.becertain.org 2/8/13
7 Community Health Applied Research Network (CHARN) http://www.kpchr.org/CHARN 1/29/13
8 Children’s Hospital of Philadelphia Research Consortium (CHOP-PeRC) http://www.research.chop.edu 2/12/13
9 Distributed Ambulatory in Therapeutics Network (DARTNet) http://www.dartnet.info/ 1/31/13
10 Electronic Medical Records and Genomics (eMERGE) Network http://emerge.mc.vanderbilt.edu/emerge-network 1/30/13
11 HMO Research Network (HMORN) http://www.hmoresearchnetwork.org/ 2/5/13
12 HOspital Medicine Reengineering Network (HOMERuN) Not available 2/11/13
13 Mini-Sentinel http://www.mini-sentinel.org/ 1/29/13
14 The National Dental Practice-Based Research Network http://nationaldentalpbrn.org/ E-mail*
15 Pediatric Emergency Care Applied Research Network (PECARN) http://www.pecarn.org 1/29/13
16 Pediatric Health Information System (PHIS+) Not available 2/1/13
17 SCAlable National Network for Effectiveness Research (SCANNER) http://scanner.ucsd.edu 1/25/13
18 Society for Vascular Surgery Vascular Quality Initiative (SVS VQI) http://www.vascularqualityinitiative.org 1/28/13
19 UC-Research eXchange (UCReX) http://www.ucrex.org 2/8/13
20 Wisconsin Network for Health Research (WiNHR) https://ictr.wisc.edu/WiNHR 2/11/13
*Information forwarded did not answer the criteria. No response to follow-up requests for additionalinformation.
iv
Table 3. PPRNs included in this inventory. “E-mail” in Interview Date column indicates that questions were answered through e-mail and phone interviews were declined.
PPRN Name Tier # Website Interview Date
1 23andMe Tier 1 http://www.23andme.com 1/29/13
2 Association of Cancer Online Resources (ACOR) Tier 4 http://www.acor.org/ No response
3 Dr. Susan Love Research Foundation’s Love/Avon Army of Women
Tier 1 http://www.armyofwomen.org/ 2/19/13
4 Asthmapolis Tier 2 http://asthmapolis.com/ 2/15/13
5 BRIDGE Tier 2 http://sagebridge.org Declined
6 Cancer Commons Tier 2 http://www.cancercommons.org 2/18/13
7 Crohnology Tier 2 http://crohnology.com/ 2/5/13
8 Collaborative Chronic Care Network (C3N) Tier 1 http://c3nproject.org/ 2/4/13
9 DIYgenomics Tier 1 (population unknown)
http://www.diygenomics.org/ 2/15/13
10 Genomera Tier 1 http://genomera.com/ 2/11/13
11 Glu Tier 1 http://www.myglu.org E-mail
12 Inspire Tier 1 http://www.inspire.com No response
13 Insulindependence Tier 3 http://www.insulindependence.org 2/6/13
14 International Waldenstrom’s Macroglubulinemia Foundation Tier 1 http://www.imwf.com 2/12/13
15 MDJunction Tier 4 http://www.mdjunction.com/ No response
16 MedHelp Tier 3 http://www.medhelp.org/ 2/1/13
17 PatientsLikeMe Tier 1 http://www.patientslikeme.com 1/15/13
18 Personal Genome Project Tier 2 http://www.personalgenomes.org/ No response
19 Quantified Self Tier 4 http://quantifiedself.com/ 2/8/13
20 TuDiabetes Tier 1 http://www.tudiabetes.org/ 2/15/13
v
Table 4. Patient registries included in this inventory. Underlined name indicates that information from PCORI’s RFI was incorporated in the inventory. “E-mail” in Interview Date column indicates that questions were answered through e-mail and phone interviews were declined.
Patient Registry Name Website Interview Date
1 Autism Genetic Resource Exchange (AGRE) http://agre.autismspeaks.org 2/14/13
2 Autism Treatment Network http://www.autismspeaks.org/science/resources-programs/autism-treatment-network E-mail
3 Be the Match Bone Marrow Donor Registry http://marrow.org 2/19/13
4 Breast Cancer Family Registry (BCFR) http://epi.grants.cancer.gov/CFR/about_breast.html 2/13/13
5 BreastCancerTrials.org (BCT) https://www.breastcancertrials.org 2/11/13
6 California Cancer Registry (CCR) http://www.ccrcal.org 2/13/13
7 California Immunization Registry (CAIR) http://cairweb.org/ E-mail
8 California Joint Replacement Registry (CJRR) http://www.caljrr.org 2/6/13
9 The Colon Cancer Family Registry (CCFR) http://epi.grants.cancer.gov/CFR/about_colon.html 2/13/13
10 Cystic Fibrosis Patient Registry
http://www.cff.org/LivingWithCF/QualityImprovement/PatientRegistryReport/ 1/30/13
11 Kaiser Permanente TotalJoint Replacement Registry Not available No response
12 Life Raft Group http://liferaftgroup.org/ 1/31/13
13
Multi-Institutional Consortium for Comparative Effectiveness Research in Prevention and Treatment of Diabetes Mellitus (SUPREME-DM)
http://www.supreme-dm.org No response
14 MURDOCK https://www.murdock-study.com/ 2/13/13
15 New York State Congenital Malformations Registry
http://www.health.ny.gov/diseases/congenital_malformations/cmrhome.htm No response
16 Physician-Hospital Organization (PHO) Not available 2/11/13
17 Reg4ALL https://www.reg4all.org/ 2/15/13
18 ResearchMatch http://www.researchmatch.org 2/4/13
19 United Network for Organ Sharing (UNOS) http://www.unos.org 2/21/13
20 Utah Population Database http://www.huntsmancancer.org/research/shared-resources/utah-population-database/overview 2/15/13
vi
Final products
Taxonomy The final taxonomy (whose excerpts are shown in Figures 1A-E) organizes the 22 criteria into three top-level classes (Network Characteristics, Evidence of Clinical Studies Capacity, and Data Processing) (Figure 1A). Each class is broken down into subclasses. For example, Network Characteristics is a top-level class with Clinical Focus as its subclass (Figure 1B). When possible, we added subclasses to the subclass to include more specific details that covered the answers we gathered. The entire taxonomy structure is provided in the Appendix. These subclasses were annotated with criterion number and question (Figure 1C). We considered this annotation as representing the instances for the ontology classes. Instances are the answers to the criteria that were gathered during our data collection process (Figure 1D). We also included the Resource Descriptor Framework (RDF), which defines the classes or subclasses in the taxonomy, their annotations, and their hierarchy (Figure 1E).
Figure 1. Taxonomy. A-C. OWL file viewed with Protégé1. D. Answers gathered for each criterion. E. Resource Descriptor Framework2
Comprehensive Inventories A comprehensive inventory of each CDRN, PPRN, and patient registry is included in the Appendix. Each inventory is displayed with the same format of Criteria in the left column and Answers in the right column.
1 Protégé is an open source framework for modeling taxonomies that is available for download at http://protege.stanford.edu/. 2 The Resource Descriptor Framework can be viewed with a text editor application.
A B C
D
E !<!$$!h&p://www.seman1cweb.org/ontologies/2013/1/PCORI.owl#Clinical_Focus!$$>!!!!!!<owl:Class!rdf:about="h&p://www.seman1cweb.org/ontologies/2013/1/PCORI.owl#Clinical_Focus">!!!!!!!!!<rdfs:subClassOf!rdf:resource="h&p://www.seman1cweb.org/ontologies/2013/1/PCORI.owl#Network_Characteris1cs"/>!!!!!!!!!<rdfs:comment!xml:lang="en">1.e.i.!(Y/N)!Does!the!network!have!a!focus!(i.e.,!topic!area!or!purpose)?!1.e.i.1.!What!does!the!network!focus!on?!</rdfs:comment>!!!!!</owl:Class>!
Criteria' Answer'1.e.i.'(Y/N)'Does'the'network'have'a'focus'(i.e.,'topic'area'or'purpose)?' Yes!
1.e.i.1.'What'does'the'network'focus'on?'' Medically!underserved!popula1ons!of!Asian!Americans,!Na1ve!Hawaiians,!and!other!Pacific!Islanders!
vii
Key Findings To assess the level of patient engagement in the networks and registries, we examined our data at three levels: governance, study, and data. While we are not covering the full spectrum of patient engagement for our evaluation, we assessed the following criteria to determine if patients are involved vs. not directly involved at each level.
• Governanceo 1.g.i. Are patients involved in the decision-making process on the use of data they
provided to the network?
• Studyo 1.f. Does the network use informed consent forms?
§ 1.f.i. Do patients consent to the broad3 … or specific use of their electronic data?
§ 1.f.ii. Do patients consent to the broad … or specific use of their biological specimens?
§ 1.f.iii. Can patients be re-contacted for consent for a new study? • Data
o 1.g.ii.1. What are the sources of self-reported data? o 1.g.ii.2. What are the sources of health care-derived data?
We found that patient involvement in the decision-making process for the use of their data is high in PPRNs (17 out of 20) and relatively low in CDRNs (5 out of 20) and patient registries (6 out of 20) (Figure 2). Examples of how patients were involved in the decision-making process includeserving as members of the advisory board, controlling how much data are shared via privacy settings, and owning data and determining how much data to contribute. We also analyzed if informed consent is included (Figure 3), whether consent is for broad or specific use of the respective patient’s data (Figure 4), and if it would be possible to re-contact a patient for a new study (Figure 5). Based on Figure 3, all three types of collaboratives tend to engage patients in a study through the use of informed consent forms. The consent within CDRNs tends to be for specific use of data, while the consent within PPRNs tends to be for broad use of data. The types of data used are almost exclusively health care-derived in CDRNs and mostly self-reported in PPRNs (Figure 6).
3 We define broad to mean that data may be analyzed for other research.
viii
Figure 2. Counts of patient involvement in the decision-making process on the use of his or her data.
Figure 3. Counts of each collaborative type using informed consent.
CDRNs PPRNs
Registries
Yes
No
Not available
Patients involved in decision-making
5
15
6
13
1
17
3
CDRNs PPRNs
Registries
Yes
No
Not available
Informed Consent
10 10
13 6
1
14
6
ix
Figure 4. Counts of patient consent to the broad or specific use of his or her electronic data orbiological specimens. Counts of not available and not applicable are not depicted.
Figure 5. Counts of whether patients can be re-contacted for a new study.
Patients consent for use of data
Registries
PPRNs
CDRNs
Broad Specific
Use of Electronic Data
Broad Specific
Use of Biological Data
Both 0
5
10
15
20
25
8 5 4 2
7
3
2
4
1
6
8
2
5
CDRNs PPRNs
Registries
Yes
No
Not available
Patients can be re-contacted
9 10
1
14
6
12 5
3
x
Figure 6. Counts of each collaborative type using Health-Care Derived and/or Self-Reported data.
Taken together, meaningful patient engagement throughout the research study process seems to be missing other than engaging the patient as a research subject. This gap may be due to the different perspectives regarding data sharing, e.g., patients with rare diseases want multiple scientists to share data so they would not need to participate in so many studies, while scientists are reluctant to share unanalyzed, raw data or analyze data collected by another group (see “Families Push for New Ways to Research Rare Diseases” from the Wall Street Journal).
Issues and Gaps We found that demographics information is not represented consistently across the networks and registries. In addition, some network/registry information was confidential, not provided, or difficult to interpret through interviews, e.g., budget and what elements would be considered metadata. Because of time constraints, the interviewees were not granted the option to review inventories. The information contained in this report represent our interpretation of answers provided by the interviewees and therefore may contain errors. We removed a few registries from our list whose websites did not contain enough information to answer the criteria and whose listed contact did not respond to our requests for an interview, e.g., Alzheimer Disease Patient Registry (http://www.washington.edu/research/centers/146), BioSense (http://www.cdc.gov/biosense/), and National Spina Bifida Patient Registry (http://www.cdc.gov/ncbddd/spinabifida/NSBPRregistry.html).
Types of Data Used
Registries
PPRNs
CDRNs
Health Care-Derived Self-Reported Both
0'
5'
10'
15'
20'
25'
30'
8
3 5
14
5
19 1
Contents
I. TaxonomyII. Original 22 Criteria with Reworded CriteriaIII. Inventories of CDRNsIV. Inventories of PPRNsV. Inventories of Patient Registries
13573 139
Executive Summary i
1
I. Taxonomy
1. Network Characteristicsa. Patient Population
i. Number of Lives Covered (1.a)ii. Demographics
1. Racial/Ethnic (1.b.i.1)2. Geography (1.b.i.2)3. Age (1.b.i.3)4. Gender (1.b.i.4)
b. Clinical Focus (1.e.i, 1.e.i.1)c. Finances
i. Total annual budget (1.c.i)ii. Total annual cost network infrastructure and maintenance
(1.c.i.1) (1.c.iii)iii. Total annual cost conducting studies (1.c.i.2)iv. Sources of funding (1.c.ii)
d. Years in existence (1.d)e. Clinical data
i. Electronic Data1. Source
i. Self-reported data (1.g.ii.1)ii. Health care-Derived data (1.g.ii.2)
iii. Data collected in clinical trials (1.g.ii.3)2. Type (4.f)
ii. Biospecimen1. Source
i. Biobank (3.a)ii. Collected by the network for research (3.d)
2. Type (3.b)f. Policies
i. Patient-related policies1. Type of Consent (1.f)
a. No consent required (1.f)b. Broad use of electronic data (1.f.i)c. Broad use of biosamples (1.f.ii)
2. Governance involvement mechanisms (1.g.i, 1.g.i.1)3. Re-contact for new study needed (1.f.iii.1)
ii. Data sharing1. Requirements for institutional investigators to collaborate
with each other (1.g.iii.1.a)2. Requirements for sharing outside the network (1.g.iii.1.b)3. Policies for protecting proprietary data (1.g.iii.1.c)
g. Healthcare organizations engagement (2.c.i)i. Mechanisms of participation (2.c.ii)
h. Methods for Data Security (4.a)
2
2. Evidence of Clinical Studies Capacitya. Publications
i. Evidence of clinical care or quality improvement (1.a.iii.1)ii. Studies published in peer reviewed journals (2.a)
iii. Evidence of longitudinal follow-up studies (2.b.i)iv. Evidence of randomized control trials (2.d.i.1)
b. Study typei. New studies in the same or different condition from the clinical
focus (1.a.iii)ii. Longitudinal follow up (2.b)
1. From existing reports by passively reviewing the data(2.b.ii)
a. Using mechanisms to standardize data elements(2.b.ii.1)
iii. Randomized controlled trials using network data (2.d.i)iv. Analysis of biospecimens
1. Analysis of biospecimens from biobanks (3.c)2. Analysis of biospecimens collected by the network (3.d.i)3. Results linkable to patient outcomes (3.d.ii)
v. Clinical care delivery (1.a.iii)vi. Quality improvement (1.a.iii)
3. Data processinga. Harmonization
i. Query distribution via central hub (4.b.i)1. Architecture (4.b.ii)
ii. Standardized terminologies adopted (4.c.i, 4.c.ii)iii. Common data model used (4.d.i, 4.d.ii)
1. Data mapping and transformation mechanism (4.d.iii)iv. Metadata collected (4.e.i)
1. Description (4.e.i.1)b. Extraction
i. Natural language processing (4.g.i)1. Approaches (4.g.ii)
c. Aggregationi. Before it leaves the local site (4.h.i)
ii. Transformation method (4.h.ii)d. Statistical analysis
i. Applications (4.i)e. Integration for longitudinal analysis (4.j.i)
i. Tools used (4.j.ii)
Criteria Listed in RFP Reworded Criteria1. Number of covered lives 1.a. How many people does the network cover or involve?11. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
20. Reusability (is the network available for new studies in the same or a different condition, or is it restricted to a single study?)
1.a.ii.1. Can the network be used for new studies in the same or a different condition?
21. Ability of the network to perform quality improvement and assist in clinical care delivery
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
21. Ability of the network to perform quality improvement and assist in clinical care delivery 1.a.iii.1. What is the evidence?
2. Demographics: describe the covered population in terms ofracial/ethnic groups 1.b.i.1. Demographics: racial/ethnic
2. Demographics: describe the covered population in terms of geography 1.b.i.2. Demographics: geography
2. Demographics: describe the covered population in terms of age 1.b.i.3. Demographics: age2. Demographics: describe the covered population in terms of gender 1.b.i.4. Demographics: gender17. Total annual budget 1.c.i. What is the total annual budget?
17. proportions dedicated to maintenance and infrastructure 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance?
17. proportions dedicated to conduct of studies 1.c.i.2. How much of that budget is dedicated to conducting studies?17. current source(s) of funding 1.c.ii. What are the current sources of funding? 16. Annual cost of maintaining and updating network 1.c.iii. How much does it cost each year to maintain and update the network?18. Years in existence 1.d. How many years has this network existed? 3. Specify the clinical characteristics, such as disease, condition, or treatment focus, if any 1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)?
3. Specify the clinical characteristics, such as disease, condition, or treatment focus, if any 1.e.i.1. What does the network focus on?
combination of 4. and 5. 1.f. (Y/N) Does the network use informed consent forms?4. Whether patient consent for broad use of electronic data is present and currently in effect
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
4. Whether patient consent for broad use of biological specimens is present and currently in effect
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
5. Whether patient consent for re-‐contact is present and currently in effect 1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study?
6. Are patients involved in governance of the uses of network data? 1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
6. If so, how? 1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
7. Sources of electronic data: claims; registry data; electronic health record (EHR) data (which EHR vendor?); and the capacity to link with pharmacy and diagnostic databases, especially imaging-‐ and lab-‐based
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
7. Sources of electronic data: claims; registry data; electronic health record (EHR) data (which EHR vendor?); and the capacity to link with pharmacy and diagnostic databases, especially imaging-‐ and lab-‐based
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
7. Sources of electronic data: claims; registry data; electronic health record (EHR) data (which EHR vendor?); and the capacity to link with pharmacy and diagnostic databases, especially imaging-‐ and lab-‐based
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
8. Data sharing policy, including existence of requirements for collaboration with institutional investigators
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
8. Data sharing policy, including existence of requirements for collaboration with institutional investigators 1.g.iii.1.b. Policies for sharing data outside the network
8. Data sharing policy, including policies in place to protect proprietary data 1.g.iii.1.c. Policies for protecting proprietary data
19. Exemplar studies (at least three with publications in peer-‐reviewed literature, if available)
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
9. Evidence of capacity to conduct, and experience in conducting, longitudinal follow-‐up for clinical outcomes
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
9. evidence of the capacity to analyze data from longitudinal follow-‐up 2.b.i. What is the evidence?
10. Are there passive means of determining follow-‐up and ongoing observation?
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
10. If so, describe any standardization of data elements 2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
3
II. Original Criteria with Reworded Criteria
Criteria Listed in RFP Reworded Criteria
12. Extent to which the network benefits from the support of, or active involvement from, a healthcare delivery system
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
12. Extent to which the network benefits from the support of, or active involvement from, a healthcare delivery system 2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
13. Past performance conducting randomized controlled trials (cluster, individual) using the database
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
13. Past performance conducting randomized controlled trials (cluster, individual) using the database 2.d.i.1. What is the evidence?
14. Present availability of biospecimens/biobank 3.a. (Y/N) Does the network have biobanks?14. detail on type of biospecimens (such as DNA, RNA, protein, and other biomarkers) collected 3.b. What types of biospecimens are collected?
14. for what types of analysis 3.c. What types of analysis are done on them?
15. Prior experience in collecting biospecimens for research purposes 3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes?
15. Prior experience in analyzing biospecimens for research purposes 3.d.i. What types of analyses do they conduct on them?
15. capacity to link biospecimens [sic] to patient outcomes 3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
22.a. Does the network manage security 4.a. What type of security technology does the network use? 22.a. Does the network manage query distribution via a central hub? 4.b.i. (Y/N) Are queries distributed via a central hub? 22. a. Please describe in brief. 4.b.ii. What is the architecture of the query distribution?
22. b. Does the network use standardized terminologies (ie, ICD-‐9, SNOMED, etc)?
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)?
22. b. If so, please provide information on which terminologies are used. 4.c.ii. Which terminologies?
22. c. Does the network use a common data model (CDM)? 4.d.i.(Y/N) Does the network use a common data model (CDM)?
22. c. If so, please provide information on which CDM is used 4.d.ii. Which CDM is used? 22. c. If so, please provide information on how the data is transformed and mapped to the model. 4.d.iii. How are the data transformed and mapped?
22. d. Is metadata routinely collected?4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)?
22. d. If so, please list key metadata elements collected. 4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
22. e. Please list the types of data that are being collected or access and incorporated into the network (eg, EHR data, claims, patient-‐reported outcomes, etc).
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
22. f. Are you conducting natural language processing? 4.g.i. (Y/N) Does the network use natural language processing?
22. f. If so, which application or approach are you using?4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
22. g. Is data aggregated before it leaves the local site and shared with the network?
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
22. g. Please describe in brief how the data is transformed and when it leaves control of the local site.
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
22. h. Does the network provide data analysis tools for researchers? Please describe in brief.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network?
22. i. Are IT or informatics tools used to integrate administrative, billing, and/or clinical records data into patient-‐level longitudinal data?
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
22. i. If so, which informatics tools? 4.j.ii. What informatics tools are used?
4
Criteria Answers1.a. How many people does the network cover or involve? 450,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
EMRs are currently being installed in all clinics, the network was recently awarded an NIH CBPR grant, new clinical health sites are being added to the network, the network is partners with CHARN, N^2, and NACHC to increase capacity for research.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?This network helps doctors make decisions about clinical care by incorporating enabling services data, and social determinants of health data, culturally efficient and effective care that advances health and reduces disparities, and integrate essential enabling services (e.g., interpretation, eligibility assistance) that facilitate access to care.
1.b.i.1. Demographics: racial/ethnic High concentrations of medically underserved Asian Americans, Native Hawaiians, and other Pacific Islanders (66%)
1.b.i.2. Demographics: geography California, Hawaii, Washington, New York, Massachusetts, Minnesota, Illinois, Florida, and the Republic of the Marshall Islands
1.b.i.3. Demographics: age
0-‐2: 5.8%<15: 23.2%15-‐64: 67.5%>65:9.3%
1.b.i.4. Demographics: gender Male: 32%Females: 58%
1.c.i. What is the total annual budget? $666,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $250,000-‐350,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $250,000
1.c.ii. What are the current sources of funding?
Bureau of Primary Health Care, ARC funded project (N^2), Health Resources and Services Administration (HRSA), The California Endowment, Centers for Disease Control and Prevention, Gilead Sciences, National Institutes of Health (NIH), New York University Center for the Study of Asian American Health, Office of Minority Health
1.c.iii. How much does it cost each year to maintain and update the network? Included in amount of annual budget dedicated to infrastructure and maintenance
1.d. How many years has this network existed? 25
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Medically underserved populations of Asian Americans, Native Hawaiians, and other Pacific Islanders1.f. (Y/N) Does the network use informed consent forms? No -‐ IRB approval and waivers of authorization are required for research studies
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
There is a Community IRB, which includes members of the patient population. Community stakeholders also collaborated with the network to develop the Criteria for Community Engagement in Research that includes principles of community involvement, alignment with community mission, equity, and community accountability.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR (NextGen, Centricity)
Association of Asian Pacific Community Health Organizations (AAPCHO)
5
III. Inventories of CDRNs
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Researchers must file IRB application forms, data request form, memoranda of understanding, business associate agreement, and data use agreement
1.g.iii.1.b. Policies for sharing data outside the network
Does not currently share data outside the network but if it were to be shared it would require the same IRB application process as for sharing within the network
1.g.iii.1.c. Policies for protecting proprietary data Each health center can see their own patient-‐level data only. All other visible data is aggregated.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Chang Weir, R., Law, H. Enabling Services Health Information Exchange at Hawaii Community Health Centers: Evaluation Report. Association of Asian Pacific Community Health Organizations, February 2012.
2) Chang Weir, R., Law, H., Valle-‐Perez, M., & Ayson, A. The Pacific Innovation Collaborative Health Information Technology: A report highlighting the development of the PIC data repository and report manager. Association of Asian Pacific Community Health Organizations, October 2011.
3) Chang Weir, R., Law, H., Oneha, M., Lee, S., & Chien, A. (Under Review). Impact of a Pay for Performance Program to Improve Emergency Department Utilization at Community Health Centers serving Asian American, Native Hawaiian, and Other Pacific Islander Communities. Submitted February 2013 to Journal of Health Care for the Poor and Underserved.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Chang Weir, R., Law, H., Oneha, M., Lee, S., & Chien, A. (Under Review). Impact of a Pay for Performance Program to Improve Emergency Department Utilization at Community Health Centers serving Asian American, Native Hawaiian, and Other Pacific Islander Communities. Submitted February 2013 to Journal of Health Care for the Poor and Underserved.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
The network tries to code lists in the same manner that is reported to UDS (Uniform Data System, Health Resources and Services Administration reporting system) whenever possible.
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Clinics give access to patient EHRs and data on other patient enabling services
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
When a health center statistician logs onto the network, they can see the data and ask for customized reports to be sent to them. External collaborators would submit a query to the website and, if approved, would get the data returned in a standard SQL format.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
6
Criteria Answers4.c.ii. Which terminologies? ICD-‐9, SNOMED4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Not available4.d.iii. How are the data transformed and mapped? Not available
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Using a home grown data dictionary that includes social determinants of health. There is also a change log and a limited level of versioning.
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
EHR, lab, pharmacy, ER/urgent care, specialty/referral, and data on non-‐clinical support services including case management assessment, case management treatment or planning, referrals, interpretation, transportation, eligibility assistance, health education, and outreach services
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
The health plan and health center data are aggregated together on the regional end and then forwarded to the central hub.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
7
Criteria Answers1.a. How many people does the network cover or involve? 2,300,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
A current project evaluates performance characteristics of standard and advanced breast imaging technologies based on breast cancer risk and specific subgroups (e.g., age, race/ethnicity, breast density), as these technologies disseminate into community practices. The BCSC will use existing and new data collected from the 6 current BCSC breast imaging registries. Collaborations using the data will be possible in the future.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
The BCSC has worked collaboratively with the American College of Radiology (ACR) External Web Site Policy to develop common data forms that collect patient and radiology information. This collaboration has resulted in improvements in the quality of mammography data collected and has improved the quality of data within the BCSC. BCSC sites provide reports to participating facilities that include information on volume of mammograms read, true positives, false positives, and other data. Radiologists use this information for quality improvement and in their Mammography Quality Standards Act (MQSA) compliance activities.
1.b.i.1. Demographics: racial/ethnic
Total PopulationWhite (Non-‐Hispanic): 70%Hispanic: 7.3%Black (Non-‐Hispanic): 5.6%Asian/Pacific Islander: 6%American Indian/Alaskan Native: 0.9%Mixed/Other/Unknown: 10.2%
1.b.i.2. Demographics: geography
Sites:Carolina Mammography Registry (Chapel Hill, NC), Vermont Breast Cancer Surveillance System (Burlington, VT), Group Health (Seattle, WA), San Francisco Mammography Registry (San Francisco, CA), New Hampshire Mammography Network (Lebanon, NH), New Mexico Mammography Project (Albuquerque, NM), Colorado Mammography Project (Golden, CO)
1.b.i.3. Demographics: age Ages 18 and up, but the vast majority of patients are over age 401.b.i.4. Demographics: gender Female: 100%1.c.i. What is the total annual budget? $1,500,000 dedicated to the statistical coordinating center1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? National Cancer Institute contract
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 15
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Mammography performance, performance of new breast imaging technologies (e.g., breast MRI), effectiveness of breast imaging by patient and provider factors, and biological measures of risk
1.f. (Y/N) Does the network use informed consent forms?
Yes -‐ Varies by registry. Some registry sites get informed consent while other sites get a waiver of informed consent at the time of their mammogram.
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
At some registry sites, patients consent to broad use of their data
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
At some registry sites, patients consent to broad use of their data
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
Breast Cancer Surveillance Consortium (BCSC)
8
Criteria Answers1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
A potential investigator presents a concept proposal form, specifying the scientific idea. The steering committee reviews and if they approve it, the researcher works with an analyst at the statistical coordinating center to write up a full proposal, which is reviewed by the steering committee and if it is feasible, it is approved.
1.g.iii.1.b. Policies for sharing data outside the network
A potential investigator presents a concept proposal form, specifying the scientific idea. The steering committee reviews and if they approve it, the researcher works with an analyst at the statistical coordinating center to write up a full proposal, which is reviewed by the steering committee and if it is feasible, it is approved.
1.g.iii.1.c. Policies for protecting proprietary data None-‐ No data that are considered sensitive are released.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Henderson LM, Hubbard RA, Onega TL, Zhu W, Buist DS, Fishman P, Tosteson AN. Assessing health care use and cost consequences of a new screening modality: the case of digital mammography. Med Care 50(12):1045-‐52. 2012 Dec
2) Onega T, Smith M, Miglioretti DL, Carney PA, Geller BA, Kerlikowske K, Buist DS, Rosenberg RD, Smith RA, Sickles EA, Haneuse S, Anderson ML, Yankaskas B. Radiologist agreement for mammographic recall by case difficulty and finding type. J Am Coll Radiol 9(11):788-‐94. 2012 Nov
3) James TA, Mace JL, Virnig BA, Geller BM. Preoperative needle biopsy improves the quality of breast cancer surgery. J Am Coll Surg 215(4):562-‐8. 2012 Oct
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? The Breast Cancer Surveillance Consortium (BCSC) has the nation’s largest longitudinal collection of mammography data from breast cancer screening in community practice.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
When a new question is added to the patient questionnaire, the statistical coordinating center makes sure that the question is being asked the same way at each site so that the data can be coordinated across the board.With two different sources of data on the same construct, the coordinating center creates a new data element that is populated by the new data only and also retains the old data elements that are populated by the old data and then the statistical coordinating center creates a computed variable to harmonize the two.
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
The data that BCSC collects from women and radiologists/facilities are linked to cancer outcomes data from population-‐based cancer and pathology registries. This linkage occurs at each site. Three sites—Group Health Cooperative, the New Mexico Mammography Project, and the San Francisco Mammography Registry—are linked to registries within NCI’s Surveillance, Epidemiology, and End Results (SEER) Program. The Colorado Mammography Project is linked to its statewide pathology registry. The Carolina Mammography Registry, New Hampshire Mammography Network, and Vermont Breast Cancer Surveillance System collect benign and malignant breast pathology reports from laboratories in their defined regions and additionally link to their respective state cancer registries.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? In one study, radiologists were randomized to receive an intervention to try to improve their interpretive performance in reading mammograms.
3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Breast tissue biopsies
9
Criteria Answers
3.c. What types of analysis are done on them?
Type: total mastectomy, partial mastectomy, core biopsy, fine needle aspirationGuidance: clinical palpation, ultrasonography, stereotaxis, needle localized, mammographicPathologic VariablesHistologic type: ductal, lobular, other special types; grade, estrogen and progesterone receptor statusStaging: tumor size, number of positive lymph nodes, distant metastasis (American Joint Committee on Cancer TNM stage), extent of disease (SEER)Histopathology: atypical hyperplasia (ductal and/or lobular), ductal hyperplasia, fibroadenoma, phyllodes tumor, other benign, normal, inconclusive)
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? All data are encrypted.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
A query is sent to the coordinating center and an analyst runs the query for the researcher and sends the results back to him or her.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Home grown common data model
4.d.iii. How are the data transformed and mapped?
Local sites collect the data and code according to the data dictionary. The data are encrypted and put up on the SSP. Then, the coordinating center receives the data, decrypts, and processes the raw data files for data quality. Finally, data are pulled together from all the sites to create computed variable versions that are used in analysis.
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Home grown data dictionary
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
• Demographics, risk factors, clinical history• Mammography examinations: indication, assessment, recommendation, breast density• Facilities: services, technologies, characteristics• Tumor registries and pathology labs: breast and ovarian cancer, tumor characteristics, benign breastdisease, treatment• Vital statistics: death date and cause of death
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS codes, STATA scripts, R code
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
10
Criteria Answers4.j.ii. What informatics tools are used? Not applicable
11
Criteria Answers1.a. How many people does the network cover or involve? 364,293
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
CCPC collaborates with DARTNet
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
The Children’s Fund of Connecticut and the Child Health and Development Institute of Connecticut (CHDI) have approved a strategic implementation plan to explore the engagement of commercial insurers in CHDI’s work underway with the Connecticut Department of Social Services and the HUSKY insurance program. This plan includes support for targeted strategies in primary care that, when implemented, will improve care and outcomes for children, and among other things, will ensure their readiness for school, efficient utilization of health and other services, and overall improved health status. The targeted strategies for which CHDI is currently seeking support include:Universal developmental screening at 9, 18, and 24 to 30 months of age.Reimbursement for care coordination services performed in the primary care setting.Expanded capacity of pediatric primary care to address behavioral health issues.
1.b.i.1. Demographics: racial/ethnic
Total PopulationWhite (Non-‐Hispanic): 92.23%Hispanic: 5.78%Black: 4.76%Asian: 2.55%American Indian/Alaska Native: 0.27%Pacific Islander/Hawaii Island: 0.19%
1.b.i.2. Demographics: geography Connecticut
1.b.i.3. Demographics: age
Total Population0-‐19: 10,384820-‐24: 26,67025-‐29: 16,04130-‐34: 15,42535-‐39: 16,18640-‐44: 21,21445-‐49: 24,83450-‐54: 27,52555-‐59: 25,91860-‐64: 21,76865-‐69: 18,246
1.b.i.4. Demographics: gender Males: 47.6%Females: 53.4%
1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? $400,000
1.c.ii. What are the current sources of funding?
Primary Care Summit brings in $40,000-‐50,000 -‐ there is no core sustainable infrastructureU.S. Department of Education, the U.S. Department of Health and Human Services, the Agency for Healthcare Research and Quality, the Connecticut Department of Public Health, the University of Connecticut, the Donaghue Foundation, the Commonwealth Fund, the American Academy of Pediatrics, public contributors
1.c.iii. How much does it cost each year to maintain and update the network? Included in amount of annual budget dedicated to infrastructure and maintenance
1.d. How many years has this network existed? 11
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Primary Care1.f. (Y/N) Does the network use informed consent forms? Yes -‐ for studies that involve direct interactions with patients (e.g., a survey)
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad consent, or the IRB gives waivers for specific studies because the data are de-‐identified
Connecticut Center for Primary Care (CCPC)
12
Criteria Answers1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Three community members with non-‐medical backgrounds are on the Board of Directors.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR (AllScripts Enterprise)
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Institutional agreements and business associate agreements, data are always de-‐identified
1.g.iii.1.b. Policies for sharing data outside the network
Multiple collaborative studies with other networks that require data use agreements. The database is not open to the public.
1.g.iii.1.c. Policies for protecting proprietary data Data are de-‐identified by CCPC
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals All articles written thus far were for conferences
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? CCPC is currently applying for grants to extend the database to look at patient outcomes over time. 2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
By giving access to EHRs, by enrolling patients in research studies who are coming to the healthcare organization to be seen by a physician
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
13
Criteria Answers3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
ProHealth maintains a dedicated secure computer center in their corporate office, a certified co-‐location facility for business continuity, a SAN for hourly data backup, and a VMware server environment for application recovery. A full fiber optic WAN connects each site to the central facility. All ProHealth clinical encounters are processed through a central administrative system which includes Microsoft Business Intelligence solutions for analysis and presentation of administrative and clinical data. This informatics capability runs on a dedicated integrated SQL data repository and a SharePoint communication platform.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
The researcher asks CCPC's Principle Investigator (PI) for information, the data from the 8 sites all go into a common clinical repository and PI strips and de-‐identifies the information. The PI writes the SQL code and returns it to the researcher.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9, CPT44.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
EHR and all payer claims data, surveys
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Proprietary 3rd party software called FollowMyHealth by Jardogs
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS code
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Linked together by the practice management system
14
Criteria Answers1.a. How many people does the network cover or involve? 3,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Covers mostly surgical care and outcomes
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Adoption of Laparoscopy for Elective Colorectal Resection: A Report from the Surgical Care and Outcomes Assessment Program. J Am Coll Surg 2012 Jun;214(6):909-‐18.
1.b.i.1. Demographics: racial/ethnic
White: 61.6%Hispanic: 4.6%Black/African American: 2.9%American Indian/Alaska Native: 1.0%Asian: 9.6%Pacific Islander/Hawaiian: 13.7%Other: 10.9%
1.b.i.2. Demographics: geography Not available
1.b.i.3. Demographics: age < 18: 1.7%18-‐30: 15.4%
1.b.i.4. Demographics: gender Female: 54.5%Male: 45.5%
1.c.i. What is the total annual budget? $2,300,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $500,000-‐800,000/yr
1.c.i.2. How much of that budget is dedicated to conducting studies? $500,000-‐800,000/yr
1.c.ii. What are the current sources of funding? AHRQ, Life Sciences Discovery Fund, Nestle Foundation
1.c.iii. How much does it cost each year to maintain and update the network? $500,000-‐800,000/yr
1.d. How many years has this network existed? 2
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Improving patient surgical outcomes1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ Consent is obtained when identified patient-‐level data are being used for specific studies not when de-‐identified data are being used for quality improvement studies
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
The Patient Advisory Groups bring together patients, advocates, or advocacy organizations to provide a valuable patient perspective to researchers and clinicians in multiple CERTAIN research studies. Patient Advisory Groups, or individual patient advisors within the groups, routinely provide feedback on research questions; research materials; maximizing patient participation and benefit to individual patient’s for research participation; interpretation of study findings; and development of publicly released information, documents or tools to share with other patients broadly.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
CERTAIN
15
Criteria Answers1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR (CERNER, EPIC, MEDITECH), data extracted from skilled nurse systems and/or doctor's offices
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Other groups may request either SCOAP (quality improvement) or CERTAIN (research data) through existing data use policies and application procedures. These will soon be posted to the CERTAIN website; in the interim, initial inquiries may be submitted to the CERTAIN Program Director.
1.g.iii.1.b. Policies for sharing data outside the network
Other groups may request either SCOAP (quality improvement) or CERTAIN (research data) through existing data use policies and application procedures. These will soon be posted to the CERTAIN website; in the interim, initial inquiries may be submitted to the CERTAIN Program Director.
1.g.iii.1.c. Policies for protecting proprietary data
CERTAIN employs rigorous processes for ensuring the protection of all patient data collected for research purposes. A unique study code is assigned to each study participant and is used on all study related data collection documents andanalyses. A master list of codes and identifiers is maintained in a secured password protected spreadsheet on the research computers. Only select research personnel directly involved in conducting study procedures have access to the master list. These persons have signed a Confidentiality Agreement. The link between the subject identifiers and unique study code will be maintained for the duration of the study and destroyed once all data points have been analyzed.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Progress in the Diagnosis of Appendicitis: A Report from Washington State’s Surgical Care and Outcomes Assessment Program. Ann Surg 2012 Oct;256(4):586-‐94.
2) Adoption of Laparoscopy for Elective Colorectal Resection: A Report from the Surgical Care and Outcomes Assessment Program. J Am Coll Surg 2012 Jun;214(6):909-‐18.
3) β-‐blocker continuation after noncardiac surgery: a report from the surgical care and outcomes assessment program. Arch Surg. 2012 May;147(5):467-‐73.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence?
Spine SCOAP: For the Spine SCOAP module, the Patient Voices Project is capturing PROs through the use of the Owestry Disability Index and Neck Disability Index – two validated instruments to assess functional outcomes as reported by patients. Presently, questionnaires are administered in the 30 days following their surgical procedure, and then bi-‐annually through 5 years post procedure date.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
They have standardized quarterly reports that researchers can review.
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Sharing data from EHRs
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? CERTAIN has both the allocation and concealed methods to adequately perform such randomization, and a broad enough population about hospitals/clinics, providers and patients to be able to identify them and match them accordingly.
3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? No
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
16
Criteria Answers
4.a. What type of security technology does the network use?
Only select research personnel directly involved in conducting study procedures have access to the master list. These persons have signed a Confidentiality Agreement. The link between the subject identifiers and unique study code will be maintained for the duration of the study and destroyed once all data points have been analyzed. Data gathered for research purposes is entered and analyzed on password protected computers belonging to the research center. Only research personnel have access to these computers. Domain passwords must be at least 8characters in length, conform to complexity rules and be changed at least every 120 days. All laptops are encrypted using PGP Whole Disk Encryption (PGP Corp., Menlo Park CA, 94025). All computing systems are configured with activeanti-‐virus software, host-‐based firewalls and automatic installation of operating system critical patches and updates Anti-‐virus software is configured to update daily. The host-‐based firewalls restrict in-‐bound connections to only the subnets where department workforce reside or that are needed for firewall administration. The firewall rule set on the dedicated server is further restricted to the network subnets used by research personnel. On the file server, all project data will be located in a folder structure with access rights controlled by domain security groups whose membership is restricted to selected workforce.
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9, CPT, LOINC, UMLS, HL74.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Home grown CDM4.d.iii. How are the data transformed and mapped? All data come in looking the same from each of the sites based on the normalized adhoc extraction
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
They use their own home grown method by normalizing the data adhoc not post-‐hoc, i.e., they defined standards at the beginning to keep data consistent across all sites.
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographic, pre-‐hospital conditions, medications, lab work, discrete operative decision making, post-‐operative outcomes up to 12 months, surveys up to 3 years
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
They have their own team working on home grown NLP algorithms.
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data are aggregated and sent to the centralized data warehouse.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Have CER tools available
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Use a home grown systematic matching algorithm
17
Criteria Answers1.a. How many people does the network cover or involve? 800,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Not available
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?Fiks AG, Grundmeier RW, Margolis B, Bell LM, Steffes J, Massey J, Wasserman RC. Comparative effectiveness research using the electronic medical record: an emerging area of investigation in pediatric primary care. J Pediatr 2012; 160:719-‐724.
1.b.i.1. Demographics: racial/ethnic Confidential1.b.i.2. Demographics: geography Confidential1.b.i.3. Demographics: age Confidential1.b.i.4. Demographics: gender Confidential1.c.i. What is the total annual budget? $1,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Too complex to break down
1.c.i.2. How much of that budget is dedicated to conducting studies? Part of the $1 million
1.c.ii. What are the current sources of funding?
Health Resources and Services Administration Maternal and Child Health Bureau and the Eunice Kennedy Shriver National Institute of Child Health & Human Development
1.c.iii. How much does it cost each year to maintain and update the network? Too complex to break down
1.d. How many years has this network existed? 6 months
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Mainly on children with chronic conditions as well as less common but prevalent conditions 1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable -‐ The network as a whole does not use consent forms because the data they collect are limited data sets and therefore do not require consent forms. If more specific patient level data are needed for a study, then a consent form will be developed and utilized.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR and Claims data
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
CER2
Note: This is a fairly new CDRN that is comprised of 5 already established Health Organizations
18
Criteria Answers1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data Use Agreements needed for investigators to gain access to the data collected by the network
1.g.iii.1.b. Policies for sharing data outside the network Currently, there are no policies for sharing outside the network.
1.g.iii.1.c. Policies for protecting proprietary data All data captured are de-‐identified to HIPAA limited status
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Participating health organizations provide access to EHR data and also participate in research studies
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Received funding to begin a randomized control trial in 3 years3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? The data are aggregated at a central site and then the investigator queries that site
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? NDC, LOINC, SNOMED-‐CT4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Using a proprietary vendor provider to do the standardization for them
19
Criteria Answers4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographic, Conditions, Medications
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Are working towards using NLP applications and approaches
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data are aggregated at the data center
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not applicable
20
Criteria Answers1.a. How many people does the network cover or involve? 519,636
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
CHARN has just partnered with National Dental Practice-‐Based Research Network which will allow CHARN to study dental practices and policies related to their patient population."CHARN currently has both patient-‐level and visit-‐level data from our patients from 2008-‐2010 and will be expanding that range from 2006-‐2012. CHARN is currently creating registries to assist in the research process."
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
The ePro Project is facilitating engagement between providers and patients by determining the patients’ preferences, risk-‐behaviors, and symptoms and making those preferences available to the provider during the encounter. Patients enter information into a touch-‐screen tablet while waiting for their provider appointment. CHARN has previously demonstrated that patients are more willing to report inadequate medication adherence, substance use, sexual risk behavior, and other potentially socially non-‐desirable behaviors on the tablet than to providers even in situations where the patient knows the provider will receive the results. Collecting information on the tablets facilitates more comprehensive capture of patient-‐reported data enabling better patient-‐provider communication and clinical care.
1.b.i.1. Demographics: racial/ethnic
White: 314,487 (60.5%)Black/African American: 94,849 (18.3%)American Indian/Alaska Native: 4,500 (0.9%)Asian/NHOPI: 91,092 (17.5%)Multi-‐racial: 4,964 (1.0%)
Hispanic:Hispanic or Latino: 242,960 (46.8%)Not Hispanic or Latino: 94,221 (18.1%)Missing (reported unknown): 92,566 (17.8%)Missing (left blank): 89,889 (17.3%)Other: 26,848 (5.2%)No race indicated (missing): 57,328 (11.0%)
1.b.i.2. Demographics: geography
Association of Asian Pacific Community Health Node: New York, Hawaii, CaliforniaAlliance of Chicago Community Health Services Node: Illinois, North Georgia, Arizona, CaliforniaFenway Health Node: Maryland, South Carolina, MassachusettsOregon Community Health Information Center, Inc. Node: Oregon
1.b.i.3. Demographics: age
Less than 18: 155,531 (29.9%)18-‐25: 72,827 (14.0%)26-‐39: 113,334 (21.8%)40-‐64: 144,935 (27.9%)65-‐79: 26,867 (5.2%)80 and older: 6,141 (1.2%)
1.b.i.4. Demographics: gender
Male: 217,169 (41.8%)Female: 302,311 (58.2%)Transgendered: 125 (0.0%)Unknown or missing: 31 (0.0%)
1.c.i. What is the total annual budget? $10,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Almost all of the annual budget is directed towards infrastructure building
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? HRSA
1.c.iii. How much does it cost each year to maintain and update the network? Included in amount of annual budget dedicated to infrastructure and maintenance
1.d. How many years has this network existed? 3
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Primary care in safety net populations, especially focusing on cardiovascular disease, diabetes, dyslipidemia, hypertension, hepatitis A and B, and AIDS and AIDS-‐related conditions
1.f. (Y/N) Does the network use informed consent forms? Yes
Community Health Applied Research Network (CHARN)
21
Criteria Answers1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Not applicable -‐ no studies have involved direct patient contact
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
APCHO, OCHIN and ALLIANCE have data sharing agreements within their network, i.e., they agree to share limited data sets without needing to go through specific consent.Any Community Health Center (CHC) or node can choose to participate in any project and express consent is given for specific projects. If a CHC is participating in that project, there is a representative of the CHC involved in that project.
1.g.iii.1.b. Policies for sharing data outside the network Has not been addressed yet
1.g.iii.1.c. Policies for protecting proprietary data
Data ownership resides with the Community Health Center (CHC) -‐ the coordinating center does not do anything to any data without express consent of the CHCs. CHCs own their data.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
All of the CHCs are responsible for Uniform Data System (UDS) reporting so CHARN's code lists are in the same manner that is reported to UDS whenever possible. UDS is a HRSA reporting system to which all Health Centers must contribute data. CHARN captures race and ethnicity data using criteria from the U.S. Census 2010. CHARN is using the newly mandated HIV variables such as sexual orientation that will be added to health center data EHRs in the future.
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By giving access to EHRs
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
22
Criteria Answers3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
The data that comes from the Community Health Centers (CHCs) are de-‐identified, the data then get put into a SQL database that is predefined and uploaded to 128-‐bit encrypted website where it is posted. Two employees at the node level have access to that data, then a few network-‐level employees have access to individual files. They can grab from website secure file transfer and put it on their network in a shared file service for a particular project only.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Version one of the data hub is a series of disease cohorts. This data is structured into an SQL database. The Community Health Centers (CHCs) populate the SQL structure locally, then upload local structure to the coordinating center, then this data is combined into centralized resource and the queries can be made locally or centrally.Version two is not restricted to particular cohorts. Data on medications, procedures, specified labs, patient characteristics, encounter characteristics are captured in a 5 year period.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9, SNOMED4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? A home grown SQL structure is pushed out to the nodes by the central hub and the SQL comes back to the central hub with those common data elements.
4.d.iii. How are the data transformed and mapped? SQL fields are populated by the nodes and then sent to the central hub
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Home grown, using a data dictionary and a data submissions document
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
All medical encounters (visits, emails, and phone calls), medications ordered, lab results, and diagnoses if they had one of the seven CHARN related conditions of interest. These include diabetes, cardiovascular disease, HIV, Hepatitis B and C, hypertension, and dyslipidemia
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Based on the needs of the researcher
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
23
Criteria Answers1.a. How many people does the network cover or involve? 204,827
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Have 41 active studies involving asthma, obesity, ADHD, depression and Autism
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?Fiks, AG, Alessandrini, EA, Luberti, AA, Ostapenko, S., Zhang, X., and Silber, JH. Identifying factors predicting immunization delay for children followed in an urban primary care network using an electronic health record. Pediatrics. 2006 Dec;118(6):e1680-‐6
1.b.i.1. Demographics: racial/ethnic
White: 55.75%Black/African American: 27.91%American Indian/Alaskan Native: 0.09%Asian: 2.66%Native Hawaiian/Other Pacific Islander: 0.02%Two or More: 0.14%Missing/Unknown: 13.41%
1.b.i.2. Demographics: geography Not available
1.b.i.3. Demographics: age
< 1: 7.07%1-‐6: 38.76%7-‐12: 31.16%13 or more: 23.21%
1.b.i.4. Demographics: gender Male: 50.5%Female: 49.5%
1.c.i. What is the total annual budget? $25,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $241,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $56,000
1.c.ii. What are the current sources of funding? National Institute of Health, Foundation, State grants, and/or Institutional grants
1.c.iii. How much does it cost each year to maintain and update the network? $241,000
1.d. How many years has this network existed? 11
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Clinical Decision Support to study a variety of childhood chronic conditions1.f. (Y/N) Does the network use informed consent forms? Yes -‐ The CHOP Institutional Review Board (IRB) manages all issues of informed consent and ethics.
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not available
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
Children's Hospital of Philadelphia Research Consortium (CHOP-‐PeRC)
24
Criteria Answers1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
PeRC is governed under one single institutional structure, which means a single IRB and the ability to easily study network-‐wide interventions.
1.g.iii.1.b. Policies for sharing data outside the network CHOP enters into Data Use Agreements (DUA) with organizations that wish to collaborate and share data.
1.g.iii.1.c. Policies for protecting proprietary data Information is de-‐identified
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Fiks AG, Localio AR, Alessandrini EA, Asch DA, Guevara JP, “Shared Decision Making in Pediatrics: A National Perspective,” Pediatrics, 2010, Vol. 126: 306-‐314.
2) Fiks AG, Mayne S, Localio AR, Alessandrini EA, Guevara JP, “Shared Decision Making, Health Care Expenditures and Utilization Among Children with Special Health Care Needs,” Pediatrics, 2012:Vol. 129: 99-‐107.
3) Fiks AG, Mayne S, Localio R, Feudtner C, Alessandrini EA, Guevara JP, “Shared decision making and behavioral impairment: a national study among children with special health care needs.” BMC Pediatrics, 2012: Vol. 12: 153.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Power, TJ, Mautone, JA, Manz, PH, Frye, L, Blum, NJ. Managing attention-‐deficit/hyperactivity disorder in primary care: A systematic analysis of roles and challenges. Pediatrics. 2008 Jan;121;e65-‐e72
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By providing EHR access and participating in research studies
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence?
1) Fiks AG, Hunter, KF, Localio, AR, Grundmeier, RW, Bryant-‐Stephens, T, Luberti, AA, Bell, LM, Alessandrini, EA “Impact of Electronic Health Record-‐Based Primary Care Clinical Alerts on Influenza Vaccination for Children and Adolescents with Asthma: A Cluster Randomized Trial,” Pediatrics, 2009, Vol. 124: 159-‐169.
2) Bell LM, Grundmeier R, Localio R, Zorc J, Fiks A, Zhang X, Guevara J, Bryant-‐Stephens T, Swietlik M. Electronic Health Record Based Decision Support to Improve Asthma Care: A Cluster Randomized Trial. Pediatrics 2010;125:e770-‐e777.)
3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Blood
3.c. What types of analysis are done on them? whole genome sequencing
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? whole genome sequencing
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
25
Criteria Answers4.a. What type of security technology does the network use?
Multi-‐layer approach, Edge protection coverage from attack, internal segregation including access and control as well on-‐going monitoring
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
All data available in the EHR, ranging from demographics, medications, conditions, vitals, procedures, etc.
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Use a home grown approach
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS codes, SPSS scripts, R
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not applicable
26
Criteria Answers1.a. How many people does the network cover or involve? 10,966,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Covers cancer studies that also involve other risk factors in addition to cancer. Also have over 106 active studies
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Potter MB, Somkin CP, Ackerson LM, Gomez V, Dao T, Horberg MA, Walsh J ME "The FLU-‐FIT program: an effective colorectal cancer screening program for high volume flu shot clinics." Am J Manag Care 17(8):577-‐83, 2011
1.b.i.1. Demographics: racial/ethnic
White: 87%African American: 2%Asian American: 3%American Indian: < 1%Hispanic: 8%
1.b.i.2. Demographics: geography Not available
1.b.i.3. Demographics: age
<= 24: 29%25-‐44: 24%45-‐64: 27%65-‐74: 9%>= 75: 11%
1.b.i.4. Demographics: gender Male: 49%Female: 51%
1.c.i. What is the total annual budget? $3,300,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $660,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $1,980,000
1.c.ii. What are the current sources of funding? National Cancer Institute
1.c.iii. How much does it cost each year to maintain and update the network? $660,000
1.d. How many years has this network existed? 13
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Mainly cancer research and treatment but also conducts research on other health factors1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ If a researcher is conducting a study on a new intervention, then an additional patient consent form is needed for that specific study.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
Cancer Research Network (CRN)
27
Criteria Answers1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Researchers within the network propose a research study and describe the data elements they would like to collect. Then, each of the sites figures out what data within their local site matches that criteria and then collect that data.
1.g.iii.1.b. Policies for sharing data outside the network Not sharing public use data but share through scientific institutions
1.g.iii.1.c. Policies for protecting proprietary data Make sure no patient identifiers are transmitted outside the network -‐ data is encrypted and de-‐identified
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) The prevalence of obesity and obesity-‐related health conditions in a large, multiethnic cohort of young adults in California. Koebnick C, Smith N, Huang K, Martinez MP, Clancy HA, Kushi LH.Ann Epidemiol. 2012 Sep;22(9):609-‐16
2) Identifying primary and recurrent cancers using a SAS-‐based natural language processing algorithm.Strauss JA, Chao CR, Kwan ML, Ahmed SA, Schottinger JE, Quinn VP.J Am Med Inform Assoc. 2012 Aug 2
3) Factors associated with inadequate colorectal cancer screening with flexible sigmoidoscopy.Laiyemo AO, Doubeni C, Pinsky PF, Doria-‐Rose VP, Sanderson AK 2nd, Bresalier R, Weissfeld J, Schoen RE, Marcus PM, Prorok PC, Berg CD. Cancer Epidemiol. 2012 Aug;36(4):395-‐9
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? The prevalence of obesity and obesity-‐related health conditions in a large, multiethnic cohort of young adults in California. Koebnick C, Smith N, Huang K, Martinez MP, Clancy HA, Kushi LH.Ann Epidemiol. 2012 Sep;22(9):609-‐16
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Giving access to EHR data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Analyze specimens for genetic markers, biopsies, tissue blocks, microdissections
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes -‐ looked at recurrence
4.a. What type of security technology does the network use?
caBIG® Data Sharing and Security Framework (DSSF) as a decision support tool to facilitate data sharing by determining which data can be shared and under which type of access and data security controls. To do so, will need to assess the sensitivity of the data by using the Framework's four elements: Economic/Proprietary/IP Value, Privacy/Confidentiality/Security Considerations, IRB or Institutional, Restrictions, Sponsor Restrictions
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
28
Criteria Answers4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9, CPT, RxNORM4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? HMORN Virtual Data Warehouse4.d.iii. How are the data transformed and mapped? A research team identifies the data elements; which are then are sent to central location
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
caGrid standard service metadata, expose a standard data service metadata (DomainModel), which details not only the UML Classes exposed by the service, but their relationships such as associations and inheritance. This information describes the logical model over which data service queries are executed.
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics and vital signs, enrollment into the health care plan, utilization, including inpatient and outpatient visits, emergency department visits, long-‐term care admissions and home health visits, and communications with health professionals via phone, diagnoses, procedures, and lab tests/results, Incident cancer, pharmacy data, provider information, census data, birth and death data, outside claims, patient scheduling, deaths, cost
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Clinical Text Analysis and Knowledge Extraction System (cTAKES)
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not available
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not applicable
29
Criteria Answers1.a. How many people does the network cover or involve? 5,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
DARTNet is conducting three simultaneous projects focusing on asthma
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
Clinicians can join eLearning Alliance clinical practice communities and Methods and Research communities to learn about new approaches to care and EHRs. The network also presents clinical data with the goal of informing best practices for care. The network's projects aim to disseminate tested clinical decision support algorithms and encourage workflow sharing amongst groups and non-‐members for quality improvement.
1.b.i.1. Demographics: racial/ethnic Most offices in the network are private practice, so they do not collect information on race/ethnicity.
1.b.i.2. Demographics: geographyCalifornia, Colorado, Kansas, Missouri, Arkansas, Illinois, Indiana, Tennessee, Florida, North Carolina, Virginia, Ohio, Pennsylvania, New Jersey, New York, Connecticut, New Hampshire, Vermont, Minnesota, Wyoming, Alaska, Montana, and Idaho
1.b.i.3. Demographics: age65 and older: 25%Under 18: 15% Average age: 45
1.b.i.4. Demographics: gender Male: 42%Female: 58%
1.c.i. What is the total annual budget? $2,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Unknown -‐ because the network was multiple networks joined together as DARTNet in December of 2011
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? CTSA, internal funding from clinical organizations, NIH, revenue from sharing data outside the network
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 6
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? No
1.e.i.1. What does the network focus on? Not applicable1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients are involved in advisory board activities of member networks
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
Distributed Ambulatory in Therapeutics Network (DARTNet)Umbrella network for: AAFP Electronic National Quality Improvement and Research Network (eNQUIRENet), the Collaborative National Network Examining Comparative Effectiveness Trials
(CoNNECT), the South Texas Ambulatory Research Network (STARNet), ProHealth, the Scalable Architecture for Federated Therapeutic Inquiries Network (SAFTINet), the Upstate New York Practice Based Research Network (UNYNet), the Washington, Alaska, Montana, and Idaho Area PBRN (WAMI), the Free Clinic Research & Educational Engagement Network (FREENet), and the Minnesota
Academy of Family Physicians Research Network (MAFPRN)
30
Criteria Answers1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Researchers who are members of the subnetworks that have donated data are not charged for queries, but researchers are charged for analysis
1.g.iii.1.b. Policies for sharing data outside the network Services and data may be purchased through the DARTNet Website
1.g.iii.1.c. Policies for protecting proprietary data Not available
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Rates of 5 Common Antidepressant Side Effects Among New Adult and Adolescent Case of Depression: A Retrospective US Claims Study. Anderson HS, Pace, WD, Libby AM, West DR, Valuck RJ. Clinical Therapeutics. 2012; 34(1): 113-‐123.
2) Enhancing Electronic Health Record Measurement of Depression Severity and Suicide Ideation: A Distributed Ambulatory Research in Therapeutics Network (DARTNet) Study. Valuck RJ, Anderson HO, Libby AM, Brandt E, Bryan C, Allen RR, Staton EW, West DR, Pace WD. J Am Board Fam Med. 2012 Sep;25(5):582-‐93.
3) An assessment of the Hawthorne Effect in practice-‐based research. Fernald DH, Coombs L, DeAlleaume L, West D, Parnes B. J Am Board Fam Med. 2012 Jan-‐Feb;25(1):83-‐6.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Pulling data every 3 months following patients with Chronic Kidney Disease2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Giving access to EHR data and claims data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? This network has used advanced methods for cluster randomized trials where numerous outcome variables and practice level variables are included.
3.a. (Y/N) Does the network have biobanks? No 3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Not available
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use?
Data are queried via a secure web portal; permission from each practice is required each time to make data available to DARTNet; databases reside at individual practices and they are responsible for their own firewalls; the limited dataset that sits on the grid node operates within the triad system run by Ohio State; 3 level security logins are required
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
31
Criteria Answers
4.b.ii. What is the architecture of the query distribution?
There are three difference methods for query distribution:(1) All the data are locally mapped and crosswalked into the observational medical outcomes standards and sent back to the central hub.(2) Data are standardized and pulled by a third party and sent back to the central hub.(3) A clinic standardizes their own data -‐ ROSITA converts the data and standardizes to OMOP -‐ the data are put in a local OMOP data structure behind a clinic's own firewall locally and then the results are sent back to the central hub.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? SNOMED, ICD-‐9, LOINC, CPT, First Data Bank4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Observational Medical Outcomes Project (OMOP)
4.d.iii. How are the data transformed and mapped?
ROSITA mapping system takes the file in and performs record linkage if data from the same set of patients are being loaded from multiple sources (e.g., EHR and claims). It recodes the source values into standardized concept IDs (using OMOP V4 Vocabulary and local mapping), strips direct patient identifiers, and outputs a limited data set to the grid node housed at each site where the data are available for query.
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
EHR data, insurance claims data, patient reported outcomes data, clinician reported outcomes data
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not available
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SPSS scripts, SAS code
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not applicable
32
Criteria Answers1.a. How many people does the network cover or involve? 310,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Covers genetic studies and phenotype studies on certain conditions
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic See Table 11.b.i.2. Demographics: geography See Table 11.b.i.3. Demographics: age See Table 11.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? $6,775,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? $500,000 per research group over 3 year period
1.c.ii. What are the current sources of funding?
RFA-‐HG-‐11-‐022: [grants.nih.gov] The Electronic Medical Records and Genomics (eMERGE) Network, Phase II -‐ Pediatric Study Investigators (U01)RFA HG-‐10-‐010: [grants.nih.gov] The Electronic Medical Records and Genomics (eMERGE) Network, Phase II -‐ Coordinating Center (U01)RFA HG-‐10-‐009: [grants.nih.gov] The Electronic Medical Records and Genomics (eMERGE) Network, Phase II -‐ Study Investigators (U01)
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 5
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? More than a dozen phenotypes that are currently being investigated including Multiple Sclerosis, Crohn's Disease, Atrial Fibrillation
1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not available
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad -‐ If a patient agrees to take part in the biobank, some of their genetic and health information might be placed into one or more scientific databases.
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
In the event that a researcher would like to use a patient's biospecimen for a study, they would need to contact the patient and the patient may opt-‐in or out of that study
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR, Biobanks and genetic data
Electronic Medical Records and Genomics (eMERGE) Network
33
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
The eMERGE Network is open to all academic, government, and private sector scientists who are interested in participating in an open process to facilitate genomic research in biorepositories with electronic medical records and application of genomic results to clinical care, and who agree.
1.g.iii.1.b. Policies for sharing data outside the network
To maximize these collaborations between sites, participating institutions had to develop Data Use Agreements in order to share de-‐identified research data, including the HIPAA-‐defined limited data sets, with other sites within the Consortium.
1.g.iii.1.c. Policies for protecting proprietary data Data Use Agreement and publication policy, and all data are de-‐identified once data leave the site
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Validation and discovery of genotype-‐phenotype associations in chronic diseases using linked data. Pathak J, Kiefer R, Freimuth R, Chute C. Stud Health Technol Inform. 2012;180:549-‐53.
2) Gene-‐centric meta-‐analyses of 108 912 individuals confirm known body mass index loci and reveal three novel signals. Guo Y, Lanktree MB, Taylor KC, Hakonarson H, Lange LA, Keating BJ; IBC 50K SNP array BMI Consortium.Hum Mol Genet. 2013 Jan 1;22(1):184-‐201.
3) Large-‐scale gene-‐centric meta-‐analysis across 32 studies identifies multiple lipid loci.Asselbergs FW, Guo Y, van Iperen EP, Sivapalaratnam S, Tragante V, Lanktree MB, Lange LA, Almoguera B, Appelman YE, Barnard J, Baumert J, Beitelshees AL, Bhangale TR, Chen YD, Gaunt TR, Gong Y, Hopewell JC, Johnson T, Kleber ME, Langaee TY, Li M, Li YR, Liu K, McDonough CW, Meijs MF, Middelberg RP, Musunuru K, Nelson CP, O'Connell JR, Padmanabhan S, Pankow JS, Pankratz N, Rafelt S, Rajagopalan R, Romaine SP, Schork NJ, Shaffer J, Shen H, Smith EN, Tischfield SE, van der Most PJ, van Vliet-‐Ostaptchouk JV, Verweij N, Volcik KA, Zhang L, Bailey KR, Bailey KM, Bauer F, Boer JM, Braund PS, Burt A, Burton PR, Buxbaum SG, Chen W, Cooper-‐Dehoff RM, Cupples LA, deJong JS, Delles C, Duggan D, Fornage M, Furlong CE, Glazer N, Gums JG, Hastie C, Holmes MV, Illig T, Kirkland SA, Kivimaki M, Klein R, Klein BE, Kooperberg C, Kottke-‐Marchant K, Kumari M, LaCroix AZ, Mallela L, Murugesan G, Ordovas J, Ouwehand WH, Post WS, Saxena R, Scharnagl H, Schreiner PJ, Shah T, Shields DC, Shimbo D, Srinivasan SR, Stolk RP, Swerdlow DI, Taylor HA Jr, Topol EJ, Toskala E, van Pelt JL, van Setten J, Yusuf S, Whittaker JC, Zwinderman AH; LifeLines Cohort Study, Anand SS, Balmforth AJ, Berenson GS, Bezzina CR, Boehm BO, Boerwinkle E, Casas JP, Caulfield MJ, Clarke R, Connell JM, Cruickshanks KJ, Davidson KW, Day IN, de Bakker PI, Doevendans PA, Dominiczak AF, Hall AS, Hartman CA, Hengstenberg C, Hillege HL, Hofker MH, Humphries SE, Jarvik GP, Johnson JA, Kaess BM, Kathiresan S, Koenig W, Lawlor DA, März W, Melander O, Mitchell BD, Montgomery GW, Munroe PB, Murray SS, Newhouse SJ, Onland-‐Moret NC, Poulter N, Psaty B, Redline S, Rich SS, Rotter JI, Schunkert H, Sever P, Shuldiner AR, Silverstein RL, Stanton A, Thorand B, Trip MD, Tsai MY, van der Harst P, van der Schoot E, van der Schouw YT, Verschuren WM, Watkins H, Wilde AA, Wolffenbuttel BH, Whitfield JB, Hovingh GK, Ballantyne CM, Wijmenga C, Reilly MP, Martin NG, Wilson JG, Rader DJ, Samani NJ, Reiner AP, Hegele RA, Kastelein JJ, Hingorani AD, Talmud PJ, Hakonarson H, Elbers CC, Keating BJ, Drenos F.Am J Hum Genet. 2012 Nov 2;91(5):823-‐38
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? https://www.zotero.org/groups/emerge_network/items/collectionKey/NUV7UTBP2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
By signing the DUA, a healthcare organization can participate in using eMERGE for research purposes as well as providing genomic and EHR data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected?
RBC count, hemoglobin level, mean corpuscular volume, mean corpuscularhemoglobin, RBC distribution width and erythrocyte sedimentation rate. DNA, plasma, and serum and neuroimaging
3.c. What types of analysis are done on them? Genomic analyses, complete blood counts, chemistry panel, B12, thyroid stimulating hormone
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
34
Criteria Answers3.d.i. What types of analyses do they conduct on them? Genomic analyses, complete blood counts, chemistry panel, B12, thyroid stimulating hormone
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
4.a. What type of security technology does the network use? The security technology is different for each of the local sites and therefore cannot be assessed
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Researchers login to a web portal but can only obtain record counts of patients within the network based on ICD-‐9 and demographics
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9/10, RxNORM, CPT, LOINC, SNOMED-‐CT,caDSR, NCI4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? CDISC SDTM4.d.iii. How are the data transformed and mapped? caBIG
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
eleMAP allows researchers to harmonize their local phenotype data dictionaries to existing metadata and terminology standards such as caDSR, NCIT, and SNOMED-‐CT
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, medical conditions, medications, vitals, and genetic data
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
clinical Text and Knowledge Extraction System (cTAKES)Health Information Text Extraction (HITEX)NegEx (NegEx)ConText (ConText)National Library of Medicine's MetaMap (MetaMap)MedExSecTag Stanford Named Entity Recognizer (NER)Stanford CoreNLP (CoreNLP)
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data are aggregated based on the request of the researcher established in the DUA
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? PheWAS methods leverage EHR billing data (ICD9s) to derive case and control populations. Using this data, a large number of disease phenotypes can be investigated simultaneously against a specified variant or variants.
35
Table 1
*Table from http://www.genome.gov/27540473
36
Criteria Answers1.a. How many people does the network cover or involve? 18,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
In two years, HMORN increased from 11 to 19 sites, and from 10 million to 18 million covered
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
Transforming Primary Care Study: Evaluating the spread of Group Health's Medical Home, PI: Robert J. Reid. 7,018 followed for two years. An evaluation of the effects of the patient-‐centered medical home model of primary care on patients’ experiences, quality, burnout of clinicians, and total costs; results showed improvements in patients’ experiences, quality, and clinician burnout—with an estimated total savings of $10.3 per patient per month. Citations: Reid, Fishman et al. 2009; Reid, Coleman et al. 2010
1.b.i.1. Demographics: racial/ethnic See Table 1
1.b.i.2. Demographics: geography
19 research centers -‐ Denver-‐Boulder-‐Colorado Springs, Atlanta, Central Texas, Hawaii, Northwest Oregon-‐Southwest Washington, Sacramento-‐San Francisco Bay Area, New Mexico, Washington-‐Northern Idaho, Wisconsin, Northeast and Central Pennsylvania, Southeast Michigan, Minneapolis-‐St. Paul, Massachusetts-‐New Hampshire-‐Maine, Massachusetts, Los Angeles County-‐ Orange County-‐San Diego County, Wisconsin-‐Minnesota-‐North Dakota-‐Idaho, Tel Aviv (Israel), Maryland-‐Virginia-‐District of Columbia, Northern California
1.b.i.3. Demographics: age See Table 11.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? See Table 21.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? See Table 2
1.c.i.2. How much of that budget is dedicated to conducting studies? See Table 2
1.c.ii. What are the current sources of funding? NIH, CDC, AHRQ, Community Benefit Funds
1.c.iii. How much does it cost each year to maintain and update the network? Included in amount of annual budget dedicated to infrastructure and maintenance
1.d. How many years has this network existed? 18
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? No
1.e.i.1. What does the network focus on? Not applicable1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR (Epic, Care Plus, Allscripts, Cattails MD, ICT, Next Gen)
HMO Research Network (HMORN)Umbrella network for: Cancer Research Network (CRN), Cardiovascular Research Network (CVRN), Diabetes Research Network, Accelerating Change and Transformation in Organizations and
Networks (ACTION II), Developing Evidence to Improve Decisions about Effectiveness (DEcIDE) Network, Medical Exposure in Pregnancy Risk Evaluation Program (MEPREP), Mental Health Research Network (MHRN), Mini-‐Sentinel, Multi-‐Institutional COnsortium for Comparative Effectiveness Research in Prevention and Treatment of Diabetes Mellitus (SUPREME-‐DM).
37
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data Use Agreements
1.g.iii.1.b. Policies for sharing data outside the network
Data are shared on a case to case basis. Typically the outside researcher needs to collaborate with a researcher who is a part of the network.
1.g.iii.1.c. Policies for protecting proprietary data Patient data are de-‐identified
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) PS2-‐51: Utilization Quality Assurance: Are We Better Yet? Donald Bachman, Terry Field, Christine Bredfeldt, Mark Hornbrook, Alan Bauck, Heather Tavel, Lucas Ovans, Debbie Godwin and Dean Kjar, Clinical Medicine & Research August 1, 2012 vol. 10 no. 3 195-‐196.
2) PS2-‐58: A Survey of HMORN VDW Tumor Data Sources. Rick Krajenta, Dustin Key and Amy Butani. Clinical Medicine & Research August 1, 2012 vol. 10 no. 3 197.
3) PS2-‐61: Establishment of a Cohort of Women to Study the Effect of Cervical Procedures on Reproductive Health Outcomes. Erin Masterson, Sheila Weinmann, Allison Naleway, Meredith Vandermeer, Tracy Dodge, Bhakti Arondekar, Jovelle Fernandez, Shanthy Krishnarajah, Geeta Swamy and Evan Myers. Clinical Medicine & Research August 1, 2012 vol. 10 no. 3 180.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence?
Adult Changes in Thought (ACT) Study, Eric B. Larson. About 2,000 (at any one time, new participants are enrolled as others die) followed since 1994. Total enrollment to date > 4,000 including >400 research quality autopsy specimens. An ongoing longitudinal study following adults over age 65 to identify risk factors for cognitive decline with aging and related conditions, such as Alzheimer's disease. Citations: Gray, Anderson et al. 2008; Breitner, Haneuse et al. 2009; Ehlenbach, Hough et al. 2010; Gray, Walker et al. 2011; Trittschuh, Crane et al. 2011
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Access to EHRs, administrative, laboratory data, pharmacy data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? Effect of Massage on Chronic Low Back Pain, Daniel C. Cherki. 400 patients followed for one year: first study to compare structural and relaxation (Swedish) massage for chronic low back pain; the randomized controlled trial found that both types of massage worked well, with few side effects. Citations: Cherkin, Sherman et al. 2009; Cherkin, Sherman et al. 2011
3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Blood, serum, and DNA samples
3.c. What types of analysis are done on them? DNA sequence analysis to identify genetic variants
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? DNA sequence analysis to identify genetic variants, examined three biomarkers as potential predictors of future diagnosis
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use? Data stored locally, computerized datasets stored behind separate security firewalls
4.b.i. (Y/N) Are queries distributed via a central hub? No
38
Criteria Answers
4.b.ii. What is the architecture of the query distribution?
For multi-‐site studies that use data from the standardized Virtual Data Warehouse, efficiencies are achieved by sharing data extraction code that has been written and validated at a single site, then deployed at other sites to be run against local Virtual Data Warehouse files. Data management staff at all sites work closely with site investigators to refine data queries and prepare analytic data sets.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? NDC, ICD-‐9/10, CPT-‐4, DRG, ISO, HCPCS, LOINC4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Virtual Data Warehouse4.d.iii. How are the data transformed and mapped? Data are stored locally and are mapped when data extraction codes are sent to the local sites.
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
EHR/EMR records, health plan claims, medical charts, lab data, clinical registries, biospecimen resources, patient reported outcomes, data on health care cost, utilization, and benefit designs, pharmacy data, survey data, clinical trials data, cancer registries, Medicare/Medicaid, vital records
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Machine learning, logistic regression, support vector machine
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Based on the query of the researcher
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Virtual Data Warehouse SAS Macros
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not applicable
39
Table 1. Population Characteristics (last updates range, 2010-2012)
EH FCHP GH GHS HFHS HP HPHC KPCO KPGA KPHI KPMA KPNC KPNW KPSC LCF MC MHS PAMF S&W
Age % ≤ 17 yrs 25 19 16 19 16 25 22 21 22 20 20 22 22 25 38 21 41 21 22
% 18 – 44 yrs 60
33 31 40 27 39 39 34 37 35 36 35 34 36 26 32 32 39 33
% 45 – 64 yrs 29 35 26 34 29 34 30 32 30 32 29 31 28 21 26 19 27 28
% 65 + 15 19 18 16 22 5 4 15 9 15 12 13 14 11 15 20 5 12 17
Race8 % American Indian/Alaska Native 4 <1 1 0 1 1 0 1 0 1 <1 <1 1 <1 NA <1 0 <1 0
% Asian <1 3 4 0 3 5 5 3 2 38 9 17 5 10
NA available
<1 0 32 1
% Native Hawaiian or Other Pacific Islander
<1 0 0 1 0 0 0 0 0 33 NA 4 0 NA <1 0 NA 0
% Black or African American <1 2 2 1 38 10 12 4 18 1 37 8 3 10 NA available
<1 0 2 6
% White 97 87 33 98 52 59 83 57 18 27 42 51 87 37 NA available
68 95 51 45
% Other or unknown <1 0 60 0 0 25 0 36 62 0 2 0 5 3 100 30 5 <1 45
% ethnicity known not specified
not specified
not specified
not specified
not specified
not specified
not specified 54 not
specified not
specified not
specified not
specified 50 Not specified
not specified 68 not
specified Not
specified not
specified % known Hispanic or Latino ethnicity
-‐ 8 2 1 1 2 4 10 2 4 19 5 41 40 2 0 14 7
Member Retention % enrolled at 1 yr n/a 95 82 82 99 88 85 91 82 85 67 87 83 90 80 88 99 90 84
% enrolled at 3 yrs n/a 92 63 64 86 70 54 66 57 72 39 75 67 76 51 82 98 79 65
% enrolled at 5 yrs n/a 92 52 47 63 55 45 54 43 63 27 66 59 66 40 70 98 68 53
8 may be > 100% if multiple responses allowed at collection, 'other' may included persons reporting multiple races. Health Plan Acronyms: EH = Essentia Health HP = HealthPartners KPMA = Kaiser Permanente Mid-‐Atlantic MCRF = Marshfield Clinic FCHP = Fallon Community Health Plan HPHC = Harvard Pilgrim Health Care KPNC = Kaiser Permanente Northern California MHS = Maccabi Healthcare Services GHC = Group Health KPCO = Kaiser Permanente Colorado KPNW = Kaiser Permanente Northwest PAMF = Palo Alto Medical Foundation GHS = Geisinger Health System KPGA = Kaiser Permanente Georgia KPSC = Kaiser Permanente Southern California S&W = Scott & White Healthcare HFHS = Henry Ford Health System KPHI = Kaiser Permanente Hawaii LCF = Lovelace Clinic Foundation
40
GHRI GHS HFHS HPRF HPHC KPCO KPGA KPHI KPNC KPNW LCF MCRF MPCI S&W
Began 1983 2003 1983 1989 1969 1987 1998 1991 1961 1964 1990 1959 1996 1985
Research clinic
Survey department
Facility that can do research lab tests
Facility that can fill research prescriptions
2010Funding§§ – all sources ($millions)
43.3 10.5 52.4 17.0 32.1 16.6 3.3 4.4 94.4 35.3 5.7 31.9 4.3 13.1
2010 Federal Funding, %
82 16 50 64 84 54 44 62 69 76 91 32 72 22
PI FTE 32 10 82 23 36 10 6 5 48 31 7 31 26 23
Investigator‐initiated clinical trials (avg/year)
1‐10 >10 1‐10 1‐10 0 0 1‐10 0 >10 >10 0 1‐10 1‐10 >10
Total clinical trials (avg/year)
<50 50+ 50+ <50 <50 50+ <50 <50 50+ 50+ 0 50+ <50 50+
§§ Revenue/expense.
41
Criteria Answers1.a. How many people does the network cover or involve? 50,000 patients per year -‐ 1200 patients in Transitions of Care project
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Although HOMERuN is currently using their data for one project, they receive data for all patients in the network who were admitted to network hospitals
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Transitions of Care project data are being locally used and sites have talked about increasing post discharge acute clinic care follow ups, and issues surrounding decreased access to pre-‐ and post-‐ acute care .
1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $1,600,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $800,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $800,000
1.c.ii. What are the current sources of funding? Association of American Medical Colleges
1.c.iii. How much does it cost each year to maintain and update the network? Included in amount of annual budget dedicated to infrastructure and maintenance
1.d. How many years has this network existed? 4
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Discharge care coordination, and a readmission review at all 13 sites, determining preventability1.f. (Y/N) Does the network use informed consent forms? Yes -‐ the UCSF IRB approves informed consent documentation
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
Hospital Medicine Reengineering Network (HOMERuN)
42
Criteria Answers1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data Use Agreements, Business Associate Agreements
1.g.iii.1.b. Policies for sharing data outside the network This network does not share data-‐-‐and does not plan to share data-‐-‐outside the network
1.g.iii.1.c. Policies for protecting proprietary data All proprietary data are property of UCSF
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
Publications under review: 1) PREVENTABILITY OF READMISSIONS IN A NATIONAL SAMPLE OF PATIENTS: PRELIMINARY RESULTS FROM THE HOSPITAL MEDICINE REENGINEERING NETWORK (HOMERUN), AD Auerbach, Mourad M, Maselli J, Sehgal N, Lindenauer PK, Kim C, Robinson E, Ruhnke G, Metlay J, Herzig S, Vasilevskis E, Kripalani S, Williams M, Fletcher G, Critchfield J, Schnipper J
2) PRIMARY CARE PHYSICIAN AND HOSPITALIST PERCEPTIONS OF CAUSES OF READMISSIONS: PRELIMINARY RESULTS FROM THE HOSPITAL MEDICINE REENGINEERING NETWORK (HOMERUN), AD Auerbach, Mourad M, Maselli J, Sehgal N, Lindenauer PK, Kim C, Robinson E, Ruhnke G, Metlay J, Herzig S, Vasilevskis E, Kripalani S, Williams M, Fletcher G, Critchfield J, Schnipper J
3) The Hospital Medicine Reengineering Network (HOMERuN): A learning organization focused on improving hospital care. Andrew D. Auerbach MD MPH, Mitesh S. Patel MD MBA, Joshua P. Metlay MD PhD, Jeffrey L. Schnipper MD MPH, Mark V. Williams MD, Edmondo J. Robinson MD MBA, Sunil Kripalani MD MSc, Peter K. Lindenauer MD MSc
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
No standardization has been needed thus far
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
Healthcare organizations are allowing researchers to include their patients in research and also interact with patients for purposes of the clinical trial while a patient is in the hospital.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? All data are stored utilizing the security technology of the UCSF secure data center.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
The query is submitted to the principal investigator and the principal investigator compiles the data set and sends it back to the researcher as an Excel, CSV, SAS, or STATA dataset.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not for the current project but moving toward using ICD-‐10
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable
43
Criteria Answers4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
EHR data (for initial review only), Patient chart review, Surveys of physicians, Patient interviews
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data are transformed based on the data needs of the researcher
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
44
Criteria Answers1.a. How many people does the network cover or involve? 126,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Conducts studies involving drug, vaccine, and medical device safety
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic See Table 11.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age See Table 11.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? $14,000,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Confidential
1.c.i.2. How much of that budget is dedicated to conducting studies? Confidential
1.c.ii. What are the current sources of funding? FDA
1.c.iii. How much does it cost each year to maintain and update the network? Confidential
1.d. How many years has this network existed? 4
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Mainly on drugs, vaccines, other biologics (such as blood products), and medical devices1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR, Claims data
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data Partners may use their own original source data transformed into Mini-‐Sentinel Common Data Model format for other purposes, such as research, as long as they comply with applicable state and federal laws and regulations, including HIPAA and the Common Rule
Mini-‐Sentinel
45
Criteria Answers
1.g.iii.1.b. Policies for sharing data outside the network
Data Use Agreements are not required for Mini-‐Sentinel activities. However, Collaborators and the Mini-‐Sentinel Coordinating Center, including all its components, may only use data obtained from sources other than their own institution (referred to as “outside source data”) in the conduct of Mini-‐Sentinel activities for Mini-‐Sentinel’s public health purposes. Such data may not be reused, re-‐disclosed, altered, or sold for any purposes other than those defined in the base contracts and subsequent task order contracts.
1.g.iii.1.c. Policies for protecting proprietary data
Direct patient identifiers may be used by Data Partners when necessary to gather additional clinical and demographic information or to link their data to data from other sources, as required by specific projects. Prior to sharing information with the Operations Center, direct patient identifiers are stripped.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Nguyen, M., Ball, R., Midthun, K., & Lieu, T. A. (2012). The Food and Drug Administration's Post-‐Licensure Rapid Immunization Safety Monitoring program: strengthening the federal vaccine safety enterprise. Pharmacoepidemiology and Drug Safety, 21, 291-‐297
2) Fireman, B., Toh, S., Butler, M. G., Go, A. S., Joffe, H. V., Graham, D. J., ... & Selby, J. V. (2012). A protocol for active surveillance of acute myocardial infarction in association with the use of a new antidiabetic pharmaceutical agent. Pharmacoepidemiology and Drug Safety, 21, 282-‐290.
3) Lopez, M. H., Holve, E., Sarkar, I. N., & Segal, C. (2012). Building the Informatics Infrastructure for Comparative Effectiveness Research (CER): A Review of the Literature. Medical Care, 50, S38-‐S48.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Validation of Severe Liver Injury Cases2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By providing EHR data and DUAs
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Not available
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use?
All current implementations using PopMedNet are NIST 800-‐53 REVE 2 / FISMA compliant and have successfully passed a full audit of the hosting facility, application, and operations procedures. The Application Portal is hosted in a two server configuration, one server (Portal Web server) to run the application and to service all applications requests that come in via the Web. This server runs the Portal application under IIS and ASP .NET. The second server (Portal Database server) houses the Portal Database in a MS SQL Server 2008 instance. There is no connection from the Portal Database server to the web. All requests are made via the Portal Web server.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
The Data Mart Client polls the portal for queries awaiting execution, downloads the query, executes the query, and manages the workflows associated with query execution (Administrator in box, notification, workflow processing, etc.). The Data Mart executes the query directly via an ODBC connection; it is not passed off to another service. Queries can be reviewed before local execution, and results reviewed before release. The system does not require an open port and is not designed to be fully synchronous – although all query fulfillment steps can be automated.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9/10/11, NDC, LOINC, SNOMED-‐CT, CPT-‐4, HCPCS, HCPCS Level III, CPT Cat II, CPT Cat III
46
Criteria Answers4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Mini-‐Sentinel4.d.iii. How are the data transformed and mapped? Mini-‐Sentinel utilizes SAS Macro toolkits to extract data from EHR/EMR from the current site
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Each query allows the requester to describe the nature of the query. System metadata include the requester name and contact information, his/her role in the system, the query description, and which other sites also received the query. The Data Mart Administrator can see the query parameters and its results before uploading to the portal.
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Enrollment, Demographic, Medication, Encounter, Diagnosis, Procedures, Labs, Vitals
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
After it leaves the local site but it all depends on the permissions of the user. Some may only view aggregated results and others may view site-‐specific results.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS toolkits are available for users to utilize with the Mini-‐Sentinel network
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? ETL tools are used to load the data
47
Table 1. Snapshot of the Mini-Sentinel Distributed Database Demographic Table in Extract 1 (Unique Individuals = 83,003,100)
*Table from http://www.mini-sentinel.org/work_products/Data_Activities/Mini-Sentinel_Year-1-Data-Quality-and-Characterization-Procedures-and-Findings-Report.pdf, page 41
48
Criteria Answers1.a. How many people does the network cover or involve? 40,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
4 main types of studies -‐ retrospective studies using dental records; observational studies of routine care activities, case-‐control studies, and clinical trials comparing alternative treatment strategies
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
Allows participating dentists to compare their results to the aggregated results of other practices; network tracks whether member practices have implemented the approaches disseminated by research.
1) Riley JL III, Gordan VV, Rindal DB, Fellows JL, Qvist V, Sager P, Foy P, Williams OD, Gilbert GH for The National Dental PBRN Collaborative Group. Components of patient satisfaction with a dental restorative visit: results from The Dental Practice-‐Based Research Network. Journal of the American Dental Association 2012; 143(9):1002-‐1010.
1.b.i.1. Demographics: racial/ethnic Not available
1.b.i.2. Demographics: geography
United States-‐-‐ Alabama, California, Colorado, Delaware, District of Columbia, Florida, Georgia, Illinois, Kentucky, Louisiana, Maine, Massachusetts, Michigan, Minnesota, Mississippi, New Jersey, New Mexico, North Carolina, Ohio, Oregon, Pennsylvania, South Carolina, Tennessee, Texas, Washington, Wisconsin -‐ divided into 6 regional nodes, the Western Region, the Midwest Region, the Northeast Region, the Southwest Region, the South Central Region, and the South Atlantic Region
1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $66,800,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? National Institute of Dental and Craniofacial Research (NIDCR)
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 1
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Dental Practice-‐Based Research1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
The National Dental Practice-‐Based Research Network (NDPBRN)
49
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
A researcher from the network submits a protocol concept to the network. Once a protocol concept has been approved, a study team is formed. This team administers the study from protocol development, to feasibility and pilot testing, data collection, data analysis, and study closure.
1.g.iii.1.b. Policies for sharing data outside the network Data are not shared outside the network
1.g.iii.1.c. Policies for protecting proprietary data Practitioners sign a confidentiality agreement, receive Human Subjects Training, and HIPAA training
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Houston TK, DeLaughter KL, Ray MN, Gilbert GH, Allison JJ, Kiefe CI, Volkman JE for the National Dental PBRN Collaborative Group. Impact of a web-‐assisted tobacco quality improvement intervention of subsequent smoker behavior: a National Dental PBRN study. BMC Oral Health 2013; accepted for publication.
2) Blue CM, Funkhouser DE, Riggs S, Rindal DB, Worley D, Pihlstrom DJ, Gilbert GH for the National DPBRN Collaborative Group. Utilization of non-‐dentist providers and attitudes toward new provider models: findings from the National Dental Practice-‐Based Research Network. Journal of Public Health Dentistry 2013; accepted for publication.
3) Ray MN, Allison JJ, Coley HL, Williams JH, Kohler C, Gilbert GH, Richman JS, Kiefe CI, Sadasivam RS, Houston TK for the National DPBRN Collaborative Group. Variations in tobacco control in National Dental PBRN practices: the role of patient and practice factors. Special Care in Dentistry 2013; accepted for publication.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence?
Houston TK, Coley HL, Sadasivam RS, Ray MN, Williams JH, Allison JJ, Gilbert GH, Kiefe CI, Kohler C for The DPBRN Collaborative Group. Impact of content-‐specific email reminders on provider participation in an online intervention: a Dental PBRN study. Studies in Health Technology and Informatics 2010; 160: 801-‐805.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
Dentists have three options for participation once they have joined the network: 1. informational (receive newsletters and correspondence only); 2. limited (also participate in questionnaires); or 3. full (also participate in in-‐office clinical studies)
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? Houston TK, Richman JS, Ray MN, Allison JJ, Gilbert GH, Shewchuk RM, Kohler CL, Kiefe CI, for The DPBRN Collaborative Group. Internet-‐delivered support for tobacco control in dental practice: randomized controlled trial in The Dental PBRN. Journal of Medical Internet Research 2008; 10(5): e38.
3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Not available
4.b.ii. What is the architecture of the query distribution? Not available
50
Criteria Answers4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? Not available
4.d.ii. Which CDM is used? Not available4.d.iii. How are the data transformed and mapped? Not available
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, treatments, procedures, medications, and surveys collected in clinical trials
4.g.i. (Y/N) Does the network use natural language processing? Not available
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Not available
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not available
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS scripts and SPSS code
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Not available
4.j.ii. What informatics tools are used? Not available
51
Criteria Answers1.a. How many people does the network cover or involve? 1,200,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Collaborates with 18 hospital emergency departments and children are being treated in the emergency department for acute illnesses and injuries across a wide spectrum of conditions from the most common to the very rare
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?Stanley R, Lillis K, Zuspan SJ, Lichenstein R, Ruddy RM, Gerardi MJ, Dean JAM, and the Pediatric Emergency Care Applied Research Network. Development and implementation of a performance measure tool in an academic pediatric research network. Controlled Clinical Trials 2010
1.b.i.1. Demographics: racial/ethnic
White (Non-‐Hispanic): 33%Hispanic: 21%Black or African American: 38%Asian: 1%American Indian or Alaskan Native: 0%Native Hawaiian or Other Pacific Islander: 0%Other: 4%Multiple Races: 1%
1.b.i.2. Demographics: geography Not available
1.b.i.3. Demographics: age
0: 17%1: 13%2: 10%3: 7%4: 6%5: 5%6: 4%7: 4%8: 4%9: 3%10: 3%11: 3%12: 3%13: 3%14: 3%15: 3%16: 3%17: 3%18: 1%
1.b.i.4. Demographics: gender Male: 53%Female: 47%
1.c.i. What is the total annual budget? $5,280,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Percentage of the total annual budget
1.c.i.2. How much of that budget is dedicated to conducting studies? $5,000,000
1.c.ii. What are the current sources of funding?
HRSA/MCHB/EMSC funds the infrastructureExternal Grants funded by NICHD, NHLBI, CDC, NIH-‐Eunice Kennedy Shriver National Institute of Child Health & Human Development, AHRQ, NIAAA and HRSA/MCHB/EMSC
1.c.iii. How much does it cost each year to maintain and update the network? Percentge of the total annual budget
1.d. How many years has this network existed? 11
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Focuses on Pediatric Emergency Care1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ Consent to specific use of their data based on the IRB submitted by the researcher. Consent forms are changed based on the study.
Pediatric Emergency Care Applied Research Network (PECARN)
52
Criteria Answers1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific -‐ Consent to specific use of their data based on the IRB submitted by the researcher. Consent forms are changed based on the study.
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Investigators will request the use of a specific dataset by submitting a formal request that includes a research plan describing the proposed research, a signed data Research Data Use Agreement (RDUA) approval from the researcher’s IRB for use of the dataset, or documentation that the use of public data sets is exempt from IRB review by institutional policy. The data coordinating center will disseminate the dataset after receipt of the aforementioned items.
1.g.iii.1.b. Policies for sharing data outside the network
Investigators will request the use of a specific dataset by submitting a formal request that includes a research plan describing the proposed research, a signed data Research Data Use Agreement (RDUA) approval from the researcher’s IRB for use of the dataset, or documentation that the use of public data sets is exempt from IRB review by institutional policy. The data coordinating center will disseminate the dataset after receipt of the aforementioned items.
1.g.iii.1.c. Policies for protecting proprietary data
Data collected in this project do not include names, but do include sufficient identifying information (such as date of birth, gender, zip code) that project investigators must protect the confidentiality of in accordance with privacy regulations such as the Health Insurance Portability and Accountability Act (HIPAA).
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Holmes, JF, Lillis K, Monroe D, Borgialli D, Kerrey BT, Mahajan P, Adelgais K, Ellison AM, Yen K, Atabaki S, Menaker J, Bonsu B, Quayle KS, Garcia M, Rogers A, Blumberg S, Lee L, Tunik M, Kooistra J, Kwok M, Cook LJ, Dean JM, Sokolove PE, Wisner DH, Ehrlich P, Cooper A, Dayan PS, Wootton-‐Gorges S, Kuppermann N, Pediatric Emergency Care Applied Research Network (PECARN). Identifying Children at Very Low Risk of Clinically Important Blunt Abdominal Injuries. Annals of Emergency Medicine, Available online 1 Feb 2013, ISSN 0196-‐0644, 10.1016/j.annemergmed.2012.11.009.
2) Pemberton VL, Browning B, Webster A, Dean JM, Moler FW. Therapeutic hypothermia after pediatric cardiac arrest trials: the vanguard phase experience and implications for other trials. Pediatr Crit Care Med. 2013 Jan;14(1):19-‐26.
3) Shaw KN, Lillis KA, Ruddy RM, Mahajan PV, Lichenstein R, Olsen CS, Chamberlain JM. Reported Medication Events in a Paediatric Emergency Research Network: Sharing to Improve Patient Safety. Emerg Med J. 2012
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Holmes JF, Borgialli DA, Nadel FM, Quayle KS, Schamban N, Cooper A, Schunk JE, Miskin ML, Atabaki SM, Hoyle JD, Dayan PS, Kuppermann N, and the TBI Study Group for the PECARN. Do children with blunt head trauma and normal cranial CT scans require hospitalization for neurological observation? Ann Emerg Med 2011.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Giving access to EHRs and providing biospecimens
53
Criteria Answers2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence?
Corneli HM, Zorc JJ, Majahan P, Shaw KN, Holubkov R, Reeves SD, Ruddy RM, Malik B, Nelson KA, Bregstein JS, Brown KM, Denenberg MN, Lillis KA, Cimpello LB, Tsung JW, Borgialli DA, Baskin MN, Teshome G, Goldstein MA, Monroe D, Dean JM, Kuppermann N; Bronchiolitis Study Group of the Pediatric Emergency Care Applied Research Network (PECARN). A multicenter, randomized, controlled trial of dexamethasone for bronchiolitis. N Engl J Med. 2007 Jul 26;357(4):331-‐9.
3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Blood
3.c. What types of analysis are done on them? Explore the differences in host responses to bacterial vs. non-‐bacterial infections in young, febrile infants by quantifying changes in the host gene mRNA expression (transcriptional biosignatures)3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them?
Explore the differences in host responses to bacterial vs. non-‐bacterial infections in young, febrile infants by quantifying changes in the host gene mRNA expression (transcriptional biosignatures)
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
4.a. What type of security technology does the network use?
The Data Coordinating Center (DCC) is housed in a building with 24-‐hour on-‐site security guards. The DCC coordinates network infrastructure and security with the Health Sciences Campus (HSC) information systems at the University of Utah. This provides the DCC with effective firewall hardware, automatic network intrusion detection, and the expertise of dedicated security experts working at the University. User authentication is centralized with two Windows 2003-‐2008 domain servers. Communication over public networks is encrypted with virtual point-‐to-‐point sessions using secure socket layer (SSL) or virtual private network (VPN) technologies, both of which provide at least 128 bit encryption. All of the DCC Web-‐based systems use the SSL protocol to transmit data securely over the Internet.
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9/10, CPT4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Site, Patient ID, Date of Birth, Gender, Race, Ethnicity, Zip Code, Triage Category, Chief Complaint, Procedure Codes, Diagnosis Codes, E-‐Code, Payer Type (Insurance), ED Disposition, Date Time (Triage Date/Time and Discharge Date/Time, Mode of Arrival
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
The data are aggregated before they leave the site. The data sets are put on CD or DVD along with the Data Dictionary and then sent to the researcher.
54
Criteria Answers4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Utilize SAS or Excel
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
55
Criteria Answers1.a. How many people does the network cover or involve? 6,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The FURTHER platform is scalable to allow addition of new hospitals and data types. PHIS+ will augment the Children's Hospital Association's existing database, PHIS, with laboratory and radiology data for children seen in the ambulatory and inpatient departments of six large children's hospitals.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
"Merging of the National Cancer Institute–funded cooperative oncology group data with an administrative data source to develop a more effective platform for clinical trial analysis and comparative effectiveness research: a report from the Children's Oncology Group" R. Aplenc, B. T. Fisher, Y. S. Huang, Y. Li, T. A. Alonzo, R. B. Gerbing, M. Hall, D. Bertoch, R. Keren, A. E. Seif, L. Sung, P. C. Adamson, A. Gamis. Pharmacoepidemiology and Drug Safety Supplement: Methods for Developing and Analyzing Clinically Rich Data for Patient-‐Centered Outcomes Research Volume 21, Issue Supplement S2, pages 37–43, May 2012
1.b.i.1. Demographics: racial/ethnic See Table 1
1.b.i.2. Demographics: geography
Children's Hospital of Philadelphia (CHOP), Cincinnati Children’s Hospital Medical Center (CCHMC), Children’s Hospital Boston (CHB), Children’s Hospital of Pittsburgh (CHP), Primary Children’s Medical Center, Intermountain Healthcare (Salt Lake City) (PCMC), Seattle Children’s Hospital (SCH) are the hospitals that the laboratory and radiology data comes from for PHIS+.Administrative data also come from all 43 PHIS Hospitals.
1.b.i.3. Demographics: age Most patients are ages 0-‐18
1.b.i.4. Demographics: gender Females: 159,663 Males: 645,255
1.c.i. What is the total annual budget? $9,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $600,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $2,900,000
1.c.ii. What are the current sources of funding? Agency for Healthcare Research and Quality (AHRQ)
1.c.iii. How much does it cost each year to maintain and update the network? Included in amount of annual budget dedicated to infrastructure and maintenance
1.d. How many years has this network existed? 3
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Pediatrics1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
Pediatric Health Information System+ (PHIS+)
56
Criteria Answers1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Each of the hospitals that send data to Children's Hospital Association (CHA) has a business associate agreement with CHA. This means the entire patient record (with PHI) is sent to CHA, but when researchers go to pull the data, researchers fill out data use agreements with CHA and the researchers receive limited data sets, meaning they receive masked MRN and account numbers that allow researchers to follow patients longitudinally.
Business Associate Agreement (BAA) Between Hospitals and CHA (1)In order to facilitate matching of PHIS+ clinical data with corresponding administrative data shared with CHA through PHIS, hospital clinical data sent to CHA contain patient identifiers such as medical record number, hospital billing number, and date of service. To authorize the sharing of data with identifiers, a business associates agreement (BAA) was employed between each hospital and CHA. This BAA was already in place as a result of the PHIS participation of the 6 hospitals.
Data Use Agreement Between CHA and University of Utah BMIC (2)CHA drafted a data use agreement governing the sharing of de-‐identified hospital clinical data with the University of Utah BMIC. Under the agreement, CHA sends de-‐identified clinical data (as limited data sets) to BMIC, who uses the data to test and refine their mapping software. BMIC then sends the mapped results back to CHA. The only personal identifiers contained in the limited data sets are dates of service. This data use agreement is needed until CHA assumes responsibility for the mapping of clinical data sent from the hospitals.
Data Use Agreement Between CHA and Participating Hospitals (3)After PHIS+ is established, hospitals who want to receive limited data sets for research will sign a separate DUA for this data. CHA drafted a data use agreement to govern the delivery of PHIS+ data to hospital investigators.
1.g.iii.1.b. Policies for sharing data outside the network
No outside researcher has access to PHIS+ data. In order to access this data, the researcher's hospital must contribute this data.
1.g.iii.1.c. Policies for protecting proprietary data
Researchers sign DUAs promising not to attempt to identify any of the patients. Additionally, hospitals cannot be identified in the research.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) S. P. Narus, R. Srivastava, R. Gouripeddi, O. E. Livne, P. Mo, J. P. Bickel, D. de Regt, J. W. Hales, E. Kirkendall, R. L. Stepanek, J. Toth, and R. Keren, (2011). “Federating Clinical Data from Six Pediatric Hospitals: Process and Initial Results from the PHIS+ Consortium,” AMIA Annu Symp Proc, vol. 2011, pp. 994–1003.
2) R. Gouripeddi, P. Warner, P. Mo, J. E. Levin, R. Srivastava, S. S. Shah, D. de Regt, E. Kirkendall, J. Bickel, E. K. Korgenski, M. Precourt, R. L. Stepanek, J. A. Mitchell, S. P. Narus, R. Keren, (2012). Federating Clinical Data from Six Pediatric Hospitals: Process and Initial Results for Microbiology from the PHIS+ Consortium. AMIA 2012 Annual Symposium Proceedings, November 3 -‐7, 2012, Proposal ID: AMIA-‐0205-‐A2012.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence?
1) Therapy for Acute Osteomyelitis in Children Prolonged Intravenous Therapy Versus Early Transition to Oral Antimicrobial, Theoklis Zaoutis, A. Russell Localio, Kateri Leckerman, Stephanie Saddlemire, David Bertoch and Ron Keren, Pediatrics 2009;123;636
2) Reflux related hospital admissions after fundoplication in children with neurological impairment: retrospective cohort study, Rajendu Srivastava, Jay G Berry, Matt Hall, Earl C Downey, Molly O’Gorman, J Michael Dean, Douglas C Barnhart, BMJ 2009;339:b4411
3) Hospital-‐Level Compliance With Asthma Care Quality Measures at Children’s Hospitals and Subsequent Asthma-‐Related Outcomes,Rustin B. Morse, Matthew Hall, Evan S. Fieldston, Gerd McGwire, Melanie Anspacher, Marion R. Sills, Kristi Williams, Naomi Oyemwense, Keith J. Mann, Harold K. Simon, Samir S. Shah, JAMA. 2011;306(13):1454-‐1460
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
A unique patient identifier permits longitudinal tracking of individual patients
57
Criteria Answers2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Giving access to EHRs, and data from inpatient, emergency department, observation and outpatient care settings
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Not applicable
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
Child Health Corporation of America (CHCA) utilizes real-‐time security scans using an intrusion prevention system. This identifies network security threats and shuns traffic to prevent damage to the organization or compromised data. This service analyzes global and local sensor data in real-‐time and identifies hostile activity and other threats. An automated update engine then automatically issues commands to the firewall to block attacks. CHCA leverages a clustered firewall system in conjunction with the IPS to provide layered defenses against unauthorized access to data assets. CHCA application architecture isolates the databases from the SSL and SFTP processes, the ETL processes, as well as the web collection tools. The data gathered through web collection tools are SSL encrypted in-‐transit as it passes from the local device through the web server and onto the application server. No CHCA developed web collection tools create local copies of data on the local device. Database tape backups are additionally encrypted with 256-‐bit AES.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
With FURTHeR, on the fly query capability is replaced with a data file adaptor. FURTHeR typically aggregates and stores translated query results in a temporary, in-‐memory database for presentation and analysis by the investigator for the duration of the user’s session. PHIS+ has added software to allow the in-‐memory database to instantiate a hibernate object that could be persisted to a physical, JDBC-‐compliant database. A special adapter also parses the text batch files.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? LOINC8, SNOMED, HL7, RxNorm, CPT4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Using the Federated Utah Research and Translation Health electronic Repository, FURTHeR, the data are translated from the original source system to a common database using a tool developed by the Biomedical Informatics Core in UtahUsing Regenstrief LOINC Mapping Assistant (RELMA).
4.d.iii. How are the data transformed and mapped?
All laboratory and radiology data from six hospitals are pulled and sent to CHA and run through filters. If statisticians at CHA have mapped a particular element to a corresponding data element in the common database, then the data element will map. Everything that does not get mapped is put in a "bin" of unmapped data. This data can be used at a later date; if statisticians choose to add additional data elements, they have the unmapped data waiting to be remapped using these additional data elements.
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
See Table 1
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
EHR data, Radiology Data, Laboratory Data, Administrative Data
4.g.i. (Y/N) Does the network use natural language processing? Yes
58
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
An adaptation of the clinical information extraction system ‘Textractor’
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
When a query is submitted by a researcher, a simple command line interface initiates the process by pointing the data file adapter to the correct sample file and configuration file, and invoking the FURTHeR application. The translation engine marshals the raw lab data into the FURTHeR lab object and translates all local codes to the standard terminologies (using the code associations in the terminology server). Unrecognized codes and malformed input data are flagged to a log file for manual review. An output adapter takes each translated lab result and inserts it into a MySQL database via a Java Hibernate object.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SPSS code, SAS scripts, STATA code, and R code
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Clinical Transaction Code System by Thompson Reuters
59
Table 1. A subset of the Lab Sample 1 metadata fields and their descriptions.
60
Criteria Answers1.a. How many people does the network cover or involve? Not available
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
SCANNER is designed to be scalable so that additional studies can be added using the existing technology
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes, SCANNER is designed to be study-‐agnostic
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? In theory, this should be the case, but no evidence yet
1.b.i.1. Demographics: racial/ethnic
Caucasian: 8%African American: 0.94%American Indian/Eskimo: 0.048%Asian/Pacific Islander: 1.17%Hispanic: 2.47%Hispanic/Latino: 2.54%Multi-‐Racial: 2.92%Non-‐Hispanic: 10.56%
1.b.i.2. Demographics: geography Not available
1.b.i.3. Demographics: age
< 18: 4.6%18-‐30: 12.2%31-‐50: 30%51-‐70: 32%> 70: 21.2%
1.b.i.4. Demographics: gender Male: 47%Female: 53%
1.c.i. What is the total annual budget? $2,769,968 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available1.c.i.2. How much of that budget is dedicated to conducting studies? Not available1.c.ii. What are the current sources of funding? Agency for Healthcare Research and Quality1.c.iii. How much does it cost each year to maintain and update the network? Not applicable until after network is deployed1.d. How many years has this network existed? Not applicable until after network is deployed1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Comparative Effectiveness Research: 1. medication surveillance: old vs. new antiplatelet medications (patients with acute coronary syndrome) and old vs. new anticoagulant medications (patients with atrial fibrillation and patients with venous thromboembolism); 2. medication therapy management in patients with diabetes and hypertension
1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ Patients consent on a study-‐by-‐study basis
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
SCAlable National Network for Effectiveness Research (SCANNER)
61
Criteria Answers1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
IRB needed for all institutions; IRB and data sharing agreement needed for VA (VA's data sharing agreement is called CRADA)
1.g.iii.1.b. Policies for sharing data outside the network Sharing allowed with approved IRB and data use agreements (for limited data sets or identified data)
1.g.iii.1.c. Policies for protecting proprietary data Data are HIPAA-‐compliant and limited datasets are shared with approved IRB in place
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Giving access to EHRs
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Not applicable3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Not applicable
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Planned to be 2-‐factor authentication and study-‐based authorization
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution? Query distribution via central hub through a portal; the architecture of the distribution is hub and spoke style
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9, RxNORM, LOINC, CPT4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Observational Medical Outcomes Project (OMOP)
62
Criteria Answers4.d.iii. How are the data transformed and mapped? SQL scripts, ETL
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
If studies require introducing additional concepts to the OMOP vocabulary, the OMOP vocabulary is augmented.
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, encounters, procedures, medications, labs, vitals, and conditions
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)? If data are aggregated, options include summary statistics from regressions executed locally on each node.4.i. What data (statistical) analysis tools, if any, are available for researchers through the network?
Options include custom implementations of multivariate statistics, as well as features of the Weka package that are installed on every node
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? ETL and source data management is left to the discretion of each site.
63
Criteria Answers1.a. How many people does the network cover or involve? 65,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Covers vascular procedure studies at multiple hospitals
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes -‐ same condition
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Use of protamine sulfate to reverse heparin anticoagulation during carotid endarterectomy, Stone et al, J Vasc Surg, 2010
1.b.i.1. Demographics: racial/ethnic
White: 88.6%Black or African American: 7.9%Hispanic or Latino: 3.1%Asian: 0.6%American Indian or Alaskn Native: 0.2%Native Hawaiian or Other Pacific Islander: 0.1%Unknown/Other: 2.6%More than 1 race: 0.1%
1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age 69.2 +/-‐ 11 Age (15 -‐ 103)
1.b.i.4. Demographics: gender Male: 63.5%Female: 36.5%
1.c.i. What is the total annual budget? $2,100,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $1,500,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $650,000
1.c.ii. What are the current sources of funding? Participating center annual subscription fees
1.c.iii. How much does it cost each year to maintain and update the network? $1,500,000
1.d. How many years has this network existed? 2
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Vascular surgery procedures1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
Society for Vascular Surgery Vascular Quality Initiative (SVS VQI)
64
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Participants who want to be involved with VQI must sign agreements with M2S (Network security provider) and SVS PSO (Patient Safety Organization. Once these agreements have been signed and approved, the participant must pay annual fees.
1.g.iii.1.b. Policies for sharing data outside the network
Participants who want to be involved with VQI must sign agreements with M2S (Network security provider) and SVS PSO (Patient Safety Organization. Once these agreements have been signed and approved, the participant must pay annual fees.
1.g.iii.1.c. Policies for protecting proprietary data Utilize the AHRQ-‐listed Patient Safety Organization, data stored in the network are all de-‐identified
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Nolan BW, De Martino RR, Goodney PP, Schanzer A, Stone DH, Butzel D, Kwolek CJ, Cronenwett JL; Vascular Study Group of New England. Comparison of carotid endarterectomy and stenting in real world practice using a regional quality improvement registry. J Vasc Surg. 2012; 55: 990-‐6.
2) Simons JP, Schanzer A, Nolan BW, Stone DH, Kalish JA, Cronenwett JL, Goodney PP; Vascular Study Group of New England. Outcomes and practice patterns in patients undergoing lower extremity bypass. J Vasc Surg. 2012;55:1629-‐36.
3) Wallaert JB, Nolan BW, Adams J, Stanley AC, Eldrup-‐Jorgensen J, Cronenwett JL, Goodney PP. The impact of diabetes on perioperative outcomes following lower-‐extremity bypass surgery. J Vasc Surg. 2012; 56: 1317-‐23.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Simons JP, Schanzer A, Nolan BW, Stone DH, Kalish JA, Cronenwett JL, Goodney PP; Vascular Study Group of New England. Outcomes and practice patterns in patients undergoing lower extremity bypass. J Vasc Surg. 2012; 55:1629-‐36.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By providing claims data and manually entering data into a web form
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
65
Criteria Answers
4.a. What type of security technology does the network use?
PATHWAYS is a cloud-‐based platform which stores information directly into a database at a central data warehouse managed by M2S. Unique username-‐password combinations authenticate users and permit access only to the appropriate content. All passwords are stored using a one-‐way hash encryption process with a custom salt. Passwords expire every 180 days and cannot be reused for five generations. This ensures that the user is the only person who knows his or her password. PATHWAYS will also automatically log the user out of his or her session after 15 minutes of inactivity. To protect accounts from malicious attacks, users will be locked out of the system after five consecutive unsuccessful attempts to log-‐in. The database manager will then need to unlock the account before the user can log-‐in again. PATHWAYS utilizes 256-‐bit SSL encryption protocols, which is the same technology used by online banking and financial institutions, as well as healthcare providers, to protect their customers’ personal information. M2S registry users do not interface directly with the database server, but rather connect to the registry through a separate server, or “proxy” server. This proxy server filters all communication between the clients and the database and prevents unauthorized users from accessing the registry data. Communication from authorized users is relayed by the proxy server to the database through M2S’s internal firewall. Registry data is never stored on the proxy server, which greatly reduces the possibility for data to be lost, stolen, or accessed by an unauthorized party. PATHWAYS protects PHI by preventing the browser from caching sensitive data. Furthermore, PATHWAYS does not require ActiveX or Java plug-‐ins to run, and never writes PHI to the user’s computer.
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? CPT, ICD-‐9
4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Home grown, they use a data dictionary
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographic, risk factor, major outcomes, and complication data
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
The data are aggregated before they leave the site and are then sent electronically and securely to the researcher requesting the data
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
66
Criteria Answers1.a. How many people does the network cover or involve? 11,800,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
No
1.a.ii.1. Can the network be used for new studies in the same or a different condition? No -‐ Not yet
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $1,000,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $1,000,000
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? NCATS and NIH
1.c.iii. How much does it cost each year to maintain and update the network? $1,000,000
1.d. How many years has this network existed? 2
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? No -‐ none yet
1.e.i.1. What does the network focus on? Not applicable1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Does not share data outside network
UC-‐Research eXchange (UCReX)
67
Criteria Answers1.g.iii.1.b. Policies for sharing data outside the network Does not share data outside network
1.g.iii.1.c. Policies for protecting proprietary data
Queries return only aggregate counts.Aggregate numbers are blurred (or obfuscated), so that the counts returned are an estimate of the number of patients meeting the queried upon criteria at each institution. No personally identifiable patient information ever leaves an individual institution.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Provide access to EHR
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
Each user of the system needs to be authenticated at their individual institution to verify employment and faculty status.All communications are encrypted using standards approved by the W3C Consortium.Institution-‐specific user log-‐in credentials never leave an individual institution.Users must register the topics they would like to query with the Data Steward application. The Data Steward administrator manually reviews all query requests to make sure they are in compliance. Actual query histories are logged and audited on a regular basis to ensure that there have been no violations of the Terms and Conditions.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Can query the database either by ICD-‐9 codes for diagnostics or by demographics (no standardized terminologies used), and returns results as counts
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? LOINC, SNOMED-‐CT, CPT-‐4, ICD-‐9, UCUM, RXNorm, HL74.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? i2b24.d.iii. How are the data transformed and mapped? SQL scripts, ETL
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
68
Criteria Answers4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Standardize data types and date ranges
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Age, Ethnicity, Gender, Language, Marital Status, Race, Religion, Diagnosis, and procedure data
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Only counts are being aggregated locally and then sent out to the central node
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? None
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Custom ETL tool
69
Criteria Answers1.a. How many people does the network cover or involve? 4,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The network consists of hospitals all across the state of Wisconsin. Conducting studies on a multitude of conditions
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Confidential1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Confidential
1.c.i.2. How much of that budget is dedicated to conducting studies? Confidential
1.c.ii. What are the current sources of funding? Confidential
1.c.iii. How much does it cost each year to maintain and update the network? Confidential
1.d. How many years has this network existed? 8
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? No -‐ does not have a focus because they cover a wide-‐range of hospitals across the state and see a large population
1.e.i.1. What does the network focus on? Not applicable1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Currently the institutions involved within our network participate in the Wisconsin Institutional Review Board (IRB) Consortium (WIC). This leads to a shared vision of human subjects protection priorities as well as shared Standard Operating Procedures.
Wisconsin Network for Health Research (WiNHR)
70
Criteria Answers1.g.iii.1.b. Policies for sharing data outside the network
In order for a researcher to utilize the data within the WiNHR network, they must have at least 2 WiNHR sites participating and who have agreed to do so before obtaining the data
1.g.iii.1.c. Policies for protecting proprietary data All data are HIPAA compliant and de-‐identified
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) WMJ. 2011 Apr; 110(2):68-‐73. “The differential diagnosis of pulmonary blastomycosis using case vignettes: a Wisconsin Network for Health Research (WiNHR) study.”Baumgardner DJ, Temte JL, Gutowski E, Agger WA, Bailey H, Burmester JK, Banerjee I.
2) WMJ. 2009 Dec; 108(9):453-‐8. “The Wisconsin Network for Health Research (WiNHR): a statewide, collaborative, multi-‐disciplinary, research group.”Bailey H, Agger W, Baumgardner D, Burmester JK, Cisler RA, Evertsen J, Glurich I, Hartman D, Yale SH, DeMets D.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Not available2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Giving access to EHR data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Genetic analyses
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use? Source data are protected and managed by the OnCore security framework
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Each integration channel is supported through a set of internal APIs. The APIs can be used to either automate the process of determining what data become part of OnCore or, if necessary, allow users to review the data to manually determine what should be transferred to OnCore.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐94.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
71
Criteria Answers4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
XMAPS
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, labs, pathology, vitals, medications, conditions, procedures, and treatments
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? No
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Built-‐in protocol in OnCore
72
Criteria Answers1.a. How many people does the network cover or involve? 150,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
23andMe mainly conducts studies on Parkinson's disease but their researchers do other genetic studies using their customer's data
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Mostly White and 10,000 African Americans (Roots into the Future project)1.b.i.2. Demographics: geography NY and CA1.b.i.3. Demographics: age 30-‐65
1.b.i.4. Demographics: gender Male: 60%Female:40%
1.c.i. What is the total annual budget? $50,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Factored into the staffing costs
1.c.i.2. How much of that budget is dedicated to conducting studies? Roots into the Future project
1.c.ii. What are the current sources of funding? Venture funding, subscription costs
1.c.iii. How much does it cost each year to maintain and update the network? Factored into the staffing costs
1.d. How many years has this network existed? 4
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Parkinson's Disease, Sarcoma, and Myeloproliferative Neoplasms1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Have control on changing privacy settings and consent status
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Currently does not share data with institutional investigators
1.g.iii.1.b. Policies for sharing data outside the network Does not share data outside of the network
23andMe
73
IV. Inventories of PPRNs
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data investigators do not have access to personally identifying "Registration Information"
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None but have research findings on their website: https://www.23andme.com/about/factoids/
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By referring patients to studies (esp. Parkinson's disease)
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? DNA sequencing
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
4.a. What type of security technology does the network use? Security Audits, Telepost Kabel-‐Service (tks) protocol, Transfer Layer Security (tls) protocol
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, Genetic Data, Conditions, Medications, Outcomes
4.g.i. (Y/N) Does the network use natural language processing? Yes
74
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Use customized scripts to extract drug names from free text
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data is located at one site already
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? R, Python scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
75
Criteria Answers1.a. How many people does the network cover or involve? Not available
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Users interested in a new condition can start a new list by contacting the list coordinator
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Not available1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 18
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Virtual cancer support groups1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No -‐ studies are conducted using data collected by this website
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
The data required from the users by ACOR are e-‐mail and name. It is at the discretion of the user as to what other personal identifying information they choose to share on a message board
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Information collected about patient may be used by researchers in aggregate form only, all surveys initiated by a third party must be approved by ACOR before being posted
1.g.iii.1.b. Policies for sharing data outside the network
Information collected about patient may be used by researchers in aggregate form only, all surveys initiated by a third party must be approved by ACOR before being posted
1.g.iii.1.c. Policies for protecting proprietary data Proprietary data is only released in aggregate form.
ACOR
76
Criteria Answers2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals Not applicable
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Not applicable
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Not available
4.b.ii. What is the architecture of the query distribution? Not available
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Name, Email, list subscriptions, disease subtopic interests
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
77
Criteria Answers4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
78
Criteria Answers1.a. How many people does the network cover or involve? 371,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Have matched women volunteers to 71 studies
1.a.ii.1. Can the network be used for new studies in the same or a different condition? No
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic See Table 11.b.i.2. Demographics: geography See Table 31.b.i.3. Demographics: age See Table 2
1.b.i.4. Demographics: gender Male: 0.3%Female: 99.7%
1.c.i. What is the total annual budget? $300,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not applicable
1.c.ii. What are the current sources of funding? Avon Foundation for Women
1.c.iii. How much does it cost each year to maintain and update the network? $250,000
1.d. How many years has this network existed? 5
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Primarily on matching women who would like to participate in breast cancer studies with researchers 1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Researchers must register and submit an application with Army of Women about themselves, including CV, and information about the study they would like to conduct. They would also have to submit an IRB. The Science Advisory Committee reviews the application and once approved. The researcher will then be assigned to two advocates of Army of Women who will aid in the research
The Dr. Susan Love Research Foundation’s Love/Avon Army of Women
79
Criteria Answers
1.g.iii.1.b. Policies for sharing data outside the network
Researchers must register and submit an application with Army of Women about themselves, including CV, and information about the study they would like to conduct. They would also have to submit an IRB. The Science Advisory Committee reviews the application and once approved. The researcher will then be assigned to two advocates of Army of Women who will aid in the research
1.g.iii.1.c. Policies for protecting proprietary data None
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By referring patients
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Hosted on a secure website using firewalls
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Name, age, city, and state of residence
80
Criteria Answers4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
When making presentations at scientific conferences
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
81
Army of Women Demographics (February 20, 2013)
Ethnicity
# members
Percent total
Caucasian
321,000
86.35%
African American
12,776
3.44%
Hispanic/Latina
12,344
3.32%
Asian
4,031
1.08%
Native American
1,697
0.46%
Pacific Islander
648
0.17%
Other
5,945
1.6%
None Selected
13,286
3.57%
Year of Birth (Ordered by Year)
Year # Women % Total 1910 6 0% 1912 1 0% 1913 1 0% 1914 1 0% 1916 2 0% 1917 2 0% 1918 2 0% 1919 3 0% 1920 44 0.01% 1921 35 0.01% 1922 40 0.01% 1923 68 0.02% 1924 79 0.02% 1925 118 0.03% 1926 154 0.04% 1927 205 0.06% 1928 270 0.07% 1929 297 0.08% 1930 452 0.12% 1931 531 0.14%
82
Year of Birth (Ordered by Year) (continued)
Year # Women % Total 1932 660 0.18% 1933 731 0.2% 1934 1002 0.27% 1935 1290 0.35% 1936 1555 0.42% 1937 1956 0.53% 1938 2472 0.67% 1939 2844 0.77% 1940 3587 0.96% 1941 4362 1.17% 1942 5524 1.49% 1943 6022 1.62% 1944 6079 1.64% 1945 6519 1.75% 1946 8743 2.35% 1947 9959 2.68% 1948 10011 2.69% 1949 9887 2.66% 1950 10102 2.72% 1951 10445 2.81% 1952 10804 2.91% 1953 10775 2.9% 1954 10932 2.94% 1955 10798 2.9% 1956 10467 2.82% 1957 10768 2.9% 1958 10269 2.76% 1959 9904 2.66% 1960 9595 2.58% 1961 9240 2.49% 1962 8802 2.37% 1963 8774 2.36% 1964 8471 2.28% 1965 7822 2.1% 1966 7385 1.99% 1967 7235 1.95% 1968 7253 1.95% 1969 7402 1.99% 1970 7496 2.02% 1971 6936 1.87% 1972 6309 1.7% 1973 5914 1.59% 1974 5949 1.6% 1975 5765 1.55% 1976 5599 1.51%
83
Year of Birth (Ordered by Year) (continued)
Year # Women % Total 1977 5712 1.54% 1978 5464 1.47% 1979 5437 1.46% 1980 5234 1.41% 1981 5038 1.36% 1982 4500 1.21% 1983 4106 1.1% 1984 3698 0.99% 1985 3135 0.84% 1986 2553 0.69% 1987 2051 0.55% 1988 1677 0.45% 1989 1346 0.36% 1990 4056 1.09% 1991 550 0.15% 1992 292 0.08% 1993 111 0.03% 1994 41 0.01% 1995 2 0% Members by State State # Women % Total California 38814 10.44% None Selected 30460 8.19% New York 22562 6.07% Florida 21744 5.85% Texas 18913 5.09% Pennsylvania 14218 3.82% Illinois 13266 3.57% Massachusetts 12426 3.34% Michigan 12208 3.28% Ohio 12098 3.25% Virginia 12093 3.25% New Jersey 11198 3.01% Georgia 10622 2.86% North Carolina 10461 2.81% Maryland 9921 2.67% Washington 7956 2.14% Colorado 7542 2.03% Arizona 7234 1.95% Wisconsin 7226 1.94%
84
Members by State (continued) State # Women % Total Indiana 6513 1.75% Minnesota 6092 1.64% Missouri 5780 1.55% Connecticut 5766 1.55% Tennessee 5207 1.4% Oregon 4937 1.33% South Carolina 4625 1.24% Alabama 3716 1% Iowa 3661 0.98% Kentucky 3562 0.96% Kansas 3090 0.83% Maine 3087 0.83% New Hampshire 2679 0.72% Oklahoma 2607 0.7% Louisiana 2401 0.65% Nevada 2167 0.58% New Mexico 2134 0.57% Rhode Island 2046 0.55% Nebraska 2041 0.55% Arkansas 1987 0.53% Idaho 1840 0.49% Utah 1750 0.47% West Virginia 1579 0.42% Mississippi 1396 0.38% Delaware 1320 0.36% Vermont 1148 0.31% Montana 1074 0.29% District of Columbia 1068 0.29% Alaska 1021 0.27% Hawaii 787 0.21% Ontario 780 0.21% South Dakota 618 0.17% North Dakota 591 0.16% Wyoming 535 0.14% British Columbia 336 0.09% Alberta 221 0.06% Puerto Rico 189 0.05% Quebec 88 0.02% Nova Scotia 75 0.02% Manitoba 64 0.02% Saskatchewan 60 0.02% New Brunswick 47 0.01% AE 29 0.01%
85
Members by State (continued) State # Women % Total Newfoundland 16 0% Prince Edward Island 12 0% AP 7 0% Yukon Territory 6 0%
86
Criteria Answers1.a. How many people does the network cover or involve? 1,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The network has just signed contracts that will double the number of users in March 2013.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? http://www.asthmapolis.com/wp-‐content/uploads/2012/12/Quality-‐Measures-‐White-‐Paper.pdf1.b.i.1. Demographics: racial/ethnic High levels of Hispanics and African Americans1.b.i.2. Demographics: geography Louisville, KY, Sacramento, CA, Florida, Boston, Hawaii, Seattle, 1.b.i.3. Demographics: age Ages 5 and older1.b.i.4. Demographics: gender Same as the census1.c.i. What is the total annual budget? Not available (spending increases ~10% per month)1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Walgreens is a major funder
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? Since summer 2012
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Improving the management of asthma1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ The patient decides to share specific information with their care team, family, or friends. The patient can give or remove access to their data instantly through their personal profile.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
The patient owns and controls their data. Asthmapolis is opt-‐in and the patient can change their preferences about which data, if any, they share.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Business associate agreements, IRB and advisory board activities
1.g.iii.1.b. Policies for sharing data outside the network Business associate agreements, IRB and advisory board activities
1.g.iii.1.c. Policies for protecting proprietary data The partner hospitals own the data, the patients own the data
Asthmapolis
87
Criteria Answers
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
Van Sickle, D, Magzamen, S, Truelove, S, and Morrison, T. “Remote Monitoring of Inhaled Bronchodilator Use and Weekly Feedback About Asthma Management: An Open-‐Group Short-‐Term Pilot Study of the Impact on Asthma Control.” PLoS One (2013): XX(XX):XX-‐XX. Publication is under embargo.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Asthmapolis brings remote monitoring to asthma epidemiology by providing the first real-‐time geospatial view of where asthma symptoms are occurring and asthma inhalers are used
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
For McKesson or Allere healthways, Asthmapolis is the replacement for their disease management systems. Dignity Health Systems and the VA in Seattle use Asthmapolis for their COPD patients. Asthmapolis then uses the data collected at these sites for research.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? Currently underway at Dignity Health Systems in Sacramento, CA.3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
Multiple networks -‐ frequency hopping protocol for Bluetooth, SSL encryption for all the data, everything is run on Amazon Web Services
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
The SQL server pulls data being recorded by the apps and the web. No data is held locally on the phone or personal computer. A provider or researcher logs onto a secure portal using a secure login and the researcher can download the data that they need from the website.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Activity limitations, symptoms triggers, day to day burden and management, inhaler medications (daily and as needed medications), frequency, time, and location of rescue medication, diagnostic results
4.g.i. (Y/N) Does the network use natural language processing? No
88
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Sometimes, shown in an aggregate way to the patients themselves
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
It is based on who the information is being given to (researcher, patient, doctor)
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? BETA program of about 100 users
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Mongo using C++
89
Criteria Answers1.a. How many people does the network cover or involve? 923
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Currently only focusing on three areas: Breast Cancer, Fanconi Anemia, and Real Names (Amyotrophic Lateral Sclerosis (ALS) and Parkinson's)
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Confidential1.b.i.2. Demographics: geography Confidential1.b.i.3. Demographics: age Confidential1.b.i.4. Demographics: gender Confidential1.c.i. What is the total annual budget? Confidential1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Confidential
1.c.i.2. How much of that budget is dedicated to conducting studies? $200,000
1.c.ii. What are the current sources of funding? Confidential
1.c.iii. How much does it cost each year to maintain and update the network? Confidential
1.d. How many years has this network existed? 9 months
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Breast Cancer, ALS, Parkinson's, Fanconi Anemia1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad -‐ Portable Legal Consent (PLC) is a standardized informed consent system for anyone who has obtained data relevant to their health and would like to donate that data for research purposes. PLC works by running volunteers through a short process in which they learn about informed consent, sign an IRB-‐approved informed consent form, and then upload the data they have chosen for donation. The existing PLC system does not transmit “identified” data, donors must indicate that they understand there are some risks of re-‐identification and harm in volunteering for donation. For the purposes of the RNDP it will be necessary to rewrite the PLC to recognize that all RNDP participants will willingly provide their own names and genomic data.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad -‐ Portable Legal Consent (PLC) is a standardized informed consent system for anyone who has obtained data relevant to their health and would like to donate that data for research purposes. PLC works by running volunteers through a short process in which they learn about informed consent, sign an IRB-‐approved informed consent form, and then upload the data they have chosen for donation. The existing PLC system does not transmit “identified” data, donors must indicate that they understand there are some risks of re-‐identification and harm in volunteering for donation. For the purposes of the RNDP it will be necessary to rewrite the PLC to recognize that all RNDP participants will willingly provide their own names and genomic data.
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
The users are in charge of what information they would like to provide when signing-‐up and also whether or not they would like to participate in answering surveys
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
BRIDGE
90
Criteria Answers
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Portable Legal Consent (PLC) is a standardized informed consent system for anyone who has obtained data relevant to their health and would like to donate that data for research purposes. PLC works by running volunteers through a short process in which they learn about informed consent, sign an IRB-‐approved informed consent form, and then upload the data they have chosen for donation. The existing PLC system does not transmit “identified” data, donors must indicate that they understand there are some risks of re-‐identification and harm in volunteering for donation. For the purposes of the RNDP it will be necessary to rewrite the PLC to recognize that all RNDP participants will willingly provide their own names and genomic data.
1.g.iii.1.b. Policies for sharing data outside the network Not available
1.g.iii.1.c. Policies for protecting proprietary data
Collect personal information but encrypt it using SSL protocol. Use/disclose personal information without separate consent to provide information about BRIDG or other issues of interest, or inform the users about the new studies of interest, to meet legal requirements. BRIDG does not sell personal information without prior written consent.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Blood
3.c. What types of analysis are done on them?
whole genome sequence for each patient (from whole blood)serial draw whole blood transcriptomics dataserial draw blood serum proteomics dataserial draw blood serum metabolomics data
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Sequencing
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
4.a. What type of security technology does the network use? Web 2.0, the secure server software receives encrypted information through the Secure Sockets Layer (SSL)
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution? REST
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
91
Criteria Answers4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, conditions, medications, and genomic data
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
The REST querying system aggregates the data
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? R, Bioconductor, Gene Pattern
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
92
Criteria Answers1.a. How many people does the network cover or involve? 15,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
There are currently approximately 10 studies underway and is actively engaged in quality improvement and has automated population management and pre-‐visit planning tools that provide real-‐time clinical information and decision support
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?Crandall WV, Boyle BM, Colletti RB, Margolis PA, Kappelman MD. Development of process and outcome measures for improvement: lessons learned in a quality improvement collaborative for pediatric inflammatory bowel disease. Inflamm Bowel Dis. 2011 Oct;17(10):2184-‐91
1.b.i.1. Demographics: racial/ethnicWhite: 85%African American:10%Other: 5%
1.b.i.2. Demographics: geography National (27 states) + 1 site in London, England
1.b.i.3. Demographics: age0 to 14 years: 40%15 to 17 years: 35%>17 years: 25%
1.b.i.4. Demographics: gender Male: 55%Female: 45%
1.c.i. What is the total annual budget? $2,000,000
1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $30,000-‐$100,00/year
1.c.i.2. How much of that budget is dedicated to conducting studies? $925,500
1.c.ii. What are the current sources of funding? AHRQ Enhanced Registries grant and NIH-‐funded Transformative TR01 grant
1.c.iii. How much does it cost each year to maintain and update the network? $1,200,000
1.d. How many years has this network existed? 5
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Focuses primarily on Crohn's disease1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable -‐ Under the new protocol, the entire population is included for clinical and QI purposes (with no consent required). It includes consent for research and limited research datasets (i.e., dates). The new IRB includes provisions for transferring data from legacy patients based on local IRB review.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients are able to change privacy settings and elect the amount of information they provide when registering and/or updating their condition status
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR and Registry data
Collaborative Chronic Care Network (C3N)
93
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Policy is a work in progress, Not available yet
1.g.iii.1.b. Policies for sharing data outside the network Policy is a work in progress, Not available yet
1.g.iii.1.c. Policies for protecting proprietary data
Queries return only aggregate counts.Aggregate numbers are blurred (or obfuscated), so that the counts returned are an estimate of the number of patients meeting the queried upon criteria at each institution. No personally identifiable patient information ever leaves an individual institution.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Colletti RB, Baldassano RN, Milov DE, Margolis PA, Bousvaros A, Crandall WV, Crissinger KD, D'Amico MA, Day AS, Denson LA, Dubinsky M, Ebach DR, Hoffenberg EJ, Kader HA, Keljo DJ, Leibowitz IH, Mamula P, Pfefferkorn MD, Qureshi MA. Variation in care in pediatric Crohn disease. J Pediatr Gastroenterol Nutr. 2009 Sep;49(3):297-‐303
2) Kappelman MD, Crandall WV, Colletti RB, Goudie A, Leibowitz IH, Duffy L, Milov DE, Kim SC, Schoen BT, Patel AS, Grunow J, Larry E, Fairbrother G, Margolis P. Short pediatric Crohn's disease activity index for quality improvement and observational research. Inflamm Bowel Dis. 2011 Jan;17(1):112-‐7
3) Burt RS, Meltzer DO, Seid M, Borgert A, Chung JW, Colletti RB, Dellal G, Kahn SA, Kaplan HC, Peterson LE, Margolis P. What's in a name generator? Choosing the right name generators for social network surveys in healthcare quality and safety research. BMJ Qual Saf. 2012 Dec;21(12):992-‐1000.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? The registry contains data from all visits for each patient including standardized process of care measures and outcome measures. Beginning in January 2013, patient reported outcomes will begin to be measured
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
The network is actively collaborating with 50 clinical sites many of which are large academic medical centers including most of the largest children’s hospitals. There is senior leadership involvement at many if not all of these sites.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence?
The network has conducted a pilot center-‐based study involving planned allocation of centers to different combinations of chronic care management approaches across 30+ centers. The network is currently supporting a project involving randomization of treatments for individual patients as part of an N of 1 trials. Randomization of patients for RCTs has not been undertaken but is feasible.
3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
Connections to the web-‐based registry front-‐end are encrypted using SSL. Users are given a unique username and password that needs to be changed on a periodic basis and must conform to certain characteristics. The servers are located in the hospital (Cincinnati Children's) data center, which is physically secured. The servers are on a protected network that is firewalled off from the hospital network and the internet. Access is controlled via an identity and access management appliance. Non-‐date PHI elements are stored in an encrypted database.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Log into the SHRINE web-‐portal and input your query based on conditions and demographics. The query is then sent to participating sites where it aggregates data and returns the count
94
Criteria Answers4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? LOINC, RxNorm4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? i2b24.d.iii. How are the data transformed and mapped? Utilizing the SHRINE network
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Centers are provided with a case report form and asked to modify their clinical visit forms to capture the necessary data in the medical record. For centers with an EMR, they can configure a form to capture that data directly at the point of care. This data can then be extracted from the EMR and uploaded to the registry. We are pushing for a model where there is one form for each of the major EMR vendors used by ImproveCareNow centers (Epic, Cerner, GE). We will create one mapping per vendor. This already exists for Epic and is in process for Cerner and GE. Centers who are not live with or do not have an EMR can abstract the data and perform double data entry into the registry webforms
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Registry Data, EMR, Patient reported outcomes, daily symptoms, disease activities indices, short form of promise survey, PDSQL, remote sensors, custom SMS queries
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Exploring these approaches
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
The data are aggregated at a central site.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Use the statistical tool using SHRINE but extracts are available for SAS
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not available
95
Criteria Answers1.a. How many people does the network cover or involve? 100
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Working on studies with Melanoma Research Foundation, Melanoma Research Alliance, Lung Cancer Foundation, Lung Cancer Alliance
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $2,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $200,000
1.c.i.2. How much of that budget is dedicated to conducting studies? None
1.c.ii. What are the current sources of funding? Pharmaceutical Companies, Stand up to Cancer, SEED Philanthropy
1.c.iii. How much does it cost each year to maintain and update the network? $200,000
1.d. How many years has this network existed? 6 months
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Building a community to bring Patients, Physicians, and Researchers together1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
They are able to control how much of their data are shared
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Share data with others using a Data Use Agreement
1.g.iii.1.b. Policies for sharing data outside the network Does not share outside of network yet
1.g.iii.1.c. Policies for protecting proprietary data Stores only de-‐identified
Cancer Commons
96
Criteria Answers2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Not available
2.b.i. What is the evidence? Not available2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Not available
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Refer patients, provide EHR, and participate in research
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Genomic sequencing to determine the subtype of cancer the patient has
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use? Third party cloud server that is HIPAA-‐Compliant
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Have a home grown mapping tool
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographic data, cancer sub-‐type, treatments, biomarkers, outcomes
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
97
Criteria Answers4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
98
Criteria Answers1.a. How many people does the network cover or involve? 2,800
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The network began with a few hundred users and is currently 2,800.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available
1.b.i.2. Demographics: geography United states: 75%Europe, Australia, Asia and South Africa: 25%
1.b.i.3. Demographics: age See Chart 11.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Budget for 3 software developers1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? None
1.c.ii. What are the current sources of funding? Y combinator, Angel investors
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 2
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Crohn's disease and colitis1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients decide what personally identifying information to provide to the website.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data sharing has not taken place but in the future Crohnology would like to share data with institutional collaborators.
1.g.iii.1.b. Policies for sharing data outside the network
No patient information has been shared outside the network for research purposes yet, but Crohnology would like to share data outside the network in the future.
Crohnology
99
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data No proprietary data
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals No published studies
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
No standardization yet necessary because the PPRN is so new
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? The server is kept in a secure location in Colorado.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Database queries are written in SQL and the output can be determined by the researcher, can be graphic visualizations, excel spreadsheets, etc.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Home grown standards
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Patient's birthday, date of diagnosis, dates of treatment use combined with overall user self-‐reported wellness scores, daily self-‐rated health rating, treatments patients are considering taking, patient's supplements, treatments, food (each one rated by self-‐reported overall wellness scores of users while taking the medication, and rated on a 1 to 5 star scale for quality of the treatment)
4.g.i. (Y/N) Does the network use natural language processing? No
100
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
101
Chart 1
102
Criteria Answers1.a. How many people does the network cover or involve? Not available
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The website is designed to allow users to create their own studies based on diseases and conditions of their interest.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 3
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Preventive medicine through crowdsourced health research studies especially focusing on health risk, drug response, and athletic performance.
1.f. (Y/N) Does the network use informed consent forms?
Yes -‐ first when they become registered users of the system, and second when a user joins a study (each study has its own consent process).
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Either -‐ Users have the option to decide if they will share specific data on a study by study basis or that they want to broadly share their data
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Users can initiate and become principle investigators for studies on this website. Users also can contribute data to the studies if they choose to join as a participant.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Study results are always reported back to the participants of the study, there are not data use agreements because institutional investigators are not involved in the studies.
1.g.iii.1.b. Policies for sharing data outside the network Only data that has been approved by users can be shared outside the network
DIYgenomics
103
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data Data are protected by Genomera's privacy policy.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Swan, M. Scaling crowdsourced health studies: the emergence of a new form of contract research organization. Personalized Medicine 2012, Mar;9(2):223-‐234.
2) Swan, M., Hathaway, K., Hogg, C., McCauley, R., Vollrath, A. Citizen science genomics as a model for crowdsourced preventive medicine research. J Participat Med. 2010 Dec 23; 2:e20.
3) Swan, M. Emerging patient-‐driven health care models: an examination of health social networks, consumer personalized medicine and quantified self-‐tracking. Int. J. Environ. Res. Public Health 2009, 2, 492-‐525.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Highest level encryption, browser is a secure http
4.b.i. (Y/N) Are queries distributed via a central hub? Principle investigator-‐users sign onto the Genomera platform and can download the data in formats including CSV or JSON.
4.b.ii. What is the architecture of the query distribution?
When a participant agrees to participate in the study, the participant's data generated from the study goes to the study's data collection and to the user's profile. The researcher can see the data flowing into their data collection portal and they also have access to links they can use to download the de-‐identified data from their study.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9/10 and HL-‐7 codes will be used after Meaningful Use Stage 24.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
104
Criteria Answers4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
E-‐mail, username, optionally, real name, current city of residence, birthdate, gender, (ancestry record field as experiment), Use of a free form basis to tell about interests-‐ examples include genetics, omega 3, and sleep. Specialized data types like a genome file can also be submitted.Each study has one or more data collection instruments, devise reported data (ZO sleep monitor), lab reported data (urine analysis), user reported data (examples include demographic surveys and morning and evening evaluations).
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Varies based on the study
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
105
Criteria Answers1.a. How many people does the network cover or involve? 1000's
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The website is designed to allow users to create their own studies based on diseases and conditions of their interest.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Angel investors and venture capitalists
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 3
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? The network focuses on democratizing the process of conducting research, by allowing individuals who are not academic researchers to team with other users to conduct clinical-‐style research studies.
1.f. (Y/N) Does the network use informed consent forms? Yes -‐ first when they become registered users of the system, and then for each study has their own consent process
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Either -‐ Users have the option to decide if they will share specific data on a study by study basis or that they want to broadly share their data
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Users can initiate and become principle investigators for studies on this website. Users also can contribute data to the studies if they choose to join as a participant.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Study results are always reported back to the participants of the study, there are not data use agreements because institutional investigators are not involved in the studies.
1.g.iii.1.b. Policies for sharing data outside the network Only data that has been approved by users can be shared outside the network
Genomera
106
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data Data is protected by Genomera's privacy policy.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
Crowdsourced Health Research Studies: An Important Emerging Complement to Clinical Trials in the Public Health Research Ecosystem, Reviewed by Paul Wicks, Thomas Pickard, and Ute Francke, Melanie Swan, MBA, J Med Internet Res. 2012 Mar-‐Apr; 14(2): e46.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? http://genomera.com/studies/aging-‐risk-‐reduction-‐for-‐common-‐aging-‐conditions-‐through-‐monitoring-‐intervention2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
No standardization done
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Referring patients to Genomera for studies
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable
3.a. (Y/N) Does the network have biobanks? No
3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Highest level encryption, browser is a secure http
4.b.i. (Y/N) Are queries distributed via a central hub? Principle investigator-‐users sign onto the Genomera platform and can download the data in formats including CSV or JSON.
4.b.ii. What is the architecture of the query distribution?
When a participant agrees to participate in the study, the participant's data generated from the study goes to the study's data collection and to the user's profile. The researcher can see the data flowing into their data collection portal and they also have access to links they can use to download the de-‐identified data from their study.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9/10 and HL-‐7 codes will be used after Meaningful Use Stage 24.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
E-‐mail, username, optionally, real name, current city of residence, birthdate, gender, (ancestry record field as experiment), Use of a free form basis to tell about interests-‐ examples include genetics, omega 3, and sleep. Specialized data types like a genome file can also be submitted.Each study has one or more data collection instruments, devise reported data (ZO sleep monitor), lab reported data (urine analysis), user reported data (examples include demographic surveys and morning and evening evaluations)
107
Criteria Answers4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Varies based on the study
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
108
Criteria Answers1.a. How many people does the network cover or involve? 25,883
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Mainly involves patients with Type 1 Diabetes
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes but only within the same condition
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable
1.b.i.1. Demographics: racial/ethnic
White (Non-‐Hispanic): 81%Black (Non-‐Hispanic): 5%Hispanic or Latino: 8%Native Hawaiian/Other Pacific Islander: 1%Asian: 1%American Indian/Alaskan Native: 1%Other: 3%
1.b.i.2. Demographics: geography Not available
1.b.i.3. Demographics: age
< 6: 49%6 -‐ 13: 27%13 -‐ 18: 24%18 -‐ 26: 15%26 -‐ 31: 4%31 -‐ 50: 13.3%50 -‐ 65: 8.31%>= 65: 2.74%
1.b.i.4. Demographics: gender Male: 50%Female: 50%
1.c.i. What is the total annual budget? Confidential1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Confidential
1.c.i.2. How much of that budget is dedicated to conducting studies? Confidential
1.c.ii. What are the current sources of funding? Helmsley Charitable Trust
1.c.iii. How much does it cost each year to maintain and update the network? Pays other sites $75 per patient to update data manually
1.d. How many years has this network existed? 1
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? An online community of type 1 diabetes1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Can provide as much or as little information as they feel comfortable with. However, any information provided up until the point the patient stops providing information will remain in the database indefinitely to be used for research
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
Glu
109
Criteria Answers1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Researchers will be able to request to use the information for their research. This might involve the Glu team performing analyses on the information and giving the researcher the results. It also could involve giving the researcher a dataset or information. There may be a charge to researchers when they request analyses of the information or a dataset. This charge is intended to cover the costs involved in collecting, storing, processing, and analyzing the information Glu members provide.
1.g.iii.1.b. Policies for sharing data outside the network
Researchers will be able to request to use the information for their research. This might involve the Glu team performing analyses on the information and giving the researcher the results. It also could involve giving the researcher a dataset or information. There may be a charge to researchers when they request analyses of the information or a dataset. This charge is intended to cover the costs involved in collecting, storing, processing, and analyzing the information Glu members provide.
1.g.iii.1.c. Policies for protecting proprietary data All information provided to researchers are de-‐identified and sometimes aggregated data
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No -‐ will be starting their first RCT in May 2013
2.d.i.1. What is the evidence? 3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? DNA, RNA, peripheral blood mononuclear cells (PBMC), serum and plasma
3.c. What types of analysis are done on them? Metabolic measures including HbA1c, glucose and C-‐peptide. Immune and genetic measures such as HLA typing and diabetes-‐related autoantibodies
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them?
Metabolic measures including HbA1c, glucose and C-‐peptide. Immune and genetic measures such as HLA typing and diabetes-‐related autoantibodies
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use?
Data are entered on the Jaeb Center for Health Research’s secure website through an SSL encrypted connection. The Jaeb Center websites are maintained on Unix and Linux servers running Apache web server software and on a Windows server running IIS, all with strong encryption. The study website is password-‐protected and restricted to users who have been authorized by the Jaeb Center to gain access.
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? RxNorm, MEDRA
110
Criteria Answers4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Nickname, mixology, My story/quote, E-‐mail, Designation(you have type 1 or are a caregiver), Date of Birth, Country, Terms and Conditions Consent, Data Use Consent, Gender, Race/Ethnicity, Zip code, Age of diagnosis, diagnosis scenario, insulin delivery method, other, information about when you developed diabeteshow your diabetes has been treated, blood sugar measurements, problems related to your diabetes, other medical problems you may have, blood tests that have been done, medicines that you take, whether anyone else in the family has diabetes, your education level (such as whether you completed high school), your family income level, what type of health insurance you have, if any, how you feel about your diabetes, problems in your life, information about your lifestyle, such as how much you exercise
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
111
Criteria Answers1.a. How many people does the network cover or involve? 315,274
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Not available
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Not available
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Not available
1.a.iii.1. What is the evidence? Not available1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 7
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Not available
1.e.i.1. What does the network focus on? Not available1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Do not have to provide their information. In order to interact with other users and post content, have to provide additional information. Also, the information in posts becomes public information so users are in control of what they post.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Not available
1.g.iii.1.b. Policies for sharing data outside the network Not available
1.g.iii.1.c. Policies for protecting proprietary data Not available
Inspire
112
Criteria Answers2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals Not available
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Not available
2.b.i. What is the evidence? Not available2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Not available
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? Not available3.a. (Y/N) Does the network have biobanks? Not available3.b. What types of biospecimens are collected? Not available
3.c. What types of analysis are done on them? Not available
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Not available
3.d.i. What types of analyses do they conduct on them? Not available
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Not available
4.b.ii. What is the architecture of the query distribution? Not available
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? Not available
4.d.ii. Which CDM is used? Not available4.d.iii. How are the data transformed and mapped? Not available
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Your inspiration, photo, relationship status, birthday, gender, zip code and country of residence, interests
4.g.i. (Y/N) Does the network use natural language processing? Not available
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
113
Criteria Answers4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Not available
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not available
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not available
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Not available
4.j.ii. What informatics tools are used? Not available
114
Criteria Answers1.a. How many people does the network cover or involve? 15,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Focus mainly on the fitness and recreation of patients with diabetes but do want to facilitate research studies using the data collected
1.a.ii.1. Can the network be used for new studies in the same or a different condition? No
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Grassroots, Corporate Sponsorship, Helmsley Charitable Trust
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 7
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Diabetes self-‐management outreach, not specifically a research network/organization1.f. (Y/N) Does the network use informed consent forms? No -‐ not the site specifically
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable -‐ If a research investigator would like to have individuals from the site participate in a study, it is the responsibility of that researcher to obtain consent from the patient
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable -‐ If a research investigator would like to have individuals from the site participate in a study, it is the responsibility of that researcher to obtain consent from the patient
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Not available
1.g.iii.1.b. Policies for sharing data outside the network Not available
Insulindependence
115
Criteria Answers
1.g.iii.1.c. Policies for protecting proprietary data
"When you submit information via our Websites, we take efforts to protect your information both online and offline. Please keep in mind, however, that whenever you give out personal information online, such information is not always secure in transit. While we strive to protect your privacy and secure your information, we cannot guarantee the security of information sent over the Internet, and you disclose such information at your own risk."
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No -‐ but it is in the process of making it possible in the future
2.b.i. What is the evidence? 2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Fairway Technologies is the security technology company supporting the website and database
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, medications, conditions, devices used, health status
4.g.i. (Y/N) Does the network use natural language processing? No
116
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
117
Criteria Answers1.a. How many people does the network cover or involve? 4,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Conducts studies mainly on Waldenstrom's macroglobulinemia
1.a.ii.1. Can the network be used for new studies in the same or a different condition? No
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $500,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? A percentage of $500,000/year
1.c.i.2. How much of that budget is dedicated to conducting studies? $500,000
1.c.ii. What are the current sources of funding? Confidential
1.c.iii. How much does it cost each year to maintain and update the network? A percentage of $500,000/year
1.d. How many years has this network existed? 18
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Focuses mainly on Waldenstrom's macroglobulinemia1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable -‐ Researchers wanting to use patient data from the Registry must get consent from the patients themselves on a specific study
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
By providing as much information about their condition and health status
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
In order for an institutional investigator to use the data, they must apply for a grant through IWMF. Then must go through a review process before being given access to the data to conduct their study
1.g.iii.1.b. Policies for sharing data outside the network
In order for an investigator to use the data, they must apply for a grant through IWMF. Then must go through a review process before being given access to the data to conduct their study
1.g.iii.1.c. Policies for protecting proprietary data Data stored are all de-‐identified
International Waldenstrom's Macroglubulinemia Foundation
118
Criteria Answers
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
MYD88 L265P in Waldenstrom's Macroglobulinemia, IgM Monoclonal Gammopathy, and other B-‐cell Lymphoproliferative Disorders using Conventional and Quantitative Allele-‐Specific PCR.Xu L, Hunter ZR, Yang G, Zhou Y, Cao Y, Liu X, Morra E, Trojani A, Greco A, Arcaini L, Varettoni M, Brown JR, Tai YT, Anderson KC, Munshi NC, Patterson CJ, Manning R, Tripsas C, Lindeman NI, Treon SP.Blood. 2013 Jan 15.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Researchers at these organizations participate in on-‐going research with IWMF
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Not available
4.b.ii. What is the architecture of the query distribution? Not available
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? Not available
4.d.ii. Which CDM is used? Not available4.d.iii. How are the data transformed and mapped? Not available
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Blood properties, trends, treatments, demographics
4.g.i. (Y/N) Does the network use natural language processing? Not available
119
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Not available
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not available
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
120
Criteria Answers1.a. How many people does the network cover or involve? Visited by 16,000,000 in the past year
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
There is a link for users to apply to start their own support groups
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Private philanthropists
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 8
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Support groups1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients choose what information to share on the message board
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
No data are offered to investigators
1.g.iii.1.b. Policies for sharing data outside the network No data are offered to investigators
MDJunction
121
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data No proprietary data are collected
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Not available
4.b.ii. What is the architecture of the query distribution? Not available
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)?
No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Name, e-‐mail
4.g.i. (Y/N) Does the network use natural language processing? Not available
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
122
Criteria Answers4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? None available
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
123
Criteria Answers1.a. How many people does the network cover or involve? 12,000,000 site visitors monthly
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Yes -‐ Conditions are added to the site based on what conditions receive the most hits on Google
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Confidential1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? 80% percent of the total budget
1.c.i.2. How much of that budget is dedicated to conducting studies? Confidential
1.c.ii. What are the current sources of funding? Confidential
1.c.iii. How much does it cost each year to maintain and update the network? Confidential
1.d. How many years has this network existed? 19
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? No
1.e.i.1. What does the network focus on? Not applicable1.f. (Y/N) Does the network use informed consent forms? Yes -‐ User consent by signing a disclaimer
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients decide what data they want to be shared with the public and with their healthcare providers.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data shared outside the PPRN contains no personally identifiable information. Third parties must agree that they will not attempt to make this information personally identifiable. The user has the option of granting access to personally identifiable information to their physician or hospital.
1.g.iii.1.b. Policies for sharing data outside the network Investigators from outside the network follow the same data sharing procedures as investigators inside the network.
1.g.iii.1.c. Policies for protecting proprietary data The data contain no personally identifiable information.
MedHelp
124
Criteria Answers2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
Cataract and intraocular implant surgery concerns and comments posted at two internet eye care forums. Hagan JC 3rd, Kutryb MJ. Mo Med. 2009 Jan-‐Feb;106(1):78-‐82.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Data are encrypted
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
When a researcher submits a query, the MedHelp team queries their database and sends the results back to the researcher
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
JSON
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Consumer collected data, from condition-‐specific health applications and Personal Health Records (PHRs)
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Confidential
125
Criteria Answers4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
The data are aggregated based on the needs of the researcher.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Google analytics and home grown analysis tools
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
126
PatientsLikeMe
Criteria Answers 1.a. How many people does the network cover or involve?
202,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
In April 2011, the platform expanded so that any patient with any condition (and multiple conditions) could use the system. To date, there are over 2,000 conditions registered on the system. There are over 30,000 patients with fibromyalgia or MS; over 10,000 patients with major depressive disorder, chronic fatigue syndrome, or generalized anxiety disorder; over 5,000 with, epilepsy, type 2 diabetes, Parkinson's disease, ALS, panic disorder, social anxiety disorder, PTSD, or rheumatoid arthritis. There are also substantial numbers of patients with rare conditions, for example, over 2,000 with kidney transplant, over 1,000 with cystic fibrosis, over 400 with primary lateral sclerosis, Devic's neuromyelitis optica, or progressive muscular atrophy, over 300 with polycystic kidney disease or idiopathy pulmonary fibrosis, and over 60 with the orphan disease alkaptonuria, for instance.
1.a.ii.1. Can the network be used for new studies in the same or a different condition?
Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Three peer-reviewed studies: A study in ALS, MS, Parkinson’s, HIV, Fibromyalgia, and mood disorders suggested a number of patient-reported benefits from using the system including better understanding of their condition and symptoms, better quality of life, and better medication adherence (Wicks P, Massagli M, Frost J, Brownstein C, Okun S, Vaughan T, Bradley R, Heywood J (2010) Sharing Health Data for Better Outcomes on PatientsLikeMe, Journal of Medical Internet Research, 12(2):e19).
A second study replicated these findings in epilepsy, and also found some evidence of better clinical outcomes (e.g. ER admissions, fewer seizures) as well as a ”dose-effect curve” of benefits against social interactions on the site (Wicks P, Keininger DL, Massagli MP, de la Loge C, Brownstein C, Isojarvi J, Heywood JA (2012) Perceived benefits of sharing health data between people with epilepsy on an online platform, Epilepsy & Behavior, 23:16-23).
An additional study assessing quality of care in epilepsy was used by the American Academy of Neurology to update how they train neurologists and in a submission to the National Quality Forum on quality of care in epilepsy (Wicks P & Fountain NB (2012) Patient assessment of physician performance of epilepsy quality-of-care measures, Neurology Clinical Practice, 2:335-345)
1.b.i.1. Demographics: racial/ethnic Among those reporting race: White: 85% Black or African-American: 4% Mixed Race: 4% Prefer not to answer: 4% Asian: 3% American Indian or Alaskan Native: 1% Native Hawaiian or other Pacific Islander: <1%
Among those reporting ethnicity: Non-hispanic: 83% Prefer not to answer: 10% Hispanic: 6%
1.b.i.2. Demographics: geography Among those reporting location: USA: 80% UK:6% Canada:5% Australia: 2% 184 other countries: 1% or less
1.b.i.3. Demographics: age Among those reporting age: <10: 1% 11-20: 2% 21-30: 13% 31-40: 21% 41-50: 27% 51-60: 23% 61-70: 11% 71-80: 3% 81-90: <1% 91+:<1%
1.b.i.4. Demographics: gender Among those reporting gender: Female: 72% Male: 28%
1.c.i. What is the total annual budget? Confidential
1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance?
Confidential
1.c.i.2. How much of that budget is dedicated to conducting studies?
Confidential
1.c.ii. What are the current sources of funding?
The PatientsLikeMe Research Team has received research funding from Abbott, Acorda, The AKU Society, Astra Zeneca, Avanir, Biogen, Boehringer Ingelheim, Genzyme, Johnson & Johnson, Merck, National Institutes of Health, Novartis, The Robert Wood Johnson Foundation, Sanofi, and UCB.
1.c.iii. How much does it cost each year to maintain and update the network?
Confidential
1.d. How many years has this network existed?
Founded in 2004, ALS community launched in 2006.
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)?
Yes
1.e.i.1. What does the network focus on? Any illness or medical condition, but with a historical emphasis on neurological conditions (e.g. ALS, Parkinson's, MS, epilepsy) and serious or disabling medical conditions (e.g. organ transplants, HIV, mood disorder, fibromyalgia)
1.f. (Y/N) Does the network use informed consent forms?
No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad - Patients are told upfront that when they sign up that their information will be used for studies and will also be sold to partner companies for research purposes. In addition, when additional information is collected via surveys there may be additional informed consent language specified by the respective partner companies' and institutions' IRBs.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-contacted for consent for a new study?
Yes
1.g.i. (Y/N) Are patients involved in the decision-making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-making process?
In accordance to how much information they provide and share in their online profile. Users of PatientsLikeMe can opt-in of sharing their profile and information to the public. Approximately 14% of users share their information in a manner accessible to the public. The remainder keep their data visible only to other members of the community.
1.g.ii.1. What are the sources of Self-Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-related quality of life)
Primarily self-reported but starting to include sensor data (e.g., voice, devices)
1.g.ii.2. What are the sources of Health care-Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-related quality of life)
PatientsLikeMe has partnered with the University of Michigan to assist in an ongoing clinical trial by aggregating level statistics on patient reported-data through the PLM platform for all individuals who join PLM through their clinical trial, conducting yearly surveys of all individuals who join PLM through their clinical trial, and providing individual-level patient reported data for all individuals who join PLM through their clinical trial and have given PLM permission to provide that individual level data to the University of Michigan for purposes of the clinical trial. Additionally, a free and publicly available tool allows members (and non–members) to easily access all the trials registered on ClinicalTrials.gov. If they provide demographic information such as age, sex, and location, and the name of their condition, they will be shown the trials most relevant for them.
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Will need institutional investigators to contact PatientsLikeMe research team and provide them with the initial research proposal. If they feel that the research project will be interesting and beneficial to their users, PatientsLikeMe will assist in writing a grant proposal and help describe what they do for a local IRB.
1.g.iii.1.b. Policies for sharing data outside the network
"Please write to the research team with your initial research proposal. If we think a research project has the potential to benefit our users we would be happy to assist you in writing a grant proposal and helping to describe what we do for your local Internal Review Board (IRB). The proportion of funding we would receive depends on a number of factors including the contribution of our staff to the design, the difficulty of accessing the specific population of interest, and the source of funding."
1.g.iii.1.c. Policies for protecting proprietary data
Outside of what the user shares to the public and/or within the network, PatientsLikeMe does not share any personal identifying information
2.a. Three most recent (or high impact) studies published in peer-reviewed journals
1) Nakamura C, Bromberg M, Bhargava S, Wicks P, Zeng-Treitler Q Mining Online Social Network Data for Biomedical Research: A Comparison of Clinicians’ and Patients’ Perceptions About Amyotrophic Lateral Sclerosis Treatments J Med Internet Res 2012;14(3):e90 2) Bove R, Secor E, Healy BC, Musallam A, Vaughan T, Glanz BI, Weiner HL, Chitnis T, Wicks P, de Jager PL Evaluation of an online platform for multiple sclerosis research: Patient description, validation of severity scale, and exploration of BMI effects on disease progression PLoS ONE 2013, 8(3):e59707 3) Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm. Paul Wicks, Timothy E Vaughan, Michael P Massagli & James Heywood. Nature Biotechnology 2011, 29:411–414
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-up?
Yes
2.b.i. What is the evidence? http://www.patientslikeme.com/research
2.b.ii. (Y/N) Can researchers conduct follow-up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Standardized questionnaires, such as the EQ-5D, have been used (with permission from the licensors) to ensure comparability of populations across multiple studies.
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
The Veteran's Administration (VA) is engaged in a research study called "Policy for Optimal Epilepsy Management" (POEM) to refer veterans with seizures to PLM for the purpose of establishing whether the platform helps improve self-efficacy. In another study, Movement Disorders specialists at Johns Hopkins are offering telemedicine consultations to PLM members with Parkinson's disease, using the information in their profile to enhance the consult. Results from both studies are anticipated in 2013/14.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? Clinical trial investigators have (unwittingly) had their patients sign up for PatientsLikeMe. This was described in a paper (Heywood, Vaughan, Wicks (2012) Waiting for p<0.05, Figshare, http://dx.doi.org/10.6084/m9.figshare.96802)
3.a. (Y/N) Does the network have biobanks? No
3.b. What types of biospecimens are collected?
Not applicable
3.c. What types of analysis are done on them?
Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes?
No
3.d.i. What types of analyses do they conduct on them?
Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
"We follow the best practices in security (as per HIPAA Security Compliance). We use a respected, secure hosting provider for the site. which has signed a HIPAA compliance agreement and earned SAS Type II certification. We also use state of the art firewalls for our production servers, and our systems have been developed to prevent the most common security vulnerabilities. For secure browsing, we use 128-bit SSL encryption using Verisign certificates. Finally, when we do any testing and development work to the site, we use sanitized versions of the site, with all personally identification information stripped out."
4.b.i. (Y/N) Are queries distributed via a central hub?
No
4.b.ii. What is the architecture of the query distribution?
Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-9, SNOMED, etc.)?
Yes
4.c.ii. Which terminologies? Multiple UMLS terminologies including SNOMED-CT, ICD-10, ICF, HL7, MEDDRA, unifying grammar, internal PatientsLikeMe Patient Vocabulary
4.d.i.(Y/N) Does the network use a common data model (CDM)?
No
4.d.ii. Which CDM is used? Not applicable
4.d.iii. How are the data transformed and mapped?
Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)?
No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
PatientsLikeMe Patient Vocabulary is a home-grown repository of symptoms, conditions, side effects, and treatments. It maps patient-entered terminology to standardized vocabularies including ICD10, SNOMED-CT and MEDDRA
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-reported outcomes, etc.).
Biographical information, e.g., photograph, biography, gender, age, location (city, state and country), general notes; Condition/disease information, e.g., diagnosis date, first symptom, family history; Treatment information, e.g., treatment start dates, stop dates, dosages, side effects, treatment evaluations; Symptom information, e.g., severity, duration; Primary and secondary outcome scores over time, e.g., ALSFRS-R, MSRS, PDRS, FVC, PFRS, Mood Map, Quality of Life, weight, InstantMe; Laboratory results, e.g., CD-4 count, viral load, creatinine; Genetic information, e.g., information on individual genes and/or entire genetic scans; Individual and aggregated survey responses; Information shared via free text fields, e.g., the forum, treatment evaluations, surveys, annotations, journals, feeds, adverse event reports
4.g.i. (Y/N) Does the network use natural language processing?
NLP has been used for adverse event detection processes. NLP and machine learning will be used by end of 2013 for various purposes.
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Report publicly shared data in aggregates based on demographic distribution by treatments and/or conditions
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network?
Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-level data?)
No
4.j.ii. What informatics tools are used? PatientsLikeMe developed a "User Voice Dashboard" where data not previously captured in their databases is triaged by a clinical team (RNs, PharmDs). These data are curated using internal data integrity conventions and informatics science.
Criteria Answers1.a. How many people does the network cover or involve? 2,428
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Personal genomes in progress: from the human genome project to the personal genome project., Lunshof JE, Bobe J, Aach J, Angrist M, Thakuria JV, Vorhaus DB, Hoehe MR, Church GM., PMID: 20373666 [PubMed -‐ indexed for MEDLINE] PMCID: PMC3181947
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? M. P. Ball et al. A public resource facilitating clinical use of genomes. Proc. Natl Acad. Sci. USA 13 July 2012 (doi:10.1073/pnas.1201904109)
1.b.i.1. Demographics: racial/ethnic Information not available in aggregate form1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Information not available in aggregate form1.b.i.4. Demographics: gender Information not available in aggregate form1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 12
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Personal genomic sequencing1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
All data are publicly available
1.g.iii.1.b. Policies for sharing data outside the network All data are publicly available
Personal Genome Project
130
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data All data are publicly available
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
M. P. Ball et al. A public resource facilitating clinical use of genomes. Proc. Natl Acad. Sci. USA 13 July 2012 (doi: 10.1073/pnas.1201904109)G M Church. The Personal Genome Project. Molecular Systems Biology 1:2005.0030
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Not available
2.d.i.1. What is the evidence? Not available3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Tissue Samples
3.c. What types of analysis are done on them? Creation of cell lines, transformation into somatic cell-‐derived stem cells, DNA sequencing, gene expression, and the identification of bacteria and viruses in the specimen sample
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them?
The study of biological characteristics, including DNA, RNA (gene expression), physical traits, biochemical traits, and the presence and characteristics of micro-‐organisms and viruses in the specimen.
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Not available
4.b.ii. What is the architecture of the query distribution? Not available
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Biometric data, conditions, medications, allergies, family membersImaging data, EHR, procedures, test results, immunizationsData from 23ndMe, surveys, enrollment history, cell lines, genomic and phenotypic dataComplete Genomics-‐CGI Sample, weight, fat mass, immunizations, red blood cell count, white blood cell count, Total PSA, Total Protein, RDW, platelet count, PH, Occult blood, Non-‐HDL Cholesterol, Nitrite, Neutrophils, mpv, Monocytes, MCV, LDL-‐Cholesterol, Ketones, Hyaline Cast, Hemoglobin, Glucose, reflexive urine culture, sodium, triglycerides, white blood cell count, calcium, ast, demographic information
131
Criteria Answers4.g.i. (Y/N) Does the network use natural language processing? Not available
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Genome-‐Environment–Trait Evidence
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Not available
4.j.ii. What informatics tools are used? Not available
132
Criteria Answers1.a. How many people does the network cover or involve? 16,000 signed up for MeetUps
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The MeetUp groups sponsored by Quantified Self are open to any citizen scientist (amateur or nonprofessional scientist) who would like to attend or present at a meeting.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Autodesk, Intel, 23andMe, Scanadu
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? Not available
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Fostering self-‐tracking and self-‐experimentation on health behaviors, conditions, etc.1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients create a profile for themselves at MeetUp.com, so they can see where Quantified Self meetings are taking place. They can also decide who can see their MeetUp profile information.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Quantified Self does not hold user's data
1.g.iii.1.b. Policies for sharing data outside the network Quantified Self does not hold user's data
1.g.iii.1.c. Policies for protecting proprietary data Quantified Self does not hold user's data
Quantified Self
133
Criteria Answers2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes -‐ Please note in this case researchers are citizen scientists (amateur or nonprofessional scientist)
2.b.i. What is the evidence? Not available2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes, researchers are not third parties but rather citizen scientists, i.e., the users themselves
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
No study has required standardization
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Data are not collected by the network
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
134
Criteria Answers4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Yes -‐ Creating tools to help users studying themselves make sense of their data -‐-‐ data aggregation systems
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
135
Criteria Answers1.a. How many people does the network cover or involve? English (tudiabetes.org): 27,000; Spanish (tuesdiabetes.org): 20,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Yes -‐ Studies are survey based and added routinely, TuAnalyze has the capacity to cover users internationally as well as nationally
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not collected
1.b.i.2. Demographics: geography US: 60+%, US with Canada, UK, India, and Australia: 90% of members
1.b.i.3. Demographics: age Average age mid 40s, 80% of members between age 35 to 651.b.i.4. Demographics: gender Female: 60%1.c.i. What is the total annual budget? $70,000 -‐ $75,000 (Diabetes Hands Foundation receives $600,000)1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $65,000 -‐ $70,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $5,000
1.c.ii. What are the current sources of funding? Diabetes Hands
1.c.iii. How much does it cost each year to maintain and update the network? Included in amount of annual budget dedicated to infrastructure and maintenance
1.d. How many years has this network existed? 5
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Type 1 and 2 Diabetes1.f. (Y/N) Does the network use informed consent forms? No for TuDiabetes.org., Yes for TuAnalyze
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ meaning users choose what data may be seen by researchers
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients control what information they make public to other users, to the Internet community, and to researchers
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
If the data are made public by a patient, then researchers at Children's Hospital Boston can see it because they operate the TuAnalyze Site. If it has been marked private, the researchers cannot see it -‐-‐ they can only see data marked private by users in an aggregate format.
1.g.iii.1.b. Policies for sharing data outside the network
A researcher approaches TuDiabetes.org with a ".edu" e-‐mail and proof that their survey has been approved by their home IRB and if the survey is approved by TuDiabetes.org, TUD allows the researcher to post the survey on the website and will send e-‐mails to users inviting them to take the survey
TuDiabetes.org with TuAnalyze
136
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data Data marked private can be viewed only in aggregate form
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
Weitzman ER, Adida B, Kelemen S, Mandl KD (2011) Sharing Data for Public Health Research by Members of an International Online Diabetes Social Network. PLoS ONE 6(4): e19256. doi:10.1371/journal.pone.0019256
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? No
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? NING platform and network, IP blocking to prevent spammers
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Site administrators are responsible for querying the database for TuDiabetes and Children's Hospital Boston researchers are responsible for querying the database for TuAnalyze, and sending that information to the researcher
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
TuDiabetes asks for type of diabetes, how long a user has had it, type of therapy (optional), A1C question (optional), location, name, e-‐mailTuAnalyze -‐ a survey is conducted that serves as metadata for all other surveys that the user fills out while using TuAnalyze. The survey asks name, type of diabetes, type of therapy A1c question
4.g.i. (Y/N) Does the network use natural language processing? Yes
137
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
TuDiabetes has entered an agreement with another company to assign key terms to an open field from the website that asks users, "What do you want to get out of the community?"
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Aggregated based on the researcher's needs
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
138
Criteria Answers1.a. How many people does the network cover or involve? 10,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
None
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Wall DP, Dally R, Luyster R, Jung JY, Deluca TF.Use of artificial intelligence to shorten the behavioral diagnosis of autism. PLoS One. 2012;7(8):e43855.
1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $800,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $400,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $400,000
1.c.ii. What are the current sources of funding? National Institute of Health (NIH)
1.c.iii. How much does it cost each year to maintain and update the network? A percentage of the $400,000
1.d. How many years has this network existed? 15
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Autism research involving families with two or more children with Autism1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
The Principal Investigator must obtain IRB approval or exemption and then sign the AGRE Researcher Distribution Agreement.
1.g.iii.1.b. Policies for sharing data outside the network Investigators go through a rigorous approval process by obtaining an IRB approval and by signing an agreement with AGRE.
Autism Genetic Resource Exchange (AGRE)
139
V. Inventory of Patient Registries
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data They have a series of protocols that protect the PHI data housed in their database. Data are de-‐identified.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Martin LA, Horriat NL. The effects of birth order and birth interval on the phenotypic expression of autism spectrum disorder. PLoS One. 2012;7(11):e51049. doi: 10.1371/journal.pone.0051049. Epub 2012 Nov 30. PMID:23226454
2) Skafidas E, Testa R, Zantomio D, Chana G, Everall IP, Pantelis C. Predicting the diagnosis of autism spectrum disorder using gene pathway analysis. Mol Psychiatry. 2012 Sep 11. doi: 10.1038/mp.2012.126. [Epub ahead of print] PMID:22965006
3) Hall D, Huerta MF, McAuliffe MJ, Farber GK. Sharing Heterogeneous Data: The National Database for Autism Research.Neuroinformatics. 2012 May 24. [Epub ahead of print] PMID:22622767
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Norris M, Lecavalier L, Edwards MC. The Structure of Autism Symptoms as Measured by the Autism Diagnostic Observation Schedule. J Autism Dev Disord. 2011 Aug 20
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? LCL DNA, Transformed Cell Lines, Serum, Plasma, Whole Blood
3.c. What types of analysis are done on them? Whole genome scan and fine mapping, High-‐density SNP
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Not available
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? The Diagnostic and Statistical Manual of Mental Disorders (DSM)4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
140
Criteria Answers4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Genotype Data, Phenotype Data, Clinical Data, Medical Data, Demographic Data
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
141
Criteria Answers1.a. How many people does the network cover or involve? 1,550
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Conduct studies within the realm of children with Autism
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes, but within the same condition
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable
1.b.i.1. Demographics: racial/ethnic
White: 73%African American: 6%Asian: 5%Latino: 10%
1.b.i.2. Demographics: geography US and Canada
1.b.i.3. Demographics: age< 5: 45%5-‐7: 20%7+: 32%
1.b.i.4. Demographics: gender Male: 84%Female 16%
1.c.i. What is the total annual budget? $4,000,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? $4,800,000
1.c.ii. What are the current sources of funding? Health Resources and Services Administration, Materna and Child Health Bureau
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 4
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Providing self-‐management support, shared-‐decision making, delivery system design, decision support, and coordination of care
1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad -‐ Patients only consent to have their de-‐identified data included in the patient registry.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific -‐ There is a separate informed consent form.
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Registry Data and EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
Autism Treatment Network
142
Criteria Answers1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Investigators within the network have access to the data
1.g.iii.1.b. Policies for sharing data outside the network
Clinics and/or Researchers must first submit an application for a "Custom Form". Once approved, the "Custom Form" can be filled out and submitted for approval. As soon as they receive approval, they will be able to have access to the Registry Data.
1.g.iii.1.c. Policies for protecting proprietary data Stored data are de-‐identified
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Coury D. Very little high-‐quality evidence to support most medications for children with autism spectrum disorders. J Pediatric. 2011; 159(5):872-‐3. 2) Coury DL. Review: little evidence of clear benefit for most medical treatments for children with autism spectrum disorders. Evid Based Ment Health. 2011; 14(4):105. Epub 2011 Sep 30.
3) Goldman S, McGrew S, Johnson K, Richdale A, Clemons T, & Malow B. Sleep is associated with problem behaviors in children and adolescents with Autism Spectrum Disorders. Res Autism Spectr Disord. 2011; 5 (3): 1223-‐1229 doi: 10.1016/j.rasd.2011.01.010 .
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Not available
2.b.i. What is the evidence? Not available2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Developed a set of proprietary or "custom" forms to be used across the clinics
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Providing EHR data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Blood and Urine
3.c. What types of analysis are done on them? Fragile X testing and genotyping
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? None yet
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Current security protocols
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? DSM-‐IV (Diagnostic and Statistical Manual of Mental Disorders)4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
143
Criteria Answers4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, clinical data, medications, conditions, outcomes
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
144
Criteria Answers1.a. How many people does the network cover or involve? 10,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
None
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Not available
1.a.iii.1. What is the evidence? Not available1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $23,000,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? U.S. Government Funding
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 16
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Improving transplantation of marrow for patients who have Leukemia or Lymphoma1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ The donor or patient is given details about the type of data asked for or sample needed and the purpose of the research.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Specific -‐ The donor or patient is included in the research only if he or she agrees and signs a consent form.
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Other -‐ Transplant Centers, Donor Centers, Cord Blood Banks, Collection Centers, Apheresis Centers, Laboratories, Repositories
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Any research project that involves the donors or patients also is reviewed and approved by their Institutional Review Board (IRB) before the research begins. The IRB continues to oversee each project until it is complete. IRB members are doctors, ethicists, and people of the community who have no stake in the research. The IRB exists to protect the rights of our donors and patients who participate in research.
Be The Match Bone Marrow Donor Registry
145
Criteria Answers
1.g.iii.1.b. Policies for sharing data outside the network
Any research project that involves the donors or patients also is reviewed and approved by their Institutional Review Board (IRB) before the research begins. The IRB continues to oversee each project until it is complete. IRB members are doctors, ethicists and people of the community who have no stake in the research. The IRB exists to protect the rights of our donors and patients who participate in research.
1.g.iii.1.c. Policies for protecting proprietary data
When a donor joins the Be The Match Registry®, he or she gives a swab of cheek cells OR blood sample and is assigned a donor ID number. The blood or cell sample is labeled only with the donor ID number and is tested for the donor's tissue type. The only time the blood or cell sample and ID number are ever linked with a donor's name is when it is necessary to contact a donor to ask for more testing because he or she matches a patient. All staff and subcontractors that provide services for Be The Match, such as storing blood and cell samples, are required by law and contract to keep donor-‐identifying information private.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Lenalidomide after stem-‐cell transplantation for multiple myeloma. McCarthy PL, Owzar K, Hofmeister CC, Hurd DD, Hassoun H, Richardson PG, Giralt S, Stadtmauer EA, Weisdorf DJ, Vij R, Moreb JS, Callander NS, Van Besien K, Gentile T, Isola L, Maziarz RT, Gabriel DA, Bashey A, Landau H, Martin T, Qazilbash MH, Levitan D, McClune B, Schlossman R, Hars V, Postiglione J, Jiang C, Bennett E, Barry S, Bressler L, Kelly M, Seiler M, Rosenbaum C, Hari P, Pasquini MC, Horowitz MM, Shea TC, Devine SM, Anderson KC, Linker C New England Journal of Medicine 366(19):1770-‐1781
2) Costs and cost-‐effectiveness of hematopoietic cell transplantation. Preussler JM, Denzen EM, Majhail NS Biology of Blood & Marrow Transplantation 18(11)1620-‐1628
3) A combined DPA1~DPB1 amino acid epitope is the primary unit of selection on the HLA-‐DP heterodimer. Hollenbach JA, Madbouly A, Gragert L, Vierra-‐Green C, Flesch S, Spellman S, Begovich A, Noreen H, Trachtenberg E, Williams T, Yu N, Shaw B, Fleischhauer K, Fernandez-‐Vina M, Maiers M Immunogenetics 64(8):559-‐569
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence?
Outcomes after matched unrelated donor versus identical sibling hematopoietic cell transplantation in adults with acute myelogenous leukemiaWael Saber, Shaun Opie, J. Douglas Rizzo, Mei-‐Jie Zhang, Mary M. Horowitz, Jeff SchriberBlood. 2012 April 26; 119(17): 3908–3916. Prepublished online 2012 February 10. doi: 10.1182/blood-‐2011-‐09-‐381699
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Using Health and Human Services standards and noting the dates of change, but mostly correct the data elements by hand
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By referring patients and participating in research
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence?
Acute graft-‐versus-‐host disease biomarkers measured during therapy can predict treatment outcomes: a Blood and Marrow Transplant Clinical Trials Network studyJohn E. Levine, Brent R. Logan, Juan Wu, Amin M. Alousi, Javier Bolaños-‐Meade, James L. M. Ferrara, Vincent T. Ho, Daniel J. Weisdorf, Sophie PaczesnyBlood. 2012 April 19; 119(16): 3854–3860. Prepublished online 2012 March 1. doi: 10.1182/blood-‐2012-‐01-‐403063
3.a. (Y/N) Does the network have biobanks? Yes
3.b. What types of biospecimens are collected?
Whole blood, Cryopreserved whole blood, Plasma, Blood spotted on filter paper, Peripheral blood mononuclear cells (PBMC) viable and non-‐viable, B-‐Lymphoblastoid cell lines (B-‐LCL) viable and non-‐viable, Granulocytes, Serum, DNA, Whole genome amplified DNA
3.c. What types of analysis are done on them? Human Leukocyte Antigen characteristics
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Human Leukocyte Antigen characteristics
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Preliminary search is done using donor HLA characteristics and then a more formal search can be done by entering patient name, HLA, disease, etc. that generates a report sorting by match ranks. Also links to other registries databases.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
146
Criteria Answers4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographics, condition, HLA
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
147
Criteria Answers1.a. How many people does the network cover or involve? See Table 1
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The BCFR conducts special recruitment initiatives including initiatives to recruit Ashkenazi families and racial and ethnic minorities for further broaden their study of breast cancer.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not available1.b.i.1. Demographics: racial/ethnic Not available
1.b.i.2. Demographics: geography Ontario Cancer Center (Canada), University of Southern California Consortium, University of Melbourne (Australia), Hawaii Cancer Registry, Mayo Clinic (Rochester, MN), Fred Hutchinson Cancer Research Center (Seattle, WA)
1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? National Cancer Institute (NCI)
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? Not available
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Breast cancer in families1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not available
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Not available
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data Use Agreements and Data Submission Agreements are required from sites that send data.
1.g.iii.1.b. Policies for sharing data outside the network Outside investigators must collaborate with a member of the consortium.
Breast Cancer Family Registry (BCFR)
148
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data The individual clinical sites own the data.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) A meta-‐analysis of genome-‐wide association studies of breast cancer identifies two novel susceptibility loci at 6q14 and 20q11. Hum Mol Genet. 2012 Dec 15;21(24):5373-‐84.
2) Better cancer biomarker discovery through better study design. Eur J Clin Invest. 2012 Dec;42(12):1350-‐9.
3) Risk of Asynchronous Contralateral Breast Cancer in Noncarriers of BRCA1 and BRCA2 Mutations With a Family History of Breast Cancer: A Report From the Women's Environmental Cancer and Radiation Epidemiology Study. J Clin Oncol. 2013 Feb 1;31(4):433-‐9.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Genes, Environment and Breast Cancer Risk: the 15 Year Follow-‐Up of the Prof-‐SC -‐ http://maps.cancer.gov/overview/DCCPSGrants/abstract.jsp?applId=8196169&term=CA159868
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Healthcare organizations agree to share data patient data with the data coordination center.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Not available
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Blood/buccal samples, Cell lines, Tumor material
3.c. What types of analysis are done on them? Not available
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Not available
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use?
Held at Georgetown UIS Laurel Data Center, authentication for users and the backend is only available to programmers. NetID system at Georgetown requires that the principle investigator at Georgetown approves everyone who receives an ID to the database. Data are not sent via e-‐mail or transferred on hard drives.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
A project concept is submitted to the steering committee. If approved, the data coordination center sends the investigator a link to the data request form. The coordination center processes the data request by querying the central database and puts it into the format that the investigator requests and puts it on their website. The investigator logs into the website and downloads the data.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Home grown4.d.iii. How are the data transformed and mapped?
Common data elements were created by the central hub working group and the query is sent to the individual sites. The data elements are captured and sent back to the central hub.
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Home grown standards
149
Criteria Answers
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Previous cancer diagnoses in the patient and the patient's parents, siblings, and children; all cancers, except non-‐melanoma skin cancers and cervical carcinoma in situ; dates of all cancer diagnoses and deaths, demographics, race/ethnicity, religion; personal history of cancer, breast and ovarian surgeries, radiation exposure, smoking and alcohol consumption, menstrual and pregnancy history, breast-‐feeding, hormone use, weight, height, and physical activity; frequency of food consumption and portion size; 30 ml sample of blood, paraffin blocks are requested for individuals with a history of breast or ovarian cancer
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No, but they are de-‐identified
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? R code and SAS scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
150
Table 1. Family Recruitment
*Table from http://epi.grants.cancer.gov/CFR/about_breast.html
151
Criteria Answers1.a. How many people does the network cover or involve? 29,977
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Each week, 4-‐5 new clinical trials for breast cancer are added to the site.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No -‐ *BCT.org does not track whether a patient signed up for a study or what the results of that study were
1.a.iii.1. What is the evidence? Not applicable
1.b.i.1. Demographics: racial/ethnic
White (Non-‐Hispanic): 89%White (Hispanic): 3%Asian: 4%African-‐American: 3%American Indian/Alaskan Native: 0.6%Pacific Islander: 0.03%
1.b.i.2. Demographics: geography See Table 1
1.b.i.3. Demographics: age
Total Patients< 30: 1.5%30-‐39: 10%40-‐49: 32%50-‐59: 38%60-‐69: 16%70-‐79: 2.9%80: 0.3%
1.b.i.4. Demographics: gender Female: 80%Male: 20%
1.c.i. What is the total annual budget? $350,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $25,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $60,000
1.c.ii. What are the current sources of funding?
Safeway Food Stores, California Endowment, Research and collaboration with community-‐based organizations, CA Breast Cancer Research Program, individual donors
1.c.iii. How much does it cost each year to maintain and update the network? This amount is included in the amount of the budget dedicated to infrastructure and maintenance annually
1.d. How many years has this network existed? 5
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Breast Cancer Clinical Trials1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Users have complete control over what is contained in their "Health History" and with whom it can be shared. BCT never shares user's personal health information with any individual or organization without a user's explicit permission.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
BreastCancerTrials.org (BCT)
152
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
BCT will only release patient information to Trial Site Network sites that users have explicitly requested BCT to contact on their behalf. BCT requires that all BCT Trial Site Network sites agree to protect the privacy and security of BCT-‐referred patient health information as they would their own patient records and in full compliance with their institution's HIPAA policies and procedures. Furthermore, BCT requires that research sites only permit individuals who have been authorized by a designated BCT liaison to log onto BCT and view patient records.
1.g.iii.1.b. Policies for sharing data outside the network
Data are not shared outside the network unless a patient allows the registry to connect him or her to researchers using the SecureConnect program.
1.g.iii.1.c. Policies for protecting proprietary data All data sharing is patient directed and can be shared on behalf of the patient using SecureConnect only.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals No studies published
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Not available
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Healthcare organizations are conducting the trials that the BCT connects its users to.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Not applicable
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? All patients and researchers have user IDs and passwords
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
SecureConnect -‐-‐ If a trial is in a matching service and the patient wants to participate, the patient notifies BCT that he would like to participate in the trial. Then, BCT sends a notification to the researcher saying that the patient is interested in the trial. The researcher can then log on the BCT site and see the patient's medical history and decide whether to contact the patient.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
153
Criteria Answers4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Pathology report detailing precisely what the pathologist saw in the tumor tissue, breast cancer staging information, imaging reports such as mammographies, ultrasounds, bone scans, CT, MRI, and PET scans, breast cancer treatment, or survivorship plans.
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Based on criteria of the clinical trial
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
154
Table 1. Geographical Distribution of BreastCancerTrials Users in United States
*Submitted by interviewee from BreastCancerTrials.org
155
Criteria Answers1.a. How many people does the network cover or involve? 1,277,200
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Mainly conducts studies involving cancer research
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Not available1.b.i.1. Demographics: racial/ethnic See Table 11.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? $1,200,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Percentage of the annual budget
1.c.i.2. How much of that budget is dedicated to conducting studies? Percentage of the annual budget
1.c.ii. What are the current sources of funding? Centers for Disease Control (CDC) and Surveillance, Epidemiology and End Results (SEER)
1.c.iii. How much does it cost each year to maintain and update the network? Percentage of the annual budget
1.d. How many years has this network existed? 5
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Cancer Research1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Cancer researchers must go through a rigorous process to access any CCR data. The CCR will only release patient contact information to qualified researchers under tightly controlled circumstances where the research has first been approved by the California State Committee for the Protection of Human Subjects (CPHS) Institutional Review Board. Research proposals are evaluated by CPHS to ensure patients’ rights are protected and the research justified. Additionally, a federally approved Institutional Review Board (IRB) at the researcher’s institution must also approve the research proposal. This IRB will also ensure that patient rights are monitored and protected.
California Cancer Registry (CCR)
156
Criteria Answers
1.g.iii.1.b. Policies for sharing data outside the network
Cancer researchers must go through a rigorous process to access any CCR data. The CCR will only release patient contact information to qualified researchers under tightly controlled circumstances where the research has first been approved by the California State Committee for the Protection of Human Subjects (CPHS) Institutional Review Board. Research proposals are evaluated by CPHS to ensure patients’ rights are protected and the research justified. Additionally, a federally approved Institutional Review Board (IRB) at the researcher’s institution must also approve the research proposal. This IRB will also ensure that patient rights are monitored and protected.
1.g.iii.1.c. Policies for protecting proprietary data Safeguards in place to protect, but not all HIPAA identifiers are removed.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Y. Zak, K. F. Rhoads and B. C. Visser. Predictors of Surgical Intervention for Hepatocellular Carcinoma: Race, Socioeconomic Status, and Hospital Type. Arch Surg. 2011. 46(7) 778-‐84
2) H. Zheng, W. Zhang, J. Z. Ayanian, L. B., Zaborski and A. M. Zaslavsky. Profiling Hospitals by Survival of Patients with Colorectal Cancer. Health Serv Res. 2011. 46(3) 729-‐46
3) M. Cockburn, P. Mills, X. Zhang, J. Zadnick, D. Goldberg and B. Ritz. Prostate Cancer and Ambient Pesticide Exposure in Agriculturally Intensive Areas in California. Am J Epidemiol. 173(11) 1280-‐8
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Providing EHR data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Barracuda system, RSA for 2-‐Factor Authentication, IP-‐Filtering, External and Internal Firewalls
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9, SEER ICDO4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? North American Association of Central Cancer Registries Data Model4.d.iii. How are the data transformed and mapped? There are code crosswalks that allow data to be mapped and transformed from the source.
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
157
Criteria Answers4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
North American Association of Central Cancer Registries Data Standards
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Patient’s name, address at time of diagnosis, sex, race, and age at diagnosis, type of cancer (such as breast cancer) and stage of disease at time of diagnosis, whether the patient had surgery, radiation, or chemotherapy as the first course of treatment.
4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? The SEER*Stat tool provided by SEER National Cancer Institute
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
158
Table 1
* Table from http://www.ccrcal.org/pdf/Reports/ACS_2012.pdf, page 23
159
Criteria Answers1.a. How many people does the network cover or involve? 12,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The registry's target is to cover 62 percent of children under age of 6 years. The registry plans to expand to allow schools to access immunization information electronically.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?The registry provides reminders when an immunization is due or overdue, consolidates immunizations into a single record, provides current recommendations and information on new vaccines, helps identify high-‐risk populations and under-‐immunized populations, and generates a variety of reports including coverage reports, e.g., HEDIS.
1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography All counties in California except Imperial County1.b.i.3. Demographics: age More heavily weighted towards 0-‐18 years although all ages are included1.b.i.4. Demographics: gender Comparable to state of California gender composition1.c.i. What is the total annual budget? $2,600,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $2,600,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $0
1.c.ii. What are the current sources of funding? All federal -‐ not further specified
1.c.iii. How much does it cost each year to maintain and update the network? This amount is included in budget dedicated to infrastructure and maintenance annually.
1.d. How many years has this network existed? 15
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Immunization records for residents of California1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable -‐ covered by HIPAA which allows collection of data that is required by law to be sent to the database, but a disclosure is shared with all parents.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
None
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
None
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data Exchange Agreement between doctors and the registry. Additionally, epistomologists have internal access during outbreak investigations.
1.g.iii.1.b. Policies for sharing data outside the network Only outside access is to health plans for HEDIS determinations
California Immunization Registry (CAIR)
160
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data User registry access agreements define conditions for data usage
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
The Challenge and Potential of Childhood Immunization Records. Victoria A. Freeman and Gordon H. DeFriese. Annu. Rev. Public Health 2003. 24:227–46.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
Health care providers and public health departments link the CAIR system with their EHR system and update patient records of immunization into the system.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Secure File Transfer Protocol (SFTP)
4.b.i. (Y/N) Are queries distributed via a central hub? Data is sent via Secure File Transfer Protocol (SFTP)
4.b.ii. What is the architecture of the query distribution? No querying system because the network uses an SFTP server
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? HL-‐74.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Not available4.d.iii. How are the data transformed and mapped?
Through an export process that retrieves immunization data from the clinic's EHR system and then exports it as an HL-‐7 or flat file to CAIR.
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Immunization Records from EHRs
4.g.i. (Y/N) Does the network use natural language processing? No
161
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? HL-‐7
162
Criteria Answers1.a. How many people does the network cover or involve? 11 hospitals and 54 surgeons
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Mainly conducts studies involving joint replacement procedures and outcomes
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Only in the same condition
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? http://www.caljrr.org/pdf/Rationale_for_CJRR.pdf1.b.i.1. Demographics: racial/ethnic Confidential1.b.i.2. Demographics: geography Confidential1.b.i.3. Demographics: age Confidential1.b.i.4. Demographics: gender Confidential1.c.i. What is the total annual budget? Confidential1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Confidential
1.c.i.2. How much of that budget is dedicated to conducting studies? Confidential
1.c.ii. What are the current sources of funding? Funded by California HealthCare Foundation and Pacific Business Group on Health
1.c.iii. How much does it cost each year to maintain and update the network? Confidential
1.d. How many years has this network existed? 3
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Hip and Knee Joint replacement1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad -‐ Give your permission to your surgeon and hospital so that it can share information about you, your surgery, and how you felt before and after it with the database
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Participants own the data from their own institutions even after that data has been contributed to the CJRR. Specific terms of use for the data provided by a participant are outlined in Business Associate Agreements and Participation Agreements agreed upon by each participating site and the CJRR.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
The registry is new and does not yet allow others to access the data.
1.g.iii.1.b. Policies for sharing data outside the network The registry is new and does not yet allow others to access the data.
California Joint Replacement Registry (CJRR)
163
Criteria Answers
1.g.iii.1.c. Policies for protecting proprietary data
To protect your SSN, before sending any information to the CJRR registry, special software is used to scramble each patient’s SSN and create a new number to track each patient. This scrambled number (not your SSN) is then saved in the registry database. Only the hospital where you received care can match your SSN to the scrambled code; the CJRR cannot do this matching. Stores data on dedicated servers that have physical and electronic protections and verifes that all communications with the registry are from valid sources (“authenticated”).
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence?
Complete a survey about you before your surgery and at several points in time after your surgery (6 months, one year). The survey collects information about you that only you know, such as whether you can walk better after your surgery and whether you are free from pain. The survey takes about 20 minutes to complete. The questions do not require that you provide long answers. If you participate in the CJRR, you would fill out the surveys through a secure on-‐line application that you would get to from an e-‐mail link sent to you by your hospital or surgeon.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By referring patients
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
No
4.a. What type of security technology does the network use?
Data are stored at a data center that is not accessible via the web or online. SFTP is used by sites to upload data to the database. Users have to contact the registry and go through a process in order to obtain the data.
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐94.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Home grown4.d.iii. How are the data transformed and mapped? Not available
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Utilizes a data dictionary
164
Criteria Answers
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
where your surgery took place (which hospital); who your surgeon was; the specific type of implant you received; which side of your body you were operated on; the medications given to you before and after your survey; other selected information about you that is important to know since it can impact the results of the surgery, such as your age and whether you have other conditions like diabetes or heart disease; information from you about how you felt before and after your surgery (called “patient-‐ reported outcomes”). This information is collected through surveys that you would fill out on a secure website before your surgery and at a few times after your surgery (e.g. six months, and one year); and your scrambled Social Security Number which identifies you as you
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data are aggregated, but at patient level, and can be identified or de-‐identified based on what the researchers requested. Then, the data are sent to the researcher on an encrypted disk.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not available
165
Criteria Answers1.a. How many people does the network cover or involve? See Table 1
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
CCFR conducts its enrollment efforts in Phases. In phase I recruitment (1998-‐2002), population-‐based sampling ranged from all incident cases of colorectal cancer to a subsample based on age at diagnosis and/or family cancer history. During phase II (2002-‐2007), population-‐based recruitment targeted cases diagnosed before the age of 50 years are more likely attributable to genetic factors.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available
1.b.i.2. Demographics: geography Fox Chase Cancer Center (Philadelphia, PA), Columbia University (New York), University of Utah, University of Melbourne (Australia), Ontario Cancer Center (Canada), Northern California Cancer Center (Fremont), University of California, Irvine
1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? National Cancer Institute
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? Not available
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Colon cancer in families1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not available
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Not available
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data Use Agreements and Data Submission Agreements from sites that send data. Within the consortium, there is free collaboration.
1.g.iii.1.b. Policies for sharing data outside the network Outside investigators must collaborate with a member of the consortium
The Colon Cancer Family Registry (CCFR)
166
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data The individual clinical sites own the data
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Peters U, Hutter CM, Hsu L, Schumacher FR, Conti DV, Carlson CS, Edlund CK, Haile RW, Gallinger S, Zanke BW, Lemire M, Rangrej J, Vijayaraghavan R, Chan AT, Hazra A, Hunter DJ, Ma J, Fuchs CS, Giovannucci EL, Kraft P, Liu Y, Chen L, Jiao S, Makar KW, Taverna D, Gruber SB, Rennert G, Moreno V, Ulrich CM, Woods MO, Green RC, Parfrey PS, Prentice RL, Kooperberg C, Jackson RD, Lacroix AZ, Caan BJ, Hayes RB, Berndt SI, Chanock SJ, Schoen RE, Chang-‐Claude J, Hoffmeister M, Brenner H, Frank B, Bézieau S, Küry S, Slattery ML, Hopper JL, Jenkins MA, Le Marchand L, Lindor NM, Newcomb PA, Seminara D, Hudson TJ, Duggan DJ, Potter JD, Casey G. Meta-‐analysis of new genome-‐wide association studies of colorectal cancer risk. Hum Genet. 2012 Feb;131(2):217-‐34.
2) Adams SV, Newcomb PA, Burnett-‐Hartman AN, White E, Mandelson MT, Potter JD. Circulating 25-‐hydroxyvitamin-‐D and risk of colorectal adenomas and hyperplastic polyps. Nutr Cancer. 2011 Apr;63(3):319-‐26.
3) Bertuccio P, La Vecchia C, Silverman DT, Petersen GM, Bracci PM, Negri E, Li D, Risch HA, Olson SH, Gallinger S, Miller AB, Bueno-‐de-‐Mesquita HB, Talamini R, Polesel J, Ghadirian P, Baghurst PA, Zatonski W, Fontham ET, Bamlet WR, Holly EA, Lucenteforte E, Hassan M, Yu H, Kurtz RC, Cotterchio M, Su J, Maisonneuve P, Duell EJ, Bosetti C, Boffetta P. Cigar and pipe smoking, smokeless tobacco use and pancreatic cancer: an analysis from the International Pancreatic Cancer Case-‐Control Consortium (PanC4). Ann Oncol. 2011 Jun;22(6):1420-‐6.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? The Family Health Promotion Project (FHPP): design and baseline data from a randomized trial to increase colonoscopy screening in high risk families. Lowery JT, Marcus A, Kinney A, Bowen D, Finkelstein DM, Horick N, Garrett K, Haile R, Sandler R, Ahnen DJ. Contemp Clin Trials. 2012 Mar;33(2):426-‐35. doi: 10.1016/j.cct.2011.11.005. Epub 2011 Nov 12.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Healthcare organizations agree to share data patient data with the data coordination center
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? The Family Health Promotion Project (FHPP): design and baseline data from a randomized trial to increase colonoscopy screening in high risk families. Lowery JT, Marcus A, Kinney A, Bowen D, Finkelstein DM, Horick N, Garrett K, Haile R, Sandler R, Ahnen DJ. Contemp Clin Trials. 2012 Mar;33(2):426-‐35. doi: 10.1016/j.cct.2011.11.005. Epub 2011 Nov 12.
3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Flood/Cuccal samples, cell lines, tumor material
3.c. What types of analysis are done on them? blood sample separation and aliquoting (or tissue sectioning) 3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Not available
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use?
Held at Georgetown UIS Laurel Data Center. Authentication for users and the backend is only available to programmers. NetID system at Georgetown requires that the PI at Georgetown approves everyone who receives an ID to the database data are not sent via e-‐mail or transferred on hard drives.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
A project concept is submitted to the steering committee. If approved, the data coordination center sends the investigator a link to the data request form, the coordination center processes the data request by querying the central database and puts it into the format that the investigator requests, then puts it on its website. The investigator logs into the website and downloads the data.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? Home grown
167
Criteria Answers4.d.iii. How are the data transformed and mapped?
Common data elements were created by the central hub working group and the query is sent to the individual sites and the data elements are captured and sent back to the central hub
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Home grown standards
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Information on the number, sex, and birthdates of first-‐degree relatives (parents, siblings, and children), their cancer history, vital status, and, if deceased, date of death. All cancers, except for nonmelanoma skin cancers, were recorded with dates of diagnoses; information on established and suspected risk factors for colorectal cancer, including medical history and medication use, reproductive history (for female participants), physical activity, demographics, alcohol and tobacco use, race and ethnicity, and limited dietary data; blood and paraffin-‐embedded tumor tissue
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No, but data are de-‐identified
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? R code and SAS scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
168
Table 1. Family Recruitment
*Table from http://epi.grants.cancer.gov/CFR/about_colon.html
169
Criteria Answers1.a. How many people does the network cover or involve? 27,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Patient data includes any new treatments or studies that the cystic fibrosis (CF) patient is participating in
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
Annual center-‐level reports inform healthcare professionals of their current practice patterns and clinical outcomes, and allow comparisons to the national averages. Patient data are continually updated and it allows the healthcare community to see a comprehensive medical description of the CF population as a whole, to see the impact of specific treatments, and gauge the care of the CF patients based on the data.
1.b.i.1. Demographics: racial/ethnic See Table 11.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age See Table 11.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Cystic Fibrosis Foundation
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 57
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Cystic Fibrosis1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific -‐ Patients must sign an informed consent to participate in the registry and then an additional consent for any study they participate in.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Not available
Cystic Fibrosis Patient Registry
170
Criteria Answers1.g.iii.1.b. Policies for sharing data outside the network Center-‐level data are available publicly on the CF Foundation website (www.cff.org)
1.g.iii.1.c. Policies for protecting proprietary data Not available
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Yen EH, Quinton H, Borowitz D., Better Nutritional Status in Early Childhood is Associated with Improved Clinical Outcomes and Survival in Patients with Cystic Fibrosis. J Pediatr. 2012 Oct 11.Epub ahead of print. 2012
2) Quon BS, Psoter K, Mayer-‐Hamblett N, Aitken ML, Li CI, Goss CH. Disparities in Access to Lung Transplantation for Cystic Fibrosis Patients by Socioeconomic Status. Am J Respir Crit Care Med. 2012 Sep 13. [Epub ahead of print] 2012
3) Bradley S. Quon, MD; Nicole Mayer-‐Hamblett, PhD; Moira Aitken, MD; Christopher H. Goss, MD, MSc Risk of Post Lung Transplant Renal Dysfunction in Adults with Cystic Fibrosis Published online before print January 5, 2012, doi: 10.1378/chest.11-‐1926. CHEST January 2012111926. 2012
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Patient data are updated and forwarded to the registry after each visit and patients fill out an annual questionnaire.2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Over 100 participating clinics update and send patient data to the registry.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? Michael R. Knowles, M.D., Kathy W. Hohneker, R.N., Zhaoquing Zhou, Ph.D.., et. al. A Controlled Study of Adenoviral-‐Vector-‐Mediated Gene Transfer in the Nasal Epithelium of Patients withCystic Fibrosis. The New England Journal of Medicine; September 1995. 1995
3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Blood, urine, stool, and tissue from CF clinical trials
3.c. What types of analysis are done on them? Spirometry, Exacerbations, Blood Inflammatory Mediators, LRT Microbiology, Growth, Sweat Chloride, Sputum Inflammatory Mediators
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them?
Spirometry, Exacerbations, Blood Inflammatory Mediators, LRT Microbiology, Growth, Sweat Chloride, Sputum Inflammatory Mediators
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
The researcher submits the data request to the Cystic Fibrosis Foundation via a central hub and if approved the data are returned to the researcher.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? Not available
4.d.ii. Which CDM is used? Not available4.d.iii. How are the data transformed and mapped? Not available
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
171
Criteria Answers4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
State of residence, height, weight, gender, CF mutations, lung function test resultsfrom pulmonary function tests, medication use, complications (problems) related to CF
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Statistical process control charts, GeneGo's MetaMiner CF,
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
172
Table 1
*Table from http://www.cff.org/UploadedFiles/research/ClinicalResearch/2011-Patient-Registry.pdf, page 26.
173
Criteria Answers1.a. How many people does the network cover or involve? 160,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
The registry adds 17,000 new patients each year.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?
The registry documents surgical techniques and implant characteristics; characterizes patients undergoing joint replacements and the relationships between these characteristics and techniques/implant selection; compares incidence rates and variations in clinical care; identifies relationships between variations in practice and short-‐term outcomes; and identifies risk factors associated with joint replacement revisions.-‐ The registry helps Kaiser Permanente immediately notify and identify patients about recalled or defective implants prior to an official recall notice.-‐ The registry has successfully monitored and identified two recalls and advisories.-‐ Prevented 16 revisions through information sharing from the registry
1.b.i.1. Demographics: racial/ethnic Not available
1.b.i.2. Demographics: geography Southern California, Northern California, Washington, Oregon, Idaho, Hawaii, Colorado, Georgia, Ohio, Maryland, District of Columbia, Virginia
1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Kaiser Permanente Integrated Health Plan
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 12
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Joint Replacements1.f. (Y/N) Does the network use informed consent forms?
Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
Kaiser Permanente Total Joint Replacement Registry
174
Criteria Answers
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Not available
1.g.iii.1.b. Policies for sharing data outside the network Data are not shared outside the network
1.g.iii.1.c. Policies for protecting proprietary data HIPAA compliant
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals Not available
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Not available2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Registry consists of patients who have had a joint replacement at the Kaiser Permanente Healthcare organization
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Data are collected from the individual sites and stored at a central hub (Clarity Database), where it can be queried using SAS and merged into an SQL database with a front end Microsoft Access Application
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9-‐CM4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Home grown core data standards
175
Criteria Answers4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Patient (e.g., age, gender, and diagnoses), procedure (e.g., operative date, laterality, surgical approach), hospital admission (e.g., length of stay, discharge disposition), implant and fixation information (e.g., manufacturer, catalog, and lot numbers) and outcome variables including complications (i.e., surgical site infections, VTE), revisions, re-‐operations, hospital readmissions, and death
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data are aggregated based on encounter or transaction.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS scripts, Crystal Report
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? SQL scripts
176
Criteria Answers1.a. How many people does the network cover or involve? 1,500
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Can cover additional lives in the tissue bank and registry
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?A nonrandom association of gastrointestinal stromal tumor (GIST) and desmoid tumor (deep fibromatosis): Case series of 28 patients. A.G. Dumont; L. Rink; A.K. Godwin; M. Miettinen; H. Joensuu; J.R. Strosberg; A. Gronchi; C.L. Corless; D. Goldstein; B.P. Rubin; et al. Annals of Oncology. 2012;23(5):1335-‐1340.
1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $2,835,317 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? $2,126,487
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 10
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Gastrointestinal Stromal Tumors (GIST)1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Patients may decide to contribute as little or as much information as they feel comfortable with. This ranges from their e-‐mail address, symptoms, and date of diagnosis to full contributions to the tissue bank.
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
9 research team members must agree to collaborate results and share tissue in order to receive funding from Life Raft Group
1.g.iii.1.b. Policies for sharing data outside the network
Researchers have access to a de-‐identified, HIPAA compliant tissue bank by signing a DUA. Stanford University's IRB handles research data requests because the tissue is stored in its Microarray Database.
Life Raft Group
177
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data
All tissue and pathology reports are de-‐identified after being processed by Oregon Health Sciences University (HSU). All information about the patient is HIPAA compliant.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals Not available
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Researchers are currently studying a particular metabolic pathway using the tissue from the tissue bank and matching it with de-‐identified patient data. The publication should be released in a few months.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
The same data elements are collected over time.
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
No
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Not applicable
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? Tumor tissue, paraffin-‐based
3.c. What types of analysis are done on them? Tissue undergoes mutational testing at Oregon HSU and then is processed into a tissue microarray at Stanford University.
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Mutational Testing
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use?
Researchers are given entry into a cordoned off portion of the electronic registry that includes only de-‐identified patient data. Only the patient registry supervisor has the ability to match patient identifying information to other information. The server is housed at Life Raft Group on a separate server.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
The de-‐identified patient record is sent by Life Raft Group and matched with the patient's particular tissue which is sent by Stanford, to the researcher. Researchers typically ask for data based on 1 or 2 criteria -‐ information can be given electronically in a spreadsheet or as a hard copy.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
The registry uses home grown standards and a data dictionary
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Conditions, medications, procedures, health-‐related quality of life, all updated after each doctor appointment, registry data, pathology reports, biospecimens
4.g.i. (Y/N) Does the network use natural language processing? No
178
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SPSS code
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
179
Criteria Answers1.a. How many people does the network cover or involve? 9,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
With guidance from a nationally recognized group of epidemiologists and the MURDOCK Study Leadership group, the Kannapolis-‐based team will begin recruiting a representative sample of the local population this January 2013 into the MURDOCK Study Community Registry and Biorepository.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable
1.b.i.1. Demographics: racial/ethnic Hispanic: 9% African American: 13%
1.b.i.2. Demographics: geography North Carolina-‐ Kannapolis and Cabarrus Counties1.b.i.3. Demographics: age Median age is 55
1.b.i.4. Demographics: gender Male: 25%Female: 65%
1.c.i. What is the total annual budget? 2,000,0001.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $400,000
1.c.i.2. How much of that budget is dedicated to conducting studies? $1,600,000
1.c.ii. What are the current sources of funding? Mr. David H. Murkock
1.c.iii. How much does it cost each year to maintain and update the network? Included in the annual budget
1.d. How many years has this network existed? 4
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Improve disease characterization on a molecular level1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Broad
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Broad
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes -‐ up to 4 times a year
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Registry participants are on registry boards; volunteers from the community recruit registry patients at locations around the community
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Collected in Clinical Trials
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Proposal form reviewed by leadership group. Group reviews at an ad hoc basis and then if approved they work with study personnel. An agreement is in place that results are returned to the study and publications must identify the Murdock Study.
1.g.iii.1.b. Policies for sharing data outside the network
Research proposal is submitted and leadership team decides how to proceed. Budget is generated and from there the process parallels the policy for institutional investigators.
MURDOCK
180
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data All managed through consent or investigator agreements
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) The Measurement to Understand Reclassification of Disease of Cabarrus/Kannapolis (MURDOCK) Study Community Registry and Biorepository. Sayanti Bhattacharya, Ashley A Dunham, Melissa A Cornish, Victoria A Christian, Geoffrey S Ginsburg, Jessica D Tenenbaum, Meredith L Nahm, Marie Lynn Miranda, Robert M Califf, Rowena J Dolor, L. Kristin Newby. Am J Transl Res 2012;4(4):458-‐470.
2) The MURDOCK Study: a long-‐term initiative for disease reclassification through advanced biomarker discovery and integration with electronic health records. Jessica D Tenenbaum, Victoria Christian, Melissa A Cornish, Rowena J Dolor, Ashley A Dunham, Geoffrey S Ginsburg, Virginia B Kraus, John G McHutchison, Meredith L Nahm, L. Kristin Newby, Laura P Svetkey, Krishna Udayakumar, Robert M Califf. Am J Transl Res 2012;4(3):291-‐301.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? MURDOCK is designed to be a population-‐based, longitudinal health study. Participants of the registry commit to yearly follow-‐up exams. Researchers are currently in the process of following these cohorts in studies.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Versioning and making electronic notations
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
The study uses health sites to enroll patients in the study. Some staff at these sites enroll patients in the study. Sites also give access to their patients' EHRs.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected?
(1) plasma, n=16 (500 uL each); (2) buffy coat; (3) serum, n=10 (500 uL each); (4) environmental serum, n=1 (3 mL each); (5) whole blood, n=2 (2 mL each); (6) PaxGene RNA, n=3; (7) urine, n=4 (10 mL each)
3.c. What types of analysis are done on them? Analysis is determined based on the research that is being conducted
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Proteomic analysis or genomic testing
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Papers addressing this link to patient outcomes are forthcoming
4.a. What type of security technology does the network use? Resides on Duke servers, behind firewalls
4.b.i. (Y/N) Are queries distributed via a central hub?
Yes-‐ The MURDOCK Integrated Data Repository (MIDR) houses all the clinical data from early projects of the MURDOCK studies, plus study metadata, consent data, omics and imagine metadata, biospecimen data, and EHR data.
4.b.ii. What is the architecture of the query distribution?
The query distribution is via a web-‐based querying system called the Registry Query Interface (RQI). Datasets are stored at their original sites and can be sent via secure FTP to MURDOCK database for researchers to access.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? RxNorm, ICD-‐9, SNOMED, UMLS4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Data dictionary
181
Criteria Answers
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Environmental exposures, personal and family history of disease, patient-‐reported outcomes, a series of questions of the NIH PROMIS Study questions
EHR data
Longitudinal outcomes assessment, biobanked samples, particular cohorts where they collect additional data -‐ MS, severe acne, physical performance, memory health screener for over 55 cohort, individuals over the age of 100 for genome sequencing
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Data are aggregated at the central hub (not at the site level) for reporting purposes
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Registry Query Interface (home grown)
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Multiple systems integrate this data
182
Criteria Answers1.a. How many people does the network cover or involve? 9,584
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Data entry method allows for new types of congenital malformation information to be entered. By law, newly diagnosed patients must be added to the registry.
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence?The registry ensures that families of children identified in the registry locate available resources so that each child can maximize his or her development. The registry also assists in identifying families of children with specific malformations who may be invited to participate in research studies.
1.b.i.1. Demographics: racial/ethnic See Table 1
1.b.i.2. Demographics: geography See Table 11.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender See Table 11.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 32
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Congenital malformations in children diagnosed before age 2 in New York state 1.f. (Y/N) Does the network use informed consent forms? No consent -‐ patient data are required by law to be added by physicians
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
New York State Registry -‐ physicians and hospitals send reports over the Internet using the New York State Department of Health’s (NYSDOH) Health Provider Network (HPN).
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
New York State Congenital Malformations Registry
183
Criteria Answers1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
All investigators are outside the institution and must follow policies listed for data sharing outside the network by filing a report using the New York State Department of Health's (NYSDOH) Health Provider Network (HPN) website
1.g.iii.1.b. Policies for sharing data outside the network
Researchers must fill out a data request form.Families of registered patients are never contacted without prior consent of the Department of Health's Institutional Review Board and the notification of the patient's physician.
1.g.iii.1.c. Policies for protecting proprietary data
Data collected by the registry can be used only for surveillance and to facilitate epidemiologic research into the prevention of environmental diseases, as prescribed by Public Health Law 206(1J). Confidentiality of all data reported to the Registry is strictly maintained by Department of Health staff and rigorously safeguarded by Section 206(1J), which specifically prohibits the release of personal identifiers.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Lin S, Herdt-‐Losavio M, Gensburg L, Marshall E, Druschel C. "Maternal asthma, asthma medication use and the risk of congenital heart defects." Birth Defects Research, Part A 2009; 85(2):161-‐1688.
2) Kumar J, Gordillo R, Kaskel FJ, Druschel CM, Woroniecki, RP. "Increased Prevalence of Renal and Urinary Tract Anomalies in Children with Congenital Hypothyroidism." The Journal of Pediatrics 2009; 263-‐266.
3) Wang Y, Tao Z, Cross PK, Le LH, Steen PK, LaSelva nee-‐Babcock GD, Druschel CM, Hwang SA. Development of a Web-‐based Integrated Birth Defects Surveillance System in New York State. J Public Health Manag & Pract. 2008; 14(6):E1-‐E10.
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Physicians are lawfully required to submit information on patients diagnosed with a congenital malformation.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Not available
2.d.i.1. What is the evidence? Not available3.a. (Y/N) Does the network have biobanks? Yes3.b. What types of biospecimens are collected? DNA samples
3.c. What types of analysis are done on them? Chromosomal studies reporting the karyotype
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
ID and password needed to imput and review patient information. Physicians can only see information for patients whose information they imput. Browser must support 128-‐bit strength SSL encryption.
4.b.i. (Y/N) Are queries distributed via a central hub? Not applicable
4.b.ii. What is the architecture of the query distribution? Not available
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9-‐CM, ICD-‐10-‐CM, British Pediatric Association (BPA)4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
184
Criteria Answers4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Congenital Anomalies, Fetal Alcohol Syndrome, Amniotic Bands, Congenital Infections: including rubella, cytomegalovirus toxoplasmosis and herpes simplex, ipoma, benign neoplasm of skin, hemangioma of skin, umbilical hernia, accessory auricle, other specified anomalies of ear, unspecified anomaly of ear, branchial cleft cyst, other specified anomalies of face and neck, other unspecified anomalies of face and neck, single umbilical artery, embryonic cyst of cervix, vagina and external female genitalia, imperforate hymen, dermatoglyphic anomalies, vascular hamartomas, congenital pigmentation anomalies of skin, other anomalies of skin, specified anomalies of hair, specified anomalies of nails, specified anomalies of breast, other specified anomalies of integument, unspecified anomalies of the integument
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? No tools utilized -‐ information is input into the patient-‐level data form
185
Table 1
*Table from http://www.health.ny.gov/diseases/congenital_malformations/2007/section1.htm
186
Criteria Answers1.a. How many people does the network cover or involve? 200,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Web-‐based asthma registry with longitudinal tracking/reporting of patient, transparent comparative practice, and network-‐level data for key process and outcome measures
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Mandel KE. Aligning rewards with large-‐scale improvement. JAMA. 2010 Feb 17;303(7):663-‐4.1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $1,400,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? $200,000
1.c.i.2. How much of that budget is dedicated to conducting studies? Percentage of the annual budget
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? $200,000
1.d. How many years has this network existed? 16
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Children with asthma1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable -‐ Quality improvement initiative falls under “operations,” thus obtaining patient consent is not required. Business associate agreements are in place between each primary care practice and the PHO. Primary care practices issue notice of privacy practices document to patients/families.
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
No formal policies exist—these decisions would be made by primary care independent practice association (IPA) Board and PHO Board.
1.g.iii.1.b. Policies for sharing data outside the network
No formal policies exist—these decisions would be made by primary care independent practice association (IPA) Board and PHO Board.
1.g.iii.1.c. Policies for protecting proprietary data
No formal policies exist—these decisions would be made by primary care independent practice association (IPA) Board and PHO Board.
Physician-‐Hospital Organization (PHO)
187
Criteria Answers
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) “Aligning Rewards with Large-‐Scale Improvement” (JAMA, 2010)
2) “Planning a Registry: Managing Care and Quality Improvement for Chronic Diseases” (Agency for Healthcare Research and Quality: “Registries for Evaluating Patient Outcomes: A User’s Guide, 2nd Edition,” 2010)
3) “Pay for Performance Alone Cannot Drive Quality” (Archives of Pediatrics and Adolescent Medicine, 2007)2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Mandel KE. Aligning rewards with large-‐scale improvement. JAMA. 2010 Feb 17;303(7):663-‐4.2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Yes
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Health Organizations refer patients, give access to EHR data, and participate in research activities
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not supported and would need to recruit approval from IPA Board, PHO Board, and IRB3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? HIPAA security privacy protection
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9, CPT4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Demographic data captured in web-‐based registry/database: date of birth, address (including zip code), payorData collected at point of care from patients/parents and providers. Admission and ED/urgent care visit data
4.g.i. (Y/N) Does the network use natural language processing? No
188
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Primary care practice billing systems: structured queries submitted to PHO via secure email file transfer.
189
Criteria Answers1.a. How many people does the network cover or involve? 0 (user enrollment will begin on Feb. 28th)
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Not applicable
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Not applicable
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Not applicable
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not applicable1.b.i.2. Demographics: geography Not applicable1.b.i.3. Demographics: age Not applicable1.b.i.4. Demographics: gender Not applicable1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Sanofi's Collaborate Activate Innovation Challenge
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? Network goes online on Feb. 28th
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? No
1.e.i.1. What does the network focus on? Not applicable1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Participants control what data they store, with whom they share, and for what purposes their information is used
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Any research study must first meet all of clinicaltrials.gov's requirements.Users decide if they want to be discoverable for research. A researcher sees aggregrate data when making the query to search for study participants. Reg4All sends the users all the information from clinicaltrials.gov and IRB approval, then the participant decides to make himself available for this study at which point a user's identifying information is shared with the researcher.
Reg4ALL
190
Criteria Answers1.g.iii.1.b. Policies for sharing data outside the network No data shared outside the network
1.g.iii.1.c. Policies for protecting proprietary data No data shared outside the network
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals No publications
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Registry has not been in existence long enough to need to standardize data over time
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Outreach organizations, Genetic Alliance, Sanofi will refer patients to Reg4All
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
Approved researchers send a query for the audience they want to connect with and the results are presented back to them in an aggregate format.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? ICD-‐9/104.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
NIH common data elements (CDE) codes
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Surveys asking common health questions, common data elements (NIH CDE) in a survey format, disease specific data, uploaded clinical data sets from their EHRs and data from groups like Personal Genome Project, biobanked tissue (2014)
4.g.i. (Y/N) Does the network use natural language processing? No
191
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
When a researcher first searches for potential participants, they can see counts of participants in the registry that meet the research criteria.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
192
Criteria Answers1.a. How many people does the network cover or involve? 31,806
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Enables volunteers to be matched with researchers for a wide variety of studies involving different diseases and conditions
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Not available
1.b.i.1. Demographics: racial/ethnic
White: 79%Black or African American: 11%Hispanic or Latino: 6%Asian: 4%American Indian or Alaska Native: 1%Multi-‐Racial: 3%Other 2%
1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available
1.b.i.4. Demographics: gender Male: 28%Female: 72%
1.c.i. What is the total annual budget? Confidential1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Confidential
1.c.i.2. How much of that budget is dedicated to conducting studies? Confidential
1.c.ii. What are the current sources of funding? NIH, Clinical and Translational Science Award (CTSA) and National Center for Advancing Translational Sciences (NCATS)
1.c.iii. How much does it cost each year to maintain and update the network? Confidential
1.d. How many years has this network existed? 3
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? To match volunteers to researchers for studies1.f. (Y/N) Does the network use informed consent forms? No
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not applicable
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes -‐ it is the responsibility of the researcher to re-‐contact the patient
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Yes
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
The patient may enter as much or as little information as they would like. If the patient decides to stop participating with ResearchMatch, they can remove their profile and their information will no longer be shared/available
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Self-‐Reported
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Not applicable
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
ResearchMatch
193
Criteria Answers
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Researchers from any participating institution may use ResearchMatch to recruit study participants. Researchers must first agree to ResearchMatch’s rules of use, including maintaining volunteers’ confidentiality and stipulating that all activity will be approved by local IRBs. After creating a ResearchMatch profile (contact information plus username/password), the researcher must electronically submit an IRB approval letter for at least one actively recruiting study.Researchers’ access requests are automatically routed to the appropriate institutional liaisons, who confirm the requests’ legitimacy and accuracy using local IRB approval letters. Once access is approved, the liaison sets an access expiration date that corresponds to the study’s local IRB expiration date; the liaison can extend the access expiration date on receiving proof that the local IRB has extended its approval. ResearchMatch allows more than one authorized researcher to access the same protocol (e.g., a principal investigator plus multiple study coordinators). Access by other researchers in the same study requires permission from the principal investigator and the institutional liaison.
1.g.iii.1.b. Policies for sharing data outside the network The researcher has to be part of the network, CTSA institution, in the process of expanding outside the network
1.g.iii.1.c. Policies for protecting proprietary data
No one has access to the user's data unless they give permission via their account settings to share their identifying information with researchers. None of the staff and/or liaisons have access to the volunteer's data.
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
None
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
No
2.b.i. What is the evidence? Not applicable2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.)
Provide a liaison to work with ResearchMatch who then coordinate with the researchers at their local site as well as their local IRB
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
No
2.d.i.1. What is the evidence? Not applicable3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
Data is encrypted at rest. The application is written in PHP scripting language and is housed on the VUMC Apache server primary website. The back-‐end database for the application is mySQL server maintained on a separate server which houses all of the data related to the registry. All research subject recruitment data sent between web server and browsers is encrypted using Secure Sockets Layer (SSL) protection. Any record fields which are identified as health information (HI) are encrypted before storing in the database (encryption at rest) to ensure maximum data security. Both web and database servers are secure and firewall protected. Inputs are also filtered for web attacks, such as cross site scripting or sql injections.
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? UMLS4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
194
Criteria Answers4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? No
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not applicable
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
geographical, demographic data (age, height, weight, body mass index, gender, race, ethnicity, tobacco use, multiple birth status), medical conditions
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Not applicable
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
195
Criteria Answers1.a. How many people does the network cover or involve? 1,300,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
None
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes but only for studies involving Diabetes Mellitus
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Not available
1.a.iii.1. What is the evidence? Not available1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? Not available1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? Not available
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? Not available
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Diabetes Mellitus1.f. (Y/N) Does the network use informed consent forms? Not available
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not available
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not available
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Not available
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
Not available
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not available
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Not available
1.g.iii.1.b. Policies for sharing data outside the network Not available
1.g.iii.1.c. Policies for protecting proprietary data Not available
SUPREME-‐DM
196
Criteria Answers
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Nichols GA, Desai J, Elston Lafata J, Lawrence JM, O’Connor PJ, Pathak RD, Raebel MA, Reid RJ, Selby JV, Silverman BG, Steiner JF, Stewart WF, Vupputuri S, Waitzfelder B. Construction of a Multisite DataLink Using Electronic Health Records for the Identification, Surveillance, Prevention, and Management of Diabetes Mellitus: The SUPREME-‐DM Project. Preventing Chronic Disease 2012; 9:110311. DOI: http://dx.doi.org/10.5888/pcd9.110311
2) Desai JR, Wu P, Nichols GA, Lieu TA, O'Connor PJ. Diabetes and asthma case identification, validation, and representativeness when using electronic health data to construct registries for comparative effectiveness and epidemiologic research. Medical Care 2012 Jul; 50 Suppl:S30-‐5. PMID: 22692256
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Nichols GA, Desai J, Lawrence JM, Reid R, Schroeder EB, Steiner JF, Vupputuri S, Yan X, for the SUPREME-‐DM Study Group. 5-‐Year incidence of diabetes among 6.7 million adult HMO members: The SUPREME-‐DM project. Diabetes 2012; 61(Suppl 1):A 356.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
Not available
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not available
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) There are about 11 healthcare organizations that provide demographic, clinical data elements, and EHR data.
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Not available
2.d.i.1. What is the evidence? Not available3.a. (Y/N) Does the network have biobanks? Not available3.b. What types of biospecimens are collected? Not available
3.c. What types of analysis are done on them? Not available
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Not available
3.d.i. What types of analyses do they conduct on them? Not available
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not available
4.a. What type of security technology does the network use? Not available
4.b.i. (Y/N) Are queries distributed via a central hub? No
4.b.ii. What is the architecture of the query distribution? Not applicable
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Not available
4.c.ii. Which terminologies? Not available4.d.i.(Y/N) Does the network use a common data model (CDM)? Yes
4.d.ii. Which CDM is used? HMORN Virtual Data Warehouse4.d.iii. How are the data transformed and mapped? The data are mapped and transformed locally at each site to its own Virtual Data Warehouse
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Not available
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
EHR including demographic and clinical data elements
197
Criteria Answers4.g.i. (Y/N) Does the network use natural language processing? Yes
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
MediClass
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
Yes
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Programs are typically distributed via e-‐mail or by posting them to a secure website. They must be manually downloaded, approved by the individual site for execution, run by personnel at the sites, and results are then returned manually. Thus, site personnel retain complete control over their local data.
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Not available
4.j.ii. What informatics tools are used? Not available
198
Criteria Answers1.a. How many people does the network cover or involve? 1,000,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
None
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
No
1.a.iii.1. What is the evidence? Not applicable1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $42,762,536 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Not available
1.c.i.2. How much of that budget is dedicated to conducting studies? Not available
1.c.ii. What are the current sources of funding? HERSA, computer registration fees when patients are listed for transplants, data services
1.c.iii. How much does it cost each year to maintain and update the network? Not available
1.d. How many years has this network existed? 13
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Organ donation and transplants1.f. (Y/N) Does the network use informed consent forms? No -‐ not for purposes of being entered into the registry. UNOS has an IRB exemption.
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Specific consent is necessary for extenuating situations, for example: a patient who want to receive an expanded criteria donor
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not applicable
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? No
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
Other -‐ Clinical information, medical history, treatment information inputed into the UNOS system manually by the participating hospitals
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
Data use agreements
1.g.iii.1.b. Policies for sharing data outside the network Data use agreement and researchers who are not members of UNOS are charged for data they receive
1.g.iii.1.c. Policies for protecting proprietary data UNOS does not store patient identity information on their database
United Network for Organ Sharing (UNOS)
199
Criteria Answers2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals Not available
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Not available
2.b.i. What is the evidence? Not available2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) By entering data about donors and candidates via a web based application run by UNOS
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Not available
2.d.i.1. What is the evidence? Not available3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? No
3.d.i. What types of analyses do they conduct on them? Not applicable
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Not applicable
4.a. What type of security technology does the network use?
UNOS monitors emerging threats and vulnerabilities using physical and automated tools, audits performed by internal and external personnel with the goal to have zero security incidents and minimal interruption to service. Future improvements to security are based on a process-‐driven analysis of emerging security threats and vulnerabilities, realistic assessment of the risk, implementation of controls to mitigate the risk, and regular testing of the controls to assure proper operation.
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution?
A researcher submits a data request and the Research Department at UNOS returns the data in the form of a report or a research dataset.
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? No
4.c.ii. Which terminologies? Not applicable4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
UNOS uses a data dictionary
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Clinical information, medical history, treatment information
4.g.i. (Y/N) Does the network use natural language processing? Not available
200
Criteria Answers4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not available
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? SAS scripts
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
No
4.j.ii. What informatics tools are used? Not applicable
201
Criteria Answers1.a. How many people does the network cover or involve? 6,500,000
1.a.i. Evidence of capacity for expansion to cover additional lives, diseases, conditions, or procedures
Conduct a wide variety of research consisting of different conditions
1.a.ii.1. Can the network be used for new studies in the same or a different condition? Yes
1.a.iii. (Y/N) Is there evidence from the past that show the network can be used for clinical care delivery or quality improvement?
Yes
1.a.iii.1. What is the evidence? Coffield JE, Metos JM, Utz RL, Waitzman NJ. "A multivariate analysis of federally mandated school wellness policies on adolescent obesity." J Adolesc Health. 2011 Oct;49(4):363-‐70. [Abstract]
1.b.i.1. Demographics: racial/ethnic Not available1.b.i.2. Demographics: geography Not available1.b.i.3. Demographics: age Not available1.b.i.4. Demographics: gender Not available1.c.i. What is the total annual budget? $1,500,000 1.c.i.1. How much of that budget is dedicated to infrastructure and maintenance? Percentage of the $1.5 million
1.c.i.2. How much of that budget is dedicated to conducting studies? Percentage of the $1.5 million
1.c.ii. What are the current sources of funding? NIH, Huntsman Cancer Institute
1.c.iii. How much does it cost each year to maintain and update the network? Percentage of the $1.5 million
1.d. How many years has this network existed? 30
1.e.i. (Y/N) Does the network have a focus (i.e., topic area or purpose)? Yes
1.e.i.1. What does the network focus on? Biomedical, cancer, and other health-‐related research across the state of Utah1.f. (Y/N) Does the network use informed consent forms? Yes
1.f.i. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their electronic data?
Not available
1.f.ii. Do patients consent to the broad (meaning data may be analyzed for other research) or specific use of their biological specimens?
Not available
1.f.iii. (Y/N) Can patients be re-‐contacted for consent for a new study? Yes
1.g.i. (Y/N) Are patients involved in the decision-‐making process on the use of the data they provided to the network?
No
1.g.i.1. What are the roles patients play and in what mechanism? How are they involved in the decision-‐making process?
Not applicable
1.g.ii.1. What are the sources of Self-‐Reported data collected in the network? (e.g., conditions, medications, medication adherence, procedures, labs/imaging, health-‐related quality of life)
Not applicable
1.g.ii.2. What are the sources of Health care-‐Derived data collected in the network? (e.g., coded diagnostics, pharmacy orders, pharmacy fulfillment, procedures, lab orders, diagnostic results, imaging data)
EHR
1.g.ii.3. What are the sources of Clinical Trials data collected in the network? (e.g., coded diagnostics, drug information, procedures, lab orders, diagnostic results, imaging data, biospecimen, health-‐related quality of life)
Not applicable
1.g.iii.1.a. Data use and sharing policies for institutional investigators to collaborate with each other using the data
All research projects must have IRB and RGE (Resources for Genetic and Epidemiologic Research) approval
1.g.iii.1.b. Policies for sharing data outside the network All research projects must have IRB and RGE (Resources for Genetic and Epidemiologic Research) approval
Utah Population Database
202
Criteria Answers1.g.iii.1.c. Policies for protecting proprietary data HIPAA, Data Use Agreement
2.a. Three most recent (or high impact) studies published in peer-‐reviewed journals
1) Hawkes JE, Cassidy PB, Manga P, Boissy RE, Goldgar D, Cannon-‐Albright L, Florell SR, Leachman SA. "Report of a novel OCA2 gene mutation and an investigation of OCA2 variants on melanoma risk in a familial melanoma pedigree." J Dermatol Sci. 2013 Jan;69(1):30-‐7. doi: 10.1016/j.jdermsci.2012.09.016. [Abstract]
2) Hurdle JF, Haroldsen SC, Hammer A, Spigle C, Fraser AM, Mineau GP, Courdy SJ. "Identifying clinical/translational research cohorts: ascertainment via querying an integrated multi-‐source database." J Am Med Inform Assoc. 2013 Jan 1;20(1):164-‐71. doi 10.1136/amiajnl-‐2012-‐001050 [Abstract]
3) Xu J, Lange EM, Lu L, Zheng SL, Wang Z, Thibodeau SN, Cannon-‐Albright LA, Teerlink CC, Camp NJ, Johnson AM, Zuhlke KA, Stanford JL, Ostrander EA, Wiley KE, Isaacs SD, Walsh PC, Maier C, Luedeke M, Vogel W, Schleutker J, Wahlfors T, Tammela T, Schaid D, McDonnell SK, Derycke MS, Cancel-‐Tassin G, Cussenot O, Wiklund F, Gronberg H, Eeles R, Easton D, Kote-‐Jarai Z, Whittemore AS, Hsieh Cl, Giles GG, Hopper JL, Severi G, Catalona WJ, Mandal D, Ledet E, Foulkes WD, Hamel N, Mahle L, Moller P, Powell I, Bailey-‐Wilson JE, Carpten JD, Seminara D, Cooney KA, Isaacs WB; International Consortium for Prostate Cnacer Genetics. "HOXB13 is a susceptibility gene for prostate cancer: results form the International Consortium for Prostate Cancer Genetics (ICPCG)." Hum Genet. 2013 Jan;132(1):5-‐14. doi: 10.1007/s00439-‐012-‐1229-‐4 [Abstract]
2.b. (Y/N) Have researchers conducted studies that involve longitudinal (multiple values rather than one time) follow-‐up?
Yes
2.b.i. What is the evidence? Brown SM, Jones JP, Aronsky D, Jones BE, Janspa MJ, Dean NC. "Relationships among initial hospital triage, disease progression and mortality in community-‐acquired pneumonia." Respirology. Nov. 2012;17(8):1207-‐13. doi: 10.1111/j.1440-‐1843.2012.02225.x.
2.b.ii. (Y/N) Can researchers conduct follow-‐up or ongoing observation from existing reports by passively reviewing data rather than actively pulling it?
No
2.b.ii.1. How do researchers standardize those data items? (e.g., how do researchers standardize survey type questions over a period of time?)
Not applicable
2.c.i. (Y/N) Are healthcare organizations (hospitals, outpatient centers) actively participating or engaging in research activities conducted by the network?
Yes
2.c.ii. How? (Examples: by referring patients, giving access to EHRs, etc.) Giving access to EHR data
2.d.i. (Y/N) Have there been any randomized control trials using the data collected in the network?
Yes
2.d.i.1. What is the evidence? Not available3.a. (Y/N) Does the network have biobanks? No3.b. What types of biospecimens are collected? Not applicable
3.c. What types of analysis are done on them? Not applicable
3.d. (Y/N) Do researchers in the network collect biospecimens for research purposes? Yes
3.d.i. What types of analyses do they conduct on them? Genome sequencing, identify biomarkers
3.d.ii. Were they able to link the analysis/research results back to patient outcomes?
Yes
4.a. What type of security technology does the network use? firewalls, HIPAA review
4.b.i. (Y/N) Are queries distributed via a central hub? Yes
4.b.ii. What is the architecture of the query distribution? Not available
4.c.i. (Y/N) Does the network use standardized terminologies (i.e., ICD-‐9, SNOMED, etc.)? Yes
4.c.ii. Which terminologies? CPT, ICD-‐9, Diagnosis Related Group codes (DRG)4.d.i.(Y/N) Does the network use a common data model (CDM)? No
4.d.ii. Which CDM is used? Not applicable4.d.iii. How are the data transformed and mapped? Not applicable
203
Criteria Answers4.e.i. (Y/N) Does the network collect additional fields to help with analysis and interpretation (metadata)? Yes
4.e.i.1. What standards, possibly home grown, are used? If home grown, is there a way to map back to standards? (Data Dictionary?)
Not available
4.f. List the types of data that are being collected or accessed and incorporated into the network (e.g., EHR data, claims, patient-‐reported outcomes, etc.).
Family History (Genealogy File and Ancestral File), Cancer Records (Utah Cancer Registry, Cancer Data Registry of Idaho), Vital Records (Birth and Death Certificates, Marriage and Divorce Records), Utah Driver License, Social Security Death Index, Voter Registration, Patient visits, demographic information, facility code, admission date, discharge date and status, principal diagnosis code, other diagnosis codes, CPT-‐4 or principal procedure codes, other procedure codes, procedure coding method, total charges, primary payer, secondary payer, third payer. Claims data, hospital code, principal diagnosis and principal procedure codes, eight (maximum) other diagnosis and other procedure codes, an external injury E-‐code, admit and discharge information, mortality risk codes, and payer category
4.g.i. (Y/N) Does the network use natural language processing? No
4.g.ii. What applications (e.g., UIMA, cTAKES, NegEx, MetaMap, many different parsers, etc.) or approaches (examples are machine learning, rule-‐based) are being used?
Not applicable
4.h.i. (Y/N) Are data aggregated before the data leave the local site and are shared with the network?
No
4.h.ii. How are the data transformed (i.e., based on what criteria are the data aggregated)?
Not applicable
4.i. What data (statistical) analysis tools, if any, are available for researchers through the network? Kinclass and Dynaped
4.j.i. (Y/N) Are administrative, billing, and/or clinical records integrated into longitudinal patient-‐level data? (Are administrative, billing, and clinical records kept in individual places or lumped in with patient-‐level data?)
Yes
4.j.ii. What informatics tools are used? Not applicable
204