Upload
heidi
View
36
Download
1
Embed Size (px)
DESCRIPTION
Implementation of CDISC at BI – Overview. CDISC German User Group Meeting Sep 2009. Dr. Jens Wientges IBM Global Business Services Life Sciences / Pharma Consulting. Motivation, Objectives and expected Benefits System Landscape, Data Flow and Processes Approach - PowerPoint PPT Presentation
Citation preview
© 2009 IBM Corporation
Implementation of CDISC at BI – Overview
CDISC German User Group Meeting Sep 2009
Dr. Jens WientgesIBM Global Business Services Life Sciences / Pharma Consulting
2 © 2009 IBM CorporationIBM
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor defined elements
3 © 2009 IBM CorporationIBM
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor defined elements
4 © 2009 IBM CorporationIBM
Implementing CDISC at BI (ICBI) - Motivation– Requests for analyses on substance/project databases
SDB/PDB are increasing• need to effective use and exploit clinical data beyond single trials • need to build efficient substance databases
– A harmonized data model based on CDISC allows for• a wider range of standard reporting tools• re-use of standard programs• facilitated familiarization with new trials/projects• higher flexibility in assignments to projects• quicker response to regulatory requests (same view on data)
- BI has taken the decision to implement the CDISC data standards to effectively manage, exploit and report clinical data
5 © 2009 IBM CorporationIBM
Corporate wide, Harmonized Clinical Data Structure
1. Effectual for:- single clinical trials - pooled databases (PDB)
2. Operational data structure, allowing:- data quality checks- ADS/ADaM generation- Ad hoc statistical analysis
3. Based on the principles of the CDISC data standards
ICBI - Objectives
6 © 2009 IBM CorporationIBM
Shown in three categories:1. Submission / Regulatory Compliance
2. Knowledge Generation
3. Effort & Time Saving
ICBI - Business Benefits
7 © 2009 IBM CorporationIBM
– Working with a data structure close to the one requested for Submission
• Allows traceability from analysis data (ADaM) back to raw data (BI-CDISC and plain SDTM)
• allows for semi-automated generation of plain SDTM and define.xml• is a one time effort per submission• is less time consuming• creates no external costs
– Having the same view on data as authorities• Increases transparency • Leads to higher efficiency / turn-around time in answering questions
Standardized Data Structure will - further enhance compliance to regulatory requirements - allow more efficient creation of submission package
ICBI - Business BenefitsSubmission / Regulatory Compliance 1
8 © 2009 IBM CorporationIBM
Working with one data structure across trials:• Allows easier creation of PDB and pooling of trial data• Leads to effective meta-analyses on project and/or substance level• Increases re-use of standard programs, program templates and views• Supports exchange between OPUs and functions (e.g. PK/PD, PGx, partners,
…)• Allows (semi-)automated load, transformation and incorporation of external
data from vendors, suppliers, pharmaceutical and collaboration partners• Leads to higher flexibility in assignments to trial & project tasks• Reduces time to answer of internal (various customers, e.g. medical affairs)
requests• Reduces time to answer of external (regulatory) questions
Standardized Data Structure will further enhance effective pooling of data and pooled analyses
2ICBI - Business BenefitsKnowledge Generation
9 © 2009 IBM CorporationIBM
Working with BI-CDISC facilitates downstream processes:• Semi-automated generation of define.xml for SDTM and ADS/ADaM
• no review cycles for define.xml generated externally• Same view on data as authorities
• increases transparency• results in higher efficiency in answering questions
• A higher degree of automation, making use of metadata (CDR)• enables more efficient programming• reduces validation efforts• Reduces effort for creation of standard ADS/ADaM
Standardized Data Structure will - establish a higher level of standardization - further enhance analysis with reduced timelines
3ICBI - Business BenefitsEffort & Time Saving
10 © 2009 IBM CorporationIBM
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor defined elements
11 © 2009 IBM CorporationIBM
Chosen Approach for BI-CDSIC In line with the recommendations of the SDTM and Analysis Datasets
Implementation Expert Team for a CDISC data standards implementation we defined the following cornerstones for our data model:
1. Define a sponsor specific in-house data-structure (BI-CDISC) and create SDTM and ADaM/ADS in parallel from there
2. Definition of transformation rules from BI-CDISC to SDTM and from BI-CDISC to ADaM/ADS (but not creating ADS from SDTM)
3. The data model contains both collected and derived data
4. The data model will omit RELREC and SUPPQUAL (will only be created upon generation of plain SDTM for submission)
5. BI-CDISC will make use of the SDTM vocabulary
• SDTM-vocabulary defined as variable metadata and controlled terminology, not the SDTM structure
6. BI-CDISC is defined by metadata and (long-term vision) metadata shall drive the transformations from this BI-CDISC to SDTM and ADaM/ADS. Traceability from SDTM ADaM is sufficiently granted by including the SEQ variable in CDR and inherit it to SDTM/ADaM and/or metadata defining the various transformation steps
12 © 2009 IBM CorporationIBM
ICBI Data Flow through System Landscape
Study Setup
SubmissionTo
FDAO*C
Trial Database
SDTM,ADaM,
Tables, Listings,Profiles,
+Metadata,define.xml
Final Reportas isSDTMADaM
Trial 1
SDTM+
Trial 2
SDTM+
Pool as is
CDR (LSH)Trial Database / Substance DB
TransformCDR 1
DataLoad
PooledDatabaseO*C Export
noChange
as is nochange
nochange
as is Transform SDTM+
nochange
as is nochange
nochange
as is
ADS Dev.Displays Dev.
TransformCDR 2
Transform
Pooled DB
Load from O*C and Transform in CDR (LSH)
SDTMADaM
SDTMADaM
define.xml
define.xml
define.xml
as is
as is
Meta info
Trial specifics manually Master Mapping
Tablepartiallymanually
13 © 2009 IBM CorporationIBM
Cornerstones of ICBI There will be no impact on early processes
like study set up, data entry, and user friendliness of RDC. Data cleaning and discrepancy management remains in O*C
ICBI requires a certain upfront (once for each trial) effort for trial specific transformation to SDTM+ and its QC/validation
Once data are available in the O*C database, they are loaded into LSH. Loading is triggered by a completed Batch Validation session in O*C
After loading the data into LSH, they can be automatically transformed into the SDTM+ structure (Load and transformation steps can be combined in one LSH workflow)
ADS/ADaM will be created from SDTM+ and form the basis for reporting
The submission data sets in plain SDTM are created by sub-setting and restructuring out of SDTM+ (can be automated)
14 © 2009 IBM CorporationIBM
The define.xml can be created semi-automatically taking the meta data available in LSH thus improving quality (inconsistencies) and timely delivery of final submission data sets
To gather all meta information needed for SDTM, ADS and define.xml a process needs to be implemented to capture the meta information throughout the process (see Module “Meta Data Collection and Master Mapping Table”)
To enable DQRM reporting to be based on SDTM+, the data need to be available in SDTM+ structure early/close to First Patient In
Training would be required for all functions working with the data in LSH. The O*C part of the process would not be effected (Overview training recommended only)
Cornerstones of ICBI
15 © 2009 IBM CorporationIBM
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor defined elements
16 © 2009 IBM CorporationIBM
Overall Approach
Sources BI-DM O*C BI-DM Plain SDTM BI-DM
Plain SDTM
OCViews
•SDTM Implementation Guide
•CDISC Controlled Terminology
Mapping Table
•BI-DM User Requirements•BI PDB Requirements•BI GLIB CT (formats)•ADaM IG•BI ADS Guideline•Data Quality Requirements
•T/PSAP•ADS Plan•Protocol•aCRF
18 © 2009 IBM CorporationIBM
Design Data Model based on two trials of indication A
Expand Data Model with two trials of indication B
Proove Data Model (PoC)– Create Pooled Database (PDB) of all four trials
– Re-create trial ADS from PDB
– Create submission SDTM from PDB
Overall Approach – Trials
19 © 2009 IBM CorporationIBM
Overall Approach – Teams
Treat/Exposure
Efficacy
Safety
Lab/Ext. Data
Keys & RelationsKeys & RelationsCT & FormatsCT & Formats
• One Rep from each Team
• One Rep from each Team
20 © 2009 IBM CorporationIBM
Overall Approach – Scope for Teams
•Lab - External Data•Safety•Efficacy•Treat. - Exposure - TD
Study A Study B
O*C Views available for the studies used for mapping •are the starting point for the mapping•are divided up among the groups according to topics
• topics are based on logical grouping of SDTM domains
21 © 2009 IBM CorporationIBM
1. Motivation, Objectives and expected Benefits
2. System Landscape, Data Flow and Processes
3. Approach
4. Real life examples of issues and sponsor defined elements
22 © 2009 IBM CorporationIBM
Using --SEQ… --SEQ should not be used for any SAS/SQL evaluation
--SEQ is dynamically assigned and might change until a database is locked
• If BI-CDISC datasets are created multiple times prior to lock then –-SEQ will be assigned differently whenever rows/observations of data have been added or removed
In different snapshots of the same trial the value of --SEQ will not be consistently applied to common observations
The Keys and Relations team does not consider the above points to be issues, (to maintain consistency in --SEQ would be very difficult / impossible to achieve, with little / no gain)
23 © 2009 IBM CorporationIBM
I. Pooling Identifiers / Keys
Proposed Variables are:1. SUBSTANCE
2. PROJECT
3. STUDYID
4. USUBJID/PTNO
5. VISITNUM
6. TPTNUM
7. VISDT
8. --DT
9. --ONDT
10.--ENDT
11.--CAT
12.--SCAT
13.--TESTCD
14.--METHOD
15.--SPEC
24 © 2009 IBM CorporationIBM
ICBI – Interdomain Dependencies
Mappings are often not trivial– BI-CDISC variables should be
derived only once and from one single source
– Domains have to be created/populated in a defined order
25 © 2009 IBM CorporationIBM
CT Consolidation – LABNM Format
For LABNM (>1000 code/decodes) it was decided to split them out to three variables (LBTESTCD, LBSPEC and LBMETHOD)
In special cases additional variables required (position, fasting status, time, …)
26 © 2009 IBM CorporationIBM
Identified SDTM+Topic SDTM SDTM(+) Workload plain Workload plus R/B*
Numeric dates/times
All dates are CHAR (ISO8601)
Keep O*C dates (NUM) and ISO8601 dates in parallel
Mediumbecause all dates have to be transformed to ISO8601 and NUM for analysis
Lowbecause NUM dates are kept and used for analysis. No back-transformation necessary
B
Missing SDTM definitions
no definition available for some variables in SDTM V3.1.2
Have to be kept as plus variables: variables required into current XAE or XTRTGEN macro(N.B. – closely evaluate future need of variable as input to new X-Macros)
Not possible to create ADS from plain SDTM, because required variable for XAE and/or XGENTRT macro. Will not be available with plain SDTM
Very low effort expected, because the variable needed in the macros can be extracted as is from the available PLUS variable without complex referencing, transformations, derivations or imputations
R
Key concept
STUDYIDUSUBJIDDOMAIN--SEQ--GRPID--REFID--SPID
STUDYIDUSUBJIDDOMAINMeaningful Keys to be defined (based on content)
Very Highvalues of ID-variables are not unique across subjects.Only designed for merging parent domains to SUPPQUAL, CO, RELREC.Does not support merging by content across domains (e.g. XR to XD)
Mediumneeds to be defined when creating SDTM+,beneficial for analysis & reporting (no additional work)
R
* R – required, B - beneficial
e.g.
e.g.
e.g.
27 © 2009 IBM CorporationIBM
Identified SDTM+Topic SDTM SDTM(+) Workload plain Workload plus R/B*
NUM - CHAR
Variables are of type CHAR in generalExample:USUBJID
--ORRES
Keep both, CHAR and NUM-type variablesExample:USUBJID"PTNO"--ORRES"--ORRESN"
MediumNumeric O*C values are converted to CHAR, then need to be converted back to NUM for analysis & reporting
LowConvert once to CHAR for SDTM.Keep numeric values from O*C as a plus for analysis & reporting (no re-conversion)
B
Code - Decode
Only Decode (CHAR)
Example:XRCATEPOCH
Have • Code (NUM)• associated SAS
format &• Decode (CHAR)
Mediumwithout formats it is not possible to reproduce all the options offered in the CRF
Very low R
No SUPPQUAL
SUPPQUAL Domain
No SUPPQUAL Domain, variables included in parent domainAdditional meta data required to identify qualifier information destined to SUPPQUALAdditional variable that contains the qualifier information that is destined to SUPPQUAL
HighMerging needed because information that clinically belongs together is scattered (search and merge).
MediumInformation that clinically belongs together is located in one Domain.One time effort to create plain SDTM (selecting and splitting).
B
* R – required, B - beneficial
e.g.
e.g.
28 © 2009 IBM CorporationIBM
Identified SDTM+Topic SDTM SDTM(+) Workload plain Workload plus R/B*
Date/time imputation
Reported date/time (ISO8601)
Have •reported date/time
•imputed date/time
•imputation rule in parallel
HighIn case of incomplete dates, imputation needs to be done by hand (error prone process)
LowIf imputation rule is implemented in O*C views. Otherwise needs to be defined once for creation of SDTM+
B
Relationship to CRF/DCM
Not included Keep the DCM name where the variable originated from
MediumConnection between SDTM data and CRF is not readily available
Low Primarily to ease programming and help with debugging
B
Tracking of same patient in multiple trials (e.g. extension trial information)
• Previous Trial Number• Previous Patient Numbercould possibly be stored in the Subject Characteristic domain (SC). This needs to be investigated.
•Previous Trial Number
•Previous Patient Number
LowPrevious Trial NumberPrevious Patient Number should be scattered into the Subject Characteristic domain (SC).
Very lowThe collected variables need to be copied from O*C into SDTM+ (DM domain?). These two variables are collected at the site and need to be available in SDTM+ for CTR reporting and to facilitate reporting from the P/SDB.
R
* R – required, B - beneficial
e.g.
e.g.
e.g.
29 © 2009 IBM CorporationIBM
IBM Global Business Services.
Contacts Dr. Jens WientgesPeter Leister
Dr. Jens WientgesMailto:[email protected]: + 49 160 5826897
Peter LeisterMailto: [email protected]: +49 160 3671761