Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
1
PharmaSUG 2020 - Paper SS-151
Supplementary Steps to Create a More Precise ADaM define.xml in Pinnacle 21 Enterprise
Majdoub Haloui, Hong Qi, Merck & Co., Inc, North Wales, PA, USA
ABSTRACT The Analysis data definition document, ADaM define.xml, is a required document in a regulatory submission package. It provides necessary information to describe the submitted ADaM datasets and their variables. A high quality define.xml is important for a smooth review process. Pinnacle 21 Enterprise enables the automation and standardization to generate a high quality define.xml. However, due to certain limitations of the current version of Pinnacle 21 Enterprise software, extra steps are needed to create a more precise define.xml after the import of ADaM datasets specification. These steps will generate an ADaM define.xml with better descriptions of attributes, controlled terms, and the source for certain variables. In this paper, the authors will introduce detailed steps leading to a more accurate ADaM define.xml file.
INTRODUCTION Creating a high quality define.xml in the past required a solid knowledge of the standards and mastery of XML. Pinnacle 21 Enterprise eliminates the need for the latter and overcomes the challenges of learning and becoming proficient with the standards. define.xml Generator is based on Excel, which allows you to focus on the metadata content instead of the complex XML syntax.
Pinnacle 21 Enterprise is the leading industry web-based application, used by sponsors and CROs, to validate SDTM/ADaM datasets and define.xml against CDISC standards. FDA and PMDA use Pinnacle 21 Enterprise to review submission data from sponsors. Pinnacle 21 Enterprise has many useful features, including validating SDTM, ADaM and define.xml, as well as generating define.xml version 2.0. When creating define.xml, careful consideration must be taken so that all the information within the define is clear, concise and compliant with CDISC and regulatory agency requirements. This paper introduces extra steps needed to create a more precise define.xml after the import of ADaM datasets specification. These steps will generate an ADaM define.xml with better description of the attributes, controlled terms, and the source for certain variables.
ADAM DEFINE.XML GENERATION PROCESS FLOW In the past, creating a submission-ready define.xml demanded a strong knowledge of the CDISC standards and mastery of XML. Absence of such knowledge turned out to be a major setback for many statistical programmers working on a regulatory submission. Pinnacle 21 Enterprise has a module that creates define.xml, however, this module has some limitations when creating Value Level Metadata for variables other than AVAL and AVALC (Display 1.1).
Display 1.1 Pinnacle 21 Enterprise Module Creating Value Level Metadata
2
We collaborated with the Pinnacle 21 team to create a customized define.xml generator to help generate the define.xml based on our ADaM datasets specification in Excel format. This customized software allows statistical programmers to focus more on the metadata content of the study rather than the complex define.xml syntax (Display 1.2).
Display 1.2 Metadata Content in define.xml Generated by Customized Pinnacle 21 Enterprise
Module
The overview of ADaM define.xml generation process flow is shown in Figure 1.
Figure 1 ADaM define.xml generation Process Flow
3
EXTRA STEPS This section describes the additional steps needed to create a more precise define.xml after the import of ADaM datasets specification to Pinnacle 21 Enterprise tool. These steps can bypass some limitations in the automation feature of the current Pinnacle 21 Enterprise software and generate an ADaM define.xml with better description of the attributes, controlled terms, and the source for certain variables. Moreover, a few steps also lead to an improved ADaM validation score.
1. PREDECESSOR ORIGIN TYPE FOR DATE VARIABLE IN CHARACTER FORMAT The Define-XML standard defines the following origin types “CRF”, “Derived”, “Assigned”, “Protocol”, “eDT” and “Predecessor”. “CRF”, “Protocol” and “eDT” origin types are generally used by SDTM variables while “Derived”, “Assigned” and “Predecessor” origin types are used by ADaM variables and parameters. The Define-XML Origin element is used to provide metadata traceability for SDTM, ADaM and SEND data, If a variable is carried over from another dataset into an ADaM dataset “as-is” (same value, same data type and same label) then the origin should be “Predecessor”. Variable Label /
Description Type Length or Display
Format Controlled Terms or ISO Format
Origin / Source /Method / Comment
VARIABLE Variable Label text 12
Predecessor: DATASET/VARIABLE
Display 2.1 Example of a Variable with Predecessor Origin Type
In Display 2.1, “DATASET.VARIABLE” is the predecessor value such as “DM.USUBJID”. Referred in the value of predecessor, DATASET must exist in the submission folder and the VARIABLE must exist in the DATASET. Many date/time variables in character format (DTC) are of “Predecessor” origin. These include but are not limited to the variables listed in Table 1.1.
Dataset Variable Label Type
ADAE AESTDTC Start Date/Time of Adverse Event date
ADAE AEENDTC End Date/Time of Adverse Event date
ADCM CMSTDTC Start Date/Time of Medication date
ADCM CMENDTC End Date/Time of Medication date
ADCM PRGRSDTC Progression Date After Treatment Phase date
ADEX EXSTDTC Start Date/Time of Treatment datetime
ADEX EXENDTC End Date/Time of Treatment datetime
ADLBGRD LBDTC Date/Time of Specimen Collection date
ADMH MHSTDTC Start Date/Time of Medical History Event date
ADMH MHDTC Date/Time of History Collection date
ADSL RFSTDTC Subject Reference Start Date/Time datetime
ADSL RFENDTC Subject Reference End Date/Time datetime
ADTL TRDTC Date/Time of Tumor Measurement date
Table 1.1 DTC of Predecessor Origin
4
In the ADaM datasets specification, the type of these DTCs is documented as “Char” and the length is specified to define the characteristics of SAS variables. Taking ADSL.RFSTDTC as an example, Display 2.2 shows how RFSTDTC from the DM domain is defined in the ADSL tab.
Variable Name
Variable Label Type Length Sig Digits
Format Codelist / Controlled Terms
Origin Define Derivation
RFSTDTC Subject Reference Start Date/Time
Char 19 Predecessor DM.RFSTDTC
Display 2.2 ADSL.RFSTDTC defined in ADaM datasets specification
The define files generated by simply importing an ADaM datasets specification to Pinnacle 21 Enterprise describe ADSL.RFSTDTC as shown in Display 2.3. The attributes of ADSL.RFSTDTC deviate from its “Predecessor”, DM.RFSTDTC, in terms of Type, Length and Controlled Terms or Format (Display 2.4).
Variable Label / Description
Type Length or Display Format
Controlled Terms or ISO Format
Origin / Source /Method / Comment
RFSTDTC Subject Reference Start Date/Time
text 19
Predecessor: DM.RFSTDTC
Display 2.3 ADSL.RFSTDTC described in the ADaM define.xml
Variable Label / Description
Type Length or Display Format
Controlled Terms or ISO Format
Origin / Source /Method / Comment
RFSTDTC Subject Reference Start Date/Time
datetime
ISO8601 Derived: First dose of study medication
Display 2.4 DM.RFSTDTC described in the SDTM define.xml
In order to describe ADSL.RFSTDTC the same as its “Predecessor” DM.RFSTDTC, manual update of the Define module in Pinnacle 21 Enterprise is needed. This process is illustrated in Figure 2.
Figure 2 Update Attributes of DTC in Pinnacle 21 Enterprise to Achieve Consistent Attributes as Predecessor’s
Define in Pinnacle 21 Enterprise: Dataset Variable Label Type Length
ADSL RFSTDTC Subject Reference Start Date/Time text 19
Dataset Variable Label Type Length
ADSL RFSTDTC Subject Reference Start Date/Time datetime
ADSL.RFSTDTC with consistent attributes as its predecessor
Variable Label / Description
Type Length or Display Format
Controlled Terms or ISO Format
Source/Derivation/Comment
RFSTDTC Subject Reference Start Date/Time
datetime
ISO8601 Predecessor: DM.RFSTDTC
Replace with “datetime”
Delete
Export ADaM define.xml
5
2. VARIABLE LENGTH The variable length specified in define.xml should match that of the variable value in the dataset. Variable length describes the maximum expected variable length. It should only be present for a data type of "text", "integer", or "float". In the case of Type="integer", the length refers to the maximum length of the numeric value expressed in characters. Since an integer can only be defined as 8 in length in the data specification for SAS variable(s), the actual length needs to be updated in both Variables and Value Level tabs of Pinnacle 21 Enterprise tool (Display 3.2 and Display 3.3) when it exceeds 8. Otherwise, the ADaM validation report shows “Error” for Rule ID SD1231 as shown in Display 3.1 below. This usually happens for the variable SRCSEQ in multiple datasets when its length exceeds 8.
Issue Summary
Dataset Rule ID Publisher ID Message FDA PMDA Found
ADRS SD1231
SRCSEQ value is longer than defined max length 8 when PARAMCD == 'BORCFIRC' Error Warning 973
ADRS SD1231
SRCSEQ value is longer than defined max length 8 when PARAMCD == 'ORINV' Error Warning 3090
ADRS SD1231
SRCSEQ value is longer than defined max length 8 when PARAMCD == 'ORIRC' Error Warning 3868
ADRS SD1324 Define.xml/dataset variable label mismatch Error Error 1
Display 3.1 Pinnacle 21 Enterprise Validation Report
Display 3.2 Length Entered for ADRS.SRCSEQ in Variables Tab of Pinnacle 21
Display 3.3 Length Entered for ADRS.SRCSEQ in Variable Level Tab of Pinnacle 21
6
After the steps in Display 3.2 and Display 3.3, the define.xml generated shows SRCSEQ with “15” in Length /Display Format as in Display 3.4.
Analysis Dataset of Response (ADRS) [Location: adrs.xpt]
Variable Label / Description Type Length or Display Format
Controlled Terms or ISO Format
Source/Derivation/Comment
SRCSEQ Source Sequence Number
integer 15
Refer to Parameter Value Level Metadata
Display 3.4 ADRS.SRCSEQ in define.xml
3. CONTROLLED TERMINOLOGY AND ADAM IG The ADaM Implementation Guide (ADaM IG v1.1) lists many variables that are subject to controlled terminology (CT). As displayed in the below table, variables AGEU, SEX, RACE should have codelists in the define.xml.
Example:
However, the ADaM IG does not offer much guidance on providing CT for other variables. This does not mean that we do not need to define the CT for them. Section 2.6.3, General Considerations for codelists, of the Define-XML Version 2.0 Completion Guidelines document states: “In addition to variables subject to controlled terminology as per CDISC IGs and sponsor-specific controlled terminology, codelist should also be provided for all other variables and value-level definitions which have a predefined and finite set of categorical allowable values.” Some variables that should have CT are: PARAM/PARAMCD/PARAMN and AVISIT/AVISITN. For more situations where codelists are expected, please refer to the Define-XML Version 2.0 Completion Guidelines document Table 2.6.3.2: Situations where codelists are expected. Some National Cancer Institute (NCI) codes in codelist and external codelist cannot be directly imported to the Pinnacle 21 Enterprise tool from an ADaM datasets specification, and therefore, need to be entered manually as described below.
3.1 NCI CODE IN CODELIST OF CONTROLLED TERMS When an NCI code is missing in Codelist or Term tab for codelist value(s), Pinnacle 21 Enterprise define validation reports “Error” for Rule ID DD0031 or DD0032 (Display 4.1). This happens to codelist AEACN (Action Taken with Study Treatment), ASTL0DTYPT (Derivation Type of Target Lesion) and
7
ADTL0PARMTYP (Parameter Type of Target Lesions). These NCI codes need to be typed in the codelists (Display 4.2) or Terms (Display 4.3) tab of Pinnacle 21 Enterprise define.
Issue Summary
Dataset Rule ID Publisher ID Message FDA PMDA Found
DD0032 Missing NCI Code for Term in Codelist 'ADTL0PARAMTYP' Error Error 1
DD0031 Missing NCI Code for Codelist 'AEACN' Error Error 1
DD0032 Missing NCI Code for Term in Codelist 'AEACN' Error Error 4
DD0031 Missing NCI Code for Codelist 'ADTL0DTYPE' Error Error 1
DD0031 Missing NCI Code for Codelist 'ADTL0PARAMTYP' Error Error 1
Display 4.1 Pinnacle 21 Enterprise Tool Validation Report Displaying Errors for Missing NCI Code
Display 4.2 NCI Code Entered for AEACN in Codelists Tab of Pinnacle 21
Display 4.3 NCI Codes Entered for AEACN in Terms Tab of Pinnacle 21 Tool
3.2 EXTERNAL DICTIONARIES/CODELIST Codelists, such as coding dictionaries provided by third party vendors, are referred as “External Codelist” in the Define-XML document. They require a different type of the information provided in the document compared to other CDISC or sponsor-defined Controlled Terminologies. Third party dictionaries such as MedDRA/WHODD have regulated terms (i.e. the same coding result would apply for the same AE under the same dictionary version regardless of sponsor or study). In a Define-XML document, this needs to be listed under the External Dictionaries with Dictionary name and Version specified (Display 5.1).
8
Display 5.1 MedDRA Display in the External Dictionaries With the current version of the custom Pinnacle 21 Enterprise tool, if the user enters ‘MedDRA’ in the CT column of the ADaM spec, the tool will create a new ‘MedDRA’ codelist instead of linking to the MedDRA Dictionary already present in the tool (Display 5.2 a). The user will need to manually select ‘MedDRA’ from the codelist dropdown in the Pinnacle 21 tool (Dispaly5.3) for variables that should have ‘MedDRA’ codelist.
Display 5.2 Property tab in the Define of Pinnacle 21 Enterprise
Display 5.3 Codelist Term “MedDRA” Manually Entered in Define Pinnacle 21 Enterprise
a
b
9
After this manual change, the corresponding define.xml generated shows and links the codelist as “Medical Dictionary for Regulatory Activities” (Display 5.4).
Display 5.4 Controlled Terms or Format for MedDRA in define.xml
4. ADRG AND ARM INFORMATION The submitted define.xml often displays links to ADRG and ARM (Display 6.1). However, the IDs, titles and exact file names (Href) of ADRG and ARM need to be manually entered into Documents of the Property tab in Pinnacle 21 Enterprise tool as shown in Display 5.2 b above.
Display 6.1 define.xml with Links to ADRG and ARM
10
CONCLUSION Due to proven benefits and an easy solution to create a compliant and complex ADaM define.xml, the Pinnacle 21 Enterprise tool has been broadly used, although not required, to generate ADaM define.xml. The extra steps we implemented in the current version are additional to the direct automation after importing an ADaM dataset specification and before exporting the define.xml. These steps are highly recommended to help enhance the precision of ADaM dataset and variable descriptions in define.xml for the regulatory submission.
ACKNOWLEDGMENTS The authors would like to thank Ms. Ellen Asam, Ms. Mary N. Varughese, and Ms. Amy Gillespie for reviewing the paper and their great suggestions.
RECOMMENDED READING 1. PhUSE. 2019. “Define-XML Version 2.0 Completion Guidelines’. Define-XML Version 2.0 Completion
Guidelines
2. ADaM Implementation Guide (ADaM IG v1.1). https://www.cdisc.org/system/files/members/standard/foundational/adam/ADaMIG_v1.1.pdf
CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the authors at:
Majdoub Haloui Principal Scientist, Statistical Programming Merck & Co., Inc. [email protected] Hong Qi Principal Scientist, Statistical Programming Merck & Co., Inc. [email protected]
Any brand and product names are trademarks of their respective companies.