Richard Lewis Octagon Research Solutions 2008-10-17 Midwest User Network

Preview:

Citation preview

Richard LewisOctagon Research Solutions2008-10-17

MidwestUser Network

2

Agenda

• 9:00-9:20 Group Update

• 9:20-9:30 Portal Update

• 9:30-10:30 Discussion Questions

• 10:30-11:00 When to Integrate SDTM

• 11:00-12:00 Changes from 3.1.1 to 3.1.2

3

Welcome!

To existing members and (many) new members

4

Next Meeting Presentations

• Vote on external presentations for next meeting. – define.xml team– ADaM team– ISD Pilot team– Controlled Terminology– Others?

• Internal Presentation Volunteers– Implementing ADaM

• Overcoming pitfalls

5

Lunch Meetings

• Successfully Used by Other User Groups• Timing

– Once a month

• Topic Out Ahead of Time• Casual

– Discussion rather than presentation

• Interest outside the N IL area? • List of possible restaurants

6

CDISC Interchange

October 28th – 30th

Arlington VA

7

Portal Update

CDISC Portal(Link from

CDISC Website)

SDS ADaMGlobal

User Network

Asia EU North America

Bay-Area Midwest

8

Portal Update

9

Portal Update

• Now ‘Midwest (Chicago)’

• Adding Final Touches– Up and Fully Running Soon

• With the caveat that we have been saying that since 2006

10

Portal Questions?

11

Discussion Questions(Cathy)

• How are companies handling the release of new controlled terminology packages?

• Should the database match what is on the crf, or is it ok to map values that are equivalent?

• Are companies using the new LBTESTCD’s – example do you use “GLUC” for both serum glucose and urine glucose?  

• Will we need to have separate testcds -  GLUC and UGLUC in ADaM, based on the recent draft ADAMIG?   

12

Discussion Questions

• Are companies using RELREC for PC and PP, and/or other domains?  How does it get created?

13

Discussion Questions

• Are companies including Screen Failure data in SDTM (DM, DS, SC)?

14

Discussion Questions

• What methods are being used to control the quality of SDTM besides WebSDM?

• WebSDM checks (Sandy VanPelt Nguyen)

• Is the FDA using V1.5 or V2.6?• CDISC Website points to V1.5

15

Discussion Questions

• What are companies doing for legacy conversions?  Do they create:– study SDTM – study ADaM– integrated SDTM– integrated ADaM

16

Discussion Questions

• For integrated ADaM, are only data sets for SCE/SCS needed? Most likely, yes? For example, there is no need to have ADIE, ADPE…

17

Discussion Questions

• Are SDTM aCRF’s needed for all studies, including those with screen shots only (i.e. EDC studies)? Usually, there are too many pages.

18

Discussion Questions

• For submissions with CDISC data sets, do we still prepare patient profiles? (According to CDISC documents, it will not be needed.)

19

Discussion Questions (Susan)• When reviewing the ADaM model I was wondering how one documents the

Source/Computational Method in the define.xml when

1) the *true* source of the collected data is an internal analysis dataset  (not SDTM)

2) the *true* source of the derived variable/parameter is an internal analysis dataset (not SDTM).

• Should the ADaM define.xml describe the data in terms of the SDTM, even if the SDTM isn't the source of the data?  

• What does one do if they produce SDTM and ADaM from their internal analysis datasets?  

• Do you define ADaM in terms of SDTM to maintain transparency between ADaM and SDTM?  

• Do you need to provide the computational method in the define.xml if the variable/parameter comes from another dataset?

20

Discussion Questions• For reference here are some excerpts from the ADaM Document:

1.1 Purpose The Analysis Data Model describes key principles that apply to all analysis datasets, with the overall principle being that the design of analysis datasets and associated metadata facilitate explicit communication of the content, input, and purpose of submitted analysis datasets.

1.4 Definitions Input Data – The data used for the creation of analysis data sets. Traceability – The property that permits the user of an analysis dataset to understand the relationship of analysis values to the study tabulation datasets.

2.1 Introduction Analysis datasets should facilitate clear and unambiguous communication of the content of the datasets supporting the statistical analysis performed in a clinical study, should provide a level of traceability to allow an understanding of the relationship of analysis values to the input data, and ...

21

Email From John In many respects, ADaM is still about philosophy and the idea that analysis dataset metadata should refer to the input

data is logical. From the point of view of the recipient of the SDTM and ADaM data, it makes sense that ADaM refers to the study data

tabulation in hand, rather than to input data that are not sent.  Otherwise, I think it would be difficult for a recipient of ADaM to verify the derivation of the datasets or to perform sensitivity analyses.

 ADaM is a CDISC standard, and as such, I believe that ADaM metadata is about how you can get to ADaM from

SDTM.  That wasn't always recognized so clearly as now.  This is partly the reason that there are some discrepancies perhaps between the documents/sections still.

 

ADaM may be in fact generated from a different source than SDTM, however if so, I think the metadata should still refer to SDTM as if it had been the source; and this is difficult if SDTM is not really the source.

 CDISC now has a vision that CDISC metadata should be bidirectional and permit one to go from collected values

through to analysis, and vice-versa.  Implicit or explicit in this is that ADaM metadata refer to SDTM.  The vision is CDASH-SDTM-ADaM-analysis (via ADaM results metadata).

 Another thing, besides data, some have said that "input" may also refer to SAP, Protocol, third party algorithms,

thresholds, etc.

22

Email From SusanAs you are probably aware, this has been a central issue for many ADaM discussions.  It is probably

worth noting that in earlier versions of the ADaM standards, we did have SDTM as our ‘input’ box but some team members felt this was too restrictive and the oft cited adage that “CDISC can not endorse any particular process” was used to change the documentation to make it more generic.  In addition, even in the Linear process (SDTM first, then ADaM), because of timing, there might be inputs to ADaM that are not yet in SDTM (PK spreadsheets come to mind, randomization schedules, protocol deviations, etc.). 

But we tried to reinforce the concepts in words that there must be some describable relationship between SDTM and ADaM, regardless of what the input to ADaM is.  The way we approach this in the ADaM training is to emphasize that the FDA reviewer receives 1) the SDTM tabulation data and 2) ADaM data.  They do not receive any ‘raw’ data.  If their task is to understand what you did in the analysis, then it follows that they must understand this with the data they have in hand.  If they have questions about how you created a derived observation in ADaM, they are going to be asking this relative to the observations in SDTM.  You will be hard pressed to answer their questions if you do not understand the relationship between SDTM and ADaM.

Creating ADaM from something other than SDTM is not impossible and in fact there are more than a couple of large pharma’s who are doing it this way.  But it does add another layer of effort to create the trace between SDTM and ADaM. 

23

Discussion: When to Integrate SDTM (Yen)

• Late Stage Conversions– Data collected in ‘legacy’ format– SDTM created in final stages– Analysis datasets created independently of

SDTM– CSR may be written

24

Discussion: When to Integrate SDTM (cont)

• Mid Stage Conversions– Data collected in ‘legacy’ format– Converted to SDTM after collection– Analysis datasets created from SDTM

• Upstream– Data collected in SDTM (like) format– No or minimal conversion necessary

25

When to Integrate SDTM?

Pros & Cons at each stage

Late-stage:• Pro: Minimum disruption of business process• Pro: Fastest way to submit SDTM• Con: Submitted data not source for analysis• Con: Convert at time-critical point in project

26

When to Integrate SDTM?

Mid-stage:• Pro: Midrange disruption of business process• Pro: SDTM data is source for analysis• Pro: Efficient data exchange w vendors &

partners• Con: Convert at time-critical point in project

27

When to Integrate SDTM?

Upstream, in collection systems:• Pro: Build SDTM, not convert to SDTM• Pro: Most efficient data exchange w vendors &

partners• Con: Maximum disruption of business process

28

Changes

from

SDTMIG

3.1.1 to 3.1.2

29

Scope of Review

• Not domain by domain review

• Review of changes in Section 4– Changes each impact many domains– Basic SDTM knowledge independent of

SDTM domains– Although I couldn’t resist adding a couple of

domains which had major changes at the end

30

4.1.1.4 Order of the Variables

• Variable order no longer flexible1) Identifiers

2) Topic

3) Qualifiers

4) Timing– Within each role order should be the order

shown in 2.2.12.2.5 of the SDTM

31

4.1.1.6 Additional Guidance on Dataset Naming

• Custom domains beginning with X, Y or Z are reserved– Will not be used by SDTM in the future– Second letter can be any letter or number– Using X-, Y- or Z- is optional and not required

32

4.1.1.7 Splitting Domains

• Why sponsors will split is not addressed• Two methods

– General observation classes• Split by –CAT, which must be populated in all

cases

– FA Domain• Split by –CAT• Split relative to parent domain of the value in –OBJ

– For example, FACM would store Findings About CM records.

33

4.1.1.7 Splitting Domains (cont)

• Other rules:1) Values in DOMAIN remain the same

2) Domain prefixes use value in DOMAIN

3) --SEQ unique within USUBJID across domains

4) Variables with same name must have same length across datasets

5) Permissible variables do not have to be in all of the datasets

34

4.1.1.7 Splitting Domains (cont)

• Other Rules: (cont)6) Up to 4 character dataset names

• First two letters are the same as the original domain

7) SUPPQUALs of split domains also split• SUPPQS36, SUPPFACM

8) RELREC relationship defined for split FA domains may reference 4 character dataset name

35

Splitting Domains - Sample

36

4.1.1.8 Origin Metadata

• Origin Column of Define.xml– CRF– eDT– Derived– Assigned

• determined by individual judgment (by an evaluator other than the subject or investigator)

– Protocol• defined as part of the Trial Design preparation

• Multiple Sources– Variable-level metadata will list all types separated by commas,

eg ‘Derived, CRF’– Value-level metadata will show origin at test level

37

4.1.1.9 Assigning Natural Keys in the Metadata

• Defines ‘Natural Keys’• Keys may include SUPPQUAL

– STUDYID, USUBJID, PEDTC, PETESTCD, PELOC, PEMETHOD, QNAM.PEMAKE, QNAM.PEMODEL

• Generic test codes rather than bunching

38

4.1.2.3 Use of “Subject” and USUBJID

• No two subjects can share the same USUBJID• Conversely, every subject must retain the same

USUBJID throughout the submission (if known)• Format not specified

– STUDY-SITE-SUBJID– 000001

39

4.1.2.5 Convention for Missing Values

• Missing values represented by nulls

• Previously stated that convention used should be specified in the define file

40

4.1.2.6 Grouping Variables and Categorization

STUDYID

DOMAIN

--CAT

--SCAT

USUBJID

--GRPID

--REFID

41

4.1.2.6 Grouping Variables and Categorization (cont)

• --CAT/--SCAT– Subset groups within a domain– Known about the data before it is collected– Group data across subjects– May have controlled terminology

42

4.1.2.6 Grouping Variables and Categorization (cont)

• --GRPID– Groups data within a subject– Have no meaning across subjects– Assigned during or after data collection– Sponsor defined, not controlled terminology

• --REFID– Groups data within a subject– Example, sample identifier for blood sample

43

4.1.2.7 Submitting Free Text From the CRF

• ‘Specify’ values for non-result qualifiers– When free-text information is collected to

supplement a standard non-result qualifier, free-text value goes into SUPPQUAL.

Reason for Dose Adjustment Describe

___ Adverse Event [EXADJ] _[SUPPQUAL]_

___ Insufficient Response _____________

___ Non-medical Reason _____________

44

4.1.2.7 Submitting Free Text From the CRF (cont)

• ‘Specify’ values for non-result qualifiers (cont)– Location of Injection: Other, Specify: ____

• Verbatim = UPPER RIGHT ABDOMEN• Option 1: EXLOC=OTHER

– Sponsor maintains original CT

– Verbatim goes in SUPPQUAL

• Option 2: EXLOC=ABDOMEN– Sponsor has expanded CT based on their coding decision of

the verbatim text

– Verbatim goes in SUPPQUAL

• Option 3: EXLOC = UPPER RIGHT ABDOMEN– Sponsor does not care about CT for this variable

45

4.1.2.7 Submitting Free Text From the CRF

• ‘Specify’ values for result qualifiers– Eye Color: Other, Specify________

• Verbatim = BLUEISH GRAY • Option 1:

– SCORRES = BLUEISH GRAY – SCSTRESC = OTHER– Sponsor wishes to maintain CT

• Option 2:– SCORRES = BLUEISH GRAY – SCSTRESC = GRAY– Sponsor will expand CT based on their coding decision

• Option 3:– SCORRES = BLUEISH GRAY – SCSTRESC = BLUEISH GRAY – Sponsor does not care about maintaining CT

46

4.1.2.7 Submitting Free Text From the CRF

• ‘Specify’ values for topic variable– Interventions

AcetaminophenAspirinOther:______

• Verbatim will be entered into –TRT

– Events• Verbatim entered into –TERM

– Findings• Verbatim needs to be coded so that –TEST/--

TESTCD are CT and not free text

47

4.1.2.8 Multiple Values for a Variable

• Topic variable (--TRT, --TERM)– Assumed sponsor will split or resolve for their

data management procedures– DS is an exception

• Covered in 6.2.2.1• Sponsor chooses primary• Submit others in SUPPQUAL

48

4.1.2.8 Multiple Values for a Variable (cont)

• Findings result variable– Split into 2 rows

• EGORRES=ATRIAL FIBRILLATION• EGORRES=ATRIAL FLUTTER

• Non-result qualifier variable– Variable value should be MULTIPLE– Individual values stored in SUPPQUAL

• AETERM=RASH, AELOC = MULTIPLE• QNAM.AELOC1 = FACE• QNAM.AELOC2 = NECK• QNAM.AELOC3 = CHEST

– UNLESS• If one is considered of primary interest, that value can go into the

variable, with the others stored in SUPPQUAL– Will reviewer know these are in SUPPQUAL? Document!

49

4.1.3 Coding and Controlled Terminology Assumptions

• ‘*’ if no controlled terminology exists• List of the terms if the list is not maintained

elsewhere• Name of the external codelist

– http://www.cancer.gov/cancertopics/terminologyresources/CDISC

• Full CT Discussion to be held in the future

50

4.1.4.7 Use of Relative Timing Variables

• Introduction to the new SDTM variables1) --STRTPT

– Examples: "2003-12-25" or "VISIT 2".

2) --STTPT3) --ENRTPT4) --ENTPT– Timepoints are not anchored to RFSTDTC and

RFENDTC as in --ENRF and --STRF – Valid values in –STTPT or –ENTPT are:

• BEFORE• COINCIDENT• AFTER • U

51

4.1.4.7 Use of Relative Timing Variables - Example

• If an AE is known to be ongoing during at the end of a subject’s study participation, which is on October 17th, 2008 then:– AEENRTPT = ONGOING– AEENTPT = 2008-10-17

52

4.1.5 Other Assumptions

• --ORRES should generally not be populated for derived records– Still not required but highly encouraged

• If symbol is collected with original results, for example <10,000 then this gets copied into –STRESC, but --STRESN is null– Also applies to values such as TRACE, 1+, etc.– Discouraging derivations in SDTM– Recommended that this be done in ADaM

53

4.1.5 Other Assumptions (cont)

• If --TEST (except for IETEST and TI.IETEST) values > 40 characters then --TEST should be:– 1st 40 characters– Shortened but meaningful version– In either case, if the full text is on the CRF, then link to

that from the Origin column. If it is not on the CRF, then link to another PDF which contains the full test name

– Also applies to QLABEL in SUPPQUAL

54

4.1.5 Other Assumptions (cont)

• Clinical Significance– Should all go to SUPPQUAL– 3.1.1 had EG examples with CS in the results

field.

• --REAS standard QNAM for reason test was performed

55

4.1.5 Other Assumptions (cont)

• Introduction to the new SDTM variable –PRESP– Indicates that an event or intervention was prespecified

on the CRF– Values are Y or null

Situation --PRESP --OCCUR --STAT

Spontaneously reported event occurred

Pre-specified event occurred Y Y

Pre-specified event did not occur Y N

Pre-specified event has no response Y NOT DONE

56

Domain Models

• New assumptions with most tables listing what variables would generally not be added into the domain

• Examples moved from Section 9 to Section 6, under corresponding domain table

• Variables dropped/added• Variable order changes• Label changes• Assumptions added/clarified/dropped

57

DM

• Multiple race should be handled as multiple response for non-result qualifier

• Additional race data now goes into SUPPDM, instead of SC

58

CO

• No longer restricts the addition of Identifiers and Timing variables– When not related to other domain records

59

SE / SV

• Moved from Trial Design to Special Purpose

60

EX

• Assumption that EX is required for all studies which include investigational product– Observed by Investigator– Automated dispensing device records– Subject Recall (eg via diary)– Derived from DA (pill count)– Derived from the protocol

61

AE

• Removed AEOCCUR

• AE is only for AEs that actually occurred

62

CE

• Clinical events of interest that would not be classified as adverse events

63

FA

• Not subclass of the findings domain

• Only domain that can use the –OBJ SDTM variable

• Previously CF domain (3.1.2 draft)

64

QUESTIONS?