13
Datacut Strategies: What, Why & How Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi

Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

Datacut Strategies: What, Why & How Hiren Naygandhi

Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi

Page 2: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

2

Contents Ø What is a datacut?

Ø Why do we need it?

Ø Scenarios

Ø What if there was no datacut?

Ø Who should be responsible?

Ø Conclusion

Page 3: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

3

What is a datacut? Ø Data that have been cut into subsets, according to a set of rules

Ø Could be at a date, for a number of patients or significant point in time

Ø Usually ü  defined by other function e.g. Stats, Science ü  performed by Stats Programming, Data Management

Page 4: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

4

Why do we need it? Ø Need to ensure data is “clean”

Ø Need to consider: e.g. Cutoff date: 05JAN2012, Snapshot date: 23JAN2012

Clinical Cut-Off Snapshot

Identify cut-off fields

Imputation of partial dates Rollback i.e. amend data to ongoing if after datacut

Page 5: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

5

Scenarios

Scenario Cut applied at Source or Analysis

Imputed dates Rollback

1 Source Retained Yes

2 Analysis Retained No

3 Source Dropped No (flags created in analysis datasets)

Page 6: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

6

Scenario 1

Scenario Cut applied at Source or Analysis

Imputed dates Rollback

1 Source Retained Yes

Ø  Advantages:

ü  Imputations made only once ü  Rollback applied, which gives a further accurate reflection of events without bias

Ø  Disadvantages:

ü  Derivations added within source datasets ü  Rollback applied, though how much of a significant impact would it have?

Page 7: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

7

Scenario 2

Scenario Cut applied at Source or Analysis

Imputed dates Rollback

2 Analysis Retained No

Ø  Advantages: ü  Source data remains intact ü  Only need to apply date imputations once

Ø  Disadvantages: ü  Mis-match between source and analysis data ü  No Rollback applied, which could lead to bias

Page 8: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

8

Scenario 3

Scenario Cut applied at Source or Analysis

Imputed dates Rollback

3 Source Dropped No (flags created in analysis datasets)

Ø  Advantages: ü  No rollback applied, therefore no source data changed ü  Flags will identify events after datacut and allow flexibility for reporting

Ø  Disadvantages: ü  Date imputations would need to be performed twice

Page 9: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

9

What if there was no datacut? Ø What if we did NOT apply any date imputations?

Ø Use database entry date instead of event date

ü  No need for imputations

ü  No need for datacut

ü  Accurate reflection of database

ü  Delay in data entry

ü  Inaccurate reflection of study

Page 10: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

10

Who should be responsible? Ø Stats Programming or Data Management?

Ø What if Data Management (DM) were owners for datacuts?

Ø Advantages: ü  We (Statistical Programming) don’t need to worry about performing the datacut! ü  DM best placed with access and tools available to them ü  DM responsibility to provide “analysis-ready” data

Ø Disadvantages: ü  May be perceived as performing “analysis” by DM ü  Depends on DM resource/agreement

Ø Need to ensure: ü  There is a clear process ü  Stakeholders are included early

•  D

Page 11: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

11

Conclusion

Ø Much more to datacut that just subsetting!

Ø Various strategies available: No wrong or right solution

Ø  Recommendation:

ü  Mixture of multiple strategies depending on purpose

ü  Ensure this is consistent across all projects i.e. one process to follow

ü  Ensure other stakeholders i.e. Statistics, Science etc. are included in decisions

ü  As programmers we can play a vital role in defining process

Page 12: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

12

We have reached the Cut-off! Ø What strategy do you apply?

Ø Who should take ownership of the process?

Ø Any questions?

Page 13: Hiren Naygandhi - Lex Jansen · 2014. 10. 27. · Hiren Naygandhi Acknowledgements: Chris McKenna, Hinal Patel & Vijay Reddi . 2 Contents ! What is a datacut? ! Why do we need it?

Doing now what patients need next