Constructing Binary Decision Tree for Predicting Deep Venous Thrombosis (DVT)

Christopher Nwosisi1,2, Sung-Hyuk Cha1, Yoo Jung An, Charles C. Tappert1, Evan Lipsitz2

1Computer Science Department Pace UniversityNew York, USA

2Vascular LaboratoryMontefiore Medical CenterNew York, USA

Statement of Problem

• The use of decision tree algorithms such as ID3 and C4.5 in medical diagnostic application today is promising, but often suffer from excessive complexity and can even be incomprehensible.

• Especially in predicting DVTs which have high mortality, simple and accurate decision model is preferred for potential patients, Medical Technologists and Physicians before sending patients for expensive medical examinations.

Proposed approach

• Using the Genetic Algorithm to minimize the complexity (size) and/or maximize the accuracy of the decision tree.

• New approach found shorter and/or more accurate decision trees than ones produced by conventional the ID3 and C4.5 algorithms.

DVT / VTE

Silent PESilent PE1 Million1 Million

DeathDeath60,00060,000

Estimated Cost of VTE Care $1.5 Billion/year

Magnitude of the Problem

Post-thrombotic Post-thrombotic SyndromeSyndrome

800,000800,000

Pulmonary Pulmonary HypertensionHypertension

30,00030,000

Goldhaber SZ, et al. Lancet 1999;353:1386-19.

DVTDVT2 Million2 Million

PEPE600,000600,000

Patients with deep vein thrombosis have a painful swollen leg which limits their mobility

Clinical Problem

Montefiore Hospital Vascular Laboratory, 2008

DVT-Duplex Evaluation

Criteria for positive diagnosis:

- incompressibility of a venous segment

- visualization of thrombus

absence of flow

Montefiore Hospital Vascular Laboratory

Database Overview

Two datasets are extracted from two databases:

• Medical History

• Physical Exam

• Diagnostic Tests

• 515 records from the Laboratory

- 350 patients are positive for DVT- 165 patients are negative for DVT

• 620 records from the general registry

- 420 patients are positive for DVT- 200 patients are negative for DVT

Table 1- Databases Attributes

No. Name Description

1 Sex1 = male; 0 = female

2 AgeAge in years {1- 99}

3 Diabetes0 = normal; 1 = Patient is receiving some treatment

4 Smoking0 = never smoked; 1 = Patient is an active Smoker;

2 = Patient stopped smoking

5 Surgery0 = never had surgery;

1 = Patient who had previous surgery

6 Pain0 = no pain in the leg;

1 = Patient experienced pain in the leg {Right, Left or Bilateral}

7 Swelling0 = no swelling below the knee;

1 = swelling in the leg

DVT0 = examination result indicate negative for DVT;

1 = examination result indicate positive for DVT

Medical History

Table 2 – Database AttributesNo. Name Description

1 Sex 1 = male; 0 = female 12 Congestive heart

failure

0 = never diagnosed; 1 = previously diagnosed

2 Age Age in years {1-99} 13 Obesity 0 = obesity not specified; 1 = obesity specified

3 Diabetes 0 = normal; 1 = Patient is receiving some treatment 14 Accident 0 = never had a fall; 1 = previously had a fall

4 Smoking 0 = never smoked; 1 = Patient is an active Smoker;

2 = Patient stopped smoking

15 Hyperlipidemia 0 = normal; 1 = Patient is diagnosed

5 Surgery 0 = never had surgery; 1 = Patient who had previous

surgery

16 Cardiac

Dysrthythmia

0 = normal; 1 = Patient is diagnosed

6 Swelling 0 = no swelling below the knee; 1 = swelling in the leg 17 Lymphoproliferat

disease

0 = normal; 1 = Patient is diagnosed

7 Chest Pain 0 = none; 1 = pain in Chest DVT 0 = examination result indicate negative for DVT

1 = examination result indicate positive for DVT

8 Cancer 0 = normal; 1 = positive for cancer

9 Cellulitis 0 = normal; 1 = positive for cellulitis

10 Injury 0 = no injury; 1 = previous and current injuries

11 Pulmonary

embolism

0 = never diagnosed; 1 = previously diagnosed

Medical History

Physical ExamDiagnostic Tests

Sex Age Diabetes Smoking surgery pain swelling DVT

M 77 y no y n n yes

M 53 n no y n n yes

M 55 n yes n n y yes

F 73 n no y n y yes

F 84 y no y n n yes

F 68 n yes y n n yes

F 81 n no y n n yes

M 84 y yes n n n yes

F 84 y no y n n yes

M 84 n no y n n yes

F 73 n no y n y yes

F 56 n no n n y yes

M 63 n no n n n yes

F 76 y no y n n yes

F 70 y no y n n yes

M 75 n no y n n yes

F 92 n no n n n no

F 73 n no y y n no

F 61 n stopped n y n no

M 63 y stopped y n n no

M 78 n no y n n no

F 96 n no y n n no

F 71 n no y n n no

M 71 n no n n y no

Table 2.1.1.1 - DVT sample data set IIDVT database (Table 1)

AGE SEX Ob Sm Swell CHF Canc Surg Chest Lip Lymp Card DB Othr ACC/ Leg leg DVT Pain Dysr PE Fall Inj Cell 50 M 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1

82 F 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1

88 F 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1

67 F 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1

83 F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

79 M 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 1

54 M 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1

69 M 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1

68 M 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1

62 M 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1

26 F 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1

64 F 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1

80 F 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1

82 F 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1

78 M 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1

33 F 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1

26 M 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1

54 M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1

45 F 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1

47 F 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1

74 F 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1

60 F 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

58 M 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1

42 F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

63 M 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1

45 F 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 1

30 F 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

87 F 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0

77 F 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

97 F 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

88 F 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0

18 M 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

85 F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

35 M 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

68 F 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0

48 F 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0

85 M 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0

68 M 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

42 F 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

DVT database (Table II)

SMSBSS

PEHFOB

Dataset I Dataset II

Datasets Relationship

Preprocessing (Binarization)

Heterogeneous type attributes

Sex Smoking … pain DVT

M no N yes

F no L yes

F yes Bi yes

F no N yes

M yes N yes

F stopped R no

M no N no

Homogeneous Binary type attributes

Original table Binary tableSex Smoking … pain DVT

1 0 0 0 0 1

0 0 0 1 0 1

0 1 1 1 1 1

0 0 0 0 0 1

1 1 1 0 0 1

0 1 0 0 1 0

1 0 0 0 0 0

Why Binary Attribute?

• Applying GA on Non-binary attributes is extremelydifficult and currently an open problem

• To use the GA to build a binary decision tree, theattribute types must be in binary

Age Distributions (numeric)

Nominal type attributes (|v| > 2)

Leg Pain {L, R, Bi, N}

L P RP

vSmoking {N, Stopped, Yes}

1 1 Bi 1 1 Smoking

1 0 L 1 0 Stopped

0 1 R 0 0 None

0 0 None

A60 GN DB SM SR PN SW DVT1 1 0 0 0 0 0 10 0 0 0 0 0 1 11 0 0 0 1 0 0 11 1 0 0 1 0 0 10 1 0 0 1 0 0 1

0 1 0 0 1 0 1 0

0 0 0 0 0 0 0 01 0 0 0 0 0 0 00 0 0 0 0 0 0 01 1 1 0 0 0 0 0

Dataset I Binarized Table

A60 GN DB OB SM SR SW HF CR CP HL LD CD PE AC IJ CL DVT

0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1

1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1

1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1

1 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1

0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1

1 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Dataset II Binarized Table

Decision Tree

Their representation of acquired knowledge in tree form is intuitive and generally easy to assimilate by humans.In general, DT classifiers have comparable accuracy to other complex classifiers but simple to understand and visualize.

CRSW pos(17/25)

(12/13)(11/12)

(10/10)pos

pospos negneg

• Decision trees classify instances – by sorting them down from the root to the leaf node, – which provides the classification of the instance.

• Each internal node in the tree specifies a test of some attribute of the instance.

• Each leaf node assigns a classification

• Each branch descending from that node corresponds to one of the possible values of this attribute.

Decision Tree RepresentationDecision Tree Representation

Decision Trees from Dataset I

(b) 61.5% by GA

(a) 59.5% by C4.5

pn pnpnpn

SWSRDBA6DB

DBA6GNn

(c) 64.5% by GA

n pSMCR

(a) C4.5 (72.25%)depth = 12

GNHFGN

SM a6a6n

pp nnpn

(c) 73.75% by GA

(b) 69.75% by GAdepth = 5

(d) 75.25% by GAdepth = 7

HFSWCR

PN a6GN GN

pnpn pnnp

Decision Trees from Dataset II – Figure 5

The Best Measure of Efficiency (shortness) for a DT

• Average number of questions required to obtain a prediction.

Other measures:

• the depth of the tree• the number of nodes in the tree

Depth limit

Performancerate

The average # of question

5 69.75 2.95256 73.75 3.37257 75.25 3.89558 76.50 4.32759 76.75 4.8225

10 78.00 5.122511 78.50 5.467512 79.50 5.867513 80.25 6.3075

Complexity of Decision Trees

12 72.25 7.485

16 80.0

From both a depth and average-number of questions perspective the complexity of the

decision tree in Figure 5 (d) can be considered much more efficient (simpler)than the decision

tree from the C4.5 algorithm (Figure 5a).

(d) 75.25% by GAdepth = 7

HFSWCR

PN a6GN GN

pnpn pnnp

n pSMCR

(a) C4.5 (72.25%)depth = 12

pos(17/25)

(12/13)

(30/43)

(20/22)

(13/16)

(11/12)

(10/10)

(56/79)

(43/52)

pospos

posnegpos

posneg posneg

Optimal DT

This might be the optimal decision tree based on the data and indicates that combining human knowledge and machine speed of processing can often produce a superior result than either the human or machine could produce separately.

Conclusion

• Experimental results on two datasets suggest that more accurate and efficient decision trees can be found by the GA

• The decision trees produced by the GA have significant clinical relevance.

• The results shown here increase the probability of predicting whether a patient would develop or have had DVT, which provides advancement in the diagnosis of DVT

Future Works

The decision trees found by using GA tend to be almost full binary trees i.e., the width is large while the depthis short.

For future work, the C4.5 pruning mechanism could be applied to decision trees produced by GA to make trees sparse and to further avoid the potential over-fittingproblem.

Constructing Binary Decision Tree for Predicting Deep Venous Thrombosis (DVT)

Documents

Prophylaxis and Treatment of Venous Thromboembolism in ... · Venous thromboembolism (VTE) is a vascular disorder characterized by deep vein thrombosis (DVT) and pulmonary embolism

Deep venous thrombosis

DEEP VENOUS THROMBOSIS - - GAPAgapa.net/.../uploads/2017/06/Deep-Venous-Thrombosis-1.pdf · 2017-06-17 · •DVT –Deep Venous Thrombosis •PE –Pulmonary Embolus •DVT and/or

DVT.. Deep vein thrombosis

Acute DVT and Beyond: Endovascular Management DVT and Beyond.pdf · Acute DVT and Beyond: Endovascular Management ... Relationship between deep venous thrombosis and the postthrombotic

Dvt Deep Venous Thrombosis

Venous Thromboembolism Thrombophilias in Pregnancy handout 3.pdf · Pregnancy & venous thromboembolism (VTE) • Deep vein thrombosis (DVT) and pulmonary embolism (PE) • Collectively

Venous Thromboprophylaxis in Critical Care · Venous Thromboprophylaxis in Critical Care. ... DVT Deep vein thrombosis ... The purpose of his document is to draw together the current

dvt deep venous thrombosis perioperative prevention.ppt

BLOOD FORMING AGENTS. Clinical Thrombosis 2.5 million cases of deep venous thrombosis (DVT) annually> 2.5 million cases of deep venous thrombosis (DVT)

Venous thromboembolism after oral and maxillofacial oncologic surgery… · 2016-12-29 · Background: Venous thromboembolism (VTE) including deep vein thrombosis (DVT) and pulmonary

Cerebral Venous Thrombosis

Deep Vein Thrombosis (Dvt) Final

Emergency Department Deep Venous Thrombosis Management · 2019-09-25 · # In patients with acute isolated distal DVT or calf muscle vein thrombosis without severe symptoms or risk

Antiphospholipid Antibody...–DVT/PE – most common presentation •Arterial Thrombosis –less common than venous thrombosis –most commonly present with transient ischemic attack

Nursing Management for Prevention of Deep Vein Thrombosis ... › docs › librariesprovider4 › guidelines › nursi… · for Prevention of Deep Vein Thrombosis (DVT) / Venous

Deep Venous Thrombosis (DVT), Suspected Acute...DEEP VENOUS THROMBOSIS (DVT), SUSPECTED ACUTE, INVOLVING A PROXIMAL LIMB Step 1. Evaluate • Signs of limb ischemia (reduced pulses

Cancer Associated Thrombosis: Scope of the … Associated...Venous thromboembolism (VTE), comprising of deep vein thrombosis (DVT) and pulmonary embolus (PE) is a common phenomenon,

Guidance for the use of thrombolytic therapy for the treatment of venous thromboembolism · 2017. 8. 25. · Venous thromboembolism (VTE), which includes deep vein thrombosis (DVT)

Cerebral venous thrombosis