Click here to load reader
Upload
oswin-brooks
View
218
Download
4
Embed Size (px)
Citation preview
Comparison of Modelling Techniques to Predict Risk of Mortality using Routinely Collected Data
Tessy Badriyah
Healthy Computing, 1nd June 2011
Aim :
to contribute to the building of effective and efficient methods to predict clinical outcome that can be constructed from routinely collected data
DatasetModel was built from Biochemistry and Haematology Outcome Model (BHOM) dataset, during 12-month study period => 17,417 patients.The fields are : death - at discharge - F=alive, T =dead (class attribute), age at admission, mode of admission (mostly emergency, but some elective), gender, haemoglobin, white cell count, urea, serum sodium, serum potassium, creatinine, urea / creatinine
Statistical AnalysisThe statistical analysis to asses the overall performance of the model are discrimination (area under ROC curve or c-index) and calibration (chi-test)
Design ExperimentWe conducted our experiment using Logistic Regression as standard method (‘gold standard’) in the Health Care data and several methods in machine learning techniques (Decision Trees, Neural Networks, K-Nearest Neighbour,).We used 10-fold cross validation method and the process was repeated 10
times to avoid bias during the formation of the cross validation data splits.
Methodology
The Average of Discrimination (c_index) over 10 iteration using 10-fold cross validation
iteration1
iteration4
iteration7
iteration10
0.7
0.74
0.78
0.82
0.86
0.9
Logistic RegressionDecision TreesNeural NetworkK-Nearest Neighbour
The Importance of Calibration
The Result and Work Plan
Both Logistic regression and Machine Learning methods have shown a good resultsMachine Learning methods is worth to looking at to predict clinical outcomeIn another experiment, we were increasing the number of non-survivors in the dataset to test the hypothesis that the discrimination can be improved when the proportion of non-survivors has been increased => hypothesis proved.Further investigation and model development to predict another clinical outcome (e.g. readmission rates) which has different types of target data with previous clinical outcome.