Medical data diagnosis

Preview:

Citation preview

Feature Selection of Medical Diagnosis DataUsing Genetic Algorithm and Data Mining

Introduction● Data mining - Helps us with making sense and finding

patterns in huge collection of data that is obtained. Automate prediction and minimize human interaction.

● Feature Selection - Redundant and irrelevant variables and predictors need to be removed from the data.To simplify the model, remove noise and prevent

overfitting.● Genetic algorithm - Simulating a process of natural

selection.

How it worksStep 1

Obtain the Data

Step 2

Feature Selection

Step 3

Genetic Algorithm

Medical Data

Mined Data Disease specific data

Mathematical model

Prediction Appropriate Treatment

New Records

Data MiningData is obtained from

Medical Records of patients in Hospitals, Clinics.

We tabulate all the data of each patient into a number of parameters or variables

This data is the Training Data for the program.

These are collected and tabulated medical records. This is called as training data. The data of 14 persons have been noted with their age, BMI (Body Mass Index), hereditary and vision along with the result, the risk of a medical condition. This data is subjected to Feature selection where irrelevant variables are eliminated. It is passed through a filter algorithm to obtain a better training data set. This data is crucial in the prediction of new patients disease.

Feature Selection● Selecting a subset of the training data which is

relevant to the specific disease.● Removes irrelevant and redundant variables.● Find the variables that affect the outcome the most,

discard the rest.● Improves processing time and prevents overfitting● Three major methods

○ Filter Method○ Wrapper Method○ Embedded Method

Feature Selection Methods

Filter Method

Pick out intrinsic properties of the data

Two stepped process

Ranking

Subset Selection

Fast and prevents overfitting

Wrapper Method

Almost the same as Filter except it can detect possible interaction between variables.

More specific but increases computation time

Embedded Method

Embedded into the model construction process

Combines the advantages of Filter and Wrapper methods

Genetic Algorithm● Genetic Algorithms are adaptive optimization methods that mimic

natural evolution processes via non-exhaustive searches among randomly generated solutions.

● Inspired by natural evolution, such as inheritance, mutation, selection, and crossover.

● Application is in Medicine: Clinical Decision Support● The data is considered to be population, every record in the data

is treated like an “individual” and it’s output is treated as its score● We select the best individuals and apply the Genetic Algorithm to

create new individuals and repeat this till we get a population of the best individuals

Fitness EvaluationFind the outputs of the inputs and find the best individuals

Initial PopulationAssess the population (data) and assign scores to each of them

Process of Genetic Algorithm

Mating/MutationTwo selected inputs can be mated with a chance of mutation to obtain an input with hopefully a better output

Quality CheckCheck if the population has sufficient quality. If yes, end the process. Else, repeat the process

Process of Genetic Algorithm

Initial PopulationAssess the population (data) and assign scores to each of them

Fitness EvaluationFind the outputs of the inputs and find the best individuals

Mating/MutationTwo selected inputs can be mated with a chance of mutation to obtain an input with hopefully a better output

Quality CheckCheck if the population has sufficient quality. If yes, end the process. Else, repeat the process

In Conclusion

● Data Mining and Genetic Algorithm techniques yield efficient results in the diagnosis of a disease.

● Feature selection methods enable elimination of irrelevant variables and generation of a better training set.

● The prediction for the new record is accurate and less time is consumed by the mathematical model to generate the prediction.

● Time is critical in diagnosis of disease, so early treatment results in high success rates for curing of disease.

Thank You

Recommended