Upload
pedrovsky702
View
216
Download
0
Embed Size (px)
Citation preview
8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs
1/4
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. II, Issue I, January 2013 (ISSN: 2278-7720)
P a g e | 1
Data mining in pharmacovigilance to reduce Adverse
Drug Effects(ADRs)
Ms. Miral KothariGujarat Technological
UniversityAITS, Yogidham Gurukul,
Kalawad Road, Rajkot.
Ms. Priti SadariaSaurashtra University
Virani Science College,Yogidham Gurukul,
Kalawad Road, Rajkot.
Ms. Nehal DaveSaurashtra University
Virani Science College,Yogidham Gurukul,
Kalawad Road, Rajkot.
ABSTRACTPharmaceutical industry provides the medicines indifferent formats. It can be tablets, capsules, liquid orinjectables. Every drug in any form may cause adverseeffect varies from person to person. Before putting anydrug in the market, the drugs are being tested for adverse
effects on large scale. Pharacovigilance is a science whichis purely related with discovery, understanding andanticipation of the Adverse Drug Effect (ADEs).Pharmaceutical experts and industries much rely on datamining algorithms or techniques to understand the hugedata collected from healthcare professionals and patients
and make the use of that data for further research anddevelopment of new drug. In this paper, author has tried to
implement Bayesian Classification method of data miningto assist the research person in decision making.
KeywordsData mining, pharmacovigilance, Bayesian classification
1.
INTRODUCTIONMedicines are required to be evaluated in terms of harm tothe human body. Harm can be of short term or long term.Before being introduced to market, every drug or medicineis tested but comparitively on a small number of people. In
wider population, it is possible that drug may create
reactions to the human body which were not detectedduring testing. Adverse Drug Effect (ADEs) is alsorefered as Adverse Drug Reaction(ADRs) are the responseto the medicine which is used. Every patient is a uniquemedicine user with different life style and circumstancesand whose body will react in different way.
Pharmacovigilance is a tool or science which can be used
to evaluate and improve the safety of medicines [1].
Pharamacovigilance is a collection of activities which areconducted to detect, assess, understand, monitor or toprevent the Adverse Drug Reactions(ADRs)[2].The main question arises in pharamcovigilance is what is
the need to monitor the adverse reactions of drugs? Thisquestion is important because every drug before beingintroduced to the market for commercial purpose, have
been gone through adequate study. But the answer is verysimple and is that the highest priority is given to humanhealth and to keep the humans safe and make the drug
more safe even after adequate testing, monitoring theadverse effects is necessary[3]. The process ofpharmacovigilance involves risk analysis and riskmanagement.
The process is illustrated in Figure (see Figure 1). The risk
analysis is the phase which involves identification,
quantification and assessment of the drug reactions. The
survey will be held to collect the data and identifying theactual cause of the reaction. If it is really the drug taken
then, the stage of quantification will be held. Inquantification, more samples will be considered and thelast phase of risk analysis is assessment in which thecollected and identified samples will be assess and how
much the drug is risky to the patient can be evaluated. Aswe understand the risk during risk analysis, the second
main phase of the pharmacovigilance process is carriedout. In the second phase, the identified risks are managed.The remedies to avoid or to reduce the intensity of risk aretaken. The risks will be measured and evalueated at theadmininstrative level and the significance of the risks will
be communicated to all the levels. Based on the risks,administration will form some strategies to prevent therisks. As the risks are on the health of human beings, the
strategies are developed to reduce or to eliminate the riskas far as possible.
Fig 1: Initial phases of pharmacovigilance
The aims of pharamacovgilance are:
Revealing the adverse effects of existing drugs
Discovering unexpected effects of newer drugs
Recognizing the risk factors associated with
development of adverse drug reactions.Quantitative estimation of the risk factors like
RiskAnalysis
Identification
Quantification
Assessment
RiskManagement
Administrative
Communicating
Prevention
8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs
2/4
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. II, Issue I, January 2013 (ISSN: 2278-7720)
P a g e | 2
o how much admissions to the hospital due to
ADRs?
o What is the mortality ratio of ADRs?Data mining is the tool which will assist thepharmacovigilance in reducing, eliminating or
understanding the risk factor of ADR.[4-6]. The datamining is discussed in the succeeding session of the paper.
2. PRACTICAL ASPECT OF DATA
MININGDue to the development of industries and technologies,industries produce very large amount of data. It becomesvery necessary to manage this data in order to utilize thedata. Based on this data, the business decision can be made
by the decision maker. But data in the available form istough to manage and analyze. For this reason, it is
necessary to take abstract of the data. Data mining can beused as a tool to discover the pattern or prototype availablein the data and discovering the facts hidden behind the
data[7]. Data mining is a combination of data, databasemanagement and data visualization. The purpose of datamining is to extract the knowledge. Data mining is moreuseful particularly when data set is too large. The points tobe considered are once the data mining is stated precisely
i.e. it is decided that on which kind of data, the data miningis to be done, the large data set also becomes small as out
of it some amount of data is of interest from the view pointof the data mining. While the second point is in very largedatabase, a sample is sufficient for accurate model[8].
To mine the data means to extract the usefulinformation from the data. Basically, at first sight, this task
does not require any kind of expertise. But in actual sense,
to make the data mining effective and to extract knowledgefrom large data set, it is not easy. Mainly expertise isrequired in subject and data analysis and data observation.An expert in subject can decide which kind of questionscan be answered from the analysis. Data expert is able to
decide that from where the data is to be collected. Whilethe analysis expert requires strong judgment power based
on statistics along with considering selection bias[9]. Datamining is a repetitive process. The result of data mining isknowledge. Figure (see Figure 2) illustrates the process ofdata mining. The main stages of data mining involve fourmajor activities.
2.1 Problem definitionIn this stage, the problem is identified. To identify the
problem means it is to be judged that which kind ofknowledge or information will be there as output aftercompletion of the process. It is advisable to decide inadvance that how the produced outcome will be used. Theoutputs can be categorized into three[10] as under.1.
The result can be used for descriptive purpose i.e. the
resultant data may be used to describe any segment or
group of whole data set.2.
The discovered facts that were hidden behind the datamay be used to predict the situation outside thedatabase.
3.
The result can be directly involved in the systembeing developed.
Fig 2: Main stages of process of data mining
No doubt that there are possibilities of biases in the data.In spite of this possibility, the consideration is to be that upto what extent the data is related with the question to besolved after the data mining. Biased data produces wrong
conclusion. When any correlation is found between theproduced data and requirement, it cannot be justified onlyby data analysis but it requires knowledge about thedomain also.
2.2 Achieve required informationHowever, analysis and conclusion is done on the basis ofdatabases, but the purpose of creating the database in tosupport the business decision making and processes. Theanalysis can suggest the statistical design. To identify the
actual pattern of the dataset, it is necessary to understand
the possible biases. If the bias occurs frequently on thesame dataset and for the same goal, it may mislead to theactual knowledge. Ultimately it is going to affect thecorrect decision. After the problem is defined, thenecessary information is to be achieved. While doing theprocess of achieving the information, it is to be kept in
mind that whatever information is going to be acquired,
must be related or targeted to the final result. If theinformation is achieved and it is not concerned with thefinal output, then the effort is wasted and has to moveagain for collection. It is illustrated in the following Figure(see Figure 3).
ProblemDefinition
achieverequired info
Selection ofdata
Pre-processing of
data
Interpretation
Use
Information
Biased and
not useful
Unbiased
and useful
Desired
information
8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs
3/4
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. II, Issue I, January 2013 (ISSN: 2278-7720)
P a g e | 3
Fig 3. Collection of information
Therefore, in the second stage of the data mining process,it is very important to achieve unbiased useful and timelyinformation.
2.3 Selection of dataThe very important stage in the process of data mining isthe selection of data. Once the required information isachieved, it is very significant job to select the related dataout of it. To select the appropriate and relevant data seeks
responsibility. The selected data should reflect the data
mining belief that let the data speak itself. Selection ofthe data should be free from pre-defined criteria whichhave been set prior to looking at the data. To select theappropriate data, data from different sources such as datawarehouses, data marts etc. are also to be considered.
Analysts generally like to make use of data warehouse ordata mart to select the data accurately and relatively[11].We can say that analyst or personnel can filter the relevant
data with the help of sources like data warehouse or datamart. It is advisable to first take the prevention steps toavoid the conflicts and inconsistency of data before
integrating more than one resource otherwise it may resultinto a time consuming process. Thus, by looking at the riskand time affecting factor, selection of the data is to be donevery precisely.
2.4 Pre-processing of dataAt the fourth stage of data mining, we already have somedata on hand which is selected out of the identified data.We know that the data is selected from the requiredinformation but before the analysis takes place on the data,it has to be processed and that is why the stage is called
pre- processing of the data. In this phase, data is processed
and experts say that you can use more than one datamining functions for the same type of data. It should benoted that if there are more than one model, each modelshould be assessed for the data by experts. Deriving newattributes other than existing is also one of the important
task carried on during data mining process.
2.5 InterpretationAnalysis of the data and evaluation of the data takes placein this phase. This is the really significant phase for the
data mining because interpretation of data asks forexpertise in analyzing. If the data is interpreted wrongly, itmay lead to wrong business decision or conclusion. Toanalyze a data in a correct direction needs all three kind of
expertise that we talked about earlier. Knowledge about
the domain on which mining is to be done or being done isrequired to interpret the result in correct manner. Tounderstand the patterns discovered during the process, dataexpertise is required. Data mining expertise can beimplemented for technical interpretation of results. Forfurther, data mining questions can be raised for sub-
regions of the data and attributes where you find the
average of the target variable is smaller than value oftarget variable[12]. The meaning is we have to verifywhether the model or processed data achieve the businessobjective or have all business issues been considered ornot.
2.6 UseThe experts use the results produced during the process of
data mining. The data which is selected, related to thedomain and interpreted correctly will be used into the
database domain. The results are stored and can be used at
any stage. It can be used as input for any further process or
it can directly be integrated for the application. Result(output) of one process can be raw material (input) forother process.Thus, looking at the data mining process, we can say that
by passing the data from 6 main stages, at the end of thecycle, we can have some meaningful and useful data which
can assist the analyst in changing the strategy of theproduct. The techniques which can be implemented inpharmacovigilance are discussed in next section.
3. BAYESIAN CLASSIFICATION -
IMPLEMENTATIONFor pharmacovigilance, out of the many available datamining techniques any one can be used. Classification and
prediction are the two techniques of data analysis that canbe used to describe the significant and useful data or to
predict the future requirements. Many classification andprediction methods have been proposed by researchers inmachine learning, pattern recognition and statistics.
Bayesian classifier is one of the most efficient techniquesused for classification. An officer in pharmaceuticalresearch industry, want to analyze ADEs data for oneparticular drug say gatifloxacin to come to the decisionwhether to make any changes in dosage or to withdraw it
from the market. The decision can be taken by classifyingthe ADEs of gatifloxacin into two categories. Common
side effects and Severe side effects. The data availablefrom the patients, clinical experts and pharmacists will becategorized and classifier is constructed to predictcategorical labels such as common or severe.Classification and prediction can be compared by factorslike accuracy, speed, robustness, scalability and
interpretability. Bayesian classification is a well known
approach for data classification.Bayesian provides practical learning algorithm which arevery useful in practical real life aspects. Bayesiancombines prior obtained knowledge and recently observeddata. It is a model based approach which offers to generate
useful conceptual framework. As the Bayesian followsprobabilistic model specification, any sequences or objects
can be classified. Bayesian classifier is comparable withother techniques like decision tree. Bayesian classifiers areuseful when dealing with large databases with highaccuracy and speed[13]. Nave Bayesian classifier assumesthat the effect of an attribute value on a given class is not
dependent of the values of the other attributes. Thisassumption is known as class conditional independence.Bayesian classifiers, allow the representation of
dependencies among subsets of attributes. Following is theBayes Theorem.
3.1 BayesRuleBayes theorem is named after Thomas Bayes. It is atheorem with two different interpretations.1.
Bayesian interpretation2.
Frequentist interpretationThe first interprets in a way which makes clear that how a
subjective degree of belief should rationally change toevidence. The second one interprets in a way which relates
inverse representation of the probabilities. In the Baysianinterpretation, Bayes theorem is based on statistics and
can be applied to various fields like science, engineering,micro economics, game theory, medicine and law.
8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs
4/4
www.ijcait.com International Journal of Computer Applications & Information Technology
Vol. II, Issue I, January 2013 (ISSN: 2278-7720)
P a g e | 4
Table 1: To keep the drug in market or to withdraw
The formula provided by Bayesian and known as Bayesrule is as under:
In the Bayes rule:
d= data
h= hypothesis (model)rearrangingp(h|d)P(d)= P(d|h) P(h)P(d|h) = P(d|h)
The joined probability on both the sides.What indicates what in Bayes rule:P(h): Probability of hypothesis h before seeing any data
P(d|h): Probability of the data if the hypothesis h is true
P(d): Marginal probability of the dataP(h|d): probability of hypothesis after seeing the dataThe bayesian method can be illustrated more properly byan example. Lets assume that there is one drug say A.
After introducing it to the market, some unexpected andharmful results are coming out. In this situation, the data
from health experts, patients and pharmacists wil becollected by considering the side effects. The datacollected is shown in Table 1. The ultimate goal afterrefering table is to look at the probability of side effectsand compute it according to Bayesian rule. Based on thecomputed result, the decision will be taken by medical
authorities whether to continue the drug in the society or to
declare it harmful and stop its usage. By looking at thetable, we can estimate that the probability of drug to bewithdrawn from the market is more than to be kept in themarket.P(w) = 7/10 = 0.7
P(u) = 3/10 = 0.3Where w = withdrawing the drug and
u = use the drug
Looking at the result, we can come to a conclusion that itis preferable to withdraw the drug from the market as 7 outof 10 consumers or patients faces danger to the health.
4. CONCLUSIONPhamaceutical industry is the area which is directly relatedto the human beings life. It produces disadvantages as
much as it produces benefits. Any drug may react inadverse way varying from person to person.Pharmacovigilance experts give over extensive effort to
post marketing observation of adverse drug reactions.
Based on the observation and its data, they use data mining
technique to find the hidden fact and takes decisionaccordingly.The Bayesian Rule is somewhat more
efficient, fast and useful for pharmacovigilance.
5. REFERENCES[1] haiweb.org/19072009/19July2009FactsheetTheEuropeanCommission'sProposalforaPharmacovigilanceDirective.
pdf[2] who.int/medicines/areas/quality_safety/safety_efficac
y/ S.AfricaDraftGuidelines.pdf[3] Dhikav, V., Singh, S. 2004 Adverse drug reactionsmonitoring in india, Journal, Indian academy of clinicalmedicine vol. 5, no.1 28-33[4] Bates, DW., Spell, N., Cullen, DJ et al. JAMA 1997,
The costs of adverse drug reactions in hospitalisedpatients.; 277: 301-07.[5] Leape, LL. Errors in medicine. JAMA 1994; 272:1851-7.[6] Bates, DW., Collen, DJ., laird, N et al. JAMA 1997Incidence of adverse drug events and potential adverse
drug events in hospitalized patients.; 277: 307-11.[7]Mara, S. P., Alberto, S., Vctor, R., Pilar, H., Jose M,P. 2007 Design and implementation of a data mining grid-aware architecture, Future Generation Computer Systems
23 4247[8] Feelders, A., Daniels, H., Holsheimer, M. 2000
Briefings Methodological and practical aspects of datamining Information & Management 37 271-281
[9] Hand, D. J., 1998, Data mining: statistics and more?The American Statistician 52 (2), pp. 112-118.[10] Glymour, C., Madigan, D., Pregibon, D., Smyth, P.,1997 Statistical themes and lessons for data mining, Data
Mining and Knowledge Discovery 1, pp. 11 - 28.
[11] Subramanian, A., Smith, L. D., Nelson, A. C.,Campbell, J.F., Bird, D. A., 1997 Strategic planning fordata warehousing, Information and Management 33, pp.99-113.[12] Friedman, J.H., Fisher, N. I., 1999 Bump hunting in
high-dimensional data, Statistics and Computing 9 (2), pp.123 - 143.
13]Jiawei, H., Micheline, K., 2011 Data mining conceptsand techniques,
)(
)()|()|(
dP
hPhdPdhp
Common adverse effects Severe adverse effects
Patient Constipation Itching Vomiting Diabetes
fluctuation
Chest
pain
Blurred
vision
Should
the drug
be used?1 YES NO NO NO NO NO YES
2 YES YES NO YES NO NO NO
3 YES YES YES NO NO NO YES
4 NO NO NO NO YES YES NO
5 NO YES NO NO NO YES NO
6 YES NO YES YES YES NO NO
7 NO YES NO NO YES NO NO
8 NO NO YES YES NO NO NO
9 NO NO YES NO NO NO YES
10 YES YES YES YES YES YES NO