Data Mining in Pharmacovigilance – to Reduce ADRs

Embed Size (px)

Citation preview

  • 8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs

    1/4

    www.ijcait.com International Journal of Computer Applications & Information Technology

    Vol. II, Issue I, January 2013 (ISSN: 2278-7720)

    P a g e | 1

    Data mining in pharmacovigilance to reduce Adverse

    Drug Effects(ADRs)

    Ms. Miral KothariGujarat Technological

    UniversityAITS, Yogidham Gurukul,

    Kalawad Road, Rajkot.

    Ms. Priti SadariaSaurashtra University

    Virani Science College,Yogidham Gurukul,

    Kalawad Road, Rajkot.

    Ms. Nehal DaveSaurashtra University

    Virani Science College,Yogidham Gurukul,

    Kalawad Road, Rajkot.

    ABSTRACTPharmaceutical industry provides the medicines indifferent formats. It can be tablets, capsules, liquid orinjectables. Every drug in any form may cause adverseeffect varies from person to person. Before putting anydrug in the market, the drugs are being tested for adverse

    effects on large scale. Pharacovigilance is a science whichis purely related with discovery, understanding andanticipation of the Adverse Drug Effect (ADEs).Pharmaceutical experts and industries much rely on datamining algorithms or techniques to understand the hugedata collected from healthcare professionals and patients

    and make the use of that data for further research anddevelopment of new drug. In this paper, author has tried to

    implement Bayesian Classification method of data miningto assist the research person in decision making.

    KeywordsData mining, pharmacovigilance, Bayesian classification

    1.

    INTRODUCTIONMedicines are required to be evaluated in terms of harm tothe human body. Harm can be of short term or long term.Before being introduced to market, every drug or medicineis tested but comparitively on a small number of people. In

    wider population, it is possible that drug may create

    reactions to the human body which were not detectedduring testing. Adverse Drug Effect (ADEs) is alsorefered as Adverse Drug Reaction(ADRs) are the responseto the medicine which is used. Every patient is a uniquemedicine user with different life style and circumstancesand whose body will react in different way.

    Pharmacovigilance is a tool or science which can be used

    to evaluate and improve the safety of medicines [1].

    Pharamacovigilance is a collection of activities which areconducted to detect, assess, understand, monitor or toprevent the Adverse Drug Reactions(ADRs)[2].The main question arises in pharamcovigilance is what is

    the need to monitor the adverse reactions of drugs? Thisquestion is important because every drug before beingintroduced to the market for commercial purpose, have

    been gone through adequate study. But the answer is verysimple and is that the highest priority is given to humanhealth and to keep the humans safe and make the drug

    more safe even after adequate testing, monitoring theadverse effects is necessary[3]. The process ofpharmacovigilance involves risk analysis and riskmanagement.

    The process is illustrated in Figure (see Figure 1). The risk

    analysis is the phase which involves identification,

    quantification and assessment of the drug reactions. The

    survey will be held to collect the data and identifying theactual cause of the reaction. If it is really the drug taken

    then, the stage of quantification will be held. Inquantification, more samples will be considered and thelast phase of risk analysis is assessment in which thecollected and identified samples will be assess and how

    much the drug is risky to the patient can be evaluated. Aswe understand the risk during risk analysis, the second

    main phase of the pharmacovigilance process is carriedout. In the second phase, the identified risks are managed.The remedies to avoid or to reduce the intensity of risk aretaken. The risks will be measured and evalueated at theadmininstrative level and the significance of the risks will

    be communicated to all the levels. Based on the risks,administration will form some strategies to prevent therisks. As the risks are on the health of human beings, the

    strategies are developed to reduce or to eliminate the riskas far as possible.

    Fig 1: Initial phases of pharmacovigilance

    The aims of pharamacovgilance are:

    Revealing the adverse effects of existing drugs

    Discovering unexpected effects of newer drugs

    Recognizing the risk factors associated with

    development of adverse drug reactions.Quantitative estimation of the risk factors like

    RiskAnalysis

    Identification

    Quantification

    Assessment

    RiskManagement

    Administrative

    Communicating

    Prevention

  • 8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs

    2/4

    www.ijcait.com International Journal of Computer Applications & Information Technology

    Vol. II, Issue I, January 2013 (ISSN: 2278-7720)

    P a g e | 2

    o how much admissions to the hospital due to

    ADRs?

    o What is the mortality ratio of ADRs?Data mining is the tool which will assist thepharmacovigilance in reducing, eliminating or

    understanding the risk factor of ADR.[4-6]. The datamining is discussed in the succeeding session of the paper.

    2. PRACTICAL ASPECT OF DATA

    MININGDue to the development of industries and technologies,industries produce very large amount of data. It becomesvery necessary to manage this data in order to utilize thedata. Based on this data, the business decision can be made

    by the decision maker. But data in the available form istough to manage and analyze. For this reason, it is

    necessary to take abstract of the data. Data mining can beused as a tool to discover the pattern or prototype availablein the data and discovering the facts hidden behind the

    data[7]. Data mining is a combination of data, databasemanagement and data visualization. The purpose of datamining is to extract the knowledge. Data mining is moreuseful particularly when data set is too large. The points tobe considered are once the data mining is stated precisely

    i.e. it is decided that on which kind of data, the data miningis to be done, the large data set also becomes small as out

    of it some amount of data is of interest from the view pointof the data mining. While the second point is in very largedatabase, a sample is sufficient for accurate model[8].

    To mine the data means to extract the usefulinformation from the data. Basically, at first sight, this task

    does not require any kind of expertise. But in actual sense,

    to make the data mining effective and to extract knowledgefrom large data set, it is not easy. Mainly expertise isrequired in subject and data analysis and data observation.An expert in subject can decide which kind of questionscan be answered from the analysis. Data expert is able to

    decide that from where the data is to be collected. Whilethe analysis expert requires strong judgment power based

    on statistics along with considering selection bias[9]. Datamining is a repetitive process. The result of data mining isknowledge. Figure (see Figure 2) illustrates the process ofdata mining. The main stages of data mining involve fourmajor activities.

    2.1 Problem definitionIn this stage, the problem is identified. To identify the

    problem means it is to be judged that which kind ofknowledge or information will be there as output aftercompletion of the process. It is advisable to decide inadvance that how the produced outcome will be used. Theoutputs can be categorized into three[10] as under.1.

    The result can be used for descriptive purpose i.e. the

    resultant data may be used to describe any segment or

    group of whole data set.2.

    The discovered facts that were hidden behind the datamay be used to predict the situation outside thedatabase.

    3.

    The result can be directly involved in the systembeing developed.

    Fig 2: Main stages of process of data mining

    No doubt that there are possibilities of biases in the data.In spite of this possibility, the consideration is to be that upto what extent the data is related with the question to besolved after the data mining. Biased data produces wrong

    conclusion. When any correlation is found between theproduced data and requirement, it cannot be justified onlyby data analysis but it requires knowledge about thedomain also.

    2.2 Achieve required informationHowever, analysis and conclusion is done on the basis ofdatabases, but the purpose of creating the database in tosupport the business decision making and processes. Theanalysis can suggest the statistical design. To identify the

    actual pattern of the dataset, it is necessary to understand

    the possible biases. If the bias occurs frequently on thesame dataset and for the same goal, it may mislead to theactual knowledge. Ultimately it is going to affect thecorrect decision. After the problem is defined, thenecessary information is to be achieved. While doing theprocess of achieving the information, it is to be kept in

    mind that whatever information is going to be acquired,

    must be related or targeted to the final result. If theinformation is achieved and it is not concerned with thefinal output, then the effort is wasted and has to moveagain for collection. It is illustrated in the following Figure(see Figure 3).

    ProblemDefinition

    achieverequired info

    Selection ofdata

    Pre-processing of

    data

    Interpretation

    Use

    Information

    Biased and

    not useful

    Unbiased

    and useful

    Desired

    information

  • 8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs

    3/4

    www.ijcait.com International Journal of Computer Applications & Information Technology

    Vol. II, Issue I, January 2013 (ISSN: 2278-7720)

    P a g e | 3

    Fig 3. Collection of information

    Therefore, in the second stage of the data mining process,it is very important to achieve unbiased useful and timelyinformation.

    2.3 Selection of dataThe very important stage in the process of data mining isthe selection of data. Once the required information isachieved, it is very significant job to select the related dataout of it. To select the appropriate and relevant data seeks

    responsibility. The selected data should reflect the data

    mining belief that let the data speak itself. Selection ofthe data should be free from pre-defined criteria whichhave been set prior to looking at the data. To select theappropriate data, data from different sources such as datawarehouses, data marts etc. are also to be considered.

    Analysts generally like to make use of data warehouse ordata mart to select the data accurately and relatively[11].We can say that analyst or personnel can filter the relevant

    data with the help of sources like data warehouse or datamart. It is advisable to first take the prevention steps toavoid the conflicts and inconsistency of data before

    integrating more than one resource otherwise it may resultinto a time consuming process. Thus, by looking at the riskand time affecting factor, selection of the data is to be donevery precisely.

    2.4 Pre-processing of dataAt the fourth stage of data mining, we already have somedata on hand which is selected out of the identified data.We know that the data is selected from the requiredinformation but before the analysis takes place on the data,it has to be processed and that is why the stage is called

    pre- processing of the data. In this phase, data is processed

    and experts say that you can use more than one datamining functions for the same type of data. It should benoted that if there are more than one model, each modelshould be assessed for the data by experts. Deriving newattributes other than existing is also one of the important

    task carried on during data mining process.

    2.5 InterpretationAnalysis of the data and evaluation of the data takes placein this phase. This is the really significant phase for the

    data mining because interpretation of data asks forexpertise in analyzing. If the data is interpreted wrongly, itmay lead to wrong business decision or conclusion. Toanalyze a data in a correct direction needs all three kind of

    expertise that we talked about earlier. Knowledge about

    the domain on which mining is to be done or being done isrequired to interpret the result in correct manner. Tounderstand the patterns discovered during the process, dataexpertise is required. Data mining expertise can beimplemented for technical interpretation of results. Forfurther, data mining questions can be raised for sub-

    regions of the data and attributes where you find the

    average of the target variable is smaller than value oftarget variable[12]. The meaning is we have to verifywhether the model or processed data achieve the businessobjective or have all business issues been considered ornot.

    2.6 UseThe experts use the results produced during the process of

    data mining. The data which is selected, related to thedomain and interpreted correctly will be used into the

    database domain. The results are stored and can be used at

    any stage. It can be used as input for any further process or

    it can directly be integrated for the application. Result(output) of one process can be raw material (input) forother process.Thus, looking at the data mining process, we can say that

    by passing the data from 6 main stages, at the end of thecycle, we can have some meaningful and useful data which

    can assist the analyst in changing the strategy of theproduct. The techniques which can be implemented inpharmacovigilance are discussed in next section.

    3. BAYESIAN CLASSIFICATION -

    IMPLEMENTATIONFor pharmacovigilance, out of the many available datamining techniques any one can be used. Classification and

    prediction are the two techniques of data analysis that canbe used to describe the significant and useful data or to

    predict the future requirements. Many classification andprediction methods have been proposed by researchers inmachine learning, pattern recognition and statistics.

    Bayesian classifier is one of the most efficient techniquesused for classification. An officer in pharmaceuticalresearch industry, want to analyze ADEs data for oneparticular drug say gatifloxacin to come to the decisionwhether to make any changes in dosage or to withdraw it

    from the market. The decision can be taken by classifyingthe ADEs of gatifloxacin into two categories. Common

    side effects and Severe side effects. The data availablefrom the patients, clinical experts and pharmacists will becategorized and classifier is constructed to predictcategorical labels such as common or severe.Classification and prediction can be compared by factorslike accuracy, speed, robustness, scalability and

    interpretability. Bayesian classification is a well known

    approach for data classification.Bayesian provides practical learning algorithm which arevery useful in practical real life aspects. Bayesiancombines prior obtained knowledge and recently observeddata. It is a model based approach which offers to generate

    useful conceptual framework. As the Bayesian followsprobabilistic model specification, any sequences or objects

    can be classified. Bayesian classifier is comparable withother techniques like decision tree. Bayesian classifiers areuseful when dealing with large databases with highaccuracy and speed[13]. Nave Bayesian classifier assumesthat the effect of an attribute value on a given class is not

    dependent of the values of the other attributes. Thisassumption is known as class conditional independence.Bayesian classifiers, allow the representation of

    dependencies among subsets of attributes. Following is theBayes Theorem.

    3.1 BayesRuleBayes theorem is named after Thomas Bayes. It is atheorem with two different interpretations.1.

    Bayesian interpretation2.

    Frequentist interpretationThe first interprets in a way which makes clear that how a

    subjective degree of belief should rationally change toevidence. The second one interprets in a way which relates

    inverse representation of the probabilities. In the Baysianinterpretation, Bayes theorem is based on statistics and

    can be applied to various fields like science, engineering,micro economics, game theory, medicine and law.

  • 8/12/2019 Data Mining in Pharmacovigilance to Reduce ADRs

    4/4

    www.ijcait.com International Journal of Computer Applications & Information Technology

    Vol. II, Issue I, January 2013 (ISSN: 2278-7720)

    P a g e | 4

    Table 1: To keep the drug in market or to withdraw

    The formula provided by Bayesian and known as Bayesrule is as under:

    In the Bayes rule:

    d= data

    h= hypothesis (model)rearrangingp(h|d)P(d)= P(d|h) P(h)P(d|h) = P(d|h)

    The joined probability on both the sides.What indicates what in Bayes rule:P(h): Probability of hypothesis h before seeing any data

    P(d|h): Probability of the data if the hypothesis h is true

    P(d): Marginal probability of the dataP(h|d): probability of hypothesis after seeing the dataThe bayesian method can be illustrated more properly byan example. Lets assume that there is one drug say A.

    After introducing it to the market, some unexpected andharmful results are coming out. In this situation, the data

    from health experts, patients and pharmacists wil becollected by considering the side effects. The datacollected is shown in Table 1. The ultimate goal afterrefering table is to look at the probability of side effectsand compute it according to Bayesian rule. Based on thecomputed result, the decision will be taken by medical

    authorities whether to continue the drug in the society or to

    declare it harmful and stop its usage. By looking at thetable, we can estimate that the probability of drug to bewithdrawn from the market is more than to be kept in themarket.P(w) = 7/10 = 0.7

    P(u) = 3/10 = 0.3Where w = withdrawing the drug and

    u = use the drug

    Looking at the result, we can come to a conclusion that itis preferable to withdraw the drug from the market as 7 outof 10 consumers or patients faces danger to the health.

    4. CONCLUSIONPhamaceutical industry is the area which is directly relatedto the human beings life. It produces disadvantages as

    much as it produces benefits. Any drug may react inadverse way varying from person to person.Pharmacovigilance experts give over extensive effort to

    post marketing observation of adverse drug reactions.

    Based on the observation and its data, they use data mining

    technique to find the hidden fact and takes decisionaccordingly.The Bayesian Rule is somewhat more

    efficient, fast and useful for pharmacovigilance.

    5. REFERENCES[1] haiweb.org/19072009/19July2009FactsheetTheEuropeanCommission'sProposalforaPharmacovigilanceDirective.

    pdf[2] who.int/medicines/areas/quality_safety/safety_efficac

    y/ S.AfricaDraftGuidelines.pdf[3] Dhikav, V., Singh, S. 2004 Adverse drug reactionsmonitoring in india, Journal, Indian academy of clinicalmedicine vol. 5, no.1 28-33[4] Bates, DW., Spell, N., Cullen, DJ et al. JAMA 1997,

    The costs of adverse drug reactions in hospitalisedpatients.; 277: 301-07.[5] Leape, LL. Errors in medicine. JAMA 1994; 272:1851-7.[6] Bates, DW., Collen, DJ., laird, N et al. JAMA 1997Incidence of adverse drug events and potential adverse

    drug events in hospitalized patients.; 277: 307-11.[7]Mara, S. P., Alberto, S., Vctor, R., Pilar, H., Jose M,P. 2007 Design and implementation of a data mining grid-aware architecture, Future Generation Computer Systems

    23 4247[8] Feelders, A., Daniels, H., Holsheimer, M. 2000

    Briefings Methodological and practical aspects of datamining Information & Management 37 271-281

    [9] Hand, D. J., 1998, Data mining: statistics and more?The American Statistician 52 (2), pp. 112-118.[10] Glymour, C., Madigan, D., Pregibon, D., Smyth, P.,1997 Statistical themes and lessons for data mining, Data

    Mining and Knowledge Discovery 1, pp. 11 - 28.

    [11] Subramanian, A., Smith, L. D., Nelson, A. C.,Campbell, J.F., Bird, D. A., 1997 Strategic planning fordata warehousing, Information and Management 33, pp.99-113.[12] Friedman, J.H., Fisher, N. I., 1999 Bump hunting in

    high-dimensional data, Statistics and Computing 9 (2), pp.123 - 143.

    13]Jiawei, H., Micheline, K., 2011 Data mining conceptsand techniques,

    )(

    )()|()|(

    dP

    hPhdPdhp

    Common adverse effects Severe adverse effects

    Patient Constipation Itching Vomiting Diabetes

    fluctuation

    Chest

    pain

    Blurred

    vision

    Should

    the drug

    be used?1 YES NO NO NO NO NO YES

    2 YES YES NO YES NO NO NO

    3 YES YES YES NO NO NO YES

    4 NO NO NO NO YES YES NO

    5 NO YES NO NO NO YES NO

    6 YES NO YES YES YES NO NO

    7 NO YES NO NO YES NO NO

    8 NO NO YES YES NO NO NO

    9 NO NO YES NO NO NO YES

    10 YES YES YES YES YES YES NO