Aspects of Multivariate Analysis

Embed Size (px)

Citation preview

  • 7/25/2019 Aspects of Multivariate Analysis

    1/4

    ASPECTSOFMULTIVARIATEANALYSIS

    1.1Introduction

    Scientifc inquiry is an iterative learning process. Objectives pertaining to theexplanation o a social or physical phenomenon must be specifed and thentested by gathering and analyzing data. In turn an analysis o the data gatheredby experimentation. !hroughout this iterative learning process variables areoten added or deleted rom the study. !hus the complexities o mostphenomena require an investigator to collect observations on many di"erentvariables. !his boo# is concerned $ith statistical methods designed to elicitinormation rom these #inds o data sets. %ecause the data include simultaneousmeasurements on many variables this body o methodology is calledmultivariate analysis.

    !he need to understand the relationships bet$een many variables ma#es

    multivariate analysis an inherently di&cult subject. Oten the human mind isover$helmed by the sheer bul# o the data. 'dditionally more mathematics isrequired to derive multivariate statistical techniques or ma#ing inerences thanin a univariate setting. (e have chosen to provide explanations based uponalgebraic concepts and to avoid the derivations o statistical results that requirethe calculus o many variables. Our objective is to introduce several useulmultivariate techniques in a clear manner ma#ing heavy use i illustrativeexamples and a minimum o mathematics. )onetheless some mathematicalsophistication and a desire to thin# quantitatively $ill be required.

    *ost o our emphasis $ill be on the analysiso measurements obtained$ithout actively controlling or manipulating any o the variables on $hich themeasurements are made. Only in +hapters , and - shall $e treat a e$experimental plans designs/ or generating data that prescribe the activemanipulation o important variables. 'lthough the experimental design isordinarily the most important part o a scientifc investigation it is requentlyimpossible to control the generation o appropriate data in certain disciplines.!his is true or example in business economics ecology geology andsociology./ 0ou should consult ,2 and -2 or detailed accounts o designprinciples that ortunately also apply to multivariate situations.

    It $ill become increasingly clear that many multivariate methods are basedupon an underlying probability model #no$n as the multivariate normal

    distribution. Other methods are ad hoc in nature and are justifed by logical orcommonsense arguments. 3egardless o their origin multivariate techniquesmust invariably be implemented on a computer. 3ecent advances in computertechnology have been accompanied by the development o rather sophisticatedstatistical sot$are pac#ages ma#ing the implementation step easier.

    *ultivariate analysis is a 4mixed bag.5 It is di&cult to establish aclassifcation scheme or multivariate techniques that is both $idely acceptedand indicates the appropriateness o the techniques. Once classifcationdistinguishes techniques designed to study interdependent relationships romthose designed to study dependent relationships. 'nother classifes techniquesaccording to the number o populations and the number o set o variables being

    studied. +hapters in this text are divided into section according to inerence

  • 7/25/2019 Aspects of Multivariate Analysis

    2/4

    about treatment means inerence about covariance structure and techniques orsorting or grouping. !his should not ho$ever be considered an attempt to placeeach method into a slot. 3ather the choice o methods and the types o analysesemployed are largely determined by the objectives o the investigation. In

    Section 1.6 $e list a smaller number o practical problems designed to illustratethe connection bet$een the choice o a statistical method and the objectives othe study. !hese problems plus the examples in the text should provide you $ithan appreciation o the applicability o multivariate techniques across di"erentfelds.

    !he objectives o scientifc investigations to $hich multivariate methodsmost naturally lend themselves include the ollo$ing7

    1. Data reduction or structural simplifcation. !he phenomenon being studiedis represented as simply as possible $ithout sacrifcing valuableinormation. It is hoped that this $ill ma#e interpretation easier.

    6. Sorting and grouping.8roups o 4similar5 objects or variables are created

    based upon measured characteristics. 'lternatively rules or classiyingobjects into $ell9defned groups may be required.

    :. Investigation o the dependence among variables. !he nature o therelationships among variables is o interest. 're all the variables mutuallyindependent or are one or more variables dependent on the others; I soho$;

    . ?. +. *arriot 1@2 page A@. !he statement $as made in a discussion ocluster analysis but $e eel it is appropriate or a broader range o methods. 0oushould #eep it in mind $henever you attempt or read about a data analysis. Itallo$s one to maintain a proper perspective and not be over$helmed by theelegance o some o the theory7

    I the results disagree $ith inormed opinion do not admit a simple logicalinterpretation and do not sho$ up clearly in a graphical presentation they are probably$rong. !here is no magic about numerical methods and may $ays in $hich they can

    brea# do$n. !hey are a valuable aid to the interpretation o data not sausage machinesautomatically transorming bodies o numbers into pac#ets o scientifc act.

    1.6'pplications o multivariate !echniques

    !he published applications o multivariate methods have increased tremendouslyin recent years. It is no$ di&cult to cover the variety o real9$ord applications othese methods $ith brie discussions as $e did in earlier editions o this boo#.?o$ever in order to give some indication o the useulness o multivariatetechniques $e o"er the ollo$ing short descriptions o the result o studies romseveral disciplines. !hese descriptions are organized according to the categories

  • 7/25/2019 Aspects of Multivariate Analysis

    3/4

    o objectives given in the previous section. O course many o our examples aremultiaceted and could be placed in more than one category.

    Data reduction or simplifcation

    Bsing data on several variables related to cancer patient responses to

    radiotherapy a simple measure o patient response to radiotherapy $asconstructed. See Cxercise 1.1=./

    !rac# records rom many nations $ere used to develop an index o

    perormance or both male and emale athletes. See A2 and 662./

    *ultispectral image data collected by a high9altitude scanner $ere reduced

    to a orm that could be vie$ed as images pictures/ o a shoreline in t$odimensions. See 6:2./

    Data on several variables relating to yield and protein content $ere used

    to create an index to select parents o subsequent generations o improvedbean plants. See 1:2./

    ' matrix o tactic similarities $as developed rom aggregate data derivedrom proessional mediators. >rom this matrix the number o dimensions by$hich proessional mediators judge the tactics they use in resolvingdisputes $as determined. See 612./

    Sorting and grouping

    Data om several variables related to computer use $ere employed to

    create clusters o categories o computer jobs that allo$ a betterdetermination o existing or planned/ computer utilization. see 62./

    *easurements o several physiological variables $ere used to develop a

    screening procedure that discriminates alcoholics rom nonalcoholics. See6,2./

    Data related to responses to visual stimuli $ere used to develop a rule or

    separating people su"ering rom a multiple9sclerosis9caused visualpathology rom those not su"ering rom the disease. See Cxercise 1.1

  • 7/25/2019 Aspects of Multivariate Analysis

    4/4

    executives $ere used to assess the relation bet$een ris#9ta#ing behaviorand perormance. See 1A2./

    Prediction

    !he associations bet$een test scores and several high school

    perormance variables and several college perormance variables $ereused to develop predictors o success in college See 1E2./

    Data on several variables related to the size distribution o sediments $ere

    used to develop rules or predicting di"erent depositional environments.See -2 and 6E2./

    *easurements on several accounting and fnancial variables $ere used to

    develop a method or identiying potentially insolvent property9liabilityinsurers. See 6A2./

    cD)' microarray experiments gene expression data/ are increasingly

    used to study the molecular variations among cancer tumors. ' reliableclassifcation o tumors is essential or successul diagnosis and treatmento cancer. See @2./

    Hypotheses testing

    Several pollution9related variables $ere measured to determine $hetherlevels or

    Fg. 6=