Upload
marcos-silva
View
218
Download
0
Embed Size (px)
Citation preview
7/25/2019 Aspects of Multivariate Analysis
1/4
ASPECTSOFMULTIVARIATEANALYSIS
1.1Introduction
Scientifc inquiry is an iterative learning process. Objectives pertaining to theexplanation o a social or physical phenomenon must be specifed and thentested by gathering and analyzing data. In turn an analysis o the data gatheredby experimentation. !hroughout this iterative learning process variables areoten added or deleted rom the study. !hus the complexities o mostphenomena require an investigator to collect observations on many di"erentvariables. !his boo# is concerned $ith statistical methods designed to elicitinormation rom these #inds o data sets. %ecause the data include simultaneousmeasurements on many variables this body o methodology is calledmultivariate analysis.
!he need to understand the relationships bet$een many variables ma#es
multivariate analysis an inherently di&cult subject. Oten the human mind isover$helmed by the sheer bul# o the data. 'dditionally more mathematics isrequired to derive multivariate statistical techniques or ma#ing inerences thanin a univariate setting. (e have chosen to provide explanations based uponalgebraic concepts and to avoid the derivations o statistical results that requirethe calculus o many variables. Our objective is to introduce several useulmultivariate techniques in a clear manner ma#ing heavy use i illustrativeexamples and a minimum o mathematics. )onetheless some mathematicalsophistication and a desire to thin# quantitatively $ill be required.
*ost o our emphasis $ill be on the analysiso measurements obtained$ithout actively controlling or manipulating any o the variables on $hich themeasurements are made. Only in +hapters , and - shall $e treat a e$experimental plans designs/ or generating data that prescribe the activemanipulation o important variables. 'lthough the experimental design isordinarily the most important part o a scientifc investigation it is requentlyimpossible to control the generation o appropriate data in certain disciplines.!his is true or example in business economics ecology geology andsociology./ 0ou should consult ,2 and -2 or detailed accounts o designprinciples that ortunately also apply to multivariate situations.
It $ill become increasingly clear that many multivariate methods are basedupon an underlying probability model #no$n as the multivariate normal
distribution. Other methods are ad hoc in nature and are justifed by logical orcommonsense arguments. 3egardless o their origin multivariate techniquesmust invariably be implemented on a computer. 3ecent advances in computertechnology have been accompanied by the development o rather sophisticatedstatistical sot$are pac#ages ma#ing the implementation step easier.
*ultivariate analysis is a 4mixed bag.5 It is di&cult to establish aclassifcation scheme or multivariate techniques that is both $idely acceptedand indicates the appropriateness o the techniques. Once classifcationdistinguishes techniques designed to study interdependent relationships romthose designed to study dependent relationships. 'nother classifes techniquesaccording to the number o populations and the number o set o variables being
studied. +hapters in this text are divided into section according to inerence
7/25/2019 Aspects of Multivariate Analysis
2/4
about treatment means inerence about covariance structure and techniques orsorting or grouping. !his should not ho$ever be considered an attempt to placeeach method into a slot. 3ather the choice o methods and the types o analysesemployed are largely determined by the objectives o the investigation. In
Section 1.6 $e list a smaller number o practical problems designed to illustratethe connection bet$een the choice o a statistical method and the objectives othe study. !hese problems plus the examples in the text should provide you $ithan appreciation o the applicability o multivariate techniques across di"erentfelds.
!he objectives o scientifc investigations to $hich multivariate methodsmost naturally lend themselves include the ollo$ing7
1. Data reduction or structural simplifcation. !he phenomenon being studiedis represented as simply as possible $ithout sacrifcing valuableinormation. It is hoped that this $ill ma#e interpretation easier.
6. Sorting and grouping.8roups o 4similar5 objects or variables are created
based upon measured characteristics. 'lternatively rules or classiyingobjects into $ell9defned groups may be required.
:. Investigation o the dependence among variables. !he nature o therelationships among variables is o interest. 're all the variables mutuallyindependent or are one or more variables dependent on the others; I soho$;
. ?. +. *arriot 1@2 page A@. !he statement $as made in a discussion ocluster analysis but $e eel it is appropriate or a broader range o methods. 0oushould #eep it in mind $henever you attempt or read about a data analysis. Itallo$s one to maintain a proper perspective and not be over$helmed by theelegance o some o the theory7
I the results disagree $ith inormed opinion do not admit a simple logicalinterpretation and do not sho$ up clearly in a graphical presentation they are probably$rong. !here is no magic about numerical methods and may $ays in $hich they can
brea# do$n. !hey are a valuable aid to the interpretation o data not sausage machinesautomatically transorming bodies o numbers into pac#ets o scientifc act.
1.6'pplications o multivariate !echniques
!he published applications o multivariate methods have increased tremendouslyin recent years. It is no$ di&cult to cover the variety o real9$ord applications othese methods $ith brie discussions as $e did in earlier editions o this boo#.?o$ever in order to give some indication o the useulness o multivariatetechniques $e o"er the ollo$ing short descriptions o the result o studies romseveral disciplines. !hese descriptions are organized according to the categories
7/25/2019 Aspects of Multivariate Analysis
3/4
o objectives given in the previous section. O course many o our examples aremultiaceted and could be placed in more than one category.
Data reduction or simplifcation
Bsing data on several variables related to cancer patient responses to
radiotherapy a simple measure o patient response to radiotherapy $asconstructed. See Cxercise 1.1=./
!rac# records rom many nations $ere used to develop an index o
perormance or both male and emale athletes. See A2 and 662./
*ultispectral image data collected by a high9altitude scanner $ere reduced
to a orm that could be vie$ed as images pictures/ o a shoreline in t$odimensions. See 6:2./
Data on several variables relating to yield and protein content $ere used
to create an index to select parents o subsequent generations o improvedbean plants. See 1:2./
' matrix o tactic similarities $as developed rom aggregate data derivedrom proessional mediators. >rom this matrix the number o dimensions by$hich proessional mediators judge the tactics they use in resolvingdisputes $as determined. See 612./
Sorting and grouping
Data om several variables related to computer use $ere employed to
create clusters o categories o computer jobs that allo$ a betterdetermination o existing or planned/ computer utilization. see 62./
*easurements o several physiological variables $ere used to develop a
screening procedure that discriminates alcoholics rom nonalcoholics. See6,2./
Data related to responses to visual stimuli $ere used to develop a rule or
separating people su"ering rom a multiple9sclerosis9caused visualpathology rom those not su"ering rom the disease. See Cxercise 1.1
7/25/2019 Aspects of Multivariate Analysis
4/4
executives $ere used to assess the relation bet$een ris#9ta#ing behaviorand perormance. See 1A2./
Prediction
!he associations bet$een test scores and several high school
perormance variables and several college perormance variables $ereused to develop predictors o success in college See 1E2./
Data on several variables related to the size distribution o sediments $ere
used to develop rules or predicting di"erent depositional environments.See -2 and 6E2./
*easurements on several accounting and fnancial variables $ere used to
develop a method or identiying potentially insolvent property9liabilityinsurers. See 6A2./
cD)' microarray experiments gene expression data/ are increasingly
used to study the molecular variations among cancer tumors. ' reliableclassifcation o tumors is essential or successul diagnosis and treatmento cancer. See @2./
Hypotheses testing
Several pollution9related variables $ere measured to determine $hetherlevels or
Fg. 6=