Upload
others
View
8
Download
1
Embed Size (px)
Citation preview
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>FrontofBook>Authors
Author
MartinBlandProfessorofMedicalStatisticsStGeorge'sHospitalMedicalSchool,London
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>FrontofBook>Dedication
Dedication
TothememoryofErnestandPhyllisBland,myparents
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>FrontofBook>PrefacetotheThirdEdition
PrefacetotheThirdEdition
InpreparingthisthirdeditionofAnIntroductiontoMedicalStatistics,Ihavetakentheopportunitytocorrectanumberofmistakesandtypographicalerrors,andtochangesomeoftheexamplesandaddafewmore.Ihaveextendedthetreatmentofseveraltopicsandintroducedsomenewones,previouslyomittedthroughlackofspaceorenergy,orbecausetheywerethenrarelyseeninthemedicalliterature.Inonecase,numberneededtotreat,theconcepthadnotevenbeeninventedwhenthesecondeditionwaswritten.Othernewtopicsincludeconsentinclinicaltrials,designandanalysisofcluster-randomizedtrials,ecologicalstudies,conditionalprobability,repeatedtesting,randomeffectsmodels,intraclasscorrelation,andconditionaloddsratios.Thankstothewondersofcomputerizedtypesetting,Ihavemanagedtoextendthecontentsofthebookwithaverysmallincreaseinthenumberofpages.
Thisbookisformedicalstudents,doctors,medicalresearchers,nurses,membersofprofessionsalliedtomedicine,andallothersconcernedwithmedicaldata.Therangeofstatisticalmethodsusedinthemedicalandhealthcareliterature,andhencedescribedinthisbook,continuestogrow,butthetimeavailableintheundergraduatecurriculumdoesnot.Someofthetopicscoveredherearebeyondtheneedsofmanystudents,soIhaveindicatedbyanasterisksectionswhichwouldnotusuallybeincludedinfirstcourses.Theseareintendedforpostgraduatestudentsandmedicalresearchers.
Thisthirdeditionisbeingpublishedwithacompanionvolume,StatisticalQuestionsinEvidence-basedMedicine(BlandandPeacock2000).Thisbookofquestionsandanswersincludesnocalculationsandiscomplementarytotheexercisesgivenhere.Inthesolutionsgivenwe
makemanyreferencestoAnIntroductiontoMedicalStatistics.BecausewewantedStatisticalQuestionsinEvidence-basedMedicinetobeusablewiththesecondeditionofAnIntroductiontoMedicalStatistics(Bland1995),Ihavekeptthesameorderandnumberingofthesectionsinthethirdedition.Newmaterialhasallbeenaddedattheendsofthechapters.Ifthestructuresometimesseemsalittleunwieldy,thatiswhy.
Thisisabookaboutdata,notstatisticaltheory.Thefundamentalconceptsofstudydesign,datacollectionanddataanalysisareexplainedbyillustrationandexample.Onlyenoughmathematicsandformulaearegiventomakeclearwhatisgoingon.Forthosewhowishtogoalittlefurtherintheirunderstanding,someofthemoremathematicalbackgroundtothetechniquesdescribedisgivenasappendicestothechaptersratherthaninthemaintext.
Thematerialcoveredincludesallthestatisticalworkthatwouldberequiredforacourseinmedicineandfortheexaminationsofmostoftheroyalcolleges.Itincludesthedesignofclinicaltrialsandepidemiologicalstudies,datacollection.summarizingandpresentingdata,probability,theBinomial,Normal,Poisson.tandChi-squareddistributions,standarderrors,confidenceintervals,testsofsignificance,largesampleandsmallsamplecomparisonsofmeans,theuseoftransformations,regressionandcorrelation,methodsbasedonranks,contingencytables,oddsratios,measurementerror,referenceranges,mortalitydata,vitalstatistics,analysisofvariance,multipleandlogisticregression,survivalanalysis,samplesizeestimation,andthechoiceofthestatisticalmethod.
Thebookisfirmlygroundedinmedicaldata,particularlyinmedicalresearch,andtheinterpretationoftheresultsofcalculationsintheirmedicalcontextisemphasized.Exceptforafewobviouslyinventednumbersusedtoillustratethemechanicsofcalculations,allthedataintheexamplesandexercisesarereal,frommyownresearchandstatisticalconsultationorfromthemedicalliterature.
Therearetwokindsofexerciseinthisbook.Eachchapterhasasetofmultiplechoicequestionsofthe‘trueorfalse’type,100inall.Multiplechoicequestionscancoveralargeamountofmaterialinashorttime,
soareausefultoolforrevision.AsMCQsarewidelyusedinpostgraduateexaminations,theseexercisesshouldalsobeusefultothosepreparingformemberships.AlltheMCQshavesolutions,withreferencetoanappropriatepartofthetextoradetailedexplanationformostoftheanswers.Eachchapteralsohasonelongexercise.Althoughtheseusuallyinvolvecalculation,Ihavetriedtoavoidmerelyslottingfiguresintoformulae.Theseexercisesincludenotonlytheapplicationofstatisticaltechniques,butalsotheinterpretationoftheresultsinthelightofthesourceofthedata.
Iwishtothankmanypeoplewhohavecontributedtothewritingofthisbook.First,therearethemanymedicalstudents,doctors,researchworkers,nurses,physiotherapists,andradiographerswhomithasbeenmypleasuretoteach,andfromwhomIhavelearnedsomuch.Second,thebookcontainsmanyexamplesdrawnfromresearchcarriedoutwithotherstatisticians,epidemiologists,andsocialscientists,particularlyDouglasAltman,RossAnderson,MikeBanks,BarbaraButland,BeulahBewley,andWalterHolland.ThesestudiescouldnothavebeendonewithouttheassistanceofPatsyBailey,BobHarris.RebeccaMcNair.JanetPeacock,SwateePatel,andVirginiaPollard.Third,thecliniciansandscientistswithwhomIhavecollaboratedorwhohavecometomeforstatisticaladvicenotonlytaughtmeaboutmedicaldatabutmanyofthemhaveleftmewithdatawhichareusedhere,includingNaibAl-Saady,ThomasBewley,FrancesBoa,NigelBrown,JanDavies,PeterFish,CarolineFlint,NickHall,TessiHanid.MichaelHutt,RiahdJasrawi,IanJohnston,MosesKipembwa,PamLuthra,HughMather,DaramMaugdal,DouglasMaxwell,CharlesMutoka,TimNorthfield,AndreasPapadopoulos,MohammedRaja,PaulRichardson,andAlbertoSmith.IamparticularlyindebtedtoJohnMorgan,asChapter16ispartlybasedonhiswork.
TheoriginalmanuscriptwastypedbySueNash,SueFisher,SusanHarding,SheilahSkipp,andmyself.ThiseditionhasbeensetbymeusingLATEX,soanyerrorswhichremainaredefinitelymyown.AllthegraphshavebeendrawnusingStataexceptforthepiecharts,doneusingHarvardGraphics.
IthankDouglasAltman,DavidJones,RobinPrescott,KlimMcPherson.JanetPeacock,andStuartPocockfortheirhelpfulcommentsonearlier
drafts.Ihavecorrectedanumberoferrorsfromthefirstandsecondeditions,andIamgratefultocolleagueswhohavepointedthemouttome,inparticulartoDanielHeitjan.IamverygratefultoJanetPeacock,whoproof-readthisedition.Specialthanksareduetomyheadofdepartment,RossAnderson,forallhissupport,andtothestaffofOxfordUniversityPress.MostofallIthankmywife,PaulineBland,forherunfailingconfidenceandencouragement,andmychildren,EmilyandNicholasBland,forkeepingmyfeetfirmlyontheground.
M.B.London,March2000
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>Sectionsmarked*containmaterialusuallyfoundonlyinpostgraduatecourses
Sectionsmarked*containmaterialusuallyfoundonlyinpostgraduatecourses
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>1-Introduction
1
Introduction
1.1StatisticsandmedicineEvidence-basedpracticeisthenewwatchwordineveryprofessionconcernedwiththetreatmentandpreventionofdiseaseandpromotionofhealthandwell-being.Thisrequiresboththegatheringofevidenceanditscriticalinterpretation.Theformerisbringingmorepeopleintothepracticeofresearch,andthelatterisrequiringofallhealthprofessionalstheabilitytoevaluatetheresearchcarriedout.Muchofthisevidenceisintheformofnumericaldata.Theessentialskillrequiredforthecollection,analysis,andevaluationofnumericaldataisstatistics.ThusStatistics,thescienceofassemblingandinterpretingnumericaldata,isthecorescienceofevidence-basedpractice.
Inthepastfortyyearsmedicalresearchhasbecomedeeplyinvolvedwiththetechniquesofstatisticalinference.Theworkpublishedinmedicaljournalsisfullofstatisticaljargonandtheresultsofstatisticalcalculations.Thisacceptanceofstatistics,thoughgratifyingtothemedicalstatistician,mayevenhavegonetoofar.MorethanonceIhavetoldacolleaguethathedidnotneedmetoprovethathisdifferenceexisted,asanyonecouldseeit,onlytobetoldinturnthatwithoutthemagicofthePvaluehecouldnothavehispaperpublished.
Statisticshasnotalwaysbeensopopularwiththemedicalprofession.Statisticalmethodswerefirstusedinmedicalresearchinthe19thcenturybyworkerssuchasPierre-Charles-AlexandreLouis,WilliamFarr,FlorenceNightingaleandJohnSnow.Snow'sstudiesofthemodesofcommunicationofcholera,forexample,madeuseofepidemiologicaltechniquesuponwhichwehavestillmadelittleimprovement.Despite
theworkofthesepioneers,however,statisticalmethodsdidnotbecomewidelyusedinclinicalmedicineuntilthemiddleofthetwentiethcentury.Itwasthenthatthemethodsofrandomizedexperimentationandstatisticalanalysisbasedonsamplingtheory,whichhadbeendevelopedbyFisherandothers,wereintroducedintomedicalresearch,notablybyBradfordHill.Itrapidlybecameapparentthatresearchinmedicineraisedmanynewproblemsinbothdesignandanalysis,andmuchworkhasbeendonesincetowardssolvingthesebyclinicians,statisticiansandepidemiologists.
Althoughconsiderableprogresshasbeenmadeinsuchfieldsasthedesignofclinicaltrials,thereremainsmuchtobedoneindevelopingresearchmethodologyinmedicine.Itseemslikelythatthiswillalwaysbeso,foreveryresearchprojectissomethingnew,somethingwhichhasneverbeendonebefore.Under
thesecircumstanceswemakemistakes.Nopieceofresearchcanbeperfectandtherewillalwaysbesomethingwhich,withhindsight,wewouldhavechanged.Furthermore,itisoftenfromtheflawsinastudythatwecanlearnmostaboutresearchmethods.Forthisreason,theworkofseveralresearchersisdescribedinthisbooktoillustratetheproblemsintowhichtheirdesignsoranalysesledthem.Idonotwishtoimplythatthesepeoplewereanymorepronetoerrorthantherestofthehumanrace,orthattheirworkwasnotavaluableandseriousundertaking.RatherIwanttolearnfromtheirexperienceofattemptingsomethingextremelydifficult,tryingtoextendourknowledge,sothatresearchersandconsumersofresearchmayavoidtheseparticularpitfallsinthefuture.
1.2StatisticsandmathematicsManypeoplearediscouragedfromthestudyofstatisticsbyafearofbeingoverwhelmedbymathematics.Itistruethatmanyprofessionalstatisticiansarealsomathematicians,butnotallare,andtherearemanyveryableappliersofstatisticstotheirownfields.Itispossible,thoughperhapsnotveryuseful,tostudystatisticssimplyasapartofmathematics,withnoconcernforitsapplicationatall.Statisticsmayalsobediscussedwithoutappearingtouseanymathematicsatall(e.g.
Huff1954).
Theaspectsofstatisticsdescribedinthisbookcanbeunderstoodandappliedwiththeuseofsimplealgebra.Onlythealgebrawhichisessentialforexplainingthemostimportantconceptsisgiveninthemaintext.Thismeansthatseveralofthetheoreticalresultsusedarestatedwithoutadiscussionoftheirmathematicalbasis.Thisisdonewhenthederivationoftheresultwouldnotaidmuchinunderstandingtheapplication.Formanyreadersthereasoningbehindtheseresultsisnotofgreatinterest.Forthereaderwhodoesnotwishtotaketheseresultsontrust,severalchaptershaveappendicesinwhichsimplemathematicalproofsaregiven.Theseappendicesaredesignedtohelpincreasetheunderstandingofthemoremathematicallyinclinedreaderandtobeomittedbythosewhofindthatthemathematicsservesonlytoconfuse.
1.3StatisticsandcomputingPracticalstatisticshasalwaysinvolvedlargeamountsofcalculation.Whenthemethodsofstatisticalinferencewerebeingdevelopedinthefirsthalfofthetwentiethcentury,calculationsweredoneusingpencil,paper,tables,sliderulesand,withluck,averyexpensivemechanicaladdingmachine.Olderbooksonstatisticsspendmuchtimeonthedetailsofcarryingoutcalculationsandanyreferencetoa‘computer’meansapersonwhocomputes,notanelectronicdevice.Thedevelopmentofthedigitalcomputerhasbroughtchangestostatisticsastomanyotherfields.Calculationscanbedonequickly,easilyand,wehope,accuratelywitharangeofmachinesfrompocketcalculatorswithbuilt-instatisticalfunctionstopowerfulcomputersanalysingdataonmanythousandsofsubjects.Manystatisticalmethodswouldnotbecontemplatedwithoutcomputers,andthedevelopmentofnewmethodsgoeshandinhandwiththedevelopmentof
softwaretocarrythemout.Thetheoryofmultilevelmodelling(Goldstein1995)andtheprogramsMLnandMLWinareagoodexample.Mostofthecalculationsinthisbookweredoneusingacomputerandthegraphswereproducedwithone.
Asanaddedbonus,mylittleMSDOSprogramClinstat(nottobe
confusedwithanycommercialpackageofthesamename)canbedownloadedfreefrommywebsiteathttp://www.sghms.ac.uk/depts/phs/staff/jmb/.Itdoesmostofthecalculationsinthisbook,includingsamplesizecalculationsandrandomsamplingandallocation.Itdoesnotdoanymultifactorialanalyses,sorry.Thereisalsoalittleprogramtofindsomeexactconfidenceintervals.
Thereisthereforenoneedtoconsidertheproblemsofmanualcalculationindetail.Theimportantthingistoknowwhyparticularcalculationsshouldbedoneandwhattheresultsofthesecalculationsactuallymean.Indeed,thedangerinthecomputerageisnotsomuchthatpeoplecarryoutcomplexcalculationswrongly,butthattheyapplyverycomplicatedstatisticalmethodswithoutknowingwhyorwhatthecomputeroutputmeans.MorethanonceIhavebeenapproachedbyaresearcherbearingatwoinchthickcomputerprintout,andaskingwhatitallmeans.Sadly,toooften,itmeansthatanothertreehasdiedinvain.
Thewidespreadavailabilityofcomputersmeansthatmorecalculationsarebeingdone,andbeingpublished,thaneverbefore,andthechanceofinappropriatestatisticalmethodsbeingappliedmayactuallyhaveincreased.Thismisusearisespartlybecausepeopleregardtheirdataanalysisproblemsascomputingproblems,notstatisticalones,andseekadvicefromcomputerexpertsratherthanstatisticians.Theyoftengetgoodadviceonhowtodoit,butratherpooradviceaboutwhattodo,whytodoitandhowtointerprettheresultsafterwards.Itisthereforemoreimportantthaneverthattheconsumersofresearchunderstandsomethingabouttheusesandlimitationsofstatisticaltechniques.
1.4ThescopeofthisbookThisbookisintendedasanintroductiontosomeofthestatisticalideasimportanttomedicine.Itdoesnottellyouallyouneedtoknowtodomedicalresearch.Onceyouhaveunderstoodtheconceptsdiscussedhere,itismucheasiertolearnaboutthetechniquesofstudydesignandstatisticalanalysisrequiredtoansweranyparticularquestion.Thereareseveralexcellentstandardworkswhichdescribethesolutionstoproblemsintheanalysisofdata(ArmitageandBerry1994,Snedecor
andCochran1980,Altman1991)andalsomorespecializedbookstowhichreferencewillbemadewhererequired.
WhatIhopethebookwilldoistogiveenoughunderstandingofthestatisticalideascommonlyusedinmedicinetoenablethehealthprofessionaltoreadthemedicalliteraturecompetentlyandcritically.Itcoversenoughmaterial(andmore)foranundergraduatecourseinstatisticsforstudentsofmedicine,nursing,physiotherapy,etc.Atthetimeofwriting,asfarascanbeestablished,itcoversthematerialrequiredtoanswerstatisticalquestionssetintheexaminationsof
mostoftheRoyalColleges,exceptfortheMRCPsych.IhaveindicatedbyanasteriskinthesubheadingthosesectionswhichIthinkwillberequiredonlybythepostgraduateortheresearcher.
Whenworkingthroughatextbook,itisusefultobeabletocheckyourunderstandingofthematerialcovered.Likemostsuchbooks,thisonehasexercisesattheendofeachchapter,buttoeasethetediummostoftheseareofthemultiplechoicetype.Thereisalsoonelongexercise,usuallyinvolvingcalculations,foreachchapter.Inkeepingwiththecomputerage,wherelaboriouscalculationwouldbenecessaryintermediateresultsaregiventoavoidthis.Thustheexercisescanbecompletedquitequicklyandthereaderisadvisedtotrythem.Youcanalsodownloadsomeofthedatasetsfrommywebsite(http://www.sghms.ac.uk/depts/phs/staff/jmb).Solutionsaregivenattheendofthebook,infullforthelongexercisesandasbriefnoteswithreferencestotherelevantsectionsinthetextforMCQs.ReaderswhowouldlikemorenumericalexercisesarerecommendedtoOsborn(1979).Forawealthofexercisesintheunderstandingandinterpretationofstatisticsinmedicalresearch,drawnfromthepublishedliteratureandpopularmedia,youshouldtrythecompanionvolumetothisone,StatisticalQuestionsinEvidence-basedMedicine(BlandandPeacock2000).
Finally,aquestionmanystudentsofmedicineaskastheystrugglewithstatistics:isitworthit?AsAltman(1982)hasargued,badstatisticsleadstobadresearchandbadresearchisunethical.Notonlymayitgivemisleadingresults,whichcanresultingoodtherapiesbeingabandonedandbadonesadopted,butitmeansthatpatientsmayhave
beenexposedtopotentiallyharmfulnewtreatmentsfornogoodreason.Medicineisarapidlychangingfield.Intenyears'time,manyofthetherapiescurrentlyprescribedandmanyofourideasaboutthecausesandpreventionofdiseasewillbeobsolete.Theywillbereplacedbynewtherapiesandnewtheories,supportedbyresearchstudiesanddataofthekinddescribedinthisbook,andprobablypresentingmanyofthesameproblemsininterpretation.Thepractitionerwillbeexpectedtodecideforher-orhimselfwhattoprescribeoradvisebasedonthesestudies.Soaknowledgeofmedicalstatisticsisoneofthemostusefulthingsanydoctorcouldacquireduringherorhistraining.
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>2-Thedesignofexperiments
2
Thedesignofexperiments
2.1ComparingtreatmentsTherearetwobroadtypesofstudyinmedicalresearch:observationalandexperimental.Inobservationalstudies,aspectsofanexistingsituationareobserved,asinasurveyoraclinicalcasereport.Wethentrytointerpretourdatatogiveanexplanationofhowtheobservedstateofaffairshascomeabout.Inexperimentalstudies,wedosomething,suchasgivingadrug,sothatwecanobservetheresultofouraction.Thischapterisconcernedwiththewaystatisticalthinkingisinvolvedinthedesignofexperiments.Inparticular,itdealswithcomparativeexperimentswherewewishtostudythedifferencebetweentheeffectsoftwoormoretreatments.Theseexperimentsmaybecarriedoutinthelaboratoryinvitrooronanimalsorhumanvolunteers,inthehospitalorcommunityonhumanpatients,or,fortrialsofpreventiveinterventions,oncurrentlyhealthypeople.Wecalltrialsoftreatmentsonhumansubjectsclinicaltrials.Thegeneralprinciplesofexperimentaldesignarethesame,althoughtherearespecialprecautionswhichmustbetakenwhenexperimentingwithhumansubjects.Theexperimentswhoseresultsmostconcerncliniciansareclinicaltrials,sothediscussionwilldealmainlywiththem.
Supposewewanttoknowwhetheranewtreatmentismoreeffectivethanthepresentstandardtreatment.Wecouldapproachthisinanumberofways.
First,wecouldcomparetheresultsofthenewtreatmentonnewpatientswithrecordsofpreviousresultsusingtheoldtreatment.Thisisseldomconvincing,becausetheremaybemanydifferencesbetween
thepatientswhoreceivedtheoldtreatmentandthepatientswhowillreceivethenew.Astimepasses,thegeneralpopulationfromwhichpatientscomemaybecomehealthier,standardsofancillarytreatmentandnursingcaremayimprove,orthesocialmixinthecatchmentareaofthehospitalmaychange.Thenatureofthediseaseitselfmaychange.Allthesefactorsmayproducechangesinthepatients'apparentresponsetotreatment.Forexample,Christie(1979)showedthisbystudyingthesurvivalofstrokepatientsin1978,aftertheintroductionofaC-Theadscanner,withthatofpatientstreatedin1974,beforetheintroductionofthescanner.Hetooktherecordsofagroupofpatientstreatedin1978,whoreceivedaC-Tscan,andmatchedeachofthemwithapatienttreatedin1974ofthesameage,diagnosisandlevelofconsciousnessonadmission.AsthefirstcolumnofTable2.1shows,patientsin1978clearlytendedtohavebettersurvivalthansimilarpatientsin1974.
Thescanned1978patientdidbetterthantheunscanned1974patientin31%ofpairs.whereastheunscanned1974patientdidbetterthatthescanned1978patientinonly7%ofpairs.However,healsocomparedthesurvivalofpatientsin1978whodidnotreceiveaC-Tscanwithmatchedpatientsin1974.Thesepatientstooshowedamarkedimprovementinsurvivalfrom1974to1978(Table2.1).The1978patientsdidbetterin38%ofpairsandthe1974patientsinonly19%ofpairs.Therewasageneralimprovementinoutcomeoverafairlyshortperiodoftime.Ifwedidnothavethedataontheunscannedpatientsfrom1978wemightbetemptedtointerpretthesedataasevidencefortheeffectivenessoftheC-Tscanner.Historicalcontrolslikethisareseldomveryconvincing,andusuallyfavourthenewtreatment.Weneedtocomparetheoldandnewtreatmentsconcurrently.
Table2.1.Analysisofthedifferenceinsurvivalformatchedpairsofstrokepatients(Christie1979)
C-Tscanin NoC-Tscanin
1978 1978
Pairswith1978betterthan1974
9(31%) 34(38%)
Pairswithsameoutcome
18(62%) 38(43%)
Pairswith1978worsethan1974
2(7%) 17(19%)
Second,wecouldobtainconcurrentgroupsbycomparingourownpatients,giventhenewtreatment,withpatientsgiventhestandardtreatmentinanotherhospitalorclinic,orbyanotherclinicianinourowninstitution.Again,theremaybedifferencesbetweenthepatientgroupsduetocatchment,diagnosticaccuracy,preferencebypatientsforaparticularclinician,oryoumightjustbeabettertherapist.Wecannotseparatethesedifferencesfromthetreatmenteffect.
Third,wecouldaskpeopletovolunteerforthenewtreatmentandgivethestandardtreatmenttothosewhodonotvolunteer.Thedifficultyhereisthatpeoplewhovolunteerandpeoplewhodonotvolunteerarelikelytobedifferentinmanywaysapartfromthetreatmentswegivethem.Theymightbemorelikelytofollowmedicaladvice,forexample.Wewillconsideranexampleoftheeffectsofvolunteerbiasin§2.4.
Fourth,wecanallocatepatientstothenewtreatmentorthestandardtreatmentandobservetheoutcome.Thewayinwhichpatientsareallocatedtotreatmentscaninfluencetheresultsenormously,asthefollowingexample(Hill1962)shows.Between1927and1944aseriesoftrialsofBCGvaccinewerecarriedoutinNewYork(LevineandSackett1946).ChildrenfromfamilieswheretherewasacaseoftuberculosiswereallocatedtoavaccinationgroupandgivenBCGvaccine,ortoacontrolgroupwhowerenotvaccinated.Between1927and1932
physiciansvaccinatedhalfthechildren,thechoiceofwhichchildrentovaccinatebeinglefttothem.TherewasaclearadvantageinsurvivalfortheBCGgroup(Table2.2).However,therewasalsoacleartendencyforthephysiciantovaccinatethechildrenofmorecooperativeparents,andtoleavethoseoflesscooperativeparentsascontrols.From1933allocationtotreatmentorcontrolwasdonecentrally,alternatechildrenbeingassignedtocontrolandvaccine.
Thedifferenceindegreeofcooperationbetweentheparentsofthetwogroupsofchildrendisappeared,andsodidthedifferenceinmortality.Notethatthesewereaspecialgroupofchildren,fromfamilieswheretherewastuberculosis.Inlargetrialsusingchildrendrawnfromthegeneralpopulation,BCGwasshowntobeeffectiveingreatlyreducingdeathsfromtuberculosis(HartandSutherland1977)
Table2.2.ResultsofstudiesofBCGvaccineinNewYorkCity(Hill1962)
Periodoftrial
No.ofchildren
No.ofdeathsfromTB
Deathrate
Averageno.ofvisitstoclinicduring1styear
offollow-up
Proportionofparentsgivinggoodcooperationasjudgedbyvisitingnurses
1927–32Selectionmadebyphysician
BCGgroup
445 3 0.67% 3.6 43%
Controlgroup
545 18 3.30% 1.7 24%
1933–44Alternativeselectioncarriedoutcentrally
BCGgroup
566 8 1.41% 2.8 40%
Controlgroup
528 8 1.52% 2.4 34%
Differentmethodsofallocationtotreatmentcanproducedifferentresults.Thisisbecausethemethodofallocationmaynotproducegroupsofsubjectswhicharecomparable,similarineveryrespectexceptthetreatment.Weneedamethodofallocationtotreatmentsinwhichthecharacteristicsofsubjectswillnotaffecttheirchanceofbeingputintoanyparticulargroup.Thiscanbedoneusingrandomallocation.
2.2RandomallocationIfwewanttodecidewhichoftwopeoplereceiveanadvantage,insuchawaythateachhasanequalchanceofreceivingit,wecanuseasimple,widelyacceptedmethod.Wetossacoin.Thisisusedtodecidethewayfootballmatchesbegin,forexample,andallappeartoagreethatitisfair.Soifwewanttodecidewhichoftwosubjectsshouldreceiveavaccine,wecantossacoin.Headsandthefirstsubjectreceivesthevaccine,tailsandthesecondreceivesit.Ifwedothisforeachpairofsubjectswebuilduptwogroupswhichhavebeenassembledwithoutanycharacteristicsofthesubjectsthemselvesbeinginvolvedintheallocation.Theonlydifferencesbetweenthegroupswillbethoseduetochance.Asweshallseelater(Chapters8and9),statisticalmethodsenableustomeasurethelikelyeffectsofchance.Anydifferencebetweenthegroupswhichislargerthanthisshouldbe
duetothetreatment,sincetherewillbenootherdifferencesbetweenthegroups.Thismethodofdividingsubjectsintogroupsiscalledrandomallocationorrandomization.
Severalmethodsofrandomizinghavebeeninuseforcenturies,includingcoins,dice,cards,lots,andspinningwheels.Someofthetheoryofprobabilitywhichweshalluselatertocomparerandomizedgroupswasfirstdevelopedas
anaidtogambling.Forlargerandomizationsweuseadifferent,non-physicalrandomizingmethod:randomnumbertables.Table2.3providesanexample,atableof1000randomdigits.Thesearemoreproperlycalledpseudo-randomnumbers,astheyaregeneratedbyamathematicalprocess.Theyareavailableintables(KendallandBabingtonSmith1971)orcanbeproducedbycomputerandsomecalculators.Wecanusetablesofrandomnumbersinseveralwaystoachieverandomallocation.Forexample,letusrandomlyallocate20subjectstotwogroups,whichIshalllabelAandB.Wechoosearandomstartingpointinthetable,usingoneofthephysicalmethodsdescribedabove.(Iuseddecimaldice.Theseare20-sideddice,numbered0to9twice,whichfitournumbersystemmoreconvenientlythanthetraditionalcube.Twosuchdicegivearandomnumberbetween1and100,counting‘0,0’as100.)Therandomstartingpointwasrow22,column20,andthefirst20digitswere3,4,6,2,9,7,5,3,2,6,9,7,9,3,9,2,3,3,2and4.WenowallocatesubjectscorrespondingtoodddigitstogroupAandthosecorrespondingtoevendigitstoB.Thefirstdigit,3,isodd,sothefirstsubjectgoesintogroupA.Theseconddigit,4,iseven,sothesecondsubjectgoesintogroupB,andsoon.WegettheallocationshowninTable2.4.WecouldallocateintothreegroupsbyassigningtoAifthedigitis1,2,or3,toBif4,5,or6,andtoCif7,8,or9,ignoring0.Therearemanypossibilities.
Table2.3.The1000randomdigits
Column
Row1–4 5–8 9–
1213–16
17–20
21–24
25–28
29–32
33–36
37–40
1 3645
8831
2873
5943
4632
0032
6715
3249
5455
7517
2 9051
4066
1846
9554
6589
1680
9533
1588
1860
5646
3 9841
9022
4837
8031
9139
3380
4082
3826
2039
7182
4 5525
7127
1468
6404
9924
8230
7343
9268
1899
4754
5 0299
1075
7721
8855
7997
7032
5987
7535
1834
6253
6 7985
5566
6384
0863
0400
1834
5394
5801
5505
9099
7 3353
9528
0681
3495
1393
3716
9506
1591
8999
3716
8 7475
1313
2216
3776
1557
4238
9623
9024
5826
7146
9 0666
3043
0066
3260
3660
4605
1731
6680
9101
6235
10 9283
3160
8730
7683
1785
3148
1323
1732
6814
8496
11 6121
3149
9829
7770
7211
3523
6947
1427
1474
5235
12 2782
0101
7441
3877
5368
5326
5516
3566
3187
8209
13 6105
5010
9485
8632
1072
9567
8821
7209
4873
0397
14 1157
8567
9491
4948
3549
3941
8017
5445
2366
8260
15 1516
0890
9286
1332
2601
2002
7245
9474
9719
9946
16 2209
2966
1544
7674
9492
4813
7585
8128
9541
3630
17 6913
5355
3587
4323
8332
7940
9220
8376
8261
2420
18 0829
7937
0033
3534
8655
1091
1886
4350
6779
3358
19 3729
9985
5563
3266
7198
8520
3193
6391
7721
9962
20 6511
1404
8886
2892
0403
4299
8708
2055
3053
8224
21 6622
8158
3080
2110
1553
2690
3377
5119
1749
2714
22 3721
7713
6931
2022
6713
4629
7532
6979
3923
3243
23 5143
0972
6838
0577
1462
8907
3789
2530
9209
0692
24 3159
3783
9255
1531
2124
0393
3597
8461
9685
4551
25 7905
4369
5293
0077
4482
9165
1171
2537
8913
6387
Table2.4.Allocationof20subjectstotwogroups
Subject Digit Group
1 3 A
2 4 B
3 6 B
4 2 B
5 9 A
6 7 A
7 5 A
8 3 A
9 2 B
10 6 B
11 9 A
12 7 A
13 9 A
14 3 A
15 9 A
16 2 B
17 3 A
18 3 A
19 2 B
20 4 B
Thesystemdescribedabovegaveusunequalnumbersinthetwogroups,12inAand8inB.Wesometimeswantthegroupstobeofequalsize.OnewaytodothiswouldbetoproceedasaboveuntileitherAorBhas10subjectsinit,alltheremainingsubjectsgoingintotheothergroups.ThisissatisfactoryinthateachsubjecthasanequalchanceofbeingallocatedtoAorB,butithasadisadvantage.Thereisatendencyforthelastfewsubjectsalltohavethesametreatment.Thischaracteristicsometimesworriesresearchers,whofeelthattherandomizationisnotquiteright.Instatisticaltermsthepossibleallocationsarenotequallylikely.Ifweusethismethodfortherandomallocationdescribedabove,the10thsubjectingroupAwouldbereachedatsubject15andthelastfivesubjectswouldallbeingroupB.Wecanensurethatallrandomizationsareequallylikelybyusingthetableofrandomnumbersinadifferentway.Forexample,wecanusethetabletodrawarandomsampleof10subjectsfrom20,asdescribedin§3.4.ThesewouldformgroupA,andtheremaining10groupB.Anotherwayistoputoursubjectsintosmallequal-sizedgroups,calledblocks,andwithineachblocktoallocateequalnumberstoAandB.Thisgivesapproximatelyequalnumbersonthetwotreatmentsandwilldosowheneverthetrialstops.
Theuseofrandomnumbersandthegenerationoftherandomnumbersthemselvesaresimplemathematicaloperationswellsuitedtothecomputerswhicharenowreadilyavailabletoresearchers.Itisveryeasytoprogramacomputertocarryoutrandomallocation,andonceaprogramisavailableitcanbeusedoverandoveragainforfurtherexperiments.MyprogramClinstat(§1.3)doesseveraldifferentrandomizationschemes,evenofferingblocksofrandomsize.
ThetrialcarriedoutbytheMedicalResearchCouncil(MRC1948)totesttheefficacyofstreptomycinforthetreatmentofpulmonary
tuberculosisisgenerallyconsideredtohavebeenthefirstrandomizedexperimentinmedicine.Inthisstudythetargetpopulationwaspatientswithacuteprogressivebilateralpulmonarytuberculosis,aged15–30years.Allcaseswerebacteriologicallyprovedandwereconsideredunsuitableforothertreatmentsthenavailable.Thetrialtookplaceinthreecentresandallocationwasbyaseriesofrandomnumbers,drawnupforeachsexateachcentre.Thestreptomycingroupcontained55
patientsandthecontrolgroup52cases.TheconditionofthepatientsonadmissionisshowninTable2.5.Thefrequencydistributionsoftemperatureandsedimentationrateweresimilarforthetwogroups;ifanythingthetreated(S)groupwereslightlyworse.However,thisdifferenceisnogreaterthancouldhavearisenbychance,which,ofcourse,ishowitarose.Thetwogroupsarecertaintobeslightlydifferentinsomecharacteristics,especiallywithafairlysmallsample,andwecantakeaccountofthisintheanalysis(Chapter17).
Table2.5.Conditionofpatientsonadmissiontotrialofstreptomycin(MRC1948)
Group
S C
Generalcondition Good 8 8
Fair 17 20
Poor 30 24
Max.eveningtemperaturein 98- 4 4
firstweek(°F) 98.9
99-99.9
13 12
100-100.9
15 17
101+ 24 19
Sedimentationrate 0-10 0 0
11-20 3 2
21-50 16 20
51+ 36 29
Table2.6.SurvivalatsixmonthsintheMRCstreptomycintrial,stratifiedbyinitialcondition
(MRC1948)
Maximumeveningtemperatureduringfirst
observationweek
Outcome
Group
Streptomycingroup
Controlgroup
98-98.9°F Alive 3 4
Dead 0 0
99-99.9°F Alive 13 11
Dead 0 1
100-100.9°F Alive 15 12
Dead 0 5
101°Fandabove Alive 20 11
Dead 4 8
Aftersixmonths,93%oftheSgroupsurvived,comparedto73%ofthecontrolgroup.Therewasaclearadvantagetothestreptomycingroup.TherelationshipofsurvivaltoinitialconditionisshowninTable2.6.Survivalwasmorelikelyforpatientswithlowertemperatures,butthedifferenceinsurvivalbetweentheSandCgroupsisclearlypresentwithineachtemperaturecategorywheredeathsoccurred.
Randomizedtrialsarenotrestrictedtotwotreatments.Wecancompareseveraltreatments.Adrugtrialmightincludethenewdrug,arivaldrug,and
nodrugatall.Wecancarryoutexperimentstocompareseveralfactorsatonce.Forexample,wemightwishtostudytheeffectofadrugatdifferentdosesinthepresenceorabsenceofaseconddrug,withthesubjectstandingorsupine.Thisisusuallydesignedasafactorialexperiment,whereeverypossiblecombinationoftreatmentsisused.Thesedesignsareunusualinclinicalresearchbutaresometimesusedinlaboratorywork.Theyaredescribedinmoreadvancedtexts
(ArmitageandBerry1994,SnedecorandCochran1980).Formoreonrandomizedtrialsingeneral,seePocock(1983)andJohnsonandJohnson(1977).
Randomizedexperimentationmaybecriticizedbecausewearewithholdingapotentiallybeneficialtreatmentfrompatients.Anybiologicallyactivetreatmentispotentiallyharmful,however,andwearesurelynotjustifiedingivingpotentiallyharmfultreatmentstopatientsbeforethebenefitshavebeendemonstratedconclusively.Withoutproperlyconductedcontrolledclinicaltrialstosupportit,eachadministrationofatreatmenttoapatientbecomesanuncontrolledexperiment,whoseoutcome,goodorbad,cannotbepredicted.
2.3*MethodsofallocationwithoutrandomnumbersInthesecondstageoftheNewYorkstudiesofBCGvaccine,thechildrenwereallocatedtotreatmentorcontrolalternately.Researchersoftenaskwhythismethodcannotbeusedinsteadofrandomization,arguingthattheorderinwhichpatientsarriveisrandom,sothegroupsthusformedwillbecomparable.First,althoughthepatientsmayappeartobeinarandomorder,thereisnoguaranteethatthisisthecase.Wecouldneverbesurethatthegroupsarecomparable.Second,thismethodisverysusceptibletomistakes,oreventocheatinginthepatients'perceivedinterest.Theexperimenterknowswhattreatmentthesubjectwillreceivebeforethesubjectisadmittedtothetrial.Thisknowledgemayinfluencethedecisiontoadmitthesubject,andsoleadtobiasedgroups.Forexample,anexperimentermightbemorepreparedtoadmitafrailpatientifthepatientwillbeonthecontroltreatmentthanifthepatientwouldbeexposedtotheriskofthenewtreatment.Thisobjectionappliestousingthelastdigitofthehospitalnumberforallocation.
Knowledgeofwhattreatmentthenextpatientwillreceivecancertainlyleadtobias.Forexample,Schulzetal.(1995)lookedat250controlledtrials.Theycomparedtrialswheretreatmentallocationwasnotadequatelyconcealedfromresearcherswithtrialswheretherewasadequatelyconcealment.Theyfoundanaveragetreatmenteffect41%largerinthetrialswithinadequateconcealment.
Thereareseveralexamplesreportedintheliteratureofalterationstotreatmentallocations.Holten(1951)reportedatrialofanticoagulanttherapyforpatientswithcoronarythrombosis.Patientswhopresentedonevendatesweretobetreatedandpatientsarrivingonodddatesweretoformthecontrolgroup.Theauthorreportsthatsomeofthecliniciansinvolvedfoundit‘difficulttoremember’thecriterionforallocation.Overallthetreatedpatientsdidbetterthanthecontrols(Table2.7).Curiously,thecontrolsontheevendates(wronglyallocated)didconsiderablybetterthancontrolpatientsontheodddates(correctly
allocated)andevenmanagedtodomarginallybetterthanthosewhoreceivedthetreatment.Thebestoutcome,treatedornot,wasforthosewhowereincorrectlyallocated.Allocationinthistrialappearstohavebeenratherselective.
Table2.7.Outcomeofaclinicaltrialusingsystematicallocation,witherrorsinallocation
(Holten1951)
OutcomeEvendates Odddates
Treated Control Treated Control
Survived 125 39 10 125
Died 39(25%) 11(22%) 0(0%) 81(36%)
Total 164 50 10 206
Othermethodsofallocationsetouttoberandombutcanfallintothissortofdifficulty.Forexample,wecouldusephysicalmixingtoachieve
randomization.Thisisquitedifficulttodo.Asanexperiment,takeadeckofcardsandordertheminsuitsfromaceofclubstokingofspades.Nowshufflethemintheusualwayandexaminethem.Youwillprobablyseemanyrunsofseveralcardswhichremaintogetherinorder.Cardsmustbeshuffledverythoroughlyindeedbeforetheorderingceasestobeapparent.Thephysicalrandomizationmethodcanbeappliedtoanexperimentbymarkingequalnumbersonslipsofpaperwiththenamesofthetreatments,sealingthemintoenvelopesandshufflingthem.Thetreatmentforasubjectisdecidedbywithdrawinganenvelope.ThismethodwasusedinanotherstudyofanticoagulanttherapybyCarletonetal.(1960).Theseauthorsreportedthatinthelatterstagesofthetrialsomeofthecliniciansinvolvedhadattemptedtoreadthecontentsoftheenvelopesbyholdingthemuptothelight,inordertoallocatepatientstotheirownpreferredtreatment.
Interferingwiththerandomizationcanactuallybebuiltintotheallocationprocedure,withequallydisastrousresults.IntheLanarkshireMilkExperiment,discussedbyStudent(1931),10000schoolchildrenreceivedthreequartersofapintofmilkperdayand10000childrenactedascontrols.Thechildrenwereweighedandmeasuredatthebeginningandendofthesix-monthexperiment.Theobjectwastoseewhetherthemilkimprovedthegrowthofchildren.Theallocationtothe‘milk’orcontrolgroupwasdoneasfollows:
Theteachersselectedthetwoclassesofpupils,thosegettingmilkandthoseactingascontrols,intwodifferentways.Incertaincasestheyselectedthembyballotandinothersonanalphabeticalsystem.Inanyparticularschoolwheretherewasanygrouptowhichthesemethodshadgivenanundueproportionofwell-fedorill-nourishedchildren,othersweresubstitutedtoobtainamorelevelselection.
Theresultofthiswasthatthecontrolgrouphadamarkedlygreateraverageheightandweightatthestartoftheexperimentthandidthemilkgroup.Studentinterpretedthisasfollows:
Presumablythisdiscriminationinheightandweightwasnotmadedeliberately,butitwouldseemprobablethattheteachers,swayedbytheveryhumanfeelingthatthepoorerchildrenneededthemilkmore
thanthecomparativelywelltodo,musthaveunconsciouslymadetoolargeasubstitutionfortheill-nourishedamongthe(milkgroup)andtoofewamongthecontrolsandthatthisunconsciousselectionaffected
secondarily,bothmeasurements.
Whetherthebiaswasconsciousornot,itspoiledtheexperiment,despitebeingfromthebestpossiblemotives.
Thereisonenon-randommethodwhichcanbeusedsuccessfullyinclinicaltrials:minimization.Inthismethod,newsubjectsareallocatedtotreatmentssoastomakethetreatmentgroupsassimilaraspossibleintermsoftheimportantprognosticfactors.Itisbeyondthescopeofthisbook,butseePocock(1983)foradescription.
2.4VolunteerbiasPeoplewhovolunteerfornewtreatmentsandthosewhorefusethemmaybeverydifferent.AnillustrationisprovidedbythefieldtrialofSalkpoliomyelitisvaccinecarriedoutin1954intheUSA(Meier1977).Thiswascarriedoutusingtwodifferentdesignssimultaneously,duetoadisputeaboutthecorrectmethod.Insomedistricts,secondgradeschoolchildrenwereinvitedtoparticipateinthetrial,andrandomlyallocatedtoreceivevaccineoraninertsalineinjection.Inotherdistricts,allsecondgradechildrenwereofferedvaccinationandthefirstandthirdgradeleftunvaccinatedascontrols.Theargumentagainstthis‘observedcontrol’approachwasthatthegroupsmaynotbecomparable,whereastheargumentagainsttherandomizedcontrolmethodwasthatthesalineinjectioncouldprovokeparalysisininfectedchildren.TheresultsareshowninTable2.8.Intherandomizedcontrolareasthevaccinatedgroupclearlyexperiencedfarlesspoliothanthecontrolgroup.Sincethesewererandomlyallocated,theonlydifferencebetweenthemshouldbethetreatment,whichisclearlypreferabletosaline.However,thecontrolgroupalsohadmorepoliothanthosewhohadrefusedtoparticipateinthetrial.Thedifferencebetweenthecontrolandnotinoculatedgroupisinbothtreatment(salineinjection)andselection;theyareself-selectedasvolunteersandrefusers.Theobservedcontrolareasenableustodistinguishbetweenthesetwofactors.Thepolioratesinthevaccinatedchildren
areverysimilarinbothpartsofthestudy,asaretheratesinthenotinoculatedsecondgradechildren.Itisthetwocontrolgroupswhichdiffer.Thesewereselectedindifferentways:intherandomizedcontrolareastheywerevolunteers,whereasintheobservedcontrolareastheywereeverybodyeligible,bothpotentialvolunteersandpotentialrefusers.Nowsupposethatthevaccineweresalineinstead,andthattherandomizedvaccinatedchildrenhadthesamepolioexperienceasthosereceivingsaline.Wewouldexpect200745×57/100000=114cases,insteadofthe33observed.Thetotalnumberofcasesintherandomizedareaswouldbe114+115+121=350andtherateper100000wouldbe47.Thiscomparesverycloselywiththerateof46intheobservedcontrolfirstandthirdgradegroup.Thusitseemsthattheprincipaldifferencebetweenthesalinecontrolgroupofvolunteersandthenotinoculatedgroupofrefusersisselection,nottreatment.
Thereisasimpleexplanationofthis.Polioisaviraldiseasetransmittedbythefaecal—oralroute.Beforethedevelopmentofvaccinealmosteveryoneinthe
populationwasexposedtoitatsometime,usuallyinchildhood.Inthemajorityofcases,paralysisdoesnotresultandimmunityisconferredwithoutthechildbeingawareofhavingbeenexposedtopolio.Inasmallminorityofcases,about1in200,paralysisordeathoccursandadiagnosisofpolioismade.Theoldertheexposedindividualis,thegreaterthechanceofparalysisdeveloping.Hence,childrenwhoareprotectedfrominfectionbyhighstandardsofhygienearelikelytobeolderwhentheyarefirstexposedtopoliothanthosechildrenfromhomeswithlowstandardsofhygiene,andthusmorelikelytodeveloptheclinicaldisease.Therearemanyfactorswhichmayinfluenceparentsintheirdecisionastowhethertovolunteerorrefusetheirchildforavaccinetrial.Thesemayincludeeducation,personalexperience,currentillness,andothers,butcertainlyincludeinterestinhealthandhygiene.Thusinthistrialthehighriskchildrentendedtobevolunteeredandthelowriskchildrentendedtoberefused.Thehigherriskvolunteercontrolchildrenexperienced57casesofpolioper100000,comparedto36per100000amongthelowerriskrefusers.
Table2.8.ResultofthefieldtrialofSalkpoliomyelitisvaccine(Meier1977)
Studygroup Numberingroup
Paralyticpolio
Numberofcases
Rateper100000
Randomizedcontrol
Vaccinated 200745 33 16
Control 201229 115 57
Notinoculated 338778 121 36
Observedcontrol
Vaccinated2ndgrade
221998 38 17
Control1stand3rdgrade
725173 330 46
Unvaccinated2ndgrade
123605 43 35
Inmostdiseases,theeffectofvolunteerbiasisoppositetothis.Poorconditionsarerelatedbothtorefusaltoparticipateandtohighrisk,
whereasvolunteerstendtobelowrisk.Theeffectofvolunteerbiasisthentoproduceanapparentdifferenceinfavourofthetreatment.Wecanseethatcomparisonsbetweenvolunteersandothergroupscanneverbereliableindicatorsoftreatmenteffects.
2.5IntentiontotreatIntheobservedcontrolareasoftheSalktrial(Table2.8),quiteapartfromthenon-randomagedifference,thevaccinatedandcontrolgroupsarenotcomparable.However,itispossibletomakeareasonablecomparisoninthisstudybycomparingallsecondgradechildren,bothvaccinatedandrefused,tothecontrolgroup.Therateinthesecondgradechildrenis23per100000,whichislessthantherateof46inthecontrolgroup,demonstratingtheeffectivenessofthevaccine.The‘treatment’whichweareevaluatingisnotvaccinationitself,butapolicyofofferingvaccinationandtreatingthosewhoaccept.Asimilarproblemcanariseinarandomizedtrial,forexampleinevaluatingtheeffectiveness
ofhealthcheckups(South-eastLondonScreeningStudyGroup1977).Subjectswererandomizedtoascreeninggrouportoacontrolgroup.Thescreeninggroupwereinvitedtoattendforanexamination,someacceptedandwerescreenedandsomerefused.Whencomparingtheresultsintermsofsubsequentmortality,itwasessentialtocomparethecontrolstothescreeninggroupscontainingbothscreenedandrefusers.Forexample,therefusersmayhaveincludedpeoplewhowerealreadytooilltocomeforscreening.Theimportantpointisthattherandomallocationprocedureproducescomparablegroupsanditisthesewemustcompare,whateverselectionmaybemadewithinthem.Wethereforeanalysethedataaccordingtothewayweintendedtotreatsubjects,notthewayinwhichtheywereactuallytreated.Thisisanalysisbyintentiontotreat.Thealternative,analysingbytreatmentactuallyreceived,iscalledontreatmentanalysis.
Analysisbyintentiontotreatisnotfreeofbias.Assomepatientsmayreceivetheothergroup'streatment,thedifferencemaybesmallerthanitshouldbe.Weknowthatthereisabiasandweknowthatitwillmakethetreatmentdifferencesmaller,byanunknownamount.On
treatmentanalyses,ontheotherhand,arebiasedinfavourofshowingadifference,whetherthereisoneornot.Statisticianscallmethodswhicharebiasedagainstfindinganyeffectconservative.Ifwemusterr,weliketodosointheconservativedirection.
2.6Cross-overdesignsSometimesitispossibletouseasubjectasherorhisowncontrol.Forexample,whencomparinganalgesicsinthetreatmentofarthritis,patientsmayreceiveinsuccessionanewdrugandacontroltreatment.Theresponsetothetwotreatmentscanthenbecomparedforeachpatient.Thesedesignshavetheadvantageofremovingvariabilitybetweensubjects.Wecancarryoutatrialwithfewersubjectsthanwouldbeneededforatwogrouptrial.
Althoughallsubjectsreceivealltreatments,thesetrialsmuststillberandomized.Inthesimplestcaseoftreatmentandcontrol,patientsmaybegiventwodifferentregimes:controlfollowedbytreatmentortreatmentfollowedbycontrol.Thesemaynotgivethesameresults,e.g.theremaybealong-termcarry-overeffectortimetrendwhichmakestreatmentfollowedbycontrolshowlessofadifferencethancontrolfollowedbytreatment.Subjectsare,therefore,assignedtoagivenorderatrandom.Itispossibleintheanalysisofcross-overstudiestoestimatethesizeofanycarry-overeffectswhichmaybepresent.
Asanexampleoftheadvantagesofacross-overtrial,consideratrialofpronethalolinthetreatmentofanginapectoris(Pritchardetal.1963).Anginapectorisisachronicdiseasecharacterizedbyattacksofacutepain.Patientsinthistrialreceivedeitherpronethaloloraninertcontroltreatment(orplacebo,see§2.8)infourperiodsoftwoweeks,twoperiodsonthedrugandtwoonthecontroltreatment.Theseperiodswereinrandomorder.Theoutcomemeasurewasthenumberofattacksofanginaexperienced.Thesewererecordedbythepatientinadiary.Twelvepatientstookpartinthetrial.Theresultsareshown
inTable2.9.Theadvantageinfavourofpronethalolisshownby11ofthe12patientsreportingfewerattacksofpainwhileonpronethalolthanwhileonthecontroltreatment.Ifwehadobtainedthesamedatafromtwoseparategroupsofpatientsinsteadofthesamegroupunder
twoconditions,itwouldbefarfromclearthatpronethalolissuperiorbecauseofthehugevariationbetweensubjects.Usingatwogroupdesign,wewouldneedamuchlargersampleofpatientstodemonstratetheefficacyofthetreatment.
Table2.9.Resultsofatrialofpronethalolforthetreatmentofanginapectoris(Pritchardetal.1963)
Patientnumber
Numberofattackswhileon
Differenceplacebo–pronethalolPlacebo Pronethalol
1 71 29 42
2 323 348 –25
3 8 1 7
4 14 7 7
5 23 16 7
6 34 25 9
7 79 65 14
8 60 41 19
9 2 0 2
10 3 0 3
11 17 15 2
12 7 2 5
Cross-overdesignscanbeusefulforlaboratoryexperimentsonanimalsorhumanvolunteers.Theyshouldonlybeusedinclinicaltrialswherethetreatmentwillnotaffectthecourseofthediseaseandwherethepatient'sconditionwouldnotchangeappreciablyoverthecourseofthetrial.Across-overtrialcouldbeusedtocomparedifferenttreatmentsforthecontrolofarthritisorasthma,forexample,butnottocomparedifferentregimesforthemanagementofmyocardialinfarction.However,across-overtrialcannotbeusedtodemonstratethelong-termactionofatreatment,asthenatureofthedesignmeansthatthetreatmentperiodmustbelimited.Asmosttreatmentsofchronicdiseasemustbeusedbythepatientforalongtime,atwosampletrialoflongdurationisusuallyrequiredtoinvestigatefullytheeffectivenessofthetreatment.Pronethalol,forexample,waslaterfoundtohavequiteunacceptablesideeffectsinlongtermuse.
Formoreoncross-overtrials,seeSenn(1993)andJonesandKenward(1989).
2.7SelectionofsubjectsforclinicaltrialsIhavediscussedtheallocationofsubjectstotreatmentsatsomelength,butwehavenotconsideredwheretheycomefrom.Thewayinwhichsubjectsareselectedforanexperimentmayhaveaneffectonitsoutcome.Inpractice,weareusuallylimitedtosubjectswhichareeasilyavailabletous.Forexample,inananimalexperimentwemusttakethelatestbatchfromtheanimalhouse.Inaclinicaltrialofthetreatmentofmyocardialinfarction,wemustbecontentwithpatients
whoarebroughtintothehospital.Inexperimentsonhumanvolunteers
wesometimeshavetousetheresearchersthemselves.
AsweshallseemorefullyinChapter3,thishasimportantconsequencesfortheinterpretationofresults.Intrialsofmyocardialinfarction,forexample,wewouldnotwishtoconcludethat,say,thesurvivalratewithanewtreatmentinatrialinLondonwouldbethesameasinatrialinEdinburgh.Thepatientsmayhaveadifferenthistoryofdiet,forexample,andthismayhaveaconsiderableeffectonthestateoftheirarteriesandhenceontheirprognosis.Indeed,itwouldbeveryrashtosupposethatwewouldgetthesamesurvivalrateinahospitalamiledowntheroad.Whatwerelyonisthecomparisonbetweenrandomizedgroupsfromthesamepopulationofsubjects,andhopethatifatreatmentreducesmortalityinLondonitwillalsodosoinEdinburgh.Thismaybeareasonablesupposition,andeffectswhichappearinonepopulationarelikelytoappearinanother,butitcannotbeprovedonstatisticalgroundsalone.Sometimesinextremecasesitturnsoutnottobetrue.BCGvaccinehasbeenshown,bylarge,wellconductedrandomizedtrials,tobeeffectiveinreducingtheincidenceoftuberculosisinchildrenintheUK.However,inIndiaitappearstobefarlesseffective(Lancet1980).Thismaybebecausetheamountofexposuretotuberculosisissodifferentinthetwopopulations.
Giventhatwecanuseonlytheexperimentalsubjectsavailabletous,therearesomeprincipleswhichweusetoguideourselectionfromthem.Asweshallseelater,thelowerthevariabilitybetweenthesubjectsinanexperimentis.thebetterchancewehaveofdetectingatreatmentdifferenceifitexists.Thismeansthatuniformityisdesirableinoursubjects.Inananimalexperimentthiscanbeachievedbyusinganimalsofthesamestrainraisedundercontrolledconditions.Inaclinicaltrialweusuallyrestrictourattentiontopatientsofadefinedagegroupandseverityofdisease.TheSalkvaccinetrial(§2.4)onlyusedchildreninoneschoolyear.Inthestreptomycintrial(§2.2)thesubjectswererestrictedtopatientswithacutebilateralpulmonarytuberculosis,bacteriologicallyproved,agedbetween15and30years,andunsuitableforothercurrenttherapy.Evenwiththisnarrowdefinitiontherewasconsiderablevariationamongthepatients,as
Tables2.5and2.6show.Tuberculosishadtobebacteriologicallyprovedbecauseitisimportanttomakesurethateveryonehasthediseasewewishtotreat.Patientswithadifferentdiseasearenotonlypotentiallybeingwronglytreatedthemselves,butmaymaketheresultsdifficulttointerpret.Restrictingattentiontoaparticularsubsetofpatients,thoughuseful,canleadtodifficulties.Forexample,atreatmentshowntobeeffectiveandsafeinyoungpeoplemaynotnecessarilybesointheelderly.Trialshavetobecarriedoutonthesortofpatientsitisproposedtotreat.
2.8ResponsebiasandplacebosTheknowledgethatsheorheisbeingtreatedmayalterapatient'sresponsetotreatment.Thisiscalledtheplaceboeffect.Aplaceboisapharmacologicallyinactivetreatmentgivenasifitwereanactivetreatment.Thiseffectmaytakemanyforms,fromadesiretopleasethedoctortomeasurablebiochemical
changesinthebrain.Mindandbodyareintimatelyconnected,andunlessthepsychologicaleffectisactuallypartofthetreatmentweusuallytrytoeliminatesuchfactorsfromtreatmentcomparisons.Thisisparticularlyimportantwhenwearedealingwithsubjectiveassessments,suchasofpainorwell-being.
Fig.2.1.Painreliefinrelationtodrugandtocolourofplacebo(afterHuskisson1974)
AfascinatingexampleofthepoweroftheplaceboeffectisgivenbyHuskisson(1974).Threeactiveanalgesics,aspirin,CodisandDistalgesic,werecomparedwithaninertplacebo.Twentytwopatientseachreceivedthefourtreatmentsinacross-overdesign.Thepatientsreportedpainreliefonafourpointscale,from0=noreliefto3=completerelief.Allthetreatmentsproducedsomepainrelief,maximumreliefbeingexperiencedafterabouttwohours(Figure2.1).Thethreeactivetreatmentswereallsuperiortoplacebo,butnotbyverymuch.Thefourdrugtreatmentsweregivenintheformoftabletsidenticalinshapeandsize,buteachdrugwasgiveninfourdifferentcolours.Thiswasdonesothatpatientscoulddistinguishthedrugsreceived,tosaywhichtheypreferred.Eachpatientreceivedfourdifferentcolours,oneforeachdrug,andthecolourcombinationswereallocatedrandomly.Thussomepatientsreceivedredplacebos,someblue,andsoon.AsFigure2.1shows,redplacebosweremarkedlymoreeffectivethanothercolours,andwerejustaseffectiveastheactivedrugs!Inthisstudynotonlyistheeffectofapharmacologicallyinertplaceboinproducingreportedpainreliefdemonstrated,butalsothewidevariabilityandunpredictabilityofthisresponse.Wemustclearlytakeaccountofthisintrialdesign.Incidentally,weshouldnotconcludethatredplacebosalwaysworkbest.Thereis,forexample,someevidencethatpatientsbeingtreatedforanxietyprefertabletstobeinasoothinggreen,anddepressivesymptomsrespondbesttoalivelyyellow(Schapiraetal.1970).
Inanytrialinvolvinghumansubjectsitisdesirablethatthesubjectsshouldnotbeabletotellwhichtreatmentiswhich.Inastudytocomparetwoormoretreatmentsthisshouldbedonebymakingthetreatmentsassimilaraspossible.Wheresubjectsaretoreceivenotreatmentaninactiveplaceboshouldbeusedifpossible.Sometimeswhentwoverydifferentactivetreatmentsarecomparedadoubleplaceboordoubledummycanbeused.Forexample,whencomparingadruggivenasingledosewithadrugtakendailyforsevendays,subjectson
thesingledosedrugmayreceiveadailyplaceboandthoseonthedaily
doseasingleplaceboatthestart.
Placebosarenotalwayspossibleorethical.IntheMRCtrialofstreptomycin.wherethetreatmentinvolvedseveralinjectionsadayforseveralmonths,itwasnotregardedasethicaltodothesamewithaninertsalinesolutionandnoplacebowasgiven.IntheSalkvaccinetrial,theinertsalineinjectionswereplacebos.Itcouldbearguedthatparalyticpolioisnotlikelytorespondtopsychologicalinfluences,buthowcouldwebereallysureofthis?Thecertainknowledgethatachildhadbeenvaccinatedmayhavealteredtheriskofexposuretoinfectionasparentsallowedthechildtogoswimming,forexample.Finally,theuseofaplacebomayalsoreducetheriskofassessmentbiasasweshallseein§2.9.
2.9AssessmentbiasanddoubleblindstudiesTheresponseofsubjectsisnottheonlythingaffectedbyknowledgeofthetreatment.Theassessmentbytheresearcheroftheresponsetotreatmentmayalsobeinfluencedbytheknowledgeofthetreatment.
Someoutcomemeasuresdonotallowformuchbiasonthepartoftheassessor.Forexample,iftheoutcomeissurvivalordeath,thereislittlepossibilitythatunconsciousbiasmayaffecttheobservation.However,ifweareinterestedinanoverallclinicalimpressionofthepatient'sprogress,orinchangesinanX-raypicture,themeasurementmaybeinfluencedbyourdesire(orotherwise)thatthetreatmentshouldsucceed.Itisnotenoughtobeawareofthisdangerandallowforit,aswemayhavethesimilarproblemof‘bendingoverbackwardstobefair’.Evensuchanapparentlyobjectivemeasureasbloodpressurecanbeinfluencedbytheexpectationsoftheexperimenter,andspecialmeasuringequipmenthasbeendevisedtoavoidthis(Roseetal.1964).
Wecanavoidthepossibilityofsuchbiasbyusingblindassessment,thatis,theassessordoesnotknowwhichtreatmentthesubjectisreceiving.Ifaclinicaltrialcannotbeconductedinsuchawaythattheclinicianinchargedoesnotknowthetreatment,blindassessmentcanstillbecarriedoutbyanexternalassessor.Whenthesubjectdoesnotknowthetreatmentandblindassessmentisused,thetrialissaidtobe
doubleblind.(Researchersoneyediseasehatetheterms‘blind’and‘doubleblind’,prefering‘masked’and‘doublemasked’instead.)
Placebosmaybejustasusefulinavoidingassessmentbiasasforresponsebias.Thesubjectisunabletotiptheassessoroffastotreatment,andthereislikelytobelessmaterialevidencetoindicatetoanassessorwhatitis.IntheanticoagulantstudybyCarletonetal.(1960)describedabove,thetreatmentwassuppliedthroughanintravenousdrip.Controlpatientshadadummydripsetup,withatubetapedtothearmbutnoneedleinserted,primarilytoavoidassessmentbias.IntheSalktrial,theinjectionswerecodedandthecodeforacasewasonlybrokenafterthedecisionhadbeenmadeastowhetherthechildhadpolioandifsoofwhatseverity.
Inthestreptomycintrial,oneoftheoutcomemeasureswasradiological
change.X-rayplateswerenumberedandthenassessedbytworadiologistsandaclinician,noneofwhomknewtowhichpatientandtreatmenttheplatebelonged.Theassessmentwasdoneindependently,andtheyonlydiscussedaplateiftheyhadnotallcometothesameconclusion.Onlywhenafinaldecisionhadbeenarrivedatwasthelinkbetweenplateandpatientmade.TheresultsareshowninTable2.10.TheclearadvantageofstreptomycinisshownintheconsiderableimprovementofoverhalftheSgroup,comparedtoonly8%ofthecontrols.
Table2.10.Assessmentofradiologicalappearanceatsixmonthsascomparedwithappearanceon
admission(MRC1948)
Radiologicalassessment S Group C Group
Considerableimprovement 28 51% 4 8%
Moderateorslightimprovement
10 18% 13 25%
Nomaterialchange 2 4% 3 6%
Moderateorslightdeterioration
5 9% 12 23%
Considerabledeterioration 6 11% 6 11%
Deaths 4 7% 14 27%
Total 55 100% 52 100%
2.10*LaboratoryexperimentsSofarwehavelookedatclinicaltrials,butexactlythesameprinciplesapplytolaboratoryresearchonanimals.Itmaywellbethatinthisareatheprinciplesofrandomizationarenotsowellunderstoodandevenmorecriticalattentionisneededfromthereaderofresearchreports.Onereasonforthismaybethatgreatefforthasbeenputintoproducinggeneticallysimilaranimals,raisedinconditionsasclosetouniformasispracticable.Theresearcherusingsuchanimalsassubjectsmayfeelthattheresultinganimalsshowsolittlebiologicalvariabilitythatanynaturaldifferencesbetweenthemwillbedwarfedbythetreatmenteffects.Thisisnotnecessarilyso,asthefollowingexamplesillustrate.
Acolleaguewaslookingattheeffectoftumourgrowthonmacrophagecountsinrats.Theonlysignificantdifferencewasbetweentheinitialvaluesintumourinducedandnon-inducedrats,thatis,beforethetumour-inducingtreatmentwasgiven.Therewasasimpleexplanationforthissurprisingresult.Theoriginaldesignhadbeentogivethe
tumour-inducingtreatmenttoeachofagroupofrats.Somewoulddeveloptumoursandotherswouldnot,andthenthemacrophagecountswouldbecomparedbetweenthetwogroupsthusdefined.Intheevent,alltheratsdevelopedtumours.Inanattempttosalvagetheexperimentmycolleagueobtainedasecondbatchofanimals,whichhedidnottreat,toactascontrols.Thedifferencebetweenthetreatedanduntreatedanimalswasthusduetodifferencesinparentageorenvironment,nottotreatment.
Thatproblemarosebychangingthedesignduringthecourseoftheexperiment.Problemscanarisefromignoringrandomizationinthedesignofacomparativeexperiment.Anothercolleaguewantedtoknowwhetheratreatmentwouldaffectweightgaininmice.Miceweretakenfromacageonebyone
andthetreatmentgiven,untilhalftheanimalshadbeentreated.Thetreatedanimalswereputintosmallercages,fivetoacage,whichwereplacedtogetherinaconstantenvironmentchamber.Thecontrolmicewereincagesalsoplacedtogetherintheconstantenvironmentchamber.Whenthedatawereanalysed,itwasdiscoveredthatthemeaninitialweightswasgreaterinthetreatedanimalsthaninthecontrolgroup.Inaweightgainexperimentthiscouldbequiteimportant!Perhapslargeranimalswereeasiertopickup,andsowereselectedfirst.Whatthatexperimentershouldhavedonewastoplacethemiceintheboxes,giveeachboxaplaceintheconstantenvironmentchamber,thenallocatetheboxestotreatmentorcontrolatrandom.Wewouldthenhavetwogroupswhichwerecomparable,bothininitialvaluesandinanyenvironmentaldifferenceswhichmayexistintheconstantenvironmentchamber.
2.11*ExperimentalunitsIntheweightgainexperimentdescribedabove,eachboxofmicecontainedfiveanimals.Theseanimalswerenotindependentofoneanother,butinteracted.Inaboxtheotherfouranimalsformedpartoftheenvironmentofthefifth,andsomightinfluenceitsgrowth.Theboxoffivemiceiscalledanexperimentalunit.Anexperimentalunitisthesmallestgroupofsubjectsinanexperimentwhoseresponsecannot
beaffectedbyothersubjects.Weneedtoknowtheamountofnaturalvariationwhichexistsbetweenexperimentalunitsbeforewecandecidewhetherthetreatmenteffectisdistinguishablefromthisnaturalvariation.Intheweightgainexperiment,themeanweightgainineachboxshouldbecalculated,andthemeandifferenceestimatedusingthetwo-sampletmethod(§10.3).Inhumanstudies,thesamethinghappenswhengroupsofpatients,suchasallthoseinahospitalwardorageneralpracticearerandomizedasagroup.Thismighthappeninatrialofhealthpromotion,forexample,wherespecialclinicsareadvertisedandsetupinGPsurgeries.Itwouldbeimpracticaltoexcludesomepatientsfromtheclinicandimpossibletopreventpatientsfromthepracticeinteractingwithandinfluencingoneanother.Allthepracticepatientsmustbetreatedasasingleunit.Trialswhereexperimentalunitscontainmorethanonesubjectarecalledclusterrandomized.
Thequestionoftheexperimentalunitariseswhenthetreatmentisappliedtotheproviderofcareratherthantothepatientdirectly.Forexample,Whiteetal.(1989)comparedthreerandomlyallocatedgroupsofGPs,thefirstgivenanintensiveprogrammeofsmallgroupeducationtoimprovetheirtreatmentofasthma,thesecondalesserintervention,andthethirdnointerventionatall.ForeachGP,asampleofherorhisasthmaticpatientswasselected.Thesepatientsreceivedquestionnairesabouttheirsymptoms,theresearchhypothesisbeingthattheintensiveprogrammewouldresultinfewersymptomsamongtheirpatients.TheexperimentalunitwastheGP,notthepatient.TheasthmapatientstreatedbyanindividualGPwereusedtomonitortheeffectoftheinterventiononthatGP.TheproportionofpatientswhoreportedsymptomswasusedasameasureoftheGP'seffectiveness,andthemeanoftheseproportionswascomparedbetween
thegroupsusingone-wayanalysisofvariance(§10.9).Anotherexamplewouldbeatrialofpopulationscreeningforadisease(§15.3),wherescreeningcentresweresetupinsomehealthdistrictsandnotinothers.Weshouldfindthemortalityrateforeachdistrictseparatelyandthencomparethemeanrateinthegroupofscreeningdistrictswiththatinthegroupofcontroldistricts.
Themostextremecaseariseswhenthereisonlyoneexperimentalunitpertreatment.Forexample,considerahealtheducationexperimentinvolvingtwoschools.Inoneschoolaspecialhealtheducationprogrammewasmounted,aimedtodiscouragechildrenfromsmoking.Bothbeforeandafterwards,thechildrenineachschoolcompletedquestionnairesaboutcigarettesmoking.Inthisexampletheschoolistheexperimentalunit.Thereisnoreasontosupposethattwoschoolsshouldhavethesameproportionofsmokersamongtheirpupils,orthattwoschoolswhichdohaveequalproportionsofsmokerswillremainso.Theexperimentwouldbemuchmoreconvincingifwehadseveralschoolsandrandomlyallocatedthemtoreceivethehealtheducationprogrammeortobecontrols.Wewouldthenlookforaconsistentdifferencebetweenthetreatedandcontrolschools,usingtheproportionofsmokersintheschoolasthevariable.
2.12*ConsentinclinicaltrialsIstartedmyresearchcareerinagriculture.Ourexperimentalsubjects,beingbarleyplants,hadnorights.Wesprayedthemwithwhateverchemicalswechoseandburntthemafterharvestandweighing.Wecannottreathumansubjectsinthesameway.Wemustrespecttherightsofourresearchsubjectsandtheirwelfaremustbeourprimaryconcern.Thishasnotalwaysbeenthecase,mostnotoriouslyintheNazideathcamps(Leaning1996).TheDeclarationofHelsinki(BMJ1996a),whichlaysdowntheprincipleswhichgovernresearchonhumansubjects,grewoutofthetrialsinNuremburgoftheperpetratorsoftheseatrocities(BMJ1996b).
Ifthereisatreatment,weshouldnotleavepatientsuntreatedifthisinanywayaffectstheirwell-being.TheworldwasrightlyoutragedbytheTuskegeeStudy,wheremenwithsyphiliswereleftuntreatedtoseewhatthelong-termeffectsofthediseasemightbe(Brawley1998,Ramsay1998).Thisisanextremeexamplebutitisnottheonlyone.Womenwithdysplasiafoundatcervicalcytologyhavebeenleftuntreatedtoseewhethercancerdeveloped(Mudur1997).Patientsarestillbeingaskedtoenterpharmaceuticaltrialswheretheymaygetaplacebo,eventhoughaneffectivetreatmentisavailable,allegedlybecauseregulatorsinsistonit.
Peopleshouldnotbetreatedwithouttheirconsent.Thisgeneralprincipleisnotconfinedtoresearch.Patientsshouldalsobeaskedwhethertheywishtotakepartinaresearchprojectandwhethertheyagreetoberandomized.Theyshouldknowtowhattheyareconsenting,andusuallyrecruitstoclinicaltrialsaregiveninformationsheetswhichexplaintothemrandomization,thealternativetreatments,andthepossiblerisksandbenefits.Onlythencantheygiveinformedorvalidconsent.Forchildrenwhoareoldenoughtounderstand,bothchildand
parentshouldbeinformedandgivetheirconsent,otherwiseparentsmustconsent(Doyal1997).Peoplegetveryupsetandangryiftheythinkthattheyhavebeenexperimentedonwithouttheirknowledgeandconsent,oriftheyfeelthattheyhavebeentrickedintoitwithoutbeingfullyinformed.Agroupofwomenwithcervicalcancerweregivenanexperimentalradiationtreatment,whichresultedinseveredamage,withoutproperinformation(Anon1997).TheyformedagroupwhichtheycalledRAGE,whichspeaksforitself.
Patientsaresometimesrecruitedintotrialswhentheyareverydistressedandveryvulnerable.Ifpossibletheyshouldhavetimetothinkaboutthetrialanddiscussitwiththeirfamily.Patientsintrialsareoftennotatallclearaboutwhatisgoingonandhavewrongideasaboutwhatishappening(Snowdonetal.1997).Theymaybeunabletorecallgivingtheirconsent,anddenyhavinggivenit.Theyshouldalwaysbeaskedtosignconsentformsandshouldbegivenaseparatepatientinformationsheetandacopyoftheformtokeep.
Adifficultyariseswiththerandomizedconsentdesign(Zelen1979,1992).Inthis,wehaveanew,activetreatmentandeithernocontroltreatmentorusualcare.Werandomizesubjectstoactiveorcontrol.Wethenofferthenewtreatmenttotheactivegroup,whomayrefuse,andthecontrolgroupgetsusualcare.Theactivegroupisaskedtoconsenttothenewtreatmentandallsubjectsareaskedtoconsenttoanymeasurementrequired.Theymightbetoldthattheyareinaresearchstudy,butnotthattheyhavebeenrandomized.Thusonlypatientsintheactivegroupcanrefusethetrial,thoughallcanrefusemeasurement.Analysisisthenbyintentiontotreat(§2.5).For
example,Dennisetal.(1997)wantedtoevaluateastrokefamilycareworker.Theyrandomizedpatientswithouttheirknowledge,thenaskedthemtoconsenttofollow-upconsistingofinterviewsbyaresearcher.Thecareworkervisitedthosepatientsandtheirfamilieswhohadbeenrandomizedtoher.McLean(1997)arguedthatifpatientscouldnotbeinformedabouttherandomizationwithoutjeopardizingthetrial,theresearchshouldnotbedone.Dennis(1997)arguedthattoaskforconsenttorandomizationmightbiastheresults,becausepatientswhodidnotreceivethecareworkermightberesentfulandbeharmedbythis.Myownviewisthatweshouldnotallowoneethicalconsideration,informedconsent,tooutweighallothersandthisdesigncanbeacceptable(Bland1997).
Thereisaspecialprobleminclusterrandomizedtrials.Patientscannotconsenttorandomization,butonlytotreatment.Inatrialwheregeneralpracticesareallocatedtoofferhealthchecks,forexample,patientscanconsenttothehealthchecksonlyiftheyareinahealthcheckpractice,thoughallwouldhavetoconsenttoanendoftrialassessment.
Researchonhumansubjectsshouldalwaysbeapprovedbyanindependentethicscommittee,whoseroleistorepresenttheinterestsoftheresearchsubject.Wheresuchasystemisnotinplace,terriblethingscanhappen.IntheUSA,researchcanbecarriedoutwithoutethicalapprovalifthesubjectsareprivatepatientsinaprivatehospitalwithoutanypublicfunding,andnonewdrugordeviceisused.Underthesecircumstances,plasticsurgeonscarriedoutatrialcomparingtwomethodsperformingface-lifts,oneoneachsideoftheface,
withoutpatients'consent(BulletinofMedicalEthics1998).
2MMultiplechoicequestions1to6(Eachbranchiseithertrueorfalse)
1.Whentestinganewmedicaltreatment,suitablecontrolgroupsincludepatientswho:
(a)aretreatedbyadifferentdoctoratthesametime;
(b)aretreatedinadifferenthospital;
(c)arenotwillingtoreceivethenewtreatment;
(d)weretreatedbythesamedoctorinthepast;
(e)arenotsuitableforthenewtreatment.
ViewAnswer
2.Inanexperimenttocomparetwotreatments,subjectsareallocatedusingrandomnumberssothat:
(a)thesamplemaybereferredtoaknownpopulation;
(b)whendecidingtoadmitasubjecttothetrial,wedonotknowwhichtreatmentthatsubjectwouldreceive;
(c)thesubjectswillgetthetreatmentbestsuitedtothem;
(d)thetwogroupswillbesimilar,apartfromtreatment;
(e)treatmentsmaybeassignedaccordingtothecharacteristicsofthesubject.
ViewAnswer
3.Inadoubleblindclinicaltrial:
(a)thepatientsdonotknowwhichtreatmenttheyreceive;
(b)eachpatientreceivesaplacebo;
(c)thepatientsdonotknowthattheyareinatrial;
(d)eachpatientreceivesbothtreatments;
(e)theclinicianmakingassessmentdoesnotknowwhichtreatmentthepatientreceives.
ViewAnswer
4.Inatrialofanewvaccine,childrenwereassignedatrandomtoa‘vaccine’anda‘control’group.The‘vaccine’groupwereofferedvaccination,whichtwo-thirdsaccepted.Thecontrolgroupwereofferednothing:
(a)thegroupwhichshouldbecomparedtothecontrolsisallchildrenwhoacceptedvaccination;
(b)thoserefusingvaccinationshouldbeincludedinthecontrolgroup;
(c)thetrialisdoubleblind;
(d)thoserefusingvaccinationshouldbeexcluded;
(e)thetrialisuselessbecausenotallthetreatedgroupwerevaccinated.
ViewAnswer
Table2.11.MethodofdeliveryintheKYMstudy
Methodofdelivery
AcceptedKYM
RefusedKYM
Controlwomen
% n % n % n
Normal 80.7 352 69.8 30 74.8 354
Instrumental 12.4 54 14.0 6 17.8 84
Caesarian 6.9 30 16.3 7 7.4 35
5.Cross-overdesignsforclinicaltrials:
(a)maybeusedtocompareseveraltreatments;
(b)involvenorandomization;
(c)requirefewerpatientsthandodesignscomparing
independentgroups;
(d)areusefulforcomparingtreatmentsintendedtoalleviatechronicsymptoms;
(e)usethepatientashisowncontrol.
ViewAnswer
6.Placebosareusefulinclinicaltrials:
(a)whentwoapparentlysimilaractivetreatmentsaretobecompared;
(b)toguaranteecomparabilityinnon-randomizedtrials;
(c)becausethefactofbeingtreatedmayitselfproducearesponse;
(d)becausetheymayhelptoconcealthesubject'streatmentfromassessors;
(e)whenanactivetreatmentistobecomparedtonotreatment.
ViewAnswer
2EExercise:The‘KnowYourMidwife’trialTheKnowYourMidwife(KYM)schemewasamethodofdeliveringmaternitycareforlow-riskwomen.Ateamofmidwivesranaclinic,andthesamemidwifewouldgiveallantenatalcareforamother,deliverthebaby,andgivepostnatalcare.TheKYMschemewascomparedtostandardantenatalcareinarandomizedtrial(FlintandPoulengeris1986).Itwasthoughtthattheschemewouldbeveryattractivetowomenandthatiftheyknewitwasavailabletheymightbereluctanttoberandomizedtostandardcare.EligiblewomenwererandomizedwithouttheirknowledgetoKYMortothecontrolgroup,whoreceivedthestandardantenatalcareprovidedbySt.George'sHospital.WomenrandomizedtoKYMweresentaletterexplainingtheKYMschemeandinvitingthemtoattend.Somewomendeclinedandattendedthestandardclinicinstead.ThemodeofdeliveryforthewomenisshowninTable2.11.Normalobstetricdatawererecordedon
allwomen,andthewomenwereaskedtocompletequestionnaires(whichtheycouldrefuse)aspartofastudyofantenatalcare,thoughtheywerenottoldaboutthetrial.
1.Thewomenknewwhattypeofcaretheywerereceiving.Whateffectmightthishaveontheoutcome?
ViewAnswer
2.WhatcomparisonshouldbemadetotestwhetherKYMhasanyeffectonmethodofdelivery?
ViewAnswer
3.Doyouthinkitwasethicaltorandomizewomenwithouttheirknowledge?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>3-Samplingandobservationalstudies
3
Samplingandobservationalstudies
3.1ObservationalstudiesInthischapterweshallbeconcernedwithobservationalstudies.Insteadofchangingsomethingandobservingtheresult,asinanexperimentorclinicaltrial,weobservetheexistingsituationandtrytounderstandwhatishappening.Mostmedicalstudiesareobservational,includingresearchintohumanbiologyinhealthypeople,thenaturalhistoryofdisease,thecausesanddistributionofdisease,thequalityofmeasurement,andtheprocessofmedicalcare.
Oneofthemostimportantanddifficulttasksinmedicineistodeterminethecausesofdisease,sothatwemaydevisemethodsofprevention.Weareworkinginanareawhereexperimentsareoftenneitherpossiblenorethical.Forexample.todeterminethatcigarettesmokingcausedcancer,wecouldimagineastudyinwhichchildrenwererandomlyallocatedtoa‘twentycigarettesadayforfiftyyears’groupanda‘neversmokeinyourlife’group.Allwewouldhavetodothenwouldbetowaitforthedeathcertificates.However,wecouldnotpersuadeoursubjectstosticktothetreatmentanddeliberatelysettingouttocausecancerishardlyethical.Wemustthereforeobservethediseaseprocessasbestwecan.bywatchingpeopleinthewildratherthanunderlaboratoryconditions.
Wecannevercometoanunequivocalconclusionaboutcausationinobservationalstudies.Thediseaseeffectandpossiblecausedonotexistinisolationbutinacomplexinterplayofmanyinterveningfactors.Wemustdoourbesttoassureourselvesthattherelationshipweobserveisnottheresultofsomeotherfactoractingonboth‘cause’
and‘effect’.Forexample,itwasoncethoughtthattheAfricanfevertree,theyellow-barkedacacia,causedmalaria,becausethoseunwiseenoughtocampunderthemwerelikelytodevelopthedisease.Thistreegrowsbywaterwheremosquitosbreed,andprovidesanidealday-timerestingplacefortheseinsects,whosebitetransmitstheplasmodiumparasitewhichproducesthedisease.Itwasthewaterandthemosquitoswhichweretheimportantfactors,notthetree.Indeed,thename‘malaria’comesfromasimilarincompleteobservation.Itmeans‘badair’andcomesfromthebeliefthatthediseasewascausedbytheairinlow-lying,marshyplaces,wherethemosquitosbred.Epidemiologicalstudydesignsmusttrytodealwiththecomplexinterrelationshipsbetweendifferentfactorsinordertodeducethetruemechanismofdiseasecausation.Wealsouseanumberofdifferentapproachestothestudyoftheseproblems,toseewhetherallproducethesameanswer.
Therearemanyproblemsininterpretingobservationalstudies,andthemedicalconsumerofsuchresearchmustbeawareofthem.Wehavenobetterwaytotacklemanyquestionsandsowemustmakethebestofthemandlookforconsistentrelationshipswhichstanduptothemostsevereexamination.Wecanalsolookforconfirmationofourfindingsindirectly,fromanimalmodelsandfromdose-responserelationshipsinthehumanpopulation.However,wemustacceptthatperfectproofisimpossibleanditisunreasonabletodemandit.Sometimes,aswithsmokingandhealth,wemustactonthebalanceoftheevidence.
Weshallstartbyconsideringhowtogetdescriptiveinformationaboutpopulationsinwhichweareinterested.Weshallgoontotheproblemofusingsuchinformationtostudydiseaseprocessesandthepossiblecausesofdisease.
3.2CensusesOnesimplequestionwecanaskaboutanygroupofinterestishowmanymembersithas.Forexample,weneedtoknowhowmanypeopleliveinacountryandhowmanyofthemareinvariousageandsexcategories,inordertomonitorthechangingpatternofdiseaseandtoplanmedicalservices.Wecanobtainitbyacensus.Inacensus,thewholeofadefinedpopulationiscounted.IntheUnitedKingdom,asin
manydevelopedcountries,apopulationcensusisheldeverytenyears.Thisisdonebydividingtheentirecountryintosmallareascalledenumerationdistricts,usuallycontainingbetween100and200households.Itistheresponsibilityofanenumeratortoidentifyeveryhouseholdinthedistrictandensurethatacensusformiscompleted,listingallmembersofthehouseholdandprovidingafewsimplepiecesofinformation.Eventhoughcompletionofthecensusformiscompelledbylaw,andenormouseffortgoesintoensuringthateveryhouseholdisincluded,thereareundoubtedlysomewhoaremissed.Thefinaldata,thoughextremelyuseful,arenottotallyreliable.
Themedicalprofessiontakespartinamassive,continuingcensusofdeaths,byprovidingdeathcertificatesforeachdeathwhichoccurs,includingnotonlythenameofthedeceasedandcauseofdeath,butalsodetailsofage,sex,placeofresidenceandoccupation.Censusmethodsarenotrestrictedtonationalpopulations.Theycanbeusedformorespecificadministrativepurposestoo.Forexample,wemightwanttoknowhowmanypatientsareinaparticularhospitalataparticulartime,howmanyofthemareindifferentdiagnosticgroups,indifferentage/sexgroups,andsoon.Wecanthenusethisinformationtogetherwithestimatesofthedeathanddischargeratestoestimatehowmanybedsthesepatientswilloccupyatvarioustimesinthefuture(Bewleyetal.1975,1981).
3.3SamplingAcensusofasinglehospitalcanonlygiveusreliableinformationaboutthathospital.Wecannoteasilygeneralizeourresultstohospitalsingeneral.IfwewanttoobtaininformationaboutthehospitalsoftheUnitedKingdom,twocoursesareopentous:wecanstudyeveryhospital,orwecantakearepresentativesampleofhospitalsandusethattodrawconclusionsabouthospitalsasawhole.
Moststatisticalworkisconcernedwithusingsamplestodrawconclusionsaboutsomelargerpopulation.IntheclinicaltrialsdescribedinChapter2,thepatientsactasasamplefromalargerpopulationconsistingofallsimilarpatientsandwedothetrialtofindoutwhatwouldhappentothislargergroupwerewetogivethema
newtreatment.
Theword‘population’isusedincommonspeechtomean‘allthepeoplelivinginanarea’,frequentlyofacountry.Instatistics,wedefinethetermmorewidely.Apopulationisanycollectionofindividualsinwhichwemaybeinterested,wheretheseindividualsmaybeanything,andthenumberofindividualsmaybefiniteorinfinite.Thus,ifweareinterestedinsomecharacteristicsoftheBritishpeople,thepopulationis‘allpeopleinBritain’.Ifweareinterestedinthetreatmentofdiabetesthepopulationis‘alldiabetics’.Ifweareinterestedinthebloodpressureofaparticularpatient,thepopulationis‘allpossiblemeasurementsofbloodpressureinthatpatient’.Ifweareinterestedinthetossoftwocoins,thepopulationis‘allpossibletossesoftwocoins’.Thefirsttwoexamplesarefinitepopulationsandcouldintheoryifnotpracticebecompletelyexamined;thesecondtwoareinfinitepopulationsandcouldnot.Wecouldonlyeverlookatasample,whichwewilldefineasbeingagroupofindividualstakenfromalargerpopulationandusedtofindoutsomethingaboutthatpopulation.
Howshouldwechooseasamplefromapopulation?Theproblemofgettingarepresentativesampleissimilartothatofgettingcomparablegroupsofpatientsdiscussedin§2.1,2,3.Wewantoursampletoberepresentative,insomesense,ofthepopulation.Wewantittohaveallthecharacteristicsintermsoftheproportionsofindividualswithparticularqualitiesashasthewholepopulation.Inasamplefromahumanpopulation,forexample,wewantthesampletohaveaboutthesameproportionofmenandwomenasinthepopulation,thesameproportionsindifferentagegroups,inoccupationalgroups,withdifferentdiseases,andsoon.Inaddition,ifweuseasampletoestimatetheproportionofpeoplewithadisease,wewanttoknowhowreliablethisestimateis,howfarfromtheproportioninthewholepopulationtheestimateislikelytobe.
Itisnotsufficienttochoosethemostconvenientgroup.Forexample,ifwewishedtopredicttheresultsofanelection,wewouldnottakeasoursamplepeoplewaitinginbusqueues.Thesemaybeeasytointerview,atleastuntilthebuscomes,butthesamplewouldbeheavilybiasedtowardsthosewhocannotaffordcarsandthustowards
lowerincomegroups.Inthesameway,ifwewantedasampleofmedicalstudentswewouldnottakethefronttworowsofthelecturetheatre.Theymaybeunrepresentativeinhavinganunusuallyhighthirstforknowledge,orpooreyesight.
Howcanwechooseasamplewhichdoesnothaveabuilt-inbias?Wemightdivideourpopulationintogroups,dependingonhowwethinkvariouscharacteristicswillaffecttheresult.Toaskaboutanelection,forexample,wemightgroupthepopulationaccordingtoage,sexandsocialclass.Wethenchooseanumberofpeopleineachgroupbyknockingondoorsuntilthequotaismadeup,andinterviewthem.Then,knowingthedistributionsofthesecategoriesinthepopulation(fromcensusdata,etc.)wecangetafarbetterpictureofthe
viewsofthepopulation.Thisiscalledquotasampling.Inthesamewaywecouldtrytochooseasampleofratsbychoosinggivennumbersofeachweight,age,sex,etc.Therearedifficultieswiththisapproach.First,itisrarelypossibletothinkofalltherelevantclassifications.Second,itisstilldifficulttoavoidbiaswithintheclassifications,bypickingintervieweeswholookfriendly,orratswhichareeasytocatch.Third,wecanonlygetanideaofthereliabilityoffindingsbyrepeatedlydoingthesametypeofsurvey,andoftherepresentativenessofthesamplebyknowingthetruepopulationvalues(whichwecanactuallydointhecaseofelections),orbycomparingtheresultswithasamplewhichdoesnothavethesedrawbacks.Quotasamplingcanbequiteeffectivewhensimilarsurveysaremaderepeatedlyasinopinionpollsormarketresearch.Itislessusefulformedicalproblems,wherewearecontinuallyaskingnewquestions.Weneedamethodwherebiasisavoidedandwherewecanestimatethereliabilityofthesamplefromthesampleitself.Asin§2.2,weusearandommethod:randomsampling.
3.4RandomsamplingTheproblemofobtainingasamplewhichisrepresentativeofalargerpopulationisverysimilartothatofallocatingpatientsintotwocomparablegroups.Wewantawayofchoosingmembersofthesamplewhichdoesnotdependontheirowncharacteristics.Theonlywaytobe
sureofthisistoselectthematrandom,sothatwhetherornoteachmemberofthepopulationischosenforthesampleispurelyamatterofchance.
Forexample,totakearandomsampleof5studentsfromaclassof80,wecouldwriteallthenamesonpiecesofpaper,mixthemthoroughlyinahatorothersuitablecontainer,anddrawoutfive.Allstudentshavethesamechanceofbeingchosen,andsowehavearandomsample.Allsamplesof5studentsareequallylikely,too,becauseeachstudentischosenquiteindependentlyoftheothers.Thismethodiscalledsimplerandomsampling.
Aswehaveseenin§2.2,physicalmethodsofrandomizingareoftennotverysuitableforstatisticalwork.Weusuallyusetablesofrandomdigits,suchasTable2.3.orrandomnumbersgeneratedbyacomputerprogram.WecoulduseTable2.3todrawoursampleof5from80studentsinseveralways.Forexample,wecouldlistthestudents,numberedfrom1to80.Thislistfromwhichthesampleistobedrawniscalledthesamplingframe.Wechooseastartingpointintherandomnumbertable(Table2.3),sayrow20,column5.Thisgivesusthefollowingpairsofdigits:
140488862892040342998708
Wecouldusethesepairsofdigitsdirectlyassubjectnumbers.Wechoosesubjectsnumbered14and4.Thereisnosubject88or86,sothenextchosenisnumber28.Thereisno92,sothenextis4.Wealreadyhavethissubjectinthesample,sowecarryontothenextpairofdigits,03.Thefinalmemberofthesampleisnumber42.Oursampleof5studentsisthusnumbers3,4,14,28and42.
Thereappearstobesomepatterninthissample.Twonumbersareadjacent(3and4)and3aredivisibleby14(14,28and42).Randomnumbersoftenappeartoustohavepattern,perhapsbecausethehumanmindisalwayslookingforit.Ontheotherhand,ifwetrytomakethesample‘morerandom’byreplacingeither3or4byasubjectneartheendofthelist,weareimposingapatternofuniformityonthesampleanddestroyingitsrandomness.Allgroupsoffiveareequallylikelyandmayhappen,even1,2,3,4,5.
Thismethodofusingthetableisfinefordrawingasmallsample,butitcanbetediousfordrawinglargesamples,becauseoftheneedtocheckforduplicates.Therearemanyotherwaysofdoingit.Forexample,wecandroptherequirementforasampleoffixedsize,andonlyrequirethateachmemberofthepopulationwillhaveafixedprobabilityofbeinginthesample.Wecoulddrawa5/80=1/16sampleofourclassbyusingthedigitsingroupstogiveadecimalnumber.say,
0.14040.88860.28920.04030.42990.8708
Wethenchoosethefirstmemberofthepopulationif0.1404islessthan1/16.Itisnot,sowedonotincludethismember,northesecond,correspondingto0.8886,northethird,correspondingto0.2892.Thefourthcorrespondsto0.0403.whichislessthan1/16(0.0625)andsothefourthmemberischosenasamemberofthesample,andsoon.Thismethodisonlysuitableforfairlylargesamples,asthesizeofthesampleobtainedcanbeveryvariableinsmallsamplingproblems.Intheexamplethereisahigherthan1in10chanceoffinishingwithasampleof2orfewer.
Aswithrandomallocation(§2.2),randomsamplingisanoperationideallysuitedtocomputers.MyfreeprogramClinstat(§1.3)providestworandomsamplingschemes.Somecomputerprogramsformanagingprimarycarepracticesactuallyhavethecapacitytotakearandomsampleforanydefinedgroupofpatientsbuiltin.
Randomsamplingensuresthattheonlywaysinwhichthesamplediffersfromthepopulationwillbethoseduetochance.Ithasafurtheradvantage.becausethesampleisrandom,wecanapplythemethodsofprobabilitytheorytothedataobtained.AsweshallseeinChapter8,thisenablesustoestimatehowfarfromthepopulationvaluethesamplevalueislikelytobe.
Theproblemwithrandomsamplingisthatwemusthavealistofthepopulationfromwhichthesampleistobedrawn.Listsofpopulationsmaybehardtofind,ortheymaybeverycumbersome.Forexample,tosampletheadultpopulationintheUK,wecouldusetheelectoralroll.Butalistofsome40000000nameswouldbedifficulttohandle,andinpracticewewouldfirsttakearandomsampleofelectoralwards,andthenarandomsampleofelectorswithinthesewards.Thisis,for
obviousreasons,amulti-stagerandomsample.Thisapproachcontainstheelementofrandomness,andsosampleswillberepresentativeofthepopulationsfromwhichtheyaredrawn.However,notallsampleshaveanequalchanceofbeingchosen,soitisnotthesameassimplerandomsampling.
Wecanalsocarryoutsamplingwithoutalistofthepopulationitself,providedwehavealistofsomelargerunitswhichcontainallthemembersofthepopulation.Forexample,wecanobtainarandomsampleofschoolchildreninanareabystartingwithalistofschools,whichisquiteeasytocomeby.Wethendrawasimplerandomsampleofschoolsandallthechildrenwithinourchosenschoolsformthesampleofchildren.Thisiscalledaclustersample,becausewetakeasampleofclustersofindividuals.Anotherexamplewouldbesamplingfromanyage/sexgroupinthegeneralpopulationbytakingasampleofaddressesandthentakingeveryoneatthechosenaddresseswhomatchedourcriteria.
Sometimesitisdesirabletodividethepopulationintodifferentstrata,forexampleintoageandsexgroups,andtakerandomsampleswithinthese.Thisisratherlikequotasampling,exceptthatwithinthestratawechooseatrandom.Ifthedifferentstratahavedifferentvaluesofthequantitywearemeasuring,thisstratifiedrandomsamplingcanincreaseourprecisionconsiderably.Therearemanycomplicatedsamplingschemesforuseindifferentsituations.Forexample,inastudyofcigarettesmokingandrespiratorydiseaseinDerbyshireschoolchildren,wedrewarandomsampleofschools,stratifiedbyschooltype(single-sex/mixed,selective/non-selective,etc.).Someschoolswhichtookchildrentoage13thenfedintothesame14+schoolwerecombinedintoonesamplingunit.Oursampleofchildrenwasallchildreninthechosenschoolswhowereintheirfirstsecondaryschoolyear(Banksetal.1978).Wethushadastratifiedrandomclustersample.Thesesamplingmethodsaffecttheestimateobtained.Stratificationimprovestheprecision,clustersamplingworsensit.Thesamplingschemeshouldbetakenintoaccountintheanalysis(Cochran1977,Kish1994).Oftenitisignored,aswasdonebyBanksetal.(1978)(thatis,byme),butitshouldnotbeandresultsmaybereportedas
beingmoreprecisethantheyreallyare.
In§2.3Ilookedatthedifficultieswhichcanariseusingmethodsofallocationwhichappearrandombutdonotuserandomnumbers.Insampling,twosuchmethodsareoftensuggestedbyresearchers.Oneistotakeeverytenthsubjectfromthelist,orwhateverfractionisrequired.Theotheristousethelastdigitofsomereferencenumber,suchasthehospitalnumber,andtakeasthesamplesubjectswherethisis,say,3or4.Thesesamplingmethodsaresystematicorquasi-random.Itisnotusuallyobviouswhytheyshouldnotgive‘random’samples,anditmaybethatinmanycasestheywouldbejustasgoodasrandomsampling.Theyarecertainlyeasier.Tousethem,wemustbeverysurethatthereisnopatterntothelistwhichcouldproduceanunrepresentativegroup.Ifitispossible,randomsamplingseemssafer.
Volunteerbiascanbeasseriousaprobleminsamplingstudiesasitisintrials(§2.4).Ifwecanonlyobtaindatafromasubsetofourrandomsample,thenthissubsetwillnotbearandomsampleofthepopulation.Itsmemberswillbeselfselected.Itisoftenverydifficulttogetdatafromeverymemberofasample.Theproportionforwhomdataisobtainediscalledtheresponserateandinasamplesurveyofthegeneralpopulationislikelytobebetween
70%and80%.Thepossibilitythatthoselostfromthesamplearedifferentinsomewaymustbeconsidered.Forexample,theymaytendtobeill,whichcanbeaseriousproblemindiseaseprevalencestudies.IntheschoolstudyofBanksetal.(1978),theresponseratewas80%,mostofthoselostbeingabsentfromschoolontheday.Now,someoftheseabsenteeswereillandsomeweretruants.Oursamplemaythusleadustounderestimatetheprevalenceofrespiratorysymptoms,byomittingsuffererswithcurrentacutedisease,andtheprevalenceofcigarettesmokingbyomittingthosewhohavegoneforaquicksmokebehindthebikesheds.
Oneofthemostfamoussamplingdisasters,theLiteraryDigestpollof1936,illustratesthesedangers(Bryson1976).Thiswasapollofvotingintentionsinthe1936USpresidentialelection,foughtbyRooseveltandLandon.Thesamplewasacomplexone.Insomecitieseveryregisteredvoterwasincluded,inothersoneintwo,andforthewholeofChicago
oneinthree.Tenmillionsampleballotsweremailedtoprospectivevoters,butonly2.3million,lessthanaquarter,werereturned.Still,twomillionisalotofAmericans,andthesepredicteda60%votetoLandon.Infact,Rooseveltwonwith62%ofthevote.Theresponsewassopoorthatthesamplewasmostunlikelytoberepresentativeofthepopulation,nomatterhowcarefullytheoriginalsamplewasdrawn.TwomillionAmericanscanbewrong!Itisnotthemeresizeofthesample,butitsrepresentativenesswhichisimportant.Providedthesampleistrulyrepresentative,2000votersisallyouneedtoestimatevotingintentionstowithin2%,whichisenoughforelectionpredictioniftheytellthetruthanddonotchangetheirminds(see§18E).
3.5SamplinginclinicalandepidemiologicalstudiesHavingextolledthevirtuesofrandomsamplingandcastdoubtonallothersamplingmethods,Imustadmitthatmostmedicaldataarenotobtainedinthisway.Thisispartlybecausethepracticaldifficultiesareimmense.ToobtainareasonablesampleofthepopulationoftheUK,anyonecangetalistofelectoralwards,takearandomsampleofthem,buycopiesoftheelectoralrollsforthechosenwardsandthentakearandomsampleofnamesfromit.Butsupposeyouwanttoobtainasampleofpatientswithcarcinomaofthebronchus.Youcouldgetalistofhospitalseasilyenoughandgetarandomsampleofthem,butthenthingswouldbecomedifficult.Thenamesofpatientswillonlybereleasedbytheconsultantinchargeshouldhesowish,andyouwillneedhispermissionbeforeapproachingthem.Anystudyofhumanpatientsrequiresethicalapproval,andyouwillneedthisfromtheethicscommitteeofeachofyourchosenhospitals.Gettingthecooperationofsomanypeopleisatasktodauntthehardiest,andobtainingethicalapprovalalonecantakemorethanayear.IntheUK,wenowhaveasystemofmulti-centreresearchethicscommittees,butaslocalapprovalmustalsobeobtainedthedelaysmaystillbeimmense.
Theresultofthisisthatclinicalstudiesaredoneonthepatientstohand.Ihavetouchedonthisprobleminthecontextofclinicaltrials(§2.7)andthe
sameappliestoothertypesofclinicalstudy.InaclinicaltrialweareconcernedwiththecomparisonoftwotreatmentsandwehopethatthesuperiortreatmentinStockportwillalsobethesuperiortreatmentinSouthampton.Ifwearestudyingclinicalmeasurement,wecanhopethatameasurementmethodwhichisrepeatableinMiddlesbroughwillberepeatableinMaidenhead,andthattwodifferentmethodsgivingsimilarresultsinoneplacewillgivesimilarresultsinanother.Studieswhicharenotcomparativegivemorecauseforconcern.Thenaturalhistoryofadiseasedescribedinoneplacemaydifferinunpredictablewaysfromthatinanother,duetodifferencesintheenvironmentandthegeneticmakeupofthelocalpopulation.Referencerangesforquantitiesofclinicalinterest,thelimitswithinwhichvaluesfrommosthealthpeoplewilllie,maywelldifferfromplacetoplace.
Studiesbasedonlocalgroupsofpatientsarenotwithoutvalue.Thisisparticularlysowhenweareconcernedwithcomparisonsbetweengroups,asinaclinicaltrial,orrelationshipsbetweendifferentvariables.However,wemustalwaysbearthelimitationsofthesamplingmethodinmindwheninterpretingtheresultsofsuchstudies.
Ingeneral,mostmedicalresearchhastobecarriedoutusingsamplesdrawnfrompopulationswhicharemuchmorerestrictedthanthoseaboutwhichwewishtodrawconclusions.Wemayhavetousepatientsinonehospitalinsteadofallpatients,orthepopulationofasmallarearatherthanthatofthewholecountryorplanet.Wemayhavetorelyonvolunteersforstudiesofnormalsubjects,givenmostpeople'sdislikeofhavingneedlespushedintothemanddisinclinationtospendhourshookeduptobatteriesofinstruments.Groupsofnormalsubjectscontainmedicalstudents,nursesandlaboratorystafffarmoreoftenthanwouldbeexpectedbychance.Inanimalresearchtheproblemisevenworse,fornotonlydoesonebatchofonestrainofmicehavetorepresentthewholespecies,itoftenhastorepresentmembersofadifferentorder,namelyhumans.
Findingsfromsuchstudiescanonlyapplytothepopulationfromwhichthesamplewasdrawn.Anyconclusionwhichwecometoaboutwiderpopulations,suchasallpatientswiththediseaseinquestion,dependsonevidencewhichisnotstatisticalandoftenunspecified,namelyourgeneralexperienceofnaturalvariabilityandexperienceofsimilar
studies.Thismayletusdown,andresultsestablishedinonepopulationmaynotapplytoanother.WehaveseenthisintheuseofBCGvaccineinIndia(§2.7).Itisveryimportantwhereverpossiblethatstudiesshouldberepeatedbyotherworkersonotherpopulations,sothatwecansamplethelargerpopulationatleasttosomeextent.
Intwotypesofstudy,casereportsandcaseseries,thesubjectscomebeforetheresearch,asitissuggestedbytheirexistence.Thereisnosampling.Theyareusedtoraisequestionsratherthantoanswerthem.
Acasereportisadescriptionofasinglepatientwhosecasedisplaysinterestingfeatures.Thisisusedtogenerateideasandraisequestions,ratherthantoanswerthem.Itclearlycannotbeplannedinadvance;itarisesfromthecase.Forexample,Velzeboeretal.(1997)reportedthecaseofan11-month-oldPakistani
girlwasadmittedtohospitalbecauseofdrowsiness,malaiseandanorexia.Shehadstoppedcrawlingorstandingupandscratchedherskincontinuously.Allinvestigationswerenegative.Her6-year-oldsisterwasthenbroughtinwithsimilarsymptoms.(Notethattherearetwopatientshere,buttheyarepartofthesamecase.)Thedoctorsguessedthatexposuretomercurymightbetoblame.Whenasked,themotherreportedthat2weeksbeforetheyoungerchild'ssymptomsstarted,mercuryfromabrokenthermometerhadbeendroppedonthecarpetinthechildren'sroom.Mercuryconcentrationinaurinesampletakenonadmissionwas12.6µg/1(slightlyabovetheacceptednormalvalueof10µg/1).Exposurewasconfirmedbyahighmercuryconcentrationinherhair.After3monthstreatmentthesymptomshaddisappearedtotallyandurinarymercuryhadfallenbelowthedetectionlimitof1µg/1.Thiscasecalledintoquestionthenormalvaluesformercuryinchildren.
Acaseseriesissimilartoacasereport,exceptthatanumberofsimilarcaseshavebeenobserved.Forexample,Shakeretal.(1997)described15patientsexaminedforhypocalcaemiaorskeletaldisease,inwhomthediagnosisofcoeliacdiseasewassubsequentlymade.In11ofthemgastrointestinalsymptomswereabsentormild.Theyconcludedthatbonelossmaybeasignofcoeliacdiseaseandthisdiagnosisshouldbeconsidered.Thedesigndoesnotallowthemtodrawanyconclusions
abouthowoftenthismighthappen.Todothattheywouldhavetocollectdatasystematically,usingacohortdesign(§3.7)forexample.
3.6Cross-sectionalstudiesOnepossibleapproachtothesamplingproblemisthecross-sectionalstudy.Wetakesomesampleorwholenarrowlydefinedpopulationandobservethematonepointintime.Wegetpoorestimatesofmeansandproportionsinanymoregeneralpopulation,butwecanlookatrelationshipswithinthesample.Forexample,inanepidemiologicalstudy,Banksetal.(1978)gavequestion-nairestoallfirstyearsecondaryschoolboysinarandomsampleofschoolsinDerbyshire(§3.4).Amongboyswhohadneversmoked,3%reportedacoughfirstthinginthemorning,comparedto19%ofboyswhosaidthattheysmokedoneormorecigarettesperweek.ThesamplewasrepresentativeofboysofthisageinDerbyshirewhoanswerquestionnaires,butwewantourconclusionstoapplyatleasttotheUnitedKingdom,ifnotthedevelopedworldorthewholeplanet.Wearguethatalthoughtheprevalenceofsymptomsandthestrengthoftherelationshipmayvarybetweenpopulations,theexistenceoftherelationshipisunlikelyonlytooccurinthepopulationstudied.Wecannotconcludethatsmokingcausesrespiratorysymptoms.Smokingandrespiratorysymptomsmaynotbedirectlyrelated,butmaybothberelatedtosomeotherfactor.Afactorrelatedtobothpossiblecauseandpossibleeffectiscalledconfounding.Forexample,childrenwhoseparentssmokemaybemorelikelytodeveloprespiratorysymptoms,becauseofpassiveinhalationoftheirparent'ssmoke,andalsobemoreinfluencedtotrysmokingthemselves.Wecantestthisbylookingseparatelyattherelationshipbetweenthechild'ssmokingandsymptomsforthose
whoseparentsarenotsmokers,andforthosewhoseparentsaresmokers.AsFigure3.1shows,thisrelationshipinfactpersistedandtherewasnoreasontosupposethatathirdcausalfactorwasatwork.
Fig.3.1.Prevalenceofself-reportedmorningcoughinDerbyshireschoolboys,bytheirownandtheirparents'cigarettesmoking(Blandetal.1978)
Mostdiseasesarenotsuitedtothissimplecross-sectionalapproach,becausetheyarerareevents.Forexample,lungcanceraccountsfor9%ofmaledeathsintheUK(OPCS,DH2No.7),andsoisaveryimportantdisease.Howevertheproportionofpeoplewhoareknowntohavethediseaseatanygiventime,theprevalence,isquitelow.Mostdeathsfromlungcancertakeplaceaftertheageof45,sowewillconsiderasampleofmenaged45andover.Theaverageremaininglifespanofthesemen,inwhichtheycouldcontractlungcancer,willbeabout30years.Theaveragetimefromdiagnosistodeathisaboutayear,soofthosewhowillcontractlungcanceronly1/30willhavebeendiagnosedwhenthesampleisdrawn.Only9%ofthesamplewilldeveloplungcanceranyway,sotheproportionwiththediseaseatanytimeis1/30×9%=0.3%or3perthousand.Wewouldneedaverylargesampleindeedtogetaworthwhilenumberoflungcancercases.
Cross-sectionaldesignsareusedinclinicalstudiesalso.Forexample,Rodinetal.(1998)studiedpolycysticovarydisease(PCO)inarandom
sampleofAsianwomenfromthelistsoflocalgeneralpracticesandfromalocaltranslatingservice.Wefoundthat52%ofthesamplehadPCO,veryhighcomparedtothatfoundinotherUKsamples.However,thiswouldnotprovideagoodestimateforAsianwomeningeneral,becausetheremaybemanydifferencesbetweenthissample,suchastheirregionsoforigin,andAsianwomenlivingelsewhere.WealsofoundthatPCOwomenhadhigherfastingglucoselevelsthannon-PCOwomen.Asthisisacomparisonwithinthesample,itseemsplausibletoconcludethatamongAsianwomenPCOtendstobeassociatedwithraisedglucose.WecannotsaywhetherPCOraisesglucoseorwhetherraisedglucoseincreasestheriskofPCO,becausetheyaremeasuredatthesametime.
Table3.1.Standardizeddeathratesperyearper1000menaged35ormoreinrelationtomostrecentamount
smoked,53monthsfollow-up(DollandHill1956)
Causeofdeath
Deathrateamong
Non-smokers Smokers
Mensmokingadailyaverageweightof
tobaccoof
1–14g 15–24g 25+g
Lungcancer
0.07 0.90 0.47 0.86 1.66
Othercancer
2.04 2.02 2.01 1.56 2.63
Otherrespiratory
0.81 1.13 1.00 1.11 1.41
Coronarythrombosis
4.22 4.87 4.64 4.60 5.99
Othercauses
6.11 6.89 6.82 6.38 7.19
Allcauses 13.25 15.78 14.92 14.49 18.84
3.7CohortstudiesOnewayofgettingroundtheproblemofthesmallproportionofpeoplewiththediseaseofinterestisthecohortstudy.Wetakeagroupofpeople,thecohort,andobservewhethertheyhavethesuspectedcausalfactor.Wethenfollowthemovertimeandobservewhethertheydevelopthedisease.Thisisaprospectivedesign,aswestartwiththepossiblecauseandseewhetherthisleadstothediseaseinthefuture.Itisalsolongitudinal,meaningthatsubjectsarestudiedatmorethanonetime.Acohortstudyusuallytakesalongtime,aswemustwaitforthefutureeventtooccur.Itinvolveskeepingtrackoflargenumbersofpeople,sometimesformanyyears,andoftenverylargenumbersmustbeincludedinthesampletoensuresufficientnumberswilldevelopthediseasetoenablecomparisonstobemadebetweenthosewithandwithoutthefactor.
AnotedcohortstudyofmortalityinrelationtocigarettesmokingwascarriedoutbyDollandHill(1956).TheysentaquestionnairetoallmembersofthemedicalprofessionintheUK,whowereaskedtogivetheirname,address,ageanddetailsofcurrentandpastsmokinghabits.Thedeathsamongthisgroupwererecorded.Only60%ofdoctorscooperated,soinfactthecohortdoesnotrepresentall
doctors.Theresultsforthefirst53monthsareshowninTable3.1.
Thecohortrepresentsdoctorswillingtoreturnquestionnaires,notpeopleasawhole.Wecannotusethedeathratesasestimatesforthewholepopulation,orevenforalldoctors.Whatwecansayisthat,inthisgroup,smokerswerefarmorelikelythannon-smokerstodiefromlungcancer.Itwouldbesurprisingifthisrelationshipwereonlytruefordoctors,butwecannotdefinitelysaythatthiswouldbethecaseforthewholepopulation,becauseofthewaythesamplehasbeenchosen.
Wealsohavetheproblemofotherinterveningvariables.Doctorswerenotallocatedtobesmokersornon-smokersasinaclinicaltrial;theychoseforthemselves.Thedecisiontobeginsmokingmayberelatedtomanythings(socialfactors,personalityfactors,geneticfactors)whichmayalsoberelatedtolungcancer.Wemustconsiderthesealternativeexplanationsverycarefullybeforecomingtoanyconclusionaboutthecausesofcancer.Inthisstudytherewerenodatatotestsuchhypotheses.
Thesametechniqueisused,usuallyonasmallerscale,inclinicalstudies.Forexample,Caseyetal.(1996)studied55patientswithverysevererheumatoidarthritisaffectingthespineandtheuseofallfourlimbs.Thesepatientswereoperatedoninanattempttoimprovetheirconditionandtheirsubsequentprogresswasmonitored.Wefoundthatonly25%hadafavourableoutcome.Wecouldnotconcludefromthisthatsurgerywouldbeworthwhilein25%ofsuchpatientsgenerally.Ourpatientsmighthavebeenparticularlyillorunusuallyfit,oursurgeonsmightbethebestortheymightbe(relativelyspeaking)ham-fistedbutchers.However,wecomparedtheseresultswithotherstudiespublishedinthemedicalliterature,whichweresimilar.Thesestudiestogethergaveamuchbettersampleofsuchpatientsthananystudyalonecoulddo(see§17.11,meta-analysis).Welookedatwhichcharacteristicsofthepatientspredictedagoodorbadoutcomeandfoundthattheareaofcross-sectionofthespinalcordwastheimportantpredictor.Weweremuchmoreconfidentofthisfinding,becauseitarosefromstudyingrelationshipsbetweenvariableswithinthesample.Itseemsquiteplausiblefromthisstudyalonethatpatientswhosespinalcordshavealreadyatrophiedareunlikelytobenefitfrom
surgery.
3.8Case-controlstudiesAnothersolutiontotheproblemofthesmallnumberofpeoplewiththediseaseofinterestisthecase-controlstudy.Inthiswestartwithagroupofpeoplewiththedisease,thecases.Wecomparethemtoasecondgroupwithoutthedisease,thecontrols.Inanepidemiologicalstudy,wethenfindtheexposureofeachsubjecttothepossiblecausativefactorandseewhetherthisdiffersbetweenthetwogroups.Beforetheircohortstudy,DollandHill(1950)carriedoutacase-controlstudyintotheaetiologyoflungcancer.TwentyLondonhospitalsnotifiedallpatientsadmittedwithcarcinomaofthelung,thecases.Aninterviewervisitedthehospitaltointerviewthecase,and,atthesametime,selectedapatientwithdiagnosisotherthancancer,ofthesamesexandwithinthesame5yearagegroupasthecase,inthesamehospitalatthesametime,asacontrol.Whenmorethanonesuitablepatientwasavailable,thepatientchosenwasthefirstinthewardlistconsideredbythewardsistertobefitforinterview.Table3.2showstherelationshipbetweensmokingandlungcancerforthesepatients.Asmokerwasanyonewhohadsmokedasmuchasonecigaretteadayforasmuchasoneyear.Itappearsthatcasesweremorelikelythancontrolstosmokecigarettes.DollandHillconcludedthatsmokingisanimportantfactorintheproductionofcarcinomaofthelung.
Thecase-controlstudyisanattractivemethodofinvestigation,becauseofitsrelativespeedandcheapnesscomparedtootherapproaches.However,therearedifficultiesintheselectionofcases,theselectionofcontrols,andobtainingthedata.Becauseofthese,case-controlstudiessometimesproducecontradictoryandconflictingresults.
Thefirstproblemistheselectionofcases.Thisusuallyreceiveslittleconsiderationbeyondadefinitionofthetypeofdiseaseandastatementaboutthe
confirmationofthediagnosis.Thisisunderstandable,asthereisusuallylittleelsethattheinvestigatorscandoaboutit.Theystartwith
theavailablesetofpatients.However,thesepatientsdonotexistinisolation.Theyaretheresultofsomeprocesswhichhasledtothembeingdiagnosedashavingthediseaseandthusbeingavailableforstudy.Forexample,supposewesuspectthatoralcontraceptivesmightcausecancerofthebreast.Wehaveagroupofpatientsdiagnosedashavingcancerofthebreast.Wemustaskourselveswhetheranyoftheseweredetectedatamedicalexaminationwhichtookplacebecausethewomanwasseeingadoctortoreceiveaprescription.Ifthiswereso,theriskfactor(pill)wouldbeassociatedwiththedetectionofthediseaseratherthanitscause.Thisiscalledascertainmentbias.
Table3.2.Numbersofsmokersandnon-smokersamonglungcancerpatientsandageandsex
matchedcontrolswithdiseasesotherthancancer(DollandHill1950)
Non-smokers Smokers Total
Males
Lungcancerpatients
2(0.3%) 647(99.7%)
649
Controlpatients 27(4.2%) 622(95.8%)
649
Females
Lungcancer 19(31.7%) 41(68.3%) 60
patients
Controlpatients 32(53.3%) 28(46.7%) 60
Farmoredifficultyiscausedbytheselectionofcontrols.Wewantagroupofpeoplewhodonothavethediseaseinquestion,butwhoareotherwisecomparabletoourcases.Wemustfirstdecidethepopulationfromwhichtheyaretobedrawn.Therearetwomainsourcesofcontrols:thegeneralpopulationandpatientswithotherdiseases.Thelattermaybepreferredbecauseofitsaccessibility.Nowthesetwopopulationsareclearlynotthesame.Forexample,DollandHill(1950)gavethecurrentsmokinghabitsof1014menandwomenwithdiseasesotherthancancer,14%ofwhomwerecurrentlynon-smokers.Theycommentedthattherewasnodifferencebetweensmokinginthediseasegroupsrespiratorydisease,cardiovasculardisease,gastro-intestinaldiseaseandothers.However,inthegeneralpopulationthepercentageofcurrentnon-smokerswas18%formenand59%forwomen(Todd1972).Thesmokingrateinthepatientgroupasawholewashigh.Sincetheirreport,ofcourse,smokinghasbeenassociatedwithdiseasesineachgroup.Smokersgetmorediseaseandaremorelikelytobeinhospitalthannon-smokers.
Intuitively,thecomparisonwewanttomakeisbetweenpeoplewiththediseaseandhealthypeople,notpeoplewithalotofotherdiseases.Wewanttofindouthowtopreventdisease,nothowtochooseonediseaseoranother!However,itismucheasiertousehospitalpatientsascontrols.Theremaythenbeabiasbecausethefactorofinterestmaybeassociatedwithotherdiseases.Supposewewanttoinvestigatetherelationshipbetweenadiseaseandcigarettesmokingusinghospitalcontrols.Shouldweexcludepatientswithlungcancerfromthecontrolgroup?Ifweincludethem,ourcontrolsmayhavemoresmokers
thanthegeneralpopulation,butifweexcludethemwemayhavefewer.Thisproblemisusuallyresolvedbychoosingspecificpatientgroups,suchasfracturecases,whoseillnessisthoughttobeunrelatedtothefactorbeinginvestigated.Incase-controlstudiesusingcancer
registries,controlsaresometimespeoplewithotherformsofcancer.Sometimesmorethanonecontrolgroupisused.
Havingdefinedthepopulationwemustchoosethesample.Therearemanyfactorswhichaffectexposuretoriskfactors,suchasageandsex.Themoststraightforwardwayistotakealargerandomsampleofthecontrolpopulation,ascertainalltherelevantcharacteristics,andthenadjustfordifferencesduringtheanalysis,usingmethodsdescribedinChapter17.Thealternativeistotrytomatchacontroltoeachcase,sothatforeachcasethereisacontrolofthesameage,sex,etc.Havingdonethis,thenwecancompareourcasesandcontrolsknowingthattheeffectsoftheseinterveningvariablesareautomaticallyadjustedfor.Ifwewishtoexcludeacasewemustexcludeitscontrol,too,orthegroupswillnolongerbecomparable.Wecanhavemorethanonecontrolpercase,buttheanalysisbecomescomplicated.
Matchingonsomevariablesdoesnotensurecomparabilityonall.Indeed,ifitdidtherewouldbenostudy.DollandHillmatchedonage,sexandhospital.Theyrecordedareaofresidenceandfoundthat25%oftheircaseswerefromoutsideLondon,comparedto14%ofcontrols.Ifwewanttoseewhetherthisinfluencesthesmokingandlungcancerrelationshipwemustmakeastatisticaladjustmentanyway.Whatshouldwematchfor?Themorewematchfor,thefewerinterveningvariablestherearetoworryabout.Ontheotherhand,itbecomesmoreandmoredifficulttofindmatches.Evenmatchingonageandsex,DollandHillcouldnotalwaysfindacontrolinthesamehospital,andhadtolookelsewhere.Matchingformorethanageandsexcanbeverydifficult.
Havingdecidedonthematchingvariableswethenfindinthecontrolpopulationallthepossiblematches.Iftherearemorematchesthanweneed,weshouldchoosethenumberrequiredatrandom.Othermethods,suchasthatusedbyDollandHillwhoallowedthewardsistertochoose,haveobviousproblemsofpotentialbias.Ifnosuitablecontrolcanbefound,wecandotwothings.Wecanwidenthematchingcriteria,sayagetowithintenyearsratherthanfive,orwecanexcludethecase.
Therearedifficultiesininterpretingtheresultsofcase-controlstudies.
Oneisthatthecase-controldesignisoftenretrospective,thatis,wearestartingwiththepresentdiseasestate,e.g.lungcancer,andrelatingittothepast,e.g.historyofsmoking.Wemayhavetorelyontheunreliablememoriesofoursubjects.Thismayleadbothtorandomerrorsamongcasesandcontrolsandsystematicrecallbias,whereonegroup,usuallythecases,recallseventsbetterthantheother.Forexample,themotherofahandicappedchildmaybemorelikelythanthemotherofanormalchildtoremembereventsinpregnancywhichmayhavecauseddamage.Thereisaproblemofassessmentbiasinsuchstudies,justasinclinicaltrials(§2.9).Interviewerswillveryoftenknowwhethertheintervieweeisacaseorcontrolandthismaywellaffectthewayquestionsareasked.Theseandotherconsiderationsmakecase-controlstudiesextremely
difficulttointerpret.Theevidencefromsuchstudiescanbeuseful,butdatafromothertypesofinvestigationmustbeconsidered,too,beforeanyfirmconclusionsaredrawn.
Thecase-controldesignisusedclinicallytoinvestigatethenaturalhistoryofdiseasebycomparingpatientswithhealthysubjectsorpatientswithanotherdisease.Forexample,Kielyetal.(1995)wereinterestedinlymphaticfunctionininflammatoryarthritis.Wecomparedarthritispatients(thecases)withhealthyvolunteers(thecontrols).Lymphaticflowwasmeasuredinthearmsofthesesubjectsandthegroupscompared.Wefoundthatlymphaticdrainagewaslessinthecasesthaninthecontrolgroup,butthiswasonlysoforarmswhichwereswollen(oedematous).
3.9*QuestionnairebiasinobservationalstudiesInobservationalstudies,muchdatamayhavetobesuppliedbythesubjectsthemselves.Thewayinwhichaquestionisaskedmayinfluencethereply.Sometimesthebiasinaquestionisobvious.Comparethese:
(a)Doyouthinkpeopleshouldbefreetoprovidethebestmedicalcarepossibleforthemselvesandtheirfamilies,freeofinterferencefromaStatebureaucracy?
(b)Shouldthewealthybeabletobuyaplaceattheheadofthequeueformedicalcare,pushingasidethosewithgreaterneed,orshouldmedicalcarebesharedsolelyonthebasisofneedforit?
Version(a)expectstheansweryes,version(b)expectstheanswerno.Wewouldhopenottobemisledbysuchblatantmanipulation,buttheeffectsofquestionwordingcanbemuchmoresubtlethanthis.Hedges(1978)reportsseveralexamplesoftheeffectsofvaryingthewordingofquestions.Heaskedtwogroupsofabout800subjectsoneofthefollowing:
(a)Doyoufeelyoutakeenoughcareofyourhealth,ornot?
(b)Doyoufeelyoutakeenoughcareofyourhealth,ordoyouthinkyoucouldtakemorecareofyourhealth?
Inreplytoquestion(a),82%saidthattheytookenoughcare,whereasonly68%saidthisinreplytoquestion(b).Evenmoredramaticwasthedifferencebetweenthispair:
(a)Doyouthinkapersonofyouragecandoanythingtopreventill-healthinthefutureornot?
(b)Doyouthinkapersonofyouragecandoanythingtopreventill-healthinthefuture,orisitlargelyamatterofchance?
Notonlywasthereadifferenceinthepercentagewhorepliedthattheycoulddosomething,butasTable3.3showsthisanswerwasrelatedtoageforversion(a)butnotforversion(b).Hereversion(b)isambiguous,asitisquitepossibletothinkthathealthislargelyamatterofchancebutthatthereisstillsomethingonecandoaboutit.Onlyifitistotallyamatterofchanceistherenothingonecando.
Table3.3.Repliestotwosimilarquestionsaboutillhealth,byage(Hedges1978)
Age(years)
Total
16–34 35–54 55+
Candosomething(a) 75% 64% 56% 65%
Candosomething(b) 45% 49% 50% 49%
Sometimestherespondentsmayinterpretthequestioninadifferentwayfromthequestioner.Forexample,whenaskedwhethertheyusuallycoughedfirstthinginthemorning,3.7%oftheDerbyshireschoolchildrenrepliedthattheydid.Whentheirparentswereaskedaboutthechild'ssymptoms2.4%repliedpositively,notadramaticdifference.Yetwhenaskedaboutcoughatothertimesinthedayoratnight24.8%ofchildrensaidyes,comparedtoonly4.5%oftheirparents(Blandetal.1979).Thesesymptomsallshowedrelationshipstothechild'ssmokingandotherpotentiallycausalvariables,andalsotooneanother.Weareforcedtoadmitthatwearemeasuringsomething,butthatwearenotsurewhat!
Anotherpossibilityisthatrespondentsmaynotunderstandthequestionatall,especiallywhenitincludesmedicalterms.Inanearlierstudyofcigarettesmokingbychildren,wefoundthat85%ofasampleagreedthatsmokingcausedcancer,butthat41%agreedthatsmokingwasnotharmful(Bewleyetal.1974).Thereareatleasttwopossibleexplanationsforthis:beingaskedtoagreewiththenegativestatement‘smokingisnotharmful’mayhaveconfusedthechildren,ortheymaynotseecancerasharmful.Wehaveevidenceforbothofthesepossibilities.InarepeatstudyinKentweaskedafurthersampleofchildrenwhethertheyagreedthatsmokingcausedcancerandthat‘smokingisbadforyourhealth’(BewleyandBland1976).Inthisstudy90%agreedthatsmokingcausescancerand91%agreedthatsmokingisbadforyourhealth.Inanotherstudy(Blandetal.1975),weaskedchildrenwhatwasmeantbytheterm‘lungcancer’.Only13%seemedtoustounderstandand32%clearlydidnot,oftensaying‘Idon'tknow’.Theynearlyallknewthatlungcancerwascausedbysmoking,however.
Thesettinginwhichaquestionisaskedmayalsoinfluencereplies.OpinionpollstersInternationalCommunicationsandMarketResearchconductedapollinwhichhalfthesubjectswerequestionedbyinterviewersabouttheirvotingpreferenceandhalfweregivenasecretballot(McKie1992).Byeachmethod33%chose‘Labour’,but28%chose‘Conservative’atinterviewand7%wouldnotsay,whereas35%chose‘Conservative’bysecretballotandonly1%wouldnotsay.HencethesecretmethodproducedaConservativemajority,asatthethenrecentgeneralelection,andtheopeninterviewaLabourmajority.Foranotherexample,Sibbaldetal.(1994)comparedtworandomsamplesofGPs.Onesamplewereapproachedbypostandthenbytelephoneiftheydidnotreplyaftertworeminders,andtheotherwerecontacteddirectlybytelephone.Ofthepredominantlypostalsample,19%reportedthattheyprovidedcounsellingthemselves,comparedto36%ofthetelephonesample,and14%reportedthat
theirhealthvisitorprovidedcounsellingcomparedto30%ofthetelephonegroup.Thusthemethodofaskingthequestioninfluencedtheanswer.Onemustbeverycautiouswheninterpretingquestionnairereplies.
Fig.3.2.VolatilesubstanceabusemortalityandunemploymentinthecountiesofGreatBritain(Theareaofthecircleisproportional
tothepopulationofthecounty,soreflectstheimportanceoftheobservation)
Oftentheeasiestandbestmethod,ifnottheonlymethod,ofobtainingdataaboutpeopleistoaskthem.Whenwedoit,wemustbeverycarefultoensurethatquestionsarestraightforward,unambiguousandinlanguagetherespondentswillunderstand.Ifwedonotdothisthendisasterislikelytofollow.
3.10*EcologicalstudiesEcologyisthestudyoflivingthingsinrelationtotheirenvironment.Inepidemiology,anecologicalstudyisonewherethediseaseisstudiedinrelationtocharacteristicsofthecommunitiesinwhichpeoplelive.Forexample,wemighttakethedeathratesfromheartdiseaseinseveralcountriesandseewhetherthisisrelatedtothenationalannualconsumptionofanimalfatperhead.
Esmailetal.(1977)carriedoutanecologicalstudyoffactorsrelatedtodeathsfromvolatilesubstanceabuse(VSA,alsocalledsolventabuse,inhalantabuseorgluesniffing).TheobservationalunitsweretheadministrativecountiesofGreatBritain.ThedeathswereobtainedfromanationalregisterofdeathsheldatSt.George'sandtheageandsexdistributionineachcountyfromnationalcensusdata.Thesewereusedtocalculateanindexofmortalityadjustedforage,thestandardizedmortalityratio(§16.3).Indicatorsofsocialdeprivationwerealsoobtainedfromcensusdata.Figure3.2showstherelationshipbetweenVSAmortalityandunemploymentinthecounties.Clearly,thereisarelationship.Themortalityishigherincountieswhereunemploymentishigh.
Relationshipsfoundinecologicalstudiesareindirect.Wemustnotconcludethatthereisarelationshipattheleveloftheperson.Thisistheecological
fallacy.Forexample,wecannotconcludefromFigure3.2thatunemployedpeopleareatagreaterriskofdyingfromVSAthantheemployed.ThepeakageforVSAdeathisamongschoolchildren,who
arenotincludedintheunemploymentfigures.Itisnottheunemployedpeoplewhoaredying.Unemploymentisjustoneindicatorofsocialdeprivation,andVSAdeathsareassociatedwithmanyofthem.
Ecologicalstudiescanbeusefultogeneratehypotheses.Forexample,theobservationthathypertensioniscommonincountrieswherethereisahighintakeofdietarysaltmightleadustoinvestigatethesaltconsumptionandbloodpressureofindividualpeople,andarelationshiptheremightinturnleadtodietaryinterventions.Theseleadsoftenturnouttobefalse,however,andtheecologicalstudyaloneisneverenough.
3MMultiplechoicequestions7to13(Eachbranchiseithertrueorfalse)
7.Instatisticalterms,apopulation:
(a)consistsonlyofpeople;
(b)maybefinite;
(c)maybeinfinite;
(d)canbeanysetofthingsinwhichweareinterested;
(e)mayconsistofthingswhichdonotactuallyexist.
ViewAnswer
8.Aonedaycensusofin-patientsinapsychiatrichospitalcould:
(a)givegoodinformationaboutthepatientsinthathospitalatthattime;
(b)givereliableestimatesofseasonalfactorsinadmissions;
(c)enableustodrawconclusionsaboutthepsychiatrichospitalsofBritain;
(d)enableustoestimatethedistributionofdifferentdiagnosesinmentalillnessinthelocalarea;
(e)tellushowmanypatientstherewereinthehospital.
ViewAnswer
9.Insimplerandomsampling:
(a)eachmemberofthepopulationhasanequalchanceofbeingchosen;
(b)adjacentmembersofthepopulationmustnotbechosen;
(c)likelyerrorscannotbeestimated;
(d)eachpossiblesampleofthegivensizehasanequalchanceofbeingchosen;
(e)thedecisiontoincludeasubjectinthesampledependsonlyonthesubject'sowncharacteristics.
ViewAnswer
10.Advantagesofrandomsamplinginclude:
(a)itcanbeappliedtoanypopulation;
(b)likelyerrorscanbeestimated;
(c)itisnotbiassed;
(d)itiseasytodo;
(e)thesamplecanbereferredtoaknownpopulation.
ViewAnswer
11.Inacase-controlstudytoinvestigatewhethereczemainchildrenisrelatedtocigarettesmokingbytheirparents:
(a)parentswouldbeaskedabouttheirsmokinghabitsatthechild'sbirthandthechildobservedforsubsequentdevelopmentofeczema;
(b)childrenofagroupofparentswhosmokewouldbecomparedtochildrenofagroupofparentswhoarenon-smokers;
(c)parentswouldbeaskedstoptosmokingtoseewhethertheirchildren'seczemawasreduced;
(d)thesmokinghabitsoftheparentsofagroupofchildrenwitheczemawouldbecomparedtothesmokinghabitsoftheparentsofagroupofchildrenwithouteczema;
(e)parentswouldberandomlyallocatedtosmokingornon-smokinggroups.
ViewAnswer
12.Toexaminetherelationshipbetweenalcoholconsumptionandcanceroftheoesophagus,feasiblestudiesinclude:
(a)questionnairesurveyofarandomsamplefromtheelectoralrole;
(b)comparisonofhistoryofalcoholconsumptionbetweenagroupofoesophagealcancerpatientsandagroupofhealthycontrolsmatchedforageandsex;
(c)comparisonofcurrentoesophagealcancerratesinagroupofalcoholicsandagroupofteetotallers;
(d)comparisonbyquestionnaireofhistoryofalcoholconsumptionbetweenagroupofoesophagealcancerpatientsandarandomsamplefromtheelectoralroleinthesurroundingdistrict;
(e)comparisonofdeathratesduetocanceroftheoesophagusinalargesampleofsubjectswhosealcoholconsumptionhasbeendeterminedinthepast.
ViewAnswer
13.*Inastudyofhospitalpatients,20hospitalswerechosenatrandomfromalistofallhospitals.Withineachhospital,10%ofpatientswerechosenatrandom:
(a)thesampleofpatientsisarandomsample;
(b)allhospitalshadanequalchanceofbeingchosen;
(c)allhospitalpatientshadanequalchanceofbeingchosenattheoutset;
(d)thesamplecouldbeusedtomakeinferencesaboutallhospitalpatientsatthattime;
(e)allpossiblesamplesofpatientshadanequalchanceofbeingchosen.
ViewAnswer
Table3.4.Doorstepdeliveryofmilkbottlesandexposuretobirdattack
No.(%)exposed
Cases Controls
Doorstepmilkdelivery 29(91%)
47(73%)
Previousmilkbottleattackbybirds 26(81%)
25(39%)
Milkbottleattackinweekbeforeillness
26(81%)
5(8%)
Protectivemeasurestaken 6(19%)
14(22%)
Handlingattackedmilkbottleinweekbeforeillness
17(53%)
5(8%)
Drinkingmilkfromattackedbottle 25 5(8%)
inweekbeforeillness (80%)
Table3.5.Frequencyofbirdattacksonmilkbottles
Numberofdaysofweekwhenattackstookplace Cases Controls
0 3 42
1–3 11 3
4–5 5 1
6–7 10 1
3EExercise:CampylobacterjejuniinfectionCampylobacterjejuniisabacteriumcausinggastro-intestinalillness,spreadbythefaecal-oralroute.Itinfectsmanyspecies,andhumaninfectionhasbeenrecordedfromhandlingpetdogsandcats,handlingandeatingchickenandothermeats,andviamilkandwatersupplies.Treatmentisbyantibiotics.
InMay,1990,therewasafourfoldriseintheisolationrateofC.jejuniintheOgwrDistrict,Mid-Glamorgan.ThemotherofayoungboyadmittedtohospitalwithfebrileconvulsionsresultingfromC.jejuniinfectionreportedthathermilkbottleshadbeenattackedbybirdsduringtheweekbeforeherson'sillness,aphenomenonwhichhadbeenassociatedwithcampylobacterinfectioninanotherarea.This
observation,withtheriseinC.jejuni,promptedacase-controlstudy(Southernetal.1990).
A‘case’wasdefinedasapersonwithlaboratoryconfirmedC.jejuniinfectionwithonsetbetweenMay1andJune11990,residentinanareawithBridgendatitscentre.Caseswereexcludediftheyhadspentoneormorenightsawayfromthisareaintheweekbeforeonset,iftheycouldhaveacquiredtheinfectionelsewhere,orweremembersofahouseholdinwhichtherehadbeenacaseofdiarrheaintheprecedingfourweeks.
Thecontrolswereselectedfromtheregisterofthegeneralpracticeofthecase,orinafewinstancesfrompracticesservingthesamearea.Twocontrolswereselectedforeachcase,matchedforsex,age(within5years),andareaofresidence.
Casesandcontrolswereinterviewedbymeansofastandardquestionnaireathomeorbytelephone.Caseswereaskedabouttheirexposuretovarious
factorsintheweekbeforetheonsetofillness.Controlswereaskedthesamequestionsaboutthecorrespondingweekfortheirmatchedcases.Ifacontrolormemberofhisorherfamilyhadhaddiarrhealastingmorethan3daysintheweekbeforeorduringtheillnessoftherespectivecase,orhadspentanynightsduringthatweekawayfromhome,anothercontrolwasfound.Evidenceofbirdattackincludedthepeckingortearingoffofmilkbottletops.Ahistoryofbirdattackwasdefinedasapreviousattackatthathouse.
Fifty-fivepeoplewithCampylobacterinfectionresidentintheareawerereportedduringthestudyperiod.Ofthese,19wereexcludedand4couldnotbeinterviewed,leaving32casesand64matchedcontrols.Therewasnodifferenceinmilkconsumptionbetweencasesandcontrols,butmorecasesthancontrolsreporteddoorstepdeliveryofbottledmilk,previousmilkbottleattackbybirds,milkbottleattackbybirdsintheindexweek,andhandlingordrinkingmilkfromanattackedbottle(Table3.4).Casesreportedbirdattacksmorefrequentlythancontrols(Table3.5).Controlsweremorelikelytohaveprotectedtheirmilkbottlesfromattackortohavediscardedmilkfromattacked
bottles.Almostallsubjectswhosemilkbottleshadbeenattackedmentionedthatmagpiesandjackdawswerecommonintheirarea,thoughonly3hadactuallywitnessedattacksandnonereportedbirddroppingsnearbottles.
Noneoftheotherfactorsinvestigated(handlingrawchicken;eatingchickenboughtraw;eatingchicken,beeforhamboughtcooked;eatingout;attendingbarbecue;catordoginthehouse;contactwithothercatsordogs;andcontactwithfarmanimals)weresignificantlymorecommonincontrolsthancases.Bottleattacksseemedtohaveceasedwhenthestudywascarriedout,andnomilkcouldbeobtainedforanalysis.
1.Whatproblemswerethereinselectingcases?
ViewAnswer
2.Whatproblemswerethereintheselectionofcontrols?
ViewAnswer
3.Arethereanyproblemsaboutdatacollection?
ViewAnswer
4.Fromtheabove,doyouthinkthereisconvincingevidencethatbirdattacksonmilkbottlescausecampylobacterinfection?
ViewAnswer
5.Whatfurtherstudiesmightbecarriedout?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>4-Summarizingdata
4
Summarizingdata
4.1TypesofdataInChapters2and3welookedatwaysinwhichdataarecollected.Inthischapterweshallseehowdatacanbesummarizedtohelptorevealinformationtheycontain.Wedothisbycalculatingnumbersfromthedatawhichextracttheimportantmaterial.Thesenumbersarecalledstatistics.Astatisticisanythingcalculatedfromthedataalone.
Itisoftenusefultodistinguishbetweenthreetypesofdata:qualitative,discretequantitativeandcontinuousquantitative.Qualitativedataarisewhenindividualsmayfallintoseparateclasses.Theseclassesmayhavenonumericalrelationshipwithoneanotheratall,e.g.sex:male,female;typesofdwelling:house,maisonette,flat,lodgings;eyecolour:brown,grey,blue,green,etc.Quantitativedataarenumerical,arisingfromcountsormeasurements.Ifthevaluesofthemeasurementsareintegers(wholenumbers),likethenumberofpeopleinahousehold,ornumberofteethwhichhavebeenfilled,thosedataaresaidtobediscrete.Ifthevaluesofthemeasurementscantakeanynumberinarange,suchasheightorweight,thedataaresaidtobecontinuous.Inpracticethereisoverlapbetweenthesecategories.Mostcontinuousdataarelimitedbytheaccuracywithwhichmeasurementscanbemade.Humanheight,forexample,isdifficulttomeasuremoreaccuratelythantothenearestmillimetreandismoreusuallymeasuredtothenearestcentimetre.Soonlyafinitesetofpossiblemeasurementsisactuallyavailable,althoughthequantity‘height’cantakeaninfinitenumberofpossiblevalues,andthemeasuredheightisreallydiscrete.However,themethodsdescribed
belowforcontinuousdatawillbeseentobethoseappropriateforitsanalysis.
Weshallrefertoqualitiesorquantitiessuchassex,height,age,etc.asvariables,becausetheyvaryfromonememberofasampletoanother.Aqualitativevariableisalsotermedacategoricalvariableoranattribute.Weshallusethesetermsinterchangeably.
4.2FrequencydistributionsWhendataarepurelyqualitative,thesimplestwaytodealwiththemistocountthenumberofcasesineachcategory.Forexample,intheanalysisofthecensusofapsychiatrichospitalpopulation(§3.2),oneofthevariablesofinterestwasthepatient'sprincipaldiagnosis(Bewleyetal.1975).Tosummarizethesedata,
wecountthenumberofpatientshavingeachdiagnosis.TheresultsareshowninTable4.1.Thecountofindividualshavingaparticularqualityiscalledthefrequencyofthatquality.Forexample,thefrequencyofschizophreniais474.Theproportionofindividualshavingthequalityiscalledtherelativefrequencyorproportionalfrequency.Therelativefrequencyofschizophreniais474/1467=0.32or32%.Thesetoffrequenciesofallthepossiblecategoriesiscalledthefrequencydistributionofthevariable.
Table4.1.PrincipaldiagnosisofpatientsinTootingBecHospital
Diagnosis Numberofpatients
Schizophrenia 474
Affectivedisorders 277
Organicbrainsyndrome 405
Subnormality 58
Alcoholism 57
Otherandnotknown 196
Total 1467
Table4.2.LikelihoodofdischargeofpatientsinTootingBecHospital
Discharge Frequency Relativefrequency
Cumulativefrequency
Relativecumulativefrequency
Unlikely 871 0.59 871 0.59
Possible 339 0.23 1210 0.82
Likely 257 0.18 1467 1.00
Total 1467 1.00 1467 1.00
Inthiscensusweassessedwhetherpatientswere‘likelytobedischarged’,‘possiblytobedischarged’or‘unlikelytobedischarged’.
ThefrequenciesofthesecategoriesareshowninTable4.2.Likelihoodofdischargeisaqualitativevariable,likediagnosis,butthecategoriesareordered.Thisenablesustouseanothersetofsummarystatistics,thecumulativefrequencies.Thecumulativefrequencyforavalueofavariableisthenumberofindividualswithvalueslessthanorequaltothatvalue.Thus,ifweorderlikelihoodofdischargefrom‘unlikely’,through‘possibly’to‘likely’thecumulativefrequenciesare871,1210(=871+339)and1467.Therelativecumulativefrequencyforavalueistheproportionofindividualsinthesamplewithvalueslessthanorequaltothatvalue.Fortheexampletheyare0.59(=871/1467),0.82and1.00.Thuswecanseethattheproportionofpatientsforwhomdischargewasnotthoughtlikelywas0.82or82%.
Aswehavenoted,likelihoodofdischargeisaqualitativevariable,withorderedcategories.Sometimesthisorderingistakenintoaccountinanalysis,sometimesnot.Althoughthecategoriesareorderedthesearenotquantitativedata.Thereisnosenseinwhichthedifferencebetween‘likely’and‘possibly’isthesameasthedifferencebetween‘possibly’and‘unlikely’.
Table4.3.Parityof125womenattendingantenatalclinicsatSt.George'sHospital
Parity FrequencyRelativefrequency(percent)
Cumulativefrequency
Relativecumulativefrequency(percent)
0 59 47.2 59 47.2
1 44 35.2 103 82.4
2 14 11.2 117 93.6
3 3 2.4 120 96.0
4 4 3.2 124 99.2
5 1 0.8 125 100.0
Total 125 100.0 125 100.0
Table4.4.FEV1(litres)of57malemedicalstudents
2.85 3.19 3.50 3.69 3.90 4.14 4.32 4.50
2.85 3.20 3.54 3.70 3.96 4.16 4.44 4.56
2.98 3.30 3.54 3.70 4.05 4.20 4.47 4.68
3.04 3.39 3.57 3.75 4.08 4.20 4.47 4.70
3.10 3.42 6.60 3.78 4.10 4.30 4.47 4.71
3.10 3.48 3.60 3.83 4.14 4.30 4.50 4.78
Table4.3showsthefrequencydistributionofaquantitativevariable,parity.ThisshowsthenumberofpreviouspregnanciesforasampleofwomenbookingfordeliveryatSt.George'sHospital.Onlycertainvalues
arepossible,asthenumberofpregnanciesmustbeaninteger,sothisvariableisdiscrete.Thefrequencyofeachseparatevalueisgiven.
Table4.4showsacontinuousvariable,forcedexpiratoryvolumeinonesecond(FEV1)inasampleofmalemedicalstudents.Asmostofthevaluesoccuronlyonce,togetausefulfrequencydistributionweneedtodividetheFEV1scaleintoclassintervals,e.g.from3.0to3.5,from3.5to4.0,andsoon,andcountthenumberofindividualswithFEV1sineachclassinterval.Theclassintervalsshouldnotoverlap,sowemustdecidewhichintervalcontainstheboundarypointtoavoiditbeingcountedtwice.Itisusualtoputthelowerboundaryofanintervalintothatintervalandthehigherboundaryintothenextinterval.Thustheintervalstartingat3.0andendingat3.5contains3.0butnot3.5.Wecanwritethisas‘3.0-’or‘3.0-3.5-’or‘3.0-3.499’.Includingthelowerboundaryintheclassintervalhasthisadvantage.Mostdistributionsofmeasurementshaveazeropointbelowwhichwecannotgo,whereasfewhaveanexactupperlimit.Ifweweretoincludetheupperboundaryintheintervalinsteadofthelower,wewouldhavetwopossiblewaysofdealingwithzero.Itcouldbeleftasanisolatedpoint,notinaninterval.Alternatively,itcouldbeincludedinthelowestinterval,whichwouldthennotbeexactlycomparabletotheothersasitwouldincludebothboundarieswhilealltheotherintervalsonlyincludedtheupper.
Ifwetakeastartingpointof2.5andanintervalof0.5wegetthefrequencydistributionshowninTable4.5.Notethatthisisnotunique.Ifwetakea
startingpointof2.4andanintervalof0.2wegetadifferentsetoffrequencies.
Table4.5.FrequencydistributionofFEV1in57malemedicalstudents
FEV1 Frequency Relativefrequency(percent)
2.0 0 0.0
2.5 3 5.3
3.0 9 15.8
3.5 14 24.6
4.0 15 26.3
4.5 10 17.5
5.0 6 10.5
5.5 0 0.0
Total 57 100.0
Table4.6.TallysystemforfindingthefrequencydistributionofFEV1
FEV1 Frequency
2.0 0
2.5 /// 3
3.0 ///////// 9
3.5 ////////////// 14
4.0 /////////////// 15
4.5 ////////// 10
5.0 ////// 6
5.5 0
Total 57
Thefrequencydistributioncanbecalculatedeasilyandaccuratelyusingacomputer.Manualcalculationisnotsoeasyandmustbedonecarefullyandsystematically.Onewayrecommendedbymanytexts(e.g.Hill1977)istosetupatallysystem,asinTable4.6.Wegothroughthedataandforeachindividualmakeatallymarkbytheappropriateinterval.Wethencountupthenumberineachinterval.Inpracticethisisverydifficulttodoaccurately,anditneedstobecheckedanddouble-checked.Hill(1977)recommendswritingeachnumberonacardanddealingthecardsintopilescorrespondingtotheintervals.Itistheneasytocheckthateachpilecontainsonlythosecasesinthatintervalandcountthem.Thisisundoubtedlysuperiortothetallysystem.Anothermethodistoordertheobservationsfromlowesttohighestbeforemarkingtheintervalboundariesandcounting,ortousethestemandleafplotdescribedbelow.Personally,Ialwaysuseacomputer.
4.3HistogramsandotherfrequencygraphsGraphicalmethodsareveryusefulforexaminingfrequency
distributions.Figure4.1showsagraphofthecumulativefrequencydistributionfortheFEV1
data.Thisiscalledastepfunction.Wecansmooththisbyjoiningsuccessivepointswherethecumulativefrequencychangesbystraightlines,togiveacumulativefrequencypolygon.Figure4.2showsthisforthecumulativerelativefrequencydistributionofFEV1.Thisplotisveryusefulforcalculatingsomeofthesummarystatisticsreferredtoin§4.5.
Fig.4.1.CumulativefrequencydistributionofFEV1inasampleofmalemedicalstudents
Fig.4.2.CumulativefrequencypolygonofFEV1
Themostcommonwayofdepictingafrequencydistributionisbyahistogram.Thisisadiagramwheretheclassintervalsareonanaxisandrectangleswithheightsorareasproportionaltothefrequencieserectedonthem.Figure4.3showsthehistogramfortheFEV1distributioninTable4.5.Theverticalscaleshowsfrequency,thenumberofobservationsineachinterval.
Sometimeswewanttoshowthedistributionofadiscretevariable(e.g.Table4.3)asahistogram.Ifourintervalsare0–1-,1–2-,etc.,theactual
observationswillallbeatoneendoftheinterval.Makingthestartingpointoftheintervalasafractionratherthananintegergivesaslightlybetterpicture(Figure4.5).Thiscanalsobehelpfulforcontinuousdatawhenthereisalotofdigitpreference(§15.2).Forexample,wheremostobservationsarerecordedasintegersorassomethingpointfive,startingtheintervalatsomethingpointsevenfivecangiveamoreaccuratepicture.
Fig.4.3.HistogramofFEV1:frequencyscale
Fig.4.4.HistogramofFEV1:frequencyperunitFEV1orfrequencydensityscale
Fig.4.5.Histogramsofparity(Table4.3)usingintegerandfractionalcut-offpointsfortheintervals
Table4.7.Distributionofageinpeoplesufferingaccidentsinthehome(Whittington1977)
Agegroup
Relativefrequency(percent)
Relativefrequencyperyear(percent)
0–4 25.3 5.06
5–14 18.9 1.89
15–44
30.3 1.01
45–64
13.6 0.68
65+ 11.7 0.33
Fig.4.6.Histogramsofagedistributionofhomeaccidentvictims,usingtherelativefrequencyscaleandtherelativefrequencydensityscale
Figure4.4showsahistogramforthesamedistributionasFigure4.3,withfrequencyperunitFEV1(orfrequencydensity)shownontheverticalaxis.Thedistributionsappearidenticalandwemaywellwonderwhetheritmatterswhichmethodwechoose.Weseethatitdoesmatterwhenweconsiderafrequencydistributionwithunequalintervals,asinTable4.7.Ifweplotthehistogramusingtheheightsoftherectanglestorepresentrelativefrequencyintheintervalwegettheleft-handhistograminFigure4.6,whereasifweusetherelativefrequencyperyearwegettheright-handhistogram.Thesehistogramstelldifferentstories.Theleft-handhistograminFigure4.6suggeststhatthemostcommonageforaccidentvictimsisbetween15and44years,whereastheright-handhistogramsuggestsitisbetween0and4.Theright-handhistogramiscorrect,theleft-handhistogrambeingdistortedbytheunequalclassintervals.Itisthereforepreferableingeneraltousethefrequencyperunit(frequencydensity)ratherthanperclassintervalwhenplottingahistogram.Thefrequencyforaparticularintervalisthenrepresentedbytheareaoftherectangleon
thatinterval.Onlywhentheclassintervalsareallequalcanthefrequencyfortheclassintervalberepresented
bytheheightoftherectangle.Thecomputerprogrammerfindsequalintervalsmucheasier,however,andhistogramswithunequalintervalsarenowuncommon.
Fig.4.7.FrequencypolygonsofFEV1andPEFinmedicalstudents
Fig.4.8.StemandleafplotfortheFEV1data,roundeddowntoonedecimalplace
Ratherthanahistogramconsistingofverticalrectangles,wecanplotafrequencypolygoninstead.Todothiswejointhecentrepointsofthetopsoftherectangles,thenomittherectangles(Figure4.7(a)).Whereacellofthehistogramisemptywejointhelinetothecentreofthecellatthehorizontalaxis(Figure4.7(b),males).Thiscanbeusefulifwewanttoshowtwoormorefrequencydistributionsonthesamegraph,asin(Figure4.7(b)).Whenwedothis,thecomparisoniseasierifweuserelativefrequencyorrelativefrequencydensityratherthanfrequency.Thismakesiteasiertocomparedistributionswithdifferentnumbersofsubjects.
AdifferentversionofthehistogramhasbeendevelopedbyTukey(1977),thestemandleafplot(Figure4.8).Therectanglesarereplacedbythenumbersthemselves.The‘stem’isthefirstdigitordigitsofthenumberandthe‘leafthetrailingdigit.ThefirstrowofFigure4.8representsthenumbers2.8,2.8,and2.9,whichinthedataare2.85,2.85,and2.98.Theplotprovidesagoodsummaryofdatastructurewhileatthesametimewecanseeothercharacteristicssuchasatendencytoprefersometrailingdigitstoothers,calleddigitpreference(§15.1).Itisalsoeasytoconstructandmuchlesspronetoerrorthanthetallymethodoffindingafrequencydistribution.
4.4ShapesoffrequencydistributionFigure4.3showsafrequencydistributionofashapeoftenseeninmedicaldata.Thedistributionisroughlysymmetricalaboutitscentralvalueandhasfrequencyconcentratedaboutonecentralpoint.Themostcommonvalueiscalledthe
modeofthedistributionandFigure4.3hasonesuchpoint.Itisunimodal.Figure4.9showsaverydifferentshape.Heretherearetwodistinctmodes,onenear5andtheothernear8.5.Thisdistributionisbimodal.Wemustbecarefultodistinguishbetweentheunevennessinthehistogramwhichresultsfromusingasmallsampletorepresentalargepopulationandthosewhichresultfromgenuinebimodalityinthedata.Thetroughbetween6and7inFigure4.9isverymarkedandmightrepresentagenuinebimodality.Inthiscasewehavechildren,someofwhomhaveaconditionwhichraisesthecholesterolleveland
someofwhomdonot.Weactuallyhavetwoseparatepopulationsrepresentedwithsomeoverlapbetweenthem.However,almostalldistributionsencounteredmmedicalstatisticsareunimodal.
Fig.4.9.Serumcholesterolinchildrenfromkinshipswithfamilialhypercholesterolaemia(Leonardetal1977)
Fig.4.10.Serumtriglycerideincordbloodfrom282babies(Table4.8)
Figure4.10differsfromFigure4.3inadifferentway.Thedistributionof
serumtriglycerideisskew,thatis,thedistancefromthecentralvaluetotheextremeismuchgreaterononesidethanitisontheother.Thepartsofthehistogramneartheextremesarecalledthetailsofthedistribution.Ifthetailsareequalthedistributionissymmetrical,asinFigure4.3.IfthetailontherightislongerthanthetailontheleftasinFigure4.10,thedistributionisskewtotherightorpositivelyskew.Ifthetailontheleftislonger,thedistributionisskewtotheleftornegativelyskew.Thisisunusual,butFigure4.11showsanexample.Thenegativeskewnesscomesaboutbecausebabiescanbebornaliveatanygestationalagefromabout20weeks,butsoonafter40weeksthebabywillhavetobeborn.Pregnancieswillnotbeallowedtogoonformorethan44weeks;thebirthwouldbeinducedartificially.Mostdistributionsencounteredinmedicalworkaresymmetricalorskewtotheright,forreasonsweshalldiscusslater(§7.4).
Table4.8.Serumtriglyceridemeasurementsincordbloodfrom282babies
0.15 0.29 0.32 0.36 0.40 0.42 0.46 0.50
0.16 0.29 0.33 0.36 0.40 0.42 0.46 0.50
0.20 0.29 0.33 0.36 0.40 0.42 0.47 0.52
0.20 0.29 0.33 0.36 0.40 0.44 0.47 0.52
0.20 0.29 0.33 0.36 0.40 0.44 0.47 0.52
0.20 0.29 0.33 0.36 0.40 0.44 0.47 0.52
0.21 0.30 0.33 0.36 0.40 0.44 0.47 0.52
0.22 0.30 0.33 0.36 0.40 0.44 0.48 0.52
0.24 0.30 0.33 0.37 0.40 0.44 0.48 0.52
0.25 0.30 0.34 0.37 0.40 0.44 0.48 0.53
0.26 0.30 0.34 0.37 0.40 0.44 0.48 0.54
0.26 0.30 0.34 0.37 0.40 0.44 0.48 0.54
0.26 0.30 0.34 0.38 0.40 0.45 0.48 0.54
0.27 0.30 0.34 0.38 0.40 0.45 0.48 0.54
0.27 0.30 0.34 0.38 0.41 0.45 0.48 0.54
0.27 0.31 0.34 0.38 0.41 0.45 0.48 0.54
0.28 0.31 0.34 0.38 0.41 0.45 0.48 0.55
0.28 0.32 0.35 0.39 0.41 0.45 0.48 0.55
0.28 0.32 0.35 0.39 0.41 0.46 0.48 0.55
0.28 0.32 0.35 0.39 0.41 0.46 0.49 0.55
0.28 0.32 0.35 0.39 0.41 0.46 0.49 0.55
0.28 0.32 0.35 0.39 0.42 0.46 0.49 0.55
0.28 0.32 0.35 0.40 0.42 0.46 0.50 0.55
0.28 0.32 0.36 0.40 0.42 0.46 0.50 0.55
4.5MediansandquantilesWeoftenwanttosummarizeafrequencydistributioninafewnumbers,foreaseofreportingorcomparison.Themostdirectmethodistousequantiles.Thequantilesarevalueswhichdividethedistributionsuchthatthereisagivenproportionofobservationsbelowthequantile.Forexample,themedianisaquantile.Themedianisthecentralvalueofthedistribution,suchthathalfthepointsarelessthanorequaltoitandhalfaregreaterthanorequaltoit.Wecanestimateanyquantileseasilyfromthecumulativefrequencydistribution
orastemandleafplot.FortheFEV1datathemedianis4.1,the29thvalueinTable4.4.Ifwehaveanevennumberofpoints,wechooseavaluemidwaybetweenthetwocentralvalues.
Fig.4.11.Gestationalageatbirthfor1749deliveriesatSt.George'sHospital
Ingeneral,weestimatetheqquantile,thevaluesuchthataproportionqwillbebelowit,asfollows.Wehavenorderedobservationswhichdividethescaleinton+1parts:belowthelowestobservation,abovethehighestandbetweeneachadjacentpair.Theproportionofthedistributionwhichliesbelowtheithobservationisestimatedbyi/(n+1).Wesetthisequaltoqandgeti=q(n+1).Ifiisaninteger,theithobservationistherequiredquantileestimate.Ifnot,letjbetheintegerpartofi,thepartbeforethedecimalpoint.Thequantilewillliebetweenthejthandj+1thobservations.Weestimateitby
Forthemedian,forexample,the0.5quantile,i=q(n+1)=0.5×(57+1)=29,the29thobservationasbefore.
Otherquantileswhichareparticularlyusefularethequartilesofthedistribution.Thequartilesdividethedistributionintofourequalparts,calledfourths.Thesecondquartileisthemedian.FortheFEV1datathefirstandthirdquartilesare3.54and4.53.Forthefirstquartile,i=0.25×58=14.5.Thequartileisbetweenthe14thand15thobservations,whichareboth3.54.Forthethirdquartile,i=0.75×58=
43.5,sothequartileliesbetweenthe42ndand43rdobservations,whichare4.50and4.56.Thequantileisgivenby4.50+(4.56-4.50)×(43.5-43)=4.53.Weoftendividethedistributionat99centilesorpercentiles.Themedianisthusthe50thcentile.Forthe20thcentileofFEV1,i=0.2×58=11.6,sothequantileisbetweenthe11thand12thobservation,3.42and3.48,andcanbeestimatedby3.42+(3.48-3.42)×(11.6-11)=3.46.WecanestimatetheseeasilyfromFigure4.2byfindingthepositionofthequantileontheverticalaxis,e.g.0.2for
the20thcentileor0.5forthemedian,drawingahorizontallinetointersectthecumulativefrequencypolygon,andreadingthequantileoffthehorizontalaxis.
Fig.4.12.BoxandwhiskerplotsforFEV1andforserumtriglyceride
Fig.4.13.Boxplotsshowingaroughlysymmetricalvariableinfourgroups,withanoutlyingpoint(datainTable10.8)
Tukey(1977)usedthemedian,quartiles,maximumandminimumasaconvenientfivefiguresummaryofadistribution.Healsosuggestedaneatgraph,theboxandwhiskerplot,whichrepresentsthis(Figure4.12).Theboxshowsthedistancebetweenthequartiles,withthemedianmarkedasaline,andthe‘whiskers’showtheextremes.ThedifferentshapesoftheFEV1andserumtriglyceridedistributionsisclearfromthegraph.Fordisplaypurposes,anobservationwhosedistancefromtheedgeofthebox(i.e.thequartile)ismorethan1.5timesthelengthofthebox(i.e.theinterquartilerange,§4.7)maybecalledanoutlier.Outliersmaybeshownasseparatepoints(Figure4.13).Theplotcanbeusefulforshowingthecomparisonofseveralgroups(Figure4.13).
4.6ThemeanThemedianisnottheonlymeasureofcentralvalueforadistribution.Anotheristhearithmeticmeanoraverage,usuallyreferredtosimplyasthemean.Thisisfoundbytakingthesumoftheobservationsanddividingbytheirnumber.Forexample,considerthefollowing
hypotheticaldata:
239540634
Thesumis36andthereare9observations,sothemeanis36/9=4.0.Atthispointwewillneedtointroducesomealgebraicnotation,widelyusedinstatistics.Wedenotetheobservationsby
x1,x2,…,xi,…,xn
Therearenobservationsandtheithoftheseisxi=Fortheexample,x4=5andn=9.Thesumofallthexiis
ThesummationsignisanuppercaseGreekletter,sigma,theGreekS.Whenitisobviousthatweareaddingthevaluesofx1,forallvaluesofi,whichrunsfrom1ton,weabbreviatethisto∑xiorsimplyto∑x.Themeanofthexiisdenotedby[xwithbarabove](‘xbar’),and
Thesumofthe57FEV1sis231.51andhencethemeanis231.51/57=4.06.Thisisveryclosetothemedian,4.1,sothemedianiswithin1%ofthemean.Thisisnotsoforthetriglyceridedata.Themediantriglyceride(Table4.8)is0.46butthemeanis0.51,whichishigher.Themedianis10%awayfromthemean.Ifthedistributionissymmetricalthesamplemeanandmedianwillbeaboutthesame,butinaskewdistributiontheywillnot.Ifthedistributionisskewtotheright,asforserumtriglyceride,themeanwillbegreater,ifitisskewtotheleftthemedianwillbegreater.Thisisbecausethevaluesinthetailsaffectthemeanbutnotthemedian.
Thesamplemeanhasmuchnicermathematicalpropertiesthanthemedianandisthusmoreusefulforthecomparisonmethodsdescribedlater.Themedianisaveryusefuldescriptivestatistic,butnotmuchusedforotherpurposes.
4.7Variance,rangeandinterquartilerange
Themeanandmedianaremeasuresofthepositionofthemiddleofthedistribution,whichwecallthecentraltendency.Weshallalsoneedameasureofthespreadorvariabilityofthedistribution,calledthedispersion.
Oneobviousmeasureistherange,thedifferencebetweenthehighestandlowestvalues.ForthedataofTable4.4,therangeis5.43–2.85=2.58litres.The
rangeisoftenpresentedasthetwoextremes.2.85–5.43litres,ratherthantheirdifference.Therangeisausefuldescriptivemeasure,buthastwodisadvantages.Firstly,itdependsonlyontheextremevaluesandsocanvaryalotfromsampletosample.Secondly,itdependsonthesamplesize.Thelargerthesampleis,thefurtheraparttheextremesarelikelytobe.Wecanseethisifweconsiderasampleofsize2.Ifweaddathirdmembertothesampletherangewillonlyremainthesameifthenewobservationfallsbetweentheothertwo,otherwisetherangewillincrease.Wecangetroundthesecondoftheseproblemsbyusingtheinterquartilerange,thedifferencebetweenthefirstandthirdquartiles.ForthedataofTable4.4,theinterquartilerangeis4.53--3.54=0.99litres.Theinterquartilerange,too,isoftenpresentedasthetwoextremes,3.54–4.53litres.However,theinterquartilerangeisquitevariablefromsampletosampleandisalsomathematicallyintractable.Althoughausefuldescriptivemeasure,itisnottheonepreferredforpurposesofcomparison.
Table4.9.Deviationsfromthemeanof9observations
Observationsxi
Deviationsfromthemeanxi-[xwithbarabove]
Squareddeviations(xi-[xwithbarabove])2
2 -2 4
3 -1 1
9 5 25
5 1 1
4 0 0
0 -4 16
6 2 4
3 -1 1
4 0 0
36 0 52
Themostcommonlyusedmeasuresofdispersionarethevarianceandstandarddeviation.Westartbycalculatingthedifferencebetweeneachobservationandthesamplemean,calledthedeviationsfromthemean,Table4.9.Ifthedataarewidelyscattered,manyoftheobservationsxiwillbefarfromthemean[xwithbarabove]andsomanydeviationsxi-[xwithbarabove]willbelarge.Ifthedataarenarrowlyscattered,veryfewobservationswillbefarfromthemeanandsofewdeviationsxi-[xwithbarabove]willbelarge.Weneedsomekindofaveragedeviationtomeasurethescatter.Ifweaddallthedeviationstogether,wegetzero,because∑(xi-[xwithbarabove])=∑xi-∑[xwithbarabove]=∑xi-n[xwithbarabove]andn[xwithbar
above]=∑xi.Insteadwesquarethedeviationsandthenaddthem,asshowninTable4.9.Thisremovestheeffectofsign;weareonlymeasuringthesizeofthedeviationnotthedirection.Thisgivesus∑(xi-[xwithbarabove])2,intheexampleequalto52,calledthesumofsquaresaboutthemean,usuallyabbreviatedtosumofsquares.
Clearly,thesumofsquareswilldependonthenumberofobservationsaswellasthescatter.Wewanttofindsomekindofaveragesquareddeviation.Thisleadstoadifficulty.Althoughwewantanaveragesquareddeviation,wedividethesumofsquaresbyn-1,notn.Thisisnottheobviousthingtodoandpuzzles
manystudentsofstatisticalmethods.Thereasonisthatweareinterestedinestimatingthescatterofthepopulation,ratherthanthesample,andthesumofsquaresaboutthesamplemeanisproportionalton-1(§4A,§6B),Dividingbynwouldleadtosmallsamplesproducinglowerestimatesofvariabilitythanlargesamples.Theminimumnumberofobservationsfromwhichthevariabilitycanbeestimatedis2,asingleobservationcannottellushowvariablethedataare.Ifweusednasourdivisor,forn-Ithesumofsquareswouldbezero,givingavarianceofzero.Withthecorrectdivisorofn-1,n=1givesthemeaninglessratio0/0,reflectingtheimpossibilityofestimatingvariabilityfromasingleobservation.Theestimateofvariabilityiscalledthevariance,definedby
Wehavealreadysaidthat∑(xi-[xwithbarabove])2iscalledthesumofsquares.Thequantityn-1iscalledthedegreesoffreedomofthevarianceestimate(§7A).Wehave:
Weshallusuallydenotethevariancebys2.Intheexample,thesumofsquaresis52andthereare9observations,giving8degreesoffreedom.Hences2=52/8=6.5.
Theformula∑(xi-[xwithbarabove])2givesusarathertediouscalculation.Thereisanotherformulaforthesumofsquares,which
makesthecalculationeasiertocarryout.Thisissimplyanalgebraicmanipulationofthefirstformandgiveexactlythesameanswers.Wethushavetwoformulaeforvariance:
Thealgebraisquitesimpleandisgivenin§4B.Forexample,usingthesecondformulaforthenineobservations,wehave:
asbefore.Onacalculatorthisisamucheasierformulathanthefirst,asthenumbersneedonlybeputinonce.Itcanbeinaccurate,becausewesubtractonelargenumberfromanothertogetasmallone.Forthisreasonthefirstformulawouldbeusedinacomputerprogram.
4.8StandarddeviationThevarianceiscalculatedfromthesquaresoftheobservations.Thismeansthatitisnotinthesameunitsastheobservations,whichlimitsitsuseasadescriptivestatistic.Theobviousanswertothisistotakethesquareroot,whichwillthenhavethesameunitsastheobservationsandthemean.Thesquarerootofthevarianceiscalledthestandarddeviation,usuallydenotedbys.Thus,
ReturningtotheFEVdata,wecalculatethevarianceandstandarddeviationasfollows.Wehaven=57,∑xi231.51,=∑xi2=965.45:
Figure4.14showstherelationshipbetweenmean,standarddeviationandfrequencydistribution.ForFEV1,weseethatthemajorityofobservationsarewithinonestandarddeviationofthemean,andnearlyallwithintwostandarddeviationsofthemean.Thereisasmallpartofthehistogramoutsidethe[xwithbarabove]-2sto[xwithbarabove]+2sinterval,oneithersideofthissymmetricalhistogram.As
Figure4.14alsoshows,thisistrueforthehighlyskewtriglyceridedata,too.Inthiscase,however,theoutlyingobservationsareallinonetailofthedistribution.Ingeneral,weexpectroughly2/3ofobservationstoliewithinonestandarddeviationofthemeanand95%toliewithintwostandarddeviationsofthemean.
Fig.4.14.HistogramsofFEV1andtriglyceridewithmeanandstandarddeviation
Table4.10.Populationof100randomdigitsforasamplingexperiment
9 1 0 7 5 6 9 5 8 8 1 0 5 7
1 8 8 8 5 2 4 8 3 1 6 5 5 7
2 8 1 8 5 8 4 0 1 9 2 1 6 9
1 9 7 9 7 2 7 7 0 8 1 6 3 8
7 0 2 8 8 7 2 5 4 1 8 6 8 3
Appendices
4AAppendix:Thedivisorforthevariance
Thevarianceisfoundbydividingthesumofsquaresaboutthesamplemeanbyn-1,notbyn.Thisisbecausewewantthescatteraboutthepopulationmean,andthescatteraboutthesamplemeanisalwaysless.Thesamplemeanis‘closer’tothedatapointsthanisthepopulationmean.Weshalltryalittlesamplingexperimenttoshowthis.Table4.10showsasetof100randomdigitswhichweshalltakeasthepopulationtobesampled.Theyhavemean4.74andthesumofsquaresaboutthemeanis811.24.Hencetheaveragesquareddifferencefromthemeanis8.1124.Wecantakesamplesofsizetwoat
randomfromthispopulationusingapairofdecimaldice,whichwillenableustochooseanydigitnumberedfrom00to99.Thefirstpairchosenwas5and6whichhasmean5.5.Thesumofsquaresaboutthepopulationmean4.74is(5-4.74)2+(6-4.74)2=1.655.Thesumofsquaresaboutthesamplemeanis(5-5.5)2+(6-5.5)2=0.5.
Thesumofsquaresaboutthepopulationmeanisgreaterthanthesumofsquaresaboutthesamplemean,andthiswillalwaysbeso.Table4.11showsthisfor20suchsamplesofsizetwo.Theaveragesumofsquaresaboutthepopulationmeanis13.6,andaboutthesamplemeanitis5.7.Hencedividingbythesamplesize(n=2)wehavemeansquaredifferencesof6.8aboutthepopulationmeanand2.9aboutthesamplemean.Comparethisto8.1forthepopulationasawhole.Weseethatthesumofsquaresaboutthepopulation
meanisquitecloseto8.1,whilethesumofsquaresaboutthesamplemeanismuchless.However,ifwedividethesumofsquaresaboutthesamplemeanbyn-1,i.e.1,insteadofnwehave5.7,whichisnotmuchdifferenttothe6.8fromthesumofsquaresaboutthepopulationmean.
Table4.11.SamplingpairsfromTable4.10
Sample ∑(xi-µ)2 ∑(xi-[xwithbarabove])2
5 6 1.655 0.5
8 8 21.255 0.0
6 1 15.575 12.5
9 3 21.175 18.0
5 5 0.135 0.0
7 7 10.215 0.0
1 7 19.095 18.0
9 8 28.775 0.5
3 3 6.055 0.0
5 1 14.055 8.0
8 3 13.655 12.5
5 7 5.175 2.0
5 2 5.575 4.5
5 7 5.175 2.0
8 8 21.255 0.0
3 2 10.535 0.5
0 4 23.015 8.0
9 3 21.175 18.0
5 2 7.575 4.5
6 9 19.735 4.5
Mean 13.6432 5.7
Table4.12.Meansumsofsquaresaobutthesamplemeanforsetsof100randomsamplesfromTable
4.11
Numberinsample,nMeanvarianceestimates
2 4.5 9.1
3 5.4 8.1
4 5.9 7.9
5 6.2 7.7
10 7.2 8.0
Table4.12showstheresultsofasimilarexperimentwithmoresamplesbeingtaken.Thetableshowsthetwoaveragevarianceestimatesusingnandn-1asthedivisorofthesumofsquares,forsamplesizes2,3,4,5and10.Weseethatthesumofsquaresaboutthesamplemeandividedbynincreasessteadilywithsamplesize,butifwedivideitbyn
-1insteadofntheestimatedoesnotchangeasthesamplesizeincreases.Thesumofsquaresaboutthesamplemeanisproportionalton-1.
4BAppendix:Formulaeforthesumofsquares
Thedifferentformulaeforsumsofsquaresarederivedasfollows:
because[xwithbarabove]hasthesamevalueforeachofthenobservations.Now,so
Wethushavethreeformulaeforvariance:
4MMultiplechoicequestions14to19(Eachbranchiseithertrueorfalse)
14.Whichofthefollowingarequalitativevariables:
(a)sex;
(b)parity;
(c)diastolicbloodpressure;
(d)diagnosis;
(e)height.
ViewAnswer
15.Whichofthefollowingarecontinuousvariables:
(a)bloodglucose;
(b)peakexpiratoryflowrate;
(c)agelastbirthday;
(d)exactage;
(e)familysize.
ViewAnswer
16.Whenadistributionisskewtotheright:
(a)themedianisgreaterthanthemean;
(b)thedistributionisunimodal;
(c)thetailontheleftisshorterthanthetailontheright;
(d)thestandarddeviationislessthanthevariance;
(e)themajorityofobservationsarelessthanthemean.
ViewAnswer
17.Theshapeofafrequencydistributioncanbedescribedusing:
(a)aboxandwhiskerplot;
(b)ahistogram:
(c)astemandleafplot;
(d)meanandvariance;
(e)atableoffrequencies.
ViewAnswer
18.Forthesample3,1,7,2,2:
(a)themeanis3:
(b)themedianis7:
(c)themodeis2:
(d)therangeis1:
(e)thevarianceis5.5.
ViewAnswer
19.Diastolicbloodpressurehasadistributionwhichisslightlyskewtotheright.Ifthemeanandstandarddeviationwerecalculatedforthediastolicpressuresofarandomsampleofmen:
(a)therewouldbefewerobservationsbelowthemeanthanaboveit;
(b)thestandarddeviationwouldbeapproximatelyequaltothemean;
(c)themajorityofobservationswouldbemorethanonestandarddeviationfromthemean:
(d)thestandarddeviationwouldestimatetheaccuracyofbloodpressuremeasurement:
(e)about95%ofobservationswouldbeexpectedtobewithintwostandarddeviationsofthemean.
ViewAnswer
4EExercise:Meanandstandarddeviation
Thisexercisegivessomepracticeinoneofthemostfundamentalcalculationsinstatistics,thatofthesumofsquaresandstandarddeviation.Italsoshowstherelationshipofthestandarddeviationtothefrequencydistribution.Table4.13showsbloodglucoselevelsobtainedfromagroupofmedicalstudents.
1.Makeastemandleafplotforthesedata.
ViewAnswer
2.Findtheminimum,maximumandquartilesandsketchaboxandwhiskerplot.
ViewAnswer
3.Findthefrequencydistribution,usingaclassintervalof0.5.
ViewAnswer
Table4.13.Randombloodglucoselevelsfromagroupoffirstyearmedicalstudents(mmol/litre)
4.7 3.6 3.8 2.2 4.7 4.1 3.6 4.0 4.4 5.1
4.2 4.1 4.4 5.0 3.7 3.6 2.9 3.7 4.7 3.4
3.9 4.8 3.3 3.3 3.6 4.6 3.4 4.5 3.3 4.0
3.4 4.0 3.8 4.1 3.8 4.4 4.9 4.9 4.3 6.0
4.Sketchthehistogramofthisfrequencydistribution.Whattermbestdescribestheshape:symmetrical,skewtotherightorskewto
theleft?
ViewAnswer
5.Forthefirstcolumnonly,i.e.for4,7,4.2,3.9,and3.4,calculatethestandarddeviationusingthedeviationsfromthemeanformula
Firstcalculatethesumoftheobservationsandthesumoftheobservationssquared.Hencecalculatethesumofsquaresaboutthemean.Isthisthesameasthatfoundin4above?Hencecalculatethevarianceandthestandarddeviation.
ViewAnswer
6.Forthesamefournumbers,calculatethestandarddeviationusingtheformula
Firstcalculatethesumoftheobservationsandthesumoftheobservationssquared.Hencecalculatethesumofsquaresaboutthemean.Isthisthesameasthatfoundin4above?Hencecalculatethevarianceandthestandarddeviation.
ViewAnswer
7.Usethefollowingsummationsforthewholesample:∑xi=162.2,∑xi2=676.74.Calculatethemeanofthesample,thesumofsquaresaboutthemean,thedegreesoffreedomforthissumofsquares,andhenceestimatethevarianceandstandarddeviation.
ViewAnswer
8.Calculatethemean±onestandarddeviationandmean±twostandarddeviations.Indicatethesepointsandthemeanonthehistogram.Whatdoyouobserveabouttheirrelationshiptothefrequencydistribution?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>5-Presentingdata
5
Presentingdata
5.1RatesandproportionsHavingcollectedourdataasdescribedinChapters2and3andextractedinformationfromitusingthemethodsofChapter4,wemustfindawaytoconveythisinformationtoothers.Inthischapterweshalllookatsomeofthemethodsofdoingthat.Webeginwithratesandproportions.
Whenwehavedataintheformoffrequencies,weoftenneedtocomparethefrequencywithcertainconditionsingroupscontainingdifferenttotals.InTable2.1,forexample,twogroupsofpatientpairswerecompared,29wherethelaterpatienthadaC-Tscanand89whereneitherhadaC-Tscan.Thelaterpatientdidbetterin9ofthefirstgroupand34ofthesecondgroup.Tocomparethesefrequencieswecomparetheproportions9/29and34/89.Theseare0.31and0.38,andwecanconcludethatthereislittledifference.InTable2.1,theseweregivenaspercentages,thatis,theproportionoutof100ratherthanoutof1,toavoidthedecimalpoint.InTable2.8,theSalkvaccinetrial,theproportionscontractingpoliowerepresentedasthenumberper100000forthesamereason.
Arateexpressesthefrequencyofthecharacteristicofinterestper1000(orper100000,etc.)ofthepopulation,perunitoftime.Forexample,inTable3.1,theresultsofthestudyofsmokingbydoctors,thedatawerepresentedasthenumberofdeathsper1000doctorsperyear.Thisisnotaproportion,asafurtheradjustmenthasbeenmadetoallowforthetimeperiodobserved.Furthermore,theratehasbeenadjustedtotakeaccountofanydifferencesintheagedistributionsof
smokersandnon-smokers(§16.2).Sometimestheactualdenominatorforaratemaybecontinuallychanging.ThenumberofdeathsfromlungcanceramongmeninEnglandandWalesfor1983was26502.Thedenominatorforthedeathrate,thenumberofmalesinEnglandandWales,changedthroughout1983,assomedied,somewereborn,someleftthecountryandsomeenteredit.Thedeathrateiscalculatedbyusingarepresentativenumber,theestimatedpopulationattheendofJune1983,themiddleoftheyear.Thiswas24175900,givingadeathrateof26502/24175900,whichequals0.001096,or109.6deathsper100000atriskperyear.Anumberoftheratesusedinmedicalstatisticsaredescribedin§16.5.
Theuseofratesandproportionsenablesustocomparefrequenciesobtainedfromunequalsizedgroups,basepopulationsortimeperiods,butwemustbewareoftheirusewhentheirbasesordenominatorsarenotgiven.Victora(1982)
reportedadrugadvertisementsenttodoctorswhichdescribedtheantibioticphosphomycinasbeing‘100%effectiveinchronicurinaryinfections’.Thisisveryimpressive.Howcouldwefailtoprescribeadrugwhichis100%effective?Thestudyonwhichthiswasbasedused8patients,afterexcluding‘thosewhoseurinecontainedphosphomycin-resistantbacteria’.Iftheadvertisementhassaidthedrugwaseffectivein100%of8cases,wewouldhavebeenlessimpressed.Hadweknownthatitworkedin100%of8casesselectedbecauseitmightworkinthem,wewouldhavebeenstilllessimpressed.Thesamepaperquotesanadvertisementforacoldremedy,where100%ofpatientsshowedimprovement.Thiswasoutof5patients!AsVictoraremarked,suchsmallsamplesareunderstandableinthestudyofveryrarediseases,butnotforthecommoncold.
Sometimeswecanfoolourselvesaswellasothersbyomittingdenominators.IoncecarriedoutastudyofthedistributionofthesofttissuetumourKaposi'ssarcomainTanzania(Blandetal.1977),andwhilewritingitupIcameacrossapapersettingouttodothesamething(Schmid1973).Oneofthefactorsstudiedwastribalgroup,ofwhichthereareover100inTanzania.Thispaperreported‘thetribalincidenceintheWabende,WambweandWashiraziisremarkable…
Thesesmalltribes,eachwithfewerthan90000people,constitutethegroupinwhichatribalfactorcanbesuspected’.Thisisbasedonthefollowingratesoftumoursper10000population:national,0.1;Wabende,1.3;Wambwe.0.7;Washirazi,1.3.Theseareverybigratescomparedtothenational,butthepopulationsonwhichtheyarebasedaresmall,8000,14000and15000respectively(EgeroandHenin1973).Togetarateof1.3/10000outof8000Wabendepeoplewemusthave8000×1.3/10000=1case!Similarlywehave1caseamongthe14000Wambweand2amongthe15000Washirazi.Wecanseethattherearenotenoughdatatodrawtheconclusionswhichtheauthorhasdone.Ratesandproportionsarepowerfultoolsandwemustbewareofthembecomingdetachedfromtheoriginaldata.
5.2SignificantfiguresWhenwecalculatedthedeathrateduetolungcanceramongmenin1983wequotedtheansweras0.001096or109.6per100000peryear.Thisisanapproximation.Theratetothegreatestnumberoffiguresmycalculatorwillgiveis0.001096215653andthisnumberwouldprobablygoonindefinitely,turningintoarecurringseriesofdigits.Thedecimalsystemofrepresentingnumberscannotingeneralrepresentfractionsexactly.Weknowthat1/2=0.5,but1/3=0.33333333…,recurringinfinitely.Thisdoesnotusuallyworryus,becauseformostapplicationsthedifferencebetween0.333and1/3istoosmalltomatter.Onlythefirstfewnon-zerodigitsofthenumberareimportantandwecallthesethesignificantdigitsorsignificantfigures.Thereisusuallylittlepointinquotingstatisticaldatatomorethanthreesignificantfigures.Afterall,ithardlymatterswhetherthelungcancermortalityrateis0.001096or0.001097.Thevalue0.001096isgivento4significantfigures.Theleadingzerosarenotsignificant,thefirstsignificantdigitinthisnumberbeing‘1’.Tothreesignificant
figuresweget0.00110,becausethelastdigitis6andsothe9whichprecedesitisroundedupto10.Notethatsignificantfiguresarenotthesameasdecimalplaces.Thenumber0.00110isgivento5decimalplaces,thenumberofdigitsafterthedecimalpoint.Whenroundingtothenearestdigit,weleavethelastsignificantdigit,9inthiscase,if
whatfollowsitislessthan5,andincreasebyoneifwhatfollowsisgreaterthan5.Whenwehaveexactly5,Iwouldalwaysroundup,i.e.1.5goesto2.Thismeansthat0,1,2,3,4godownand5,6,7,8,9goup,whichseemsunbiased.Somewriterstaketheviewthat5shouldgouphalfthetimeanddownhalfthetime,sinceitisexactlymidwaybetweentheprecedingdigitandthatdigitplusone.VariousmethodsaresuggestedfordoingthisbutIdonotrecommendthemmyself.Inanycase,itisusuallyamistaketoroundtosofewsignificantfiguresthatthismatters.
Howmanysignificantfiguresweneeddependsontheusetowhichthenumberistobeputandonhowaccurateitisanyway.Forexample,ifwehaveasampleof10sublingualtemperaturesmeasuredtothenearesthalfdegree,thereislittlepointinquotingthemeantomorethan3significantfigures.Whatweshouldnotdoistoroundnumberstoafewsignificantfiguresbeforewehavecompletedourcalculations.Inthelungcancermortalityrateexample,supposeweroundthenumeratoranddenominatortotwosignificantfigures.Wehave27000/24000000=0.001125andtheanswerisonlycorrecttotwofigures.Thiscanspreadthroughcalculationscausingerrorstobuildup.Wealwaystrytoretainseveralmoresignificantfiguresthanwerequiredforthefinalanswer.
ConsiderTable5.1.Thisshowsmortalitydataintermsoftheexactnumbersofdeathsinoneyear.Thetableistakenfromamuchlargertable(OPCS1991)whichshowsthenumbersdyingfromeverycauseofdeathintheInternationalClassificationofDiseases(ICD),whichgivesnumericalcodestomanyhundredsofcausesofdeath.Thefulltable,whichalsogivesdeathsbyagegroup,covers70A4pages.Table5.1showsdeathsforbroadgroupsofdiseasescalledICDchapters.Thistableisnotagoodwaytopresentthesedataifwewanttogetanunderstandingofthefrequencydistributionofcauseofdeath,andthedifferencesbetweencausesinmenandwomen.Thisisevenmoretrueofthe70pageoriginal.Thisisnotthepurposeofthetable,ofcourse.Itisasourceofdata,areferencedocumentfromwhichusersextractinformationfortheirownpurposes.LetusseehowTable5.1canbesimplified.First,wecanreducethenumberofsignificantfigures.Letusbeextremeandreducethedatatoonesignificantfigure(Table5.2).
Thismakescomparisonsrathereasier,butitisstillnotobviouswhicharethemostimportantcausesofdeath.Wecanimprovethisbyre-orderingthetabletoputthemostfrequentcause,diseasesofthecirculatorysystem,first(Table5.3).Wecanalsocombinealotofthesmallercategoriesintoan‘others’group.Ididthisarbitrarily,bycombiningallthoseaccountingforlessthan2%ofthetotal.NowitisclearataglancethatthemostimportantcausesofdeathinEnglandandWalesarediseasesofthecirculatorysystem,neoplasmsanddiseasesoftherespiratorysystem,andthatthesedwarfalltheothers.Ofcourse,mortalityisnottheonlyindicatoroftheimportanceofadisease.ICDchapterXIII,diseasesofthemusculo-skeletal
systemandconnectivetissues,areeasilyseenfromTable5.2tobeonlyminorcausesofdeath,butthisgroupincludesarthritisandrheumatism,themostimportantillnessinitseffectsondailyactivity.
Table5.1.Deathsbysexandcause,EnglandandWales,1989(OPCS1991,DH2No.10)
I.C.D. Chapterandtypeofdisease
Numberofdeaths
Males Females
I Infectiousandparasitic 1246
1297
II Neoplasms(cancers) 75172
69948
III Endocrine,nutritionalandmetabolicdiseasesand
4395
5758
immunitydisorders
IV Bloodandbloodformingorgans
1002
1422
V Mentaldisorders 4493
9225
VI Nervoussystemandsenseorgans
5466
5990
VII Circulatorysystem 127435
137165
VIII Respiratorysystem 33489
33223
IX Digestivesystem 7900
10779
X Genitourinarysystem 3616
4156
XI Complicationsofpregnancy,childbirthandthepuerperium
0 56
XII Skinandsubcutaneoustissues
250 573
XIII Musculo-skeletalsystemandconnectivetissues
1235
4139
XIV Congenitalanomalies 897 869
XV Certainconditionsoriginatingintheperinatalperiod
122 118
XVI Signs,symptomsandill-definedconditions
1582
3082
XVII Injuryandpoisoning 11073
6427
Total 279373
294227
5.3PresentingtablesTables5.1,5.2and5.3illustrateanumberofusefulpointsaboutthepresentationoftables.Likeallthetablesinthisbook,theyaredesignedtostandalonefromthetext.Thereisnoneedtorefertomaterialburiedinsomeparagraphtointerpretthetable.Atableisintendedtocommunicateinformation,soitshouldbeeasytoreadandunderstand.Atableshouldhaveacleartitle,statingclearlyandunambiguouslywhatthetablerepresents.Therowsandcolumnsmustalsobelabelledclearly.
Whenproportions,ratesorpercentagesareusedinatabletogetherwithfrequencies,theymustbeeasytodistinguishfromoneanother.Thiscanbedone,asinTable2.10,byaddinga‘%’symbol,orby
includingaplaceofdecimals.TheadditioninTable2.10ofthe‘total’rowandthe‘100%’makesitclearthatthepercentagesarecalculatedfromthenumberinthetreatmentgroup,ratherthanthenumberwiththatparticularoutcomeorthetotalnumberofpatients.
Table5.2.Deathsbysexandcause,EnglandandWales,1989,roundedtoonesignificantfigure
I.C.D. Chapterandtypeofdisease
Numberofdeaths
Males Females
I Infectiousandparasitic 1000
1000
II Neoplasms(cancers) 80000
70000
III Endocrine,nutritionalandmetabolicdiseasesandimmunitydisorders
4000
6000
IV Bloodandbloodformingorgans
1000
1000
V Mentaldisorders 4000
9000
VI Nervoussystemandsense 5 6000
organs 000
VII Circulatorysystem 100000
100000
VIII Respiratorysystem 30000
30000
IX Digestivesystem 8000
10000
X Genitourinarysystem 4000
4000
XI Complicationsofpregnancy,childbirthandthepuerperium
0 60
XII Skinandsubcutaneoustissues
300 600
XIII Musculo-skeletalsystemandconnectivetissues
1000 4000
XIV Congenitalanomalies 900 900
XV Certainconditionsoriginatingintheperinatalperiod
100 100
XVI Signs,symptomsandill-definedconditions
2000
3000
XVII Injuryandpoisoning 10000
6000
Total 300000
300000
Table5.3.Deathsbysex,EnglandandWales,1989,formajorcauses
I.C.D.Chapterandtypeofdisease
Numberofdeaths
Males Females
Circulatorysystem(VII) 100000
100000
Neoplasms(cancers)(II) 80000 70000
Respiratorysystem(VIII) 30000 30000
Injuryandpoisoning(XVII) 10000 6000
Digestivesystem(IX) 8000 10000
Others 20000 20000
Total 300000
300000
5.4PiechartsItisoftenconvenienttopresentdatapictorially.Informationcanbeconveyedmuchmorequicklybyadiagramthanbyatableofnumbers.Thisisparticularlyusefulwhendataarebeingpresentedtoanaudience,asheretheinformationhastobegotacrossinalimitedtime.Itcanalsohelpareadergetthesalientpointsofatableofnumbers.Unfortunately,unlessgreatcareistaken,diagramscanalsobeverymisleadingandshouldbetreatedonlyasanadditiontonumbers,notareplacement.
Wehavealreadydiscussedmethodsofillustratingthefrequencydistributionofaqualitativevariable.Wewillnowlookatanequivalentofthehistogramfor
qualitativedata,thepiechartorpiediagram.Thisshowstherelativefrequencyforeachcategorybydividingacircleintosectors,theanglesofwhichareproportionaltotherelativefrequency.Wethusmultiplyeachrelativefrequencyby360,togivethecorrespondingangleindegrees.
Table5.4.Calculationsforapiechartofthedistributionofcauseofdeath
Causeofdeath Frequency Relativefrequency
Angle(degrees)
Circulatorysystem
137165 0.46619 168
Neoplasms(cancers)
69948 0.23773 86
Respiratorysystem
33223 0.11292 41
Injuryandpoisoning
6427 0.02184 8
Digestivesystem
10779 0.03663 13
Nervoussystem
5990 0.02036 7
Others 30695 0.10432 38
Total 294227 1.00000 361
Fig.5.1.Piechartshowingthedistributionofcauseofdeathamongfemales,EnglandandWales,1983
Table5.4showsthecalculationfordrawingapiecharttorepresentthedistributionofcauseofdeathforfemales,usingthedataofTables5.1and5.3.(Thetotaldegreesare361ratherthan360becauseofroundingerrorsinthecalculations.)TheresultingpiechartisshowninFigure5.1.Thisdiagramissaidtoresembleapiecutintopiecesforserving,hencethename.
5.5BarchartsAbarchartorbardiagramshowsdataintheformofhorizontalorverticalbars.Forexample,Table5.5showsthemortalityduetocanceroftheoesophagusinEnglandandWalesovera10yearperiod.Figure5.2showsthesedataintheformofabarchart,theheightsofthebarsbeingproportionaltothemortality.
Table5.5.Canceroftheoesophagus:standardizedmortalityrateper100000peryear,Englandand
Wales,1960--1969
Year Mortalityrate Year Mortalityrate
60 5.1 65 5.4
61 5.0 66 5.4
62 5.2 67 5.6
63 5.2 68 5.8
64 5.2 69 6.0
Fig.5.2.Barchartshowingtherelationshipbetweenmortalityduetocanceroftheoesophagusandyear,EnglandandWales,1960–1969
Therearemanyusesforbarcharts.AsinFigure5.2,theycanbeusedtoshowtherelationshipbetweentwovariables,onebeingquantitativeandtheothereitherqualitativeoraquantitativevariablewhichisgrouped,asistimeinyears.Thevaluesofthefirstvariableareshownbytheheightsofbars,onebarforeachcategoryofthesecondvariable.
Barchartscanbeusedtorepresentrelationshipsbetweenmorethantwovariables.Figure5.3showstherelationshipbetweenchildren'sreportsofbreathlessnessandcigarettesmokingbythemselvesandtheirparents.Wecanseequicklythattheprevalenceofthesymptomincreasesbothwiththechild'ssmokingandwiththatoftheirparents.Inthepublishedpaperreportingtheserespiratorysymptomdata(Blandetal.1978)thebarchartwasnotused;thedataweregivenintheformoftables.Itwasthusavailableforotherresearcherstocomparetotheirownortocarryoutcalculationsupon.Thebarchartwasusedtopresenttheresultsduringaconference,wherethemostimportantthingwastoconveyanoutlineoftheanalysisquickly.
Barchartscanalsobeusedtoshowfrequencies.Forexample,Figure5.4(a)showstherelativefrequencydistributionsofcausesofdeathamongmenandwomen,Figure5.4(b)showsthefrequencydistributionofcauseofdeathamong
men.Figure5.4(b)looksverymuchlikeahistogram.Thedistinctionbetweenthesetwotermsisnotclear.MoststatisticianswoulddescribeFigures4.3,4.4,and4.6ashistograms,andFigures5.2and5.3asbarcharts,butIhaveseenbookswhichactuallyreversethisterminologyandotherswhichreservetheterm‘histogram’forafrequencydensitygraph,likeFigures4.4and4.6.
Fig.5.3.Barchartshowingtherelationshipbetweentheprevalenceofself-reportedbreathlessnessamongschoolchildrenandtwopossiblecausativefactors
Fig.5.4.BarchartsshowingdatafromTable5.1
5.6Scatterdiagrams
Thebarchartwouldbearatherclumsymethodforshowingtherelationshipbetweentwocontinuousvariables,suchasvitalcapacityandheight(Table5.6).Forthisweuseascatterdiagramorscattergram(Figure5.5).Thisismadebymarkingthescalesofthetwovariablesalonghorizontalandverticalaxes.Eachpairofmeasurementsisplottedwithacross,circle,orsomeothersuitablesymbolatthepointindicatedbyusingthemeasurementsascoordinates.
Table5.7showsserumalbuminmeasuredfromagroupofalcoholicpatientsandagroupofcontrols(Hickishetal.1989).Wecanuseascatterdiagramto
presentthesedataalso.Theverticalaxisrepresentsalbuminandwechoosetwoarbitrarypointsonthehorizontalaxistorepresentthegroups.
Table5.6.Vitalcapacity(VC)andheightfor44femalemedicalstudents
Height(cm)
VC(litres)
Height(cm)
VC(litres)
Height(cm)
VC(litres)
Height(cm)
155.0 2.20 161.2 3.39 166.0 3.66 170.0
155.0 2.65 162.0 2.88 166.0 3.69 171.0
155.4 3.06 162.0 2.96 166.6 3.06 171.0
158.0 2.40 162.0 3.12 167.0 3.48 171.5
160.0 2.30 163.0 2.72 167.0 3.72 172.0
160.2 2.63 163.0 2.82 167.0 3.80 172.0
161.0 2.56 163.0 3.40 167.6 3.06 174.0
161.0 2.60 164.0 2.90 167.8 3.70 174.2
161.0 2.80 165.0 3.07 168.0 2.78 176.0
161.0 2.90 166.0 3.03 168.0 3.63 177.0
161.0 3.40 166.0 3.50 169.4 2.80 180.6
Fig.5.5.Scatterdiagramshowingtherelationshipbetweenvitalcapacityandheightforagroupoffemalemedicalstudents
Table5.7.Albuminmeasuredinalcoholicsandcontrols
Alcoholics Controls
15 28 39 41 44 48 34 41 43 45 45
16 29 39 43 45 48 39 42 43 45 45
17 32 39 43 45 49 39 42 43 45 45
18 37 40 44 46 51 40 42 43 45 46
20 38 40 44 46 51 41 42 44 45 46
21 38 40 44 46 52 41 42 44 45 47
28 38 41 44 47 41 42 44 45 47
Fig.5.6.ScatterdiagramsshowingthedataofTable5.7
InTable5.7therearemanyidenticalobservationsineachgroup,soweneedtoallowforthisinthescatterdiagram.Ifthereismorethanoneobservationatthesamecoordinatewecanindicatethisinseveralways.Wecanusethenumberofobservationsinplaceofthechosensymbol,butthismethodisbecomingobsolete.AsinFigure5.6(a),wecandisplacethepointsslightlyinarandomdirection(calledjittering).ThisiswhatStatadoesandsowhatIhavedoneinmostofthisbook.Alternatively,wecanuseasystematicsidewaysshift,toformamoreorderlypictureasinFigure5.6(b).Thelatterisoftenusedwhenthevariableonthehorizontalaxisiscategoricalratherthancontinuous.Suchscatterdiagramsareveryusefulforcheckingtheassumptionsofsomeoftheanalyticalmethodswhichweshalluselater.Ascatterdiagramwhereonevariableisagroupisalsocalledadotplot.Asapresentationaldevice,theyenableustoshowfarmoreinformationthanabarchartofthegroupmeanscando.Forthisreason,statisticiansusuallypreferthemtoothertypesofgraphicaldisplay.
5.7LinegraphsandtimeseriesThedataofTable5.5areorderedinawaythatthoseofTable5.6arenot,inthattheyarerecordedatintervalsintime.Suchdataarecalledatimeseries.Ifweplotascatterdiagramofsuchdata,asinFigure5.7,itisnaturaltojoinsuccessivepointsbylinestoformalinegraph.Wedonotevenneedtomarkthepointsatall;allweneedistheline.ThiswouldnotbesensibleinFigure5.5,astheobservationsareindependentofoneanotherandquiteunrelated,whereasinFigure5.7thereislikelytobearelationshipbetweenadjacentpoints.Herethemortalityraterecordedforcanceroftheoesophaguswilldependonanumberofthingswhichvaryovertimeincludingpossiblycausalfactors,suchastobaccoandalcoholconsumption,andclinicalfactors,suchasbetterdiagnostictechniquesandmethodsoftreatment.
Linegraphsareparticularlyusefulwhenwewanttoshowthechangeofmorethanonequantityovertime.Figure5.8showslevelsofzidovudine(AZT)in
thebloodofAIDSpatientsatseveraltimesafteradministrationofthedrug,forpatientswithnormalfatabsorptionandwithfatmalabsorption(§10.7).Thedifferenceinresponsetothetwotreatmentsisveryclear.
Fig.5.7.Linegraphshowingchangesincanceroftheoesophagusmortalityovertime
Fig.5.8.LinegraphtoshowtheresponsetoadministrationofzidovudineintwogroupsofAIDSpatients
5.8MisleadinggraphsFigure5.2isclearlytitledandlabelledandcanbereadindependentlyofthesurroundingtext.Theprinciplesofclarityoutlinedfortablesapplyequallyhere.Afterall,adiagramisamethodofconveyinginformationquicklyandthisobjectisdefeatedifthereaderoraudiencehastospendtimetryingtosortoutexactlywhatadiagramreallymeans.Becausethevisualimpactofdiagramscanbesogreat,furtherproblemsariseintheiruse.
Thefirstoftheseisthemissingzero.Figure5.9showsasecondbarchart
representingthedataofTable5.5.Thischartappearstoshowaveryrapidincreaseinmortality,comparedtothegradualincreaseshowninFigure5.2.Yetbothshowthesamedata.Figure5.9omitsmostoftheverticalscale,andinsteadstretchesthatsmallpartofthescalewherethechangetakesplace.Evenwhenweareawareofthis,itisdifficult
tolookatthisgraphandnotthinkthatitshowsalargeincreaseinmortality.Ithelpsifwevisualizethebaselineasbeingsomewherenearthebottomofthepage.
Fig.5.9.Barchartwithzeroomittedontheverticalscale
ThereisnozeroonthehorizontalaxisinFigures5.2and5.9,either.Therearetworeasonsforthis.Thereisnopractical‘zerotime’onthecalendar;weuseanarbitraryzero.Also,thereisanunstatedassumptionthatmortalityratesvarywithtimeandnottheotherwayround.
ThezeroisomittedinFigure5.5.Thisisalmostalwaysdoneinscatterdiagrams,yetifwearetogaugetheimportanceoftherelationshipbetweenvitalcapacityandheightbytherelativechangeinvitalcapacityovertheheightrangeweneedthezeroonthevitalcapacityscale.Theoriginisoftenomittedonscatterdiagramsbecauseweareusuallyconcernedwiththeexistenceofarelationshipandthedistributionsfollowedbytheobservations,ratherthanitsmagnitude.Weestimatethelatterinadifferentway,describedinChapter11.
Linegraphsareparticularlyatriskofundergoingthesortofdistortion
ofmissingzerodescribedin§5.8.ManycomputerprogramsresistdrawingbarchartslikeFigure5.9,butwillproducealinegraphwithatruncatedscaleasthedefault.Figure5.10showsalinegraphwithatruncatedscale,correspondingtoFigure5.9.Justasthere,themessageofthegraphisadramaticincreaseinmortality,whichthedatathemselvesdonotreallysupport.Wecanmakethisevenmoredramaticbystretchingtheverticalscaleandcompressingthehorizontalscale.TheeffectisnowreallyimpressiveandlooksmuchmorelikelythanFigure5.7toattractresearchfunds,Nobelprizesandinterviewsontelevision.Huff(1954)aptlynamessuchhorrors‘geewhiz’graphs.Theyareevenmoredramaticifweomitthescalesaltogetherandshowonlythesoaringline.
Fig.5.10.Linegraphswithamissingzeroandwithastretchedverticalandcompressedhorizontalscale,a‘geewhiz’graph
Fig.5.11.Figure5.1withthree-dimensionaleffects
Thisisnottosaythatauthorswhoshowonlypartofthescalearedeliberatelytryingtomislead.Thereareoftengoodargumentsagainstgraphswithvastareasofboringblankpaper.InFigure5.5,wearenotinterestedinvitalcapacitiesnearzeroandcanfeelquitejustifiedinexcludingthem.InFigure5.10wecertainlyareinterestedinzeromortality;itissurelywhatweareaimingfor.Thepointisthatgraphscansoeasilymisleadtheunwaryreader,soletthereaderbeware.
Theadventofpowerfulpersonalcomputersledtoanincreaseintheabilitytoproducecomplicatedgraphics.Simplecharts,suchasFigure5.1,areinformativebutnotvisuallyexciting.Onewayofdecoratingsuchgraphsismakethemappearthree-dimensional.Figure5.11showstheeffect.Theanglesarenolongerproportionaltothenumberswhichtheyrepresent.Theareasare,butbecausetheyaredifferentshapesitisdifficulttocomparethem.Thisdefeatstheprimaryobjectofconveyinginformationquicklyandaccurately.Anotherapproachtodecoratingdiagramsistoturnthemintopictures.Inapictogramthebarsof
thebarchartarereplacedbypictures.Pictogramscanbehighlymisleading,astheheightofapicture,drawnwiththree-dimensionaleffect,isproportionaltothenumberrepresented,butwhatweseeisthevolume.Suchdecoratedgraphsareliketheilluminatedcapitalsofmedievalmanuscripts:nicetolookatbuthardtoread.Ithinkthey
shouldbeavoided.
Fig.5.12.TuberculosismortalityinEnglandandWales,1871–1971(DHSS1976)
Huff(1954)recountsthatthepresidentofachapteroftheAmericanStatisticalAssociationcriticizedhimforaccusingpresentersofdataoftryingtodeceive.Thestatisticianarguedthatincompetencewastheproblem.Huff'sreplywasthatdiagramsfrequentlysensationalizebyexaggerationandrarelyminimizeanything,thatpresentersofdatararelydistortthosedatatomaketheircaseappearweakerthanitis.Theerrorsaretooone-sidedforustoignorethepossibilitythatwearebeingfooled.Whenpresentingdata,especiallygraphically,beverycarefulthatthedataareshownfairly.Whenonthereceivingend,beware!
5.9LogarithmicscalesFigure5.12showsalinegraphrepresentingthefallintuberculosismortalityinEnglandandWalesover100years(DHSS1976).Wecanseearatherunsteadycurve,showingthecontinuingdeclineinthedisease.Figure5.12alsoshowsthemortalityplottedonalogarithmic(orlog)scale.Alogarithmicscaleisonewheretwopairsofpointswillbethesamedistanceapartiftheirratiosareequal,ratherthantheirdifferences.Thusthedistancebetween1and10isequaltothat
between10and100,nottothatbetween10and19.(See§5Aifyoudonotunderstandthis.)Thelogarithmiclineshowsaclearkinkinthecurveabout1950,thetimewhenanumberofeffectiveanti-TBmeasures,chemotherapywithstreptomycin,BCGvaccineandmassscreeningwithX-rays,wereintroduced.Ifweconsiderthepropertiesoflogarithms(§5A),wecanseehowthelogscaleforthetuberculosismortalitydataproducedsuchsharpchangesinthecurve.Iftherelationshipissuchthatthemortalityisfallingwithaconstantproportion,suchas10%peryear,theabsolutefalleachyeardependsontheabsolutelevelintheprecedingyear:
mortalityin1960=constant×mortalityin1959
Soifweplotmortalityonalogscaleweget:
log(mortalityin1960)=log(constant)+log(mortalityin1959)
Formortalityin1961,wehave
Hencewegetastraightlinerelationshipbetweenlogmortalityandtimet:
log(mortalityaftertyears)=t×log(constant)+log(mortalityasstart)
Whentheconstantproportionchanges,theslopeofthestraightlineformedbyplottinglog(mortality)againsttimechangesandthereisaveryobviouskinkintheline.
Logscalesareveryusefulanalytictools.However,agraphonalogscalecanbeverymisleadingifthereaderdoesnotallowforthenatureofthescale.ThelogscaleinFigure5.12showstheincreasedrateofreductioninmortalityassociatedwiththeanti-TBmeasuresquiteplainly,butitgivestheimpressionthatthesemeasureswereimportantinthedeclineofTB.Thisisnotso.Ifwelookatthecorrespondingpointonthenaturalscale,wecanseethatallthesemeasuresdidwastoaccelerateadeclinewhichhadbeengoingonforalongtime(seeRadicalStatisticsHealthGroup1976)
Appendices
5AAppendix:Logarithms
Logarithmsarenotsimplyamethodofcalculationdatingfrombeforethecomputerage,butasetoffundamentalmathematicalfunctions.Becauseoftheirspecialpropertiestheyaremuchusedinstatistics.Weshallstartwithlogarithms(orlogsforshort)tobase10,thecommonlogarithmsusedincalculations.Thelogtobase10ofanumberxisywhere
x=10y
Wewritey=log10(x).Thusforexamplelog10(10)=1,log10(100)=2,log10(1000)=3,log10(10000)=4,andsoon.Ifwemultiplytwonumbers,thelogoftheproductisthesumoftheirlogs:
log(xy)=log(x)+log(y)
Forexample.
100×1000=102×103=102+3=105=100000
Orinlogterms:
log10(100×1000)=log10(100)+log10(1000)=2+3=5
Hence,100×1000=105=100000.Thismeansthatanymultiplicativerelationshipoftheform
y=a×b×c×d
canbemadeadditivebyalogtransformation:
log(y)=log(a)+log(b)+log(c)+log(d)
ThisistheprocessunderlyingthefittotheLognormalDistributiondescribedin§7.4.
Thereisnoneedtouse10asthebaseforlogarithms.Wecanuseanynumber.Thelogofanumberxtobasebcanbefoundfromthelogto
baseabyasimplecalculation:
Tenisconvenientforarithmeticusinglogtables,butforotherpurposesitislessso.Forexample,thegradient,slopeordifferentialofthecurvey=log10(x)islog10(e)/x,wheree=2.718281…isaconstantwhichdoesnotdependonthebaseofthelogarithm.Thisleadstoawkwardconstantsspreadingthroughformulae.Tokeepthistoaminimumweuselogstothebasee,callednaturalorNapierianlogarithmsafterthemathematicianJohnNapier.ThisisthelogarithmusuallyproducedbyLOG(X)functionsincomputerlanguages.
Figure5.13showsthelogcurveforthreedifferentbases,2,eand10.Thecurvesallgothroughthepoint(1,0),i.e.log(1)=0.Asxapproaches0,log(x)becomesalargerandlargernegativenumber,tendingtowardsminusinfinityasxtendstozero.Therearenologsofnegativenumbers.Asxincreasesfrom1,thecurvebecomesflatterandflatter.Thoughlog(x)continuestoincrease,itdoessomoreandmoreslowly.Thecurvesallgothrough(base,1)i.e.log(base)=1.Thecurveforlogtothebase2goesthrough(2,1),(4,2),(8,3)because21=2.22=4,23=8.Wecanseethattheeffectofreplacingdatabytheirlogswillbetostretchoutthescaleatthelowerendandcontractitattheupper.
Weoftenworkwithlogarithmsofdataratherthanthedatathemselves.Thismayhaveseveraladvantages.Multiplicativerelationshipsmaybecomeadditive,curvesmaybecomestraightlinesandskewdistributionsmaybecomesymmetrical.
Wetransformbacktothenaturalscaleusingtheantilogarithmorantilog.Ify=log10(x),x=10yistheantilogofy.IfZ=loge(x),x=ezorx=exp(z)istheantilogofz.Ifyourcomputerprogramdoesnottransformback,mostcalculatorshaveexand10xfunctionsforthispurpose.
Fig.5.13.Logarithmiccurvestothreedifferentbases
5MMultiplechoicequestions20to24(Eachbranchiseithertrueorfalse)
20.‘AftertreatmentwithWondermycin,66.67%ofpatientsmadeacompleterecovery’
(a)Wondermyciniswonderful;
(b)thisstatementmaybemisleadingbecausethedenominatorisnotgiven;
(c)thenumberofsignificantfiguresusedsuggestadegreeofprecisionwhichmaynotbepresent;
(d)somecontrolinformationisrequiredbeforewecandrawanyconclusionsaboutWondermycin;
(e)theremightbeonlyaverysmallnumberofpatients.
ViewAnswer
21.Thenumber1729.54371:
(a)totwosignificantfiguresis1700;
(b)tothreesignificantfiguresis1720;
(c)tosixdecimalplacesis1729.54;
(d)tothreedecimalplacesis1729.544;
(e)tofivesignificantfiguresis1729.5.
ViewAnswer
Fig.5.14.Adubiousgraph
22.Figure5.14:
(a)showsahistogram;
(b)shouldhavetheverticalaxislabelled;
(c)shouldshowthezeroontheverticalaxis;
(d)shouldshowthezeroonthehorizontalaxis;
(e)shouldshowtheunitsfortheverticalaxis.
ViewAnswer
23.Logarithmicscalesusedingraphsshowingtimetrends:
(a)showchangesinthetrendclearly;
(b)oftenproducestraightlines;
(c)giveaclearideaofthemagnitudeofchanges;
(d)shouldshowthezeropointfromtheoriginalscale;
(e)compressintervalsbetweenlargenumberscomparedtothosebetweensmallnumbers.
ViewAnswer
24.Thefollowingmethodscanbeusedtoshowtherelationshipbetweentwovariables:
(a)histogram;
(b)piechart;
(c)scatterdiagram;
(d)barchart;
(e)linegraph.
ViewAnswer
Table5.8.WeeklygeriatricadmissionsinWandsworthHealthDistrictfromMaytoSeptember,
1982and1983(Fishetal.1985)
Week 1982 1983 Week 1982 1983
1 24 20 12 11 25
2 22 17 13 6 22
3 21 21 14 10 26
4 22 17 15 13 12
5 24 22 16 19 33
6 15 23 17 13 19
7 23 20 18 17 21
8 21 16 19 10 28
9 18 24 20 16 19
10 21 21 21 24 13
11 17 20 22 15 29
5EExercise:CreatinggraphsInthisexerciseweshalldisplaygraphicallysomeofthedatawehavestudiedsofar.
1.Table4.1showsdiagnosesofpatientsinahospitalcensus.Displaythesedataasagraph.
ViewAnswer
2.Table2.8showstheparalyticpolioratesforseveralgroupsofchildren.Constructabarchartfortheresultsfromtherandomizedcontrolareas.
ViewAnswer
3.Table3.1showssomeresultsfromthestudyofmortalityinBritishdoctors.Showthesegraphically.
ViewAnswer
4.Table5.8showsthenumbersofgeriatricadmissionsinWandsworthHealthDistrictforeachweekfromMaytoSeptemberin1982and1983.Showthesedatagraphically.Whydoyouthinkthetwoyearsweredifferent?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>6-Probability
6
Probability
6.1ProbabilityWeusedatafromasampletodrawconclusionsaboutthepopulationfromwhichitisdrawn.Forexample,inaclinicaltrialwemightobservethatasampleofpatientsgivenanewtreatmentrespondbetterthanpatientsgivenanoldtreatment.Wewanttoknowwhetheranimprovementwouldbeseeninthewholepopulationofpatients,andifsohowbigitmightbe.Thetheoryofprobabilityenablesustolinksamplesandpopulations,andtodrawconclusionsaboutpopulationsfromsamples.Weshallstartthediscussionofprobabilitywithsomesimplerandomizingdevices,suchascoinsanddice,buttherelevancetomedicalproblemsshouldsoonbecomeapparent.
Wefirstaskwhatexactlyismeantby‘probability’.InthisbookIshalltakethefrequencydefinition:theprobabilitythataneventwillhappenundergivencircumstancesmaybedefinedastheproportionofrepetitionsofthosecircumstancesinwhichtheeventwouldoccurinthelongrun.Forexample,ifwetossacoinitcomesdowneitherheadsortails.Beforewetossit,wehavenowayofknowingwhichwillhappen,butwedoknowthatitwilleitherbeheadsortails.Afterwehavetossedit,ofcourse,weknowexactlywhattheoutcomeis.Ifwecarryontossingourcoin,weshouldgetseveralheadsandseveraltails.Ifwegoondoingthisforlongenough,thenwewouldexpecttogetasmanyheadsaswedotails.Sotheprobabilityofaheadbeingthrownishalf,becauseinthelongrunaheadshouldoccuronhalfofthethrows.Thenumberofheadswhichmightariseinseveraltossesofthecoiniscalledarandomvariable,thatis,avariablewhichcantakemorethan
onevaluewithgivenprobabilities.Inthesameway,athrowndiecanshowsixfaces,numberedonetosix,withequalprobability.Wecaninvestigaterandomvariablessuchasthenumberofsixesinagivennumberofthrows,thenumberofthrowsbeforethefirstsix,andsoon.Thereisanother,broaderdefinitionofprobabilitywhichleadstoadifferentapproachtostatistics,theBayesianschool(BlandandAltman1998),butitisbeyondthescopeofthisbook.
Thefrequencydefinitionofprobabilityalsoappliestocontinuousmeasurement,suchashumanheight.Forexample,supposethemedianheightinapopulationofwomenis168cm.Thenhalfthewomenareabove168cminheight.Ifwechoosewomenatrandom(i.e.withoutthecharacteristicsofthewomaninfluencingthechoice)theninthelongrunhalfthewomenchosenwillhave
heightsabove168cm.Theprobabilityofawomanhavingheightabove168cmisonehalf.Similarly,if1/10ofthewomenhaveheightgreaterthan180cm.awomanchosenatrandomwillhaveheightgreaterthan180cmwithprobability1/10.Inthesamewaywecanfindtheprobabilityofheightbeingbetweenanygivenvalues.Whenwemeasureacontinuousquantitywearealwayslimitedbythemethodofmeasurement,andsowhenwesayawoman'sheightis170cmwemeanthatitisbetween,say,169.5and170.5cm,dependingontheaccuracywithwhichwemeasure.Sowhatweareinterestedinistheprobabilityoftherandomvariabletakingvaluesbetweencertainlimitsratherthanparticularvalues.
6.2PropertiesofprobabilityThefollowingsimplepropertiesfollowfromthedefinitionofprobability.
1. Aprobabilityliesbetween0.0and1.0.Whentheeventneverhappenstheprobabilityis0.0,whenitalwayshappenstheprobabilityis1.0.
2. Additionrule.Supposetwoeventsaremutuallyexclusive,i.e.whenonehappenstheothercannothappen.Thentheprobabilitythatoneortheotherhappensisthesumoftheirprobabilities.Forexample,a
throwndiemayshowaoneoratwo,butnotboth.Theprobabilitythatitshowsaoneoratwo=1/6+1/6=2/6.
3. Multiplicationrule.Supposetwoeventsareindependent,i.e.knowingonehashappenedtellsusnothingaboutwhethertheotherhappens.Thentheprobabilitythatbothhappenistheproductoftheirprobabilities.Forexample,supposewetosstwocoins.Onecoindoesnotinfluencetheother,sotheresultsofthetwotossesareindependent,andtheprobabilityoftwoheadsoccurringis1/2×1/2=1/4.ConsidertwoindependenteventsAandB.TheproportionoftimesAhappensinthelongrunistheprobabilityofA.SinceAandBareindependent,ofthosetimeswhenAhappens,aproportion,equaltoprobabilityofB,willhaveBhappenalso.HencetheproportionoftimesthatAandBhappentogetheristheprobabilityofAmultipliedbytheprobabilityofB.
6.3ProbabilitydistributionsandrandomvariablesSupposewehaveasetofeventswhicharemutuallyexclusiveandwhichincludesalltheeventswhichcanpossiblyhappen.Thesumoftheirprobabilitiesis1.0.Thesetoftheseprobabilitiesmakeupaprobabilitydistribution.Forexample,ifwetossacointhetwopossibilities,headortail,aremutuallyexclusiveandthesearetheonlyeventswhichcanhappen.Theprobabilitydistributionis:
PROB(head)=1/2
PROB(tail)=1/2
Now,letusdefineavariable,whichwewilldenotebythesymbolX,suchthatX=0ifthecoinshowsatailandX=1ifthecoinshowsahead.Xis
thenumberofheadsshownonasingletoss,whichmustbe0or1.WedonotknowbeforethetosswhatXwillbe,butdoknowtheprobabilityofithavinganypossiblevalue.Xisarandomvariable(§6.1)andtheprobabilitydistributionisalsothedistributionofX.Wecanrepresentthiswithadiagram,asinFigure6.1(a).
Fig.6.1.Probabilitydistributionsforthenumberofheadsshowninthetossofonecoinandintossesoftwocoins
Whathappensifwetosstwocoinsatonce?Wenowhavefourpossibleevents:aheadandahead,aheadandatail,atailandahead,atailandatail.Clearly,theseareequallylikelyandeachhasprobability1/4.LetYbethenumberofheads.Yhasthreepossiblevalues:0,1,and2.Y=0onlywhenwegetatailandatailandhasprobability1/4.Similarly,Y=2onlywhenwegetaheadandahead,sohasprobability1/4.However,Y=1eitherwhenwegetaheadandtail,orwhenwehaveatailandahead,andsohasprobability1/4+1/4=1/2.Wecanwritethisprobabilitydistributionas:
PROB(Y=0)=1/4
PROB(Y=1)=1/2
PROB(Y=2)=1/4
TheprobabilitydistributionofYisshowninFigure6.1(b).
6.4TheBinomialdistributionWehaveconsideredtheprobabilitydistributionsoftworandomvariables:X,thenumberofheadsinonetossofacoin,takingvalues0and1,andY,thenumberofheadswhenwetosstwocoins,takingvalues0,1or2.Wecanincreasethenumberofcoins;Figure6.2showsthedistributionofthenumberofheadsobtainedwhen15coinsaretossed.Wedonotneedtheprobabilityofa‘head’tobe0.5:wecan
countthenumberofsixeswhendicearethrown.Figure6.2alsoshowsthedistributionofthenumberofsixesobtainedfrom10dice.Ingeneral,wecanthinkofthecoinorthedieastrials,whichcanhaveoutcomessuccess(headorsix)orfailure(tailoronetofive).ThedistributionsinFigures6.1and6.2areallexamplesoftheBinomialdistribution,whicharisesfrequentlyinmedicalapplications.TheBinomialdistributionisthedistributionfollowed
bythenumberofsuccessesinnindependenttrialswhentheprobabilityofanysingletrialbeingasuccessisp.TheBinomialdistributionisinfactafamiliyofdistributions,themembersofwhicharedefinedbythevaluesofnandp.Thevalueswhichdefinewhichmemberofthedistributionfamilywehavearecalledtheparametersofthedistribution.
Fig.6.2.Distributionofthenumberofheadsshownwhen15coinsaretossedandofthenumberofsixesshownwhen10dicearethrown,examplesoftheBinomialdistribution
Simplerandomizingdeviceslikecoinsanddiceareofinterestinthemselves,butnotofobviousrelevancetomedicine.However,supposewearecarryingoutarandomsamplesurveytoestimatetheunknownprevalence,p,ofadisease.Sincemembersofthesamplearechosenatrandomandindependentlyfromthepopulation,theprobabilityofanychosensubjecthavingthediseaseisp.Wethushave
aseriesofindependenttrials,eachwithprobabilityofsuccessp,andthenumberofsuccesses,i.e.membersofthesamplewiththedisease,willfollowaBinomialdistribution.Asweshallseelater,thepropertiesoftheBinomialdistributionenableustosayhowaccurateistheestimateofprevalenceobtained(§8.4).
WecouldcalculatetheprobabilitiesforaBinomialdistributionbylistingallthewaysinwhich,say,15coinscanfall.However,thereare215=32768combinationsof15coins,sothisisnotverypractical.Instead,thereisaformulafortheprobabilityintermsofthenumberofthrowsandtheprobabilityofahead.Thisenablesustoworktheseprobabilitiesoutforanyprobabilityofsuccessandanynumberoftrials.Ingeneral,wehavenindependenttrialswiththeprobabilitythatatrialisasuccessbeingp.Theprobabilityofrsuccessesis
wheren!.callednfactorial,isn×(n-1)×(n-2)×…×2×1.Thisratherforbiddingformulaariseslikethis.Foranyparticularseriesofrsuccesses,eachwithprobabilityp,andn-rfailures,eachwithprobability1-p,theprobabilityoftheserieshappeningispr(1-p)(n-r),sincethetrialsareindependentandthemultiplicativeruleapplies.Thenumberofwaysinwhichrthingsmaybechosenfromnthingsisn!/r!(n-r)!(§6A).Onlyonecombinationcanhappenat
onetime,sowehaven!/r!(n-r)!mutuallyexclusivewaysofhavingrsuccesses,eachwithprobabilitypr(1-p)(n-r).Theprobabilityofhavingrsuccessesisthesumofthesen!/r!(n-r)!probabilities,givingtheformulaabove.Thosewhorememberthebinomialexpansioninmathematicswillseethatthisisonetermofit,hencethenameBinomialdistribution.
Fig.6.3.Binomialdistributionswithdifferentn,p=0.3
Wecanapplythistothenumberofheadsintossesoftwocoins.ThenumberofheadswillbefromaBinomialdistributionwithp=0.5andn=2.Hencetheprobabilityoftwoheads(r=2)is:
Notethat0!=1(§6A),andanythingtothepower0is1.Similarlyforr=1andr=0:
Thisiswhatwasfoundfortwocoinsin§6.3.Wecanusethisdistributionwheneverwehaveaseriesoftrialswithtwopossibleoutcomes.Ifwetreatagroupofpatients,thenumberwhorecoverisfromaBinomialdistribution.Ifwemeasurethebloodpressureofagroupofpeople,thenumberclassifiedashypertensiveisfromaBinomialdistribution.
Figure6.3showstheBinomialdistributionforp=0.3andincreasingvaluesofn.Thedistributionbecomesmoresymmetricalasnincreases.ItisconvergingtotheNormaldistribution,describedinthenext
chapter.
6.5MeanandvarianceThenumberofdifferentprobabilitiesinaBinomialdistributioncanbeverylargeandunwieldy.Whennislarge,weusuallyneedtosummarizetheseprobabilitiesinsomeway.Justasafrequencydistributioncanbedescribedbyitsmeanandvariance,socanaprobabilitydistributionanditsassociatedrandomvariable.
Themeanistheaveragevalueoftherandomvariableinthelongrun.ItisalsocalledtheexpectedvalueorexpectationandtheexpectationofarandomvariableXisusuallydenotedbyE(X).Forexample,considerthenumberofheadsintossesoftwocoins.Weget0headsin1/4ofpairsofcoins,i.e.withprobability1/4.Weget1headin1/2ofpairsofcoins,and2headsin1/4ofpairs.Theaveragevalueweshouldgetinthelongrunisfoundbymultiplyingeachvaluebytheproportionofpairsinwhichitoccursandadding:
Ifwekeptontossingpairsofcoins,theaveragenumberofheadsperpairwouldbe1.Thusforanyrandomvariablewhichtakesdiscretevaluesthemean,expectationorexpectedvalueisfoundbysummingeachpossiblevaluemultipliedbyitsprobability.
Notethattheexpectedvalueofarandomvariabledoesnothavetobeavaluethattherandomvariablecanactuallytake.Forexample,forthemeannumberofheadsinthrowsofonecoinwehaveeithernoheadsor1head,eachwithprobabilityhalf,andtheexpectedvalueis0×½+1×½=½.Thenumberofheadsmustbe0or1,buttheexpectedvalueishalf,theaveragewhichwewouldgetinthelongrun.
Thevarianceofarandomvariableistheaveragesquareddifferencefromthemean.Forthenumberofheadsintossesoftwocoins,0is1unitfromthemeanandoccursfor1/4ofpairsofcoins,1is0unitsfromthemeanandoccursforhalfofthepairsand2is1unitfromthemeanandoccursfor1/4ofpairs,i.e.withprobability1/4.Thevarianceisthenfoundbysquaringthesedifferences,multiplyingbythe
proportionoftimesthedifferencewilloccur(theprobability)andadding:
WedenotethevarianceofarandomvariableXbyVAR(X).Inmathematicalterms,
VAR(X)=E(X2-E(X)2)
Thesquarerootofthevarianceisthestandarddeviationoftherandomvariableordistribution.WeoftenusetheGreekletterµ,pronounced‘mu’,andσ,‘sigma’,
forthemeanandstandarddeviationofaprobabilitydistribution.Thevarianceisthenσ2.
Themeanandvarianceofthedistributionofacontinuousvariable,ofwhichmoreinChapter7,aredefinedinasimilarway.Calculusisusedtodefinethemasintegrals,butthisneednotconcernushere.Essentiallywhathappensisthatthecontinuousscaleisbrokenupintomanyverysmallintervalsandthevalueofthevariableinthatverysmallintervalismultipliedbytheprobabilityofbeinginit,thentheseareadded.
6.6PropertiesofmeansandvariancesWhenweusethemeanandvarianceofprobabilitydistributionsinstatisticalcalculations,itisnotthedetailsoftheirformulaewhichweneedtoknow,butsomeoftheirsimpleproperties.Mostoftheformulaeusedinstatisticalcalculationsarederivedfromthese.Thereasonsforthesepropertiesarequiteeasytoseeinanon-mathematicalway.
Ifweaddaconstanttoarandomvariable,thenewvariablesocreatedhasameanequaltothatoftheoriginalvariableplustheconstant.Thevarianceandstandarddeviationwillbeunchanged.Supposeourrandomvariableishumanheight.Wecanaddaconstanttotheheight
bymeasuringtheheightsofpeoplestandingonabox.Themeanheightofpeopleplusboxwillnowbethemeanheightofthepeopleplustheconstantheightofthebox.Theboxwillnotalterthevariabilityoftheheights,however.Thedifferencebetweenthetallestandsmallest,forexample,willbeunchanged.Wecansubtractaconstantbyaskingthepeopletostandinaconstantholetobemeasured.Thisreducesthemeanbutleavesthevarianceunchangedasbefore.(MyfreeprogramClinstat(§1.3)hasasimplegraphicsprogramwhichillustratesthis.)
Ifwemultiplyarandomvariablebyapositiveconstant,themeanandstandarddeviationaremultipliedbytheconstant,thevarianceismultipliedbythesquareoftheconstant.Forexample,ifwechangeourunitsofmeasurements,sayfrominchestocentimetres,wemultiplyeachmeasurementby2.54.Thishastheeffectofmultiplyingthemeanbytheconstant,2.54,andmultiplyingthestandarddeviationbytheconstantsinceitisinthesameunitsastheobservations.However,thevarianceismeasuredinsquaredunits,andsoismultipliedbythesquareoftheconstant.Divisionbyaconstantworksinthesameway.Iftheconstantisnegative,themeanismultipliedbytheconstantandsochangessign.Thevarianceismultipliedbythesquareoftheconstant,whichispositive,sothevarianceremainspositive.Thestandarddeviation,whichisthesquarerootofthevariance,isalwayspositive.Itismultipliedbytheabsolutevalueoftheconstant,i.e.theconstantwithoutthenegativesign.
Ifweaddtworandomvariablesthemeanofthesumisthesumofthemeans,and,ifthetwovariablesareindependent,thevarianceofthesumisthesumoftheirvariances.Wecandothisbymeasuringtheheightofpeoplestandingonboxesofrandomheight.Themeanheightofpeopleonboxesisthemeanheightofpeople+themeanheightoftheboxes.Thevariabilityoftheheightsisalso
increased.Thisisbecausesomeshortpeoplewillfindthemselvesonsmallboxes,andsometallpeoplewillfindthemselvesonlargeboxes.Ifthetwovariablesarenotindependent,somethingdifferenthappens.Themeanofthesumremainsthesumofthemeans,butthevarianceofthesumisnotthesumofthevariances.Supposeourpeoplehavedecidedtostandontheboxes,notjustatastatistician'swhim,butfor
apurpose.Theywishtochangealightbulb,andsomustreacharequiredheight.Nowtheshortpeoplemustpicklargeboxes,whereastallpeoplecanmakedowithsmallones.Theresultisareductioninvariabilitytoalmostnothing.Ontheotherhand,ifwetoldthetallestpeopletofindthelargestboxesandtheshortesttofindthesmallestboxes,thevariablitywouldbeincreased.Independenceisanimportantcondition.
Ifwesubtractonerandomvariablefromanother,themeanofthedifferenceisthedifferencebetweenthemeans,and,ifthetwovariablesareindependent,thevarianceofthedifferenceisthesumoftheirvariances.Supposewemeasuretheheightsabovegroundlevelofourpeoplestandinginholesofrandomdepth.Themeanheightabovegroundisthemeanheightofthepeopleminusthemeandepthofthehole.Thevariabilityisincreased,becausesomeshortpeoplestandindeepholesandsometallpeoplestandinshallowholes.Ifthevariablesarenotindependent,theadditivityofthevariancesbreaksdown,asitdidforthesumoftwovariables.Whenthepeopletrytohideintheholes,andsomustfindaholedeepenoughtoholdthem,thevariabilityisagainreduced.
Theeffectsofmultiplyingtworandomvariablesandofdividingonebyanotheraremuchmorecomplicated.Fortunatelywerarelyneedtodothis.
WecannowfindthemeanandvarianceoftheBinomialdistributionwithparametersnandp.Firstconsidern=1.Thentheprobabilitydistributionis:
Themeanistherefore0×(1-p)+1×p=p.Thevarianceis
Now,avariablefromtheBinomialdistributionwithparametersnandpisthesumofnindependentvariablesfromtheBinomialdistributionwithparameters1andp.Soitsmeanisthesumofnmeansallequaltop,anditsvarianceisthesumofnvariancesallequaltop(1-p).Hence
theBinomialdistributionhasmean=npandvariance=np(1-p).Forlargesampleproblems,thesearemoreusefulthantheBinomialprobabilityformula.
ThepropertiesofmeansandvariancesofrandomvariablesenableustofindaformalsolutiontotheproblemofdegreesoffreedomforthesamplevariancediscussedinChapter4.Wewantanestimateofvariancewhoseexpectedvalueisthepopulationvariance.TheexpectedvalueofΣ(xi-[xwithbarabove])2canbeshownto
be(n-1)VAR(x)(§6B)andhencewedividebyn-1,notn,togetourestimateofvariance.
Fig.6.4.Poissondistributionswithfourdifferentmeans
6.7*ThePoissondistribution
TheBinomialdistributionisoneofmanyprobabilitydistributionswhichareusedinstatistics.Itisadiscretedistribution,thatisitcantakeonlyafinitesetofpossiblevalues,andisthediscretedistributionmostcommonlyencounteredinmedicalapplications.Oneotherdiscretedistributionisworthdiscussingatthispoint,thePoissondistribution.Although,liketheBinomial,thePoissondistributionarisesfromasimpleprobabilitymodel,themathematicsinvolvedismorecomplicatedandwillbeomitted.
Supposeeventshappenrandomlyandindependentlyintimeataconstantrate.ThePoissondistributionisthedistributionfollowedbythenumberofeventswhichhappeninafixedtimeinterval.Ifeventshappenwithrateµeventsperunittime,theprobabilityofreventshappeninginunittimeis
wheree=2.718…,themathematicalconstant.Ifeventshappenrandomlyandindependentlyinspace,thePoissondistributiongivestheprobabilitiesforthenumberofeventsinunitvolumeorarea.
Thereisseldomanyneedtouseindividualprobabilitiesofthisdistribution,
asitsmeanandvariancesuffice.ThemeanofthePoissondistributionforthenumberofeventsperunittimeissimplytherate,µ.ThevarianceofthePoissondistributionisalsoequaltoµ.ThusthePoissonisafamilyofdistributions,liketheBinomial,butwithonlyoneparameter,µ.Thisdistributionisimportant,becausedeathsfrommanydiseasescanbetreatedasoccuringrandomlyandindependentlyinthepopulation.Thus,forexample,thenumberofdeathsfromlungcancerinoneyearamongpeopleinanoccupationalgroup,suchascoalminers,willbeanobservationfromaPoissondistribution,andwecanusethistomakecomparisonsbetweenmortalityrates(§16.3).
Figure6.4showsthePoissondistributionforfourdifferentmeans.YouwillseethatasthemeanincreasesthePoissondistributionlooksratherliketheBinomialdistributioninFigure6.3.Weshalldiscussthissimilarityfurtherinthenextchapter.
6.8*ConditionalprobabilitySometimesweneedtothinkabouttheprobabilityofaneventifanothereventhashappened.Forexample,wemightaskwhatistheprobablitythatapatienthascoronaryarterydiseaseifheorshehastinglingpainintheleftarm.Thisiscalledaconditionalprobability,theprobabilityoftheevent(coronaryarterydisease)givenacondition(tinglingpain).Wewritethisprobabilitythus,separatingtheeventandtheconditionbyaverticalbar:
PROB(coronaryarterydisease|tinglingpain)
Conditionalprobablitiesareusefulinstatisticalaidstodiagnosis(§15.7).Forasimplerexample,wecangobacktotossesoftwocoins.Ifwetossonecointhentheother,thefirsttossalterstheprobabilitiesforthepossibleoutcomesforthetwocoins:
PROB(bothcoinsheads|firstcoinhead)=0.5
PROB(headandtail|firstcoinhead)=0.5
PROB(bothcoinstails|firstcoinhead)=0.0
and
PROB(bothcoinsheads|firstcointail)=0.0
PROB(headandtail|firstcointail)=0.5
PROB(bothcoinstails|firstcointail)=0.5
Themultiplicativerule(§6.2)canbeextendedtodealwitheventswhicharenotindependent.FortwoeventsAandB:
PROB(AandB)=PROB(A|B)PROB(B)=PROB(B|A)PROB(A).
ItisimportanttounderstandthatPROB(A|B)andPROB(B|A)arenot
thesame.Forexample,Table6.1showstherelationshipbetweentwodiseases,hayfeverandeczemainalargegroupofchildren.Theprobabilitythatinthisgroupachildwithhayfeverwillhaveeczemaalsois
PROB(eczema|hayfever)=141/1069=0.13
theproportionofchildrenwithhayfeverwhohaveeczemaalso.Thisis
clearlymuchlessthantheprobablitythatachildwitheczemawillhavehayfever,
PROB(hayfever|eczema)=141/561=0.25
theproportionofchildrenwitheczemawhohavehayfeveralso.
Table6.1.Relationshipbetweenhayfeverandeczemaatage11intheNationalChildDevelopment
Study
EczemaHayfever
TotalYes No
Yes 141 420 561
No 928 13525 14453
Total 1069 13945 15522
Thismaylookobvious,butconfusionbetweenconditionalprobabilitiesiscommonandcancauseseriousproblems,forexampleintheconsiderationofforensicevidence.Typically,thiswillproducetheprobabilitythatamaterialfoundacrimescene(DNA,fibres,etc.)willmatchthesuspectascloselyasitdoesgiventhatthematerialdidnotcomefromthesubject.Thisis
PROB(evidence|suspectnotatcrimescene).
Itisnotthesameas
PROB(suspectnotatcrimescene|evidence),
butthisisoftenhowitisinterpreted,aninversionknownasthe
prosecutor'sfallacy.
Appendices
6AAppendix:Permutationsandcombinations
Forthosewhoneverknew,orhaveforgotten,thetheoryofcombinations,itgoeslikethis.First,welookatthenumberofpermutations,i.e.waysofarrangingasetofobjects.Supposewehavenobjects.Howmanywayscanweorderthem?Thefirstobjectcanbechosennways,i.e.anyobject.Foreachfirstobjecttherearen-1possiblesecondobjects,sotherearen×(n-1)possiblefirstandsecondpermutations.Therearenowonlyn-2choicesforthethirdobject,n-3choicesforthefourth,andsoon,untilthereisonlyonechoiceforthelast.Hence,therearen×(n-1)×(n-2)×…×2×1permutationsofnobjects.Wecallthisnumberthefactorialofnandwriteit‘n!’.
Nowwewanttoknowhowmanywaysthereareofchoosingrobjectsfromnobjects.Havingmadeachoiceofrobjects,wecanorderthoseinr!ways.Wecanalsoorderthen-rnotchosenin(n-r)!ways.Sotheobjectscanbeorderedinr!(n-r)!wayswithoutalteringtheobjectschosen.Forexample,saywechoosethefirsttwofromthreeobjects,A,BandC.TheniftheseareAandB,twopermutationsgivethischoice,ABCandBAC.Thisis,ofcourse,2!×1!=2permutations.Eachcombinationofrthingsaccountsforr!(n-r)!ofthen!permutationspossible,sothereare
possiblecombinations.Forexample,considerthenumberofcombinationsoftwoobjectsoutofthree,sayA,BandC.ThepossiblechoicesareAB,ACandBC.Thereisnootherpossibility.Applyingtheformula,wehaven=3andr=2so
Sometimesinusingthisformulawecomeacrossr=0orr=nleadingto0!.Thiscannotbedefinedinthewaywehavechosen,butwecancalculateitsonlypossiblevalue,0!=1.Becausethereisonlyonewayofchoosingnobjectsfromn,wehave
so0!=1.
6BAppendix:Expectedvalueofasumofsquares
Thepropertiesofmeansandvariancesdescribedin§6.6canbeusedtoanswerthequestionraisedin§4.7and§4Aaboutthedivisorinthesamplevariance.Weaskwhythevariancefromasampleis
andnot
Weshallbeconcernedwiththegeneralpropertiesofsamplesofsizen,soweshalltreatnasaconstantandxiand[xwithbarabove]asrandomvariables.Weshallsupposexihasmeanµandvarianceσ2.
Theexpectedvalueofthesumofsquareis
becausetheexpectedvalueofthedifferenceisthedifferencebetweentheexpectedvaluesandnisaconstant.Now,thepopulationvarianceσ2istheaveragesquareddistancefromthepopulationmeanµ,so
becauseµisaconstant.BecauseE(xi)=µ,wehave
andsowefindE(x2i)=σ2+µ2andsoE(Σx2i)=n(σ2+µ2),beingthesumofnnumbersallofwhichareσ2+µ2.WenowfindthevalueofE((Σxi)2).Weneed
JustasE(x2i)=σ2+µ2=VAR(xi)+(E(xi))2so
So
Sotheexpectedvalueofthesumofsquaresis(n-1)σ2andwemustdividethesumofsquaresbyn-1,notn,toobtaintheestimateofthevariance,σ2.
Weshallfindthevarianceofthesamplemean,[xwithbarabove],usefullater(§8.2):
6MMultiplechoicequestions25to31(Eachbranchiseithertrueorfalse.)
25.TheeventsAandBaremutuallyexclusive,so:
(a)PROB(AorB)=PROB(A)+PROB(B);
(b)PROB(AandB)=0;
(c)PROB(AandB)=PROB(A)PROB(B);
(d)PROB(A)=PROB(B);
(e)PROB(A)+PROB(B)=1.
ViewAnswer
26.Theprobabilityofawomanaged50havingconditionXis0.20andtheprobabilityofherhavingconditionYis0.05.Theseprobabilitiesareindependent:
Fig.19.5.PiechartshowingthedistributionofpatientsinTootingBecHospitalbydiagnosticgroup
Fig.19.6.BarchartshowingtheresultsoftheSalkvaccinetrial
(a)theprobabilityofherhavingbothconditionsis0.01;
(b)theprobabilityofherhavingbothconditionsis0.25;
(c)theprobabilityofherhavingeitherX,orY,orbothis0.24;
(d)ifshehasconditionX,theprobabilityofherhavingYalsois0.01;
(e)ifshehasconditionY,theprobabilityofherhavingXalsois0.20.
ViewAnswer
27.ThefollowingvariablesfollowaBinomialdistribution:
(a)numberofsixesin20throwsofadie;
(b)humanweight;
(c)numberofarandomsampleofpatientswhorespondtoatreatment;
(d)numberofredcellsin1mlofblood;
(e)proportionofhypertensivesinarandomsampleofadultmen.
ViewAnswer
28.Twoparentseachcarrythesamerecessivegenewhicheachtransmitstotheirchildwithprobability0.5.Iftheirchildwilldevelopclinicaldiseaseifitinheritsthegenefrombothparentsandwillbeacarrierifitinheritsthegenefromoneparentonlythen:
(a)theprobabilitythattheirnextchildwillhaveclinicaldiseaseis0.25;
(b)theprobabilitythattwosuccessivechildrenwillbothdevelopclinicaldiseaseis0.25×0.25;
(c)theprobabilitytheirnextchildwillbeacarrierwithoutclinicaldiseaseis0.50:
(d)theprobabilityofachildbeingacarrierorhavingclinicaldiseaseis0.75;
(e)ifthefirstchilddoesnothaveclinicaldisease,theprobabilitythatthesecondchildwillnothaveclinicaldiseaseis0.752.
ViewAnswer
Table6.2.Numberofmenremainingaliveattenyearintervals(fromEnglishLifeTableNo.11,
Males)
Ageinyears,x
Numbersurviving,lx
Ageinyears,x
Numbersurviving,lx
0 1000 60 758
10 959 70 524
20 952 80 211
30 938 90 22
40 920 100 0
50 876
29.Ifacoinisspuntwiceinsuccession:
(a)theexpectednumberoftailsis1.5;
(b)theprobabilityoftwotailsis0.25;
(c)thenumberoftailsfollowsaBinomialdistribution;
(d)theprobabilityofatleastonetailis0.5;
(e)thedistributionofthenumberoftailsissymmetrical.
ViewAnswer
30.IfXisarandomvariable,meanµandvarianceσ2:
(a)E(X+2)=µ;
(b)VAR(X+2)=σ2;
(c)E(2X)=2µ;
(d)VAR(2X)=2σ2;
(e)VAR(X/2)=σ2/4.
ViewAnswer
31.IfXandYareindependentrandomvariables:
(a)VAR(X+Y)=VAR(X)+VAR(Y);
(b)E(X+Y)=E(X)+E(Y);
(c)E(X-Y)=E(X)-E(Y);
(d)VAR(X-Y)=VAR(X)-VAR(Y);
(e)VAR(-X)=-VAR(X).
ViewAnswer
6EExercise:ProbabilityandthelifetableInthisexerciseweshallapplysomeofthebasiclawsofprobabilitytoapracticalexercise.Thedataarebasedonalifetable.(Ishallsaymoreaboutthesein§16.4.)Table6.2showsthenumberofmen,fromagroupnumbering1000atbirth,whowewouldexpecttobealiveatdifferentages.Thus,forexample,after10years,weseethat959surviveandso41havedied,at20years952surviveandso48havedied,41betweenages0and9and7betweenages10and19.
1.Whatistheprobabilitythatanindividualchosenatrandomwillsurvivetoage10?
ViewAnswer
2.Whatistheprobabilitythatthisindividualwilldiebeforeage10?Whichpropertyofprobabilitydoesthisdependon?
ViewAnswer
3.Whataretheprobabilitiesthattheindividualwillsurvivetoages10,20.30,40,50,60,70.80,90,100?Isthissetofprobabilitiesaprobabilitydistribution?
ViewAnswer
4.Whatistheprobabilitythatanindividualaged60yearssurvivestoage70?
ViewAnswer
5.Whatistheprobabilitythattwomenaged60willbothsurviveto
age70?Whichpropertyofprobabilityisusedhere?
ViewAnswer
6.Ifwehad100individualsaged60,howmanywouldweexpecttoattainage70?
ViewAnswer
7.Whatistheprobabilitythatamandiesinhisseconddecade?YoucanusethefactthatPROB(deathin2nd)+PROB(survivesto3rd)=PROB(survivesto2nd).
ViewAnswer
8.Foreachdecade,whatistheprobabilitythatagivenmanwilldieinthatdecade?Thisisaprobabilitydistribution—why?Sketchthedistribution.
ViewAnswer
9.Asanapproximation,wecanassumethattheaveragenumberofyearslivedinthedecadeofdeathis5.Thus,thosewhodieinthe2nddecadewillhaveanaveragelifespanof15years.Theprobabilityofdyinginthe2nddecadeis0.007,i.e.aproportion0.007ofmenhaveameanlifetimeof15years.Whatisthemeanlifetimeofallmen?Thisistheexpectationoflifeatbirth.
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>7-TheNormaldistribution
7
TheNormaldistribution
7.1ProbabilityforcontinuousvariablesWhenwederivedthetheoryofprobabilityinthediscretecase,wewereabletosaywhattheprobabilitywasofarandomvariabletakingaparticularvalue.Asthenumberofpossiblevaluesincreases,theprobabilityofaparticularvaluedecreases.Forexample,intheBinomialdistributionwithp=0.5andn=2,themostlikelyvalue,1,hasprobability0.5.IntheBinomialdistributionwithp=0.5andn=100themostlikelyvalue,50,hasprobability0.08.Insuchcasesweareusuallymoreinterestedintheprobabilityofarangeofvaluesthanoneparticularvalue.
Foracontinuousvariable,suchasheight,thesetofpossiblevaluesisinfiniteandtheprobabilityofanyparticularvalueiszero(§6.1).Weareinterestedintheprobabilityoftherandomvariabletakingvaluesbetweencertainlimitsratherthantakingparticularvalues.Iftheproportionofindividualsinthepopulationwhosevaluesarebetweengivenlimitsisp,andwechooseanindividualatrandom,theprobabilityofchoosinganindividualwholiesbetweentheselimitsisequaltop.Thiscomesfromourdefinitionofprobability,thechoiceofeachindividualbeingequallylikely.Theproblemisfindingandgivingavaluetothisprobability.
Whenwefindthefrequencydistributionforasampleofobservations,we
countthenumberofvaluesinwhichfallwithincertainlimits(§4.2).WecanrepresentthisasahistogramsuchasFigure7.1(§4.3).Oneway
ofpresentingthehistogramisasrelativefrequencydensity,theproportionofobservationsintheintervalperunitofX(§4.3),Thus,whentheintervalsizeis5,therelativefrequencydensityistherelativefrequencydividedby5(Figure7.1).Therelativefrequencyinanintervalisnowrepresentedbythewidthoftheintervalmultipliedbythedensity,whichgivestheareaoftherectangle.Thus,therelativefrequencybetweenanytwopointscanbefoundfromtheareaunderthehistogrambetweenthepoints.Forexample,toestimatetherelativefrequencybetween10and20inFigure7.1wehavethedensityfrom10to15as0.05andbetween15and20as0.03.Hencetherelativefrequencyis
0.05×(15-10)+0.03×(20-15)=0.25+0.15=0.40
Ifwetakealargersamplewecanusesmallerintervals.Wegetasmootherlookinghistogram,asinFigure7.2,andaswetakelargerandlargersamples,andsosmallerandsmallerintervals,wegetashapeveryclosetoasmoothcurve(Figure7.3).Asthesamplesizeapproachesthatofthepopulation,whichwecanassumetobeverylarge,thiscurvebecomestherelativefrequencydensityofthewholepopulation.Thuswecanfindtheproportionofobservationsbetweenanytwolimitsbyfindingtheareaunderthecurve,asindicatedinFigure7.3.
Fig.7.1.Histogramshowingrelativefrequencydensity
Fig.7.2.Theeffectonafrequencydistributionofincreasingsamplesize
Ifweknowtheequationofthiscurve,wecanfindtheareaunderit.(Mathematicallywedothisbyintegration,butwedonotneedtoknowhowtointegratetouseortounderstandpracticalstatistics—alltheintegralsweneedhavebeendoneandtabulated.)Now,ifwechooseanindividualatrandom,theprobabilitythatXliesbetweenanygivenlimitsisequaltotheproportionofindividualswhofallbetweentheselimits.Hence,therelativefrequencydistributionforthewholepopulationgivesustheprobabilitydistributionofthevariable.Wecallthiscurvetheprobabilitydensityfunction.
Fig.7.3.Relativefrequencydensityorprobabilitydensityfunction,showingtheprobabilityofanobservationbetween10and20
Fig.7.4.Mean,µ,standarddeviation,σ,andaprobabilitydensityfunction
Probabilitydensityfunctionshaveanumberofgeneralproperties.Forexample,thetotalareaunderthecurvemustbeone,sincethisisthetotalprobabilityofallpossibleevents.Continuousrandomvariableshavemeans,variancesandstandarddeviationsdefinedinasimilarwaytothosefordiscreterandomvariablesandpossessingthesameproperties(§6.5).Themeanwillbesomewherenearthemiddleofthecurveandmostoftheareaunderthecurvewillbebetweenthemeanminustwostandarddeviationsandthemeanplustwostandarddeviations(Figure7.4).
Thepreciseshapeofthecurveismoredifficulttoascertain.Therearemanypossibleprobabilitydensityfunctionsandsomeofthesecanbeshowntoarisefromsimpleprobabilitysituations,asweretheBinomialandPoissondistributions.However,mostcontinuousvariableswithwhichwehavetodeal,suchas
height,bloodpressure,serumcholesterol,etc.,donotarisefromsimpleprobabilitysituations.Asaresult,wedonotknowtheprobabilitydistributionforthesemeasurementsontheoreticalgrounds.Asweshallsee,wecanoftenfindastandarddistributionwhosemathematicalpropertiesareknown,whichfitsobserveddatawellandwhichenablesustodrawconclusionsaboutthem.Further,assamplesizeincreasesthedistributionofcertainstatisticscalculatedfromthedata,suchasthemean,becomeindependentofthedistributionoftheobservationsthemselvesandfollowoneparticulardistributionform,theNormaldistribution.Weshalldevotetheremainderofthischaptertoastudyofthisdistribution.
Fig.7.5.Binomialdistributionsforp=0.3andsixdifferentvaluesofn,withcorrespondingNormaldistributioncurves
7.2TheNormaldistributionTheNormaldistribution,alsoknownastheGaussiandistribution,mayberegardedasthefundamentalprobabilitydistributionofstatistics.Theword‘normal’isnotusedhereinitscommonmeaningof‘ordinaryorcommon’,oritsmedicalmeaningof‘notdiseased’.Theusagerelatestoitsoldermeaningof‘conformingtoaruleorpattern’,andasweshallsee,theNormaldistributionistheformtowhichtheBinomialdistributiontendsasitsparameternincreases.ThereisnoimplicationthatmostvariablesfollowaNormaldistribution.
WeshallstartbyconsideringtheBinomialdistributionasnincreases.Wesawin§6.4that,asnincreases,theshapeofthedistributionchanges.Themostextremepossiblevaluesbecomelesslikelyandthedistributionbecomesmoresymmetrical.Thishappenswhateverthevalueofp.Thepositionofthedistributionalongthehorizontalaxis,anditsspread,arestilldeterminedbyp,buttheshapeisnot.Asmoothcurvecanbedrawnwhichgoesveryclosetothesepoints.ThisistheNormaldistributioncurve,thecurveofthecontinuousdistributionwhichtheBinomialdistributionapproachesasnincreases.
AnyBinomialdistributionmaybeapproximatedbytheNormaldistributionofthesamemean
andvarianceprovidednislargeenough.Figure7.5showstheBinomialdistributionsofFigure6.3withthecorrespondingNormaldistributioncurves.Fromn=10onwardsthetwodistributionsareveryclose.Generally,ifbothnpandn(1-p)exceed5theapproximationoftheBinomialtotheNormaldistributionisquitegoodenoughformostpracticalpurposes.See§8.4foranapplication.ThePoissondistributionhasthesameproperty,asFigure6.4suggests.
Fig.7.6.SumsofobservationsfromaUniformdistribution
TheBinomialvariablemayberegardedasthesumofnindependentidenticallydistributedrandomvariables,eachbeingtheoutcomeofonetrialtakingvalue1withprobabilityp.Ingeneral,ifwehaveany
seriesofindependent,identicallydistributedrandomvariables,thentheirsumtendstoaNormaldistributionasthenumberofvariablesincreases.Thisisknownasthecentrallimittheorem.Asmostsetsofmeasurementsareobservationsofsuchaseriesofrandomvariables,thisisaveryimportantproperty.Fromit,wecandeducethatthesumormeanofanylargeseriesofindependentobservationsfollowsaNormaldistribution.
Forexample,considertheUniformorRectangulardistribution.Thisisthedistributionwhereallvaluesbetweentwolimits,say0and1,areequallylikelyandnoothervaluesarepossible.ObservationsfromthisariseifwetakerandomdigitsfromatableofrandomnumberssuchasTable2.3.EachobservationoftheUniformvariableisformedbyaseriesofsuchdigitsplacedafteradecimalpoint.Onamicrocomputer,thisisusuallythedistributionproducedbytheRND(X)functionintheBASIClanguage.Figure7.6showsthehistogramforthefrequencydistributionof500observationsfromtheUniformdistribution
between0and1.ItisquitedifferentfromtheNormaldistribution.NowsupposewecreateanewvariablebytakingtwoUniformvariablesandaddingthem(Figure7.6),TheshapeofthedistributionofthesumoftwoUniformvariablesisquitedifferentfromtheshapeoftheUniformdistribution.Thesumisunlikelytobeclosetoeitherextreme,here0or2,andobservationsareconcentratedinthemiddleneartheexpectedvalue.Thereasonforthisisthattoobtainalowsum,boththeUniformvariablesformingitmustbelow;tomakeahighsumbothmustbehigh.Butwegetasumnearthemiddleifthefirstishighandthesecondlow,orthefirstislowandsecondhigh,orbothfirstandsecondaremoderate.ThedistributionofthesumoftwoismuchclosertotheNormalthanistheUniformdistributionitself.However,theabruptcut-offat0andat2isunlikethecorrespondingNormaldistribution.Figure7.6alsoshowstheresultofaddingfourUniformvariablesandsixUniformvariables.ThesimilaritytotheNormaldistributionincreasesasthenumberaddedincreasesandforthesumofsixthecorrespondenceissoclosethatthedistributionscouldnoteasilybetoldapart.
TheapproximationoftheBinomialtotheNormaldistributionisa
specialcaseofthecentrallimittheorem.ThePoissondistributionisanother.IfwetakeasetofPoissonvariableswiththesamerateandaddthem,wewillgetavariablewhichisthenumberofrandomeventsinalongertimeinterval(thesumoftheintervalsfortheindividualvariables)andwhichisthereforeaPoissondistributionwithincreasedmean.Asitisthesumofasetofindependent,identicallydistributedrandomvariablesitwilltendtowardstheNormalasthemeanincreases.HenceasthemeanincreasesthePoissondistributionbecomesapproximatelyNormal.Formostpracticalpurposesthisiswhenthemeanexceeds10.ThesimilaritybetweenthePoissonandtheBinomialnotedin§6.7isapartofamoregeneralconvergenceshownbymanyotherdistributions.
7.3PropertiesoftheNormaldistributionInitssimplestformtheequationoftheNormaldistributioncurve,calledtheStandardNormaldistribution,isusuallydenotedbyφ(z),whereφistheGreekletter‘phi’:
whereπistheusualmathematicalconstant.Themedicalreadercanbereassuredthatwedonotneedtousethisforbiddingformulainpractice.TheStandardNormaldistributionhasameanof0,astandarddeviationof1andashapeasshowninFigure7.7.Thecurveissymmetricalaboutthemeanandoftendescribedas‘bell-shaped’(thoughIhaveneverseenabelllikeit).Wecannotethatmostofthearea,i.e.theprobability,isbetween-1and+1,thelargemajoritybetween-2and+2,andalmostallbetween-3and+3.
AlthoughtheNormaldistributioncurvehasmanyremarkableproperties,ithasoneratherawkwardone:itcannotbeintegrated.Inotherwords,thereisnosimpleformulafortheprobabilityofarandomvariablefromaNormal
distributionlyingbetweengivenlimits.Theareasunderthecurvecanbefoundnumerically,however,andthesehavebeencalculatedandtabulated.Table7.1showstheareaundertheprobabilitydensitycurve
fordifferentvaluesoftheNormaldistribution.Tobemoreprecise,foravaluezthetableshowstheareaunderthecurvetotheleftofz,i.e.fromminusinfinitytoz(Figure7.8).ThusΦ(z)istheprobabilitythatavaluechosenatrandomfromtheStandardNormaldistributionwillbelessthanz.ΦistheGreekcapital‘phi’.Notethathalfthistableisnotstrictlynecessary.WeneedonlythehalfforpositivezasΦ(-z)+Φ(z)=1.Thisarisesfromthesymmetryofthedistribution.Tofindtheprobabilityofzlyingbetweentwovaluesaandb,whereb>a,wefindΦ(b)-Φ(a).Tofindtheprobabilityofzbeinggreaterthanawefind1-Φ(a).Theseformulaeareallexamplesoftheadditivelawofprobability.Table7.1givesonlyafewvaluesofz,andmuchmoreextensiveonesareavailable(LindleyandMiller1955,PearsonandHartley1970).Goodstatisticalcomputerprogramswillcalculatethesevalueswhentheyareneeded.
Fig.7.7.TheStandardNormaldistribution
Table7.1.TheNormaldistribution
z Φ(z) z Φ(z) z Φ(z) z Φ(z)
-3.0 0.001 -2.0 0.023 -1.0 0.159 0.0 0.500
-2.9 0.002 -1.9 0.029 -0.9 0.184 0.1 0.540
-2.8 0.003 -1.8 0.036 -0.8 0.212 0.2 0.579
-2.7 0.003 -1.7 0.045 -0.7 0.242 0.3 0.618
-2.6 0.005 -1.6 0.055 -0.6 0.274 0.4 0.655
-2.5 0.006 -1.5 0.067 -0.5 0.309 0.5 0.691
-2.4 0.008 -1.4 0.081 -0.4 0.345 0.6 0.726
-2.3 0.011 -1.3 0.097 -0.3 0.382 0.7 0.758
-2.2 0.014 -1.2 0.115 -0.2 0.421 0.8 0.788
-2.1 0.018 -1.1 0.136 -0.1 0.460 0.9 0.816
-2.0 0.023 -1.0 0.159 0.0 0.500 1.0 0.841
Thereisanotherwayoftabulatingadistribution,usingwhatarecalled
percentagepoints.Theone-sidedPpercentagepointofadistributionisthevaluezsuchthatthereisaprobabilityP%ofanobservationfromthatdistributionbeinggreaterthanorequaltoz(Figure7.8).Thetwo-sidedPpercentagepointisthevaluezsuchthatthereisaprobability
P%ofanobservationbeinggreaterthanorequaltozorlessthanorequalto-z(Figure7.8).Table7.2showsbothonesidedandtwosidedpercentagepointsfortheNormaldistribution.Theprobabilityisquotedasapercentagebecausewhenweusepercentagepointsweareusuallyconcernedwithrathersmallprobabilities,suchas0.05or0.01,anduseofthepercentageform,makingthem5%and1%,cutsouttheleadingzero.
Table7.2.PercentagepointsoftheNormaldistribution
One-sided Two-sided
P1 (z) P2 (z)
50 0.00
25 0.67 50 0.67
10 1.28 20 1.28
5 1.64 10 1.64
2.5 1.96 5 1.96
1 2.33 2 2.33
0.5 2.58 1 2.58
0.1 3.09 0.2 3.09
0.05 3.29 0.1 3.29
ThetableshowstheprobabilityP1(z)ofaNormalvariablewithmean0andvariance1beinggreaterthanz,andtheprobabilityP2(z)ofaNormalvariablewithmean0andvariance1beinglessthan-zorgreaterthanz.
Fig.7.8.One-andtwo-sidedpercentagepoints(5%)oftheStandardNormaldistribution
SofarwehaveexaminedtheNormaldistributionwithmean0andstandarddeviation1.IfweaddaconstantµtoaStandardNormalvariable,wegetanewvariablewhichhasmeanµ(see§6.6).Figure7.9showstheNormaldistributionwithmean0andthedistributionobtainedbyadding1toittogetherwiththeirtwo-sided5%points.Thecurvesareidenticalapartfromashiftalongtheaxis.
Onthecurvewithmean0nearlyalltheprobabilityisbetween-3and+3.Forthecurvewithmean1itisbetween-2and+4,i.e.betweenthemean-3andthemean+3.Theprobabilityofbeingagivennumberof
unitsfromthemeanisthesameforbothdistributions,asisalsoshownbythe5%points.
Fig.7.9.Normaldistributionswithdifferentmeansandwithdifferentvariances,showingtwo-sided5%points
IfwetakeaStandardNormalvariable,withstandarddeviation1,andmultiplybyaconstantσwegetanewvariablewhichhasstandarddeviationσ.Figure7.9showstheNormaldistributionwithmean0andstandarddeviation1andthedistributionobtainedbymultiplyingby2.Thecurvesdonotappearidentical.Forthedistributionwithstandarddeviation2,nearlyalltheprobabilityisbetween-6and+6,amuchwiderintervalthanthe-3and+3forthestandarddistribution.Thevalues-6and+6are-3and+3standarddeviations.Wecanseethattheprobabilityofbeingagivennumberofstandarddeviationsfromthemeanisthesameforbothdistributions.Thisisalsoseenfromthe5%points,whichrepresentthemeanplusorminus1.96standarddeviationsineachcase.
InfactifweaddµtoaStandardNormalvariableandmultiplybyσ,wegetaNormaldistributionofmeanµ,andstandarddeviationσ.Tables7.1and7.2applytoitdirectly,ifwedenotebyzthenumberofstandarddeviationsabovethemean,ratherthanthenumericalvalueofthevariable.Thus,forexample,thetwosided5%pointsofaNormaldistributionwithmean10andstandarddeviation5arefoundby10-
1.96×5=0.2and10+1.96×5=19.8,thevalue1.96beingfoundfromTable7.2.
ThispropertyoftheNormaldistribution,thatmultiplyingoraddingconstantsstillgivesaNormaldistribution,isnotasobviousasitmightseem.TheBinomialdoesnothaveit,forexample.TakeaBinomialvariablewithn=3,possiblevalues0,1,2,and3,andmultiplyby2,Thepossiblevaluesarenow0,2,4,and6.TheBinomialdistributionwithn=6hasalsopossiblevalues1,3,and5,sothedistributionsaredifferentandtheonewhichwehavederivedisnotamemberoftheBinomialfamily.
WehaveseenthataddingaconstanttoavariablefromaNormaldistributiongivesanothervariablewhichfollowsaNormaldistribution.IfweaddtwovariablesfromNormaldistributionstogether,evenwithdifferentmeansand
variances,thesumfollowsaNormaldistribution.ThedifferencebetweentwovariablesfromNormaldistributionsalsofollowsaNormaldistribution.
Fig.7.10.Distributionofheightinasampleof1794pregnantwomen(dataofBrookeetal.1989)
Fig.7.11.Distributionofserumtriglyceride(Table4.8)andlog10triglycerideincordbloodfor282babies,withcorrespondingNormaldistributioncurves
7.4VariableswhichfollowaNormaldistributionSofarwehavediscussedtheNormaldistributionasitarisesfromsamplingasthesumorlimitofotherdistributions.However,manynaturallyoccurringvariables,suchashumanheight,appeartofollowaNormaldistributionveryclosely.Wemightexpectthistohappenifthevariableweretheresultofaddingvariationfromanumberofdifferentsources.TheprocessshownbythecentrallimittheoremmaywellproducearesultclosetoNormal.Figure7.10showsthedistributionofheightinasampleofpregnantwomen,andthecorrespondingNormaldistributioncurve.ThefittotheNormaldistributionisverygood.
IfthevariablewemeasureistheresultofmultiplyingseveraldifferentsourcesofvariationwewouldnotexpecttheresulttobeNormalfromtheproperties
discussedin§7.2,whichwereallbasedonadditionofvariables.However,ifwetakethelogtransformationofsuchavariable(§5A)wewouldthengetanewvariablewhichisthesumofseveraldifferentsourcesofvariationandwhichmaywellhaveaNormaldistribution.
Thisprocessoftenhappenswithquantitieswhicharepartofmetabolicpathways,therateatwhichreactioncantakeplacedependingontheconcentrationsofothercompounds.Manymeasurementsofbloodconstituentsexhibitthis,forexample.Figure7.11showsthedistributionofserumtriglyceridemeasuredincordbloodfor282babies(Table4.8).ThedistributionishighlyskewedandquiteunliketheNormaldistributioncurve.However,whenwetakethelogarithmofthetriglycerideconcentration,wehavearemarkablygoodfittotheNormaldistribution(Fig.7.11).IfthelogarithmofarandomvariablefollowsaNormaldistribution,therandomvariableitselffollowsaLognormaldistribution.
WeoftenwanttochangethescaleonwhichweanalyseourdatasoastogetaNormaldistribution.Wecallthisprocessofanalysingamathematicalfunctionofthedataratherthanthedatathemselvestransformation.Thelogarithmisthetransformationmostoftenused,thesquarerootandreciprocalareothers(seealso§10.4).Forasinglesample,transformationenablesustousetheNormaldistributiontoestimatecentiles(§4.5).Forexample,weoftenwanttoestimatethe2.5thand97.5thcentiles.whichtogetherenclose95%oftheobservations.ForaNormaldistribution,thesecanbeestimatedby[xwithbarabove]±1.96s.WecantransformthedatasothatthedistributionisNormal,calculatethecentile,andthentransformbacktotheoriginalscale.
ConsiderthetriglyceridedataofFigure7.11andTable4.8.Themeanis0.51andthestandarddeviation0.22.Themeanforthelog10transformeddatais-0.33andthestandarddeviationis0.17.Whathappensifwetransformbackbytheantilog?Forthemean,weget10-0.33=0.47.Thisislessthanthemeanfortherawdata.Theantilogofthemeanlogisnotthesameastheuntransformedarithmeticmean.Infact,thisthegeometricmean,whichisthenthrootoftheproductoftheobservations.Ifweaddthelogsoftheobservationswegetthelogoftheirproduct(§5A).Ifwemultiplythelogofanumberbyasecondnumber,wegetthelogofthefirstraisedtothepowerofthesecond.Soifwedividethelogbyn,wegetthelogofthenthroot.Thusthemeanofthelogsisthelogofthegeometricmean.Onbacktransformation,thereciprocaltransformationalsoyieldsameanwitha
specialname,theharmonicmean,thereciprocalofthemeanofthereciprocals.
Thegeometricmeanisintheoriginalunits.Iftriglycerideismeasuredinmmol/litre,thelogofasingleobservationisthelogofameasurementinmmol/litre.Thesumofnlogsisthelogoftheproductofnmeasurementsinmmol/litreandisthelogofameasurementinmmol/litretothenth.Thenthrootisthusagainthelogofanumberinmmol/litreandtheantilogisbackintheoriginalunits,mmol/litre(see§5A).
Theantilogofthestandarddeviation,however,isnotmeasuredintheoriginalunits.Tocalculatethestandarddeviationwetakethedifferencebetweeneachlogobservationandsubtracttheloggeometricmean,usingtheusualformula
Σ(xi-[xwithbarabove])2/(n-1)(§4.8).Thuswehavethedifferencebetweenthelogoftwonumberseachmeasuredinmmol/litre,givingthelogoftheirratio(§5A)whichisthelogofadimensionlesspurenumber.Itwouldbethesameifthetriglyceridesweremeasuredinmmol/litreormg/100ml.Wecannottransformthestandarddeviationbacktotheoriginalscale.
Ifwewanttousethestandarddeviation,itiseasiesttodoallcalculationsonthetransformedscaleandtransformback,ifnecessary,attheend.Forexample,the2.5thcentileonthelogscaleis-0.33-1.96×0.17=-0.66andthe97.5thcentileis-0.33+1.96×0.17=0.00.Togetthesewetookthelogofsomethinginmmol/litreandaddedorsubtractedthelogofapurenumber(i.e.multipliedonthenaturalscale),sowestillhavethelogofsomethinginmmol/litre.Togetbacktotheoriginalscaleweantilogtoget2.5thcentile=0.22and97.5thcentile=1.00mmol/litre.
TransformingthedatatoaNormaldistributionandthenanalysingonthetransformedscalemaylooklikecheating.Idonotthinkitis.Thescaleonwhichwechoosetomeasurethingsneednotbelinear,thoughthisisoftenconvenient.Otherscalescanbemuchmoreuseful.WemeasurepHonalogarithmicscale,forexample.Shouldthemagnitudeofanearthquakebemeasuredinmmofamplitude(linear)oronthe
Richterscale(logarithmic)?Shouldspectaclelensesbemeasuredintermsoffocallengthincm(linear)ordioptres(reciprocal)?Weoftenchoosenon-linearscalesbecausetheysuitourpurposeandforstatisticalanalysisitoftensuitsustomakethedistributionNormal,byfindingascaleofmeasurementwherethisisthecase.
7.5TheNormalplotManystatisticalmethodscanonlybeusediftheobservationsfollowaNormaldistribution(seeChapters10and11).ThereareseveralwaysofinvestigatingwhetherobservationsfollowaNormaldistribution.WithalargesamplewecaninspectahistogramtoseewhetheritlookslikeaNormaldistributioncurve.Thisdoesnotworkwellwithasmallsample,andamorereliablemethodistheNormalplot.Thisisagraphicalmethod,whichcanbedoneusingordinarygraphpaperandatableoftheNormaldistribution,withspeciallyprintedNormalprobabilitypaper,or,muchmoreeasily,usingacomputer.AnygoodgeneralstatisticalpackagewillgiveNormalplots;ifitdoesnotthenitisnotagoodpackage.TheNormalplotmethodcanbeusedtoinvestigatetheNormalassumptioninsamplesofanysize,andisaveryusefulcheckwhenusingmethodssuchasthetdistributionmethodsdescribedinChapter10.
TheNormalplotisaplotofthecumulativefrequencydistributionforthedataagainstthecumulativefrequencydistributionfortheNormaldistribution.First,weorderthedatafromlowesttohighest.ForeachorderedobservationwefindtheexpectedvalueoftheobservationifthedatafollowedaStandardNormaldistribution.Thereareseveralapproximateformulaeforthis.IshallfollowArmitageandBerry(1994)andusefortheithobservationzwhereΦ(z)=(i-0.5)/n.SomebooksandprogramsuseΦ(z)=i/(n+1)andthereareother
morecomplexformulae.Itdoesnotmakemuchdifferencewhichisused.WefindfromatableoftheNormaldistributionthevaluesofzwhichcorrespondtoΦ(z)=0.5/n,1.5/n,etc.(Table7.1lacksdetailforpracticalwork,butwilldoforillustration.)For5points,forexample,wehaveΦ(z)=0.1,0.3,0.5,0.7,and0.9.andz=-1.3,-0.5,0,0.5,and1.3.ThesearethepointsoftheStandardNormal
distributionwhichcorrespondtotheobserveddata.Now,iftheobserveddatacomefromaNormaldistributionofmeanµandvarianceσ2,theobservedpointshouldequalσz+µ,wherezisthecorrespondingpointoftheStandardNormaldistribution.IfweplottheStandardNormalpointsagainsttheobservedvaluesweshouldgetsomethingclosetoastraightline.Wecanwritetheequationofthislineasσz+µ=x,wherexistheobservedvariableandzthecorrespondingquantileoftheStandardNormaldistribution.Wecanrewritethisas
whichgoesthroughthepointdefinedby(µ,0)andhasslope1/σ(see§11.1).IfthedataarenotfromaNormaldistributionwewillnotgetastraightline,butacurveofsomesort.Becauseweplotthequantilesoftheobservedfrequency
distributionagainstthecorrespondingquantilesofthetheoretical(hereNormal)distribution,thisisalsoreferredtoasaquantile–quantileplotorq–qplot.
Table7.3.VitaminDlevelsmeasuredinthebloodof26healthymen,dataofHickishetal.(1989)
14 25 30 42 54
17 26 31 43 54
20 26 31 46 63
21 26 32 48 67
22 27 35 52 83
24
Table7.4.CalculationoftheNormalplotforthevitaminDdata
i VitD Φ(z) z i Vit
D Φ(z) z
1 14 0.019 -2.07 14 31 0.519 0.05
2 17 0.058 -1.57 15 32 0.558 0.15
3 20 0.096 -1.30 16 35 0.596 0.24
4 21 0.135 -1.10 17 42 0.635 0.34
5 22 0.173 -0.94 18 43 0.673 0.45
6 24 0.212 -0.80 19 46 0.712 0.56
7 25 0.250 -0.67 20 48 0.750 0.67
8 26 0.288 -0.56 21 52 0.788 0.80
9 26 0.327 -0.45 22 54 0.827 0.94
10 26 0.365 -0.34 23 54 0.865 1.10
11 27 0.404 -0.24 24 63 0.904 1.30
12 30 0.442 -0.15 25 67 0.942 1.57
13 31 0.481 -0.05 26 83 0.981 2.07
Φ(z)=(i-0.5)/26
Fig.7.12.BloodvitaminDlevelsandlog10vitaminDfor26normalmen,withNormalplots
Table7.3showsvitaminlevelsmeasuredinthebloodof26healthymen.ThecalculationoftheNormalplotisshowninTable7.4.NotethattheΦ(z)=(i-0.5)/26andzaresymmetrical,thesecondhalfbeingthefirsthalfwithoppositesign.ThevalueoftheStandardNormaldeviate,z,canbefoundbyinterpolationinTable7.1,byusingafullertable,orbycomputer.Figure7.12showsthehistogramandtheNormalplotforthesedata.ThedistributionisskewandtheNormalplotshowsapronouncedcurve.Figure7.12alsoshowsthevitaminDdataafterlogtransformation.ItisquiteeasytoproducetheNormalplot,asthecorrespondingStandardNormaldeviate,z,isunchanged.Weonlyneedtologtheobservationsandplotagain.TheNormalplotforthetransformeddataconformsverywelltothetheoreticalline,suggestingthatthedistributionoflogvitaminDlevelisclosetotheNormal.
AsinglebendintheNormalplotindicatesskewness.AdoublecurveindicatesthatbothtailsofthedistributionaredifferentfromtheNormal,usuallybeingtoolong,andmanycurvesmayindicatethatthedistributionisbimodal(Figure7.13).Whenthesampleissmall,ofcourse,therewillbesomerandomfluctuations.
ThereareseveraldifferentwaystodisplaytheNormalplot.SomeprogramsplotthedatadistributionontheverticalaxisandthetheoreticalNormaldistributiononthehorizontalaxis,whichreversesthedirectionofthecurve.Some
plotthetheoreticalNormaldistributionwithmean[xwithbarabove],thesamplemean,andstandarddeviations,thesamplestandarddeviation.Thisisdonebycalculating[xwithbarabove]+sz.Figure7.14(a)showsboththesefeatures,theNormalplotdrawnbytheprogramStata's‘qnorm’command.Thestraightlineisthelineofequality.ThisplotisidenticaltothesecondplotinFigure7.12,exceptforthechangeofscaleandswitchingoftheaxes.AslightvariationisthestandardizedNormalprobabilityplotorp-pplot,wherewestandardizetheobservationstozeromeanandstandarddeviationone,y=(x-[xwithbarabove])/s,andplotthecumulativeNormal
probabilities,Φ(y),against(i-0.5)/nor?/(n+1)(Figure7.14(b),
producedbytheStatacommand‘pnorm’)-ThereisverylittledifferencebetweenFigure7.14(a)and(b)andthequantileandprobabilityversionsoftheNormalplotshouldbeinterpretedinthesameway.
Fig.7.13.Bloodsodiumandsystolicbloodpressuremeasuredin250patientsintheIntensiveTherapyUnitatSt.George'sHospital,withNormalplots(dataofFreidlandetal.1996)
Fig.7.14.VariationsontheNormalplotforthevitaminDdata
Appendices
7AAppendix:Chi-squared,t,andF
Lessmathematicallyinclinedreaderscanskipthissection,butthosewhopersevereshouldfindthatapplicationslikechi-squaredtests(Chapter13)appearmuchmorelogical.
ManyprobabilitydistributionscanbederivedforfunctionsofNormalvariableswhichariseinstatisticalanalysis.Threeoftheseareparticularlyimportant:theChi-squared,tandFdistributions.Thesehavemanyapplications,someofwhichweshalldiscussinlaterchapters.
TheChi-squareddistributionisdennedasfollows.SupposeZisaStandardNormalvariable,sohavingmean0andvariance1.ThenthevariableformedbyZ2followstheChi-squareddistributionwith1degreeoffreedom.IfwehavensuchindependentStandardNormalvariables,Z1,Z2,…,Znthenthevariabledefinedby
χ2=Z21+Z22+…+Z2n
isdefinedtobetheChi-squareddistributionwithndegreesoffreedom.χistheGreekletter‘chi’,pronounced‘ki’asin‘kite’.The
distributioncurvesforseveraldifferentnumbersofdegreesoffreedomareshowninFigure7.15.Themathematicaldescriptionofthiscurveisrathercomplicated,butwedonotneedtogointothis.
SomepropertiesoftheChi-squareddistributionareeasytodeduce.AsthedistributionisthesumofnindependentidenticallydistributedrandomvariablesittendstotheNormalasnincreases,fromthecentrallimittheorem(§7.2).Theconvergenceisslow,however,(Figure7.15)andthesquarerootofchi-squaredconvergesmuchmorequickly.TheexpectedvalueofZ2isthevarianceofZ,theexpectedvalueofZbeing0,andsoE(Z2)=1.Theexpectedvalueofchi-squaredwithndegreesoffreedomisthusn:
TheChi-squareddistributionhasaveryimportantproperty.SupposewerestrictourattentiontoasubsetofpossibleoutcomesforthenrandomvariablesZ1,Z2,…,Zn.ThesubsetwillbedefinedbythosevaluesofZ1,Z2,…,Znwhichsatisfytheequationa1Z1+a2Z2+…+anZn=k,wherea1,a2…,an,andkareconstants.(Thisiscalledalinearconstraint).Thenunderthisrestriction,χ2=ΣZ2ifollowsaChi-squareddistributionwithn-1degreesoffreedom.Iftherearemsuchconstraintssuchthatnoneoftheequationscanbecalculated
fromtheothers,thenwehaveaChi-squareddistributionwithn-mdegreesoffreedom.Thisisthesourceofthename‘degreesoffreedom’.
Fig.7.15.SomeChi-squareddistributions
Theproofofthisistoocomplicatedtogivehere,involvingsuchmathematicalabstractionsasndimensionalspheres,butitsimplicationsareveryimportant.First,considerthesumofsquaresaboutthepopulationmeanµofasampleofsizenfromaNormaldistribution,dividedbyσ2·σ(xi-µ)2/σ2willfollowaChi-squareddistributionwithndegreesoffreedom,asthe(xi-µ)/σhavemean0andvariance1andtheyareindependent.Nowsupposewereplaceµbyanestimatecalculatedfromthedata,[xwithbarabove].Thevariablesarenolongerindependent,theymustsatisfytherelationshipΣ(xi-[xwithbarabove])=0andwenowhaven-1degreesoffreedom.HenceΣ(xi-[xwithbarabove])2/σ2followsaChi-squareddistributionwithn-1degreesoffreedom.ThesumofsquaresaboutthemeanofanyNormalsamplewithvarianceσ2followsthedistributionofaChi-squaredvariablemultipliedbyσ2.Itthereforehasexpectedvalue(n-1)σ2andwedividebyn-1togivetheestimateofσ2.
Thus,providedthedataarefromaNormaldistribution,notonlydoesthesamplemeanfollowaNormaldistribution,butthesamplevarianceisfromaChi-squareddistributiontimesσ2/(n-1).BecausethesquarerootoftheChi-squareddistributionconvergesquiterapidlytotheNormal,thedistributionofthesamplestandarddeviationisapproximatelyNormalforn>20,providedthedatathemselvesarefromaNormaldistribution.AnotherimportantpropertyofthevariancesofNormalsamplesisthat,ifwetakemanyrandomsamplesfromthesamepopulation,thesamplevarianceandsamplemeanareindependentif,
andonlyif,thedataarefromaNormaldistribution.
TheFdistributionwithmandndegreesoffreedomisthedistributionof(χ2m)/(χ2n/n),thetworatiooftwoindependentX2variableseachdividedbyitsdegreesoffreedom.Thisdistributionisusedforcomparingvariances.IfwehavetwoindependentestimatesofthesamevariancecalculatedfromNormaldata,thevarianceratiowillfollowtheFdistribution.Wecanusethisforcomparingtwoestimatesofvariance(§10.8),butitmainusesareincomparinggroupsofmeans(§10.9)andinexaminingtheeffectsofseveralfactorstogether(§17.2).
7MMultiplechoicequestions32to37(Eachbranchiseithertrueorfalse)
32.TheNormaldistribution:
(a)isalsocalledtheGaussiandistribution;
(b)isfollowedbymanyvariables;
(c)isafamilyofdistributionswithtwoparameters;
(d)isfollowedbyallmeasurementsmadeinhealthypeople;
(e)isthedistributiontowardswhichthePoissondistributiontendsasitsmeanincreases.
ViewAnswer
33.TheStandardNormaldistribution:
(a)isskewtotheleft;
(b)hasmean=1.0;
(c)hasstandarddeviation=0.0;
(d)hasvariance=1.0;
(e)hasthemedianequaltothemean.
ViewAnswer
34.ThePEFRsofagroupof11-year-oldgirlsfollowaNormaldistributionwithmean300litre/minandastandarddeviation20litre/min:
(a)about95%ofthegirlshavePEFRbetween260and340litre/min;
(b)50%ofthegirlshavePEFRabove300litre/min;
(c)thegirlshavehealthylungs;
(d)about5%ofgirlshavePEFRbelow260litre/min;
(e)allthePEFRsmustbelessthan340litre/min.
ViewAnswer
35.Themeanofalargesample:
(a)isalwaysgreaterthanthemedian;
(b)iscalculatedfromtheformulaΣxn/n
(c)isfromanapproximatelyNormaldistribution;
(d)increasesasthesamplesizeincreases;
(e)isalwaysgreaterthanthestandarddeviation.
ViewAnswer
36.IfXandYareindependentvariableswhichfollowStandardNormaldistributions,aNormaldistributionisalsofollowedby:
(a)5X;
(b)X2;
(c)X+5;
(d)X-Y;
(e)X/Y.
ViewAnswer
37.WhenaNormalplotisdrawnwiththeStandardNormaldeviateontheyaxis:
(a)astraightlineindicatesthatobservationsarefromaNormalDistribution;
(b)acurvewithdecreasingslopeindicatespositiveskewness;
(c)an‘S’shapedcurve(orogive)indicateslongtails;
(d)averticallinewilloccurifallobservationsareequal;
(e)ifthereisastraightlineitsslopedependsonthestandarddeviation.
ViewAnswer
7EExercise:ANormalplotInthisexerciseweshallreturntothebloodglucosedataof§4EandtrytodecidehowwelltheyconformtoaNormaldistribution.
1.Fromtheboxandwhiskerplotandthehistogramfoundinexercise§4E(ifyouhavenottriedexercise§4Eseethesolutionin
Chapter19),dothebloodglucoselevelslooklikeaNormaldistribution?
ViewAnswer
2.ConstructaNormalplotforthedata.Thisisquiteeasyastheyareorderedalready.Find(i-0.5)/nfori=1to40andobtainthecorrespondingcumulativeNormalprobabilitiesfromTable7.1.Nowplottheseprobabilitiesagainstthecorrespondingbloodglucose.
ViewAnswer
3.Doestheplotappeartogiveastraightline?DothedatafollowaNormaldistribution?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>8-Estimation
8
Estimation
8.1SamplingdistributionsWehaveseeninChapter3howsamplesaredrawnfrommuchlargerpopulations.Dataarecollectedaboutthesamplesothatwecanfindoutsomethingaboutthepopulation.Weusesamplestoestimatequantitiessuchasdiseaseprevalence,meanbloodpressure,meanexposuretoacarcinogen,etc.Wealsowanttoknowbyhowmuchtheseestimatesmightvaryfromsampletosample.
InChapters6and7wesawhowthetheoryofprobabilityenablesustolinkrandomsampleswiththepopulationsfromwhichtheyaredrawn.Inthischapterweshallseehowprobabilitytheoryenablesustousesamplestoestimatequantitiesinpopulations,andtodeterminetheprecisionoftheseestimates.Firstweshallconsiderwhathappenswhenwedrawrepeatsamplesfromthesamepopulation.Table8.1showsasetof100randomdigitswhichwecanuseasthepopulationforasamplingexperiment.ThedistributionofthenumbersinthispopulationisshowninFigure8.1.Thepopulationmeanis4.7andthestandarddeviationis2.9.
Thesamplingexperimentisdonebyusingasuitablerandomsamplingmethodtodrawrepeatedsamplesfromthepopulation.Inthiscasedecimaldicewereaconvenientmethod.Asampleofsizefourwaschosen:6,4,6and1.Themeanwascalculated:17/4=4.25.Thiswasrepeatedtodrawasecondsampleof4numbers:7,8,1,8.Theirmeanis6,00.Thissamplingprocedurewasdone20timesaltogether,togivethesamplesandtheirmeansshowninTable8.2.
Thesesamplemeansarenotallthesame.Theyshowrandomvariation.Ifwewereabletodrawallofthe3921225possiblesamplesofsize4andcalculatetheirmeans,thesemeansthemselveswouldformadistribution.Our20samplemeansarethemselvesasamplefromthisdistribution.Thedistributionofallpossiblesamplemeansiscalledthesamplingdistributionofthemean.Ingeneral,thesamplingdistributionofanystatisticisthedistributionofthevaluesofthe
statisticwhichwouldarisefromallpossiblesamples.
Table8.1.Populationof100randomdigitsforasamplingexperiment
9 1 0 7 5 6 9 5 8 8 1 0 5 7
1 8 8 8 5 2 4 8 3 1 6 5 5 7
2 8 1 8 5 8 4 0 1 9 2 1 6 9
1 9 7 9 7 2 7 7 0 8 1 6 3 8
7 0 2 8 8 7 2 5 4 1 8 6 8 3
Fig.8.1.DistributionofthepopulationofTable8.1
8.2StandarderrorofasamplemeanForthemomentweshallconsiderthesamplingdistributionofthemeanonly.Asoursampleof20meansisarandomsamplefromit,wecanusethistoestimatesomeoftheparametersofthedistribution.Thetwentymeanshavetheirownmeanandstandarddeviation.Themeanis5.1andthestandarddeviationis1.1.Nowthemeanofthewholepopulationis4.7,whichisclosetothemeanofthesamples.Butthestandarddeviationofthepopulationis2.9,whichisconsiderablygreaterthanthatofthesamplemeans.Ifweplotahistogramforthesampleofmeans(Figure8.2)weseethatthecentreofthesamplingdistributionandtheparentpopulationdistributionarethesame,butthescatterofthesamplingdistributionismuchless.
Table8.2.Randomsamplesdrawninasamplingexperiment
Sample 6 7 7 1 5 5 4
4 8 9 8 2 5 2
6 1 2 8 9 7 7
1 8 7 4 5 8 6
Mean 4.25 6.00 6.25 5.25 5.25 6.25 4.75
Sample 7 7 2 8 3 4 5
8 3 5 0 7 8 5
7 8 0 7 4 7 8
2 7 8 7 8 7 3
Mean 6.00 6.25 3.75 5.50 5.50 6.50 5.25
Fig.8.2.DistributionofthepopulationofTable8.1andofthesampleofthemeansofTable8.2
Thesamplemeanisanestimateofthepopulationmean.Thestandarddeviationofitssamplingdistributioniscalledthestandarderroroftheestimate.Itprovidesameasureofhowfarfromthetruevaluetheestimateislikelytobe.Inmostestimation,theestimateislikelytobewithinonestandarderrorofthetruemeanandunlikelytobemorethantwostandarderrorsfromit.Weshalllookatthismorepreciselyin§8.3.
Inalmostallpracticalsituationswedonotknowthetruevalueofthe
populationvarianceσ2butonlyitsestimates2(§4.7).Wecanusethistoestimatethestandarderrorbys/√n.Thisestimateisalsoreferredtoasthestandarderrorofthemean.Itisusuallyclearfromthecontextwhetherthestandarderroristhetruevalueorthatestimatedfromthedata.
Whenthesamplesizenislarge,thesamplingdistributionof[xwithbarabove]tendstoaNormaldistribution.Also,wecanassumethats2isagoodestimateofσ2.Soforlargen[xwithbarabove],is,ineffect,anobservationfromaNormaldistributionwithmeanµandstandarddeviationestimatedbys/√n.Sowithprobability0.95,xiswithintwo,ormorepreciselyiswithin1.96standarderrorsofµ.WithsmallsampleswecannotassumeeitheraNormaldistributionor,moreimportantly,
thats2isagoodestimateofσ2.WeshalldiscussthisinChapter10.
Fig.8.3.SamplesofmeansfromaStandardNormalvariable
Themeanandstandarderrorareoftenwrittenas4.062±0.089.Thisisrathermisleading,asthetruevaluemaybeuptotwostandarderrorsfromthemeanwithareasonableprobability.Thispracticeisnotrecommended.
Thereisoftenconfusionbetweentheterms‘standarderror’and‘standarddeviation’.Thisisunderstandable,asthestandarderrorisastandarddeviation(ofthesamplingdistribution)andthetermsareofteninterchangedinthiscontext.Theconventionisthis:weusetheterm‘standarderror’whenwemeasuretheprecisionofestimates,andtheterm‘standarddeviation’whenweareconcernedwiththevariabilityofsamples,populationsordistributions.IfwewanttosayhowgoodourestimateofthemeanFEV1measurementis,wequotethestandarderrorofthemean.IfwewanttosayhowwidelyscatteredtheFEV1measurementsare,wequotethestandarddeviation,s.
8.3ConfidenceintervalsTheestimateofmeanFEV1isasinglevalueandsoiscalledapointestimate.Thereisnoreasontosupposethatthepopulationmeanwillbeexactlyequaltothepointestimate,thesamplemean.Itislikelytobeclosetoit,however,andtheamountbywhichitislikelytodifferfromtheestimatecanbefound
fromthestandarderror.Whatwedoisfindlimitswhicharelikelytoincludethepopulationmean,andsaythatweestimatethepopulationmeantoliesomewhereintheinterval(thesetofallpossiblevalues)betweentheselimits.Thisiscalledanintervalestimate.
Fig.8.4.Samplingdistributionofthemeanof4observationsfromaStandardNormaldistribution
Forinstance,ifweregardthe57FEVmeasurementsasbeingalargesamplewecanassumethatthesamplingdistributionofthemeanisNormal,andthatthestandarderrorisagoodestimateofitsstandarddeviation(see§10.6foradiscussionofhowlargeislarge).Wethereforeexpectabout95%ofsuchmeanstobewithin1.96standarderrorsofthepopulationmean,µ.Hence,forabout95%ofallpossiblesamples,thepopulationmeanmustbegreaterthanthesamplemeanminus1.96standarderrorsandlessthanthesamplemeanplus1.96standarderrors.Ifwecalculatedx-1.96seandx+1.96seforallpossiblesamples,95%ofsuchintervalswouldcontainthepopulationmean.Inthiscasetheselimitsare4.062-1.96×0.089to4.062+1.96×0.089whichgives3.89to4.24,or3.9to4.2litres,roundingtotwosignificantfigures;3.9and4.2arecalledthe95%confidencelimitsfortheestimate,andthesetofvaluesbetween3.9and4.2iscalledthe95%confidenceinterval.Theconfidencelimitsarethevaluesattheendsoftheconfidenceinterval.
Strictlyspeaking,itisincorrecttosaythatthereisaprobabilityof0.95thatthepopulationmeanliesbetween3.9and4.2,thoughitisoften
putthatway(evenbyme).Thepopulationmeanisanumber,notarandomvariable,andhasnoprobability.Itistheprobabilitythatlimitscalculatedfromarandomsamplewillincludethepopulationvaluewhichis95%.Figure8.5showsconfidenceintervalsforthemeanfor20randomsamplesof100observationsfromtheStandardNormaldistribution.Thepopulationmeanis,ofcourse,0.0,shownbythehorizontalline.Somesamplemeansarecloseto0.0,somefurtheraway,someaboveandsomebelow.Thepopulationmeaniscontainedby19ofthe20confidenceintervals.Ingeneral,for95%ofconfidenceintervalsitwillbetrueto
saythatthepopulationvaluelieswithintheinterval.Wejustdon'tknowwhich95%.Weexpressthisbysayingthatweare95%confidentthatthemeanliesbetweentheselimits.
Fig.8.5.Meanand95%confidenceintervalfor20randomsamplesof100observationsfromtheStandardNormaldistribution
IntheFEV1example,thesamplingdistributionofthemeanisNormalanditsstandarddeviationiswellestimatedbecausethesampleislarge.Thisisnotalwaystrueandalthoughitisusuallypossibletocalculateconfidenceintervalsforanestimatetheyarenotallquiteassimpleasthatforthemeanestimatedfromalargesample.Weshalllookatthemeanestimatedfromasmallsamplein§10.2.
Thereisnonecessityfortheconfidenceintervaltohaveaprobabilityof95%.Forexample,wecanalsocalculate99%confidencelimits.Theupper0.5%pointoftheStandardNormaldistributionis2.58(Table7.2),sotheprobabilityofaStandardNormaldeviatebeingabove2.58orbelow-2.58is1%andtheprobabilityofbeingwithintheselimitsis99%.The99%confidencelimitsforthemeanFEV1aretherefore,4.062-2.58×0.089and4.062+2.58×0.089,i.e.3.8and4.3litres.Thesegiveawiderintervalthanthe95%limits,aswewouldexpectsincewearemoreconfidentthatthemeanwillbeincluded.Theprobabilitywechooseforaconfidenceintervalisthusacompromisebetweenthedesiretoincludetheestimatedpopulationvalueandthedesiretoavoidpartsofscalewherethereisalowprobabilitythatthemeanwillbefound.Formostpurposes,95%confidenceintervalshavebeenfoundtobesatisfactory.
Standarderrorisnottheonlywayinwhichwecancalculateconfidenceintervals,althoughatpresentitistheoneusedformostproblems.In§8.8Idescribeadifferentapproachbasedontheexactprobabilitiesofadistribution,whichrequiresnolargesampleassumption.In§8.9IdescribealargesamplemethodwhichusestheBinomialdistributiondirectly.Thereareothers,whichIshallomitbecausetheyarerarelyused.
8.4StandarderrorandconfidenceintervalforaproportionThestandarderrorofaproportionestimatecanbecalculatedinthesameway.Supposetheproportionofindividualswhohaveaparticularconditioninagivenpopulationisp,andwetakearandomsampleofsizen,thenumberobservedwiththeconditionbeingr.Thenthe
estimatedproportionisr/n.Wehaveseen(§6.4)thatrcomesfromaBinomialdistributionwithmeannpandvariancenp(1-p).Providednislarge,thisdistributionisapproximatelyNormal.Sor/n,theestimatedproportion,isNormallydistributedwithmeangivenbynp/n=p,andvariancegivenby
sincenisconstant,andthestandarderroris
Wecanestimatethisbyreplacingpbyr/n.
ThestandarderroroftheproportionisonlyofuseifthesampleislargeenoughfortheNormalapproximationtoapply.Aroughguidetothisisthatnpandn(1-p)shouldbothexceed5.Thisisusuallythecasewhenweareconcernedwithstraightforwardestimation.Ifwetrytousethemethodforsmallersamples,wemaygetabsurdresults.Forexample,inastudyoftheprevalenceofHIVinex-prisoners(Turnbulletal.1992),of29womenwhodidnotinjectdrugsonewasHIVpositive.Theauthorsreportedthistobe3.4%,witha95%confidenceinterval-3.1%to9.9%.Thelowerlimitof-3.1%,obtainedfromtheobservedproportionminus1.96standarderrors,isimpossible.AsNewcombe(1992)pointedout,thecorrect95%confidenceintervalcanbeobtainedfromtheexactprobabilitiesoftheBinomialdistributionandis0.1%to17.8%(§8.8).
8.5ThedifferencebetweentwomeansInmanystudieswearemoreinterestedinthedifferencebetweentwoparametersthanintheirabsolutevalue.Thesecouldbemeans,
proportions,theslopesoflines,andmanyotherstatistics.WhensamplesarelargewecanassumethatsamplemeansandproportionsareobservationsfromaNormaldistribution,andthatthecalculatedstandarderrorsaregoodestimatesofthestandarddeviations
oftheseNormaldistributions.Wecanusethistofindconfidenceintervals.
Forexample,supposewewishtocomparethemeans,[xwithbarabove]1and[xwithbarabove]2,oftwolargesamples,sizesn1andn2.Theexpecteddifferencebetweenthesamplemeansisequaltothedifferencebetweenthepopulationmeans,i.e.E([xwithbarabove]1-[xwithbarabove]2)=µ1-µ2.Whatisthestandarderrorofthedifference?Thevarianceofthedifferencebetweentwoindependentrandomvariablesisthesumoftheirvariances(§6.6).Hence,thestandarderrorofthedifferencebetweentwoindependentestimatesisthesquarerootofthesumofthesquaresoftheirstandarderrors.Thestandarderrorofameanis√s2/n,sothestandarderrorofthedifferencebetweentwoindependentmeansis
Foranexample,inastudyofrespiratorysymptomsinschoolchildren(Blandetal.1974),wewantedtoknowwhetherchildrenreportedbytheirparentstohaverespiratorysymptomshadworselungfunctionthanchildrenwhowerenotreportedtohavesymptoms.Ninety-twochildrenwerereportedtohavecoughduringthedayoratnight,andtheirmeanPEFRwas294.8litre/minwithstandarddeviation57.1litre/min,and1643childrenwerenotreportedtohavethissymptom,theirmeanPEFRbeing313.6litre/minwithstandarddeviation55.2litre/min.Wethushavetwolargesamples,andcanapplytheNormaldistribution.Wehave
n1=92,[xwithbarabove]1=294.8,s1=57.1,n2=1643,[xwithbarabove]2=313.6,s2=55.2
Thedifferencebetweenthetwogroupsis[xwithbarabove]1-[xwithbarabove]2=294.8-313.6=-18.8.Thestandarderrorofthedifferenceis
Weshalltreatthesampleasbeinglarge,sothedifferencebetweenthemeanscanbeassumedtocomefromaNormaldistributionandtheestimatedstandarderrortobeagoodestimateofthestandarddeviationofthisdistribution.(Forsmallsamplessee§10.3and§10.6.)The95%confidencelimitsforthedifferencearethus-18.8-1.96×6.11and-18.8+1.96×6.11,i.e.-6.8and-30.8litre/min.Theconfidenceintervaldoesnotincludezero,sowehavegoodevidencethat,inthispopulation,childrenreportedtohavedayornightcoughhavelowermeanPEFRthanothers.Thedifferenceisestimatedtobebetween7and31litre/minlowerinchildrenwiththesymptom,soitmaybequitesmall.
Whenwehavepaireddata,suchasacross-overtrial(§2.6)oramatchedcase-controlstudy(§3.8),thetwo-samplemethoddoesnotwork.Instead,wecalculatethedifferencesbetweenthepairedobservationsforeachsubject,thenfindthemeandifference,itsstandarderrorandconfidenceintervalasin§8.3.
Table8.3.Coughduringthedayoratnightatage14andbronchitisbeforeage5(Hollandetal.1978)
Coughat14Bronchitisat5
TotalYes No
Yes 26 44 70
No 247 1002 1249
Total 273 1046 1319
8.6Comparisonoftwoproportions
ProvidedtheconditionsofNormalapproximationaremet(see§8.4)wecanfindaconfidenceintervalforthedifferenceintheusualway.
Forexample,considerTable8.3.Theresearcherswantedtoknowtowhatextentchildrenwithbronchitisininfancygetmorerespiratorysymptomsinlaterlifethanothers.Wecanestimatethedifferencebetweentheproportionsreportedtocoughduringthedayoratnightamongchildrenwithandchildrenwithoutahistoryofbronchitisbeforeage5years.Wehaveestimatesoftwoproportions,p1=26/273=0.09524andp2=44/1046=0.04207.Thedifferencebetweenthemisp1-p2=0.09524-0.04207=0.05317.Thestandarderrorofthedifferenceis
The95%confidenceintervalforthedifferenceis0.05317-1.96×0.0188to0.05317+1.96×0.0188=0.016to0.090.Althoughthedifferenceisnotverypreciselyestimated,theconfidenceintervaldoesnotincludezeroandgivesusclearevidencethatchildrenwith
bronchitisreportedininfancyaremorelikelythanotherstobereportedtohaverespiratorysymptomsinlaterlife.Thedataonlungfunctionin§8.5givesussomereasontosupposethatthisisnotentirelyduetoresponsebias(§3.9).Asin§8.4,theconfidenceintervalmustbeestimated
differentlyforsmallsamples.
Thisdifferenceinproportionsmaynotbeveryeasytointerpret.Theratiooftwoproportionsisoftenmoreuseful.Anothermethod,theoddsratio,isdescribedin§13.7.Theratiooftheproportionwithcoughatage14forbronchitisbefore5totheproportionwithcoughatage14forthosewithoutbronchitisbefore5isp1/p2=0.09524/0.04207=2.26.Childrenwithbronchitisbefore5aremorethantwiceaslikelytocoughduringthedayoratnightatage14thanchildrenwithnosuchhistory.
Thestandarderrorforthisratioiscomplex,andasitisaratioratherthanadifferenceitdoesnotapproximatewelltoaNormaldistribution.Ifwetakethelogarithmoftheratio,however,wegetthedifferencebetweentwologarithms,becauselog(p1/p2)=log(p1)-log(p2)(§5A).Wecanfindthestandarderrorforthelogratioquiteeasily.Weusetheresultthat,foranyrandomvariableXwithmeanµandvarianceσ2,theapproximatevarianceoflog(X)isgivenbyVAR(loge(X))=σ2/µ2(seeKendallandStuart1969).Hence,thevarianceoflog(p)is
Forthedifferencebetweenthetwologarithmsweget
Thestandarderroristhesquarerootofthis.(Thisformulaisoftenwrittenintermsoffrequencies,butIthinkthisversionisclearer.)Fortheexamplethelogratioisloge(2.26385)=0.81707andthestandarderroris
The95%confidenceintervalforthelogratioistherefore0.81707-1.96×0.23784to0.81707+1.96×0.23784=0.35089to1.28324.The95%confidenceintervalfortheratioofproportionsitselfistheantilogofthis:e0.35089toe1.28324=1.42to3.61.Thusweestimatethattheproportionofchildrenreportedtocoughduringthedayoratnightamongthosewithahistoryofbronchitisisbetween1.4to3.6timestheproportionamongthosewithoutahistoryofbronchitis.
Theproportionofindividualsinapopulationwhodevelopadiseaseorsymptomisequaltotheprobabilitythatanygivenindividualwilldevelopthedisease,calledtheriskofanindividualdevelopingadisease.ThusinTable8.3therisk
thatachildwithbronchitisbeforeage5willcoughatage14is26/273=0.09524,andtheriskforachildwithoutbronchitisbeforeage5is44/1046=0.04207.Tocomparerisksforpeoplewithandwithoutaparticularriskfactor,welookattheratiooftheriskwiththefactortotheriskwithoutthefactor,therelativerisk.Therelativeriskofcoughatage14forbronchitisbefore5isthus2.26.Toestimatetherelativeriskdirectly,weneedacohortstudy(§3.7)asinTable8.3.Weestimaterelativeriskforacase-controlstudyinadifferentway(§13.7).
Intheunusualsitutationwhenthesamplesarepaired,eithermatchedortwoobservationsonthesamesubject,weuseadifferentmethod(§13.9).
8.7*Standarderrorofasamplestandarddeviation
8.8*ConfidenceintervalforaproportionwhennumbersaresmallIn§8.4Imentionedthatthestandarderrormethodforaproportiondoesnotworkwhenthesampleissmall.Instead,theconfidenceintervalcanbefoundusingtheexactprobabilitiesoftheBinomialdistribution.Themethodworkslikethis.Givenn,wefindthevaluePLfortheparameterpoftheBinomialdistributionwhichgivesaprobability0.025ofgettinganobservednumberofsuccesses,r,asbigasorbiggerthanthevalueobserved.Wedothisbycalculatingtheprobabilitiesfromtheformulain§6.4,iteratingrounddifferentpossiblevaluesofpuntilwegettherightone.WealsofindthevaluepUfortheparameterpoftheBinomialdistributionwhichgivesaprobability0.025ofgettinganobservednumberofsuccessesassmallasorsmallerthanthevalueobserved.Theexact95%confidenceintervalisPLtopU.Forexample,supposeweobserve3successesoutof10trials.TheBinomialdistributionwithn=10whichhasthetotalprobabilityfor3ormoresuccessesequalto0.025hasparameterp=0.067.Thedistributionwhichhasthetotalprobabilityfor3orfewersuccessesequalto0.025hasp=0.652.Hencethe95%confidenceintervalfortheproportioninthepopulationis0.067to0.652.Figure8.6showsthetwodistributions.Nolargesampleapproximationisrequiredandwecanusethisforanysizeofsample.PearsonandHartley(1970)giveatableforcalculatingexactBinomialconfidenceintervals.Evenbetter,youcandownloadafreeprogramfrommywebsite(§1.3).
Fig.8.6.Distributionsshowingthecalculationoftheexactconfidenceintervalforthreesuccessesoutoftentrials.
Unlesstheobservedproportioniszeroorone,thesevaluesareneverincludedintheexactconfidenceinterval.Thepopulationproportionofsuccessescannotbezeroifwehaveobservedasuccessinthesample.Itcannotbeoneifwehaveobservedafailure.
8.9*Confidenceintervalforamedianandotherquantiles
Weroundjandkuptothenextinteger.Thenthe95%confidenceintervalisbetweenthejthandthekthobservationsintheordered
data.Forthe57FEVmeasurementsofTable4.4,themedianwas4.1litres(§4.5).Forthe95%confidenceintervalforthemedian,n=57andq=0.5,and
The95%confidenceintervalisthusfromthe22ndtothe36thobservation,3.75to4.30litresfromTable4.4.Comparethistothe95%confidenceintervalforthemean,3.9to4.2litres,whichiscompletelyincludedintheintervalforthemedian.Thismethodofestimatingpercentilesisrelativelyimprecise.Anotherexampleisgiven§15.5.
8.10Whatisthecorrectconfidenceinterval?Aconfidenceintervalonlyestimateserrorsduetosampling.Theydonotallowforanybiasinthesampleandgiveusanestimateforthepopulationofwhichourdatacanbeconsideredarandomsample.Asdiscussedin§3.5,itisoftennotclearwhatthispopulationis,andwerelyfarmoreontheestimationofdifferencesthanabsolutevalues.Thisisparticularlytrueinclinicaltrials.Westartwithpatientsinonelocality,excludesome,allowrefusals,andthepatientscannotberegardedasarandomsampleofpatientsingeneral.However,wethenrandomizeintotwogroupswhicharethentwosamplesfromthesamepopulation,andonlythetreatmentdiffersbetweenthem.Thusthedifferenceisthethingwewanttheconfidenceintervalfor,notforeithergroupseparately.Yetresearchersoftenignorethedirectcomparisoninfavourofestimationusingeachgroupseparately.
Forexample,Salvesenetal.(1992)reportedfollow-upoftworandomizedcontrolledtrialsofroutineultrasonographyscreeningduringpregnancy.Atages8to9years,childrenofwomenwhohadtakenpartinthesetrialswerefollowedup.Asubgroupofchildrenunderwentspecifictestsfordyslexia.Thetestresultsclassified21ofthe309screenedchildren(7%,95%confidenceinterval3-10%)and26ofthe294controls(9%,95%confidenceinterval4–12%)asdyslexic.Muchmoreusefulwouldbeaconfidenceintervalforthedifferencebetweenprevalences(-6.3to2.2percentagepoints)ortheirratio(0.44to1.34),
becausewecouldthencomparethegroupsdirectly.
8MMultiplechoicequestions38to43(Eachbranchiseithertrueorfalse)
38.Thestandarderrorofthemeanofasample:
(a)measuresthevariabilityoftheobservations;
(b)istheaccuracywithwhicheachobservationismeasured;
(c)isameasureofhowfarthesamplemeanislikelytobefromthepopulationmean;
(d)isproportionaltothenumberofobservations;
(e)isgreaterthantheestimatedstandarddeviationofthepopulation.
ViewAnswer
39.The95%confidencelimitsforthemeanestimatedfromasetofobservations
(a)arelimitsbetweenwhich,inthelongrun,95%ofobservationsfall;
(b)areawayofmeasuringtheprecisionoftheestimateofthemean;
(c)arelimitswithinwhichthesamplemeanfallswithprobability0.95;
(d)arelimitswhichwouldincludethepopulationmeanfor95%ofpossiblesamples;
(e)areawayofmeasuringthevariabilityofasetofobservations.
ViewAnswer
40.Ifthesizeofarandomsamplewereincreased,wewouldexpect:
(a)themeantodecrease;
(b)thestandarderrorofthemeantodecrease;
(c)thestandarddeviationtodecrease;
(d)thesamplevariancetoincrease;
(e)thedegreesoffreedomfortheestimatedvariancetoincrease.
ViewAnswer
41.Theprevalenceofaconditioninapopulationis0.1.Iftheprevalenceisestimatedrepeatedlyfromsamplesofsize100,theseestimateswillformadistributionwhich:
(a)isasamplingdistribution;
(b)isapproximatelyNormal;
(c)hasmean=0.1;
(d)havevariance=9;
(e)isBinomial.
ViewAnswer
42.ItisnecessarytoestimatethemeanFEV1bydrawingasamplefromalargepopulation.Theaccuracyoftheestimatewilldependon:
(a)themeanFEV1inthepopulation;
(b)thenumberinthepopulation;
(c)thenumberinthesample;
(d)thewaythesampleisselected;
(e)thevarianceofFEV1inthepopulation.
ViewAnswer
43.Inastudyof88birthstowomenwithahistoryofthrombocytopenia(Samuelsetal.1990),thesameconditionwas
recordedin20%ofbabies(95%confidenceinterval13%to30%,exactmethod):
(a)Anothersampleofthesamesizewillshowarateofthrombocytopeniabetween13%and30%;
(b)95%ofsuchwomenhaveaprobabilityofbetween13%and30%ofhavingababywiththrombocytopenia;
(c)Itislikelythatbetween13%and30%ofbirthstosuchwomenintheareawouldshowthrombocytopenia;
(d)Ifthesamplewereincreasedto880births,the95%confidenceintervalwouldbenarrower;
(e)Itwouldbeimpossibletogetthesedataiftherateforallwomenwas10%.
ViewAnswer
8EExercise:MeansoflargesamplesTable8.4summarizesdatacollectedinastudyofplasmamagnesiumindiabetics.Thediabeticsubjectswereallinsulin-dependentsubjectsattendingadiabeticclinicovera5monthperiod.Thenon-diabeticcontrolswereamixtureofblooddonorsandpeopleattendingdaycentresfortheelderly,togiveawideage
distribution.PlasmamagnesiumfollowsaNormaldistributionveryclosely.
Table8.4.Plasmamagnesiumininsulin-dependentdiabeticsandhealthycontrols
Number Mean Standarddeviation
Insulin-dependentdiabetics
227 0.719 0.068
Non-diabeticcontrols 140 0.810 0.057
Fig.8.7.Distributionofmagnesiumindiabeticsandcontrols,showingtheproportionofdiabeticsabovethelowerlimitofreferenceinterval
1.Calculateanintervalwhichwouldinclude95%ofplasmamagnesiummeasurementsfromthecontrolpopulation.Thisiswhatwecallthe95%referenceinterval,describedindetailin§15.5.Ittellsussomethingaboutthedistributionofplasmamagnesiuminthepopulation.
ViewAnswer
2.Whatproportionofinsulin-dependentdiabeticswouldliewithinthis95%referenceinterval?(Hint:findhowmanystandarddeviationsfromthediabeticmeanthelowerlimitis,thenusethetableoftheNormaldistribution,Table7.1,tofindtheprobabilityofexceedingthis.SeeFigure8.7.)
ViewAnswer
3.Findthestandarderrorofthemeanplasmamagnesiumforeachgroup.
ViewAnswer
4.Finda95%confidenceintervalforthemeanplasmamagnesiuminthehealthypopulation.Howdoestheconfidenceintervaldifferfromthe95%referenceinterval?Whyaretheydifferent?
ViewAnswer
5.Findthestandarderrorofthedifferenceinmeanplasmamagnesiumbetweeninsulin-dependentdiabeticsandhealthypeople.
ViewAnswer
6.Finda95%confidenceintervalforthedifferenceinmeanplasmamagnesiumbetweeninsulin-dependentdiabeticsandhealthypeople.Isthereanyevidencethatdiabeticshavelowerplasmamagnesiumthannon-diabeticsinthepopulationfromwhichthesedatacome?
ViewAnswer
7.Wouldplasmamagnesiumbeagooddiagnostictestfordiabetes?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>9-Significancetests
9
Significancetests
9.1TestingahypothesisInChapter8Idealtwithestimationandtheprecisionofestimates.Thisisoneformofstatisticalinference,theprocessbywhichweusesamplestodrawconclusionsaboutthepopulationsfromwhichtheyaretaken.InthischapterIshallintroduceadifferentformofinference,thesignificancetestorhypothesistest.
Asignificancetestenablesustomeasurethestrengthoftheevidencewhichthedatasupplyconcerningsomepropositionofinterest.Forexample,considerthecross-overtrialofpronethalolforthetreatmentofangina(§2.6).Table9.1showsthenumberofattacksoverfourweeksoneachtreatment.These12patientsareasamplefromthepopulationofallpatients.Wouldtheothermembersofthispopulationexperiencefewerattackswhileusingpronethalol?Wecanseethatthenumberofattacksishighlyvariablefromonepatienttoanother,anditisquitepossiblethatthisistruefromoneperiodoftimetoanotheraswell.Soitcouldbethatsomepatientswouldhavefewerattackswhileonpronethalolthanwhileonplaceboquitebychance.Inasignificancetest,weaskwhetherthedifferenceobservedwassmallenoughtohaveoccurredbychanceiftherewerereallynodifferenceinthepopulation.Ifitwereso,thentheevidenceinfavouroftherebeingadifferencebetweenthetreatmentperiodswouldbeweakorabsent.Ontheotherhand,ifthedifferenceweremuchlargerthanwewouldexpectduetochanceiftherewerenorealpopulationdifference,thentheevidenceinfavourofarealdifferencewouldbestrong.
Tocarryoutthetestofsignificancewesupposethat,inthepopulation,
thereisnodifferencebetweenthetwotreatments.Thehypothesisof‘nodifference’or‘noeffect’inthepopulationiscalledthenullhypothesis.Ifthisisnottrue,thenthealternativehypothesismustbetrue,thatthereisadifferencebetweenthetreatmentsinonedirectionortheother.Wethenfindtheprobabilityofgettingdataasdifferentfromwhatwouldbeexpected,ifthenullhypothesisweretrue,asarethosedataactuallyobserved.Ifthisprobabilityislargethedataareconsistentwiththenullhypothesis;ifitissmallthedataareunlikelytohavearisenifthenullhypothesisweretrueandtheevidenceisinfavourofthealternativehypothesis.
Table9.1.Trialofpronethalolforthepreventionofanginapectoris
Numberofattackswhileon
Differenceplacebo—pronethalol
Signofdifference
Placebo Pronethalol
71 29 42 +
323 348 -25 -
8 1 7 +
14 7 7 +
23 16 7 +
34 25 9 +
79 65 14 +
60 41 19 +
2 0 2 +
3 0 3 +
17 15 2 +
7 2 5 +
9.2Anexample:ThesigntestIshallnowdescribeaparticulartestofsignificance,thesigntest,totestthenullhypothesisthatplaceboandpronethalolhavethesameeffectonangina.Considerthedifferencesbetweenthenumberofattacksonthetwotreatmentsforeachpatient,asinTable9.1.Ifthenullhypothesisweretrue,thendifferencesinnumberofattackswouldbejustaslikelytobepositiveasnegative,theywouldberandom.Theprobabilityofachangebeingnegativewouldbeequaltotheprobabilityofitbeingpositive,sobothprobabilitieswouldbe0.5.ThenthenumberofnegativeswouldbeanobservationfromaBinomialdistribution(§6.4)withn=12andp=0.5.(Iftherewereanysubjectswhohadthesamenumberofattacksonbothregimeswewouldomitthem,astheyprovidenoinformationaboutthedirectionofanydifferencebetweenthetreatments.Inthistest,nisthenumberofsubjectsforwhomthereisadifference,onewayortheother.)
Ifthenullhypothesisweretrue,whatwouldbetheprobabilityofgettinganobservationfromthisdistributionasextremeasthevaluewehaveactuallyobserved?Theexpectednumberofnegativeswouldbenp=6.Whatistheprobabilityofgettingavalueasfarfromexpectationasisthatobserved?Thenumberofnegativedifferencesis1.The
probabilityofgettingonenegativechangeis
Thisisnotalikelyeventinitself.However,weareinterestedintheprobabilityofgettingavalueasfarorfurtherfromtheexpectedvalue,6,asis1,andclearly0isfurtherandmustbeincluded.Theprobabilityofnonegativechangesis
Sotheprobabilityofoneorfewernegativechangesis0.00293+0.00024=0.00317.Thenullhypothesisisthatthereisnodifference,sothealternativehypothesisisthatthereisadifferenceinonedirectionortheother.Wemust,therefore,considertheprobabilityofgettingavalueasextremeontheothersideofthemean,thatis11or12negatives(Figure9.1).Theprobabilityof11or12negativesisalso0.00317,becausethedistributionissymmetrical.Hence,theprobabilityofgettingasextremeavalueasthatobserved,ineitherdirection,is0.00317+0.00317=0.00634.Thismeansthatifthenullhypothesisweretruewewouldhaveasamplewhichissoextremethattheprobabilityofitarisingbychanceis0.006,lessthanoneinahundred.
Fig.9.1.ExtremesoftheBinomialdistributionforthesigntest
Thus,wewouldhaveobservedaveryunlikelyeventifthenullhypothesisweretrue.Thismeansthatthedataarenotconsistentwithnullhypothesis,andwecanconcludethatthereisstrongevidenceinfavourofadifferencebetweenthetreatments.(Sincethiswasadoubleblindrandomizedtrial,itisreasonabletosupposethatthiswascausedbytheactivityofthedrug.)
9.3PrinciplesofsignificancetestsThesigntestisanexampleofatestofsignificance.Thenumberofnegativechangesiscalledtheteststatistic,somethingcalculatedfromthedatawhichcanbeusedtotestthenullhypothesis.Thegeneralprocedureforasignificancetestisasfollows.
1. Setupthenullhypothesisanditsalternative.
2. Findthevalueoftheteststatistic.
3. Refertheteststatistictoaknowndistributionwhichitwouldfollowifthenullhypothesisweretrue.
4. Findtheprobabilityofavalueoftheteststatisticarisingwhichisasormoreextremethanthatobserved,ifthenullhypothesisweretrue.
5. Concludethatthedataareconsistentorinconsistentwiththenullhypothesis.
Weshalldealwithseveraldifferentsignificancetestsinthisandsubsequentchapters.Weshallseethattheyallfollowthispattern.
Ifthedataarenotconsistentwiththenullhypothesis,thedifferenceissaidtobestatisticallysignificant.Ifthedatadonotsupportthenullhypothesis,itissometimessaidthatwerejectthenullhypothesis,andifthedataareconsistentwiththenullhypothesisitissaidthatweacceptit.Suchan‘allornothing’decisionmakingapproachisseldomappropriateinmedicalresearch.Itispreferabletothinkofthesignificancetestprobabilityasanindexofthestrengthofevidenceagainstthenullhypothesis.Theterm‘acceptthenullhypothesis’isalsomisleadingbecauseitimpliesthatwehaveconcludedthatthenullhypothesisistrue,whichweshouldnotdo.Wecannotprovestatisticallythatsomething,suchasatreatmenteffect,doesnotexist.Itisbettertosaythatwehavenotrejectedorhavefailedtorejectthenullhypothesis.
TheprobabilityofsuchanextremevalueoftheteststatisticoccurringifthenullhypothesisweretrueisoftencalledthePvalue.Itisnottheprobabilitythatthenullhypothesisistrue.Thisisacommonmisconception.Thenullhypothesisiseithertrueoritisnot;itisnotrandomandhasnoprobability.Isuspectthatmanyresearchershavemanagedtousesignificancetestsquiteeffectivelydespiteholdingthisincorrectview.
9.4SignificancelevelsandtypesoferrorWemuststillconsiderthequestionofhowsmallissmall.Aprobabilityof0.006,asintheexampleabove,isclearlysmallandwehaveaquiteunlikelyevent.Butwhatabout0.06,or0.1?Supposewetakeaprobabilityof0.01orlessasconstitutingreasonableevidenceagainstthenullhypothesis.Ifthenullhypothesisistrue,weshallmakea
wrongdecisiononeinahundredtimes.Decidingagainstatruenullhypothesisiscalledanerrorofthefirstkind,typeIerror,orαerror.Wegetanerrorofthesecondkind,typeIIerror,orβerrorifwedonotrejectanullhypothesiswhichisinfactfalse.(αandβaretheGreekletters‘alpha’and‘beta’.)Nowthesmallerwedemandtheprobabilitybebeforewedecideagainstthenullhypothesis,thelargertheobserveddifferencemustbe,andsothemorelikelywearetomissrealdifferences.Byreducingtheriskofanerrorofthefirstkindweincreasetheriskofanerrorofthesecondkind.
Theconventionalcompromiseistosaythatdifferencesaresignificantiftheprobabilityislessthan0.05.Thisisareasonableguide-line,butshouldnotbetakenassomekindofabsolutedemarcation.Thereisnotagreatdifferencebetweenprobabilitiesof0.06and0.04,andtheysurelyindicatesimilarstrengthofevidence.Itisbettertoregardprobabilitiesaround0.05asprovidingsomeevidenceagainstthenullhypothesis,whichincreasesinstrengthastheprobabilityfalls.Ifwedecidethatthedifferenceissignificant,theprobabilityissometimesreferredtoasthesignificancelevel.Wesaythatthesignificancelevelishigh
ifthePvalueislow.
Fig.9.2.One-andtwo-sidedtests
Asaroughandreadyguide,wecanthinkofPvaluesasindicatingthestrengthofevidencelikethis:
greaterthan0.1:littleornoevidenceofadifferenceorrelationship
between0.05and0.1:weakevidenceofadifferenceorrelationship
between0.01and0.05:evidenceofadifferenceorrelationship
lessthan0.01:strongevidenceofadifferenceorrelationship
lessthan0.001:verystrongevidenceofadifferenceorrelationship
9.5One-andtwo-sidedtestsofsignificanceIntheaboveexample,thealternativehypothesiswasthattherewasadifferenceinonedirectionortheother.Thisiscalledatwo-sidedortwo-tailedtest,becauseweusedtheprobabilitiesofextremevaluesinbothdirections.Itwouldhavebeenpossibletohavethealternativehypothesisthattherewasadecreaseinthepronethaloldirection,inwhichcasethenullhypothesiswouldbethatthenumberofattacksontheplacebowaslessthanorequaltothenumberonpronethalol.ThiswouldgiveP=0.00317,andofcourse,ahighersignificancelevelthanthetwosidedtest.Thiswouldbeaone-sidedorone-tailedtest(Figure9.2).Thelogicofthisisthatweshouldignoreanysignsthattheactivedrugisharmfultothepatients.Ifwhatweweresayingwas‘ifthistrialdoesnotgiveasignificantreductioninanginausingpronethalolwewillnotuseitagain’,thismightbereasonable,butthemedicalresearchprocessdoesnotworklikethat.Thisisoneofseveralpiecesofevidenceandsoweshouldcertainlyuseamethodofinferencewhichwouldenableustodetecteffectsineitherdirection.
Thequestionofwhetherone-ortwo-sidedtestsshouldbethenormhasbeenthesubjectofconsiderabledebateamongpractitionersofstatisticalmethods.Perhapsthepositiontakendependsonthefieldinwhichthetestingisusuallydone.Inbiologicalscience,treatmentsseldomhaveonlyoneeffectandrelationshipsbetweenvariablesareusuallycomplex.Two-sidedtestsarealmostalwayspreferable.
Therearecircumstancesinwhichaone-sidedtestisappropriate.Ina
studyoftheeffectsofaninvestigativeprocedure,laparoscopyandhydrotubation,onthefertilityofsub-fertilewomen(Luthraetal.1982),westudiedwomenpresentingataninfertilityclinic.Thesewomenwereobservedforseveralmonths,duringwhichsomeconceived,beforelaparoscopywascarriedoutonthosestillinfertile.Thesewerethenobservedforseveralmonthsafterwardsandsomeofthesewomenalsoconceived.Wecomparedtheconceptionrateintheperiodbeforelaparoscopywiththatafterwards.Ofcourse,womenwhoconceivedduringthefirstperioddidnothavealaparoscopy.Wearguedthatthelessfertileawomanwasthelongeritwaslikelytotakehertoconceive.Hence,thewomenwhohadthelaparoscopyshouldhavealowerconceptionrate(byanunknownamount)thanthelargergroupwhoenteredthestudy,becausethemorefertilewomenhadconceivedbeforetheirturnforlaparoscopycame.Toseewhetherlaparoscopyincreasedfertility,wecouldtestthenullhypothesisthattheconceptionrateafterlaparoscopywaslessthanorequaltothatbefore.Thealternativehypothesiswasthattheconceptionrateafterlaparoscopywashigherthanthatbefore.Atwo-sidedtestwasinappropriatebecauseifthelaparoscopyhadnoeffectonfertilitythepostlaparoscopyratewasexpectedtobelower;chancedidnotcomeintoit.Infactthepostlaparoscopyconceptionratewasveryhighandthedifferenceclearlysignificant.
9.6Significant,realandimportantIfadifferenceisstatisticallysignificant,thenitmaywellbereal,butnotnecessarilyimportant.Forexample,wemaylookattheeffectofadrug,givenforsomeotherpurpose,onbloodpressure.Supposewefindthatthedrugraisesbloodpressurebyanaverageof1mmHg,andthatthisissignificant.Ariseinbloodpressureof1mmHgisnotclinicallyimportant,so,althoughitmaybethere,itdoesnotmatter.Itis(statistically)significant,andreal,butnotimportant.
Ontheotherhand,ifadifferenceisnotstatisticallysignificant,itcouldstillbereal.Wemaysimplyhavetoosmallasampletoshowthatadifferenceexists.Furthermore,thedifferencemaystillbeimportant.ThedifferenceinmortalityintheanticoagulanttrialofCarletonetal.(1960),describedinChapter2,wasnotsignificant,thedifferencein
percentagesurvivalbeing5.5infavouroftheactivetreatment.However,theauthorsalsoquoteaconfidenceintervalforthedifferenceinpercentagesurvivalof24.2percentagepointsinfavourofheparinto13.3percentagepointsinfavourofthecontroltreatment.Adifferenceinsurvivalof24percentagepointsinfavourofthetreatmentwouldcertainlybeimportantifitturnedouttobethecase.‘Notsignificant’doesnotimplythatthereisnoeffect.Itmeansthatwehavefailedtodemonstratetheexistenceofone.Laterstudiesshowedthatanticoagulationisindeedeffective.
Aparticularcaseofmisinterpretationofnon-significantresultsoccursintheinterpretationofrandomizedclinicaltrialswherethereisameasurementbeforetreatmentandanotherafterwards.Ratherthancomparetheaftertreatment
measurebetweenthetwogroups,researcherscanbetemptedtotestseparatelythenullhypothesesthatthemeasureinthetreatmentgrouphasnotchangedfrombaselineandthatthemeasureinthecontrolgrouphasnotchangedfrombaseline.Ifonegroupshowsasignificantdifferenceandtheotherdoesnot,theresearchersthenconcludethatthetreatmentsaredifferent.
Forexample,Kerriganetal.(1993)assessedtheeffectsofdifferentlevelsofinformationonanxietyinpatientsduetoundergosurgery.Theyrandomizedpatientstoreceiveeithersimpleordetailedinformationabouttheprocedureanditsrisks.Anxietywasagainmeasuredafterpatientshadbeengiventheinformation.Kerriganetal.(1993)calculatedsignificancetestsforthemeanchangeinanxietyscoreforeachgroupseparately.Inthegroupgivendetailedinformationthemeanchangeinanxietywasnotsignificant(P=0.2),interpretedincorrectlyas‘nochange’.Intheothergroupthereductioninanxietywassignificant(P=0.01).Theyconcludedthattherewasadifferencebetweenthetwogroupsbecausethechangewassignificantinonegroupbutnotintheother.Thisisincorrect.Theremay,forexample,beadifferenceinonegroupwhichjustfailstoreachthe(arbitrary)significancelevelandadifferenceintheotherwhichjustexceedsit,thedifferencesinthetwogroupsbeingsimilar.Weshouldcomparethetwogroupsdirectly.Itisthesewhicharecomparable
apartfromtheeffectsoftreatment,beingrandomized,notthebeforeandaftertreatmentmeanswhichcouldbeinfluencedbymanyotherfactors.Analternativeanalysistestedthenullhypothesisthatafteradjustmentforinitialanxietyscorethemeananxietyscoresarethesameinpatientsgivensimpleanddetailedinformation.Thisshowedasignificantlyhighermeanscoreinthedetailedinformationgroup(BlandandAltman1993).Testingwithineachgroupseparatelyisessentiallythesameerrorascalculatingaconfidenceintervalforeachgroupseparately(§8.9).
9.7Comparingthemeansoflargesamples
Wecanusethisconfidenceintervaltocarryoutasignificancetestofthenullhypothesisthatthedifferencebetweenthemeansiszero,i.e.thealternativehypothesisisthatµ1andµ2arenotequal.Iftheconfidenceintervalincludeszero,thentheprobabilityofgettingsuchextremedataifthenullhypothesisweretrueisgreaterthan0.05(i.e.1-0.95).Iftheconfidenceintervalexcludeszero,thentheprobabilityofsuchextremedataunderthenullhypothesisisless
than0.05andthedifferenceissignificant.Anotherwayofdoingthesamethingistonotethat
isfromaStandardNormaldistribution,i.e.mean0andvariance1.Underthenullhypothesisthatµ1-µ2orµ1=µ2-0,thisis
Thisistheteststatistic,andifitliesbetween-1.96and+1.96thentheprobabilityofsuchanextremevalueisgreaterthan0.05andthedifferenceisnotsignificant.Iftheteststatisticisgreaterthan1.96orlessthan-1.96,thereisalessthan0.05probabilityofsuchdataarisingifthenullhypothesisweretrue,andthedataarenotconsistentwithnullhypothesis;thedifferenceissignificantatthe0.05or5%level.ThisisthelargesampleNormaltestorztestfortwomeans.
Foranexample,inastudyofrespiratorysymptomsinschoolchildren(§8.5),wewantedtoknowwhetherchildrenreportedbytheirparentstohaverespiratorysymptomshadworselungfunctionthanchildrenwhowerenotreportedtohavesymptoms.Ninety-twochildrenwerereportedtohavecoughduringthedayoratnight,andtheirmeanPEFRwas294.8litre/minwithstandarddeviation57.1litre/min;1643childrenwerereportednottohavethesymptom,andtheirmeanPEFRwas313.6litre/minwithstandarddeviation55.2litre/min.Wethushavetwolargesamples,andcanapplytheNormaltest.Wehave
Thedifferencebetweenthetwogroupsis[xwithbarabove]1-[xwithbarabove]2=294.8-313.6=-18.8.Thestandarderrorofthedifferenceis
Theteststatisticis
UnderthenullhypothesisthisisanobservationfromaStandardNormaldistribution,andsoP<0.01(Table7.2).Ifthenullhypothesisweretrue,thedatawhichwehaveobservedwouldbeunlikely.WecanconcludethatthereisgoodevidencethatchildrenreportedtohavecoughduringthedayoratnighthavelowerPEFRthanotherchildren.
Thishasaprobabilityofabout0.16,andsothedataareconsistentwiththenullhypothesis.However,the95%confidenceintervalforthedifferenceis-14.6-1.96×10.5to-14.6+1.96×10.5giving-35to6litre/min.Weseethatthedifferencecouldbejustasgreatasforcough.Becausethesizeofthesmallersampleisnotsogreat,thetestislesslikelytodetectadifferenceforthephlegmcomparisonthanforthecoughcomparison.TheadvantagesofconfidenceintervalsovertestsofsignificancearediscussedbyGardnerandAltman(1986).ConfidenceintervalsareusuallymoreinformativethanPvalues,particularlynon-significantones.
9.8ComparisonoftwoproportionsSupposewewishtocomparetwoproportionsp1andp2,estimatedfromlargeindependentsamplessizen1andn2.Thenullhypothesisisthattheproportioninthepopulationsfromwhichthesamplesaredrawnarethesame,psay.Sinceunderthenullhypothesistheproportionsforthetwogroupsarethesame,wecangetonecommonestimateoftheproportionanduseittoestimatethestandarderrors.Weestimatethecommonproportionfromthedataby
wherep1=r1/n2-p2=r2/n2.Wewanttomakeinferencesfromthedifferencebetweensampleproportions,p1-p2,sowerequirethestandarderrorofthisdifference.
sincethesamplesareindependent.Hence
Aspisbasedonmoresubjectsthaneitherp1orp2,ifthenullhypothesisweretruethenstandarderrorswouldbemorereliablethanthoseestimatedin§8.6usingp1andp2separately.Wethenfindtheteststatistic
In§8.6,welookedattheproportionsofchildrenwithbronchitisininfancyandwithnosuchhistorywhowerereportedtohaverespiratorysymptomsinlaterlife.Wehad273childrenwithahistoryofbronchitisbeforeage5years,26ofwhomwerereportedtohavedayornightcoughatage14.Wehad1046childrenwithnobronchitisbeforeage5years,44ofwhomwerereportedtohavedayornightcoughatage14.Weshalltestthenullhypothesisthattheprevalenceofthesymptomisthesameinbothpopulations,againstthealternativethatitisnot:
ReferringthistoTable7.2oftheNormaldistribution,wefindtheprobabilityofsuchanextremevalueislessthan0.01,soweconcludethatthedataarenotconsistentwiththenullhypothesis.Thereisgoodevidencethatchildrenwithahistoryofbronchitisaremorelikelytobereportedtohavedayornightcoughatage14.
Notethatthestandarderrorusedhereisnotthesameasthatfoundin§8.6.Itisonlycorrectifthenullhypothesisistrue.Theformulaof§8.6shouldbeusedforfindingtheconfidenceinterval.Thusthestandarderrorusedfortestingisnotidenticaltothatusedforestimation,aswasthecaseforthecomparisonoftwomeans.Itispossibleforthetesttobesignificantandtheconfidenceintervalincludezero.Thispropertyispossessedbyseveralrelatedtestsandconfidenceintervals.
Thisisalargesamplemethod,andisequivalenttothechi-squaredtestfora2by2table(§13.1,2).Howsmallthesamplecanbeandmethodsforsmallsamplesarediscussedin§13.3-6.
Notethatwedonotneedadifferenttestfortheratiooftwoproportions,asthenullhypothesisthattheratiointhepopulationisoneisthesameasthenullhypothesisthatthedifferenceinthepopulationiszero.
9.9*ThepowerofatestThetestforcomparingmeansin§9.7ismorelikelytodetectalargedifferencebetweentwopopulationsthanasmallone.Theprobabilitythatatestwillproduceasignificantdifferenceatagivensignificanceleveliscalledthepowerofthetest.Foragiventest,thiswilldepend
onthetruedifferencebetweenthepopulationscompared,thesamplesizeandthesignificancelevelchosen.Wehavealreadynotedin§9.4thatwearemorelikelytoobtainasignificantdifferencewithasignificancelevelof0.05thanwithoneof0.01.WehavegreaterpowerifthePvaluechosentobeconsideredassignificantislarger.
ForthecomparisonofPEFRinchildrenwithandwithoutphlegm(§9.7),for
example,supposethatthepopulationmeanswereinfactµ1=310andµ2=295litre/min,andeachpopulationhadstandarddeviation55litre/min.Thesamplesizesweren1=1708andn2=27,sothestandarderrorofthedifferencewouldbe
Thepopulationdifferencewewanttobeabletodetectisµ1-µ2=310-295=15,andso
FromTable7.1,Φ(0.55)isbetween0.691and0.726,about0.71.Thepowerofthetestwouldbe1-0.71=0.29.Ifthesewerethepopulationmeansandstandarddeviation,ourtestwouldhavehadapoorchanceofdetectingthedifferenceinmeans,eventhoughitexisted.Thetestwouldhavelowpower.Figure9.3showshowthepowerofthistestchangeswiththedifferencebetweenpopulationmeans.Asthedifferencegetslarger,thepowerincreases,gettingcloserandcloserto1.Thepowerisnotzeroevenwhenthepopulationdifferenceiszero,becausethereisalwaysthepossibilityofasignificantdifference,evenwhenthenullhypothesisistrue.1-power=β,theprobabilityofaTypeIIorbetaerror(§9.4)ifthepopulationdifference=15litres/min.
Fig.9.3.Powercurveforacomparisonoftwomeansfromsamplesofsize1708and27
9.10*MultiplesignificancetestsIfwetestanullhypothesiswhichisinfacttrue,using0.05asthecriticalsignificancelevel,wehaveaprobabilityof0.95ofcomingtoa‘notsignificant’(i.e.correct)conclusion.Ifwetesttwoindependent
truenullhypotheses,theprobabilitythatneithertestwillbesignificantis0.95×0.95=0.90(§6.2).Ifwetesttwentysuchhypothesestheprobabilitythatnonewillbesignificantis
0.9520=0.36.0.Thisgivesaprobabilityof1-0.36=0.64ofgettingatleastonesignificantresult;wearemorelikelytogetonethannot.Theexpectednumberofspurioussignificantresultsis20×0.05=1.
Manymedicalresearchstudiesarepublishedwithlargenumbersofsignificancetests.Thesearenotusuallyindependent,beingcarriedoutonthesamesetofsubjects,sotheabovecalculationsdonotapplyexactly.However,itisclearthatifwegoontestinglongenoughwewillfindsomethingwhichis‘significant’.Wemustbewareofattachingtoomuchimportancetoalonesignificantresultamongamassofnon-significantones.Itmaybetheoneintwentywhichweshouldgetbychancealone.
Thisisparticularlyimportantwhenwefindthataclinicaltrialorepidemiologicalstudygivesnosignificantdifferenceoverall,butdoessoinaparticularsubsetofsubjects,suchaswomenagedover60.Forexample,Leeetal.(1980)simulatedaclinicaltrialofthetreatmentofcoronaryarterydiseasebyallocating1073patientrecordsfrompastcasesintotwo‘treatment’groupsatrandom.Theythenanalysedtheoutcomeasifitwereagenuinetrialoftwotreatments.Theanalysiswasquitedetailedandthorough.Aswewouldexpect,itfailedtoshowanysignificantdifferenceinsurvivalbetweenthosepatientsallocatedtothetwo‘treatments’.Patientswerethensubdividedbytwovariableswhichaffectprognosis,thenumberofdiseasedcoronaryvesselsandwhethertheleftventricularcontractionpatternwasnormalorabnormal.Asignificantdifferenceinsurvivalbetweenthetwo‘treatment’groupswasfoundinthosepatientswiththreediseasedvessels(themaximum)andabnormalventricularcontraction.Asthiswouldbethesubsetofpatientswiththeworstprognosis,thefindingwouldbeeasytoaccountforbysayingthatthesuperior‘treatment’haditsgreatestadvantageinthemostseverelyillpatients!Themoralofthisstoryisthatifthereisnodifferencebetweenthetreatmentsoverall,significantdifferencesinsubsetsaretobetreatedwiththeutmostsuspicion.Thismethodoflookingforadifferenceintreatment
effectbetweensubgroupsofsubjectsisincorrect.Acorrectapproachwouldbetouseamultifactorialanalysis,asdescribedinChapter17,withtreatmentandgroupastwofactors,andtestforaninteractionbetweengroupsandtreatments.Thepowerfordetectingsuchinteractionsisquitelow,andweneedalargersamplethanwouldbeneededsimplytoshowadifferenceoverall(AltmanandMatthews1996,MatthewsandAltman1996a,b).
Thisspurioussignificantdifferencecomesaboutbecause,whenthereisnorealdifference,theprobabilityofgettingnosignificantdifferencesinsixsubgroupsis0.956=0.74,not0.95.WecanallowforthiseffectbytheBonferronimethod.Ingeneral,ifwehavekindependentsignificanttests,attheαlevel,ofnullhypotheseswhicharealltrue,theprobabilitythatwewillgetnosignificantdifferencesis(1-α)k.Ifwemakeαsmallenough,wecanmaketheprobabilitythatnoneoftheseparatetestsissignificantequalto0.95.ThenifanyofthektestshasaPvaluelessthanα,wewillhaveasignificantdifferencebetweenthetreatmentsatthe0.05level.Sinceαwillbeverysmall,itcanbeshownthat(1-α)k≈1-kα.Ifweputkα=0.05,soα=0.05/kwewillhaveprobability
0.05thatoneofthektestswillhaveaPvaluelessthanαifthenullhypothesesaretrue.Thus,ifinaclinicaltrialwecomparetwotreatmentswithin5subsetsofpatients,thetreatmentswillbesignificantlydifferentatthe0.05levelifthereisaPvaluelessthan0.01withinanyofthesubsets.ThisistheBonferronimethod.Notethattheyarenotsignificantatthe0.01level,butatonlythe0.05level.Thekteststogethertestthecompositenullhypothesisthatthereisnotreatmenteffectonanyvariable.
WecandothesamethingbymultiplyingtheobservedPvaluefromthesignificancetestsbythenumberoftests,k,anykPwhichexceedsonebeingignored.ThenifanykPislessthan0.05,thetwotreatmentsaresignificantatthe0.05level.
Forexample,Williamsetal.(1992)randomlyallocatedelderlypatientsdischargedfromhospitaltotwogroups.Theinterventiongroupreceivedtimetabledvisitsbyhealthvisitorassistants,thecontrol
patientsgroupwerenotvisitedunlesstherewasperceivedneed.Soonafterdischargeandafteroneyear,patientswereassessedforphysicalhealth,disability,andmentalstateusingquestionnairescales.Therewerenosignificantdifferencesoverallbetweentheinterventionandcontrolgroups,butamongwomenaged75–79livingalonethecontrolgroupshowedsignificantlygreaterdeteriorationinphysicalscorethandidtheinterventiongroup(P=0.04),andamongmenover80yearsthecontrolgroupshowedsignificantlygreaterdeteriorationindisabilityscorethandidtheinterventiongroup(P=0.03).Theauthorsstatedthat‘Twosmallsub-groupsofpatientswerepossiblyshowntohavebenefitedfromtheintervention….Thesebenefits,however,havetobetreatedwithcaution,andmaybeduetochancefactors.’Subjectswerecross-classifiedbyagegroups,whetherlivingalone,andsex,sotherewereatleasteightsubgroups,ifnotmore.Thusevenifweconsiderthethreescalesseparately,onlyaPvaluelessthan0.05/8=0.006wouldprovideevidenceofatreatmenteffect.Alternatively,thetruePvaluesare8×0.04=0.32and8×0.03=0.24.
Asimilarproblemarisesifwehavemultipleoutcomemeasurements.Forexample,Newnhametal.(1993)randomizedpregnantwomentoreceiveaseriesofDopplerultrasoundbloodflowmeasurementsortocontrol.Theyfoundasignificantlyhigherproportionofbirthweightsbelowthe10thand3rdcentiles(P=0.006andP=0.02).Thesewereonlytwoofmanycomparisons,however,andonewouldsuspectthattheremaybesomespurioussignificantdifferencesamongsomany.Atleast35werereportedinthepaper,thoughonlythesetwowerereportedintheabstract.(Birthweightwasnottheintendedoutcomevariableforthetrial.)Thesetestsarenotindependent,becausetheyareallonthesamesubjects,usingvariableswhichmaynotbeindependent.Theproportionsofbirthweightsbelowthe10thand3rdcentilesareclearlynotindependent,forexample.Theprobabilitythattwocorrelatedvariablesbothgivenon-significantdifferenceswhenthenullhypothesisistrueisgreaterthan(1-α)2becauseifthefirsttestisnotsignificant,thesecondnowhasaprobabilitygreaterthan1-αofbeingnotsignificantalso.(Similarly,theprobabilitythatbotharesignificantexceedsα2,andtheprobabilitythatonlyoneissignificantisreduced.)
Forkteststheprobabilityofnosignificantdifferencesisgreaterthan(1-α)kandsogreaterthan1-kα.Thusifwecarryouteachtestattheα=0.05/klevel,wewillhaveaprobabilityofnosignificantdifferenceswhichisgreaterthan0.95.APvaluelessthanαforanyvariable,orkP<0.05,wouldmeanthatthetreatmentsweresignificantlydifferent.Fortheexample,thePvaluescouldbeadjustedby35×0.006=0.21and35×0.02=0.70.
Becausetheprobabilityofobtainingnosignificantdifferencesifthenullhypothesesarealltrueisgreaterthanthe0.95whichwewantittobe,theoverallPvalueisactuallysmallerthanthenominal0.05,byanunknownamountwhichdependsonthelackofindependencebetweenthetests.Thepowerofthetest,itsabilitytodetecttruedifferencesinthepopulation,iscorrespondinglydiminished.Instatisticalterms,thetestisconservative.
Othermultipletestingproblemsarisewhenwehavemorethantwogroupsofsubjectsandwishtocompareeachpairofgroups(§10.9),whenwehaveaseriesofobservationsovertime,suchasbloodpressureevery15minafteradministrationofadrug,wheretheremaybeatemptationtotesteachtimepointseparately(§10.7),andwhenwehaverelationshipsbetweenmanyvariablestoexamine,asinasurvey.Foralltheseproblems,themultipletestsarehighlycorrelatedandtheBonferronimethodisinappropriate,asitwillbehighlyconservativeandmaymissrealdifferences.
9.11*RepeatedsignificancetestsandsequentialanalysisAspecialcaseofmultipletestingarisesinclinicaltrials,wherepatientsareadmittedatdifferenttimes.Therecanbeatemptationtokeeplookingatthedataandcarryingoutsignificanttests.Asdescribedabove(§9.10),thisisliabletoproducespurioussignificantdifferences,detectingtreatmenteffectswherenoneexist.Ihaveheardofresearcherstestingthedifferenceeachtimeapatientisaddedandstoppingthetrialassoonasthedifferenceissignificant,thensubmittingthepaperforpublicationasifonlyonetesthadbeen
carriedout.Iwillbecharitableandputthisdowntoignorance.
Itisquitelegitimatetosetupatrialwherethetreatmentdifferenceistestedeverytimeapatientisadded,providedthisrepeatedtestingisdesignedintothetrialandtheoverallchanceofasignificantdifferencewhenthenullhypothesisistrueremains0.05.Suchdesignsarecalledsequentialclinicaltrials.AcomprehensiveaccountisgivenbyWhitehead(1997).
Analternativeapproachwhichisquiteoftenusedistotakeasmallnumberoflooksatthedataasthetrialprogresses,testingatapredeterminedPvalue.Forexample,wecouldtestthreetimes,rejectingthenullhypothesisofnotreatmenteffectthefirsttimeonlyifP<0.001,thesecondtimeifP<0.01,andthethirdtimeifP<0.04.Thenifthenullhypothesisistrue,theprobabilitythattherewillnotbeasignificantdifferenceisapproximately0.999×0.99×0.96=0.949,sotheoverallalphalevelwillbe1-0.949=0.051,i.e.approximately0.05.(Thecalculationisapproximatebecausethetestsarenotindependent.)Ifthenullhypothesisisrejectedatanyofthesetests,theoverallPvalueis0.05,notthe
nominalone.Thisapproachcanbeusedbydatamonitoringcommittees,whereifthetrialshowsalargedifferenceearlyonthetrialcanbestoppedyetstillallowastatisticalconclusiontobedrawn.ThisiscalledthealphaspendingorP-valuespendingapproach.
TwoparticularmethodswhichyoumightcomeacrossarethegroupedsequentialdesignofPocock(1977,1982),whereeachtestisdoneatthesamenominalalphavalue,andthemethodofO'BrienandFleming(1979),widelyusedbythepharmaceuticalindustry,wherethenominalalphavaluesdecreasesharplyasthetrialprogresses.
9MMultiplechoicequestions44to49(Eachbranchiseithertrueorfalse)
44.Inacase–controlstudy,patientswithagivendiseasedrankcoffeemorefrequentlythandidcontrols,andthedifferencewashighlysignificant.Wecanconcludethat:
(a)drinkingcoffeecausesthedisease;
(b)thereisevidenceofarealrelationshipbetweenthediseaseandcoffeedrinkinginthesampledpopulation;
(c)thediseaseisnotrelatedtocoffeedrinking;
(d)eliminatingcoffeewouldpreventthedisease;
(e)coffeeandthediseasealwaysgotogether.
ViewAnswer
45.WhencomparingthemeansoftwolargesamplesusingtheNormaltest:
(a)thenullhypothesisisthatthesamplemeansareequal;
(b)thenullhypothesisisthatthemeansarenotsignificantlydifferent;
(c)standarderrorofthedifferenceisthesumofthestandarderrorsofthemeans;
(d)thestandarderrorsofthemeansmustbeequal;
(e)theteststatisticistheratioofthedifferencetoitsstandarderror.
ViewAnswer
46.InacomparisonoftwomethodsofmeasuringPEFR,6of17subjectshadhigherreadingsontheWrightpeakflowmeter,10hadhigherreadingsontheminipeakflowmeterandonehadthesameonboth.Ifthedifferencebetweentheinstrumentsistestedusingasigntest:
(a)theteststatisticmaybethenumberwiththehigherreadingontheWrightmeter;
(b)thenullhypothesisisthatthereisnotendencyforoneinstrumenttoreadhigherthantheother;
(c)aone-tailedtestofsignificanceshouldbeused;
(d)theteststatisticshouldfollowtheBinomialdistribution(n=
16andp=0.5)ifthenullhypothesisweretrue;
(e)theinstrumentsshouldhavebeenpresentedinrandomorder.
ViewAnswer
47.Inasmallrandomizeddoubleblindtrialofanewtreatmentinacutemyocardialinfarction,themortalityinthetreatedgroupwashalfthatinthecontrolgroup,butthedifferencewasnotsignificant.Wecanconcludethat:
(a)thetreatmentisuseless;
(b)thereisnopointincontinuingtodevelopthetreatment;
(c)thereductioninmortalityissogreatthatweshouldintroducethetreatmentimmediately;
(d)weshouldkeepaddingcasestothetrialuntiltheNormaltestforcomparisonoftwoproportionsissignificant;
(e)weshouldcarryoutanewtrialofmuchgreatersize.
ViewAnswer
48.Inalargesamplecomparisonbetweentwogroups,increasingthesamplesizewill:
(a)improvetheapproximationoftheteststatistictotheNormaldistribution;
(b)decreasethechanceofanerrorofthefirstkind;
(c)decreasethechanceofanerrorofthesecondkind;
(d)increasethepoweragainstagivenalternative;
(e)makethenullhypothesislesslikelytobetrue.
ViewAnswer
49.Inastudyofbreastfeedingandintelligence(Lucasetal.1992),300childrenwhowereverysmallatbirthweregiventheirmother'sbreastmilkorinfantformula,atthechoiceofthemother.Atthe
ageof8yearstheIQofthesechildrenwasmeasured.ThemeanIQintheformulagroupwas92.8,comparedtoameanof103.0inthebreastmilkgroup.Thedifferencewassignificant,P<0.001:
(a)thereisgoodevidencethatformulafeedingofverysmallbabiesreducesIQatageeight;
(b)thereisgoodevidencethatchoosingtoexpressbreastmilkisrelatedtohigherIQinthechildatageeight;
(c)typeofmilkhasnoeffectonsubsequentIQ;
(d)theprobabilitythattypeofmilkaffectssubsequentIQislessthan0.1%;
(e)iftypeofmilkwereunrelatedtosubsequentIQ,theprobabilityofgettingadifferenceinmeanIQasbigasthatobservedislessthan0.001.
ViewAnswer
9EExercise:Crohn'sdiseaseandcornflakesThesuggestionthatcornflakesmaycauseCrohn'sdiseasearoseinthestudyofJames(1977).Crohn'sdiseaseisaninflammatorydisease,usuallyofthelastpartofthesmallintestine.Itcancauseavarietyofsymptoms,includingvaguepain,diarrhoea,acutepainandobstruction.Treatmentmaybebydrugsorsurgery,butmanypatientshavehadthediseaseformanyyears.James'initialhypothesiswasthatfoodstakenatbreakfastmaybeassociatedwithCrohn'sdisease.Jamesstudied16menand18womenwithCrohn'sdisease,aged19–64years,meantimesincediagnosis4.2years.Thesewerecomparedtocontrols,drawnfromhospital
patientswithoutmajorgastro-intestinalsymptoms.Twocontrolswerechosenperpatient,matchedforageandsex.Jamesinterviewedallcasesandcontrolshimself.Caseswereaskedwhethertheyatevariousfoodsforbreakfastbeforetheonsetofsymptoms,andcontrolswereaskedwhethertheyatevariousfoodsbeforeacorrespondingtime(Table9.2).Therewasasignificantexcessofeatingofcornflakes,
wheatandbranamongtheCrohn'spatients.Theconsumptionofdifferentcerealswasinterrelated,peoplereportingonecerealbeinglikelytoreportothers.InJames'opiniontheprincipalassociationofCrohn'sdiseasewaswithcornflakes,basedontheapparentstrengthoftheassociation.Onlyonecasehadnevereatencornflakes.
Table9.2.NumbersofCrohn'sdiseasepatientsandcontrolswhoatevariouscerealsregularly(atleastonce
perweek)(James1977)
Patients Controls Significancetest
Cornflakes Regularly 23 17 P<0.0001
Rarelyornever
11 51
Wheat Regularly 16 12 P<0.01
Rarelyornever
18 56
Porridge Regularly 11 15 0.5>P>0.1
Rarelyornever
23 53
Rice Regularly 8 10 0.5>P>0.1
Rarelyornever
26 56
Bran Regularly 6 2 P=0.02
Rarelyornever
28 66
Muesli Regularly 4 3 P=0.17
Rarelyornever
30 65
Severalpaperssoonappearedinwhichthisstudywasrepeated,withvariations.NonewasidenticalindesigntoJames'studyandnoneappearedtosupporthisfindings.Mayberryetal.(1978)interviewed100patientswithCrohn'sdisease,meandurationnineyears.Theyobtained100controls,matchedforageandsex,frompatientsandtheirrelativesattendingafractureclinic.Casesandcontrolswereinterviewedabouttheircurrentbreakfasthabits(Table9.3).Theonlysignificantdifferencewasanexcessoffruitjuicedrinkingincontrols.Cornflakeswereeatenby29casescomparedto22controls,whichwasnotsignificant.Inthisstudytherewasnoparticulartendencyforcasestoreportmorefoodsthancontrols.Theauthorsalsoaskedcaseswhethertheyknewofanassociationbetweenfood(unspecified)andCrohn'sdisease.Theassociationwithcornflakeswasreportedby29,and12ofthesehadstoppedeatingthem,havingpreviouslyeatenthemregularly.Intheir29matchedcontrols,3werepastcornflakeseaters.Ofthe71Crohn'spatientswhowereunawareoftheassociation,21haddiscontinuedeatingcornflakescomparedto10oftheir71controls.Theauthorsremarked‘seeminglypatientswithCrohn'sdiseasehadsignificantlyreducedtheirconsumptionofcornflakescomparedwithcontrols,irrespectiveofwhethertheywereawareofthepossible
association’.
1.Arethecasesandcontrolscomparableineitherofthesestudies?
ViewAnswer
2.Whatothersourcesofbiascouldtherebeinthesedesigns?
ViewAnswer
Table9.3.Numberofpatientsandcontrolsregularlyconsumingcertainfoodsatleasttwiceweekly
(Mayberryetal.1978)
Foodsatbreakfast
Crohn'spatients(n=100)
Controls(n=100)
Significancetest
Bread 91 86
Toast 59 64
Egg 31 37
Fruitorfruitjuice
14 30 P<0.02
Porridge 20 18
Weetabix,shreddiesor
21 19
shreddedwheat
Cornflakes 29 22
SpecialK 4 7
Ricekrispies 6 6
Sugarpuffs 3 1
Branorallbran 13 12
Muesli 3 10
AnyCereal 55 55
3.WhatisthemainpointofdifferenceindesignbetweenthestudyofJamesandthatofMayberryetal.?
ViewAnswer
4.InthestudyofMayberryetal.howmanyCrohn'scasesandhowmanycontrolshadeverbeenregulareatersofcornflakes?HowdoesthiscomparewithJames'findings?
ViewAnswer
5.WhydidJamesthinkthateatingcornflakeswasparticularlyimportant?
ViewAnswer
6.ForthedataofTable9.2,calculatethepercentageofcasesand
controlswhosaidthattheyatethevariouscereals.Nowdividetheproportionofcaseswhosaidthattheyhadeatenthecerealbytheproportionofcontrolswhoreportedeatingit.Thistellsus,roughly,howmuchmorelikelycasesweretoreportthecerealthanwerecontrols.Doyouthinkeatingcornflakesisparticularlyimportant?
ViewAnswer
7.Ifwehaveanexcessofallcerealswhenweaskwhatwasevereaten,andnonewhenweaskwhatiseatennow,whatpossiblefactorscouldaccountforthis?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>10-Comparingthemeansofsmallsamples
10
Comparingthemeansofsmallsamples
10.1ThetdistributionWehaveseeninChapters8and9howtheNormaldistributioncanbeusedtocalculateconfidenceintervalsandtocarryouttestsofsignificanceforthemeansoflargesamples.Inthischapterweshallseehowsimilarmethodsmaybeusedwhenwehavesmallsamples,usingthetdistribution,andgoontocompareseveralmeans.
Sofar,theprobabilitydistributionswehaveusedhavearisenbecauseofthewaydatawerecollected,eitherfromthewaysamplesweredrawn(Binomialdistribution),orfromthemathematicalpropertiesoflargesamples(Normaldistribution).Thedistributiondidnotdependonanypropertyofthedatathemselves.Tousethetdistributionwemustmakeanassumptionaboutthedistributionfromwhichtheobservationsthemselvesaretaken,thedistributionofthevariableinthepopulation.WemustassumethistobeaNormaldistribution.AswesawinChapter7,manynaturallyoccurringvariableshavebeenfoundtofollowaNormaldistributionclosely.IshalldiscusstheeffectsofanydeviationsfromtheNormallater.
Fig.10.1.Student'stdistributionwith1,4and20degreesoffreedom,showingconvergencetotheStandardNormaldistribution
Table10.1.Two-tailedprobabilitypointsofthetdistribution
D.f. Probability D.f. Probability
0.10 0.05 0.01 0.001 0.10 0.05
10% 5% 1% 0.1% 10% 5%
1 6.31 12.70 63.66 636.62 16 1.75 2.12
2 2.92 4.30 9.93 31.60 17 1.74 2.11
3 2.35 3.18 5.84 12.92 18 1.73 2.10
4 2.13 2.78 4.60 8.61 19 1.73 2.09
5 2.02 2.57 4.03 6.87 20 1.72 2.09
6 1.94 2.45 3.71 5.96 21 1.72 2.08
7 1.89 2.36 3.50 5.41 22 1.72 2.07
8 1.86 2.31 3.36 5.04 23 1.71 2.07
9 1.83 2.26 3.25 4.78 24 1.71 2.06
10 1.81 2.23 3.17 4.59 25 1.71 2.06
11 1.80 2.20 3.11 4.44 30 1.70 2.04
12 1.78 2.18 3.05 4.32 40 1.68 2.02
13 1.77 2.16 3.01 4.22 60 1.67 2.00
14 1.76 2.14 2.98 4.14 120 1.66 1.98
15 1.75 2.13 2.95 4.07 ∞ 1.64 1.96
D.f.=Degreesoffreedom.
∞=infinity,sameastheStandardNormaldistribution.
LiketheNormaldistribution,thetdistributionfunctioncannotbeintegratedalgebraicallyanditsnumericalvalueshavebeentabulated.Becausethetdistributiondependsonthedegreesoffreedom,itisnotusuallytabulatedinfullliketheNormaldistributioninTable7.1.Instead,probabilitypointsaregivenfordifferentdegreesoffreedom.Table10.1showstwosidedprobabilitypointsforselecteddegreesoffreedom.Thus,with4degreesoffreedom,wecanseethat,withprobability0.05,twillbe2.78ormorefromitsmean,zero.
Becauseonlycertainprobabilitiesarequoted,wecannotusuallyfindtheexactprobabilityassociatedwithaparticularvalueoft.Forexample,supposewewanttoknowtheprobabilityofton9degreesoffreedombeingfurtherfromzerothan3.7.FromTable10.1weseethatthe0.01pointis3.25andthe0.001pointis4.78.Wethereforeknowthattherequiredprobabilityliesbetween0.01and0.001.Wecouldwritethisas0.001<P<0.01.Oftenthelowerbound,0.001,isomittedandwewriteP<0.01.Withacomputeritispossibletocalculatetheexactprobabilityeverytime,sothiscommonpracticeisduetodisappear.
Fig.10.2.Sampletratiosderivedfrom750samplesof4humanheightsandthetdistribution,afterStudent(1908)
10.2Theone-sampletmethodWecanusethetdistributiontofindconfidenceintervalsformeansestimatedfromasmallsamplefromaNormaldistribution.Wedonotusuallyhavesmallsamplesinsamplesurveys,butweoftenfindtheminclinicalstudies.Forexample,wecanusethetdistributiontofind
confidenceintervalsforthesizeofdifferencebetweentwotreatmentgroups,orbetweenmeasurementsobtainedfromsubjectsundertwoconditions.Ishalldealwiththelatter,singlesampleproblemfirst.
Thepopulationmean,µ,isunknownandwewishtoestimateitusinga95%confidenceinterval.Wecanseethat,for95%ofsamples,thedifferencebetween[xwithbarabove]andµisatmosttstandarderrors,wheretisthevalueofthetdistributionsuchthat95%ofobservationswillbeclosertozerothant.Foralargesamplethiswillbe1.96asfortheNormaldistribution.ForsmallsampleswemustuseTable10.1.Inthistable,theprobabilitythatthetdistributionisfurtherfromzerothantisgiven,sowemustfirstfindoneminusourdesiredprobability,0.95.Wehave1-0.95=0.05,soweusethe0.05columnofthetabletogetthevalueoft.Wethenhavethe95%confidenceinterval:[xwithbarabove]-tstandarderrorsto[xwithbarabove]-tstandarderrors.Theusualapplicationofthisistodifferencesbetweenmeasurementsmadeonthesameoronmatchedpairsofsubjects.Inthisapplicationtheonesamplettestisalsoknownasthepairedttest.
ConsiderthedataofTable10.2.(Iaskedtheresearcherwhythereweresomanymissingdata.Hetoldmethatsomeofthebiopsieswerenotusabletocountthecapillaries,andthatsomeofthesepatientswereamputeesandthefootitselfwasmissing.)Weshallestimatethedifferenceincapillarydensity
betweentheworsefoot(intermsofulceration,notcapillaries)andthebetterfootfortheulceratedpatients.Thefirststepistofindthedifferences(worse–better).Wethenfindthemeandifferenceanditsstandarderror,asdescribedin§8.2.TheseareinthelastcolumnofTable10.2.
Table10.2.Capillarydensity(permm2)inthefeetofulceratedpatientsandahealthycontrolgroup(datasuppliedbyMarc
Lamah)
Controls Ulceratedpatients
Rightfoot Leftfoot
Averageofrightandleft†
Worsefoot
Betterfoot
Averageofworseand
better†
Differenceworse-
19 16 17.5 9 ? 9.0
25 30 27.5 11 ? 11.0
25 29 27.0 15 10 12.5
26 33 29.5 16 21 18.5
26 28 27.0 18 18 18.0
30 28 29.0 18 18 18.0
33 36 34.5 19 26 22.5
33 29 31.0 20 ? 20.0
34 37 35.5 20 20 20.0
34 33 33.5 20 33 26.5
34 37 35.5 20 26 23.0
34 ? 34.0 21 15 18.0
35 38 36.5 22 23 22.5
36 40 38.0 22 ? 22.0
39 41 40.0 23 23 23.0
40 39 39.5 25 30 27.5
41 39 40.0 26 31 28.5
41 39 40.0 27 26 26.5
56 48 52.0 27 ? 27.0
35 23 29.0
47 42 44.5
? 24 24.0
? 28 28.0
Number 19 23
Mean 34.08 22.59
Sumofsquares
956.13 1176.32
Variance 53.12 53.47
Standarddeviation
7.29 7.31
Standarderror
0.38 0.32
†Whenoneobservationismissingtheaverage=theotherobservation.?=Missingdata.
Tofindthe95%confidenceintervalforthemeandifferencewemustsupposethatthedifferencesfollowaNormaldistribution.Tocalculatetheinterval,wefirstrequiretherelevantpointofthetdistributionfromTable10.1.Thereare16non-missingdifferencesandhencen-1=15degreesoffreedomassociatedwiths2.Wewantaprobabilityof0.95ofbeingclosertozerothant,sowegotoTable10.1withprobability=1-0.95=0.05.Usingthe15d.f.row,wegett=2.13.Hencethedifferencebetweenasamplemeanandthepopulationmeanislessthan2.13standarderrorsfor95%ofsamples,andthe95%confidenceintervalis-0.81-2.13×1.51to-0.81+2.13×1.51=-4.03to+2.41capillaries/mm2.
Onthebasisofthesedata,thecapillarydensitycouldbelessintheworseaffectedfootbyasmuchas4.03capillaries/mm2,orgreaterbyasmuchas2.41capillaries/mm2.Inthelargesamplecase,wewouldusetheNormaldistributioninsteadofthetdistribution,putting1.96insteadof2.13.WewouldnotthenneedthedifferencesthemselvestofollowaNormaldistribution.
Fig.10.3.NormalplotfordifferencesandplotofdifferenceagainstaverageforthedataofTable10.2,ulceratedpatients
Wecanalsousethetdistributiontotestthenullhypothesisthatinthepopulationthemeandifferenceiszero.Ifthenullhypothesisweretrue,andthedifferencesfollowaNormaldistribution,theteststatisticmean/standarderrorwouldbefromatdistributionwithn-1degreesoffreedom.Thisisbecausethenullhypothesisisthatthemeandifferenceµ=0,hencethenumerator[xwithbarabove]-µ=[xwithbarabove].Wehavetheusual‘estimateoverstandarderror’formula.Fortheexample,wehave
Ifwegotothe15degreesoffreedomrowofTable10.1,wefindthattheprobabilityofsuchanextremevaluearisingisgreaterthan0.10,the0.10pointofthedistributionbeing1.75.UsingacomputerwewouldfindP=0.6.Thedataareconsistentwiththenullhypothesisandwehavefailedtodemonstratetheexistenceofadifference.Notethattheconfidenceintervalismoreinformativethanthesignificancetest.
Wecouldalsousethesigntesttotestthenullhypothesisofnodifference.Thisgivesus5positivesoutof12differences(4differences,beingzero,givenousefulinformation)whichgivesatwosidedprobabilityof0.8,alittlelargerthanthatgivenbythettest.ProvidedtheassumptionofaNormaldistributionistrue,thettestis
preferredbecauseitisthemostpowerfultest,andsomostlikelytodetectdifferencesshouldtheyexist.
ThevalidityofthepairedtmethoddescribedabovedependsontheassumptionthatthedifferencesarefromaNormaldistribution.WecanchecktheassumptionofaNormaldistributionbyaNormalplot(§7.5).Figure10.3showsaNormalplotforthedifferences.Thepointslieclosetotheexpectedline,suggestingthatthereislittledeviationfromtheNormal.
Anotherplotwhichisausefulcheckhereisthedifferenceagainstthesubjectmean(Figure10.3).Ifthedifferencedependsonmagnitude,thenweshouldbecarefulofdrawinganyconclusionaboutthemeandifference.Wemaywanttoinvestigatethisfurther,perhapsbytransformingthedata(§10.4).Inthiscasethedifferencebetweenthetwofeetdoesnotappeartoberelatedtothelevelofcapillarydensityandweneednotbeconcernedaboutthis.
ThedifferencesmaylooklikeafairlygoodfittotheNormalevenwhenthemeasurementsthemselvesdonot.Therearetworeasonsforthis:thesubtractionremovesvariabilitybetweensubjects,leavingthemeasurementerrorwhichismorelikelytobeNormal,andthetwomeasurementerrorsarethenaddedbythedifferencing,producingthetendencyofsumstotheNormalseenintheCentralLimittheorem(§7.3).TheassumptionofaNormaldistributionfortheonesamplecaseisquitelikelytobemet.Idiscussthisfurtherin§10.5.
10.3ThemeansoftwoindependentsamplesSupposewehavetwosamplesfrompopulationswhichhaveaNormaldistribution,withwhichwewanttoestimatethedifferencebetweenthepopulationmeans.Ifthesampleswerelarge,the95%confidenceintervalforthedifferencewouldbetheobserveddifference-1.96standarderrorstoobserveddifference+1.96standarderrors.Unfortunately,wecannotsimplyreplace1.96byanumberfromTable10.1.Thisisbecausethestandarderrordoesnothavethesimpleformdescribedin§10.1.Itisnotbasedonasinglesumofsquares,butratheristhesquarerootofthesumoftwoconstantsmultipliedbytwosums
ofsquares.Hence,itdoesnotfollowthesquarerootoftheChi-squareddistributionasrequiredforthedenominatorofatdistributedrandomvariable(§7A).Inordertousethetdistributionwemustmakeafurtherassumptionaboutthedata.NotonlymustthesamplesbefromNormaldistributions,theymustbefromNormaldistributionswiththesamevariance.Thisisnotasunreasonableanassumptionasitmaysound.Adifferenceinmeanbutnotinvariabilityisacommonphenomenon.ThePEFRdataforchildrenwithandwithoutsymptomsanalysedin§8.5and§9.6showthecharacteristicveryclearly,asdotheaveragecapillarydensitiesinTable10.2.
Wenowestimatethecommonvariance,s2.Firstwefindthesumofsquaresaboutthesamplemeanforeachsample,whichwecanlabelSS1andSS2.WeformacombinedsumofsquaresbySS1+SS2.Thesumofsquaresforthefirstgroup,SS1,hasn1-1degreesoffreedomandthesecond,SS2,hasn2-1degreesoffreedom.Thetotaldegreesoffreedomisthereforen1-1+n2-1=n1+n2-2.Wehavelost2degreesoffreedombecausewehaveasumofsquaresabouttwomeans,eachestimatedfromthedata.Thecombinedestimateofvarianceis
Thestandarderrorof[xwithbarabove]1-[xwithbarabove]2is
NowwehaveastandarderrorrelatedtothesquarerootoftheChi-squareddistributionandwecangetatdistributedvariableby
havingn1+n2-2degreesoffreedom.The95%confidenceintervalforthedifferencebetweenpopulationmeans,µ1-µ2,is
wheretisthe0.05pointwithn1+n2-2degreesoffreedomfromTable
10.1.Alternatively,wecantestthenullhypothesisthatinthepopulationthedifferenceiszero,i.e.thatµ1=µ2,usingtheteststatistic
whichwouldfollowthetdistributionwithn1+n2-2d.f.ifthenullhypothesisweretrue.
Fig.10.4.ScatterplotagainstgroupandNormalplotforthepatientaveragesofTable10.2
Forapracticalexample,Table10.2showstheaveragecapillarydensityoverbothfeet(ifpresent)fornormalcontrolsubjectsaswellasulcerpatients.Weshallestimatethedifferencebetweentheulceratedpatientsandcontrols.WecanchecktheassumptionsofNormaldistributionanduniformvariance.FromTable10.2thevariancesappearremarkablysimilar,53.12and53.47.Figure10.4showsthatthereappearstobeashiftofmeanonly.TheNormalplotcombinesbygroupsbytakingthedifferencesbetweeneachobservationanditsgroupmean,calledtheresiduals.Thishasaslightkinkattheendbutnopronouncedcurve,
suggestingthatthereislittledeviationfromtheNormal.Ithereforefeelquitehappythattheassumptionsofthetwo-sampletmethodaremet.
Firstwefindthecommonvarianceestimate,s2.Thesumsofsquaresaboutthetwosamplemeansare956.13and1176.32.Thisgivesthecombinedsumofsquaresaboutthesamplemeanstobe956.13+1176.32=2132.45.Thecombineddegreesoffreedomaren1+n2-2=19+23-2=40.Hences2=2132.45/40=53.31.Thestandarderrorofthedifferencebetweenmeansis
Thevalueofthetdistributionforthe95%confidenceintervalisfoundfromthe0.05columnand40degreesoffreedomrowofTable10.1,givingt0.05=2.02.Thedifferencebetweenmeans(control–ulcerated)is34.08-22.59=11.49.Hencethe95%confidenceintervalis11.49-2.02×2.26to11.49+2.02×2.26,giving6.92to16.06capillaries/mm2.Hencethereisclearlyadifferenceincapillarydensitybetweennormalcontrolsandulceratedpatients.
Totestthenullhypothesisthatinthepopulationthecontrol-ulcerateddifferenceiszero,theteststatisticisdifferenceoverstandarderror,11.49/2.26=5.08.Ifthenullhypothesisweretrue,thiswouldbeanobservationfromthetdistributionwith40degreesoffreedom.FromTable10.1,theprobabilityofsuchanextremevalueislessthan0.001.Hencethedataarenotconsistentwiththenullhypothesisandwecanconcludethatthereisstrongevidenceofadifferenceinthepopulationswhichthesepatientsrepresent.
10.4Theuseoftransformations
Wehavealreadyseen(§7.4)thatsomevariableswhichdonotfollowaNormaldistributioncanbemadesobyasuitabletransformation.Thesametransformationcanbeusedtomakethevariancesimilarindifferentgroups,calledvariancestabilizingtransformations.BecausemeanandvarianceinsamplesfromthesamepopulationareindependentifandonlyifthedistributionisNormal(§7A),stablevariancesandNormaldistributionstendtogotogether.
Oftenstandarddeviationandmeanareconnectedbyasimplerelationshipoftheforms=a[xwithbarabove]b,whereaandbareconstants.Ifthisisso,itcanbeshownthatthevariancewillbestabilizedbyraisingtheobservationstothepower1-b,
unlessb=1,whenweusethelog.(Ishallresistthetemptationtoprovethis,thoughIcan.Anybookonmathematicalstatisticswilldoit.)Thus,ifthestandarddeviationisproportionaltothesquarerootofthemean(i.e.varianceproportionaltomean),e.g.Poissonvariance(§6.7),b=0.5,1-b=0.5,andweuseasquareroottransformation.Ifthestandarddeviationisproportionaltothemeanwelog.Ifthestandarddeviationisproportionaltothesquareofthemeanwehaveb=2,1-b=-1,andweusethereciprocal.Another,rarelyseentransformationisusedwhenobservationsareBinomialproportions.Herethestandarddeviationincreasesastheproportiongoesfrom0.0to0.5,thendecreasesastheproportiongoesfrom0.5to1.0.Thisisthearcsinesquareroottransformation.Whetheritworksdependsonhowmuchothervariationthereis.Ithasnowbeenlargelysupersededbylogisticregression(§17.8).
Table10.3.Bicepsskinfoldthickness(mm)intwogroupsofpatients
Crohn'sdisease Coeliacdisease
1.8 2.8 4.2 6.2 1.8 3.8
2.2 3.2 4.4 6.6 2.0 4.2
2.4 3.6 4.8 7.0 2.0 5.4
2.5 3.8 5.6 10.0 2.0 7.6
2.8 4.0 6.0 10.4 3.0
Fig.10.5.Scatterplot,histogram,andNormalplotforthebicepsskinfolddata
Whenwehaveseveralgroupswecanplotlog(s)againstlog([xwithbarabove])thendrawalinethroughthepoints.Theslopeofthelineisb(seeHealy1968).Trialanderror,however,combinedwithscatterplots,histograms,andNormalplots,usuallysuffice.
Table10.3showssomedatafromastudyofanthropometryanddiagnosisinpatientswithintestinaldisease(Maugdaletal.1985).Wewereinterestedindifferencesinanthropometricalmeasurementsbetweenpatientswithdifferentdiagnoses,andherewehavethebicepsskinfoldmeasurementsfor20patientswithCrohn'sdiseaseand9patientswithcoeliacdisease.Thedatahavebeenputintoorderof
magnitudeanditisfairlyobviousthatthedistributionisskewedtotheright.Figure10.5showsthisclearly.Ihavesubtractedthegroupmeanfromeachobservation,givingwhatiscalledthewithin-groupresiduals,andthenfoundboththefrequencydistributionandNormalplot.Thedistributionisclearlyskew,andthisisreflectedintheNormalplot,whichshowsapronouncedcurvature.
Fig.10.6.Scatterplot,histogram,andNormalplotforthebicepsskinfolddata,aftersquareroot,log,andreciprocaltransformations
WeneedaNormalizingtransformation,ifonecanbefound.Theusualbestguessesaresquareroot,log,andreciprocal,withthelogbeingthemostlikelytosucceed.Figure10.6showsthescatterplot,histogram,andNormalplotfortheresidualsaftertransformation.(Theselogarithmsarenatural,tobasee,ratherthantobase10.Itmakesno
differencetothefinalresultandthecalculationsarethesametothecomputer.)ThefittotheNormaldistributionisnotperfect,butforeachtransformationismuchbetterthaninFigure10.5.TheloglooksthebestfortheequalityofvarianceandtheNormaldistribution.Wecouldusethetwo-sampletmethodonthesedataquitehappily.
Table10.4showstheresultsofthetwosampletmethodusedwiththeraw,untransformeddataandwitheachtransformation.ThetteststatisticincreasesanditsassociatedprobabilitydecreasesaswemoveclosertoaNormaldistribution,reflectingtheincreasingpowerofthettestasitsassumptionsaremorecloselymet.Table10.4alsoshowstheratioofthevariancesinthetwosamples.Wecanseethat,asthetransformeddatagetsclosertoaNormaldistribution,thevariancestendtobecomemoreequalalso.
Thetransformeddataclearlygivesabettertestofsignificancethantherawdata.Theconfidenceintervalsforthetransformeddataaremoredifficulttointerpret,however,sothegainhereisnotsoapparent.Theconfidencelimitsforthedifferencecannotbetransformedbacktotheoriginalscale.Ifwetryit,thesquarerootandreciprocallimitsgiveludicrousresults.Theloggivesinterpretableresults(0.89to2.03)butthesearenotlimitsforthedifferencein
millimetres.Howcouldtheybe,fortheydonotcontainzeroyetthedifferenceisnotsignificant?Theyareinfactthe95%confidencelimitsfortheratiooftheCrohn'sdiseasegeometricmeantothecoeliacdiseasegeometricmean(§7.4).Iftherewerenodifference,ofcourse,theexpectedvalueofthisratiowouldbeone,notzero,andsolieswithinthelimits.Thereasonisthatwhenwetakethedifferencebetweenthelogarithmsoftwonumbers,wegetthelogarithmoftheirratio,notoftheirdifference(§5A).
Table10.4.Bicepsskinfoldthicknesscomparedfortwogroupsofpatients,usingdifferenttransformations
Transformation
Two-samplettest,27d.f.
95%Confidenceintervalfordifferenceontransformedscale
Varianceratio,
larger/smallert P
None,rawdata
1.28 0.21 -0.71to3.07mm
1.52
Squareroot 1.38 0.18 -0.140to0.714
1.16
Logarithm 1.48 0.15 -0.114to0.706
1.10
Reciprocal -1.65 0.11 -0.203to0.022
1.63
Becausethelogtransformationistheonlyonewhichgivesusefulconfidenceintervals,Iwoulduseitunlessitwereclearlyinadequateforthedata,andanothertransformationclearlysuperior.Whenthishappenswearereducedtoasignificancetestonly,withnomeaningfulestimate.
10.5DeviationsfromtheassumptionsoftmethodsThemethodsdescribedinthischapterdependonsomestrongassumptionsaboutthedistributionsfromwhichthedatacome.Thisoftenworriesusersofstatisticalmethods,whofeelthattheseassumptionsmustlimitgreatlytheuseoftdistributionmethodsandfindtheattitudeofmanystatisticians,whooftenusemethodsbasedon
Normalassumptionsalmostasamatterofcourse,rathersanguine.Weshalllookatsomeconsequencesofdeviationsfromtheassumptions.
Firstweshallconsideranon-Normaldistribution.Aswehaveseen,somevariablesconformverycloselytotheNormaldistribution,othersdonot.Deviationsoccurintwomainways:groupingandskewness.Groupingoccurswhenacontinuousvariable,suchashumanheight,ismeasuredinunitswhicharefairlylargerelativetotherange.Thishappens,forexample,ifwemeasurehumanheighttothenearestinch.TheheightsinFigure10.2weretothenearestinch,andthefittothetdistributionisverygood.Thiswasaverycoarsegrouping,asthestandarddeviationofheightswas2.5inchesandso95%ofthe3000observationshadvaluesoverarangeof10inches,only10or11possiblevaluesinall.WecanseefromthisthatiftheunderlyingdistributionisNormal,roundingthemeasurementisnotgoingtoaffecttheapplicationofthetdistributionbymuch.
Theotherassumptionofthetwo-sampletmethodisthatthevariancesinthetwopopulationsarethesame.Ifthisisnotcorrect,thetdistributionwillnotnecessarilyapply.TheeffectisusuallysmallifthetwopopulationsarefromaNormaldistribution.Thissituationisunusualbecause,forsamplesfromthesamepopulation,meanandvarianceareindependentifthedistributionisNormal(§7A).Thereisanapproximatetmethod,aswenotedin§10.3.However,unequalvarianceismoreoftenassociatedwithskewnessinthedata,inwhich
caseatransformationdesignedtocorrectonefaultoftentendstocorrecttheotheraswell.
Boththepairedandtwo-sampletmethodsarerobusttomostdeviationsfromtheassumptions.Onlylargedeviationsaregoingtohavemucheffectonthesemethods.Themainproblemiswithskeweddataintheone-samplemethod,butforreasonsgivenin§10.2,thepairedtestwillusuallyprovidedifferenceswithareasonabledistribution.Ifthedatadoappeartobenon-Normal,thenaNormalizingtransformationwillimprovematters.Ifthisdoesnotwork,thenwemustturntomethodswhichdonotrequiretheseassumptions(§9.2,§12.2,§12.3).
10.6Whatisalargesample?Inthischapterwehavelookedatsmallsampleversionsofthelargesamplemethodsof§8.5and§9.7.Thereweignoredboththedistributionofthevariableandthevariabilityofs2,onthegroundsthattheydidnotmatterprovidedthesampleswerelarge.Howsmallcanalargesamplebe?Thisquestioniscriticaltothevalidityofthesemethods,butseldomseemstobediscussedintextbooks.
Providedtheassumptionsofthettestapply,thequestioniseasyenoughtoanswer.InspectionofTable10.1willshowthatfor30degreesoffreedomthe5%pointis2.04,whichissoclosetotheNormalvalueof1.96thatitmakeslittledifferencewhichisused.SoforNormaldatawithuniformvariancewecanforgetthetdistributionwhenwehavemorethan30observations.
Whenthedataarenotinthishappystate,thingsarenotsosimple.Ifthetmethodisnotvalid,wecannotassumethatalargesamplemethodwhichapproximatestoitwillbevalid.Irecommendthefollowingroughguide.First,ifindoubt,treatthesampleassmall.Second,transformtoaNormaldistributionifpossible.Inthepairedcaseyoushouldtransformbeforesubtraction.Third,themorenon-Normalthedata,thelargerthesampleneedstobebeforewecan
ignoreerrorsintheNormalapproximation.
Table10.5.Bloodzidovudinelevelsattimesafteradministrationofthedrugbypresenceoffatmalabsorption
Timesinceadministrationofzidovudine
0 15 30 45 60 90 120 150
Malabsorptionpatients
0.08 13.15 5.70 3.22 2.69 1.91 1.72 1.22
0.08 0.08 0.14 2.10 6.37 4.89 2.11 1.40
0.08 0.08 3.29 3.47 1.42 1.61 1.41 1.09
0.08 0.08 1.33 1.71 3.30 1.81 1.16 0.69
0.08 6.69 8.27 5.02 3.98 1.90 1.24 1.01
0.08 4.28 4.92 1.22 1.17 0.88 0.34 0.24
0.08 0.13 9.29 6.03 3.65 2.32 1.25 1.02
0.08 0.64 1.19 1.65 2.37 2.07 2.54 1.34
0.08 2.39 3.53 6.28 2.61 2.29 2.23 1.97
Normalabsorptionpatients:
0.08 3.72 16.02 8.17 5.21 4.84 2.12 1.50
0.08 6.72 5.48 4.84 2.30 1.95 1.46 1.49
0.08 9.98 7.28 3.46 2.42 1.69 0.70 0.76
0.08 1.12 7.27 3.77 2.97 1.78 1.27 0.99
0.08 13.37 17.61 3.90 5.53 7.17 5.16 3.84
Thereisnosimpleanswertothequestion:‘howlargeisalargesample?’.Weshouldbereasonablysafewithinferencesaboutmeansifthesampleisgreaterthan100forasinglesample,orifbothsamplesaregreaterthan50fortwosamples.Theapplicationofstatisticalmethodsisamatterofjudgementaswellasknowledge.
10.7*SerialdataTable10.5showslevelsofzidovudine(AZT)inthebloodofAIDSpatientsatseveraltimesafteradministrationofthedrug,forpatientswithnormalfatabsorptionorfatmalabsorption.AlinegraphofthesedatawasshowninFigure5.6.Onecommonapproachtosuchdataistocarryoutatwo-samplettestateachtimeseparately,andresearchersoftenaskatwhattimethedifferencebecomessignificant.Thisisamisleadingquestion,assignificanceisapropertyofthesampleratherthanthepopulation.Thedifferenceat15minmaynotbesignificantbecausethesampleissmallandthedifferencetobedetectedissmall,notbecausethereisnodifferenceinthepopulation.Further,ifwedothisforeachtimepointwearecarryingoutmultiplesignificancetests(§9.10)andeachtestonlyusesasmallpartofthedatasowearelosingpower(§9.9).Itisbettertoaskwhetherthereisanyevidenceofadifferencebetweentheresponseofnormalandmalabsorptionsubjectsoverthewholeperiodofobservation.
Thesimplestapproachistoreducethedataforasubjecttoone
number.Wecanusethehighestvalueattainedbythesubject,thetimeatwhichthispeakvaluewasreached,ortheareaunderthecurve.Thefirsttwoareself-explanatory.
Theareaunderthecurveor(AUC)isfoundbydrawingalinethroughallthepointsandfindingtheareabetweenitandthehorizontalaxis.The‘curve’isususallyformedbyaseriesofstraightlinesfoundbyjoiningallthepointsforthesubject,andFigure10.7showsthisforthefirstsubjectinTable10.5.Theareaunderthecurvecanbecalculatedbytakingeachstraightlinesegmentandcalculatingtheareaunderthis.Thisisthebasemultipliedbytheaverageofthetwoverticalheights.Wecalculatethisforeachlinesegment,i.e.betweeneachpairofadjacenttimepoints,andadd.Thusforthefirstsubjectweget(15-0)×(0.08+13.15)/2+(30-15)×(13.15+5.70)/2+…+(360-300)×(0.43+0.32)/2=667.425.Thiscanbedonefairlyeasilybymoststatisticalcomputerpackages.TheareaforeachsubjectisshowninTable10.6.
Fig.10.7.Calculationoftheareaunderthecurveforonesubject
Table10.6.AreaunderthecurvefordataofTable10.5
Malabsorptionpatients Normalpatients
667.425 256.275 919.875
569.625 527.475 599.850
306.000 388.800 499.500
298.200 505.875 472.875
617.850 1377.975
Fig.10.8.NormalplotsforareaunderthecurveandlogareaforthedataofTable10.5
10.8*ComparingtwovariancesbytheFtestWecantestthenullhypothesisthattwopopulationvariancesareequalusingtheFdistribution.ProvidedthedataarefromaNormaldistribution,theratiooftwoindependentestimatesofthesamevariancewillfollowaFdistribution(§7A),thedegreesoffreedombeingthedegreesoffreedomofthetwoestimates.TheFdistributionisdefinedasthatoftheratiooftwoindependentChi-squaredvariablesdividedbytheirdegreesoffreedom:
wheremandnarethedegreesoffreedom(§7A).ForNormaldatathedistributionofasamplevariances2fromnobservationsisthatofσ2χ2n/(n-1)andwhenwedivideoneestimateofvariancebyanothertogivetheFratio,theσ2cancelsout.LikeotherdistributionsderivedfromtheNormal,theFdistributioncannotbeintegratedandsowemustuseatable.Becauseithastwodegreesoffreedom,thetableiscumbersome,coveringseveralpages,andIshallomitit.MostFmethodsaredoneusingcomputerprogramswhichcalculatetheprobabilitydirectly.Thetableisusuallyonlygivenastheupperpercentagepoints.
Totestthenullhypothesis,wedividethelargervariancebythesmaller.Fortheskinfolddataof§10.4,thevariancesare5.860with19degreesoffreedomfortheCrohn'spatientsand3.860with8degreesoffreedomforthecoeliacs,givingF=5.860/3.860=1.52.TheprobabilityofthisbeingexceededbytheFdistributionwith19and8degreesoffreedomis0.3,the5%pointofthedistributionbeing3.16,sothereisnoevidencefromthesedatathatthevarianceofskinfolddiffersbetweenpatientswithCrohn'sdiseaseandcoeliacdisease.
SeveralvariancescanbecomparedbyBartlett'stestortheLevenetest(seeArmitageandBerry1994,SnedecorandCochran1980).
Table10.7.MannitolandlactulosegutpermeabilitytestsinagroupofHIVpatientsandcontrols
HIVstatus Diarrhoea %
Mannitol %lactulose HIVstatus Diarrhoea
AIDS Yes 14.9 1.17 ARC Yes
AIDS Yes 7.074 1.203 ARC No
AIDS Yes 5.693 1.008 ARC No
AIDS Yes 16.82 0.367 HIV+ No
AIDS Yes 4.93 1.13 HIV+ No
AIDS Yes 9.974 0.545 HIV+ No
AIDS Yes 2.069 0.14 HIV+ No
AIDS Yes 10.9 0.86 HIV+ No
AIDS Yes 6.28 0.08 HIV+ No
AIDS Yes 11.23 0.398 HIV+ No
AIDS No 13.95 0.6 HIV- No
AIDS No 12.455 0.4 HIV- No
AIDS No 10.45 0.18 HIV- No
AIDS No 8.36 0.189 HIV- No
AIDS No 7.423 0.175 HIV- No
AIDS No 2.657 0.039 HIV- No
AIDS No 19.95 1.43 HIV- No
AIDS No 15.17 0.2 HIV- No
AIDS No 12.59 0.25 HIV- No
AIDS No 21.8 1.15 HIV- No
AIDS No 11.5 0.36 HIV- No
AIDS No 10.5 0.33 HIV- No
AIDS No 15.22 0.29 HIV- No
AIDS No 17.71 0.47 HIV- No
AIDS Yes 7.256 0.252 HIV- No
AIDS No 17.75 0.47 HIV- No
ARC Yes 7.42 0.21 HIV- No
ARC Yes 9.174 0.399 HIV- No
ARC Yes 9.77 0.215 HIV- No
ARC No 22.03 0.651
10.9*ComparingseveralmeansusinganalysisofvarianceConsiderthedataofTable10.7.Thesearemeasuresofgutpermeabilityobtainedfromfourgroupsofsubjects,diagnosedwithAIDS,AIDSrelatedcom-plex(ARC),asymptomaticHIVpositive,andHIVnegativecontrols.Wewanttoinvestigatethedifferencesbetweenthegroups.
Oneapproachwouldbetousethettesttocompareeachpairofgroups.Thishasdisadvantages.First,therearemanycomparisons,m(m-1)/2wheremisthenumberofgroups.Themoregroupswehave,themorelikelyitisthattwoofthemwillbefarenoughaparttoproducea
‘significant’differencewhenthenullhypothesisistrueandthepopulationmeansarethesame(§9.10).Second,whengroupsaresmall,theremaynotbemanydegreesoffreedomfortheestimateofvariance.Ifwecanuseallthedatatoestimatevariancewewillhavemore
degreesoffreedomandhenceamorepowerfulcomparison.Wecandothisbyanalysisofvariance,whichcomparesthevariationbetweenthegroupstothevariationwithinthegroups.
Table10.8.Someartificialdatatoillustratehowanalysisofvarianceworks
Group1 Group2 Group3 Group4
6 4 7 3
7 5 9 5
8 6 10 6
8 6 11 6
9 6 11 6
11 8 13 8
Mean 8.167 5.833 10.167 5.667
Toillustratehowtheanalysisofvariance,oranova,works,Ishalluse
someartificialdata,assetoutinTable10.8.Inpractice,equalnumbersineachgroupareunusualinmedicalapplications.Westartbyestimatingthecommonvariancewithinthegroups,justaswedoinatwo-samplettest(§10.3).Wefindthesumofsquaresaboutthegroupmeanforeachgroupandaddthem.Wecallthisthewithingroupssumofsquares.ForTable10.8thisgives57.833.Foreachgroupweestimatethemeanfromthedata,sowehaveestimated4parametersandhave24-4=20degreesoffreedom.Ingeneral,formgroupsofsizeneachwehavenm-m=m(n-1)degreesoffreedom.Thisgivesusanestimateofvarianceof
Thisisthewithingroupsvarianceorresidualvariance.Thereisanassumptionhere.Foracommonvariance,weassumethatthevariancesarethesameinthefourpopulationsrepresentedbythefourgroups.
Wecanalsofindanestimateofvariancefromthegroupmeans.Thevarianceofthefourgroupmeansis4.562.Iftherewerenodifferencebetweenthemeansinthepopulationfromwhichthesamplecomes,thisvariancewouldbethevarianceofthesamplingdistributionofthemeanofnobservations,whichiss2/n,thesquareofthestandarderror(§8.2).Thusntimesthisvarianceshouldbeequaltothewithingroupsvariance.Fortheexample,thisis4.562×6=27.375.whichismuchgreaterthanthe2.892foundwithinthegroups.Weexpressthisbytheratioofonevarianceestimatetotheother,betweengroupsoverwithingroups,whichwecallthevarianceratioorFratio.IfthenullhypothesisistrueandiftheobservationsarefromaNormaldistributionwithuniformvariance,thisratiofollowsaknowndistribution,theFdistributionwithm-1andn-1degreesoffreedom(§10.8).
Fortheexamplewewouldhave3and20degreesoffreedomand
Ifthenullhypothesisweretrue,theexpectedvalueofthisratiowouldbe1.0.
Alargevaluegivesusevidenceofadifferencebetweenthemeansin
thefourpopulations.Fortheexamplewehavealargevalueof9.47andtheprobabilityofgettingavalueasbigasthisifthenullhypothesisweretruewouldbe0.0004.Thusthereisasignificantdifferencebetweenthefourgroups.
Table10.9.One-wayanalysisofvarianceforthedataofTable10.8
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 23 139.958
Betweengroups
3 82.125 27.375 9.47 0.0004
Withingroups
20 57.833 2.892
Table10.10.One-wayanalysisofvarianceforthemannitoldata
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 58 1559.036
Betweengroups
3 49.012 16.337 0.6
Residual 55 1510.024 27.455
Wecansetthesecalculationsoutinananalysisofvariancetable,asshowninTable10.9.Thesumofsquaresinthe‘betweengroups’rowisthesumofsquaresofthegroupmeanstimesn.Wecallthisthebetweengroupssumofsquares.Noticethatinthe‘degreesoffreedom’and‘sumofsquares’columnsthe‘withingroups’and‘betweengroups’rowsadduptothetotal.Thewithingroupssumofsquaresisalsocalledtheresidualsumofsquares,becauseitiswhatisleftwhenthegroupeffectisremoved,ortheerrorsumofsquares,becauseitmeasurestherandomvariationorerrorremainingwhenallsystematiceffectshavebeenremoved.
Thesumofsquaresofthewholedata,ignoringthegroupsiscalledthetotalsumofsquares.Itisthesumofthebetweengroupsandwithingroupssumofsquares.
Returningtothemannitoldata,assooftenhappensthegroupsareofunequalsize.Thecalculationofthebetweengroupssumofsquaresbecomesmorecomplicatedandweusuallydoitbysubtractingthewithingroupssumofsquaresfromthetotalsumofsquares.Otherwise,thetableisthesame,asshowninTable10.10.Asthesecalculationsareusuallydonebycomputertheextracomplexityincalculationdoesnotmatter.Herethereisnosignificantdifferencebetweenthegroups.
Ifwehaveonlytwogroups,one-wayanalysisofvarianceisanotherwayofdoingatwo-samplettest.Forexample,theanalysisofvariancetableforthecomparisonofaveragecapillarydensity(§10.3)isshowninTable10.11.TheprobabilityisthesameandtheFratio,25.78,isthesquareofthetstatistic,5.08.Theresidualmeansquareisthecommonvarianceofthettest.
Table10.11.One-wayanalysisofvarianceforthecomparisonofmeancapillarydensitybetweenulceratedpatientsandcontrols,
Table10.2
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 41 3506.57
Betweengroups
1 1374.114 1374.114 25.78 <0.0001
Residual 40 2132.458 53.311
Fig.10.9.Plotsofthemannitoldata,showingthattheassumptionsofNormaldistributionandhomoscedasticityarereasonable
10.10*AssumptionsoftheanalysisofvarianceTherearetwoassumptionsforanalysisofvariance:thatdatacomefromNormaldistributionswithinthegroupsandthatthevariancesofthesedistributionsarethesame.Thetechnicaltermforuniformityofvarianceishomoscedasticity;lackofuniformityisheteroscedasticity.Heteroscedasticitycanaffectanalysesofvariancealotandwetrytoguardagainstit.
Wecanexaminetheseassumptionsgraphically.Formannitol(Figure10.9)thescatterplotforthegroupsshowsthatthespreadofdataineachgroupissimilar,suggestingthattheassumptionofuniformvarianceismet,thehistogramlooksNormalandNormalplotlooksstraight.Thisisnotthecaseforthelactulosedata,asFigure10.10shows.ThevariancesarenotuniformandthehistogramandNormalplotsuggestpositiveskewness.Asisoftenthecase,thegroupwiththehighestmean,AIDS,hasthegreatestspread.Thesquareroottransformationofthelactulosefitsbetter,givingagoodNormaldistributionalthoughthevariabilityisnotuniform.Thelogtransformover-compensatesforskewness,byproducingskewnessintheoppositedirection,thoughthevariancesappearuniform.Eitherthesquarerootorthelogarithmictransformationwouldbebetterthantherawdata.Ipickedthesquarerootbecausethedistributionlookedbetter.Table10.12showstheanalysisofvarianceforsquareroottransformedlactulose.
TherearealsosignificancetestswhichwecanapplyforNormaldistributionandhomoscedasticity.Ishallomitthedetails.
10.11*ComparisonofmeansafteranalysisofvarianceConcludingfromTables10.9and10.12thatthereisasignificantdifferencebetweenthemeansisratherunsatisfactory.Wewanttoknowwhichmeansdiffer
fromwhich.Thereareanumberofwaysofdoingthis,calledmultiplecomparisonsprocedures.ThesearemostlydesignedtogiveonlyonetypeIerror(§9.3)per20analyseswhenthenullhypothesisistrue,as
opposedtodoingttestsforeachpairofgroups,whichgivesoneerrorper20comparisonswhenthenullhypothesisistrue.Ishallnotgointodetails,butlookatacoupleofexamples.Thereareseveraltestswhichcanbeusedwhenthenumbersineachgrouparethesame,Tukey'sHonestlySignificantDifference,theNewman-Keulssequentialprocedure(bothcalledStudentizedrangetests),Duncan'smultiplerangetest,etc.Theoneyouusewilldependonwhichcomputerprogramyouhave.TheresultsoftheNewman-KeulssequentialprocedureforthedataofTable10.8areshowninTable10.13.Group1issignificantlydifferentfromgroups2and4,andgroup3fromgroups2and4.Atthe1%level,theonlysignificantdifferencesarebetweengroup3andgroups2and4.
Fig.10.10.Plotsofthelactulosedataonthenaturalscaleandaftersquarerootandlogtransformation
Table10.12.One-wayanalysisofvarianceforthesquareroottransformedlactulosedataofTable10.7
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 58 3.25441
HIVstatus
3 0.42870 0.14290 2.78 0.0495
Residual 55 2.82571 0.05138
Forunequal-sizedgroups,thechoiceofmultiplecomparisonproceduresis
morelimited,Gabriel'stestcanbeusedwithunequal-sizedgroups.Fortheroottransformedlactulosedata,theresultsofGabriel'stestareshowninTable10.14.ThisshowsthattheAIDSsubjectsaresignificantlydifferentfromtheasymptomaticHIV+patientsandfromtheHIV-controls.Forthemannitoldata,mostmultiplecomparisonprocedureswillgivenosignificantdifferencesbecausetheyaredesignedtogiveonlyonetypeIerrorperanalysisofvariance.WhentheFtestisnotsignificant,nogroupcomparisonswillbeeither.
Table10.13.TheNewman-KeulstestforthedataofTable10.8
0.05level 0.01level
Group Group Group Group
1 2 3 1 2 3
2 S 2 N
3 N S 3 N S
4 S N S 4 N N S
S=significant,N=notsignificant.
Table10.14.Gabriel'stestfortheroottransformedlactulosedata
0.05level 0.01level
Group Group Group Group
AIDS ARC HIV+ AIDS ARC HIV+
ARC N ARC N
HIV+ S N HIV+ N N
HIV- S N N HIV- N N N
S=significant,N=notsignificant.
10.12*RandomeffectsinanalysisofvarianceAlthoughthetechniqueiscalledanalysisofvariance,in§10-9-11wehavebeenusingitforthecomparisonofmeans.Inthissectionweshalllookatanotherapplication,whereweshallindeeduseanovatolookatvariances.Whenweestimateandcomparethemeansofgroupsrepresentingdifferentdiagnoses,differenttreatments,etc.,wecallthesefixedeffects.Inotherapplications,groupsaremembersofarandomsamplefromalargerpopulationand,ratherthanestimatethemeanofeachgroup,weestimatethevariancebetweenthem.Wecalledthesegroupsrandomeffects.
ConsiderTable10.15,whichshowsrepeatedmeasurementsofpulserateonagroupofmedicalstudents.Eachmeasurementwasmadebyadifferentobserver.Observationsmaderepeatedlyunderthesamecircumstancesarecalledreplicatesandherewehavetworeplicatespersubject.Wecandoaonewayanalysisofvarianceonthesedata,withsubjectasthegroupingfactor(Table10.16).
ThetestofsignificanceinTable10.16isredundant,becauseweknoweachpairofmeasurementsisfromadifferentperson,andthenullhypothesisthatallpairsarefromthesamepopulationisclearlyfalse.Whatwecanusethisanova
foristoestimatesomevariances.Therearetwodifferentvariancesinthedata.Oneisbetweenmeasurementsonthesameperson,thewithin-subjectvariancewhichweshalldenotebyσ2w.Inthisexamplethewithinsubjectvarianceisthemeasurementerror,andweshallassumeitisthesameforeveryone.Theotheristhevariancebetweenthesubjects'trueoraveragepulserates,aboutwhichtheindividualmeasurementsforasubjectaredistributed.Thisistheaverageofallpossiblemeasurementsforthatsubject,nottheaverageofthetwomeasurementsweactuallyhave.Thisvarianceisthebetween-subjectsvarianceandweshalldenoteitbyσ2b.Asinglemeasurementobserved
fromasingleindividualisthesumofthesubject'struepulserateandthemeasurementerror.Suchmeasurementsthereforehavevarianceσ2b+σ2w.Wecanestimateboththesevariancesfromtheanovatable.
Table10.15.Pairedmeasurementsof30secondpulsein45medicalstudents
Subject PulseAB Subject PulseA
B Subject PulseAB
1 46 42 16 34 36 31 43 43
2 50 42 17 30 36 32 30 29
3 39 37 18 35 45 33 31 36
4 40 54 19 32 34 34 43 43
5 41 46 20 44 46 35 38 43
6 35 35 21 39 42 36 31 37
7 31 44 22 34 37 37 45 43
8 43 35 23 36 38 38 39 43
9 47 45 24 33 34 39 48 48
10 48 36 25 34 35 40 40 40
11 32 46 26 51 48 41 46 45
12 36 34 27 31 30 42 44 42
13 37 30 28 30 31 43 36 34
14 34 36 29 42 43 44 33 28
15 38 36 30 39 35 45 39 42
Table10.16.One-wayanalysisofvarianceforthe30secondpulsedataofTable10.15
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 89 3.054.99
Betweensubjects
44 2408.49 54.74 3.81 <0.0001
Withinsubjects
45 646.50 14.37
Forthesimpleexampleofthesamenumberofreplicatesmoneachof
nsubjects,theestimationofthevariancesisquitesimple.Weestimateσ2w,directlyfromthemeansquarewithinsubjects,MSw,givinganestimates2w.Wecanshow(althoughIshallomitit)thatthemeansquarebetweensubjects,MSb,isanestimateofmσ2b+σ2w.Thevarianceratio,F=MSb/MSw,willbeexpectedtobe1.0ifσ2b=0,i.e.ifthenullhypothesisthatallsubjectsarethesameistrue.Wecanestimateσ2bbys2b=(MSb-MSw)/m.
Fortheexample,s2w=14.37ands2b=(54.74-14.37)/2=20.19.Thusthe
variabilitybetweenmeasurementsbydifferentobserversonthesamesubjectisnotmuchlessthanthevariabilitybetweentheunderlyingpulseratebetweendifferentsubjects.Themeasurement(bytheseuntrainedandinexperienceobservers)doesnottellusmuchaboutthesubjects.Weshallseeapracticalapplicationinthestudyofmeasurementerrorandobservervariationin§15.2,andconsideranotheraspectofthisanalysis,intraclasscorrelation,in§11.13.
Table10.17.NumberofX-rayrequestsconformingtotheguidelinesforeachpracticeintheinterventionandcontrolsgroups(Oakeshott
al1994)
Interventiongroup Controlgroup
Numberofrequests Percentage Numberof
requests
Total Conforming conforming Total Conforming
20 20 100 7 7
7 7 100 37 33
16 15 94 38 32
31 28 90 28 23
20 18 90 20 16
24 21 88 19 15
7 6 86 9 7
6 5 83 25 19
30 25 83 120 90
66 53 80 89 64
5 4 80 22 15
43 33 77 76 52
43 32 74 21 14
23 16 70 127 83
64 44 69 22 14
6 4 67 34 21
18 10 56 10 4
Total 429 341 704 509
Mean 81.6
SD 11.9
Ifwehavedifferentnumbersofreplicatespersubjectorotherfactorstoconsider(e.g.ifeachobservermadetworepeatedmeasurements)theanalysisbecomesfiendishlycomplicated(seeSearleetal.1992,ifyoumust).Theseestimatesofvariancedeserveconfidenceintervalslikeanyotherestimate,buttheseareevenmorefiendishlycomplicated,asBurdickandGraybill(1992)convincinglydemonstrate.Iwouldrecommendyouconsultastatisticianexperiencedinthesematters,ifyoucanfindone.
10.13*Unitsofanalysisandcluster-randomizedtrialsAcluster-randomizedstudy(§2.11)isonewhereagroupofsubjects,suchasthepatientsinahospitalwardorageneralpracticelist,arerandomizedtothesametreatmenttogether.Thetreatmentmightbeappliedtopatientdirectly,suchasanofferofbreastcancerscreeningtoalleligiblewomeninadistrict,orbeappliedtothecareprovider,suchastreatmentguidelinesgiventotheGP.Thedesignofthestudymustbetakenintoaccountintheanalysis.
Table10.17showsanexample(Oakeshottetal.1994,KerryandBland1998).
Fig.10.11.ScatterplotsandNormalplotsforthedataofTable10.17,showingtheeffectofanarcsinesquareroottransformation
InthisstudyguidelinesastoappropriatereferralforX-rayweregiventoGPsin17practicesandanother17practicesservedascontrols.Wecouldsaywehave341outof429appropriatereferralsinthetreatedgroupand509outof704inthecontrolgroupandcomparetheseproportionsasin§8.6and§9.8.Thiswouldbewrong,becausetofollowaBinomialdistribution,allthereferralsmustbeindependent(§6.4).Theyarenot,astheindividualGPmayhaveaprofoundeffectonthedecisiontorefer.Evenwherethepractitionerisnotdirectlyinvolved,membersofaclustermaybemoresimilartooneanotherthentheyaretomembersofanotherclusterandsonotbeindependent.IgnoringtheclusteringmayresultinconfidenceintervalswhicharetoonarrowandPvalueswhicharetoosmall,producingspurioussignificantdifferences.
Theeasiestwaytoanalysethedatafromsuchstudiesistomaketheexperimentalunit,thatwhichisrandomized(§2.11),theunitofanalysis.Wecanconstructasummarystatisticforeachclusterand
thenanalysethesesummaryvalues.Theideaissimilartotheanalysisofrepeatedmeasurementsonthesamesubject,whereweconstructasinglesummarystatisticoverthetimesforeachindividual(§10.7).ForTable10.17,thepractice'spercentageofreferralswhichareappropriateisthesummarystatistic.Themeanpercentagesinthetwogroupscanthenbecomparedbythetwo-sampletmethod.Theobserveddifferenceis81.6–73.6=8.0andthestandarderrorofthedifferenceis4.3.Thereare32degreesoffreedomand,fromTable10.1,the5%pointofthetdistributionis2.04.Thisgivesa95%confidenceintervalforthetreatmentdifferenceof
8.0±2.037×4.3,or-1to17percentagepoints.Forthetestofsignificance,theteststatisticis8.0/4.3=1.86,P=0.07.
Inthisexample,eachobservationisaBinomialproportion,sowecouldconsideranarcsinesquareroottransformationoftheproportions(§10.4).AsFigure10.11shows,ifanythingthetransformationmakesthefittotheNormaldistributionworse.ThisisreflectedinalargerPvalue,givingP=0.10.
Thereisawidelyvaryingnumberofreferrals,betweenpractices,whichmustreflectthelistsizeandnumberofGPsinthepractice.Wecantakethisintoaccountwithananalysiswhichweightseachobservationbythenumbersofreferrals.BlandandKerry(1998)givedetails.
Appendices
10AAppendix:Theratiomean/standarderror
Asifbymagic,wehaveoursamplemeanoveritsstandarderror.Ishallnotbothertogointothisdetailfortheothersimilarratioswhichweshallencounter.AnyquantitywhichfollowsaNormaldistributionwithmeanzero(suchas[xwithbarabove]-µ),dividedbyitsstandarderror,willfollowatdistributionprovidedthestandarderrorisbasedononesumofsquaresandhenceisrelatedtotheChi-squareddistribution.
10MMultiplechoicequestions50to56(Eachbranchiseithertrueorfalse)
50.Thepairedttestis:
(a)impracticalforlargesamples;
(b)usefulfortheanalysisofqualitativedata;
(c)suitableforverysmallsamples;
(d)usedforindependentsamples;
(e)basedontheNormaldistribution.
ViewAnswer
51.Whichofthefollowingconditionsmustbemetforavalidttestbetweenthemeansoftwosamples:
(a)thenumbersofobservationsmustbethesameinthetwogroups;
(b)thestandarddeviationsmustbeapproximatelythesameinthetwogroups;
(c)themeansmustbeapproximatelyequalinthetwogroups;
(d)theobservationsmustbefromapproximatelyNormaldistributions;
(e)thesamplesmustbesmall.
ViewAnswer
52.Inatwo-sampleclinicaltrial,oneoftheoutcomemeasureswashighlyskewed.Totestthedifferencebetweenthelevelsofthismeasureinthetwogroupsofpatients,possibleapproachesinclude:
(a)astandardttestusingtheobservations;
(b)aNormalapproximationifthesampleislarge;
(c)tranaformingthedatatoaNormaldistributionandusingattest;
(d)asigntest;
(e)thestandarderrorofthedifferencebetweentwoproportions.
ViewAnswer
53.Inthetwo-samplettest,deviationfromtheNormaldistributionbythedatamayseriouslyaffectthevalidityofthetestif:
(a)thesamplesizesareequal;
(b)thedistributionfollowedbythedataishighlyskewed;
(c)onesampleismuchlargerthantheother;
(d)bothsamplesarelarge;
(e)thedatadeviatefromaNormaldistributionbecausethemeasurementunitislargeandonlyafewvaluesarepossible.
ViewAnswer
Table10.18.Semenanalysesforsuccessfulandunsuccessfulspermdonors(Paraskevaidesetal.1991)
Successfuldonors Unsuccessfuldonors
n Mean (sd) n Mean (sd)
Volume(ml)
17 3.14 (1.28) 19 2.91 (0.91)
Semencount(106/ml)
18 146.4 (95.7) 19 124.8 (81.8)
%Motility 17 60.7 (9.7) 19 58.5 (12.8)
%Abnormalmorphology
13 22.8 (8.4) 16 20.3 (8.5)
Alldifferencesnotsignificant,ttest.
54.Table10.18showsacomparisonofsuccessful(i.e.fertile)andunsuccessfulartificialinseminationdonors.Theauthorsconcludedthat‘Conventionalsemenanalysismaybetooinsensitiveanindicatorofhighfertility[inAID]’:
(a)thetablewouldbemoreinformativeifPvaluesweregiven;
(b)thettestisimportanttotheconclusiongiven:
(c)itislikelythatsemencountfollowsaNormaldistribution;
(d)ifthenullhypothesisweretrue,thesamplingdistributionofthetteststatisticforsemencountwouldapproximatetoatdistribution;
(e)ifthenullhypothesiswerefalse,thepowerofthettestforsemencountcouldbeincreasedbyalogtransformation.
ViewAnswer
55.IfwetakesamplesofsizenfromaNormaldistributionandcalculatethesamplemean[xwithbarabove]andvariances2:
(a)sampleswithlargevaluesof[xwithbarabove]willtendtohavelarges2;
(b)thesamplingdistributionof[xwithbarabove]willbeNormal;
(c)thesamplingdistributionofs2willberelatedtotheChi-squareddistributionwithn-1degreesoffreedom;
(e)thesamplingdistributionofswillbeapproximatelyNormalifn>20.
ViewAnswer
56.Intheone-wayanalysisofvariancetableforthecomparisonofthreegroups:
(a)thegroupmeansquare+theerrormeansquare=thetotal
meansquare;
(b)therearetwodegreesoffreedomforgroups;
(c)thegroupsumofsquares+theerrorsumofsquares=thetotalsumofsquares;
(d)thenumbersineachgroupmustbeequal;
(e)thegroupdegreesoffreedom+theerrordegreesoffreedom=thetotaldegreesoffreedom.
ViewAnswer
10EExercise:ThepairedtmethodTable10.19showsthetotalstaticcomplianceoftherespiratorysystemandthearterialoxygentension(pa(O2))in16patientsinintensivecare(Al-Saady,personalcommunication).Thepatients'breathingwasassistedbyarespirator
andthequestionwaswhethertheirrespirationcouldbeimprovedbyvaryingthecharacteristicsoftheairflow.Table10.19comparesaconstantinspiratoryflowwaveformwithadeceleratinginspiratoryflowwaveform.Weshallexaminetheeffectofwaveformoncompliance.
Table10.19.pa(O2)andcompliancefortwoinspiratoryflowwaveforms
Patient pa(O2)(kPa)Compliance(ml/cmH2O)
Waveform Waveform
Constant Decelerating Constant Decelerating
1 9.1 10.8 65.4 72.9
2 5.6 5.9 73.7 94.4
3 6.7 7.2 37.4 43.3
4 8.1 7.9 26.3 29.0
5 16.2 17.0 65.0 66.4
6 11.5 11.6 35.2 36.4
7 7.9 8.4 24.7 27.7
8 7.2 10.0 23.0 27.5
9 17.7 22.3 133.2 178.2
10 10.5 11.1 38.4 39.3
11 9.5 11.1 29.2 31.8
12 13.7 11.7 28.3 26.9
13 9.7 9.0 46.6 45.0
14 10.5 9.9 61.5 58.2
15 6.9 6.3 25.7 25.7
16 18.1 13.9 48.7 42.3
1.Calculatethechangesincompliance.Findastemandleafplot(hint:youwillneedbothazeroandaminuszerorow).
ViewAnswer
2.Asacheckonthevalidityofthetmethod,plotthedifferenceagainstthesubject'smeancompliance.Dotheyappeartoberelated?
ViewAnswer
3.Calculatethemean,variance,standarddeviationandstandarderrorofthemeanforthecompliancedifferences.
ViewAnswer
4.EventhoughthecompliancedifferencesarefarfromaNormaldistribution,calculatethe95%confidenceintervalusingthetdistribution.Wewillcomparethiswiththatfortransformeddata.
ViewAnswer
5.Findthelogarithmsofthecomplianceandrepeatsteps1to3.Dotheassumptionsofthetdistributionmethodapplymoreclosely?
ViewAnswer
6.Calculatethe95%confidenceintervalforthelogdifferenceandtransformbacktotheoriginalscale.Whatdoesthismeanandhowdoesitcomparetothatbasedontheuntransformeddata?
ViewAnswer
7.Whatcanbeconcludedabouttheeffectofinspiratorywaveformonstaticcomplianceinintensivecarepatients?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>11-Regressionandcorrelation
11
Regressionandcorrelation
11.1ScatterdiagramsInthischapterIshalllookatmethodsofanalysingtherelationshipbetweentwoquantitativevariables.ConsiderTable11.1,whichshowsdatacollectedbyagroupofmedicalstudentsinaphysiologyclass.InspectionofthedatasuggeststhattheremaybesomerelationshipbetweenFEV1andheight.Beforetryingtoquantifythisrelationship,wecanplotthedataandgetanideaofitsnature.Theusualfirstplotisascatterdiagram,§5.6.Whichvariablewechooseforwhichaxisdependsonourideasastotheunderlyingrelationshipbetweenthem,asdiscussedbelow.Figure11.1showsthescatterdiagramforFEV1andheight.
InspectionofFigure11.1suggeststhatFEVlincreaseswithheight.Thenextstepistotryanddrawalinewhichbestrepresentstherelationship.Thesimplestlineisastraightone;IshallconsidermorecomplicatedrelationshipsinChapter17.
Theequationofastraightlinerelationshipbetweenvariablesxandyisy=a+bx,whereaandbareconstants.Thefirst,a,iscalledtheintercept.Itisthevalueofywhenxis0.Thesecond,b,iscalledtheslopeorgradientoftheline.Itistheincreaseinycorrespondingtoanincreaseofoneunitinx.TheirgeometricalmeaningisshowninFigure11.2.Wecanfindthevaluesofaandbwhichbestfitthedatabyregressionanalysis.
11.2Regression
Regressionisamethodofestimatingthenumericalrelationshipbetween
variables.Forexample,wewouldliketoknowwhatisthemeanorexpectedFEV1forstudentsofagivenheight,andwhatincreaseinFEV1isassociatedwithaunitincreaseinheight.
Table11.1.FEV1andheightfor20malemedicalstudents
Height(cm)
FEV1(litres)
Height(cm)
FEV1(litres)
Height(cm)
FEV1(litres)
164.0 3.54 172.0 3.78 178.0 2.98
167.0 3.54 174.0 4.32 180.7 4.80
170.4 3.19 176.0 3.75 181.0 3.96
171.2 2.85 177.0 3.09 183.1 4.78
171.2 3.42 177.0 4.05 183.6 4.56
171.3 3.20 177.0 5.43 183.7 4.68
172.0 3.60 177.4 3.60
Fig.11.1.ScatterdiagramshowingtherelationshipbetweenFEV1andheightforagroupofmalemedicalstudents
Fig.11.2.Coefficientsofastraightline
Thename‘regression’isduetoGalton(1886),whodevelopedthetechniquetoinvestigatetherelationshipbetweentheheightsofchildrenandoftheirparents.Heobservedthatifwechooseagroupofparentsofagivenheight,themeanheightoftheirchildrenwillbeclosertothemeanheightofthepopulationthanisthegivenheight.Inotherwords,tallparentstendtobetallerthantheirchildren,shortparentstendtobeshorter.Galtontermedthisphenomenon‘regressiontowardsmediocrity’,meaning‘goingbacktowardstheaverage’.Itisnowcalledregressiontowardsthemean(§11.4).Themethodusedtoinvestigateitwascalledregressionanalysisandthenamehasstuck.However,
inGalton'sterminologytherewas‘noregression’iftherelationshipbetweenthevariableswassuchthatonepredictedtheotherexactly;inmodernterminologythereisnoregressionifthevariablesarenotrelatedatall.
Inregressionproblemsweareinterestedinhowwellonevariablecanbeusedtopredictanother.InthecaseofFEV1andheight,forexample,weareconcernedwithestimatingthemeanFEV1foragivenheightratherthanmeanheightforgivenFEV1.Wehavetwokindsofvariables:theoutcomevariablewhichwearetryingtopredict,inthiscaseFEV1,andthepredictororexplanatoryvariable,inthiscaseheight.Thepredictorvariableisoftencalledtheindependentvariableandtheoutcomevariableiscalledthedependentvariable.However,thesetermshaveothermeaningsinprobability(§6.2),soIshallnotusethem.IfwedenotethepredictorvariablebyXandtheoutcomebyY,therelationshipbetweenthemmaybewrittenas
whereaandbareconstantsandEisarandomvariablewithmean0,calledtheerror,whichrepresentsthatpartofthevariabilityofYwhichisnotexplainedbytherelationshipwithX.IfthemeanofEwerenotzero,wecouldmakeitsobychanginga.WeassumethatEisindependentofX.
11.3ThemethodofleastsquaresIfthepointsalllayalongalineandtherewasnorandomvariation,itwouldbeeasytodrawalineonthescatterdiagram.InFigure11.1thisisnotthecase.Therearemanypossiblevaluesofaandbwhichcouldrepresentthedataandweneedacriterionforchoosingthebestline.Figure11.3showsthedeviationofapointfromtheline,thedistancefromthepointtothelineintheYdirection,Thelinewillfitthedatawellifthedeviationsfromitaresmall,andwillfitbadlyiftheyarelarge.ThesedeviationsrepresenttheerrorE,thatpartofthevariableYnotexplainedbyX.OnesolutiontotheproblemoffindingthebestlineistochoosethatwhichleavestheminimumamountofthevariabilityofYunexplained,bymakingthevarianceofEaminimum.Thiswillbeachievedbymakingthesumofsquaresofthedeviationsaboutthelineaminimum.Thisiscalledthemethodofleastsquaresandthelinefoundistheleastsquaresline.
ThemethodofleastsquaresisthebestmethodifthedeviationsfromthelineareNormallydistributedwithuniformvariancealongtheline.Thisislikelytobethecase,astheregressiontendstoremovefromYthevariabilitybetweensubjectsandleavethemeasurementerror,whichislikelytobeNormal.Ishalldealwithdeviationsfromthisassumptionin§11.8.
Manyusersofstatisticsarepuzzledbytheminimizationofvariationinonedirectiononly.UsuallybothvariablesaremeasuredwithsomeerrorandyetweseemtoignoretheerrorinX.Whynotminimizetheperpendiculardistancestothelineratherthanthevertical?Therearetworeasonsforthis.First,wearefindingthebeatpredictionofYfromtheobservedvaluesofX,notfromthe
‘true’valuesofX.Themeasurementerrorinbothvariablesisoneofthecausesofdeviationsfromtheline,andisincludedinthesedeviationsmeasuredintheYdirection.Second,thelinefoundinthiswaydependsontheunitsinwhichthevariablesaremeasured.ForthedataofTable11.1thelinefoundbythismethodis
FEV1(litre)=-9.33+0.075×height(cm)
Ifwemeasureheightinmetresinsteadofcentimetres,weget
FEV1(litre)=-34.70+22.0×height(m)
ThusbythismethodthepredictedFEV1forastudentofheight170cmis3.42litres,butforastudentofheight1.70mitis2.70litres.Thisisclearlyunsatisfactoryandwewillnotconsiderthisapproachfurther.
Fig.11.3.Deviationsfromthelineintheydirection
ReturningtoFigure11.3,theequationofthelinewhichminimizesthesumofsquareddeviationsfromthelineintheoutcomevariableisfoundquiteeasily(§11A).Thesolutionis:
Wethenfindtheinterceptaby
TheequationY=a+bXiscalledtheregressionequationofYonX,YbeingtheoutcomevariableandXthepredictor.Thegradient,b,isalsocalledtheregressioncoefficient.WeshallcalculateitforthedataofTable11.1.Wehave
WedonotneedthesumofsquaresforYyet,butweshalllater.
HencetheregressionequationofFEV1onheightis
FEV=-9.19+0.0744×height
Figure11.4showsthelinedrawnonthescatterdiagram.
Thecoefficientsaandbhavedimensions,dependingonthoseofXandY.IfwechangetheunitsinwhichXandYaremeasuredwealsochangeaandb,butwedonotchangetheline.Forexample,ifheightismeasuredinmetreswedividethexiby100andwefindthatbismultipliedby100togiveb=7.4389litres/m.Thelineis
FEV1(litres)=-9.19+7.44×height(m)
Thisisexactlythesamelineonthescatterdiagram.
Fig.11.4.TheregressionofFEV1onheight
Fig.11.5.ThetworegressionlinesforthedataofTables11.1and10.15
11.4*TheregressionofXonYWhathappensifwechangeourchoiceofoutcomeandpredictorvariables?TheregressionequationofheightonFEVlis
height=158+4.54×FEV1
ThisisnotthesamelineastheregressionofFEV1onheight.Forifwerearrangethisequationbydividingeachsideby4.54weget
FEVl=-34.8+0.220×height
TheslopeoftheregressionofheightonFEV1isgreaterthanthatofFEV1onheight(Figure11.5).Ingeneral,theslopeoftheregressionofXonYisgreaterthanthatofYonX,whenXisthehorizontalaxis.Onlyifallthepointslieexactlyonastraightlinearethetwoequationsthesame.
Figure11.5alsoshowsthetwo30secondpulsemeasurementsofTable10.15,withthelinesrepresentingtheregressionofthesecondmeasurementonthe
firstandthefirstmeasurementonthesecond.Theregressionequationsare2ndpulse=17.3+0.572×1stpulseand1stpulse=14.9+0.598×2ndpulse.Eachregressioncoefficientislessthanone.Thismeansthatforsubjectswithanygivenfirstpulsemeasurement,thepredictedsecondpulsemeasurementwillbeclosertothemeanthanthefirstmeasurement,andforanygivensecondpulsemeasurement,thepredictedfirstmeasurementwillbeclosertothemeanthanthesecondmeasurement.Thisisregressiontowardsthemean(§11.2).Regressiontowardsthemeanisapurelystatisticalphenomenon,producedbytheselectionofthegivenvalueofthepredictorandtheimperfectrelationshipbetweenthevariables.Regressiontowardsthemeanmaymanifestitselfinmanyways.Forexample,supposewemeasurethe
bloodpressureofanunselectedgroupofpeopleandthenselectsubjectswithhighbloodpressure,e.g.diastolic>95mmHg.Ifwethenmeasuretheselectedgroupagain,themeandiastolicpressurefortheselectedgroupwillbelessonthesecondoccasionthanonthefirst,withoutanyinterventionortreatment.Theapparentfalliscausedbytheinitialselection.
11.5ThestandarderroroftheregressioncoefficientInanyestimationprocedure,wewanttoknowhowreliableourestimatesare.Wedothisbyfindingtheirstandarderrorsandhenceconfidenceintervals.Wecanalsotesthypothesesaboutthecoefficients,forexample,thenullhypothesisthatinthepopulationtheslopeiszeroandthereisnolinearrelationship.Thedetailsaregivenin§11C.Wefirstfindthesumofsquaresofthedeviationsfromtheline,thatis,thedifferencebetweentheobservedyiandthevaluespredictedbytheregressionline.Thisis
Inordertoestimatethevarianceweneedthedegreesoffreedomwithwhichtodividethesumofsquares.Wehaveestimatednotoneparameterfromthedata,asforthesumofsquaresaboutthemean(§4.6),buttwo,aandb.Welosetwodegreesoffreedom,leavinguswithn-2.HencethevarianceofYabouttheline,calledtheresidualvariance,is
Ifwearetoestimatethevariationabouttheline,wemustassumethatitisthesameallthewayalongtheline,i.e.thatthevarianceisuniform.Thisisthesameasforthetwo-sampletmethod(§10.3)andanalysisofvariance(§10.9).Forthe
FEV1datathesumofsquaresduetotheregressionis0.0743892×576.352=3.18937andthesumofsquaresabouttheregressionis9.43868-3.18937=6.24931.Thereare20-2=18degreesoffreedom,sothevarianceabouttheregressioniss2=6.2493/18=0.34718.Thestandarderrorofbisgivenby
WehavealreadyassumedthattheerrorEisNormallydistributed,sobmustbe,too.Thestandarderrorisbasedonasinglesumofsquares,sob/SE(b)isanobservationfromthetdistributionwithn-2degreesoffreedom(§10.1).Wecanfinda95%confidenceintervalforbbytakingtstandarderrorsoneithersideoftheestimate.Fortheexample,wehave18degreesoffreedom.FromTable10.1,the5%pointofthetdistributionis2.10.sothe95%confidenceintervalforbis0.074389-2.10×0.02454to0.074389+2.10×0.02454or0.02to0.13litres/cm.WecanseethatFEV1andheightarerelated,thoughtheslopeisnotverywellestimated.
Wecanalsotestthenullhypothesisthat,inthepopulation,theslope=0againstthealternativethattheslopeisnotequalto0,arelationshipineitherdirection.Theteststatisticisb/SE(b)andifthenullhypothesisistruethiswillbefromatdistributionwithn-2degreesoffreedom.Fortheexample,
FromTable10.1thishastwo-tailedprobabilityoflessthan0.01.Thecomputertellsusthattheprobabilityisabout0.007.Hencethedataareinconsistentwiththenullhypothesisandthedataprovidefairlygoodevidencethatarelationshipexists.Ifthesampleweremuchlarger,wecoulddispensewiththetdistributionandusetheStandardNormaldistributioninitsplace.
11.6*UsingtheregressionlineforpredictionWecanusetheregressionequationtopredictthemeanorexpectedYforanygivenvalueofX.ThisiscalledtheregressionestimateofY.We
canusethistosaywhetheranyindividualhasanobservedYgreaterorlessthanwouldbeexpectedgivenX.Forexample,thepredictedFEVlforstudentswithheight177cmis-9.19+0.0744×177=3.98litres.Threesubjectshadheight177cm.ThefirsthadobservedFEVlof5.43litres,1.45litresabovethatexpected.ThesecondhadaratherlowFEVlof3.09litres,0.89litresbelowexpectation,whilethethirdwithanFEVlof4.05litreswasveryclosetothatpredicted.Wecanusethisclinicallytoadjustameasuredlungfunctionforheightandthusgetabetterideaofthepatient'sstatus.Wewould,ofcourse,useamuchlargersampletoestablishapreciseestimateoftheregressionequation.Wecanalsouseavariantofthemethod(§17.1)toadjustFEV1forheightincomparingdifferentgroups,wherewecanbothremovevariationinFEV1duetovariationinheight
andallowfordifferencesinmeanheightbetweenthegroups.Wemaywishtodothistocomparepatientswithrespiratorydiseaseondifferenttherapies,ortocomparesubjectsexposedtodifferentenvironmentalfactors,suchasairpollution,cigarettesmoking,etc.
Fig.11.6.Confidenceintervalsfortheregressionestimate
Aswithallsampleestimates,theregressionestimateissubjecttosamplingvariation.Weestimateitsprecisionbystandarderrorandconfidenceintervalintheusualway.ThestandarderroroftheexpectedYforanobservedvaluexis
Weneednotgointothealgebraicdetailsofthis.Itisverysimilartothatin§11C.Forx=177wehave
Thisgivesa95%confidenceintervalof3.98-2.10×0.138to3.98+2.10×0.138givingfrom3.69to4.27litres.Here3.98istheestimateand2.10isthe5%pointofthetdistributionwithn-2=18degreesoffreedom.
Thestandarderrorisaminimumatx=[xwithbarabove],andincreasesaswemoveawayfrom[xwithbarabove]ineitherdirection.Itcanbeusefultoplotthestandarderrorand95%confidenceintervalaboutthelineonthescatterdiagram.Figure11.6showsthisfortheFEV1data.Noticethatthelinesdivergeconsiderablyaswereachtheextremesofthedata.Itisverydangeroustoextrapolatebeyondthedata.Notonlydothestandarderrorsbecomeverywide,butweoftenhavenoreasontosupposethatthestraightlinerelationshipwouldpersist.
Theintercepta,thepredictedvalueofYwhenX=0,isaspecialcaseofthis.Clearly,wecannotactuallyhaveamedicalstudentofheightzeroandwithFEV1of-9.19litres.Figure11.6alsoshowstheconfidenceintervalfortheregressionestimatewithamuchsmallerscale,toshowtheintercept.Theconfidenceintervalisverywideatheight=0,andthisdoesnottakeaccountof
anybreakdowninlinearity.
WemaywishtousethevalueofXforasubjecttoestimatethatsubject'sindividualvalueofY,ratherthanthemeanforallsubjectswiththisX.Theestimateisthesameastheregressionestimate,butthestandarderrorismuchgreater:
Forastudentwithaheightof177cm.thepredictedFEVlis3.98litres,withstandarderror0.61litres.Figure11.7showstheprecisionofthepredictionofafurtherobservation.Aswemightexpect,the95%confidenceintervalsincludeallbutoneofthe20observations.Thisisonlygoingtobeausefulpredictionwhentheresidualvariances2issmall.
WecanalsousetheregressionequationofYonXtopredictXfromY.ThisismuchlessaccuratethanpredictingYfromX.Thestandarderrorsare
Forexample,ifweusetheregressionofheightonFEV1(Figure11.5)topredicttheFEV1ofanindividualstudentwithheight177cm,wegetapredictionof4.21litres,withstandarderror1.05litres.ThisisalmosttwicethestandarderrorobtainedfromtheregressionofFEV1onheight,0.61.OnlyifthereisnopossibilityofdeviationsinXfulfillingtheassumptionsofNormaldistributionanduniformvariance,andsonowayoffittingX=a+bY,shouldweconsiderpredictingXfromtheregressionofYonX.ThismighthappenifXisfixedinadvance,e.g.thedoseofadrug.
11.7*AnalysisofresidualsItisoftenveryusefultoexaminetheresiduals,thedifferencesbetweentheobservedandpredictedY.Thisisbestdonegraphically.WecanassesstheassumptionofaNormaldistributionbylookingatthehistogramorNormalplot(§7.5).Figure11.8showsthesefortheFEVldata.Thefitisquitegood.
Figure11.9showsaplotofresidualsagainstthepredictorvariable.Thisplotenablesustoexaminedeviationsfromlinearity.Forexample,ifthetruerelationshipwerequadratic,sothatYincreasesmoreandmorerapidlyasXincreases,weshouldseethattheresidualsare
relatedtoX.LargeandsmallXwouldtendtohavepositiveresidualswhereascentralvalueswouldhavenegativeresiduals.Figure11.9showsnorelationshipbetweentheresidualsandheight,andthelinearmodelseemstobeanadequatefittothedata.
Fig.11.7.Confidenceintervalforafurtherobservation
Fig.11.8.DistributionofresidualsfortheFEV1data
Fig.11.9.ResidualsagainstheightfortheFEV1data
Fig.11.10.Datawhichdonotmeettheconditionsofthemethodofleastsquares,beforeandafterlogtransformation
Figure11.9showssomethingelse,however.Onepointstandsoutashavingaratherlargerresidualthantheothers.Thismaybeanoutlier,apointwhichmaywellcomefromadifferentpopulation.Itisoftendifficulttoknowwhattodowithsuchdata.Atleastwehavebeenwarnedtodoublecheckthispointfortranscriptionerrors.Itisalltooeasytotransposeadjoiningdigitswhentransferringdatafromonemediumtoanother.Thismayhavebeenthecasehere,asanFEV1of4.53,ratherthanthe5.43recorded,wouldhavebeenmoreinlinewiththerestofthedata.Ifthishappenedatthepointofrecording,thereisnotmuchwecandoaboutit.Wecouldtrytomeasurethesubjectagain,orexcludehimandseewhetherthismakesanydifference.Ithinkthat,onthewhole,weshouldworkwithallthedataunlessthereareverygoodreasonsfornotdoingso.Ihaveretainedthiscasehere.
11.8*DeviationsfromassumptionsinregressionBoththeappropriatenessofthemethodofleastsquaresandtheuseofthetdistributionforconfidenceintervalsandtestsofsignificancedependontheassumptionthattheresidualsarefromaNormaldistributionwithuniformvariance.Thisassumptioniseasilymet,forthesamereasonsthatitisinthepairedttest(§10.2).TheremovalofthevariationduetoXtendstoremovesomeofthevariationbetweenindividuals,leavingthemeasurementerror.Problemscanarise,however,anditisalwaysagoodideatoplottheoriginalscatterdiagramandtheresidualstocheckthattherearenogrossdeparturesfromtheassumptionsofthemethod.Notonlydoesthishelppreservethevalidityofthestatisticalmethodused,butitmayalsohelpuslearnmoreaboutthestructureofthedata.
Figure11.10showstherelationshipbetweengestationalageandcordbloodlevelsofAVP,theantidiuretichormone,inasampleofmalefoetuses.ThevariabilityoftheoutcomevariableAVPdependsontheactualvalueofthevariable,beinglargerforlargevaluesofAVP.Theassumptionsofthemethodofleastsquaresdonotapply.However,wecanuseatransformationaswedidforthecomparisonofmeansin§10.4.Figure11.10alsoshowsthedataafterAVPhasbeenlogtransformed,togetherwiththeleastsquaresline.
Asin§10.4,thetransformationisfoundbytrialanderror.Thelogtransformationenablesustointerprettheregressioncoefficientinawaywhichothertransformationsdonot.Iusedlogstobase10forthistransformationandgotthefollowingregressionequation:
log10(AVP)=-0.651253+0.011771×gestationalage
Thismeansthatforeveryonedayincreaseingestationalage,log10(AVP)increasesby0.011771.Adding0.011771tolog10(AVP)multipliesAVPby100.011771=1.027theantilogof0.011771.Wecanantilogtheconfidencelimitsfortheslopetogivetheconfidenceintervalforthisfactor.
Itmaybemoreconvenienttoreporttheincreaseperweekorpermonth.Thesewouldbefactorsof100.011771×7=1.209or100.011771×30
=2.255respectively.Whenthedataarearandomsample,itisoftenconvenienttoquotetheslopecalculatedfromlogsastheeffectofadifferenceofonestandarddeviationofthepredictor.Forgestationalagethestandarddeviationis61.16104days,sotheeffectofachangeofoneSDistomultipleAVPby100.011771×61.16104=5.247,soadifferenceofonestandarddeviationisassociatedwithafivefoldincreaseinAVP.Anotherapproachistolookatthedifferencebetweentwocentiles,suchasthe10thandthe90th.Forgestationalagetheseare98and273days,sotheeffectonAVPwouldbetomultiplyitby100.011771×(273–98)=114.796.ThusthedifferenceoverthisintercentilerangeistoraiseAVP115-fold.
11.9CorrelationTheregressionmethodtellsussomethingaboutthenatureoftherelationshipbetweentwovariables,howonechangeswiththeother,butitdoesnottellushowclosethatrelationshipis.Todothisweneedadifferentcoefficient,thecorrelationcoefficient.Thecorrelationcoefficientisbasedonthesumofproductsaboutthemeanofthetwovariables,soIshallstartbyconsideringthepropertiesofthesumofproductsandwhyitisagoodindicatoroftheclosenessoftherelationship.
Figure11.11showsthescatterdiagramofFigure11.1withtwonewaxesdrawnthroughthemeanpoint.Thedistancesofthepointsfromtheseaxesrepresentthedeviationsfromthemean.InthetoprightsectionofFigure11.11,thedeviationsfromthemeanofbothvariables,FEV1andheight,arepositive.Hence,theirproductswillbepositive.Inthebottomleftsection,thedeviationsfromthemeanofthetwovariableswillbothbenegative.Again,theirproductwillbepositive.InthetopleftsectionofFigure11.11,thedeviationsofFEV1fromitsmeanwillbepositive,andthedeviationofheightfromitsmeanwillbenegative.Theproductofthesewillbenegative.Inthebottomrightsection,theproductwillagainbenegative.SoinFigure11.11nearlyalltheseproductswillbepositive,andtheirsumwillbepositive.Wesaythatthereisapositivecorrelationbetweenthetwovariables;asoneincreasessodoestheother.Ifonevariabledecreasedastheotherincreased,wewouldhaveascatterdiagramwheremostofthepointslayinthetopleftandbottomrightsections.Inthis
casethesumoftheproductswouldbenegativeandtherewouldbeanegativecorrelationbetweenthevariables.Whenthetwovariablesarenotrelated,wehaveascatterdiagramwithroughlythesamenumberofpointsineachofthesections.Inthiscase,thereareasmanypositiveasnegativeproducts,andthesumiszero.Thereiszerocorrelationornocorrelation.Thevariablesaresaidtobeuncorrelated.
Fig.11.11.Scatterdiagramwithaxesthroughthemeanpoint
Thevalueofthesumofproductsdependsontheunitsinwhichthetwovariablesaremeasured.WecanfindadimensionlesscoefficientifwedividethesumofproductsbythesquarerootsofthesumsofsquaresofXandY.Thisgivesustheproductmomentcorrelationcoefficient,orthecorrelationcoefficientforshort,usuallydenotedbyr.
Ifthenpairsofobservationsaredenotedby(xi,yi),thenrisgivenby
FortheFEV1andheightwehave
Theeffectofdividingthesumofproductsbytherootsumofsquaresofdeviationsofeachvariableistomakethecorrelationcoefficientliebetween-1.0and+1.0.WhenallthepointslieexactlyonastraightlinesuchthatYincreasesasXincreases,r=1.Thiscanbeshownbyputtinga+bxiinplaceofyiintheequationforr;everythingcancelsoutleavingr=1.Whenallthepointslieexactlyonastraightlinewithnegativeslope,r=-1.Whenthereisnorelationshipatall,r=0,becausethesumofproductsiszero.Thecorrelationcoefficientdescribestheclosenessofthelinearrelationshipbetweentwovariables.ItdoesnotmatterwhichvariablewetaketobeYandwhichtobeX.Thereisnochoiceofpredictorandoutcomevariable,asthereisinregression.
Fig.11.12.Datawherethecorrelationcoefficientmaybemisleading
Thecorrelationcoefficientmeasureshowclosethepointsaretoastraightline.EvenifthereisaperfectmathematicalrelationshipbetweenXandY,thecorrelationcoefficientwillnotbeexactly1unlessthisisoftheformy=a+bx.Forexample,Figure11.12showstwovariableswhichareperfectlyrelatedbuthaver=0.86.Figure11.12alsoshowstwovariableswhichareclearlyrelatedbuthavezerocorrelation,becausetherelationshipisnotlinear.Thisshowsagaintheimportanceofplottingthedataandnotrelyingonsummarystatisticssuchasthecorrelationcoefficientonly.Inpractice,relationshipslikethoseofFigures11.12arerareinmedicaldata,althoughthepossibilityisalwaysthere.Moreoften,thereissomuchrandomvariationthatitisnoteasytodiscernanyrelationshipatall.
Thecorrelationcoefficientrisrelatedtotheregressioncoefficientbinasimpleway.IfY=a+bXistheregressionofyonX,andX=a′+b′YistheregressionofXonY,thenr2=bb′.Thisarisesfromtheformulaeforrandb.FortheFEV1data,b=0.074389andb′=4.5424,sobb′=0.074389×4.5424=0.33790,thesquarerootofwhichis0.58129,thecorrelationcoefficient.Wealsohave
Thisistheproportionofvariabilityexplained,describedin§11.5.
Table11.2.Two-sided5%and1%pointsofthedistributionofthecorrelationcoefficient,r,underthe
nullhypothesis
n 5% 1% n 5% 1% n 5% 1%
3 1.00 1.00 16 0.50 0.62 29 0.37 0.47
4 0.95 0.99 17 0.48 0.61 30 0.36 0.46
5 0.88 0.96 18 0.47 0.59 40 0.31 0.40
6 0.81 0.92 19 0.46 0.58 50 0.28 0.36
7 0.75 0.87 20 0.44 0.56 60 0.25 0.33
8 0.71 0.83 21 0.43 0.55 70 0.24 0.31
9 0.67 0.80 22 0.42 0.54 80 0.22 0.29
10 0.63 0.77 23 0.41 0.53 90 0.21 0.27
11 0.60 0.74 24 0.40 0.52 100 0.20 0.25
12 0.58 0.71 25 0.40 0.51 200 0.14 0.18
13 0.55 0.68 26 0.39 0.50 500 0.09 0.12
14 0.53 0.66 27 0.38 0.49 1000 0.06 0.08
15 0.51 0.64 28 0.37 0.48
n=Numberofobservations.
11.10SignificancetestandconfidenceintervalforrTestingthenullhypothesisthatr=0inthepopulation,i.e.thatthereisnolinearrelationship,issimple.Thetestisnumericallyequivalenttotestingthenullhypothesisthatb=0,andthetestisvalidprovidedatleastoneofthevariablesisfromaNormaldistribution.Thisconditioniseffectivelythesameasthatfortestingb,wheretheresidualsintheYdirectionmustbeNormal,Ifb=0,theresidualsintheYdirectionaresimplythedeviationsfromthemean,andthesewillonlybeNormallydistributedifYis.Iftheconditionisnotmet,wecanuseatransformation(§11.8),oroneoftherankcorrelationmethods(§12.4-5).
Becausethecorrelationcoefficientdoesnotdependonthemeansorvariancesoftheobservations,thedistributionofthesamplecorrelationcoefficientwhenthepopulationcoefficientiszeroiseasytotabulate.Table11.2showsthecorrelationcoefficientatthe5%and1%levelofsignificance.Fortheexamplewehaver=0.58from20observations.The1%pointfor20observationsis0.56,sowehaveP<0.01,andthecorrelationisunlikelytohaveariseniftherewerenolinearrelationshipinthepopulation.Notethatthevaluesofrwhichcanarisebychancewithsmallsamplesarequitehigh.With10pointsrwouldhavetobegreaterthan0.63tobesignificant.Ontheotherhandwith1000pointsverysmallvaluesofr,aslowas0.06,willbesignificant.
Findingaconfidenceintervalforthecorrelationcoefficientismoredifficult.
EvenwhenXandYarebothNormallydistributed,rdoesnotitselfapproachaNormaldistributionuntilthesamplesizeisinthethousands.Furthermore,itsdistributionisrathersensitivetodeviationsfromtheNormalinXandY.However,ifbothvariablesarefromNormaldistributions,Fisher'sztransformationgivesaNormallydistributedvariablewhosemeanandvarianceareknownintermsofthepopulationcorrelationcoefficientwhichwewishtoestimate.Fromthisaconfidenceintervalcanbefound.Fisher'sztransformationis
whichfollowsaNormaldistributionwithmean
soforthelowerlimitwehave
andfortheupperlimit
andthe95%confidenceintervalis0.18to0.81.Thisisverywide,
reflectingthesamplingvariationwhichthecorrelationcoefficienthasforsmallsamples.Correlationcoefficientsmustbetreatedwithsomecautionwhenderivedfromsmallsamples.
Theeaseofthesignificancetestcomparedtotherelativecomplexityoftheconfidenceintervalcalculationhasmeantthatinthepastasignificancetestwasusuallygivenforthecorrelationcoefficient.Theincreasingavailabilityofcomputerswithwell-writtenstatisticalpackagesshouldleadtocorrelationcoefficientsappearingwithconfidenceintervalsinthefuture.
Table11.3.Simulateddatashowing10pairsofmeasurementsoftwoindependentvariablesforfoursubjects
Subject1 Subject2 Subject3 Subject4
x y x y x y x
47 51 49 52 51 46 63
46 53 50 56 46 48 70
50 57 42 46 46 47 63
52 54 48 52 45 55 58
46 55 60 53 52 49 59
36 53 47 49 54 61 61
47 54 51 52 48 53 67
46 57 57 50 47 48 64
36 61 49 50 47 50 59
44 57 49 49 54 44 61
Means 45.0 55.2 50.2 50.9 49.0 50.1 62.5
r=-0.33 r=0.49 r=0.06 r=-0.39
P=0.35 P=0.15 P=0.86 P=0.27
11.11UsesofthecorrelationcoefficientThecorrelationcoefficienthasseveraluses.UsingTable11.2,itprovidesasimpletestofthenullhypothesisthatthevariablesarenotlinearlyrelated,withlesscalculationthantheregressionmethod.Itisalsousefulasasummarystatisticforthestrengthofrelationshipbetweentwovariables.Thisisofgreatvaluewhenweareconsideringtheinterrelationshipsbetweenalargenumberofvariables.Wecansetupasquarearrayofthecorrelationsofeachpairofvariables,calledthecorrelationmatrix.Examinationofthecorrelationmatrixcanbeveryinstructive,butwemustbearinmindthepossibilityofnon-linearrelationships.Thereisnosubstituteforplottingthedata.Thecorrelationmatrixalsoprovidesthestartingpointforanumberofmethodsfordealingwithalargenumberofvariablessimultaneously.
Ofcourse,forthereasonsdiscussedinChapter3,thefactthattwovariablesarecorrelateddoesnotmeanthatonecausestheother.
11.12*Usingrepeatedobservations
Inclinicalresearchweareoftenabletotakeseveralmeasurementsonthesamepatient.Wemaywanttoinvestigatetherelationshipbetweentwovariables,andtakepairsofreadingswithseveralpairsfromeachofseveralpatients.Theanalysisofsuchdataisquitecomplex.Thisisbecausethevariabilityofmeasurementsmadeondifferentsubjectsisusuallymuchgreaterthanthevariabilitybetweenmeasurementsonthesamesubject,andwemusttakethesetwokindsofvariabilityintoaccount.Whatwemustnotdoistoputallthedatatogether,asiftheywereonesample.
ConsiderthesimulateddataofTable11.3.Thedataweregeneratedfromrandomnumbers,andthereisnorelationshipbetweenXandYatall.FirstvaluesofXandYweregeneratedforeach‘subject’,thenafurtherrandomnumberwasaddedtomaketheindividual‘observation’.Foreachsubjectseparately,
therewasnosignificantcorrelationbetweenXandY.Forthesubjectmeans,thecorrelationcoefficientwasr=0.77,P=0.23.However,ifweputall40observationstogetherwegetr=0.53,P=0.0004.Eventhoughthecoefficientissmallerthanthatbetweensubjectmeans,becauseitisbasedon40pairsofobservationsratherthan4itbecomessignificant.ThedataareplottedinFigure11.13,withthreeothersimulations.Asthenullhypothesisisalwaystrueinthesesimulateddata,thepopulationcorrelationsforeach‘subject’andforthemeansarezero.Becausethenumbersofobservationsaresmall,thesamplecorrelationsvarygreatly.AsTable11.2shows,largecorrelationcoefficientscanarisebychanceinsmallsamples.However,theoverallcorrelationis‘significant’inthreeofthefoursimulations,thoughindifferentdirections.
Fig.11.13.Simulationsof10pairsofobservationsonfoursubjects
Weonlyhavefoursubjectsandonlyfourpoints.Byusingtherepeateddata,wearenotincreasingthenumberofsubjects,butthestatisticalcalculationisdoneasifwehave,andsothenumberofdegreesoffreedomforthesignificancetestisincorrectlyincreasedandaspurioussignificantcorrelationproduced.
Therearetwosimplewaystoapproachthistypeofdata,andwhichischosendependsonthequestionbeingasked.IfwewanttoknowwhethersubjectswithahighvalueofXtendtohaveahighvalueofYalso,weusethesubjectmeansandfindthecorrelationbetweenthem.Ifwehavedifferentnumbersofobservationsforeachsubject,wecanuseaweightedanalysis,weightedbythenumberofobservationsforthesubject.Ifwewanttoknowwhetherchangesinonevariableinthesamesubjectareparallelledbychangesintheother,weneedtousemultipleregression,takingsubjectsoutasafactor(§17.1,§17.6).Ineither
case,weshouldnotmixobservationsfromdifferentsubjects
indiscriminately.
Fig.11.14.Scatterplotsofthe30secondpulsedataasinTable10.15andwithhalfthepairsofobservationsreversed
11.13*IntraclasscorrelationSometimeswehavepairsofobservationswherethereisnoobviouschoiceofXandY.ThedataofTable10.15areagoodexample.Eachsubjecthastwomeasurementsmadebydifferentobservers,differentpairsofobserversbeingusedforeachsubject.ThechoiceofXandYisarbitraryFigure11.14showsthedataasinTable10.15andwithhalfthepairsarbitrarilyreversed.Thescatterplotslookalittledifferentandthereisnogoodreasontochooseoneagainsttheother.Thecorrelationcoefficientsarealittledifferenttoo:fortheoriginalorderr=0.5848andforthesecondorderr=0.5804.Theseareverysimilar,ofcourse,butwhichshouldweuse?
Itwouldbenicetohaveanaveragecorrelationcoefficientacrossallthe245possibleorderings.ThisisprovidedbytheintraclasscorrelationcoefficientorICC.Thiscanbefoundfromtheestimatesofwithinsubjectvariance,s2w,andbetweensubjectsvariance,s2b,foundfromtheanalysisofvariancein§10.12.Wehave:
Fortheexample,s2w=14.37ands2b=20.19(§10.12).hence
TheICCwasoriginallydevelopedforapplicationssuchascorrelationbetweenvariablesmeasuredinpairsoftwins(whichtwinisXandwhichisY?).WedonothavetohavepairsofmeasurementstousetheICC.Itworksjustaswellfortripletsorforanynumberofobservationswithinthegroups,notnecessarilyallthesame.
Althoughnotusednearlyasoftenastheproductmomentcorrelationcoefficient,theICChassomeimportantapplications.Oneisinthestudyofmeasurementerrorandobservervariation(§15.2),whereifmeasurementsaretrue
replicatestheorderinwhichtheyweremadeisnotimportant.Anotherisinthedesignofcluster-randomizedtrialswherethegroupistheclusterandmayhavehundredsofobservationswithinit(§18.8).
Appendices
11AAppendix:Theleastsquaresestimates
Thissectionrequiresknowledgeofcalculus.Wewanttofindaandbsothatthesumofsquaresabouttheliney=a+bxisaminimum.WethereforewanttominimizeΣ(yi-a-bxi)2.Thiswillhaveaminimumwhenthepartialdifferentialswithrespecttoaandbarebothzero.
Subtractingthisfromthesecondequationweget
Thisgivesus
11BAppendix:Varianceabouttheregressionline
11CAppendix:Thestandarderrorofb
Tofindthestandarderrorofb,wemustbearinmindthatinourregressionmodelalltherandomvariationisinY.Wefirstrewritethesumofproducts:
Thevarianceofaconstanttimesarandomvariableisthesquareoftheconstanttimesthevarianceoftherandomvariable(§6.6).Thexiareconstants,notrandomvariables,so
VAR(yi)isthesameforallyi,sayVAR(yi)=s2.Hence
Thestandarderrorofbisthesquarerootofthis.
11MMultiplechoicequestions57to61(Eachbranchiseithertrueorfalse)
57.InFigure11.15(a):
(a)predictorandoutcomeareindependent;
(b)predictorandoutcomeareuncorrelated;
(c)thecorrelationbetweenpredictorandoutcomeislessthan1;
(d)predictorandoutcomeareperfectlyrelated;
(e)therelationshipisbestestimatedbysimplelinearregression.
ViewAnswer
58.InFigure11.15(b):
(a)predictorandoutcomeareindependentrandomvariables;
(b)thecorrelationbetweenpredictorandoutcomeiscloseto
zero;
(c)outcomeincreasesaspredictorincreases;
(d)predictorandoutcomearelinearlyrelated;
(e)therelationshipcouldbemadelinearbyalogarithmictransformationoftheoutcome.
ViewAnswer
Fig.11.15.Scatterdiagrams
59.Asimplelinearregressionequation:
(a)describesalinewhichgoesthroughtheorigin;
(b)describesalinewithzeroslope;
(c)isnotaffectedbychangesofscale;
(d)describesalinewhichgoesthroughthemeanpoint;
(e)isaffectedbythechoiceofdependentvariable.
ViewAnswer
60.Ifthetdistributionisusedtofindaconfidenceintervalfortheslopeofaregressionline:
(a)deviationsfromthelineintheindependentvariablemustfollowaNormaldistribution;
(b)deviationsfromthelineinthedependentvariablemustfollowaNormaldistribution;
(c)thevarianceaboutthelineisassumedtobethesamethroughouttherangeofthepredictorvariable;
(d)theyvariablemustbelogtransformed;
(e)allthepointsmustlieontheline.
ViewAnswer
61.Theproductmomentcorrelationcoefficient,r:
(a)mustliebetween-1and+1;
(b)canonlyhaveavalidsignificancetestcarriedoutwhenatleastoneofthevariablesisfromaNormaldistribution;
(c)is0.5whenthereisnorelationship;
(d)dependsonthechoiceofdependentvariable;
(e)measuresthemagnitudeofthechangeinonevariableassociatedwithachangeintheother.
ViewAnswer
11EExercise:ComparingtworegressionlinesTable11.4andFigure11.16showthePEFRandheightsofsamplesofmaleandfemalemedicalstudents.Table11.5showsthesumsofsquaresandproductsforthesedata.
1.Estimatetheslopesoftheregressionlinesforfemalesandmales.
ViewAnswer
2.Estimatethestandarderrorsoftheslopes.
ViewAnswer
3.Findthestandarderrorforthedifferencebetweentheslopes,whichareindependent.Calculatea95%confidenceintervalforthedifference.
ViewAnswer
4.Usethestandarderrortotestthenullhypothesisthattheslopesarethesameinthepopulationfromwhichthesedatacome.
ViewAnswer
Fig.11.16.PEFRandheightforfemaleandmalemedicalstudents
Table11.4.HeightandPEFRinasampleofmedicalstudents
Females
Ht PEFR Ht PEFR Ht PEFR Ht PEFR Ht
155 450 163 428 168 480 164 540 175
155 475 163 548 168 595 167 470 176
155 503 164 485 169 510 167 530 176
158 440 165 485 170 455 167 598 177
160 360 166 430 171 430 168 510 177
161 383 166 440 171 537 168 560 177
161 461 166 485 172 442 170 510 177
161 470 166 510 172 463 170 547 177
161 470 167 415 172 490 170 553 177
161 475 167 455 174 540 170 560 177
161 480 167 470 174 540 171 460 178
162 450 167 500 176 535 171 473 178
162 475 168 430 177 513 171 550 178
162 550 168 440 181 522 171 575 178
163 370 172 480 178
172 550 180
172 620 181
174 550 181
174 550 181
174 616
Table11.5.SummarystatisticsforheightandPEFRinasampleofmedicalstudents
Females Males
Number 43 58
Sumofsquares,height 1469.9 2292.0
Sumofsquares,PEFR 101124.8 226994.1
Sumofproductsaboutmean 4220.1 9048.2
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>12-Methodsbasedonrankorder
12
Methodsbasedonrankorder
12.1*Non-parametricmethodsInChapters10and11IdescribedanumberofmethodsofanalysiswhichreliedontheassumptionthatthedatacamefromaNormaldistribution.Tobemoreprecise,wecouldsaythedatacomefromoneoftheNormalfamilyofdistributions,theparticularNormaldistributioninvolvedbeingdefinedbyitsmeanandstandarddeviation,theparametersofthedistribution.ThesemethodsarecalledparametricbecauseweestimatetheparametersoftheunderlyingNormaldistribution.Methodswhichdonotassumeaparticularfamilyofdistributionsforthedataaresaidtobenon-parametric.InthisandthenextchapterIshallconsidersomenon-parametrictestsofsignificance.Therearemanyothers,butthesewillillustratethegeneralprinciple.Wehavealreadymetonenon-parametrictest,thesigntest(§9.2).ThelargesampleNormaltestcouldalsoberegardedasnon-parametric.
Itisusefultodistinguishbetweenthreetypesofmeasurementsscales.Onanintervalscale,thesizeofthedifferencebetweentwovaluesonthescalehasaconsistentmeaning.Forexample,thedifferenceintemperaturebetween1°Cand2°Cisthesameasthedifferencebetween31°Cand32°C.Onanordinalscale,observationsareordered,butdifferencesmaynothaveameaning.Forexample,anxietyisoftenmeasuredusingsetsofquestions,thenumberofpositiveanswersgivingtheanxietyscale.Asetof36questionswouldgiveascalefrom0to36.Thedifferenceinanxietybetweenscoresof1and2isnotnecessarilythesameasthedifferencebetweenscores31and32.Onanominalscale,wehaveaqualitativeorcategoricalvariable,whereindividuals
aregroupedbutnotnecessarilyordered.Eyecolourisagoodexample.Whencategoriesareordered,wecantreatthescaleaseitherorderedornominal,asappropriate.
AllthemethodsofChapters10and11applytointervaldata,beingbasedondifferencesofobservationsfromthemean.Mostofthemethodsinthischapterapplytoordinaldata.AnyintervalscalewhichdoesnotmeettherequirementsofChapters10and11maybetreatedasordinal,sinceitis,ofcourse,ordered.Thisisthemorecommonapplicationinmedicalwork.
GeneraltextssuchasArmitageandBerry(1994),SnedecorandCochran(1980)andColton(1974)tendnottogointoalotofdetailaboutrankandrelatedmethods,andmorespecializedbooksareneeded(Siegel1956,Conover1980).
12.2*TheMann-WhitneyUtestThisisthenon-parametricanalogueofthetwo-samplettest(§10.3).Itworkslikethis.Considerthefollowingartificialdatashowingobservationsofavariableintwoindependentgroups,AandB:
A 7 4 9 17
B 11 6 21 14
WewanttoknowwhetherthereisanyevidencethatAandBaredrawnfrompopulationswithdifferentlevelsofthevariable.Thenullhypothesisisthatthereisnotendencyformembersofonepopulationtoexceedmembersoftheother.Thealternativeisthatthereissuchatendency,inonedirectionortheother.Firstwearrangetheobservationsinascendingorder,i.e.werankthem:
4 6 7 9 11 14 17 21
A B A A B B A B
Wenowchooseonegroup,sayA.ForeachA,wecounthowmanyBsprecedeit.ForthefirstA,4,noBsprecede.Forthesecond,7,oneBprecedes,forthethirdA,9,oneB,forthefourth,17,threeBs.Weadd
thesenumbersofprecedingBstogethertogiveU=0+1+1+3=5.Now,ifUisverysmall,nearlyalltheAsarelessthannearlyalltheBs.IfUislarge,nearlyallAsaregreaterthannearlyallBs.ModeratevaluesofUmeanthatAsandBsaremixed.TheminimumUis0,whenallBsexceedallAs,andmaximumUisn1×n2whenallAsexceedallBs.ThemagnitudeofUhasameaning,becauseU/n1n2isanestimateoftheprobabilitythatanobservationdrawnatrandomfrompopulationAwouldexceedanobservationdrawnatrandomfrompopulationB.
ThereisanotherpossibleU,whichwewillcallU′,obtainedbycountingthenumberofAsbeforeeachB,ratherthanthenumberofBsbeforeeachA.Thiswouldbe1+3+3+4=11.ThetwopossiblevaluesofUandU′arerelatedbyU+U′=n1n2.SowesubtractU′fromn1n2togive4×4-11=5.
IfweknowthedistributionofUunderthenullhypothesisthatthesamplescomefromthesamepopulation,wecansaywithwhatprobabilitythesedatacouldhaveariseniftherewerenodifference.Wecancarryoutthetestofsignificance.ThedistributionofUunderthenullhypothesiscanbefoundeasily.Thetwosetsoffourobservationscanbearrangedin70differentways,fromAAAABBBBtoBBBBAAAA(8!/4!4!=70,§6A).Underthenullhypothesisthesearrangementsareallequallylikelyand,hence,haveprobability1/70.EachhasitsvalueofU,from0to16,andbycountingthenumberofarrangementswhichgiveeachvalueofUwecanfindtheprobabilityofthatvalue.Forexample,U=0onlyarisesfromtheorderAAAABBBBandsohasprobability1/70=0.014.U=1onlyarisesfromAAABABBBandsohasprobability1/70=0.014also.U=2canariseintwoways:AAABBABBandAABAABBB.Ithasprobability2/70=0.029.ThefullsetofprobabilitiesisshowninTable12.1.
Weapplythistotheexample.ForgroupsAandB,U=5andtheprobabilityofthisis0.071.Aswedidforthesigntest(§9.2)weconsidertheprobabilityofmoreextremevaluesofU,U=5orless,whichis0.071+0.071+0.043+0.029+0.014+0.014=0.242.
Thisgivesaonesidedtest.Foratwo-sidedtest,wemustconsidertheprobabilitiesofadifferenceasextremeintheoppositedirection.We
canseefromTable12.1thatthedistributionofUissymmetrical,sotheprobabilityofanequallyextremevalueintheoppositedirectionisalso0.242,hencethetwo-sidedprobabilityis0.242+0.242=0.484.Thustheobserveddifferencewouldhavebeenquiteprobableifthenullhypothesisweretrueandthetwosamplescouldhavecomefromthesamepopulation.
Table12.1.DistributionoftheMann-WhitneyUstatistic,fortwosamplesofsize4
U Probability U Probability U Probability
0 0.014 6 0.100 12 0.071
1 0.014 7 0.100 13 0.043
2 0.029 8 0.114 14 0.029
3 0.043 9 0.100 15 0.014
4 0.071 10 0.100 16 0.014
5 0.071 11 0.071
Table12.2.Two-sided5%pointsforthedistributionofthesmallervalueofUintheMann-WhitneyUtest
n1n
2 3 4 5 6 7 8 9 10 11
2 - - - - - - 0 0 0 0
3 - - - 0 1 1 2 2 3 3
4 - - 0 1 2 3 4 4 5 6
5 - 0 1 2 3 5 6 7 8 9
6 - 1 2 3 5 6 8 10 11 13
7 - 1 3 5 6 8 10 12 14 16
8 0 2 4 6 8 10 13 15 17 19
9 0 2 4 7 10 12 15 17 20 23
10 0 3 5 8 11 14 17 20 23 26
11 0 3 6 9 13 16 19 23 26 30
12 1 4 7 11 14 18 22 26 29 33
13 1 4 8 12 16 20 24 28 33 37
14 1 5 9 13 17 22 26 31 36 40
15 1 5 10 14 19 24 29 34 39 44
16 1 6 11 15 21 26 31 37 42 47
17 2 6 11 17 22 28 34 39 45 51
18 2 7 12 18 24 30 36 42 48 55
19 2 7 13 19 25 32 38 45 52 58
20 2 8 13 20 27 34 41 48 55 62
IfUislessthanorequaltothetabulatedvaluethedifferenceissignificant.
Inpractice,thereisnoneedtocarryoutthesummationofprobabilitiesdescribedabove,asthesearealreadytabulated.Table12.2showsthe5%pointsofUforeachcombinationofsamplesizesn1andn2upto20.ForourgroupsAandB,U=5.wefindthen2=4columnandthen1=4row.Fromthisweseethatthe5%pointforUis0,andsoU=5isnotsignificant.IfwehadcalculatedthelargerofthetwovaluesofU,11,wecanuseTable12.2byfindingthelowervalue,n1n2-U=16-11=5.
Table12.3.Bicepsskinfoldthickness(mm)intwogroupsofpatients
Crohn'sDisease CoeliacDisease
1.8 2.8 4.2 6.2 1.8 3.8
2.2 3.2 4.4 6.6 2.0 4.2
2.4 3.6 4.8 7.0 2.0 5.4
2.5 3.8 5.6 10.0 2.0 7.6
2.8 4.0 6.0 10.4 3.0
Wecannowturntothepracticalanalysisofsomerealdata.ConsiderthebicepsskinfoldthicknessdataofTable10.4,reproducedasTable12.3.WewillanalysetheseusingtheMann-WhitneyUtest.DenotetheCrohn'sdiseasegroupbyAandthecoeliacgroupbyB.Thejointorderisasfollows:
LetuscounttheAsbeforeeachB.Immediatelywehaveaproblem.ThefirstAandthefirstBhavethesamevalue.DoesthefirstAcomebeforethefirstBorafterit?WeresolvethisdilemmabycountingonehalfforthetiedA.Thetiesbetweenthesecond,thirdandfourthBsdonotmatter,aswecancountthenumberofAsbeforeeachwithoutdifficulty.WehavefortheUstatistic:
U=0.5+1+1+1+6+8.5+10.5+13+18=59.5
Thisisthelowervalue,sincen1n2=9×20=180andsothemiddlevalueis90.WecanthereforereferUtoTable12.2.Thecriticalvalueatthe5%levelforgroupssize9and20is48,whichourvalueexceeds.Hencethedifferenceisnotsignificantatthe5%levelandthedataare
consistentwiththenullhypothesisthatthereisnotendencyformembersofonepopulationtoexceedmembersoftheother.Thisisthesameastheresultofthettestof§10.4.
Forlargervaluesofn1andn2calculationofUcanberathertedious.AsimpleformulaforUcanbefoundusingtheranks.Therankofthelowestobservationis1,ofthenextis2,andsoon.Ifanumberofobservationsaretied,eachhavingthesamevalueandhencethesamerank,wegiveeachtheaverageoftherankstheywouldhaveweretheyordered.Forexample,intheskinfolddatathefirsttwoobservationsareeach1.8.Theyeachreceiverank(1+2)/2=1.5.Thethird,fourthandfiftharetiedat2.0,givingeachofthemrank(3+4+5)/3=4.Thesixth,2.2,isnottiedandsohasrank6.Theranksfortheskinfolddataareasfollows:
skinfold 1.8 1.8 2.0 2.0 2.0 2.2 2.4 2.5 2.8 2.8
group A B B B B A A A A A
rank 1.5 1.5 4 4 4 6 7 8 9.5 9.5
r1 r2 r3 r4
skinfold 3.0 3.2 3.6 3.8 3.8 4.0 4.2 4.2 4.4 4.8
group B A A A B A A B A A
rank 11 12 13 14.5 14.5 16 17.5 17.5 19 20
r5 r6 r7
skinfold 5.4 5.6 6.0 6.2 6.6 7.0 7.6 10.0 10.4
group B A A A A A B A A
rank 21 22 23 24 25 26 27 28 29
r8 r9
WedenotetheranksoftheBgroupbyr1,r2,…,rn1.ThenumberofAsprecedingthefirstBmustber1-1,sincetherearenoBsbeforeitanditisther1thobservation.ThenumberofAsprecedingthesecondBisr2-2,sinceitisther2thobservation,andoneprecedingobservationisaB.Similarly,thenumberprecedingthethirdBisr3-3,andthenumberprecedingtheithBisri-i.Hencewehave:
Thatis,weaddtogethertheranksofallthen1observations,subtractn1(n1+1)/2andwehaveU.Fortheexample,wehave
asbefore.Thisformulaissometimeswritten
Butthisissimplybasedontheothergroup,sinceU+U′=n1n2.Fortestingweusethesmallervalue,asbefore.
isanobservationfromaStandardNormaldistribution.Fortheexample,n1=9andn2=20.wehave
FromTable7.1thisgivestwo-sidedprobability=0.15,similartothatfoundbythetwosamplettest(§10.3).
NeitherTable12.2northeaboveformulaforthestandarddeviationofUtaketiesintoaccount;bothassumethedatacanbefullyranked.Theirusefordatawithtiesisanapproximation.Forsmallsampleswemustacceptthis.FortheNormalapproximation,tiescanbeallowedforusingthefollowingformulaforthestandarddeviationofUwhenthenullhypothesisistrue:
TheMann-WhitneyUtestisanon-parametricanalogueofthetwosamplettest.Theadvantageoverthettestisthattheonlyassumptionaboutthedistributionofthedataisthattheobservationscanberanked,whereasforthettestwemustassumethedataarefrom
Normaldistributionswithuniformvariance.Therearedisadvantages.FordatawhichareNormallydistributed,theUtestislesspowerfulthanthettest,i.e.thettest,whenvalid,candetect
smallerdifferencesforgivensamplesize.TheUtestisalmostaspowerfulformoderateandlargesamplesizes,andthisdifferenceisimportantonlyforsmallsamples.Forverysmallsamples,e.g.twogroupsofthreeobservations,thetestisuselessasallpossiblevaluesofUhaveprobabilitiesabove0.05(Table12.2).TheUtestisprimarilyatestofsignificance.Thetmethodalsoenablesustoestimatethesizeofthedifferenceandgivesaconfidenceinterval.AlthoughasnotedaboveU/n1n2hasaninterpretation,wecannot,sofarasIknow,findaconfidenceintervalforit.
Table12.4.Frequencydistributionsofnumberofnodesinvolvedinbreastcancersdetectedat
screeninganddetectedintheintervalsbetweenscreens(dataofMohammedRaja)
Screeningcancers Intervalcancers
Nodes Freqency Nodes Frequency
0 291 0 66
1 43 1 22
2 16 2 7
3 20 3 7
4 13 4 2
5 3 5 4
6 1 6 4
7 4 7 3
8 3 8 3
9 1 9 2
10 1 10 2
11 2 12 2
12 1 13 1
15 1 15 1
16 1 16 1
17 2 20 1
18 2
20 1
27 1
33 1
Total 408 128
Mean 1.21 2.19
Median 0 0
75%ile 1 3
ThenullhypothesisoftheMann–Whitneytestissometimespresentedasbeingthatthepopulationshavethesamemedian.ThereisevenaconfidenceintervalforthedifferencebetweentwomediansbasedontheMann–Whitneytest(CampbellandGardner1989).Thisissurprising,asthemediansarenotinvolvedinthecalculation.Furthermore,wecanhavetwogroupswhicharesignificantlydifferentusingtheMann–WhitneyUtestyethavethesamemedian.Table12.4
showsanexample.Themajorityofobservationsinbothgroupsarezero,sotransformationtotheNormalisimpossible.Althoughthesamplesarequitelarge,thedistributionissoskewthatarankmethod,appropriatelyadjustedforties,maybesaferthanthemethodof§9.7.TheMann–WhitneyUtestwashighlysignificant,yetthemediansarebothzero.Asthemedianswereequal,Isuggestedthe75thpercentileasameasureoflocationforthedistributions.
ThereasonforthesetwodifferentviewsoftheMann–WhitneyUtestliesintheassumptionswemakeaboutthedistributionsinthetwo
populations.Ifwemakenoassumptions,wecantestthenullhypothesis:thattheprobabilitythatamemberofthefirstpopulationdrawnatrandomwillexceedamemberofthesecondpopulationdrawnatrandomisonehalf.Somepeoplechoosetomakeanassumptionaboutthedistributions:thattheyhavethesameshapeanddifferonlyinlocation(meanormedian).Ifthisassumptionistrue,thenifthedistributionsaredifferentthemediansmustbedifferent.Themeansmustdifferbythesameamount.Itisaverystrongassumption.Forexample,ifitistruethenthevariancesmustbethesameinthetwopopulations.Forthereasonsgivenin§10.5and§7A,itisunlikelythatwecouldgetthisifthedistributionswerenotNormal.UnderthisassumptiontheMann–WhitneyUtestwillrarelybevalidifthetwosamplettestisnotvalidalso.
Thereareothernon-parametrictestswhichtestthesameorsimilarnullhypotheses.Twoofthese,theWilcoxontwosampletestandtheKendallTautest,aredifferentversionsoftheMann–WhitneyUtestwhichweredevelopedaroundthesametimeandlatershowntobeidentical.Thesenamesaresometimesusedinterchangeably.Theteststatisticsandtablesarenotthesame,andtheusermustbeverycarefulthatthecalculationoftheteststatisticbeingusedcorrespondstothetabletowhichitisreferred.AnotherdifficultywithtablesisthatsomearedrawnsothatforasignificantdifferenceUmustbelessthanorequaltothetabulatedvalue(asinTable12.2),forothersUmustbestrictlylessthanthetabulatedvalue.
Formorethantwogroups,therankanalogueofone-wayanalysisofvariance(§10.9)istheKruskal–Wallistest,seeConover(1980)andSiegel(1956).Conover(1980)alsodescribesamultiplecomparisontestforthepairsofgroups,similartothosedescribedin§10.11.
12.3*TheWilcoxonmatchedpairstestThistestisananalogueofthepairedttest.Wehaveasamplemeasuredundertwoconditionsandthenullhypothesisisthatthereisnotendencyfortheoutcomeononeconditiontobehigherorlowerthantheother.Thealternativehypothesisisthattheoutcomeononeconditiontendstobehigherorlowerthantheother.Asthetestisbasedonthemagnitudeofthedifferences,thedatamustbeinterval.
ConsiderthedataofTable12.5,previouslydiscussedin§2.6and§9.2,whereweusedthesigntestfortheanalysis.Inthesigntest,wehaveignoredthemagnitudeofdifferences,andonlyconsideredtheirsigns.Ifwecanuseinformation
aboutthemagnitude,wewouldhopetohaveamorepowerfultest.Clearly,wemusthaveintervaldatatodothis.Toavoidmakingassumptionsaboutthedistributionofthedifferences,weusetheirrankorderinasimilarmannertotheMann–WhitneyUtest.
Table12.5.Resultsofatrialofpronethalolforthepreventionofanginapectoris(Pritchardetal.1963),in
rankorderofdifferences
Numberofattackswhileon
Differenceplacebo–pronethalol
Rankofdifference
Placebo Pronethalol All Positive Negative
2 0 2 1.5 1.5
17 15 2 1.5 1.5
3 0 3 3 3
7 2 5 4 4
8 1 7 6 6
14 7 7 6 6
23 16 7 6 6
34 25 9 8 8
79 65 14 9 9
60 41 19 10 10
323 348 -25 11 11
71 29 42 12 12
Sumofranks
67 11
First,werankthedifferencesbytheirabsolutevalues,i.e.ignoringthesign.Asin§12.2,tiedobservationsaregiventheaverageoftheirranks.Wenowsumtheranksofthepositivedifferences,67,andtheranksofthenegativedifferences,11(Table12.5).Ifthenullhypothesisweretrueandtherewasnodifference,wewouldexpecttheranksumsforpositiveandnegativedifferencestobeaboutthesame,equalto39(theiraverage).Theteststatisticisthelesserofthesesums,T.ThesmallerTis,thelowertheprobabilityofthedataarisingbychance.
ThedistributionofTwhenthenullhypothesisistruecanbefoundbyenumeratingallthepossibilities,asdescribedfortheMann–WhitneyUstatistic.Table12.6givesthe5%and1%pointsforthisdistribution,forsamplesizenupto25.Fortheexample,n=12andsothedifferencewouldbesignificantatthe5%levelifTwerelessthanorequalto14.WehaveT=11,sothedataarenotconsistentwiththenullhypothesis.Thedatasupporttheviewthatthereisarealtendencyforpatientstohavefewerattackswhileontheactivetreatment.
FromTable12.6,wecanseethattheprobabilitythatT≤11liesbetween0.05and0.01.Thisisgreaterthantheprobabilitygivenbythesigntest,whichwas0.006(§9.2).Usuallywewouldexpectgreaterpower,andhencelowerprobabilitieswhenthenullhypothesisisfalse,whenweusemoreoftheinformation.Inthiscase,thegreaterprobabilityreflectsthefactthattheonenegativedifference,-25,islarge.Examinationoftheoriginaldatashowsthatthisindividualhadverylargenumbersofattacksonbothtreatments,anditseemspossiblethathemaybelongtoadifferentpopulationfromtheothereleven.
LikeTable12.2,Table12.6isbasedontheassumptionthatthedifferencescanbefullyrankedandtherearenoties.Tiesmayoccurintwowaysinthis
test.Firstly,tiesmayoccurintherankingsense.Intheexamplewehadtwodifferencesof+2andthreeof+7.Thesewererankedequally:1.5and1.5.and6,6and6.Whentiesarepresentbetweennegativeandpositivedifferences,Table12.6onlyapproximatestothedistributionofT.
Table12.6.Two-sided5%and1%pointsofthedistributionofT(lowervalue)intheWilcoxonone-
sampletest
Samplesizen
ProbabilitythatT≤thetabulated
valueSamplesizen
ProbabilitythatT≤the
tabulatedvalue
5% 1% 5% 1%
5 - - 16 30 19
6 1 - 17 35 23
7 2 - 18 40 28
8 4 0 19 46 32
9 6 2 20 52 37
10 8 3 21 59 43
11 11 5 22 66 49
12 14 7 23 73 55
13 17 10 24 81 61
14 21 13 25 90 68
15 25 16
Tiesmayalsooccurbetweenthepairedobservations,wheretheobserveddifferenceiszero.Inthesamewayasforthesigntest,weomitzerodifferences(§9.2).Table12.6isusedwithnasthenumberofnon-zerodifferencesonly,notthetoalnumberofdifferences.Thisseemsodd,inthatalotofzerodifferenceswouldappeartosupportthenullhypothesis.Forexample,ifinTable12.5wehadanotherdozenpatientswithzerodifferences,thecalculationandconclusionwouldbethesame.However,themeandifferencewouldbesmallerandtheWilcoxontesttellsusnothingaboutthesizeofthedifference,onlyitsexistence.Thisillustratesthedangerofallowingsignificanceteststooutweighallotherwaysoflookingatthedata.
isfromaStandardNormaldistributionifthenullhypothesisistrue.FortheexampleofTable12.5,wehave:
FromTable7.1thisgivesatwo-tailedprobabilityof0.028,similartothatobtainedfromTable12.6.
Wehavethreepossibletestsforpaireddata,theWilcoxon,signandpairedtmethods.IfthedifferencesareNormallydistributed,thettestisthemostpowerfultest.TheWilcoxontestisalmostaspowerful,however,andinpracticethedifferenceisnotgreatexceptforsmallsamples.LiketheMann–WhitneyUtest,theWilcoxonisuselessforverysmallsamples.ThesigntestissimilarinpowertotheWilcoxonforverysmallsamples,butasthesamplesizeincreasestheWilcoxontestbecomesmuchmorepowerful.ThismightbeexpectedsincetheWilcoxontestusesmoreoftheinformation.TheWilcoxontestusesthemagnitudeofthedifferences,andhencerequiresintervaldata.Thismeansthat,asfortmethods,wewillgetdifferentresultsifwetransformthedata.Fortrulyordinaldataweshouldusethesigntest.Thepairedtmethodalsogivesaconfidenceintervalforthedifference.TheWilcoxontestispurelyatestofsignificance,butaconfidenceintervalforthemediandifferencecanbefoundusingtheBinomialmethoddescribedin§8.9.
12.4*Spearman'srankcorrelationcoefficient,ρWenotedinChapter11thesensitivitytoassumptionsofNormalityoftheproductmomentcorrelationcoefficient,r.Thisledtothedevelopmentofnon-parametricapproachesbasedonranks.Spearman'sapproachwasdirect.Firstweranktheobservations,thencalculatetheproductmomentcorrelationoftheranks,ratherthanoftheobservationsthemselves.Theresultingstatistichasadistributionwhichdoesnotdependonthedistributionoftheoriginalvariables.ItisusuallydenotedbytheGreekletterρ,pronounced‘rho’,orbyrs.
Table12.7showsdatafromastudyofthegeographicaldistributionofatumour,Kaposi'ssarcoma,inmainlandTanzania.Theincidencerateswerecalculatedfromcancerregistrydataandtherewasconsiderabledoubtthatallcaseswerenotified.Thedegreeofreportingofcasesmayhavebeenrelatedtopopulationdensityoravailabilityofhealthservices.Inaddition,incidencewascloselyrelatedtoageandsex(whererecorded)andsocouldberelatedtotheageandsexdistributionintheregion.Tocheckthatnoneofthesewereproducingartefactsinthegeographicaldistribution,Icalculatedtherankcorrelationofdiseaseincidencewitheachofthepossibleexplanatoryvariables.Table12.7showstherelationshipofincidencetothepercentageofthepopulationlivingwithin10kmofahealthcentre.Figure12.1showsthescatterdiagramofthesedata.Thepercentagewithin10kmofahealthcentreisveryhighlyskewed,whereasthediseaseincidenceappearssomewhatbimodal.Theassumptionoftheproductmomentcorrelationdonotappeartobemet,sorankcorrelationwaspreferred.
Table12.7.IncidenceofKaposi'ssarcomaandaccessofpopulationtohealthcentresforeachregionofmainland
Tanzania(Blandetal.1977)
Percent Rankorder
RegionIncidence
permillionperyear
populationwithin10kmofhealthcentre
Incidence Population%
Coast 1.28 4.0 1 3
Shinyanga 1.66 9.0 2 7
Mbeya 2.06 6.7 3 6
Tabora 2.37 1.8 4 1
Arusha 2.46 13.7 5 13
Dodoma 2.60 11.1 6 10
Kigoma 4.22 9.2 7 8
Mara 4.29 4.4 8 4
Tanga 4.54 23.0 9 16
Singida 6.17 10.8 10 9
Morogoro 6.33 11.7 11 11
Mtwara 6.40 14.8 12 14
Westlake 6.60 12.5 13 12
Kilimanjaro 6.65 57.3 14 17
Ruvuma 7.21 6.6 15 5
Iringa 8.46 2.6 16 2
Mwanza 8.54 20.7 17 15
Fig.12.1.IncidenceofKaposi'ssarcomapermillionperyearandpercentageofpopulationwithin10kmofahealthcentre,for17regionsofmainlandTanzania
ThecalculationofSpearman'sρproceedsasfollows.Theranksforthe
twovariablesarefound(Table12.7).Weapplytheformulafortheproductmomentcorrelation(§11.9)totheseranks.Wedefine:
Table12.8.Two-sided5%and1%pointsofthedistributionofSpearman'sρ
Samplesizen
Probabilitythatρisasfarorfurtherfrom0thanthetabulatedvalue
5% 1%
4 - -
5 1.00 -
6 0.89 1.00
7 0.82 0.96
8 0.79 0.93
9 0.70 0.83
10 0.68 0.81
Wehaveignoredtheproblemoftiesintheabove.Wetreatobservationswiththesamevalueasdescribedin§12.2.Wegivethemtheaverageoftherankstheywouldhaveiftheywereseparableandapplytherankcorrelationformulaasdescribedabove.InthiscasethedistributionofTable12.8isonlyapproximate.
Thereareseveralwaysofcalculatingthiscoefficient,resultinginformulaewhichappearquitedifferent,thoughtheygivethesameresult(seeSiegel1956).
12.5*Kendall'srankcorrelationcoefficient,τSpearman'srankcorrelationisquitesatisfactoryfortestingthenullhypothesisofnorelationship,butisdifficulttointerpretasameasurementofthestrengthoftherelationship.Kendalldevelopedadifferentrankcorrelationcoefficient.Kendall'sτ,whichhassomeadvantagesoverSpearman's.(TheGreekletterτispronounced‘tau’.)ItisrathermoretedioustocalculatethanSpearman's,butinthecomputeragethishardlymatters.Foreachpairofsubjectswe
observewhetherthesubjectsareorderedinthesamewaybythetwo
variables,aconcordantpair,orderedinoppositeways,adiscordantpair,orequalforoneofthevariablesandsonotorderedatall,atiedpair.Kendall'sτistheproportionofconcordantpairsminustheproportionofdiscordantpairs.τwillbeoneiftherankingsareidentical,asallpairswillbeorderedinthesameway,andminusoneiftherankingsareexactlyopposite,asallpairswillbeorderedintheoppositeway.
Weshalldenotethenumberofconcordantpairs(orderedthesameway)bync,thenumberofdiscordantpairs(orderedinoppositeways)bynd,andthedifference,nc-nd,byS.Therearen(n-1)/2pairsaltogether,so
Whentherearenoties,nc+nd=n(n-1)/2.
Thesimplestwaytocalculatencistoordertheobservationsbyoneofthevariables,asinTable12.7whichisorderedbydiseaseincidence.Nowconsiderthesecondranking(%populationwithin10kmofahealthcentre).Thefirstregion,Coast,has14regionsbelowitwhichhavegreaterrank,sothepairsformedbythefirstregionandthesewillbeinthecorrectorder.Thereare2regionsbelowitwhichhavelowerrank,sothepairsformedbythefirstregionandthesewillbeintheoppositeorder.Thesecondregion,Shinyanga,has10regionsbelowitwithgreaterrankandsocontributes10furtherpairsinthecorrectorder.Notethatthepair‘CoastandShinyanga’hasalreadybeencounted.Thereare5pairsinoppositeorder.Thethirdregion,Mbeya,has10regionsbelowitinthesameorderand4inoppositeorders,andsoon.Weaddthesenumberstogetncandnd:
nc=14+10+10+13+4+6+7+8+1+5+4+2+2+0+1+1+0=88
nd=2+5+4+0+8+5+3+1+7+2+2+3+2+3+1+0+0=48
Thenumberofpairsisn(n-1)/2=17×16/2=136.Becausetherearenoties,wecouldalsocalculatendbynd=n(n-1)/2-nc=136-88=48.S=nc-nd=88-48=40.Henceτ=S/(n(n-1)/2)=40/136=0.29.
Whenthereareties,τcannotbeone.However,wecouldhaveperfectcorrelationifthetieswerebetweenthesamesubjectsforbothvariables.Toallowforthis,weuseadifferentversionofτ,τb.Considerthedenominator.Therearen(n-1)/2possiblepairs.IftherearetindividualstiedataparticularrankforvariableX,nopairsfromthesetindividualscontributetoS.Therearet(t-1)/2suchpairs.IfweconsiderallthegroupsoftiedindividualswehaveΣt(t-1)/2pairswhichdonotcontributetoS,summingoverallgroupsoftiedranks.HencethetotalnumberofpairswhichcancontributetoSisn(n-1)-Σt(t-1)/2,andScannotbegreaterthann(n-1)/2-Σt(t-1)/2.ThesizeofSisalsolimitedbytiesinthesecondranking.Ifwedenotethenumberofindividuals
withthesamevalueofYbyu,thenthenumberofpairswhichcancontributetoSisn(n-1)/2-Σu(u-1)/2.Wenowdefineτbby
Notethatiftherearenoties,Σt(t-1)/2=0=Σ.Whentherankingsareidenticalτb=1,nomatterhowmanytiesthereare.Kendall(1970)alsodiscussestwootherwaysofdealingwithties,obtainingcoefficientsτaandτc,buttheiruseisrestricted.
Weoftenwanttotestthenullhypothesisthatthereisnorelationshipbetweenthetwovariablesinthepopulationfromwhichoursamplewasdrawn.Asusual,weareconcernedwiththeprobabilityofSbeingasormoreextreme(i.e.farfromzero)thantheobservedvalue.Table12.9wascalculatedinthesamewayasTables12.1and12.2.ItshowstheprobabilityofbeingasextremeastheobservedvalueofSfornupto10.Forconvenience,Sistabulatedratherthanτ.Whentiesarepresentthisisonlyanapproximation.
Whenthesamplesizeisgreaterthan10,ShasanapproximatelyNormaldistributionunderthenullhypothesis,withmeanzero.Iftherearenoties,thevarianceis
Whenthereareties,thevarianceformulaisverycomplicated(Kendall
1970).Ishallomitit,asinpracticethesecalculationswillbedoneusingcomputersanyway.Iftherearenotmanytiesitwillnotmakemuchdifferenceifthesimpleformisused.
Fortheexample,S=40,n=17andtherearenoties,sotheStandardNormalvariateis
FromTable7.1oftheNormaldistributionwefindthatthetwo-sidedprobabilityofavalueasextremeasthisis0.06×2=0.12,whichisverysimilartothatfoundusingSpearman'sρ.Theproductmomentcorrelation,r,givesr=0.30,P=0.24,butofcoursethenon-NormaldistributionsofthevariablesmakethisPinvalid.
Whyhavetwodifferentrankcorrelationcoefficients?Spearman'sρisolderthanKendall'sτ,andcanbethoughtofasasimpleanalogueoftheproductmomentcorrelationcoefficient,Pearson'sr.τisapartofamoregeneralandconsistentsystemofrankingmethods,andhasadirectinterpretation,asthedifferencebetweentheproportionsofconcordantanddiscordantpairs.Ingeneral,
thenumericalvalueofρisgreaterthanthatofτ.Itisnotpossibletocalculateτfromρorρfromτ,theymeasuredifferentsortsofcorrelation.ρgivesmoreweighttoreversalsoforderwhendataarefarapartinrankthanwhenthereisareversalclosetogetherinrank,τdoesnot.Howeverintermsoftestsofsignificancebothhavethesamepowertorejectafalsenullhypothesis,soforthispurposeitdoesnotmatterwhichisused.
Table12.9.Two-sided5%and1%pointsofthedistributionofSforKendall'sτ
Samplesizen
ProbabilitythatSisasfarorfurtherfromtheexpectedthanthetabulatedvalue
5% 1%
4 - -
5 10 -
6 13 15
7 15 19
8 18 22
9 20 26
10 23 29
12.6*ContinuitycorrectionsInthischapter,whensampleswerelargewehaveusedacontinuousdistribution,theNormal,toapproximatetoadiscretedistribution.U,TorS.Forexample,Figure12.2showsthedistributionoftheMann—WhitneyUstatisticforn1=4,n2=4(Table12.1)withthecorrespondingNormalcurve.Fromtheexactdistribution,theprobabilitythatU<2is0.014+0.014+0.029=0.057.ThecorrespondingStandardNormaldeviateis
Thishasaprobabilityof0.048,interpolatinginTable7.1.Thisis
smallerthantheexactprobability.Thedisparityarisesbecausethecontinuousdistributiongivesprobabilitytovaluesotherthantheintegers0,1,2,etc.TheestimatedprobabilityforU=2canbefoundbytheareaunderthecurvebetweenU=1.5andU=2.5.ThecorrespondingNormaldeviatesare-1.876and-1.588,whichhaveprobabilitiesfromTable7.1of0.030and0.056.ThisgivestheestimatedprobabilityforU=2tobe0.056-0.030=0.026,whichcomparesquitewellwiththeexactfigureof0.029.ThustoestimatetheprobabilitythatU<2,weestimatetheareabelowU=1.5,notbelowU=2.ThisgivesusaStandardNormaldeviateof-1.588,asalreadynoted,andhenceaprobabilityof0.056.Thiscorrespondsremarkablywellwiththeexactprobabilityof0.057,especiallywhenweconsiderhowsmalln1andn2are.
WewillgetabetterapproximationfromourStandardNormaldeviateifwemakeUclosertoitsexpectedvalueby1/2.Ingeneral,wegetabetterfitifwe
maketheobservedvalueofthestatisticclosertoitsexpectedvaluebyhalfoftheintervalbetweenadjacentdiscretevalues.Thisisacontinuitycorrection.
Fig.12.2.DistributionoftheMann-WhitneyUstatistic,n1=4,n2=4,whenthenullhypothesisistrue,withthecorrespondingNormal
distributionandareaestimatingPROB(U=2)
ForS,theintervalbetweenadjacentvaluesis2,not1,forS=nc-nd=2nc-n(n-1)/2,andncisaninteger.AchangeofoneunitinncproducesachangeoftwounitsinS.Thecontinuitycorrectionisthereforehalfof2,whichis1.WemakeSclosertotheexpectedvalueof0by1beforeapplyingtheNormalapproximation.FortheKaposi'ssarcomadata,wehadS=40,withn=17.Usingthecontinuitycorrectiongives
Thisgivesatwo-sidedprobabilityof0.066×2=0.13,slightlygreaterthantheuncorrectedvalueof0.12.
Continuitycorrectionsareimportantforsmallsamples;forlargesamplestheyarenegligible.WeshallmeetanotherinChapter13.
12.7*Parametricornon-parametricmethods?Formanystatisticalproblemsthereareseveralpossiblesolutions,justasformanydiseasesthereareseveraltreatments,similarperhapsintheiroverallefficacybutdisplayingvariationintheirsideeffects,intheirinteractionswithotherdiseasesortreatmentsandintheirsuitabilityfordifferenttypesofpatients.Thereisoftennoonerighttreatment,butrathertreatmentisdecidedonthepresciber'sjudgementoftheseeffects,pastexperienceandplainprejudice.Manyproblemsinstatisticalanalysisarelikethis.Incomparingthemeansoftwosmallgroups,forinstance,wecoulduseattest,attestwithatransformation,aMann-WhitneyUtest,oroneofseveralothers.Ourchoice
ofmethoddependsontheplausibilityofNormalassumptions,theimportanceofobtainingaconfidenceinterval,theeaseofcalculation,andsoon.Itdependsonplainprejudice,too.SomeusersofstatisticalmethodsareveryconcernedabouttheimplicationsofNormalassumptionsandwilladvocatenon-parametricmethodswherever
possible,whileothersaretoocarelessoftheerrorsthatmaybeintroducedwhenassumptionsarenotmet.
Isometimesmeetpeoplewhotellmethattheyhaveusednon-parametricmethodsthroughouttheiranalysisasifthisissomekindofbadgeofstatisticalpurity.Itisnothingofthekind.Itmaymeanthattheirsignificancetestshavelesspowerthantheymighthave,andthatresultsareleftas‘notsignificant’when,forexample,aconfidenceintervalforadifferencemightbemoreinformative.
Ontheotherhand,suchmethodsareveryusefulwhenthenecessaryassumptionsofthetdistributionmethodcannotbemade,anditwouldbeequallywrongtoeschewtheiruse.Rather,weshouldchoosethemethodmostsuitedtotheproblem,bearinginmindboththeassumptionswearemakingandwhatwereallywanttoknow.WeshallsaymoreaboutwhatmethodtousewheninChapter14.
Thereisacommonmisconceptionthatwhenthenumberofobservationsisverysmall,usuallysaidtobelessthansix,Normaldistributionmethodssuchasttestsandregressionmustnotbeusedandthatrankmethodsshouldbeusedinstead.Ihaveneverseenanyargumentputforwardinsupportofthis,butinspectionofTables12.2,12.6,12.8,and12.9willshowthatitisnonsense.Forsuchsmallsamplesranktestscannotproduceanysignificanceattheusual5%level.Shouldoneneedstatisticalanalysisofsuchsmallsamples,Normalmethodsarerequired.
12M*Multiplechoicequestions62to66(Eachbranchiseithertrueorfalse)
62.Forcomparingtheresponsestoanewtreatmentofagroupofpatientswiththeresponsesofacontrolgrouptoastandardtreatment,possibleapproachesinclude:
(a)thetwo-sampletmethod;
(b)thesigntest;
(c)theMann-WhitneyUtest;
(d)theWilcoxonmatchedpairstest;
(e)rankcorrelationbetweenresponsestothetreatments.
ViewAnswer
63.Suitablemethodsfortrulyordinaldatainclude:
(a)thesigntest;
(b)theMann-WhitneyUtest;
(c)theWilcoxonmatchedpairstest;
(d)thetwosampletmethod;
(e)Kendall'srankcorrelationcoefficient.
ViewAnswer
64.Kendall'srankcorrelationcoefficientbetweentwovariables:
(a)dependsonwhichvariableisregardedasthepredictor;
(b)iszerowhenthereisnorelationship;
(c)cannothaveavalidsignificancetestwhentherearetiedobservations;
(d)mustliebetween-1and+1;
(e)isnotaffectedbyalogtransformationofthevariables.
ViewAnswer
65.Testsofsignificancebasedonranks:
(a)arealwaystobepreferredtomethodswhichassumethedatatobeNormallydistributed;
(b)arelesspowerfulthanmethodsbasedontheNormaldistributionwhendataareNormallydistributed;
(c)enableconfidenceintervalstobeestimatedeasily;
(d)requirenoassumptionsaboutthedata;
(e)areoftentobepreferredwhendatacannotbeassumedto
followanyparticulardistribution.
ViewAnswer
66.Tenmenwithanginaweregivenanactivedrugandaplaceboonalternatedaysinrandomorder.Patientsweretestedusingthetimeinminutesforwhichtheycouldexerciseuntilanginaorfatiguestoppedthem.Theexistenceofanactivedrugeffectcouldbeexaminedby:
(a)pairedttest;
(b)Mann-WhitneyUtest;
(c)signtest;
(d)Wilcoxonmatchedpairstest;
(e)Spearman'sρ.
ViewAnswer
12E*Exercise:ApplicationofrankmethodsInthisexerciseweshallanalysetherespiratorycompliancedataof§10Eusingnon-parametricmethods.
1.ForthedataofTable10.19,usethesigntesttotestthenullhypothesisthatchangingthewaveformhasnoeffectonstaticcompliance.
ViewAnswer
2.Testthesamenullhypothesisusingatestbasedonranks.
ViewAnswer
3.Repeatstep1usinglogtransformedcompliance.Doesthetransformationmakeanydifference?
ViewAnswer
4.Repeatstep2usinglogcompliance.Whydoyougetadifferentanswer?
ViewAnswer
5.Whatdoyouconcludeabouttheeffectofwaveformfromthenon-parametrictests?
ViewAnswer
6.Howdotheconclusionsoftheparametricandnon-parametricapproachesdiffer?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>13-Theanalysisofcross-tabulations
13
Theanalysisofcross-tabulations
13.1Thechi-squaredtestforassociationTable13.1showsforasampleofmotherstherelationshipbetweenhousingtenureandwhethertheyhadapretermdelivery.Thiskindofcross-tabulationoffrequenciesisalsocalledacontingencytableorcross-classification.Eachentryinthetableisafrequency,thenumberofindividualshavingthesecharacteristics(§4.1).Itcanbequitedifficulttomeasurethestrengthoftheassociationbetweentwoqualitativevariableslikethese,butitiseasytotestthenullhypothesisthatthereisnorelationshiporassociationbetweenthetwovariables.Ifthesampleislarge,wedothisbyachi-squaredtest.
Thechi-squaredtestforassociationinacontingencytableworkslikethis.Thenullhypothesisisthatthereisnoassociationbetweenthetwovariables,thealternativebeingthatthereisanassociationofanykind.Wefindforeachcellofthetablethefrequencywhichwewouldexpectifthenullhypothesisweretrue.Todothisweusetherowandcolumntotals,sowearefindingtheexpectedfrequenciesfortableswiththesetotals,calledthemarginaltotals.
Thereare1443women,ofwhom899wereowneroccupiers,aproportion899/1443.Iftherewerenorelationshipbetweentimeofdeliveryandhousingtenure,wewouldexpecteachcolumnofthetabletohavethesameproportion,899/1443,ofitsmembersinthefirstrow.Thusthe99patientsinthefirstcolumnwouldbeexpectedtohave99×899/1443=61.7inthefirstrow.By‘expected’wemeantheaveragefrequencywewouldgetinthelongrun.Wecouldnotactuallyobserve61.7subjects.The1344patientsinthesecondcolumnwouldbe
expectedtohave1344×899/1443=837.3inthefirstrow.Thesumofthesetwoexpectedfrequenciesis899,therowtotal.Similarly,thereare258patientsinthesecondrowandsowewouldexpect99×258/1443=17.7in
thesecondrow,firstcolumnand1344×258/1443=240.3inthesecondrow,secondcolumn.Wecalculatetheexpectedfrequencyforeachrowandcolumncombination,orcell.The10cellsofTable13.1giveustheexpectedfrequenciesshowninTable13.2.NoticethattherowandcolumntotalsarethesameasinTable13.1.Ingeneral,theexpectedfrequencyforacellofthecontingencytableisfoundby
Itdoesnotmatterwhichvariableistherowandwhichthecolumn.
Table13.1.Contingencytableshowingtimeofdeliverybyhousingtenure
Housingtenure Preterm Term Total
Owner–occupier 50 849 899
Counciltenant 29 229 258
Privatetenant 11 164 175
Liveswithparents 6 66 72
Other 3 36 39
Total 99 1344 1443
Wenowcomparetheobservedandexpectedfrequencies.Ifthetwovariablesarenotassociated,theobservedandexpectedfrequenciesshouldbeclosetogether,anydiscrepancybeingduetorandomvariation.Weneedateststatisticwhichmeasuresthis.Thedifferencesbetweenobservedandexpectedfrequenciesareagoodplacetostart.Wecannotsimplysumthemasthesumwouldbezero,bothobservedandexpectedfrequencieshavingthesamegrandtotal,1443.Wecanresolvethisasweresolvedasimilarproblemwithdifferencesfromthemean(§4.7),bysquaringthedifferences.Thesizeofthedifferencewillalsodependinsomewayonthenumberofpatients.Whentherowandcolumntotalsaresmall,thedifferencebetweenobservedandexpectedisforcedtobesmall.Itturnsout,forreasonsdiscussedin§13A,thatthebeststatisticis
Thisisoftenwrittenas
ForTable13.1thisis
Aswillbeexplainedin§13A,thedistributionofthisteststatisticwhenthenullhypothesisistrueandthesampleislargeenoughistheChi-squareddistribution(§7A)with(r-1)(c-1)degreesoffreedom,whereristhenumberofrowsand
cisthenumberofcolumns.Ishalldiscusswhatismeantby‘large
enough’in§13.3.Wearetreatingtherowandcolumntotalsasfixedandonlyconsideringthedistributionoftableswiththesetotals.Thetestissaidtobeconditionalonthesetotals.Wecanprovethatweloseverylittleinformationbydoingthisandwegetasimpletest.
Table13.2.ExpectedfrequenciesunderthenullhypothesisforTable13.1
Housingtenure Preterm Term Total
Owner–occupier 61.7 837.3 899
Counciltenant 17.7 240.3 258
Privatetenant 12.0 163.0 175
Liveswithparents 4.9 67.1 72
Other 2.7 36.3 39
Total 99 1344 1443
Fig.13.1.PercentagepointoftheChi-squareddistribution
ForTable13.1wehave(5-1)×(2-1)=4degreesoffreedom.Table13.3showssomepercentagepointsoftheChi-squareddistributionforselecteddegreesoffreedom.Thesearetheupperpercentagepoints,asshowninFigure13.1.Weseethatfor4degreesoffreedomthe5%pointis9.49and1%pointis13.28,soourobservedvalueof10.5hasprobabilitybetween1%and5%,or0.01and0.05.Ifweuseacomputerprogramwhichprintsouttheactualprobability,wefindP=0.03.Thedataarenotconsistentwiththenullhypothesisandwecanconcludethatthereisgoodevidenceofarelationshipbetweenhousingtenureandtimeofdelivery.
Thechi-squaredstatisticisnotanindexofthestrengthoftheassociation.IfwedoublethefrequenciesinTable13.1,thiswilldoublechi-squared,butthestrengthoftheassociationisunchanged.Notethatwecanonlyusethechi-squaredtestwhenthenumbersinthecellsarefrequencies,notwhentheyarepercentages,proportionsormeasurements.
Table13.3.PercentagepointsoftheChi-squareddistribution
Degreesoffreedom
Probabilitythatthetabulatedvalueisexceeded(Figure13.1)
10% 5% 1% 0.1%
1 2.71 3.84 6.63 10.83
2 4.61 5.99 9.21 13.82
3 6.25 7.81 11.34 16.27
4 7.78 9.49 13.28 18.47
5 9.24 11.07 15.09 20.52
6 10.64 12.59 16.81 22.46
7 12.02 14.07 18.48 24.32
8 13.36 15.51 20.09 26.13
9 14.68 16.92 21.67 27.88
10 15.99 18.31 23.21 29.59
11 17.28 19.68 24.73 31.26
12 18.55 21.03 26.22 32.91
13 19.81 22.36 27.69 34.53
14 21.06 23.68 29.14 36.12
15 22.31 25.00 30.58 37.70
16 23.54 26.30 32.00 39.25
17 24.77 27.59 33.41 40.79
18 25.99 28.87 34.81 42.31
19 27.20 30.14 36.19 43.82
20 28.41 31.41 37.57 45.32
Table13.4.Coughduringthedayoratnightatage14forchildrenwithandwithoutahistoryofbronchitisbeforeage5(Hollandetal.1978)
Bronchitis NoBronchitis Total
Cough 26 44 70
Nocough 247 1002 1249
Total 273 1046 1319
13.2Testsfor2by2tablesConsiderthedataoncoughsymptomandhistoryofbronchitisdiscussedin§9.8.Wehad273childrenwithahistoryofbronchitisofwhom26werereportedtohavedayornightcough,and1046childrenwithouthistoryofbronchitis,ofwhom44werereportedtohavedayornightcough.Wecansetthesedataoutasacontingencytable,asinTable13.4.Wecanalsousethechi-squaredtesttotestthenullhypothesisofnoassociationbetweencoughandhistory.TheexpectedvaluesareshowninTable13.5.Theteststatisticis
Wehaver=2rowsandc=2columns,sothereare(r-1)(c-1)=(2-1)×(2-1)=1degreeoffreedom.WeseefromTable13.3thatthe5%pointis3.84,andthe1%pointis6.63,sowehaveobservedsomethingveryunlikelyifthenullhypothesisweretrue.Hencewerejectthenullhypothesisofnoassociationandconcludethatthereisarelationshipbetweenpresentcoughandhistoryofbronchitis.
Table13.5.ExpectedfrequenciesforTable13.4
Bronchitis Nobronchitis Total
Cough 14.49 55.51 70.00
Nocough 258.51 990.49 1249.00
Total 273.00 1046.00 1319.00
Nowthenullhypothesis‘noassociationbetweencoughandbronchitis’isthesameasthenullhypothesis‘nodifferencebetweentheproportionswithcoughinthebronchitisandnobronchitisgroups’.Iftherewereadifference,thevariableswouldbeassociated.Thuswehavetestedthesamenullhypothesisintwodifferentways.Infactthesetestsareexactlyequivalent.IfwetaketheNormaldeviatefrom§9.8,whichwas3.49,andsquareit,weget12.2,thechi-squaredvalue.Themethodof§9.8and§8.6hastheadvantagethatitcanalsogiveusaconfidenceintervalforthesizeofthedifference,whichthechi-squaredmethoddoesnot.Notethatthechi-squaredtestcorrespondstothetwo-sidedztest,eventhoughonlytheuppertailofthechi-squareddistributionisused.
13.3Thechi-squaredtestforsmallsamplesWhenthenullhypothesisistrue,theteststatisticΣ(O-E)2/E,whichwecancallthechi-squaredstatistic,followstheChi-squareddistributionprovidedtheexpectedvaluesarelargeenough.Thisisalargesampletest,likethoseof§9.7and§9.8.Thesmallertheexpectedvaluesbecome,themoredubiouswillbethetest.
TheconventionalcriterionforthetesttobevalidisusuallyattributedtothegreatstatisticianW.G.Cochran.Theruleisthis:thechi-squaredtestisvalidifatleast80%oftheexpectedfrequenciesexceed5andalltheexpectedfrequenciesexceed1.WecanseethatTable13.2satisfiesthisrequirement,sinceonly2outof10expectedfrequencies,20%,arelessthan5andnoneislessthan1.Notethatthisconditionappliestotheexpectedfrequencies,nottheobservedfrequencies.Itisquiteacceptableforanobservedfrequencytobe0,providedtheexpectedfrequenciesmeetthecriterion.
Thiscriterionisopentoquestion.Simulationstudiesappeartosuggestthattheconditionmaybetooconservativeandthatthechi-squaredapproximationworksforsmallerexpectedvalues,especiallyforlargernumbersofrowsandcolumns.Atthetimeofwritingtheanalysisof
tablesbasedonsmallsamplesizes,particularly2by2tables,isthesubjectofhotdisputeamongstatisticians.Asyet,no-onehassucceededindevisingabetterrulethanCochran's,soIwouldrecommendkeepingtoituntilthetheoreticalquestionsareresolved.Any
chi-squaredtestwhichdoesnotsatisfythecriterionisalwaysopentothechargethatitsvalidityisindoubt.
Table13.6.Observedandexpectedfrequenciesofcategoriesofradiologicalappearanceatsixmonthsascomparedwith
appearanceonadmissionintheMRCstreptomycintrial,patientswithaninitialtemperatureof100–100.9°F
Radiologicalassessment
Streptomycin Control
Observed Expected Observed Expected
Improvement 13 8.4 5 9.6
Deterioration 2 4.2 7 4.8
Death 0 2.3 5 2.7
Total 15 15 17 17
Ifthecriterionisnotsatisfiedwecanusuallycombineordeleterowsandcolumnstogivebiggerexpectedvalues.Ofcourse,thiscannotbedonefor2by2tables,whichweconsiderinmoredetailbelow.Forexample,Table13.6showsdatafromtheMRCstreptomycintrial(§2.2),
theresultsofradiologicalassessmentforasubgroupofpatientsdefinedbyaprognosticvariable.Wewanttoknowwhetherthereisevidenceofastreptomycineffectwithinthissubgroup,sowewanttotestthenullhypothesisofnoeffectusingachi-squaredtest.Thereare4outof6expectedvalueslessthan5,sothetestonthistablewouldnotbevalid.Wecancombinetherowssoastoraisetheexpectedvalues.Sincethesmallexpectedfrequenciesareinthe‘deterioration’and‘death’rows,itmakessensetocombinethesetogivea‘deteriorationordeath’row.Theexpectedvaluesarethenallgreaterthan5andwecandothechi-squaredtestwith1degreeoffreedom.Thiseditingmustbedonewithregardtothemeaningofthevariouscategories.InTable13.6,therewouldbenopointincombiningrows1and3togiveanewcategoryof‘considerableimprovementordeath’tobecomparedtotheremainder,asthecomparisonwouldbeabsurd.ThenewtableisshowninTable13.7.Wehave
UnderthenullhypothesisthisisfromaChi-squareddistributionwithonedegreeoffreedom,andfromTable13.3wecanseethattheprobabilityofgettingavalueasextremeas10.8islessthan1%.Wehavedatainconsistentwiththenullhypothesisandwecanconcludethattheevidencesuggestsatreatmenteffectinthissubgroup.
Ifthetabledoesnotmeetthecriterionevenafterreductiontoa2by2table,wecanapplyeitheracontinuitycorrectiontoimprovetheapproximationtotheChi-squareddistribution(§13.5),oranexacttestbasedonadiscretedistribution(§13.4).
Table13.7.ReductionofTable13.6toa2by2table
Radiologicalassessment
Streptomycin Control
Observed Expected Observed Expected
Improvement 13 8.4 5 9.6
Deteriorationordeath
2 6.6 12 7.4
Total 15 15.0 17 17.0
Table13.8.ArtificialdatatoillustrateFisher'sexacttest
Survived Died Total
TreatmentA 3 1 4
TreatmentB 2 2 4
Total 5 3 8
13.4Fisher'sexacttestThechi-squaredtestdescribedin§13.1isalargesampletest.Whenthesampleisnotlargeandexpectedvaluesarelessthan5,wecanturntoanexactdistributionlikethatfortheMann–WhitneyUstatistic(§12.2).ThismethodiscalledFisher'sexacttest.
Theexactprobabilitydistributionforthetablecanonlybefoundwhentherowandcolumntotalsaregiven.Justaswiththelargesamplechi-squaredtest,werestrictourattentiontotableswiththesetotals.Thisdifficultyhasledtomuchcontroversyabouttheuseofthistest.Ishallshowhowthetestworks,thendiscussitsapplicability.
Considerthefollowingartificialexample.Inanexperiment,werandomlyallocate4patientstotreatmentAand4totreatmentB,andgettheoutcomeshowninTable13.8.Wewanttoknowtheprobabilityofsolargeadifferenceinmortalitybetweenthetwogroupsifthetreatmentshavethesameeffect(thenullhypothesis).Wecouldhaverandomizedthesubjectsintotwogroupsinmanyways,butifthenullhypothesisistruethesamethreewouldhavedied.Therowandcolumntotalswouldthereforebethesameforallthesepossibleallocations.Ifwekeeptherowandcolumntotalsconstant,thereareonly4possibletables,showninTable13.9.Thesetablesarefoundbyputtingthevalues0,1,2,3inthe‘DiedingroupA’cell.AnyothervalueswouldmaketheDtotalgreaterthan3.
Now,letuslabeloursubjectsatoh.Thesurvivorswewillcallatoe,andthedeathsftoh.Howmanywayscanthesepatientsbearrangedintwogroupsof4togivetablesi,ii,iiiandiv?Tableicanarisein5ways.Patientsf,g,andhwouldhavetobeingroupB,togive3deaths,andtheremainingmemberofBcouldbea,b,c,dore.Tableiicanarisein30ways.The3survivorsingroupAcanbeabc,abd,abe,acd,ace,ade,bcd,bce,bde,cde,10ways.ThedeathinAcanbef,gorh,3ways.Hencethegroupcanbemadeupin10×3=30ways.Tableiiiisthesameastableii,withAandBreversed,soarisesin30ways.TableivisthesameastableiwithAandBreversed,soarisesin5ways.
Hencewecanarrangethe8patientsinto2groupsof4in5+30+30+5=70ways.Now,theprobabilityofanyonearrangementarisingbychanceis1/70,sincetheyareallequallylikelyifthenullhypothesisistrue.Tableiarisesfrom5ofthe70arrangements,sohadprobability5/70=0.071.Tableiiarisesfrom30outof70arrangements,sohasprobability30/70=0.429.Similarly,Tableiiihasprobability30/70=0.429,andTableivhasprobability5/70=0.071.
Table13.9.PossibletablesforthetotalsofTable13.8
i. S D T
A 4 0 4
B 1 3 4
T 5 3 8
ii
S D T
A 3 1 4
B 2 2 4
T 5 3 8
iii.
S D T
A 2 2 4
B 3 1 4
T 5 3 8
iv.
S D T
A 1 3 4
B 4 0 4
T 5 3 8
Hence,underthenullhypothesisthatthereisnoassociationbetweentreatmentandsurvival,Tableii,whichweobserved,hasaprobabilityof0.429.Itcouldeasilyhavearisenbychanceandsoitisconsistentwiththenullhypothesis.Asin§9.2,wemustalsoconsidertablesmoreextremethantheobserved.Inthiscase,thereisonemoreextremetableinthedirectionoftheobserveddifference,Tablei.Inthedirectionoftheobserveddifference,theprobabilityoftheobservedtableoramoreextremeoneis0.071+0.429=0.5.ThisisthePvalueforaone-sidedtest(§9.5).
Fisher'sexacttestisessentiallyonesided.Itisnotclearwhatthecorrespondingdeviationsintheotherdirectionwouldbe,especiallywhenallthemarginaltotalsaredifferent.Thisisbecauseinthatcasethedistributionisasymmetrical,unlikethoseof§12.2–5.Onesolutionistodoubletheone-sidedprobabilitytogetatwo-sidedtestwhenthisisrequired.IfollowArmitageandBerry(1994)inpreferringthisoption.AnothersolutionistocalculateprobabilitiesforeverypossibletableandsumallprobabilitieslessthanorequaltotheprobabilityfortheobservedtabletogivethePvalue.ThismaygiveasmallerPvaluethanthedoublingmethod.
Thereisnoneedtoenumerateallthepossibletables,asabove.Theprobabilitycanbefoundfromasimpleformula(§13B).Theprobabilityofobservingasetoffrequenciesf11,f12,f21,f22,whentherowandcolumntotalsarer1,r2,c1,andc2andthegrandtotalisn,is
(See§6Aforthemeaningofn!.)Wecancalculatethisforeachpossibletablesofindtheprobabilityfortheobservedtableandeachmoreextremeone.Fortheexample:
givingatotalof0.50asbefore.
Unliketheexactdistributionsfortherankstatistics,thisdistributionisfairlyeasytocalculatebutdifficulttotabulate.Agoodtableofthisdistributionrequiredasmallbook(Finneyetal.1963).
WecanapplythistesttoTable13.7.The2by2tablestobetestedandtheirprobabilitiesare:
Thetotalone-sidedprobabilityis0.0014553,whichdoubledforatwo-sidedtestgives0.0029.ThemethodusingallsmallerprobabilitiesgivesP=0.00159.EitherislargerthantheprobabilityfortheX2valueof10.6,whichis0.0011.
Fisher'sexacttestwasoriginallydevisedforthe2×2tableandonlyusedwhentheexpectedfrequenciesweresmall.Thiswasbecauseforlargernumbersandlargertablesthecalculationswereimpractical.Withcomputersthingshavechanged,andFisher'sexacttestcanbedoneforany2×2table.SomeprogramswillalsocalculateFisher'sexacttestforlargertablesasthenumberofrowsandcolumnsincreases,thenumberofpossibletablesincreasesveryrapidlyanditbecomesimpracticabletocalculateandstoretheprobabilityforeachone.TherearespecialistprogramssuchasStatExactwhichcreatearandomsampleofthepossibletablesandusethemtoestimateadistributionofprobabilities
whosetailareaisthenfound.Methodswhichsamplethepossibilitiesinthiswayare(ratherendearingly)calledMonteCarlomethods.
13.5Yates'continuitycorrectionforthe2by2tableThediscrepancyinprobabilitiesbetweenthechi-squaredtestandFisher'sexacttestarisesbecauseweareestimatingthediscretedistributionoftheteststatisticbythecontinuousChi-squareddistribution.Acontinuitycorrectionlikethoseof§12.6,calledYates'correction,canbeusedtoimprovethefit.Theobservedfrequencieschangeinunitsofone,sowemakethemclosertotheirexpectedvaluesbyonehalf.Hencetheformulaforthecorrectedchi-squaredstatisticfora2by2tableis
Thishasprobability0.0037,whichisclosertotheexactprobability,thoughthereisstillaconsiderablediscrepancy.Atsuchextremelylowvaluesanyapproximateprobabilitymodelsuchasthisisliabletobreakdown.Inthecriticalareabetween0.10and0.01,thecontinuitycorrectionusuallygivesaverygoodfittotheexactprobability.AsFisher'sexacttestisnowsoeasytodo,Yates'correctionmaysoondisappear.
13.6*ThevalidityofFisher'sandYates'methodsTherehasbeenmuchdisputeamongstatisticiansaboutthevalidityoftheexacttestandthecontinuitycorrectionwhichapproximatestoit.Amongthemoreargumentativeofthefoundingfathersofstatistical
inference,suchasFisherandNeyman,thiswasquiteacrimonious.Theproblemisstillunresolved,andgeneratingalmostasmuchheataslight.
Notethatalthoughbothare2by2tables,Tables13.4and13.7aroseindifferentways.InTable13.7,thecolumntotalswerefixedbythedesignoftheexperimentandonlytherowtotalsarefromarandomvariable.InTable13.4neitherrownorcolumntotalsweresetinadvance.BotharefromtheBinomialdistribution,dependingontheincidenceofbronchitisandprevalenceofchroniccoughinthepopulation.Thereisathirdpossibility,thatboththerowandcolumntotalsarefixed.Thisisrareinpractice,butitcanbeachievedbythefollowingexperimentaldesign.Wewanttoknowwhetherasubjectcandistinguishanactivetreatmentfromaplacebo.Wepresenthimwith10tablets,5ofeach,andaskhimtosortthetabletsintothe5activeand5placebo.Thiswouldgivea2by2table,subject'schoiceversustruth,inwhichallrowandcolumntotalsarepresetto5.Thereareseveralvariationsonthesetypesoftable,too.Itcanbeshownthatthesamechi-squaredtestappliestoallthesecaseswhensamplesarelarge.Whensamplesaresmall,thisisnotnecessarilyso.Adiscussionoftheproblemiswellbeyondthescopeofthisbook.Forsomeofthesecases.Fisher'sexacttestandYates'correctionmaybeconservative,that
is,giveratherlargerprobabilitiesthantheyshould,thoughthisisamatterofdebate.MyownopinionisthatYates'correctionandFisher'sexacttestshouldbeused.Ifwemusterr,itseemsbettertoerronthesideofcaution.
Table13.10.The2by2tableinsymbolicnotation
Total
a b a+b
c d c+d
Total a+c b+d a+b+c+d
13.7OddsandoddsratiosIftheprobabilityofaneventispthentheoddsofthateventiso=p/(1-p).Theprobabilitythatacoinshowsaheadis0.5,theoddsis0.5/(1-0.5)=1.Notethat‘odds’isasingularword,notthepluralof‘odd’.Theoddshasadvantagesforsometypesofanalysis,asitisnotconstrainedtoliebetween0and1,butcantakeanyvaluefromzerotoinfinity.Weoftenusethelogarithmtothebaseeoftheodds,thelogoddsorlogit:
Thiscanvaryfromminusinfinitytoplusinfinityandthusisveryusefulinfittingregressiontypemodels(§17.8).Thelogitiszerowhenp=1/2andthelogitof1-pisminusthelogitofp:
ConsiderTable13.4.Theprobabilityofcoughforchildrenwithahistoryofbronchitisis26/273=0.09524.Theoddsofcoughforchildrenwithahistoryofbronchitisis26/247=0.10526.Theprobabilityofcoughforchildrenwithoutahistoryofbronchitisis44/1046=0.04207.Theoddsofcoughforchildrenwithoutahistoryofbronchitisis44/1002=0.04391.
Onewaytocomparechildrenwithandwithoutbronchitisistofindtheratiooftheproportionsofchildrenwithcoughinthetwogroups(therelativerisk,§8.6).Anotheristofindtheoddsratio,theratiooftheoddsofcoughinchildrenwithbronchitisandchildrenwithoutbronchitis.Thisis(26/247)/(44/1002)=0.10526/0.04391=2.39718.Thustheoddsofcoughinchildrenwithahistoryofbronchitisis2.39718timestheoddsofcoughinchildrenwithoutahistoryofbronchitis.
Ifwedenotethefrequenciesinthetablebya,b,c.andd,asinTable
13.10,theoddsratioisgivenby
Thisissymmetrical;wegetthesamethingby
Wecanestimatethestandarderrorandconfidenceintervalusingthelogoftheoddsratio(§13C).Thestandarderrorofthelogoddsratiois:
Hencewecanfindthe95%confidenceinterval.ForTable13.4,thelogoddsratioisloge(2.39718)=0.87429,withstandarderror
Providedthesampleislargeenough,wecanassumethatthelogoddsratiocomesfromaNormaldistributionandhencetheapproximate95%confidenceintervalis
0.87429-1.96×0.25736to0.87429+1.96×0.25736=0.36986to1.37872
Togetaconfidenceintervalfortheoddsratioitselfwemustantilog:
Theoddsratiocanbeusedtoestimatetherelativeriskinacase-controlstudy.Thecalculationofrelativeriskin§8.6dependedonthefactthatwecouldestimatetherisks.Wecoulddothisbecausewehadaprospectivestudyandsoknewhowmanyoftheriskgroupdevelopedthesymptom.Thiscannotbedoneifwestartwiththeoutcome,inthiscasecoughatage14,andtrytoworkbacktotheriskfactor,bronchitis,asinacase–controlstudy.
Table13.11showsdatafromacase–controlstudyofsmokingandlungcancer(see§3.8).Westartwithagroupofcases,patientswithlungcancerandagroupofcontrols,herehospitalpatientswithoutcancer.Wecannotcalculaterisks(thecolumntotalswouldbemeaninglessand
havebeenomitted),butwecanstillestimatetherelativerisk.
Supposetheprevalenceoflungcancerisp,asmallnumber,andthetableisasTable13.10.Thenwecanestimatetheprobabilityofbothhavinglungcancerandbeingasmokerbypa/(a+b),becausea/(a+b)istheconditionalprobabilityofsmokinginlungcancerpatients(§6.8).Similarly,theprobabilityofbeingasmokerwithoutlungcanceris(1-p)c/(c+d).Theprobabilityofbeingasmokeristhereforepa/(a+b)+(1-p)c/(c+d),theprobabilityofbeingasmokerwithlungcancerplustheprobabilityofbeingasmokerwithoutlungcancer.Becausepismuchsmallerthan1-p,thefirsttermcanbeignoredand
theprobabilityofbeingasmokerisapproximately(1-p)c/(c+d).Theriskoflungcancerforsmokersisfoundbydividingtheprobabilityofbeingasmokerwithlungcancerbytheprobabilitybeingasmoker:
Table13.11.Smokersandnon-smokersamongmalecancerpatientsandcontrols(DollandHill1950)
Smokers Non-smokers Total
Lungcancer 647 2 649
Controls 622 27 649
Similarly,theprobabilityofbothbeinganon-smokerandhavinglungcancerispb/(a+b)andtheprobabilityofbeinganon-smokerwithoutlungcanceris(1-p)d/(c+d).Theprobabilityofbeinganon-smokeris
thereforepb/(a+b)+(1-p)d/(c+d),andsincepismuchsmallerthan1-p,thefirsttermcanbeignoredandtheprobabilityofbeinganon-smokerisapproximately(1-p)d/(c+d).Thisgivesariskoflungcanceramongnon-smokersofapproximately
Therelativeriskoflungcancerforsmokersisthus,approximately,
Thisis,ofcourse,theoddsratio.Thusforcasecontrolstudiestherelativeriskisapproximatedbytheoddsratio.
ForTable13.11wehave
Thustheriskoflungcancerinsmokersisabout14timesthatofnon-smokers.Thisisasurprisingresultfromatablewithsofewnon-smokers,butadirectestimatefromthecohortstudy(Table3.1)is0.90/0.07=12.9,whichisverysimilar.Thelogoddsratiois2.64210anditsstandarderroris
Hencetheapproximate95%confidenceintervalis
Table13.12.Coughduringthedayoratnightandcigarettesmokingby12-year-oldboys(Blandetal.1978)
Boy'ssmoking
Non-smoker Occasional Regular
Cough 266 20.4% 395 28.8% 80 46.5%
Nocough
1037 79.6% 977 71.2% 92 53.5%
Total 1303 100.0% 1372 100.0% 172 100.0%
Togetaconfidenceintervalfortheoddsratioitselfwemustantilog:
Theverywideconfidenceintervalisbecausethenumbersofnon-smokers,particularlyforlungcancercases,aresosmall.
13.8*Thechi-squaredtestfortrendConsiderthedataofTable13.12.Usingthechi-squaredtestdescribedin§13.1,wecantestthenullhypothesisthatthereisnorelationshipbetweenreportedcoughandsmokingagainstthealternativethatthereisarelationshipofsomesort.Thechi-squaredstatisticis64.25,with2degreesoffreedom,P<0.001.Thedataarenotconsistentwiththenullhypothesis.
Now,wewouldhavegotthesamevalueofchi-squaredwhatevertheorderofthecolumns.Thetestignoresthenaturalorderingofthecolumns,butwemightexpectthatiftherewerearelationshipbetweenreportedcoughandsmoking,theprevalenceofcoughwouldbegreaterforgreateramountsofsmoking.Inotherwords,welookforatrendincoughprevalencefromoneendofthetabletotheother.Wecantestforthisusingthechi-squaredtestfortrend.
First,wedefinetworandomvariables.XandY,whosevaluesdependonthecategoriesoftherowandcolumnvariables.Forexample,wecouldputX=1fornon-smokers,X=2foroccasionalsmokersandX=3forregularsmokers,andputY=1for‘cough’andY=2for‘nocough’.Thenforanon-smokerwhocoughs,thevalueofXis1andthevalueofYis1.BothXandYmayhavemorethantwocategories,providedbothareordered.Iftherearenindividuals,wehavenpairsofobservations
(xi,yi).Ifthereisalineartrendacrossthetable,therewillbelinearregressionofYonXwhichhasnon-zeroslope.Wefittheusualleastsquaresregressionline,Y=a+bX,where
andwheres2istheestimatedvarianceofY.Insimplelinearregression,asdescribedinChapter11,weareusuallyconcernedwithestimatingbandmakingstatementsaboutitsprecision.Hereweareonlygoingtotestthenullhypothesisthatinthepopulationb=0.Underthenullhypothesis,thevarianceaboutthelineisequaltothetotalvarianceofY,sincethelinehaszeroslope.Weusethe
estimate
(Weusenasthedenominator,notn-1,becausethetestisconditionalontherowandcolumntotalsasdescribedin§13A.Thereisagoodreasonforit,butitisnotworthgoingintohere.)Asin§11.5,thestandarderrorofbis
Forpracticalcalculationsweusethealternativeformsofthesumsofsquaresandproducts:
NotethatitdoesnotmatterwhichvariableisXandwhichisY.The
sumsofsquaresandproductsareeasytoworkout.Forexample,forthecolumnvariable,X,wehave1303individualswithX=1,1372withX=2and172withX=3.Forourdatawehave
Similarly,Σy2i=9165andΣyi=4953;
=59.47
Ifthenullhypothesisistrue,χ2iisanobservationfromtheChi-squareddistributionwith1degreeoffreedom.Thevalue59.47ishighlyunlikelyfromthisdistributionandthetrendissignificant.
Thereareseveralpointstonoteaboutthismethod.ThechoiceofvaluesforXandYisarbitrary.ByputtingX=1,2or3weassumedthatthedifferencebetweennon-smokersandoccasionalsmokersisthesameasthatbetweenoccasionalsmokersandsmokers.ThisneednotbesoandadifferentchoiceofXwouldgiveadifferentchi-squaredfortrendstatistic.Thechoiceisnotcritical,however.Forexample,puttingX=1,2or4,somakingregularsmokersmoredifferentfromoccasionalsmokersthanoccasionalsmokersarefromnon-smokers,wegetx2fortrendtobe64.22.Thefittothedataisratherbetter,buttheconclusionsareunchanged.
Thetrendmaybesignificanteveniftheoverallcontingencytablechi-squaredisnot.Thisisbecausethetestfortrendhasgreaterpowerfordetectingtrendsthanhastheordinarychi-squaredtest.Ontheotherhand,ifwehadanassociationwherethosewhowereoccasionalsmokershadfarmoresymptomsthaneithernon-smokersorregularsmokers,thetrendtestwouldnotdetectit.Ifthehypothesiswewish
totestinvolvestheorderofthecategories,weshouldusethetrendtest,ifitdoesnotweshouldusethecontingencytabletestof§13.1.Notethatthetrendteststatisticisalwayslessthantheoverallchi-squaredstatistic.
Thedistributionofthetrendchi-squaredstatisticdependsonalargesampleregressionmodel,notonthetheorygivenin§13A.ThetabledoesnothavetomeetCochran'srule(§13.3)forthetrendtesttobevalid.Aslongasthereareatleast30observationstheapproximationshouldbevalid.
Somecomputerprogramsofferaslightlydifferenttest,theMantel–Haenzseltrendtest(nottobeconfusedwiththeMantel–Haenzselmethodforcombining2by2tables,§17.11).Thisisalmostidenticaltothemethoddescribedhere.Asanalternativetothechi-squaredtestfortrend,wecouldcalculateKendall'srankcorrelationcoefficient,τb,betweenXandY(§12.5).ForTable13.12wegetτb=-0.136withstandarderror0.018.Wegetaχ21statisticby(τb/SE(τb))2=57.09.ThisisverysimilartotheX2fortrendvalue59.47.
13.9*MethodsformatchedsamplesThechi-squaredtestdescribedaboveenablesus,amongotherthings,totestthenullhypothesisthatbinomialproportionsestimatedfromtwoindependentsamplesarethesame.Wecandothisfortheonesampleormatchedsampleproblemalso.Forexample,Hollandetal.(1978)obtainedrespiratorysymptomquestionnairesfor1319Kentschoolchildrenatages12and14.Onequestionweaskedwaswhethertheprevalenceofreportedsymptomswasdifferentatthetwoages.Atage12,356(27%)childrenwerereportedtohavehadseverecoldsinthepast12monthscomparedto468(35%)atage14.Wasthereevidenceofarealincrease?Justasintheonesampleorpairedttest(§10.2)wewouldhope
toimproveouranalysisbytakingintoaccountthefactthatthisisthesamesample.Wemightexpect,forinstance,thatsymptomsonthetwooccasionswillberelated.
Table13.13.SeverecoldsreportedattwoagesforKentschoolchildren(Hollandetal.1978)
Severecoldsatage12
Severecoldsatage14 Total
Yes No
Yes 212 144 356
No 256 707 963
Total 468 851 1319
ThemethodwhichenablesustodothisisMcNemar'stest,anotherversionofthesigntest.Weneedtoknowthat212childrenwerereportedtohavecoldsonbothoccasions.144tohavecoldsat12butnotat14,256tohavecoldsat14butnotat12and707tohavecoldsatneitherage.Table13.13showsthedataintabularform.
Thenullhypothesisisthattheproportionssayingyesonthefirstandsecondoccasionsarethesame,thealternativebeingthatoneexceedstheother.Thisisahypothesisabouttherowandcolumntotals,quitedifferentfromthatforthecontingencytablechi-squaredtest.Ifthenullhypothesisweretruewewouldexpectthefrequenciesfor‘yes,no’and‘no,yes’tobeequal.Inotherwords,asmanyshouldgoupasdown.(Comparethiswiththesigntest,§9.2.)Ifwedenotethesefrequenciesbyfynandfny,thentheexpectedfrequencieswillbe(fyn+fny)/2.Wegettheteststatistic:
whichfollowsaChi-squareddistributionprovidedtheexpectedvaluesarelargeenough.Therearetwoobservedfrequenciesandoneconstraint,thatthesumoftheobservedfrequencies=thesumoftheexpectedfrequencies.Hencethereisonedegreeoffreedom.Likethechi-squaredtest(§13.1)andFisher'sexacttest(§13.4),weassumeatotaltobefixed.Inthiscaseitisfyn+fny,nottherowandcolumntotals,whicharewhatwearetesting.Theteststatisticcanbesimplifiedconsiderably,to:
ForTable13.13,wehave
ThiscanbereferredtoTable13.3withonedegreeoffreedomandisclearlyhighlysignificant.Therewasadifferencebetweenthetwoages.Astherewasnochangeinanyoftheothersymptomsstudied,wethoughtthatthiswaspossiblyduetoanepidemicofupperrespiratorytractinfectionjustbeforethesecondquestionnaire.
Thereisacontinuitycorrection,againduetoYates.Iftheobservedfrequencyfynincreasesby1,fnydecreasesby1andfyn-fnyincreasesby2.Thushalfthedifferencebetweenadjacentpossiblevaluesis1andwemaketheobserveddifferencenearertotheexpecteddifference(zero)by1.Thusthecontinuitycorrectedteststatisticis
where|fyn-fny|istheabsolutevalue,withoutsign.ForTable13.13:
Thereisverylittledifferencebecausetheexpectedvaluesaresolargebutiftheexpectedvaluesaresmall,saylessthan20,thecorrectionisadvisable.Forsmallsamples,wecanalsotakefnyasanobservation
fromtheBinomialdistributionwithp=½andn=fyn+fnyandproceedasforthesigntest(§9.2).
Wecanfindaconfidenceintervalforthedifferencebetweentheproportions.Theestimateddifferenceisp1-p2=(fyn-fyn)/n.Werearrangethis:
WecantreatthefynasanobservationfromaBinomialdistributionwithparametern=fyn+fny,which,ofcourse,wearetreatingasfixed.(IamusingnheretomeantheparameteroftheBinomialdistributionasin§6.4,nottomeanthetotalsamplesize.)Wefindaconfidenceintervalforfyn/(fyn+fny)usingeitherthezmethodof§8.4ortheexactmethodof§8.8.Wethenmultiplytheselimitsby2,subtract1andmultiplyby(fyn+fny)/n.
Fortheexample,theestimateddifferenceis(144-256)/1319=-0.085.Fortheconfidenceinterval,fyn+fny=400andfyn=144.The95%confidenceintervalforfyn/(fyn+fny)is0.313to0.407bythelargesamplemethod.Hencetheconfidenceintervalforp1-p2is(2×0.313-1)×400/1319=-0.113to(2×0.407-1)×400/1319=-0.056.Weestimatethattheproportionofcoldsonthefirstoccasionwaslessthanthatonthesecondbybetween0.06and0.11.
Wemaywishtocomparethedistributionofavariablewiththreeormorecategoriesinmatchedsamples.Ifthecategoriesareordered,likesmokingexperienceinTable13.12,weareusuallylookingforashiftfromoneendofthedistributiontotheother,andwecanusethesigntest(§9.2),countingpositiveswhensmokingincreased,negativewhenitdecreased,andzeroifthecategory
wasthesame.Whenthecategoriesarenotordered,asTable13.1thereisatestduetoStuart(1955),describedbyMaxwell(1970).Thetestisdifficulttodoandthesituationisveryunusual,soIshallomitdetails.MyfreeprogramClinstatwilldoit(§1.3).
Table13.14.Parityof125womenattendingantenatalclinicsatSt.George'sHospital,withthecalculationofthechi-squaredgoodnessoffittest
Wecanalsofindanoddsratioforthematchedtable,calledtheconditionaloddsratio.LikeMcNemar'smethod,itusesthefrequenciesintheoffdiagonalonly.Theestimateisverysimple:fyn/fny.ThusforTable13.13theoddsofhavingseverecoldsatage12is144/256=0.56timesthatatage14.Thisexampleisnotveryinteresting,butthemethodisparticularlyusefulinmatchedcase–controlstudies,whereitprovidesanestimateoftherelativerisk.Aconfidenceintervalisprovidedinthesamewayasforthedifferencebetweenproportions.Wecanestimatep=fyn/(fyn+fny)andthentheoddsratioisgivenbyp/(1-p).Fortheexample,p=144/400=0.36andturningpbacktotheoddsratiop/(1-p)=0.36/(1-0.36)=0.56asbefore.The95%confidenceintervalforpis0.313to0.4071,asabove.Hencethe95%confidenceintervalfortheconditionaloddsratiois0.31/(1-0.31)=0.45to0.41/(1-0.41)=0.69.
13.10*Thechi-squaredgoodnessoffittestAnotheruseoftheChi-squareddistributionisthegoodnessoffittest.HerewetestthenullhypothesisthatafrequencydistributionfollowssometheoreticaldistributionsuchasthePoissonorNormal.Table13.14showsafrequencydistribution.Weshalltestthenullhypothesis
thatitisfromaPoissondistribution,i.e.thatconceptionisarandomeventamongfertilewomen.
FirstweestimatetheparameterofthePoissondistribution,itsmean,µ,inthiscase0.816.Wethencalculatetheprobabilityforeachvalueofthevariable,usingthePoissonformulaof§6.7:
whereristhenumberofevents.TheprobabilitiesareshowninTable13.14.Theprobabilitythatthevariableexceedsfiveisfoundbysubtractingtheprobabilitiesfor0,1,2,3,4,and5from1.0.Wethenmultiplythesebythenumberof
observations,125,togivethefrequencieswewouldexpectfrom125observationsfromaPoissondistributionwithmepn0.816.
Table13.15.Timeofonsetof554strokesWroeetal.(1992)
Time Frequency Time Frequency
00.01–02.00 21 12.01–14.00 34
02.01–04.00 16 14.01–16.00 59
04.01–06.00 22 16.01–18.00 44
06.01–08.00 104 18.01–20.00 51
08.01–10.00 95 20.01–22.00 32
10.01–12.00 66 22.01–24.00 10
Wenowhaveasetofobservedandexpectedfrequenciesandcancomputeachi-squaredstatisticintheusualway.Wewantalltheexpectedfrequenciestobegreaterthan5ifpossible.Weachievethisherebycombiningallthecategoriesforparitygreaterthanorequalto3.Wethenadd(O-E)2/Eforthecategoriestogiveaχ2statistic.Wenowfindthedegreesoffreedom.Thisisthenumberofcategoriesminusthenumberofparametersfittedfromthedata(oneintheexample)minusone.Thuswehave4-1-1=2degreesoffreedom.FromTable13.3theobservedχ2valueof2.99hasP>0.10andthedeviationfromthePoissondistributionisclearlynotsignificant.
Thesametestcanbeusedfortestingthefitofanydistribution.Forexample,Wroeetal.(1992)studieddiurnalvariationinonsetofstrokes.Table13.15showsthefrequencydistributionoftimesofonset.Ifthenullhypothesisthatthereisnodiurnalvariationweretrue,thetimeatwhichstrokesoccurredwouldfollowaUniformdistribution(§7.2).Theexpectedfrequencyineachtimeintervalwouldbethesame.Therewere554casesaltogether,sotheexpectedfrequencyforeachtimeis554/12=46.167.Wethenworkout(O-E)2/Eforeachintervalandaddtogivethechi-squaredstatistic,inthiscaseequalto218.8.Thereisonlyoneconstraint,thatthefrequenciestotal554,asnoparametershavebeenestimated.HenceifthenullhypothesisweretruewewouldhaveanobservationfromtheChi-squareddistributionwith12-1=11degreesoffreedom.Thecalculatedvalueof218.8isveryunlikely,P<0.001fromTable13.3,andthedataarenotconsistentwiththenullhypothesis.WhenwetesttheequalityofasetoffrequencieslikethisthetestisalsocalledthePoissonheterogeneitytest.
Appendices
13AAppendix:Whythechi-squaredtestworks
WenotedsomeofthepropertiesoftheChi-squareddistributionin§7A.Inparticular,itisthesumofthesquaresofasetofindependentStandardNormalvariables,andifwelookatasubsetofvaluesdefinedbyindependentlinearrelationshipsbetweenthesevariablesweloseonedegreeoffreedomforeachconstraint.Itisonthesetwopropertiesthatthechi-squaredtestdepends.
SupposewedidnothaveafixedsizetothebirthstudyofTable13.1,butobservedsubjectsastheydeliveredoverafixedtime.Thenthenumberin
agivencellofthetablewouldbefromaPoissondistributionandthesetofPoissonvariablescorrespondingtothecellfrequencywouldbeindependentofoneanother.OurtableisonesetofsamplesfromthesePoissondistributions.However,wedonotknowtheexpectedvaluesofthesedistributionsunderthenullhypothesis;weonlyknowtheirexpectedvaluesifthetablehastherowandcolumntotalsweobserved.Wecanonlyconsiderthesubsetofoutcomesofthesevariableswhichhastheobservedrowandcolumntotals.Thetestissaidtobeconditionalontheserowandcolumntotals.
Table13.16.Symbolicrepresentationofa2×2table
Total
f11 f12 r1
f21 f22 r2
Total c1 c2 n
ThemeanandvarianceofaPoissonvariableareequal(§6.7).Ifthenullhypothesisistrue,themeansofthesevariableswillbeequaltotheexpectedfrequencycalculatedin§13.1.ThusO,theobservedcellfrequency,isfromaPoissondistributionwithmeanE,theexpectedcellfrequency,andstandarddeviation√E.ProvidedEislargeenough,thisPoissondistributionwillbeapproximatelyNormal.Hence(O-E)/√EisfromaNormaldistributionmean0andvariance1.Henceifwefind
thisisthesumofthesquaresofasetofNormallydistributedrandomvariableswithmean0andvariance1,andsoisfromaChi-squareddistribution(§7A).
Wewillnowfindthedegreesoffreedom.Althoughtheunderlyingvariablesareindependent,weareonlyconsideringasubsetdefinedbytherowandcolumntotals.ConsiderthetableasinTable13.16.Here,f11tof22aretheobservedfrequencies,r1,r2therowtotals,c1,c2thecolumntotals,andnthegrandtotal.Denotethecorrespondingexpectedvaluesbye11toe22.Therearethreelinearconstraintsonthefrequencies:
Anyotherconstraintcanbemadeupofthese.Forexample,wemusthave
Thiscanbefoundbysubtractingthesecondequationfromthefirst.Eachoftheselinearconstraintsonf11tof22isalsoalinearconstrainton(f11-e11)/√e11
to(f22-e22)/√e22.Thisisbecausee11isfixedandso(f11-e11)/√e11isalinearfunctionoff11.Therearefourobservedfrequenciesandsofour
(O-E)/√Evariables,withthreeconstraints.Weloseonedegreeoffreedomforeachconstraintandsohave4-3=1degreeoffreedom.
Ifwehaverrowsandccolumns,thenwehaveoneconstraintthatthesumofthefrequenciesisn.Eachrowmustaddup,butwhenwereachthelastrowtheconstraintcanbeobtainedbysubtractingthefirstr-1rowsfromthegrandtotal.Therowscontributeonlyr-1furtherconstraints.Similarlythecolumnscontributec-1constraints.Hence,therebeingrcfrequencies,thedegreesoffreedomare
Sowehavedegreesoffreedomgivenbythenumberofrowsminusonetimesthenumberofcolumnsminusone.
13BAppendix:TheformulaforFisher'sexacttest
ThederivationofFisher'sformulaisstrictlyforthealgebraicallyminded.Rememberthatthenumberofwaysofchoosingrthingsoutofnthings(§6A)isn!/r!(n-r)!.Now,supposewehavea2by2tablemadeupofnasshowninTable13.16.First,weaskhowmanywaysnindividualscanbearrangedtogivemarginaltotals,r1,r2,c1andc2.Theycanbearrangedincolumnsinn!/c1!c2!ways,sincewearechoosingc1objectsoutofn,andinrowsn!/r1!r2!ways.(Remembern-c1=c2andn-r1=r2.)Hencetheycanbearrangedin
ways.Forexample,thetablewithtotals
canhappenin
Aswesawin§13.4,thecolumnscanbearrangedin70ways.Nowweask,ofthesewayshowmanymakeupaparticulartable?Wearenowdividingthenintofourgroupsofsizesf11,f12,f21andf12.Wecan
choosethefirstgroupinn!/f11!(n-f11)!ways,asbefore.Wearenowleftwithn-f11individuals,sowecanchoosef12in(n-f11)!/f12!(n-f11-f12)!.Wearenowleftwithn-f11-f12,andsowechoosef21in(n-f11-f12)!/f21!ways.Thisleavesn-f11-f12-f21,whichis,ofcourse,equaltof22andsof22canonlybechoseninoneway.Hencewehavealtogether:
becausen-f11-f12-f12=f22.Sooutofthe
possibletables,thegiventablesarisesin
ways.Theprobabilityofthistablearisingbychanceis
13CAppendix:Standarderrorforthelogoddsratio
Thisisforthemathematicalreader.Westartwithageneralresultconcerninglogtransformations.IfXisarandomvariablewithmeanµ,
theapproximatevarianceofloge(X)isgivenby
Ifaneventhappensatimesanddoesnothappenbtimes,thelogoddsisloge(a/b)-loge(a)-loge(b).ThefrequenciesaandbarefromindependentPoissondistributionswithmeansestimatedbyaandbrespectively.Hencetheirvariancesareestimatedby1/aand1/brespectively.Thevarianceofthelogoddsisgivenby
Thestandarderrorofthelogoddsisthusgivenby
Thelogoddsratioisthedifferencebetweenthelogodds:
Thevarianceofthelogoddsratioisthesumofthevariancesofthelogoddsandfortable2wehave
Thestandarderroristhesquarerootofthis:
13MMultiplechoicequestions67to73
(Eachbranchiseithertrueorfalse)
67.Thestandardchi-squaredtestfora2by2contingencytableisvalidonlyif:
(a)alltheexpectedfrequenciesaregreaterthanfive;
(b)bothvariablesarecontinuous;
(c)atleastonevariableisfromaNormaldistribution;
(d)alltheobservedfrequenciesaregreaterthanfive;
(e)thesampleisverylarge.
ViewAnswer
68.Inachi-squaredtestfora5by3contingencytable:
(a)variablesmustbequantitative;
(b)observedfrequenciesarecomparedtoexpectedfrequencies;
(c)thereare15degreesoffreedom;
(d)atleast12cellsmusthaveexpectedvaluesgreaterthanfive;
(e)alltheobservedvaluesmustbegreaterthanone.
ViewAnswer
Table13.17.Coughfirstthinginthemorninginagroupofschoolchildren,asreportedbythechildandbythechild'sparents(Blandetal.1979)
Parents'reportChild'sreport
TotalYes No
Yes 29 104 133
No 172 5097 5269
Total 201 5201 5402
69.InTable13.17:
(a)theassociationbetweenreportsbyparentsandchildrencanbetestedbyachi-squaredtest;
(b)*thedifferencebetweensymptomprevalenceasreportedbychildrenandparentscanbetestedbyMcNemar'stest;
(c)*ifMcNemar'stestissignificant,thecontingencychi-squaredtestisnotvalid;
(d)thecontingencychi-squaredtesthasonedegreeoffreedom;
(e)itwouldbeimportanttousethecontinuitycorrectioninthecontingencychi-squaredtest.
ViewAnswer
70.Fisher'sexacttestforacontingencytable:
(a)appliesto2by2tables;
(b)usuallygivesalargerprobabilitythantheordinarychi-squaredtest;
(c)usuallygivesaboutthesameprobabilityasthechi-squaredtestwithYates'continuitycorrection;
(d)issuitablewhenexpectedfrequenciesaresmall;
(e)isdifficulttocalculatewhentheexpectedfrequenciesarelarge.
ViewAnswer
71.Whenanoddsratioiscalculatedfroma2by2table:
(a)theoddsratioisameasureofthestrengthoftherelationshipbetweentherowandcolumnvariables;
(b)iftheorderoftherowsandtheorderofthecolumnsisreversed,theoddsratiowillbeunchanged;
(c)theratiomaytakeanypositivevalue;
(d)theoddsratiowillbechangedtoitsreciprocaliftheorderofthecolumnsischanged;
(e)theoddsratioistheratiooftheproportionsofobservationsinthefirstrowforthetwocolumns.
ViewAnswer
Table13.18.BirdattacksonmilkbottlesreportedbycasesofCampylobacterjejuniinfectionand
controls(Southernetal.1990)
Numberofdaysofweekwhenattackstookplace
NumberofOR
Cases Controls
0 3 42 1
1–3 11 3 51
4–5 5 1 70
6–7 10 1 140
72.Table13.18appearedinthereportofacasecontrolstudyofinfectionwithCampylobacterjejuni(§3E):
(a)*achi-squaredtestfortrendcouldbeusedtotestthenullhypothesisthatriskofdiseasedoesnotincreasewiththenumberofbirdattacks;
(b)‘OR’meanstheoddsratio;
(c)*asignificantchi-squaredtestwouldshowthatriskofdiseaseincreaseswithincreasingnumbersofbirdattacks;
(d)‘OR’providesanestimateoftherelativeriskofCampylobacterjejuniinfection;
(e)*Kendall'srankcorrelationcoefficient,τb,couldbeusedtotestthenullhypothesisthatriskofdiseasedoesnotincreasewiththenumberofbirdattacks.
ViewAnswer
73.*McNemar'stestcouldbeused:
(a)tocomparethenumbersofcigarettesmokersamongcancercasesandageandsexmatchedhealthycontrols;
(b)toexaminethechangeinrespiratorysymptomprevalenceinagroupofasthmaticsfromwintertosummer;
(c)tolookattherelationshipbetweencigarettesmokingandrespiratorysymptomsinagroupofasthmatics;
(d)toexaminethechangeinPEFRinagroupofasthmaticsfromwintertosummer;
(e)tocomparethenumberofcigarettesmokersamongagroupofcancercasesandarandomsampleofthegeneralpopulation.
ViewAnswer
13EExercise:AdmissionstohospitalinaheatwaveInthisexerciseweshalllookatsomedataassembledtotestthehypothesisthatthereisaconsiderableincreaseinthenumberof
admissionstogeriatricwardsduringheatwaves.Table13.19showsthenumberofadmissionstogeriatricwardsinahealthdistrictforeachweekduringthesummersof1982,whichwascold,and1983,whichwashot.Alsoshownaretheaverageofthedailypeaktemperaturesforeachweek.
1.Whendoyouthinktheheatwavebeganandended?
ViewAnswer
2.Howmanyadmissionswerethereduringtheheatwaveandinthecorrespondingperiodof1982?Wouldthisbesufficientevidencetoconcludethatheatwavesproduceanincreaseinadmissions?
ViewAnswer
3.Wecanusetheperiodsbeforeandaftertheheatwaveweeksascontrolsforchangesinotherfactorsbetweentheyears.Dividetheyearsintothreeperiods,before,during,andaftertheheatwaveandsetupatwo-waytableshowingnumbersofadmissionsbyperiodandyear.
ViewAnswer
Table13.19.MeanpeakdailytemperaturesforeachweekfromMaytoSeptemberof1982and1983,withgeriatricadmissionsinWandsworth
(Fish1985)
Week
Meanpeak,°C Admissions
Week
Meanpeak,°C
1982 1983 1982 1983 1982 1983
1 12.4 15.3 24 20 12 21.7 25.0
2 18.2 14.4 22 17 13 22.5 27.3
3 20.4 15.5 21 21 14 25.7 22.9
4 18.8 15.6 22 17 15 23.6 24.3
5 25.3 19.6 24 22 16 20.4 26.5
6 23.2 21.6 15 23 17 19.6 25.0
7 18.6 18.9 23 20 18 20.2 21.2
8 19.4 22.0 21 16 19 22.2 19.7
9 20.6 21.0 18 24 20 23.3 16.6
10 23.4 26.5 21 21 21 18.1 18.4
11 22.8 30.4 17 20 22 17.3 20.7
4.Wecanusethistabletotestforaheatwaveeffect.Statethenullhypothesisandcalculatethefrequenciesexpectedifthenullhypothesisweretrue.
ViewAnswer
5.Testthenullhypothesis.Whatconclusionscanyoudraw?
ViewAnswer
6.Whatotherinformationcouldbeusedtotesttherelationship
betweenheatwavesandgeriatricadmissions?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>14-Choosingthestatisticalmethod
14
Choosingthestatisticalmethod
14.1*MethodorientedandproblemorientedteachingThechoiceofmethodofanalysisforaproblemdependsonthecomparisontobemadeandthedatatobeused.InChapters8,9,10,11,12,and13,statisticalmethodshavebeenarrangedlargelybytypeofdata,largesamples,Normal,ordinal,categorical,etc,ratherthanbytypeofcomparison.Inthischapterwelookathowtheappropriatemethodischosenforthethreemostcommonproblemsinstatisticalinference:
comparisonoftwoindependentgroups,forexample,groupsofpatientsgivendifferenttreatments;
comparisonoftheresponseofonegroupunderdifferentconditions,asinacross-overtrial,orofmatchedpairsofsubjects,asinsomecase–controlstudies;
investigationoftherelationshipbetweentwovariablesmeasuredonthesamesampleofsubjects.
ThischapteractsasamapofthemethodsdescribedinChapters8,9,10,11,12,and13.Subsequentchaptersdescribemethodsforspecialproblemsinclinicalmedicine,populationstudy,dealingwithseveralfactorsatonce,andthechoiceofsamplesize.
Aswasdiscussedin§12.7,thereareoftenseveraldifferentapproachestoevenasimplestatisticalproblem.Themethodsdescribedhereandrecommendedforparticulartypesofquestionmaynotbetheonlymethods,andmaynotalwaysbeuniversallyagreedasthebest
method.Statisticiansareatleastaspronetodisagreeasclinicians.However,thesewouldusuallybeconsideredasvalidandsatisfactorymethodsforthepurposesforwhichtheyaresuggestedhere.Whenthereismorethanonevalidapproachtoaproblem,theywillusuallybefoundtogivesimilaranswers.
14.2*TypesofdataThestudydesignisonefactorwhichdeterminesthemethodofanalysis,thevariablebeinganalysedisanother.Wecanclassifyvariablesintothefollowingtypes:
RatioscalesTheratiooftwoquantitieshasameaning,sowecansaythatoneobservationistwiceanother.Humanheightisaratioscale.Ratioscales
allowustocarryoutpowertransformationslikelogorsquareroot.
IntervalscalesTheintervalordistancebetweenpointsonthescalehasprecisemeaning,achangeofoneunitatonescalepointisthesameasachangeofoneunitatanother.Forexample,temperaturein°Cisanintervalscale,thoughnotaratioscalebecausethezeroisarbitrary.Wecanaddandsubtractonanintervalscale.Allratioscalesarealsointervalscales.Intervalscalesallowustocalculatemeansandvariances,andtofindstandarderrorsandconfidenceintervalsforthese.
OrdinalscaleThescaleenablesustoorderthesubjects,fromthatwiththelowestvaluetothatwiththehighest.Anytieswhichcannotbeorderedareassumedtobebecausethemeasurementisnotsufficientlyprecise.Atypicalexamplewouldbeananxietyscorecalculatedfromaquestionnaire.Apersonscoring10ismoreanxiousthanapersonscoring8,butnotnecessarilyhigherbythesameamountthatapersonscoring4ishigherthanapersonscoring2.
OrderednominalscaleWecangroupsubjectsintoseveralcategories,whichhaveanorder.Forexample,wecanaskpatientsiftheirconditionismuchimproved,improvedalittle,nochange,alittleworse,muchworse.
NominalscaleWecangroupsubjectsintocategorieswhichneednotbeorderedinanyway.Eyecolourismeasuredonanominalscale.
DichotomousscalesSubjectsaregroupedintoonlytwocategories,forexample:survivedordied.Thisisaspecialcaseofthenominalscale.
Clearlytheseclassesarenotmutuallyexclusive,andanintervalscaleisalsoordinal.Sometimesitisusefultoapplymethodsappropriatetoalowerlevelofmeasurement,ignoringsomeoftheinformation.Thecombinationofthetypeofcomparsionandthescaleofmeasurementshoulddirectustotheappropriatemethod.
14.3*ComparingtwogroupsThemethodsusedforcomparingtwogroupsaresummarizedinTable14.1.
Intervaldata.Forlargesamples,saymorethan50ineachgroup,confidenceintervalsforthemeancanbefoundbytheNormalapproximation(§8.5).Forsmallersamples.confidenceintervalsforthemeancanbefoundusingthetdistributionprovidedthedatafolloworcanbetransformedtoaNormaldistribution(§10.3,§10.4).Ifnot,asignificancetestofthenullhypothesisthatthemeansareequalcanbecarriedoutusingtheMann–WhitneyUtest(§12.2).Thiscanbeusefulwhenthedataarecensored,thatis,therearevaluestoosmallortoolargetomeasure.Thishappens,forexample,whenconcentrationsaretoosmalltomeasureandlabelled‘notdetectable’.ProvidedthatdataarefromNormaldistributions,itispossibletocomparethevariancesofthegroupsusingtheFtest(§10.8).
Ordinaldata.ThetendencyforonegrouptoexceedmembersoftheotheristestedbytheMann–WhitneyUtest(§12.2).
Orderednominaldata.Firstthedataissetoutasatwowaytable,onevariablebeinggroupandtheothertheorderednominaldata.Achi-squaredtest
(§13.1)willtestthenullhypothesisthatthereisnorelationshipbetweengroupandvariable,buttakesnoaccountoftheordering.Thisisdonebyusingthechi-squaredtestfortrend,whichtakestheorderingintoaccountandprovidesamuchmorepowerfultest(§13.8).
Table14.1.Methodsforcomparingtwosamples
Typeofdata Sizeofsample Method
Interval Large,>50eachsample
Normaldistributionformeans(§8.5,§9.7)
Small,<50eachsample,withNormaldistributionanduniformvariance
Two-sampletmethod(§10.3)
Small,<50eachsample,non-Normal
Mann–WhitneyUtest(§12.2)
Ordinal Any Mann–WhitneyU
test(§12.2)
Nominal,ordered
Large,n>30 Chi-squaredfortrend(§13.8)
Nominal,notordered
Large,mostexpectedfrequencies>5
Chi-squaredtest(§13.1)
Small,morethan20%expectedfrequencies<5
Reducenumberofcategoriesbycombiningorexcludingasappropriate(§13.3)
Dichotomous Large,allexpectedfrequencies>5
Comparisonoftwoproportions(§8.6,§9.8),chi-squaredtest(§13.1),oddsratio(§13.7)
Small,atleastoneexpectedfrequency<5
Chi-squaredtestwithYates'correction(§13.5),Fisher'sexacttest(§13.4)
Nominaldata.Setthedataoutasatwowaytableasdescribedabove.Thechi-squaredtestforatwowaytableistheappropriatetest(§13.1).Theconditionforvalidityofthetest,thatatleast80%oftheexpectedfrequenciesshouldbegreaterthan5,mustbemetbycombiningor
deletingcategoriesasappropriate(§13.3).Ifthetablereducestoa2by2tablewithouttheconditionbeingmet,useFisher'sexacttest.
Dichotomousdata.Forlargesamples,eitherpresentthedataastwoproportionsandusetheNormalapproximationtofindtheconfidenceintervalforthedifference(§8.6),orsetthedataupasa2by2tableanddoachi-squaredtest(§13.1).Theseareequivalentmethods.Anoddsratiocanalsobecalculated(§13.7).Ifthesampleissmall,thefittotheChi-squareddistributioncanbeimprovedbyusingYates'correction(§13.5).Alternatively,useFisher'sexacttest(§13.4).
Table14.2.Methodsfordifferencesinoneorpairedsample
Typeofdata Sizeofsample Method
Interval Large,>100 Normaldistribution(§8.3)
Small,<100,Normaldifferences
Pairedtmethod(§10.2)
Small,<100,non-Normaldifferences
Wilcoxonmatchedpairstest(§12.3)
Ordinal Any Signtest(§9.2)
Nominal,ordered
Any Signtest(§9.2)
Nominal Any Stuarttest(§13.9)
Dichotomous Any McNemar'stest(§13.9)
14.4*OnesampleandpairedsamplesMethodsofanalysisforpairedsamplesaresummarizedinTable14.2.
Intervaldata.Inferencesareondifferencesbetweenthevariableasobservedonthetwoconditions.Forlargesamples,sayn>100,theconfidenceintervalforthemeandifferenceisfoundusingtheNormalapproximation(§8.3).Forsmallsamples,providedthedifferencesarefromaNormaldistribution,usethepairedttest(§10.2).Thisassumptionisoftenveryreasonable,asmostofthevariationbetweenindividualsisremovedandrandomerrorislargelymadeupofmeasurementerror.Furthermore,theerroristheresultoftwoaddedmeasurementerrorsandsotendstofollowaNormaldistributionanyway.Ifnot,transformationoftheoriginaldatawilloftenmakedifferencesNormal(§10.4).IfnoassumptionofaNormaldistributioncanbemade,usetheWilcoxonsigned-rankmatched-pairstest(§12.3).
Itisrarelyaskedwhetherthereisadifferenceinvariabilityinpaireddata.Thiscanbetestedbyfindingthedifferencesbetweenthetwoconditionsandtheirsum.Thenifthereisnochangeinvariancethecorrelationbetweendifferenceandsumhasexpectedvaluezero(Pitman'stest).Thisisnotobviousbutitistrue.
Ordinaldata.Ifthedatadonotformanintervalscale,asnotedin§14.2thedifferencebetweenconditionsisnotmeaningful.However,wecansaywhatdirectionthedifferenceisin,andthiscanbeexaminedbythesigntest(§9.2).
Orderednominaldata.Usethesigntest,withchangesinonedirection
beingpositive,intheothernegative,nochangeaszero(§9.2).
Nominaldata.Withmorethantwocategories,thisisdifficult.UseStuart'sgeneralizationtomorethantwocategoriesofMcNemar'stest(§13.9).
Dichotomousdata.Herewearecomparingtheproportionsofindividualsinagivenstateunderthetwoconditions.TheappropriatetestisMcNemar'stest(§13.9).
14.5*RelationshipbetweentwovariablesThemethodsforstudyingrelationshipsbetweenvariablesaresummarizedinTable14.3.Relationshipswithdichotomousvariablescanbestudiedasthedifferencebetweentwogroups(§14.3),thegroupsbeingdefinedbythetwostatesofthedichotomousvariable.Dichotomousdatahavebeenexcludedfromthetextofthissection,butareincludedinTable14.3.
Intervalandintervaldata.Twomethodsareused:regressionandcorrelation.Regression(§11.2,§11.5)isusuallypreferred,asitgivesinformationaboutthenatureoftherelationshipaswellasaboutitsexistence.Correlation(§11.9)measuresthestrengthoftherelationship.Forregression,residualsaboutthelinemustfollowaNormaldistributionwithuniformvariance.Forestimation,thecorrelationcoefficientrequiresanassumptionthatbothvariablesfollowaNormaldistribution,buttotestthenullhypothesisonlyonevariableneedstofollowaNormaldistribution.IfneithervariablecanbeassumedtofollowaNormaldistributionorbetransformedtoit(§11.8),userankcorrelation(§12.4,§12.5).
Intervalandordinaldata.Rankcorrelationcoefficient(§12.4,§12.5).
Intervalandorderednominaldata.Thiscanbeapproachedbyrankcorrelation,usingKendall'sτ(§12.5)becauseitcopeswiththelargenumberoftiesbetterthandoesSpearman'sρ,orbyanalysisofvarianceasdescribedforintervalandnominaldata.ThelatterrequiresanassumptionofNormaldistributionanduniformvariancefortheintervalvariable.Thesetwoapproachesarenotequivalent.
Intervalandnominaldata.IftheintervalscalefollowsaNormaldistribution,useone-wayanalysisofvariance(§10.9).TheassumptionisthatwithincategoriestheintervalvariableisfromNormaldistributionswithuniformvariance.Ifthisassumptionisnotreasonable,useKruskal–Wallisanalysisofvariancebyranks(§12.2).
Ordinalandordinaldata.Usearankcorrelationcoefficient,Spearman'sρ(§12.4)orKendall'sτ(§12.5).Bothwillgiveverysimilaranswersfortestingthenullhypothesisofnorelationshipintheabsenceofties.Fordatawithmanytiesandforcomparingthestrengthsofdifferentrelationships,Kendall'sτispreferable.
Ordinalandorderednominaldata.UseKendall'srankcorrelationcoefficient,τ(§12.5).
Ordinalandnominaldata.Kruskal–Wallisone-wayanalysisofvariancebyranks(§12.2).
Orderednominalandorderednominaldata.Usechi-squaredfortrend(§13.8).
Orderednominalandnominaldata.Usethechi-squaredtestforatwo-waytable(§13.1).
Nominalandnominaldata.Usethechi-squaredtestforatwo-waytable(§13.1),providedtheexpectedvaluesarelargeenough.OtherwiseuseYates'correction(§13.5)orFisher'sexacttest(§13.4).
Table14.3.Methodsforrelationshipsbetweenvariables
Interval,Normal
Interval,non-Normal Ordinal
IntervalNormal
Regression(§11.2)correlation
Regression(§11.2)Rank
Rankcorrelation(§12.4,
(§11.9) correlation(§12.4,§12.5)
§12.5)
Interval,non-Normal
Regression(§11.2)rankcorrelation(§12.4,§12.5)
Rankcorrelation(§12.4,§12.5)
Rankcorrelation(§12.4,§12.5)
Ordinal Rankcorrelation(§12.4,§12.5)
Rankcorrelation(§12.4,§12.5)
Rankcorrelation(§12.4,§12.5)
Nominal,ordered
Kendall'srankcorrelation(§12.5)
Kendall'srankcorrelation(§12.5)
Kendall'srankcorrelation(§12.5)
Nominal Analysisofvariance(§10.9)
Kruskal–Wallistest(§12.2)
Kruskal–Wallistest(§12.2)
Dichotomous ttest(§10.3)Normaltest(§8.5,§9.7)
LargesampleNormaltest(§8.5,§9.7)Mann–WhitneyU
Mann–WhitneyUtest(§12.2)
test(§12.2)
Nominal,ordered Nominal Dichotomous
IntervalNormal
Rankcorrelation(§12.4,§12.5)
Analysisofvariance(§10.9)
ttest(§10.3)Normaltest(§8.5,§9.7)
Interval,non-Normal
Kendall'srankcorrelation(§12.5)
Kruskal-Wallistest(§12.2)
LargesampleNormaltest(§8.5,§9.7),Mann–WhitneyUtest(§12.2)
Ordinal Kendall'srankcorrelation(§12.5)
Kruskal-Wallistest(§12.2)
Mann-WhitneyUtest(§12.2)
Nominal,ordered
Chi-squaredtestfortrend(§13.8)
Chi-squaredtest(§13.1)
Chi-squaredtestfortrend(§13.8)
Nominal Chi-squared
Chi-squared
Chi-squaredtest(§13.1)
test(§13.1)
test(§13.1)
Dichotomous Chi-squaredtestfortrend(§13.8)
Chi-squaredtest(§13.1)
Chi-squaredtest(§13.1,§13.5)Fisher'sexacttest(§13.4)
14MMultiplechoicequestions74to80(*Eachbranchiseithertrueorfalse)
74.Thefollowingvariableshaveintervalscalesofmeasurement:
(a)height;
(b)presenceorabsenceofasthma;
(c)Apgarscore;
(d)age;
(e)ForcedExpiratoryVolume.
ViewAnswer
75.Thefollowingmethodsmaybeusedtoinvestigatearelationshipbetweentwocontinuousvariables:
(a)pairedttest;
(b)thecorrelationcoefficient,r;
(c)simplelinearregression;
(d)Kendall'sτ;
(e)Spearman'sρ.
ViewAnswer
76.Whenanalysingnominaldatathefollowingstatisticalmethodsmaybeused:
(a)simplelinearregression;
(b)correlationcoefficient,r;
(c)pairedttest;
(d)Kendall'sτ;
(e)chi-squaredtest.
ViewAnswer
77.Tocomparelevelsofacontinuousvariableintwogroups,possiblemethodsinclude:
(a)theMann–WhitneyUtest;
(b)Fisher'sexacttest;
(c)attest;
(d)Wilcoxonmatched-pairssigned-ranktest;
(e)thesigntest.
ViewAnswer
Table14.4.Numberofrejectionepisodesover16weeksfollowinghearttransplantintwogroupsof
patients
Episodes GroupA GroupB Total
0 10 8 18
1 15 6 21
2 4 0 4
3 3 0 3
Totalpatients 32 14 46
78.Table14.4showsthenumberofrejectionepisodesfollowinghearttransplantintwogroupsofpatients:
(a)therejectionratesinthetwopopulationscouldbecomparedbyaMann–WhitneyUtest;
(b)therejectionratesinthetwopopulationscouldbecomparedbyatwo-samplettest;
(c)therejectionratesinthetwopopulationscouldbecomparedbyachi-squaredtestfortrend:
(d)thechi-squaredtestfora4by2tablewouldnotbevalid;
(e)thehypothesisthatthenumberofepisodesfollowsaPoissondistributioncouldbeinvestigatedusingachi-squaredtestforgoodnessoffit.
ViewAnswer
79.Twentyarthritispatientsweregiveneitheranewanalgesicoraspirinonsuccessivedaysinrandomorder.Thegripstrengthofthepatientswasmeasured.Methodswhichcouldbeusedtoinvestigatetheexistenceofatreatmenteffectinclude:
(a)Mann–WhitneyUtest;
(b)pairedtmethod;
(c)signtest;
(d)Normalconfidenceintervalforthemeandifference;
(e)Wilcoxonmatched-pairssigned-ranktest.
ViewAnswer
80.Inastudyofboxers,computertomographyrevealedbrainatrophyin3of6professionalsand1of8amateurs(Kasteetal.1982).Thesegroupscouldbecomparedusing:
(a)Fisher'sexacttest;
(b)thechi-squaredtest;
(c)thechi-squaredtestwithYates'correction;
(d)*McNemar'stest;
(e)thetwo-samplettest.
ViewAnswer
Table14.5.GastricpHandurinarynitriteconcentrationsin26subjects(HallandNorthfield,privatecommunication)
pH Nitrite pH Nitrite pH Nitrite pH Nitrite
1.72 1.64 2.64 2.33 5.29 50.6 5.77 48.9
1.93 7.13 2.73 52.0 5.31 43.9 5.86 3.26
1.94 12.1 2.94 6.53 5.50 35.2 5.90 63.4
2.03 15.7 4.07 22.7 5.55 83.8 5.91 81.2
2.11 0.19 4.91 17.8 5.59 52.5 6.03 19.5
2.17 1.48 4.94 55.6 5.59 81.8
2.17 9.36 5.18 0.0 5.17 21.9
14E*Exercise:Choosingastatisticalmethod1.Inacross-overtrialtocomparetwoappliancesforileostomypatients,of14patientswhoreceivedsystemAfirst,5expressedapreferenceforA,9forsystemBandnonehadnopreference.OfthepatientswhoreceivedsystemBfirst,7preferredA,5preferredBand4hadnopreference.Howwouldyoudecidewhetheronetreatmentwaspreferable?Howwouldyoudecidewhethertheorderoftreatmentinfluencedthechoice?
ViewAnswer
2.Burretal.(1976)testedaproceduretoremovehouse-dustmitesfromthebeddingofadultasthmaticsinattempttoimprovesubjects'lungfunction,whichtheymeasuredbyPEFR.Thetrialwasatwoperiodcross-overdesign,thecontrolorplacebotreatmentbeingthoroughdustremovalfromthelivingroom.ThemeansandstandarderrorsforPEFRinthe32subjectswere:
activetreatment:335litres/min,SE=19.6litres/min
placebotreatment:329litres/min,SE=20.8litres/min
differenceswithinsubjects:(treatment–placebo)6.45litres/min,SE=5.05litres/min
HowwouldyoudecidewhetherthetreatmentimprovesPEFR?
ViewAnswer
3.Inatrialofscreeningandtreatmentformildhypertension(Readeretal.1980),1138patientscompletedthetrialonactivetreatment,with9deaths,and1080completedonplacebo,with19deaths.Afurther583patientsallocatedtoactivetreatmentwithdrew,ofwhom6died,and626allocatedtoplacebowithdrew,ofwhom16diedduringthetrialperiod.Howwouldyoudecidewhetherscreeningandtreatmentformildhypertensionreducestheriskofdying?
ViewAnswer
4.Table14.5showsthepHandnitriteconcentrationsinsamplesofgastricfluidfrom26patients.AscatterdiagramisshowninFigure14.1.HowwouldyouassesstheevidenceofarelationshipbetweenpHandnitriteconcentration?
ViewAnswer
5.Thelungfunctionof79childrenwithahistoryofhospitalizationforwhoopingcoughand178childrenwithoutahistoryofwhoopingcough,takenfromthesameschoolclasses,wasmeasured.Themeantransittimeforthewhoopingcoughcaseswas0.49seconds(s.d.=0.14seconds)andforthecontrols0.47seconds(s.d.=0.11seconds),(Johnstonetal.1983).Howcouldyouanalysethedifferenceinlungfunctionbetweenchildrenwhohadhadwhoopingcoughandthosewhohadnot?Eachcasehadtwomatchedcontrols.Ifyouhadallthedata,howcouldyouusethisinformation?
ViewAnswer
Fig.14.1.GastricpHandurinarynitrite
Table14.6.Visualacuityandresultsofacontrastsensitivityvisiontestbeforeandaftercataractsurgery(Wilkins,personalcommunication)
CaseVisualacuity Contrastsensitivitytest
Before After Before After
1 6/9 6/9 1.35 1.50
2 6/9 6/9 0.75 1.05
3 6/9 6/9 1.05 1.35
4 6/9 6/9 0.45 0.90
5 6/12 6/6 1.05 1.35
6 6/12 6/9 0.90 1.20
7 6/12 6/9 0.90 1.05
8 6/12 6/12 1.05 1.20
9 6/12 6/12 0.60 1.05
10 6/18 6/6 0.75 1.05
11 6/18 6/12 0.90 1.05
12 6/18 6/12 0.90 1.50
13 6/24 6/18 0.45 0.75
14 6/36 6/18 0.15 0.45
15 6/36 6/36 0.45 0.60
16 6/60 6/9 0.45 1.05
17 6/60 6/12 0.30 1.05
6.Table14.6showssomedatafromapre-andpost-treatmentstudy
ofcataractpatients.Thesecondnumberinthevisualacuityscorerepresentsthesizeofletterwhichcanbereadatadistanceofsixmetres,sohighnumbersrepresentpoorvision.Forthecontrastsensitivitytest,whichisameasurement,highnumbersrepresentgoodvision.Whatmethodscouldbeusedtotestthedifferenceinvisualacuityandinthecontrastsensitivitytestpre-andpost-operation?Whatmethodcouldbeusedtoinvestigatetherelationshipbetweenvisualacuityandthecontrastsensitivitytestpost-operation?
ViewAnswer
Table14.7.Asthmaorwheezebymaternalage(Andersonetal.1986)
Asthmaorwheezereported
Mother'sageatchild'sbirth
15–19 20–29 30+
Never 261 4017 2146
Onsetbyage7 103 984 487
Onsetfrom8to11 27 189 95
Onsetfrom12to16 20 157 67
Table14.8.Colontransittime(hours)ingroupsofmobileand
immobileelderlypatients(dataofDrMichaelO'Connor)
Mobilepatients Immobilepatients
8.4 21.6 45.5 62.4 68.4 15.6 38.8 54.0
14.4 25.2 48.0 66.0 24.0 42.0 54.0
19.2 30.0 50.4 66.0 24.0 43.2 57.6
20.4 36.0 60.0 66.0 32.4 47.0 58.8
20.4 38.4 60.0 67.2 34.8 52.8 62.4
n1=21,[xwithbarabove]1=42.57,s1=20.58
n1=21,[xwithbarabove]49.63,s2=16.39
7.Table14.7showstherelationshipbetweenageofonsetofasthmainchildrenandmaternalageatthechild'sbirth.Howwouldyoutestwhetherthesewererelated?ThechildrenwereallborninoneweekinMarch,1958.Apartfromthepossibilitythatyoungmothersingeneraltendtohavechildrenpronetoasthma,whatotherpossibleexplanationsarethereforthisfinding?
ViewAnswer
8.Inastudyofthyroidhormoneinprematurebabies,wewantedtostudytherelationshipoffreeT3measuredatseveraltimepointsoversevendayswiththenumberofdaysthebabiesremainedoxygendependent.Somebabiesdied,mostlywithinafewdaysofbirth,andsomebabieswenthomestilloxygendependentandwere
notfollowedanylongerbytheresearchers.HowcouldyoureducetheseriesofT3measurementsonababytoasinglevariable?Howcouldyoutesttherelationshipwithtimeonoxygen?
ViewAnswer
9.Table14.8showscolontransittimesmeasuredinagroupofelderlypatientswhoweremobileandinasecondgroupwhowereunabletomoveindependently.Figure14.2showsascatterdiagramandhistogramandNormalplotofresidualsforthesedata.Whattwostatisticalapproachescouldbeusedhere?Whichwouldyoupreferandwhy?
ViewAnswer
Fig.14.2.Scatterplot,histogram,andNormalplotforthecolontransittimedataofTable14.8
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>15-Clinicalmeasurement
15
Clinicalmeasurement
15.1MakingmeasurementsInthischapterweshalllookatanumberofproblemsassociatedwithclinicalmeasurement.Theseincludehowpreciselywecanmeasure,howdifferentmethodsofmeasurementcanbecompared,howmeasurementscanbeusedindiagnosisandhowtodealwithincompletemeasurementsofsurvival.
Whenwemakeameasurement,particularlyabiologicalmeasurement,thenumberweobtainistheresultofseveralthings:thetruevalueofthequantitywewanttomeasure,biologicalvariation,themeasurementinstrumentitself,thepositionofthesubject,theskill,experienceandexpectationsoftheobserver,andeventherelationshipbetweenobserverandsubject.Someofthesefactors,suchasthevariationwithinthesubject,areoutsidethecontroloftheobserver.Others,suchasposition,arenot,anditisimportanttostandardizethese.Onewhichismostunderourcontrolistheprecisionwithwhichwereadscalesandrecordtheresult.Whenbloodpressureismeasured,forexample,someobserversrecordtothenearest5mmHg,otherstothenearest10mmHg.SomeobserversmayrecorddiastolicpressureatKorotkovsoundfour,othersatfive.Observersmaythinkthatasbloodpressureissuchavariablequantity,errorsinrecordingofthismagnitudeareunimportant.Inthemonitoringoftheindividualpatient,suchlackofuniformitymaymakeapparentchangesdifficulttointerpret.Inresearch,imprecisemeasurementcanleadtoproblemsintheanalysistolossofpower.
Howpreciselyshouldwerecorddata?Whilethismustdependtosome
extentonthepurposeforwhichthedataaretoberecorded,anydatawhicharetobesubjectedtostatisticalanalysisshouldberecordedaspreciselyaspossible.Astudycanonlybeasgoodasthedata,anddataareoftenverycostlyandtime-consumingtocollect.Theprecisiontowhichdataaretoberecordedandallother.procedurestobeusedinmeasurementshouldbedecidedinadvanceandstatedintheprotocol,thewrittenstatementofhowthestudyistobecarriedout.Weshouldbearinmindthattheprecisionofrecordingdependsonthenumberofsignificantfigures(§5.2)recorded,notthenumberofdecimalplaces.Theobservations0.15and1.66fromTable4.8,forexample,arebothrecordedtotwodecimalplaces,but0.15hastwosignificantfiguresand1.66hasthree.Thesecondobservationisrecordedmoreprecisely.Thisbecomesveryimportantwhenwecometoanalysethedata,forthedataofTable4.8havea
skewdistributionwhichwewishtologtransform.Thegreaterimprecisionofrecordingatthelowerendofthescaleismagnifiedbythetransformation.
Inmeasurementthereisusuallyuncertaintyinthelastdigit.Observerswilloftenhavesomevaluesforthislastdigitwhichtheyrecordmoreoftenthanothers.Manyobserversaremorelikelytorecordaterminalzerothananineoraone,forexample.Thisisknownasdigitpreference.Thetendencytoreadbloodpressuretothenearest5or10mmHgmentionedaboveisanexampleofthis.Observertrainingandawarenessoftheproblemhelptominimizedigitpreference,butifpossiblereadingsshouldbetakentosufficientsignificantfiguresforthelastdigittobeunimportant.Digitpreferenceisparticularlyimportantwhendifferencesinthelastdigitareofimportancetotheoutcome,asitmightbeinTable15.1,wherewearedealingwiththedifferencebetweentwosimilarnumbers.Becauseofthisitisamistaketohaveonemeasurertakereadingsunderonesetofconditionsandasecondunderanother,astheirdegreeofdigitpreferencemaydiffer.Itisalsoimportanttoagreetheprecisiontowhichdataaretoberecordedandtoensurethatinstrumentshavesufficientlyfinescalesforthejobinhand.
15.2*RepeatabilityandmeasurementerrorIhavealreadydiscussedsomefactorswhichmayproducebiasinmeasurements(§2.7,§2.8,§3.6).Ihavenotyetconsideredthenaturalbiologicalvariability,insubjectandinmeasurementmethod,whichmayleadtomeasurementerror.‘Error’comesfromaLatinrootmeaning‘towander’,anditsuseinstatisticsincloselyrelatedtothis,asin§11.2,forexample.Thuserrorinmeasurementmayincludethenaturalcontinualvariationofabiologicalquantity,whenasingleobservationwillbeusedtocharacterizetheindividual.Forexample,inthemeasurementofbloodpressurewearedealingwithaquantitythatvariescontinuously,notonlyfromheartbeattoheartbeatbutfromdaytoday,seasontoseason,andevenwiththesexofthemeasurer.Themeasurer,too,willshowvariationintheperceptionoftheKorotkovsoundandreadingofthemanometer.Becauseofthis,mostclinicalmeasurementscannotbetakenatfacevaluewithoutsomeconsiderationbeinggiventotheirerror.
Thequantificationofmeasurementerrorisnotdifficultinprinciple.Todoitweneedasetofreplicatereadings,obtainedbymeasuringeachmemberofasampleofsubjectsmorethanonce.Wecanthenestimatethestandarddeviationofrepeatedmeasurementsonthesamesubject.Table15.1showssomereplicatedmeasurementsofpeakexpiratoryflowrate,madebythesameobserver(myself)withaWrightPeakFlowMeter.Foreachsubject,themeasuredPEFRvariesfromobservationtoobservation.Thisvariationisthemeasurementerror.Wecanquantifymeasurementerrorintwoways:usingthestandarddeviationforrepeatedmeasurementsonthesamesubjectandbycorrelation.
Table15.1.PairsofreadingsmadewithaWrightPeakFlowMeteron17healthyvolunteers
Subject
PEFR(litres/min)
Subject
PEFR(litres/min)
First Second First Second
1 494 490 10 433 429
2 395 397 11 417 420
3 516 512 12 656 633
4 434 401 13 267 275
5 476 470 14 478 492
6 557 611 15 178 165
7 413 415 16 423 372
8 442 431 17 427 421
9 650 638
Table15.2.AnalysisofvariancebysubjectforthePEFRdataofTable15.1
Sourceof Degrees Sumof Mean Variance
variation offreedom
squares square ratio(F) Probability
Total 33 445581.5
Betweensubjects
16 441598.5 27599.9 117.8
Residual(withinsubjects)
17 3983.0 234.3
Table15.3.Analysisofvariancebysubjectforthelog(basetransformedPEFRdataofTable15.1
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 33 3.160104
Subjects 16 3.139249 0.196203 159.9
Residual(withinsubjects)
17 0.020855 0.001227
Weshouldchecktoseewhethertheerrordoesdependonthevalueofthemeasurement,usuallybeinglargerforlargervalues.Wecandothisbyplottingascatterdiagramoftheabsolutevalueofthedifference(i.e.ignoringthesign)andthemeanofthetwoobservations(Figure15.1).ForthePEFRdata,thereisnoobviousrelationship.Wecancheckthisbycalculatingacorrelation(§11.9)orrankcorrelationcoefficient(§12.4,§12.5).ForFigure15.1wehaveτ=0.17,P=0.3,sothereislittletosuggestthatthemeasurementerrorisrelatedtothesizeofthePEFR.Hencethecoefficientofvariationisnotasappropriateasthewithinsubjectsstandarddeviationasarepresentationofthemeasurementerror.Formostmedicalmeasurements,thestandarddeviationiseitherindependentoforproportionaltothemeasurementandsooneofthesetwoapproachescanbeused.
Fig.15.1.Absolutedifferenceversussumfor17pairsofWrightPeakFlowMetermeasurements
Measurementerrormayalsobepresentedasthecorrelationcoefficientbetweenpairsofreadings.Thisissometimescalledthereliabilityofthemeasurement,andisoftenusedforpsychologicalmeasurementsusingquestionnairescales.However,thecorrelationdependsontheamountofvariationbetweensubjects.Ifwedeliberatelychoosesubjectstohaveawidespreadofpossiblevalues,thecorrelationwillbebiggerthanifwetakearandomsampleofsubjects.Thusthismethodshouldonlybeusedifwehavearepresentativesampleofthesubjectsinwhomweareinterested.Theintra-classcorrelationcoefficient(§11.13),whichdoesnottakeintoaccounttheorderinwhichobservationsweretakenandwhichcanbeusedwithmorethantwoobservationspersubject,ispreferredforthisapplication.Applyingthemethodof§11.13toTable15.1wegetICC=0.98.ICCandswarecloselyrelated,becauseICC=1-sw2/(sb2+sw2).ICCthereforedependsalsoonthevariationbetweensubjects,andthusrelatestothepopulationofwhichthesubjectscanbeconsideredarandomsample.StreinerandNorman(1996)giveaninterestingdiscussion.
15.3*ComparingtwomethodsofmeasurementInclinicalmeasurement,mostofthethingswewanttomeasure,hearts,lungs,liversandsoon,aredeepwithinlivingbodiesandoutofreach.Thismeansthatmanyofthemethodsweusetomeasurethemareindirectandwecannotbesurehowcloselytheyarerelatedtowhatwereallywanttoknow.Whenanewmethodofmeasurementisdeveloped,ratherthancompareitsoutcometoasetofknownvalueswemustoftencompareittoanothermethodjustasindirect.Thisisacommontypeofstudy,andonewhichisoftenbadlydone(AltmanandBland1983,BlandandAltman1986).
Table15.4showsmeasurementsofPEFRbytwodifferentmethods,theWrightmeterdatacomingfromTable15.1.Forsimplicity,Ishalluse
onlyonemeasurementbyeachmethodhere.Wecouldmakeuseoftheduplicate
databyusingtheaverageofeachpairfirst,butthisintroducesanextrastageinthecalculation.BlandandAltman(1986)givedetails.
Table15.4.ComparisonoftwomethodsofmeasuringPEFR
Subjectnumber
PEFR(litres/min)DifferenceWright-miniWright
meterMinimeter
1 494 512 -18
2 395 430 -35
3 516 520 -4
4 434 428 6
5 476 500 -24
6 557 600 -43
7 413 364 49
8 442 380 62
9 650 658 -8
10 433 445 -12
11 417 432 -15
12 656 626 30
13 267 260 7
14 478 477 1
15 178 259 -81
16 423 350 73
17 427 451 -24
Total -36
Mean 2.1
S.d. 38.8
Thefirststepintheanalysisistoplotthedataasascatterdiagram(Figure15.2).Ifwedrawthelineofequality,alongwhichthetwomeasurementswouldbeexactlyequal,thisgivesusanideaoftheextenttowhichthetwomethodsagree.Thisisnotthebestwayoflookingatdataofthistype,becausemuchofthegraphisemptyspaceandtheinterestinginformationisclusteredalongtheline.Abetter
approachistoplotthedifferencebetweenthemethodsagainstthesumoraverage.Thesignofthedifferenceisimportant,asthereisapossibilitythatonemethodmaygivehighervaluesthantheotherandthismayberelatedtothetruevaluewearetryingtomeasure.ThisplotisalsoshowninFigure15.2.
Twomethodsofmeasurementagreeifthedifferencebetweenobservationsonthesamesubjectusingbothmethodsissmallenoughforustousethemethodsinterchangeably.Howsmallthisdifferencehastobedependsonthemeasurementandtheusetowhichitistobeput.Itisaclinical,notastatistical,decision.Wequantifythedifferencesbyestimatingthebias,whichisthemeandifference,andthelimitswithinwhichmostdifferenceswilllie.Weestimatetheselimitsfromthemeanandstandarddeviationofthedifferences.Ifwearetoestimatethesequantities,wewantthemtobethesameforhighvaluesandforlowvaluesofthemeasurement.Wecancheckthisfromtheplot.ThereisnoclearevidenceofarelationshipbetweendifferenceandmeaninFigure15.4,andwecancheckthisbyatestofsignificanceusingthecorrelationcoefficient.Wegetr=0.19,P=0.5.
Themeandifferenceisclosetozero,sothereislittleevidenceofoverallbias.
Wecanfindaconfidenceintervalforthemeandifferenceasdescribedin§10.2.Thedifferenceshaveamean[dwithbarabove]=-2.1litres/min,andastandarddeviationof38.8.Thestandarderrorofthemeanisthuss/√n=38.8/√17=9.41litres/minandthecorrespondingvalueoftwith16degreesoffreedomis2.12.The95%confidenceintervalforthebiasisthus-2.1±2.12×9.41=-22to+18litres/min.Thusonthebasisofthesedatawecouldhaveabiasofasmuchas22litres/min,whichcouldbeclinicallyimportant.Theoriginalcomparisonoftheseinstrumentsusedamuchlargersampleandfoundthatanybiaswasverysmall(Oldhametal.1979).
Fig.15.2.PEFRmeasuredbytwodifferentinstruments,minimeterversusWrightmeteranddifferenceversusmeanofminiandWrightmeters
Fig.15.3.DistributionofdifferencesbetweenPEFRmeasuredbytwomethods
Thestandarddeviationofthedifferencesbetweenmeasurementsmadebythetwomethodsprovidesagoodindexofthecomparabilityofthemethods.Ifwecanestimatethemeanandstandarddeviationreliably,withsmallstandarderrors,wecanthensaythatthedifferencebetweenmethodswillbeatmosttwostandarddeviationsoneithersideofthemeanfor95%ofobservations.These[dwithbarabove]±2s
limitsforthedifferencearecalledthe95%limitsofagreement.ForthePEFRdata,thestandarddeviationofthedifferencesisestimatedtobe38.8litres/minandthemeanis-2litres/min.Twostandarddeviationsistherefore78litres/min.Thereadingwiththeminimeterisexpectedtobe80litresbelowto76litresaboveformostsubjects.TheselimitsareshownashorizontallinesinFigure15.4.Thelimitsdependontheassumptionthatthedistributionof
thedifferencesisapproximatelyNormal,whichcanbecheckedbyhistogramandNormalplot(§7.5)(Figure15.3).
Fig.15.4.DifferenceversussumforPEFRmeasuredbytwomethods
OnthebasisofthesedatawewouldnotconcludethatthetwomethodsarecomparableorthattheminimetercouldreliablyreplacetheWrightpeakflowmeter.Asremarkedin§10.2,thismeterhadreceivedconsiderablewear.
Whenthereisarelationshipbetweenthedifferenceandthemean,wecantrytoremoveitbyatransformation.Thisisusuallyaccomplishedbythelogarithm,andleadstoaninterpretationofthelimitssimilartothatdescribedin§15.2.BlandandAltman(1986,1999)givedetails.
15.4SensitivityandspecificityOneofthemainreasonsformakingclinicalmeasurementsistoaidindiagnosis.Thismaybetoidentifyoneofseveralpossiblediagnosesinapatient,ortofindpeoplewithaparticulardiseaseinanapparentlyhealthypopulation.Thelatterisknownasscreening.Ineithercasethemeasurementprovidesatestwhichenablesustoclassifysubjectsintotwogroups,onegroupwhomwethinkarelikelytohavediseaseinwhichweareinterested,andanothergroupunlikelytohavethedisease.Whendevelopingsuchatest,weneedtocomparethetestresultwithatruediagnosis.Thetestmaybebasedonacontinuousvariableandthediseaseindicatedifitisaboveorbelowagivenlevel,oritmaybeaqualitativeobservationsuchascarcinomainsitucellsonacervicalsmear.IneithercaseIshallcallthetestpositiveifitindicatesthediseaseandnegativeifnot,andthediseasepositiveifthediseaseislaterconfirmed,negativeifnot.
Howdowemeasuretheeffectivenessofthetest?Table15.5showsthreeartificialsetsoftestanddiseasedata.Wecouldtakeasanindexoftesteffectivenesstheproportiongivingthecorrectdiagnosisfromthetest.ForTest1intheexampleitis94%.NowconsiderTest2,whichalwaysgivesanegativeresult.Test2willneverdetectanycasesofthedisease.Wearenowrightfor95%ofthesubjects!However,thefirsttestisuseful,inthatitdetectssome
casesofthedisease,andthesecondisnot,sothisisclearlyapoorindex.
Table15.5.Someartificialtestanddiagnosisdata
DiseaseTest1 Test2 Test3
Total+ve -ve +ve -ve +ve -ve
Yes 4 1 0 5 2 3 5
No 5 90 0 95 0 95 95
Total 9 91 0 100 2 98 100
Thereisnoonesimpleindexwhichenablesustocomparedifferenttestsinallthewayswewouldlike.Thisisbecausetherearetwothingsweneedtomeasure:howgoodthetestisatfindingdiseasepositives,i.e.thosewiththecondition,andhowgoodthetestisatexcludingdiseasenegatives,i.e.thosewhodonothavethecondition.Theindicesconventionallyemployedtodothisare:
Inotherwords,thesensitivityisaproportionofdiseasepositiveswhoaretestpositive,andthespecificityistheproportionofdiseasenegativeswhoaretestnegatives.Forourthreeteststheseare:
Test1 Test2 Test3
Sensitivity 0.80 0.00 0.40
Specificity 0.95 1.00 1.00
Test2,ofcourse,missesallthediseasepositivesandfindsallthediseasenegatives,bysayingallarenegative.ThedifferencebetweenTests1and3isbroughtoutbythegreatersensitivityof1andthegreaterspecificityof3.Wearecomparingtestsintwodimensions.WecanseethatTest3isbetterthanTest2,becauseitssensitivityishigherandspecificitythesame.However,itismoredifficulttoseewhetherTest3isbetterthanTest1.Wemustcometoajudgementbasedontherelativeimportanceofsensitivityandspecificityintheparticularcase.
Sensitivityandspecificityareoftenmultipliedby100togivepercentages.Theyarebothbinomialproportions,sotheirstandard
errorsandconfidenceintervalsarefoundasdescribedin§8.4and§8.8.Becausetheproportionsareoftennearto1.0,thelargesampleapproach(§8.4)maynotbevalid.TheexactmethodusingtheBinomialprobabilities(§8.8)ispreferable.HarperandReeves(1999)pointoutthatconfidenceintervalsarealmostalwaysomittedinstudiesofdiagnostictestsreportedoutsidethemajorgeneralmedicaljournals,andrecommendthattheyshouldalwaysbegiven.Asthereadermightexpect,Iagreewiththem!Thesamplesizerequiredforthereliableestimationofsensitivityandspecificitycanbecalculatedasdescribedin§18.2.
Sometimesatestisbasedonacontinuousvariable.Forexample,Table15.6showsmeasurementsofcreatinekinase(CK)inpatientswithunstableangina
andacutemyocardialinfarction.Figure15.5(a)showsascatterplot.WewishtodetectpatientswithAMIamongpatientswhomayhaveeitherconditionandthismeasurementisapotentialtest,AMIpatientstendingtohavehighvalues.Howdowechoosethecut-offpoint?ThelowestCKinAMIpatientsis90,soacut-offbelowthiswilldetectallAMIpatients.Using80,forexample,wewoulddetectallAMIpatients,sensitivity=1.00,butwouldalsoonlyhave42%ofanginapatientsbelow80,sothesensitivity=0.42.Wecanalterthesensitivityandspecificitybychangingthecut-offpoint.Raisingthecut-offpointwillmeanfewercaseswillbedetectedandsothesensitivitywillbedecreased.However,therewillbefewerfalsepositives,positivesontestbutwhodonotinfacthavethedisease,andthespecificitywillbeincreased.Forexample,ifCK≥100werethecriterionforAMI,sensitivitywouldbe0.96andspecificity0.62.Thereisatrade-offbetweensensitivityandspecificity.Itcanbehelpfultoplotsensitivityagainstspecificitytoexaminethistrade-off.ThisiscalledareceiveroperatingcharacteristicorROCcurve.(Thenamecomesfromtelecommunications.)
Weoftenplotsensitivityagainst1–specificity,asinFigure15.5(b).WecanseefromFigure15.5(b)thatwecangetbothhighsensitivityandhighspecificityifwechoosetherightcut-off.With1-specificitylessthan0.1,i.e.sensitivitygreaterthan0.9.wecangetsensitivitygreater
than0.9also.Infact,acut-offof200wouldgivesensitivity=0.93andspecificity=0.91inthissample.Theseestimateswillbebiased,becauseweareestimatingthecut-offandtestingitinthesamesample.Weshouldcheckthesensitivityandspecificityofthiscut-offinadifferentsampletobesure.
Table15.6.Creatinekinaseinpatientswithunstableanginaandacutemyocardialinfarction(AMI)(dataof
FrancesBoa)
Unstableangina AMI
23 48 62 83 104 130 307 90 648
33 49 63 84 105 139 351 196 894
36 52 63 85 105 150 360 302 962
37 52 65 86 107 155 311 1015
37 52 65 88 108 157 325 1143
41 53 66 88 109 162 335 1458
41 54 67 88 111 176 347 1955
41 57 71 89 114 180 349 2139
42 57 72 91 116 188 363 2200
42 58 72 94 118 198 377 3044
43 58 73 94 121 226 390 7590
45 58 73 95 121 232 398 11138
47 60 75 97 122 257 545
48 60 80 100 126 257 577
48 60 80 103 130 297 629
Fig.15.5.ScatterdiagramandwithROCcurveforthedataofTable15.6
TheareaundertheROCcurveisoftenquoted(hereitis0.9753).Itestimatestheprobabilitythatamemberofonepopulationchosenatrandomwillexceedamemberoftheotherpopulation,inthesamewayasdoesU/n1n2intheMann–WhitneyUtest(§12.2).Itcanbeusefulin
comparingdifferenttests.InthisstudyanotherbloodtestgaveusanareaundertheROCcurve=0.9825,suggestingthatthetestmaybeslightlybetterthanCK.
WecanalsoestimatethepositivepredictivevalueorPPV,theprobabilitythatasubjectwhoistestpositivewillbeatruepositive(i.e.hasthediseaseandiscorrectlyclassified),andthenegativepredictivevalueorNPV,theprobabilitythatasubjectwhoistestnegativewillbeatruenegative(i.e.doesnothavethediseaseandiscorrectlyclassified).Thesedependontheprevalenceofthecondition,Pprev,aswellasthesensitivity,Psens,andthespecificity,pspec.Ifthesampleisasinglegroupofpeople,weknowtheprevalenceandcanestimatePPVandNPVforthispopulationdirectlyassimpleproportions.Ifwestartedwithasampleofcasesandasampleofcontrols,wedonotknowtheprevalence,butwecanestimatePPVandNPVforapopulationwithanygivenprevalence.Asdescribedin§6.8,psensistheconditionalprobabilityofapositivetestgiventhedisease,sotheprobabilityofbeingbothtestpositiveanddiseasepositiveispsens×pprev.Similarly,theprobabilityofbeingbothtestnegativeanddiseasepositiveis(1-pspec)×(1-pprev).Theprobabilityofbeingtestpositiveisthesumofthese(§6.2):psens×pprev+(1-pspec)×(1-pprev)andthePPVis
Similarly,theNPVis
InscreeningsituationstheprevalenceisalmostalwayssmallandthePPVislow.Supposewehaveafairlysensitiveandspecifictest,psens=0.95andpspec=0.90,andthediseasehasprevalencepprev=0.01(1%).Then
soonly8.8%oftestpositiveswouldbetruepositives,butalmostalltestnegativeswouldbetruenegatives.Mostscreeningtestsaredealingwithmuchsmallerprevalencesthanthis,somosttestpositivesarefalsepositives.
15.5NormalrangeorreferenceintervalIn§15.4wewereconcernedwiththediagnosisofparticulardiseases.Inthissectionwelookatittheotherwayroundandaskwhatvaluesmeasurementsonnormal,healthypeoplearelikelytohave.Therearedifficultiesindoingthis.Whois‘normal’anyway?IntheUKpopulationalmosteveryonehashardfattydepositsintheircoronaryarteries,whichresultindeathformanyofthem.VeryfewAfricanshavethis;theydiefromothercauses.SoitisnormalintheUKtohaveanabnormality.Weusuallysaythatnormalpeoplearetheapparentlyhealthymembersofthelocalpopulation.WecandrawasampleoftheseasdescribedinChapter3andmakethemeasurementonthem.
Thenextproblemistoestimatethesetofvalues.Ifweusetherangeoftheobservations,thedifferencebetweenthetwomostextremevalues,wecanbefairlyconfidentthatifwecarryonsamplingwewilleventuallyfindobservationsoutsideit.andtherangewillgetbiggerandbigger(§4.7).Toavoidthisweusearangebetweentwoquantiles(§4.7),usuallythe2.5centileandthe97.5centile,whichiscalledthenormalrange,95%referencerangeor95%referenceinterval.Thisleaves5%ofnormalsoutsidethe‘normalrange’,whichisthesetofvalueswithinwhich95%ofmeasurementsfromapparentlyhealthyindividualswilllie.
Athirddifficultycomesfromconfusionbetween‘normal’asusedinmedicineand‘Normaldistribution’asusedinstatistics.ThishasledsomepeopletodevelopapproacheswhichsaythatalldatawhichdonotfitunderaNormalcurveareabnormal!Suchmethodsaresimplyabsurd,thereisnoreasontosupposethatallvariablesfollowaNormaldistribution(§7.4,§7.5).Theterm‘referenceinterval’,whichisbecomingwidelyused,hastheadvantageofavoidingthisconfusion.However,themostcommonlyusedmethodofcalculationrestsontheassumptionthatthevariablefollowsaNormaldistribution.
Wehavealreadyseenthatingeneralmostobservationsfallwithintwostandarddeviationsofthemean,andthatforaNormaldistribution95%arewithintheselimitswith2.5%belowand2.5%above.IfweestimatethemeanandstandarddeviationofdatafromaNormalpopulationwecanestimatethereferenceintervalas[xwithbarabove]-2sto[xwithbarabove]+2s.
ConsidertheFEV1dataofTable4.5.WewillestimatethereferenceintervalforFEV1inmalemedicalstudents.Wehave57observations,mean4.06andstandarddeviation0.67litres.Thereferenceintervalisthusfrom2.7to5.4litres.FromTable4.4weseethatinfactonlyonestudent(2%)isoutsidetheselimits,althoughthesampleisrathersmall.
Hence,providedNormalassumptionshold,thestandarderrorofthelimitofthereferenceintervalis
ComparetheserumtriglyceridemeasurementsofTable4.8.Asalreadynoted(§4.4,§7.4).thedataarehighlyskewed,andwecannotusetheNormalmethoddirectly.Ifwedid,thelowerlimitwouldbe0.07,wellbelowanyoftheobservations,andtheupperlimitwouldbe0.94,greaterthanwhichare5%oftheobservations.Itispossibleforsuchdatatogiveanegativelowerlimit.
BecauseoftheobviouslyunsatisfactorynatureoftheNormalmethodforsomedata,someauthorshaveadvocatedtheestimationofthepercentilesdirectly(§4.5),withoutanydistributionalassumptions.Thisisanattractiveidea.Wewanttoknowthepointbelowwhich2.5%ofvalueswillfall.Letussimplyranktheobservationsandfindthepointbelowwhich2.5%oftheobservationsfall.Forthe282triglycerides,the2.5and97.5centilesarefoundasfollows.Forthe2.5centile,wefindi=q(n+1)=0.025×(282+1)=7.08.Therequiredquantilewillbebetweenthe7thand8thobservation.The7this0.21,the8this0.22sothe2.5centilewouldbeestimatedby0.21+(0.22-0.21)×(7.08-7)=0.211.Similarlythe97.5centileis1.039.
Thisapproachgivesanunbiassedestimatewhateverthedistribution.Thelogtransformedtriglyceridewouldgiveexactlythesameresults.NotethattheNormaltheorylimitsfromthelogtransformeddataareverysimilar.Wenowlookattheconfidenceinterval.The95%confidenceintervalfortheqquantile,hereqbeing0.025or0.975,estimateddirectlyfromthedataisfoundbytheBinomialdistributionmethod(§8.9).Forthetriglyceridedata,n=282andsoforthelowerlimit,q=0.025,wehave
Thisgivesj=1.9andk=12.2,whichwerounduptoj=2andk=13.Inthetriglyceridedatathesecondobservation,correspondingtoj=2,is0.16andthe13this0.26.Thusthe95%confidenceintervalforthelowerreferencelimitis0.16to0.26.Thecorrespondingcalculationforq=0.975givesj=270andk=281.The270thobservationis0.96andthe281stis1.64,givinga95%confidenceintervalfortheupperreferencelimitof0.96to1.64.ThesearewiderconfidenceintervalsthanthosefoundbytheNormalmethod,thoseforthelongtailparticularlyso.Thismethodofestimatingpercentilesinlongtailsisrelativelyimprecise.
15.6*SurvivaldataWeoftenhavedatawhichrepresentthetimefromsomeeventtodeath,suchastimefromdiagnosisorfromentrytoaclinicaltrial,butsurvivalanalysisdoesnothavetobeaboutdeath.Incancerstudieswecanusesurvivalanalysisforthetimetometastasisortolocalrecurrenceofatumour,inastudyofmedicalcarewecanuseittoanalysethetimetoreadmissiontohospital,inastudyofbreast-feedingwecouldlookattheageatwhichbreast-feedingceasedoratwhichbottlefeedingwasfirstintroduced,andinastudyofthetreatmentofinfertilitywecantreatthetimefromtreatmenttoconceptionassurvivaldata.Weusuallyrefertotheterminalevent,death,conception,etc.,astheendpoint.
Problemsariseinthemeasurementofsurvivalbecauseoftenwedonotknowtheexactsurvivaltimesofallsubjects.Thisisbecausesomewillstillbesurvivingwhenwewanttoanalysethedata.Whencaseshaveenteredthestudyatdifferenttimes,someoftherecententrantsmaybesurviving,butonlyhavebeenobservedforashorttime.Theirobservedsurvivaltimemaybelessthanthosecasesadmittedearlyinthestudyandwhohavesincedied.Themethodofcalculatingsurvivalcurvesdescribedbelowtakesthisintoaccount.Observationswhichareknownonlytobegreaterthansomevaluearerightcensored,oftenshortenedtocensored.(Wegetleftcensoreddatawhenthemeasurementmethodcannotdetectanythingbelowsomecut-offvalue,andobservationsarerecordedas‘nonedetectable’.TherankmethodsinChapter12areusefulforsuchdata.)
Table15.7.Survivaltimeinyearsofpatientsafterdiagnosisofparathyroidcancer
Alive Deaths
<1 <1
<1 2
1 6
1 6
4 7
5 9
6 9
8 11
10 14
10
17
Table15.7showssomesurvivaldata,forpatientswithparathyroidcancer.Thesurvivaltimesarerecordedincompletedyears.Apatientwhosurvivedfor6yearsandthendiedcanbetakenashavinglivedfor6yearsandthendiedintheseventh.Inthefirstyearfromdiagnosis.onepatientdied,twopatientswereobservedforonlypartofthisyear,and17survivedintothenextyear.Thesubjectswhohaveonlybeenobservedforpartoftheyeararecensored,alsocalledlosttofollow-uporwithdrawnfromfollow-up.(Thesearerathermisleadingnames,oftenwronglyinterpretedasmeaningthatthesesubjectshavedroppedoutofthestudy.Thismaybethecase,butmostofthesesubjectsare
simplystillaliveandtheirfurthersurvivalisunknown.)Thereisnoinformationaboutthesurvivalofthesesubjectsafterthefirstyear,becauseithasnothappenedyet.Thesepatientsareonlyatriskofdyingforpartoftheyearandwecannotsaythat1outof20diedastheymayyetcontributeanotherdeathinthefirstyear.Wecansaythatsuchpatientswillcontributehalfayearofrisk,onaverage,sothenumberofpatientyearsatriskinthefirstyearis18(17whosurvivedand1whodied)plus2halvesforthosewithdrawnfromfollow-up,giving19altogether.Wegetanestimateoftheprobabilityofdyinginthefirstyearof1/19,andanestimatedprobabilityofsurvivingof1-1/19.Wecandothisforeachyearuntilthelimitsofthedataarereached.Wethustracethesurvivalofthesepatientsestimatingtheprobabilityofdeathorsurvivalateachyearandthecumulativeprobabilityofsurvivaltoeachyear.Thissetofprobabilitiesiscalledalifetable.
Tocarryoutthecalculation,wefirstsetoutforeachyear,x,thenumberaliveatthestart,nx,thenumberwithdrawnduringtheyear,wx,andthenumberatrisk,rx,andthenumberdying,dx(Table15.8).Thusinyear1thenumberatthestartis20,thenumberwithdrawnis2,thenumberatriskr1=n1-1/2w1=20-1/2×2=19andthenumberofdeathsis1.Astherewere2withdrawalsand1deaththenumberatthestartofyear2is17.Foreachyearwecalculatetheprobabilityofdyinginthatyearforpatientswhohavereachedthebeginningofit,qx=dx/rx,andhencetheprobabilityofsurvivingtothenextyear,px=1-qx.Finallywecalculatethecumulativesurvivalprobability.
Table15.8.Lifetablecalculationforparathyroidcancersurvival
Year Numberatstart
Withdrawnduringyear
Atrisk Deaths
Prob.ofdeath
Prob.ofsurvivingyear
x nx wx rx dx qx
1 20 2 19 1 0.0526 0.9474
2 17 2 16 0 0 1
3 15 0 15 1 0.0667 0.9333
4 14 0 14 0 0 1
5 14 1 13.5 0 0 1
6 13 1 12.5 0 0 1
7 12 1 11.5 2 0.1739 0.8261
8 9 0 9 1 0.1111 0.8889
9 8 1 7.5 0 0 1
10 7 0 7 2 0.2857 0.7143
11 5 2 4 0 0 1
12 3 0 3 1 0.3333 0.6667
13 2 0 2 0 0 1
14 2 0 2 0 0 1
15 2 0 2 1 0.5000 0.5000
16 1 0 1 0 0 1
17 1 0 1 0 0 1
18 1 1 0.5 0 0 1
rx=nx-1/2wx,qx=dx/rx,px=1-qx,Px=pxPx-1.
Forthefirstyear,thisistheprobabilityofsurvivingthatyear,P1=p1.Forthesecondyear,itistheprobabilityofsurvivinguptothestartofthesecondyear,P1,timestheprobabilityofsurvivingthatyear,p2,togiveP2=p2P1.Theprobabilityofsurvivingfor3yearsissimilarlyP3=p3P2,andsoon.Fromthislifetablewecanestimatethefiveyearsurvivalrate,ausefulmeasureofprognosisincancer.Fortheparathyroidcancer,thefiveyearsurvivalrateis0.8842,or88%.Wecanseethattheprognosisforthiscancerisquitegood.Ifweknowtheexacttimeofdeathorwithdrawalforeachsubject,theninsteadofusingfixedtimeintervalsweusexastheexacttime,witharowofthetableforeachtimewheneitheranendpointorawithdrawaloccurs.Thenrx=nxandwecanomittherx=nx-1/2wxstep.
Wecandrawagraphofthecumulativesurvivalprobability,thesurvivalcurve.Thisisusuallydrawninsteps,withabruptchangesinprobability(Figure15.6).Thisconventionemphasizestherelativelypoorestimationatthelongsurvivalendofthecurve,wherethesmallnumbersatriskproducedlargesteps.Whentheexacttimesofdeathandcensoringareknown,thisiscalledaKaplan-Meiersurvivalcurve.Thetimesatwhichobservationsarecensoredmaybemarkedbysmallverticallinesabovethesurvivalcurve(Figure15.7),andthenumberremainingatriskmaybewrittenatsuitableintervalsbelowthetime
axis.
Thestandarderrorandconfidenceintervalforthesurvivalprobabilitiescanbefound(seeArmitageandBerry1994).Theseareusefulforestimatessuchasfiveyearsurvivalrate.Theydonotprovideagoodmethodforcomparing
survivalcurves,astheydonotincludeallthedata,onlyusingthoseuptothechosentime.Survivalcurvesstartofftogetherat100%survival,possiblydiverge,buteventuallycometogetheratzerosurvival.Thusthecomparisonwoulddependonthetimechosen.Survivalcurvescanbecomparedbyseveralsignificancetests,ofwhichthebestknownisthelogranktest.Thisisanon-parametrictestwhichmakesuseofthefullsurvivaldatawithoutmakinganyassumptionabouttheshapeofthesurvivalcurve.
Fig.15.6.Survivalcurveforparathyroidcancerpatients
Table15.9showsthetimetorecurrenceofgallstonesfollowingdissolutionbybileacidtreatmentorlithotrypsy.Hereweshallcomparethetwogroupsdefinedbyhavingsingleormultiplegallstones,usingthelogranktest.Weshalllookatthequantitativevariablesdiameterof
gallstoneandmonthstodissolvein§17.9.Figure15.7showthetimetorecurrenceforsubjectswithsingleprimarygallstonesandmultipleprimarygallstones.Thenullhypothesisisthatthereisnodifferenceinrecurrence-freesurvivaltime,thealternativethatthereissuchadifference.ThecalculationofthelogranktestissetoutinTable15.10.Foreachtimeatwhicharecurrenceoracensoringoccurred,wehavethenumbersunderobservationineachgroup,n1andn2,thenumberofrecurrences,d1andd2(dfordeath),andthenumberofcensorings,w1
andw2(wforwithdrawal).Foreachtime,wecalculatetheprobabilityofrecurrence,pd=(d1+d2)/(n1+n2),whicheachsubjectwouldhaveifthenullhypothesisweretrue.Foreachgroup,wecalculatetheexpectednumberofrecurrences,e1=Pd×n1ande2=Pd×n2.Wethencalculatethenumbersatriskatthenexttime,n1-d1-w1andn2-d2-w2.Wedothisforeachtime.Wethenaddthed1andd2columnstogettheobservednumbersofrecurrences,andthee1ande2columnstogetthenumbersofrecurrencesexpectedifthenullhypothesisweretrue.
Wehaveobservedfrequenciesofrecurrenced1andd2,andexpectedfrequenciese1,ande2.Ofcourse,d1+d2=e1+e2,soweonlyneedtocalculatee1asinTable15.10.andhencee2bysubtraction.Thisonlyworksfortwogroups,however,andthemethodofTable15.10worksforanynumberofgroups.
Table15.9.Timetorecurrenceofgallstonesfollowingdissolution,whetherpreviousgallstonesweremultiple,
maximumdiameterofpreviousgallstones,andmonthspreviousgallstonestooktodissolve
Time Rec. Mult. Diam. Dis. Time Rec. Mult. Diam.
3 No Yes 4 10 13 No No 11
3 No No 18 3 13 No No 22
3 No Yes 5 27 13 No No 13
4 No Yes 4 4 13 Yes Yes
5 No No 19 20 14 No Yes
6 No Yes 3 10 14 No No 23
6 No Yes 4 6 14 No No 15
6 No Yes 4 20 16 Yes Yes
6 Yes Yes 5 8 16 Yes Yes
6 Yes Yes 3 18 16 No No 18
6 Yes Yes 7 9 17 No No
6 No No 25 9 17 No Yes
6 No Yes 4 6 17 No Yes
6 Yes Yes 10 38 17 Yes No
6 Yes Yes 8 15 17 No Yes
6 No Yes 4 13 18 Yes No 10
7 Yes Yes 4 15 18 Yes Yes
7 No Yes 3 7 18 No Yes 11
7 Yes Yes 10 48 19 No No 26
8 Yes Yes 14 29 19 No Yes 11
8 Yes No 18 14 19 Yes Yes
8 Yes Yes 6 6 20 No No 11
8 No No 15 1 20 No No 13
8 No Yes 1 12 20 No No
8 No Yes 5 6 21 No Yes 11
9 No Yes 2 15 21 No Yes 13
9 Yes Yes 7 6 21 No Yes
9 No No 19 8 22 No No 10
10 Yes Yes 14 8 22 No No 20
11 No Yes 8 12 23 No No 16
11 No No 15 15 24 No No 15
11 Yes No 5 8 24 No Yes
11 No Yes 3 6 24 No No 15
11 Yes Yes 5 12 24 Yes Yes
11 No Yes 4 6 25 No No 13
11 No Yes 4 3 25 Yes Yes
11 No Yes 13 18 25 No No
11 Yes No 7 8 26 No No 17
12 Yes Yes 5 7 26 No Yes
12 Yes Yes 8 12 26 Yes No 16
12 No Yes 4 6 28 No No 20
12 No Yes 4 8 28 Yes No 30
12 Yes Yes 7 19 29 No No 16
12 Yes No 7 3 29 Yes No 12
12 No Yes 5 22 29 Yes Yes 10
12 Yes No 8 1 29 No Yes
12 No No 6 6 30 No Yes
12 No No 26 4 30 No No
13 No Yes 5 6 30 Yes Yes 22
13 No No 13 6 30 Yes Yes
31 No Yes 5 6 38 No No 10
31 No No 26 3 38 Yes Yes
31 No No 7 24 38 No No
32 Yes Yes 10 12 40 No No 23
32 No Yes 5 6 41 No No 16
32 No No 4 6 41 No No
32 No No 18 10 42 No No 15
33 No No 13 9 42 No Yes 16
34 No No 15 8 42 No Yes
34 No No 20 30 42 No Yes 14
34 No Yes 15 8 43 Yes No
34 No No 27 8 44 No Yes
35 No No 6 12 44 No Yes 10
36 No No 18 5 45 No No 12
36 No Yes 6 16 47 No Yes
36 No Yes 5 6 48 No No 21
36 No Yes 8 17 48 No No
36 No No 5 4 53 No Yes
37 No Yes 5 7 60 Yes No 15
37 No No 19 4 61 No No 10
37 No Yes 4 4 65 No Yes
37 No Yes 4 12 70 No Yes
Fig.15.7.Gallstone-freesurvivalafterthedissolutionofsingleandmultiplegall-stones
Table15.10.Calculationforthelogranktest
Time n1 d1 w1 n2 d2 w2 pd e1
3 65 0 1 79 0 2 0.000 0.000
4 64 0 0 77 0 1 0.000 0.000
5 64 0 1 76 0 0 0.000 0.000
6 63 0 1 76 5 5 0.036 2.266
7 62 0 0 66 2 1 0.016 0.969
8 62 1 1 63 2 2 0.024 1.488
9 60 0 1 59 1 1 0.008 0.504
10 59 0 0 57 1 0 0.009 0.509
11 59 2 1 56 1 5 0.026 1.539
12 56 2 2 50 3 3 0.047 2.642
13 52 0 4 44 1 1 0.010 0.542
14 48 0 2 42 0 1 0.000 0.000
16 46 0 1 41 2 0 0.023 1.057
17 45 1 1 39 0 3 0.012 0.536
18 43 1 0 36 1 1 0.025 1.089
19 42 0 1 34 1 1 0.013 0.553
20 41 0 3 32 0 0 0.000 0.000
21 38 0 0 32 0 3 0.000 0.000
22 38 0 2 29 0 0 0.000 0.000
23 36 0 1 29 0 0 0.000 0.000
24 35 0 2 29 1 1 0.016 0.547
25 33 0 2 27 1 0 0.017 0.550
26 31 1 1 26 0 1 0.018 0.544
28 29 1 1 25 0 0 0.019 0.537
29 27 1 1 25 1 1 0.038 1.038
30 25 0 1 23 2 1 0.042 1.042
31 24 0 2 20 0 1 0.000 0.000
32 22 0 2 19 1 1 0.024 0.537
33 20 0 1 17 0 0 0.000 0.000
34 19 0 3 17 0 1 0.000 0.000
35 16 0 1 16 0 0 0.000 0.000
36 15 0 2 16 0 3 0.000 0.000
37 13 0 1 13 0 3 0.000 0.000
38 12 0 2 10 1 0 0.045 0.545
40 10 0 1 9 0 0 0.000 0.000
41 9 0 2 9 0 0 0.000 0.000
42 7 0 1 9 0 3 0.000 0.000
43 6 1 0 6 0 0 0.083 0.500
44 5 0 0 4 0 2 0.000 0.000
45 5 0 1 4 0 0 0.000 0.000
47 4 0 0 4 0 1 0.000 0.000
48 4 0 2 3 0 0 0.000 0.000
53 2 0 0 3 0 1 0.000 0.000
60 2 1 0 2 0 0 0.250 0.500
61 1 0 1 2 0 0 0.000 0.000
65 0 0 0 2 0 1 0.000 0.000
70 0 0 0 1 0 1 0.000 0.000
Total 12 27 20.032
pd=(d1+d2)/(n1+n2),e1=pdn1,e2=pdn2.
Wecantestthenullhypothesisthattheriskofrecurrenceinanymonthisequalforthetwopopulationsbyachi-squaredtest:
Thereisoneconstraint,thatthetwofrequenciesaddtothesumoftheexpected(i.e.thetotalnumberofrecurrences),soweloseonedegreeoffreedom,giving2-1=1degreeoffreedom.FromTable13.3.thishasaprobabilityof0.01.
Sometextsdescribethistestdifferently,sayingthatunderthenullhypothesisd1isfromaNormaldistributionwithmeane1andvariancee1e2/(e1+e2).Thisisalgebraicallyidenticaltothechi-squaredmethod,butonlyworksfortwogroups.
Thelogranktestisnon-parametric,becausewemakenoassumptionsabouteitherthedistributionofsurvivaltimeoranydifferenceinrecurrencerates.Itrequiresthesurvivalorcensoringtimestobeexact.AsimilarmethodforgroupeddataasinTable15.8isgivenbyMantel(1966).
Thelogranktestisatestofsignificanceand,ofcourse,anestimateofthedifferenceispreferableifwecangetone.Thelogranktestcalculationcanbeusedtogiveusone:thehazardratio.Thisistheratiooftheriskofdeathingroup1totheriskofdeathingroup2.Forthistomakesense,wehavetoassumethatthisratioisthesameatalltimes,otherwisetherecouldnotbeasingleestimate.(Comparethepairedtmethod,§10.2.)Theriskofdeathisthenumberofdeathsdividedbythepopulationatrisk,butthepopulationkeepschangingduetocensoring.However,thepopulationsatriskinthetwogroupsareproportionaltothenumbersofexpecteddeaths,e1ande2.Wecanthuscalculatethehazardratioby
ForTable15.10.wehave
Thusweestimatetheriskofrecurrencewithsinglestonestobe0.42timestheriskformultiplestones.ThedirectcalculationofaconfidenceintervalforthehazardratioistediousandIshallomitit.Altman(1991)givesdetails.ItcanalsobedonebyCoxregression(§17.9).
15.7*ComputeraideddiagnosisReferenceintervals(§15.5)areoneareawherestatisticalmethodsareinvolveddirectlyindiagnosis,computeraideddiagnosisisanother.The‘aided’isputintopersuadecliniciansthatthemainpurposeisnottodothemoutofajob,but,naturally,theyhavetheirdoubts.Computeraideddiagnosisispartlyastatisticalexercise.Therearetwotypesofcomputeraideddiagnosis:statisticalmethods,wherediagnosisisbasedonasetofdataobtainedfrompastcases,and
decisiontreemethods,whichtrytoimitatethethoughtprocessesofanexpertinthefield.Weshalllookbrieflyateachapproach.
Thereareseveralmethodsofstatisticalcomputeraideddiagnosis.Oneusesdiscriminantanalysis.Inthiswestartwithasetofdataonsubjectswhosediagnosiswassubsequentlyconfirmed,andcalculateoneormorediscriminantfunctions.Adiscriminantfunctionhastheform:
constant1×variable1+constant2×variable2+…+constantk×variablek
Theconstantsarecalculatedsothatthevaluesofthefunctionsareassimilaraspossibleformembersofthesamegroupandasdifferentaspossibleformembersofdifferentgroups.Inthecaseofonlytwogroups,wehaveonediscriminantfunctionandallthesubjectsinonegroupwillhavehighvaluesofthefunctionandallsubjectsintheotherwillhavelowvalues.Foreachnewsubjectweevaluatethe
discriminantfunctionanduseittoallocatethesubjecttoagroupordiagnosis.Wecanestimatetheprobabilityofthesubjectfallinginthatgroup,andinanyother.Manyformsofdiscriminantanalysishavebeendevelopedtotryandimprovethisformofcomputerdiagnosis,butitdoesnotseemtomakemuchdifferencewhichisused.Logisticregression(§17.8)canalsobeused.
AdifferentapproachusesBayesiananalysis.ThisisbasedonBayes'theorem,aresultaboutconditionalprobability(§6.8)whichmaybestatedintermsoftheprobabilityofdiagnosisAbeingtrueifwehaveobserveddataB,as:
Ifwehavealargedatasetofknowndiagnosesandtheirassociatedsymptomsandsigns,wecandeterminePROB(diagnosisA)easily.ItissimplytheproportionoftimesAhasbeendiagnosed.Theproblemoffindingtheprobabilityofaparticularcombinationofsymptomsandsignsismoredifficult.Iftheyareallindependent,wecansaythattheprobabilityofagivensymptomistheproportionoftimesitoccurs,andtheprobabilityofthesymptomforeachdiagnosisisfoundinthesameway.Theprobabilityofanycombinationofsymptomscanbefoundbymultiplyingtheirindividualprobabilitiestogether,asdescribedin§6.2.Inpracticetheassumptionthatsignsandsymptomsareindependentismostunlikelytobemetandamorecomplicatedanalysiswouldberequiredtodealwiththis.However,somesystemsofcomputeraideddiagnosishavebeenfoundtoworkquitewellwiththesimpleapproach.
Expertorknowledge-basedsystemsworkinadifferentway.Heretheknowledgeofahumanexpertorgroupofexpertsinthefieldisconvertedintoaseriesofdecisionrules,e.g.‘ifthepatienthashighCKthenthepatienthasmyocardialinfarction,ifnotthenontothenextdecision’.Thesesystemscanbemodifiedbyaskingfurtherexpertstotestthesystemwithcasesfromtheirownexperienceandtosuggestfurtherdecisionrulesiftheprogramfails.Theyalsohavetheadvantagethattheprogramcan‘explain’thereasonforits‘decision’bylistingtheseriesofstepswhichledtoit.MostofChapter14consistsofrulesofjust
thistypeandcouldbeturnedintoanexpertsystemforstatistical
analysis.
Althoughtherehavebeensomeimpressiveachievementsinthefieldofcomputerdiagnosis,ithastodatemadelittleprogresstowardsacceptanceinroutinemedicalpractice.Ascomputersbecomemorefamiliartoclinicians,morecommonintheirsurgeriesandmorepowerfulintermsofdatastorageandprocessingspeed,wemayexpectcomputeraideddiagnosistobecomeaswellestablishedascomputeraidedstatisticalanalysisistoday.
15.8*NumberneededtotreatWhenaclinicaltrialhasadichotomousoutcomemeasure,suchassurvivalordeath,thereareseveralwaysinwhichwecanexpressthedifferencebetweenthetwotreatments.Theseincludethedifferencebetweenproportionsofsuccesses,ratioofproportions(riskratioorrelativerisk),andtheoddsratio.Thenumberneededtotreat(NNT)isthenumberofpatientswewouldneedtotreatwiththenewtreatmenttoachieveonemoresuccessthanwewouldontheoldtreatment(Laupacisetal.1988;CookandSackett1995).Itisthereciprocalofthedifferencebetweentheproportionofsuccessonthenewtreatmentandtheproportionontheoldtreatment.Forexample,intheMRCstreptomycintrial(Table2.10)thesurvivalratesafter6monthswere93%instreptomycingroupand730.93-0.73=0.20andthenumberneededtotreattopreventonedeathoversixmonthswas1/0.20=5.ThesmallertheNNT,themoreeffectivethetreatmentwillbe.ThesmallestpossiblevalueforNNTis1.0,whentheproportionssuccessfulare1.0and0.0.Thiswouldmeanthatthenewtreatmentwasalwayseffectiveandtheoldtreatmentwasnevereffective.TheNNTcannotbezero.Ifthetreatmenthasnoeffectatall,theNNTwillbeinfinite,becausethedifferenceintheproportionofsuccesseswillbezero.Ifthetreatmentisharmful,sothatsuccessrateislessthanonthecontroltreatment,theNNTwillbenegative.Thenumberisthencalledthenumberneededtoharm(NNH).Thisideahascaughtonveryquicklyandhasbeenwidelyusedanddeveloped,forexampleasthenumberneededtoscreen(Rembold1998).
TheNNTisanestimateandshouldhaveaconfidenceinterval.Thisisapparentlyquitestraightforward.Wefindtheconfidenceintervalfor
thedifferenceintheproportions,andthereciprocaloftheselimitsaretheconfidencelimitsfortheNNT.FortheMRCstreptomycintrialthe95%confidenceintervalforthedifferenceis0.0578to0.3352,reciprocals17.3and3.0.Thusthe95%confidenceintervalfortheNNTis3to17.
Thisisdeceptivelysimple.AsAltman(1998)pointedout,thereareproblemswhenthedifferenceisnotsignificant.Theconfidenceintervalforthedifferencebetweenproportionsincludeszero,soinfinityisapossiblevalueforNNT,andnegativevaluesarealsopossible,i.e.thetreatmentmayharm.Theconfidenceintervalmustallowforthis.
Forexample,Henzietal.(2000)calculatedNNTforseveralstudies,includingthatofLopez-Olaondoetal.(1996).Thisstudycompareddexamethasoneagainstplacebotopreventpostoperativenauseaandvomiting.Theyobserved
nauseain5/25patientsondexamethasoneand10/25onplacebo.Thusthedifferenceinproportionswithoutnausea(success)is0.80-0.60=0.20,95%confidenceinterval-0.0479to0.4479(§8.6).Thenumberneededtotreatisthereciprocalofthisdifference,1/0.20=5.0.Thereciprocalsoftheconfidencelimtsare1/(-0.0479)=-20.9and1/0.4479=2.2.ButtheconfidenceintervalfortheNNTisnot-20.9to2.2.Zero,whichthisincludes,isnotapossiblevaluefortheNNT.Sincetheremaybenotreatmentdifferenceatall,zerodifferencebetweenproportions,theNNTmaybeinfinite.Infact,theconfidenceintervalforNNTisnotthevaluesbetween-20.9and2.2,butthevaluesoutsidethisinterval,i.e.2.2toinfinity(numberneededtoachieveanextrasuccess,NNT)andminusinfinityto-20.9(numberneededtoachieveanextrafailure,NNH).ThustheNNTisestimatedtobeanythinggreaterthan2.2,andtheNNHtobeanythinggreaterthan20.9.Theconfidenceintervalisintwoparts,-∞to-20.9and2.2to∞.(‘∞’isthesymbolforinfinity.)Henzietal.(2000)quotethisconfidenceintervalas2.2to-21,whichtheysaythereadershouldinterpretasincludinginfinity.Altman(1998)recommends‘NNTH=21.9to∞toNNTB2.2’,whereNNTHmeans‘numberneededtoharm’andNNTBmeans‘numberneededtobenefit’.Iprefer‘-∞to-20.9,2.2to∞’.Here-∞and∞each
tellusthatitdoesnotmatterwhichtreatmentisused.
Two-partconfidenceintervalsarenotexactlyintuitiveandIthinkthattheproblemsofinterpretationofNNTinnegativetrialslimititsvaluetobeingasupplementarydescriptionoftrialsresults.
15MMultiplechoicequestions81to86(Eachansweristrueorfalse)
81.*Therepeatabilityorprecisionofmeasurementsmaybemeasuredby:
(a)thecoefficientofvariationofrepeatedmeasurements;
(b)thestandarddeviationofmeasurementsbetweensubjects;
(c)thestandarddeviationofthedifferencebetweenpairsofmeasurements;
(d)thestandarddeviationofrepeatedmeasurementswithinsubjects;
(e)thedifferencebetweenthemeansoftwosetsofmeasurementsonthesamesetofsubjects.
ViewAnswer
82.Thespecificityofatestforadisease:
(a)hasastandarderrorderivedfromtheBinomialdistribution;
(b)measureshowwellthetestdetectscasesofthedisease;
(c)measureshowwellthetestexcludessubjectswithoutthedisease;
(d)measureshowoftenacorrectdiagnosisisobtainedfromthetest;
(e)isallweneedtotellushowgoodthetestis.
ViewAnswer
83.Thelevelofanenzymemeasuredinbloodisusedasadiagnostictestforadisease,thetestbeingpositiveiftheenzymeconcentrationisaboveacriticalvalue.Thesensitivityofthediagnostictest:
(a)isoneminusthespecificity;
(b)isameasureofhowwellthetestdetectscasesofthedisease;
(c)istheproportionofpeoplewiththediseasewhoarepositiveonthetest;
(d)increasesifthecriticalvalueislowered;
(e)measureshowwellpeoplewithoutthediseaseareexcluded.
ViewAnswer
84.A95%referenceinterval,95%referencerange,ornormalrange:
(a)maybecalculatedastwostandarddeviationsoneithersideofthemean;
(b)maybecalculateddirectlyfromthefrequencydistribution;
(c)canonlybecalculatediftheobservationsfollowaNormaldistribution;
(d)getswiderasthesamplesizeincreases;
(e)maybecalculatedfromthemeananditsstandarderror.
ViewAnswer
85.Ifthe95%referenceintervalforhaematocritinmenis43.2to49.2:
(a)anymanwithhaematocritoutsidetheselimitsisabnormal;
(b)haematocritsoutsidetheselimitsareproofofdisease:
(c)amanwithahaematocritof46mustbeveryhealthy;
(d)awomanwithahaematocritof48hasahaematocritwithinnormallimits;
(e)amanwithahaematocritof42maybeill.
ViewAnswer
86.*Whenasurvivalcurveiscalculatedfromcensoredsurvivaltimes:
(a)theestimatedproportionsurvivingbecomeslessreliableassurvivaltimeincreases;
(b)individualswithdrawnduringthefirsttimeintervalareexcludedfromtheanalysis;
(c)survivalestimatesdependontheassumptionthatsurvivalratesremainconstantoverthestudyperiod;
(d)itmaybethatthesurvivalcurvewillnotreachzerosurvival;
(e)thefiveyearsurvivalratecanbecalculatedevenifsomeofthesubjectswereidentifiedlessthanfiveyearsago.
ViewAnswer
15EExercise:AreferenceintervalInthisexerciseweshallestimateareferenceinterval.Matheretal.(1979)measuredplasmamagnesiumin140apparentlyhealthypeople,tocomparewithasampleofdiabetics.ThenormalsamplewaschosenfromblooddonorsandpeopleattendingdaycentresfortheelderlyintheareaofSt.George'sHospital,togive10maleand10femalesubjectsineachagedecadefrom15–24to75yearsandover.Questionnaireswereusedtoexcludeanysubjectwithpersistent
diarrhoea,excessivealcoholintakeorwhowereonregulardrugtherapyotherthanhypnoticsandmildanalgesicsintheelderly.ThedistributionofplasmamagnesiumisshowninFigure15.8.Themeanwas0.810mmol/litreandthestandarddeviation0.057mmol/litre.
Fig.15.8.Distributionofplasmamagnesiumin140apparentlyhealthypeople
1.Whatdoyouthinkofthesamplingmethod?Whyuseblooddonorsandelderlypeopleattendingdaycentres?
ViewAnswer
2.Whyweresomepotentialsubjectsexcluded?Wasthisagoodidea?Whywerecertaindrugsallowedfortheelderly?
ViewAnswer
3.DoesplasmamagnesiumappeartofollowaNormaldistribution?
ViewAnswer
4.Whatisthereferenceintervalforplasmamagnesium,usingtheNormaldistributionmethod?
ViewAnswer
5.Findconfidenceintervalsforthereferencelimits.
ViewAnswer
6.Woulditmatterifmeanplasmamagnesiuminnormalpeopleincreasedwithage?Whatmethodmightbeusedtoimprovethe
estimateofthereferenceintervalinthiscase?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>16-Mortalitystatisticsandpopulationstructure
16
Mortalitystatisticsandpopulationstructure
16.1MortalityratesMortalitystatisticsareoneofourprincipalsourcesofinformationaboutchangingpatternofdiseasewithinacountryandthedifferencesindiseasetweencountries.Inmostdevelopedcountries,anydeathmustbecertifiedby-doctor,whorecordsthecause,dateandplaceofdeathandsomedataaboutdeceased.InBritain,theseincludethedateofbirth,areaofresidenceandknownoccupation.Thesedeathcertificatesformtherawmaterialfromwhichmortalitystatisticsarecompiledbyanationalbureauofcensuses,inBritaintheOfficeforNationalStatistics.Thenumbersofdeathscanbetabulatedbycause,sex,age,typesofoccupation,areaofresidence,andmaritalstatus.Table5.1showsonesuchtabulation,ofdeathsbycauseandsex.
Forpurposesofcomparisonwemustrelatethenumberofdeathstothenumberinthepopulationinwhichtheyoccur.Wehavethisinformationfairlyreliablyat10yearintervalsfromthedecennialcensusofthecountry.Wecanestimatethesizeandageandsexstructureofthepopulationbetweencensusesusingregistrationofbirthsanddeaths.Eachbirthordeathisnotifiedtoanofficialregistrar,andsowecankeepsometrackofchangesinthepopulationThereareother,lesswelldocumentedchangestakingplace,suchasimmigrationandemigration,whichmeanthatpopulationsizeestimatesbetweenthecensusyearsareonlyapproximations.Someestimates,suchasthenumbersindifferentoccupations,aresounreliablethatmortalitydataisonlytabulatedbythemforcensusyears.
Ifwetakethenumberofdeathsoveragivenperiodoftimeanddivide
itbythenumberinthepopulationandthetimeperiod,wegetamortalityrate,thenumberofdeathsperunittimeperperson.Weusuallytakethenumberofdeathsoveronecalendaryear,althoughwhenthenumberofdeathsissmallwemaytakedeathsoverseveralyears,toincreasetheprecisionofthenumerator.Thenumberinthepopulationischangingcontinually,andwetakeasthedenominatortheestimatedpopulationatthemid-pointofthetimeperiod.Mortalityratesareoftenverysmallnumbers,soweusuallymultiplythembyaconstant,suchas1000or100000,toavoidstringsofzerosafterthedecimalpoint.
Whenwearedealingwithdeathsinthewholepopulation,irrespectiveofage,therateweobtainiscalledthecrudemortalityrateorcrudedeathdrate.
Theterms‘deathrate’and‘mortalityrate’areusedinterchangeably.Wecalculatethecrudemortalityrateforapopulationas:
Table16.1.Age-specificmortalityratesandagedistributioninadultmales,EnglandandWales,1901
and1981
Agegroup(years)
Age-specificdeathrateper1000peryear
%Adultpopulationinagegroup
1901 1981 1901 1981
15–19 3.5 0.8 15.36 11.09
20–24 4.7 0.8 14.07 9.75
25–34 6.2 0.9 23.76 18.81
35–44 10.6 1.8 18.46 15.99
45–54 18.0 6.1 13.34 14.75
55–64 33.5 17.7 8.68 14.04
65–74 67.8 45.6 4.57 10.65
75–84 139.8 105.2 1.58 4.28
85+ 276.5 226.2 0.17 0.64
Iftheperiodisinyears,thisgivesthecrudemortalityrateasdeathsper1000populationperyear.
Thecrudemortalityrateissocalledbecausenoallowanceismadefortheagedistributionofthepopulation,andcomparisonsbetweenpopulationswithdifferentagestructures.Forexample,in1901thecrudemortalityrateamongadultmales(agedover15years)inEnglandandWaleswas15.7per1000peryear,andin1981itwas14.8per1000peryear.Itseemsstrangethatwithalltheimprovementsinmedicine,housingandnutritionbetweenthesetimestherehasbeensolittleimprovementinthecrudemortalityrate.Toseewhywemustlookattheage-specificmortalityrates,themortalityrateswithinnarrowagegroups.Age-specificmortalityratesareusuallycalculatedforone,fiveortenyearagegroups.In1901theagespecificmortalityrateformenaged15to19was3.5deathsper1000peryear,whereasin1981itwasonly0.8.AsTable16.1shows,theagespecificmortalityratein1901wasgreaterthanthatin1981foreveryagegroup.Howeverin1901therewasamuchgreaterproportionofthepopulationintheyoungeragegroups,wheremortalitywaslow,thantherewasin1981.
Correspondingly,therewasasmallerproportionofthe1901populationthanthe1981populationinthehighermortalityolderagegroups.Althoughmortalitywasloweratanygivenagein1981,thegreaterproportionofolderpeoplemeantthattherewerealmostasmanydeathsasin1901.
Toeliminatetheeffectsofdifferentagestructuresinthepopulationswhichwewanttocompare,wecanlookattheage-specificdeathrates.Butifwearecomparingseveralpopulations,thisisarathercumbersomeprocedure,anditisoftenmoreconvenienttocalculateasinglesummaryfigurefromtheage-specific
rates.Therearemanywaysofdoingthis,ofwhichthreearefrequentlyused:thedirectandindirectmethodsofagestandardizationandthelifetable.
Table16.2.Calculationoftheagestandardizedmortalityratebythedirectmethod
Agegroup(years)
Standardproportioninagegroup(a)
Observedmortalityrateper1000(b)
a×i
15–19 0.1536 0.8 0.1229
20–24 0.1407 0.8 0.1126
25–34 0.2376 0.9 0.2138
35–44 0.1846 1.8 0.3323
45–54 0.1334 6.1 0.8137
55–64 0.0868 17.7 1.5364
65–74 0.0457 45.6 2.0839
75–84 0.0158 105.2 1.6622
85+ 0.0017 226.2 0.3845
Sum 7.2623
16.2AgestandardizationusingthedirectmethodIshalldescribethedirectmethodfirst.Weuseastandardpopulationstructure,i.e.astandardagedistributionorsetofproportionsofpeopleineachagegroup.Wethencalculatetheoverallmortalityratewhichapopulationwiththestandardagestructurewouldhaveifitexperiencedtheagespecificmortalityratesoftheobservedpopulation,thepopulationwhosemortalityrateistobeadjusted.Weshalltakethe1901populationasthestandardandcalculatethemortalityratethe1981populationwouldhaveexperiencediftheagedistributionwerethesameasin1901.Wedothisbymultiplyingeach1981agespecificmortalityratebytheproportioninthatagegroupinthestandard1901population,andadding.Thisthengivesusanaveragemortalityrateforthewholepopulation,theage-standardizedmortalityrate.Forexample,the1981mortalityrateinagegroup15–19was0.8per1000peryearandtheproportioninthestandardpopulationinthisagegroupis15.36%or0.1536.Thecontributionofthisagegroupis0.8×0.1536=0.1229.ThecalculationissetoutinTable16.2.
Ifweusedthepopulation'sownproportionsineachagegroupinthiscalculationwewouldgetthecrudemortalityrate.Since1901hasbeenchosenasthestandardpopulation,itscrudemortalityrateof15.7is
alsotheage-standardizedmortalityrate.Theage-standardizedmortalityratefor1981was7.3per1000menperyear.Wecanseethattherewasamuchhigherage-standardizedmortalityin1901than1981,reflectingthedifferenceinage-specificmortalityrates.
16.3AgestandardizationbytheindirectmethodThedirectmethodreliesuponage-specificmortalityratesfortheobservedpopulation.Ifwehaveveryfewdeaths,theseage-specificrateswillbeverypoorlyestimated.Thiswillbeparticularlysointheyoungeragegroups,wherewemay
evenhavenodeathsatall.Suchsituationsarisewhenconsideringmortalityduetoparticularconditionsorinrelativelysmallgroups,suchasthosedefinedbyoccupation.Theindirectmethodofstandardizationisusedforsuchdata.Wecalculatethenumberofdeathswewouldexpectintheobservedpopulationifitexperiencedtheage-specificmortalityratesofastandardpopulation.Wethencomparetheexpectednumberofdeathswiththatactuallyobserved.
Table16.3.Age-specificmortalityratesduetocirrhosisoftheliverandagedistributionsofallmenandmedicalpractitioners,EnglandandWales,1971
Agegroup(years)
Mortalitypermillionmenperyear
Numberofmen
Numberofdoctors
15–24 5.859 3584320 1080
25–34 13.050 3065100 12860
35–44 46.937 2876170 11510
45–54 161.503 2965880 10330
55–64 271.358 2756510 7790
IshalltakeasanexamplethedeathsduetocirrhosisoftheliveramongmalequalifiedmedicalpractitionersinEnglandandWales,recordedaroundthe1971census.Therewere14deathsamong43570doctorsagedbelow65,acrudemortalityrateof14/43570=321permillion,comparedto1423outof15247980adultmales(aged15–64),or93permillion.Themortalityamongdoctorsappearshigh,butthemedicalpopulationmaybeolderthanthepopulationofmenasawhole,asitwillcontainrelativelyfewbelowtheageof25.Alsotheactualnumberofdeathsamongdoctorsissmallandanydifferencenotexplainedbytheageeffectmaybeduetochance.Theindirectmethodenablesustotestthis.Table16.3showstheage-specificmortalityratesforcirrhosisoftheliveramongallmenaged15to65,andthenumberofmenestimatedineachten-year-agegroup,forallmenandfordoctors.Wecanseethatthetwoagedistributionsdoappeartobedifferent.
Thecalculationoftheexpectednumberofdeathsissimilartothedirectmethod,butdifferentpopulationsandratesareused.Foreachagegroup,wetakethenumberintheobservedpopulation,andmultiplyitbythestandardagespecificmortalityrate,whichwouldbetheprobabilityofdyingifmortalityintheobservedpopulationwerethesameasthatinthestandardpopulation.Thisgivesusthenumberwewouldexpecttodieinthisagegroupintheobservedpopulation.Weaddtheseovertheagegroupsandobtaintheexpectednumberofdeaths.ThecalculationissetoutinTable16.4.
Theexpectednumberofdeathsis4.4965,whichisconsiderablylessthanthe14observed.Weusuallyexpresstheresultofthecalculationastheratioofobservedtoexpecteddeaths,calledthestandardizedmortalityratioorSMR.ThustheSMRforcirrhosisamongdoctorsis
WeusuallymultiplytheSMRby100togetridofthedecimalpoint,andreporttheSMRas311.Ifwedonotadjustforageatall,theratioofthecrudedeathratesis3.44,comparedtotheageadjustedfigureof3.11,sotheadjustmenthasmadesome,butnotmuch,differenceinthiscase.
Table16.4.Calculationoftheexpectednumberofdeathsduetocirrhosisoftheliveramongpractitioners,usingtheindirectmethod
Agegroup(years)
Standardmortalityrate(a)
Observedpopulationnumberofdoctors(b)
a×b
15–24 0.000005859 1080 0.0063
25–34 0.000013050 12860 0.1678
35–44 0.000046937 11510 0.5402
45–54 0.000161503 10330 1.6683
55–64 0.000271358 7790 2.1139
Total 4.4965
WecancalculateaconfidenceintervalfortheSMRquiteeasily.DenotetheobserveddeathsbyOandexpectedbyE.Itisreasonabletosupposethatthedeathsareindependentofoneanotherandhappening
randomlyintime,sotheobservednumberofdeathsisfromaPoissondistribution(§6.7).ThestandarddeviationofthisPoissondistributionisthesquarerootofitsmeanandsocanbeestimatedbythesquarerootoftheobserveddeaths,√O.Theexpectednumberiscalculatedfromaverymuchlargersampleandissowellestimateditcanbetreatedasaconstant,sothestandarddeviationof100×O/E,whichisthestandarderroroftheSMR,isestimatedby100×√O/E.Providedthenumberofdeathsislargeenough,saymorethan10,anapproximate95%confidenceintervalisgivenby
Forthecirrhosisdatatheformulagives
Theconfidenceintervalclearlyexcludes100andthehighmortalitycannotbeascribedtochance.
ForsmallobservedfrequenciestablesbasedontheexactprobabilitiesofthePoissondistributionareavailable(PearsonandHartley1970).ThecalculationsareeasilydonebycomputerandmyfreeprogramClinstat(§1.3)doesthem.ThereisalsoanexactmethodforcomparingtwoSMRs,whichClinstatdoes.Forthecirrhosisdatatheexact95%confidenceintervalis170to522.Thisis
notquitethesameasthelargesampleapproximation.BetterapproximationsandexactmethodsofcalculatingconfidenceintervalsaredescribedbyMorrisandGardner(1989)andBreslowandDay(1987).
WecanalsotestthenullhypothesisthatinthepopulationtheSMR=100.Ifthenullhypothesisistrue,OisfromaPoissondistributionwithmeanEandhencestandarddeviation√E,providedthesampleislargeenough,sayE>10.Then(O-E)/√EwouldbeanobservationfromtheStandardNormaldistributionifthenullhypothesisweretrue.Thesampleofdoctorsistoosmallforthistesttobereliable,butifitwere,wewouldhave(O-E)/√E=(14-4.4965)/√4.4965=4.48,P=0.0001.Again,thereisanexactmethod.ThisgivesP=0.0005.Assooften
happens,largesamplemethodsbecometooliberalandgivePvalueswhicharetoosmallwhenusedwithsampleswhicharetoosmallforthetesttobevalid.
Thehighlysignificantdifferencesuggeststhatdoctorsareatincreasedriskofdeathfromcirrhosisoftheliver,comparedtoemployedmenasawhole.Thenewsisnotallbadformedicalpractitioners,however.TheirSMRforcancerofthetrachea,bronchusandlungisonly32.Doctorsmaydrink,buttheydonotsmoke!
16.4DemographiclifetablesWehavealreadydiscussedauseofthelifetabletechniquefortheanalysisofclinicalsurvivaldata(§15.6).Thelifetablewasfoundbyfollowingthesurvivalofagroupofsubjectsfromsomestartingpointtodeath.Indemography,whichmeansthestudyofhumanpopulations,thislongitudinalmethodofanalysisisimpractical,becausewecouldonlystudypeoplebornmorethan100yearsago.Demographiclifetablesaregeneratedinadifferentway,usingacross-sectionalapproach.Ratherthanchartingtheprogressofagroupfrombirthtodeath,westartwiththepresentage-specificmortalityrates.Wethencalculatewhatwouldhappentoacohortofpeoplefrombirthiftheseage-specificmortalityratesappliedunchangedthroughouttheirlives.Wedenotetheprobabilityofdyingbetweenagesxandx+1years(theage-specificmortalityrateatagex)byqx.AsinTable15.8,theprobabilityofsurvivingfromagextox+1ispx=1-qx.Wenowsupposethatwehaveacohortofsizel0atage0,i.e.atbirth.l0isusually100000or10000.Thenumberwhowouldstillbealiveafterxyearsislx.Wecanseethatthenumberaliveafterx+1yearsislx+1=px×lx,sogivenallthepxfromx=0onwardswecancalculatethelx.ThecumulativesurvivalprobabilitytoagexisthenPx=lx/l0
Table16.5showsanextractfromLifeTableNumber11,1950–52,forEnglandandWales.Withtheexceptionof1941,alifetablelikethishasbeenproducedevery10yearssince1871,basedonthedecennialcensusyear.Thelifetableisbasedonthecensusyearbecauseonlythendowehaveagoodmeasureofthenumberofpeopleateachage,thedenominatorinthecalculationofqx.Athreeyearperiodisusedto
increasethenumberofdeathsforayearofageandsoimprovetheestimationofqx.Separatetablesareproducedformalesandfemales
becausethemortalityofthetwosexesisverydifferent.Agespecificdeathratesarehigherinmalesthanfemalesateveryage.Betweencensusyearslifetablesarestillproducedbutareonlypublishedinanabridgedform,givinglxatfiveyearintervalsonlyafteragefive(Table16.6).
Table16.5.ExtractfromEnglishLifeTableNumber11,1950–52,Males
Ageinyears
Expectednumberaliveatagex
Probabilityanindividualdiesbetweenagesxandx+1
Expectedlifeatagexyears
x lx qx ex
0 100000 0.03266 66.42
1 96734 0.00241 67.66
2 96501 0.00141 66.82
3 96395 0.00102 65.91
4 96267 0.00084 64.98
. . . .
. . . .
. . . .
100 23 0.44045 1.67
101 13 0.45072 1.62
102 7 0.46011 1.58
103 4 0.46864 1.53
104 2 0.47636 1.50
ThefinalcolumninTables16.5and16.6istheexpectedlife,expectationoflifeorlifeexpectancy,ex.Thisistheaveragelifestilltobelivedbythosereachingagex.Wehavealreadycalculatedthisastheexpectedvalueoftheprobabilitydistributionofyearofdeath(§6E).Wecandothecalculationinanumberofotherways.Forexample,ifweaddlx+1,lx+2,lx+3,etc.wewillgetthetotalnumberofyearstobelived,becausethelx+1whosurvivetox+1willhaveaddedlx+1yearstothetotal,thelx+2ofthesewhosurvivefromx+1tox+2willaddafurtherlx+2years,andsoon.Ifwedividethissumbylxwegettheaveragenumberofwholeyearstobelived.Ifwethenrememberthatpeopledonotdieontheirbirthdays,butscatteredthroughouttheyear,wecanaddhalftoallowfortheaverageofhalfyearlivedintheyearofdeath.Wethusget
i.e.summingthelifromagex+1totheendofthelifetable.
Ifmanypeopledieinearlylife,withhighage-specificdeathratesforchildren,thishasagreateffectonexpectationoflifeatbirth.Table16.7showsexpectationoflifeatselectedagesfromfourEnglishLifeTables(OfficeforNationalStatistics1997).In1991,forexample,expectationoflifeatbirthformaleswas74years,comparedtoonly40yearsin1841,animprovementof34years.Howeverexpectationoflifeatage45in1991was31yearscomparedto23yearsin1841,animprovementofonly8years.Atage65,maleexpectationoflifewas11
yearsin1841and14yearsin1991,anevensmallerchange.Hencethechangeinlifeexpectancyatbirthwasduetochangesinmortalityinearlylife,notlatelife.
Table16.6.AbridgedLifeTable1988–90,EnglandandWales
Age Males Females
x lx ex lx ex
0 10000 73.0 10000 78.5
1 9904 72.7 9928 78.0
2 9898 71.7 9922 77.1
3 9893 70.8 9919 76.1
4 9890 69.8 9916 75.1
5 9888 68.8 9914 74.2
10 9877 63.9 9907 69.2
15 9866 58.9 9899 64.3
20 9832 54.1 9885 59.4
25 9790 49.3 9870 54.4
30 9749 44.5 9852 49.5
35 9702 39.7 9826 44.6
40 9638 35.0 9784 39.8
45 9542 30.3 9718 35.1
50 9375 25.8 9607 30.5
55 9097 21.5 9431 26.0
60 8624 17.5 9135 21.7
65 7836 14.0 8645 17.8
70 6689 11.0 7918 14.2
75 5177 8.4 6869 11.0
80 3451 6.4 5446 8.2
85 1852 4.9 3659 5.9
Thereisacommonmisconceptionthatalifeexpectancyatbirthof40years,asin1841,meantthatmostpeoplediedaboutage40.Forexample(Rowe1992):
Mothershavealwaysprovokedrageandresentmentintheiradultdaughters,whiletheadultdaughtershavealwaysprovokedanguishandguiltintheirmothers.Inpastcenturies,however,suchmatchedmiserydidnotlastforlong.Daughterscouldburytheirrageandresentmentunderaconcernfordutywhiletheycaredfortheirmotherswho,turning40,rapidlyaged,grewfrailanddied.Nowmothersturning40arestrongandhealthy,andonlyhalfwaythroughtheirlives.
Thisisabsurd.AsTable16.7shows,sincelifeexpectancywasfirstestimatedwomenturning40havehadaverageremaininglivesofmorethan20years.Theydidnotrapidlyage,growfrail,anddie.
‘Expectation’isusedinitsstatisticalsenseoftheaverageofadistribution.Itdoesnotmeanthateachpersoncanknowwhentheywilldie.FromthemostrecentlifetableforEnglandandWales,for1994–96(OfficeforNationalStatistics1998a),amanaged53(myself,forexample)hasalifeexpectancyof24years.Thisistheaveragelifetimewhichallmenaged53yearswouldhaveifthepresentage-specificmortalityratesdonotchange.(Theseshouldgodownovertime,puttinglife-spansup.)Abouthalfofthesemenwillhaveshorterlivesandhalflonger.Ifwecouldcalculatelifeexpectanciesformenwithdifferent
combinationsofriskfactors,wemightfindthatmylifeexpectancywouldbedecreasedbecauseIamshort(sounfairIthink)andfatandincreasedbecauseIdonotsmoke(likealmostallmedicalstatisticians)andamofprofessionalsocialclass.Howevermyexpectationoflifewasadjusted,itwouldremainanaverage,notaguaranteedfigureforme.
Table16.7.Lifeexpectancyin1841,1901,1951,and1991,EnglandandWales
Age Sex Expectationoflifeinyears
1841 1901 1951 1991
Birth Males 40 49 66 74
Females 42 52 72 79
15yrs Males 43 47 54 59
Females 44 50 59 65
45yrs Males 23 23 27 31
Females 24 26 31 36
65yrs Males 11 11 12 14
Females 12 12 14 18
Lifetableshaveanumberofuses,bothmedicalandnon-medical.Expectationoflifeprovidesausefulsummaryofmortalitywithouttheneedforastandardpopulation.Thetableenablesustopredictthefuturesizeofandagestructureofapopulationgivenitspresentstate,calledapopulationprojection.Thiscanbeveryusefulinpredictingsuchthingsasthefuturerequirementforgeriatricbedsinahealthdistrict.Lifetablesarealsoinvaluableinnon-medicalapplications,
suchasthecalculationofinsurancepremiums,pensionsandannuities.
Themaindifficultywithpredictionfromalifetableisfindingatablewhichappliestothepopulationsunderconsideration.Forthegeneralpopulationof,say,ahealthdistrict,thenationallifetablewillusuallybeadequate,butforspecialpopulationsthismaynotbethecase.Ifwewanttopredictthefutureneedforcareofaninstitutionalizedpopulation,suchasinalongstaypsychiatrichospitaloroldpeoples'home,themortalitymaybeconsiderablygreaterthanthatinthegeneralpopulation.Predictionsbasedonthenationallifetablecanonlybetakenasaveryroughguide.Ifpossiblelifetablescalculatedonthattypeofpopulationshouldbeused.
16.5VitalstatisticsWehaveseenanumberofoccasionswhereordinarywordshavebeengivenquitedifferentmeaningsinstatisticsfromthosetheyhaveincommonspeech;‘Normal’and‘significant’aregoodexamples.‘Vitalstatistics’istheopposite,atechnicaltermwhichhasacquiredacompletelyunrelatedpopularmeaning.Asfarasthemedicalstatisticianisconcerned,vitalstatisticshavenothingtodowiththedimensionsoffemalebodies.Theyarethestatisticsrelatingtolifeanddeath:birthrates,fertilityrates,marriageratesanddeathrates.Ihavealreadymentionedcrudemortalityrate,age-specificmortalityrates,age-standardized
mortalityrate,standardizedmortalityratio,andexpectationoflife.InthissectionIshalldefineanumberofotherstatisticswhichareoftenquotedinthemedicalliterature.
Theinfantmortalityrateisthenumberofdeathsunderoneyearofagedividedbythenumberoflivebirths,usuallyexpressedasdeathsper1000livebirths.Theneonatalmortalityrateisthesamethingfordeathsinthefirst4weeksoflife.Thestillbirthrateisthenumberofstillbirthsdividedbythetotalnumberofbirths,liveandstill.Astillbirthisachildborndeadafter28weeksgestation.Theperinatalmortalityrateisthenumberofstillbirthsanddeathsinthefirstweekoflifedividedbythetotalbirths,againusuallypresentedper1000births.Infantandperinatalmortalityratesareregardedasparticularly
sensitiveindicatorsofthehealthstatusofthepopulation.Thematernalmortalityrateisthenumberofdeathsofmothersascribedtoproblemsofpregnancyandbirth,dividedbythetotalnumberofbirths.Thebirthrateisthenumberoflivebirthsperyeardividedbythetotalpopulation.Thefertilityrateisthenumberoflivebirthsperyeardividedbythenumberofwomenofchildbearingage,takenas15–44years.
Theattackrateforadiseaseistheproportionofpeopleexposedtoinfectionwhodevelopthedisease.Thecasefatalityrateistheproportionofpeoplewiththediseasewhodiefromit.Theprevalenceofadiseaseistheproportionofpeoplewhohaveitatonepointintime.Theincidenceisthenumberofnewcasesinoneyeardividedbythenumberatrisk.
16.6ThepopulationpyramidTheagedistributionofapopulationcanbepresentedashistogram,usingthemethodsof§4.3.However,becausethemortalityofmalesandfemalesissodifferenttheagedistributionsformalesandfemalesarealsodifferent.Itisusualtopresenttheagedistributionsforthetwosexesseparately.Figure16.1showstheagedistributionsforthemaleandfemalepopulationsofEnglandandWalesin1901.Now,thesehistogramshavethesamehorizontalscale.TheconventionalwaytodisplaythemiswiththeagescaleverticallyandthefrequencyscalehorizontallyasinFigure16.2.Thefrequencyscalehaszerointhemiddleandincreasestotherightforfemalesandtotheleftformales.Thisiscalledapopulationpyramid,becauseofitsshape.
Figure16.3showsthepopulationpyramidforEnglandandWalesin1991.Theshapeisquitedifferent.Insteadofatrianglewehaveanirregularfigurewithalmostverticalsideswhichbegintobendverysharplyinwardsataboutage65.Thepost-warand1960sbabyboomscanbeseenasbulgesatages25–30and40–45.Amajorchangeinpopulationstructurehastakenplace,withavastincreaseintheproportionofelderly.Thishasmajorimplicationsformedicine,asthecareoftheelderlyhasbecomealargeproportionoftheworkofdoctors,nursesandtheircolleagues.Itisinterestingtoseehowthishascomeabout.
Itispopularlysupposedthatpeoplearenowlivingmuchlongerasaresultofmodernmedicine,whichpreventsdeathsinmiddlelife.Thisisonlypartlytrue.
Fig.16.1.AgedistributionsforthepopulationofEnglandandWales,bysex,1901
Fig.16.2.PopulationpyramidforEnglandandWales,1901
Fig.16.3.PopulationpyramidforEnglandandWales,1991
AsTable16.7shows,lifeexpectancyatbirthincreaseddramaticallybetween1901and1991,buttheincreaseinlaterlifeismuchless.Thechangeisnotanextensionofeverylifeby25years,whichwouldbeseenateveryage,butmainlyareductioninmortalityinchildhoodandearlyadulthood.Mortalityinlaterlifehaschangedrelativelylittle.Now,abigreductioninmortalityinchildhoodwouldresultinanincreaseinthebasepartofthepyramid,asmorechildrensurvived,unlesstherewasacorrespondingfallinthenumberofbabiesbeingborn.Inthe19thcentury,womenwerehavingmanychildrenanddespitethehighmortalityinchildhoodthenumberwhosurvivedintoadulthoodtohavechildrenoftheirownexceededthatoftheirownparents.Thepopulationexpandedandthishistoryisembodiedinthe1901populationpyramid.Inthe20thcentury,infantmortalityfellandpeoplerespondedtothisbyhavingfewerchildren.In1841–45,theinfantmortalityrateswere148per1000livebirths,138in1901–05,10
in1981–85(OPCS1992)andonly5.9in1997(OfficeforNationalStatistics1999).Thebirthratewas32.2per1000populationperyearin1841–45,in1901–05itwas28.2,andin1987–97itwas13.5(OfficeforNationalStatistics1998b).Thebaseofthepyramidceasedtoexpand.Asthosewhowereinthebaseofthe1901pyramidgrewolder,thepopulationinthetophalfofthepyramidincreased.Thesurvivorsofthe0–4agegroupinthe1901pyramidarethe90+agegroupinthe1991pyramid.Hadthebirthratenotfallen,thepopulationwouldhavecontinuedtoexpandandwewouldhaveasgreatorgreateraproportionofyoungpeoplein1991aswedidin1901,andavastlylargerpopulation.Thustheincreaseintheproportionoftheelderlyisnotprimarilybecauseadultliveshavebeenextended,althoughthishasasmalleffect,butbecausefertilityhasdeclined.Lifeexpectancyfortheelderlyhaschangedrelativelylittle.MostdevelopedcountrieshavestablepopulationpyramidslikeFigure16.3andthoseofmostdevelopingcountrieshaveexpandingpyramidslikeFigure16.2.
16MMultiplechoicequestions87to92(Eachbranchiseithertrueorfalse)
87.Age-specificmortalityrate:
(a)isaratioofobservedtoexpecteddeaths;
(b)canbeusedtocomparemortalitybetweendifferentagegroups;
(c)isanageadjustedmortalityrate;
(d)measuresthenumberofdeathsinayear;
(e)measurestheagestructureofthepopulation.
ViewAnswer
88.Expectationoflife:
(a)isthenumberofyearsmostpeoplelive;
(b)isawayofsummarizingage-specificdeathrates:
(c)istheexpectedvalueofaparticularprobabilitydistribution;
(d)varieswithage:
(e)isderivedfromlifetables.
ViewAnswer
89.Inastudyofpost-natalsuicide(Appleby1991),theSMRforsuicideamongwomenwhohadjusthadababywas17witha95%confidenceinterval14to21(allwomen=100).Forwomenwhohadhadastillbirth,theSMRwas105(95%confidenceinterval31to277).Wecanconcludethat:
(a)womenwhohadjusthadababywerelesslikelytocommitsuicidethanotherwomenofthesameage;
(b)womenwhohadjusthadastillbirthwerelesslikelytocommitsuicidethanotherwomenofthesameage;
(c)womenwhohadjusthadalivebabywerelesslikelytocommitsuicidethanwomenofthesameagewhohadhadastillbirth:
(d)itispossiblethathavingastillbirthincreasestheriskofsuicide;
(e)suicidalwomenshouldhavebabies.
ViewAnswer
90.In1971,theSMRforcirrhosisoftheliverformenwas773forpublicansandinnkeepersand25forwindowcleaners,bothbeingsignificantlydifferentfrom100(DonnanandHaskey1977).Wecanconcludethat:
(a)publicansaremorethan7timesaslikelyastheaveragepersontodiefromcirrhosisoftheliver;
(b)thehighSMRforpublicansmaybebecausetheytendtobefoundintheolderagegroups;
(c)beingapublicancausescirrhosisoftheliver;
(d)windowcleaningprotectsmenfromcirrhosisoftheliver;
(e)windowcleanersareathighriskofcirrhosisoftheliver.
ViewAnswer
91.Theageandsexstructureofapopulationmaybedescribedby:
(a)alifetable;
(b)acorrelationcoefficient;
(c)astandardizedmortalityratio;
(d)apopulationpyramid;
(e)abarchart.
ViewAnswer
92.Thefollowingstatisticsareadjustedtoallowfortheagedistributionofthepopulation:
(a)age-standardizedmortalityrate;
(b)fertilityrate;
(c)perinatalmortalityrate;
(d)crudemortalityrate;
(e)expectationoflifeatbirth.
ViewAnswer
16EExercise:DeathsfromvolatilesubstanceabuseAndersonetal.(1985)studiedmortalityassociatedwithvolatilesubstanceabuse(VSA),oftencalledgluesniffing.InthisstudyallknowndeathsassociatedwithVSAfrom1971to1983inclusivewerecollected,usingsourcesincludingthreepresscuttingsagenciesandasix-monthlysystematicsurveyofallcoroners.CaseswerealsonotifiedbytheOfficeofPopulationCensusesandSurveysforEnglandandWalesandbytheCrownOfficeandprocuratorsfiscalinScotland.
Table16.8showstheagedistributionofthesedeathsforGreatBritainandforScotlandalone,withthecorrespondingagedistributionsatthe1981decennialcensus.
1.Calculateage-specificmortalityratesforVSAperyearandforthewholeperiod.Whatisunusualabouttheseage-specificmortalityrates?
ViewAnswer
2.CalculatetheSMRforVSAdeathsforScotland.
ViewAnswer
3.Calculatethe95%confidenceintervalforthisSMR.
ViewAnswer
4.DoesthenumberofdeathsinScotlandappearparticularlyhigh?Apartfromalotofgluesniffing,arethereanyotherfactorswhichshouldbeconsideredaspossibleexplanationsforthisfinding?
ViewAnswer
Table16.8.Volatilesubstanceabusemortalityandpopulationsize,GreatBritainandScotland.1971–83
(Andersonetal.1985)
Agegroup(years) GreatBritain Scotland
VSAdeaths Population(thousands)
VSAdeaths
Population(thousands)
0–9 0 6770 0 653
10–14 44 4271 13 425
15–19 150 4467 29 447
20–24 45 3959 9 394
25–29 15 3616 0 342
30–39 8 7408 0 0659
40–49 2 6055 0 574
50–59 7 6242 0 579
60+ 4 10769 0 962
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>17-Multifactorialmethods
17
Multifactorialmethods
17.1*MultipleregressionInChapters10and11welookedatmethodsofanalysingtherelationshipbetweenacontinuousoutcomevariableandapredictor.Thepredictorcouldbequantitative,asinregression,orqualitative,asinone-wayanalysisofvariance.Inthischapterweshalllookattheextensionofthesemethodstomorethanonepredictorvariable,anddescriberelatedmethodsforusewhentheoutcomeisdichotomousorcensoredsurvivaldata.Thesemethodsareverydifficulttodobyhandandcomputerprogramsarealwaysused.Ishallomittheformulae.
Table17.1showstheages,heightsandmaximumvoluntarycontractionofthequadricepsmuscle(MVC)inagroupofmalealcoholics.TheoutcomevariableisMVC.Figure17.1showstherelationshipbetweenMVCandheight.Wecan
fitaregressionlineoftheformMVC=a+b×height(§11.2–3).ThisenablesustopredictwhatthemeanMVCwouldbeformenofanygivenheight.ButMVCvarieswithotherthingsbesideheight.Figure17.2showstherelationshipbetweenMVCandage.
Table17.1.Maximumvoluntarycontraction(MVC)ofquadricepsmuscle,ageandheight,of41male
alcoholics(Hickishetal.1989)
Age(years)
Height(cm)
MVC(newtons)
Age(years)
Height(cm)
MVC(newtons)
24 166 466 42 178 417
27 175 304 47 171 294
28 173 343 47 162 270
28 175 404 48 177 368
31 172 147 49 177 441
31 172 294 49 178 392
32 160 392 50 167 294
32 172 147 51 176 368
32 179 270 53 159 216
32 177 412 53 173 294
34 175 402 53 175 392
34 180 368 53 172 466
35 167 491 55 170 304
37 175 196 55 178 324
38 172 343 55 155 196
39 172 319 58 160 98
39 161 387 61 162 216
39 173 441 62 159 196
40 173 441 65 168 137
41 168 343 65 168 74
41 178 540
Fig.17.1.Musclestrength(MVC)againstheight
Fig.17.2.Musclestrength(MVC)againstage
Wecanshowthestrengthsofthelinearrelationshipsbetweenallthreevariablesbytheircorrelationmatrix.Thisisatabulardisplayofthecorrelationcoefficientsbetweeneachpairofvariables,matrixbeingusedinitsmathematicalsenseasarectangulararrayofnumbers.ThecorrelationmatrixforthedataofTable17.1isshowninTable17.2.Thecoefficientsofthemaindiagonalareall1.0,becausetheyshowthecorrelationofthevariablewithitself,andthecorrelationmatrixissymmetricalaboutthisdiagonal.Becauseofthissymmetrymanycomputerprogramsprintonlythepartofthematrixbelowthediagonal.InspectionofTable17.2showsthatoldermenwereshorterandweaker
thanyoungermen.thattallermenwerestrongerthanshortermen,andthatthemagnitudesofallthreerelationshipswassimilar.ReferencetoTable11.2with41-2=39degreesoffreedomshowsthatallthreecorrelationsaresignificant.
Table17.2.CorrelationmatrixforthedataofTable17.1
Age Height MVC
Age 1.000 -0.338 -0.417
Height -0.338 1.000 0.419
MVC -0.417 0.419 1.000
WecouldfitaregressionlineoftheformMVC=a+b×age,fromwhichwecouldpredictthemeanMVCforanygivenage.However,MVCwouldstillvarywithheight.Toinvestigatetheeffectofbothageandheight,wecanusemultipleregressiontofitaregressionequationoftheform
MVC=b0+b1×height+b2×age
Thecoefficientsarecalculatedbyaleastsquaresprocedure,exactlythesameinprincipleasforsimpleregression.Inpractice,thisisalwaysdoneusingacomputerprogram.ForthedataofTable17.1,themultipleregressionequationis
MVC=-466+5.40×height-3.08×age
Fromthis,wewouldestimatethemeanMVCofmenwithanygivenageandheight,inthepopulationofwhichtheseareasample.
Thereareanumberofassumptionsimplicithere.OneisthattherelationshipbetweenMVCandheightisthesameateachage,thatis,thatthereisnointeractionbetweenheightandage.AnotheristhattherelationshipbetweenMVCandheightislinear,thatisoftheformMVC=a+b×height.Multipleregressionanalysisenablesustotestbothoftheseassumptions.
Multipleregressionisnotlimitedtotwopredictorvariables.Wecan
haveanynumber,althoughthemorevariableswehavethemoredifficultitbecomestointerprettheregression.Wemust,however,havemorepointsthanvariables,andasthedegreesoffreedomfortheresidualvariancearen-1-qifqvariablesarefitted,andthisshouldbelargeenoughforsatisfactoryestimationofconfidenceintervalsandtestsofsignificance.Thiswillbecomeclearafterthenextsection.
17.2*SignificancetestsandestimationinmultipleregressionAswesawin§11.5,thesignificanceofasimplelinearregressionlinecanbetestedusingthetdistribution.Wecancarryoutthesametestusinganalysisofvariance.FortheFEV1andheightdataofTable11.1thesumsofsquaresandproductswerecalculatedin§11.3.ThetotalsumofsquaresforFEV1isSyy=9.43868,withn-1=19degreesoffreedom.Thesumofsquaresduetoregressionwascalculatedin§11.5tobe3.18937.Theresidualsumofsquares,i.e.thesumofsquaresabouttheregressionline,isfoundbysubtractionas9.43868-3.18937=6.24931,andthishasn-2=18degreesoffreedom.We
cannowsetupananalysisofvariancetableasdescribedin§10.9,showninTable17.3.
Table17.3.AnalysisofvariancefortheregressionofFEV1onheight
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 19 9.43868
Duetoregression
1 3.18937
3.18937
9.19
Residual(aboutregression)
18 6.24931
0.34718
Table17.4.AnalysisofvariancefortheregressionofMVConheightandage
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 40 503344
Regression 2 131495
65748
6.72
Residual 38 371849
9785
Notethatthesquarerootofthevarianceratiois3.03,thevalueoftfoundin§11.5.Thetwotestsareequivalent.Notealsothattheregressionsumofsquaresdividedbythetotalsumofsquares=3.18937/9.43868=0.3379isthesquareofthecorrelationcoefficient,r=0.58(§11.5,§11.10).Thisratio,sumofsquaresduetoregressionovertotalsumofsquares,istheproportionofthevariabilityaccountedfor
bytheregression.Thepercentagevariabilityaccountedfororexplainedbytheregressionis100timesthis,i.e.34%.
ReturningtotheMVCdata,wecantestthesignificanceoftheregressionofMVConheightandagetogetherbyanalysisofvariance.Ifwefittheregressionmodelin§17.1,theregressionsumofsquareshastwodegreesoffreedom,becausewehavefittedtworegressioncoefficients.TheanalysisofvariancefortheMVCregressionisshowninTable17.4.
Theregressionissignificant;itisunlikelythatthisassociationcouldhavearisenbychanceifthenullhypothesisweretrue.Theproportionofvariabilityaccountedfor,denotedbyR2,is131495/503344=0.26.Thesquarerootofthisiscalledthemultiplecorrelationcoefficient,R.R2mustliebetween0and1,andasnomeaningcanbegiventothedirectionofcorrelationinthemultivariatecase,Risalsotakenaspositive.ThelargerRis,themorecloselycorrelatedwiththeoutcomevariablethesetofpredictorvariablesare.WhenR=1thevariablesareperfectlycorrelatedinthesensethattheoutcomevariableisalinearcombinationoftheothers.Whentheoutcomevariableisnotlinearlyrelatedtoanyofthepredictorvariables,Rwillbesmall,butnotzero.
Wemaywishtoknowwhetherbothoronlyoneofourvariablesleadstotheassociation.Todothis,wecancalculateastandarderrorforeachregressioncoefficient(Table17.5).Thiswillbedoneautomaticallybytheregressionprogram.Wecanusethistotesteachcoefficientseparatelybyattest.Wecan
alsofindaconfidenceintervalforeach,usingtstandarderrorsoneithersideoftheestimate.Fortheexample,bothageandheighthaveP=0.04andwecanconcludethatbothageandheightareindependentlyassociatedwithMVC.
Table17.5.CoefficientsfortheregressionofMVConheightandage,withstandarderrorsandconfidenceintervals
Predictorvariable Coefficient Standard
errortratio P
95%Confidenceinterval
height 5.40 2.55 2.12 0.04 0.25to10.55
age -3.08 1.47 -2.10 0.04 -6.05to-0.10
intercept -465.63 460.33 -1.01 0.3 -1397.52to466.27
Adifficultyariseswhenthepredictorvariablesarecorrelatedwithoneanother.Thisincreasesthestandarderroroftheestimates,andvariablesmayhaveamultipleregressioncoefficientwhichisnotsignificantdespitebeingrelatedtotheoutcomevariable.Wecanseethatthiswillbesomostclearlybytakinganextremecase.Supposewetrytofit
MVC=b0+b1×height+b2×height
FortheMVCdata
MVC=-908+6.20×height+1.00×height
isaregressionequationwhichminimizestheresidualsumofsquares.However,itisnotunique,because
MVC=-908+5.20×height+2.00×height
willdosotoo.ThetwoequationsgivethesamepredictedMVC.Thereisnouniquesolution,andsonoregressionequationcanbefitted,eventhoughthereisaclearrelationshipbetweenMVCandheight.Whenthepredictorvariablesarehighlycorrelatedtheindividualcoefficientswillbepoorlyestimatedandhavelargestandarderrors.Correlatedpredictorvariablesmayobscuretherelationshipofeachwiththe
outcomevariable.
Adifferent(andequivalent)wayoftestingtheeffectsoftwocorrelatedpredictorvariablesseparatelyistoproceedasfollows.Wefitthreemodels:
1. MVConheightandage,regressionsumofsquares=131495,d.f.=2
2. MVConheight,regressionsumofsquares=88511,d.f.=1
3. MVConage,regressionsumofsquares=87471,d.f.=1
Notethat88511+87471=175982isgreaterthan131495.Thisisbecauseageandheightarecorrelated.Wethentesttheeffectofheightifageistakenintoaccount,referredtoastheeffectofheightgivenage.Theregressionsumofsquaresforheightgivenageistheregressionsumofsquares(ageandheight)minusregressionsumofsquares(ageonly),whichis131495-87471=44024.Thishasdegreesoffreedom=2-1=1.Similarly,theeffectofageallowing
forheight,i.e.agegivenheight,istestedbyregressionsumofsquares(ageandheight)minusregressionsumofsquares(heightonly)=131495-88511=42984,withdegreesoffreedom=2-1=1.Wecansetallthisoutinananalysisofvariancetable(Table17.6).Thethirdtosixthrowsofthetableareindentedforthesourceofvariation,degreesoffreedomandsumofsquarescolumns,toindicatethattheyaredifferentwaysoflookingatvariationalreadyaccountedforinthesecondrow.Theindentedrowsarenotincludedwhenthedegreesoffreedomandsumsofsquaresareaddedtogivethetotal.AfteradjustmentforagethereisstillevidenceofarelationshipbetweenMVCandheight,andafteradjustmentforheightthereisstillevidenceofarelationshipbetweenMVCandage.NotethatthePvaluesarethesameasthosefoundbyattestfortheregressioncoefficient.Thisapproachisessentialforqualitativepredictorvariableswithmorethantwocategories(§17.6),whenseveraltstatisticsmaybeprintedforthevariable.
Table17.6.AnalysisofvariancefortheregressionofMVCon
heightandage,showingadjustedsumsofsquares
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 40 503344
Regression 2 131495
65748
6.72
Agealone
1 87471
87471
8.94
Heightgivenage
1 44024
44024
4.50
Heightalone
1 88511
88511
9.05
Agegivenheight
1 42984
42984
4.39
Residual 38 371849
9785
17.3*InteractioninmultipleregressionAninteractionbetweentwopredictorvariablesariseswhentheeffect
ofoneontheoutcomedependsonthevalueoftheother.Forexample,tallmenmaybestrongerthanshortmenwhentheyareyoung,butthedifferencemaydisappearastheyage.
Wecantestforinteractionasfollows.Wehavefitted
MVC=b0+b1×height+b2×age
Aninteractionmaytaketwosimpleforms.Asheightincreases,theeffectofagemayincreasesothatthedifferenceinMVCbetweenyoungandoldtallmenisgreaterthanthedifferencebetweenyoungandoldshortmen.Alternatively,asheightincreases,theeffectofagemaydecrease.Morecomplexinteractionsarebeyondthescopeofthisdiscussion.Now,ifwefit
MVC=b0+b1×height+b2×age+b3×height×age
forfixedheighttheeffectofageisb2+b3×height.Ifthereisnointeraction,theeffectofageisthesameatallheights,andb3willbezero.Ofcourse,b3willnot
beexactlyzero,butonlywithinthelimitsofrandomvariation.Wecanfitsuchamodeljustaswefittedthefirstone.Weget
Table17.7.Analysisofvariancefortheinteractionofheightandage
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 40 503344
Regression 3 202 67 8.32 0.0002
719 573
Heightandage
2 131495
65748
8.09 0.001
Height×age
1 71224
71224
8.77 0.005
Residual 37 300625
8125
MVC=4661-24.7×height-112.8×age+0.650×height×age
Theregressionisstillsignificant,aswewouldexpect.However,thecoefficientsofheightandagehavechanged;theyhaveevenchangedsign.Thecoefficientofheightdependsonage.Theregressionequationcanbewritten
MVC=4661+(-24.7+0.650×age)×height-112.8×age
Thecoefficientofheightdependsonage,thedifferenceinstrengthbetweenshortandtallsubjectsbeinggreaterforoldersubjectsthanforyounger.
TheanalysisofvarianceforthisregressionequationisshowninTable17.7.Theregressionsumofsquaresisdividedintotwoparts:thatduetoageandheight,andthatduetotheinteractiontermafterthemaineffectsofageandheighthavebeenaccountedfor.TheinteractionrowisthedifferencebetweentheregressionrowinTable17.7,whichhas3degreesoffreedom,andtheregressionrowinTable17.4,whichhas2.Fromthisweseethattheinteractionishighlysignificant.TheeffectsofheightandageonMVCarenotadditive.Anotherexampleoftheinvestigationofapossibleinteractionisgivenin§17.7.
17.4*PolynomialregressionSofar,wehaveassumedthatalltheregressionrelationshipshavebeen
linear,i.e.thatwearedealingwithstraightlines.Thisisnotnecessarilyso.Wemayhavedatawheretheunderlyingrelationshipisacurveratherthanastraightline.Unlessthereisatheoreticalreasonforsupposingthataparticularformoftheequation,suchaslogarithmicorexponential,isneeded,wetestfornon-linearitybyusingapolynomial.Clearly,ifwecanfitarelationshipoftheform
MVC=b0+b1×height+b2×age
wecanalsofitoneoftheform
MVC=b0+b1×height+b2×height2
Table17.8.AnalysisofvarianceforpolynomialregressionofMVConheight
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 40 503344
Regression 2 89103 44552
4.09 0.02
Linear 1 88522
88522
7.03 0.01
Quadratic 1 581 581 0.05 0.8
Residual 38 414241
12584
togiveaquadraticequation,andcontinueaddingpowersofheighttogiveequationswhicharecubic,quartic,etc.
Heightandheightsquaredarehighlycorrelated,whichcanleadtoproblemsinestimation.Toreducethecorrelation,wecansubtractanumberclosetomeanheightfromheightbeforesquaring.ForthedataofTable17.1,thecorrelationbetweenheightandheightsquaredis0.9998.Meanheightis170.7cm,so170isaconvenientnumbertosubtract.Thecorrelationbetweenheightandheightminus170squaredis-0.44,sothecorrelationhasbeenreduced,thoughnoteliminated.Theregressionequationis
MVC=-961+7.49×height+0.092×(height-170)2
Totestfornon-linearity,weproceedasin§17.2.Wefittworegressionequations,alinearandaquadratic.Thenon-linearityisthentestedbythedifferencebetweenthesumofsquaresduetothequadraticequationandthesumofsquaresduetothelinear.TheanalysisofvarianceisshowninTable17.8.Inthiscasethequadratictermisnotsignificant,sothereisnoevidenceofnon-linearity.Werethequadratictermsignificant,wecouldfitacubicequationandtesttheeffectofthecubicterminthesameway.Polynomialregressionofonevariablecanbecombinedwithordinarylinearregressionofotherstogiveregressionequationsoftheform
MVC=b0+b1×height+b2×height2+b3×age
andsoon.RoystonandAltman(1994)haveshownthatquitecomplexcurvescanbefittedwithasmallnumberofcoefficientsifweuselog(x)andpowers-1,0.5,0.5,1and2intheregressionequation.
17.5*AssumptionsofmultipleregressionFortheregressionestimatestobeoptimalandtheFtestsvalid,theresiduals(thedifferencebetweenobservedvaluesofthedependentvariableandthosepredictedbytheregressionequation)shouldfollow
aNormaldistributionandhavethesamevariancethroughouttherange.Wealsoassumethattherelationshipswhichwearemodellingarelinear.Theseassumptionsarethesameasforsimplelinearregression(§11.8)andcanbecheckedgraphicallyinthesameway,usinghistograms,Normalplotsandscatterdiagrams.IftheassumptionsofNormal
distributionanduniformvariancearenotmet,wecanuseatransformationasdescribedin§10.4and§11.8.Non-linearitycanbedealtwithusingpolynomialregression.
Fig.17.3.HistogramandNormalplotofresidualsofMVCaboutheightandage
Fig.17.4.ResidualsagainstobservedMVC,tocheckuniformityofvariance,andage,tochecklinearity
TheregressionequationofstrengthonheightandageisMVC=-466+5.40×height-3.08×ageandtheresidualsaregivenby
residual=MVC-(-466+5.40×height-3.08×age)
Figure17.3showsahistogramandaNormalplotoftheresidualsfortheMVCdata.Thedistributionlooksquitegood.Figure17.4showsaplotofresidualsagainstMVC.Thevariabilitylooksuniform.Wecanalsocheckthelinearitybyplottingresidualsagainstthepredictorvariables.Figure17.4alsoshowstheresidualagainstage.Thereisanindicationthatresidualmayberelatedtoage.Thepossibilityofanonlinearrelationshipcanbecheckedbypolynomialregression,which,inthiscase,doesnotproduceaquadratictermwhichapproachessignificance.
17.6*QualitativepredictorvariablesIn§17.1thepredictorvariables,heightandage,werequantitative.Inthestudyfromwhichthesedatacome,wealsorecordedwhetherornotsubjectshad
cirrhosisoftheliver.Cirrhosiswasrecordedas‘present’or‘absent’,sothevariablewasdichotomous.Itiseasytoincludesuchvariablesaspredictorsinmultipleregression.Wecreateavariablewhichis0ifthecharacteristicisabsent,1ifpresent,andusethisintheregressionequationjustaswedidheight.Theregressioncoefficientofthisdichotomousvariableisthedifferenceinthemeanoftheoutcomevariablebetweensubjectswiththecharacteristicandsubjectswithout.Ifthecoefficientinthisexamplewerenegative,itwouldmeanthatsubjectswithcirrhosiswerenotasstrongassubjectswithoutcirrhosis.Inthesameway,wecanusesexasapredictorvariablebycreatingavariablewhichis0forfemalesand1formales.Thecoefficientthenrepresentsthedifferenceinmeanbetweenmaleandfemale.Ifweuseonlyone,dichotomouspredictorvariableintheequation,theregressionisexactlyequivalenttoatwo-samplettestbetweenthe
groupsdefinedbythevariable(§10.3).
Apredictorvariablewithmorethantwocategoriesorclassesiscalledaclassvariableorafactor.Wecannotsimplyuseaclassvariableintheregressionequation,unlesswecanassumethattheclassesareorderedinthesamewayastheircodes,andthatadjoiningclassesareinsomesensethesamedistanceapart.Forsomevariables,suchasthediagnosisdataofTable4.1andthehousingdataofTable13.1,thisisabsurd.Forothers,suchastheAIDScategoriesofTable10.7,itisaverystrongassumption.Whatwedoinsteadistocreateasetofdichotomousvariablestorepresentthefactor.FortheAIDSdataofTable10.7,wecancreatethreevariables:
hiv1=1ifsubjecthasAIDS,0otherwise
hiv2=1ifsubjecthasARC,0otherwise
hiv3=1ifsubjectisHIVpositivebuthasnosymptoms,0otherwise
IfthesubjectisHIVnegative,allthreevariablesarezero.hiv1,hiv2,andhiv3arecalleddummyvariables.Somecomputerprogramswillcalculatethedummyvariablesautomaticallyifthevariableisdeclaredtobeafactor,forotherstheusermustdefinethem.Weputthethreedummyvariablesintotheregressionequation.Thisgivestheequation:
mannitol=11.4-0.066×hiv1-2.56×hiv2-1.69×hiv3
Eachcoefficientisthedifferenceinmannitolabsorptionbetweentheclassrepresentedbythatvariableandtheclassrepresentedbyalldummyvariablesbeingzero,HIVnegative,calledthereferenceclass.TheanalysisofvarianceforthisregressionequationisshowninTable17.9,andtheFtestshowsthatthereisnosignificantrelationshipbetweenmannitolabsorptionandHIVstatus.Theregressionprogramprintsoutstandarderrorsandttestsforeachdummyvariable,butthesettestsshouldbeignored,becausewecannotinterpretonedummyvariableinisolationfromtheothers.
Table17.9.AnalysisofvariancefortheregressionofmannitolexcretiononHIVstatus
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 58 1559.035
Regression 3 49.011 16.337 0.60 0.6
Residual 55 1510.024
27.455
Table17.10.Two-wayanalysisofvarianceformannitolexcretion,withHIVstatusanddiarrhoeaasfactors
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 58 1559.035
Model 4 134.880 33.720 1.28 0.3
HIV 3 58.298 19.432 0.74 0.5
Diarrhoea 1 85.869 85.869 3.26 0.08
Residual 54 1424.155
26.373
17.7*Multi-wayanalysisofvarianceAdifferentapproachtotheanalysisofmultifactorialdataisprovidedbythedirectcalculationofanalysisofvariance.Table17.9isidenticaltotheonewayanalysisofvarianceforthesamedatainTable10.8.Wecanalsoproduceanalysesofvarianceforseveralfactorsatonce.Table17.10showsthetwo-wayanalysisofvarianceforthemannitoldata,thefactorsbeingHIVstatusandpresenceorabsenceofdiarrhoea.Thiscouldbeproducedequallywellbymultipleregressionwithtwocategoricalpredictorvariables.IftherewerethesamenumberofpatientswithandwithoutdiarrhoeaineachHIVgroupthefactorswouldbebalanced.ThemodelsumofsquareswouldthenbethesumofthesumsofsquaresforHIVandfordiarrhoea,andthesecouldbecalculatedverysimplyfromthetotaloftheHIVgroupsandthediarrhoeagroups.Forbalanceddatawecanassessmanycategoricalfactorsandtheirinteractionsquiteeasilybymanualcalculation.SeeArmitageandBerry(1994)fordetails.Complexmultifactorialbalancedexperimentsarerareinmedicalresearch,andtheycanbeanalysedbyregressionanywaytogetidenticalresults.Mostcomputerprogramsinfactusetheregressionmethodtocalculateanalysesofvariance.
Foranotherexample,considerTable17.11,whichshowstheresultsofastudyoftheproductionofTumourNecrosisFactor(TNF)bycellsinvitro.Twodifferentpotentialstimulatingfactors,Mycobacteriumtuberculosis(MTB)andFixedActivatedT-cells(FAT),havebeenadded,singlyandtogether.Cellsfromthesame11donorshavebeenusedthroughout.Thuswehavethreefactors,MTB,FAT,anddonor.Threemeasurementsweremadeateachcombinationoffactors;Figure
17.5(a)showsthemeansofthesesetsofthree.Everypossiblecombinationoffactorsisusedthesamenumberoftimesinaperfectthree-wayfactorialarrangement.Therearetwomissingobservations.Thesethingshappen,eveninthebestregulatedlaboratories.TherearesomenegativevaluesofTNF.
Table17.11.TNFmeasuredunderfourdifferentconditionsusingcellsfrom11donors(dataofDr.JanDavies)
NoMTB MTB
FAT Donor TNF,3replicates FAT Donor
No 1 -0.01 -0.01 -0.13 No 1
No 2 16.13 -9.62 -14.88 No 2
No 3 Missing -0.3 -0.95 No 3
No 4 3.63 47.5 55.2 no 4
No 5 -3.21 -5.64 -5.32 No 5
No 6 16.26 52.21 17.93 No 6
No 7 -12.74 -5.23 -4.06 No 7
No 8 -4.67 20.1 110 No 8
No 9 -5.4 20 10.3 No 9
No 10 -10.94 -5.26 -2.73 No 10
No 11 -4.19 -11.83 -6.29 No 11
Yes 1 88.16 97.58 66.27 Yes 1
Yes 2 196.5 114.1 134.2 Yes 2
Yes 3 6.02 1.19 3.38 Yes 3
Yes 4 935.4 1011 951.2 Yes 4
Yes 5 606 592.7 608.4 Yes 5
Yes 6 1457 1349 1625 Yes 6
Yes 7 1457 1349 1625 Yes 7
Yes 8 196.7 270.8 160.7 Yes 8
Yes 9 135.2 221.5 268 Yes 9
Yes 10 -14.47 79.62 304.1 Yes 10
Yes 11 516.3 585.9 562.6 Yes 11
Fig.17.5.TumourNecrosisFactor(TNF)measuredinthepresenceandabsenceofFixedActivatedT-cells(FAT)andMycobacteriumtuberculosis(MTB),thenaturalandatransformedscale
ThisdoesnotmeanthatthecellsweresuckingTNFinfromtheirenvironment,butwasanartifactoftheassaymethodandrepresentsmeasurementerror.
ThesubjectmeansareshowninFigure17.5(a).Thissuggestsseveralthings:thereisastrongdonoreffect(donor6isalwayshigh,donor3isalwayslow,forexample),MTBandFATeachincreaseTNF,bothtogetherhaveagreatereffectthaneitherindividually,thedistributionofTNFishighlyskew,thevarianceofTNFvariesgreatlyfromgrouptogroup,andincreaseswiththemean.AsthemeanforMTBandFATcombinedismuchgreaterthanthesumoftheir
individualmeans,theresearcherthoughttherewassynergy,i.e.thatMTBandFATworkedtogether,thepresenceofoneenhancingtheeffectoftheother.Shewasseekingstatisticalsupportforthisconclusion(JanDavies,personalcommunication).
Table17.12.AnalysisofvariancefortheeffectsofMTB,FATanddonorontransformedTNF
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 43 194.04030
Donor 10 38.89000
3.88900
3.72 0.003
MTB 1 58.49320
58.49320
55.88 <0.0001
FAT 1 65.24482
65.24482
62.33 <0.0001
MTB×FAT
1 0.00811
0.00811
0.01 0.9
Residual 30 31.40418
1.04681
Forstatisticalanalysis,wewouldlikeNormaldistributionswithuniformvariancesbetweenthegroups.Alogtransformationlookslikeagoodbet,butsomeobservationsarenegative.Asthelog(orthesquareroot)willnotworkfornegativenumbers,wehavetoadjustthedatafurther.Theeasiestapproachistoaddaconstanttoalltheobservationsbeforetransformation.Ichose20,whichmakesalltheobservationspositivebutissmallcomparedtomostoftheobservations.Ididthisbytrialanderror.AsFigure17.5(b)shows,thetransformationhasnotbeentotallysuccessful,butthetransformeddatalookmuchmoreamenable
toaNormaltheoryanalysisthandotherawdata.
TherepeatedmeasurementsgiveusamoreaccuratemeasurementofTNF,butdonotcontributeanythingelse.IthereforeanalysedthemeantransformedTNF.TheanalysisofvarianceisshowninTable17.12.Donorisafactorwith11categories,hencehas10degreesoffreedom.Itisnotofanyimportancetothesciencehere,butiswhatwecallanuisancevariable,oneweneedtoallowforbutarenotinterestedin.IhaveincludedaninteractionbetweenMTBandFAT,becauselookingforthisisoneoftheobjectivesoftheexperiment.ThemaineffectsofMTBandFATarehighlysignificant,buttheinteractiontermisnot.TheestimatesoftheeffectswiththeirconfidenceintervalsareshowninTable17.13.Astheanalysiswasonalogscale,theantilogs(exponentials)arealsoshown.Theantiloggivesustheratioofthe(geometric)meaninthepresenceofthefactortothemeanintheabsenceofthefactor,i.e.theamountbywhichTNFismultipliedbywhenthefactorispresent.Strictlyspeaking,ofcourse,itistheratioofthegeometricmeansofTNFplus20,butas20issmallcomparedtomostTNFmeasurementstheratiowillbeapproximatelytheincreaseinTNF.
Theestimatedinteractionissmallandnotsignificant.Theconfidenceintervaliswide(thesampleisverysmall),sowecannotexcludethepossibilityofaninteraction,butthereiscertainlynoevidencethatoneexists.Thiswasnotwhattheresearcherexpected.Thiscontradictioncomesaboutbecausethestatisticalmodelusedisofadditiveeffectsonthelogarithmicscale,i.e.ofmultiplicativeeffectsonthenaturalscale.Thisisforcedonusbythenatureofthedata.The
lackofinteractionbetweentheeffectsshowsthatthedataareconsistentwiththismodel,thisviewofwhatishappening.ThelackofinteractioncanbeseenquiteclearlyinFigure17.5(b),asthemeanforMTBandFATlooksverysimilartothesumofthemeansforMTBaloneandFATalone.
Table17.13.EstimatedeffectsonTNFofMTB,FAT
andtheirinteraction
Effect(logscale)
95%Confidenceinterval
Ratioeffect(naturalscale)
95%Confidenceinterval
Withinteractionterm
MTB 2.333 (1.442to3.224)
10.3 (4.2to25.1)
FAT 2.463 (1.572to3.354)
11.7 (4.8to28.6)
MTB×FAT
0.054 (-1.206to1.314)
1.1 (0.3to3.7)
Withoutinteractionterm
MTB 2.306 (1.687to2.925)
10.0 (5.4to18.6)
FAT 2.435 (1.816to3.054)
11.4 (6.1to21.2)
Multipleregressioninwhichqualitativeandquantitativepredictorvariablesarebothusedisalsoknownasanalysisofcovariance.Forordinaldata,thereisatwo-wayanalysisofvarianceusingranks,theFriedmantest(seeConover1980,Altman1991)
17.8*LogisticregressionLogisticregressionisusedwhentheoutcomevariableisdichotomous,a‘yesorno’,whetherornotthesubjecthasaparticularcharacteristicsuchasasymptom.Wewantaregressionequationwhichwillpredicttheproportionofindividualswhohavethecharacteristic,or,equivalently,estimatetheprobabilitythatanindividualwillhavethesymptom.Wecannotuseanordinarylinearregressionequation,becausethismightpredictproportionslessthanzeroorgreaterthanone,whichwouldbemeaningless.Insteadweusethelogitoftheproportionastheoutcomevariable.Thelogitofaproportionpisthelogodds(§13.7):
Thelogitcantakeanyvaluefromminusinfinity,whenp=0,toplusinfinity,whenp=1.WecanfitregressionmodelstothelogitwhichareverysimilartotheordinarymultipleregressionandanalysisofvariancemodelsfoundfordatafromaNormaldistribution.Weassumethatrelationshipsarelinearonthelogisticscale:
wherex1,…,xmarethepredictorvariablesandpistheproportiontobepredicted.Themethodiscalledlogisticregression,andthecalculationiscomputerintensive.Theeffectsofthepredictorvariablesarefoundaslogoddsratios.Wewilllookattheinterpretationinanexample.
Fig.17.6.Bodymassindex(BMI)inwomenundergoingtrialofscar
Table17.14.Coefficientsinthelogisticregressionforpredictingcaesariansection
Coef. Std.Err. z P
95%Confidenceinterval
BMI 0.0883
0.0200
4.42 <0.001 0.0492to0.1275
Induction 0.6471
0.2141
3.02 0.003 0.2276to1.0667
Prev.vag.del.
-1.7963
0.2981
-6.03 <0.001 -2.3805to-1.2120
Intercept -3.7000
0.5343
-6.93 <0.001 -4.7473to-2.6528
Whengivingbirth,womenwhohavehadapreviouscaesariansectionusuallyhaveatrialofscar,thatis,theyattemptanaturallabourwithvaginaldeliveryandonlyhaveanothercaesarianifthisisdeemednecessary.Severalfactorsmayincreasetheriskofacaesarian,andinthisstudythefactorofinterestwasobesity,asmeasuredbythebodymassindexorBMI,definedasweight/height2.ThedistributionofBMIisshowninFigure17.6(dataofAndreasPapadopoulos).ForcaesariansthemeanBMIwas26.4kg/m2andforvaginaldeliveriesthemeanwas24.9kg/m2.Twoothervariableshadastrongrelationshipwithasubsequentcaesarian.Womenwhohadhadapreviousvaginaldelivery(PVD)werelesslikelytoneedacaesarian,oddsratio=0.18,95%confidenceinterval0.10to0.32.Womenwhoselabourwasinducedhadanincreasedriskofacaesarian,oddsratio=2.11,95%confidenceinterval1.44to3.08.Alltheserelationshipswerehighlysignificant.ThequestiontobeansweredwaswhethertherelationshipbetweenBMIandcaesariansectionremainedwhentheeffectsofinductionandpreviousdeliverieswereallowedfor.
TheresultsofthelogisticregressionareshowninTable17.14.Wehavethecoefficientsfortheequationpredictingthelogoddsofacaesarian:
log(o)=-3.7000+0.0883×BMI+0.6471×induction-1.7963×PVD
whereinductionandPVDare1ifpresent,0ifnot.ThusforwomanwhohadBMI=25kg/m2,notbeeninducedandhadapreviousvaginaldeliverythelog
oddsofacaesarianisestimatedtobe
Table17.15.Oddsratiosfromthelogisticregression
forpredictingcaesariansection
Oddsratio P 95%Confidenceinterval
BMI 1.092 <0.001 1.050to1.136
Induction 1.910 0.003 1.256to2.906
Prev.vag.del.
0.166 <0.001 0.096to0.298
log(o)=-3.7000+0.0883×25+0.6471×0-1.7963×1=-3.2888
Theoddsisexp(-3.2888)=0.03730andtheprobabilityisgivenbyp=o/(1+o)=0.03730/(1+0.03730)=0.036.Iflabourhadbeeninduced,thelogoddswouldriseto
log(o)=-3.7000+0.0883×25+0.6471×1-1.7963×1=-2.6417
givingoddsexp(-2.6417)=0.07124andhenceprobability0.07124/(1+0.07124)=0.067.
Becausethelogisticregressionequationpredictsthelogodds,thecoefficientsrepresentthedifferencebetweentwologodds,alogoddsratio.Theantilogofthecoefficientsisthusanoddsratio.Someprogramswillprinttheseoddsratiosdirectly,asinTable17.15.Wecanseethatinductionincreasestheoddsofacaesarianbyafactorof1.910andapreviousvaginaldeliveryreducestheoddsbyafactorof0.166.Theseareoftencalledadjustedoddsratios.Inthisexampletheyandtheirconfidenceintervalsaresimilartotheunadjustedoddsratiosgivenabove,becausethethreepredictorvariableshappennottobecloselyrelatedtoeachother.
Foracontinuouspredictorvariable,suchasBMI,thecoefficientisthechangeinlogoddsforanincreaseofoneunitinthepredictorvariable.
Theantilogofthecoefficient,theoddsratio,isthefactorbywhichtheoddsmustbemultipliedforaunitincreaseinthepredictor.Twounitsincreaseinthepredictorincreasestheoddsbythesquareoftheoddsratio,andsoon.Adifferenceof5kg/m2inBMIgivesanoddsratioforacaesarianof1.0925=1.55,thustheoddsofacaesarianaremultipliedby1.55.See§11.8forasimilarinterpretationandfullerdiscussionwhenacontinuousoutcomevariableislogtransformed.
Whenwehaveacasecontrolstudy,wecananalysethedatabyusingthecaseorcontrolstatusastheoutcomevariableinalogisticregression.Thecoefficientsarethentheapproximatelogrelativerisksduetothefactors(§13.7).Thereisavariantcalledconditionallogisticregression,whichcanbeusedwhenthecasesandcontrolsareinmatchedpairs,triples,etc.
Logisticregressionisalargesamplemethod.Aruleofthumbisthatthereshouldbeatleast10‘yes'sand10‘no's,andpreferably20,foreachpredictorvariable(Peduzzietal.1996).
17.9*SurvivaldatausingCoxregressionOneproblemofsurvivaldata,thecensoringofindividualswhohavenotdiedatthetimeofanalysis,hasbeendiscussedin§15.6.Thereisanotherwhichisimportantformultifactorialanalysis.Weoftenhavenosuitablemathematicalmodelofthewaysurvivalisrelatedtotime,i.e.thesurvivalcurve.ThesolutionnowwidelyadoptedtothisproblemwasproposedbyCox(1972),andisknownasCoxregressionortheproportionalhazardsmodel.Inthisapproach,wesaythatforsubjectswhohavelivedtotimet,theprobabilityofanendpoint(e.g.dying)instantaneouslyattimetish(t),whichisanunknownfunctionoftime.Wecalltheprobabilityofanendpointthehazard,andh(t)isthehazardfunction.Wethenassumethatanythingwhichaffectsthehazarddoessobythesameratioatalltimes.Thus,somethingwhichdoublestheriskofanendpointondayonewillalsodoubletheriskofanendpointondaytwo,daythreeandsoon.Thus,ifh0(t)isthehazardfunctionforsubjectswithallthepredictorvariablesequaltozero,andh(t)isthehazardfunctionforasubjectwithsomeother
valuesforthepredictorvariables,h(t)/h0(t)dependsonlyonthepredictorvariables,notontimet.Wecallh(t)/h0(t)thehazardratio.Itistherelativeriskofanendpointoccurringatanygiventime.
Instatistics,itisconvenienttoworkwithdifferencesratherthanratios,sowetakethelogarithmoftheratio(see§5A)andhavearegression-likeequation:
wherex1,…,xparethepredictorvariablesandb1,…,bparethecoefficientswhichweestimatefromthedata.ThisisCox'sproportionalhazardsmodel.Coxregressionenablesustoestimatethevaluesofb1,…,bpwhichbestpredicttheobservedsurvival.Thereisnoconstanttermb0,itsplacebeingtakenbythebaselinehazardfunctionh0(t).
Table15.7showsthetimetorecurrenceofgallstones,orthetimeforwhichpatientsareknowntohavebeengallstone-free,followingdissolutionbybileacidtreatmentorlithotrypsy,withthenumberofpreviousgallstones,theirmaximumdiameter,andthetimerequiredfortheirdissolution.Thedifferencebetweenpatientswithasingleandwithmultiplepreviousgallstoneswastestedusingthelogranktest(§15.6).Coxregressionenablesustolookatcontinuouspredictorvariables,suchasdiameterofgallstone,andtoexamineseveralpredictorvariablesatonce.Table17.16showstheresultoftheCoxregression.Wecanearn-outanapproximatetestofsignificancedividingthecoefficientbyitsstandarderror,andifthenullhypothesisthatthecoefficientwouldbezerointhepopulationistrue,thisfollowsaStandardNormaldistribution.Thechi-squaredstatisticteststherelationshipbetweenthetimetorecurrenceandthethreevariablestogether.Themaximumdiameterhasnosignificantrelationshiptotimetorecurrence,sowecantryamodelwithoutit(Table17.17).Asthechangeinoverallchi-squaredshows,removingdiameterhashadverylittleeffect.
ThecoefficientsinTable17.17aretheloghazardratios.Thecoefficientfor
multiplegallstonesis0.963.Ifweantilogthis,wegetexp(0.963)=
2.62.Asmultiplegallstonesisa0or1variable,thecoefficientmeasuresthedifferencebetweenthosewithsingleandmultiplestones.Apatientwithmultiplegallstonesis2.62timesaslikelytohavearecurrenceatanytimethanapatientwithasinglestone.The95%confidenceintervalforthisestimateisfoundfromtheantilogsoftheconfidenceintervalinTable17.17,1.30to5.26.Notethatapositivecoefficientmeansanincreasedriskoftheevent,inthiscaserecurrence.Thecoefficientformonthstodissolutionis0.043,whichhasantilog=1.04.Thisisaquantitativevariable,andforeachmonthtodissolvethehazardratioincreasesbyafactorof1.04.Thusapatientwhosestonetooktwomonthstodissolvehasariskofrecurrence1.04timesthatforapatientwhosestonetookonemonth,apatientwhosestonetookthreemonthshasarisk1.042timesthatforaonemonthpatient,andsoon.
Table17.16.Coxregressionoftimetorecurrenceofgallstonesonpresenceofmultiplestones,maximum
diameterofstoneandmonthstodissolution
Variable Coef. Std.Err. z P
95%Conf.interval
Mult.gallstones
0.838 0.401 2.09 0.038 0.046to1.631
Max.diam.
-0.023 0.036 -0.63 0.532 -0.094to0.049
Monthsto 0.044 0.017 2.64 0.009 0.011
dissol. to0.078
X2=12.57,3d.f.,P=0.006.
Table17.17.Coxregressionoftimetorecurrenceofgallstonesonpresenceofmultiplestonesand
monthstodissolution
Variable Coef. Std.Err. z P
95%Conf.interval
Mult.gallstones
0.963 0.353 2.73 0.007 0.266to1.661
Monthstodissol.
0.043 0.017 2.59 0.011 0.010to0.076
X2=12.16,2d.f.,P=0.002.
IfwehaveonlythedichotomousvariablemultiplegallstonesintheCoxmodel,wegetfortheoverallteststatisticX2=6.11,1degreesoffreedom.In§15.6weanalysedthesedatabycomparisonoftwogroupsusingthelogranktestwhichgaveX2=6.62,1degreeoffreedom.Thetwomethodsgivesimilar,butnotidenticalresults.Thelogranktestis
non-parametric,makingnoassumptionaboutthedistributionofsurvivaltime.TheCoxmethodissaidtosemi-parametric,becausealthoughitmakesnoassumptionabouttheshapeofthedistributionofsurvivaltime,itdoesrequireassumptionsaboutthehazardratio.
Likelogisticregression(§17.8),Coxregressionisalargesamplemethod.Aruleofthumbisthatthereshouldbeatleast10,andpreferably20,events(deaths)foreachpredictorvariable.FulleraccountsofCoxregressionaregivenbyAltman(1991),MatthewsandFarewell(1988),ParmarandMachin(1995),andHosmerandLemeshow(1999).
17.10*StepwiseregressionStepwiseregressionisatechniqueforchoosingpredictorvariablesfromalargeset.Thestepwiseapproachcanbeusedwithmultiplelinear,logisticandCoxregressionandwithother,lessoftenseen,regressiontechniques(§17.12)too.
Therearetwobasicstrategies:step-upandstep-down,alsocalledforwardandbackward.Instep-uporforwardregression,wefitallpossibleone-wayregressionequations.Havingfoundtheonewhichaccountsforthegreatestvariance,alltwo-wayregressionsincludingthisvariablearefitted.Theequationaccountingforthemostvariationischosen,andallthree-wayregressionsincludingthesearefitted,andsoon.Thiscontinuesuntilnosignificantincrease,invariationaccountedforisfound.Inthestep-downorbackwardmethod,wefirstfittheregressionwithallthepredictorvariables,andthenthevariableisremovedwhichreducestheamountofvariationaccountedforbytheleastamount,andsoon.Therearealsomorecomplexmethods,inwhichvariablescanbothenterandleavetheregressionequation.
Thesemethodsmustbetreatedwithcare.Differentstepwisetechniquesmayproducedifferentsetsofpredictorvariablesintheregressionequation.Thisisespeciallylikelywhenthepredictorvariablesarecorrelatedwithoneanother.Thetechniqueisveryusefulforselectingasmallsetofpredictorvariablesforpurposesofstandardizationandprediction.Fortryingtogetanunderstandingof
theunderlyingsystem,stepwisemethodscanbeverymisleading.Whenpredictorvariablesarehighlycorrelated,onceonehasenteredtheequationinastep-upanalysis,theotherwillnotenter,eventhoughitisrelatedtotheoutcome.Thusitwillnotappearinthefinalequation.
17.11*Meta-analysis:DatafromseveralstudiesMeta-analysisisthecombinationofdatafromseveralstudiestoproduceasingleestimate.Fromthestatisticalpointofview,meta-analysisisastraightforwardapplicationofmultifactorialmethods.Wehaveseveralstudiesofthesamething,whichmightbeclinicaltrialsorepidemiologicalstudies,perhapscarriedoutindifferentcountries.Eachtrialgivesusanestimateofaneffect.Weassumethattheseareestimatesofthesameglobalpopulationvalue.Wechecktheassumptionsoftheanalysis,and,iftheseassumptionsaresatisfied,wecombinetheseparatestudyestimatestomakeacommonestimate.Thisisamultifactorialanalysis,wherethetreatmentorriskfactorisonepredictorvariableandthestudyisanother,categorical,predictorvariable.
Themainproblemsofmeta-analysisarisebeforewebegintheanalysisofthedata.First,wemusthaveacleardefinitionofthequestionsothatweonlyincludestudieswhichaddressthis.Forexample,ifwewanttoknowwhetherloweringserumcholesterolreducesmortalityfromcoronaryarterydisease,wewouldnotwanttoincludeastudywheretheattempttolowercholesterolfailed.Ontheotherhand,ifweaskwhetherdietaryadvicelowersmortality,wewouldincludesuchastudy.Whichstudiesweincludemayhaveaprofoundinfluenceontheconclusions(Thompson1993).Second,wemusthavealltherelevant
studies.Asimpleliteraturesearchisnotenough.Notallstudieswhichhavebeenstartedarepublished;studieswhichproducesignificantdifferencesaremorelikelytobepublishedthanthosewhichdonot(e.g.PocockandHughes1990;Easterbrooketal.1991).Withinastudy,resultswhicharesignificantmaybeemphasizedandpartsofthedatawhichproducenodifferencesmaybeignoredbytheinvestigatorsasuninteresting.Publicationofunfavourableresultsmaybediscouragedbythesponsorsofresearch.ResearcherswhoarenotnativeEnglish
speakersmayfeelthatpublicationintheEnglishlanguageliteratureismoreprestigiousasitwillreachawideraudience,andsotrytherefirst,onlypublishingintheirownlanguageiftheycannotpublishinEnglish.TheEnglishlanguageliteraturemaythuscontainmorepositiveresultsthandootherliteratures.Thephenomenonbywhichsignificantandpositiveresultsaremorelikelytobereported,andreportedmoreprominently,thannon-significantandnegativeonesiscalledpublicationbias.Thuswemustnotonlytrawlthepublishedliteratureforstudies,butusepersonalknowledgeofourselvesandotherstolocatealltheunpublishedstudies.Onlythenshouldwecarryoutthemeta-analysis.
Whenwehaveallthestudieswhichmeetthedefinition,wecombinethemtogetacommonestimateoftheeffectofthetreatmentorriskfactor.Weregardthestudiesasprovidingseveralobservationsofthesamepopulationvalue.Therearetwostagesinmeta-analysis.Firstwecheckthatthestudiesdoprovideestimatesofthesamething.Second,wecalculatethecommonestimateanditsconfidenceinterval.Todothiswemayhavetheoriginaldatafromallthestudies,whichwecancombineintoonelargedatafilewithstudyasoneofthevariables,orwemayonlyhavesummarystatisticsobtainedfrompublications.
Iftheoutcomemeasureiscontinuous,suchasmeanfallinbloodpressure,wecancheckthatsubjectsarefromthesamepopulationbyanalysisofvariance,withtreatmentorriskfactor,study,andinteractionbetweentheminthemodel.Multipleregressioncanalsobeused,rememberingthatstudyisacategoricalvariableanddummyvariablesarerequired.Wetestthetreatmenttimesstudyinteractionintheusualway.Iftheinteractionissignificantthisindicatesthatthetreatmenteffectisnotthesameinallstudies,andsowecannotcombinethestudies.Itistheinteractionwhichisimportant.Itdoesnotmattermuchifthemeanbloodpressurevariesfromstudytostudy.Whatmattersiswhethertheeffectofthetreatmentonbloodpressurevariesmorethanwewouldexpect.Wemaywanttoexaminethestudiestoseewhetheranycharacteristicofthestudiesexplainsthisvariation.Thismightbeafeatureofthesubjects,thetreatmentorthedatacollection.Ifthereisnointeraction,thenthedataareconsistentwiththetreatmentorriskfactoreffectbeingconstant.Thisiscalleda
fixedeffectsmodel(see§10.12).Wecandroptheinteractiontermfromthemodelandthetreatmentorriskfactoreffectisthentheestimatewewant.Itsstandarderrorandconfidenceintervalarefoundasdescribedin§17.2.Ifthereisaninteraction,wecannotestimateasingletreatmenteffect.Wecanthinkofthestudiesasarandomsampleofthepossibletrialsandestimatethemeantreatmenteffectforthispopulation.Thisiscalledtherandomeffectsmodel(§10.12).The
confidenceintervalisusuallymuchwiderthanthatfoundusingthefixedeffectmodel.
Table17.18.OddsratiosandconfidenceintervalsinfivestudiesofvitaminAsupplementationin
infectiousdisease(GlasziouandMackerras1993)
Study Doseregime VitaminA Controls
Deaths Number Deaths Number
1 200000IUsix-monthly
101 12991 130 12209
2 200000IUsix-monthly
39 7076 41 7006
3 8333IUweekly
37 7764 80 7755
4 200000IUfour-monthly
152 12541 210 12264
5 200000IUonce
138 3786 167 3411
Table17.19.OddsratiosandconfidenceintervalsinfivestudiesofvitaminAsupplementationin
infectiousdisease
Study Oddsratio 95%Confidenceinterval
1 0.73 0.56to0.95
2 0.94 0.61to1.46
3 0.46 0.31to0.68
4 0.70 0.57to0.87
5 0.73 0.58to0.93
Iftheoutcomemeasureisdichotomous,suchassurvivedordied,theestimateofthetreatmentorriskfactoreffectwillbeintheformofanoddsratio(§13.7).Wecanproceedinthesamewayasforacontinuousoutcome,usinglogisticregression(§17.8).Severalothermethodsexistforcheckingthehomogeneityoftheoddsratiosacrossstudies,suchas
Woolf'stest(seeArmitageandBerry1994)orthatofBreslowandDay(1980).Theyallgivesimilaranswers,and,sincetheyarebasedondifferentlarge-sampleapproximations,thelargerthestudysamplesthemoresimilartheresultswillbe.Providedtheoddsratiosarehomogeneousacrossstudies,wecanthenestimatethecommonoddsratio.ThiscanbedoneusingtheMantel-Haenszelmethod(seeArmitageandBerry1994)orbylogisticregression.
Forexample,GlasziouandMackerras(1993)carriedoutameta-analysisofvitaminAsupplementationininfectiousdisease.TheirdataforfivecommunitystudiesareshowninTable17.18.Wecanobtainoddsratiosandconfidenceintervalsasdescribedin§13.7,showninTable17.19.
Thecommonoddsratiocanbefoundinseveralways.Touselogisticregression,weregresstheeventofdeathonvitaminAtreatmentandstudy.Ishalltreatthetreatmentasadichotomousvariable,setto1iftreatedwithvitaminA,0ifcontrol.Studyisacategoricalvariable,sowecreatedummyvariablesstudy1tostudy4,whicharesettooneforstudies1to4respectively,andtozerootherwise.Wetesttheinteractionbycreatinganothersetofvariables,theproductsofstudy1tostudy4andvitaminA.LogisticregressionofdeathonvitaminA,studyandinteractiongivesachi-squaredstatisticforthemodelof
496.99with9degreesoffreedom,whichishighlysignificant.Logisticregressionwithouttheinteractiontermsgives490.33with5degreesoffreedom.Thedifferenceis496.99-490.33=6.66with9-5=4degreesoffreedom,whichhasP=0.15,sowecandroptheinteractionfromthemodel.TheadjustedoddsratioforvitaminAis0.70,95%confidenceinterval0.62to0.79,P<0.0001.
Fig.17.7.Meta-analysisoffivevitaminAtrials(dataofGlasziouandMackerras1993).Theverticallinesaretheconfidenceintervals.
TheoddsratiosandtheirconfidenceintervalsareshowninFigure17.7.Theconfidenceintervalisindicatedbyaline,thepointestimateoftheoddsratiobyacircle.Inthispicturethemostimportanttrialappearstobestudy2,withthewidestconfidenceinterval.Infact,itisthestudywiththeleasteffectonthewholeestimate,becauseitisthestudywheretheoddsratioisleastwellestimated.Inthesecondpicture,theoddsratioisindicatedbythemiddleofasquare.Theareaofthesquareisproportionaltothenumberofsubjectsinthestudy.Thisnowmakesstudy2appearrelativelyunimportant,andmakestheoverallestimatestandout.
Therearemanyvariantsonthisstyleofgraph,whichissometimescalledaforestdiagram.Thegraphisoftenshownwiththestudiesontheverticalaxis
andtheoddsratioordifferenceinmeanonthehorizontalaxis(Figure17.8).Thecombinedestimateoftheeffectmaybeshownasalozengeordiamondshapeandforoddsratiosalogarithmicscaleisoftenemployed,asinFigure17.8.
Fig.17.8.Meta-analysisoffivevitaminAtrials,verticalversion
17.12*OthermultifactorialmethodsThechoiceofmultiple,logisticorCoxregressionisdeterminedbythenatureoftheoutcomevariable:continuous,dichotomous,orsurvivaltimesrespectively.Thereareothertypesofoutcomevariableandcorrespondingmultifactorialtechniques.Ishallnotgointoanydetails,butthislistmayhelpshouldyoucomeacrossanyofthem.Iwouldrecommendyouconsultastatisticianshouldyouactuallyneedtouseoneofthesemethods.Thetechniquesfordealingwithpredictorvariablesdescribedin§17.2–17.4and§17.6applytoallofthem.
Iftheoutcomevariableiscategoricalwithmorethantwocategories,e.g.severaldiagnosticgroups,weuseaprocedurecalledmultinomiallogisticregression.Thisestimatesforasubjectwithgivenvaluesofthepredictorvariabletheprobabilitythatthesubjectwillbeineachcategory.Ifthecategoriesareordered,e.g.tumourstage,wecantaketheorderingintoaccountusingorderedlogisticregression.Boththesetechniquesarecloselyrelatedtologisticregression(§17.8).
Iftheoutcomeisacount,suchashospitaladmissionsinadayordeathsrelatedtoaspecificcauseperweekormonth,wecanusePoisson
regression.Thisisparticularlyusefulwhenwehavemanytimeintervalsbutthenumbersofeventsperintervalissmall,sothattheassumptionsofmultipleregression(§17.5)donotapply.
Aslightlydifferentproblemariseswithmulti-waycontingencytableswherethereisnoobviousoutcomevariable.Wecanuseatechniquecalledloglinearmodelling.Thisenablesustotesttherelationshipbetweenanytwoofthevariablesinthetableholdingtheothersconstant.
17M*Multiplechoicequestions93to97(Eachansweristrueorfalse)
93.Inmultipleregression,R2:
(a)isthesquareofthemultiplecorrelationcoefficient;
(b)wouldbeunchangedifweexchangedtheoutcome(dependent)variableandoneofthepredictor(independent)variables;
(c)iscalledtheproportionofvariabilityexplainedbytheregression;
(d)istheratiooftheerrorsumofsquarestothetotalsumofsquares;
(e)wouldincreaseifmorepredictorvariableswereaddedtothemodel.
ViewAnswer
Table17.20.Analysisofvariancefortheeffectsofage,sexandethnicgroup(Afro-CaribbeanversusWhite)oninter-pupil
distance(Imafedon,personalcommunication)
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F)
Probability
Total 37 603.586
Agegroup
2 124.587 62.293 6.81 0.003
Sex 1 1.072 1.072 0.12 0.7
Ethnicgroup
1 134.783 134.783 14.74 0.0005
Residual 33 301.782 9.145
94.TheanalysisofvariancetableforastudyofthedistancebetweenthepupilsoftheeyesisshowninTable17.20:
(a)therewere34observations;
(b)thereisgoodevidenceofanethnicgroupdifferenceinthepopulation:
(c)wecanconcludethatthereisnodifferenceininter-pupildistancebetweenmenandwomen;
(d)thereweretwoagegroups;
(e)thedifferencebetweenethnicgroupsislikelytobeduetoarelationshipbetweenethnicityandageinthesample.
ViewAnswer
Table17.21.Logisticregressionofgraftfailureafter6months(Thomasetal.1993)
Variable Coef. Std.Err.
z=coef/se P 95%
Conf.
Whitecellcount
1.238 0.273 4.539 <0.001
0.695
Grafttype1
0.175 0.876 0.200 0.842 -1.570
Grafttype2
0.973 1.030 0.944 0.348 -1.080
Grafttype3
0.038 1.518 0.025 0.980 -2.986
Female -0.289 0.767 -0.377 0.708 -1.816
Age 0.022 0.035 0.633 0.528 -0.048
Smoker 0.998 0.754 1.323 0.190 -0.504
Diabetic 1.023 0.709 1.443 0.153 -0.389
Constant -13.726 3.836 -3.578 0.001 -21.369
Numberofobservations=84,chi-squared=38.05,d.f.=8,P<0.0001.
95.Table17.21showsthelogisticregressionofveingraftfailureonsomepotentialexplanatoryvariables.Fromthisanalysis:
(a)patientswithhighwhitecellcountsweremorelikelytohavegraftfailure;
(b)thelogoddsofgraftfailureforadiabeticisbetween0.389lessand2.435greaterthanthatforanon-diabetic;
(c)graftsweremorelikelytofailinfemalesubjects,thoughthisisnotsignificant;
(d)therewerefourtypesofgraft;
(e)anyrelationshipbetweenwhitecellcountandgraftfailuremaybeduetosmokershavinghigherwhitecellcounts.
ViewAnswer
Fig.17.9.Oralandforeheadtemperaturemeasurementsmadeinagroupofpyrexicpatients
96.ForthedatainFigure17.9:
(a)therelationshipcouldbeinvestigatedbylinearregression;
(b)an‘oralsquared’termcouldbeusedtotestwhetherthereisanyevidencethattherelationshipisnotastraightline;
(c)ifan‘oralsquared’termwereincludedtherewouldbe2degreesoffreedomforthemodel;
(d)thecoefficientsofan‘oral’andan‘oralsquared’termwouldbeuncorrelated;
(e)theestimationofthecoefficientofaquadratictermwouldbeimprovedbysubtractingthemeanfromtheoraltemperaturebeforesquaring.
ViewAnswer
Table17.22.Coxregressionoftimetoreadmissionforasthmaticchildrenfollowingdischargefrom
hospital(Mitchelletal.1994)
Variable Coef. Std.err. coef/se P
Boy -0.197 0.088 -2.234 0.026
Age -0.126 0.017 -7.229 <0.001
Previousadmissions
0.395 0.034 11.695 <0.001
(squareroot)
Inpatienti.v.therapy
0.267 0.093 2.876 0.004
Inpatienttheophyline
-0.728 0.295 -2.467 0.014
Numberofobservations=1024,X2=167.15,5d.f.,P<0.0001.
97.Table17.22showstheresultsofanobservationalstudyfollowingupasthmaticchildrendischargedfromhospital.Fromthistable:
(a)theanalysiscouldonlyhavebeendoneifallchildrenhadbeenreadmittedtohospital;
(b)theproportionalhazardsmodelwouldhavebeenbetterthanCoxregression;
(c)Boyshaveashorteraveragetimebeforereadmissionthandogirls;
(d)theuseoftheophylinepreventsreadmissiontohospital;
(e)childrenwithseveralpreviousadmissionshaveanincreasedriskofreadmission.
ViewAnswer
Fig.17.10.Cushionvolumeagainstnumberofpairsofsomitesfortwogroupsofmouseembryos(WebbandBrown,personalcommunication)
Table17.23.Numberofsomitesandcushionvolumeinmouseembryos
Normal Trisomy-16
som. c.vol. som. c.vol. som. c.vol. som.
17 2.674 28 3.704 15 0.919 28
20 3.299 31 6.358 17 2.047 28
21 2.486 32 3.966 18 3.302 28
23 1.202 32 7.184 20 4.667 31
23 4.263 34 8.803 20 4.930 32
23 4.620 35 4.373 23 4.942 34
25 4.644 40 4.465 23 6.500 35
25 4.403 42 10.940 23 7.122 36
27 5.417 43 6.035 25 7.688 40
27 4.395 25 4.230 42
27 8.647
17E*Exercise:AmultipleregressionanalysisTrisomy-16micecanbeusedasananimalmodelforDown'ssyndrome.Thisanalysislooksatthevolumeofaregionoftheheart,theatrioventricularcushion,ofamouseembryo,comparedbetweentrisomicandnormalembryos.Theembryoswereatvaryingstagesofdevelopment,indicatedbythenumberofpairsofsomites(precursorsofvertebrae).Figure17.10andTable17.23showthedata.Thegroupwascoded1=normal,2=trisomy-16.Table17.24showstheresultsofaregressionanalysisandFigure17.11showsresidualplots.
1.Isthereanyevidenceofadifferenceinvolumebetweengroupsforgivenstageofdevelopment?
ViewAnswer
2.Figure17.11showsresidualplotsfortheanalysisofTable17.24.Arethereanyfeaturesofthedatawhichmightmaketheanalysisinvalid?
ViewAnswer
Table17.24.Regressionofcushionvolumeonnumberofpairsofsomitesandgroupinmouseembryos
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 39 328.976
Duetoregression
2 197.708 98.854 27.86 P<0.0001
Residual(aboutregression)
37 131.268 3.548
Variable Coef. Std.Err. t P 95%Conf.interval
group 2.44 0.60 4.06 <0.001 1.29to3.65
somites 0.27 0.04 6.70 <0.001 0.19to0.36
Fig.17.11.ResidualagainstnumberofpairsofsomitesandNormalplotofresidualsfortheanalysisofTable17.24
3.ItappearsfromFigure17.10thattherelationshipbetweenvolumeandnumberofpairsofsomitesmaynotbethesameinthetwogroups.Table17.25showstheanalysisofvarianceforregressionanalysisincludinganinteractionterm.CalculatetheF-ratiototesttheevidencethattherelationshipisdifferentinnormalandtrisomy-16embryos.YoucanfindtheprobabilityfromTable10.1,usingthefactthatthesquarerootofFwith1andndegreesoffreedomistwithndegreesoffreedom.
ViewAnswer
Table17.25.Analysisofvarianceforregressionwithnumberofpairsofsomites×groupinteraction
Sourceofvariation
Degreesoffreedom
Sumofsquares
Meansquare
Varianceratio(F) Probability
Total 39 328.976
Duetoregression
3 207.139 69.046 20.40 P<0.0001
Residual(aboutregression)
36 121.837 3.384
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>18-Determinationofsamplesize
18
Determinationofsamplesize
18.1*EstimationofapopulationmeanOneofthequestionsmostfrequentlyaskedofamedicalstatisticianis‘HowlargeasampleshouldItake?’Inthischapterweshallseehowstatisticalmethodsfordecidingsamplesizescanbeusedinpracticeasanaidindesigninginvestigations.Themethodsweshallusearelargesamplemethods,thatis,theyassumethatlargesamplemethodswillbeusedintheanalysisandsotakenoaccountofdegreesoffreedom.
Wecanusetheconceptsofstandarderrorandconfidenceintervaltohelpdecidehowmanysubjectsshouldbeincludedinasample.Ifwewanttoestimatesomepopulationquantity,suchasthemean,andweknowhowthestandarderrorisrelatedtothesamplesize,thenwecancalculatethesamplesizerequiredtogiveaconfidenceintervalwiththedesiredwidth.Thedifficultyisthatthestandarderrormayalsodependeitheronthequantitywewishtoestimate,oronsomeotherpropertyofthepopulation,suchasthestandarddeviation.Wemustestimatethesequantitiesfromdataalreadyavailable,orcarryoutapilotstudytoobtainaroughestimate.Thecalculationofsamplesizecanonlybeapproximateanyway,sotheestimatesusedtodoitneednotbeprecise.
Ifwewanttoestimatethemeanofapopulation,wecanusetheformulaforthestandarderrorofamean,s/√n,toestimatethesamplesizerequired.Forexample,supposewewishtoestimatethemeanFEV1inapopulationofyoungmen.WeknowthatinanotherstudyFEV1hadstandarddeviations=0.67litre(§4.8).Wethereforeexpectthestandarderrorofthemeantobe0.67/√n.Wecansetthesizeof
standarderrorwewantandchoosethesamplesizetoachievethis.Wemightdecidethatastandarderrorof0.1litreiswhatwewant,sothatwewouldestimatethemeantowithin1.96×0.1=0.2litre.Then:SE=0.67/√n,n=0.672/SE2=0.672/0.12=45.Wecanalsoseewhatthestandarderrorandwidthofthe95%confidenceintervalwouldbefordifferentvaluesofn:
n 10 20 50 100 200 500
standarderror 0.212 0.150 0.095 0.067 0.047 0.030
95%confidenceinterval
±0.42 ±0.29 ±0.19 ±0.13 ±0.09 ±0.06
Sothatifwehadasamplesizeof200,wewouldexpectthe95%confidenceintervaltobe0.09litreoneithersidedofthesamplemean(1.96standarderrors)whereaswithasampleof50the95%confidenceintervalwouldbe0.19litreon
eithersideofthemean.
18.2*EstimationofapopulationproportionWhenwewishtoestimateaproportionwehaveafurtherproblem.Thestandarderrordependsontheveryquantitywhichwewishtoestimate.Wemustguesstheproportionfirst.Forexample,supposewewishtoestimatetheprevalenceofadisease,whichwesuspecttobeabout2%,towithin5%,i.e.tothenearest1per1000.Theunknownproportion,p,isguessedtobe0.02andwewantthe95%confidenceintervaltobe0.001oneitherside,sothestandarderrormustbehalfthis,0.0005.
Theaccurateestimationofverysmallproportionsrequiresverylargesamples.Thisisaratherextremeexampleandwedonotusuallyneed
toestimateproportionswithsuchaccuracy.Awiderconfidenceinterval,obtainablewithasmallersampleisusuallyacceptable.Wecanalsoask‘Ifwecanonlyaffordasamplesizeof1000,whatwillbethestandarderror?’
The95%confidencelimitswouldbe,roughly,p±0.009.Forexample,iftheestimatewere0.02,the95%confidencelimitswouldbe0.011to0.029.Ifthisaccuracyweresufficientwecouldproceed.
TheseestimatesofsamplesizearebasedontheassumptionthatthesampleislargeenoughtousetheNormaldistribution.Ifaverysmallsampleisindicateditwillbeinadequateandothermethodsmustbeusedwhicharebeyondthescopeofthisbook.
18.3*SamplesizeforsignificancetestsWeoftenwanttodemonstratetheexistenceofadifferenceorrelationshipaswellaswantingtoestimateitsmagnitude,asinaclinicaltrial,forexample.Webasethesesamplesizecalculationsonsignificancetests,usingthepowerofatest(§9.9)tohelpchoosethesamplesizerequiredtodetectadifferenceifitexists.Thepowerofatestisrelatedtothepostulateddifferenceinthepopulation,thestandarderrorofthesampledifference(whichinturndependsonthesamplesize),andthesignificancelevel,whichweusuallytaketobeα=0.05.Thesequantitiesarelinkedbyanequationwhichenablesustodetermineanyoneofthemgiventheothers.Wecanthensaywhatsamplesizewouldberequiredtodetectanygivendifference.Wethendecidewhatdifferenceweneedtobeable
todetect.Thismightbeadifferencewhichwouldhaveclinicalimportanceofadifferencewhichwethinkthetreatmentmayproduce.
Supposewehaveasamplewhichgivesanestimatedofthepopulationdifferenceµd.WeassumedcomesfromaNormaldistributionwithmeanµdandhasstandarderrorSE(d).Heredmightbethedifferencebetweentwomeanstwoproportions,oranythingelsewecancalculatefromdata.Weareinterestedintestingthenullhypothesisthatthereis
nodifferenceinthepopulation.i.e.µd=0.Wearegoingtouseasignificancetestattheαlevel,andwantthepower,theprobabilityofdetectingasignificantdifference,tobeP.
IshalldefineuαtobethevaluesuchthattheStandardNormaldistribution(mean0andvariance1)islessthan-uαorgreaterthanuαwithprobabilityα.Forexample,u0.05=1.96.Theprobabilityoflyingbetween-uαanduαis1-α.ThusuαisthetwosidedαprobabilitypointoftheStandardNormaldistribution,asshowninTable7.2.
Ifthenullhypothesisweretrue,theteststatisticd/SE(d)wouldbefromaStandardNormaldistribution.Werejectthenullhypothesisattheαleveliftheteststatisticisgreaterthanuαorlessthan-uα,1.96fortheusual5%significancelevel.Forsignificancewemusthave:
Letusassumethatwearetryingtodetectadifferencesuchthatdwillbegreaterthan0.Thefirstalternativeisthenextremelyunlikelyandcanbeignored.Thuswemusthave,forasignificantdifference:d/SE(d)>uαsod>uαSE(d).ThecriticalvaluewhichdmustexceedisuαSE(d).
Now,disarandomvariable,andforsomesamplesitwillbegreaterthanitsmean,µd,forsomeitwillbelessthanitsmean.disanobservationfromaNormaldistributionwithmeanµdandvarianceSE(d)2.WewantdtoexceedthecriticalvaluewithprobabilityP,thechosenpowerofthetest.ThevalueoftheStandardNormaldistributionwhichisexceededwithprobabilityPis-u2(1-P)(seeFigure18.1).(1-P)isoftenrepresentedasβ(beta).Thisistheprobabilityoffailingtoobtainasignificantdifferencewhenthenullhypothesisisfalseandthepopulationdifferenceisµd.ItistheprobabilityofaTypeIIerror(§9.4).ThevaluewhichdexceedswithprobabilityPisthemeanminus-u2(1-P)standarddeviations:µd-u2(1-P)SE(d).Henceforsignificancethismustexceedthecriticalvalue,uαSE(d).Thisgives
µd-u2(1-P)SE(d)=uαSE(d)
Puttingthecorrectstandarderrorformulaintothiswillyieldtherequiredsamplesize.Wecanrearrangeitas
µ2d=(uα+u2(1-P))2SE(d)2
ThisistheconditionwhichmustbemetifwearetohaveaprobabilityPof
detectingasignificantdifferenceattheαlevel.Weshallusetheexpression(uα2(1-P))2alot,soforconvenienceIshalldenoteitbyf(α,P).Table18.1showsthevaluesofthefactorf(α,P)fordifferentvaluesofαandP.Theusualvalueusedforαis0.05,andPisusually0.80,0.90,or0.95.
Fig.18.1.RelationshipbetweenPandu2(1-P)
Table18.1.Valuesoff(α,P)=(uα+u2(1-P))2for
differentPandα
Power,PSignificancelevel,α
0.05 0.01
0.50 3.8 6.6
0.70 6.2 9.6
0.80 7.9 11.7
0.90 10.5 14.9
0.95 13.0 17.8
0.99 18.4 24.0
Sometimeswedonotexpectthenewtreatmenttobebetterthanthestandardtreatment,buthopethatitwillbeasgood.Wewanttotesttreatmentswhichmaybeasgoodastheexistingtreatmentbecausethenewtreatmentmaybecheaper,havefewersideeffects,belessinvasive,orunderourpatent.Wecannotusethepowermethodbasedonthedifferencewewanttobeabletodetect,becausewearenotlookingforadifference.Whatwedoisspecifyhowdifferentthetreatmentsmightbeinthepopulationandstillberegardedasequivalent,anddesignourstudytodetectsuchadifference.Thiscangetrathercomplicatedandspecialised,soIshallleavethedetailstoMachinetal.(1998).
18.4*Comparisonoftwomeans
Whenwearecomparingthemeansoftwosamples,samplesizesn1andn2,frompopulationswithmeansµ1andµ2,withthevarianceofthemeasurementsbeingσ2,wehaveµd=µ1-µ2and
sotheequationbecomes:
Forexample,supposewewanttocomparebicepsskinfoldinpatientswithCrohn'sdiseaseandcoeliacdisease,followinguptheinconclusivecomparisonofbicepsskinfoldinTable10.4withalargerstudy.Weshallneedanestimateofthevariabilityofbicepsskinfoldinthepopulationweareconsidering.Wecanusuallygetthisfromthemedicalliterature,orasherefromourowndata.Ifnotwemustdoapilotstudy,asmallpreliminaryinvestigationtocollectsomedataandcalculatethestandarddeviation.ForthedataofTable10.4,thewithin-groupsstandarddeviationis2.3mm.Wemustdecidewhatdifferencewewanttodetect.Inpracticethismaybedifficult.InmysmallstudythemeanskinfoldthicknessintheCrohn'spatientswas1mmgreaterthaninmycoeliacpatients.Iwilldesignmylargerstudytodetectadifferenceof0.5mm.Ishalltaketheusualsignificancelevelof0.05.Iwantafairlyhighpower,sothatthereisahighprobabilityofdetectingadifferenceofthechosensizeshoulditexist.Ishalltake0.90,whichgivesf(α,P)=10.5fromTable18.1.Theequationbecomes:
Wehaveoneequationwithtwounknowns,sowemustdecideontherelationshipbetweenn1andn2.Ishalltrytorecruitequalnumbersinthetwogroups:
andIneed444subjectsineachgroup.
Itmaybethatwedonotknowexactlywhatsizeofdifferenceweareinterestedin.Ausefulapproachistolookatthesizeofthedifferencewecoulddetectusingdifferentsamplesizes,asinTable18.2.Thisisdonebyputtingdifferentvaluesofninthesamplesizeequation.
Table18.2.Differenceinmeanbicepsskinfoldthickness(mm)detectedatthe5%significancelevelwithpower90%fordifferentsamplesizes,equal
groups
Sizeofeachgroup,n
Differencedetectedwithprobability0.90
10 3.33
20 2.36
50 1.49
100 1.05
200 0.75
500 0.47
1000 0.33
Table18.3.Samplesizerequiredineachgrouptodetectadifferencebetweentwomeansatthe5%
significancelevelwithpower90%,usingequallysizedsamples
Differencein
standarddeviations
n
Differencein
standarddeviations
n
Differencein
standarddeviations
n
0.01 210000
0.1 2100 0.6 58
0.02 52500
0.2 525 0.7 43
0.03 23333
0.3 233 0.8 33
0.04 13125
0.4 131 0.9 26
0.05 8400
0.5 84 1.0 21
Ifwemeasurethedifferenceintermsofstandarddeviations,wecanmakeageneraltable.Table18.3givesthesamplesizerequiredtodetectdifferencesbetweentwoequallysizedgroups.Altman(1982)givesaneatgraphicalmethodofcalculation.
Wedonotneedtohaven1=n2=n.Wecancalculateµ1-µ2fordifferentcombinationsofn1andn2.Thesizeofdifference,intermsofstandarddeviations,whichwouldbedetectedisgiveninTable18.4.Wecanseefromthisthatwhatmattersisthesizeofthesmallersample.Forexample,ifwehave10ingroup1and20ingroup2,wedonotgainverymuchbyincreasingthesizeofgroup2:increasinggroup2from20to100produceslessadvantagethanincreasinggroup1from10to20.Inthiscasetheoptimumisclearlytohavesamplesofequalsize.
Table18.4.Difference(instandarddeviations)detectableatthe5%significancelevelwithpower90%
fordifferentsamplesizes,unequalgroups
n2 n1
10 1.45 1.25 1.13 1.08 1.05 1.03 1.03
20 1.25 1.03 0.85 0.80 0.75 0.75 0.73
50 1.13 0.85 0.65 0.55 0.50 0.48 0.48
100 1.08 0.80 0.55 0.45 0.40 0.35 0.35
200 1.05 0.75 0.50 0.40 0.33 0.28 0.25
500 1.03 0.75 0.48 0.35 0.28 0.20 0.18
1000 1.03 0.73 0.48 0.35 0.25 0.18 0.15
18.5*ComparisonoftwoproportionsUsingthesameapproach,wecanalsocalculatethesamplesizesforcomparingtwoproportions.Ifwehavetwosampleswithsizesn1andn2fromBinomialpopulationswithproportionsp1andp2thedifferenceisµd=p1-p2,thestandarderrorofthedifferencebetweenthesampleproportions(§8.6)is:
Ifweputtheseintothepreviousformulawehave:
Thesizeoftheproportions,p1andp2,isimportant,aswellastheirdifference.(Thesignificancetestimpliedhereissimilartothechi-squaredtestfora2by2table).Whenthesamplesizesareequal,i.e.n1=n2=n,wehave
Thereareseveralslightvariationsonthisformula.Differentcomputerprogramsmaythereforegiveslightlydifferentsamplesizeestimates
Supposewewishtocomparethesurvivalratewithanewtreatmentwiththatwithanoldtreatment,whereitisabout60%.Whatvaluesofn1andn2willhave90%chanceofgivingsignificantdifferenceatthe5%
levelfordifferentvaluesofp2?ForP=0.90andα=0.05,f(α,P)=10.5.Supposewewishtodetectanincreaseinthesurvivalrateonthenewtreatmentto80%,sop2=0.80,andp1=0.60.
Table18.5.Samplesizeineachgrouprequiredtodetectdifferentproportionsp2whenp1=0.6atthe5%significancelevelwithpower90%,equalgroups
p2 n
0.90 39
0.80 105
0.70 473
0.65 1964
Table18.6.n2fordifferentn1andp2whenp1=0.05atthe5%significancelevelwithpower90%
p2n1
50 100 200 500 1000 2000 5000
0.06 . . . . . . 237000
0.07 . . . . . 4500 2300
0.08 . . . . 1900 1200 970
0.10 . . 1500 630 472 420 390
0.15 5400 270 180 150 140 140 140
0.20 134 96 84 78 76 76 75
Wewouldrequire105ineachgrouptohavea90%chanceofshowingasignificantdifferenceifthepopulationproportionswere0.6and0.8.
Whenwedonothaveaclearideaofthevalueofp2inwhichweareinterested,wecancalculatethesamplesizerequiredforseveralproportions,asinTable18.5.Itisimmediatelyapparentthattodetectsmalldifferencesbetweenproportionsweneedverylargesamples.
Thecasewheresamplesareofequalsizeisusualinexperimentalstudies,butnotinobservationalstudies.Supposewewishtocomparetheprevalenceofacertainconditionintwopopulations.Weexpectthatinonepopulationitwillbe5%andthatitmaybemorecommonthesecond.Wecanrearrangetheequation:
Table18.6showsn2fordifferentn1andp2.Forsomevaluesofn1wegetanegativevalueofn2.Thismeansthatnovalueofn2islarge
enough.Itisclear
thatwhentheproportionsthemselvesaresmall,thedetectionofsmalldifferencesrequiresverylargesamplesindeed.
18.6*DetectingacorrelationInvestigationsareoftensetuptolookforarelationshipbetweentwocontinuousvariables.Itisconvenienttotreatthisasanestimationofortestofacorrelationcoefficient.Thecorrelationcoefficienthasanawkwarddistribution,whichtendsonlyveryslowlytotheNormal,evenwhenbothvariablesthemselvesfollowaNormaldistribution.WecanuseFisher'sztransformation:
whichfollowsaNormaldistributionwithmean
andvariance1/(n-3)approximately,whereρisthepopulationcorrelationcoefficientandnisthesamplesize(§11.10).Forsamplesizecalculationswecanapproximatezρby
Thuswehave
andwecanestimaten,ρorPgiventheothertwo.Table18.7showsthesamplesizerequiredtodetectacorrelationcoefficientwithapowerofP=0.9andasignificancelevelα=0.05.
Table18.7.Approximatesamplesizerequiredtodetectacorrelationatthe5%significancelevelwith
power90%
ρ n ρ n ρ n
0.01 100000 0.1 1000 0.6 25
0.02 26000 0.2 260 0.7 17
0.03 12000 0.3 110 0.8 12
0.04 6600 0.4 62 0.9 8
0.05 4200 0.5 38
18.7*AccuracyoftheestimatedsamplesizeInthischapterIhaveassumedthatsamplesaresufficientlylargeforsamplingdistributionstobeapproximatelyNormalandforestimatesofvariancetobegoodestimates.Withverysmallsamplesthismaynotbethecase.Variousmoreaccuratemethodsexist,butanysamplesizecalculationisapproximateandexceptforverysmallsamples,saylessthan10,themethodsdescribedaboveshouldbeadequate.Whenthesampleisverysmall,wemightneedtoreplacethesignificancetestcomponentoff(α,P)bythecorrespondingnumberfromthetdistribution.
Thesemethodsdependonassumptionsaboutthesizeofdifferencesoughtandthevariabilityoftheobservations.Itmaybethatthepopulationtobestudiedmaynothaveexactlythesamecharacteristicsasthosefromwhichthestandarddeviationorproportionswereestimated.Thelikelyeffectsofchangesinthesecanbeexaminedbyputtingdifferentvaluesofthemintheformula.However,thereisalwaysanelementofventuringintotheunknownwhenembarkingonastudyandwecanneverbysurethatthesampleandpopulationwillbeasweexpect.Thedeterminationofsamplesizeasdescribedaboveisthusonlyaguide,anditisprobablyaswellalwaystoerronthesideofalargersamplewhencomingtoafinaldecision.
Thechoiceofpowerisarbitrary,inthatthereisnotoptimumchoiceofpowerforastudy.Iusuallyrecommend90%,but80%isoftenquoted.Thisgivessmallerestimatedsamplesizes,but,ofcourse,agreaterchanceoffailingtodetecteffects.
ForafullertreatmentofsamplesizeestimationandfullertablesseeMachinetal.(1998)andLemeshowetal.(1990).
18.8*TrialsrandomizedinclustersWhenwerandomizebyclusterratherthanindividual(§2.11)welosepowercomparedtoanindividually-randomizedtrialofthesamesize.Hencetogetthepowerwewant,wemustincreasethesamplesizefromthatrequiredforanindividuallyrandomizedtrial.Theratioofthenumberofpatientsrequiredforaclustertrialtothatforasimplyrandomizedtrialiscalledthedesigneffectofthestudy.Itdependson
thenumberofsubjectspercluster.Forthepurposeofsamplesizecalculationsweusuallyassumethisisconstant.
Iftheoutcomemeasurementiscontinuous,e.g.serumcholesterol,asimple
methodofanalysisisbasedonthemeanoftheobservationsforallsubjectsinthecluster,andcomparesthesemeansbetweenthetreatmentgroups(§10.13).Wewilldenotethevarianceofobservationswithinoneclusterbys2wandassumethatthisvarianceisthesameforallclusters.Iftherearemsubjectsineachclusterthenthevarianceofasinglesamplemeaniss2w/m.Thetrueclustermean(unknown)willvaryfromclustertocluster,withvariances2c(see§10.12).Theobservedvarianceoftheclustermeanswillbethesumofthevariancebetweenclustersandthevariancewithinclusters,i.e.varianceofoutcome=s2c+s2w/m.Hencethestandarderrorforthedifferencebetweenmeansisgivenby
wheren1andn2arethenumbersofclustersinthetwogroups.Formosttrialsn1=n2=n.so
Hence,usingthegeneralmethodof§18.3,wecancalculatetherequirednumberofclustersby
Whentheoutcomeisadichotomous,‘yesorno’variable,wereplaces2wbyp(1-p),wherepistheprobabilityofa‘yes’.
Forexample,inaproposedstudyofabehaviouralinterventiontolowercholesterolingeneralpractice,practicesweretoberandomisedintotwogroups,onetoofferintensivedietaryinterventionbyspecially
trainedpracticenursesusingabehaviouralapproachandtheothertousualgeneralpracticecare.Theoutcomemeasurewouldbemeancholesterollevelsinpatientsattendingeachpracticeoneyearlater.EstimatesofbetweenpracticevarianceandwithinpracticevariancewereobtainedfromtheMRCthrombosispreventiontrial(Meadeetal.1992)andweres2c=0.0046ands2w=1.28respectively.Theminimumdifferenceconsideredtobeclinicallyrelevantwas0.1mmol/l.Ifwerecruit50patientsperpractice,wewouldhaves2=s2w+s2w/m=0.0046+1.28/50=0.0302.IfwechoosepowerP=0.90andandsignificancelevelα=0.05,fromTable18.1f(P,α)=10.5.Thenumberofpracticesrequiredtodetectadifferenceof0.1mmol/lisgivenbyn=10.5×0.0302×2/0.12=63ineachgroup.Thiswouldgiveus63×50=3150patientsineachgroup.Acompletelyrandomizedtrialwithoutclusterswouldhaves2=0.0046+1.28=1.2846andwewouldneedn=10.5×1.2846×2/0.12=2698patientspergroup.Thusthedesigneffectofhavingclustersof50patientsis3150/2698=1.17.
Theequationforthedesigneffectis
Ifwecalculateanintra-classcorrelationcoefficient(ICC)fortheseclusters(§11.13),wehave
Inthiscontext,theICCiscalledtheintra-clustercorrelationcoefficient.Byabitofalgebraweget
DEEF=1+(m-1)ICC
Ifthereisonlyoneobservationpercluster,m=1andthedesigneffectis1.0andthetwodesignsarethesame.Otherwise,thelargertheICC,i.e.themoreimportantthevariationbetweenclustersis,thebiggerthedesigneffectandthemoresubjectswewillneedtogetthesamepowerasasimply-randomizedstudy.EvenasmallICCwillhaveanimpactiftheclustersizeislarge.TheX-rayguidelinesstudy(§10.13)
hadICC=0.019.AstudywiththesameICCandm=50referralsperpracticewouldhavedesigneffectD=1+(50-1)×0.019=1.93.Thusitwouldrequirealmosttwiceasmanysubjectsasatrialwherepatientswererandomizedtotreatmentindividually.
ThemaindifficultyincalculatingsamplesizeforclusterrandomizedstudiesisobtaininganestimateofthebetweenclustervariationorICC.Estimatesofvariationbetweenindividualscanoftenbeobtainedfromtheliteraturebutevenstudiesthatusetheclusterastheunitofanalysismaynotpublishtheirresultsinsuchawaythatthebetweenpracticevariationcanbeestimated.Donneretal.(1990),recognizingthisproblem,recommendedthatauthorspublishthecluster-specificeventratesobservedintheirtrial.Thiswouldenableotherworkerstousethisinformationtoplanfurtherstudies.
Insometrials,wheretheinterventionisdirectedattheindividualsubjectsandthenumberofsubjectsperclusterissmall,wemayjudgethatthedesigneffectcanbeignored.Ontheotherhand,wherethenumberofsubjectsperclusterislarge,anestimateofthevariabilitybetweenclusterswillbeveryimportant.Whenthenumberofclustersisverysmall,wemayhavetousesmallsampleadjustmentsmentionedin§18.7.
18M*Multiplechoicequestions98to100(Eachansweristrueorfalse)
98.*Thepowerofatwo-samplettest:
(a)increasesifthesamplesizesareincreased;
(b)dependsonthedifferencebetweenthepopulationmeanswhichwewishtodetect;
(c)dependsonthedifferencebetweenthesamplemeans;
(d)istheprobabilitythatthetestwilldetectagivenpopulationdifference;
(e)cannotbezero.
ViewAnswer
99.*Thesamplesizerequiredforastudytocomparetwoproportions:
(a)dependsonthemagnitudeoftheeffectwewishtodetect;
(b)dependsonthesignificancelevelwewishtoemploy;
(c)dependsonthepowerwewishtohave;
(d)dependsontheanticipatedvaluesoftheproportionsthemselves;
(e)shouldbedecidedbyaddingsubjectsuntilthedifferenceissignificant.
ViewAnswer
100.*Thesamplesizerequiredforastudytoestimateamean:
(a)dependsonthewidthoftheconfidenceintervalwhichwewant;
(b)dependsonthevariabilityofthequantitybeingstudied;
(c)dependsonthepowerwewishtohave;
(d)dependsontheanticipatedvalueofthemean;
(e)dependsontheanticipatedvalueofthestandarddeviation.
ViewAnswer
18E*Exercise:Estimationofsamplesizes1.Whatsamplesizewouldberequiredtoestimatea95%referenceintervalusingtheNormaldistributionmethod,sothatthe95%confidenceintervalforthereferencelimitswereatmost20%ofthereferenceintervalsize?
ViewAnswer
2.Howbigasamplewouldberequiredforanopinionpollstertoestimatevoterpreferencestowithintwopercentagepoints?
ViewAnswer
3.Mortalityfrommyocardialinfarctionafteradmissiontohospitalisabout15%.Howmanypatientswouldberequiredforaclinicaltrialtodetecta10%reductioninmortality,i.e.to13.5%,ifthepowerrequiredwas90%?Howmanywouldbeneededifthepowerwereonly80%?
ViewAnswer
4.Howmanypatientswouldberequiredinaclinicalstudytocompareanenzymeconcentrationinpatientswithaparticulardiseaseandcontrols,ifdifferencesoflessthanonestandarddeviationwouldnotbeclinicallyimportant?Iftherewasalreadyasampleofmeasurementsfrom100healthycontrols,howmanydiseasecaseswouldberequired?
ViewAnswer
5.Inaproposedtrialofahealthpromotionprogramme,theprogrammewastobeimplementedacrossawholecounty.Theplanwastousefourcounties,twocountiestobeallocatedtoreceivetheprogrammeandtwocountiestoactascontrols.Theprogrammewouldbeevaluatedbyasurveyofsamplesofabout750subjectsdrawnfromtheat-riskpopulationsineachcounty.Aconventionalsamplesizecalculation,whichignoredtheclustering,hadindicatedthat1500subjectsineachtreatmentgroupwouldberequiredtogivepower80%todetecttherequireddifference.Theapplicantswereawareoftheproblemofclusterrandomisationandtheneedtotakeitintoaccountintheanalysis,e.g.byanalysisatthelevelofthecluster(county).Theyhadanestimateoftheintraclustercorrelation=0.005,basedonapreviousstudy.Theyarguedthatthiswassosmallthattheycouldignoretheclustering.Weretheycorrect?
ViewAnswer
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>TableofContents>19-Solutionstoexercises
19
Solutionstoexercises
Someofthemultiplechoicequestionsarequitehard.Ifyouscore+1foracorrectanswer,-1foranincorrectanswer,and0forapartwhichyouomitted,Iwouldregard40%asthepasslevel,50%asgood,60%asverygood,and70%asexcellent.Thesequestionsarehardtosetandsomemaybeambiguous,soyouwillnotscore100%.
SolutiontoExercise2M:Multiplechoicequestions1to61.FFFFF.Controlsshouldbetreatedinthesameplaceatthesametime,underthesameconditionsotherthanthetreatmentundertest(§2.1).Allmustbewillingandeligibletoreceiveeithertreatment(§2.4).
2.FTFTF.Randomallocationisdonetoachievecomparablegroups,allocationbeingunrelatedtothesubjects'characteristics(§2.2).Theuseofrandomnumbershelpstopreventbiasinrecruitment(§2.3).
3.TFFFT.Patientsdonotknowtheirtreatment,buttheyusuallydoknowthattheyareinatrial(§2.9).Notthesameasacross-overtrial(§2.6).
4.FFFFF.Vaccinatedandrefusingchildrenareself-selected(§2.4).Weanalysebyintentiontotreat(§2.5).Wecancompareeffectofavaccinationprogrammebycomparingwholevaccinationgroup,vaccinatedandrefuserstothecontrols.
5.TFTTT.§2.6.Theorderisrandomized.
6.FFTTT.§2.8,§2.9.Thepurposeofplacebosismakedissimilartreatmentsappearsimilar.Onlyinrandomizedtrialscanwerelyoncomparability,andthenonlywithinthelimitsofrandomvariation(§2.2).
SolutiontoExercise2E1.ItwashopedthatwomenintheKYMgroupwouldbemoresatisfiedwiththeircare.Theknowledgethattheywouldreceivecontinuityofcarewasanimportantpartofthetreatment,andsothelackofblindnessisessential.MoredifficultisthatKYMwomenweregivenachoiceandsomayhavefeltmorecommittedtowhicheverscheme,KYMorstandard,theyhadchosen,thandidthecontrolgroup.Wemustacceptthiselementofpatientcontrolaspartofthetreatment.
2.Thestudyshouldbe(andwas)analysedbyintentiontotreat(§2.5).Asoftenhappens,therefusersdidworsethandidtheacceptorsofKYM,andworsethan
thecontrolgroup.WhenwecompareallthoseallocatedtoKYMwiththoseallocatedtocontrol,thereisverylittledifference(Table19.1).
Table19.1.MethodofdeliveryintheKYMstudy
Methodofdelivery
AllocatedtoKYM
Allocatedtocontrol
% n % n
Normal 79.7 382 74.8 354
Instrumental 12.5 60 17.8 84
Caesarian 7.7 37 7.4 35
3.Womenhadbookedforhospitalantenatalcareexpectingthestandardservice.Thoseallocatedtothisthereforereceivedwhattheyhadrequested.ThoseallocatedtotheKYMschemewereofferedatreatmentwhichtheycouldrefuseiftheywished,refusersgettingthecareforwhichtheyhadoriginallybooked.Noextraexaminationswerecarriedoutforresearchpurposes,theonlyspecialdatabeingthequestionnaires,whichcouldberefused.Therewasthereforenoneedtogetthewomen'spermissionfortherandomization.Ithoughtthiswasaconvincingargument.
SolutiontoExercise3M:Multiplechoicequestions7to137.FTTTT.Apopulationcanbeanything(§3.3).
8.TFFFT.Acensustellsuswhoisthereonthatday,andonlyappliestocurrentin-patients.Thehospitalcouldbequiteunusual.Somediagnosesarelesslikelythanotherstoleadtoadmissionortolongstay(§3.2).
9.TFFTF.Allmembersandallsampleshaveequalchancesofbeingchosen(§3.4).Wemuststicktothesampletherandomprocessproduces.Errorscanbeestimatedusingconfidenceintervalsandsignificancetests.Choicedoesnotdependonthesubject'scharacteristicsatall,exceptforitsbeinginthepopulation.
10.FTTFT.Somepopulationsareunidentifiableandsomecannotbelistedeasily(§3.4).
11.FFFTF.Inacase-controlstudywestartwithagroupwiththedisease,thecases,andagroupwithoutthedisease,thecontrols(§3.8).
12.FTFTT.Wemusthaveacohortorcasecontrolstudytogetenoughcases(§3.7,§3.8).
13.TTTTF.Thisisarandomclustersample(§3.4).Eachpatienthadthesamechanceoftheirhospitalbeingchosenandthenthesamechanceofbeingchosenwithinthehospital.Thiswouldnotbesoifwechoseafixednumberfromeachhospitalratherthanafixedproportion,as
thoseinsmallhospitalswouldbemorelikelytobechosenthanthoseinlargehospitals.Inpart(e).whataboutasamplewithpatientsineveryhospital?
SolutiontoExercise3E1.Manycasesofinfectionmaybeunreported,butthereisnotmuchthatcouldbedoneaboutthat.Manyorganismsproducesimilarsymptoms,hencethe
needforlaboratoryconfirmation.Therearemanysourcesofinfection,includingdirecttransmission,hencetheexclusionofcasesexposedtootherwatersuppliesandtoinfectedpeople.
2.Controlsmustbematchedforageandsexasthesemayberelatedtotheirexposuretoriskfactorssuchashandlingrawmeat.Inclusionofcontrolswhomayhavehadthediseasewouldhaveweakenedanyrelationshipswiththecause,andthesameexclusioncriteriawereappliedasforthecases,tokeepthemcomparable.
3.Dataareobtainedbyrecall.Casesmayremembereventsinrelationtothediseasemoreeasilythatthancontrolsinrelationtothesametime.Casesmayhavebeenthinkingaboutpossiblecausesofthediseaseandsobemorelikelytorecallmilkattacks.Thelackofpositiveassociationwithanyotherriskfactorssuggeststhatthisisnotimportanthere.
4.Iwasconvinced.Therelationshipisverystrongandthesescavengingbirdsareknowntocarrytheorganism.Therewasnorelationshipwithanyotherriskfactor.Theonlyproblemisthattherewaslittleevidencethatthesebirdshadactuallyattackedthemilk.Othershavesuggestedthatcatsmayalsoremovethetopsofmilkbottlestodrinkthemilkandmaybetherealculprits(Balfour1991).
5.Furtherstudies:testingofattackedmilkbottlesforCampylobacter(havetowaitforthenextyear).Possiblyacohortstudy,askingpeopleabouthistoryofbirdattacksanddrinkingattackedmilk,thenfollowforfutureCampylobacter(andother)infections.Possiblyaninterventionstudy.Advisepeopletoprotecttheirmilkandobservethesubsequentpatternofinfection.
SolutiontoExercise4M:Multiplechoicequestions14to1914.TFFTF.§4.1.Parityisquantitativeanddiscrete,heightandbloodpressurearecontinuous.
15.TTFTF.§4.1.Agelastbirthdayisdiscrete,exactageincludesyearsandfractionofayear.
16.FFTFT.§4.4,§4.6.Itcouldhavemorethanonemode,wecannotsay.Standarddeviationislessthanvarianceifthevarianceisgreaterthanone(§4.7,8).
17.TTTFT.§4.2,3,4.Meanandvarianceonlytellusthelocationandspreadofthedistribution(§4.6,7).
18.TFTFT.§4.5,6,7.Median=2,theobservationsmustbeorderedbeforethecentraloneisfound,mode=2,range=7-1=6,variance=22/4=5.5.
19.FFFFT.§4.6,7,8.Therewouldbemoreobservationsbelowthemeanthanabove,becausethemedianwouldbelessthanthemean.Mostobservationswillbewithinonestandarddeviationofthemeanwhatevertheshape.Thestandarddeviationmeasureshowwidelythebloodpressureisspreadbetweenpeople,notforasingleperson,whichwouldbeneededtoestimateaccuracy.Seealso§15.2.
Fig.19.1.Stemandleafplotofbloodglucose
Fig.19.2.Boxandwhiskerplotofbloodglucose
SolutiontoExercise4E1.ThestemandleafplotisshowninFigure19.1:
2.Minimum=2.2,maximum=6.0.Themedianistheaverageofthe20thand21storderedobservations,sincethenumberofobservationsiseven.Theseareboth4.0,sothemedianis4.0.Thefirstquartileisbetweenthe10thand11th,whichareboth3.6.Thethirdquartileisbetweenthe30thand31stobservations,whichare4.5and4.6.Wehaveq=0.75,i=0.75×41=30.75,andthequartileisgivenby4.5+(4.6-4.5)×0.75=4.575(§4.5).TheboxandwhiskerplotisshowninFigure19.2.
Fig.19.3.Histogramofbloodglucose
3.Thefrequencydistributionisderivedeasilyfromthestemandleafplot:
Interval Frequency
2.0–2.4 1
2.5–2.9 1
3.0–3.4 6
3.5–3.9 10
4.0–4.4 11
4.5–4.9 8
5.0–5.4 2
5.5–5.9 0
6.0–6.4 1
Total 40
4.ThehistogramisshowninFigure19.3.Thedistributionissymmetrical.
5.Themeanisgivenby
Thedeviationsandtheirsquaresareasfollows:
xi xi-[xwithbarabove] (xi-[xwithbarabove])2
4.7 0.65 0.4225
4.2 0.15 0.0225
3.9 -0.15 0.0225
3.4 -0.65 0.4225
Total 16.2 0.00 0.8900
Therearen-1=4-1=3degreesoffreedom.Thevarianceisgivenby
6.Asbefore,thesumis∑xi=16.2,Thesumofsquaresaboutthemeanisthengivenby∑xi2=66.5and
Thisisthesameasfoundin5above,so,asbefore,
7.Forthemeanwehave∑xi=162.2,
Thesumofsquaresaboutthemeanisgivenby:
Therearen-1-40-1=39degreesoffreedom.Thevarianceisgivenby
9.Forthelimits,[xwithbarabove]-2s=4.055-2×0.698=2.659,[xwithbarabove]-s=4.055-0.698=3.357,[xwithbarabove]=4.055,[xwithbarabove]+s=4.055+0.698=4.753,and[xwithbarabove]+2s=4.055+2×0.698=5.451.Figure19.3showsthemeanandstandarddeviationmarkedonthehistogram.Themajorityofpointsfallwithinonestandarddeviationofthemeanandnearlyallwithintwostandarddeviationsofthemean.Becausethedistributionissymmetrical,itextendsjustbeyondthe[xwithbarabove]±2spointsoneitherside.
SolutiontoExercise5M:Multiplechoicequestions20to2420.FTTTT.§5.1,§5.2.Withoutacontrolgroupwehavenoideahowmanywouldgetbetteranyway(§2.1).66.67%is2/3.Wemayonlyhave3patients.
21.TFFTT.§5.2.Tothreesignificantfigures,itshouldbe1730.Weroundupbecauseofthe9.Tosixdecimalplacesitis1729.543710.
22.FTTFT.Thisisabarchartshowingtherelationshipbetweentwovariables(§5.5).SeeFigure19.4.Calendartimehasnotruezerotoshow.
23.TTFFT.§5.9,§5A.Thereisnologarithmofzero.
24.FFTTT.§5.5,6,7.Ahistogram(§4.3)andapiechart(§5.4)eachshowthedistributionofasinglevariable.
Fig.19.4.Adubiousgraphrevised
Table19.2.CalculationsforapiechartfortheTootingBecdata
Category Frequency Relativefrequency Angle
Schizophrenia 474 0.32311 116
Affectiveillness 277 0.18882 68
Organicbrainsyndrome
405 0.27607 99
Subnormality 58 0.03954 14
Alcoholism 57 0.03885 14
Other 196 0.13361 48
Total 1467 1.00000 359
SolutiontoExercise5E1.Thisisthefrequencydistributionofaqualitativevariable,soapiechartcanbeusedtodisplayit.ThecalculationsaresetoutinTable19.2.Noticethatwehavelostonedegreethroughroundingerrors.Wecouldworktofractionsofadegree,buttheeyeisunlikelytospotthedifference.ThepiechartisshowninFigure19.5.
2.SeeFigure19.6.
3.Thereareseveralpossibilities.Intheoriginalpaper,DollandHillusedaseparatebarchartforeachdisease,similartoFigure19.7.
4.Linegraphscanbeusedhere,aswehavesimpletimeseries(Figure19.8).Foranexplanationofthedifferencebetweenyears,see§13E.
SolutiontoExercise6M:Multiplechoicequestions25to3125.TTFFF.§6.2.Iftheyaremutuallyexclusivetheycannotbothhappen.Thereisnoreasonwhytheyshouldbeequiprobableorexhaustive,theonlyeventswhichcanhappen(§6.3).
26.TFTFT.Forboth,theprobabilitiesaremultiplied,0.2×0.05=0.01(§6.2).
Clearlytheprobabilityofbothmustbelessthanthatforeachone.The
probabilityofbothis0.01,sotheprobabilityofXaloneis0.20-0.01=0.19andtheprobabilityofYaloneis0.05-0.01=0.04.TheprobabilityofhavingXorYistheprobabilityofXalone+probabilityofYalone+probabilityofXandYtogether,becausethesearethreemutuallyexclusiveevents.HavingXandhavingYarenotmutuallyexclusiveasshecanhaveboth.HavingXtellsusnothingaboutwhethershehasY.IfshehasXtheprobabilityofhavingYisstill0.05,becauseXandYareindependent.
Fig.19.5.PiechartshowingthedistributionofpatientsinTootingBecHospitalbydiagnosticgroup
Fig.19.6.BarchartshowingtheresultsoftheSalkvaccinetrial
27.TFTFF.§6.4.Weightiscontinuous.Patientsrespondornotwithequalprobability,beingselectedatrandomfromapopulationwheretheprobabilityofrespondingvaries.ThenumberofredcellsmightfollowaPoissondistribution(§6.7);thereisnosetofindependenttrials.Thenumberofhypertensivesfollows
aBinominaldistribution,nottheproportion
Fig.19.7.MortalityinBritishdoctorsbysmokinghabits,afterDollandHill(1956)
Fig.19.8.LinegraphsforgeriatricadmissionsinWandsworthinthesummersof1982and1983
28.TTTTF.Theprobabilityofclinicaldiseaseis0.5×0.5=0.25.Theprobabilityofcarrierstatus=probabilitythatfatherpassesthegeneandmotherdoesnot+probabilitythatmotherpassesthegeneandfatherdoesnot=0.5×0.5+0.5×0.5=0.5.Probabilityofnotinheritingthegene=0.5×0.5=0.25.Probabilityofnothavingclinicaldisease=1-0.25=0.75.Successivechidrenareindependent,sotheprobabilitiesforthesecondchildareunaffectedbythefirst(§6.2)
29.FTTFT.§6.3,4.Theexpectednumberisone(§6.6).Thespinsareindependent(§6.2).Atleastonetailmeansonetail(PROB=0.5)ortwotails(PROB=0.25).Thesearemutuallyexclusive,sotheprobabilityofatleastonetailis0.5+0.25=0.75.
Table19.3.Probabilityofsurvivingtodifferentages
Survivetoage Probability Surviveto
age Probability
10 0.959 60 0.758
20 0.952 70 0.524
30 0.938 80 0.211
40 0.920 90 0.022
50 0.876 100 0.000
30.FTTFT.§6.6.E(X=2)=µ+2,VAR(2X)=4σ2.
31.TTTFF.§6.6.Thevarianceofadifferenceisthesumofthevariances.Variancescannotbenegative.VAR(-X)=(-1)2×VAR(X)=VAR(X).
SolutiontoExercise6E1.Probabilityofsurvivaltoage10.Thisillustratesthefrequencydefinitionofprobability.959outof1000survive,sotheprobabilityis959/1000=0.959.
2.Survivalanddeatharemutuallyexclusive,exhaustiveevents,soPROB(survives)+PROB(dies)=1.HencePROB(dies)=1-0.959=0.041.
3.Thesearethenumbersurvivingdividedby1000(Table19.3).Theeventsarenotmutuallyexclusive,e.g.amancannotsurvivetoage20
ifhedoesnotsurvivetoage10.Thisdoesnotformaprobabilitydistribution.
4.Theprobabilityisfoundby
5.Independentevents.PROB(survival60to70)=0.691,
PROB(bothsurvive)=0.691×0.691=0.477.
6.Theproportionsurvivingonaverageistheprobabilityofsurvival=0.691.Soaproportionof0.691ofthe100survive.Weexpect0.691×100=69.1tosurvive.
7.Theprobabilityisfoundby
8.Asin7,wefindprobabilitiesofdyingforeachdecade(Table19.4).Thisisasetofmutuallyexclusiveeventsandtheyareexhaustive–thereisnootherdecadeinwhichdeathcantakeplace.Thesumoftheprobabilitiesistherefore1.0.ThedistributionisshowninFigure19.9.
9.Wefindtheexpectedvaluesormeanofaprobabilitydistributionbysummingeachvaluetimesitsprobability(§6.4),togivelifeexpectancyatbirth=66.6
years(Table19.5).
Table19.4.Probabilityofdyingineachdecade
Decade Probabilityofdying Decade Probabilityof
dying
1st 0.041 6th 0.118
2nd 0.007 7th 0.234
3rd 0.014 8th 0.313
4th 0.018 9th 0.189
5th 0.044 10th 0.022
Fig.19.9.Probabilitydistributionofdecadeofdeath
SolutiontoExercise7M:Multiplechoicequestions32to3732.TTTFT.§7.2,3,4.
33.FFFTT.Symmetrical,µ=0,σ=1(§7.3,§4.6).
34.TTFFF.§7.2.Median=mean.TheNormaldistributionhasnothingtodowithnormalphysiology.2.5%willbelessthan260,2.5%willbegreaterthan340litres/min.
Table19.5.Calculationofexpectationoflife
5×0.041=0.20515×0.007=0.10525×0.014=0.35035×0.018=0.63045×0.044=1.98055×0.118=6.49065×0.234=15.21075×0.313=23.47585×0.189=16.06595×0.022=2.090Total66.600
Fig.19.10.Histogramofthebloodglucosedatawiththe
correspondingNormaldistributioncurve,andNormalplot
35.FTTFF.§4.6,§7.3.Thesamplesizeshouldnotaffectthemean.Therelativesizesofmean,medianandstandarddeviationdependontheshapeofthefrequencydistribution.
36.TFTTF.§7.2,§7.3.Adding,subtractingormultiplyingbyaconstant,oraddingorsubtractinganindependentNormalvariablegivesaNormaldistribution.X2followsaveryskewChi-squareddistributionwithonedegreeoffreedomandX/Yfollowsatdistributionwithonedegreeoffreedom(§7A).
37.TTTTT.Agentleslopeindicatesthatobservationsarefarapart,asteepslopethattherearemanyobservationsclosetogether.Hencegentle-steep-gentle(‘S’shaped)indicateslongtails(§7.5).
SolutiontoExercise7E1.Theboxandwhiskerplotshowsaveryslightdegreeofskewness,thelowerwhiskerbeingshorterthantheupperandthelowerhalfoftheboxsmallerthantheupper.FromthehistogramitappearsthatthetailsarealittlelongerthantheNormalcurveofFigure7.10wouldsuggest.Figure19.10showstheNormaldistributionwiththesamemeanandvariancesuperimposedonthehistogram,whichalsoindicatesthis.
2.Wehaven=40.Fori=1to40wewanttocalculate(i-0.5)/n=(2i-1)/2n.Thisgivesusaprobability.WeuseTable7.1tofindthevalueoftheNormaldistributioncorrespondingtothisprobability.Forexample,fori=1wehave
FromTable7.1wecannotfindthevalueofxcorrespondingtoΦ(x)=0.0125directly,butweseethatx=-2.3correspondstoΦ(x)=0.011andx=-2.2toΦ(x)=0.014.Φ(x)=0.0125ismid-waybetweentheseprobabilitiessowecanestimatethevalueofxasmid-waybetween-2.3and-2.2,giving-2.25.Thiscorrespondstothelowestblood
glucose,2.2.Fori=2wehaveΦ(x)=0.0375.Referringtothetablewehavex=-1.8,Φ(x)=0.036andx=-1.7,Φ(x)=0.045.ForΦ(x)=0.0375wemusthavexjustabove-1.8,about-1.78.The
correspondingbloodglucoseis2.9.Wedonothavetobeveryaccuratebecauseweareonlyusingthisplotforaroughguide.Wegetasetofprobabilitiesasfollows:
i (2i-1)/2n=Φ(x) x Bloodglucose
1 1/80=0.0125 -2.25 2.2
2 3/80=0.0375 -1.78 2.9
3 5/80=0.625 -1.53 3.3
4 7/80=0.0875 -1.36 3.3
andsoon.BecauseofthesymmetryoftheNormaldistribution,fromi=21onwardsthevaluesofxarethosecorrespondingto40-i+1,butwithapositivesign.TheNormalplotisshowninFigure19.10.
3.Thepointsdonotlieonastraightline.Therearepronouncedbendsneareachend.Thesebendsreflectratherlongtailsofthedistributionofbloodglucose.Ifthelineshowedasteadycurve,risinglesssteeplyasthebloodglucosevalueincreased,thiswouldshowsimpleskewnesswhichcanoftenbecorrectedbyalogtransformation.Thiswouldnotworkhere;thebendatthelowerendwouldbemadeworse.
Thedeviationfromastraightlineisnotverygreat,compared,say,tothevitaminDmeasurementsinFigure7.12.AsweseeinChapter10,suchsmalldeviationsfromtheNormaldonotusuallymatter.
SolutiontoExercise8M:Multiplechoicequestions38to43
39.FTFTF.§8.3.Thesamplemeanisalwaysinthemiddleofthelimits.
41.TTTFF.§8.1,2,§6.4)Varianceisp(1-p)/n=0.1×0.9/100=0.0009.ThenumberinthesamplewiththeconditionfollowsaBinomialdistribution,nottheproportion.
42.FFTTT.ItdependsonthevariabilityofFEV1andthenumberinthesample(§8.2).Thesampleshouldberandom(§3.3,4).
43.FFTTF.§8.3,4.Itisunlikelythatwewouldgetthesedataifthepopulationratewere10%,butnotimpossible.
SolutiontoExercise8E1.Theintervalwillbe1.96standarddeviationslessthanandgreaterthanthemean.Thelowerlimitis0.810-1.96×0.057=0.698mmol/litre.Theupperlimitis0.810+1.96×0.057=0.922mmol/litre.
2.Forthediabetics,themeanis0.719andthestandarddeviation0.068,sothelowerlimitof0.698willbe(0.698-0.719)/0.068=-0.309standarddeviationsfromthemean.FromTable7.1,theprobabilityofbeingbelowthisis0.38,sotheprobabilityofbeingaboveis1-0.38=0.62.Thustheprobabilitythataninsulin-dependentdiabeticwouldbewithinthereferenceintervalwouldbe0.62or62%.Thisistheproportionwerequire.
4.The95%confidenceintervalisthemean±1.96standarderrors.Forthecontrols,0.810-1.96×0.00482to0.810+1.96×0.00482givesus0.801to0.819mmol/litre.Thisismuchnarrowerthantheintervalofpart1.Thisisbecausetheconfidenceintervaltellsushowfarthesamplemeanmightbefromthepopulationmean.The95%referenceintervaltellsushowfaranindividualobservationmightbefromthepopulationmean.
5.Thegroupsareindependent,sothestandarderrorofthedifferencebetweenmeansisgivenby:
6.Thedifferencebetweenthemeansis0.719-0.810=-0.091mmol/litre.The95%confidenceintervalisthus-0.091-1.96×0.00660to-0.091+1.96×0.00660,giving-0.104to-0.078.Hencethemeanplasmamagnesiumlevelforinsulindependentdiabeticsisbetween0.078and0.104mmol/litrebelowthatofnon-diabetics.
7.Althoughthedifferenceissignificant,thiswouldnotbeagoodtestbecausethemajorityofdiabeticsarewithinthe95%referenceinterval.
SolutiontoExercise9M:Multiplechoicequestions44to4944.FTFFF.Thereisevidenceforarelationship(§9.6),whichisnotnecessarilycausal.Theremaybeotherdifferencesrelatedtocoffeedrinking,suchassmoking(§3.8).
46.TTFTT.§9.2.Itisquitepossibleforeithertobehigheranddeviationsineitherdirectionareimportant(§9.5).n=16becausethesubjectgivingthesamereadingonbothgivesnoinformationaboutthedifferenceandisexcludedfromthetest.Theordershouldberandom,asinacross-overtrial(§2.6).
47.FFFFT.Thetrialissmallandthedifferencemaybeduetochance,buttheremayalsobealargetreatmenteffect.Wemustdoabiggertrialtoincreasethepower(§9.9).Addingcaseswouldcompletelyinvalidatethetest.Ifthenullhypothesisistrue,thetestwillgivea‘significant’resultonein20times.Ifwekeepaddingcasesanddoingmanytestswehaveaveryhighchanceofgettinga‘significant’resultononeofthem,eventhoughthereisnotreatmenteffect(§9.10).
48.TFTTF.Largesamplemethodsdependonestimatesofvarianceobtained
fromthedata.Thisestimategetsclosertothepopulationvalueasthesamplesizeincreases(§9.7,§9.8).Thechanceofanerrorofthefirstkindisthesignificancelevelsetinadvance,say5%.Thelargerthesamplethemorelikelywearetodetectadifferenceshouldoneexist(§9.9).Thenullhypothesisdependsonthephenomenaweareinvestigating,notonthesamplesize.
49.FTFFT.Wecannotconcludecausationinanobservationalstudy(§3.6,7,8),butwecanconcludethatthereisevidenceofadifference(§9.6).0.001istheprobabilityofgettingsolargeadifferenceifthenullhypothesisweretrue(§9.3).
SolutiontoExercise9E1.Bothcontrolgroupsaredrawnfrompopulationswhichwereeasytogetto,onebeinghospitalpatientswithoutgastro-intestinalsymptoms,theotherbeingfracturepatientsandtheirrelatives.Botharematchedforageandsex;Mayberryetal.(1978)alsomatchedforsocialclassandmaritalstatus.Apartfromthematchingfactors,wehavenowayofknowingwhethercasesandcontrolsarecomparable,oranywayofknowingwhethercontrolsarerepresentativeofthegeneralpopulation.Thisisusualincasecontrolstudiesandisamajorproblemwiththisdesign.
2.Therearetwoobvioussourcesofbias:interviewswerenotblindandinformationisbeingrecalledbythesubject.Thelatterisparticularlyaproblemfordataaboutthepast.InJames'studysubjectswereaskedwhattheyusedtoeatseveralyearsinthepast.Forthecasesthiswasbeforeadefiniteevent,onsetofCrohn'sdisease,forthecontrolsitwasnot,thetimebeingtimeofonsetofthediseaseinthematchedcase.
3.ThequestioninJames'studywas‘whatdidyoutoeatinthepast?’,thatinMayberryetal.(1978)was‘whatdoyoueatnow?’
4.Ofthe100patientswithCrohn'sdisease,29werecurrenteatersofcornflakes.Of29caseswhoknewofthecornflakesassociation,12wereex-eatersofcornflakes,andamongtheother71cases21wereex-eatersofcornflakes,givingatotalof33pastbutnotpresenteatersofcornflakes.Combiningthesewiththe29currentconsumers,weget62
caseswhohadatsometimebeenregulareatersofcornflakes.Ifwecarryoutthesamecalculationforthecontrols,weobtain3+10=13pasteatersandwith22currenteatersthisgives35sometimeregularcornflakeseaters.Casesweremorelikelythancontrolstohaveeatencornflakesregularlyatsometime,theproportionofcasesreportinghavingeatencornflakesbeingalmosttwiceasgreatasforcontrols.ComparethistoJames'data,where17/68=25%ofcontrolsand23/34=68%ofcases,2.7timesasmany,hadeatencornflakesregularly.Theresultsaresimilar.
5.TherelationshipbetweenCrohn'sdiseaseandreportedconsumptionofcornflakeshadamuchsmallerprobabilityforthesignificancetestandhencestrongerevidencethatarelationshipexisted.Also,onlyonecasehadnevereatencornflakes(itwasalsothemostpopularcerealamongcontrols).
6.OftheCrohn'scases,67.6%(i.e.23/34)reportedhavingeatencornflakesregularlycomparedto25.0%ofcontrols.Thuscaseswere67.6/25.0=2.7times
aslikelyascontrolstoreporthavingeatencornflakes.Thecorrespondingratiosfortheothercerealsare:wheat,2.7;porridge,1.5;rice,1.6;bran,6.1;muesli,2.7.Cornflakesdoesnotstandoutwhenwelookatthedatainthisway.Thesmallprobabilitysimplyarisesbecauseitisthemostpopularcereal.ThePvalueisapropertyofthesample,notofthepopulation.
7.WecanconcludethatthereisnoevidencethateatingcornflakesismorecloselyrelatedtoCrohn'sdiseasethanisconsumptionofothercereals.ThetendencyforCrohn'scasestoreportexcessiveeatingofbreakfastfoodsbeforeonsetofthediseasemaybetheresultofgreatervariationindietthanincontrols,astheytrydifferentfoodsinresponsetotheirsymptoms.Theymayalsobemorelikelytorecallwhattheyusedtoeat,beingmoreawareoftheeffectsofdietbecauseoftheirdisease.
SolutiontoExercise10M:Multiplechoicequestions50to56
50.FFTFT.§10.2.ItisequivalenttotheNormaldistributionmethod(§8.3).
51.FTFTF.§10.3.Whetherthe(population)meansareequaliswhatwearetryingtofindout.ThelargesamplecaseisliketheNormaltestof(§9.7),exceptforthecommonvarianceestimate.Itisvalidforanysamplesize.
52.FTTFF.TheassumptionofNormalitywouldnotbemetforasmallsamplettest(§10.3)withouttransformation(§10.4),butforalargesamplethedistributionfollowedbythedatawouldnotmatter(§9.7).Thesigntestisforpaireddata.Wehavemeasurements,notqualitativedata.
53.FTTFF.§10.5.Themoredifferentthesamplesizesare,theworseistheapproximationtothetdistribution.Whenbothsamplesarelarge,thisbecomesalargesampleNormaldistributiontest(§9.7).Groupingofdataisnotaseriousproblem.
54.TFFTT.APvalueconveysmoreinformationthanastatementthatthedifferenceissignificantornotsignificant.Aconfidenceintervalwouldbeevenbetter.Whatisimportantishowwellthediagnostictestdiscriminates,i.e.byhowmuchthedistributionsoverlap,notanydifferenceinmean.SemencountcannotfollowaNormaldistributionbecausetwostandarddeviationsexceedthemeanandsomeobservationswouldbenegative(§7.4).Approximatelyequalnumbersmakethettestveryrobustbutskewnessreducesthepower(§10.5).
56.FTTFT.§10.9.Sumsofsquaresanddegreesoffreedomaddup,meansquaresdonot.Threegroupsgivestwodegreesoffreedom.Wecanhaveanysizesofgroups.
SolutiontoExercise10E1.ThedifferencesforcomplianceareshowninTable19.6.ThestemandleafplotisshowninFigure19.11.
Table19.6.Differencesandmeansforstaticcompliance
Patient Constant Decelerating Difference Mean
1 65.4 72.9 -7.5 69.15
2 73.7 94.4 -20.7 84.05
3 37.4 43.3 -5.9 40.35
4 26.3 29.0 -2.7 27.65
5 65.0 66.4 -1.4 65.70
6 35.2 36.4 -1.2 35.80
7 24.7 27.7 -3.0 26.20
8 23.0 27.5 -4.5 25.25
9 133.2 178.2 -45.0 155.70
10 38.4 39.3 -0.9 38.85
11 29.2 31.8 -2.6 30.50
12 28.3 26.9 1.4 27.60
13 46.6 45.0 1.6 45.80
14 61.5 58.2 3.3 59.85
15 25.7 25.7 0.0 25.70
16 48.7 42.3 6.4 45.50
Fig.19.11.Stemandleafplotforcompliance
2.TheplotofdifferenceagainstmeanisFigure19.12.Thedistributionishighlyskewedandthedifferencecloselyrelatedtothemean.
3.Thesumandsumofthesquareddifferencesare∑di=-82.7and∑di2
=2648.43,hencethemeanis[dwithbarabove]=-82.7/16=-5.16875.Forthesumofsquaresaboutthemean
4.Wehave15degreesoffreedomandfromTable7.1the5%pointofthetdistributionis2.13.The95%confidenceintervalis-5.16875-2.13×3.04205to-5.16875+2.13×3.04205,giving-11.6to+1.3.
Fig.19.12.Differenceversusmeanforcompliance
6.The95%confidenceintervalis-0.028688-2.13×0.012503to-0.028688+2.13×0.012503whichgives-0.055312to-0.002057.Thishasnotbeenrounded,becauseweneedtotransformthemfirst.Ifwetransformtheselimitsbackbytakingtheantilogsweget0.880to0.995.Thismeansthatthecompliancewithadeceleratingwaveformisbetween0.880and0.995timesthatwithaconstantwaveform.Thereissomeevidencethatwaveformhasaneffect,whereaswiththeuntransformeddatatheconfidenceintervalforthedifferenceincludedzero.Becauseoftheskewnessoftherawdatatheconfidenceintervalwastoowide.
7.Wecanconcludethatthereissomeevidenceofareductioninmeancompliance,whichcouldbeupto12%(from(1-0.880)×100),butcouldalsobenegligiblysmall.
SolutiontoExercise11M:Multiplechoicequestions57to6157.FFTTF.Outcomeandpredictorvariablesareperfectlyrelatedbutdonotlieonastraightline,sor<1(§11.9).
Fig.19.13.Stemandleafplotsforlogcompliance
Table19.7.Differenceandmeanforlogtransformedcompliance(tobase10)
Patient Constant Decelerating Difference Mean
1 1.816 1.863 -0.047 1.8395
2 1.867 1.975 -0.108 1.9210
3 1.573 1.636 -0.063 1.6045
4 1.420 1.462 -0.042 1.4410
5 1.813 1.822 -0.009 1.8175
6 1.547 1.561 -0.014 1.5540
7 1.393 1.442 -0.049 1.4175
8 1.362 1.439 -0.077 1.4005
9 2.125 2.251 -0.126 2.1880
10 1.584 1.594 -0.010 1.5890
11 1.465 1.502 -0.037 1.4835
12 1.452 1.430 0.022 1.4410
13 1.668 1.653 0.015 1.6605
14 1.789 1.765 0.024 1.7770
15 1.410 1.410 0.000 1.4100
16 1.688 1.626 0.062 1.6570
Fig.19.14.Differenceversusmeanforlogcompliance
58.FTFFF.Knowledgeofthepredictortellsussomethingabouttheoutcomevariable(§6.2).Thisisnotastraightlinerelationship.Forpartofthescaletheoutcomevariabledecreasesasthepredictorincreases,thentheoutcomevariableincreasesagain.Thecorrelationcoefficientwillbeclosetozero(§11.9).Alogarithmictransformationwouldworkiftheoutcomeincreasedmoreandmorerapidlyasthepredictorincreased(§5.9).
59.FFFTT.Aregressionlineusuallyhasnon-zerointerceptandslope,whichhavedimensions(§11.3).ExchangingXandYchangestheline(§11.4).
60.FTTFF.Thepredictorvariablehasnoerrorintheregressionmodel(§11.3).Transformationsareonlyusedifneededtomeettheassumptions(§11.8).Thereisascatterabouttheline(§11.3).
61.TTFFF.§11.9,10.Thereisnodistinctionbetweenpredictorandoutcome.rshouldnotbeconfusedwiththeregressioncoefficient(§11.3).
SolutiontoExercise11E1.Theslopeisfoundby
Forfemales,
Formales,
2.Forthestandarderror,wefirstneedthevariancesabouttheline:
thenthestandarderroris
Forfemales:
Formales:
3.Thestandarderrorofthedifferencebetweentwoindependentvariablesisthesquarerootofthesumoftheirstandarderrorssquared(§8.5):
Thesampleisreasonablylarge,almostattaining50ineachgroup,sothisstandarderrorshouldbefairlywellestimatedandwecanusealargesampleNormalapproximation.The95%confidenceintervalisthus1.96standarderrorsoneithersideoftheestimate.Theobserveddifferenceisbf-bm=2.8710-3.9477=-1.0767.The95%confidenceintervalisthus-1.0767-1.96×1.7225=-4.5to-1.0767+1.96×1.7225=2.3.Ifthesamplesweresmall,wecoulddothisusingthetDistribution,butwewouldneedtoestimateacommonvariance.Itwouldbebettertousemultipleregression,testingtheheight×sexinteraction(§17.3).
4.Forthetestofsignificancetheteststatisticisobserveddifferenceoverstandarderror:
Ifthenullhypothesisweretrue,thiswouldbeanobservationfromaStandardNormaldistribution.FromTable7.2,P>0.5.
SolutiontoExercise12M:Multiplechoicequestions62to6662.TFTFF.§10.3,§12.2.ThesignandWilcoxontestsareforpaireddata(§9.2,§12.3).Rankcorrelationlooksfortheexistenceofrelationshipsbetweentwoordinalvariables,notacomparisonbetweentwogroups(§12.4,§12.5).
63.TTFFT.§9.2,§12.2,§10.3,§12.5.TheWilcoxontestisforintervaldata(§12.3).
64.FTFTT.§12.5.Thereisnopredictorvariableincorrelation.Logtransformationwouldnotaffecttherankorderoftheobservations.
65.FTFFT.IfNormalassumptionsaremetthemethodsusingthemarebetter(§12.7).Estimationofconfidenceintervalsusingrankmethodsisdifficult.Rankmethodsrequiretheassumptionthatthescaleisordinal,i.e.thatthedatacanberanked.
66.TFTTF.Weneedapairedtest:t,signorWilcoxon(§10.2,§9.2,§12.3).
SolutiontoExercise12E1.ThedifferencesareshowninTable19.6.Wehave4positive,11negativeand1zero.Underthenullhypothesisofnodifference,thenumberofpositivesisfromtheBinomialdistributionwithp=0.5,n=15.Wehaven=15becausethesinglezerocontributesnoinformationaboutthedirectionofthedifference.ForPROB(r≤4)wehave
Ifwedoublethisforatwo-sidedtestweget0.11848,againnotsignificant.
2.UsingtheWilcoxonmatchedpairstestweget
Diff. -0.9 -1.2 -1.4 1.4 1.6 -2.6 -2.7 -3.0
Rank 1 2 3.5 3.5 5 6 7 8
Diff. 3.3 -4.5 -5.9 6.4 -7.5 -20.7 -45.0
Rank 9 10 11 12 13 14 15
Asforthesigntest,thezeroisomitted.SumofranksforpositivedifferencesisT=3.5+5+9+12=29.5.FromTable12.5the5%pointforn=15is25,whichTexceeds,sothedifferenceisnotsignificantatthe5%level.Thethreetestsgivesimilaranswers.
3.UsingthelogtransformeddifferencesinTable19.7,westillhave4positives,11negativesand1zero,withasigntestprobabilityof0.11848.Thetransformationdoesnotalterthedirectionofthechangesandsodoesnotaffectthesigntest.
4.FortheWilcoxonmatchedpairstestonthelogcompliance:
Diff. -0.009 -0.010 -0.014 0.015 0.022 0.024
Rank 1 2 3 4 5 6
Diff. -0.037 -0.042 -0.047 -0.049 0.062 -0.063
Rank 7 8 9 10 11 12
Diff. -0.077 -0.108 -0.126
Rank 13 14 15
HenceT=4+5+6+11-26.Thisisjustabovethe5%pointof25andisdifferentfromthatintheuntransformeddata.Thisisbecausethetransformationhasalteredtherelativesizeofthedifferences.Thistest
assumesintervaldata.Bychangingtoalogscalewehavemovedtoascalewherethedifferencesaremorecomparable,becausethechangedoesdependonthemagnitudeoftheoriginalvalue.Thisdoesnothappenwiththeotherranktests,theMann–WhitneyUtestandrankcorrelationcoefficients,whichinvolvenodifferencing.
5.Althoughthereisapossibilityofareductionincomplianceitdoesnotreachtheconventionallevelofsignificance.
6.Theconclusionsarebroadlysimilar,buttheeffectoncomplianceismorestronglysuggestedbythetmethod.ProvidedthedatacanbetransformedtoapproximateNormalitythetdistributionanalysisismorepowerful,andasitalsogivesconfidenceintervalsmoreeasily,Iwouldpreferit.
SolutiontoExercise13M:Multiplechoicequestions67to7367.TFFFF.§13.3.80%of4isgreaterthan3,soallexpectedfrequenciesmustexceed5.Thesamplesizecanbeassmallas20,ifallrowandcolumntotalsare10.
68.FTFTF.§13.1,§13.3.(5-1)×(3-1)=8degreesoffreedom,80%×15=12cellsmusthaveexpectedfrequencies>5.ItisO.K.foranobservedfrequencytobezero.
69.TTFTF.§13.1,§13.9.Thetwotestsareindependent.Thereare(2-1)×(2-1)=1degreeoffreedom.WithsuchlargenumbersYates'correctiondoesnotmakemuchdifference.Withoutitwegetχ2=124.5,withitwegetχ2=119.4(§13.5.).
70.TTTTT.§13.4,5.Thefactorialsoflargenumberscanbedifficulttocalculate.
71.TTTTF.§13.7.
72.TTFTT.Chi-squaredfortrendandτbwillbothtestthenullhypothesisofnotrendinthetable,butanordinarychi-squaredtestwillnot(§13.8).Theoddsratio(OR)isanestimateoftherelativeriskforacase-controlstudy(§13.7).
73.TTFFF.Thetestcomparesproportionsinmatchedsamples(§13.9).
Forarelationship,weusethechi-squaredtest(§13.1).PEFRisacontinuousvariable,weusethepairedtmethod(§10.2).Fortwoindependentsamplesweusethechi-squaredtest(§13.1).
SolutiontoExercise13E1.Theheatwaveappearstobegininweek10andcontinuetoincludeweek17.Thisperiodwasmuchhotterthanthecorrespondingperiodof1982.
Table19.8.Cross-tabulationoftimeperiodbyyearforgeriatricadmissions
Year
Period
TotalBeforeheatwave
Duringheatwave
Afterheatwave
1982 190 110 82 382
1983 180 178 110 468
Total 370 288 192 850
Table19.9.ExpectedfrequenciesforTable19.8
Year
Period
TotalBefore During After
heatwave heatwave heatwave
1982 166.3 129.4 86.3 382.0
1983 203.7 158.6 105.7 468.0
Total 370.0 288.0 192.0 850.0
2.Therewere178admissionsduringtheheatwavein1983and110inthecorrespondingweeksof1982.Wecouldtestthenullhypothesisthatthesecamefromdistributionswiththesameadmissionrateandwewouldgetasignificantdifference.Thiswouldnotbeconvincing,however.Itcouldbeduetootherfactors,suchastheclosureofanotherhospitalwithresultingchangesincatchmentarea.
3.Thecross-tabulationisshowninTable19.8.
4.Thenullhypothesisisthatthereisnoassociationbetweenyearandperiod,inotherwordsthatthedistributionofadmissionsbetweentheperiodswillbethesameforeachyear.TheexpectedvaluesareshowninTable19.9.
5.Thechi-squaredstatisticisgivenby:
Thereare2rowsand3columns,givingus(2-1)×(3-1)=2degreesoffreedom.Thuswehavechi-squared=11.8with2degreesoffreedom.FromTable13.3weseethatthishasprobabilityoflessthan0.01.Thedataarenotconsistentwiththenullhypothesis.Theevidencesupportstheviewthatadmissionsrosebymorethancouldbeascribedtochanceduringthe1983heatwave.Wecannotbecertainthatthiswasduetotheheatwaveandnotsomeotherfactorwhichhappenedtooperateatthesametime.
6.Wecouldseewhetherthesameeffectoccurredinotherdistrictsbetween1982and1983.Wecouldalsolookatolderrecordstoseewhethertherewasasimilarincreaseinadmissions,sayfortheheatwavesof1975and1976.
SolutiontoExercise14M:Multiplechoicequestions74to8074.TFFTT.Table14.2.
75.FTTTT.§14.5.
76.FFFFT.Regression,correlationandpairedtmethodsneedcontinuousdata(§11.3,§11.9,§10.2).Kendall'sτcanbeusedfororderedcategories.
77.TFTFF.§14.2.
78.TFTTT.AttestcouldnotbeusedbecausethedatadonotfollowaNormaldistribution(10.3).Theexpectedfrequencieswillbetoosmallforachi-squaredtest(§13.3),butatrendtestwouldbeO.K.(§13.8).Agoodnessoffittestcouldbeused(§13.10).
79.FTTFT.Asmall-sample,pairedmethodisneeded(Table14.4).
80.TFTFF.ForatwobytwotablewithsmallexpectedfrequencieswecanuseFisher'sexacttestorYates'correction(§13.4,5).McNemar'stestisinappropriatebecausethegroupsarenotmatched(§13.9).
SolutiontoExercise14E1.Overallpreference:wehaveonesampleofpatientssoweuse(Table14.2).Ofthese12preferredA,14preferredBand4didnotexpressapreference.WecanuseaBinomialorsigntest(§9.2),onlyconsideringthosewhoexpressedapreference.ThoseforAarepositives,thoseforBarenegatives.Wegettwo-sidedP=0.85,notsignificant.
Preferenceandorder:wehavetherelationshipbetweentwovariables(Table14.3),preferenceandorder,bothnominal.Wesetupatwowaytableanddoachi-squaredtest.Forthe3by2tablewehavetwoexpectedfrequencieslessthanfive,sowemusteditthetable.There
arenoobviouscombinations,butwecandeletethosewhoexpressednopreference,leavinga2by2table,χ2=1.3,1degreeoffreedom,P>0.05.
2.Thedataarepaired(Table14.2)soweuseapairedttest(§10.2).TheassumptionofaNormaldistributionforthedifferencesshouldbemetasPEFRitselffollowsaNormaldistributionfairlywell.Wegett=6.45/5.05=1.3,degreesoffreedom=31,whichisnotsignificant.Usingt=2.04(Table10.1)wegeta95%confidenceintervalof-3.85to16.75litres/min.
3.Wehavetwoindependentsamples(Table14.1).Wemustusethetotalnumberofpatientswerandomizedtotreatments,inanintentiontotreatanalysis(§2.5).Thuswehave1721activetreatmentpatientsincluding15deaths,and1706placebopatientswith35deaths.Achi-squaredtestgivesusχ2=8.3,d.f.=1,P<0.01.Acomparisonoftwoproportionsgivesadifferenceof-0.0118with95%confidenceinterval-0.0198to-0.0038(§8.6)andtestofsignificanceusingtheStandardNormaldistributiongivesavalueof2.88,P<0.01,(§9.8).
4.Wearelookingattherelationshipbetweentwovariables(Table14.3).Bothvariableshaveverynon-Normaldistributions.NitriteishighlyskewandpHisbimodal.ItmightbepossibletotransformthenitritestoaNormaldistributionbutthetransformationwouldnotbeasimpleone.Thezeropreventsasimplelogarithmictransformation,forexample.Becauseofthis,regressionand
correlationarenotappropriateandrankcorrelationcanbeused.Spearman'sρ=0.58andKendall'sτ=0.40,bothgivingaprobabilityof0.004.
5.Wehavetwoindependentsamples(Table14.1).WehavetwolargesamplesandcandotheNormalcomparisonoftwomeans(§8.5).Thestandarderrorofthedifferenceis0.0178sandtheobserveddifferenceis0.02s,givinga95%confidenceintervalof-0.015to0.055fortheexcessmeantransittimeinthecontrols.Ifwehadallthedata,foreachcasewecouldcalculatethemeanMTTforthetwocontrolsmatchedtoeachcase,findthedifferencebetweencaseMTTandcontrolmeanMTT,andusetheonesamplemethodof§8.3.
6.Thesearepaireddata,sowerefertoTable14.2.Theunequalstepsinthevisualacuityscalesuggestthatitisbesttreatedasanordinalscale,sothesigntestisappropriate.Preminuspost,thereare10positivedifferences,nonegativedifferencesand7zeros.Thuswerefer0totheBinomialdistributionwithp=0.5andn=10.Theprobabilityisgivenby
7.Wewanttotestfortherelationshipbetweentwovariables,whicharebothpresentedascategorical(Table14.3).Weuseachi-squaredtestforacontingencytable,χ2=38.1,d.f.=6,P<0.001.Onepossibilityisthatsomeothervariable,suchasthemother'ssmokingorpoverty,isrelatedtobothmaternalageandasthma.Anotheristhatthereisacohorteffect.Alltheage14–19motherswerebornduringthesecondworldwar,andsomecommonhistoricalexperiencemayhaveproducedtheasthmaintheirchildren.
8.Theserialmeasurementsofthyroidhormonecouldbesummarizedusingtheareaunderthecurve(§10.7).Theoxygendependenceistricky.Thebabieswhodiedhadtheworstoutcome,butifwetooktheirsurvivaltimeasthetimetheywereoxygendependent,wewouldbetreatingthemasiftheyhadagoodoutcome.Wemustalsoallowforthebabieswhowenthomeonoxygenhavingalongbutunknownoxygendependence.Mysolutionwastoassignanarbitrarylargenumberofdays,largerthananyforthebabiessenthomewithoutoxygen,tothebabiessenthomeonoxygen.Iassignedanevenlargernumberofdaysto
thebabieswhodied.IthenusedKendall'staub(§12.5)toassesstherelationshipwiththyroidhormoneAUC.Kendall'srankcorrelationwaschoseninpreferencetoSpearman'sbecauseofthelargenumberoftieswhichthearbitraryassignmentoflargenumbersproduced.
9.Thisisacomparisonoftwoindependentsamples,soweuseTable14.1.Thevariableisintervalandthesamplesaresmall.Wecouldeitherusethetwosampletmethod(10.3)ortheMann–WhitneyUtest
(§12.2).Thegroupshavesimilarvariances,butthedistributionshowsaslightnegativeskewness.AsthetwosampletmethodisfairlyrobusttodeviationsfromtheNormaldistributionandasIwantedaconfidenceintervalforthedifferenceIchosethisoption.Ididnotthinkthattheslightskewnesswassufficienttocauseanyproblems.
Bythetwosampletmethodwegetthedifferencebetweenthemeans,immobile-mobile,tobe7.06,standarderror=5.74,t=1.23,P=0.23,95%confidenceinterval=-4.54to18.66hours.BytheMann-Whitney,wegetU=178.5,z=-1.06,P=0.29.Thetwomethodsgiveverysimilarresultsandleadtosimilarconclusions,asweexpectthemtodowhenbothmethodsarevalid.
SolutiontoExercise15M:Multiplechoicequestions81to8681.TFTTF.§15.2.Unlessthemeasurementprocesschangesthesubject,wewouldexpectthedifferenceinmeantobezero.
82.TFTFF.§15.4.Weneedthesensitivityaswellasspecificity.Thereareotherthings,dependentonthepopulationstudied,whichmaybeimportanttoo,likethepositivepredictivevalue.
83.FTTTF.§15.4.Specificity,notsensitivity,measureshowwellpeoplewithoutthediseaseareeliminated.
84.TTFFF.§15.5.The95%referenceintervalshouldnotdependonthesamplesize.
85.FFFFT.§15.5.Weexpect5%of‘normal’mentobeoutsidetheselimits.Thepatientmayhaveadiseasewhichdoesnotproduceanabnormalhaematocrit.Thisreferenceintervalisformen,notwomenwhomayhaveadifferentdistributionofhaematocrit.Itisdangeroustoextrapolatethereferenceintervaltoadifferentpopulation.Infact,forwomenthereferenceintervalis35.8to45.4,puttingawomanwithahaematocritof48outsidethereferenceinterval.Ahaematocritoutsidethe95%referenceintervalsuggeststhatthemanmaybeill,althoughitdoesnotproveit.
86.TFTTT.§15.6.Astimeincreases,ratesarebasedonfewerpotentialsurvivors.Withdrawalsduringthefirstintervalcontributehalfan
intervalatrisk.Ifsurvivalrateschangethosesubjectsstartinglaterincalendartime,andsomorelikelytobewithdrawn,willhaveadifferentsurvivaltothosestartingearlier.Thefirstpartofthecurvewillrepresentadifferentpopulationtothesecond.Thelongestsurvivormaystillbealiveandsobecomeawithdrawal.
SolutiontoExercise15E1.Theblooddonorswereusedbecauseitwaseasytogettheblood.Thiswouldproduceasampledeficientinolderpeople,soitwassupplementedbypeopleattendingdaycentres.Thiswouldensurethatthesewerereasonablyactive,healthypeoplefortheirage.Giventheproblemofgettingbloodandthelimitedresourcesavailable,thisseemsafairlysatisfactorysampleforthepurpose.Thealternativewouldbetotakearandomsamplefromthelocalpopulationandtrytopersuadethemtogivetheblood.Theremighthavebeensomanyrefusalsthatvolunteerbiaswouldmakethesampleunrepresentativeanyway.Thesampleisalsobiasedgeographically,beingdrawnfromonepartofLondon.Inthecontextofthestudy,wherewewantedtocomparediabeticswithnormals,thisdidnotmattersomuch,asbothgroupscamefromthesameplace.Forareferenceintervalwhichwouldapplynationally,iftherewereageographicalfactortheintervalwouldbebiassedinotherplaces.Tolookatthiswewouldhavetorepeatthestudyinseveralplaces,comparetheresultingreferenceintervalsandpoolasappropriate.
2.Wewantnormal,healthypeopleforthesample,sowewanttoexcludepeoplewithobviouspathologyandespeciallythosewithdiseaseknowntoaffectthequantitybeingmeasured.However,ifweexcludedallelderlypeoplecurrentlyreceivingdrugtherapywewouldfinditverydifficulttoasufficientlylargesample.Itisindeed‘normal’fortheelderlytobetakinganalgesicsandhypnotics,sothesewerepermitted.
3.FromtheshapeofthehistogramandtheNormalplot,thedistributionofplasmamagnesiumdoesindeedappearNormal.
4.Thereferenceinterval,outsidewhichabout5%ofnormalvaluesare
expectedtolie,is[xwithbarabove]-2sto[xwithbarabove]+2s,or0.810-×0.057to0.810+2×0.057,whichis0.696to0.924,or0.70to0.92mmol/litre.
5.AsthesampleislargeandthedataNormallydistributedthestandarderrorofthelimitsisapproximately
Forthe95%confidenceintervalwetake1.96standarderrorsoneithersideofthelimit,1.96×0.0083439=0.016.The95%confidenceintervalforthelowerreferencelimitis0.696-0.016to0.696+0.016=0.680to0.712or0.68to0.71mmol/litre.Theconfidenceintervalfortheupperlimitis0.924-0.016to0.696+0.016=0.908to0.940or0.91to0.94mmol/litre.Thereferenceintervaliswellestimatedasfarassamplingerrorsareconcerned.
6.Plasmamagnesiumdidindeedincreasewithage.Thevariabilitydidnot.Thiswouldmeanthatforolderpeoplethelowerlimitwouldbetoolowandtheupperlimittoohigh,asthefewabovethiswouldallbeelderly.Wecouldsimplyestimatethereferenceintervalseparatelyatdifferentages.Wecoulddothisusingseparatemeansbutacommonestimateofvariance,obtainedbyone-wayanalysisofvariance(§10.9).Orwecouldusetheregressionofmagnesiumon
agetogetaformulawhichwouldpredictthereferenceintervalforanyage.Themethodchosenwoulddependonthenatureoftherelationship.
SolutiontoExercise16M:Multiplechoicequestions87to9287.FTFFF.§16.1.Itisforaspecificagegroup,notageadjusted.Itmeasuresthenumberofdeathsperpersonatrisk,notthetotalnumber.Ittellsusnothingaboutagestructure.
88.FTTTT.§16.4.Thelifetableiscalculatedfromagespecificdeathrates.Expectationoflifeistheexpectedvalueofthedistributionofageatdeathifthesemortalityratesapply(§6E).Itusuallyincreaseswithage.
89.TFTTF.TheSMR(§16.3)forwomenwhohadjusthadababyislowerthan100(allwomen)and105(stillbirthwomen).Theconfidenceintervalsdonotoverlapsothereisgoodevidenceforadifference.Womenwhohadhadastillbirthmaybelessormorelikelythanallwomentocommitsuicide,wecannottell.Wecannotconcludethatgivingbirthpreventssuicide–itmaybethatoptimistsconceive,forexample.
90.TFFFF.§16.3.Ageeffectshavebeenadjustedfor.Itmayalsobethatheavydrinkersbecomepublicans.Itisdifficulttoinfercausationfromobservationaldata.Menathighriskofcirrhosisoftheliver,i.e.heavydrinkers,maynotbecomewindowcleaners,orwindowcleanerswhodrinkmaychangetheiroccupation,whichrequiresgoodbalance.Windowcleanershavelowrisk.The‘average’ratiois100,not1.0.
91.FFFTF.§16.6.Alifetabletellsusaboutmortality,notpopulationstructure.Abarchartshowstherelationshipbetweentwovariables,nottheirfrequencydistribution(§5.5).
92.TFFFT.§16.1,§16.2,§16.5.Expectationoflifedoesnotdependonagedistribution(§16.4).
SolutiontoExercise16E1.Weobtaintheratesforthewholeperiodbydividingthenumberofdeathsinanagegroupbythepopulationsize.Thusforages10–14wehave44/4271=0.01030casesperthousandpopulation.Thisisfora13yearperiodsotherateperyearis0.01030/13=0.00079per1000peryear,or0.79permillionperyear.Table19.10showstheratesforeachagegroup.Theratesareunusualbecausetheyarehighestamongtheadolescentgroup,wheremortalityratesformostcausesarelow.Andersonetal.(1985)notethat‘…ourresultssuggestthatamongadolescentmalesabuseofvolatilesubstancescurrentlyaccountfor2%ofdeathsfromallcauses…’.Theratesarealsounusualbecausewehavenotcalculatedthemseparatelyforeachsex.Thisispartlyforsimplicityandpartlybecausethenumberofcasesinmostagegroupsissmallasitis.
2.TheexpectednumberofdeathsbymultiplyingthenumberintheagegroupinScotlandbythedeathratefortheperiod,i.e.per13years,
forGreatBritain.Wethenaddthesetoget27.19deathsexpectedaltogether.Weobserved48,sotheSMRis48/27.19=1.77,or177withGreatBritainas100.
Table19.10.Age-specificmortalityratesforvolatilesubstanceabuse,GreatBritain,andcalculationof
SMRforScotland
Agegroup
GreatBritainASMRs
Scotlandpopulation(thousands)
Scotlandexpecteddeaths
Permillionperyear
Perthousandper13years
0–9 0.00 0.00000 653 0.00000
10–14
0.79 0.01030 425 4.37750
15–19
2.58 0.03358 447 15.01026
20–24
0.87 0.01137 394 4.47978
25–29
0.32 0.00415 342 1.41930
30–39
0.08 0.00108 659 0.71172
40–49
0.03 0.00033 574 0.18942
50–59
0.09 0.00112 579 0.64848
60+ 0.03 0.00037 962 0.35594
Total 27.19240
3.WefindthestandarderroroftheSMRby
The95%confidenceintervalisthen1.77-1.96×0.2548to1.77+1.96×0.2548,or1.27to2.27.Multiplyingby100asusual,weget127to227.TheobservednumberisquitelargeenoughfortheNormalapproximationtothePoissondistributiontobeused.
4.Yes,theconfidenceintervaliswellawayfromzero.Otherfactorsrelatetothedatacollection,whichwasfromnewspapers,coroners,deathregistrationsetc.ScotlandhasdifferentnewspapersandothernewsmediaandadifferentlegalsystemtotherestofGreatBritain.ItmaybethattheassociationofdeathswithVSAismorelikelytobereportedtherethaninEnglandandWales.
SolutiontoExercise17M:Multiplechoicequestions93to9793.TFTFT.§17.2.Itistheratiooftheregressionsumofsquarestothetotalsumofsquares.
94.FTFFF.§17.2.Therewere37+1=38observations.Thereisahighlysignificantethnicgroupeffect.Thenon-significantsexeffectdoesnotmeanthatthereisnodifference(§9.6).Therearethreeagegroups,sotwodegreesoffreedom.Iftheeffectofethnicityweredueentirelytoage,itwouldhavedisappearedwhenagewasincludedinthemodel.
95.TTTTF.§17.8.Afour-levelfactorhasthreedummyvariables(§17.6).Iftheeffectofwhitecellcountweredueentirelytosmoking,itwouldhavedisappearedwhensmokingwasincludedinthemodel.
96.TTTFT.§17.4
97.FFFFT.§17.9.Boyshavealowerriskofreadmissionthangirls,shownbythenegativecoefficient,andhencealongertimebeforebeingreadmitted.Theophilineisrelatedtoalowerriskofreadmissionbutwecannotconcludecausation.Treatmentmaydependonthetypeandseverityofasthma.
SolutiontoExercise17E1.Thedifferenceishighlysignificant(P<0.001)andisestimatedtobebetween1.3and3.7,i.e.volumesarehigheringroup2,thetrisomy-16group.
2.FromboththeNormalplotandtheplotagainstnumberofpairsofsomitesthereappearstobeonepointwhichmayberatherseparatefromtherestofthedata,anoutlier.Inspectionofthedatashowednoreasontosupposethatthepointwasanerror,soitwasretained.OtherwisethefittotheNormaldistributionseemsquitegood.Theplotagainstnumberofpairsofsomitesshowsthattheremaybearelationshipbetweenmeanandvariability,butthisverysmallandwillnotaffecttheanalysistoomuch.Thereisalsoapossiblenon-linearrelationship,whichshouldbeinvestigated.(Theadditionofaquadratictermdidnotimprovethefitsignificantly.)
3.Modeldifferenceinsumofsquares=207.139-197.708=9.431,residualsumofsquares=3.384,Fratio=9.431/3.384=2.79with1and36degreesoffreedom,correspondingtot=1.67,P>0.1,notsignificant.
SolutiontoExercise18M:Multiplechoicequestions98to10098.TTFTT.§9.9.Powerisapropertyofthetest,notthesample.Itcannotbezero,asevenwhenthereisnopopulationdifferenceatallthetestmaybesignificant.
99.TTTTF.§18.5.Ifwekeeponaddingobservationsandtesting,wearecarryingoutmultipletestingandsoinvalidatethetest(§9.10).
100.TTFFT.§18.1.Powerisnotinvolvedinestimation.
SolutiontoExercise18E
3.Thisisacomparisonoftwoproportions(§18.5).Wehavep1=0.15andp2=0.15×0.9=0.135,areductionof10%.Withapowerof90%andasignificancelevelof5%,wehave
Henceweneed11400ineachgroup,22800patientsaltogether.Withapowerof80%andasignificancelevelof5%,wehave
Henceweneed8577ineachgroup,17154patientsaltogether.Loweringthepowerreducestherequiredsamplesize,but,ofcourse,reducesthechanceofdetectingadifferenceiftherereallyisone.
4.Thisisthecomparisonoftwomeans(§18.4).Weestimatethesamplesizeforadifferenceofonestandarddeviation,µ1-µ2=σ.Withapowerof90%andasignificancelevelof5%,thenumberineachgroupisgivenby
Henceweneed21ineachgroup.Ifwehaveunequalsamplesandn1=100,n2isgivenby
andsoweneed12subjectsinthediseasegroup.
5.Whenthenumberofclustersisverysmallandthenumberofindividualswithinaclusterislarge,asinthisstudy,clusteringcanhaveamajoreffect.Thedesigneffect,bywhichtheestimatedsamplesizeshouldbemultiplied,isDEFF=1+(750-1)×0.005=4.745.Thustheestimatedsamplesizeforanygivencomparisonshouldbemultipliedby4.745.Lookingatitanotherway,theeffectivesamplesizeistheactualsamplesize,3000,dividedby4.745,about632.Further,samplesizecalculationsshouldtakeintoaccountdegreesoffreedom.Inlargesampleapproximationsamplesizecalculations,power80%andalpha5%areembodiedinthemultiplierf(α,P)=f(0.05,0.80)=(1.96+0.85)2=7.90.Forasmallsamplecalculationusingthettest,1.96mustbereplacedbythecorresponding5%pointofthetdistributionwiththeappropriatedegreesoffreedom,here2degreesoffreedomgivingt=4.30.Hencethemultiplieris(4.30+0.85)2=26.52,3.36timesthatforthelargesample.
Theeffectofthesmallnumberofclusterswouldreducetheeffectivesamplesizeevenmore,downto630/3.36=188.Thusthe3000menintwogroupsoftwoclusterswouldgivethesamepowertodetectthesamedifferenceas188menrandomizedindividually.Theapplicantsresubmittedaproposalwithmanymoreclusters.
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>References
References
Altman,D.G.(1982).Statisticsandethicsinmedicalresearch.InStatisticsinPractice(ed.S.M.GoreandD.G.Altman).BritishMedicalAssociation,London.
Altman,D.G.(1991).PracticalStatisticsforMedicalResearch,ChapmanandHall,London.
Altman,D.G.(Confidenceintervalsforthenumberneededtotreat)(1998).BritishMedicalJournal,317,1309–12.
Altman,D.G.andBland,J.M.(1983).Measurementinmedicine:theanalysisofmethodcomparisonstudies.TheStatistician,32,307–17.
Altman,D.G.andMatthews,J.N.S.(1996).StatisticsNotes:Interaction1:heterogeneityofeffects.BritishMedicalJournal,313,486.
Anderson,H.R.,Bland,J.M.,Patel,S.,andPeckham,C.(1986).Thenaturalhistoryofasthmainchildhood.JournalofEpidemiologyandCommunityHealth,40,121–9.
Anderson,H.R.,MacNair,R.S.,andRamsey,J.D.(1985).Deathsfromabuseofsubstances,anationalepidemiologicalstudy.BritishMedicalJournal,290,304–7.
Anon(1997).Alltrialsmusthaveinformedconsent.BritishMedicalJournal,314,1134–5.
Appleby,L.(1991).Suicideduringpregnancyandinthefirstpostnatalyear.BritishMedicalJournal,302,137–40.
Armitage,P.andBerry,G.(1994).StatisticalMethodsinMedicalResearch,Blackwell,Oxford.
Balfour,R.P.(1991).Birds,milkandcampylobacter.Lancet,337,176.
Ballard,R.A.,Ballard,P.C.,Creasy,R.K.,Padbury,J.,Polk,D.H.,Bracken,M.,Maya,F.R.,andGross,I.(1992).Respiratorydiseaseinvery-low-birthweightinfantsafterprenatalthyrotropinreleasinghormoneandglucocorticoid.Lancet,339,510–5.
Banks,M.H.,Bewley,B.R.,Bland,J.M.,Dean,J.R.,andPollard,V.M.(1978).Alongtermstudyofsmokingbysecondaryschoolchildren.ArchivesofDiseaseinChildhood,53,12–19.
Bewley,B.R.andBland,J.M.(1976).Academicperformanceandsocialfactorsrelatedtocigarettesmokingbyschoolchildren.BritishJournalofPreventiveandSocialMedicine,31,18–24.
Bewley,B.R.,Bland,J.M.,andHarris,R.(1974).Factorsassociatedwiththestartingofcigarettesmokingbyprimaryschoolchildren.BritishJournalofPreventiveandSocialMedicine,28,37–44.
Bewley,T.H.,Bland,J.M.,Ilo,M.,Walch,E.,andWillington,G.(1975).Censusofmentalhospitalpatientsandlifeexpectancyofthoseunlikelytobedischarged.BritishMedicalJournal,4,671–5.
Bewley,T.H.,Bland,J.M.,Mechen,D.,andWalch,E.(1981).‘Newchronic’patients.BritishMedicalJournal,283,1161–4.
Bland,J.M.andAltman,D.G.(1986).Statisticalmethodsforassessingagreementbetweentwomethodsofclinicalmeasurement.Lancet,i,307–10.
Bland,J.M.andAltman,D.G.(1993).Informedconsent.BritishMedicalJournal,306,928.
Bland,J.M.andAltman,D.G.(1998).StatisticsNotes.Bayesiansandfrequentists.BritishMedicalJournal,317,1151.
Bland,J.M.andAltman,D.G.(1999).Measuringagreementinmethodcomparisonstudies.StatisticalMethodsinMedicalResearch,8,135–60.
Bland,J.M.,Bewley,B.R.,Banks,M.H.,andPollard,V.M.(1975).Schoolchildren'sbeliefsaboutsmokinganddisease.HealthEducationJournal,34,71–8.
Bland,J.M.,Bewley,B.R.,Pollard,V.,andBanks,M.H.(1978).Effectofchildren'sandparents'smokingonrespiratorysymptoms.ArchivesofDiseaseinChildhood,53,100–5.
Bland,J.M.,Bewley,B.R.,andBanks,M.H.(1979).Cigarettesmokingandchildren'srespiratorysymptoms:validityofquestionnairemethod.Revued'EpidemiologieetSantéPublique,27,69–76.
Bland,J.M.,Holland,W.W.,andElliott,A.(1974).ThedevelopmentofrespiratorysymptomsinacohortofKentschoolchildren.BulletinPhysio-PathologieRespiratoire,10,699–716.
Bland,J.M.andKerry,S.M.(1998).StatisticsNotes.Weightedcomparisonofmeans.BritishMedicalJournal,316,129.
Bland,J.M.,Mutoka,C.,andHutt,M.S.R.(1977).Kaposi'ssarcomainTanzania.EastAfricanJournalofMedicalResearch,4,47–53.
Bland,J.M.andPeacock,J.L.(2000).StatisticalQuestionsinEvidence-BasedMedicine,UniversityPress,Oxford.
Bland,M.(1995).AnIntroductiontoMedicalStatistics,2nd.ed.,UniversityPress,Oxford.
Bland,M.(1997).Informedconsentinmedicalresearch:Letreadersjudgeforthemselves.BritishMedicalJournal,314,1477–8.
BMJ(1996a).TheDeclarationofHelsinki.BritishMedicalJournal,313,1448.
BMJ(1996b).TheNurembergcode(1947).BritishMedicalJournal,313,1448.
Brawley,O.W.(1998).Thestudyofuntreatedsyphilisinthenegromale.InternationalJournalofRadiationOncology,Biology,Physics,40,5–8.
Breslow,N.E.andDay,N.E.(1987).StatisticalMethodsinCancerResearch.VolumeII—TheDesignandAnalysisofCohortStudies,IARC,Lyon.
BritishStandardsInstitution(1979).Precisionoftestmethods.1:Guideforthedeterminationandreproducibilityofastandardtestmethod(BS5497,part1),BSI,London.
Brooke,O.G.,Anderson,H.R.,Bland,J.M.,Peacock,J.,andStewart,M.(1989).Theinfluenceonbirthweightofsmoking,alcohol,caffeine,psychosocialandsocio-economicfactors.British
MedicalJournal,298,795–801.
Bryson,M.C.(1976).TheLiteraryDigestpoll:makingofastatisticalmyth.TheAmericanStatistician,30,184–5.
BulletinofMedicalEthics(1998).News:LivelydebateonresearchethicsintheUS.November,3–4.
Burdick,R.K.andGraybill,F.A.(1992).ConfidenceIntervalsonVarianceComponents,NewYork,Dekker.
Burr,M.L.,St.Leger,A.S.,andNeale,E.(1976).Anti-mitemeasuresinmite-sensitiveadultasthma:acontrolledtrial.Lancet,i,333–5.
Campbell,M.J.andGardner,M.J.(1989).Calculatingconfidenceintervalsforsomenon-parametricanalyses.InStatisticswithConfidence(ed.Gardner,M.J.andAltman,D.G.).BritishMedicalJournal,London.
Carleton,R.A.,Sanders,C.A.,andBurack,W.R.(1960).Heparinadministrationafteracutemyocardialinfarction.NewEnglandJournalofMedicine,263,1002–4.
Casey,A.T.H.,Crockard,H.A.,Bland,J.M.,Stevens,J.,Moskovich,R.,andRansford,A.(1996).Predictorsofoutcomeinthequadripareticnonambulatorymyelopathicpatientwithrheumatoid-arthritis—aprospectivestudyof55surgicallytreatedRanawatclassIIIBpatients.JournalofNeurosurgery,85,574–81.
Christie,D.(1979).Before-and-aftercomparisons:acautionarytale.BritishMedicalJournal,2,1629–30.
Cochran,W.G.(1977).SamplingTechniques,Wiley,NewYork.
Colton,T.(1974).StatisticsinMedicine,LittleBrown,Boston.
Cook,R.J.andSackett,D.L.(1995).Thenumberneededtotreat:aclinicallyusefulmeasureoftreatmenteffect.BritishMedicalJournal,310,452–4.
Conover,W.J.(1980).PracticalNonparametricStatistics,JohnWileyandSons,NewYork.
Cox,D.R.(1972).Regressionmodelsandlifetables.JournaloftheRoyalStatisticalSocietySeriesB,34,187–220.
Curtis,M.J.,Bland,J.M.,andRing,P.A.(1992).TheRingtotalkneereplacement—acomparisonofsurvivorship.JournaloftheRoyalSocietyofMedicine,85,208–10.
Davies,O.L.andGoldsmith,P.L.(1972).StatisticalMethodsinResearchandProduction,OliverandBoyd,Edinburgh.
Dennis,M.(1997).Commentary:Whywedidn'taskpatientsfortheirconsent.BritishMedicalJournal,314,1077.
Dennis,M.,O'Rourke,S.,Slattery,J.,Staniforth,T.,andWarlow,C.(1997).Evaluationofastrokefamilycareworker:resultsofarandomisedcontrolledtrial.BritishMedicalJournal,314,1071–11.
DHSS(1976).PreventionandHealth:Everybody'sBusiness,HMSO,London.
Doll,R.andHill,A.B.(1950).Smokingandcarcinomaofthelung.BritishMedicalJournal,ii,739–48.
Doll,R.andHill,A.B.(1956).Lungcancerandothercausesofdeathinrelationtosmoking:asecondreportonthemortalityofBritishdoctors.BritishMedicalJournal,ii,1071–81.
Donnan,S.P.B.andHaskey,J.(1977).Alcoholismandcirrhosisoftheliver.PopulationTrends,7,18–24.
Donner,A.,Brown,K.S.,andBrasher,P.(1990).Amethodologicalreviewofnon-therapeuticinterventiontrialsemployingclusterrandomisation1979–1989.InternationalJournalofEpidemiology,19,795–800.
Doyal,L.(1997).Informedconsentinmedicalresearch:Journalsshouldnotpublishresearchtowhichpatientshavenotgivenfullyinformedconsent—withthreeexceptions.BritishMedicalJournal,314,1107–11.
Easterbrook,P.J.,Berlin,J.A.,Gopalan,R.,andMathews,D.R.(1991).Publicationbiasinclinicalresearch.Lancet,337,867–72.
Egero,B.andHenin,R.A.(1973).ThePopulationofTanzania,BureauofStatistics,DaresSalaam.
Esmail,A.,Warburton,B.,Bland,J.M.,Anderson,H.R.,Ramsey,J.(1997).RegionalvariationsindeathsfromvolatilesubstanceabuseinGreatBritain.Addiction,92,1765–71.
Finney,D.J.,Latscha,R.,Bennett,B.M.,andHsa,P.(1963).TablesforTestingSignificanceina2×2ContingencyTable,CambridgeUniversityPress,London.
Fish,P.D.,Bennett,G.C.J.,andMillard,P.H.(1985).Heatwavemorbidityandmortalityinoldage.AgeandAging,14,243–5.
Flint,C.andPoulengeris,P.(1986).The‘KnowYourMidwife’Report,CarolineFlint,London.
Friedland,J.S.,Porter,J.C.,Daryanani,S.,Bland,J.M.,Screaton,N.J.,Vesely,M.J.J.,Griffin,G.E.,Bennett,E.D.,andRemick,D.G.(1996).Plasmaproinflammatorycytokineconcentrations,AcutePhysiologyandChronicHealthEvaluation(APACHE)IIIscoresandsurvivalinpatientsinanintensivecareunit.CriticalCareMedicine,24,1775–81.
Galton,F.(1886).Regressiontowardsmediocrityinhereditarystature.JournaloftheAnthropologicalInstitute,15,246–63.
Gardner,M.J.andAltman,D.G.(1986).ConfidenceintervalsratherthanPvalues:estimationratherthanhypothesistesting.BritishMedicalJournal,292,746–50.
Glasziou,P.P.andMackerras,D.E.M.(1993).VitaminAsupplementationininfectiousdisease:ameta-analysis.BritishMedicalJournal,306,366–70.
Goldstein,H.(1995).MultilevelStatisticalModels,EdwardArnold,London.
Harper,R.andReeves,B.(1999).Reportingofprecisionofestimatesfordiagnosticaccuracy:areview.BritishMedicalJournal,318,1322–3.
Hart,P.D.andSutherland,I.(1977).BCGandvolebacillusinthepreventionoftuberculosisinadolescenceandearlyadultlife.BritishMedicalJournal,2,293–5.
Healy,M.J.R.(1968).Discipliningmedicaldata.BritishMedicalBulletin,24,210–4.
Hedges,B.M.(1978).Questionwordingeffects:presentingoneorbothsidesofacase.TheStatistician,28,83–99.
Henzi,I.,Walder,B.,andTramè,M.R.(2000).Dexamethasoneforthepreventionofpostoperativenauseaandvomiting:aquantitativesystematicreview.Anesthesia—Analgesia,90,186–94.
Hickish,T.,Colston,K.,Bland,J.M.,andMaxwell,J.D.(1989).VitaminDdeficiencyandmusclestrengthinmalealcoholics.ClinicalScience,77,171–6.
Hill,A.B.(1962).StatisticalMethodsinClinicalandPreventiveMedicine,ChurchillLivingstone,Edinburgh.
Hill,A.B.(1977).AShortTextbookofMedicalStatistics,HodderandStoughton,London.
Holland,W.W.,Bailey,P.,andBland,J.M.(1978).Long-termconsequencesofrespiratorydiseaseininfancy.JournalofEpidemiologyandCommunityHealth,32,256–9.
Holten,C.(1951).Anticoagulantsinthetreatmentofcoronarythrombosis.ActaMedicaScandinavica,140,340–8.
Hosmer,D.W.andLemeshow,S.(1999).AppliedSurvivalAnalysis,JohnWileyandSons,NewYork.
Huff,D.(1954).HowtoLiewithStatistics,Gollancz,London.
Huskisson,E.C.(1974).Simpleanalgesicsforarthritis.BritishMedicalJournal,4,196–200.
James,A.H.(1977).BreakfastandCrohn'sdisease.BritishMedicalJournal,1,943–7.
Johnson,F.N.andJohnson,S.(ed.)(1977).ClinicalTrials,Blackwell,Oxford.
Johnston,I.D.A.,Anderson,H.R.,Lambert,H.P.,andPatel,S.(1983).Respiratorymorbidityandlungfunctionafterwhoopingcough.Lancet,ii,1104–8.
Jones,B.andKeward,M.G.(1989).DesignandAnalysisofCross-OverTrials,ChapmanandHall,London.
Kaste,M.,Kuurne,T.,Vilkki,J.,Katevuo,K.,Sainio,K.,andMeurala,H.(1982).Ischronicbraindamageinboxingahazardofthepast?Lancet,ii,1186–8.
Kendall,M.G.(1970).RankCorrelationMethods,CharlesGriffin,London.
Kendall,M.G.andBabingtonSmith,B.(1971).TablesofRandomSamplingNumbers,CambridgeUniversityPress,Cambridge.
Kendall,M.G.andStuart,A.(1969).TheAdvancedTheoryofStatistics,3rd.ed.,vol.1,CharlesGriffin,London.
Kerrigan,D.D.,Thevasagayam,R.S.,Woods,T.O.,McWelch,I.,ThomasW.E.G.,Shorthouse,A.J.,andDennison,A.R.(1993).Who'safraidofinformedconsent?BritishMedicalJournal,306,298–300.
Kerry,S.M.andBland,J.M.(1998).StatisticsNotes:Analysisofatrialrandomizedinclusters.BritishMedicalJournal,316,54.
Kiely,P.D.W.,Bland,J.M.,Joseph,A.E.A.,Mortimer,P.S.,andBourke,B.E.(1995).Upperlimblymphaticfunctionininflamatoryarthritis.JournalofRheumatology,22,214–17.
Kish,L.(1994).SurveySampling,WileyClassicLibrary,NewYork.
Lancet(1980).BCG:badnewsfromIndia.Lancet,i,73–4.
Laupacis,A.,Sackett,D.L.,Roberts,R.S.(1988).Anassessmentofclinicallyusefulmeasuresoftheconsequencesoftreatment.NewEnglandJournalofMedicine,318,1728–33.
Leaning,J.(1996).Warcrimesandmedicalscience.BritishMedicalJournal,313,1413–15.
Lee,K.L.,McNeer,J.F.,Starmer,F.C.,Harris,P.J.,andRosati,R.A.(1980).Clinicaljudgementsandstatistics:lessonsformasimulatedrandomizedtrialincoronaryarterydisease.Circulation,61,508–15.
Lemeshow,S.,Hosmer,D.W.,Klar,J.,andLwanga,S.K.(1990).AdequacyofSampleSizeinHealthStudies,JohnWileyandSons,Chichester.
Leonard,J.V,Whitelaw,A.G.L.,Wolff,O.H.,Lloyd,J.K.,andSlack,S.(1977).Diagnosingfamilialhypercholesterolaemiainchildhoodbymeasuringserumcholesterol.BritishMedicalJournal,1,1566–8.
Levine,M.I.andSackett,M.F.(1946).ResultsofBCGimmunizationinNewYorkCity.AmericanReviewofTuberculosis,53,517–32.
Lindley,M.I.andMiller,J.C.P.(1955).CambridgeElementaryStatisticalTables,CambridgeUniversityPress,Cambridge.
Lopez-Olaondo,L.,Carrascosa,F.,Pueyo,F.J.,Monedero,P.,Busto,N.,andSaez,A.(1996).Combinationofondansetronanddexamethasoneintheprophylaxisofpostoperativenauseaandvomiting.BritishJournalofAnaesthesia,76,835–40.
Lucas,A.,Morley,R.,Cole,T.J.,Lister,G.,andLeeson-Payne,C.(1992).Breastmilkandsubsequentintelligencequotientinchildrenbornpreterm.Lancet,339,510–5.
Luthra,P.,Bland,J.M.,andStanton,S.L.(1982).Incidenceofpregnancyafterlaparoscopyandhydrotubation.BritishMedicalJournal,284,1013.
Machin,D.,Campbell,M.J.,Fayers,P.,andPinol,A.(1998).StatisticalTablesfortheDesignofClinicalStudies,SecondEdition,Blackwell,Oxford.
Mantel,N.(1966).Evaluationofsurvivaldataandtwonewrankorderstatisticsarisinginitsconsideration.CancerChemotherapyReports,50,163–70.
Mather,H.M.,Nisbet,J.A.,Burton,G.H.,Poston,G.J.,Bland,J.M.,Bailey,P.A.,andPilkington,T.R.E.(1979).Hypomagnesaemiaindiabetes.ClinicaChemicaActa,95,235–42.
Matthews,D.E.andFarewell,V.(1988).UsingandUnderstandingMedicalStatistics,SecondEdition,Karger,Basel.
Matthews,J.N.S.andAltman,D.G.(1996a).StatisticsNotes:Interaction2:compareeffectsizesnotPvalues.BritishMedicalJournal,313,808.
Matthews,J.N.S.andAltman,D.G.(1996b).StatisticsNotes:Interaction3:howtoexamineheterogeneity.BritishMedicalJournal,313,862.
Matthews,J.N.S.,Altman,D.G.,Campbell,M.J.,andRoyston,P.(1990).Analysisofserialmeasurementsinmedicalresearch.BritishMedicalJournal,300,230–5.
Maugdal,D.P.,Ang,L.,Patel,S.,Bland,J.M.,andMaxwell,J.D.(1985).Nutritionalassessmentinpatientswithchronicgastro-intestinalsymptoms:comparisonoffunctionalandorganicdisorders.HumanNutrition:ClinicalNutrition,39,203–12.
Maxwell,A.E.(1970).Comparingtheclassificationofsubjectsbytwoindependentjudges.BritishJournalofPsychiatry,116,651–5.
Mayberry,J.F.,Rhodes,J.,andNewcombe,R.G.(1978).BreakfastanddietaryaspectsofCrohn'sdisease.BritishMedicalJournal,2,1401.
McKie,D.(1992).Pollstersturntosecretballot.TheGuardian,London,24August,p.20.
McLean,S.(1997).Commentary:Noconsentmeansnottreatingthepatientwithrespect.BritishMedicalJournal,314,1076.
Meade,T.W.,Roderick,P.J.,Brennan,P.J.,Wilkes,H.C.,andKelleher,C.C.(1992).Extra-cranialbleedingandothersymptomsduetolowdoseaspirinandlowintensityoralanticoagulation.ThrombosisandHaematosis,68,1–6.
Meier,P.(1977).Thebiggesthealthexperimentever:the1954fieldtrialoftheSalkpoliomyelitisvaccine.InStatistics:AGuidetotheBiologicalandHealthSciences(ed.J.M.Tanur,etal.).Holden-Day,SanFrancisco.
Mitchell,E.A.,Bland,J.M.,andThompson,J.M.D.(1994).Riskfactorsforreadmissiontohospitalforasthma.Thorax,49,33–36.
Morris,J.A.andGardner,M.J.(1989).Calculatingconfidenceintervalsforrelativerisks,oddsratiosandstandardizedratiosandrates.InStatisticswithConfidence(ed.Gardner,M.J.andAltmanD.G.).BritishMedicalJournal,London.
MRC(1948).Streptomycintreatmentofpulmonarytuberculosis.BritishMedicalJournal,2,769–82.
Mudur,G.(1997).Indianstudyofwomenwithcervicallesionscalledunethical.BritishMedicalJournal,314,1065.
Newcombe,R.G.(1992).Confidenceintervals:enlighteningormystifying.BritishMedicalJournal,304,381–2.
Newnham,J.P.,Evans,S.F.,Con,A.M.,Stanley,F.J.,andLandau,L.I.(1993).Effectsoffrequentultrasoundduringpregnancy:arandomizedcontrolledtrial.Lancet,342,887–91.
Oakeshott,P.,Kerry,S.M.,andWilliams,J.E.(1994).RandomisedcontrolledtrialoftheeffectoftheRoyalCollegeofRadiologists'guidelinesongeneralpractitioners'referralforradiographicexamination.BritishJournalofGeneralPractice,44,197–200.
O'Brien,P.C.andFleming,T.R.(1979).Amultipletestingprocedureforclinicaltrials.Biometrics,35,549–56.
OfficeforNationalStatistics(1997).1995,1996,1997MortalityStatistics,General,SeriesDH1,No.28,HMSO,London.
OfficeforNationalStatistics(1998a).1998MortalityStatistics,General,SeriesDH1,No.29,HMSO,London.
OfficeforNationalStatistics(1998b).1997BirthStatistics,SeriesFM1,No.26,HMSO,London.
OfficeforNationalStatistics(1999).MortalityStatistics,Childhood,InfantandPerinatal,SeriesDH3,No.30,HMSO,London.
Oldham,H.G.,Bevan,M.M.,andMcDermott,M.(1979).ComparisonofthenewminiatureWrightpeakflowmeterwiththestandardWrightpeakflowmeter.Thorax,34,807–8.
OPCS(1991).MortalityStatistics,SeriesDH2,No.16,HMSO,London.
OPCS(1992).MortalityStatistics,SeriesDH1,No.24,HMSO,London.
Osborn,J.F.(1979).StatisticalExercisesinMedicalResearch,Blackwell,Oxford.
Paraskevaides,E.C.,Pennington,G.W.,Naik,S.,andGibbs,A.A.(1991).Prefreeze/post-freezesemenmotilityratio.Lancet,337,366–7.
Parmar,M.andMachin,D.(1995).SurvivalAnalysis,JohnWileyandSons,Chichester.
Pearson,E.S.andHartley,H.O.(1970).BiometrikaTablesforStatisticians,vol.1,CambridgeUniversityPress,Cambridge.
Pearson,E.S.andHartley,H.O.(1972).BiometrikaTablesforStatisticians,vol.2,CambridgeUniversityPress,Cambridge.
Peduzzi,P.,Concato,J.,Kemper,E.,Holford,T.R.,andFeinstein,A.R.(1996).Asimulationstudyofthenumberofeventspervariableinlogisticregressionanalysis.JournalofClinicalEpidemiology,49,1373–9.
Pocock,S.J.(1977).Groupsequentialmethodsinthedesignandanalysisofclinicaltrials.Biometrika,64,191–9.
Pocock,S.J.(1982).Interimanalysesforrandomisedclinicaltrials:thegroupsequentialapproach.Biometrics,38,153–62.
Pocock,S.J.(1983).ClinicalTrials:APracticalApproach,JohnWileyandSons,Chichester.
Pocock,S.J.andHughes,M.D.(1990).Estimationissuesinclinicaltrialsandoverviews.StatisticsinMedicine,9,657–71.
Pritchard,B.N.C.,Dickinson,C.J.,Alleyne,G.A.O,Hurst,P.,Hill,I.D.,Rosenheim,M.L.,andLaurence,D.R.(1963).ReportofaclinicaltrialfromMedicalUnitandMRCStatisticalUnit,UniversityCollegeHospitalMedicalSchool,London.BritishMedicalJournal,2,1226–7.
RadicalStatisticsHealthGroup(1976).WhosePriorities?,RadicalStatistics,London.
Ramsay,S.(1998).MissEvers'Boys(review).Lancet,352,1075.
Reader,R.,etal.(1980).TheAustraliantrialinmildhypertension:reportbythemanagementcommittee.Lancet,i,1261–7.
Rembold,C.(Numberneededtoscreen:developmentofastatisticfordiseasescreening).1998.BritishMedicalJournal,317,307–12.
Rodin,D.A.,Bano,G.,Bland,J.M.,Taylor,K.,andNussey,S.S.(1998).PolycysticovariesandassociatedmetabolicabnormalitiesinIndiansubcontinentAsianwomen.ClinicalEndocrinology,49,91–9.
Rose,G.A.,Holland,W.W.,andCrowley,E.A.(1964).Asphygmomanometerforepidemiologists.Lancet,i,296–300.
Rowe,D.(1992).Motheranddaughteraren'tdoingwell.TheGuardian,London,14July,p.33.
Royston,P.andAltman,D.G.(1994).Regressionusingfractionalpolynomialsofcontinuouscovariates:parsimoniousparametricmodelling.AppliedStatistics,43,429–467.
Salvesen,K.A.,Bakketeig,L.S.,Eik-nes,S.H.,Undheim,J.O.,andOkland,O.(1992).Routineultrasonographyinuteroandschoolperformanceatage8–9years.Lancet,339,85–9.
Samuels,P.,Bussel,J.B.,Braitman,L.E.,Tomaski,A.,Druzin,M.L.,Mennuti,M.T.,andCines,D.B.(1990).Estimationoftheriskofthrombocytopeniaintheoffspringofpregnantwomenwithpresumedimmunethrombocytopeniapurpura.NewEnglandJournalofMedicine,323,229–35.
Schapira,K.,McClelland,H.A.,Griffiths,N.R.,andNewell,D.J.(1970).Studyontheeffectsoftabletcolourinthetreatmentofanxietystates.BritishMedicalJournal,2,446–9.
Schmid,H.(1973).Kaposi'ssarcomainTanzania:astatisticalstudyof220cases.TropicalGeographicalMedicine,25,266–76.
Schulz,K.F.,Chalmers,I.,Hayes,R.J.,andAltman,D.G.(1995).Biasduetonon-concealmentofrandomizationandnon-double-blinding.JournaloftheAmericanMedicalAssociation,273,408–12.
Searle,S.R.,Cassela,G.,andMcCulloch,C.E.(1992).VarianceComponents,NewYork,NewYork.
Senn,S.(1989).Cross-OverTrialsinClinicalResearch,Wiley,Chichester.
Shaker,J.L.,Brickner,R.C.,Findling,J.W.,Kelly,T.M.,Rapp.R.,Rizk,G.,Haddad,J.G.,Schalch,D.S.,andShenker,Y.(1997).Hypocalcemiaandskeletaldiseaseaspresentingfeaturesofceliacdisease.ArchivesofInternalMedicine,157,1013–6.
Siegel,S.(1956).Non-parametricStatisticsfortheBehaviouralSciences,McGraw-HillKagakusha,Tokyo.
Sibbald,B.,AddingtonHall,J.,Brenneman,D.,andFreeling,P.(1994).Telephoneversuspostalsurveysofgeneralpractitioners.BritishJournalofGeneralPractice,44,297–300.
Snedecor,G.W.andCochran,W.G.(1980).StatisticalMethods,7thedn.,IowaStateUniversityPress,Ames,Iowa.
Snowdon,C.,Garcia,J.,andElbourne,D.R.(1997).Makingsenseofrandomisation:Responsesofparentsofcriticallyillbabiestorandomallocationoftreatmentinaclinicaltrial.SocialScienceandMedicine,15,1337–55.
South-eastLondonScreeningStudyGroup(1977).Acontrolledtrialofmultiphasicscreeninginmiddle-age:resultsoftheSouth-EastLondonScreeningStudy.InternationalJournalofEpidemiology,6,357–63.
Southern,J.P.,Smith,R.M.M.,andPalmer,S.R.(1990).Birdattackonmilkbottles:possiblemodeoftransmissionofCampylobacter
jejunitoman.Lancet,336,1425–7.
Streiner,D.L.andNorman,G.R.(1996).HealthMeasurementScales:APracticalGuidetoTheirDevelopmentandUse,secondedition,Oxford,UniversityPress.
Stuart,A.(1955).Atestforhomogeneityofthemarginaldistributionsinatwo-wayclassification.Biometrika,42,412.
‘Student’(1908).Theprobableerrorofamean.Biometrika,6,1–24.
‘Student’(1931).TheLanarkshireMilkExperiment.Biometrika,23,398–406.
Thomas,P.R.S.,Queraishy,M.S.,Bowyer,R.,Scott,R.A.P.,Bland,J.M.,andDormandy,J.A.(1993).Leucocytecount:apredictorofearlyfemoropoplitealgraftfailure.CardiovascularSurgery,1,369–72.
Thompson,S.G.(1993).Controversiesinmeta-analysis:thecaseofthetrialsofserumcholesterolreduction.StatisticalMethodsinMedicalResearch,2,173–92.
Todd,G.F.(1972).StatisticsofSmokingintheUnitedKingdom,6thed.,TobaccoResearchCouncil,London.
Tukey,J.W.(1977).ExploratoryDataAnalysis,Addison-Wesley,NewYork.
Turnbull,P.J.,Stimson,G.V.,andDolan,K.A.(1992).PrevalenceofHIVinfectionamongex-prisoners.BritishMedicalJournal,304,90–1.
Velzeboer,S.C.J.M.,Frenkel,J.,anddeWolff,F.A.(1997).Ahypertensivetoddler.Lancet,349,1810.
Victora,C.G.(1982).Statisticalmalpracticeindrugpromotion:acase-studyfromBrazil.SocialScienceandMedicine,16,707–9.
White,P.T.,Pharoah,C.A.,Anderson,H.R.,andFreeling,P.(1989).Improvingtheoutcomeofchronicasthmaingeneralpractice:arandomizedcontrolledtrialofsmallgroupeducation.JournaloftheRoyalCollegeofGeneralPractitioners,39,182–6.
Whitehead,J.(1997).TheDesignandAnalysisofSequentialClinicalTrials,revised2nd.ed.,Chichester,Wiley.
Whittington,C.(1977).Safetybeginsathome.NewScientist,76,340–2.
Williams,E.I.,Greenwell,J.,andGroom,L.M.(1992).Thecareofpeopleover75yearsoldafterdischargefromhospital:anevaluationoftimetabledvisitingbyHealthVisitorAssistants.JournalofPublicHealthMedicine,14,138–44.
Wroe,S.J.,Sandercock,P.,Bamford,J.,Dennis,M.,Slattery,J.,andWarlow,C.(1992).Diurnalvariationinincidenceofstroke:Oxfordshirecommunitystrokeproject.BritishMedicalJournal,304,155–7.
Zelen,M.(1979).Anewdesignforclinicaltrials.NewEnglandJournalofMedicine,300,1242–5.
Zelen,M.(1992).Randomizedconsentdesignsforclinicaltrials:anupdate.StatisticsinMedicine,11,131–2.
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>A
Aabridgedlifetable200–1absolutedifference271–2absolutevalue239acceptingnullhypothesis140accidents53acutemyocardialinfarction277additionrule88adjustedoddsratio323admissionstohospital86 255–6 354 356 370–1age53 56–7 267 308–14 316 373age,gestational56–7ageinlifetableseelifetableage-specificmortalityrate295–6 299–300 302 307 376–7age-standardizedmortalityrate74 296 302age-standardizedmortalityratio297–9 303 307 376–7agreement272–5AIDS58 77–8 169–71 172 174–8 317–8alphaspending152albumin76–7alcoholics76–7 308–17allocationtotreatment6–13 15 20–1 23alterationsto11–13 21alternate6–7 11alternatedates11–12bygeneralpractice21 23byward21cheatingin12–13knowninadvance11inclusters21–2 179–81 344–6
minimization13non-random11–13 21–2physicalrandomization12random7–11 15 17 20–1 25systematic11–12usingenvelopes12usinghospitalnumber11
alphaerror140alternateallocation6–7 11alternativehypothesis137 139–42ambiguousquestions40–1analgesics15 18analysisofcovariance321analysisofvariance172–9 261–2 267–8 318–21assumptions173 175–6balanced318inestimationofmeasurementerror271fixedeffects177Friedman321Kruskal–Wallis217 261–2inmeta-analysis327multi-way318–21one-way172–9 261–2randomeffects177–9inregression310–15 315two-way318usingranks217 261–2 321
anginapectoris15–16 138–9 218–20animalexperiments5 16–17 20–1 33anticoagulanttherapy11–12 19 142antidiuretichormone196–7antilogarithm83appropriateconfidenceintervalsforcomparison134appropriatesignificancetestsforcomparison142–3anxiety18 143 210ARC58 172 174–7arcsinesquareroottransformation165
areaunderthecurve104–5 109–11 169–71 278 373–4probability104–5 109–11serialdata169–71 373–4ROCcurve278
arithmeticmean59arterialoxygentension183–4arthritis15 18 37 40Asianwomen35assessment19–20ascertainmentbias38association230–2asthma21 265 267 332 372 373atrophyofspinalchord37attackrate303attribute47AUCseeareaunderthecurveaverageseemeanAVP196–7AZT(zidovudine)77–8 169–71
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>B
Bbabies267 373–4back-transformation166–7 271backwardsregression326barchart73–5 354–6barnotation59Bartlett'stest172baseoflogarithm82–4baseline79baselinehazard324BASIC107Bayesianprobability87Bayes'theorem289BCGvaccine6–7 11 17 33 81betaerror140 337betweengroupssumofsquares174betweenclustervariance345–6betweensubjectsvariance178–9 204bias6 11–14 17–20 28 39–42 283–4 327 350 363ascertainment38inallocation11–13ascertainment38inassessment19–20publication327inquestionwording40–2recall39 350 363inreporting17–19response17–19insampling28 31volunteer6 13–14 32
bicepsskinfold165–7 213–15 339bimodaldistribution54–5binaryvariableseedichotomousvariableBinomialdistribution89–91 94 103 106–8 110 128 130–1 132–3 180andNormaldistribution91 106–8meanandvariance94probability90–1insigntest138–9 247
biologicalvariation269birds45–6 255 350birthrate303 305birthweight150blindassessment19–20blocks9bloodpressure19 28 117 191 268–9BMIseebodymassindexbodymassindex(BMI)322–3Bonferronimethod148–51boxandwhiskerplot58 66 351 359boxers264boxes93–4breastcancer37 216–17breastfeeding153breathlessness74–5BritishStandardsInstitution270bronchitis130–2 146 233–4
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>C
CCampylobacterjejuni44–6 255 350C-Tscanner5–6 68caesariansection25 349calculationerror70calibration194cancer23 32–9 41 69–74 216–17 241–3breast37 216–17cervicalcancer23lung32 35–9 68–70 241–3 299oesophagus74 78–80parathyroidcancerregistry39
capillarydensity159–64 174cards7 12 50carry-overeffect15case-controlstudy37–40 45–6 153–5 241–3 248 323 349–50 362–3casefatalityrate303casereport33–4caseseries33–4cataracts266 373categoricaldata47–8 373 seenominaldatacats350causeofdeath70–3 75celloftable230censoredobservations281 308 324–5census27 47–8 86 294decennial27 294hospital27 47–8 86local27
national27 294years294 299
centile57–8 279–81centrallimittheorem107–8cervicalcancer23cervicalsmear275cervicalcytology22chartbarseebarchartpieseepiechart
cheatinginallocation12–13Chi-squareddistribution118–20 232–3andsamplevariance119–20 132contingencytables231–3 249–51degreesoffreedom118–19 231–2 251table233
chi-squaredtest230–6 238–40 243–51 249–51 258–9 261–2 371 372373contingencytable230–6 238–40 243–7 249–51 258–9 261–2 371 372373continuitycorrection238–40 247 259 261degreesoffreedom231–2 251goodnessoffit248–9logranktest287–8samplesize341trend243–5 259 261–2validity234–6 239–40 245
childrenseeschoolchildrenchoiceofstatisticalmethod257–267cholesterol55 326 345cigarettesmokingseesmokingcirrhosis297–9 306 317classinterval49–50classvariable317clinicaltrials5–25 32–3 326–30allocationtotreatment6–15 20–1 23assessment19–20
combiningresultsfrom326–30clusterrandomized21–2 179–81 205 344–6 380consentofsubjects22–4cross-over15–16 341doubleblind19–20doublemaskedseedoubleblindethics19 22–4groupedsequential152informedconsent22–4intentiontotreat14–15 23 348 372meta-analysis326–30placeboeffect17–19randomized7–11samplesize336–42 344–6 347selectionofsubjects16–17sequential151–2volunteerbias13–14
Clinstatcomputerprogram3 9 30 93 248 298clusterrandomization21–2 179–81 205 344–6 380clustersampling31 344–6Cochran,W.G.230coefficientofvariation271coefficientsinregression189 191–2 310–12 314 317 322–3 325Cox325andinteraction314logistic322–3multiple310–12 314 317simplelinear189 191–2
coeliacdisease34 165–7 213–15 339cohortstudy36–7 350cohort,hypotheticalinlifetable299coins7 28 87–92colds69 241–3colontransittime267combinations97–8combiningdatafromdifferentstudies326–30commoncoldseecolds
commonestimate326–30commonoddsratio328–30commonproportion145–7commonvariance162–4 173comparisonmultipleseemultiplecomparisonsofmeans12–19 143–5 162–4 170–6 338–41 347 361 379–80ofmethodsofmeasurement269–73ofproportions130–2 145–7 233–4 245–7 259 341–3 347 372 379ofregressionlines208 9 367–8oftwogroups128–32 143–7 162–4 211–17 233–4 254 255–7 338–43344–6 347 361 372 379–80ofvariances172 260withinonegroup159–62 217–20 245–7 257 260–1 341
compliance183–4 228–9 363–7 369–70computer2 8–9 30 107 166 174 201 238 288–90 298 308 310 318diagnosis288–90randomnumbergeneration8–9 107programforconfidenceintervalofproportion132programsforsampling30statisticalanalysis2 174 201 298 308 310 318
conception142conditionallogisticregression323conditionaloddsratio248conditionalprobability96–7conditionaltest250confidenceinterval126–34appropriateforcomparison134centile133 280–1correlationcoefficient200–1differencebetweentwomeans128–9 136 162–4 361differencebetweentwoproportions130–1 243differencebetweentworegressioncoefficients208–9 368hazardratio288 325mean126–7 136 159–60 335–6 361median133numberneededtotreat290–1
oddsratio241–3 248percentile133 280–1predictedvalueinregression194–5proportion128 132–3 336quantile133 280–1ratiooftwoproportions131–2referenceinterval280–1 290 375 378regressioncoefficient191–2regressionestimate192–4andsamplesize335–6orsignificancetest142 145 227SMR298–9 307 376–7sensitivity276sensitivity276survivalprobability283transformeddata166–7usingrankorder216 220
confidencelimits126–34confounding34–5consentofresearchsubjects22–4conservativemethods15constraint118–19 250–1contingencytable230 330
continuitycorrection225–6 238–40 247chi-squaredtest238–40Kendall'srankcorrelationcoefficient226Mann-WhitneyUtest225McNemarstest247
continuousvariable47–50 75 87–8 93 103–6 276–8 323indiagnostictest276–8
contrastsensitivity266 373controlgroupcasecontrolstudy37–9 350 362–3clinicaltrial5–7
controlledtrialseeclinicaltrialcornflakes153–5 362–3
coronaryarterydisease34 149 326coronarythrombosis11–12 36correlation197–205 220 260–2 309–11assumptions200–1betweenrepeatedmeasurements341coefficient197–204confidenceinterval200–1Fisher'sztransformation201 339–40 343intra-class179 204–5 272 346intra-cluster346 347linearrelationship199matrix202 309–10multiple311negative198positive197productmoment198r198–200r2199–200rankseerankcorrelationandregression199–200 311repeatedobservations202–4samplesize343–4significancetest200–1tableof200tableofsamplesize344zero198
cough34–5 41 128–32 144–7 233–4 240–1 254counselling41–2counties347covarianceanalysis321Coxregression324–5crime97Crohn'sdisease153–5 165–7 213–15 339 362–3cross-classification230 370–1cross-overtrial15–16 137 341cross-sectionalstudy34–5cross-tabulation230 370–1
crudedeathrate294–5crudemortalityrate294–5 302cumulativefrequency48–51 56cumulativesurvivalprobability282–3 299cushionvolume333–4 378cut-offpoint277–8 281
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>D
Ddeath27 70–3 96 101–2 281deathcertificate27 294deathrateseemortalityratedecennialcensus27 294decimaldice8decimalplaces70 268decimalpoint70decimalsystem69–70decisiontree289–90DeclarationofHelsinki22degreesoffreedom61 67 118–20 153–4 159 169 171–2 191 231–2251 288 309 311 319 331analysisofvariance173–5Chi-squareddistribution118–20chi-squaredtest231–2 251Fdistribution120Ftest171 173–5goodnessoffittest248–9logranktest288regression191 310 313samplesizecalculations335tdistribution120 157–8tmethod157–8 160–4varianceestimate61 67 94–5 119 352–3
delivery25 230–1 322–3 349demography299denominator68–9dependentvariable187depressivesymptoms18
Derbyshire128designeffect344–6 380detection,belowlimitof281deviationfromassumptions161–2 164 167–8 175–6 196–7deviationsfrommean61 352deviationsfromregressionline187–8dexamethasone290–1diabetes135–6 360–1diagnosis47–8 86 275–9 288–90 317diagnostictest136 275–9 361diagrams72–82 85–6barseebarchartpieseepiechartscatterseescatterdiagram
diarrhoea172 318diastolicbloodpressureseebloodpressuredice7–8 87–9 122dichotomousvariable258–62 308 317 321–3 325 328differenceagainstmeanplot161–2 184 271–5 364–5 367differences129–30 138–9 159–62 184 217–20 271–5 341 364–5 369–70differencesbetweentwogroups128–31 136 143–7 162–7 211–17 258–9 338–43 344–6 347 362–3digitpreference269directstandardization296dischargefromhospital48discretedata47 49discriminantanalysis289distributionBinomialseeBinomialdistributionChi-squaredseeChi-squareddistributioncumulativefrequencyseecumulativefrequencydistributionFseeFdistributionfrequencyseefrequencydistributionNormalseeNormaldistributionPoissonseePoissondistributionprobabilityseeprobabilitydistributionRectangularseeRectangulardistribution
tseetdistributionUniformseeUniformdistribution
distribution-freemethods210diurnalvariation249DNA97doctors36 68 86 297–9 356Dopplerultrasound150dotplot77doubleblind19–20doubledummy18doublemaskedseedoubleblinddoubleplaceboseedoubledummydrug69dummytreatmentseeplacebodummyvariables317 328Duncan'smultiplerangetest176
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>E
Ee,mathematicalconstant83–4 95ecologicalfallacy42–3ecologicalstudies42–3eczema97election28 32 41electoralroll30 32embryos333–4 378enumerationdistrict27envelopes12enzymeconcentration347 379–80epidemiologicalstudies32 34–40 42–3 45–6 326equality,lineof273–4error70 140 187 192 269–72 337alpha140beta140 337calculation70firstkind140measurement269–72secondkind140 337terminregressionmodel187 192typeI140typeII140 337
estimate61 122–36 326–30estimation122–36 335–6ethicalapproval32ethics4 19 22–4 32evidence-basedpractice1expectation92–4ofadistribution92–3
ofBinomialdistribution94ofChi-squareddistribution118oflife102 300–2 305 357–8ofsumofsquares60–4 98–9 119
expectedfrequency230–31 26 250expectednumberofdeaths297–9expectedvalueseeexpectation,expectedfrequencyexperimentalunit21–2 180experiments5–25animal5 16–17 20–1 33clinicalseeclinicaltrialsdesignof5–25factorial10–11laboratory5 16–17 20–1
expertsystem288–90ex-prisoners128
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>F
FFdistribution118 120 334Ftest171 173–5 311 313–15 317–18 320 334 378face-lifts23factor317–18factorial90 97factorialexperiment10–11falsenegative277–9falsepositive277–9familyofdistributions90 96Farr,William1FATseefixedactivatedT-cellsfatabsorption78 169–71fatalityrate303feet,ulcerated159–64 174fertility142 302–3fertilityrate303FEVl49–54 57–60 62–3 125–7 133 185–6 188–95 197–9 201 279–80310–11 335–6fevertree26Fisher1Fisher'sexacttest236–40 251–2 259 262Fisher'sztransformation201 343fivefiguresummary58fiveyearsurvivalrate283fixedactivatedT-cells(FAT)318–21fixedeffects177–9 328follow-up,losttoorwithdrawnfrom282footulcers159–64 174forcedexpiratoryvolumeseeFEV1
forestdiagram330forwardregression326fourths57frequency48–56 68–9 230–1 250cumulative48–51density52–4 104–5distribution48–56 66–7 103–5 351–2 354expected230–1 250perunit52–4polygon54andprobability87 103–5proportion68relative48–50 53–4 104–5tallysystem50 54intables71 230–1
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>G
GG.P.41Gabriel'stest177gallstones284–8 324–5Galton186gastricpH265–6 372–3GaussiandistributionseeNormaldistributiongeewhizgraph79–80geometricmean113 167 320geriatricadmissions86 255–6 354 356 370–1gestationalage196–7glucose35 66–7 121–2 351–3 359–60gluesniffingseevolatilesubstanceabusegoodnessoffittest248–9GossettseeStudentgradient185–6graftfailure331graphs72–82 85–6groupcomparisonseecomparisonsgroupedsequentialtrials152groupingofdata167guidelines179–81
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>H
Hharmonicmean113hayfever97hazardratio288 324–25health40–1healthcentre220–1healthpromotion347healthypopulation279 292–3hearttransplants264heatwave86 255–6 356 370–1height75–6 87–8 93–4 112 159 185–6 188–95 197–9 201 208–9 308–17 367–9Helsinki,Declarationof22heteroscedasticity175heterogeneitytest249 328–9Hill,Bradford1histogram50–7 67 72 75 103–4 267 303–4 352 354 356 359historicalcontrols6HIV58 128 172 174–7holes93–4homogeneityofoddsratios328–9homogeneityofvarianceseeuniformvariancehomoscedasticity175hospitaladmissions86 255–6 356 370–1hospitalcensus27 47–8 85hospitalcontrols38–9house-dustmite265 372housingtenure230–1 317Huff79 81humanimmunodeficiencyvirusseeHIV
hypercholesterolaemia55hypertension43 91 265 372hypocalcaemia34hypothesis,alternativeseealternativehypothesishypothesis,nullseenullhypothesis
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>I
IICCseeintra-classcorrelationICDseeInternationalClassificationofDiseaseileostomy265 372incidence303independentevents88 357independentgroups128–32 143–7 162–4 172–7 211–17independentrandomvariables93–4independenttrials90independentvariableinregression187India17 33indirectstandardization296–9inductionoflabour322–3infantmortalityrate303infinity(∞)291inflammatoryarthritis40informedconsent22–3instrumentaldelivery25 349intentiontotreat14–15 348–9 372interaction310 313–14 320–1 327–9 334 378intercept185–6InternationalClassificationofDisease70–72inter-pupildistance331interquartilerange60interval,class49intervalestimate126intervalscale210 217 258–62 373intra-classcorrelationcoefficient179 204–5 272 380intra-clustercorrelationcoefficient272 380
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>J
Jjitteringinscatterdiagrams77
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>K
KKaplan-Meiersurvivalcurve283Kaposi'ssarcoma69 220–1Kendall'srankcorrelationcoefficient222–6 245 261–2 373 374continuitycorrection226incontingencytables245τ222table225tau222ties23–4comparedtoSpearman's224–5
Kendall'stestfortwogroups217Kent245–7KnowYourMidwifetrial25 348–9knowledgebasedsystem289–90Korotkovsounds268–9Kruskal-Wallistest217 261–2
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>L
Llabour322–3 348–9laboratoryexperiment5 16 20–1lactulose172 175–7Lanarkshiremilkexperiment12laparoscopy142largesample126 128–32 143–7 168–9 258–60 335–6leastsquares187–90 205–6 310leftcensoreddata281Levenetest172lifeexpectancy102 300–2 305 357–8lifetable101–2 282–3 296 299–302limitsofagreement274–5linegraph77–80 354 356lineofequality273–4linearconstraint118–19 243–5 250–1linearregressionseeregression,multipleregressionlinearrelationship185–209 243–5lineartrendincontingencytable243–5LiteraryDigest31lithotrypsy284logseelogarithm,logarithmicloghazard324–5log-linearmodel330logodds240 252–3 321–3logoddsratio241–2 252–3 323logarithm82–4 131baseof82–4
logarithmofproportion131logarithmofratio131
logarithmicscale81–2logarithmictransformation113–14 116 164–7 175–6 184andcoefficientofvariation271andconfidenceinterval167geometricmean113 167toequalvariance164–7 175–6 196–7 271toNormaldistribution113–14 116 164–7 175–6 184 360 364–5 372standarddeviation113–14varianceof131 248
logisticregression289 321–3 326 328–9 330conditional323multinomial330ordinal330
logittransformation235 248–9 321–3Lognormaldistribution83 113logranktest284 287–9 325longitudinalstudy36–7losstofollow-up282Louis,Pierre-Charles-Alexandre1lungcancer32 35–9 68–70 96 242–3 299lungfunctionseeFEV1,PEFR,meantransittime,vitalcapacitylymphaticdrainage40
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>M
Mmagnesium135–6 292–3 360–1 375–6malaria26mannitol58 172 174–7 317–18Mann–WhitneyUtest164 211–17 225–7 258–9 259 278 373–4andtwo-sampletmethod211 215–17continuitycorrection225–6Normalapproximation215 225–6andROCcurve278table212tablesof217ties213 215
Mantel'smethodforsurvivaldata288Mantel-Haenszelmethodforcombining,2by2tables328methodfortrend245
marginaltotals230–1matchedsamples159–62 217–20 245–7 260 341 363–7 369–70matching39 45–6maternalage267 373maternalmortalityrate303maternitycare25mathematics2matrix309maximum58 65 169 345maximumvoluntarycontraction308–16McNemar'stest245–7 260meantransittime265 368mean59–60 67arithmetic59
comparisonoftwo128–9 143–5 162–4 338–41 361 378–9confidenceintervalfor126–7 132 335 361deviationsfrom60geometric113 167harmonic113ofpopulation126–7 335–6ofprobabilitydistribution92–4 105–6ofasample56–8 65–6 352–3samplesize335–6 338–41samplingdistributionof122–5standarderrorof126–7 136 156 335 361sumofsquaresabout60–65
measurement268–9measurementerror269–72measurementmethods272–5median56–9 133 216–7 220 351confidenceintervalfor133 220
MedicalResearchCouncil9mercury34meta-analysis326–30methodsofmeasurement269–73mice21 33 333–4 378midwives25 342–3mildhypertension265 368milk12–13 45–6 255 349–50miniWrightpeakflowmeterseepeakflowmeterminimization13minimum58 66 351misleadinggraphs78–81missingdenominator69missingzero79–80mites265 372MLn3MLWin3mode55modulus239Montecarlomethods238
mortality15 36 70–6 86 294–6 302–3 347 356 357–8 376–7mortalityrate36 294–6 302–3age-specific295–6 299–300 302 307 376–7age-standardized296 302crude294–5 302infant303 305neonatal303perinatal303
mosquitos26MTBseemycobateriumtuberculosisMTTseemeantransittimemultifactorialmethods308–34multi-levelmodelling3multinomiallogisticregression330multiplecomparisons175–7multipleregression308–18 333–4analysisofvariancefor310–15andanalysisofvariance318assumptions310 315–16backward326classvariable317–18coefficients310–12 314 378computerprograms308 310 318correlatedpredictorvariables312degreesoffreedom310 312dichotomouspredictor317dummyvariables317–18Ftest311 313 317factor317–18forward326interaction310 313–14 333–4 378leastsquares310linear310 314inmeta-analysis327non-linear310 314–15 378Normalassumption315–16outcomevariable308
polynomial314–15predictorvariable308 312–13 316–18quadraticterm315 316 378qualitativepredictors316–18R2311referenceclass317residualvariance310residuals315–16 333–4 378significancetests310–13standarderrors311–12stepwise326sumofsquares310 313–14 378ttests310–12 317transformations316uniformvariance316varianceratio311variationexplained311
multiplesignificancetests148–52 169multiplicativerule88 90 92–4 96multi-wayanalysisofvariance318–21multi-waycontingencytables330musclestrength308–16mutuallyexclusiveevents88 90 357mycobateriumtuberculosis(MTB)318–21myocardialinfarction277 347 379
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>N
NNapier83naturalhistory26 33naturallogarithm83naturalscale81–2nauseaandvomiting290–1Nazideathcamps22negativepredictivevalue278–9neonatalmortalityrate300NewYork6–7 10Newman-Keulstest176Nightingale,Florence1nitrite265 372–3NNHseenumberneededtoharmNNTseenumberneededtotreatnodesinbreastcancer216–17nominalscale210 258–62non-parametricmethods210 226–7non-significant140–1 142–3 149nonedetectable281Normalcurve106–9normaldelivery25 349Normaldistribution91 101–20andBinomial91 106–8inconfidenceintervals126–7 258–60 262 373incorrelation200–1deriveddistributions118–20independenceofsamplemeanandvariance119–20aslimit106–8andnormalrange279–81 293
ofobservations112–18 156 210 258–62 359–60andreferenceinterval279–81 293 375 378inregression187 192 194 315–16insignificancetests143–7 258–60 262 368standarderrorofsamplestandarddeviation132intmethod156–8tables109–10
Normalplot114–19 121–2 161 163 165–7 170–3 175–6 180–1 267359–60Normalprobabilitypaper114normalrangeseereferenceintervalnullhypothesis137 139–42numberneededtoharm290numberneededtotreat290–1Nuremburgtrials22nuisancevariable320
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>O
Oobservationalstudies5 26–46observedandexpectedfrequencies230–1occupation96odds240 321–3oddsratio240–2 248 252–3 259 323 328–9oesophogealcancer74 77–80OfficeofNationalStatistics294ontreatmentanalysis15one-sidedpercentagepoint110one-sidedtest141–2 237one-tailedtest141–2 237opinionpoll29 32 41 347 378–9orderednominalscale258–62ordinallogisticregression330ordinalscale210 220 258–62 373outcomevariable187 190 308 321outliers58 196 378overview326–30oxygendependence267 373–4
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>P
Ppa(O2)183–4pain15–16 18painreliefscore18paireddata129–30 138–9 159–62 167–8 217–20 245–7 260 341 363–7369–70 372inlargesample129–30McNemar'stestseeMcNemar'stestsamplesize341signtestseesigntesttmethodseetmethodsWilcoxonseetestWilcoxontest
parameter90parametricmethods210 226–7parathyroidcancer282–4parity49 52–3 248–9passivesmoking34–5PCOseepolycysticovarydiseasepeakexpiratoryflowrateseePEFRpeakflowmeter269–75peakvalue169Pearson'scorrelationcoefficientseecorrelationcoefficientPEFR54 128–9 144–5 147–8 208–9 265 269–75 363–4 368percentage68 71percentagepoint109–10 347 378percentile57 279–81perinatalmortalityrate303permutation97–8pH265 372–3phlegm145 147–8
phosphomycin69physicalmixing12pictogram80–1piechart72–3 80 354–5piediagramseepiechartpilotstudy335 339 341Pitman'stest260placebo17–20 22pointestimate125Poissondistribution95–6 108 165 248–50 252 298–9Poissonheterogeneitytest249Poissonregression330poliomyelitis13–14 19 68 86 355polycysticovarydisease35polygonseefrequencypolygonpolynomialregression314–15population27–34 36 39 87 335–6census27 294estimate294mean126–7 335–6national27 294projection302pyramid303–5restricted33standarddeviation124–5statisticalusage28variance124–5
positivepredictivevalue278power147–8 337–46p–pplot117–18precision268–9predictorvariable187 190 308 312–13 316–18 321 323 324pregnancy25 49 348–9prematurebabies267presentingdata68–86presentingtables71–2prevalence35 90 278–9 303
probability87–122additionrule88conditional96–7densityfunction104–6distribution88–9 92–4 103–6 357–8ofdying101–2 299–300 357–8multiplicationrule88 96paper114insignificancetests137 9ofsurvival101–2 357–8thatnullhypothesisistrue140
productmomentcorrelationcoefficientseecorrelationcoefficientpronethalol15–16 138–9 217–20proportion68–9 71 128 130–3 165 321–3arcsinesquareroottransformation165confidenceintervalfor128 132–3 336denominator69differencebetweentwo130–1 145–7 233–4 245–7 341–3 347asoutcomevariable321–3ratiooftwo131–2 147samplesize336 341–3 347standarderror128 336intables71ofvariabilityexplained191 200
proportionalfrequency48proportionalhazardsmodel324–5prosecutor'sfallacy97prospectivestudy36–7protocol268pseudo-random8publicationbias327pulmonarytuberculosisseetuberculosispulserate178–9 190–1 204Pvalue1 139–41Pvaluespending152pyramid,population303–5
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>Q
Qq–qplotseequantile–quantileplotquadraticterm315 316 378qualitativedata47 258–62 316–18quantile56–8 116–18 133 279–81confidenceinterval133 280–1
quantile-quantileplot116–18quantitativedata47 49quartile57–8 66 351quasi-randomsampling31questionnaires36 40–2quotasampling28–29
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>R
Rr,correlationcoefficient198–9r2199–20 311rS,Spearmanrankcorrelation220R,multiplecorrelationcoefficient311R2311radiologicalappearance20RAGE23randomallocation7–11 15 17 20–3 25bygeneralpractice21 23byward21inclusters21–2 344–6
randombloodglucose66–7randomeffects177–9 328randomnumbers8 10 29–30randomsampling9 29–32 38 90randomvariable87–118additionofaconstant93differencebetweentwo94expectedvalueof92–4meanof92–4multipliedbyaconstant92sumoftwo92–3varianceof92–4
randomizationseerandomallocationrandomizedconsent23randomizingdevices7–8 87 90range59–60 279interquartile59–60normalseereferenceinterval
referenceseereferenceintervalrank211 213–14 218 221 223rankcorrelation220–6 261–2 373 374choiceof226 261–2Kendall's222–6 261–2 373 374Spearman's220–2 226 261–2 374
rankorder211 213–14 221ranksumtest210–20onesampleseeWilcoxontwosampleseeMannWhitney
rate68–9 71agespecificmortality295–6 299–300 302 307agestandardizedmortality296 302attack303birth303 305casefatality303crudemortality294–5 302denominator69fertility303fiveyearsurvival283incidence303infantmortality303 305maternalmortality303mortality294–6 302–3multiplier68 295neonatalmortality303perinatalmortality303prevalence303response31–2stillbirth303survival283
ratiooddsseeoddsratioofproportions131–2 147scale257–8standardizedmortalityseestandardizedmortalityratio
rats20
rawdata167recallbias39 350 363receiveroperatingcharacteristiccurveseeROCcurvereciprocaltransformation165–7Rectangulardistribution107–8referenceclass317referenceinterval33 136 279–81 293 361 375 378confidenceinterval280–1 293 375 378bydirectestimation280–1samplesize347 378usingNormaldistribution279–80 293 361 375 378usingtransformation280
refusingtreatment13–15 25registerofdeaths27regression185–9 199–200 205–7 208–9 261–2 308–18 312–30 333–4analysisofvariancefor310–15assumptions187 191–2 194–5 196–7backward326coefficient189 191–2comparingtwolines208–9 367–8confidenceinterval192incontingencytable234–5andcorrelation199–200Cox324–5dependentvariable187deviationsfrom187deviationsfromassumptions196–7equation189errorterm187 192estimate192–3explanatoryvariable187forward326gradient185–6independentvariable187intercept185–6leastsquares187–90 205–6line187
linear189logistic321–3 326 328–9multinomiallogistic330multipleseemultipleregressionordinallogistic330outcomevariable187 190outliers196perpendiculardistancefromline187–8Poisson330polynomialseepolynomialregressionprediction192–4predictorvariable187 190proportionalhazards324–5residualsumofsquares191residualvariance191residuals194–6significancetest192simplelinear189slope185–6standarderror191–4stepwise326sumofproducts189sumofsquaresabout191–2 310sumofsquaresdueto191–2towardsthemean186–7 191variabilityexplained191 200varianceaboutline191–2 205–6XonY190–1
rejectingnullhypothesis140–1relationshipbetweenvariables33 73–8 185–209 220–6 230–45 257261–2 308–34relativefrequency48–50 53 103–5relativerisk132 241–3 248 323reliability272repeatability33 269–72repeatedobservations169–71 202–3repeatedsignificancetests151–2 169
replicates177representativesample28–32 34residualmeansquare174 270residualstandarddeviation191–2 270residualsumofsquares174 310 312residualvariance173 310residuals165–6 175–6 267 315–16 333–4aboutregressionline194–6 315–16plotsof162–4 173–4 194–6 315–16 333–4 378withingroups165–6 175–6
respiratorydisease32 34–5respiratorysymptoms32 34–5 41 125–9 142–7 233–4 240–1 243–7254responsebias17–19responserate31–2responsevariableseeoutcomevariableretrospectivestudy39rheumatoidarthritis37Richterscale114risk131–2riskfactor39 326–7 350RND(X)107robustnesstodeviationsfromassumptions167–9ROCcurve277–8
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>S
Ss2,symbolforvariance61saline13–14Salkvaccine13–14 17 19 68 355salt43sample87large127–31 168–9 258–60 262 335–6meanseemeansizeseesizeofsamplesmall130–1 132–3 156–69 227 258–60 262 344varianceseevariance
sampling27–34inclinicalstudies32–4 293 375cluster31distribution122–5 127inepidemiologicalstudies32 34–9experiment63–4 122–5frame29multi-stage30quasi-random31quota29random29–31simplerandom29–30stratified31systematic31
scanner5–6scatterdiagram75–7 185–6scattergramseescatterdiagramschoolchildren12–13 17 22 31 34–5 41 43 128–32 143–7 233–4 240–1 243–7 254
schools22 31 34screening15 22 81 216–7 265 275–9selectionofsubjects16–17 32–3 37–9incasecontrolstudies37–9inclinicaltrials16–17self31–2
selfselection31–2semenanalysis183semi-parametric325sensitivity276–8sequentialanalysis151–2sequentialtrials151–2serialmeasurements169–71sex71–2signtest138–9 161 210 217 219–20 228 246–7 260 369–70 372 373signed-ranktestseeWilcoxonsignificanceandimportance142–3significanceandpublication327significancelevel140–1 147significancetests137–55multiple148–52 169andsamplesize336–8insubsets149–50inferiortoconfidenceintervals142 145
significantdifference140significantdigitsseesignificantfiguressignificantfigures69–72 268–9sizeofsample32 147–8 335–47accuracyofestimation344inclusterrandomization344–6correlationcoefficient343–4andestimation335–0pairedsamples6–341referenceinterval347 378andsignificancetests147–8 336–8singlemean335–6singleproportion336 378–9
twomeans338–41 379–80twoproportions341–3 379
skewdistribution56 59 67 112–14 116–17 165 167–8 360skinfoldthickness165–7 213–15 335slope185–6smallsamples156–67 227 258–60smoking22 26 31–2 34–9 41 67 74–5 241–3 356SMR297–9 303 307 376–7Snow,John1sodium116–17somites333–4 378SouthEastLondonScreeningStudy15Spearman'srankcorrelationcoefficient220–2 226 261–2 373table219ties219
specificity276–8spinalchordatrophy37squareroottransformation165–7 175–7squares,sumofseesumofsquaresstandardagespecificmortalityrates297–8standarddeviation60 62–4 67 92–4 119–21degreesoffreedomfor63–4 67 119ofdiiferences159–62ofpopulation123–4ofprobabilitydistribution92–4 105ofsample62–4 67 119 353ofsamplingdistribution123–4andtransformation113–14andstandarderror126standarderrorof132withinsubjects269–70
standarderror122–5andconfidenceintervals126–7centite280correlationcoefficient201 343differencebetweentwomeans128–9 136 338–41 361 379–80differencebetweentwoproportions130–1 145–7 341–3 379
differencebetweentworegressioncoefficients208 367–8differentinsignificancetestandconfidenceinterval147loghazardratio325logoddsratio241–2 252–3logisticregressioncoefficient322mean123–5 136 335–6 361percentile280predictedvalueinregression192–4proportion128 336 378–9quantile280ratiooftwoproportions121–2referenceinterval280 370–1 378regressioncoefficient191–2 311–12 317regressionestimate192–3SMR298–9 377standarddeviation132survivalrate283–4 341
StandardNormaldeviate114–17 225–6StandardNormaldistribution108–11 143 156–8 337–8standardpopulation296standardizedmortalityrate74 296standardizedmortalityratio296–9 303 307standardizedNormalprobabilityplot117–18Stata118StatExact238statistic47 139 302–3 337test139 337vital302–3
Statistics1statisticalsignificanceseesignificanceteststemandleafplot54 57 66 184 351 364–6stepfunction51 283step-down326step-up326stepwiseregression326stillbirthrate303
stratification31strength308–16strengthofevidence137 140 362streptomycin9–10 17 19–20 81 235–6 290stroke5–6 23 249Stuarttest248 260Student12–13 156 158–9Student'stdistributionseetdistributionStudentizedrange176subsets149–50success90suicide306sumofproductsaboutmean189 198–200sumofsquares60–1 63–5 98–9 119 173–4 310 313–14aboutmean60–1 63–5 119 352–3aboutregression191–2 310 313–14duetoregression191–2 310 313–14expectedvalueof63–4 98–9 119
summarystatistics169 180–1 327summation59survey28–9 42 90survival10 101–2 281–8 324–5analysis281–8 324–5curve283–4 286 324probability101–2 282–4 287rate283time162 281–8
symmetricaldistribution54 56 59synergy320syphilis22systolicbloodpressureseebloodpressure
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>T
Ttdistribution120 156–9degreesoffreedom120 153–4 157–8andNormaldistribution120 156–8shapeof157table158
tmethods114 156–69assumptions161–8 184 365–7confidenceintervals159–63 164 167deviationsfromassumptions161–2 164 167–8differencebetweenmeansinmatchedsample159–61 184 260 363–7370 372differencebetweenmeansintwosamples162–7 258–9 262onesample159–62 184 260 363–7 370 372paired159–62 167–8 184 217 220 260 363–7 370 372regressioncoefficient191–2 310–12 317singlemean159–62 176twosample162–7 217 258–9 262 317 373–4unpairedsameastwosample
tableofprobabilitydistributionChi-squared233correlationcoefficient200Kendall'sτ225Mann–WhitneyU212Normal109–10Spearman'sρ222t158Wilcoxonmatchedpairs219
tableofsamplesizeforcorrelationcoefficient344tablesofrandomnumbers8–9 29–30
tables,presentationof71–2tables,twoway230–48tailsofdistributions56 359–60tallysystem49–50 54Tanzania69 220–4TBseetuberculosistelephonesurvey42temperature10 70 86 210 255–6 332test,diagnostic136 275–9 361test,significanceseesignificancetestteststatistic136 337threedimensionaleffectingraphs80thrombosis11–12 36 345thyroidhormone267 373–4tiesinranktests213 215 218–19 222–4tiesinsigntest138time324–5timeseries77–8 169–71 354 356timetopeak169time,survivalseesurvivaltimeTNFseetumournecrosisfactortotalsumofsquares174transformations112–14 163–7 320arcsinesquareroot165andconfidenceintervals167Fisher'sz201 343logarithmic112–14 116 163–7 170–1 175–6 184 320 364–7 369–70logit240 252–3toNormaldistribution112–14 116 164–7 175–6 184reciprocal113 165–7andsignificantfigures269squareroot165–7 175–7touniformvariance163–7 168 175–6 196–7 271
treatedgroup5–7treatment5–7 326–7treatmentguidelines179–81trendincontingencytables243–5
chi-squaredtest243–5Kendall'sτb245Mantel–Haenzsel245
trial,clinicalseeclinicaltrialtrialofscar322–3triglyceride55–6 58–59 63 112–13 280–2trisomy-16333–4 378truedifference147truenegative278truepositive278tuberculosis6–7 9–10 17 81–2 290Tukey54 58Tukey'sHonestlySignificantDifference176tumourgrowth20tumournecrosisfactor(TNF)318–21TuskegeeStudy22twins204two-samplettestseetmethodstwo-sampletrial16two-sidedpercentagepoint110two-sidedtest141–2two-tailedtest141–2typeIerror140typeIIerror140 337
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>U
Uulceratedfeet159–64 174ultrasonography134unemployment42Uniformdistribution107–8 249uniformvariance159 162–4 167–8 175–6 187 191 196–7 316 319–20unimodaldistribution55unitofanalysis21–2 179–81urinaryinfection69urinarynitrite265 372–3
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>V
Vvaccine6–7 11 13–14 17 19validityofchi-squaredtest234–6 239–40 245variability59–64 269variabilityexplainedbyregression191 200variable47categorical47continuous47 49dependent187dichotomous259–62discrete47 49explanatory187independent187nominal210 259–62nuisance320ordinal210 259–62outcome187 190 308 321predictor187 190 308 312–13 316–18 321 323 324qualitative47 316–18quantitative47randomseerandomvariable
variance59–64 67aboutregressionline191–2 205–6analysisofseeanalysisofvariancebetweenclusters345–6betweensubjects178–9 204common162–4 170 173comparisoninpaireddata260comparisonofseveral172comparisonoftwo171 260
degreesoffreedomfor61 63–4 352–3estimate59–64 124–5oflogarithm131 252population123–4ofprobabilitydistribution91–4 105ofrandomvariable91–4ratio120 311residual192 205–6 310sample59–64 67 94 98–9 119 352–3uniform162 163–7 168 174–6 187 196–7 316withinclusters345–6withinsubjects178–9 204 269–72
variation,coefficientof271visualacuity266 373vitalcapacity75–6vitalstatistics302–3vitaminA328–9vitaminD115–16volatilesubstanceabuse42 307 376–7volunteerbias6 13–14 32volunteers5–6 13–14 16–17VSAseevolatilesubstanceabuse
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>W
WWandsworthHealthDistrict86 255–6 356website3 4weightgain20–1wheeze267whoopingcough265 373Wilcoxontest217–20 260 369–70 373matchedpairs217–20 260 369–70 373onesample217–20 260 369–70 373signedrank217–20 260 369–70 373table219ties218–19twosample217 seeMann-Whitney
withdrawnfromfollow-up282withinclustervariance345–6withingroupresidualsseeresidualswithingroupssumofsquares173withingroupsvariance173withinsubjectsvariance178–9 204
withinsubjectsvariation178–9 269–72Wooif'stest328Wrightpeakflowmeterseepeakflowmeter
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>X
X[xwithbarabove],symbolformean59X-ray19–20 81 179–81
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>Y
YYates'correction238–40 247 259 261
Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition
Copyright©2000OxfordUniversityPress
>BackofBook>Index>Z
Zztest143–7 258–9 262 234ztransformation201 343zero,missing78–80zidovudineseeAZT%symbol71!(symbolforfactorial)90 97∞(symbolforinfinity)291|(symbolforgiven)96|(symbolforabsolutevalue)239α(symbolforalpha)140β(symbolforbeta)140χ(symbolforchi)118–19µ(symbolformu)92–3φ(symbolforphi)108Φ(symbolforPhi)109ρ(symbolforrho)220–2Σ(symbolforsummation)57σ(symbolforsigma)92–3τ(symbolfortau)222–5