An Introduction to Medical Statistics by Martin Bland

Authors: Bland,MartinTitle: IntroductiontoMedicalStatistics,An,3rdEdition

CopyrightÂ©2000OxfordUniversityPress

>FrontofBook>Authors

Author

MartinBlandProfessorofMedicalStatisticsStGeorge'sHospitalMedicalSchool,London



>FrontofBook>Dedication

Dedication

TothememoryofErnestandPhyllisBland,myparents


Copyright©2000OxfordUniversityPress

>FrontofBook>PrefacetotheThirdEdition

PrefacetotheThirdEdition

InpreparingthisthirdeditionofAnIntroductiontoMedicalStatistics,Ihavetakentheopportunitytocorrectanumberofmistakesandtypographicalerrors,andtochangesomeoftheexamplesandaddafewmore.Ihaveextendedthetreatmentofseveraltopicsandintroducedsomenewones,previouslyomittedthroughlackofspaceorenergy,orbecausetheywerethenrarelyseeninthemedicalliterature.Inonecase,numberneededtotreat,theconcepthadnotevenbeeninventedwhenthesecondeditionwaswritten.Othernewtopicsincludeconsentinclinicaltrials,designandanalysisofcluster-randomizedtrials,ecologicalstudies,conditionalprobability,repeatedtesting,randomeffectsmodels,intraclasscorrelation,andconditionaloddsratios.Thankstothewondersofcomputerizedtypesetting,Ihavemanagedtoextendthecontentsofthebookwithaverysmallincreaseinthenumberofpages.

Thisbookisformedicalstudents,doctors,medicalresearchers,nurses,membersofprofessionsalliedtomedicine,andallothersconcernedwithmedicaldata.Therangeofstatisticalmethodsusedinthemedicalandhealthcareliterature,andhencedescribedinthisbook,continuestogrow,butthetimeavailableintheundergraduatecurriculumdoesnot.Someofthetopicscoveredherearebeyondtheneedsofmanystudents,soIhaveindicatedbyanasterisksectionswhichwouldnotusuallybeincludedinfirstcourses.Theseareintendedforpostgraduatestudentsandmedicalresearchers.

Thisthirdeditionisbeingpublishedwithacompanionvolume,StatisticalQuestionsinEvidence-basedMedicine(BlandandPeacock2000).Thisbookofquestionsandanswersincludesnocalculationsandiscomplementarytotheexercisesgivenhere.Inthesolutionsgivenwe

makemanyreferencestoAnIntroductiontoMedicalStatistics.BecausewewantedStatisticalQuestionsinEvidence-basedMedicinetobeusablewiththesecondeditionofAnIntroductiontoMedicalStatistics(Bland1995),Ihavekeptthesameorderandnumberingofthesectionsinthethirdedition.Newmaterialhasallbeenaddedattheendsofthechapters.Ifthestructuresometimesseemsalittleunwieldy,thatiswhy.

Thisisabookaboutdata,notstatisticaltheory.Thefundamentalconceptsofstudydesign,datacollectionanddataanalysisareexplainedbyillustrationandexample.Onlyenoughmathematicsandformulaearegiventomakeclearwhatisgoingon.Forthosewhowishtogoalittlefurtherintheirunderstanding,someofthemoremathematicalbackgroundtothetechniquesdescribedisgivenasappendicestothechaptersratherthaninthemaintext.

Thematerialcoveredincludesallthestatisticalworkthatwouldberequiredforacourseinmedicineandfortheexaminationsofmostoftheroyalcolleges.Itincludesthedesignofclinicaltrialsandepidemiologicalstudies,datacollection.summarizingandpresentingdata,probability,theBinomial,Normal,Poisson.tandChi-squareddistributions,standarderrors,confidenceintervals,testsofsignificance,largesampleandsmallsamplecomparisonsofmeans,theuseoftransformations,regressionandcorrelation,methodsbasedonranks,contingencytables,oddsratios,measurementerror,referenceranges,mortalitydata,vitalstatistics,analysisofvariance,multipleandlogisticregression,survivalanalysis,samplesizeestimation,andthechoiceofthestatisticalmethod.

Thebookisfirmlygroundedinmedicaldata,particularlyinmedicalresearch,andtheinterpretationoftheresultsofcalculationsintheirmedicalcontextisemphasized.Exceptforafewobviouslyinventednumbersusedtoillustratethemechanicsofcalculations,allthedataintheexamplesandexercisesarereal,frommyownresearchandstatisticalconsultationorfromthemedicalliterature.

Therearetwokindsofexerciseinthisbook.Eachchapterhasasetofmultiplechoicequestionsofthe‘trueorfalse’type,100inall.Multiplechoicequestionscancoveralargeamountofmaterialinashorttime,

soareausefultoolforrevision.AsMCQsarewidelyusedinpostgraduateexaminations,theseexercisesshouldalsobeusefultothosepreparingformemberships.AlltheMCQshavesolutions,withreferencetoanappropriatepartofthetextoradetailedexplanationformostoftheanswers.Eachchapteralsohasonelongexercise.Althoughtheseusuallyinvolvecalculation,Ihavetriedtoavoidmerelyslottingfiguresintoformulae.Theseexercisesincludenotonlytheapplicationofstatisticaltechniques,butalsotheinterpretationoftheresultsinthelightofthesourceofthedata.

Iwishtothankmanypeoplewhohavecontributedtothewritingofthisbook.First,therearethemanymedicalstudents,doctors,researchworkers,nurses,physiotherapists,andradiographerswhomithasbeenmypleasuretoteach,andfromwhomIhavelearnedsomuch.Second,thebookcontainsmanyexamplesdrawnfromresearchcarriedoutwithotherstatisticians,epidemiologists,andsocialscientists,particularlyDouglasAltman,RossAnderson,MikeBanks,BarbaraButland,BeulahBewley,andWalterHolland.ThesestudiescouldnothavebeendonewithouttheassistanceofPatsyBailey,BobHarris.RebeccaMcNair.JanetPeacock,SwateePatel,andVirginiaPollard.Third,thecliniciansandscientistswithwhomIhavecollaboratedorwhohavecometomeforstatisticaladvicenotonlytaughtmeaboutmedicaldatabutmanyofthemhaveleftmewithdatawhichareusedhere,includingNaibAl-Saady,ThomasBewley,FrancesBoa,NigelBrown,JanDavies,PeterFish,CarolineFlint,NickHall,TessiHanid.MichaelHutt,RiahdJasrawi,IanJohnston,MosesKipembwa,PamLuthra,HughMather,DaramMaugdal,DouglasMaxwell,CharlesMutoka,TimNorthfield,AndreasPapadopoulos,MohammedRaja,PaulRichardson,andAlbertoSmith.IamparticularlyindebtedtoJohnMorgan,asChapter16ispartlybasedonhiswork.

TheoriginalmanuscriptwastypedbySueNash,SueFisher,SusanHarding,SheilahSkipp,andmyself.ThiseditionhasbeensetbymeusingLATEX,soanyerrorswhichremainaredefinitelymyown.AllthegraphshavebeendrawnusingStataexceptforthepiecharts,doneusingHarvardGraphics.

IthankDouglasAltman,DavidJones,RobinPrescott,KlimMcPherson.JanetPeacock,andStuartPocockfortheirhelpfulcommentsonearlier

drafts.Ihavecorrectedanumberoferrorsfromthefirstandsecondeditions,andIamgratefultocolleagueswhohavepointedthemouttome,inparticulartoDanielHeitjan.IamverygratefultoJanetPeacock,whoproof-readthisedition.Specialthanksareduetomyheadofdepartment,RossAnderson,forallhissupport,andtothestaffofOxfordUniversityPress.MostofallIthankmywife,PaulineBland,forherunfailingconfidenceandencouragement,andmychildren,EmilyandNicholasBland,forkeepingmyfeetfirmlyontheground.

M.B.London,March2000



>TableofContents>Sectionsmarked*containmaterialusuallyfoundonlyinpostgraduatecourses

Sectionsmarked*containmaterialusuallyfoundonlyinpostgraduatecourses



>TableofContents>1-Introduction

1

Introduction

1.1StatisticsandmedicineEvidence-basedpracticeisthenewwatchwordineveryprofessionconcernedwiththetreatmentandpreventionofdiseaseandpromotionofhealthandwell-being.Thisrequiresboththegatheringofevidenceanditscriticalinterpretation.Theformerisbringingmorepeopleintothepracticeofresearch,andthelatterisrequiringofallhealthprofessionalstheabilitytoevaluatetheresearchcarriedout.Muchofthisevidenceisintheformofnumericaldata.Theessentialskillrequiredforthecollection,analysis,andevaluationofnumericaldataisstatistics.ThusStatistics,thescienceofassemblingandinterpretingnumericaldata,isthecorescienceofevidence-basedpractice.

Inthepastfortyyearsmedicalresearchhasbecomedeeplyinvolvedwiththetechniquesofstatisticalinference.Theworkpublishedinmedicaljournalsisfullofstatisticaljargonandtheresultsofstatisticalcalculations.Thisacceptanceofstatistics,thoughgratifyingtothemedicalstatistician,mayevenhavegonetoofar.MorethanonceIhavetoldacolleaguethathedidnotneedmetoprovethathisdifferenceexisted,asanyonecouldseeit,onlytobetoldinturnthatwithoutthemagicofthePvaluehecouldnothavehispaperpublished.

Statisticshasnotalwaysbeensopopularwiththemedicalprofession.Statisticalmethodswerefirstusedinmedicalresearchinthe19thcenturybyworkerssuchasPierre-Charles-AlexandreLouis,WilliamFarr,FlorenceNightingaleandJohnSnow.Snow'sstudiesofthemodesofcommunicationofcholera,forexample,madeuseofepidemiologicaltechniquesuponwhichwehavestillmadelittleimprovement.Despite

theworkofthesepioneers,however,statisticalmethodsdidnotbecomewidelyusedinclinicalmedicineuntilthemiddleofthetwentiethcentury.Itwasthenthatthemethodsofrandomizedexperimentationandstatisticalanalysisbasedonsamplingtheory,whichhadbeendevelopedbyFisherandothers,wereintroducedintomedicalresearch,notablybyBradfordHill.Itrapidlybecameapparentthatresearchinmedicineraisedmanynewproblemsinbothdesignandanalysis,andmuchworkhasbeendonesincetowardssolvingthesebyclinicians,statisticiansandepidemiologists.

Althoughconsiderableprogresshasbeenmadeinsuchfieldsasthedesignofclinicaltrials,thereremainsmuchtobedoneindevelopingresearchmethodologyinmedicine.Itseemslikelythatthiswillalwaysbeso,foreveryresearchprojectissomethingnew,somethingwhichhasneverbeendonebefore.Under

thesecircumstanceswemakemistakes.Nopieceofresearchcanbeperfectandtherewillalwaysbesomethingwhich,withhindsight,wewouldhavechanged.Furthermore,itisoftenfromtheflawsinastudythatwecanlearnmostaboutresearchmethods.Forthisreason,theworkofseveralresearchersisdescribedinthisbooktoillustratetheproblemsintowhichtheirdesignsoranalysesledthem.Idonotwishtoimplythatthesepeoplewereanymorepronetoerrorthantherestofthehumanrace,orthattheirworkwasnotavaluableandseriousundertaking.RatherIwanttolearnfromtheirexperienceofattemptingsomethingextremelydifficult,tryingtoextendourknowledge,sothatresearchersandconsumersofresearchmayavoidtheseparticularpitfallsinthefuture.

1.2StatisticsandmathematicsManypeoplearediscouragedfromthestudyofstatisticsbyafearofbeingoverwhelmedbymathematics.Itistruethatmanyprofessionalstatisticiansarealsomathematicians,butnotallare,andtherearemanyveryableappliersofstatisticstotheirownfields.Itispossible,thoughperhapsnotveryuseful,tostudystatisticssimplyasapartofmathematics,withnoconcernforitsapplicationatall.Statisticsmayalsobediscussedwithoutappearingtouseanymathematicsatall(e.g.

Huff1954).

Theaspectsofstatisticsdescribedinthisbookcanbeunderstoodandappliedwiththeuseofsimplealgebra.Onlythealgebrawhichisessentialforexplainingthemostimportantconceptsisgiveninthemaintext.Thismeansthatseveralofthetheoreticalresultsusedarestatedwithoutadiscussionoftheirmathematicalbasis.Thisisdonewhenthederivationoftheresultwouldnotaidmuchinunderstandingtheapplication.Formanyreadersthereasoningbehindtheseresultsisnotofgreatinterest.Forthereaderwhodoesnotwishtotaketheseresultsontrust,severalchaptershaveappendicesinwhichsimplemathematicalproofsaregiven.Theseappendicesaredesignedtohelpincreasetheunderstandingofthemoremathematicallyinclinedreaderandtobeomittedbythosewhofindthatthemathematicsservesonlytoconfuse.

1.3StatisticsandcomputingPracticalstatisticshasalwaysinvolvedlargeamountsofcalculation.Whenthemethodsofstatisticalinferencewerebeingdevelopedinthefirsthalfofthetwentiethcentury,calculationsweredoneusingpencil,paper,tables,sliderulesand,withluck,averyexpensivemechanicaladdingmachine.Olderbooksonstatisticsspendmuchtimeonthedetailsofcarryingoutcalculationsandanyreferencetoa‘computer’meansapersonwhocomputes,notanelectronicdevice.Thedevelopmentofthedigitalcomputerhasbroughtchangestostatisticsastomanyotherfields.Calculationscanbedonequickly,easilyand,wehope,accuratelywitharangeofmachinesfrompocketcalculatorswithbuilt-instatisticalfunctionstopowerfulcomputersanalysingdataonmanythousandsofsubjects.Manystatisticalmethodswouldnotbecontemplatedwithoutcomputers,andthedevelopmentofnewmethodsgoeshandinhandwiththedevelopmentof

softwaretocarrythemout.Thetheoryofmultilevelmodelling(Goldstein1995)andtheprogramsMLnandMLWinareagoodexample.Mostofthecalculationsinthisbookweredoneusingacomputerandthegraphswereproducedwithone.

Asanaddedbonus,mylittleMSDOSprogramClinstat(nottobe

confusedwithanycommercialpackageofthesamename)canbedownloadedfreefrommywebsiteathttp://www.sghms.ac.uk/depts/phs/staff/jmb/.Itdoesmostofthecalculationsinthisbook,includingsamplesizecalculationsandrandomsamplingandallocation.Itdoesnotdoanymultifactorialanalyses,sorry.Thereisalsoalittleprogramtofindsomeexactconfidenceintervals.

Thereisthereforenoneedtoconsidertheproblemsofmanualcalculationindetail.Theimportantthingistoknowwhyparticularcalculationsshouldbedoneandwhattheresultsofthesecalculationsactuallymean.Indeed,thedangerinthecomputerageisnotsomuchthatpeoplecarryoutcomplexcalculationswrongly,butthattheyapplyverycomplicatedstatisticalmethodswithoutknowingwhyorwhatthecomputeroutputmeans.MorethanonceIhavebeenapproachedbyaresearcherbearingatwoinchthickcomputerprintout,andaskingwhatitallmeans.Sadly,toooften,itmeansthatanothertreehasdiedinvain.

Thewidespreadavailabilityofcomputersmeansthatmorecalculationsarebeingdone,andbeingpublished,thaneverbefore,andthechanceofinappropriatestatisticalmethodsbeingappliedmayactuallyhaveincreased.Thismisusearisespartlybecausepeopleregardtheirdataanalysisproblemsascomputingproblems,notstatisticalones,andseekadvicefromcomputerexpertsratherthanstatisticians.Theyoftengetgoodadviceonhowtodoit,butratherpooradviceaboutwhattodo,whytodoitandhowtointerprettheresultsafterwards.Itisthereforemoreimportantthaneverthattheconsumersofresearchunderstandsomethingabouttheusesandlimitationsofstatisticaltechniques.

1.4ThescopeofthisbookThisbookisintendedasanintroductiontosomeofthestatisticalideasimportanttomedicine.Itdoesnottellyouallyouneedtoknowtodomedicalresearch.Onceyouhaveunderstoodtheconceptsdiscussedhere,itismucheasiertolearnaboutthetechniquesofstudydesignandstatisticalanalysisrequiredtoansweranyparticularquestion.Thereareseveralexcellentstandardworkswhichdescribethesolutionstoproblemsintheanalysisofdata(ArmitageandBerry1994,Snedecor

http://www.sghms.ac.uk/depts/phs/staff/jmb/

andCochran1980,Altman1991)andalsomorespecializedbookstowhichreferencewillbemadewhererequired.

WhatIhopethebookwilldoistogiveenoughunderstandingofthestatisticalideascommonlyusedinmedicinetoenablethehealthprofessionaltoreadthemedicalliteraturecompetentlyandcritically.Itcoversenoughmaterial(andmore)foranundergraduatecourseinstatisticsforstudentsofmedicine,nursing,physiotherapy,etc.Atthetimeofwriting,asfarascanbeestablished,itcoversthematerialrequiredtoanswerstatisticalquestionssetintheexaminationsof

mostoftheRoyalColleges,exceptfortheMRCPsych.IhaveindicatedbyanasteriskinthesubheadingthosesectionswhichIthinkwillberequiredonlybythepostgraduateortheresearcher.

Whenworkingthroughatextbook,itisusefultobeabletocheckyourunderstandingofthematerialcovered.Likemostsuchbooks,thisonehasexercisesattheendofeachchapter,buttoeasethetediummostoftheseareofthemultiplechoicetype.Thereisalsoonelongexercise,usuallyinvolvingcalculations,foreachchapter.Inkeepingwiththecomputerage,wherelaboriouscalculationwouldbenecessaryintermediateresultsaregiventoavoidthis.Thustheexercisescanbecompletedquitequicklyandthereaderisadvisedtotrythem.Youcanalsodownloadsomeofthedatasetsfrommywebsite(http://www.sghms.ac.uk/depts/phs/staff/jmb).Solutionsaregivenattheendofthebook,infullforthelongexercisesandasbriefnoteswithreferencestotherelevantsectionsinthetextforMCQs.ReaderswhowouldlikemorenumericalexercisesarerecommendedtoOsborn(1979).Forawealthofexercisesintheunderstandingandinterpretationofstatisticsinmedicalresearch,drawnfromthepublishedliteratureandpopularmedia,youshouldtrythecompanionvolumetothisone,StatisticalQuestionsinEvidence-basedMedicine(BlandandPeacock2000).

Finally,aquestionmanystudentsofmedicineaskastheystrugglewithstatistics:isitworthit?AsAltman(1982)hasargued,badstatisticsleadstobadresearchandbadresearchisunethical.Notonlymayitgivemisleadingresults,whichcanresultingoodtherapiesbeingabandonedandbadonesadopted,butitmeansthatpatientsmayhave

http://www.sghms.ac.uk/depts/phs/staff/jmb

beenexposedtopotentiallyharmfulnewtreatmentsfornogoodreason.Medicineisarapidlychangingfield.Intenyears'time,manyofthetherapiescurrentlyprescribedandmanyofourideasaboutthecausesandpreventionofdiseasewillbeobsolete.Theywillbereplacedbynewtherapiesandnewtheories,supportedbyresearchstudiesanddataofthekinddescribedinthisbook,andprobablypresentingmanyofthesameproblemsininterpretation.Thepractitionerwillbeexpectedtodecideforher-orhimselfwhattoprescribeoradvisebasedonthesestudies.Soaknowledgeofmedicalstatisticsisoneofthemostusefulthingsanydoctorcouldacquireduringherorhistraining.



>TableofContents>2-Thedesignofexperiments

2

Thedesignofexperiments

2.1ComparingtreatmentsTherearetwobroadtypesofstudyinmedicalresearch:observationalandexperimental.Inobservationalstudies,aspectsofanexistingsituationareobserved,asinasurveyoraclinicalcasereport.Wethentrytointerpretourdatatogiveanexplanationofhowtheobservedstateofaffairshascomeabout.Inexperimentalstudies,wedosomething,suchasgivingadrug,sothatwecanobservetheresultofouraction.Thischapterisconcernedwiththewaystatisticalthinkingisinvolvedinthedesignofexperiments.Inparticular,itdealswithcomparativeexperimentswherewewishtostudythedifferencebetweentheeffectsoftwoormoretreatments.Theseexperimentsmaybecarriedoutinthelaboratoryinvitrooronanimalsorhumanvolunteers,inthehospitalorcommunityonhumanpatients,or,fortrialsofpreventiveinterventions,oncurrentlyhealthypeople.Wecalltrialsoftreatmentsonhumansubjectsclinicaltrials.Thegeneralprinciplesofexperimentaldesignarethesame,althoughtherearespecialprecautionswhichmustbetakenwhenexperimentingwithhumansubjects.Theexperimentswhoseresultsmostconcerncliniciansareclinicaltrials,sothediscussionwilldealmainlywiththem.

Supposewewanttoknowwhetheranewtreatmentismoreeffectivethanthepresentstandardtreatment.Wecouldapproachthisinanumberofways.

First,wecouldcomparetheresultsofthenewtreatmentonnewpatientswithrecordsofpreviousresultsusingtheoldtreatment.Thisisseldomconvincing,becausetheremaybemanydifferencesbetween

thepatientswhoreceivedtheoldtreatmentandthepatientswhowillreceivethenew.Astimepasses,thegeneralpopulationfromwhichpatientscomemaybecomehealthier,standardsofancillarytreatmentandnursingcaremayimprove,orthesocialmixinthecatchmentareaofthehospitalmaychange.Thenatureofthediseaseitselfmaychange.Allthesefactorsmayproducechangesinthepatients'apparentresponsetotreatment.Forexample,Christie(1979)showedthisbystudyingthesurvivalofstrokepatientsin1978,aftertheintroductionofaC-Theadscanner,withthatofpatientstreatedin1974,beforetheintroductionofthescanner.Hetooktherecordsofagroupofpatientstreatedin1978,whoreceivedaC-Tscan,andmatchedeachofthemwithapatienttreatedin1974ofthesameage,diagnosisandlevelofconsciousnessonadmission.AsthefirstcolumnofTable2.1shows,patientsin1978clearlytendedtohavebettersurvivalthansimilarpatientsin1974.

Thescanned1978patientdidbetterthantheunscanned1974patientin31%ofpairs.whereastheunscanned1974patientdidbetterthatthescanned1978patientinonly7%ofpairs.However,healsocomparedthesurvivalofpatientsin1978whodidnotreceiveaC-Tscanwithmatchedpatientsin1974.Thesepatientstooshowedamarkedimprovementinsurvivalfrom1974to1978(Table2.1).The1978patientsdidbetterin38%ofpairsandthe1974patientsinonly19%ofpairs.Therewasageneralimprovementinoutcomeoverafairlyshortperiodoftime.Ifwedidnothavethedataontheunscannedpatientsfrom1978wemightbetemptedtointerpretthesedataasevidencefortheeffectivenessoftheC-Tscanner.Historicalcontrolslikethisareseldomveryconvincing,andusuallyfavourthenewtreatment.Weneedtocomparetheoldandnewtreatmentsconcurrently.

Table2.1.Analysisofthedifferenceinsurvivalformatchedpairsofstrokepatients(Christie1979)

C-Tscanin NoC-Tscanin

1978 1978

Pairswith1978betterthan1974

9(31%) 34(38%)

Pairswithsameoutcome

18(62%) 38(43%)

Pairswith1978worsethan1974

2(7%) 17(19%)

Second,wecouldobtainconcurrentgroupsbycomparingourownpatients,giventhenewtreatment,withpatientsgiventhestandardtreatmentinanotherhospitalorclinic,orbyanotherclinicianinourowninstitution.Again,theremaybedifferencesbetweenthepatientgroupsduetocatchment,diagnosticaccuracy,preferencebypatientsforaparticularclinician,oryoumightjustbeabettertherapist.Wecannotseparatethesedifferencesfromthetreatmenteffect.

Third,wecouldaskpeopletovolunteerforthenewtreatmentandgivethestandardtreatmenttothosewhodonotvolunteer.Thedifficultyhereisthatpeoplewhovolunteerandpeoplewhodonotvolunteerarelikelytobedifferentinmanywaysapartfromthetreatmentswegivethem.Theymightbemorelikelytofollowmedicaladvice,forexample.Wewillconsideranexampleoftheeffectsofvolunteerbiasin§2.4.

Fourth,wecanallocatepatientstothenewtreatmentorthestandardtreatmentandobservetheoutcome.Thewayinwhichpatientsareallocatedtotreatmentscaninfluencetheresultsenormously,asthefollowingexample(Hill1962)shows.Between1927and1944aseriesoftrialsofBCGvaccinewerecarriedoutinNewYork(LevineandSackett1946).ChildrenfromfamilieswheretherewasacaseoftuberculosiswereallocatedtoavaccinationgroupandgivenBCGvaccine,ortoacontrolgroupwhowerenotvaccinated.Between1927and1932

physiciansvaccinatedhalfthechildren,thechoiceofwhichchildrentovaccinatebeinglefttothem.TherewasaclearadvantageinsurvivalfortheBCGgroup(Table2.2).However,therewasalsoacleartendencyforthephysiciantovaccinatethechildrenofmorecooperativeparents,andtoleavethoseoflesscooperativeparentsascontrols.From1933allocationtotreatmentorcontrolwasdonecentrally,alternatechildrenbeingassignedtocontrolandvaccine.

Thedifferenceindegreeofcooperationbetweentheparentsofthetwogroupsofchildrendisappeared,andsodidthedifferenceinmortality.Notethatthesewereaspecialgroupofchildren,fromfamilieswheretherewastuberculosis.Inlargetrialsusingchildrendrawnfromthegeneralpopulation,BCGwasshowntobeeffectiveingreatlyreducingdeathsfromtuberculosis(HartandSutherland1977)

Table2.2.ResultsofstudiesofBCGvaccineinNewYorkCity(Hill1962)

Periodoftrial

No.ofchildren

No.ofdeathsfromTB

Deathrate

Averageno.ofvisitstoclinicduring1styear

offollow-up

Proportionofparentsgivinggoodcooperationasjudgedbyvisitingnurses

1927–32Selectionmadebyphysician

BCGgroup

445 3 0.67% 3.6 43%

Controlgroup

545 18 3.30% 1.7 24%

1933–44Alternativeselectioncarriedoutcentrally

BCGgroup

566 8 1.41% 2.8 40%

Controlgroup

528 8 1.52% 2.4 34%

Differentmethodsofallocationtotreatmentcanproducedifferentresults.Thisisbecausethemethodofallocationmaynotproducegroupsofsubjectswhicharecomparable,similarineveryrespectexceptthetreatment.Weneedamethodofallocationtotreatmentsinwhichthecharacteristicsofsubjectswillnotaffecttheirchanceofbeingputintoanyparticulargroup.Thiscanbedoneusingrandomallocation.

2.2RandomallocationIfwewanttodecidewhichoftwopeoplereceiveanadvantage,insuchawaythateachhasanequalchanceofreceivingit,wecanuseasimple,widelyacceptedmethod.Wetossacoin.Thisisusedtodecidethewayfootballmatchesbegin,forexample,andallappeartoagreethatitisfair.Soifwewanttodecidewhichoftwosubjectsshouldreceiveavaccine,wecantossacoin.Headsandthefirstsubjectreceivesthevaccine,tailsandthesecondreceivesit.Ifwedothisforeachpairofsubjectswebuilduptwogroupswhichhavebeenassembledwithoutanycharacteristicsofthesubjectsthemselvesbeinginvolvedintheallocation.Theonlydifferencesbetweenthegroupswillbethoseduetochance.Asweshallseelater(Chapters8and9),statisticalmethodsenableustomeasurethelikelyeffectsofchance.Anydifferencebetweenthegroupswhichislargerthanthisshouldbe

duetothetreatment,sincetherewillbenootherdifferencesbetweenthegroups.Thismethodofdividingsubjectsintogroupsiscalledrandomallocationorrandomization.

Severalmethodsofrandomizinghavebeeninuseforcenturies,includingcoins,dice,cards,lots,andspinningwheels.Someofthetheoryofprobabilitywhichweshalluselatertocomparerandomizedgroupswasfirstdevelopedas

anaidtogambling.Forlargerandomizationsweuseadifferent,non-physicalrandomizingmethod:randomnumbertables.Table2.3providesanexample,atableof1000randomdigits.Thesearemoreproperlycalledpseudo-randomnumbers,astheyaregeneratedbyamathematicalprocess.Theyareavailableintables(KendallandBabingtonSmith1971)orcanbeproducedbycomputerandsomecalculators.Wecanusetablesofrandomnumbersinseveralwaystoachieverandomallocation.Forexample,letusrandomlyallocate20subjectstotwogroups,whichIshalllabelAandB.Wechoosearandomstartingpointinthetable,usingoneofthephysicalmethodsdescribedabove.(Iuseddecimaldice.Theseare20-sideddice,numbered0to9twice,whichfitournumbersystemmoreconvenientlythanthetraditionalcube.Twosuchdicegivearandomnumberbetween1and100,counting‘0,0’as100.)Therandomstartingpointwasrow22,column20,andthefirst20digitswere3,4,6,2,9,7,5,3,2,6,9,7,9,3,9,2,3,3,2and4.WenowallocatesubjectscorrespondingtoodddigitstogroupAandthosecorrespondingtoevendigitstoB.Thefirstdigit,3,isodd,sothefirstsubjectgoesintogroupA.Theseconddigit,4,iseven,sothesecondsubjectgoesintogroupB,andsoon.WegettheallocationshowninTable2.4.WecouldallocateintothreegroupsbyassigningtoAifthedigitis1,2,or3,toBif4,5,or6,andtoCif7,8,or9,ignoring0.Therearemanypossibilities.

Table2.3.The1000randomdigits

Column

Row1–4 5–8 9–

1213–16

17–20

21–24

25–28

29–32

33–36

37–40

1 3645

8831

2873

5943

4632

0032

6715

3249

5455

7517

2 9051

4066

1846

9554

6589

1680

9533

1588

1860

5646

3 9841

9022

4837

8031

9139

3380

4082

3826

2039

7182

4 5525

7127

1468

6404

9924

8230

7343

9268

1899

4754

5 0299

1075

7721

8855

7997

7032

5987

7535

1834

6253

6 7985

5566

6384

0863

0400

1834

5394

5801

5505

9099

7 3353

9528

0681

3495

1393

3716

9506

1591

8999

3716

8 7475

1313

2216

3776

1557

4238

9623

9024

5826

7146

9 0666

3043

0066

3260

3660

4605

1731

6680

9101

6235

10 9283

3160

8730

7683

1785

3148

1323

1732

6814

8496

11 6121

3149

9829

7770

7211

3523

6947

1427

1474

5235

12 2782

0101

7441

3877

5368

5326

5516

3566

3187

8209

13 6105

5010

9485

8632

1072

9567

8821

7209

4873

0397

14 1157

8567

9491

4948

3549

3941

8017

5445

2366

8260

15 1516

0890

9286

1332

2601

2002

7245

9474

9719

9946

16 2209

2966

1544

7674

9492

4813

7585

8128

9541

3630

17 6913

5355

3587

4323

8332

7940

9220

8376

8261

2420

18 0829

7937

0033

3534

8655

1091

1886

4350

6779

3358

19 3729

9985

5563

3266

7198

8520

3193

6391

7721

9962

20 6511

1404

8886

2892

0403

4299

8708

2055

3053

8224

21 6622

8158

3080

2110

1553

2690

3377

5119

1749

2714

22 3721

7713

6931

2022

6713

4629

7532

6979

3923

3243

23 5143

0972

6838

0577

1462

8907

3789

2530

9209

0692

24 3159

3783

9255

1531

2124

0393

3597

8461

9685

4551

25 7905

4369

5293

0077

4482

9165

1171

2537

8913

6387

Table2.4.Allocationof20subjectstotwogroups

Subject Digit Group

1 3 A

2 4 B

3 6 B

4 2 B

5 9 A

6 7 A

7 5 A

8 3 A

9 2 B

10 6 B

11 9 A

12 7 A

13 9 A

14 3 A

15 9 A

16 2 B

17 3 A

18 3 A

19 2 B

20 4 B

Thesystemdescribedabovegaveusunequalnumbersinthetwogroups,12inAand8inB.Wesometimeswantthegroupstobeofequalsize.OnewaytodothiswouldbetoproceedasaboveuntileitherAorBhas10subjectsinit,alltheremainingsubjectsgoingintotheothergroups.ThisissatisfactoryinthateachsubjecthasanequalchanceofbeingallocatedtoAorB,butithasadisadvantage.Thereisatendencyforthelastfewsubjectsalltohavethesametreatment.Thischaracteristicsometimesworriesresearchers,whofeelthattherandomizationisnotquiteright.Instatisticaltermsthepossibleallocationsarenotequallylikely.Ifweusethismethodfortherandomallocationdescribedabove,the10thsubjectingroupAwouldbereachedatsubject15andthelastfivesubjectswouldallbeingroupB.Wecanensurethatallrandomizationsareequallylikelybyusingthetableofrandomnumbersinadifferentway.Forexample,wecanusethetabletodrawarandomsampleof10subjectsfrom20,asdescribedin§3.4.ThesewouldformgroupA,andtheremaining10groupB.Anotherwayistoputoursubjectsintosmallequal-sizedgroups,calledblocks,andwithineachblocktoallocateequalnumberstoAandB.Thisgivesapproximatelyequalnumbersonthetwotreatmentsandwilldosowheneverthetrialstops.

Theuseofrandomnumbersandthegenerationoftherandomnumbersthemselvesaresimplemathematicaloperationswellsuitedtothecomputerswhicharenowreadilyavailabletoresearchers.Itisveryeasytoprogramacomputertocarryoutrandomallocation,andonceaprogramisavailableitcanbeusedoverandoveragainforfurtherexperiments.MyprogramClinstat(§1.3)doesseveraldifferentrandomizationschemes,evenofferingblocksofrandomsize.

ThetrialcarriedoutbytheMedicalResearchCouncil(MRC1948)totesttheefficacyofstreptomycinforthetreatmentofpulmonary

tuberculosisisgenerallyconsideredtohavebeenthefirstrandomizedexperimentinmedicine.Inthisstudythetargetpopulationwaspatientswithacuteprogressivebilateralpulmonarytuberculosis,aged15–30years.Allcaseswerebacteriologicallyprovedandwereconsideredunsuitableforothertreatmentsthenavailable.Thetrialtookplaceinthreecentresandallocationwasbyaseriesofrandomnumbers,drawnupforeachsexateachcentre.Thestreptomycingroupcontained55

patientsandthecontrolgroup52cases.TheconditionofthepatientsonadmissionisshowninTable2.5.Thefrequencydistributionsoftemperatureandsedimentationrateweresimilarforthetwogroups;ifanythingthetreated(S)groupwereslightlyworse.However,thisdifferenceisnogreaterthancouldhavearisenbychance,which,ofcourse,ishowitarose.Thetwogroupsarecertaintobeslightlydifferentinsomecharacteristics,especiallywithafairlysmallsample,andwecantakeaccountofthisintheanalysis(Chapter17).

Table2.5.Conditionofpatientsonadmissiontotrialofstreptomycin(MRC1948)

Group

S C

Generalcondition Good 8 8

Fair 17 20

Poor 30 24

Max.eveningtemperaturein 98- 4 4

firstweek(°F) 98.9

99-99.9

13 12

100-100.9

15 17

101+ 24 19

Sedimentationrate 0-10 0 0

11-20 3 2

21-50 16 20

51+ 36 29

Table2.6.SurvivalatsixmonthsintheMRCstreptomycintrial,stratifiedbyinitialcondition

(MRC1948)

Maximumeveningtemperatureduringfirst

observationweek

Outcome

Group

Streptomycingroup

Controlgroup

98-98.9°F Alive 3 4

Dead 0 0

99-99.9°F Alive 13 11

Dead 0 1

100-100.9°F Alive 15 12

Dead 0 5

101°Fandabove Alive 20 11

Dead 4 8

Aftersixmonths,93%oftheSgroupsurvived,comparedto73%ofthecontrolgroup.Therewasaclearadvantagetothestreptomycingroup.TherelationshipofsurvivaltoinitialconditionisshowninTable2.6.Survivalwasmorelikelyforpatientswithlowertemperatures,butthedifferenceinsurvivalbetweentheSandCgroupsisclearlypresentwithineachtemperaturecategorywheredeathsoccurred.

Randomizedtrialsarenotrestrictedtotwotreatments.Wecancompareseveraltreatments.Adrugtrialmightincludethenewdrug,arivaldrug,and

nodrugatall.Wecancarryoutexperimentstocompareseveralfactorsatonce.Forexample,wemightwishtostudytheeffectofadrugatdifferentdosesinthepresenceorabsenceofaseconddrug,withthesubjectstandingorsupine.Thisisusuallydesignedasafactorialexperiment,whereeverypossiblecombinationoftreatmentsisused.Thesedesignsareunusualinclinicalresearchbutaresometimesusedinlaboratorywork.Theyaredescribedinmoreadvancedtexts

(ArmitageandBerry1994,SnedecorandCochran1980).Formoreonrandomizedtrialsingeneral,seePocock(1983)andJohnsonandJohnson(1977).

Randomizedexperimentationmaybecriticizedbecausewearewithholdingapotentiallybeneficialtreatmentfrompatients.Anybiologicallyactivetreatmentispotentiallyharmful,however,andwearesurelynotjustifiedingivingpotentiallyharmfultreatmentstopatientsbeforethebenefitshavebeendemonstratedconclusively.Withoutproperlyconductedcontrolledclinicaltrialstosupportit,eachadministrationofatreatmenttoapatientbecomesanuncontrolledexperiment,whoseoutcome,goodorbad,cannotbepredicted.

2.3*MethodsofallocationwithoutrandomnumbersInthesecondstageoftheNewYorkstudiesofBCGvaccine,thechildrenwereallocatedtotreatmentorcontrolalternately.Researchersoftenaskwhythismethodcannotbeusedinsteadofrandomization,arguingthattheorderinwhichpatientsarriveisrandom,sothegroupsthusformedwillbecomparable.First,althoughthepatientsmayappeartobeinarandomorder,thereisnoguaranteethatthisisthecase.Wecouldneverbesurethatthegroupsarecomparable.Second,thismethodisverysusceptibletomistakes,oreventocheatinginthepatients'perceivedinterest.Theexperimenterknowswhattreatmentthesubjectwillreceivebeforethesubjectisadmittedtothetrial.Thisknowledgemayinfluencethedecisiontoadmitthesubject,andsoleadtobiasedgroups.Forexample,anexperimentermightbemorepreparedtoadmitafrailpatientifthepatientwillbeonthecontroltreatmentthanifthepatientwouldbeexposedtotheriskofthenewtreatment.Thisobjectionappliestousingthelastdigitofthehospitalnumberforallocation.

Knowledgeofwhattreatmentthenextpatientwillreceivecancertainlyleadtobias.Forexample,Schulzetal.(1995)lookedat250controlledtrials.Theycomparedtrialswheretreatmentallocationwasnotadequatelyconcealedfromresearcherswithtrialswheretherewasadequatelyconcealment.Theyfoundanaveragetreatmenteffect41%largerinthetrialswithinadequateconcealment.

Thereareseveralexamplesreportedintheliteratureofalterationstotreatmentallocations.Holten(1951)reportedatrialofanticoagulanttherapyforpatientswithcoronarythrombosis.Patientswhopresentedonevendatesweretobetreatedandpatientsarrivingonodddatesweretoformthecontrolgroup.Theauthorreportsthatsomeofthecliniciansinvolvedfoundit‘difficulttoremember’thecriterionforallocation.Overallthetreatedpatientsdidbetterthanthecontrols(Table2.7).Curiously,thecontrolsontheevendates(wronglyallocated)didconsiderablybetterthancontrolpatientsontheodddates(correctly

allocated)andevenmanagedtodomarginallybetterthanthosewhoreceivedthetreatment.Thebestoutcome,treatedornot,wasforthosewhowereincorrectlyallocated.Allocationinthistrialappearstohavebeenratherselective.

Table2.7.Outcomeofaclinicaltrialusingsystematicallocation,witherrorsinallocation

(Holten1951)

OutcomeEvendates Odddates

Treated Control Treated Control

Survived 125 39 10 125

Died 39(25%) 11(22%) 0(0%) 81(36%)

Total 164 50 10 206

Othermethodsofallocationsetouttoberandombutcanfallintothissortofdifficulty.Forexample,wecouldusephysicalmixingtoachieve

randomization.Thisisquitedifficulttodo.Asanexperiment,takeadeckofcardsandordertheminsuitsfromaceofclubstokingofspades.Nowshufflethemintheusualwayandexaminethem.Youwillprobablyseemanyrunsofseveralcardswhichremaintogetherinorder.Cardsmustbeshuffledverythoroughlyindeedbeforetheorderingceasestobeapparent.Thephysicalrandomizationmethodcanbeappliedtoanexperimentbymarkingequalnumbersonslipsofpaperwiththenamesofthetreatments,sealingthemintoenvelopesandshufflingthem.Thetreatmentforasubjectisdecidedbywithdrawinganenvelope.ThismethodwasusedinanotherstudyofanticoagulanttherapybyCarletonetal.(1960).Theseauthorsreportedthatinthelatterstagesofthetrialsomeofthecliniciansinvolvedhadattemptedtoreadthecontentsoftheenvelopesbyholdingthemuptothelight,inordertoallocatepatientstotheirownpreferredtreatment.

Interferingwiththerandomizationcanactuallybebuiltintotheallocationprocedure,withequallydisastrousresults.IntheLanarkshireMilkExperiment,discussedbyStudent(1931),10000schoolchildrenreceivedthreequartersofapintofmilkperdayand10000childrenactedascontrols.Thechildrenwereweighedandmeasuredatthebeginningandendofthesix-monthexperiment.Theobjectwastoseewhetherthemilkimprovedthegrowthofchildren.Theallocationtothe‘milk’orcontrolgroupwasdoneasfollows:

Theteachersselectedthetwoclassesofpupils,thosegettingmilkandthoseactingascontrols,intwodifferentways.Incertaincasestheyselectedthembyballotandinothersonanalphabeticalsystem.Inanyparticularschoolwheretherewasanygrouptowhichthesemethodshadgivenanundueproportionofwell-fedorill-nourishedchildren,othersweresubstitutedtoobtainamorelevelselection.

Theresultofthiswasthatthecontrolgrouphadamarkedlygreateraverageheightandweightatthestartoftheexperimentthandidthemilkgroup.Studentinterpretedthisasfollows:

Presumablythisdiscriminationinheightandweightwasnotmadedeliberately,butitwouldseemprobablethattheteachers,swayedbytheveryhumanfeelingthatthepoorerchildrenneededthemilkmore

thanthecomparativelywelltodo,musthaveunconsciouslymadetoolargeasubstitutionfortheill-nourishedamongthe(milkgroup)andtoofewamongthecontrolsandthatthisunconsciousselectionaffected

secondarily,bothmeasurements.

Whetherthebiaswasconsciousornot,itspoiledtheexperiment,despitebeingfromthebestpossiblemotives.

Thereisonenon-randommethodwhichcanbeusedsuccessfullyinclinicaltrials:minimization.Inthismethod,newsubjectsareallocatedtotreatmentssoastomakethetreatmentgroupsassimilaraspossibleintermsoftheimportantprognosticfactors.Itisbeyondthescopeofthisbook,butseePocock(1983)foradescription.

2.4VolunteerbiasPeoplewhovolunteerfornewtreatmentsandthosewhorefusethemmaybeverydifferent.AnillustrationisprovidedbythefieldtrialofSalkpoliomyelitisvaccinecarriedoutin1954intheUSA(Meier1977).Thiswascarriedoutusingtwodifferentdesignssimultaneously,duetoadisputeaboutthecorrectmethod.Insomedistricts,secondgradeschoolchildrenwereinvitedtoparticipateinthetrial,andrandomlyallocatedtoreceivevaccineoraninertsalineinjection.Inotherdistricts,allsecondgradechildrenwereofferedvaccinationandthefirstandthirdgradeleftunvaccinatedascontrols.Theargumentagainstthis‘observedcontrol’approachwasthatthegroupsmaynotbecomparable,whereastheargumentagainsttherandomizedcontrolmethodwasthatthesalineinjectioncouldprovokeparalysisininfectedchildren.TheresultsareshowninTable2.8.Intherandomizedcontrolareasthevaccinatedgroupclearlyexperiencedfarlesspoliothanthecontrolgroup.Sincethesewererandomlyallocated,theonlydifferencebetweenthemshouldbethetreatment,whichisclearlypreferabletosaline.However,thecontrolgroupalsohadmorepoliothanthosewhohadrefusedtoparticipateinthetrial.Thedifferencebetweenthecontrolandnotinoculatedgroupisinbothtreatment(salineinjection)andselection;theyareself-selectedasvolunteersandrefusers.Theobservedcontrolareasenableustodistinguishbetweenthesetwofactors.Thepolioratesinthevaccinatedchildren

areverysimilarinbothpartsofthestudy,asaretheratesinthenotinoculatedsecondgradechildren.Itisthetwocontrolgroupswhichdiffer.Thesewereselectedindifferentways:intherandomizedcontrolareastheywerevolunteers,whereasintheobservedcontrolareastheywereeverybodyeligible,bothpotentialvolunteersandpotentialrefusers.Nowsupposethatthevaccineweresalineinstead,andthattherandomizedvaccinatedchildrenhadthesamepolioexperienceasthosereceivingsaline.Wewouldexpect200745×57/100000=114cases,insteadofthe33observed.Thetotalnumberofcasesintherandomizedareaswouldbe114+115+121=350andtherateper100000wouldbe47.Thiscomparesverycloselywiththerateof46intheobservedcontrolfirstandthirdgradegroup.Thusitseemsthattheprincipaldifferencebetweenthesalinecontrolgroupofvolunteersandthenotinoculatedgroupofrefusersisselection,nottreatment.

Thereisasimpleexplanationofthis.Polioisaviraldiseasetransmittedbythefaecal—oralroute.Beforethedevelopmentofvaccinealmosteveryoneinthe

populationwasexposedtoitatsometime,usuallyinchildhood.Inthemajorityofcases,paralysisdoesnotresultandimmunityisconferredwithoutthechildbeingawareofhavingbeenexposedtopolio.Inasmallminorityofcases,about1in200,paralysisordeathoccursandadiagnosisofpolioismade.Theoldertheexposedindividualis,thegreaterthechanceofparalysisdeveloping.Hence,childrenwhoareprotectedfrominfectionbyhighstandardsofhygienearelikelytobeolderwhentheyarefirstexposedtopoliothanthosechildrenfromhomeswithlowstandardsofhygiene,andthusmorelikelytodeveloptheclinicaldisease.Therearemanyfactorswhichmayinfluenceparentsintheirdecisionastowhethertovolunteerorrefusetheirchildforavaccinetrial.Thesemayincludeeducation,personalexperience,currentillness,andothers,butcertainlyincludeinterestinhealthandhygiene.Thusinthistrialthehighriskchildrentendedtobevolunteeredandthelowriskchildrentendedtoberefused.Thehigherriskvolunteercontrolchildrenexperienced57casesofpolioper100000,comparedto36per100000amongthelowerriskrefusers.

Table2.8.ResultofthefieldtrialofSalkpoliomyelitisvaccine(Meier1977)

Studygroup Numberingroup

Paralyticpolio

Numberofcases

Rateper100000

Randomizedcontrol

Vaccinated 200745 33 16

Control 201229 115 57

Notinoculated 338778 121 36

Observedcontrol

Vaccinated2ndgrade

221998 38 17

Control1stand3rdgrade

725173 330 46

Unvaccinated2ndgrade

123605 43 35

Inmostdiseases,theeffectofvolunteerbiasisoppositetothis.Poorconditionsarerelatedbothtorefusaltoparticipateandtohighrisk,

whereasvolunteerstendtobelowrisk.Theeffectofvolunteerbiasisthentoproduceanapparentdifferenceinfavourofthetreatment.Wecanseethatcomparisonsbetweenvolunteersandothergroupscanneverbereliableindicatorsoftreatmenteffects.

2.5IntentiontotreatIntheobservedcontrolareasoftheSalktrial(Table2.8),quiteapartfromthenon-randomagedifference,thevaccinatedandcontrolgroupsarenotcomparable.However,itispossibletomakeareasonablecomparisoninthisstudybycomparingallsecondgradechildren,bothvaccinatedandrefused,tothecontrolgroup.Therateinthesecondgradechildrenis23per100000,whichislessthantherateof46inthecontrolgroup,demonstratingtheeffectivenessofthevaccine.The‘treatment’whichweareevaluatingisnotvaccinationitself,butapolicyofofferingvaccinationandtreatingthosewhoaccept.Asimilarproblemcanariseinarandomizedtrial,forexampleinevaluatingtheeffectiveness

ofhealthcheckups(South-eastLondonScreeningStudyGroup1977).Subjectswererandomizedtoascreeninggrouportoacontrolgroup.Thescreeninggroupwereinvitedtoattendforanexamination,someacceptedandwerescreenedandsomerefused.Whencomparingtheresultsintermsofsubsequentmortality,itwasessentialtocomparethecontrolstothescreeninggroupscontainingbothscreenedandrefusers.Forexample,therefusersmayhaveincludedpeoplewhowerealreadytooilltocomeforscreening.Theimportantpointisthattherandomallocationprocedureproducescomparablegroupsanditisthesewemustcompare,whateverselectionmaybemadewithinthem.Wethereforeanalysethedataaccordingtothewayweintendedtotreatsubjects,notthewayinwhichtheywereactuallytreated.Thisisanalysisbyintentiontotreat.Thealternative,analysingbytreatmentactuallyreceived,iscalledontreatmentanalysis.

Analysisbyintentiontotreatisnotfreeofbias.Assomepatientsmayreceivetheothergroup'streatment,thedifferencemaybesmallerthanitshouldbe.Weknowthatthereisabiasandweknowthatitwillmakethetreatmentdifferencesmaller,byanunknownamount.On

treatmentanalyses,ontheotherhand,arebiasedinfavourofshowingadifference,whetherthereisoneornot.Statisticianscallmethodswhicharebiasedagainstfindinganyeffectconservative.Ifwemusterr,weliketodosointheconservativedirection.

2.6Cross-overdesignsSometimesitispossibletouseasubjectasherorhisowncontrol.Forexample,whencomparinganalgesicsinthetreatmentofarthritis,patientsmayreceiveinsuccessionanewdrugandacontroltreatment.Theresponsetothetwotreatmentscanthenbecomparedforeachpatient.Thesedesignshavetheadvantageofremovingvariabilitybetweensubjects.Wecancarryoutatrialwithfewersubjectsthanwouldbeneededforatwogrouptrial.

Althoughallsubjectsreceivealltreatments,thesetrialsmuststillberandomized.Inthesimplestcaseoftreatmentandcontrol,patientsmaybegiventwodifferentregimes:controlfollowedbytreatmentortreatmentfollowedbycontrol.Thesemaynotgivethesameresults,e.g.theremaybealong-termcarry-overeffectortimetrendwhichmakestreatmentfollowedbycontrolshowlessofadifferencethancontrolfollowedbytreatment.Subjectsare,therefore,assignedtoagivenorderatrandom.Itispossibleintheanalysisofcross-overstudiestoestimatethesizeofanycarry-overeffectswhichmaybepresent.

Asanexampleoftheadvantagesofacross-overtrial,consideratrialofpronethalolinthetreatmentofanginapectoris(Pritchardetal.1963).Anginapectorisisachronicdiseasecharacterizedbyattacksofacutepain.Patientsinthistrialreceivedeitherpronethaloloraninertcontroltreatment(orplacebo,see§2.8)infourperiodsoftwoweeks,twoperiodsonthedrugandtwoonthecontroltreatment.Theseperiodswereinrandomorder.Theoutcomemeasurewasthenumberofattacksofanginaexperienced.Thesewererecordedbythepatientinadiary.Twelvepatientstookpartinthetrial.Theresultsareshown

inTable2.9.Theadvantageinfavourofpronethalolisshownby11ofthe12patientsreportingfewerattacksofpainwhileonpronethalolthanwhileonthecontroltreatment.Ifwehadobtainedthesamedatafromtwoseparategroupsofpatientsinsteadofthesamegroupunder

twoconditions,itwouldbefarfromclearthatpronethalolissuperiorbecauseofthehugevariationbetweensubjects.Usingatwogroupdesign,wewouldneedamuchlargersampleofpatientstodemonstratetheefficacyofthetreatment.

Table2.9.Resultsofatrialofpronethalolforthetreatmentofanginapectoris(Pritchardetal.1963)

Patientnumber

Numberofattackswhileon

Differenceplacebo–pronethalolPlacebo Pronethalol

1 71 29 42

2 323 348 –25

3 8 1 7

4 14 7 7

5 23 16 7

6 34 25 9

7 79 65 14

8 60 41 19

9 2 0 2

10 3 0 3

11 17 15 2

12 7 2 5

Cross-overdesignscanbeusefulforlaboratoryexperimentsonanimalsorhumanvolunteers.Theyshouldonlybeusedinclinicaltrialswherethetreatmentwillnotaffectthecourseofthediseaseandwherethepatient'sconditionwouldnotchangeappreciablyoverthecourseofthetrial.Across-overtrialcouldbeusedtocomparedifferenttreatmentsforthecontrolofarthritisorasthma,forexample,butnottocomparedifferentregimesforthemanagementofmyocardialinfarction.However,across-overtrialcannotbeusedtodemonstratethelong-termactionofatreatment,asthenatureofthedesignmeansthatthetreatmentperiodmustbelimited.Asmosttreatmentsofchronicdiseasemustbeusedbythepatientforalongtime,atwosampletrialoflongdurationisusuallyrequiredtoinvestigatefullytheeffectivenessofthetreatment.Pronethalol,forexample,waslaterfoundtohavequiteunacceptablesideeffectsinlongtermuse.

Formoreoncross-overtrials,seeSenn(1993)andJonesandKenward(1989).

2.7SelectionofsubjectsforclinicaltrialsIhavediscussedtheallocationofsubjectstotreatmentsatsomelength,butwehavenotconsideredwheretheycomefrom.Thewayinwhichsubjectsareselectedforanexperimentmayhaveaneffectonitsoutcome.Inpractice,weareusuallylimitedtosubjectswhichareeasilyavailabletous.Forexample,inananimalexperimentwemusttakethelatestbatchfromtheanimalhouse.Inaclinicaltrialofthetreatmentofmyocardialinfarction,wemustbecontentwithpatients

whoarebroughtintothehospital.Inexperimentsonhumanvolunteers

wesometimeshavetousetheresearchersthemselves.

AsweshallseemorefullyinChapter3,thishasimportantconsequencesfortheinterpretationofresults.Intrialsofmyocardialinfarction,forexample,wewouldnotwishtoconcludethat,say,thesurvivalratewithanewtreatmentinatrialinLondonwouldbethesameasinatrialinEdinburgh.Thepatientsmayhaveadifferenthistoryofdiet,forexample,andthismayhaveaconsiderableeffectonthestateoftheirarteriesandhenceontheirprognosis.Indeed,itwouldbeveryrashtosupposethatwewouldgetthesamesurvivalrateinahospitalamiledowntheroad.Whatwerelyonisthecomparisonbetweenrandomizedgroupsfromthesamepopulationofsubjects,andhopethatifatreatmentreducesmortalityinLondonitwillalsodosoinEdinburgh.Thismaybeareasonablesupposition,andeffectswhichappearinonepopulationarelikelytoappearinanother,butitcannotbeprovedonstatisticalgroundsalone.Sometimesinextremecasesitturnsoutnottobetrue.BCGvaccinehasbeenshown,bylarge,wellconductedrandomizedtrials,tobeeffectiveinreducingtheincidenceoftuberculosisinchildrenintheUK.However,inIndiaitappearstobefarlesseffective(Lancet1980).Thismaybebecausetheamountofexposuretotuberculosisissodifferentinthetwopopulations.

Giventhatwecanuseonlytheexperimentalsubjectsavailabletous,therearesomeprincipleswhichweusetoguideourselectionfromthem.Asweshallseelater,thelowerthevariabilitybetweenthesubjectsinanexperimentis.thebetterchancewehaveofdetectingatreatmentdifferenceifitexists.Thismeansthatuniformityisdesirableinoursubjects.Inananimalexperimentthiscanbeachievedbyusinganimalsofthesamestrainraisedundercontrolledconditions.Inaclinicaltrialweusuallyrestrictourattentiontopatientsofadefinedagegroupandseverityofdisease.TheSalkvaccinetrial(§2.4)onlyusedchildreninoneschoolyear.Inthestreptomycintrial(§2.2)thesubjectswererestrictedtopatientswithacutebilateralpulmonarytuberculosis,bacteriologicallyproved,agedbetween15and30years,andunsuitableforothercurrenttherapy.Evenwiththisnarrowdefinitiontherewasconsiderablevariationamongthepatients,as

Tables2.5and2.6show.Tuberculosishadtobebacteriologicallyprovedbecauseitisimportanttomakesurethateveryonehasthediseasewewishtotreat.Patientswithadifferentdiseasearenotonlypotentiallybeingwronglytreatedthemselves,butmaymaketheresultsdifficulttointerpret.Restrictingattentiontoaparticularsubsetofpatients,thoughuseful,canleadtodifficulties.Forexample,atreatmentshowntobeeffectiveandsafeinyoungpeoplemaynotnecessarilybesointheelderly.Trialshavetobecarriedoutonthesortofpatientsitisproposedtotreat.

2.8ResponsebiasandplacebosTheknowledgethatsheorheisbeingtreatedmayalterapatient'sresponsetotreatment.Thisiscalledtheplaceboeffect.Aplaceboisapharmacologicallyinactivetreatmentgivenasifitwereanactivetreatment.Thiseffectmaytakemanyforms,fromadesiretopleasethedoctortomeasurablebiochemical

changesinthebrain.Mindandbodyareintimatelyconnected,andunlessthepsychologicaleffectisactuallypartofthetreatmentweusuallytrytoeliminatesuchfactorsfromtreatmentcomparisons.Thisisparticularlyimportantwhenwearedealingwithsubjectiveassessments,suchasofpainorwell-being.

Fig.2.1.Painreliefinrelationtodrugandtocolourofplacebo(afterHuskisson1974)

AfascinatingexampleofthepoweroftheplaceboeffectisgivenbyHuskisson(1974).Threeactiveanalgesics,aspirin,CodisandDistalgesic,werecomparedwithaninertplacebo.Twentytwopatientseachreceivedthefourtreatmentsinacross-overdesign.Thepatientsreportedpainreliefonafourpointscale,from0=noreliefto3=completerelief.Allthetreatmentsproducedsomepainrelief,maximumreliefbeingexperiencedafterabouttwohours(Figure2.1).Thethreeactivetreatmentswereallsuperiortoplacebo,butnotbyverymuch.Thefourdrugtreatmentsweregivenintheformoftabletsidenticalinshapeandsize,buteachdrugwasgiveninfourdifferentcolours.Thiswasdonesothatpatientscoulddistinguishthedrugsreceived,tosaywhichtheypreferred.Eachpatientreceivedfourdifferentcolours,oneforeachdrug,andthecolourcombinationswereallocatedrandomly.Thussomepatientsreceivedredplacebos,someblue,andsoon.AsFigure2.1shows,redplacebosweremarkedlymoreeffectivethanothercolours,andwerejustaseffectiveastheactivedrugs!Inthisstudynotonlyistheeffectofapharmacologicallyinertplaceboinproducingreportedpainreliefdemonstrated,butalsothewidevariabilityandunpredictabilityofthisresponse.Wemustclearlytakeaccountofthisintrialdesign.Incidentally,weshouldnotconcludethatredplacebosalwaysworkbest.Thereis,forexample,someevidencethatpatientsbeingtreatedforanxietyprefertabletstobeinasoothinggreen,anddepressivesymptomsrespondbesttoalivelyyellow(Schapiraetal.1970).

Inanytrialinvolvinghumansubjectsitisdesirablethatthesubjectsshouldnotbeabletotellwhichtreatmentiswhich.Inastudytocomparetwoormoretreatmentsthisshouldbedonebymakingthetreatmentsassimilaraspossible.Wheresubjectsaretoreceivenotreatmentaninactiveplaceboshouldbeusedifpossible.Sometimeswhentwoverydifferentactivetreatmentsarecomparedadoubleplaceboordoubledummycanbeused.Forexample,whencomparingadruggivenasingledosewithadrugtakendailyforsevendays,subjectson

thesingledosedrugmayreceiveadailyplaceboandthoseonthedaily

doseasingleplaceboatthestart.

Placebosarenotalwayspossibleorethical.IntheMRCtrialofstreptomycin.wherethetreatmentinvolvedseveralinjectionsadayforseveralmonths,itwasnotregardedasethicaltodothesamewithaninertsalinesolutionandnoplacebowasgiven.IntheSalkvaccinetrial,theinertsalineinjectionswereplacebos.Itcouldbearguedthatparalyticpolioisnotlikelytorespondtopsychologicalinfluences,buthowcouldwebereallysureofthis?Thecertainknowledgethatachildhadbeenvaccinatedmayhavealteredtheriskofexposuretoinfectionasparentsallowedthechildtogoswimming,forexample.Finally,theuseofaplacebomayalsoreducetheriskofassessmentbiasasweshallseein§2.9.

2.9AssessmentbiasanddoubleblindstudiesTheresponseofsubjectsisnottheonlythingaffectedbyknowledgeofthetreatment.Theassessmentbytheresearcheroftheresponsetotreatmentmayalsobeinfluencedbytheknowledgeofthetreatment.

Someoutcomemeasuresdonotallowformuchbiasonthepartoftheassessor.Forexample,iftheoutcomeissurvivalordeath,thereislittlepossibilitythatunconsciousbiasmayaffecttheobservation.However,ifweareinterestedinanoverallclinicalimpressionofthepatient'sprogress,orinchangesinanX-raypicture,themeasurementmaybeinfluencedbyourdesire(orotherwise)thatthetreatmentshouldsucceed.Itisnotenoughtobeawareofthisdangerandallowforit,aswemayhavethesimilarproblemof‘bendingoverbackwardstobefair’.Evensuchanapparentlyobjectivemeasureasbloodpressurecanbeinfluencedbytheexpectationsoftheexperimenter,andspecialmeasuringequipmenthasbeendevisedtoavoidthis(Roseetal.1964).

Wecanavoidthepossibilityofsuchbiasbyusingblindassessment,thatis,theassessordoesnotknowwhichtreatmentthesubjectisreceiving.Ifaclinicaltrialcannotbeconductedinsuchawaythattheclinicianinchargedoesnotknowthetreatment,blindassessmentcanstillbecarriedoutbyanexternalassessor.Whenthesubjectdoesnotknowthetreatmentandblindassessmentisused,thetrialissaidtobe

doubleblind.(Researchersoneyediseasehatetheterms‘blind’and‘doubleblind’,prefering‘masked’and‘doublemasked’instead.)

Placebosmaybejustasusefulinavoidingassessmentbiasasforresponsebias.Thesubjectisunabletotiptheassessoroffastotreatment,andthereislikelytobelessmaterialevidencetoindicatetoanassessorwhatitis.IntheanticoagulantstudybyCarletonetal.(1960)describedabove,thetreatmentwassuppliedthroughanintravenousdrip.Controlpatientshadadummydripsetup,withatubetapedtothearmbutnoneedleinserted,primarilytoavoidassessmentbias.IntheSalktrial,theinjectionswerecodedandthecodeforacasewasonlybrokenafterthedecisionhadbeenmadeastowhetherthechildhadpolioandifsoofwhatseverity.

Inthestreptomycintrial,oneoftheoutcomemeasureswasradiological

change.X-rayplateswerenumberedandthenassessedbytworadiologistsandaclinician,noneofwhomknewtowhichpatientandtreatmenttheplatebelonged.Theassessmentwasdoneindependently,andtheyonlydiscussedaplateiftheyhadnotallcometothesameconclusion.Onlywhenafinaldecisionhadbeenarrivedatwasthelinkbetweenplateandpatientmade.TheresultsareshowninTable2.10.TheclearadvantageofstreptomycinisshownintheconsiderableimprovementofoverhalftheSgroup,comparedtoonly8%ofthecontrols.

Table2.10.Assessmentofradiologicalappearanceatsixmonthsascomparedwithappearanceon

admission(MRC1948)

Radiologicalassessment S Group C Group

Considerableimprovement 28 51% 4 8%

Moderateorslightimprovement

10 18% 13 25%

Nomaterialchange 2 4% 3 6%

Moderateorslightdeterioration

5 9% 12 23%

Considerabledeterioration 6 11% 6 11%

Deaths 4 7% 14 27%

Total 55 100% 52 100%

2.10*LaboratoryexperimentsSofarwehavelookedatclinicaltrials,butexactlythesameprinciplesapplytolaboratoryresearchonanimals.Itmaywellbethatinthisareatheprinciplesofrandomizationarenotsowellunderstoodandevenmorecriticalattentionisneededfromthereaderofresearchreports.Onereasonforthismaybethatgreatefforthasbeenputintoproducinggeneticallysimilaranimals,raisedinconditionsasclosetouniformasispracticable.Theresearcherusingsuchanimalsassubjectsmayfeelthattheresultinganimalsshowsolittlebiologicalvariabilitythatanynaturaldifferencesbetweenthemwillbedwarfedbythetreatmenteffects.Thisisnotnecessarilyso,asthefollowingexamplesillustrate.

Acolleaguewaslookingattheeffectoftumourgrowthonmacrophagecountsinrats.Theonlysignificantdifferencewasbetweentheinitialvaluesintumourinducedandnon-inducedrats,thatis,beforethetumour-inducingtreatmentwasgiven.Therewasasimpleexplanationforthissurprisingresult.Theoriginaldesignhadbeentogivethe

tumour-inducingtreatmenttoeachofagroupofrats.Somewoulddeveloptumoursandotherswouldnot,andthenthemacrophagecountswouldbecomparedbetweenthetwogroupsthusdefined.Intheevent,alltheratsdevelopedtumours.Inanattempttosalvagetheexperimentmycolleagueobtainedasecondbatchofanimals,whichhedidnottreat,toactascontrols.Thedifferencebetweenthetreatedanduntreatedanimalswasthusduetodifferencesinparentageorenvironment,nottotreatment.

Thatproblemarosebychangingthedesignduringthecourseoftheexperiment.Problemscanarisefromignoringrandomizationinthedesignofacomparativeexperiment.Anothercolleaguewantedtoknowwhetheratreatmentwouldaffectweightgaininmice.Miceweretakenfromacageonebyone

andthetreatmentgiven,untilhalftheanimalshadbeentreated.Thetreatedanimalswereputintosmallercages,fivetoacage,whichwereplacedtogetherinaconstantenvironmentchamber.Thecontrolmicewereincagesalsoplacedtogetherintheconstantenvironmentchamber.Whenthedatawereanalysed,itwasdiscoveredthatthemeaninitialweightswasgreaterinthetreatedanimalsthaninthecontrolgroup.Inaweightgainexperimentthiscouldbequiteimportant!Perhapslargeranimalswereeasiertopickup,andsowereselectedfirst.Whatthatexperimentershouldhavedonewastoplacethemiceintheboxes,giveeachboxaplaceintheconstantenvironmentchamber,thenallocatetheboxestotreatmentorcontrolatrandom.Wewouldthenhavetwogroupswhichwerecomparable,bothininitialvaluesandinanyenvironmentaldifferenceswhichmayexistintheconstantenvironmentchamber.

2.11*ExperimentalunitsIntheweightgainexperimentdescribedabove,eachboxofmicecontainedfiveanimals.Theseanimalswerenotindependentofoneanother,butinteracted.Inaboxtheotherfouranimalsformedpartoftheenvironmentofthefifth,andsomightinfluenceitsgrowth.Theboxoffivemiceiscalledanexperimentalunit.Anexperimentalunitisthesmallestgroupofsubjectsinanexperimentwhoseresponsecannot

beaffectedbyothersubjects.Weneedtoknowtheamountofnaturalvariationwhichexistsbetweenexperimentalunitsbeforewecandecidewhetherthetreatmenteffectisdistinguishablefromthisnaturalvariation.Intheweightgainexperiment,themeanweightgainineachboxshouldbecalculated,andthemeandifferenceestimatedusingthetwo-sampletmethod(§10.3).Inhumanstudies,thesamethinghappenswhengroupsofpatients,suchasallthoseinahospitalwardorageneralpracticearerandomizedasagroup.Thismighthappeninatrialofhealthpromotion,forexample,wherespecialclinicsareadvertisedandsetupinGPsurgeries.Itwouldbeimpracticaltoexcludesomepatientsfromtheclinicandimpossibletopreventpatientsfromthepracticeinteractingwithandinfluencingoneanother.Allthepracticepatientsmustbetreatedasasingleunit.Trialswhereexperimentalunitscontainmorethanonesubjectarecalledclusterrandomized.

Thequestionoftheexperimentalunitariseswhenthetreatmentisappliedtotheproviderofcareratherthantothepatientdirectly.Forexample,Whiteetal.(1989)comparedthreerandomlyallocatedgroupsofGPs,thefirstgivenanintensiveprogrammeofsmallgroupeducationtoimprovetheirtreatmentofasthma,thesecondalesserintervention,andthethirdnointerventionatall.ForeachGP,asampleofherorhisasthmaticpatientswasselected.Thesepatientsreceivedquestionnairesabouttheirsymptoms,theresearchhypothesisbeingthattheintensiveprogrammewouldresultinfewersymptomsamongtheirpatients.TheexperimentalunitwastheGP,notthepatient.TheasthmapatientstreatedbyanindividualGPwereusedtomonitortheeffectoftheinterventiononthatGP.TheproportionofpatientswhoreportedsymptomswasusedasameasureoftheGP'seffectiveness,andthemeanoftheseproportionswascomparedbetween

thegroupsusingone-wayanalysisofvariance(§10.9).Anotherexamplewouldbeatrialofpopulationscreeningforadisease(§15.3),wherescreeningcentresweresetupinsomehealthdistrictsandnotinothers.Weshouldfindthemortalityrateforeachdistrictseparatelyandthencomparethemeanrateinthegroupofscreeningdistrictswiththatinthegroupofcontroldistricts.

Themostextremecaseariseswhenthereisonlyoneexperimentalunitpertreatment.Forexample,considerahealtheducationexperimentinvolvingtwoschools.Inoneschoolaspecialhealtheducationprogrammewasmounted,aimedtodiscouragechildrenfromsmoking.Bothbeforeandafterwards,thechildrenineachschoolcompletedquestionnairesaboutcigarettesmoking.Inthisexampletheschoolistheexperimentalunit.Thereisnoreasontosupposethattwoschoolsshouldhavethesameproportionofsmokersamongtheirpupils,orthattwoschoolswhichdohaveequalproportionsofsmokerswillremainso.Theexperimentwouldbemuchmoreconvincingifwehadseveralschoolsandrandomlyallocatedthemtoreceivethehealtheducationprogrammeortobecontrols.Wewouldthenlookforaconsistentdifferencebetweenthetreatedandcontrolschools,usingtheproportionofsmokersintheschoolasthevariable.

2.12*ConsentinclinicaltrialsIstartedmyresearchcareerinagriculture.Ourexperimentalsubjects,beingbarleyplants,hadnorights.Wesprayedthemwithwhateverchemicalswechoseandburntthemafterharvestandweighing.Wecannottreathumansubjectsinthesameway.Wemustrespecttherightsofourresearchsubjectsandtheirwelfaremustbeourprimaryconcern.Thishasnotalwaysbeenthecase,mostnotoriouslyintheNazideathcamps(Leaning1996).TheDeclarationofHelsinki(BMJ1996a),whichlaysdowntheprincipleswhichgovernresearchonhumansubjects,grewoutofthetrialsinNuremburgoftheperpetratorsoftheseatrocities(BMJ1996b).

Ifthereisatreatment,weshouldnotleavepatientsuntreatedifthisinanywayaffectstheirwell-being.TheworldwasrightlyoutragedbytheTuskegeeStudy,wheremenwithsyphiliswereleftuntreatedtoseewhatthelong-termeffectsofthediseasemightbe(Brawley1998,Ramsay1998).Thisisanextremeexamplebutitisnottheonlyone.Womenwithdysplasiafoundatcervicalcytologyhavebeenleftuntreatedtoseewhethercancerdeveloped(Mudur1997).Patientsarestillbeingaskedtoenterpharmaceuticaltrialswheretheymaygetaplacebo,eventhoughaneffectivetreatmentisavailable,allegedlybecauseregulatorsinsistonit.

Peopleshouldnotbetreatedwithouttheirconsent.Thisgeneralprincipleisnotconfinedtoresearch.Patientsshouldalsobeaskedwhethertheywishtotakepartinaresearchprojectandwhethertheyagreetoberandomized.Theyshouldknowtowhattheyareconsenting,andusuallyrecruitstoclinicaltrialsaregiveninformationsheetswhichexplaintothemrandomization,thealternativetreatments,andthepossiblerisksandbenefits.Onlythencantheygiveinformedorvalidconsent.Forchildrenwhoareoldenoughtounderstand,bothchildand

parentshouldbeinformedandgivetheirconsent,otherwiseparentsmustconsent(Doyal1997).Peoplegetveryupsetandangryiftheythinkthattheyhavebeenexperimentedonwithouttheirknowledgeandconsent,oriftheyfeelthattheyhavebeentrickedintoitwithoutbeingfullyinformed.Agroupofwomenwithcervicalcancerweregivenanexperimentalradiationtreatment,whichresultedinseveredamage,withoutproperinformation(Anon1997).TheyformedagroupwhichtheycalledRAGE,whichspeaksforitself.

Patientsaresometimesrecruitedintotrialswhentheyareverydistressedandveryvulnerable.Ifpossibletheyshouldhavetimetothinkaboutthetrialanddiscussitwiththeirfamily.Patientsintrialsareoftennotatallclearaboutwhatisgoingonandhavewrongideasaboutwhatishappening(Snowdonetal.1997).Theymaybeunabletorecallgivingtheirconsent,anddenyhavinggivenit.Theyshouldalwaysbeaskedtosignconsentformsandshouldbegivenaseparatepatientinformationsheetandacopyoftheformtokeep.

Adifficultyariseswiththerandomizedconsentdesign(Zelen1979,1992).Inthis,wehaveanew,activetreatmentandeithernocontroltreatmentorusualcare.Werandomizesubjectstoactiveorcontrol.Wethenofferthenewtreatmenttotheactivegroup,whomayrefuse,andthecontrolgroupgetsusualcare.Theactivegroupisaskedtoconsenttothenewtreatmentandallsubjectsareaskedtoconsenttoanymeasurementrequired.Theymightbetoldthattheyareinaresearchstudy,butnotthattheyhavebeenrandomized.Thusonlypatientsintheactivegroupcanrefusethetrial,thoughallcanrefusemeasurement.Analysisisthenbyintentiontotreat(§2.5).For

example,Dennisetal.(1997)wantedtoevaluateastrokefamilycareworker.Theyrandomizedpatientswithouttheirknowledge,thenaskedthemtoconsenttofollow-upconsistingofinterviewsbyaresearcher.Thecareworkervisitedthosepatientsandtheirfamilieswhohadbeenrandomizedtoher.McLean(1997)arguedthatifpatientscouldnotbeinformedabouttherandomizationwithoutjeopardizingthetrial,theresearchshouldnotbedone.Dennis(1997)arguedthattoaskforconsenttorandomizationmightbiastheresults,becausepatientswhodidnotreceivethecareworkermightberesentfulandbeharmedbythis.Myownviewisthatweshouldnotallowoneethicalconsideration,informedconsent,tooutweighallothersandthisdesigncanbeacceptable(Bland1997).

Thereisaspecialprobleminclusterrandomizedtrials.Patientscannotconsenttorandomization,butonlytotreatment.Inatrialwheregeneralpracticesareallocatedtoofferhealthchecks,forexample,patientscanconsenttothehealthchecksonlyiftheyareinahealthcheckpractice,thoughallwouldhavetoconsenttoanendoftrialassessment.

Researchonhumansubjectsshouldalwaysbeapprovedbyanindependentethicscommittee,whoseroleistorepresenttheinterestsoftheresearchsubject.Wheresuchasystemisnotinplace,terriblethingscanhappen.IntheUSA,researchcanbecarriedoutwithoutethicalapprovalifthesubjectsareprivatepatientsinaprivatehospitalwithoutanypublicfunding,andnonewdrugordeviceisused.Underthesecircumstances,plasticsurgeonscarriedoutatrialcomparingtwomethodsperformingface-lifts,oneoneachsideoftheface,

withoutpatients'consent(BulletinofMedicalEthics1998).

2MMultiplechoicequestions1to6(Eachbranchiseithertrueorfalse)

1.Whentestinganewmedicaltreatment,suitablecontrolgroupsincludepatientswho:

(a)aretreatedbyadifferentdoctoratthesametime;

(b)aretreatedinadifferenthospital;

(c)arenotwillingtoreceivethenewtreatment;

(d)weretreatedbythesamedoctorinthepast;

(e)arenotsuitableforthenewtreatment.

ViewAnswer

2.Inanexperimenttocomparetwotreatments,subjectsareallocatedusingrandomnumberssothat:

(a)thesamplemaybereferredtoaknownpopulation;

(b)whendecidingtoadmitasubjecttothetrial,wedonotknowwhichtreatmentthatsubjectwouldreceive;

(c)thesubjectswillgetthetreatmentbestsuitedtothem;

(d)thetwogroupswillbesimilar,apartfromtreatment;

(e)treatmentsmaybeassignedaccordingtothecharacteristicsofthesubject.

ViewAnswer

3.Inadoubleblindclinicaltrial:

(a)thepatientsdonotknowwhichtreatmenttheyreceive;

(b)eachpatientreceivesaplacebo;

(c)thepatientsdonotknowthattheyareinatrial;

(d)eachpatientreceivesbothtreatments;

(e)theclinicianmakingassessmentdoesnotknowwhichtreatmentthepatientreceives.

ViewAnswer

4.Inatrialofanewvaccine,childrenwereassignedatrandomtoa‘vaccine’anda‘control’group.The‘vaccine’groupwereofferedvaccination,whichtwo-thirdsaccepted.Thecontrolgroupwereofferednothing:

(a)thegroupwhichshouldbecomparedtothecontrolsisallchildrenwhoacceptedvaccination;

(b)thoserefusingvaccinationshouldbeincludedinthecontrolgroup;

(c)thetrialisdoubleblind;

(d)thoserefusingvaccinationshouldbeexcluded;

(e)thetrialisuselessbecausenotallthetreatedgroupwerevaccinated.

ViewAnswer

Table2.11.MethodofdeliveryintheKYMstudy

Methodofdelivery

AcceptedKYM

RefusedKYM

Controlwomen

% n % n % n

Normal 80.7 352 69.8 30 74.8 354

Instrumental 12.4 54 14.0 6 17.8 84

Caesarian 6.9 30 16.3 7 7.4 35

5.Cross-overdesignsforclinicaltrials:

(a)maybeusedtocompareseveraltreatments;

(b)involvenorandomization;

(c)requirefewerpatientsthandodesignscomparing

independentgroups;

(d)areusefulforcomparingtreatmentsintendedtoalleviatechronicsymptoms;

(e)usethepatientashisowncontrol.

ViewAnswer

6.Placebosareusefulinclinicaltrials:

(a)whentwoapparentlysimilaractivetreatmentsaretobecompared;

(b)toguaranteecomparabilityinnon-randomizedtrials;

(c)becausethefactofbeingtreatedmayitselfproducearesponse;

(d)becausetheymayhelptoconcealthesubject'streatmentfromassessors;

(e)whenanactivetreatmentistobecomparedtonotreatment.

ViewAnswer

2EExercise:The‘KnowYourMidwife’trialTheKnowYourMidwife(KYM)schemewasamethodofdeliveringmaternitycareforlow-riskwomen.Ateamofmidwivesranaclinic,andthesamemidwifewouldgiveallantenatalcareforamother,deliverthebaby,andgivepostnatalcare.TheKYMschemewascomparedtostandardantenatalcareinarandomizedtrial(FlintandPoulengeris1986).Itwasthoughtthattheschemewouldbeveryattractivetowomenandthatiftheyknewitwasavailabletheymightbereluctanttoberandomizedtostandardcare.EligiblewomenwererandomizedwithouttheirknowledgetoKYMortothecontrolgroup,whoreceivedthestandardantenatalcareprovidedbySt.George'sHospital.WomenrandomizedtoKYMweresentaletterexplainingtheKYMschemeandinvitingthemtoattend.Somewomendeclinedandattendedthestandardclinicinstead.ThemodeofdeliveryforthewomenisshowninTable2.11.Normalobstetricdatawererecordedon

allwomen,andthewomenwereaskedtocompletequestionnaires(whichtheycouldrefuse)aspartofastudyofantenatalcare,thoughtheywerenottoldaboutthetrial.

1.Thewomenknewwhattypeofcaretheywerereceiving.Whateffectmightthishaveontheoutcome?

ViewAnswer

2.WhatcomparisonshouldbemadetotestwhetherKYMhasanyeffectonmethodofdelivery?

ViewAnswer

3.Doyouthinkitwasethicaltorandomizewomenwithouttheirknowledge?

ViewAnswer



>TableofContents>3-Samplingandobservationalstudies

3

Samplingandobservationalstudies

3.1ObservationalstudiesInthischapterweshallbeconcernedwithobservationalstudies.Insteadofchangingsomethingandobservingtheresult,asinanexperimentorclinicaltrial,weobservetheexistingsituationandtrytounderstandwhatishappening.Mostmedicalstudiesareobservational,includingresearchintohumanbiologyinhealthypeople,thenaturalhistoryofdisease,thecausesanddistributionofdisease,thequalityofmeasurement,andtheprocessofmedicalcare.

Oneofthemostimportantanddifficulttasksinmedicineistodeterminethecausesofdisease,sothatwemaydevisemethodsofprevention.Weareworkinginanareawhereexperimentsareoftenneitherpossiblenorethical.Forexample.todeterminethatcigarettesmokingcausedcancer,wecouldimagineastudyinwhichchildrenwererandomlyallocatedtoa‘twentycigarettesadayforfiftyyears’groupanda‘neversmokeinyourlife’group.Allwewouldhavetodothenwouldbetowaitforthedeathcertificates.However,wecouldnotpersuadeoursubjectstosticktothetreatmentanddeliberatelysettingouttocausecancerishardlyethical.Wemustthereforeobservethediseaseprocessasbestwecan.bywatchingpeopleinthewildratherthanunderlaboratoryconditions.

Wecannevercometoanunequivocalconclusionaboutcausationinobservationalstudies.Thediseaseeffectandpossiblecausedonotexistinisolationbutinacomplexinterplayofmanyinterveningfactors.Wemustdoourbesttoassureourselvesthattherelationshipweobserveisnottheresultofsomeotherfactoractingonboth‘cause’

and‘effect’.Forexample,itwasoncethoughtthattheAfricanfevertree,theyellow-barkedacacia,causedmalaria,becausethoseunwiseenoughtocampunderthemwerelikelytodevelopthedisease.Thistreegrowsbywaterwheremosquitosbreed,andprovidesanidealday-timerestingplacefortheseinsects,whosebitetransmitstheplasmodiumparasitewhichproducesthedisease.Itwasthewaterandthemosquitoswhichweretheimportantfactors,notthetree.Indeed,thename‘malaria’comesfromasimilarincompleteobservation.Itmeans‘badair’andcomesfromthebeliefthatthediseasewascausedbytheairinlow-lying,marshyplaces,wherethemosquitosbred.Epidemiologicalstudydesignsmusttrytodealwiththecomplexinterrelationshipsbetweendifferentfactorsinordertodeducethetruemechanismofdiseasecausation.Wealsouseanumberofdifferentapproachestothestudyoftheseproblems,toseewhetherallproducethesameanswer.

Therearemanyproblemsininterpretingobservationalstudies,andthemedicalconsumerofsuchresearchmustbeawareofthem.Wehavenobetterwaytotacklemanyquestionsandsowemustmakethebestofthemandlookforconsistentrelationshipswhichstanduptothemostsevereexamination.Wecanalsolookforconfirmationofourfindingsindirectly,fromanimalmodelsandfromdose-responserelationshipsinthehumanpopulation.However,wemustacceptthatperfectproofisimpossibleanditisunreasonabletodemandit.Sometimes,aswithsmokingandhealth,wemustactonthebalanceoftheevidence.

Weshallstartbyconsideringhowtogetdescriptiveinformationaboutpopulationsinwhichweareinterested.Weshallgoontotheproblemofusingsuchinformationtostudydiseaseprocessesandthepossiblecausesofdisease.

3.2CensusesOnesimplequestionwecanaskaboutanygroupofinterestishowmanymembersithas.Forexample,weneedtoknowhowmanypeopleliveinacountryandhowmanyofthemareinvariousageandsexcategories,inordertomonitorthechangingpatternofdiseaseandtoplanmedicalservices.Wecanobtainitbyacensus.Inacensus,thewholeofadefinedpopulationiscounted.IntheUnitedKingdom,asin

manydevelopedcountries,apopulationcensusisheldeverytenyears.Thisisdonebydividingtheentirecountryintosmallareascalledenumerationdistricts,usuallycontainingbetween100and200households.Itistheresponsibilityofanenumeratortoidentifyeveryhouseholdinthedistrictandensurethatacensusformiscompleted,listingallmembersofthehouseholdandprovidingafewsimplepiecesofinformation.Eventhoughcompletionofthecensusformiscompelledbylaw,andenormouseffortgoesintoensuringthateveryhouseholdisincluded,thereareundoubtedlysomewhoaremissed.Thefinaldata,thoughextremelyuseful,arenottotallyreliable.

Themedicalprofessiontakespartinamassive,continuingcensusofdeaths,byprovidingdeathcertificatesforeachdeathwhichoccurs,includingnotonlythenameofthedeceasedandcauseofdeath,butalsodetailsofage,sex,placeofresidenceandoccupation.Censusmethodsarenotrestrictedtonationalpopulations.Theycanbeusedformorespecificadministrativepurposestoo.Forexample,wemightwanttoknowhowmanypatientsareinaparticularhospitalataparticulartime,howmanyofthemareindifferentdiagnosticgroups,indifferentage/sexgroups,andsoon.Wecanthenusethisinformationtogetherwithestimatesofthedeathanddischargeratestoestimatehowmanybedsthesepatientswilloccupyatvarioustimesinthefuture(Bewleyetal.1975,1981).

3.3SamplingAcensusofasinglehospitalcanonlygiveusreliableinformationaboutthathospital.Wecannoteasilygeneralizeourresultstohospitalsingeneral.IfwewanttoobtaininformationaboutthehospitalsoftheUnitedKingdom,twocoursesareopentous:wecanstudyeveryhospital,orwecantakearepresentativesampleofhospitalsandusethattodrawconclusionsabouthospitalsasawhole.

Moststatisticalworkisconcernedwithusingsamplestodrawconclusionsaboutsomelargerpopulation.IntheclinicaltrialsdescribedinChapter2,thepatientsactasasamplefromalargerpopulationconsistingofallsimilarpatientsandwedothetrialtofindoutwhatwouldhappentothislargergroupwerewetogivethema

newtreatment.

Theword‘population’isusedincommonspeechtomean‘allthepeoplelivinginanarea’,frequentlyofacountry.Instatistics,wedefinethetermmorewidely.Apopulationisanycollectionofindividualsinwhichwemaybeinterested,wheretheseindividualsmaybeanything,andthenumberofindividualsmaybefiniteorinfinite.Thus,ifweareinterestedinsomecharacteristicsoftheBritishpeople,thepopulationis‘allpeopleinBritain’.Ifweareinterestedinthetreatmentofdiabetesthepopulationis‘alldiabetics’.Ifweareinterestedinthebloodpressureofaparticularpatient,thepopulationis‘allpossiblemeasurementsofbloodpressureinthatpatient’.Ifweareinterestedinthetossoftwocoins,thepopulationis‘allpossibletossesoftwocoins’.Thefirsttwoexamplesarefinitepopulationsandcouldintheoryifnotpracticebecompletelyexamined;thesecondtwoareinfinitepopulationsandcouldnot.Wecouldonlyeverlookatasample,whichwewilldefineasbeingagroupofindividualstakenfromalargerpopulationandusedtofindoutsomethingaboutthatpopulation.

Howshouldwechooseasamplefromapopulation?Theproblemofgettingarepresentativesampleissimilartothatofgettingcomparablegroupsofpatientsdiscussedin§2.1,2,3.Wewantoursampletoberepresentative,insomesense,ofthepopulation.Wewantittohaveallthecharacteristicsintermsoftheproportionsofindividualswithparticularqualitiesashasthewholepopulation.Inasamplefromahumanpopulation,forexample,wewantthesampletohaveaboutthesameproportionofmenandwomenasinthepopulation,thesameproportionsindifferentagegroups,inoccupationalgroups,withdifferentdiseases,andsoon.Inaddition,ifweuseasampletoestimatetheproportionofpeoplewithadisease,wewanttoknowhowreliablethisestimateis,howfarfromtheproportioninthewholepopulationtheestimateislikelytobe.

Itisnotsufficienttochoosethemostconvenientgroup.Forexample,ifwewishedtopredicttheresultsofanelection,wewouldnottakeasoursamplepeoplewaitinginbusqueues.Thesemaybeeasytointerview,atleastuntilthebuscomes,butthesamplewouldbeheavilybiasedtowardsthosewhocannotaffordcarsandthustowards

lowerincomegroups.Inthesameway,ifwewantedasampleofmedicalstudentswewouldnottakethefronttworowsofthelecturetheatre.Theymaybeunrepresentativeinhavinganunusuallyhighthirstforknowledge,orpooreyesight.

Howcanwechooseasamplewhichdoesnothaveabuilt-inbias?Wemightdivideourpopulationintogroups,dependingonhowwethinkvariouscharacteristicswillaffecttheresult.Toaskaboutanelection,forexample,wemightgroupthepopulationaccordingtoage,sexandsocialclass.Wethenchooseanumberofpeopleineachgroupbyknockingondoorsuntilthequotaismadeup,andinterviewthem.Then,knowingthedistributionsofthesecategoriesinthepopulation(fromcensusdata,etc.)wecangetafarbetterpictureofthe

viewsofthepopulation.Thisiscalledquotasampling.Inthesamewaywecouldtrytochooseasampleofratsbychoosinggivennumbersofeachweight,age,sex,etc.Therearedifficultieswiththisapproach.First,itisrarelypossibletothinkofalltherelevantclassifications.Second,itisstilldifficulttoavoidbiaswithintheclassifications,bypickingintervieweeswholookfriendly,orratswhichareeasytocatch.Third,wecanonlygetanideaofthereliabilityoffindingsbyrepeatedlydoingthesametypeofsurvey,andoftherepresentativenessofthesamplebyknowingthetruepopulationvalues(whichwecanactuallydointhecaseofelections),orbycomparingtheresultswithasamplewhichdoesnothavethesedrawbacks.Quotasamplingcanbequiteeffectivewhensimilarsurveysaremaderepeatedlyasinopinionpollsormarketresearch.Itislessusefulformedicalproblems,wherewearecontinuallyaskingnewquestions.Weneedamethodwherebiasisavoidedandwherewecanestimatethereliabilityofthesamplefromthesampleitself.Asin§2.2,weusearandommethod:randomsampling.

3.4RandomsamplingTheproblemofobtainingasamplewhichisrepresentativeofalargerpopulationisverysimilartothatofallocatingpatientsintotwocomparablegroups.Wewantawayofchoosingmembersofthesamplewhichdoesnotdependontheirowncharacteristics.Theonlywaytobe

sureofthisistoselectthematrandom,sothatwhetherornoteachmemberofthepopulationischosenforthesampleispurelyamatterofchance.

Forexample,totakearandomsampleof5studentsfromaclassof80,wecouldwriteallthenamesonpiecesofpaper,mixthemthoroughlyinahatorothersuitablecontainer,anddrawoutfive.Allstudentshavethesamechanceofbeingchosen,andsowehavearandomsample.Allsamplesof5studentsareequallylikely,too,becauseeachstudentischosenquiteindependentlyoftheothers.Thismethodiscalledsimplerandomsampling.

Aswehaveseenin§2.2,physicalmethodsofrandomizingareoftennotverysuitableforstatisticalwork.Weusuallyusetablesofrandomdigits,suchasTable2.3.orrandomnumbersgeneratedbyacomputerprogram.WecoulduseTable2.3todrawoursampleof5from80studentsinseveralways.Forexample,wecouldlistthestudents,numberedfrom1to80.Thislistfromwhichthesampleistobedrawniscalledthesamplingframe.Wechooseastartingpointintherandomnumbertable(Table2.3),sayrow20,column5.Thisgivesusthefollowingpairsofdigits:

140488862892040342998708

Wecouldusethesepairsofdigitsdirectlyassubjectnumbers.Wechoosesubjectsnumbered14and4.Thereisnosubject88or86,sothenextchosenisnumber28.Thereisno92,sothenextis4.Wealreadyhavethissubjectinthesample,sowecarryontothenextpairofdigits,03.Thefinalmemberofthesampleisnumber42.Oursampleof5studentsisthusnumbers3,4,14,28and42.

Thereappearstobesomepatterninthissample.Twonumbersareadjacent(3and4)and3aredivisibleby14(14,28and42).Randomnumbersoftenappeartoustohavepattern,perhapsbecausethehumanmindisalwayslookingforit.Ontheotherhand,ifwetrytomakethesample‘morerandom’byreplacingeither3or4byasubjectneartheendofthelist,weareimposingapatternofuniformityonthesampleanddestroyingitsrandomness.Allgroupsoffiveareequallylikelyandmayhappen,even1,2,3,4,5.

Thismethodofusingthetableisfinefordrawingasmallsample,butitcanbetediousfordrawinglargesamples,becauseoftheneedtocheckforduplicates.Therearemanyotherwaysofdoingit.Forexample,wecandroptherequirementforasampleoffixedsize,andonlyrequirethateachmemberofthepopulationwillhaveafixedprobabilityofbeinginthesample.Wecoulddrawa5/80=1/16sampleofourclassbyusingthedigitsingroupstogiveadecimalnumber.say,

0.14040.88860.28920.04030.42990.8708

Wethenchoosethefirstmemberofthepopulationif0.1404islessthan1/16.Itisnot,sowedonotincludethismember,northesecond,correspondingto0.8886,northethird,correspondingto0.2892.Thefourthcorrespondsto0.0403.whichislessthan1/16(0.0625)andsothefourthmemberischosenasamemberofthesample,andsoon.Thismethodisonlysuitableforfairlylargesamples,asthesizeofthesampleobtainedcanbeveryvariableinsmallsamplingproblems.Intheexamplethereisahigherthan1in10chanceoffinishingwithasampleof2orfewer.

Aswithrandomallocation(§2.2),randomsamplingisanoperationideallysuitedtocomputers.MyfreeprogramClinstat(§1.3)providestworandomsamplingschemes.Somecomputerprogramsformanagingprimarycarepracticesactuallyhavethecapacitytotakearandomsampleforanydefinedgroupofpatientsbuiltin.

Randomsamplingensuresthattheonlywaysinwhichthesamplediffersfromthepopulationwillbethoseduetochance.Ithasafurtheradvantage.becausethesampleisrandom,wecanapplythemethodsofprobabilitytheorytothedataobtained.AsweshallseeinChapter8,thisenablesustoestimatehowfarfromthepopulationvaluethesamplevalueislikelytobe.

Theproblemwithrandomsamplingisthatwemusthavealistofthepopulationfromwhichthesampleistobedrawn.Listsofpopulationsmaybehardtofind,ortheymaybeverycumbersome.Forexample,tosampletheadultpopulationintheUK,wecouldusetheelectoralroll.Butalistofsome40000000nameswouldbedifficulttohandle,andinpracticewewouldfirsttakearandomsampleofelectoralwards,andthenarandomsampleofelectorswithinthesewards.Thisis,for

obviousreasons,amulti-stagerandomsample.Thisapproachcontainstheelementofrandomness,andsosampleswillberepresentativeofthepopulationsfromwhichtheyaredrawn.However,notallsampleshaveanequalchanceofbeingchosen,soitisnotthesameassimplerandomsampling.

Wecanalsocarryoutsamplingwithoutalistofthepopulationitself,providedwehavealistofsomelargerunitswhichcontainallthemembersofthepopulation.Forexample,wecanobtainarandomsampleofschoolchildreninanareabystartingwithalistofschools,whichisquiteeasytocomeby.Wethendrawasimplerandomsampleofschoolsandallthechildrenwithinourchosenschoolsformthesampleofchildren.Thisiscalledaclustersample,becausewetakeasampleofclustersofindividuals.Anotherexamplewouldbesamplingfromanyage/sexgroupinthegeneralpopulationbytakingasampleofaddressesandthentakingeveryoneatthechosenaddresseswhomatchedourcriteria.

Sometimesitisdesirabletodividethepopulationintodifferentstrata,forexampleintoageandsexgroups,andtakerandomsampleswithinthese.Thisisratherlikequotasampling,exceptthatwithinthestratawechooseatrandom.Ifthedifferentstratahavedifferentvaluesofthequantitywearemeasuring,thisstratifiedrandomsamplingcanincreaseourprecisionconsiderably.Therearemanycomplicatedsamplingschemesforuseindifferentsituations.Forexample,inastudyofcigarettesmokingandrespiratorydiseaseinDerbyshireschoolchildren,wedrewarandomsampleofschools,stratifiedbyschooltype(single-sex/mixed,selective/non-selective,etc.).Someschoolswhichtookchildrentoage13thenfedintothesame14+schoolwerecombinedintoonesamplingunit.Oursampleofchildrenwasallchildreninthechosenschoolswhowereintheirfirstsecondaryschoolyear(Banksetal.1978).Wethushadastratifiedrandomclustersample.Thesesamplingmethodsaffecttheestimateobtained.Stratificationimprovestheprecision,clustersamplingworsensit.Thesamplingschemeshouldbetakenintoaccountintheanalysis(Cochran1977,Kish1994).Oftenitisignored,aswasdonebyBanksetal.(1978)(thatis,byme),butitshouldnotbeandresultsmaybereportedas

beingmoreprecisethantheyreallyare.

In§2.3Ilookedatthedifficultieswhichcanariseusingmethodsofallocationwhichappearrandombutdonotuserandomnumbers.Insampling,twosuchmethodsareoftensuggestedbyresearchers.Oneistotakeeverytenthsubjectfromthelist,orwhateverfractionisrequired.Theotheristousethelastdigitofsomereferencenumber,suchasthehospitalnumber,andtakeasthesamplesubjectswherethisis,say,3or4.Thesesamplingmethodsaresystematicorquasi-random.Itisnotusuallyobviouswhytheyshouldnotgive‘random’samples,anditmaybethatinmanycasestheywouldbejustasgoodasrandomsampling.Theyarecertainlyeasier.Tousethem,wemustbeverysurethatthereisnopatterntothelistwhichcouldproduceanunrepresentativegroup.Ifitispossible,randomsamplingseemssafer.

Volunteerbiascanbeasseriousaprobleminsamplingstudiesasitisintrials(§2.4).Ifwecanonlyobtaindatafromasubsetofourrandomsample,thenthissubsetwillnotbearandomsampleofthepopulation.Itsmemberswillbeselfselected.Itisoftenverydifficulttogetdatafromeverymemberofasample.Theproportionforwhomdataisobtainediscalledtheresponserateandinasamplesurveyofthegeneralpopulationislikelytobebetween

70%and80%.Thepossibilitythatthoselostfromthesamplearedifferentinsomewaymustbeconsidered.Forexample,theymaytendtobeill,whichcanbeaseriousproblemindiseaseprevalencestudies.IntheschoolstudyofBanksetal.(1978),theresponseratewas80%,mostofthoselostbeingabsentfromschoolontheday.Now,someoftheseabsenteeswereillandsomeweretruants.Oursamplemaythusleadustounderestimatetheprevalenceofrespiratorysymptoms,byomittingsuffererswithcurrentacutedisease,andtheprevalenceofcigarettesmokingbyomittingthosewhohavegoneforaquicksmokebehindthebikesheds.

Oneofthemostfamoussamplingdisasters,theLiteraryDigestpollof1936,illustratesthesedangers(Bryson1976).Thiswasapollofvotingintentionsinthe1936USpresidentialelection,foughtbyRooseveltandLandon.Thesamplewasacomplexone.Insomecitieseveryregisteredvoterwasincluded,inothersoneintwo,andforthewholeofChicago

oneinthree.Tenmillionsampleballotsweremailedtoprospectivevoters,butonly2.3million,lessthanaquarter,werereturned.Still,twomillionisalotofAmericans,andthesepredicteda60%votetoLandon.Infact,Rooseveltwonwith62%ofthevote.Theresponsewassopoorthatthesamplewasmostunlikelytoberepresentativeofthepopulation,nomatterhowcarefullytheoriginalsamplewasdrawn.TwomillionAmericanscanbewrong!Itisnotthemeresizeofthesample,butitsrepresentativenesswhichisimportant.Providedthesampleistrulyrepresentative,2000votersisallyouneedtoestimatevotingintentionstowithin2%,whichisenoughforelectionpredictioniftheytellthetruthanddonotchangetheirminds(see§18E).

3.5SamplinginclinicalandepidemiologicalstudiesHavingextolledthevirtuesofrandomsamplingandcastdoubtonallothersamplingmethods,Imustadmitthatmostmedicaldataarenotobtainedinthisway.Thisispartlybecausethepracticaldifficultiesareimmense.ToobtainareasonablesampleofthepopulationoftheUK,anyonecangetalistofelectoralwards,takearandomsampleofthem,buycopiesoftheelectoralrollsforthechosenwardsandthentakearandomsampleofnamesfromit.Butsupposeyouwanttoobtainasampleofpatientswithcarcinomaofthebronchus.Youcouldgetalistofhospitalseasilyenoughandgetarandomsampleofthem,butthenthingswouldbecomedifficult.Thenamesofpatientswillonlybereleasedbytheconsultantinchargeshouldhesowish,andyouwillneedhispermissionbeforeapproachingthem.Anystudyofhumanpatientsrequiresethicalapproval,andyouwillneedthisfromtheethicscommitteeofeachofyourchosenhospitals.Gettingthecooperationofsomanypeopleisatasktodauntthehardiest,andobtainingethicalapprovalalonecantakemorethanayear.IntheUK,wenowhaveasystemofmulti-centreresearchethicscommittees,butaslocalapprovalmustalsobeobtainedthedelaysmaystillbeimmense.

Theresultofthisisthatclinicalstudiesaredoneonthepatientstohand.Ihavetouchedonthisprobleminthecontextofclinicaltrials(§2.7)andthe

sameappliestoothertypesofclinicalstudy.InaclinicaltrialweareconcernedwiththecomparisonoftwotreatmentsandwehopethatthesuperiortreatmentinStockportwillalsobethesuperiortreatmentinSouthampton.Ifwearestudyingclinicalmeasurement,wecanhopethatameasurementmethodwhichisrepeatableinMiddlesbroughwillberepeatableinMaidenhead,andthattwodifferentmethodsgivingsimilarresultsinoneplacewillgivesimilarresultsinanother.Studieswhicharenotcomparativegivemorecauseforconcern.Thenaturalhistoryofadiseasedescribedinoneplacemaydifferinunpredictablewaysfromthatinanother,duetodifferencesintheenvironmentandthegeneticmakeupofthelocalpopulation.Referencerangesforquantitiesofclinicalinterest,thelimitswithinwhichvaluesfrommosthealthpeoplewilllie,maywelldifferfromplacetoplace.

Studiesbasedonlocalgroupsofpatientsarenotwithoutvalue.Thisisparticularlysowhenweareconcernedwithcomparisonsbetweengroups,asinaclinicaltrial,orrelationshipsbetweendifferentvariables.However,wemustalwaysbearthelimitationsofthesamplingmethodinmindwheninterpretingtheresultsofsuchstudies.

Ingeneral,mostmedicalresearchhastobecarriedoutusingsamplesdrawnfrompopulationswhicharemuchmorerestrictedthanthoseaboutwhichwewishtodrawconclusions.Wemayhavetousepatientsinonehospitalinsteadofallpatients,orthepopulationofasmallarearatherthanthatofthewholecountryorplanet.Wemayhavetorelyonvolunteersforstudiesofnormalsubjects,givenmostpeople'sdislikeofhavingneedlespushedintothemanddisinclinationtospendhourshookeduptobatteriesofinstruments.Groupsofnormalsubjectscontainmedicalstudents,nursesandlaboratorystafffarmoreoftenthanwouldbeexpectedbychance.Inanimalresearchtheproblemisevenworse,fornotonlydoesonebatchofonestrainofmicehavetorepresentthewholespecies,itoftenhastorepresentmembersofadifferentorder,namelyhumans.

Findingsfromsuchstudiescanonlyapplytothepopulationfromwhichthesamplewasdrawn.Anyconclusionwhichwecometoaboutwiderpopulations,suchasallpatientswiththediseaseinquestion,dependsonevidencewhichisnotstatisticalandoftenunspecified,namelyourgeneralexperienceofnaturalvariabilityandexperienceofsimilar

studies.Thismayletusdown,andresultsestablishedinonepopulationmaynotapplytoanother.WehaveseenthisintheuseofBCGvaccineinIndia(§2.7).Itisveryimportantwhereverpossiblethatstudiesshouldberepeatedbyotherworkersonotherpopulations,sothatwecansamplethelargerpopulationatleasttosomeextent.

Intwotypesofstudy,casereportsandcaseseries,thesubjectscomebeforetheresearch,asitissuggestedbytheirexistence.Thereisnosampling.Theyareusedtoraisequestionsratherthantoanswerthem.

Acasereportisadescriptionofasinglepatientwhosecasedisplaysinterestingfeatures.Thisisusedtogenerateideasandraisequestions,ratherthantoanswerthem.Itclearlycannotbeplannedinadvance;itarisesfromthecase.Forexample,Velzeboeretal.(1997)reportedthecaseofan11-month-oldPakistani

girlwasadmittedtohospitalbecauseofdrowsiness,malaiseandanorexia.Shehadstoppedcrawlingorstandingupandscratchedherskincontinuously.Allinvestigationswerenegative.Her6-year-oldsisterwasthenbroughtinwithsimilarsymptoms.(Notethattherearetwopatientshere,buttheyarepartofthesamecase.)Thedoctorsguessedthatexposuretomercurymightbetoblame.Whenasked,themotherreportedthat2weeksbeforetheyoungerchild'ssymptomsstarted,mercuryfromabrokenthermometerhadbeendroppedonthecarpetinthechildren'sroom.Mercuryconcentrationinaurinesampletakenonadmissionwas12.6µg/1(slightlyabovetheacceptednormalvalueof10µg/1).Exposurewasconfirmedbyahighmercuryconcentrationinherhair.After3monthstreatmentthesymptomshaddisappearedtotallyandurinarymercuryhadfallenbelowthedetectionlimitof1µg/1.Thiscasecalledintoquestionthenormalvaluesformercuryinchildren.

Acaseseriesissimilartoacasereport,exceptthatanumberofsimilarcaseshavebeenobserved.Forexample,Shakeretal.(1997)described15patientsexaminedforhypocalcaemiaorskeletaldisease,inwhomthediagnosisofcoeliacdiseasewassubsequentlymade.In11ofthemgastrointestinalsymptomswereabsentormild.Theyconcludedthatbonelossmaybeasignofcoeliacdiseaseandthisdiagnosisshouldbeconsidered.Thedesigndoesnotallowthemtodrawanyconclusions

abouthowoftenthismighthappen.Todothattheywouldhavetocollectdatasystematically,usingacohortdesign(§3.7)forexample.

3.6Cross-sectionalstudiesOnepossibleapproachtothesamplingproblemisthecross-sectionalstudy.Wetakesomesampleorwholenarrowlydefinedpopulationandobservethematonepointintime.Wegetpoorestimatesofmeansandproportionsinanymoregeneralpopulation,butwecanlookatrelationshipswithinthesample.Forexample,inanepidemiologicalstudy,Banksetal.(1978)gavequestion-nairestoallfirstyearsecondaryschoolboysinarandomsampleofschoolsinDerbyshire(§3.4).Amongboyswhohadneversmoked,3%reportedacoughfirstthinginthemorning,comparedto19%ofboyswhosaidthattheysmokedoneormorecigarettesperweek.ThesamplewasrepresentativeofboysofthisageinDerbyshirewhoanswerquestionnaires,butwewantourconclusionstoapplyatleasttotheUnitedKingdom,ifnotthedevelopedworldorthewholeplanet.Wearguethatalthoughtheprevalenceofsymptomsandthestrengthoftherelationshipmayvarybetweenpopulations,theexistenceoftherelationshipisunlikelyonlytooccurinthepopulationstudied.Wecannotconcludethatsmokingcausesrespiratorysymptoms.Smokingandrespiratorysymptomsmaynotbedirectlyrelated,butmaybothberelatedtosomeotherfactor.Afactorrelatedtobothpossiblecauseandpossibleeffectiscalledconfounding.Forexample,childrenwhoseparentssmokemaybemorelikelytodeveloprespiratorysymptoms,becauseofpassiveinhalationoftheirparent'ssmoke,andalsobemoreinfluencedtotrysmokingthemselves.Wecantestthisbylookingseparatelyattherelationshipbetweenthechild'ssmokingandsymptomsforthose

whoseparentsarenotsmokers,andforthosewhoseparentsaresmokers.AsFigure3.1shows,thisrelationshipinfactpersistedandtherewasnoreasontosupposethatathirdcausalfactorwasatwork.

Fig.3.1.Prevalenceofself-reportedmorningcoughinDerbyshireschoolboys,bytheirownandtheirparents'cigarettesmoking(Blandetal.1978)

Mostdiseasesarenotsuitedtothissimplecross-sectionalapproach,becausetheyarerareevents.Forexample,lungcanceraccountsfor9%ofmaledeathsintheUK(OPCS,DH2No.7),andsoisaveryimportantdisease.Howevertheproportionofpeoplewhoareknowntohavethediseaseatanygiventime,theprevalence,isquitelow.Mostdeathsfromlungcancertakeplaceaftertheageof45,sowewillconsiderasampleofmenaged45andover.Theaverageremaininglifespanofthesemen,inwhichtheycouldcontractlungcancer,willbeabout30years.Theaveragetimefromdiagnosistodeathisaboutayear,soofthosewhowillcontractlungcanceronly1/30willhavebeendiagnosedwhenthesampleisdrawn.Only9%ofthesamplewilldeveloplungcanceranyway,sotheproportionwiththediseaseatanytimeis1/30×9%=0.3%or3perthousand.Wewouldneedaverylargesampleindeedtogetaworthwhilenumberoflungcancercases.

Cross-sectionaldesignsareusedinclinicalstudiesalso.Forexample,Rodinetal.(1998)studiedpolycysticovarydisease(PCO)inarandom

sampleofAsianwomenfromthelistsoflocalgeneralpracticesandfromalocaltranslatingservice.Wefoundthat52%ofthesamplehadPCO,veryhighcomparedtothatfoundinotherUKsamples.However,thiswouldnotprovideagoodestimateforAsianwomeningeneral,becausetheremaybemanydifferencesbetweenthissample,suchastheirregionsoforigin,andAsianwomenlivingelsewhere.WealsofoundthatPCOwomenhadhigherfastingglucoselevelsthannon-PCOwomen.Asthisisacomparisonwithinthesample,itseemsplausibletoconcludethatamongAsianwomenPCOtendstobeassociatedwithraisedglucose.WecannotsaywhetherPCOraisesglucoseorwhetherraisedglucoseincreasestheriskofPCO,becausetheyaremeasuredatthesametime.

Table3.1.Standardizeddeathratesperyearper1000menaged35ormoreinrelationtomostrecentamount

smoked,53monthsfollow-up(DollandHill1956)

Causeofdeath

Deathrateamong

Non-smokers Smokers

Mensmokingadailyaverageweightof

tobaccoof

1–14g 15–24g 25+g

Lungcancer

0.07 0.90 0.47 0.86 1.66

Othercancer

2.04 2.02 2.01 1.56 2.63

Otherrespiratory

0.81 1.13 1.00 1.11 1.41

Coronarythrombosis

4.22 4.87 4.64 4.60 5.99

Othercauses

6.11 6.89 6.82 6.38 7.19

Allcauses 13.25 15.78 14.92 14.49 18.84

3.7CohortstudiesOnewayofgettingroundtheproblemofthesmallproportionofpeoplewiththediseaseofinterestisthecohortstudy.Wetakeagroupofpeople,thecohort,andobservewhethertheyhavethesuspectedcausalfactor.Wethenfollowthemovertimeandobservewhethertheydevelopthedisease.Thisisaprospectivedesign,aswestartwiththepossiblecauseandseewhetherthisleadstothediseaseinthefuture.Itisalsolongitudinal,meaningthatsubjectsarestudiedatmorethanonetime.Acohortstudyusuallytakesalongtime,aswemustwaitforthefutureeventtooccur.Itinvolveskeepingtrackoflargenumbersofpeople,sometimesformanyyears,andoftenverylargenumbersmustbeincludedinthesampletoensuresufficientnumberswilldevelopthediseasetoenablecomparisonstobemadebetweenthosewithandwithoutthefactor.

AnotedcohortstudyofmortalityinrelationtocigarettesmokingwascarriedoutbyDollandHill(1956).TheysentaquestionnairetoallmembersofthemedicalprofessionintheUK,whowereaskedtogivetheirname,address,ageanddetailsofcurrentandpastsmokinghabits.Thedeathsamongthisgroupwererecorded.Only60%ofdoctorscooperated,soinfactthecohortdoesnotrepresentall

doctors.Theresultsforthefirst53monthsareshowninTable3.1.

Thecohortrepresentsdoctorswillingtoreturnquestionnaires,notpeopleasawhole.Wecannotusethedeathratesasestimatesforthewholepopulation,orevenforalldoctors.Whatwecansayisthat,inthisgroup,smokerswerefarmorelikelythannon-smokerstodiefromlungcancer.Itwouldbesurprisingifthisrelationshipwereonlytruefordoctors,butwecannotdefinitelysaythatthiswouldbethecaseforthewholepopulation,becauseofthewaythesamplehasbeenchosen.

Wealsohavetheproblemofotherinterveningvariables.Doctorswerenotallocatedtobesmokersornon-smokersasinaclinicaltrial;theychoseforthemselves.Thedecisiontobeginsmokingmayberelatedtomanythings(socialfactors,personalityfactors,geneticfactors)whichmayalsoberelatedtolungcancer.Wemustconsiderthesealternativeexplanationsverycarefullybeforecomingtoanyconclusionaboutthecausesofcancer.Inthisstudytherewerenodatatotestsuchhypotheses.

Thesametechniqueisused,usuallyonasmallerscale,inclinicalstudies.Forexample,Caseyetal.(1996)studied55patientswithverysevererheumatoidarthritisaffectingthespineandtheuseofallfourlimbs.Thesepatientswereoperatedoninanattempttoimprovetheirconditionandtheirsubsequentprogresswasmonitored.Wefoundthatonly25%hadafavourableoutcome.Wecouldnotconcludefromthisthatsurgerywouldbeworthwhilein25%ofsuchpatientsgenerally.Ourpatientsmighthavebeenparticularlyillorunusuallyfit,oursurgeonsmightbethebestortheymightbe(relativelyspeaking)ham-fistedbutchers.However,wecomparedtheseresultswithotherstudiespublishedinthemedicalliterature,whichweresimilar.Thesestudiestogethergaveamuchbettersampleofsuchpatientsthananystudyalonecoulddo(see§17.11,meta-analysis).Welookedatwhichcharacteristicsofthepatientspredictedagoodorbadoutcomeandfoundthattheareaofcross-sectionofthespinalcordwastheimportantpredictor.Weweremuchmoreconfidentofthisfinding,becauseitarosefromstudyingrelationshipsbetweenvariableswithinthesample.Itseemsquiteplausiblefromthisstudyalonethatpatientswhosespinalcordshavealreadyatrophiedareunlikelytobenefitfrom

surgery.

3.8Case-controlstudiesAnothersolutiontotheproblemofthesmallnumberofpeoplewiththediseaseofinterestisthecase-controlstudy.Inthiswestartwithagroupofpeoplewiththedisease,thecases.Wecomparethemtoasecondgroupwithoutthedisease,thecontrols.Inanepidemiologicalstudy,wethenfindtheexposureofeachsubjecttothepossiblecausativefactorandseewhetherthisdiffersbetweenthetwogroups.Beforetheircohortstudy,DollandHill(1950)carriedoutacase-controlstudyintotheaetiologyoflungcancer.TwentyLondonhospitalsnotifiedallpatientsadmittedwithcarcinomaofthelung,thecases.Aninterviewervisitedthehospitaltointerviewthecase,and,atthesametime,selectedapatientwithdiagnosisotherthancancer,ofthesamesexandwithinthesame5yearagegroupasthecase,inthesamehospitalatthesametime,asacontrol.Whenmorethanonesuitablepatientwasavailable,thepatientchosenwasthefirstinthewardlistconsideredbythewardsistertobefitforinterview.Table3.2showstherelationshipbetweensmokingandlungcancerforthesepatients.Asmokerwasanyonewhohadsmokedasmuchasonecigaretteadayforasmuchasoneyear.Itappearsthatcasesweremorelikelythancontrolstosmokecigarettes.DollandHillconcludedthatsmokingisanimportantfactorintheproductionofcarcinomaofthelung.

Thecase-controlstudyisanattractivemethodofinvestigation,becauseofitsrelativespeedandcheapnesscomparedtootherapproaches.However,therearedifficultiesintheselectionofcases,theselectionofcontrols,andobtainingthedata.Becauseofthese,case-controlstudiessometimesproducecontradictoryandconflictingresults.

Thefirstproblemistheselectionofcases.Thisusuallyreceiveslittleconsiderationbeyondadefinitionofthetypeofdiseaseandastatementaboutthe

confirmationofthediagnosis.Thisisunderstandable,asthereisusuallylittleelsethattheinvestigatorscandoaboutit.Theystartwith

theavailablesetofpatients.However,thesepatientsdonotexistinisolation.Theyaretheresultofsomeprocesswhichhasledtothembeingdiagnosedashavingthediseaseandthusbeingavailableforstudy.Forexample,supposewesuspectthatoralcontraceptivesmightcausecancerofthebreast.Wehaveagroupofpatientsdiagnosedashavingcancerofthebreast.Wemustaskourselveswhetheranyoftheseweredetectedatamedicalexaminationwhichtookplacebecausethewomanwasseeingadoctortoreceiveaprescription.Ifthiswereso,theriskfactor(pill)wouldbeassociatedwiththedetectionofthediseaseratherthanitscause.Thisiscalledascertainmentbias.

Table3.2.Numbersofsmokersandnon-smokersamonglungcancerpatientsandageandsex

matchedcontrolswithdiseasesotherthancancer(DollandHill1950)

Non-smokers Smokers Total

Males

Lungcancerpatients

2(0.3%) 647(99.7%)

649

Controlpatients 27(4.2%) 622(95.8%)

649

Females

Lungcancer 19(31.7%) 41(68.3%) 60

patients

Controlpatients 32(53.3%) 28(46.7%) 60

Farmoredifficultyiscausedbytheselectionofcontrols.Wewantagroupofpeoplewhodonothavethediseaseinquestion,butwhoareotherwisecomparabletoourcases.Wemustfirstdecidethepopulationfromwhichtheyaretobedrawn.Therearetwomainsourcesofcontrols:thegeneralpopulationandpatientswithotherdiseases.Thelattermaybepreferredbecauseofitsaccessibility.Nowthesetwopopulationsareclearlynotthesame.Forexample,DollandHill(1950)gavethecurrentsmokinghabitsof1014menandwomenwithdiseasesotherthancancer,14%ofwhomwerecurrentlynon-smokers.Theycommentedthattherewasnodifferencebetweensmokinginthediseasegroupsrespiratorydisease,cardiovasculardisease,gastro-intestinaldiseaseandothers.However,inthegeneralpopulationthepercentageofcurrentnon-smokerswas18%formenand59%forwomen(Todd1972).Thesmokingrateinthepatientgroupasawholewashigh.Sincetheirreport,ofcourse,smokinghasbeenassociatedwithdiseasesineachgroup.Smokersgetmorediseaseandaremorelikelytobeinhospitalthannon-smokers.

Intuitively,thecomparisonwewanttomakeisbetweenpeoplewiththediseaseandhealthypeople,notpeoplewithalotofotherdiseases.Wewanttofindouthowtopreventdisease,nothowtochooseonediseaseoranother!However,itismucheasiertousehospitalpatientsascontrols.Theremaythenbeabiasbecausethefactorofinterestmaybeassociatedwithotherdiseases.Supposewewanttoinvestigatetherelationshipbetweenadiseaseandcigarettesmokingusinghospitalcontrols.Shouldweexcludepatientswithlungcancerfromthecontrolgroup?Ifweincludethem,ourcontrolsmayhavemoresmokers

thanthegeneralpopulation,butifweexcludethemwemayhavefewer.Thisproblemisusuallyresolvedbychoosingspecificpatientgroups,suchasfracturecases,whoseillnessisthoughttobeunrelatedtothefactorbeinginvestigated.Incase-controlstudiesusingcancer

registries,controlsaresometimespeoplewithotherformsofcancer.Sometimesmorethanonecontrolgroupisused.

Havingdefinedthepopulationwemustchoosethesample.Therearemanyfactorswhichaffectexposuretoriskfactors,suchasageandsex.Themoststraightforwardwayistotakealargerandomsampleofthecontrolpopulation,ascertainalltherelevantcharacteristics,andthenadjustfordifferencesduringtheanalysis,usingmethodsdescribedinChapter17.Thealternativeistotrytomatchacontroltoeachcase,sothatforeachcasethereisacontrolofthesameage,sex,etc.Havingdonethis,thenwecancompareourcasesandcontrolsknowingthattheeffectsoftheseinterveningvariablesareautomaticallyadjustedfor.Ifwewishtoexcludeacasewemustexcludeitscontrol,too,orthegroupswillnolongerbecomparable.Wecanhavemorethanonecontrolpercase,buttheanalysisbecomescomplicated.

Matchingonsomevariablesdoesnotensurecomparabilityonall.Indeed,ifitdidtherewouldbenostudy.DollandHillmatchedonage,sexandhospital.Theyrecordedareaofresidenceandfoundthat25%oftheircaseswerefromoutsideLondon,comparedto14%ofcontrols.Ifwewanttoseewhetherthisinfluencesthesmokingandlungcancerrelationshipwemustmakeastatisticaladjustmentanyway.Whatshouldwematchfor?Themorewematchfor,thefewerinterveningvariablestherearetoworryabout.Ontheotherhand,itbecomesmoreandmoredifficulttofindmatches.Evenmatchingonageandsex,DollandHillcouldnotalwaysfindacontrolinthesamehospital,andhadtolookelsewhere.Matchingformorethanageandsexcanbeverydifficult.

Havingdecidedonthematchingvariableswethenfindinthecontrolpopulationallthepossiblematches.Iftherearemorematchesthanweneed,weshouldchoosethenumberrequiredatrandom.Othermethods,suchasthatusedbyDollandHillwhoallowedthewardsistertochoose,haveobviousproblemsofpotentialbias.Ifnosuitablecontrolcanbefound,wecandotwothings.Wecanwidenthematchingcriteria,sayagetowithintenyearsratherthanfive,orwecanexcludethecase.

Therearedifficultiesininterpretingtheresultsofcase-controlstudies.

Oneisthatthecase-controldesignisoftenretrospective,thatis,wearestartingwiththepresentdiseasestate,e.g.lungcancer,andrelatingittothepast,e.g.historyofsmoking.Wemayhavetorelyontheunreliablememoriesofoursubjects.Thismayleadbothtorandomerrorsamongcasesandcontrolsandsystematicrecallbias,whereonegroup,usuallythecases,recallseventsbetterthantheother.Forexample,themotherofahandicappedchildmaybemorelikelythanthemotherofanormalchildtoremembereventsinpregnancywhichmayhavecauseddamage.Thereisaproblemofassessmentbiasinsuchstudies,justasinclinicaltrials(§2.9).Interviewerswillveryoftenknowwhethertheintervieweeisacaseorcontrolandthismaywellaffectthewayquestionsareasked.Theseandotherconsiderationsmakecase-controlstudiesextremely

difficulttointerpret.Theevidencefromsuchstudiescanbeuseful,butdatafromothertypesofinvestigationmustbeconsidered,too,beforeanyfirmconclusionsaredrawn.

Thecase-controldesignisusedclinicallytoinvestigatethenaturalhistoryofdiseasebycomparingpatientswithhealthysubjectsorpatientswithanotherdisease.Forexample,Kielyetal.(1995)wereinterestedinlymphaticfunctionininflammatoryarthritis.Wecomparedarthritispatients(thecases)withhealthyvolunteers(thecontrols).Lymphaticflowwasmeasuredinthearmsofthesesubjectsandthegroupscompared.Wefoundthatlymphaticdrainagewaslessinthecasesthaninthecontrolgroup,butthiswasonlysoforarmswhichwereswollen(oedematous).

3.9*QuestionnairebiasinobservationalstudiesInobservationalstudies,muchdatamayhavetobesuppliedbythesubjectsthemselves.Thewayinwhichaquestionisaskedmayinfluencethereply.Sometimesthebiasinaquestionisobvious.Comparethese:

(a)Doyouthinkpeopleshouldbefreetoprovidethebestmedicalcarepossibleforthemselvesandtheirfamilies,freeofinterferencefromaStatebureaucracy?

(b)Shouldthewealthybeabletobuyaplaceattheheadofthequeueformedicalcare,pushingasidethosewithgreaterneed,orshouldmedicalcarebesharedsolelyonthebasisofneedforit?

Version(a)expectstheansweryes,version(b)expectstheanswerno.Wewouldhopenottobemisledbysuchblatantmanipulation,buttheeffectsofquestionwordingcanbemuchmoresubtlethanthis.Hedges(1978)reportsseveralexamplesoftheeffectsofvaryingthewordingofquestions.Heaskedtwogroupsofabout800subjectsoneofthefollowing:

(a)Doyoufeelyoutakeenoughcareofyourhealth,ornot?

(b)Doyoufeelyoutakeenoughcareofyourhealth,ordoyouthinkyoucouldtakemorecareofyourhealth?

Inreplytoquestion(a),82%saidthattheytookenoughcare,whereasonly68%saidthisinreplytoquestion(b).Evenmoredramaticwasthedifferencebetweenthispair:

(a)Doyouthinkapersonofyouragecandoanythingtopreventill-healthinthefutureornot?

(b)Doyouthinkapersonofyouragecandoanythingtopreventill-healthinthefuture,orisitlargelyamatterofchance?

Notonlywasthereadifferenceinthepercentagewhorepliedthattheycoulddosomething,butasTable3.3showsthisanswerwasrelatedtoageforversion(a)butnotforversion(b).Hereversion(b)isambiguous,asitisquitepossibletothinkthathealthislargelyamatterofchancebutthatthereisstillsomethingonecandoaboutit.Onlyifitistotallyamatterofchanceistherenothingonecando.

Table3.3.Repliestotwosimilarquestionsaboutillhealth,byage(Hedges1978)

Age(years)

Total

16–34 35–54 55+

Candosomething(a) 75% 64% 56% 65%

Candosomething(b) 45% 49% 50% 49%

Sometimestherespondentsmayinterpretthequestioninadifferentwayfromthequestioner.Forexample,whenaskedwhethertheyusuallycoughedfirstthinginthemorning,3.7%oftheDerbyshireschoolchildrenrepliedthattheydid.Whentheirparentswereaskedaboutthechild'ssymptoms2.4%repliedpositively,notadramaticdifference.Yetwhenaskedaboutcoughatothertimesinthedayoratnight24.8%ofchildrensaidyes,comparedtoonly4.5%oftheirparents(Blandetal.1979).Thesesymptomsallshowedrelationshipstothechild'ssmokingandotherpotentiallycausalvariables,andalsotooneanother.Weareforcedtoadmitthatwearemeasuringsomething,butthatwearenotsurewhat!

Anotherpossibilityisthatrespondentsmaynotunderstandthequestionatall,especiallywhenitincludesmedicalterms.Inanearlierstudyofcigarettesmokingbychildren,wefoundthat85%ofasampleagreedthatsmokingcausedcancer,butthat41%agreedthatsmokingwasnotharmful(Bewleyetal.1974).Thereareatleasttwopossibleexplanationsforthis:beingaskedtoagreewiththenegativestatement‘smokingisnotharmful’mayhaveconfusedthechildren,ortheymaynotseecancerasharmful.Wehaveevidenceforbothofthesepossibilities.InarepeatstudyinKentweaskedafurthersampleofchildrenwhethertheyagreedthatsmokingcausedcancerandthat‘smokingisbadforyourhealth’(BewleyandBland1976).Inthisstudy90%agreedthatsmokingcausescancerand91%agreedthatsmokingisbadforyourhealth.Inanotherstudy(Blandetal.1975),weaskedchildrenwhatwasmeantbytheterm‘lungcancer’.Only13%seemedtoustounderstandand32%clearlydidnot,oftensaying‘Idon'tknow’.Theynearlyallknewthatlungcancerwascausedbysmoking,however.

Thesettinginwhichaquestionisaskedmayalsoinfluencereplies.OpinionpollstersInternationalCommunicationsandMarketResearchconductedapollinwhichhalfthesubjectswerequestionedbyinterviewersabouttheirvotingpreferenceandhalfweregivenasecretballot(McKie1992).Byeachmethod33%chose‘Labour’,but28%chose‘Conservative’atinterviewand7%wouldnotsay,whereas35%chose‘Conservative’bysecretballotandonly1%wouldnotsay.HencethesecretmethodproducedaConservativemajority,asatthethenrecentgeneralelection,andtheopeninterviewaLabourmajority.Foranotherexample,Sibbaldetal.(1994)comparedtworandomsamplesofGPs.Onesamplewereapproachedbypostandthenbytelephoneiftheydidnotreplyaftertworeminders,andtheotherwerecontacteddirectlybytelephone.Ofthepredominantlypostalsample,19%reportedthattheyprovidedcounsellingthemselves,comparedto36%ofthetelephonesample,and14%reportedthat

theirhealthvisitorprovidedcounsellingcomparedto30%ofthetelephonegroup.Thusthemethodofaskingthequestioninfluencedtheanswer.Onemustbeverycautiouswheninterpretingquestionnairereplies.

Fig.3.2.VolatilesubstanceabusemortalityandunemploymentinthecountiesofGreatBritain(Theareaofthecircleisproportional

tothepopulationofthecounty,soreflectstheimportanceoftheobservation)

Oftentheeasiestandbestmethod,ifnottheonlymethod,ofobtainingdataaboutpeopleistoaskthem.Whenwedoit,wemustbeverycarefultoensurethatquestionsarestraightforward,unambiguousandinlanguagetherespondentswillunderstand.Ifwedonotdothisthendisasterislikelytofollow.

3.10*EcologicalstudiesEcologyisthestudyoflivingthingsinrelationtotheirenvironment.Inepidemiology,anecologicalstudyisonewherethediseaseisstudiedinrelationtocharacteristicsofthecommunitiesinwhichpeoplelive.Forexample,wemighttakethedeathratesfromheartdiseaseinseveralcountriesandseewhetherthisisrelatedtothenationalannualconsumptionofanimalfatperhead.

Esmailetal.(1977)carriedoutanecologicalstudyoffactorsrelatedtodeathsfromvolatilesubstanceabuse(VSA,alsocalledsolventabuse,inhalantabuseorgluesniffing).TheobservationalunitsweretheadministrativecountiesofGreatBritain.ThedeathswereobtainedfromanationalregisterofdeathsheldatSt.George'sandtheageandsexdistributionineachcountyfromnationalcensusdata.Thesewereusedtocalculateanindexofmortalityadjustedforage,thestandardizedmortalityratio(§16.3).Indicatorsofsocialdeprivationwerealsoobtainedfromcensusdata.Figure3.2showstherelationshipbetweenVSAmortalityandunemploymentinthecounties.Clearly,thereisarelationship.Themortalityishigherincountieswhereunemploymentishigh.

Relationshipsfoundinecologicalstudiesareindirect.Wemustnotconcludethatthereisarelationshipattheleveloftheperson.Thisistheecological

fallacy.Forexample,wecannotconcludefromFigure3.2thatunemployedpeopleareatagreaterriskofdyingfromVSAthantheemployed.ThepeakageforVSAdeathisamongschoolchildren,who

arenotincludedintheunemploymentfigures.Itisnottheunemployedpeoplewhoaredying.Unemploymentisjustoneindicatorofsocialdeprivation,andVSAdeathsareassociatedwithmanyofthem.

Ecologicalstudiescanbeusefultogeneratehypotheses.Forexample,theobservationthathypertensioniscommonincountrieswherethereisahighintakeofdietarysaltmightleadustoinvestigatethesaltconsumptionandbloodpressureofindividualpeople,andarelationshiptheremightinturnleadtodietaryinterventions.Theseleadsoftenturnouttobefalse,however,andtheecologicalstudyaloneisneverenough.


7.Instatisticalterms,apopulation:

(a)consistsonlyofpeople;

(b)maybefinite;

(c)maybeinfinite;

(d)canbeanysetofthingsinwhichweareinterested;

(e)mayconsistofthingswhichdonotactuallyexist.

ViewAnswer

8.Aonedaycensusofin-patientsinapsychiatrichospitalcould:

(a)givegoodinformationaboutthepatientsinthathospitalatthattime;

(b)givereliableestimatesofseasonalfactorsinadmissions;

(c)enableustodrawconclusionsaboutthepsychiatrichospitalsofBritain;

(d)enableustoestimatethedistributionofdifferentdiagnosesinmentalillnessinthelocalarea;

(e)tellushowmanypatientstherewereinthehospital.

ViewAnswer

9.Insimplerandomsampling:

(a)eachmemberofthepopulationhasanequalchanceofbeingchosen;

(b)adjacentmembersofthepopulationmustnotbechosen;

(c)likelyerrorscannotbeestimated;

(d)eachpossiblesampleofthegivensizehasanequalchanceofbeingchosen;

(e)thedecisiontoincludeasubjectinthesampledependsonlyonthesubject'sowncharacteristics.

ViewAnswer

10.Advantagesofrandomsamplinginclude:

(a)itcanbeappliedtoanypopulation;

(b)likelyerrorscanbeestimated;

(c)itisnotbiassed;

(d)itiseasytodo;

(e)thesamplecanbereferredtoaknownpopulation.

ViewAnswer

11.Inacase-controlstudytoinvestigatewhethereczemainchildrenisrelatedtocigarettesmokingbytheirparents:

(a)parentswouldbeaskedabouttheirsmokinghabitsatthechild'sbirthandthechildobservedforsubsequentdevelopmentofeczema;

(b)childrenofagroupofparentswhosmokewouldbecomparedtochildrenofagroupofparentswhoarenon-smokers;

(c)parentswouldbeaskedstoptosmokingtoseewhethertheirchildren'seczemawasreduced;

(d)thesmokinghabitsoftheparentsofagroupofchildrenwitheczemawouldbecomparedtothesmokinghabitsoftheparentsofagroupofchildrenwithouteczema;

(e)parentswouldberandomlyallocatedtosmokingornon-smokinggroups.

ViewAnswer

12.Toexaminetherelationshipbetweenalcoholconsumptionandcanceroftheoesophagus,feasiblestudiesinclude:

(a)questionnairesurveyofarandomsamplefromtheelectoralrole;

(b)comparisonofhistoryofalcoholconsumptionbetweenagroupofoesophagealcancerpatientsandagroupofhealthycontrolsmatchedforageandsex;

(c)comparisonofcurrentoesophagealcancerratesinagroupofalcoholicsandagroupofteetotallers;

(d)comparisonbyquestionnaireofhistoryofalcoholconsumptionbetweenagroupofoesophagealcancerpatientsandarandomsamplefromtheelectoralroleinthesurroundingdistrict;

(e)comparisonofdeathratesduetocanceroftheoesophagusinalargesampleofsubjectswhosealcoholconsumptionhasbeendeterminedinthepast.

ViewAnswer

13.*Inastudyofhospitalpatients,20hospitalswerechosenatrandomfromalistofallhospitals.Withineachhospital,10%ofpatientswerechosenatrandom:

(a)thesampleofpatientsisarandomsample;

(b)allhospitalshadanequalchanceofbeingchosen;

(c)allhospitalpatientshadanequalchanceofbeingchosenattheoutset;

(d)thesamplecouldbeusedtomakeinferencesaboutallhospitalpatientsatthattime;

(e)allpossiblesamplesofpatientshadanequalchanceofbeingchosen.

ViewAnswer

Table3.4.Doorstepdeliveryofmilkbottlesandexposuretobirdattack

No.(%)exposed

Cases Controls

Doorstepmilkdelivery 29(91%)

47(73%)

Previousmilkbottleattackbybirds 26(81%)

25(39%)

Milkbottleattackinweekbeforeillness

26(81%)

5(8%)

Protectivemeasurestaken 6(19%)

14(22%)

Handlingattackedmilkbottleinweekbeforeillness

17(53%)

5(8%)

Drinkingmilkfromattackedbottle 25 5(8%)

inweekbeforeillness (80%)

Table3.5.Frequencyofbirdattacksonmilkbottles

Numberofdaysofweekwhenattackstookplace Cases Controls

0 3 42

1–3 11 3

4–5 5 1

6–7 10 1

3EExercise:CampylobacterjejuniinfectionCampylobacterjejuniisabacteriumcausinggastro-intestinalillness,spreadbythefaecal-oralroute.Itinfectsmanyspecies,andhumaninfectionhasbeenrecordedfromhandlingpetdogsandcats,handlingandeatingchickenandothermeats,andviamilkandwatersupplies.Treatmentisbyantibiotics.

InMay,1990,therewasafourfoldriseintheisolationrateofC.jejuniintheOgwrDistrict,Mid-Glamorgan.ThemotherofayoungboyadmittedtohospitalwithfebrileconvulsionsresultingfromC.jejuniinfectionreportedthathermilkbottleshadbeenattackedbybirdsduringtheweekbeforeherson'sillness,aphenomenonwhichhadbeenassociatedwithcampylobacterinfectioninanotherarea.This

observation,withtheriseinC.jejuni,promptedacase-controlstudy(Southernetal.1990).

A‘case’wasdefinedasapersonwithlaboratoryconfirmedC.jejuniinfectionwithonsetbetweenMay1andJune11990,residentinanareawithBridgendatitscentre.Caseswereexcludediftheyhadspentoneormorenightsawayfromthisareaintheweekbeforeonset,iftheycouldhaveacquiredtheinfectionelsewhere,orweremembersofahouseholdinwhichtherehadbeenacaseofdiarrheaintheprecedingfourweeks.

Thecontrolswereselectedfromtheregisterofthegeneralpracticeofthecase,orinafewinstancesfrompracticesservingthesamearea.Twocontrolswereselectedforeachcase,matchedforsex,age(within5years),andareaofresidence.

Casesandcontrolswereinterviewedbymeansofastandardquestionnaireathomeorbytelephone.Caseswereaskedabouttheirexposuretovarious

factorsintheweekbeforetheonsetofillness.Controlswereaskedthesamequestionsaboutthecorrespondingweekfortheirmatchedcases.Ifacontrolormemberofhisorherfamilyhadhaddiarrhealastingmorethan3daysintheweekbeforeorduringtheillnessoftherespectivecase,orhadspentanynightsduringthatweekawayfromhome,anothercontrolwasfound.Evidenceofbirdattackincludedthepeckingortearingoffofmilkbottletops.Ahistoryofbirdattackwasdefinedasapreviousattackatthathouse.

Fifty-fivepeoplewithCampylobacterinfectionresidentintheareawerereportedduringthestudyperiod.Ofthese,19wereexcludedand4couldnotbeinterviewed,leaving32casesand64matchedcontrols.Therewasnodifferenceinmilkconsumptionbetweencasesandcontrols,butmorecasesthancontrolsreporteddoorstepdeliveryofbottledmilk,previousmilkbottleattackbybirds,milkbottleattackbybirdsintheindexweek,andhandlingordrinkingmilkfromanattackedbottle(Table3.4).Casesreportedbirdattacksmorefrequentlythancontrols(Table3.5).Controlsweremorelikelytohaveprotectedtheirmilkbottlesfromattackortohavediscardedmilkfromattacked

bottles.Almostallsubjectswhosemilkbottleshadbeenattackedmentionedthatmagpiesandjackdawswerecommonintheirarea,thoughonly3hadactuallywitnessedattacksandnonereportedbirddroppingsnearbottles.

Noneoftheotherfactorsinvestigated(handlingrawchicken;eatingchickenboughtraw;eatingchicken,beeforhamboughtcooked;eatingout;attendingbarbecue;catordoginthehouse;contactwithothercatsordogs;andcontactwithfarmanimals)weresignificantlymorecommonincontrolsthancases.Bottleattacksseemedtohaveceasedwhenthestudywascarriedout,andnomilkcouldbeobtainedforanalysis.

1.Whatproblemswerethereinselectingcases?

ViewAnswer

2.Whatproblemswerethereintheselectionofcontrols?

ViewAnswer

3.Arethereanyproblemsaboutdatacollection?

ViewAnswer

4.Fromtheabove,doyouthinkthereisconvincingevidencethatbirdattacksonmilkbottlescausecampylobacterinfection?

ViewAnswer

5.Whatfurtherstudiesmightbecarriedout?

ViewAnswer



>TableofContents>4-Summarizingdata

4

Summarizingdata

4.1TypesofdataInChapters2and3welookedatwaysinwhichdataarecollected.Inthischapterweshallseehowdatacanbesummarizedtohelptorevealinformationtheycontain.Wedothisbycalculatingnumbersfromthedatawhichextracttheimportantmaterial.Thesenumbersarecalledstatistics.Astatisticisanythingcalculatedfromthedataalone.

Itisoftenusefultodistinguishbetweenthreetypesofdata:qualitative,discretequantitativeandcontinuousquantitative.Qualitativedataarisewhenindividualsmayfallintoseparateclasses.Theseclassesmayhavenonumericalrelationshipwithoneanotheratall,e.g.sex:male,female;typesofdwelling:house,maisonette,flat,lodgings;eyecolour:brown,grey,blue,green,etc.Quantitativedataarenumerical,arisingfromcountsormeasurements.Ifthevaluesofthemeasurementsareintegers(wholenumbers),likethenumberofpeopleinahousehold,ornumberofteethwhichhavebeenfilled,thosedataaresaidtobediscrete.Ifthevaluesofthemeasurementscantakeanynumberinarange,suchasheightorweight,thedataaresaidtobecontinuous.Inpracticethereisoverlapbetweenthesecategories.Mostcontinuousdataarelimitedbytheaccuracywithwhichmeasurementscanbemade.Humanheight,forexample,isdifficulttomeasuremoreaccuratelythantothenearestmillimetreandismoreusuallymeasuredtothenearestcentimetre.Soonlyafinitesetofpossiblemeasurementsisactuallyavailable,althoughthequantity‘height’cantakeaninfinitenumberofpossiblevalues,andthemeasuredheightisreallydiscrete.However,themethodsdescribed

belowforcontinuousdatawillbeseentobethoseappropriateforitsanalysis.

Weshallrefertoqualitiesorquantitiessuchassex,height,age,etc.asvariables,becausetheyvaryfromonememberofasampletoanother.Aqualitativevariableisalsotermedacategoricalvariableoranattribute.Weshallusethesetermsinterchangeably.

4.2FrequencydistributionsWhendataarepurelyqualitative,thesimplestwaytodealwiththemistocountthenumberofcasesineachcategory.Forexample,intheanalysisofthecensusofapsychiatrichospitalpopulation(§3.2),oneofthevariablesofinterestwasthepatient'sprincipaldiagnosis(Bewleyetal.1975).Tosummarizethesedata,

wecountthenumberofpatientshavingeachdiagnosis.TheresultsareshowninTable4.1.Thecountofindividualshavingaparticularqualityiscalledthefrequencyofthatquality.Forexample,thefrequencyofschizophreniais474.Theproportionofindividualshavingthequalityiscalledtherelativefrequencyorproportionalfrequency.Therelativefrequencyofschizophreniais474/1467=0.32or32%.Thesetoffrequenciesofallthepossiblecategoriesiscalledthefrequencydistributionofthevariable.

Table4.1.PrincipaldiagnosisofpatientsinTootingBecHospital

Diagnosis Numberofpatients

Schizophrenia 474

Affectivedisorders 277

Organicbrainsyndrome 405

Subnormality 58

Alcoholism 57

Otherandnotknown 196

Total 1467

Table4.2.LikelihoodofdischargeofpatientsinTootingBecHospital

Discharge Frequency Relativefrequency

Cumulativefrequency

Relativecumulativefrequency

Unlikely 871 0.59 871 0.59

Possible 339 0.23 1210 0.82

Likely 257 0.18 1467 1.00

Total 1467 1.00 1467 1.00

Inthiscensusweassessedwhetherpatientswere‘likelytobedischarged’,‘possiblytobedischarged’or‘unlikelytobedischarged’.

ThefrequenciesofthesecategoriesareshowninTable4.2.Likelihoodofdischargeisaqualitativevariable,likediagnosis,butthecategoriesareordered.Thisenablesustouseanothersetofsummarystatistics,thecumulativefrequencies.Thecumulativefrequencyforavalueofavariableisthenumberofindividualswithvalueslessthanorequaltothatvalue.Thus,ifweorderlikelihoodofdischargefrom‘unlikely’,through‘possibly’to‘likely’thecumulativefrequenciesare871,1210(=871+339)and1467.Therelativecumulativefrequencyforavalueistheproportionofindividualsinthesamplewithvalueslessthanorequaltothatvalue.Fortheexampletheyare0.59(=871/1467),0.82and1.00.Thuswecanseethattheproportionofpatientsforwhomdischargewasnotthoughtlikelywas0.82or82%.

Aswehavenoted,likelihoodofdischargeisaqualitativevariable,withorderedcategories.Sometimesthisorderingistakenintoaccountinanalysis,sometimesnot.Althoughthecategoriesareorderedthesearenotquantitativedata.Thereisnosenseinwhichthedifferencebetween‘likely’and‘possibly’isthesameasthedifferencebetween‘possibly’and‘unlikely’.

Table4.3.Parityof125womenattendingantenatalclinicsatSt.George'sHospital

Parity FrequencyRelativefrequency(percent)

Cumulativefrequency

Relativecumulativefrequency(percent)

0 59 47.2 59 47.2

1 44 35.2 103 82.4

2 14 11.2 117 93.6

3 3 2.4 120 96.0

4 4 3.2 124 99.2

5 1 0.8 125 100.0

Total 125 100.0 125 100.0

Table4.4.FEV1(litres)of57malemedicalstudents

2.85 3.19 3.50 3.69 3.90 4.14 4.32 4.50

2.85 3.20 3.54 3.70 3.96 4.16 4.44 4.56

2.98 3.30 3.54 3.70 4.05 4.20 4.47 4.68

3.04 3.39 3.57 3.75 4.08 4.20 4.47 4.70

3.10 3.42 6.60 3.78 4.10 4.30 4.47 4.71

3.10 3.48 3.60 3.83 4.14 4.30 4.50 4.78

Table4.3showsthefrequencydistributionofaquantitativevariable,parity.ThisshowsthenumberofpreviouspregnanciesforasampleofwomenbookingfordeliveryatSt.George'sHospital.Onlycertainvalues

arepossible,asthenumberofpregnanciesmustbeaninteger,sothisvariableisdiscrete.Thefrequencyofeachseparatevalueisgiven.

Table4.4showsacontinuousvariable,forcedexpiratoryvolumeinonesecond(FEV1)inasampleofmalemedicalstudents.Asmostofthevaluesoccuronlyonce,togetausefulfrequencydistributionweneedtodividetheFEV1scaleintoclassintervals,e.g.from3.0to3.5,from3.5to4.0,andsoon,andcountthenumberofindividualswithFEV1sineachclassinterval.Theclassintervalsshouldnotoverlap,sowemustdecidewhichintervalcontainstheboundarypointtoavoiditbeingcountedtwice.Itisusualtoputthelowerboundaryofanintervalintothatintervalandthehigherboundaryintothenextinterval.Thustheintervalstartingat3.0andendingat3.5contains3.0butnot3.5.Wecanwritethisas‘3.0-’or‘3.0-3.5-’or‘3.0-3.499’.Includingthelowerboundaryintheclassintervalhasthisadvantage.Mostdistributionsofmeasurementshaveazeropointbelowwhichwecannotgo,whereasfewhaveanexactupperlimit.Ifweweretoincludetheupperboundaryintheintervalinsteadofthelower,wewouldhavetwopossiblewaysofdealingwithzero.Itcouldbeleftasanisolatedpoint,notinaninterval.Alternatively,itcouldbeincludedinthelowestinterval,whichwouldthennotbeexactlycomparabletotheothersasitwouldincludebothboundarieswhilealltheotherintervalsonlyincludedtheupper.

Ifwetakeastartingpointof2.5andanintervalof0.5wegetthefrequencydistributionshowninTable4.5.Notethatthisisnotunique.Ifwetakea

startingpointof2.4andanintervalof0.2wegetadifferentsetoffrequencies.

Table4.5.FrequencydistributionofFEV1in57malemedicalstudents

FEV1 Frequency Relativefrequency(percent)

2.0 0 0.0

2.5 3 5.3

3.0 9 15.8

3.5 14 24.6

4.0 15 26.3

4.5 10 17.5

5.0 6 10.5

5.5 0 0.0

Total 57 100.0

Table4.6.TallysystemforfindingthefrequencydistributionofFEV1

FEV1 Frequency

2.0 0

2.5 /// 3

3.0 ///////// 9

3.5 ////////////// 14

4.0 /////////////// 15

4.5 ////////// 10

5.0 ////// 6

5.5 0

Total 57

Thefrequencydistributioncanbecalculatedeasilyandaccuratelyusingacomputer.Manualcalculationisnotsoeasyandmustbedonecarefullyandsystematically.Onewayrecommendedbymanytexts(e.g.Hill1977)istosetupatallysystem,asinTable4.6.Wegothroughthedataandforeachindividualmakeatallymarkbytheappropriateinterval.Wethencountupthenumberineachinterval.Inpracticethisisverydifficulttodoaccurately,anditneedstobecheckedanddouble-checked.Hill(1977)recommendswritingeachnumberonacardanddealingthecardsintopilescorrespondingtotheintervals.Itistheneasytocheckthateachpilecontainsonlythosecasesinthatintervalandcountthem.Thisisundoubtedlysuperiortothetallysystem.Anothermethodistoordertheobservationsfromlowesttohighestbeforemarkingtheintervalboundariesandcounting,ortousethestemandleafplotdescribedbelow.Personally,Ialwaysuseacomputer.

4.3HistogramsandotherfrequencygraphsGraphicalmethodsareveryusefulforexaminingfrequency

distributions.Figure4.1showsagraphofthecumulativefrequencydistributionfortheFEV1

data.Thisiscalledastepfunction.Wecansmooththisbyjoiningsuccessivepointswherethecumulativefrequencychangesbystraightlines,togiveacumulativefrequencypolygon.Figure4.2showsthisforthecumulativerelativefrequencydistributionofFEV1.Thisplotisveryusefulforcalculatingsomeofthesummarystatisticsreferredtoin§4.5.

Fig.4.1.CumulativefrequencydistributionofFEV1inasampleofmalemedicalstudents

Fig.4.2.CumulativefrequencypolygonofFEV1

Themostcommonwayofdepictingafrequencydistributionisbyahistogram.Thisisadiagramwheretheclassintervalsareonanaxisandrectangleswithheightsorareasproportionaltothefrequencieserectedonthem.Figure4.3showsthehistogramfortheFEV1distributioninTable4.5.Theverticalscaleshowsfrequency,thenumberofobservationsineachinterval.

Sometimeswewanttoshowthedistributionofadiscretevariable(e.g.Table4.3)asahistogram.Ifourintervalsare0–1-,1–2-,etc.,theactual

observationswillallbeatoneendoftheinterval.Makingthestartingpointoftheintervalasafractionratherthananintegergivesaslightlybetterpicture(Figure4.5).Thiscanalsobehelpfulforcontinuousdatawhenthereisalotofdigitpreference(§15.2).Forexample,wheremostobservationsarerecordedasintegersorassomethingpointfive,startingtheintervalatsomethingpointsevenfivecangiveamoreaccuratepicture.

Fig.4.3.HistogramofFEV1:frequencyscale

Fig.4.4.HistogramofFEV1:frequencyperunitFEV1orfrequencydensityscale

Fig.4.5.Histogramsofparity(Table4.3)usingintegerandfractionalcut-offpointsfortheintervals

Table4.7.Distributionofageinpeoplesufferingaccidentsinthehome(Whittington1977)

Agegroup

Relativefrequency(percent)

Relativefrequencyperyear(percent)

0–4 25.3 5.06

5–14 18.9 1.89

15–44

30.3 1.01

45–64

13.6 0.68

65+ 11.7 0.33

Fig.4.6.Histogramsofagedistributionofhomeaccidentvictims,usingtherelativefrequencyscaleandtherelativefrequencydensityscale

Figure4.4showsahistogramforthesamedistributionasFigure4.3,withfrequencyperunitFEV1(orfrequencydensity)shownontheverticalaxis.Thedistributionsappearidenticalandwemaywellwonderwhetheritmatterswhichmethodwechoose.Weseethatitdoesmatterwhenweconsiderafrequencydistributionwithunequalintervals,asinTable4.7.Ifweplotthehistogramusingtheheightsoftherectanglestorepresentrelativefrequencyintheintervalwegettheleft-handhistograminFigure4.6,whereasifweusetherelativefrequencyperyearwegettheright-handhistogram.Thesehistogramstelldifferentstories.Theleft-handhistograminFigure4.6suggeststhatthemostcommonageforaccidentvictimsisbetween15and44years,whereastheright-handhistogramsuggestsitisbetween0and4.Theright-handhistogramiscorrect,theleft-handhistogrambeingdistortedbytheunequalclassintervals.Itisthereforepreferableingeneraltousethefrequencyperunit(frequencydensity)ratherthanperclassintervalwhenplottingahistogram.Thefrequencyforaparticularintervalisthenrepresentedbytheareaoftherectangleon

thatinterval.Onlywhentheclassintervalsareallequalcanthefrequencyfortheclassintervalberepresented

bytheheightoftherectangle.Thecomputerprogrammerfindsequalintervalsmucheasier,however,andhistogramswithunequalintervalsarenowuncommon.

Fig.4.7.FrequencypolygonsofFEV1andPEFinmedicalstudents

Fig.4.8.StemandleafplotfortheFEV1data,roundeddowntoonedecimalplace

Ratherthanahistogramconsistingofverticalrectangles,wecanplotafrequencypolygoninstead.Todothiswejointhecentrepointsofthetopsoftherectangles,thenomittherectangles(Figure4.7(a)).Whereacellofthehistogramisemptywejointhelinetothecentreofthecellatthehorizontalaxis(Figure4.7(b),males).Thiscanbeusefulifwewanttoshowtwoormorefrequencydistributionsonthesamegraph,asin(Figure4.7(b)).Whenwedothis,thecomparisoniseasierifweuserelativefrequencyorrelativefrequencydensityratherthanfrequency.Thismakesiteasiertocomparedistributionswithdifferentnumbersofsubjects.

AdifferentversionofthehistogramhasbeendevelopedbyTukey(1977),thestemandleafplot(Figure4.8).Therectanglesarereplacedbythenumbersthemselves.The‘stem’isthefirstdigitordigitsofthenumberandthe‘leafthetrailingdigit.ThefirstrowofFigure4.8representsthenumbers2.8,2.8,and2.9,whichinthedataare2.85,2.85,and2.98.Theplotprovidesagoodsummaryofdatastructurewhileatthesametimewecanseeothercharacteristicssuchasatendencytoprefersometrailingdigitstoothers,calleddigitpreference(§15.1).Itisalsoeasytoconstructandmuchlesspronetoerrorthanthetallymethodoffindingafrequencydistribution.

4.4ShapesoffrequencydistributionFigure4.3showsafrequencydistributionofashapeoftenseeninmedicaldata.Thedistributionisroughlysymmetricalaboutitscentralvalueandhasfrequencyconcentratedaboutonecentralpoint.Themostcommonvalueiscalledthe

modeofthedistributionandFigure4.3hasonesuchpoint.Itisunimodal.Figure4.9showsaverydifferentshape.Heretherearetwodistinctmodes,onenear5andtheothernear8.5.Thisdistributionisbimodal.Wemustbecarefultodistinguishbetweentheunevennessinthehistogramwhichresultsfromusingasmallsampletorepresentalargepopulationandthosewhichresultfromgenuinebimodalityinthedata.Thetroughbetween6and7inFigure4.9isverymarkedandmightrepresentagenuinebimodality.Inthiscasewehavechildren,someofwhomhaveaconditionwhichraisesthecholesterolleveland

someofwhomdonot.Weactuallyhavetwoseparatepopulationsrepresentedwithsomeoverlapbetweenthem.However,almostalldistributionsencounteredmmedicalstatisticsareunimodal.

Fig.4.9.Serumcholesterolinchildrenfromkinshipswithfamilialhypercholesterolaemia(Leonardetal1977)

Fig.4.10.Serumtriglycerideincordbloodfrom282babies(Table4.8)

Figure4.10differsfromFigure4.3inadifferentway.Thedistributionof

serumtriglycerideisskew,thatis,thedistancefromthecentralvaluetotheextremeismuchgreaterononesidethanitisontheother.Thepartsofthehistogramneartheextremesarecalledthetailsofthedistribution.Ifthetailsareequalthedistributionissymmetrical,asinFigure4.3.IfthetailontherightislongerthanthetailontheleftasinFigure4.10,thedistributionisskewtotherightorpositivelyskew.Ifthetailontheleftislonger,thedistributionisskewtotheleftornegativelyskew.Thisisunusual,butFigure4.11showsanexample.Thenegativeskewnesscomesaboutbecausebabiescanbebornaliveatanygestationalagefromabout20weeks,butsoonafter40weeksthebabywillhavetobeborn.Pregnancieswillnotbeallowedtogoonformorethan44weeks;thebirthwouldbeinducedartificially.Mostdistributionsencounteredinmedicalworkaresymmetricalorskewtotheright,forreasonsweshalldiscusslater(§7.4).

Table4.8.Serumtriglyceridemeasurementsincordbloodfrom282babies

0.15 0.29 0.32 0.36 0.40 0.42 0.46 0.50

0.16 0.29 0.33 0.36 0.40 0.42 0.46 0.50

0.20 0.29 0.33 0.36 0.40 0.42 0.47 0.52

0.20 0.29 0.33 0.36 0.40 0.44 0.47 0.52

0.20 0.29 0.33 0.36 0.40 0.44 0.47 0.52

0.20 0.29 0.33 0.36 0.40 0.44 0.47 0.52

0.21 0.30 0.33 0.36 0.40 0.44 0.47 0.52

0.22 0.30 0.33 0.36 0.40 0.44 0.48 0.52

0.24 0.30 0.33 0.37 0.40 0.44 0.48 0.52

0.25 0.30 0.34 0.37 0.40 0.44 0.48 0.53

0.26 0.30 0.34 0.37 0.40 0.44 0.48 0.54

0.26 0.30 0.34 0.37 0.40 0.44 0.48 0.54

0.26 0.30 0.34 0.38 0.40 0.45 0.48 0.54

0.27 0.30 0.34 0.38 0.40 0.45 0.48 0.54

0.27 0.30 0.34 0.38 0.41 0.45 0.48 0.54

0.27 0.31 0.34 0.38 0.41 0.45 0.48 0.54

0.28 0.31 0.34 0.38 0.41 0.45 0.48 0.55

0.28 0.32 0.35 0.39 0.41 0.45 0.48 0.55

0.28 0.32 0.35 0.39 0.41 0.46 0.48 0.55

0.28 0.32 0.35 0.39 0.41 0.46 0.49 0.55

0.28 0.32 0.35 0.39 0.41 0.46 0.49 0.55

0.28 0.32 0.35 0.39 0.42 0.46 0.49 0.55

0.28 0.32 0.35 0.40 0.42 0.46 0.50 0.55

0.28 0.32 0.36 0.40 0.42 0.46 0.50 0.55

4.5MediansandquantilesWeoftenwanttosummarizeafrequencydistributioninafewnumbers,foreaseofreportingorcomparison.Themostdirectmethodistousequantiles.Thequantilesarevalueswhichdividethedistributionsuchthatthereisagivenproportionofobservationsbelowthequantile.Forexample,themedianisaquantile.Themedianisthecentralvalueofthedistribution,suchthathalfthepointsarelessthanorequaltoitandhalfaregreaterthanorequaltoit.Wecanestimateanyquantileseasilyfromthecumulativefrequencydistribution

orastemandleafplot.FortheFEV1datathemedianis4.1,the29thvalueinTable4.4.Ifwehaveanevennumberofpoints,wechooseavaluemidwaybetweenthetwocentralvalues.

Fig.4.11.Gestationalageatbirthfor1749deliveriesatSt.George'sHospital

Ingeneral,weestimatetheqquantile,thevaluesuchthataproportionqwillbebelowit,asfollows.Wehavenorderedobservationswhichdividethescaleinton+1parts:belowthelowestobservation,abovethehighestandbetweeneachadjacentpair.Theproportionofthedistributionwhichliesbelowtheithobservationisestimatedbyi/(n+1).Wesetthisequaltoqandgeti=q(n+1).Ifiisaninteger,theithobservationistherequiredquantileestimate.Ifnot,letjbetheintegerpartofi,thepartbeforethedecimalpoint.Thequantilewillliebetweenthejthandj+1thobservations.Weestimateitby

Forthemedian,forexample,the0.5quantile,i=q(n+1)=0.5×(57+1)=29,the29thobservationasbefore.

Otherquantileswhichareparticularlyusefularethequartilesofthedistribution.Thequartilesdividethedistributionintofourequalparts,calledfourths.Thesecondquartileisthemedian.FortheFEV1datathefirstandthirdquartilesare3.54and4.53.Forthefirstquartile,i=0.25×58=14.5.Thequartileisbetweenthe14thand15thobservations,whichareboth3.54.Forthethirdquartile,i=0.75×58=

43.5,sothequartileliesbetweenthe42ndand43rdobservations,whichare4.50and4.56.Thequantileisgivenby4.50+(4.56-4.50)×(43.5-43)=4.53.Weoftendividethedistributionat99centilesorpercentiles.Themedianisthusthe50thcentile.Forthe20thcentileofFEV1,i=0.2×58=11.6,sothequantileisbetweenthe11thand12thobservation,3.42and3.48,andcanbeestimatedby3.42+(3.48-3.42)×(11.6-11)=3.46.WecanestimatetheseeasilyfromFigure4.2byfindingthepositionofthequantileontheverticalaxis,e.g.0.2for

the20thcentileor0.5forthemedian,drawingahorizontallinetointersectthecumulativefrequencypolygon,andreadingthequantileoffthehorizontalaxis.

Fig.4.12.BoxandwhiskerplotsforFEV1andforserumtriglyceride

Fig.4.13.Boxplotsshowingaroughlysymmetricalvariableinfourgroups,withanoutlyingpoint(datainTable10.8)

Tukey(1977)usedthemedian,quartiles,maximumandminimumasaconvenientfivefiguresummaryofadistribution.Healsosuggestedaneatgraph,theboxandwhiskerplot,whichrepresentsthis(Figure4.12).Theboxshowsthedistancebetweenthequartiles,withthemedianmarkedasaline,andthe‘whiskers’showtheextremes.ThedifferentshapesoftheFEV1andserumtriglyceridedistributionsisclearfromthegraph.Fordisplaypurposes,anobservationwhosedistancefromtheedgeofthebox(i.e.thequartile)ismorethan1.5timesthelengthofthebox(i.e.theinterquartilerange,§4.7)maybecalledanoutlier.Outliersmaybeshownasseparatepoints(Figure4.13).Theplotcanbeusefulforshowingthecomparisonofseveralgroups(Figure4.13).

4.6ThemeanThemedianisnottheonlymeasureofcentralvalueforadistribution.Anotheristhearithmeticmeanoraverage,usuallyreferredtosimplyasthemean.Thisisfoundbytakingthesumoftheobservationsanddividingbytheirnumber.Forexample,considerthefollowing

hypotheticaldata:

239540634

Thesumis36andthereare9observations,sothemeanis36/9=4.0.Atthispointwewillneedtointroducesomealgebraicnotation,widelyusedinstatistics.Wedenotetheobservationsby

x1,x2,…,xi,…,xn

Therearenobservationsandtheithoftheseisxi=Fortheexample,x4=5andn=9.Thesumofallthexiis

ThesummationsignisanuppercaseGreekletter,sigma,theGreekS.Whenitisobviousthatweareaddingthevaluesofx1,forallvaluesofi,whichrunsfrom1ton,weabbreviatethisto∑xiorsimplyto∑x.Themeanofthexiisdenotedby[xwithbarabove](‘xbar’),and

Thesumofthe57FEV1sis231.51andhencethemeanis231.51/57=4.06.Thisisveryclosetothemedian,4.1,sothemedianiswithin1%ofthemean.Thisisnotsoforthetriglyceridedata.Themediantriglyceride(Table4.8)is0.46butthemeanis0.51,whichishigher.Themedianis10%awayfromthemean.Ifthedistributionissymmetricalthesamplemeanandmedianwillbeaboutthesame,butinaskewdistributiontheywillnot.Ifthedistributionisskewtotheright,asforserumtriglyceride,themeanwillbegreater,ifitisskewtotheleftthemedianwillbegreater.Thisisbecausethevaluesinthetailsaffectthemeanbutnotthemedian.

Thesamplemeanhasmuchnicermathematicalpropertiesthanthemedianandisthusmoreusefulforthecomparisonmethodsdescribedlater.Themedianisaveryusefuldescriptivestatistic,butnotmuchusedforotherpurposes.

4.7Variance,rangeandinterquartilerange

Themeanandmedianaremeasuresofthepositionofthemiddleofthedistribution,whichwecallthecentraltendency.Weshallalsoneedameasureofthespreadorvariabilityofthedistribution,calledthedispersion.

Oneobviousmeasureistherange,thedifferencebetweenthehighestandlowestvalues.ForthedataofTable4.4,therangeis5.43–2.85=2.58litres.The

rangeisoftenpresentedasthetwoextremes.2.85–5.43litres,ratherthantheirdifference.Therangeisausefuldescriptivemeasure,buthastwodisadvantages.Firstly,itdependsonlyontheextremevaluesandsocanvaryalotfromsampletosample.Secondly,itdependsonthesamplesize.Thelargerthesampleis,thefurtheraparttheextremesarelikelytobe.Wecanseethisifweconsiderasampleofsize2.Ifweaddathirdmembertothesampletherangewillonlyremainthesameifthenewobservationfallsbetweentheothertwo,otherwisetherangewillincrease.Wecangetroundthesecondoftheseproblemsbyusingtheinterquartilerange,thedifferencebetweenthefirstandthirdquartiles.ForthedataofTable4.4,theinterquartilerangeis4.53--3.54=0.99litres.Theinterquartilerange,too,isoftenpresentedasthetwoextremes,3.54–4.53litres.However,theinterquartilerangeisquitevariablefromsampletosampleandisalsomathematicallyintractable.Althoughausefuldescriptivemeasure,itisnottheonepreferredforpurposesofcomparison.

Table4.9.Deviationsfromthemeanof9observations

Observationsxi

Deviationsfromthemeanxi-[xwithbarabove]

Squareddeviations(xi-[xwithbarabove])2

2 -2 4

3 -1 1

9 5 25

5 1 1

4 0 0

0 -4 16

6 2 4

3 -1 1

4 0 0

36 0 52

Themostcommonlyusedmeasuresofdispersionarethevarianceandstandarddeviation.Westartbycalculatingthedifferencebetweeneachobservationandthesamplemean,calledthedeviationsfromthemean,Table4.9.Ifthedataarewidelyscattered,manyoftheobservationsxiwillbefarfromthemean[xwithbarabove]andsomanydeviationsxi-[xwithbarabove]willbelarge.Ifthedataarenarrowlyscattered,veryfewobservationswillbefarfromthemeanandsofewdeviationsxi-[xwithbarabove]willbelarge.Weneedsomekindofaveragedeviationtomeasurethescatter.Ifweaddallthedeviationstogether,wegetzero,because∑(xi-[xwithbarabove])=∑xi-∑[xwithbarabove]=∑xi-n[xwithbarabove]andn[xwithbar

above]=∑xi.Insteadwesquarethedeviationsandthenaddthem,asshowninTable4.9.Thisremovestheeffectofsign;weareonlymeasuringthesizeofthedeviationnotthedirection.Thisgivesus∑(xi-[xwithbarabove])2,intheexampleequalto52,calledthesumofsquaresaboutthemean,usuallyabbreviatedtosumofsquares.

Clearly,thesumofsquareswilldependonthenumberofobservationsaswellasthescatter.Wewanttofindsomekindofaveragesquareddeviation.Thisleadstoadifficulty.Althoughwewantanaveragesquareddeviation,wedividethesumofsquaresbyn-1,notn.Thisisnottheobviousthingtodoandpuzzles

manystudentsofstatisticalmethods.Thereasonisthatweareinterestedinestimatingthescatterofthepopulation,ratherthanthesample,andthesumofsquaresaboutthesamplemeanisproportionalton-1(§4A,§6B),Dividingbynwouldleadtosmallsamplesproducinglowerestimatesofvariabilitythanlargesamples.Theminimumnumberofobservationsfromwhichthevariabilitycanbeestimatedis2,asingleobservationcannottellushowvariablethedataare.Ifweusednasourdivisor,forn-Ithesumofsquareswouldbezero,givingavarianceofzero.Withthecorrectdivisorofn-1,n=1givesthemeaninglessratio0/0,reflectingtheimpossibilityofestimatingvariabilityfromasingleobservation.Theestimateofvariabilityiscalledthevariance,definedby

Wehavealreadysaidthat∑(xi-[xwithbarabove])2iscalledthesumofsquares.Thequantityn-1iscalledthedegreesoffreedomofthevarianceestimate(§7A).Wehave:

Weshallusuallydenotethevariancebys2.Intheexample,thesumofsquaresis52andthereare9observations,giving8degreesoffreedom.Hences2=52/8=6.5.

Theformula∑(xi-[xwithbarabove])2givesusarathertediouscalculation.Thereisanotherformulaforthesumofsquares,which

makesthecalculationeasiertocarryout.Thisissimplyanalgebraicmanipulationofthefirstformandgiveexactlythesameanswers.Wethushavetwoformulaeforvariance:

Thealgebraisquitesimpleandisgivenin§4B.Forexample,usingthesecondformulaforthenineobservations,wehave:

asbefore.Onacalculatorthisisamucheasierformulathanthefirst,asthenumbersneedonlybeputinonce.Itcanbeinaccurate,becausewesubtractonelargenumberfromanothertogetasmallone.Forthisreasonthefirstformulawouldbeusedinacomputerprogram.

4.8StandarddeviationThevarianceiscalculatedfromthesquaresoftheobservations.Thismeansthatitisnotinthesameunitsastheobservations,whichlimitsitsuseasadescriptivestatistic.Theobviousanswertothisistotakethesquareroot,whichwillthenhavethesameunitsastheobservationsandthemean.Thesquarerootofthevarianceiscalledthestandarddeviation,usuallydenotedbys.Thus,

ReturningtotheFEVdata,wecalculatethevarianceandstandarddeviationasfollows.Wehaven=57,∑xi231.51,=∑xi2=965.45:

Figure4.14showstherelationshipbetweenmean,standarddeviationandfrequencydistribution.ForFEV1,weseethatthemajorityofobservationsarewithinonestandarddeviationofthemean,andnearlyallwithintwostandarddeviationsofthemean.Thereisasmallpartofthehistogramoutsidethe[xwithbarabove]-2sto[xwithbarabove]+2sinterval,oneithersideofthissymmetricalhistogram.As

Figure4.14alsoshows,thisistrueforthehighlyskewtriglyceridedata,too.Inthiscase,however,theoutlyingobservationsareallinonetailofthedistribution.Ingeneral,weexpectroughly2/3ofobservationstoliewithinonestandarddeviationofthemeanand95%toliewithintwostandarddeviationsofthemean.

Fig.4.14.HistogramsofFEV1andtriglyceridewithmeanandstandarddeviation

Table4.10.Populationof100randomdigitsforasamplingexperiment

9 1 0 7 5 6 9 5 8 8 1 0 5 7

1 8 8 8 5 2 4 8 3 1 6 5 5 7

2 8 1 8 5 8 4 0 1 9 2 1 6 9

1 9 7 9 7 2 7 7 0 8 1 6 3 8

7 0 2 8 8 7 2 5 4 1 8 6 8 3

Appendices

4AAppendix:Thedivisorforthevariance

Thevarianceisfoundbydividingthesumofsquaresaboutthesamplemeanbyn-1,notbyn.Thisisbecausewewantthescatteraboutthepopulationmean,andthescatteraboutthesamplemeanisalwaysless.Thesamplemeanis‘closer’tothedatapointsthanisthepopulationmean.Weshalltryalittlesamplingexperimenttoshowthis.Table4.10showsasetof100randomdigitswhichweshalltakeasthepopulationtobesampled.Theyhavemean4.74andthesumofsquaresaboutthemeanis811.24.Hencetheaveragesquareddifferencefromthemeanis8.1124.Wecantakesamplesofsizetwoat

randomfromthispopulationusingapairofdecimaldice,whichwillenableustochooseanydigitnumberedfrom00to99.Thefirstpairchosenwas5and6whichhasmean5.5.Thesumofsquaresaboutthepopulationmean4.74is(5-4.74)2+(6-4.74)2=1.655.Thesumofsquaresaboutthesamplemeanis(5-5.5)2+(6-5.5)2=0.5.

Thesumofsquaresaboutthepopulationmeanisgreaterthanthesumofsquaresaboutthesamplemean,andthiswillalwaysbeso.Table4.11showsthisfor20suchsamplesofsizetwo.Theaveragesumofsquaresaboutthepopulationmeanis13.6,andaboutthesamplemeanitis5.7.Hencedividingbythesamplesize(n=2)wehavemeansquaredifferencesof6.8aboutthepopulationmeanand2.9aboutthesamplemean.Comparethisto8.1forthepopulationasawhole.Weseethatthesumofsquaresaboutthepopulation

meanisquitecloseto8.1,whilethesumofsquaresaboutthesamplemeanismuchless.However,ifwedividethesumofsquaresaboutthesamplemeanbyn-1,i.e.1,insteadofnwehave5.7,whichisnotmuchdifferenttothe6.8fromthesumofsquaresaboutthepopulationmean.

Table4.11.SamplingpairsfromTable4.10

Sample ∑(xi-µ)2 ∑(xi-[xwithbarabove])2

5 6 1.655 0.5

8 8 21.255 0.0

6 1 15.575 12.5

9 3 21.175 18.0

5 5 0.135 0.0

7 7 10.215 0.0

1 7 19.095 18.0

9 8 28.775 0.5

3 3 6.055 0.0

5 1 14.055 8.0

8 3 13.655 12.5

5 7 5.175 2.0

5 2 5.575 4.5

5 7 5.175 2.0

8 8 21.255 0.0

3 2 10.535 0.5

0 4 23.015 8.0

9 3 21.175 18.0

5 2 7.575 4.5

6 9 19.735 4.5

Mean 13.6432 5.7

Table4.12.Meansumsofsquaresaobutthesamplemeanforsetsof100randomsamplesfromTable

4.11

Numberinsample,nMeanvarianceestimates

2 4.5 9.1

3 5.4 8.1

4 5.9 7.9

5 6.2 7.7

10 7.2 8.0

Table4.12showstheresultsofasimilarexperimentwithmoresamplesbeingtaken.Thetableshowsthetwoaveragevarianceestimatesusingnandn-1asthedivisorofthesumofsquares,forsamplesizes2,3,4,5and10.Weseethatthesumofsquaresaboutthesamplemeandividedbynincreasessteadilywithsamplesize,butifwedivideitbyn

-1insteadofntheestimatedoesnotchangeasthesamplesizeincreases.Thesumofsquaresaboutthesamplemeanisproportionalton-1.

4BAppendix:Formulaeforthesumofsquares

Thedifferentformulaeforsumsofsquaresarederivedasfollows:

because[xwithbarabove]hasthesamevalueforeachofthenobservations.Now,so

Wethushavethreeformulaeforvariance:


14.Whichofthefollowingarequalitativevariables:

(a)sex;

(b)parity;

(c)diastolicbloodpressure;

(d)diagnosis;

(e)height.

ViewAnswer

15.Whichofthefollowingarecontinuousvariables:

(a)bloodglucose;

(b)peakexpiratoryflowrate;

(c)agelastbirthday;

(d)exactage;

(e)familysize.

ViewAnswer

16.Whenadistributionisskewtotheright:

(a)themedianisgreaterthanthemean;

(b)thedistributionisunimodal;

(c)thetailontheleftisshorterthanthetailontheright;

(d)thestandarddeviationislessthanthevariance;

(e)themajorityofobservationsarelessthanthemean.

ViewAnswer

17.Theshapeofafrequencydistributioncanbedescribedusing:

(a)aboxandwhiskerplot;

(b)ahistogram:

(c)astemandleafplot;

(d)meanandvariance;

(e)atableoffrequencies.

ViewAnswer

18.Forthesample3,1,7,2,2:

(a)themeanis3:

(b)themedianis7:

(c)themodeis2:

(d)therangeis1:

(e)thevarianceis5.5.

ViewAnswer

19.Diastolicbloodpressurehasadistributionwhichisslightlyskewtotheright.Ifthemeanandstandarddeviationwerecalculatedforthediastolicpressuresofarandomsampleofmen:

(a)therewouldbefewerobservationsbelowthemeanthanaboveit;

(b)thestandarddeviationwouldbeapproximatelyequaltothemean;

(c)themajorityofobservationswouldbemorethanonestandarddeviationfromthemean:

(d)thestandarddeviationwouldestimatetheaccuracyofbloodpressuremeasurement:

(e)about95%ofobservationswouldbeexpectedtobewithintwostandarddeviationsofthemean.

ViewAnswer

4EExercise:Meanandstandarddeviation

Thisexercisegivessomepracticeinoneofthemostfundamentalcalculationsinstatistics,thatofthesumofsquaresandstandarddeviation.Italsoshowstherelationshipofthestandarddeviationtothefrequencydistribution.Table4.13showsbloodglucoselevelsobtainedfromagroupofmedicalstudents.

1.Makeastemandleafplotforthesedata.

ViewAnswer

2.Findtheminimum,maximumandquartilesandsketchaboxandwhiskerplot.

ViewAnswer

3.Findthefrequencydistribution,usingaclassintervalof0.5.

ViewAnswer

Table4.13.Randombloodglucoselevelsfromagroupoffirstyearmedicalstudents(mmol/litre)

4.7 3.6 3.8 2.2 4.7 4.1 3.6 4.0 4.4 5.1

4.2 4.1 4.4 5.0 3.7 3.6 2.9 3.7 4.7 3.4

3.9 4.8 3.3 3.3 3.6 4.6 3.4 4.5 3.3 4.0

3.4 4.0 3.8 4.1 3.8 4.4 4.9 4.9 4.3 6.0

4.Sketchthehistogramofthisfrequencydistribution.Whattermbestdescribestheshape:symmetrical,skewtotherightorskewto

theleft?

ViewAnswer

5.Forthefirstcolumnonly,i.e.for4,7,4.2,3.9,and3.4,calculatethestandarddeviationusingthedeviationsfromthemeanformula

Firstcalculatethesumoftheobservationsandthesumoftheobservationssquared.Hencecalculatethesumofsquaresaboutthemean.Isthisthesameasthatfoundin4above?Hencecalculatethevarianceandthestandarddeviation.

ViewAnswer

6.Forthesamefournumbers,calculatethestandarddeviationusingtheformula

Firstcalculatethesumoftheobservationsandthesumoftheobservationssquared.Hencecalculatethesumofsquaresaboutthemean.Isthisthesameasthatfoundin4above?Hencecalculatethevarianceandthestandarddeviation.

ViewAnswer

7.Usethefollowingsummationsforthewholesample:∑xi=162.2,∑xi2=676.74.Calculatethemeanofthesample,thesumofsquaresaboutthemean,thedegreesoffreedomforthissumofsquares,andhenceestimatethevarianceandstandarddeviation.

ViewAnswer

8.Calculatethemean±onestandarddeviationandmean±twostandarddeviations.Indicatethesepointsandthemeanonthehistogram.Whatdoyouobserveabouttheirrelationshiptothefrequencydistribution?

ViewAnswer



>TableofContents>5-Presentingdata

5

Presentingdata

5.1RatesandproportionsHavingcollectedourdataasdescribedinChapters2and3andextractedinformationfromitusingthemethodsofChapter4,wemustfindawaytoconveythisinformationtoothers.Inthischapterweshalllookatsomeofthemethodsofdoingthat.Webeginwithratesandproportions.

Whenwehavedataintheformoffrequencies,weoftenneedtocomparethefrequencywithcertainconditionsingroupscontainingdifferenttotals.InTable2.1,forexample,twogroupsofpatientpairswerecompared,29wherethelaterpatienthadaC-Tscanand89whereneitherhadaC-Tscan.Thelaterpatientdidbetterin9ofthefirstgroupand34ofthesecondgroup.Tocomparethesefrequencieswecomparetheproportions9/29and34/89.Theseare0.31and0.38,andwecanconcludethatthereislittledifference.InTable2.1,theseweregivenaspercentages,thatis,theproportionoutof100ratherthanoutof1,toavoidthedecimalpoint.InTable2.8,theSalkvaccinetrial,theproportionscontractingpoliowerepresentedasthenumberper100000forthesamereason.

Arateexpressesthefrequencyofthecharacteristicofinterestper1000(orper100000,etc.)ofthepopulation,perunitoftime.Forexample,inTable3.1,theresultsofthestudyofsmokingbydoctors,thedatawerepresentedasthenumberofdeathsper1000doctorsperyear.Thisisnotaproportion,asafurtheradjustmenthasbeenmadetoallowforthetimeperiodobserved.Furthermore,theratehasbeenadjustedtotakeaccountofanydifferencesintheagedistributionsof

smokersandnon-smokers(§16.2).Sometimestheactualdenominatorforaratemaybecontinuallychanging.ThenumberofdeathsfromlungcanceramongmeninEnglandandWalesfor1983was26502.Thedenominatorforthedeathrate,thenumberofmalesinEnglandandWales,changedthroughout1983,assomedied,somewereborn,someleftthecountryandsomeenteredit.Thedeathrateiscalculatedbyusingarepresentativenumber,theestimatedpopulationattheendofJune1983,themiddleoftheyear.Thiswas24175900,givingadeathrateof26502/24175900,whichequals0.001096,or109.6deathsper100000atriskperyear.Anumberoftheratesusedinmedicalstatisticsaredescribedin§16.5.

Theuseofratesandproportionsenablesustocomparefrequenciesobtainedfromunequalsizedgroups,basepopulationsortimeperiods,butwemustbewareoftheirusewhentheirbasesordenominatorsarenotgiven.Victora(1982)

reportedadrugadvertisementsenttodoctorswhichdescribedtheantibioticphosphomycinasbeing‘100%effectiveinchronicurinaryinfections’.Thisisveryimpressive.Howcouldwefailtoprescribeadrugwhichis100%effective?Thestudyonwhichthiswasbasedused8patients,afterexcluding‘thosewhoseurinecontainedphosphomycin-resistantbacteria’.Iftheadvertisementhassaidthedrugwaseffectivein100%of8cases,wewouldhavebeenlessimpressed.Hadweknownthatitworkedin100%of8casesselectedbecauseitmightworkinthem,wewouldhavebeenstilllessimpressed.Thesamepaperquotesanadvertisementforacoldremedy,where100%ofpatientsshowedimprovement.Thiswasoutof5patients!AsVictoraremarked,suchsmallsamplesareunderstandableinthestudyofveryrarediseases,butnotforthecommoncold.

Sometimeswecanfoolourselvesaswellasothersbyomittingdenominators.IoncecarriedoutastudyofthedistributionofthesofttissuetumourKaposi'ssarcomainTanzania(Blandetal.1977),andwhilewritingitupIcameacrossapapersettingouttodothesamething(Schmid1973).Oneofthefactorsstudiedwastribalgroup,ofwhichthereareover100inTanzania.Thispaperreported‘thetribalincidenceintheWabende,WambweandWashiraziisremarkable…

Thesesmalltribes,eachwithfewerthan90000people,constitutethegroupinwhichatribalfactorcanbesuspected’.Thisisbasedonthefollowingratesoftumoursper10000population:national,0.1;Wabende,1.3;Wambwe.0.7;Washirazi,1.3.Theseareverybigratescomparedtothenational,butthepopulationsonwhichtheyarebasedaresmall,8000,14000and15000respectively(EgeroandHenin1973).Togetarateof1.3/10000outof8000Wabendepeoplewemusthave8000×1.3/10000=1case!Similarlywehave1caseamongthe14000Wambweand2amongthe15000Washirazi.Wecanseethattherearenotenoughdatatodrawtheconclusionswhichtheauthorhasdone.Ratesandproportionsarepowerfultoolsandwemustbewareofthembecomingdetachedfromtheoriginaldata.

5.2SignificantfiguresWhenwecalculatedthedeathrateduetolungcanceramongmenin1983wequotedtheansweras0.001096or109.6per100000peryear.Thisisanapproximation.Theratetothegreatestnumberoffiguresmycalculatorwillgiveis0.001096215653andthisnumberwouldprobablygoonindefinitely,turningintoarecurringseriesofdigits.Thedecimalsystemofrepresentingnumberscannotingeneralrepresentfractionsexactly.Weknowthat1/2=0.5,but1/3=0.33333333…,recurringinfinitely.Thisdoesnotusuallyworryus,becauseformostapplicationsthedifferencebetween0.333and1/3istoosmalltomatter.Onlythefirstfewnon-zerodigitsofthenumberareimportantandwecallthesethesignificantdigitsorsignificantfigures.Thereisusuallylittlepointinquotingstatisticaldatatomorethanthreesignificantfigures.Afterall,ithardlymatterswhetherthelungcancermortalityrateis0.001096or0.001097.Thevalue0.001096isgivento4significantfigures.Theleadingzerosarenotsignificant,thefirstsignificantdigitinthisnumberbeing‘1’.Tothreesignificant

figuresweget0.00110,becausethelastdigitis6andsothe9whichprecedesitisroundedupto10.Notethatsignificantfiguresarenotthesameasdecimalplaces.Thenumber0.00110isgivento5decimalplaces,thenumberofdigitsafterthedecimalpoint.Whenroundingtothenearestdigit,weleavethelastsignificantdigit,9inthiscase,if

whatfollowsitislessthan5,andincreasebyoneifwhatfollowsisgreaterthan5.Whenwehaveexactly5,Iwouldalwaysroundup,i.e.1.5goesto2.Thismeansthat0,1,2,3,4godownand5,6,7,8,9goup,whichseemsunbiased.Somewriterstaketheviewthat5shouldgouphalfthetimeanddownhalfthetime,sinceitisexactlymidwaybetweentheprecedingdigitandthatdigitplusone.VariousmethodsaresuggestedfordoingthisbutIdonotrecommendthemmyself.Inanycase,itisusuallyamistaketoroundtosofewsignificantfiguresthatthismatters.

Howmanysignificantfiguresweneeddependsontheusetowhichthenumberistobeputandonhowaccurateitisanyway.Forexample,ifwehaveasampleof10sublingualtemperaturesmeasuredtothenearesthalfdegree,thereislittlepointinquotingthemeantomorethan3significantfigures.Whatweshouldnotdoistoroundnumberstoafewsignificantfiguresbeforewehavecompletedourcalculations.Inthelungcancermortalityrateexample,supposeweroundthenumeratoranddenominatortotwosignificantfigures.Wehave27000/24000000=0.001125andtheanswerisonlycorrecttotwofigures.Thiscanspreadthroughcalculationscausingerrorstobuildup.Wealwaystrytoretainseveralmoresignificantfiguresthanwerequiredforthefinalanswer.

ConsiderTable5.1.Thisshowsmortalitydataintermsoftheexactnumbersofdeathsinoneyear.Thetableistakenfromamuchlargertable(OPCS1991)whichshowsthenumbersdyingfromeverycauseofdeathintheInternationalClassificationofDiseases(ICD),whichgivesnumericalcodestomanyhundredsofcausesofdeath.Thefulltable,whichalsogivesdeathsbyagegroup,covers70A4pages.Table5.1showsdeathsforbroadgroupsofdiseasescalledICDchapters.Thistableisnotagoodwaytopresentthesedataifwewanttogetanunderstandingofthefrequencydistributionofcauseofdeath,andthedifferencesbetweencausesinmenandwomen.Thisisevenmoretrueofthe70pageoriginal.Thisisnotthepurposeofthetable,ofcourse.Itisasourceofdata,areferencedocumentfromwhichusersextractinformationfortheirownpurposes.LetusseehowTable5.1canbesimplified.First,wecanreducethenumberofsignificantfigures.Letusbeextremeandreducethedatatoonesignificantfigure(Table5.2).

Thismakescomparisonsrathereasier,butitisstillnotobviouswhicharethemostimportantcausesofdeath.Wecanimprovethisbyre-orderingthetabletoputthemostfrequentcause,diseasesofthecirculatorysystem,first(Table5.3).Wecanalsocombinealotofthesmallercategoriesintoan‘others’group.Ididthisarbitrarily,bycombiningallthoseaccountingforlessthan2%ofthetotal.NowitisclearataglancethatthemostimportantcausesofdeathinEnglandandWalesarediseasesofthecirculatorysystem,neoplasmsanddiseasesoftherespiratorysystem,andthatthesedwarfalltheothers.Ofcourse,mortalityisnottheonlyindicatoroftheimportanceofadisease.ICDchapterXIII,diseasesofthemusculo-skeletal

systemandconnectivetissues,areeasilyseenfromTable5.2tobeonlyminorcausesofdeath,butthisgroupincludesarthritisandrheumatism,themostimportantillnessinitseffectsondailyactivity.

Table5.1.Deathsbysexandcause,EnglandandWales,1989(OPCS1991,DH2No.10)

I.C.D. Chapterandtypeofdisease

Numberofdeaths

Males Females

I Infectiousandparasitic 1246

1297

II Neoplasms(cancers) 75172

69948

III Endocrine,nutritionalandmetabolicdiseasesand

4395

5758

immunitydisorders

IV Bloodandbloodformingorgans

1002

1422

V Mentaldisorders 4493

9225

VI Nervoussystemandsenseorgans

5466

5990

VII Circulatorysystem 127435

137165

VIII Respiratorysystem 33489

33223

IX Digestivesystem 7900

10779

X Genitourinarysystem 3616

4156

XI Complicationsofpregnancy,childbirthandthepuerperium

0 56

XII Skinandsubcutaneoustissues

250 573

XIII Musculo-skeletalsystemandconnectivetissues

1235

4139

XIV Congenitalanomalies 897 869

XV Certainconditionsoriginatingintheperinatalperiod

122 118

XVI Signs,symptomsandill-definedconditions

1582

3082

XVII Injuryandpoisoning 11073

6427

Total 279373

294227

5.3PresentingtablesTables5.1,5.2and5.3illustrateanumberofusefulpointsaboutthepresentationoftables.Likeallthetablesinthisbook,theyaredesignedtostandalonefromthetext.Thereisnoneedtorefertomaterialburiedinsomeparagraphtointerpretthetable.Atableisintendedtocommunicateinformation,soitshouldbeeasytoreadandunderstand.Atableshouldhaveacleartitle,statingclearlyandunambiguouslywhatthetablerepresents.Therowsandcolumnsmustalsobelabelledclearly.

Whenproportions,ratesorpercentagesareusedinatabletogetherwithfrequencies,theymustbeeasytodistinguishfromoneanother.Thiscanbedone,asinTable2.10,byaddinga‘%’symbol,orby

includingaplaceofdecimals.TheadditioninTable2.10ofthe‘total’rowandthe‘100%’makesitclearthatthepercentagesarecalculatedfromthenumberinthetreatmentgroup,ratherthanthenumberwiththatparticularoutcomeorthetotalnumberofpatients.

Table5.2.Deathsbysexandcause,EnglandandWales,1989,roundedtoonesignificantfigure

I.C.D. Chapterandtypeofdisease

Numberofdeaths

Males Females

I Infectiousandparasitic 1000

1000

II Neoplasms(cancers) 80000

70000

III Endocrine,nutritionalandmetabolicdiseasesandimmunitydisorders

4000

6000

IV Bloodandbloodformingorgans

1000

1000

V Mentaldisorders 4000

9000

VI Nervoussystemandsense 5 6000

organs 000

VII Circulatorysystem 100000

100000

VIII Respiratorysystem 30000

30000

IX Digestivesystem 8000

10000

X Genitourinarysystem 4000

4000

XI Complicationsofpregnancy,childbirthandthepuerperium

0 60

XII Skinandsubcutaneoustissues

300 600

XIII Musculo-skeletalsystemandconnectivetissues

1000 4000

XIV Congenitalanomalies 900 900

XV Certainconditionsoriginatingintheperinatalperiod

100 100

XVI Signs,symptomsandill-definedconditions

2000

3000

XVII Injuryandpoisoning 10000

6000

Total 300000

300000

Table5.3.Deathsbysex,EnglandandWales,1989,formajorcauses

I.C.D.Chapterandtypeofdisease

Numberofdeaths

Males Females

Circulatorysystem(VII) 100000

100000

Neoplasms(cancers)(II) 80000 70000

Respiratorysystem(VIII) 30000 30000

Injuryandpoisoning(XVII) 10000 6000

Digestivesystem(IX) 8000 10000

Others 20000 20000

Total 300000

300000

5.4PiechartsItisoftenconvenienttopresentdatapictorially.Informationcanbeconveyedmuchmorequicklybyadiagramthanbyatableofnumbers.Thisisparticularlyusefulwhendataarebeingpresentedtoanaudience,asheretheinformationhastobegotacrossinalimitedtime.Itcanalsohelpareadergetthesalientpointsofatableofnumbers.Unfortunately,unlessgreatcareistaken,diagramscanalsobeverymisleadingandshouldbetreatedonlyasanadditiontonumbers,notareplacement.

Wehavealreadydiscussedmethodsofillustratingthefrequencydistributionofaqualitativevariable.Wewillnowlookatanequivalentofthehistogramfor

qualitativedata,thepiechartorpiediagram.Thisshowstherelativefrequencyforeachcategorybydividingacircleintosectors,theanglesofwhichareproportionaltotherelativefrequency.Wethusmultiplyeachrelativefrequencyby360,togivethecorrespondingangleindegrees.

Table5.4.Calculationsforapiechartofthedistributionofcauseofdeath

Causeofdeath Frequency Relativefrequency

Angle(degrees)

Circulatorysystem

137165 0.46619 168

Neoplasms(cancers)

69948 0.23773 86

Respiratorysystem

33223 0.11292 41

Injuryandpoisoning

6427 0.02184 8

Digestivesystem

10779 0.03663 13

Nervoussystem

5990 0.02036 7

Others 30695 0.10432 38

Total 294227 1.00000 361

Fig.5.1.Piechartshowingthedistributionofcauseofdeathamongfemales,EnglandandWales,1983

Table5.4showsthecalculationfordrawingapiecharttorepresentthedistributionofcauseofdeathforfemales,usingthedataofTables5.1and5.3.(Thetotaldegreesare361ratherthan360becauseofroundingerrorsinthecalculations.)TheresultingpiechartisshowninFigure5.1.Thisdiagramissaidtoresembleapiecutintopiecesforserving,hencethename.

5.5BarchartsAbarchartorbardiagramshowsdataintheformofhorizontalorverticalbars.Forexample,Table5.5showsthemortalityduetocanceroftheoesophagusinEnglandandWalesovera10yearperiod.Figure5.2showsthesedataintheformofabarchart,theheightsofthebarsbeingproportionaltothemortality.

Table5.5.Canceroftheoesophagus:standardizedmortalityrateper100000peryear,Englandand

Wales,1960--1969

Year Mortalityrate Year Mortalityrate

60 5.1 65 5.4

61 5.0 66 5.4

62 5.2 67 5.6

63 5.2 68 5.8

64 5.2 69 6.0

Fig.5.2.Barchartshowingtherelationshipbetweenmortalityduetocanceroftheoesophagusandyear,EnglandandWales,1960–1969

Therearemanyusesforbarcharts.AsinFigure5.2,theycanbeusedtoshowtherelationshipbetweentwovariables,onebeingquantitativeandtheothereitherqualitativeoraquantitativevariablewhichisgrouped,asistimeinyears.Thevaluesofthefirstvariableareshownbytheheightsofbars,onebarforeachcategoryofthesecondvariable.

Barchartscanbeusedtorepresentrelationshipsbetweenmorethantwovariables.Figure5.3showstherelationshipbetweenchildren'sreportsofbreathlessnessandcigarettesmokingbythemselvesandtheirparents.Wecanseequicklythattheprevalenceofthesymptomincreasesbothwiththechild'ssmokingandwiththatoftheirparents.Inthepublishedpaperreportingtheserespiratorysymptomdata(Blandetal.1978)thebarchartwasnotused;thedataweregivenintheformoftables.Itwasthusavailableforotherresearcherstocomparetotheirownortocarryoutcalculationsupon.Thebarchartwasusedtopresenttheresultsduringaconference,wherethemostimportantthingwastoconveyanoutlineoftheanalysisquickly.

Barchartscanalsobeusedtoshowfrequencies.Forexample,Figure5.4(a)showstherelativefrequencydistributionsofcausesofdeathamongmenandwomen,Figure5.4(b)showsthefrequencydistributionofcauseofdeathamong

men.Figure5.4(b)looksverymuchlikeahistogram.Thedistinctionbetweenthesetwotermsisnotclear.MoststatisticianswoulddescribeFigures4.3,4.4,and4.6ashistograms,andFigures5.2and5.3asbarcharts,butIhaveseenbookswhichactuallyreversethisterminologyandotherswhichreservetheterm‘histogram’forafrequencydensitygraph,likeFigures4.4and4.6.

Fig.5.3.Barchartshowingtherelationshipbetweentheprevalenceofself-reportedbreathlessnessamongschoolchildrenandtwopossiblecausativefactors

Fig.5.4.BarchartsshowingdatafromTable5.1

5.6Scatterdiagrams

Thebarchartwouldbearatherclumsymethodforshowingtherelationshipbetweentwocontinuousvariables,suchasvitalcapacityandheight(Table5.6).Forthisweuseascatterdiagramorscattergram(Figure5.5).Thisismadebymarkingthescalesofthetwovariablesalonghorizontalandverticalaxes.Eachpairofmeasurementsisplottedwithacross,circle,orsomeothersuitablesymbolatthepointindicatedbyusingthemeasurementsascoordinates.

Table5.7showsserumalbuminmeasuredfromagroupofalcoholicpatientsandagroupofcontrols(Hickishetal.1989).Wecanuseascatterdiagramto

presentthesedataalso.Theverticalaxisrepresentsalbuminandwechoosetwoarbitrarypointsonthehorizontalaxistorepresentthegroups.

Table5.6.Vitalcapacity(VC)andheightfor44femalemedicalstudents

Height(cm)

VC(litres)

Height(cm)

VC(litres)

Height(cm)

VC(litres)

Height(cm)

155.0 2.20 161.2 3.39 166.0 3.66 170.0

155.0 2.65 162.0 2.88 166.0 3.69 171.0

155.4 3.06 162.0 2.96 166.6 3.06 171.0

158.0 2.40 162.0 3.12 167.0 3.48 171.5

160.0 2.30 163.0 2.72 167.0 3.72 172.0

160.2 2.63 163.0 2.82 167.0 3.80 172.0

161.0 2.56 163.0 3.40 167.6 3.06 174.0

161.0 2.60 164.0 2.90 167.8 3.70 174.2

161.0 2.80 165.0 3.07 168.0 2.78 176.0

161.0 2.90 166.0 3.03 168.0 3.63 177.0

161.0 3.40 166.0 3.50 169.4 2.80 180.6

Fig.5.5.Scatterdiagramshowingtherelationshipbetweenvitalcapacityandheightforagroupoffemalemedicalstudents

Table5.7.Albuminmeasuredinalcoholicsandcontrols

Alcoholics Controls

15 28 39 41 44 48 34 41 43 45 45

16 29 39 43 45 48 39 42 43 45 45

17 32 39 43 45 49 39 42 43 45 45

18 37 40 44 46 51 40 42 43 45 46

20 38 40 44 46 51 41 42 44 45 46

21 38 40 44 46 52 41 42 44 45 47

28 38 41 44 47 41 42 44 45 47

Fig.5.6.ScatterdiagramsshowingthedataofTable5.7

InTable5.7therearemanyidenticalobservationsineachgroup,soweneedtoallowforthisinthescatterdiagram.Ifthereismorethanoneobservationatthesamecoordinatewecanindicatethisinseveralways.Wecanusethenumberofobservationsinplaceofthechosensymbol,butthismethodisbecomingobsolete.AsinFigure5.6(a),wecandisplacethepointsslightlyinarandomdirection(calledjittering).ThisiswhatStatadoesandsowhatIhavedoneinmostofthisbook.Alternatively,wecanuseasystematicsidewaysshift,toformamoreorderlypictureasinFigure5.6(b).Thelatterisoftenusedwhenthevariableonthehorizontalaxisiscategoricalratherthancontinuous.Suchscatterdiagramsareveryusefulforcheckingtheassumptionsofsomeoftheanalyticalmethodswhichweshalluselater.Ascatterdiagramwhereonevariableisagroupisalsocalledadotplot.Asapresentationaldevice,theyenableustoshowfarmoreinformationthanabarchartofthegroupmeanscando.Forthisreason,statisticiansusuallypreferthemtoothertypesofgraphicaldisplay.

5.7LinegraphsandtimeseriesThedataofTable5.5areorderedinawaythatthoseofTable5.6arenot,inthattheyarerecordedatintervalsintime.Suchdataarecalledatimeseries.Ifweplotascatterdiagramofsuchdata,asinFigure5.7,itisnaturaltojoinsuccessivepointsbylinestoformalinegraph.Wedonotevenneedtomarkthepointsatall;allweneedistheline.ThiswouldnotbesensibleinFigure5.5,astheobservationsareindependentofoneanotherandquiteunrelated,whereasinFigure5.7thereislikelytobearelationshipbetweenadjacentpoints.Herethemortalityraterecordedforcanceroftheoesophaguswilldependonanumberofthingswhichvaryovertimeincludingpossiblycausalfactors,suchastobaccoandalcoholconsumption,andclinicalfactors,suchasbetterdiagnostictechniquesandmethodsoftreatment.

Linegraphsareparticularlyusefulwhenwewanttoshowthechangeofmorethanonequantityovertime.Figure5.8showslevelsofzidovudine(AZT)in

thebloodofAIDSpatientsatseveraltimesafteradministrationofthedrug,forpatientswithnormalfatabsorptionandwithfatmalabsorption(§10.7).Thedifferenceinresponsetothetwotreatmentsisveryclear.

Fig.5.7.Linegraphshowingchangesincanceroftheoesophagusmortalityovertime

Fig.5.8.LinegraphtoshowtheresponsetoadministrationofzidovudineintwogroupsofAIDSpatients

5.8MisleadinggraphsFigure5.2isclearlytitledandlabelledandcanbereadindependentlyofthesurroundingtext.Theprinciplesofclarityoutlinedfortablesapplyequallyhere.Afterall,adiagramisamethodofconveyinginformationquicklyandthisobjectisdefeatedifthereaderoraudiencehastospendtimetryingtosortoutexactlywhatadiagramreallymeans.Becausethevisualimpactofdiagramscanbesogreat,furtherproblemsariseintheiruse.

Thefirstoftheseisthemissingzero.Figure5.9showsasecondbarchart

representingthedataofTable5.5.Thischartappearstoshowaveryrapidincreaseinmortality,comparedtothegradualincreaseshowninFigure5.2.Yetbothshowthesamedata.Figure5.9omitsmostoftheverticalscale,andinsteadstretchesthatsmallpartofthescalewherethechangetakesplace.Evenwhenweareawareofthis,itisdifficult

tolookatthisgraphandnotthinkthatitshowsalargeincreaseinmortality.Ithelpsifwevisualizethebaselineasbeingsomewherenearthebottomofthepage.

Fig.5.9.Barchartwithzeroomittedontheverticalscale

ThereisnozeroonthehorizontalaxisinFigures5.2and5.9,either.Therearetworeasonsforthis.Thereisnopractical‘zerotime’onthecalendar;weuseanarbitraryzero.Also,thereisanunstatedassumptionthatmortalityratesvarywithtimeandnottheotherwayround.

ThezeroisomittedinFigure5.5.Thisisalmostalwaysdoneinscatterdiagrams,yetifwearetogaugetheimportanceoftherelationshipbetweenvitalcapacityandheightbytherelativechangeinvitalcapacityovertheheightrangeweneedthezeroonthevitalcapacityscale.Theoriginisoftenomittedonscatterdiagramsbecauseweareusuallyconcernedwiththeexistenceofarelationshipandthedistributionsfollowedbytheobservations,ratherthanitsmagnitude.Weestimatethelatterinadifferentway,describedinChapter11.

Linegraphsareparticularlyatriskofundergoingthesortofdistortion

ofmissingzerodescribedin§5.8.ManycomputerprogramsresistdrawingbarchartslikeFigure5.9,butwillproducealinegraphwithatruncatedscaleasthedefault.Figure5.10showsalinegraphwithatruncatedscale,correspondingtoFigure5.9.Justasthere,themessageofthegraphisadramaticincreaseinmortality,whichthedatathemselvesdonotreallysupport.Wecanmakethisevenmoredramaticbystretchingtheverticalscaleandcompressingthehorizontalscale.TheeffectisnowreallyimpressiveandlooksmuchmorelikelythanFigure5.7toattractresearchfunds,Nobelprizesandinterviewsontelevision.Huff(1954)aptlynamessuchhorrors‘geewhiz’graphs.Theyareevenmoredramaticifweomitthescalesaltogetherandshowonlythesoaringline.

Fig.5.10.Linegraphswithamissingzeroandwithastretchedverticalandcompressedhorizontalscale,a‘geewhiz’graph

Fig.5.11.Figure5.1withthree-dimensionaleffects

Thisisnottosaythatauthorswhoshowonlypartofthescalearedeliberatelytryingtomislead.Thereareoftengoodargumentsagainstgraphswithvastareasofboringblankpaper.InFigure5.5,wearenotinterestedinvitalcapacitiesnearzeroandcanfeelquitejustifiedinexcludingthem.InFigure5.10wecertainlyareinterestedinzeromortality;itissurelywhatweareaimingfor.Thepointisthatgraphscansoeasilymisleadtheunwaryreader,soletthereaderbeware.

Theadventofpowerfulpersonalcomputersledtoanincreaseintheabilitytoproducecomplicatedgraphics.Simplecharts,suchasFigure5.1,areinformativebutnotvisuallyexciting.Onewayofdecoratingsuchgraphsismakethemappearthree-dimensional.Figure5.11showstheeffect.Theanglesarenolongerproportionaltothenumberswhichtheyrepresent.Theareasare,butbecausetheyaredifferentshapesitisdifficulttocomparethem.Thisdefeatstheprimaryobjectofconveyinginformationquicklyandaccurately.Anotherapproachtodecoratingdiagramsistoturnthemintopictures.Inapictogramthebarsof

thebarchartarereplacedbypictures.Pictogramscanbehighlymisleading,astheheightofapicture,drawnwiththree-dimensionaleffect,isproportionaltothenumberrepresented,butwhatweseeisthevolume.Suchdecoratedgraphsareliketheilluminatedcapitalsofmedievalmanuscripts:nicetolookatbuthardtoread.Ithinkthey

shouldbeavoided.

Fig.5.12.TuberculosismortalityinEnglandandWales,1871–1971(DHSS1976)

Huff(1954)recountsthatthepresidentofachapteroftheAmericanStatisticalAssociationcriticizedhimforaccusingpresentersofdataoftryingtodeceive.Thestatisticianarguedthatincompetencewastheproblem.Huff'sreplywasthatdiagramsfrequentlysensationalizebyexaggerationandrarelyminimizeanything,thatpresentersofdatararelydistortthosedatatomaketheircaseappearweakerthanitis.Theerrorsaretooone-sidedforustoignorethepossibilitythatwearebeingfooled.Whenpresentingdata,especiallygraphically,beverycarefulthatthedataareshownfairly.Whenonthereceivingend,beware!

5.9LogarithmicscalesFigure5.12showsalinegraphrepresentingthefallintuberculosismortalityinEnglandandWalesover100years(DHSS1976).Wecanseearatherunsteadycurve,showingthecontinuingdeclineinthedisease.Figure5.12alsoshowsthemortalityplottedonalogarithmic(orlog)scale.Alogarithmicscaleisonewheretwopairsofpointswillbethesamedistanceapartiftheirratiosareequal,ratherthantheirdifferences.Thusthedistancebetween1and10isequaltothat

between10and100,nottothatbetween10and19.(See§5Aifyoudonotunderstandthis.)Thelogarithmiclineshowsaclearkinkinthecurveabout1950,thetimewhenanumberofeffectiveanti-TBmeasures,chemotherapywithstreptomycin,BCGvaccineandmassscreeningwithX-rays,wereintroduced.Ifweconsiderthepropertiesoflogarithms(§5A),wecanseehowthelogscaleforthetuberculosismortalitydataproducedsuchsharpchangesinthecurve.Iftherelationshipissuchthatthemortalityisfallingwithaconstantproportion,suchas10%peryear,theabsolutefalleachyeardependsontheabsolutelevelintheprecedingyear:

mortalityin1960=constant×mortalityin1959

Soifweplotmortalityonalogscaleweget:

log(mortalityin1960)=log(constant)+log(mortalityin1959)

Formortalityin1961,wehave

Hencewegetastraightlinerelationshipbetweenlogmortalityandtimet:

log(mortalityaftertyears)=t×log(constant)+log(mortalityasstart)

Whentheconstantproportionchanges,theslopeofthestraightlineformedbyplottinglog(mortality)againsttimechangesandthereisaveryobviouskinkintheline.

Logscalesareveryusefulanalytictools.However,agraphonalogscalecanbeverymisleadingifthereaderdoesnotallowforthenatureofthescale.ThelogscaleinFigure5.12showstheincreasedrateofreductioninmortalityassociatedwiththeanti-TBmeasuresquiteplainly,butitgivestheimpressionthatthesemeasureswereimportantinthedeclineofTB.Thisisnotso.Ifwelookatthecorrespondingpointonthenaturalscale,wecanseethatallthesemeasuresdidwastoaccelerateadeclinewhichhadbeengoingonforalongtime(seeRadicalStatisticsHealthGroup1976)

Appendices

5AAppendix:Logarithms

Logarithmsarenotsimplyamethodofcalculationdatingfrombeforethecomputerage,butasetoffundamentalmathematicalfunctions.Becauseoftheirspecialpropertiestheyaremuchusedinstatistics.Weshallstartwithlogarithms(orlogsforshort)tobase10,thecommonlogarithmsusedincalculations.Thelogtobase10ofanumberxisywhere

x=10y

Wewritey=log10(x).Thusforexamplelog10(10)=1,log10(100)=2,log10(1000)=3,log10(10000)=4,andsoon.Ifwemultiplytwonumbers,thelogoftheproductisthesumoftheirlogs:

log(xy)=log(x)+log(y)

Forexample.

100×1000=102×103=102+3=105=100000

Orinlogterms:

log10(100×1000)=log10(100)+log10(1000)=2+3=5

Hence,100×1000=105=100000.Thismeansthatanymultiplicativerelationshipoftheform

y=a×b×c×d

canbemadeadditivebyalogtransformation:

log(y)=log(a)+log(b)+log(c)+log(d)

ThisistheprocessunderlyingthefittotheLognormalDistributiondescribedin§7.4.

Thereisnoneedtouse10asthebaseforlogarithms.Wecanuseanynumber.Thelogofanumberxtobasebcanbefoundfromthelogto

baseabyasimplecalculation:

Tenisconvenientforarithmeticusinglogtables,butforotherpurposesitislessso.Forexample,thegradient,slopeordifferentialofthecurvey=log10(x)islog10(e)/x,wheree=2.718281…isaconstantwhichdoesnotdependonthebaseofthelogarithm.Thisleadstoawkwardconstantsspreadingthroughformulae.Tokeepthistoaminimumweuselogstothebasee,callednaturalorNapierianlogarithmsafterthemathematicianJohnNapier.ThisisthelogarithmusuallyproducedbyLOG(X)functionsincomputerlanguages.

Figure5.13showsthelogcurveforthreedifferentbases,2,eand10.Thecurvesallgothroughthepoint(1,0),i.e.log(1)=0.Asxapproaches0,log(x)becomesalargerandlargernegativenumber,tendingtowardsminusinfinityasxtendstozero.Therearenologsofnegativenumbers.Asxincreasesfrom1,thecurvebecomesflatterandflatter.Thoughlog(x)continuestoincrease,itdoessomoreandmoreslowly.Thecurvesallgothrough(base,1)i.e.log(base)=1.Thecurveforlogtothebase2goesthrough(2,1),(4,2),(8,3)because21=2.22=4,23=8.Wecanseethattheeffectofreplacingdatabytheirlogswillbetostretchoutthescaleatthelowerendandcontractitattheupper.

Weoftenworkwithlogarithmsofdataratherthanthedatathemselves.Thismayhaveseveraladvantages.Multiplicativerelationshipsmaybecomeadditive,curvesmaybecomestraightlinesandskewdistributionsmaybecomesymmetrical.

Wetransformbacktothenaturalscaleusingtheantilogarithmorantilog.Ify=log10(x),x=10yistheantilogofy.IfZ=loge(x),x=ezorx=exp(z)istheantilogofz.Ifyourcomputerprogramdoesnottransformback,mostcalculatorshaveexand10xfunctionsforthispurpose.

Fig.5.13.Logarithmiccurvestothreedifferentbases


20.‘AftertreatmentwithWondermycin,66.67%ofpatientsmadeacompleterecovery’

(a)Wondermyciniswonderful;

(b)thisstatementmaybemisleadingbecausethedenominatorisnotgiven;

(c)thenumberofsignificantfiguresusedsuggestadegreeofprecisionwhichmaynotbepresent;

(d)somecontrolinformationisrequiredbeforewecandrawanyconclusionsaboutWondermycin;

(e)theremightbeonlyaverysmallnumberofpatients.

ViewAnswer

21.Thenumber1729.54371:

(a)totwosignificantfiguresis1700;

(b)tothreesignificantfiguresis1720;

(c)tosixdecimalplacesis1729.54;

(d)tothreedecimalplacesis1729.544;

(e)tofivesignificantfiguresis1729.5.

ViewAnswer

Fig.5.14.Adubiousgraph

22.Figure5.14:

(a)showsahistogram;

(b)shouldhavetheverticalaxislabelled;

(c)shouldshowthezeroontheverticalaxis;

(d)shouldshowthezeroonthehorizontalaxis;

(e)shouldshowtheunitsfortheverticalaxis.

ViewAnswer

23.Logarithmicscalesusedingraphsshowingtimetrends:

(a)showchangesinthetrendclearly;

(b)oftenproducestraightlines;

(c)giveaclearideaofthemagnitudeofchanges;

(d)shouldshowthezeropointfromtheoriginalscale;

(e)compressintervalsbetweenlargenumberscomparedtothosebetweensmallnumbers.

ViewAnswer

24.Thefollowingmethodscanbeusedtoshowtherelationshipbetweentwovariables:

(a)histogram;

(b)piechart;

(c)scatterdiagram;

(d)barchart;

(e)linegraph.

ViewAnswer

Table5.8.WeeklygeriatricadmissionsinWandsworthHealthDistrictfromMaytoSeptember,

1982and1983(Fishetal.1985)

Week 1982 1983 Week 1982 1983

1 24 20 12 11 25

2 22 17 13 6 22

3 21 21 14 10 26

4 22 17 15 13 12

5 24 22 16 19 33

6 15 23 17 13 19

7 23 20 18 17 21

8 21 16 19 10 28

9 18 24 20 16 19

10 21 21 21 24 13

11 17 20 22 15 29

5EExercise:CreatinggraphsInthisexerciseweshalldisplaygraphicallysomeofthedatawehavestudiedsofar.

1.Table4.1showsdiagnosesofpatientsinahospitalcensus.Displaythesedataasagraph.

ViewAnswer

2.Table2.8showstheparalyticpolioratesforseveralgroupsofchildren.Constructabarchartfortheresultsfromtherandomizedcontrolareas.

ViewAnswer

3.Table3.1showssomeresultsfromthestudyofmortalityinBritishdoctors.Showthesegraphically.

ViewAnswer

4.Table5.8showsthenumbersofgeriatricadmissionsinWandsworthHealthDistrictforeachweekfromMaytoSeptemberin1982and1983.Showthesedatagraphically.Whydoyouthinkthetwoyearsweredifferent?

ViewAnswer



>TableofContents>6-Probability

6

Probability

6.1ProbabilityWeusedatafromasampletodrawconclusionsaboutthepopulationfromwhichitisdrawn.Forexample,inaclinicaltrialwemightobservethatasampleofpatientsgivenanewtreatmentrespondbetterthanpatientsgivenanoldtreatment.Wewanttoknowwhetheranimprovementwouldbeseeninthewholepopulationofpatients,andifsohowbigitmightbe.Thetheoryofprobabilityenablesustolinksamplesandpopulations,andtodrawconclusionsaboutpopulationsfromsamples.Weshallstartthediscussionofprobabilitywithsomesimplerandomizingdevices,suchascoinsanddice,buttherelevancetomedicalproblemsshouldsoonbecomeapparent.

Wefirstaskwhatexactlyismeantby‘probability’.InthisbookIshalltakethefrequencydefinition:theprobabilitythataneventwillhappenundergivencircumstancesmaybedefinedastheproportionofrepetitionsofthosecircumstancesinwhichtheeventwouldoccurinthelongrun.Forexample,ifwetossacoinitcomesdowneitherheadsortails.Beforewetossit,wehavenowayofknowingwhichwillhappen,butwedoknowthatitwilleitherbeheadsortails.Afterwehavetossedit,ofcourse,weknowexactlywhattheoutcomeis.Ifwecarryontossingourcoin,weshouldgetseveralheadsandseveraltails.Ifwegoondoingthisforlongenough,thenwewouldexpecttogetasmanyheadsaswedotails.Sotheprobabilityofaheadbeingthrownishalf,becauseinthelongrunaheadshouldoccuronhalfofthethrows.Thenumberofheadswhichmightariseinseveraltossesofthecoiniscalledarandomvariable,thatis,avariablewhichcantakemorethan

onevaluewithgivenprobabilities.Inthesameway,athrowndiecanshowsixfaces,numberedonetosix,withequalprobability.Wecaninvestigaterandomvariablessuchasthenumberofsixesinagivennumberofthrows,thenumberofthrowsbeforethefirstsix,andsoon.Thereisanother,broaderdefinitionofprobabilitywhichleadstoadifferentapproachtostatistics,theBayesianschool(BlandandAltman1998),butitisbeyondthescopeofthisbook.

Thefrequencydefinitionofprobabilityalsoappliestocontinuousmeasurement,suchashumanheight.Forexample,supposethemedianheightinapopulationofwomenis168cm.Thenhalfthewomenareabove168cminheight.Ifwechoosewomenatrandom(i.e.withoutthecharacteristicsofthewomaninfluencingthechoice)theninthelongrunhalfthewomenchosenwillhave

heightsabove168cm.Theprobabilityofawomanhavingheightabove168cmisonehalf.Similarly,if1/10ofthewomenhaveheightgreaterthan180cm.awomanchosenatrandomwillhaveheightgreaterthan180cmwithprobability1/10.Inthesamewaywecanfindtheprobabilityofheightbeingbetweenanygivenvalues.Whenwemeasureacontinuousquantitywearealwayslimitedbythemethodofmeasurement,andsowhenwesayawoman'sheightis170cmwemeanthatitisbetween,say,169.5and170.5cm,dependingontheaccuracywithwhichwemeasure.Sowhatweareinterestedinistheprobabilityoftherandomvariabletakingvaluesbetweencertainlimitsratherthanparticularvalues.

6.2PropertiesofprobabilityThefollowingsimplepropertiesfollowfromthedefinitionofprobability.

1. Aprobabilityliesbetween0.0and1.0.Whentheeventneverhappenstheprobabilityis0.0,whenitalwayshappenstheprobabilityis1.0.

2. Additionrule.Supposetwoeventsaremutuallyexclusive,i.e.whenonehappenstheothercannothappen.Thentheprobabilitythatoneortheotherhappensisthesumoftheirprobabilities.Forexample,a

throwndiemayshowaoneoratwo,butnotboth.Theprobabilitythatitshowsaoneoratwo=1/6+1/6=2/6.

3. Multiplicationrule.Supposetwoeventsareindependent,i.e.knowingonehashappenedtellsusnothingaboutwhethertheotherhappens.Thentheprobabilitythatbothhappenistheproductoftheirprobabilities.Forexample,supposewetosstwocoins.Onecoindoesnotinfluencetheother,sotheresultsofthetwotossesareindependent,andtheprobabilityoftwoheadsoccurringis1/2×1/2=1/4.ConsidertwoindependenteventsAandB.TheproportionoftimesAhappensinthelongrunistheprobabilityofA.SinceAandBareindependent,ofthosetimeswhenAhappens,aproportion,equaltoprobabilityofB,willhaveBhappenalso.HencetheproportionoftimesthatAandBhappentogetheristheprobabilityofAmultipliedbytheprobabilityofB.

6.3ProbabilitydistributionsandrandomvariablesSupposewehaveasetofeventswhicharemutuallyexclusiveandwhichincludesalltheeventswhichcanpossiblyhappen.Thesumoftheirprobabilitiesis1.0.Thesetoftheseprobabilitiesmakeupaprobabilitydistribution.Forexample,ifwetossacointhetwopossibilities,headortail,aremutuallyexclusiveandthesearetheonlyeventswhichcanhappen.Theprobabilitydistributionis:

PROB(head)=1/2

PROB(tail)=1/2

Now,letusdefineavariable,whichwewilldenotebythesymbolX,suchthatX=0ifthecoinshowsatailandX=1ifthecoinshowsahead.Xis

thenumberofheadsshownonasingletoss,whichmustbe0or1.WedonotknowbeforethetosswhatXwillbe,butdoknowtheprobabilityofithavinganypossiblevalue.Xisarandomvariable(§6.1)andtheprobabilitydistributionisalsothedistributionofX.Wecanrepresentthiswithadiagram,asinFigure6.1(a).

Fig.6.1.Probabilitydistributionsforthenumberofheadsshowninthetossofonecoinandintossesoftwocoins

Whathappensifwetosstwocoinsatonce?Wenowhavefourpossibleevents:aheadandahead,aheadandatail,atailandahead,atailandatail.Clearly,theseareequallylikelyandeachhasprobability1/4.LetYbethenumberofheads.Yhasthreepossiblevalues:0,1,and2.Y=0onlywhenwegetatailandatailandhasprobability1/4.Similarly,Y=2onlywhenwegetaheadandahead,sohasprobability1/4.However,Y=1eitherwhenwegetaheadandtail,orwhenwehaveatailandahead,andsohasprobability1/4+1/4=1/2.Wecanwritethisprobabilitydistributionas:

PROB(Y=0)=1/4

PROB(Y=1)=1/2

PROB(Y=2)=1/4

TheprobabilitydistributionofYisshowninFigure6.1(b).

6.4TheBinomialdistributionWehaveconsideredtheprobabilitydistributionsoftworandomvariables:X,thenumberofheadsinonetossofacoin,takingvalues0and1,andY,thenumberofheadswhenwetosstwocoins,takingvalues0,1or2.Wecanincreasethenumberofcoins;Figure6.2showsthedistributionofthenumberofheadsobtainedwhen15coinsaretossed.Wedonotneedtheprobabilityofa‘head’tobe0.5:wecan

countthenumberofsixeswhendicearethrown.Figure6.2alsoshowsthedistributionofthenumberofsixesobtainedfrom10dice.Ingeneral,wecanthinkofthecoinorthedieastrials,whichcanhaveoutcomessuccess(headorsix)orfailure(tailoronetofive).ThedistributionsinFigures6.1and6.2areallexamplesoftheBinomialdistribution,whicharisesfrequentlyinmedicalapplications.TheBinomialdistributionisthedistributionfollowed

bythenumberofsuccessesinnindependenttrialswhentheprobabilityofanysingletrialbeingasuccessisp.TheBinomialdistributionisinfactafamiliyofdistributions,themembersofwhicharedefinedbythevaluesofnandp.Thevalueswhichdefinewhichmemberofthedistributionfamilywehavearecalledtheparametersofthedistribution.

Fig.6.2.Distributionofthenumberofheadsshownwhen15coinsaretossedandofthenumberofsixesshownwhen10dicearethrown,examplesoftheBinomialdistribution

Simplerandomizingdeviceslikecoinsanddiceareofinterestinthemselves,butnotofobviousrelevancetomedicine.However,supposewearecarryingoutarandomsamplesurveytoestimatetheunknownprevalence,p,ofadisease.Sincemembersofthesamplearechosenatrandomandindependentlyfromthepopulation,theprobabilityofanychosensubjecthavingthediseaseisp.Wethushave

aseriesofindependenttrials,eachwithprobabilityofsuccessp,andthenumberofsuccesses,i.e.membersofthesamplewiththedisease,willfollowaBinomialdistribution.Asweshallseelater,thepropertiesoftheBinomialdistributionenableustosayhowaccurateistheestimateofprevalenceobtained(§8.4).

WecouldcalculatetheprobabilitiesforaBinomialdistributionbylistingallthewaysinwhich,say,15coinscanfall.However,thereare215=32768combinationsof15coins,sothisisnotverypractical.Instead,thereisaformulafortheprobabilityintermsofthenumberofthrowsandtheprobabilityofahead.Thisenablesustoworktheseprobabilitiesoutforanyprobabilityofsuccessandanynumberoftrials.Ingeneral,wehavenindependenttrialswiththeprobabilitythatatrialisasuccessbeingp.Theprobabilityofrsuccessesis

wheren!.callednfactorial,isn×(n-1)×(n-2)×…×2×1.Thisratherforbiddingformulaariseslikethis.Foranyparticularseriesofrsuccesses,eachwithprobabilityp,andn-rfailures,eachwithprobability1-p,theprobabilityoftheserieshappeningispr(1-p)(n-r),sincethetrialsareindependentandthemultiplicativeruleapplies.Thenumberofwaysinwhichrthingsmaybechosenfromnthingsisn!/r!(n-r)!(§6A).Onlyonecombinationcanhappenat

onetime,sowehaven!/r!(n-r)!mutuallyexclusivewaysofhavingrsuccesses,eachwithprobabilitypr(1-p)(n-r).Theprobabilityofhavingrsuccessesisthesumofthesen!/r!(n-r)!probabilities,givingtheformulaabove.Thosewhorememberthebinomialexpansioninmathematicswillseethatthisisonetermofit,hencethenameBinomialdistribution.

Fig.6.3.Binomialdistributionswithdifferentn,p=0.3

Wecanapplythistothenumberofheadsintossesoftwocoins.ThenumberofheadswillbefromaBinomialdistributionwithp=0.5andn=2.Hencetheprobabilityoftwoheads(r=2)is:

Notethat0!=1(§6A),andanythingtothepower0is1.Similarlyforr=1andr=0:

Thisiswhatwasfoundfortwocoinsin§6.3.Wecanusethisdistributionwheneverwehaveaseriesoftrialswithtwopossibleoutcomes.Ifwetreatagroupofpatients,thenumberwhorecoverisfromaBinomialdistribution.Ifwemeasurethebloodpressureofagroupofpeople,thenumberclassifiedashypertensiveisfromaBinomialdistribution.

Figure6.3showstheBinomialdistributionforp=0.3andincreasingvaluesofn.Thedistributionbecomesmoresymmetricalasnincreases.ItisconvergingtotheNormaldistribution,describedinthenext

chapter.

6.5MeanandvarianceThenumberofdifferentprobabilitiesinaBinomialdistributioncanbeverylargeandunwieldy.Whennislarge,weusuallyneedtosummarizetheseprobabilitiesinsomeway.Justasafrequencydistributioncanbedescribedbyitsmeanandvariance,socanaprobabilitydistributionanditsassociatedrandomvariable.

Themeanistheaveragevalueoftherandomvariableinthelongrun.ItisalsocalledtheexpectedvalueorexpectationandtheexpectationofarandomvariableXisusuallydenotedbyE(X).Forexample,considerthenumberofheadsintossesoftwocoins.Weget0headsin1/4ofpairsofcoins,i.e.withprobability1/4.Weget1headin1/2ofpairsofcoins,and2headsin1/4ofpairs.Theaveragevalueweshouldgetinthelongrunisfoundbymultiplyingeachvaluebytheproportionofpairsinwhichitoccursandadding:

Ifwekeptontossingpairsofcoins,theaveragenumberofheadsperpairwouldbe1.Thusforanyrandomvariablewhichtakesdiscretevaluesthemean,expectationorexpectedvalueisfoundbysummingeachpossiblevaluemultipliedbyitsprobability.

Notethattheexpectedvalueofarandomvariabledoesnothavetobeavaluethattherandomvariablecanactuallytake.Forexample,forthemeannumberofheadsinthrowsofonecoinwehaveeithernoheadsor1head,eachwithprobabilityhalf,andtheexpectedvalueis0×½+1×½=½.Thenumberofheadsmustbe0or1,buttheexpectedvalueishalf,theaveragewhichwewouldgetinthelongrun.

Thevarianceofarandomvariableistheaveragesquareddifferencefromthemean.Forthenumberofheadsintossesoftwocoins,0is1unitfromthemeanandoccursfor1/4ofpairsofcoins,1is0unitsfromthemeanandoccursforhalfofthepairsand2is1unitfromthemeanandoccursfor1/4ofpairs,i.e.withprobability1/4.Thevarianceisthenfoundbysquaringthesedifferences,multiplyingbythe

proportionoftimesthedifferencewilloccur(theprobability)andadding:

WedenotethevarianceofarandomvariableXbyVAR(X).Inmathematicalterms,

VAR(X)=E(X2-E(X)2)

Thesquarerootofthevarianceisthestandarddeviationoftherandomvariableordistribution.WeoftenusetheGreekletterµ,pronounced‘mu’,andσ,‘sigma’,

forthemeanandstandarddeviationofaprobabilitydistribution.Thevarianceisthenσ2.

Themeanandvarianceofthedistributionofacontinuousvariable,ofwhichmoreinChapter7,aredefinedinasimilarway.Calculusisusedtodefinethemasintegrals,butthisneednotconcernushere.Essentiallywhathappensisthatthecontinuousscaleisbrokenupintomanyverysmallintervalsandthevalueofthevariableinthatverysmallintervalismultipliedbytheprobabilityofbeinginit,thentheseareadded.

6.6PropertiesofmeansandvariancesWhenweusethemeanandvarianceofprobabilitydistributionsinstatisticalcalculations,itisnotthedetailsoftheirformulaewhichweneedtoknow,butsomeoftheirsimpleproperties.Mostoftheformulaeusedinstatisticalcalculationsarederivedfromthese.Thereasonsforthesepropertiesarequiteeasytoseeinanon-mathematicalway.

Ifweaddaconstanttoarandomvariable,thenewvariablesocreatedhasameanequaltothatoftheoriginalvariableplustheconstant.Thevarianceandstandarddeviationwillbeunchanged.Supposeourrandomvariableishumanheight.Wecanaddaconstanttotheheight

bymeasuringtheheightsofpeoplestandingonabox.Themeanheightofpeopleplusboxwillnowbethemeanheightofthepeopleplustheconstantheightofthebox.Theboxwillnotalterthevariabilityoftheheights,however.Thedifferencebetweenthetallestandsmallest,forexample,willbeunchanged.Wecansubtractaconstantbyaskingthepeopletostandinaconstantholetobemeasured.Thisreducesthemeanbutleavesthevarianceunchangedasbefore.(MyfreeprogramClinstat(§1.3)hasasimplegraphicsprogramwhichillustratesthis.)

Ifwemultiplyarandomvariablebyapositiveconstant,themeanandstandarddeviationaremultipliedbytheconstant,thevarianceismultipliedbythesquareoftheconstant.Forexample,ifwechangeourunitsofmeasurements,sayfrominchestocentimetres,wemultiplyeachmeasurementby2.54.Thishastheeffectofmultiplyingthemeanbytheconstant,2.54,andmultiplyingthestandarddeviationbytheconstantsinceitisinthesameunitsastheobservations.However,thevarianceismeasuredinsquaredunits,andsoismultipliedbythesquareoftheconstant.Divisionbyaconstantworksinthesameway.Iftheconstantisnegative,themeanismultipliedbytheconstantandsochangessign.Thevarianceismultipliedbythesquareoftheconstant,whichispositive,sothevarianceremainspositive.Thestandarddeviation,whichisthesquarerootofthevariance,isalwayspositive.Itismultipliedbytheabsolutevalueoftheconstant,i.e.theconstantwithoutthenegativesign.

Ifweaddtworandomvariablesthemeanofthesumisthesumofthemeans,and,ifthetwovariablesareindependent,thevarianceofthesumisthesumoftheirvariances.Wecandothisbymeasuringtheheightofpeoplestandingonboxesofrandomheight.Themeanheightofpeopleonboxesisthemeanheightofpeople+themeanheightoftheboxes.Thevariabilityoftheheightsisalso

increased.Thisisbecausesomeshortpeoplewillfindthemselvesonsmallboxes,andsometallpeoplewillfindthemselvesonlargeboxes.Ifthetwovariablesarenotindependent,somethingdifferenthappens.Themeanofthesumremainsthesumofthemeans,butthevarianceofthesumisnotthesumofthevariances.Supposeourpeoplehavedecidedtostandontheboxes,notjustatastatistician'swhim,butfor

apurpose.Theywishtochangealightbulb,andsomustreacharequiredheight.Nowtheshortpeoplemustpicklargeboxes,whereastallpeoplecanmakedowithsmallones.Theresultisareductioninvariabilitytoalmostnothing.Ontheotherhand,ifwetoldthetallestpeopletofindthelargestboxesandtheshortesttofindthesmallestboxes,thevariablitywouldbeincreased.Independenceisanimportantcondition.

Ifwesubtractonerandomvariablefromanother,themeanofthedifferenceisthedifferencebetweenthemeans,and,ifthetwovariablesareindependent,thevarianceofthedifferenceisthesumoftheirvariances.Supposewemeasuretheheightsabovegroundlevelofourpeoplestandinginholesofrandomdepth.Themeanheightabovegroundisthemeanheightofthepeopleminusthemeandepthofthehole.Thevariabilityisincreased,becausesomeshortpeoplestandindeepholesandsometallpeoplestandinshallowholes.Ifthevariablesarenotindependent,theadditivityofthevariancesbreaksdown,asitdidforthesumoftwovariables.Whenthepeopletrytohideintheholes,andsomustfindaholedeepenoughtoholdthem,thevariabilityisagainreduced.

Theeffectsofmultiplyingtworandomvariablesandofdividingonebyanotheraremuchmorecomplicated.Fortunatelywerarelyneedtodothis.

WecannowfindthemeanandvarianceoftheBinomialdistributionwithparametersnandp.Firstconsidern=1.Thentheprobabilitydistributionis:

Themeanistherefore0×(1-p)+1×p=p.Thevarianceis

Now,avariablefromtheBinomialdistributionwithparametersnandpisthesumofnindependentvariablesfromtheBinomialdistributionwithparameters1andp.Soitsmeanisthesumofnmeansallequaltop,anditsvarianceisthesumofnvariancesallequaltop(1-p).Hence

theBinomialdistributionhasmean=npandvariance=np(1-p).Forlargesampleproblems,thesearemoreusefulthantheBinomialprobabilityformula.

ThepropertiesofmeansandvariancesofrandomvariablesenableustofindaformalsolutiontotheproblemofdegreesoffreedomforthesamplevariancediscussedinChapter4.Wewantanestimateofvariancewhoseexpectedvalueisthepopulationvariance.TheexpectedvalueofΣ(xi-[xwithbarabove])2canbeshownto

be(n-1)VAR(x)(§6B)andhencewedividebyn-1,notn,togetourestimateofvariance.

Fig.6.4.Poissondistributionswithfourdifferentmeans

6.7*ThePoissondistribution

TheBinomialdistributionisoneofmanyprobabilitydistributionswhichareusedinstatistics.Itisadiscretedistribution,thatisitcantakeonlyafinitesetofpossiblevalues,andisthediscretedistributionmostcommonlyencounteredinmedicalapplications.Oneotherdiscretedistributionisworthdiscussingatthispoint,thePoissondistribution.Although,liketheBinomial,thePoissondistributionarisesfromasimpleprobabilitymodel,themathematicsinvolvedismorecomplicatedandwillbeomitted.

Supposeeventshappenrandomlyandindependentlyintimeataconstantrate.ThePoissondistributionisthedistributionfollowedbythenumberofeventswhichhappeninafixedtimeinterval.Ifeventshappenwithrateµeventsperunittime,theprobabilityofreventshappeninginunittimeis

wheree=2.718…,themathematicalconstant.Ifeventshappenrandomlyandindependentlyinspace,thePoissondistributiongivestheprobabilitiesforthenumberofeventsinunitvolumeorarea.

Thereisseldomanyneedtouseindividualprobabilitiesofthisdistribution,

asitsmeanandvariancesuffice.ThemeanofthePoissondistributionforthenumberofeventsperunittimeissimplytherate,µ.ThevarianceofthePoissondistributionisalsoequaltoµ.ThusthePoissonisafamilyofdistributions,liketheBinomial,butwithonlyoneparameter,µ.Thisdistributionisimportant,becausedeathsfrommanydiseasescanbetreatedasoccuringrandomlyandindependentlyinthepopulation.Thus,forexample,thenumberofdeathsfromlungcancerinoneyearamongpeopleinanoccupationalgroup,suchascoalminers,willbeanobservationfromaPoissondistribution,andwecanusethistomakecomparisonsbetweenmortalityrates(§16.3).

Figure6.4showsthePoissondistributionforfourdifferentmeans.YouwillseethatasthemeanincreasesthePoissondistributionlooksratherliketheBinomialdistributioninFigure6.3.Weshalldiscussthissimilarityfurtherinthenextchapter.

6.8*ConditionalprobabilitySometimesweneedtothinkabouttheprobabilityofaneventifanothereventhashappened.Forexample,wemightaskwhatistheprobablitythatapatienthascoronaryarterydiseaseifheorshehastinglingpainintheleftarm.Thisiscalledaconditionalprobability,theprobabilityoftheevent(coronaryarterydisease)givenacondition(tinglingpain).Wewritethisprobabilitythus,separatingtheeventandtheconditionbyaverticalbar:

PROB(coronaryarterydisease|tinglingpain)

Conditionalprobablitiesareusefulinstatisticalaidstodiagnosis(§15.7).Forasimplerexample,wecangobacktotossesoftwocoins.Ifwetossonecointhentheother,thefirsttossalterstheprobabilitiesforthepossibleoutcomesforthetwocoins:

PROB(bothcoinsheads|firstcoinhead)=0.5

PROB(headandtail|firstcoinhead)=0.5

PROB(bothcoinstails|firstcoinhead)=0.0

and

PROB(bothcoinsheads|firstcointail)=0.0

PROB(headandtail|firstcointail)=0.5

PROB(bothcoinstails|firstcointail)=0.5

Themultiplicativerule(§6.2)canbeextendedtodealwitheventswhicharenotindependent.FortwoeventsAandB:

PROB(AandB)=PROB(A|B)PROB(B)=PROB(B|A)PROB(A).

ItisimportanttounderstandthatPROB(A|B)andPROB(B|A)arenot

thesame.Forexample,Table6.1showstherelationshipbetweentwodiseases,hayfeverandeczemainalargegroupofchildren.Theprobabilitythatinthisgroupachildwithhayfeverwillhaveeczemaalsois

PROB(eczema|hayfever)=141/1069=0.13

theproportionofchildrenwithhayfeverwhohaveeczemaalso.Thisis

clearlymuchlessthantheprobablitythatachildwitheczemawillhavehayfever,

PROB(hayfever|eczema)=141/561=0.25

theproportionofchildrenwitheczemawhohavehayfeveralso.

Table6.1.Relationshipbetweenhayfeverandeczemaatage11intheNationalChildDevelopment

Study

EczemaHayfever

TotalYes No

Yes 141 420 561

No 928 13525 14453

Total 1069 13945 15522

Thismaylookobvious,butconfusionbetweenconditionalprobabilitiesiscommonandcancauseseriousproblems,forexampleintheconsiderationofforensicevidence.Typically,thiswillproducetheprobabilitythatamaterialfoundacrimescene(DNA,fibres,etc.)willmatchthesuspectascloselyasitdoesgiventhatthematerialdidnotcomefromthesubject.Thisis

PROB(evidence|suspectnotatcrimescene).

Itisnotthesameas

PROB(suspectnotatcrimescene|evidence),

butthisisoftenhowitisinterpreted,aninversionknownasthe

prosecutor'sfallacy.

Appendices

6AAppendix:Permutationsandcombinations

Forthosewhoneverknew,orhaveforgotten,thetheoryofcombinations,itgoeslikethis.First,welookatthenumberofpermutations,i.e.waysofarrangingasetofobjects.Supposewehavenobjects.Howmanywayscanweorderthem?Thefirstobjectcanbechosennways,i.e.anyobject.Foreachfirstobjecttherearen-1possiblesecondobjects,sotherearen×(n-1)possiblefirstandsecondpermutations.Therearenowonlyn-2choicesforthethirdobject,n-3choicesforthefourth,andsoon,untilthereisonlyonechoiceforthelast.Hence,therearen×(n-1)×(n-2)×…×2×1permutationsofnobjects.Wecallthisnumberthefactorialofnandwriteit‘n!’.

Nowwewanttoknowhowmanywaysthereareofchoosingrobjectsfromnobjects.Havingmadeachoiceofrobjects,wecanorderthoseinr!ways.Wecanalsoorderthen-rnotchosenin(n-r)!ways.Sotheobjectscanbeorderedinr!(n-r)!wayswithoutalteringtheobjectschosen.Forexample,saywechoosethefirsttwofromthreeobjects,A,BandC.TheniftheseareAandB,twopermutationsgivethischoice,ABCandBAC.Thisis,ofcourse,2!×1!=2permutations.Eachcombinationofrthingsaccountsforr!(n-r)!ofthen!permutationspossible,sothereare

possiblecombinations.Forexample,considerthenumberofcombinationsoftwoobjectsoutofthree,sayA,BandC.ThepossiblechoicesareAB,ACandBC.Thereisnootherpossibility.Applyingtheformula,wehaven=3andr=2so

Sometimesinusingthisformulawecomeacrossr=0orr=nleadingto0!.Thiscannotbedefinedinthewaywehavechosen,butwecancalculateitsonlypossiblevalue,0!=1.Becausethereisonlyonewayofchoosingnobjectsfromn,wehave

so0!=1.

6BAppendix:Expectedvalueofasumofsquares

Thepropertiesofmeansandvariancesdescribedin§6.6canbeusedtoanswerthequestionraisedin§4.7and§4Aaboutthedivisorinthesamplevariance.Weaskwhythevariancefromasampleis

andnot

Weshallbeconcernedwiththegeneralpropertiesofsamplesofsizen,soweshalltreatnasaconstantandxiand[xwithbarabove]asrandomvariables.Weshallsupposexihasmeanµandvarianceσ2.

Theexpectedvalueofthesumofsquareis

becausetheexpectedvalueofthedifferenceisthedifferencebetweentheexpectedvaluesandnisaconstant.Now,thepopulationvarianceσ2istheaveragesquareddistancefromthepopulationmeanµ,so

becauseµisaconstant.BecauseE(xi)=µ,wehave

andsowefindE(x2i)=σ2+µ2andsoE(Σx2i)=n(σ2+µ2),beingthesumofnnumbersallofwhichareσ2+µ2.WenowfindthevalueofE((Σxi)2).Weneed

JustasE(x2i)=σ2+µ2=VAR(xi)+(E(xi))2so

So

Sotheexpectedvalueofthesumofsquaresis(n-1)σ2andwemustdividethesumofsquaresbyn-1,notn,toobtaintheestimateofthevariance,σ2.

Weshallfindthevarianceofthesamplemean,[xwithbarabove],usefullater(§8.2):

6MMultiplechoicequestions25to31(Eachbranchiseithertrueorfalse.)

25.TheeventsAandBaremutuallyexclusive,so:

(a)PROB(AorB)=PROB(A)+PROB(B);

(b)PROB(AandB)=0;

(c)PROB(AandB)=PROB(A)PROB(B);

(d)PROB(A)=PROB(B);

(e)PROB(A)+PROB(B)=1.

ViewAnswer

26.Theprobabilityofawomanaged50havingconditionXis0.20andtheprobabilityofherhavingconditionYis0.05.Theseprobabilitiesareindependent:

Fig.19.5.PiechartshowingthedistributionofpatientsinTootingBecHospitalbydiagnosticgroup

Fig.19.6.BarchartshowingtheresultsoftheSalkvaccinetrial

(a)theprobabilityofherhavingbothconditionsis0.01;

(b)theprobabilityofherhavingbothconditionsis0.25;

(c)theprobabilityofherhavingeitherX,orY,orbothis0.24;

(d)ifshehasconditionX,theprobabilityofherhavingYalsois0.01;

(e)ifshehasconditionY,theprobabilityofherhavingXalsois0.20.

ViewAnswer

27.ThefollowingvariablesfollowaBinomialdistribution:

(a)numberofsixesin20throwsofadie;

(b)humanweight;

(c)numberofarandomsampleofpatientswhorespondtoatreatment;

(d)numberofredcellsin1mlofblood;

(e)proportionofhypertensivesinarandomsampleofadultmen.

ViewAnswer

28.Twoparentseachcarrythesamerecessivegenewhicheachtransmitstotheirchildwithprobability0.5.Iftheirchildwilldevelopclinicaldiseaseifitinheritsthegenefrombothparentsandwillbeacarrierifitinheritsthegenefromoneparentonlythen:

(a)theprobabilitythattheirnextchildwillhaveclinicaldiseaseis0.25;

(b)theprobabilitythattwosuccessivechildrenwillbothdevelopclinicaldiseaseis0.25×0.25;

(c)theprobabilitytheirnextchildwillbeacarrierwithoutclinicaldiseaseis0.50:

(d)theprobabilityofachildbeingacarrierorhavingclinicaldiseaseis0.75;

(e)ifthefirstchilddoesnothaveclinicaldisease,theprobabilitythatthesecondchildwillnothaveclinicaldiseaseis0.752.

ViewAnswer

Table6.2.Numberofmenremainingaliveattenyearintervals(fromEnglishLifeTableNo.11,

Males)

Ageinyears,x

Numbersurviving,lx

Ageinyears,x

Numbersurviving,lx

0 1000 60 758

10 959 70 524

20 952 80 211

30 938 90 22

40 920 100 0

50 876

29.Ifacoinisspuntwiceinsuccession:

(a)theexpectednumberoftailsis1.5;

(b)theprobabilityoftwotailsis0.25;

(c)thenumberoftailsfollowsaBinomialdistribution;

(d)theprobabilityofatleastonetailis0.5;

(e)thedistributionofthenumberoftailsissymmetrical.

ViewAnswer

30.IfXisarandomvariable,meanµandvarianceσ2:

(a)E(X+2)=µ;

(b)VAR(X+2)=σ2;

(c)E(2X)=2µ;

(d)VAR(2X)=2σ2;

(e)VAR(X/2)=σ2/4.

ViewAnswer

31.IfXandYareindependentrandomvariables:

(a)VAR(X+Y)=VAR(X)+VAR(Y);

(b)E(X+Y)=E(X)+E(Y);

(c)E(X-Y)=E(X)-E(Y);

(d)VAR(X-Y)=VAR(X)-VAR(Y);

(e)VAR(-X)=-VAR(X).

ViewAnswer

6EExercise:ProbabilityandthelifetableInthisexerciseweshallapplysomeofthebasiclawsofprobabilitytoapracticalexercise.Thedataarebasedonalifetable.(Ishallsaymoreaboutthesein§16.4.)Table6.2showsthenumberofmen,fromagroupnumbering1000atbirth,whowewouldexpecttobealiveatdifferentages.Thus,forexample,after10years,weseethat959surviveandso41havedied,at20years952surviveandso48havedied,41betweenages0and9and7betweenages10and19.

1.Whatistheprobabilitythatanindividualchosenatrandomwillsurvivetoage10?

ViewAnswer

2.Whatistheprobabilitythatthisindividualwilldiebeforeage10?Whichpropertyofprobabilitydoesthisdependon?

ViewAnswer

3.Whataretheprobabilitiesthattheindividualwillsurvivetoages10,20.30,40,50,60,70.80,90,100?Isthissetofprobabilitiesaprobabilitydistribution?

ViewAnswer

4.Whatistheprobabilitythatanindividualaged60yearssurvivestoage70?

ViewAnswer

5.Whatistheprobabilitythattwomenaged60willbothsurviveto

age70?Whichpropertyofprobabilityisusedhere?

ViewAnswer

6.Ifwehad100individualsaged60,howmanywouldweexpecttoattainage70?

ViewAnswer

7.Whatistheprobabilitythatamandiesinhisseconddecade?YoucanusethefactthatPROB(deathin2nd)+PROB(survivesto3rd)=PROB(survivesto2nd).

ViewAnswer

8.Foreachdecade,whatistheprobabilitythatagivenmanwilldieinthatdecade?Thisisaprobabilitydistribution—why?Sketchthedistribution.

ViewAnswer

9.Asanapproximation,wecanassumethattheaveragenumberofyearslivedinthedecadeofdeathis5.Thus,thosewhodieinthe2nddecadewillhaveanaveragelifespanof15years.Theprobabilityofdyinginthe2nddecadeis0.007,i.e.aproportion0.007ofmenhaveameanlifetimeof15years.Whatisthemeanlifetimeofallmen?Thisistheexpectationoflifeatbirth.

ViewAnswer



>TableofContents>7-TheNormaldistribution

7

TheNormaldistribution

7.1ProbabilityforcontinuousvariablesWhenwederivedthetheoryofprobabilityinthediscretecase,wewereabletosaywhattheprobabilitywasofarandomvariabletakingaparticularvalue.Asthenumberofpossiblevaluesincreases,theprobabilityofaparticularvaluedecreases.Forexample,intheBinomialdistributionwithp=0.5andn=2,themostlikelyvalue,1,hasprobability0.5.IntheBinomialdistributionwithp=0.5andn=100themostlikelyvalue,50,hasprobability0.08.Insuchcasesweareusuallymoreinterestedintheprobabilityofarangeofvaluesthanoneparticularvalue.

Foracontinuousvariable,suchasheight,thesetofpossiblevaluesisinfiniteandtheprobabilityofanyparticularvalueiszero(§6.1).Weareinterestedintheprobabilityoftherandomvariabletakingvaluesbetweencertainlimitsratherthantakingparticularvalues.Iftheproportionofindividualsinthepopulationwhosevaluesarebetweengivenlimitsisp,andwechooseanindividualatrandom,theprobabilityofchoosinganindividualwholiesbetweentheselimitsisequaltop.Thiscomesfromourdefinitionofprobability,thechoiceofeachindividualbeingequallylikely.Theproblemisfindingandgivingavaluetothisprobability.

Whenwefindthefrequencydistributionforasampleofobservations,we

countthenumberofvaluesinwhichfallwithincertainlimits(§4.2).WecanrepresentthisasahistogramsuchasFigure7.1(§4.3).Oneway

ofpresentingthehistogramisasrelativefrequencydensity,theproportionofobservationsintheintervalperunitofX(§4.3),Thus,whentheintervalsizeis5,therelativefrequencydensityistherelativefrequencydividedby5(Figure7.1).Therelativefrequencyinanintervalisnowrepresentedbythewidthoftheintervalmultipliedbythedensity,whichgivestheareaoftherectangle.Thus,therelativefrequencybetweenanytwopointscanbefoundfromtheareaunderthehistogrambetweenthepoints.Forexample,toestimatetherelativefrequencybetween10and20inFigure7.1wehavethedensityfrom10to15as0.05andbetween15and20as0.03.Hencetherelativefrequencyis

0.05×(15-10)+0.03×(20-15)=0.25+0.15=0.40

Ifwetakealargersamplewecanusesmallerintervals.Wegetasmootherlookinghistogram,asinFigure7.2,andaswetakelargerandlargersamples,andsosmallerandsmallerintervals,wegetashapeveryclosetoasmoothcurve(Figure7.3).Asthesamplesizeapproachesthatofthepopulation,whichwecanassumetobeverylarge,thiscurvebecomestherelativefrequencydensityofthewholepopulation.Thuswecanfindtheproportionofobservationsbetweenanytwolimitsbyfindingtheareaunderthecurve,asindicatedinFigure7.3.

Fig.7.1.Histogramshowingrelativefrequencydensity

Fig.7.2.Theeffectonafrequencydistributionofincreasingsamplesize

Ifweknowtheequationofthiscurve,wecanfindtheareaunderit.(Mathematicallywedothisbyintegration,butwedonotneedtoknowhowtointegratetouseortounderstandpracticalstatistics—alltheintegralsweneedhavebeendoneandtabulated.)Now,ifwechooseanindividualatrandom,theprobabilitythatXliesbetweenanygivenlimitsisequaltotheproportionofindividualswhofallbetweentheselimits.Hence,therelativefrequencydistributionforthewholepopulationgivesustheprobabilitydistributionofthevariable.Wecallthiscurvetheprobabilitydensityfunction.

Fig.7.3.Relativefrequencydensityorprobabilitydensityfunction,showingtheprobabilityofanobservationbetween10and20

Fig.7.4.Mean,µ,standarddeviation,σ,andaprobabilitydensityfunction

Probabilitydensityfunctionshaveanumberofgeneralproperties.Forexample,thetotalareaunderthecurvemustbeone,sincethisisthetotalprobabilityofallpossibleevents.Continuousrandomvariableshavemeans,variancesandstandarddeviationsdefinedinasimilarwaytothosefordiscreterandomvariablesandpossessingthesameproperties(§6.5).Themeanwillbesomewherenearthemiddleofthecurveandmostoftheareaunderthecurvewillbebetweenthemeanminustwostandarddeviationsandthemeanplustwostandarddeviations(Figure7.4).

Thepreciseshapeofthecurveismoredifficulttoascertain.Therearemanypossibleprobabilitydensityfunctionsandsomeofthesecanbeshowntoarisefromsimpleprobabilitysituations,asweretheBinomialandPoissondistributions.However,mostcontinuousvariableswithwhichwehavetodeal,suchas

height,bloodpressure,serumcholesterol,etc.,donotarisefromsimpleprobabilitysituations.Asaresult,wedonotknowtheprobabilitydistributionforthesemeasurementsontheoreticalgrounds.Asweshallsee,wecanoftenfindastandarddistributionwhosemathematicalpropertiesareknown,whichfitsobserveddatawellandwhichenablesustodrawconclusionsaboutthem.Further,assamplesizeincreasesthedistributionofcertainstatisticscalculatedfromthedata,suchasthemean,becomeindependentofthedistributionoftheobservationsthemselvesandfollowoneparticulardistributionform,theNormaldistribution.Weshalldevotetheremainderofthischaptertoastudyofthisdistribution.

Fig.7.5.Binomialdistributionsforp=0.3andsixdifferentvaluesofn,withcorrespondingNormaldistributioncurves

7.2TheNormaldistributionTheNormaldistribution,alsoknownastheGaussiandistribution,mayberegardedasthefundamentalprobabilitydistributionofstatistics.Theword‘normal’isnotusedhereinitscommonmeaningof‘ordinaryorcommon’,oritsmedicalmeaningof‘notdiseased’.Theusagerelatestoitsoldermeaningof‘conformingtoaruleorpattern’,andasweshallsee,theNormaldistributionistheformtowhichtheBinomialdistributiontendsasitsparameternincreases.ThereisnoimplicationthatmostvariablesfollowaNormaldistribution.

WeshallstartbyconsideringtheBinomialdistributionasnincreases.Wesawin§6.4that,asnincreases,theshapeofthedistributionchanges.Themostextremepossiblevaluesbecomelesslikelyandthedistributionbecomesmoresymmetrical.Thishappenswhateverthevalueofp.Thepositionofthedistributionalongthehorizontalaxis,anditsspread,arestilldeterminedbyp,buttheshapeisnot.Asmoothcurvecanbedrawnwhichgoesveryclosetothesepoints.ThisistheNormaldistributioncurve,thecurveofthecontinuousdistributionwhichtheBinomialdistributionapproachesasnincreases.

AnyBinomialdistributionmaybeapproximatedbytheNormaldistributionofthesamemean

andvarianceprovidednislargeenough.Figure7.5showstheBinomialdistributionsofFigure6.3withthecorrespondingNormaldistributioncurves.Fromn=10onwardsthetwodistributionsareveryclose.Generally,ifbothnpandn(1-p)exceed5theapproximationoftheBinomialtotheNormaldistributionisquitegoodenoughformostpracticalpurposes.See§8.4foranapplication.ThePoissondistributionhasthesameproperty,asFigure6.4suggests.

Fig.7.6.SumsofobservationsfromaUniformdistribution

TheBinomialvariablemayberegardedasthesumofnindependentidenticallydistributedrandomvariables,eachbeingtheoutcomeofonetrialtakingvalue1withprobabilityp.Ingeneral,ifwehaveany

seriesofindependent,identicallydistributedrandomvariables,thentheirsumtendstoaNormaldistributionasthenumberofvariablesincreases.Thisisknownasthecentrallimittheorem.Asmostsetsofmeasurementsareobservationsofsuchaseriesofrandomvariables,thisisaveryimportantproperty.Fromit,wecandeducethatthesumormeanofanylargeseriesofindependentobservationsfollowsaNormaldistribution.

Forexample,considertheUniformorRectangulardistribution.Thisisthedistributionwhereallvaluesbetweentwolimits,say0and1,areequallylikelyandnoothervaluesarepossible.ObservationsfromthisariseifwetakerandomdigitsfromatableofrandomnumberssuchasTable2.3.EachobservationoftheUniformvariableisformedbyaseriesofsuchdigitsplacedafteradecimalpoint.Onamicrocomputer,thisisusuallythedistributionproducedbytheRND(X)functionintheBASIClanguage.Figure7.6showsthehistogramforthefrequencydistributionof500observationsfromtheUniformdistribution

between0and1.ItisquitedifferentfromtheNormaldistribution.NowsupposewecreateanewvariablebytakingtwoUniformvariablesandaddingthem(Figure7.6),TheshapeofthedistributionofthesumoftwoUniformvariablesisquitedifferentfromtheshapeoftheUniformdistribution.Thesumisunlikelytobeclosetoeitherextreme,here0or2,andobservationsareconcentratedinthemiddleneartheexpectedvalue.Thereasonforthisisthattoobtainalowsum,boththeUniformvariablesformingitmustbelow;tomakeahighsumbothmustbehigh.Butwegetasumnearthemiddleifthefirstishighandthesecondlow,orthefirstislowandsecondhigh,orbothfirstandsecondaremoderate.ThedistributionofthesumoftwoismuchclosertotheNormalthanistheUniformdistributionitself.However,theabruptcut-offat0andat2isunlikethecorrespondingNormaldistribution.Figure7.6alsoshowstheresultofaddingfourUniformvariablesandsixUniformvariables.ThesimilaritytotheNormaldistributionincreasesasthenumberaddedincreasesandforthesumofsixthecorrespondenceissoclosethatthedistributionscouldnoteasilybetoldapart.

TheapproximationoftheBinomialtotheNormaldistributionisa

specialcaseofthecentrallimittheorem.ThePoissondistributionisanother.IfwetakeasetofPoissonvariableswiththesamerateandaddthem,wewillgetavariablewhichisthenumberofrandomeventsinalongertimeinterval(thesumoftheintervalsfortheindividualvariables)andwhichisthereforeaPoissondistributionwithincreasedmean.Asitisthesumofasetofindependent,identicallydistributedrandomvariablesitwilltendtowardstheNormalasthemeanincreases.HenceasthemeanincreasesthePoissondistributionbecomesapproximatelyNormal.Formostpracticalpurposesthisiswhenthemeanexceeds10.ThesimilaritybetweenthePoissonandtheBinomialnotedin§6.7isapartofamoregeneralconvergenceshownbymanyotherdistributions.

7.3PropertiesoftheNormaldistributionInitssimplestformtheequationoftheNormaldistributioncurve,calledtheStandardNormaldistribution,isusuallydenotedbyφ(z),whereφistheGreekletter‘phi’:

whereπistheusualmathematicalconstant.Themedicalreadercanbereassuredthatwedonotneedtousethisforbiddingformulainpractice.TheStandardNormaldistributionhasameanof0,astandarddeviationof1andashapeasshowninFigure7.7.Thecurveissymmetricalaboutthemeanandoftendescribedas‘bell-shaped’(thoughIhaveneverseenabelllikeit).Wecannotethatmostofthearea,i.e.theprobability,isbetween-1and+1,thelargemajoritybetween-2and+2,andalmostallbetween-3and+3.

AlthoughtheNormaldistributioncurvehasmanyremarkableproperties,ithasoneratherawkwardone:itcannotbeintegrated.Inotherwords,thereisnosimpleformulafortheprobabilityofarandomvariablefromaNormal

distributionlyingbetweengivenlimits.Theareasunderthecurvecanbefoundnumerically,however,andthesehavebeencalculatedandtabulated.Table7.1showstheareaundertheprobabilitydensitycurve

fordifferentvaluesoftheNormaldistribution.Tobemoreprecise,foravaluezthetableshowstheareaunderthecurvetotheleftofz,i.e.fromminusinfinitytoz(Figure7.8).ThusΦ(z)istheprobabilitythatavaluechosenatrandomfromtheStandardNormaldistributionwillbelessthanz.ΦistheGreekcapital‘phi’.Notethathalfthistableisnotstrictlynecessary.WeneedonlythehalfforpositivezasΦ(-z)+Φ(z)=1.Thisarisesfromthesymmetryofthedistribution.Tofindtheprobabilityofzlyingbetweentwovaluesaandb,whereb>a,wefindΦ(b)-Φ(a).Tofindtheprobabilityofzbeinggreaterthanawefind1-Φ(a).Theseformulaeareallexamplesoftheadditivelawofprobability.Table7.1givesonlyafewvaluesofz,andmuchmoreextensiveonesareavailable(LindleyandMiller1955,PearsonandHartley1970).Goodstatisticalcomputerprogramswillcalculatethesevalueswhentheyareneeded.

Fig.7.7.TheStandardNormaldistribution

Table7.1.TheNormaldistribution

z Φ(z) z Φ(z) z Φ(z) z Φ(z)

-3.0 0.001 -2.0 0.023 -1.0 0.159 0.0 0.500

-2.9 0.002 -1.9 0.029 -0.9 0.184 0.1 0.540

-2.8 0.003 -1.8 0.036 -0.8 0.212 0.2 0.579

-2.7 0.003 -1.7 0.045 -0.7 0.242 0.3 0.618

-2.6 0.005 -1.6 0.055 -0.6 0.274 0.4 0.655

-2.5 0.006 -1.5 0.067 -0.5 0.309 0.5 0.691

-2.4 0.008 -1.4 0.081 -0.4 0.345 0.6 0.726

-2.3 0.011 -1.3 0.097 -0.3 0.382 0.7 0.758

-2.2 0.014 -1.2 0.115 -0.2 0.421 0.8 0.788

-2.1 0.018 -1.1 0.136 -0.1 0.460 0.9 0.816

-2.0 0.023 -1.0 0.159 0.0 0.500 1.0 0.841

Thereisanotherwayoftabulatingadistribution,usingwhatarecalled

percentagepoints.Theone-sidedPpercentagepointofadistributionisthevaluezsuchthatthereisaprobabilityP%ofanobservationfromthatdistributionbeinggreaterthanorequaltoz(Figure7.8).Thetwo-sidedPpercentagepointisthevaluezsuchthatthereisaprobability

P%ofanobservationbeinggreaterthanorequaltozorlessthanorequalto-z(Figure7.8).Table7.2showsbothonesidedandtwosidedpercentagepointsfortheNormaldistribution.Theprobabilityisquotedasapercentagebecausewhenweusepercentagepointsweareusuallyconcernedwithrathersmallprobabilities,suchas0.05or0.01,anduseofthepercentageform,makingthem5%and1%,cutsouttheleadingzero.

Table7.2.PercentagepointsoftheNormaldistribution

One-sided Two-sided

P1 (z) P2 (z)

50 0.00

25 0.67 50 0.67

10 1.28 20 1.28

5 1.64 10 1.64

2.5 1.96 5 1.96

1 2.33 2 2.33

0.5 2.58 1 2.58

0.1 3.09 0.2 3.09

0.05 3.29 0.1 3.29

ThetableshowstheprobabilityP1(z)ofaNormalvariablewithmean0andvariance1beinggreaterthanz,andtheprobabilityP2(z)ofaNormalvariablewithmean0andvariance1beinglessthan-zorgreaterthanz.

Fig.7.8.One-andtwo-sidedpercentagepoints(5%)oftheStandardNormaldistribution

SofarwehaveexaminedtheNormaldistributionwithmean0andstandarddeviation1.IfweaddaconstantµtoaStandardNormalvariable,wegetanewvariablewhichhasmeanµ(see§6.6).Figure7.9showstheNormaldistributionwithmean0andthedistributionobtainedbyadding1toittogetherwiththeirtwo-sided5%points.Thecurvesareidenticalapartfromashiftalongtheaxis.

Onthecurvewithmean0nearlyalltheprobabilityisbetween-3and+3.Forthecurvewithmean1itisbetween-2and+4,i.e.betweenthemean-3andthemean+3.Theprobabilityofbeingagivennumberof

unitsfromthemeanisthesameforbothdistributions,asisalsoshownbythe5%points.

Fig.7.9.Normaldistributionswithdifferentmeansandwithdifferentvariances,showingtwo-sided5%points

IfwetakeaStandardNormalvariable,withstandarddeviation1,andmultiplybyaconstantσwegetanewvariablewhichhasstandarddeviationσ.Figure7.9showstheNormaldistributionwithmean0andstandarddeviation1andthedistributionobtainedbymultiplyingby2.Thecurvesdonotappearidentical.Forthedistributionwithstandarddeviation2,nearlyalltheprobabilityisbetween-6and+6,amuchwiderintervalthanthe-3and+3forthestandarddistribution.Thevalues-6and+6are-3and+3standarddeviations.Wecanseethattheprobabilityofbeingagivennumberofstandarddeviationsfromthemeanisthesameforbothdistributions.Thisisalsoseenfromthe5%points,whichrepresentthemeanplusorminus1.96standarddeviationsineachcase.

InfactifweaddµtoaStandardNormalvariableandmultiplybyσ,wegetaNormaldistributionofmeanµ,andstandarddeviationσ.Tables7.1and7.2applytoitdirectly,ifwedenotebyzthenumberofstandarddeviationsabovethemean,ratherthanthenumericalvalueofthevariable.Thus,forexample,thetwosided5%pointsofaNormaldistributionwithmean10andstandarddeviation5arefoundby10-

1.96×5=0.2and10+1.96×5=19.8,thevalue1.96beingfoundfromTable7.2.

ThispropertyoftheNormaldistribution,thatmultiplyingoraddingconstantsstillgivesaNormaldistribution,isnotasobviousasitmightseem.TheBinomialdoesnothaveit,forexample.TakeaBinomialvariablewithn=3,possiblevalues0,1,2,and3,andmultiplyby2,Thepossiblevaluesarenow0,2,4,and6.TheBinomialdistributionwithn=6hasalsopossiblevalues1,3,and5,sothedistributionsaredifferentandtheonewhichwehavederivedisnotamemberoftheBinomialfamily.

WehaveseenthataddingaconstanttoavariablefromaNormaldistributiongivesanothervariablewhichfollowsaNormaldistribution.IfweaddtwovariablesfromNormaldistributionstogether,evenwithdifferentmeansand

variances,thesumfollowsaNormaldistribution.ThedifferencebetweentwovariablesfromNormaldistributionsalsofollowsaNormaldistribution.

Fig.7.10.Distributionofheightinasampleof1794pregnantwomen(dataofBrookeetal.1989)

Fig.7.11.Distributionofserumtriglyceride(Table4.8)andlog10triglycerideincordbloodfor282babies,withcorrespondingNormaldistributioncurves

7.4VariableswhichfollowaNormaldistributionSofarwehavediscussedtheNormaldistributionasitarisesfromsamplingasthesumorlimitofotherdistributions.However,manynaturallyoccurringvariables,suchashumanheight,appeartofollowaNormaldistributionveryclosely.Wemightexpectthistohappenifthevariableweretheresultofaddingvariationfromanumberofdifferentsources.TheprocessshownbythecentrallimittheoremmaywellproducearesultclosetoNormal.Figure7.10showsthedistributionofheightinasampleofpregnantwomen,andthecorrespondingNormaldistributioncurve.ThefittotheNormaldistributionisverygood.

IfthevariablewemeasureistheresultofmultiplyingseveraldifferentsourcesofvariationwewouldnotexpecttheresulttobeNormalfromtheproperties

discussedin§7.2,whichwereallbasedonadditionofvariables.However,ifwetakethelogtransformationofsuchavariable(§5A)wewouldthengetanewvariablewhichisthesumofseveraldifferentsourcesofvariationandwhichmaywellhaveaNormaldistribution.

Thisprocessoftenhappenswithquantitieswhicharepartofmetabolicpathways,therateatwhichreactioncantakeplacedependingontheconcentrationsofothercompounds.Manymeasurementsofbloodconstituentsexhibitthis,forexample.Figure7.11showsthedistributionofserumtriglyceridemeasuredincordbloodfor282babies(Table4.8).ThedistributionishighlyskewedandquiteunliketheNormaldistributioncurve.However,whenwetakethelogarithmofthetriglycerideconcentration,wehavearemarkablygoodfittotheNormaldistribution(Fig.7.11).IfthelogarithmofarandomvariablefollowsaNormaldistribution,therandomvariableitselffollowsaLognormaldistribution.

WeoftenwanttochangethescaleonwhichweanalyseourdatasoastogetaNormaldistribution.Wecallthisprocessofanalysingamathematicalfunctionofthedataratherthanthedatathemselvestransformation.Thelogarithmisthetransformationmostoftenused,thesquarerootandreciprocalareothers(seealso§10.4).Forasinglesample,transformationenablesustousetheNormaldistributiontoestimatecentiles(§4.5).Forexample,weoftenwanttoestimatethe2.5thand97.5thcentiles.whichtogetherenclose95%oftheobservations.ForaNormaldistribution,thesecanbeestimatedby[xwithbarabove]±1.96s.WecantransformthedatasothatthedistributionisNormal,calculatethecentile,andthentransformbacktotheoriginalscale.

ConsiderthetriglyceridedataofFigure7.11andTable4.8.Themeanis0.51andthestandarddeviation0.22.Themeanforthelog10transformeddatais-0.33andthestandarddeviationis0.17.Whathappensifwetransformbackbytheantilog?Forthemean,weget10-0.33=0.47.Thisislessthanthemeanfortherawdata.Theantilogofthemeanlogisnotthesameastheuntransformedarithmeticmean.Infact,thisthegeometricmean,whichisthenthrootoftheproductoftheobservations.Ifweaddthelogsoftheobservationswegetthelogoftheirproduct(§5A).Ifwemultiplythelogofanumberbyasecondnumber,wegetthelogofthefirstraisedtothepowerofthesecond.Soifwedividethelogbyn,wegetthelogofthenthroot.Thusthemeanofthelogsisthelogofthegeometricmean.Onbacktransformation,thereciprocaltransformationalsoyieldsameanwitha

specialname,theharmonicmean,thereciprocalofthemeanofthereciprocals.

Thegeometricmeanisintheoriginalunits.Iftriglycerideismeasuredinmmol/litre,thelogofasingleobservationisthelogofameasurementinmmol/litre.Thesumofnlogsisthelogoftheproductofnmeasurementsinmmol/litreandisthelogofameasurementinmmol/litretothenth.Thenthrootisthusagainthelogofanumberinmmol/litreandtheantilogisbackintheoriginalunits,mmol/litre(see§5A).

Theantilogofthestandarddeviation,however,isnotmeasuredintheoriginalunits.Tocalculatethestandarddeviationwetakethedifferencebetweeneachlogobservationandsubtracttheloggeometricmean,usingtheusualformula

Σ(xi-[xwithbarabove])2/(n-1)(§4.8).Thuswehavethedifferencebetweenthelogoftwonumberseachmeasuredinmmol/litre,givingthelogoftheirratio(§5A)whichisthelogofadimensionlesspurenumber.Itwouldbethesameifthetriglyceridesweremeasuredinmmol/litreormg/100ml.Wecannottransformthestandarddeviationbacktotheoriginalscale.

Ifwewanttousethestandarddeviation,itiseasiesttodoallcalculationsonthetransformedscaleandtransformback,ifnecessary,attheend.Forexample,the2.5thcentileonthelogscaleis-0.33-1.96×0.17=-0.66andthe97.5thcentileis-0.33+1.96×0.17=0.00.Togetthesewetookthelogofsomethinginmmol/litreandaddedorsubtractedthelogofapurenumber(i.e.multipliedonthenaturalscale),sowestillhavethelogofsomethinginmmol/litre.Togetbacktotheoriginalscaleweantilogtoget2.5thcentile=0.22and97.5thcentile=1.00mmol/litre.

TransformingthedatatoaNormaldistributionandthenanalysingonthetransformedscalemaylooklikecheating.Idonotthinkitis.Thescaleonwhichwechoosetomeasurethingsneednotbelinear,thoughthisisoftenconvenient.Otherscalescanbemuchmoreuseful.WemeasurepHonalogarithmicscale,forexample.Shouldthemagnitudeofanearthquakebemeasuredinmmofamplitude(linear)oronthe

Richterscale(logarithmic)?Shouldspectaclelensesbemeasuredintermsoffocallengthincm(linear)ordioptres(reciprocal)?Weoftenchoosenon-linearscalesbecausetheysuitourpurposeandforstatisticalanalysisitoftensuitsustomakethedistributionNormal,byfindingascaleofmeasurementwherethisisthecase.

7.5TheNormalplotManystatisticalmethodscanonlybeusediftheobservationsfollowaNormaldistribution(seeChapters10and11).ThereareseveralwaysofinvestigatingwhetherobservationsfollowaNormaldistribution.WithalargesamplewecaninspectahistogramtoseewhetheritlookslikeaNormaldistributioncurve.Thisdoesnotworkwellwithasmallsample,andamorereliablemethodistheNormalplot.Thisisagraphicalmethod,whichcanbedoneusingordinarygraphpaperandatableoftheNormaldistribution,withspeciallyprintedNormalprobabilitypaper,or,muchmoreeasily,usingacomputer.AnygoodgeneralstatisticalpackagewillgiveNormalplots;ifitdoesnotthenitisnotagoodpackage.TheNormalplotmethodcanbeusedtoinvestigatetheNormalassumptioninsamplesofanysize,andisaveryusefulcheckwhenusingmethodssuchasthetdistributionmethodsdescribedinChapter10.

TheNormalplotisaplotofthecumulativefrequencydistributionforthedataagainstthecumulativefrequencydistributionfortheNormaldistribution.First,weorderthedatafromlowesttohighest.ForeachorderedobservationwefindtheexpectedvalueoftheobservationifthedatafollowedaStandardNormaldistribution.Thereareseveralapproximateformulaeforthis.IshallfollowArmitageandBerry(1994)andusefortheithobservationzwhereΦ(z)=(i-0.5)/n.SomebooksandprogramsuseΦ(z)=i/(n+1)andthereareother

morecomplexformulae.Itdoesnotmakemuchdifferencewhichisused.WefindfromatableoftheNormaldistributionthevaluesofzwhichcorrespondtoΦ(z)=0.5/n,1.5/n,etc.(Table7.1lacksdetailforpracticalwork,butwilldoforillustration.)For5points,forexample,wehaveΦ(z)=0.1,0.3,0.5,0.7,and0.9.andz=-1.3,-0.5,0,0.5,and1.3.ThesearethepointsoftheStandardNormal

distributionwhichcorrespondtotheobserveddata.Now,iftheobserveddatacomefromaNormaldistributionofmeanµandvarianceσ2,theobservedpointshouldequalσz+µ,wherezisthecorrespondingpointoftheStandardNormaldistribution.IfweplottheStandardNormalpointsagainsttheobservedvaluesweshouldgetsomethingclosetoastraightline.Wecanwritetheequationofthislineasσz+µ=x,wherexistheobservedvariableandzthecorrespondingquantileoftheStandardNormaldistribution.Wecanrewritethisas

whichgoesthroughthepointdefinedby(µ,0)andhasslope1/σ(see§11.1).IfthedataarenotfromaNormaldistributionwewillnotgetastraightline,butacurveofsomesort.Becauseweplotthequantilesoftheobservedfrequency

distributionagainstthecorrespondingquantilesofthetheoretical(hereNormal)distribution,thisisalsoreferredtoasaquantile–quantileplotorq–qplot.

Table7.3.VitaminDlevelsmeasuredinthebloodof26healthymen,dataofHickishetal.(1989)

14 25 30 42 54

17 26 31 43 54

20 26 31 46 63

21 26 32 48 67

22 27 35 52 83

24

Table7.4.CalculationoftheNormalplotforthevitaminDdata

i VitD Φ(z) z i Vit

D Φ(z) z

1 14 0.019 -2.07 14 31 0.519 0.05

2 17 0.058 -1.57 15 32 0.558 0.15

3 20 0.096 -1.30 16 35 0.596 0.24

4 21 0.135 -1.10 17 42 0.635 0.34

5 22 0.173 -0.94 18 43 0.673 0.45

6 24 0.212 -0.80 19 46 0.712 0.56

7 25 0.250 -0.67 20 48 0.750 0.67

8 26 0.288 -0.56 21 52 0.788 0.80

9 26 0.327 -0.45 22 54 0.827 0.94

10 26 0.365 -0.34 23 54 0.865 1.10

11 27 0.404 -0.24 24 63 0.904 1.30

12 30 0.442 -0.15 25 67 0.942 1.57

13 31 0.481 -0.05 26 83 0.981 2.07

Φ(z)=(i-0.5)/26

Fig.7.12.BloodvitaminDlevelsandlog10vitaminDfor26normalmen,withNormalplots

Table7.3showsvitaminlevelsmeasuredinthebloodof26healthymen.ThecalculationoftheNormalplotisshowninTable7.4.NotethattheΦ(z)=(i-0.5)/26andzaresymmetrical,thesecondhalfbeingthefirsthalfwithoppositesign.ThevalueoftheStandardNormaldeviate,z,canbefoundbyinterpolationinTable7.1,byusingafullertable,orbycomputer.Figure7.12showsthehistogramandtheNormalplotforthesedata.ThedistributionisskewandtheNormalplotshowsapronouncedcurve.Figure7.12alsoshowsthevitaminDdataafterlogtransformation.ItisquiteeasytoproducetheNormalplot,asthecorrespondingStandardNormaldeviate,z,isunchanged.Weonlyneedtologtheobservationsandplotagain.TheNormalplotforthetransformeddataconformsverywelltothetheoreticalline,suggestingthatthedistributionoflogvitaminDlevelisclosetotheNormal.

AsinglebendintheNormalplotindicatesskewness.AdoublecurveindicatesthatbothtailsofthedistributionaredifferentfromtheNormal,usuallybeingtoolong,andmanycurvesmayindicatethatthedistributionisbimodal(Figure7.13).Whenthesampleissmall,ofcourse,therewillbesomerandomfluctuations.

ThereareseveraldifferentwaystodisplaytheNormalplot.SomeprogramsplotthedatadistributionontheverticalaxisandthetheoreticalNormaldistributiononthehorizontalaxis,whichreversesthedirectionofthecurve.Some

plotthetheoreticalNormaldistributionwithmean[xwithbarabove],thesamplemean,andstandarddeviations,thesamplestandarddeviation.Thisisdonebycalculating[xwithbarabove]+sz.Figure7.14(a)showsboththesefeatures,theNormalplotdrawnbytheprogramStata's‘qnorm’command.Thestraightlineisthelineofequality.ThisplotisidenticaltothesecondplotinFigure7.12,exceptforthechangeofscaleandswitchingoftheaxes.AslightvariationisthestandardizedNormalprobabilityplotorp-pplot,wherewestandardizetheobservationstozeromeanandstandarddeviationone,y=(x-[xwithbarabove])/s,andplotthecumulativeNormal

probabilities,Φ(y),against(i-0.5)/nor?/(n+1)(Figure7.14(b),

producedbytheStatacommand‘pnorm’)-ThereisverylittledifferencebetweenFigure7.14(a)and(b)andthequantileandprobabilityversionsoftheNormalplotshouldbeinterpretedinthesameway.

Fig.7.13.Bloodsodiumandsystolicbloodpressuremeasuredin250patientsintheIntensiveTherapyUnitatSt.George'sHospital,withNormalplots(dataofFreidlandetal.1996)

Fig.7.14.VariationsontheNormalplotforthevitaminDdata

Appendices

7AAppendix:Chi-squared,t,andF

Lessmathematicallyinclinedreaderscanskipthissection,butthosewhopersevereshouldfindthatapplicationslikechi-squaredtests(Chapter13)appearmuchmorelogical.

ManyprobabilitydistributionscanbederivedforfunctionsofNormalvariableswhichariseinstatisticalanalysis.Threeoftheseareparticularlyimportant:theChi-squared,tandFdistributions.Thesehavemanyapplications,someofwhichweshalldiscussinlaterchapters.

TheChi-squareddistributionisdennedasfollows.SupposeZisaStandardNormalvariable,sohavingmean0andvariance1.ThenthevariableformedbyZ2followstheChi-squareddistributionwith1degreeoffreedom.IfwehavensuchindependentStandardNormalvariables,Z1,Z2,…,Znthenthevariabledefinedby

χ2=Z21+Z22+…+Z2n

isdefinedtobetheChi-squareddistributionwithndegreesoffreedom.χistheGreekletter‘chi’,pronounced‘ki’asin‘kite’.The

distributioncurvesforseveraldifferentnumbersofdegreesoffreedomareshowninFigure7.15.Themathematicaldescriptionofthiscurveisrathercomplicated,butwedonotneedtogointothis.

SomepropertiesoftheChi-squareddistributionareeasytodeduce.AsthedistributionisthesumofnindependentidenticallydistributedrandomvariablesittendstotheNormalasnincreases,fromthecentrallimittheorem(§7.2).Theconvergenceisslow,however,(Figure7.15)andthesquarerootofchi-squaredconvergesmuchmorequickly.TheexpectedvalueofZ2isthevarianceofZ,theexpectedvalueofZbeing0,andsoE(Z2)=1.Theexpectedvalueofchi-squaredwithndegreesoffreedomisthusn:

TheChi-squareddistributionhasaveryimportantproperty.SupposewerestrictourattentiontoasubsetofpossibleoutcomesforthenrandomvariablesZ1,Z2,…,Zn.ThesubsetwillbedefinedbythosevaluesofZ1,Z2,…,Znwhichsatisfytheequationa1Z1+a2Z2+…+anZn=k,wherea1,a2…,an,andkareconstants.(Thisiscalledalinearconstraint).Thenunderthisrestriction,χ2=ΣZ2ifollowsaChi-squareddistributionwithn-1degreesoffreedom.Iftherearemsuchconstraintssuchthatnoneoftheequationscanbecalculated

fromtheothers,thenwehaveaChi-squareddistributionwithn-mdegreesoffreedom.Thisisthesourceofthename‘degreesoffreedom’.

Fig.7.15.SomeChi-squareddistributions

Theproofofthisistoocomplicatedtogivehere,involvingsuchmathematicalabstractionsasndimensionalspheres,butitsimplicationsareveryimportant.First,considerthesumofsquaresaboutthepopulationmeanµofasampleofsizenfromaNormaldistribution,dividedbyσ2·σ(xi-µ)2/σ2willfollowaChi-squareddistributionwithndegreesoffreedom,asthe(xi-µ)/σhavemean0andvariance1andtheyareindependent.Nowsupposewereplaceµbyanestimatecalculatedfromthedata,[xwithbarabove].Thevariablesarenolongerindependent,theymustsatisfytherelationshipΣ(xi-[xwithbarabove])=0andwenowhaven-1degreesoffreedom.HenceΣ(xi-[xwithbarabove])2/σ2followsaChi-squareddistributionwithn-1degreesoffreedom.ThesumofsquaresaboutthemeanofanyNormalsamplewithvarianceσ2followsthedistributionofaChi-squaredvariablemultipliedbyσ2.Itthereforehasexpectedvalue(n-1)σ2andwedividebyn-1togivetheestimateofσ2.

Thus,providedthedataarefromaNormaldistribution,notonlydoesthesamplemeanfollowaNormaldistribution,butthesamplevarianceisfromaChi-squareddistributiontimesσ2/(n-1).BecausethesquarerootoftheChi-squareddistributionconvergesquiterapidlytotheNormal,thedistributionofthesamplestandarddeviationisapproximatelyNormalforn>20,providedthedatathemselvesarefromaNormaldistribution.AnotherimportantpropertyofthevariancesofNormalsamplesisthat,ifwetakemanyrandomsamplesfromthesamepopulation,thesamplevarianceandsamplemeanareindependentif,

andonlyif,thedataarefromaNormaldistribution.

TheFdistributionwithmandndegreesoffreedomisthedistributionof(χ2m)/(χ2n/n),thetworatiooftwoindependentX2variableseachdividedbyitsdegreesoffreedom.Thisdistributionisusedforcomparingvariances.IfwehavetwoindependentestimatesofthesamevariancecalculatedfromNormaldata,thevarianceratiowillfollowtheFdistribution.Wecanusethisforcomparingtwoestimatesofvariance(§10.8),butitmainusesareincomparinggroupsofmeans(§10.9)andinexaminingtheeffectsofseveralfactorstogether(§17.2).


32.TheNormaldistribution:

(a)isalsocalledtheGaussiandistribution;

(b)isfollowedbymanyvariables;

(c)isafamilyofdistributionswithtwoparameters;

(d)isfollowedbyallmeasurementsmadeinhealthypeople;

(e)isthedistributiontowardswhichthePoissondistributiontendsasitsmeanincreases.

ViewAnswer

33.TheStandardNormaldistribution:

(a)isskewtotheleft;

(b)hasmean=1.0;

(c)hasstandarddeviation=0.0;

(d)hasvariance=1.0;

(e)hasthemedianequaltothemean.

ViewAnswer

34.ThePEFRsofagroupof11-year-oldgirlsfollowaNormaldistributionwithmean300litre/minandastandarddeviation20litre/min:

(a)about95%ofthegirlshavePEFRbetween260and340litre/min;

(b)50%ofthegirlshavePEFRabove300litre/min;

(c)thegirlshavehealthylungs;

(d)about5%ofgirlshavePEFRbelow260litre/min;

(e)allthePEFRsmustbelessthan340litre/min.

ViewAnswer

35.Themeanofalargesample:

(a)isalwaysgreaterthanthemedian;

(b)iscalculatedfromtheformulaΣxn/n

(c)isfromanapproximatelyNormaldistribution;

(d)increasesasthesamplesizeincreases;

(e)isalwaysgreaterthanthestandarddeviation.

ViewAnswer

36.IfXandYareindependentvariableswhichfollowStandardNormaldistributions,aNormaldistributionisalsofollowedby:

(a)5X;

(b)X2;

(c)X+5;

(d)X-Y;

(e)X/Y.

ViewAnswer

37.WhenaNormalplotisdrawnwiththeStandardNormaldeviateontheyaxis:

(a)astraightlineindicatesthatobservationsarefromaNormalDistribution;

(b)acurvewithdecreasingslopeindicatespositiveskewness;

(c)an‘S’shapedcurve(orogive)indicateslongtails;

(d)averticallinewilloccurifallobservationsareequal;

(e)ifthereisastraightlineitsslopedependsonthestandarddeviation.

ViewAnswer

7EExercise:ANormalplotInthisexerciseweshallreturntothebloodglucosedataof§4EandtrytodecidehowwelltheyconformtoaNormaldistribution.

1.Fromtheboxandwhiskerplotandthehistogramfoundinexercise§4E(ifyouhavenottriedexercise§4Eseethesolutionin

Chapter19),dothebloodglucoselevelslooklikeaNormaldistribution?

ViewAnswer

2.ConstructaNormalplotforthedata.Thisisquiteeasyastheyareorderedalready.Find(i-0.5)/nfori=1to40andobtainthecorrespondingcumulativeNormalprobabilitiesfromTable7.1.Nowplottheseprobabilitiesagainstthecorrespondingbloodglucose.

ViewAnswer

3.Doestheplotappeartogiveastraightline?DothedatafollowaNormaldistribution?

ViewAnswer



>TableofContents>8-Estimation

8

Estimation

8.1SamplingdistributionsWehaveseeninChapter3howsamplesaredrawnfrommuchlargerpopulations.Dataarecollectedaboutthesamplesothatwecanfindoutsomethingaboutthepopulation.Weusesamplestoestimatequantitiessuchasdiseaseprevalence,meanbloodpressure,meanexposuretoacarcinogen,etc.Wealsowanttoknowbyhowmuchtheseestimatesmightvaryfromsampletosample.

InChapters6and7wesawhowthetheoryofprobabilityenablesustolinkrandomsampleswiththepopulationsfromwhichtheyaredrawn.Inthischapterweshallseehowprobabilitytheoryenablesustousesamplestoestimatequantitiesinpopulations,andtodeterminetheprecisionoftheseestimates.Firstweshallconsiderwhathappenswhenwedrawrepeatsamplesfromthesamepopulation.Table8.1showsasetof100randomdigitswhichwecanuseasthepopulationforasamplingexperiment.ThedistributionofthenumbersinthispopulationisshowninFigure8.1.Thepopulationmeanis4.7andthestandarddeviationis2.9.

Thesamplingexperimentisdonebyusingasuitablerandomsamplingmethodtodrawrepeatedsamplesfromthepopulation.Inthiscasedecimaldicewereaconvenientmethod.Asampleofsizefourwaschosen:6,4,6and1.Themeanwascalculated:17/4=4.25.Thiswasrepeatedtodrawasecondsampleof4numbers:7,8,1,8.Theirmeanis6,00.Thissamplingprocedurewasdone20timesaltogether,togivethesamplesandtheirmeansshowninTable8.2.

Thesesamplemeansarenotallthesame.Theyshowrandomvariation.Ifwewereabletodrawallofthe3921225possiblesamplesofsize4andcalculatetheirmeans,thesemeansthemselveswouldformadistribution.Our20samplemeansarethemselvesasamplefromthisdistribution.Thedistributionofallpossiblesamplemeansiscalledthesamplingdistributionofthemean.Ingeneral,thesamplingdistributionofanystatisticisthedistributionofthevaluesofthe

statisticwhichwouldarisefromallpossiblesamples.

Table8.1.Populationof100randomdigitsforasamplingexperiment

9 1 0 7 5 6 9 5 8 8 1 0 5 7

1 8 8 8 5 2 4 8 3 1 6 5 5 7

2 8 1 8 5 8 4 0 1 9 2 1 6 9

1 9 7 9 7 2 7 7 0 8 1 6 3 8

7 0 2 8 8 7 2 5 4 1 8 6 8 3

Fig.8.1.DistributionofthepopulationofTable8.1

8.2StandarderrorofasamplemeanForthemomentweshallconsiderthesamplingdistributionofthemeanonly.Asoursampleof20meansisarandomsamplefromit,wecanusethistoestimatesomeoftheparametersofthedistribution.Thetwentymeanshavetheirownmeanandstandarddeviation.Themeanis5.1andthestandarddeviationis1.1.Nowthemeanofthewholepopulationis4.7,whichisclosetothemeanofthesamples.Butthestandarddeviationofthepopulationis2.9,whichisconsiderablygreaterthanthatofthesamplemeans.Ifweplotahistogramforthesampleofmeans(Figure8.2)weseethatthecentreofthesamplingdistributionandtheparentpopulationdistributionarethesame,butthescatterofthesamplingdistributionismuchless.

Table8.2.Randomsamplesdrawninasamplingexperiment

Sample 6 7 7 1 5 5 4

4 8 9 8 2 5 2

6 1 2 8 9 7 7

1 8 7 4 5 8 6

Mean 4.25 6.00 6.25 5.25 5.25 6.25 4.75

Sample 7 7 2 8 3 4 5

8 3 5 0 7 8 5

7 8 0 7 4 7 8

2 7 8 7 8 7 3

Mean 6.00 6.25 3.75 5.50 5.50 6.50 5.25

Fig.8.2.DistributionofthepopulationofTable8.1andofthesampleofthemeansofTable8.2

Thesamplemeanisanestimateofthepopulationmean.Thestandarddeviationofitssamplingdistributioniscalledthestandarderroroftheestimate.Itprovidesameasureofhowfarfromthetruevaluetheestimateislikelytobe.Inmostestimation,theestimateislikelytobewithinonestandarderrorofthetruemeanandunlikelytobemorethantwostandarderrorsfromit.Weshalllookatthismorepreciselyin§8.3.

Inalmostallpracticalsituationswedonotknowthetruevalueofthe

populationvarianceσ2butonlyitsestimates2(§4.7).Wecanusethistoestimatethestandarderrorbys/√n.Thisestimateisalsoreferredtoasthestandarderrorofthemean.Itisusuallyclearfromthecontextwhetherthestandarderroristhetruevalueorthatestimatedfromthedata.

Whenthesamplesizenislarge,thesamplingdistributionof[xwithbarabove]tendstoaNormaldistribution.Also,wecanassumethats2isagoodestimateofσ2.Soforlargen[xwithbarabove],is,ineffect,anobservationfromaNormaldistributionwithmeanµandstandarddeviationestimatedbys/√n.Sowithprobability0.95,xiswithintwo,ormorepreciselyiswithin1.96standarderrorsofµ.WithsmallsampleswecannotassumeeitheraNormaldistributionor,moreimportantly,

thats2isagoodestimateofσ2.WeshalldiscussthisinChapter10.

Fig.8.3.SamplesofmeansfromaStandardNormalvariable

Themeanandstandarderrorareoftenwrittenas4.062±0.089.Thisisrathermisleading,asthetruevaluemaybeuptotwostandarderrorsfromthemeanwithareasonableprobability.Thispracticeisnotrecommended.

Thereisoftenconfusionbetweentheterms‘standarderror’and‘standarddeviation’.Thisisunderstandable,asthestandarderrorisastandarddeviation(ofthesamplingdistribution)andthetermsareofteninterchangedinthiscontext.Theconventionisthis:weusetheterm‘standarderror’whenwemeasuretheprecisionofestimates,andtheterm‘standarddeviation’whenweareconcernedwiththevariabilityofsamples,populationsordistributions.IfwewanttosayhowgoodourestimateofthemeanFEV1measurementis,wequotethestandarderrorofthemean.IfwewanttosayhowwidelyscatteredtheFEV1measurementsare,wequotethestandarddeviation,s.

8.3ConfidenceintervalsTheestimateofmeanFEV1isasinglevalueandsoiscalledapointestimate.Thereisnoreasontosupposethatthepopulationmeanwillbeexactlyequaltothepointestimate,thesamplemean.Itislikelytobeclosetoit,however,andtheamountbywhichitislikelytodifferfromtheestimatecanbefound

fromthestandarderror.Whatwedoisfindlimitswhicharelikelytoincludethepopulationmean,andsaythatweestimatethepopulationmeantoliesomewhereintheinterval(thesetofallpossiblevalues)betweentheselimits.Thisiscalledanintervalestimate.

Fig.8.4.Samplingdistributionofthemeanof4observationsfromaStandardNormaldistribution

Forinstance,ifweregardthe57FEVmeasurementsasbeingalargesamplewecanassumethatthesamplingdistributionofthemeanisNormal,andthatthestandarderrorisagoodestimateofitsstandarddeviation(see§10.6foradiscussionofhowlargeislarge).Wethereforeexpectabout95%ofsuchmeanstobewithin1.96standarderrorsofthepopulationmean,µ.Hence,forabout95%ofallpossiblesamples,thepopulationmeanmustbegreaterthanthesamplemeanminus1.96standarderrorsandlessthanthesamplemeanplus1.96standarderrors.Ifwecalculatedx-1.96seandx+1.96seforallpossiblesamples,95%ofsuchintervalswouldcontainthepopulationmean.Inthiscasetheselimitsare4.062-1.96×0.089to4.062+1.96×0.089whichgives3.89to4.24,or3.9to4.2litres,roundingtotwosignificantfigures;3.9and4.2arecalledthe95%confidencelimitsfortheestimate,andthesetofvaluesbetween3.9and4.2iscalledthe95%confidenceinterval.Theconfidencelimitsarethevaluesattheendsoftheconfidenceinterval.

Strictlyspeaking,itisincorrecttosaythatthereisaprobabilityof0.95thatthepopulationmeanliesbetween3.9and4.2,thoughitisoften

putthatway(evenbyme).Thepopulationmeanisanumber,notarandomvariable,andhasnoprobability.Itistheprobabilitythatlimitscalculatedfromarandomsamplewillincludethepopulationvaluewhichis95%.Figure8.5showsconfidenceintervalsforthemeanfor20randomsamplesof100observationsfromtheStandardNormaldistribution.Thepopulationmeanis,ofcourse,0.0,shownbythehorizontalline.Somesamplemeansarecloseto0.0,somefurtheraway,someaboveandsomebelow.Thepopulationmeaniscontainedby19ofthe20confidenceintervals.Ingeneral,for95%ofconfidenceintervalsitwillbetrueto

saythatthepopulationvaluelieswithintheinterval.Wejustdon'tknowwhich95%.Weexpressthisbysayingthatweare95%confidentthatthemeanliesbetweentheselimits.

Fig.8.5.Meanand95%confidenceintervalfor20randomsamplesof100observationsfromtheStandardNormaldistribution

IntheFEV1example,thesamplingdistributionofthemeanisNormalanditsstandarddeviationiswellestimatedbecausethesampleislarge.Thisisnotalwaystrueandalthoughitisusuallypossibletocalculateconfidenceintervalsforanestimatetheyarenotallquiteassimpleasthatforthemeanestimatedfromalargesample.Weshalllookatthemeanestimatedfromasmallsamplein§10.2.

Thereisnonecessityfortheconfidenceintervaltohaveaprobabilityof95%.Forexample,wecanalsocalculate99%confidencelimits.Theupper0.5%pointoftheStandardNormaldistributionis2.58(Table7.2),sotheprobabilityofaStandardNormaldeviatebeingabove2.58orbelow-2.58is1%andtheprobabilityofbeingwithintheselimitsis99%.The99%confidencelimitsforthemeanFEV1aretherefore,4.062-2.58×0.089and4.062+2.58×0.089,i.e.3.8and4.3litres.Thesegiveawiderintervalthanthe95%limits,aswewouldexpectsincewearemoreconfidentthatthemeanwillbeincluded.Theprobabilitywechooseforaconfidenceintervalisthusacompromisebetweenthedesiretoincludetheestimatedpopulationvalueandthedesiretoavoidpartsofscalewherethereisalowprobabilitythatthemeanwillbefound.Formostpurposes,95%confidenceintervalshavebeenfoundtobesatisfactory.

Standarderrorisnottheonlywayinwhichwecancalculateconfidenceintervals,althoughatpresentitistheoneusedformostproblems.In§8.8Idescribeadifferentapproachbasedontheexactprobabilitiesofadistribution,whichrequiresnolargesampleassumption.In§8.9IdescribealargesamplemethodwhichusestheBinomialdistributiondirectly.Thereareothers,whichIshallomitbecausetheyarerarelyused.

8.4StandarderrorandconfidenceintervalforaproportionThestandarderrorofaproportionestimatecanbecalculatedinthesameway.Supposetheproportionofindividualswhohaveaparticularconditioninagivenpopulationisp,andwetakearandomsampleofsizen,thenumberobservedwiththeconditionbeingr.Thenthe

estimatedproportionisr/n.Wehaveseen(§6.4)thatrcomesfromaBinomialdistributionwithmeannpandvariancenp(1-p).Providednislarge,thisdistributionisapproximatelyNormal.Sor/n,theestimatedproportion,isNormallydistributedwithmeangivenbynp/n=p,andvariancegivenby

sincenisconstant,andthestandarderroris

Wecanestimatethisbyreplacingpbyr/n.

ThestandarderroroftheproportionisonlyofuseifthesampleislargeenoughfortheNormalapproximationtoapply.Aroughguidetothisisthatnpandn(1-p)shouldbothexceed5.Thisisusuallythecasewhenweareconcernedwithstraightforwardestimation.Ifwetrytousethemethodforsmallersamples,wemaygetabsurdresults.Forexample,inastudyoftheprevalenceofHIVinex-prisoners(Turnbulletal.1992),of29womenwhodidnotinjectdrugsonewasHIVpositive.Theauthorsreportedthistobe3.4%,witha95%confidenceinterval-3.1%to9.9%.Thelowerlimitof-3.1%,obtainedfromtheobservedproportionminus1.96standarderrors,isimpossible.AsNewcombe(1992)pointedout,thecorrect95%confidenceintervalcanbeobtainedfromtheexactprobabilitiesoftheBinomialdistributionandis0.1%to17.8%(§8.8).

8.5ThedifferencebetweentwomeansInmanystudieswearemoreinterestedinthedifferencebetweentwoparametersthanintheirabsolutevalue.Thesecouldbemeans,

proportions,theslopesoflines,andmanyotherstatistics.WhensamplesarelargewecanassumethatsamplemeansandproportionsareobservationsfromaNormaldistribution,andthatthecalculatedstandarderrorsaregoodestimatesofthestandarddeviations

oftheseNormaldistributions.Wecanusethistofindconfidenceintervals.

Forexample,supposewewishtocomparethemeans,[xwithbarabove]1and[xwithbarabove]2,oftwolargesamples,sizesn1andn2.Theexpecteddifferencebetweenthesamplemeansisequaltothedifferencebetweenthepopulationmeans,i.e.E([xwithbarabove]1-[xwithbarabove]2)=µ1-µ2.Whatisthestandarderrorofthedifference?Thevarianceofthedifferencebetweentwoindependentrandomvariablesisthesumoftheirvariances(§6.6).Hence,thestandarderrorofthedifferencebetweentwoindependentestimatesisthesquarerootofthesumofthesquaresoftheirstandarderrors.Thestandarderrorofameanis√s2/n,sothestandarderrorofthedifferencebetweentwoindependentmeansis

Foranexample,inastudyofrespiratorysymptomsinschoolchildren(Blandetal.1974),wewantedtoknowwhetherchildrenreportedbytheirparentstohaverespiratorysymptomshadworselungfunctionthanchildrenwhowerenotreportedtohavesymptoms.Ninety-twochildrenwerereportedtohavecoughduringthedayoratnight,andtheirmeanPEFRwas294.8litre/minwithstandarddeviation57.1litre/min,and1643childrenwerenotreportedtohavethissymptom,theirmeanPEFRbeing313.6litre/minwithstandarddeviation55.2litre/min.Wethushavetwolargesamples,andcanapplytheNormaldistribution.Wehave

n1=92,[xwithbarabove]1=294.8,s1=57.1,n2=1643,[xwithbarabove]2=313.6,s2=55.2

Thedifferencebetweenthetwogroupsis[xwithbarabove]1-[xwithbarabove]2=294.8-313.6=-18.8.Thestandarderrorofthedifferenceis

Weshalltreatthesampleasbeinglarge,sothedifferencebetweenthemeanscanbeassumedtocomefromaNormaldistributionandtheestimatedstandarderrortobeagoodestimateofthestandarddeviationofthisdistribution.(Forsmallsamplessee§10.3and§10.6.)The95%confidencelimitsforthedifferencearethus-18.8-1.96×6.11and-18.8+1.96×6.11,i.e.-6.8and-30.8litre/min.Theconfidenceintervaldoesnotincludezero,sowehavegoodevidencethat,inthispopulation,childrenreportedtohavedayornightcoughhavelowermeanPEFRthanothers.Thedifferenceisestimatedtobebetween7and31litre/minlowerinchildrenwiththesymptom,soitmaybequitesmall.

Whenwehavepaireddata,suchasacross-overtrial(§2.6)oramatchedcase-controlstudy(§3.8),thetwo-samplemethoddoesnotwork.Instead,wecalculatethedifferencesbetweenthepairedobservationsforeachsubject,thenfindthemeandifference,itsstandarderrorandconfidenceintervalasin§8.3.

Table8.3.Coughduringthedayoratnightatage14andbronchitisbeforeage5(Hollandetal.1978)

Coughat14Bronchitisat5

TotalYes No

Yes 26 44 70

No 247 1002 1249

Total 273 1046 1319

8.6Comparisonoftwoproportions

ProvidedtheconditionsofNormalapproximationaremet(see§8.4)wecanfindaconfidenceintervalforthedifferenceintheusualway.

Forexample,considerTable8.3.Theresearcherswantedtoknowtowhatextentchildrenwithbronchitisininfancygetmorerespiratorysymptomsinlaterlifethanothers.Wecanestimatethedifferencebetweentheproportionsreportedtocoughduringthedayoratnightamongchildrenwithandchildrenwithoutahistoryofbronchitisbeforeage5years.Wehaveestimatesoftwoproportions,p1=26/273=0.09524andp2=44/1046=0.04207.Thedifferencebetweenthemisp1-p2=0.09524-0.04207=0.05317.Thestandarderrorofthedifferenceis

The95%confidenceintervalforthedifferenceis0.05317-1.96×0.0188to0.05317+1.96×0.0188=0.016to0.090.Althoughthedifferenceisnotverypreciselyestimated,theconfidenceintervaldoesnotincludezeroandgivesusclearevidencethatchildrenwith

bronchitisreportedininfancyaremorelikelythanotherstobereportedtohaverespiratorysymptomsinlaterlife.Thedataonlungfunctionin§8.5givesussomereasontosupposethatthisisnotentirelyduetoresponsebias(§3.9).Asin§8.4,theconfidenceintervalmustbeestimated

differentlyforsmallsamples.

Thisdifferenceinproportionsmaynotbeveryeasytointerpret.Theratiooftwoproportionsisoftenmoreuseful.Anothermethod,theoddsratio,isdescribedin§13.7.Theratiooftheproportionwithcoughatage14forbronchitisbefore5totheproportionwithcoughatage14forthosewithoutbronchitisbefore5isp1/p2=0.09524/0.04207=2.26.Childrenwithbronchitisbefore5aremorethantwiceaslikelytocoughduringthedayoratnightatage14thanchildrenwithnosuchhistory.

Thestandarderrorforthisratioiscomplex,andasitisaratioratherthanadifferenceitdoesnotapproximatewelltoaNormaldistribution.Ifwetakethelogarithmoftheratio,however,wegetthedifferencebetweentwologarithms,becauselog(p1/p2)=log(p1)-log(p2)(§5A).Wecanfindthestandarderrorforthelogratioquiteeasily.Weusetheresultthat,foranyrandomvariableXwithmeanµandvarianceσ2,theapproximatevarianceoflog(X)isgivenbyVAR(loge(X))=σ2/µ2(seeKendallandStuart1969).Hence,thevarianceoflog(p)is

Forthedifferencebetweenthetwologarithmsweget

Thestandarderroristhesquarerootofthis.(Thisformulaisoftenwrittenintermsoffrequencies,butIthinkthisversionisclearer.)Fortheexamplethelogratioisloge(2.26385)=0.81707andthestandarderroris

The95%confidenceintervalforthelogratioistherefore0.81707-1.96×0.23784to0.81707+1.96×0.23784=0.35089to1.28324.The95%confidenceintervalfortheratioofproportionsitselfistheantilogofthis:e0.35089toe1.28324=1.42to3.61.Thusweestimatethattheproportionofchildrenreportedtocoughduringthedayoratnightamongthosewithahistoryofbronchitisisbetween1.4to3.6timestheproportionamongthosewithoutahistoryofbronchitis.

Theproportionofindividualsinapopulationwhodevelopadiseaseorsymptomisequaltotheprobabilitythatanygivenindividualwilldevelopthedisease,calledtheriskofanindividualdevelopingadisease.ThusinTable8.3therisk

thatachildwithbronchitisbeforeage5willcoughatage14is26/273=0.09524,andtheriskforachildwithoutbronchitisbeforeage5is44/1046=0.04207.Tocomparerisksforpeoplewithandwithoutaparticularriskfactor,welookattheratiooftheriskwiththefactortotheriskwithoutthefactor,therelativerisk.Therelativeriskofcoughatage14forbronchitisbefore5isthus2.26.Toestimatetherelativeriskdirectly,weneedacohortstudy(§3.7)asinTable8.3.Weestimaterelativeriskforacase-controlstudyinadifferentway(§13.7).

Intheunusualsitutationwhenthesamplesarepaired,eithermatchedortwoobservationsonthesamesubject,weuseadifferentmethod(§13.9).

8.7*Standarderrorofasamplestandarddeviation

8.8*ConfidenceintervalforaproportionwhennumbersaresmallIn§8.4Imentionedthatthestandarderrormethodforaproportiondoesnotworkwhenthesampleissmall.Instead,theconfidenceintervalcanbefoundusingtheexactprobabilitiesoftheBinomialdistribution.Themethodworkslikethis.Givenn,wefindthevaluePLfortheparameterpoftheBinomialdistributionwhichgivesaprobability0.025ofgettinganobservednumberofsuccesses,r,asbigasorbiggerthanthevalueobserved.Wedothisbycalculatingtheprobabilitiesfromtheformulain§6.4,iteratingrounddifferentpossiblevaluesofpuntilwegettherightone.WealsofindthevaluepUfortheparameterpoftheBinomialdistributionwhichgivesaprobability0.025ofgettinganobservednumberofsuccessesassmallasorsmallerthanthevalueobserved.Theexact95%confidenceintervalisPLtopU.Forexample,supposeweobserve3successesoutof10trials.TheBinomialdistributionwithn=10whichhasthetotalprobabilityfor3ormoresuccessesequalto0.025hasparameterp=0.067.Thedistributionwhichhasthetotalprobabilityfor3orfewersuccessesequalto0.025hasp=0.652.Hencethe95%confidenceintervalfortheproportioninthepopulationis0.067to0.652.Figure8.6showsthetwodistributions.Nolargesampleapproximationisrequiredandwecanusethisforanysizeofsample.PearsonandHartley(1970)giveatableforcalculatingexactBinomialconfidenceintervals.Evenbetter,youcandownloadafreeprogramfrommywebsite(§1.3).

Fig.8.6.Distributionsshowingthecalculationoftheexactconfidenceintervalforthreesuccessesoutoftentrials.

Unlesstheobservedproportioniszeroorone,thesevaluesareneverincludedintheexactconfidenceinterval.Thepopulationproportionofsuccessescannotbezeroifwehaveobservedasuccessinthesample.Itcannotbeoneifwehaveobservedafailure.

8.9*Confidenceintervalforamedianandotherquantiles

Weroundjandkuptothenextinteger.Thenthe95%confidenceintervalisbetweenthejthandthekthobservationsintheordered

data.Forthe57FEVmeasurementsofTable4.4,themedianwas4.1litres(§4.5).Forthe95%confidenceintervalforthemedian,n=57andq=0.5,and

The95%confidenceintervalisthusfromthe22ndtothe36thobservation,3.75to4.30litresfromTable4.4.Comparethistothe95%confidenceintervalforthemean,3.9to4.2litres,whichiscompletelyincludedintheintervalforthemedian.Thismethodofestimatingpercentilesisrelativelyimprecise.Anotherexampleisgiven§15.5.

8.10Whatisthecorrectconfidenceinterval?Aconfidenceintervalonlyestimateserrorsduetosampling.Theydonotallowforanybiasinthesampleandgiveusanestimateforthepopulationofwhichourdatacanbeconsideredarandomsample.Asdiscussedin§3.5,itisoftennotclearwhatthispopulationis,andwerelyfarmoreontheestimationofdifferencesthanabsolutevalues.Thisisparticularlytrueinclinicaltrials.Westartwithpatientsinonelocality,excludesome,allowrefusals,andthepatientscannotberegardedasarandomsampleofpatientsingeneral.However,wethenrandomizeintotwogroupswhicharethentwosamplesfromthesamepopulation,andonlythetreatmentdiffersbetweenthem.Thusthedifferenceisthethingwewanttheconfidenceintervalfor,notforeithergroupseparately.Yetresearchersoftenignorethedirectcomparisoninfavourofestimationusingeachgroupseparately.

Forexample,Salvesenetal.(1992)reportedfollow-upoftworandomizedcontrolledtrialsofroutineultrasonographyscreeningduringpregnancy.Atages8to9years,childrenofwomenwhohadtakenpartinthesetrialswerefollowedup.Asubgroupofchildrenunderwentspecifictestsfordyslexia.Thetestresultsclassified21ofthe309screenedchildren(7%,95%confidenceinterval3-10%)and26ofthe294controls(9%,95%confidenceinterval4–12%)asdyslexic.Muchmoreusefulwouldbeaconfidenceintervalforthedifferencebetweenprevalences(-6.3to2.2percentagepoints)ortheirratio(0.44to1.34),

becausewecouldthencomparethegroupsdirectly.


38.Thestandarderrorofthemeanofasample:

(a)measuresthevariabilityoftheobservations;

(b)istheaccuracywithwhicheachobservationismeasured;

(c)isameasureofhowfarthesamplemeanislikelytobefromthepopulationmean;

(d)isproportionaltothenumberofobservations;

(e)isgreaterthantheestimatedstandarddeviationofthepopulation.

ViewAnswer

39.The95%confidencelimitsforthemeanestimatedfromasetofobservations

(a)arelimitsbetweenwhich,inthelongrun,95%ofobservationsfall;

(b)areawayofmeasuringtheprecisionoftheestimateofthemean;

(c)arelimitswithinwhichthesamplemeanfallswithprobability0.95;

(d)arelimitswhichwouldincludethepopulationmeanfor95%ofpossiblesamples;

(e)areawayofmeasuringthevariabilityofasetofobservations.

ViewAnswer

40.Ifthesizeofarandomsamplewereincreased,wewouldexpect:

(a)themeantodecrease;

(b)thestandarderrorofthemeantodecrease;

(c)thestandarddeviationtodecrease;

(d)thesamplevariancetoincrease;

(e)thedegreesoffreedomfortheestimatedvariancetoincrease.

ViewAnswer

41.Theprevalenceofaconditioninapopulationis0.1.Iftheprevalenceisestimatedrepeatedlyfromsamplesofsize100,theseestimateswillformadistributionwhich:

(a)isasamplingdistribution;

(b)isapproximatelyNormal;

(c)hasmean=0.1;

(d)havevariance=9;

(e)isBinomial.

ViewAnswer

42.ItisnecessarytoestimatethemeanFEV1bydrawingasamplefromalargepopulation.Theaccuracyoftheestimatewilldependon:

(a)themeanFEV1inthepopulation;

(b)thenumberinthepopulation;

(c)thenumberinthesample;

(d)thewaythesampleisselected;

(e)thevarianceofFEV1inthepopulation.

ViewAnswer

43.Inastudyof88birthstowomenwithahistoryofthrombocytopenia(Samuelsetal.1990),thesameconditionwas

recordedin20%ofbabies(95%confidenceinterval13%to30%,exactmethod):

(a)Anothersampleofthesamesizewillshowarateofthrombocytopeniabetween13%and30%;

(b)95%ofsuchwomenhaveaprobabilityofbetween13%and30%ofhavingababywiththrombocytopenia;

(c)Itislikelythatbetween13%and30%ofbirthstosuchwomenintheareawouldshowthrombocytopenia;

(d)Ifthesamplewereincreasedto880births,the95%confidenceintervalwouldbenarrower;

(e)Itwouldbeimpossibletogetthesedataiftherateforallwomenwas10%.

ViewAnswer

8EExercise:MeansoflargesamplesTable8.4summarizesdatacollectedinastudyofplasmamagnesiumindiabetics.Thediabeticsubjectswereallinsulin-dependentsubjectsattendingadiabeticclinicovera5monthperiod.Thenon-diabeticcontrolswereamixtureofblooddonorsandpeopleattendingdaycentresfortheelderly,togiveawideage

distribution.PlasmamagnesiumfollowsaNormaldistributionveryclosely.

Table8.4.Plasmamagnesiumininsulin-dependentdiabeticsandhealthycontrols

Number Mean Standarddeviation

Insulin-dependentdiabetics

227 0.719 0.068

Non-diabeticcontrols 140 0.810 0.057

Fig.8.7.Distributionofmagnesiumindiabeticsandcontrols,showingtheproportionofdiabeticsabovethelowerlimitofreferenceinterval

1.Calculateanintervalwhichwouldinclude95%ofplasmamagnesiummeasurementsfromthecontrolpopulation.Thisiswhatwecallthe95%referenceinterval,describedindetailin§15.5.Ittellsussomethingaboutthedistributionofplasmamagnesiuminthepopulation.

ViewAnswer

2.Whatproportionofinsulin-dependentdiabeticswouldliewithinthis95%referenceinterval?(Hint:findhowmanystandarddeviationsfromthediabeticmeanthelowerlimitis,thenusethetableoftheNormaldistribution,Table7.1,tofindtheprobabilityofexceedingthis.SeeFigure8.7.)

ViewAnswer

3.Findthestandarderrorofthemeanplasmamagnesiumforeachgroup.

ViewAnswer

4.Finda95%confidenceintervalforthemeanplasmamagnesiuminthehealthypopulation.Howdoestheconfidenceintervaldifferfromthe95%referenceinterval?Whyaretheydifferent?

ViewAnswer

5.Findthestandarderrorofthedifferenceinmeanplasmamagnesiumbetweeninsulin-dependentdiabeticsandhealthypeople.

ViewAnswer

6.Finda95%confidenceintervalforthedifferenceinmeanplasmamagnesiumbetweeninsulin-dependentdiabeticsandhealthypeople.Isthereanyevidencethatdiabeticshavelowerplasmamagnesiumthannon-diabeticsinthepopulationfromwhichthesedatacome?

ViewAnswer

7.Wouldplasmamagnesiumbeagooddiagnostictestfordiabetes?

ViewAnswer



>TableofContents>9-Significancetests

9

Significancetests

9.1TestingahypothesisInChapter8Idealtwithestimationandtheprecisionofestimates.Thisisoneformofstatisticalinference,theprocessbywhichweusesamplestodrawconclusionsaboutthepopulationsfromwhichtheyaretaken.InthischapterIshallintroduceadifferentformofinference,thesignificancetestorhypothesistest.

Asignificancetestenablesustomeasurethestrengthoftheevidencewhichthedatasupplyconcerningsomepropositionofinterest.Forexample,considerthecross-overtrialofpronethalolforthetreatmentofangina(§2.6).Table9.1showsthenumberofattacksoverfourweeksoneachtreatment.These12patientsareasamplefromthepopulationofallpatients.Wouldtheothermembersofthispopulationexperiencefewerattackswhileusingpronethalol?Wecanseethatthenumberofattacksishighlyvariablefromonepatienttoanother,anditisquitepossiblethatthisistruefromoneperiodoftimetoanotheraswell.Soitcouldbethatsomepatientswouldhavefewerattackswhileonpronethalolthanwhileonplaceboquitebychance.Inasignificancetest,weaskwhetherthedifferenceobservedwassmallenoughtohaveoccurredbychanceiftherewerereallynodifferenceinthepopulation.Ifitwereso,thentheevidenceinfavouroftherebeingadifferencebetweenthetreatmentperiodswouldbeweakorabsent.Ontheotherhand,ifthedifferenceweremuchlargerthanwewouldexpectduetochanceiftherewerenorealpopulationdifference,thentheevidenceinfavourofarealdifferencewouldbestrong.

Tocarryoutthetestofsignificancewesupposethat,inthepopulation,

thereisnodifferencebetweenthetwotreatments.Thehypothesisof‘nodifference’or‘noeffect’inthepopulationiscalledthenullhypothesis.Ifthisisnottrue,thenthealternativehypothesismustbetrue,thatthereisadifferencebetweenthetreatmentsinonedirectionortheother.Wethenfindtheprobabilityofgettingdataasdifferentfromwhatwouldbeexpected,ifthenullhypothesisweretrue,asarethosedataactuallyobserved.Ifthisprobabilityislargethedataareconsistentwiththenullhypothesis;ifitissmallthedataareunlikelytohavearisenifthenullhypothesisweretrueandtheevidenceisinfavourofthealternativehypothesis.

Table9.1.Trialofpronethalolforthepreventionofanginapectoris


Differenceplacebo—pronethalol

Signofdifference

Placebo Pronethalol

71 29 42 +

323 348 -25 -

8 1 7 +

14 7 7 +

23 16 7 +

34 25 9 +

79 65 14 +

60 41 19 +

2 0 2 +

3 0 3 +

17 15 2 +

7 2 5 +

9.2Anexample:ThesigntestIshallnowdescribeaparticulartestofsignificance,thesigntest,totestthenullhypothesisthatplaceboandpronethalolhavethesameeffectonangina.Considerthedifferencesbetweenthenumberofattacksonthetwotreatmentsforeachpatient,asinTable9.1.Ifthenullhypothesisweretrue,thendifferencesinnumberofattackswouldbejustaslikelytobepositiveasnegative,theywouldberandom.Theprobabilityofachangebeingnegativewouldbeequaltotheprobabilityofitbeingpositive,sobothprobabilitieswouldbe0.5.ThenthenumberofnegativeswouldbeanobservationfromaBinomialdistribution(§6.4)withn=12andp=0.5.(Iftherewereanysubjectswhohadthesamenumberofattacksonbothregimeswewouldomitthem,astheyprovidenoinformationaboutthedirectionofanydifferencebetweenthetreatments.Inthistest,nisthenumberofsubjectsforwhomthereisadifference,onewayortheother.)

Ifthenullhypothesisweretrue,whatwouldbetheprobabilityofgettinganobservationfromthisdistributionasextremeasthevaluewehaveactuallyobserved?Theexpectednumberofnegativeswouldbenp=6.Whatistheprobabilityofgettingavalueasfarfromexpectationasisthatobserved?Thenumberofnegativedifferencesis1.The

probabilityofgettingonenegativechangeis

Thisisnotalikelyeventinitself.However,weareinterestedintheprobabilityofgettingavalueasfarorfurtherfromtheexpectedvalue,6,asis1,andclearly0isfurtherandmustbeincluded.Theprobabilityofnonegativechangesis

Sotheprobabilityofoneorfewernegativechangesis0.00293+0.00024=0.00317.Thenullhypothesisisthatthereisnodifference,sothealternativehypothesisisthatthereisadifferenceinonedirectionortheother.Wemust,therefore,considertheprobabilityofgettingavalueasextremeontheothersideofthemean,thatis11or12negatives(Figure9.1).Theprobabilityof11or12negativesisalso0.00317,becausethedistributionissymmetrical.Hence,theprobabilityofgettingasextremeavalueasthatobserved,ineitherdirection,is0.00317+0.00317=0.00634.Thismeansthatifthenullhypothesisweretruewewouldhaveasamplewhichissoextremethattheprobabilityofitarisingbychanceis0.006,lessthanoneinahundred.

Fig.9.1.ExtremesoftheBinomialdistributionforthesigntest

Thus,wewouldhaveobservedaveryunlikelyeventifthenullhypothesisweretrue.Thismeansthatthedataarenotconsistentwithnullhypothesis,andwecanconcludethatthereisstrongevidenceinfavourofadifferencebetweenthetreatments.(Sincethiswasadoubleblindrandomizedtrial,itisreasonabletosupposethatthiswascausedbytheactivityofthedrug.)

9.3PrinciplesofsignificancetestsThesigntestisanexampleofatestofsignificance.Thenumberofnegativechangesiscalledtheteststatistic,somethingcalculatedfromthedatawhichcanbeusedtotestthenullhypothesis.Thegeneralprocedureforasignificancetestisasfollows.

1. Setupthenullhypothesisanditsalternative.

2. Findthevalueoftheteststatistic.

3. Refertheteststatistictoaknowndistributionwhichitwouldfollowifthenullhypothesisweretrue.

4. Findtheprobabilityofavalueoftheteststatisticarisingwhichisasormoreextremethanthatobserved,ifthenullhypothesisweretrue.

5. Concludethatthedataareconsistentorinconsistentwiththenullhypothesis.

Weshalldealwithseveraldifferentsignificancetestsinthisandsubsequentchapters.Weshallseethattheyallfollowthispattern.

Ifthedataarenotconsistentwiththenullhypothesis,thedifferenceissaidtobestatisticallysignificant.Ifthedatadonotsupportthenullhypothesis,itissometimessaidthatwerejectthenullhypothesis,andifthedataareconsistentwiththenullhypothesisitissaidthatweacceptit.Suchan‘allornothing’decisionmakingapproachisseldomappropriateinmedicalresearch.Itispreferabletothinkofthesignificancetestprobabilityasanindexofthestrengthofevidenceagainstthenullhypothesis.Theterm‘acceptthenullhypothesis’isalsomisleadingbecauseitimpliesthatwehaveconcludedthatthenullhypothesisistrue,whichweshouldnotdo.Wecannotprovestatisticallythatsomething,suchasatreatmenteffect,doesnotexist.Itisbettertosaythatwehavenotrejectedorhavefailedtorejectthenullhypothesis.

TheprobabilityofsuchanextremevalueoftheteststatisticoccurringifthenullhypothesisweretrueisoftencalledthePvalue.Itisnottheprobabilitythatthenullhypothesisistrue.Thisisacommonmisconception.Thenullhypothesisiseithertrueoritisnot;itisnotrandomandhasnoprobability.Isuspectthatmanyresearchershavemanagedtousesignificancetestsquiteeffectivelydespiteholdingthisincorrectview.

9.4SignificancelevelsandtypesoferrorWemuststillconsiderthequestionofhowsmallissmall.Aprobabilityof0.006,asintheexampleabove,isclearlysmallandwehaveaquiteunlikelyevent.Butwhatabout0.06,or0.1?Supposewetakeaprobabilityof0.01orlessasconstitutingreasonableevidenceagainstthenullhypothesis.Ifthenullhypothesisistrue,weshallmakea

wrongdecisiononeinahundredtimes.Decidingagainstatruenullhypothesisiscalledanerrorofthefirstkind,typeIerror,orαerror.Wegetanerrorofthesecondkind,typeIIerror,orβerrorifwedonotrejectanullhypothesiswhichisinfactfalse.(αandβaretheGreekletters‘alpha’and‘beta’.)Nowthesmallerwedemandtheprobabilitybebeforewedecideagainstthenullhypothesis,thelargertheobserveddifferencemustbe,andsothemorelikelywearetomissrealdifferences.Byreducingtheriskofanerrorofthefirstkindweincreasetheriskofanerrorofthesecondkind.

Theconventionalcompromiseistosaythatdifferencesaresignificantiftheprobabilityislessthan0.05.Thisisareasonableguide-line,butshouldnotbetakenassomekindofabsolutedemarcation.Thereisnotagreatdifferencebetweenprobabilitiesof0.06and0.04,andtheysurelyindicatesimilarstrengthofevidence.Itisbettertoregardprobabilitiesaround0.05asprovidingsomeevidenceagainstthenullhypothesis,whichincreasesinstrengthastheprobabilityfalls.Ifwedecidethatthedifferenceissignificant,theprobabilityissometimesreferredtoasthesignificancelevel.Wesaythatthesignificancelevelishigh

ifthePvalueislow.

Fig.9.2.One-andtwo-sidedtests

Asaroughandreadyguide,wecanthinkofPvaluesasindicatingthestrengthofevidencelikethis:

greaterthan0.1:littleornoevidenceofadifferenceorrelationship

between0.05and0.1:weakevidenceofadifferenceorrelationship

between0.01and0.05:evidenceofadifferenceorrelationship

lessthan0.01:strongevidenceofadifferenceorrelationship

lessthan0.001:verystrongevidenceofadifferenceorrelationship

9.5One-andtwo-sidedtestsofsignificanceIntheaboveexample,thealternativehypothesiswasthattherewasadifferenceinonedirectionortheother.Thisiscalledatwo-sidedortwo-tailedtest,becauseweusedtheprobabilitiesofextremevaluesinbothdirections.Itwouldhavebeenpossibletohavethealternativehypothesisthattherewasadecreaseinthepronethaloldirection,inwhichcasethenullhypothesiswouldbethatthenumberofattacksontheplacebowaslessthanorequaltothenumberonpronethalol.ThiswouldgiveP=0.00317,andofcourse,ahighersignificancelevelthanthetwosidedtest.Thiswouldbeaone-sidedorone-tailedtest(Figure9.2).Thelogicofthisisthatweshouldignoreanysignsthattheactivedrugisharmfultothepatients.Ifwhatweweresayingwas‘ifthistrialdoesnotgiveasignificantreductioninanginausingpronethalolwewillnotuseitagain’,thismightbereasonable,butthemedicalresearchprocessdoesnotworklikethat.Thisisoneofseveralpiecesofevidenceandsoweshouldcertainlyuseamethodofinferencewhichwouldenableustodetecteffectsineitherdirection.

Thequestionofwhetherone-ortwo-sidedtestsshouldbethenormhasbeenthesubjectofconsiderabledebateamongpractitionersofstatisticalmethods.Perhapsthepositiontakendependsonthefieldinwhichthetestingisusuallydone.Inbiologicalscience,treatmentsseldomhaveonlyoneeffectandrelationshipsbetweenvariablesareusuallycomplex.Two-sidedtestsarealmostalwayspreferable.

Therearecircumstancesinwhichaone-sidedtestisappropriate.Ina

studyoftheeffectsofaninvestigativeprocedure,laparoscopyandhydrotubation,onthefertilityofsub-fertilewomen(Luthraetal.1982),westudiedwomenpresentingataninfertilityclinic.Thesewomenwereobservedforseveralmonths,duringwhichsomeconceived,beforelaparoscopywascarriedoutonthosestillinfertile.Thesewerethenobservedforseveralmonthsafterwardsandsomeofthesewomenalsoconceived.Wecomparedtheconceptionrateintheperiodbeforelaparoscopywiththatafterwards.Ofcourse,womenwhoconceivedduringthefirstperioddidnothavealaparoscopy.Wearguedthatthelessfertileawomanwasthelongeritwaslikelytotakehertoconceive.Hence,thewomenwhohadthelaparoscopyshouldhavealowerconceptionrate(byanunknownamount)thanthelargergroupwhoenteredthestudy,becausethemorefertilewomenhadconceivedbeforetheirturnforlaparoscopycame.Toseewhetherlaparoscopyincreasedfertility,wecouldtestthenullhypothesisthattheconceptionrateafterlaparoscopywaslessthanorequaltothatbefore.Thealternativehypothesiswasthattheconceptionrateafterlaparoscopywashigherthanthatbefore.Atwo-sidedtestwasinappropriatebecauseifthelaparoscopyhadnoeffectonfertilitythepostlaparoscopyratewasexpectedtobelower;chancedidnotcomeintoit.Infactthepostlaparoscopyconceptionratewasveryhighandthedifferenceclearlysignificant.

9.6Significant,realandimportantIfadifferenceisstatisticallysignificant,thenitmaywellbereal,butnotnecessarilyimportant.Forexample,wemaylookattheeffectofadrug,givenforsomeotherpurpose,onbloodpressure.Supposewefindthatthedrugraisesbloodpressurebyanaverageof1mmHg,andthatthisissignificant.Ariseinbloodpressureof1mmHgisnotclinicallyimportant,so,althoughitmaybethere,itdoesnotmatter.Itis(statistically)significant,andreal,butnotimportant.

Ontheotherhand,ifadifferenceisnotstatisticallysignificant,itcouldstillbereal.Wemaysimplyhavetoosmallasampletoshowthatadifferenceexists.Furthermore,thedifferencemaystillbeimportant.ThedifferenceinmortalityintheanticoagulanttrialofCarletonetal.(1960),describedinChapter2,wasnotsignificant,thedifferencein

percentagesurvivalbeing5.5infavouroftheactivetreatment.However,theauthorsalsoquoteaconfidenceintervalforthedifferenceinpercentagesurvivalof24.2percentagepointsinfavourofheparinto13.3percentagepointsinfavourofthecontroltreatment.Adifferenceinsurvivalof24percentagepointsinfavourofthetreatmentwouldcertainlybeimportantifitturnedouttobethecase.‘Notsignificant’doesnotimplythatthereisnoeffect.Itmeansthatwehavefailedtodemonstratetheexistenceofone.Laterstudiesshowedthatanticoagulationisindeedeffective.

Aparticularcaseofmisinterpretationofnon-significantresultsoccursintheinterpretationofrandomizedclinicaltrialswherethereisameasurementbeforetreatmentandanotherafterwards.Ratherthancomparetheaftertreatment

measurebetweenthetwogroups,researcherscanbetemptedtotestseparatelythenullhypothesesthatthemeasureinthetreatmentgrouphasnotchangedfrombaselineandthatthemeasureinthecontrolgrouphasnotchangedfrombaseline.Ifonegroupshowsasignificantdifferenceandtheotherdoesnot,theresearchersthenconcludethatthetreatmentsaredifferent.

Forexample,Kerriganetal.(1993)assessedtheeffectsofdifferentlevelsofinformationonanxietyinpatientsduetoundergosurgery.Theyrandomizedpatientstoreceiveeithersimpleordetailedinformationabouttheprocedureanditsrisks.Anxietywasagainmeasuredafterpatientshadbeengiventheinformation.Kerriganetal.(1993)calculatedsignificancetestsforthemeanchangeinanxietyscoreforeachgroupseparately.Inthegroupgivendetailedinformationthemeanchangeinanxietywasnotsignificant(P=0.2),interpretedincorrectlyas‘nochange’.Intheothergroupthereductioninanxietywassignificant(P=0.01).Theyconcludedthattherewasadifferencebetweenthetwogroupsbecausethechangewassignificantinonegroupbutnotintheother.Thisisincorrect.Theremay,forexample,beadifferenceinonegroupwhichjustfailstoreachthe(arbitrary)significancelevelandadifferenceintheotherwhichjustexceedsit,thedifferencesinthetwogroupsbeingsimilar.Weshouldcomparethetwogroupsdirectly.Itisthesewhicharecomparable

apartfromtheeffectsoftreatment,beingrandomized,notthebeforeandaftertreatmentmeanswhichcouldbeinfluencedbymanyotherfactors.Analternativeanalysistestedthenullhypothesisthatafteradjustmentforinitialanxietyscorethemeananxietyscoresarethesameinpatientsgivensimpleanddetailedinformation.Thisshowedasignificantlyhighermeanscoreinthedetailedinformationgroup(BlandandAltman1993).Testingwithineachgroupseparatelyisessentiallythesameerrorascalculatingaconfidenceintervalforeachgroupseparately(§8.9).

9.7Comparingthemeansoflargesamples

Wecanusethisconfidenceintervaltocarryoutasignificancetestofthenullhypothesisthatthedifferencebetweenthemeansiszero,i.e.thealternativehypothesisisthatµ1andµ2arenotequal.Iftheconfidenceintervalincludeszero,thentheprobabilityofgettingsuchextremedataifthenullhypothesisweretrueisgreaterthan0.05(i.e.1-0.95).Iftheconfidenceintervalexcludeszero,thentheprobabilityofsuchextremedataunderthenullhypothesisisless

than0.05andthedifferenceissignificant.Anotherwayofdoingthesamethingistonotethat

isfromaStandardNormaldistribution,i.e.mean0andvariance1.Underthenullhypothesisthatµ1-µ2orµ1=µ2-0,thisis

Thisistheteststatistic,andifitliesbetween-1.96and+1.96thentheprobabilityofsuchanextremevalueisgreaterthan0.05andthedifferenceisnotsignificant.Iftheteststatisticisgreaterthan1.96orlessthan-1.96,thereisalessthan0.05probabilityofsuchdataarisingifthenullhypothesisweretrue,andthedataarenotconsistentwithnullhypothesis;thedifferenceissignificantatthe0.05or5%level.ThisisthelargesampleNormaltestorztestfortwomeans.

Foranexample,inastudyofrespiratorysymptomsinschoolchildren(§8.5),wewantedtoknowwhetherchildrenreportedbytheirparentstohaverespiratorysymptomshadworselungfunctionthanchildrenwhowerenotreportedtohavesymptoms.Ninety-twochildrenwerereportedtohavecoughduringthedayoratnight,andtheirmeanPEFRwas294.8litre/minwithstandarddeviation57.1litre/min;1643childrenwerereportednottohavethesymptom,andtheirmeanPEFRwas313.6litre/minwithstandarddeviation55.2litre/min.Wethushavetwolargesamples,andcanapplytheNormaltest.Wehave

Thedifferencebetweenthetwogroupsis[xwithbarabove]1-[xwithbarabove]2=294.8-313.6=-18.8.Thestandarderrorofthedifferenceis

Theteststatisticis

UnderthenullhypothesisthisisanobservationfromaStandardNormaldistribution,andsoP<0.01(Table7.2).Ifthenullhypothesisweretrue,thedatawhichwehaveobservedwouldbeunlikely.WecanconcludethatthereisgoodevidencethatchildrenreportedtohavecoughduringthedayoratnighthavelowerPEFRthanotherchildren.

Thishasaprobabilityofabout0.16,andsothedataareconsistentwiththenullhypothesis.However,the95%confidenceintervalforthedifferenceis-14.6-1.96×10.5to-14.6+1.96×10.5giving-35to6litre/min.Weseethatthedifferencecouldbejustasgreatasforcough.Becausethesizeofthesmallersampleisnotsogreat,thetestislesslikelytodetectadifferenceforthephlegmcomparisonthanforthecoughcomparison.TheadvantagesofconfidenceintervalsovertestsofsignificancearediscussedbyGardnerandAltman(1986).ConfidenceintervalsareusuallymoreinformativethanPvalues,particularlynon-significantones.

9.8ComparisonoftwoproportionsSupposewewishtocomparetwoproportionsp1andp2,estimatedfromlargeindependentsamplessizen1andn2.Thenullhypothesisisthattheproportioninthepopulationsfromwhichthesamplesaredrawnarethesame,psay.Sinceunderthenullhypothesistheproportionsforthetwogroupsarethesame,wecangetonecommonestimateoftheproportionanduseittoestimatethestandarderrors.Weestimatethecommonproportionfromthedataby

wherep1=r1/n2-p2=r2/n2.Wewanttomakeinferencesfromthedifferencebetweensampleproportions,p1-p2,sowerequirethestandarderrorofthisdifference.

sincethesamplesareindependent.Hence

Aspisbasedonmoresubjectsthaneitherp1orp2,ifthenullhypothesisweretruethenstandarderrorswouldbemorereliablethanthoseestimatedin§8.6usingp1andp2separately.Wethenfindtheteststatistic

In§8.6,welookedattheproportionsofchildrenwithbronchitisininfancyandwithnosuchhistorywhowerereportedtohaverespiratorysymptomsinlaterlife.Wehad273childrenwithahistoryofbronchitisbeforeage5years,26ofwhomwerereportedtohavedayornightcoughatage14.Wehad1046childrenwithnobronchitisbeforeage5years,44ofwhomwerereportedtohavedayornightcoughatage14.Weshalltestthenullhypothesisthattheprevalenceofthesymptomisthesameinbothpopulations,againstthealternativethatitisnot:

ReferringthistoTable7.2oftheNormaldistribution,wefindtheprobabilityofsuchanextremevalueislessthan0.01,soweconcludethatthedataarenotconsistentwiththenullhypothesis.Thereisgoodevidencethatchildrenwithahistoryofbronchitisaremorelikelytobereportedtohavedayornightcoughatage14.

Notethatthestandarderrorusedhereisnotthesameasthatfoundin§8.6.Itisonlycorrectifthenullhypothesisistrue.Theformulaof§8.6shouldbeusedforfindingtheconfidenceinterval.Thusthestandarderrorusedfortestingisnotidenticaltothatusedforestimation,aswasthecaseforthecomparisonoftwomeans.Itispossibleforthetesttobesignificantandtheconfidenceintervalincludezero.Thispropertyispossessedbyseveralrelatedtestsandconfidenceintervals.

Thisisalargesamplemethod,andisequivalenttothechi-squaredtestfora2by2table(§13.1,2).Howsmallthesamplecanbeandmethodsforsmallsamplesarediscussedin§13.3-6.

Notethatwedonotneedadifferenttestfortheratiooftwoproportions,asthenullhypothesisthattheratiointhepopulationisoneisthesameasthenullhypothesisthatthedifferenceinthepopulationiszero.

9.9*ThepowerofatestThetestforcomparingmeansin§9.7ismorelikelytodetectalargedifferencebetweentwopopulationsthanasmallone.Theprobabilitythatatestwillproduceasignificantdifferenceatagivensignificanceleveliscalledthepowerofthetest.Foragiventest,thiswilldepend

onthetruedifferencebetweenthepopulationscompared,thesamplesizeandthesignificancelevelchosen.Wehavealreadynotedin§9.4thatwearemorelikelytoobtainasignificantdifferencewithasignificancelevelof0.05thanwithoneof0.01.WehavegreaterpowerifthePvaluechosentobeconsideredassignificantislarger.

ForthecomparisonofPEFRinchildrenwithandwithoutphlegm(§9.7),for

example,supposethatthepopulationmeanswereinfactµ1=310andµ2=295litre/min,andeachpopulationhadstandarddeviation55litre/min.Thesamplesizesweren1=1708andn2=27,sothestandarderrorofthedifferencewouldbe

Thepopulationdifferencewewanttobeabletodetectisµ1-µ2=310-295=15,andso

FromTable7.1,Φ(0.55)isbetween0.691and0.726,about0.71.Thepowerofthetestwouldbe1-0.71=0.29.Ifthesewerethepopulationmeansandstandarddeviation,ourtestwouldhavehadapoorchanceofdetectingthedifferenceinmeans,eventhoughitexisted.Thetestwouldhavelowpower.Figure9.3showshowthepowerofthistestchangeswiththedifferencebetweenpopulationmeans.Asthedifferencegetslarger,thepowerincreases,gettingcloserandcloserto1.Thepowerisnotzeroevenwhenthepopulationdifferenceiszero,becausethereisalwaysthepossibilityofasignificantdifference,evenwhenthenullhypothesisistrue.1-power=β,theprobabilityofaTypeIIorbetaerror(§9.4)ifthepopulationdifference=15litres/min.

Fig.9.3.Powercurveforacomparisonoftwomeansfromsamplesofsize1708and27

9.10*MultiplesignificancetestsIfwetestanullhypothesiswhichisinfacttrue,using0.05asthecriticalsignificancelevel,wehaveaprobabilityof0.95ofcomingtoa‘notsignificant’(i.e.correct)conclusion.Ifwetesttwoindependent

truenullhypotheses,theprobabilitythatneithertestwillbesignificantis0.95×0.95=0.90(§6.2).Ifwetesttwentysuchhypothesestheprobabilitythatnonewillbesignificantis

0.9520=0.36.0.Thisgivesaprobabilityof1-0.36=0.64ofgettingatleastonesignificantresult;wearemorelikelytogetonethannot.Theexpectednumberofspurioussignificantresultsis20×0.05=1.

Manymedicalresearchstudiesarepublishedwithlargenumbersofsignificancetests.Thesearenotusuallyindependent,beingcarriedoutonthesamesetofsubjects,sotheabovecalculationsdonotapplyexactly.However,itisclearthatifwegoontestinglongenoughwewillfindsomethingwhichis‘significant’.Wemustbewareofattachingtoomuchimportancetoalonesignificantresultamongamassofnon-significantones.Itmaybetheoneintwentywhichweshouldgetbychancealone.

Thisisparticularlyimportantwhenwefindthataclinicaltrialorepidemiologicalstudygivesnosignificantdifferenceoverall,butdoessoinaparticularsubsetofsubjects,suchaswomenagedover60.Forexample,Leeetal.(1980)simulatedaclinicaltrialofthetreatmentofcoronaryarterydiseasebyallocating1073patientrecordsfrompastcasesintotwo‘treatment’groupsatrandom.Theythenanalysedtheoutcomeasifitwereagenuinetrialoftwotreatments.Theanalysiswasquitedetailedandthorough.Aswewouldexpect,itfailedtoshowanysignificantdifferenceinsurvivalbetweenthosepatientsallocatedtothetwo‘treatments’.Patientswerethensubdividedbytwovariableswhichaffectprognosis,thenumberofdiseasedcoronaryvesselsandwhethertheleftventricularcontractionpatternwasnormalorabnormal.Asignificantdifferenceinsurvivalbetweenthetwo‘treatment’groupswasfoundinthosepatientswiththreediseasedvessels(themaximum)andabnormalventricularcontraction.Asthiswouldbethesubsetofpatientswiththeworstprognosis,thefindingwouldbeeasytoaccountforbysayingthatthesuperior‘treatment’haditsgreatestadvantageinthemostseverelyillpatients!Themoralofthisstoryisthatifthereisnodifferencebetweenthetreatmentsoverall,significantdifferencesinsubsetsaretobetreatedwiththeutmostsuspicion.Thismethodoflookingforadifferenceintreatment

effectbetweensubgroupsofsubjectsisincorrect.Acorrectapproachwouldbetouseamultifactorialanalysis,asdescribedinChapter17,withtreatmentandgroupastwofactors,andtestforaninteractionbetweengroupsandtreatments.Thepowerfordetectingsuchinteractionsisquitelow,andweneedalargersamplethanwouldbeneededsimplytoshowadifferenceoverall(AltmanandMatthews1996,MatthewsandAltman1996a,b).

Thisspurioussignificantdifferencecomesaboutbecause,whenthereisnorealdifference,theprobabilityofgettingnosignificantdifferencesinsixsubgroupsis0.956=0.74,not0.95.WecanallowforthiseffectbytheBonferronimethod.Ingeneral,ifwehavekindependentsignificanttests,attheαlevel,ofnullhypotheseswhicharealltrue,theprobabilitythatwewillgetnosignificantdifferencesis(1-α)k.Ifwemakeαsmallenough,wecanmaketheprobabilitythatnoneoftheseparatetestsissignificantequalto0.95.ThenifanyofthektestshasaPvaluelessthanα,wewillhaveasignificantdifferencebetweenthetreatmentsatthe0.05level.Sinceαwillbeverysmall,itcanbeshownthat(1-α)k≈1-kα.Ifweputkα=0.05,soα=0.05/kwewillhaveprobability

0.05thatoneofthektestswillhaveaPvaluelessthanαifthenullhypothesesaretrue.Thus,ifinaclinicaltrialwecomparetwotreatmentswithin5subsetsofpatients,thetreatmentswillbesignificantlydifferentatthe0.05levelifthereisaPvaluelessthan0.01withinanyofthesubsets.ThisistheBonferronimethod.Notethattheyarenotsignificantatthe0.01level,butatonlythe0.05level.Thekteststogethertestthecompositenullhypothesisthatthereisnotreatmenteffectonanyvariable.

WecandothesamethingbymultiplyingtheobservedPvaluefromthesignificancetestsbythenumberoftests,k,anykPwhichexceedsonebeingignored.ThenifanykPislessthan0.05,thetwotreatmentsaresignificantatthe0.05level.

Forexample,Williamsetal.(1992)randomlyallocatedelderlypatientsdischargedfromhospitaltotwogroups.Theinterventiongroupreceivedtimetabledvisitsbyhealthvisitorassistants,thecontrol

patientsgroupwerenotvisitedunlesstherewasperceivedneed.Soonafterdischargeandafteroneyear,patientswereassessedforphysicalhealth,disability,andmentalstateusingquestionnairescales.Therewerenosignificantdifferencesoverallbetweentheinterventionandcontrolgroups,butamongwomenaged75–79livingalonethecontrolgroupshowedsignificantlygreaterdeteriorationinphysicalscorethandidtheinterventiongroup(P=0.04),andamongmenover80yearsthecontrolgroupshowedsignificantlygreaterdeteriorationindisabilityscorethandidtheinterventiongroup(P=0.03).Theauthorsstatedthat‘Twosmallsub-groupsofpatientswerepossiblyshowntohavebenefitedfromtheintervention….Thesebenefits,however,havetobetreatedwithcaution,andmaybeduetochancefactors.’Subjectswerecross-classifiedbyagegroups,whetherlivingalone,andsex,sotherewereatleasteightsubgroups,ifnotmore.Thusevenifweconsiderthethreescalesseparately,onlyaPvaluelessthan0.05/8=0.006wouldprovideevidenceofatreatmenteffect.Alternatively,thetruePvaluesare8×0.04=0.32and8×0.03=0.24.

Asimilarproblemarisesifwehavemultipleoutcomemeasurements.Forexample,Newnhametal.(1993)randomizedpregnantwomentoreceiveaseriesofDopplerultrasoundbloodflowmeasurementsortocontrol.Theyfoundasignificantlyhigherproportionofbirthweightsbelowthe10thand3rdcentiles(P=0.006andP=0.02).Thesewereonlytwoofmanycomparisons,however,andonewouldsuspectthattheremaybesomespurioussignificantdifferencesamongsomany.Atleast35werereportedinthepaper,thoughonlythesetwowerereportedintheabstract.(Birthweightwasnottheintendedoutcomevariableforthetrial.)Thesetestsarenotindependent,becausetheyareallonthesamesubjects,usingvariableswhichmaynotbeindependent.Theproportionsofbirthweightsbelowthe10thand3rdcentilesareclearlynotindependent,forexample.Theprobabilitythattwocorrelatedvariablesbothgivenon-significantdifferenceswhenthenullhypothesisistrueisgreaterthan(1-α)2becauseifthefirsttestisnotsignificant,thesecondnowhasaprobabilitygreaterthan1-αofbeingnotsignificantalso.(Similarly,theprobabilitythatbotharesignificantexceedsα2,andtheprobabilitythatonlyoneissignificantisreduced.)

Forkteststheprobabilityofnosignificantdifferencesisgreaterthan(1-α)kandsogreaterthan1-kα.Thusifwecarryouteachtestattheα=0.05/klevel,wewillhaveaprobabilityofnosignificantdifferenceswhichisgreaterthan0.95.APvaluelessthanαforanyvariable,orkP<0.05,wouldmeanthatthetreatmentsweresignificantlydifferent.Fortheexample,thePvaluescouldbeadjustedby35×0.006=0.21and35×0.02=0.70.

Becausetheprobabilityofobtainingnosignificantdifferencesifthenullhypothesesarealltrueisgreaterthanthe0.95whichwewantittobe,theoverallPvalueisactuallysmallerthanthenominal0.05,byanunknownamountwhichdependsonthelackofindependencebetweenthetests.Thepowerofthetest,itsabilitytodetecttruedifferencesinthepopulation,iscorrespondinglydiminished.Instatisticalterms,thetestisconservative.

Othermultipletestingproblemsarisewhenwehavemorethantwogroupsofsubjectsandwishtocompareeachpairofgroups(§10.9),whenwehaveaseriesofobservationsovertime,suchasbloodpressureevery15minafteradministrationofadrug,wheretheremaybeatemptationtotesteachtimepointseparately(§10.7),andwhenwehaverelationshipsbetweenmanyvariablestoexamine,asinasurvey.Foralltheseproblems,themultipletestsarehighlycorrelatedandtheBonferronimethodisinappropriate,asitwillbehighlyconservativeandmaymissrealdifferences.

9.11*RepeatedsignificancetestsandsequentialanalysisAspecialcaseofmultipletestingarisesinclinicaltrials,wherepatientsareadmittedatdifferenttimes.Therecanbeatemptationtokeeplookingatthedataandcarryingoutsignificanttests.Asdescribedabove(§9.10),thisisliabletoproducespurioussignificantdifferences,detectingtreatmenteffectswherenoneexist.Ihaveheardofresearcherstestingthedifferenceeachtimeapatientisaddedandstoppingthetrialassoonasthedifferenceissignificant,thensubmittingthepaperforpublicationasifonlyonetesthadbeen

carriedout.Iwillbecharitableandputthisdowntoignorance.

Itisquitelegitimatetosetupatrialwherethetreatmentdifferenceistestedeverytimeapatientisadded,providedthisrepeatedtestingisdesignedintothetrialandtheoverallchanceofasignificantdifferencewhenthenullhypothesisistrueremains0.05.Suchdesignsarecalledsequentialclinicaltrials.AcomprehensiveaccountisgivenbyWhitehead(1997).

Analternativeapproachwhichisquiteoftenusedistotakeasmallnumberoflooksatthedataasthetrialprogresses,testingatapredeterminedPvalue.Forexample,wecouldtestthreetimes,rejectingthenullhypothesisofnotreatmenteffectthefirsttimeonlyifP<0.001,thesecondtimeifP<0.01,andthethirdtimeifP<0.04.Thenifthenullhypothesisistrue,theprobabilitythattherewillnotbeasignificantdifferenceisapproximately0.999×0.99×0.96=0.949,sotheoverallalphalevelwillbe1-0.949=0.051,i.e.approximately0.05.(Thecalculationisapproximatebecausethetestsarenotindependent.)Ifthenullhypothesisisrejectedatanyofthesetests,theoverallPvalueis0.05,notthe

nominalone.Thisapproachcanbeusedbydatamonitoringcommittees,whereifthetrialshowsalargedifferenceearlyonthetrialcanbestoppedyetstillallowastatisticalconclusiontobedrawn.ThisiscalledthealphaspendingorP-valuespendingapproach.

TwoparticularmethodswhichyoumightcomeacrossarethegroupedsequentialdesignofPocock(1977,1982),whereeachtestisdoneatthesamenominalalphavalue,andthemethodofO'BrienandFleming(1979),widelyusedbythepharmaceuticalindustry,wherethenominalalphavaluesdecreasesharplyasthetrialprogresses.


44.Inacase–controlstudy,patientswithagivendiseasedrankcoffeemorefrequentlythandidcontrols,andthedifferencewashighlysignificant.Wecanconcludethat:

(a)drinkingcoffeecausesthedisease;

(b)thereisevidenceofarealrelationshipbetweenthediseaseandcoffeedrinkinginthesampledpopulation;

(c)thediseaseisnotrelatedtocoffeedrinking;

(d)eliminatingcoffeewouldpreventthedisease;

(e)coffeeandthediseasealwaysgotogether.

ViewAnswer

45.WhencomparingthemeansoftwolargesamplesusingtheNormaltest:

(a)thenullhypothesisisthatthesamplemeansareequal;

(b)thenullhypothesisisthatthemeansarenotsignificantlydifferent;

(c)standarderrorofthedifferenceisthesumofthestandarderrorsofthemeans;

(d)thestandarderrorsofthemeansmustbeequal;

(e)theteststatisticistheratioofthedifferencetoitsstandarderror.

ViewAnswer

46.InacomparisonoftwomethodsofmeasuringPEFR,6of17subjectshadhigherreadingsontheWrightpeakflowmeter,10hadhigherreadingsontheminipeakflowmeterandonehadthesameonboth.Ifthedifferencebetweentheinstrumentsistestedusingasigntest:

(a)theteststatisticmaybethenumberwiththehigherreadingontheWrightmeter;

(b)thenullhypothesisisthatthereisnotendencyforoneinstrumenttoreadhigherthantheother;

(c)aone-tailedtestofsignificanceshouldbeused;

(d)theteststatisticshouldfollowtheBinomialdistribution(n=

16andp=0.5)ifthenullhypothesisweretrue;

(e)theinstrumentsshouldhavebeenpresentedinrandomorder.

ViewAnswer

47.Inasmallrandomizeddoubleblindtrialofanewtreatmentinacutemyocardialinfarction,themortalityinthetreatedgroupwashalfthatinthecontrolgroup,butthedifferencewasnotsignificant.Wecanconcludethat:

(a)thetreatmentisuseless;

(b)thereisnopointincontinuingtodevelopthetreatment;

(c)thereductioninmortalityissogreatthatweshouldintroducethetreatmentimmediately;

(d)weshouldkeepaddingcasestothetrialuntiltheNormaltestforcomparisonoftwoproportionsissignificant;

(e)weshouldcarryoutanewtrialofmuchgreatersize.

ViewAnswer

48.Inalargesamplecomparisonbetweentwogroups,increasingthesamplesizewill:

(a)improvetheapproximationoftheteststatistictotheNormaldistribution;

(b)decreasethechanceofanerrorofthefirstkind;

(c)decreasethechanceofanerrorofthesecondkind;

(d)increasethepoweragainstagivenalternative;

(e)makethenullhypothesislesslikelytobetrue.

ViewAnswer

49.Inastudyofbreastfeedingandintelligence(Lucasetal.1992),300childrenwhowereverysmallatbirthweregiventheirmother'sbreastmilkorinfantformula,atthechoiceofthemother.Atthe

ageof8yearstheIQofthesechildrenwasmeasured.ThemeanIQintheformulagroupwas92.8,comparedtoameanof103.0inthebreastmilkgroup.Thedifferencewassignificant,P<0.001:

(a)thereisgoodevidencethatformulafeedingofverysmallbabiesreducesIQatageeight;

(b)thereisgoodevidencethatchoosingtoexpressbreastmilkisrelatedtohigherIQinthechildatageeight;

(c)typeofmilkhasnoeffectonsubsequentIQ;

(d)theprobabilitythattypeofmilkaffectssubsequentIQislessthan0.1%;

(e)iftypeofmilkwereunrelatedtosubsequentIQ,theprobabilityofgettingadifferenceinmeanIQasbigasthatobservedislessthan0.001.

ViewAnswer

9EExercise:Crohn'sdiseaseandcornflakesThesuggestionthatcornflakesmaycauseCrohn'sdiseasearoseinthestudyofJames(1977).Crohn'sdiseaseisaninflammatorydisease,usuallyofthelastpartofthesmallintestine.Itcancauseavarietyofsymptoms,includingvaguepain,diarrhoea,acutepainandobstruction.Treatmentmaybebydrugsorsurgery,butmanypatientshavehadthediseaseformanyyears.James'initialhypothesiswasthatfoodstakenatbreakfastmaybeassociatedwithCrohn'sdisease.Jamesstudied16menand18womenwithCrohn'sdisease,aged19–64years,meantimesincediagnosis4.2years.Thesewerecomparedtocontrols,drawnfromhospital

patientswithoutmajorgastro-intestinalsymptoms.Twocontrolswerechosenperpatient,matchedforageandsex.Jamesinterviewedallcasesandcontrolshimself.Caseswereaskedwhethertheyatevariousfoodsforbreakfastbeforetheonsetofsymptoms,andcontrolswereaskedwhethertheyatevariousfoodsbeforeacorrespondingtime(Table9.2).Therewasasignificantexcessofeatingofcornflakes,

wheatandbranamongtheCrohn'spatients.Theconsumptionofdifferentcerealswasinterrelated,peoplereportingonecerealbeinglikelytoreportothers.InJames'opiniontheprincipalassociationofCrohn'sdiseasewaswithcornflakes,basedontheapparentstrengthoftheassociation.Onlyonecasehadnevereatencornflakes.

Table9.2.NumbersofCrohn'sdiseasepatientsandcontrolswhoatevariouscerealsregularly(atleastonce

perweek)(James1977)

Patients Controls Significancetest

Cornflakes Regularly 23 17 P<0.0001

Rarelyornever

11 51

Wheat Regularly 16 12 P<0.01

Rarelyornever

18 56

Porridge Regularly 11 15 0.5>P>0.1

Rarelyornever

23 53

Rice Regularly 8 10 0.5>P>0.1

Rarelyornever

26 56

Bran Regularly 6 2 P=0.02

Rarelyornever

28 66

Muesli Regularly 4 3 P=0.17

Rarelyornever

30 65

Severalpaperssoonappearedinwhichthisstudywasrepeated,withvariations.NonewasidenticalindesigntoJames'studyandnoneappearedtosupporthisfindings.Mayberryetal.(1978)interviewed100patientswithCrohn'sdisease,meandurationnineyears.Theyobtained100controls,matchedforageandsex,frompatientsandtheirrelativesattendingafractureclinic.Casesandcontrolswereinterviewedabouttheircurrentbreakfasthabits(Table9.3).Theonlysignificantdifferencewasanexcessoffruitjuicedrinkingincontrols.Cornflakeswereeatenby29casescomparedto22controls,whichwasnotsignificant.Inthisstudytherewasnoparticulartendencyforcasestoreportmorefoodsthancontrols.Theauthorsalsoaskedcaseswhethertheyknewofanassociationbetweenfood(unspecified)andCrohn'sdisease.Theassociationwithcornflakeswasreportedby29,and12ofthesehadstoppedeatingthem,havingpreviouslyeatenthemregularly.Intheir29matchedcontrols,3werepastcornflakeseaters.Ofthe71Crohn'spatientswhowereunawareoftheassociation,21haddiscontinuedeatingcornflakescomparedto10oftheir71controls.Theauthorsremarked‘seeminglypatientswithCrohn'sdiseasehadsignificantlyreducedtheirconsumptionofcornflakescomparedwithcontrols,irrespectiveofwhethertheywereawareofthepossible

association’.

1.Arethecasesandcontrolscomparableineitherofthesestudies?

ViewAnswer

2.Whatothersourcesofbiascouldtherebeinthesedesigns?

ViewAnswer

Table9.3.Numberofpatientsandcontrolsregularlyconsumingcertainfoodsatleasttwiceweekly

(Mayberryetal.1978)

Foodsatbreakfast

Crohn'spatients(n=100)

Controls(n=100)

Significancetest

Bread 91 86

Toast 59 64

Egg 31 37

Fruitorfruitjuice

14 30 P<0.02

Porridge 20 18

Weetabix,shreddiesor

21 19

shreddedwheat

Cornflakes 29 22

SpecialK 4 7

Ricekrispies 6 6

Sugarpuffs 3 1

Branorallbran 13 12

Muesli 3 10

AnyCereal 55 55

3.WhatisthemainpointofdifferenceindesignbetweenthestudyofJamesandthatofMayberryetal.?

ViewAnswer

4.InthestudyofMayberryetal.howmanyCrohn'scasesandhowmanycontrolshadeverbeenregulareatersofcornflakes?HowdoesthiscomparewithJames'findings?

ViewAnswer

5.WhydidJamesthinkthateatingcornflakeswasparticularlyimportant?

ViewAnswer

6.ForthedataofTable9.2,calculatethepercentageofcasesand

controlswhosaidthattheyatethevariouscereals.Nowdividetheproportionofcaseswhosaidthattheyhadeatenthecerealbytheproportionofcontrolswhoreportedeatingit.Thistellsus,roughly,howmuchmorelikelycasesweretoreportthecerealthanwerecontrols.Doyouthinkeatingcornflakesisparticularlyimportant?

ViewAnswer

7.Ifwehaveanexcessofallcerealswhenweaskwhatwasevereaten,andnonewhenweaskwhatiseatennow,whatpossiblefactorscouldaccountforthis?

ViewAnswer



>TableofContents>10-Comparingthemeansofsmallsamples

10

Comparingthemeansofsmallsamples

10.1ThetdistributionWehaveseeninChapters8and9howtheNormaldistributioncanbeusedtocalculateconfidenceintervalsandtocarryouttestsofsignificanceforthemeansoflargesamples.Inthischapterweshallseehowsimilarmethodsmaybeusedwhenwehavesmallsamples,usingthetdistribution,andgoontocompareseveralmeans.

Sofar,theprobabilitydistributionswehaveusedhavearisenbecauseofthewaydatawerecollected,eitherfromthewaysamplesweredrawn(Binomialdistribution),orfromthemathematicalpropertiesoflargesamples(Normaldistribution).Thedistributiondidnotdependonanypropertyofthedatathemselves.Tousethetdistributionwemustmakeanassumptionaboutthedistributionfromwhichtheobservationsthemselvesaretaken,thedistributionofthevariableinthepopulation.WemustassumethistobeaNormaldistribution.AswesawinChapter7,manynaturallyoccurringvariableshavebeenfoundtofollowaNormaldistributionclosely.IshalldiscusstheeffectsofanydeviationsfromtheNormallater.

Fig.10.1.Student'stdistributionwith1,4and20degreesoffreedom,showingconvergencetotheStandardNormaldistribution

Table10.1.Two-tailedprobabilitypointsofthetdistribution

D.f. Probability D.f. Probability

0.10 0.05 0.01 0.001 0.10 0.05

10% 5% 1% 0.1% 10% 5%

1 6.31 12.70 63.66 636.62 16 1.75 2.12

2 2.92 4.30 9.93 31.60 17 1.74 2.11

3 2.35 3.18 5.84 12.92 18 1.73 2.10

4 2.13 2.78 4.60 8.61 19 1.73 2.09

5 2.02 2.57 4.03 6.87 20 1.72 2.09

6 1.94 2.45 3.71 5.96 21 1.72 2.08

7 1.89 2.36 3.50 5.41 22 1.72 2.07

8 1.86 2.31 3.36 5.04 23 1.71 2.07

9 1.83 2.26 3.25 4.78 24 1.71 2.06

10 1.81 2.23 3.17 4.59 25 1.71 2.06

11 1.80 2.20 3.11 4.44 30 1.70 2.04

12 1.78 2.18 3.05 4.32 40 1.68 2.02

13 1.77 2.16 3.01 4.22 60 1.67 2.00

14 1.76 2.14 2.98 4.14 120 1.66 1.98

15 1.75 2.13 2.95 4.07 ∞ 1.64 1.96

D.f.=Degreesoffreedom.

∞=infinity,sameastheStandardNormaldistribution.

LiketheNormaldistribution,thetdistributionfunctioncannotbeintegratedalgebraicallyanditsnumericalvalueshavebeentabulated.Becausethetdistributiondependsonthedegreesoffreedom,itisnotusuallytabulatedinfullliketheNormaldistributioninTable7.1.Instead,probabilitypointsaregivenfordifferentdegreesoffreedom.Table10.1showstwosidedprobabilitypointsforselecteddegreesoffreedom.Thus,with4degreesoffreedom,wecanseethat,withprobability0.05,twillbe2.78ormorefromitsmean,zero.

Becauseonlycertainprobabilitiesarequoted,wecannotusuallyfindtheexactprobabilityassociatedwithaparticularvalueoft.Forexample,supposewewanttoknowtheprobabilityofton9degreesoffreedombeingfurtherfromzerothan3.7.FromTable10.1weseethatthe0.01pointis3.25andthe0.001pointis4.78.Wethereforeknowthattherequiredprobabilityliesbetween0.01and0.001.Wecouldwritethisas0.001<P<0.01.Oftenthelowerbound,0.001,isomittedandwewriteP<0.01.Withacomputeritispossibletocalculatetheexactprobabilityeverytime,sothiscommonpracticeisduetodisappear.

Fig.10.2.Sampletratiosderivedfrom750samplesof4humanheightsandthetdistribution,afterStudent(1908)

10.2Theone-sampletmethodWecanusethetdistributiontofindconfidenceintervalsformeansestimatedfromasmallsamplefromaNormaldistribution.Wedonotusuallyhavesmallsamplesinsamplesurveys,butweoftenfindtheminclinicalstudies.Forexample,wecanusethetdistributiontofind

confidenceintervalsforthesizeofdifferencebetweentwotreatmentgroups,orbetweenmeasurementsobtainedfromsubjectsundertwoconditions.Ishalldealwiththelatter,singlesampleproblemfirst.

Thepopulationmean,µ,isunknownandwewishtoestimateitusinga95%confidenceinterval.Wecanseethat,for95%ofsamples,thedifferencebetween[xwithbarabove]andµisatmosttstandarderrors,wheretisthevalueofthetdistributionsuchthat95%ofobservationswillbeclosertozerothant.Foralargesamplethiswillbe1.96asfortheNormaldistribution.ForsmallsampleswemustuseTable10.1.Inthistable,theprobabilitythatthetdistributionisfurtherfromzerothantisgiven,sowemustfirstfindoneminusourdesiredprobability,0.95.Wehave1-0.95=0.05,soweusethe0.05columnofthetabletogetthevalueoft.Wethenhavethe95%confidenceinterval:[xwithbarabove]-tstandarderrorsto[xwithbarabove]-tstandarderrors.Theusualapplicationofthisistodifferencesbetweenmeasurementsmadeonthesameoronmatchedpairsofsubjects.Inthisapplicationtheonesamplettestisalsoknownasthepairedttest.

ConsiderthedataofTable10.2.(Iaskedtheresearcherwhythereweresomanymissingdata.Hetoldmethatsomeofthebiopsieswerenotusabletocountthecapillaries,andthatsomeofthesepatientswereamputeesandthefootitselfwasmissing.)Weshallestimatethedifferenceincapillarydensity

betweentheworsefoot(intermsofulceration,notcapillaries)andthebetterfootfortheulceratedpatients.Thefirststepistofindthedifferences(worse–better).Wethenfindthemeandifferenceanditsstandarderror,asdescribedin§8.2.TheseareinthelastcolumnofTable10.2.

Table10.2.Capillarydensity(permm2)inthefeetofulceratedpatientsandahealthycontrolgroup(datasuppliedbyMarc

Lamah)

Controls Ulceratedpatients

Rightfoot Leftfoot

Averageofrightandleft†

Worsefoot

Betterfoot

Averageofworseand

better†

Differenceworse-

19 16 17.5 9 ? 9.0

25 30 27.5 11 ? 11.0

25 29 27.0 15 10 12.5

26 33 29.5 16 21 18.5

26 28 27.0 18 18 18.0

30 28 29.0 18 18 18.0

33 36 34.5 19 26 22.5

33 29 31.0 20 ? 20.0

34 37 35.5 20 20 20.0

34 33 33.5 20 33 26.5

34 37 35.5 20 26 23.0

34 ? 34.0 21 15 18.0

35 38 36.5 22 23 22.5

36 40 38.0 22 ? 22.0

39 41 40.0 23 23 23.0

40 39 39.5 25 30 27.5

41 39 40.0 26 31 28.5

41 39 40.0 27 26 26.5

56 48 52.0 27 ? 27.0

35 23 29.0

47 42 44.5

? 24 24.0

? 28 28.0

Number 19 23

Mean 34.08 22.59

Sumofsquares

956.13 1176.32

Variance 53.12 53.47

Standarddeviation

7.29 7.31

Standarderror

0.38 0.32

†Whenoneobservationismissingtheaverage=theotherobservation.?=Missingdata.

Tofindthe95%confidenceintervalforthemeandifferencewemustsupposethatthedifferencesfollowaNormaldistribution.Tocalculatetheinterval,wefirstrequiretherelevantpointofthetdistributionfromTable10.1.Thereare16non-missingdifferencesandhencen-1=15degreesoffreedomassociatedwiths2.Wewantaprobabilityof0.95ofbeingclosertozerothant,sowegotoTable10.1withprobability=1-0.95=0.05.Usingthe15d.f.row,wegett=2.13.Hencethedifferencebetweenasamplemeanandthepopulationmeanislessthan2.13standarderrorsfor95%ofsamples,andthe95%confidenceintervalis-0.81-2.13×1.51to-0.81+2.13×1.51=-4.03to+2.41capillaries/mm2.

Onthebasisofthesedata,thecapillarydensitycouldbelessintheworseaffectedfootbyasmuchas4.03capillaries/mm2,orgreaterbyasmuchas2.41capillaries/mm2.Inthelargesamplecase,wewouldusetheNormaldistributioninsteadofthetdistribution,putting1.96insteadof2.13.WewouldnotthenneedthedifferencesthemselvestofollowaNormaldistribution.

Fig.10.3.NormalplotfordifferencesandplotofdifferenceagainstaverageforthedataofTable10.2,ulceratedpatients

Wecanalsousethetdistributiontotestthenullhypothesisthatinthepopulationthemeandifferenceiszero.Ifthenullhypothesisweretrue,andthedifferencesfollowaNormaldistribution,theteststatisticmean/standarderrorwouldbefromatdistributionwithn-1degreesoffreedom.Thisisbecausethenullhypothesisisthatthemeandifferenceµ=0,hencethenumerator[xwithbarabove]-µ=[xwithbarabove].Wehavetheusual‘estimateoverstandarderror’formula.Fortheexample,wehave

Ifwegotothe15degreesoffreedomrowofTable10.1,wefindthattheprobabilityofsuchanextremevaluearisingisgreaterthan0.10,the0.10pointofthedistributionbeing1.75.UsingacomputerwewouldfindP=0.6.Thedataareconsistentwiththenullhypothesisandwehavefailedtodemonstratetheexistenceofadifference.Notethattheconfidenceintervalismoreinformativethanthesignificancetest.

Wecouldalsousethesigntesttotestthenullhypothesisofnodifference.Thisgivesus5positivesoutof12differences(4differences,beingzero,givenousefulinformation)whichgivesatwosidedprobabilityof0.8,alittlelargerthanthatgivenbythettest.ProvidedtheassumptionofaNormaldistributionistrue,thettestis

preferredbecauseitisthemostpowerfultest,andsomostlikelytodetectdifferencesshouldtheyexist.

ThevalidityofthepairedtmethoddescribedabovedependsontheassumptionthatthedifferencesarefromaNormaldistribution.WecanchecktheassumptionofaNormaldistributionbyaNormalplot(§7.5).Figure10.3showsaNormalplotforthedifferences.Thepointslieclosetotheexpectedline,suggestingthatthereislittledeviationfromtheNormal.

Anotherplotwhichisausefulcheckhereisthedifferenceagainstthesubjectmean(Figure10.3).Ifthedifferencedependsonmagnitude,thenweshouldbecarefulofdrawinganyconclusionaboutthemeandifference.Wemaywanttoinvestigatethisfurther,perhapsbytransformingthedata(§10.4).Inthiscasethedifferencebetweenthetwofeetdoesnotappeartoberelatedtothelevelofcapillarydensityandweneednotbeconcernedaboutthis.

ThedifferencesmaylooklikeafairlygoodfittotheNormalevenwhenthemeasurementsthemselvesdonot.Therearetworeasonsforthis:thesubtractionremovesvariabilitybetweensubjects,leavingthemeasurementerrorwhichismorelikelytobeNormal,andthetwomeasurementerrorsarethenaddedbythedifferencing,producingthetendencyofsumstotheNormalseenintheCentralLimittheorem(§7.3).TheassumptionofaNormaldistributionfortheonesamplecaseisquitelikelytobemet.Idiscussthisfurtherin§10.5.

10.3ThemeansoftwoindependentsamplesSupposewehavetwosamplesfrompopulationswhichhaveaNormaldistribution,withwhichwewanttoestimatethedifferencebetweenthepopulationmeans.Ifthesampleswerelarge,the95%confidenceintervalforthedifferencewouldbetheobserveddifference-1.96standarderrorstoobserveddifference+1.96standarderrors.Unfortunately,wecannotsimplyreplace1.96byanumberfromTable10.1.Thisisbecausethestandarderrordoesnothavethesimpleformdescribedin§10.1.Itisnotbasedonasinglesumofsquares,butratheristhesquarerootofthesumoftwoconstantsmultipliedbytwosums

ofsquares.Hence,itdoesnotfollowthesquarerootoftheChi-squareddistributionasrequiredforthedenominatorofatdistributedrandomvariable(§7A).Inordertousethetdistributionwemustmakeafurtherassumptionaboutthedata.NotonlymustthesamplesbefromNormaldistributions,theymustbefromNormaldistributionswiththesamevariance.Thisisnotasunreasonableanassumptionasitmaysound.Adifferenceinmeanbutnotinvariabilityisacommonphenomenon.ThePEFRdataforchildrenwithandwithoutsymptomsanalysedin§8.5and§9.6showthecharacteristicveryclearly,asdotheaveragecapillarydensitiesinTable10.2.

Wenowestimatethecommonvariance,s2.Firstwefindthesumofsquaresaboutthesamplemeanforeachsample,whichwecanlabelSS1andSS2.WeformacombinedsumofsquaresbySS1+SS2.Thesumofsquaresforthefirstgroup,SS1,hasn1-1degreesoffreedomandthesecond,SS2,hasn2-1degreesoffreedom.Thetotaldegreesoffreedomisthereforen1-1+n2-1=n1+n2-2.Wehavelost2degreesoffreedombecausewehaveasumofsquaresabouttwomeans,eachestimatedfromthedata.Thecombinedestimateofvarianceis

Thestandarderrorof[xwithbarabove]1-[xwithbarabove]2is

NowwehaveastandarderrorrelatedtothesquarerootoftheChi-squareddistributionandwecangetatdistributedvariableby

havingn1+n2-2degreesoffreedom.The95%confidenceintervalforthedifferencebetweenpopulationmeans,µ1-µ2,is

wheretisthe0.05pointwithn1+n2-2degreesoffreedomfromTable

10.1.Alternatively,wecantestthenullhypothesisthatinthepopulationthedifferenceiszero,i.e.thatµ1=µ2,usingtheteststatistic

whichwouldfollowthetdistributionwithn1+n2-2d.f.ifthenullhypothesisweretrue.

Fig.10.4.ScatterplotagainstgroupandNormalplotforthepatientaveragesofTable10.2

Forapracticalexample,Table10.2showstheaveragecapillarydensityoverbothfeet(ifpresent)fornormalcontrolsubjectsaswellasulcerpatients.Weshallestimatethedifferencebetweentheulceratedpatientsandcontrols.WecanchecktheassumptionsofNormaldistributionanduniformvariance.FromTable10.2thevariancesappearremarkablysimilar,53.12and53.47.Figure10.4showsthatthereappearstobeashiftofmeanonly.TheNormalplotcombinesbygroupsbytakingthedifferencesbetweeneachobservationanditsgroupmean,calledtheresiduals.Thishasaslightkinkattheendbutnopronouncedcurve,

suggestingthatthereislittledeviationfromtheNormal.Ithereforefeelquitehappythattheassumptionsofthetwo-sampletmethodaremet.

Firstwefindthecommonvarianceestimate,s2.Thesumsofsquaresaboutthetwosamplemeansare956.13and1176.32.Thisgivesthecombinedsumofsquaresaboutthesamplemeanstobe956.13+1176.32=2132.45.Thecombineddegreesoffreedomaren1+n2-2=19+23-2=40.Hences2=2132.45/40=53.31.Thestandarderrorofthedifferencebetweenmeansis

Thevalueofthetdistributionforthe95%confidenceintervalisfoundfromthe0.05columnand40degreesoffreedomrowofTable10.1,givingt0.05=2.02.Thedifferencebetweenmeans(control–ulcerated)is34.08-22.59=11.49.Hencethe95%confidenceintervalis11.49-2.02×2.26to11.49+2.02×2.26,giving6.92to16.06capillaries/mm2.Hencethereisclearlyadifferenceincapillarydensitybetweennormalcontrolsandulceratedpatients.

Totestthenullhypothesisthatinthepopulationthecontrol-ulcerateddifferenceiszero,theteststatisticisdifferenceoverstandarderror,11.49/2.26=5.08.Ifthenullhypothesisweretrue,thiswouldbeanobservationfromthetdistributionwith40degreesoffreedom.FromTable10.1,theprobabilityofsuchanextremevalueislessthan0.001.Hencethedataarenotconsistentwiththenullhypothesisandwecanconcludethatthereisstrongevidenceofadifferenceinthepopulationswhichthesepatientsrepresent.

10.4Theuseoftransformations

Wehavealreadyseen(§7.4)thatsomevariableswhichdonotfollowaNormaldistributioncanbemadesobyasuitabletransformation.Thesametransformationcanbeusedtomakethevariancesimilarindifferentgroups,calledvariancestabilizingtransformations.BecausemeanandvarianceinsamplesfromthesamepopulationareindependentifandonlyifthedistributionisNormal(§7A),stablevariancesandNormaldistributionstendtogotogether.

Oftenstandarddeviationandmeanareconnectedbyasimplerelationshipoftheforms=a[xwithbarabove]b,whereaandbareconstants.Ifthisisso,itcanbeshownthatthevariancewillbestabilizedbyraisingtheobservationstothepower1-b,

unlessb=1,whenweusethelog.(Ishallresistthetemptationtoprovethis,thoughIcan.Anybookonmathematicalstatisticswilldoit.)Thus,ifthestandarddeviationisproportionaltothesquarerootofthemean(i.e.varianceproportionaltomean),e.g.Poissonvariance(§6.7),b=0.5,1-b=0.5,andweuseasquareroottransformation.Ifthestandarddeviationisproportionaltothemeanwelog.Ifthestandarddeviationisproportionaltothesquareofthemeanwehaveb=2,1-b=-1,andweusethereciprocal.Another,rarelyseentransformationisusedwhenobservationsareBinomialproportions.Herethestandarddeviationincreasesastheproportiongoesfrom0.0to0.5,thendecreasesastheproportiongoesfrom0.5to1.0.Thisisthearcsinesquareroottransformation.Whetheritworksdependsonhowmuchothervariationthereis.Ithasnowbeenlargelysupersededbylogisticregression(§17.8).

Table10.3.Bicepsskinfoldthickness(mm)intwogroupsofpatients

Crohn'sdisease Coeliacdisease

1.8 2.8 4.2 6.2 1.8 3.8

2.2 3.2 4.4 6.6 2.0 4.2

2.4 3.6 4.8 7.0 2.0 5.4

2.5 3.8 5.6 10.0 2.0 7.6

2.8 4.0 6.0 10.4 3.0

Fig.10.5.Scatterplot,histogram,andNormalplotforthebicepsskinfolddata

Whenwehaveseveralgroupswecanplotlog(s)againstlog([xwithbarabove])thendrawalinethroughthepoints.Theslopeofthelineisb(seeHealy1968).Trialanderror,however,combinedwithscatterplots,histograms,andNormalplots,usuallysuffice.

Table10.3showssomedatafromastudyofanthropometryanddiagnosisinpatientswithintestinaldisease(Maugdaletal.1985).Wewereinterestedindifferencesinanthropometricalmeasurementsbetweenpatientswithdifferentdiagnoses,andherewehavethebicepsskinfoldmeasurementsfor20patientswithCrohn'sdiseaseand9patientswithcoeliacdisease.Thedatahavebeenputintoorderof

magnitudeanditisfairlyobviousthatthedistributionisskewedtotheright.Figure10.5showsthisclearly.Ihavesubtractedthegroupmeanfromeachobservation,givingwhatiscalledthewithin-groupresiduals,andthenfoundboththefrequencydistributionandNormalplot.Thedistributionisclearlyskew,andthisisreflectedintheNormalplot,whichshowsapronouncedcurvature.

Fig.10.6.Scatterplot,histogram,andNormalplotforthebicepsskinfolddata,aftersquareroot,log,andreciprocaltransformations

WeneedaNormalizingtransformation,ifonecanbefound.Theusualbestguessesaresquareroot,log,andreciprocal,withthelogbeingthemostlikelytosucceed.Figure10.6showsthescatterplot,histogram,andNormalplotfortheresidualsaftertransformation.(Theselogarithmsarenatural,tobasee,ratherthantobase10.Itmakesno

differencetothefinalresultandthecalculationsarethesametothecomputer.)ThefittotheNormaldistributionisnotperfect,butforeachtransformationismuchbetterthaninFigure10.5.TheloglooksthebestfortheequalityofvarianceandtheNormaldistribution.Wecouldusethetwo-sampletmethodonthesedataquitehappily.

Table10.4showstheresultsofthetwosampletmethodusedwiththeraw,untransformeddataandwitheachtransformation.ThetteststatisticincreasesanditsassociatedprobabilitydecreasesaswemoveclosertoaNormaldistribution,reflectingtheincreasingpowerofthettestasitsassumptionsaremorecloselymet.Table10.4alsoshowstheratioofthevariancesinthetwosamples.Wecanseethat,asthetransformeddatagetsclosertoaNormaldistribution,thevariancestendtobecomemoreequalalso.

Thetransformeddataclearlygivesabettertestofsignificancethantherawdata.Theconfidenceintervalsforthetransformeddataaremoredifficulttointerpret,however,sothegainhereisnotsoapparent.Theconfidencelimitsforthedifferencecannotbetransformedbacktotheoriginalscale.Ifwetryit,thesquarerootandreciprocallimitsgiveludicrousresults.Theloggivesinterpretableresults(0.89to2.03)butthesearenotlimitsforthedifferencein

millimetres.Howcouldtheybe,fortheydonotcontainzeroyetthedifferenceisnotsignificant?Theyareinfactthe95%confidencelimitsfortheratiooftheCrohn'sdiseasegeometricmeantothecoeliacdiseasegeometricmean(§7.4).Iftherewerenodifference,ofcourse,theexpectedvalueofthisratiowouldbeone,notzero,andsolieswithinthelimits.Thereasonisthatwhenwetakethedifferencebetweenthelogarithmsoftwonumbers,wegetthelogarithmoftheirratio,notoftheirdifference(§5A).

Table10.4.Bicepsskinfoldthicknesscomparedfortwogroupsofpatients,usingdifferenttransformations

Transformation

Two-samplettest,27d.f.

95%Confidenceintervalfordifferenceontransformedscale

Varianceratio,

larger/smallert P

None,rawdata

1.28 0.21 -0.71to3.07mm

1.52

Squareroot 1.38 0.18 -0.140to0.714

1.16

Logarithm 1.48 0.15 -0.114to0.706

1.10

Reciprocal -1.65 0.11 -0.203to0.022

1.63

Becausethelogtransformationistheonlyonewhichgivesusefulconfidenceintervals,Iwoulduseitunlessitwereclearlyinadequateforthedata,andanothertransformationclearlysuperior.Whenthishappenswearereducedtoasignificancetestonly,withnomeaningfulestimate.

10.5DeviationsfromtheassumptionsoftmethodsThemethodsdescribedinthischapterdependonsomestrongassumptionsaboutthedistributionsfromwhichthedatacome.Thisoftenworriesusersofstatisticalmethods,whofeelthattheseassumptionsmustlimitgreatlytheuseoftdistributionmethodsandfindtheattitudeofmanystatisticians,whooftenusemethodsbasedon

Normalassumptionsalmostasamatterofcourse,rathersanguine.Weshalllookatsomeconsequencesofdeviationsfromtheassumptions.

Firstweshallconsideranon-Normaldistribution.Aswehaveseen,somevariablesconformverycloselytotheNormaldistribution,othersdonot.Deviationsoccurintwomainways:groupingandskewness.Groupingoccurswhenacontinuousvariable,suchashumanheight,ismeasuredinunitswhicharefairlylargerelativetotherange.Thishappens,forexample,ifwemeasurehumanheighttothenearestinch.TheheightsinFigure10.2weretothenearestinch,andthefittothetdistributionisverygood.Thiswasaverycoarsegrouping,asthestandarddeviationofheightswas2.5inchesandso95%ofthe3000observationshadvaluesoverarangeof10inches,only10or11possiblevaluesinall.WecanseefromthisthatiftheunderlyingdistributionisNormal,roundingthemeasurementisnotgoingtoaffecttheapplicationofthetdistributionbymuch.

Theotherassumptionofthetwo-sampletmethodisthatthevariancesinthetwopopulationsarethesame.Ifthisisnotcorrect,thetdistributionwillnotnecessarilyapply.TheeffectisusuallysmallifthetwopopulationsarefromaNormaldistribution.Thissituationisunusualbecause,forsamplesfromthesamepopulation,meanandvarianceareindependentifthedistributionisNormal(§7A).Thereisanapproximatetmethod,aswenotedin§10.3.However,unequalvarianceismoreoftenassociatedwithskewnessinthedata,inwhich

caseatransformationdesignedtocorrectonefaultoftentendstocorrecttheotheraswell.

Boththepairedandtwo-sampletmethodsarerobusttomostdeviationsfromtheassumptions.Onlylargedeviationsaregoingtohavemucheffectonthesemethods.Themainproblemiswithskeweddataintheone-samplemethod,butforreasonsgivenin§10.2,thepairedtestwillusuallyprovidedifferenceswithareasonabledistribution.Ifthedatadoappeartobenon-Normal,thenaNormalizingtransformationwillimprovematters.Ifthisdoesnotwork,thenwemustturntomethodswhichdonotrequiretheseassumptions(§9.2,§12.2,§12.3).

10.6Whatisalargesample?Inthischapterwehavelookedatsmallsampleversionsofthelargesamplemethodsof§8.5and§9.7.Thereweignoredboththedistributionofthevariableandthevariabilityofs2,onthegroundsthattheydidnotmatterprovidedthesampleswerelarge.Howsmallcanalargesamplebe?Thisquestioniscriticaltothevalidityofthesemethods,butseldomseemstobediscussedintextbooks.

Providedtheassumptionsofthettestapply,thequestioniseasyenoughtoanswer.InspectionofTable10.1willshowthatfor30degreesoffreedomthe5%pointis2.04,whichissoclosetotheNormalvalueof1.96thatitmakeslittledifferencewhichisused.SoforNormaldatawithuniformvariancewecanforgetthetdistributionwhenwehavemorethan30observations.

Whenthedataarenotinthishappystate,thingsarenotsosimple.Ifthetmethodisnotvalid,wecannotassumethatalargesamplemethodwhichapproximatestoitwillbevalid.Irecommendthefollowingroughguide.First,ifindoubt,treatthesampleassmall.Second,transformtoaNormaldistributionifpossible.Inthepairedcaseyoushouldtransformbeforesubtraction.Third,themorenon-Normalthedata,thelargerthesampleneedstobebeforewecan

ignoreerrorsintheNormalapproximation.

Table10.5.Bloodzidovudinelevelsattimesafteradministrationofthedrugbypresenceoffatmalabsorption

Timesinceadministrationofzidovudine

0 15 30 45 60 90 120 150

Malabsorptionpatients

0.08 13.15 5.70 3.22 2.69 1.91 1.72 1.22

0.08 0.08 0.14 2.10 6.37 4.89 2.11 1.40

0.08 0.08 3.29 3.47 1.42 1.61 1.41 1.09

0.08 0.08 1.33 1.71 3.30 1.81 1.16 0.69

0.08 6.69 8.27 5.02 3.98 1.90 1.24 1.01

0.08 4.28 4.92 1.22 1.17 0.88 0.34 0.24

0.08 0.13 9.29 6.03 3.65 2.32 1.25 1.02

0.08 0.64 1.19 1.65 2.37 2.07 2.54 1.34

0.08 2.39 3.53 6.28 2.61 2.29 2.23 1.97

Normalabsorptionpatients:

0.08 3.72 16.02 8.17 5.21 4.84 2.12 1.50

0.08 6.72 5.48 4.84 2.30 1.95 1.46 1.49

0.08 9.98 7.28 3.46 2.42 1.69 0.70 0.76

0.08 1.12 7.27 3.77 2.97 1.78 1.27 0.99

0.08 13.37 17.61 3.90 5.53 7.17 5.16 3.84

Thereisnosimpleanswertothequestion:‘howlargeisalargesample?’.Weshouldbereasonablysafewithinferencesaboutmeansifthesampleisgreaterthan100forasinglesample,orifbothsamplesaregreaterthan50fortwosamples.Theapplicationofstatisticalmethodsisamatterofjudgementaswellasknowledge.

10.7*SerialdataTable10.5showslevelsofzidovudine(AZT)inthebloodofAIDSpatientsatseveraltimesafteradministrationofthedrug,forpatientswithnormalfatabsorptionorfatmalabsorption.AlinegraphofthesedatawasshowninFigure5.6.Onecommonapproachtosuchdataistocarryoutatwo-samplettestateachtimeseparately,andresearchersoftenaskatwhattimethedifferencebecomessignificant.Thisisamisleadingquestion,assignificanceisapropertyofthesampleratherthanthepopulation.Thedifferenceat15minmaynotbesignificantbecausethesampleissmallandthedifferencetobedetectedissmall,notbecausethereisnodifferenceinthepopulation.Further,ifwedothisforeachtimepointwearecarryingoutmultiplesignificancetests(§9.10)andeachtestonlyusesasmallpartofthedatasowearelosingpower(§9.9).Itisbettertoaskwhetherthereisanyevidenceofadifferencebetweentheresponseofnormalandmalabsorptionsubjectsoverthewholeperiodofobservation.

Thesimplestapproachistoreducethedataforasubjecttoone

number.Wecanusethehighestvalueattainedbythesubject,thetimeatwhichthispeakvaluewasreached,ortheareaunderthecurve.Thefirsttwoareself-explanatory.

Theareaunderthecurveor(AUC)isfoundbydrawingalinethroughallthepointsandfindingtheareabetweenitandthehorizontalaxis.The‘curve’isususallyformedbyaseriesofstraightlinesfoundbyjoiningallthepointsforthesubject,andFigure10.7showsthisforthefirstsubjectinTable10.5.Theareaunderthecurvecanbecalculatedbytakingeachstraightlinesegmentandcalculatingtheareaunderthis.Thisisthebasemultipliedbytheaverageofthetwoverticalheights.Wecalculatethisforeachlinesegment,i.e.betweeneachpairofadjacenttimepoints,andadd.Thusforthefirstsubjectweget(15-0)×(0.08+13.15)/2+(30-15)×(13.15+5.70)/2+…+(360-300)×(0.43+0.32)/2=667.425.Thiscanbedonefairlyeasilybymoststatisticalcomputerpackages.TheareaforeachsubjectisshowninTable10.6.

Fig.10.7.Calculationoftheareaunderthecurveforonesubject

Table10.6.AreaunderthecurvefordataofTable10.5

Malabsorptionpatients Normalpatients

667.425 256.275 919.875

569.625 527.475 599.850

306.000 388.800 499.500

298.200 505.875 472.875

617.850 1377.975

Fig.10.8.NormalplotsforareaunderthecurveandlogareaforthedataofTable10.5

10.8*ComparingtwovariancesbytheFtestWecantestthenullhypothesisthattwopopulationvariancesareequalusingtheFdistribution.ProvidedthedataarefromaNormaldistribution,theratiooftwoindependentestimatesofthesamevariancewillfollowaFdistribution(§7A),thedegreesoffreedombeingthedegreesoffreedomofthetwoestimates.TheFdistributionisdefinedasthatoftheratiooftwoindependentChi-squaredvariablesdividedbytheirdegreesoffreedom:

wheremandnarethedegreesoffreedom(§7A).ForNormaldatathedistributionofasamplevariances2fromnobservationsisthatofσ2χ2n/(n-1)andwhenwedivideoneestimateofvariancebyanothertogivetheFratio,theσ2cancelsout.LikeotherdistributionsderivedfromtheNormal,theFdistributioncannotbeintegratedandsowemustuseatable.Becauseithastwodegreesoffreedom,thetableiscumbersome,coveringseveralpages,andIshallomitit.MostFmethodsaredoneusingcomputerprogramswhichcalculatetheprobabilitydirectly.Thetableisusuallyonlygivenastheupperpercentagepoints.

Totestthenullhypothesis,wedividethelargervariancebythesmaller.Fortheskinfolddataof§10.4,thevariancesare5.860with19degreesoffreedomfortheCrohn'spatientsand3.860with8degreesoffreedomforthecoeliacs,givingF=5.860/3.860=1.52.TheprobabilityofthisbeingexceededbytheFdistributionwith19and8degreesoffreedomis0.3,the5%pointofthedistributionbeing3.16,sothereisnoevidencefromthesedatathatthevarianceofskinfolddiffersbetweenpatientswithCrohn'sdiseaseandcoeliacdisease.

SeveralvariancescanbecomparedbyBartlett'stestortheLevenetest(seeArmitageandBerry1994,SnedecorandCochran1980).

Table10.7.MannitolandlactulosegutpermeabilitytestsinagroupofHIVpatientsandcontrols

HIVstatus Diarrhoea %

Mannitol %lactulose HIVstatus Diarrhoea

AIDS Yes 14.9 1.17 ARC Yes

AIDS Yes 7.074 1.203 ARC No

AIDS Yes 5.693 1.008 ARC No

AIDS Yes 16.82 0.367 HIV+ No







AIDS No 13.95 0.6 HIV- No














AIDS Yes 7.256 0.252 HIV- No


ARC Yes 7.42 0.21 HIV- No



ARC No 22.03 0.651

10.9*ComparingseveralmeansusinganalysisofvarianceConsiderthedataofTable10.7.Thesearemeasuresofgutpermeabilityobtainedfromfourgroupsofsubjects,diagnosedwithAIDS,AIDSrelatedcom-plex(ARC),asymptomaticHIVpositive,andHIVnegativecontrols.Wewanttoinvestigatethedifferencesbetweenthegroups.

Oneapproachwouldbetousethettesttocompareeachpairofgroups.Thishasdisadvantages.First,therearemanycomparisons,m(m-1)/2wheremisthenumberofgroups.Themoregroupswehave,themorelikelyitisthattwoofthemwillbefarenoughaparttoproducea

‘significant’differencewhenthenullhypothesisistrueandthepopulationmeansarethesame(§9.10).Second,whengroupsaresmall,theremaynotbemanydegreesoffreedomfortheestimateofvariance.Ifwecanuseallthedatatoestimatevariancewewillhavemore

degreesoffreedomandhenceamorepowerfulcomparison.Wecandothisbyanalysisofvariance,whichcomparesthevariationbetweenthegroupstothevariationwithinthegroups.

Table10.8.Someartificialdatatoillustratehowanalysisofvarianceworks

Group1 Group2 Group3 Group4

6 4 7 3

7 5 9 5

8 6 10 6

8 6 11 6

9 6 11 6

11 8 13 8

Mean 8.167 5.833 10.167 5.667

Toillustratehowtheanalysisofvariance,oranova,works,Ishalluse

someartificialdata,assetoutinTable10.8.Inpractice,equalnumbersineachgroupareunusualinmedicalapplications.Westartbyestimatingthecommonvariancewithinthegroups,justaswedoinatwo-samplettest(§10.3).Wefindthesumofsquaresaboutthegroupmeanforeachgroupandaddthem.Wecallthisthewithingroupssumofsquares.ForTable10.8thisgives57.833.Foreachgroupweestimatethemeanfromthedata,sowehaveestimated4parametersandhave24-4=20degreesoffreedom.Ingeneral,formgroupsofsizeneachwehavenm-m=m(n-1)degreesoffreedom.Thisgivesusanestimateofvarianceof

Thisisthewithingroupsvarianceorresidualvariance.Thereisanassumptionhere.Foracommonvariance,weassumethatthevariancesarethesameinthefourpopulationsrepresentedbythefourgroups.

Wecanalsofindanestimateofvariancefromthegroupmeans.Thevarianceofthefourgroupmeansis4.562.Iftherewerenodifferencebetweenthemeansinthepopulationfromwhichthesamplecomes,thisvariancewouldbethevarianceofthesamplingdistributionofthemeanofnobservations,whichiss2/n,thesquareofthestandarderror(§8.2).Thusntimesthisvarianceshouldbeequaltothewithingroupsvariance.Fortheexample,thisis4.562×6=27.375.whichismuchgreaterthanthe2.892foundwithinthegroups.Weexpressthisbytheratioofonevarianceestimatetotheother,betweengroupsoverwithingroups,whichwecallthevarianceratioorFratio.IfthenullhypothesisistrueandiftheobservationsarefromaNormaldistributionwithuniformvariance,thisratiofollowsaknowndistribution,theFdistributionwithm-1andn-1degreesoffreedom(§10.8).

Fortheexamplewewouldhave3and20degreesoffreedomand

Ifthenullhypothesisweretrue,theexpectedvalueofthisratiowouldbe1.0.

Alargevaluegivesusevidenceofadifferencebetweenthemeansin

thefourpopulations.Fortheexamplewehavealargevalueof9.47andtheprobabilityofgettingavalueasbigasthisifthenullhypothesisweretruewouldbe0.0004.Thusthereisasignificantdifferencebetweenthefourgroups.

Table10.9.One-wayanalysisofvarianceforthedataofTable10.8

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare

Varianceratio(F) Probability

Total 23 139.958

Betweengroups

3 82.125 27.375 9.47 0.0004

Withingroups

20 57.833 2.892

Table10.10.One-wayanalysisofvarianceforthemannitoldata

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 58 1559.036

Betweengroups

3 49.012 16.337 0.6

Residual 55 1510.024 27.455

Wecansetthesecalculationsoutinananalysisofvariancetable,asshowninTable10.9.Thesumofsquaresinthe‘betweengroups’rowisthesumofsquaresofthegroupmeanstimesn.Wecallthisthebetweengroupssumofsquares.Noticethatinthe‘degreesoffreedom’and‘sumofsquares’columnsthe‘withingroups’and‘betweengroups’rowsadduptothetotal.Thewithingroupssumofsquaresisalsocalledtheresidualsumofsquares,becauseitiswhatisleftwhenthegroupeffectisremoved,ortheerrorsumofsquares,becauseitmeasurestherandomvariationorerrorremainingwhenallsystematiceffectshavebeenremoved.

Thesumofsquaresofthewholedata,ignoringthegroupsiscalledthetotalsumofsquares.Itisthesumofthebetweengroupsandwithingroupssumofsquares.

Returningtothemannitoldata,assooftenhappensthegroupsareofunequalsize.Thecalculationofthebetweengroupssumofsquaresbecomesmorecomplicatedandweusuallydoitbysubtractingthewithingroupssumofsquaresfromthetotalsumofsquares.Otherwise,thetableisthesame,asshowninTable10.10.Asthesecalculationsareusuallydonebycomputertheextracomplexityincalculationdoesnotmatter.Herethereisnosignificantdifferencebetweenthegroups.

Ifwehaveonlytwogroups,one-wayanalysisofvarianceisanotherwayofdoingatwo-samplettest.Forexample,theanalysisofvariancetableforthecomparisonofaveragecapillarydensity(§10.3)isshowninTable10.11.TheprobabilityisthesameandtheFratio,25.78,isthesquareofthetstatistic,5.08.Theresidualmeansquareisthecommonvarianceofthettest.

Table10.11.One-wayanalysisofvarianceforthecomparisonofmeancapillarydensitybetweenulceratedpatientsandcontrols,

Table10.2

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 41 3506.57

Betweengroups

1 1374.114 1374.114 25.78 <0.0001

Residual 40 2132.458 53.311

Fig.10.9.Plotsofthemannitoldata,showingthattheassumptionsofNormaldistributionandhomoscedasticityarereasonable

10.10*AssumptionsoftheanalysisofvarianceTherearetwoassumptionsforanalysisofvariance:thatdatacomefromNormaldistributionswithinthegroupsandthatthevariancesofthesedistributionsarethesame.Thetechnicaltermforuniformityofvarianceishomoscedasticity;lackofuniformityisheteroscedasticity.Heteroscedasticitycanaffectanalysesofvariancealotandwetrytoguardagainstit.

Wecanexaminetheseassumptionsgraphically.Formannitol(Figure10.9)thescatterplotforthegroupsshowsthatthespreadofdataineachgroupissimilar,suggestingthattheassumptionofuniformvarianceismet,thehistogramlooksNormalandNormalplotlooksstraight.Thisisnotthecaseforthelactulosedata,asFigure10.10shows.ThevariancesarenotuniformandthehistogramandNormalplotsuggestpositiveskewness.Asisoftenthecase,thegroupwiththehighestmean,AIDS,hasthegreatestspread.Thesquareroottransformationofthelactulosefitsbetter,givingagoodNormaldistributionalthoughthevariabilityisnotuniform.Thelogtransformover-compensatesforskewness,byproducingskewnessintheoppositedirection,thoughthevariancesappearuniform.Eitherthesquarerootorthelogarithmictransformationwouldbebetterthantherawdata.Ipickedthesquarerootbecausethedistributionlookedbetter.Table10.12showstheanalysisofvarianceforsquareroottransformedlactulose.

TherearealsosignificancetestswhichwecanapplyforNormaldistributionandhomoscedasticity.Ishallomitthedetails.

10.11*ComparisonofmeansafteranalysisofvarianceConcludingfromTables10.9and10.12thatthereisasignificantdifferencebetweenthemeansisratherunsatisfactory.Wewanttoknowwhichmeansdiffer

fromwhich.Thereareanumberofwaysofdoingthis,calledmultiplecomparisonsprocedures.ThesearemostlydesignedtogiveonlyonetypeIerror(§9.3)per20analyseswhenthenullhypothesisistrue,as

opposedtodoingttestsforeachpairofgroups,whichgivesoneerrorper20comparisonswhenthenullhypothesisistrue.Ishallnotgointodetails,butlookatacoupleofexamples.Thereareseveraltestswhichcanbeusedwhenthenumbersineachgrouparethesame,Tukey'sHonestlySignificantDifference,theNewman-Keulssequentialprocedure(bothcalledStudentizedrangetests),Duncan'smultiplerangetest,etc.Theoneyouusewilldependonwhichcomputerprogramyouhave.TheresultsoftheNewman-KeulssequentialprocedureforthedataofTable10.8areshowninTable10.13.Group1issignificantlydifferentfromgroups2and4,andgroup3fromgroups2and4.Atthe1%level,theonlysignificantdifferencesarebetweengroup3andgroups2and4.

Fig.10.10.Plotsofthelactulosedataonthenaturalscaleandaftersquarerootandlogtransformation

Table10.12.One-wayanalysisofvarianceforthesquareroottransformedlactulosedataofTable10.7

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 58 3.25441

HIVstatus

3 0.42870 0.14290 2.78 0.0495

Residual 55 2.82571 0.05138

Forunequal-sizedgroups,thechoiceofmultiplecomparisonproceduresis

morelimited,Gabriel'stestcanbeusedwithunequal-sizedgroups.Fortheroottransformedlactulosedata,theresultsofGabriel'stestareshowninTable10.14.ThisshowsthattheAIDSsubjectsaresignificantlydifferentfromtheasymptomaticHIV+patientsandfromtheHIV-controls.Forthemannitoldata,mostmultiplecomparisonprocedureswillgivenosignificantdifferencesbecausetheyaredesignedtogiveonlyonetypeIerrorperanalysisofvariance.WhentheFtestisnotsignificant,nogroupcomparisonswillbeeither.

Table10.13.TheNewman-KeulstestforthedataofTable10.8

0.05level 0.01level

Group Group Group Group

1 2 3 1 2 3

2 S 2 N

3 N S 3 N S

4 S N S 4 N N S

S=significant,N=notsignificant.

Table10.14.Gabriel'stestfortheroottransformedlactulosedata

0.05level 0.01level

Group Group Group Group

AIDS ARC HIV+ AIDS ARC HIV+

ARC N ARC N

HIV+ S N HIV+ N N

HIV- S N N HIV- N N N

S=significant,N=notsignificant.

10.12*RandomeffectsinanalysisofvarianceAlthoughthetechniqueiscalledanalysisofvariance,in§10-9-11wehavebeenusingitforthecomparisonofmeans.Inthissectionweshalllookatanotherapplication,whereweshallindeeduseanovatolookatvariances.Whenweestimateandcomparethemeansofgroupsrepresentingdifferentdiagnoses,differenttreatments,etc.,wecallthesefixedeffects.Inotherapplications,groupsaremembersofarandomsamplefromalargerpopulationand,ratherthanestimatethemeanofeachgroup,weestimatethevariancebetweenthem.Wecalledthesegroupsrandomeffects.

ConsiderTable10.15,whichshowsrepeatedmeasurementsofpulserateonagroupofmedicalstudents.Eachmeasurementwasmadebyadifferentobserver.Observationsmaderepeatedlyunderthesamecircumstancesarecalledreplicatesandherewehavetworeplicatespersubject.Wecandoaonewayanalysisofvarianceonthesedata,withsubjectasthegroupingfactor(Table10.16).

ThetestofsignificanceinTable10.16isredundant,becauseweknoweachpairofmeasurementsisfromadifferentperson,andthenullhypothesisthatallpairsarefromthesamepopulationisclearlyfalse.Whatwecanusethisanova

foristoestimatesomevariances.Therearetwodifferentvariancesinthedata.Oneisbetweenmeasurementsonthesameperson,thewithin-subjectvariancewhichweshalldenotebyσ2w.Inthisexamplethewithinsubjectvarianceisthemeasurementerror,andweshallassumeitisthesameforeveryone.Theotheristhevariancebetweenthesubjects'trueoraveragepulserates,aboutwhichtheindividualmeasurementsforasubjectaredistributed.Thisistheaverageofallpossiblemeasurementsforthatsubject,nottheaverageofthetwomeasurementsweactuallyhave.Thisvarianceisthebetween-subjectsvarianceandweshalldenoteitbyσ2b.Asinglemeasurementobserved

fromasingleindividualisthesumofthesubject'struepulserateandthemeasurementerror.Suchmeasurementsthereforehavevarianceσ2b+σ2w.Wecanestimateboththesevariancesfromtheanovatable.

Table10.15.Pairedmeasurementsof30secondpulsein45medicalstudents

Subject PulseAB Subject PulseA

B Subject PulseAB

1 46 42 16 34 36 31 43 43

2 50 42 17 30 36 32 30 29

3 39 37 18 35 45 33 31 36

4 40 54 19 32 34 34 43 43

5 41 46 20 44 46 35 38 43

6 35 35 21 39 42 36 31 37

7 31 44 22 34 37 37 45 43

8 43 35 23 36 38 38 39 43

9 47 45 24 33 34 39 48 48

10 48 36 25 34 35 40 40 40

11 32 46 26 51 48 41 46 45

12 36 34 27 31 30 42 44 42

13 37 30 28 30 31 43 36 34

14 34 36 29 42 43 44 33 28

15 38 36 30 39 35 45 39 42

Table10.16.One-wayanalysisofvarianceforthe30secondpulsedataofTable10.15

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 89 3.054.99

Betweensubjects

44 2408.49 54.74 3.81 <0.0001

Withinsubjects

45 646.50 14.37

Forthesimpleexampleofthesamenumberofreplicatesmoneachof

nsubjects,theestimationofthevariancesisquitesimple.Weestimateσ2w,directlyfromthemeansquarewithinsubjects,MSw,givinganestimates2w.Wecanshow(althoughIshallomitit)thatthemeansquarebetweensubjects,MSb,isanestimateofmσ2b+σ2w.Thevarianceratio,F=MSb/MSw,willbeexpectedtobe1.0ifσ2b=0,i.e.ifthenullhypothesisthatallsubjectsarethesameistrue.Wecanestimateσ2bbys2b=(MSb-MSw)/m.

Fortheexample,s2w=14.37ands2b=(54.74-14.37)/2=20.19.Thusthe

variabilitybetweenmeasurementsbydifferentobserversonthesamesubjectisnotmuchlessthanthevariabilitybetweentheunderlyingpulseratebetweendifferentsubjects.Themeasurement(bytheseuntrainedandinexperienceobservers)doesnottellusmuchaboutthesubjects.Weshallseeapracticalapplicationinthestudyofmeasurementerrorandobservervariationin§15.2,andconsideranotheraspectofthisanalysis,intraclasscorrelation,in§11.13.

Table10.17.NumberofX-rayrequestsconformingtotheguidelinesforeachpracticeintheinterventionandcontrolsgroups(Oakeshott

al1994)

Interventiongroup Controlgroup

Numberofrequests Percentage Numberof

requests

Total Conforming conforming Total Conforming

20 20 100 7 7

7 7 100 37 33

16 15 94 38 32

31 28 90 28 23

20 18 90 20 16

24 21 88 19 15

7 6 86 9 7

6 5 83 25 19

30 25 83 120 90

66 53 80 89 64

5 4 80 22 15

43 33 77 76 52

43 32 74 21 14

23 16 70 127 83

64 44 69 22 14

6 4 67 34 21

18 10 56 10 4

Total 429 341 704 509

Mean 81.6

SD 11.9

Ifwehavedifferentnumbersofreplicatespersubjectorotherfactorstoconsider(e.g.ifeachobservermadetworepeatedmeasurements)theanalysisbecomesfiendishlycomplicated(seeSearleetal.1992,ifyoumust).Theseestimatesofvariancedeserveconfidenceintervalslikeanyotherestimate,buttheseareevenmorefiendishlycomplicated,asBurdickandGraybill(1992)convincinglydemonstrate.Iwouldrecommendyouconsultastatisticianexperiencedinthesematters,ifyoucanfindone.

10.13*Unitsofanalysisandcluster-randomizedtrialsAcluster-randomizedstudy(§2.11)isonewhereagroupofsubjects,suchasthepatientsinahospitalwardorageneralpracticelist,arerandomizedtothesametreatmenttogether.Thetreatmentmightbeappliedtopatientdirectly,suchasanofferofbreastcancerscreeningtoalleligiblewomeninadistrict,orbeappliedtothecareprovider,suchastreatmentguidelinesgiventotheGP.Thedesignofthestudymustbetakenintoaccountintheanalysis.

Table10.17showsanexample(Oakeshottetal.1994,KerryandBland1998).

Fig.10.11.ScatterplotsandNormalplotsforthedataofTable10.17,showingtheeffectofanarcsinesquareroottransformation

InthisstudyguidelinesastoappropriatereferralforX-rayweregiventoGPsin17practicesandanother17practicesservedascontrols.Wecouldsaywehave341outof429appropriatereferralsinthetreatedgroupand509outof704inthecontrolgroupandcomparetheseproportionsasin§8.6and§9.8.Thiswouldbewrong,becausetofollowaBinomialdistribution,allthereferralsmustbeindependent(§6.4).Theyarenot,astheindividualGPmayhaveaprofoundeffectonthedecisiontorefer.Evenwherethepractitionerisnotdirectlyinvolved,membersofaclustermaybemoresimilartooneanotherthentheyaretomembersofanotherclusterandsonotbeindependent.IgnoringtheclusteringmayresultinconfidenceintervalswhicharetoonarrowandPvalueswhicharetoosmall,producingspurioussignificantdifferences.

Theeasiestwaytoanalysethedatafromsuchstudiesistomaketheexperimentalunit,thatwhichisrandomized(§2.11),theunitofanalysis.Wecanconstructasummarystatisticforeachclusterand

thenanalysethesesummaryvalues.Theideaissimilartotheanalysisofrepeatedmeasurementsonthesamesubject,whereweconstructasinglesummarystatisticoverthetimesforeachindividual(§10.7).ForTable10.17,thepractice'spercentageofreferralswhichareappropriateisthesummarystatistic.Themeanpercentagesinthetwogroupscanthenbecomparedbythetwo-sampletmethod.Theobserveddifferenceis81.6–73.6=8.0andthestandarderrorofthedifferenceis4.3.Thereare32degreesoffreedomand,fromTable10.1,the5%pointofthetdistributionis2.04.Thisgivesa95%confidenceintervalforthetreatmentdifferenceof

8.0±2.037×4.3,or-1to17percentagepoints.Forthetestofsignificance,theteststatisticis8.0/4.3=1.86,P=0.07.

Inthisexample,eachobservationisaBinomialproportion,sowecouldconsideranarcsinesquareroottransformationoftheproportions(§10.4).AsFigure10.11shows,ifanythingthetransformationmakesthefittotheNormaldistributionworse.ThisisreflectedinalargerPvalue,givingP=0.10.

Thereisawidelyvaryingnumberofreferrals,betweenpractices,whichmustreflectthelistsizeandnumberofGPsinthepractice.Wecantakethisintoaccountwithananalysiswhichweightseachobservationbythenumbersofreferrals.BlandandKerry(1998)givedetails.

Appendices

10AAppendix:Theratiomean/standarderror

Asifbymagic,wehaveoursamplemeanoveritsstandarderror.Ishallnotbothertogointothisdetailfortheothersimilarratioswhichweshallencounter.AnyquantitywhichfollowsaNormaldistributionwithmeanzero(suchas[xwithbarabove]-µ),dividedbyitsstandarderror,willfollowatdistributionprovidedthestandarderrorisbasedononesumofsquaresandhenceisrelatedtotheChi-squareddistribution.


50.Thepairedttestis:

(a)impracticalforlargesamples;

(b)usefulfortheanalysisofqualitativedata;

(c)suitableforverysmallsamples;

(d)usedforindependentsamples;

(e)basedontheNormaldistribution.

ViewAnswer

51.Whichofthefollowingconditionsmustbemetforavalidttestbetweenthemeansoftwosamples:

(a)thenumbersofobservationsmustbethesameinthetwogroups;

(b)thestandarddeviationsmustbeapproximatelythesameinthetwogroups;

(c)themeansmustbeapproximatelyequalinthetwogroups;

(d)theobservationsmustbefromapproximatelyNormaldistributions;

(e)thesamplesmustbesmall.

ViewAnswer

52.Inatwo-sampleclinicaltrial,oneoftheoutcomemeasureswashighlyskewed.Totestthedifferencebetweenthelevelsofthismeasureinthetwogroupsofpatients,possibleapproachesinclude:

(a)astandardttestusingtheobservations;

(b)aNormalapproximationifthesampleislarge;

(c)tranaformingthedatatoaNormaldistributionandusingattest;

(d)asigntest;

(e)thestandarderrorofthedifferencebetweentwoproportions.

ViewAnswer

53.Inthetwo-samplettest,deviationfromtheNormaldistributionbythedatamayseriouslyaffectthevalidityofthetestif:

(a)thesamplesizesareequal;

(b)thedistributionfollowedbythedataishighlyskewed;

(c)onesampleismuchlargerthantheother;

(d)bothsamplesarelarge;

(e)thedatadeviatefromaNormaldistributionbecausethemeasurementunitislargeandonlyafewvaluesarepossible.

ViewAnswer

Table10.18.Semenanalysesforsuccessfulandunsuccessfulspermdonors(Paraskevaidesetal.1991)

Successfuldonors Unsuccessfuldonors

n Mean (sd) n Mean (sd)

Volume(ml)

17 3.14 (1.28) 19 2.91 (0.91)

Semencount(106/ml)

18 146.4 (95.7) 19 124.8 (81.8)

%Motility 17 60.7 (9.7) 19 58.5 (12.8)

%Abnormalmorphology

13 22.8 (8.4) 16 20.3 (8.5)

Alldifferencesnotsignificant,ttest.

54.Table10.18showsacomparisonofsuccessful(i.e.fertile)andunsuccessfulartificialinseminationdonors.Theauthorsconcludedthat‘Conventionalsemenanalysismaybetooinsensitiveanindicatorofhighfertility[inAID]’:

(a)thetablewouldbemoreinformativeifPvaluesweregiven;

(b)thettestisimportanttotheconclusiongiven:

(c)itislikelythatsemencountfollowsaNormaldistribution;

(d)ifthenullhypothesisweretrue,thesamplingdistributionofthetteststatisticforsemencountwouldapproximatetoatdistribution;

(e)ifthenullhypothesiswerefalse,thepowerofthettestforsemencountcouldbeincreasedbyalogtransformation.

ViewAnswer

55.IfwetakesamplesofsizenfromaNormaldistributionandcalculatethesamplemean[xwithbarabove]andvariances2:

(a)sampleswithlargevaluesof[xwithbarabove]willtendtohavelarges2;

(b)thesamplingdistributionof[xwithbarabove]willbeNormal;

(c)thesamplingdistributionofs2willberelatedtotheChi-squareddistributionwithn-1degreesoffreedom;

(e)thesamplingdistributionofswillbeapproximatelyNormalifn>20.

ViewAnswer

56.Intheone-wayanalysisofvariancetableforthecomparisonofthreegroups:

(a)thegroupmeansquare+theerrormeansquare=thetotal

meansquare;

(b)therearetwodegreesoffreedomforgroups;

(c)thegroupsumofsquares+theerrorsumofsquares=thetotalsumofsquares;

(d)thenumbersineachgroupmustbeequal;

(e)thegroupdegreesoffreedom+theerrordegreesoffreedom=thetotaldegreesoffreedom.

ViewAnswer

10EExercise:ThepairedtmethodTable10.19showsthetotalstaticcomplianceoftherespiratorysystemandthearterialoxygentension(pa(O2))in16patientsinintensivecare(Al-Saady,personalcommunication).Thepatients'breathingwasassistedbyarespirator

andthequestionwaswhethertheirrespirationcouldbeimprovedbyvaryingthecharacteristicsoftheairflow.Table10.19comparesaconstantinspiratoryflowwaveformwithadeceleratinginspiratoryflowwaveform.Weshallexaminetheeffectofwaveformoncompliance.

Table10.19.pa(O2)andcompliancefortwoinspiratoryflowwaveforms

Patient pa(O2)(kPa)Compliance(ml/cmH2O)

Waveform Waveform

Constant Decelerating Constant Decelerating

1 9.1 10.8 65.4 72.9

2 5.6 5.9 73.7 94.4

3 6.7 7.2 37.4 43.3

4 8.1 7.9 26.3 29.0

5 16.2 17.0 65.0 66.4

6 11.5 11.6 35.2 36.4

7 7.9 8.4 24.7 27.7

8 7.2 10.0 23.0 27.5

9 17.7 22.3 133.2 178.2

10 10.5 11.1 38.4 39.3

11 9.5 11.1 29.2 31.8

12 13.7 11.7 28.3 26.9

13 9.7 9.0 46.6 45.0

14 10.5 9.9 61.5 58.2

15 6.9 6.3 25.7 25.7

16 18.1 13.9 48.7 42.3

1.Calculatethechangesincompliance.Findastemandleafplot(hint:youwillneedbothazeroandaminuszerorow).

ViewAnswer

2.Asacheckonthevalidityofthetmethod,plotthedifferenceagainstthesubject'smeancompliance.Dotheyappeartoberelated?

ViewAnswer

3.Calculatethemean,variance,standarddeviationandstandarderrorofthemeanforthecompliancedifferences.

ViewAnswer

4.EventhoughthecompliancedifferencesarefarfromaNormaldistribution,calculatethe95%confidenceintervalusingthetdistribution.Wewillcomparethiswiththatfortransformeddata.

ViewAnswer

5.Findthelogarithmsofthecomplianceandrepeatsteps1to3.Dotheassumptionsofthetdistributionmethodapplymoreclosely?

ViewAnswer

6.Calculatethe95%confidenceintervalforthelogdifferenceandtransformbacktotheoriginalscale.Whatdoesthismeanandhowdoesitcomparetothatbasedontheuntransformeddata?

ViewAnswer

7.Whatcanbeconcludedabouttheeffectofinspiratorywaveformonstaticcomplianceinintensivecarepatients?

ViewAnswer



>TableofContents>11-Regressionandcorrelation

11

Regressionandcorrelation

11.1ScatterdiagramsInthischapterIshalllookatmethodsofanalysingtherelationshipbetweentwoquantitativevariables.ConsiderTable11.1,whichshowsdatacollectedbyagroupofmedicalstudentsinaphysiologyclass.InspectionofthedatasuggeststhattheremaybesomerelationshipbetweenFEV1andheight.Beforetryingtoquantifythisrelationship,wecanplotthedataandgetanideaofitsnature.Theusualfirstplotisascatterdiagram,§5.6.Whichvariablewechooseforwhichaxisdependsonourideasastotheunderlyingrelationshipbetweenthem,asdiscussedbelow.Figure11.1showsthescatterdiagramforFEV1andheight.

InspectionofFigure11.1suggeststhatFEVlincreaseswithheight.Thenextstepistotryanddrawalinewhichbestrepresentstherelationship.Thesimplestlineisastraightone;IshallconsidermorecomplicatedrelationshipsinChapter17.

Theequationofastraightlinerelationshipbetweenvariablesxandyisy=a+bx,whereaandbareconstants.Thefirst,a,iscalledtheintercept.Itisthevalueofywhenxis0.Thesecond,b,iscalledtheslopeorgradientoftheline.Itistheincreaseinycorrespondingtoanincreaseofoneunitinx.TheirgeometricalmeaningisshowninFigure11.2.Wecanfindthevaluesofaandbwhichbestfitthedatabyregressionanalysis.

11.2Regression

Regressionisamethodofestimatingthenumericalrelationshipbetween

variables.Forexample,wewouldliketoknowwhatisthemeanorexpectedFEV1forstudentsofagivenheight,andwhatincreaseinFEV1isassociatedwithaunitincreaseinheight.

Table11.1.FEV1andheightfor20malemedicalstudents

Height(cm)

FEV1(litres)

Height(cm)

FEV1(litres)

Height(cm)

FEV1(litres)

164.0 3.54 172.0 3.78 178.0 2.98

167.0 3.54 174.0 4.32 180.7 4.80

170.4 3.19 176.0 3.75 181.0 3.96

171.2 2.85 177.0 3.09 183.1 4.78

171.2 3.42 177.0 4.05 183.6 4.56

171.3 3.20 177.0 5.43 183.7 4.68

172.0 3.60 177.4 3.60

Fig.11.1.ScatterdiagramshowingtherelationshipbetweenFEV1andheightforagroupofmalemedicalstudents

Fig.11.2.Coefficientsofastraightline

Thename‘regression’isduetoGalton(1886),whodevelopedthetechniquetoinvestigatetherelationshipbetweentheheightsofchildrenandoftheirparents.Heobservedthatifwechooseagroupofparentsofagivenheight,themeanheightoftheirchildrenwillbeclosertothemeanheightofthepopulationthanisthegivenheight.Inotherwords,tallparentstendtobetallerthantheirchildren,shortparentstendtobeshorter.Galtontermedthisphenomenon‘regressiontowardsmediocrity’,meaning‘goingbacktowardstheaverage’.Itisnowcalledregressiontowardsthemean(§11.4).Themethodusedtoinvestigateitwascalledregressionanalysisandthenamehasstuck.However,

inGalton'sterminologytherewas‘noregression’iftherelationshipbetweenthevariableswassuchthatonepredictedtheotherexactly;inmodernterminologythereisnoregressionifthevariablesarenotrelatedatall.

Inregressionproblemsweareinterestedinhowwellonevariablecanbeusedtopredictanother.InthecaseofFEV1andheight,forexample,weareconcernedwithestimatingthemeanFEV1foragivenheightratherthanmeanheightforgivenFEV1.Wehavetwokindsofvariables:theoutcomevariablewhichwearetryingtopredict,inthiscaseFEV1,andthepredictororexplanatoryvariable,inthiscaseheight.Thepredictorvariableisoftencalledtheindependentvariableandtheoutcomevariableiscalledthedependentvariable.However,thesetermshaveothermeaningsinprobability(§6.2),soIshallnotusethem.IfwedenotethepredictorvariablebyXandtheoutcomebyY,therelationshipbetweenthemmaybewrittenas

whereaandbareconstantsandEisarandomvariablewithmean0,calledtheerror,whichrepresentsthatpartofthevariabilityofYwhichisnotexplainedbytherelationshipwithX.IfthemeanofEwerenotzero,wecouldmakeitsobychanginga.WeassumethatEisindependentofX.

11.3ThemethodofleastsquaresIfthepointsalllayalongalineandtherewasnorandomvariation,itwouldbeeasytodrawalineonthescatterdiagram.InFigure11.1thisisnotthecase.Therearemanypossiblevaluesofaandbwhichcouldrepresentthedataandweneedacriterionforchoosingthebestline.Figure11.3showsthedeviationofapointfromtheline,thedistancefromthepointtothelineintheYdirection,Thelinewillfitthedatawellifthedeviationsfromitaresmall,andwillfitbadlyiftheyarelarge.ThesedeviationsrepresenttheerrorE,thatpartofthevariableYnotexplainedbyX.OnesolutiontotheproblemoffindingthebestlineistochoosethatwhichleavestheminimumamountofthevariabilityofYunexplained,bymakingthevarianceofEaminimum.Thiswillbeachievedbymakingthesumofsquaresofthedeviationsaboutthelineaminimum.Thisiscalledthemethodofleastsquaresandthelinefoundistheleastsquaresline.

ThemethodofleastsquaresisthebestmethodifthedeviationsfromthelineareNormallydistributedwithuniformvariancealongtheline.Thisislikelytobethecase,astheregressiontendstoremovefromYthevariabilitybetweensubjectsandleavethemeasurementerror,whichislikelytobeNormal.Ishalldealwithdeviationsfromthisassumptionin§11.8.

Manyusersofstatisticsarepuzzledbytheminimizationofvariationinonedirectiononly.UsuallybothvariablesaremeasuredwithsomeerrorandyetweseemtoignoretheerrorinX.Whynotminimizetheperpendiculardistancestothelineratherthanthevertical?Therearetworeasonsforthis.First,wearefindingthebeatpredictionofYfromtheobservedvaluesofX,notfromthe

‘true’valuesofX.Themeasurementerrorinbothvariablesisoneofthecausesofdeviationsfromtheline,andisincludedinthesedeviationsmeasuredintheYdirection.Second,thelinefoundinthiswaydependsontheunitsinwhichthevariablesaremeasured.ForthedataofTable11.1thelinefoundbythismethodis

FEV1(litre)=-9.33+0.075×height(cm)

Ifwemeasureheightinmetresinsteadofcentimetres,weget

FEV1(litre)=-34.70+22.0×height(m)

ThusbythismethodthepredictedFEV1forastudentofheight170cmis3.42litres,butforastudentofheight1.70mitis2.70litres.Thisisclearlyunsatisfactoryandwewillnotconsiderthisapproachfurther.

Fig.11.3.Deviationsfromthelineintheydirection

ReturningtoFigure11.3,theequationofthelinewhichminimizesthesumofsquareddeviationsfromthelineintheoutcomevariableisfoundquiteeasily(§11A).Thesolutionis:

Wethenfindtheinterceptaby

TheequationY=a+bXiscalledtheregressionequationofYonX,YbeingtheoutcomevariableandXthepredictor.Thegradient,b,isalsocalledtheregressioncoefficient.WeshallcalculateitforthedataofTable11.1.Wehave

WedonotneedthesumofsquaresforYyet,butweshalllater.

HencetheregressionequationofFEV1onheightis

FEV=-9.19+0.0744×height

Figure11.4showsthelinedrawnonthescatterdiagram.

Thecoefficientsaandbhavedimensions,dependingonthoseofXandY.IfwechangetheunitsinwhichXandYaremeasuredwealsochangeaandb,butwedonotchangetheline.Forexample,ifheightismeasuredinmetreswedividethexiby100andwefindthatbismultipliedby100togiveb=7.4389litres/m.Thelineis

FEV1(litres)=-9.19+7.44×height(m)

Thisisexactlythesamelineonthescatterdiagram.

Fig.11.4.TheregressionofFEV1onheight

Fig.11.5.ThetworegressionlinesforthedataofTables11.1and10.15

11.4*TheregressionofXonYWhathappensifwechangeourchoiceofoutcomeandpredictorvariables?TheregressionequationofheightonFEVlis

height=158+4.54×FEV1

ThisisnotthesamelineastheregressionofFEV1onheight.Forifwerearrangethisequationbydividingeachsideby4.54weget

FEVl=-34.8+0.220×height

TheslopeoftheregressionofheightonFEV1isgreaterthanthatofFEV1onheight(Figure11.5).Ingeneral,theslopeoftheregressionofXonYisgreaterthanthatofYonX,whenXisthehorizontalaxis.Onlyifallthepointslieexactlyonastraightlinearethetwoequationsthesame.

Figure11.5alsoshowsthetwo30secondpulsemeasurementsofTable10.15,withthelinesrepresentingtheregressionofthesecondmeasurementonthe

firstandthefirstmeasurementonthesecond.Theregressionequationsare2ndpulse=17.3+0.572×1stpulseand1stpulse=14.9+0.598×2ndpulse.Eachregressioncoefficientislessthanone.Thismeansthatforsubjectswithanygivenfirstpulsemeasurement,thepredictedsecondpulsemeasurementwillbeclosertothemeanthanthefirstmeasurement,andforanygivensecondpulsemeasurement,thepredictedfirstmeasurementwillbeclosertothemeanthanthesecondmeasurement.Thisisregressiontowardsthemean(§11.2).Regressiontowardsthemeanisapurelystatisticalphenomenon,producedbytheselectionofthegivenvalueofthepredictorandtheimperfectrelationshipbetweenthevariables.Regressiontowardsthemeanmaymanifestitselfinmanyways.Forexample,supposewemeasurethe

bloodpressureofanunselectedgroupofpeopleandthenselectsubjectswithhighbloodpressure,e.g.diastolic>95mmHg.Ifwethenmeasuretheselectedgroupagain,themeandiastolicpressurefortheselectedgroupwillbelessonthesecondoccasionthanonthefirst,withoutanyinterventionortreatment.Theapparentfalliscausedbytheinitialselection.

11.5ThestandarderroroftheregressioncoefficientInanyestimationprocedure,wewanttoknowhowreliableourestimatesare.Wedothisbyfindingtheirstandarderrorsandhenceconfidenceintervals.Wecanalsotesthypothesesaboutthecoefficients,forexample,thenullhypothesisthatinthepopulationtheslopeiszeroandthereisnolinearrelationship.Thedetailsaregivenin§11C.Wefirstfindthesumofsquaresofthedeviationsfromtheline,thatis,thedifferencebetweentheobservedyiandthevaluespredictedbytheregressionline.Thisis

Inordertoestimatethevarianceweneedthedegreesoffreedomwithwhichtodividethesumofsquares.Wehaveestimatednotoneparameterfromthedata,asforthesumofsquaresaboutthemean(§4.6),buttwo,aandb.Welosetwodegreesoffreedom,leavinguswithn-2.HencethevarianceofYabouttheline,calledtheresidualvariance,is

Ifwearetoestimatethevariationabouttheline,wemustassumethatitisthesameallthewayalongtheline,i.e.thatthevarianceisuniform.Thisisthesameasforthetwo-sampletmethod(§10.3)andanalysisofvariance(§10.9).Forthe

FEV1datathesumofsquaresduetotheregressionis0.0743892×576.352=3.18937andthesumofsquaresabouttheregressionis9.43868-3.18937=6.24931.Thereare20-2=18degreesoffreedom,sothevarianceabouttheregressioniss2=6.2493/18=0.34718.Thestandarderrorofbisgivenby

WehavealreadyassumedthattheerrorEisNormallydistributed,sobmustbe,too.Thestandarderrorisbasedonasinglesumofsquares,sob/SE(b)isanobservationfromthetdistributionwithn-2degreesoffreedom(§10.1).Wecanfinda95%confidenceintervalforbbytakingtstandarderrorsoneithersideoftheestimate.Fortheexample,wehave18degreesoffreedom.FromTable10.1,the5%pointofthetdistributionis2.10.sothe95%confidenceintervalforbis0.074389-2.10×0.02454to0.074389+2.10×0.02454or0.02to0.13litres/cm.WecanseethatFEV1andheightarerelated,thoughtheslopeisnotverywellestimated.

Wecanalsotestthenullhypothesisthat,inthepopulation,theslope=0againstthealternativethattheslopeisnotequalto0,arelationshipineitherdirection.Theteststatisticisb/SE(b)andifthenullhypothesisistruethiswillbefromatdistributionwithn-2degreesoffreedom.Fortheexample,

FromTable10.1thishastwo-tailedprobabilityoflessthan0.01.Thecomputertellsusthattheprobabilityisabout0.007.Hencethedataareinconsistentwiththenullhypothesisandthedataprovidefairlygoodevidencethatarelationshipexists.Ifthesampleweremuchlarger,wecoulddispensewiththetdistributionandusetheStandardNormaldistributioninitsplace.

11.6*UsingtheregressionlineforpredictionWecanusetheregressionequationtopredictthemeanorexpectedYforanygivenvalueofX.ThisiscalledtheregressionestimateofY.We

canusethistosaywhetheranyindividualhasanobservedYgreaterorlessthanwouldbeexpectedgivenX.Forexample,thepredictedFEVlforstudentswithheight177cmis-9.19+0.0744×177=3.98litres.Threesubjectshadheight177cm.ThefirsthadobservedFEVlof5.43litres,1.45litresabovethatexpected.ThesecondhadaratherlowFEVlof3.09litres,0.89litresbelowexpectation,whilethethirdwithanFEVlof4.05litreswasveryclosetothatpredicted.Wecanusethisclinicallytoadjustameasuredlungfunctionforheightandthusgetabetterideaofthepatient'sstatus.Wewould,ofcourse,useamuchlargersampletoestablishapreciseestimateoftheregressionequation.Wecanalsouseavariantofthemethod(§17.1)toadjustFEV1forheightincomparingdifferentgroups,wherewecanbothremovevariationinFEV1duetovariationinheight

andallowfordifferencesinmeanheightbetweenthegroups.Wemaywishtodothistocomparepatientswithrespiratorydiseaseondifferenttherapies,ortocomparesubjectsexposedtodifferentenvironmentalfactors,suchasairpollution,cigarettesmoking,etc.

Fig.11.6.Confidenceintervalsfortheregressionestimate

Aswithallsampleestimates,theregressionestimateissubjecttosamplingvariation.Weestimateitsprecisionbystandarderrorandconfidenceintervalintheusualway.ThestandarderroroftheexpectedYforanobservedvaluexis

Weneednotgointothealgebraicdetailsofthis.Itisverysimilartothatin§11C.Forx=177wehave

Thisgivesa95%confidenceintervalof3.98-2.10×0.138to3.98+2.10×0.138givingfrom3.69to4.27litres.Here3.98istheestimateand2.10isthe5%pointofthetdistributionwithn-2=18degreesoffreedom.

Thestandarderrorisaminimumatx=[xwithbarabove],andincreasesaswemoveawayfrom[xwithbarabove]ineitherdirection.Itcanbeusefultoplotthestandarderrorand95%confidenceintervalaboutthelineonthescatterdiagram.Figure11.6showsthisfortheFEV1data.Noticethatthelinesdivergeconsiderablyaswereachtheextremesofthedata.Itisverydangeroustoextrapolatebeyondthedata.Notonlydothestandarderrorsbecomeverywide,butweoftenhavenoreasontosupposethatthestraightlinerelationshipwouldpersist.

Theintercepta,thepredictedvalueofYwhenX=0,isaspecialcaseofthis.Clearly,wecannotactuallyhaveamedicalstudentofheightzeroandwithFEV1of-9.19litres.Figure11.6alsoshowstheconfidenceintervalfortheregressionestimatewithamuchsmallerscale,toshowtheintercept.Theconfidenceintervalisverywideatheight=0,andthisdoesnottakeaccountof

anybreakdowninlinearity.

WemaywishtousethevalueofXforasubjecttoestimatethatsubject'sindividualvalueofY,ratherthanthemeanforallsubjectswiththisX.Theestimateisthesameastheregressionestimate,butthestandarderrorismuchgreater:

Forastudentwithaheightof177cm.thepredictedFEVlis3.98litres,withstandarderror0.61litres.Figure11.7showstheprecisionofthepredictionofafurtherobservation.Aswemightexpect,the95%confidenceintervalsincludeallbutoneofthe20observations.Thisisonlygoingtobeausefulpredictionwhentheresidualvariances2issmall.

WecanalsousetheregressionequationofYonXtopredictXfromY.ThisismuchlessaccuratethanpredictingYfromX.Thestandarderrorsare

Forexample,ifweusetheregressionofheightonFEV1(Figure11.5)topredicttheFEV1ofanindividualstudentwithheight177cm,wegetapredictionof4.21litres,withstandarderror1.05litres.ThisisalmosttwicethestandarderrorobtainedfromtheregressionofFEV1onheight,0.61.OnlyifthereisnopossibilityofdeviationsinXfulfillingtheassumptionsofNormaldistributionanduniformvariance,andsonowayoffittingX=a+bY,shouldweconsiderpredictingXfromtheregressionofYonX.ThismighthappenifXisfixedinadvance,e.g.thedoseofadrug.

11.7*AnalysisofresidualsItisoftenveryusefultoexaminetheresiduals,thedifferencesbetweentheobservedandpredictedY.Thisisbestdonegraphically.WecanassesstheassumptionofaNormaldistributionbylookingatthehistogramorNormalplot(§7.5).Figure11.8showsthesefortheFEVldata.Thefitisquitegood.

Figure11.9showsaplotofresidualsagainstthepredictorvariable.Thisplotenablesustoexaminedeviationsfromlinearity.Forexample,ifthetruerelationshipwerequadratic,sothatYincreasesmoreandmorerapidlyasXincreases,weshouldseethattheresidualsare

relatedtoX.LargeandsmallXwouldtendtohavepositiveresidualswhereascentralvalueswouldhavenegativeresiduals.Figure11.9showsnorelationshipbetweentheresidualsandheight,andthelinearmodelseemstobeanadequatefittothedata.

Fig.11.7.Confidenceintervalforafurtherobservation

Fig.11.8.DistributionofresidualsfortheFEV1data

Fig.11.9.ResidualsagainstheightfortheFEV1data

Fig.11.10.Datawhichdonotmeettheconditionsofthemethodofleastsquares,beforeandafterlogtransformation

Figure11.9showssomethingelse,however.Onepointstandsoutashavingaratherlargerresidualthantheothers.Thismaybeanoutlier,apointwhichmaywellcomefromadifferentpopulation.Itisoftendifficulttoknowwhattodowithsuchdata.Atleastwehavebeenwarnedtodoublecheckthispointfortranscriptionerrors.Itisalltooeasytotransposeadjoiningdigitswhentransferringdatafromonemediumtoanother.Thismayhavebeenthecasehere,asanFEV1of4.53,ratherthanthe5.43recorded,wouldhavebeenmoreinlinewiththerestofthedata.Ifthishappenedatthepointofrecording,thereisnotmuchwecandoaboutit.Wecouldtrytomeasurethesubjectagain,orexcludehimandseewhetherthismakesanydifference.Ithinkthat,onthewhole,weshouldworkwithallthedataunlessthereareverygoodreasonsfornotdoingso.Ihaveretainedthiscasehere.

11.8*DeviationsfromassumptionsinregressionBoththeappropriatenessofthemethodofleastsquaresandtheuseofthetdistributionforconfidenceintervalsandtestsofsignificancedependontheassumptionthattheresidualsarefromaNormaldistributionwithuniformvariance.Thisassumptioniseasilymet,forthesamereasonsthatitisinthepairedttest(§10.2).TheremovalofthevariationduetoXtendstoremovesomeofthevariationbetweenindividuals,leavingthemeasurementerror.Problemscanarise,however,anditisalwaysagoodideatoplottheoriginalscatterdiagramandtheresidualstocheckthattherearenogrossdeparturesfromtheassumptionsofthemethod.Notonlydoesthishelppreservethevalidityofthestatisticalmethodused,butitmayalsohelpuslearnmoreaboutthestructureofthedata.

Figure11.10showstherelationshipbetweengestationalageandcordbloodlevelsofAVP,theantidiuretichormone,inasampleofmalefoetuses.ThevariabilityoftheoutcomevariableAVPdependsontheactualvalueofthevariable,beinglargerforlargevaluesofAVP.Theassumptionsofthemethodofleastsquaresdonotapply.However,wecanuseatransformationaswedidforthecomparisonofmeansin§10.4.Figure11.10alsoshowsthedataafterAVPhasbeenlogtransformed,togetherwiththeleastsquaresline.

Asin§10.4,thetransformationisfoundbytrialanderror.Thelogtransformationenablesustointerprettheregressioncoefficientinawaywhichothertransformationsdonot.Iusedlogstobase10forthistransformationandgotthefollowingregressionequation:

log10(AVP)=-0.651253+0.011771×gestationalage

Thismeansthatforeveryonedayincreaseingestationalage,log10(AVP)increasesby0.011771.Adding0.011771tolog10(AVP)multipliesAVPby100.011771=1.027theantilogof0.011771.Wecanantilogtheconfidencelimitsfortheslopetogivetheconfidenceintervalforthisfactor.

Itmaybemoreconvenienttoreporttheincreaseperweekorpermonth.Thesewouldbefactorsof100.011771×7=1.209or100.011771×30

=2.255respectively.Whenthedataarearandomsample,itisoftenconvenienttoquotetheslopecalculatedfromlogsastheeffectofadifferenceofonestandarddeviationofthepredictor.Forgestationalagethestandarddeviationis61.16104days,sotheeffectofachangeofoneSDistomultipleAVPby100.011771×61.16104=5.247,soadifferenceofonestandarddeviationisassociatedwithafivefoldincreaseinAVP.Anotherapproachistolookatthedifferencebetweentwocentiles,suchasthe10thandthe90th.Forgestationalagetheseare98and273days,sotheeffectonAVPwouldbetomultiplyitby100.011771×(273–98)=114.796.ThusthedifferenceoverthisintercentilerangeistoraiseAVP115-fold.

11.9CorrelationTheregressionmethodtellsussomethingaboutthenatureoftherelationshipbetweentwovariables,howonechangeswiththeother,butitdoesnottellushowclosethatrelationshipis.Todothisweneedadifferentcoefficient,thecorrelationcoefficient.Thecorrelationcoefficientisbasedonthesumofproductsaboutthemeanofthetwovariables,soIshallstartbyconsideringthepropertiesofthesumofproductsandwhyitisagoodindicatoroftheclosenessoftherelationship.

Figure11.11showsthescatterdiagramofFigure11.1withtwonewaxesdrawnthroughthemeanpoint.Thedistancesofthepointsfromtheseaxesrepresentthedeviationsfromthemean.InthetoprightsectionofFigure11.11,thedeviationsfromthemeanofbothvariables,FEV1andheight,arepositive.Hence,theirproductswillbepositive.Inthebottomleftsection,thedeviationsfromthemeanofthetwovariableswillbothbenegative.Again,theirproductwillbepositive.InthetopleftsectionofFigure11.11,thedeviationsofFEV1fromitsmeanwillbepositive,andthedeviationofheightfromitsmeanwillbenegative.Theproductofthesewillbenegative.Inthebottomrightsection,theproductwillagainbenegative.SoinFigure11.11nearlyalltheseproductswillbepositive,andtheirsumwillbepositive.Wesaythatthereisapositivecorrelationbetweenthetwovariables;asoneincreasessodoestheother.Ifonevariabledecreasedastheotherincreased,wewouldhaveascatterdiagramwheremostofthepointslayinthetopleftandbottomrightsections.Inthis

casethesumoftheproductswouldbenegativeandtherewouldbeanegativecorrelationbetweenthevariables.Whenthetwovariablesarenotrelated,wehaveascatterdiagramwithroughlythesamenumberofpointsineachofthesections.Inthiscase,thereareasmanypositiveasnegativeproducts,andthesumiszero.Thereiszerocorrelationornocorrelation.Thevariablesaresaidtobeuncorrelated.

Fig.11.11.Scatterdiagramwithaxesthroughthemeanpoint

Thevalueofthesumofproductsdependsontheunitsinwhichthetwovariablesaremeasured.WecanfindadimensionlesscoefficientifwedividethesumofproductsbythesquarerootsofthesumsofsquaresofXandY.Thisgivesustheproductmomentcorrelationcoefficient,orthecorrelationcoefficientforshort,usuallydenotedbyr.

Ifthenpairsofobservationsaredenotedby(xi,yi),thenrisgivenby

FortheFEV1andheightwehave

Theeffectofdividingthesumofproductsbytherootsumofsquaresofdeviationsofeachvariableistomakethecorrelationcoefficientliebetween-1.0and+1.0.WhenallthepointslieexactlyonastraightlinesuchthatYincreasesasXincreases,r=1.Thiscanbeshownbyputtinga+bxiinplaceofyiintheequationforr;everythingcancelsoutleavingr=1.Whenallthepointslieexactlyonastraightlinewithnegativeslope,r=-1.Whenthereisnorelationshipatall,r=0,becausethesumofproductsiszero.Thecorrelationcoefficientdescribestheclosenessofthelinearrelationshipbetweentwovariables.ItdoesnotmatterwhichvariablewetaketobeYandwhichtobeX.Thereisnochoiceofpredictorandoutcomevariable,asthereisinregression.

Fig.11.12.Datawherethecorrelationcoefficientmaybemisleading

Thecorrelationcoefficientmeasureshowclosethepointsaretoastraightline.EvenifthereisaperfectmathematicalrelationshipbetweenXandY,thecorrelationcoefficientwillnotbeexactly1unlessthisisoftheformy=a+bx.Forexample,Figure11.12showstwovariableswhichareperfectlyrelatedbuthaver=0.86.Figure11.12alsoshowstwovariableswhichareclearlyrelatedbuthavezerocorrelation,becausetherelationshipisnotlinear.Thisshowsagaintheimportanceofplottingthedataandnotrelyingonsummarystatisticssuchasthecorrelationcoefficientonly.Inpractice,relationshipslikethoseofFigures11.12arerareinmedicaldata,althoughthepossibilityisalwaysthere.Moreoften,thereissomuchrandomvariationthatitisnoteasytodiscernanyrelationshipatall.

Thecorrelationcoefficientrisrelatedtotheregressioncoefficientbinasimpleway.IfY=a+bXistheregressionofyonX,andX=a′+b′YistheregressionofXonY,thenr2=bb′.Thisarisesfromtheformulaeforrandb.FortheFEV1data,b=0.074389andb′=4.5424,sobb′=0.074389×4.5424=0.33790,thesquarerootofwhichis0.58129,thecorrelationcoefficient.Wealsohave

Thisistheproportionofvariabilityexplained,describedin§11.5.

Table11.2.Two-sided5%and1%pointsofthedistributionofthecorrelationcoefficient,r,underthe

nullhypothesis

n 5% 1% n 5% 1% n 5% 1%

3 1.00 1.00 16 0.50 0.62 29 0.37 0.47

4 0.95 0.99 17 0.48 0.61 30 0.36 0.46

5 0.88 0.96 18 0.47 0.59 40 0.31 0.40

6 0.81 0.92 19 0.46 0.58 50 0.28 0.36

7 0.75 0.87 20 0.44 0.56 60 0.25 0.33

8 0.71 0.83 21 0.43 0.55 70 0.24 0.31

9 0.67 0.80 22 0.42 0.54 80 0.22 0.29

10 0.63 0.77 23 0.41 0.53 90 0.21 0.27

11 0.60 0.74 24 0.40 0.52 100 0.20 0.25

12 0.58 0.71 25 0.40 0.51 200 0.14 0.18

13 0.55 0.68 26 0.39 0.50 500 0.09 0.12

14 0.53 0.66 27 0.38 0.49 1000 0.06 0.08

15 0.51 0.64 28 0.37 0.48

n=Numberofobservations.

11.10SignificancetestandconfidenceintervalforrTestingthenullhypothesisthatr=0inthepopulation,i.e.thatthereisnolinearrelationship,issimple.Thetestisnumericallyequivalenttotestingthenullhypothesisthatb=0,andthetestisvalidprovidedatleastoneofthevariablesisfromaNormaldistribution.Thisconditioniseffectivelythesameasthatfortestingb,wheretheresidualsintheYdirectionmustbeNormal,Ifb=0,theresidualsintheYdirectionaresimplythedeviationsfromthemean,andthesewillonlybeNormallydistributedifYis.Iftheconditionisnotmet,wecanuseatransformation(§11.8),oroneoftherankcorrelationmethods(§12.4-5).

Becausethecorrelationcoefficientdoesnotdependonthemeansorvariancesoftheobservations,thedistributionofthesamplecorrelationcoefficientwhenthepopulationcoefficientiszeroiseasytotabulate.Table11.2showsthecorrelationcoefficientatthe5%and1%levelofsignificance.Fortheexamplewehaver=0.58from20observations.The1%pointfor20observationsis0.56,sowehaveP<0.01,andthecorrelationisunlikelytohaveariseniftherewerenolinearrelationshipinthepopulation.Notethatthevaluesofrwhichcanarisebychancewithsmallsamplesarequitehigh.With10pointsrwouldhavetobegreaterthan0.63tobesignificant.Ontheotherhandwith1000pointsverysmallvaluesofr,aslowas0.06,willbesignificant.

Findingaconfidenceintervalforthecorrelationcoefficientismoredifficult.

EvenwhenXandYarebothNormallydistributed,rdoesnotitselfapproachaNormaldistributionuntilthesamplesizeisinthethousands.Furthermore,itsdistributionisrathersensitivetodeviationsfromtheNormalinXandY.However,ifbothvariablesarefromNormaldistributions,Fisher'sztransformationgivesaNormallydistributedvariablewhosemeanandvarianceareknownintermsofthepopulationcorrelationcoefficientwhichwewishtoestimate.Fromthisaconfidenceintervalcanbefound.Fisher'sztransformationis

whichfollowsaNormaldistributionwithmean

soforthelowerlimitwehave

andfortheupperlimit

andthe95%confidenceintervalis0.18to0.81.Thisisverywide,

reflectingthesamplingvariationwhichthecorrelationcoefficienthasforsmallsamples.Correlationcoefficientsmustbetreatedwithsomecautionwhenderivedfromsmallsamples.

Theeaseofthesignificancetestcomparedtotherelativecomplexityoftheconfidenceintervalcalculationhasmeantthatinthepastasignificancetestwasusuallygivenforthecorrelationcoefficient.Theincreasingavailabilityofcomputerswithwell-writtenstatisticalpackagesshouldleadtocorrelationcoefficientsappearingwithconfidenceintervalsinthefuture.

Table11.3.Simulateddatashowing10pairsofmeasurementsoftwoindependentvariablesforfoursubjects

Subject1 Subject2 Subject3 Subject4

x y x y x y x

47 51 49 52 51 46 63

46 53 50 56 46 48 70

50 57 42 46 46 47 63

52 54 48 52 45 55 58

46 55 60 53 52 49 59

36 53 47 49 54 61 61

47 54 51 52 48 53 67

46 57 57 50 47 48 64

36 61 49 50 47 50 59

44 57 49 49 54 44 61

Means 45.0 55.2 50.2 50.9 49.0 50.1 62.5

r=-0.33 r=0.49 r=0.06 r=-0.39

P=0.35 P=0.15 P=0.86 P=0.27

11.11UsesofthecorrelationcoefficientThecorrelationcoefficienthasseveraluses.UsingTable11.2,itprovidesasimpletestofthenullhypothesisthatthevariablesarenotlinearlyrelated,withlesscalculationthantheregressionmethod.Itisalsousefulasasummarystatisticforthestrengthofrelationshipbetweentwovariables.Thisisofgreatvaluewhenweareconsideringtheinterrelationshipsbetweenalargenumberofvariables.Wecansetupasquarearrayofthecorrelationsofeachpairofvariables,calledthecorrelationmatrix.Examinationofthecorrelationmatrixcanbeveryinstructive,butwemustbearinmindthepossibilityofnon-linearrelationships.Thereisnosubstituteforplottingthedata.Thecorrelationmatrixalsoprovidesthestartingpointforanumberofmethodsfordealingwithalargenumberofvariablessimultaneously.

Ofcourse,forthereasonsdiscussedinChapter3,thefactthattwovariablesarecorrelateddoesnotmeanthatonecausestheother.

11.12*Usingrepeatedobservations

Inclinicalresearchweareoftenabletotakeseveralmeasurementsonthesamepatient.Wemaywanttoinvestigatetherelationshipbetweentwovariables,andtakepairsofreadingswithseveralpairsfromeachofseveralpatients.Theanalysisofsuchdataisquitecomplex.Thisisbecausethevariabilityofmeasurementsmadeondifferentsubjectsisusuallymuchgreaterthanthevariabilitybetweenmeasurementsonthesamesubject,andwemusttakethesetwokindsofvariabilityintoaccount.Whatwemustnotdoistoputallthedatatogether,asiftheywereonesample.

ConsiderthesimulateddataofTable11.3.Thedataweregeneratedfromrandomnumbers,andthereisnorelationshipbetweenXandYatall.FirstvaluesofXandYweregeneratedforeach‘subject’,thenafurtherrandomnumberwasaddedtomaketheindividual‘observation’.Foreachsubjectseparately,

therewasnosignificantcorrelationbetweenXandY.Forthesubjectmeans,thecorrelationcoefficientwasr=0.77,P=0.23.However,ifweputall40observationstogetherwegetr=0.53,P=0.0004.Eventhoughthecoefficientissmallerthanthatbetweensubjectmeans,becauseitisbasedon40pairsofobservationsratherthan4itbecomessignificant.ThedataareplottedinFigure11.13,withthreeothersimulations.Asthenullhypothesisisalwaystrueinthesesimulateddata,thepopulationcorrelationsforeach‘subject’andforthemeansarezero.Becausethenumbersofobservationsaresmall,thesamplecorrelationsvarygreatly.AsTable11.2shows,largecorrelationcoefficientscanarisebychanceinsmallsamples.However,theoverallcorrelationis‘significant’inthreeofthefoursimulations,thoughindifferentdirections.

Fig.11.13.Simulationsof10pairsofobservationsonfoursubjects

Weonlyhavefoursubjectsandonlyfourpoints.Byusingtherepeateddata,wearenotincreasingthenumberofsubjects,butthestatisticalcalculationisdoneasifwehave,andsothenumberofdegreesoffreedomforthesignificancetestisincorrectlyincreasedandaspurioussignificantcorrelationproduced.

Therearetwosimplewaystoapproachthistypeofdata,andwhichischosendependsonthequestionbeingasked.IfwewanttoknowwhethersubjectswithahighvalueofXtendtohaveahighvalueofYalso,weusethesubjectmeansandfindthecorrelationbetweenthem.Ifwehavedifferentnumbersofobservationsforeachsubject,wecanuseaweightedanalysis,weightedbythenumberofobservationsforthesubject.Ifwewanttoknowwhetherchangesinonevariableinthesamesubjectareparallelledbychangesintheother,weneedtousemultipleregression,takingsubjectsoutasafactor(§17.1,§17.6).Ineither

case,weshouldnotmixobservationsfromdifferentsubjects

indiscriminately.

Fig.11.14.Scatterplotsofthe30secondpulsedataasinTable10.15andwithhalfthepairsofobservationsreversed

11.13*IntraclasscorrelationSometimeswehavepairsofobservationswherethereisnoobviouschoiceofXandY.ThedataofTable10.15areagoodexample.Eachsubjecthastwomeasurementsmadebydifferentobservers,differentpairsofobserversbeingusedforeachsubject.ThechoiceofXandYisarbitraryFigure11.14showsthedataasinTable10.15andwithhalfthepairsarbitrarilyreversed.Thescatterplotslookalittledifferentandthereisnogoodreasontochooseoneagainsttheother.Thecorrelationcoefficientsarealittledifferenttoo:fortheoriginalorderr=0.5848andforthesecondorderr=0.5804.Theseareverysimilar,ofcourse,butwhichshouldweuse?

Itwouldbenicetohaveanaveragecorrelationcoefficientacrossallthe245possibleorderings.ThisisprovidedbytheintraclasscorrelationcoefficientorICC.Thiscanbefoundfromtheestimatesofwithinsubjectvariance,s2w,andbetweensubjectsvariance,s2b,foundfromtheanalysisofvariancein§10.12.Wehave:

Fortheexample,s2w=14.37ands2b=20.19(§10.12).hence

TheICCwasoriginallydevelopedforapplicationssuchascorrelationbetweenvariablesmeasuredinpairsoftwins(whichtwinisXandwhichisY?).WedonothavetohavepairsofmeasurementstousetheICC.Itworksjustaswellfortripletsorforanynumberofobservationswithinthegroups,notnecessarilyallthesame.

Althoughnotusednearlyasoftenastheproductmomentcorrelationcoefficient,theICChassomeimportantapplications.Oneisinthestudyofmeasurementerrorandobservervariation(§15.2),whereifmeasurementsaretrue

replicatestheorderinwhichtheyweremadeisnotimportant.Anotherisinthedesignofcluster-randomizedtrialswherethegroupistheclusterandmayhavehundredsofobservationswithinit(§18.8).

Appendices

11AAppendix:Theleastsquaresestimates

Thissectionrequiresknowledgeofcalculus.Wewanttofindaandbsothatthesumofsquaresabouttheliney=a+bxisaminimum.WethereforewanttominimizeΣ(yi-a-bxi)2.Thiswillhaveaminimumwhenthepartialdifferentialswithrespecttoaandbarebothzero.

Subtractingthisfromthesecondequationweget

Thisgivesus

11BAppendix:Varianceabouttheregressionline

11CAppendix:Thestandarderrorofb

Tofindthestandarderrorofb,wemustbearinmindthatinourregressionmodelalltherandomvariationisinY.Wefirstrewritethesumofproducts:

Thevarianceofaconstanttimesarandomvariableisthesquareoftheconstanttimesthevarianceoftherandomvariable(§6.6).Thexiareconstants,notrandomvariables,so

VAR(yi)isthesameforallyi,sayVAR(yi)=s2.Hence

Thestandarderrorofbisthesquarerootofthis.


57.InFigure11.15(a):

(a)predictorandoutcomeareindependent;

(b)predictorandoutcomeareuncorrelated;

(c)thecorrelationbetweenpredictorandoutcomeislessthan1;

(d)predictorandoutcomeareperfectlyrelated;

(e)therelationshipisbestestimatedbysimplelinearregression.

ViewAnswer

58.InFigure11.15(b):

(a)predictorandoutcomeareindependentrandomvariables;

(b)thecorrelationbetweenpredictorandoutcomeiscloseto

zero;

(c)outcomeincreasesaspredictorincreases;

(d)predictorandoutcomearelinearlyrelated;

(e)therelationshipcouldbemadelinearbyalogarithmictransformationoftheoutcome.

ViewAnswer

Fig.11.15.Scatterdiagrams

59.Asimplelinearregressionequation:

(a)describesalinewhichgoesthroughtheorigin;

(b)describesalinewithzeroslope;

(c)isnotaffectedbychangesofscale;

(d)describesalinewhichgoesthroughthemeanpoint;

(e)isaffectedbythechoiceofdependentvariable.

ViewAnswer

60.Ifthetdistributionisusedtofindaconfidenceintervalfortheslopeofaregressionline:

(a)deviationsfromthelineintheindependentvariablemustfollowaNormaldistribution;

(b)deviationsfromthelineinthedependentvariablemustfollowaNormaldistribution;

(c)thevarianceaboutthelineisassumedtobethesamethroughouttherangeofthepredictorvariable;

(d)theyvariablemustbelogtransformed;

(e)allthepointsmustlieontheline.

ViewAnswer

61.Theproductmomentcorrelationcoefficient,r:

(a)mustliebetween-1and+1;

(b)canonlyhaveavalidsignificancetestcarriedoutwhenatleastoneofthevariablesisfromaNormaldistribution;

(c)is0.5whenthereisnorelationship;

(d)dependsonthechoiceofdependentvariable;

(e)measuresthemagnitudeofthechangeinonevariableassociatedwithachangeintheother.

ViewAnswer

11EExercise:ComparingtworegressionlinesTable11.4andFigure11.16showthePEFRandheightsofsamplesofmaleandfemalemedicalstudents.Table11.5showsthesumsofsquaresandproductsforthesedata.

1.Estimatetheslopesoftheregressionlinesforfemalesandmales.

ViewAnswer

2.Estimatethestandarderrorsoftheslopes.

ViewAnswer

3.Findthestandarderrorforthedifferencebetweentheslopes,whichareindependent.Calculatea95%confidenceintervalforthedifference.

ViewAnswer

4.Usethestandarderrortotestthenullhypothesisthattheslopesarethesameinthepopulationfromwhichthesedatacome.

ViewAnswer

Fig.11.16.PEFRandheightforfemaleandmalemedicalstudents

Table11.4.HeightandPEFRinasampleofmedicalstudents

Females

Ht PEFR Ht PEFR Ht PEFR Ht PEFR Ht

155 450 163 428 168 480 164 540 175

155 475 163 548 168 595 167 470 176

155 503 164 485 169 510 167 530 176

158 440 165 485 170 455 167 598 177

160 360 166 430 171 430 168 510 177

161 383 166 440 171 537 168 560 177

161 461 166 485 172 442 170 510 177

161 470 166 510 172 463 170 547 177

161 470 167 415 172 490 170 553 177

161 475 167 455 174 540 170 560 177

161 480 167 470 174 540 171 460 178

162 450 167 500 176 535 171 473 178

162 475 168 430 177 513 171 550 178

162 550 168 440 181 522 171 575 178

163 370 172 480 178

172 550 180

172 620 181

174 550 181

174 550 181

174 616

Table11.5.SummarystatisticsforheightandPEFRinasampleofmedicalstudents

Females Males

Number 43 58

Sumofsquares,height 1469.9 2292.0

Sumofsquares,PEFR 101124.8 226994.1

Sumofproductsaboutmean 4220.1 9048.2



>TableofContents>12-Methodsbasedonrankorder

12

Methodsbasedonrankorder

12.1*Non-parametricmethodsInChapters10and11IdescribedanumberofmethodsofanalysiswhichreliedontheassumptionthatthedatacamefromaNormaldistribution.Tobemoreprecise,wecouldsaythedatacomefromoneoftheNormalfamilyofdistributions,theparticularNormaldistributioninvolvedbeingdefinedbyitsmeanandstandarddeviation,theparametersofthedistribution.ThesemethodsarecalledparametricbecauseweestimatetheparametersoftheunderlyingNormaldistribution.Methodswhichdonotassumeaparticularfamilyofdistributionsforthedataaresaidtobenon-parametric.InthisandthenextchapterIshallconsidersomenon-parametrictestsofsignificance.Therearemanyothers,butthesewillillustratethegeneralprinciple.Wehavealreadymetonenon-parametrictest,thesigntest(§9.2).ThelargesampleNormaltestcouldalsoberegardedasnon-parametric.

Itisusefultodistinguishbetweenthreetypesofmeasurementsscales.Onanintervalscale,thesizeofthedifferencebetweentwovaluesonthescalehasaconsistentmeaning.Forexample,thedifferenceintemperaturebetween1°Cand2°Cisthesameasthedifferencebetween31°Cand32°C.Onanordinalscale,observationsareordered,butdifferencesmaynothaveameaning.Forexample,anxietyisoftenmeasuredusingsetsofquestions,thenumberofpositiveanswersgivingtheanxietyscale.Asetof36questionswouldgiveascalefrom0to36.Thedifferenceinanxietybetweenscoresof1and2isnotnecessarilythesameasthedifferencebetweenscores31and32.Onanominalscale,wehaveaqualitativeorcategoricalvariable,whereindividuals

aregroupedbutnotnecessarilyordered.Eyecolourisagoodexample.Whencategoriesareordered,wecantreatthescaleaseitherorderedornominal,asappropriate.

AllthemethodsofChapters10and11applytointervaldata,beingbasedondifferencesofobservationsfromthemean.Mostofthemethodsinthischapterapplytoordinaldata.AnyintervalscalewhichdoesnotmeettherequirementsofChapters10and11maybetreatedasordinal,sinceitis,ofcourse,ordered.Thisisthemorecommonapplicationinmedicalwork.

GeneraltextssuchasArmitageandBerry(1994),SnedecorandCochran(1980)andColton(1974)tendnottogointoalotofdetailaboutrankandrelatedmethods,andmorespecializedbooksareneeded(Siegel1956,Conover1980).

12.2*TheMann-WhitneyUtestThisisthenon-parametricanalogueofthetwo-samplettest(§10.3).Itworkslikethis.Considerthefollowingartificialdatashowingobservationsofavariableintwoindependentgroups,AandB:

A 7 4 9 17

B 11 6 21 14

WewanttoknowwhetherthereisanyevidencethatAandBaredrawnfrompopulationswithdifferentlevelsofthevariable.Thenullhypothesisisthatthereisnotendencyformembersofonepopulationtoexceedmembersoftheother.Thealternativeisthatthereissuchatendency,inonedirectionortheother.Firstwearrangetheobservationsinascendingorder,i.e.werankthem:

4 6 7 9 11 14 17 21

A B A A B B A B

Wenowchooseonegroup,sayA.ForeachA,wecounthowmanyBsprecedeit.ForthefirstA,4,noBsprecede.Forthesecond,7,oneBprecedes,forthethirdA,9,oneB,forthefourth,17,threeBs.Weadd

thesenumbersofprecedingBstogethertogiveU=0+1+1+3=5.Now,ifUisverysmall,nearlyalltheAsarelessthannearlyalltheBs.IfUislarge,nearlyallAsaregreaterthannearlyallBs.ModeratevaluesofUmeanthatAsandBsaremixed.TheminimumUis0,whenallBsexceedallAs,andmaximumUisn1×n2whenallAsexceedallBs.ThemagnitudeofUhasameaning,becauseU/n1n2isanestimateoftheprobabilitythatanobservationdrawnatrandomfrompopulationAwouldexceedanobservationdrawnatrandomfrompopulationB.

ThereisanotherpossibleU,whichwewillcallU′,obtainedbycountingthenumberofAsbeforeeachB,ratherthanthenumberofBsbeforeeachA.Thiswouldbe1+3+3+4=11.ThetwopossiblevaluesofUandU′arerelatedbyU+U′=n1n2.SowesubtractU′fromn1n2togive4×4-11=5.

IfweknowthedistributionofUunderthenullhypothesisthatthesamplescomefromthesamepopulation,wecansaywithwhatprobabilitythesedatacouldhaveariseniftherewerenodifference.Wecancarryoutthetestofsignificance.ThedistributionofUunderthenullhypothesiscanbefoundeasily.Thetwosetsoffourobservationscanbearrangedin70differentways,fromAAAABBBBtoBBBBAAAA(8!/4!4!=70,§6A).Underthenullhypothesisthesearrangementsareallequallylikelyand,hence,haveprobability1/70.EachhasitsvalueofU,from0to16,andbycountingthenumberofarrangementswhichgiveeachvalueofUwecanfindtheprobabilityofthatvalue.Forexample,U=0onlyarisesfromtheorderAAAABBBBandsohasprobability1/70=0.014.U=1onlyarisesfromAAABABBBandsohasprobability1/70=0.014also.U=2canariseintwoways:AAABBABBandAABAABBB.Ithasprobability2/70=0.029.ThefullsetofprobabilitiesisshowninTable12.1.

Weapplythistotheexample.ForgroupsAandB,U=5andtheprobabilityofthisis0.071.Aswedidforthesigntest(§9.2)weconsidertheprobabilityofmoreextremevaluesofU,U=5orless,whichis0.071+0.071+0.043+0.029+0.014+0.014=0.242.

Thisgivesaonesidedtest.Foratwo-sidedtest,wemustconsidertheprobabilitiesofadifferenceasextremeintheoppositedirection.We

canseefromTable12.1thatthedistributionofUissymmetrical,sotheprobabilityofanequallyextremevalueintheoppositedirectionisalso0.242,hencethetwo-sidedprobabilityis0.242+0.242=0.484.Thustheobserveddifferencewouldhavebeenquiteprobableifthenullhypothesisweretrueandthetwosamplescouldhavecomefromthesamepopulation.

Table12.1.DistributionoftheMann-WhitneyUstatistic,fortwosamplesofsize4

U Probability U Probability U Probability

0 0.014 6 0.100 12 0.071

1 0.014 7 0.100 13 0.043

2 0.029 8 0.114 14 0.029

3 0.043 9 0.100 15 0.014

4 0.071 10 0.100 16 0.014

5 0.071 11 0.071

Table12.2.Two-sided5%pointsforthedistributionofthesmallervalueofUintheMann-WhitneyUtest

n1n

2 3 4 5 6 7 8 9 10 11

2 - - - - - - 0 0 0 0

3 - - - 0 1 1 2 2 3 3

4 - - 0 1 2 3 4 4 5 6

5 - 0 1 2 3 5 6 7 8 9

6 - 1 2 3 5 6 8 10 11 13

7 - 1 3 5 6 8 10 12 14 16

8 0 2 4 6 8 10 13 15 17 19

9 0 2 4 7 10 12 15 17 20 23

10 0 3 5 8 11 14 17 20 23 26

11 0 3 6 9 13 16 19 23 26 30

12 1 4 7 11 14 18 22 26 29 33

13 1 4 8 12 16 20 24 28 33 37

14 1 5 9 13 17 22 26 31 36 40

15 1 5 10 14 19 24 29 34 39 44

16 1 6 11 15 21 26 31 37 42 47

17 2 6 11 17 22 28 34 39 45 51

18 2 7 12 18 24 30 36 42 48 55

19 2 7 13 19 25 32 38 45 52 58

20 2 8 13 20 27 34 41 48 55 62

IfUislessthanorequaltothetabulatedvaluethedifferenceissignificant.

Inpractice,thereisnoneedtocarryoutthesummationofprobabilitiesdescribedabove,asthesearealreadytabulated.Table12.2showsthe5%pointsofUforeachcombinationofsamplesizesn1andn2upto20.ForourgroupsAandB,U=5.wefindthen2=4columnandthen1=4row.Fromthisweseethatthe5%pointforUis0,andsoU=5isnotsignificant.IfwehadcalculatedthelargerofthetwovaluesofU,11,wecanuseTable12.2byfindingthelowervalue,n1n2-U=16-11=5.

Table12.3.Bicepsskinfoldthickness(mm)intwogroupsofpatients

Crohn'sDisease CoeliacDisease

1.8 2.8 4.2 6.2 1.8 3.8

2.2 3.2 4.4 6.6 2.0 4.2

2.4 3.6 4.8 7.0 2.0 5.4

2.5 3.8 5.6 10.0 2.0 7.6

2.8 4.0 6.0 10.4 3.0

Wecannowturntothepracticalanalysisofsomerealdata.ConsiderthebicepsskinfoldthicknessdataofTable10.4,reproducedasTable12.3.WewillanalysetheseusingtheMann-WhitneyUtest.DenotetheCrohn'sdiseasegroupbyAandthecoeliacgroupbyB.Thejointorderisasfollows:

LetuscounttheAsbeforeeachB.Immediatelywehaveaproblem.ThefirstAandthefirstBhavethesamevalue.DoesthefirstAcomebeforethefirstBorafterit?WeresolvethisdilemmabycountingonehalfforthetiedA.Thetiesbetweenthesecond,thirdandfourthBsdonotmatter,aswecancountthenumberofAsbeforeeachwithoutdifficulty.WehavefortheUstatistic:

U=0.5+1+1+1+6+8.5+10.5+13+18=59.5

Thisisthelowervalue,sincen1n2=9×20=180andsothemiddlevalueis90.WecanthereforereferUtoTable12.2.Thecriticalvalueatthe5%levelforgroupssize9and20is48,whichourvalueexceeds.Hencethedifferenceisnotsignificantatthe5%levelandthedataare

consistentwiththenullhypothesisthatthereisnotendencyformembersofonepopulationtoexceedmembersoftheother.Thisisthesameastheresultofthettestof§10.4.

Forlargervaluesofn1andn2calculationofUcanberathertedious.AsimpleformulaforUcanbefoundusingtheranks.Therankofthelowestobservationis1,ofthenextis2,andsoon.Ifanumberofobservationsaretied,eachhavingthesamevalueandhencethesamerank,wegiveeachtheaverageoftherankstheywouldhaveweretheyordered.Forexample,intheskinfolddatathefirsttwoobservationsareeach1.8.Theyeachreceiverank(1+2)/2=1.5.Thethird,fourthandfiftharetiedat2.0,givingeachofthemrank(3+4+5)/3=4.Thesixth,2.2,isnottiedandsohasrank6.Theranksfortheskinfolddataareasfollows:

skinfold 1.8 1.8 2.0 2.0 2.0 2.2 2.4 2.5 2.8 2.8

group A B B B B A A A A A

rank 1.5 1.5 4 4 4 6 7 8 9.5 9.5

r1 r2 r3 r4

skinfold 3.0 3.2 3.6 3.8 3.8 4.0 4.2 4.2 4.4 4.8

group B A A A B A A B A A

rank 11 12 13 14.5 14.5 16 17.5 17.5 19 20

r5 r6 r7

skinfold 5.4 5.6 6.0 6.2 6.6 7.0 7.6 10.0 10.4

group B A A A A A B A A

rank 21 22 23 24 25 26 27 28 29

r8 r9

WedenotetheranksoftheBgroupbyr1,r2,…,rn1.ThenumberofAsprecedingthefirstBmustber1-1,sincetherearenoBsbeforeitanditisther1thobservation.ThenumberofAsprecedingthesecondBisr2-2,sinceitisther2thobservation,andoneprecedingobservationisaB.Similarly,thenumberprecedingthethirdBisr3-3,andthenumberprecedingtheithBisri-i.Hencewehave:

Thatis,weaddtogethertheranksofallthen1observations,subtractn1(n1+1)/2andwehaveU.Fortheexample,wehave

asbefore.Thisformulaissometimeswritten

Butthisissimplybasedontheothergroup,sinceU+U′=n1n2.Fortestingweusethesmallervalue,asbefore.

isanobservationfromaStandardNormaldistribution.Fortheexample,n1=9andn2=20.wehave

FromTable7.1thisgivestwo-sidedprobability=0.15,similartothatfoundbythetwosamplettest(§10.3).

NeitherTable12.2northeaboveformulaforthestandarddeviationofUtaketiesintoaccount;bothassumethedatacanbefullyranked.Theirusefordatawithtiesisanapproximation.Forsmallsampleswemustacceptthis.FortheNormalapproximation,tiescanbeallowedforusingthefollowingformulaforthestandarddeviationofUwhenthenullhypothesisistrue:

TheMann-WhitneyUtestisanon-parametricanalogueofthetwosamplettest.Theadvantageoverthettestisthattheonlyassumptionaboutthedistributionofthedataisthattheobservationscanberanked,whereasforthettestwemustassumethedataarefrom

Normaldistributionswithuniformvariance.Therearedisadvantages.FordatawhichareNormallydistributed,theUtestislesspowerfulthanthettest,i.e.thettest,whenvalid,candetect

smallerdifferencesforgivensamplesize.TheUtestisalmostaspowerfulformoderateandlargesamplesizes,andthisdifferenceisimportantonlyforsmallsamples.Forverysmallsamples,e.g.twogroupsofthreeobservations,thetestisuselessasallpossiblevaluesofUhaveprobabilitiesabove0.05(Table12.2).TheUtestisprimarilyatestofsignificance.Thetmethodalsoenablesustoestimatethesizeofthedifferenceandgivesaconfidenceinterval.AlthoughasnotedaboveU/n1n2hasaninterpretation,wecannot,sofarasIknow,findaconfidenceintervalforit.

Table12.4.Frequencydistributionsofnumberofnodesinvolvedinbreastcancersdetectedat

screeninganddetectedintheintervalsbetweenscreens(dataofMohammedRaja)

Screeningcancers Intervalcancers

Nodes Freqency Nodes Frequency

0 291 0 66

1 43 1 22

2 16 2 7

3 20 3 7

4 13 4 2

5 3 5 4

6 1 6 4

7 4 7 3

8 3 8 3

9 1 9 2

10 1 10 2

11 2 12 2

12 1 13 1

15 1 15 1

16 1 16 1

17 2 20 1

18 2

20 1

27 1

33 1

Total 408 128

Mean 1.21 2.19

Median 0 0

75%ile 1 3

ThenullhypothesisoftheMann–Whitneytestissometimespresentedasbeingthatthepopulationshavethesamemedian.ThereisevenaconfidenceintervalforthedifferencebetweentwomediansbasedontheMann–Whitneytest(CampbellandGardner1989).Thisissurprising,asthemediansarenotinvolvedinthecalculation.Furthermore,wecanhavetwogroupswhicharesignificantlydifferentusingtheMann–WhitneyUtestyethavethesamemedian.Table12.4

showsanexample.Themajorityofobservationsinbothgroupsarezero,sotransformationtotheNormalisimpossible.Althoughthesamplesarequitelarge,thedistributionissoskewthatarankmethod,appropriatelyadjustedforties,maybesaferthanthemethodof§9.7.TheMann–WhitneyUtestwashighlysignificant,yetthemediansarebothzero.Asthemedianswereequal,Isuggestedthe75thpercentileasameasureoflocationforthedistributions.

ThereasonforthesetwodifferentviewsoftheMann–WhitneyUtestliesintheassumptionswemakeaboutthedistributionsinthetwo

populations.Ifwemakenoassumptions,wecantestthenullhypothesis:thattheprobabilitythatamemberofthefirstpopulationdrawnatrandomwillexceedamemberofthesecondpopulationdrawnatrandomisonehalf.Somepeoplechoosetomakeanassumptionaboutthedistributions:thattheyhavethesameshapeanddifferonlyinlocation(meanormedian).Ifthisassumptionistrue,thenifthedistributionsaredifferentthemediansmustbedifferent.Themeansmustdifferbythesameamount.Itisaverystrongassumption.Forexample,ifitistruethenthevariancesmustbethesameinthetwopopulations.Forthereasonsgivenin§10.5and§7A,itisunlikelythatwecouldgetthisifthedistributionswerenotNormal.UnderthisassumptiontheMann–WhitneyUtestwillrarelybevalidifthetwosamplettestisnotvalidalso.

Thereareothernon-parametrictestswhichtestthesameorsimilarnullhypotheses.Twoofthese,theWilcoxontwosampletestandtheKendallTautest,aredifferentversionsoftheMann–WhitneyUtestwhichweredevelopedaroundthesametimeandlatershowntobeidentical.Thesenamesaresometimesusedinterchangeably.Theteststatisticsandtablesarenotthesame,andtheusermustbeverycarefulthatthecalculationoftheteststatisticbeingusedcorrespondstothetabletowhichitisreferred.AnotherdifficultywithtablesisthatsomearedrawnsothatforasignificantdifferenceUmustbelessthanorequaltothetabulatedvalue(asinTable12.2),forothersUmustbestrictlylessthanthetabulatedvalue.

Formorethantwogroups,therankanalogueofone-wayanalysisofvariance(§10.9)istheKruskal–Wallistest,seeConover(1980)andSiegel(1956).Conover(1980)alsodescribesamultiplecomparisontestforthepairsofgroups,similartothosedescribedin§10.11.

12.3*TheWilcoxonmatchedpairstestThistestisananalogueofthepairedttest.Wehaveasamplemeasuredundertwoconditionsandthenullhypothesisisthatthereisnotendencyfortheoutcomeononeconditiontobehigherorlowerthantheother.Thealternativehypothesisisthattheoutcomeononeconditiontendstobehigherorlowerthantheother.Asthetestisbasedonthemagnitudeofthedifferences,thedatamustbeinterval.

ConsiderthedataofTable12.5,previouslydiscussedin§2.6and§9.2,whereweusedthesigntestfortheanalysis.Inthesigntest,wehaveignoredthemagnitudeofdifferences,andonlyconsideredtheirsigns.Ifwecanuseinformation

aboutthemagnitude,wewouldhopetohaveamorepowerfultest.Clearly,wemusthaveintervaldatatodothis.Toavoidmakingassumptionsaboutthedistributionofthedifferences,weusetheirrankorderinasimilarmannertotheMann–WhitneyUtest.

Table12.5.Resultsofatrialofpronethalolforthepreventionofanginapectoris(Pritchardetal.1963),in

rankorderofdifferences


Differenceplacebo–pronethalol

Rankofdifference

Placebo Pronethalol All Positive Negative

2 0 2 1.5 1.5

17 15 2 1.5 1.5

3 0 3 3 3

7 2 5 4 4

8 1 7 6 6

14 7 7 6 6

23 16 7 6 6

34 25 9 8 8

79 65 14 9 9

60 41 19 10 10

323 348 -25 11 11

71 29 42 12 12

Sumofranks

67 11

First,werankthedifferencesbytheirabsolutevalues,i.e.ignoringthesign.Asin§12.2,tiedobservationsaregiventheaverageoftheirranks.Wenowsumtheranksofthepositivedifferences,67,andtheranksofthenegativedifferences,11(Table12.5).Ifthenullhypothesisweretrueandtherewasnodifference,wewouldexpecttheranksumsforpositiveandnegativedifferencestobeaboutthesame,equalto39(theiraverage).Theteststatisticisthelesserofthesesums,T.ThesmallerTis,thelowertheprobabilityofthedataarisingbychance.

ThedistributionofTwhenthenullhypothesisistruecanbefoundbyenumeratingallthepossibilities,asdescribedfortheMann–WhitneyUstatistic.Table12.6givesthe5%and1%pointsforthisdistribution,forsamplesizenupto25.Fortheexample,n=12andsothedifferencewouldbesignificantatthe5%levelifTwerelessthanorequalto14.WehaveT=11,sothedataarenotconsistentwiththenullhypothesis.Thedatasupporttheviewthatthereisarealtendencyforpatientstohavefewerattackswhileontheactivetreatment.

FromTable12.6,wecanseethattheprobabilitythatT≤11liesbetween0.05and0.01.Thisisgreaterthantheprobabilitygivenbythesigntest,whichwas0.006(§9.2).Usuallywewouldexpectgreaterpower,andhencelowerprobabilitieswhenthenullhypothesisisfalse,whenweusemoreoftheinformation.Inthiscase,thegreaterprobabilityreflectsthefactthattheonenegativedifference,-25,islarge.Examinationoftheoriginaldatashowsthatthisindividualhadverylargenumbersofattacksonbothtreatments,anditseemspossiblethathemaybelongtoadifferentpopulationfromtheothereleven.

LikeTable12.2,Table12.6isbasedontheassumptionthatthedifferencescanbefullyrankedandtherearenoties.Tiesmayoccurintwowaysinthis

test.Firstly,tiesmayoccurintherankingsense.Intheexamplewehadtwodifferencesof+2andthreeof+7.Thesewererankedequally:1.5and1.5.and6,6and6.Whentiesarepresentbetweennegativeandpositivedifferences,Table12.6onlyapproximatestothedistributionofT.

Table12.6.Two-sided5%and1%pointsofthedistributionofT(lowervalue)intheWilcoxonone-

sampletest

Samplesizen

ProbabilitythatT≤thetabulated

valueSamplesizen

ProbabilitythatT≤the

tabulatedvalue

5% 1% 5% 1%

5 - - 16 30 19

6 1 - 17 35 23

7 2 - 18 40 28

8 4 0 19 46 32

9 6 2 20 52 37

10 8 3 21 59 43

11 11 5 22 66 49

12 14 7 23 73 55

13 17 10 24 81 61

14 21 13 25 90 68

15 25 16

Tiesmayalsooccurbetweenthepairedobservations,wheretheobserveddifferenceiszero.Inthesamewayasforthesigntest,weomitzerodifferences(§9.2).Table12.6isusedwithnasthenumberofnon-zerodifferencesonly,notthetoalnumberofdifferences.Thisseemsodd,inthatalotofzerodifferenceswouldappeartosupportthenullhypothesis.Forexample,ifinTable12.5wehadanotherdozenpatientswithzerodifferences,thecalculationandconclusionwouldbethesame.However,themeandifferencewouldbesmallerandtheWilcoxontesttellsusnothingaboutthesizeofthedifference,onlyitsexistence.Thisillustratesthedangerofallowingsignificanceteststooutweighallotherwaysoflookingatthedata.

isfromaStandardNormaldistributionifthenullhypothesisistrue.FortheexampleofTable12.5,wehave:

FromTable7.1thisgivesatwo-tailedprobabilityof0.028,similartothatobtainedfromTable12.6.

Wehavethreepossibletestsforpaireddata,theWilcoxon,signandpairedtmethods.IfthedifferencesareNormallydistributed,thettestisthemostpowerfultest.TheWilcoxontestisalmostaspowerful,however,andinpracticethedifferenceisnotgreatexceptforsmallsamples.LiketheMann–WhitneyUtest,theWilcoxonisuselessforverysmallsamples.ThesigntestissimilarinpowertotheWilcoxonforverysmallsamples,butasthesamplesizeincreasestheWilcoxontestbecomesmuchmorepowerful.ThismightbeexpectedsincetheWilcoxontestusesmoreoftheinformation.TheWilcoxontestusesthemagnitudeofthedifferences,andhencerequiresintervaldata.Thismeansthat,asfortmethods,wewillgetdifferentresultsifwetransformthedata.Fortrulyordinaldataweshouldusethesigntest.Thepairedtmethodalsogivesaconfidenceintervalforthedifference.TheWilcoxontestispurelyatestofsignificance,butaconfidenceintervalforthemediandifferencecanbefoundusingtheBinomialmethoddescribedin§8.9.

12.4*Spearman'srankcorrelationcoefficient,ρWenotedinChapter11thesensitivitytoassumptionsofNormalityoftheproductmomentcorrelationcoefficient,r.Thisledtothedevelopmentofnon-parametricapproachesbasedonranks.Spearman'sapproachwasdirect.Firstweranktheobservations,thencalculatetheproductmomentcorrelationoftheranks,ratherthanoftheobservationsthemselves.Theresultingstatistichasadistributionwhichdoesnotdependonthedistributionoftheoriginalvariables.ItisusuallydenotedbytheGreekletterρ,pronounced‘rho’,orbyrs.

Table12.7showsdatafromastudyofthegeographicaldistributionofatumour,Kaposi'ssarcoma,inmainlandTanzania.Theincidencerateswerecalculatedfromcancerregistrydataandtherewasconsiderabledoubtthatallcaseswerenotified.Thedegreeofreportingofcasesmayhavebeenrelatedtopopulationdensityoravailabilityofhealthservices.Inaddition,incidencewascloselyrelatedtoageandsex(whererecorded)andsocouldberelatedtotheageandsexdistributionintheregion.Tocheckthatnoneofthesewereproducingartefactsinthegeographicaldistribution,Icalculatedtherankcorrelationofdiseaseincidencewitheachofthepossibleexplanatoryvariables.Table12.7showstherelationshipofincidencetothepercentageofthepopulationlivingwithin10kmofahealthcentre.Figure12.1showsthescatterdiagramofthesedata.Thepercentagewithin10kmofahealthcentreisveryhighlyskewed,whereasthediseaseincidenceappearssomewhatbimodal.Theassumptionoftheproductmomentcorrelationdonotappeartobemet,sorankcorrelationwaspreferred.

Table12.7.IncidenceofKaposi'ssarcomaandaccessofpopulationtohealthcentresforeachregionofmainland

Tanzania(Blandetal.1977)

Percent Rankorder

RegionIncidence

permillionperyear

populationwithin10kmofhealthcentre

Incidence Population%

Coast 1.28 4.0 1 3

Shinyanga 1.66 9.0 2 7

Mbeya 2.06 6.7 3 6

Tabora 2.37 1.8 4 1

Arusha 2.46 13.7 5 13

Dodoma 2.60 11.1 6 10

Kigoma 4.22 9.2 7 8

Mara 4.29 4.4 8 4

Tanga 4.54 23.0 9 16

Singida 6.17 10.8 10 9

Morogoro 6.33 11.7 11 11

Mtwara 6.40 14.8 12 14

Westlake 6.60 12.5 13 12

Kilimanjaro 6.65 57.3 14 17

Ruvuma 7.21 6.6 15 5

Iringa 8.46 2.6 16 2

Mwanza 8.54 20.7 17 15

Fig.12.1.IncidenceofKaposi'ssarcomapermillionperyearandpercentageofpopulationwithin10kmofahealthcentre,for17regionsofmainlandTanzania

ThecalculationofSpearman'sρproceedsasfollows.Theranksforthe

twovariablesarefound(Table12.7).Weapplytheformulafortheproductmomentcorrelation(§11.9)totheseranks.Wedefine:

Table12.8.Two-sided5%and1%pointsofthedistributionofSpearman'sρ

Samplesizen

Probabilitythatρisasfarorfurtherfrom0thanthetabulatedvalue

5% 1%

4 - -

5 1.00 -

6 0.89 1.00

7 0.82 0.96

8 0.79 0.93

9 0.70 0.83

10 0.68 0.81

Wehaveignoredtheproblemoftiesintheabove.Wetreatobservationswiththesamevalueasdescribedin§12.2.Wegivethemtheaverageoftherankstheywouldhaveiftheywereseparableandapplytherankcorrelationformulaasdescribedabove.InthiscasethedistributionofTable12.8isonlyapproximate.

Thereareseveralwaysofcalculatingthiscoefficient,resultinginformulaewhichappearquitedifferent,thoughtheygivethesameresult(seeSiegel1956).

12.5*Kendall'srankcorrelationcoefficient,τSpearman'srankcorrelationisquitesatisfactoryfortestingthenullhypothesisofnorelationship,butisdifficulttointerpretasameasurementofthestrengthoftherelationship.Kendalldevelopedadifferentrankcorrelationcoefficient.Kendall'sτ,whichhassomeadvantagesoverSpearman's.(TheGreekletterτispronounced‘tau’.)ItisrathermoretedioustocalculatethanSpearman's,butinthecomputeragethishardlymatters.Foreachpairofsubjectswe

observewhetherthesubjectsareorderedinthesamewaybythetwo

variables,aconcordantpair,orderedinoppositeways,adiscordantpair,orequalforoneofthevariablesandsonotorderedatall,atiedpair.Kendall'sτistheproportionofconcordantpairsminustheproportionofdiscordantpairs.τwillbeoneiftherankingsareidentical,asallpairswillbeorderedinthesameway,andminusoneiftherankingsareexactlyopposite,asallpairswillbeorderedintheoppositeway.

Weshalldenotethenumberofconcordantpairs(orderedthesameway)bync,thenumberofdiscordantpairs(orderedinoppositeways)bynd,andthedifference,nc-nd,byS.Therearen(n-1)/2pairsaltogether,so

Whentherearenoties,nc+nd=n(n-1)/2.

Thesimplestwaytocalculatencistoordertheobservationsbyoneofthevariables,asinTable12.7whichisorderedbydiseaseincidence.Nowconsiderthesecondranking(%populationwithin10kmofahealthcentre).Thefirstregion,Coast,has14regionsbelowitwhichhavegreaterrank,sothepairsformedbythefirstregionandthesewillbeinthecorrectorder.Thereare2regionsbelowitwhichhavelowerrank,sothepairsformedbythefirstregionandthesewillbeintheoppositeorder.Thesecondregion,Shinyanga,has10regionsbelowitwithgreaterrankandsocontributes10furtherpairsinthecorrectorder.Notethatthepair‘CoastandShinyanga’hasalreadybeencounted.Thereare5pairsinoppositeorder.Thethirdregion,Mbeya,has10regionsbelowitinthesameorderand4inoppositeorders,andsoon.Weaddthesenumberstogetncandnd:

nc=14+10+10+13+4+6+7+8+1+5+4+2+2+0+1+1+0=88

nd=2+5+4+0+8+5+3+1+7+2+2+3+2+3+1+0+0=48

Thenumberofpairsisn(n-1)/2=17×16/2=136.Becausetherearenoties,wecouldalsocalculatendbynd=n(n-1)/2-nc=136-88=48.S=nc-nd=88-48=40.Henceτ=S/(n(n-1)/2)=40/136=0.29.

Whenthereareties,τcannotbeone.However,wecouldhaveperfectcorrelationifthetieswerebetweenthesamesubjectsforbothvariables.Toallowforthis,weuseadifferentversionofτ,τb.Considerthedenominator.Therearen(n-1)/2possiblepairs.IftherearetindividualstiedataparticularrankforvariableX,nopairsfromthesetindividualscontributetoS.Therearet(t-1)/2suchpairs.IfweconsiderallthegroupsoftiedindividualswehaveΣt(t-1)/2pairswhichdonotcontributetoS,summingoverallgroupsoftiedranks.HencethetotalnumberofpairswhichcancontributetoSisn(n-1)-Σt(t-1)/2,andScannotbegreaterthann(n-1)/2-Σt(t-1)/2.ThesizeofSisalsolimitedbytiesinthesecondranking.Ifwedenotethenumberofindividuals

withthesamevalueofYbyu,thenthenumberofpairswhichcancontributetoSisn(n-1)/2-Σu(u-1)/2.Wenowdefineτbby

Notethatiftherearenoties,Σt(t-1)/2=0=Σ.Whentherankingsareidenticalτb=1,nomatterhowmanytiesthereare.Kendall(1970)alsodiscussestwootherwaysofdealingwithties,obtainingcoefficientsτaandτc,buttheiruseisrestricted.

Weoftenwanttotestthenullhypothesisthatthereisnorelationshipbetweenthetwovariablesinthepopulationfromwhichoursamplewasdrawn.Asusual,weareconcernedwiththeprobabilityofSbeingasormoreextreme(i.e.farfromzero)thantheobservedvalue.Table12.9wascalculatedinthesamewayasTables12.1and12.2.ItshowstheprobabilityofbeingasextremeastheobservedvalueofSfornupto10.Forconvenience,Sistabulatedratherthanτ.Whentiesarepresentthisisonlyanapproximation.

Whenthesamplesizeisgreaterthan10,ShasanapproximatelyNormaldistributionunderthenullhypothesis,withmeanzero.Iftherearenoties,thevarianceis

Whenthereareties,thevarianceformulaisverycomplicated(Kendall

1970).Ishallomitit,asinpracticethesecalculationswillbedoneusingcomputersanyway.Iftherearenotmanytiesitwillnotmakemuchdifferenceifthesimpleformisused.

Fortheexample,S=40,n=17andtherearenoties,sotheStandardNormalvariateis

FromTable7.1oftheNormaldistributionwefindthatthetwo-sidedprobabilityofavalueasextremeasthisis0.06×2=0.12,whichisverysimilartothatfoundusingSpearman'sρ.Theproductmomentcorrelation,r,givesr=0.30,P=0.24,butofcoursethenon-NormaldistributionsofthevariablesmakethisPinvalid.

Whyhavetwodifferentrankcorrelationcoefficients?Spearman'sρisolderthanKendall'sτ,andcanbethoughtofasasimpleanalogueoftheproductmomentcorrelationcoefficient,Pearson'sr.τisapartofamoregeneralandconsistentsystemofrankingmethods,andhasadirectinterpretation,asthedifferencebetweentheproportionsofconcordantanddiscordantpairs.Ingeneral,

thenumericalvalueofρisgreaterthanthatofτ.Itisnotpossibletocalculateτfromρorρfromτ,theymeasuredifferentsortsofcorrelation.ρgivesmoreweighttoreversalsoforderwhendataarefarapartinrankthanwhenthereisareversalclosetogetherinrank,τdoesnot.Howeverintermsoftestsofsignificancebothhavethesamepowertorejectafalsenullhypothesis,soforthispurposeitdoesnotmatterwhichisused.

Table12.9.Two-sided5%and1%pointsofthedistributionofSforKendall'sτ

Samplesizen

ProbabilitythatSisasfarorfurtherfromtheexpectedthanthetabulatedvalue

5% 1%

4 - -

5 10 -

6 13 15

7 15 19

8 18 22

9 20 26

10 23 29

12.6*ContinuitycorrectionsInthischapter,whensampleswerelargewehaveusedacontinuousdistribution,theNormal,toapproximatetoadiscretedistribution.U,TorS.Forexample,Figure12.2showsthedistributionoftheMann—WhitneyUstatisticforn1=4,n2=4(Table12.1)withthecorrespondingNormalcurve.Fromtheexactdistribution,theprobabilitythatU<2is0.014+0.014+0.029=0.057.ThecorrespondingStandardNormaldeviateis

Thishasaprobabilityof0.048,interpolatinginTable7.1.Thisis

smallerthantheexactprobability.Thedisparityarisesbecausethecontinuousdistributiongivesprobabilitytovaluesotherthantheintegers0,1,2,etc.TheestimatedprobabilityforU=2canbefoundbytheareaunderthecurvebetweenU=1.5andU=2.5.ThecorrespondingNormaldeviatesare-1.876and-1.588,whichhaveprobabilitiesfromTable7.1of0.030and0.056.ThisgivestheestimatedprobabilityforU=2tobe0.056-0.030=0.026,whichcomparesquitewellwiththeexactfigureof0.029.ThustoestimatetheprobabilitythatU<2,weestimatetheareabelowU=1.5,notbelowU=2.ThisgivesusaStandardNormaldeviateof-1.588,asalreadynoted,andhenceaprobabilityof0.056.Thiscorrespondsremarkablywellwiththeexactprobabilityof0.057,especiallywhenweconsiderhowsmalln1andn2are.

WewillgetabetterapproximationfromourStandardNormaldeviateifwemakeUclosertoitsexpectedvalueby1/2.Ingeneral,wegetabetterfitifwe

maketheobservedvalueofthestatisticclosertoitsexpectedvaluebyhalfoftheintervalbetweenadjacentdiscretevalues.Thisisacontinuitycorrection.

Fig.12.2.DistributionoftheMann-WhitneyUstatistic,n1=4,n2=4,whenthenullhypothesisistrue,withthecorrespondingNormal

distributionandareaestimatingPROB(U=2)

ForS,theintervalbetweenadjacentvaluesis2,not1,forS=nc-nd=2nc-n(n-1)/2,andncisaninteger.AchangeofoneunitinncproducesachangeoftwounitsinS.Thecontinuitycorrectionisthereforehalfof2,whichis1.WemakeSclosertotheexpectedvalueof0by1beforeapplyingtheNormalapproximation.FortheKaposi'ssarcomadata,wehadS=40,withn=17.Usingthecontinuitycorrectiongives

Thisgivesatwo-sidedprobabilityof0.066×2=0.13,slightlygreaterthantheuncorrectedvalueof0.12.

Continuitycorrectionsareimportantforsmallsamples;forlargesamplestheyarenegligible.WeshallmeetanotherinChapter13.

12.7*Parametricornon-parametricmethods?Formanystatisticalproblemsthereareseveralpossiblesolutions,justasformanydiseasesthereareseveraltreatments,similarperhapsintheiroverallefficacybutdisplayingvariationintheirsideeffects,intheirinteractionswithotherdiseasesortreatmentsandintheirsuitabilityfordifferenttypesofpatients.Thereisoftennoonerighttreatment,butrathertreatmentisdecidedonthepresciber'sjudgementoftheseeffects,pastexperienceandplainprejudice.Manyproblemsinstatisticalanalysisarelikethis.Incomparingthemeansoftwosmallgroups,forinstance,wecoulduseattest,attestwithatransformation,aMann-WhitneyUtest,oroneofseveralothers.Ourchoice

ofmethoddependsontheplausibilityofNormalassumptions,theimportanceofobtainingaconfidenceinterval,theeaseofcalculation,andsoon.Itdependsonplainprejudice,too.SomeusersofstatisticalmethodsareveryconcernedabouttheimplicationsofNormalassumptionsandwilladvocatenon-parametricmethodswherever

possible,whileothersaretoocarelessoftheerrorsthatmaybeintroducedwhenassumptionsarenotmet.

Isometimesmeetpeoplewhotellmethattheyhaveusednon-parametricmethodsthroughouttheiranalysisasifthisissomekindofbadgeofstatisticalpurity.Itisnothingofthekind.Itmaymeanthattheirsignificancetestshavelesspowerthantheymighthave,andthatresultsareleftas‘notsignificant’when,forexample,aconfidenceintervalforadifferencemightbemoreinformative.

Ontheotherhand,suchmethodsareveryusefulwhenthenecessaryassumptionsofthetdistributionmethodcannotbemade,anditwouldbeequallywrongtoeschewtheiruse.Rather,weshouldchoosethemethodmostsuitedtotheproblem,bearinginmindboththeassumptionswearemakingandwhatwereallywanttoknow.WeshallsaymoreaboutwhatmethodtousewheninChapter14.

Thereisacommonmisconceptionthatwhenthenumberofobservationsisverysmall,usuallysaidtobelessthansix,Normaldistributionmethodssuchasttestsandregressionmustnotbeusedandthatrankmethodsshouldbeusedinstead.Ihaveneverseenanyargumentputforwardinsupportofthis,butinspectionofTables12.2,12.6,12.8,and12.9willshowthatitisnonsense.Forsuchsmallsamplesranktestscannotproduceanysignificanceattheusual5%level.Shouldoneneedstatisticalanalysisofsuchsmallsamples,Normalmethodsarerequired.

12M*Multiplechoicequestions62to66(Eachbranchiseithertrueorfalse)

62.Forcomparingtheresponsestoanewtreatmentofagroupofpatientswiththeresponsesofacontrolgrouptoastandardtreatment,possibleapproachesinclude:

(a)thetwo-sampletmethod;

(b)thesigntest;

(c)theMann-WhitneyUtest;

(d)theWilcoxonmatchedpairstest;

(e)rankcorrelationbetweenresponsestothetreatments.

ViewAnswer

63.Suitablemethodsfortrulyordinaldatainclude:

(a)thesigntest;

(b)theMann-WhitneyUtest;

(c)theWilcoxonmatchedpairstest;

(d)thetwosampletmethod;

(e)Kendall'srankcorrelationcoefficient.

ViewAnswer

64.Kendall'srankcorrelationcoefficientbetweentwovariables:

(a)dependsonwhichvariableisregardedasthepredictor;

(b)iszerowhenthereisnorelationship;

(c)cannothaveavalidsignificancetestwhentherearetiedobservations;

(d)mustliebetween-1and+1;

(e)isnotaffectedbyalogtransformationofthevariables.

ViewAnswer

65.Testsofsignificancebasedonranks:

(a)arealwaystobepreferredtomethodswhichassumethedatatobeNormallydistributed;

(b)arelesspowerfulthanmethodsbasedontheNormaldistributionwhendataareNormallydistributed;

(c)enableconfidenceintervalstobeestimatedeasily;

(d)requirenoassumptionsaboutthedata;

(e)areoftentobepreferredwhendatacannotbeassumedto

followanyparticulardistribution.

ViewAnswer

66.Tenmenwithanginaweregivenanactivedrugandaplaceboonalternatedaysinrandomorder.Patientsweretestedusingthetimeinminutesforwhichtheycouldexerciseuntilanginaorfatiguestoppedthem.Theexistenceofanactivedrugeffectcouldbeexaminedby:

(a)pairedttest;

(b)Mann-WhitneyUtest;

(c)signtest;

(d)Wilcoxonmatchedpairstest;

(e)Spearman'sρ.

ViewAnswer

12E*Exercise:ApplicationofrankmethodsInthisexerciseweshallanalysetherespiratorycompliancedataof§10Eusingnon-parametricmethods.

1.ForthedataofTable10.19,usethesigntesttotestthenullhypothesisthatchangingthewaveformhasnoeffectonstaticcompliance.

ViewAnswer

2.Testthesamenullhypothesisusingatestbasedonranks.

ViewAnswer

3.Repeatstep1usinglogtransformedcompliance.Doesthetransformationmakeanydifference?

ViewAnswer

4.Repeatstep2usinglogcompliance.Whydoyougetadifferentanswer?

ViewAnswer

5.Whatdoyouconcludeabouttheeffectofwaveformfromthenon-parametrictests?

ViewAnswer

6.Howdotheconclusionsoftheparametricandnon-parametricapproachesdiffer?

ViewAnswer



>TableofContents>13-Theanalysisofcross-tabulations

13

Theanalysisofcross-tabulations

13.1Thechi-squaredtestforassociationTable13.1showsforasampleofmotherstherelationshipbetweenhousingtenureandwhethertheyhadapretermdelivery.Thiskindofcross-tabulationoffrequenciesisalsocalledacontingencytableorcross-classification.Eachentryinthetableisafrequency,thenumberofindividualshavingthesecharacteristics(§4.1).Itcanbequitedifficulttomeasurethestrengthoftheassociationbetweentwoqualitativevariableslikethese,butitiseasytotestthenullhypothesisthatthereisnorelationshiporassociationbetweenthetwovariables.Ifthesampleislarge,wedothisbyachi-squaredtest.

Thechi-squaredtestforassociationinacontingencytableworkslikethis.Thenullhypothesisisthatthereisnoassociationbetweenthetwovariables,thealternativebeingthatthereisanassociationofanykind.Wefindforeachcellofthetablethefrequencywhichwewouldexpectifthenullhypothesisweretrue.Todothisweusetherowandcolumntotals,sowearefindingtheexpectedfrequenciesfortableswiththesetotals,calledthemarginaltotals.

Thereare1443women,ofwhom899wereowneroccupiers,aproportion899/1443.Iftherewerenorelationshipbetweentimeofdeliveryandhousingtenure,wewouldexpecteachcolumnofthetabletohavethesameproportion,899/1443,ofitsmembersinthefirstrow.Thusthe99patientsinthefirstcolumnwouldbeexpectedtohave99×899/1443=61.7inthefirstrow.By‘expected’wemeantheaveragefrequencywewouldgetinthelongrun.Wecouldnotactuallyobserve61.7subjects.The1344patientsinthesecondcolumnwouldbe

expectedtohave1344×899/1443=837.3inthefirstrow.Thesumofthesetwoexpectedfrequenciesis899,therowtotal.Similarly,thereare258patientsinthesecondrowandsowewouldexpect99×258/1443=17.7in

thesecondrow,firstcolumnand1344×258/1443=240.3inthesecondrow,secondcolumn.Wecalculatetheexpectedfrequencyforeachrowandcolumncombination,orcell.The10cellsofTable13.1giveustheexpectedfrequenciesshowninTable13.2.NoticethattherowandcolumntotalsarethesameasinTable13.1.Ingeneral,theexpectedfrequencyforacellofthecontingencytableisfoundby

Itdoesnotmatterwhichvariableistherowandwhichthecolumn.

Table13.1.Contingencytableshowingtimeofdeliverybyhousingtenure

Housingtenure Preterm Term Total

Owner–occupier 50 849 899

Counciltenant 29 229 258

Privatetenant 11 164 175

Liveswithparents 6 66 72

Other 3 36 39

Total 99 1344 1443

Wenowcomparetheobservedandexpectedfrequencies.Ifthetwovariablesarenotassociated,theobservedandexpectedfrequenciesshouldbeclosetogether,anydiscrepancybeingduetorandomvariation.Weneedateststatisticwhichmeasuresthis.Thedifferencesbetweenobservedandexpectedfrequenciesareagoodplacetostart.Wecannotsimplysumthemasthesumwouldbezero,bothobservedandexpectedfrequencieshavingthesamegrandtotal,1443.Wecanresolvethisasweresolvedasimilarproblemwithdifferencesfromthemean(§4.7),bysquaringthedifferences.Thesizeofthedifferencewillalsodependinsomewayonthenumberofpatients.Whentherowandcolumntotalsaresmall,thedifferencebetweenobservedandexpectedisforcedtobesmall.Itturnsout,forreasonsdiscussedin§13A,thatthebeststatisticis

Thisisoftenwrittenas

ForTable13.1thisis

Aswillbeexplainedin§13A,thedistributionofthisteststatisticwhenthenullhypothesisistrueandthesampleislargeenoughistheChi-squareddistribution(§7A)with(r-1)(c-1)degreesoffreedom,whereristhenumberofrowsand

cisthenumberofcolumns.Ishalldiscusswhatismeantby‘large

enough’in§13.3.Wearetreatingtherowandcolumntotalsasfixedandonlyconsideringthedistributionoftableswiththesetotals.Thetestissaidtobeconditionalonthesetotals.Wecanprovethatweloseverylittleinformationbydoingthisandwegetasimpletest.

Table13.2.ExpectedfrequenciesunderthenullhypothesisforTable13.1

Housingtenure Preterm Term Total

Owner–occupier 61.7 837.3 899

Counciltenant 17.7 240.3 258

Privatetenant 12.0 163.0 175

Liveswithparents 4.9 67.1 72

Other 2.7 36.3 39

Total 99 1344 1443

Fig.13.1.PercentagepointoftheChi-squareddistribution

ForTable13.1wehave(5-1)×(2-1)=4degreesoffreedom.Table13.3showssomepercentagepointsoftheChi-squareddistributionforselecteddegreesoffreedom.Thesearetheupperpercentagepoints,asshowninFigure13.1.Weseethatfor4degreesoffreedomthe5%pointis9.49and1%pointis13.28,soourobservedvalueof10.5hasprobabilitybetween1%and5%,or0.01and0.05.Ifweuseacomputerprogramwhichprintsouttheactualprobability,wefindP=0.03.Thedataarenotconsistentwiththenullhypothesisandwecanconcludethatthereisgoodevidenceofarelationshipbetweenhousingtenureandtimeofdelivery.

Thechi-squaredstatisticisnotanindexofthestrengthoftheassociation.IfwedoublethefrequenciesinTable13.1,thiswilldoublechi-squared,butthestrengthoftheassociationisunchanged.Notethatwecanonlyusethechi-squaredtestwhenthenumbersinthecellsarefrequencies,notwhentheyarepercentages,proportionsormeasurements.

Table13.3.PercentagepointsoftheChi-squareddistribution

Degreesoffreedom

Probabilitythatthetabulatedvalueisexceeded(Figure13.1)

10% 5% 1% 0.1%

1 2.71 3.84 6.63 10.83

2 4.61 5.99 9.21 13.82

3 6.25 7.81 11.34 16.27

4 7.78 9.49 13.28 18.47

5 9.24 11.07 15.09 20.52

6 10.64 12.59 16.81 22.46

7 12.02 14.07 18.48 24.32

8 13.36 15.51 20.09 26.13

9 14.68 16.92 21.67 27.88

10 15.99 18.31 23.21 29.59

11 17.28 19.68 24.73 31.26

12 18.55 21.03 26.22 32.91

13 19.81 22.36 27.69 34.53

14 21.06 23.68 29.14 36.12

15 22.31 25.00 30.58 37.70

16 23.54 26.30 32.00 39.25

17 24.77 27.59 33.41 40.79

18 25.99 28.87 34.81 42.31

19 27.20 30.14 36.19 43.82

20 28.41 31.41 37.57 45.32

Table13.4.Coughduringthedayoratnightatage14forchildrenwithandwithoutahistoryofbronchitisbeforeage5(Hollandetal.1978)

Bronchitis NoBronchitis Total

Cough 26 44 70

Nocough 247 1002 1249

Total 273 1046 1319

13.2Testsfor2by2tablesConsiderthedataoncoughsymptomandhistoryofbronchitisdiscussedin§9.8.Wehad273childrenwithahistoryofbronchitisofwhom26werereportedtohavedayornightcough,and1046childrenwithouthistoryofbronchitis,ofwhom44werereportedtohavedayornightcough.Wecansetthesedataoutasacontingencytable,asinTable13.4.Wecanalsousethechi-squaredtesttotestthenullhypothesisofnoassociationbetweencoughandhistory.TheexpectedvaluesareshowninTable13.5.Theteststatisticis

Wehaver=2rowsandc=2columns,sothereare(r-1)(c-1)=(2-1)×(2-1)=1degreeoffreedom.WeseefromTable13.3thatthe5%pointis3.84,andthe1%pointis6.63,sowehaveobservedsomethingveryunlikelyifthenullhypothesisweretrue.Hencewerejectthenullhypothesisofnoassociationandconcludethatthereisarelationshipbetweenpresentcoughandhistoryofbronchitis.

Table13.5.ExpectedfrequenciesforTable13.4

Bronchitis Nobronchitis Total

Cough 14.49 55.51 70.00

Nocough 258.51 990.49 1249.00

Total 273.00 1046.00 1319.00

Nowthenullhypothesis‘noassociationbetweencoughandbronchitis’isthesameasthenullhypothesis‘nodifferencebetweentheproportionswithcoughinthebronchitisandnobronchitisgroups’.Iftherewereadifference,thevariableswouldbeassociated.Thuswehavetestedthesamenullhypothesisintwodifferentways.Infactthesetestsareexactlyequivalent.IfwetaketheNormaldeviatefrom§9.8,whichwas3.49,andsquareit,weget12.2,thechi-squaredvalue.Themethodof§9.8and§8.6hastheadvantagethatitcanalsogiveusaconfidenceintervalforthesizeofthedifference,whichthechi-squaredmethoddoesnot.Notethatthechi-squaredtestcorrespondstothetwo-sidedztest,eventhoughonlytheuppertailofthechi-squareddistributionisused.

13.3Thechi-squaredtestforsmallsamplesWhenthenullhypothesisistrue,theteststatisticΣ(O-E)2/E,whichwecancallthechi-squaredstatistic,followstheChi-squareddistributionprovidedtheexpectedvaluesarelargeenough.Thisisalargesampletest,likethoseof§9.7and§9.8.Thesmallertheexpectedvaluesbecome,themoredubiouswillbethetest.

TheconventionalcriterionforthetesttobevalidisusuallyattributedtothegreatstatisticianW.G.Cochran.Theruleisthis:thechi-squaredtestisvalidifatleast80%oftheexpectedfrequenciesexceed5andalltheexpectedfrequenciesexceed1.WecanseethatTable13.2satisfiesthisrequirement,sinceonly2outof10expectedfrequencies,20%,arelessthan5andnoneislessthan1.Notethatthisconditionappliestotheexpectedfrequencies,nottheobservedfrequencies.Itisquiteacceptableforanobservedfrequencytobe0,providedtheexpectedfrequenciesmeetthecriterion.

Thiscriterionisopentoquestion.Simulationstudiesappeartosuggestthattheconditionmaybetooconservativeandthatthechi-squaredapproximationworksforsmallerexpectedvalues,especiallyforlargernumbersofrowsandcolumns.Atthetimeofwritingtheanalysisof

tablesbasedonsmallsamplesizes,particularly2by2tables,isthesubjectofhotdisputeamongstatisticians.Asyet,no-onehassucceededindevisingabetterrulethanCochran's,soIwouldrecommendkeepingtoituntilthetheoreticalquestionsareresolved.Any

chi-squaredtestwhichdoesnotsatisfythecriterionisalwaysopentothechargethatitsvalidityisindoubt.

Table13.6.Observedandexpectedfrequenciesofcategoriesofradiologicalappearanceatsixmonthsascomparedwith

appearanceonadmissionintheMRCstreptomycintrial,patientswithaninitialtemperatureof100–100.9°F

Radiologicalassessment

Streptomycin Control

Observed Expected Observed Expected

Improvement 13 8.4 5 9.6

Deterioration 2 4.2 7 4.8

Death 0 2.3 5 2.7

Total 15 15 17 17

Ifthecriterionisnotsatisfiedwecanusuallycombineordeleterowsandcolumnstogivebiggerexpectedvalues.Ofcourse,thiscannotbedonefor2by2tables,whichweconsiderinmoredetailbelow.Forexample,Table13.6showsdatafromtheMRCstreptomycintrial(§2.2),

theresultsofradiologicalassessmentforasubgroupofpatientsdefinedbyaprognosticvariable.Wewanttoknowwhetherthereisevidenceofastreptomycineffectwithinthissubgroup,sowewanttotestthenullhypothesisofnoeffectusingachi-squaredtest.Thereare4outof6expectedvalueslessthan5,sothetestonthistablewouldnotbevalid.Wecancombinetherowssoastoraisetheexpectedvalues.Sincethesmallexpectedfrequenciesareinthe‘deterioration’and‘death’rows,itmakessensetocombinethesetogivea‘deteriorationordeath’row.Theexpectedvaluesarethenallgreaterthan5andwecandothechi-squaredtestwith1degreeoffreedom.Thiseditingmustbedonewithregardtothemeaningofthevariouscategories.InTable13.6,therewouldbenopointincombiningrows1and3togiveanewcategoryof‘considerableimprovementordeath’tobecomparedtotheremainder,asthecomparisonwouldbeabsurd.ThenewtableisshowninTable13.7.Wehave

UnderthenullhypothesisthisisfromaChi-squareddistributionwithonedegreeoffreedom,andfromTable13.3wecanseethattheprobabilityofgettingavalueasextremeas10.8islessthan1%.Wehavedatainconsistentwiththenullhypothesisandwecanconcludethattheevidencesuggestsatreatmenteffectinthissubgroup.

Ifthetabledoesnotmeetthecriterionevenafterreductiontoa2by2table,wecanapplyeitheracontinuitycorrectiontoimprovetheapproximationtotheChi-squareddistribution(§13.5),oranexacttestbasedonadiscretedistribution(§13.4).

Table13.7.ReductionofTable13.6toa2by2table

Radiologicalassessment

Streptomycin Control

Observed Expected Observed Expected

Improvement 13 8.4 5 9.6

Deteriorationordeath

2 6.6 12 7.4

Total 15 15.0 17 17.0

Table13.8.ArtificialdatatoillustrateFisher'sexacttest

Survived Died Total

TreatmentA 3 1 4

TreatmentB 2 2 4

Total 5 3 8

13.4Fisher'sexacttestThechi-squaredtestdescribedin§13.1isalargesampletest.Whenthesampleisnotlargeandexpectedvaluesarelessthan5,wecanturntoanexactdistributionlikethatfortheMann–WhitneyUstatistic(§12.2).ThismethodiscalledFisher'sexacttest.

Theexactprobabilitydistributionforthetablecanonlybefoundwhentherowandcolumntotalsaregiven.Justaswiththelargesamplechi-squaredtest,werestrictourattentiontotableswiththesetotals.Thisdifficultyhasledtomuchcontroversyabouttheuseofthistest.Ishallshowhowthetestworks,thendiscussitsapplicability.

Considerthefollowingartificialexample.Inanexperiment,werandomlyallocate4patientstotreatmentAand4totreatmentB,andgettheoutcomeshowninTable13.8.Wewanttoknowtheprobabilityofsolargeadifferenceinmortalitybetweenthetwogroupsifthetreatmentshavethesameeffect(thenullhypothesis).Wecouldhaverandomizedthesubjectsintotwogroupsinmanyways,butifthenullhypothesisistruethesamethreewouldhavedied.Therowandcolumntotalswouldthereforebethesameforallthesepossibleallocations.Ifwekeeptherowandcolumntotalsconstant,thereareonly4possibletables,showninTable13.9.Thesetablesarefoundbyputtingthevalues0,1,2,3inthe‘DiedingroupA’cell.AnyothervalueswouldmaketheDtotalgreaterthan3.

Now,letuslabeloursubjectsatoh.Thesurvivorswewillcallatoe,andthedeathsftoh.Howmanywayscanthesepatientsbearrangedintwogroupsof4togivetablesi,ii,iiiandiv?Tableicanarisein5ways.Patientsf,g,andhwouldhavetobeingroupB,togive3deaths,andtheremainingmemberofBcouldbea,b,c,dore.Tableiicanarisein30ways.The3survivorsingroupAcanbeabc,abd,abe,acd,ace,ade,bcd,bce,bde,cde,10ways.ThedeathinAcanbef,gorh,3ways.Hencethegroupcanbemadeupin10×3=30ways.Tableiiiisthesameastableii,withAandBreversed,soarisesin30ways.TableivisthesameastableiwithAandBreversed,soarisesin5ways.

Hencewecanarrangethe8patientsinto2groupsof4in5+30+30+5=70ways.Now,theprobabilityofanyonearrangementarisingbychanceis1/70,sincetheyareallequallylikelyifthenullhypothesisistrue.Tableiarisesfrom5ofthe70arrangements,sohadprobability5/70=0.071.Tableiiarisesfrom30outof70arrangements,sohasprobability30/70=0.429.Similarly,Tableiiihasprobability30/70=0.429,andTableivhasprobability5/70=0.071.

Table13.9.PossibletablesforthetotalsofTable13.8

i. S D T

A 4 0 4

B 1 3 4

T 5 3 8

ii

S D T

A 3 1 4

B 2 2 4

T 5 3 8

iii.

S D T

A 2 2 4

B 3 1 4

T 5 3 8

iv.

S D T

A 1 3 4

B 4 0 4

T 5 3 8

Hence,underthenullhypothesisthatthereisnoassociationbetweentreatmentandsurvival,Tableii,whichweobserved,hasaprobabilityof0.429.Itcouldeasilyhavearisenbychanceandsoitisconsistentwiththenullhypothesis.Asin§9.2,wemustalsoconsidertablesmoreextremethantheobserved.Inthiscase,thereisonemoreextremetableinthedirectionoftheobserveddifference,Tablei.Inthedirectionoftheobserveddifference,theprobabilityoftheobservedtableoramoreextremeoneis0.071+0.429=0.5.ThisisthePvalueforaone-sidedtest(§9.5).

Fisher'sexacttestisessentiallyonesided.Itisnotclearwhatthecorrespondingdeviationsintheotherdirectionwouldbe,especiallywhenallthemarginaltotalsaredifferent.Thisisbecauseinthatcasethedistributionisasymmetrical,unlikethoseof§12.2–5.Onesolutionistodoubletheone-sidedprobabilitytogetatwo-sidedtestwhenthisisrequired.IfollowArmitageandBerry(1994)inpreferringthisoption.AnothersolutionistocalculateprobabilitiesforeverypossibletableandsumallprobabilitieslessthanorequaltotheprobabilityfortheobservedtabletogivethePvalue.ThismaygiveasmallerPvaluethanthedoublingmethod.

Thereisnoneedtoenumerateallthepossibletables,asabove.Theprobabilitycanbefoundfromasimpleformula(§13B).Theprobabilityofobservingasetoffrequenciesf11,f12,f21,f22,whentherowandcolumntotalsarer1,r2,c1,andc2andthegrandtotalisn,is

(See§6Aforthemeaningofn!.)Wecancalculatethisforeachpossibletablesofindtheprobabilityfortheobservedtableandeachmoreextremeone.Fortheexample:

givingatotalof0.50asbefore.

Unliketheexactdistributionsfortherankstatistics,thisdistributionisfairlyeasytocalculatebutdifficulttotabulate.Agoodtableofthisdistributionrequiredasmallbook(Finneyetal.1963).

WecanapplythistesttoTable13.7.The2by2tablestobetestedandtheirprobabilitiesare:

Thetotalone-sidedprobabilityis0.0014553,whichdoubledforatwo-sidedtestgives0.0029.ThemethodusingallsmallerprobabilitiesgivesP=0.00159.EitherislargerthantheprobabilityfortheX2valueof10.6,whichis0.0011.

Fisher'sexacttestwasoriginallydevisedforthe2×2tableandonlyusedwhentheexpectedfrequenciesweresmall.Thiswasbecauseforlargernumbersandlargertablesthecalculationswereimpractical.Withcomputersthingshavechanged,andFisher'sexacttestcanbedoneforany2×2table.SomeprogramswillalsocalculateFisher'sexacttestforlargertablesasthenumberofrowsandcolumnsincreases,thenumberofpossibletablesincreasesveryrapidlyanditbecomesimpracticabletocalculateandstoretheprobabilityforeachone.TherearespecialistprogramssuchasStatExactwhichcreatearandomsampleofthepossibletablesandusethemtoestimateadistributionofprobabilities

whosetailareaisthenfound.Methodswhichsamplethepossibilitiesinthiswayare(ratherendearingly)calledMonteCarlomethods.

13.5Yates'continuitycorrectionforthe2by2tableThediscrepancyinprobabilitiesbetweenthechi-squaredtestandFisher'sexacttestarisesbecauseweareestimatingthediscretedistributionoftheteststatisticbythecontinuousChi-squareddistribution.Acontinuitycorrectionlikethoseof§12.6,calledYates'correction,canbeusedtoimprovethefit.Theobservedfrequencieschangeinunitsofone,sowemakethemclosertotheirexpectedvaluesbyonehalf.Hencetheformulaforthecorrectedchi-squaredstatisticfora2by2tableis

Thishasprobability0.0037,whichisclosertotheexactprobability,thoughthereisstillaconsiderablediscrepancy.Atsuchextremelylowvaluesanyapproximateprobabilitymodelsuchasthisisliabletobreakdown.Inthecriticalareabetween0.10and0.01,thecontinuitycorrectionusuallygivesaverygoodfittotheexactprobability.AsFisher'sexacttestisnowsoeasytodo,Yates'correctionmaysoondisappear.

13.6*ThevalidityofFisher'sandYates'methodsTherehasbeenmuchdisputeamongstatisticiansaboutthevalidityoftheexacttestandthecontinuitycorrectionwhichapproximatestoit.Amongthemoreargumentativeofthefoundingfathersofstatistical

inference,suchasFisherandNeyman,thiswasquiteacrimonious.Theproblemisstillunresolved,andgeneratingalmostasmuchheataslight.

Notethatalthoughbothare2by2tables,Tables13.4and13.7aroseindifferentways.InTable13.7,thecolumntotalswerefixedbythedesignoftheexperimentandonlytherowtotalsarefromarandomvariable.InTable13.4neitherrownorcolumntotalsweresetinadvance.BotharefromtheBinomialdistribution,dependingontheincidenceofbronchitisandprevalenceofchroniccoughinthepopulation.Thereisathirdpossibility,thatboththerowandcolumntotalsarefixed.Thisisrareinpractice,butitcanbeachievedbythefollowingexperimentaldesign.Wewanttoknowwhetherasubjectcandistinguishanactivetreatmentfromaplacebo.Wepresenthimwith10tablets,5ofeach,andaskhimtosortthetabletsintothe5activeand5placebo.Thiswouldgivea2by2table,subject'schoiceversustruth,inwhichallrowandcolumntotalsarepresetto5.Thereareseveralvariationsonthesetypesoftable,too.Itcanbeshownthatthesamechi-squaredtestappliestoallthesecaseswhensamplesarelarge.Whensamplesaresmall,thisisnotnecessarilyso.Adiscussionoftheproblemiswellbeyondthescopeofthisbook.Forsomeofthesecases.Fisher'sexacttestandYates'correctionmaybeconservative,that

is,giveratherlargerprobabilitiesthantheyshould,thoughthisisamatterofdebate.MyownopinionisthatYates'correctionandFisher'sexacttestshouldbeused.Ifwemusterr,itseemsbettertoerronthesideofcaution.

Table13.10.The2by2tableinsymbolicnotation

Total

a b a+b

c d c+d

Total a+c b+d a+b+c+d

13.7OddsandoddsratiosIftheprobabilityofaneventispthentheoddsofthateventiso=p/(1-p).Theprobabilitythatacoinshowsaheadis0.5,theoddsis0.5/(1-0.5)=1.Notethat‘odds’isasingularword,notthepluralof‘odd’.Theoddshasadvantagesforsometypesofanalysis,asitisnotconstrainedtoliebetween0and1,butcantakeanyvaluefromzerotoinfinity.Weoftenusethelogarithmtothebaseeoftheodds,thelogoddsorlogit:

Thiscanvaryfromminusinfinitytoplusinfinityandthusisveryusefulinfittingregressiontypemodels(§17.8).Thelogitiszerowhenp=1/2andthelogitof1-pisminusthelogitofp:

ConsiderTable13.4.Theprobabilityofcoughforchildrenwithahistoryofbronchitisis26/273=0.09524.Theoddsofcoughforchildrenwithahistoryofbronchitisis26/247=0.10526.Theprobabilityofcoughforchildrenwithoutahistoryofbronchitisis44/1046=0.04207.Theoddsofcoughforchildrenwithoutahistoryofbronchitisis44/1002=0.04391.

Onewaytocomparechildrenwithandwithoutbronchitisistofindtheratiooftheproportionsofchildrenwithcoughinthetwogroups(therelativerisk,§8.6).Anotheristofindtheoddsratio,theratiooftheoddsofcoughinchildrenwithbronchitisandchildrenwithoutbronchitis.Thisis(26/247)/(44/1002)=0.10526/0.04391=2.39718.Thustheoddsofcoughinchildrenwithahistoryofbronchitisis2.39718timestheoddsofcoughinchildrenwithoutahistoryofbronchitis.

Ifwedenotethefrequenciesinthetablebya,b,c.andd,asinTable

13.10,theoddsratioisgivenby

Thisissymmetrical;wegetthesamethingby

Wecanestimatethestandarderrorandconfidenceintervalusingthelogoftheoddsratio(§13C).Thestandarderrorofthelogoddsratiois:

Hencewecanfindthe95%confidenceinterval.ForTable13.4,thelogoddsratioisloge(2.39718)=0.87429,withstandarderror

Providedthesampleislargeenough,wecanassumethatthelogoddsratiocomesfromaNormaldistributionandhencetheapproximate95%confidenceintervalis

0.87429-1.96×0.25736to0.87429+1.96×0.25736=0.36986to1.37872

Togetaconfidenceintervalfortheoddsratioitselfwemustantilog:

Theoddsratiocanbeusedtoestimatetherelativeriskinacase-controlstudy.Thecalculationofrelativeriskin§8.6dependedonthefactthatwecouldestimatetherisks.Wecoulddothisbecausewehadaprospectivestudyandsoknewhowmanyoftheriskgroupdevelopedthesymptom.Thiscannotbedoneifwestartwiththeoutcome,inthiscasecoughatage14,andtrytoworkbacktotheriskfactor,bronchitis,asinacase–controlstudy.

Table13.11showsdatafromacase–controlstudyofsmokingandlungcancer(see§3.8).Westartwithagroupofcases,patientswithlungcancerandagroupofcontrols,herehospitalpatientswithoutcancer.Wecannotcalculaterisks(thecolumntotalswouldbemeaninglessand

havebeenomitted),butwecanstillestimatetherelativerisk.

Supposetheprevalenceoflungcancerisp,asmallnumber,andthetableisasTable13.10.Thenwecanestimatetheprobabilityofbothhavinglungcancerandbeingasmokerbypa/(a+b),becausea/(a+b)istheconditionalprobabilityofsmokinginlungcancerpatients(§6.8).Similarly,theprobabilityofbeingasmokerwithoutlungcanceris(1-p)c/(c+d).Theprobabilityofbeingasmokeristhereforepa/(a+b)+(1-p)c/(c+d),theprobabilityofbeingasmokerwithlungcancerplustheprobabilityofbeingasmokerwithoutlungcancer.Becausepismuchsmallerthan1-p,thefirsttermcanbeignoredand

theprobabilityofbeingasmokerisapproximately(1-p)c/(c+d).Theriskoflungcancerforsmokersisfoundbydividingtheprobabilityofbeingasmokerwithlungcancerbytheprobabilitybeingasmoker:

Table13.11.Smokersandnon-smokersamongmalecancerpatientsandcontrols(DollandHill1950)

Smokers Non-smokers Total

Lungcancer 647 2 649

Controls 622 27 649

Similarly,theprobabilityofbothbeinganon-smokerandhavinglungcancerispb/(a+b)andtheprobabilityofbeinganon-smokerwithoutlungcanceris(1-p)d/(c+d).Theprobabilityofbeinganon-smokeris

thereforepb/(a+b)+(1-p)d/(c+d),andsincepismuchsmallerthan1-p,thefirsttermcanbeignoredandtheprobabilityofbeinganon-smokerisapproximately(1-p)d/(c+d).Thisgivesariskoflungcanceramongnon-smokersofapproximately

Therelativeriskoflungcancerforsmokersisthus,approximately,

Thisis,ofcourse,theoddsratio.Thusforcasecontrolstudiestherelativeriskisapproximatedbytheoddsratio.

ForTable13.11wehave

Thustheriskoflungcancerinsmokersisabout14timesthatofnon-smokers.Thisisasurprisingresultfromatablewithsofewnon-smokers,butadirectestimatefromthecohortstudy(Table3.1)is0.90/0.07=12.9,whichisverysimilar.Thelogoddsratiois2.64210anditsstandarderroris

Hencetheapproximate95%confidenceintervalis

Table13.12.Coughduringthedayoratnightandcigarettesmokingby12-year-oldboys(Blandetal.1978)

Boy'ssmoking

Non-smoker Occasional Regular

Cough 266 20.4% 395 28.8% 80 46.5%

Nocough

1037 79.6% 977 71.2% 92 53.5%

Total 1303 100.0% 1372 100.0% 172 100.0%

Togetaconfidenceintervalfortheoddsratioitselfwemustantilog:

Theverywideconfidenceintervalisbecausethenumbersofnon-smokers,particularlyforlungcancercases,aresosmall.

13.8*Thechi-squaredtestfortrendConsiderthedataofTable13.12.Usingthechi-squaredtestdescribedin§13.1,wecantestthenullhypothesisthatthereisnorelationshipbetweenreportedcoughandsmokingagainstthealternativethatthereisarelationshipofsomesort.Thechi-squaredstatisticis64.25,with2degreesoffreedom,P<0.001.Thedataarenotconsistentwiththenullhypothesis.

Now,wewouldhavegotthesamevalueofchi-squaredwhatevertheorderofthecolumns.Thetestignoresthenaturalorderingofthecolumns,butwemightexpectthatiftherewerearelationshipbetweenreportedcoughandsmoking,theprevalenceofcoughwouldbegreaterforgreateramountsofsmoking.Inotherwords,welookforatrendincoughprevalencefromoneendofthetabletotheother.Wecantestforthisusingthechi-squaredtestfortrend.

First,wedefinetworandomvariables.XandY,whosevaluesdependonthecategoriesoftherowandcolumnvariables.Forexample,wecouldputX=1fornon-smokers,X=2foroccasionalsmokersandX=3forregularsmokers,andputY=1for‘cough’andY=2for‘nocough’.Thenforanon-smokerwhocoughs,thevalueofXis1andthevalueofYis1.BothXandYmayhavemorethantwocategories,providedbothareordered.Iftherearenindividuals,wehavenpairsofobservations

(xi,yi).Ifthereisalineartrendacrossthetable,therewillbelinearregressionofYonXwhichhasnon-zeroslope.Wefittheusualleastsquaresregressionline,Y=a+bX,where

andwheres2istheestimatedvarianceofY.Insimplelinearregression,asdescribedinChapter11,weareusuallyconcernedwithestimatingbandmakingstatementsaboutitsprecision.Hereweareonlygoingtotestthenullhypothesisthatinthepopulationb=0.Underthenullhypothesis,thevarianceaboutthelineisequaltothetotalvarianceofY,sincethelinehaszeroslope.Weusethe

estimate

(Weusenasthedenominator,notn-1,becausethetestisconditionalontherowandcolumntotalsasdescribedin§13A.Thereisagoodreasonforit,butitisnotworthgoingintohere.)Asin§11.5,thestandarderrorofbis

Forpracticalcalculationsweusethealternativeformsofthesumsofsquaresandproducts:

NotethatitdoesnotmatterwhichvariableisXandwhichisY.The

sumsofsquaresandproductsareeasytoworkout.Forexample,forthecolumnvariable,X,wehave1303individualswithX=1,1372withX=2and172withX=3.Forourdatawehave

Similarly,Σy2i=9165andΣyi=4953;

=59.47

Ifthenullhypothesisistrue,χ2iisanobservationfromtheChi-squareddistributionwith1degreeoffreedom.Thevalue59.47ishighlyunlikelyfromthisdistributionandthetrendissignificant.

Thereareseveralpointstonoteaboutthismethod.ThechoiceofvaluesforXandYisarbitrary.ByputtingX=1,2or3weassumedthatthedifferencebetweennon-smokersandoccasionalsmokersisthesameasthatbetweenoccasionalsmokersandsmokers.ThisneednotbesoandadifferentchoiceofXwouldgiveadifferentchi-squaredfortrendstatistic.Thechoiceisnotcritical,however.Forexample,puttingX=1,2or4,somakingregularsmokersmoredifferentfromoccasionalsmokersthanoccasionalsmokersarefromnon-smokers,wegetx2fortrendtobe64.22.Thefittothedataisratherbetter,buttheconclusionsareunchanged.

Thetrendmaybesignificanteveniftheoverallcontingencytablechi-squaredisnot.Thisisbecausethetestfortrendhasgreaterpowerfordetectingtrendsthanhastheordinarychi-squaredtest.Ontheotherhand,ifwehadanassociationwherethosewhowereoccasionalsmokershadfarmoresymptomsthaneithernon-smokersorregularsmokers,thetrendtestwouldnotdetectit.Ifthehypothesiswewish

totestinvolvestheorderofthecategories,weshouldusethetrendtest,ifitdoesnotweshouldusethecontingencytabletestof§13.1.Notethatthetrendteststatisticisalwayslessthantheoverallchi-squaredstatistic.

Thedistributionofthetrendchi-squaredstatisticdependsonalargesampleregressionmodel,notonthetheorygivenin§13A.ThetabledoesnothavetomeetCochran'srule(§13.3)forthetrendtesttobevalid.Aslongasthereareatleast30observationstheapproximationshouldbevalid.

Somecomputerprogramsofferaslightlydifferenttest,theMantel–Haenzseltrendtest(nottobeconfusedwiththeMantel–Haenzselmethodforcombining2by2tables,§17.11).Thisisalmostidenticaltothemethoddescribedhere.Asanalternativetothechi-squaredtestfortrend,wecouldcalculateKendall'srankcorrelationcoefficient,τb,betweenXandY(§12.5).ForTable13.12wegetτb=-0.136withstandarderror0.018.Wegetaχ21statisticby(τb/SE(τb))2=57.09.ThisisverysimilartotheX2fortrendvalue59.47.

13.9*MethodsformatchedsamplesThechi-squaredtestdescribedaboveenablesus,amongotherthings,totestthenullhypothesisthatbinomialproportionsestimatedfromtwoindependentsamplesarethesame.Wecandothisfortheonesampleormatchedsampleproblemalso.Forexample,Hollandetal.(1978)obtainedrespiratorysymptomquestionnairesfor1319Kentschoolchildrenatages12and14.Onequestionweaskedwaswhethertheprevalenceofreportedsymptomswasdifferentatthetwoages.Atage12,356(27%)childrenwerereportedtohavehadseverecoldsinthepast12monthscomparedto468(35%)atage14.Wasthereevidenceofarealincrease?Justasintheonesampleorpairedttest(§10.2)wewouldhope

toimproveouranalysisbytakingintoaccountthefactthatthisisthesamesample.Wemightexpect,forinstance,thatsymptomsonthetwooccasionswillberelated.

Table13.13.SeverecoldsreportedattwoagesforKentschoolchildren(Hollandetal.1978)

Severecoldsatage12

Severecoldsatage14 Total

Yes No

Yes 212 144 356

No 256 707 963

Total 468 851 1319

ThemethodwhichenablesustodothisisMcNemar'stest,anotherversionofthesigntest.Weneedtoknowthat212childrenwerereportedtohavecoldsonbothoccasions.144tohavecoldsat12butnotat14,256tohavecoldsat14butnotat12and707tohavecoldsatneitherage.Table13.13showsthedataintabularform.

Thenullhypothesisisthattheproportionssayingyesonthefirstandsecondoccasionsarethesame,thealternativebeingthatoneexceedstheother.Thisisahypothesisabouttherowandcolumntotals,quitedifferentfromthatforthecontingencytablechi-squaredtest.Ifthenullhypothesisweretruewewouldexpectthefrequenciesfor‘yes,no’and‘no,yes’tobeequal.Inotherwords,asmanyshouldgoupasdown.(Comparethiswiththesigntest,§9.2.)Ifwedenotethesefrequenciesbyfynandfny,thentheexpectedfrequencieswillbe(fyn+fny)/2.Wegettheteststatistic:

whichfollowsaChi-squareddistributionprovidedtheexpectedvaluesarelargeenough.Therearetwoobservedfrequenciesandoneconstraint,thatthesumoftheobservedfrequencies=thesumoftheexpectedfrequencies.Hencethereisonedegreeoffreedom.Likethechi-squaredtest(§13.1)andFisher'sexacttest(§13.4),weassumeatotaltobefixed.Inthiscaseitisfyn+fny,nottherowandcolumntotals,whicharewhatwearetesting.Theteststatisticcanbesimplifiedconsiderably,to:

ForTable13.13,wehave

ThiscanbereferredtoTable13.3withonedegreeoffreedomandisclearlyhighlysignificant.Therewasadifferencebetweenthetwoages.Astherewasnochangeinanyoftheothersymptomsstudied,wethoughtthatthiswaspossiblyduetoanepidemicofupperrespiratorytractinfectionjustbeforethesecondquestionnaire.

Thereisacontinuitycorrection,againduetoYates.Iftheobservedfrequencyfynincreasesby1,fnydecreasesby1andfyn-fnyincreasesby2.Thushalfthedifferencebetweenadjacentpossiblevaluesis1andwemaketheobserveddifferencenearertotheexpecteddifference(zero)by1.Thusthecontinuitycorrectedteststatisticis

where|fyn-fny|istheabsolutevalue,withoutsign.ForTable13.13:

Thereisverylittledifferencebecausetheexpectedvaluesaresolargebutiftheexpectedvaluesaresmall,saylessthan20,thecorrectionisadvisable.Forsmallsamples,wecanalsotakefnyasanobservation

fromtheBinomialdistributionwithp=½andn=fyn+fnyandproceedasforthesigntest(§9.2).

Wecanfindaconfidenceintervalforthedifferencebetweentheproportions.Theestimateddifferenceisp1-p2=(fyn-fyn)/n.Werearrangethis:

WecantreatthefynasanobservationfromaBinomialdistributionwithparametern=fyn+fny,which,ofcourse,wearetreatingasfixed.(IamusingnheretomeantheparameteroftheBinomialdistributionasin§6.4,nottomeanthetotalsamplesize.)Wefindaconfidenceintervalforfyn/(fyn+fny)usingeitherthezmethodof§8.4ortheexactmethodof§8.8.Wethenmultiplytheselimitsby2,subtract1andmultiplyby(fyn+fny)/n.

Fortheexample,theestimateddifferenceis(144-256)/1319=-0.085.Fortheconfidenceinterval,fyn+fny=400andfyn=144.The95%confidenceintervalforfyn/(fyn+fny)is0.313to0.407bythelargesamplemethod.Hencetheconfidenceintervalforp1-p2is(2×0.313-1)×400/1319=-0.113to(2×0.407-1)×400/1319=-0.056.Weestimatethattheproportionofcoldsonthefirstoccasionwaslessthanthatonthesecondbybetween0.06and0.11.

Wemaywishtocomparethedistributionofavariablewiththreeormorecategoriesinmatchedsamples.Ifthecategoriesareordered,likesmokingexperienceinTable13.12,weareusuallylookingforashiftfromoneendofthedistributiontotheother,andwecanusethesigntest(§9.2),countingpositiveswhensmokingincreased,negativewhenitdecreased,andzeroifthecategory

wasthesame.Whenthecategoriesarenotordered,asTable13.1thereisatestduetoStuart(1955),describedbyMaxwell(1970).Thetestisdifficulttodoandthesituationisveryunusual,soIshallomitdetails.MyfreeprogramClinstatwilldoit(§1.3).

Table13.14.Parityof125womenattendingantenatalclinicsatSt.George'sHospital,withthecalculationofthechi-squaredgoodnessoffittest

Wecanalsofindanoddsratioforthematchedtable,calledtheconditionaloddsratio.LikeMcNemar'smethod,itusesthefrequenciesintheoffdiagonalonly.Theestimateisverysimple:fyn/fny.ThusforTable13.13theoddsofhavingseverecoldsatage12is144/256=0.56timesthatatage14.Thisexampleisnotveryinteresting,butthemethodisparticularlyusefulinmatchedcase–controlstudies,whereitprovidesanestimateoftherelativerisk.Aconfidenceintervalisprovidedinthesamewayasforthedifferencebetweenproportions.Wecanestimatep=fyn/(fyn+fny)andthentheoddsratioisgivenbyp/(1-p).Fortheexample,p=144/400=0.36andturningpbacktotheoddsratiop/(1-p)=0.36/(1-0.36)=0.56asbefore.The95%confidenceintervalforpis0.313to0.4071,asabove.Hencethe95%confidenceintervalfortheconditionaloddsratiois0.31/(1-0.31)=0.45to0.41/(1-0.41)=0.69.

13.10*Thechi-squaredgoodnessoffittestAnotheruseoftheChi-squareddistributionisthegoodnessoffittest.HerewetestthenullhypothesisthatafrequencydistributionfollowssometheoreticaldistributionsuchasthePoissonorNormal.Table13.14showsafrequencydistribution.Weshalltestthenullhypothesis

thatitisfromaPoissondistribution,i.e.thatconceptionisarandomeventamongfertilewomen.

FirstweestimatetheparameterofthePoissondistribution,itsmean,µ,inthiscase0.816.Wethencalculatetheprobabilityforeachvalueofthevariable,usingthePoissonformulaof§6.7:

whereristhenumberofevents.TheprobabilitiesareshowninTable13.14.Theprobabilitythatthevariableexceedsfiveisfoundbysubtractingtheprobabilitiesfor0,1,2,3,4,and5from1.0.Wethenmultiplythesebythenumberof

observations,125,togivethefrequencieswewouldexpectfrom125observationsfromaPoissondistributionwithmepn0.816.

Table13.15.Timeofonsetof554strokesWroeetal.(1992)

Time Frequency Time Frequency

00.01–02.00 21 12.01–14.00 34

02.01–04.00 16 14.01–16.00 59

04.01–06.00 22 16.01–18.00 44

06.01–08.00 104 18.01–20.00 51

08.01–10.00 95 20.01–22.00 32

10.01–12.00 66 22.01–24.00 10

Wenowhaveasetofobservedandexpectedfrequenciesandcancomputeachi-squaredstatisticintheusualway.Wewantalltheexpectedfrequenciestobegreaterthan5ifpossible.Weachievethisherebycombiningallthecategoriesforparitygreaterthanorequalto3.Wethenadd(O-E)2/Eforthecategoriestogiveaχ2statistic.Wenowfindthedegreesoffreedom.Thisisthenumberofcategoriesminusthenumberofparametersfittedfromthedata(oneintheexample)minusone.Thuswehave4-1-1=2degreesoffreedom.FromTable13.3theobservedχ2valueof2.99hasP>0.10andthedeviationfromthePoissondistributionisclearlynotsignificant.

Thesametestcanbeusedfortestingthefitofanydistribution.Forexample,Wroeetal.(1992)studieddiurnalvariationinonsetofstrokes.Table13.15showsthefrequencydistributionoftimesofonset.Ifthenullhypothesisthatthereisnodiurnalvariationweretrue,thetimeatwhichstrokesoccurredwouldfollowaUniformdistribution(§7.2).Theexpectedfrequencyineachtimeintervalwouldbethesame.Therewere554casesaltogether,sotheexpectedfrequencyforeachtimeis554/12=46.167.Wethenworkout(O-E)2/Eforeachintervalandaddtogivethechi-squaredstatistic,inthiscaseequalto218.8.Thereisonlyoneconstraint,thatthefrequenciestotal554,asnoparametershavebeenestimated.HenceifthenullhypothesisweretruewewouldhaveanobservationfromtheChi-squareddistributionwith12-1=11degreesoffreedom.Thecalculatedvalueof218.8isveryunlikely,P<0.001fromTable13.3,andthedataarenotconsistentwiththenullhypothesis.WhenwetesttheequalityofasetoffrequencieslikethisthetestisalsocalledthePoissonheterogeneitytest.

Appendices

13AAppendix:Whythechi-squaredtestworks

WenotedsomeofthepropertiesoftheChi-squareddistributionin§7A.Inparticular,itisthesumofthesquaresofasetofindependentStandardNormalvariables,andifwelookatasubsetofvaluesdefinedbyindependentlinearrelationshipsbetweenthesevariablesweloseonedegreeoffreedomforeachconstraint.Itisonthesetwopropertiesthatthechi-squaredtestdepends.

SupposewedidnothaveafixedsizetothebirthstudyofTable13.1,butobservedsubjectsastheydeliveredoverafixedtime.Thenthenumberin

agivencellofthetablewouldbefromaPoissondistributionandthesetofPoissonvariablescorrespondingtothecellfrequencywouldbeindependentofoneanother.OurtableisonesetofsamplesfromthesePoissondistributions.However,wedonotknowtheexpectedvaluesofthesedistributionsunderthenullhypothesis;weonlyknowtheirexpectedvaluesifthetablehastherowandcolumntotalsweobserved.Wecanonlyconsiderthesubsetofoutcomesofthesevariableswhichhastheobservedrowandcolumntotals.Thetestissaidtobeconditionalontheserowandcolumntotals.

Table13.16.Symbolicrepresentationofa2×2table

Total

f11 f12 r1

f21 f22 r2

Total c1 c2 n

ThemeanandvarianceofaPoissonvariableareequal(§6.7).Ifthenullhypothesisistrue,themeansofthesevariableswillbeequaltotheexpectedfrequencycalculatedin§13.1.ThusO,theobservedcellfrequency,isfromaPoissondistributionwithmeanE,theexpectedcellfrequency,andstandarddeviation√E.ProvidedEislargeenough,thisPoissondistributionwillbeapproximatelyNormal.Hence(O-E)/√EisfromaNormaldistributionmean0andvariance1.Henceifwefind

thisisthesumofthesquaresofasetofNormallydistributedrandomvariableswithmean0andvariance1,andsoisfromaChi-squareddistribution(§7A).

Wewillnowfindthedegreesoffreedom.Althoughtheunderlyingvariablesareindependent,weareonlyconsideringasubsetdefinedbytherowandcolumntotals.ConsiderthetableasinTable13.16.Here,f11tof22aretheobservedfrequencies,r1,r2therowtotals,c1,c2thecolumntotals,andnthegrandtotal.Denotethecorrespondingexpectedvaluesbye11toe22.Therearethreelinearconstraintsonthefrequencies:

Anyotherconstraintcanbemadeupofthese.Forexample,wemusthave

Thiscanbefoundbysubtractingthesecondequationfromthefirst.Eachoftheselinearconstraintsonf11tof22isalsoalinearconstrainton(f11-e11)/√e11

to(f22-e22)/√e22.Thisisbecausee11isfixedandso(f11-e11)/√e11isalinearfunctionoff11.Therearefourobservedfrequenciesandsofour

(O-E)/√Evariables,withthreeconstraints.Weloseonedegreeoffreedomforeachconstraintandsohave4-3=1degreeoffreedom.

Ifwehaverrowsandccolumns,thenwehaveoneconstraintthatthesumofthefrequenciesisn.Eachrowmustaddup,butwhenwereachthelastrowtheconstraintcanbeobtainedbysubtractingthefirstr-1rowsfromthegrandtotal.Therowscontributeonlyr-1furtherconstraints.Similarlythecolumnscontributec-1constraints.Hence,therebeingrcfrequencies,thedegreesoffreedomare

Sowehavedegreesoffreedomgivenbythenumberofrowsminusonetimesthenumberofcolumnsminusone.

13BAppendix:TheformulaforFisher'sexacttest

ThederivationofFisher'sformulaisstrictlyforthealgebraicallyminded.Rememberthatthenumberofwaysofchoosingrthingsoutofnthings(§6A)isn!/r!(n-r)!.Now,supposewehavea2by2tablemadeupofnasshowninTable13.16.First,weaskhowmanywaysnindividualscanbearrangedtogivemarginaltotals,r1,r2,c1andc2.Theycanbearrangedincolumnsinn!/c1!c2!ways,sincewearechoosingc1objectsoutofn,andinrowsn!/r1!r2!ways.(Remembern-c1=c2andn-r1=r2.)Hencetheycanbearrangedin

ways.Forexample,thetablewithtotals

canhappenin

Aswesawin§13.4,thecolumnscanbearrangedin70ways.Nowweask,ofthesewayshowmanymakeupaparticulartable?Wearenowdividingthenintofourgroupsofsizesf11,f12,f21andf12.Wecan

choosethefirstgroupinn!/f11!(n-f11)!ways,asbefore.Wearenowleftwithn-f11individuals,sowecanchoosef12in(n-f11)!/f12!(n-f11-f12)!.Wearenowleftwithn-f11-f12,andsowechoosef21in(n-f11-f12)!/f21!ways.Thisleavesn-f11-f12-f21,whichis,ofcourse,equaltof22andsof22canonlybechoseninoneway.Hencewehavealtogether:

becausen-f11-f12-f12=f22.Sooutofthe

possibletables,thegiventablesarisesin

ways.Theprobabilityofthistablearisingbychanceis

13CAppendix:Standarderrorforthelogoddsratio

Thisisforthemathematicalreader.Westartwithageneralresultconcerninglogtransformations.IfXisarandomvariablewithmeanµ,

theapproximatevarianceofloge(X)isgivenby

Ifaneventhappensatimesanddoesnothappenbtimes,thelogoddsisloge(a/b)-loge(a)-loge(b).ThefrequenciesaandbarefromindependentPoissondistributionswithmeansestimatedbyaandbrespectively.Hencetheirvariancesareestimatedby1/aand1/brespectively.Thevarianceofthelogoddsisgivenby

Thestandarderrorofthelogoddsisthusgivenby

Thelogoddsratioisthedifferencebetweenthelogodds:

Thevarianceofthelogoddsratioisthesumofthevariancesofthelogoddsandfortable2wehave

Thestandarderroristhesquarerootofthis:

13MMultiplechoicequestions67to73

(Eachbranchiseithertrueorfalse)

67.Thestandardchi-squaredtestfora2by2contingencytableisvalidonlyif:

(a)alltheexpectedfrequenciesaregreaterthanfive;

(b)bothvariablesarecontinuous;

(c)atleastonevariableisfromaNormaldistribution;

(d)alltheobservedfrequenciesaregreaterthanfive;

(e)thesampleisverylarge.

ViewAnswer

68.Inachi-squaredtestfora5by3contingencytable:

(a)variablesmustbequantitative;

(b)observedfrequenciesarecomparedtoexpectedfrequencies;

(c)thereare15degreesoffreedom;

(d)atleast12cellsmusthaveexpectedvaluesgreaterthanfive;

(e)alltheobservedvaluesmustbegreaterthanone.

ViewAnswer

Table13.17.Coughfirstthinginthemorninginagroupofschoolchildren,asreportedbythechildandbythechild'sparents(Blandetal.1979)

Parents'reportChild'sreport

TotalYes No

Yes 29 104 133

No 172 5097 5269

Total 201 5201 5402

69.InTable13.17:

(a)theassociationbetweenreportsbyparentsandchildrencanbetestedbyachi-squaredtest;

(b)*thedifferencebetweensymptomprevalenceasreportedbychildrenandparentscanbetestedbyMcNemar'stest;

(c)*ifMcNemar'stestissignificant,thecontingencychi-squaredtestisnotvalid;

(d)thecontingencychi-squaredtesthasonedegreeoffreedom;

(e)itwouldbeimportanttousethecontinuitycorrectioninthecontingencychi-squaredtest.

ViewAnswer

70.Fisher'sexacttestforacontingencytable:

(a)appliesto2by2tables;

(b)usuallygivesalargerprobabilitythantheordinarychi-squaredtest;

(c)usuallygivesaboutthesameprobabilityasthechi-squaredtestwithYates'continuitycorrection;

(d)issuitablewhenexpectedfrequenciesaresmall;

(e)isdifficulttocalculatewhentheexpectedfrequenciesarelarge.

ViewAnswer

71.Whenanoddsratioiscalculatedfroma2by2table:

(a)theoddsratioisameasureofthestrengthoftherelationshipbetweentherowandcolumnvariables;

(b)iftheorderoftherowsandtheorderofthecolumnsisreversed,theoddsratiowillbeunchanged;

(c)theratiomaytakeanypositivevalue;

(d)theoddsratiowillbechangedtoitsreciprocaliftheorderofthecolumnsischanged;

(e)theoddsratioistheratiooftheproportionsofobservationsinthefirstrowforthetwocolumns.

ViewAnswer

Table13.18.BirdattacksonmilkbottlesreportedbycasesofCampylobacterjejuniinfectionand

controls(Southernetal.1990)

Numberofdaysofweekwhenattackstookplace

NumberofOR

Cases Controls

0 3 42 1

1–3 11 3 51

4–5 5 1 70

6–7 10 1 140

72.Table13.18appearedinthereportofacasecontrolstudyofinfectionwithCampylobacterjejuni(§3E):

(a)*achi-squaredtestfortrendcouldbeusedtotestthenullhypothesisthatriskofdiseasedoesnotincreasewiththenumberofbirdattacks;

(b)‘OR’meanstheoddsratio;

(c)*asignificantchi-squaredtestwouldshowthatriskofdiseaseincreaseswithincreasingnumbersofbirdattacks;

(d)‘OR’providesanestimateoftherelativeriskofCampylobacterjejuniinfection;

(e)*Kendall'srankcorrelationcoefficient,τb,couldbeusedtotestthenullhypothesisthatriskofdiseasedoesnotincreasewiththenumberofbirdattacks.

ViewAnswer

73.*McNemar'stestcouldbeused:

(a)tocomparethenumbersofcigarettesmokersamongcancercasesandageandsexmatchedhealthycontrols;

(b)toexaminethechangeinrespiratorysymptomprevalenceinagroupofasthmaticsfromwintertosummer;

(c)tolookattherelationshipbetweencigarettesmokingandrespiratorysymptomsinagroupofasthmatics;

(d)toexaminethechangeinPEFRinagroupofasthmaticsfromwintertosummer;

(e)tocomparethenumberofcigarettesmokersamongagroupofcancercasesandarandomsampleofthegeneralpopulation.

ViewAnswer

13EExercise:AdmissionstohospitalinaheatwaveInthisexerciseweshalllookatsomedataassembledtotestthehypothesisthatthereisaconsiderableincreaseinthenumberof

admissionstogeriatricwardsduringheatwaves.Table13.19showsthenumberofadmissionstogeriatricwardsinahealthdistrictforeachweekduringthesummersof1982,whichwascold,and1983,whichwashot.Alsoshownaretheaverageofthedailypeaktemperaturesforeachweek.

1.Whendoyouthinktheheatwavebeganandended?

ViewAnswer

2.Howmanyadmissionswerethereduringtheheatwaveandinthecorrespondingperiodof1982?Wouldthisbesufficientevidencetoconcludethatheatwavesproduceanincreaseinadmissions?

ViewAnswer

3.Wecanusetheperiodsbeforeandaftertheheatwaveweeksascontrolsforchangesinotherfactorsbetweentheyears.Dividetheyearsintothreeperiods,before,during,andaftertheheatwaveandsetupatwo-waytableshowingnumbersofadmissionsbyperiodandyear.

ViewAnswer

Table13.19.MeanpeakdailytemperaturesforeachweekfromMaytoSeptemberof1982and1983,withgeriatricadmissionsinWandsworth

(Fish1985)

Week

Meanpeak,°C Admissions

Week

Meanpeak,°C

1982 1983 1982 1983 1982 1983

1 12.4 15.3 24 20 12 21.7 25.0

2 18.2 14.4 22 17 13 22.5 27.3

3 20.4 15.5 21 21 14 25.7 22.9

4 18.8 15.6 22 17 15 23.6 24.3

5 25.3 19.6 24 22 16 20.4 26.5

6 23.2 21.6 15 23 17 19.6 25.0

7 18.6 18.9 23 20 18 20.2 21.2

8 19.4 22.0 21 16 19 22.2 19.7

9 20.6 21.0 18 24 20 23.3 16.6

10 23.4 26.5 21 21 21 18.1 18.4

11 22.8 30.4 17 20 22 17.3 20.7

4.Wecanusethistabletotestforaheatwaveeffect.Statethenullhypothesisandcalculatethefrequenciesexpectedifthenullhypothesisweretrue.

ViewAnswer

5.Testthenullhypothesis.Whatconclusionscanyoudraw?

ViewAnswer

6.Whatotherinformationcouldbeusedtotesttherelationship

betweenheatwavesandgeriatricadmissions?

ViewAnswer



>TableofContents>14-Choosingthestatisticalmethod

14

Choosingthestatisticalmethod

14.1*MethodorientedandproblemorientedteachingThechoiceofmethodofanalysisforaproblemdependsonthecomparisontobemadeandthedatatobeused.InChapters8,9,10,11,12,and13,statisticalmethodshavebeenarrangedlargelybytypeofdata,largesamples,Normal,ordinal,categorical,etc,ratherthanbytypeofcomparison.Inthischapterwelookathowtheappropriatemethodischosenforthethreemostcommonproblemsinstatisticalinference:

comparisonoftwoindependentgroups,forexample,groupsofpatientsgivendifferenttreatments;

comparisonoftheresponseofonegroupunderdifferentconditions,asinacross-overtrial,orofmatchedpairsofsubjects,asinsomecase–controlstudies;

investigationoftherelationshipbetweentwovariablesmeasuredonthesamesampleofsubjects.

ThischapteractsasamapofthemethodsdescribedinChapters8,9,10,11,12,and13.Subsequentchaptersdescribemethodsforspecialproblemsinclinicalmedicine,populationstudy,dealingwithseveralfactorsatonce,andthechoiceofsamplesize.

Aswasdiscussedin§12.7,thereareoftenseveraldifferentapproachestoevenasimplestatisticalproblem.Themethodsdescribedhereandrecommendedforparticulartypesofquestionmaynotbetheonlymethods,andmaynotalwaysbeuniversallyagreedasthebest

method.Statisticiansareatleastaspronetodisagreeasclinicians.However,thesewouldusuallybeconsideredasvalidandsatisfactorymethodsforthepurposesforwhichtheyaresuggestedhere.Whenthereismorethanonevalidapproachtoaproblem,theywillusuallybefoundtogivesimilaranswers.

14.2*TypesofdataThestudydesignisonefactorwhichdeterminesthemethodofanalysis,thevariablebeinganalysedisanother.Wecanclassifyvariablesintothefollowingtypes:

RatioscalesTheratiooftwoquantitieshasameaning,sowecansaythatoneobservationistwiceanother.Humanheightisaratioscale.Ratioscales

allowustocarryoutpowertransformationslikelogorsquareroot.

IntervalscalesTheintervalordistancebetweenpointsonthescalehasprecisemeaning,achangeofoneunitatonescalepointisthesameasachangeofoneunitatanother.Forexample,temperaturein°Cisanintervalscale,thoughnotaratioscalebecausethezeroisarbitrary.Wecanaddandsubtractonanintervalscale.Allratioscalesarealsointervalscales.Intervalscalesallowustocalculatemeansandvariances,andtofindstandarderrorsandconfidenceintervalsforthese.

OrdinalscaleThescaleenablesustoorderthesubjects,fromthatwiththelowestvaluetothatwiththehighest.Anytieswhichcannotbeorderedareassumedtobebecausethemeasurementisnotsufficientlyprecise.Atypicalexamplewouldbeananxietyscorecalculatedfromaquestionnaire.Apersonscoring10ismoreanxiousthanapersonscoring8,butnotnecessarilyhigherbythesameamountthatapersonscoring4ishigherthanapersonscoring2.

OrderednominalscaleWecangroupsubjectsintoseveralcategories,whichhaveanorder.Forexample,wecanaskpatientsiftheirconditionismuchimproved,improvedalittle,nochange,alittleworse,muchworse.

NominalscaleWecangroupsubjectsintocategorieswhichneednotbeorderedinanyway.Eyecolourismeasuredonanominalscale.

DichotomousscalesSubjectsaregroupedintoonlytwocategories,forexample:survivedordied.Thisisaspecialcaseofthenominalscale.

Clearlytheseclassesarenotmutuallyexclusive,andanintervalscaleisalsoordinal.Sometimesitisusefultoapplymethodsappropriatetoalowerlevelofmeasurement,ignoringsomeoftheinformation.Thecombinationofthetypeofcomparsionandthescaleofmeasurementshoulddirectustotheappropriatemethod.

14.3*ComparingtwogroupsThemethodsusedforcomparingtwogroupsaresummarizedinTable14.1.

Intervaldata.Forlargesamples,saymorethan50ineachgroup,confidenceintervalsforthemeancanbefoundbytheNormalapproximation(§8.5).Forsmallersamples.confidenceintervalsforthemeancanbefoundusingthetdistributionprovidedthedatafolloworcanbetransformedtoaNormaldistribution(§10.3,§10.4).Ifnot,asignificancetestofthenullhypothesisthatthemeansareequalcanbecarriedoutusingtheMann–WhitneyUtest(§12.2).Thiscanbeusefulwhenthedataarecensored,thatis,therearevaluestoosmallortoolargetomeasure.Thishappens,forexample,whenconcentrationsaretoosmalltomeasureandlabelled‘notdetectable’.ProvidedthatdataarefromNormaldistributions,itispossibletocomparethevariancesofthegroupsusingtheFtest(§10.8).

Ordinaldata.ThetendencyforonegrouptoexceedmembersoftheotheristestedbytheMann–WhitneyUtest(§12.2).

Orderednominaldata.Firstthedataissetoutasatwowaytable,onevariablebeinggroupandtheothertheorderednominaldata.Achi-squaredtest

(§13.1)willtestthenullhypothesisthatthereisnorelationshipbetweengroupandvariable,buttakesnoaccountoftheordering.Thisisdonebyusingthechi-squaredtestfortrend,whichtakestheorderingintoaccountandprovidesamuchmorepowerfultest(§13.8).

Table14.1.Methodsforcomparingtwosamples

Typeofdata Sizeofsample Method

Interval Large,>50eachsample

Normaldistributionformeans(§8.5,§9.7)

Small,<50eachsample,withNormaldistributionanduniformvariance

Two-sampletmethod(§10.3)

Small,<50eachsample,non-Normal

Mann–WhitneyUtest(§12.2)

Ordinal Any Mann–WhitneyU

test(§12.2)

Nominal,ordered

Large,n>30 Chi-squaredfortrend(§13.8)

Nominal,notordered

Large,mostexpectedfrequencies>5

Chi-squaredtest(§13.1)

Small,morethan20%expectedfrequencies<5

Reducenumberofcategoriesbycombiningorexcludingasappropriate(§13.3)

Dichotomous Large,allexpectedfrequencies>5

Comparisonoftwoproportions(§8.6,§9.8),chi-squaredtest(§13.1),oddsratio(§13.7)

Small,atleastoneexpectedfrequency<5

Chi-squaredtestwithYates'correction(§13.5),Fisher'sexacttest(§13.4)

Nominaldata.Setthedataoutasatwowaytableasdescribedabove.Thechi-squaredtestforatwowaytableistheappropriatetest(§13.1).Theconditionforvalidityofthetest,thatatleast80%oftheexpectedfrequenciesshouldbegreaterthan5,mustbemetbycombiningor

deletingcategoriesasappropriate(§13.3).Ifthetablereducestoa2by2tablewithouttheconditionbeingmet,useFisher'sexacttest.

Dichotomousdata.Forlargesamples,eitherpresentthedataastwoproportionsandusetheNormalapproximationtofindtheconfidenceintervalforthedifference(§8.6),orsetthedataupasa2by2tableanddoachi-squaredtest(§13.1).Theseareequivalentmethods.Anoddsratiocanalsobecalculated(§13.7).Ifthesampleissmall,thefittotheChi-squareddistributioncanbeimprovedbyusingYates'correction(§13.5).Alternatively,useFisher'sexacttest(§13.4).

Table14.2.Methodsfordifferencesinoneorpairedsample

Typeofdata Sizeofsample Method

Interval Large,>100 Normaldistribution(§8.3)

Small,<100,Normaldifferences

Pairedtmethod(§10.2)

Small,<100,non-Normaldifferences

Wilcoxonmatchedpairstest(§12.3)

Ordinal Any Signtest(§9.2)

Nominal,ordered

Any Signtest(§9.2)

Nominal Any Stuarttest(§13.9)

Dichotomous Any McNemar'stest(§13.9)

14.4*OnesampleandpairedsamplesMethodsofanalysisforpairedsamplesaresummarizedinTable14.2.

Intervaldata.Inferencesareondifferencesbetweenthevariableasobservedonthetwoconditions.Forlargesamples,sayn>100,theconfidenceintervalforthemeandifferenceisfoundusingtheNormalapproximation(§8.3).Forsmallsamples,providedthedifferencesarefromaNormaldistribution,usethepairedttest(§10.2).Thisassumptionisoftenveryreasonable,asmostofthevariationbetweenindividualsisremovedandrandomerrorislargelymadeupofmeasurementerror.Furthermore,theerroristheresultoftwoaddedmeasurementerrorsandsotendstofollowaNormaldistributionanyway.Ifnot,transformationoftheoriginaldatawilloftenmakedifferencesNormal(§10.4).IfnoassumptionofaNormaldistributioncanbemade,usetheWilcoxonsigned-rankmatched-pairstest(§12.3).

Itisrarelyaskedwhetherthereisadifferenceinvariabilityinpaireddata.Thiscanbetestedbyfindingthedifferencesbetweenthetwoconditionsandtheirsum.Thenifthereisnochangeinvariancethecorrelationbetweendifferenceandsumhasexpectedvaluezero(Pitman'stest).Thisisnotobviousbutitistrue.

Ordinaldata.Ifthedatadonotformanintervalscale,asnotedin§14.2thedifferencebetweenconditionsisnotmeaningful.However,wecansaywhatdirectionthedifferenceisin,andthiscanbeexaminedbythesigntest(§9.2).

Orderednominaldata.Usethesigntest,withchangesinonedirection

beingpositive,intheothernegative,nochangeaszero(§9.2).

Nominaldata.Withmorethantwocategories,thisisdifficult.UseStuart'sgeneralizationtomorethantwocategoriesofMcNemar'stest(§13.9).

Dichotomousdata.Herewearecomparingtheproportionsofindividualsinagivenstateunderthetwoconditions.TheappropriatetestisMcNemar'stest(§13.9).

14.5*RelationshipbetweentwovariablesThemethodsforstudyingrelationshipsbetweenvariablesaresummarizedinTable14.3.Relationshipswithdichotomousvariablescanbestudiedasthedifferencebetweentwogroups(§14.3),thegroupsbeingdefinedbythetwostatesofthedichotomousvariable.Dichotomousdatahavebeenexcludedfromthetextofthissection,butareincludedinTable14.3.

Intervalandintervaldata.Twomethodsareused:regressionandcorrelation.Regression(§11.2,§11.5)isusuallypreferred,asitgivesinformationaboutthenatureoftherelationshipaswellasaboutitsexistence.Correlation(§11.9)measuresthestrengthoftherelationship.Forregression,residualsaboutthelinemustfollowaNormaldistributionwithuniformvariance.Forestimation,thecorrelationcoefficientrequiresanassumptionthatbothvariablesfollowaNormaldistribution,buttotestthenullhypothesisonlyonevariableneedstofollowaNormaldistribution.IfneithervariablecanbeassumedtofollowaNormaldistributionorbetransformedtoit(§11.8),userankcorrelation(§12.4,§12.5).

Intervalandordinaldata.Rankcorrelationcoefficient(§12.4,§12.5).

Intervalandorderednominaldata.Thiscanbeapproachedbyrankcorrelation,usingKendall'sτ(§12.5)becauseitcopeswiththelargenumberoftiesbetterthandoesSpearman'sρ,orbyanalysisofvarianceasdescribedforintervalandnominaldata.ThelatterrequiresanassumptionofNormaldistributionanduniformvariancefortheintervalvariable.Thesetwoapproachesarenotequivalent.

Intervalandnominaldata.IftheintervalscalefollowsaNormaldistribution,useone-wayanalysisofvariance(§10.9).TheassumptionisthatwithincategoriestheintervalvariableisfromNormaldistributionswithuniformvariance.Ifthisassumptionisnotreasonable,useKruskal–Wallisanalysisofvariancebyranks(§12.2).

Ordinalandordinaldata.Usearankcorrelationcoefficient,Spearman'sρ(§12.4)orKendall'sτ(§12.5).Bothwillgiveverysimilaranswersfortestingthenullhypothesisofnorelationshipintheabsenceofties.Fordatawithmanytiesandforcomparingthestrengthsofdifferentrelationships,Kendall'sτispreferable.

Ordinalandorderednominaldata.UseKendall'srankcorrelationcoefficient,τ(§12.5).

Ordinalandnominaldata.Kruskal–Wallisone-wayanalysisofvariancebyranks(§12.2).

Orderednominalandorderednominaldata.Usechi-squaredfortrend(§13.8).

Orderednominalandnominaldata.Usethechi-squaredtestforatwo-waytable(§13.1).

Nominalandnominaldata.Usethechi-squaredtestforatwo-waytable(§13.1),providedtheexpectedvaluesarelargeenough.OtherwiseuseYates'correction(§13.5)orFisher'sexacttest(§13.4).

Table14.3.Methodsforrelationshipsbetweenvariables

Interval,Normal

Interval,non-Normal Ordinal

IntervalNormal

Regression(§11.2)correlation

Regression(§11.2)Rank

Rankcorrelation(§12.4,

(§11.9) correlation(§12.4,§12.5)

§12.5)

Interval,non-Normal

Regression(§11.2)rankcorrelation(§12.4,§12.5)

Rankcorrelation(§12.4,§12.5)


Ordinal Rankcorrelation(§12.4,§12.5)



Nominal,ordered

Kendall'srankcorrelation(§12.5)



Nominal Analysisofvariance(§10.9)

Kruskal–Wallistest(§12.2)

Kruskal–Wallistest(§12.2)

Dichotomous ttest(§10.3)Normaltest(§8.5,§9.7)

LargesampleNormaltest(§8.5,§9.7)Mann–WhitneyU

Mann–WhitneyUtest(§12.2)

test(§12.2)

Nominal,ordered Nominal Dichotomous

IntervalNormal


Analysisofvariance(§10.9)

ttest(§10.3)Normaltest(§8.5,§9.7)

Interval,non-Normal


Kruskal-Wallistest(§12.2)

LargesampleNormaltest(§8.5,§9.7),Mann–WhitneyUtest(§12.2)

Ordinal Kendall'srankcorrelation(§12.5)

Kruskal-Wallistest(§12.2)

Mann-WhitneyUtest(§12.2)

Nominal,ordered

Chi-squaredtestfortrend(§13.8)


Chi-squaredtestfortrend(§13.8)

Nominal Chi-squared

Chi-squared


test(§13.1)

test(§13.1)

Dichotomous Chi-squaredtestfortrend(§13.8)


Chi-squaredtest(§13.1,§13.5)Fisher'sexacttest(§13.4)

14MMultiplechoicequestions74to80(*Eachbranchiseithertrueorfalse)

74.Thefollowingvariableshaveintervalscalesofmeasurement:

(a)height;

(b)presenceorabsenceofasthma;

(c)Apgarscore;

(d)age;

(e)ForcedExpiratoryVolume.

ViewAnswer

75.Thefollowingmethodsmaybeusedtoinvestigatearelationshipbetweentwocontinuousvariables:

(a)pairedttest;

(b)thecorrelationcoefficient,r;

(c)simplelinearregression;

(d)Kendall'sτ;

(e)Spearman'sρ.

ViewAnswer

76.Whenanalysingnominaldatathefollowingstatisticalmethodsmaybeused:

(a)simplelinearregression;

(b)correlationcoefficient,r;

(c)pairedttest;

(d)Kendall'sτ;

(e)chi-squaredtest.

ViewAnswer

77.Tocomparelevelsofacontinuousvariableintwogroups,possiblemethodsinclude:

(a)theMann–WhitneyUtest;

(b)Fisher'sexacttest;

(c)attest;

(d)Wilcoxonmatched-pairssigned-ranktest;

(e)thesigntest.

ViewAnswer

Table14.4.Numberofrejectionepisodesover16weeksfollowinghearttransplantintwogroupsof

patients

Episodes GroupA GroupB Total

0 10 8 18

1 15 6 21

2 4 0 4

3 3 0 3

Totalpatients 32 14 46

78.Table14.4showsthenumberofrejectionepisodesfollowinghearttransplantintwogroupsofpatients:

(a)therejectionratesinthetwopopulationscouldbecomparedbyaMann–WhitneyUtest;

(b)therejectionratesinthetwopopulationscouldbecomparedbyatwo-samplettest;

(c)therejectionratesinthetwopopulationscouldbecomparedbyachi-squaredtestfortrend:

(d)thechi-squaredtestfora4by2tablewouldnotbevalid;

(e)thehypothesisthatthenumberofepisodesfollowsaPoissondistributioncouldbeinvestigatedusingachi-squaredtestforgoodnessoffit.

ViewAnswer

79.Twentyarthritispatientsweregiveneitheranewanalgesicoraspirinonsuccessivedaysinrandomorder.Thegripstrengthofthepatientswasmeasured.Methodswhichcouldbeusedtoinvestigatetheexistenceofatreatmenteffectinclude:

(a)Mann–WhitneyUtest;

(b)pairedtmethod;

(c)signtest;

(d)Normalconfidenceintervalforthemeandifference;

(e)Wilcoxonmatched-pairssigned-ranktest.

ViewAnswer

80.Inastudyofboxers,computertomographyrevealedbrainatrophyin3of6professionalsand1of8amateurs(Kasteetal.1982).Thesegroupscouldbecomparedusing:

(a)Fisher'sexacttest;

(b)thechi-squaredtest;

(c)thechi-squaredtestwithYates'correction;

(d)*McNemar'stest;

(e)thetwo-samplettest.

ViewAnswer

Table14.5.GastricpHandurinarynitriteconcentrationsin26subjects(HallandNorthfield,privatecommunication)

pH Nitrite pH Nitrite pH Nitrite pH Nitrite

1.72 1.64 2.64 2.33 5.29 50.6 5.77 48.9

1.93 7.13 2.73 52.0 5.31 43.9 5.86 3.26

1.94 12.1 2.94 6.53 5.50 35.2 5.90 63.4

2.03 15.7 4.07 22.7 5.55 83.8 5.91 81.2

2.11 0.19 4.91 17.8 5.59 52.5 6.03 19.5

2.17 1.48 4.94 55.6 5.59 81.8

2.17 9.36 5.18 0.0 5.17 21.9

14E*Exercise:Choosingastatisticalmethod1.Inacross-overtrialtocomparetwoappliancesforileostomypatients,of14patientswhoreceivedsystemAfirst,5expressedapreferenceforA,9forsystemBandnonehadnopreference.OfthepatientswhoreceivedsystemBfirst,7preferredA,5preferredBand4hadnopreference.Howwouldyoudecidewhetheronetreatmentwaspreferable?Howwouldyoudecidewhethertheorderoftreatmentinfluencedthechoice?

ViewAnswer

2.Burretal.(1976)testedaproceduretoremovehouse-dustmitesfromthebeddingofadultasthmaticsinattempttoimprovesubjects'lungfunction,whichtheymeasuredbyPEFR.Thetrialwasatwoperiodcross-overdesign,thecontrolorplacebotreatmentbeingthoroughdustremovalfromthelivingroom.ThemeansandstandarderrorsforPEFRinthe32subjectswere:

activetreatment:335litres/min,SE=19.6litres/min

placebotreatment:329litres/min,SE=20.8litres/min

differenceswithinsubjects:(treatment–placebo)6.45litres/min,SE=5.05litres/min

HowwouldyoudecidewhetherthetreatmentimprovesPEFR?

ViewAnswer

3.Inatrialofscreeningandtreatmentformildhypertension(Readeretal.1980),1138patientscompletedthetrialonactivetreatment,with9deaths,and1080completedonplacebo,with19deaths.Afurther583patientsallocatedtoactivetreatmentwithdrew,ofwhom6died,and626allocatedtoplacebowithdrew,ofwhom16diedduringthetrialperiod.Howwouldyoudecidewhetherscreeningandtreatmentformildhypertensionreducestheriskofdying?

ViewAnswer

4.Table14.5showsthepHandnitriteconcentrationsinsamplesofgastricfluidfrom26patients.AscatterdiagramisshowninFigure14.1.HowwouldyouassesstheevidenceofarelationshipbetweenpHandnitriteconcentration?

ViewAnswer

5.Thelungfunctionof79childrenwithahistoryofhospitalizationforwhoopingcoughand178childrenwithoutahistoryofwhoopingcough,takenfromthesameschoolclasses,wasmeasured.Themeantransittimeforthewhoopingcoughcaseswas0.49seconds(s.d.=0.14seconds)andforthecontrols0.47seconds(s.d.=0.11seconds),(Johnstonetal.1983).Howcouldyouanalysethedifferenceinlungfunctionbetweenchildrenwhohadhadwhoopingcoughandthosewhohadnot?Eachcasehadtwomatchedcontrols.Ifyouhadallthedata,howcouldyouusethisinformation?

ViewAnswer

Fig.14.1.GastricpHandurinarynitrite

Table14.6.Visualacuityandresultsofacontrastsensitivityvisiontestbeforeandaftercataractsurgery(Wilkins,personalcommunication)

CaseVisualacuity Contrastsensitivitytest

Before After Before After

1 6/9 6/9 1.35 1.50

2 6/9 6/9 0.75 1.05

3 6/9 6/9 1.05 1.35

4 6/9 6/9 0.45 0.90

5 6/12 6/6 1.05 1.35

6 6/12 6/9 0.90 1.20

7 6/12 6/9 0.90 1.05

8 6/12 6/12 1.05 1.20

9 6/12 6/12 0.60 1.05

10 6/18 6/6 0.75 1.05

11 6/18 6/12 0.90 1.05

12 6/18 6/12 0.90 1.50

13 6/24 6/18 0.45 0.75

14 6/36 6/18 0.15 0.45

15 6/36 6/36 0.45 0.60

16 6/60 6/9 0.45 1.05

17 6/60 6/12 0.30 1.05

6.Table14.6showssomedatafromapre-andpost-treatmentstudy

ofcataractpatients.Thesecondnumberinthevisualacuityscorerepresentsthesizeofletterwhichcanbereadatadistanceofsixmetres,sohighnumbersrepresentpoorvision.Forthecontrastsensitivitytest,whichisameasurement,highnumbersrepresentgoodvision.Whatmethodscouldbeusedtotestthedifferenceinvisualacuityandinthecontrastsensitivitytestpre-andpost-operation?Whatmethodcouldbeusedtoinvestigatetherelationshipbetweenvisualacuityandthecontrastsensitivitytestpost-operation?

ViewAnswer

Table14.7.Asthmaorwheezebymaternalage(Andersonetal.1986)

Asthmaorwheezereported

Mother'sageatchild'sbirth

15–19 20–29 30+

Never 261 4017 2146

Onsetbyage7 103 984 487

Onsetfrom8to11 27 189 95

Onsetfrom12to16 20 157 67

Table14.8.Colontransittime(hours)ingroupsofmobileand

immobileelderlypatients(dataofDrMichaelO'Connor)

Mobilepatients Immobilepatients

8.4 21.6 45.5 62.4 68.4 15.6 38.8 54.0

14.4 25.2 48.0 66.0 24.0 42.0 54.0

19.2 30.0 50.4 66.0 24.0 43.2 57.6

20.4 36.0 60.0 66.0 32.4 47.0 58.8

20.4 38.4 60.0 67.2 34.8 52.8 62.4

n1=21,[xwithbarabove]1=42.57,s1=20.58

n1=21,[xwithbarabove]49.63,s2=16.39

7.Table14.7showstherelationshipbetweenageofonsetofasthmainchildrenandmaternalageatthechild'sbirth.Howwouldyoutestwhetherthesewererelated?ThechildrenwereallborninoneweekinMarch,1958.Apartfromthepossibilitythatyoungmothersingeneraltendtohavechildrenpronetoasthma,whatotherpossibleexplanationsarethereforthisfinding?

ViewAnswer

8.Inastudyofthyroidhormoneinprematurebabies,wewantedtostudytherelationshipoffreeT3measuredatseveraltimepointsoversevendayswiththenumberofdaysthebabiesremainedoxygendependent.Somebabiesdied,mostlywithinafewdaysofbirth,andsomebabieswenthomestilloxygendependentandwere

notfollowedanylongerbytheresearchers.HowcouldyoureducetheseriesofT3measurementsonababytoasinglevariable?Howcouldyoutesttherelationshipwithtimeonoxygen?

ViewAnswer

9.Table14.8showscolontransittimesmeasuredinagroupofelderlypatientswhoweremobileandinasecondgroupwhowereunabletomoveindependently.Figure14.2showsascatterdiagramandhistogramandNormalplotofresidualsforthesedata.Whattwostatisticalapproachescouldbeusedhere?Whichwouldyoupreferandwhy?

ViewAnswer

Fig.14.2.Scatterplot,histogram,andNormalplotforthecolontransittimedataofTable14.8



>TableofContents>15-Clinicalmeasurement

15

Clinicalmeasurement

15.1MakingmeasurementsInthischapterweshalllookatanumberofproblemsassociatedwithclinicalmeasurement.Theseincludehowpreciselywecanmeasure,howdifferentmethodsofmeasurementcanbecompared,howmeasurementscanbeusedindiagnosisandhowtodealwithincompletemeasurementsofsurvival.

Whenwemakeameasurement,particularlyabiologicalmeasurement,thenumberweobtainistheresultofseveralthings:thetruevalueofthequantitywewanttomeasure,biologicalvariation,themeasurementinstrumentitself,thepositionofthesubject,theskill,experienceandexpectationsoftheobserver,andeventherelationshipbetweenobserverandsubject.Someofthesefactors,suchasthevariationwithinthesubject,areoutsidethecontroloftheobserver.Others,suchasposition,arenot,anditisimportanttostandardizethese.Onewhichismostunderourcontrolistheprecisionwithwhichwereadscalesandrecordtheresult.Whenbloodpressureismeasured,forexample,someobserversrecordtothenearest5mmHg,otherstothenearest10mmHg.SomeobserversmayrecorddiastolicpressureatKorotkovsoundfour,othersatfive.Observersmaythinkthatasbloodpressureissuchavariablequantity,errorsinrecordingofthismagnitudeareunimportant.Inthemonitoringoftheindividualpatient,suchlackofuniformitymaymakeapparentchangesdifficulttointerpret.Inresearch,imprecisemeasurementcanleadtoproblemsintheanalysistolossofpower.

Howpreciselyshouldwerecorddata?Whilethismustdependtosome

extentonthepurposeforwhichthedataaretoberecorded,anydatawhicharetobesubjectedtostatisticalanalysisshouldberecordedaspreciselyaspossible.Astudycanonlybeasgoodasthedata,anddataareoftenverycostlyandtime-consumingtocollect.Theprecisiontowhichdataaretoberecordedandallother.procedurestobeusedinmeasurementshouldbedecidedinadvanceandstatedintheprotocol,thewrittenstatementofhowthestudyistobecarriedout.Weshouldbearinmindthattheprecisionofrecordingdependsonthenumberofsignificantfigures(§5.2)recorded,notthenumberofdecimalplaces.Theobservations0.15and1.66fromTable4.8,forexample,arebothrecordedtotwodecimalplaces,but0.15hastwosignificantfiguresand1.66hasthree.Thesecondobservationisrecordedmoreprecisely.Thisbecomesveryimportantwhenwecometoanalysethedata,forthedataofTable4.8havea

skewdistributionwhichwewishtologtransform.Thegreaterimprecisionofrecordingatthelowerendofthescaleismagnifiedbythetransformation.

Inmeasurementthereisusuallyuncertaintyinthelastdigit.Observerswilloftenhavesomevaluesforthislastdigitwhichtheyrecordmoreoftenthanothers.Manyobserversaremorelikelytorecordaterminalzerothananineoraone,forexample.Thisisknownasdigitpreference.Thetendencytoreadbloodpressuretothenearest5or10mmHgmentionedaboveisanexampleofthis.Observertrainingandawarenessoftheproblemhelptominimizedigitpreference,butifpossiblereadingsshouldbetakentosufficientsignificantfiguresforthelastdigittobeunimportant.Digitpreferenceisparticularlyimportantwhendifferencesinthelastdigitareofimportancetotheoutcome,asitmightbeinTable15.1,wherewearedealingwiththedifferencebetweentwosimilarnumbers.Becauseofthisitisamistaketohaveonemeasurertakereadingsunderonesetofconditionsandasecondunderanother,astheirdegreeofdigitpreferencemaydiffer.Itisalsoimportanttoagreetheprecisiontowhichdataaretoberecordedandtoensurethatinstrumentshavesufficientlyfinescalesforthejobinhand.

15.2*RepeatabilityandmeasurementerrorIhavealreadydiscussedsomefactorswhichmayproducebiasinmeasurements(§2.7,§2.8,§3.6).Ihavenotyetconsideredthenaturalbiologicalvariability,insubjectandinmeasurementmethod,whichmayleadtomeasurementerror.‘Error’comesfromaLatinrootmeaning‘towander’,anditsuseinstatisticsincloselyrelatedtothis,asin§11.2,forexample.Thuserrorinmeasurementmayincludethenaturalcontinualvariationofabiologicalquantity,whenasingleobservationwillbeusedtocharacterizetheindividual.Forexample,inthemeasurementofbloodpressurewearedealingwithaquantitythatvariescontinuously,notonlyfromheartbeattoheartbeatbutfromdaytoday,seasontoseason,andevenwiththesexofthemeasurer.Themeasurer,too,willshowvariationintheperceptionoftheKorotkovsoundandreadingofthemanometer.Becauseofthis,mostclinicalmeasurementscannotbetakenatfacevaluewithoutsomeconsiderationbeinggiventotheirerror.

Thequantificationofmeasurementerrorisnotdifficultinprinciple.Todoitweneedasetofreplicatereadings,obtainedbymeasuringeachmemberofasampleofsubjectsmorethanonce.Wecanthenestimatethestandarddeviationofrepeatedmeasurementsonthesamesubject.Table15.1showssomereplicatedmeasurementsofpeakexpiratoryflowrate,madebythesameobserver(myself)withaWrightPeakFlowMeter.Foreachsubject,themeasuredPEFRvariesfromobservationtoobservation.Thisvariationisthemeasurementerror.Wecanquantifymeasurementerrorintwoways:usingthestandarddeviationforrepeatedmeasurementsonthesamesubjectandbycorrelation.

Table15.1.PairsofreadingsmadewithaWrightPeakFlowMeteron17healthyvolunteers

Subject

PEFR(litres/min)

Subject

PEFR(litres/min)

First Second First Second

1 494 490 10 433 429

2 395 397 11 417 420

3 516 512 12 656 633

4 434 401 13 267 275

5 476 470 14 478 492

6 557 611 15 178 165

7 413 415 16 423 372

8 442 431 17 427 421

9 650 638

Table15.2.AnalysisofvariancebysubjectforthePEFRdataofTable15.1

Sourceof Degrees Sumof Mean Variance

variation offreedom

squares square ratio(F) Probability

Total 33 445581.5

Betweensubjects

16 441598.5 27599.9 117.8

Residual(withinsubjects)

17 3983.0 234.3

Table15.3.Analysisofvariancebysubjectforthelog(basetransformedPEFRdataofTable15.1

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 33 3.160104

Subjects 16 3.139249 0.196203 159.9

Residual(withinsubjects)

17 0.020855 0.001227

Weshouldchecktoseewhethertheerrordoesdependonthevalueofthemeasurement,usuallybeinglargerforlargervalues.Wecandothisbyplottingascatterdiagramoftheabsolutevalueofthedifference(i.e.ignoringthesign)andthemeanofthetwoobservations(Figure15.1).ForthePEFRdata,thereisnoobviousrelationship.Wecancheckthisbycalculatingacorrelation(§11.9)orrankcorrelationcoefficient(§12.4,§12.5).ForFigure15.1wehaveτ=0.17,P=0.3,sothereislittletosuggestthatthemeasurementerrorisrelatedtothesizeofthePEFR.Hencethecoefficientofvariationisnotasappropriateasthewithinsubjectsstandarddeviationasarepresentationofthemeasurementerror.Formostmedicalmeasurements,thestandarddeviationiseitherindependentoforproportionaltothemeasurementandsooneofthesetwoapproachescanbeused.

Fig.15.1.Absolutedifferenceversussumfor17pairsofWrightPeakFlowMetermeasurements

Measurementerrormayalsobepresentedasthecorrelationcoefficientbetweenpairsofreadings.Thisissometimescalledthereliabilityofthemeasurement,andisoftenusedforpsychologicalmeasurementsusingquestionnairescales.However,thecorrelationdependsontheamountofvariationbetweensubjects.Ifwedeliberatelychoosesubjectstohaveawidespreadofpossiblevalues,thecorrelationwillbebiggerthanifwetakearandomsampleofsubjects.Thusthismethodshouldonlybeusedifwehavearepresentativesampleofthesubjectsinwhomweareinterested.Theintra-classcorrelationcoefficient(§11.13),whichdoesnottakeintoaccounttheorderinwhichobservationsweretakenandwhichcanbeusedwithmorethantwoobservationspersubject,ispreferredforthisapplication.Applyingthemethodof§11.13toTable15.1wegetICC=0.98.ICCandswarecloselyrelated,becauseICC=1-sw2/(sb2+sw2).ICCthereforedependsalsoonthevariationbetweensubjects,andthusrelatestothepopulationofwhichthesubjectscanbeconsideredarandomsample.StreinerandNorman(1996)giveaninterestingdiscussion.

15.3*ComparingtwomethodsofmeasurementInclinicalmeasurement,mostofthethingswewanttomeasure,hearts,lungs,liversandsoon,aredeepwithinlivingbodiesandoutofreach.Thismeansthatmanyofthemethodsweusetomeasurethemareindirectandwecannotbesurehowcloselytheyarerelatedtowhatwereallywanttoknow.Whenanewmethodofmeasurementisdeveloped,ratherthancompareitsoutcometoasetofknownvalueswemustoftencompareittoanothermethodjustasindirect.Thisisacommontypeofstudy,andonewhichisoftenbadlydone(AltmanandBland1983,BlandandAltman1986).

Table15.4showsmeasurementsofPEFRbytwodifferentmethods,theWrightmeterdatacomingfromTable15.1.Forsimplicity,Ishalluse

onlyonemeasurementbyeachmethodhere.Wecouldmakeuseoftheduplicate

databyusingtheaverageofeachpairfirst,butthisintroducesanextrastageinthecalculation.BlandandAltman(1986)givedetails.

Table15.4.ComparisonoftwomethodsofmeasuringPEFR

Subjectnumber

PEFR(litres/min)DifferenceWright-miniWright

meterMinimeter

1 494 512 -18

2 395 430 -35

3 516 520 -4

4 434 428 6

5 476 500 -24

6 557 600 -43

7 413 364 49

8 442 380 62

9 650 658 -8

10 433 445 -12

11 417 432 -15

12 656 626 30

13 267 260 7

14 478 477 1

15 178 259 -81

16 423 350 73

17 427 451 -24

Total -36

Mean 2.1

S.d. 38.8

Thefirststepintheanalysisistoplotthedataasascatterdiagram(Figure15.2).Ifwedrawthelineofequality,alongwhichthetwomeasurementswouldbeexactlyequal,thisgivesusanideaoftheextenttowhichthetwomethodsagree.Thisisnotthebestwayoflookingatdataofthistype,becausemuchofthegraphisemptyspaceandtheinterestinginformationisclusteredalongtheline.Abetter

approachistoplotthedifferencebetweenthemethodsagainstthesumoraverage.Thesignofthedifferenceisimportant,asthereisapossibilitythatonemethodmaygivehighervaluesthantheotherandthismayberelatedtothetruevaluewearetryingtomeasure.ThisplotisalsoshowninFigure15.2.

Twomethodsofmeasurementagreeifthedifferencebetweenobservationsonthesamesubjectusingbothmethodsissmallenoughforustousethemethodsinterchangeably.Howsmallthisdifferencehastobedependsonthemeasurementandtheusetowhichitistobeput.Itisaclinical,notastatistical,decision.Wequantifythedifferencesbyestimatingthebias,whichisthemeandifference,andthelimitswithinwhichmostdifferenceswilllie.Weestimatetheselimitsfromthemeanandstandarddeviationofthedifferences.Ifwearetoestimatethesequantities,wewantthemtobethesameforhighvaluesandforlowvaluesofthemeasurement.Wecancheckthisfromtheplot.ThereisnoclearevidenceofarelationshipbetweendifferenceandmeaninFigure15.4,andwecancheckthisbyatestofsignificanceusingthecorrelationcoefficient.Wegetr=0.19,P=0.5.

Themeandifferenceisclosetozero,sothereislittleevidenceofoverallbias.

Wecanfindaconfidenceintervalforthemeandifferenceasdescribedin§10.2.Thedifferenceshaveamean[dwithbarabove]=-2.1litres/min,andastandarddeviationof38.8.Thestandarderrorofthemeanisthuss/√n=38.8/√17=9.41litres/minandthecorrespondingvalueoftwith16degreesoffreedomis2.12.The95%confidenceintervalforthebiasisthus-2.1±2.12×9.41=-22to+18litres/min.Thusonthebasisofthesedatawecouldhaveabiasofasmuchas22litres/min,whichcouldbeclinicallyimportant.Theoriginalcomparisonoftheseinstrumentsusedamuchlargersampleandfoundthatanybiaswasverysmall(Oldhametal.1979).

Fig.15.2.PEFRmeasuredbytwodifferentinstruments,minimeterversusWrightmeteranddifferenceversusmeanofminiandWrightmeters

Fig.15.3.DistributionofdifferencesbetweenPEFRmeasuredbytwomethods

Thestandarddeviationofthedifferencesbetweenmeasurementsmadebythetwomethodsprovidesagoodindexofthecomparabilityofthemethods.Ifwecanestimatethemeanandstandarddeviationreliably,withsmallstandarderrors,wecanthensaythatthedifferencebetweenmethodswillbeatmosttwostandarddeviationsoneithersideofthemeanfor95%ofobservations.These[dwithbarabove]±2s

limitsforthedifferencearecalledthe95%limitsofagreement.ForthePEFRdata,thestandarddeviationofthedifferencesisestimatedtobe38.8litres/minandthemeanis-2litres/min.Twostandarddeviationsistherefore78litres/min.Thereadingwiththeminimeterisexpectedtobe80litresbelowto76litresaboveformostsubjects.TheselimitsareshownashorizontallinesinFigure15.4.Thelimitsdependontheassumptionthatthedistributionof

thedifferencesisapproximatelyNormal,whichcanbecheckedbyhistogramandNormalplot(§7.5)(Figure15.3).

Fig.15.4.DifferenceversussumforPEFRmeasuredbytwomethods

OnthebasisofthesedatawewouldnotconcludethatthetwomethodsarecomparableorthattheminimetercouldreliablyreplacetheWrightpeakflowmeter.Asremarkedin§10.2,thismeterhadreceivedconsiderablewear.

Whenthereisarelationshipbetweenthedifferenceandthemean,wecantrytoremoveitbyatransformation.Thisisusuallyaccomplishedbythelogarithm,andleadstoaninterpretationofthelimitssimilartothatdescribedin§15.2.BlandandAltman(1986,1999)givedetails.

15.4SensitivityandspecificityOneofthemainreasonsformakingclinicalmeasurementsistoaidindiagnosis.Thismaybetoidentifyoneofseveralpossiblediagnosesinapatient,ortofindpeoplewithaparticulardiseaseinanapparentlyhealthypopulation.Thelatterisknownasscreening.Ineithercasethemeasurementprovidesatestwhichenablesustoclassifysubjectsintotwogroups,onegroupwhomwethinkarelikelytohavediseaseinwhichweareinterested,andanothergroupunlikelytohavethedisease.Whendevelopingsuchatest,weneedtocomparethetestresultwithatruediagnosis.Thetestmaybebasedonacontinuousvariableandthediseaseindicatedifitisaboveorbelowagivenlevel,oritmaybeaqualitativeobservationsuchascarcinomainsitucellsonacervicalsmear.IneithercaseIshallcallthetestpositiveifitindicatesthediseaseandnegativeifnot,andthediseasepositiveifthediseaseislaterconfirmed,negativeifnot.

Howdowemeasuretheeffectivenessofthetest?Table15.5showsthreeartificialsetsoftestanddiseasedata.Wecouldtakeasanindexoftesteffectivenesstheproportiongivingthecorrectdiagnosisfromthetest.ForTest1intheexampleitis94%.NowconsiderTest2,whichalwaysgivesanegativeresult.Test2willneverdetectanycasesofthedisease.Wearenowrightfor95%ofthesubjects!However,thefirsttestisuseful,inthatitdetectssome

casesofthedisease,andthesecondisnot,sothisisclearlyapoorindex.

Table15.5.Someartificialtestanddiagnosisdata

DiseaseTest1 Test2 Test3

Total+ve -ve +ve -ve +ve -ve

Yes 4 1 0 5 2 3 5

No 5 90 0 95 0 95 95

Total 9 91 0 100 2 98 100

Thereisnoonesimpleindexwhichenablesustocomparedifferenttestsinallthewayswewouldlike.Thisisbecausetherearetwothingsweneedtomeasure:howgoodthetestisatfindingdiseasepositives,i.e.thosewiththecondition,andhowgoodthetestisatexcludingdiseasenegatives,i.e.thosewhodonothavethecondition.Theindicesconventionallyemployedtodothisare:

Inotherwords,thesensitivityisaproportionofdiseasepositiveswhoaretestpositive,andthespecificityistheproportionofdiseasenegativeswhoaretestnegatives.Forourthreeteststheseare:

Test1 Test2 Test3

Sensitivity 0.80 0.00 0.40

Specificity 0.95 1.00 1.00

Test2,ofcourse,missesallthediseasepositivesandfindsallthediseasenegatives,bysayingallarenegative.ThedifferencebetweenTests1and3isbroughtoutbythegreatersensitivityof1andthegreaterspecificityof3.Wearecomparingtestsintwodimensions.WecanseethatTest3isbetterthanTest2,becauseitssensitivityishigherandspecificitythesame.However,itismoredifficulttoseewhetherTest3isbetterthanTest1.Wemustcometoajudgementbasedontherelativeimportanceofsensitivityandspecificityintheparticularcase.

Sensitivityandspecificityareoftenmultipliedby100togivepercentages.Theyarebothbinomialproportions,sotheirstandard

errorsandconfidenceintervalsarefoundasdescribedin§8.4and§8.8.Becausetheproportionsareoftennearto1.0,thelargesampleapproach(§8.4)maynotbevalid.TheexactmethodusingtheBinomialprobabilities(§8.8)ispreferable.HarperandReeves(1999)pointoutthatconfidenceintervalsarealmostalwaysomittedinstudiesofdiagnostictestsreportedoutsidethemajorgeneralmedicaljournals,andrecommendthattheyshouldalwaysbegiven.Asthereadermightexpect,Iagreewiththem!Thesamplesizerequiredforthereliableestimationofsensitivityandspecificitycanbecalculatedasdescribedin§18.2.

Sometimesatestisbasedonacontinuousvariable.Forexample,Table15.6showsmeasurementsofcreatinekinase(CK)inpatientswithunstableangina

andacutemyocardialinfarction.Figure15.5(a)showsascatterplot.WewishtodetectpatientswithAMIamongpatientswhomayhaveeitherconditionandthismeasurementisapotentialtest,AMIpatientstendingtohavehighvalues.Howdowechoosethecut-offpoint?ThelowestCKinAMIpatientsis90,soacut-offbelowthiswilldetectallAMIpatients.Using80,forexample,wewoulddetectallAMIpatients,sensitivity=1.00,butwouldalsoonlyhave42%ofanginapatientsbelow80,sothesensitivity=0.42.Wecanalterthesensitivityandspecificitybychangingthecut-offpoint.Raisingthecut-offpointwillmeanfewercaseswillbedetectedandsothesensitivitywillbedecreased.However,therewillbefewerfalsepositives,positivesontestbutwhodonotinfacthavethedisease,andthespecificitywillbeincreased.Forexample,ifCK≥100werethecriterionforAMI,sensitivitywouldbe0.96andspecificity0.62.Thereisatrade-offbetweensensitivityandspecificity.Itcanbehelpfultoplotsensitivityagainstspecificitytoexaminethistrade-off.ThisiscalledareceiveroperatingcharacteristicorROCcurve.(Thenamecomesfromtelecommunications.)

Weoftenplotsensitivityagainst1–specificity,asinFigure15.5(b).WecanseefromFigure15.5(b)thatwecangetbothhighsensitivityandhighspecificityifwechoosetherightcut-off.With1-specificitylessthan0.1,i.e.sensitivitygreaterthan0.9.wecangetsensitivitygreater

than0.9also.Infact,acut-offof200wouldgivesensitivity=0.93andspecificity=0.91inthissample.Theseestimateswillbebiased,becauseweareestimatingthecut-offandtestingitinthesamesample.Weshouldcheckthesensitivityandspecificityofthiscut-offinadifferentsampletobesure.

Table15.6.Creatinekinaseinpatientswithunstableanginaandacutemyocardialinfarction(AMI)(dataof

FrancesBoa)

Unstableangina AMI

23 48 62 83 104 130 307 90 648

33 49 63 84 105 139 351 196 894

36 52 63 85 105 150 360 302 962

37 52 65 86 107 155 311 1015

37 52 65 88 108 157 325 1143

41 53 66 88 109 162 335 1458

41 54 67 88 111 176 347 1955

41 57 71 89 114 180 349 2139

42 57 72 91 116 188 363 2200

42 58 72 94 118 198 377 3044

43 58 73 94 121 226 390 7590

45 58 73 95 121 232 398 11138

47 60 75 97 122 257 545

48 60 80 100 126 257 577

48 60 80 103 130 297 629

Fig.15.5.ScatterdiagramandwithROCcurveforthedataofTable15.6

TheareaundertheROCcurveisoftenquoted(hereitis0.9753).Itestimatestheprobabilitythatamemberofonepopulationchosenatrandomwillexceedamemberoftheotherpopulation,inthesamewayasdoesU/n1n2intheMann–WhitneyUtest(§12.2).Itcanbeusefulin

comparingdifferenttests.InthisstudyanotherbloodtestgaveusanareaundertheROCcurve=0.9825,suggestingthatthetestmaybeslightlybetterthanCK.

WecanalsoestimatethepositivepredictivevalueorPPV,theprobabilitythatasubjectwhoistestpositivewillbeatruepositive(i.e.hasthediseaseandiscorrectlyclassified),andthenegativepredictivevalueorNPV,theprobabilitythatasubjectwhoistestnegativewillbeatruenegative(i.e.doesnothavethediseaseandiscorrectlyclassified).Thesedependontheprevalenceofthecondition,Pprev,aswellasthesensitivity,Psens,andthespecificity,pspec.Ifthesampleisasinglegroupofpeople,weknowtheprevalenceandcanestimatePPVandNPVforthispopulationdirectlyassimpleproportions.Ifwestartedwithasampleofcasesandasampleofcontrols,wedonotknowtheprevalence,butwecanestimatePPVandNPVforapopulationwithanygivenprevalence.Asdescribedin§6.8,psensistheconditionalprobabilityofapositivetestgiventhedisease,sotheprobabilityofbeingbothtestpositiveanddiseasepositiveispsens×pprev.Similarly,theprobabilityofbeingbothtestnegativeanddiseasepositiveis(1-pspec)×(1-pprev).Theprobabilityofbeingtestpositiveisthesumofthese(§6.2):psens×pprev+(1-pspec)×(1-pprev)andthePPVis

Similarly,theNPVis

InscreeningsituationstheprevalenceisalmostalwayssmallandthePPVislow.Supposewehaveafairlysensitiveandspecifictest,psens=0.95andpspec=0.90,andthediseasehasprevalencepprev=0.01(1%).Then

soonly8.8%oftestpositiveswouldbetruepositives,butalmostalltestnegativeswouldbetruenegatives.Mostscreeningtestsaredealingwithmuchsmallerprevalencesthanthis,somosttestpositivesarefalsepositives.

15.5NormalrangeorreferenceintervalIn§15.4wewereconcernedwiththediagnosisofparticulardiseases.Inthissectionwelookatittheotherwayroundandaskwhatvaluesmeasurementsonnormal,healthypeoplearelikelytohave.Therearedifficultiesindoingthis.Whois‘normal’anyway?IntheUKpopulationalmosteveryonehashardfattydepositsintheircoronaryarteries,whichresultindeathformanyofthem.VeryfewAfricanshavethis;theydiefromothercauses.SoitisnormalintheUKtohaveanabnormality.Weusuallysaythatnormalpeoplearetheapparentlyhealthymembersofthelocalpopulation.WecandrawasampleoftheseasdescribedinChapter3andmakethemeasurementonthem.

Thenextproblemistoestimatethesetofvalues.Ifweusetherangeoftheobservations,thedifferencebetweenthetwomostextremevalues,wecanbefairlyconfidentthatifwecarryonsamplingwewilleventuallyfindobservationsoutsideit.andtherangewillgetbiggerandbigger(§4.7).Toavoidthisweusearangebetweentwoquantiles(§4.7),usuallythe2.5centileandthe97.5centile,whichiscalledthenormalrange,95%referencerangeor95%referenceinterval.Thisleaves5%ofnormalsoutsidethe‘normalrange’,whichisthesetofvalueswithinwhich95%ofmeasurementsfromapparentlyhealthyindividualswilllie.

Athirddifficultycomesfromconfusionbetween‘normal’asusedinmedicineand‘Normaldistribution’asusedinstatistics.ThishasledsomepeopletodevelopapproacheswhichsaythatalldatawhichdonotfitunderaNormalcurveareabnormal!Suchmethodsaresimplyabsurd,thereisnoreasontosupposethatallvariablesfollowaNormaldistribution(§7.4,§7.5).Theterm‘referenceinterval’,whichisbecomingwidelyused,hastheadvantageofavoidingthisconfusion.However,themostcommonlyusedmethodofcalculationrestsontheassumptionthatthevariablefollowsaNormaldistribution.

Wehavealreadyseenthatingeneralmostobservationsfallwithintwostandarddeviationsofthemean,andthatforaNormaldistribution95%arewithintheselimitswith2.5%belowand2.5%above.IfweestimatethemeanandstandarddeviationofdatafromaNormalpopulationwecanestimatethereferenceintervalas[xwithbarabove]-2sto[xwithbarabove]+2s.

ConsidertheFEV1dataofTable4.5.WewillestimatethereferenceintervalforFEV1inmalemedicalstudents.Wehave57observations,mean4.06andstandarddeviation0.67litres.Thereferenceintervalisthusfrom2.7to5.4litres.FromTable4.4weseethatinfactonlyonestudent(2%)isoutsidetheselimits,althoughthesampleisrathersmall.

Hence,providedNormalassumptionshold,thestandarderrorofthelimitofthereferenceintervalis

ComparetheserumtriglyceridemeasurementsofTable4.8.Asalreadynoted(§4.4,§7.4).thedataarehighlyskewed,andwecannotusetheNormalmethoddirectly.Ifwedid,thelowerlimitwouldbe0.07,wellbelowanyoftheobservations,andtheupperlimitwouldbe0.94,greaterthanwhichare5%oftheobservations.Itispossibleforsuchdatatogiveanegativelowerlimit.

BecauseoftheobviouslyunsatisfactorynatureoftheNormalmethodforsomedata,someauthorshaveadvocatedtheestimationofthepercentilesdirectly(§4.5),withoutanydistributionalassumptions.Thisisanattractiveidea.Wewanttoknowthepointbelowwhich2.5%ofvalueswillfall.Letussimplyranktheobservationsandfindthepointbelowwhich2.5%oftheobservationsfall.Forthe282triglycerides,the2.5and97.5centilesarefoundasfollows.Forthe2.5centile,wefindi=q(n+1)=0.025×(282+1)=7.08.Therequiredquantilewillbebetweenthe7thand8thobservation.The7this0.21,the8this0.22sothe2.5centilewouldbeestimatedby0.21+(0.22-0.21)×(7.08-7)=0.211.Similarlythe97.5centileis1.039.

Thisapproachgivesanunbiassedestimatewhateverthedistribution.Thelogtransformedtriglyceridewouldgiveexactlythesameresults.NotethattheNormaltheorylimitsfromthelogtransformeddataareverysimilar.Wenowlookattheconfidenceinterval.The95%confidenceintervalfortheqquantile,hereqbeing0.025or0.975,estimateddirectlyfromthedataisfoundbytheBinomialdistributionmethod(§8.9).Forthetriglyceridedata,n=282andsoforthelowerlimit,q=0.025,wehave

Thisgivesj=1.9andk=12.2,whichwerounduptoj=2andk=13.Inthetriglyceridedatathesecondobservation,correspondingtoj=2,is0.16andthe13this0.26.Thusthe95%confidenceintervalforthelowerreferencelimitis0.16to0.26.Thecorrespondingcalculationforq=0.975givesj=270andk=281.The270thobservationis0.96andthe281stis1.64,givinga95%confidenceintervalfortheupperreferencelimitof0.96to1.64.ThesearewiderconfidenceintervalsthanthosefoundbytheNormalmethod,thoseforthelongtailparticularlyso.Thismethodofestimatingpercentilesinlongtailsisrelativelyimprecise.

15.6*SurvivaldataWeoftenhavedatawhichrepresentthetimefromsomeeventtodeath,suchastimefromdiagnosisorfromentrytoaclinicaltrial,butsurvivalanalysisdoesnothavetobeaboutdeath.Incancerstudieswecanusesurvivalanalysisforthetimetometastasisortolocalrecurrenceofatumour,inastudyofmedicalcarewecanuseittoanalysethetimetoreadmissiontohospital,inastudyofbreast-feedingwecouldlookattheageatwhichbreast-feedingceasedoratwhichbottlefeedingwasfirstintroduced,andinastudyofthetreatmentofinfertilitywecantreatthetimefromtreatmenttoconceptionassurvivaldata.Weusuallyrefertotheterminalevent,death,conception,etc.,astheendpoint.

Problemsariseinthemeasurementofsurvivalbecauseoftenwedonotknowtheexactsurvivaltimesofallsubjects.Thisisbecausesomewillstillbesurvivingwhenwewanttoanalysethedata.Whencaseshaveenteredthestudyatdifferenttimes,someoftherecententrantsmaybesurviving,butonlyhavebeenobservedforashorttime.Theirobservedsurvivaltimemaybelessthanthosecasesadmittedearlyinthestudyandwhohavesincedied.Themethodofcalculatingsurvivalcurvesdescribedbelowtakesthisintoaccount.Observationswhichareknownonlytobegreaterthansomevaluearerightcensored,oftenshortenedtocensored.(Wegetleftcensoreddatawhenthemeasurementmethodcannotdetectanythingbelowsomecut-offvalue,andobservationsarerecordedas‘nonedetectable’.TherankmethodsinChapter12areusefulforsuchdata.)

Table15.7.Survivaltimeinyearsofpatientsafterdiagnosisofparathyroidcancer

Alive Deaths

<1 <1

<1 2

1 6

1 6

4 7

5 9

6 9

8 11

10 14

10

17

Table15.7showssomesurvivaldata,forpatientswithparathyroidcancer.Thesurvivaltimesarerecordedincompletedyears.Apatientwhosurvivedfor6yearsandthendiedcanbetakenashavinglivedfor6yearsandthendiedintheseventh.Inthefirstyearfromdiagnosis.onepatientdied,twopatientswereobservedforonlypartofthisyear,and17survivedintothenextyear.Thesubjectswhohaveonlybeenobservedforpartoftheyeararecensored,alsocalledlosttofollow-uporwithdrawnfromfollow-up.(Thesearerathermisleadingnames,oftenwronglyinterpretedasmeaningthatthesesubjectshavedroppedoutofthestudy.Thismaybethecase,butmostofthesesubjectsare

simplystillaliveandtheirfurthersurvivalisunknown.)Thereisnoinformationaboutthesurvivalofthesesubjectsafterthefirstyear,becauseithasnothappenedyet.Thesepatientsareonlyatriskofdyingforpartoftheyearandwecannotsaythat1outof20diedastheymayyetcontributeanotherdeathinthefirstyear.Wecansaythatsuchpatientswillcontributehalfayearofrisk,onaverage,sothenumberofpatientyearsatriskinthefirstyearis18(17whosurvivedand1whodied)plus2halvesforthosewithdrawnfromfollow-up,giving19altogether.Wegetanestimateoftheprobabilityofdyinginthefirstyearof1/19,andanestimatedprobabilityofsurvivingof1-1/19.Wecandothisforeachyearuntilthelimitsofthedataarereached.Wethustracethesurvivalofthesepatientsestimatingtheprobabilityofdeathorsurvivalateachyearandthecumulativeprobabilityofsurvivaltoeachyear.Thissetofprobabilitiesiscalledalifetable.

Tocarryoutthecalculation,wefirstsetoutforeachyear,x,thenumberaliveatthestart,nx,thenumberwithdrawnduringtheyear,wx,andthenumberatrisk,rx,andthenumberdying,dx(Table15.8).Thusinyear1thenumberatthestartis20,thenumberwithdrawnis2,thenumberatriskr1=n1-1/2w1=20-1/2×2=19andthenumberofdeathsis1.Astherewere2withdrawalsand1deaththenumberatthestartofyear2is17.Foreachyearwecalculatetheprobabilityofdyinginthatyearforpatientswhohavereachedthebeginningofit,qx=dx/rx,andhencetheprobabilityofsurvivingtothenextyear,px=1-qx.Finallywecalculatethecumulativesurvivalprobability.

Table15.8.Lifetablecalculationforparathyroidcancersurvival

Year Numberatstart

Withdrawnduringyear

Atrisk Deaths

Prob.ofdeath

Prob.ofsurvivingyear

x nx wx rx dx qx

1 20 2 19 1 0.0526 0.9474

2 17 2 16 0 0 1

3 15 0 15 1 0.0667 0.9333

4 14 0 14 0 0 1

5 14 1 13.5 0 0 1

6 13 1 12.5 0 0 1

7 12 1 11.5 2 0.1739 0.8261

8 9 0 9 1 0.1111 0.8889

9 8 1 7.5 0 0 1

10 7 0 7 2 0.2857 0.7143

11 5 2 4 0 0 1

12 3 0 3 1 0.3333 0.6667

13 2 0 2 0 0 1

14 2 0 2 0 0 1

15 2 0 2 1 0.5000 0.5000

16 1 0 1 0 0 1

17 1 0 1 0 0 1

18 1 1 0.5 0 0 1

rx=nx-1/2wx,qx=dx/rx,px=1-qx,Px=pxPx-1.

Forthefirstyear,thisistheprobabilityofsurvivingthatyear,P1=p1.Forthesecondyear,itistheprobabilityofsurvivinguptothestartofthesecondyear,P1,timestheprobabilityofsurvivingthatyear,p2,togiveP2=p2P1.Theprobabilityofsurvivingfor3yearsissimilarlyP3=p3P2,andsoon.Fromthislifetablewecanestimatethefiveyearsurvivalrate,ausefulmeasureofprognosisincancer.Fortheparathyroidcancer,thefiveyearsurvivalrateis0.8842,or88%.Wecanseethattheprognosisforthiscancerisquitegood.Ifweknowtheexacttimeofdeathorwithdrawalforeachsubject,theninsteadofusingfixedtimeintervalsweusexastheexacttime,witharowofthetableforeachtimewheneitheranendpointorawithdrawaloccurs.Thenrx=nxandwecanomittherx=nx-1/2wxstep.

Wecandrawagraphofthecumulativesurvivalprobability,thesurvivalcurve.Thisisusuallydrawninsteps,withabruptchangesinprobability(Figure15.6).Thisconventionemphasizestherelativelypoorestimationatthelongsurvivalendofthecurve,wherethesmallnumbersatriskproducedlargesteps.Whentheexacttimesofdeathandcensoringareknown,thisiscalledaKaplan-Meiersurvivalcurve.Thetimesatwhichobservationsarecensoredmaybemarkedbysmallverticallinesabovethesurvivalcurve(Figure15.7),andthenumberremainingatriskmaybewrittenatsuitableintervalsbelowthetime

axis.

Thestandarderrorandconfidenceintervalforthesurvivalprobabilitiescanbefound(seeArmitageandBerry1994).Theseareusefulforestimatessuchasfiveyearsurvivalrate.Theydonotprovideagoodmethodforcomparing

survivalcurves,astheydonotincludeallthedata,onlyusingthoseuptothechosentime.Survivalcurvesstartofftogetherat100%survival,possiblydiverge,buteventuallycometogetheratzerosurvival.Thusthecomparisonwoulddependonthetimechosen.Survivalcurvescanbecomparedbyseveralsignificancetests,ofwhichthebestknownisthelogranktest.Thisisanon-parametrictestwhichmakesuseofthefullsurvivaldatawithoutmakinganyassumptionabouttheshapeofthesurvivalcurve.

Fig.15.6.Survivalcurveforparathyroidcancerpatients

Table15.9showsthetimetorecurrenceofgallstonesfollowingdissolutionbybileacidtreatmentorlithotrypsy.Hereweshallcomparethetwogroupsdefinedbyhavingsingleormultiplegallstones,usingthelogranktest.Weshalllookatthequantitativevariablesdiameterof

gallstoneandmonthstodissolvein§17.9.Figure15.7showthetimetorecurrenceforsubjectswithsingleprimarygallstonesandmultipleprimarygallstones.Thenullhypothesisisthatthereisnodifferenceinrecurrence-freesurvivaltime,thealternativethatthereissuchadifference.ThecalculationofthelogranktestissetoutinTable15.10.Foreachtimeatwhicharecurrenceoracensoringoccurred,wehavethenumbersunderobservationineachgroup,n1andn2,thenumberofrecurrences,d1andd2(dfordeath),andthenumberofcensorings,w1

andw2(wforwithdrawal).Foreachtime,wecalculatetheprobabilityofrecurrence,pd=(d1+d2)/(n1+n2),whicheachsubjectwouldhaveifthenullhypothesisweretrue.Foreachgroup,wecalculatetheexpectednumberofrecurrences,e1=Pd×n1ande2=Pd×n2.Wethencalculatethenumbersatriskatthenexttime,n1-d1-w1andn2-d2-w2.Wedothisforeachtime.Wethenaddthed1andd2columnstogettheobservednumbersofrecurrences,andthee1ande2columnstogetthenumbersofrecurrencesexpectedifthenullhypothesisweretrue.

Wehaveobservedfrequenciesofrecurrenced1andd2,andexpectedfrequenciese1,ande2.Ofcourse,d1+d2=e1+e2,soweonlyneedtocalculatee1asinTable15.10.andhencee2bysubtraction.Thisonlyworksfortwogroups,however,andthemethodofTable15.10worksforanynumberofgroups.

Table15.9.Timetorecurrenceofgallstonesfollowingdissolution,whetherpreviousgallstonesweremultiple,

maximumdiameterofpreviousgallstones,andmonthspreviousgallstonestooktodissolve

Time Rec. Mult. Diam. Dis. Time Rec. Mult. Diam.

3 No Yes 4 10 13 No No 11

3 No No 18 3 13 No No 22

3 No Yes 5 27 13 No No 13

4 No Yes 4 4 13 Yes Yes

5 No No 19 20 14 No Yes

6 No Yes 3 10 14 No No 23

6 No Yes 4 6 14 No No 15


6 Yes Yes 5 8 16 Yes Yes

6 Yes Yes 3 18 16 No No 18

6 Yes Yes 7 9 17 No No

6 No No 25 9 17 No Yes

6 No Yes 4 6 17 No Yes

6 Yes Yes 10 38 17 Yes No

6 Yes Yes 8 15 17 No Yes

6 No Yes 4 13 18 Yes No 10


7 No Yes 3 7 18 No Yes 11

7 Yes Yes 10 48 19 No No 26

8 Yes Yes 14 29 19 No Yes 11

8 Yes No 18 14 19 Yes Yes

8 Yes Yes 6 6 20 No No 11

8 No No 15 1 20 No No 13

8 No Yes 1 12 20 No No

8 No Yes 5 6 21 No Yes 11

9 No Yes 2 15 21 No Yes 13


9 No No 19 8 22 No No 10

10 Yes Yes 14 8 22 No No 20

11 No Yes 8 12 23 No No 16

11 No No 15 15 24 No No 15

11 Yes No 5 8 24 No Yes

11 No Yes 3 6 24 No No 15


11 No Yes 4 6 25 No No 13


11 No Yes 13 18 25 No No

11 Yes No 7 8 26 No No 17


12 Yes Yes 8 12 26 Yes No 16

12 No Yes 4 6 28 No No 20

12 No Yes 4 8 28 Yes No 30

12 Yes Yes 7 19 29 No No 16

12 Yes No 7 3 29 Yes No 12

12 No Yes 5 22 29 Yes Yes 10

12 Yes No 8 1 29 No Yes


12 No No 26 4 30 No No

13 No Yes 5 6 30 Yes Yes 22

13 No No 13 6 30 Yes Yes

31 No Yes 5 6 38 No No 10

31 No No 26 3 38 Yes Yes

31 No No 7 24 38 No No

32 Yes Yes 10 12 40 No No 23

32 No Yes 5 6 41 No No 16

32 No No 4 6 41 No No

32 No No 18 10 42 No No 15

33 No No 13 9 42 No Yes 16

34 No No 15 8 42 No Yes

34 No No 20 30 42 No Yes 14

34 No Yes 15 8 43 Yes No

34 No No 27 8 44 No Yes

35 No No 6 12 44 No Yes 10

36 No No 18 5 45 No No 12


36 No Yes 5 6 48 No No 21

36 No Yes 8 17 48 No No


37 No Yes 5 7 60 Yes No 15

37 No No 19 4 61 No No 10



Fig.15.7.Gallstone-freesurvivalafterthedissolutionofsingleandmultiplegall-stones

Table15.10.Calculationforthelogranktest

Time n1 d1 w1 n2 d2 w2 pd e1

3 65 0 1 79 0 2 0.000 0.000

4 64 0 0 77 0 1 0.000 0.000

5 64 0 1 76 0 0 0.000 0.000

6 63 0 1 76 5 5 0.036 2.266

7 62 0 0 66 2 1 0.016 0.969

8 62 1 1 63 2 2 0.024 1.488

9 60 0 1 59 1 1 0.008 0.504

10 59 0 0 57 1 0 0.009 0.509

11 59 2 1 56 1 5 0.026 1.539

12 56 2 2 50 3 3 0.047 2.642

13 52 0 4 44 1 1 0.010 0.542

14 48 0 2 42 0 1 0.000 0.000

16 46 0 1 41 2 0 0.023 1.057

17 45 1 1 39 0 3 0.012 0.536

18 43 1 0 36 1 1 0.025 1.089

19 42 0 1 34 1 1 0.013 0.553

20 41 0 3 32 0 0 0.000 0.000

21 38 0 0 32 0 3 0.000 0.000

22 38 0 2 29 0 0 0.000 0.000

23 36 0 1 29 0 0 0.000 0.000

24 35 0 2 29 1 1 0.016 0.547

25 33 0 2 27 1 0 0.017 0.550

26 31 1 1 26 0 1 0.018 0.544

28 29 1 1 25 0 0 0.019 0.537

29 27 1 1 25 1 1 0.038 1.038

30 25 0 1 23 2 1 0.042 1.042

31 24 0 2 20 0 1 0.000 0.000

32 22 0 2 19 1 1 0.024 0.537

33 20 0 1 17 0 0 0.000 0.000

34 19 0 3 17 0 1 0.000 0.000

35 16 0 1 16 0 0 0.000 0.000

36 15 0 2 16 0 3 0.000 0.000

37 13 0 1 13 0 3 0.000 0.000

38 12 0 2 10 1 0 0.045 0.545

40 10 0 1 9 0 0 0.000 0.000

41 9 0 2 9 0 0 0.000 0.000

42 7 0 1 9 0 3 0.000 0.000

43 6 1 0 6 0 0 0.083 0.500

44 5 0 0 4 0 2 0.000 0.000

45 5 0 1 4 0 0 0.000 0.000

47 4 0 0 4 0 1 0.000 0.000

48 4 0 2 3 0 0 0.000 0.000

53 2 0 0 3 0 1 0.000 0.000

60 2 1 0 2 0 0 0.250 0.500

61 1 0 1 2 0 0 0.000 0.000

65 0 0 0 2 0 1 0.000 0.000

70 0 0 0 1 0 1 0.000 0.000

Total 12 27 20.032

pd=(d1+d2)/(n1+n2),e1=pdn1,e2=pdn2.

Wecantestthenullhypothesisthattheriskofrecurrenceinanymonthisequalforthetwopopulationsbyachi-squaredtest:

Thereisoneconstraint,thatthetwofrequenciesaddtothesumoftheexpected(i.e.thetotalnumberofrecurrences),soweloseonedegreeoffreedom,giving2-1=1degreeoffreedom.FromTable13.3.thishasaprobabilityof0.01.

Sometextsdescribethistestdifferently,sayingthatunderthenullhypothesisd1isfromaNormaldistributionwithmeane1andvariancee1e2/(e1+e2).Thisisalgebraicallyidenticaltothechi-squaredmethod,butonlyworksfortwogroups.

Thelogranktestisnon-parametric,becausewemakenoassumptionsabouteitherthedistributionofsurvivaltimeoranydifferenceinrecurrencerates.Itrequiresthesurvivalorcensoringtimestobeexact.AsimilarmethodforgroupeddataasinTable15.8isgivenbyMantel(1966).

Thelogranktestisatestofsignificanceand,ofcourse,anestimateofthedifferenceispreferableifwecangetone.Thelogranktestcalculationcanbeusedtogiveusone:thehazardratio.Thisistheratiooftheriskofdeathingroup1totheriskofdeathingroup2.Forthistomakesense,wehavetoassumethatthisratioisthesameatalltimes,otherwisetherecouldnotbeasingleestimate.(Comparethepairedtmethod,§10.2.)Theriskofdeathisthenumberofdeathsdividedbythepopulationatrisk,butthepopulationkeepschangingduetocensoring.However,thepopulationsatriskinthetwogroupsareproportionaltothenumbersofexpecteddeaths,e1ande2.Wecanthuscalculatethehazardratioby

ForTable15.10.wehave

Thusweestimatetheriskofrecurrencewithsinglestonestobe0.42timestheriskformultiplestones.ThedirectcalculationofaconfidenceintervalforthehazardratioistediousandIshallomitit.Altman(1991)givesdetails.ItcanalsobedonebyCoxregression(§17.9).

15.7*ComputeraideddiagnosisReferenceintervals(§15.5)areoneareawherestatisticalmethodsareinvolveddirectlyindiagnosis,computeraideddiagnosisisanother.The‘aided’isputintopersuadecliniciansthatthemainpurposeisnottodothemoutofajob,but,naturally,theyhavetheirdoubts.Computeraideddiagnosisispartlyastatisticalexercise.Therearetwotypesofcomputeraideddiagnosis:statisticalmethods,wherediagnosisisbasedonasetofdataobtainedfrompastcases,and

decisiontreemethods,whichtrytoimitatethethoughtprocessesofanexpertinthefield.Weshalllookbrieflyateachapproach.

Thereareseveralmethodsofstatisticalcomputeraideddiagnosis.Oneusesdiscriminantanalysis.Inthiswestartwithasetofdataonsubjectswhosediagnosiswassubsequentlyconfirmed,andcalculateoneormorediscriminantfunctions.Adiscriminantfunctionhastheform:

constant1×variable1+constant2×variable2+…+constantk×variablek

Theconstantsarecalculatedsothatthevaluesofthefunctionsareassimilaraspossibleformembersofthesamegroupandasdifferentaspossibleformembersofdifferentgroups.Inthecaseofonlytwogroups,wehaveonediscriminantfunctionandallthesubjectsinonegroupwillhavehighvaluesofthefunctionandallsubjectsintheotherwillhavelowvalues.Foreachnewsubjectweevaluatethe

discriminantfunctionanduseittoallocatethesubjecttoagroupordiagnosis.Wecanestimatetheprobabilityofthesubjectfallinginthatgroup,andinanyother.Manyformsofdiscriminantanalysishavebeendevelopedtotryandimprovethisformofcomputerdiagnosis,butitdoesnotseemtomakemuchdifferencewhichisused.Logisticregression(§17.8)canalsobeused.

AdifferentapproachusesBayesiananalysis.ThisisbasedonBayes'theorem,aresultaboutconditionalprobability(§6.8)whichmaybestatedintermsoftheprobabilityofdiagnosisAbeingtrueifwehaveobserveddataB,as:

Ifwehavealargedatasetofknowndiagnosesandtheirassociatedsymptomsandsigns,wecandeterminePROB(diagnosisA)easily.ItissimplytheproportionoftimesAhasbeendiagnosed.Theproblemoffindingtheprobabilityofaparticularcombinationofsymptomsandsignsismoredifficult.Iftheyareallindependent,wecansaythattheprobabilityofagivensymptomistheproportionoftimesitoccurs,andtheprobabilityofthesymptomforeachdiagnosisisfoundinthesameway.Theprobabilityofanycombinationofsymptomscanbefoundbymultiplyingtheirindividualprobabilitiestogether,asdescribedin§6.2.Inpracticetheassumptionthatsignsandsymptomsareindependentismostunlikelytobemetandamorecomplicatedanalysiswouldberequiredtodealwiththis.However,somesystemsofcomputeraideddiagnosishavebeenfoundtoworkquitewellwiththesimpleapproach.

Expertorknowledge-basedsystemsworkinadifferentway.Heretheknowledgeofahumanexpertorgroupofexpertsinthefieldisconvertedintoaseriesofdecisionrules,e.g.‘ifthepatienthashighCKthenthepatienthasmyocardialinfarction,ifnotthenontothenextdecision’.Thesesystemscanbemodifiedbyaskingfurtherexpertstotestthesystemwithcasesfromtheirownexperienceandtosuggestfurtherdecisionrulesiftheprogramfails.Theyalsohavetheadvantagethattheprogramcan‘explain’thereasonforits‘decision’bylistingtheseriesofstepswhichledtoit.MostofChapter14consistsofrulesofjust

thistypeandcouldbeturnedintoanexpertsystemforstatistical

analysis.

Althoughtherehavebeensomeimpressiveachievementsinthefieldofcomputerdiagnosis,ithastodatemadelittleprogresstowardsacceptanceinroutinemedicalpractice.Ascomputersbecomemorefamiliartoclinicians,morecommonintheirsurgeriesandmorepowerfulintermsofdatastorageandprocessingspeed,wemayexpectcomputeraideddiagnosistobecomeaswellestablishedascomputeraidedstatisticalanalysisistoday.

15.8*NumberneededtotreatWhenaclinicaltrialhasadichotomousoutcomemeasure,suchassurvivalordeath,thereareseveralwaysinwhichwecanexpressthedifferencebetweenthetwotreatments.Theseincludethedifferencebetweenproportionsofsuccesses,ratioofproportions(riskratioorrelativerisk),andtheoddsratio.Thenumberneededtotreat(NNT)isthenumberofpatientswewouldneedtotreatwiththenewtreatmenttoachieveonemoresuccessthanwewouldontheoldtreatment(Laupacisetal.1988;CookandSackett1995).Itisthereciprocalofthedifferencebetweentheproportionofsuccessonthenewtreatmentandtheproportionontheoldtreatment.Forexample,intheMRCstreptomycintrial(Table2.10)thesurvivalratesafter6monthswere93%instreptomycingroupand730.93-0.73=0.20andthenumberneededtotreattopreventonedeathoversixmonthswas1/0.20=5.ThesmallertheNNT,themoreeffectivethetreatmentwillbe.ThesmallestpossiblevalueforNNTis1.0,whentheproportionssuccessfulare1.0and0.0.Thiswouldmeanthatthenewtreatmentwasalwayseffectiveandtheoldtreatmentwasnevereffective.TheNNTcannotbezero.Ifthetreatmenthasnoeffectatall,theNNTwillbeinfinite,becausethedifferenceintheproportionofsuccesseswillbezero.Ifthetreatmentisharmful,sothatsuccessrateislessthanonthecontroltreatment,theNNTwillbenegative.Thenumberisthencalledthenumberneededtoharm(NNH).Thisideahascaughtonveryquicklyandhasbeenwidelyusedanddeveloped,forexampleasthenumberneededtoscreen(Rembold1998).

TheNNTisanestimateandshouldhaveaconfidenceinterval.Thisisapparentlyquitestraightforward.Wefindtheconfidenceintervalfor

thedifferenceintheproportions,andthereciprocaloftheselimitsaretheconfidencelimitsfortheNNT.FortheMRCstreptomycintrialthe95%confidenceintervalforthedifferenceis0.0578to0.3352,reciprocals17.3and3.0.Thusthe95%confidenceintervalfortheNNTis3to17.

Thisisdeceptivelysimple.AsAltman(1998)pointedout,thereareproblemswhenthedifferenceisnotsignificant.Theconfidenceintervalforthedifferencebetweenproportionsincludeszero,soinfinityisapossiblevalueforNNT,andnegativevaluesarealsopossible,i.e.thetreatmentmayharm.Theconfidenceintervalmustallowforthis.

Forexample,Henzietal.(2000)calculatedNNTforseveralstudies,includingthatofLopez-Olaondoetal.(1996).Thisstudycompareddexamethasoneagainstplacebotopreventpostoperativenauseaandvomiting.Theyobserved

nauseain5/25patientsondexamethasoneand10/25onplacebo.Thusthedifferenceinproportionswithoutnausea(success)is0.80-0.60=0.20,95%confidenceinterval-0.0479to0.4479(§8.6).Thenumberneededtotreatisthereciprocalofthisdifference,1/0.20=5.0.Thereciprocalsoftheconfidencelimtsare1/(-0.0479)=-20.9and1/0.4479=2.2.ButtheconfidenceintervalfortheNNTisnot-20.9to2.2.Zero,whichthisincludes,isnotapossiblevaluefortheNNT.Sincetheremaybenotreatmentdifferenceatall,zerodifferencebetweenproportions,theNNTmaybeinfinite.Infact,theconfidenceintervalforNNTisnotthevaluesbetween-20.9and2.2,butthevaluesoutsidethisinterval,i.e.2.2toinfinity(numberneededtoachieveanextrasuccess,NNT)andminusinfinityto-20.9(numberneededtoachieveanextrafailure,NNH).ThustheNNTisestimatedtobeanythinggreaterthan2.2,andtheNNHtobeanythinggreaterthan20.9.Theconfidenceintervalisintwoparts,-∞to-20.9and2.2to∞.(‘∞’isthesymbolforinfinity.)Henzietal.(2000)quotethisconfidenceintervalas2.2to-21,whichtheysaythereadershouldinterpretasincludinginfinity.Altman(1998)recommends‘NNTH=21.9to∞toNNTB2.2’,whereNNTHmeans‘numberneededtoharm’andNNTBmeans‘numberneededtobenefit’.Iprefer‘-∞to-20.9,2.2to∞’.Here-∞and∞each

tellusthatitdoesnotmatterwhichtreatmentisused.

Two-partconfidenceintervalsarenotexactlyintuitiveandIthinkthattheproblemsofinterpretationofNNTinnegativetrialslimititsvaluetobeingasupplementarydescriptionoftrialsresults.

15MMultiplechoicequestions81to86(Eachansweristrueorfalse)

81.*Therepeatabilityorprecisionofmeasurementsmaybemeasuredby:

(a)thecoefficientofvariationofrepeatedmeasurements;

(b)thestandarddeviationofmeasurementsbetweensubjects;

(c)thestandarddeviationofthedifferencebetweenpairsofmeasurements;

(d)thestandarddeviationofrepeatedmeasurementswithinsubjects;

(e)thedifferencebetweenthemeansoftwosetsofmeasurementsonthesamesetofsubjects.

ViewAnswer

82.Thespecificityofatestforadisease:

(a)hasastandarderrorderivedfromtheBinomialdistribution;

(b)measureshowwellthetestdetectscasesofthedisease;

(c)measureshowwellthetestexcludessubjectswithoutthedisease;

(d)measureshowoftenacorrectdiagnosisisobtainedfromthetest;

(e)isallweneedtotellushowgoodthetestis.

ViewAnswer

83.Thelevelofanenzymemeasuredinbloodisusedasadiagnostictestforadisease,thetestbeingpositiveiftheenzymeconcentrationisaboveacriticalvalue.Thesensitivityofthediagnostictest:

(a)isoneminusthespecificity;

(b)isameasureofhowwellthetestdetectscasesofthedisease;

(c)istheproportionofpeoplewiththediseasewhoarepositiveonthetest;

(d)increasesifthecriticalvalueislowered;

(e)measureshowwellpeoplewithoutthediseaseareexcluded.

ViewAnswer

84.A95%referenceinterval,95%referencerange,ornormalrange:

(a)maybecalculatedastwostandarddeviationsoneithersideofthemean;

(b)maybecalculateddirectlyfromthefrequencydistribution;

(c)canonlybecalculatediftheobservationsfollowaNormaldistribution;

(d)getswiderasthesamplesizeincreases;

(e)maybecalculatedfromthemeananditsstandarderror.

ViewAnswer

85.Ifthe95%referenceintervalforhaematocritinmenis43.2to49.2:

(a)anymanwithhaematocritoutsidetheselimitsisabnormal;

(b)haematocritsoutsidetheselimitsareproofofdisease:

(c)amanwithahaematocritof46mustbeveryhealthy;

(d)awomanwithahaematocritof48hasahaematocritwithinnormallimits;

(e)amanwithahaematocritof42maybeill.

ViewAnswer

86.*Whenasurvivalcurveiscalculatedfromcensoredsurvivaltimes:

(a)theestimatedproportionsurvivingbecomeslessreliableassurvivaltimeincreases;

(b)individualswithdrawnduringthefirsttimeintervalareexcludedfromtheanalysis;

(c)survivalestimatesdependontheassumptionthatsurvivalratesremainconstantoverthestudyperiod;

(d)itmaybethatthesurvivalcurvewillnotreachzerosurvival;

(e)thefiveyearsurvivalratecanbecalculatedevenifsomeofthesubjectswereidentifiedlessthanfiveyearsago.

ViewAnswer

15EExercise:AreferenceintervalInthisexerciseweshallestimateareferenceinterval.Matheretal.(1979)measuredplasmamagnesiumin140apparentlyhealthypeople,tocomparewithasampleofdiabetics.ThenormalsamplewaschosenfromblooddonorsandpeopleattendingdaycentresfortheelderlyintheareaofSt.George'sHospital,togive10maleand10femalesubjectsineachagedecadefrom15–24to75yearsandover.Questionnaireswereusedtoexcludeanysubjectwithpersistent

diarrhoea,excessivealcoholintakeorwhowereonregulardrugtherapyotherthanhypnoticsandmildanalgesicsintheelderly.ThedistributionofplasmamagnesiumisshowninFigure15.8.Themeanwas0.810mmol/litreandthestandarddeviation0.057mmol/litre.

Fig.15.8.Distributionofplasmamagnesiumin140apparentlyhealthypeople

1.Whatdoyouthinkofthesamplingmethod?Whyuseblooddonorsandelderlypeopleattendingdaycentres?

ViewAnswer

2.Whyweresomepotentialsubjectsexcluded?Wasthisagoodidea?Whywerecertaindrugsallowedfortheelderly?

ViewAnswer

3.DoesplasmamagnesiumappeartofollowaNormaldistribution?

ViewAnswer

4.Whatisthereferenceintervalforplasmamagnesium,usingtheNormaldistributionmethod?

ViewAnswer

5.Findconfidenceintervalsforthereferencelimits.

ViewAnswer

6.Woulditmatterifmeanplasmamagnesiuminnormalpeopleincreasedwithage?Whatmethodmightbeusedtoimprovethe

estimateofthereferenceintervalinthiscase?

ViewAnswer



>TableofContents>16-Mortalitystatisticsandpopulationstructure

16

Mortalitystatisticsandpopulationstructure

16.1MortalityratesMortalitystatisticsareoneofourprincipalsourcesofinformationaboutchangingpatternofdiseasewithinacountryandthedifferencesindiseasetweencountries.Inmostdevelopedcountries,anydeathmustbecertifiedby-doctor,whorecordsthecause,dateandplaceofdeathandsomedataaboutdeceased.InBritain,theseincludethedateofbirth,areaofresidenceandknownoccupation.Thesedeathcertificatesformtherawmaterialfromwhichmortalitystatisticsarecompiledbyanationalbureauofcensuses,inBritaintheOfficeforNationalStatistics.Thenumbersofdeathscanbetabulatedbycause,sex,age,typesofoccupation,areaofresidence,andmaritalstatus.Table5.1showsonesuchtabulation,ofdeathsbycauseandsex.

Forpurposesofcomparisonwemustrelatethenumberofdeathstothenumberinthepopulationinwhichtheyoccur.Wehavethisinformationfairlyreliablyat10yearintervalsfromthedecennialcensusofthecountry.Wecanestimatethesizeandageandsexstructureofthepopulationbetweencensusesusingregistrationofbirthsanddeaths.Eachbirthordeathisnotifiedtoanofficialregistrar,andsowecankeepsometrackofchangesinthepopulationThereareother,lesswelldocumentedchangestakingplace,suchasimmigrationandemigration,whichmeanthatpopulationsizeestimatesbetweenthecensusyearsareonlyapproximations.Someestimates,suchasthenumbersindifferentoccupations,aresounreliablethatmortalitydataisonlytabulatedbythemforcensusyears.

Ifwetakethenumberofdeathsoveragivenperiodoftimeanddivide

itbythenumberinthepopulationandthetimeperiod,wegetamortalityrate,thenumberofdeathsperunittimeperperson.Weusuallytakethenumberofdeathsoveronecalendaryear,althoughwhenthenumberofdeathsissmallwemaytakedeathsoverseveralyears,toincreasetheprecisionofthenumerator.Thenumberinthepopulationischangingcontinually,andwetakeasthedenominatortheestimatedpopulationatthemid-pointofthetimeperiod.Mortalityratesareoftenverysmallnumbers,soweusuallymultiplythembyaconstant,suchas1000or100000,toavoidstringsofzerosafterthedecimalpoint.

Whenwearedealingwithdeathsinthewholepopulation,irrespectiveofage,therateweobtainiscalledthecrudemortalityrateorcrudedeathdrate.

Theterms‘deathrate’and‘mortalityrate’areusedinterchangeably.Wecalculatethecrudemortalityrateforapopulationas:

Table16.1.Age-specificmortalityratesandagedistributioninadultmales,EnglandandWales,1901

and1981

Agegroup(years)

Age-specificdeathrateper1000peryear

%Adultpopulationinagegroup

1901 1981 1901 1981

15–19 3.5 0.8 15.36 11.09

20–24 4.7 0.8 14.07 9.75

25–34 6.2 0.9 23.76 18.81

35–44 10.6 1.8 18.46 15.99

45–54 18.0 6.1 13.34 14.75

55–64 33.5 17.7 8.68 14.04

65–74 67.8 45.6 4.57 10.65

75–84 139.8 105.2 1.58 4.28

85+ 276.5 226.2 0.17 0.64

Iftheperiodisinyears,thisgivesthecrudemortalityrateasdeathsper1000populationperyear.

Thecrudemortalityrateissocalledbecausenoallowanceismadefortheagedistributionofthepopulation,andcomparisonsbetweenpopulationswithdifferentagestructures.Forexample,in1901thecrudemortalityrateamongadultmales(agedover15years)inEnglandandWaleswas15.7per1000peryear,andin1981itwas14.8per1000peryear.Itseemsstrangethatwithalltheimprovementsinmedicine,housingandnutritionbetweenthesetimestherehasbeensolittleimprovementinthecrudemortalityrate.Toseewhywemustlookattheage-specificmortalityrates,themortalityrateswithinnarrowagegroups.Age-specificmortalityratesareusuallycalculatedforone,fiveortenyearagegroups.In1901theagespecificmortalityrateformenaged15to19was3.5deathsper1000peryear,whereasin1981itwasonly0.8.AsTable16.1shows,theagespecificmortalityratein1901wasgreaterthanthatin1981foreveryagegroup.Howeverin1901therewasamuchgreaterproportionofthepopulationintheyoungeragegroups,wheremortalitywaslow,thantherewasin1981.

Correspondingly,therewasasmallerproportionofthe1901populationthanthe1981populationinthehighermortalityolderagegroups.Althoughmortalitywasloweratanygivenagein1981,thegreaterproportionofolderpeoplemeantthattherewerealmostasmanydeathsasin1901.

Toeliminatetheeffectsofdifferentagestructuresinthepopulationswhichwewanttocompare,wecanlookattheage-specificdeathrates.Butifwearecomparingseveralpopulations,thisisarathercumbersomeprocedure,anditisoftenmoreconvenienttocalculateasinglesummaryfigurefromtheage-specific

rates.Therearemanywaysofdoingthis,ofwhichthreearefrequentlyused:thedirectandindirectmethodsofagestandardizationandthelifetable.

Table16.2.Calculationoftheagestandardizedmortalityratebythedirectmethod

Agegroup(years)

Standardproportioninagegroup(a)

Observedmortalityrateper1000(b)

a×i

15–19 0.1536 0.8 0.1229

20–24 0.1407 0.8 0.1126

25–34 0.2376 0.9 0.2138

35–44 0.1846 1.8 0.3323

45–54 0.1334 6.1 0.8137

55–64 0.0868 17.7 1.5364

65–74 0.0457 45.6 2.0839

75–84 0.0158 105.2 1.6622

85+ 0.0017 226.2 0.3845

Sum 7.2623

16.2AgestandardizationusingthedirectmethodIshalldescribethedirectmethodfirst.Weuseastandardpopulationstructure,i.e.astandardagedistributionorsetofproportionsofpeopleineachagegroup.Wethencalculatetheoverallmortalityratewhichapopulationwiththestandardagestructurewouldhaveifitexperiencedtheagespecificmortalityratesoftheobservedpopulation,thepopulationwhosemortalityrateistobeadjusted.Weshalltakethe1901populationasthestandardandcalculatethemortalityratethe1981populationwouldhaveexperiencediftheagedistributionwerethesameasin1901.Wedothisbymultiplyingeach1981agespecificmortalityratebytheproportioninthatagegroupinthestandard1901population,andadding.Thisthengivesusanaveragemortalityrateforthewholepopulation,theage-standardizedmortalityrate.Forexample,the1981mortalityrateinagegroup15–19was0.8per1000peryearandtheproportioninthestandardpopulationinthisagegroupis15.36%or0.1536.Thecontributionofthisagegroupis0.8×0.1536=0.1229.ThecalculationissetoutinTable16.2.

Ifweusedthepopulation'sownproportionsineachagegroupinthiscalculationwewouldgetthecrudemortalityrate.Since1901hasbeenchosenasthestandardpopulation,itscrudemortalityrateof15.7is

alsotheage-standardizedmortalityrate.Theage-standardizedmortalityratefor1981was7.3per1000menperyear.Wecanseethattherewasamuchhigherage-standardizedmortalityin1901than1981,reflectingthedifferenceinage-specificmortalityrates.

16.3AgestandardizationbytheindirectmethodThedirectmethodreliesuponage-specificmortalityratesfortheobservedpopulation.Ifwehaveveryfewdeaths,theseage-specificrateswillbeverypoorlyestimated.Thiswillbeparticularlysointheyoungeragegroups,wherewemay

evenhavenodeathsatall.Suchsituationsarisewhenconsideringmortalityduetoparticularconditionsorinrelativelysmallgroups,suchasthosedefinedbyoccupation.Theindirectmethodofstandardizationisusedforsuchdata.Wecalculatethenumberofdeathswewouldexpectintheobservedpopulationifitexperiencedtheage-specificmortalityratesofastandardpopulation.Wethencomparetheexpectednumberofdeathswiththatactuallyobserved.

Table16.3.Age-specificmortalityratesduetocirrhosisoftheliverandagedistributionsofallmenandmedicalpractitioners,EnglandandWales,1971

Agegroup(years)

Mortalitypermillionmenperyear

Numberofmen

Numberofdoctors

15–24 5.859 3584320 1080

25–34 13.050 3065100 12860

35–44 46.937 2876170 11510

45–54 161.503 2965880 10330

55–64 271.358 2756510 7790

IshalltakeasanexamplethedeathsduetocirrhosisoftheliveramongmalequalifiedmedicalpractitionersinEnglandandWales,recordedaroundthe1971census.Therewere14deathsamong43570doctorsagedbelow65,acrudemortalityrateof14/43570=321permillion,comparedto1423outof15247980adultmales(aged15–64),or93permillion.Themortalityamongdoctorsappearshigh,butthemedicalpopulationmaybeolderthanthepopulationofmenasawhole,asitwillcontainrelativelyfewbelowtheageof25.Alsotheactualnumberofdeathsamongdoctorsissmallandanydifferencenotexplainedbytheageeffectmaybeduetochance.Theindirectmethodenablesustotestthis.Table16.3showstheage-specificmortalityratesforcirrhosisoftheliveramongallmenaged15to65,andthenumberofmenestimatedineachten-year-agegroup,forallmenandfordoctors.Wecanseethatthetwoagedistributionsdoappeartobedifferent.

Thecalculationoftheexpectednumberofdeathsissimilartothedirectmethod,butdifferentpopulationsandratesareused.Foreachagegroup,wetakethenumberintheobservedpopulation,andmultiplyitbythestandardagespecificmortalityrate,whichwouldbetheprobabilityofdyingifmortalityintheobservedpopulationwerethesameasthatinthestandardpopulation.Thisgivesusthenumberwewouldexpecttodieinthisagegroupintheobservedpopulation.Weaddtheseovertheagegroupsandobtaintheexpectednumberofdeaths.ThecalculationissetoutinTable16.4.

Theexpectednumberofdeathsis4.4965,whichisconsiderablylessthanthe14observed.Weusuallyexpresstheresultofthecalculationastheratioofobservedtoexpecteddeaths,calledthestandardizedmortalityratioorSMR.ThustheSMRforcirrhosisamongdoctorsis

WeusuallymultiplytheSMRby100togetridofthedecimalpoint,andreporttheSMRas311.Ifwedonotadjustforageatall,theratioofthecrudedeathratesis3.44,comparedtotheageadjustedfigureof3.11,sotheadjustmenthasmadesome,butnotmuch,differenceinthiscase.

Table16.4.Calculationoftheexpectednumberofdeathsduetocirrhosisoftheliveramongpractitioners,usingtheindirectmethod

Agegroup(years)

Standardmortalityrate(a)

Observedpopulationnumberofdoctors(b)

a×b

15–24 0.000005859 1080 0.0063

25–34 0.000013050 12860 0.1678

35–44 0.000046937 11510 0.5402

45–54 0.000161503 10330 1.6683

55–64 0.000271358 7790 2.1139

Total 4.4965

WecancalculateaconfidenceintervalfortheSMRquiteeasily.DenotetheobserveddeathsbyOandexpectedbyE.Itisreasonabletosupposethatthedeathsareindependentofoneanotherandhappening

randomlyintime,sotheobservednumberofdeathsisfromaPoissondistribution(§6.7).ThestandarddeviationofthisPoissondistributionisthesquarerootofitsmeanandsocanbeestimatedbythesquarerootoftheobserveddeaths,√O.Theexpectednumberiscalculatedfromaverymuchlargersampleandissowellestimateditcanbetreatedasaconstant,sothestandarddeviationof100×O/E,whichisthestandarderroroftheSMR,isestimatedby100×√O/E.Providedthenumberofdeathsislargeenough,saymorethan10,anapproximate95%confidenceintervalisgivenby

Forthecirrhosisdatatheformulagives

Theconfidenceintervalclearlyexcludes100andthehighmortalitycannotbeascribedtochance.

ForsmallobservedfrequenciestablesbasedontheexactprobabilitiesofthePoissondistributionareavailable(PearsonandHartley1970).ThecalculationsareeasilydonebycomputerandmyfreeprogramClinstat(§1.3)doesthem.ThereisalsoanexactmethodforcomparingtwoSMRs,whichClinstatdoes.Forthecirrhosisdatatheexact95%confidenceintervalis170to522.Thisis

notquitethesameasthelargesampleapproximation.BetterapproximationsandexactmethodsofcalculatingconfidenceintervalsaredescribedbyMorrisandGardner(1989)andBreslowandDay(1987).

WecanalsotestthenullhypothesisthatinthepopulationtheSMR=100.Ifthenullhypothesisistrue,OisfromaPoissondistributionwithmeanEandhencestandarddeviation√E,providedthesampleislargeenough,sayE>10.Then(O-E)/√EwouldbeanobservationfromtheStandardNormaldistributionifthenullhypothesisweretrue.Thesampleofdoctorsistoosmallforthistesttobereliable,butifitwere,wewouldhave(O-E)/√E=(14-4.4965)/√4.4965=4.48,P=0.0001.Again,thereisanexactmethod.ThisgivesP=0.0005.Assooften

happens,largesamplemethodsbecometooliberalandgivePvalueswhicharetoosmallwhenusedwithsampleswhicharetoosmallforthetesttobevalid.

Thehighlysignificantdifferencesuggeststhatdoctorsareatincreasedriskofdeathfromcirrhosisoftheliver,comparedtoemployedmenasawhole.Thenewsisnotallbadformedicalpractitioners,however.TheirSMRforcancerofthetrachea,bronchusandlungisonly32.Doctorsmaydrink,buttheydonotsmoke!

16.4DemographiclifetablesWehavealreadydiscussedauseofthelifetabletechniquefortheanalysisofclinicalsurvivaldata(§15.6).Thelifetablewasfoundbyfollowingthesurvivalofagroupofsubjectsfromsomestartingpointtodeath.Indemography,whichmeansthestudyofhumanpopulations,thislongitudinalmethodofanalysisisimpractical,becausewecouldonlystudypeoplebornmorethan100yearsago.Demographiclifetablesaregeneratedinadifferentway,usingacross-sectionalapproach.Ratherthanchartingtheprogressofagroupfrombirthtodeath,westartwiththepresentage-specificmortalityrates.Wethencalculatewhatwouldhappentoacohortofpeoplefrombirthiftheseage-specificmortalityratesappliedunchangedthroughouttheirlives.Wedenotetheprobabilityofdyingbetweenagesxandx+1years(theage-specificmortalityrateatagex)byqx.AsinTable15.8,theprobabilityofsurvivingfromagextox+1ispx=1-qx.Wenowsupposethatwehaveacohortofsizel0atage0,i.e.atbirth.l0isusually100000or10000.Thenumberwhowouldstillbealiveafterxyearsislx.Wecanseethatthenumberaliveafterx+1yearsislx+1=px×lx,sogivenallthepxfromx=0onwardswecancalculatethelx.ThecumulativesurvivalprobabilitytoagexisthenPx=lx/l0

Table16.5showsanextractfromLifeTableNumber11,1950–52,forEnglandandWales.Withtheexceptionof1941,alifetablelikethishasbeenproducedevery10yearssince1871,basedonthedecennialcensusyear.Thelifetableisbasedonthecensusyearbecauseonlythendowehaveagoodmeasureofthenumberofpeopleateachage,thedenominatorinthecalculationofqx.Athreeyearperiodisusedto

increasethenumberofdeathsforayearofageandsoimprovetheestimationofqx.Separatetablesareproducedformalesandfemales

becausethemortalityofthetwosexesisverydifferent.Agespecificdeathratesarehigherinmalesthanfemalesateveryage.Betweencensusyearslifetablesarestillproducedbutareonlypublishedinanabridgedform,givinglxatfiveyearintervalsonlyafteragefive(Table16.6).

Table16.5.ExtractfromEnglishLifeTableNumber11,1950–52,Males

Ageinyears

Expectednumberaliveatagex

Probabilityanindividualdiesbetweenagesxandx+1

Expectedlifeatagexyears

x lx qx ex

0 100000 0.03266 66.42

1 96734 0.00241 67.66

2 96501 0.00141 66.82

3 96395 0.00102 65.91

4 96267 0.00084 64.98

. . . .

. . . .

. . . .

100 23 0.44045 1.67

101 13 0.45072 1.62

102 7 0.46011 1.58

103 4 0.46864 1.53

104 2 0.47636 1.50

ThefinalcolumninTables16.5and16.6istheexpectedlife,expectationoflifeorlifeexpectancy,ex.Thisistheaveragelifestilltobelivedbythosereachingagex.Wehavealreadycalculatedthisastheexpectedvalueoftheprobabilitydistributionofyearofdeath(§6E).Wecandothecalculationinanumberofotherways.Forexample,ifweaddlx+1,lx+2,lx+3,etc.wewillgetthetotalnumberofyearstobelived,becausethelx+1whosurvivetox+1willhaveaddedlx+1yearstothetotal,thelx+2ofthesewhosurvivefromx+1tox+2willaddafurtherlx+2years,andsoon.Ifwedividethissumbylxwegettheaveragenumberofwholeyearstobelived.Ifwethenrememberthatpeopledonotdieontheirbirthdays,butscatteredthroughouttheyear,wecanaddhalftoallowfortheaverageofhalfyearlivedintheyearofdeath.Wethusget

i.e.summingthelifromagex+1totheendofthelifetable.

Ifmanypeopledieinearlylife,withhighage-specificdeathratesforchildren,thishasagreateffectonexpectationoflifeatbirth.Table16.7showsexpectationoflifeatselectedagesfromfourEnglishLifeTables(OfficeforNationalStatistics1997).In1991,forexample,expectationoflifeatbirthformaleswas74years,comparedtoonly40yearsin1841,animprovementof34years.Howeverexpectationoflifeatage45in1991was31yearscomparedto23yearsin1841,animprovementofonly8years.Atage65,maleexpectationoflifewas11

yearsin1841and14yearsin1991,anevensmallerchange.Hencethechangeinlifeexpectancyatbirthwasduetochangesinmortalityinearlylife,notlatelife.

Table16.6.AbridgedLifeTable1988–90,EnglandandWales

Age Males Females

x lx ex lx ex

0 10000 73.0 10000 78.5

1 9904 72.7 9928 78.0

2 9898 71.7 9922 77.1

3 9893 70.8 9919 76.1

4 9890 69.8 9916 75.1

5 9888 68.8 9914 74.2

10 9877 63.9 9907 69.2

15 9866 58.9 9899 64.3

20 9832 54.1 9885 59.4

25 9790 49.3 9870 54.4

30 9749 44.5 9852 49.5

35 9702 39.7 9826 44.6

40 9638 35.0 9784 39.8

45 9542 30.3 9718 35.1

50 9375 25.8 9607 30.5

55 9097 21.5 9431 26.0

60 8624 17.5 9135 21.7

65 7836 14.0 8645 17.8

70 6689 11.0 7918 14.2

75 5177 8.4 6869 11.0

80 3451 6.4 5446 8.2

85 1852 4.9 3659 5.9

Thereisacommonmisconceptionthatalifeexpectancyatbirthof40years,asin1841,meantthatmostpeoplediedaboutage40.Forexample(Rowe1992):

Mothershavealwaysprovokedrageandresentmentintheiradultdaughters,whiletheadultdaughtershavealwaysprovokedanguishandguiltintheirmothers.Inpastcenturies,however,suchmatchedmiserydidnotlastforlong.Daughterscouldburytheirrageandresentmentunderaconcernfordutywhiletheycaredfortheirmotherswho,turning40,rapidlyaged,grewfrailanddied.Nowmothersturning40arestrongandhealthy,andonlyhalfwaythroughtheirlives.

Thisisabsurd.AsTable16.7shows,sincelifeexpectancywasfirstestimatedwomenturning40havehadaverageremaininglivesofmorethan20years.Theydidnotrapidlyage,growfrail,anddie.

‘Expectation’isusedinitsstatisticalsenseoftheaverageofadistribution.Itdoesnotmeanthateachpersoncanknowwhentheywilldie.FromthemostrecentlifetableforEnglandandWales,for1994–96(OfficeforNationalStatistics1998a),amanaged53(myself,forexample)hasalifeexpectancyof24years.Thisistheaveragelifetimewhichallmenaged53yearswouldhaveifthepresentage-specificmortalityratesdonotchange.(Theseshouldgodownovertime,puttinglife-spansup.)Abouthalfofthesemenwillhaveshorterlivesandhalflonger.Ifwecouldcalculatelifeexpectanciesformenwithdifferent

combinationsofriskfactors,wemightfindthatmylifeexpectancywouldbedecreasedbecauseIamshort(sounfairIthink)andfatandincreasedbecauseIdonotsmoke(likealmostallmedicalstatisticians)andamofprofessionalsocialclass.Howevermyexpectationoflifewasadjusted,itwouldremainanaverage,notaguaranteedfigureforme.

Table16.7.Lifeexpectancyin1841,1901,1951,and1991,EnglandandWales

Age Sex Expectationoflifeinyears

1841 1901 1951 1991

Birth Males 40 49 66 74

Females 42 52 72 79

15yrs Males 43 47 54 59

Females 44 50 59 65

45yrs Males 23 23 27 31

Females 24 26 31 36

65yrs Males 11 11 12 14

Females 12 12 14 18

Lifetableshaveanumberofuses,bothmedicalandnon-medical.Expectationoflifeprovidesausefulsummaryofmortalitywithouttheneedforastandardpopulation.Thetableenablesustopredictthefuturesizeofandagestructureofapopulationgivenitspresentstate,calledapopulationprojection.Thiscanbeveryusefulinpredictingsuchthingsasthefuturerequirementforgeriatricbedsinahealthdistrict.Lifetablesarealsoinvaluableinnon-medicalapplications,

suchasthecalculationofinsurancepremiums,pensionsandannuities.

Themaindifficultywithpredictionfromalifetableisfindingatablewhichappliestothepopulationsunderconsideration.Forthegeneralpopulationof,say,ahealthdistrict,thenationallifetablewillusuallybeadequate,butforspecialpopulationsthismaynotbethecase.Ifwewanttopredictthefutureneedforcareofaninstitutionalizedpopulation,suchasinalongstaypsychiatrichospitaloroldpeoples'home,themortalitymaybeconsiderablygreaterthanthatinthegeneralpopulation.Predictionsbasedonthenationallifetablecanonlybetakenasaveryroughguide.Ifpossiblelifetablescalculatedonthattypeofpopulationshouldbeused.

16.5VitalstatisticsWehaveseenanumberofoccasionswhereordinarywordshavebeengivenquitedifferentmeaningsinstatisticsfromthosetheyhaveincommonspeech;‘Normal’and‘significant’aregoodexamples.‘Vitalstatistics’istheopposite,atechnicaltermwhichhasacquiredacompletelyunrelatedpopularmeaning.Asfarasthemedicalstatisticianisconcerned,vitalstatisticshavenothingtodowiththedimensionsoffemalebodies.Theyarethestatisticsrelatingtolifeanddeath:birthrates,fertilityrates,marriageratesanddeathrates.Ihavealreadymentionedcrudemortalityrate,age-specificmortalityrates,age-standardized

mortalityrate,standardizedmortalityratio,andexpectationoflife.InthissectionIshalldefineanumberofotherstatisticswhichareoftenquotedinthemedicalliterature.

Theinfantmortalityrateisthenumberofdeathsunderoneyearofagedividedbythenumberoflivebirths,usuallyexpressedasdeathsper1000livebirths.Theneonatalmortalityrateisthesamethingfordeathsinthefirst4weeksoflife.Thestillbirthrateisthenumberofstillbirthsdividedbythetotalnumberofbirths,liveandstill.Astillbirthisachildborndeadafter28weeksgestation.Theperinatalmortalityrateisthenumberofstillbirthsanddeathsinthefirstweekoflifedividedbythetotalbirths,againusuallypresentedper1000births.Infantandperinatalmortalityratesareregardedasparticularly

sensitiveindicatorsofthehealthstatusofthepopulation.Thematernalmortalityrateisthenumberofdeathsofmothersascribedtoproblemsofpregnancyandbirth,dividedbythetotalnumberofbirths.Thebirthrateisthenumberoflivebirthsperyeardividedbythetotalpopulation.Thefertilityrateisthenumberoflivebirthsperyeardividedbythenumberofwomenofchildbearingage,takenas15–44years.

Theattackrateforadiseaseistheproportionofpeopleexposedtoinfectionwhodevelopthedisease.Thecasefatalityrateistheproportionofpeoplewiththediseasewhodiefromit.Theprevalenceofadiseaseistheproportionofpeoplewhohaveitatonepointintime.Theincidenceisthenumberofnewcasesinoneyeardividedbythenumberatrisk.

16.6ThepopulationpyramidTheagedistributionofapopulationcanbepresentedashistogram,usingthemethodsof§4.3.However,becausethemortalityofmalesandfemalesissodifferenttheagedistributionsformalesandfemalesarealsodifferent.Itisusualtopresenttheagedistributionsforthetwosexesseparately.Figure16.1showstheagedistributionsforthemaleandfemalepopulationsofEnglandandWalesin1901.Now,thesehistogramshavethesamehorizontalscale.TheconventionalwaytodisplaythemiswiththeagescaleverticallyandthefrequencyscalehorizontallyasinFigure16.2.Thefrequencyscalehaszerointhemiddleandincreasestotherightforfemalesandtotheleftformales.Thisiscalledapopulationpyramid,becauseofitsshape.

Figure16.3showsthepopulationpyramidforEnglandandWalesin1991.Theshapeisquitedifferent.Insteadofatrianglewehaveanirregularfigurewithalmostverticalsideswhichbegintobendverysharplyinwardsataboutage65.Thepost-warand1960sbabyboomscanbeseenasbulgesatages25–30and40–45.Amajorchangeinpopulationstructurehastakenplace,withavastincreaseintheproportionofelderly.Thishasmajorimplicationsformedicine,asthecareoftheelderlyhasbecomealargeproportionoftheworkofdoctors,nursesandtheircolleagues.Itisinterestingtoseehowthishascomeabout.

Itispopularlysupposedthatpeoplearenowlivingmuchlongerasaresultofmodernmedicine,whichpreventsdeathsinmiddlelife.Thisisonlypartlytrue.

Fig.16.1.AgedistributionsforthepopulationofEnglandandWales,bysex,1901

Fig.16.2.PopulationpyramidforEnglandandWales,1901

Fig.16.3.PopulationpyramidforEnglandandWales,1991

AsTable16.7shows,lifeexpectancyatbirthincreaseddramaticallybetween1901and1991,buttheincreaseinlaterlifeismuchless.Thechangeisnotanextensionofeverylifeby25years,whichwouldbeseenateveryage,butmainlyareductioninmortalityinchildhoodandearlyadulthood.Mortalityinlaterlifehaschangedrelativelylittle.Now,abigreductioninmortalityinchildhoodwouldresultinanincreaseinthebasepartofthepyramid,asmorechildrensurvived,unlesstherewasacorrespondingfallinthenumberofbabiesbeingborn.Inthe19thcentury,womenwerehavingmanychildrenanddespitethehighmortalityinchildhoodthenumberwhosurvivedintoadulthoodtohavechildrenoftheirownexceededthatoftheirownparents.Thepopulationexpandedandthishistoryisembodiedinthe1901populationpyramid.Inthe20thcentury,infantmortalityfellandpeoplerespondedtothisbyhavingfewerchildren.In1841–45,theinfantmortalityrateswere148per1000livebirths,138in1901–05,10

in1981–85(OPCS1992)andonly5.9in1997(OfficeforNationalStatistics1999).Thebirthratewas32.2per1000populationperyearin1841–45,in1901–05itwas28.2,andin1987–97itwas13.5(OfficeforNationalStatistics1998b).Thebaseofthepyramidceasedtoexpand.Asthosewhowereinthebaseofthe1901pyramidgrewolder,thepopulationinthetophalfofthepyramidincreased.Thesurvivorsofthe0–4agegroupinthe1901pyramidarethe90+agegroupinthe1991pyramid.Hadthebirthratenotfallen,thepopulationwouldhavecontinuedtoexpandandwewouldhaveasgreatorgreateraproportionofyoungpeoplein1991aswedidin1901,andavastlylargerpopulation.Thustheincreaseintheproportionoftheelderlyisnotprimarilybecauseadultliveshavebeenextended,althoughthishasasmalleffect,butbecausefertilityhasdeclined.Lifeexpectancyfortheelderlyhaschangedrelativelylittle.MostdevelopedcountrieshavestablepopulationpyramidslikeFigure16.3andthoseofmostdevelopingcountrieshaveexpandingpyramidslikeFigure16.2.


87.Age-specificmortalityrate:

(a)isaratioofobservedtoexpecteddeaths;

(b)canbeusedtocomparemortalitybetweendifferentagegroups;

(c)isanageadjustedmortalityrate;

(d)measuresthenumberofdeathsinayear;

(e)measurestheagestructureofthepopulation.

ViewAnswer

88.Expectationoflife:

(a)isthenumberofyearsmostpeoplelive;

(b)isawayofsummarizingage-specificdeathrates:

(c)istheexpectedvalueofaparticularprobabilitydistribution;

(d)varieswithage:

(e)isderivedfromlifetables.

ViewAnswer

89.Inastudyofpost-natalsuicide(Appleby1991),theSMRforsuicideamongwomenwhohadjusthadababywas17witha95%confidenceinterval14to21(allwomen=100).Forwomenwhohadhadastillbirth,theSMRwas105(95%confidenceinterval31to277).Wecanconcludethat:

(a)womenwhohadjusthadababywerelesslikelytocommitsuicidethanotherwomenofthesameage;

(b)womenwhohadjusthadastillbirthwerelesslikelytocommitsuicidethanotherwomenofthesameage;

(c)womenwhohadjusthadalivebabywerelesslikelytocommitsuicidethanwomenofthesameagewhohadhadastillbirth:

(d)itispossiblethathavingastillbirthincreasestheriskofsuicide;

(e)suicidalwomenshouldhavebabies.

ViewAnswer

90.In1971,theSMRforcirrhosisoftheliverformenwas773forpublicansandinnkeepersand25forwindowcleaners,bothbeingsignificantlydifferentfrom100(DonnanandHaskey1977).Wecanconcludethat:

(a)publicansaremorethan7timesaslikelyastheaveragepersontodiefromcirrhosisoftheliver;

(b)thehighSMRforpublicansmaybebecausetheytendtobefoundintheolderagegroups;

(c)beingapublicancausescirrhosisoftheliver;

(d)windowcleaningprotectsmenfromcirrhosisoftheliver;

(e)windowcleanersareathighriskofcirrhosisoftheliver.

ViewAnswer

91.Theageandsexstructureofapopulationmaybedescribedby:

(a)alifetable;

(b)acorrelationcoefficient;

(c)astandardizedmortalityratio;

(d)apopulationpyramid;

(e)abarchart.

ViewAnswer

92.Thefollowingstatisticsareadjustedtoallowfortheagedistributionofthepopulation:

(a)age-standardizedmortalityrate;

(b)fertilityrate;

(c)perinatalmortalityrate;

(d)crudemortalityrate;

(e)expectationoflifeatbirth.

ViewAnswer

16EExercise:DeathsfromvolatilesubstanceabuseAndersonetal.(1985)studiedmortalityassociatedwithvolatilesubstanceabuse(VSA),oftencalledgluesniffing.InthisstudyallknowndeathsassociatedwithVSAfrom1971to1983inclusivewerecollected,usingsourcesincludingthreepresscuttingsagenciesandasix-monthlysystematicsurveyofallcoroners.CaseswerealsonotifiedbytheOfficeofPopulationCensusesandSurveysforEnglandandWalesandbytheCrownOfficeandprocuratorsfiscalinScotland.

Table16.8showstheagedistributionofthesedeathsforGreatBritainandforScotlandalone,withthecorrespondingagedistributionsatthe1981decennialcensus.

1.Calculateage-specificmortalityratesforVSAperyearandforthewholeperiod.Whatisunusualabouttheseage-specificmortalityrates?

ViewAnswer

2.CalculatetheSMRforVSAdeathsforScotland.

ViewAnswer

3.Calculatethe95%confidenceintervalforthisSMR.

ViewAnswer

4.DoesthenumberofdeathsinScotlandappearparticularlyhigh?Apartfromalotofgluesniffing,arethereanyotherfactorswhichshouldbeconsideredaspossibleexplanationsforthisfinding?

ViewAnswer

Table16.8.Volatilesubstanceabusemortalityandpopulationsize,GreatBritainandScotland.1971–83

(Andersonetal.1985)

Agegroup(years) GreatBritain Scotland

VSAdeaths Population(thousands)

VSAdeaths

Population(thousands)

0–9 0 6770 0 653

10–14 44 4271 13 425

15–19 150 4467 29 447

20–24 45 3959 9 394

25–29 15 3616 0 342

30–39 8 7408 0 0659

40–49 2 6055 0 574

50–59 7 6242 0 579

60+ 4 10769 0 962



>TableofContents>17-Multifactorialmethods

17

Multifactorialmethods

17.1*MultipleregressionInChapters10and11welookedatmethodsofanalysingtherelationshipbetweenacontinuousoutcomevariableandapredictor.Thepredictorcouldbequantitative,asinregression,orqualitative,asinone-wayanalysisofvariance.Inthischapterweshalllookattheextensionofthesemethodstomorethanonepredictorvariable,anddescriberelatedmethodsforusewhentheoutcomeisdichotomousorcensoredsurvivaldata.Thesemethodsareverydifficulttodobyhandandcomputerprogramsarealwaysused.Ishallomittheformulae.

Table17.1showstheages,heightsandmaximumvoluntarycontractionofthequadricepsmuscle(MVC)inagroupofmalealcoholics.TheoutcomevariableisMVC.Figure17.1showstherelationshipbetweenMVCandheight.Wecan

fitaregressionlineoftheformMVC=a+b×height(§11.2–3).ThisenablesustopredictwhatthemeanMVCwouldbeformenofanygivenheight.ButMVCvarieswithotherthingsbesideheight.Figure17.2showstherelationshipbetweenMVCandage.

Table17.1.Maximumvoluntarycontraction(MVC)ofquadricepsmuscle,ageandheight,of41male

alcoholics(Hickishetal.1989)

Age(years)

Height(cm)

MVC(newtons)

Age(years)

Height(cm)

MVC(newtons)

24 166 466 42 178 417

27 175 304 47 171 294

28 173 343 47 162 270

28 175 404 48 177 368

31 172 147 49 177 441

31 172 294 49 178 392

32 160 392 50 167 294

32 172 147 51 176 368

32 179 270 53 159 216

32 177 412 53 173 294

34 175 402 53 175 392

34 180 368 53 172 466

35 167 491 55 170 304

37 175 196 55 178 324

38 172 343 55 155 196

39 172 319 58 160 98

39 161 387 61 162 216

39 173 441 62 159 196

40 173 441 65 168 137

41 168 343 65 168 74

41 178 540

Fig.17.1.Musclestrength(MVC)againstheight

Fig.17.2.Musclestrength(MVC)againstage

Wecanshowthestrengthsofthelinearrelationshipsbetweenallthreevariablesbytheircorrelationmatrix.Thisisatabulardisplayofthecorrelationcoefficientsbetweeneachpairofvariables,matrixbeingusedinitsmathematicalsenseasarectangulararrayofnumbers.ThecorrelationmatrixforthedataofTable17.1isshowninTable17.2.Thecoefficientsofthemaindiagonalareall1.0,becausetheyshowthecorrelationofthevariablewithitself,andthecorrelationmatrixissymmetricalaboutthisdiagonal.Becauseofthissymmetrymanycomputerprogramsprintonlythepartofthematrixbelowthediagonal.InspectionofTable17.2showsthatoldermenwereshorterandweaker

thanyoungermen.thattallermenwerestrongerthanshortermen,andthatthemagnitudesofallthreerelationshipswassimilar.ReferencetoTable11.2with41-2=39degreesoffreedomshowsthatallthreecorrelationsaresignificant.

Table17.2.CorrelationmatrixforthedataofTable17.1

Age Height MVC

Age 1.000 -0.338 -0.417

Height -0.338 1.000 0.419

MVC -0.417 0.419 1.000

WecouldfitaregressionlineoftheformMVC=a+b×age,fromwhichwecouldpredictthemeanMVCforanygivenage.However,MVCwouldstillvarywithheight.Toinvestigatetheeffectofbothageandheight,wecanusemultipleregressiontofitaregressionequationoftheform

MVC=b0+b1×height+b2×age

Thecoefficientsarecalculatedbyaleastsquaresprocedure,exactlythesameinprincipleasforsimpleregression.Inpractice,thisisalwaysdoneusingacomputerprogram.ForthedataofTable17.1,themultipleregressionequationis

MVC=-466+5.40×height-3.08×age

Fromthis,wewouldestimatethemeanMVCofmenwithanygivenageandheight,inthepopulationofwhichtheseareasample.

Thereareanumberofassumptionsimplicithere.OneisthattherelationshipbetweenMVCandheightisthesameateachage,thatis,thatthereisnointeractionbetweenheightandage.AnotheristhattherelationshipbetweenMVCandheightislinear,thatisoftheformMVC=a+b×height.Multipleregressionanalysisenablesustotestbothoftheseassumptions.

Multipleregressionisnotlimitedtotwopredictorvariables.Wecan

haveanynumber,althoughthemorevariableswehavethemoredifficultitbecomestointerprettheregression.Wemust,however,havemorepointsthanvariables,andasthedegreesoffreedomfortheresidualvariancearen-1-qifqvariablesarefitted,andthisshouldbelargeenoughforsatisfactoryestimationofconfidenceintervalsandtestsofsignificance.Thiswillbecomeclearafterthenextsection.

17.2*SignificancetestsandestimationinmultipleregressionAswesawin§11.5,thesignificanceofasimplelinearregressionlinecanbetestedusingthetdistribution.Wecancarryoutthesametestusinganalysisofvariance.FortheFEV1andheightdataofTable11.1thesumsofsquaresandproductswerecalculatedin§11.3.ThetotalsumofsquaresforFEV1isSyy=9.43868,withn-1=19degreesoffreedom.Thesumofsquaresduetoregressionwascalculatedin§11.5tobe3.18937.Theresidualsumofsquares,i.e.thesumofsquaresabouttheregressionline,isfoundbysubtractionas9.43868-3.18937=6.24931,andthishasn-2=18degreesoffreedom.We

cannowsetupananalysisofvariancetableasdescribedin§10.9,showninTable17.3.

Table17.3.AnalysisofvariancefortheregressionofFEV1onheight

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 19 9.43868

Duetoregression

1 3.18937

3.18937

9.19

Residual(aboutregression)

18 6.24931

0.34718

Table17.4.AnalysisofvariancefortheregressionofMVConheightandage

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 40 503344

Regression 2 131495

65748

6.72

Residual 38 371849

9785

Notethatthesquarerootofthevarianceratiois3.03,thevalueoftfoundin§11.5.Thetwotestsareequivalent.Notealsothattheregressionsumofsquaresdividedbythetotalsumofsquares=3.18937/9.43868=0.3379isthesquareofthecorrelationcoefficient,r=0.58(§11.5,§11.10).Thisratio,sumofsquaresduetoregressionovertotalsumofsquares,istheproportionofthevariabilityaccountedfor

bytheregression.Thepercentagevariabilityaccountedfororexplainedbytheregressionis100timesthis,i.e.34%.

ReturningtotheMVCdata,wecantestthesignificanceoftheregressionofMVConheightandagetogetherbyanalysisofvariance.Ifwefittheregressionmodelin§17.1,theregressionsumofsquareshastwodegreesoffreedom,becausewehavefittedtworegressioncoefficients.TheanalysisofvariancefortheMVCregressionisshowninTable17.4.

Theregressionissignificant;itisunlikelythatthisassociationcouldhavearisenbychanceifthenullhypothesisweretrue.Theproportionofvariabilityaccountedfor,denotedbyR2,is131495/503344=0.26.Thesquarerootofthisiscalledthemultiplecorrelationcoefficient,R.R2mustliebetween0and1,andasnomeaningcanbegiventothedirectionofcorrelationinthemultivariatecase,Risalsotakenaspositive.ThelargerRis,themorecloselycorrelatedwiththeoutcomevariablethesetofpredictorvariablesare.WhenR=1thevariablesareperfectlycorrelatedinthesensethattheoutcomevariableisalinearcombinationoftheothers.Whentheoutcomevariableisnotlinearlyrelatedtoanyofthepredictorvariables,Rwillbesmall,butnotzero.

Wemaywishtoknowwhetherbothoronlyoneofourvariablesleadstotheassociation.Todothis,wecancalculateastandarderrorforeachregressioncoefficient(Table17.5).Thiswillbedoneautomaticallybytheregressionprogram.Wecanusethistotesteachcoefficientseparatelybyattest.Wecan

alsofindaconfidenceintervalforeach,usingtstandarderrorsoneithersideoftheestimate.Fortheexample,bothageandheighthaveP=0.04andwecanconcludethatbothageandheightareindependentlyassociatedwithMVC.

Table17.5.CoefficientsfortheregressionofMVConheightandage,withstandarderrorsandconfidenceintervals

Predictorvariable Coefficient Standard

errortratio P

95%Confidenceinterval

height 5.40 2.55 2.12 0.04 0.25to10.55

age -3.08 1.47 -2.10 0.04 -6.05to-0.10

intercept -465.63 460.33 -1.01 0.3 -1397.52to466.27

Adifficultyariseswhenthepredictorvariablesarecorrelatedwithoneanother.Thisincreasesthestandarderroroftheestimates,andvariablesmayhaveamultipleregressioncoefficientwhichisnotsignificantdespitebeingrelatedtotheoutcomevariable.Wecanseethatthiswillbesomostclearlybytakinganextremecase.Supposewetrytofit

MVC=b0+b1×height+b2×height

FortheMVCdata

MVC=-908+6.20×height+1.00×height

isaregressionequationwhichminimizestheresidualsumofsquares.However,itisnotunique,because

MVC=-908+5.20×height+2.00×height

willdosotoo.ThetwoequationsgivethesamepredictedMVC.Thereisnouniquesolution,andsonoregressionequationcanbefitted,eventhoughthereisaclearrelationshipbetweenMVCandheight.Whenthepredictorvariablesarehighlycorrelatedtheindividualcoefficientswillbepoorlyestimatedandhavelargestandarderrors.Correlatedpredictorvariablesmayobscuretherelationshipofeachwiththe

outcomevariable.

Adifferent(andequivalent)wayoftestingtheeffectsoftwocorrelatedpredictorvariablesseparatelyistoproceedasfollows.Wefitthreemodels:

1. MVConheightandage,regressionsumofsquares=131495,d.f.=2

2. MVConheight,regressionsumofsquares=88511,d.f.=1

3. MVConage,regressionsumofsquares=87471,d.f.=1

Notethat88511+87471=175982isgreaterthan131495.Thisisbecauseageandheightarecorrelated.Wethentesttheeffectofheightifageistakenintoaccount,referredtoastheeffectofheightgivenage.Theregressionsumofsquaresforheightgivenageistheregressionsumofsquares(ageandheight)minusregressionsumofsquares(ageonly),whichis131495-87471=44024.Thishasdegreesoffreedom=2-1=1.Similarly,theeffectofageallowing

forheight,i.e.agegivenheight,istestedbyregressionsumofsquares(ageandheight)minusregressionsumofsquares(heightonly)=131495-88511=42984,withdegreesoffreedom=2-1=1.Wecansetallthisoutinananalysisofvariancetable(Table17.6).Thethirdtosixthrowsofthetableareindentedforthesourceofvariation,degreesoffreedomandsumofsquarescolumns,toindicatethattheyaredifferentwaysoflookingatvariationalreadyaccountedforinthesecondrow.Theindentedrowsarenotincludedwhenthedegreesoffreedomandsumsofsquaresareaddedtogivethetotal.AfteradjustmentforagethereisstillevidenceofarelationshipbetweenMVCandheight,andafteradjustmentforheightthereisstillevidenceofarelationshipbetweenMVCandage.NotethatthePvaluesarethesameasthosefoundbyattestfortheregressioncoefficient.Thisapproachisessentialforqualitativepredictorvariableswithmorethantwocategories(§17.6),whenseveraltstatisticsmaybeprintedforthevariable.

Table17.6.AnalysisofvariancefortheregressionofMVCon

heightandage,showingadjustedsumsofsquares

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 40 503344

Regression 2 131495

65748

6.72

Agealone

1 87471

87471

8.94

Heightgivenage

1 44024

44024

4.50

Heightalone

1 88511

88511

9.05

Agegivenheight

1 42984

42984

4.39

Residual 38 371849

9785

17.3*InteractioninmultipleregressionAninteractionbetweentwopredictorvariablesariseswhentheeffect

ofoneontheoutcomedependsonthevalueoftheother.Forexample,tallmenmaybestrongerthanshortmenwhentheyareyoung,butthedifferencemaydisappearastheyage.

Wecantestforinteractionasfollows.Wehavefitted


Aninteractionmaytaketwosimpleforms.Asheightincreases,theeffectofagemayincreasesothatthedifferenceinMVCbetweenyoungandoldtallmenisgreaterthanthedifferencebetweenyoungandoldshortmen.Alternatively,asheightincreases,theeffectofagemaydecrease.Morecomplexinteractionsarebeyondthescopeofthisdiscussion.Now,ifwefit

MVC=b0+b1×height+b2×age+b3×height×age

forfixedheighttheeffectofageisb2+b3×height.Ifthereisnointeraction,theeffectofageisthesameatallheights,andb3willbezero.Ofcourse,b3willnot

beexactlyzero,butonlywithinthelimitsofrandomvariation.Wecanfitsuchamodeljustaswefittedthefirstone.Weget

Table17.7.Analysisofvariancefortheinteractionofheightandage

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 40 503344

Regression 3 202 67 8.32 0.0002

719 573

Heightandage

2 131495

65748

8.09 0.001

Height×age

1 71224

71224

8.77 0.005

Residual 37 300625

8125

MVC=4661-24.7×height-112.8×age+0.650×height×age

Theregressionisstillsignificant,aswewouldexpect.However,thecoefficientsofheightandagehavechanged;theyhaveevenchangedsign.Thecoefficientofheightdependsonage.Theregressionequationcanbewritten

MVC=4661+(-24.7+0.650×age)×height-112.8×age

Thecoefficientofheightdependsonage,thedifferenceinstrengthbetweenshortandtallsubjectsbeinggreaterforoldersubjectsthanforyounger.

TheanalysisofvarianceforthisregressionequationisshowninTable17.7.Theregressionsumofsquaresisdividedintotwoparts:thatduetoageandheight,andthatduetotheinteractiontermafterthemaineffectsofageandheighthavebeenaccountedfor.TheinteractionrowisthedifferencebetweentheregressionrowinTable17.7,whichhas3degreesoffreedom,andtheregressionrowinTable17.4,whichhas2.Fromthisweseethattheinteractionishighlysignificant.TheeffectsofheightandageonMVCarenotadditive.Anotherexampleoftheinvestigationofapossibleinteractionisgivenin§17.7.

17.4*PolynomialregressionSofar,wehaveassumedthatalltheregressionrelationshipshavebeen

linear,i.e.thatwearedealingwithstraightlines.Thisisnotnecessarilyso.Wemayhavedatawheretheunderlyingrelationshipisacurveratherthanastraightline.Unlessthereisatheoreticalreasonforsupposingthataparticularformoftheequation,suchaslogarithmicorexponential,isneeded,wetestfornon-linearitybyusingapolynomial.Clearly,ifwecanfitarelationshipoftheform


wecanalsofitoneoftheform

MVC=b0+b1×height+b2×height2

Table17.8.AnalysisofvarianceforpolynomialregressionofMVConheight

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 40 503344

Regression 2 89103 44552

4.09 0.02

Linear 1 88522

88522

7.03 0.01

Quadratic 1 581 581 0.05 0.8

Residual 38 414241

12584

togiveaquadraticequation,andcontinueaddingpowersofheighttogiveequationswhicharecubic,quartic,etc.

Heightandheightsquaredarehighlycorrelated,whichcanleadtoproblemsinestimation.Toreducethecorrelation,wecansubtractanumberclosetomeanheightfromheightbeforesquaring.ForthedataofTable17.1,thecorrelationbetweenheightandheightsquaredis0.9998.Meanheightis170.7cm,so170isaconvenientnumbertosubtract.Thecorrelationbetweenheightandheightminus170squaredis-0.44,sothecorrelationhasbeenreduced,thoughnoteliminated.Theregressionequationis

MVC=-961+7.49×height+0.092×(height-170)2

Totestfornon-linearity,weproceedasin§17.2.Wefittworegressionequations,alinearandaquadratic.Thenon-linearityisthentestedbythedifferencebetweenthesumofsquaresduetothequadraticequationandthesumofsquaresduetothelinear.TheanalysisofvarianceisshowninTable17.8.Inthiscasethequadratictermisnotsignificant,sothereisnoevidenceofnon-linearity.Werethequadratictermsignificant,wecouldfitacubicequationandtesttheeffectofthecubicterminthesameway.Polynomialregressionofonevariablecanbecombinedwithordinarylinearregressionofotherstogiveregressionequationsoftheform

MVC=b0+b1×height+b2×height2+b3×age

andsoon.RoystonandAltman(1994)haveshownthatquitecomplexcurvescanbefittedwithasmallnumberofcoefficientsifweuselog(x)andpowers-1,0.5,0.5,1and2intheregressionequation.

17.5*AssumptionsofmultipleregressionFortheregressionestimatestobeoptimalandtheFtestsvalid,theresiduals(thedifferencebetweenobservedvaluesofthedependentvariableandthosepredictedbytheregressionequation)shouldfollow

aNormaldistributionandhavethesamevariancethroughouttherange.Wealsoassumethattherelationshipswhichwearemodellingarelinear.Theseassumptionsarethesameasforsimplelinearregression(§11.8)andcanbecheckedgraphicallyinthesameway,usinghistograms,Normalplotsandscatterdiagrams.IftheassumptionsofNormal

distributionanduniformvariancearenotmet,wecanuseatransformationasdescribedin§10.4and§11.8.Non-linearitycanbedealtwithusingpolynomialregression.

Fig.17.3.HistogramandNormalplotofresidualsofMVCaboutheightandage

Fig.17.4.ResidualsagainstobservedMVC,tocheckuniformityofvariance,andage,tochecklinearity

TheregressionequationofstrengthonheightandageisMVC=-466+5.40×height-3.08×ageandtheresidualsaregivenby

residual=MVC-(-466+5.40×height-3.08×age)

Figure17.3showsahistogramandaNormalplotoftheresidualsfortheMVCdata.Thedistributionlooksquitegood.Figure17.4showsaplotofresidualsagainstMVC.Thevariabilitylooksuniform.Wecanalsocheckthelinearitybyplottingresidualsagainstthepredictorvariables.Figure17.4alsoshowstheresidualagainstage.Thereisanindicationthatresidualmayberelatedtoage.Thepossibilityofanonlinearrelationshipcanbecheckedbypolynomialregression,which,inthiscase,doesnotproduceaquadratictermwhichapproachessignificance.

17.6*QualitativepredictorvariablesIn§17.1thepredictorvariables,heightandage,werequantitative.Inthestudyfromwhichthesedatacome,wealsorecordedwhetherornotsubjectshad

cirrhosisoftheliver.Cirrhosiswasrecordedas‘present’or‘absent’,sothevariablewasdichotomous.Itiseasytoincludesuchvariablesaspredictorsinmultipleregression.Wecreateavariablewhichis0ifthecharacteristicisabsent,1ifpresent,andusethisintheregressionequationjustaswedidheight.Theregressioncoefficientofthisdichotomousvariableisthedifferenceinthemeanoftheoutcomevariablebetweensubjectswiththecharacteristicandsubjectswithout.Ifthecoefficientinthisexamplewerenegative,itwouldmeanthatsubjectswithcirrhosiswerenotasstrongassubjectswithoutcirrhosis.Inthesameway,wecanusesexasapredictorvariablebycreatingavariablewhichis0forfemalesand1formales.Thecoefficientthenrepresentsthedifferenceinmeanbetweenmaleandfemale.Ifweuseonlyone,dichotomouspredictorvariableintheequation,theregressionisexactlyequivalenttoatwo-samplettestbetweenthe

groupsdefinedbythevariable(§10.3).

Apredictorvariablewithmorethantwocategoriesorclassesiscalledaclassvariableorafactor.Wecannotsimplyuseaclassvariableintheregressionequation,unlesswecanassumethattheclassesareorderedinthesamewayastheircodes,andthatadjoiningclassesareinsomesensethesamedistanceapart.Forsomevariables,suchasthediagnosisdataofTable4.1andthehousingdataofTable13.1,thisisabsurd.Forothers,suchastheAIDScategoriesofTable10.7,itisaverystrongassumption.Whatwedoinsteadistocreateasetofdichotomousvariablestorepresentthefactor.FortheAIDSdataofTable10.7,wecancreatethreevariables:

hiv1=1ifsubjecthasAIDS,0otherwise

hiv2=1ifsubjecthasARC,0otherwise

hiv3=1ifsubjectisHIVpositivebuthasnosymptoms,0otherwise

IfthesubjectisHIVnegative,allthreevariablesarezero.hiv1,hiv2,andhiv3arecalleddummyvariables.Somecomputerprogramswillcalculatethedummyvariablesautomaticallyifthevariableisdeclaredtobeafactor,forotherstheusermustdefinethem.Weputthethreedummyvariablesintotheregressionequation.Thisgivestheequation:

mannitol=11.4-0.066×hiv1-2.56×hiv2-1.69×hiv3

Eachcoefficientisthedifferenceinmannitolabsorptionbetweentheclassrepresentedbythatvariableandtheclassrepresentedbyalldummyvariablesbeingzero,HIVnegative,calledthereferenceclass.TheanalysisofvarianceforthisregressionequationisshowninTable17.9,andtheFtestshowsthatthereisnosignificantrelationshipbetweenmannitolabsorptionandHIVstatus.Theregressionprogramprintsoutstandarderrorsandttestsforeachdummyvariable,butthesettestsshouldbeignored,becausewecannotinterpretonedummyvariableinisolationfromtheothers.

Table17.9.AnalysisofvariancefortheregressionofmannitolexcretiononHIVstatus

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 58 1559.035

Regression 3 49.011 16.337 0.60 0.6

Residual 55 1510.024

27.455

Table17.10.Two-wayanalysisofvarianceformannitolexcretion,withHIVstatusanddiarrhoeaasfactors

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 58 1559.035

Model 4 134.880 33.720 1.28 0.3

HIV 3 58.298 19.432 0.74 0.5

Diarrhoea 1 85.869 85.869 3.26 0.08


26.373

17.7*Multi-wayanalysisofvarianceAdifferentapproachtotheanalysisofmultifactorialdataisprovidedbythedirectcalculationofanalysisofvariance.Table17.9isidenticaltotheonewayanalysisofvarianceforthesamedatainTable10.8.Wecanalsoproduceanalysesofvarianceforseveralfactorsatonce.Table17.10showsthetwo-wayanalysisofvarianceforthemannitoldata,thefactorsbeingHIVstatusandpresenceorabsenceofdiarrhoea.Thiscouldbeproducedequallywellbymultipleregressionwithtwocategoricalpredictorvariables.IftherewerethesamenumberofpatientswithandwithoutdiarrhoeaineachHIVgroupthefactorswouldbebalanced.ThemodelsumofsquareswouldthenbethesumofthesumsofsquaresforHIVandfordiarrhoea,andthesecouldbecalculatedverysimplyfromthetotaloftheHIVgroupsandthediarrhoeagroups.Forbalanceddatawecanassessmanycategoricalfactorsandtheirinteractionsquiteeasilybymanualcalculation.SeeArmitageandBerry(1994)fordetails.Complexmultifactorialbalancedexperimentsarerareinmedicalresearch,andtheycanbeanalysedbyregressionanywaytogetidenticalresults.Mostcomputerprogramsinfactusetheregressionmethodtocalculateanalysesofvariance.

Foranotherexample,considerTable17.11,whichshowstheresultsofastudyoftheproductionofTumourNecrosisFactor(TNF)bycellsinvitro.Twodifferentpotentialstimulatingfactors,Mycobacteriumtuberculosis(MTB)andFixedActivatedT-cells(FAT),havebeenadded,singlyandtogether.Cellsfromthesame11donorshavebeenusedthroughout.Thuswehavethreefactors,MTB,FAT,anddonor.Threemeasurementsweremadeateachcombinationoffactors;Figure

17.5(a)showsthemeansofthesesetsofthree.Everypossiblecombinationoffactorsisusedthesamenumberoftimesinaperfectthree-wayfactorialarrangement.Therearetwomissingobservations.Thesethingshappen,eveninthebestregulatedlaboratories.TherearesomenegativevaluesofTNF.

Table17.11.TNFmeasuredunderfourdifferentconditionsusingcellsfrom11donors(dataofDr.JanDavies)

NoMTB MTB

FAT Donor TNF,3replicates FAT Donor

No 1 -0.01 -0.01 -0.13 No 1

No 2 16.13 -9.62 -14.88 No 2

No 3 Missing -0.3 -0.95 No 3

No 4 3.63 47.5 55.2 no 4

No 5 -3.21 -5.64 -5.32 No 5

No 6 16.26 52.21 17.93 No 6

No 7 -12.74 -5.23 -4.06 No 7

No 8 -4.67 20.1 110 No 8

No 9 -5.4 20 10.3 No 9

No 10 -10.94 -5.26 -2.73 No 10

No 11 -4.19 -11.83 -6.29 No 11

Yes 1 88.16 97.58 66.27 Yes 1

Yes 2 196.5 114.1 134.2 Yes 2

Yes 3 6.02 1.19 3.38 Yes 3

Yes 4 935.4 1011 951.2 Yes 4

Yes 5 606 592.7 608.4 Yes 5

Yes 6 1457 1349 1625 Yes 6

Yes 7 1457 1349 1625 Yes 7

Yes 8 196.7 270.8 160.7 Yes 8

Yes 9 135.2 221.5 268 Yes 9

Yes 10 -14.47 79.62 304.1 Yes 10

Yes 11 516.3 585.9 562.6 Yes 11

Fig.17.5.TumourNecrosisFactor(TNF)measuredinthepresenceandabsenceofFixedActivatedT-cells(FAT)andMycobacteriumtuberculosis(MTB),thenaturalandatransformedscale

ThisdoesnotmeanthatthecellsweresuckingTNFinfromtheirenvironment,butwasanartifactoftheassaymethodandrepresentsmeasurementerror.

ThesubjectmeansareshowninFigure17.5(a).Thissuggestsseveralthings:thereisastrongdonoreffect(donor6isalwayshigh,donor3isalwayslow,forexample),MTBandFATeachincreaseTNF,bothtogetherhaveagreatereffectthaneitherindividually,thedistributionofTNFishighlyskew,thevarianceofTNFvariesgreatlyfromgrouptogroup,andincreaseswiththemean.AsthemeanforMTBandFATcombinedismuchgreaterthanthesumoftheir

individualmeans,theresearcherthoughttherewassynergy,i.e.thatMTBandFATworkedtogether,thepresenceofoneenhancingtheeffectoftheother.Shewasseekingstatisticalsupportforthisconclusion(JanDavies,personalcommunication).

Table17.12.AnalysisofvariancefortheeffectsofMTB,FATanddonorontransformedTNF

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 43 194.04030

Donor 10 38.89000

3.88900

3.72 0.003

MTB 1 58.49320

58.49320

55.88 <0.0001

FAT 1 65.24482

65.24482

62.33 <0.0001

MTB×FAT

1 0.00811

0.00811

0.01 0.9


1.04681

Forstatisticalanalysis,wewouldlikeNormaldistributionswithuniformvariancesbetweenthegroups.Alogtransformationlookslikeagoodbet,butsomeobservationsarenegative.Asthelog(orthesquareroot)willnotworkfornegativenumbers,wehavetoadjustthedatafurther.Theeasiestapproachistoaddaconstanttoalltheobservationsbeforetransformation.Ichose20,whichmakesalltheobservationspositivebutissmallcomparedtomostoftheobservations.Ididthisbytrialanderror.AsFigure17.5(b)shows,thetransformationhasnotbeentotallysuccessful,butthetransformeddatalookmuchmoreamenable

toaNormaltheoryanalysisthandotherawdata.

TherepeatedmeasurementsgiveusamoreaccuratemeasurementofTNF,butdonotcontributeanythingelse.IthereforeanalysedthemeantransformedTNF.TheanalysisofvarianceisshowninTable17.12.Donorisafactorwith11categories,hencehas10degreesoffreedom.Itisnotofanyimportancetothesciencehere,butiswhatwecallanuisancevariable,oneweneedtoallowforbutarenotinterestedin.IhaveincludedaninteractionbetweenMTBandFAT,becauselookingforthisisoneoftheobjectivesoftheexperiment.ThemaineffectsofMTBandFATarehighlysignificant,buttheinteractiontermisnot.TheestimatesoftheeffectswiththeirconfidenceintervalsareshowninTable17.13.Astheanalysiswasonalogscale,theantilogs(exponentials)arealsoshown.Theantiloggivesustheratioofthe(geometric)meaninthepresenceofthefactortothemeanintheabsenceofthefactor,i.e.theamountbywhichTNFismultipliedbywhenthefactorispresent.Strictlyspeaking,ofcourse,itistheratioofthegeometricmeansofTNFplus20,butas20issmallcomparedtomostTNFmeasurementstheratiowillbeapproximatelytheincreaseinTNF.

Theestimatedinteractionissmallandnotsignificant.Theconfidenceintervaliswide(thesampleisverysmall),sowecannotexcludethepossibilityofaninteraction,butthereiscertainlynoevidencethatoneexists.Thiswasnotwhattheresearcherexpected.Thiscontradictioncomesaboutbecausethestatisticalmodelusedisofadditiveeffectsonthelogarithmicscale,i.e.ofmultiplicativeeffectsonthenaturalscale.Thisisforcedonusbythenatureofthedata.The

lackofinteractionbetweentheeffectsshowsthatthedataareconsistentwiththismodel,thisviewofwhatishappening.ThelackofinteractioncanbeseenquiteclearlyinFigure17.5(b),asthemeanforMTBandFATlooksverysimilartothesumofthemeansforMTBaloneandFATalone.

Table17.13.EstimatedeffectsonTNFofMTB,FAT

andtheirinteraction

Effect(logscale)


Ratioeffect(naturalscale)


Withinteractionterm

MTB 2.333 (1.442to3.224)

10.3 (4.2to25.1)

FAT 2.463 (1.572to3.354)

11.7 (4.8to28.6)

MTB×FAT

0.054 (-1.206to1.314)

1.1 (0.3to3.7)

Withoutinteractionterm

MTB 2.306 (1.687to2.925)

10.0 (5.4to18.6)

FAT 2.435 (1.816to3.054)

11.4 (6.1to21.2)

Multipleregressioninwhichqualitativeandquantitativepredictorvariablesarebothusedisalsoknownasanalysisofcovariance.Forordinaldata,thereisatwo-wayanalysisofvarianceusingranks,theFriedmantest(seeConover1980,Altman1991)

17.8*LogisticregressionLogisticregressionisusedwhentheoutcomevariableisdichotomous,a‘yesorno’,whetherornotthesubjecthasaparticularcharacteristicsuchasasymptom.Wewantaregressionequationwhichwillpredicttheproportionofindividualswhohavethecharacteristic,or,equivalently,estimatetheprobabilitythatanindividualwillhavethesymptom.Wecannotuseanordinarylinearregressionequation,becausethismightpredictproportionslessthanzeroorgreaterthanone,whichwouldbemeaningless.Insteadweusethelogitoftheproportionastheoutcomevariable.Thelogitofaproportionpisthelogodds(§13.7):

Thelogitcantakeanyvaluefromminusinfinity,whenp=0,toplusinfinity,whenp=1.WecanfitregressionmodelstothelogitwhichareverysimilartotheordinarymultipleregressionandanalysisofvariancemodelsfoundfordatafromaNormaldistribution.Weassumethatrelationshipsarelinearonthelogisticscale:

wherex1,…,xmarethepredictorvariablesandpistheproportiontobepredicted.Themethodiscalledlogisticregression,andthecalculationiscomputerintensive.Theeffectsofthepredictorvariablesarefoundaslogoddsratios.Wewilllookattheinterpretationinanexample.

Fig.17.6.Bodymassindex(BMI)inwomenundergoingtrialofscar

Table17.14.Coefficientsinthelogisticregressionforpredictingcaesariansection

Coef. Std.Err. z P


BMI 0.0883

0.0200

4.42 <0.001 0.0492to0.1275

Induction 0.6471

0.2141

3.02 0.003 0.2276to1.0667

Prev.vag.del.

-1.7963

0.2981

-6.03 <0.001 -2.3805to-1.2120

Intercept -3.7000

0.5343

-6.93 <0.001 -4.7473to-2.6528

Whengivingbirth,womenwhohavehadapreviouscaesariansectionusuallyhaveatrialofscar,thatis,theyattemptanaturallabourwithvaginaldeliveryandonlyhaveanothercaesarianifthisisdeemednecessary.Severalfactorsmayincreasetheriskofacaesarian,andinthisstudythefactorofinterestwasobesity,asmeasuredbythebodymassindexorBMI,definedasweight/height2.ThedistributionofBMIisshowninFigure17.6(dataofAndreasPapadopoulos).ForcaesariansthemeanBMIwas26.4kg/m2andforvaginaldeliveriesthemeanwas24.9kg/m2.Twoothervariableshadastrongrelationshipwithasubsequentcaesarian.Womenwhohadhadapreviousvaginaldelivery(PVD)werelesslikelytoneedacaesarian,oddsratio=0.18,95%confidenceinterval0.10to0.32.Womenwhoselabourwasinducedhadanincreasedriskofacaesarian,oddsratio=2.11,95%confidenceinterval1.44to3.08.Alltheserelationshipswerehighlysignificant.ThequestiontobeansweredwaswhethertherelationshipbetweenBMIandcaesariansectionremainedwhentheeffectsofinductionandpreviousdeliverieswereallowedfor.

TheresultsofthelogisticregressionareshowninTable17.14.Wehavethecoefficientsfortheequationpredictingthelogoddsofacaesarian:

log(o)=-3.7000+0.0883×BMI+0.6471×induction-1.7963×PVD

whereinductionandPVDare1ifpresent,0ifnot.ThusforwomanwhohadBMI=25kg/m2,notbeeninducedandhadapreviousvaginaldeliverythelog

oddsofacaesarianisestimatedtobe

Table17.15.Oddsratiosfromthelogisticregression

forpredictingcaesariansection

Oddsratio P 95%Confidenceinterval

BMI 1.092 <0.001 1.050to1.136

Induction 1.910 0.003 1.256to2.906

Prev.vag.del.

0.166 <0.001 0.096to0.298

log(o)=-3.7000+0.0883×25+0.6471×0-1.7963×1=-3.2888

Theoddsisexp(-3.2888)=0.03730andtheprobabilityisgivenbyp=o/(1+o)=0.03730/(1+0.03730)=0.036.Iflabourhadbeeninduced,thelogoddswouldriseto

log(o)=-3.7000+0.0883×25+0.6471×1-1.7963×1=-2.6417

givingoddsexp(-2.6417)=0.07124andhenceprobability0.07124/(1+0.07124)=0.067.

Becausethelogisticregressionequationpredictsthelogodds,thecoefficientsrepresentthedifferencebetweentwologodds,alogoddsratio.Theantilogofthecoefficientsisthusanoddsratio.Someprogramswillprinttheseoddsratiosdirectly,asinTable17.15.Wecanseethatinductionincreasestheoddsofacaesarianbyafactorof1.910andapreviousvaginaldeliveryreducestheoddsbyafactorof0.166.Theseareoftencalledadjustedoddsratios.Inthisexampletheyandtheirconfidenceintervalsaresimilartotheunadjustedoddsratiosgivenabove,becausethethreepredictorvariableshappennottobecloselyrelatedtoeachother.

Foracontinuouspredictorvariable,suchasBMI,thecoefficientisthechangeinlogoddsforanincreaseofoneunitinthepredictorvariable.

Theantilogofthecoefficient,theoddsratio,isthefactorbywhichtheoddsmustbemultipliedforaunitincreaseinthepredictor.Twounitsincreaseinthepredictorincreasestheoddsbythesquareoftheoddsratio,andsoon.Adifferenceof5kg/m2inBMIgivesanoddsratioforacaesarianof1.0925=1.55,thustheoddsofacaesarianaremultipliedby1.55.See§11.8forasimilarinterpretationandfullerdiscussionwhenacontinuousoutcomevariableislogtransformed.

Whenwehaveacasecontrolstudy,wecananalysethedatabyusingthecaseorcontrolstatusastheoutcomevariableinalogisticregression.Thecoefficientsarethentheapproximatelogrelativerisksduetothefactors(§13.7).Thereisavariantcalledconditionallogisticregression,whichcanbeusedwhenthecasesandcontrolsareinmatchedpairs,triples,etc.

Logisticregressionisalargesamplemethod.Aruleofthumbisthatthereshouldbeatleast10‘yes'sand10‘no's,andpreferably20,foreachpredictorvariable(Peduzzietal.1996).

17.9*SurvivaldatausingCoxregressionOneproblemofsurvivaldata,thecensoringofindividualswhohavenotdiedatthetimeofanalysis,hasbeendiscussedin§15.6.Thereisanotherwhichisimportantformultifactorialanalysis.Weoftenhavenosuitablemathematicalmodelofthewaysurvivalisrelatedtotime,i.e.thesurvivalcurve.ThesolutionnowwidelyadoptedtothisproblemwasproposedbyCox(1972),andisknownasCoxregressionortheproportionalhazardsmodel.Inthisapproach,wesaythatforsubjectswhohavelivedtotimet,theprobabilityofanendpoint(e.g.dying)instantaneouslyattimetish(t),whichisanunknownfunctionoftime.Wecalltheprobabilityofanendpointthehazard,andh(t)isthehazardfunction.Wethenassumethatanythingwhichaffectsthehazarddoessobythesameratioatalltimes.Thus,somethingwhichdoublestheriskofanendpointondayonewillalsodoubletheriskofanendpointondaytwo,daythreeandsoon.Thus,ifh0(t)isthehazardfunctionforsubjectswithallthepredictorvariablesequaltozero,andh(t)isthehazardfunctionforasubjectwithsomeother

valuesforthepredictorvariables,h(t)/h0(t)dependsonlyonthepredictorvariables,notontimet.Wecallh(t)/h0(t)thehazardratio.Itistherelativeriskofanendpointoccurringatanygiventime.

Instatistics,itisconvenienttoworkwithdifferencesratherthanratios,sowetakethelogarithmoftheratio(see§5A)andhavearegression-likeequation:

wherex1,…,xparethepredictorvariablesandb1,…,bparethecoefficientswhichweestimatefromthedata.ThisisCox'sproportionalhazardsmodel.Coxregressionenablesustoestimatethevaluesofb1,…,bpwhichbestpredicttheobservedsurvival.Thereisnoconstanttermb0,itsplacebeingtakenbythebaselinehazardfunctionh0(t).

Table15.7showsthetimetorecurrenceofgallstones,orthetimeforwhichpatientsareknowntohavebeengallstone-free,followingdissolutionbybileacidtreatmentorlithotrypsy,withthenumberofpreviousgallstones,theirmaximumdiameter,andthetimerequiredfortheirdissolution.Thedifferencebetweenpatientswithasingleandwithmultiplepreviousgallstoneswastestedusingthelogranktest(§15.6).Coxregressionenablesustolookatcontinuouspredictorvariables,suchasdiameterofgallstone,andtoexamineseveralpredictorvariablesatonce.Table17.16showstheresultoftheCoxregression.Wecanearn-outanapproximatetestofsignificancedividingthecoefficientbyitsstandarderror,andifthenullhypothesisthatthecoefficientwouldbezerointhepopulationistrue,thisfollowsaStandardNormaldistribution.Thechi-squaredstatisticteststherelationshipbetweenthetimetorecurrenceandthethreevariablestogether.Themaximumdiameterhasnosignificantrelationshiptotimetorecurrence,sowecantryamodelwithoutit(Table17.17).Asthechangeinoverallchi-squaredshows,removingdiameterhashadverylittleeffect.

ThecoefficientsinTable17.17aretheloghazardratios.Thecoefficientfor

multiplegallstonesis0.963.Ifweantilogthis,wegetexp(0.963)=

2.62.Asmultiplegallstonesisa0or1variable,thecoefficientmeasuresthedifferencebetweenthosewithsingleandmultiplestones.Apatientwithmultiplegallstonesis2.62timesaslikelytohavearecurrenceatanytimethanapatientwithasinglestone.The95%confidenceintervalforthisestimateisfoundfromtheantilogsoftheconfidenceintervalinTable17.17,1.30to5.26.Notethatapositivecoefficientmeansanincreasedriskoftheevent,inthiscaserecurrence.Thecoefficientformonthstodissolutionis0.043,whichhasantilog=1.04.Thisisaquantitativevariable,andforeachmonthtodissolvethehazardratioincreasesbyafactorof1.04.Thusapatientwhosestonetooktwomonthstodissolvehasariskofrecurrence1.04timesthatforapatientwhosestonetookonemonth,apatientwhosestonetookthreemonthshasarisk1.042timesthatforaonemonthpatient,andsoon.

Table17.16.Coxregressionoftimetorecurrenceofgallstonesonpresenceofmultiplestones,maximum

diameterofstoneandmonthstodissolution

Variable Coef. Std.Err. z P

95%Conf.interval

Mult.gallstones

0.838 0.401 2.09 0.038 0.046to1.631

Max.diam.

-0.023 0.036 -0.63 0.532 -0.094to0.049

Monthsto 0.044 0.017 2.64 0.009 0.011

dissol. to0.078

X2=12.57,3d.f.,P=0.006.

Table17.17.Coxregressionoftimetorecurrenceofgallstonesonpresenceofmultiplestonesand

monthstodissolution

Variable Coef. Std.Err. z P

95%Conf.interval

Mult.gallstones

0.963 0.353 2.73 0.007 0.266to1.661

Monthstodissol.

0.043 0.017 2.59 0.011 0.010to0.076

X2=12.16,2d.f.,P=0.002.

IfwehaveonlythedichotomousvariablemultiplegallstonesintheCoxmodel,wegetfortheoverallteststatisticX2=6.11,1degreesoffreedom.In§15.6weanalysedthesedatabycomparisonoftwogroupsusingthelogranktestwhichgaveX2=6.62,1degreeoffreedom.Thetwomethodsgivesimilar,butnotidenticalresults.Thelogranktestis

non-parametric,makingnoassumptionaboutthedistributionofsurvivaltime.TheCoxmethodissaidtosemi-parametric,becausealthoughitmakesnoassumptionabouttheshapeofthedistributionofsurvivaltime,itdoesrequireassumptionsaboutthehazardratio.

Likelogisticregression(§17.8),Coxregressionisalargesamplemethod.Aruleofthumbisthatthereshouldbeatleast10,andpreferably20,events(deaths)foreachpredictorvariable.FulleraccountsofCoxregressionaregivenbyAltman(1991),MatthewsandFarewell(1988),ParmarandMachin(1995),andHosmerandLemeshow(1999).

17.10*StepwiseregressionStepwiseregressionisatechniqueforchoosingpredictorvariablesfromalargeset.Thestepwiseapproachcanbeusedwithmultiplelinear,logisticandCoxregressionandwithother,lessoftenseen,regressiontechniques(§17.12)too.

Therearetwobasicstrategies:step-upandstep-down,alsocalledforwardandbackward.Instep-uporforwardregression,wefitallpossibleone-wayregressionequations.Havingfoundtheonewhichaccountsforthegreatestvariance,alltwo-wayregressionsincludingthisvariablearefitted.Theequationaccountingforthemostvariationischosen,andallthree-wayregressionsincludingthesearefitted,andsoon.Thiscontinuesuntilnosignificantincrease,invariationaccountedforisfound.Inthestep-downorbackwardmethod,wefirstfittheregressionwithallthepredictorvariables,andthenthevariableisremovedwhichreducestheamountofvariationaccountedforbytheleastamount,andsoon.Therearealsomorecomplexmethods,inwhichvariablescanbothenterandleavetheregressionequation.

Thesemethodsmustbetreatedwithcare.Differentstepwisetechniquesmayproducedifferentsetsofpredictorvariablesintheregressionequation.Thisisespeciallylikelywhenthepredictorvariablesarecorrelatedwithoneanother.Thetechniqueisveryusefulforselectingasmallsetofpredictorvariablesforpurposesofstandardizationandprediction.Fortryingtogetanunderstandingof

theunderlyingsystem,stepwisemethodscanbeverymisleading.Whenpredictorvariablesarehighlycorrelated,onceonehasenteredtheequationinastep-upanalysis,theotherwillnotenter,eventhoughitisrelatedtotheoutcome.Thusitwillnotappearinthefinalequation.

17.11*Meta-analysis:DatafromseveralstudiesMeta-analysisisthecombinationofdatafromseveralstudiestoproduceasingleestimate.Fromthestatisticalpointofview,meta-analysisisastraightforwardapplicationofmultifactorialmethods.Wehaveseveralstudiesofthesamething,whichmightbeclinicaltrialsorepidemiologicalstudies,perhapscarriedoutindifferentcountries.Eachtrialgivesusanestimateofaneffect.Weassumethattheseareestimatesofthesameglobalpopulationvalue.Wechecktheassumptionsoftheanalysis,and,iftheseassumptionsaresatisfied,wecombinetheseparatestudyestimatestomakeacommonestimate.Thisisamultifactorialanalysis,wherethetreatmentorriskfactorisonepredictorvariableandthestudyisanother,categorical,predictorvariable.

Themainproblemsofmeta-analysisarisebeforewebegintheanalysisofthedata.First,wemusthaveacleardefinitionofthequestionsothatweonlyincludestudieswhichaddressthis.Forexample,ifwewanttoknowwhetherloweringserumcholesterolreducesmortalityfromcoronaryarterydisease,wewouldnotwanttoincludeastudywheretheattempttolowercholesterolfailed.Ontheotherhand,ifweaskwhetherdietaryadvicelowersmortality,wewouldincludesuchastudy.Whichstudiesweincludemayhaveaprofoundinfluenceontheconclusions(Thompson1993).Second,wemusthavealltherelevant

studies.Asimpleliteraturesearchisnotenough.Notallstudieswhichhavebeenstartedarepublished;studieswhichproducesignificantdifferencesaremorelikelytobepublishedthanthosewhichdonot(e.g.PocockandHughes1990;Easterbrooketal.1991).Withinastudy,resultswhicharesignificantmaybeemphasizedandpartsofthedatawhichproducenodifferencesmaybeignoredbytheinvestigatorsasuninteresting.Publicationofunfavourableresultsmaybediscouragedbythesponsorsofresearch.ResearcherswhoarenotnativeEnglish

speakersmayfeelthatpublicationintheEnglishlanguageliteratureismoreprestigiousasitwillreachawideraudience,andsotrytherefirst,onlypublishingintheirownlanguageiftheycannotpublishinEnglish.TheEnglishlanguageliteraturemaythuscontainmorepositiveresultsthandootherliteratures.Thephenomenonbywhichsignificantandpositiveresultsaremorelikelytobereported,andreportedmoreprominently,thannon-significantandnegativeonesiscalledpublicationbias.Thuswemustnotonlytrawlthepublishedliteratureforstudies,butusepersonalknowledgeofourselvesandotherstolocatealltheunpublishedstudies.Onlythenshouldwecarryoutthemeta-analysis.

Whenwehaveallthestudieswhichmeetthedefinition,wecombinethemtogetacommonestimateoftheeffectofthetreatmentorriskfactor.Weregardthestudiesasprovidingseveralobservationsofthesamepopulationvalue.Therearetwostagesinmeta-analysis.Firstwecheckthatthestudiesdoprovideestimatesofthesamething.Second,wecalculatethecommonestimateanditsconfidenceinterval.Todothiswemayhavetheoriginaldatafromallthestudies,whichwecancombineintoonelargedatafilewithstudyasoneofthevariables,orwemayonlyhavesummarystatisticsobtainedfrompublications.

Iftheoutcomemeasureiscontinuous,suchasmeanfallinbloodpressure,wecancheckthatsubjectsarefromthesamepopulationbyanalysisofvariance,withtreatmentorriskfactor,study,andinteractionbetweentheminthemodel.Multipleregressioncanalsobeused,rememberingthatstudyisacategoricalvariableanddummyvariablesarerequired.Wetestthetreatmenttimesstudyinteractionintheusualway.Iftheinteractionissignificantthisindicatesthatthetreatmenteffectisnotthesameinallstudies,andsowecannotcombinethestudies.Itistheinteractionwhichisimportant.Itdoesnotmattermuchifthemeanbloodpressurevariesfromstudytostudy.Whatmattersiswhethertheeffectofthetreatmentonbloodpressurevariesmorethanwewouldexpect.Wemaywanttoexaminethestudiestoseewhetheranycharacteristicofthestudiesexplainsthisvariation.Thismightbeafeatureofthesubjects,thetreatmentorthedatacollection.Ifthereisnointeraction,thenthedataareconsistentwiththetreatmentorriskfactoreffectbeingconstant.Thisiscalleda

fixedeffectsmodel(see§10.12).Wecandroptheinteractiontermfromthemodelandthetreatmentorriskfactoreffectisthentheestimatewewant.Itsstandarderrorandconfidenceintervalarefoundasdescribedin§17.2.Ifthereisaninteraction,wecannotestimateasingletreatmenteffect.Wecanthinkofthestudiesasarandomsampleofthepossibletrialsandestimatethemeantreatmenteffectforthispopulation.Thisiscalledtherandomeffectsmodel(§10.12).The

confidenceintervalisusuallymuchwiderthanthatfoundusingthefixedeffectmodel.

Table17.18.OddsratiosandconfidenceintervalsinfivestudiesofvitaminAsupplementationin

infectiousdisease(GlasziouandMackerras1993)

Study Doseregime VitaminA Controls

Deaths Number Deaths Number

1 200000IUsix-monthly

101 12991 130 12209

2 200000IUsix-monthly

39 7076 41 7006

3 8333IUweekly

37 7764 80 7755

4 200000IUfour-monthly

152 12541 210 12264

5 200000IUonce

138 3786 167 3411

Table17.19.OddsratiosandconfidenceintervalsinfivestudiesofvitaminAsupplementationin

infectiousdisease

Study Oddsratio 95%Confidenceinterval

1 0.73 0.56to0.95

2 0.94 0.61to1.46

3 0.46 0.31to0.68

4 0.70 0.57to0.87

5 0.73 0.58to0.93

Iftheoutcomemeasureisdichotomous,suchassurvivedordied,theestimateofthetreatmentorriskfactoreffectwillbeintheformofanoddsratio(§13.7).Wecanproceedinthesamewayasforacontinuousoutcome,usinglogisticregression(§17.8).Severalothermethodsexistforcheckingthehomogeneityoftheoddsratiosacrossstudies,suchas

Woolf'stest(seeArmitageandBerry1994)orthatofBreslowandDay(1980).Theyallgivesimilaranswers,and,sincetheyarebasedondifferentlarge-sampleapproximations,thelargerthestudysamplesthemoresimilartheresultswillbe.Providedtheoddsratiosarehomogeneousacrossstudies,wecanthenestimatethecommonoddsratio.ThiscanbedoneusingtheMantel-Haenszelmethod(seeArmitageandBerry1994)orbylogisticregression.

Forexample,GlasziouandMackerras(1993)carriedoutameta-analysisofvitaminAsupplementationininfectiousdisease.TheirdataforfivecommunitystudiesareshowninTable17.18.Wecanobtainoddsratiosandconfidenceintervalsasdescribedin§13.7,showninTable17.19.

Thecommonoddsratiocanbefoundinseveralways.Touselogisticregression,weregresstheeventofdeathonvitaminAtreatmentandstudy.Ishalltreatthetreatmentasadichotomousvariable,setto1iftreatedwithvitaminA,0ifcontrol.Studyisacategoricalvariable,sowecreatedummyvariablesstudy1tostudy4,whicharesettooneforstudies1to4respectively,andtozerootherwise.Wetesttheinteractionbycreatinganothersetofvariables,theproductsofstudy1tostudy4andvitaminA.LogisticregressionofdeathonvitaminA,studyandinteractiongivesachi-squaredstatisticforthemodelof

496.99with9degreesoffreedom,whichishighlysignificant.Logisticregressionwithouttheinteractiontermsgives490.33with5degreesoffreedom.Thedifferenceis496.99-490.33=6.66with9-5=4degreesoffreedom,whichhasP=0.15,sowecandroptheinteractionfromthemodel.TheadjustedoddsratioforvitaminAis0.70,95%confidenceinterval0.62to0.79,P<0.0001.

Fig.17.7.Meta-analysisoffivevitaminAtrials(dataofGlasziouandMackerras1993).Theverticallinesaretheconfidenceintervals.

TheoddsratiosandtheirconfidenceintervalsareshowninFigure17.7.Theconfidenceintervalisindicatedbyaline,thepointestimateoftheoddsratiobyacircle.Inthispicturethemostimportanttrialappearstobestudy2,withthewidestconfidenceinterval.Infact,itisthestudywiththeleasteffectonthewholeestimate,becauseitisthestudywheretheoddsratioisleastwellestimated.Inthesecondpicture,theoddsratioisindicatedbythemiddleofasquare.Theareaofthesquareisproportionaltothenumberofsubjectsinthestudy.Thisnowmakesstudy2appearrelativelyunimportant,andmakestheoverallestimatestandout.

Therearemanyvariantsonthisstyleofgraph,whichissometimescalledaforestdiagram.Thegraphisoftenshownwiththestudiesontheverticalaxis

andtheoddsratioordifferenceinmeanonthehorizontalaxis(Figure17.8).Thecombinedestimateoftheeffectmaybeshownasalozengeordiamondshapeandforoddsratiosalogarithmicscaleisoftenemployed,asinFigure17.8.

Fig.17.8.Meta-analysisoffivevitaminAtrials,verticalversion

17.12*OthermultifactorialmethodsThechoiceofmultiple,logisticorCoxregressionisdeterminedbythenatureoftheoutcomevariable:continuous,dichotomous,orsurvivaltimesrespectively.Thereareothertypesofoutcomevariableandcorrespondingmultifactorialtechniques.Ishallnotgointoanydetails,butthislistmayhelpshouldyoucomeacrossanyofthem.Iwouldrecommendyouconsultastatisticianshouldyouactuallyneedtouseoneofthesemethods.Thetechniquesfordealingwithpredictorvariablesdescribedin§17.2–17.4and§17.6applytoallofthem.

Iftheoutcomevariableiscategoricalwithmorethantwocategories,e.g.severaldiagnosticgroups,weuseaprocedurecalledmultinomiallogisticregression.Thisestimatesforasubjectwithgivenvaluesofthepredictorvariabletheprobabilitythatthesubjectwillbeineachcategory.Ifthecategoriesareordered,e.g.tumourstage,wecantaketheorderingintoaccountusingorderedlogisticregression.Boththesetechniquesarecloselyrelatedtologisticregression(§17.8).

Iftheoutcomeisacount,suchashospitaladmissionsinadayordeathsrelatedtoaspecificcauseperweekormonth,wecanusePoisson

regression.Thisisparticularlyusefulwhenwehavemanytimeintervalsbutthenumbersofeventsperintervalissmall,sothattheassumptionsofmultipleregression(§17.5)donotapply.

Aslightlydifferentproblemariseswithmulti-waycontingencytableswherethereisnoobviousoutcomevariable.Wecanuseatechniquecalledloglinearmodelling.Thisenablesustotesttherelationshipbetweenanytwoofthevariablesinthetableholdingtheothersconstant.

17M*Multiplechoicequestions93to97(Eachansweristrueorfalse)

93.Inmultipleregression,R2:

(a)isthesquareofthemultiplecorrelationcoefficient;

(b)wouldbeunchangedifweexchangedtheoutcome(dependent)variableandoneofthepredictor(independent)variables;

(c)iscalledtheproportionofvariabilityexplainedbytheregression;

(d)istheratiooftheerrorsumofsquarestothetotalsumofsquares;

(e)wouldincreaseifmorepredictorvariableswereaddedtothemodel.

ViewAnswer

Table17.20.Analysisofvariancefortheeffectsofage,sexandethnicgroup(Afro-CaribbeanversusWhite)oninter-pupil

distance(Imafedon,personalcommunication)

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare

Varianceratio(F)

Probability

Total 37 603.586

Agegroup

2 124.587 62.293 6.81 0.003

Sex 1 1.072 1.072 0.12 0.7

Ethnicgroup

1 134.783 134.783 14.74 0.0005

Residual 33 301.782 9.145

94.TheanalysisofvariancetableforastudyofthedistancebetweenthepupilsoftheeyesisshowninTable17.20:

(a)therewere34observations;

(b)thereisgoodevidenceofanethnicgroupdifferenceinthepopulation:

(c)wecanconcludethatthereisnodifferenceininter-pupildistancebetweenmenandwomen;

(d)thereweretwoagegroups;

(e)thedifferencebetweenethnicgroupsislikelytobeduetoarelationshipbetweenethnicityandageinthesample.

ViewAnswer

Table17.21.Logisticregressionofgraftfailureafter6months(Thomasetal.1993)

Variable Coef. Std.Err.

z=coef/se P 95%

Conf.

Whitecellcount

1.238 0.273 4.539 <0.001

0.695

Grafttype1

0.175 0.876 0.200 0.842 -1.570

Grafttype2

0.973 1.030 0.944 0.348 -1.080

Grafttype3

0.038 1.518 0.025 0.980 -2.986

Female -0.289 0.767 -0.377 0.708 -1.816

Age 0.022 0.035 0.633 0.528 -0.048

Smoker 0.998 0.754 1.323 0.190 -0.504

Diabetic 1.023 0.709 1.443 0.153 -0.389

Constant -13.726 3.836 -3.578 0.001 -21.369

Numberofobservations=84,chi-squared=38.05,d.f.=8,P<0.0001.

95.Table17.21showsthelogisticregressionofveingraftfailureonsomepotentialexplanatoryvariables.Fromthisanalysis:

(a)patientswithhighwhitecellcountsweremorelikelytohavegraftfailure;

(b)thelogoddsofgraftfailureforadiabeticisbetween0.389lessand2.435greaterthanthatforanon-diabetic;

(c)graftsweremorelikelytofailinfemalesubjects,thoughthisisnotsignificant;

(d)therewerefourtypesofgraft;

(e)anyrelationshipbetweenwhitecellcountandgraftfailuremaybeduetosmokershavinghigherwhitecellcounts.

ViewAnswer

Fig.17.9.Oralandforeheadtemperaturemeasurementsmadeinagroupofpyrexicpatients

96.ForthedatainFigure17.9:

(a)therelationshipcouldbeinvestigatedbylinearregression;

(b)an‘oralsquared’termcouldbeusedtotestwhetherthereisanyevidencethattherelationshipisnotastraightline;

(c)ifan‘oralsquared’termwereincludedtherewouldbe2degreesoffreedomforthemodel;

(d)thecoefficientsofan‘oral’andan‘oralsquared’termwouldbeuncorrelated;

(e)theestimationofthecoefficientofaquadratictermwouldbeimprovedbysubtractingthemeanfromtheoraltemperaturebeforesquaring.

ViewAnswer

Table17.22.Coxregressionoftimetoreadmissionforasthmaticchildrenfollowingdischargefrom

hospital(Mitchelletal.1994)

Variable Coef. Std.err. coef/se P

Boy -0.197 0.088 -2.234 0.026

Age -0.126 0.017 -7.229 <0.001

Previousadmissions

0.395 0.034 11.695 <0.001

(squareroot)

Inpatienti.v.therapy

0.267 0.093 2.876 0.004

Inpatienttheophyline

-0.728 0.295 -2.467 0.014

Numberofobservations=1024,X2=167.15,5d.f.,P<0.0001.

97.Table17.22showstheresultsofanobservationalstudyfollowingupasthmaticchildrendischargedfromhospital.Fromthistable:

(a)theanalysiscouldonlyhavebeendoneifallchildrenhadbeenreadmittedtohospital;

(b)theproportionalhazardsmodelwouldhavebeenbetterthanCoxregression;

(c)Boyshaveashorteraveragetimebeforereadmissionthandogirls;

(d)theuseoftheophylinepreventsreadmissiontohospital;

(e)childrenwithseveralpreviousadmissionshaveanincreasedriskofreadmission.

ViewAnswer

Fig.17.10.Cushionvolumeagainstnumberofpairsofsomitesfortwogroupsofmouseembryos(WebbandBrown,personalcommunication)

Table17.23.Numberofsomitesandcushionvolumeinmouseembryos

Normal Trisomy-16

som. c.vol. som. c.vol. som. c.vol. som.

17 2.674 28 3.704 15 0.919 28

20 3.299 31 6.358 17 2.047 28

21 2.486 32 3.966 18 3.302 28

23 1.202 32 7.184 20 4.667 31

23 4.263 34 8.803 20 4.930 32

23 4.620 35 4.373 23 4.942 34

25 4.644 40 4.465 23 6.500 35

25 4.403 42 10.940 23 7.122 36

27 5.417 43 6.035 25 7.688 40

27 4.395 25 4.230 42

27 8.647

17E*Exercise:AmultipleregressionanalysisTrisomy-16micecanbeusedasananimalmodelforDown'ssyndrome.Thisanalysislooksatthevolumeofaregionoftheheart,theatrioventricularcushion,ofamouseembryo,comparedbetweentrisomicandnormalembryos.Theembryoswereatvaryingstagesofdevelopment,indicatedbythenumberofpairsofsomites(precursorsofvertebrae).Figure17.10andTable17.23showthedata.Thegroupwascoded1=normal,2=trisomy-16.Table17.24showstheresultsofaregressionanalysisandFigure17.11showsresidualplots.

1.Isthereanyevidenceofadifferenceinvolumebetweengroupsforgivenstageofdevelopment?

ViewAnswer

2.Figure17.11showsresidualplotsfortheanalysisofTable17.24.Arethereanyfeaturesofthedatawhichmightmaketheanalysisinvalid?

ViewAnswer

Table17.24.Regressionofcushionvolumeonnumberofpairsofsomitesandgroupinmouseembryos

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 39 328.976

Duetoregression

2 197.708 98.854 27.86 P<0.0001


37 131.268 3.548

Variable Coef. Std.Err. t P 95%Conf.interval

group 2.44 0.60 4.06 <0.001 1.29to3.65

somites 0.27 0.04 6.70 <0.001 0.19to0.36

Fig.17.11.ResidualagainstnumberofpairsofsomitesandNormalplotofresidualsfortheanalysisofTable17.24

3.ItappearsfromFigure17.10thattherelationshipbetweenvolumeandnumberofpairsofsomitesmaynotbethesameinthetwogroups.Table17.25showstheanalysisofvarianceforregressionanalysisincludinganinteractionterm.CalculatetheF-ratiototesttheevidencethattherelationshipisdifferentinnormalandtrisomy-16embryos.YoucanfindtheprobabilityfromTable10.1,usingthefactthatthesquarerootofFwith1andndegreesoffreedomistwithndegreesoffreedom.

ViewAnswer

Table17.25.Analysisofvarianceforregressionwithnumberofpairsofsomites×groupinteraction

Sourceofvariation

Degreesoffreedom

Sumofsquares

Meansquare


Total 39 328.976

Duetoregression

3 207.139 69.046 20.40 P<0.0001


36 121.837 3.384



>TableofContents>18-Determinationofsamplesize

18

Determinationofsamplesize

18.1*EstimationofapopulationmeanOneofthequestionsmostfrequentlyaskedofamedicalstatisticianis‘HowlargeasampleshouldItake?’Inthischapterweshallseehowstatisticalmethodsfordecidingsamplesizescanbeusedinpracticeasanaidindesigninginvestigations.Themethodsweshallusearelargesamplemethods,thatis,theyassumethatlargesamplemethodswillbeusedintheanalysisandsotakenoaccountofdegreesoffreedom.

Wecanusetheconceptsofstandarderrorandconfidenceintervaltohelpdecidehowmanysubjectsshouldbeincludedinasample.Ifwewanttoestimatesomepopulationquantity,suchasthemean,andweknowhowthestandarderrorisrelatedtothesamplesize,thenwecancalculatethesamplesizerequiredtogiveaconfidenceintervalwiththedesiredwidth.Thedifficultyisthatthestandarderrormayalsodependeitheronthequantitywewishtoestimate,oronsomeotherpropertyofthepopulation,suchasthestandarddeviation.Wemustestimatethesequantitiesfromdataalreadyavailable,orcarryoutapilotstudytoobtainaroughestimate.Thecalculationofsamplesizecanonlybeapproximateanyway,sotheestimatesusedtodoitneednotbeprecise.

Ifwewanttoestimatethemeanofapopulation,wecanusetheformulaforthestandarderrorofamean,s/√n,toestimatethesamplesizerequired.Forexample,supposewewishtoestimatethemeanFEV1inapopulationofyoungmen.WeknowthatinanotherstudyFEV1hadstandarddeviations=0.67litre(§4.8).Wethereforeexpectthestandarderrorofthemeantobe0.67/√n.Wecansetthesizeof

standarderrorwewantandchoosethesamplesizetoachievethis.Wemightdecidethatastandarderrorof0.1litreiswhatwewant,sothatwewouldestimatethemeantowithin1.96×0.1=0.2litre.Then:SE=0.67/√n,n=0.672/SE2=0.672/0.12=45.Wecanalsoseewhatthestandarderrorandwidthofthe95%confidenceintervalwouldbefordifferentvaluesofn:

n 10 20 50 100 200 500

standarderror 0.212 0.150 0.095 0.067 0.047 0.030

95%confidenceinterval

±0.42 ±0.29 ±0.19 ±0.13 ±0.09 ±0.06

Sothatifwehadasamplesizeof200,wewouldexpectthe95%confidenceintervaltobe0.09litreoneithersidedofthesamplemean(1.96standarderrors)whereaswithasampleof50the95%confidenceintervalwouldbe0.19litreon

eithersideofthemean.

18.2*EstimationofapopulationproportionWhenwewishtoestimateaproportionwehaveafurtherproblem.Thestandarderrordependsontheveryquantitywhichwewishtoestimate.Wemustguesstheproportionfirst.Forexample,supposewewishtoestimatetheprevalenceofadisease,whichwesuspecttobeabout2%,towithin5%,i.e.tothenearest1per1000.Theunknownproportion,p,isguessedtobe0.02andwewantthe95%confidenceintervaltobe0.001oneitherside,sothestandarderrormustbehalfthis,0.0005.

Theaccurateestimationofverysmallproportionsrequiresverylargesamples.Thisisaratherextremeexampleandwedonotusuallyneed

toestimateproportionswithsuchaccuracy.Awiderconfidenceinterval,obtainablewithasmallersampleisusuallyacceptable.Wecanalsoask‘Ifwecanonlyaffordasamplesizeof1000,whatwillbethestandarderror?’

The95%confidencelimitswouldbe,roughly,p±0.009.Forexample,iftheestimatewere0.02,the95%confidencelimitswouldbe0.011to0.029.Ifthisaccuracyweresufficientwecouldproceed.

TheseestimatesofsamplesizearebasedontheassumptionthatthesampleislargeenoughtousetheNormaldistribution.Ifaverysmallsampleisindicateditwillbeinadequateandothermethodsmustbeusedwhicharebeyondthescopeofthisbook.

18.3*SamplesizeforsignificancetestsWeoftenwanttodemonstratetheexistenceofadifferenceorrelationshipaswellaswantingtoestimateitsmagnitude,asinaclinicaltrial,forexample.Webasethesesamplesizecalculationsonsignificancetests,usingthepowerofatest(§9.9)tohelpchoosethesamplesizerequiredtodetectadifferenceifitexists.Thepowerofatestisrelatedtothepostulateddifferenceinthepopulation,thestandarderrorofthesampledifference(whichinturndependsonthesamplesize),andthesignificancelevel,whichweusuallytaketobeα=0.05.Thesequantitiesarelinkedbyanequationwhichenablesustodetermineanyoneofthemgiventheothers.Wecanthensaywhatsamplesizewouldberequiredtodetectanygivendifference.Wethendecidewhatdifferenceweneedtobeable

todetect.Thismightbeadifferencewhichwouldhaveclinicalimportanceofadifferencewhichwethinkthetreatmentmayproduce.

Supposewehaveasamplewhichgivesanestimatedofthepopulationdifferenceµd.WeassumedcomesfromaNormaldistributionwithmeanµdandhasstandarderrorSE(d).Heredmightbethedifferencebetweentwomeanstwoproportions,oranythingelsewecancalculatefromdata.Weareinterestedintestingthenullhypothesisthatthereis

nodifferenceinthepopulation.i.e.µd=0.Wearegoingtouseasignificancetestattheαlevel,andwantthepower,theprobabilityofdetectingasignificantdifference,tobeP.

IshalldefineuαtobethevaluesuchthattheStandardNormaldistribution(mean0andvariance1)islessthan-uαorgreaterthanuαwithprobabilityα.Forexample,u0.05=1.96.Theprobabilityoflyingbetween-uαanduαis1-α.ThusuαisthetwosidedαprobabilitypointoftheStandardNormaldistribution,asshowninTable7.2.

Ifthenullhypothesisweretrue,theteststatisticd/SE(d)wouldbefromaStandardNormaldistribution.Werejectthenullhypothesisattheαleveliftheteststatisticisgreaterthanuαorlessthan-uα,1.96fortheusual5%significancelevel.Forsignificancewemusthave:

Letusassumethatwearetryingtodetectadifferencesuchthatdwillbegreaterthan0.Thefirstalternativeisthenextremelyunlikelyandcanbeignored.Thuswemusthave,forasignificantdifference:d/SE(d)>uαsod>uαSE(d).ThecriticalvaluewhichdmustexceedisuαSE(d).

Now,disarandomvariable,andforsomesamplesitwillbegreaterthanitsmean,µd,forsomeitwillbelessthanitsmean.disanobservationfromaNormaldistributionwithmeanµdandvarianceSE(d)2.WewantdtoexceedthecriticalvaluewithprobabilityP,thechosenpowerofthetest.ThevalueoftheStandardNormaldistributionwhichisexceededwithprobabilityPis-u2(1-P)(seeFigure18.1).(1-P)isoftenrepresentedasβ(beta).Thisistheprobabilityoffailingtoobtainasignificantdifferencewhenthenullhypothesisisfalseandthepopulationdifferenceisµd.ItistheprobabilityofaTypeIIerror(§9.4).ThevaluewhichdexceedswithprobabilityPisthemeanminus-u2(1-P)standarddeviations:µd-u2(1-P)SE(d).Henceforsignificancethismustexceedthecriticalvalue,uαSE(d).Thisgives

µd-u2(1-P)SE(d)=uαSE(d)

Puttingthecorrectstandarderrorformulaintothiswillyieldtherequiredsamplesize.Wecanrearrangeitas

µ2d=(uα+u2(1-P))2SE(d)2

ThisistheconditionwhichmustbemetifwearetohaveaprobabilityPof

detectingasignificantdifferenceattheαlevel.Weshallusetheexpression(uα2(1-P))2alot,soforconvenienceIshalldenoteitbyf(α,P).Table18.1showsthevaluesofthefactorf(α,P)fordifferentvaluesofαandP.Theusualvalueusedforαis0.05,andPisusually0.80,0.90,or0.95.

Fig.18.1.RelationshipbetweenPandu2(1-P)

Table18.1.Valuesoff(α,P)=(uα+u2(1-P))2for

differentPandα

Power,PSignificancelevel,α

0.05 0.01

0.50 3.8 6.6

0.70 6.2 9.6

0.80 7.9 11.7

0.90 10.5 14.9

0.95 13.0 17.8

0.99 18.4 24.0

Sometimeswedonotexpectthenewtreatmenttobebetterthanthestandardtreatment,buthopethatitwillbeasgood.Wewanttotesttreatmentswhichmaybeasgoodastheexistingtreatmentbecausethenewtreatmentmaybecheaper,havefewersideeffects,belessinvasive,orunderourpatent.Wecannotusethepowermethodbasedonthedifferencewewanttobeabletodetect,becausewearenotlookingforadifference.Whatwedoisspecifyhowdifferentthetreatmentsmightbeinthepopulationandstillberegardedasequivalent,anddesignourstudytodetectsuchadifference.Thiscangetrathercomplicatedandspecialised,soIshallleavethedetailstoMachinetal.(1998).

18.4*Comparisonoftwomeans

Whenwearecomparingthemeansoftwosamples,samplesizesn1andn2,frompopulationswithmeansµ1andµ2,withthevarianceofthemeasurementsbeingσ2,wehaveµd=µ1-µ2and

sotheequationbecomes:

Forexample,supposewewanttocomparebicepsskinfoldinpatientswithCrohn'sdiseaseandcoeliacdisease,followinguptheinconclusivecomparisonofbicepsskinfoldinTable10.4withalargerstudy.Weshallneedanestimateofthevariabilityofbicepsskinfoldinthepopulationweareconsidering.Wecanusuallygetthisfromthemedicalliterature,orasherefromourowndata.Ifnotwemustdoapilotstudy,asmallpreliminaryinvestigationtocollectsomedataandcalculatethestandarddeviation.ForthedataofTable10.4,thewithin-groupsstandarddeviationis2.3mm.Wemustdecidewhatdifferencewewanttodetect.Inpracticethismaybedifficult.InmysmallstudythemeanskinfoldthicknessintheCrohn'spatientswas1mmgreaterthaninmycoeliacpatients.Iwilldesignmylargerstudytodetectadifferenceof0.5mm.Ishalltaketheusualsignificancelevelof0.05.Iwantafairlyhighpower,sothatthereisahighprobabilityofdetectingadifferenceofthechosensizeshoulditexist.Ishalltake0.90,whichgivesf(α,P)=10.5fromTable18.1.Theequationbecomes:

Wehaveoneequationwithtwounknowns,sowemustdecideontherelationshipbetweenn1andn2.Ishalltrytorecruitequalnumbersinthetwogroups:

andIneed444subjectsineachgroup.

Itmaybethatwedonotknowexactlywhatsizeofdifferenceweareinterestedin.Ausefulapproachistolookatthesizeofthedifferencewecoulddetectusingdifferentsamplesizes,asinTable18.2.Thisisdonebyputtingdifferentvaluesofninthesamplesizeequation.

Table18.2.Differenceinmeanbicepsskinfoldthickness(mm)detectedatthe5%significancelevelwithpower90%fordifferentsamplesizes,equal

groups

Sizeofeachgroup,n

Differencedetectedwithprobability0.90

10 3.33

20 2.36

50 1.49

100 1.05

200 0.75

500 0.47

1000 0.33

Table18.3.Samplesizerequiredineachgrouptodetectadifferencebetweentwomeansatthe5%

significancelevelwithpower90%,usingequallysizedsamples

Differencein

standarddeviations

n

Differencein

standarddeviations

n

Differencein

standarddeviations

n

0.01 210000

0.1 2100 0.6 58

0.02 52500

0.2 525 0.7 43

0.03 23333

0.3 233 0.8 33

0.04 13125

0.4 131 0.9 26

0.05 8400

0.5 84 1.0 21

Ifwemeasurethedifferenceintermsofstandarddeviations,wecanmakeageneraltable.Table18.3givesthesamplesizerequiredtodetectdifferencesbetweentwoequallysizedgroups.Altman(1982)givesaneatgraphicalmethodofcalculation.

Wedonotneedtohaven1=n2=n.Wecancalculateµ1-µ2fordifferentcombinationsofn1andn2.Thesizeofdifference,intermsofstandarddeviations,whichwouldbedetectedisgiveninTable18.4.Wecanseefromthisthatwhatmattersisthesizeofthesmallersample.Forexample,ifwehave10ingroup1and20ingroup2,wedonotgainverymuchbyincreasingthesizeofgroup2:increasinggroup2from20to100produceslessadvantagethanincreasinggroup1from10to20.Inthiscasetheoptimumisclearlytohavesamplesofequalsize.

Table18.4.Difference(instandarddeviations)detectableatthe5%significancelevelwithpower90%

fordifferentsamplesizes,unequalgroups

n2 n1

10 1.45 1.25 1.13 1.08 1.05 1.03 1.03

20 1.25 1.03 0.85 0.80 0.75 0.75 0.73

50 1.13 0.85 0.65 0.55 0.50 0.48 0.48

100 1.08 0.80 0.55 0.45 0.40 0.35 0.35

200 1.05 0.75 0.50 0.40 0.33 0.28 0.25

500 1.03 0.75 0.48 0.35 0.28 0.20 0.18

1000 1.03 0.73 0.48 0.35 0.25 0.18 0.15

18.5*ComparisonoftwoproportionsUsingthesameapproach,wecanalsocalculatethesamplesizesforcomparingtwoproportions.Ifwehavetwosampleswithsizesn1andn2fromBinomialpopulationswithproportionsp1andp2thedifferenceisµd=p1-p2,thestandarderrorofthedifferencebetweenthesampleproportions(§8.6)is:

Ifweputtheseintothepreviousformulawehave:

Thesizeoftheproportions,p1andp2,isimportant,aswellastheirdifference.(Thesignificancetestimpliedhereissimilartothechi-squaredtestfora2by2table).Whenthesamplesizesareequal,i.e.n1=n2=n,wehave

Thereareseveralslightvariationsonthisformula.Differentcomputerprogramsmaythereforegiveslightlydifferentsamplesizeestimates

Supposewewishtocomparethesurvivalratewithanewtreatmentwiththatwithanoldtreatment,whereitisabout60%.Whatvaluesofn1andn2willhave90%chanceofgivingsignificantdifferenceatthe5%

levelfordifferentvaluesofp2?ForP=0.90andα=0.05,f(α,P)=10.5.Supposewewishtodetectanincreaseinthesurvivalrateonthenewtreatmentto80%,sop2=0.80,andp1=0.60.

Table18.5.Samplesizeineachgrouprequiredtodetectdifferentproportionsp2whenp1=0.6atthe5%significancelevelwithpower90%,equalgroups

p2 n

0.90 39

0.80 105

0.70 473

0.65 1964

Table18.6.n2fordifferentn1andp2whenp1=0.05atthe5%significancelevelwithpower90%

p2n1

50 100 200 500 1000 2000 5000

0.06 . . . . . . 237000

0.07 . . . . . 4500 2300

0.08 . . . . 1900 1200 970

0.10 . . 1500 630 472 420 390

0.15 5400 270 180 150 140 140 140

0.20 134 96 84 78 76 76 75

Wewouldrequire105ineachgrouptohavea90%chanceofshowingasignificantdifferenceifthepopulationproportionswere0.6and0.8.

Whenwedonothaveaclearideaofthevalueofp2inwhichweareinterested,wecancalculatethesamplesizerequiredforseveralproportions,asinTable18.5.Itisimmediatelyapparentthattodetectsmalldifferencesbetweenproportionsweneedverylargesamples.

Thecasewheresamplesareofequalsizeisusualinexperimentalstudies,butnotinobservationalstudies.Supposewewishtocomparetheprevalenceofacertainconditionintwopopulations.Weexpectthatinonepopulationitwillbe5%andthatitmaybemorecommonthesecond.Wecanrearrangetheequation:

Table18.6showsn2fordifferentn1andp2.Forsomevaluesofn1wegetanegativevalueofn2.Thismeansthatnovalueofn2islarge

enough.Itisclear

thatwhentheproportionsthemselvesaresmall,thedetectionofsmalldifferencesrequiresverylargesamplesindeed.

18.6*DetectingacorrelationInvestigationsareoftensetuptolookforarelationshipbetweentwocontinuousvariables.Itisconvenienttotreatthisasanestimationofortestofacorrelationcoefficient.Thecorrelationcoefficienthasanawkwarddistribution,whichtendsonlyveryslowlytotheNormal,evenwhenbothvariablesthemselvesfollowaNormaldistribution.WecanuseFisher'sztransformation:

whichfollowsaNormaldistributionwithmean

andvariance1/(n-3)approximately,whereρisthepopulationcorrelationcoefficientandnisthesamplesize(§11.10).Forsamplesizecalculationswecanapproximatezρby

Thuswehave

andwecanestimaten,ρorPgiventheothertwo.Table18.7showsthesamplesizerequiredtodetectacorrelationcoefficientwithapowerofP=0.9andasignificancelevelα=0.05.

Table18.7.Approximatesamplesizerequiredtodetectacorrelationatthe5%significancelevelwith

power90%

ρ n ρ n ρ n

0.01 100000 0.1 1000 0.6 25

0.02 26000 0.2 260 0.7 17

0.03 12000 0.3 110 0.8 12

0.04 6600 0.4 62 0.9 8

0.05 4200 0.5 38

18.7*AccuracyoftheestimatedsamplesizeInthischapterIhaveassumedthatsamplesaresufficientlylargeforsamplingdistributionstobeapproximatelyNormalandforestimatesofvariancetobegoodestimates.Withverysmallsamplesthismaynotbethecase.Variousmoreaccuratemethodsexist,butanysamplesizecalculationisapproximateandexceptforverysmallsamples,saylessthan10,themethodsdescribedaboveshouldbeadequate.Whenthesampleisverysmall,wemightneedtoreplacethesignificancetestcomponentoff(α,P)bythecorrespondingnumberfromthetdistribution.

Thesemethodsdependonassumptionsaboutthesizeofdifferencesoughtandthevariabilityoftheobservations.Itmaybethatthepopulationtobestudiedmaynothaveexactlythesamecharacteristicsasthosefromwhichthestandarddeviationorproportionswereestimated.Thelikelyeffectsofchangesinthesecanbeexaminedbyputtingdifferentvaluesofthemintheformula.However,thereisalwaysanelementofventuringintotheunknownwhenembarkingonastudyandwecanneverbysurethatthesampleandpopulationwillbeasweexpect.Thedeterminationofsamplesizeasdescribedaboveisthusonlyaguide,anditisprobablyaswellalwaystoerronthesideofalargersamplewhencomingtoafinaldecision.

Thechoiceofpowerisarbitrary,inthatthereisnotoptimumchoiceofpowerforastudy.Iusuallyrecommend90%,but80%isoftenquoted.Thisgivessmallerestimatedsamplesizes,but,ofcourse,agreaterchanceoffailingtodetecteffects.

ForafullertreatmentofsamplesizeestimationandfullertablesseeMachinetal.(1998)andLemeshowetal.(1990).

18.8*TrialsrandomizedinclustersWhenwerandomizebyclusterratherthanindividual(§2.11)welosepowercomparedtoanindividually-randomizedtrialofthesamesize.Hencetogetthepowerwewant,wemustincreasethesamplesizefromthatrequiredforanindividuallyrandomizedtrial.Theratioofthenumberofpatientsrequiredforaclustertrialtothatforasimplyrandomizedtrialiscalledthedesigneffectofthestudy.Itdependson

thenumberofsubjectspercluster.Forthepurposeofsamplesizecalculationsweusuallyassumethisisconstant.

Iftheoutcomemeasurementiscontinuous,e.g.serumcholesterol,asimple

methodofanalysisisbasedonthemeanoftheobservationsforallsubjectsinthecluster,andcomparesthesemeansbetweenthetreatmentgroups(§10.13).Wewilldenotethevarianceofobservationswithinoneclusterbys2wandassumethatthisvarianceisthesameforallclusters.Iftherearemsubjectsineachclusterthenthevarianceofasinglesamplemeaniss2w/m.Thetrueclustermean(unknown)willvaryfromclustertocluster,withvariances2c(see§10.12).Theobservedvarianceoftheclustermeanswillbethesumofthevariancebetweenclustersandthevariancewithinclusters,i.e.varianceofoutcome=s2c+s2w/m.Hencethestandarderrorforthedifferencebetweenmeansisgivenby

wheren1andn2arethenumbersofclustersinthetwogroups.Formosttrialsn1=n2=n.so

Hence,usingthegeneralmethodof§18.3,wecancalculatetherequirednumberofclustersby

Whentheoutcomeisadichotomous,‘yesorno’variable,wereplaces2wbyp(1-p),wherepistheprobabilityofa‘yes’.

Forexample,inaproposedstudyofabehaviouralinterventiontolowercholesterolingeneralpractice,practicesweretoberandomisedintotwogroups,onetoofferintensivedietaryinterventionbyspecially

trainedpracticenursesusingabehaviouralapproachandtheothertousualgeneralpracticecare.Theoutcomemeasurewouldbemeancholesterollevelsinpatientsattendingeachpracticeoneyearlater.EstimatesofbetweenpracticevarianceandwithinpracticevariancewereobtainedfromtheMRCthrombosispreventiontrial(Meadeetal.1992)andweres2c=0.0046ands2w=1.28respectively.Theminimumdifferenceconsideredtobeclinicallyrelevantwas0.1mmol/l.Ifwerecruit50patientsperpractice,wewouldhaves2=s2w+s2w/m=0.0046+1.28/50=0.0302.IfwechoosepowerP=0.90andandsignificancelevelα=0.05,fromTable18.1f(P,α)=10.5.Thenumberofpracticesrequiredtodetectadifferenceof0.1mmol/lisgivenbyn=10.5×0.0302×2/0.12=63ineachgroup.Thiswouldgiveus63×50=3150patientsineachgroup.Acompletelyrandomizedtrialwithoutclusterswouldhaves2=0.0046+1.28=1.2846andwewouldneedn=10.5×1.2846×2/0.12=2698patientspergroup.Thusthedesigneffectofhavingclustersof50patientsis3150/2698=1.17.

Theequationforthedesigneffectis

Ifwecalculateanintra-classcorrelationcoefficient(ICC)fortheseclusters(§11.13),wehave

Inthiscontext,theICCiscalledtheintra-clustercorrelationcoefficient.Byabitofalgebraweget

DEEF=1+(m-1)ICC

Ifthereisonlyoneobservationpercluster,m=1andthedesigneffectis1.0andthetwodesignsarethesame.Otherwise,thelargertheICC,i.e.themoreimportantthevariationbetweenclustersis,thebiggerthedesigneffectandthemoresubjectswewillneedtogetthesamepowerasasimply-randomizedstudy.EvenasmallICCwillhaveanimpactiftheclustersizeislarge.TheX-rayguidelinesstudy(§10.13)

hadICC=0.019.AstudywiththesameICCandm=50referralsperpracticewouldhavedesigneffectD=1+(50-1)×0.019=1.93.Thusitwouldrequirealmosttwiceasmanysubjectsasatrialwherepatientswererandomizedtotreatmentindividually.

ThemaindifficultyincalculatingsamplesizeforclusterrandomizedstudiesisobtaininganestimateofthebetweenclustervariationorICC.Estimatesofvariationbetweenindividualscanoftenbeobtainedfromtheliteraturebutevenstudiesthatusetheclusterastheunitofanalysismaynotpublishtheirresultsinsuchawaythatthebetweenpracticevariationcanbeestimated.Donneretal.(1990),recognizingthisproblem,recommendedthatauthorspublishthecluster-specificeventratesobservedintheirtrial.Thiswouldenableotherworkerstousethisinformationtoplanfurtherstudies.

Insometrials,wheretheinterventionisdirectedattheindividualsubjectsandthenumberofsubjectsperclusterissmall,wemayjudgethatthedesigneffectcanbeignored.Ontheotherhand,wherethenumberofsubjectsperclusterislarge,anestimateofthevariabilitybetweenclusterswillbeveryimportant.Whenthenumberofclustersisverysmall,wemayhavetousesmallsampleadjustmentsmentionedin§18.7.

18M*Multiplechoicequestions98to100(Eachansweristrueorfalse)

98.*Thepowerofatwo-samplettest:

(a)increasesifthesamplesizesareincreased;

(b)dependsonthedifferencebetweenthepopulationmeanswhichwewishtodetect;

(c)dependsonthedifferencebetweenthesamplemeans;

(d)istheprobabilitythatthetestwilldetectagivenpopulationdifference;

(e)cannotbezero.

ViewAnswer

99.*Thesamplesizerequiredforastudytocomparetwoproportions:

(a)dependsonthemagnitudeoftheeffectwewishtodetect;

(b)dependsonthesignificancelevelwewishtoemploy;

(c)dependsonthepowerwewishtohave;

(d)dependsontheanticipatedvaluesoftheproportionsthemselves;

(e)shouldbedecidedbyaddingsubjectsuntilthedifferenceissignificant.

ViewAnswer

100.*Thesamplesizerequiredforastudytoestimateamean:

(a)dependsonthewidthoftheconfidenceintervalwhichwewant;

(b)dependsonthevariabilityofthequantitybeingstudied;

(c)dependsonthepowerwewishtohave;

(d)dependsontheanticipatedvalueofthemean;

(e)dependsontheanticipatedvalueofthestandarddeviation.

ViewAnswer

18E*Exercise:Estimationofsamplesizes1.Whatsamplesizewouldberequiredtoestimatea95%referenceintervalusingtheNormaldistributionmethod,sothatthe95%confidenceintervalforthereferencelimitswereatmost20%ofthereferenceintervalsize?

ViewAnswer

2.Howbigasamplewouldberequiredforanopinionpollstertoestimatevoterpreferencestowithintwopercentagepoints?

ViewAnswer

3.Mortalityfrommyocardialinfarctionafteradmissiontohospitalisabout15%.Howmanypatientswouldberequiredforaclinicaltrialtodetecta10%reductioninmortality,i.e.to13.5%,ifthepowerrequiredwas90%?Howmanywouldbeneededifthepowerwereonly80%?

ViewAnswer

4.Howmanypatientswouldberequiredinaclinicalstudytocompareanenzymeconcentrationinpatientswithaparticulardiseaseandcontrols,ifdifferencesoflessthanonestandarddeviationwouldnotbeclinicallyimportant?Iftherewasalreadyasampleofmeasurementsfrom100healthycontrols,howmanydiseasecaseswouldberequired?

ViewAnswer

5.Inaproposedtrialofahealthpromotionprogramme,theprogrammewastobeimplementedacrossawholecounty.Theplanwastousefourcounties,twocountiestobeallocatedtoreceivetheprogrammeandtwocountiestoactascontrols.Theprogrammewouldbeevaluatedbyasurveyofsamplesofabout750subjectsdrawnfromtheat-riskpopulationsineachcounty.Aconventionalsamplesizecalculation,whichignoredtheclustering,hadindicatedthat1500subjectsineachtreatmentgroupwouldberequiredtogivepower80%todetecttherequireddifference.Theapplicantswereawareoftheproblemofclusterrandomisationandtheneedtotakeitintoaccountintheanalysis,e.g.byanalysisatthelevelofthecluster(county).Theyhadanestimateoftheintraclustercorrelation=0.005,basedonapreviousstudy.Theyarguedthatthiswassosmallthattheycouldignoretheclustering.Weretheycorrect?

ViewAnswer



>TableofContents>19-Solutionstoexercises

19

Solutionstoexercises

Someofthemultiplechoicequestionsarequitehard.Ifyouscore+1foracorrectanswer,-1foranincorrectanswer,and0forapartwhichyouomitted,Iwouldregard40%asthepasslevel,50%asgood,60%asverygood,and70%asexcellent.Thesequestionsarehardtosetandsomemaybeambiguous,soyouwillnotscore100%.

SolutiontoExercise2M:Multiplechoicequestions1to61.FFFFF.Controlsshouldbetreatedinthesameplaceatthesametime,underthesameconditionsotherthanthetreatmentundertest(§2.1).Allmustbewillingandeligibletoreceiveeithertreatment(§2.4).

2.FTFTF.Randomallocationisdonetoachievecomparablegroups,allocationbeingunrelatedtothesubjects'characteristics(§2.2).Theuseofrandomnumbershelpstopreventbiasinrecruitment(§2.3).

3.TFFFT.Patientsdonotknowtheirtreatment,buttheyusuallydoknowthattheyareinatrial(§2.9).Notthesameasacross-overtrial(§2.6).

4.FFFFF.Vaccinatedandrefusingchildrenareself-selected(§2.4).Weanalysebyintentiontotreat(§2.5).Wecancompareeffectofavaccinationprogrammebycomparingwholevaccinationgroup,vaccinatedandrefuserstothecontrols.

5.TFTTT.§2.6.Theorderisrandomized.

6.FFTTT.§2.8,§2.9.Thepurposeofplacebosismakedissimilartreatmentsappearsimilar.Onlyinrandomizedtrialscanwerelyoncomparability,andthenonlywithinthelimitsofrandomvariation(§2.2).

SolutiontoExercise2E1.ItwashopedthatwomenintheKYMgroupwouldbemoresatisfiedwiththeircare.Theknowledgethattheywouldreceivecontinuityofcarewasanimportantpartofthetreatment,andsothelackofblindnessisessential.MoredifficultisthatKYMwomenweregivenachoiceandsomayhavefeltmorecommittedtowhicheverscheme,KYMorstandard,theyhadchosen,thandidthecontrolgroup.Wemustacceptthiselementofpatientcontrolaspartofthetreatment.

2.Thestudyshouldbe(andwas)analysedbyintentiontotreat(§2.5).Asoftenhappens,therefusersdidworsethandidtheacceptorsofKYM,andworsethan

thecontrolgroup.WhenwecompareallthoseallocatedtoKYMwiththoseallocatedtocontrol,thereisverylittledifference(Table19.1).

Table19.1.MethodofdeliveryintheKYMstudy

Methodofdelivery

AllocatedtoKYM

Allocatedtocontrol

% n % n

Normal 79.7 382 74.8 354

Instrumental 12.5 60 17.8 84

Caesarian 7.7 37 7.4 35

3.Womenhadbookedforhospitalantenatalcareexpectingthestandardservice.Thoseallocatedtothisthereforereceivedwhattheyhadrequested.ThoseallocatedtotheKYMschemewereofferedatreatmentwhichtheycouldrefuseiftheywished,refusersgettingthecareforwhichtheyhadoriginallybooked.Noextraexaminationswerecarriedoutforresearchpurposes,theonlyspecialdatabeingthequestionnaires,whichcouldberefused.Therewasthereforenoneedtogetthewomen'spermissionfortherandomization.Ithoughtthiswasaconvincingargument.

SolutiontoExercise3M:Multiplechoicequestions7to137.FTTTT.Apopulationcanbeanything(§3.3).

8.TFFFT.Acensustellsuswhoisthereonthatday,andonlyappliestocurrentin-patients.Thehospitalcouldbequiteunusual.Somediagnosesarelesslikelythanotherstoleadtoadmissionortolongstay(§3.2).

9.TFFTF.Allmembersandallsampleshaveequalchancesofbeingchosen(§3.4).Wemuststicktothesampletherandomprocessproduces.Errorscanbeestimatedusingconfidenceintervalsandsignificancetests.Choicedoesnotdependonthesubject'scharacteristicsatall,exceptforitsbeinginthepopulation.

10.FTTFT.Somepopulationsareunidentifiableandsomecannotbelistedeasily(§3.4).

11.FFFTF.Inacase-controlstudywestartwithagroupwiththedisease,thecases,andagroupwithoutthedisease,thecontrols(§3.8).

12.FTFTT.Wemusthaveacohortorcasecontrolstudytogetenoughcases(§3.7,§3.8).

13.TTTTF.Thisisarandomclustersample(§3.4).Eachpatienthadthesamechanceoftheirhospitalbeingchosenandthenthesamechanceofbeingchosenwithinthehospital.Thiswouldnotbesoifwechoseafixednumberfromeachhospitalratherthanafixedproportion,as

thoseinsmallhospitalswouldbemorelikelytobechosenthanthoseinlargehospitals.Inpart(e).whataboutasamplewithpatientsineveryhospital?

SolutiontoExercise3E1.Manycasesofinfectionmaybeunreported,butthereisnotmuchthatcouldbedoneaboutthat.Manyorganismsproducesimilarsymptoms,hencethe

needforlaboratoryconfirmation.Therearemanysourcesofinfection,includingdirecttransmission,hencetheexclusionofcasesexposedtootherwatersuppliesandtoinfectedpeople.

2.Controlsmustbematchedforageandsexasthesemayberelatedtotheirexposuretoriskfactorssuchashandlingrawmeat.Inclusionofcontrolswhomayhavehadthediseasewouldhaveweakenedanyrelationshipswiththecause,andthesameexclusioncriteriawereappliedasforthecases,tokeepthemcomparable.

3.Dataareobtainedbyrecall.Casesmayremembereventsinrelationtothediseasemoreeasilythatthancontrolsinrelationtothesametime.Casesmayhavebeenthinkingaboutpossiblecausesofthediseaseandsobemorelikelytorecallmilkattacks.Thelackofpositiveassociationwithanyotherriskfactorssuggeststhatthisisnotimportanthere.

4.Iwasconvinced.Therelationshipisverystrongandthesescavengingbirdsareknowntocarrytheorganism.Therewasnorelationshipwithanyotherriskfactor.Theonlyproblemisthattherewaslittleevidencethatthesebirdshadactuallyattackedthemilk.Othershavesuggestedthatcatsmayalsoremovethetopsofmilkbottlestodrinkthemilkandmaybetherealculprits(Balfour1991).

5.Furtherstudies:testingofattackedmilkbottlesforCampylobacter(havetowaitforthenextyear).Possiblyacohortstudy,askingpeopleabouthistoryofbirdattacksanddrinkingattackedmilk,thenfollowforfutureCampylobacter(andother)infections.Possiblyaninterventionstudy.Advisepeopletoprotecttheirmilkandobservethesubsequentpatternofinfection.

SolutiontoExercise4M:Multiplechoicequestions14to1914.TFFTF.§4.1.Parityisquantitativeanddiscrete,heightandbloodpressurearecontinuous.

15.TTFTF.§4.1.Agelastbirthdayisdiscrete,exactageincludesyearsandfractionofayear.

16.FFTFT.§4.4,§4.6.Itcouldhavemorethanonemode,wecannotsay.Standarddeviationislessthanvarianceifthevarianceisgreaterthanone(§4.7,8).

17.TTTFT.§4.2,3,4.Meanandvarianceonlytellusthelocationandspreadofthedistribution(§4.6,7).

18.TFTFT.§4.5,6,7.Median=2,theobservationsmustbeorderedbeforethecentraloneisfound,mode=2,range=7-1=6,variance=22/4=5.5.

19.FFFFT.§4.6,7,8.Therewouldbemoreobservationsbelowthemeanthanabove,becausethemedianwouldbelessthanthemean.Mostobservationswillbewithinonestandarddeviationofthemeanwhatevertheshape.Thestandarddeviationmeasureshowwidelythebloodpressureisspreadbetweenpeople,notforasingleperson,whichwouldbeneededtoestimateaccuracy.Seealso§15.2.

Fig.19.1.Stemandleafplotofbloodglucose

Fig.19.2.Boxandwhiskerplotofbloodglucose

SolutiontoExercise4E1.ThestemandleafplotisshowninFigure19.1:

2.Minimum=2.2,maximum=6.0.Themedianistheaverageofthe20thand21storderedobservations,sincethenumberofobservationsiseven.Theseareboth4.0,sothemedianis4.0.Thefirstquartileisbetweenthe10thand11th,whichareboth3.6.Thethirdquartileisbetweenthe30thand31stobservations,whichare4.5and4.6.Wehaveq=0.75,i=0.75×41=30.75,andthequartileisgivenby4.5+(4.6-4.5)×0.75=4.575(§4.5).TheboxandwhiskerplotisshowninFigure19.2.

Fig.19.3.Histogramofbloodglucose

3.Thefrequencydistributionisderivedeasilyfromthestemandleafplot:

Interval Frequency

2.0–2.4 1

2.5–2.9 1

3.0–3.4 6

3.5–3.9 10

4.0–4.4 11

4.5–4.9 8

5.0–5.4 2

5.5–5.9 0

6.0–6.4 1

Total 40

4.ThehistogramisshowninFigure19.3.Thedistributionissymmetrical.

5.Themeanisgivenby

Thedeviationsandtheirsquaresareasfollows:

xi xi-[xwithbarabove] (xi-[xwithbarabove])2

4.7 0.65 0.4225

4.2 0.15 0.0225

3.9 -0.15 0.0225

3.4 -0.65 0.4225

Total 16.2 0.00 0.8900

Therearen-1=4-1=3degreesoffreedom.Thevarianceisgivenby

6.Asbefore,thesumis∑xi=16.2,Thesumofsquaresaboutthemeanisthengivenby∑xi2=66.5and

Thisisthesameasfoundin5above,so,asbefore,

7.Forthemeanwehave∑xi=162.2,

Thesumofsquaresaboutthemeanisgivenby:

Therearen-1-40-1=39degreesoffreedom.Thevarianceisgivenby

9.Forthelimits,[xwithbarabove]-2s=4.055-2×0.698=2.659,[xwithbarabove]-s=4.055-0.698=3.357,[xwithbarabove]=4.055,[xwithbarabove]+s=4.055+0.698=4.753,and[xwithbarabove]+2s=4.055+2×0.698=5.451.Figure19.3showsthemeanandstandarddeviationmarkedonthehistogram.Themajorityofpointsfallwithinonestandarddeviationofthemeanandnearlyallwithintwostandarddeviationsofthemean.Becausethedistributionissymmetrical,itextendsjustbeyondthe[xwithbarabove]±2spointsoneitherside.

SolutiontoExercise5M:Multiplechoicequestions20to2420.FTTTT.§5.1,§5.2.Withoutacontrolgroupwehavenoideahowmanywouldgetbetteranyway(§2.1).66.67%is2/3.Wemayonlyhave3patients.

21.TFFTT.§5.2.Tothreesignificantfigures,itshouldbe1730.Weroundupbecauseofthe9.Tosixdecimalplacesitis1729.543710.

22.FTTFT.Thisisabarchartshowingtherelationshipbetweentwovariables(§5.5).SeeFigure19.4.Calendartimehasnotruezerotoshow.

23.TTFFT.§5.9,§5A.Thereisnologarithmofzero.

24.FFTTT.§5.5,6,7.Ahistogram(§4.3)andapiechart(§5.4)eachshowthedistributionofasinglevariable.

Fig.19.4.Adubiousgraphrevised

Table19.2.CalculationsforapiechartfortheTootingBecdata

Category Frequency Relativefrequency Angle

Schizophrenia 474 0.32311 116

Affectiveillness 277 0.18882 68

Organicbrainsyndrome

405 0.27607 99

Subnormality 58 0.03954 14

Alcoholism 57 0.03885 14

Other 196 0.13361 48

Total 1467 1.00000 359

SolutiontoExercise5E1.Thisisthefrequencydistributionofaqualitativevariable,soapiechartcanbeusedtodisplayit.ThecalculationsaresetoutinTable19.2.Noticethatwehavelostonedegreethroughroundingerrors.Wecouldworktofractionsofadegree,buttheeyeisunlikelytospotthedifference.ThepiechartisshowninFigure19.5.

2.SeeFigure19.6.

3.Thereareseveralpossibilities.Intheoriginalpaper,DollandHillusedaseparatebarchartforeachdisease,similartoFigure19.7.

4.Linegraphscanbeusedhere,aswehavesimpletimeseries(Figure19.8).Foranexplanationofthedifferencebetweenyears,see§13E.

SolutiontoExercise6M:Multiplechoicequestions25to3125.TTFFF.§6.2.Iftheyaremutuallyexclusivetheycannotbothhappen.Thereisnoreasonwhytheyshouldbeequiprobableorexhaustive,theonlyeventswhichcanhappen(§6.3).

26.TFTFT.Forboth,theprobabilitiesaremultiplied,0.2×0.05=0.01(§6.2).

Clearlytheprobabilityofbothmustbelessthanthatforeachone.The

probabilityofbothis0.01,sotheprobabilityofXaloneis0.20-0.01=0.19andtheprobabilityofYaloneis0.05-0.01=0.04.TheprobabilityofhavingXorYistheprobabilityofXalone+probabilityofYalone+probabilityofXandYtogether,becausethesearethreemutuallyexclusiveevents.HavingXandhavingYarenotmutuallyexclusiveasshecanhaveboth.HavingXtellsusnothingaboutwhethershehasY.IfshehasXtheprobabilityofhavingYisstill0.05,becauseXandYareindependent.

Fig.19.5.PiechartshowingthedistributionofpatientsinTootingBecHospitalbydiagnosticgroup

Fig.19.6.BarchartshowingtheresultsoftheSalkvaccinetrial

27.TFTFF.§6.4.Weightiscontinuous.Patientsrespondornotwithequalprobability,beingselectedatrandomfromapopulationwheretheprobabilityofrespondingvaries.ThenumberofredcellsmightfollowaPoissondistribution(§6.7);thereisnosetofindependenttrials.Thenumberofhypertensivesfollows

aBinominaldistribution,nottheproportion

Fig.19.7.MortalityinBritishdoctorsbysmokinghabits,afterDollandHill(1956)

Fig.19.8.LinegraphsforgeriatricadmissionsinWandsworthinthesummersof1982and1983

28.TTTTF.Theprobabilityofclinicaldiseaseis0.5×0.5=0.25.Theprobabilityofcarrierstatus=probabilitythatfatherpassesthegeneandmotherdoesnot+probabilitythatmotherpassesthegeneandfatherdoesnot=0.5×0.5+0.5×0.5=0.5.Probabilityofnotinheritingthegene=0.5×0.5=0.25.Probabilityofnothavingclinicaldisease=1-0.25=0.75.Successivechidrenareindependent,sotheprobabilitiesforthesecondchildareunaffectedbythefirst(§6.2)

29.FTTFT.§6.3,4.Theexpectednumberisone(§6.6).Thespinsareindependent(§6.2).Atleastonetailmeansonetail(PROB=0.5)ortwotails(PROB=0.25).Thesearemutuallyexclusive,sotheprobabilityofatleastonetailis0.5+0.25=0.75.

Table19.3.Probabilityofsurvivingtodifferentages

Survivetoage Probability Surviveto

age Probability

10 0.959 60 0.758

20 0.952 70 0.524

30 0.938 80 0.211

40 0.920 90 0.022

50 0.876 100 0.000

30.FTTFT.§6.6.E(X=2)=µ+2,VAR(2X)=4σ2.

31.TTTFF.§6.6.Thevarianceofadifferenceisthesumofthevariances.Variancescannotbenegative.VAR(-X)=(-1)2×VAR(X)=VAR(X).

SolutiontoExercise6E1.Probabilityofsurvivaltoage10.Thisillustratesthefrequencydefinitionofprobability.959outof1000survive,sotheprobabilityis959/1000=0.959.

2.Survivalanddeatharemutuallyexclusive,exhaustiveevents,soPROB(survives)+PROB(dies)=1.HencePROB(dies)=1-0.959=0.041.

3.Thesearethenumbersurvivingdividedby1000(Table19.3).Theeventsarenotmutuallyexclusive,e.g.amancannotsurvivetoage20

ifhedoesnotsurvivetoage10.Thisdoesnotformaprobabilitydistribution.

4.Theprobabilityisfoundby

5.Independentevents.PROB(survival60to70)=0.691,

PROB(bothsurvive)=0.691×0.691=0.477.

6.Theproportionsurvivingonaverageistheprobabilityofsurvival=0.691.Soaproportionof0.691ofthe100survive.Weexpect0.691×100=69.1tosurvive.

7.Theprobabilityisfoundby

8.Asin7,wefindprobabilitiesofdyingforeachdecade(Table19.4).Thisisasetofmutuallyexclusiveeventsandtheyareexhaustive–thereisnootherdecadeinwhichdeathcantakeplace.Thesumoftheprobabilitiesistherefore1.0.ThedistributionisshowninFigure19.9.

9.Wefindtheexpectedvaluesormeanofaprobabilitydistributionbysummingeachvaluetimesitsprobability(§6.4),togivelifeexpectancyatbirth=66.6

years(Table19.5).

Table19.4.Probabilityofdyingineachdecade

Decade Probabilityofdying Decade Probabilityof

dying

1st 0.041 6th 0.118

2nd 0.007 7th 0.234

3rd 0.014 8th 0.313

4th 0.018 9th 0.189

5th 0.044 10th 0.022

Fig.19.9.Probabilitydistributionofdecadeofdeath

SolutiontoExercise7M:Multiplechoicequestions32to3732.TTTFT.§7.2,3,4.

33.FFFTT.Symmetrical,µ=0,σ=1(§7.3,§4.6).

34.TTFFF.§7.2.Median=mean.TheNormaldistributionhasnothingtodowithnormalphysiology.2.5%willbelessthan260,2.5%willbegreaterthan340litres/min.

Table19.5.Calculationofexpectationoflife

5×0.041=0.20515×0.007=0.10525×0.014=0.35035×0.018=0.63045×0.044=1.98055×0.118=6.49065×0.234=15.21075×0.313=23.47585×0.189=16.06595×0.022=2.090Total66.600

Fig.19.10.Histogramofthebloodglucosedatawiththe

correspondingNormaldistributioncurve,andNormalplot

35.FTTFF.§4.6,§7.3.Thesamplesizeshouldnotaffectthemean.Therelativesizesofmean,medianandstandarddeviationdependontheshapeofthefrequencydistribution.

36.TFTTF.§7.2,§7.3.Adding,subtractingormultiplyingbyaconstant,oraddingorsubtractinganindependentNormalvariablegivesaNormaldistribution.X2followsaveryskewChi-squareddistributionwithonedegreeoffreedomandX/Yfollowsatdistributionwithonedegreeoffreedom(§7A).

37.TTTTT.Agentleslopeindicatesthatobservationsarefarapart,asteepslopethattherearemanyobservationsclosetogether.Hencegentle-steep-gentle(‘S’shaped)indicateslongtails(§7.5).

SolutiontoExercise7E1.Theboxandwhiskerplotshowsaveryslightdegreeofskewness,thelowerwhiskerbeingshorterthantheupperandthelowerhalfoftheboxsmallerthantheupper.FromthehistogramitappearsthatthetailsarealittlelongerthantheNormalcurveofFigure7.10wouldsuggest.Figure19.10showstheNormaldistributionwiththesamemeanandvariancesuperimposedonthehistogram,whichalsoindicatesthis.

2.Wehaven=40.Fori=1to40wewanttocalculate(i-0.5)/n=(2i-1)/2n.Thisgivesusaprobability.WeuseTable7.1tofindthevalueoftheNormaldistributioncorrespondingtothisprobability.Forexample,fori=1wehave

FromTable7.1wecannotfindthevalueofxcorrespondingtoΦ(x)=0.0125directly,butweseethatx=-2.3correspondstoΦ(x)=0.011andx=-2.2toΦ(x)=0.014.Φ(x)=0.0125ismid-waybetweentheseprobabilitiessowecanestimatethevalueofxasmid-waybetween-2.3and-2.2,giving-2.25.Thiscorrespondstothelowestblood

glucose,2.2.Fori=2wehaveΦ(x)=0.0375.Referringtothetablewehavex=-1.8,Φ(x)=0.036andx=-1.7,Φ(x)=0.045.ForΦ(x)=0.0375wemusthavexjustabove-1.8,about-1.78.The

correspondingbloodglucoseis2.9.Wedonothavetobeveryaccuratebecauseweareonlyusingthisplotforaroughguide.Wegetasetofprobabilitiesasfollows:

i (2i-1)/2n=Φ(x) x Bloodglucose

1 1/80=0.0125 -2.25 2.2

2 3/80=0.0375 -1.78 2.9

3 5/80=0.625 -1.53 3.3

4 7/80=0.0875 -1.36 3.3

andsoon.BecauseofthesymmetryoftheNormaldistribution,fromi=21onwardsthevaluesofxarethosecorrespondingto40-i+1,butwithapositivesign.TheNormalplotisshowninFigure19.10.

3.Thepointsdonotlieonastraightline.Therearepronouncedbendsneareachend.Thesebendsreflectratherlongtailsofthedistributionofbloodglucose.Ifthelineshowedasteadycurve,risinglesssteeplyasthebloodglucosevalueincreased,thiswouldshowsimpleskewnesswhichcanoftenbecorrectedbyalogtransformation.Thiswouldnotworkhere;thebendatthelowerendwouldbemadeworse.

Thedeviationfromastraightlineisnotverygreat,compared,say,tothevitaminDmeasurementsinFigure7.12.AsweseeinChapter10,suchsmalldeviationsfromtheNormaldonotusuallymatter.

SolutiontoExercise8M:Multiplechoicequestions38to43

39.FTFTF.§8.3.Thesamplemeanisalwaysinthemiddleofthelimits.

41.TTTFF.§8.1,2,§6.4)Varianceisp(1-p)/n=0.1×0.9/100=0.0009.ThenumberinthesamplewiththeconditionfollowsaBinomialdistribution,nottheproportion.

42.FFTTT.ItdependsonthevariabilityofFEV1andthenumberinthesample(§8.2).Thesampleshouldberandom(§3.3,4).

43.FFTTF.§8.3,4.Itisunlikelythatwewouldgetthesedataifthepopulationratewere10%,butnotimpossible.

SolutiontoExercise8E1.Theintervalwillbe1.96standarddeviationslessthanandgreaterthanthemean.Thelowerlimitis0.810-1.96×0.057=0.698mmol/litre.Theupperlimitis0.810+1.96×0.057=0.922mmol/litre.

2.Forthediabetics,themeanis0.719andthestandarddeviation0.068,sothelowerlimitof0.698willbe(0.698-0.719)/0.068=-0.309standarddeviationsfromthemean.FromTable7.1,theprobabilityofbeingbelowthisis0.38,sotheprobabilityofbeingaboveis1-0.38=0.62.Thustheprobabilitythataninsulin-dependentdiabeticwouldbewithinthereferenceintervalwouldbe0.62or62%.Thisistheproportionwerequire.

4.The95%confidenceintervalisthemean±1.96standarderrors.Forthecontrols,0.810-1.96×0.00482to0.810+1.96×0.00482givesus0.801to0.819mmol/litre.Thisismuchnarrowerthantheintervalofpart1.Thisisbecausetheconfidenceintervaltellsushowfarthesamplemeanmightbefromthepopulationmean.The95%referenceintervaltellsushowfaranindividualobservationmightbefromthepopulationmean.

5.Thegroupsareindependent,sothestandarderrorofthedifferencebetweenmeansisgivenby:

6.Thedifferencebetweenthemeansis0.719-0.810=-0.091mmol/litre.The95%confidenceintervalisthus-0.091-1.96×0.00660to-0.091+1.96×0.00660,giving-0.104to-0.078.Hencethemeanplasmamagnesiumlevelforinsulindependentdiabeticsisbetween0.078and0.104mmol/litrebelowthatofnon-diabetics.

7.Althoughthedifferenceissignificant,thiswouldnotbeagoodtestbecausethemajorityofdiabeticsarewithinthe95%referenceinterval.

SolutiontoExercise9M:Multiplechoicequestions44to4944.FTFFF.Thereisevidenceforarelationship(§9.6),whichisnotnecessarilycausal.Theremaybeotherdifferencesrelatedtocoffeedrinking,suchassmoking(§3.8).

46.TTFTT.§9.2.Itisquitepossibleforeithertobehigheranddeviationsineitherdirectionareimportant(§9.5).n=16becausethesubjectgivingthesamereadingonbothgivesnoinformationaboutthedifferenceandisexcludedfromthetest.Theordershouldberandom,asinacross-overtrial(§2.6).

47.FFFFT.Thetrialissmallandthedifferencemaybeduetochance,buttheremayalsobealargetreatmenteffect.Wemustdoabiggertrialtoincreasethepower(§9.9).Addingcaseswouldcompletelyinvalidatethetest.Ifthenullhypothesisistrue,thetestwillgivea‘significant’resultonein20times.Ifwekeepaddingcasesanddoingmanytestswehaveaveryhighchanceofgettinga‘significant’resultononeofthem,eventhoughthereisnotreatmenteffect(§9.10).

48.TFTTF.Largesamplemethodsdependonestimatesofvarianceobtained

fromthedata.Thisestimategetsclosertothepopulationvalueasthesamplesizeincreases(§9.7,§9.8).Thechanceofanerrorofthefirstkindisthesignificancelevelsetinadvance,say5%.Thelargerthesamplethemorelikelywearetodetectadifferenceshouldoneexist(§9.9).Thenullhypothesisdependsonthephenomenaweareinvestigating,notonthesamplesize.

49.FTFFT.Wecannotconcludecausationinanobservationalstudy(§3.6,7,8),butwecanconcludethatthereisevidenceofadifference(§9.6).0.001istheprobabilityofgettingsolargeadifferenceifthenullhypothesisweretrue(§9.3).

SolutiontoExercise9E1.Bothcontrolgroupsaredrawnfrompopulationswhichwereeasytogetto,onebeinghospitalpatientswithoutgastro-intestinalsymptoms,theotherbeingfracturepatientsandtheirrelatives.Botharematchedforageandsex;Mayberryetal.(1978)alsomatchedforsocialclassandmaritalstatus.Apartfromthematchingfactors,wehavenowayofknowingwhethercasesandcontrolsarecomparable,oranywayofknowingwhethercontrolsarerepresentativeofthegeneralpopulation.Thisisusualincasecontrolstudiesandisamajorproblemwiththisdesign.

2.Therearetwoobvioussourcesofbias:interviewswerenotblindandinformationisbeingrecalledbythesubject.Thelatterisparticularlyaproblemfordataaboutthepast.InJames'studysubjectswereaskedwhattheyusedtoeatseveralyearsinthepast.Forthecasesthiswasbeforeadefiniteevent,onsetofCrohn'sdisease,forthecontrolsitwasnot,thetimebeingtimeofonsetofthediseaseinthematchedcase.

3.ThequestioninJames'studywas‘whatdidyoutoeatinthepast?’,thatinMayberryetal.(1978)was‘whatdoyoueatnow?’

4.Ofthe100patientswithCrohn'sdisease,29werecurrenteatersofcornflakes.Of29caseswhoknewofthecornflakesassociation,12wereex-eatersofcornflakes,andamongtheother71cases21wereex-eatersofcornflakes,givingatotalof33pastbutnotpresenteatersofcornflakes.Combiningthesewiththe29currentconsumers,weget62

caseswhohadatsometimebeenregulareatersofcornflakes.Ifwecarryoutthesamecalculationforthecontrols,weobtain3+10=13pasteatersandwith22currenteatersthisgives35sometimeregularcornflakeseaters.Casesweremorelikelythancontrolstohaveeatencornflakesregularlyatsometime,theproportionofcasesreportinghavingeatencornflakesbeingalmosttwiceasgreatasforcontrols.ComparethistoJames'data,where17/68=25%ofcontrolsand23/34=68%ofcases,2.7timesasmany,hadeatencornflakesregularly.Theresultsaresimilar.

5.TherelationshipbetweenCrohn'sdiseaseandreportedconsumptionofcornflakeshadamuchsmallerprobabilityforthesignificancetestandhencestrongerevidencethatarelationshipexisted.Also,onlyonecasehadnevereatencornflakes(itwasalsothemostpopularcerealamongcontrols).

6.OftheCrohn'scases,67.6%(i.e.23/34)reportedhavingeatencornflakesregularlycomparedto25.0%ofcontrols.Thuscaseswere67.6/25.0=2.7times

aslikelyascontrolstoreporthavingeatencornflakes.Thecorrespondingratiosfortheothercerealsare:wheat,2.7;porridge,1.5;rice,1.6;bran,6.1;muesli,2.7.Cornflakesdoesnotstandoutwhenwelookatthedatainthisway.Thesmallprobabilitysimplyarisesbecauseitisthemostpopularcereal.ThePvalueisapropertyofthesample,notofthepopulation.

7.WecanconcludethatthereisnoevidencethateatingcornflakesismorecloselyrelatedtoCrohn'sdiseasethanisconsumptionofothercereals.ThetendencyforCrohn'scasestoreportexcessiveeatingofbreakfastfoodsbeforeonsetofthediseasemaybetheresultofgreatervariationindietthanincontrols,astheytrydifferentfoodsinresponsetotheirsymptoms.Theymayalsobemorelikelytorecallwhattheyusedtoeat,beingmoreawareoftheeffectsofdietbecauseoftheirdisease.

SolutiontoExercise10M:Multiplechoicequestions50to56

50.FFTFT.§10.2.ItisequivalenttotheNormaldistributionmethod(§8.3).

51.FTFTF.§10.3.Whetherthe(population)meansareequaliswhatwearetryingtofindout.ThelargesamplecaseisliketheNormaltestof(§9.7),exceptforthecommonvarianceestimate.Itisvalidforanysamplesize.

52.FTTFF.TheassumptionofNormalitywouldnotbemetforasmallsamplettest(§10.3)withouttransformation(§10.4),butforalargesamplethedistributionfollowedbythedatawouldnotmatter(§9.7).Thesigntestisforpaireddata.Wehavemeasurements,notqualitativedata.

53.FTTFF.§10.5.Themoredifferentthesamplesizesare,theworseistheapproximationtothetdistribution.Whenbothsamplesarelarge,thisbecomesalargesampleNormaldistributiontest(§9.7).Groupingofdataisnotaseriousproblem.

54.TFFTT.APvalueconveysmoreinformationthanastatementthatthedifferenceissignificantornotsignificant.Aconfidenceintervalwouldbeevenbetter.Whatisimportantishowwellthediagnostictestdiscriminates,i.e.byhowmuchthedistributionsoverlap,notanydifferenceinmean.SemencountcannotfollowaNormaldistributionbecausetwostandarddeviationsexceedthemeanandsomeobservationswouldbenegative(§7.4).Approximatelyequalnumbersmakethettestveryrobustbutskewnessreducesthepower(§10.5).

56.FTTFT.§10.9.Sumsofsquaresanddegreesoffreedomaddup,meansquaresdonot.Threegroupsgivestwodegreesoffreedom.Wecanhaveanysizesofgroups.

SolutiontoExercise10E1.ThedifferencesforcomplianceareshowninTable19.6.ThestemandleafplotisshowninFigure19.11.

Table19.6.Differencesandmeansforstaticcompliance

Patient Constant Decelerating Difference Mean

1 65.4 72.9 -7.5 69.15

2 73.7 94.4 -20.7 84.05

3 37.4 43.3 -5.9 40.35

4 26.3 29.0 -2.7 27.65

5 65.0 66.4 -1.4 65.70

6 35.2 36.4 -1.2 35.80

7 24.7 27.7 -3.0 26.20

8 23.0 27.5 -4.5 25.25

9 133.2 178.2 -45.0 155.70

10 38.4 39.3 -0.9 38.85

11 29.2 31.8 -2.6 30.50

12 28.3 26.9 1.4 27.60

13 46.6 45.0 1.6 45.80

14 61.5 58.2 3.3 59.85

15 25.7 25.7 0.0 25.70

16 48.7 42.3 6.4 45.50

Fig.19.11.Stemandleafplotforcompliance

2.TheplotofdifferenceagainstmeanisFigure19.12.Thedistributionishighlyskewedandthedifferencecloselyrelatedtothemean.

3.Thesumandsumofthesquareddifferencesare∑di=-82.7and∑di2

=2648.43,hencethemeanis[dwithbarabove]=-82.7/16=-5.16875.Forthesumofsquaresaboutthemean

4.Wehave15degreesoffreedomandfromTable7.1the5%pointofthetdistributionis2.13.The95%confidenceintervalis-5.16875-2.13×3.04205to-5.16875+2.13×3.04205,giving-11.6to+1.3.

Fig.19.12.Differenceversusmeanforcompliance

6.The95%confidenceintervalis-0.028688-2.13×0.012503to-0.028688+2.13×0.012503whichgives-0.055312to-0.002057.Thishasnotbeenrounded,becauseweneedtotransformthemfirst.Ifwetransformtheselimitsbackbytakingtheantilogsweget0.880to0.995.Thismeansthatthecompliancewithadeceleratingwaveformisbetween0.880and0.995timesthatwithaconstantwaveform.Thereissomeevidencethatwaveformhasaneffect,whereaswiththeuntransformeddatatheconfidenceintervalforthedifferenceincludedzero.Becauseoftheskewnessoftherawdatatheconfidenceintervalwastoowide.

7.Wecanconcludethatthereissomeevidenceofareductioninmeancompliance,whichcouldbeupto12%(from(1-0.880)×100),butcouldalsobenegligiblysmall.

SolutiontoExercise11M:Multiplechoicequestions57to6157.FFTTF.Outcomeandpredictorvariablesareperfectlyrelatedbutdonotlieonastraightline,sor<1(§11.9).

Fig.19.13.Stemandleafplotsforlogcompliance

Table19.7.Differenceandmeanforlogtransformedcompliance(tobase10)

Patient Constant Decelerating Difference Mean

1 1.816 1.863 -0.047 1.8395

2 1.867 1.975 -0.108 1.9210

3 1.573 1.636 -0.063 1.6045

4 1.420 1.462 -0.042 1.4410

5 1.813 1.822 -0.009 1.8175

6 1.547 1.561 -0.014 1.5540

7 1.393 1.442 -0.049 1.4175

8 1.362 1.439 -0.077 1.4005

9 2.125 2.251 -0.126 2.1880

10 1.584 1.594 -0.010 1.5890

11 1.465 1.502 -0.037 1.4835

12 1.452 1.430 0.022 1.4410

13 1.668 1.653 0.015 1.6605

14 1.789 1.765 0.024 1.7770

15 1.410 1.410 0.000 1.4100

16 1.688 1.626 0.062 1.6570

Fig.19.14.Differenceversusmeanforlogcompliance

58.FTFFF.Knowledgeofthepredictortellsussomethingabouttheoutcomevariable(§6.2).Thisisnotastraightlinerelationship.Forpartofthescaletheoutcomevariabledecreasesasthepredictorincreases,thentheoutcomevariableincreasesagain.Thecorrelationcoefficientwillbeclosetozero(§11.9).Alogarithmictransformationwouldworkiftheoutcomeincreasedmoreandmorerapidlyasthepredictorincreased(§5.9).

59.FFFTT.Aregressionlineusuallyhasnon-zerointerceptandslope,whichhavedimensions(§11.3).ExchangingXandYchangestheline(§11.4).

60.FTTFF.Thepredictorvariablehasnoerrorintheregressionmodel(§11.3).Transformationsareonlyusedifneededtomeettheassumptions(§11.8).Thereisascatterabouttheline(§11.3).

61.TTFFF.§11.9,10.Thereisnodistinctionbetweenpredictorandoutcome.rshouldnotbeconfusedwiththeregressioncoefficient(§11.3).

SolutiontoExercise11E1.Theslopeisfoundby

Forfemales,

Formales,

2.Forthestandarderror,wefirstneedthevariancesabouttheline:

thenthestandarderroris

Forfemales:

Formales:

3.Thestandarderrorofthedifferencebetweentwoindependentvariablesisthesquarerootofthesumoftheirstandarderrorssquared(§8.5):

Thesampleisreasonablylarge,almostattaining50ineachgroup,sothisstandarderrorshouldbefairlywellestimatedandwecanusealargesampleNormalapproximation.The95%confidenceintervalisthus1.96standarderrorsoneithersideoftheestimate.Theobserveddifferenceisbf-bm=2.8710-3.9477=-1.0767.The95%confidenceintervalisthus-1.0767-1.96×1.7225=-4.5to-1.0767+1.96×1.7225=2.3.Ifthesamplesweresmall,wecoulddothisusingthetDistribution,butwewouldneedtoestimateacommonvariance.Itwouldbebettertousemultipleregression,testingtheheight×sexinteraction(§17.3).

4.Forthetestofsignificancetheteststatisticisobserveddifferenceoverstandarderror:

Ifthenullhypothesisweretrue,thiswouldbeanobservationfromaStandardNormaldistribution.FromTable7.2,P>0.5.

SolutiontoExercise12M:Multiplechoicequestions62to6662.TFTFF.§10.3,§12.2.ThesignandWilcoxontestsareforpaireddata(§9.2,§12.3).Rankcorrelationlooksfortheexistenceofrelationshipsbetweentwoordinalvariables,notacomparisonbetweentwogroups(§12.4,§12.5).

63.TTFFT.§9.2,§12.2,§10.3,§12.5.TheWilcoxontestisforintervaldata(§12.3).

64.FTFTT.§12.5.Thereisnopredictorvariableincorrelation.Logtransformationwouldnotaffecttherankorderoftheobservations.

65.FTFFT.IfNormalassumptionsaremetthemethodsusingthemarebetter(§12.7).Estimationofconfidenceintervalsusingrankmethodsisdifficult.Rankmethodsrequiretheassumptionthatthescaleisordinal,i.e.thatthedatacanberanked.

66.TFTTF.Weneedapairedtest:t,signorWilcoxon(§10.2,§9.2,§12.3).

SolutiontoExercise12E1.ThedifferencesareshowninTable19.6.Wehave4positive,11negativeand1zero.Underthenullhypothesisofnodifference,thenumberofpositivesisfromtheBinomialdistributionwithp=0.5,n=15.Wehaven=15becausethesinglezerocontributesnoinformationaboutthedirectionofthedifference.ForPROB(r≤4)wehave

Ifwedoublethisforatwo-sidedtestweget0.11848,againnotsignificant.

2.UsingtheWilcoxonmatchedpairstestweget

Diff. -0.9 -1.2 -1.4 1.4 1.6 -2.6 -2.7 -3.0

Rank 1 2 3.5 3.5 5 6 7 8

Diff. 3.3 -4.5 -5.9 6.4 -7.5 -20.7 -45.0

Rank 9 10 11 12 13 14 15

Asforthesigntest,thezeroisomitted.SumofranksforpositivedifferencesisT=3.5+5+9+12=29.5.FromTable12.5the5%pointforn=15is25,whichTexceeds,sothedifferenceisnotsignificantatthe5%level.Thethreetestsgivesimilaranswers.

3.UsingthelogtransformeddifferencesinTable19.7,westillhave4positives,11negativesand1zero,withasigntestprobabilityof0.11848.Thetransformationdoesnotalterthedirectionofthechangesandsodoesnotaffectthesigntest.

4.FortheWilcoxonmatchedpairstestonthelogcompliance:

Diff. -0.009 -0.010 -0.014 0.015 0.022 0.024

Rank 1 2 3 4 5 6

Diff. -0.037 -0.042 -0.047 -0.049 0.062 -0.063

Rank 7 8 9 10 11 12

Diff. -0.077 -0.108 -0.126

Rank 13 14 15

HenceT=4+5+6+11-26.Thisisjustabovethe5%pointof25andisdifferentfromthatintheuntransformeddata.Thisisbecausethetransformationhasalteredtherelativesizeofthedifferences.Thistest

assumesintervaldata.Bychangingtoalogscalewehavemovedtoascalewherethedifferencesaremorecomparable,becausethechangedoesdependonthemagnitudeoftheoriginalvalue.Thisdoesnothappenwiththeotherranktests,theMann–WhitneyUtestandrankcorrelationcoefficients,whichinvolvenodifferencing.

5.Althoughthereisapossibilityofareductionincomplianceitdoesnotreachtheconventionallevelofsignificance.

6.Theconclusionsarebroadlysimilar,buttheeffectoncomplianceismorestronglysuggestedbythetmethod.ProvidedthedatacanbetransformedtoapproximateNormalitythetdistributionanalysisismorepowerful,andasitalsogivesconfidenceintervalsmoreeasily,Iwouldpreferit.

SolutiontoExercise13M:Multiplechoicequestions67to7367.TFFFF.§13.3.80%of4isgreaterthan3,soallexpectedfrequenciesmustexceed5.Thesamplesizecanbeassmallas20,ifallrowandcolumntotalsare10.

68.FTFTF.§13.1,§13.3.(5-1)×(3-1)=8degreesoffreedom,80%×15=12cellsmusthaveexpectedfrequencies>5.ItisO.K.foranobservedfrequencytobezero.

69.TTFTF.§13.1,§13.9.Thetwotestsareindependent.Thereare(2-1)×(2-1)=1degreeoffreedom.WithsuchlargenumbersYates'correctiondoesnotmakemuchdifference.Withoutitwegetχ2=124.5,withitwegetχ2=119.4(§13.5.).

70.TTTTT.§13.4,5.Thefactorialsoflargenumberscanbedifficulttocalculate.

71.TTTTF.§13.7.

72.TTFTT.Chi-squaredfortrendandτbwillbothtestthenullhypothesisofnotrendinthetable,butanordinarychi-squaredtestwillnot(§13.8).Theoddsratio(OR)isanestimateoftherelativeriskforacase-controlstudy(§13.7).

73.TTFFF.Thetestcomparesproportionsinmatchedsamples(§13.9).

Forarelationship,weusethechi-squaredtest(§13.1).PEFRisacontinuousvariable,weusethepairedtmethod(§10.2).Fortwoindependentsamplesweusethechi-squaredtest(§13.1).

SolutiontoExercise13E1.Theheatwaveappearstobegininweek10andcontinuetoincludeweek17.Thisperiodwasmuchhotterthanthecorrespondingperiodof1982.

Table19.8.Cross-tabulationoftimeperiodbyyearforgeriatricadmissions

Year

Period

TotalBeforeheatwave

Duringheatwave

Afterheatwave

1982 190 110 82 382

1983 180 178 110 468

Total 370 288 192 850

Table19.9.ExpectedfrequenciesforTable19.8

Year

Period

TotalBefore During After

heatwave heatwave heatwave

1982 166.3 129.4 86.3 382.0

1983 203.7 158.6 105.7 468.0

Total 370.0 288.0 192.0 850.0

2.Therewere178admissionsduringtheheatwavein1983and110inthecorrespondingweeksof1982.Wecouldtestthenullhypothesisthatthesecamefromdistributionswiththesameadmissionrateandwewouldgetasignificantdifference.Thiswouldnotbeconvincing,however.Itcouldbeduetootherfactors,suchastheclosureofanotherhospitalwithresultingchangesincatchmentarea.

3.Thecross-tabulationisshowninTable19.8.

4.Thenullhypothesisisthatthereisnoassociationbetweenyearandperiod,inotherwordsthatthedistributionofadmissionsbetweentheperiodswillbethesameforeachyear.TheexpectedvaluesareshowninTable19.9.

5.Thechi-squaredstatisticisgivenby:

Thereare2rowsand3columns,givingus(2-1)×(3-1)=2degreesoffreedom.Thuswehavechi-squared=11.8with2degreesoffreedom.FromTable13.3weseethatthishasprobabilityoflessthan0.01.Thedataarenotconsistentwiththenullhypothesis.Theevidencesupportstheviewthatadmissionsrosebymorethancouldbeascribedtochanceduringthe1983heatwave.Wecannotbecertainthatthiswasduetotheheatwaveandnotsomeotherfactorwhichhappenedtooperateatthesametime.

6.Wecouldseewhetherthesameeffectoccurredinotherdistrictsbetween1982and1983.Wecouldalsolookatolderrecordstoseewhethertherewasasimilarincreaseinadmissions,sayfortheheatwavesof1975and1976.

SolutiontoExercise14M:Multiplechoicequestions74to8074.TFFTT.Table14.2.

75.FTTTT.§14.5.

76.FFFFT.Regression,correlationandpairedtmethodsneedcontinuousdata(§11.3,§11.9,§10.2).Kendall'sτcanbeusedfororderedcategories.

77.TFTFF.§14.2.

78.TFTTT.AttestcouldnotbeusedbecausethedatadonotfollowaNormaldistribution(10.3).Theexpectedfrequencieswillbetoosmallforachi-squaredtest(§13.3),butatrendtestwouldbeO.K.(§13.8).Agoodnessoffittestcouldbeused(§13.10).

79.FTTFT.Asmall-sample,pairedmethodisneeded(Table14.4).

80.TFTFF.ForatwobytwotablewithsmallexpectedfrequencieswecanuseFisher'sexacttestorYates'correction(§13.4,5).McNemar'stestisinappropriatebecausethegroupsarenotmatched(§13.9).

SolutiontoExercise14E1.Overallpreference:wehaveonesampleofpatientssoweuse(Table14.2).Ofthese12preferredA,14preferredBand4didnotexpressapreference.WecanuseaBinomialorsigntest(§9.2),onlyconsideringthosewhoexpressedapreference.ThoseforAarepositives,thoseforBarenegatives.Wegettwo-sidedP=0.85,notsignificant.

Preferenceandorder:wehavetherelationshipbetweentwovariables(Table14.3),preferenceandorder,bothnominal.Wesetupatwowaytableanddoachi-squaredtest.Forthe3by2tablewehavetwoexpectedfrequencieslessthanfive,sowemusteditthetable.There

arenoobviouscombinations,butwecandeletethosewhoexpressednopreference,leavinga2by2table,χ2=1.3,1degreeoffreedom,P>0.05.

2.Thedataarepaired(Table14.2)soweuseapairedttest(§10.2).TheassumptionofaNormaldistributionforthedifferencesshouldbemetasPEFRitselffollowsaNormaldistributionfairlywell.Wegett=6.45/5.05=1.3,degreesoffreedom=31,whichisnotsignificant.Usingt=2.04(Table10.1)wegeta95%confidenceintervalof-3.85to16.75litres/min.

3.Wehavetwoindependentsamples(Table14.1).Wemustusethetotalnumberofpatientswerandomizedtotreatments,inanintentiontotreatanalysis(§2.5).Thuswehave1721activetreatmentpatientsincluding15deaths,and1706placebopatientswith35deaths.Achi-squaredtestgivesusχ2=8.3,d.f.=1,P<0.01.Acomparisonoftwoproportionsgivesadifferenceof-0.0118with95%confidenceinterval-0.0198to-0.0038(§8.6)andtestofsignificanceusingtheStandardNormaldistributiongivesavalueof2.88,P<0.01,(§9.8).

4.Wearelookingattherelationshipbetweentwovariables(Table14.3).Bothvariableshaveverynon-Normaldistributions.NitriteishighlyskewandpHisbimodal.ItmightbepossibletotransformthenitritestoaNormaldistributionbutthetransformationwouldnotbeasimpleone.Thezeropreventsasimplelogarithmictransformation,forexample.Becauseofthis,regressionand

correlationarenotappropriateandrankcorrelationcanbeused.Spearman'sρ=0.58andKendall'sτ=0.40,bothgivingaprobabilityof0.004.

5.Wehavetwoindependentsamples(Table14.1).WehavetwolargesamplesandcandotheNormalcomparisonoftwomeans(§8.5).Thestandarderrorofthedifferenceis0.0178sandtheobserveddifferenceis0.02s,givinga95%confidenceintervalof-0.015to0.055fortheexcessmeantransittimeinthecontrols.Ifwehadallthedata,foreachcasewecouldcalculatethemeanMTTforthetwocontrolsmatchedtoeachcase,findthedifferencebetweencaseMTTandcontrolmeanMTT,andusetheonesamplemethodof§8.3.

6.Thesearepaireddata,sowerefertoTable14.2.Theunequalstepsinthevisualacuityscalesuggestthatitisbesttreatedasanordinalscale,sothesigntestisappropriate.Preminuspost,thereare10positivedifferences,nonegativedifferencesand7zeros.Thuswerefer0totheBinomialdistributionwithp=0.5andn=10.Theprobabilityisgivenby

7.Wewanttotestfortherelationshipbetweentwovariables,whicharebothpresentedascategorical(Table14.3).Weuseachi-squaredtestforacontingencytable,χ2=38.1,d.f.=6,P<0.001.Onepossibilityisthatsomeothervariable,suchasthemother'ssmokingorpoverty,isrelatedtobothmaternalageandasthma.Anotheristhatthereisacohorteffect.Alltheage14–19motherswerebornduringthesecondworldwar,andsomecommonhistoricalexperiencemayhaveproducedtheasthmaintheirchildren.

8.Theserialmeasurementsofthyroidhormonecouldbesummarizedusingtheareaunderthecurve(§10.7).Theoxygendependenceistricky.Thebabieswhodiedhadtheworstoutcome,butifwetooktheirsurvivaltimeasthetimetheywereoxygendependent,wewouldbetreatingthemasiftheyhadagoodoutcome.Wemustalsoallowforthebabieswhowenthomeonoxygenhavingalongbutunknownoxygendependence.Mysolutionwastoassignanarbitrarylargenumberofdays,largerthananyforthebabiessenthomewithoutoxygen,tothebabiessenthomeonoxygen.Iassignedanevenlargernumberofdaysto

thebabieswhodied.IthenusedKendall'staub(§12.5)toassesstherelationshipwiththyroidhormoneAUC.Kendall'srankcorrelationwaschoseninpreferencetoSpearman'sbecauseofthelargenumberoftieswhichthearbitraryassignmentoflargenumbersproduced.

9.Thisisacomparisonoftwoindependentsamples,soweuseTable14.1.Thevariableisintervalandthesamplesaresmall.Wecouldeitherusethetwosampletmethod(10.3)ortheMann–WhitneyUtest

(§12.2).Thegroupshavesimilarvariances,butthedistributionshowsaslightnegativeskewness.AsthetwosampletmethodisfairlyrobusttodeviationsfromtheNormaldistributionandasIwantedaconfidenceintervalforthedifferenceIchosethisoption.Ididnotthinkthattheslightskewnesswassufficienttocauseanyproblems.

Bythetwosampletmethodwegetthedifferencebetweenthemeans,immobile-mobile,tobe7.06,standarderror=5.74,t=1.23,P=0.23,95%confidenceinterval=-4.54to18.66hours.BytheMann-Whitney,wegetU=178.5,z=-1.06,P=0.29.Thetwomethodsgiveverysimilarresultsandleadtosimilarconclusions,asweexpectthemtodowhenbothmethodsarevalid.

SolutiontoExercise15M:Multiplechoicequestions81to8681.TFTTF.§15.2.Unlessthemeasurementprocesschangesthesubject,wewouldexpectthedifferenceinmeantobezero.

82.TFTFF.§15.4.Weneedthesensitivityaswellasspecificity.Thereareotherthings,dependentonthepopulationstudied,whichmaybeimportanttoo,likethepositivepredictivevalue.

83.FTTTF.§15.4.Specificity,notsensitivity,measureshowwellpeoplewithoutthediseaseareeliminated.

84.TTFFF.§15.5.The95%referenceintervalshouldnotdependonthesamplesize.

85.FFFFT.§15.5.Weexpect5%of‘normal’mentobeoutsidetheselimits.Thepatientmayhaveadiseasewhichdoesnotproduceanabnormalhaematocrit.Thisreferenceintervalisformen,notwomenwhomayhaveadifferentdistributionofhaematocrit.Itisdangeroustoextrapolatethereferenceintervaltoadifferentpopulation.Infact,forwomenthereferenceintervalis35.8to45.4,puttingawomanwithahaematocritof48outsidethereferenceinterval.Ahaematocritoutsidethe95%referenceintervalsuggeststhatthemanmaybeill,althoughitdoesnotproveit.

86.TFTTT.§15.6.Astimeincreases,ratesarebasedonfewerpotentialsurvivors.Withdrawalsduringthefirstintervalcontributehalfan

intervalatrisk.Ifsurvivalrateschangethosesubjectsstartinglaterincalendartime,andsomorelikelytobewithdrawn,willhaveadifferentsurvivaltothosestartingearlier.Thefirstpartofthecurvewillrepresentadifferentpopulationtothesecond.Thelongestsurvivormaystillbealiveandsobecomeawithdrawal.

SolutiontoExercise15E1.Theblooddonorswereusedbecauseitwaseasytogettheblood.Thiswouldproduceasampledeficientinolderpeople,soitwassupplementedbypeopleattendingdaycentres.Thiswouldensurethatthesewerereasonablyactive,healthypeoplefortheirage.Giventheproblemofgettingbloodandthelimitedresourcesavailable,thisseemsafairlysatisfactorysampleforthepurpose.Thealternativewouldbetotakearandomsamplefromthelocalpopulationandtrytopersuadethemtogivetheblood.Theremighthavebeensomanyrefusalsthatvolunteerbiaswouldmakethesampleunrepresentativeanyway.Thesampleisalsobiasedgeographically,beingdrawnfromonepartofLondon.Inthecontextofthestudy,wherewewantedtocomparediabeticswithnormals,thisdidnotmattersomuch,asbothgroupscamefromthesameplace.Forareferenceintervalwhichwouldapplynationally,iftherewereageographicalfactortheintervalwouldbebiassedinotherplaces.Tolookatthiswewouldhavetorepeatthestudyinseveralplaces,comparetheresultingreferenceintervalsandpoolasappropriate.

2.Wewantnormal,healthypeopleforthesample,sowewanttoexcludepeoplewithobviouspathologyandespeciallythosewithdiseaseknowntoaffectthequantitybeingmeasured.However,ifweexcludedallelderlypeoplecurrentlyreceivingdrugtherapywewouldfinditverydifficulttoasufficientlylargesample.Itisindeed‘normal’fortheelderlytobetakinganalgesicsandhypnotics,sothesewerepermitted.

3.FromtheshapeofthehistogramandtheNormalplot,thedistributionofplasmamagnesiumdoesindeedappearNormal.

4.Thereferenceinterval,outsidewhichabout5%ofnormalvaluesare

expectedtolie,is[xwithbarabove]-2sto[xwithbarabove]+2s,or0.810-×0.057to0.810+2×0.057,whichis0.696to0.924,or0.70to0.92mmol/litre.

5.AsthesampleislargeandthedataNormallydistributedthestandarderrorofthelimitsisapproximately

Forthe95%confidenceintervalwetake1.96standarderrorsoneithersideofthelimit,1.96×0.0083439=0.016.The95%confidenceintervalforthelowerreferencelimitis0.696-0.016to0.696+0.016=0.680to0.712or0.68to0.71mmol/litre.Theconfidenceintervalfortheupperlimitis0.924-0.016to0.696+0.016=0.908to0.940or0.91to0.94mmol/litre.Thereferenceintervaliswellestimatedasfarassamplingerrorsareconcerned.

6.Plasmamagnesiumdidindeedincreasewithage.Thevariabilitydidnot.Thiswouldmeanthatforolderpeoplethelowerlimitwouldbetoolowandtheupperlimittoohigh,asthefewabovethiswouldallbeelderly.Wecouldsimplyestimatethereferenceintervalseparatelyatdifferentages.Wecoulddothisusingseparatemeansbutacommonestimateofvariance,obtainedbyone-wayanalysisofvariance(§10.9).Orwecouldusetheregressionofmagnesiumon

agetogetaformulawhichwouldpredictthereferenceintervalforanyage.Themethodchosenwoulddependonthenatureoftherelationship.

SolutiontoExercise16M:Multiplechoicequestions87to9287.FTFFF.§16.1.Itisforaspecificagegroup,notageadjusted.Itmeasuresthenumberofdeathsperpersonatrisk,notthetotalnumber.Ittellsusnothingaboutagestructure.

88.FTTTT.§16.4.Thelifetableiscalculatedfromagespecificdeathrates.Expectationoflifeistheexpectedvalueofthedistributionofageatdeathifthesemortalityratesapply(§6E).Itusuallyincreaseswithage.

89.TFTTF.TheSMR(§16.3)forwomenwhohadjusthadababyislowerthan100(allwomen)and105(stillbirthwomen).Theconfidenceintervalsdonotoverlapsothereisgoodevidenceforadifference.Womenwhohadhadastillbirthmaybelessormorelikelythanallwomentocommitsuicide,wecannottell.Wecannotconcludethatgivingbirthpreventssuicide–itmaybethatoptimistsconceive,forexample.

90.TFFFF.§16.3.Ageeffectshavebeenadjustedfor.Itmayalsobethatheavydrinkersbecomepublicans.Itisdifficulttoinfercausationfromobservationaldata.Menathighriskofcirrhosisoftheliver,i.e.heavydrinkers,maynotbecomewindowcleaners,orwindowcleanerswhodrinkmaychangetheiroccupation,whichrequiresgoodbalance.Windowcleanershavelowrisk.The‘average’ratiois100,not1.0.

91.FFFTF.§16.6.Alifetabletellsusaboutmortality,notpopulationstructure.Abarchartshowstherelationshipbetweentwovariables,nottheirfrequencydistribution(§5.5).

92.TFFFT.§16.1,§16.2,§16.5.Expectationoflifedoesnotdependonagedistribution(§16.4).

SolutiontoExercise16E1.Weobtaintheratesforthewholeperiodbydividingthenumberofdeathsinanagegroupbythepopulationsize.Thusforages10–14wehave44/4271=0.01030casesperthousandpopulation.Thisisfora13yearperiodsotherateperyearis0.01030/13=0.00079per1000peryear,or0.79permillionperyear.Table19.10showstheratesforeachagegroup.Theratesareunusualbecausetheyarehighestamongtheadolescentgroup,wheremortalityratesformostcausesarelow.Andersonetal.(1985)notethat‘…ourresultssuggestthatamongadolescentmalesabuseofvolatilesubstancescurrentlyaccountfor2%ofdeathsfromallcauses…’.Theratesarealsounusualbecausewehavenotcalculatedthemseparatelyforeachsex.Thisispartlyforsimplicityandpartlybecausethenumberofcasesinmostagegroupsissmallasitis.

2.TheexpectednumberofdeathsbymultiplyingthenumberintheagegroupinScotlandbythedeathratefortheperiod,i.e.per13years,

forGreatBritain.Wethenaddthesetoget27.19deathsexpectedaltogether.Weobserved48,sotheSMRis48/27.19=1.77,or177withGreatBritainas100.

Table19.10.Age-specificmortalityratesforvolatilesubstanceabuse,GreatBritain,andcalculationof

SMRforScotland

Agegroup

GreatBritainASMRs

Scotlandpopulation(thousands)

Scotlandexpecteddeaths

Permillionperyear

Perthousandper13years

0–9 0.00 0.00000 653 0.00000

10–14

0.79 0.01030 425 4.37750

15–19

2.58 0.03358 447 15.01026

20–24

0.87 0.01137 394 4.47978

25–29

0.32 0.00415 342 1.41930

30–39

0.08 0.00108 659 0.71172

40–49

0.03 0.00033 574 0.18942

50–59

0.09 0.00112 579 0.64848

60+ 0.03 0.00037 962 0.35594

Total 27.19240

3.WefindthestandarderroroftheSMRby

The95%confidenceintervalisthen1.77-1.96×0.2548to1.77+1.96×0.2548,or1.27to2.27.Multiplyingby100asusual,weget127to227.TheobservednumberisquitelargeenoughfortheNormalapproximationtothePoissondistributiontobeused.

4.Yes,theconfidenceintervaliswellawayfromzero.Otherfactorsrelatetothedatacollection,whichwasfromnewspapers,coroners,deathregistrationsetc.ScotlandhasdifferentnewspapersandothernewsmediaandadifferentlegalsystemtotherestofGreatBritain.ItmaybethattheassociationofdeathswithVSAismorelikelytobereportedtherethaninEnglandandWales.

SolutiontoExercise17M:Multiplechoicequestions93to9793.TFTFT.§17.2.Itistheratiooftheregressionsumofsquarestothetotalsumofsquares.

94.FTFFF.§17.2.Therewere37+1=38observations.Thereisahighlysignificantethnicgroupeffect.Thenon-significantsexeffectdoesnotmeanthatthereisnodifference(§9.6).Therearethreeagegroups,sotwodegreesoffreedom.Iftheeffectofethnicityweredueentirelytoage,itwouldhavedisappearedwhenagewasincludedinthemodel.

95.TTTTF.§17.8.Afour-levelfactorhasthreedummyvariables(§17.6).Iftheeffectofwhitecellcountweredueentirelytosmoking,itwouldhavedisappearedwhensmokingwasincludedinthemodel.

96.TTTFT.§17.4

97.FFFFT.§17.9.Boyshavealowerriskofreadmissionthangirls,shownbythenegativecoefficient,andhencealongertimebeforebeingreadmitted.Theophilineisrelatedtoalowerriskofreadmissionbutwecannotconcludecausation.Treatmentmaydependonthetypeandseverityofasthma.

SolutiontoExercise17E1.Thedifferenceishighlysignificant(P<0.001)andisestimatedtobebetween1.3and3.7,i.e.volumesarehigheringroup2,thetrisomy-16group.

2.FromboththeNormalplotandtheplotagainstnumberofpairsofsomitesthereappearstobeonepointwhichmayberatherseparatefromtherestofthedata,anoutlier.Inspectionofthedatashowednoreasontosupposethatthepointwasanerror,soitwasretained.OtherwisethefittotheNormaldistributionseemsquitegood.Theplotagainstnumberofpairsofsomitesshowsthattheremaybearelationshipbetweenmeanandvariability,butthisverysmallandwillnotaffecttheanalysistoomuch.Thereisalsoapossiblenon-linearrelationship,whichshouldbeinvestigated.(Theadditionofaquadratictermdidnotimprovethefitsignificantly.)

3.Modeldifferenceinsumofsquares=207.139-197.708=9.431,residualsumofsquares=3.384,Fratio=9.431/3.384=2.79with1and36degreesoffreedom,correspondingtot=1.67,P>0.1,notsignificant.

SolutiontoExercise18M:Multiplechoicequestions98to10098.TTFTT.§9.9.Powerisapropertyofthetest,notthesample.Itcannotbezero,asevenwhenthereisnopopulationdifferenceatallthetestmaybesignificant.

99.TTTTF.§18.5.Ifwekeeponaddingobservationsandtesting,wearecarryingoutmultipletestingandsoinvalidatethetest(§9.10).

100.TTFFT.§18.1.Powerisnotinvolvedinestimation.

SolutiontoExercise18E

3.Thisisacomparisonoftwoproportions(§18.5).Wehavep1=0.15andp2=0.15×0.9=0.135,areductionof10%.Withapowerof90%andasignificancelevelof5%,wehave

Henceweneed11400ineachgroup,22800patientsaltogether.Withapowerof80%andasignificancelevelof5%,wehave

Henceweneed8577ineachgroup,17154patientsaltogether.Loweringthepowerreducestherequiredsamplesize,but,ofcourse,reducesthechanceofdetectingadifferenceiftherereallyisone.

4.Thisisthecomparisonoftwomeans(§18.4).Weestimatethesamplesizeforadifferenceofonestandarddeviation,µ1-µ2=σ.Withapowerof90%andasignificancelevelof5%,thenumberineachgroupisgivenby

Henceweneed21ineachgroup.Ifwehaveunequalsamplesandn1=100,n2isgivenby

andsoweneed12subjectsinthediseasegroup.

5.Whenthenumberofclustersisverysmallandthenumberofindividualswithinaclusterislarge,asinthisstudy,clusteringcanhaveamajoreffect.Thedesigneffect,bywhichtheestimatedsamplesizeshouldbemultiplied,isDEFF=1+(750-1)×0.005=4.745.Thustheestimatedsamplesizeforanygivencomparisonshouldbemultipliedby4.745.Lookingatitanotherway,theeffectivesamplesizeistheactualsamplesize,3000,dividedby4.745,about632.Further,samplesizecalculationsshouldtakeintoaccountdegreesoffreedom.Inlargesampleapproximationsamplesizecalculations,power80%andalpha5%areembodiedinthemultiplierf(α,P)=f(0.05,0.80)=(1.96+0.85)2=7.90.Forasmallsamplecalculationusingthettest,1.96mustbereplacedbythecorresponding5%pointofthetdistributionwiththeappropriatedegreesoffreedom,here2degreesoffreedomgivingt=4.30.Hencethemultiplieris(4.30+0.85)2=26.52,3.36timesthatforthelargesample.

Theeffectofthesmallnumberofclusterswouldreducetheeffectivesamplesizeevenmore,downto630/3.36=188.Thusthe3000menintwogroupsoftwoclusterswouldgivethesamepowertodetectthesamedifferenceas188menrandomizedindividually.Theapplicantsresubmittedaproposalwithmanymoreclusters.



>BackofBook>References

References

Altman,D.G.(1982).Statisticsandethicsinmedicalresearch.InStatisticsinPractice(ed.S.M.GoreandD.G.Altman).BritishMedicalAssociation,London.

Altman,D.G.(1991).PracticalStatisticsforMedicalResearch,ChapmanandHall,London.

Altman,D.G.(Confidenceintervalsforthenumberneededtotreat)(1998).BritishMedicalJournal,317,1309–12.

Altman,D.G.andBland,J.M.(1983).Measurementinmedicine:theanalysisofmethodcomparisonstudies.TheStatistician,32,307–17.

Altman,D.G.andMatthews,J.N.S.(1996).StatisticsNotes:Interaction1:heterogeneityofeffects.BritishMedicalJournal,313,486.

Anderson,H.R.,Bland,J.M.,Patel,S.,andPeckham,C.(1986).Thenaturalhistoryofasthmainchildhood.JournalofEpidemiologyandCommunityHealth,40,121–9.

Anderson,H.R.,MacNair,R.S.,andRamsey,J.D.(1985).Deathsfromabuseofsubstances,anationalepidemiologicalstudy.BritishMedicalJournal,290,304–7.

Anon(1997).Alltrialsmusthaveinformedconsent.BritishMedicalJournal,314,1134–5.

Appleby,L.(1991).Suicideduringpregnancyandinthefirstpostnatalyear.BritishMedicalJournal,302,137–40.

Armitage,P.andBerry,G.(1994).StatisticalMethodsinMedicalResearch,Blackwell,Oxford.

Balfour,R.P.(1991).Birds,milkandcampylobacter.Lancet,337,176.

Ballard,R.A.,Ballard,P.C.,Creasy,R.K.,Padbury,J.,Polk,D.H.,Bracken,M.,Maya,F.R.,andGross,I.(1992).Respiratorydiseaseinvery-low-birthweightinfantsafterprenatalthyrotropinreleasinghormoneandglucocorticoid.Lancet,339,510–5.

Banks,M.H.,Bewley,B.R.,Bland,J.M.,Dean,J.R.,andPollard,V.M.(1978).Alongtermstudyofsmokingbysecondaryschoolchildren.ArchivesofDiseaseinChildhood,53,12–19.

Bewley,B.R.andBland,J.M.(1976).Academicperformanceandsocialfactorsrelatedtocigarettesmokingbyschoolchildren.BritishJournalofPreventiveandSocialMedicine,31,18–24.

Bewley,B.R.,Bland,J.M.,andHarris,R.(1974).Factorsassociatedwiththestartingofcigarettesmokingbyprimaryschoolchildren.BritishJournalofPreventiveandSocialMedicine,28,37–44.

Bewley,T.H.,Bland,J.M.,Ilo,M.,Walch,E.,andWillington,G.(1975).Censusofmentalhospitalpatientsandlifeexpectancyofthoseunlikelytobedischarged.BritishMedicalJournal,4,671–5.

Bewley,T.H.,Bland,J.M.,Mechen,D.,andWalch,E.(1981).‘Newchronic’patients.BritishMedicalJournal,283,1161–4.

Bland,J.M.andAltman,D.G.(1986).Statisticalmethodsforassessingagreementbetweentwomethodsofclinicalmeasurement.Lancet,i,307–10.

Bland,J.M.andAltman,D.G.(1993).Informedconsent.BritishMedicalJournal,306,928.

Bland,J.M.andAltman,D.G.(1998).StatisticsNotes.Bayesiansandfrequentists.BritishMedicalJournal,317,1151.

Bland,J.M.andAltman,D.G.(1999).Measuringagreementinmethodcomparisonstudies.StatisticalMethodsinMedicalResearch,8,135–60.

Bland,J.M.,Bewley,B.R.,Banks,M.H.,andPollard,V.M.(1975).Schoolchildren'sbeliefsaboutsmokinganddisease.HealthEducationJournal,34,71–8.

Bland,J.M.,Bewley,B.R.,Pollard,V.,andBanks,M.H.(1978).Effectofchildren'sandparents'smokingonrespiratorysymptoms.ArchivesofDiseaseinChildhood,53,100–5.

Bland,J.M.,Bewley,B.R.,andBanks,M.H.(1979).Cigarettesmokingandchildren'srespiratorysymptoms:validityofquestionnairemethod.Revued'EpidemiologieetSantéPublique,27,69–76.

Bland,J.M.,Holland,W.W.,andElliott,A.(1974).ThedevelopmentofrespiratorysymptomsinacohortofKentschoolchildren.BulletinPhysio-PathologieRespiratoire,10,699–716.

Bland,J.M.andKerry,S.M.(1998).StatisticsNotes.Weightedcomparisonofmeans.BritishMedicalJournal,316,129.

Bland,J.M.,Mutoka,C.,andHutt,M.S.R.(1977).Kaposi'ssarcomainTanzania.EastAfricanJournalofMedicalResearch,4,47–53.

Bland,J.M.andPeacock,J.L.(2000).StatisticalQuestionsinEvidence-BasedMedicine,UniversityPress,Oxford.

Bland,M.(1995).AnIntroductiontoMedicalStatistics,2nd.ed.,UniversityPress,Oxford.

Bland,M.(1997).Informedconsentinmedicalresearch:Letreadersjudgeforthemselves.BritishMedicalJournal,314,1477–8.

BMJ(1996a).TheDeclarationofHelsinki.BritishMedicalJournal,313,1448.

BMJ(1996b).TheNurembergcode(1947).BritishMedicalJournal,313,1448.

Brawley,O.W.(1998).Thestudyofuntreatedsyphilisinthenegromale.InternationalJournalofRadiationOncology,Biology,Physics,40,5–8.

Breslow,N.E.andDay,N.E.(1987).StatisticalMethodsinCancerResearch.VolumeII—TheDesignandAnalysisofCohortStudies,IARC,Lyon.

BritishStandardsInstitution(1979).Precisionoftestmethods.1:Guideforthedeterminationandreproducibilityofastandardtestmethod(BS5497,part1),BSI,London.

Brooke,O.G.,Anderson,H.R.,Bland,J.M.,Peacock,J.,andStewart,M.(1989).Theinfluenceonbirthweightofsmoking,alcohol,caffeine,psychosocialandsocio-economicfactors.British

MedicalJournal,298,795–801.

Bryson,M.C.(1976).TheLiteraryDigestpoll:makingofastatisticalmyth.TheAmericanStatistician,30,184–5.

BulletinofMedicalEthics(1998).News:LivelydebateonresearchethicsintheUS.November,3–4.

Burdick,R.K.andGraybill,F.A.(1992).ConfidenceIntervalsonVarianceComponents,NewYork,Dekker.

Burr,M.L.,St.Leger,A.S.,andNeale,E.(1976).Anti-mitemeasuresinmite-sensitiveadultasthma:acontrolledtrial.Lancet,i,333–5.

Campbell,M.J.andGardner,M.J.(1989).Calculatingconfidenceintervalsforsomenon-parametricanalyses.InStatisticswithConfidence(ed.Gardner,M.J.andAltman,D.G.).BritishMedicalJournal,London.

Carleton,R.A.,Sanders,C.A.,andBurack,W.R.(1960).Heparinadministrationafteracutemyocardialinfarction.NewEnglandJournalofMedicine,263,1002–4.

Casey,A.T.H.,Crockard,H.A.,Bland,J.M.,Stevens,J.,Moskovich,R.,andRansford,A.(1996).Predictorsofoutcomeinthequadripareticnonambulatorymyelopathicpatientwithrheumatoid-arthritis—aprospectivestudyof55surgicallytreatedRanawatclassIIIBpatients.JournalofNeurosurgery,85,574–81.

Christie,D.(1979).Before-and-aftercomparisons:acautionarytale.BritishMedicalJournal,2,1629–30.

Cochran,W.G.(1977).SamplingTechniques,Wiley,NewYork.

Colton,T.(1974).StatisticsinMedicine,LittleBrown,Boston.

Cook,R.J.andSackett,D.L.(1995).Thenumberneededtotreat:aclinicallyusefulmeasureoftreatmenteffect.BritishMedicalJournal,310,452–4.

Conover,W.J.(1980).PracticalNonparametricStatistics,JohnWileyandSons,NewYork.

Cox,D.R.(1972).Regressionmodelsandlifetables.JournaloftheRoyalStatisticalSocietySeriesB,34,187–220.

Curtis,M.J.,Bland,J.M.,andRing,P.A.(1992).TheRingtotalkneereplacement—acomparisonofsurvivorship.JournaloftheRoyalSocietyofMedicine,85,208–10.

Davies,O.L.andGoldsmith,P.L.(1972).StatisticalMethodsinResearchandProduction,OliverandBoyd,Edinburgh.

Dennis,M.(1997).Commentary:Whywedidn'taskpatientsfortheirconsent.BritishMedicalJournal,314,1077.

Dennis,M.,O'Rourke,S.,Slattery,J.,Staniforth,T.,andWarlow,C.(1997).Evaluationofastrokefamilycareworker:resultsofarandomisedcontrolledtrial.BritishMedicalJournal,314,1071–11.

DHSS(1976).PreventionandHealth:Everybody'sBusiness,HMSO,London.

Doll,R.andHill,A.B.(1950).Smokingandcarcinomaofthelung.BritishMedicalJournal,ii,739–48.

Doll,R.andHill,A.B.(1956).Lungcancerandothercausesofdeathinrelationtosmoking:asecondreportonthemortalityofBritishdoctors.BritishMedicalJournal,ii,1071–81.

Donnan,S.P.B.andHaskey,J.(1977).Alcoholismandcirrhosisoftheliver.PopulationTrends,7,18–24.

Donner,A.,Brown,K.S.,andBrasher,P.(1990).Amethodologicalreviewofnon-therapeuticinterventiontrialsemployingclusterrandomisation1979–1989.InternationalJournalofEpidemiology,19,795–800.

Doyal,L.(1997).Informedconsentinmedicalresearch:Journalsshouldnotpublishresearchtowhichpatientshavenotgivenfullyinformedconsent—withthreeexceptions.BritishMedicalJournal,314,1107–11.

Easterbrook,P.J.,Berlin,J.A.,Gopalan,R.,andMathews,D.R.(1991).Publicationbiasinclinicalresearch.Lancet,337,867–72.

Egero,B.andHenin,R.A.(1973).ThePopulationofTanzania,BureauofStatistics,DaresSalaam.

Esmail,A.,Warburton,B.,Bland,J.M.,Anderson,H.R.,Ramsey,J.(1997).RegionalvariationsindeathsfromvolatilesubstanceabuseinGreatBritain.Addiction,92,1765–71.

Finney,D.J.,Latscha,R.,Bennett,B.M.,andHsa,P.(1963).TablesforTestingSignificanceina2×2ContingencyTable,CambridgeUniversityPress,London.

Fish,P.D.,Bennett,G.C.J.,andMillard,P.H.(1985).Heatwavemorbidityandmortalityinoldage.AgeandAging,14,243–5.

Flint,C.andPoulengeris,P.(1986).The‘KnowYourMidwife’Report,CarolineFlint,London.

Friedland,J.S.,Porter,J.C.,Daryanani,S.,Bland,J.M.,Screaton,N.J.,Vesely,M.J.J.,Griffin,G.E.,Bennett,E.D.,andRemick,D.G.(1996).Plasmaproinflammatorycytokineconcentrations,AcutePhysiologyandChronicHealthEvaluation(APACHE)IIIscoresandsurvivalinpatientsinanintensivecareunit.CriticalCareMedicine,24,1775–81.

Galton,F.(1886).Regressiontowardsmediocrityinhereditarystature.JournaloftheAnthropologicalInstitute,15,246–63.

Gardner,M.J.andAltman,D.G.(1986).ConfidenceintervalsratherthanPvalues:estimationratherthanhypothesistesting.BritishMedicalJournal,292,746–50.

Glasziou,P.P.andMackerras,D.E.M.(1993).VitaminAsupplementationininfectiousdisease:ameta-analysis.BritishMedicalJournal,306,366–70.

Goldstein,H.(1995).MultilevelStatisticalModels,EdwardArnold,London.

Harper,R.andReeves,B.(1999).Reportingofprecisionofestimatesfordiagnosticaccuracy:areview.BritishMedicalJournal,318,1322–3.

Hart,P.D.andSutherland,I.(1977).BCGandvolebacillusinthepreventionoftuberculosisinadolescenceandearlyadultlife.BritishMedicalJournal,2,293–5.

Healy,M.J.R.(1968).Discipliningmedicaldata.BritishMedicalBulletin,24,210–4.

Hedges,B.M.(1978).Questionwordingeffects:presentingoneorbothsidesofacase.TheStatistician,28,83–99.

Henzi,I.,Walder,B.,andTramè,M.R.(2000).Dexamethasoneforthepreventionofpostoperativenauseaandvomiting:aquantitativesystematicreview.Anesthesia—Analgesia,90,186–94.

Hickish,T.,Colston,K.,Bland,J.M.,andMaxwell,J.D.(1989).VitaminDdeficiencyandmusclestrengthinmalealcoholics.ClinicalScience,77,171–6.

Hill,A.B.(1962).StatisticalMethodsinClinicalandPreventiveMedicine,ChurchillLivingstone,Edinburgh.

Hill,A.B.(1977).AShortTextbookofMedicalStatistics,HodderandStoughton,London.

Holland,W.W.,Bailey,P.,andBland,J.M.(1978).Long-termconsequencesofrespiratorydiseaseininfancy.JournalofEpidemiologyandCommunityHealth,32,256–9.

Holten,C.(1951).Anticoagulantsinthetreatmentofcoronarythrombosis.ActaMedicaScandinavica,140,340–8.

Hosmer,D.W.andLemeshow,S.(1999).AppliedSurvivalAnalysis,JohnWileyandSons,NewYork.

Huff,D.(1954).HowtoLiewithStatistics,Gollancz,London.

Huskisson,E.C.(1974).Simpleanalgesicsforarthritis.BritishMedicalJournal,4,196–200.

James,A.H.(1977).BreakfastandCrohn'sdisease.BritishMedicalJournal,1,943–7.

Johnson,F.N.andJohnson,S.(ed.)(1977).ClinicalTrials,Blackwell,Oxford.

Johnston,I.D.A.,Anderson,H.R.,Lambert,H.P.,andPatel,S.(1983).Respiratorymorbidityandlungfunctionafterwhoopingcough.Lancet,ii,1104–8.

Jones,B.andKeward,M.G.(1989).DesignandAnalysisofCross-OverTrials,ChapmanandHall,London.

Kaste,M.,Kuurne,T.,Vilkki,J.,Katevuo,K.,Sainio,K.,andMeurala,H.(1982).Ischronicbraindamageinboxingahazardofthepast?Lancet,ii,1186–8.

Kendall,M.G.(1970).RankCorrelationMethods,CharlesGriffin,London.

Kendall,M.G.andBabingtonSmith,B.(1971).TablesofRandomSamplingNumbers,CambridgeUniversityPress,Cambridge.

Kendall,M.G.andStuart,A.(1969).TheAdvancedTheoryofStatistics,3rd.ed.,vol.1,CharlesGriffin,London.

Kerrigan,D.D.,Thevasagayam,R.S.,Woods,T.O.,McWelch,I.,ThomasW.E.G.,Shorthouse,A.J.,andDennison,A.R.(1993).Who'safraidofinformedconsent?BritishMedicalJournal,306,298–300.

Kerry,S.M.andBland,J.M.(1998).StatisticsNotes:Analysisofatrialrandomizedinclusters.BritishMedicalJournal,316,54.

Kiely,P.D.W.,Bland,J.M.,Joseph,A.E.A.,Mortimer,P.S.,andBourke,B.E.(1995).Upperlimblymphaticfunctionininflamatoryarthritis.JournalofRheumatology,22,214–17.

Kish,L.(1994).SurveySampling,WileyClassicLibrary,NewYork.

Lancet(1980).BCG:badnewsfromIndia.Lancet,i,73–4.

Laupacis,A.,Sackett,D.L.,Roberts,R.S.(1988).Anassessmentofclinicallyusefulmeasuresoftheconsequencesoftreatment.NewEnglandJournalofMedicine,318,1728–33.

Leaning,J.(1996).Warcrimesandmedicalscience.BritishMedicalJournal,313,1413–15.

Lee,K.L.,McNeer,J.F.,Starmer,F.C.,Harris,P.J.,andRosati,R.A.(1980).Clinicaljudgementsandstatistics:lessonsformasimulatedrandomizedtrialincoronaryarterydisease.Circulation,61,508–15.

Lemeshow,S.,Hosmer,D.W.,Klar,J.,andLwanga,S.K.(1990).AdequacyofSampleSizeinHealthStudies,JohnWileyandSons,Chichester.

Leonard,J.V,Whitelaw,A.G.L.,Wolff,O.H.,Lloyd,J.K.,andSlack,S.(1977).Diagnosingfamilialhypercholesterolaemiainchildhoodbymeasuringserumcholesterol.BritishMedicalJournal,1,1566–8.

Levine,M.I.andSackett,M.F.(1946).ResultsofBCGimmunizationinNewYorkCity.AmericanReviewofTuberculosis,53,517–32.

Lindley,M.I.andMiller,J.C.P.(1955).CambridgeElementaryStatisticalTables,CambridgeUniversityPress,Cambridge.

Lopez-Olaondo,L.,Carrascosa,F.,Pueyo,F.J.,Monedero,P.,Busto,N.,andSaez,A.(1996).Combinationofondansetronanddexamethasoneintheprophylaxisofpostoperativenauseaandvomiting.BritishJournalofAnaesthesia,76,835–40.

Lucas,A.,Morley,R.,Cole,T.J.,Lister,G.,andLeeson-Payne,C.(1992).Breastmilkandsubsequentintelligencequotientinchildrenbornpreterm.Lancet,339,510–5.

Luthra,P.,Bland,J.M.,andStanton,S.L.(1982).Incidenceofpregnancyafterlaparoscopyandhydrotubation.BritishMedicalJournal,284,1013.

Machin,D.,Campbell,M.J.,Fayers,P.,andPinol,A.(1998).StatisticalTablesfortheDesignofClinicalStudies,SecondEdition,Blackwell,Oxford.

Mantel,N.(1966).Evaluationofsurvivaldataandtwonewrankorderstatisticsarisinginitsconsideration.CancerChemotherapyReports,50,163–70.

Mather,H.M.,Nisbet,J.A.,Burton,G.H.,Poston,G.J.,Bland,J.M.,Bailey,P.A.,andPilkington,T.R.E.(1979).Hypomagnesaemiaindiabetes.ClinicaChemicaActa,95,235–42.

Matthews,D.E.andFarewell,V.(1988).UsingandUnderstandingMedicalStatistics,SecondEdition,Karger,Basel.

Matthews,J.N.S.andAltman,D.G.(1996a).StatisticsNotes:Interaction2:compareeffectsizesnotPvalues.BritishMedicalJournal,313,808.

Matthews,J.N.S.andAltman,D.G.(1996b).StatisticsNotes:Interaction3:howtoexamineheterogeneity.BritishMedicalJournal,313,862.

Matthews,J.N.S.,Altman,D.G.,Campbell,M.J.,andRoyston,P.(1990).Analysisofserialmeasurementsinmedicalresearch.BritishMedicalJournal,300,230–5.

Maugdal,D.P.,Ang,L.,Patel,S.,Bland,J.M.,andMaxwell,J.D.(1985).Nutritionalassessmentinpatientswithchronicgastro-intestinalsymptoms:comparisonoffunctionalandorganicdisorders.HumanNutrition:ClinicalNutrition,39,203–12.

Maxwell,A.E.(1970).Comparingtheclassificationofsubjectsbytwoindependentjudges.BritishJournalofPsychiatry,116,651–5.

Mayberry,J.F.,Rhodes,J.,andNewcombe,R.G.(1978).BreakfastanddietaryaspectsofCrohn'sdisease.BritishMedicalJournal,2,1401.

McKie,D.(1992).Pollstersturntosecretballot.TheGuardian,London,24August,p.20.

McLean,S.(1997).Commentary:Noconsentmeansnottreatingthepatientwithrespect.BritishMedicalJournal,314,1076.

Meade,T.W.,Roderick,P.J.,Brennan,P.J.,Wilkes,H.C.,andKelleher,C.C.(1992).Extra-cranialbleedingandothersymptomsduetolowdoseaspirinandlowintensityoralanticoagulation.ThrombosisandHaematosis,68,1–6.

Meier,P.(1977).Thebiggesthealthexperimentever:the1954fieldtrialoftheSalkpoliomyelitisvaccine.InStatistics:AGuidetotheBiologicalandHealthSciences(ed.J.M.Tanur,etal.).Holden-Day,SanFrancisco.

Mitchell,E.A.,Bland,J.M.,andThompson,J.M.D.(1994).Riskfactorsforreadmissiontohospitalforasthma.Thorax,49,33–36.

Morris,J.A.andGardner,M.J.(1989).Calculatingconfidenceintervalsforrelativerisks,oddsratiosandstandardizedratiosandrates.InStatisticswithConfidence(ed.Gardner,M.J.andAltmanD.G.).BritishMedicalJournal,London.

MRC(1948).Streptomycintreatmentofpulmonarytuberculosis.BritishMedicalJournal,2,769–82.

Mudur,G.(1997).Indianstudyofwomenwithcervicallesionscalledunethical.BritishMedicalJournal,314,1065.

Newcombe,R.G.(1992).Confidenceintervals:enlighteningormystifying.BritishMedicalJournal,304,381–2.

Newnham,J.P.,Evans,S.F.,Con,A.M.,Stanley,F.J.,andLandau,L.I.(1993).Effectsoffrequentultrasoundduringpregnancy:arandomizedcontrolledtrial.Lancet,342,887–91.

Oakeshott,P.,Kerry,S.M.,andWilliams,J.E.(1994).RandomisedcontrolledtrialoftheeffectoftheRoyalCollegeofRadiologists'guidelinesongeneralpractitioners'referralforradiographicexamination.BritishJournalofGeneralPractice,44,197–200.

O'Brien,P.C.andFleming,T.R.(1979).Amultipletestingprocedureforclinicaltrials.Biometrics,35,549–56.

OfficeforNationalStatistics(1997).1995,1996,1997MortalityStatistics,General,SeriesDH1,No.28,HMSO,London.

OfficeforNationalStatistics(1998a).1998MortalityStatistics,General,SeriesDH1,No.29,HMSO,London.

OfficeforNationalStatistics(1998b).1997BirthStatistics,SeriesFM1,No.26,HMSO,London.

OfficeforNationalStatistics(1999).MortalityStatistics,Childhood,InfantandPerinatal,SeriesDH3,No.30,HMSO,London.

Oldham,H.G.,Bevan,M.M.,andMcDermott,M.(1979).ComparisonofthenewminiatureWrightpeakflowmeterwiththestandardWrightpeakflowmeter.Thorax,34,807–8.

OPCS(1991).MortalityStatistics,SeriesDH2,No.16,HMSO,London.

OPCS(1992).MortalityStatistics,SeriesDH1,No.24,HMSO,London.

Osborn,J.F.(1979).StatisticalExercisesinMedicalResearch,Blackwell,Oxford.

Paraskevaides,E.C.,Pennington,G.W.,Naik,S.,andGibbs,A.A.(1991).Prefreeze/post-freezesemenmotilityratio.Lancet,337,366–7.

Parmar,M.andMachin,D.(1995).SurvivalAnalysis,JohnWileyandSons,Chichester.

Pearson,E.S.andHartley,H.O.(1970).BiometrikaTablesforStatisticians,vol.1,CambridgeUniversityPress,Cambridge.

Pearson,E.S.andHartley,H.O.(1972).BiometrikaTablesforStatisticians,vol.2,CambridgeUniversityPress,Cambridge.

Peduzzi,P.,Concato,J.,Kemper,E.,Holford,T.R.,andFeinstein,A.R.(1996).Asimulationstudyofthenumberofeventspervariableinlogisticregressionanalysis.JournalofClinicalEpidemiology,49,1373–9.

Pocock,S.J.(1977).Groupsequentialmethodsinthedesignandanalysisofclinicaltrials.Biometrika,64,191–9.

Pocock,S.J.(1982).Interimanalysesforrandomisedclinicaltrials:thegroupsequentialapproach.Biometrics,38,153–62.

Pocock,S.J.(1983).ClinicalTrials:APracticalApproach,JohnWileyandSons,Chichester.

Pocock,S.J.andHughes,M.D.(1990).Estimationissuesinclinicaltrialsandoverviews.StatisticsinMedicine,9,657–71.

Pritchard,B.N.C.,Dickinson,C.J.,Alleyne,G.A.O,Hurst,P.,Hill,I.D.,Rosenheim,M.L.,andLaurence,D.R.(1963).ReportofaclinicaltrialfromMedicalUnitandMRCStatisticalUnit,UniversityCollegeHospitalMedicalSchool,London.BritishMedicalJournal,2,1226–7.

RadicalStatisticsHealthGroup(1976).WhosePriorities?,RadicalStatistics,London.

Ramsay,S.(1998).MissEvers'Boys(review).Lancet,352,1075.

Reader,R.,etal.(1980).TheAustraliantrialinmildhypertension:reportbythemanagementcommittee.Lancet,i,1261–7.

Rembold,C.(Numberneededtoscreen:developmentofastatisticfordiseasescreening).1998.BritishMedicalJournal,317,307–12.

Rodin,D.A.,Bano,G.,Bland,J.M.,Taylor,K.,andNussey,S.S.(1998).PolycysticovariesandassociatedmetabolicabnormalitiesinIndiansubcontinentAsianwomen.ClinicalEndocrinology,49,91–9.

Rose,G.A.,Holland,W.W.,andCrowley,E.A.(1964).Asphygmomanometerforepidemiologists.Lancet,i,296–300.

Rowe,D.(1992).Motheranddaughteraren'tdoingwell.TheGuardian,London,14July,p.33.

Royston,P.andAltman,D.G.(1994).Regressionusingfractionalpolynomialsofcontinuouscovariates:parsimoniousparametricmodelling.AppliedStatistics,43,429–467.

Salvesen,K.A.,Bakketeig,L.S.,Eik-nes,S.H.,Undheim,J.O.,andOkland,O.(1992).Routineultrasonographyinuteroandschoolperformanceatage8–9years.Lancet,339,85–9.

Samuels,P.,Bussel,J.B.,Braitman,L.E.,Tomaski,A.,Druzin,M.L.,Mennuti,M.T.,andCines,D.B.(1990).Estimationoftheriskofthrombocytopeniaintheoffspringofpregnantwomenwithpresumedimmunethrombocytopeniapurpura.NewEnglandJournalofMedicine,323,229–35.

Schapira,K.,McClelland,H.A.,Griffiths,N.R.,andNewell,D.J.(1970).Studyontheeffectsoftabletcolourinthetreatmentofanxietystates.BritishMedicalJournal,2,446–9.

Schmid,H.(1973).Kaposi'ssarcomainTanzania:astatisticalstudyof220cases.TropicalGeographicalMedicine,25,266–76.

Schulz,K.F.,Chalmers,I.,Hayes,R.J.,andAltman,D.G.(1995).Biasduetonon-concealmentofrandomizationandnon-double-blinding.JournaloftheAmericanMedicalAssociation,273,408–12.

Searle,S.R.,Cassela,G.,andMcCulloch,C.E.(1992).VarianceComponents,NewYork,NewYork.

Senn,S.(1989).Cross-OverTrialsinClinicalResearch,Wiley,Chichester.

Shaker,J.L.,Brickner,R.C.,Findling,J.W.,Kelly,T.M.,Rapp.R.,Rizk,G.,Haddad,J.G.,Schalch,D.S.,andShenker,Y.(1997).Hypocalcemiaandskeletaldiseaseaspresentingfeaturesofceliacdisease.ArchivesofInternalMedicine,157,1013–6.

Siegel,S.(1956).Non-parametricStatisticsfortheBehaviouralSciences,McGraw-HillKagakusha,Tokyo.

Sibbald,B.,AddingtonHall,J.,Brenneman,D.,andFreeling,P.(1994).Telephoneversuspostalsurveysofgeneralpractitioners.BritishJournalofGeneralPractice,44,297–300.

Snedecor,G.W.andCochran,W.G.(1980).StatisticalMethods,7thedn.,IowaStateUniversityPress,Ames,Iowa.

Snowdon,C.,Garcia,J.,andElbourne,D.R.(1997).Makingsenseofrandomisation:Responsesofparentsofcriticallyillbabiestorandomallocationoftreatmentinaclinicaltrial.SocialScienceandMedicine,15,1337–55.

South-eastLondonScreeningStudyGroup(1977).Acontrolledtrialofmultiphasicscreeninginmiddle-age:resultsoftheSouth-EastLondonScreeningStudy.InternationalJournalofEpidemiology,6,357–63.

Southern,J.P.,Smith,R.M.M.,andPalmer,S.R.(1990).Birdattackonmilkbottles:possiblemodeoftransmissionofCampylobacter

jejunitoman.Lancet,336,1425–7.

Streiner,D.L.andNorman,G.R.(1996).HealthMeasurementScales:APracticalGuidetoTheirDevelopmentandUse,secondedition,Oxford,UniversityPress.

Stuart,A.(1955).Atestforhomogeneityofthemarginaldistributionsinatwo-wayclassification.Biometrika,42,412.

‘Student’(1908).Theprobableerrorofamean.Biometrika,6,1–24.

‘Student’(1931).TheLanarkshireMilkExperiment.Biometrika,23,398–406.

Thomas,P.R.S.,Queraishy,M.S.,Bowyer,R.,Scott,R.A.P.,Bland,J.M.,andDormandy,J.A.(1993).Leucocytecount:apredictorofearlyfemoropoplitealgraftfailure.CardiovascularSurgery,1,369–72.

Thompson,S.G.(1993).Controversiesinmeta-analysis:thecaseofthetrialsofserumcholesterolreduction.StatisticalMethodsinMedicalResearch,2,173–92.

Todd,G.F.(1972).StatisticsofSmokingintheUnitedKingdom,6thed.,TobaccoResearchCouncil,London.

Tukey,J.W.(1977).ExploratoryDataAnalysis,Addison-Wesley,NewYork.

Turnbull,P.J.,Stimson,G.V.,andDolan,K.A.(1992).PrevalenceofHIVinfectionamongex-prisoners.BritishMedicalJournal,304,90–1.

Velzeboer,S.C.J.M.,Frenkel,J.,anddeWolff,F.A.(1997).Ahypertensivetoddler.Lancet,349,1810.

Victora,C.G.(1982).Statisticalmalpracticeindrugpromotion:acase-studyfromBrazil.SocialScienceandMedicine,16,707–9.

White,P.T.,Pharoah,C.A.,Anderson,H.R.,andFreeling,P.(1989).Improvingtheoutcomeofchronicasthmaingeneralpractice:arandomizedcontrolledtrialofsmallgroupeducation.JournaloftheRoyalCollegeofGeneralPractitioners,39,182–6.

Whitehead,J.(1997).TheDesignandAnalysisofSequentialClinicalTrials,revised2nd.ed.,Chichester,Wiley.

Whittington,C.(1977).Safetybeginsathome.NewScientist,76,340–2.

Williams,E.I.,Greenwell,J.,andGroom,L.M.(1992).Thecareofpeopleover75yearsoldafterdischargefromhospital:anevaluationoftimetabledvisitingbyHealthVisitorAssistants.JournalofPublicHealthMedicine,14,138–44.

Wroe,S.J.,Sandercock,P.,Bamford,J.,Dennis,M.,Slattery,J.,andWarlow,C.(1992).Diurnalvariationinincidenceofstroke:Oxfordshirecommunitystrokeproject.BritishMedicalJournal,304,155–7.

Zelen,M.(1979).Anewdesignforclinicaltrials.NewEnglandJournalofMedicine,300,1242–5.

Zelen,M.(1992).Randomizedconsentdesignsforclinicaltrials:anupdate.StatisticsinMedicine,11,131–2.



>BackofBook>Index>A

Aabridgedlifetable200–1absolutedifference271–2absolutevalue239acceptingnullhypothesis140accidents53acutemyocardialinfarction277additionrule88adjustedoddsratio323admissionstohospital86 255–6 354 356 370–1age53 56–7 267 308–14 316 373age,gestational56–7ageinlifetableseelifetableage-specificmortalityrate295–6 299–300 302 307 376–7age-standardizedmortalityrate74 296 302age-standardizedmortalityratio297–9 303 307 376–7agreement272–5AIDS58 77–8 169–71 172 174–8 317–8alphaspending152albumin76–7alcoholics76–7 308–17allocationtotreatment6–13 15 20–1 23alterationsto11–13 21alternate6–7 11alternatedates11–12bygeneralpractice21 23byward21cheatingin12–13knowninadvance11inclusters21–2 179–81 344–6

minimization13non-random11–13 21–2physicalrandomization12random7–11 15 17 20–1 25systematic11–12usingenvelopes12usinghospitalnumber11

alphaerror140alternateallocation6–7 11alternativehypothesis137 139–42ambiguousquestions40–1analgesics15 18analysisofcovariance321analysisofvariance172–9 261–2 267–8 318–21assumptions173 175–6balanced318inestimationofmeasurementerror271fixedeffects177Friedman321Kruskal–Wallis217 261–2inmeta-analysis327multi-way318–21one-way172–9 261–2randomeffects177–9inregression310–15 315two-way318usingranks217 261–2 321

anginapectoris15–16 138–9 218–20animalexperiments5 16–17 20–1 33anticoagulanttherapy11–12 19 142antidiuretichormone196–7antilogarithm83appropriateconfidenceintervalsforcomparison134appropriatesignificancetestsforcomparison142–3anxiety18 143 210ARC58 172 174–7arcsinesquareroottransformation165

areaunderthecurve104–5 109–11 169–71 278 373–4probability104–5 109–11serialdata169–71 373–4ROCcurve278

arithmeticmean59arterialoxygentension183–4arthritis15 18 37 40Asianwomen35assessment19–20ascertainmentbias38association230–2asthma21 265 267 332 372 373atrophyofspinalchord37attackrate303attribute47AUCseeareaunderthecurveaverageseemeanAVP196–7AZT(zidovudine)77–8 169–71



>BackofBook>Index>B

Bbabies267 373–4back-transformation166–7 271backwardsregression326barchart73–5 354–6barnotation59Bartlett'stest172baseoflogarithm82–4baseline79baselinehazard324BASIC107Bayesianprobability87Bayes'theorem289BCGvaccine6–7 11 17 33 81betaerror140 337betweengroupssumofsquares174betweenclustervariance345–6betweensubjectsvariance178–9 204bias6 11–14 17–20 28 39–42 283–4 327 350 363ascertainment38inallocation11–13ascertainment38inassessment19–20publication327inquestionwording40–2recall39 350 363inreporting17–19response17–19insampling28 31volunteer6 13–14 32

bicepsskinfold165–7 213–15 339bimodaldistribution54–5binaryvariableseedichotomousvariableBinomialdistribution89–91 94 103 106–8 110 128 130–1 132–3 180andNormaldistribution91 106–8meanandvariance94probability90–1insigntest138–9 247

biologicalvariation269birds45–6 255 350birthrate303 305birthweight150blindassessment19–20blocks9bloodpressure19 28 117 191 268–9BMIseebodymassindexbodymassindex(BMI)322–3Bonferronimethod148–51boxandwhiskerplot58 66 351 359boxers264boxes93–4breastcancer37 216–17breastfeeding153breathlessness74–5BritishStandardsInstitution270bronchitis130–2 146 233–4



>BackofBook>Index>C

CCampylobacterjejuni44–6 255 350C-Tscanner5–6 68caesariansection25 349calculationerror70calibration194cancer23 32–9 41 69–74 216–17 241–3breast37 216–17cervicalcancer23lung32 35–9 68–70 241–3 299oesophagus74 78–80parathyroidcancerregistry39

capillarydensity159–64 174cards7 12 50carry-overeffect15case-controlstudy37–40 45–6 153–5 241–3 248 323 349–50 362–3casefatalityrate303casereport33–4caseseries33–4cataracts266 373categoricaldata47–8 373 seenominaldatacats350causeofdeath70–3 75celloftable230censoredobservations281 308 324–5census27 47–8 86 294decennial27 294hospital27 47–8 86local27

national27 294years294 299

centile57–8 279–81centrallimittheorem107–8cervicalcancer23cervicalsmear275cervicalcytology22chartbarseebarchartpieseepiechart

cheatinginallocation12–13Chi-squareddistribution118–20 232–3andsamplevariance119–20 132contingencytables231–3 249–51degreesoffreedom118–19 231–2 251table233

chi-squaredtest230–6 238–40 243–51 249–51 258–9 261–2 371 372373contingencytable230–6 238–40 243–7 249–51 258–9 261–2 371 372373continuitycorrection238–40 247 259 261degreesoffreedom231–2 251goodnessoffit248–9logranktest287–8samplesize341trend243–5 259 261–2validity234–6 239–40 245

childrenseeschoolchildrenchoiceofstatisticalmethod257–267cholesterol55 326 345cigarettesmokingseesmokingcirrhosis297–9 306 317classinterval49–50classvariable317clinicaltrials5–25 32–3 326–30allocationtotreatment6–15 20–1 23assessment19–20

combiningresultsfrom326–30clusterrandomized21–2 179–81 205 344–6 380consentofsubjects22–4cross-over15–16 341doubleblind19–20doublemaskedseedoubleblindethics19 22–4groupedsequential152informedconsent22–4intentiontotreat14–15 23 348 372meta-analysis326–30placeboeffect17–19randomized7–11samplesize336–42 344–6 347selectionofsubjects16–17sequential151–2volunteerbias13–14

Clinstatcomputerprogram3 9 30 93 248 298clusterrandomization21–2 179–81 205 344–6 380clustersampling31 344–6Cochran,W.G.230coefficientofvariation271coefficientsinregression189 191–2 310–12 314 317 322–3 325Cox325andinteraction314logistic322–3multiple310–12 314 317simplelinear189 191–2

coeliacdisease34 165–7 213–15 339cohortstudy36–7 350cohort,hypotheticalinlifetable299coins7 28 87–92colds69 241–3colontransittime267combinations97–8combiningdatafromdifferentstudies326–30commoncoldseecolds

commonestimate326–30commonoddsratio328–30commonproportion145–7commonvariance162–4 173comparisonmultipleseemultiplecomparisonsofmeans12–19 143–5 162–4 170–6 338–41 347 361 379–80ofmethodsofmeasurement269–73ofproportions130–2 145–7 233–4 245–7 259 341–3 347 372 379ofregressionlines208 9 367–8oftwogroups128–32 143–7 162–4 211–17 233–4 254 255–7 338–43344–6 347 361 372 379–80ofvariances172 260withinonegroup159–62 217–20 245–7 257 260–1 341

compliance183–4 228–9 363–7 369–70computer2 8–9 30 107 166 174 201 238 288–90 298 308 310 318diagnosis288–90randomnumbergeneration8–9 107programforconfidenceintervalofproportion132programsforsampling30statisticalanalysis2 174 201 298 308 310 318

conception142conditionallogisticregression323conditionaloddsratio248conditionalprobability96–7conditionaltest250confidenceinterval126–34appropriateforcomparison134centile133 280–1correlationcoefficient200–1differencebetweentwomeans128–9 136 162–4 361differencebetweentwoproportions130–1 243differencebetweentworegressioncoefficients208–9 368hazardratio288 325mean126–7 136 159–60 335–6 361median133numberneededtotreat290–1

oddsratio241–3 248percentile133 280–1predictedvalueinregression194–5proportion128 132–3 336quantile133 280–1ratiooftwoproportions131–2referenceinterval280–1 290 375 378regressioncoefficient191–2regressionestimate192–4andsamplesize335–6orsignificancetest142 145 227SMR298–9 307 376–7sensitivity276sensitivity276survivalprobability283transformeddata166–7usingrankorder216 220

confidencelimits126–34confounding34–5consentofresearchsubjects22–4conservativemethods15constraint118–19 250–1contingencytable230 330

continuitycorrection225–6 238–40 247chi-squaredtest238–40Kendall'srankcorrelationcoefficient226Mann-WhitneyUtest225McNemarstest247

continuousvariable47–50 75 87–8 93 103–6 276–8 323indiagnostictest276–8

contrastsensitivity266 373controlgroupcasecontrolstudy37–9 350 362–3clinicaltrial5–7

controlledtrialseeclinicaltrialcornflakes153–5 362–3

coronaryarterydisease34 149 326coronarythrombosis11–12 36correlation197–205 220 260–2 309–11assumptions200–1betweenrepeatedmeasurements341coefficient197–204confidenceinterval200–1Fisher'sztransformation201 339–40 343intra-class179 204–5 272 346intra-cluster346 347linearrelationship199matrix202 309–10multiple311negative198positive197productmoment198r198–200r2199–200rankseerankcorrelationandregression199–200 311repeatedobservations202–4samplesize343–4significancetest200–1tableof200tableofsamplesize344zero198

cough34–5 41 128–32 144–7 233–4 240–1 254counselling41–2counties347covarianceanalysis321Coxregression324–5crime97Crohn'sdisease153–5 165–7 213–15 339 362–3cross-classification230 370–1cross-overtrial15–16 137 341cross-sectionalstudy34–5cross-tabulation230 370–1

crudedeathrate294–5crudemortalityrate294–5 302cumulativefrequency48–51 56cumulativesurvivalprobability282–3 299cushionvolume333–4 378cut-offpoint277–8 281



>BackofBook>Index>D

Ddeath27 70–3 96 101–2 281deathcertificate27 294deathrateseemortalityratedecennialcensus27 294decimaldice8decimalplaces70 268decimalpoint70decimalsystem69–70decisiontree289–90DeclarationofHelsinki22degreesoffreedom61 67 118–20 153–4 159 169 171–2 191 231–2251 288 309 311 319 331analysisofvariance173–5Chi-squareddistribution118–20chi-squaredtest231–2 251Fdistribution120Ftest171 173–5goodnessoffittest248–9logranktest288regression191 310 313samplesizecalculations335tdistribution120 157–8tmethod157–8 160–4varianceestimate61 67 94–5 119 352–3

delivery25 230–1 322–3 349demography299denominator68–9dependentvariable187depressivesymptoms18

Derbyshire128designeffect344–6 380detection,belowlimitof281deviationfromassumptions161–2 164 167–8 175–6 196–7deviationsfrommean61 352deviationsfromregressionline187–8dexamethasone290–1diabetes135–6 360–1diagnosis47–8 86 275–9 288–90 317diagnostictest136 275–9 361diagrams72–82 85–6barseebarchartpieseepiechartscatterseescatterdiagram

diarrhoea172 318diastolicbloodpressureseebloodpressuredice7–8 87–9 122dichotomousvariable258–62 308 317 321–3 325 328differenceagainstmeanplot161–2 184 271–5 364–5 367differences129–30 138–9 159–62 184 217–20 271–5 341 364–5 369–70differencesbetweentwogroups128–31 136 143–7 162–7 211–17 258–9 338–43 344–6 347 362–3digitpreference269directstandardization296dischargefromhospital48discretedata47 49discriminantanalysis289distributionBinomialseeBinomialdistributionChi-squaredseeChi-squareddistributioncumulativefrequencyseecumulativefrequencydistributionFseeFdistributionfrequencyseefrequencydistributionNormalseeNormaldistributionPoissonseePoissondistributionprobabilityseeprobabilitydistributionRectangularseeRectangulardistribution

tseetdistributionUniformseeUniformdistribution

distribution-freemethods210diurnalvariation249DNA97doctors36 68 86 297–9 356Dopplerultrasound150dotplot77doubleblind19–20doubledummy18doublemaskedseedoubleblinddoubleplaceboseedoubledummydrug69dummytreatmentseeplacebodummyvariables317 328Duncan'smultiplerangetest176



>BackofBook>Index>E

Ee,mathematicalconstant83–4 95ecologicalfallacy42–3ecologicalstudies42–3eczema97election28 32 41electoralroll30 32embryos333–4 378enumerationdistrict27envelopes12enzymeconcentration347 379–80epidemiologicalstudies32 34–40 42–3 45–6 326equality,lineof273–4error70 140 187 192 269–72 337alpha140beta140 337calculation70firstkind140measurement269–72secondkind140 337terminregressionmodel187 192typeI140typeII140 337

estimate61 122–36 326–30estimation122–36 335–6ethicalapproval32ethics4 19 22–4 32evidence-basedpractice1expectation92–4ofadistribution92–3

ofBinomialdistribution94ofChi-squareddistribution118oflife102 300–2 305 357–8ofsumofsquares60–4 98–9 119

expectedfrequency230–31 26 250expectednumberofdeaths297–9expectedvalueseeexpectation,expectedfrequencyexperimentalunit21–2 180experiments5–25animal5 16–17 20–1 33clinicalseeclinicaltrialsdesignof5–25factorial10–11laboratory5 16–17 20–1

expertsystem288–90ex-prisoners128



>BackofBook>Index>F

FFdistribution118 120 334Ftest171 173–5 311 313–15 317–18 320 334 378face-lifts23factor317–18factorial90 97factorialexperiment10–11falsenegative277–9falsepositive277–9familyofdistributions90 96Farr,William1FATseefixedactivatedT-cellsfatabsorption78 169–71fatalityrate303feet,ulcerated159–64 174fertility142 302–3fertilityrate303FEVl49–54 57–60 62–3 125–7 133 185–6 188–95 197–9 201 279–80310–11 335–6fevertree26Fisher1Fisher'sexacttest236–40 251–2 259 262Fisher'sztransformation201 343fivefiguresummary58fiveyearsurvivalrate283fixedactivatedT-cells(FAT)318–21fixedeffects177–9 328follow-up,losttoorwithdrawnfrom282footulcers159–64 174forcedexpiratoryvolumeseeFEV1

forestdiagram330forwardregression326fourths57frequency48–56 68–9 230–1 250cumulative48–51density52–4 104–5distribution48–56 66–7 103–5 351–2 354expected230–1 250perunit52–4polygon54andprobability87 103–5proportion68relative48–50 53–4 104–5tallysystem50 54intables71 230–1



>BackofBook>Index>G

GG.P.41Gabriel'stest177gallstones284–8 324–5Galton186gastricpH265–6 372–3GaussiandistributionseeNormaldistributiongeewhizgraph79–80geometricmean113 167 320geriatricadmissions86 255–6 354 356 370–1gestationalage196–7glucose35 66–7 121–2 351–3 359–60gluesniffingseevolatilesubstanceabusegoodnessoffittest248–9GossettseeStudentgradient185–6graftfailure331graphs72–82 85–6groupcomparisonseecomparisonsgroupedsequentialtrials152groupingofdata167guidelines179–81



>BackofBook>Index>H

Hharmonicmean113hayfever97hazardratio288 324–25health40–1healthcentre220–1healthpromotion347healthypopulation279 292–3hearttransplants264heatwave86 255–6 356 370–1height75–6 87–8 93–4 112 159 185–6 188–95 197–9 201 208–9 308–17 367–9Helsinki,Declarationof22heteroscedasticity175heterogeneitytest249 328–9Hill,Bradford1histogram50–7 67 72 75 103–4 267 303–4 352 354 356 359historicalcontrols6HIV58 128 172 174–7holes93–4homogeneityofoddsratios328–9homogeneityofvarianceseeuniformvariancehomoscedasticity175hospitaladmissions86 255–6 356 370–1hospitalcensus27 47–8 85hospitalcontrols38–9house-dustmite265 372housingtenure230–1 317Huff79 81humanimmunodeficiencyvirusseeHIV

hypercholesterolaemia55hypertension43 91 265 372hypocalcaemia34hypothesis,alternativeseealternativehypothesishypothesis,nullseenullhypothesis



>BackofBook>Index>I

IICCseeintra-classcorrelationICDseeInternationalClassificationofDiseaseileostomy265 372incidence303independentevents88 357independentgroups128–32 143–7 162–4 172–7 211–17independentrandomvariables93–4independenttrials90independentvariableinregression187India17 33indirectstandardization296–9inductionoflabour322–3infantmortalityrate303infinity(∞)291inflammatoryarthritis40informedconsent22–3instrumentaldelivery25 349intentiontotreat14–15 348–9 372interaction310 313–14 320–1 327–9 334 378intercept185–6InternationalClassificationofDisease70–72inter-pupildistance331interquartilerange60interval,class49intervalestimate126intervalscale210 217 258–62 373intra-classcorrelationcoefficient179 204–5 272 380intra-clustercorrelationcoefficient272 380



>BackofBook>Index>J

Jjitteringinscatterdiagrams77



>BackofBook>Index>K

KKaplan-Meiersurvivalcurve283Kaposi'ssarcoma69 220–1Kendall'srankcorrelationcoefficient222–6 245 261–2 373 374continuitycorrection226incontingencytables245τ222table225tau222ties23–4comparedtoSpearman's224–5

Kendall'stestfortwogroups217Kent245–7KnowYourMidwifetrial25 348–9knowledgebasedsystem289–90Korotkovsounds268–9Kruskal-Wallistest217 261–2



>BackofBook>Index>L

Llabour322–3 348–9laboratoryexperiment5 16 20–1lactulose172 175–7Lanarkshiremilkexperiment12laparoscopy142largesample126 128–32 143–7 168–9 258–60 335–6leastsquares187–90 205–6 310leftcensoreddata281Levenetest172lifeexpectancy102 300–2 305 357–8lifetable101–2 282–3 296 299–302limitsofagreement274–5linegraph77–80 354 356lineofequality273–4linearconstraint118–19 243–5 250–1linearregressionseeregression,multipleregressionlinearrelationship185–209 243–5lineartrendincontingencytable243–5LiteraryDigest31lithotrypsy284logseelogarithm,logarithmicloghazard324–5log-linearmodel330logodds240 252–3 321–3logoddsratio241–2 252–3 323logarithm82–4 131baseof82–4

logarithmofproportion131logarithmofratio131

logarithmicscale81–2logarithmictransformation113–14 116 164–7 175–6 184andcoefficientofvariation271andconfidenceinterval167geometricmean113 167toequalvariance164–7 175–6 196–7 271toNormaldistribution113–14 116 164–7 175–6 184 360 364–5 372standarddeviation113–14varianceof131 248

logisticregression289 321–3 326 328–9 330conditional323multinomial330ordinal330

logittransformation235 248–9 321–3Lognormaldistribution83 113logranktest284 287–9 325longitudinalstudy36–7losstofollow-up282Louis,Pierre-Charles-Alexandre1lungcancer32 35–9 68–70 96 242–3 299lungfunctionseeFEV1,PEFR,meantransittime,vitalcapacitylymphaticdrainage40



>BackofBook>Index>M

Mmagnesium135–6 292–3 360–1 375–6malaria26mannitol58 172 174–7 317–18Mann–WhitneyUtest164 211–17 225–7 258–9 259 278 373–4andtwo-sampletmethod211 215–17continuitycorrection225–6Normalapproximation215 225–6andROCcurve278table212tablesof217ties213 215

Mantel'smethodforsurvivaldata288Mantel-Haenszelmethodforcombining,2by2tables328methodfortrend245

marginaltotals230–1matchedsamples159–62 217–20 245–7 260 341 363–7 369–70matching39 45–6maternalage267 373maternalmortalityrate303maternitycare25mathematics2matrix309maximum58 65 169 345maximumvoluntarycontraction308–16McNemar'stest245–7 260meantransittime265 368mean59–60 67arithmetic59

comparisonoftwo128–9 143–5 162–4 338–41 361 378–9confidenceintervalfor126–7 132 335 361deviationsfrom60geometric113 167harmonic113ofpopulation126–7 335–6ofprobabilitydistribution92–4 105–6ofasample56–8 65–6 352–3samplesize335–6 338–41samplingdistributionof122–5standarderrorof126–7 136 156 335 361sumofsquaresabout60–65

measurement268–9measurementerror269–72measurementmethods272–5median56–9 133 216–7 220 351confidenceintervalfor133 220

MedicalResearchCouncil9mercury34meta-analysis326–30methodsofmeasurement269–73mice21 33 333–4 378midwives25 342–3mildhypertension265 368milk12–13 45–6 255 349–50miniWrightpeakflowmeterseepeakflowmeterminimization13minimum58 66 351misleadinggraphs78–81missingdenominator69missingzero79–80mites265 372MLn3MLWin3mode55modulus239Montecarlomethods238

mortality15 36 70–6 86 294–6 302–3 347 356 357–8 376–7mortalityrate36 294–6 302–3age-specific295–6 299–300 302 307 376–7age-standardized296 302crude294–5 302infant303 305neonatal303perinatal303

mosquitos26MTBseemycobateriumtuberculosisMTTseemeantransittimemultifactorialmethods308–34multi-levelmodelling3multinomiallogisticregression330multiplecomparisons175–7multipleregression308–18 333–4analysisofvariancefor310–15andanalysisofvariance318assumptions310 315–16backward326classvariable317–18coefficients310–12 314 378computerprograms308 310 318correlatedpredictorvariables312degreesoffreedom310 312dichotomouspredictor317dummyvariables317–18Ftest311 313 317factor317–18forward326interaction310 313–14 333–4 378leastsquares310linear310 314inmeta-analysis327non-linear310 314–15 378Normalassumption315–16outcomevariable308

polynomial314–15predictorvariable308 312–13 316–18quadraticterm315 316 378qualitativepredictors316–18R2311referenceclass317residualvariance310residuals315–16 333–4 378significancetests310–13standarderrors311–12stepwise326sumofsquares310 313–14 378ttests310–12 317transformations316uniformvariance316varianceratio311variationexplained311

multiplesignificancetests148–52 169multiplicativerule88 90 92–4 96multi-wayanalysisofvariance318–21multi-waycontingencytables330musclestrength308–16mutuallyexclusiveevents88 90 357mycobateriumtuberculosis(MTB)318–21myocardialinfarction277 347 379



>BackofBook>Index>N

NNapier83naturalhistory26 33naturallogarithm83naturalscale81–2nauseaandvomiting290–1Nazideathcamps22negativepredictivevalue278–9neonatalmortalityrate300NewYork6–7 10Newman-Keulstest176Nightingale,Florence1nitrite265 372–3NNHseenumberneededtoharmNNTseenumberneededtotreatnodesinbreastcancer216–17nominalscale210 258–62non-parametricmethods210 226–7non-significant140–1 142–3 149nonedetectable281Normalcurve106–9normaldelivery25 349Normaldistribution91 101–20andBinomial91 106–8inconfidenceintervals126–7 258–60 262 373incorrelation200–1deriveddistributions118–20independenceofsamplemeanandvariance119–20aslimit106–8andnormalrange279–81 293

ofobservations112–18 156 210 258–62 359–60andreferenceinterval279–81 293 375 378inregression187 192 194 315–16insignificancetests143–7 258–60 262 368standarderrorofsamplestandarddeviation132intmethod156–8tables109–10

Normalplot114–19 121–2 161 163 165–7 170–3 175–6 180–1 267359–60Normalprobabilitypaper114normalrangeseereferenceintervalnullhypothesis137 139–42numberneededtoharm290numberneededtotreat290–1Nuremburgtrials22nuisancevariable320



>BackofBook>Index>O

Oobservationalstudies5 26–46observedandexpectedfrequencies230–1occupation96odds240 321–3oddsratio240–2 248 252–3 259 323 328–9oesophogealcancer74 77–80OfficeofNationalStatistics294ontreatmentanalysis15one-sidedpercentagepoint110one-sidedtest141–2 237one-tailedtest141–2 237opinionpoll29 32 41 347 378–9orderednominalscale258–62ordinallogisticregression330ordinalscale210 220 258–62 373outcomevariable187 190 308 321outliers58 196 378overview326–30oxygendependence267 373–4



>BackofBook>Index>P

Ppa(O2)183–4pain15–16 18painreliefscore18paireddata129–30 138–9 159–62 167–8 217–20 245–7 260 341 363–7369–70 372inlargesample129–30McNemar'stestseeMcNemar'stestsamplesize341signtestseesigntesttmethodseetmethodsWilcoxonseetestWilcoxontest

parameter90parametricmethods210 226–7parathyroidcancer282–4parity49 52–3 248–9passivesmoking34–5PCOseepolycysticovarydiseasepeakexpiratoryflowrateseePEFRpeakflowmeter269–75peakvalue169Pearson'scorrelationcoefficientseecorrelationcoefficientPEFR54 128–9 144–5 147–8 208–9 265 269–75 363–4 368percentage68 71percentagepoint109–10 347 378percentile57 279–81perinatalmortalityrate303permutation97–8pH265 372–3phlegm145 147–8

phosphomycin69physicalmixing12pictogram80–1piechart72–3 80 354–5piediagramseepiechartpilotstudy335 339 341Pitman'stest260placebo17–20 22pointestimate125Poissondistribution95–6 108 165 248–50 252 298–9Poissonheterogeneitytest249Poissonregression330poliomyelitis13–14 19 68 86 355polycysticovarydisease35polygonseefrequencypolygonpolynomialregression314–15population27–34 36 39 87 335–6census27 294estimate294mean126–7 335–6national27 294projection302pyramid303–5restricted33standarddeviation124–5statisticalusage28variance124–5

positivepredictivevalue278power147–8 337–46p–pplot117–18precision268–9predictorvariable187 190 308 312–13 316–18 321 323 324pregnancy25 49 348–9prematurebabies267presentingdata68–86presentingtables71–2prevalence35 90 278–9 303

probability87–122additionrule88conditional96–7densityfunction104–6distribution88–9 92–4 103–6 357–8ofdying101–2 299–300 357–8multiplicationrule88 96paper114insignificancetests137 9ofsurvival101–2 357–8thatnullhypothesisistrue140

productmomentcorrelationcoefficientseecorrelationcoefficientpronethalol15–16 138–9 217–20proportion68–9 71 128 130–3 165 321–3arcsinesquareroottransformation165confidenceintervalfor128 132–3 336denominator69differencebetweentwo130–1 145–7 233–4 245–7 341–3 347asoutcomevariable321–3ratiooftwo131–2 147samplesize336 341–3 347standarderror128 336intables71ofvariabilityexplained191 200

proportionalfrequency48proportionalhazardsmodel324–5prosecutor'sfallacy97prospectivestudy36–7protocol268pseudo-random8publicationbias327pulmonarytuberculosisseetuberculosispulserate178–9 190–1 204Pvalue1 139–41Pvaluespending152pyramid,population303–5



>BackofBook>Index>Q

Qq–qplotseequantile–quantileplotquadraticterm315 316 378qualitativedata47 258–62 316–18quantile56–8 116–18 133 279–81confidenceinterval133 280–1

quantile-quantileplot116–18quantitativedata47 49quartile57–8 66 351quasi-randomsampling31questionnaires36 40–2quotasampling28–29



>BackofBook>Index>R

Rr,correlationcoefficient198–9r2199–20 311rS,Spearmanrankcorrelation220R,multiplecorrelationcoefficient311R2311radiologicalappearance20RAGE23randomallocation7–11 15 17 20–3 25bygeneralpractice21 23byward21inclusters21–2 344–6

randombloodglucose66–7randomeffects177–9 328randomnumbers8 10 29–30randomsampling9 29–32 38 90randomvariable87–118additionofaconstant93differencebetweentwo94expectedvalueof92–4meanof92–4multipliedbyaconstant92sumoftwo92–3varianceof92–4

randomizationseerandomallocationrandomizedconsent23randomizingdevices7–8 87 90range59–60 279interquartile59–60normalseereferenceinterval

referenceseereferenceintervalrank211 213–14 218 221 223rankcorrelation220–6 261–2 373 374choiceof226 261–2Kendall's222–6 261–2 373 374Spearman's220–2 226 261–2 374

rankorder211 213–14 221ranksumtest210–20onesampleseeWilcoxontwosampleseeMannWhitney

rate68–9 71agespecificmortality295–6 299–300 302 307agestandardizedmortality296 302attack303birth303 305casefatality303crudemortality294–5 302denominator69fertility303fiveyearsurvival283incidence303infantmortality303 305maternalmortality303mortality294–6 302–3multiplier68 295neonatalmortality303perinatalmortality303prevalence303response31–2stillbirth303survival283

ratiooddsseeoddsratioofproportions131–2 147scale257–8standardizedmortalityseestandardizedmortalityratio

rats20

rawdata167recallbias39 350 363receiveroperatingcharacteristiccurveseeROCcurvereciprocaltransformation165–7Rectangulardistribution107–8referenceclass317referenceinterval33 136 279–81 293 361 375 378confidenceinterval280–1 293 375 378bydirectestimation280–1samplesize347 378usingNormaldistribution279–80 293 361 375 378usingtransformation280

refusingtreatment13–15 25registerofdeaths27regression185–9 199–200 205–7 208–9 261–2 308–18 312–30 333–4analysisofvariancefor310–15assumptions187 191–2 194–5 196–7backward326coefficient189 191–2comparingtwolines208–9 367–8confidenceinterval192incontingencytable234–5andcorrelation199–200Cox324–5dependentvariable187deviationsfrom187deviationsfromassumptions196–7equation189errorterm187 192estimate192–3explanatoryvariable187forward326gradient185–6independentvariable187intercept185–6leastsquares187–90 205–6line187

linear189logistic321–3 326 328–9multinomiallogistic330multipleseemultipleregressionordinallogistic330outcomevariable187 190outliers196perpendiculardistancefromline187–8Poisson330polynomialseepolynomialregressionprediction192–4predictorvariable187 190proportionalhazards324–5residualsumofsquares191residualvariance191residuals194–6significancetest192simplelinear189slope185–6standarderror191–4stepwise326sumofproducts189sumofsquaresabout191–2 310sumofsquaresdueto191–2towardsthemean186–7 191variabilityexplained191 200varianceaboutline191–2 205–6XonY190–1

rejectingnullhypothesis140–1relationshipbetweenvariables33 73–8 185–209 220–6 230–45 257261–2 308–34relativefrequency48–50 53 103–5relativerisk132 241–3 248 323reliability272repeatability33 269–72repeatedobservations169–71 202–3repeatedsignificancetests151–2 169

replicates177representativesample28–32 34residualmeansquare174 270residualstandarddeviation191–2 270residualsumofsquares174 310 312residualvariance173 310residuals165–6 175–6 267 315–16 333–4aboutregressionline194–6 315–16plotsof162–4 173–4 194–6 315–16 333–4 378withingroups165–6 175–6

respiratorydisease32 34–5respiratorysymptoms32 34–5 41 125–9 142–7 233–4 240–1 243–7254responsebias17–19responserate31–2responsevariableseeoutcomevariableretrospectivestudy39rheumatoidarthritis37Richterscale114risk131–2riskfactor39 326–7 350RND(X)107robustnesstodeviationsfromassumptions167–9ROCcurve277–8



>BackofBook>Index>S

Ss2,symbolforvariance61saline13–14Salkvaccine13–14 17 19 68 355salt43sample87large127–31 168–9 258–60 262 335–6meanseemeansizeseesizeofsamplesmall130–1 132–3 156–69 227 258–60 262 344varianceseevariance

sampling27–34inclinicalstudies32–4 293 375cluster31distribution122–5 127inepidemiologicalstudies32 34–9experiment63–4 122–5frame29multi-stage30quasi-random31quota29random29–31simplerandom29–30stratified31systematic31

scanner5–6scatterdiagram75–7 185–6scattergramseescatterdiagramschoolchildren12–13 17 22 31 34–5 41 43 128–32 143–7 233–4 240–1 243–7 254

schools22 31 34screening15 22 81 216–7 265 275–9selectionofsubjects16–17 32–3 37–9incasecontrolstudies37–9inclinicaltrials16–17self31–2

selfselection31–2semenanalysis183semi-parametric325sensitivity276–8sequentialanalysis151–2sequentialtrials151–2serialmeasurements169–71sex71–2signtest138–9 161 210 217 219–20 228 246–7 260 369–70 372 373signed-ranktestseeWilcoxonsignificanceandimportance142–3significanceandpublication327significancelevel140–1 147significancetests137–55multiple148–52 169andsamplesize336–8insubsets149–50inferiortoconfidenceintervals142 145

significantdifference140significantdigitsseesignificantfiguressignificantfigures69–72 268–9sizeofsample32 147–8 335–47accuracyofestimation344inclusterrandomization344–6correlationcoefficient343–4andestimation335–0pairedsamples6–341referenceinterval347 378andsignificancetests147–8 336–8singlemean335–6singleproportion336 378–9

twomeans338–41 379–80twoproportions341–3 379

skewdistribution56 59 67 112–14 116–17 165 167–8 360skinfoldthickness165–7 213–15 335slope185–6smallsamples156–67 227 258–60smoking22 26 31–2 34–9 41 67 74–5 241–3 356SMR297–9 303 307 376–7Snow,John1sodium116–17somites333–4 378SouthEastLondonScreeningStudy15Spearman'srankcorrelationcoefficient220–2 226 261–2 373table219ties219

specificity276–8spinalchordatrophy37squareroottransformation165–7 175–7squares,sumofseesumofsquaresstandardagespecificmortalityrates297–8standarddeviation60 62–4 67 92–4 119–21degreesoffreedomfor63–4 67 119ofdiiferences159–62ofpopulation123–4ofprobabilitydistribution92–4 105ofsample62–4 67 119 353ofsamplingdistribution123–4andtransformation113–14andstandarderror126standarderrorof132withinsubjects269–70

standarderror122–5andconfidenceintervals126–7centite280correlationcoefficient201 343differencebetweentwomeans128–9 136 338–41 361 379–80differencebetweentwoproportions130–1 145–7 341–3 379

differencebetweentworegressioncoefficients208 367–8differentinsignificancetestandconfidenceinterval147loghazardratio325logoddsratio241–2 252–3logisticregressioncoefficient322mean123–5 136 335–6 361percentile280predictedvalueinregression192–4proportion128 336 378–9quantile280ratiooftwoproportions121–2referenceinterval280 370–1 378regressioncoefficient191–2 311–12 317regressionestimate192–3SMR298–9 377standarddeviation132survivalrate283–4 341

StandardNormaldeviate114–17 225–6StandardNormaldistribution108–11 143 156–8 337–8standardpopulation296standardizedmortalityrate74 296standardizedmortalityratio296–9 303 307standardizedNormalprobabilityplot117–18Stata118StatExact238statistic47 139 302–3 337test139 337vital302–3

Statistics1statisticalsignificanceseesignificanceteststemandleafplot54 57 66 184 351 364–6stepfunction51 283step-down326step-up326stepwiseregression326stillbirthrate303

stratification31strength308–16strengthofevidence137 140 362streptomycin9–10 17 19–20 81 235–6 290stroke5–6 23 249Stuarttest248 260Student12–13 156 158–9Student'stdistributionseetdistributionStudentizedrange176subsets149–50success90suicide306sumofproductsaboutmean189 198–200sumofsquares60–1 63–5 98–9 119 173–4 310 313–14aboutmean60–1 63–5 119 352–3aboutregression191–2 310 313–14duetoregression191–2 310 313–14expectedvalueof63–4 98–9 119

summarystatistics169 180–1 327summation59survey28–9 42 90survival10 101–2 281–8 324–5analysis281–8 324–5curve283–4 286 324probability101–2 282–4 287rate283time162 281–8

symmetricaldistribution54 56 59synergy320syphilis22systolicbloodpressureseebloodpressure



>BackofBook>Index>T

Ttdistribution120 156–9degreesoffreedom120 153–4 157–8andNormaldistribution120 156–8shapeof157table158

tmethods114 156–69assumptions161–8 184 365–7confidenceintervals159–63 164 167deviationsfromassumptions161–2 164 167–8differencebetweenmeansinmatchedsample159–61 184 260 363–7370 372differencebetweenmeansintwosamples162–7 258–9 262onesample159–62 184 260 363–7 370 372paired159–62 167–8 184 217 220 260 363–7 370 372regressioncoefficient191–2 310–12 317singlemean159–62 176twosample162–7 217 258–9 262 317 373–4unpairedsameastwosample

tableofprobabilitydistributionChi-squared233correlationcoefficient200Kendall'sτ225Mann–WhitneyU212Normal109–10Spearman'sρ222t158Wilcoxonmatchedpairs219

tableofsamplesizeforcorrelationcoefficient344tablesofrandomnumbers8–9 29–30

tables,presentationof71–2tables,twoway230–48tailsofdistributions56 359–60tallysystem49–50 54Tanzania69 220–4TBseetuberculosistelephonesurvey42temperature10 70 86 210 255–6 332test,diagnostic136 275–9 361test,significanceseesignificancetestteststatistic136 337threedimensionaleffectingraphs80thrombosis11–12 36 345thyroidhormone267 373–4tiesinranktests213 215 218–19 222–4tiesinsigntest138time324–5timeseries77–8 169–71 354 356timetopeak169time,survivalseesurvivaltimeTNFseetumournecrosisfactortotalsumofsquares174transformations112–14 163–7 320arcsinesquareroot165andconfidenceintervals167Fisher'sz201 343logarithmic112–14 116 163–7 170–1 175–6 184 320 364–7 369–70logit240 252–3toNormaldistribution112–14 116 164–7 175–6 184reciprocal113 165–7andsignificantfigures269squareroot165–7 175–7touniformvariance163–7 168 175–6 196–7 271

treatedgroup5–7treatment5–7 326–7treatmentguidelines179–81trendincontingencytables243–5

chi-squaredtest243–5Kendall'sτb245Mantel–Haenzsel245

trial,clinicalseeclinicaltrialtrialofscar322–3triglyceride55–6 58–59 63 112–13 280–2trisomy-16333–4 378truedifference147truenegative278truepositive278tuberculosis6–7 9–10 17 81–2 290Tukey54 58Tukey'sHonestlySignificantDifference176tumourgrowth20tumournecrosisfactor(TNF)318–21TuskegeeStudy22twins204two-samplettestseetmethodstwo-sampletrial16two-sidedpercentagepoint110two-sidedtest141–2two-tailedtest141–2typeIerror140typeIIerror140 337



>BackofBook>Index>U

Uulceratedfeet159–64 174ultrasonography134unemployment42Uniformdistribution107–8 249uniformvariance159 162–4 167–8 175–6 187 191 196–7 316 319–20unimodaldistribution55unitofanalysis21–2 179–81urinaryinfection69urinarynitrite265 372–3



>BackofBook>Index>V

Vvaccine6–7 11 13–14 17 19validityofchi-squaredtest234–6 239–40 245variability59–64 269variabilityexplainedbyregression191 200variable47categorical47continuous47 49dependent187dichotomous259–62discrete47 49explanatory187independent187nominal210 259–62nuisance320ordinal210 259–62outcome187 190 308 321predictor187 190 308 312–13 316–18 321 323 324qualitative47 316–18quantitative47randomseerandomvariable

variance59–64 67aboutregressionline191–2 205–6analysisofseeanalysisofvariancebetweenclusters345–6betweensubjects178–9 204common162–4 170 173comparisoninpaireddata260comparisonofseveral172comparisonoftwo171 260

degreesoffreedomfor61 63–4 352–3estimate59–64 124–5oflogarithm131 252population123–4ofprobabilitydistribution91–4 105ofrandomvariable91–4ratio120 311residual192 205–6 310sample59–64 67 94 98–9 119 352–3uniform162 163–7 168 174–6 187 196–7 316withinclusters345–6withinsubjects178–9 204 269–72

variation,coefficientof271visualacuity266 373vitalcapacity75–6vitalstatistics302–3vitaminA328–9vitaminD115–16volatilesubstanceabuse42 307 376–7volunteerbias6 13–14 32volunteers5–6 13–14 16–17VSAseevolatilesubstanceabuse



>BackofBook>Index>W

WWandsworthHealthDistrict86 255–6 356website3 4weightgain20–1wheeze267whoopingcough265 373Wilcoxontest217–20 260 369–70 373matchedpairs217–20 260 369–70 373onesample217–20 260 369–70 373signedrank217–20 260 369–70 373table219ties218–19twosample217 seeMann-Whitney

withdrawnfromfollow-up282withinclustervariance345–6withingroupresidualsseeresidualswithingroupssumofsquares173withingroupsvariance173withinsubjectsvariance178–9 204

withinsubjectsvariation178–9 269–72Wooif'stest328Wrightpeakflowmeterseepeakflowmeter



>BackofBook>Index>X

X[xwithbarabove],symbolformean59X-ray19–20 81 179–81



>BackofBook>Index>Y

YYates'correction238–40 247 259 261



>BackofBook>Index>Z

Zztest143–7 258–9 262 234ztransformation201 343zero,missing78–80zidovudineseeAZT%symbol71!(symbolforfactorial)90 97∞(symbolforinfinity)291|(symbolforgiven)96|(symbolforabsolutevalue)239α(symbolforalpha)140β(symbolforbeta)140χ(symbolforchi)118–19µ(symbolformu)92–3φ(symbolforphi)108Φ(symbolforPhi)109ρ(symbolforrho)220–2Σ(symbolforsummation)57σ(symbolforsigma)92–3τ(symbolfortau)222–5

Documents

An Introduction to Medical Statistics by Martin Bland