Upload
suriyamilton7135
View
325
Download
4
Embed Size (px)
Citation preview
Mass Media Research:An Introduction, 7th editionR. D. Wimmer & J. R. Dominick (2003)
This handout is prepared with copyright by Jonathan Zhu.
Reproduction or distribution is prohibited without prior consent.
2
Table of ContentsI. The Research Process
1. Science and Research2. Elements of Research3. Research Ethics4. Sampling
II. Research Approaches5. Qualitative Research
Methods*
6. Content Analysis7. Survey Research8. Longitudinal Research9. Experimental Research
III. Data Analysis10. Introduction to Statistics11. Hypothesis Testing12. Basic Statistical Procedures
IV. Research Applications*
13. Research in the Print Media14. Research in the Electronic Media15. Research in Advertising16. Research in Public Relations17. Research in Media Effects18. Mass Media Research & Internet
* Omitted from this course.
3
Ch 1. Science and ResearchLearning Objectives:
What is scientific research?Why is scientific research the only learning approach that allows for self-correction (i.e., falsification) of existing knowledge or new findings?
Basic characteristics of scientific researchHow to conduct scientific research?
Basic steps of research procedure
4
What Is Scientific Research?Research is an attempt to discover something newScientific research is an organized, objective, controlled, qualitative or quantitative empirical analysis of one or more variables
5
Methods of KnowingFour approaches for people to find answers to research questions:
1. Method of tenacity (to follow or apply previously established traditions)
2. Method of intuition (to apply common sense, to take it for granted)
3. Method of authority (to follow a trusted source)4. Method of scientific inquiry (to discover truth
through objective evaluations of data)
6
Characteristics of Scientific MethodScientific method has the following five basic characteristics, as compared with other methods of knowing:
Public (vs. private)Objective (vs. subjective)Empirical (vs. speculative)Systematic and cumulative (vs. anecdotal) Predictive (vs. ad hoc)
7
Scientific Research ProceduresA typical research process consists of eight
steps:1. Select a problem;2. Review existing, relevant research and theory;3. Develop hypothesis or research questions;4. Determine an appropriate
methodology/research design;5. Collect relevant data;6. Analyze the data and the interpret the results;7. Present the results in an appropriate form;8. Replicate the study (when necessary)
8
Steps of the Research Procedure
Selection of problem
Review
of existing research &
theory
Statem
ent of hypothesis or research questions
Determ
ination of appropriate m
ethodology
Data collection
Analysis &
Interpretation of data
Presentation of results
Replication
9
Step 1: How to Select a Research Topic?
Issues to consider when selecting a topic:1. Is the topic too broad?2. Can the problem really be investigated?3. Can the data be analyzed?4. Is the problem significant?5. Can the results be generalized?6. What costs and time are involved?7. Is the planned approach appropriate to the
project?8. Is there any potential harm to the subjects?
10
Step 2. How to Review Existing Research & Theory?
Sources to review:Academic journalsProfessional trade publicationsPopular magazinesThe Internet
Focus on when reviewing:What type research has been done in the area?What has been found in previous studies?What suggestions do other researchers make for future search?What has not been investigated?What research methods were used in previous studies? How can the proposed study add to our knowledge of the area?
11
Step 3. How to State a Hypothesis/Research Question?
Hypothesis is a formal statement about the nature, function, direction, and strength of testable relationship between variables.Research question is a less formal statement about relationships between variables without specifying the nature, direction, or strength of the relationships. (Zhu)Between the two, hypothesis is preferred, especially when there is a rich body of existing research on the topic. (Zhu)
12
Step 4. How to Determine the Right Methodology and Research Design? (Zhu)
Basic research design:Between-subjects design (cross-sectional, use survey)Within-subjects design (longitudinal, see Ch. 8)Mixed design (cross-sectional and longitudinal)
Unit of analysis:Individuals (micro-level)Groups or communitiesSocieties (macro-level)
Research purpose:Seek internal validity: use experimentSeek external validity: use surveySeek “thick description”: use case studies, observations, indepth interviews, focus groups, etc.
13
Step 5. How to Collect Data?Sampling (Ch. 4)Instrument
Questionnaire (survey or experiment) or coding classification (content analysis)Stimulus (experimental materials)
Measuring (Ch. 5-9)
14
Step 6. How to Analyze and Interpret Data? (Zhu)
Univariate analysis: characteristics of a variable in the sample and the corresponding population (Ch. 10)Bivariate analysis: relationship between two variables in the sample and the corresponding population (Ch. 12)Multivariate analysis: relationship among three or more variables in the sample and the corresponding population (Ch. 12)
15
Step 7. How to Present the Results? (Zhu)
News releasesBusiness/policy reportsAcademic journal articlesBook/book chapter manuscriptsOral presentations
16
Step 8. How to Replicate the Study?Replications are necessary to eliminate design-specific, sample-specific, or method-specific results.Basic types of replication:
Literal replication (exact duplication)Operational replication (to duplicate only sampling or experimental procedure)Instrumental replication (to duplicate only dependent measures)Constructive replication (deliberately to alter measures and procedures)
17
Ch 2. Elements of ResearchLearning Objectives:What are the similarities and differences between the following pairs of terms?
Concept vs. constructindependent vs. dependent variableConcept/construct vs. variableVariables vs. constant (Zhu)Discrete vs. continuous variableVariable vs. measurementMeasurement vs. scaleConceptualization vs. operationalization (Zhu)Constitutive vs. operational definitionValidity vs. reliability
18
Concepts and ConstructsBoth are “building blocks” of theory:
Concept: a term that expresses an abstract idea formed by generalizing from particulars and summarizing related observations. Construct: a “meta” concept that consists of lower-level concepts, usually not directly observable.
The difference between a concept and a construct is always arbitrary and relative as a concept may contain sub-concepts whereas a construct may be part of a larger construct
19
Independent and Dependent Variables
Variables are an “empirical” or “operational”version of concepts/constructs:
Independent variable (IV): a force or event that acts as the cause of a process
IV can be systematically varied in experiment but observed in survey or content analysis
Dependent variables (DV): a force or event that acts as the effect or outcome of a process
DV can only be observed in any research setting. DV is what the research wishes to explain.
20
Variables and Constants (Zhu)Variable has at least two values.Constant is a special case of variables that involve a single (i.e., fixed) value. In general, variables with a fixed value are not worth studying. That it, don’t study a constant.
21
Discrete and Continuous VariablesThe values of any variable can be either discrete or continuous:
Discrete variable: a finite set of valuesContinuous variable: an set of values that can be infinitely broken into subparts.
22
Constitutive and Operational Definitions
Constitutive definition: using other words or concepts to define or explain the word/concept being defined (see Ch. 1)Operational definition: using observable (i.e., measurable and quantifiable) variables to define or explain the word/concept being defined
23
Conceptualization and Operationalization (Zhu)
Conceptualization: a thought process to identify key concepts and formulate their structural relationship, based on existing theory and past research. It involves “translating” concrete events and/or phenomena to abstract symbols and propositions.Operationalization: a thought process to translate an abstract concept into a concrete variable that can be quantitatively measured by a questionnaire (in survey), coding sheet (in content analysis), physiological instruments (in experiment), and other means of data collection.Both translation processes involve inevitably errors and distortions. The accuracy of the translations is known as “validity” whereas the precision of the translation is known as “reliability”.
24
Levels of MeasurementMeasurement is an empirical (i.e., quantifiable) version of variables:
Nominal level (discrete): using arbitrary numbers to classify categories of a variable Ordinal level (discrete): ranking the order of categories of a variableInterval level (continuous): measuring a variable with an equal distance between adjacent points on a scaleRatio level (continuous): with all properties of interval level plus a true zero point
Zhu: in practice, interval and ratio levels are treated in the same way. No fine distinction between the two is necessary.
25
Reliability and ValidityIndicators of the quality of a variable (or its measurement):
Reliability: the extent to which a measure consistently gives the same results (“precision”)
Stability over timeInternal consistency among constitutive measuresEquivalency between two parallel forms of a measurement
Validity: the extent to which a measure captures what it is supposed to measure (“accuracy” or “truth”)
Face validityPredictive validityConcurrent validityConstructive validity
26
Reliability and ValidityIndicators of the quality of a variable (or its measurement):
Reliability: the extent to which a measure consistently gives the same results (“precision”)
Stability over timeInternal consistency among constitutive measuresEquivalency between two parallel forms of a measurement
Validity: the extent to which a measure captures what it is supposed to measure (“accuracy” or “truth”)
Face validityPredictive validityConcurrent validityConstructive validity
27
Ch 3. Research EthicsLearning Objectives:
What principles, guidelines, and codes of behavior are available to guide researchers when dilemmas arise?Why are the principles, guidelines, and codes stipulated that way (i.e., theoretical grounds and practical consequences)?
28
Why Be Ethical?Unethical practices will
harm research subjects damage the reputation of the research communityproduce misleading results
29
General Ethical TheoriesThe following three theories help researchers to determine what is ethical and what is not:
Categorical imperative theory (rule-based): the researcher should act in a way that he or she wants all others to act.Balancing theory (utilitarianism): the researcher should maximize good and minimize harm from an action.Relativism theory: assuming that there is no absolute right or wrong way of behavior, the researcher should act based on the established norms and, better yet, codes of conduct in the local culture.
30
Ethical Principles Autonomy (self-determination): the researcher should respect the rights, values, and decisions of research subjects.Nonmaleficence: the researcher should not intentionally inflict harm on research subjects.Beneficence: the researcher should remove existing harms and confer benefits on research subjects.Justice: the researcher should treat all research subjects equally (to enjoy benefits and avoid harms).
31
S. Cook’s Code of BehaviorDon’t involve people in research without their knowledge or consent.Don’t coerce people to participate.Don’t withhold from the participant the true nature of the research or actively lie about it.Don’t lead the participant to commit acts that diminish his or her self-respect.Don’t violate the right to self-determination.Don’t expose the participant to physical or mental stress.Don’t invade the privacy of the participant.Don’t withhold benefits from participants in control groups.Don’t fail to treat participants fairly and show them consideration and respect.Always treat every participant with unconditional human regard.
32
Specific Ethical ProblemsVoluntary participation and informed consent: remedied by consent forms Concealment (hiding information) and deception (providing false information): remedied by post-research debriefs Protection of privacy: remedied by measures of confidentialityFederal regulations: human subjects reviewsData analysis and reporting: prevention from tampering with data, plagiarism, deprived authorship
33
AAPOR’s Code of Professional Ethics & Practices
I. Principles of professional practice in conductA. Exercise due care in gathering and processing data,
to assure the accuracy of resultsB. Exercise due care in research design and analysis:
1. using only tools and methods well suited to the research problem;2. not using tools and methods that yield misleading conclusions;3. not interpreting research results inconsistent with the data;4. not implying greater confidence in the results than what the data warrant.
(cont’d on next page)
34
AAPOR’s Code of Professional Ethics (cont’d)
II. Principles of professional responsibility in dealing with people A. the public: disclose methods used and correct possible
distortions when releasing results B. clients and sponsors: protect their confidentiality and accept
only assignments possible to accomplish within technical limitations
C. the profession: share as freely as possible ideas and findings to improve to the profession
D. respondents: shall not lie, abuse, coerce, or humiliate them; protect their confidentiality
35
Ch 4. SamplingLearning Objectives:
What is probability sampling?Why is it important and necessary to draw probability samples?How to draw probability samples?
How to determine sampling error (in relation to confidence level and confidence interval)?How to determine sample size?
36
Population and SamplePopulation: the entire set of subjects, variables, concepts, or phenomena under study (i.e., “study population”).
A study of every member of the study population is known as “census”. Census involves only measurement errors (i.e., inconsistencies produced by the instrument used).
Sample: a sub-set of the study population that is representative of the population.
The process of drawing a sample from the study population is known as “sampling”. Sample involves both measurement errors and “sampling errors” (i.e., differences from the population).
37
Probability and Nonprobability Samples
Probability sample: selected according to random principle whereby each unit’s chance for selection is known.
Given the selection probability known, sampling errors of a probability sample can be estimated.
Nonprobability sample: selected without following random principle so that units are selected with an unknown chance.
With the selection probability unknown, sampling errors of a nonprobability can not be estimated.
38
When to Use a Nonprobability Sample?Purpose of the study: when it is not to generalize the results to the population (i.e., for inferential purpose); instead, it is to investigate the relationships between variables, or a pilot study to aims to test questionnaire or instrumentCost availability: when there isn’t adequate money to collect a probability sampleTime availability: when there isn’t adequate time to collect a probability sampleAcceptable errors: when the amount of errors is not a prime concern.Zhu: of the above issues, the generalizability is often the most important consideration.
39
Nonprobability SamplesAvailable sample (convenience sample): a collection of readily accessible subjects.Volunteer sample: a collection of subjects selected not by random principle but by self initiation.Purposive sample: a collection of subjects selected for specific characteristics, which eliminates all others who fail to meet the criteria.Quota sample: a collection of subjects selected to meet a predetermined or known percentage (quota).Snowball sample: a collection of subjects selected based on “referrals”.
40
Probability SamplesSimple random sample (SRS): each subject in the population has an equal chance of being selected.
In telephone survey, random digital dialing (RDD) procedure produces an SRS
Systematic random sample: every n-th subject is selected from the population.Stratified sample: selected after the population is divided into demographic strata.Cluster sample: selected after the population is divided into geographic clusters[Note: in practice, multistage sampling is often used, in which two or more of the above are used.]
41
Random Selection of Individuals from Households
Either of the following methods is used to ensure that individuals are selected randomly from the chosen households/organizations:
Based on predetermined demographic quota (e.g., Kish Random Tables; see Table 4.2)Based on “last-birthday” principle (added): , selecting the member of the household/organization whose birthday is the most recent
42
Sampling ErrorSampling error (standard error): the degree to which statistics (i.e., measurements obtained from a sample) differ from parameters (i.e., the same measurements that would be obtained from the population)Sampling error is a function of sample sizeSampling error is inevitable, but can be estimated
43
Sampling Error, Confidence Interval, & Confidence Level
Sampling error is a relative concept that takes different values for a given sample depending on the “confidence level” arbitrarily chosen by the researcher
The higher confidence level, the larger sampling errorThe positive and negative sampling errors associated with a given confidence level form a range known as “confidence interval”
In practice, 95% is considered to be the minimally acceptable confidence level; if 100 samples of the same size were drawn, the population parameter would fall within the confidence interval
44
Calculation of Sample Error
where se is the sampling error, n is the sample size, p is the percentage of interest in the population (often assumed to be 50%), and z is the z-value associated with a confidence level, e.g.,
Confidence level = 68%, z = 1Confidence level = 95%, z = 1.96Confidence level = 99%, z = 2.57
zn
ppse ×−
=)1(
45
Considerations for Sample SizeProject type: focus groups, pilot study, or formal study Project purpose: descriptive, inferential, or explanatoryProject complexity: univariate, bivariate, or multivariate; single level or multilevel; cross-sectional or longitudinal Amount of error tolerated: “low incident” events are particularly sensitive for errors Time constraintsFinancial constraintsPrevious research in the area
46
Calculation of Sample Size (Zhu)The formulae for sampling error can also be used to determine sample size given a predetermined, maximally acceptable sampling error (se) and a predetermined, minimally acceptable confidence level (which determines the value of z):
Note that if the chosen confidence level = 95%, z = 1.96; if confidence level = 99%, z = 2.54; etc.
22
)1( zse
ppn −=
47
Ch 6. Content Analysis Learning Objectives:
What is content analysis?Why do we need to use content analysis?
Limitations of content analysis and over-use of content analysis
How to conduct content analysis?How to ensure the systematic and objective nature?
48
What Is Content Analysis?Kerlinger (2000): content analysis is a method of studying and analyzing communication in a systematic, objective, and quantitative manner.
Systematic: content is selected and analyzed according to explicit and consistently applied rules: sampling procedure and coding schemeObjective: content is analyzed based on explicit operational definition and classification rulesQuantitative: content is precisely counted
49
Why to Use Content Analysis?To describe communication content: text, graphics, audio, video, hypertextTo test hypotheses of message characteristics: “If the source has characteristic A, then messages containing elements x and y will be produced; if the source has characteristic B, then messages with elements w and z will be produced.”To compare media content to the “real world”: portrayal of some group, phenomenon, trait, etc. is assessed against a standard taken from real life.To assess the image of particular groups in society: a specific application of describing communication content To establish a starting point for studies of media effects: content as the input of the effects process
50
Limitations of Content AnalysisContent analysis alone cannot serve as a basis for making statements about the effects of content on audiences.The findings of a particular content analysis are limited to the coding scheme used (e.g., operational definition of TV violence).It is difficult to content analyze “low incident”phenomenon.Content analysis is often time-consuming and expensive (i.e., labor intensive).
51
How to Conduct Content Analysis?Basic steps in content analysis:
1. Formulate research question or hypothesis2. Define the study population3. Select a sample from the population4. Select and define a unit of analysis5. Construct classification categories 6. Establish a quantification system7. Train coders and conduct a pilot study8. Code the content based on established rules9. Analyze the coding data10. Draw conclusions and search for indicators
52
Step 1. Formulating a Research Question or Hypothesis
Research question or hypothesis is generated based on:
Existing theoryPrior researchPractical issue
Zhu: research question or hypothesis should be:
SpecificFalsifiableInsightful
53
Step 2. Defining the Study Population
To specify the boundaries of the body of content:
What should be included and why?TopicFormat (e.g., text, graphic, audio, video, etc.?) Time period
What should be excluded and why?TopicFormat (e.g., text, graphic, audio, video, etc.?) Time period
54
Step 3. Selecting a SampleA census of the entire population is sometimes desirable and feasible in content analysisSampling in content analysis often involves a multistage procedure, e.g.,
1. Select appropriate media outlets (purposively or randomly)
2. Select appropriate dates (SRS, systematic, or stratified to create “composite week”)
12-14 issues per year for print media2 days per month for broadcast media
3. Select specific section/program (when necessary or appropriate)
55
Step 4. Selecting a Unit of AnalysisUnit of analysis: the smallest element (which cannot be further divided) of a content analysisExamples of unit of analysis in content analysis:
Print media: word/symbol, sentence, paragraph, theme, article, etc.Broadcast media: character, act, episode, segment, program, etc.
Unit of analysis should be clearly defined (i.e., by a clear-cut operational definition)Whenever possible, lower unit of analysis is preferred because it can be combined into a higher unit of analysis in the future
56
Step 5. Constructing Coding Categories Classification categories should be:
Mutually exclusive: a unit of analysis can be placed in one and only one category
Zhu: when mutually non-exclusivity is either inevitable or desirable, multi-response coding should be used (which creates complications for statistical analysis)
Exclusive: every unit of analysis can be placed in a category
When necessary, set up a category of “others” or “miscellaneous” (which should not exceed 10% of the sample)
Reliable: different coders should agree about the proper category for each unit of analysis (see intercoder reliability)
57
Step 6. Establishing a Quantification System
The quantification system can take any of the four levels of measurement:
NominalOrdinalInterval Ratio
58
Step 7. Training Coders and Doing a Pilot Study
Number of coders is used for the same content: 2-6Coders should not be informed of the research question/hypothesis (i.e., “blinded coding”)Training coders:
To revise definitions and categoriesTo identify “opinionate” coders (and replace such coders who fail to conform to the rules)
Pilot study: to check intercoder reliability
59
Step 8. Coding the ContentA standardized coding sheet based on the operational definition and classification rules should be used for coding An instruction sheet should be provided as an accompanying instrument for the coding sheet
60
Step 9. Analyzing the DataUnivariate analysis:
Percentage or mean (and standard deviation) of each dependent variable
Bivariate analysis:Difference in percentage or mean of each dependent variable between/among groups of an independent variableCorrelation between the dependent and independent variables
Multivariate analysis:Difference in percentage or mean of each dependent variable between/among groups of across several independent variablesCorrelation of a dependent variable with several independent variables
61
Step 10. Interpreting the ResultsFor descriptive studies: additional, independent “benchmark” indicator is needed to help interpret the findingsZhu: Answers to “why” question almost always come from data outside content analysis
Example: Zhu (1991)
62
Reliability of Content AnalysisReliability: if a content analysis is to be objective, its measurements and procedures must be replicable (i.e., repeated measurement of the same material results in similar conclusions).Reliability of content analysis is measured by intercoder reliability (i.e., degree of agreement between independent coders)A poor intercoder reliability indicates problems with:
coding instrument or instructionscoder trainingunit of analysis
63
Intercoder Reliability Holsti’s formulae:
where M is the number of coding decisions on which two coders agree, N1 and N2 are the number of decisions by the first and second coders, respectively.
Scott’s pi (which offsets agreement by chance):
where O is the observed agreement (in %) between two coders, E is the expected agreement (in %) given by:
(where P is the number of categories of the variable):
21
2NN
MR+
=EEOR
−−
=1
pE 1=
64
Validity of Content AnalysisThe validity of content analysis results is affected by both sampling method and coding method. The validity can be assessed based on:
Face validity: if the categories are rigidly and satisfactorily defined and followedConcurrent validity: if the results are consistent with similar data obtained from an independent sourceConstruct validityPredictive validity
65
Ch 7. Survey ResearchLearning Objectives:
What is survey research?Why do we need to conduct surveys?
Advantages and disadvantages of surveyDifferences between survey and experiment (Ch. 9)
How to conduct surveys?Questionnaire designSamplingData collectionData analysis
66
What Is Survey Research?A survey involves interviewing a sample of respondents by face-to-face, telephone, mail, or other methods to measure their knowledge, attitudes, behavior, and other relevant information, with the aim to project the findings to the study population.
Descriptive surveys: to describe or document what the current condition is Analytical surveys: to explain why the current condition exists
67
Why to Use Survey Research?Survey research has the following advantages:
Able to investigate problems in realistic settingsCosts reasonably given the amount of information gatheredAble to collect a large amount of information from a variety of people with relative easeNot constrained by geographic boundariesZhu: the only valid, reliable, and practical way to gather information representative of the general population
68
Disadvantages of Survey ResearchReal causal factors (i.e., independent variables) cannot be manipulated (see Ch. 9 on experimental research), which makes it difficult, if possible at all, to establish a causal relationship under study.Inappropriate wording a placement of questions within a questionnaire may bias results.Survey methods other than face-to-face interviews may not be able to screen out wrong respondents.It has become increasingly difficult to obtain a high response rate.
69
Basic Procedure of Survey Research1. Select the method of interviews2. Select the sample3. Construct the questionnaire4. Pretest the questionnaire5. Train interviewers6. Carry out fieldwork7. Make necessary callbacks/revisits8. Verify the results (quality control inspection)9. Tabulate and analyze the data10. Zhu: calculate response rate
70
Step 1. Selecting the Method of InterviewsMail surveys: sending self-administrated questionnaires to a sample of respondents with stamped reply envelops enclosedTelephone interviews: trained interviewers ask questions and record the answers over the phone; the respondents don’t get to see the questionnaire
CATI: Computer-Assisted Telephone InterviewsPersonal interviews: respondents are interviewed face-to-face at either their home/workplace or a field service location
CAPI: Computer-Assisted Personal InterviewsGroup administered surveys: a group of respondents is gathered together and asked to fill individual copies of a questionnaire
71
Comparisons of Data Collection Methods
GroupPersonalPhoneMail
2nd Most Expensive
Most Expensive
2nd
CheapestCheapestCost
2nd
Quickest2nd SlowestQuickestSlowestTime
Full controlFull controlLimited controlNo controlSelection of
Respondents
2nd LowestHighest2nd HighestLowestInterviewer Bias
Fullest2nd Fullest2nd MinimalMinimalAssistance Available
2nd LowestHighest2nd HighestLowestResponse Rate
72
Step 2. Selecting the SampleMail surveys: drawn from the appropriate sampling frame (i.e., mailing list with names and addresses of respondents) Telephone interviews: drawn from telephone directory (with necessary modifications, e.g., adding a random digit) or from a random digital dialing program Personal interviews: commonly drawn from a multistage sampling procedureGroup administered surveys: drawn a mailing list, a telephone directory (or RDD program), or other sampling procedures
73
Step 3. Construct the QuestionnaireBasic rules of questionnaire design:
Understand the goals of the project so that only relevant questions are includedQuestions should be clear and unambiguousQuestions must accurately communicate what is required from the respondentsDon’t assume respondents understand the questions they are askedFollow Occam’s Razor (the simpler the better)
See later part of this chapter for more details of questionnaire design
74
Impact of Data Collection Methods on Questionnaire Design
Mail surveys: questions must be easy to read and understand before respondents are unable to seek explanationsTelephone interviews: response options for all questions must be fewer and shorter than other methodsPersonal interviews: the interviewers must tread lightly with sensitive and personal questions because his/her physical presence may make the respondent less willing to answer
75
Step 4. Pretest the QuestionnaireAll questionnaires must be tested at least once before put in formal usePretest can be done among:
a focus groupan informal sample of 10-20 people
Particular attention should be paid to how easy the respondents in the pretest understand the questions
76
Step 5. Train InterviewersAll interviewers, experienced or beginner, should be trainedTraining should focus on:
what the questions are abouthow to ask the questions (i.e., reading out exactly the original wording and instruction without any personal interpretation)how to probe follow-up questions (e.g., asking “what else”at least once after the respondent gives an answer to open-ended questions)how to take answers from respondents (i.e., writing down exactly the original wording without any personal interpretation)
77
Step 6. Carry out the FieldworkWeekends and evenings of weekdays are generally preferred time for interviewsLong public holidays should be avoidedUsual duration of the fieldwork for a sample of 1,000:
Mail: 2 months (with 3 rounds of mails)Telephone: 1-2 weeksPersonal: 2-4 weeks
Excessively long fieldwork may introduce unexpected events into the survey
78
Step 7. Make Necessary Callbacks/RevisitsZhu: Any non-contact respondent should be called back or revisited 3-5 timesCallbacks/revisits are necessary and effective means to improve response rateMust callback or revisit:
Those who were contacted but not available for interview with or without an appointment madeThose who have never been contacted
Optional callback or revisit:Those who broke off the interviewThose who refuse the interview
79
Step 8. Verify the ResultsSupervisor(s) call back/revisit a subsample(10-20%) of the complete cases by each interviewer to ensure the interviews take place.Zhu: the quality control (QC) verification usually focuses on a few specific details of the survey that are easy to remember, e.g.,
method of the interview (telephone or personal)sex of the interviewer (male or female)general topic of the questionstype of the gifts (if any)
80
Step 9. Analyze the DataData cleaning: run frequencies to identify missing, illegal or illogical values, and recode all necessary valuesDemographic distribution: run frequencies on age, sex (or other key variables) and compare the results with population census data (and weight the sample if necessary)Formal analyses
UnivariateBivariateMultivariate
81
Step 10. Calculate the Response Rate (Zhu)Survey response rate (SRR) is the most important indicator of the quality of the survey.American Association for Public Opinion Research (AAPOR) has published standard formulas to calculate SRR so that different surveys can be comparable (see Ke, Zhu, & Sun, 2003, pp. 130-113). Of the 6 formulas of AAPOR, RR4 represents the best balance between the most conservative (which tends to underestimate the real SRR) and the most liberal (which tends to overestimate the SRR).
82
Calculation of RR4
where I = completed interviews, P = partially completed interviews, R = refusals and break-off cases, NC = non-contacts, O = other eligible but unsuccessful cases, UH= households status unknown, UO = other unknown households, and e = estimated proportion of eligible but unknown cases, which is often estimated by:
where NE is cases known to be non-eligible (e.g., non-residential units, unqualified individuals, etc.).
)()()(4
UOUHeONCRPIPIRR
+++++++
=
NEONCRPIONCRPIe+++++++++
=
83
How to Design a Questionnaire?Question wordingQuestion typesQuestion formatsIntroductionScreener/filter questionsInstructions for interviewers and respondentsQuestion orderQuestionnaire length
84
Guidelines for Question WordingMake questions clearKeep questions shortInclude only relevant questions Do not ask double-barreled questions (i.e., involving two or more questions in one sentence)Avoid biased words or termsAvoid leading questions (i.e., suggesting a certain response or hidden premise)Do not ask highly detailed informationAvoid potentially embarrassing questions
85
Personal Background
(Control)
Knowledge (DV)
Attitudes (DV)
Behavior (DV)
Media Exposure
(IV)
Media Access
(IV)
How to Determine Which Questions to Ask in a Survey? (Zhu)
CV=Control variable; IV=Independent Variable; DV=Dependent Variable
86
Question TypesOpen-ended questions: requiring respondents to generate their own answers
FlexibleTime consumingNeed to use content analysis to process
Close-ended questions: requiring respondents to select an answer from a list of predetermined options
Ease to quantifyRigidNeed an “Other” category for unforeseen answers
87
Common Formats of QuestionsMultiple choices
Check listForced choice (from a pair of statements)
Rating scalesLikert scaleSemantic differentialFeeling thermometer
Rank orderingFill in blanks
88
Introduction of QuestionnaireTo inform the respondent about
the survey organization the (general) purpose of the surveythe duration of the survey the anonymity and confidentiality of the respondent
Characteristics of a successful introduction (Backstrom and hursh-Cesar, 1986):
shortrealistically wordednonthreateningseriousneutralpleasant but firm
89
Screener/Filter QuestionsTo exclude unqualified respondents from the entire or part of the questionnaire
Screener for exclusion from the entire questionnaire: placed right after the introduction sectionScreener for skips over section(s) of the questionnaire: placed before the section(s) to be skipped
90
Instructions To explain how to answer the questions
Instructions for the interviewer (in telephone or personal interviews):
provided as much as possible usually typed in capital letters and enclosed in brackets or boxes to be distinguished from instructions for the respondent
Instructions for the respondent (particularly important for self-administered questionnaire):
used only when necessary to avoid confusionswhenever possible, providing examples for illustration
91
Question OrderGenerally, the following order is recommended:
Start with simple, easy, and general (i.e., “warm-up”) questionsQuestions serving as dependent variables general go before questions serving as independent variables (to avoid “contamination” or priming effects) Do not ask knowledge questions at the beginning (to avoid embarrassment) Place demographic, personal, and other sensitive questions at the end of the questionnaire
92
Questionnaire LengthHow long is a questionnaire too long?
when there are 10% or more breakoffs (i.e., respondents who drop out before the end of the interview)
Wimmer and Dominick’s recommended maximum length:
Self-administered mail or group survey: 60 min.Face-to-face interview: 60 min.Telephone interview: 20 min.
Zhu: Questionnaires longer than half of the above generally result in high breakoffs among Chinese respondents.
93
A Checklist for Questionnaire Design (Zhu)1. Are the answers in multiple choice questions complete and
mutually exclusive?2. Is a mid-ground position provided in the answers of
attitudinal question? 3. Are the question and answers clearly and unambiguously
expressed?4. Are explanations offered for abstract concepts or technical
terms? 5. Are the question and answers too long?6. Is there any “double-barrel” question?7. Are similar question placed together?8. Are questions ordered from easy to difficult and from
general to specific?9. Are personal questions asked at the end?10. Are screener questions and continued questions
appropriately linked?
94
What to Include in a Survey Report?
YesOptionalNoReferencesYesYesNoDiscussion
YesOptionalNo(Statistical Test)YesYesYesResults
YesYesYesMethods
YesOptionalNoLiterature Review
YesYesYesIntroduction
YesYesNoAbstract
Academic Paper
Business Report
News ReleaseSection
95
Ch 8. Longitudinal ResearchLearning Objectives:
What is longitudinal research?Differences from cross-sectional research
Why do we need longitudinal research?Requirements for causal relationships
How do we conduct longitudinal research?Trend studyCohort analysisPanel study
96
What Is Longitudinal Research?Unlike cross-sectional research that collects data from a representative sample at only one point in time, longitudinal research collects data from the same sample or different samples at different points in time.Longitudinal research is not a new method of data collection or statistical analysis, but a new research design involving existing methods of data collection and analysis.
97
Why to Use Longitudinal Research? (Zhu)
Compared with cross-sectional research, longitudinal research enables the researcher to:
Identify changes over timeTrace the time order of a causal relationship
Requirements for causality:Time order between independent and dependent variablesAssociation of the two variablesExclusion of all other alternative explanations
98
How to Conduct Longitudinal Research?Trend studies: a topic is restudied using different samples drawn from the same population Cohort analysis: tracking specific age cohorts as they change over time Panel studies: the same sample of people is measured at different points in time
Retrospective panel: members of a cross-sectional sample reconstruct their past by recallsFollow-back panel: a cross-sectional sample is compared with a corresponding archival dataCatch-up panel: a cross-sectional sample in the past is compared with current data available from other sources
99
Classification of Longitudinal Research (Zhu)
Panel studyIndividual
Cohort analysisGroup Time series analysis
Trend studySample
30+2-30Unit of Analysis
Number of Time Points
Source: Ke, Zhu & Sun (2003), Ch. 15
100
Comparison of Longitudinal Designs
Costly; high attrition;
sensitization to research instrument
Difficult to separate age,
cohort and period; vulnerable to
sample mortality
Vulnerable to changes in sample or
measurement
Disadvantages
Able to identify dynamic changes
Detect the effects of
maturation and social changes
Establish long-term patterns;
allow secondary analysis
Advantages
Panel StudiesCohort AnalysisTrend Studies
101
Ch 9. Experimental ResearchLearning Objectives:
What is experiment?Why do we need experiment?
Requirements for causalityHow to design and conduct experiment?
How to select the appropriate design?How to manipulate the independent variable?How to randomize the subjects?
102
What Is Experiment?The classic experiment (“controlled labolatoryexperiment”) is a research procedure in which subjects are randomly assigned to experimental vs. control conditions so that the effects of experimental stimulus (“manipulation”) could be directly tested. (Zhu)Quasi-experiment does not involve random assignment of subjects to experimental groups.Field experiment takes place outside lab settings to mimic real life in natural settings.
103
Why to Use (Lab) Experimental Research?It helps establish causality (i.e., cause and effect).It allows control for confounding effects.It costs relatively less than other methods.It is easy to replicate.As the first two merits (establishment of causality and control for confounding effects) are the most central in science research, experiment is considered “the most rigorous method”.
104
Procedure of Experimental Research1. Select the experimental setting2. Select the experimental design3. Operationalize the variables4. Decide how to manipulate the independent
variables5. Select and assign subjects to experimental
conditions (i.e., randomization)6. Conduct a pilot study7. Administer the experiment8. Analyze and interpret the results
105
Step 1. Select the Experimental SettingExperimental settings:
Controlled laboratory settings (to ensure interval validity)Natural settings (to ensure external validity)
106
Step 2. Select the Experimental Design“True” experimental designs
Posttest-only design with control group Pretest-posttest design with control group Solomon design with four groupsFactorial design
Other experimental designsRepeated measures design without control group (similarly, interrupted time series design)Pretest-posttest design with nonequivalent control group (i.e., without randomization)
107
Step 3. Operationalize the VariablesIndependent variable: experimental stimulus to apply to the subjectsDependent variable(s): responses from the subjects (before and) after exposed to the stimulus
Attention (e.g., secondary tasks such as button push) Knowledge (e.g., recalls)Attitudes or perceptions (e.g., questionnaire)Behavior (e.g., display of imitation actions)
108
Step 4. Manipulate the Independent Variable
Develop a set of specific instructions, events or stimuli for presentation to the subjects:
Straightforward manipulation: written materials, verbal instructions, or other stimuli are presented to the subjects.Staged manipulation: the researcher constructs events or circumstances (e.g., by using a “confederate” who pretends to be a subject) to manipulate the independent variable.
In general, the manipulation should be as strong as possible to maximize potential differences between the experimental groups.
109
Step 5. Select and Assign the SubjectsSelection of experimental subjects: ideally, subjects should be randomly selected from the study population to ensure external validityAssignment of experimental conditions: the chosen subjects should be randomly assigned to experimental condition(s) and control condition to eliminate (or minimize) confounding effects that exist among subjects:
Randomization:Matching:
110
Step 6. Conduct a Pilot StudyA pilot study with a small number of subjects helps reveal problems with stimuli and/or measurement, especially to test whether the manipulation of the independent variable is strong enough to have the intended effects.
111
Step 7. Administer the ExperimentFormally carry out the main phase of the experiment:
Have the subjects to read and sign a “consent form” (required by human subjects review committee)Randomization of subjectsApply the manipulation (i.e., stimulus)Measure subjects’ responsesDebrief the subjects at the end of experiment to inform them the real purpose and potential implications of the study
112
Step 8. Analyze and Interpret the Results
Given the specific levels of measurement used (e.g., nominal scale for the independent variable and interval/ratio scale for the dependent variable), experimental data are mostly analyzed with:
t-Test (when there are two groups involved)ANOVA (when there are three or more groups)MANOVA (when there are several parallel or repeatedly measured dependent variables)
113
Ch 10. Introduction to StatisticsLearning Objectives:
What is descriptive statistics?Distributions vs. summary statistics
What is inferential statistics?Sampling distribution
How to calculate descriptive statistics?Sample distributionsCentral tendenciesDispersionsNormal curve z-score
114
What Is Descriptive Statistics?Descriptive statistics: statistical methods and techniques used to reduce data to allow for easier interpretation.
Zhu: statistical indicators used to describe quantitative characteristics of a sample
Key descriptive statistics:Distribution of variables: frequenciesCentral tendency of variables: mean, median, modeDispersion of variables: variance, standard deviation
115
What Is Inferential Statistics (Zhu)?Inferential statistics are indicators used to estimate quantitative characteristics (i.e., parameters) of a population from relevant descriptive statistics of a corresponding sampleKey inferential statistics:
Sampling distributionsSampling errors (also known as “standard errors”)Confidence levelsConfidence intervals
116
Sample DistributionsDistribution: a collection of numbers.
Zhu: a collection of all possible values of a variable observed/measured from a sample
Distribution can be described by a frequency table that contains:
ValuesCounts (frequencies)PercentageValid PercentageCumulative Percentage
Distribution can also be described by a graphic chat:Bar chat (for nominal or ordinal scale variables)Histogram (for interval or ration scale variables)
117
Shapes of Distribution Skewness of a distribution:
Right skewness: the tail of the curve trails off to the right of the distributionLeft skewness: the tail of the curve trails off to the left of the distributionNormal distribution: the two halves of the curve are identical (i.e., symmetrical)
Zhu: normal distribution has desirable mathematical properties for many statistical analyses; some of skewed distributions can be transformed to become proximately normal
118
Statistics of Central Tendency Central tendency uses a single number (i.e., a “statistic”) to describes the “typical” or “average” feature of a sample distributionCommon central tendency statistics:
Mean: the arithmetical average of a distribution (i.e., the sum of all scores divided by N), which is the most frequently used Median: the midpoint of a distribution with half of the scores above and half below itMode: the score(s) that occur(s) most frequently
119
How to Calculate Mean?Based on original scores:
where (read “x bar”) is the mean; is the summation symbol, X is any score of the distribution, n is the number of cases (i.e., sample size)
Based on aggregated (i.e., grouped) scores:
where f is the frequency of each given interval (or group), X is the midpoint of that interval or group
nX
X ∑=X n
fXX ∑=
∑
120
Statistics of DispersionsDispersion statistics describe the variability, “spread-out,” or deviation from the central tendency of a distributionCommon dispersion statistics:
Range (difference between maximum and minimum)VarianceStandard deviation
121
How to Calculate Variance & Standard Deviation?
Variance (S2) Standard Deviation (S)
1)( 2
2
−−
= ∑n
XXS
1)( 2
−−
= ∑n
XXS
Zhu: Notice that variance is just the squared standard deviation; standard deviation is the squared root of variance. While variance is unit-free, standard deviation takes the same unit as the original scores (X).
122
Standard Scores (z-Scores) z-scores are transformed values from the original scores based on the mean and standard deviation:
z-scores help compare scores obtained from totally different methods (i.e., measurement units) because all z-scores have a mean of 0 and a standard deviation of 1. A particular z-score tells how many standard deviations the original score is above or below the mean of the sample.
sXXz −
=
123
Standard Normal Curve (i.e., Distribution of z-Scores)
0z
P (z)
34.1%113.6%
213.6%
-134.1%
-2
95.4%
50%
124
Sampling DistributionDistributions:
Sample distribution: the collection of all values of a sample (actually measured), with a fixed sample size (n) Population distribution: the collection of all values of a corresponding population (possibly but unlikely to be measured), with a fixed population size (N)Sampling distribution: the collection of all possible values of a statistic (i.e., ) that would occur if all possible samples of a fixed size (n) were taken from the population. Sampling distribution is a virtual (i.e., non-existent) distribution, with an infinite number of cases in it. Sampling distribution is the basis to estimate population parameters from sample statistics.
X
125
Comparison of Three Distributions (Zhu)
σsStd Dev
μμMean
∞NnSize
A sampleAn individualAn individualUnit
Sampling Distribution
Population Distribution(Parameter)
Sample Distribution (Statistic)
X
)(n
se σ=
126
How to Calculate Standard Error?For nominal variables:
which has been presented in Ch. 4 under “sampling error”
For interval/ratio variables:
where s2 is the variance of the variable in the sample
zn
ppse ×−
=)1( z
nsse ×=
2
Zhu: Generalized from the above formulas, we can conclude that the standard error of a sampling distribution is simply thevariance of a sample divided by the squared root of the sample size plus adjustment for a given confidence level.
127
Ch 11. Hypothesis TestingLearning Objectives:
What is statistical hypothesis (as compared with research question)?Why do we need to set up hypothesis (instead of research question)?How to test hypothesis (the 5-step procedure)?
128
What Is a Statistical Hypothesis?Unlike research question (which is informal, general, exploratory, and preliminary), statistical hypothesis is formal, specific, explanatory, and predictive.Hypothesis is a tentative generalization about the relationship between two or more variables that predicts an outcome.Zhu: a good hypothesis need to specify
the nature of the relationship (correlated or causal)the direction of the relationship (positive or negative)the form of the relationship (linear or nonlinear)the strength of the relationship (strong or weak)
129
Why Do We Need Hypothesis?Major benefits of setting up a hypothesis (instead of a research question):
It provides direction for a study; without it, research lacks focus and clarityIt eliminates trial-and-error research that is time consuming and wastefulIt helps rule out intervening and confounding variablesIt allows for quantification of variables; words that cannot be quantified cannot be included in a hypothesis
130
What Constitutes a Useful Hypothesis?It should be compatible with current knowledge in the area. Any hypothesis that challenges existing knowledge should have a compelling reason. It should be logically consistent (i.e., “if A = B and B = C, then A = C”, or Aristotle’s syllogism).It should be stated conciselyIt should be testable (i.e., falsifiable)
Zhu: the above prescribes a “useful” (i.e., functional) hypothesis; see my previous criteria for a “good”hypothesis
131
Research Hypothesis vs. Null Hypothesis
Null hypothesis (also called “hypothesis of no difference”) assets that the statistical differences or relationships under study are due to chance or random error (i.e., sampling error). Null hypothesis (denoted as Ho) forms a logical alternative to the research hypothesis (H1) under testZhu: null hypothesis aims to prevent existing knowledge from easily challenged
132
Procedure of Hypothesis Testing (Zhu)1. Specify the research hypothesis and the
corresponding null hypothesis 2. Select the appropriate statistical test3. Select the minimally acceptable significance
level4. Collect the required data and perform the
chosen testing5. Make a decision on the acceptance (or
rejection) of the research hypothesis based on the testing results
133
Step 1. Specifying the Hypotheses (Zhu)Generally, there are two types of research hypotheses:
Difference between or among groups:H0: H1: , or better yet, H1:
Relationship between two or more variables:H0: H1: , or better yet, H1: Note that a difference and a relationship are mathematically equivalent (i.e., a significant difference between groups is the same as a significant relationship between the group variable and the outcome variable)
⋅⋅⋅== 21 XX⋅⋅⋅≠≠ 21 XX ⋅⋅⋅⟩⟩ 21 XX
0=β0≠β 0>β
134
Step 2. Select Statistical Test (Zhu)
Correlation or RegressionF-Testt- or F-TestInterval/Ratio
Spearman Correlation*
Spearman Correlation
Crosstabs + Chi-squareOrdinal
Crosstabs + Chi-square*
Crosstabs + Chi-square
Crosstabs + Chi-squareNominal
Interval/RatioOrdinalNominal
Independent Variable (IV)
Dependent Variable (DV)
*After recoding the IV into an ordinal scale
135
Step 3. Determine the Significance Level (Zhu)
The significance level is the minimally acceptable probability of error for the hypothesis testing
Significance level (denoted as α): predetermined by the researcher, commonly set at α = 0.05 (i.e., an error of 5% in rejecting H0), α = 0.01 (an error of 1%), or α = 0.001 (an error of 0.1%) Probability level (denoted as p): the actually obtained in the data analysis, which could be any value from 0 to 1Confidence level: the opposite to the significance level (i.e., when α = 0.05 , confidence level = 95%; when α= 0.01, confidence level = 99%; and when α = 0.001, confidence level = 99.9%)
136
Step 4. Collect and Analyze the Data (Zhu)
Data collection (see Ch. 6-9)Statistical analysis based on the choice made in Step 2:
Difference between/among groups: Crosstabulation (i.e., Chi-square analysis)t-test (for two groups)ANOVA (F-test, for three or more groups)
Relationship between variables:Correlation analysis (for non-directional relationships)Regression analysis (for directional or causal relationships)
137
Step 5. Make the Statistical Decision The decision is made by simply comparing the resulting p and the predetermined α:
If p ≥ α, we accept (i.e., fail to reject) H0 (i.e., the null hypothesis); consequently, we reject H1 (the research hypothesis)If p < α, we reject H0; consequently, we accept H1
138
µ
Region of retention
Regions of Rejection for α<.05 (Two-tail)
Region of Rejection (= α/2)
Region of Rejection (= α/2)
.025 (2.5%) .025 (2.5%)
.475 (47.5%) .475 (47.5%)
.50 (50%).50 (50%)
139
Errors in Hypothesis TestingWhen we make a decision (i.e., either to accept or reject H0), we run the risk of committing one of the two types of errors:
Type I error: the rejection of a true null hypothesis that should be accepted. Because it happens in the region of rejection (= α), Type I error is called as “alpha error.” Type I error is under the direct control of the researcher. To reduce the error, the researcher can simply set α closer to zero. Type II error: the acceptance of a false null hypothesis that should be rejected. Type II error, called as “beta error,” is the reverse to Type I error.
Zhu: between the two, Type I error is generally more serious and thus should be prevented as the first priority
140
Possible Results in Testing an H0
Decision Made
Type II error (=β)
Correct(=1-β
=Power of Analysis)H0 is false
Correct (=1-α
=Confidence Level)
Type I error (=α)
H0 is true
Accept H0Reject H0Reality
141
Possible Results in Testing an H1
The difference/relationship in the population:
Type II errorCorrectNon-significant
CorrectType I errorSignificant
existsdoesn’t exist
The difference /relationship in the sample is:
142
Power AnalysisPower of a test refers to the probability of rejecting a null hypothesis when it is false
Zhu: the relevant statement on pp. 274-5? is wrongZhu: power analysis helps the researcher to determine, given i) the predetermined α level and ii) the possible size of observed difference/relationship:1. the necessary sample size (to reject a false H0), or2. the minimal power of the test (to avoid β error) Power is commonly set to .80
143
Ch 12. Basic Statistical ProceduresLearning Objectives:
What are the following statistical tests?crosstabulation analysis and Chi-square testt-test and ANOVAcorrelation and regression analysis?
Why should we use any of the above tests (instead of all others)?How can we perform these tests and interpret their results?
144
Why Do We Need Statistical Tests?Statistical tests are necessary with the scientific method of knowing.
In order to obtain valid and reliable results, any data must be analyzed using some type of statistical method.Statistics is how we advance our knowledge of everything.
Otherwise, the data will be analyzed based on the methods of intuition, tenacity, or authority, which generate results that cannot be verified.
145
What Statistical Test Should I Use? It depends on the measurement level of your variables
Correlation or RegressionF-Testt- or F-TestInterval/Ratio
Spearman Correlation*
Spearman Correlation
Crosstabs-Chi-squareOrdinal
Crosstabs-Chi-square*
Crosstabs-Chi-square
Crosstabs-Chi-squareNominal
Interval/RatioOrdinalNominal
Independent Variable (IV)
Dependent Variable (DV)
*After recoding the IV into an ordinal scale
146
Where to Find the Relevant Tool in SPSS?
Regression/LinearSimple & multiple regression
Correlate/BivariateSpearman & Pearson correlation
General Linear Model/UnivariateMulti-way ANOVA
Analyze/Compare Means/One-Way ANOVAOne-way ANOVA
Analyze/Compare Means/Paired Samples Testst-test
Analyze/Descriptive/CrosstabsChi-square test
SPSS ProcedureStatistical Test
147
Crosstabs Analysis & Chi-square TestCrosstabulation analysis displays, in a 2-way table format, either of the following:
the difference in a nominal/ordinal DV between/among groups of a nominal/ordinal IV, orthe relationship between a nominal/ordinal IV and a nominal/ordinal DV
Chi-square (χ2) test provides the significance test of the null hypothesis underlying the crosstabulationthat
the observed difference doesn’t exist in the population, orthe observed relationship doesn’t exist in the population
148
Construction of a 2-Way Crosstable (Zhu)
????N
100%100%100%100%Total
?%?%?%?%2
?%?%?%?%1
Total (optional)321DV
IV
χ2 = ?, df = 2, p < .?
149
Rules for Cross-tables (Zhu)
Put the IV in columns, with each column for each group of IVPut the DV in rows, with each row for each group of DVShow “column percent” in each cell (i.e., dividing the number of cases in each of the cells by the total number of cases in the corresponding column)Show column totals (100% and the number of cases) at the bottomOptionally, show “sample percent” in the last column Show Chi-square test results (χ2 value, degrees of free, and p level) below the table
150
Interpretation of Chi-square Test Results (Zhu)
When the resulting p-level is equal to or greater than the predetermined α-level (e.g., .05), retain the H0and consequently reject the H1
When p is smaller than α, reject the H0 and consequently accept the H1Remember that the H0 states either of the following:
There is no difference in the DV between/among the groups of the IV in the populationThere is no relationship between the IV and DV in the population
151
t-Test and ANOVA (Zhu)Both tests examine the difference in an interval/ratio DV between/among groups of a nominal/ordinal IV:
t-test is used when the IV involves two groupsANOVA (Analysis of Variance, based on F-test) is used when the IV involves three or more groups
t-test is a special case of ANOVA because the t-statistic is the squared root of the corresponding F-statistic; therefore, ANOVA can be used practically for comparisons of any number of groups
152
Construction of a t-Test Table (Zhu)
???Std Error (optional)
--??N
--??Std Dev
???Mean
21DVDifference
IV
t = ?, df = ?, p < .?
153
Rules for t-Test Tables (Zhu)Put the IV in columns, with each column for each group of the IVPut the DV in rows, with one row for mean, standard deviation, standard error (optional because it can be calculated based on the other three pieces of the information presented), and number of cases of each group, respectivelyOptionally, show the difference in the mean of the DV between the two groups in the last columnShow the results of the t-test (including the obtained t-value, df, and p-level) below the tableWhen constrained by the space, t-test table(s) can be replaced by a brief report of the above content in the text; parallel tables can be combined into a multi-panel table
154
Interpretation of t-Test Results (Zhu)When the resulting p-level is equal to or greater than the predetermined α-level (e.g., .05), retain the H0 (i.e., there is no difference in the mean of DV between the groups of the IV in the population) and consequently reject the H1
When p is smaller than α, reject the H0 and consequently accept the H1 (i.e., there is a significant difference in the DV between the groups of the IV in the population)
155
Construction of an ANOVA Table (Zhu)
IV
???Std Error (optional)
???N
???Std Dev
???Mean
21DV 3
f = ?, df1 = ?, df2 = 2, p < ?
156
An Alternative Table for ANOVA Results: Multi-group Comparisons (Zhu)
Group 3
Group 2
MeanStdN
Difference**
Difference*
???
Group 1
Difference***
???
Group 2
???
Group 3
* p < .05, ** p < .01, *** p < .001
157
Interpretation of ANOVA Results (Zhu)Overall test:
When p ≥ α (e.g., .05), retain the H0 (i.e., there is no difference in the mean of DV among all groups of the IV in the population) and consequently reject the H1
When p < α, reject the H0 and consequently accept the H1(i.e., there is a significant difference in the DV between at least one pair of the groups of the IV in the population)
Post hoc tests: when the overall test is significant (p < α), a post hoc test is needed to compare each pair of the groups to identify exactly which pair is significantly different. The logic of t-test between two groups applies here.
158
Two-way ANOVA Data Table*
Total
Value=2
Value=1
TotalValue=3Value=2Value=1IV 2
IV 1
11x
23x22x21x
•1x13x12x
1•x ••x3•x2•x
•2x
This and the following slides on 2-way ANOVA are optional.
159
Two-way ANOVA Result Table
?
?
?
F
??Total
???Error
????Interaction
????IV 2
????IV 1
pMean SquaredfSum of
SquaresSource
160
Interpretation of 2-way ANOVA Results (Zhu)
Three H0’s are tested:H0 for IV 1, if p < α, reject the H0 for IV 1 (i.e., IV 1 has a significant effect on the DV)H0 for IV 2, if p < α, reject the H0 for IV 2 (i.e., IV 2 has a significant effect on the DV)H0 for the interaction between IV 1 and IV 2, if p < α, reject the H0 (i.e., the effect of IV 1 on DV varies according to the value of IV 2; or vise vice, the effect of IV 2 on DV varies according to the value of IV 1
161
Illustration of Interaction Effect
1 2 1 2
DV
IV 1
IV 2 = 2
IV 2 = 1
No Interaction
IV 1
IV 2 = 2
IV 2 = 1
Significant InteractionDV
162
Correlation and Regression AnalysisCorrelation: the degree of association of 2 or more variables without assuming the causal direction between them, measured in a standardized unit (from 0 to 1):
Spearman’s correlation (ρ, for ordinal variables)Pearson’s correlation (r, for interval/ratio variables)
Regression: the degree of association of a DV with 1 or more IVs, measured in both standardized unit and nonstandardized unit (i.e., the original unit of the DV):
Simple regression (for a DV and an IV)Multiple regression (for a DV and 2+ IVs)
163
Weak (probably nonsig.) (r=0.1)
Possible Correlational Relationships
Imperfect Positive (r=0.5)
Y
X
Y
X
Y
X Imperfect Negative (r=-0.5)
Perfectly Positive Correlation (r=1)
Y
X
Y
X
Y
XPerfectly Zero Correlation (r=0)
Perfectly Negative Correlation (r=-1)
164
Interpretation of Correlation Results (Zhu)Significance of the relationship:
If p ≥ α, retain H0 (i.e., the observed ρ or r between X and Y is merely by chance and doesn’t exist in the population)If p > α, reject H0 (i.e., the ρ or r between X and Y is beyond chance and does hold in the population)
Direction of the relationship:If ρ or r > 0, there is a positive correlation between X and Y (i.e., the two vary in the same direction)If ρ or r < 0, there is a negative correlation between X and Y (i.e., the two vary in an opposition direction)
Strength of the relationship:If the absolute value ρ or r < .3, the correlation is weakif the absolute value ρ or r > .3 but < .7, the correlation is mediumif the absolute value of ρ or r > .7, the correlation is strong
165
Construction of Correlation Matrix Table
1.00?***
(n = ?)?**
(n = ?)X3
1.00?*
(n = ?)X2
1.00X1
X3X2X1
* p < .05, ** p < .01, *** p < .001
166
Partial Correlation (optional)Partial correlation is a multivariate analysis that examines the net bivariate correlation between X and Y by controlling the impact of “third variable”(e.g., Z, W, etc.) on Y.Zhu: the order of partial correlation is determined by the number of “third variables” involved:
rxy (bivariate correlation between X and Y) is called “zero-order correlation coefficient”rxy|z (partial correlation between X and Y with Z controlled) is called “first-order correlation coefficient”rxy|zw (partial correlation between X and Y with Z and W controlled) is called “second-order correlation coefficient”rxy|zw… (partial correlation between X and Y with k variables controlled) is called “k-order correlation coefficient”
167
Simple Regression AnalysisA simple regression analysis builds on an assumed causal relationship between two variables, with one (X) as the IV and another Y as the DVThe relationship between X and Y can be expressed by the equation:
Y = a + bXwhere a is the intercept of the regression line and bis the slope of the regression line.
Y
Xa
b
Y = a + bX
168
Construction of Simple Regression Table
N = ?
????Constant
????IV
ptBetab
169
Interpretation of Simple Regression Results (Zhu)
Regression analysis tests the H0 that the effect of X on Y, called “regression coefficient of X” is null (i.e., b = 0)If the resulting p (for b) < α, we reject H0 and conclude that X has a significant effect on YMore specifically, we can predict that a unit increase in X will lead to a change with the amount of b in Y in the population, of course within a confidence interval However, the above (i.e., rejection of H0) still doesn’t prove the causal direction from X to Y; it only suggests the impact of X on Y if the causal assumption is correct.
170
Multiple Regression Analysis (optional)Multiple regression analysis involves two or more IVs:Y = a + b1X1 + b2X2 + … + bkXk
The test involves:an H0 for the overall regression equation (i.e., R2 = 0, where R2 is called “coefficient of determination”)multiple H0’s, each corresponding to a particular IV (i.e., b1 = b2 = … = bk = 0)
171
Table of a Multiple Regression
????IV 1
????IV 2
Adjust R2 = ?, N = ?, (F = ?, df1 = ?, df2 = ?,) p < ?
????Constant
????IV 3
ptBetab
* p < .05, ** p < .01, *** p < .001
172
Table of Several Multiple RegressionsDV3DV2DV1
???IV 1
???IV 2
???IV 3
???Constant
???p
???N
???Adj. R2
173
Interpretation of Multiple Regression Results (optional)
If the p for the overall equation < α, we reject the overall H0 to conclude that the IVs jointly have a significant effect on the DV
The value of R2 describes the amount of variance in the DV explained by the IVs together
If the p for a particular b (ranging form b1 to bk) < α, we reject the associated H0 to conclude that the corresponding IV has a significant effect on the DV
The value of the particular b describes the amount of change in the DV caused by a unit of increase in the IV, when all other IVs are held constant
174
Advanced Techniques for Multiple Regression (Zhu, optional)
Nominal/ordinal scale variables as IVs: recoded by a series of dummy (i.e., binary) variables
Interaction between two IVs:when both are interval/ratio variableswhen one IV is interval/ratio and another nominal/ordinal variable
Nonlinear relationship:Nonlinear transformation of IVs