Mass Media Research (Lecture Notes)

Mass Media Research:An Introduction, 7th editionR. D. Wimmer & J. R. Dominick (2003)

This handout is prepared with copyright by Jonathan Zhu.

Reproduction or distribution is prohibited without prior consent.

2

Table of ContentsI. The Research Process

1. Science and Research2. Elements of Research3. Research Ethics4. Sampling

II. Research Approaches5. Qualitative Research

Methods*

6. Content Analysis7. Survey Research8. Longitudinal Research9. Experimental Research

III. Data Analysis10. Introduction to Statistics11. Hypothesis Testing12. Basic Statistical Procedures

IV. Research Applications*

13. Research in the Print Media14. Research in the Electronic Media15. Research in Advertising16. Research in Public Relations17. Research in Media Effects18. Mass Media Research & Internet

* Omitted from this course.

3

Ch 1. Science and ResearchLearning Objectives:

What is scientific research?Why is scientific research the only learning approach that allows for self-correction (i.e., falsification) of existing knowledge or new findings?

Basic characteristics of scientific researchHow to conduct scientific research?

Basic steps of research procedure

4

What Is Scientific Research?Research is an attempt to discover something newScientific research is an organized, objective, controlled, qualitative or quantitative empirical analysis of one or more variables

5

Methods of KnowingFour approaches for people to find answers to research questions:

1. Method of tenacity (to follow or apply previously established traditions)

2. Method of intuition (to apply common sense, to take it for granted)

3. Method of authority (to follow a trusted source)4. Method of scientific inquiry (to discover truth

through objective evaluations of data)

6

Characteristics of Scientific MethodScientific method has the following five basic characteristics, as compared with other methods of knowing:

Public (vs. private)Objective (vs. subjective)Empirical (vs. speculative)Systematic and cumulative (vs. anecdotal) Predictive (vs. ad hoc)

7

Scientific Research ProceduresA typical research process consists of eight

steps:1. Select a problem;2. Review existing, relevant research and theory;3. Develop hypothesis or research questions;4. Determine an appropriate

methodology/research design;5. Collect relevant data;6. Analyze the data and the interpret the results;7. Present the results in an appropriate form;8. Replicate the study (when necessary)

8

Steps of the Research Procedure

Selection of problem

Review

of existing research &

theory

Statem

ent of hypothesis or research questions

Determ

ination of appropriate m

ethodology

Data collection

Analysis &

Interpretation of data

Presentation of results

Replication

9

Step 1: How to Select a Research Topic?

Issues to consider when selecting a topic:1. Is the topic too broad?2. Can the problem really be investigated?3. Can the data be analyzed?4. Is the problem significant?5. Can the results be generalized?6. What costs and time are involved?7. Is the planned approach appropriate to the

project?8. Is there any potential harm to the subjects?

10

Step 2. How to Review Existing Research & Theory?

Sources to review:Academic journalsProfessional trade publicationsPopular magazinesThe Internet

Focus on when reviewing:What type research has been done in the area?What has been found in previous studies?What suggestions do other researchers make for future search?What has not been investigated?What research methods were used in previous studies? How can the proposed study add to our knowledge of the area?

11

Step 3. How to State a Hypothesis/Research Question?

Hypothesis is a formal statement about the nature, function, direction, and strength of testable relationship between variables.Research question is a less formal statement about relationships between variables without specifying the nature, direction, or strength of the relationships. (Zhu)Between the two, hypothesis is preferred, especially when there is a rich body of existing research on the topic. (Zhu)

12

Step 4. How to Determine the Right Methodology and Research Design? (Zhu)

Basic research design:Between-subjects design (cross-sectional, use survey)Within-subjects design (longitudinal, see Ch. 8)Mixed design (cross-sectional and longitudinal)

Unit of analysis:Individuals (micro-level)Groups or communitiesSocieties (macro-level)

Research purpose:Seek internal validity: use experimentSeek external validity: use surveySeek “thick description”: use case studies, observations, indepth interviews, focus groups, etc.

13

Step 5. How to Collect Data?Sampling (Ch. 4)Instrument

Questionnaire (survey or experiment) or coding classification (content analysis)Stimulus (experimental materials)

Measuring (Ch. 5-9)

14

Step 6. How to Analyze and Interpret Data? (Zhu)

Univariate analysis: characteristics of a variable in the sample and the corresponding population (Ch. 10)Bivariate analysis: relationship between two variables in the sample and the corresponding population (Ch. 12)Multivariate analysis: relationship among three or more variables in the sample and the corresponding population (Ch. 12)

15

Step 7. How to Present the Results? (Zhu)

News releasesBusiness/policy reportsAcademic journal articlesBook/book chapter manuscriptsOral presentations

16

Step 8. How to Replicate the Study?Replications are necessary to eliminate design-specific, sample-specific, or method-specific results.Basic types of replication:

Literal replication (exact duplication)Operational replication (to duplicate only sampling or experimental procedure)Instrumental replication (to duplicate only dependent measures)Constructive replication (deliberately to alter measures and procedures)

17

Ch 2. Elements of ResearchLearning Objectives:What are the similarities and differences between the following pairs of terms?

Concept vs. constructindependent vs. dependent variableConcept/construct vs. variableVariables vs. constant (Zhu)Discrete vs. continuous variableVariable vs. measurementMeasurement vs. scaleConceptualization vs. operationalization (Zhu)Constitutive vs. operational definitionValidity vs. reliability

18

Concepts and ConstructsBoth are “building blocks” of theory:

Concept: a term that expresses an abstract idea formed by generalizing from particulars and summarizing related observations. Construct: a “meta” concept that consists of lower-level concepts, usually not directly observable.

The difference between a concept and a construct is always arbitrary and relative as a concept may contain sub-concepts whereas a construct may be part of a larger construct

19

Independent and Dependent Variables

Variables are an “empirical” or “operational”version of concepts/constructs:

Independent variable (IV): a force or event that acts as the cause of a process

IV can be systematically varied in experiment but observed in survey or content analysis

Dependent variables (DV): a force or event that acts as the effect or outcome of a process

DV can only be observed in any research setting. DV is what the research wishes to explain.

20

Variables and Constants (Zhu)Variable has at least two values.Constant is a special case of variables that involve a single (i.e., fixed) value. In general, variables with a fixed value are not worth studying. That it, don’t study a constant.

21

Discrete and Continuous VariablesThe values of any variable can be either discrete or continuous:

Discrete variable: a finite set of valuesContinuous variable: an set of values that can be infinitely broken into subparts.

22

Constitutive and Operational Definitions

Constitutive definition: using other words or concepts to define or explain the word/concept being defined (see Ch. 1)Operational definition: using observable (i.e., measurable and quantifiable) variables to define or explain the word/concept being defined

23

Conceptualization and Operationalization (Zhu)

Conceptualization: a thought process to identify key concepts and formulate their structural relationship, based on existing theory and past research. It involves “translating” concrete events and/or phenomena to abstract symbols and propositions.Operationalization: a thought process to translate an abstract concept into a concrete variable that can be quantitatively measured by a questionnaire (in survey), coding sheet (in content analysis), physiological instruments (in experiment), and other means of data collection.Both translation processes involve inevitably errors and distortions. The accuracy of the translations is known as “validity” whereas the precision of the translation is known as “reliability”.

24

Levels of MeasurementMeasurement is an empirical (i.e., quantifiable) version of variables:

Nominal level (discrete): using arbitrary numbers to classify categories of a variable Ordinal level (discrete): ranking the order of categories of a variableInterval level (continuous): measuring a variable with an equal distance between adjacent points on a scaleRatio level (continuous): with all properties of interval level plus a true zero point

Zhu: in practice, interval and ratio levels are treated in the same way. No fine distinction between the two is necessary.

25

Reliability and ValidityIndicators of the quality of a variable (or its measurement):

Reliability: the extent to which a measure consistently gives the same results (“precision”)

Stability over timeInternal consistency among constitutive measuresEquivalency between two parallel forms of a measurement

Validity: the extent to which a measure captures what it is supposed to measure (“accuracy” or “truth”)

Face validityPredictive validityConcurrent validityConstructive validity

26

Reliability and ValidityIndicators of the quality of a variable (or its measurement):

Reliability: the extent to which a measure consistently gives the same results (“precision”)

Stability over timeInternal consistency among constitutive measuresEquivalency between two parallel forms of a measurement

Validity: the extent to which a measure captures what it is supposed to measure (“accuracy” or “truth”)

Face validityPredictive validityConcurrent validityConstructive validity

27

Ch 3. Research EthicsLearning Objectives:

What principles, guidelines, and codes of behavior are available to guide researchers when dilemmas arise?Why are the principles, guidelines, and codes stipulated that way (i.e., theoretical grounds and practical consequences)?

28

Why Be Ethical?Unethical practices will

harm research subjects damage the reputation of the research communityproduce misleading results

29

General Ethical TheoriesThe following three theories help researchers to determine what is ethical and what is not:

Categorical imperative theory (rule-based): the researcher should act in a way that he or she wants all others to act.Balancing theory (utilitarianism): the researcher should maximize good and minimize harm from an action.Relativism theory: assuming that there is no absolute right or wrong way of behavior, the researcher should act based on the established norms and, better yet, codes of conduct in the local culture.

30

Ethical Principles Autonomy (self-determination): the researcher should respect the rights, values, and decisions of research subjects.Nonmaleficence: the researcher should not intentionally inflict harm on research subjects.Beneficence: the researcher should remove existing harms and confer benefits on research subjects.Justice: the researcher should treat all research subjects equally (to enjoy benefits and avoid harms).

31

S. Cook’s Code of BehaviorDon’t involve people in research without their knowledge or consent.Don’t coerce people to participate.Don’t withhold from the participant the true nature of the research or actively lie about it.Don’t lead the participant to commit acts that diminish his or her self-respect.Don’t violate the right to self-determination.Don’t expose the participant to physical or mental stress.Don’t invade the privacy of the participant.Don’t withhold benefits from participants in control groups.Don’t fail to treat participants fairly and show them consideration and respect.Always treat every participant with unconditional human regard.

32

Specific Ethical ProblemsVoluntary participation and informed consent: remedied by consent forms Concealment (hiding information) and deception (providing false information): remedied by post-research debriefs Protection of privacy: remedied by measures of confidentialityFederal regulations: human subjects reviewsData analysis and reporting: prevention from tampering with data, plagiarism, deprived authorship

33

AAPOR’s Code of Professional Ethics & Practices

I. Principles of professional practice in conductA. Exercise due care in gathering and processing data,

to assure the accuracy of resultsB. Exercise due care in research design and analysis:

1. using only tools and methods well suited to the research problem;2. not using tools and methods that yield misleading conclusions;3. not interpreting research results inconsistent with the data;4. not implying greater confidence in the results than what the data warrant.

(cont’d on next page)

34

AAPOR’s Code of Professional Ethics (cont’d)

II. Principles of professional responsibility in dealing with people A. the public: disclose methods used and correct possible

distortions when releasing results B. clients and sponsors: protect their confidentiality and accept

only assignments possible to accomplish within technical limitations

C. the profession: share as freely as possible ideas and findings to improve to the profession

D. respondents: shall not lie, abuse, coerce, or humiliate them; protect their confidentiality

35

Ch 4. SamplingLearning Objectives:

What is probability sampling?Why is it important and necessary to draw probability samples?How to draw probability samples?

How to determine sampling error (in relation to confidence level and confidence interval)?How to determine sample size?

36

Population and SamplePopulation: the entire set of subjects, variables, concepts, or phenomena under study (i.e., “study population”).

A study of every member of the study population is known as “census”. Census involves only measurement errors (i.e., inconsistencies produced by the instrument used).

Sample: a sub-set of the study population that is representative of the population.

The process of drawing a sample from the study population is known as “sampling”. Sample involves both measurement errors and “sampling errors” (i.e., differences from the population).

37

Probability and Nonprobability Samples

Probability sample: selected according to random principle whereby each unit’s chance for selection is known.

Given the selection probability known, sampling errors of a probability sample can be estimated.

Nonprobability sample: selected without following random principle so that units are selected with an unknown chance.

With the selection probability unknown, sampling errors of a nonprobability can not be estimated.

38

When to Use a Nonprobability Sample?Purpose of the study: when it is not to generalize the results to the population (i.e., for inferential purpose); instead, it is to investigate the relationships between variables, or a pilot study to aims to test questionnaire or instrumentCost availability: when there isn’t adequate money to collect a probability sampleTime availability: when there isn’t adequate time to collect a probability sampleAcceptable errors: when the amount of errors is not a prime concern.Zhu: of the above issues, the generalizability is often the most important consideration.

39

Nonprobability SamplesAvailable sample (convenience sample): a collection of readily accessible subjects.Volunteer sample: a collection of subjects selected not by random principle but by self initiation.Purposive sample: a collection of subjects selected for specific characteristics, which eliminates all others who fail to meet the criteria.Quota sample: a collection of subjects selected to meet a predetermined or known percentage (quota).Snowball sample: a collection of subjects selected based on “referrals”.

40

Probability SamplesSimple random sample (SRS): each subject in the population has an equal chance of being selected.

In telephone survey, random digital dialing (RDD) procedure produces an SRS

Systematic random sample: every n-th subject is selected from the population.Stratified sample: selected after the population is divided into demographic strata.Cluster sample: selected after the population is divided into geographic clusters[Note: in practice, multistage sampling is often used, in which two or more of the above are used.]

41

Random Selection of Individuals from Households

Either of the following methods is used to ensure that individuals are selected randomly from the chosen households/organizations:

Based on predetermined demographic quota (e.g., Kish Random Tables; see Table 4.2)Based on “last-birthday” principle (added): , selecting the member of the household/organization whose birthday is the most recent

42

Sampling ErrorSampling error (standard error): the degree to which statistics (i.e., measurements obtained from a sample) differ from parameters (i.e., the same measurements that would be obtained from the population)Sampling error is a function of sample sizeSampling error is inevitable, but can be estimated

43

Sampling Error, Confidence Interval, & Confidence Level

Sampling error is a relative concept that takes different values for a given sample depending on the “confidence level” arbitrarily chosen by the researcher

The higher confidence level, the larger sampling errorThe positive and negative sampling errors associated with a given confidence level form a range known as “confidence interval”

In practice, 95% is considered to be the minimally acceptable confidence level; if 100 samples of the same size were drawn, the population parameter would fall within the confidence interval

44

Calculation of Sample Error

where se is the sampling error, n is the sample size, p is the percentage of interest in the population (often assumed to be 50%), and z is the z-value associated with a confidence level, e.g.,

Confidence level = 68%, z = 1Confidence level = 95%, z = 1.96Confidence level = 99%, z = 2.57

zn

ppse ×−

=)1(

45

Considerations for Sample SizeProject type: focus groups, pilot study, or formal study Project purpose: descriptive, inferential, or explanatoryProject complexity: univariate, bivariate, or multivariate; single level or multilevel; cross-sectional or longitudinal Amount of error tolerated: “low incident” events are particularly sensitive for errors Time constraintsFinancial constraintsPrevious research in the area

46

Calculation of Sample Size (Zhu)The formulae for sampling error can also be used to determine sample size given a predetermined, maximally acceptable sampling error (se) and a predetermined, minimally acceptable confidence level (which determines the value of z):

Note that if the chosen confidence level = 95%, z = 1.96; if confidence level = 99%, z = 2.54; etc.

22

)1( zse

ppn −=

47

Ch 6. Content Analysis Learning Objectives:

What is content analysis?Why do we need to use content analysis?

Limitations of content analysis and over-use of content analysis

How to conduct content analysis?How to ensure the systematic and objective nature?

48

What Is Content Analysis?Kerlinger (2000): content analysis is a method of studying and analyzing communication in a systematic, objective, and quantitative manner.

Systematic: content is selected and analyzed according to explicit and consistently applied rules: sampling procedure and coding schemeObjective: content is analyzed based on explicit operational definition and classification rulesQuantitative: content is precisely counted

49

Why to Use Content Analysis?To describe communication content: text, graphics, audio, video, hypertextTo test hypotheses of message characteristics: “If the source has characteristic A, then messages containing elements x and y will be produced; if the source has characteristic B, then messages with elements w and z will be produced.”To compare media content to the “real world”: portrayal of some group, phenomenon, trait, etc. is assessed against a standard taken from real life.To assess the image of particular groups in society: a specific application of describing communication content To establish a starting point for studies of media effects: content as the input of the effects process

50

Limitations of Content AnalysisContent analysis alone cannot serve as a basis for making statements about the effects of content on audiences.The findings of a particular content analysis are limited to the coding scheme used (e.g., operational definition of TV violence).It is difficult to content analyze “low incident”phenomenon.Content analysis is often time-consuming and expensive (i.e., labor intensive).

51

How to Conduct Content Analysis?Basic steps in content analysis:

1. Formulate research question or hypothesis2. Define the study population3. Select a sample from the population4. Select and define a unit of analysis5. Construct classification categories 6. Establish a quantification system7. Train coders and conduct a pilot study8. Code the content based on established rules9. Analyze the coding data10. Draw conclusions and search for indicators

52

Step 1. Formulating a Research Question or Hypothesis

Research question or hypothesis is generated based on:

Existing theoryPrior researchPractical issue

Zhu: research question or hypothesis should be:

SpecificFalsifiableInsightful

53

Step 2. Defining the Study Population

To specify the boundaries of the body of content:

What should be included and why?TopicFormat (e.g., text, graphic, audio, video, etc.?) Time period

What should be excluded and why?TopicFormat (e.g., text, graphic, audio, video, etc.?) Time period

54

Step 3. Selecting a SampleA census of the entire population is sometimes desirable and feasible in content analysisSampling in content analysis often involves a multistage procedure, e.g.,

1. Select appropriate media outlets (purposively or randomly)

2. Select appropriate dates (SRS, systematic, or stratified to create “composite week”)

12-14 issues per year for print media2 days per month for broadcast media

3. Select specific section/program (when necessary or appropriate)

55

Step 4. Selecting a Unit of AnalysisUnit of analysis: the smallest element (which cannot be further divided) of a content analysisExamples of unit of analysis in content analysis:

Print media: word/symbol, sentence, paragraph, theme, article, etc.Broadcast media: character, act, episode, segment, program, etc.

Unit of analysis should be clearly defined (i.e., by a clear-cut operational definition)Whenever possible, lower unit of analysis is preferred because it can be combined into a higher unit of analysis in the future

56

Step 5. Constructing Coding Categories Classification categories should be:

Mutually exclusive: a unit of analysis can be placed in one and only one category

Zhu: when mutually non-exclusivity is either inevitable or desirable, multi-response coding should be used (which creates complications for statistical analysis)

Exclusive: every unit of analysis can be placed in a category

When necessary, set up a category of “others” or “miscellaneous” (which should not exceed 10% of the sample)

Reliable: different coders should agree about the proper category for each unit of analysis (see intercoder reliability)

57

Step 6. Establishing a Quantification System

The quantification system can take any of the four levels of measurement:

NominalOrdinalInterval Ratio

58

Step 7. Training Coders and Doing a Pilot Study

Number of coders is used for the same content: 2-6Coders should not be informed of the research question/hypothesis (i.e., “blinded coding”)Training coders:

To revise definitions and categoriesTo identify “opinionate” coders (and replace such coders who fail to conform to the rules)

Pilot study: to check intercoder reliability

59

Step 8. Coding the ContentA standardized coding sheet based on the operational definition and classification rules should be used for coding An instruction sheet should be provided as an accompanying instrument for the coding sheet

60

Step 9. Analyzing the DataUnivariate analysis:

Percentage or mean (and standard deviation) of each dependent variable

Bivariate analysis:Difference in percentage or mean of each dependent variable between/among groups of an independent variableCorrelation between the dependent and independent variables

Multivariate analysis:Difference in percentage or mean of each dependent variable between/among groups of across several independent variablesCorrelation of a dependent variable with several independent variables

61

Step 10. Interpreting the ResultsFor descriptive studies: additional, independent “benchmark” indicator is needed to help interpret the findingsZhu: Answers to “why” question almost always come from data outside content analysis

Example: Zhu (1991)

62

Reliability of Content AnalysisReliability: if a content analysis is to be objective, its measurements and procedures must be replicable (i.e., repeated measurement of the same material results in similar conclusions).Reliability of content analysis is measured by intercoder reliability (i.e., degree of agreement between independent coders)A poor intercoder reliability indicates problems with:

coding instrument or instructionscoder trainingunit of analysis

63

Intercoder Reliability Holsti’s formulae:

where M is the number of coding decisions on which two coders agree, N1 and N2 are the number of decisions by the first and second coders, respectively.

Scott’s pi (which offsets agreement by chance):

where O is the observed agreement (in %) between two coders, E is the expected agreement (in %) given by:

(where P is the number of categories of the variable):

21

2NN

MR+

=EEOR

−−

=1

pE 1=

64

Validity of Content AnalysisThe validity of content analysis results is affected by both sampling method and coding method. The validity can be assessed based on:

Face validity: if the categories are rigidly and satisfactorily defined and followedConcurrent validity: if the results are consistent with similar data obtained from an independent sourceConstruct validityPredictive validity

65

Ch 7. Survey ResearchLearning Objectives:

What is survey research?Why do we need to conduct surveys?

Advantages and disadvantages of surveyDifferences between survey and experiment (Ch. 9)

How to conduct surveys?Questionnaire designSamplingData collectionData analysis

66

What Is Survey Research?A survey involves interviewing a sample of respondents by face-to-face, telephone, mail, or other methods to measure their knowledge, attitudes, behavior, and other relevant information, with the aim to project the findings to the study population.

Descriptive surveys: to describe or document what the current condition is Analytical surveys: to explain why the current condition exists

67

Why to Use Survey Research?Survey research has the following advantages:

Able to investigate problems in realistic settingsCosts reasonably given the amount of information gatheredAble to collect a large amount of information from a variety of people with relative easeNot constrained by geographic boundariesZhu: the only valid, reliable, and practical way to gather information representative of the general population

68

Disadvantages of Survey ResearchReal causal factors (i.e., independent variables) cannot be manipulated (see Ch. 9 on experimental research), which makes it difficult, if possible at all, to establish a causal relationship under study.Inappropriate wording a placement of questions within a questionnaire may bias results.Survey methods other than face-to-face interviews may not be able to screen out wrong respondents.It has become increasingly difficult to obtain a high response rate.

69

Basic Procedure of Survey Research1. Select the method of interviews2. Select the sample3. Construct the questionnaire4. Pretest the questionnaire5. Train interviewers6. Carry out fieldwork7. Make necessary callbacks/revisits8. Verify the results (quality control inspection)9. Tabulate and analyze the data10. Zhu: calculate response rate

70

Step 1. Selecting the Method of InterviewsMail surveys: sending self-administrated questionnaires to a sample of respondents with stamped reply envelops enclosedTelephone interviews: trained interviewers ask questions and record the answers over the phone; the respondents don’t get to see the questionnaire

CATI: Computer-Assisted Telephone InterviewsPersonal interviews: respondents are interviewed face-to-face at either their home/workplace or a field service location

CAPI: Computer-Assisted Personal InterviewsGroup administered surveys: a group of respondents is gathered together and asked to fill individual copies of a questionnaire

71

Comparisons of Data Collection Methods

GroupPersonalPhoneMail

2nd Most Expensive

Most Expensive

2nd

CheapestCheapestCost

2nd

Quickest2nd SlowestQuickestSlowestTime

Full controlFull controlLimited controlNo controlSelection of

Respondents

2nd LowestHighest2nd HighestLowestInterviewer Bias

Fullest2nd Fullest2nd MinimalMinimalAssistance Available

2nd LowestHighest2nd HighestLowestResponse Rate

72

Step 2. Selecting the SampleMail surveys: drawn from the appropriate sampling frame (i.e., mailing list with names and addresses of respondents) Telephone interviews: drawn from telephone directory (with necessary modifications, e.g., adding a random digit) or from a random digital dialing program Personal interviews: commonly drawn from a multistage sampling procedureGroup administered surveys: drawn a mailing list, a telephone directory (or RDD program), or other sampling procedures

73

Step 3. Construct the QuestionnaireBasic rules of questionnaire design:

Understand the goals of the project so that only relevant questions are includedQuestions should be clear and unambiguousQuestions must accurately communicate what is required from the respondentsDon’t assume respondents understand the questions they are askedFollow Occam’s Razor (the simpler the better)

See later part of this chapter for more details of questionnaire design

74

Impact of Data Collection Methods on Questionnaire Design

Mail surveys: questions must be easy to read and understand before respondents are unable to seek explanationsTelephone interviews: response options for all questions must be fewer and shorter than other methodsPersonal interviews: the interviewers must tread lightly with sensitive and personal questions because his/her physical presence may make the respondent less willing to answer

75

Step 4. Pretest the QuestionnaireAll questionnaires must be tested at least once before put in formal usePretest can be done among:

a focus groupan informal sample of 10-20 people

Particular attention should be paid to how easy the respondents in the pretest understand the questions

76

Step 5. Train InterviewersAll interviewers, experienced or beginner, should be trainedTraining should focus on:

what the questions are abouthow to ask the questions (i.e., reading out exactly the original wording and instruction without any personal interpretation)how to probe follow-up questions (e.g., asking “what else”at least once after the respondent gives an answer to open-ended questions)how to take answers from respondents (i.e., writing down exactly the original wording without any personal interpretation)

77

Step 6. Carry out the FieldworkWeekends and evenings of weekdays are generally preferred time for interviewsLong public holidays should be avoidedUsual duration of the fieldwork for a sample of 1,000:

Mail: 2 months (with 3 rounds of mails)Telephone: 1-2 weeksPersonal: 2-4 weeks

Excessively long fieldwork may introduce unexpected events into the survey

78

Step 7. Make Necessary Callbacks/RevisitsZhu: Any non-contact respondent should be called back or revisited 3-5 timesCallbacks/revisits are necessary and effective means to improve response rateMust callback or revisit:

Those who were contacted but not available for interview with or without an appointment madeThose who have never been contacted

Optional callback or revisit:Those who broke off the interviewThose who refuse the interview

79

Step 8. Verify the ResultsSupervisor(s) call back/revisit a subsample(10-20%) of the complete cases by each interviewer to ensure the interviews take place.Zhu: the quality control (QC) verification usually focuses on a few specific details of the survey that are easy to remember, e.g.,

method of the interview (telephone or personal)sex of the interviewer (male or female)general topic of the questionstype of the gifts (if any)

80

Step 9. Analyze the DataData cleaning: run frequencies to identify missing, illegal or illogical values, and recode all necessary valuesDemographic distribution: run frequencies on age, sex (or other key variables) and compare the results with population census data (and weight the sample if necessary)Formal analyses

UnivariateBivariateMultivariate

81

Step 10. Calculate the Response Rate (Zhu)Survey response rate (SRR) is the most important indicator of the quality of the survey.American Association for Public Opinion Research (AAPOR) has published standard formulas to calculate SRR so that different surveys can be comparable (see Ke, Zhu, & Sun, 2003, pp. 130-113). Of the 6 formulas of AAPOR, RR4 represents the best balance between the most conservative (which tends to underestimate the real SRR) and the most liberal (which tends to overestimate the SRR).

82

Calculation of RR4

where I = completed interviews, P = partially completed interviews, R = refusals and break-off cases, NC = non-contacts, O = other eligible but unsuccessful cases, UH= households status unknown, UO = other unknown households, and e = estimated proportion of eligible but unknown cases, which is often estimated by:

where NE is cases known to be non-eligible (e.g., non-residential units, unqualified individuals, etc.).

)()()(4

UOUHeONCRPIPIRR

+++++++

=

NEONCRPIONCRPIe+++++++++

=

83

How to Design a Questionnaire?Question wordingQuestion typesQuestion formatsIntroductionScreener/filter questionsInstructions for interviewers and respondentsQuestion orderQuestionnaire length

84

Guidelines for Question WordingMake questions clearKeep questions shortInclude only relevant questions Do not ask double-barreled questions (i.e., involving two or more questions in one sentence)Avoid biased words or termsAvoid leading questions (i.e., suggesting a certain response or hidden premise)Do not ask highly detailed informationAvoid potentially embarrassing questions

85

Personal Background

(Control)

Knowledge (DV)

Attitudes (DV)

Behavior (DV)

Media Exposure

(IV)

Media Access

(IV)

How to Determine Which Questions to Ask in a Survey? (Zhu)

CV=Control variable; IV=Independent Variable; DV=Dependent Variable

86

Question TypesOpen-ended questions: requiring respondents to generate their own answers

FlexibleTime consumingNeed to use content analysis to process

Close-ended questions: requiring respondents to select an answer from a list of predetermined options

Ease to quantifyRigidNeed an “Other” category for unforeseen answers

87

Common Formats of QuestionsMultiple choices

Check listForced choice (from a pair of statements)

Rating scalesLikert scaleSemantic differentialFeeling thermometer

Rank orderingFill in blanks

88

Introduction of QuestionnaireTo inform the respondent about

the survey organization the (general) purpose of the surveythe duration of the survey the anonymity and confidentiality of the respondent

Characteristics of a successful introduction (Backstrom and hursh-Cesar, 1986):

shortrealistically wordednonthreateningseriousneutralpleasant but firm

89

Screener/Filter QuestionsTo exclude unqualified respondents from the entire or part of the questionnaire

Screener for exclusion from the entire questionnaire: placed right after the introduction sectionScreener for skips over section(s) of the questionnaire: placed before the section(s) to be skipped

90

Instructions To explain how to answer the questions

Instructions for the interviewer (in telephone or personal interviews):

provided as much as possible usually typed in capital letters and enclosed in brackets or boxes to be distinguished from instructions for the respondent

Instructions for the respondent (particularly important for self-administered questionnaire):

used only when necessary to avoid confusionswhenever possible, providing examples for illustration

91

Question OrderGenerally, the following order is recommended:

Start with simple, easy, and general (i.e., “warm-up”) questionsQuestions serving as dependent variables general go before questions serving as independent variables (to avoid “contamination” or priming effects) Do not ask knowledge questions at the beginning (to avoid embarrassment) Place demographic, personal, and other sensitive questions at the end of the questionnaire

92

Questionnaire LengthHow long is a questionnaire too long?

when there are 10% or more breakoffs (i.e., respondents who drop out before the end of the interview)

Wimmer and Dominick’s recommended maximum length:

Self-administered mail or group survey: 60 min.Face-to-face interview: 60 min.Telephone interview: 20 min.

Zhu: Questionnaires longer than half of the above generally result in high breakoffs among Chinese respondents.

93

A Checklist for Questionnaire Design (Zhu)1. Are the answers in multiple choice questions complete and

mutually exclusive?2. Is a mid-ground position provided in the answers of

attitudinal question? 3. Are the question and answers clearly and unambiguously

expressed?4. Are explanations offered for abstract concepts or technical

terms? 5. Are the question and answers too long?6. Is there any “double-barrel” question？7. Are similar question placed together?8. Are questions ordered from easy to difficult and from

general to specific?9. Are personal questions asked at the end?10. Are screener questions and continued questions

appropriately linked?

94

What to Include in a Survey Report?

YesOptionalNoReferencesYesYesNoDiscussion

YesOptionalNo(Statistical Test)YesYesYesResults

YesYesYesMethods

YesOptionalNoLiterature Review

YesYesYesIntroduction

YesYesNoAbstract

Academic Paper

Business Report

News ReleaseSection

95

Ch 8. Longitudinal ResearchLearning Objectives:

What is longitudinal research?Differences from cross-sectional research

Why do we need longitudinal research?Requirements for causal relationships

How do we conduct longitudinal research?Trend studyCohort analysisPanel study

96

What Is Longitudinal Research?Unlike cross-sectional research that collects data from a representative sample at only one point in time, longitudinal research collects data from the same sample or different samples at different points in time.Longitudinal research is not a new method of data collection or statistical analysis, but a new research design involving existing methods of data collection and analysis.

97

Why to Use Longitudinal Research? (Zhu)

Compared with cross-sectional research, longitudinal research enables the researcher to:

Identify changes over timeTrace the time order of a causal relationship

Requirements for causality:Time order between independent and dependent variablesAssociation of the two variablesExclusion of all other alternative explanations

98

How to Conduct Longitudinal Research?Trend studies: a topic is restudied using different samples drawn from the same population Cohort analysis: tracking specific age cohorts as they change over time Panel studies: the same sample of people is measured at different points in time

Retrospective panel: members of a cross-sectional sample reconstruct their past by recallsFollow-back panel: a cross-sectional sample is compared with a corresponding archival dataCatch-up panel: a cross-sectional sample in the past is compared with current data available from other sources

99

Classification of Longitudinal Research (Zhu)

Panel studyIndividual

Cohort analysisGroup Time series analysis

Trend studySample

30+2-30Unit of Analysis

Number of Time Points

Source: Ke, Zhu & Sun (2003), Ch. 15

100

Comparison of Longitudinal Designs

Costly; high attrition;

sensitization to research instrument

Difficult to separate age,

cohort and period; vulnerable to

sample mortality

Vulnerable to changes in sample or

measurement

Disadvantages

Able to identify dynamic changes

Detect the effects of

maturation and social changes

Establish long-term patterns;

allow secondary analysis

Advantages

Panel StudiesCohort AnalysisTrend Studies

101

Ch 9. Experimental ResearchLearning Objectives:

What is experiment?Why do we need experiment?

Requirements for causalityHow to design and conduct experiment?

How to select the appropriate design?How to manipulate the independent variable?How to randomize the subjects?

102

What Is Experiment?The classic experiment (“controlled labolatoryexperiment”) is a research procedure in which subjects are randomly assigned to experimental vs. control conditions so that the effects of experimental stimulus (“manipulation”) could be directly tested. (Zhu)Quasi-experiment does not involve random assignment of subjects to experimental groups.Field experiment takes place outside lab settings to mimic real life in natural settings.

103

Why to Use (Lab) Experimental Research?It helps establish causality (i.e., cause and effect).It allows control for confounding effects.It costs relatively less than other methods.It is easy to replicate.As the first two merits (establishment of causality and control for confounding effects) are the most central in science research, experiment is considered “the most rigorous method”.

104

Procedure of Experimental Research1. Select the experimental setting2. Select the experimental design3. Operationalize the variables4. Decide how to manipulate the independent

variables5. Select and assign subjects to experimental

conditions (i.e., randomization)6. Conduct a pilot study7. Administer the experiment8. Analyze and interpret the results

105

Step 1. Select the Experimental SettingExperimental settings:

Controlled laboratory settings (to ensure interval validity)Natural settings (to ensure external validity)

106

Step 2. Select the Experimental Design“True” experimental designs

Posttest-only design with control group Pretest-posttest design with control group Solomon design with four groupsFactorial design

Other experimental designsRepeated measures design without control group (similarly, interrupted time series design)Pretest-posttest design with nonequivalent control group (i.e., without randomization)

107

Step 3. Operationalize the VariablesIndependent variable: experimental stimulus to apply to the subjectsDependent variable(s): responses from the subjects (before and) after exposed to the stimulus

Attention (e.g., secondary tasks such as button push) Knowledge (e.g., recalls)Attitudes or perceptions (e.g., questionnaire)Behavior (e.g., display of imitation actions)

108

Step 4. Manipulate the Independent Variable

Develop a set of specific instructions, events or stimuli for presentation to the subjects:

Straightforward manipulation: written materials, verbal instructions, or other stimuli are presented to the subjects.Staged manipulation: the researcher constructs events or circumstances (e.g., by using a “confederate” who pretends to be a subject) to manipulate the independent variable.

In general, the manipulation should be as strong as possible to maximize potential differences between the experimental groups.

109

Step 5. Select and Assign the SubjectsSelection of experimental subjects: ideally, subjects should be randomly selected from the study population to ensure external validityAssignment of experimental conditions: the chosen subjects should be randomly assigned to experimental condition(s) and control condition to eliminate (or minimize) confounding effects that exist among subjects:

Randomization:Matching:

110

Step 6. Conduct a Pilot StudyA pilot study with a small number of subjects helps reveal problems with stimuli and/or measurement, especially to test whether the manipulation of the independent variable is strong enough to have the intended effects.

111

Step 7. Administer the ExperimentFormally carry out the main phase of the experiment:

Have the subjects to read and sign a “consent form” (required by human subjects review committee)Randomization of subjectsApply the manipulation (i.e., stimulus)Measure subjects’ responsesDebrief the subjects at the end of experiment to inform them the real purpose and potential implications of the study

112

Step 8. Analyze and Interpret the Results

Given the specific levels of measurement used (e.g., nominal scale for the independent variable and interval/ratio scale for the dependent variable), experimental data are mostly analyzed with:

t-Test (when there are two groups involved)ANOVA (when there are three or more groups)MANOVA (when there are several parallel or repeatedly measured dependent variables)

113

Ch 10. Introduction to StatisticsLearning Objectives:

What is descriptive statistics?Distributions vs. summary statistics

What is inferential statistics?Sampling distribution

How to calculate descriptive statistics?Sample distributionsCentral tendenciesDispersionsNormal curve z-score

114

What Is Descriptive Statistics?Descriptive statistics: statistical methods and techniques used to reduce data to allow for easier interpretation.

Zhu: statistical indicators used to describe quantitative characteristics of a sample

Key descriptive statistics:Distribution of variables: frequenciesCentral tendency of variables: mean, median, modeDispersion of variables: variance, standard deviation

115

What Is Inferential Statistics (Zhu)?Inferential statistics are indicators used to estimate quantitative characteristics (i.e., parameters) of a population from relevant descriptive statistics of a corresponding sampleKey inferential statistics:

Sampling distributionsSampling errors (also known as “standard errors”)Confidence levelsConfidence intervals

116

Sample DistributionsDistribution: a collection of numbers.

Zhu: a collection of all possible values of a variable observed/measured from a sample

Distribution can be described by a frequency table that contains:

ValuesCounts (frequencies)PercentageValid PercentageCumulative Percentage

Distribution can also be described by a graphic chat:Bar chat (for nominal or ordinal scale variables)Histogram (for interval or ration scale variables)

117

Shapes of Distribution Skewness of a distribution:

Right skewness: the tail of the curve trails off to the right of the distributionLeft skewness: the tail of the curve trails off to the left of the distributionNormal distribution: the two halves of the curve are identical (i.e., symmetrical)

Zhu: normal distribution has desirable mathematical properties for many statistical analyses; some of skewed distributions can be transformed to become proximately normal

118

Statistics of Central Tendency Central tendency uses a single number (i.e., a “statistic”) to describes the “typical” or “average” feature of a sample distributionCommon central tendency statistics:

Mean: the arithmetical average of a distribution (i.e., the sum of all scores divided by N), which is the most frequently used Median: the midpoint of a distribution with half of the scores above and half below itMode: the score(s) that occur(s) most frequently

119

How to Calculate Mean?Based on original scores:

where (read “x bar”) is the mean; is the summation symbol, X is any score of the distribution, n is the number of cases (i.e., sample size)

Based on aggregated (i.e., grouped) scores:

where f is the frequency of each given interval (or group), X is the midpoint of that interval or group

nX

X ∑=X n

fXX ∑=

∑

120

Statistics of DispersionsDispersion statistics describe the variability, “spread-out,” or deviation from the central tendency of a distributionCommon dispersion statistics:

Range (difference between maximum and minimum)VarianceStandard deviation

121

How to Calculate Variance & Standard Deviation?

Variance (S2) Standard Deviation (S)

1)( 2

2

−−

= ∑n

XXS

1)( 2

−−

= ∑n

XXS

Zhu: Notice that variance is just the squared standard deviation; standard deviation is the squared root of variance. While variance is unit-free, standard deviation takes the same unit as the original scores (X).

122

Standard Scores (z-Scores) z-scores are transformed values from the original scores based on the mean and standard deviation:

z-scores help compare scores obtained from totally different methods (i.e., measurement units) because all z-scores have a mean of 0 and a standard deviation of 1. A particular z-score tells how many standard deviations the original score is above or below the mean of the sample.

sXXz −

=

123

Standard Normal Curve (i.e., Distribution of z-Scores)

0z

P (z)

34.1%113.6%

213.6%

-134.1%

-2

95.4%

50%

124

Sampling DistributionDistributions:

Sample distribution: the collection of all values of a sample (actually measured), with a fixed sample size (n) Population distribution: the collection of all values of a corresponding population (possibly but unlikely to be measured), with a fixed population size (N)Sampling distribution: the collection of all possible values of a statistic (i.e., ) that would occur if all possible samples of a fixed size (n) were taken from the population. Sampling distribution is a virtual (i.e., non-existent) distribution, with an infinite number of cases in it. Sampling distribution is the basis to estimate population parameters from sample statistics.

X

125

Comparison of Three Distributions (Zhu)

σsStd Dev

μμMean

∞NnSize

A sampleAn individualAn individualUnit

Sampling Distribution

Population Distribution(Parameter)

Sample Distribution (Statistic)

X

)(n

se σ=

126

How to Calculate Standard Error?For nominal variables:

which has been presented in Ch. 4 under “sampling error”

For interval/ratio variables:

where s2 is the variance of the variable in the sample

zn

ppse ×−

=)1( z

nsse ×=

2

Zhu: Generalized from the above formulas, we can conclude that the standard error of a sampling distribution is simply thevariance of a sample divided by the squared root of the sample size plus adjustment for a given confidence level.

127

Ch 11. Hypothesis TestingLearning Objectives:

What is statistical hypothesis (as compared with research question)?Why do we need to set up hypothesis (instead of research question)?How to test hypothesis (the 5-step procedure)?

128

What Is a Statistical Hypothesis?Unlike research question (which is informal, general, exploratory, and preliminary), statistical hypothesis is formal, specific, explanatory, and predictive.Hypothesis is a tentative generalization about the relationship between two or more variables that predicts an outcome.Zhu: a good hypothesis need to specify

the nature of the relationship (correlated or causal)the direction of the relationship (positive or negative)the form of the relationship (linear or nonlinear)the strength of the relationship (strong or weak)

129

Why Do We Need Hypothesis?Major benefits of setting up a hypothesis (instead of a research question):

It provides direction for a study; without it, research lacks focus and clarityIt eliminates trial-and-error research that is time consuming and wastefulIt helps rule out intervening and confounding variablesIt allows for quantification of variables; words that cannot be quantified cannot be included in a hypothesis

130

What Constitutes a Useful Hypothesis?It should be compatible with current knowledge in the area. Any hypothesis that challenges existing knowledge should have a compelling reason. It should be logically consistent (i.e., “if A = B and B = C, then A = C”, or Aristotle’s syllogism).It should be stated conciselyIt should be testable (i.e., falsifiable)

Zhu: the above prescribes a “useful” (i.e., functional) hypothesis; see my previous criteria for a “good”hypothesis

131

Research Hypothesis vs. Null Hypothesis

Null hypothesis (also called “hypothesis of no difference”) assets that the statistical differences or relationships under study are due to chance or random error (i.e., sampling error). Null hypothesis (denoted as Ho) forms a logical alternative to the research hypothesis (H1) under testZhu: null hypothesis aims to prevent existing knowledge from easily challenged

132

Procedure of Hypothesis Testing (Zhu)1. Specify the research hypothesis and the

corresponding null hypothesis 2. Select the appropriate statistical test3. Select the minimally acceptable significance

level4. Collect the required data and perform the

chosen testing5. Make a decision on the acceptance (or

rejection) of the research hypothesis based on the testing results

133

Step 1. Specifying the Hypotheses (Zhu)Generally, there are two types of research hypotheses:

Difference between or among groups:H0: H1: , or better yet, H1:

Relationship between two or more variables:H0: H1: , or better yet, H1: Note that a difference and a relationship are mathematically equivalent (i.e., a significant difference between groups is the same as a significant relationship between the group variable and the outcome variable)

⋅⋅⋅== 21 XX⋅⋅⋅≠≠ 21 XX ⋅⋅⋅⟩⟩ 21 XX

0=β0≠β 0>β

134

Step 2. Select Statistical Test (Zhu)

Correlation or RegressionF-Testt- or F-TestInterval/Ratio

Spearman Correlation*

Spearman Correlation

Crosstabs + Chi-squareOrdinal

Crosstabs + Chi-square*

Crosstabs + Chi-square

Crosstabs + Chi-squareNominal

Interval/RatioOrdinalNominal

Independent Variable (IV)

Dependent Variable (DV)

*After recoding the IV into an ordinal scale

135

Step 3. Determine the Significance Level (Zhu)

The significance level is the minimally acceptable probability of error for the hypothesis testing

Significance level (denoted as α): predetermined by the researcher, commonly set at α = 0.05 (i.e., an error of 5% in rejecting H0), α = 0.01 (an error of 1%), or α = 0.001 (an error of 0.1%) Probability level (denoted as p): the actually obtained in the data analysis, which could be any value from 0 to 1Confidence level: the opposite to the significance level (i.e., when α = 0.05 , confidence level = 95%; when α= 0.01, confidence level = 99%; and when α = 0.001, confidence level = 99.9%)

136

Step 4. Collect and Analyze the Data (Zhu)

Data collection (see Ch. 6-9)Statistical analysis based on the choice made in Step 2:

Difference between/among groups: Crosstabulation (i.e., Chi-square analysis)t-test (for two groups)ANOVA (F-test, for three or more groups)

Relationship between variables:Correlation analysis (for non-directional relationships)Regression analysis (for directional or causal relationships)

137

Step 5. Make the Statistical Decision The decision is made by simply comparing the resulting p and the predetermined α:

If p ≥ α, we accept (i.e., fail to reject) H0 (i.e., the null hypothesis); consequently, we reject H1 (the research hypothesis)If p < α, we reject H0; consequently, we accept H1

138

µ

Region of retention

Regions of Rejection for α<.05 (Two-tail)

Region of Rejection (= α/2)

Region of Rejection (= α/2)

.025 (2.5%) .025 (2.5%)

.475 (47.5%) .475 (47.5%)

.50 (50%).50 (50%)

139

Errors in Hypothesis TestingWhen we make a decision (i.e., either to accept or reject H0), we run the risk of committing one of the two types of errors:

Type I error: the rejection of a true null hypothesis that should be accepted. Because it happens in the region of rejection (= α), Type I error is called as “alpha error.” Type I error is under the direct control of the researcher. To reduce the error, the researcher can simply set α closer to zero. Type II error: the acceptance of a false null hypothesis that should be rejected. Type II error, called as “beta error,” is the reverse to Type I error.

Zhu: between the two, Type I error is generally more serious and thus should be prevented as the first priority

140

Possible Results in Testing an H0

Decision Made

Type II error (=β)

Correct(=1-β

=Power of Analysis)H0 is false

Correct (=1-α

=Confidence Level)

Type I error (=α)

H0 is true

Accept H0Reject H0Reality

141

Possible Results in Testing an H1

The difference/relationship in the population:

Type II errorCorrectNon-significant

CorrectType I errorSignificant

existsdoesn’t exist

The difference /relationship in the sample is:

142

Power AnalysisPower of a test refers to the probability of rejecting a null hypothesis when it is false

Zhu: the relevant statement on pp. 274-5? is wrongZhu: power analysis helps the researcher to determine, given i) the predetermined α level and ii) the possible size of observed difference/relationship:1. the necessary sample size (to reject a false H0), or2. the minimal power of the test (to avoid β error) Power is commonly set to .80

143

Ch 12. Basic Statistical ProceduresLearning Objectives:

What are the following statistical tests?crosstabulation analysis and Chi-square testt-test and ANOVAcorrelation and regression analysis?

Why should we use any of the above tests (instead of all others)?How can we perform these tests and interpret their results?

144

Why Do We Need Statistical Tests?Statistical tests are necessary with the scientific method of knowing.

In order to obtain valid and reliable results, any data must be analyzed using some type of statistical method.Statistics is how we advance our knowledge of everything.

Otherwise, the data will be analyzed based on the methods of intuition, tenacity, or authority, which generate results that cannot be verified.

145

What Statistical Test Should I Use? It depends on the measurement level of your variables

Correlation or RegressionF-Testt- or F-TestInterval/Ratio

Spearman Correlation*

Spearman Correlation

Crosstabs-Chi-squareOrdinal

Crosstabs-Chi-square*

Crosstabs-Chi-square

Crosstabs-Chi-squareNominal

Interval/RatioOrdinalNominal

Independent Variable (IV)

Dependent Variable (DV)

*After recoding the IV into an ordinal scale

146

Where to Find the Relevant Tool in SPSS?

Regression/LinearSimple & multiple regression

Correlate/BivariateSpearman & Pearson correlation

General Linear Model/UnivariateMulti-way ANOVA

Analyze/Compare Means/One-Way ANOVAOne-way ANOVA

Analyze/Compare Means/Paired Samples Testst-test

Analyze/Descriptive/CrosstabsChi-square test

SPSS ProcedureStatistical Test

147

Crosstabs Analysis & Chi-square TestCrosstabulation analysis displays, in a 2-way table format, either of the following:

the difference in a nominal/ordinal DV between/among groups of a nominal/ordinal IV, orthe relationship between a nominal/ordinal IV and a nominal/ordinal DV

Chi-square (χ2) test provides the significance test of the null hypothesis underlying the crosstabulationthat

the observed difference doesn’t exist in the population, orthe observed relationship doesn’t exist in the population

148

Construction of a 2-Way Crosstable (Zhu)

????N

100%100%100%100%Total

?%?%?%?%2

?%?%?%?%1

Total (optional)321DV

IV

χ2 = ?, df = 2, p < .?

149

Rules for Cross-tables (Zhu)

Put the IV in columns, with each column for each group of IVPut the DV in rows, with each row for each group of DVShow “column percent” in each cell (i.e., dividing the number of cases in each of the cells by the total number of cases in the corresponding column)Show column totals (100% and the number of cases) at the bottomOptionally, show “sample percent” in the last column Show Chi-square test results (χ2 value, degrees of free, and p level) below the table

150

Interpretation of Chi-square Test Results (Zhu)

When the resulting p-level is equal to or greater than the predetermined α-level (e.g., .05), retain the H0and consequently reject the H1

When p is smaller than α, reject the H0 and consequently accept the H1Remember that the H0 states either of the following:

There is no difference in the DV between/among the groups of the IV in the populationThere is no relationship between the IV and DV in the population

151

t-Test and ANOVA (Zhu)Both tests examine the difference in an interval/ratio DV between/among groups of a nominal/ordinal IV:

t-test is used when the IV involves two groupsANOVA (Analysis of Variance, based on F-test) is used when the IV involves three or more groups

t-test is a special case of ANOVA because the t-statistic is the squared root of the corresponding F-statistic; therefore, ANOVA can be used practically for comparisons of any number of groups

152

Construction of a t-Test Table (Zhu)

???Std Error (optional)

--??N

--??Std Dev

???Mean

21DVDifference

IV

t = ?, df = ?, p < .?

153

Rules for t-Test Tables (Zhu)Put the IV in columns, with each column for each group of the IVPut the DV in rows, with one row for mean, standard deviation, standard error (optional because it can be calculated based on the other three pieces of the information presented), and number of cases of each group, respectivelyOptionally, show the difference in the mean of the DV between the two groups in the last columnShow the results of the t-test (including the obtained t-value, df, and p-level) below the tableWhen constrained by the space, t-test table(s) can be replaced by a brief report of the above content in the text; parallel tables can be combined into a multi-panel table

154

Interpretation of t-Test Results (Zhu)When the resulting p-level is equal to or greater than the predetermined α-level (e.g., .05), retain the H0 (i.e., there is no difference in the mean of DV between the groups of the IV in the population) and consequently reject the H1

When p is smaller than α, reject the H0 and consequently accept the H1 (i.e., there is a significant difference in the DV between the groups of the IV in the population)

155

Construction of an ANOVA Table (Zhu)

IV

???Std Error (optional)

???N

???Std Dev

???Mean

21DV 3

f = ?, df1 = ?, df2 = 2, p < ?

156

An Alternative Table for ANOVA Results: Multi-group Comparisons (Zhu)

Group 3

Group 2

MeanStdN

Difference**

Difference*

???

Group 1

Difference***

???

Group 2

???

Group 3

* p < .05, ** p < .01, *** p < .001

157

Interpretation of ANOVA Results (Zhu)Overall test:

When p ≥ α (e.g., .05), retain the H0 (i.e., there is no difference in the mean of DV among all groups of the IV in the population) and consequently reject the H1

When p < α, reject the H0 and consequently accept the H1(i.e., there is a significant difference in the DV between at least one pair of the groups of the IV in the population)

Post hoc tests: when the overall test is significant (p < α), a post hoc test is needed to compare each pair of the groups to identify exactly which pair is significantly different. The logic of t-test between two groups applies here.

158

Two-way ANOVA Data Table*

Total

Value=2

Value=1

TotalValue=3Value=2Value=1IV 2

IV 1

11x

23x22x21x

•1x13x12x

1•x ••x3•x2•x

•2x

This and the following slides on 2-way ANOVA are optional.

159

Two-way ANOVA Result Table

?

?

?

F

??Total

???Error

????Interaction

????IV 2

????IV 1

pMean SquaredfSum of

SquaresSource

160

Interpretation of 2-way ANOVA Results (Zhu)

Three H0’s are tested:H0 for IV 1, if p < α, reject the H0 for IV 1 (i.e., IV 1 has a significant effect on the DV)H0 for IV 2, if p < α, reject the H0 for IV 2 (i.e., IV 2 has a significant effect on the DV)H0 for the interaction between IV 1 and IV 2, if p < α, reject the H0 (i.e., the effect of IV 1 on DV varies according to the value of IV 2; or vise vice, the effect of IV 2 on DV varies according to the value of IV 1

161

Illustration of Interaction Effect

1 2 1 2

DV

IV 1

IV 2 = 2

IV 2 = 1

No Interaction

IV 1

IV 2 = 2

IV 2 = 1

Significant InteractionDV

162

Correlation and Regression AnalysisCorrelation: the degree of association of 2 or more variables without assuming the causal direction between them, measured in a standardized unit (from 0 to 1):

Spearman’s correlation (ρ, for ordinal variables)Pearson’s correlation (r, for interval/ratio variables)

Regression: the degree of association of a DV with 1 or more IVs, measured in both standardized unit and nonstandardized unit (i.e., the original unit of the DV):

Simple regression (for a DV and an IV)Multiple regression (for a DV and 2+ IVs)

163

Weak (probably nonsig.) (r=0.1)

Possible Correlational Relationships

Imperfect Positive (r=0.5)

Y

X

Y

X

Y

X Imperfect Negative (r=-0.5)

Perfectly Positive Correlation (r=1)

Y

X

Y

X

Y

XPerfectly Zero Correlation (r=0)

Perfectly Negative Correlation (r=-1)

164

Interpretation of Correlation Results (Zhu)Significance of the relationship:

If p ≥ α, retain H0 (i.e., the observed ρ or r between X and Y is merely by chance and doesn’t exist in the population)If p > α, reject H0 (i.e., the ρ or r between X and Y is beyond chance and does hold in the population)

Direction of the relationship:If ρ or r > 0, there is a positive correlation between X and Y (i.e., the two vary in the same direction)If ρ or r < 0, there is a negative correlation between X and Y (i.e., the two vary in an opposition direction)

Strength of the relationship:If the absolute value ρ or r < .3, the correlation is weakif the absolute value ρ or r > .3 but < .7, the correlation is mediumif the absolute value of ρ or r > .7, the correlation is strong

165

Construction of Correlation Matrix Table

1.00?***

(n = ?)?**

(n = ?)X3

1.00?*

(n = ?)X2

1.00X1

X3X2X1

* p < .05, ** p < .01, *** p < .001

166

Partial Correlation (optional)Partial correlation is a multivariate analysis that examines the net bivariate correlation between X and Y by controlling the impact of “third variable”(e.g., Z, W, etc.) on Y.Zhu: the order of partial correlation is determined by the number of “third variables” involved:

rxy (bivariate correlation between X and Y) is called “zero-order correlation coefficient”rxy|z (partial correlation between X and Y with Z controlled) is called “first-order correlation coefficient”rxy|zw (partial correlation between X and Y with Z and W controlled) is called “second-order correlation coefficient”rxy|zw… (partial correlation between X and Y with k variables controlled) is called “k-order correlation coefficient”

167

Simple Regression AnalysisA simple regression analysis builds on an assumed causal relationship between two variables, with one (X) as the IV and another Y as the DVThe relationship between X and Y can be expressed by the equation:

Y = a + bXwhere a is the intercept of the regression line and bis the slope of the regression line.

Y

Xa

b

Y = a + bX

168

Construction of Simple Regression Table

N = ?

????Constant

????IV

ptBetab

169

Interpretation of Simple Regression Results (Zhu)

Regression analysis tests the H0 that the effect of X on Y, called “regression coefficient of X” is null (i.e., b = 0)If the resulting p (for b) < α, we reject H0 and conclude that X has a significant effect on YMore specifically, we can predict that a unit increase in X will lead to a change with the amount of b in Y in the population, of course within a confidence interval However, the above (i.e., rejection of H0) still doesn’t prove the causal direction from X to Y; it only suggests the impact of X on Y if the causal assumption is correct.

170

Multiple Regression Analysis (optional)Multiple regression analysis involves two or more IVs:Y = a + b1X1 + b2X2 + … + bkXk

The test involves:an H0 for the overall regression equation (i.e., R2 = 0, where R2 is called “coefficient of determination”)multiple H0’s, each corresponding to a particular IV (i.e., b1 = b2 = … = bk = 0)

171

Table of a Multiple Regression

????IV 1

????IV 2

Adjust R2 = ?, N = ?, (F = ?, df1 = ?, df2 = ?,) p < ?

????Constant

????IV 3

ptBetab

* p < .05, ** p < .01, *** p < .001

172

Table of Several Multiple RegressionsDV3DV2DV1

???IV 1

???IV 2

???IV 3

???Constant

???p

???N

???Adj. R2

173

Interpretation of Multiple Regression Results (optional)

If the p for the overall equation < α, we reject the overall H0 to conclude that the IVs jointly have a significant effect on the DV

The value of R2 describes the amount of variance in the DV explained by the IVs together

If the p for a particular b (ranging form b1 to bk) < α, we reject the associated H0 to conclude that the corresponding IV has a significant effect on the DV

The value of the particular b describes the amount of change in the DV caused by a unit of increase in the IV, when all other IVs are held constant

174

Advanced Techniques for Multiple Regression (Zhu, optional)

Nominal/ordinal scale variables as IVs: recoded by a series of dummy (i.e., binary) variables

Interaction between two IVs:when both are interval/ratio variableswhen one IV is interval/ratio and another nominal/ordinal variable

Nonlinear relationship:Nonlinear transformation of IVs

Documents

Mass Media Research (Lecture Notes)