Research Methods II

Research Methods II

Chapter 1: Introduction What is NOT research?

o Just collecting facts or information with no clear purpose

o Reassembling and reordering facts or information without interpretation

o A term to get your product or idea noticed and respected

What is research?

o Data are collected systematically

o Data are interpreted systematically

o There is a clear purpose = to find things out

Definition: Something that people undertake in order to find out things in a systematic way,

thereby increasing their knowledge.

BUSINESS AND MANAGEMENT RESEARCH

o Transdisciplinary nature o Development of ideas that are related to practice Requirement to have some practical

consequence o Personal or commercial advantages related to research o Theory + practice (?)

In sum, research (also business and management research) should

o Collect data systematically

o Interpret data systematically

o Have a clear purpose: to find things out

FUNDAMENTAL RESEARCH APPLIED RESEARCH

Purpose: o Expand knowledge of processes o Universal principles o Findings of significance and value to

society in general o Find rules that might explain a theory

(E.g.: How students are motivated) Context:

o Universities (Mostly academic context) o Choice determined by researcher o Flexible time scales

You develop a model that can be used in other situations.

Purpose: o Improve understanding of particular

problem o Results in solution to problem o New knowledge limited to problem o Findings of particular relevance o To solve practical problems

(E.g.: lack of motivation in a company) Context:

o Organisations and universities (E.g.: Consultancy)

o Negotiation with originator o Tight time scales

Different types of research depending on

its purpose and context

There can be interactions between these two researches.

Question:

A research team will investigate which location is most suited for the establishment of a new

Carrefour supermarket. This research project leans towards:

A) Fundamental research

B) Applied Research

Answer: A Because other companies can use that.

A research study was carried out to see whether and why people notice web addresses on television

adverts. This study leans towards:


B) Applied Research

Answer: A Principle that can be used for other companies

A study was carried out to see how the rise of the Internet has changed consumers buying behavior.

This study leans towards:


B) Applied Research

Answer:

Which of the following statements is wrong?

A) Research is either fundamental or applied

B) Fundamental research might be of practical relevance

The key outcomes of applied research are actionable results with practice impact

C) Research might be of both practical and theoretical relevance

Answer: A They are the 2 extremes. However, we can have a mixture of fundamental and applied

research.

Wherever your research lies, whether fundamental or applied, or anywhere in between, you should

undertake it with rigour.

Pay careful attention to the research process!

Chapter 2: The research topic

THE RESEARCH PROCESS

o Formulate and clarify the research topic (What: Formulate the question. Make a

good/relevant question)

o Develop the research design. (How am I going to develop the research? Who are my target

group?)

o Gather data

o Analyse and interpret the data

o Write the project report (Write research work with an answer to question)

FORMULATE AND CLARIFY THE RESEARCH

Research topic

Why are you doing research?

Basically, because you are dealing with a certain problem or question; you want to find something

out. There must be a problem to conduct a research.

Why this particular problem or question?

o Intellectual reasons

o Practical reasons

o Personal reasons

Formulating Research Questions

Research question = The question on which you will try to find an answer by means of your research. This question should give direction to your research No good research question = No good answer!!! Was our recent advertising campaign successful? NOT a good question. What do we define as successful/recent/advertising Campaign? We have to be as concrete as possible by defining all the elements. We should try to narrow it down as much as possible. Make sure you critically evaluate your research question(s):

o Suitable focus? o Am I able to investigate my research question(s)? o Is it feasible to investigate my research question(s)?

Suitable Focus? No simplistic inductive research question

o Entirely separate from previous research or theory

No omnivore-research question

o Involves as many elements as possible; Wants to investigate everything

o So broad that any profundity becomes impossible

No theoretical research question

o Entirely separate from empiricism; Not tuned into social reality

Every research project involves making choices:

o Not everything can be investigated, and what you investigate cannot be known to perfection

Make sure you make this clear:

o As a researcher, you should communicate the theoretical and practical

boundaries/limitations of your research question(s) (and thus research results)

Am I able to investigate my research?

Do I have the appropriate research skills? Is it ethical? (E.g.: Can young people buy the product?) Is it feasible to investigate my research question(s)? Time?

o Longitudinal Research (Made several times over long periods of time). It enables us to see if the problem changes over time.

o Cross-sectional research (investigate/compare different population groups at a single point in time)

Money?

Accessible to and willingness to participate of the research objects.

o Am I able to get access to the data? Research Questions: Some examples Problem statement:

The Flemish movie industry is growing but has still to cope with limited attendance, possibly caused

by the general public not being familiar with such movies. It is not clear how this lack of

familiarity could be countered.

Research questions?

o How many times do Flemish people go to the cinema to watch Flemish movies?

Problem statement:

Company X deals with significant financial problems. To solve these problems, the turnover should

increase with at least 10 percent during the next 18 months.

Research questions?

o What is the cause of the financial problem?

o How can company X increase turnover by 10% during the next 18 months?

Problem statement:

Previous academic research thoroughly investigated the shoplifting phenomenon as it negatively

influences business, other consumers, and society more generally. Although many examined the

socio-demographic profile of shoplifters, research about how to prevent shoplifting among

consumers is rather scarce.

Research questions?

o Why are they shoplifting?

o What is the most effective way to fight shoplifting?

THE NATURE OF YOUR RESEARCH

EXPLORATORY RESEARCH

to discover what is happening and gain insights about a topic of interest. It is particularly useful if

you wish to clarify your understanding of a problem, such as if you are unsure of the precise nature

of the problem. It may be that time is well spent on exploratory research, as it might show that the

research is not worth pursuing! () has the advantage that it is flexible and adaptable to change.

For qualitative research. In case there are not a lot of research about this topic.

DESCRIPTIVE RESEARCH

o To gain an accurate profile of events, persons or situations

o May be a precursor or extension of exploratory and/or explanatory research

o Means to an end vs. An end in itself

o Descripto-explanatory studies (= precursor/predecessor to explanatory studies)

EXPLANATORY/CAUSAL RESEARCH

To establish relationships between variables

o E.g.: Quantitative study investigating whether certain colours in the shop lay-out result in

higher levels of customer satisfaction

o E.g.: Qualitative study investigating whether Corporate Social Responsibility activities in a

company influence employee involvement

Gender Study Results

Independent Variable Hypothesis/Expectation: Female doing better Discipline is the mediator

Dependent variable

Discipline

(Mediator)

Belgium v.s

Other countries (Moderator)

Gender Results

Research projects may serve more than one purpose Cinite, I., Duxbury, L.E., & Higgins, C. (2009). Measurement of perceived organisational readiness for

change in the public sector. British Journal of Management, 20(2), 265-277.

o Exploratory phase: To identify behaviours, based on their participants experiences, of

organisational change (interviews)

o Descriptive phase: Used as a forerunner for the next phase (web-based survey)

o Explanatory phase: To explain the relationship between organisational actions and readiness

or unreadiness to implement change, based on employees perceptions (web-based survey)

Question:

Which of the following statements is false?

A) Profiling KUL-students in terms of gender and age is an example of descriptive research.

B) Exploratory research may follow descriptive or causal research.

C) When little is known about the problem situation, it is desirable to start with exploratory

research.

D) Investigating whether and why a decrease in price influences sales and market share results

in descriptive research.

Answer: D

Dependent variable

Independent variable

RESEARCH OBJECTIVES

Might be useful for complex research questions. It comes after the question.

What?

Operationalise how you intend to conduct your research by providing a set of coherent and

connected steps to answer your research question.

Why?

o Likely to lead to greater specificity compared to research questions

o Require more rigorous thinking

What kind of work do I need to do in order to answer my research question? What successive steps

do I need to take in order to answer my research question?

These are statements, not questions and they are numbered in a list.

Example:

As a sales manager, you notice that your sales staff becomes less and less motivated to sell the

companys products. Therefore, you decide to investigate in which way you could increase the level

of motivation among your sales staff.

To define the concept of motivation

To review key literature on the existing measures to motivate sales people

To identify the strengths and weaknesses of the identified measures

To determine which measures are most relevant to use in the context of my company

To carry out primary research in my company to measure the effectiveness of the selected

measure

HYPOTHESES

It is something you would like to test. It is based on a theory, but not always. E.g.: For inductive

approach.

Directional hypotheses:

o The direction of the relationship between the variables is indicated. E.g.: The greater the

stress experienced in the job, the lower the job satisfaction of employees

o The difference between two groups on a variable is postulated. E.g.: Women are more

motivated than men to lose weight

Non-directional hypotheses:

o Do postulate a relationship or difference, but offer no indication of the direction of these

relationships or differences. E.g.: There is a relationship between age and job satisfaction

HYPOTHESES: SOME COMMON MISTAKES

Ambiguous formulation

o E.g.: Belgians consume much candy, Americans dont. (should avoid using words like much)

No point of reference

o E.g.: Adolescents consume much more alcohol. (much more)

Unfounded (not based on literature; Theory as fuel)

o However, this would not be problematic when following the inductive approach (see

afterwards)

RESEARCH QUESTIONS VS. HYPOTHESES

Research questions

o E.g.: Which measure is most effective in preventing consumers from shoplifting?

Hypothesis

o A tentative, yet testable, statement which predicts what you expect to find in your

(empirical) data. E.g.: Financial punishments are more effective in preventing consumers

from shoplifting compared to social punishments.

Question:

Is the following hypothesis well formulated? Explain your answer.

Belgian adolescents have a better self-image compared to French adolescents.

Answer: This is a good directional hypothesis. Because it is well formulated and there is no

ambiguous word and with is a clear point of reference.

Question:

Research topic: Salespeople of Samsung and their preference for payment by commission vs. salary

Formulate a research question in line with the research topic above that results in descriptive

research and a research question that results in causal research.

Answer:

Descriptive Research: What percentage prefer payment by salary versus payment by commission?

Causal Research: How many products are sold by salespeople paid by salary in comparison to

salespeople paid by commission?

Chapter 3: The Research Design

How you will conduct your research. At this stage you need to think of all the elements needed to

fulfil the research. Should be aware of advantages and disadvantages.

Definition of Research Design

A general framework or plan for conducting a research project; It details the procedures

necessary for obtaining the information needed to answer the research question.

Motivate the choices you make!

Exam: Give arguments for choice of design. No right/wrong answer.

RESEARCH PARADIGM

The development of your research design will be influenced by your research paradigm

o A cluster of beliefs and dictates which for scientists in a particular discipline influence what should be studied, how research should be done and how results should be interpreted

o A basic belief system or worldview that guides the investigator, not only in choices of method but in ontologically and epistemologically fundamental ways

The difference between research paradigms is based on assumptions within three domains: Ontology: What is the reality? How does reality look like? Is there a reality external to

humans? If yes, what does it look like? Epistemology: How can we built knowledge about that reality? How do we know what we know?

What counts as knowledge, what doesnt? How is the relationship between researcher and subject?

Methodology: How can the researcher acquire knowledge about his beliefs? (Is limited by ontological and epistemological viewpoints.)

Positivism Constructivism

o Explaining relationships o Accumulating data o Objective process o Knowledgeable researcher, known

subjects o Theory verification o Deduction Testing hypotheses o Focus on quantitative research

o Understanding subjects meaning o Constructing information o Intersubjective process o Researcher becomes involved with the

subjects o Theory building o Induction Developing hypotheses o Focus on qualitative research

There are many other research paradigms in between the extremes of positivism and constructivism! It could happen that a qualitative research could be positivism.

POSITIVISM CONSTRUCTIVISM

Example:

Relationship between CSR activities within a company and employees involvement.

How would a positivist deal with this topic?

He would test a theory. E.g.: Giving the employees a quantitative questionnaire.

How would a constructivist deal with this topic?

He wouldnt start with theory. Depending on the field of study, he would develop a theory.

Question:

Which of the following statements is false? A) Quantitative research can follow an inductive approach. B) Qualitative research might be inductive as well as deductive. C) In positivistic research, the researcher intervenes in the research process. D) Examining peoples motivation for luxury consumption can be done by quantitative as well

as qualitative research.

Answer: C

B Might be true sometimes. But not always. Not typical. C Qualitative is more appropriate but we

can have a quantitative approach as well. (E.g. checking consumption)

RESEARCH APPROACH

The development of your research design will also be influenced by your research approach

Survey: more deduction approach Case study: more induction However, Induction and Deduction can be combined within the same research project! Data Theory Data Once youve collected data and made your theory, you can decide to test it again.

Induction Deduction

Deductive or Inductive?

1. It rains, everything outside becomes wet. It rains. The car is outside.

The car will become wet.

Answer: Deduction

2. The first duck in the park is brown. The second duck in the park is brown. The third duck in the park is brown.

Every duck in the park is brown.

Answer: Induction Some practical criteria:

o Emphasis of the research and nature of the research topic o Wealth of literature o Time available o Risk o Audience

Inductive approach is more risky. You might not develop a good theory with your data. Question: Which of the following statements is true?

A) With deduction, data are collected and a theory developed as a result of the data analysis. B) Research projects should include either the deductive or inductive research approach. C) A research topic about which little literature exists, is more likely to result in an inductive

research approach than a deductive research approach. D) The deductive research approach is less strict compared to the inductive research approach.

Answer: C

Patrick is a member of the Human Relations Research Group of KUL. He read about the large amount of adolescents slipping into shoplifting behaviour and wonders how this behaviour could be prevented. Therefore, he runs a study in which he tests whether the Protection Motivation Theory is applicable to this particular issue. Patricks study leans towards: a) An inductive research approach b) A deductive research approach Answer: b He develops some expectations/theory. Then he gathers data to test the theory. In sum, you should be aware of the fact that research paradigm and research approach influence your research design Core elements of your research design are:

o Research choice o Research strategy o Time horizon

RESEARCH CHOICE: QUANTITATIVE, QUALITATIVE OR MULTIPLE METHODS RESEARCH DESIGN How will you combine quantitative and qualitative data collection techniques and data analysis procedures? Quantitative

Often used as a synonym for data collection techniques/data analysis procedures that generate or use numerical data.

Qualitative

Often used as a synonym for data collection techniques/data analysis procedures that generate or use non-numerical data (such as text)

This distinction might be both problematic and narrow. Why problematic? Many research designs are likely to combine quantitative and qualitative elements

o E.g., Research design using a questionnaire in which respondents also have to answer some open questions in their own words

o E.g., Qualitative research data may be analysed quantitatively (i.e., qualitative data being quantitised)

Why narrow? Reinterpret quantitative and qualitative methodologies through their associations to research paradigms, research approaches and research strategies.

Quantitative research design Qualitative research design

o Research paradigm: Positivism o Research approach: Deduction o Characteristics: Causal Relationships,

numbers, statistical analysis techniques, standardised, probability sampling, generalizability, independent researcher

o Research strategies: Experiments, surveys

o Research paradigm: Constructivism o Research approach: Induction o Characteristics: Meanings, text,

interpretation, non-standardised, non-probability sampling, develop conceptual framework, researcher part of the research process

o Research strategies: Case study

Still, it is possible that a quantitative research design is more in line with induction, and that a qualitative research design is more in line with deduction Many research designs are thus likely to combine quantitative and qualitative elements. No need to learn figure by heart.

Triangulation is one of the advantages of using more than one data collection technique and

analysis procedure

Multiple methods may be used in order to combine data to ascertain if the findings

from one method mutually corroborate the findings from the other method. ( to see

whether we have the same result)

Whatever methods you use to collect and analyse data o Be explicit about the grounds on which multiple methods research is conducted! o And do not forget that they must serve your research question!

RESEARCH STRATEGY:

We have different ways of conducting a research. We can combine different strategies within the

same project.

Various research strategies exist:

o Experiment

o Survey

o Archival Research

o Case Study

o Ethnography

o Action research

o Grounded theory

o

EXPERIMENT

The only way to investigate Causal Relationship. To infer whether a change in one or more

independent variables produces a change in one or more dependent variables.

E.g.:

Mood (A) Creativity (B)

Negative Positive

There might be a problem:

Is it mood that affects creativity? Or is it the other way round?

Now we can manipulate A. Then we test the creativity. Then there will be an effect of mood on

creativity.

Choice of research strategy (strategies) is, among others,

guided by research questions, research objectives, research

paradigm, research approach and research purpose, as well as

by more practical concerns (e.g., time resources, access to

potential participants).

Dependent Independent

CLASSIC EXPERIMENT

o Participants randomly assigned to either the experimental group or control group o Each group should be similar in all aspects relevant to the research other than whether or

not they are exposed to the planned intervention or manipulation Experimental group: Some form of planned intervention/manipulation will be tested Control group: No such intervention/manipulation is made

Example:

Promotion Purchasing Behaviour

No promotion Promotion Control group Experimental group Success if promotion increased purchase behaviour.

Field experiment is better than lab experiment in terms of external validity.

Pre-test measurement of

Purchasing Behaviour

Buy two, get one free promotion:

Yes or no

Post-test measurement

of Purchasing Behaviour


Internal Validity: The extent to which the findings can be attributed to the interventions rather than any flaws in your research design.

External Validity: Whether the cause-and-effect relationship(s) found in the experiment can be generalised.

Question:

o As a marketer, you are wondering whether rock versus pop music in supermarkets

influences the time consumers spend in these supermarkets.

o Design an experiment which would enable this marketeer to find an answer on this

problem.

Answer:

Music Time spent in supermarket

Control group: No music Experimental group:

1. Pop 2. Rock

Choice of supermarket is important. The supermarket, as well as the days have to be the same. Because some days people might be happier or shop more. We have to consider all the elements. SURVEY

o Involves the structured collection of data from a sizeable population. E.g.: Questionnaire, structured observation, structured interviews

o Usually associated with the deductive research approach o Popular and common research strategy in business and management research o Most frequently used to answer what, who, where, how much and how many

questions

ARCHIVAL RESEARCH

o Analysis of administrative records and documents as principal source of data because they

are products of day-to-day activities

o Recent as well as historical documents


o Collection of standardised data from a sizeable population in a highly economical way, allowing easy comparison

o Perceived as authoritative by people in general o Easy to explain and to understand o When sampling (see next chapter) is used, it is possible to generate

findings that are representative of the whole population at a lower cost than collecting the data for the whole population

o Data collected by the survey strategy is unlikely to be as wide-ranging as those collected by other research strategies

Limited number of questions can be included o In case a questionnaire is used

Capacity to do it badly (see later)

o Secondary data analysis: Data are part of the reality being studied rather than having been

collected originally as data for other (research) purposes

o Allows research questions which focus upon the past and changes over time to be

answered

o Disadvantages might be the nature of the records and documents, missing data, and access

to data (confidentiality)

CASE STUDY

o Empirical investigation of a particular contemporary phenomenon within its real-life context,

using multiple sources of (data) evidence

o The boundaries between the phenomenon being studied and the context within which it is

being studied are not clearly evident

Experiment: Research undertaken in a highly controlled context

Survey: Ability to explore and understand the context is limited by the

number of variables for which data can be collected

o Relevant strategy if you wish to gain a rich understanding of the context

o Has considerable ability to generate answers to why, what and how questions

o Likely to use multiple sources of data (interviews, observation, documentary analysis,

questionnaires, )

Example:

o Building high quality interaction and cooperation during organisational change (Grieten & Lambrechts, 2007, 2009)

o Problem definition: 2/3 of change processess fails, although it is known that these failures are often caused by relational aspects

o Research question: What makes relational practices of such a quality that they improve common progress during organisational change?

o Case selection: Two organisations with contrasting change processes in terms of results (best practice and worst practice), but similar in terms of relational approach

o Data collection methods: (participant) observation, in-depth interviews, focus groups, document analysis (Triangulation!)

ETHNOGRAPHY

o Used for studying people in groups, who interact with one another and share the same

space (e.g., street level, work group, organisation, )

o Origins in (colonial) anthropology

o Focuses upon describing and interpreting the social world through first-hand field study

o Researchers living amonst those whom they study, to observe and talk to them in

order to produce detailed cultural accounts of their shared beliefs, behaviors,

interactions, language, rituals and the events that shaped their lives

TRIANGULATION

o Ideas about this strategy or not unified!

ACTION RESEARCH

o An emergent and iterative process of inquiry that is designed to develop solutions to real

organisational problems through a participative and collaborative approach, which uses

different forms of knowledge, and which will have implications for participants and the

organisation beyond the research project

o Research in action rather than research about action

o Demanding strategy in terms of the intensity involved and the resources and time required

GROUNDED THEORY

Uses Inductive research approach. We start with the data and theory/a relevant model.

o Developed as a response to the extreme positivism of past social research

o Theory is developed through the systematic and simultaneous process of data collection and

analysis involving a mainly inductive approach

to generate theory grounded in your data

o A process of constant comparison moving between inductive and deductive thinking

o Theoretical sampling until theoretical saturation is reached

= Conceptual density

= Conceptual saturation

o Time-consuming, intensive and reflective

o Will something significant emerge?

o Will something emerge that is more than simply descriptive?

Example:

Nyilasy, G., & Reid, L.N. (2009). Agency practitioners metatheories of advertising. International

Journal of Advertising, 28(4), 639-668.

o What do advertising agency practitioners think about how advertising works? This

studys basic aim was to understand practitioners thinking about the work of

advertising in their own terms. As there was little substantive research of this

perspective, a grounded theory approach to qualitative research was used.

o Semi-structured, in-depth interviews were used until theoretical saturation was

achieved

TIME HORIZON:

Cross-sectional studies

o The study of a particular phenomenon or phenomena at a particular time, i.e. a

snapshot

o Choice of moment may be important

Longitudinal studies

o The study of a particular phenomenon or phenomena over an extended period of

time (different moments in time)

o Possible to study changes and developments

o Be careful for relevant changes in variables you do not take into account!

o E.g., Consumer Sentiment Index (University of Michigan)

When developing your research design, you should also consider the ethnics and the quality of your

research design.

ESTABLISHING THE QUALITY OF THE RESEARCH DESIGN

Reliability: Consistency in research. Is it consistent when I replicate exactly the same experiment. Validity: Testing the right variable.

An example by means of a scale as measurement instrument

1. 73 kg

2. 73 kg

3. 73 kg

4. 73 kg

RELIABILITY

Reliability does not involve validity!!! & Validity does not involve reliability!!!

Real weight= 78 kg

VALIDITY

It doesnt measure what it intends to measure.

Not Reliable

Not Valid

Reliable

Not Valid

Reliable

Valid

Not Reliable

Valid

Question:

1. The student administration department of HUB examines the extent to which HUB-students

are satisfied with the teaching skills of the HUB-staff. By means of a questionnaire on Time 1,

researcher X finds that the overall satisfaction is equal to 8.7 on 10. Two weeks later (Time

2), researcher X conducts the same research (among the same respondents) and finds that

the overall satisfaction is equal to 8.7 on 10. Consequently, researcher Xs results are:

Valid: ? => No information to tell us whether it is valid or not.

Reliable: Yes

2. You developed a measurement instrument to examine employees level of job autonomy

perception (i.e., the extent to which they experience autonomy in their job). This

measurement instrument seems to be sensitive to social desirability (i.e., respondents

tendency to give answers that may be desirable from a social standpoint / when people

answer according to what they think is expected of them and not according to their own

opinion.)

Question: What is the implication of social desirability for the quality of your measurement

instrument?

Not valid because of measuring job autonomy, they are asking/measuring social desirability.

3. Which of the following statements is correct?

A) Experiments are more valid compared to surveys B) If a study is reliable, it means that it measures what we think it should measure C) External validity is about the extent to which the reliability of a study can be

generalised D) An interviewer who writes down a wrong answer from absent-mindedness threats

the reliability of his study

Answer: D => Just one instance of absent-mindedness will not influence the validity of

the research.

Chapter 4: Sampling

The full set of cases are not necessarily people! E.g.: Whats the everage price of chicken soup in Chinese Restaurants located in Brussels? Population: Chinise restaurants located in Brussels. Sample: A sub-group within a population. Research question: How many beers do Belgian adults drink on average each week? You could collect and analyse data from every possible case in the population = CENSUS However, there might be restrictions in terms of time, money, accessn currency, speed, practice, accuracy, detail ... Therefore, consider data from a subgroup rather than all possible cases or elements of the population = SAMPLE

Sampling is about selecting a number of elements from a populaton you would like to study, with the intention to derive characteristics of the population from characteristics of the sample.

THE SAMPLING PROCESS

DEFINE THE POPULATION Depends on your research question! E.g.: How satisfied are HUB-students with the teaching skills of the HUB-professors? Defining the population is not always that straightforward E.g.: Research project assessing consumer response to a new brand of mens moisturiser Be careful for population specification error = Consequence of not studying a specific part of the target group Question: Define the population for the following research questions:

How do employees of Carrefour think the proposed introduction of compulsory Sunday working will affect their working lives? Population: Employees of Carrefour

What is the normal range in miles that can be travelled by electric cars in everyday use? Population: Electric cars you use everyday

DETERMINE THE SAMPLE FRAME A list of all elements in the population from which your sample will be drawn Examples:

o Telephone book o Companies customer database o Membership lists o

In some cases you will have to develop the sample frame yourself!

Sampling frame error

Sampling frame is not a perfect reproduction of the research population = The variation between the population defined by the researcher and the population as implied by the sampling frame used

Examples of causes of sampling frame errors: o Not up to date o Elements of sampling frame that are not part of the population o Elements of population are not in sampling frame o Elements that are included multiple times

E.g. of sampling frame errors Telephone book:

Not up to date Not everyone has telephone Companies are in phone book as well

Checklist: Are elements listed in the sampling frame relevant to your research question? How recently was the sampling frame compiled, in particular is it up to date? Does the sampling frame includes all elements, in other words is it complete? Does the sampling frame contain the correct information, in other words is it accurate? Does the sampling frame exclude irrelevant cases, in other words is it precise? For purchased lists and online panels, can you establish and control precisely how the

sample will be selected? For an online panel, can you establish whether incentives will be used to enhance the likely

response and provide an assessment of the impact of this on respondent characteristics and consequently responses?

You should not generalise beyond your sampling frame

E.g.: Sampling frame consists of all employees of an organisation You can only generalise to

employees of that particular organisation

Sometimes not possible (or very hard) to develop a sampling frame!

Question: Which sampling frame is suited for the following research questions? How do employees of Carrefour think the proposed introduction of compulsory Sunday working will affect their working lives? Answer: Which factors influence Belgian lawyers decision to work in other European countries? Answer: SELECT SAMPLING TECHNIQUES First of all you need to decide whether you will examine all elements of the population (=census) or you will dram a sample For populations fewer than 50, it is usually more sensible to collect data from the entire population. Draw a sample => Conditions:

o Practical constraints o Budget constraints o Time constraints o Access constraints o Results need to be quickly available o Testing includes destroying of population (e.g.: Establish the actual duration of long-life

batteries) Two types of sampling:

1. Probability Sampling 2. Non-probability Sampling

Probability Sampling Techniques o Sampling techniques in which each element of the population has a fixed probabilistic

chance (usually an equal chance) of being selected for the sample. o It becomes possible to answer research questions that require you to estimate statistically

the characteristics of the population from the sample (i.e., with a certain level of confidence, you are able to generalise the findings to the population)

o Probability sampling is often associated with survey and experiment research strategies. Non-Probability Sampling Techniques

o The probability of each case being selected from the total population is not known. o It is impossible to answer research questions that require you to make statistical inferences

about the characteristics of the population. Note: You may still be able to generalise from non-probability samples about the population, but not on statistical grounds.

Question: Which of the following statements is true?

A) With probability samples the chance, or probability, of each case being selected from the population is unknown.

B) Generalizations about populations from data collected using any probability sample are based on intuition.

C) Sampling provides a valid alternative to a census when it would be impracticable for you to survey the entire population.

D) The sampling frame gives an overview of all the elements which will be included in your final sample.

Answer: C

Probability Sampling Techniques Non-Probability Sampling Techniques

o Simple random sampling o Systematic random sampling o Stratified random sampling o Cluster sampling o Multi-stage sampling

o Quota sampling o Judgemental sampling o Snowball sampling o Self-selection sampling o Convenience sampling

Probability Sampling Techniques

Simple Random Sampling A probability sampling technique in which each element has a known and equal probability of selection. Every element is selected independently of every other element, and the sample is drawn by a random procedure from a sampling frame. E.g.,

o Each element of the sampling frame is assigned a unique identification number (0, 1, 2, )

o Random numbers are generated to determine which elements to include in the sample (e.g., by means of a random number table) and until sample size is reached

Example of random number table:

If the same number is read off a second time, it must be disregarded as you need different cases. This means that you are not putting each cases number back into the sampling frame after it has been selected. This is termed sampling without replacement. If a number is selected that is outside the range of those in your sampling frame, you simply ignore it and continue reading off numbers until your sample size is reached. Disadvantages of this procedure:

o Time-consuming o Requires adapted table with sufficient radom numbers

Other random procedure:

o Computer generated random numbers / Online random number generator ( random number tables)

o Random telephone numbers - Often used when doing computer-aided telephone interviewing (CATI) - Dialing telephone numbers at random from an existing database - Or random digit dialling

+ Does not consider the telephone book - Some households have more than one telephone number!

Simple random sampling:

o Sample without (systematic) bias o Best used when you have an accurate and easily accessible sampling frame that lists the

entire population Disadvantage: These lists are not always available!

o If your population covers a large geographical area, random selection means that selected cases are likely to be dispersed throughout the area

Disadvantage: This sample is not suited if collecting data over a large geopgraphical area using a method that requires face-to-face contact (high travel costs)

Example: Jemma was undertaking her work placement at a large supermarket, where 5011 of the supermarkets customers used the supermarkets Internet purchase and delivery scheme. She was asked to interview customers and find out why they used this scheme. As there was insufficient time to interview all of them, she decided to interview a sample using the telephone. Her calculations revealed that to obtain acceptable levels of confidence and accuracy she needed an actual sample size of approximately 360 customers. She decided to select them using simple random sampling. Having obtained a list of Internet customers and their telephone numbers, Jemma gave each of the cases (customers) in this sampling frame a unique number. In order that each number was made up

in exactly the same way she used 5011 four-digit numbers starting with 0000 through 5010. So customer 677 was given the number 0676. She selected at random a first random number in the random number table. After that, she read off the other random numbers in a regular and systematic manner. She continued in this manner until 360 different cases had been selected. These formed her random sample. Numbers selected that were outside the range of those in her sampling frame (such as 8321, 5953 and 7932) were simply ignored.

Systematic Random Sampling A probability sampling technique in which the sample is chosen by selecting a random starting point and then picking every ith element in succession from the sampling frame

Selecting the sample at regular intervals from the sampling frame Similar to Simple random sampling but in a systematic order. We apply an interval for sample selection. Example: Number each of the cases in your sampling with a unique number (0, 1, 2 )

1500 patients: number each of these patients (0,1,2 1499) Sample of 300 participants

Calculate the sampling fraction (actual sample size/total population)

Sampling fraction: 300/1500=1/5 Select the first case using a random number (depends on sampling fraction)

Random starting point (i.e., random number between 0 and 4) Select subsequent cases systematically (until sample size is reached) using the sampling fraction to determine the frequency of selection.

Continue to select every fifth patient until the sample size of 300 patients is reached. Systematic random sampling:

o Sometimes not necessary to develop a sampling frame (e.g., every tenth visitor of a website)

o Easy to understand and to explain o Despite these advantages, be careful when using existing lists as sampling frames

- You need to ensure that the lists do not contain period patterns! (See next 2 slides) - Systematic random sampling is suitable for geographically dispersed cases only if

you do not require face-to-face contact when collecting data ( simple random sampling)

The impact of period patterns on systematic random sampling: Consider the use of systematic random sampling to generate a sample of monthly sales from the Harrods store in London. The sampling frame contains monthly sales for the last 60 years. A sampling interval of 12 is chosen.

A high street bank needs you to administer a questionnaire to a sample of individual customers with joint bank accounts Sampling fraction = 1/2 = you will need to select every second customer on the list The names of the customer list, which you intend to use as the sampling frame, are arranged as depicted below:

Stratified Random Sampling You divide the population into two or more relevant strata based on one or a number of attributes (e.g., gender, income, region: these attributes are relevant for your research).

In other words, your sampling frame is divided into a number of subsets. A random (simple or systematic) sample is then drawn from each of the strata. More concrete

o Choose the stratification variable(s) - These variables need to be relevant for the research problem - Stratification needs to result in homogeneity within each strata with regard to the

stratification variable(s) o Divide the sampling frame into the discrete strata o Number each of the cases within each stratum with a unique number o Select your sample using either simple random or systematic random sampling

Example: Sarah worked for a major supplier of office supplies to public and private organisations. As part of her research into her organisations customers, she needed to ensure that both public and private sector organisations were represented correctly. An important stratum was, therefore, the sector of the organisation. Her sampling frame was thus divided into two discrete strata: public sector and private sector. Within each stratum, the individual cases were then numbered.

She decided to select a systematic random sample. A sampling fraction of 1/4 meant that she needed to select every fourth customer on the list. As indicated by the ticks, random numbers were used to select the first case in the public sector (001) and private sector (003) strata. Subsequently, every fourth customer in each stratum was selected.

Stratified random sampling:

o Dividing the population into a series of relevant strata means that the sample is more likely to be representative, as you can ensure that each of the strata is represented proportionally within your sample.

Proportionate stratified random sampling = the sample size drawn from the strata are proportionate to the stratas share of the total population

o Disproportionate stratified random sampling (oversampling enables separate analyses) o Despite the advantages of proportionate and disproportionate sampling, there are some

disadvantages as well: - Only possible if you can easily distinguish significant strata (in your sampling frame) - Extra stage of sampling procedure

More time More expensive More difficult to explain compared to simple and systematic random

sampling

Cluster Sampling All elements of a number of randomly selected clusters are selected More concrete:

o Choose the cluster grouping for your sampling frame - Heterogeneity in clusters is important! Cluster small universe

(e.g., Population=football lovers; Cluster=football stadium) o Number each of the clusters with a unique number (0, 1, ) o Select your sample of clusters using some form of random sampling o Select all elements of the selected clusters

Every cluster has an equal chance to be selected Random sampling technique

Still, the technique normally results in samples that represent the total population less

accurately compared to stratified random sampling (Make sure that clusters are thus

heterogeneous!)

Advantage: Restricting the sample to a few relatively compact geographical sub-areas (clusters)

maximises the amount of data you can collect using face-to-face methods within the resources

available.

Example:

Abdel needed to select a sample of firms to undertake an interview-based survey about the use of

large multiple-purpose digital printer copiers. As he had limited resources with which to pay for

travel and other associated data collection costs, he decided to interview firms in four geographical

areas selected from a cluster grouping of local administrative areas. A list of all local administrative

areas formed his sampling frame. Each of the local administrative areas (clusters) was given a unique

number, the first being 0, the second 1 and so on. The four sample clusters were selected from this

sampling frame of local administrative areas using simple random sampling.

Abdels sample was all firms within the selected clusters. He decided that the appropriate telephone

directories would probably provide a suitable list of all firms in each cluster.

Stratified random sampling vs. Cluster sampling

Multi-stage sampling

Select a stage and research within the cluster.

o Modifying a cluster sample by adding at least one more stage of sampling that also involves some form of random sampling

o Procedure: Choose the cluster grouping for your sampling frame

Heterogeneity in clusters is important! Number each of the clusters with a unique number (0, 1, ) Randomly select a number of clusters Repeat the above steps (e.g., districts cities neighbourhoods streets) Randomly select elements of the most recently selected clusters

Example: Laura worked for a market research organisation that needed her to interview a sample of 400 households in England and Wales. She decided to use the electoral register as a sampling frame. Laura knew that selecting 400 households using either systematic or simple random sampling was likely to result in these 400 households being dispersed throughout England and Wales, resulting in considerable amounts of time spent travelling between interviewees as well as high travel costs. By using multi-stage sampling, Laura fest these problems could be overcome. In her first stage, the geographical area (England and Wales) was split into discrete sub-areas (counties). These formed her sampling frame. After numbering all the counties, Laura selected a small number of counties using simple random sampling. Since each case (household) was located in a county, each had an equal chance of being selected for the final sample. As the counties selected were still too large, each was subdivided into smaller geographically discrete areas (electoral wards). These formed the next sampling frame (stage 2). Laura selected another simple random sample. This time she selected a larger number of wards to allow for likely important variations in the nature of households between wards. A sampling frame of the households in each of these wards was then generated using a combination of the electoral register and the UK Royal Mails postcode address file. Laura finally selected the actual cases (households) that she would interview using systematic random sampling. Multi-stage sampling: Advantages:

o Geographically dispersed population becomes possible against lower cost. o Compared to normal cluster sampling, larger clusters with many cases is possible

Disadvantages:

o Selecting smaller and smaller subgroups might impact the representativeness of your sample Can be solved through applying stratified random sampling techniques as well

Impact of various factors on choice of probability sampling techniques:

o Sampling frame required o Size of sample needed o Geographical area to which suited o Necessity of personal contact with respondents o Relative cost o Easy to explain to support workers? o Advantage compared with simple random sampling o

Question:

o BNP Paribas Fortis has about 400 000 Benelux-clients using their credit card. The credit card

application form contains common information such as name, address, age, telephone

number, educational level, etc.

o BNP Paribas Fortis wants to examine whether there is a relationship between the way in

which credit cards are used (e.g., frequency of use) and the socio-economic profile of its

users.

Questions: Identify the population and the sampling frame. Consider the suitability

of the various probability sampling techniques in this situation.

Answer:

Probability Sampling Techniques

Quota Sampling Stratified sampling though the selection of cases is not random (often used for structured interviews as part of a survey strategy) Procedure:

o Divide the population into specific subgroups (quota) based on relevant variables o Calculate, based on relevant and available data, for each subgroup the amount of elements

to be selected o Give each researcher an assignment which states the number of cases in each quota from

which they must collect data o Combine the data collected by interviewers to provide the full sample

E.g.: 2 quotas, Female and Male. Then within the group we select the elements. Quota:

o Are based on relevant and available data o *Are usually relative to the proportions in which they occur in the population* (e.g., 48%

females in population 480 females and 520 males being selected in a sample of 1000 participants)

o Without sensible and relevant quotas, data collected may be biased *Precision control = Proportions in sample perfectly mirror the proportions in the population

Precision control Example:

o Interest in consumption habits among +16 in a medium village o Sample must be representative in terms of residence and age o Population statistics: 24 420 16+-residents o Sample: 1/12 of population 2035 sample cases o 3 districts and 4 age groups 12 quota

Frequency Control:

Representative in terms of criterion

Question: An association has 750 members. In the table below, the distribution of these members is given in terms of gender and age.

Draw a quota sample of 125 subjects, taking into account: - Gender - Age - Gender & Age

Answer: Gender: Males : 125 * (367/750)= 61 Females: 125 * (383/750)= 64 Age: 18-25: 125 * (173/750) = 29 26-49: 125 * (379/750) = 63 50+: 125 * (198/750) = 33 Gender & Age: (We now consider every single cell) Males - 18-25: 125 * (98/750) = 16 Males 26-49: 125 * (191/750) = 32

Quota Sampling: Advantages compared to probability sampling techniques

o Less costly o Can be set up very quickly o Does not require a sampling frame

Disadvantages o Because the interviewer can choose within quota boundaries whom they interview, your

quota sample may be subject to bias (e.g., easily accessible respondents who appear to be willing to answer the questions)

o As the sample is not probability based, you cannot measure the level of certainty or margins of error

Judgemental Sampling

o = Purposive sampling o You need to use your judgment to select cases that will best enable you to answer your

research question o Often used when:

- Working with very small samples (such as in case study research or when you wish to select cases that are particularly informative) E.g., Industrial research among experts

- Doing qualitative research - Doing exploratory research

Those samples cannot be considered to be statistically representative of the total population! The more common judgemental sampling strategies:

o Extreme case or deviant sampling o Heterogeneous or maximum variation sampling o Homogeneous sampling (Focus group discussing: stimulate conversation by putting similar

people together. Then they will be likely to debate more) o Critical case sampling o Typical case sampling o Theoretical sampling (cfr. Grounded Theory)

Snowball Sampling Commonly used when it is difficult to identify members of the desired population Procedure:

o Make contact with one or two cases in the population o Ask these cases to identify further cases o Ask these new cases to identify further new cases (and so on) o Stop when either no new cases are given or the sample is large enough (or when theoretical

saturation is reached)

Main problem = Making initial contact Problems of bias is huge

o Respondents are most likely to identify other potential respondents who are similar to themselves, resulting in a homogeneous sample

Self-selection Sampling

o It occurs when you allow each case, usually individuals, to identify their desire to take part in the research

o You therefore: - Publicise your need for cases either by advertising or by asking them to take part - Collect data from those who respond

o Problem = representativeness - Cases that self-select often do so because of their feelings or opinions about the

research question

E.g.: People posting on Facebook to ask people to participate in a survey. Problem: Bias Example: Patricks research was concerned with the impact of student loans on studying habits. He had decided to administer his questionnaire using the Internet. He publicised his research on Facebook in a number of groups pages, using the associated description to invite people to self-select and clicking on the link to the questionnaire. Those who self-selected by clicking on the hyperlink were automatically taken to the online questionnaire he had develop using the Qualtrics online survey software.

Convenience Sampling

o Involves selecting cases haphazardly only because they are easily available (or most convenient) to obtain for you sample

- E.g., the person interviewed at random in a shopping centre for a television programme

o Widely used o Advantages:

- Cheap - Quick (Suited for exploratory research)

o Though prone to bias and influences that are beyond your control Cases appear in the sample only because of the ease of obtaining them. Bias decreases as the population becomes more homogeneous

Impact of various factors on choice of non-probability sampling techniques

o Likelihood of sampling being representative o Types of research in which useful (e.g., non-probability techniques often used in exploratory

research) o Relative costs (Note: Non-probability techniques are often used as they often imply less

costs compared to probability sampling techniques) o Note: Where it is not possible to construct a sampling frame you will need to use non-

probability sampling techniques

Question: For the following research question, it has not been possible for you to obtain a sampling frame. Suggest the most suitable sampling technique to obtain the necessary data, and motivate your choice. Research question: Would users of the tennis club be prepared to pay a 10 per cent increase in subscriptions to help fund two extra tennis courts? You need the answer by tomorrow morning. Answer: Convenience sample (not much time) But if we have time, probability sample technique EXAM: If we have sampling frame => Probability technique For many research projects, you will have to combine different sampling techniques

Question: Is the following statement true or false? Motivate your answer Stratified sampling can be seen as random quota sampling Answer:

DETERMINE THE SAMPLE SIZE

Probability sampling techniques The confidence interval approach Normal Distribution

95% of the values is in between -1.96*standard deviation and +1.96*standard deviation

A z-score of 1.96 corresponds with a confidence level of 95%

Statistical Interference

Important in research is to calculate statistics, such as the sample mean and sample proportion, and use them to estimate the corresponding true population values Statistical interference: The process of generalising the sample results to a target population

Confidence Intervals

o We are thus interested in using the sample statistic (e.g., the sample mean) as an estimate of the value in the population

o An approach to assessing the accuracy of the sample mean as an estimate of the mean in the population is to calculate boundaries (Confidence Intervals) within which we believe the true value of the mean will fall

o Typically, we look at 95% confidence intervals o This means that for 95% of the time, the true value of the population will fall within the

boundaries of the confidence interval o In other words, if you would collected 100 samples, calculated the mean and then calculated

a confidence interval for that mean, then for 95 of these samples, the confidence intervals we constructed would contain the true value of the mean in the population

o X = sample mean o = population mean

= standard deviation of population o n = sample size o Confidence level (Z)

The range of a normally distributed variable is approximately equal to +/- 3 standard deviations, and

one can thus estimate the standard deviation by dividing the range by 6.

Example:

Suppose a researcher wants to estimate the monthly household savings more precisely so that the

estimate will be within +/- 5 of the true population value. What should be the size of the sample?

1) The researcher should specify the level of precision. This is the

maximum permissible difference between the sample mean

and the population mean.

D=5

2) The researcher should specify the level of confidence and determine the z-value associated

with this confidence level

Confidence level=95% => z-value=1.96

3) The researcher should determine the standard deviation of the population.

Secondary sources indicate a standard deviation of 55 (=)

D=5 z=1.96 =55 n=(1 .96 * 55) / 5 = 465 (rounded to next highest integer)

Sample size: The larger the population, the larger the sample size The higher the degree of confidence, the larger the sample size The higher the degree of precision, the larger the sample size

The choice of sample size is thus governed by: o The confidence you need to have in your data: The level of certainty that the characteristics

of the data collected will represent the characteristics of the total population o The margin of error that you can tolerate: The accuracy you require for any estimates made

from your sample o The variability in the population in terms of the variable(s) of interest

Define the level of precision

=

The maximum permissible difference

(D) between the sample mean and the

population mean

We already determined the level of precision (D)

But what about Z and ? o Specifying Z is about specifying the level of confidence

A 95% confidence level is desired Z = 1.96

o Determine (=the standard deviation of the population) Secondary sources, pilot study or (max valuemin value)/6

EXAM questions: A big company wants to know how much money (in euro) each of its managers spends on lunches per month. It is known that the maximum amount of money spent is 700 euros while the minimum is 400 euro. The company wants the result to be accurate in terms of 5 euro and wants to make a prediction with a confidence level of 95%. How large should the sample size be? Answer: Level of precision: D = 5 Confidence 95%: z = 1.96

= 700-400/6 = 50 N= (1.96 * 50) / 5 = 385

Sample size determination: Proportions

Example: Suppose a researcher is interested in estimating the proportion of households in a particular region that have bought clothes online. What should be the sample size?

1) The researcher should specify the level of precision. This is the maximum permissible difference between the sample proportion and the population proportion.

D = 0.5 2) The researcher should specify the level of confidence and determine the z-value associated

with this confidence level. Confidence level=95% z-value = 1.96

3) The researcher should determine the population proportion. Secondary sources indicate a population of 0.64

D = 0.05 z = 1.96 = 0.64 n = [1.96 * 0.64(1-0.64)] / 0.05 = 355 EXAM Question: A researcher wants to know the percentage of households that has a loyalty card of a certain supermarket. You desire a precision level of 5 percentage points (and a 95% confidence level). How large should the sample size be? Level of precision: D = 5 Confidence 95%: z = 1.96 = 50 N= [1.96 * 50(50)] / 5 = 385

? Population proportion?

Secondary sources, pilot study, or

conservative (=0.5)

Other factors influence the determination of the sample size: o Time resources o Financial resources o Type of data analysis o Access o Expected response (see later response rate) o

Non-probability sampling o Formulas of probability sampling techniques

- Are based on the assumption that the sample cases are randomly selected - Formulas are just guidelines

o Larger sample sizes do not necessarily lead to higher levels of confidence and precision o However, take into account

- Variability in the target group - Goal of sampling - Importance of research for management/client

o Or you could consider sample sizes used in similar studies, for instance

Non-response and response rate

The non-sampling response problem o In reality, you are likely to have non-responses o Possible causes of non-response

- Refusal to participate - Ineligibility to respond - Inability to locate respondent - Respondent located but unable to make contact

o Possible consequences of non-response - Lower confidence and precision levels due to smaller sample size - Non-response bias: People who refuse differ from actual respondents

o As part of your research report, you will need to include the o Response rate:

Total response rate =

Active response rate =

Total number of responses

Total number in sample - Ineligible

Total number of responses

Total number in sample (ineligible + unreachable)

Total and active response rate: Example Suzan has decided to administer a telephone questionnaire to people who had left her company over the past five years. She obtained a list of the 1034 people who had left over this period (the total population) and selected a 50 per cent sample. Unfortunately, she could obtain current telephone numbers for only 311 of the 517 ex-employees who made up her total sample. Of these 311 people who were potentially reachable, she obtained a response from 147. In addition, her list of people who had left her company was inaccurate, and 9 of those she contacted where ineligible to respond, having left the company over five years earlier. Total response rate = Total number of respondents / (total number in sample ineligible) Total response rate = 147 / (517 9) = 28.9 % Active response rate = Total number of respondents / Total number in sample - (ineligible + unreachable) Active response rate = 147/ 3119 = 48.7 % Estimating response rates and actual sample size required

o Non-response = Reality You should estimate the likely response rate and increase the

sample size accordingly

- First of all, determine the minimal sample size (taking into account certain

confidence and precision levels)

- Second, estimate the likely response rate

- Third, calculate the actual sample size you require

Example: Peter was a part-time student employed by a large manufacturing company. He had decided to send a questionnaire to the companys customers and calculated that a minimum sample size of 439 was required. From previous questionnaires that his company had used to collect data from customers, Peter knew the likely response rate would be approximately 30 per cent. Using these data he could calculate his actual sample size: na = 439 x 100 / 30 = 43 900 / 30 = 1463 Peters actual sample size, therefore, needed to be 1463 customers.

na = The actual sample size required

n = The minimum sample size

re % = The estimated response rate expressed as a percentage

How to estimate the response rate?

o Consider the response rates achieved for similar research that has already been undertaken Beware, response rates can vary considerably when collecting primary data!

- E.g., postal questionnaires: often lower than 50% (?) - E.g., face-to-face contact: often higher (?) - E.g., online questionnaires: often lower than 30% (?)

o Alternatively, err on the side of caution (35-50 per cent reasonable?) Possible consequences of non-response: Lower confidence and precision levels due to smaller sample size Increasing the actual sample size is useful in case non-response only results in less confidence and precision. However increasing the actual sample size is no solution

o When doing longitudinal research in which the same respondents need to be re-examined o If it is a matter of non-response bias

- Refusers differ on observable characteristics (gender, education, ) compared to respondents

Refusers might also differ on non-observable characteristics! How to trace non-response bias?

o Comparing characteristics of respondents with refusers - On moment of refusal - Afterwards by means of additional contact

o Comparing characteristics of respondents with population o Still not the solution when there would be differences in terms of non-observable

characteristics How to tackle non-response bias?

o Increasing the number of contacts o Work with substitutes that are randomly selected, but which match on crucial characteristics

(e.g., gender) - However, this measure is not able to solve the bias completely

EXECUTE SAMPLING PROCESS

VALIDATE SAMPLE

o Once data are collected from a sample, comparisons between the structure of the sample and the structure of the population should be made

o If it is found that the structure of a sample does not match the target population (due to population specification error, sampling frame error, sample selection bias, non-response bias), a weighting scheme can be used

- A statistical procedure that attempts to account for these errors/biases by assigning differential weights to the data depending on the response rates

na

= n x 100 / re%

Weighing

Each case in the database is assigned a weight The effect of weighting is to increase or decrease the number of cases in the sample that

possess certain characteristics Most widely used to make the sample data more representative of a target population on

specific characteristics Also used to adjust the sample so that greater importance is attached to participants with

certain characteristics Because it destroys the self-weighting nature of the sample design, this procedure should be

applied with caution! Do not forget to report this procedure!

Chapter 5: Using secondary data Research questions might be answered using some combinations of primary and secondary data. Secondary data can be more effective in terms of money and time.

TYPES OF SECONDARY DATA AND USES IN RESEARCH

May be both quantitative and qualitative data

May be raw data (received little if any processing) or compiled data (received some form of selection or summarising)

Primarily used in descriptive and explanatory research (also possible in exploratory research!)

Within business and management research, secondary data are most frequently used as part of a case study or survey research strategy (also used as part of other research strategies!)

Three main subgroups of secondary data

DOCUMENTARY SECONDARY DATA

Often used in research projects that also collect primary data (But you can also use them on their own or with other sources of secondary data!)

Include text materials and non-text materials. (can be nice to create a background with text) Can be analysed both quantitatively and qualitatively Can be used to help to triangulate findings based on other data Documentary sources you have available can depend on access issues as well as succes in

locating these sources SURVEY-BASED SECONDARY DATA

Data collected using a survey strategy (e.g., questionnaires) that have already been analysed for their original purpose

Collected through one of three distinct subtypes of survey strategy: o Censuses

- Usually carried out by governments Data are often: clearly defined well documented of high quality easily accessible widely used

- Are unique as, unlike surveys, participation is obligatory Therefore, they provide very good coverage of the population surveyed

- E.g., population and housing censuses

o Continuous and regular surveys - Those surveys, excluding censuses, that are repeated over time

E.g., surveys where data are collected throughout the year E.g., UKs General Lifestyle Survey (GLF)

E.g., surveys repeated at regular intervals E.g., EU Labour Force Survey Comparative data

- Also carried out by non-governmental bodies E.g., market research surveys Data often costly to obtain

- Also carried out by large organisations E.g., employee attitude survey Often difficult to gain access due to sensitive nature

o Ad-hoc surveys (result of one survey/ doing the survey just once)

- = A general term normally used to describe the collection of data that only occurs once due to the specificity of focus

- Usually one-off surveys - Usually far more specific in their subject matter - Because of their ad hoc nature, it will probably be more difficult to

discover relevant surveys

MULTIPLE CHOICE SECONDARY DATA

= Secondary data created by combining two or more different data sets prior to the data being accessed for the research. These data sets can be based entirely on documentary or on survey data, or can be an amalgam of the two

E.g., Various compilations of company information o E.g., Europes 15,000 Largest Companies

Some methods of compilation o Extract and combine selected comparable variables from a number of surveys or

from the same survey that has been repeated a number of times to provide longitudinal data (time-series data)

o Data compiled from the same cases over time using a series of snapshots to form cohort studies

o Secondary data from different sources can be combined, if they have the same geographical basis, to form area-based data sets (E.g., Europe in Figures: Eurostat Yearbook)

Question The Facebook-page of McDonalds is an example of: a) Documentary secondary data b) Survey secondary data c) Multiple source secondary data d) None of the above types of secondary data Answer: A) Documentary HOW TO LOCATE SECONDARY DATA?

Are the data you need available? Requires you to:

STEP 1: Establish whether the sort of data you require are likely to be available as secondary data STEP 2: Locate the precise data you require STEP 1: ESTABLISHING THE LIKELY AVAILABILITY IF SECONDARY DATA

Literature review (Reference list) Quality national newspapers Subject-specific textbooks Tertiary literature (e.g., indexes and catalogues) Informal discussions

STEP 2: LOCATE SECONDARY DATA

Once you have ascertained that secondary data are likely to exist, you need to find their precise location

o Relatively straightfoward for secondary data held in online databases or held by specialist libraries

o Data held by organisations are more difficult to locate (time consuming, quality?, ) o Once you have located a possible secondary data set, you need to be certain that it

will meet your needs

Advantages of Secondary Data Disadvantage of secondary Data

May have fewer resource requirements Unobtrusive Longitudinal studies may be feasible Can provide comparative and

contextual data Can result in unforeseen discoveries Permanence of data

May be collected for a purpose that does not match your need Access may be difficult or costly Aggregations and definitions may be unsuitable No real control over data quality Initial purpose may affect how data are presented

EVALUATING SECONDARY DATA SOURCES We need to make sure that the secondary data is valid.

Secondary data must be viewed with the same caution as any primary data! You need to be sure that:

o They will enable you to answer your research question o The benefits associated with their use will be greater than the costs o You will be allowed access to the data

Most authors suggest a range of validity and reliability criteria against which potential secondary data can be evaluated

These criteria can be incorporated into a three-stage process 1. Overall suitability 2. Precise suitability 3. Costs and benefits

1. OVERALL SUITABILITY Measurement validity

o Do the measures used match those you need? o E.g., A manufacturing organisation recording monthly sales whereas you are

interested in monthly orders o E.g., Use minutes of company meetings as a proxy for what actually happened in

those meetings Coverage and unmeasured variables

o Do secondary data cover the population about which you need data, for the time period you need, and contain variables that will enable you to answer the research question?

o Some secondary data sets may not include variables you have identified as necessary for your analysis (i.e., unmeasured variables)

Checklist: Does the data set contain the information you require to answer your research

question(s)? Do the measures used match those you require? Is the data set a proxy for the data you really need? Does the data set cover the population that is the subject of your research? Does the data set cover the geographical area that is the subject of your research? Can data about the population that is the subject of your research be separated from

unwanted data? Are the data for the right time period or sufficiently up to date? Are data available for all the variables you require to answer your research question(s)? Are the variables defined clearly?

2. PRECISE SUITABILITY Reliability and validity

o Quick option: Assess the authority or reputation of the source o In-depth assessment:

Who is responsible for the data? Method used to collect the data? (we can contact the person who

conducted the research and ask about the methodology) Context in which the data were collected? How were data analysed and reported?

Measurement bias: Can occur for two reasons 1) Deliberate or intentional distortion of data

E.g., Purpose of study is to reach a predetermined conclusion E.g., People responding to a structured interview adjusting their

responses to please the interviewer Triangulation!

2) Changes in the way data were collected Particularly important for longitudinal data sets!

Checklist: How reliable is the data set you are thinking of using? How credible is the data source? Is it clear what the source of the data is? Do the credentials of the source of the data (author, institution or organisation

sponsoring the data) suggest it is likely to be reliable?

Do the data have an associated copyright statement? Do associated published documents exist? Does the source contain contact details for obtaining further information about the

data? Is the method described clearly? If sampling was used, what was the procedure and what were the associated

sampling errors and response rates? Who was responsible for collecting or recording the data? (For surveys) Is a copy of the questionnaire or interview checklist included? (For compiled data) Are you clear how the data were analysed and compiled? Are the data likely to contain measurement bias? What was the original purpose for which the data were collected? Who was the target audience and what was its relationship to the data collector or

compiler (where there any vested interests)? Have there been any documented changes in the way the data are measured or

recorded including definition changes? How consistent are the data obtained from this source when compared with data

from other sources? Have the data have been recorded accurately?

3. COSTS AND BENEFITS

Checklist: What are the financial and time costs of obtaining these data? Can the data be downloaded into a spreadsheet, statistical analysis software or word

processor? Do the overall benefits of using these secondary data sources outweigh the associated

costs? Question: Suppose you are undertaking a research project as part of your research methods course in which you need to investigate the following research question:

How has Belgiums import and export trade with other countries altered since its entry into the European Union?

Give one argument that you could use to convince the project leader of the suitability of using secondary data to answer this research question. Answer: Time wise it is not possible to gather primary data as it is very time consuming. Question: Which of the following statements is wrong?

A) Primary data become secondary data B) Primary data are more reliable and valid compared to secondary data C) Research projects might combine primary and secondary data. D) Secondary data enable researchers to triangulate their primary research findings

Answer: A) Primary data are more reliable and valid compared to secondary data

Chapter 6: Collecting Primary Data through observation If your research question is concerned with what people do, an obvious way in which to discover this is to watch them do it This is essentially what observation involves:

The systematic observation, recording, description, analysis and interpretation of (peoples) behaviour

Observation is very often used in marketing research. E.g.: Loyalty card: observing the buying pattern of customers. Cookies online: browsing behaviour (Structured Obcervation) Two types of observation are examined in this chapter

1. Participant observation Qualitative Emphasis is on discovering the meaning that people attach to their actions

2. Structured observation

Quantitative Emphasis is on the frequency of actions

Observation can be used as either the main method of data collection or to supplement other methods!!! What can be observed?

Behavior and physical actions Verbal behavior Body language Spacial aspects of relationships Time patterns Physical objects (products on shelves) Activities from the past .

Dimension based on which observation methods differ Natural or Manipulated Natural: observe people in their natural environment Manipulated: In a lab setting Personal or Mechanical Personal: Human being observing Mechanical: Done by an eye tracker (To see what attracts people more) Hidden or Not Hidden Concealed identity

Question: The sellers of a multimedia store visit competitive stores and write down their prices. This observation is: a) Natural - Personal - Not hidden b) Manipulated - Mechanical - Not hidden c) Manipulated - Personal Hidden d) Natural - Personal - Hidden Answer: d) natural personal hidden

PARTICIPANT OBSERVATION

Qualitative

Emphasis is on discovering the meaning that people attach to their actions.

Observation in which the researcher attempts to participate fully in or closely observe the lives and activities of the research subjects and thus becomes a member of the subjects group(s), organisation(s) or community

This enables researchers to share their experiences by not merely observing what is happening but also feeling it

E.g., Street Corner Society by W.F. Whyte

TYPOLOGY OF PARTICIPANT OBSERVATION RESEARCHER ROLES

The time you have to conduct the research will determine the role you would take as a researcher. Complete participant:

Preventing social desirability Raises questions of ethics Might lose sight of research purpose

Observer as participant:

Able to focus on the researcher role Lose the emotional involvement

Participant as observer: Not always easy to gain trust of the group you observe

FACTORS THAT WILL DETERMINE THE CHOICE OF PARTICIPANT OBSERVER ROLE

Purpose of your research Which role suits your research question? E.g.: A phenomenon about which the research informants would be naturally defensive is one that lends itself to the complete participant role

The time you have to devote to your research Some of the roles may be very time consuming E.g.: A period of attachment might be necessary

The degree to which you feel suited to participant observation

Organisational access

Ethical considerations The degree to which you reveal your identity as the researcher or adopt a covert stance will be dictated by ethical considerations

Not making and recording data Note making: Your notes are likely to be composed of different types of data:

o Primary observations What happened? What was said?

o Secondary observations Statements by observers about what happened or was said

o Experiential data Perceptions and feelings as you experience the process you are researching Contextual data

Data related to the research setting and organisational structures and communication patterns that will help you to interpret other data

Data Collection

No formal interviews but informal discussions Recording must take place on the same day as the fieldwork in order to not forget valuable

data Data Analysis

Data from participant observation are analysed like other qualitative data (not part of this course; see BBA3)

Data will start to be analysed at the time you collect them (i.e., data collection and data analysis will be carried out simultaneously)

o Promising lines of enquiry that you wish to follow up in your continued observation will emerge (cfr. analytic induction)

ISSUES RELATED TO RELIABILITY AD VALIDITY Participant observation has high ecological validity as it involves studying social actors and

social phenomena in their natural settings However, using participant observation may lead to a number of threats to reliability and

validity: o Observer error o Observer bias o Observer effect

Observer Error Lack of understanding of or overfamiliarity with setting may lead you to un

Documents

Research Methods II