Upload
anna-mkrtichian-qlimenko
View
4
Download
0
Embed Size (px)
DESCRIPTION
overview RM
Citation preview
Research Methods II
Chapter 1: Introduction What is NOT research?
o Just collecting facts or information with no clear purpose
o Reassembling and reordering facts or information without interpretation
o A term to get your product or idea noticed and respected
What is research?
o Data are collected systematically
o Data are interpreted systematically
o There is a clear purpose = to find things out
Definition: Something that people undertake in order to find out things in a systematic way,
thereby increasing their knowledge.
BUSINESS AND MANAGEMENT RESEARCH
o Transdisciplinary nature o Development of ideas that are related to practice Requirement to have some practical
consequence o Personal or commercial advantages related to research o Theory + practice (?)
In sum, research (also business and management research) should
o Collect data systematically
o Interpret data systematically
o Have a clear purpose: to find things out
FUNDAMENTAL RESEARCH APPLIED RESEARCH
Purpose: o Expand knowledge of processes o Universal principles o Findings of significance and value to
society in general o Find rules that might explain a theory
(E.g.: How students are motivated) Context:
o Universities (Mostly academic context) o Choice determined by researcher o Flexible time scales
You develop a model that can be used in other situations.
Purpose: o Improve understanding of particular
problem o Results in solution to problem o New knowledge limited to problem o Findings of particular relevance o To solve practical problems
(E.g.: lack of motivation in a company) Context:
o Organisations and universities (E.g.: Consultancy)
o Negotiation with originator o Tight time scales
Different types of research depending on
its purpose and context
There can be interactions between these two researches.
Question:
A research team will investigate which location is most suited for the establishment of a new
Carrefour supermarket. This research project leans towards:
A) Fundamental research
B) Applied Research
Answer: A Because other companies can use that.
A research study was carried out to see whether and why people notice web addresses on television
adverts. This study leans towards:
A) Fundamental research
B) Applied Research
Answer: A Principle that can be used for other companies
A study was carried out to see how the rise of the Internet has changed consumers buying behavior.
This study leans towards:
A) Fundamental research
B) Applied Research
Answer:
Which of the following statements is wrong?
A) Research is either fundamental or applied
B) Fundamental research might be of practical relevance
The key outcomes of applied research are actionable results with practice impact
C) Research might be of both practical and theoretical relevance
Answer: A They are the 2 extremes. However, we can have a mixture of fundamental and applied
research.
Wherever your research lies, whether fundamental or applied, or anywhere in between, you should
undertake it with rigour.
Pay careful attention to the research process!
Chapter 2: The research topic
THE RESEARCH PROCESS
o Formulate and clarify the research topic (What: Formulate the question. Make a
good/relevant question)
o Develop the research design. (How am I going to develop the research? Who are my target
group?)
o Gather data
o Analyse and interpret the data
o Write the project report (Write research work with an answer to question)
FORMULATE AND CLARIFY THE RESEARCH
Research topic
Why are you doing research?
Basically, because you are dealing with a certain problem or question; you want to find something
out. There must be a problem to conduct a research.
Why this particular problem or question?
o Intellectual reasons
o Practical reasons
o Personal reasons
Formulating Research Questions
Research question = The question on which you will try to find an answer by means of your research. This question should give direction to your research No good research question = No good answer!!! Was our recent advertising campaign successful? NOT a good question. What do we define as successful/recent/advertising Campaign? We have to be as concrete as possible by defining all the elements. We should try to narrow it down as much as possible. Make sure you critically evaluate your research question(s):
o Suitable focus? o Am I able to investigate my research question(s)? o Is it feasible to investigate my research question(s)?
Suitable Focus? No simplistic inductive research question
o Entirely separate from previous research or theory
No omnivore-research question
o Involves as many elements as possible; Wants to investigate everything
o So broad that any profundity becomes impossible
No theoretical research question
o Entirely separate from empiricism; Not tuned into social reality
Every research project involves making choices:
o Not everything can be investigated, and what you investigate cannot be known to perfection
Make sure you make this clear:
o As a researcher, you should communicate the theoretical and practical
boundaries/limitations of your research question(s) (and thus research results)
Am I able to investigate my research?
Do I have the appropriate research skills? Is it ethical? (E.g.: Can young people buy the product?) Is it feasible to investigate my research question(s)? Time?
o Longitudinal Research (Made several times over long periods of time). It enables us to see if the problem changes over time.
o Cross-sectional research (investigate/compare different population groups at a single point in time)
Money?
Accessible to and willingness to participate of the research objects.
o Am I able to get access to the data? Research Questions: Some examples Problem statement:
The Flemish movie industry is growing but has still to cope with limited attendance, possibly caused
by the general public not being familiar with such movies. It is not clear how this lack of
familiarity could be countered.
Research questions?
o How many times do Flemish people go to the cinema to watch Flemish movies?
Problem statement:
Company X deals with significant financial problems. To solve these problems, the turnover should
increase with at least 10 percent during the next 18 months.
Research questions?
o What is the cause of the financial problem?
o How can company X increase turnover by 10% during the next 18 months?
Problem statement:
Previous academic research thoroughly investigated the shoplifting phenomenon as it negatively
influences business, other consumers, and society more generally. Although many examined the
socio-demographic profile of shoplifters, research about how to prevent shoplifting among
consumers is rather scarce.
Research questions?
o Why are they shoplifting?
o What is the most effective way to fight shoplifting?
THE NATURE OF YOUR RESEARCH
EXPLORATORY RESEARCH
to discover what is happening and gain insights about a topic of interest. It is particularly useful if
you wish to clarify your understanding of a problem, such as if you are unsure of the precise nature
of the problem. It may be that time is well spent on exploratory research, as it might show that the
research is not worth pursuing! () has the advantage that it is flexible and adaptable to change.
For qualitative research. In case there are not a lot of research about this topic.
DESCRIPTIVE RESEARCH
o To gain an accurate profile of events, persons or situations
o May be a precursor or extension of exploratory and/or explanatory research
o Means to an end vs. An end in itself
o Descripto-explanatory studies (= precursor/predecessor to explanatory studies)
EXPLANATORY/CAUSAL RESEARCH
To establish relationships between variables
o E.g.: Quantitative study investigating whether certain colours in the shop lay-out result in
higher levels of customer satisfaction
o E.g.: Qualitative study investigating whether Corporate Social Responsibility activities in a
company influence employee involvement
Gender Study Results
Independent Variable Hypothesis/Expectation: Female doing better Discipline is the mediator
Dependent variable
Discipline
(Mediator)
Belgium v.s
Other countries (Moderator)
Gender Results
Research projects may serve more than one purpose Cinite, I., Duxbury, L.E., & Higgins, C. (2009). Measurement of perceived organisational readiness for
change in the public sector. British Journal of Management, 20(2), 265-277.
o Exploratory phase: To identify behaviours, based on their participants experiences, of
organisational change (interviews)
o Descriptive phase: Used as a forerunner for the next phase (web-based survey)
o Explanatory phase: To explain the relationship between organisational actions and readiness
or unreadiness to implement change, based on employees perceptions (web-based survey)
Question:
Which of the following statements is false?
A) Profiling KUL-students in terms of gender and age is an example of descriptive research.
B) Exploratory research may follow descriptive or causal research.
C) When little is known about the problem situation, it is desirable to start with exploratory
research.
D) Investigating whether and why a decrease in price influences sales and market share results
in descriptive research.
Answer: D
Dependent variable
Independent variable
RESEARCH OBJECTIVES
Might be useful for complex research questions. It comes after the question.
What?
Operationalise how you intend to conduct your research by providing a set of coherent and
connected steps to answer your research question.
Why?
o Likely to lead to greater specificity compared to research questions
o Require more rigorous thinking
What kind of work do I need to do in order to answer my research question? What successive steps
do I need to take in order to answer my research question?
These are statements, not questions and they are numbered in a list.
Example:
As a sales manager, you notice that your sales staff becomes less and less motivated to sell the
companys products. Therefore, you decide to investigate in which way you could increase the level
of motivation among your sales staff.
To define the concept of motivation
To review key literature on the existing measures to motivate sales people
To identify the strengths and weaknesses of the identified measures
To determine which measures are most relevant to use in the context of my company
To carry out primary research in my company to measure the effectiveness of the selected
measure
HYPOTHESES
It is something you would like to test. It is based on a theory, but not always. E.g.: For inductive
approach.
Directional hypotheses:
o The direction of the relationship between the variables is indicated. E.g.: The greater the
stress experienced in the job, the lower the job satisfaction of employees
o The difference between two groups on a variable is postulated. E.g.: Women are more
motivated than men to lose weight
Non-directional hypotheses:
o Do postulate a relationship or difference, but offer no indication of the direction of these
relationships or differences. E.g.: There is a relationship between age and job satisfaction
HYPOTHESES: SOME COMMON MISTAKES
Ambiguous formulation
o E.g.: Belgians consume much candy, Americans dont. (should avoid using words like much)
No point of reference
o E.g.: Adolescents consume much more alcohol. (much more)
Unfounded (not based on literature; Theory as fuel)
o However, this would not be problematic when following the inductive approach (see
afterwards)
RESEARCH QUESTIONS VS. HYPOTHESES
Research questions
o E.g.: Which measure is most effective in preventing consumers from shoplifting?
Hypothesis
o A tentative, yet testable, statement which predicts what you expect to find in your
(empirical) data. E.g.: Financial punishments are more effective in preventing consumers
from shoplifting compared to social punishments.
Question:
Is the following hypothesis well formulated? Explain your answer.
Belgian adolescents have a better self-image compared to French adolescents.
Answer: This is a good directional hypothesis. Because it is well formulated and there is no
ambiguous word and with is a clear point of reference.
Question:
Research topic: Salespeople of Samsung and their preference for payment by commission vs. salary
Formulate a research question in line with the research topic above that results in descriptive
research and a research question that results in causal research.
Answer:
Descriptive Research: What percentage prefer payment by salary versus payment by commission?
Causal Research: How many products are sold by salespeople paid by salary in comparison to
salespeople paid by commission?
Chapter 3: The Research Design
How you will conduct your research. At this stage you need to think of all the elements needed to
fulfil the research. Should be aware of advantages and disadvantages.
Definition of Research Design
A general framework or plan for conducting a research project; It details the procedures
necessary for obtaining the information needed to answer the research question.
Motivate the choices you make!
Exam: Give arguments for choice of design. No right/wrong answer.
RESEARCH PARADIGM
The development of your research design will be influenced by your research paradigm
o A cluster of beliefs and dictates which for scientists in a particular discipline influence what should be studied, how research should be done and how results should be interpreted
o A basic belief system or worldview that guides the investigator, not only in choices of method but in ontologically and epistemologically fundamental ways
The difference between research paradigms is based on assumptions within three domains: Ontology: What is the reality? How does reality look like? Is there a reality external to
humans? If yes, what does it look like? Epistemology: How can we built knowledge about that reality? How do we know what we know?
What counts as knowledge, what doesnt? How is the relationship between researcher and subject?
Methodology: How can the researcher acquire knowledge about his beliefs? (Is limited by ontological and epistemological viewpoints.)
Positivism Constructivism
o Explaining relationships o Accumulating data o Objective process o Knowledgeable researcher, known
subjects o Theory verification o Deduction Testing hypotheses o Focus on quantitative research
o Understanding subjects meaning o Constructing information o Intersubjective process o Researcher becomes involved with the
subjects o Theory building o Induction Developing hypotheses o Focus on qualitative research
There are many other research paradigms in between the extremes of positivism and constructivism! It could happen that a qualitative research could be positivism.
POSITIVISM CONSTRUCTIVISM
Example:
Relationship between CSR activities within a company and employees involvement.
How would a positivist deal with this topic?
He would test a theory. E.g.: Giving the employees a quantitative questionnaire.
How would a constructivist deal with this topic?
He wouldnt start with theory. Depending on the field of study, he would develop a theory.
Question:
Which of the following statements is false? A) Quantitative research can follow an inductive approach. B) Qualitative research might be inductive as well as deductive. C) In positivistic research, the researcher intervenes in the research process. D) Examining peoples motivation for luxury consumption can be done by quantitative as well
as qualitative research.
Answer: C
B Might be true sometimes. But not always. Not typical. C Qualitative is more appropriate but we
can have a quantitative approach as well. (E.g. checking consumption)
RESEARCH APPROACH
The development of your research design will also be influenced by your research approach
Survey: more deduction approach Case study: more induction However, Induction and Deduction can be combined within the same research project! Data Theory Data Once youve collected data and made your theory, you can decide to test it again.
Induction Deduction
Deductive or Inductive?
1. It rains, everything outside becomes wet. It rains. The car is outside.
The car will become wet.
Answer: Deduction
2. The first duck in the park is brown. The second duck in the park is brown. The third duck in the park is brown.
Every duck in the park is brown.
Answer: Induction Some practical criteria:
o Emphasis of the research and nature of the research topic o Wealth of literature o Time available o Risk o Audience
Inductive approach is more risky. You might not develop a good theory with your data. Question: Which of the following statements is true?
A) With deduction, data are collected and a theory developed as a result of the data analysis. B) Research projects should include either the deductive or inductive research approach. C) A research topic about which little literature exists, is more likely to result in an inductive
research approach than a deductive research approach. D) The deductive research approach is less strict compared to the inductive research approach.
Answer: C
Patrick is a member of the Human Relations Research Group of KUL. He read about the large amount of adolescents slipping into shoplifting behaviour and wonders how this behaviour could be prevented. Therefore, he runs a study in which he tests whether the Protection Motivation Theory is applicable to this particular issue. Patricks study leans towards: a) An inductive research approach b) A deductive research approach Answer: b He develops some expectations/theory. Then he gathers data to test the theory. In sum, you should be aware of the fact that research paradigm and research approach influence your research design Core elements of your research design are:
o Research choice o Research strategy o Time horizon
RESEARCH CHOICE: QUANTITATIVE, QUALITATIVE OR MULTIPLE METHODS RESEARCH DESIGN How will you combine quantitative and qualitative data collection techniques and data analysis procedures? Quantitative
Often used as a synonym for data collection techniques/data analysis procedures that generate or use numerical data.
Qualitative
Often used as a synonym for data collection techniques/data analysis procedures that generate or use non-numerical data (such as text)
This distinction might be both problematic and narrow. Why problematic? Many research designs are likely to combine quantitative and qualitative elements
o E.g., Research design using a questionnaire in which respondents also have to answer some open questions in their own words
o E.g., Qualitative research data may be analysed quantitatively (i.e., qualitative data being quantitised)
Why narrow? Reinterpret quantitative and qualitative methodologies through their associations to research paradigms, research approaches and research strategies.
Quantitative research design Qualitative research design
o Research paradigm: Positivism o Research approach: Deduction o Characteristics: Causal Relationships,
numbers, statistical analysis techniques, standardised, probability sampling, generalizability, independent researcher
o Research strategies: Experiments, surveys
o Research paradigm: Constructivism o Research approach: Induction o Characteristics: Meanings, text,
interpretation, non-standardised, non-probability sampling, develop conceptual framework, researcher part of the research process
o Research strategies: Case study
Still, it is possible that a quantitative research design is more in line with induction, and that a qualitative research design is more in line with deduction Many research designs are thus likely to combine quantitative and qualitative elements. No need to learn figure by heart.
Triangulation is one of the advantages of using more than one data collection technique and
analysis procedure
Multiple methods may be used in order to combine data to ascertain if the findings
from one method mutually corroborate the findings from the other method. ( to see
whether we have the same result)
Whatever methods you use to collect and analyse data o Be explicit about the grounds on which multiple methods research is conducted! o And do not forget that they must serve your research question!
RESEARCH STRATEGY:
We have different ways of conducting a research. We can combine different strategies within the
same project.
Various research strategies exist:
o Experiment
o Survey
o Archival Research
o Case Study
o Ethnography
o Action research
o Grounded theory
o
EXPERIMENT
The only way to investigate Causal Relationship. To infer whether a change in one or more
independent variables produces a change in one or more dependent variables.
E.g.:
Mood (A) Creativity (B)
Negative Positive
There might be a problem:
Is it mood that affects creativity? Or is it the other way round?
Now we can manipulate A. Then we test the creativity. Then there will be an effect of mood on
creativity.
Choice of research strategy (strategies) is, among others,
guided by research questions, research objectives, research
paradigm, research approach and research purpose, as well as
by more practical concerns (e.g., time resources, access to
potential participants).
Dependent Independent
CLASSIC EXPERIMENT
o Participants randomly assigned to either the experimental group or control group o Each group should be similar in all aspects relevant to the research other than whether or
not they are exposed to the planned intervention or manipulation Experimental group: Some form of planned intervention/manipulation will be tested Control group: No such intervention/manipulation is made
Example:
Promotion Purchasing Behaviour
No promotion Promotion Control group Experimental group Success if promotion increased purchase behaviour.
Field experiment is better than lab experiment in terms of external validity.
Pre-test measurement of
Purchasing Behaviour
Buy two, get one free promotion:
Yes or no
Post-test measurement
of Purchasing Behaviour
Dependent Independent
Internal Validity: The extent to which the findings can be attributed to the interventions rather than any flaws in your research design.
External Validity: Whether the cause-and-effect relationship(s) found in the experiment can be generalised.
Question:
o As a marketer, you are wondering whether rock versus pop music in supermarkets
influences the time consumers spend in these supermarkets.
o Design an experiment which would enable this marketeer to find an answer on this
problem.
Answer:
Music Time spent in supermarket
Control group: No music Experimental group:
1. Pop 2. Rock
Choice of supermarket is important. The supermarket, as well as the days have to be the same. Because some days people might be happier or shop more. We have to consider all the elements. SURVEY
o Involves the structured collection of data from a sizeable population. E.g.: Questionnaire, structured observation, structured interviews
o Usually associated with the deductive research approach o Popular and common research strategy in business and management research o Most frequently used to answer what, who, where, how much and how many
questions
ARCHIVAL RESEARCH
o Analysis of administrative records and documents as principal source of data because they
are products of day-to-day activities
o Recent as well as historical documents
Dependent Independent
o Collection of standardised data from a sizeable population in a highly economical way, allowing easy comparison
o Perceived as authoritative by people in general o Easy to explain and to understand o When sampling (see next chapter) is used, it is possible to generate
findings that are representative of the whole population at a lower cost than collecting the data for the whole population
o Data collected by the survey strategy is unlikely to be as wide-ranging as those collected by other research strategies
Limited number of questions can be included o In case a questionnaire is used
Capacity to do it badly (see later)
o Secondary data analysis: Data are part of the reality being studied rather than having been
collected originally as data for other (research) purposes
o Allows research questions which focus upon the past and changes over time to be
answered
o Disadvantages might be the nature of the records and documents, missing data, and access
to data (confidentiality)
CASE STUDY
o Empirical investigation of a particular contemporary phenomenon within its real-life context,
using multiple sources of (data) evidence
o The boundaries between the phenomenon being studied and the context within which it is
being studied are not clearly evident
Experiment: Research undertaken in a highly controlled context
Survey: Ability to explore and understand the context is limited by the
number of variables for which data can be collected
o Relevant strategy if you wish to gain a rich understanding of the context
o Has considerable ability to generate answers to why, what and how questions
o Likely to use multiple sources of data (interviews, observation, documentary analysis,
questionnaires, )
Example:
o Building high quality interaction and cooperation during organisational change (Grieten & Lambrechts, 2007, 2009)
o Problem definition: 2/3 of change processess fails, although it is known that these failures are often caused by relational aspects
o Research question: What makes relational practices of such a quality that they improve common progress during organisational change?
o Case selection: Two organisations with contrasting change processes in terms of results (best practice and worst practice), but similar in terms of relational approach
o Data collection methods: (participant) observation, in-depth interviews, focus groups, document analysis (Triangulation!)
ETHNOGRAPHY
o Used for studying people in groups, who interact with one another and share the same
space (e.g., street level, work group, organisation, )
o Origins in (colonial) anthropology
o Focuses upon describing and interpreting the social world through first-hand field study
o Researchers living amonst those whom they study, to observe and talk to them in
order to produce detailed cultural accounts of their shared beliefs, behaviors,
interactions, language, rituals and the events that shaped their lives
TRIANGULATION
o Ideas about this strategy or not unified!
ACTION RESEARCH
o An emergent and iterative process of inquiry that is designed to develop solutions to real
organisational problems through a participative and collaborative approach, which uses
different forms of knowledge, and which will have implications for participants and the
organisation beyond the research project
o Research in action rather than research about action
o Demanding strategy in terms of the intensity involved and the resources and time required
GROUNDED THEORY
Uses Inductive research approach. We start with the data and theory/a relevant model.
o Developed as a response to the extreme positivism of past social research
o Theory is developed through the systematic and simultaneous process of data collection and
analysis involving a mainly inductive approach
to generate theory grounded in your data
o A process of constant comparison moving between inductive and deductive thinking
o Theoretical sampling until theoretical saturation is reached
= Conceptual density
= Conceptual saturation
o Time-consuming, intensive and reflective
o Will something significant emerge?
o Will something emerge that is more than simply descriptive?
Example:
Nyilasy, G., & Reid, L.N. (2009). Agency practitioners metatheories of advertising. International
Journal of Advertising, 28(4), 639-668.
o What do advertising agency practitioners think about how advertising works? This
studys basic aim was to understand practitioners thinking about the work of
advertising in their own terms. As there was little substantive research of this
perspective, a grounded theory approach to qualitative research was used.
o Semi-structured, in-depth interviews were used until theoretical saturation was
achieved
TIME HORIZON:
Cross-sectional studies
o The study of a particular phenomenon or phenomena at a particular time, i.e. a
snapshot
o Choice of moment may be important
Longitudinal studies
o The study of a particular phenomenon or phenomena over an extended period of
time (different moments in time)
o Possible to study changes and developments
o Be careful for relevant changes in variables you do not take into account!
o E.g., Consumer Sentiment Index (University of Michigan)
When developing your research design, you should also consider the ethnics and the quality of your
research design.
ESTABLISHING THE QUALITY OF THE RESEARCH DESIGN
Reliability: Consistency in research. Is it consistent when I replicate exactly the same experiment. Validity: Testing the right variable.
An example by means of a scale as measurement instrument
1. 73 kg
2. 73 kg
3. 73 kg
4. 73 kg
RELIABILITY
Reliability does not involve validity!!! & Validity does not involve reliability!!!
Real weight= 78 kg
VALIDITY
It doesnt measure what it intends to measure.
Not Reliable
Not Valid
Reliable
Not Valid
Reliable
Valid
Not Reliable
Valid
Question:
1. The student administration department of HUB examines the extent to which HUB-students
are satisfied with the teaching skills of the HUB-staff. By means of a questionnaire on Time 1,
researcher X finds that the overall satisfaction is equal to 8.7 on 10. Two weeks later (Time
2), researcher X conducts the same research (among the same respondents) and finds that
the overall satisfaction is equal to 8.7 on 10. Consequently, researcher Xs results are:
Valid: ? => No information to tell us whether it is valid or not.
Reliable: Yes
2. You developed a measurement instrument to examine employees level of job autonomy
perception (i.e., the extent to which they experience autonomy in their job). This
measurement instrument seems to be sensitive to social desirability (i.e., respondents
tendency to give answers that may be desirable from a social standpoint / when people
answer according to what they think is expected of them and not according to their own
opinion.)
Question: What is the implication of social desirability for the quality of your measurement
instrument?
Not valid because of measuring job autonomy, they are asking/measuring social desirability.
3. Which of the following statements is correct?
A) Experiments are more valid compared to surveys B) If a study is reliable, it means that it measures what we think it should measure C) External validity is about the extent to which the reliability of a study can be
generalised D) An interviewer who writes down a wrong answer from absent-mindedness threats
the reliability of his study
Answer: D => Just one instance of absent-mindedness will not influence the validity of
the research.
Chapter 4: Sampling
The full set of cases are not necessarily people! E.g.: Whats the everage price of chicken soup in Chinese Restaurants located in Brussels? Population: Chinise restaurants located in Brussels. Sample: A sub-group within a population. Research question: How many beers do Belgian adults drink on average each week? You could collect and analyse data from every possible case in the population = CENSUS However, there might be restrictions in terms of time, money, accessn currency, speed, practice, accuracy, detail ... Therefore, consider data from a subgroup rather than all possible cases or elements of the population = SAMPLE
Sampling is about selecting a number of elements from a populaton you would like to study, with the intention to derive characteristics of the population from characteristics of the sample.
THE SAMPLING PROCESS
DEFINE THE POPULATION Depends on your research question! E.g.: How satisfied are HUB-students with the teaching skills of the HUB-professors? Defining the population is not always that straightforward E.g.: Research project assessing consumer response to a new brand of mens moisturiser Be careful for population specification error = Consequence of not studying a specific part of the target group Question: Define the population for the following research questions:
How do employees of Carrefour think the proposed introduction of compulsory Sunday working will affect their working lives? Population: Employees of Carrefour
What is the normal range in miles that can be travelled by electric cars in everyday use? Population: Electric cars you use everyday
DETERMINE THE SAMPLE FRAME A list of all elements in the population from which your sample will be drawn Examples:
o Telephone book o Companies customer database o Membership lists o
In some cases you will have to develop the sample frame yourself!
Sampling frame error
Sampling frame is not a perfect reproduction of the research population = The variation between the population defined by the researcher and the population as implied by the sampling frame used
Examples of causes of sampling frame errors: o Not up to date o Elements of sampling frame that are not part of the population o Elements of population are not in sampling frame o Elements that are included multiple times
E.g. of sampling frame errors Telephone book:
Not up to date Not everyone has telephone Companies are in phone book as well
Checklist: Are elements listed in the sampling frame relevant to your research question? How recently was the sampling frame compiled, in particular is it up to date? Does the sampling frame includes all elements, in other words is it complete? Does the sampling frame contain the correct information, in other words is it accurate? Does the sampling frame exclude irrelevant cases, in other words is it precise? For purchased lists and online panels, can you establish and control precisely how the
sample will be selected? For an online panel, can you establish whether incentives will be used to enhance the likely
response and provide an assessment of the impact of this on respondent characteristics and consequently responses?
You should not generalise beyond your sampling frame
E.g.: Sampling frame consists of all employees of an organisation You can only generalise to
employees of that particular organisation
Sometimes not possible (or very hard) to develop a sampling frame!
Question: Which sampling frame is suited for the following research questions? How do employees of Carrefour think the proposed introduction of compulsory Sunday working will affect their working lives? Answer: Which factors influence Belgian lawyers decision to work in other European countries? Answer: SELECT SAMPLING TECHNIQUES First of all you need to decide whether you will examine all elements of the population (=census) or you will dram a sample For populations fewer than 50, it is usually more sensible to collect data from the entire population. Draw a sample => Conditions:
o Practical constraints o Budget constraints o Time constraints o Access constraints o Results need to be quickly available o Testing includes destroying of population (e.g.: Establish the actual duration of long-life
batteries) Two types of sampling:
1. Probability Sampling 2. Non-probability Sampling
Probability Sampling Techniques o Sampling techniques in which each element of the population has a fixed probabilistic
chance (usually an equal chance) of being selected for the sample. o It becomes possible to answer research questions that require you to estimate statistically
the characteristics of the population from the sample (i.e., with a certain level of confidence, you are able to generalise the findings to the population)
o Probability sampling is often associated with survey and experiment research strategies. Non-Probability Sampling Techniques
o The probability of each case being selected from the total population is not known. o It is impossible to answer research questions that require you to make statistical inferences
about the characteristics of the population. Note: You may still be able to generalise from non-probability samples about the population, but not on statistical grounds.
Question: Which of the following statements is true?
A) With probability samples the chance, or probability, of each case being selected from the population is unknown.
B) Generalizations about populations from data collected using any probability sample are based on intuition.
C) Sampling provides a valid alternative to a census when it would be impracticable for you to survey the entire population.
D) The sampling frame gives an overview of all the elements which will be included in your final sample.
Answer: C
Probability Sampling Techniques Non-Probability Sampling Techniques
o Simple random sampling o Systematic random sampling o Stratified random sampling o Cluster sampling o Multi-stage sampling
o Quota sampling o Judgemental sampling o Snowball sampling o Self-selection sampling o Convenience sampling
Probability Sampling Techniques
Simple Random Sampling A probability sampling technique in which each element has a known and equal probability of selection. Every element is selected independently of every other element, and the sample is drawn by a random procedure from a sampling frame. E.g.,
o Each element of the sampling frame is assigned a unique identification number (0, 1, 2, )
o Random numbers are generated to determine which elements to include in the sample (e.g., by means of a random number table) and until sample size is reached
Example of random number table:
If the same number is read off a second time, it must be disregarded as you need different cases. This means that you are not putting each cases number back into the sampling frame after it has been selected. This is termed sampling without replacement. If a number is selected that is outside the range of those in your sampling frame, you simply ignore it and continue reading off numbers until your sample size is reached. Disadvantages of this procedure:
o Time-consuming o Requires adapted table with sufficient radom numbers
Other random procedure:
o Computer generated random numbers / Online random number generator ( random number tables)
o Random telephone numbers - Often used when doing computer-aided telephone interviewing (CATI) - Dialing telephone numbers at random from an existing database - Or random digit dialling
+ Does not consider the telephone book - Some households have more than one telephone number!
Simple random sampling:
o Sample without (systematic) bias o Best used when you have an accurate and easily accessible sampling frame that lists the
entire population Disadvantage: These lists are not always available!
o If your population covers a large geographical area, random selection means that selected cases are likely to be dispersed throughout the area
Disadvantage: This sample is not suited if collecting data over a large geopgraphical area using a method that requires face-to-face contact (high travel costs)
Example: Jemma was undertaking her work placement at a large supermarket, where 5011 of the supermarkets customers used the supermarkets Internet purchase and delivery scheme. She was asked to interview customers and find out why they used this scheme. As there was insufficient time to interview all of them, she decided to interview a sample using the telephone. Her calculations revealed that to obtain acceptable levels of confidence and accuracy she needed an actual sample size of approximately 360 customers. She decided to select them using simple random sampling. Having obtained a list of Internet customers and their telephone numbers, Jemma gave each of the cases (customers) in this sampling frame a unique number. In order that each number was made up
in exactly the same way she used 5011 four-digit numbers starting with 0000 through 5010. So customer 677 was given the number 0676. She selected at random a first random number in the random number table. After that, she read off the other random numbers in a regular and systematic manner. She continued in this manner until 360 different cases had been selected. These formed her random sample. Numbers selected that were outside the range of those in her sampling frame (such as 8321, 5953 and 7932) were simply ignored.
Systematic Random Sampling A probability sampling technique in which the sample is chosen by selecting a random starting point and then picking every ith element in succession from the sampling frame
Selecting the sample at regular intervals from the sampling frame Similar to Simple random sampling but in a systematic order. We apply an interval for sample selection. Example: Number each of the cases in your sampling with a unique number (0, 1, 2 )
1500 patients: number each of these patients (0,1,2 1499) Sample of 300 participants
Calculate the sampling fraction (actual sample size/total population)
Sampling fraction: 300/1500=1/5 Select the first case using a random number (depends on sampling fraction)
Random starting point (i.e., random number between 0 and 4) Select subsequent cases systematically (until sample size is reached) using the sampling fraction to determine the frequency of selection.
Continue to select every fifth patient until the sample size of 300 patients is reached. Systematic random sampling:
o Sometimes not necessary to develop a sampling frame (e.g., every tenth visitor of a website)
o Easy to understand and to explain o Despite these advantages, be careful when using existing lists as sampling frames
- You need to ensure that the lists do not contain period patterns! (See next 2 slides) - Systematic random sampling is suitable for geographically dispersed cases only if
you do not require face-to-face contact when collecting data ( simple random sampling)
The impact of period patterns on systematic random sampling: Consider the use of systematic random sampling to generate a sample of monthly sales from the Harrods store in London. The sampling frame contains monthly sales for the last 60 years. A sampling interval of 12 is chosen.
A high street bank needs you to administer a questionnaire to a sample of individual customers with joint bank accounts Sampling fraction = 1/2 = you will need to select every second customer on the list The names of the customer list, which you intend to use as the sampling frame, are arranged as depicted below:
Stratified Random Sampling You divide the population into two or more relevant strata based on one or a number of attributes (e.g., gender, income, region: these attributes are relevant for your research).
In other words, your sampling frame is divided into a number of subsets. A random (simple or systematic) sample is then drawn from each of the strata. More concrete
o Choose the stratification variable(s) - These variables need to be relevant for the research problem - Stratification needs to result in homogeneity within each strata with regard to the
stratification variable(s) o Divide the sampling frame into the discrete strata o Number each of the cases within each stratum with a unique number o Select your sample using either simple random or systematic random sampling
Example: Sarah worked for a major supplier of office supplies to public and private organisations. As part of her research into her organisations customers, she needed to ensure that both public and private sector organisations were represented correctly. An important stratum was, therefore, the sector of the organisation. Her sampling frame was thus divided into two discrete strata: public sector and private sector. Within each stratum, the individual cases were then numbered.
She decided to select a systematic random sample. A sampling fraction of 1/4 meant that she needed to select every fourth customer on the list. As indicated by the ticks, random numbers were used to select the first case in the public sector (001) and private sector (003) strata. Subsequently, every fourth customer in each stratum was selected.
Stratified random sampling:
o Dividing the population into a series of relevant strata means that the sample is more likely to be representative, as you can ensure that each of the strata is represented proportionally within your sample.
Proportionate stratified random sampling = the sample size drawn from the strata are proportionate to the stratas share of the total population
o Disproportionate stratified random sampling (oversampling enables separate analyses) o Despite the advantages of proportionate and disproportionate sampling, there are some
disadvantages as well: - Only possible if you can easily distinguish significant strata (in your sampling frame) - Extra stage of sampling procedure
More time More expensive More difficult to explain compared to simple and systematic random
sampling
Cluster Sampling All elements of a number of randomly selected clusters are selected More concrete:
o Choose the cluster grouping for your sampling frame - Heterogeneity in clusters is important! Cluster small universe
(e.g., Population=football lovers; Cluster=football stadium) o Number each of the clusters with a unique number (0, 1, ) o Select your sample of clusters using some form of random sampling o Select all elements of the selected clusters
Every cluster has an equal chance to be selected Random sampling technique
Still, the technique normally results in samples that represent the total population less
accurately compared to stratified random sampling (Make sure that clusters are thus
heterogeneous!)
Advantage: Restricting the sample to a few relatively compact geographical sub-areas (clusters)
maximises the amount of data you can collect using face-to-face methods within the resources
available.
Example:
Abdel needed to select a sample of firms to undertake an interview-based survey about the use of
large multiple-purpose digital printer copiers. As he had limited resources with which to pay for
travel and other associated data collection costs, he decided to interview firms in four geographical
areas selected from a cluster grouping of local administrative areas. A list of all local administrative
areas formed his sampling frame. Each of the local administrative areas (clusters) was given a unique
number, the first being 0, the second 1 and so on. The four sample clusters were selected from this
sampling frame of local administrative areas using simple random sampling.
Abdels sample was all firms within the selected clusters. He decided that the appropriate telephone
directories would probably provide a suitable list of all firms in each cluster.
Stratified random sampling vs. Cluster sampling
Multi-stage sampling
Select a stage and research within the cluster.
o Modifying a cluster sample by adding at least one more stage of sampling that also involves some form of random sampling
o Procedure: Choose the cluster grouping for your sampling frame
Heterogeneity in clusters is important! Number each of the clusters with a unique number (0, 1, ) Randomly select a number of clusters Repeat the above steps (e.g., districts cities neighbourhoods streets) Randomly select elements of the most recently selected clusters
Example: Laura worked for a market research organisation that needed her to interview a sample of 400 households in England and Wales. She decided to use the electoral register as a sampling frame. Laura knew that selecting 400 households using either systematic or simple random sampling was likely to result in these 400 households being dispersed throughout England and Wales, resulting in considerable amounts of time spent travelling between interviewees as well as high travel costs. By using multi-stage sampling, Laura fest these problems could be overcome. In her first stage, the geographical area (England and Wales) was split into discrete sub-areas (counties). These formed her sampling frame. After numbering all the counties, Laura selected a small number of counties using simple random sampling. Since each case (household) was located in a county, each had an equal chance of being selected for the final sample. As the counties selected were still too large, each was subdivided into smaller geographically discrete areas (electoral wards). These formed the next sampling frame (stage 2). Laura selected another simple random sample. This time she selected a larger number of wards to allow for likely important variations in the nature of households between wards. A sampling frame of the households in each of these wards was then generated using a combination of the electoral register and the UK Royal Mails postcode address file. Laura finally selected the actual cases (households) that she would interview using systematic random sampling. Multi-stage sampling: Advantages:
o Geographically dispersed population becomes possible against lower cost. o Compared to normal cluster sampling, larger clusters with many cases is possible
Disadvantages:
o Selecting smaller and smaller subgroups might impact the representativeness of your sample Can be solved through applying stratified random sampling techniques as well
Impact of various factors on choice of probability sampling techniques:
o Sampling frame required o Size of sample needed o Geographical area to which suited o Necessity of personal contact with respondents o Relative cost o Easy to explain to support workers? o Advantage compared with simple random sampling o
Question:
o BNP Paribas Fortis has about 400 000 Benelux-clients using their credit card. The credit card
application form contains common information such as name, address, age, telephone
number, educational level, etc.
o BNP Paribas Fortis wants to examine whether there is a relationship between the way in
which credit cards are used (e.g., frequency of use) and the socio-economic profile of its
users.
Questions: Identify the population and the sampling frame. Consider the suitability
of the various probability sampling techniques in this situation.
Answer:
Probability Sampling Techniques
Quota Sampling Stratified sampling though the selection of cases is not random (often used for structured interviews as part of a survey strategy) Procedure:
o Divide the population into specific subgroups (quota) based on relevant variables o Calculate, based on relevant and available data, for each subgroup the amount of elements
to be selected o Give each researcher an assignment which states the number of cases in each quota from
which they must collect data o Combine the data collected by interviewers to provide the full sample
E.g.: 2 quotas, Female and Male. Then within the group we select the elements. Quota:
o Are based on relevant and available data o *Are usually relative to the proportions in which they occur in the population* (e.g., 48%
females in population 480 females and 520 males being selected in a sample of 1000 participants)
o Without sensible and relevant quotas, data collected may be biased *Precision control = Proportions in sample perfectly mirror the proportions in the population
Precision control Example:
o Interest in consumption habits among +16 in a medium village o Sample must be representative in terms of residence and age o Population statistics: 24 420 16+-residents o Sample: 1/12 of population 2035 sample cases o 3 districts and 4 age groups 12 quota
Frequency Control:
Representative in terms of criterion
Question: An association has 750 members. In the table below, the distribution of these members is given in terms of gender and age.
Draw a quota sample of 125 subjects, taking into account: - Gender - Age - Gender & Age
Answer: Gender: Males : 125 * (367/750)= 61 Females: 125 * (383/750)= 64 Age: 18-25: 125 * (173/750) = 29 26-49: 125 * (379/750) = 63 50+: 125 * (198/750) = 33 Gender & Age: (We now consider every single cell) Males - 18-25: 125 * (98/750) = 16 Males 26-49: 125 * (191/750) = 32
Quota Sampling: Advantages compared to probability sampling techniques
o Less costly o Can be set up very quickly o Does not require a sampling frame
Disadvantages o Because the interviewer can choose within quota boundaries whom they interview, your
quota sample may be subject to bias (e.g., easily accessible respondents who appear to be willing to answer the questions)
o As the sample is not probability based, you cannot measure the level of certainty or margins of error
Judgemental Sampling
o = Purposive sampling o You need to use your judgment to select cases that will best enable you to answer your
research question o Often used when:
- Working with very small samples (such as in case study research or when you wish to select cases that are particularly informative) E.g., Industrial research among experts
- Doing qualitative research - Doing exploratory research
Those samples cannot be considered to be statistically representative of the total population! The more common judgemental sampling strategies:
o Extreme case or deviant sampling o Heterogeneous or maximum variation sampling o Homogeneous sampling (Focus group discussing: stimulate conversation by putting similar
people together. Then they will be likely to debate more) o Critical case sampling o Typical case sampling o Theoretical sampling (cfr. Grounded Theory)
Snowball Sampling Commonly used when it is difficult to identify members of the desired population Procedure:
o Make contact with one or two cases in the population o Ask these cases to identify further cases o Ask these new cases to identify further new cases (and so on) o Stop when either no new cases are given or the sample is large enough (or when theoretical
saturation is reached)
Main problem = Making initial contact Problems of bias is huge
o Respondents are most likely to identify other potential respondents who are similar to themselves, resulting in a homogeneous sample
Self-selection Sampling
o It occurs when you allow each case, usually individuals, to identify their desire to take part in the research
o You therefore: - Publicise your need for cases either by advertising or by asking them to take part - Collect data from those who respond
o Problem = representativeness - Cases that self-select often do so because of their feelings or opinions about the
research question
E.g.: People posting on Facebook to ask people to participate in a survey. Problem: Bias Example: Patricks research was concerned with the impact of student loans on studying habits. He had decided to administer his questionnaire using the Internet. He publicised his research on Facebook in a number of groups pages, using the associated description to invite people to self-select and clicking on the link to the questionnaire. Those who self-selected by clicking on the hyperlink were automatically taken to the online questionnaire he had develop using the Qualtrics online survey software.
Convenience Sampling
o Involves selecting cases haphazardly only because they are easily available (or most convenient) to obtain for you sample
- E.g., the person interviewed at random in a shopping centre for a television programme
o Widely used o Advantages:
- Cheap - Quick (Suited for exploratory research)
o Though prone to bias and influences that are beyond your control Cases appear in the sample only because of the ease of obtaining them. Bias decreases as the population becomes more homogeneous
Impact of various factors on choice of non-probability sampling techniques
o Likelihood of sampling being representative o Types of research in which useful (e.g., non-probability techniques often used in exploratory
research) o Relative costs (Note: Non-probability techniques are often used as they often imply less
costs compared to probability sampling techniques) o Note: Where it is not possible to construct a sampling frame you will need to use non-
probability sampling techniques
Question: For the following research question, it has not been possible for you to obtain a sampling frame. Suggest the most suitable sampling technique to obtain the necessary data, and motivate your choice. Research question: Would users of the tennis club be prepared to pay a 10 per cent increase in subscriptions to help fund two extra tennis courts? You need the answer by tomorrow morning. Answer: Convenience sample (not much time) But if we have time, probability sample technique EXAM: If we have sampling frame => Probability technique For many research projects, you will have to combine different sampling techniques
Question: Is the following statement true or false? Motivate your answer Stratified sampling can be seen as random quota sampling Answer:
DETERMINE THE SAMPLE SIZE
Probability sampling techniques The confidence interval approach Normal Distribution
95% of the values is in between -1.96*standard deviation and +1.96*standard deviation
A z-score of 1.96 corresponds with a confidence level of 95%
Statistical Interference
Important in research is to calculate statistics, such as the sample mean and sample proportion, and use them to estimate the corresponding true population values Statistical interference: The process of generalising the sample results to a target population
Confidence Intervals
o We are thus interested in using the sample statistic (e.g., the sample mean) as an estimate of the value in the population
o An approach to assessing the accuracy of the sample mean as an estimate of the mean in the population is to calculate boundaries (Confidence Intervals) within which we believe the true value of the mean will fall
o Typically, we look at 95% confidence intervals o This means that for 95% of the time, the true value of the population will fall within the
boundaries of the confidence interval o In other words, if you would collected 100 samples, calculated the mean and then calculated
a confidence interval for that mean, then for 95 of these samples, the confidence intervals we constructed would contain the true value of the mean in the population
o X = sample mean o = population mean
= standard deviation of population o n = sample size o Confidence level (Z)
The range of a normally distributed variable is approximately equal to +/- 3 standard deviations, and
one can thus estimate the standard deviation by dividing the range by 6.
Example:
Suppose a researcher wants to estimate the monthly household savings more precisely so that the
estimate will be within +/- 5 of the true population value. What should be the size of the sample?
1) The researcher should specify the level of precision. This is the
maximum permissible difference between the sample mean
and the population mean.
D=5
2) The researcher should specify the level of confidence and determine the z-value associated
with this confidence level
Confidence level=95% => z-value=1.96
3) The researcher should determine the standard deviation of the population.
Secondary sources indicate a standard deviation of 55 (=)
D=5 z=1.96 =55 n=(1 .96 * 55) / 5 = 465 (rounded to next highest integer)
Sample size: The larger the population, the larger the sample size The higher the degree of confidence, the larger the sample size The higher the degree of precision, the larger the sample size
The choice of sample size is thus governed by: o The confidence you need to have in your data: The level of certainty that the characteristics
of the data collected will represent the characteristics of the total population o The margin of error that you can tolerate: The accuracy you require for any estimates made
from your sample o The variability in the population in terms of the variable(s) of interest
Define the level of precision
=
The maximum permissible difference
(D) between the sample mean and the
population mean
We already determined the level of precision (D)
But what about Z and ? o Specifying Z is about specifying the level of confidence
A 95% confidence level is desired Z = 1.96
o Determine (=the standard deviation of the population) Secondary sources, pilot study or (max valuemin value)/6
EXAM questions: A big company wants to know how much money (in euro) each of its managers spends on lunches per month. It is known that the maximum amount of money spent is 700 euros while the minimum is 400 euro. The company wants the result to be accurate in terms of 5 euro and wants to make a prediction with a confidence level of 95%. How large should the sample size be? Answer: Level of precision: D = 5 Confidence 95%: z = 1.96
= 700-400/6 = 50 N= (1.96 * 50) / 5 = 385
Sample size determination: Proportions
Example: Suppose a researcher is interested in estimating the proportion of households in a particular region that have bought clothes online. What should be the sample size?
1) The researcher should specify the level of precision. This is the maximum permissible difference between the sample proportion and the population proportion.
D = 0.5 2) The researcher should specify the level of confidence and determine the z-value associated
with this confidence level. Confidence level=95% z-value = 1.96
3) The researcher should determine the population proportion. Secondary sources indicate a population of 0.64
D = 0.05 z = 1.96 = 0.64 n = [1.96 * 0.64(1-0.64)] / 0.05 = 355 EXAM Question: A researcher wants to know the percentage of households that has a loyalty card of a certain supermarket. You desire a precision level of 5 percentage points (and a 95% confidence level). How large should the sample size be? Level of precision: D = 5 Confidence 95%: z = 1.96 = 50 N= [1.96 * 50(50)] / 5 = 385
? Population proportion?
Secondary sources, pilot study, or
conservative (=0.5)
Other factors influence the determination of the sample size: o Time resources o Financial resources o Type of data analysis o Access o Expected response (see later response rate) o
Non-probability sampling o Formulas of probability sampling techniques
- Are based on the assumption that the sample cases are randomly selected - Formulas are just guidelines
o Larger sample sizes do not necessarily lead to higher levels of confidence and precision o However, take into account
- Variability in the target group - Goal of sampling - Importance of research for management/client
o Or you could consider sample sizes used in similar studies, for instance
Non-response and response rate
The non-sampling response problem o In reality, you are likely to have non-responses o Possible causes of non-response
- Refusal to participate - Ineligibility to respond - Inability to locate respondent - Respondent located but unable to make contact
o Possible consequences of non-response - Lower confidence and precision levels due to smaller sample size - Non-response bias: People who refuse differ from actual respondents
o As part of your research report, you will need to include the o Response rate:
Total response rate =
Active response rate =
Total number of responses
Total number in sample - Ineligible
Total number of responses
Total number in sample (ineligible + unreachable)
Total and active response rate: Example Suzan has decided to administer a telephone questionnaire to people who had left her company over the past five years. She obtained a list of the 1034 people who had left over this period (the total population) and selected a 50 per cent sample. Unfortunately, she could obtain current telephone numbers for only 311 of the 517 ex-employees who made up her total sample. Of these 311 people who were potentially reachable, she obtained a response from 147. In addition, her list of people who had left her company was inaccurate, and 9 of those she contacted where ineligible to respond, having left the company over five years earlier. Total response rate = Total number of respondents / (total number in sample ineligible) Total response rate = 147 / (517 9) = 28.9 % Active response rate = Total number of respondents / Total number in sample - (ineligible + unreachable) Active response rate = 147/ 3119 = 48.7 % Estimating response rates and actual sample size required
o Non-response = Reality You should estimate the likely response rate and increase the
sample size accordingly
- First of all, determine the minimal sample size (taking into account certain
confidence and precision levels)
- Second, estimate the likely response rate
- Third, calculate the actual sample size you require
Example: Peter was a part-time student employed by a large manufacturing company. He had decided to send a questionnaire to the companys customers and calculated that a minimum sample size of 439 was required. From previous questionnaires that his company had used to collect data from customers, Peter knew the likely response rate would be approximately 30 per cent. Using these data he could calculate his actual sample size: na = 439 x 100 / 30 = 43 900 / 30 = 1463 Peters actual sample size, therefore, needed to be 1463 customers.
na = The actual sample size required
n = The minimum sample size
re % = The estimated response rate expressed as a percentage
How to estimate the response rate?
o Consider the response rates achieved for similar research that has already been undertaken Beware, response rates can vary considerably when collecting primary data!
- E.g., postal questionnaires: often lower than 50% (?) - E.g., face-to-face contact: often higher (?) - E.g., online questionnaires: often lower than 30% (?)
o Alternatively, err on the side of caution (35-50 per cent reasonable?) Possible consequences of non-response: Lower confidence and precision levels due to smaller sample size Increasing the actual sample size is useful in case non-response only results in less confidence and precision. However increasing the actual sample size is no solution
o When doing longitudinal research in which the same respondents need to be re-examined o If it is a matter of non-response bias
- Refusers differ on observable characteristics (gender, education, ) compared to respondents
Refusers might also differ on non-observable characteristics! How to trace non-response bias?
o Comparing characteristics of respondents with refusers - On moment of refusal - Afterwards by means of additional contact
o Comparing characteristics of respondents with population o Still not the solution when there would be differences in terms of non-observable
characteristics How to tackle non-response bias?
o Increasing the number of contacts o Work with substitutes that are randomly selected, but which match on crucial characteristics
(e.g., gender) - However, this measure is not able to solve the bias completely
EXECUTE SAMPLING PROCESS
VALIDATE SAMPLE
o Once data are collected from a sample, comparisons between the structure of the sample and the structure of the population should be made
o If it is found that the structure of a sample does not match the target population (due to population specification error, sampling frame error, sample selection bias, non-response bias), a weighting scheme can be used
- A statistical procedure that attempts to account for these errors/biases by assigning differential weights to the data depending on the response rates
na
= n x 100 / re%
Weighing
Each case in the database is assigned a weight The effect of weighting is to increase or decrease the number of cases in the sample that
possess certain characteristics Most widely used to make the sample data more representative of a target population on
specific characteristics Also used to adjust the sample so that greater importance is attached to participants with
certain characteristics Because it destroys the self-weighting nature of the sample design, this procedure should be
applied with caution! Do not forget to report this procedure!
Chapter 5: Using secondary data Research questions might be answered using some combinations of primary and secondary data. Secondary data can be more effective in terms of money and time.
TYPES OF SECONDARY DATA AND USES IN RESEARCH
May be both quantitative and qualitative data
May be raw data (received little if any processing) or compiled data (received some form of selection or summarising)
Primarily used in descriptive and explanatory research (also possible in exploratory research!)
Within business and management research, secondary data are most frequently used as part of a case study or survey research strategy (also used as part of other research strategies!)
Three main subgroups of secondary data
DOCUMENTARY SECONDARY DATA
Often used in research projects that also collect primary data (But you can also use them on their own or with other sources of secondary data!)
Include text materials and non-text materials. (can be nice to create a background with text) Can be analysed both quantitatively and qualitatively Can be used to help to triangulate findings based on other data Documentary sources you have available can depend on access issues as well as succes in
locating these sources SURVEY-BASED SECONDARY DATA
Data collected using a survey strategy (e.g., questionnaires) that have already been analysed for their original purpose
Collected through one of three distinct subtypes of survey strategy: o Censuses
- Usually carried out by governments Data are often: clearly defined well documented of high quality easily accessible widely used
- Are unique as, unlike surveys, participation is obligatory Therefore, they provide very good coverage of the population surveyed
- E.g., population and housing censuses
o Continuous and regular surveys - Those surveys, excluding censuses, that are repeated over time
E.g., surveys where data are collected throughout the year E.g., UKs General Lifestyle Survey (GLF)
E.g., surveys repeated at regular intervals E.g., EU Labour Force Survey Comparative data
- Also carried out by non-governmental bodies E.g., market research surveys Data often costly to obtain
- Also carried out by large organisations E.g., employee attitude survey Often difficult to gain access due to sensitive nature
o Ad-hoc surveys (result of one survey/ doing the survey just once)
- = A general term normally used to describe the collection of data that only occurs once due to the specificity of focus
- Usually one-off surveys - Usually far more specific in their subject matter - Because of their ad hoc nature, it will probably be more difficult to
discover relevant surveys
MULTIPLE CHOICE SECONDARY DATA
= Secondary data created by combining two or more different data sets prior to the data being accessed for the research. These data sets can be based entirely on documentary or on survey data, or can be an amalgam of the two
E.g., Various compilations of company information o E.g., Europes 15,000 Largest Companies
Some methods of compilation o Extract and combine selected comparable variables from a number of surveys or
from the same survey that has been repeated a number of times to provide longitudinal data (time-series data)
o Data compiled from the same cases over time using a series of snapshots to form cohort studies
o Secondary data from different sources can be combined, if they have the same geographical basis, to form area-based data sets (E.g., Europe in Figures: Eurostat Yearbook)
Question The Facebook-page of McDonalds is an example of: a) Documentary secondary data b) Survey secondary data c) Multiple source secondary data d) None of the above types of secondary data Answer: A) Documentary HOW TO LOCATE SECONDARY DATA?
Are the data you need available? Requires you to:
STEP 1: Establish whether the sort of data you require are likely to be available as secondary data STEP 2: Locate the precise data you require STEP 1: ESTABLISHING THE LIKELY AVAILABILITY IF SECONDARY DATA
Literature review (Reference list) Quality national newspapers Subject-specific textbooks Tertiary literature (e.g., indexes and catalogues) Informal discussions
STEP 2: LOCATE SECONDARY DATA
Once you have ascertained that secondary data are likely to exist, you need to find their precise location
o Relatively straightfoward for secondary data held in online databases or held by specialist libraries
o Data held by organisations are more difficult to locate (time consuming, quality?, ) o Once you have located a possible secondary data set, you need to be certain that it
will meet your needs
Advantages of Secondary Data Disadvantage of secondary Data
May have fewer resource requirements Unobtrusive Longitudinal studies may be feasible Can provide comparative and
contextual data Can result in unforeseen discoveries Permanence of data
May be collected for a purpose that does not match your need Access may be difficult or costly Aggregations and definitions may be unsuitable No real control over data quality Initial purpose may affect how data are presented
EVALUATING SECONDARY DATA SOURCES We need to make sure that the secondary data is valid.
Secondary data must be viewed with the same caution as any primary data! You need to be sure that:
o They will enable you to answer your research question o The benefits associated with their use will be greater than the costs o You will be allowed access to the data
Most authors suggest a range of validity and reliability criteria against which potential secondary data can be evaluated
These criteria can be incorporated into a three-stage process 1. Overall suitability 2. Precise suitability 3. Costs and benefits
1. OVERALL SUITABILITY Measurement validity
o Do the measures used match those you need? o E.g., A manufacturing organisation recording monthly sales whereas you are
interested in monthly orders o E.g., Use minutes of company meetings as a proxy for what actually happened in
those meetings Coverage and unmeasured variables
o Do secondary data cover the population about which you need data, for the time period you need, and contain variables that will enable you to answer the research question?
o Some secondary data sets may not include variables you have identified as necessary for your analysis (i.e., unmeasured variables)
Checklist: Does the data set contain the information you require to answer your research
question(s)? Do the measures used match those you require? Is the data set a proxy for the data you really need? Does the data set cover the population that is the subject of your research? Does the data set cover the geographical area that is the subject of your research? Can data about the population that is the subject of your research be separated from
unwanted data? Are the data for the right time period or sufficiently up to date? Are data available for all the variables you require to answer your research question(s)? Are the variables defined clearly?
2. PRECISE SUITABILITY Reliability and validity
o Quick option: Assess the authority or reputation of the source o In-depth assessment:
Who is responsible for the data? Method used to collect the data? (we can contact the person who
conducted the research and ask about the methodology) Context in which the data were collected? How were data analysed and reported?
Measurement bias: Can occur for two reasons 1) Deliberate or intentional distortion of data
E.g., Purpose of study is to reach a predetermined conclusion E.g., People responding to a structured interview adjusting their
responses to please the interviewer Triangulation!
2) Changes in the way data were collected Particularly important for longitudinal data sets!
Checklist: How reliable is the data set you are thinking of using? How credible is the data source? Is it clear what the source of the data is? Do the credentials of the source of the data (author, institution or organisation
sponsoring the data) suggest it is likely to be reliable?
Do the data have an associated copyright statement? Do associated published documents exist? Does the source contain contact details for obtaining further information about the
data? Is the method described clearly? If sampling was used, what was the procedure and what were the associated
sampling errors and response rates? Who was responsible for collecting or recording the data? (For surveys) Is a copy of the questionnaire or interview checklist included? (For compiled data) Are you clear how the data were analysed and compiled? Are the data likely to contain measurement bias? What was the original purpose for which the data were collected? Who was the target audience and what was its relationship to the data collector or
compiler (where there any vested interests)? Have there been any documented changes in the way the data are measured or
recorded including definition changes? How consistent are the data obtained from this source when compared with data
from other sources? Have the data have been recorded accurately?
3. COSTS AND BENEFITS
Checklist: What are the financial and time costs of obtaining these data? Can the data be downloaded into a spreadsheet, statistical analysis software or word
processor? Do the overall benefits of using these secondary data sources outweigh the associated
costs? Question: Suppose you are undertaking a research project as part of your research methods course in which you need to investigate the following research question:
How has Belgiums import and export trade with other countries altered since its entry into the European Union?
Give one argument that you could use to convince the project leader of the suitability of using secondary data to answer this research question. Answer: Time wise it is not possible to gather primary data as it is very time consuming. Question: Which of the following statements is wrong?
A) Primary data become secondary data B) Primary data are more reliable and valid compared to secondary data C) Research projects might combine primary and secondary data. D) Secondary data enable researchers to triangulate their primary research findings
Answer: A) Primary data are more reliable and valid compared to secondary data
Chapter 6: Collecting Primary Data through observation If your research question is concerned with what people do, an obvious way in which to discover this is to watch them do it This is essentially what observation involves:
The systematic observation, recording, description, analysis and interpretation of (peoples) behaviour
Observation is very often used in marketing research. E.g.: Loyalty card: observing the buying pattern of customers. Cookies online: browsing behaviour (Structured Obcervation) Two types of observation are examined in this chapter
1. Participant observation Qualitative Emphasis is on discovering the meaning that people attach to their actions
2. Structured observation
Quantitative Emphasis is on the frequency of actions
Observation can be used as either the main method of data collection or to supplement other methods!!! What can be observed?
Behavior and physical actions Verbal behavior Body language Spacial aspects of relationships Time patterns Physical objects (products on shelves) Activities from the past .
Dimension based on which observation methods differ Natural or Manipulated Natural: observe people in their natural environment Manipulated: In a lab setting Personal or Mechanical Personal: Human being observing Mechanical: Done by an eye tracker (To see what attracts people more) Hidden or Not Hidden Concealed identity
Question: The sellers of a multimedia store visit competitive stores and write down their prices. This observation is: a) Natural - Personal - Not hidden b) Manipulated - Mechanical - Not hidden c) Manipulated - Personal Hidden d) Natural - Personal - Hidden Answer: d) natural personal hidden
PARTICIPANT OBSERVATION
Qualitative
Emphasis is on discovering the meaning that people attach to their actions.
Observation in which the researcher attempts to participate fully in or closely observe the lives and activities of the research subjects and thus becomes a member of the subjects group(s), organisation(s) or community
This enables researchers to share their experiences by not merely observing what is happening but also feeling it
E.g., Street Corner Society by W.F. Whyte
TYPOLOGY OF PARTICIPANT OBSERVATION RESEARCHER ROLES
The time you have to conduct the research will determine the role you would take as a researcher. Complete participant:
Preventing social desirability Raises questions of ethics Might lose sight of research purpose
Observer as participant:
Able to focus on the researcher role Lose the emotional involvement
Participant as observer: Not always easy to gain trust of the group you observe
FACTORS THAT WILL DETERMINE THE CHOICE OF PARTICIPANT OBSERVER ROLE
Purpose of your research Which role suits your research question? E.g.: A phenomenon about which the research informants would be naturally defensive is one that lends itself to the complete participant role
The time you have to devote to your research Some of the roles may be very time consuming E.g.: A period of attachment might be necessary
The degree to which you feel suited to participant observation
Organisational access
Ethical considerations The degree to which you reveal your identity as the researcher or adopt a covert stance will be dictated by ethical considerations
Not making and recording data Note making: Your notes are likely to be composed of different types of data:
o Primary observations What happened? What was said?
o Secondary observations Statements by observers about what happened or was said
o Experiential data Perceptions and feelings as you experience the process you are researching Contextual data
Data related to the research setting and organisational structures and communication patterns that will help you to interpret other data
Data Collection
No formal interviews but informal discussions Recording must take place on the same day as the fieldwork in order to not forget valuable
data Data Analysis
Data from participant observation are analysed like other qualitative data (not part of this course; see BBA3)
Data will start to be analysed at the time you collect them (i.e., data collection and data analysis will be carried out simultaneously)
o Promising lines of enquiry that you wish to follow up in your continued observation will emerge (cfr. analytic induction)
ISSUES RELATED TO RELIABILITY AD VALIDITY Participant observation has high ecological validity as it involves studying social actors and
social phenomena in their natural settings However, using participant observation may lead to a number of threats to reliability and
validity: o Observer error o Observer bias o Observer effect
Observer Error Lack of understanding of or overfamiliarity with setting may lead you to un