Upload
sachinudepurkar
View
1.270
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Data Analysis - Market Research by Prof Sachin Udepurkar
Citation preview
1) Stage of preparing data preparation
2) Data Analysis3) Descriptive
statistics
1) By Prof. Sachin Udepurkar
Converting information from questionnaire so it can be transferred to a data warehouse is referred to as data preparation
This process usually follows a four step approach, beginning with data validation followed by editing and coding, data entry and data tabulation
Error detection begins in first phase and continues throughout the process
The purpose of data preparation is to take data in its raw form and convert it to establish meaning and create value for the user
Validation
Editing & Coding
Data Entry
Data Tabulation
Data Analysis
Descriptive
Analysis
MultiVariate
Analysis
Uni & Bivariate Analysis
Interpretation
DATA PREPARATION
ERROR
DETECTION
The process of determining, to the extent possible, whether a surveys interviews or observations were conducted correctly and are free of fraud or bias
In many data collection approaches it is not always convenient to closely monitor data collection process wherein to facilitate the accurate data collection each respondents name, address and phone number may be recorded
While this information is not used for analysis, it does enable the validation process to be completed
Curbstoning :
It is term used in marketing research industry to indicate falsification of data which is collected like filling the questionnaire by self
Process of data validation covers five areas :
1. FRAUD : To infer that whether Person was actually interviewed or not Did the interviewer contact respondent simply to get a name/address and then proceed to fabricate responses? Did the interviewer used the friend to obtain the necessary information?SCREENING : To ensure accuracy of data collected in set prescribed criteria such Household income level, recent purchase of a specific product and brand or even gender or age. Like Interview procedure may require that only female heads of households with an annual household income of Rs 25000 or more be interviewed. In this case validation callback would verify each of these factors
Data Validation areas :
1)Fraud2)Screening3)Procedure4)Completeness5)Courtesy
Process of data validation covers five areas :
PROCEDURE: In marketing research, it is critical that the data be collected according to a specific procedure. Like
Many customer exit interviews must occur in a designated place as the respondent leaves a certain retail establishment. Here a validation callback may be necessary to ensure that interview took place at the proper setting, not some social gathering area like a party or a park
Data Validation areas :
1)Fraud2)Screening3)Procedure4)Completeness5)Courtesy
Process of data validation covers five areas :
PROCEDURE: In marketing research, it is critical that the data be collected according to a specific procedure. Like
Many customer exit interviews must occur in a designated place as the respondent leaves a certain retail establishment. Here a validation callback may be necessary to ensure that interview took place at the proper setting, not some social gathering area like a party or a park
Data Validation areas :
1)Fraud2)Screening3)Procedure4)Completeness5)Courtesy
Process of data validation covers five areas :
COMPLETENESS: In order to speed through the data collection process , an interviewer may ask the respondent only a few of requisite questions and then make up answers to remaining questions
To determine if the interview is valid , researcher could recontact a sample of respondents and ask about questions from different parts of interview form
Data Validation areas :
1)Fraud2)Screening3)Procedure4)Completeness5)Courtesy
Process whereby data must be edited for mistakes wherein raw data is checked for mistakes made by either interviewer or respondent is called as data editing
By scanning each completed interview , the researcher can check following areas of concern :
Asking the proper questions Accurate recording of answers Correct screening questions Responses to open ended ended questions
Data Validation areas :
1)Fraud2)Screening3)Procedure4)Completeness5)Courtesy
Grouping and assigning value to various responses from the survey instrument
Codes are typically numerical number from 0 to 9 because numbers are quick and easy to input and computers work better with numbers than alphanumerical values
It can be tedious if certain issues are not addressed prior to collecting the data
Like - - well planned and constructed questionnaire can reduce the amount of time spent on coding and increase the accuracy of the process if it is incorporated into design of questionnaire
In questionnaires that do not use such simple coded responses, the researcher will establish a master code on which the assigned numeric values are shown
Researchers typically use a four step process to develop codes for responses :
1. Generating list of as many potential responses as possible and Assigning values to generated responses
2. Consolidation of responses is actually the second phase of the four step process – having same meaning clubbed to one
3. Assign a numerical value as code4. Assign a coded value to each
response
Those task involved with the direct input of the coded data into some specified software package that ultimately allows the research analyst to manipulate and transform the raw data into useful information
It follows validation, editing and coding
It is the procedure used to enter the data into the computer for subsequent data analysis
It includes those tasks involved with the direct input of the coded data into a software package that enables the research analyst to manipulate and transform the raw data into useful information
One critical task of data entry personnel is to ensure that the data entered is correct and error free
First step in error detection is to determine whether the software used for data entry and tabulation will allow the researcher to perform “error edit routines” which identifies the wrong type of data. Example – Say that for a particular field on a given data record, only the codes of 1 or 2 should appear. An error edit routine can display an error message on the data output if any number other than 1 or 2 has been entered
Another approach to error detection is for the researcher to review a printed representation of entered data
The final approach to error detection is to produce a data/column list for the entered data. Quick view of this data/column list procedure can indicate to the analyst whether inappropriate codes were entered into data fields
Once the data have been collected and prepared for analysis, there are some basic statistical analysis procedures that MR will want to perform
An obvious need for these statistics comes from the fact that almost all data sets are disaggregated
Graphics should be used whenever practical availing information user to quickly grasp the essence of the information developed in research project
Charts also can be an effective visual aid to enhance the communication process and add clarity and impact to research reports i.e Bar Charts, Line charts, pie or round chart
Data must be accurately scored and systematically organized to facilitate data analysis vide descriptive analysis, univariate ,bivariate analysis and multivariate analysis
Descriptive statistics : Descriptive statistics : permit the researcher to describe many pieces of data with a few indices
Statistics : Statistics : indices calculated by the researcher for a sample drawn from a population
Parameter : indices calculated by the researcher for an entire population
Types of descriptive statistics : 1) Graphs2) Measures of Central Tendency3) Measures of central variability
Graphs : a.Representations of data enabling the researcher to see what the distribution of scores look like Bar graph, line graph and Pie or Round chart
Indices enabling the researcher to determine the typical or average score of a group of scores.
They are : They are :
a)a)Mean – Mean – The arithmetic average of the The arithmetic average of the
sample sample All values of a distribution of All values of a distribution of
responses are summed and divided responses are summed and divided by the number of valid responsesby the number of valid responses
b) Median – The middle value of rank ordered distribution Exactly half of the responses are above and half are below the median value3) Mode – The most common value in the set of responses to a question i.e the response most often given to a question
Indices enabling the researcher to indicate how spread out a group of scores are
They are : They are :
a)a)RangeRange
b)b)Quartile deviationQuartile deviation
c)c) VarianceVariance
d)d)Standard DeviationStandard Deviation
Indices enabling the researcher to determine the typical or average score of a group of scores.
They are : They are :
a)a)Mean – Mean – The arithmetic average of the The arithmetic average of the
sample sample All values of a distribution of All values of a distribution of
responses are summed and divided responses are summed and divided by the number of valid responsesby the number of valid responses
a) Range - The difference between the highest and lowest score in a distribution
b)b) Variance – Variance – A A summary statistic indicating the
degree of variability among participants for a given variable
The average squared deviation about the mean of distribution of values
c)c) Standard deviation – Standard deviation – TThe square root of variance
providing an index of variability in the distribution of scores.
It describes the average distance of distribution values from the mean