View
96
Download
0
Category
Tags:
Preview:
Citation preview
ONE SAMPLE T-TEST
OBJECTIVE
To conduct one sample T-test using SPSS.
PROBLEM
A major oil company developed a petrol additive that was supposed to increase engine efficiency. Twenty two cars were test driven both with and without the additive and the number of kilometer per liter was recorded. Whether the car was automatic or manual was also recorded and coded as 1 = manual and 2 = automatic.
During an earlier trial 22 cars were test driven using the additive. The mean number of kilometers per liter was 10.5.
NULL HYPOTHESIS
There is no significant difference in engine efficiency between the present trial and the earlier trial.
ALTERNATE HYPOTHESIS
There is a significant difference in engine efficiency between the present trial and the earlier trial.
PROCEDURE
1. Select the Analyze menu.
2. Click on compare means and then one-sample T Test…. To open the One-Sample T Test dialogue box.
3. Select ‘withadd’ and move the variable into the Test Variable(s): box
4. In the Test Value: box type the mean score (10.5).
5. Click Ok.
OUTPUT
ONE SAMPLE T- TEST
One-Sample Statistics
22 13.86 2.748 .586withaddN Mean Std. Deviation
Std. ErrorMean
One-Sample Test
5.741 21 .000 3.364 2.15 4.58withaddt df Sig. (2-tailed)
MeanDifference Lower Upper
95% ConfidenceInterval of the
Difference
Test Value = 10.5
INFERENCE
1. The difference between the sample mean and the hypothesized mean is determined by consulting the t-value, degree of freedom (df) and two-tail significance.
2. If the value for two-tail significance is less than .05 (p<.05), then the difference between the means is significant.
3. The cars in the present trial appear to have greater engine efficiency than that of those in the earlier trial – t (21) = 5.74, p<.05.
RESULT
The output indicates that there is a significant difference in engine efficiency between the present trial and the earlier trial.
INDEPENDENT SAMPLE T-TEST
OBJECTIVE:
To find out the difference in opinion among two sets of people by Independent sample t-test
using SPSS.
PROBLEM:
As marketers of brand jeans, we want to find out whether a set of customers in Delhi and set
of customers in Mumbai thought of our brand in the same way or not. A small survey was
conducted in both the cities and the ratings were obtained on an interval scale 1-7. We want
to find out whether the two sets of rating are significantly different.
S.NO. RATING CITY S.NO. RATING CITY1 2 1 16 3 22 3 1 17 4 23 3 1 18 5 24 4 1 19 6 25 5 1 20 5 26 4 1 21 5 27 4 1 22 5 28 5 1 23 4 29 3 1 24 3 210 4 1 25 3 211 5 1 26 5 212 4 1 27 6 213 3 1 28 6 214 3 1 29 6 215 4 1 30 5 2
NULL HYPOTHESIS
There is no significant difference between the ratings given by the customers in Mumbai and
Delhi at 95% confidence interval.
ALTERNATE HYPOTHESIS
There is significant difference between the ratings given by the customers in Mumbai and
Delhi at 95% confidence interval.
PROCEDURE:
1. The variables are entered in the variable view of the SPSS data editor where city is a
categorical variable using nominal measure and respondent’s ratings in scale.
2. In the value cell for city, enter the label values as “1- Mumbai” and “2-Delhi”.
3. The given data is entered in the data view.
4. Choose Analyse from the main menu.
5. Then choose Compare means > Independent sample T-test.
6. In the Independent sample t-test dialogue box, ratings given by the respondents are
entered as a test variable and the city they belong to is entered as grouping variable.
7. Enter the specified values for the groups after clicking defining groups.
8. The output chart is generated and it is analyzed and inference is obtained.
OUTPUT:
Group statistics
Respondent's city N Mean
Std. Deviation
Std. Error Mean
Respondent's rating
Mumbai 15 3.73 .884 .228Delhi 15 4.73 1.100 .284
Independent Samples Test
Levene's Test for Equality
of Variances
t-test for Equality of Means
FSig
.T Df
Sig. (2-
tailed)
Mean Difference
Std. Error Difference
95% Confidence Interval of the
DifferenceUpper Lower
RatingsEqual
variances assumed
.727.401
-2.745 28 .010 -1.000 .364 -1.746 -.254
Equal variances
not assumed
-2.745 26.759 .011 -1.000 .364 -1.748 -.252
INFERENCE:
1. The Independent samples t-test procedure compares the two group means (both
Mumbai and Delhi).
2. The mean value for the two groups are displayed in the Group Statistics table
(3.73 – 4.73 = - 1.00)
3. One test assumes that the variances of the two groups are equal. Levene tests this
assumption.
4. The significance value for the Levene’s test is high (0.401 is typically greater than of
0.10), so the result is assumed that there is equal variance for both the groups and the
second test is ignored.
5. The significance value for the t-test 0.010 is less than 0.05 and the confidence
interval for the mean difference does not contain zero.
6. So, the Null hypothesis is rejected and the Alternate hypothesis accepted. This
indicates that there is a significant difference between the two group means.
RESULT:
There is a significant difference in the ratings on the brand, given by the respondents in the
cities of Mumbai and Delhi.
PAIRED SAMPLE T-TEST
OBJECTIVE
To conduct a Paired Sample T-Test using SPSS.
PROBLEM
A major oil company developed a petrol additive that was supposed to increase engine efficiency. Twenty two cars were test driven both with and without the additive and the number of kilometer per liter was recorded. Whether the car was automatic or manual was also recorded and coded as 1 = manual and 2 = automatic
Does engine efficiency improve when the additive is used? This is a repeated measure t-test design.
NULL HYPOTHESIS
There is no significant difference exists between engine efficiency with and without the additive.
ALTERNATE HYPOTHESIS
There is a significant difference exists between engine efficiency with and without the additive.
PROCEDURE
1. Select the Analyze menu.
2. Click on Compare Means and then Paired-Samples T Test… to open the Paired-Sample T Test dialogue box.
3. Select the variables ‘without and withadd’ and move the variables into the Paired Variables: box
4. Click Ok.
OUTPUT
PARIED SAMPLE T-TEST
Paired Samples Statistics
8.50 22 3.335 .711
13.86 22 2.748 .586
without
withadd
Pair1
Mean N Std. DeviationStd. Error
Mean
Paired Samples Correlations
22 .559 .007without & withaddPair 1N Correlation Sig.
INFERENCE
1. It can be determined that whether the groups come from the same or different populations
2. The significance is determined by looking at the probability level (p) specified under the heading ‘two tail significance’.
3. If the probability value is less than the specified alpha value, then the observed t-value is significant
4. The 95 percent confidence interval indicates that 95 percent of the time the interval specified will contain the true difference between the population means
5. The additive significantly improves the number of kilometers to the liter, t(21) = 8.66, p<.05
RESULT
The output shows that there is a significant difference exists between engine efficiency with and without the additive.
Paired Samples Test
-5.364 2.904 .619 -6.651 -4.076 -8.663 21 .000without - withaddPair 1Mean Std. Deviation
Std. ErrorMean Lower Upper
95% ConfidenceInterval of the
Difference
Paired Differences
t df Sig. (2-tailed)
ONE WAY ANOVA
OBJECTIVE:
To test the preferred ad copy by the target population before the launch of its campaign.
PROBLEM:
There are three different versions of advertising copy created by an advertising agency for a
campaign. Let us call these versions of copy as adcopy 1, 2 and 3. A sample of 18
respondents is selected from the target population in the nearby areas of the city. At random,
these 18 respondents are assigned to the 3 versions of ad copy. Each version of ad copy is
thus shown to six of the respondents. The respondents are asked to rate their liking for the ad
copy shown to them on a scale of 1 to 10. (1 = Not liked at all, 10 = Liked a lot, and other
values in between these two).
S. No. Ad copy Rating1 1 6.002 1 7.003 1 5.004 1 8.005 1 8.006 1 8.007 2 4.008 2 4.009 2 5.0010 2 7.0011 2 7.0012 2 6.0013 3 5.0014 3 5.0015 3 4.0016 3 7.0017 3 8.0018 3 7.00
Null Hypothesis
There is no difference in the ratings between the three versions of the ad copy at 95%
confidence level.
Alternative Hypothesis
There is a significant difference between the three versions of the ad copy at 95%
confidence level.
PROCEDURE:
1. The given data is entered in the variable view and then in the data view.
2. Choose Analyse > Compare means > One-way ANOVA.
3. In the one-way ANOVA dialog box, select ‘ratings’ as the dependent list and ad copy
as its factor.
4. Select other variables as required.
5. The output chart is generated and analysed and inference obtained.
OUTPUT:
DescriptivesRatings
NMea
n
Std. Deviatio
n
Std. Erro
r
95% Confidence Interval for Mean Minimu
mMaximu
mLower Bound
Upper Bound
Ad copy1
67.0000
1.26491.51640
5.6726 8.3274 5.00 8.00
Ad copy2
65.5000
1.37840.56273
4.0535 6.9465 4.00 7.00
Ad copy3
66.0000
1.54919.63246
4.3742 7.6258 4.00 8.00
Total18
6.1667
1.46528.34537
5.4380 6.8953 4.00 8.00
Test of Homogeneity of Variances
Ratings Levene Statistic
df1 df2 Sig.
.536 2 15 .596
ANOVARatings
Sum of Square
sdf
Mean Square
F Sig.
Between Groups
7.000 2 3.5001.78
0.203
Within Groups
29.500 15 1.967
Total 36.500 17
INFERENCE:
1. The descriptive of the ratings are obtained in terms of mean and standard deviation.
2. The mean values of the three versions of ad copy are displayed.
3. The significance value for the Levene’s test of homogeneity of variables is high
(0.596 >0.05) and the ANOVA table, sig represents the significance level of F-test.
4. Therefore the null hypothesis is not rejected and alternate hypothesis is not accepted.
Hence the variances for the three versions are equal and the assumption is justified.
RESULT:
There is no significant difference in the preferences over the three versions of the ad
copy by a target population before the launch of its campaign.
CORRELATION
OBJECTIVE:
To find the interrelationship between the dependent and the independent variables.
PROBLEM:
A manufacturer and the marketer of electric motors would like to build a regression
model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15
sales territories, on sales and 6 different independent variables. Build a regression model and
recommend whether or not it should be used by the company.
Dependent Variable:
Y Sales (in Rs. Lakhs) in the territory.
Independent Variables:
X1 Market potential of the territory.
X2 No. of dealers of the Company in the territory.
X3 No. of sales person in the territory.
X4 Index of the competitor activity in the territory on a 5 point scale.
X5 No. of service people in the territory.
X6 No. of existing customers in the territory.
S. No. Sales (Y) Potential (X1)
Dealers (X2)
People (X3)
Competition (X4)
Service (X5)
Customers (X6)
1 5 25 1 6 5 2 202 60 150 12 30 4 5 503 20 45 5 15 3 2 254 11 30 2 10 3 2 205 45 75 12 20 2 4 306 6 10 3 8 2 3 167 15 29 5 18 4 5 308 22 43 7 16 3 6 409 29 70 4 15 2 5 3910 3 40 1 6 5 2 511 16 40 4 11 4 2 1712 8 25 2 9 3 3 1013 18 32 7 14 3 4 3114 23 73 10 10 4 3 4315 81 150 15 35 4 7 70
Null Hypothesis
There is no significant relationship between the independent and the dependent
variables at 95% confidence interval.
Alternative Hypothesis
There is significant relationship between the independent and dependent variables at
95% confidence interval.
PROCEDURE:
Let the estimating equation be Y= a1X1+a2X2+a3X3+a4X4+a5X5
1. The variables are defined in the variable view of the SPSS data editor.
2. Enter the data in the data view.
3. Choose Analyze > Correlate > Bivariate from the main menu.
4. In the bivariate correlations dialogue box select all the dependent and independent
variables.
5. Select Pearson’s correlation coefficient with test of significance being one tailed.
6. Also include the statistics for mean and standard deviation.
7. The output chart is generated, analyzed and inference obtained
OUTPUT:
Descriptive Statistics
CorrelationSales
in Rs.lakh
s in the
territory
Market potential
in the territory(in Rs. lakhs)
No. of dealers of
the company
in the territory
No. of sales
people in the
territory
Index of competit
or activity in the
territory
No. of service
people in the
territory
No. of existing customers in the territory
Sales in Rs.lakhs
in the territory
Pearson Correlation
1 .945(**) .908(**).953(*
*)-.046 .726(**) .878(**)
Sig. (1-tailed) .000 .000 .000 .436 .001 .000N 15 15 15 15 15 15 15
Market potential
in the territory (in Rs. Lakhs)
Pearson Correlation
.945(**)
1 .837(**).877(*
*).140 .613(**) .831(**)
Sig. (1-tailed) .000 .000 .000 .309 .008 .000
N 15 15 15 15 15 15 15
No. of dealers of
the company
in the territory
Pearson Correlation
.908(**)
.837(**) 1.855(*
*)-.082 .685(**) .860(**)
Sig. (1-tailed) .000 .000 .000 .385 .002 .000
N 15 15 15 15 15 15 15
No. of sales
people in the
Pearson Correlation
.953(**)
.877(**) .855(**) 1 -.036 .794(**) .854(**)
Sig. (1-tailed) .000 .000 .000 .449 .000 .000N 15 15 15 15 15 15 15
MeanStd.
Deviation N
Sales in Rs.lakhs in the territory 24.13 21.980 15Market potential in the territory (in Rs.
lakhs)55.80 42.543 15
No. of dealers of the company in the territory
6.00 4.408 15
No. of sales people in the territory 14.87 8.340 15Index of competitor activity in the
territory3.40 .986 15
No. of service people in the territory 3.67 1.633 15
No. of existing customers in the territory 29.73 16.829 15
territoryIndex of competit
or activity in the
territory
Pearson Correlation
-.046 .140 -.082 -.036 1 -.178 -.015
Sig. (1-tailed) .436 .309 .385 .449 .263 .479
N 15 15 15 15 15 15 15
No. of service
people in the
territory
Pearson Correlation
.726(**)
.613(**) .685(**).794(*
*)-.178 1 .818(**)
Sig. (1-tailed) .001 .008 .002 .000 .263 .000
N 15 15 15 15 15 15 15
No. of existing customers in the territory
Pearson Correlation
.878(**)
.831(**) .860(**).854(*
*)-.015 .818(**) 1
Sig. (1-tailed) .000 .000 .000 .000 .479 .000
N 15 15 15 15 15 15 15
INFERENCE:
The correlations table shows Pearson correlation coefficients, significance values, and
the number of cases with non missing values.
1. The Pearson correlation coefficient measures the linear association between two
variables if the value of the correlation coefficient ranges from -1 to 1.
2. The sign of the correlation coefficient indicates the direction of the relationship.
Hence from the inference there is a negative relation between sales and the index of
the competitor activity and the positive relationship with market potential, number of
dealers, no of salespersons, number of service people and the no of existing
customers.
3. The absolute value of the correlation coefficient indicates the strength, with larger
absolute values indicating stronger relationships.
4. The significance levels of market potential is 0.000, no of service people is 0.001 and
no of existing customers is 0.000 which is less than 0.05. So null hypothesis is
rejected and alternate hypothesis is accepted. Hence it indicates that the correlation is
significant and the variables are linearly related with sales.
5. The significance level of the index of the competitor 0.436 is greater than 0.05 then
the correlation is not significant and the variable is not linearly related.
6. This indicates that the manufacturer should not consider the index of the competitor
since it does not affect the sales.
RESULT:
Hence there is dependence between the sales (dependent variable) and the market
potential of the territory, number of dealers of the company in the territory, number of sales
person in the territory, number of service people in the territory, number of existing
customers in the territory (independent variables). Index of the competitor activity in the
territory and sales are negatively correlated.
FACTOR ANALYSIS
OBJECTIVE:
To find the factors which are fewer but linear combinations of original 10 variables.
PROBLEM:
A two wheeler manufacturer is interested in determining which variables potential
customers think about when they consider his product. Twenty two-wheeler owners were
surveyed by the manufacturer. They were asked to indicate on a 7 point scale,
1- Completely agree to 7- Completely disagree. Their agreement or disagreement with a set
of 10 statements relates to their perception and some attributes of the two-wheeler. Use
factorial analysis to find underlying factors which are fewer but are linear combinations of
original 10 variables.
TEN STATEMENTS:
1. I use a two-wheeler because it is affordable.
2. It gives me a sense of freedom to own a two-wheeler.
3. Low maintenance cost makes a two-wheeler very economical in a long run.
4. Two-wheeler is essentially a man’s vehicle.
5. I feel very powerful when I am on my two-wheeler.
6. Some of my friends who don’t have their own vehicle is jealous of me.
7. I feel good whenever I see the ad of my two-wheeler.
8. My vehicle gives me a comfortable ride.
9. I think two-wheelers are safe way to travel.
10. Three people should be legally allowed to travel on a two-wheeler.
S.No. Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q101 1 4 1 6 5 6 5 2 3 22 2 3 2 4 3 3 3 5 5 23 2 2 2 1 2 1 1 7 6 24 5 1 4 2 2 2 2 3 2 35 1 2 2 5 4 4 4 1 1 26 3 2 3 3 3 3 3 6 5 37 2 2 5 1 2 1 2 4 4 58 4 4 3 4 4 5 3 2 3 39 2 3 2 6 5 6 5 1 4 110 1 4 2 2 1 2 1 4 4 111 1 5 1 3 2 3 2 2 2 112 1 6 1 1 1 1 1 1 2 213 3 1 4 4 4 3 3 6 5 314 2 2 2 2 2 2 2 1 3 215 2 5 1 3 2 3 2 2 1 616 5 6 3 2 1 3 2 5 5 417 1 4 2 2 1 2 1 1 1 318 2 3 1 1 2 2 2 3 2 219 3 3 2 3 4 3 4 3 3 320 4 3 2 7 6 6 6 2 3 6
PROCEDURE:
1. The variables are defined in the variable view of the SPSS data editor.
2. Enter the given data in the data view.
3. Choose Analyze > Data reduction > Factor analysis from the main menu and enter the
variables.
4. In the factor analysis dialogue box select the analysis variables and check the options
as required.
5. The output chart is generated, analyzed and inference obtained.
OUTPUT:
Descriptive Statistics
MeanStd.
DeviationAnalysis
NIt is affordable 2.35 1.309 20Gives sense of freedom 3.25 1.482 20Economical 2.25 1.118 20Man's vehicle 3.10 1.804 20Feel powerful 2.80 1.508 20Friends would be jealous 3.05 1.605 20Feel good to see ad of my vehicle
2.70 1.455 20
Comfortable driving 3.05 1.905 20Safe way to travel 3.20 1.508 203 people should be legally allowed
2.80 1.473 20
Communalities
InitialExtractio
nIt is affordable 1.000 .722Gives sense of freedom 1.000 .452Economical 1.000 .731Man's vehicle 1.000 .945Feel powerful 1.000 .950Friends would be jealous 1.000 .914Feel good to see ad of my vehicle
1.000 .955
Comfortable driving 1.000 .799Safe way to travel 1.000 .7773 people should be legally allowed
1.000 .789
Extraction Method: Principal Component Analysis.
Component Matrix (a)
Component1 2 3
It is affordable .176 .670 .493Gives sense of freedom -.136 -.608 .254Economical -.107 .820 .218Man's vehicle .966 -.036 -.097Feel powerful .951 .166 -.136Friends would be jealous .952 -.084 -.025Feel good to see ad of my vehicle
.971 .096 -.046
Comfortable driving -.322 .775 -.308Safe way to travel -.069 .735 -.4823 people should be legally allowed
.161 .319 .814
Extraction Method: Principal Component Analysis.
1. 3 components extracted.
Component Score Coefficient Matrix
Component 1 2 3It is affordable .004 .023 .434Gives sense of freedom -.063 -.278 .043Economical -.041 .176 .283Man's vehicle .256 .003 -.042Feel powerful .257 .081 -.030Friends would be jealous .245 -.038 -.007Feel good to see ad of my vehicle
.253 .026 .014
Comfortable driving -.047 .360 -.057Safe way to travel .033 .406 -.1663 people should be legally allowed
-.032 -.203 .568
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
Component Scores.
Component Score Covariance Matrix
Component
1 2 3
1 1.000 .000 .0002 .000 1.000 .0003 .000 .000 1.000
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
Component Scores.
INFERENCE:
Factor analysis is primarily used for data reduction or structure detection.
1. Communalities indicate the amount of variance in each variable that is accounted for.
2. Communalities table reports the factor loadings for each variable on the unrotated
components or factors.
3. Rotated component matrix table (called the Pattern Matrix for oblique rotations)
reports the factor loadings for each variable on the components or factors after
rotation.
4. Group the factors which have high values.
5. Here man’s vehicle, feel powerful, friend would be jealous and feel good to see ad of
my vehicle have high value (greater than 0.5). So we can group them into component
1.
6. Similarly economical, comfortable driving and safe way to travel have high value and
hence are grouped in component 2.
7. Finally, it is affordable and three people should be legally allowed are grouped into
component 3.
8. Since value of sense of freedom is negative in all the three components this factor is
eliminated.
RESULT:
The ten factors are clustered into three components
DISCRIMINANT ANALYSIS
OBJECTIVE:
To conduct Discriminant Analysis for the given data using SPSS software
PROBLEM:
Conduct Discriminant Analysis that predicts membership of two groups based on the
dependent variable category and creating the discriminant equation with inclusion of 17
independent variables selected by a step-wise procedure based on minimization of Wilk’s
Lambda at each step.
NULL HYPOTHESIS
There is no discrimination in membership of two groups.
ALTERNATE HYPOTHESIS
There is discrimination in membership of two groups.
PROCEDURE:
1. Select Analyze from the menu Classify Discriminant.
2. Select grouping variable (1,2).
3. Define the range – min:1 & max: 2.
4. Select all the variables as independent variables select ‘Use stepwise method’.
5. Click Statistics check Means, Univariate ANOVA, Box’s M and Under
standardized Continue.
6. Click select Wilk’s Lambda method and enter F value – Entry: 1.15 and Exit: 1
Continue.
7. Click Classify Check All groups equal, Case wise results, Summary table,
Combined-groups Continue.
8. Click OK.
OUTPUT:
Analysis Case Processing Summary
50 100.0
0 .0
0 .0
0 .0
0 .0
50 100.0
Unweighted CasesValid
Missing or out-of-rangegroup codes
At least one missingdiscriminating variable
Both missing orout-of-range group codesand at least one missingdiscriminating variable
Total
Excluded
Total
N Percent
Box's Test of Equality of Covariance Matrices
Log Determinants
6 3.031
6 2.918
6 3.800
1=COMPLETEDPHD, 2=DID NOTCOMPLETE PHDFINISH
NOT FINISH
Pooled within-groups
RankLog
Determinant
The ranks and natural logarithms of determinantsprinted are those of the group covariance matrices.
Test Results
39.633
1.633
21
8474.108
.034
Box's M
Approx.
df1
df2
Sig.
F
Tests null hypothesis of equal population covariance matrices.
Tests of Equality of Group Means
.795 12.356 1 48 .001
.951 2.493 1 48 .121
.998 .113 1 48 .738
.628 28.393 1 48 .000
.974 1.283 1 48 .263
.650 25.889 1 48 .000
.756 15.476 1 48 .000
.534 41.969 1 48 .000
.679 22.722 1 48 .000
1.000 .007 1 48 .934
.993 .337 1 48 .564
.768 14.526 1 48 .000
.904 5.079 1 48 .029
.787 13.017 1 48 .001
.972 1.378 1 48 .246
OVERALL COLLEGE GPA
MAJOR AREA GPA
GRE SCORE ONSPECIALITY EXAM
GRE SCORE ONQUANTATIVE
GRE SCORE ON VERBAL
FIRST LETTER OFRECOMMENDATION
SECOND LETTER OFRECOMMENDATION
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
STUDENTS EMOTIONALSTABILITY
FINAICIAL/PERSONALRESOURCES TOCOMPLETE
AGE IN YEARS AT ENTRY
ABILITY TO INTERACTEASILY
RATING OF STUDENTHOSTILITY
MEAN RATING OFSELECTORSIMPRESSION OFAPPLICANT
Wilks'Lambda F df1 df2 Sig.
Stepwise Statistics:
Variables Entered/Removeda,b,c,d
THIRDLETTEROFRECOMMENDATION
.534 1 1 48.000 41.969 1 48.000 .000
STUDENTSMOTIVATION
.451 2 1 48.000 28.638 2 47.000 .000
FIRSTLETTEROFRECOMMENDATION
.415 3 1 48.000 21.602 3 46.000 .000
AGE INYEARSATENTRY
.391 4 1 48.000 17.495 4 45.000 .000
MEANRATINGOFSELECTORSIMPRESSION OFAPPLICANT
.367 5 1 48.000 15.188 5 44.000 .000
FINAICIAL/PERSONALRESOURCES TOCOMPLETE
.348 6 1 48.000 13.446 6 43.000 .000
Step1
2
3
4
5
6
Entered Statistic df1 df2 df3 Statistic df1 df2 Sig.
Exact F
Wilks' Lambda
At each step, the variable that minimizes the overall Wilks' Lambda is entered.
Maximum number of steps is 30.a.
Minimum partial F to enter is 1.15.b.
Maximum partial F to remove is 1.c.
F level, tolerance, or VIN insufficient for further computation.d.
Wilks' Lambda
1 .534 1 1 48 41.969 1 48.000 .000
2 .451 2 1 48 28.638 2 47.000 .000
3 .415 3 1 48 21.602 3 46.000 .000
4 .391 4 1 48 17.495 4 45.000 .000
5 .367 5 1 48 15.188 5 44.000 .000
6 .348 6 1 48 13.446 6 43.000 .000
Step1
2
3
4
5
6
Number ofVariables Lambda df1 df2 df3 Statistic df1 df2 Sig.
Exact F
Variables in the Analysis
1.000 41.969
.987 23.774 .679
.987 8.633 .534
.957 15.183 .552
.934 4.617 .457
.909 3.943 .451
.955 12.842 .503
.913 3.122 .419
.908 3.418 .421
.969 2.733 .415
.935 13.679 .481
.898 3.642 .397
.897 2.417 .387
.943 3.452 .396
.926 2.941 .391
.935 12.445 .448
.882 4.182 .381
.892 2.580 .369
.885 4.599 .385
.926 2.755 .370
.897 2.372 .367
THIRD LETTER OFRECOMMENDATION
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
FIRST LETTER OFRECOMMENDATION
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
FIRST LETTER OFRECOMMENDATION
AGE IN YEARS AT ENTRY
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
FIRST LETTER OFRECOMMENDATION
AGE IN YEARS AT ENTRY
MEAN RATING OFSELECTORSIMPRESSION OFAPPLICANT
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
FIRST LETTER OFRECOMMENDATION
AGE IN YEARS AT ENTRY
MEAN RATING OFSELECTORSIMPRESSION OFAPPLICANT
FINAICIAL/PERSONALRESOURCES TOCOMPLETE
Step1
2
3
4
5
6
Tolerance F to RemoveWilks'
Lambda
Standardized Canonical Discriminant Function Coefficientts
.312
.607
.393
.299
.409
.316
FIRST LETTER OFRECOMMENDATION
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
FINAICIAL/PERSONALRESOURCES TOCOMPLETE
AGE IN YEARS AT ENTRY
MEAN RATING OFSELECTORSIMPRESSION OFAPPLICANT
1
Function
Summary of Canonical Discriminant Functions
Eigenvalues
1.876a 100.0 100.0 .808Function1
Eigenvalue % of Variance Cumulative %CanonicalCorrelation
First 1 canonical discriminant functions were used in theanalysis.
a.
Wilks' Lambda
.348 47.541 6 .000Test of Function(s)1
Wilks'Lambda Chi-square df Sig.
Structure Matrix
.683
.547
.536
.502
.402
-.335
.278
.237
.178
.129
-.126
.124
-.068
.061
-.027
THIRD LETTER OFRECOMMENDATION
GRE SCORE ONQUANTATIVE
a
FIRST LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
AGE IN YEARS AT ENTRY
RATING OF STUDENTHOSTILITY
a
SECOND LETTER OFRECOMMENDATION
a
OVERALL COLLEGE GPAa
ABILITY TO INTERACTEASILY
a
MAJOR AREA GPAa
GRE SCORE ONSPECIALITY EXAM
a
MEAN RATING OFSELECTORSIMPRESSION OFAPPLICANT
GRE SCORE ON VERBALa
FINAICIAL/PERSONALRESOURCES TOCOMPLETE
STUDENTS EMOTIONALSTABILITY
a
1
Function
Pooled within-groups correlations between discriminatingvariables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function.
This variable not used in the analysis.a.
Canonical Discriminant Function Coefficients
.288
.617
.490
.175
.091
.262
-15.564
FIRST LETTER OFRECOMMENDATION
THIRD LETTER OFRECOMMENDATION
STUDENTS MOTIVATION
FINAICIAL/PERSONALRESOURCES TOCOMPLETE
AGE IN YEARS AT ENTRY
MEAN RATING OFSELECTORSIMPRESSION OFAPPLICANT
(Constant)
1
Function
Unstandardized coefficients
Functions at Group Centroids
1.342
-1.342
1=COMPLETEDPHD, 2=DID NOTCOMPLETE PHDFINISH
NOT FINISH
1
Function
Unstandardized canonical discriminantfunctions evaluated at group means
Casewise Statistics
1 1 .422 1 .997 .645 2 .003 12.160 2.145
1 1 .698 1 .990 .150 2 .010 9.436 1.730
1 1 .934 1 .967 .007 2 .033 6.766 1.259
1 1 .699 1 .929 .149 2 .071 5.281 .956
2 1** .755 1 .941 .097 2 .059 5.629 1.031
1 1 .178 1 .999 1.812 2 .001 16.244 2.688
1 1 .947 1 .968 .004 2 .032 6.851 1.275
1 1 .765 1 .988 .090 2 .012 8.901 1.641
1 1 .439 1 .997 .599 2 .003 11.958 2.116
1 1 .795 1 .987 .068 2 .013 8.669 1.602
1 1 .345 1 .998 .891 2 .002 13.165 2.286
1 1 .620 1 .906 .246 2 .094 4.788 .846
1 1 .668 1 .991 .184 2 .009 9.693 1.771
2 1** .381 1 .777 .769 2 .223 3.266 .465
1 1 .573 1 .994 .318 2 .006 10.551 1.906
1 1 .239 1 .609 1.386 2 .391 2.271 .165
1 1 .804 1 .986 .062 2 .014 8.601 1.591
1 1 .065 1 1.000 3.393 2 .000 20.487 3.184
1 1 .231 1 .595 1.436 2 .405 2.207 .144
1 1 .970 1 .976 .001 2 .024 7.408 1.380
2 2 .708 1 .931 .141 1 .069 5.332 -.967
1 1 .677 1 .991 .173 2 .009 9.611 1.758
1 1 .583 1 .994 .301 2 .006 10.450 1.891
1 1 .290 1 .998 1.118 2 .002 13.998 2.399
1 1 .980 1 .975 .001 2 .025 7.340 1.367
2 2 .644 1 .992 .214 1 .008 9.902 -1.805
2 2 .854 1 .957 .034 1 .043 6.251 -1.158
2 2 .598 1 .993 .278 1 .007 10.316 -1.870
2 2 .580 1 .893 .306 1 .107 4.541 -.789
1 2** .277 1 .664 1.184 1 .336 2.547 -.254
2 2 .903 1 .964 .015 1 .036 6.564 -1.220
2 2 .555 1 .883 .348 1 .117 4.385 -.752
2 2 .322 1 .998 .982 1 .002 13.508 -2.333
2 2 .999 1 .973 .000 1 .027 7.195 -1.340
2 2 .432 1 .997 .617 1 .003 12.037 -2.127
2 2 .401 1 .997 .706 1 .003 12.422 -2.182
2 2 .345 1 .744 .893 1 .256 3.025 -.397
2 2 .649 1 .992 .207 1 .008 9.852 -1.797
1 2** .800 1 .949 .064 1 .051 5.909 -1.089
2 2 .701 1 .990 .148 1 .010 9.414 -1.726
2 2 .326 1 .724 .966 1 .276 2.894 -.359
2 2 .152 1 .999 2.054 1 .001 16.953 -2.775
2 2 .986 1 .972 .000 1 .028 7.108 -1.324
2 2 .202 1 .999 1.629 1 .001 15.687 -2.619
2 2 .973 1 .976 .001 1 .024 7.385 -1.375
2 2 .709 1 .990 .140 1 .010 9.350 -1.716
1 2** .892 1 .962 .018 1 .038 6.495 -1.206
2 2 .426 1 .812 .634 1 .188 3.565 -.546
2 2 .412 1 .997 .673 1 .003 12.280 -2.162
2 2 .715 1 .990 .134 1 .010 9.301 -1.708
Case Number1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
OriginalActual Group
PredictedGroup p df
P(D>d | G=g)
P(G=g | D=d)
SquaredMahalanobisDistance to
Centroid
Highest Group
Group P(G=g | D=d)
SquaredMahalanobisDistance to
Centroid
Second Highest Group
Function 1
DiscriminantScores
Misclassified case**.
Classification Resultsa
22 3 25
2 23 25
88.0 12.0 100.0
8.0 92.0 100.0
1=COMPLETEDPHD, 2=DID NOTCOMPLETE PHDFINISH
NOT FINISH
FINISH
NOT FINISH
Count
%
OriginalFINISH NOT FINISH
Predicted GroupMembership
Total
90.0% of original grouped cases correctly classified.a.
INFERENCE:
1. The F and significant F values identify for which variables the two groups differ
significantly.
2. The canonical correlation coefficient is .808 which shows strong correlation.
3. The significance values are <0.05
RESULT:
There is discrimination in membership between two groups.
CLUSTER ANALYSIS
OBJECTIVE:
To conduct K-means cluster analysis.
PROBLEM:
Brands of 21 VCRs are given along with their attributes. Determine the hierarchical K-means cluster analysis.
PROCEDURE:
1. Select Analyze Classify K-means cluster2. Select all variables and move into the variables boz.3. Label case as brand4. Enter ‘number of clusters’ as 35. Then select the required options from ‘Save’. Then Continue6. Click Options check initial cluster centre continue ok
OUTPUT:
Quick Cluster
Iteration Historya
52.791 70.267 80.393
14.037 12.547 .000
.000 .000 .000
Iteration1
2
3
1 2 3
Change in Cluster Centers
Convergence achieved due to no or smallchange in cluster centers. The maximumabsolute coordinate change for any center is.000. The current iteration is 3. The minimumdistance between initial centers is 335.404.
a.
Number of Cases in each Cluster
8.000
8.000
5.000
21.000
.000
1
2
3
Cluster
Valid
Missing
Initial Cluster Centers
200 535 380
3 5 2
3 5 2
3 5 2
3 5 2
3 5 2
3 3 3
4 4 3
2 3 3
4 5 4
4 4 4
4 4 4
4 4 4
8 6 4
365 365 30
3 3 4
3 4 4
3 4 4
3 12 6
3 12 6
3 12 6
price
pictur1
pictur2
pictur3
pictur4
pictur5
program
recept1
recept3
audio1
audio2
audio3
features
events
days
remote1
remote2
remote3
extras1
extras2
extras3
1 2 3
Cluster
Final Cluster Centers
239 453 460
3 4 4
3 4 4
3 4 4
3 4 4
3 4 4
3 3 3
4 4 3
2 3 3
4 4 4
4 4 3
4 4 4
4 4 4
8 7 6
365 365 30
3 3 4
3 3 4
3 3 4
3 8 10
3 8 10
3 8 10
price
pictur1
pictur2
pictur3
pictur4
pictur5
program
recept1
recept3
audio1
audio2
audio3
features
events
days
remote1
remote2
remote3
extras1
extras2
extras3
1 2 3
Cluster
INFERENCE:
Three clusters were formed.
RESULT:
Thus cluster analysis was done using SPSS.
REGRESSION
OBJECTIVE:
To find the dependency of the variables with respect to the sales of the company.
PROBLEM:
A manufacturer and the marketer of electric motors would like to build a regression
model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15
sales territories, on sales and 6 different independent variables. Build a regression model and
recommend whether or not it should be used by the company.
Dependent Variable:
Y Sales (in Rs. Lakhs) in the territory.
Independent Variables:
X1 Market potential of the territory.
X2 No. of dealers of the Company in the territory.
X3 No. of sales person in the territory.
X4 Index of the competitor activity in the territory on a 5 point scale.
X5 No. of service people in the territory.
X6 No. of existing customers in the territory.
S. No. Sales (Y) Potential (X1)
Dealers (X2)
People (X3)
Competition (X4)
Service (X5)
Customers (X6)
1 5 25 1 6 5 2 202 60 150 12 30 4 5 503 20 45 5 15 3 2 254 11 30 2 10 3 2 205 45 75 12 20 2 4 306 6 10 3 8 2 3 167 15 29 5 18 4 5 308 22 43 7 16 3 6 409 29 70 4 15 2 5 3910 3 40 1 6 5 2 511 16 40 4 11 4 2 1712 8 25 2 9 3 3 1013 18 32 7 14 3 4 3114 23 73 10 10 4 3 4315 81 150 15 35 4 7 70
Null Hypothesis
There is no dependence between the independent variables, market potential, no. of
dealers, no. of sales person, index of the competitor activity, no. of service people, no. of
existing customers and the dependent variable sales at 95% confidence interval.
Alternative Hypothesis
There is dependence between the independent and dependent variables at 95%
confidence interval.
PROCEDURE:
Let the estimating equation be Y= a1X1+a2X2+a3X3+a4X4+a5X5
1. The variables are defined in the variable view of the SPSS data editor.
2. Enter the data in the data view.
3. Choose Analyze >Regression > Linear from the main menu.
4. In the linear regression dialogue box enter sales as the dependent variable and all the
other variables as the independent variables.
5. Click the statistics button and click the regression coefficient estimates, model fit and
descriptive check boxes.
6. The output chart is generated and it is analyzed and inference obtained.
OUTPUT:
Model Summary (b)
Model
R R SquareAdjusted R Square
Std. Error of the
Estimate1 .989(a) .977 .960 4.391
a Predictors: (Constant), No. of existing customers in the territory, Index of
competitor activity in the territory, No. of service people in the territory, Market potential in
the territory (in Rs. lakhs), No. of dealers of the company in the territory, No. of sales people
in the territory
b Dependent Variable: Sales in Rs.lakhs in the territory
ANOVA (b)
Model
Sum of Squares
dfMean Square
F Sig.
1Regressi
on6609.48
56 1101.581
57.133
.000(a)
Residual 154.249 8 19.281
Total6763.73
314
a Predictors: (Constant), No. of existing customers in the territory, Index of competitor
activity in the territory, No. of service people in the territory, Market potential in the territory
(in Lakhs), No. of dealers of the company in the territory, No. of sales people in the territory
b Dependent Variable: Sales in Lakhs in the territory
Coefficients (a)
Model Unstandardize
d Coefficients
Standardized
Coefficients
T Sig.
95% Confidence Interval for B
BStd.
ErrorBeta
Lower Bound
Upper Bound
1 (Constant) -3.173 5.813 -.546 .600 -16.579 10.233Market potential in the territory (in Rs. Lakhs)
.227 .075 .439 3.040 .016 .055 .399
No. of dealers of the company in the territory
.819 .631 .164 1.298 .230 -.636 2.275
No. of sales people in the territory
1.091 .418 .414 2.609 .031 .127 2.055
Index of competitor activity in the territory
-1.893 1.340 -.085 -1.413 .195 -4.982 1.197
No. of service people in the territory
-.549 1.568 -.041 -.350 .735 -4.166 3.067
No. of existing customers in the territory
.066 .195 .050 .338 .744 -.384 .516
a Dependent Variable: Sales in Rs.lakhs in the territory
Therefore the estimating equation is: Y= 0.439X1+0.164X2+0.414X3-0.085X4-0.041X5+0.05X6
INFERENCE:
1. The variables are selected using the enter method.
2. The values of R ranging from 0 to 1 are determined. Larger values indicate stronger
relationship.
3. The significance value .000 arrived through ANOVA is less than 0.05. So null
hypothesis is rejected and alternate hypothesis is accepted. Hence the independent
variables explain the variations of the dependent variable.
4. The t statistics shows the relative importance of each variable with respect to the
regression coefficients where t values below -2 or above +2 are good predictors.
5. The t statistic and its significance value are used to test the null hypothesis that the
regression coefficient is zero (or that there is no linear relationship between the
dependent and independent variable).
6. The significance levels of the market potential (0.016) and no of sales people (0.031)
are less than 0.05. So null hypothesis is rejected and alternate hypothesis is accepted.
Hence the variables are linearly related with sales.
7. The significance level of the number of dealers (0.230), index of competitor (0.195),
no of service people (0.735) and no. of existing customer (0.744) is greater than 0.05.
So null hypothesis is not rejected and alternate hypothesis is not accepted. Hence the
variables are not linearly related.
8. Residuals are estimates of the true errors in the model. The residual statistic gives the
difference between the observed value of the dependent variable and the value
predicted by the model.
9. Since the residual value (154.249) is less than regression value (6609.485) the
estimating equation is the best fit.
10. Since the significance value is less than 0.05 the estimating equation is the best fit.
11. Since the model is appropriate for the data, the residuals follow a normal distribution
as indicated by a histogram.
RESULT
There is dependency between the sales (dependent variable) and the market potential
of the territory and number of sales person in the territory (independent variables). The other
independent variables, number of dealers, number of service people, index of the competitor
activity and number of existing customers in the territory has a non-linear relationship with
sales.The estimating equation is :
Y= 0.439X1+0.164X2+0.414X3-0.085X4-0.041X5+0.05X6
CORRELATION
OBJECTIVE:
To find the interrelationship between the dependent and the independent variables.
PROBLEM:
A manufacturer and the marketer of electric motors would like to build a regression
model consisting of 5 or 6 independent variables to sales. Past data has been collected for 15
sales territories, on sales and 6 different independent variables. Build a regression model and
recommend whether or not it should be used by the company.
Dependent Variable:
Y Sales (in Rs. Lakhs) in the territory.
Independent Variables:
X1 Market potential of the territory.
X2 No. of dealers of the Company in the territory.
X3 No. of sales person in the territory.
X4 Index of the competitor activity in the territory on a 5 point scale.
X5 No. of service people in the territory.
X6 No. of existing customers in the territory.
S. No. Sales (Y) Potential (X1)
Dealers (X2)
People (X3)
Competition (X4)
Service (X5)
Customers (X6)
1 5 25 1 6 5 2 202 60 150 12 30 4 5 503 20 45 5 15 3 2 254 11 30 2 10 3 2 205 45 75 12 20 2 4 306 6 10 3 8 2 3 167 15 29 5 18 4 5 308 22 43 7 16 3 6 409 29 70 4 15 2 5 3910 3 40 1 6 5 2 511 16 40 4 11 4 2 1712 8 25 2 9 3 3 1013 18 32 7 14 3 4 3114 23 73 10 10 4 3 4315 81 150 15 35 4 7 70
Null Hypothesis
There is no significant relationship between the independent and the dependent
variables at 95% confidence interval.
Alternative Hypothesis
There is significant relationship between the independent and dependent variables at
95% confidence interval.
PROCEDURE:
Let the estimating equation be Y= a1X1+a2X2+a3X3+a4X4+a5X5
8. The variables are defined in the variable view of the SPSS data editor.
9. Enter the data in the data view.
10. Choose Analyze > Correlate > Bivariate from the main menu.
11. In the bivariate correlations dialogue box select all the dependent and independent
variables.
12. Select Pearson’s correlation coefficient with test of significance being one tailed.
13. Also include the statistics for mean and standard deviation.
14. The output chart is generated, analyzed and inference obtained.
OUTPUT:
Descriptive Statistics
CHI-SQUARE TEST
MeanStd.
Deviation N
Sales in Rs.lakhs in the territory 24.13 21.980 15Market potential in the territory (in Rs.
lakhs)55.80 42.543 15
No. of dealers of the company in the territory
6.00 4.408 15
No. of sales people in the territory 14.87 8.340 15Index of competitor activity in the
territory3.40 .986 15
No. of service people in the territory 3.67 1.633 15
No. of existing customers in the territory 29.73 16.829 15
OBJECTIVE:
To find out whether there is a significant association between the income of the individuals
and intention to purchase.
PROBLEM:
A customer survey was conducted for a brand of detergent. One of the questions dealt
with the income category and the other asked the respondent to rate his purchase intention.
These two variables are listed in the table below. Both variables are coded as follows:
INCOME TABLE
CODE INCOME IN Rs./month1 <=50002 5001-100003 10001-200004 >20000
PURCHASE TABLE
Code Intention1 None2 Low3 High4 Very high5 Certain
S. No. Income Code Intent Intent Code1 <5000 1 None 12 <5000 1 Low 23 <5000 1 Low 24 <5000 1 None 15 <5000 1 High 36 5001-10000 2 Low 27 5001-10000 2 High 38 5001-10000 2 Very high 49 5001-10000 2 High 310 5001-10000 2 Low 211 10001-20000 3 High 312 10001-20000 3 Very high 413 10001-20000 3 Certain 514 10001-20000 3 High 315 10001-20000 3 Very high 416 >20000 4 High 317 >20000 4 Certain 518 >20000 4 Very high 419 >20000 4 Certain 520 >20000 4 Certain 5
Null Hypothesis
There is no significant association between income and purchase intention.
Alternate Hypothesis
There is significant association between income and purchase intention.
PROCEDURE:
1. The field names and the corresponding data types are entered in the variable view
with the income and purchase intention in nominal measure.
2. The given data is entered in the data view.
3. Choose Analyze > Descriptive statistics > Cross tabs > statistics from the main menu.
4. Select Chi-square test and the required cells are checked.
5. Select income of the respondent (the independent variable) into the rows option and
the intention of the respondents (the dependent variable) into the columns option.
6. The value and significance column are compared from the output and the inference is
made.
OUTPUT:
CHI-SQUARE TEST
Case Processing Summary
CasesValid Missing Total
N Percent N Percent N PercentIncome of the respondent * Intention to purchase
20 100.0% 0 .0% 20 100.0%
Income of the respondent * Intention to purchase Cross tabulationCount
Intention to purchase Total
None Low HighVery high
Certain None
Income of the
respondent
<5000 2 2 1 0 0 55001-10000 0 2 2 1 0 5
10001-20000
0 0 2 2 1 5
>20000 0 0 1 1 3 5Total 2 4 6 4 4 20
Chi-Square Tests
Value Df Asymp. Sig. (2-sided)
Pearson Chi-Square 18.667(a) 12 .097Likelihood Ratio 21.134 12 .048Linear-by-Linear
Association11.790 1 .001
N of Valid Cases 20
a. 20 cells (100.0%) have expected count less than 5. The minimum expected count is .50.
Symmetric Measures
ValueApprox.
Sig.Nominal by Nominal Phi .966 .097 Cramer's V .558 .097N of Valid Cases 20
a. Not assuming the null hypothesis.b. Using the asymptotic standard error assuming the null hypothesis.
Income of the respondent>2000010001-200005001-10000<5000
Co
un
t
3
2
1
0
Bar Chart
certainvery highhighlownone
Intention to purchase
INFERENCE:
1. The Chi-square tests the hypothesis that the row and column variables in a cross
tabulation are independent.
2. The significance value 0.097 is greater than 0.05
3. So null hypothesis is not rejected and alternate hypothesis not accepted. Hence there
is no association between the two variables, Income and the intention.
4. The nominal directional measures indicate both the strength and significance of the
relationship between the row and column variables of the cross tabulation.
5. The value of each statistic can range from 0 to 1 and indicates the proportional
reduction in error in predicting the value of one variable based on the value of other
variable.
6. Also the significance value is greater than 0.05 indicating that there is no relationship
between the two variables.
7. Hence the two attributes are independent.
RESULT:
There is no association between the income and the purchase intention of the
individual.
Recommended