Upload
newgate-india
View
670
Download
2
Embed Size (px)
DESCRIPTION
Citation preview
1
NewGate India Hyderbad, Andhra Pradesh- 500038
Website: www.newgate.in
Email: [email protected]
Slideshare URL : http://www.slideshare.net/newgateindia
Business Statistics
Descriptive Statistical Analysis
For
Location of students of PGP Jan-09 students
2
CONTENTS:
I. Questions: ………………………………………………….3
II. Solution: Calculation
1. Data Collection………………………………………………3
2. Data Classification to intervals………………………………5 3. DATA GENERATED through MS EXCEL…………………………6
4. Sum, Count & Range………………………………………...6
5. Mean…………………………………………………………7
6. Variance & Standard Deviation…………………………….7
6.1 Standard Deviation………………………………7
6.2 Variance…………………………………………7
7. Median………………………………………………..……..7
8. Quartile……………………………………………………...8
9. Inter Quartile Range…………………………………………9
10. Upper limit & Lower Limit……………………………10
11. Mode……………………………………………………11
12. Skewness……………………………………………….11
III. Analysis:
13. Central tendencies & Dispersion……………………....12
13.1 Central tendency………………………………...12
13.2 Dispersion……………………………………….13
13.3 Coefficient of variation…………………………..14
14. Mode…………………………………………………...14
15. Box & Whisker’s Plot………………………...……..…15
15.1 Outliers…………….……………………………15
15.2 Evidence of skewness………………………..…16
Appendix – 1……………………………………………17
Appendix – 2……………………………………………18
Appendix – 3……………………………………………19
Bibliography……………………………………………..20
3
1. __________________________________________________________________________
I.QUESTION __________________________________________________________________________
Collect information on any variable for a group of 30 members. Write a report
summarizing those data, including the following activities.
a. Calculate appropriate measures of central tendency and dispersion.
b. Do these data have a mode?
c. Draw a box-and-whisker plot. Are there any outliers? Does the plot show
any evidence of skewness?
__________________________________________________________________________
II. Solution: Calculation __________________________________________________________________________
1.DATA COLLECTION:
A survey was done for a sample of 30 students from PGP JAN 2010 batch of Alliance
Business school.
Survey was done on the basis of the places to which each one of them belonged. The
distance of each of the places were calculated with reference to Bangalore.
SAMPLE : 30 Students
POPULATION : 52 Students
NOTE: The distance of students from Bangalore were recorded as 0 Km as all distance
were measured from Bangalore.
The scale value for all 30 students were collected and quoted as follows.
Sources to calculate distance : http://maps.google.co.in/
4
COLLECTED DATA TABLE 1
S.No Names Location Distance in
Km
1 Abhigna Vijaywada 658
2 Adil Bangalore 0
3 Akshay jaipur 1855
4 Anupriya Lucknow 1879
5 Bhavya Kodagui 206
6 Chandreep Kolkata 1929
7 Deepak Bokaro 1981
8 Girish Bangalore 0
9 Irfan kolkata 1929
10 Kiran Alleppy 441
11 Kreetika Bangalore 0
12 Laxman Jamshedpur 1420
13 Mohit Delhi 2079
14 Mridul Patna 2022
15 Neha Shimla 2444
16 Pradeeep Mangalore 298
17 Prajuktri Kolkata 1929
18 Prashant Chennai 327
19 Priyank Agra 1879
20 Rama Chennai 327
21 Rishab Delhi 2079
22 Ritesh Jamshedpur 1420
23 Ritu Agartala 3304
24 satish Chennai 327
25 Shruti Bangalore 0
26 Soumya kanpur 1783
27 Sunam Rourkela 1708
28 Tanay Bhopal 1466
29 yeshwant Guntur 619
30 Yetin Chennai 327
5
2. DATA GENERATED through MS EXCEL
Data analysis was done for descriptive statistics through MS-EXCEL. All the above 30
data were taken into account while producing the below report.
MS-EXCEL REPORT 1
Mean 1221.2
Median 1443
Mode 0
Standard Deviation 915.7679423
Sample Variance 838630.9241
Sample Variance 838630.9241
Skewness 0.117333504
Range 3304
Minimum 0
Maximum 3304
Sum 36636
Count 30
Largest(1) 3304
Smallest(1) 0
Confidence Level(95.0%) 341.9533643
Q1(First Quartile) 327
Q2(Second Quartile) 1443
Q3(Third Quartile) 1929
IQR 1602
Upper Limit 4332
Lower Limit -2076
Lower Limit -2076
6
3. DATA CLASSIFIED TO CLASS INTERVAL
To have better understanding of the behavior of the large number of sample data. We
have categorized the collected data to ordinal values within the class intervals.
CLASSIFIED CLASS INTERVAL TABLE : 2
X ( in Km) f
0-500 10
500-1000 2
1000-1500 3
1500-2000 9
2000-2500 5
2500-3000 0
3000-3500 1
4. SUM , COUNT & RANGE
ARRANGEMENT ( SORTING OF DATA) :
First fall of all variable data were arranged from top to bottom in their increasing order
X ( in km) Xmid
( in Km) f Xmid X f
0-500 250 10 2500
500-1000 750 2 1500
1000-1500 1250 3 3750
1500-2000 1750 9 15750
2000-2500 2250 5 11250
2500-3000 2750 0 0
3000-3500 3250 1 3250
Total 30 38000
SUM : 38000
COUNT : n = 30
7
Range :
Range = Maximum value - Minimum Value
Max = 3304
Min = 0
Range = 3304 _________________________________________________________________________________________
5. MEAN:
X = Σ xi ( i = 1,2,3………n )
n
X= { ( 250 x 10) + ( 750 x 2) + ( 1250 x 3) + ( 1750 x 9) + ( 250 x 10) + ( 2250 x 5)
+ ( 2750 x 0) + ( 3250 x 1) }/30
= 38000 / 30
X = 1266.66 ________________________________________________________________________________________
6. VARIANCE & STANDARD DEVIATION:
6.1 Standard deviation S = Σ | x – x |
n – 1
S= [ | ( 250 – 1266.66 ) | x 10 + | ( 750 – 1266.66 ) | x 2 + | ( 1250 – 1266.66 ) | x 3
+ | ( 1750 – 1266.66 ) | x 9 + | ( 250 – 1266.66 ) | x 10 + | ( 2250 – 1266.66 )| x 5
+ | ( 2750 – 1266.66 )| x 0 + | ( 3250 – 1266.66 )| x 1 ] /30
S = 759.31
6.2 VARIANCE :
S = √ Variance
Variance = S X S Variance = 576552.2
8
_________________________________________________________________________________________
7. MEDIAN:
Median = L + N/2 – C X i
f
L: Lower limit of the class interval
N : Number of observations
,f : Frequency of particular observation
, i: Width of the class interval
C : Cumulative frequency of previous observation
N = 30
N/2 =15
x f Cumulative
f
0-500 10 10
500-1000 2 12
1000-1500 3 15
1500-2000 9 24
2000-2500 5 29
2500-3000 0 29
3000-3500 1 30
Total 30 30
_________________________________________________________________________________________
N =15
C = 12
f= 3
i=500
L=1000
Q2 ( Median)
= 1000 + [ { (15-12) /3 } X 500 ]
Q2 = 1500
9
8. QUARTILE:
Quartiles Q1, Q2, Q3 are the percentile values dividing the whole samples of data into
4 four equal quadrant.
Q1 = first Quartile or 25th
Percentile
Q2( Median) = Second Quartile or 50th
Percentile
Q3 = Third Quartile or 75th
Percentile
Q1 = L + N/4 – C X i
f
N = 30
N/4 =7.5
x f Cumulative
f
0-500 10 10
500-1000 2 12
1000-1500 3 15
1500-2000 9 24
2000-2500 5 29
2500-3000 0 29
3000-3500 1 30
Total 30 30
N =15
C = 0
f= 10
i=500
L=0
Q1 ( Median)
= 0 + [ { (7.5-0) /10 } X 500 ]
Q1 = 375
10
Q3 = L + 3N/4 – C X i
f
N = 30
3N/4 = 22.5
x f Cumulative
f
0-500 10 10
500-1000 2 12
1000-1500 3 15
1500-2000 9 24
2000-2500 5 29
2500-3000 0 29
3000-3500 1 30
Total 30 30
Q1 = 375
Q2 = 1500
Q3 = 1916.66
_________________________________________________________________________________________
9. INTER QUARTILE RANGE ( IQR)
IQR = Q3 – Q1
IQR = 19166.66 -376 IQR = 541.66
__________________________________________________________________________
N =15
C = 15
f= 9
i=500
L=1500
Q1 ( Median)
= 1500 + [{ (22.5-15) /9 } X 500 ]
Q3 = 1916.66
11
10. UPPER LIMIT & LOWER LIMIT
Upper Limit U = Q3 + 1.5 IQR
Lower Limit L = Q1 - 1.5 IQR
U = 1916.66 + 1.5 x 541.66 = 4228.49
L = 375 – 1.5 x 541.66 = -1937.49
__________________________________________________________________________
11. MODE:
Mode is the maximum frequency of particular data for a given variable.
x Xmid f
0-500 250 10
500-1000 750 2
1000-1500 1250 3
1500-2000 1750 9
2000-2500 2250 5
2500-3000 2750 0
3000-3500 3250 1
Total 30
__________________________________________________________________________
12. Skweness:
Skp = {3 ( Mean – Median )} / S
Skp = [ 3 ( 1266.66 -1500)] / 759.31
Skp = - 0.9219
__________________________________________________________________________
Frequency is maximum at interval
0-500 that is 10 times
Mode = 250
12
III. ANALYSIS: .
13. Appropriate measures of central tendency and dispersion
13.1 Central Tendency: Mean,Median,Q1,Q3
Mean 1266.66 Km
Median 1500 Km
Observation:
It can inferred that students of PGP Jan-10 batch travel an average distance of
1266.66 Km to come to Alliance Business School.
50 % of the observations lies above 1500 km
Q1 ( First Quartile) 375 Km
Q3 ( Third Quartile) 1916.66 Km
1100
1200
1300
1400
1500
Mean Median
0
500
1000
1500
2000
2500
Q1 Q2 Q3
13
Observation:
25 % of observation lies below 375 Km
50% of observation lies between 375 Km to 1916.66 Km
25% of observation lies above 1916.66 Km
13.2 Dispersion: Standard Deviation, Sample Variance,Range,IQR,Coefficient
Standard Deviation 915.7679423 Km
Sample Variance 838630.9241 Km
Observation:
The central tendency predicted to summarize the whole sample can
deviate/differ from its mean at by an average value of 915.78 ( more or less )
Mean + Deviation = 2182.44 Km
Mean – Deviation = 350.88 Km
It will deviate/differ over a range of 2 X 915.78 = 1831.56 Km
It will deviate/differ at an average from lower most value 350.88 Km to higher
most value 350.88 Km
Range 3304 Km
IQR 1602 Km
Observation:
The student’s hometown are spread up over a coverage of 3304 Km
50 % of the observations lies in the coverage area of 1602 Km
0 500 1000 1500 2000 2500 3000 3500
Standard Deviation
Range
IQR
14
13.3 Coefficient of variation:
Coefficient of variation = Standard Deviation/Mean X 100
= 915.7679423/1266.66
= 72.23 %
_________________________________________________________________________________________
14. Maximum Observed data ( Mode)
Observation:
The maximum observed interval were 0 Km to 500km
Mode is its mid value (0 + 500) / 2 = 250 Km
_________________________________________________________________________________________
0 5 10 15 20 25 30 35
0-500
500-1000
1000-1500
1500-2000
2000-2500
2500-3000
3000-3500
Total
Axis Title
Axi
s Ti
tle
f
f
15
15. Box-and-whisker plot
Q1 375 327 Km Q2 1500 1443 Km
Q3 1916.66 1929 Km IQR 1541.66 1602 Km
Upper Limit 4228.49 4332 Km
Lower Limit -1937.49 -2076 Km
16
15.1 Outliers : UpperLimit,Lower Limit
Upper Limit 4228.49 Km Lower Limit -1937.49 Km
Max value : 3304 Km < upper limit:4228.49 Km
Min Value : 0 Km < lower limit : -1937.49 Km
Hence, No outliers
Observation:
All samples data should be within 4228.49 Km to escape being outliers.
The maximum sample date is of 3304 Km < 4228.49 Km.So there is no
outliers
15.2 .Evidence Of Skweness
Skewness -0.9219
Observation:
It is negatively skewed as mean < median
The skewness is -0.9219
In the box plot the median do not lie exactly in between Q1 and Q2. ________________________________________________________________________________________
-3000
-2000
-1000
0
1000
2000
3000
4000
5000
Upper Limit Lower Limit
17
APPENDIX – 1
COLLECTED DATA TABLE 1
S.No Names Location Distance in
Km
1 Abhigna Vijaywada 658
2 Adil Bangalore 0
3 Akshay jaipur 1855
4 Anupriya Lucknow 1879
5 Bhavya Kodagui 206
6 Chandreep Kolkata 1929
7 Deepak Bokaro 1981
8 Girish Bangalore 0
9 Irfan kolkata 1929
10 Kiran Alleppy 441
11 Kreetika Bangalore 0
12 Laxman Jamshedpur 1420
13 Mohit Delhi 2079
14 Mridul Patna 2022
15 Neha Shimla 2444
16 Pradeeep Mangalore 298
17 Prajuktri Kolkata 1929
18 Prashant Chennai 327
19 Priyank Agra 1879
20 Rama Chennai 327
21 Rishab Delhi 2079
22 Ritesh Jamshedpur 1420
23 Ritu Agartala 3304
24 satish Chennai 327
25 Shruti Bangalore 0
26 Soumya kanpur 1783
27 Sunam Rourkela 1708
28 Tanay Bhopal 1466
29 yeshwant Guntur 619
30 Yetin Chennai 327
18
APPENDIX – 2
CLASSIFIED CLASS INTERVAL TABLE : 2
X (in Km) Xmid
( in Km) f cumulative f Xmid X f
0-500 250 10 10 2500
500-1000 750 2 12 1500
1000-1500 1250 3 15 3750
1500-2000 1750 9 24 15750
2000-2500 2250 5 29 11250
2500-3000 2750 0 29 0
3000-3500 3250 1 30 3250
Total 30 30 38000
MS-EXCEL Vs CALCULATED REPORT
Calculated Value Simulated By
Ms-excel
Mean 1266.66 1221.2
Mode 250 0
Median 1500 1443 Standard Deviation
759.31 915.7679423
Sample variance 576552.2 838630.9241
Skewness -0.9219 0.117333504 Range 3304 3304
Minimum 0 0
Maximum 3304 3304 Sum 38000 36636
Count 30 30 Largest(1) 3304 3304
Smallest(1) 0 0
Q1 375 327
Q2 1500 1443
Q3 1916.66 1929 IQR 1541.66 1602
Upper Limit 4228.49 4332 Lower Limit -1937.49 -2076
19
APPENDIX – 3
MS-EXCEL Vs CALCULATED REPORT
0
500
1000
1500
2000
2500
3000
3500
Mean Median StandardDeviation
Range
calculated value
Ms-Excel value
-3000
-2000
-1000
0
1000
2000
3000
4000
5000
Q1 Q2 Q3 IQR UpperLimit
LowerLimit
Axi
s Ti
tle
Axis Title
Calculated data
MS-Excel Value
20
BIBLIOGRAPHY
Statistics of Business and Economics
( Anderson,Sweeney,Williams,Cenage Learning,9th
Edition)
www.wikipdeia.org
www.stats4u.com
www.maps.google.co.in