Statistics and Probability Solved Assignments - Semester Spring 2008

  • View
    347

  • Download
    1

Embed Size (px)

DESCRIPTION

Statistics and Probability Solved Assignments - Semester Spring 2008

Text of Statistics and Probability Solved Assignments - Semester Spring 2008

Assignment 1 Question 1 (a) Define the following terms population, sample, parameter, statistic and variable. Solution: Population: Collection of all the possible observations regarding some problem that is under consideration. Sample: A representative part of population is called sample. Parameter: Any numerical value computed from population is called parameter. Statistic: Any numerical value computed from sample is called statistic. Variable: A characteristic that varies from individual to individual or object to object. (b) Count the number of letters in each word of the following passage, and make a frequency distribution of word length. The Virtual University of Pakistan delivers education through a judicious combination of broadcast television and the Internet. VU courses are written in meticulous detail by acknowledged experts in the field. Lectures are then recorded in a professional studio environment and after insertion of slides, movie clips and other material, become ready for broadcast. Course lectures are broadcast over television and are also made available in the form of multimedia CDs. The multiple formats allows for a high degree of flexibility for students who may view the lectures at a time of their choosing within a 24 hour period. Additionally, students can use the lectures to review an entire course before their examinations; a facility simply not available in the conventional face to face environment. Solution: length of Words Tally Bar Frequency 1 |||| | 6 2 |||| |||| |||| ||| 18 3 |||| |||| |||| |||| |||| 25 4 |||| |||| | 11 5 |||| ||| 8 6 |||| |||| |||| 14 7 |||| | 6 8 |||| |||| |||| 14 9 |||| ||| 8 10 |||| 5 11 |||| 4 12 |||| 5 Total 124 Question 2 Find the mean, median and mode from the following data Class Interval Frequency 20-29 6 30-39 15 40-49 21 50-59 29 60-69 25 70-79 22 80-89 11 90-99 9 100-109 3 110-119 1 120-129 2 Solution: The given data n required calculations are computed in the following table Class Interval Class Boundaries Frequency f Mid-Point x fx Cumulative Freq. cf 20-29 19.5-29.5 6 24.5 147 6 30-39 29.5-39.5 15 34.5 517.5 21 40-49 39.5-49.5 21 44.5 934.5 42 50-59 49.5-59.5 29 54.5 1580.5 71 60-69 59.5-69.5 25 64.5 1612.5 96 70-79 69.5-79.5 22 74.5 1639 118 80-89 79.5-89.5 11 84.5 929.5 129 90-99 89.5-99.5 9 94.5 850.5 138 100-109 99.5-109.5 3 104.5 313.5 141 110-119 109.5-119.5 1 114.5 114.5 142 120-129 119.5-129.5 2 124.5 249 144 144 8888 Modal Class 11 21 28838, 144888814461.722( )2144, 71, 25, 10, 59.510 14459.5 ( 71)25 259.5 0.459.9( ) ( )49.5, 29, 21, 25, 102949.5mm mmfxMean xfHere fx fxxh nMeadian l cfHere n c f h lf fMode l hf f f fHerel f f f hE= =EE = E ==== + = = = = == + = +== + + = = = = == +2110(29 21) (29 25)849.5 101256.83 + = + = Assignment 2 Question 1 (a) What is difference between absolute measure of dispersion and relative measures of dispersion? (b) The weekly sales of two products A and B were recorded as give below: Product A 59 75 27 63 27 28 56 Product B 150 200 125 310 330 250 225 Find out which of the two shows greater fluctuation in sales. Solution (a): Absolute measures are describes by a number or value to represent the amount of variation among the values in a data set. Such values are expressed in the same unit of measurement as the set of values in the data such as rupees, inches, and feet. The relative measures are described as the ratio of a measure of absolute measure to an average and this value is independent of any unit of measurement. It is also called coefficient of variations. Solution (b): For this we will find Coefficient of Variation CV of both products. Required calculations are shown below, Product A Product B X X2 X X2 59 3481 150 22500 75 5625 200 40000 27 729 125 15625 63 3969 310 96100 27 729 330 108900 28 784 250 62500 56 3136 225 50625 Total = 335 18453 1590 396250 For Product A 222335747.86. ( )18453 3357 72636.14 2290.3118.60xMean XnXXandX XS D X Sn nSSSE= ===| |= = | |\ .| |= |\ .= = . 10018.60. 10047.86. 38.86%NowCoefficietofVaritaionSCVXCVCV= = = For Product B 22215907227.14. ( )396250 15907 756607.14 51593.8870.80. 10070.80. 100227.14. 31.17%xMean XnXXandX XS D X Sn nSSSNowCoefficietofVaritaionSCVXCVCVE= ===| |= = | |\ .| |= |\ .= == = = Conclusion/ Interpretation: By comparing the CVs of product A and B, We see CV of product A is greater than that of product B, this shows that Product B has greater fluctuation in sales. Question 2 (a) What is empirical rule? (b) Evaluate an appropriate measure of variation for the following data. Also find coefficient of that variation. Farm size (acre) No. of forms Below 40 394 41-80 461 81-120 391 121-160 334 161-200 169 201-240 113 241 and above 148 Solution (a): Empirical Rule: For a data set having symmetrical bell-shaped distribution (normal curve), the range within which a given percentage of values of the distribution are likely to fall within a specified number of standard deviations of the mean is determined as follows: ( ) X S Covers approximately 68% of values in the data set ( 2 ) X S Covers approximately 95% of values in the data set ( 3 ) X S Covers approximately 100 %( 99.73%) of values in the data set Solution (b): Since the frequency distribution has open-end class intervals on the two extreme sides, therefore Q.D. would be an appropriate measure of variation. The computation of Q.D. is shown below Farm size (acre) Class Boundaries No. of forms ( f ) Cumulative frequency ( cf ) Below 40 Below 40.5 394 394 41-80 40.5-80.5 461 855 81-120 80.5-120.5 391 1246 121-160 120.5-160.5 334 1580 161-200 160.5-200.5 169 1749 201-240 200.5-240.5 113 1862 241 and above 240.5 and above 148 2010 Total 2010 First we find first quartile: 1111( )42010502.54 4461, 394, 40, 40.54040.5 (502.5 394)46140.5 9.4149.91h nQ l cfHerenthvaluef c h lQQQ= + = == = = == + = += And third quartile: Q3 class Q1 class 33333( )43 3(2010)1507.54 4334, 1246, 40, 120.540120.5 (1507.5 1246)334120.5 31.31151.81h nQ l cfHerenthvaluef c h lQQQ= + = == = = == + = += Thus the quartile deviation is 3 1.2151.81 49.91.2. 50.95Q QQ DQ DQ D=== And coefficient of Q.D: 3 13 1.151.81 49.91151.81 49.910.505Q QCoefficient of Q DQ Q=+=+= Assignment 3 Question 1 (a) Define Set and its properties. Also explain the Venn diagram. (b) The first four moments of a distribution about the origin are 1, 4, 10, and 46 respectively. Obtain the four moments about mean. Also calculate moments ratios. Solution: a) Set: A set is any well-defined collection or list of distinct objects, e.g. a group of students, the books in a library, the integers between 1 and 100, all human beings on the earth, etc Properties of set: Followings are the main properties of a set i) Union ii) Intersection iii) Difference Venn Diagram. It is a diagram which is use to represent the set in such a way that the universal set or Sample Space is represented by the rectangle while its subsets are represented by the circles. e.g. b) In usual notations, we have 146 , 10 , 4 , 1 , 0/1/4/3/2/1= = = = = = =u u u u uorigion aboput moment first xA ( ) 3 1 4 ) (2 /1/2 22= = = = u u u o Variance A S B ( ) 732 . 1 3 . .2= = = o o D S ( ) ( )( ) ( ) 0 1 2 1 4 3 10 2 . 33/1/1/2/3 3 = + = + = u u u u u ( ) ( )( )( ) ( )( ) ( ) 27 1 3 1 4 6 1 10 4 463 6 . 44 244/12/1/2/1/3/4 4= + = + =u u u u u u u u As we know that moment ratios are 03032231 = = =uu| And 42 222739u|u= = = Question 2 (a) In simple linear regression analysis, interpret a and b. (b) A company is introducing a job evaluation scheme in which all jobs are graded by points for skill, responsibility, and so on. Monthly pay scales (Rs. in 1000s) are then drawn up according to the number of points allocated and other factors such as experience and local conditions. To date the company has applied this scheme to 9 jobs: Job: A B C D E F G H I Points: 5 25 7 19 10 12 15 28 16 Pay: 3.0 5.0 3.25 6.5 5.5 5.6 6.0 7.2 6.1 (i) Find the least squares line for linking pay scales to points. (ii) Estimate the monthly pay for a job graded by 20 points (iii) Calculates the standard error of estimate. Solution: a) Solution: If bx a y + = than = a y-intercept that represents average value of the dependent variable y when x = 0 = b slop of the regression line that represents the expected change in the value of y (either positive or negative) for a unit change in the value of x. b) Calculations required are as x y x2 y2 xy 5 3 25 9 15 25 5 625 25 125 7 3.25 49 10.5625 22.75 19 6.5 361 42.25 123.5 10 5.5 100 30.25 55 12 5.6 144 31.36 67.2 15 6 225 36 90 28 7.2 784 51.84 201.6 16 6.1 256 37.21 97.6 137 48.15 2569 273.4725 797.65 (i) 35 . 5915 . 48, 22 . 159137= = = = = = nxynxx ( ) ( )( )( )133