Population

AssignmentOn

Estimating a Population MeanCourse Title: Research Methodology

Course Code : REM - 312

SUBMITTED TO:

Professor Gouranga Ch. ChandaLecturer Faculty of Business Administration BGC Trust University Bangladesh

SUBMITTED BY:

Md. Hasan Khan Id No: 130105152Sec: ASemester: 6th

Batch: 5th

Faculty of Business AdministrationBGC Trust University, Bangladesh.

Date of Submission: 19/12/2015

Population:

The entire pool from which a statistical sample is drawn. The information obtained from the sample allows statisticians to develop hypotheses about the larger population. Researchers gather information from a sample because of the difficulty of studying the entire population. In statistical equations, population is usually denoted with a capital 'N', while the sample is usually denoted with a lowercase 'n'.

In statistics, population refers to the total set of observations that can be made.

For example, if we are studying the weight of adult women, the population is the set of weights of all the women in the world. If we are studying the grade point average (GPA) of students at Harvard, the population is the set of GPA's of all the students at Harvard.

Mean:

A mean score is an average score, often denoted by X. It is the sum of individual scores divided by the number of individuals. The mean is the average of the numbers: a calculated "central" value of a set of numbers. To calculate: Just add up all the numbers, then divide by how many numbers there are. Thus, if you have a set of N numbers ( X1 , X2 , X3 , . . . XN ), the mean of those numbers would be defined as:

X = ( X1 + X2 + X3 + . . . + XN ) / N = [ Σ Xi ] / N

Example: what is the mean of 2, 7 and 9?Add the numbers: 2 + 7 + 9 = 18Divide by how many numbers (i.e. we added 3 numbers): 18 ÷ 3 = 6So the Mean is 6

Simple or arithmetic average of a range of values or quantities, computed by dividing the total of all values by the number of values. For example, the

mean of 1, 2, 3, 4, and 5 is (15 5) = 3. It is the most common and best general purpose measure of the mid-point (around which all other values cluster) of a set of values, but is prone to distortion by the presence of extreme values and may require use of a measure of distortion (such as mean deviation or standard deviation). Also called arithmetic mean

Types of Mean and Their Formulas:

In mathematics, the average of two or more numbers is called the mean. The most commonly used mean is called the arithmetic mean, in which you add up all the values and divide by the number of values. For example, the arithmetic mean of 2, 5, and 14 is (2+5+14)/3 = 7. However, there is more than one kind of mean, including the geomtric mean, harmonic mean, root mean square, and several others.

The essential property of any mean is that it must fall between the highest value and the lowest value. When computing means, the type of mean you need to use depends on the type of data you are analyzing. Several formulas an examples are discussed below.

Geometric Mean:

The geometric mean of two numbers x and y is sqrt(xy). If you have three numbers x, y, and z, their geometric mean is cbrt(xyz), where "cbrt" means the cube root. For n numbers, the harmonic mean is

(x1·x2·...xn)1/n

Business analysts and scientists use the geometric mean to find the average growth rate of a process. For example, suppose a business's profits grow by 25% one year, and by 45.8% the next year. To find the average yearly percent growth rate, you must take the geometric mean of 1.25 and 1.458.

sqrt[(1.25)(1.458)]= sqrt[1.8225]= 1.35

Thus, the average growth rate over the two years was 35%. Compare this to the result you would get if you took the arithmetic mean of 25 and 45.8. Since (25+45.8)/2 = 35.4, you would get an answer that is too high.

Harmonic Mean:

In science in business applications, the harmonic mean is used to average ratios. For two numbers x and y, the harmonic mean is 2xy/(x+y). For three numbers x, y, and z, the harmonic mean is 3xyz/(xy+xz+yz). For n numbers, the harmonic mean is

n/(1/x1 + 1/x2 + ... + 1/xn)

There are many instances where the harmonic mean is the most appropriate way to find the average value.

For example, suppose a man drives at a speed of 80 k/h for 100 kilometers (1.25 hours), and then drives at a speed of 40 k/h for the next 100 km (2.5 hours). The average speed of the car for the entire 200 km trip is total distance divided by total time. Since 200/(1.25+2.5) = 53.33, the average speed is 53.33 k/h. This is equivalent to the harmonic mean of 80 and 40. Observe:

2(80)(40)/(80+40)= 6400/120= 53.33

In business, investors use the harmonic mean to compute the average price/earning ratio of a stock portfolio. For example, suppose you have three stocks, and their P/E ratios are 8, 18, and 30. The average P/E ratio of the three stocks is

3(8)(18)(30)/(144+240+540)= 12960/924= 14.026

Root Mean Square (Quadratic Mean):

The root mean square, aka quadratic mean, is used in many engineering and statistical applications, especially when there are data points that can be negative. The standard deviation of a set of numbers is an example of the root mean square. (It is the root mean square of the differences between each data point and the arithmetic mean.) If you have two numbers x and y, the quadratic mean is sqrt[(x2 + y2)/2]. For n variables, it is

sqrt[(x12 + x22 + ... + xn2)/n]

For example, suppose you have this set of numbers: -10, -5, -4, 1, 6, 7. The root mean square is

sqrt[(100+25+16+1+36+49)/6]= sqrt(227/6)= 6.15

which can be interpreted as the average positive value.

Contraharmonic Mean:

The contraharmonic mean of x and y is (x2 + y2)/(x + y). For n values, the contraharmonic mean is

(x12 + x22 + ... + xn2)/(x1 + x2 + ... + xn)

For example, the contraharmonic mean of 1, 3, 5, and 7 is

(1+9+25+49)/(1+3+5+7) = 84/16 = 5.25Other Means

[(xp + yp)/2]1/p (Power Mean)

[(xp - yp)/(p(x - y))]1/(p-1) (Stolarsky Mean)

sqrt[(x2 + xy + y2)/3] when p = 3

(xp + yp)/(xp-1 + yp-1) (Lehmer Mean)

[(xp + yp)/(xr + yr)]1/(p-r)

[(r(xp - yp))/(p(xr - yr))]1/(p-r)

[(xpyr + xryp)/2]1/(p+r)

(x - y)/(Ln(x) - Ln(y)) (Log Mean)

(xLn(x) + yLn(y))/(Ln(x) + Ln(y))

(x + sqrt(xy) + y)/3 (Heronian Mean)

(1/e)(xx/yy)1/(x-y), e = 2.718281828.... (Identric Mean)

(e)(xy/yx)1/(y-x)

(xxyy)1/(x+y)

(xyyx)1/(x+y)

Mean Inequalities:

Some means are in a constant relationship to one another. If we denote the arithmentic mean of x and y by A, their geometric mean by G, their harmonic mean by H, their root mean square by R, and their contraharmonic mean by C, then the following chain of inequalities is always true

C ≥ R ≥ A ≥ G ≥ H.

Estimating a population mean:

• We use ¯y as an estimator of μ. Is it a ’good’ estimator?

• An estimator is ’good’ if:– It is unbiased– It has small standard error.

• An estimator is unbiased if the mean of its sampling distribution equalsthe parameter we are trying to estimate.

– ¯y is unbiased for μ because E(¯y) = μ¯y = μ.• In English: if we were to draw 100 samples of size n from somepopulation with mean μ, and were to compute ¯y in each of the 100samples, the average of those 100 ¯y would be close to μ.

Population mean (cont’d):

• Recall that if y _ (μ, _2), then the sampling distribution of ¯y isN(μ, _2/n).

• As n increases, _2/n decreases: the larger the sample, the more reliablewill ¯y be as an estimator of μ.

• The parameter p_n is called standard error of the mean and is estimatedby S/pn.

• If ¯y _ N(μ, _2/n) thenProb(¯y − 2 _pn< μ < ¯y + 2 _pn) _ 0.95(exactly equal to 0.95 if we use 1.96 instead of 2).

Confidence intervals:

• Since ¯y will fall within ±2_/pn of the population mean μ approximately95% of the time, then the interval¯y − 2 _pnto ¯y + 2 _pn will cover μ about 95% of the time in repeated sampling.

• 100(1-_)% confidence interval for μ:¯y ± z_/2_pn,where z_/2 is the z value with an area equal to _/2 to its right

Confidence intervals:

• We can construct confidence intervals with any confidence coefficient(1 − _).

• For a 90% confidence interval, use 1.64 instead of 2 (or 1.96 to beprecise), because:

1. _ = 0.10 and _/2 = 0.05

2. z-value with 0.05 to its right (or 0.95 to its left) is 1.64 from standard normal table.

• For a 99% confidence interval, use 2.58 (or z0.005) instead of 2.

• Note: the wider the interval, the higher the confidence that it will coverμ. Thus, a 99% confidence interval for μ will always be wider than a90% interval.

Confidence intervals (cont’d)

• Example: Attention times given by parents to sets of twin boys duringone week (Table 1.9, page 36).

• n = 50, ¯y = 20.85 and S = 13.41.

• A 90% CI for the true mean attention time μ is¯y ± 1.64 Spn= 20.85 ± 1.6413.41p50= 20.85 ± 3.11.

• 95% CI: y¯± 2pSn = 20.85 ± 21p3.4150= 20.85 ± 3.80.

• 99% CI: y¯± 2.57pSn = 20.85 ± 2.571p3.4150= 20.85 ± 4.88.

• Note that we used the sample standard deviation S in place of theunknown population standard deviation _ to compute the CI.

• This is OK only if n is large enough (more than 30).

• If _ is unknown (as it usually is) and n < 30 we compute the CI usingt_/2 instead of z_/2 (Student’s t−table instead of z−table).

• The value t_/2 is the upper-tail t−value such that an area equal to_/2 lies to its right. Confidence intervals (cont’d)

• To get the appropriate value out of a t−table we need:

1. The degrees of freedom = n − 1 in this type of applications.

2. The desired confidence coefficient (1 − _).

• For small n, a 100(1 − _)%CI for μ is¯y ± t_2 ,n−1Spn

saline water.• n = 5 (small!), ¯y = 239.2, S = 29.3 and df = 4.

• For a 95% CI for the true silica concentration: t_2 ,4 = t0.052 ,4 = 2.776.

• Then, the 95% CI for μ is 239.2 ± 2.776 29.3p5= 239.2 ± 36.4.

• If we had wished to obtain a 90% or a 99% CI for the mean, then thecorresponding t−values (from the table) would have been 2.132 and4.604, respectively (see Table C.2).

Confidence intervals (cont’d)

• Continue with same example, and now ask the following question: whatsample size would we have needed if we wished to estimate the truemean silica concentration to within 10 ppm with 95% confidence?

• We wish to know what n we would need if we wished to be able to

state that Prob(¯y − 10 < μ < ¯y + 10) = 0.95.

• Above means thatt0.025,n−1Spn= 10.

• From expression above, we need to solve for n.Confidence intervals (cont’d)

• We know that the desired n must be larger than 5 because with n = 5we estimated μ to within 36.4 ppm with 95% confidence.

• We need the following in order to come up with an answer:

– Assume that S would not change with increased n

– Approximate a value of t0.025,n−1 to be about 2

Documents

Population