18
Course B1129 ± Statistics for management Sem-1 Assesment 1 Question 1. (a) What is the difference between a qualitative and quantitative variable? [5 Marks] (b) A town has 15 neighbourhoods. If you interviewed everyone living in one particular neighbourhood, would you be interviewing a population or a sample from the town? Would this be a random sample? If you had a li st of everyone living in the town, called a frame, and you randomly selected 100 people from all neighbourhoods, would this a random sample? [5 Marks]. Answer 1: 1. Qualitative data deals with m eanings while quantitative data deals with numbers. Qualitative data describes properties or characteristics that are used to identify things. Quantitative data describes in terms of quantity using the numerical figure accompanied by measurement unit. By definition something that is qualitative concerns or describes a quality. A qualitative variable is a descriptive. Qualitative variable are sometimes referred to as categorical. The variable may be colors in the light spectrum or a comparison between red and green grapes. Qualitative variables can influence the outcome of an experiment or research because they can influence other factors or parameters. Qualitative variables are frequently used in social research. Qualitative research is considered to be inductive. By definition something that is quantitative can be expressed as a quantity or number. Quantitative variables are something that can be measured. Quantitative variables are numerical. A quantitative variable can be a percentage of something, a number of units or any other measurement. Temperature is a quantitative value or variable by the number of degrees. Speed, area population, voltage and time are all examples of quantitative variables that can be measured. Quantitative variables are most often considered to be deductive in nature.  An example of quantitative variables in an experiment would be testing the change in speed on a turntable as additional weight is applied. The turntable itself is the controlled variable. The experimenter will only use one. The independent quantitative variable is the amount of weight applied for each measurement. The dependent quantitative variable is the resulting speed that is measured.

Cource B1129- Statistics for Management

Embed Size (px)

Citation preview

Page 1: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 1/18

Course B1129 ± Statistics for managementSem-1 Assesment 1

Question 1.

(a) What is the difference between a qualitative and quantitative variable?[5 Marks](b) A town has 15 neighbourhoods. If you interviewed everyone living in one particular 

neighbourhood, would you be interviewing a population or a sample from the town?

Would this be a random sample? If you had a list of everyone living in the town, called a

frame, and you randomly selected 100 people from all neighbourhoods, would this a

random sample? [5 Marks].

Answer 1:

1. Qualitative data deals with meanings while quantitative data deals with numbers.

Qualitative data describes properties or characteristics that are used to identify things.

Quantitative data describes in terms of quantity using the numerical figure accompanied

by measurement unit.

By definition something that is qualitative concerns or describes a quality. A qualitative

variable is a descriptive. Qualitative variable are sometimes referred to as categorical.

The variable may be colors in the light spectrum or a comparison between red and

green grapes.

Qualitative variables can influence the outcome of an experiment or research becausethey can influence other factors or parameters. Qualitative variables are frequently used

in social research. Qualitative research is considered to be inductive.

By definition something that is quantitative can be expressed as a quantity or number.Quantitative variables are something that can be measured. Quantitative variables arenumerical. A quantitative variable can be a percentage of something, a number of unitsor any other measurement.

Temperature is a quantitative value or variable by the number of degrees. Speed, area

population, voltage and time are all examples of quantitative variables that can bemeasured. Quantitative variables are most often considered to be deductive in nature.

 An example of quantitative variables in an experiment would be testing the change inspeed on a turntable as additional weight is applied. The turntable itself is the controlledvariable. The experimenter will only use one. The independent quantitative variable isthe amount of weight applied for each measurement. The dependent quantitativevariable is the resulting speed that is measured.

Page 2: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 2/18

 An example of a qualitative variable in testing would be the drying time require for redand green grapes at a constant temperature. The outcome, of time is measured andtherefore quantitative. The controlled variable being used is temperature, alsoquantitative. The independent variable is qualitative, the difference between red andgreen grapes. In this particular example the weight of each grape, a quantitative

variable would also need to be consistent or controlled.

1 B) Question:A town has 15 neighbourhoods. If you interviewed everyone living in one particular neighbourhood, would you be interviewing a population or a sample from thetown?

Would this be a random sample? If you had a list of everyone living in the town, called aframe, and you randomly selected 100 ! 

1B) Answer:

  A town has 15 neighbourhoods. If we interviewed everyone living in one particular neighbourhood. We will be interviewing a sample from the town.This Is a randomsample.

If we Have a list of everyone living in the town, called a frame, and we randomlyselected 100 people from all neighbourhoods, this would be a random sample.

Explanation:

When collecting data on a large group of people (called a "population"), you might wantto minimize the impact that the survey will have on the group that you are surveying. Itis often not necessary to survey the entire population. Instead, you can select a randomsample of people from the population and survey just them. You can then drawconclusions about how the entire population would respond based on the responsesfrom this randomly selected group of people. This is exactly what political pollsters do -they ask a group of people a list of questions and based on their results, they drawconclusions about the population as a whole with those often heard disclaimers of "plusor minus 5%."

Page 3: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 3/18

 

Question 2.a) Explain the steps involved in planning of a statistical survey?

b) What are the merits & Demerits of Direct personal observation and Indirect OralInterview?

 Answer 2a)

The Accuracy of data obtained in a survey depends upon the care exercised in

planning. A properly planned investigation can lead to best result with least cost and

time.

The steps are defined below.

1) Nature of problem to be investigated should be properly defined in an unambiguousmanner.

2) Objective of investigation stated at the outset.

Objective could be to:

Obtain certain estimates.

Establish a theory.

Verify an existing statement .

Page 4: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 4/18

Find relationship between characteristics.

The scope of investigation has to be

3) The scope of investigation has to be made clear. The scope of investigation referes

to the area to be covered, identification of units to be studied, nature of 

characteristics to be observed accuracy of measurement, analytical methods, time

cost and other resources required.

4) Whether to use data collected from primary or secondary source should be

determined in advance.

5) The organization of investigation is the final step in the process. It encompasses the

determination of the number of investigators required, their training, supervision

work needed, funds required.

 Answer 2b)

When the investigator collects the data by having direct contact with units of investigation is direct personal observation.

The direct personal method is suitable where the scope of investigation is narrow .

investigation is confidential and requires personal personal attention of the investigator 

and accuracy of data is important.

However there are some minus points of direct personal observation are also there.

Merits and Demerits of direct personal observation:

Merits Demerits1 We get the original data which is

more accurate and reliable.This .method consumes more cost

2 Satisfactory information can beextracted by the investigator 

This method consumes more time.

3 Date is homogeneous andcomparable.

This method can not be used when thescope of the work is wide.

4 Additional information can begathered.

Most of the data collected by this method ismaintained as confidential.Hence , there is chance of leakage of data.

5 Misinterpretation of questions can

be avoided.

Page 5: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 5/18

Question 3A) Draw Ogives from the follwing data and measure median valueverify it by actual calcuation.

Central size 5 15 25 35 45Frequency 5 11 21 16 10

 Answer 3

4. a) What is the main difference between correlation analysis and regression analysis?b) In a multiple regression model with 12 independent variables, what are the degrees

of freedom for error? Explain?

 Answer 4 a)

Corelation analysis attempt to study the relationship between the two variables µx¶ and

µy¶. Regression analysis attempts to predict the average µx¶ for a given µy¶.

0

10

20

30

40

50

60

70

0 10 20 30 40 50

Series1

5 11 21 16 10

Page 6: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 6/18

Corelations Analysis deals with the following.

Measuring the relationship between the variables.

Testing the relationship for its significance.

Giving confidence interval for population corelation measure.

Regression analysis deals with the following.

To estimate the values of the dependent variables from the values of the

independent variables.

To get a measure of the error involvedwhile using regression line as a basis for 

estimation.

Regression coefficient is used to calculate corelation coefficient.

Correlation quantifies the degree to which two variables are related. Correlation does

not find a best-fit line (that is regression). You simply are computing a correlationcoefficient (r) that tells you how much one variable tends to change when the other one

does.

With correlation you don't have to think about cause and effect. You simply quantify

how well two variables relate to each other. With regression, you do have to think about

cause and effect as the regression line is determined as the best way to predict Y from

X.

With correlation, it doesn't matter which of the two variables you call "X" and which you

call "Y". You'll get the same correlation coefficient if you swap the two. With linear 

regression, the decision of which variable you call "X" and which you call "Y" matters a

lot, as you'll get a different best-fit line if you swap the two. The line that best predicts Yfrom X is not the same as the line that predicts X from Y.

Correlation is almost always used when you measure both variables. It rarely is

appropriate when one variable is something you experimentally manipulate. With linear 

regression, the X variable is often something you experimentall manipulate (time,

concentration...) and the Y variable is something you measure.

The difference between corelation coefficient and regression coefficient.

Question 4 b)

The degrees of freedom in a multiple regression equals N-k-1,.

Where k  is the number of variables.

Page 7: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 7/18

N is data points.

The more variables you add, the more you erode your ability to test the model..

n-12-1 ??

Question 5. a) Discuss what is meant by Quality control and quality improvement.

 Answer 5 A)

Quality control:

Statistical quality control refers to the use of statistical methods in the monitoring andmaintaining of the quality of products and services. One method, referred to asacceptance sampling, can be used when a decision must be made to accept or reject agroup of parts or items based on the quality found in a sample. A second method,referred to as statistical process control, uses graphical...

Quality control refers to the systematic use of methods to ensure that a service or product conforms to a desired standard. Primary emphasis is placed on monitoringprocesses and/or outcomes. Quality improvement refers to the betterment or enhancement of a product or service. When enhancements are ongoing or occur repeatedly over time, the process is known as continuous quality improvement 

Quality improvement:

Complete control and improvements on any process is made by accuratemeasurements at critical points within the process. In order to gain confidence,the numbers are required to be generated often at various points so that all thevariations of the process are detected. The quantity of measurementsaccumulates over time and simple tables or listings of these numbers are notenough to evaluate the process. The following statistical tools are used tounderstand what the numbers mean. The numbers from measurements thatrepresent something in common rather than a scattering of unrelated numbers arecalled a set. When measuring properties of the process that are different, for 

example, gradation, crush count, or chert count, each property requires a set of 

Page 8: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 8/18

 

The act of overseeing all activities and tasks needed to maintain a desired level of excellence. This includes creating and implementing quality planning and assurance, aswell as quality control and quality improvement. It is also referred to as total qualitymanagement (TQM).

  A major aspect of quality control is the establishment of well-defined controls. These

controls help standardize both production and reactions to quality issues. Limiting roomfor error by specifying which production activities are to be completed by whichpersonnel reduces the chance that employees will be involved in tasks for which they donot have adequate training.

Page 9: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 9/18

 

Question 4 b) What are the limitations of a quality control charts?

 Answer 4 b)  A control chart is a popular statistical tool for monitoring the quality of goods and

services, and for detecting when the process goes "out of control" as early as possible.

Samples from the process are taken every time interval, and their quality measured.

Control charts are used to track the sample quality over time and detect any unusual

behavior. Below are calculators that help you to easily obtain the control chart limits for 

different types of measurements. 

Statistical tool used in quality control to analyze and understand process variables

determine process capabilities, and to monitor effects of the variables on the difference

between target and actual performance. Control charts indicate upper and lower 

control limits, and often include a central (average) line, to help detect trend of 

plotted values. If all data points are within the control limits, variations in the values

may be due to a common cause and process is said to be 'in control'. If data points fall

outside the control limits, variations may be due to a special cause and the process is

said to be out of control.

Traditional statistical tools are subject to certain constraints when they are applied to

quality control in industries where the number of faults per working day is limited. An

effective quality monitoring and analyzing tool is therefore needed to meet the specificrequirements of these industrial sectors. Proposes a so-called ³Cause-classified

Control Chart´,

Types of charts

Chart Process observation

Process

observations

relationships

Process

observations

type

Size of 

shift to

detect

and R chart

Quality characteristic

measurement within one

subgroup

Independent VariablesLarge

(� 1.5)

and s chart Quality characteristic

measurement within oneIndependent Variables

Large

(� 1.5)

Page 10: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 10/18

subgroup

Shewhart

individuals

control chart 

(ImR chart or 

XmR chart)

Quality characteristic

measurement for one

observation

Independent Variables� Large

(� 1.5)

Three-way chart

Quality characteristic

measurement within one

subgroup

Independent VariablesLarge

(� 1.5)

p-chartFraction nonconforming

within one subgroupIndependent Attributes� 

Large

(� 1.5)

np-chart Number nonconformingwithin one subgroup

Independent Attributes�  Large(� 1.5)

c-chart

Number of 

nonconformances within

one subgroup

Independent Attributes� Large

(� 1.5)

u-chartNonconformances per unit

within one subgroupIndependent Attributes� 

Large

(� 1.5)

EWMA chart

Exponentially weightedmoving average of quality

characteristic

measurement within one

subgroup

Independent Attributes or 

variables

Small (<

1.5)

CUSUM chart

Cumulative sum of quality

characteristic

measurement within one

subgroup

Independent Attributes or 

variables

Small (<

1.5)

Time series 

model

Quality characteristic

measurement within one

subgroup

 Autocorrelated Attributes or 

variablesN/A

RegressionQuality characteristic

measurement within one

Dependent of 

process controlVariables Large

Page 11: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 11/18

control chart subgroup variables (� 1.5)

Real-time

contrasts chart

Sliding window of quality

characteristic

measurement within one

subgroup

Independent Attributes or 

variables

Small (<

1.5)

Limitations of Control Chart:

several authors have criticised the control chart on the grounds that

1) it violates the likelihood principle.However, the principle is itself controversial andsupporters of control charts further argue that, in general, it is impossible to specify alikelihood function for a process not in statistical control, especially where knowledgeabout the cause system of the process is weak.

2) Some authors have criticised the use of average run lengths (ARLs) for comparingcontrol chart performance, because that average usually follows a geometricdistribution, which has high variability and difficulties.

3) Some authors have criticized that most control charts focus on numeric data.Nowadays, process data can be much more complex, e.g. non-Gaussian, mix numericaland categorical, missing-valued.

4) Critics of this approach argue that control charts should not be used when their underlying assumptions are violated, such as when process data is neither normallydistributed nor binomially (or Poisson) distributed. Such processes are not in control andshould be improved before the application of control charts. Additionally, application of the charts in the presence of such deviations increases the type I and type II error  rates of the control charts, and may make the chart of little practical use.

6. a) Suggest a more suitable average in each of the following cases:

Page 12: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 12/18

(i) Average size of ready-made garments.(ii) Average marks of a student.

b) State the nature of symmetry in the following cases:(i) When median is greater than mean, and

(ii) When Mean is greater than median.

 Answer 6 a)

 Average size of ready made germents == median

 Average marks= mean

Answer 6 B)

State the nature of symmetry in the following cases:(i) When median is greater than mean, and(ii) When Mean is greater than median.

Consider the following data set:

4 ; 5 ; 6 ; 6 ; 6 ; 7 ; 7 ; 7 ; 7 ; 7 ; 7 ; 8 ; 8 ; 8 ; 9 ; 10

This data produces the histogram shown below. Each interval has width one and each

value is located in the middle of an interval.

Page 13: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 13/18

The histogram displays a symmetrical distribution of data. A distribution is symmetrical if a vertical line can be drawn at some point in the histogram such that the shape to theleft and the right of the vertical line are mirror images of each other. The mean, themedian, and the mode are each 7 for these data. In a perfectly symmetrical distribution,the mean, the median, and the mode are often the same. 

The histogram for the data:

4 ; 5 ; 6 ; 6 ; 6 ; 7 ; 7 ; 7 ; 7 ; 8

is not symmetrical. The right-hand side seems "chopped off" compared to the left side.The shape distribution is called sk ewed to the left because it is pulled out to the left.

The mean is 6.3, the median is 6.5, and the mode is 7. N otice that the mean is less thanthe median and they are both less than the mode. The mean and the median both

reflect the skewing but the mean more so.

The histogram for the data:

6 ; 7 ; 7 ; 7 ; 7 ; 8 ; 8 ; 8 ; 9 ; 10

is also not symmetrical. It is sk ewed to the right .

Page 14: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 14/18

The mean is 7.7, the median is 7.5, and the mode is 7. N otice that the mean is thelargest statistic, while the mode is the smallest . Again, the mean reflects the skewingthe most.

To summarize, generally if the distribution of data is skewed to the left, the mean is less

than the median, which is less than the mode. If the distribution of data is skewed to theright, the mode is less than the median, which is less than the mean.

Skew ness and symmetry become important when we discuss probability distributionsin later chapters.

Part II

1. What are the characteristics of a good measure of central tendency?

(b) What are the uses of averages?

Answer 1 a)

A measure of central tendency is a single value that attempts to describe a set of 

data by identifying the central position within that set of data. As such, measures of 

Page 15: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 15/18

central tendency are sometimes called measures of central location. They are also

classed as summary statistics. The mean (often called the average) is most likely the

measure of central tendency that you are most familiar with, but there are others, such

as, the median and the mode.

The mean, median and mode are all valid measures of central tendency but,under different conditions, some measures of central tendency become more

appropriate to use than others.

There can often be a "good´ measure of central tendency with regards to the data

you are analyzing but there is no one good" measure of central tendency. This is

because whether you use the median, mean or mode will depend on the type of 

data you have, , such as nominal or continuous data; whether your data has

outliers and/or is skewed; and what you are trying to show from your data.

So the good measure of central tendency depends on the type of data.

Type of Variable  Good measure of central tendency 

Nominal Mode

Ordinal Median

Interval/Ratio (not skewed) Mean

Interval/Ratio (skewed) Median

Answer 1 b)

The uses of averages:

The use or application of a particular average depends upon the purpose of the

investigation. Some of the cases of different averages are as follows:

Arithmetic Mean 

 Arithmetic mean is considered an deal average. It is frequently used in all the aspects of life. It possesses many mathematical properties and due to this it is of immense utility infurther statistical analysis. In economic analysis arithmetic mean is used extensively tocalculate average production, average wage, average cost, per capital income exports,

Page 16: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 16/18

imports, consumption, prices, etc. When different items of a series have differentrelative importance, then weighted arithmetic mean is used.

Geometric Mean 

Use of Geometric mean is important in a series having items of wide dispersion. It isused in the construction of index number. The averages of proportions, percentagesand compound rates are computed by geometric mean. The growth of population ismeasured in it as population increases in geometric progression.

Harmonic Mean 

Harmonic mean is applied in the problems where small items must get more relativeimportance than the large ones. It is useful in cases where time, speed, values given inquantities, rate and prices are involved. But in practice, it has little applicability.

Median and partition Values 

Median and partition values are positional measures of central tendency. There aremainly used in the qualitative cases like honestly, intelligence, ability, etc. In thedistributions which are positively skewed, median is a more suitable average. These arealso suitable for the problems of distribution of income, wealth, investment, etc.

Mode 

Mode is also positional average. Its applicability of daily problems is increasing. Mode isused to calculate the 'modal size of a collar', 'modal size of shore', or 'modal size of 

ready-made garments' etc. It is also used in the sciences of Biology, Meteorology,Business and Industry.

Question 2. For each one of the following null hypothesis, determine if it is a left-tailed,a right-tailed, or a two-tailed test. [10 Marks}a. � 10 =b. P � 0.5 c. is at least 100.d. � -20

e. p is exactly 0.22  

answer 2)

One-Tailed and Two-Tailed Tests 

 A test of a statistical hypothesis, where the region of rejection is on only one side of the

sampling distribution, is called a one-tailed test. For example, suppose the null

Page 17: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 17/18

hypothesis states that the mean is less than or equal to 10. The alternative hypothesis

would be that the mean is greater than 10. The region of rejection would consist of a

range of numbers located located on the right side of sampling distribution; that is, a set

of numbers greater than 10.

 A test of a statistical hypothesis, where the region of rejection is on both sides of the

sampling distribution, is called a two-tailed test. For example, suppose the null

hypothesis states that the mean is equal to 10. The alternative hypothesis would be that

the mean is less than 10 or greater than 10. The region of rejection would consist of a

range of numbers located located on both sides of sampling distribution; that is, the

region of rejection would consist partly of numbers that were less than 10 and partly of 

numbers that were greater than 10.

a. � 10 =b. P � 0.5 c. is at least 100.d. � -20

Question 3) Why do we have to know the distribution of a test statistic?

answer 3)

 A statistic is calculated from the sample. To begin with we assume that the hypothesisabout the population parameter is true. We compare the value of the statistic with thehypothetical value of the parameter. If the difference between them is small, thehypothesis is accepted and if the difference between them is large, the hypothesis isrejected. A statistic on which the decision can be based whether to accept or reject a hypothesis is called test statistic. It is important to remember that a test-statistic does not prove the hypothesis to be correct but if furnishes as evidence againstthe hypothesis. Some of the test statistics to be discussed later are Z, t and Chi ±Square .Statistics is a diverse subject and thus the mathematics that are requireddepend on the kind of statistics we are studying. A strong background in linear algebrais needed for most multivariate statistics, but is not necessary for introductory statistics.

 A background in Calculus is useful no matter what branch of statistics is

Page 18: Cource B1129- Statistics for Management

8/6/2019 Cource B1129- Statistics for Management

http://slidepdf.com/reader/full/cource-b1129-statistics-for-management 18/18

being studied, but is not required for most introductory statistics classes .At a bareminimum the student should have a grasp of basic concepts taught in Algebra and becomfortable with "moving things around" and solving for an unknown.

In statistical hypothesis testing, a hypothesis test is typically specified in terms of a teststatistic, which is a function of the sample [1]; it is considered as a numerical summaryof a set of data that reduces the data to one or a small number of values that can beused to perform a hypothesis test. Given a null hypothesis and a test statistic T , we canspecify a "null value" T 0 such that values of T close to T 0 present the strongest evidencein favor of the null hypothesis, whereas values of T  far from T 0 present the strongestevidence against the null hypothesis. An important property of a test statistic is that wemust be able to determine its sampling distribution under the null hypothesis, whichallows us to calculate p-values.

For example, suppose we wish to test whether a coin is fair (i.e. has equal probabilitiesof producing a head or a tail). If we flip the coin 100 times and record the results, theraw data can be represented as a sequence of 100 Heads and Tails. If our interest is inthe marginal probability of obtaining a head, we only need to record the number T out of the 100 flips that produced a head, and use T 0 = 50 as our null value. In this case, theexact sampling distribution of T  is the binomial distribution, but for larger sample sizesthe normal approximation can be used. Using one of these sampling distributions, it ispossible to compute either a one-tailed or two-tailed p-value for the null hypothesis thatthe coin is fair. Note that the test statistic in this case reduces a set of 100 numbers to asingle numerical summary that can be used for testing.

 A test statistic shares some of the same qualities of a descriptive statistic, and manystatistics can be used as both test statistics and descriptive statistics. However a teststatistic is specifically intended for use in statistical testing, whereas the main quality of a descriptive statistic is that it is easily interpretable. Some informative descriptivestatistics, such as the sample range, do not make good test statistics since it is difficultto determine their sampling distribution.