Upload
joan-whitehead
View
217
Download
0
Embed Size (px)
Citation preview
Basic Statistics for Engineers.Basic Statistics for Engineers.
Collection, presentation, interpretation and decision making.
Prof. Dudley S. Finch
StatisticsStatistics
Four steps:– Data collection including sampling techniques– Data presentation– Data analysis– Conclusions and decisions based on the
analysis
Data typesData types
Discrete– Defined as:
A variable consisting of separate values; for example the number of bolts in a packet. There may be 8 or 9 but there cannot be 8.5
Continuous– Defined as:
A variable which may have any value; for example the diameter of steel bars after machining. Any diameter is possible within the allowable tolerance to which the machine is set.
SamplingSampling Often not practical to examine every component
therefore sampling techniques are used. Sample should be representative of the complete set
(the population) of values from which it has been chosen.
Although not guaranteed, we attempt to chose an unbiased sample.
To be unbiased every possible sample must have an equal chance of being chosen. Satisfied if sample is chosen at random; that is, if there is no order in the way the sample is chosen. This is called a random sample.
Random samplesRandom samples
The larger the random sample the more representative of the population it is likely to be.
Random sampling can be carried out by allocating a number to each member of the population and then drawing numbered balls from a bag or using a random number generator.
Sampling techniques involve probability theory (will be dealt with later).
Data presentationData presentation51.4 55.3 56.1 50.5 55.5
52.8 55.6 55.3 50.2 56.1
52.1 54.8 49.6 57.0 52.0
56.5 55.3 54.0 51.6 52.1
57.3 53.9 53.5 56.1 57.2
54.6 55.4 55.9 56.0 52.9
54.1 55.0 54.2 54.2 54.5
53.0 52.7 54.5 54.7 58.4
56.2 55.8 54.1 56.0 55.1
55.1 54.4 57.2 53.2 55.4
53.9 50.9 54.5 56.9 54.0
56.4 53.1 51.8 52.8 50.5
53.7 52.8 54.0 56.4 55.0
53.8
Measured weights of a casting (lbs).
Frequency distributionFrequency distribution
Mass of casting 50 51 52 53 54 55 56 57 58
Number of castings (frequency)f
2 4 5 8 13 15 12 6 1
The class interval should be one that emphasizes any pattern in the data. Typically between 8 and 15 class intervals should be chosen.
In the example used, a class interval of 1lb is chosen. 50lbs therefore includes 49.5 to 50.4lbs. We can thereforecompile a frequency distribution table.
Bar chartBar chart
0
2
4
6
8
10
12
14
16
50 51 52 53 54 55 56 57 58
Variable x (lbs)
Fre
qu
en
cy
(f)
HistogramHistogram
0
2
4
6
8
10
12
14
16
50 51 52 53 54 55 56 57 58
Variable x (lbs)
Fre
qu
en
cy
(f)
Frequency polygonFrequency polygon
0
2
4
6
8
10
12
14
16
50 51 52 53 54 55 56 57 58
Variable x (lbs)
Fre
qu
en
cy
(f)
Pie chart showing relative frequencyPie chart showing relative frequency
Relative frequency = class frequency / total frequency of the sample e.g. the relative frequency of the 53lb class is 8/66 or 0.121
503%
516% 52
8%
5312%
5420%
5522%
5618%
579%
582%
Numerical methods of a Numerical methods of a distributiondistribution
A frequency distribution can be represented by two numerical quantities:– Central tendency or average value of the
distribution– Dispersion or scatter of variables about the
average value
Numerical measures of central Numerical measures of central tendencytendency
Mid point of range:– Difference between the largest and smallest values of
the variable Generally poor measure of central tendency since it depends
only on the extreme values of the variable and is not influenced by the form of the distribution.
Mode:– The most frequently occurring value of the variable
Easily obtained from frequency table. For the casting the mode = 55lbs.
Arithmetic mean
– Determined by adding all the values of the variable and dividing this by the total number of values. If x1, x2, x3, ….xn are the N values then…
mean =
f1x1 + f2x2 + ... + fnxn
¹ ¹ ¹ f1 + f2 + ... + fn
where f1 + f2 + ... + fn = N
or = 1 fx¹ ¹ N
For frequency distribution tablesFor frequency distribution tables::
Evaluate the deviations:
(x1 - ), (x2 - ), ... (xn - )
Evaluate the squares of the deviations:
(x1 - )2, (x2 - )
2, ... (xn - )
2
Evaluate the sum f(x- )2
= f1(x1 - )2, f2(x2 - )
2, ... fn(xn - )
2
To calculate standard deviation:To calculate standard deviation:
Evaluate the average squared deviation
= f(x- )
2
¹ ¹ N
Evaluate the standard deviation s
=ž f(x- )2 ¹ ¹ ¹ ˆ̂̂̂̂̂N¹ ¹
¹ ¹ ¹
EstimationEstimation
Applies to the difficulty of obtaining data about the population from which the sample was drawn and in setting up a mathematical model to describe this population.
Two components: estimation and testing of hypotheses about the chosen model.
Two types of estimates:Two types of estimates:
Point estimate– Estimate of a population parameter expressed as a
single number This method gives no indication as to the accuracy of the
estimate
Interval estimate – Estimate of a population parameter expressed as two
numbers This method is preferable as it gives an indication as to where
the population parameter is expected to lie
Confidence intervalsConfidence intervals In practice, the true standard deviation, , is
unknown and that the sample standard deviation, s, is used to estimate .
If a random sample size n is drawn, an estimate of the standard error of the sample mean is given by
Need to determine the confidence interval for the true mean, .
For n>30 a good approximation can be obtained. For small samples a wider interval is used.
s/n
Use of Student t-distribution tablesUse of Student t-distribution tables
Look up value for (n-1) and use desired confidence limits (0.01= 98%, 0.005 = 99%, 0.001 = 99.8%, etc.).
Find The true mean = sample mean
t½,n-1
s/n
s/n