Experimental Methods for Engineers (MENG203)

Preview:

Citation preview

Experimental Methods for Engineers (MENG203)

Graphical Analysisand Curve Fitting (Chapter 3)

By

Marzieh REZAEI

Graphical Analysis

Statistic is the science of describing, interpreting and analyzing data.Statistics may be:• GraphicalMakes the numbers visible.• Inferential:Makes inferences about populations from sample data.• Analytical:Use math to model and predict variation.• Descriptive:Describes characteristics of data ( location and spread)

Graphical Analysis

• Engineers are well known for their ability to plot many curves of experimental data and to extract all sorts of significant facts from these curves.

• the person who is usually most successful in analyzing experimental data is the one who understands the physical processes behind the data.

• Blind curve-plotting and cross-plotting usually generate an excess of

• displays, which are confusing the management, supervisor and also experimenter.

Graphical Analysis

Benefits:

• Allows to learn about the nature of the process..

• Enables clarity of communication.

• Helps understanding sources of variation in

the data.

• Provides focus for further analysis.

Graphical Analysis

Line charts:

• One of the simplest form of charts.

• Useful for showing trends in quality, cost or other process performance measures.

• They represent the data by

connecting the data points

by straight lines to highlight

trends in the data.

Graphical Analysis

Pie chart:

• Circular charts that make it easy to compare the data.

• They represent each category as a slice of the pie.

• They display the proportion of each category relative to the whole data set.

• Widely used in the business and media worlds for their simplicity and ease of interpretation.

Graphical Analysis

Bar charts:

• Use to display the frequencies of attribute data.

• They focus on the absolute value of data.

• The bars on the chart are presented vertically or horizontally.

Graphical Analysis

Graph selection:

The graphs you choose depends on:

• The type of data you have.

• The objective you are trying to achieve.

• There are graphs for continuous data and graphs for

count and attribute data.

• The engineer should give considerable thought

to the kind of information being looked for before drawing any conclusion.

Linear regression

In statistics, Linear regression is a basic and commonly used type ofpredictive analysis

In Regression we study the association between two variables in orderto explain the values of one from the values of the other (i.e., makepredictions).

Correlation tells us about strength and direction of the linearrelationship between two quantitative variables.

When there is a linear association between two variables, then a

straight line equation can be used to model the relationship.

Linear regression (Cont…)

• A regression line is a line that best describes the linear relationship between the two variables, and it is expressed by means of an equation y= 𝑎𝑥 + 𝑏

Where a is the slope and b is the intercept.

• Once the equation of the regression line is established, we can use it to predict the response y for a specific value of the explanatory variable x .

Linear regression (Cont…)

• The least-squares regression line is the line that makes the sum of

the squares of the vertical distances of the data points from the

line as small as possible.

D= Least possible value

Method of Least Squares

• We want to come up with a linear model that we want to find theequation of a line that best fits this data and when we talk about thebest fit this is where it comes back to the least squares criteria .

• The equation of the least-squares regression line of y on x is:

ො𝑦=ax+b

Method of Least Squares (Cont…)

Where a and b can be calculated from below equations:

Number of variables

(3.1)

(3.2)

Example 3.1

• LEAST-SQUARES REGRESSION.

• From the following data obtain y as a linear function of x using the method of least squares:

Solution

• We seek an equation of the form

y = ax + b

We first calculate the quantities indicated in the following table:

Solution

• We calculate the value of a and b using Eqs. (3.1) and (3.2) with n = 5:

Solution

• Thus, the desired relation is

• y = 0.540x + 0.879

• A plot of this relation and the data points from which it was derived is shown in the accompanying figure.

The Correlation Coefficient

• Let us assume that a suitable correlation between y and x has been obtained, by either least-squares analysis or graphical curve fitting. We want to know how good this fit is and the parameter which conveys this information is the correlation coefficient r defined by

where σy is the standard deviation of y given as

3.3

3.4

3.5

yi= the actual values of y

yic= values computed from the correlation equation for the same value of x.

The Correlation Coefficient (Cont…)

• The correlation coefficient r may also be written as below and r2 is called the coefficient of determination.

(3.6)

We note that for a perfect fit σy,x = 0 because there are no deviationsbetween the data and the correlation. In this case r = 1.0. If σy = σy,x,we obtain r = 0, indicating a poor fit or substantial scatter around thefitted line.

The Correlation Coefficient (Cont…)

• The reader must be cautioned about ascribing too much virtue tovalues of r close to 1.0. These values may occur when the data do notfit the line.

• To be on the safe side, one should never accept a least-squaresanalysis based only on calculations. One should always plot the datato obtain a visual observation of the behavior. If the data points doindeed hug the least-squares line, then a high value of r will beindicative of a very good correlation. If the data scatter but stillappear to follow the fitted relationship, then a small value of r willalso be meaningful as a measure of poorer correlation.

The Correlation Coefficient (Cont…)

• It may be noted that most scientific calculators have built-in routines which calculate the correlation coefficient as well as other statistical functions. In addition, there are many computer software packages which accomplish these calculations.

• A relationship for the correlation coefficient which may be preferable to Eq. (3.3) for computer calculations is:

(3.7)

Example 3.2

• Calculate the correlation coefficient for the least-square correlation of Example 3.1.

Solution

From Example 3.1

Solution

From Example 3.1

Graphical Analysis and Curve Fitting

• The person who is usually most successful in analyzing experimental data is the one who understands the physical processes behind the data.

• When the data may be approximated by a straight line, the analytical relation is easy to obtain; but when almost any other functional variation is present, difficulties are usually encountered.

• The curve could be a polynomial, exponential, or complicated logarithmic function and still present roughly the same appearance to the eye.

Graphical Analysis and Curve Fitting

• Several different types of functions and plotting methods that may beused to produce straight lines on the graph is demonstrated in thefollowing table.

• The graphical measurements which may be made to determine thevarious constants are also shown. It may be remarked that themethod of least squares may be applied to all these relations toobtain the best straight line to fit the experimental data.

Methods of plotting various functions to obtain straight lines

Graphical Analysis and Curve Fitting

• variations in format that we shall illustrate by plotting the simple table of x-y data shown below.

Six formats for plotting the data are shown in Fig. 3.1a through f . The choice of formatdepends on both the source and type of data as well as the eventual use to be madeof the display. The following paragraphs discuss the six alternatives. The computergraphics were generated in Microsoft Excel.

Graphical Analysis and Curve Fitting (Cont…)

• a. This display presents just the raw data points with a data marker for eachpoint. It might be selected as an initial type of display before deciding on a moresuitable alternative. It may be employed for either raw experimental data pointsor for points calculated from an analytical relationship. With computer graphics awide selection of data marker styles is available.

Graphical Analysis and Curve Fitting (Cont…)

• b. This display presents the points with the same data markers connected by asmooth curve drawn either by hand or by a computer graphics system; in thiscase, by computer. This display should be used with caution. If employed forpresentation of experimental data, it implies that the smooth curve describes thephysical phenomena represented by the data points. The engineer may want toavoid such an implication and choose not to use this format.

Graphical Analysis and Curve Fitting (Cont…)

• c. This display is the same as (b) but with the data markers removed. It would almost never beemployed for presentation of experimental data because the actual data points are not displayed.It also has the same disadvantage as (b) in the implication that the physical phenomena arerepresented by the smooth connecting curve. In contrast, this type of display is obviously quitesuitable for presenting the results of calculations. The calculated points could be designated withdata markers as in (b) or left off as in (c). The computer-generated curve offers the advantage of asmooth curve with a minimum number of calculated points.

Graphical Analysis and Curve Fitting (Cont…)

• d. This display presents the data points connected with straight-line segments instead of asmooth curve, and avoids the implication that the physical situation behaves in a certain“smooth” fashion. The plot is typically employed for calibration curves where linear interpolationwill be used between points, or when a numerical integration is to be performed based on theconnecting straight-line segments. If used for presentation of experimental data, the implicationis the same as in (b) and (c) that the physical system actually behaves as indicated, in this casewith a somewhat jerky pattern.

Graphical Analysis and Curve Fitting (Cont…)

• e. The format in (e) is the same as (d) without the data markers. Itmight be used for calculation results where the engineer wants toavoid computer smoothing between the calculated points.

Graphical Analysis and Curve Fitting (Cont…)

• f. Finally, the format presented in (f ) is one that is frequently selected to present experimentalresults where uncertainties in the measurements are expected to result in scatter of the datapoints. A smooth curve is drawn through the data points as the experimentalist’s best estimate ofthe behavior of the phenomena under study. The smooth curve may be drawn by hand orgenerated through a least-squares process executed by the computer. A trend line equation mayor may not be displayed along with the curve. When experimental uncertainties are expected tocontribute significantly to the scatter of data, as they do in many cases, a full discussion of theirnature should be offered in the accompanying narrative material.

Example 3.1CORRELATION OF DATA WITH POWER RELATION

• The data for a series of experiments are shown in the table below.

In this case, the data were collected in two setsso the tabular listing is not in ascending values of x.Because of the nature of the physical problemthe data are expected to correlate in terms ofa power relation

y = axb

Example 3.1

• A computer will be used to plot the data and obtain the values of the constants a and b. If the data are plotted sequentially on a point-to-point linear graph, the result shown in Fig. Example 3.1a is obtained. The jagged nature of the lines results from data scatter and the fact that the data are not tabulated with continuously increasing values of x. Obviously, such a graph is inappropriate.

Example 3.1

• A least-squares fit of the data to a power relation is given as the trendline andequation shown in Fig.b, along with the corresponding value of r2 calculated fromEq. (3.7). The trendline and value of r2 are also calculated by Excel but may becomputed with other programs. The value of r2 = 0.9778 indicates a goodcorrelation.

Example 3.2

ALTERNATIVE DISPLAYS AND CORRELATION TRENDLINES FOR EXPONENTIALFUNCTION

• This example illustrates different ways of graphing and obtaining least squares correlations for data as applied to a calculate exponential function. First, the value of the function

y = 2.5e−0.2x

is calculated for a number of values of x

from 1 to 20 as shown in the accompanying table.

Example 3.2

• We are dealing with a known exponential function, so some of the correlationsthat will be examined are obviously designed for illustrative purposes only. Ineach case where a correlation trendline is presented the corresponding equationwill be displayed on the graph along with a value of r2 calculated from Eq. (3.7).The closer the value of r2 to unity, the better the correlation. The graphs andcorrelation trendlines have been generated in Microsoft Excel but could beobtained from other software packages as well.

• The following comments apply to the figures noted.

Example 3.2

• a. The calculated values of the exponential relation are displayed with data markers, but without connecting line segments.

Cont…

• b. The calculated values of the exponential relation are plotted with data markers along with connecting line segments.

Cont…

• c. A smooth curve is plotted through the points with data markers omitted.

Cont…

• d. The data points are displayed without connecting line segments along with aleast-squares linear fit to the data points. The linear relation obviously does notwork, and is evidenced by a low value of r2.

Cont…

• e. As suggested from the third entry of Table, the exponential relation should plotas a straight line on semilog coordinates. This figure gives such a display alongwith a least squares fit to an exponential relation and the corresponding value ofr2 calculated from Eq. (3.7). A perfect fit is obtained, as should be expected fromthe exact calculations of the exponential function. It should be noted that the linedrawn through the data markers is the correlation line, and not a line connectingthe points. Note that r2 = 1.0.

Tredline type

Cont…

• f. An exponential relation is again fitted to the data with a leas squares analysisbut this time on a linear plot. Again, a perfect fit is obtained. We note from thischart that the least-squares calculation is independent of the type of coordinatesystem employed for the display.

Cont…

• g. This plot displays an attempt at a least-squares fit to a power relation like thatin the second entry of Table, along with the plot using linear coordinates. Notethe poor fit and low value of r2.

Cont…

h. This figure shows a second-order polynomial (quadratic) fit of the data on linear coordinates.The results are fairly good, except at the largervalues of x. The least-squares

analysis of this type of fit was described in mentioned Equations.

Cont…

i. A cubic polynomial fit is performed in this figure with very good results. Note the almostperfect value of r2. In this example we know the functional form of the data points, butwith actual experimental results the functional form may be unknown.In such cases a polynomial fit is sometimes triedand frequently worksvery well

Cont…

j. This figure presents another failed attempt at a linear fit, this time with semilog coordinates.

Points

• Two general conclusions may be made from the above calculationsand displays:

• 1. A least-squares analysis of a set of data points is independent ofthe type of coordinate system used for the presentation, although thetype of plot may suggest the functional form or correlation to beattempted. In this regard, the information of Table can be quitehelpful.

• 2. If one can anticipate the functional form of the data, the type ofplot and presentation of a correlation trendline is simplified.

Example 3.3EVOLUTION OF A CORRELATION USING COMPUTER GRAPHICS.

To illustrate how a data correlation may evolve using computer graphicswe consider the set of data shown below

From the physical nature of the problem y is expected to behaveaccording to:

y = a + bxc

where a, b, and c are constants which must be determined from the

experimental data. Normally, one would insist upon more data points than

shown in the table, but we are considering an abbreviated set to keep thepresentation simple.

Cont…

• We consider a sequence of graphics that may be used to correlate the data. Obviously, x takes on a wide range of values and the linear plot of y vs. x shown in Fig. a causes a compression of data markers for the lower range of x. Inspecting Eq. (a) we see that y will approach the value of the constant a for very small values of x. Thus, we should expect to be able to estimate the value of a by inspecting the behavior of the chart for small values of x; however, the compression in Fig. (a) makes that a very difficult task.

Cont…

• The situation is helped considerably by replotting the x axis with a logarithmic scale as shown in Fig. (b). Now, we see that y appears to be approaching a value of about 2 for very small values of x. We therefore add a “y − 2” column to the data table and replot the data as shown in Fig. (c). Equation (a) is now written as

• y − 2 = bxc

Cont…

• which should plot as a straight line when displayed as log(y − 2) vs. log x. Such a display is shown in Fig. (d). We may suspect that the jagged or non straight line is the result of scatter in the experimental data, and thus a point-to-point graph is not appropriate. The point-to-point

• graph is dropped in Fig. (e), and a computer-generated trendline for a power relationship is calculated and displayed on the chart along with a value of r2 calculated from Eq. (3.7). The final data correlation is therefore

• y = 2 + 0.3536x0.2594

Cont…

• Finally, the cosmetics of the presentation is improved with addition of major and minor gridlines along with enlarged data markers and a wider trendline. This results in the graph shown in (f ).

Cont…

Recommended