166
Introduction Excel Guide 1 The Excel Guide to accompany Practical Business Statistics by Andrew F. Siegel PREPARED BY ANDREW F. SIEGEL Copyright © 2003 Andrew F. Siegel Published by Irwin/McGraw-Hill Excel is a registered trademark of Microsoft, Inc.

Excel Manual

Embed Size (px)

DESCRIPTION

EXCEL MANUAL

Citation preview

Page 1: Excel Manual

Introduction Excel Guide 1

The Excel Guide

to accompany

Practical Business Statistics

by Andrew F. Siegel

PREPARED BY ANDREW F. SIEGEL

Copyright © 2003 Andrew F. Siegel

Published by Irwin/McGraw-Hill

Excel is a registered trademark of Microsoft, Inc.

Page 2: Excel Manual

2 Excel Guide Introduction

PrefaceYou may already be familiar with Excel, the versatile spreadsheet program that is used widely in business and management analysis of nearly everything from accounting and finance to production and marketing. Much of the success of spreadsheets is due to the complete flexibility you have in putting text, numbers, and graphics anywhere on your computer screen (and to have formulas update themselves automatically). In addition, Excel includes a sophisticated set of statistical tools and is therefore a natural computing environment for a business statistics course.

The purpose of this Excel Guide is to help you learn statistics by working through real-data examples from the textbook Practical Business Statistics by Andrew F. Siegel. You don't need to know a lot about computers when you begin. Because the material is presented “from scratch,” once you launch into Excel, you will be able to get results right away. Just follow along and try commands like the ones you see presented and discussed here.

This Excel Guide works with Excel only. If you would like to enhance the statistical capabilities of Excel, we recommend StatPad

, an Excel add-in which also provides non-technical explanations of the results of your statistical analysis. Many statistical methods explained here are much easier to do in StatPad. When you use StatPad, it seems as though the added conveniences were built into Excel itself, so there is no need to leave the familiar spreadsheet environment of Excel. StatPad comes with Practical Business Statistics by Andrew F. Siegel, published by Irwin/McGraw-Hill.

The Excel Guide begins with an introductory chapter to tell you about Excel and get you up-and-running with the basics. After that, the chapters here closely follow the same sequence as the chapters of the textbook Practical Business Statistics, beginning with the Histograms chapter (Chapter 3). While this Excel Guide gives enough information for you to see how to use the computer, you may wish to keep Practical Business Statistics handy for reference and further details about the examples because some of them are taken directly from the textbook. Once you've seen how to work the textbook examples, it should be straightforward for you to do homework and projects.

Each chapter contains discussion, examples, explanations, and the results of actual Excel sessions. Don’t forget that many Excel data files are available with Practical Business Statistics, so that there is no need to retype any data from your textbook.

Best wishes to you in this learning adventure!

Page 3: Excel Manual

Introduction Excel Guide 3

Introduction and Sample Excel Session

Excel is a powerful computing environment with statistical capabilities. You can type data into the worksheet, analyze and manipulate the data, and write text to identify and explain it. Summaries, charts, and detailed calculations are easily done using Excel’s menu commands and functions. In this introductory chapter we cover some of the basics of Excel with hints and tips including: entering data, doing arithmetic, using functions, selecting and naming cells, using UNDO, formatting, working with data files from Practical Business Statistics, sorting, and making a chart.

If you are new to Excel, remember that the best way to learn is by experimenting. Explore the menu system and try things out to see how they work. Use Help for guidance. This manual gives you step-by-step instructions for many tasks. If you are an experienced Excel user, this manual will show you many ways in which Excel can be used for statistical calculations.

Moving Around and Typing Data into the Worksheet

I enjoy the freedom of working in spreadsheets like Excel. You can click on any cell you want, type anything you want - text or number or function - and it stays there when you hit the Enter key. To move around, you can use the mouse or the cursor keys , , , and . Here is a worksheet that has some text, some numbers and a function. Note that you can see what function is in the selected cell (in this case, “=C3+C4”) by looking at the Formula Bar near the top.

In this case, the function starts, as always with an equals sign “=”. To enter the formula that adds Jim’s and Adrian’s sales together, you might either type the formula directly and hit Enter, or construct it by pointing to cells as follows:

1. Select cell C6 by clicking on it or moving to it with the cursor keys

Page 4: Excel Manual

4 Excel Guide Introduction

2. Hit the = key

3. Click on Jim’s sales in cell C3

4. Hit the + key

5. Click on Adrian’s sales in cell C4

6. Hit Enter.

Using Formulas to do Basic Arithmetic

Any cell can contain a formula that uses basic arithmetic with numbers and references to numbers (or formula results) in other cells. Here are some of the rules of basic arithmetic in Excel.

1. Start by selecting the cell where you want the result to go and hitting the = key

2. Use operators “+” for addition, ““ for subtraction, “*” for multiplication, “/” for division, and exponentiation “^” to raise to a power. Here are some examples (in column B) with the formulas written out (in column C). Note that 2 ^ 3 means 2 multiplied by itself 3 times, so it’s 2 * 2 * 2 = 8. Note also that the last formula, in cell B7, adds the results of two other formulas to find 18 as 8 + 10.

3. Rules of arithmetic say that these operations are performed in the following order:

a. Exponentiation “^” is done first

b. Multiplication “*” and division “/” happen next. You will want to use parentheses so that equations with multiplication after division like 2 / (3 * 4) are correctly evaluated

c. Addition “+” and subtraction “-“ are done last. Thus 6 + 4 * 2 ^ 3 is evaluated as 6 + 4 * 8, which is 6 + 32, which is 38.

Page 5: Excel Manual

Introduction Excel Guide 5

d. If you want something to happen first, put it in parentheses. For example, (2 + 3) * 4 makes the addition happen before the multiplication.

e. If you have a minus sign that is not subtracting, be careful! It happens even before exponentiation! Thus -2 ^ 4 is evaluated as (-2) ^ 4 which is 16. If you wanted -(2 ^ 4) you would need to include the parentheses to make the exponentiation happen first and to get -16 as the answer.

4. Percentages are used as if they were already divided by 100. For example, if you enter a percent like “20%” directly into a cell, its value is taken to be 0.20. This makes it easy, for example, to find 20% of a number: you simply multiply the number by 20%.

Using Functions to Compute a Number

Excel has a vast collection of useful functions. One easy way to browse them is to select an empty cell and choose Insert/Function from the main menu. Here is how it looks if you select the Statistical Function Category and then the AVERAGE function:

This is a nice way to insert a function into the worksheet because Excel will help you fill in the details in the correct order, so that you don’t have to memorize what goes where, which is especially useful with functions that need more than one piece of information. To insert the AVERAGE function, click OK to see a dialog box like this

Page 6: Excel Manual

6 Excel Guide Introduction

that is ready for you to select one or more cells by clicking or dragging the mouse across cells with the numbers you want to average. You may move this dialog box out of the way by dragging most anywhere on it. Here is how it looks after dragging down cells B2 through B7:

When you click OK, the result is placed into the worksheet in the cell that was selected when you first chose Insert/Function from the menu. Here is the result:

You could achieve exactly the same result by selecting cell B9 and typing “=AVERAGE(B2:B7)” without the quotation marks and then hitting Enter. Another way to do

Page 7: Excel Manual

Introduction Excel Guide 7

this is to type “=AVERAGE(” without the quotation marks, then use the mouse to drag down cells B2 through B7, then type “)” without the quotation marks and hit Enter.

Selecting a Range of Cells

You will probably want to do many things to cells: put things in them, format them, calculate with them. The way Excel works, you will need to know how to select cells in order to change them or use them.

To select a rectangular range of worksheet cells, simply drag the mouse from one corner diagonally to the opposite corner. The result will look something like this:

Another way to select these cells would be to use the cursor keys to move to one corner, say C70. Then hold the Shift key while you move right twice. Then hit the End key (with or without Shift). Finally, hold the Shift key while you hit the down arrow . When you use the End key, the next movement (left, right, up, or down) will go to the end of the row or column you are working in. Holding the Shift key expands the selection.

Page 8: Excel Manual

8 Excel Guide Introduction

Naming a Range of Cells

It is much more convenient to refer to a list of data using an Excel Range Name like “Sales” instead of an Excel address like “D3:D6”. It is a good idea to also have a column heading like “Sales” in the cell above the data, but this may not be enough. Some versions of Excel will try to figure out which cells you wish to work with, but the best way to be sure that the name is associated with the correct data is to explicitly give the range a name.

Here is one way to create an Excel range name for a column of sales numbers with a label at the top:

1. Begin by selecting the sales numbers (just the numbers, not the label) by dragging the mouse down the column. It should look something like this:

2. Choose Insert/Name/Define from the main menu system. Because the label is at the top and you have selected cells below it, Excel knows what you want to do and proposes to give the range name “Sales” to the data in cells D3 through D6. Here is how it should look

you can also use this Define Name dialog box to see what other names are defined and to check that they refer to the correct worksheet range.

Page 9: Excel Manual

Introduction Excel Guide 9

You cannot just choose any name for a range. The first character must be a letter or the underscore character “_”. The other characters can be letters, numbers, periods, and underscore characters, but not spaces (use underscores instead). Names cannot be the same as a cell reference (e.g. C16, R3C5, R and C are not allowed). There is no distinction between uppercase and lowercase letters, so “Sales”, “sales”, “SALES”, and “sALeS” all refer to the same worksheet cells.

3. When you choose OK, the range name is assigned. Whenever you select this range, its name (“Sales”) will appear in the name box near the top left corner of the worksheet, at the left end of the formula bar. You can select this range quickly by choosing its name in the name box.

The Fill Handle

At the lower right-hand corner of a selection is the fill handle. Here’s one nice thing it can do if you drag it with the mouse: extend a sequence automatically:

Page 10: Excel Manual

10 Excel Guide Introduction

Another nice thing the fill handle can do is automatically copy a selected cell’s formula down a column by dragging the fill handle as far as you want. If the cell is next to a column with data in it, then double-clicking the fill handle will automatically copy the cell’s contents down the column!

Copying and Pasting

To copy and paste text, a number, or a formula, you select the source cell(s), choose Edit/Copy from the main menu, select a cell at the destination, and choose Edit/Paste.

To move the contents of a cell or cells, select the source cell(s) and choose Edit/Cut, select a cell at the destination, and choose Edit/Paste (or just hit Enter).

To paste just the numbers but not the formulas, when you paste, choose Edit/PasteSpecial/Values.

If a formula adds the two numbers to its left, then the way it copies depends on how the cell addresses are specified. With relative addressing the formula changes to reflect its new location. Suppose that the formula “=A5+B5” is in cell C5. When this formula is copied to another cell, the resulting formula will change so that it adds the two cells to the left of the destination. For example, if this formula is copied to cell C6, the formula will change to “=A6+B6”. With absolute addressing the formula remains the same. If the formula includes dollar signs to read “=$A$5+$B$5”, then it remains unchanged when it is copied, always adding these two cells.

Using UNDO

Thank goodness for UNDO! No need to worry if you have just erased your precious data by accidentally hitting the delete key, so long as you react reasonably quickly. Just choose Edit/Undo from the main menu, and your valuable data will reappear as if by magic. Excel now has multiple UNDO levels, so that you can undo more than one action.

Page 11: Excel Manual

Introduction Excel Guide 11

Formatting a Range of Cells

To make your worksheet look nice, you will need to format cells. Select the cells first, then use Format/Cells from the main menu. You will then have control over how numbers appear (number of decimal places, percentage, dollar signs, dates, etc.), how cells are aligned (left or right, top or bottom), what font is used (including color, size, underline, italic, bold), how cell borders are indicated, and what patterns and colors fill the cells.

To show numbers with 2 decimal places and commas for thousands separation, here is the Format/Cells dialog box with the Number tab chosen. Don’t forget to select the cell(s) first!

To show numbers as percentages with one decimal place, you would use

Page 12: Excel Manual

12 Excel Guide Introduction

Working with Data Files from Practical Business Statistics

For use with Excel, each chapter of Practical Business Statistics has its own data file that includes the data tables from examples and problems. To access it, use File/Open from Excel’s menu. Each column of numbers is named and ready to use. For example, the data sets from Chapter 3 are in the file named Chapter03.xls, and the employee database from Appendix A of the textbook is in the file named EmployeeDatabase.xls. A list of the names used for each individual data table within a file can be found in the Appendix to this Excel guide.

To work with a column of numbers from a data file, you may use its name in a formula, such as “=AVERAGE(yield)” to place the average of a column of numbers named “yield” into a cell in your worksheet. Alternatively, you may drag the mouse down the numbers in the data set to select them if you wish.

Sorting to Put a Range in Order

When you want to put a column of data in order, smallest to largest or largest to smallest, simply select your data, then choose Data/Sort from the main menu.

If you have a larger database with more than one variable measured on each elementary unit, be sure to select the entire data set before sorting. Here is a small database:

Page 13: Excel Manual

Introduction Excel Guide 13

To sort it by revenues, you may either start by selecting A6 through C9, or let Excel do it for you when you choose Data/Sort from the main menu. Here is how it should look as you prepare to sort by Revenues, with both columns of data selected along with the identifying labels.

When you choose OK, the cities are sorted in order by revenues, and their expenses have correctly remained associated with them:

Page 14: Excel Manual

14 Excel Guide Introduction

Making a Chart

Here is how to create a chart in Excel.

1. Select your data, either one column or multiple columns. In some cases you will want to select the label at the top of the column for Excel to use.

2. Choose Insert/Chart from the main menu or click on the Chart Wizard icon on the toolbar. The dialog box gives you many chart options:

Page 15: Excel Manual

Introduction Excel Guide 15

Of particular interest in statistics are the XY (Scatter) used for bivariate and multivariate data and the Line chart used in time series analysis. Creating a histogram will require some computation before the chart is created. Details on creating particular types of charts will be covered as situations arise in this Excel Guide.

3. As you click on Next > to go through the sequence of dialog boxes, you will have the option to add titles, as well as to add or take away gridlines or legends. If you choose to put the chart back “As Object in” your worksheet, you will be able to move and size it near the data it came from.

4. In addition, if you don’t like the gray background in a chart, double-click on it and set the Patterns in the Area to None. To change the size of the chart, drag a sizing handle (which appear in the corners and in the middle of the sides when you click just inside the edge of the chart). To move the chart to a different place in the worksheet, drag just inside the edge but not on a sizing handle. To add or change titles, right-click just inside the chart, select Chart Options from the little pop-up menu, and choose the Titles tab. To change the font size, right-click on the item (a title or an axis) and choose Format from the little pop-up menu.

Using the Data Analysis ToolPak

Some statistical methods, such as regression and the analysis of variance, can be performed in Excel by using the “Data Analysis ToolPak” which is part of Excel, but you may need to install it before you can use it.

To find out if the Data Analysis ToolPak is installed on your system, look under the Tools menu for Data Analysis. If you cannot find Data Analysis under Excel’s Tools menu, select Add-Ins from the Tools menu and make sure the Analysis ToolPak is checked. If the Analysis ToolPak was not installed when Excel was installed on your computer, you may need to install it from the Excel CD-ROM.

Hints, Tips, and Troubleshooting

Here are some general comments that fall into the categories of hints, tips, and troubleshooting.

Experiment! Explore the menu system. Try things out to see how they work. And check your work for reasonableness: don’t just believe it has to be correct because you did it on a computer.

Save your work often so that if the computer shuts off unexpectedly you will not be sad. If your work is important, then keep more than one copy of it in more than one place.

Use the help system to learn more about Excel, either from the menu or by hitting the F1 key. Personally, I find that Help/ContentsAndIndex/Index from the main menu is the most useful.

Be familiar with the Tools/Options menu choice, which give you control over many worksheet features. Here are some highlights:

Page 16: Excel Manual

16 Excel Guide Introduction

1. With the View tab of Tools/Options, if something like a formula bar or scroll bar disappears from your worksheet, you will be able to bring it back. If you want to get rid of those gridlines, you can.

2. With the Calculation tab of Tools/Options, you can make sure that the worksheet is set to calculate automatically. If calculation is set to Manual, then you may need to hit the F9 key to see correct up-to-date results.

3. With the Edit tab of Tools/Options, you can control whether the selection moves down, or some other direction, or stays in the cell when you hit Enter. You can also ask Excel not to guess what you mean when you start typing, by un-checking the box at “Enable AutoComplete for cell values”.

4. With the General tab of Tools/Options, you can choose the default font and size.

To widen a column so that you can see all that is in a cell, select the cell and then use Format/Column/AutoFitSelection from the main menu.

There are additional toolbars, in particular, the drawing toolbar can be useful for placing arrows and other drawing objects on the worksheet. To see them, right-click in the open area near the top of the window.

If you are not sure how to get Excel to do something, try right-clicking or double-clicking on the object. The context-sensitive menu that appears when you right-click with the mouse can be very helpful, by making suggestions that are appropriate to the object you are interested in. Try this on a range or on part of a chart when you are not sure what to do. The Esc key makes this pop-up menu go away if you decide not to use it.

Page 17: Excel Manual

Chapter 3 Histograms 17

Histograms (Chapter 3)Here is how to produce a histogram in Excel by first creating a column of bins to hold the frequencies, then using Excel’s COUNTIF function to count how many data values fall into each bin, and finally create a bar chart of these frequencies with labels and connected bars.

You have two alternatives to these procedures while staying in Excel. First, with StatPad, creating a histogram is quick and easy. Second, with the data analysis add-in (“Analysis ToolPak”), creating a histogram requires more steps and the final result (after eliminating gaps between bars) can be counterintuitive because a data value that falls on a bin boundary may be placed in the bin to its left, instead of the bin to its right (so that, e.g., 60 would be counted as “50 to 60” instead of “60 to 70”).

Example: Computer Ownership Rates (Histogram)

Consider the data for rates of computer ownership (Table 3.5.2 of Practical Business Statistics). Here are the steps involved in creating a histogram:

1. Create a column of bin boundaries, in this case from 30% to 70% by 5% (a reasonable choice because the data values range from 37.2% to 66.1%). To do this, you might type “30%” in cell E277, hit Enter, then use Excel’s menu commands Edit/Fill/Series with Series in Columns, Step value 5% and Stop value 70% as shown here:1

1 Typing “30%” in the cell is the same as typing “0.30” in the cell and then using Format/Cells/Number to specify percentage format with two decimal places.

Page 18: Excel Manual

18 Histograms Chapter 3

2. Compute the counted frequencies using the COUNTIF function. Select the cell to the right of the first bin boundary amount. We want the number of data values from 30% to 35% (remember that 35% is the same as 0.35 in Excel). Since 30% is in cell E271 and 35% is in cell E272, we can use the formula

=COUNTIF(computer_owners,"<"&E272)-COUNTIF(computer_owners,"<"&E271)

which has been carefully crafted in this form so that all counts can be found by copying down the column, to the next-to-last cell (representing data values from 65% to 70%). For this formula to work, the column of data must have a name such as “computer_owners” here (if your data does not yet have a name, then select the numbers in the data column and use Excel’s menu command Insert/Name/Define to give your data a name). To copy and paste after typing the formula and hitting enter, you may use the menu command Edit/Copy, then select the cells of the column and then use Edit/Paste (or just double-click the little fill handle at the lower right of the selected cell, then delete the last one in the column). Here is the result so far:

3. Prepare for charting by selecting the bin boundaries and the counts, INCLUDING THE BLANK TOP ROW, which will convince Excel to draw the bar chart correctly, using the bin boundaries as the category axis. Here is how it should look as you select Insert/Chart

from the menu (or click on the Chart Wizard icon on the toolbar):

Page 19: Excel Manual

Chapter 3 Histograms 19

4. Use the standard Column Chart Type with first Sub-Type:

Page 20: Excel Manual

20 Histograms Chapter 3

5. Click on Next > twice, then eliminate some unnecessary features. Delete the legend by selecting the Legend tab and unselecting the “Show legend” checkbox, and eliminate gridlines by selecting the Gridlines tab and unselecting anything checked there:

6. Click on Finish to place the chart in the worksheet.

7. Eliminate the gaps between the bars by right-clicking on a bar to bring up a little menu from which you choose “Format Data Series”

8. Choose the Options tab, then decrease the Gap Width to 0 to make it into a true histogram:

Page 21: Excel Manual

Chapter 3 Histograms 21

9. Click OK to complete this task. You now have a histogram in the worksheet!

10. Here are some optional steps. If you don’t like the gray background, double-click on it and set the Patterns in the Area to None. Similarly, by double-clicking inside a bar, you may change or eliminate the color. To change the size of the histogram, drag a sizing

Page 22: Excel Manual

22 Histograms Chapter 3

handle (which appear in the corners and in the middle of the sides when you click just inside the edge of the chart). To move the chart to a different place in the worksheet, drag just inside the edge but not on a sizing handle. To add titles, right-click just inside the chart, select Chart Options from the little pop-up menu, and choose the Titles tab. To change the font size, right-click on the item (a title or an axis) and choose Format from the little pop-up menu. To format the horizontal axis as percent, double click on the axis, then choose Number and Percent. Here is one possible result:

Histogram of Computer Ownership

0

5

10

15

20

30% 35% 40% 45% 50% 55% 60% 65%

Percent of Households

Num

ber

of S

tate

s

Example: Assets of Commercial Banks (Transformation)

This example shows how you can transform a data set using logarithms. We use the Excel function =LOG10( ) to find the base 10 logarithm of each data values, but you may use natural logarithm (base e), using the =LN( ) function instead.

Consider the assets, in billions, of commercial banks in the Fortune 1000 (Table 3.4.1 of Practical Business Statistics). A histogram of these value, found using the methods explained earlier in this chapter, is very skewed:

Page 23: Excel Manual

Chapter 3 Histograms 23

To compute the logarithms of the data values, begin by computing the logarithm of the first data value. To do this, select the cell to its right, then use Excel’s Insert/Function menu command. You will find the LOG10 function under the Math & Trig category:

Select OK to see the LOG10 dialog box, then click on the first data value (you may need to drag the dialog box out of the way to see it) to tell Excel which number to take the logarithm of, as follows (in this case, the number in cell E79, which you specify by clicking on it):

Page 24: Excel Manual

24 Histograms Chapter 3

Select OK, then double-click on the fill handle to copy this formula down the column of data, resulting in a new column containing the logarithms of the data (if you prefer, you may use Edit/Copy and Edit/Paste instead):

Page 25: Excel Manual

Chapter 3 Histograms 25

Now give these logarithms a name, for example, logAssets, while they are still selected, by choosing the Insert/Name/Define menu command and typing the name logAssets:

Page 26: Excel Manual

26 Histograms Chapter 3

Now we are ready to construct the histogram of logAssets, using the methods explained earlier in this chapter, but this time for the logAssets data. Here is the resulting histogram, which is much less skewed than the original data:

Page 27: Excel Manual

Chapter 4 Landmark Summaries 27

Landmark Summaries (Chapter 4)Excel can quickly compute many statistical summaries and, with some effort, draw the related graphs. In this chapter we consider the average, median, weighted average, five-number summary, boxplot, and cumulative distribution function.

Example: How Many Defective Parts? (Average, Median)

This example shows how to use Excel to find the average, median, quartiles, and percentiles. Consider the data for defective parts (from the example in Chapter 4 of Practical Business Statistics).

If your data are not yet named, begin by giving a name (such as “Defects” here) to your column of numbers by highlighting the numbers and then using Excel’s menu command Insert/Name/Define. Next, select the cell where you want to put the average. You may either

1. type “=AVERAGE(Defects)” directly into the cell and hit Enter

or

2. select Average from the statistical functions listed under the menu command Insert/Function, hit OK, and then either type “Defects” directly into the dialog box, or drag the mouse down your column of numbers to tell Excel which data set to use. Then select OK.

Page 28: Excel Manual

28 Landmark Summaries Chapter 4

Either way, the result is the same. After selecting another cell to hold the median and repeating these steps to find the median, the result (average is 5.1, median is 4.5) is as follows:

Example: Your Grade Point Average (Weighted Average)

This example shows how to compute a weighted average, given two columns of numbers: one with values and the other with the weights. Consider the data on grades (from the example in Chapter 4 of Practical Business Statistics). A grade point average is the weighted average grade where credits define the weights.

Page 29: Excel Manual

Chapter 4 Landmark Summaries 29

Be sure each column of numbers has a name (select the column of numbers and use Excel’s Insert/Name/Define menu command if needed). The weighted average can then be computed using the expression “=SUMPRODUCT(Credits, Grade)/SUM(Credits)”. The SUMPRODUCT function multiplies credits by grade for each course and adds them up, while the SUM function finds the total credits. Remember always to divide by the sum of the weights (in this example, the credits). The result here is a grade point average of 3.45:

Example: How Many Defective Parts? (Quartiles, 5-Number Summary, Percentiles)

To find the quartiles, recall that the rank of the lower quartile is [1+int(1+n)/2]/2. You can find n, the number of data values, by using Excel’s COUNT function. To convince Excel to find the data value at this rank (and to average two data values if the rank includes a fractional part), we can use Excel’s PERCENTILE function, with a few modifications, as shown below. To find the upper quartile, the formula changes only slightly. You can use these formulas to find the quartiles of any data set by substituting the data set name in place of “Defects”. Here are the results for the Defects data:

Page 30: Excel Manual

30 Landmark Summaries Chapter 4

The 5-number summary consists of the smallest, lower quartile, median, upper quartile, and largest. You can use Excel’s MIN and MAX functions to find the smallest and largest. Here is the 5-number summary:

To find a percentile when you have the percentage, you may use Excel’s PERCENTILE function, which needs to know the data set and the percentage. Here is the 85 th percentile for the Defects data:

Given a number (not necessarily a data value, but in the same units as the data values) you may use Excel’s PERCENTRANK function to find the percentage that tells what percentile it is. This example shows that 11 is the94th percentile. That is, about 94% of the data values are smaller than 11. To get the number 0.944 to show as 94.4%, you may select the cell and format it as a percentage (using the menu command Format/Cells/Number/Percentage).

Page 31: Excel Manual

Chapter 4 Landmark Summaries 31

Example: CEO Compensation (Boxplot)

This example shows how to draw a box plot, once you have the 5-number summary, which involves a particular arrangement of the five numbers in a table. A simpler alternative is to use StatPad. Consider a data set of CEO compensation, with five-number summary 100,000, 1,000,000, 1,497,500, 2,101,000, and 7,730,000. Here are the steps involved in creating a box plot:

1. Arrange the 5-number summary exactly as follows, repeating some summaries and leaving a space before the median as shown here:

2. To the left of these numbers, type in the numbers 1, 2, 3 in exactly the following sequence. This will tell Excel how to draw the lines to create the box plot (the number 2 is in the middle, while 1 will place it to the left and 3 to the right).

Page 32: Excel Manual

32 Landmark Summaries Chapter 4

3. Select both columns of numbers all the way down (including the blank line) and choose Insert/Chart from the menu as follows:

Page 33: Excel Manual

Chapter 4 Landmark Summaries 33

4. Choose “XY (Scatter)” as the Chart Type, and choose “Scatter with data points connected by lines without markers” as the Chart sub-type, as follows:

Click here And here

5. Click Next > twice, then eliminate some unnecessary features. Delete the X Axis by selecting the Axes tab and unselecting the “Value (X) Axis” checkbox. Delete the legend by selecting the Legend tab and unselecting the “Show legend” checkbox, and eliminate gridlines by selecting the Gridlines tab and unselecting anything checked there. You may also add titles by clicking on the Titles tab:

Page 34: Excel Manual

34 Landmark Summaries Chapter 4

6. Click on Finish to place the chart in the worksheet. The chart is selected so you see the sizing handles around it and the data it was made from.

Page 35: Excel Manual

Chapter 4 Landmark Summaries 35

7. Drag the sizing handles to make it larger. In addition, if you don’t like the gray background, double-click on it and set the Patterns in the Area to None. To move the chart to a different place in the worksheet, drag just inside the edge but not on a sizing handle. To add or change titles, right-click just inside the chart, select Chart Options from the little pop-up menu, and choose the Titles tab. To change the font size, right-click on the item (a title or an axis) and choose Format from the little pop-up menu. Here is the result:

Example: Defects Data (Cumulative Distribution Function)

This example shows how to draw a cumulative distribution function (CDF), which involves arranging two copies of the data set together with the percentages in a table. A simpler alternative is to use StatPad. Consider the data for defective parts in production (from an example in Chapter 4 of Practical Business Statistics). Here are the steps involved in creating the CDF:

1. Select the all of the numbers in the data column and get ready to make copies of it using Edit/Copy from the main menu. One quick way to select the numbers is to click on the first number, then hit the End key, and then hold the Shift key while you hit the down arrow .

Page 36: Excel Manual

36 Landmark Summaries Chapter 4

2. Click on a wide-open area of the worksheet with room for two columns not touching any other data in your worksheet. Paste the data once (using Edit/Paste from the main menu), then select the empty cell under the last data value (one quick way is to hit End, , and ) and paste it again. Here is how it looks after pasting once, just before the second pasting:

Page 37: Excel Manual

Chapter 4 Landmark Summaries 37

3. Now sort this double data set as follows. First, select any single data value within the column (Excel should sort the entire column). Then choose Data/Sort from Excel’s main menu and select OK from the dialog box. You will then have two copies, sorted. Here is the worksheet just before sorting:

Page 38: Excel Manual

38 Landmark Summaries Chapter 4

4. Create the column of percentages. Place the number 0 in the empty cell just to the right of the top cell of your sorted double data set by typing 0, Enter. Just below it, type the formula “=1/COUNT(Defects)” where you would substitute your data set name for “Defects” here. Just below that, type the = key, click on the cell with the 0 you just entered, then type “+1/COUNT(Defects)”, substituting your data set name for “Defects” and hit Enter. Finally, double-click the fill handle to complete the column (or copy this cell to the cells under it to fill out the column). Here is the result just before double-clicking on the fill handle - note that the cell P10 is where the zero was entered.

Page 39: Excel Manual

Chapter 4 Landmark Summaries 39

5. Select both columns of numbers and choose Insert/Chart from the menu as follows:

Page 40: Excel Manual

40 Landmark Summaries Chapter 4

6. Choose “XY (Scatter)” as the Chart Type, and choose “Scatter with data points connected by lines without markers” as the Chart sub-type, as follows:

Page 41: Excel Manual

Chapter 4 Landmark Summaries 41

Click here And here

7. Click Next > twice, then eliminate some unnecessary features. Delete the legend by selecting the Legend tab and unselecting the “Show legend” checkbox, and eliminate gridlines by selecting the Gridlines tab and unselecting anything checked there. You may also add titles by clicking on the Titles tab:

Page 42: Excel Manual

42 Landmark Summaries Chapter 4

8. Click on Finish to place the chart in the worksheet. The chart is selected so you see the sizing handles around it and the data it was made from.

Page 43: Excel Manual

Chapter 4 Landmark Summaries 43

9. Drag the sizing handles to make it larger. Then double-click on the Cumulative Percent axis (or on any number on this Y axis), select the Number tab, choose Percentage with 0 Decimal places as follows:

Page 44: Excel Manual

44 Landmark Summaries Chapter 4

10. In addition, if you don’t like the gray background, double-click on it and set the Patterns in the Area to None. To move the chart to a different place in the worksheet, drag just inside the edge but not on a sizing handle. To add or change titles, right-click just inside the chart, select Chart Options from the little pop-up menu, and choose the Titles tab. To change the font size, right-click on the item (a title or an axis) and choose Format from the little pop-up menu. Here is the result:

Page 45: Excel Manual

Chapter 5 Variability 45

Variability (Chapter 5)Excel can quickly compute the basic variability measures. In this chapter we consider the standard deviation, the range, the coefficient of variation, and the variance.

Example: The Advertising Budget (Standard Deviation, Range, Coefficient of Variation, Variance)

This example shows how to find four measures of variability: the standard deviation, range, coefficient of variation, and variance. Consider the data for the advertising budget of firms within an industry group (from the example in Chapter 5 of Practical Business Statistics). For these formulas to work, the column of data should have a name such as “Budget” here (if your data does not yet have a name, then select the numbers in the data column and use Excel’s menu command Insert/Name/Define to give your data a name).

Use Excel’s STDEV function to find the sample standard deviation. To find the range, subtract the smallest from the largest using Excel’s MIN and MAX functions. To find the coefficient of variation, recall that we divide the standard deviation (STDEV function) by the average (AVERAGE function). Finally, to find the variance, use Excel’s VAR function. Here are the results:

Page 46: Excel Manual

46 Variability Chapter 5

If you need the population standard deviation instead of the sample standard deviation, you may use the function STDEVP instead of STDEV.

Page 47: Excel Manual

Chapter 6 Probability 47

Probability (Chapter 6)Most of the probability chapter requires thinking, and perhaps a calculator, to get the answers. Of course you can use Excel to do your arithmetic for you - just select a cell, hit the = key, type an expression such as (0.1+0.3)*0.4, and hit Enter to see the answer. Excel can also be used to demonstrate the law of large numbers, to show you how the (random) relative frequency of an event becomes closer to the probability as the number of trials grows larger.

Example: The Law of Large Numbers

Suppose an event has probability 0.4. There is nothing random about this number. The randomness is in whether the event happens or doesn't each time you run the random experiment. If you run it 10 times, the event might happen exactly 4 times, but it also might happen twice, 6 times, or just once. In this example you will see how the relative frequencies, while being random, get closer to the probability as n increases.

1. Start with a new worksheet (File/New) and then type “Probability” in cell A2 and 0.4 in cell B2.

2. With cell B2 still selected, use Insert/Name/Define from the main menu to name it “Probability”.

3. In cell A8, type the formula “=IF(RAND()<Probability,1,0) and hit Enter.

4. Hit the F9 key (called the “Recalculation key”). Each time you do, a new random number RAND() will be compared to the Probability: if it is smaller, then 1 is displayed and the event “happens”, otherwise you will see 0. Hit F9 over and over to get a sense of how a random event with probability 0.4 might occur. If you wish, select cell B2 and type in a different probability number, hit Enter, then recalculate over and over again with F9. Try it with probability 0.1 and 0.9 and others if you wish.

Page 48: Excel Manual

48 Probability Chapter 6

5. Now select cell A8 and choose Edit/Copy from the main menu. Next, click once with the mouse on cell A9. To select lots of cells from A9 on down, hold the Shift key while you hit Pg Dn over and over. When you have selected a few hundred or a few thousand cells, choose Edit/Paste from the main menu. You now have repeated the random experiment many times, once in each cell starting with A8:

6. Compute the relative frequencies as follows: enter the formula “=SUM($A$8:A8)/COUNT($A$8:A8)” into cell B8, being careful about the $ signs, which will help Excel when you copy the formula down the column by double-clicking on the fill handle, as shown here:

Page 49: Excel Manual

Chapter 6 Probability 49

6. Hit the F9 key to see how the relative frequencies might change. Here is one possibility: note that the relative frequencies are 0 for the first two trials because the event didn’t happen yet. After 3 trials, the relative frequency is 1 out of 3, or 0.333333. After 4 trials it drops to 1 out of 4, or 0.25, and so forth:

Page 50: Excel Manual

50 Probability Chapter 6

7. To create a graph, first select the column of relative frequencies. This might be done by selecting cell B8, hitting End, then holding down Shift while you hit the down arrow . Then choose Insert/Chart from the main menu and choose a Line Chart with the first Chart sub-type:

Page 51: Excel Manual

Chapter 6 Probability 51

8. Click Next > twice, then delete the legend by selecting the Legend tab and unselecting the “Show legend” checkbox. You may also add titles by clicking on the Titles tab:

Page 52: Excel Manual

52 Probability Chapter 6

9. Click on Finish to place the chart in the worksheet, and resize it with the sizing handles. Note how the graph of the relative frequencies hovers fairly near to the probability of 0.4. Hit the recalculation key (F9) a few times to see how else it might have come out, with different randomness each time.

10. You can see what relative frequencies look like with different probabilities. Here is how they might look if you change Probability to 0.9:

Page 53: Excel Manual

Chapter 6 Probability 53

Page 54: Excel Manual

54 Random Variables Chapter 7

Random Variables (Chapter 7)Excel can be used to perform, or help with, many of the basic calculations involving random variables. This chapter will cover discrete random variables (mean and standard deviation), binomial probabilities, normal probabilities, Poisson probabilities, and exponential probabilities.

Example: Profit and Economic Scenarios (Mean and Standard Deviation of a Discrete Distribution)

This example shows how to use Excel to help with the calculation of the mean and standard deviation of a discrete distribution. Consider the profits example (from the example in Chapter 7 of Practical Business Statistics). For these formulas to work, each column of data should have a name such as “Profit” and “Probability” here (if your data does not yet have a name, then select the numbers in a data column and use Excel’s menu command Insert/Name/Define to give your data a name). The mean, 3.65, is the sum of the products of value times probability, hence the formula is “=SUMPRODUCT(Profit,Probability)”. Give this cell (which now contains the mean) the name “Mean”. The standard deviation, 4.40, is the square root (SQRT function) of the sum of the products of the square of value minus mean times probability, hence the formula is “=SQRT(SUMPRODUCT((Profit-Mean)^2,Probability))”. These formulas give us 3.65 for the mean and 4.40 for the standard deviation:

Example: How Many of Five Possibilities Will Succeed? (Binomial Probabilities)

This example shows how to find probabilities for the binomial distribution, given the number of trials n and the probability for each one. Consider the example in which n = 5 and = 0.8. That is, you have 5 independent possibilities and each one has probability 0.8 of success.

Page 55: Excel Manual

Chapter 7 Random Variables 55

To use Excel to compute binomial probabilities, use the formula “=BINOMDIST(a,n,,FALSE)” to find the probability P(X=a) of being equal to a, and use the formula “=BINOMDIST(a,n,,TRUE)” to find the probability P(Xa) of being less than or equal to a, as follows, where the “FALSE” and “TRUE” in Excel’s binomial distribution formula refers to whether the probability distribution is cumulative or not.

Here are some results. The probability that exactly 3 succeed is 0.2048, the probability that 3 or fewer succeed is 0.2627, and the probability that 3 or more succeed is 0.9421 (evaluated as “not 2 or less”):

Example: Standard Normal Probabilities

This example shows how to find probabilities for the standard normal distribution in Excel. The standard normal probability table (Chapter 7 of Practical Business Statistics) gives the probability that a normal distribution with mean 0 and standard deviation 1 will be less than a given value. For example, the probability that a standard normal is less than 1.38 is 0.9162. These may easily be found using Excel’s NORMSDIST function as follows:

Page 56: Excel Manual

56 Random Variables Chapter 7

Example: Sales Forecasting (Normal Probabilities)

This example shows how you can solve normal probability problems without standardizing the numbers. Because you tell Excel the mean and standard deviation, you can ask about probabilities concerning the original numbers (no need to subtract the mean and divide by the standard deviation; Excel’s NORMDIST function will do that for you).

Consider the sales forecasting example (from Chapter 7 of Practical Business Statistics). Sales are forecast as having a mean of $20 million and a standard deviation of $3 million. Here you find the probability (0.0478) that sales will be less than $15 million, as well as three other probabilities:

To use Excel to compute these probabilities, we use the function “NORMDIST(value,mean,standardDeviation,TRUE)” to find the probability that a normal distribution with specified mean and standard deviation is less than some value. There is no need to standardize because Excel will do this for you as part of the calculation. The first calculation is straightforward because it is a probability of being less. The second calculation is one minus the NORMDIST function because it is a probability of being greater. The third calculation is the difference of two NORMDIST calculations because it is the probability of being between two values. The fourth calculation is one minus the difference of two NORMDIST calculations because it is the probability of NOT being between two values. Here are the results:

Example: How Many Warranty Returns (Poisson Probabilities)

This example shows how to find probabilities for the Poisson distribution. Consider the warranty returns example (from Chapter 7 of Practical Business Statistics) where you expect 1.3 of your

Page 57: Excel Manual

Chapter 7 Random Variables 57

products to be returned, on average, each day for warranty repairs. Assuming a Poisson distribution, the POISSON function can give you either the probability that a particular number will be returned, or the cumulative probability that a particular number or less will be returned on a particular day.

Here is how to use Excel’s function “POISSON(value,mean,FALSE)” to find the probability that a Poisson random variable is exactly equal to some value, and how to use “POISSON(value,mean,TRUE)” to find the probability that a Poisson random variable is less than or equal to some value. The terms TRUE and FALSE in the function refer to whether the probability is cumulative or not. Here are the results:

Example: Customer Arrivals (Exponential Probabilities)

This example shows how to find probabilities for the exponential distribution. Consider the customer arrivals example (from Chapter 7 of Practical Business Statistics) where customers arrive independently at a constant mean rate of 40 per hour. The random variable is the waiting time until the next customer arrives. The mean waiting time is 1.5 minutes, computed as 60 minutes per hour divided by 40 expected arrivals in that time.

Using Excel’s function EXPONDIST(value,1/mean,TRUE), you can find the probability that an exponential random variable with a given mean is less than or equal to the given value. Note that Excel’s EXPONDIST function uses 1/mean, not the mean itself. Here are two calculations, the probability of waiting 5 minutes or less for the next customer, and the probability of waiting 2 minutes or less:

Page 58: Excel Manual

58 Random Variables Chapter 7

Page 59: Excel Manual

Chapter 8 Random Sampling 59

Random Sampling (Chapter 8)Excel can choose a random sample with or without replacement. The standard error of the average may easily be found using Excel formulas.

Example: Choosing a Random Sample of 3 from a Population of 10

Here is how to use Excel to choose a random sample of size n = 3 from a population of size N = 10 by shuffling the population, using a column of random numbers placed next to the population listing.

1. Create a column of frame numbers, in this case from 0 to 10. To do this quickly (even for much larger N), you might type “1” in cell A3, hit Enter, then use Excel’s menu commands Edit/Fill/Series with Series in Columns, Step value 1 and Stop value 10 as shown here:

2. Insert random numbers by typing “=RAND()” in cell B3, just to the right of the first frame number, hit ENTER, and then copy the result down the column to produce a column of random numbers (this is quickly done by double-clicking the little fill handle at the lower right corner of the selected cell B3).

3. To shuffle the population, first select both columns of numbers (the frame numbers and the random numbers). For a large population, this is easily done by selecting the first frame number (cell A3 here), holding Shift while you hit the right arrow , hitting End, and holding Shift while you hit the down arrow . Then use Data/Sort from Excel’s main menu, being sure to sort by the random numbers.

Page 60: Excel Manual

60 Random Sampling Chapter 8

4. After the columns are sorted randomly, you may take the first three frame numbers to obtain your random sample, which results in selection of items 7, 10, and 2 in this example.

Example: Shopping Trips (Standard Error of the Average)

This example shows how to find the standard error of the average for a column of data, once you have the standard deviation, by dividing it by the square root of n. Consider the shopping trips example (from Chapter 8 of Practical Business Statistics). Suppose you put the standard deviation, S = 8.63, into cell A15 and the sample size, n = 200, into cell A16. The standard error of 0.610 may then be found using the formula “=A15/SQRT(A16)” as follows:

Page 61: Excel Manual

Chapter 8 Random Sampling 61

Alternatively, you can compute the standard error all at once with the formula “=STDEV(rangeName)/SQRT(COUNT(rangeName))”, where “rangeName” is the name of your data.

Page 62: Excel Manual

62 Confidence Intervals Chapter 9

Confidence Intervals (Chapter 9)You can use Excel to compute confidence intervals for you, given a sample of data, at any specified confidence level. Excel will even look up the t table value for you.

Example: Controlling the Average Thickness of Paper (Confidence Interval)

This example shows how to construct confidence intervals for a sample of data. Consider the example of paper thickness (Table 9.1.2 of Practical Business Statistics).

Here is how to use Excel to find the confidence interval. First, if needed, give the data column a name (such as “Thickness” here) by selecting the numbers and using Excel’s Insert/Name/Define menu command. Next, use Excel’s AVERAGE, STDEV, and COUNT functions to compute the average, the standard deviation, and the sample size respectively and name the cells so they can be easily used. The 95% confidence interval formula is then computed as average plus or minus t times the standard error, where we use Excel’s TINV function to find the t value. Excel’s TINV function is shown using “10.95” because it needs “one minus the confidence level” instead of the confidence level itself. The term n1 is used because TINV needs the number of degrees of freedom.

To use a different confidence level other than 95%, you need only change the 0.95 in the TINV function. For example, for a 99% confidence interval, you would use 0.99 in place of 0.95.

Page 63: Excel Manual

Chapter 9 Confidence Intervals 63

Example: Controlling the Average Thickness of Paper (One-sided Confidence Intervals)

Here is how to find a one-sided confidence interval. Consider the example of paper thickness (Table 9.1.2 of Practical Business Statistics).

In order to find a one-sided 95% confidence interval, the t table value changes to TINV(2*(1-0.95),n-1), placing all the probability of error on one side because the other side extends indefinitely without chance of error. To claim that the population mean paper thickness is at least a certain value, the appropriate calculation is average minus t times standard error (so that the one-sided interval from here to all higher values includes the average). Using the average value of 0.0040147, the standard deviation of 0.0002614, and the sample size of 15, we have:

To use a different confidence level other than 95%, you need only change the 0.95 in the TINV function. For example, for a 99% confidence interval, you would use 0.99 in place of 0.95.

Page 64: Excel Manual

64 Hypothesis Testing Chapter 10

Hypothesis Testing (Chapter 10)Excel can help you perform hypothesis tests for various situations involving population means for which univariate data are available: one- and two-sided tests, various test levels, and two-sample problems (both paired and unpaired).

If you are using the confidence interval approach to hypothesis testing (for example, deciding a two-sided test by seeing whether the reference value is in the interval), please use the confidence intervals explained earlier for Chapter 9.

Instead of having you specify the test level (for example 5%), Excel can give you the p-value (as well as the t value and basic summaries). You may then complete the test at any level by comparing the computed p-value to the test level. For example, if the reported p-value is less than 5%, the test is significant at the 5% level (otherwise it is not significant). You may wish to review the discussion of p-values in Chapter 10 of Practical Business Statistics.

Example: Controlling Paper Thickness (the t Test: Computing the t Statistic and Finding the p-Value)

This example shows how to test a population mean against a known reference value based on a random sample from the population. Consider the data on paper thickness (Table 9.1.2 of Practical Business Statistics), to be tested against the reference value 0 = 0.00385. If your data are not yet named, please select your column of numbers and use Excel’s menu command Insert/Name/Define. To find the t statistic, we subtract the reference value, 0.00385, from the average and then divide by the standard error (which is standard deviation divided by square root of n). To find the p-value, we use the Excel formula =TDIST(ABS(t),n-1,2) where t is the computed t statistic and n is the sample size (the “2” tells Excel to find a 2-sided p-value). Here, then, are the results of an ordinary two-sided test for this example:

Page 65: Excel Manual

Chapter 10 Hypothesis Testing 65

Example: Controlling Paper Thickness (One-sided t Test)

For a one-sided test, the t statistic and sample size n both stay the same as before, but the p-value must be computed differently. These calculations are different depending on the side being tested.

First, consider the case of a one-sided test to see if the sample average is significantly larger than the reference value (that is, the research hypothesis claims that the population mean is larger than the reference value). In this case, the p-value is either =TDIST(ABS(t),n-1,1) or =1-=TDIST(ABS(t),n-1,1), depending on whether t is positive or negative respectively. Using the t statistic of 2.4395561 and sample size n = 15 for the paper thickness example, the one-sided p-value is 0.0143, found as follows:

Next, consider the case of a one-sided test to see if the sample average is significantly smaller than the reference value (that is, the research hypothesis claims that the population mean is smaller than the reference value). In this case, the p-value is either =TDIST(ABS(t),n-1,1) or =1-=TDIST(ABS(t),n-1,1), depending on whether t is negative or positive respectively. Using the t statistic of 2.4395561 and sample size n = 15 for the paper thickness example, the one-sided p-value is 0.9857, found as follows:

Page 66: Excel Manual

66 Hypothesis Testing Chapter 10

This example has been used to illustrate the calculations. Note that, in real life, you would not compute both of these tests (significantly greater, significantly smaller) on the same data set because you would have to choose the side you wished to test before performing the test.

Example: Reactions to Advertising (Paired t Test)

This example shows how to perform a paired t test to see whether two paired columns of data are significantly different or not, on average. This test begins by subtracting the two columns (which is permitted because the situation is paired) and then testing these differences against the reference value 0. Consider the data on reactions to advertising (Table 10.6.1 of Practical Business Statistics).

The differences are calculated by using Excel’s arithmetic formulas. In this case, the formula =D10-C10 was entered into cell E10 to compute After - Before for the first person. This formula was then copied down the column (either using copy and paste from the main menu, or simply double-clicking the little fill handle at the lower right corner of the selected cell E10). The two-sample paired t test then becomes an ordinary one-sample t test of the differences, using the reference value 0. The result is p =0.03, and since p < 0.05, we conclude that there is a significant difference between the Before and the After scores. Here are the calculations:

Page 67: Excel Manual

Chapter 10 Hypothesis Testing 67

Example: Gender Discrimination and Salaries (Two-Sample Unpaired t Test)

This example shows how to perform a two-sample t test for the small-sample situation (see the two formulas for the standard error of the difference in Chapter 10 of Practical Business Statistics).

Consider the data on gender discrimination and salaries (Table 10.6.4 of Practical Business Statistics). The hypothesis test (to see if the average salaries of men’s and women’s salaries are different from one another) starts with the basic summaries: the average of each group, the standard deviation of each group, and the sample size of each group. Each of these summaries is given a name to make it easy to use (by selecting the cell and using the menu command Insert/Name/Define).

Page 68: Excel Manual

68 Hypothesis Testing Chapter 10

Then you can find the standard error of the average difference, the t statistic, and the p-value from these summaries. The conclusion is that there is a very highly significant difference between men's and women's salaries (p < 0.001). Here are the Excel results:

Page 69: Excel Manual

Chapter 10 Hypothesis Testing 69

Page 70: Excel Manual

70 Correlation and Regression Chapter 11

Correlation and Regression (Chapter 11)Excel provides assorted methods for the analysis of bivariate data: correlation, plotting, and regression analysis.

Example: Contacts and Sales (Correlation)

This example shows how to find the correlation in Excel by using the CORREL function after naming your two columns of numbers (for example, by selecting a column of numbers and using the Insert/Name/Define menu command to name it). Here is how to find the correlation of 0.985 between contacts and sales:

Example: Internet Usage Ratings (Plotting the Data)

This example shows how to use Excel to create a scatterplot for a bivariate data set. Consider the data on Internet usage ratings (Table 11.1.3 of Practical Business Statistics). It is easiest if the two columns are next to each other, with the X-axis data to the left of the Y-axis data. We will create a scatterplot of Time (vertical) against Pages (horizontal).

1. Begin by selecting both columns of numbers (with the horizontal X axis data to the left).

2. Choose Insert/Chart from the main menu.

Page 71: Excel Manual

Chapter 11 Correlation and Regression 71

3. Choose XY (Scatter) from the list of chart type, and the first Chart sub-type (“Scatter. Compares pairs of values”).

4. Continuing with Excel’s steps, you can create a scatterplot as an object in the worksheet. Here is how the initial dialog box looks like after you select the data and begin to insert a chart, together with the finished chart in the worksheet.

5. In addition, if you don’t like the gray background in the chart, double-click on it and set the Patterns in the Area to None. To eliminate the legend at the right in the chart, right-click on it and clear. To eliminate gridlines, right-click on one and clear. To change the size of the chart, drag a sizing handle (which appear in the corners and in the middle of the sides when you click just inside the edge of the chart). To move the chart to a different place in the worksheet, drag just inside the edge but not on a sizing handle. To add or change titles, right-click just inside the chart, select Chart Options from the little pop-up menu, and choose the Titles tab. To change the font size, right-click on the item (a title or an axis) and choose Format from the little pop-up menu. To change the number format of an axis, double-click on it and select Number. Here is one possible result:

Page 72: Excel Manual

72 Correlation and Regression Chapter 11

0

10

20

30

40

50

60

70

80

90

0 50 100 150 200

Pages

Tim

e

Example: Internet Usage Ratings (Plotting the Least-Squares Line)

Here is how to use Excel to add a least-squares line to a scatterplot. We continue with the Internet usage ratings data.

Right-click with the mouse on a data point in the chart, then select Add Trendline from the context-sensitive menu that appears, and finally specify Linear as the Trend/Regression type before clicking OK. The initial step of right-clicking on a data point is shown below, followed by the end result after the line has been added.

Page 73: Excel Manual

Chapter 11 Correlation and Regression 73

Page 74: Excel Manual

74 Correlation and Regression Chapter 11

Example: The Stock Market (Regression Analysis)

Here is how to perform regression analysis with Excel, using data from Table 11.1.6 on the daily percent change in the S&P500 stock market index, trying to predict today’s market movement from yesterday’s. As an alternative, you may wish to consider using StatPad, which will provide more explanation of the results and give more output and charting options.

1. First give a name to each column of numbers if needed (for example, by selecting a column of numbers and using Excel’s Insert/Name/Define menu command).

2. Look under the Tools menu for Data Analysis, and then select Regression. In the resulting dialog box, you may specify the range name for the Y variable (“Today” in this example) and for the X variable (“Yesterday”). If you cannot find Data Analysis under Excel’s Tools menu, select Add-Ins from the Tools menu and make sure the Analysis ToolPak is checked. If the Analysis ToolPak was not installed when Excel was installed on your computer, you will need to install it from the Excel CD-ROM.

3. Click “Output Range” in the dialog box and specify where in the worksheet you want the results to be placed, then click OK. Here is the dialog box and its results, which include the R2 value of 0.0132 (which tells you that only 1.32% of the variation in market performance can be explained by yesterday’s market change), the standard error of estimate Se of 0.0087 (which tells you that, after using the predicted value, the actual performance is typically different by about 0.87 percentage points), as well as the estimated regression coefficient b = 0.1114 (which tells you that, for each percentage point of yesterday’s market performance, we expect today’s market performance to be up about an additional one-tenth of that, on average), its standard error Sb = 0.1522, the t statistic t = 0.732, and its p-value of 0.468 (which is not significant because p > 0.05, telling you that there is no significant relationship between yesterday’s and today’s market performance).

Page 75: Excel Manual

Chapter 11 Correlation and Regression 75

From these results, looking at the last table’s Coefficients and recognizing that “X variable 1” refers to the X variable “Yesterday”, you can see that the least-squares prediction equation is

Today = 0.000398 + 0.111421 Yesterday

Because the R2 is 0.0132 or 1.32% (from the first table of Regression Statistics), it is clear that given whatever the market did yesterday does not seem to help you very much to predict what it will do today.

To perform the t test, you may look at the t statistic (“t stat” for X Variable 1” in the last table) of 0.732 and its p-value of 0.468. Because p > 0.05 the relationship between Yesterday’s and Today’s stock market movements is not significant.

This is also clear from the 95% confidence interval for the regression coefficient, which extends from -0.196226 to 0.419068 and includes the reference value 0. These numbers are found in the last row of the last table under the headings “Lower 95%” and “Upper 95%”.

Page 76: Excel Manual

76 Multiple Regression Chapter 12

Multiple Regression (Chapter 12)Excel’s Tools/DataAnalysis commands allow you to perform multiple regression analysis and correlation analysis of multivariate data. As an alternative, you may wish to consider using StatPad, which will allow you to pick an choose your X variables even if they are not right next to one another, and will also explain the results and give more output and charting options.

Example: Magazine Ads (Multiple Regression)

Here is how to perform multiple regression analysis on the magazine ad data from Table 12.1.3 in Chapter 12 of Practical Business Statistics, to understand how the cost per page of advertising can be (at least partially) explained by the magazine’s characteristics. Here is the data set we will be working with (there are 55 magazines in all - this is just the top of the database).

1. Look under the Tools menu for Data Analysis, and then select Regression. If you cannot find Data Analysis under Excel’s Tools menu, select Add-Ins from the Tools menu and make sure the Analysis ToolPak is checked. If the Analysis ToolPak was not installed when Excel was installed on your computer, you will need to install it from the Excel CD-ROM.

2. In the resulting dialog box, you may specify the range for the Y variable by selecting the label at the top along with the column of numbers to be predicted by dragging the mouse

Page 77: Excel Manual

Chapter 12 Multiple Regression 77

down the column starting at the label “Page” in cell D9 down to the Page Cost value for the last magazine in cell D64. The X variables must be right next to each other, forming a rectangular range of rows and columns. In this case the X variable range, including labels, is from E9 (the label “Audience”) to G64 (the Income measure for the last magazine), selected by dragging the mouse diagonally from one corner to the other. Here is the resulting dialog box:

What to do if you do not want to use all of the X variables? For example, to leave one out you should create a copy of the X variables (selecting them, using Edit/Copy, selecting a cell in a different part of the worksheet, then using and Edit/Paste), select the column of data to be omitted, delete it with the Del key (this is why we use a copy!), select the columns to its right by dragging the mouse diagonally across from one corner to the other, then use Edit/Cut, move to the empty column, and use Edit/Paste to close the gap. You now have a copy of the X data that omits the column you are not using.

3. Click “Labels” in this dialog box because we have included labels at the top of the data columns. This was done to make the results easier to interpret (so that Excel can use the names of the variables instead of just “X variable 2” for example).

4. Click “Output Range” in this dialog box and specify (by clicking the mouse or typing a cell address) where in the worksheet you want the results to be placed, then click OK. The result is not a pretty sight - it still needs to be tidied up because some cells cannot be read because they are blocked by others and the numbers are not aligned nicely.

Page 78: Excel Manual

78 Multiple Regression Chapter 12

4. Now tidy it up and format the results. If there is more in a cell than you can see, select it and use the menu command Format/Columns/Autofit Selection in order to make the column wider so that you can see it all. To control the number of decimal places shown, select the cell(s), then use Format/Cells, then under the Number tab you might choose Number and then specify the number of decimal places. The last two columns have been deleted because they contain no new information (they just repeated the columns before them). Here are the results after tidying up:

Page 79: Excel Manual

Chapter 12 Multiple Regression 79

The results in the first table of Regression Statistics include the R2 value of 0.787 (which tells you that or 78.7% of the variation in Page Costs can be explained by the X variables) and the standard error of estimate Se of 21,578 (which tells you that Page Costs can be predicted to within about this many dollars).

The ANOVA table includes the F test, whose p-value 3.81619E-17 is very small (the “E-17” tells you to move the decimal point to the left 17 places, so actually p = 0.0000000000000000381619). In particular, p < 0.001 and the result is very highly significant.

The last table has the Coefficients, including the constant term of 4,042.799 and the regression coefficients: 3.788 for Audience, -123.634 for Male, and 0.903 for Income. The Standard Error column shows standard errors for each of these coefficients. Next are their t statistics and p-values (note that Audience and Income are significant, but Male is not). Finally you have 95% confidence intervals for the regression coefficients - for example, we are 95% sure that the effect of an additional dollar of Income is to increase Page Costs somewhere between $0.161 and $1.645, on average.

Page 80: Excel Manual

80 Multiple Regression Chapter 12

Example: Magazine Ads (Correlations)

Here is how to find the correlation matrix of a multivariate data set, giving you the correlation of each pair of variables.

1. Look under the Tools menu for Data Analysis, and then select Correlation. In the resulting dialog box, you may select the labels at the top of each column as part of the data range (which must be data columns arranged right next to each other, forming a rectangular range of rows and columns). Also click on “Labels in First Row” so that Excel can use the variable names to help you understand the results. In this case the Input Range is from D9 (the label “Page”) to G64 (the Income measure for the last magazine), selected by dragging the mouse diagonally from one corner to the other. Here is the resulting dialog box:

2. Click on OK. You can see, for example, that the correlation between Page Costs and Audience is the highest, with r = 0.872. The correlation between Audience and Income is negative, with r = -0.353. Here are the results:

Page 81: Excel Manual

Chapter 14 Time Series 81

Time Series (Chapter 14)Excel can be used to perform a trend-seasonal analysis of time series data, accomplished by performing a number of detailed steps one at a time to produce the results. As an alternative, you may wish to consider using StatPad, which will perform this analysis automatically, with many output and charting options.

Example: Ford Automotive Sales (Trend-Seasonal Analysis)

This example shows how to perform a trend-seasonal analysis of quarterly data using Excel. This analysis is built up one basic step at a time by finding the moving average, the ratio-to-moving-average, each seasonal index, the seasonally adjusted series, and the long-term trend. Consider the data for Ford Motor Company’s Automotive Sales from Table 14.2.1 of Practical Business Statistics. Here is the data set (it actually extends through 2000 - this is just the top of the database).

1. To find the moving average for a quarterly series like this one, remember that it starts with the third row (so that we can average a full year’s worth of data, with a half-year before and a half-year after). So we start in the third quarter (cell D6 in this case). Note that if we go back two quarters and ahead two quarters there are two “Quarter 1” values, so they must have weight 0.5 each so that quarters 1 through 4 are treated equally. The

Page 82: Excel Manual

82 Time Series Chapter 14

easiest way to compute this weighted average is actually to average two overlapping full years’ worth of data: the four quarters of 1994 (cells C4:C7 here) with the full year beginning one quarter later (cells C5:C8). This is why, in this case, you can use the formula

=AVERAGE(C4:C7,C5:C8)

in cell D6 for the first moving-average value. An easy way to enter the formula is to drag down each four-quarter range instead of typing in its address. Here is how it looks so far:

If you have a monthly instead of a quarterly time series, then instead of the “quarter” column with 1, 2, 3, 4, 1, 2, 3, ... you would have a “month” column with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, ... and the moving average would start in the seventh row instead of the third. The formula for the moving average would again be the average of two overlapping full years’ worth of data (1) the first 12 months and (2) the full year beginning one month later with months 2 through 13. With monthly data the moving average is also unavailable for the last six months.

2. Double click on the fill handle to copy this formula down the column, then select and delete the last two entries of this column because the moving average is unavailable for the last two quarters. (For monthly data, delete the last six entries). Here is the result so far:

Page 83: Excel Manual

Chapter 14 Time Series 83

3. Find the ratio-to-moving-average, by dividing the Sales value by the Moving Average value, (in this case, place the formula =C6/D6 in cell E6) then double-click on the fill handle to copy this formula down the column. Here is the result:

Page 84: Excel Manual

84 Time Series Chapter 14

4. The seasonal index can be computed for all quarters, even when the moving average and ratio are unavailable. The seasonal index for a given quarter (1, 2, 3, or 4) is the average

Page 85: Excel Manual

Chapter 14 Time Series 85

of all the ratios for that quarter, averaged over all the years that have a ratio for that quarter. For example, the seasonal index for quarter 1 is the average of the ratio 1.03142 for quarter 1 in 1995 with the ratio 1.000796 for quarter 1 in 1996, and so forth through quarter 1 of 2000. Here is a fairly easy way to compute the seasonal index column by using the SUMIF(RANGE,CRITERIA,SUM_RANGE) function to sum the ratios for the selected quarter, divided by the COUNTIF(RANGE,CRITERIA) function that counts how many there are.

In this case the formula to put in cell F4 is

=SUMIF($B$6:$B$29,B4,$E$6:$E$29)/COUNTIF($B$6:$B$29,B4)

Note carefully the use of dollar signs in the cell addresses: references with $ will not change when the formula is copied. The RANGE is $B$6:$B$29 in both functions (SUMIF and COUNTIF), consisting of those values in the “Quarter” column for which ratios are available, so that the first two and last two rows are excluded. The CRITERIA in both functions (SUMIF and COUNTIF) is simply B4, which refers to the Quarter number, 1, for the first row of data. No dollar signs are used here so that when the formula is copied, the result will be for the appropriate quarter for that row. The SUM_RANGE is $E$6:$E$29 for the SUMIF function, telling it to sum up the ratio values for the specified quarter number, specifying only those rows for which ratios are available.

After entering this formula into cell F4, drag the fill handle down the entire column (or use copy and paste) to find all the seasonal values. Note that they repeat exactly from one year to the next, for example, the quarter 1 seasonal index is always 0.9993252 for all years:

Page 86: Excel Manual

86 Time Series Chapter 14

5. The seasonally adjusted values are found by dividing each Sales figure by its Seasonal Index. In this case, the formula is =C4/F4. Enter the formula into the top cell, then copy down the column, perhaps by double-clicking on the fill handle:

Page 87: Excel Manual

Chapter 14 Time Series 87

6. Before you can find the long-term trend, you need a “time period” column consisting of the numbers 1, 2, 3, ... counting how many time periods have gone by. A quick way to do this is to start with 1 and 2 in the first two rows (H4 and H5 in this example), select both cells, then double-click the little fill handle in the lower right corner of the selected cells.

Page 88: Excel Manual

88 Time Series Chapter 14

7. Use this column of time periods to predict the seasonally adjusted column (Y) from the time period (X) using regression analysis. A quick way to do this is with the FORECAST(X,KNOWN_Y’S,KNOWN_X’S) function, using the first time period value

Page 89: Excel Manual

Chapter 14 Time Series 89

column for X, using the entire seasonally-adjusted series with absolute $ cell addressing as the KNOWN_Y’S, and using the entire time period column with absolute $ cell addressing as the KNOWN_X’S. In this case, entering the formula into cell I4 using Insert/Function from the main menu for this problem looks like this (be careful to use omit $ for X but to use $ in the other two ranges:

8. Choose OK to see the resulting long-term trend value in the top cell, then double-click the fill handle to copy the formula down the column:

Page 90: Excel Manual

90 Time Series Chapter 14

9. To extend the trend beyond the series and find the seasonally-adjusted forecast values, the quickest way is to select the last two rows of the time period and the trend columns

Page 91: Excel Manual

Chapter 14 Time Series 91

(you need two rows so that Excel will know to keep increasing the time period in the next step) as follows:

and then to drag the little fill handle at the lower right corner of the selected range to drag it down as many rows as you want. It’s like magic!

10. To prepare to forecast by seasonalizing the trend, you will need to extend the columns for year, quarter, and seasonal index (columns A, B, and F here). After extending columns A and B, you may select the last seasonal index (cell F31 here) and drag the fill handle down to extend it (if Excel has not already done this for you):

Page 92: Excel Manual

92 Time Series Chapter 14

11. You are now ready to create the forecast values by multiplying the trend by the seasonal index. In this example, enter the formula =I4*F4 into cell J4, then double-click the fill handle (or copy and paste) to complete the forecast column. Congratulations! You are done the calculations!

Page 93: Excel Manual

Chapter 14 Time Series 93

Page 94: Excel Manual

94 Time Series Chapter 14

Example: Ford Automotive Sales (Charting the Series and Forecast)

Here is one way to make a chart of one or more of the columns you have created. In this example we create a chart of the original series (sales) together with the forecast values.

1. To begin, select the Sales column including the label at the top (so Excel can use this label), then choose Insert/Chart from the main menu and specify Chart Type as Line and Chart sub-type as either the first choice, or “Line with markers displayed at each data value” as specified here:

Page 95: Excel Manual

Chapter 14 Time Series 95

Page 96: Excel Manual

96 Time Series Chapter 14

2. To list the years along the horizontal axis, click Next >, choose the Series tab, click in the “Category (X) axis labels:” portion of the dialog box and drag with the mouse down the numbers in the Year column in the spreadsheet (in this example, cells A4:A36, excluding the label at the top this time). The dialog box now looks like this:

3. To add the forecasts to this chart, click Add, then click in the Values area of the dialog box, then drag with the mouse down the Forecast values in the worksheet (just the numbers). Next click in the Name area of the dialog box, then click on the cell with the label “Forecast” (in cell J3 here). Your dialog box now looks like this:

Page 97: Excel Manual

Chapter 14 Time Series 97

4. Click Next >, make any changes you like, then click Finish to place the chart into the worksheet. After resizing the chart and double-clicking on the gray background to make it white, the chart looks like this:

0

5

10

15

20

25

30

35

40

45

1994

1994

1995

1995

1996

1996

1997

1997

1998

1998

1999

1999

2000

2000

2001

2001

2002

Sales

Forecast

Page 98: Excel Manual

98 ANOVA Chapter 15

ANOVA (Chapter 15)Excel can perform one-way and two-way ANOVA. Here is an example of each type of analysis.

Example: Supplier Quality Scores (One-way ANOVA)

This example shows how to perform a basic one-way analysis of variance to test for significant differences among several individual columns of data. Consider the data on supplier quality (Table 15.1.1 of Practical Business Statistics).

1. Look under the Tools menu for Data Analysis, select “Anova: Single Factor”, and choose OK. If you cannot find Data Analysis under Excel’s Tools menu, select Add-Ins from the Tools menu and make sure the Analysis ToolPak is checked. If the Analysis ToolPak was not installed when Excel was installed on your computer, you will need to install it from the Excel CD-ROM.

2. In the dialog box that appears, click in “Input Range” and select your data including labels at the top, being sure to extend down to the last row even if you extend past the end of some data columns. Excel requires that your variables be next to one another so that your Input range is a rectangle. Click the check box “Labels in First Row” so that Excel will recognize the names of the columns. Click to the left of “Output Range”, click to the right of “Output Range” and then click in a cell in the worksheet where Excel can put the results. So far, here is how it looks:

Page 99: Excel Manual

Chapter 15 ANOVA 99

3. Click OK to see the results. In this case the p-value of 0.005 tells you that the mean quality scores of these three suppliers are highly significantly different from one another (p < 0.01). That is, you may conclude that there are supplier differences. Also shown are the average quality for each supplier (82.056, 80.667, and 87.684) and each supplier's variance. You also find the between-sample variability of 269.081 and the within-sample variability of 45.631 under the MS column of the ANOVA table (MS stands for Mean Square).

Here are the results, after tidying up by adjusting column widths (try selecting cells that are not displayed properly, then using Format/Column/AutoFitSelection) and by formatting most cells to show three decimal places (using Format/Cells, selecting the Number tab, then using Category Number with 3 decimal places for these cells).

Page 100: Excel Manual

100 ANOVA Chapter 15

4. To find the suppliers’ standard deviations, you may take the square root of each variance, using the SQRT function as follows:

Page 101: Excel Manual

Chapter 15 ANOVA 101

Example: Production Quality by Shift and Supplier (Two-way ANOVA)

This example shows how you might perform a two-way analysis of variance with interaction term. Consider the data on production quality according to shift (day, night, and swing) and supplier (A, B, and C) summarized in Figure 15.4.1, and discussed in Problem 16, with averages listed in Table 15.5.4 of Practical Business Statistics.

1. Look under the Tools menu for Data Analysis, select “Anova: Two-Factor with Replication”, and choose OK. If you cannot find Data Analysis under Excel’s Tools menu, select Add-Ins from the Tools menu and make sure the Analysis ToolPak is checked. If the Analysis ToolPak was not installed when Excel was installed on your computer, you will need to install it from the Excel CD-ROM.

Page 102: Excel Manual

102 ANOVA Chapter 15

2. In the dialog box that appears, click in “Input Range” and select your data including labels at the top and on the sides. Excel requires that the data be arranged in a table as shown below. In this case there are 5 observations for each combination of shift and supplier, so the “Rows per sample” is set at 5. Click to the left of “Output Range”, click to the right of “Output Range” and then click in a cell in the worksheet where Excel can put the results. So far, here is how it looks:

Page 103: Excel Manual

Chapter 15 ANOVA 103

3. Click OK to see the results, as shown below. First you see summary statistics for each combination of shift and supplier (for example, the average quality for Shift 1 and Supplier 1 is 77.062, the average for Supplier 1 is 82.417 (to the right in the first table, for Supplier 1, under “Total”), and the average for Shift 1 is 80.076 (below, in the table headed “Total” under the column headed Shift 1 at the very top).

In the ANOVA table are the results of the hypothesis tests, including a p-value of 0.720 for testing whether the suppliers have equal means or not, a p-value listed as 0.000 for testing whether the shifts have equal means or not, and a p-value of 0.014 for the interaction of shift and supplier.

Here are the results, after tidying up by adjusting column widths (try selecting cells that are not displayed properly, then using Format/Column/AutoFitSelection) and by formatting most cells to show three decimal places (using Format/Cells, selecting the Number tab, then using Category Number with 3 decimal places for these cells).

Page 104: Excel Manual

104 ANOVA Chapter 15

Page 105: Excel Manual

Chapter 16 Nonparametrics 105

Nonparametrics (Chapter 16)Excel has functions that can help you with nonparametric testing based on ranks of the data. In this chapter we will illustrate the sign test and the two-sample unpaired nonparametric test (the Mann-Whitney U-test, also called the Wilcoxon rank-sum test).

Example: Local and National Family Income (Sign Test)

The sign test can be used to test whether the median of a random sample differs significantly from a reference value. Consider the example of local family incomes using data from table 16.1.2 of Practical Business Statistics. The question is: Do these incomes differ significantly from the national median of $27,735?

By using the COUNTIF(RANGE,CRITERIA) function, you can find the modified sample size by using, for the criteria, the condition that the data values are different from the reference value (to say “different” in Excel, you use the less-than and greater-than signs like this: “<> ReferenceValue”). You can also find the number of data values below the reference value by using the less-than sign “< ReferenceValue” in the criteria.

You may then find the p-value for the test by using the BINOMDIST and the MIN functions. If m denotes the modified sample size, #Below denotes the number of data values below the reference value, and 0 denotes the reference value, then the p-value is

=2*MIN(BINOMDIST(m,#Below,0.5,TRUE),BINOMDIST(m-#Below,0.5,TRUE))

Here is how you could find the modified sample size and the number of data values below the reference value (m = 25 and #Below = 6) for the income data, together with the p-value. The result is that the observed median income is significantly different from the reference value.

Page 106: Excel Manual

106 Nonparametrics Chapter 16

Page 107: Excel Manual

Chapter 16 Nonparametrics 107

Example: Incomes of Mortgage Applicants (Unpaired Two-Sample Test)

The nonparametric test for two unpaired samples is based on the ranks of the overall data set with both samples combined. Both the Mann-Whitney U-test and the Wilcoxon rank-sum test give the same result. Excel can help you perform this procedure.

1. Begin by listing both groups of numbers in a single column, with labels in the column to its left to identify the group of each number. Then, with any data cell selected, use the menu command Data/Sort to sort both columns by data value. Here is the Data/Sort dialog box:

Page 108: Excel Manual

108 Nonparametrics Chapter 16

2. Now find the rank of each data value, being careful to average any ties. To do this, create a column (headed “1, 2, 3 ...” below) consisting of the initial ranks (before averaging) of 1, 2, 3, and so forth. Then create a column of ranks with tie averaging by using the SUMIF(DataRange,DataValue,123Range)/COUNTIF(DataRange,DataValue), being careful to use absolute $ addressing for DataRange and 123Range but not for DataValue.

Page 109: Excel Manual

Chapter 16 Nonparametrics 109

Here is the result after copying that formula down the column (for example, by double-clicking on the fill handle after entering the first formula). Note that the averaged rank of 18.5 is used for both income values of 57,000.

Page 110: Excel Manual

110 Nonparametrics Chapter 16

3. To find the average rank for each group, you may again use the SUMIF and COUNTIF functions, this time as

SUMIF(GroupLabelRange,”Fixed”,RanksRange)/COUNTIF(groupLabelRange,”Fixed”)

for the fixed-rate mortgages, changing “Fixed” to “Variable” for the variable-rate mortgages. Here are the results:

Page 111: Excel Manual

Chapter 16 Nonparametrics 111

4. Now find the average difference in ranks by subtracting these average ranks. Find the standard error by using the sample size for each group (16 and 14, here). Divide the

Page 112: Excel Manual

112 Nonparametrics Chapter 16

average difference in ranks by the standard error to find the test statistic. Finally, find the p-value using the function

=2*(1-NORMSDIST(ABS(TestStatistic)))

The results are as follows. Note that these two groups are not significantly different from one another because p > 0.05.

Page 113: Excel Manual

Chapter 17 Chi-Squared Analysis 113

Chi-Squared Analysis (Chapter 17)Chi-squared analysis is used to test for significance in counted data. You can use Excel to test whether population percentages are equal to known reference values; this is done by performing the appropriate steps (e.g., compute the expected counts, etc.). Excel may also be used to perform a test for independence in a two-way table of counts.

Example: Causes of Quality Problems (Chi-squared Test for Known Percentages)

In this example we solve the problem of testing observed counts against known population reference percentages by executing the steps in Excel. Consider the data on quality problems in Tables 17.2.2 and 17.2.3 of Practical Business Statistics.

1. Find the column of expected counts by multiplying each reference percent by the total observed count. Give a name such as “Observed” to the observed counts, and the name “Expected” to the expected counts, perhaps by selecting a column of numbers and using the menu command Insert/Name/Define.

2. To find the chi-squared statistic, since you have named the Observed and Expected numbers, you may use the following matrix formula:

=SUM((Observed-Expected)^2/Expected)

by typing it in and holding Ctrl and Shift while you hit Enter. Because this is a matrix formula, it may not give you an answer if you hit only Enter by itself.

3. To find the p-value, you may use the function CHIDIST(ChiSq,DF) evaluated using the chi-squared statistic and its number of degrees of freedom. Here are the results. Note that the observed counts do not differ significantly from the reference percentages because p > 0.05.

Page 114: Excel Manual

114 Chi-Squared Analysis Chapter 17

Example: Is Your Market Segmented? (Chi-squared Test for Independence)

In this example we perform the chi-squared test for independence of two categorical variables with Excel’s CHITEST function, which gives the p-value directly. Consider the data on market segmentation (Table 17.3.1 of Practical Business Statistics) which gives the number of consumers of each type (practical or impulsive) who purchased each type of rowing machine. Remember that the chi-squared test requires counts, not percentages or averages.

1. Excel can help you compute the p-value of the chi-squared test for independence using the CHITEST function, but you have to compute the table of expected counts first. In the example below, to create a formula for expected counts that will copy correctly to fill the entire table, note the use of “absolute addressing” using dollar signs in the formula “=B$6*$D3/$D$6” to find the expected 18.93 purchases of basic machines by practical consumers. This formula can be copied and pasted to fill the table while always taking column totals from row 6 (hence the reference B$6), always taking the row totals from

Page 115: Excel Manual

Chapter 17 Chi-Squared Analysis 115

column D (hence the reference $D3), and always taking the overall total from cell D6 (hence the reference $D$6).

2. The results are shown below: first the original table of counts, next the table of expected counts, and finally the CHITEST function, which uses both the original table and the table of expected counts (but not the totals). The resulting CHITEST p-value is 3.07823E-15, which represents the very small number 0.00000000000000307823 because the scientific notation "E-15” tells you to move the decimal point 15 places to the left. Clearly the result is very highly significant because this p-value is less than 0.001.

Page 116: Excel Manual

116 Quality Control Chapter 18

Quality Control (Chapter 18)You can produce quality control charts in Excel. You arrange the data for the chart in one column, copy the center line number down another column, and similarly set up one column for each of the control limits, then create the chart. Here are examples for XBar and R charts. In a similar way you can also produce a percentage chart. The procedure is simpler if you use StatPad.

Example: Weights of Boxes of Detergent (XBar and R Charts)

Here is how to use Excel to draw an XBar chart for the detergent data from Table 18.3.4 of Practical Business Statistics. Begin with a column containing a list of the averages (of five observations each in this case). Immediately to its right, create a column containing the average XBarBar=16.093 of these averages repeated down the column. Next to it, create a column for the lower control limit XBarBar-A2*RBar = 15.941 and one for the upper control limit XBarBar+A2*RBar = 16.245. Now select all four of these columns (just the numbers) and use Excel’s menu command Insert/Chart (or click on the Chart Wizard icon) to bring up the Chart Wizard dialog box. Under “Chart type” select Line (with markers displayed at each data value) as shown. As you work your way through the Chart Wizard (by clicking on Next), pause at the Chart Options dialog box. Click on the Gridlines tab to specify whether or not you would like gridlines (which were not used here) and click on the Legend tab to delete the legend (by unselecting the “Show legend” check box). Click on Finish, and the XBar chart appears.

Page 117: Excel Manual

Chapter 18 Quality Control 117

To use Excel to draw an R chart for the detergent data, proceed as for the XBar chart, but use the range values R for the first column, their average RBar for the second column, and the

Page 118: Excel Manual

118 Quality Control Chapter 18

appropriate lower and upper control limits D3*RBar = 0 and D4*RBar = 0.556 for the third and fourth columns. Here is the R chart in Excel:

Page 119: Excel Manual

Appendix A Data Files and Variable Names 119

Appendix: Excel Range NamesFor use with Excel, each chapter of Practical Business Statistics has its own file on the CD-ROM that includes the data tables from examples and problems. To access it, use File/Open from Excel’s menu. Each column of numbers is named and ready to use. For example, the data sets from Chapter 3 are in the file named Chapter03.xls, and the employee database from Appendix A of the textbook is in the file named EmployeeDatabase.xls.

To work with a column of numbers from a data file, you may use its name in a formula, such as “=AVERAGE(yield)” to place the average of a column of numbers named “yield” into a cell in your worksheet. Alternatively, you may drag the mouse down the numbers in the data set to select them if you wish.

Here are the Excel range names assigned to each individual data table within a file. Note that spaces are not allowed in Excel names, and that the underline character (_) is often used instead.

CHAPTER 2, RANGE NAMES.

Characteristics of Food Services Companies. Profits Return Employees Revenues

Daily Stock Price Information for Home Depot Open High Low HD_Close Volume HD_Percent_Change

Table 2.6.1. Employment/History Status of Five People. Salary Experience

Table 2.6.2. Selected Product Output of Production Facilities.

Employ

Table 2.6.3. Sales and Income. Sales Income

Table 2.6.4. Selected Customers and Purchases. Purchases

Table 2.6.5 Ratings of Cell Phones. Price

Table 2.6.6. Comparison of Upright Vacuum Cleaners. Vacuum_Price Weight

Table 2.6.7. Closing Price and Monthly Change for DJIA Firms. Close Change

Table 2.6.8. Daily DJTA for September 1998. DJIA

Page 120: Excel Manual

120 Excel Range Names Appendix

Net_Change Percent_Change

CHAPTER 3, RANGE NAMES.

Table 3.2.1. Home Mortgage Rates. Interest_Rate

Table 3.2.2. Starting Salaries. Salary

Table 3.4.1. Assets of Commercial Banks in the Fortune 1000. Assets

Table 3.5.1. Yields of Money Market Funds. Taxable_Yield Tax_Exempt_Yield

Table 3.5.2. Rates of Computer Ownership Computer_Owners

Table 3.6.1. Changes in Spending on Syndicated TV Advertising. Change

Table 3.8.1. The Number of Employees for Food Services Firms. Employees

Table 3.9.1. Yields of Municipal Bonds. Yield

Table 3.9.2. Market Response to Stock Buy-Backs. Price_Change

Table 3.9.3. Active Stock Market Issues. Stock_Change

Table 3.9.4. CREF'S Investments. Market_Value

Table 3.9.5. Percent Change in Revenues for Scientific, Photo, and Control Equipment Companies in the Fortune 500. Revenue_Change

Table 3.9.6. Hospital Charges for Heart Failure and Shock. Hospital_Charges

Table 3.9.7. CEO Compensation for Food Processing Firms. CEO_Compensation

Table 3.9.8. Market Share for Seattle Radio Stations. Listeners

Table 3.9.9. Net Income of Selected Firms. Net_Income

Table 3.9.10. Cost of Traditional Funeral Service. Funeral_Cost

Table 3.9.11. Special Exemptions to the Tax Code. Exemption

Problem 3.21. Defective Motors, Per Batch Of 250. Defects

Table 3.9.12. Cost to Rent a Car. Rental

Problem 3.23. Interest Rates. Rate

Problem 3.24. Market Values. Market

Problem 3.25. Executive Salaries.

Page 121: Excel Manual

Appendix A Data Files and Variable Names 121

Executive_Salary

Problem 3.26. Order Size. Order

Problem 3.27. Envelope Prices. Envelope_Price

Problem 3.28. Market Share. Share

Table 3.9.13. Percentage Change in Dollar Value. Dollar_Change

Problem 3.30. Tylenol Prices. Tylenol

Table 2.6.7. Closing Price and Monthly Change for DJIA Firms. DJIA_Close DJIA_Change

Table 2.6.8. Daily DJIA for January 2002. DJIA_Net_Change DJIA_Percent_Change

Case. Material Manager Inventory

CHAPTER 4, RANGE NAMES.

Example: How Many Defective Parts? Defects

Example: Your Grade Point Average. Credits Grade

Example: The Firm's Cost of Capital Market_Value Rate_of_Return

Table 4.1.1. Loss at Opening, Crash of 1987. Loss

Table 4.2.1. CEO Compensation in Technology. Salary

Table 4.2.2 Business Failures by State. Failures

Problem 4.1. Cars Requiring Extra Work Cars

Table 4.3.1. Last Month's Sales Sales

Table 4.3.2. Value Added Tax Rates by Country. VAT

Table 4.3.3. Profits for General Merchandisers in the Fortune 500. Profits

Problem 4.7. Beta of Stock Portfolio. Shares Cost_Per_Share Beta

Table 4.3.4. State Population and Taxes.

Page 122: Excel Manual

122 Excel Range Names Appendix

Population State_Taxes

Table 4.3.5. Percent Change in Housing Values over Five Years for U.S. Regions. Percent_Change

Table 4.3.6. Revenues for selected Fortune 500 companies. Revenues

Table 4.3.7. Percent increases of initial public stock offerings. Percent_Increase

Problem 4.16. Paper Mill Problems. Problem

Table 4.3.8. Home Mortgage Loan Fees Fee

Problem 4.23. Strength of Cotton Yarn. Strength

Problem 4.24. Factory Inventory Level. Inventory

Problem 4.25. Your Products' Share. Share

Problem 4.26. Monthly Sales. Monthly_Sales

Table 4.3.9. Changing Value of the Dollar. Change

Table 3.9.1. Yields of Municipal Bonds. Yield

Table 3.9.2. Market Response to Stock Buy-Backs. Price_Change

Table 3.9.4. CREF'S Investments. CREF_Value

Table 4.3.10. Length in minutes for selected films from a video library. Time

Table 3.9.6. Hospital Charges for Heart Failure and Shock. Hospital_Charges

Table 3.9.7. CEO Compensation for Food Processing Firms. CEO_Compensation

Table 3.9.10. Cost of Traditional Funeral Service. Funeral_Cost

Table 4.3.11. Sales of Some 'Light' Foods. Food_Sales

Table 2.6.7. Closing Price and Monthly Change for DJIA Firms. DJIA_Close DJIA_Change

Table 2.6.8. Daily DJIA for January 2002. DJIA_Net_Change DJIA_Percent_Change

Case. Chairs Tables Bookshelves Cabinets Value

CHAPTER 5, RANGE NAMES.

Table 5.1.1. Finding The Deviations From The Average. Dart_Returns

Page 123: Excel Manual

Appendix A Data Files and Variable Names 123

Example: The Advertising Budget. Budget

Table 5.1.3-5.1.4. Closing Stock Prices and Daily Returns. Dow_Jones Dow_Jones_Return

Example: S&P 500 Stock Index Volatility. Standard_Deviation

Table 5.2.1. Employee Salaries. Employee_Salary

Table 5.2.2. Hospital Length of Stay. Days

Table 5.5.1. Advertising Accounts in Play. Ad_Budget

Table 5.5.2. Performance of Pharmaceutical Firms. Stock_Return

Table 5.5.3. Largest Stock Mutual Funds. Return_Mutual_Fund Assets_Mutual_Fund

Problem 5.6. Number of Executives for Seattle Firms. Executives

Table 5.5.4. Weights for Two Samples of Candy Bars. Before After

Table 5.5.5. Cost due to traffic congestion, per registered vehicle. Cost_Traffic

Problem 5.17. Rates of Return. ROR

Problem 5.18. Interest Rates Rate

Table 5.5.6. Theme Park Admission Prices. Admission

Table 5.5.7. Changing Value of the Dollar. Change

Problem 5.20. Weights of Sinks. Weight

Table 5.5.8. Hotel Room Prices. Price

Table 5.5.9. Gifts Returned. Returned

Problem 5.23. Airline Ticket Prices Ticket_Cost

Problem 5.24. Productivity Measures. Productivity

Problem 5.25. Sales. Sales

Problem 5.26. Percentage of Gold. Gold

Problem 5.27. Return on Equity.

Page 124: Excel Manual

124 Excel Range Names Appendix

ROE

Table 4.3.2. Value Added Tax Rates by Country. VAT

Table 4.3.10. Length in minutes for selected films from a video library. Time

Table 5.5.10. International taxation. GDP Taxes

Table 3.9.1. Yields of Municipal Bonds. Yield

Table 3.9.2. Market Response to Stock Buy-Backs. Price_Change

Table 3.9.4. CREF'S Investments. Market_Value

Table 3.9.10. Cost of Traditional Funeral Service. Funeral_Cost

Problem 5.40. Defective Motors, Per Batch Of 250. Defects

Table 4.3.1. Last Month’s Sales. Last_Month_Sales

Table 4.3.5. Percent Change in Housing Values over Five Years for U.S. Regions. Percent_Change

Table 4.3.8. Home Mortgage Loan Fees. Fee

Table 5.5.11. International Bond Mutual Fund Performance. Performance_Before Performance_After

Table 5.5.12. Age and Cost for Presses. Age Cost_Presses

Table 2.6.7. Closing Price and Monthly Change for DJIA Firms. DJIA_Close DJIA_Change

Table 2.6.8. Daily DJIA for January 2002. DJIA_Net_Change DJIA_Percent_Change

Case. Part_Size

CHAPTER 7, RANGE NAMES.

Example. Profit Under Various Economic Scenarios. Profit Prob_of_Profit

Table 7.6.1. Probability Distribution of Payoff. Payoff Prob_of_Payoff

Table 7.6.2. Probability Distribution of Downtime. Downtime Prob_of_Downtime

TABLE 7.6.3. Probabilities for Qualified Technical Applicants. Applicants Probability_of_Applicants

Page 125: Excel Manual

Appendix A Data Files and Variable Names 125

Table 7.6.4. Rates of Return and Probabilities. ROR Prob_of_ROR

Table 7.6.5. Quality Control Problems. Prob_of_Rework Rework_Cost

Case. Oil_Price Prob_of_Oil_Price

CHAPTER 8, RANGE NAMES.

Table 8.6.1. Project Analysis. Probability Profit_or_Loss

Table 8.6.2. Industrial Farm Equipment Firms. Profit

Table 8.6.3. Revenue Change for Fortune 500 Soap and Cosmetics Companies. Revenue_Change

Table 8.6.4. Economic Forecasts. Forecast

Problem 8.31. Recent Billings. Billing

Problem 8.33. Quality of Agricultural Produce. Quality

CHAPTER 9, RANGE NAMES.

Table 9.1.2. Thickness of Selected Sheets of Paper. Thickness

TABLE 9.1.3. Yearly Percentage of Adults Using the Internet. Internet_Usage

Table 9.1.4. Yields of a Chemical Processing Facility. Tons

Problem 9.11. Personal Computer Orders. Computers

Problem 9.14. Weights of Loaves of Bread. Weight

Problem 9.16. Cleaning Cost Cleaning_Cost

Table 9.6.1. Prices at SuperMall and elsewhere for various items. SuperMall Elsewhere Savings

Problem 9.27. Daily Changes in S&P 500 Stock Market Index. Change

Table 9.6.2. Performance of Recommended Stocks. Performance

Problem 9.34. Computer Speed. Seconds

Problem 9.35. Economic Viability of Mining Operation. ROR

Page 126: Excel Manual

126 Excel Range Names Appendix

Table 4.3.1. Last Month's Sales. Sales

Problem 9.40. Strength of Cotton Yarn. Strength

Table 5.5.4. Weights for Two Samples of Candy Bars. Before After

Problem 9.44. Quality scores for agricultural produce. Quality

Problem 9.45. Caffeine in Coffee. Caffeine

Case. Order_Amount

CHAPTER 10, RANGE NAMES.

Table 10.6.1. Relaxation Scores. Before After

Table 10.6.4. Salaries Arranged by Gender. Women Men

Problem 10.8. Inventory Level. Inventory

Problem 10.9. Weights of Loaves of Bread. Weight_Bread

Table 4.3.7. Percent increases of initial public stock offerings. Percent_Increase

Problem 10.21. Weight of Frozen Foods.

Weight_Food

Problem 10.22. Prices. Price

Problem 10.23. Calorie Content. Calories

Table 10.7.2. Store Returns. Returned

Problem 10.25. Satisfaction Scores. Satisfaction

Problem 10.26. Pollutant Levels. Pollution

Problem 10.27. Component Weights. Weight_Component

Table 10.7.3. Performance of Socially Aware Funds. ROR

Table 10.7.4. World Income Funds One-Year Market Return. Market_Return

Table 10.7.5. Vocal Stress Level. True_Stress False_Stress

Table 10.7.6. Wine Tasting Scores. Chardonnay Sauvignon

Table 10.7.7. Days Until Failure. You Competitor

Table 10.7.8. Monthly Daycare Rates. Laurelhurst Other_Areas

Table 10.7.9. New Product Preferences.

Page 127: Excel Manual

Appendix A Data Files and Variable Names 127

Milwaukee Green_Bay

Table 10.7.11. Supplier Quality. Custom_Cases International_Plastics

Table 5.5.4. Weights for Two Samples of Candy Bars. Candy_Before Candy_After

Case. n Avg stdDev stdErr t p

CHAPTER 11, RANGE NAMES.

Table 11.1.1. First Quarter Performance. Contacts Sales_Qtr

Table 11.1.3. Internet Usage Ratings. Audience Reach Pages Time

Table 11.1.4. Top Merger & Acquisition Advisers. Deals Dollars

Table 11.1.5. Mortgage Costs. Fee Interest

Table 11.1.6. Percent Change in Stock Index. Today

Yesterday

Table 11.1.7. S&P100 Index Call Options. Strike_Price Call_Price

Table 11.1.8. Temperature and Yield for an Industrial Process. Temperature Yield_Process

Table 11.1.9. Fiber-Optics Long-Distance Communications. Investment Miles

Table 11.1.11. U. S. Treasury Bonds. Coupon_Rate Bid_Price

Table 11.1.12. Weekly Production. Number_Produced Cost_Weekly

Data for Figure 11.1.18. Restaurant and Food Store Expenditures by State, millions. Food_Stores Restaurant

Table 11.2.2. Weekly Production (Outlier Omitted). Produced Cost

Table 11.2.3. Territory and Performance of Salespeople. Territory Sales_Performance

Table 11.3.1. Printing Presses. Age Cost_Printing

Page 128: Excel Manual

128 Excel Range Names Appendix

Table 11.3.2. Airline On-Time Performance. Month_On_Time Year_On_Time

Table 11.3.3. International Closed-End Bond Funds. NAV Price_Fund

Table 11.3.4. Business Failures By State. Failures Population

Table 11.3.5. Daily Stock Changes. McDonalds Dow_Jones

Table 11.3.6. Expense Ratio and One-Year Rate of Return. WR_ Expense_Ratio WR_Return

Table 11.3.7. Votes for Albert Gore, Jr. Nov7 Certified Change

Table 11.3.8. Total U.S. Advertising Spending by Retail Firms. Ads2000 Ads1999

Table 11.3.9. Market Share and 30-Second Advertising Cost. Share Ad_Cost

Table 11.3.10. Gold Coins. Weight Price_Gold

Table 11.3.11. Room for Expansion. Existing_Units Capacity

Table 11.3.12. Gasoline Prices. Price_11_30_90 Price_2_26_91

Table 11.3.13. Salaries and Money Raised Per Capita, Charitable Organizations. President Money_Raised

Table 11.3.14. Mailing Lists. Size Sales

Table 11.3.15. Short-Term Bond Funds. Maturity Return

Table 11.3.16. Production Data. Workers Production

Table 11.3.17. Biotechnical Stocks. EPS Price_Biotech

Table 11.3.18. Newspaper Advertising Rates Per Line. Circulation Open_Line_Rate

Page 129: Excel Manual

Appendix A Data Files and Variable Names 129

Table 11.3.19. Newspaper Ad Rates Adjusted for Readership. Circulation2 Milline_Rate

Table 11.3.20. Defects and Possible Causes. Defect_Rate Temperature_Variability Stoppages

Case Purifier Yield_Case

CHAPTER 12, RANGE NAMES.

Table 12.1.3. Advertising Costs, Characteristics of Magazines. Page Audience Male Income

Table 12.2.2. Computers and Office Equipment Companies in the Fortune 500. Mkt_Val Assets Employees

Table 12.2.13. Dividends, Sales of Goods. Div Nondur Durable

Table 12.3.3. Temperature and Yield for an Industrial Process. Temperature_Process Yield

Table 12.4.4. Salary, Experience, and Gender for Employees. Salary

Experience Gender

Table 12.5.1. Picasso Paintings. Price Area Year

Table 12.5.4. Computer Response Time, Users, and Load. Response_Time Users Load

Table 12.5.5. Performance of International Stocks. US Europe Pacific_Rim

Table 12.5.7. CEO Salaries, Sales and Return on Equity for Northwest Companies NW_Salary NW_Sales NW_ROE

Table 12.5.8. Brokerage House Asset-Allocation. Performance Stocks Bonds

Table 12.5.11. Staff and Contribution Levels for Charities. Staff Public Govt Other

Table 12.5.14. Fiber-Optics Long-Distance Communications. Invest Miles

Page 130: Excel Manual

130 Excel Range Names Appendix

Table 12.5.16. Price and Profit in Test Markets. Price_Test Profit

Table 12.5.18. Interest Rates. Fed_Funds T_Bills T_Bonds

Case. Temperature Density Rate AM_PM Defect

CHAPTER 13, RANGE NAMES.

Data from Appendix of Report: Quick Pricing Formula. Components Size Cost

CHAPTER 14, RANGE NAMES.

Table 14.1.1. Radio, TV, and Computer Store Sales. Radio_TV_Computer

Table 14.1.3-14.1.4. U.S. Retail Sales. Unadjusted_Sales Seasonally_Adjusted_Sales

Table 14.1.5. U.S. Treasury Bills. Yield

Table 14.2.1. Ford Motor Company. Automotive_Sales

Table 14.3.1. Civilian Unemployment Rate. Unemployment

Table 14.4.1. Quarterly Revenues for Walt Disney Company and Subsidiaries. Disney

Table 14.4.2. Quarterly Net Sales for PepsiCo. PepsiCo

Table 14.4.3. Quarterly Sales for Deere & Company. Deere_Sales

Table 14.4.4. Quarterly Sales for Castle & Cooke, Inc. Castle_Cooke_Sales

Table 14.4.5. Quarterly Sales for Nordstrom, Inc. Nordstrom_Sales

Table 14.4.6. Quarterly Sales. Sales

Table 14.4.10. Interest Rate Forecasts. Forecast Lower_95 Upper_95

CHAPTER 15, RANGE NAMES.

Table 15.1.1. Quality Scores for Suppliers' Products. Amalgamated Bipolar Consolidated

Table 15.5.3. Lengths of Telephone Calls. Info Sales Service Other

Table 15.5.4. Original Data.

Page 131: Excel Manual

Appendix A Data Files and Variable Names 131

Quality Shift Supplier

CHAPTER 16, RANGE NAMES.

Table 16.1.2. Incomes of Sampled Families. Income

Table 16.2.2. Level of Creativity. Ad_1 Ad_2

Table 16.3.2. Income of Mortgage Applicants. Fixed Variable

Table 16.4.1. Building Materials Firm Profits. Building_Profit

Table 16.4.2. Aerospace Firm Profits. Aerospace_Profit

Table 16.4.3. Relaxation Scores. Before After

Table 16.4.4. Stress Levels. True_Answer False_Answer

Table 16.4.5. Gender Salary Data. Women Men

Table 16.4.6. Reliability of Products Under Abuse. Yours Competitor

Table 16.4.7. Prescription Drug Prices. United_States

Canada

CHAPTER 17, RANGE NAMES.

Table 17.1.1. Vehicle Desired. Vehicle_Count Vehicle_Percent

Table 17.1.2. Responses to the Question on GM Cars. Boomer Nonboom Overall

Table 17.2.2. Defective Components. Defect_Count

Table 17.3.1. Rowing Machine Purchases. Practical Impulsive

Table 17.4.1. Vehicle Desired: This week's count and last year's percentage. This_Count Last_Percent

Table 17.4.2. Incoming Telephone Calls. Phone_Count Phone_Percent

Table 17.4.3. Survey of Future Business Conditions. Managers Employees

Table 17.4.4. Survey on the Chances of a Stock Market Crash. Stockholders Nonstockholders

Table 17.4.5. Order Rates by Region East West

Page 132: Excel Manual

132 Excel Range Names Appendix

Table 17.4.6. Status of Mortgage Applications. Residential Commercial

Table 17.4.7. Household Responses. Satisfied Dissatisfied

Table 17.4.8. Newsletter Interest Level. Customer Potential_customer

CHAPTER 18, RANGE NAMES.

Table 18.1.1. Defect Causes, with Frequency of Occurrence. Number_Defects

Table 18.3.3. Summaries of Measurements for 8 Samples, n=4. Average_Meas Range_Meas

Table 18.3.4. Weights of Sampled Boxes of Detergent. Average_Detergent Range_Detergent

Table 18.4.2. Summaries of Measurements for 12 Samples. Defects_of_500

Table 18.4.3. Errors in Batches of n=300 Purchase Orders. Defects_of_300

Table 18.5.1. Frequency of Problems in Candy Manufacturing. Candy_Problems

Table 18.5.2. Problems in Rebate Processing. Rebate_Problems

Table 18.5.3. Thickness of Protective Coating. Thick_1 Thick_2 Thick_3

Table 18.5.4. Length of Broccoli Trees, n=4. Broc_1 Broc_2 Broc_3 Broc_4

Table 18.5.5. Defective Invoices, n=500. Errors

Table 18.5.6. High Speed Memory Chips. Chip_Number

Table 18.5.7. Baking Oven Temperatures. Mon_Avg Mon_Range Tues_Avg Tues_Range Wed_Avg Wed_Range

APPENDIX A, RANGE NAMES.

Appendix A. Employee Database. Salary Gender Age Experience Level

APPENDIX B, RANGE NAMES.

Appendix B. Donations Database.Note: “_D0” indicates 19,011 non-donors, while “_D1” indicates 989 donors, out of 20,000 overall.

Page 133: Excel Manual

Appendix A Data Files and Variable Names 133

Age Age_D0 Age_D1 Age55_59 Age55_59_D0 Age55_59_D1 Age60_64 Age60_64_D0 Age60_64_D1 AvgGift AvgGift_D0 AvgGift_D1 Cars Cars_D0 Cars_D1 CatalogShopper CatalogShopper_D0 CatalogShopper_D1 Clerical Clerical_D0 Clerical_D1 Donation Donation_D0 Donation_D1 Farmers Farmers_D0 Farmers_D1 Gifts Gifts_D0 Gifts_D1 HomePhone HomePhone_D0 HomePhone_D1 Lifetime Lifetime_D0 Lifetime_D1 MajorDonor MajorDonor_D0 MajorDonor_D1 MedHouseInc MedHouseInc_D0 MedHouseInc_D1 OwnerOccupied OwnerOccupied_D0 OwnerOccupied_D1

PCOwner PCOwner_D0 PCOwner_D1 PerCapIncome PerCapIncome_D0 PerCapIncome_D1 Professional Professional_D0 Professional_D1 Promotions Promotions_D0 Promotions_D1 RecentGifts RecentGifts_D0 RecentGifts_D1 Sales Sales_D0 Sales_D1 School School_D0 School_D1 SelfEmployed SelfEmployed_D0 SelfEmployed_D1 Technical Technical_D0 Technical_D1 YearsSinceFirst YearsSinceFirst_D0 YearsSinceFirst_D1 YearsSinceLast YearsSinceLast_D0 YearsSinceLast_D1