149
Statistics for Business Parts 3 & 4 (Topics 7-10) Course No. STAT:1030 Spring 2018 Whitten (Price for two-Notebook BUNDLE) Royalty $0.00 Copies $27.02 Binding $6.00 Course-Pak TM Property of

Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Statistics for Business

Parts 3 & 4 (Topics 7-10)

Course No.STAT:1030

Spring 2018

Whitten

(Price for two-Notebook BUNDLE)

Royalty $0.00

Copies $27.02

Binding $6.00

Course-Pak TM

Property of

Page 2: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

S T A T S G U I D E

S T A T S H E L P

• WALK-IN TA/Prof Office Hours

◦ STRONGLY recommended! GREAT answers!

◦ See ANY TA — not just your own! (Office hours are interchangeable.)

◦ S-M-O-O-T-H-E-S the way. (Everybody needs help with stats.)

◦ Find and Use Weekly Office Hour Schedule. (See Stats website.)

◦ Come discuss Stats with us! Blake, Carter, Daniel, Haibo, Jeremy,

Liyang, Rebecca, Seung-Wook, Tim, Wenda, Xun, Yunju, Zhiwei

• TWO TUTOR LABS (These supplement office hours.)

◦ Stats Dept. Lab (staffed by graduate students in Library LC)

◦ Tippie College Lab (peer-to-peer undergrad tutoring in Tippie Library)

◦ Find and Use both Tutor Lab Schedules (See inside Notebook and online.)

S T A T S H O M E W O R K

• The real key to success!

◦ HW Quiz scores strongly and positively correlate with course grades.

1. Write Homework answers on SEPARATE paper!

Buy your own notebook to collect and document your homework

— mistakes and all! (Pays dividends on exams.)

2. MARK UP HW Directions (in Notebook) with an Accounting System!

◦ CHECK OFF questions which you answer correctly. (You’ve conqueredthese questions!)

◦ But CIRCLE questions which you miss.

* Study the HW Solution (online.)

* Search for similar Notebook Examples.

* Get help at Office Hours & Tutor Labs.

◦ Then RETURN to each circled question a day or two later.

* You’ll forget the HW Solution answer quickly!

* Can you answer successfully NOW on a blank sheet of paper?

1

blake
Sample HW Questions
blake
Oval
blake
Line
blake
Line
blake
Rectangle
blake
Circle
blake
Checkmark
blake
Rectangle
blake
Rectangle
blake
Line
blake
Text Box
That's what the quiz and exam are like!
blake
Line
Page 3: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

�� ��Stats for Business Notes

Part 3: Statistical InferenceTopic 7 Confidence IntervalsTopic 8 Hypothesis Testing

The following textbook pages GREATLY ENHANCE the Part 3 Notes.Reading them together with the Notes provides a competitive advantage anddeeper understanding. Highly recommended!

TopicNumber Topic Important Textbook Reading

7 Confidence Intervals 334–344, 395–399, 457–461, 465–467

8 Hypothesis Testing 351–367, 382–385, 399–401, 462–464

Where To Quickly Find . . .

• STATS GUIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Page 1

• Semester Schedule (when is the next exam, etc.?) . . . . . . . . . . . . Page 10

• Homework (what’s covered this week? when’s the quiz?) . . . . . Page 11

• Syllabus (including exams and grades) . . . . . . . . . . . . . . . . . . . . Pages 6–11

Also Included in Notebook:

• Homework Directions

• Discussion Worksheets

• Statistical tables

2

blake
Rectangle
Page 4: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

T-2 TABLES

Probability

z

Table entry for z is thearea under thestandard normal curveto the left of z.

TABLE A Standard normal probabilities

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

−3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002−3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003−3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005−3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007−3.0 .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010−2.9 .0019 .0018 .0018 .0017 .0016 .0016 .0015 .0015 .0014 .0014−2.8 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0021 .0020 .0019−2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026−2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036−2.5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048−2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064−2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084−2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110−2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143−2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183−1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233−1.8 .0359 .0351 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294−1.7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367−1.6 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455−1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559−1.4 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681−1.3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823−1.2 .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985−1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170−1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379−0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611−0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867−0.7 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148−0.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451−0.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776−0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121−0.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483−0.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859−0.1 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247−0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641

3

blake
Text Box
blake
Line
blake
Text Box
0
blake
Text Box
blake
Text Box
Page 5: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

TABLES T-3

z

Probability

Table entry for z isthe area under thestandard normal curveto the left of z.

TABLE A Standard normal probabilities (continued)

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .53590.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .57530.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .61410.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .65170.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .68790.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .72240.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .75490.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .78520.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .81330.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .83891.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .86211.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .88301.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .90151.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .91771.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .93191.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .94411.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .95451.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .96331.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .97061.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .97672.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .98172.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .98572.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .98902.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .99162.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .99362.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .99522.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .99642.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .99742.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .99812.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .99863.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .99903.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .99933.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .99953.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .99973.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998

4

blake
Line
blake
Text Box
0
blake
Text Box
blake
Text Box
blake
Text Box
Page 6: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Probability p

t*

Table entry for p andC is the critical valuet* with probability plying to its right andprobability C lyingbetween −t* and t*.

TABLE D t distribution critical values

Upper tail probability p

df .25 .20 .15 .10 .05 .025 .02 .01 .005 .0025 .001 .0005

1 1.000 1.376 1.963 3.078 6.314 12.71 15.89 31.82 63.66 127.3 318.3 636.62 0.816 1.061 1.386 1.886 2.920 4.303 4.849 6.965 9.925 14.09 22.33 31.603 0.765 0.978 1.250 1.638 2.353 3.182 3.482 4.541 5.841 7.453 10.21 12.924 0.741 0.941 1.190 1.533 2.132 2.776 2.999 3.747 4.604 5.598 7.173 8.6105 0.727 0.920 1.156 1.476 2.015 2.571 2.757 3.365 4.032 4.773 5.893 6.8696 0.718 0.906 1.134 1.440 1.943 2.447 2.612 3.143 3.707 4.317 5.208 5.9597 0.711 0.896 1.119 1.415 1.895 2.365 2.517 2.998 3.499 4.029 4.785 5.4088 0.706 0.889 1.108 1.397 1.860 2.306 2.449 2.896 3.355 3.833 4.501 5.0419 0.703 0.883 1.100 1.383 1.833 2.262 2.398 2.821 3.250 3.690 4.297 4.781

10 0.700 0.879 1.093 1.372 1.812 2.228 2.359 2.764 3.169 3.581 4.144 4.58711 0.697 0.876 1.088 1.363 1.796 2.201 2.328 2.718 3.106 3.497 4.025 4.43712 0.695 0.873 1.083 1.356 1.782 2.179 2.303 2.681 3.055 3.428 3.930 4.31813 0.694 0.870 1.079 1.350 1.771 2.160 2.282 2.650 3.012 3.372 3.852 4.22114 0.692 0.868 1.076 1.345 1.761 2.145 2.264 2.624 2.977 3.326 3.787 4.14015 0.691 0.866 1.074 1.341 1.753 2.131 2.249 2.602 2.947 3.286 3.733 4.07316 0.690 0.865 1.071 1.337 1.746 2.120 2.235 2.583 2.921 3.252 3.686 4.01517 0.689 0.863 1.069 1.333 1.740 2.110 2.224 2.567 2.898 3.222 3.646 3.96518 0.688 0.862 1.067 1.330 1.734 2.101 2.214 2.552 2.878 3.197 3.611 3.92219 0.688 0.861 1.066 1.328 1.729 2.093 2.205 2.539 2.861 3.174 3.579 3.88320 0.687 0.860 1.064 1.325 1.725 2.086 2.197 2.528 2.845 3.153 3.552 3.85021 0.686 0.859 1.063 1.323 1.721 2.080 2.189 2.518 2.831 3.135 3.527 3.81922 0.686 0.858 1.061 1.321 1.717 2.074 2.183 2.508 2.819 3.119 3.505 3.79223 0.685 0.858 1.060 1.319 1.714 2.069 2.177 2.500 2.807 3.104 3.485 3.76824 0.685 0.857 1.059 1.318 1.711 2.064 2.172 2.492 2.797 3.091 3.467 3.74525 0.684 0.856 1.058 1.316 1.708 2.060 2.167 2.485 2.787 3.078 3.450 3.72526 0.684 0.856 1.058 1.315 1.706 2.056 2.162 2.479 2.779 3.067 3.435 3.70727 0.684 0.855 1.057 1.314 1.703 2.052 2.158 2.473 2.771 3.057 3.421 3.69028 0.683 0.855 1.056 1.313 1.701 2.048 2.154 2.467 2.763 3.047 3.408 3.67429 0.683 0.854 1.055 1.311 1.699 2.045 2.150 2.462 2.756 3.038 3.396 3.65930 0.683 0.854 1.055 1.310 1.697 2.042 2.147 2.457 2.750 3.030 3.385 3.64640 0.681 0.851 1.050 1.303 1.684 2.021 2.123 2.423 2.704 2.971 3.307 3.55150 0.679 0.849 1.047 1.299 1.676 2.009 2.109 2.403 2.678 2.937 3.261 3.49660 0.679 0.848 1.045 1.296 1.671 2.000 2.099 2.390 2.660 2.915 3.232 3.46080 0.678 0.846 1.043 1.292 1.664 1.990 2.088 2.374 2.639 2.887 3.195 3.416

100 0.677 0.845 1.042 1.290 1.660 1.984 2.081 2.364 2.626 2.871 3.174 3.3901000 0.675 0.842 1.037 1.282 1.646 1.962 2.056 2.330 2.581 2.813 3.098 3.300

z∗ 0.674 0.841 1.036 1.282 1.645 1.960 2.054 2.326 2.575 2.807 3.091 3.291

50% 60% 70% 80% 90% 95% 96% 98% 99% 99.5% 99.8% 99.9%

Confidence level C

T-11

5

blake
Line
blake
Text Box
0
blake
Text Box
blake
Text Box
blake
Text Box
Page 7: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

UNIVERSITY OF IOWADepartment of Statistics and Actuarial Science

STAT 1030 Statistics for Business Spring 2018

Course Information

Overview We develop statistical methods of inductive reasoning to make the best-possible businessdecisions based on available partial (sample) information. We rely on deductive (mathematical)reasoning through Probability as a vital tool to help us achieve that goal.

STAT 1030 provides general education credit for quantitative and formal reasoning and is prereq-uisite for MSCI 2800 Business Analytics.

Lecture 3:30–4:45 MW in Macbride Hall Auditorium

Stats Course Website (supplements ICON) homepage.divms.uiowa.edu/∼blake/stat1030

Instructor Blake Whitten Office: 261 Schaeffer Hall (319) 335-0647 [email protected]

Prof. Whitten’s Office Hours

◦ Regular Weekly Office Hours: Monday 9:30 AM – 12:30 PM

◦ Special “Pre-Exam” Monday evening office hours: 5:30–7:00 PM on Feb. 12, Mar. 19, Apr. 9

Required Materials

• Course Packet from Zephyr Copies, 125 S. Dubuque St., (319) 351-3500($35.00 for two-Notebook bundle)

• Textbook: The Practice of Statistics for Business and Economics, 3nd Edition 2011 byMoore. ISBN 978-1429-2425-30, UI Bookstore & Beat The Bookstore (Old Capitol Mall)

• Calculator: Any calculator is acceptable if it can produce one-variable statistics (samplemean and standard deviation) from a set of numbers input as raw data.

∗ The Calculator Help link on the Main Stats Website supports the following models:

TI-83, TI-84, TI-89 Titanium, TI-30X II S, TI-BA II Plus, Casio FX-300 MS Plus,HP-50g.

∗ Many other calculator models work great too, but you may need to google your calcu-lator’s directions for “standard deviation” if not listed above.

∗ The TI-83, TI-84 and TI-89 are graphing calculators but graphing capability is not usedin Business Stats. Many Casio and other TI calculators work fine and are less expensive.

• MINITAB 17 Statistical Software: Available in 41 Schaefer Hall, Main Library LearningCommons, Tippie College of Business, and other computing locations on campus

1 6

blake
Rectangle
blake
Rectangle
blake
Rectangle
Page 8: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Course Features�� ��Course Notes Are Key!

• Buy Course Notebooks from Zephyr Printing, 125 S. Dubuque St. (in the downtown PedMall near Weatherdance Fountain.)

• Students complete Notebook Examples (together with Prof. Whitten and TAs) in Macbride.

• If you miss class, get that day’s notes from any TA/Professor office hour or from a classmate.

�� ��Stats Homework

• Use the “Accounting Method” described in the STATS GUIDE for success!

• Homework is not collected. Instead, homework answers are posted on the Stats Website so youcan check answers and work through challenges/incorrect answers. (This requires discipline.)Students take homework quizzes in Discussion instead of graded homework.

�� ��Stats Assistance (Personalize your study routine with help from three sources)

• Stats Dept. Tutor Lab (graduate student tutors)

◦ Location: 1113 Red LIB (Main Library Learning Commons)

◦ Weekly tutoring schedule: http://www.stat.uiowa.edu/resources/tutoring

• Tippie College Peer-To-Peer Tutoring (undergrad student tutors)

◦ Location: Tippie Business Library (4th floor PBB)

◦ Schedule: 5–7 PM Monday, Tuesday, and Thursday nights

• Shared TA Office Hours – See ANY TA, not just your own! (see Stats Website link)

�� ��Discussions

• Weekly Quizzes Quizzes may cover any current or previous HW assignments as well asanything discussed in lectures/Discussions in the course to date.

• Topic Worksheets (included in Notebooks), a boost for the next HW challenge!

• Help with MINITAB statistical software�� ��Quizzes

• Quizzes may cover any previous or current HW assignments as well as anything discussed inlectures/Discussions in the course to date (provides the best-possible exam preparation!)

• Discussion Quizzes are time-sensitive so as a practical matter makeup quizzes are not given.Instead, and as an explicit allowance for necessary absence (university-sanctioned events, ill-ness, family emergency, etc.) the lowest 2 quiz scores are dropped from the calculation ofthe course grade.

(See more about quizzes next page)

2 7

Page 9: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• If you do miss a quiz, we recommend that you pick up an extra copy from your TA to workas practice for the next exam. Your TA will email your quiz version’s solution to youeach week after the quiz has been graded.

• We’ll also use Practice Quizzes (not for credit) in Macbride lectures to help prep for exams.

�� ��Exams

• Exams are multiple-choice and closed-book. Three 90-minute midterm exams are taken 6:30–8:00 Thursday evenings (see Course Schedule), in addition to a (comprehensive) final exam.

• You may use one standard sheet of paper (8.5′′ by 11′′), front and back, of handwrittenor word-processed formulas and notes on each midterm exam.

◦ Make your own formula sheets. — It’s good practice!

◦ Keep your midterm-exam formula sheets to re-use on the (comprehensive) final exam!

◦ For the final exam you may use four standard sheets of paper, front and back (one sheetfor new topics covered after Exam 3; one sheet for each of three midterms.)

• Midterm Exam Assigned Desks: Must use assigned desk and room to earn exam credit.

Exam Rooms (See Your Assigned Room/Desk on Stats Website Exams Page)

◦ SHAM LIB

◦ AUD MH You must attend your exam location to earn grading credit.

◦ C20 PC (See classroom maps on Exams Page to find your desk!)

◦ C31 PC

• Midterm Exam Score Replacement Policy:

◦ If the final exam percentage score exceeds at least one of the midterm exam percentagescores, then the single lowest midterm score is replaced by the final exam score (at mostone replacement) in the calculation of the course grade.

◦ Since the final exam is comprehensive, you have a second chance to score higher if aparticular exam doesn’t go well.

• Makeup Exam Policy

◦ Experience shows that staying on schedule is vital to Stats success!

◦ So students are required to take exams as scheduled except in cases of officiallyuniversity-approved absence such as class conflict with official exam time, illness,religious observance, and NCAA athletic competition.

◦ Makeup exams are not available for other reasons, including student org fieldtrips, club competitions and personal events. You still may choose to miss anexam for personal reasons, in which case the exam score of 0 is automaticallyreplaced by the final exam score, as described by the Midterm Exam ScoreReplacement Policy.

3 8

blake
Rectangle
Page 10: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

�� ��MINITAB Reports

• Students complete MINITAB assignments (and submit reports) during the semester. (Theseare independent of weekly HW assignments.) Frequency and due dates to be announced.

• MINITAB greatly simplifies some statistical calculations and graphs. Students who becomebusiness majors in the Tippie College and take the subsequent course MSCI 2800 (BusinessAnalytics) greatly benefit from prior computing experience.

�� ��Course Grades

• Weights for Course Percentage:

5% Discussion5% Minitab

15% Quizzes15% Exam 115% Exam 215% Exam 330% Final Exam

100%

• Calculate your own Course Percentage, as follows:

1. Drop your two lowest Quiz scores. Then calculate a Quiz Percentage from the remainingscores.

2. Calculate a MINITAB Percentage by averaging MINITAB Report scores.

3. Replace your lowest Midterm Exam score with your Final Exam score only if suchreplacement improves your score (at most one such replacement.)

4. Now use the following formula:

Course % = (0.05)(Discussion) + (0.05)(Minitab) + (0.15)(Quiz) + (0.15)(Exam 1) + (0.15)(Exam 2)

+ (0.15)(Exam 3) + (0.30)(Final Exam)

• Course grades are earned according to the following minimum Course Percentage:

A 92% A− 90% B+ 88%B 82% B− 80% C+ 78%C 72% C− 70% D+ 68%D 62% D− 60%F Below 60%

For example, a course percentage of 87.9999% earns a grade of B in STAT 1030. So thatcourse grades are meaningful, all students earn grades on the same scale, without exception.

4 9

blake
Rectangle
Page 11: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Macbride Hall Semester Schedule

Week Lecture Day Date Subject1 − Mon Jan. 15

1 Wed Jan. 17 Topic 1: Six Steps of Inference

2 2 Mon Jan. 22 Topic 13 Wed Jan. 24 Topic 2: Describing Sample Data

3 4 Mon Jan. 29 Topic 2 & Topic 3: Probability5 Wed Jan. 31 Topic 3

4 6 Mon Feb. 5 Topic 37 Wed Feb. 7 Topic 3

5 8 Mon Feb. 12 Exam 1 Practice Questions (Prof. Whitten special evening office hours)9 Wed Feb. 14 Topic 4: Random Variables

Midterm Exam 1: Thursday, Feb. 15 6:30 – 8:00 PM (Covers Topics 1–3)

6 10 Mon Feb. 19 Topic 411 Wed Feb. 21 Topic 4 & Topic 5: Continuous Distributions

7 12 Mon Feb. 26 Topic 513 Wed Feb. 28 Topic 5 & Topic 6: Sampling Distributions

8 14 Mon Mar. 5 Topic 615 Wed Mar. 7 Topic 6

9 − Mon Mar. 12 (Spring Break)− Wed Mar. 14

10 16 Mon Mar. 19 Exam 2 Prep Worksheet (Prof. Whitten special evening office hours)17 Wed Mar. 21 Exam 2 Practice Questions

Midterm Exam 2: Thursday, March 22 6:30 – 8:00 PM (Covers Topics 4–6)

11 18 Mon Mar. 26 Topic 7: Confidence Intervals19 Wed Mar. 28 Topic 7 & Topic 8: Hypothesis Testing

12 20 Mon Apr. 2 Topic 821 Wed Apr. 4 Topic 8

13 22 Mon Apr. 9 Exam 3 Practice Questions (Prof. Whitten special evening office hours)23 Wed Apr. 11 Topic 9: Comparing Two Proportions

Midterm Exam 3: Thursday, April 12 6:30 – 8:00 PM (Covers Topics 7–8)

14 24 Mon Apr. 16 Topic 10: Correlation, Regression, and Stock Portfolios25 Wed Apr. 18 Topic 10

15 26 Mon Apr. 23 Topic 1027 Wed Apr. 25 Topic 10

16 28 Mon Apr. 30 Course Overview, Final Exam Practice Sheet29 Wed May 2 Final Exam Q/A Session

Final Exam (Date and location TBA)

5 10

blake
Rectangle
Page 12: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Homework Due Dates (Tuesday Discussion Quiz on Current or Past HW)

Homework DiscussionAssignment Due Date Topics Covered Homework Tips!

1 Jan. 23 Topic 1: Six Steps of Inference Write and think carefully! Refer to Six Stepsand Notebook 1 diagrams pages 21–23

2 Jan. 30 Topic 1, Topic 2: Describing Data Use Stats calculator functions in Discussion

3 Feb. 6 Topic 3: Probability Features . . . Laws of Probability

4 Feb. 13 Topic 3 Features . . . Conditional Probability

5 Feb. 20 Topic 4: Discrete Random Variables Includes . . . Mean/Expected Value

6 Feb. 27 Topic 4 Variance, Standard Deviation, Binomial RV’s

7 March 6 Topic 5: Continuous Distributions Uniform and Normal Random Variables

8 March 20 Topic 6: Sampling Distributions Tough homework! Do it before Spring Break!

9 April 3 Topic 7: Confidence Intervals Long assignment, start early!

10 April 10 Topic 8: Hypothesis Tests Long assignment, start early!

11 April 24 Topic 10: Correlation and Regression First regression assignment, start early!

12 May 1 Topic 10 Second regression assignment, start early!

Administrative Notes

Honor Code Course policies are governed by the College of Liberal Arts and Sciences. Cheating (academicmisconduct) in STAT 1030 is unethical and unfairly punishes honest students. Cheating includes, but is notlimited to, copying or communicating with other students during quizzes or exams, and unauthorized use ofnotes, computer files, or other study aids. All academic misconduct will receive the following sanctions:

1. A report is filed with the College. You will be placed on academic probation for 5 years, suspended,or expelled.

2. You will receive a 0 on the quiz/exam on which the academic misconduct occurs.

3. You will not be able to drop the lowest 2 quiz scores (if misconduct occurs on a quiz) and cannot usethe Midterm Exam Score Replacement Policy (if misconduct occurs on a midterm exam.)

4. Your final course grade will be lowered by 2 full letter grades (e.g. from a B+ to a D+).

Disabilities Please see me as soon as possible (after class or during office hours) if you have any disabilitiesfor which alternative arrangements for lectures or exams should be made. Student Disability Services (locatedin Burge Hall) directs such arrangements at UI and also welcomes student inquiries.

Sexual Harassment The College of Liberal Arts and Sciences and the University of Iowa are committedto providing students with an environment free from sexual harassment. If you feel that you are being orhave been harassed or you are not sure what constitutes sexual harassment, visit the University websitehttp://www.sexualharassment.uiowa.edu/index.php, to seek assistance from department chairs, the Dean’sOffice, the University Ombuds Office, or the Office of Equal Opportunity and Diversity.

Department Chair Joseph Lang, 241 Schaeffer Hall, 335-0712, [email protected]

6 11

blake
Rectangle
blake
Line
Page 13: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

12

Page 14: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

13

Page 15: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

14

Page 16: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for BusinessTopic 7: Confidence Intervals

(Textbook Reading: Chapter 6 (334–344), Chapter 7 (395–399), Chapter 8 (457–461, 465–467)

NOW WE RETURN TO A SIX STEPS DIAGRAM FROM THE FIRST WEEK OF CLASS!

We’re finally ready to “seal the circle” of the Six Steps by making inferences (generalizationsbased on inductive reasoning) about populations from samples.

In particular we use statistics in Step 6 to estimate parameters :

Common (Parameter/Statistic) Pairings:

Example SampleVariable Measurements Statistic Parameter

numerical variable x 100, 26,−14, . . . Sample mean x Pop mean µx

numerical variable x 100, 26,−14, . . . Sample variance s2 Pop variance σ2x

numerical variable x 100, 26,−14, . . . Sample stand. dev. s Pop stand. dev. σx

(Yes/No or 1/0) variable Yes, No, Yes, . . . Sample proportion p Pop proportion p

115

blake
Rectangle
Page 17: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

◦ The two most common and important inferences are:

1. proportions (binomial) problems:

(sample proportion) pestimates−−−−−−→ p (population proportion)

2. means problems:

(sample mean) xestimates−−−−−−→ µ (population mean)

◦ What else do we know?

From Topic 6 Notes (and a mathematical fact known as the Central Limit Theoremwhich underpins those notes), two bell curves are in play:

1. proportions problems:

• If original data collected in Step 4 are (Yes/No) or (1/0) measurementsthen a “large” sample size (n ≥ 30) “buys” a bell curve!

• Is There A Snag? YES! WE’RE SURE TO BE WRONG!

Warning: According the to bell curve, there’s a 100% chance that our sampleestimate will be wrong!

P ( correct estimate ) = P ( p = p )

= Area of the straight line at center of curve = 0%

216

blake
Rectangle
blake
Rectangle
blake
bell
blake
Text Box
blake
Text Box
p
blake
Text Box
p
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
Page 18: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

2. means problems: There’s a bell curve if either of two conditions holds:

(1) individual numerical measurements x have a normal distribution (bell curve)to begin with and with any sample size n

(2) OR individual measurements x are not normally distributed but the largesample size (n ≥ 30) “buys”a bell curve for the samplemean x.

Intuition: small and large values for x cancel, so x is likely “in the middle.”

• Is There A Snag? YES, THE SAME AS BEFORE!

Warning: According the to bell curve, there’s a 100% chance that our sample estimatewill be wrong!

P ( correct estimate ) = P ( x = µ )

= Area of the straight line at center of curve = 0%

THIS MEANS WE HAVE 0% CONFIDENCE IN OUR STATS! (NOT GOOD)

HOW CAN WE SOLVE THIS CONFIDENCE CRISIS?

317

blake
bell
blake
Text Box
x
blake
Text Box
blake
mu
blake
Line
blake
bell
blake
Text Box
x
blake
Line
blake
exponential dist'n
blake
triangular dist'n
blake
uniform dist'n
blake
Text Box
x
blake
Text Box
x
blake
Text Box
x
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
Page 19: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 1 (Halloween party)Sororities and fraternities at UCSB (California at Santa Barbara) throw a traditional Hal-loween party. The mean number of kegs of beer consumed (each keg contains 7.75 gallons)at the party each Halloween is unknown and is represented by the population mean µ.

(The population is all such Halloween parties, in the past and future.)

Set x = number of kegs consumed at the party

• Suppose Sample #1 from 5 parties is

x1 = 6.42 x2 = 10.69 x3 = 12.24 x4 = 4.76 x5 = 8.89

=⇒ x =x1 + x2 + x3 + x4 + x5

5=

43.00

5= 8.60

• Suppose Sample #2 from 5 parties is

x1 = 7.02 x2 = 15.92 x3 = 9.04 x4 = 6.29 x5 = 11.34

=⇒ x =x1 + x2 + x3 + x4 + x5

5=

49.61

5= 9.922

• Suppose Sample #3 from 5 parties is

x1 = 10.10 x2 = 8.46 x3 = 12.12 x4 = 9.22 x5 = 8.74

=⇒ x =x1 + x2 + x3 + x4 + x5

5=

48.64

5= 9.728

Given that the mean µ (for all parties) is a single unknown number, what are

the chances that any of the three sample means

x = 8.60 x = 9.992 x = 9.728

is exactly the correct estimate of µ ? =⇒ 0% chance =⇒ 0% confidence

HOW CAN WE SOLVE THIS CONFIDENCE CRISIS?

(see next page)

418

blake
Line
blake
Line
blake
Line
blake
beer
blake
Line
Page 20: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Solution: Use the Z Curve to Restore 90%, 95%, 99% Confidence!

One of the beautiful things about the Z Curve is that it acts as “universal translator” forboth means problems and proportions problems:

We’ll convert relatively-simple answers from the Z curve and Z table into a formula forproportions problems:

• Restore 95% confidence to proportions problems

1. Find a number z0 from the Z table so that

P (−z0 < Z < z0) = 95% = 0.9500

The Z table shades to the left so the area to “work backward” is 0.0250 .

Answer:

519

blake
bell
blake
bell
blake
Text Box
(means) Z formula
blake
Text Box
(proportions) Z formula
blake
Text Box
p
blake
Text Box
p
blake
mu
blake
Text Box
x
blake
Z curve
blake
Z curve
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Text Box
Z
blake
Text Box
Z
blake
bell
blake
Text Box
Z
blake
Line
blake
Line
blake
Text Box
0.0250
blake
Text Box
0.4750
blake
Text Box
0.4750
blake
Text Box
0.0250
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Text Box
0
blake
Text Box
- z0
blake
Text Box
z0
Page 21: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

2. Convert to a proportions confidence interval:

P (−1.96 < Z < 1.96) = 0.95

But the Z formula is . . . Z =p− p

σp

=⇒ P

[−1.96 <

p− p

σp

< 1.96

]= 0.95

Multiply all three sides of − 1.96 <p− p

σp

< 1.96 by (−1) :

=⇒ −1.96 <p− p

σp

< 1.96

=⇒ (−1.96) σp < p− p < (1.96) σp

=⇒ p− (1.96) σp < p < p+ (1.96) σp

So P

[p− (1.96) σp < p < p+ (1.96) σp

]= 0.95

In other words, a 95% confidence interval for the true (population) proportion p is

p± (1.96)σp = p± (1.96)

√p(1− p)

n≈ p± (1.96)

√p (1− p )

n

If we already knew the true value of p we wouldn’t be estimating it!

(we’ll apply the formula next page)

620

blake
Rectangle
blake
Line
Page 22: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2 (President’s approval rating)A Gallup telephone poll of 1500 adult Americans on April 3, 2013 shows that 618 ofthose Americans approve of the president’s performance.

(a) Find a 95% confidence interval for

p = true president’s approval rating in the population of all adult Americans

(b) Interpret the confidence interval.

(c) Explain what “95% confidence” really means to the president’s advisors in theWhite House.

721

Page 23: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• Restore 90% confidence to proportions problems

How many standard deviations z0 are needed to restore 90% confidence?

Find a number z0 from the Z table so that

P (−z0 < Z < z0) = 90% = 0.9000

Answer:

Example 2, continued.

(d) Find a 90% confidence interval for

p = true president’s approval rating in the population of all adult Americans

(e) Interpret the confidence interval.

822

blake
bell
blake
Text Box
blake
Text Box
0
blake
Text Box
Z
blake
Text Box
z0
blake
Text Box
- z0
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Text Box
0.05
blake
Line
Page 24: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• Restore 99% confidence to proportions problems

How many standard deviations z0 are needed to restore 99% confidence?

Find a number z0 from the Z table so that

P (−z0 < Z < z0) = 99%

Answer:

Example 2, continued.

(f) Find a 99% confidence interval (CI) for

p = true president’s approval rating in the population of all adult Americans

(g) From the White House perspective, name a disadvantage to using a 99% CI ratherthan a 90% CI to determine the president’s popularity.

923

blake
bell
blake
Text Box
blake
Text Box
Z
blake
Text Box
0
Page 25: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2, continued.Suppose that another Gallup poll of 3000 adult Americans on April 4, 2013 shows that 1236of those Americans approve of the president’s performance.

(h) Find a 95% CI for p on April 4.

(i) Which 95% CI is more precise, the one for April 3 or for April 4? Explain why.

1024

Page 26: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

More Practice With Proportions Confidence Intervals

The box below shows the textbook’s approach to confidence intervals:

Two important items from the box:

• The standard error SEp means estimated standard deviation of p :

σp =

√p(1− p)

n≈√p (1− p )

n= SEp

• The margin of error m is the “plus or minus” part of the formula which restores

confidence from 0% to something reasonable such as 90%, 95%, or 99%.

1125

blake
Text Box
blake
Text Box
blake
Rectangle
blake
Line
blake
Text Box
(We can't calculate the actual standard deviation!)
blake
Text Box
(We can calculate this approximation!)
blake
Line
blake
Line
blake
Rectangle
Page 27: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 3 (Construction rebound)By Spring 2013 the construction industry in the U.S. was poised for a rebound from themassive decline in construction spending during the Great Recession of 2007–2009.

As a gauge of future activity, 100 independent homebuilding companies were interviewedin April 2013. Fifty-nine of these companies indicated that they plan to increase buildingactivity in 2013 compared to 2012, while the other 41 do not.

(a) Provide a single number which is the best estimate for the proportion of all independenthomebuilding companies in the U.S. which plan to increase building in 2013 comparedto 2012.

Answer:

(b) Find the 95% margin of error.

(c) Find a 95% confidence interval for the proportion.

(d) Find a 95% confidence interval for the percentage of independent homebuilders whichplan to increase building in 2013.

(e) Two economists interviewed on CNBC offered conflicting views on the constructionrebound:

• Economist A predicted that 50% of independent homebuilders will increase build-ing in 2013.

• Economist B predicted that 75% of such homebuilders will increase building.

Does the evidence support either economist, with 95% confidence? That is, is eitherprediction plausible? (Tip: Draw a quick number line graph.)

1226

Page 28: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 4 (Mobile internet)Sixty of 87 users of mobile internet devices prefer the Apple iPhone over the Verizon Hotspot.If these 87 users represent a random sample of all consumers,

(a) Find a 90% confidence interval for the proportion of all consumers who prefer theiPhone over the Hotspot.

(b) Find a 90% confidence interval for the percentage of all consumers who prefer theiPhone over the Hotspot.

(c) Apple claims that at least 80% of all consumers prefer the iPhone over the Hotspot.Is the claim plausible (with 90% confidence)?

(d) Verizon claims that at most 80% prefer the iPhone. Is that claim plausible?

1327

Page 29: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Confidence Intervals for Means Problems

◦ We’re able to solve the “confidence crisis” for proportions problems as long as we haveenough information in the form of statistical currency or money — a large sample size(n ≥ 30).

◦ We can similarly solve the confidence crisis for means problems, though small samplesizes (n ≥ 1) if the variable x itself has a bell curve complicate matters a bit.

• Recall that z0 = 1.96 standard deviations “buys” 95% confidence:

P (−1.96 < Z < 1.96) = 0.95

But the Z formula is . . . Z =x− μ

σx

=⇒ P

[−1.96 <

x− μ

σx

< 1.96

]= 0.95

Multiply all three sides of − 1.96 <x− μ

σx

< 1.96 by (−1) :

=⇒ −1.96 <μ− x

σx

< 1.96

=⇒ (−1.96) σx < μ− x < (1.96) σx

=⇒ x− (1.96) σx < μ < x+ (1.96) σx

So P

[x− (1.96) σx < μ < x+ (1.96) σx

]= 0.95

In other words, a 95% confidence interval for the true (population) mean μ is

x± (1.96)σx = x± (1.96)σ√n

• Unfortunately it’s not practical to expect that we know the pop stand. dev. σ ifwe’re trying to estimate the pop mean μ !

• We can try to estimate standard deviations ( i.e., σ ≈ s ) but the estimate isinaccurate for small sample sizes (n < 30). Help! What should we do?

1428

blake
Line
blake
Rectangle
Page 30: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Using the t curve to Compensate for Small Samples in Means Problems

We can compensate for small sample sizes and for imperfect information(using s instead of σ ) by substituting the t curve and table for the Z curve:

x± (z∗) σx = x± (z∗)σ√n

≈ x± (t∗)s√n

• The t curve is wider than the Z curve, which reflects less-accurate statistical data(small sample sizes):

0

N(0, 1)

t (5)

• Reading the t table requires calculating something called degrees of freedom which isalmost the same thing as the sample size:

degrees of freedom df = n− 1

Example 5 (Emerging markets)The discount online broker Vanguard sells an emerging markets (international) mutual fundwhich trades under the name VEMAX on stock exchanges.

(a) A sample of 10 stock bids for VEMAX which arrived at the New York Stock Exchange(NYSE) within the past hour average x = $35.76 with standard deviation s = $3.20.

Use the t table to find the number of standard deviations t∗ required to “buy” 95%confidence for the mean price µ of all price bids for VEMAX within the past hour.

Is t∗ more or less than the corresponding number z∗ from the Z table?

Answer:

(continued)

1529

blake
Rectangle
blake
Text Box
Z
blake
Oval
Page 31: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(b) Find a 95% CI for μ based on the 10 bids.

(c) Suppose there’s a correction for part (a): The statistics x = $35.76 and s = $3.20are actually based on 20 bids instead of 10. Find a (revised) 95% CI for μ based onthe 20 bids.

(d) Find a 99% CI for μ based on a sample of five bids for the mutual fund:

x1 = $32.70 x2 = $44.10 x3 = $26.90 x4 = $38.00 x5 = $40.00

1630

Page 32: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 6 (Student spending)A survey of UI students showed that average spending by those students on nonessentialitems (other than for food, shelter, basic clothes and transportation) averaged $64.10 perweek, with standard deviation $6.26.

(a) Define the random variable. (Attach an English description to a letter.)

(b) Define the population parameter. (Attach an English description to a letter.)

(c) Find a 90% CI for average weekly spending on nonessentials by all UI students if 50students were surveyed.

(d) Find a 90% CI for average weekly spending on nonessentials by all UI students if 150students were surveyed.

(e) Find a 90% CI for average weekly spending on nonessentials by all UI students if 1150students were surveyed.

1731

Page 33: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Choosing the Right Sample Size

We’ve seen that the sample size n represents the wealth of sample information.

A larger sample size can be “spent” in several ways:

• greater confidence (e.g. 99% instead of 90%)

• greater precision (smaller margin of error)

• a combination of both improved confidence and improved precision

It’s also possible to calculate the minimum sample size n necessary to do any particular job,based on how confident and how precise we’d like to be!

Note: This calculation is obviously made in the planning stages, before measuring a sample.

◦ Sample sizes for means problems

A CI for a pop mean μ is

x±m

where m = margin of error.

Also,

m = (z∗) σx = (z∗)σ√n

=⇒ √n =

(z∗) σm

=⇒ n =

[(z∗) σm

]2

n =(z∗)2 σ2

m2

To use the formula, we need a good estimate of the pop standard deviation σ beforewe collect the data.

• Sometimes σ is known from similar studies on the same customer base.

• We can also substitute the sample standard deviation s for σ, where s is calculated from a previous sample.

1832

blake
Rectangle
blake
Rectangle
Page 34: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

◦ Sample sizes for proportions problems

A CI for a population proportion p is

p±m

where m = margin of error.

Also,

m = (z∗) σp = (z∗)

√p (1− p)

n

=⇒ m

(z∗)=

√p (1− p)

n

=⇒ m2

(z∗)2=

p (1− p)

n

n =p (1− p)(z∗)2

m2

The function f(p) = (p)(1 − p) = p − p2 is a downward-facing parabola which ismaximized at p = 1/2.

(Alternatively, apply Calculus:)

* First derivative is f ′(p) = 1− 2p = 0 =⇒ p = 1/2

* Second derivative is f ′′(p) = −2 < 0 =⇒ maximum

So the maximum sample size n required to do the job for any true value ofp is found by setting p = 1/2 in the previous equation:

n =p (1− p)(z∗)2

m2=

(z∗)2

4m2

1933

blake
Rectangle
blake
down parabola
blake
Text Box
1/2
blake
Text Box
f(p)
blake
Line
blake
Line
blake
Line
blake
Text Box
0
blake
Text Box
1
blake
Text Box
p
Page 35: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 7 (Zinc mine)An investor is considering investing in an Australian zinc mine. The quality of the zinc orein a “core sample” from the mine is measured as a concentration:

x = grams of zinc per milliliter of ore

The goal is to assess the overall quality of the mine by estimating

μ = mean concentration of all core samples (grams/ml)

It’s known from other zinc mines in the same region of Australia that the standard deviationis about σ = 0.30 grams/ml.

(a) How many core samples are needed to be 95% confident that the estimate of μ is offby less than 0.05 grams/ml?

(b) Interpret the answer to (a).

2034

Page 36: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 8 (Approval rating)The White House wants to estimate the President’s current approval rating (percent of adultAmericans who approve of his performance) to within 2% of the correct rating with 99%confidence.

(a) How many adult Americans are needed in the sample?

(b) Interpret the answer to (a).

(end of Topic 7 Notes)

2135

Page 37: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for Business HOMEWORK 9 (29 problems for Topic 7: Confidence Intervals)

DIRECTIONS:

• For precision, make calculations without rounding. There are several ways to do this.:

◦ TI graphing calculators let you reference ANS (the previous answer.)

◦ Every calculator has a memory button for which you can google directions.

◦ The HARDEST way is to simply “write down all the numbers” if you’d prefer not touse the calculator memory.

Then write final answers for proportions to four decimal places (e.g. 0.6476 or 64.76%) andfor dollar amounts to the nearest cent (e.g. $10.76)

• Directions for some exercises below allow use of either a calculator or MINITAB to calculate xand s, whichever you prefer. (On exams and quizzes, though, you’ll need to use a calculatorsince MINITAB won’t be available.)

• Drop by either of our two Stats tutor labs (Stats Dept. Lab and Tippie Tutoring) and TA/profoffice hours for discussion and consultation!

Initial Exercises Based on Means Problems

• Exercise 6.1 (page 338) Answer these specific questions:

(a) In statistics terminology, what descriptive letter is attached to the number $220?

(b) What descriptive letter is attached to the number 100?

(c) What’s the standard deviation of x?

(d) $220 would be the correct value of the standard deviation of x for which sample size?

• Exercise 6.2 (Recall that the textbook calls the Bell Curve Rule the “68-95-99.7 Rule.”)

• Exercise 6.3

• Exercise 6.4

◦ By SRS the textbook means “simple random sample.”

◦ Open a browser to the textbook website:

bcs.whfreeman.com/psbe3e

and click “Statistical Applets” under “Browse By Category”

◦ Click the third applet in the list: “Confidence Interval”

◦ Follow the instructions in parts (a) and (b) of Exercise 6.4.

(continued)

1 36

blake
Rectangle
blake
Rectangle
Page 38: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Confidence Intervals for Proportions

• Exercise 8.1 (page 459)

• Exercise 8.3

• Exercise 8.2

• Exercise 8.4

• Exercise 8.13

• Exercise 8.14

• Exercise 8.29 Add parts (b) and (c):

(b) Find an 80% confidence interval for the proportion of all college students who have atleast one credit card.

(c) Which is more precise, the 80% CI or the 95% CI? Explain.

• Exercise 8.30

• Exercise 8.37 Add part (b):

(b) Would the data support an advertisement by the company that it ships 95% of orderson time?

• Exercise 8.39 Add part (b):

(b) Are more than 70% of first-year students at this university concerned about being finan-cially well-off?

• Exercise 8.43

Ignore the book’s instructions about p∗. Use the formula in the Topic 7 Notesinstead.

• Exercise 8.49

Confidence Intervals for Means

• Exercise 7.1 (page 397)

• Exercise 7.3

• Exercise 7.4

• Exercise 6.5 (page 341)

• Exercise 6.6

(continued)

2 37

Page 39: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• Exercise 6.7

• Exercise 6.8

• Exercise 6.9

Exercise 6.9 is a puzzler. If you can’t figure out these answers quickly it’s okay tolook at the HW9 key to discover the answers! ,

• Exercise 6.27 (page 350)

(Assume that the sample standard deviation is s = 3.2 years.)

• Exercise 6.28

◦ Ignore the book’s instructions to “assume the standard deviation is $21.00.” Instead,calculate the sample mean x and sample standard deviation s.

◦ It might save time to stop by computer lab 41 SH to use MINITAB rather than yourcalculator (though the choice is yours.) As a reminder, the MINITAB commands are

Stat > Basic Statistics > Display Descriptive Statistics

(The data set Shoppers is found on the Homework page of the Stats website.)

• Exercise 6.29

◦ Ignore the book’s instructions to “use $22.00 for the standard deviation” and insteaduse the actual standard deviation s from Exercise 6.28.

• Exercise 7.30 (page 414) Ignore parts (a) and (b). Do part (c) and add part (d):

(d) Nonprofit organizers claim that gross sales averaged at least $1800 in 2008. Is this claimplausible?

• Exercise 7.23

The MINITAB data set Mileage is available if you prefer to calculate x and s with MINITABinstead of using your calculator. Also add parts (b) and (c), both with 95% confidence:

(b) Toyota’s sales literature says that the Highlander averages at least 32 mpg at 60 mph.Is that plausibly true for the car being tested?

(c) Is that definitely true for the car being tested?

(end of assignment)

3 38

Page 40: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stat 1030 Topic 7 Worksheet

1. (Late credit card payments)Hills Bank and Trust Company in Coralville collected data from a random sample of thebank’s credit card customers who are late on credit card payments. The data are the numberof days late on payments.

28 32 25 34 38 26 25 18 30 26 28 13 2021 17 16 21 23 14 32 25 21 22 20 18 2616 30 30 20 50 25 26 28 31 38 32 21

(a) Define the random variable x. (Give an English description.)

(b) Define the relevant population parameter.

(c) Find a 95% confidence interval for the average number of days late for all of the bank’scredit card customers who are late on payments.

(d) How many credit card customers (who are late on payments) should be sampled in orderto estimate the mean number of days late to within one day of the true mean with 95%confidence?

(continued)

139

blake
Rectangle
Page 41: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

2. (Greek bar)Sue is an entrepreneur who is considering opening a new bar in downtown Iowa City calledThe Greek IC. The bar’s target clientele would be UI sorority and fraternity (Greek) members,although the bar would cater to the general public also.

Before investing in the new bar, Sue needs to determine whether such a bar would be suffi-ciently popular to be a financial success. She figures that if she receives $200,000 in annualrevenue from enthusiastic UI Greeks, this amount plus additional revenue from the general(non-Greek) public will be enough to make the bar profitable. Sue believes she’ll be ableto attract 30 cents of every dollar spent in downtown bars by Greeks who view her barenthusiastically.

UI sorority and fraternity members spend an average of $384 per year in downtown IowaCity bars, according to the Iowa City Downtown Merchants Association. Suppose that theUI Interfraternity Council estimates that there are 3280 Greek members in 2019.

Sue conducted a random sample of UI Greeks, with these results:

Attitude Toward Proposed Bar NumberStrongly Disinterested 4Mildly Disinterested 19Indifferent 12Mildly Enthusiastic 23Strongly Enthusiastic 14

(a) Find a 99% confidence interval for the proportion of all UI Greeks whose attitude towardthe bar is enthusiastic.

(b) How many UI Greeks should Sue interview if she wants to estimate the percentage ofenthusiastic UI Greeks to within 5% of the true percentage with 99% confidence?

(continued)

240

Page 42: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(c) Find an approximate 99% confidence interval for the total amount of money spent indowntown Iowa City bars in 2019 by UI Greeks who are enthusiastic about the proposedbar.

(d) Do these data indicate that Sue’s bar would plausibly be profitable in 2019? Explain.

(e) Would you recommend that Sue open the bar? Why or why not?

(The third question on the next page is optional, Discussion time permitting.)

341

Page 43: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

3. (Crime worries)The elderly generally fear crime more than younger people, even though they are less likelyto be victims of crime. One study recruited a random sample of 56 black women over the ageof 65 from Atlantic City, New Jersey. Of these women, 27 said that they “felt vulnerable” tocrime.

(a) Define the population parameter.

(b) Find a 90% confidence interval for the parameter.

(c) Find a 90% confidence interval for the percentage.

(d) Consider the claim that more than half of black women over age 65 from Atlantic Cityfeel vulnerable to crime. Is this claim plausible?

(end of Topic 7 worksheet)

442

Page 44: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

43

Page 45: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

44

Page 46: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for BusinessTopic 8: Hypothesis Testing

(Textbook Reading: Chapter 6 (351–367, 382–385), Chapter 7 (399–401), Chapter 8 (462–464)

SOMETIMES THE GOAL IN STEP 6 (INFERENCE) IS TO DECIDE AN ISSUE CLEARLY— with a Yes or No answer to a well-defined and specific question.

A statistical procedure called hypothesis testing is used to do exactly that:

• Two conflicting hypotheses (points of view) clash, only one of which can be correct!

• Our job is to decide which hypothesis “wins”, based on the sample evidence.

◦ The research or alternative hypothesis (written HR or HA) often represents anew idea, theory, or suspicion to be tested.

◦ The null hypothesis H0 nullifies or opposes the new idea or theory and so oftenrepresents conventional wisdom or the status quo.

• New theories such as HA are plentiful and are often wrong! So we’ll protect ourselvesagainst making the mistake of rejecting H0 and accepting HA when we shouldn’t (i.e.,when H0 is true and HA is false.)

Actually, there are two types of mistakes that we can make:

Correctdecision

Correctdecision

Type Ierror

H0 true Ha true

Reject H0

Accept H0

Deci

sion b

ase

don s

am

ple

Truth about the population

Type IIerror

145

blake
children boxing
blake
Text Box
HO
blake
Text Box
HA
blake
Line
blake
Line
blake
Oval
blake
Oval
blake
Text Box
HA is the idea to test!
blake
Rectangle
Page 47: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

But of these two types of mistakes,

Type I Error = REJECT H0 when H0 is TRUE

is usually considered the more serious error, as (medical) Example 1 below shows.

• So we “limit our liability” with small probabilities such as:

(Greek letter alpha) α = P (Type I error) = 0.10, 0.05, or 0.01.

Note: These errors are mirror images to 90%, 95%, and 99% confidenceintervals in Topic 6, respectively.

Example 1 (new AIDS drug cocktail)When the AIDS epidemic began in the mid-1980’s, the disease was fatal. But in the 1990’sa “cocktail” of several drugs mixed together was shown to provide a powerful counterweightto AIDS so that today the disease can often be managed successfully.

Suppose that Drug Cocktail A is this proven combination and is the therapy that doctorscurrently prescribe. Scientists recently discovered a new therapy, Drug Cocktail B, which isin clinical trials testing to determine whether it’s safe and effective.

Of course the hope is that the (new) Cocktail B performs better than the (standard) Cock-tail A! This theory then becomes the research or alternative hypothesis in the clinical testing:

HA: Cocktail B performs better than Cocktail A

H0: Cocktail B does not perform better than Cocktail A

(a) Describe the Type I error.

(b) Describe the Type II error.

(c) Why is the Type I error considered especially dangerous?

246

blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Text Box
This probability is called the test's significance level .
Page 48: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

More About Hypothesis Tests

• The logic of Hypothesis testing is proof by contradiction:

◦ We assume that the null hypothesis H0 is true.

◦ If evidence is overwhelming against H0 then we Reject H0 and the new idea HA

is proven by contradiction.

◦ But if there’s only some and not overwhelming evidence againstH0 we Fail to Reject H0

and HA isn’t proven.

• The philosophy of Hypothesis testing is to be cautious! Only strong evidence causesa dramatic conclusion to reject H0 (and confirm HA.)

• Here’s the method:

Five-Step Hypothesis Testing

1. Define:

◦ population parameter (µ or p)

◦ hypotheses (HA and H0)

2. Determine the Rejection Region (bell curve)

3. Calculate the test statistic (Z or t)

4. Decide to Reject H0 or not

5. Interpret the decision

Special Notes about the Five Steps:

(1) If we Reject H0 in Step 4, the interpretation in Step 5 is:

There is sufficient evidence to show that . . .HA . . . is true(except translate HA into English, no math symbols or technical terms!)

If we Fail to Reject H0 in Step 4, the interpretation in Step 5 is:

There is not sufficient evidence to show that . . .HA . . . is true.

(2) The test statistics (for proportions problems and means problems, respectively)in Step 3 are:

Z =p− p0√p0·(1−po )

n

where H0: p = p0 t =x− µ0

s/√n

where H0: µ = µ0

347

blake
Rectangle
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Text Box
ENGLISH
blake
Text Box
MATH
blake
Line
blake
Line
blake
Line
Page 49: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Special Notes about the Five Steps, Continued:

(3) Since we assume thatH0 is true, we rejectH0 for extreme values directed away fromH0 (and toward HA.)

Rejection regions for proportions problems are shown below for the three possiblepairs of hypotheses:

Hypotheses: Rejection Region

◦ HA: p < p0

H0: p ≥ p0

Reject H0 if Z < −zα

◦ HA: p > p0

H0: p ≤ p0

Reject H0 if Z > zα

◦ HA: p = p0

H0: p = p0

Reject H0 if Z < −zα/2 or Z > zα/2

448

blake
bell_1
blake
bell_2
blake
bell_3
Page 50: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2 (Political agenda)From past experience, White House advisors know that 39% is a threshold presidentialapproval rating: The rating must exceed 39% for the president to have sufficient publicsupport to successfully press his political agenda with Congress.

A Gallup telephone poll of 1500 adult Americans shows that 618 of those Americans approvethe president’s performance. Can the president successfully press his political agenda? Testusing a 5% significance level.

Solution:

(continued)

549

blake
Oval
blake
Oval
blake
Text Box
Sample Data
blake
Line
blake
Line
Page 51: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2, continued.

650

Page 52: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2, continued.

751

Page 53: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 3 Political agenda: a more cautious test)Use the same data as in Example 2 but apply a more cautious test by reducing

α = P (Type I error) from α = 5% to α = 1%.

(a) What is a Type I error in this situation?

(b) Carry out the Five Steps of Hypothesis Testing. (Label each step.)

(continued)

852

blake
Underline
blake
Text Box
(Greek letter alpha)
blake
Line
Page 54: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 3, continued.

953

Page 55: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Special Note: Why do Examples 2 and 3 Differ?

1054

blake
Rectangle
Page 56: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 4 (Water quality)Most water-treatment facilities monitor the quality of their drinking water on an hourlybasis. One variable monitored is pH, which measures the degree of alkalinity or acidity inthe water. A pH below 7.0 is acidic, one above 7.0 is alkaline, and a pH of 7.0 is neutral.

One water-treatment plant has a target pH of 8.5 (most plants try to maintain a slightlyalkaline level.) The mean and standard deviation of one hour’s test results, based on 17water containers measured during that hour, are x = 8.42 and s = 0.16. Is there sufficientsample evidence to show that the mean pH level in the water differs from 8.5? Use α = 0.05.

(continued)

1155

Page 57: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 4, continued.

1256

Page 58: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 5 (Battery testing)The reputations (and hence sales) of many businesses can be severely damaged by shipmentsof manufactured items that contain a large percentage of defectives. For example, a man-ufacturer of alkaline batteries may want to be reasonably certain that fewer than 5% of itsbatteries are defective.

Suppose that 300 batteries are selected from a very large shipment; each is tested and 10defective batteries are found. Is there sufficient evidence for the manufacturer to concludethat fewer than 5% of batteries in the entire shipment are defective? Use 1% significance.

(continued)

1357

Page 59: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 5, continued.

1458

Page 60: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 6 (Soil contamination)Environmental Science & Technology (Oct. 1993) reported on a study of contaminated soilin The Netherlands. Seventy-two 400-gram soil specimens were sampled, dried, and analyzedfor the contaminant cyanide. The cyanide concentration in milligrams per kilogram (mg/kg)of each soil specimen was determined by use of an infrared spectrometer.

The sample resulted in a mean cyanide level of x = 84 mg/kg and a standard deviation ofs = 80 mg/kg. Use this information to test the hypothesis that the true mean cyanide levelin soil in The Netherlands exceeds 100 mg/kg. Test at α = 0.05

(continued)

1559

Page 61: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 6, continued.

1660

Page 62: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 7 (Car pollution standards)Ford Motor Company wants to test a new type of engine to determine whether it meets newair-pollution standards. The mean emission of all engines of this type must be less than 20parts per million (ppm) of carbon. Ten engines are manufactured for testing purposes andthe emission level of each is determined. The data in ppm is listed below.

15.6 16.2 22.5 20.5 16.4 19.4 16.6 17.9 12.7 13.9

Do the data supply sufficient evidence to allow Ford to conclude that this type of enginemeets the pollution standard? Assume that Ford is willing to risk a Type I error withprobability α = 0.01.

(continued)

1761

Page 63: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 7, continued.

1862

Page 64: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 8 (Color preference for washers)Whirlpool, Inc. manufactures washing machines in three basic colors: white, yellow, andgreen. Suppose that 388 washers from a sample of 1000 washers sold were yellow.

(a) Would you conclude that more than 1/3 of customers prefer yellow? Use α = 0.05.

(b) Find a 90% confidence interval for the proportion of customers who prefer yellow.

(c) Are the inferences from (a) and (b) consistent? (Tip: Draw the confidence interval.)

(continued)

1963

Page 65: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 8, continued.

2064

Page 66: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

P -Values and Hypothesis Testing

Recall Examples 2 and 3 (President’s political agenda). Depending on the significance level,the conclusion from the test

HA: p > 0.39

H0: p ≤ 0.39

differs:

• For significance level α = 0.05:

Reject H0 if Z > 1.645 =⇒ Reject H0 since Z = 1.75 > 1.645 =⇒ Politicalagenda seems feasible.

• For significance level α = 0.01:

Reject H0 if Z > 2.33 =⇒ Fail to Reject H0 since Z = 1.75 < 2.33 =⇒ Political

agenda does not seem feasible.

So the evidence against the null hypothesis H0 is actually measured on a continuum by theZ score

Z = 1.75

This leads to an alternative way to test the hypothesis:

• Two Definitions of P -value

1. For CALCULATIONS:

P -value = probability that a sample result at least as unusual as ours occurswhen the null hypothesis H0 is true.

2. RISK INTERPRETATION:

P -value = risk of error if the null hypothesis H0 is rejected on the basis ofthe sample (or stronger) evidence, assuming that H0 is true.

• The significance level α (alpha) = RISK TOLERANCE for making an error by rejectingthe null hypothesis H0.

Therefore,

◦ P -value ≤ α =⇒ Sample Risk ≤ Risk Tolerance =⇒ Reject H0 (not too risky)

◦ P -value > α =⇒ Sample Risk > Risk Tolerance =⇒ Fail to Reject H0 (too risky)

2165

blake
Rectangle
blake
Rectangle
blake
Text Box
Shade P-value away from HO and toward HA!
blake
Line
Page 67: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• Reconsider Example 2.

(1) The risk of error with sample data = P -value= 4.01% chance.

(2) The experimenter is willing to risk α = 0.05 = 5% chance of error.

Hence, P -value = 0.0401 < 0.05 = α =⇒ Reject H0 (not too risky.)

• Reconsider Example 3.

(1) P -value= 4.01% chance of error

(2) The experimenter is willing to risk α = 0.01 = 1% chance of error.

Hence, P -value = 0.0401 > 0.01 = α =⇒ Fail to Reject H0 (too risky.)

• SUMMARY:

◦ The Five Steps of Hypothesis Testing in Stat 1030 provide an understandingof the logic of hypothesis testing.

◦ Printouts from computer packages, however, generally show only P -values, notrejection regions. So with computers:

∗ P -value < α =⇒ Reject H0

∗ P -value ≥ α =⇒ Fail to Reject H0

(end of Topic 8 notes)

2266

blake
bell
blake
Text Box
0
blake
Text Box
1.75
blake
Line
blake
Text Box
P Value = 0.0401
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Text Box
Z
blake
Line
blake
Text Box
blake
Text Box
Z Table Area = 0.9599
blake
Line
Page 68: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for Business HOMEWORK 10 (20 exercises for Topic 8: Hypothesis Testing)

DIRECTIONS:

• Use and label the Five Steps of Hypothesis Testing from the Topic 8 Notes!

• For precision, calculate the test statistic (Z or t) in Step 3 without rounding. There areseveral ways to do this!

◦ TI graphing calculators let you reference ANS (the previous answer.)

◦ Every calculator has a memory button. Google your calculator model’s directions!

◦ The HARDEST way is to simply “write down all the numbers” if you’d prefer not touse calculator memory.

• Directions for some exercises below allow use of either a calculator or MINITAB to calculate xand s, whichever you prefer. (On exams and quizzes, though, you’ll need to use a calculatorsince MINITAB won’t be available.)

• Work exercises in the order shown below for max benefit!

• Exercise 6.39 (page 355)

• Exercise 6.40

• Exercise 6.59

• Exercise 6.60

• Exercise 8.45 (page 471) Add part (c):

(c) Are the answers to (a) and (b) consistent? Explain in terms of plausibility.

• Exercise 8.98 (page 488) (Test using 1% significance.)

• Exercise 8.99 Add part (b):

(b) Is the null hypothesis H0 from Exercise 8.98 plausible?

• Exercise 8.101 (Test using 10% significance.)

• Exercise 8.102

(continued)

1 67

blake
Rectangle
blake
Rectangle
Page 69: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• Exercise 8.46 Ignore the textbook’s directions. Answer these questions instead:

(a) Test the claim that a majority of people prefer the taste of fresh-brewed coffee at 5%significance. (Use and label the Five Steps of Hypothesis Testing.)

(b) Suppose that the sample provides different numbers: Actually only 19 of the 60 sub-jects prefer instant coffee. Repeat the hypothesis test from part (a).

(c) Briefly explain why the conclusions differ between parts (a) and (b).

(d) Turn to the description of “P -Values” near the end of the Topic 8 Notes.

The P -value provides an alternative to the Five Steps for testing hypotheses.(Of course both methods result in the same decision to Reject H0 or not.)

The P -value is calculated from almost the same shading as in the RejectionRegion except that we shade from the actual value of the Z statistic (Z = 1.75)instead of from numbers (called critical values) such as Z = 1.645 or Z = 2.33that we look up in the Z table.

When we reconsider Example 3 in the Notes,

P -value = 0.0401 = (Area to the right of Z = 1.75)

(1) As an alternative to the Five Steps used in part (a), calculate the P -value.

(2) Provide the risk interpretation of the P -value.

(3) Make a decision by comparing the P -value to the significance level.

(e) Let’s try another P -value! ,

(1) As an alternative to the Five Steps used in part (b), calculate the P -value.

(2) Provide the risk interpretation of the P -value.

(3) Make a decision by comparing the P -value to the significance level.

• Exercise 7.5 (page 401)Ignore the book’s instructions about the P -value. Apply the Five Steps instead.

• Exercise 7.114 (page 447) Do part (a) only.Ignore the book’s instructions about the P -value. Apply the Five Steps instead, at 5% sig-nificance.

(continued)

2 68

Page 70: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• Exercise 7.23 (page 413)Ignore textbook instructions for this exercise. Instead answer (a), (b), (c) below.

◦ There’s a MINITAB file on the Homework web page. So you can calculate x and s eitheron your calculator or with MINITAB Basic Statistics > Display Descriptive Statistics.(On quizzes and exams you’ll need to use your calculator.)

(a) Suppose that Consumer’s Digest magazine is considering writing an article which claimsthat the Toyota Highlander Hybrid receives fewer than 30 mpg on average when set to60 mph on cruise control. There’s a danger that Toyota will sue Consumer’s Digest formaking a false claim so test using the Five Steps at a very cautious 1% significance level.

(b) Calculate the P -value for this test. (Refer to the end of the Topic 8 Notes if you need arefresher.)

(c) Interpret the P -value from part (b). How does the P -value reinforce the decision frompart (a)?

• Exercise 6.117 (page 390) Do part (c) only. (Use 5% significance.)

• Exercise 6.118 Do part (b) only.(Reason an answer from the CI. You do not need to do the Five Steps.)

• Exercise 6.119

• Exercise 7.32 (page 415) Ignore the textbook’s instructions.

◦ A MINITAB data file is available on the Homework page for optional use.

◦ Is there conclusive evidence that the mean IBI (water quality) differs from an environ-mental standard of 68 points? Use the Five Steps and 10% significance.

• Exercise 7.40 Do part (a) only.

(continued)

3 69

Page 71: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

• Exercise 6.122 (page 390)Ignore textbook instructions. Instead answer (a) and (b):

◦ We’ll test the hypotheses

HA: µ > 0

H0: µ ≤ 0

for µ = mean percentage increase in CEO salaries this year

at 5% significance using the P -value Method instead of the Five Steps.

(a) Calculate the P -value for the test.

(b) Make a decision based on the P -value. Also interpret the test.

• Exercise 6.77 (page 370)Ignore textbook instructions. Instead answer (a) and (b):

(a) Is there significant evidence at the 5% level that the mean reading differs from the truevalue 105? Use the Five Steps.

(b) Calculate and interpret the P -value for this test.

(end of assignment)

4 70

Page 72: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stat 1030 Topic 8 Discussion Worksheet

1. (Telecommuting and sick days)Recently many companies have been experimenting with telecommuting, allowing employeesto work at home on their computers. Among other things, telecommuting is supposed toreduce the number of sick days taken.

Suppose that at one firm, it is known that over the past few years employees have takenan average of 5.4 sick days. This year, the firm introduces telecommuting. Managementchooses a random sample of 80 employees to follow in detail, and at the end of the year theseemployees average 4.5 sick days with a standard deviation of 2.7 days.

Has telecommuting reduced the use of sick days at this firm? Test using 5% significance.

(more space next page)

1 71

blake
Rectangle
Page 73: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Problem 1, continued.

2 72

Page 74: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

2. (Checking a juice machine)A juice machine in a processing plant fills cans with pineapple juice. The cans are labeled “12ounces.” The machine is initially adjusted so that the mean amount of juice dispensed agreeswith the label. Over time, though, the machine can veer out of control so that the meanamount dispensed differs from the label. (In that case the machine must be readjusted.)

Does the machine need to be adjusted if a recent sample of 100 cans from the production lineaverages 11.98 ounces, with a standard deviation of 0.19 ounces? Use α = 0.05.

(more space next page)

3 73

Page 75: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Problem 2, continued.

4 74

Page 76: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

3. (Defective components)Each component in a random sample of 300 electronic components manufactured by a certaincompany is tested, and 25 are found to be defective. The company’s quality control standardis that at most 5% of all components manufactured should be defective.

(a) Does the evidence indicate that this quality control standard is not being met? Use FiveSteps and α = 0.10.

(b) Alternatively, calculate the test’s P -value = risk of error. Should H0 be rejected? Whyor why not?

(more space next page)

5 75

Page 77: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Problem 3, continued.

6 76

Page 78: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

4. (New sports channel)A cable company provided its subscribers with free access to a new sports channel for onemonth. It then solicited answers from 400 subscribers to whether they would pay an extra$10 per month for continued access to the channel. Of these, only 25 indicated a willingnessto pay.

(a) Can the cable company conclude that more than 5% of all subscribers are willing topay? Test at a 10% significance level.

(b) Alternatively, calculate the P -value. What’s the decision, based on the P -value?

(more space next page)

7 77

Page 79: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Problem 4, continued.

8 78

Page 80: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

5. (Octane rating for gasoline)A particular brand of Shell gasoline is advertised to have a mean octane rating of at least 90%.Consumer Reports magazine is writing a news article to test this claim. Five measurementsof the gasoline show these octane ratings, in percents:

90.1 88.8 89.5 91.0 89.1

(a) Should the magazine report that the advertisement is false? There is a danger that Shellwill sue the magazine if more extensive testing at a later date shows the magazine’sconclusions to be false. Therefore perform a very cautious test, using a 1% significancelevel.

(b) How should Consumer Reports modify it’s hypothesis test if it wishes to maintain thesame amount of caution but increase its ability to detect a false advertisement?

A. Increase α.

B. Decrease α.

C. Bribe certain Shell managers not to sue the magazine.

D. Test different hypotheses.

E. Increase the sample size above n = 5.

(more space next page)

9 79

Page 81: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Problem 5, continued.

10 80

Page 82: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

6. (Production at a chemical plant)When it is operating properly, a chemical plant has a mean daily production of at least 740tons. The output is measured on a random sample of 60 days. The sample had a mean of715 tons/day and standard deviation of 24 tons/day.

(a) Do you believe that it is plausible that the plant is operating properly or are you con-vinced that the plant is not operating properly? Use α = 0.05.

(b) Calculate the P -value. Is H0 rejected? Why?

(c) Interpret the P -value in (b) in terms of risk. Explain why this makes sense, consideringthe value of the test statistic t.

(more space next page)

11 81

Page 83: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Problem 6, continued.

(end of Topic 8 worksheet)

12 82

Page 84: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

83

Page 85: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

84

Page 86: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

�� ��Stats for Business Notes

Part 4: Two-Population Comparisons & Statistical Prediction

Topic 9 Comparing Two ProportionsTopic 10 Correlation, Regression, and Stock Portfolios

The following textbook pages GREATLY ENHANCE the Part 4 Notes.Reading them together with the Notes is a smart investment! Business ma-jors will see these topics again (in more depth) in several subsequent courses!

TopicNumber Topic Important Textbook Reading

9 Comparing Two Proportions Pages 472–480

10 Correlation, Regression, and Stock Portfolios Pages 77–85, 92–97, 100–116, 525–560

85

blake
Rectangle
Page 87: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for BusinessTopic 9: Comparing Proportions from Two Populations

(Textbook Reading: Chapter 8: 472–480)

REALISTIC BUSINESS DECISIONS OFTEN INVOLVE MAKING CHOICES AMONGtwo or more alternatives. In Topic 9 we take the first step in that direction by comparingtwo options.

Example (Capital One credit cards)Suppose that managers at Capital One Bank, headquartered in McLean, Virginia, arewrestling with a delinquency problem.

Capital One’s credit card products are generally quite profitable! Capital One earns a smallpercentage transaction fee from retail stores where the card is used, as well as interest fromcustomers who carry credit card balances (debt) from month to month.

Still, delinquency — defined as a credit card holder who is three or more months late oncredit card payments during a calendar year — is a problem for Capital One. The cost ofdelinquencies (unpaid credit card bills) each year counts as a charge-off against revenue inCapital One’s income statement and, ultimately, as a significant reduction in Capital One’sannual profits.

Suppose that Capital One managers devise two potential solutions to the delinquency prob-lem, both to be tested as “pilot projects” with samples of 500 customers each over the nextyear:

• Credit Card A (informally known as the STICK among Capital One managers)

◦ Customers who are either one month late on payments or two months late onpayments receive a strongly-worded letter which threatens a $200 “penalty andreactivation fee” if they progress to being three months late on payments.

◦ Customers who are three months late on payments have their account “frozen”(the credit card becomes inactive) and are assessed a $200 penalty. The cus-tomer’s credit card reactivates only if the penalty and balance are paid in full.

• Credit Card B (informally known as the CARROT among Capital One managers)

◦ Customers receive a “letter of thanks” after six months of consecutive on-timepayments and the offer of a $100 bonus (cash back) for making payments on timefor 12 consecutive months.

◦ Customers with no late payments for 12 consecutive months receive $100 cashback (credit to their accounts) as a “thank you.”

186

blake
Rectangle
blake
Line
blake
Line
Page 88: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

87

Page 89: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Data Scenario #1

• Suppose that x1 = 70 Credit Card A customers are delinquent

=⇒ p1 =70

500= 14%

• Suppose that x2 = 50 Credit Card B customers are delinquent

=⇒ p1 =50

500= 10%

• In the samples, Credit Card B outperforms Credit Card A by 4% (10% vs. 14%.)

• But what about the population of all Capital One customers?

=⇒ ( Use 90% confidence intervals to compare Card A to Card B )

Statistical Analysis of Scenario #1

1. Credit Card A

p1 ± 1.645

√p1 (1− p1 )

n1

= 0.14 ± 1.645

√(0.14)(0.86)

500

= 0.14 ± 0.0255

= (0.1145, 0.1655) −→ (11.45% , 16.55%)

2. Credit Card B

p2 ± 1.645

√p2 (1− p2 )

n2

= 0.10 ± 1.645

√(0.10)(0.90)

500

= 0.10 ± 0.0221

= (0.0779, 0.1221) −→ (7.79% , 12.21%)

388

blake
Rectangle
Page 90: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

89

blake
Text Box
Page 91: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

90

Page 92: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

91

Page 93: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

92

Page 94: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

93

Page 95: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stat 1030 Topic 9 Discussion Worksheet

DIRECTIONS:

◦ Important Note: You must complete the MINITAB Pre-Assignment over theweekend before Discussion to understand the worksheet! (See ICON Announcementfor details.)

◦ Your TA will help you develop answers for the statistical analyses of Data Scenarios #2 and#3 below in your Week 14 Stats Computer Lab Discussion.

◦ Suppose in all scenarios that increased costs make Card B feasible only when its delinquencyrate represents a greater than 10% improvement over Card A’s delinquency rate.

Data Scenario #2

• Suppose that x1 = 140 Credit Card A customers are delinquent.

• Suppose that x2 = 60 Credit Card B customers are delinquent.

Statistical Analysis of Scenario #2

1. By how much does Card B outperform Card A in the test market (sample) data?

• Calculate p1 . Answer:

• Calculate p2 . Answer:

• Final Answer:

2. Calculate the standard deviation σp1−p2 for (p1 − p2):

(continued)

194

blake
Rectangle
blake
Rectangle
Page 96: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

3. With 90% confidence, by how much does Card B outperform Card A in the population of allCapital One credit card customers?

(Calculate a 90% confidence interval for (p1 − p2) in the space below.)

4. Interpret the confidence interval.

5. Use MINITAB to calculate the 90% confidence interval for (p1 − p2).(Record the exact answer from the MINITAB output.)

Answer:

6. Capital One will choose one of the cards to market nationwide in the U.S., beginning nextyear. Which card do you recommend, Card A or Card B? Support your recommendationwith statistical evidence.

(continued)

295

Page 97: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Data Scenario #3

• Suppose that x1 = 120 Credit Card A customers are delinquent.

• Suppose that x2 = 75 Credit Card B customers are delinquent.

Statistical Analysis of Scenario #3

1. By how much does Card B outperform Card A in the test market (sample) data?

Answer:

2. With 90% confidence, by how much does Card B outperform Card A in the population of allCapital One credit card customers?

3. Interpret the CI.

4. Use MINITAB to calculate the 90% confidence interval for (p1 − p2).

Answer:

5. Capital One will choose one of the cards to market nationwide in the U.S., beginning nextyear. Which card do you recommend, Card A or Card B? Support your recommendation.

(continued)

396

blake
Rectangle
Page 98: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Hypothesis Testing for Two Proportions with MINITAB

Reconsider Capital One’s two research questions concerning Credit Cards A and B:

Question 1

Which policy more effectively discourages delinquency, Credit Card A or Credit Card B?

Question 2

Even if Credit Card B more effectively discourages delinquency, is it worth the extra cost toCapital One compared to Credit Card A?

◦ We’ve used confidence intervals to answer these questions under Data Scenarios #1, #2, #3.

◦ But more direct answers are available from hypothesis tests.

◦ There are hand-calculation formulas for hypothesis tests, similar to the ones we used for CI’s.But to save time and effort we’ll instead just use MINITAB in the Topic 9 Worksheet.

◦ For all tests define the parameters as

• p1 = proportion of Credit Card A customers who are delinquent

• p2 = proportion of Credit Card B customers who are delinquent

and suppose that increased costs make Card B feasible only when its delinquency rate repre-sents a greater than 10% improvement over Card A’s delinquency rate.

◦ Answer the two research questions by testing whether Card B outperforms Card A, first interms of delinquency only and then when considering Card B’s extra costs:

Question 1

HA: p1 > p2 or (p1 − p2) > 0

H0: p1 ≤ p2 or (p1 − p2) ≤ 0

Question 2

HA: p2 < p1 − 0.10 or (p1 − p2) > 0.10

H0: p2 ≥ p1 − 0.10 or (p1 − p2) ≤ 0.10

Use 5% significance for all tests.

(continued)

497

blake
Rectangle
blake
Rectangle
Page 99: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Data Scenario #1, Question 1

1. Write down hypotheses HA and H0:

2. Find the P -value for the test. Use the same MINITAB steps as for the CI except

Options > (Select Alternative greater than) > (Choose Test difference 0)

> (Click box “Use pooled estimate of p for test”)

> OK > OK

Test and CI for Two Proportions

Sample X N Sample p

1 70 500 0.140000

2 50 500 0.100000

Difference = p (1) - p (2)

Estimate for difference: 0.04

90% lower bound for difference: 0.0137110

Test for difference = 0 (vs > 0): Z = 1.95 P-Value = 0.026

Fisher’s exact test: P-Value = 0.032

Answer:

3. Make a decision and support it numerically:

4. Interpret:

(continued)

598

blake
Rectangle
Page 100: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Data Scenario #1, Question 2

1. Write down hypotheses HA and H0:

2. Now change the test difference in MINITAB to find the P -value for the test:

Options > (Select Alternative greater than) > (Choose Test difference 0.10)

> OK > OK

Answer:

3. Make a decision and support it numerically:

4. Interpret:

5. Why is the P -value so large?

(end of Topic 9 worksheet)

699

blake
Rectangle
Page 101: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

100

Page 102: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

101

Page 103: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

102

Page 104: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for Business

Topic 10: Correlation, Regression, and Stock Portfolios

Textbook Reading: Chapter 2 (77–85, 92–97, 100–116), Chapter 10 (525–560)

IN THIS LAST PART OF THE COURSE WE EXPAND THE ORIGINAL SIX STEPS

of Inference to Step 7: Statistical Prediction (also known as Regression.)

This involves a “Double Step 4” (measuring two variables) which leads directly to Step 7:

Step 7

Use pairs of measurements (x1, y1), (x2, y2), (x3, y3), . . . (xn, yn) from a sample

of n objects to estimate a regression (forecasting) equation to predict

future values of y from future values of x.

A Closeup View of Step 7

First, measure a sample −→

x1 y1x2 y2x3 y3...

...xn yn

Then, predict the future −→ y = function(x)

y = β0 + β1x

1103

blake
Step 7
blake
linear_regression
blake
Rectangle
blake
Rectangle
blake
Text Box
the linear (straight-line) function is relatively easy and effective!
blake
Line
blake
Line
blake
Rectangle
Page 105: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

§1 Correlation

WE BEGIN OUR STUDY OF TOPIC 10 WITH AN INTUITIVE CONCEPT CALLEDcorrelation.

Correlation is a number which measures how two variables x and y CO-RELATE:

Correlation is a numerical measurement of the strength and direction of thelinear relationship between two variables x and y in a “scatterplot.”

The six plots below taken from the textbook show various degrees of (negative and positive)correlation.

Questions:

• In five of the six scatterplots you should be able to approximately place a prediction linethrough the middle of the points which best describes how y is related to x. (Drawthe lines into your Notebook.)

• But one of the six scatterplots has NO best prediction line. (Identify the plot.)

Correlation r = 0 Correlation r = 0.5

Correlation r = 0.9

Correlation r = -0.3

Correlation r = -0.7 Correlation r = -0.99

2104

Page 106: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

§1.1 Notes About Correlation

(1) One of the really interesting and useful characteristics of the correlation r is that itis a unitless measurement:

The correlation is unchanged if the units of measurement change!

Example 1 (Fuel efficiency)In the U.S. we measure fuel efficiency in miles per gallon (mpg.) The data belowcompare

• x = City Driving mpg

• y = Highway Driving mpg for a sample of 8 cars:

City HighwayCar mpg (x) mpg (y)1 18.6 28.72 19.2 29.33 18.8 29.24 19.5 29.05 19.1 28.56 20.2 33.07 19.1 31.08 19.5 31.2 City mpg

Highway mpg

20.2520.0019.7519.5019.2519.0018.7518.50

33

32

31

30

29

28

Highway mpg vs City mpg

In Europe fuel efficiency might be measured instead in kilometers per liter (kpl), inwhich case the numbers of course have a completely different meaning!

x1 = Car 1 City Efficiency = 18.6 mpg · 1.61 kilometer

mile· 1 gallon

3.78 liters= 7.92 kilo/liter

City HighwayCar kpl (x) kpl (y)1 7.92 12.222 8.18 12.483 8.01 12.444 8.31 12.355 8.14 12.146 8.60 14.067 8.14 13.208 8.31 13.29 City km per liter

Highway km per liter

8.68.58.48.38.28.18.07.9

14.0

13.5

13.0

12.5

12.0

Highway vs City km per liter

The numbers on the x and y axes differ on the two graphs but thedata shape and the correlation are the same!

The correlation is r = 0.772 (moderately strong positive) for both data sets andboth graphs!

3105

blake
Beetle
blake
Oval
blake
Oval
blake
Line
blake
Line
blake
Oval
blake
Oval
Page 107: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(2) Did you notice that correlation is represented in Statistics by the letter r instead ofthe letter c ?

(This is a historical accident that is explained a bit later in the Topic 10 Notes.)

(Fill-In-The-Blanks Below!)

1. The correlation r is a number which can range anywhere from −1 to 1:

−1 ≤ r ≤ 1

2. A positive linear relationship on a scatterplot produces a positive correlation r.

3. The correlation represents a perfect positive line (extreme case.)

4. A negative linear relationship on a scatterplot produces a negative correlation r.

5. The correlation represents a perfect negative line (extreme case.)

6. The correlation represents no linear relationship.

4106

blake
corr_1
blake
corr_2
blake
corr_3
Page 108: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

§2 Regression

IN CONTRAST TO CORRELATION, THE REGRESSION (PREDICTION) LINE MUSTchange depending on the units of measurement!

Example 1, continued.

(1) Suppose that both x (City) and y (Highway) fuel efficiencies are measured in mpg.

• Correlation r = 0.772

• The forecast equation is y = −17.74 + 2.48x

City HighwayCar mpg (x) mpg (y)1 18.6 28.72 19.2 29.33 18.8 29.24 19.5 29.05 19.1 28.56 20.2 33.07 19.1 31.08 19.5 31.2

(2) Now suppose that we change the way we measure y (Highway) from mpg to kpl.

• Correlation r = 0.772

• The forecast equation changes to y = −7.62 + 1.06x

City HighwayCar mpg (x) kpl (y)1 18.6 12.222 19.2 12.483 18.8 12.444 19.5 12.355 19.1 12.146 20.2 14.067 19.1 13.208 19.5 13.29

5107

blake
Line
blake
Line
blake
Beetle
Page 109: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Questions for Example 1

(a) The forecast equation is written as

y = β0 + β1x

• so the y-intercept is β0

• and the slope is β1

Interpret the slope β1 = 2.48 in the first regression equation (Regression #1):

Answer :

(b) Interpret the slope β1 = 1.06 in the second regression equation (Regression #2):

Answer :

6108

Page 110: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(c) Why are there two different slopes for the same sample of eight cars? We can solvethis mystery by translating the first regression slope into the second one:

Answer :

(d) Predict highway mpg for a car that gets x = 15 mpg in city driving.

Answer :

(e) Predict highway kpl for a car that gets x = 15 mpg in city driving.

Answer :

7109

Page 111: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

§2.1 Regression Notation

Do the letters used in Regression seem strange? . . . What do they mean, anyway?

In an Algebra class you may may have seen a straight-line equation written with differentletters:

y = mx+ b (1)

• b = y-intercept

• m = slope

Or possibly like this:

y = a+ bx (2)

• a = y-intercept

• b = slope

As we often do in Stats, we instead use Greek letters β0 and β1 to represent the theoreticalor true regression if we were to measure the entire population:

µy = β0 + β1x (3)

(Our forecasts for y based on x are correct on average!)

Realistic business forecasts often use multiple x variables to predict y. For instance with twox variables x1 and x2 there are two slopes, one for each variable:

µy = β0 + β1x1 + β2x2 (4)

Realistically, we almost never measure an entire population! So we’ll be satisfied with anapproximate forecast equation calculated from the sample:

y = β0 + β1x (5)

• The sample y-intercept β0 estimates the population y-intercept β0 .

• The sample slope β1 estimates the population slope β1 .

8110

blake
Oval
blake
Line
blake
Line
blake
Line
Page 112: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

§2.2 Calculating Regression Equations

Five ingredients can be mixed together to calculate the regression equation:

• r (correlation)

• x and sx (from the x measurements)

• y and sy (from the y measurements)

1. Finding the slope

β1 = r ·sysx

◦ Notice that the slope β1 and correlation r have the same sign (are either bothpositive or both negative.)

◦ The standard deviations sx and sy provide the units needed for regression(e.g. mpg or kpl in Example 1.)

2. Finding the intercept

The best prediction equation naturally goes through the “middle of the points” (x, y)on the scatterplot.

In Example 1:

• x = 19.3 mpg in city driving

• y = 30.0 mpg in city driving

• =⇒ (x, y) = (19.3, 30.0)

20.2520.0019.7519.5019.2519.0018.7518.50

33

32

31

30

29

28

City mpg

Hig

hw

ay m

pg

S 1.08459

R-Sq 59.7%

R-Sq(adj) 53.0%

Highway mpg = - 17.74 + 2.479 City mpg

So substitute y for y and x for x in the equation y = β0 + β1x

and solve for the intercept: =⇒ y = β0 + β1x

=⇒ β0 = y − β1x

β0 = y − β1x

9111

blake
Rectangle
blake
Rectangle
blake
Line
blake
Line
blake
Line
blake
Line
Page 113: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2 (Sales determined by staffing)The Best Buy store in the Coralville Mall collected data over 10 weeks in the home consumerelectronics sections of the store, relating the number of sales people working that week toweekly sales (in thousands of dollars.)

The data and scatterplot are shown below. It’s easy to see that the variables are highlypositively correlated. In fact, r = 0.9654 .

Number of SalesWeek Sales People (thousands $)

1 2 102 3 113 7 134 9 145 10 186 10 207 12 208 15 229 16 2210 20 26

20151050

28

26

24

22

20

18

16

14

12

10

Number of Sales People

We

ek

ly R

eta

il S

ale

s

Scatterplot of Weekly Retail Sales vs Number of Sales People

Questions for Example 2

(a) Describe the y variable in English.

(b) Describe the x variable in English.

(c) Find x and sx.

(d) Find y and sy.

(continued)

10112

blake
best_buy
blake
Text Box
Carry at least 4 decimal places in all calculations! (Exact answers are better!)
blake
Text Box
blake
Text Box
Page 114: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(e) Find β1.

(f) Find β0.

(g) Use the previous two answers to write down the prediction (regression) equation.

(h) Forecast weekly sales when Best Buy staffs 10 sales people.

(i) Forecast weekly sales when Best Buy staffs 15 sales people.

11113

blake
Text Box
blake
Text Box
Page 115: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 3 (Used car prices)Used car prices (in dollars) advertised online, together with age (in years) for a sample ofJapanese import cars are shown below. The correlation is −0.972.

Age Price

1 139901 134953 129994 95004 104955 89955 94956 69997 69507 78508 69998 599510 495010 449513 2850

(a) Find the regression equation to predict Price from Age. Show your work and Boxthe final answer.

(b) Predict the price of a five-year-old used car.

12114

blake
used_cars_online
blake
Text Box
Carry at least 6 decimal places in calculations. (Exact answers are better!)
Page 116: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(c) Find the regression equation to predict Age from Price. Show your work and Boxthe final answer.

(d) Predict the age of a used car advertised for $10,000.

Historical note:

According to Wikipedia, Sir Francis Galton introduced the terms correlation and re-

gression around 1888.

He used the letter r to stand for “regression” in the special case that thex and y samples are “standardized” (have mean 0 and stand. deviation 1):

• x = 0, sx = 1

• y = 0, sy = 1

In that special case the regression slope β1 and the correlation r are the same!

β1 = r ·sy

sx= r ·

1

1= r

By historical accident we still use the letter r to represent correlationtoday!

13115

blake
galton
blake
Rectangle
Page 117: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for Business HOMEWORK 11 (Topic 10: Correlation, Regression and Stock Portfolios)

DIRECTIONS:

• These exercises provide a thorough and intuitive introduction to correlation and regressionfor business applications such as stock portfolios.

• Some exercises call MINITAB to be used for computations. Others direct you to use calculatorsonly. Of course, on quizzes and exams you’ll need to rely on your calculators.

Part 1: Scatterplots, Correlation, and Regression

• Exercise 2.42 (page 98) Substitute the following directions for textbook directions:

1. It’s generally easier to use MINITAB to make scatterplots than plotting by hand. Thereare two ways to access MINITAB:

◦ Use a computing lab on campus where MINITAB is available — for instance, 41 Scha-effer Hall, the computer lab C220 in the Tippie College, and the Library LearningCommons (LC.)

◦ Use a web browser to access MINITAB remotely using the university’s Virtual Desk-top. If you’re using Virtual Desktop for the first time from home on a PC or Maccomputer you’ll need to install Citrix software first. Click Virtual Desktop Directionson the Stats Website.

2. Enter the data to the MINITAB worksheet: First enter column titles Price and De-forestation, then enter the column numbers.

3. Make a scatterplot with these steps:

Graph > Scatterplot > Simple > OK > (Enter Deforestation as the yvariable, Price as the x variable) > OK

4. Let’s use MINITAB to get the correlation, too:

Stat > Basic Statistics > Correlation > (Select both variables) > OK

Answer these questions:

(a) Provide an English description of the x variable. (Include units of measurement.)

(b) Provide an English description of the y variable. (Include units of measurement.)

(c) Provide the correlation r between x and y.

(continued)

1 116

blake
Rectangle
Page 118: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(d) Calculate the other four “ingredients” for the regression equation: x, sx, y, sy.

(e) Calculate β1. (Use the formula from the Topic 10 Notes.)

(f) Calculate β0. (Use the formula from the Topic 10 Notes.)

(g) Provide the sample regression equation.

(h) Interpret the slope β1.

(i) Interpret the intercept β0.

(j) Consider the laws of supply and demand, either from a Microeconomics course thatyou’re taking, or just from your general knowledge.

Briefly provide an economic reason why the two variables are either positively ornegatively correlated.

(k) Suppose that in 2013 the price per pound of coffee is 88¢. Predict deforestation in2013.

• Exercises 2.35, 2.36, 2.69, 2.70

◦ These data compare the carbs and alcohol in 86 brands of beer. Probably that’s toomuch data to hand-type into a MINITAB worksheet. So open the Beer data file fromthe Homework page of the main Stats website.

Depending on whether you’re at a campus computing center or using Virtual Desktop,the method to open the data file differs:

N From on campus, navigate to the Homework page of the Stats website and click theBeer file to open.

N From Virtual Desktop:

Since the data file resides on the university’s computer (not yours), you’ll need to usedifferent steps on Virtual Desktop. (Otherwise your computer will show gibberish.)

From Virtual Desktop, use:

Web Browsers > Firefox > Go to Stats Website

> Go to Homework page, click the Beer file to open.

◦ An outlier is a data point which is markedly different from the other data points.(To “delete” an outlier just right-click and delete the entry from the spreadsheet.)

◦ Use MINITAB to plot the data as well as to calculate correlations.

(continued)

2 117

Page 119: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

◦ To save typing (86×2) numbers into your calculator, let’s also use MINITAB to calculatex, sx, y, sy for these data:

Stat > Basic Statistics > Display Descriptive Statistics

(Be sure to use all decimal places from the MINITAB output for accuracy.)

◦ Add parts (c) and (d) to Exercise 2.70:

(c) Describe how carbs change as alcohol content changes in alcoholic beers.

(d) Predict the carbs in a beer which is 6% alcohol.

• Additional ExerciseThe Ventura County Star newspaper reported the following election results in California in1998 for Proposition 227. The proposition declared English to be the official language ofCalifornia and sought to outlaw bilingual education in public schools.

Percent in City Percent of City’s Students WithCity Who Voted Yes Limited English Speaking SkillsThousand Oaks 71.5 7.4Simi Valley 71.0 6.0Oxnard 47.9 40.4Ventura 61.7 11.9Ojai 57.4 6.5Fillmore 48.4 36.1

The correlation between the variables is = −0.866.

Questions:

(a) Provide an English description of the x variable.

(b) Provide an English description of the y variable.

(c) Calculate the other four “ingredients” for the regression equation: x, sx, y, sy.

(d) Calculate β1. (Use the formula from the Topic 10 Notes, show your work.)

(e) Calculate β0. (Use the formula from the Topic 10 Notes, show your work.)

(f) Provide the sample regression equation.

(g) Interpret the slope β1.

(h) Interpret the intercept β0.

(i) In 1998, 7.8% of students in the California city of Camarillo had limited English-speaking skills. Predict the voting outcome in Camarillo for Proposition 227.

(continued)

3 118

Page 120: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Part 2: Regression and Stock Portfolios

The data below show the percentage monthly return (percentage increase of the stock’s endingmonthly price compared to the stock’s beginning monthly price) for two companies (Infosys Tech-nology and Pearson Education) and the overall market (S&P 500) for five selected Januaries:

Month Infosys Tech Pearson Education S&P 500

January 2004 −7.97 1.49 1.70January 2005 −4.97 −3.58 −2.59January 2006 −6.09 8.83 2.48January 2007 5.93 4.25 1.39January 2008 −9.57 −4.62 −6.51

Set

• x = S&P 500 percentage monthly return

• y1 = Infosys Tech stock percentage monthly return

• y2 = Pearson Education stock percentage monthly return

Financial analysts give the regression slope β1 which relates the return of a specific company’sstock to the overall market a special name and meaning:

β1 is known as the Beta Measure of Financial Risk

and

• β1 < 1 shows the company’s stock to be relatively insensitive to market changesand so likely to be a (relatively) low-risk/low-return investment.

• β1 > 1 shows the company’s stock to be relatively sensitive to market changes andso likely to be a (relatively) high-risk/high-return investment.

Questions

(a) Use the formulas in the Topic 10 Notes to find the regression equation of y1 on x.

(b) Interpret the slope of the regression in (a).

(c) Use the formulas in the Topic 10 Notes to find the regression equation of y2 on x.

(d) Interpret the slope of the regression in (c).

(e) Which stock, if either, is considered a low-risk investment compared to the overall stockmarket, and why? Which is considered a high-risk investment?

Notes:

◦ Business majors take the course Introductory Financial Management (FIN 3000.) In Financeyou’ll encounter a theory called the Capital Asset Pricing Model (CAPM) which uses regres-sion to explain and predict stock prices. These exercises provide a brief introduction to thatfinancial theory.

◦ Much larger data sets than these are typically needed to provide realistic measures of stockrisk (for example, several years of monthly data, at least.)

4 119

blake
Text Box
(end of assignment)
Page 121: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

§2.3 Measuring the Impact of Regression: sy , se and R2

Example 2, continued.How effective is the variable

x = number of sales people

in predictingy = weekly sales

for Best Buy?

• Of course, predictions y from a regression line aren’t perfect:

◦ The only perfect prediction is a point exactly on the line.

◦ Prediction errors are measured vertically (on the y-axis for Weekly Sales), asshown in the graph below:

prediction error e = y (actual) − y (prediction)

14120

blake
regression_error
blake
best_buy
blake
Text Box
(actual)
blake
Text Box
(prediction)
blake
Line
blake
Line
Page 122: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Number of SalesWeek Sales People (thousands $)

1 2 102 3 113 7 134 9 145 10 186 10 207 12 208 15 229 16 2210 20 26

20151050

28

26

24

22

20

18

16

14

12

10

Number of Sales People

We

ek

ly R

eta

il S

ale

s

S 1.47701

R-Sq 93.2%

R-Sq(adj) 92.3%

Weekly Retail Sales = 8.0975 + 0.9137 Number of Sales People

• How accurate is the regression for Week 1?

◦x = 2 =⇒ y = 8.1006 + 0.9134(x)

= 8.1006 + (0.9134)(2)

= 8.1006 + 1.8268 = 9.9274

◦Error e = y − y = 10− 9.9274 = 0.0726

◦Squared Error = (0.0726)2 = 0.0053

• We can calculate the error for all 10 weeks:

Week x y y e e2

1 2 10 9.9274 0.0726 0.00532 3 11 10.8408 0.1592 0.02533 7 13 14.4944 −1.4944 2.23324 9 14 16.3212 −2.3212 5.38805 10 18 17.2346 0.7654 0.58586 10 20 17.2346 2.7654 7.64747 12 20 19.0614 0.9386 0.88108 15 22 21.8016 0.1984 0.03949 16 22 22.7150 −0.7150 0.511210 20 26 26.3686 −0.3686 0.1359

0.0004 17.4525

The positive and negative errors cancelbut the squared errors don’t!

• Now calculate the regression standard deviation se :

◦ The regression variance is

s2e =n∑

i=1

(yi − yi)2/(n− 2) = 17.4525/(10− 2) = 17.4525/8 = 2.18156

◦Then regression standard deviation is se =

√s2e =

√2.18156 = 1.4770

15121

blake
Line
blake
Line
blake
Line
Page 123: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

A First Way to Measure the Impact of Regression

• The standard deviation sy of

y = Weekly Sales

is sy = 5.3375

The standard deviation sy = 5.3375 measures the error in predictingSales without regression.

• The error in predicting Sales with regression is se = 1.4770

• Regression is effective since standard deviation drops from sy = 5.3375 tose = 1.4770 .

A Second Way to Measure the Impact of Regression

• From Topic 2, the sample variance formula for a single variable divides by (n− 1):

s2y =n∑

i=1

(yi − y)2/n− 1 =n∑

i=1

(yi − y)2/9

(The estimation of a population mean µ “costs” a single sample measurement.)

• But in regression we need to estimate both β0 and β1 which “costs” two sample mea-surements:

s2e =n∑

i=1

(yi − yi)2/(n− 2) =

n∑i=1

(yi − yi)2/8

• So to compare the two variances fairly let’s multiply s2e by 8/9 :

s2e ×8

9= 2.18156× 8

9= 1.939

Then regression has reduced average squared prediction error from

s2y = 28.49 to 1.939

for a relative improvement of

28.49− 1.94

28.49=

26.55

28.49= 0.932% = 93.2% = R2

This percentage improvement is called R2 since it is, indeed, the squaredcorrelation!

correlation r = 0.9654 =⇒ r2 = (0.9654)2 = 0.932 = 93.2% = R2

16122

blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
blake
Line
Page 124: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 4 (Air flights from Boston)The following data show round-trip air fares from Boston to several different cities in 2010.The correlation is 0.822.

City Cost Miles

NY 179 206Chicago 179 963San Fran 397 3095Denver 274 1940Dallas 404 1748

3500300025002000150010005000

400

350

300

250

200

Miles

Co

st

Cost vs Miles

(For accuracy carry at least 4 decimal places in the calculations on this page.)

1. Calculate the regression equation.

(a) Find x, sx, y, sy

(b) Find the regression equation.

2. Use the regression.

(a) Interpret the regression slope.

(b) Kansas City is 1417 miles from Boston. Predict the cost of a round-trip ticket

from Boston to Kansas City. Box the answer.

17123

blake
compare_flights
blake
Text Box
x = miles from Boston y = flight cost in dollars
Page 125: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

3. Measure regression impact.

(a) Use the table below to calculate se. Then describe regression’s impact by com-paring se to sy. (Carry at least 2 decimal places accuracy in the table.)

City x y y e e2

NY 179 206

Chicago 179 963

San Fran 397 3095

Denver 274 1940

Dallas 404 1748

(b) Calculate and interpret R2.

18124

blake
Text Box
x
blake
Text Box
y
Page 126: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

§2.4 Confidence and Prediction Intervals

Knowledge of the overall regression standard deviation se opens the door to several powerfulapplications, all backed with 95% certainty.

Confidence Intervals for Slope

The slope β1 in the sample regression equation is a good but imperfect estimate of the trueslope β1 for the entire population. The graph below illustrates the difference between asample slope and a population slope:

Just as we did for proportions problems:

p ± m (margin of error)

and for means problems:x ± m

we can do the same for regression problems:

β1 ±m = β1 ± t∗n−2 · SE(β1)

where

• Degrees of freedom for the t table are df = n− 2

• SE(β1) stands for standard error (estimated standard deviation) for β1 and is

SE(β1) =se

sx√n− 1

19125

blake
pop_sample_lines
blake
Text Box
blake
Text Box
blake
Text Box
blake
Rectangle
blake
Rectangle
blake
Line
blake
Line
Page 127: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2, continued.Best Buy’s business plan calls for each additional salesperson to generate at least $1500 inextra weekly sales, on average. Is the plan realistic (i.e., plausible), with 95% confidence?

(a) Find and interpret a 95% CI for the slope β1.

Answer :

SE(β1) =se

sx√n− 1

=1.477

5.64√10− 1

=1.477

(5.64)(3)= 0.0873

β1 ± t∗10−2

· SE(β1) = 0.9134± (2.306)(0.0873)

= 0.9134± 0.2013

= (0.7121, 1.1147)

Interpret:

We are 95% confident that each extra salesperson generates between $712and $1115 in extra weekly sales, on average.

(b) Is the business plan realistic?

Answer :

No since $1500 isn’t a plausible value for the CI.

20126

blake
best_buy
blake
Text Box
From page 112
blake
Line
blake
Text Box
From page 121
blake
Line
Page 128: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 4, continued.Travel Sure magazine claims that each extra mile distance from Boston costs $0.15, onaverage.

(a) Find and interpret a 90% CI for the slope β1.

(b) Is the claim plausible, with 90% confidence? Explain.

21127

blake
compare_flights
blake
Line
blake
Line
Page 129: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Prediction and Confidence Intervals for Specific Predictions

Example 2, continued.Suppose that Best Buy has two very different questions, both concerning the use of 12salespersons:

(1) To predict long-term profit, what are average weekly sales with 12 salespersons?

(2) To ensure enough cash flow to pay next week’s salaries, what will sales be next weekwith 12 salespersons?

Theoretically, there’s a separate bell curve (for y = Weekly Sales) for each different value ofx = Number of Salespersons, as this graph shows:

Best Buy needs a prediction interval for an individual week (next week) and a confidence

interval for mean sales over many weeks.

• A prediction interval for an individual response at a particular value x0 is

y ± t∗n−2× SEy

where

SEy =

√(SE(β1)

)2 × (xo − x)2 + s2e

(1 +

1

n

)

• A confidence interval for a mean response at a particular value x0 is

y ± t∗n−2× SEµ

where

SEµ =

√(SE(β1)

)2 × (xo − x)2 +s2en

22128

blake
best_buy
blake
line_with_bell_curves
blake
Rectangle
blake
Rectangle
Page 130: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 2, continued.Answer Questions (1) and (2), both with 95% certainty.

(1)

SEµ =

√(SE(β1)

)2 × (xo − x)2 +s2en

=

√(0.0873)2 × (12− 10.40)2 +

2.1816

10

=√0.00762× 2.56 + 0.21816 =

√0.2377 = 0.488

y = 8.1006 + (0.9134)x = 8.1006 + (0.9134)(12) = 19.06

y ± t∗n−2× SEµ = 19.06 ± (2.306)(0.488) = 19.06 ± 1.13 = (17.93, 20.19)

=⇒ ($17,930 , $20,190)

(2)

SEy =

√(SE(β1)

)2 × (xo − x)2 + s2e

(1 +

1

n

)

=

√(0.0873)2 × (12− 10.40)2 + 2.1816

(1 +

1

10

)=

√2.4193 = 1.555

y ± t∗n−2× SEy = 19.06 ± (2.306)(1.555) = 19.06 ± 3.59 = (15.47, 22.65)

=⇒ ($15,470 , $22,650)

Follow-Up Question: Which is more difficult to predict or estimate statistically, an indi-

vidual week’s sales or the mean sales over many weeks?

Answer :

23129

blake
best_buy
Page 131: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 4, continued.Answer Questions (a) and (b) below, both with 95% certainty:

(a) What’s the average ticket cost for destinations which are 500 miles from Boston?

(continued)

24130

blake
compare_flights
Page 132: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(b) What’s the ticket cost of a single flight from Boston to Kansas City?

25131

blake
compare_flights
Page 133: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 5 (Political science class)In a political science class, the instructor gives only one midterm and then a final exam. Themaximum score on either exam is 100%. Statistics are available on the percentage scores by45 students on these two exams:

• Assume that se = 5.0 for any regression equation considered.

• The correlation between exam scores is 0.70.

Here’s the exam score data:

• Midterm Scores: Mean = 75, Standard Deviation = 10

• Final Exam Scores: Mean = 80, Standard Deviation = 7

Answer these questions:

(a) Find the regression equation to predict the final exam score from the midterm examscore.

(b) Interpret a 95% CI for the slope of the equation from (a).

(c) Describe the effectiveness of the regression from (a). Support your answer numerically.

(d) A student named Mary Smith scores 82 on the midterm exam. Forecast her final examscore, with 95% certainty.

(e) Consider the population of all students who score 70 on the midterm exam. What’stheir average final exam score, also with 95% certainty?

(more space on following pages)

26132

blake
political_science_class
blake
Rectangle
Page 134: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 5, continued.

27133

Page 135: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 5, continued.

28134

Page 136: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 5, continued.

29135

Page 137: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Example 5, continued.

(End of Example 5 and Topic 10 Notes. See Homework 12 next page.)

30136

Page 138: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stats for Business HOMEWORK 12 (Topic 10: Correlation, Regression and Stock Portfolios)

DIRECTIONS:

• This assignment is a direct counterpart to HW11. (The same exercises and data sets areused in both assignments.)

• You’re asked to calculate and interpret three types of questions (See formulas in Topic 10Notes):

◦ Confidence interval for the slope β1

◦ Confidence interval for mean response µy to a particular value x0

◦ Prediction interval for an individual response y to a particular value x0

• Exercise 2.42In HW11 you were asked to calculate regression “ingredients” r, x, sx, y, sy and use them tocalculate the sample regression equation

y = β0 + β1x

You also predicted deforestation when the price of coffee is 88¢.

(Refer back to your work for HW11 and the HW11 Solution to review.)

Questions:

(a) Calculate the regression standard deviation se from a table for these quantities:

◦ actual response y

◦ predicted response y

◦ prediction error e = y − y

◦ squared prediction error e2

(See Section 2.3 in the Topic 10 Notes to review how we did this in class. Carry calcu-lations to at least 3 decimal places.)

(b) Calculate and interpret a 95% confidence interval for the regression slope β1.

(c) Calculate and interpret a 95% prediction interval for deforestation in 2013 if the priceof coffee is 88¢ in 2013.

(continued)

1 137

blake
Rectangle
Page 139: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

(d) Calculate and interpret a 95% confidence interval for mean deforestation in all years forwhich the price of coffee is 88¢.

(e) Which answer is less precise, the answer to (c) or (d), and why?

• Exercise 2.70

After deleting the (non-alcoholic) O’Doul’s beer, you calculated the regression “ingredients”and the regression equation

y = −3.54 + 3.03x

where

◦ x = percent alcohol content in beer

◦ y = carbs in grams per 12 ounces beer

Given that the regression standard deviation is se = 3.8817, answer the following questions:

(a) Calculate and interpret a 95% confidence interval for the regression slope β1.

(b) Calculate and interpret a 95% prediction interval for the carbs in a brand of beer whichcontains 6% alcohol.

(c) Calculate and interpret a 95% confidence interval for the mean carbs in all brands ofbeer which contain 6% alcohol.

• Additional ExerciseRefer to the Additional Exercise in HW11, which concerns votes for Proposition 227 in Cali-fornia.

(a) Calculate and interpret a 95% confidence interval for the regression slope β1.

(b)

In HW11 Additional Exercise Question (i) you predicted the voting outcome for Prop. 227in the California city of Camarillo. Repeat that prediction, but this time with 90% cer-tainty. Also interpret the answer.

(end of assignment)

2 138

Page 140: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Stat 1030 Topic 10 Discussion Worksheet

DIRECTIONS:

◦ Refer to Topic 10 Notes and formulas to answer worksheet questions.

◦ Carry calculations to at least 3 decimal places.

◦ Use 90% certainty for all prediction and confidence intervals.

◦ The worksheet provides several blank pages toward the end for calculations.

A family owns two identical convenience stores which are located several blocks from each other,both in a residential neighborhood of Cleveland, Ohio. The monthly sales from both stores (inthousands of dollars) for the past 8 months are shown in the table below.

The family’s goal is to predict y = Store 2 Revenue based on either a time trend (x1 = Month)or in relation to the other store: (x2 = Store 1 Revenue.)

We’ll investigate both regressions in this worksheet. Here are the data:

x1 = Month x2 = Store 1 Sales y = Store 2 Sales

1 26.8 26.52 25.4 27.33 28.9 24.24 23.6 27.15 27.7 23.66 23.9 25.97 24.7 26.38 28.1 22.5

Other facts are:

• The correlation between x1 and y is r = −0.518

• The correlation between x2 and y is r = −0.771

Here are scatterplots for the two potential regressions:

876543210

28

27

26

25

24

23

22

Month

Sto

re 2

Sa

les

Scatterplot of Store 2 Sales vs Month

29282726252423

28

27

26

25

24

23

22

Store 1 Revenue

Sto

re 2

Re

ve

nu

e

Scatterplot of Store 2 Sales vs Store 1 Sales

1139

blake
Rectangle
Page 141: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

I. Consider Regression #1: Using x1 to predict y

1. Find the necessary “ingredients” for regression.

2. Calculate and write out the sample regression equation.

3. Interpret the regression slope.

4. Calculate the regression standard error se. (Use a table for the necessary calculations.)

5. Express the benefits from Regression #1 in two different ways: Compare sy to se and notethe value of R2.

6. Calculate a 90% CI for the slope.

7. Interpret the 90% CI for the slope.

8. Is it plausible (with 90% confidence), that in fact there is no time trend for Store 2 Sales?Explain.

II. Consider Regression #2: Using x2 to predict y

1. Find the necessary “ingredients” for regression.

2. Calculate and write out the sample regression equation.

3. Interpret the regression slope.

4. Can you provide an explanation, based on economics, for the negative slope and correlation?

5. Calculate the regression standard error se. (Use a table for the necessary calculations.)

6. Express the benefits from Regression #2 in two different ways: Compare sy to se and notethe value of R2.

7. Calculate a 90% CI for the slope.

8. Interpret the 90% CI for the slope.

9. Is it plausible (with 90% confidence), that in fact there is no relationship between Store 2 Salesand Store 1 Sales? Explain.

(continued)

2140

Page 142: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

III. Choosing a Regression Model and Using it for Prediction

1. Choose a preferred regression model between Regression #1 and Regression #2.Provide specific reasons for your choice.

2. Use the chosen model to forecast with 90% certainty Store 2 Sales in the 15th week, assumingthat Store 1 sells $25,000 that week.

3. Use the chosen model to estimate with 90% certainty average Store 2 Sales for all weeks inwhich Store 1 sells $25,000.

(more space next several pages)

3141

Page 143: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Topic 10 Discussion worksheet, continued.

4142

Page 144: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Topic 10 Discussion worksheet, continued.

5143

Page 145: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Topic 10 Discussion worksheet, continued.

6144

Page 146: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Topic 10 Discussion worksheet, continued.

7145

Page 147: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Topic 10 Discussion worksheet, continued.

8146

Page 148: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Topic 10 Discussion worksheet, continued.

9147

Page 149: Statistics for Business Parts 3 & 4 (Topics 7-10)homepage.divms.uiowa.edu/~blake/Carl/Whitten_STAT_1030_Notebook2_S2018_master.pdfTWO TUTOR LABS (These supplement o ce hours.) Stats

Topic 10 Discussion worksheet, continued.

(end of Topic 10 Worksheet and Stats Notes)

10148