15
Name ____________________________________ Period _____________________________ HW Organizing and Describing Data 1. (a) Gender (female or male) Categorical (b) Age (years) Quantitative - discrete (c) Race Categorical (d) Smoker (yes or no) Categorical (e) Systolic blood pressure (millimeters of mercury) Quantitative - continuous (f) Level of calcium in the blood (micrograms per milliliter) Quantitative - continuous (g) Number of prior surgeries Quantitative - discrete 2. (a) What percent of spam would fall in the “other” category? 7% (b) Display this data in a bar graph. Use graph paper and a formal neat presentation with all the required elements.

HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

Name ____________________________________ Period _____________________________

HW Organizing and Describing Data

1. (a) Gender (female or male) Categorical

(b) Age (years) Quantitative - discrete

(c) Race Categorical

(d) Smoker (yes or no) Categorical

(e) Systolic blood pressure (millimeters of mercury) Quantitative - continuous

(f) Level of calcium in the blood (micrograms per milliliter) Quantitative - continuous

(g) Number of prior surgeries Quantitative - discrete

2. (a) What percent of spam would fall in the “other” category? 7%

(b) Display this data in a bar graph. Use graph paper and a formal neat presentation with all the

required elements.

Page 2: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

3. (a) Present these data in a well-labeled bar graph.

(b) Suggest some possible reasons why there are fewer births in the weekends.

Labor induced during the week when doctors are in the office

Page 3: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

4. Based on these results, do you think there was a change in people’s attitudes during the 10 years

between these polls? There doesn’t appear to be much of a difference in people’s attitudes as the

height of the bars are approximately the same for each category. Support your conclusions using a

side-by-side bar graph.

Page 4: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

5.

(a) Construct a segmented bar graph to show the different percentages of physical activity for the

three BMI Groups.

(b) Do these data prove that lack of exercise causes obesity? Explain. Cause cannot be determined

from an observational study.

Page 5: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

6. a)Describe each of the four distributions.

Process A: shape is roughly symmetrical and bell shaped. The center is at 11.5 and the spread is from

about 11.4 to 11.6. There are no unusual features.

Process B: The shape is uniform. The center is around 12 and the spread is from 11.4 to 12.7. There are

clusters throughout that are separated by small gaps, almost every .05 in change in

diameter length.

Process C: The shaped is roughly bimodal. The center is around 11.7 and the data is spread from 11.1 to

12.2. There are many gaps all along the interval.

Process D: The shape is roughly symmetrical and bell shaped. The center is at 12 and the spread is from

11.8 to 12.2.

a) Which process is the best? Why? Process D because almost all of the dots appear to be in the acceptable

range of 11.8 to 12.2.

b) Which process is the most stable? Why? Process A is the most stable because it has the smallest spread

in the data, approximately .2 cm.

c) Which process is the least stable? Why? Process B is the least stable because it has the largest spread in

the data, approximately 1.3 cm.

7. a) Create a dotplot and a stemplot for these data.

b) Describe the distribution.

Fairly symmetrical bell-shaped with a peak at 35. The center appears to be 35 with a range

from 14 to 54. Looking at the dotplot we can see gaps at 17, 20, 21 and 36-37 which gives it the

appearance of clusters of data scattered throughout.

Page 6: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

8. a) Create a dotplot and a stemplot for these data.

b) Describe the distribution.

The distribution is skewed to the right with a center of 10 and a range from 1 to 31. Looking at

the dot there are gaps from 15-18, 20, 24, 26 and 28-30. There is a cluster of data around 22

and 31 could be an outlier.

9. a) Create a dotplot of this data.

The distribution of number of

hurricanes is skewed to the right with a

center at 2 and a range from 0 to 7.

There are no unusual features to this

distribution.

Page 7: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

10.

The distribution of horse-power for autos is slightly skewed to the right with a center at 103 and a range

from 65 to 155. There aren’t any unusual features with this distribution.

Page 8: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

The split stem shows the details a

bit better where you are able to

make out the cluster of data around

86.

The distribution of the amount of

money spent by shoppers in a store

is skewed to the right with a cluster

of amounts around 86. The

distribution has a center of 28 and a

range from 3 to 94.

Page 9: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

The distribution of division times for

lung cells exposed to Berylllium is

skewed to the right with a center

around 28 and a spread from 14 to 73.

There are no unusual features.

Page 10: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

13.

The distribution of zinc intake is

roughly symmetric and bell-shaped

with a center around 11 and a spread

from 5 to 19. There are no unusual

features.

Page 11: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

14. Complete this frequency distribution.

Class Limits Class Boundaries Frequency Relative

Frequency

Cumulative

Frequency

Cumulative

Relative Freq.

14-20 13.5-20.5 6 .136 6 .136

21-27 20.5-27.5 7 .159 13 .295

28-34 27.5-34.5 7 .159 20 .455

35-41 34.5-41.5 11 .25 31 .705

42-48 41.5-48.5 8 .182 39 .886

49-55 48.5-55.5 5 .114 44 1.000

Create a cumulative relative frequency plot (ogive).

15) What DRP score is at the 20th percentile?

23

16) What DRP score is at the 90th percentile?

50

17) What is the median DRP score?

36

18) What is the IQR for the distribution of DRP

scores? 17

The following cumulative relative frequency plot shows the age of people enrolled in school in 1996.

19) What is the median age of school enrollment in 1996?

11

20) What is the interquartile range of school enrollment in

1996?

10

21) At or below what age is the bottom 10% of school

enrollment in 1996?

4

22) At or above what age is the top 20% of school enrollment in

1996?

19

105 15 20 25 30 35 40

.1

.2

.3

.4

.5

.6

.7

.8

.9

1

Age

Cum

ulat

ive r

ela

tive

fre

quenc

y

21 28 35 42 49 56

DRP Score

14

Page 12: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

The following ogive shows the grade point average (GPA) for students at a certain school.

23) What is the median GPA?

2.4

24) What is the IQR of GPA?

1.0

25) What GPA is at the 85th percentile?

3.0

26) What does the steepness of the line imply?

The cumulative relative frequency increased greatly because there were many values in that

class.

27) What does it mean if the line is horizontal between two points?

There were no values between those two points.

28) (a) Find the mean and median of each medal count (total, gold silver and bronze).

Total: mean = 31.29, median = 22 Gold: mean = 9.95, median = 7

Silver: mean = 10.1, median = 7 Bronze: mean = 11.24, median = 9

(b) Which is larger, the mean or the median? Is the difference considerable “large”?

The mean is larger for all. It seems considerably large in the total category.

29) (a) Use the formula to calculate the mean. Mean = 85

(b) the fifteenth quiz and he receives a score of zero. Mean = 79.3 mean is non-resistant

30) (a) Find the mean score from the formula for the mean. Mean = 141.06

(b) Find the mean for the 17 observations when you drop the outlier. Mean = 137.59; the

mean was impacted severely by the outlier.

31) What is the team’s annual payroll for players? 30 million If you only knew the median salary,

would you be able to answer the question? NO

.1

.2

.3

.4

.5

.6

.7

.8

.9

1

1 2 3 4Grade Point Average

Cum

ulat

ive P

ropo

rtio

n

Page 13: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

32) Find the median of these scores. Median = 138.5 < Mean because distribution is skewed right

33) What is the mean salary paid at this firm? 60,000 How many employees earn less 7/8; They

could promote the $60,000 as the average even though most employees (7/8) earn

significantly less than that.

34) $490,000 (Median) and 1,160,000 (Mean). Salaries are skewed right.

35) a). State the IQR of this data set. 79 84.5 88.5 93 111 IQR = 8.5

b) Find the mean and the median. Mean is 90.7, Median is 88.5

c) Based on the mean and median, describe the distribution. Skewed right

36) a) Construct a boxplot (modified if necessary) of the data. 44 61.5 65.5 71.5 80

b) Find the value of the IQR. 10

c) Are there any outliers? 44; Didn’t study

37) a) Compute the 5 number summary. 1.12 1.88 2.23 2.86 4.69

b) Draw a modified boxplot if you suspect outliers. Are there any? 4.69

c) the shape of the distribution, mean to fall distinctly above the median, skewed right

38) Find the range (58), mean (29.75), variance (613.07) and standard deviation (24.76)

39) Find the range (4.6), mean (6.681), variance (12.91) and standard deviation (3.59) for this data.

40) Find the range (11), mean (8.58), variance (11.72) and standard deviation (3.42).

41) • Choose four numbers that have the smallest possible standard deviation. All same

• Choose four numbers that have the largest possible standard deviation. 0,0,10,10

• Is more than one choice possible in either (a) (YES) or (b) (NO)

42) SAT Verbal scores for a high school’s graduating class

Males

Females

300 400 500 600 700 800

In general, graduating females at this high school tend to slightly out-perform their male

counterparts in SAT Verbal scores. Both distributions are skewed to the left, but the males

will have outliers with any of the scores that are below 312.5. The median score for females is

625 which is slightly higher than the males at 600. Female scores have a bit more variability in

the IQR (150 vs. 135), but the scores for males are more spread out overall with a range of

490 compared to only 410 for females.

Page 14: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

43) Number of C-sections performed in a year by doctors

Male Female

0 5 7

1 0 4 8 9

8 7 5 5 0 2 5 9

7 6 4 2 1 3 1 3

4 4

9 0 5

6

7

6 5 8

44) Agility test performance by 4th - graders

Male Female

1 2

8 7 7 * 6 9

4 3 3 2 2 2 1 2 0 1 2 2 4

9 * 5 5 6 7 8

45) Which of the following groups have outliers? Associate and Instructor

46) a) Which group has the individual largest salary of anyone listed; South; 125,000

b) Which group has the largest median salary; South; 90,000

c) Which group has the smallest interquartile range; Northeast; 17,000

d) Which group has the lowest Q3; West; 90,000

e) The top half of the south make the same or more as 75% of the west.

f) What is the spread of the lower 25% of the midwest? 65,000-75,000

g) Which region has the widest middle 50% of their data? Midwest

47) a) If the company decides to give every employee a $5000 raise, how will that affect the

(mean, median, mode) + 5,000, (range, variance and standard deviation) Unaffected

b) Suppose instead the CEO decides to give everyone a 20% raise, how will that affect the

(mean, median, mode, range, and standard deviation) times 1.2; variance times 1.44

1 4 represents 14

c-sections performed

In general, it appears that male doctors

perform more c-sections in a year than

female doctors. The shape of the

distribution of the number of c-sections

performed by male doctors is skewed to

the right with a gap from 59 to 85 and two

outliers in the 80s, while the shape of the

distribution for female doctors is roughly

symmetric. The median for the males (34)

is much higher than the median for females

(18). The males have a larger range of 86 –

20 = 66 in comparison to the range of the

females, which is 33 – 5 = 28.

1 2 represents an

agility test score of 12

In general, it appears that females out-

perform males on a 4th – grade agility test.

The shape of the distribution of the

number of agility test scores by females is

skewed to the left while the distribution of

male scores is skewed to the right. Neither

distribution seems to have any unusual

features. The median score for both

genders is 22, however, the females have

more scores in the upper 20s. The females

have a slightly larger range of 28 – 12 = 16

in comparison to the range of the males,

which is 29 – 17 = 12.

Page 15: HW Organizing and Describing Datastaff.katyisd.org/sites/0410576/PublishingImages/Pages/documents/… · the data, approximately 1.3 cm. 7. a) Create a dotplot and a stemplot for

48) a) Assume all those families were able to use a $1.00 off coupon, how will that affect the

(mean, median, mode) minus $1, (range, variance and standard deviation) Unaffected

b) Instead of a $1.00 of coupon they were able to save 20%, how will that affect the (mean,

median, mode, range, and standard deviation) times .8; variance times .64

c) It is their lucky week and they can take $1.00 and then also save 20%; what impact will

that have on the (mean, median, mode) .8(x – 1), (range, and standard deviation) times .8

and variance times .64