Upload
0p00
View
249
Download
0
Tags:
Embed Size (px)
DESCRIPTION
exam sample desriptive statistics
Citation preview
Descriptive Statistics tasks Iwona Nowakowska
1
S T A T I S T I C S Task 1: Table below displays the number of days to maturity for 8 short-term
investments. Data are from Barrons National Business and Financial Weekly.
Determine the first quartile for the given data set.
.
.
.
Interpretation:
.
.
.
Task 2: The U.S. National Center for Health Statistics published data on weights by age in Vital and Health Statistics. The weights are in the table: Weight (lb)
(males)
age 18 - 24
frequency
120-140 3
140-160 7
160-180 14
180-220 5
220-260 2
260-280 1
Determine the mean weight for the males aged 18 24 years.
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
2
Interpretation:
.
.
Task 3: A city planner collected data on the number of school-age children in each of
12 families. The data are displayed in the table below. Construct a box and-
whisker plot for this data set.
number of school-age children 0 3 0 2 3 1 0 4 3 1 0 2
.
.
.
.
.
.
.
Graph:
.
.
.
.
.
Task 4: Professor Weiss asked his statistics students to state their political party
affiliations as Democratic (D), Republican (R) or Other (O). The responses are
given in the table below. Build the relative frequency distribution for these
data.
Descriptive Statistics tasks Iwona Nowakowska
3
Political party
affiliations
D R O R R R R D O R D O R R D R O R R D O O O R D O R R R D O R D O R
.
.
Task 5: The Bureau of Economic Analysis gathers information on the length of stay in
Europe by U.S. travelers. Data are published in Survey of Current Business. A
sample of 30 residents who traveled to Europe one year yielded the following
data, in days, on length of stay.
1 - 5 2
5 - 10 3
10 - 20 7
20 - 30 12
30 - 40 4
40 - 60 2
Determine the middle length of stay in Europe by U.S. travelers.
.
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
4
Interpretation:
.
.
Task 6: The Energy Information Administration collects data on residential electric
energy consumption. Results are published below.
electric energy consumption
( ) number of
homes
2 - 4] 6
4 - 6 10
6 - 8 30
8 - 10 40
10 - 12 10
12 - 14 4
Is it true that the mode of electric energy consumption is greater than the
mean value of this variable? Prove your answer.
Determine the kind of skewness for above data.
.
.
.
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
5
Conclusions:
.
.
.
.
Task 7: The Food and Nutrition Board of the National Academy of Sciences states that
the recommended daily allowance of iron is 18 mg for adult under the age of
50. The amounts of iron intake (in milligrams) during a 24-hour period for
some people are below.
(mg) 10,5 5
12 9
14 14
15,5 16
16 28
17,5 30
18 42
19,5 26
20 3
Is it true that 25% of the population has daily allowance of iron at least 14mg?
Give the proper proof of your answer.
.
.
.
Conclusions:
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
6
Task 8: Data on starting salaries for college graduates are provided by The
Northwestern Endicott-Lindquist Report. A population of 1200 Management
graduates yielded the following starting annual salaries (data are in thousands
of dollars):
20 - 22 5
22 - 25 15
25 - 27 35
27 - 29 20
29 - 31 10
31 - 32 10
32 - 35 5
Determine the upper quartile value for 75% of the population asked about the
starting salaries.
.
.
.
.
Interpretation:
.
.
.
Task 9: A car salesman keeps track of the number of cars she sells per week. The
number of cars she sold per week last year are as follows:
1 0 3 3 2 1 0 4 2 3 4 2 0 1 2 3 0 3 5 1 0 2
Determine values for all quartiles and give the proper interpretations.
Descriptive Statistics tasks Iwona Nowakowska
7
.
.
.
.
.
.
Interpretation:
.
.
.
.
.
.
Task 10: A research physician conducted a study on the ages of people with diabetes. The following data were obtained for the ages of a sample of some diabetics. Construct a typical area of variation for the given data set. (years) 6 - 10 1
10 - 30 5
30 - 40 10
40 - 50 15
50 - 60 25
60 - 80 10
.
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
8
Interpretation:
.
.
Task 11: A small construction company employs 8 bricklayers. The number of days each employee misses is recorded. Absence records for the past year are as follow: 2 3 6 5 3 2 6 7
Is the employee absence statistically significant or insignificant ? Prove your
answer.
.
.
.
.
.
.
Conclusions:
.
.
Task 12: The exam scores for the students in an introductory statistics class are as follow:
scores number of
students
below 30 2
30 - 40 10
40 - 50 25
50 - 60 35
60 - 80 40
above 80 10
Descriptive Statistics tasks Iwona Nowakowska
9
Determine the appropriate measure of central tendency for the given variable with taking into account the variability. .
.
.
.
Conclusions:
.
.
Task 13: A farmer wants to know the middle value of his oat yields. Having his observations displayed on the histogram help him to find this value.
.
.
.
oat yield (bushels)
number of farms
Descriptive Statistics tasks Iwona Nowakowska
10
Conclusions:
.
.
Task 14:
Determine the coefficient of variation for the examinations scores. .
.
.
.
.
.
examinations scores midpoints
Examinations scores
number of
students
Descriptive Statistics tasks Iwona Nowakowska
11
Conclusions
.
.
Task 15: The histogram below shows the level of cholesterol (in mg per dl) of 200 people. How many people have a level of cholesterol between 205 and 210 ? What can we say about the skewness of the distribution for cholesterol level? Is it true that more people have the cholesterol level above the mean ? Prove your answer. Determine the proper measures.
.
.
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
12
Conclusions:
.
.
Task 16: In a group of 50 students we tested the exam scores and the values of quartiles
were obtained:
What can we say about the variability? Is the dispersion statistically significant
or not ?
.
.
.
Conclusions:
.
.
Task 17: Adam and Eve were writing test on statistics in two different groups. The result of Adam was 20 points (in his group the average was 17 points and the standard deviation 2 points). Eve received 10 points (in her group the average was 8,5 points with the standard deviation 1 point). Who of them received a better result? Prove your answer. .
.
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
13
Task 18: In the table below, each point represents one persons answers to
questions about annual income and education.
participant of the survey
income (thousand $)
years of education
1 125 19
2 100 20
3 40 16
4 35 16
5 41 18
6 29 12
7 35 14
Determine the direction and strength of the correlation for the given variables.
.
.
.
.
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
14
Task 19: Having the correlation distribution:
number of study hours 2 4 6 8 10
number of sleeping hours 10 9 8 7 6
- what can we say about the relation between the study hours and sleeping hours?
.
.
.
.
.
.
Conclusions:
.
. .
.
Descriptive Statistics tasks Iwona Nowakowska
15
Task 20: The correlation distribution is given:
years since PhD
number of publications
A Pearson correlation coefficient was computed to examine the relationship between a faculty members number of years of experience and his or her number of peer-reviewed publications. There was a significant positive correlation between the time since a faculty member received his or her Ph.D. and his or her number of publications because Pearson correlation coefficient value was . Indicate the influence of independent variable on dependent variable.
3 18
6 3
3 2
8 17
9 11
6 6
16 38
10 48
2 9
5 22
5 30
6 21
7 10
11 27
18 37
.
.
.
.
.
Conclusions:
.
. .
.
Descriptive Statistics tasks Iwona Nowakowska
16
Task 21: Having the data below determine the strength of the correlation for and :
1 2 -3,7 -2,3 13,69 5,29
3 5 -1,7 0,7 2,89
5 6 0,3 1,7
2,89
6 6 1,3 1,7 1,69
8 7 3,3 2,7 10,89 7,29
9 7 4,3 2,7 18,49 7,29
6 5 1,3 0,7 1,69 0,49
4 3 -0,7 -1,3
1,69
3 1 -1,7 -3,3 2,89 10,89
2 1 -2,7 -3,3 7,29
47 43 .
.
.
Conclusions:
.
.
Task 22: Having the data below determine the strength of the correlation for and :
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
17
Conclusions:
.
.
Task 23: Suppose that 5 students were asked their high school GPA (grade point average) and their College GPA, with the answers as follow:
Student HS GPA College GPA
A 3.8 2.8
B 3.1 2.2
C 4.0 3.5
D 2.5 1.9
E 3.3 2.5
Is high school and college GPA related according to this data? Use Spearman correlation coefficient to answer the question.
A
B
C
D
E
.
.
.
.
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
18
Task 24: Compute the value of the Spearman correlation coefficient from the following table:
Age
Glucose Level
1 45 99
2 21 65
3 25 79
4 42 75
5 57 85
6 45 85
.
.
.
.
Conclusions:
.
.
Task 25: We have the data on height (in inches) and hand span (in centimeters) for 27 students enrolled in Math Course. For the correlation distribution, the Pearson correlation coefficient value is 0,746. Determine the coefficient of determination and give the proper interpretation of this measure. .
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
19
Task 26: At the Big Rock Insurance Company branch office, a pool of six secretaries work under two different managers. At the end of the year, both managers are asked to rank the secretaries. A rank of 1 means best secretary. Determine the measure of correlation having data:
secretary 1 2 3 4 5 6
Manager A rank 3 5 2 1 6 4
Manager B rank 1 3 6 2 5 4
1
2
3
4
5
6
.
.
.
Conclusions:
.
.
Task 27: Insurance Company did a study of per capita income and volume of insurance sales in five cities. The volume of sales in the cities was ranked, with 1 being the largest volume. The per capita income was rounded to the nearest thousand dollars.
city 1 2 3 4 5
Volume of Insurance Sales Rank 5 1 3 2 4
Per Capita Income (in $1000) 17 11 16 12 15
Use a rank of 1 for the highest per capita income to determine Spearman rank coefficient.
Descriptive Statistics tasks Iwona Nowakowska
20
1
2
3
4
5
.
.
Conclusions:
.
.
Task 28: A group of 4 cadets selected at random were given a flying aptitude test before they went to flight training test. After graduation from training school, their commanding officer ranked each cadet according to his or her flying ability (highest ranks means greater ability). The results were:
cadet 1 2 3 4
aptitude score 520 390 480 750
performance rank 4 1 2 3
Using a rank of 1 for the lowest aptitude score, find Spearman rank coefficient.
1
2
3
4
.
.
.
Conclusions:
.
Descriptive Statistics tasks Iwona Nowakowska
21
Task 29: Check if there is a correlation between the education and number of cigarettes smoked a day, having the data:
Education
number of cigarettes smoked a day
primary primary
vocational secondary
higher higher
15 12 13 10 8
10
1
2
3
4
5
6
.
.
.
.
Conclusions:
.
.
Task 30: Some students claim they can tell the cost of textbook just by looking at its
thickness. To test this claim they picked six books of the same height and width
at random. The cost and thickness relation is:
thickness (cm) 1 2 0,5 1,5 3 cost ($) 5 7 4 9 10
Draw a scater diagram. Find the equation of the best fitting line to the
empirical data.
Descriptive Statistics tasks Iwona Nowakowska
22
.
.
.
.
.
.
(cm)
($)
1 5
2 7
0,5 4
1,5 9
3 10
.
.
.
.
.
.
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
23
Task 31: The sociologist is interested in the relation between number of job changes and annual salary (in thousands of dollars) for people living in the USA. A sample of 7 people provided the following information:
number of job changes 4 5 6 1 3 salary (in $1000) 33 34 35 32 33
Determine the equation of theoretical regression line for the given data and
interpret the slope .
4 33
5 34
6 35
1 32
3 33
.
.
.
.
Conclusions:
.
.
.
Task 32: Modern medical practice tells us not to encourage babies to eat not to become too fat. Medical research indicates that there is a positive correlation between the weight ( ) of a 1-year-old baby and the weight ( ) of a mature adult (30-years-old). A random sample of medical files produced the following information for some females:
21 25 24 20 15 125 125 120 130 120
Descriptive Statistics tasks Iwona Nowakowska
24
Can you predict the weight at 30 years of old woman if as a 1 year baby her weight was 20 lb using a best fitting line equation ? Find the standard error of this estimate.
.
.
.
.
.
Conclusions:
.
.
.
Task 33: Dorothy sells life insurance for the Prudence Insurance Company. She sells insurance by making visitors to her clients homes. Dorothy believes that the number of sales should depend to some degree on the number of visits made. For the past several years she kept careful records of the number of visits ( ) she made each week and the number of people ( ) who bought insurance that week. For a random sample of 5 such weeks, the values for and follows:
11 13 15 20 14 3 5 5 6 5
On a week in which Dorothy made 18 visits, how many people would you predict would buy insurance from her? Use the least square line for prediction.
Descriptive Statistics tasks Iwona Nowakowska
25
.
.
.
.
Conclusions:
.
.
Task 34: The following data are based on information from the book Life in Americas Small Cities: percentage of those 25 years or older with 4 or more years of collage per capita income in thousands of dollars Regarding the and values we obtain: % , , , In a small city where x=20 percentage of the population 25 years or older who have had 4 or more years of college, what would the least-square equation forecast for - per capita income ? .
.
Conclusions:
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
26
Task 35: The newspaper reported the following information about the random variables: per capita income and per capita retail sales (both in thousands dollars): , , , Suppose you plan to open a retail store in a city where the per capita income is 9,5 thousand dollars. What does the regression line equation forecast for per capita retail sales ? .
.
Conclusions:
.
.
. Task 36: The number of workers on an assembly line varies to the level of absenteeism on any given day. In a random sample of production output from several days of work, the following data were obtained:
- number of workers absent from assembly line
3 5 0 2 1 ;
- number of defects coming off the line
16 20 9 12 10 standard error
On a day when four workers are absent from the assembly line, what would be the number of defects coming off the line? .
.
Conclusions:
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
27
Task 37: Having the data on the graph below, determine the influence of intelligence quotient (IQ) on the test scores. What can you say about the utility of the regression equation for making predictions ?
.
Conclusions:
.
.
Task 38:
What life expectancy can we expect for a cat in the eighth generation? .
Conclusions:
.
S
C
O
R
E
S
IQ
cat generation influence on the life expectancy
life
expectancy
(years)
generation
Descriptive Statistics tasks Iwona Nowakowska
28
Task 39: The correlation table is given:
number of jobs
age
1 2
24 - 26 10 10
26 - 28 10 20
Check if there is a correlation between the variables age and number of jobs.
.
.
.
deviations
table
10 10
10 20
.
.
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
29
Task 40: The correlation table is given:
price ( 100s)
car age
40 - 80 80 - 120
2 - 4 5 10
4 - 6 10 -
Check if there is a correlation between the price and car age.
.
.
.
deviations
table
5 10
10 -
.
.
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
30
Task 41: The table with data for the variables and is given:
Y
X
0 2 4
1 10 - -
2 - 10 -
3 - - 20
4 10 - -
Check if there is a correlation between the variables and , where: - years of study , - number of failed exams
1 0
2 2
3 4
4
.
.
.
deviations
table
Descriptive Statistics tasks Iwona Nowakowska
31
.
.
.
Conclusions:
.
.
Task 42: The table gives the annual income value of Brase Company between 1995 and
2000.
year $ (millions)
1995 = 100
1999 = 100
1 1995 46
2 1996 47
3 1997 48
4 1998 48,5
5 1999 49
6 2000 48,2
Determine all one base indexes where 1995 = 100 and 1999 = 100
Give the interpretation of the value for the base 1999.
.
.
Conclusions:
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
32
Task 43: The time series below shows the price of the Super Sandwich between May
and October 2010:
(PLN)
1 May 9
2 June 8
3 July 10
4 August 12
5 September 12
6 October 10
Determine all chain indexes .
Give the interpretation of the value .
.
Conclusions:
.
.
Task 44: The time series below shows the price of the Super Sandwich between May and
October 2010:
(PLN) Determine the
average sandwich price
in the given months.
1 May 9
2 June 8
3 July 10
4 August 12
5 September 12
6 October 10
.
.
Descriptive Statistics tasks Iwona Nowakowska
33
Task 45: The time series shows the plane tickets prices from Lodz to Paris:
(2008) (PLN) Determine the average plane
tickets price from Lodz to Paris between 7 October and
12 October, 2008
1 2 3 4 5 6
7 October 348
8 October 380
9 October 420
10 October 399
11 October 520
12 October 490
.
.
.
Task 46: The time series below gives us the value of money gathered by the celebrities for charity ($ millions ) in the years 2002 2010:
t 1 2 3 4 5 6 7 8 9
12,9 12,8 11,3 10,6 9,2 9,8 12,3 12,3 13,6
Determine the average value of the variable Y between 2002 2010 and give the interpretations for the values:
.
.
.
.
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
34
.
.
Task 47: The dynamic indexes are given:
(year)
in/0
in/0
in/0
in/n-1
2005 245 1 ----
2006 275 1,12 1,12
2007 285 1
2008 250 1
Fill in all gaps by missing index numbers.
Task 48: The time series below gives a cost of a weekly stay in the clinic (in thousand PLN)
(year)
2006 8
2007 10
2008 12
2009 12,5
2010 14
Determine the average growth rate of the cost of a weekly stay in the clinic.
.
.
.
Conclusions:
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
35
Task 49: The price of laptops produced by firm LoNg LiFe increased of 25 % in 2012 in
comparison with the year 2008. As a consequence, in the year 2012 the
quantity of sold laptops decreased of 18,8 percentage, taking into account the
year 2008. Determine the percent of change of the production value for
laptops produced by LoNg LiFe company.
.
.
Conclusions:
.
.
Task 50: The value of sold coffee makers decreased of 3,9 % in comparison with the previous month but the price of coffee makers increased of 5,5 %. How the quantity of sold coffee makers were changed?
.
.
Conclusions:
.
.
Task 51: The value of sold goods produced by the company WIND increased of 12, 2 % in comparison with the previous quarter but the prices of the goods decreased of 1,5 %. How the quantity of sold goods were changed? .
.
Conclusions:
.
Descriptive Statistics tasks Iwona Nowakowska
36
Task 52: Make a comprehensive analysis of time series below:
( ice-cream sales (tonnes) )
March April May June
9,1 10,6 11,5 13,8
Using the least square criterion estimate the parameters of the proper trend
function. What kind of ice-cream sales can we expect in July?
.
.
.
.
Task 53: Based on data below make a comprehensive analysis of price dynamics,
quantity and value of all grocery products per capita in Poland in 2011 relative
to the base year 2010.
product unit
price (PLN) quantity
2010
2011
2010
2011
eggs carton 5,5 6,0 60 50
butter 250 grams 3,9 3,9 70 65
meat /pork/
kilogram 17 19 30 35
milk litre 1,8 2,3 120 140
bread loaf 2,2 2,0 150 150 Do not forget to make conclusions taking into account Fisher ideal index.
.
.
.
Descriptive Statistics tasks Iwona Nowakowska
37
Task 54: The following table reports prices and usage quantities for two items in 2004 and 2006.
Item
Quantity Unit price ($)
2004 2006 2004 2006
A 1 500 1800 7,5 7,75
B 2 1 630 1 500 Describe price change for each item in 2006 using 2004 as the base period and
the price change for two items simultaneously. Use all possible index numbers.
Interpret Fisher ideal index.
.
.
.
.
.
.
.
Task 55: The probability distribution is given:
Determine all the numerical parameters for the random variable described by the above function with the proper value of the parametr . .
.
Conclusions:
.
.
Descriptive Statistics tasks Iwona Nowakowska
38
Task 56: The probability distribution of X, the age of a randomly selected students is given:
x (age)
19 0,250
20 0,375
21 0,250
27 0,125
Express the mean age of the students in terms of probability distribution of the
random variable X.
.
.
.
.
Task 57: A factory manager collected data on the number of equipment breakdowns per day. From those data, he derived the probability distribution shown in the table below. W denotes number of breakdowns on a given day:
W 0 1 2
0,80 0,15 0,05
On average, how many breakdowns occur a day?
What is the standard deviation value?
.
.
.