38
Descriptive Statistics tasks Iwona Nowakowska 1 S T A T I S T I C S Task 1: Table below displays the number of days to maturity for 8 short-term investments. Data are from Barron’s National Business and Financial Weekly. Determine the first quartile for the given data set. ……………………………………………………………………………………………………………………. ……………………………………………………………………………………………………………………. ……………………………………………………………………………………………………………………. Interpretation: ……………………………………………………………………………………………………………………. ……………………………………………………………………………………………………………………. ……………………………………………………………………………………………………………………. Task 2: The U.S. National Center for Health Statistics published data on weights by age in Vital and Health Statistics. The weights are in the table: Weight (lb) (males) age 18 - 24 frequency 120-140 3 140-160 7 160-180 14 180-220 5 220-260 2 260-280 1 Determine the mean weight for the males aged 18 – 24 years. ……………………………………………………………………………………………………………………. ……………………………………………………………………………………………………………………. …………………………………………………………………………………………………………………….

Descriptive Statistics - Tasks (Example)-1

  • Upload
    0p00

  • View
    249

  • Download
    0

Embed Size (px)

DESCRIPTION

exam sample desriptive statistics

Citation preview

  • Descriptive Statistics tasks Iwona Nowakowska

    1

    S T A T I S T I C S Task 1: Table below displays the number of days to maturity for 8 short-term

    investments. Data are from Barrons National Business and Financial Weekly.

    Determine the first quartile for the given data set.

    .

    .

    .

    Interpretation:

    .

    .

    .

    Task 2: The U.S. National Center for Health Statistics published data on weights by age in Vital and Health Statistics. The weights are in the table: Weight (lb)

    (males)

    age 18 - 24

    frequency

    120-140 3

    140-160 7

    160-180 14

    180-220 5

    220-260 2

    260-280 1

    Determine the mean weight for the males aged 18 24 years.

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    2

    Interpretation:

    .

    .

    Task 3: A city planner collected data on the number of school-age children in each of

    12 families. The data are displayed in the table below. Construct a box and-

    whisker plot for this data set.

    number of school-age children 0 3 0 2 3 1 0 4 3 1 0 2

    .

    .

    .

    .

    .

    .

    .

    Graph:

    .

    .

    .

    .

    .

    Task 4: Professor Weiss asked his statistics students to state their political party

    affiliations as Democratic (D), Republican (R) or Other (O). The responses are

    given in the table below. Build the relative frequency distribution for these

    data.

  • Descriptive Statistics tasks Iwona Nowakowska

    3

    Political party

    affiliations

    D R O R R R R D O R D O R R D R O R R D O O O R D O R R R D O R D O R

    .

    .

    Task 5: The Bureau of Economic Analysis gathers information on the length of stay in

    Europe by U.S. travelers. Data are published in Survey of Current Business. A

    sample of 30 residents who traveled to Europe one year yielded the following

    data, in days, on length of stay.

    1 - 5 2

    5 - 10 3

    10 - 20 7

    20 - 30 12

    30 - 40 4

    40 - 60 2

    Determine the middle length of stay in Europe by U.S. travelers.

    .

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    4

    Interpretation:

    .

    .

    Task 6: The Energy Information Administration collects data on residential electric

    energy consumption. Results are published below.

    electric energy consumption

    ( ) number of

    homes

    2 - 4] 6

    4 - 6 10

    6 - 8 30

    8 - 10 40

    10 - 12 10

    12 - 14 4

    Is it true that the mode of electric energy consumption is greater than the

    mean value of this variable? Prove your answer.

    Determine the kind of skewness for above data.

    .

    .

    .

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    5

    Conclusions:

    .

    .

    .

    .

    Task 7: The Food and Nutrition Board of the National Academy of Sciences states that

    the recommended daily allowance of iron is 18 mg for adult under the age of

    50. The amounts of iron intake (in milligrams) during a 24-hour period for

    some people are below.

    (mg) 10,5 5

    12 9

    14 14

    15,5 16

    16 28

    17,5 30

    18 42

    19,5 26

    20 3

    Is it true that 25% of the population has daily allowance of iron at least 14mg?

    Give the proper proof of your answer.

    .

    .

    .

    Conclusions:

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    6

    Task 8: Data on starting salaries for college graduates are provided by The

    Northwestern Endicott-Lindquist Report. A population of 1200 Management

    graduates yielded the following starting annual salaries (data are in thousands

    of dollars):

    20 - 22 5

    22 - 25 15

    25 - 27 35

    27 - 29 20

    29 - 31 10

    31 - 32 10

    32 - 35 5

    Determine the upper quartile value for 75% of the population asked about the

    starting salaries.

    .

    .

    .

    .

    Interpretation:

    .

    .

    .

    Task 9: A car salesman keeps track of the number of cars she sells per week. The

    number of cars she sold per week last year are as follows:

    1 0 3 3 2 1 0 4 2 3 4 2 0 1 2 3 0 3 5 1 0 2

    Determine values for all quartiles and give the proper interpretations.

  • Descriptive Statistics tasks Iwona Nowakowska

    7

    .

    .

    .

    .

    .

    .

    Interpretation:

    .

    .

    .

    .

    .

    .

    Task 10: A research physician conducted a study on the ages of people with diabetes. The following data were obtained for the ages of a sample of some diabetics. Construct a typical area of variation for the given data set. (years) 6 - 10 1

    10 - 30 5

    30 - 40 10

    40 - 50 15

    50 - 60 25

    60 - 80 10

    .

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    8

    Interpretation:

    .

    .

    Task 11: A small construction company employs 8 bricklayers. The number of days each employee misses is recorded. Absence records for the past year are as follow: 2 3 6 5 3 2 6 7

    Is the employee absence statistically significant or insignificant ? Prove your

    answer.

    .

    .

    .

    .

    .

    .

    Conclusions:

    .

    .

    Task 12: The exam scores for the students in an introductory statistics class are as follow:

    scores number of

    students

    below 30 2

    30 - 40 10

    40 - 50 25

    50 - 60 35

    60 - 80 40

    above 80 10

  • Descriptive Statistics tasks Iwona Nowakowska

    9

    Determine the appropriate measure of central tendency for the given variable with taking into account the variability. .

    .

    .

    .

    Conclusions:

    .

    .

    Task 13: A farmer wants to know the middle value of his oat yields. Having his observations displayed on the histogram help him to find this value.

    .

    .

    .

    oat yield (bushels)

    number of farms

  • Descriptive Statistics tasks Iwona Nowakowska

    10

    Conclusions:

    .

    .

    Task 14:

    Determine the coefficient of variation for the examinations scores. .

    .

    .

    .

    .

    .

    examinations scores midpoints

    Examinations scores

    number of

    students

  • Descriptive Statistics tasks Iwona Nowakowska

    11

    Conclusions

    .

    .

    Task 15: The histogram below shows the level of cholesterol (in mg per dl) of 200 people. How many people have a level of cholesterol between 205 and 210 ? What can we say about the skewness of the distribution for cholesterol level? Is it true that more people have the cholesterol level above the mean ? Prove your answer. Determine the proper measures.

    .

    .

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    12

    Conclusions:

    .

    .

    Task 16: In a group of 50 students we tested the exam scores and the values of quartiles

    were obtained:

    What can we say about the variability? Is the dispersion statistically significant

    or not ?

    .

    .

    .

    Conclusions:

    .

    .

    Task 17: Adam and Eve were writing test on statistics in two different groups. The result of Adam was 20 points (in his group the average was 17 points and the standard deviation 2 points). Eve received 10 points (in her group the average was 8,5 points with the standard deviation 1 point). Who of them received a better result? Prove your answer. .

    .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    13

    Task 18: In the table below, each point represents one persons answers to

    questions about annual income and education.

    participant of the survey

    income (thousand $)

    years of education

    1 125 19

    2 100 20

    3 40 16

    4 35 16

    5 41 18

    6 29 12

    7 35 14

    Determine the direction and strength of the correlation for the given variables.

    .

    .

    .

    .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    14

    Task 19: Having the correlation distribution:

    number of study hours 2 4 6 8 10

    number of sleeping hours 10 9 8 7 6

    - what can we say about the relation between the study hours and sleeping hours?

    .

    .

    .

    .

    .

    .

    Conclusions:

    .

    . .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    15

    Task 20: The correlation distribution is given:

    years since PhD

    number of publications

    A Pearson correlation coefficient was computed to examine the relationship between a faculty members number of years of experience and his or her number of peer-reviewed publications. There was a significant positive correlation between the time since a faculty member received his or her Ph.D. and his or her number of publications because Pearson correlation coefficient value was . Indicate the influence of independent variable on dependent variable.

    3 18

    6 3

    3 2

    8 17

    9 11

    6 6

    16 38

    10 48

    2 9

    5 22

    5 30

    6 21

    7 10

    11 27

    18 37

    .

    .

    .

    .

    .

    Conclusions:

    .

    . .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    16

    Task 21: Having the data below determine the strength of the correlation for and :

    1 2 -3,7 -2,3 13,69 5,29

    3 5 -1,7 0,7 2,89

    5 6 0,3 1,7

    2,89

    6 6 1,3 1,7 1,69

    8 7 3,3 2,7 10,89 7,29

    9 7 4,3 2,7 18,49 7,29

    6 5 1,3 0,7 1,69 0,49

    4 3 -0,7 -1,3

    1,69

    3 1 -1,7 -3,3 2,89 10,89

    2 1 -2,7 -3,3 7,29

    47 43 .

    .

    .

    Conclusions:

    .

    .

    Task 22: Having the data below determine the strength of the correlation for and :

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    17

    Conclusions:

    .

    .

    Task 23: Suppose that 5 students were asked their high school GPA (grade point average) and their College GPA, with the answers as follow:

    Student HS GPA College GPA

    A 3.8 2.8

    B 3.1 2.2

    C 4.0 3.5

    D 2.5 1.9

    E 3.3 2.5

    Is high school and college GPA related according to this data? Use Spearman correlation coefficient to answer the question.

    A

    B

    C

    D

    E

    .

    .

    .

    .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    18

    Task 24: Compute the value of the Spearman correlation coefficient from the following table:

    Age

    Glucose Level

    1 45 99

    2 21 65

    3 25 79

    4 42 75

    5 57 85

    6 45 85

    .

    .

    .

    .

    Conclusions:

    .

    .

    Task 25: We have the data on height (in inches) and hand span (in centimeters) for 27 students enrolled in Math Course. For the correlation distribution, the Pearson correlation coefficient value is 0,746. Determine the coefficient of determination and give the proper interpretation of this measure. .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    19

    Task 26: At the Big Rock Insurance Company branch office, a pool of six secretaries work under two different managers. At the end of the year, both managers are asked to rank the secretaries. A rank of 1 means best secretary. Determine the measure of correlation having data:

    secretary 1 2 3 4 5 6

    Manager A rank 3 5 2 1 6 4

    Manager B rank 1 3 6 2 5 4

    1

    2

    3

    4

    5

    6

    .

    .

    .

    Conclusions:

    .

    .

    Task 27: Insurance Company did a study of per capita income and volume of insurance sales in five cities. The volume of sales in the cities was ranked, with 1 being the largest volume. The per capita income was rounded to the nearest thousand dollars.

    city 1 2 3 4 5

    Volume of Insurance Sales Rank 5 1 3 2 4

    Per Capita Income (in $1000) 17 11 16 12 15

    Use a rank of 1 for the highest per capita income to determine Spearman rank coefficient.

  • Descriptive Statistics tasks Iwona Nowakowska

    20

    1

    2

    3

    4

    5

    .

    .

    Conclusions:

    .

    .

    Task 28: A group of 4 cadets selected at random were given a flying aptitude test before they went to flight training test. After graduation from training school, their commanding officer ranked each cadet according to his or her flying ability (highest ranks means greater ability). The results were:

    cadet 1 2 3 4

    aptitude score 520 390 480 750

    performance rank 4 1 2 3

    Using a rank of 1 for the lowest aptitude score, find Spearman rank coefficient.

    1

    2

    3

    4

    .

    .

    .

    Conclusions:

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    21

    Task 29: Check if there is a correlation between the education and number of cigarettes smoked a day, having the data:

    Education

    number of cigarettes smoked a day

    primary primary

    vocational secondary

    higher higher

    15 12 13 10 8

    10

    1

    2

    3

    4

    5

    6

    .

    .

    .

    .

    Conclusions:

    .

    .

    Task 30: Some students claim they can tell the cost of textbook just by looking at its

    thickness. To test this claim they picked six books of the same height and width

    at random. The cost and thickness relation is:

    thickness (cm) 1 2 0,5 1,5 3 cost ($) 5 7 4 9 10

    Draw a scater diagram. Find the equation of the best fitting line to the

    empirical data.

  • Descriptive Statistics tasks Iwona Nowakowska

    22

    .

    .

    .

    .

    .

    .

    (cm)

    ($)

    1 5

    2 7

    0,5 4

    1,5 9

    3 10

    .

    .

    .

    .

    .

    .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    23

    Task 31: The sociologist is interested in the relation between number of job changes and annual salary (in thousands of dollars) for people living in the USA. A sample of 7 people provided the following information:

    number of job changes 4 5 6 1 3 salary (in $1000) 33 34 35 32 33

    Determine the equation of theoretical regression line for the given data and

    interpret the slope .

    4 33

    5 34

    6 35

    1 32

    3 33

    .

    .

    .

    .

    Conclusions:

    .

    .

    .

    Task 32: Modern medical practice tells us not to encourage babies to eat not to become too fat. Medical research indicates that there is a positive correlation between the weight ( ) of a 1-year-old baby and the weight ( ) of a mature adult (30-years-old). A random sample of medical files produced the following information for some females:

    21 25 24 20 15 125 125 120 130 120

  • Descriptive Statistics tasks Iwona Nowakowska

    24

    Can you predict the weight at 30 years of old woman if as a 1 year baby her weight was 20 lb using a best fitting line equation ? Find the standard error of this estimate.

    .

    .

    .

    .

    .

    Conclusions:

    .

    .

    .

    Task 33: Dorothy sells life insurance for the Prudence Insurance Company. She sells insurance by making visitors to her clients homes. Dorothy believes that the number of sales should depend to some degree on the number of visits made. For the past several years she kept careful records of the number of visits ( ) she made each week and the number of people ( ) who bought insurance that week. For a random sample of 5 such weeks, the values for and follows:

    11 13 15 20 14 3 5 5 6 5

    On a week in which Dorothy made 18 visits, how many people would you predict would buy insurance from her? Use the least square line for prediction.

  • Descriptive Statistics tasks Iwona Nowakowska

    25

    .

    .

    .

    .

    Conclusions:

    .

    .

    Task 34: The following data are based on information from the book Life in Americas Small Cities: percentage of those 25 years or older with 4 or more years of collage per capita income in thousands of dollars Regarding the and values we obtain: % , , , In a small city where x=20 percentage of the population 25 years or older who have had 4 or more years of college, what would the least-square equation forecast for - per capita income ? .

    .

    Conclusions:

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    26

    Task 35: The newspaper reported the following information about the random variables: per capita income and per capita retail sales (both in thousands dollars): , , , Suppose you plan to open a retail store in a city where the per capita income is 9,5 thousand dollars. What does the regression line equation forecast for per capita retail sales ? .

    .

    Conclusions:

    .

    .

    . Task 36: The number of workers on an assembly line varies to the level of absenteeism on any given day. In a random sample of production output from several days of work, the following data were obtained:

    - number of workers absent from assembly line

    3 5 0 2 1 ;

    - number of defects coming off the line

    16 20 9 12 10 standard error

    On a day when four workers are absent from the assembly line, what would be the number of defects coming off the line? .

    .

    Conclusions:

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    27

    Task 37: Having the data on the graph below, determine the influence of intelligence quotient (IQ) on the test scores. What can you say about the utility of the regression equation for making predictions ?

    .

    Conclusions:

    .

    .

    Task 38:

    What life expectancy can we expect for a cat in the eighth generation? .

    Conclusions:

    .

    S

    C

    O

    R

    E

    S

    IQ

    cat generation influence on the life expectancy

    life

    expectancy

    (years)

    generation

  • Descriptive Statistics tasks Iwona Nowakowska

    28

    Task 39: The correlation table is given:

    number of jobs

    age

    1 2

    24 - 26 10 10

    26 - 28 10 20

    Check if there is a correlation between the variables age and number of jobs.

    .

    .

    .

    deviations

    table

    10 10

    10 20

    .

    .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    29

    Task 40: The correlation table is given:

    price ( 100s)

    car age

    40 - 80 80 - 120

    2 - 4 5 10

    4 - 6 10 -

    Check if there is a correlation between the price and car age.

    .

    .

    .

    deviations

    table

    5 10

    10 -

    .

    .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    30

    Task 41: The table with data for the variables and is given:

    Y

    X

    0 2 4

    1 10 - -

    2 - 10 -

    3 - - 20

    4 10 - -

    Check if there is a correlation between the variables and , where: - years of study , - number of failed exams

    1 0

    2 2

    3 4

    4

    .

    .

    .

    deviations

    table

  • Descriptive Statistics tasks Iwona Nowakowska

    31

    .

    .

    .

    Conclusions:

    .

    .

    Task 42: The table gives the annual income value of Brase Company between 1995 and

    2000.

    year $ (millions)

    1995 = 100

    1999 = 100

    1 1995 46

    2 1996 47

    3 1997 48

    4 1998 48,5

    5 1999 49

    6 2000 48,2

    Determine all one base indexes where 1995 = 100 and 1999 = 100

    Give the interpretation of the value for the base 1999.

    .

    .

    Conclusions:

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    32

    Task 43: The time series below shows the price of the Super Sandwich between May

    and October 2010:

    (PLN)

    1 May 9

    2 June 8

    3 July 10

    4 August 12

    5 September 12

    6 October 10

    Determine all chain indexes .

    Give the interpretation of the value .

    .

    Conclusions:

    .

    .

    Task 44: The time series below shows the price of the Super Sandwich between May and

    October 2010:

    (PLN) Determine the

    average sandwich price

    in the given months.

    1 May 9

    2 June 8

    3 July 10

    4 August 12

    5 September 12

    6 October 10

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    33

    Task 45: The time series shows the plane tickets prices from Lodz to Paris:

    (2008) (PLN) Determine the average plane

    tickets price from Lodz to Paris between 7 October and

    12 October, 2008

    1 2 3 4 5 6

    7 October 348

    8 October 380

    9 October 420

    10 October 399

    11 October 520

    12 October 490

    .

    .

    .

    Task 46: The time series below gives us the value of money gathered by the celebrities for charity ($ millions ) in the years 2002 2010:

    t 1 2 3 4 5 6 7 8 9

    12,9 12,8 11,3 10,6 9,2 9,8 12,3 12,3 13,6

    Determine the average value of the variable Y between 2002 2010 and give the interpretations for the values:

    .

    .

    .

    .

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    34

    .

    .

    Task 47: The dynamic indexes are given:

    (year)

    in/0

    in/0

    in/0

    in/n-1

    2005 245 1 ----

    2006 275 1,12 1,12

    2007 285 1

    2008 250 1

    Fill in all gaps by missing index numbers.

    Task 48: The time series below gives a cost of a weekly stay in the clinic (in thousand PLN)

    (year)

    2006 8

    2007 10

    2008 12

    2009 12,5

    2010 14

    Determine the average growth rate of the cost of a weekly stay in the clinic.

    .

    .

    .

    Conclusions:

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    35

    Task 49: The price of laptops produced by firm LoNg LiFe increased of 25 % in 2012 in

    comparison with the year 2008. As a consequence, in the year 2012 the

    quantity of sold laptops decreased of 18,8 percentage, taking into account the

    year 2008. Determine the percent of change of the production value for

    laptops produced by LoNg LiFe company.

    .

    .

    Conclusions:

    .

    .

    Task 50: The value of sold coffee makers decreased of 3,9 % in comparison with the previous month but the price of coffee makers increased of 5,5 %. How the quantity of sold coffee makers were changed?

    .

    .

    Conclusions:

    .

    .

    Task 51: The value of sold goods produced by the company WIND increased of 12, 2 % in comparison with the previous quarter but the prices of the goods decreased of 1,5 %. How the quantity of sold goods were changed? .

    .

    Conclusions:

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    36

    Task 52: Make a comprehensive analysis of time series below:

    ( ice-cream sales (tonnes) )

    March April May June

    9,1 10,6 11,5 13,8

    Using the least square criterion estimate the parameters of the proper trend

    function. What kind of ice-cream sales can we expect in July?

    .

    .

    .

    .

    Task 53: Based on data below make a comprehensive analysis of price dynamics,

    quantity and value of all grocery products per capita in Poland in 2011 relative

    to the base year 2010.

    product unit

    price (PLN) quantity

    2010

    2011

    2010

    2011

    eggs carton 5,5 6,0 60 50

    butter 250 grams 3,9 3,9 70 65

    meat /pork/

    kilogram 17 19 30 35

    milk litre 1,8 2,3 120 140

    bread loaf 2,2 2,0 150 150 Do not forget to make conclusions taking into account Fisher ideal index.

    .

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    37

    Task 54: The following table reports prices and usage quantities for two items in 2004 and 2006.

    Item

    Quantity Unit price ($)

    2004 2006 2004 2006

    A 1 500 1800 7,5 7,75

    B 2 1 630 1 500 Describe price change for each item in 2006 using 2004 as the base period and

    the price change for two items simultaneously. Use all possible index numbers.

    Interpret Fisher ideal index.

    .

    .

    .

    .

    .

    .

    .

    Task 55: The probability distribution is given:

    Determine all the numerical parameters for the random variable described by the above function with the proper value of the parametr . .

    .

    Conclusions:

    .

    .

  • Descriptive Statistics tasks Iwona Nowakowska

    38

    Task 56: The probability distribution of X, the age of a randomly selected students is given:

    x (age)

    19 0,250

    20 0,375

    21 0,250

    27 0,125

    Express the mean age of the students in terms of probability distribution of the

    random variable X.

    .

    .

    .

    .

    Task 57: A factory manager collected data on the number of equipment breakdowns per day. From those data, he derived the probability distribution shown in the table below. W denotes number of breakdowns on a given day:

    W 0 1 2

    0,80 0,15 0,05

    On average, how many breakdowns occur a day?

    What is the standard deviation value?

    .

    .

    .