Year 10 Semester 2 Summary Probability Summary · Year 10 Summary Semester 2 Page 11 of 23 Distribution of Data Histograms give an indication of the distribution of data. The most

Year 10 Summary Semester 2

Page 1 of 23

Year 10 Semester 2 Summary

Probability Summary

Set Notation

A set is a collection of elements usually of the same type;

We can represent sets in Venn diagrams.

The sample space denoted by 𝜀 𝑜𝑟 𝑈 is the set of all possible elements. This is called the universal

set.

A null or empty set is a set with no elements and is denoted by ∅ or { }.

All elements common to A and B make up the intersection of sets A and B. This is denoted by 𝑨 ∩ 𝑩.

All elements in either A or B or both make up the union of A and B. This is denoted by 𝑨 ∪ 𝑩.

Two sets are said to be mutually exclusive if they have no elements in common. This means

𝑨 ∩ 𝑩 = ∅.

Sets A and B are mutually exclusive.

A’, is called the complement of set A, and is the set of elements that are not in set A.

A’ is shaded in the above diagrams.

𝒏(𝑨) is the number of elements in set A.


Page 2 of 23

Probability

The probability that an event must happen is 1

The probability that an event can never happen is 0

The probability of an event happening lies between 0 and 1.


Page 3 of 23

Tree Diagrams


Page 4 of 23

Trigonometry

The hypotenuse is the longest side in a right triangle and is opposite the right angle.


Page 5 of 23

Bearings

Remember: Bearings measured from north in clockwise direction

Summary: Measurement

Conversion of Units

1m = 100 cm = 1000 mm

1 km = 1000 m = 100000 cm = 1000000 mm

1 cm = 10 mm

Perimeter Formula

The perimeter is the distance round the outside of a shape. The perimeter of a circle is called the

circumference. The circumference is often denoted by C.


Page 6 of 23

Area Formula

𝑨𝒓𝒆𝒂 = 𝝅𝒓𝟐

Surface Areas of Solids

The total surface area of a solid is the sum of the areas of its faces.

Example 1


Page 7 of 23

Example 2

Surface area of a cylinder


Page 8 of 23

Volumes of Solids

Capacity

The capacity of a container is the quantity of fluid (liquid or gas) it is capable of holding.

1 Litre (L) = 1000 millilitres (mL)

1 kilolitre (kL) = 1000 Litres

Capacity and Volume 1 mL = 1 cm3 1 litre = 1000 cm3 1 kL = 1 m3


Page 9 of 23

Statistics Summary

Representing Data

Data is often represented in frequency charts, column charts, histograms and dot plots.

Remember: the frequency of an observation is the number of times that observation occurs.

Example 1:

The following frequency distribution table gives the number of days of each weather type for the

month of January. Represent the information using a column chart.

Example 2

0

2

4

6

8

10

12

14

16

Hot Warm Mild Cool

Fre

qu

ency

Weather

Weather Type


Page 10 of 23

a.

Notice that for a histogram there is no gap between the bars and the number of visits are positioned

at the centre of each bar.

Example 3

Represent the data in a. a histogram

a.

Notice that the numbers are placed at the edges of the bars along the x-axis for grouped data.


Page 11 of 23

Distribution of Data

Histograms give an indication of the distribution of data.

The most common distributions are:

Time Series Data

A line graph is often used to represent the change in data over time.

Example 4.

The approximate population of an outback town is recorded from 1990 to 2005 in the diagram

below.

The graph indicates that the population if falling between 1990 and 1999. After that the trend is

slightly upwards.


Page 12 of 23

Types of Average

There are three types of average which can represent a set of data. An average is a measure of

central tendency.

The Mean

The most common average is the mean.

�̅� is used to denote the mean.

�̅� = 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑠𝑐𝑜𝑟𝑒𝑠

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒𝑠

Example 5: The following data gives the number of pets kept in each of 10 different households.

3, 5, 4, 4, 2, 3, 0, 1, 4, 5

The mean number of pets is given by:

3 + 5 + 4 + 4 + 2 + 3 + 0 + 1 + 4 + 5

10= 3.1

The mean is sometimes not the best average to use as it is affected by extreme scores or outliers.

The Median

The median of a set of scores is the middle score when the data are arranged in order of size.

The median’s position is given by 𝑛+1

2th score, where n is the number of scores.

In example 1 the median’s position is given by the 10+1

2 th score. This is the 5.5th score or halfway

between the 5th and 6th score, after the scores have been arranged in order of size.

Arranging the data in order of size:

0, 1, 2, 3, 3, 4, 4, 4, 5, 5

Median number of pets is: 3+4

2= 3.5 (as there are two middle scores we take their mean.

The median is not affected by extreme values or outliers.

The Mode

The mode of a group of scores is the score that occurs most often. That is the score with the highest

frequency.

In example 1 the modal number of pets is 4. More than one mode is possible.


Page 13 of 23

Frequency Tables

Example 6

The table indicates that 6 students made 0 cinema visits, 7 students made 1 cinema visit, 4 students

made 2 cinema visits etc.

The mean number of visits can be found by adding an extra column to the table and multiplying the

number of visits by the frequency.

Number of visits (𝒙) Frequency (𝒇) 𝒇 × 𝒙

0 6 0

1 7 7

2 4 8

3 2 6

4 1 4

Total 20 25

a. mean number of visits =𝑡𝑜𝑡𝑎𝑙 𝑜𝑓 𝑓×𝑥

𝑡𝑜𝑡𝑎𝑙 𝑜𝑓 𝑓=

25

20= 1.25

b. the median number of visits can be found by finding the position of the median as the number of

visits are in order of size in the table.

The median’s position is the 𝑛+1

2th score =

20+1

2th = 10.5th position. Halfway between the 10th

and 11th scores. The median’s position falls within the second row and is therefore 1.

c. The mode is the score with the highest frequency. The mode is 1.

Alternatively, the mean and median can be found using Lists and Spreadsheet in the calculator.

1: Enter the data into Lists and Spreadsheet view

2: Hit Menu, Statistics, Stat Calculations, One Variable Statistics…


Page 14 of 23

3. Click OK when number of lists appears. 4. In the pop up, click in the X1 List box and select visits from the drop down list. Hit the Tab key to move to the next box and select freq from the drop down list in the Frequency List box

There is no need to enter data into the other boxes. 5. Click OK.

6. The statistical data appears. The mean is given by �̅� = 1.25 The median is 1. n is useful as it gives the frequency total.

Grouped Data

When data is presented in a frequency table within class intervals, and we do not know the actual

values within each class interval, we assume that all values are equal to the midpoint of the class

interval in order to find the mean.

Example 7:

The ages of a group of 30 people attending a superannuation seminar are recorded in the frequency

table below, calculate the mean age.

Age (Class Intervals) Frequency

20 - 29 1

30 - 39 6

40 - 49 13

50 - 59 6

60 - 69 3

70 - 79 1

Total 30

To find the mean age, assume all people in the class interval 20 - 29 are 24.5 years of age (This value

is obtained by finding the midpoint of 20 - 29), all people in the class interval 30 - 39 are 34.5 years

of age and so on. The midpoint of the class intervals are entered in the table below:


Page 15 of 23

Age (Class Intervals) Frequency Midpoint of Class Interval

20 - 29 1 24.5

30 - 39 6 34.5

40 - 49 13 44.5

50 - 59 6 54.5

60 - 69 3 64.5

70 - 79 1 74.5

Total: 30

The mean age can now be found using the calculator.

1: In Lists and Spreadsheets view enter the data for the midpoints and the frequency into the first

two columns. Label the columns as shown.

2. Press Menu, Statistics, Stat Calculations, One Variable Statistics.

3. Leave the Number of Lists as 1 and select OK.

4. In the pop up box, click in the X1 List box and select midpoint from the drop down list. Press the

Tab key to move to the Frequency List box and select freq from the drop down list.

5. Press the TAB key to move to the OK button.

The mean �̅� = 46.833 The mean age = 46.8 years


Page 16 of 23

Measures of Variability or Spread

It is useful to be able to measure the spread or variability of the data. How dispersed is the data?

The Range

The simplest measure of spread is the range. The range is the difference between the smallest score

and the largest.

Example 7

The set of data 3, 5, 4, 4, 2, 3, 0, 1, 4, 5 (which gave the number of pets in each of 10 households) has

a range of

5 − 0 = 5

The Interquartile Range (IQR)

The lower quartile, 𝑄1 is 1

4 of the way through the set of data.

The upper quartile, 𝑄3 is 3

4 of the way through the set of data.

The 𝐼𝑄𝑅 = 𝑄3 − 𝑄1

Example 8


Page 17 of 23

Stem and Leaf Plots

Example 9

The data below shows the weights in kg of 20 possums arranged in order of size:

0.7 0.9 1.1 1.4 1.5 1.6

1.7 1.7 1.8 1.8 1.9 2.0

2.1 2.1 2.2 2.3 2.3 2.5

3.0 3.2

We can represent this data in a stem and leaf plot as shown below:

Key: 0|7 = 0.7kg

Stem Leaf 0 7 9 1 1 4 5 6 7 7 8 8 9 2 0 1 1 2 3 3 5 3 0 2

In a stem and leaf plot the numbers are arranged in order of size. The key is given as 0|7 kg means

stem 0 and leaf 7 which represents 0.7 kg. You should always include a key in the stem and leaf plot.

When preparing a stem and leaf plot keep the numbers in neat vertical columns because a neat plot

will show the distribution of the scores. It is like a sideways bar chart or histogram.

The interquartile range can be found from the stem and leaf plot.

1. Find the median weight. The median weight Q2 is the 2

)120( th score. ie the 10.5th

score. The median lies between the 10th and 11th scores. Count through the data to find

the position of the median. It can be seen from the plot that the median lies between 1.8

and 1.9. The median weight is 2

)9.18.1( = 1.85 kg.

2. The lower quartile Q1 will be the 2

)110( th score in the lower half. ie the 5.5th score in

the lower half. Count through the data to find the position of the lower quartile. Q1 =

2

)6.15.1( = 1.55 kg.

3. The upper quartile Q3 will be the 5.5th score in the upper half of the plot. Count through

the data to find the position of the upper quartile. Q3 = 2

)3.22.2( = 2.25 kg

4. The interquartile range = Q3 – Q1 = 2.25 – 1.55 = 0.7 kg


Page 18 of 23

5. See diagram below:

Key: 0|7 = 0.7kg

Stem Leaf 0 7 9 1 1 4 5 6 7 7 8 8 9 2 0 1 1 2 3 3 5 3 0 2

Example 10

Find the interquartile range of the data presented in the following stem and leaf plot.

Key: 15|4 = 154

Stem Leaf 15 4 8 8 16 1 3 3 6 8 17 0 0 1 4 7 9 9 9 18 1 2 3 3 5 7 8 8 9 19 2 7 8 20 0 2

The median is the 2

)130( th score. ie the 15.5th score which lies between 179 and 179. So the

median is 179.

The lower quartile Q1 will be the 2

)115( th score in the lower half. ie the 8th score in the lower half.

Q1 = 168.

The upper quartile Q2 will be the 8th score in the upper half of the data. ie 188.

The interquartile range = Q3 – Q1 = 188 – 168 = 20.

See the diagram above.

Using CAS.

You could check your answers by entering the data into your CAS calculator to determine the

median, lower and upper quartiles.

median Q2 Q1

Q3

median Q2 Q1

Q3


Page 19 of 23

Boxplots

Five-number summary

A five number summary is a list consisting of the lowest score (Xmin), lower quartile (Q1), median

(Q2), upper quartile (Q3) and the greatest score (Xmax) of a set of data.

A five number summary gives information about the spread or variability of a set of data.

Box Plots

A box plot is a graph of the 5-number summary. It is a powerful way of showing the spread of data. A

box plot consists of a central divided box with attached “whiskers”. The box spans the interquartile

range. The median is marked by a vertical line inside the box. The whiskers indicate the range of

scores. Box plots are always drawn to scale and a scale is often attached.

Interpreting a Boxplot

A boxplot divides the data into four sections. 25% of the scores lie between the lowest score and the

lower quartile, 25% between the lower quartile and the median, 25% between the median and the

upper quartile and 25% between the upper quartile and the greatest score.

Extreme Values

Extreme values often make the whiskers appear longer than they should and hence give the

appearance that the data is spread over a much greater range than they really are. If an extreme

value occurs in a set of data it can be denoted by a small cross on the boxplot. The whisker is then

shortened to the next largest or smallest score.


Page 20 of 23

Comparing Sets of Data

Back to Back Stem and Leaf Plots

Two sets of data can be compared using back to back stem and leaf plots. The data below shows the

life time of a sample of 40 batteries in hours of each of two brands when fitted into a child’s toy.

Some of the toys are fitted with an ordinary battery and some with Brand X. Which battery is best?

Key: 6|9 = 69 hours

Ordinary Brand Leaf

Stem Brand X Leaf

8 6 2 0 0 6 9 9 9 9 8 8 6 4 0 7 3 5 8 8 7 5 3 1 1 1 0 8 2 4 8 9 6 6 4 2 2 2 0 0 9 0 1 4 5 5 9 8 7 5 3 1 1 1 10 0 0 2 5 8 8 9 9 4 2 11 0 0 1 1 3 3 6 7 9 12 1 4 6 6 6 7 8 8 13 3 5 14 6

The spread of each set of data can be seen graphically from the stem and leaf plot. It can be

seen that although brand X showed a little more variability than the ordinary brand the

batteries generally lasted longer.

Parallel Box Plots

The above data can also be compared by using parallel boxplots. The boxplots share a

common scale. Quantitative comparisons can be made between the sets of data.

The 5-Number Summaries of both types of batteries are given below. You can work them

out from the stem plots or by using your calculator.

Brand X

Xmin Lower Quartile Q1 Median Q2 Upper Quartile Q3 Xmax

69 95 109.5 122.5 146

Ordinary Brand

Xmin Lower Quartile Q1 Median Q2 Upper Quartile Q3 Xmax

60 78.5 87.5 97.5 114

The following parallel boxplots can be drawn to compare the data.

50 60 70 80 90 100 110 120 130 140 150

Brand X

Ordinary Brand

Time in Hours


Page 21 of 23

From the box plots it can be seen that:

1. Brand X showed more variability in its performance than the ordinary brand. Brand X range =

77, ordinary brand range = 54. Brand X interquartile range = 27.5 and ordinary brand

interquartile range =19.0

2. The longest lifetime recorded was that of a Brand X battery of 146 hours

3. The shortest lifetime recorded was that of an ordinary battery of 60 hours.

4. Brand X battery median lifetime (109.5 hours) was better than that of an ordinary battery

(87.5 hours)

5. Over one quarter of Brand X batteries were better performers than the best ordinary brand

battery (that is, had longer lifetimes than the longest of the ordinary brand batteries’

lifetimes)

Bivariate Data and Scatterplots

Dependent and Independent Variables

In a relationship involving two variables, if the values of one variable “depend” on the values of

another variable, then the former variable is referred to as the dependent variable and the latter

variable is referred to as the independent variable. When a relationship between two sets of

variables is being examined, it is important to know which one of the two variables depends on the

other. Most often we can make a judgement about this, although sometimes it may not be possible.

For example, in the case where the ages of company employees are compared with their annual

salaries, you might reasonably expect that the annual salary of an employee would depend on the

person’s age. In this case, the age of the employee is the independent variable and the salary of the

employee is the dependent variable.

We always place the independent variable on the x-axis and the dependent variable on the y-axis

in a scatterplot


Page 22 of 23

Scatterplots

Example

There is a moderate, negative linear relationship or correlation between the two variables.


Page 23 of 23

Line of Best Fit

We can draw a line of best fit by eye to represent the points in a scatterplot.

Documents

Year 10 Semester 2 Summary Probability Summary · Year 10 Summary Semester 2 Page 11 of 23 Distribution of Data Histograms give an indication of the distribution of data. The most