50
Lecture 1 Notes Chapter 1. Stats Starts Here Chapter 2. Displaying and Describing Categorical Data 1

Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Lecture 1 Notes➢ Chapter 1. Stats Starts Here

➢ Chapter 2. Displaying and Describing Categorical Data

1

Page 2: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Data▪ We can make sense of the world by making sense of data.

▪ Data is plural. Datum is singular.

▪ Data are values along with their context.

▪ Data is any collection of numbers, characters, images, or other items that provide information about some thing.

▪ Data help us see the underlying truth and pattern.

▪ Nowadays data mostly come in an excel form.

▪ Data are presented in a table (rows and columns) like the example below.

2

Page 3: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Data Vary

For example ask Canadians to what extent do you agree or disagree with the statement “Leaning new things is fun”?

• Let’s find out what were Canadians’ responses to this statement in 2008.

• Source: Access and Support to Education and Training Survey, 2008.

3

Page 4: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

4

CHASS: Computing in the Humanities and Social Sciences

Faculty of Arts & Science, University of Toronto

Copy and paste the link below into a new tab in your internet browser: http://www.chass.utoronto.ca/

Page 5: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

5

From the left-side menu, click on Data Centre > U. of T. users

Page 6: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

6

Click on SDA@CHASS (SDA: Survey Documentation and Analysis)

Page 7: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

7

Click on Continue in English

Page 8: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

8

Click on Access and Support to Education and Training Survey, 2008 (ASETS)

Page 9: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

9

Click on Data

Page 10: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

10

Click on Codebooks > SDA codebooks

Page 11: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

11

Click on Sequential Variable List

Page 12: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

12

Click on Attitudes Towards Learning

Page 13: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

13

Click on item: al_g02 Learning new things is fun

What do you think about Canadians’ responses to this item?

Do you think they all agreed? Or some of them agreed?

Page 14: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Example of Canadian Data Set

14

Do you expect to obtain the same answers (responses) from different selection of Canadians in year 2008?

Do you expect to obtain the same responses from the same selected Canadians in 2018?

Page 15: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Statistics

Statistics is the science of data.

This involves collecting, classifying, analyzing, presenting, interpreting and communicating numerical information.

Statistics helps us:

• make sense of the world in everyday life by seeing past the underlying variation to find patterns and relationships (e.g., in health, politics, economics, education, environment, and social issues);

• become informed citizens by giving us the tools to understand, question, and interpret data.

• Understand articles published in research journals and reports in government agencies and private industries.

15

Page 16: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Role of Statistics

Statistics has important roles in answering real questions like the following:

• How do we assess the risk of genetically engineered foods being considered by the Canadian Food Inspection?

• How do we asses the safety and effectiveness of new drugs submitted to Health Canada for approval?

• How do we determine whether vitamin C really prevent illness?

• Which factors have the greatest impact on student performance in school?

• Which factors affect people’s quality of their health care?

• Which factors affect people’s decision to retire?

Improving Human Welfare in 2013 International Year of Statistics:

http://www.worldofstatistics.org/about-us/

16

Page 17: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Elements of Statistics

• An Individual Case (Experimental Unit) is an object about which we collect data.

• The cases are sample of cases selected from some larger population that we would like to understand.

Example: student, animal, transaction, event

• A population is a set of individual cases that we are interested in studying.

Example: All students at University of Toronto

• A sample is a subset of the individual cases (units) of a population.

Example: All students in this class

• A representative sample exhibit characteristics typical of those possessed by the population.

• It is a kind of snapshot of image of larger world.

• The most common way to satisfy the representative sample requirement is to select a random sample as it ensures that every subset of a fixed size in the population has the same chance of being selected.

Example: If we want to understand students’ experience at U of T, we need to randomly select students from the entire population of U of T students.

17

Page 18: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

• Variation is the of statistics.

• Variation is the foundation of sound reasoning about the data.

• Statistical methods helps explain the variation in the data; We model the variation in the data.

• A variable is a characteristic of an individual case (experimental or observational unit) in the population.

• A variable can take different values on different cases.

Example: An undergraduate student’ data base.

Individual cases: Students of the university.

Variables: Gender, GPA, Program of Study, Year of Study, etc.

18

Elements of Statistics

Page 19: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Context of Data Answers the Five W’s

When you plan a statistical study or explore data from someone else’s work, ask your self the following questions:

• Who will be the cases in my study? How many individuals will be in my study?

• Why conduct this study? What purpose do the data have? Do I hope to answer some specific questions? Do I want to draw conclusions about individuals other than the ones actually I have data for? Is my data reliable?

• What? How many variables do the data contain? Exact definitions of these variables? In what unit of measurement is each variable recorded?

• When is an appropriate time to conduct my study?

• Where can I conduct my study?

In addition to the five W’s:

• How can I conduct my study? (e.g. use an instrument, for example, a validate survey)

19

Page 20: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Asal’s ExampleI study students’ attitudes about statistics.

• Who: Undergraduate students

• Why: By understanding attitudes about statistics I aim to improve teaching and learning of statistics

• What: Students’ attitudes, their prior mathematics related experiences and achievement, their gender, their program of study, their year of study, and their statistics course outcome.

• Where: University of Toronto

• When: At the beginning and at the end of an introductory statistics course

• How: By administrating the Survey of Attitude Towards Statistics (SATS-36©) and linking students’ responses to students’ repository record from the Office of Registrar

At the time of my study in 2016, I was the instructor for the course under the study. I had a research assistant who administered the survey at both times and collected the data. She assigned a participant number to each student who participated into the study. The participant number is an “Identifier Variable”, which identifies individual cases. This was not included in the data analysis but it helped to match students’ information regarding their attitudes toward statistics and their program and their year of study.

20

Page 21: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Types of DataClassify Variables: Quantitative or Categorical

Quantitative variable:

• When the measurement scale has numerical values. They describe amount of something.

• These variables must accompany with their unit of measurement.

E.g., University GPA: (range form 1.0 to 4.0)

E.g., Hours of study: (0 to infinite!?)

• They may also arise from the process of counting.

E.g., The number residents in the province of Ontario.

E.g., The number siblings a person has.

Categorical variable:

• When the measurement scale is set of categories.

• To determine what group or category individuals (cases) belong to.

• Counting is a natural way to summarize and learn about a categorical variable.

• Often called qualitative variables: Distinct categories differ in their qualities not in their numerical magnitude.

E.g., Program of Study: Environmental Sciences, Life Sciences, Social Science, and so on.

E.g., Canadian Provinces: Ontario, British Columbia, Alberta, and so on.21

Page 22: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Why Classify Variables as Quantitative or Categorical?

• For application of different statistical methods.

• For obtaining appropriate graphs and summary statistics.

Example of a Quantitative Variable:

• Income of Canadian Citizens (in thousands of dollars).

We might be interested in average income of all Canadian Citizens.

Graphical Display: Histogram or Boxplot of distribution of income.

Example of a Categorical Variable:

• Canadian Provinces.

We might be interested in the number of Canadians living in each province (Count).

Graphical Display: Bar chart or Pie Chart

22

Page 23: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Convert a Quantitative Variable to a Categorical Variable

Simply break up the range of values into several intervals.

Example: Age

Distribution of on- and off-reserve First Nations people (single identity), by age group, 2011

Source: http://www.statcan.gc.ca/pub/89-653-x/2016010/tbl/tbl01-eng.htm

23

Page 24: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Scales of Measurement

• Interval Scales:

For quantitative variables intervals are equal distances.

Example: Annual income (in thousands of dollars).

• The interval (distance) between $30,000 to $40,000 is $10,000.

• Purpose: We can compare outcomes are how much larger or how much smaller one is than the other (e.g., in which interval should an annual income go to).

• Nominal Scales:

For categorical variables no level (category) is greater or smaller than any other level (category).

Example: Primary mode of transportation to school.

• Categories: automobile, bus, subway, bicycle, walk.

24

Page 25: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Scales of Measurement: Ordinal Scales

• A variable between nominal and interval scales.

• Consists of categorical scales having a natural ordering of values.

• The levels form an ordinal scale.

Examples: Social Class

• Categorical scale: upper, middle, lower.

Example: Political philosophy

• Categorical scale: Very liberal, moderately liberal, slightly liberal, very conservative, moderately conservative, slightly conservative

25

Page 26: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Quantitative Aspects of Ordinal Scales

• The position of ordinal scales on the quantitative-qualitative(categorical) classification is fuzzy.

• Often methods used for their statistical analysis is the same as nominal (categorical) variables.

• In some cases, they could closely resemble interval scales for quantitative variables.

• Each level has a greater or smaller magnitude than another level.

• We can conduct a sensitivity analysis and check if conclusions would differ in any significant way of other choices of scores.

• Example: Survey of Attitude Towards Statistics (SATS-36©):

• SATS-36© items are ordinal (e.g., strongly disagree, strongly agree)

• We might want to treat them as a quantitative variable (1, 2, 3, 4, 5, 6, 7; interval scale: distance is 1) to compute a mean score for an item (e.g., I will like statistics). 26

Page 27: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Asal’s Example

I obtained a statistic (average) about students’ reported willingness to spend a great deal to learn statistics as 5.88 on a 7-point Likert scale (“1” indicates a strong disagreement to “4” neutral to “7” strong agreement).

With this value of 5.88, since it is above 4 (neutral response),

I described that on average, students reported a great deal of effort to learn statistics in their course.

27

Page 28: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Variables: Discrete or Continuous

Discrete Variables:

• Any variable with finite number (countable) of possible values is discrete.

Examples:

• number of siblings for a person.

• number of people living in Ontario.

• ALL categorical variables (nominal or ordinal) are discrete, having a finite set of categories.• Categories/levels pre-determined for a categorical variable.• Example: Social Class (Upper, Middle, Lower)

Continuous Variables:

• Any variable with infinite continuum (no ending number) of possible real number values (e.g., a number with decimal points).

Examples:

• Time (in minutes) takes to finish reading a book.

• Age of a person.

28

Page 29: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

In summary:

Quantitative Variables:

• Have interval scales.

• Could be either continuous (e.g., age) or discrete (e.g., number of times dined at l'espresso bar mercurio in the month of April).

Categorical Variables (always discrete):

• Nominal scale (e.g., mode of transportation to school: automobile, bus, subway, bicycle, walk) are always discrete.

• Ordinal scales (e.g., University GPA: A, B, C, D) are always discrete.

29

Page 30: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Summarizing and Describing a Single Categorical Variable

30

• Recall our earlier example, responses to the ASETS (2008) survey item: Learning new things is fun.

Page 31: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Summarizing and Describing a Single Categorical Variable

31

Example: Learning new things is fun: A categorical variable.

Frequency Table:

• Count the number of cases corresponding to each category and put them into a table.

• Frequency table records the totals and uses the category names to label each row.

The table on the right describes the distribution of Canadian responses to the statement “Learning new things is fun”, because it names the possible categories and tell how frequently each occur (how cases are distributed across the categories).

Example: 15,712 participants strongly agreed to the statement

Relative Frequency:

• Divide the count by the total number of cases. This gives fraction (proportion) of the whole.

Example: 15712/23519 = 0.668

• Multiply the proportions by 100 to obtain the percentages.

Example: 0.668 x 100 = 66.8%

Majority (66.8%) of the respondents strongly agreed

with the statement.

Page 32: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Summarizing and Describing a Single Categorical Variable

32

Bar Chart:

• Display the distribution of a categorical variable.

• Shows the frequency (count) for each category next to each other for easy comparison.

• The height of the bar shows the count for its category

• It is better to have spaces between bars to indicate that these are freestanding bars that could be arranged into any order.

• The bars are the same width so their heights determine the areas.

• These areas are proportional to the counts in each category.

Note: Bar chart stays true to the Area Principle.

Area Principle:

The area occupied by a part of the graph should correspond to the magnitude of the value it represents.

Example: Learning new things is fun: A categorical variable.

Page 33: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Summarizing and Describing a Single Categorical Variable

33

Pie Chart:

• Display the whole group of cases as a circle.

• It slices the circle into pieces whose size is proportional to the fraction of a whole.

Majority (66.8%) of the respondents strongly agreed

with the statement.

Example: Learning new things is fun: A categorical variable.

Page 34: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

34

Contingency Tables:

• Classification with respect to two categorical variables.

• It determines if two categorical variables are related (associated, depended).

• Idea: Arrange the counts in a two-way table.

Example: A question on the General Social Survey (2009) on Victimization asked a random sample of 9689

Canadians about their opinion regarding Canadian Criminal courts: “Are they doing a good job, an average job or

poor job of determining whether the accused or the person charged is guilty or not?”.

The data are summarized in the two-way table below.

Exploring Relationships Between Two Categorical Variables

Sex

Opinion Regarding Criminal Courts at Sentencing

Good Average Poor Total

Male 1664 2237 705 4606

Female 1479 2794 810 5083

Total 3143 5031 1515 9689

Page 35: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

35

• Table below is called a 2 x 3 (read as “2-by-3”) contingency table (two rows and three columns), because it shows

how the individuals are distributed along each variable, contingent on the value of the other variable.

• Subjects are classified to both their sex and their opinion regarding Canadian’s criminal courts at sentencing.

• Each cell of the table gives the count for a combination of values of the two variables.

Example: 1664 represents the number of respondents who are male and think that the Canadian criminal courts are

doing a good job at sentencing.

Exploring Relationships Between Two Categorical VariablesContingency Table of Opinion Regarding Criminal Court at Sentencing and Sex of the Respondents

Sex

Opinion Regarding Criminal Courts at Sentencing

Good Average Poor Total

Male 1664 2237 705 4606

Female 1479 2794 810 5083

Total 3143 5031 1515 9689

Page 36: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

36

Read Data in R

Page 37: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

37

Bar Plot of Sentencing Opinion by Sex

• Women are more likely to think that the criminal court is doing an average job at sentencing, compared with the men.

• There is not much of a difference between the sexes in the likelihood of opinion regarding the criminal court is doing a

poor job at sentencing.

Page 38: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Finding Marginal Distribution

38

Marginal Distribution of Sex:

• The percentage of respondents who are male:

(4606/9689) = 0.4754 x 100 ≅ 47.54%

• The percentage of respondents who are female:

(5083/9689) = 0.5246 x 100 ≅ 52.46%

Marginal Distribution of Opinion about Criminal Court:

• The percentage of respondents who think that Canadian

criminal courts are doing a good job at sentencing:

(3143/9689) = 0.32445 x 100 ≅ 32.44%

• The percentage of respondents who think that Canadian

criminal courts are doing an average job at sentencing:

(5031/9689) = 0.5192 x 100 ≅ 51.92%

• The percentage of respondents who think that Canadian

criminal courts are doing a poor job at sentencing:

(1515/9689) = 0.1564 x 100 ≅ 15.64%

The margins of the table, on the right and at the bottom, give the totals.

Note: the two proportions adds up to 1.

Note: the three proportions adds up to 1.

Page 39: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Finding Joint Distribution: Overall Percentages

39

Joint distributions of two variables of all cases belong to each combination of row and column category.

Example: The percentage of respondents who are male and think that Canadian criminal courts are doing a good

job at sentencing: (1664/9689) = 0.17174115 x 100 ≅ 17.17%

Note: the six proportions adds up to 1.

Page 40: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

40

Describe the conditional distribution of sex for those who think that criminal court is doing a good job at sentencing.

• The percent of males among those who think that Canadian criminal courts are doing good job at sentencing:

(1664/3143) = 0.5294 x 100 ≅ 52.94%

• The percent of females among those who think that Canadian criminal courts are doing good job at sentencing:

(1479/3143) = 0.4706 x 100 ≅ 47.06%

Finding Conditional Distributions: Column Percentages

Note: the column proportions adds up to 1.

Page 41: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Looking for Associations Between Two Variables

41

1. Describe the conditional distribution of opinion about criminal court at sentencing for males (Row Percentages).

The percentage of male respondents who think that the Canadian criminal courts are doing:

• a good job at sentencing is (1664/4606) = 0.3613 x 100 ≅ 36.13%

• an average job at sentencing is (2237/4606) = 0.4857 x 100 ≅ 48.57%

• a poor job at sentencing is (705/4606) = 0.1531 x 100 ≅ 15.31%

Note: the row proportions adds up to 1.

Page 42: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Looking for Associations Between Two Variables

42

2. Describe the conditional distribution of opinion about criminal court at sentencing for females (Row Percentages).

The percentage of female respondents who think that the Canadian criminal courts are doing:

• a good job at sentencing is (1479/5083) = 0.2910 x 100 ≅ 29.10%

• an average job at sentencing is (2794/5083) = 0.5498 x 100 ≅ 54.98%

• a poor job at sentencing is (810/5083) = 0.1594 x 100 ≅ 15.94%

Page 43: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Compare Row Percentages: Associations Between Two Variables

43

• Women (54.98%) are more likely to think that the criminal court is doing an average job at sentencing, compared

with the men (48.57%).

• There is not much of a difference between the sexes in the likelihood of opinion regarding the criminal court is

doing a poor job at sentencing.

Side-by-side Bar Chart of Sentencing Opinion by Sex

Page 44: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Another Example: Distributions of Perceived Health By Sex Source: Canadian Community Health Survey (CCHS, 2012)

44

• There were 1500 respondents.

• 832 of the respondents were females. 668 of the respondents were males.

• Most females (315) and most males (240) reported “3 = Very Good” as their perceived health. • There are not much differences of reported perceived health between males and females.

Variable(s): 2

Variable Names: Perceived Health, Sex

Variable Type:

• Perceived Health: Ordinal Categorical Variable

0 = Poor, 1 = Fair, 2 = Good, 3 = Very Good, 4 = Excellent

• Sex: Nominal Categorical Variable

Male, Female

Page 45: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Distributions of Perceived Health by Sex (CCHS, 2012)Possible Lack of Association Between Two Variables

45

• There are not much differences of reported perceived health between males and females.

• There is no apparent association between reported perceived health and the sex of the subjects.

• Reported perceived health may be independent of (may not depend on) sex of the subject.

• That is the sex of the respondents does not appear to explain reported perceived health.

Page 46: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Distributions of Perceived Health (CCHS, 2012)

46

Most of the respondents (37%) perceived

their health as very good.

Page 47: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Exploring Relationships Between Two Categorical Variable

47

• Use the either row or column percentages to compare the percentages.

• That is, find the conditional distribution of one variable within each level of another variable.

• When the distribution of one variable is different for all categories of another variables, we say that the variables are

dependent (the variables are associated; the variables are related).

• When the distribution of one variable is the same for all categories of another variables, we say that the variables are

independent (the variables are not associated; the variables are not related).

• Note the points made above are an informal method of comparing distributions. In STA221, we will see a formal

way of checking for independence (Test of Hypothesis regarding the independence of two variables),

Page 50: Lecture 1 Notes - Asal Aslemandasalaslemand.weebly.com/uploads/3/1/3/1/31310805/... · Statistics Statistics is the science of data. This involves collecting, classifying, analyzing,

Nice to meet you and see you soon ☺

Please Bring your laptop to the next class for exploring RStudio.

50