15
Statistics for Social Science 2019 Syllabus __________________________________________________________ Higher School of Economics 2019 Утверждена Академическим советом образовательной программы «30» августа 2019 г., № протокола 7 Академический руководитель образовательной программы Д.А. Щербаков Statistics for Social Science Part 1: Course Information Instructor: Denis Burakov Office: Moscow, 17 M.Ordynka str., office 301 Office Hours: by appointment E-mail: [email protected] Course Description This course serves as an introduction to fundamental and advanced concepts in statistics and probability and will be instrumental in teaching students how to effectively collect, analyze, and draw inferences from data in order to answer their own research questions and understand the analyses by others. The emphasis will be placed on statistical reasoning, problem solving, computer applications, and interpretation of the results. It is desired that you brush up your high school algebra to solve problem sets, yet most of the complex calculations will be performed using computers. Learning Outcomes This course is designed for students who are interested in developing a set of quantitative skills that are broadly applicable in the job market and improving things in the practical world, be it environmental policies or business strategies. By taking this course you will learn how to make sense of the data by applying sound statistical reasoning and use the data to shape policy and design desired outcomes. This course is aimed at developing the following knowledge and skills: Knowledge and Understanding understand fundamental concepts and important terminology in statistics and probability

Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

__________________________________________________________

Higher School of Economics 2019 Утверждена Академическим советом

образовательной программы «30» августа 2019 г., № протокола 7

Академический руководитель образовательной программы

Д.А. Щербаков

Statistics for Social Science

Part 1: Course Information

Instructor: Denis Burakov

Office: Moscow, 17 M.Ordynka str., office 301

Office Hours: by appointment

E-mail: [email protected]

Course Description

This course serves as an introduction to fundamental and advanced concepts

in statistics and probability and will be instrumental in teaching students how

to effectively collect, analyze, and draw inferences from data in order to

answer their own research questions and understand the analyses by others.

The emphasis will be placed on statistical reasoning, problem solving,

computer applications, and interpretation of the results. It is desired that you

brush up your high school algebra to solve problem sets, yet most of the

complex calculations will be performed using computers.

Learning Outcomes

This course is designed for students who are interested in developing a set

of quantitative skills that are broadly applicable in the job market

and improving things in the practical world, be it environmental policies

or business strategies. By taking this course you will learn how to make

sense of the data by applying sound statistical reasoning and use the data to

shape policy and design desired outcomes.

This course is aimed at developing the following knowledge and skills:

• Knowledge and Understanding

understand fundamental concepts and important terminology

in statistics and probability

Page 2: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

2

develop an understanding of principles of data collection,

data analysis, and data visualization

• Skills and Other Attributes

be able to perform basic statistical operations using R software

present data in tables and charts, summarize and describe

numerical data

be able to apply statistical reasoning, perform statistical analysis

and interpret the results

Textbook & Course Materials The following textbooks and materials should be consulted for further reading. Additional readings and homework excercises will be assigned on a weekly basis.

PRESCRIBED TEXTS

[SME] Keller, Gerald. Statistics for Management and Economics, 10th

Edition. Cengage Learning, 2015.

[OS3] Diez, David M., Christopher D. Barr, and Mine Cetinkaya-Rundel.

OpenIntro Statistics, 3rd

Edition. OpenIntro, 2012.1

REFERENCES (SUGGESTED TEXTS)

Arthur Aron. Statistics for Psychology, 6th

Edition. Pearson, 2012.

Bruce, Peter, and Andrew Bruce. Practical Statistics for Data Scientists: 50

Essential Concepts. O'Reilly, 2017. Lind, Douglas A., William G. Marchal, and Samuel Adam Wathen. Statistical Techniques in Business and Economics. McGraw-Hill/Irwin, 2012. Venables, W. N., and D. M. Smith. An Introduction to R: Notes on R: A Programming Environment for Data Analysis and Graphics (Version 3.4.4). 2018.

All necessary materials (lecture PPT files, a list of Textbook problems, sample multiple choice questions, related articles, and related videos) will be provided in the form of electronic files.

11

For [OS3] materials in home assignment, please consult OpenIntroOrg youtube channel for short

video lectures: https://www.youtube.com/user/OpenIntroOrg/

Page 3: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

3

LECTURE/SEMINAR/HOMEWORK HOURS During a weekly lecture you will be presented with a summary of the material

and short presentations on each topic from the cirriculum. It is expected that you

come to class with prepared homework in order to fully comprehend the material.

The pace will be very fast; be prepared!

From time to time you will be asked to watch short video clips and solve and

submit exercises to test your comprehension of the material. You will also have

access to online resources throughout the course. The average commitment will

be approximately 9 hours per week for attending lectures, doing the readings, and

completing the assignments to successfully complete the course.

No Topic Lecture

Hours

Home

work

Hours

Total

1. Introduction 2 2 4

2. Data Basics 4 4 8

3. Graphical Descriptive Techniques 4 6 10

4. Numerical Descriptive Techniques 4 6 10

5. Data Collection and Sampling Theory 2 6 8

6. Probability 4 8 12

7. Discrete Probability Distributions 2 4 6

8. Midterm exam 2 2

9. Continuous Probability Distributions 2 4 6

10. Sampling Distributions 4 4 8

11. Estimation 4 8 12

12. Hypothesis Testing Framework 4 8 12

13. Inference for Numerical Data 2 4 6

14. Analysis of Variance 4 6 10

15. Regression Analysis 6 10 16

16. Final exam 2 2

Total 52 80 132

Page 4: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

4

Part 2: Grading Policy

Grading Policy Section Weight, %

Block 1. Overall comprehension of statistical theory 50%

Midterm exam 20%

Final exam 30%

Block 2. Coding techniques and data understanding 50%

Homeworks 25%

Class problem sets 25%

Block 3. Individual characteristics 0%

Interest in statistics 0%

Attendance 0%

The grading policy for this course is aimed at assessing the students’ overall

comprehension of statistical theory (Block 1) via Midterm and Final exams and their

coding techniques and understanding of data (Block 2) via Homeworks and

Laboratory assignments.

In addition, there is a third block, which assesses individual characteristics,

which may or may not be accounted in your final grade. E.g., attendance has a 0%

weight, but if there are three or more absences, a decreasing coefficient will be

applied. Similarly, a genuine interest in the material can yield a higher grade through

an increasing coefficient.

Page 5: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

5

Part 3: Topic Outline/Schedule

Weekly Schedule:

Week 1. Introduction

Weeks 2, 3. Data Basics

Weeks 4, 5. Graphical Descriptive Techniques

Weeks 6, 7. Numerical Descriptive Techniques

Weeks 8. Data Collection and Sampling Theory

Weeks 9, 10. Probability

Weeks 11, 12. Discrete Probability Distributions

Week 13. Midterm Exam

Weeks 14, 15. Continuous Probability Distributions

Weeks 16, 17. Sampling Distributions

Weeks 18, 19. Estimation

Weeks 20, 21. Hypothesis Testing Framework

Week 22. Inference for Numerical Data

Weeks 23, 24. Analysis of Variance

Weeks 25, 26, 27. Regression Analysis

Week 28. Final Exam.

Lecture Outlines:

Week 1. Introduction

(1) Learning Objectives

After this session, students should be able to:

- Understand the focus of statistics as a subject

- Learn key statistical terminology

- Understand statistical applications in social science and business

.

(2) Session Outline

1. Short history of the development of modern statistics

2. The subject of statistics as a scientific discipline

3. Applications of statistics in the social sciences and business

4. Overview of key terminology in statistics

(3) Required readings and references

[SME] Chapter 1.

Page 6: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

6

Salsburg, David. The Lady Tasting Tea: How Statistics Revolutionized Science in

The Twentieth Century. Macmillan, 2001.

Weeks 2, 3. Data Basics

(1) Learning Objectives

After this session, students should be able to:

- Understand the concept of a random variable

- Differentiate between types of variables

- Be able to work with variables in R environment

(2) Session Outline

1. Concept of a random variable

2. Variable types

3. Variable transformations

4. Observations, variables, and data matrices

5. Variables in R environment

(3) Required readings and references

[SME] Chapter 2.

[OS3] Chapter 1.

Weeks 4, 5. Graphical Descriptive Techniques

(1) Learning Objectives

After this session, students should be able to:

- Become familiar with a glossary of chart types

- Learn graphical techniques to describe interval and categorical data

- Show a relationship between numerical and nominal variables

- Organize data using a frequency distribution

- Represent data using charts

(2) Session Outline

1. Frequency and relative frequency distributions

2. Shapes of frequency distributions

3. Contingency tables and bar plots

4. Examining numerical data

5. Cross-sectional and time-series data

Page 7: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

7

6. Misleading graphs and charts to be avoided

7. Alternatives to pie charts

(3) Required readings and references

[SME] Chapters 2 and 3.

Knaflic, Cole Nussbaumer. Storytelling with Data: A Data Visualization Guide for

Business Professionals. John Wiley & Sons, 2015. Introduction.

Wexler, Steve, Jeffrey Shaffer, and Andy Cotgreave. The Big Book of

Dashboards: Visualizing Your Data Using Real-World Business Scenarios. John

Wiley & Sons, 2017. Part I.

[Videos]

- Identifying Misleading Graphs - Konst Math https://www.youtube.com/watch?v=ETbc8GIhfHo

Weeks 6, 7. Numerical Descriptive Techniques

(1) Learning Objectives

After this session, students should be able to:

- Apply numerical techniques for describing and summarizing data

- Identify, compute, and interpret descriptive statistical summary

measures

- Differentiate between the measures of central tendency, dispersion,

and relative standing

(2) Session Outline

1. Types of distributions (symmetrical, left-skewed, and right-skewed)

2. Measures of central tendency for numerical data

3. Comparing the mean, mode, and median

4. Measures of dispersion

5. Measures of relative standing

6. The Empirical Rule and Chebyshev’s Theorem

(3) Required readings and references

[SME] Chapter 4.

Aron, A. Statistics for Psychology, 6th edition. Pearson, 2012. Chapter 2.

Page 8: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

8

Week 8. Data Collection and Sampling Theory

(1) Learning Objectives

After this session, students should be able to:

- Understand the methodologies underlying data collection

- Become acquainted with the concepts of random sampling and sample

bias

- Differentiate between sampling strategies

(2) Session Outline

1. Methods of collecting data

2. The concepts of population and sample

3. Sampling from a population

4. Sampling strategies

5. Sampling bias

(3) Required readings and references

[SME] Chapter 5.

[OS3] Chapter 1 (1.3).

Bruce, Peter, and Andrew Bruce. Practical Statistics for Data Scientists: 50

Essential Concepts. O'Reilly, 2017. Chapter 2.

[Articles]

- Sampling bias Vs. Sampling Error

http://conflict.lshtm.ac.uk/page_40.htm

[Videos]

Bias and Error in Sampling Statistic (2 minutes to watch) https://www.youtube.com/watch?v=Pb_CrXBRovE

Weeks 9, 10. Probability

(1) Learning Objectives

After this session, students should be able to:

- Learn probability concepts and rules

- Identify components of probability

Page 9: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

9

- Assess probabilities and apply probability formulas

(2) Session Outline

1. Defining probability

2. Joint, marginal, and conditional probability

3. Addition, multiplication, and complement rules

4. Marginal and joint probabilities

5. Defining conditional probabilities

6. Tree diagrams for probability

7. Bayes' Theorem

(3) Required readings and references

[SME] Chapter 6.

[OS3] Chapter 2.

Silver, Nate. The Signal and the Noise: Why so Many Predictions Fail - but Some

Don't. Penguin, 2012.

Please consult for a short review:

Cantor, Murray. Filling in the Blanks, The Math behind Nate Silver’s The Signal

and the Noise. IBM Developer Works, 2013.

[Articles]

- Bayes’ Law

https://en.wikipedia.org/wiki/Bayes%27_theorem

- Visualization of conditional probability

http://setosa.io/conditional/

[Videos]

- Conditional Probability explained visually (Bayes’ Theorem)

https://youtu.be/Zxm4Xxvzohk

Weeks 11, 12. Discrete Probability Distributions

(1) Learning Objectives

After this session, students should be able to:

- Learn the concept of a random variable

- Become acquainted with discrete distributions of random variables

(2) Session Outline

Page 10: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

10

1. Defining random variable

2. A family of discrete distributions

3. Bivariate Distributions

4. Binomial Distribution

5. Poisson Distribution

(3) Required readings and references

[SME] Chapter 7.

[OS3] Chapter 3.

[Articles]

- Joint Probability Distribution

https://en.wikipedia.org/wiki/Joint_probability_distribution

[Videos]

- Binomial Distribution Demo http://demonstrations.wolfram.com/BinomialProbabilityDistribution

- The Bivariate Normal Distribution Demo http://demonstrations.wolfram.com/TheBivariateNormalDistribution/

Week 13. Midterm Exam

(1) Midterm exam instructions

- Study Topics 1 through 7

- Calculator is not allowed, but a formula sheet will be provided

- No make up will be given

- Be on time!

Weeks 14, 15. Continuous Probability Distributions

(1) Learning Objectives

After this session, students should be able to:

- Become familiar with continuous distributions

- Learn the 68-95-99.7 rule

- Learn the concept of probability density functions

- Calculate Z-scores and use distribution tables

Page 11: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

11

(2) Session Outline

6. A family of continuous distributions

7. Normal distribution

8. Standardizing with Z-scores

9. Student’s t distribution

10. χ2 distribution

(3) Required readings and references

[SME] Chapter 8.

[OS3] Chapter 2 (2.5.1).

[OS3] Chapter 3 (3.1.2).

[OS3] Chapter 6 (6.3.3).

[Videos]

- Joint Probability Distribution

http://demonstrations.wolfram.com/TheBivariateNormalDistribution/

Weeks 16, 17. Sampling Distributions

(1) Learning Objectives

After this session, students should be able to:

- Learn the concept of a sampling distribution of the mean

- Learn the concept of a standard error of the mean

- Apply the Central Limit Theorem

- Use the sampling distribution for inference

(2) Session Outline

1. The logic of parametric estimation

2. Sampling distribution

3. Standard error of an estimate

4. Sampling distribution of the difference between two means

5. Standard error of the difference between two means

6. Principles of A/B tests in business

(3) Required readings and references

Page 12: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

12

[SME] Chapter 9.

[OS3] Chapter 4.

Bruce, Peter, and Andrew Bruce. Practical Statistics for Data Scientists: 50

Essential Concepts. O'Reilly, 2017. Chapter 3.

[Articles]

- Central Limit Theorem Visualized in D3

http://blog.vctr.me/posts/central-limit-theorem.html

- Probability Density Function

https://en.wikipedia.org/wiki/Joint_probability_distribution

[Videos]

- Bunnies, Dragons and the 'Normal' World: Central Limit Theorem |

The New York Times

https://youtu.be/jvoxEYmQHNM

- Sampling Distribution Concept and its relation to Inference

https://youtu.be/Zbw-YvELsaM

- Sampling Distribution of the Sample Mean

https://youtu.be/q50GpTdFYyI

Weeks 18, 19. Estimation

(1) Learning Objectives

After this session, students should be able to:

- Develop an understanding of statistical concepts behind parametric

estimation

- Calculate point estimates and interpret confidence intervals

- Compute standard error for the sample mean

- Select the sample size

- Use the sampling distribution for inference

(2) Session Outline

1. Basic properties of point estimates

2. Capturing the population parameters

3. Selecting the sample size

4. Calculations of point estimates and standard errors for the sample mean

5. Interpretation of confidence intervals

Page 13: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

13

(3) Required readings and references

[SME] Chapter 10.

[OS3] Chapter 4.

[Articles]

- Things to be considered when choosing the sample sizes https://www.itl.nist.gov/div898/handbook/ppc/section3/ppc333.htm

Weeks 20, 21. Hypothesis Testing Framework

(1) Learning Objectives

After this session, students should be able to:

- Develop an understanding of hypothesis testing framework

- Conduct one-tailed and two-tailed hypothesis tests

- Define decision errors

- Explain the relationship between classical hypothesis test and p-values

(2) Session Outline

6. Hypothesis testing methodology

7. Null and alternative hypotheses

8. Significance levels (α) and decision errors (Type I and Type II)

9. One-sided and two-sided hypothesis tests

(3) Required readings and references

[SME] Chapter 11

[OS3] Chapter 4

[Articles]

- Type I and Type II Error

https://en.wikipedia.org/wiki/Type_I_and_type_II_errors

[Videos]

- Hypothesis testing basics

https://youtu.be/ZzeXCKd5a18

- Type I and Type II Error

https://youtu.be/a_l991xUAOU

Page 14: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

14

Week 22. Inference for Numerical Data

(1) Learning Objectives

After this session, students should be able to:

- Learn the concept of experimental design

- Define degrees of freedom

- Test and estimate a population variance

- Test the equality of means

- Test the equality of count data and proportions

(2) Session Outline

1. What is experimental design?

2. The meaning of degrees of freedom

3. One-sample mean tests

4. Inference for paired data

5. χ2-tests and goodness-of-fit

(3) Required readings and references

[SME] Chapters 12, 13, and 15.

[OS3] Chapter 5.

Weeks 23, 24. Analysis of Variance

(1) Learning Objectives

After this session, students should be able to:

- Conduct F-tests

- Decompose variance

- Conduct one-way and two-way analysis of variance

- Understand the properties of ANOVA tables

(2) Session Outline

1. Familywise error and the limitations of a t-test

2. The F-test and F-distribution

3. Variance decomposition

4. The within-groups and between-groups estimates of population variance

5. The F-ratio

6. Hypothesis testing with the analysis of variance

7. Diagnostics for an ANOVA analysis

Page 15: Statistics for Social Science - hse.ruLind, Douglas A., William G. Marchal, and Samuel Adam Wathen. ... (Version 3.4.4). 2018. All necessary materials (lecture PPT files, a list of

Statistics for Social Science 2019 Syllabus

15

(3) Required readings and references

[SME] Chapters 14.

[OS3] Chapter 5 (5.5.1).

[Videos]

- Sources of Variance in an Experiment

https://youtu.be/JPjMUeTMOwg

Weeks 25, 26, 27. Regression Analysis

(1) Learning Objectives

After this session, students should be able to:

- Run a regression model

- Interpret the linear regression’s coefficients

- Compute the coefficient of determination

- Predict values using regression techniques

(2) Session Outline

1. Covariance and correlation

2. Assumptions of ordinary least squares (OLS)

3. Simple bivariate regression

4. Multivariate regression

5. Multicollinearity

6. Control variables in multiple regression

7. Difference between the full model and the best model

8. Prediction using regression

(3) Recommended Readings

Keller, Gerald. Statistics for Management and Economics, Abbreviated. Cengage

Learning, 2015. Chapter 16.

Diez, David M., Christopher D. Barr, and Mine Cetinkaya-Rundel. OpenIntro

Statistics. OpenIntro, 2012. Chapter 8.

Week 28. Final Exam

(1) Final exam instructions

- Study Topics 1 through 15

- Calculator is not allowed, but a formula sheet will be provided

- No make up will be given

- Be on time!