Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Statistics for Social Science 2019 Syllabus
__________________________________________________________
Higher School of Economics 2019 Утверждена Академическим советом
образовательной программы «30» августа 2019 г., № протокола 7
Академический руководитель образовательной программы
Д.А. Щербаков
Statistics for Social Science
Part 1: Course Information
Instructor: Denis Burakov
Office: Moscow, 17 M.Ordynka str., office 301
Office Hours: by appointment
E-mail: [email protected]
Course Description
This course serves as an introduction to fundamental and advanced concepts
in statistics and probability and will be instrumental in teaching students how
to effectively collect, analyze, and draw inferences from data in order to
answer their own research questions and understand the analyses by others.
The emphasis will be placed on statistical reasoning, problem solving,
computer applications, and interpretation of the results. It is desired that you
brush up your high school algebra to solve problem sets, yet most of the
complex calculations will be performed using computers.
Learning Outcomes
This course is designed for students who are interested in developing a set
of quantitative skills that are broadly applicable in the job market
and improving things in the practical world, be it environmental policies
or business strategies. By taking this course you will learn how to make
sense of the data by applying sound statistical reasoning and use the data to
shape policy and design desired outcomes.
This course is aimed at developing the following knowledge and skills:
• Knowledge and Understanding
understand fundamental concepts and important terminology
in statistics and probability
Statistics for Social Science 2019 Syllabus
2
develop an understanding of principles of data collection,
data analysis, and data visualization
• Skills and Other Attributes
be able to perform basic statistical operations using R software
present data in tables and charts, summarize and describe
numerical data
be able to apply statistical reasoning, perform statistical analysis
and interpret the results
Textbook & Course Materials The following textbooks and materials should be consulted for further reading. Additional readings and homework excercises will be assigned on a weekly basis.
PRESCRIBED TEXTS
[SME] Keller, Gerald. Statistics for Management and Economics, 10th
Edition. Cengage Learning, 2015.
[OS3] Diez, David M., Christopher D. Barr, and Mine Cetinkaya-Rundel.
OpenIntro Statistics, 3rd
Edition. OpenIntro, 2012.1
REFERENCES (SUGGESTED TEXTS)
Arthur Aron. Statistics for Psychology, 6th
Edition. Pearson, 2012.
Bruce, Peter, and Andrew Bruce. Practical Statistics for Data Scientists: 50
Essential Concepts. O'Reilly, 2017. Lind, Douglas A., William G. Marchal, and Samuel Adam Wathen. Statistical Techniques in Business and Economics. McGraw-Hill/Irwin, 2012. Venables, W. N., and D. M. Smith. An Introduction to R: Notes on R: A Programming Environment for Data Analysis and Graphics (Version 3.4.4). 2018.
All necessary materials (lecture PPT files, a list of Textbook problems, sample multiple choice questions, related articles, and related videos) will be provided in the form of electronic files.
11
For [OS3] materials in home assignment, please consult OpenIntroOrg youtube channel for short
video lectures: https://www.youtube.com/user/OpenIntroOrg/
Statistics for Social Science 2019 Syllabus
3
LECTURE/SEMINAR/HOMEWORK HOURS During a weekly lecture you will be presented with a summary of the material
and short presentations on each topic from the cirriculum. It is expected that you
come to class with prepared homework in order to fully comprehend the material.
The pace will be very fast; be prepared!
From time to time you will be asked to watch short video clips and solve and
submit exercises to test your comprehension of the material. You will also have
access to online resources throughout the course. The average commitment will
be approximately 9 hours per week for attending lectures, doing the readings, and
completing the assignments to successfully complete the course.
No Topic Lecture
Hours
Home
work
Hours
Total
1. Introduction 2 2 4
2. Data Basics 4 4 8
3. Graphical Descriptive Techniques 4 6 10
4. Numerical Descriptive Techniques 4 6 10
5. Data Collection and Sampling Theory 2 6 8
6. Probability 4 8 12
7. Discrete Probability Distributions 2 4 6
8. Midterm exam 2 2
9. Continuous Probability Distributions 2 4 6
10. Sampling Distributions 4 4 8
11. Estimation 4 8 12
12. Hypothesis Testing Framework 4 8 12
13. Inference for Numerical Data 2 4 6
14. Analysis of Variance 4 6 10
15. Regression Analysis 6 10 16
16. Final exam 2 2
Total 52 80 132
Statistics for Social Science 2019 Syllabus
4
Part 2: Grading Policy
Grading Policy Section Weight, %
Block 1. Overall comprehension of statistical theory 50%
Midterm exam 20%
Final exam 30%
Block 2. Coding techniques and data understanding 50%
Homeworks 25%
Class problem sets 25%
Block 3. Individual characteristics 0%
Interest in statistics 0%
Attendance 0%
The grading policy for this course is aimed at assessing the students’ overall
comprehension of statistical theory (Block 1) via Midterm and Final exams and their
coding techniques and understanding of data (Block 2) via Homeworks and
Laboratory assignments.
In addition, there is a third block, which assesses individual characteristics,
which may or may not be accounted in your final grade. E.g., attendance has a 0%
weight, but if there are three or more absences, a decreasing coefficient will be
applied. Similarly, a genuine interest in the material can yield a higher grade through
an increasing coefficient.
Statistics for Social Science 2019 Syllabus
5
Part 3: Topic Outline/Schedule
Weekly Schedule:
Week 1. Introduction
Weeks 2, 3. Data Basics
Weeks 4, 5. Graphical Descriptive Techniques
Weeks 6, 7. Numerical Descriptive Techniques
Weeks 8. Data Collection and Sampling Theory
Weeks 9, 10. Probability
Weeks 11, 12. Discrete Probability Distributions
Week 13. Midterm Exam
Weeks 14, 15. Continuous Probability Distributions
Weeks 16, 17. Sampling Distributions
Weeks 18, 19. Estimation
Weeks 20, 21. Hypothesis Testing Framework
Week 22. Inference for Numerical Data
Weeks 23, 24. Analysis of Variance
Weeks 25, 26, 27. Regression Analysis
Week 28. Final Exam.
Lecture Outlines:
Week 1. Introduction
(1) Learning Objectives
After this session, students should be able to:
- Understand the focus of statistics as a subject
- Learn key statistical terminology
- Understand statistical applications in social science and business
.
(2) Session Outline
1. Short history of the development of modern statistics
2. The subject of statistics as a scientific discipline
3. Applications of statistics in the social sciences and business
4. Overview of key terminology in statistics
(3) Required readings and references
[SME] Chapter 1.
Statistics for Social Science 2019 Syllabus
6
Salsburg, David. The Lady Tasting Tea: How Statistics Revolutionized Science in
The Twentieth Century. Macmillan, 2001.
Weeks 2, 3. Data Basics
(1) Learning Objectives
After this session, students should be able to:
- Understand the concept of a random variable
- Differentiate between types of variables
- Be able to work with variables in R environment
(2) Session Outline
1. Concept of a random variable
2. Variable types
3. Variable transformations
4. Observations, variables, and data matrices
5. Variables in R environment
(3) Required readings and references
[SME] Chapter 2.
[OS3] Chapter 1.
Weeks 4, 5. Graphical Descriptive Techniques
(1) Learning Objectives
After this session, students should be able to:
- Become familiar with a glossary of chart types
- Learn graphical techniques to describe interval and categorical data
- Show a relationship between numerical and nominal variables
- Organize data using a frequency distribution
- Represent data using charts
(2) Session Outline
1. Frequency and relative frequency distributions
2. Shapes of frequency distributions
3. Contingency tables and bar plots
4. Examining numerical data
5. Cross-sectional and time-series data
Statistics for Social Science 2019 Syllabus
7
6. Misleading graphs and charts to be avoided
7. Alternatives to pie charts
(3) Required readings and references
[SME] Chapters 2 and 3.
Knaflic, Cole Nussbaumer. Storytelling with Data: A Data Visualization Guide for
Business Professionals. John Wiley & Sons, 2015. Introduction.
Wexler, Steve, Jeffrey Shaffer, and Andy Cotgreave. The Big Book of
Dashboards: Visualizing Your Data Using Real-World Business Scenarios. John
Wiley & Sons, 2017. Part I.
[Videos]
- Identifying Misleading Graphs - Konst Math https://www.youtube.com/watch?v=ETbc8GIhfHo
Weeks 6, 7. Numerical Descriptive Techniques
(1) Learning Objectives
After this session, students should be able to:
- Apply numerical techniques for describing and summarizing data
- Identify, compute, and interpret descriptive statistical summary
measures
- Differentiate between the measures of central tendency, dispersion,
and relative standing
(2) Session Outline
1. Types of distributions (symmetrical, left-skewed, and right-skewed)
2. Measures of central tendency for numerical data
3. Comparing the mean, mode, and median
4. Measures of dispersion
5. Measures of relative standing
6. The Empirical Rule and Chebyshev’s Theorem
(3) Required readings and references
[SME] Chapter 4.
Aron, A. Statistics for Psychology, 6th edition. Pearson, 2012. Chapter 2.
Statistics for Social Science 2019 Syllabus
8
Week 8. Data Collection and Sampling Theory
(1) Learning Objectives
After this session, students should be able to:
- Understand the methodologies underlying data collection
- Become acquainted with the concepts of random sampling and sample
bias
- Differentiate between sampling strategies
(2) Session Outline
1. Methods of collecting data
2. The concepts of population and sample
3. Sampling from a population
4. Sampling strategies
5. Sampling bias
(3) Required readings and references
[SME] Chapter 5.
[OS3] Chapter 1 (1.3).
Bruce, Peter, and Andrew Bruce. Practical Statistics for Data Scientists: 50
Essential Concepts. O'Reilly, 2017. Chapter 2.
[Articles]
- Sampling bias Vs. Sampling Error
http://conflict.lshtm.ac.uk/page_40.htm
[Videos]
Bias and Error in Sampling Statistic (2 minutes to watch) https://www.youtube.com/watch?v=Pb_CrXBRovE
Weeks 9, 10. Probability
(1) Learning Objectives
After this session, students should be able to:
- Learn probability concepts and rules
- Identify components of probability
Statistics for Social Science 2019 Syllabus
9
- Assess probabilities and apply probability formulas
(2) Session Outline
1. Defining probability
2. Joint, marginal, and conditional probability
3. Addition, multiplication, and complement rules
4. Marginal and joint probabilities
5. Defining conditional probabilities
6. Tree diagrams for probability
7. Bayes' Theorem
(3) Required readings and references
[SME] Chapter 6.
[OS3] Chapter 2.
Silver, Nate. The Signal and the Noise: Why so Many Predictions Fail - but Some
Don't. Penguin, 2012.
Please consult for a short review:
Cantor, Murray. Filling in the Blanks, The Math behind Nate Silver’s The Signal
and the Noise. IBM Developer Works, 2013.
[Articles]
- Bayes’ Law
https://en.wikipedia.org/wiki/Bayes%27_theorem
- Visualization of conditional probability
http://setosa.io/conditional/
[Videos]
- Conditional Probability explained visually (Bayes’ Theorem)
https://youtu.be/Zxm4Xxvzohk
Weeks 11, 12. Discrete Probability Distributions
(1) Learning Objectives
After this session, students should be able to:
- Learn the concept of a random variable
- Become acquainted with discrete distributions of random variables
(2) Session Outline
Statistics for Social Science 2019 Syllabus
10
1. Defining random variable
2. A family of discrete distributions
3. Bivariate Distributions
4. Binomial Distribution
5. Poisson Distribution
(3) Required readings and references
[SME] Chapter 7.
[OS3] Chapter 3.
[Articles]
- Joint Probability Distribution
https://en.wikipedia.org/wiki/Joint_probability_distribution
[Videos]
- Binomial Distribution Demo http://demonstrations.wolfram.com/BinomialProbabilityDistribution
- The Bivariate Normal Distribution Demo http://demonstrations.wolfram.com/TheBivariateNormalDistribution/
Week 13. Midterm Exam
(1) Midterm exam instructions
- Study Topics 1 through 7
- Calculator is not allowed, but a formula sheet will be provided
- No make up will be given
- Be on time!
Weeks 14, 15. Continuous Probability Distributions
(1) Learning Objectives
After this session, students should be able to:
- Become familiar with continuous distributions
- Learn the 68-95-99.7 rule
- Learn the concept of probability density functions
- Calculate Z-scores and use distribution tables
Statistics for Social Science 2019 Syllabus
11
(2) Session Outline
6. A family of continuous distributions
7. Normal distribution
8. Standardizing with Z-scores
9. Student’s t distribution
10. χ2 distribution
(3) Required readings and references
[SME] Chapter 8.
[OS3] Chapter 2 (2.5.1).
[OS3] Chapter 3 (3.1.2).
[OS3] Chapter 6 (6.3.3).
[Videos]
- Joint Probability Distribution
http://demonstrations.wolfram.com/TheBivariateNormalDistribution/
Weeks 16, 17. Sampling Distributions
(1) Learning Objectives
After this session, students should be able to:
- Learn the concept of a sampling distribution of the mean
- Learn the concept of a standard error of the mean
- Apply the Central Limit Theorem
- Use the sampling distribution for inference
(2) Session Outline
1. The logic of parametric estimation
2. Sampling distribution
3. Standard error of an estimate
4. Sampling distribution of the difference between two means
5. Standard error of the difference between two means
6. Principles of A/B tests in business
(3) Required readings and references
Statistics for Social Science 2019 Syllabus
12
[SME] Chapter 9.
[OS3] Chapter 4.
Bruce, Peter, and Andrew Bruce. Practical Statistics for Data Scientists: 50
Essential Concepts. O'Reilly, 2017. Chapter 3.
[Articles]
- Central Limit Theorem Visualized in D3
http://blog.vctr.me/posts/central-limit-theorem.html
- Probability Density Function
https://en.wikipedia.org/wiki/Joint_probability_distribution
[Videos]
- Bunnies, Dragons and the 'Normal' World: Central Limit Theorem |
The New York Times
https://youtu.be/jvoxEYmQHNM
- Sampling Distribution Concept and its relation to Inference
https://youtu.be/Zbw-YvELsaM
- Sampling Distribution of the Sample Mean
https://youtu.be/q50GpTdFYyI
Weeks 18, 19. Estimation
(1) Learning Objectives
After this session, students should be able to:
- Develop an understanding of statistical concepts behind parametric
estimation
- Calculate point estimates and interpret confidence intervals
- Compute standard error for the sample mean
- Select the sample size
- Use the sampling distribution for inference
(2) Session Outline
1. Basic properties of point estimates
2. Capturing the population parameters
3. Selecting the sample size
4. Calculations of point estimates and standard errors for the sample mean
5. Interpretation of confidence intervals
Statistics for Social Science 2019 Syllabus
13
(3) Required readings and references
[SME] Chapter 10.
[OS3] Chapter 4.
[Articles]
- Things to be considered when choosing the sample sizes https://www.itl.nist.gov/div898/handbook/ppc/section3/ppc333.htm
Weeks 20, 21. Hypothesis Testing Framework
(1) Learning Objectives
After this session, students should be able to:
- Develop an understanding of hypothesis testing framework
- Conduct one-tailed and two-tailed hypothesis tests
- Define decision errors
- Explain the relationship between classical hypothesis test and p-values
(2) Session Outline
6. Hypothesis testing methodology
7. Null and alternative hypotheses
8. Significance levels (α) and decision errors (Type I and Type II)
9. One-sided and two-sided hypothesis tests
(3) Required readings and references
[SME] Chapter 11
[OS3] Chapter 4
[Articles]
- Type I and Type II Error
https://en.wikipedia.org/wiki/Type_I_and_type_II_errors
[Videos]
- Hypothesis testing basics
https://youtu.be/ZzeXCKd5a18
- Type I and Type II Error
https://youtu.be/a_l991xUAOU
Statistics for Social Science 2019 Syllabus
14
Week 22. Inference for Numerical Data
(1) Learning Objectives
After this session, students should be able to:
- Learn the concept of experimental design
- Define degrees of freedom
- Test and estimate a population variance
- Test the equality of means
- Test the equality of count data and proportions
(2) Session Outline
1. What is experimental design?
2. The meaning of degrees of freedom
3. One-sample mean tests
4. Inference for paired data
5. χ2-tests and goodness-of-fit
(3) Required readings and references
[SME] Chapters 12, 13, and 15.
[OS3] Chapter 5.
Weeks 23, 24. Analysis of Variance
(1) Learning Objectives
After this session, students should be able to:
- Conduct F-tests
- Decompose variance
- Conduct one-way and two-way analysis of variance
- Understand the properties of ANOVA tables
(2) Session Outline
1. Familywise error and the limitations of a t-test
2. The F-test and F-distribution
3. Variance decomposition
4. The within-groups and between-groups estimates of population variance
5. The F-ratio
6. Hypothesis testing with the analysis of variance
7. Diagnostics for an ANOVA analysis
Statistics for Social Science 2019 Syllabus
15
(3) Required readings and references
[SME] Chapters 14.
[OS3] Chapter 5 (5.5.1).
[Videos]
- Sources of Variance in an Experiment
https://youtu.be/JPjMUeTMOwg
Weeks 25, 26, 27. Regression Analysis
(1) Learning Objectives
After this session, students should be able to:
- Run a regression model
- Interpret the linear regression’s coefficients
- Compute the coefficient of determination
- Predict values using regression techniques
(2) Session Outline
1. Covariance and correlation
2. Assumptions of ordinary least squares (OLS)
3. Simple bivariate regression
4. Multivariate regression
5. Multicollinearity
6. Control variables in multiple regression
7. Difference between the full model and the best model
8. Prediction using regression
(3) Recommended Readings
Keller, Gerald. Statistics for Management and Economics, Abbreviated. Cengage
Learning, 2015. Chapter 16.
Diez, David M., Christopher D. Barr, and Mine Cetinkaya-Rundel. OpenIntro
Statistics. OpenIntro, 2012. Chapter 8.
Week 28. Final Exam
(1) Final exam instructions
- Study Topics 1 through 15
- Calculator is not allowed, but a formula sheet will be provided
- No make up will be given
- Be on time!