31
ECON1203 – Statistics Contents Chapter 1 – What is Statistics?............................3 1 Descriptive statistics vs. inferential statistics.............3 2 Population vs. sample.........................................3 3 Statistical inference.........................................3 Chapter 2 – Graphical Descriptive Techniques I.............4 4 Variables, values, data.......................................4 5 Types of data.................................................4 6 Describing univariate nominal data............................4 7 Comparing multivariate nominal data...........................4 Chapter 3 – Graphical Descriptive Techniques II............5 8 Describing univariate interval data...........................5 9 Describing time-series data...................................5 10 Describing bivariate interval data............................5 11 Graphical excellence..........................................5 12 Graphical deception...........................................5 Chapter 4 – Numerical Descriptive Techniques...............7 13 Measures of central location..................................7 14 Variability...................................................7 15 Measures of relative standing.................................7 16 Measures of linear relationship...............................8 Chapter 5 – Data Collection and Sampling...................9 17 Methods of collecting data....................................9 18 Sampling......................................................9 19 Sampling plans................................................9 20 Sampling error................................................9 21 Nonsampling error.............................................9 Chapter 6 – Probability...................................10 22 Random experiment............................................10 23 Sample space.................................................10

Summary

  • Upload
    maustro

  • View
    10

  • Download
    1

Embed Size (px)

DESCRIPTION

chapter 8

Citation preview

ECON1203 Statistics

ECON1203 StatisticsChapter 3 Graphical Descriptive Techniques IIECON1203 StatisticsContentsChapter 1 What is Statistics?31Descriptive statistics vs. inferential statistics32Population vs. sample33Statistical inference3Chapter 2 Graphical Descriptive Techniques I44Variables, values, data45Types of data46Describing univariate nominal data47Comparing multivariate nominal data4Chapter 3 Graphical Descriptive Techniques II58Describing univariate interval data59Describing time-series data510Describing bivariate interval data511Graphical excellence512Graphical deception5Chapter 4 Numerical Descriptive Techniques713Measures of central location714Variability715Measures of relative standing716Measures of linear relationship8Chapter 5 Data Collection and Sampling917Methods of collecting data918Sampling919Sampling plans920Sampling error921Nonsampling error9Chapter 6 Probability1022Random experiment1023Sample space1024Requirements of probabilities1025Approaches to assigning probabilities1026Events1027Joint, marginal and conditional probability1028Probability rules10Chapter 7 Discrete Probability Distributions1229Random variables1230Discrete probability distributions1231Bivariate distributions1232Binomial distributions13Chapter 8 Continuous Probability Distributions1433Requirements of probability density functions1434Uniform distributions1435Normal distributions1436Exponential distribution1437Student distribution1438Chi-squared distribution1539 distribution15Chapter 9 Sampling Distributions1640Central Limit Theorem (CLT)1641Sampling distribution of the sample mean1642Normal approximation of binomial distributions1643Approximating sampling distribution of a sample proportion1644Sampling distribution of the difference between two means17Chapter 10 Introduction to Estimation1845Point vs. interval estimators1846Properties of estimators1847Estimating population mean from standard deviation 1848Estimating population mean from median1849Sample size18Chapter 11: Introduction to Hypothesis Testing19

Chapter 1 What is Statistics?Descriptive statistics vs. inferential statistics Descriptive statistics Organising, summarising & presenting data Inferential statistics Drawing conclusions about populations based on sample dataPopulation vs. sample Population All items of interest to a statistics practitioner (e.g. the shoe size of Australians) Parameter A descriptive measure of a population (e.g. the mean shoe size of Australians) Sample A subset of a population (e.g. the shoe size of UNSW students) Statistic A descriptive measure of a sample (e.g. the mean shoe size of UNSW students)Statistical inference Statistical inference Drawing conclusions about populations based on sample data Confidence level The proportion of times an estimation procedure will be correct Significance level The proportion of times a conclusion will be wrong

Chapter 2 Graphical Descriptive Techniques IVariables, values, data Variable (denoted as uppercase letters) A characteristic of a population or sample (e.g. shoe size) Values The possible observations of a variable (e.g. shoe sizes between 1-16) Data (denoted as lowercase letters) The observed values of a variableTypes of dataHierarchy of data Moving down the hierarchy of data reduces the number of permissible calculations. Higher-level data can be treated as lower-level data, but not vice versa.1. Interval/quantitative/numerical data Real numbers (all calculations are valid)2. Ordinal data Data in a ranked order (calculations based on order are valid)3. Nominal/qualitative/categorical data Arbitrary numbers (calculations based on frequencies and percentages are valid)Describing univariate nominal dataFrequency 1. Frequency distribution[footnoteRef:1] - A table that shows the frequency of each outcome [1: Excel: To count the frequency of a particular value, use =COUNTIF ([Input range], [Criteria]).]

2. Bar chart A chart that shows the frequency of each outcomeRelative frequency 3. Relative frequency distribution A table that shows the relative frequency of each outcome4. Pie chart A chart that shows the relative frequency of each outcomeComparing multivariate nominal data1. Cross-classification table/cross-tabulation table A table that shows the frequency of combinations of two variables2. Relative cross-classification table/cross-tabulation table A table that shows the relative frequency of combinations of two variables3. Separate bar charts

Chapter 3 Graphical Descriptive Techniques IIDescribing univariate interval data1. Histogram A chart with rectangles whose bases are the intervals and whose heights are the frequencies Symmetric Mirrored on either sides of the middle Positively skewed With a tail to the right Negatively skewed With a tail to the left Unimodal With one peak Bimodal With two peaks Bell-shaped Symmetric & unimodal2. Stem-and-leaf display A table that separates place values3. Relative frequency distribution A table that shows the relative frequency of values4. Cumulative relative frequency distribution A table that cumulatively adds relative frequencies5. Ogive A chart that shows cumulative relative frequencyDescribing time-series data Line chart A chart that plots a variable over timeDescribing bivariate interval dataScatter diagram A chart that plots the observed combinations of two variables Linearity linear/nonlinear/no relationship Direction positive/negative Strength strong/medium-strength/weakGraphical excellence1. Concise data2. Clear ideas3. Multivariate4. Substance over form5. No distortionGraphical deception1. Graphs without scale2. Graphs with different captions3. Stretching and shrinking graphs4. Bar charts with changing widths

Chapter 4 Numerical Descriptive TechniquesMeasures of central location1. 2. 3. 4. 5. Variability1. 2. Variance a. b. c. 3. Standard deviation a. b. 4. 5. Empirical rule a. Within one standard deviation of the mean: b. Within two standard deviations of the mean: c. Within three standard deviations of the mean: 6. 7. 8. Measures of relative standing1. 2. 3. Box plots A graph with a box and whiskers that shows the maximum, minimum, range, median, interquartile range and outliers.4. Outliers Unusually large or small observationsMeasures of linear relationship1. Covariance a. b. c. 2. Coefficient of correlation a. b. 3. a. b. c. 4. how much of s variation is explained by s variationa. b. 5. Correlation is not causation!

Chapter 5 Data Collection and SamplingMethods of collecting data1. Primary data Collected by the statistics practitioners for the current problem2. Secondary data Collected by someone else for another problem3. Observation Measuring actual behaviour4. Experiments Imposing treatments and measuring resultant behaviour 5. Surveys Asking questionsSampling Target population The population about which we want to draw inferences Sampled population The actual population from which the sample has been take Self-selected samples When participants choose to participate and thus are more keenly interested in the issue than other members of the populationSampling plans1. Simple random sample Samples with the same number of observations are equally likely to be chosen2. Stratified random sample Dividing the population into mutually exclusive strata and then drawing simple random samples from each stratum3. Cluster sample Dividing the population into mutually exclusive clusters and then only drawing simple random samples from selected clustersSampling error Sampling error Differences between the sample and the population because of observations that happened to be selected for the sample; it can be reduced by increasing the sample sizeNonsampling errorNonsampling error Differences between the sample and the population because of mistakes in data acquisition or improper selection of sample observations; it cannot be reduced by increasing the sample size1. Errors in data acquisition (e.g. faulty equipment, inaccurate responses to sensitive questions)2. Nonresponse error When responses are not obtained from some members of the sample3. Selection bias When members of the target population cannot possibly be selected for inclusion in the sample

Chapter 6 ProbabilityRandom experiment Random experiment An action or process that leads to one of several possible outcomes (e.g. Experiment: Flipping a coin. Outcomes: Heads or tails.)Sample space Sample space All possible outcomes of an experiment. They must be mutually exclusive.Requirements of probabilities1. The probability of any outcome must lie between 0 and 1: 2. The sum of the probabilities of all outcomes is 1; Approaches to assigning probabilities1. Classical approach Probabilities in games of chance (e.g. flipping a coin, rolling dice)2. Relative frequency approach Probabilities are long-run relative frequencies (e.g. if the relative frequency of getting a distinction is 200/1000 students, ).3. Subjective approach Probabilities are the degree of belief in the occurrence of an event (e.g. the probability that the price of a share will increase)Events Simple event An individual outcome of a sample space (e.g. getting a mark of 80) Event A collection or set of one or more simple events in a sample space (e.g. the event of getting a distinction requires a mark of at least 80, ) Probability of an event The sum of the probabilities of the simple events that make up an eventJoint, marginal and conditional probability1. Joint probability (intersection) The probability that both and occur: 2. Marginal probability Probabilities computed by adding across rows or down columns3. Conditional probability The probability of given : 4. Independent events 5. Union The probability that either or or both occur: Probability rules1. Complement rule: The probability that does not occur: 2. Multiplication rule: The joint probability of and 3. Multiplication rule for independent events: 4. Addition rule: The union of and 5. Addition rule for mutually exclusive events:

Chapter 7 Discrete Probability DistributionsRandom variables Random variable A function or rule that assigns a number to each outcome of an experiment (e.g. when flipping a coin, the number of heads ) Discrete random variable Can only assume certain values (whether finite or infinite) Continuous random variable Can assume any values within a specified range (e.g. time) Probability distribution A table, formula or graph that shows the probabilities of values of a random variableDiscrete probability distributions1. Requirements of discrete probability distributions a. b. 2. 3. 4. 5. 6. Laws of expected value a. b. c. 7. Laws of variance a. b. c. Bivariate distributions1. Requirements for discrete bivariate distributions a. b. 2. 3. 4. 5. Laws of expected value of the sum of two variables a. 6. Laws of variance of the sum of two variables a. b. If and are independent, and 7. 8. Binomial distributionsRequirements of binomial experiments:1. Fixed number of trials 2. Two outcomes: and 3. Independent trials the outcome of one trial does not affect the outcomes of other trials Binomial probability distribution:

Probability that is at least Probability that equals Mean, variance and standard deviation:1. 2. 3.

Chapter 8 Continuous Probability DistributionsRequirements of probability density functions1. The function is above 0: 2. The area under the function is 1: Uniform distributions

Normal distributions

It is symmetric about the mean . Increasing the standard deviation widens the curve. Standardised normal distributions are symmetric about 0: Exponential distribution

Increasing the parameter of distribution steepens the curve. Student distribution

It is symmetrical about 0: It is flatter than the standard normal distribution. Increasing the degrees of freedom narrows the curve. Chi-squared distribution

Increasing the degrees of freedom flattens the curve. Probabilities distribution

Chapter 9 Sampling DistributionsCentral Limit Theorem (CLT)The sampling distribution of the mean of a random sample drawn from any population is approximately normal for a sufficiently large sample size.Sampling distribution of the sample mean

1. 2. 3. Normal approximation of binomial distributions

1. Binomial distributions are approximately normally distributed if:a. ; and b. 2. 3. 4. 5. The continuity correction factor because binomial distributions are discrete random variables whereas normal distributions are continuous random variables:Binomial distributionNormal distribution

Approximating sampling distribution of a sample proportion1. is approximately normally distributed I:a. b. 2. 3. 4. Sampling distribution of the difference between two means1. 2. 3.

Chapter 10 Introduction to EstimationPoint vs. interval estimators1. Point estimators Estimate a parameter using a single value or point2. Interval estimators Estimate a parameter using an intervalProperties of estimators1. Unbiased The expected value of the estimator equals the parameter: 2. Consistent As the sample size grows, the difference between the estimator and the parameter falls: and 3. Relatively efficient An estimator is relatively more efficient if its variance is lower: is relatively more efficient than if Estimating population mean from standard deviation 1. 2. 3. Estimating population mean from median

Sample size1. 2.

Chapter 11 Introduction to hypothesis testing

Chapter 12 Inference about a population

Chapter 13 Inference about comparing two populations

Chapter 14 Analysis of variance

Chapter 15 Chi-squared tests

Chapter 16 Similar linear regression and correlation

Chapter 17 Multiple regression

5