Upload
nurun2010
View
360
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction to Statistics for Built Environment
Course Code: AED 1222
Compiled byDEPARTMENT OF ARCHITECTURE AND ENVIRONMENTAL DESIGN (AED)
CENTRE FOR FOUNDATION STUDIES (CFS)INTERNATIONAL ISLAMIC UNIVERSITY MALAYSIA
What is/are statistics?
1. Used as a plural term that refers to numerical facts or data.
The term statistics is commonly used in two ways:
Statistics: as a set of numerical facts or data about the usage of a particular website
Statistics: as a set of numerical facts or data about the availability of lands, and the ownership of minerals in the state of Alberta (USA).
2. Used as a singular (more broad) term that refers to the science of designing studies, gathering data, and then classifying, summarizing, interpreting, and presenting these data to explain and support the decisions that are reached.
In other words, the term statistics used as a singular term refers to a science or a field of study that covers the following activities:
Designing studies Gathering data Analyzing the
dataPresenting the
data
Reaching a decision
Statistical Problem-solving
Managers, decision makers and researchers use statistical problem solving procedures to help them make wise and effective decisions.
Step one: Identifying the problem
Statistical Problem-solving cont.
Step two: Gathering available facts
Step three: Gathering new data
Step four: Classifying and organizing the data
Step five: Presenting and analyzing data
Step six: Making a decision
Important terms used in the study of statistics
A population
A sample
A parameter
A statistic
Is the complete collection of objects or individuals under study
Is a portion or subset taken from a population
Is a number that describes a population characteristic (such as weight or height…)
Is a number that describes a sample characteristic
A variable Is a characteristic that can be expressed by a number. The value of this characteristic is likely to vary (change) from one item in
the data set to the next, for example: age, gender, weight, height…
P O P U L A T I O N
S a m p l e
P O P U L A T I O N
S a m p l e
Sample selection
Judgment about the unknown Population.
The two essential parts of the subject of statistics
1. Descriptive statistics: covers the process of collecting, classifying, summarizing and presenting data.
2. Inferential statistics: refers to the process of arriving at a conclusion about a population based on information obtained from a sample.
The subject of statistics can be viewed as a process that is broken down into two parts:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
1. Descriptive Statistics
• Collect data– e.g., Survey, Observation,
Experiments
• Present data– e.g., Charts and graphs
• Characterize data
– e.g., Sample mean = n
x i
The process of collecting, classifying, summarizing and presenting data.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
2. Inferential Statistics
• Estimation– e.g., Estimate the population
mean weight using the sample mean weight
• Hypothesis Testing– e.g., Use sample evidence to test
the claim that the population mean weight is 120 kg.
Drawing conclusions and/or making decisions concerning a population based on sample results.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
• Making statements about a population by examining sample results
Sample statistics Population parameters (known) Inference (unknown, but can
be estimated from sample evidence)
Sample Population
2. Inferential Statistics
Start of the Study
Classification, summarization and processing of data
Presentation and communication of
summarized information
Use sample information to make inferences and draw
conclusions about the population
Use census data to analyze the population
End of the Study
Or
Descriptive Statistics Inferential statistics
An overview of Descriptive Statistics
and Inferential Statistics
The need for sampling
Sampling (or taking samples) is an activity that occurs on a daily basis, and is not used by statisticians alone.Sampling in daily life may not be as sophisticated as sampling done for formal statistical studies.However, they still serve the fundamental purpose of providing information for judgement.
Examples of daily-life sampling include:1. A chef tasting food to see if it has the desired flavor.2. A car buyer test-driving a car to compare it with others…3. Pieces of rock being analyzed to determine the availability of
a certain mineral in an area.
Sampling is needed to provide sufficient information so that inferences can be made about the characteristics of a population.
A population can be either finite or infinite:A finite population is one where the total number of members (items, measurements, etc…) is fixed and could be listed.
An infinite population has an unlimited number of members.
The need for sampling cont.
Computers installed in the labs of the Centre for Foundation Studies
Finite
Penguins inhabiting the north pole
Infinite
Cars assembled in a given day
Finite
Fireflies population in Kuala Selangor
Infinite
Sample data vs. Census data
Complete information acquired through a census is generally desirable.If every item in a population is examined, we can be confident in describing the population.
However:• What you want is not necessarily what you can get.• Census data are a luxury in most situations and are usually not
available for studying a population.
Therefore:• Data gathering by sampling (rather than census taking) is the rule
rather than the exception because of the following sampling advantages:
Advantages of sampling
1. Cost: any data-gathering effort incurs costs for such things as mailings, interviews, and data processing. The more data to be handled, the higher the costs are likely to be.
2. Time: speed in decision making is often crucial, and carrying out a census is too time-consuming… an example?
3. Accuracy of sampling: sometimes a small sample provides information that’s almost as accurate as the results obtained from a complete census. How? There are sampling methods that produce samples that are highly representative of the population, and in such cases, larger samples will not produce results that are significantly more accurate.
4. Other advantages: sometimes the resources may be available for a census, but the nature of the population requires a sample.
For example, an environmental protection agency is willing to sponsor a study of the entire population of a certain whale species, but the migration movement, births and deaths can prevent a complete count. One way to solve this problem is to study a small area of the ocean and use the results to make a projection (inference).
In addition to the above, destructive test are often used to judge product quality. For example a car manufacturer may want to know the safety of one of their vehicles… what would the manufacturer do?
Advantages of sampling cont.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
Voluntary
SAMPLING TECHNIQUES
NON-PROBABILITY / NON-STATISTICAL SAMPLING TECHNIQUES
Judgment
PROBABILITY / STATISTICAL SAMPLING TECHNIQUES
Simple Random Stratified
SystematicCluster
Sampling Techniques
Convenience
There are many ways that samples can be selected. But they are categorized into 2 types:
Non-probability sampling techniques
A non-probability sample is one in which the judgment of the experimenter, the method in which the data are collected, or other factors could affect the results of the sample.
Items of the sample are chosen based on unknown or non-probabilities
The interpretation of such samples is always questionable:“was the discovery from the sample true? Or was it just the result of the way the sample was taken?”
There are 3 common types of non-probability samples:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
Non-Probability Sampling(Non- Statistical Sampling)
Voluntary ConvenienceJudgement
Non-Probability sampling techniques cont…
Judgment sampling techniqueJudgement Sample selection based on the opinion of one or more persons who feel sufficiently qualified to identify items for a sample as being representative of the population.Any sample will be taken based on someone’s expertise about the population.
For example, a politician picks a certain voting district as reliable places to measure the public’s opinion of his/her political party…
A judgment sample is convenient, but its difficult to assess how closely it measures reality.However, it can still be useful depending on the expertise of the person(s) involved in determining the sample.
Voluntary sampling technique
Sometimes questions or questionnaires are distributed to the public by publishing them in print media, the internet or radio/television. Such questionnaires or polls produce voluntary samples and attract only those who are interested in the subject matter…
Obviously, results obtained from such samples are unreliable… Why?
Convenience sampling technique
Often people want to take an “easy” sample.Such samples where the ease with which the sample is taken are called Convenience Samples.For example, a surveyor will stand in one location and ask passersby their question or questions.A student working on a project will ask an entire class to fill out a survey.Would standing outside a bank asking customers what they thought of the bank’s services give a complete picture? Why?
Probability sampling techniques
A probability sample is one in which the chance of selection of each item in the population is known or calculable before the sample is picked.
Probability samples are more reliable than Non-Probability samples.
There are 4 common types of Probability samples that are used to gather new data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
Probability Sampling(Statistical Sampling)
StratifiedSystematic ClusterSimple Random
Probability sampling techniques cont…
Simple random sampling technique
If a probability sample is chosen in such a way that each item in the population has an equal chance of being selected, then the sample is called a simple random sample.
Assume that every item in a population is numbered, and each number is written on a slip of paper. If all the slips are placed in a bowl and mixed, and if a group of slips is then picked, the items represented by the selected slips constitute a simple random sample.
A more practical approach is often to number the items in the population, and use a computer programme to generate a table of random numbers, and then select a sample from that list…
Systematic sampling technique
Suppose we have a list of 1000 items in a finite population, and that we want to pick a probability sample of 50 items.
We first have to number the items from 0-1000. Then, we can use a random number table to pick one of the first 20 items (1000/50 = 20) on our list.
If the table of random numbers gives us number 16, then the 16th item in the list will be the first to be selected. We would then pick every 20th name after this random start (the 36th item, the 56th item etc…) to produce a systematic sample.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
• Decide on sample size: n• Divide frame of N individuals into groups of k
individuals: k=N/n• Randomly select one individual from the 1st
group • Select every kth individual thereafter
Systematic sampling technique
N = 64
n = 8
k = 8 First Group
Stratified sampling technique
If a population is divided into homogenous groups (or strata), and then a sample is drawn from each group to produce an overall sample, this overall sample is known as a stratified sample.
We can stratify data based on race (in the case of Malaysia), gender, employment category, type of programme (1 year vs. 1.5 years at CFS) and so on…
Some prior knowledge of the structure of the population is necessary for selecting a stratified sample.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
Stratified sampling technique• Divide population into subgroups (called strata)
according to some common characteristic• Select a simple random sample from each
subgroup• Combine samples from subgroups into one
PopulationDividedinto 4strata
Sample
Cluster sampling technique
A cluster sample is one in which the individual units to be sampled are actually groups or clusters of items.Its assumed that the individual items within each cluster are representative of the population.Consumer surveys of large cities often employ cluster sampling. Usually a city is divided into small blocks, each block containing a cluster of households to be surveyed.A number of clusters are selected for the sample, and all the households in the selected clusters are surveyed.The benefits of this sampling method is savings in cost and time, since an interviewer will need less energy and money if he/she stays within a specific area rather than travel across the city…
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.
Cluster sampling technique
• Divide population into several “clusters,” each representative of the population
• Select a simple random sample of clusters– All items in the selected clusters can be used, or items can be chosen
from a cluster using another probability sampling technique
Population divided into 16 clusters.
Randomly selected clusters for sample