39
Introduction to Statistics for Built Environment Course Code: AED 1222 Compiled by DEPARTMENT OF ARCHITECTURE AND ENVIRONMENTAL DESIGN (AED) CENTRE FOR FOUNDATION STUDIES (CFS) INTERNATIONAL ISLAMIC UNIVERSITY MALAYSIA

Aed1222 lesson 1 and 3

Embed Size (px)

Citation preview

Page 1: Aed1222 lesson 1 and 3

Introduction to Statistics for Built Environment

Course Code: AED 1222

Compiled byDEPARTMENT OF ARCHITECTURE AND ENVIRONMENTAL DESIGN (AED)

CENTRE FOR FOUNDATION STUDIES (CFS)INTERNATIONAL ISLAMIC UNIVERSITY MALAYSIA

Page 2: Aed1222 lesson 1 and 3

What is/are statistics?

1. Used as a plural term that refers to numerical facts or data.

The term statistics is commonly used in two ways:

Page 3: Aed1222 lesson 1 and 3

Statistics: as a set of numerical facts or data about the usage of a particular website

Page 4: Aed1222 lesson 1 and 3

Statistics: as a set of numerical facts or data about the availability of lands, and the ownership of minerals in the state of Alberta (USA).

Page 5: Aed1222 lesson 1 and 3

2. Used as a singular (more broad) term that refers to the science of designing studies, gathering data, and then classifying, summarizing, interpreting, and presenting these data to explain and support the decisions that are reached.

In other words, the term statistics used as a singular term refers to a science or a field of study that covers the following activities:

Designing studies Gathering data Analyzing the

dataPresenting the

data

Reaching a decision

Page 6: Aed1222 lesson 1 and 3

Statistical Problem-solving

Managers, decision makers and researchers use statistical problem solving procedures to help them make wise and effective decisions.

Page 7: Aed1222 lesson 1 and 3

Step one: Identifying the problem

Statistical Problem-solving cont.

Step two: Gathering available facts

Step three: Gathering new data

Step four: Classifying and organizing the data

Step five: Presenting and analyzing data

Step six: Making a decision

Page 8: Aed1222 lesson 1 and 3

Important terms used in the study of statistics

A population

A sample

A parameter

A statistic

Is the complete collection of objects or individuals under study

Is a portion or subset taken from a population

Is a number that describes a population characteristic (such as weight or height…)

Is a number that describes a sample characteristic

A variable Is a characteristic that can be expressed by a number. The value of this characteristic is likely to vary (change) from one item in

the data set to the next, for example: age, gender, weight, height…

Page 9: Aed1222 lesson 1 and 3

P O P U L A T I O N

S a m p l e

P O P U L A T I O N

S a m p l e

Sample selection

Judgment about the unknown Population.

Page 10: Aed1222 lesson 1 and 3

The two essential parts of the subject of statistics

1. Descriptive statistics: covers the process of collecting, classifying, summarizing and presenting data.

2. Inferential statistics: refers to the process of arriving at a conclusion about a population based on information obtained from a sample.

The subject of statistics can be viewed as a process that is broken down into two parts:

Page 11: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

1. Descriptive Statistics

• Collect data– e.g., Survey, Observation,

Experiments

• Present data– e.g., Charts and graphs

• Characterize data

– e.g., Sample mean = n

x i

The process of collecting, classifying, summarizing and presenting data.

Page 12: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

2. Inferential Statistics

• Estimation– e.g., Estimate the population

mean weight using the sample mean weight

• Hypothesis Testing– e.g., Use sample evidence to test

the claim that the population mean weight is 120 kg.

Drawing conclusions and/or making decisions concerning a population based on sample results.

Page 13: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

• Making statements about a population by examining sample results

Sample statistics Population parameters (known) Inference (unknown, but can

be estimated from sample evidence)

Sample Population

2. Inferential Statistics

Page 14: Aed1222 lesson 1 and 3

Start of the Study

Classification, summarization and processing of data

Presentation and communication of

summarized information

Use sample information to make inferences and draw

conclusions about the population

Use census data to analyze the population

End of the Study

Or

Descriptive Statistics Inferential statistics

An overview of Descriptive Statistics

and Inferential Statistics

Page 15: Aed1222 lesson 1 and 3

The need for sampling

Sampling (or taking samples) is an activity that occurs on a daily basis, and is not used by statisticians alone.Sampling in daily life may not be as sophisticated as sampling done for formal statistical studies.However, they still serve the fundamental purpose of providing information for judgement.

Examples of daily-life sampling include:1. A chef tasting food to see if it has the desired flavor.2. A car buyer test-driving a car to compare it with others…3. Pieces of rock being analyzed to determine the availability of

a certain mineral in an area.

Page 16: Aed1222 lesson 1 and 3

Sampling is needed to provide sufficient information so that inferences can be made about the characteristics of a population.

A population can be either finite or infinite:A finite population is one where the total number of members (items, measurements, etc…) is fixed and could be listed.

An infinite population has an unlimited number of members.

The need for sampling cont.

Page 17: Aed1222 lesson 1 and 3

Computers installed in the labs of the Centre for Foundation Studies

Finite

Page 18: Aed1222 lesson 1 and 3

Penguins inhabiting the north pole

Infinite

Page 19: Aed1222 lesson 1 and 3

Cars assembled in a given day

Finite

Page 20: Aed1222 lesson 1 and 3

Fireflies population in Kuala Selangor

Infinite

Page 21: Aed1222 lesson 1 and 3

Sample data vs. Census data

Complete information acquired through a census is generally desirable.If every item in a population is examined, we can be confident in describing the population.

However:• What you want is not necessarily what you can get.• Census data are a luxury in most situations and are usually not

available for studying a population.

Therefore:• Data gathering by sampling (rather than census taking) is the rule

rather than the exception because of the following sampling advantages:

Page 22: Aed1222 lesson 1 and 3

Advantages of sampling

1. Cost: any data-gathering effort incurs costs for such things as mailings, interviews, and data processing. The more data to be handled, the higher the costs are likely to be.

2. Time: speed in decision making is often crucial, and carrying out a census is too time-consuming… an example?

3. Accuracy of sampling: sometimes a small sample provides information that’s almost as accurate as the results obtained from a complete census. How? There are sampling methods that produce samples that are highly representative of the population, and in such cases, larger samples will not produce results that are significantly more accurate.

Page 23: Aed1222 lesson 1 and 3

4. Other advantages: sometimes the resources may be available for a census, but the nature of the population requires a sample.

For example, an environmental protection agency is willing to sponsor a study of the entire population of a certain whale species, but the migration movement, births and deaths can prevent a complete count. One way to solve this problem is to study a small area of the ocean and use the results to make a projection (inference).

In addition to the above, destructive test are often used to judge product quality. For example a car manufacturer may want to know the safety of one of their vehicles… what would the manufacturer do?

Advantages of sampling cont.

Page 24: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Voluntary

SAMPLING TECHNIQUES

NON-PROBABILITY / NON-STATISTICAL SAMPLING TECHNIQUES

Judgment

PROBABILITY / STATISTICAL SAMPLING TECHNIQUES

Simple Random Stratified

SystematicCluster

Sampling Techniques

Convenience

There are many ways that samples can be selected. But they are categorized into 2 types:

Page 25: Aed1222 lesson 1 and 3

Non-probability sampling techniques

A non-probability sample is one in which the judgment of the experimenter, the method in which the data are collected, or other factors could affect the results of the sample.

Items of the sample are chosen based on unknown or non-probabilities

The interpretation of such samples is always questionable:“was the discovery from the sample true? Or was it just the result of the way the sample was taken?”

There are 3 common types of non-probability samples:

Page 26: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Non-Probability Sampling(Non- Statistical Sampling)

Voluntary ConvenienceJudgement

Non-Probability sampling techniques cont…

Page 27: Aed1222 lesson 1 and 3

Judgment sampling techniqueJudgement Sample selection based on the opinion of one or more persons who feel sufficiently qualified to identify items for a sample as being representative of the population.Any sample will be taken based on someone’s expertise about the population.

For example, a politician picks a certain voting district as reliable places to measure the public’s opinion of his/her political party…

A judgment sample is convenient, but its difficult to assess how closely it measures reality.However, it can still be useful depending on the expertise of the person(s) involved in determining the sample.

Page 28: Aed1222 lesson 1 and 3

Voluntary sampling technique

Sometimes questions or questionnaires are distributed to the public by publishing them in print media, the internet or radio/television. Such questionnaires or polls produce voluntary samples and attract only those who are interested in the subject matter…

Obviously, results obtained from such samples are unreliable… Why?

Page 29: Aed1222 lesson 1 and 3

Convenience sampling technique

Often people want to take an “easy” sample.Such samples where the ease with which the sample is taken are called Convenience Samples.For example, a surveyor will stand in one location and ask passersby their question or questions.A student working on a project will ask an entire class to fill out a survey.Would standing outside a bank asking customers what they thought of the bank’s services give a complete picture? Why?

Page 30: Aed1222 lesson 1 and 3

Probability sampling techniques

A probability sample is one in which the chance of selection of each item in the population is known or calculable before the sample is picked.

Probability samples are more reliable than Non-Probability samples.

There are 4 common types of Probability samples that are used to gather new data:

Page 31: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Probability Sampling(Statistical Sampling)

StratifiedSystematic ClusterSimple Random

Probability sampling techniques cont…

Page 32: Aed1222 lesson 1 and 3

Simple random sampling technique

If a probability sample is chosen in such a way that each item in the population has an equal chance of being selected, then the sample is called a simple random sample.

Assume that every item in a population is numbered, and each number is written on a slip of paper. If all the slips are placed in a bowl and mixed, and if a group of slips is then picked, the items represented by the selected slips constitute a simple random sample.

A more practical approach is often to number the items in the population, and use a computer programme to generate a table of random numbers, and then select a sample from that list…

Page 33: Aed1222 lesson 1 and 3

Systematic sampling technique

Suppose we have a list of 1000 items in a finite population, and that we want to pick a probability sample of 50 items.

We first have to number the items from 0-1000. Then, we can use a random number table to pick one of the first 20 items (1000/50 = 20) on our list.

If the table of random numbers gives us number 16, then the 16th item in the list will be the first to be selected. We would then pick every 20th name after this random start (the 36th item, the 56th item etc…) to produce a systematic sample.

Page 34: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

• Decide on sample size: n• Divide frame of N individuals into groups of k

individuals: k=N/n• Randomly select one individual from the 1st

group • Select every kth individual thereafter

Systematic sampling technique

N = 64

n = 8

k = 8 First Group

Page 35: Aed1222 lesson 1 and 3

Stratified sampling technique

If a population is divided into homogenous groups (or strata), and then a sample is drawn from each group to produce an overall sample, this overall sample is known as a stratified sample.

We can stratify data based on race (in the case of Malaysia), gender, employment category, type of programme (1 year vs. 1.5 years at CFS) and so on…

Some prior knowledge of the structure of the population is necessary for selecting a stratified sample.

Page 36: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Stratified sampling technique• Divide population into subgroups (called strata)

according to some common characteristic• Select a simple random sample from each

subgroup• Combine samples from subgroups into one

PopulationDividedinto 4strata

Sample

Page 37: Aed1222 lesson 1 and 3

Cluster sampling technique

A cluster sample is one in which the individual units to be sampled are actually groups or clusters of items.Its assumed that the individual items within each cluster are representative of the population.Consumer surveys of large cities often employ cluster sampling. Usually a city is divided into small blocks, each block containing a cluster of households to be surveyed.A number of clusters are selected for the sample, and all the households in the selected clusters are surveyed.The benefits of this sampling method is savings in cost and time, since an interviewer will need less energy and money if he/she stays within a specific area rather than travel across the city…

Page 38: Aed1222 lesson 1 and 3

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Cluster sampling technique

• Divide population into several “clusters,” each representative of the population

• Select a simple random sample of clusters– All items in the selected clusters can be used, or items can be chosen

from a cluster using another probability sampling technique

Population divided into 16 clusters.

Randomly selected clusters for sample

Page 39: Aed1222 lesson 1 and 3