Aed1222 lesson 1 and 3

Preview:

Citation preview

Introduction to Statistics for Built Environment

Course Code: AED 1222

Compiled byDEPARTMENT OF ARCHITECTURE AND ENVIRONMENTAL DESIGN (AED)

CENTRE FOR FOUNDATION STUDIES (CFS)INTERNATIONAL ISLAMIC UNIVERSITY MALAYSIA

What is/are statistics?

1. Used as a plural term that refers to numerical facts or data.

The term statistics is commonly used in two ways:

Statistics: as a set of numerical facts or data about the usage of a particular website

Statistics: as a set of numerical facts or data about the availability of lands, and the ownership of minerals in the state of Alberta (USA).

2. Used as a singular (more broad) term that refers to the science of designing studies, gathering data, and then classifying, summarizing, interpreting, and presenting these data to explain and support the decisions that are reached.

In other words, the term statistics used as a singular term refers to a science or a field of study that covers the following activities:

Designing studies Gathering data Analyzing the

dataPresenting the

data

Reaching a decision

Statistical Problem-solving

Managers, decision makers and researchers use statistical problem solving procedures to help them make wise and effective decisions.

Step one: Identifying the problem

Statistical Problem-solving cont.

Step two: Gathering available facts

Step three: Gathering new data

Step four: Classifying and organizing the data

Step five: Presenting and analyzing data

Step six: Making a decision

Important terms used in the study of statistics

A population

A sample

A parameter

A statistic

Is the complete collection of objects or individuals under study

Is a portion or subset taken from a population

Is a number that describes a population characteristic (such as weight or height…)

Is a number that describes a sample characteristic

A variable Is a characteristic that can be expressed by a number. The value of this characteristic is likely to vary (change) from one item in

the data set to the next, for example: age, gender, weight, height…

P O P U L A T I O N

S a m p l e

P O P U L A T I O N

S a m p l e

Sample selection

Judgment about the unknown Population.

The two essential parts of the subject of statistics

1. Descriptive statistics: covers the process of collecting, classifying, summarizing and presenting data.

2. Inferential statistics: refers to the process of arriving at a conclusion about a population based on information obtained from a sample.

The subject of statistics can be viewed as a process that is broken down into two parts:

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

1. Descriptive Statistics

• Collect data– e.g., Survey, Observation,

Experiments

• Present data– e.g., Charts and graphs

• Characterize data

– e.g., Sample mean = n

x i

The process of collecting, classifying, summarizing and presenting data.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

2. Inferential Statistics

• Estimation– e.g., Estimate the population

mean weight using the sample mean weight

• Hypothesis Testing– e.g., Use sample evidence to test

the claim that the population mean weight is 120 kg.

Drawing conclusions and/or making decisions concerning a population based on sample results.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

• Making statements about a population by examining sample results

Sample statistics Population parameters (known) Inference (unknown, but can

be estimated from sample evidence)

Sample Population

2. Inferential Statistics

Start of the Study

Classification, summarization and processing of data

Presentation and communication of

summarized information

Use sample information to make inferences and draw

conclusions about the population

Use census data to analyze the population

End of the Study

Or

Descriptive Statistics Inferential statistics

An overview of Descriptive Statistics

and Inferential Statistics

The need for sampling

Sampling (or taking samples) is an activity that occurs on a daily basis, and is not used by statisticians alone.Sampling in daily life may not be as sophisticated as sampling done for formal statistical studies.However, they still serve the fundamental purpose of providing information for judgement.

Examples of daily-life sampling include:1. A chef tasting food to see if it has the desired flavor.2. A car buyer test-driving a car to compare it with others…3. Pieces of rock being analyzed to determine the availability of

a certain mineral in an area.

Sampling is needed to provide sufficient information so that inferences can be made about the characteristics of a population.

A population can be either finite or infinite:A finite population is one where the total number of members (items, measurements, etc…) is fixed and could be listed.

An infinite population has an unlimited number of members.

The need for sampling cont.

Computers installed in the labs of the Centre for Foundation Studies

Finite

Penguins inhabiting the north pole

Infinite

Cars assembled in a given day

Finite

Fireflies population in Kuala Selangor

Infinite

Sample data vs. Census data

Complete information acquired through a census is generally desirable.If every item in a population is examined, we can be confident in describing the population.

However:• What you want is not necessarily what you can get.• Census data are a luxury in most situations and are usually not

available for studying a population.

Therefore:• Data gathering by sampling (rather than census taking) is the rule

rather than the exception because of the following sampling advantages:

Advantages of sampling

1. Cost: any data-gathering effort incurs costs for such things as mailings, interviews, and data processing. The more data to be handled, the higher the costs are likely to be.

2. Time: speed in decision making is often crucial, and carrying out a census is too time-consuming… an example?

3. Accuracy of sampling: sometimes a small sample provides information that’s almost as accurate as the results obtained from a complete census. How? There are sampling methods that produce samples that are highly representative of the population, and in such cases, larger samples will not produce results that are significantly more accurate.

4. Other advantages: sometimes the resources may be available for a census, but the nature of the population requires a sample.

For example, an environmental protection agency is willing to sponsor a study of the entire population of a certain whale species, but the migration movement, births and deaths can prevent a complete count. One way to solve this problem is to study a small area of the ocean and use the results to make a projection (inference).

In addition to the above, destructive test are often used to judge product quality. For example a car manufacturer may want to know the safety of one of their vehicles… what would the manufacturer do?

Advantages of sampling cont.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Voluntary

SAMPLING TECHNIQUES

NON-PROBABILITY / NON-STATISTICAL SAMPLING TECHNIQUES

Judgment

PROBABILITY / STATISTICAL SAMPLING TECHNIQUES

Simple Random Stratified

SystematicCluster

Sampling Techniques

Convenience

There are many ways that samples can be selected. But they are categorized into 2 types:

Non-probability sampling techniques

A non-probability sample is one in which the judgment of the experimenter, the method in which the data are collected, or other factors could affect the results of the sample.

Items of the sample are chosen based on unknown or non-probabilities

The interpretation of such samples is always questionable:“was the discovery from the sample true? Or was it just the result of the way the sample was taken?”

There are 3 common types of non-probability samples:

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Non-Probability Sampling(Non- Statistical Sampling)

Voluntary ConvenienceJudgement

Non-Probability sampling techniques cont…

Judgment sampling techniqueJudgement Sample selection based on the opinion of one or more persons who feel sufficiently qualified to identify items for a sample as being representative of the population.Any sample will be taken based on someone’s expertise about the population.

For example, a politician picks a certain voting district as reliable places to measure the public’s opinion of his/her political party…

A judgment sample is convenient, but its difficult to assess how closely it measures reality.However, it can still be useful depending on the expertise of the person(s) involved in determining the sample.

Voluntary sampling technique

Sometimes questions or questionnaires are distributed to the public by publishing them in print media, the internet or radio/television. Such questionnaires or polls produce voluntary samples and attract only those who are interested in the subject matter…

Obviously, results obtained from such samples are unreliable… Why?

Convenience sampling technique

Often people want to take an “easy” sample.Such samples where the ease with which the sample is taken are called Convenience Samples.For example, a surveyor will stand in one location and ask passersby their question or questions.A student working on a project will ask an entire class to fill out a survey.Would standing outside a bank asking customers what they thought of the bank’s services give a complete picture? Why?

Probability sampling techniques

A probability sample is one in which the chance of selection of each item in the population is known or calculable before the sample is picked.

Probability samples are more reliable than Non-Probability samples.

There are 4 common types of Probability samples that are used to gather new data:

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Probability Sampling(Statistical Sampling)

StratifiedSystematic ClusterSimple Random

Probability sampling techniques cont…

Simple random sampling technique

If a probability sample is chosen in such a way that each item in the population has an equal chance of being selected, then the sample is called a simple random sample.

Assume that every item in a population is numbered, and each number is written on a slip of paper. If all the slips are placed in a bowl and mixed, and if a group of slips is then picked, the items represented by the selected slips constitute a simple random sample.

A more practical approach is often to number the items in the population, and use a computer programme to generate a table of random numbers, and then select a sample from that list…

Systematic sampling technique

Suppose we have a list of 1000 items in a finite population, and that we want to pick a probability sample of 50 items.

We first have to number the items from 0-1000. Then, we can use a random number table to pick one of the first 20 items (1000/50 = 20) on our list.

If the table of random numbers gives us number 16, then the 16th item in the list will be the first to be selected. We would then pick every 20th name after this random start (the 36th item, the 56th item etc…) to produce a systematic sample.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

• Decide on sample size: n• Divide frame of N individuals into groups of k

individuals: k=N/n• Randomly select one individual from the 1st

group • Select every kth individual thereafter

Systematic sampling technique

N = 64

n = 8

k = 8 First Group

Stratified sampling technique

If a population is divided into homogenous groups (or strata), and then a sample is drawn from each group to produce an overall sample, this overall sample is known as a stratified sample.

We can stratify data based on race (in the case of Malaysia), gender, employment category, type of programme (1 year vs. 1.5 years at CFS) and so on…

Some prior knowledge of the structure of the population is necessary for selecting a stratified sample.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Stratified sampling technique• Divide population into subgroups (called strata)

according to some common characteristic• Select a simple random sample from each

subgroup• Combine samples from subgroups into one

PopulationDividedinto 4strata

Sample

Cluster sampling technique

A cluster sample is one in which the individual units to be sampled are actually groups or clusters of items.Its assumed that the individual items within each cluster are representative of the population.Consumer surveys of large cities often employ cluster sampling. Usually a city is divided into small blocks, each block containing a cluster of households to be surveyed.A number of clusters are selected for the sample, and all the households in the selected clusters are surveyed.The benefits of this sampling method is savings in cost and time, since an interviewer will need less energy and money if he/she stays within a specific area rather than travel across the city…

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc.

Cluster sampling technique

• Divide population into several “clusters,” each representative of the population

• Select a simple random sample of clusters– All items in the selected clusters can be used, or items can be chosen

from a cluster using another probability sampling technique

Population divided into 16 clusters.

Randomly selected clusters for sample

Recommended