21
Stratifica tion STATISTICS

Stratification STATISTICS. the process of dividing members of the population into homogeneous subgroups before sampling In general, stratification

Embed Size (px)

Citation preview

Page 1: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Stratifi cationSTATISTICS

Page 2: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

the process of dividing members of the population into homogeneous subgroups before sampling

In general, stratification is used to gain effi ciency. If variability is primarily between strata rather than within strata, it could mean a smaller number of samples need to be taken.

STRATIFICATION

Page 3: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

BIMODAL DISTRIBUTIONS

Great case for stratification!

Page 4: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Proportionate allocation uses a sampling fraction in each of the strata that is proportional to that of the total population. For instance, if the population X consists of m in the male stratum and f in the female stratum (where m + f = X), then the relative size of the two samples (x1 = m/K males, x2 = f/K females) should reflect this proportion.

Optimum allocation (or Disproportionate allocation) - Each stratum is proportionate to the standard deviation of the distribution of the variable. Larger samples are taken in the strata with the greatest variability to generate the least possible sampling variance.

STRATIFICATION STRATEGIES

Page 5: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

THE NATURE OF RISK

Page 6: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Statements (or inferences) about things are based on the best information at hand.

In a forestry context, statements are made about stand volumes based upon a sample, rather than from all of the trees in the stand. There is a risk of making any statement, particularly in today's litigious society.

Decisions are made as best as possible with consideration given to the probability the statement is right or wrong and the cost of being wrong. Being proactive in fi nding potential problems in timber sales is essential to effi cient cruising.

A forestry example would be where part of the stand is in low value pulp with the rest of the stand in high value sawtimber. Good information about volumes by product is necessary rather than just a total volume. Accurate representation of what is being sold is important (within reasonable cruising cost guidelines), in fairness to both the purchaser and the seller.

THE NATURE OF RISK

Page 7: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Unit refers to a cutting unit (a physical piece of ground).

There are two levels of stratifi cation, the strata and sub-strata or sample group . Stratifi cation groups similar things together into a population, in forestry, the typical unit of observation is a plot or a tree. Typically, volume is the variable of interest which aff ects the CV.

UNITS AND STRATA

Page 8: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Stratification is used to group similar individuals together into populations. These populations are the basis for statistical calculations and the error standards in the handbook are written for these strata.

These attributes or categorical variables can be used to aggregate total volumes in diff erent ways. While averages and totals are available for these diff erent groupings, it would be a violation of statistics to post stratify and calculate confidence intervals about those numbers. The user always has the option of creating strata using these attributes to place individuals into those populations.

STRATA VERSUS ATTRIBUTE

Page 9: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

1) Only use one sampling method for a stratum (if point/plot cruising only one BAF or plot size per stratum).

2) Change the frequency of the sampling method within a stratum by defining sub-stratification of sample group.

RULES FOR USING

Page 10: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Defining populations is the crux of cruise design. Before eff ectively designing a cruise, the prescriptions must be finalized. It is necessary to know what kind of information is needed in the prospectus in order to design the cruise. For example, if there are big diff erences in the value of a tree because of size or species then probably need to stratify based on those characteristics. Once the populations are defined then the next task is to decide how best to sample that population.

USER DEFINED POPULATIONS

Page 11: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

each individual (could be a tree or a plot) needs to be identifi ed by what population (stratum) the individual is in and where it is located. An individual can belong to only one population and can be located in only one unit. That unit may be used as a stratifi cation variable to place an individual into a population. Membership in a population determines if and when this individual is a measured sample, and the rules for selecting samples will vary with the cruise method. A unit may be a stratifi cation variable which is recorded and this attribute is used to summarize volumes by unit. Other attributes, such as species and logging methods, can also be recorded and used to summarize volumes.

The key point is although averages or totals can be calculated by these other attributes, if it is not a stratifi cation variable, sampling errors and confi dence intervals cannot legitimately be calculated.

SAMPLING AND ATTRIBUTES

Page 12: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Each tree sampled represents other trees which were not sampled. Since the sample selection takes place at the population level, the expanded volumes, sampling errors and statistics are also at the population level.

For sample tree cruises, the apportioning of the volume to the unit is in proportion to the percent of trees (tally by species) for a unit.

For area based sampling, the population volume per acre is multiplied by the unit acreage. This, of course, results in all units within a strata having the same species and volume per acre.

EXPANDING SAMPLES

Page 13: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Two uni ts , s ing le s tratum. Ca lcu late the expans ion factor as the number o f

counted trees d iv ided by the number sampled or 10 d iv ided by 2 equals 5 .

Each measured tree represents 5 others in the populat ion that were not measured

S i nc e s t r a t i fi c a t i o n w a s no t b y un i t , t he vo l ume ne ed s t o b e p ro ra t ed b a c k t o t he un i t . S i x o u t o f t he 1 0 t ree s o b s e rved w e re i n un i t 1 , s o 6 0 % o f t he PP , WF , a nd t o t a l vo l ume w o u l d b e a s s umed

t o b e i n un i t 1 . S i m i l a r l y , 4 o u t o f 1 0 t rees o b s e rved we re i n un i t 2 , s o 4 0 % o f t he PP , WF ,

a nd t o t a l vo l ume w o u l d b e a s s umed t o b e i n un i t 2 .

SIMPLE EXAMPLE

Page 14: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

VOLUME EXPANSION SIDE EFFECTS

Page 15: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Take a stand with two components where the S's could represent sawtimber or a species such as spruce, and the P's could represent pulp or pine. Two sample groups could be created, an 'S' and a 'P ' , and sample separately for the S's and P's on the points/plots. Looking at the S sample group, the error is based on plot volume of S's so the variabi l i ty would be very high since some plots have al l the volume in S's whi le others have none. Not only that but the presence of S's on the plot means there isn't as much room left for P's and vice versa. The volume of S's and P's is inversely correlated. And, of course, this high variabi l i ty and resultant high C.V.'s results in the cal l for more plots to meet sampling error. However, adding more plots could result in driving the C.V.'s higher i f the variabi l i ty increases even more, which, of course, would indicate even more plots are needed and so on. This type of stand needs to be sampled in a diff erent manner. Either sample minor or highly variable species or products with a separate method in a separate stratum, or spl i t out portions of units which are highly variable and sample them separately.

PRECAUTION FOR POINT/FIX PLOT SAMPLING

Page 16: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

POINT/FIX PLOT EXPANSION AND PRORATION

Remember, if the unit is not used as a stratification variable then there needs to be some assumptions to allocate volumes back to the unit level. In this example, a single stratum composed of two units is established. Then plots are placed in the population and the volume per acre calculated for each plot. Suppose this results in an average volume of 1000 CF per acre. The unit volumes are calculated by multiplying the volume per acre by the unit acres.In prorating point/plot sample volumes, the number of points/plots is used at the stratum level to calculate volume per acre for the stratum. So the number of points/plots are not considered at the unit level. One precaution in using sample groups with point/plot sampling: don't use sample groups to try to get unit volumes. Since the point/plot count and expansion is at the stratum level (looking back at the previous example it would appear), if units were sample groups then sample group 1 would have eight plots with volume and four without, and vice versa for sample group 2. This increases the variability and also results in strange looking expanded volumes for units.

Page 17: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

INFERENTIAL STATISTICS

Page 18: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

1. Estimate a parameter about a population2. Test a hypothesis

Example: 10 forest stands were thinned. The average increase in CAI was 25% following thinning.

Hypothesis: Forest thinning has no eff ect on tree growth rate

You want to prove that the thinning did have an eff ect so you test the opposite viewpoint or null hypothesis

We can test the hypothesis by comparing these stands to 10 similar stands that were not thinned.

2 WAYS TO USE INFERENTIAL STATISTICS

Page 19: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

Can use sample mean for continuous variables to estimate population mean.

Using a confidence interval helps define the certainty that the true population mean falls with the range of values.

Usually a 95% or 99% confidence interval is expressed.

PREDICTING THE POPULATION

Page 20: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

If you wanted to test the tree mortality associated with an insect infestation or other damaging event, you set a prediction level – say greater than 60 percent dead.

After the event you measure the number of dead trees and determine the real percent mortality.

You can do a 1 tailed test to see if your prediction was accurate by testing the data compared to mortality less than or equal to 60 percent.

You can do a 2 tailed test by setting the level to 60% and testing if the mortality was not equal to 60%. The test would fail if the measured mortality was much larger or much smaller than 60%

ONE OR TWO TAILED TESTS

Page 21: Stratification STATISTICS.  the process of dividing members of the population into homogeneous subgroups before sampling  In general, stratification

T-test – categorical data OK

In Excel – =TTEST(array1,array2,tails,type) Array1 – first set of numbers to compare Array2 – second set of numbers to compare Tails – 1 or 2 sided test Type – 1 equals paired, 2 equals Two-sample equal variance,

3 equals Two-sample unequal variance

COMPARING MEANS OF 2 POPULATIONS