Semiconductor Defect Management Separating The Vital Few From The Trivial Many

November 19, 2009 Stuart L. Riley 1

Semiconductor Defect Management: Separating the Vital Few From the Trivial Many

Stuart L. [email protected]@valaddsoft.com

Member American Society for Quality


Copyright Statement

Original work by Stuart L. Riley: Copyright 2009

Rights reserved.

This document may be downloaded for personal use.

Users are forbidden to reproduce, republish, redistribute, or resell any materials from this document as their original work.

All references to this document, any quotation, or figures should be made to the author.

Questions or comments can be addressed to Stuart L. Riley, at [email protected] or [email protected]


Definition of Quality ControlQuality control (QC) is a procedure or set of procedures intended to ensure that a manufactured product or performed service adheres to a defined set of quality criteria or meets the requirements of theclient or customer. QC is similar to, but not identical with, quality assurance (QA). QA is defined as a procedure or set of procedures intended to ensure that a product or service under development (before work is complete, as opposed to afterwards) meets specified requirements. QA is sometimes expressed together with QC as a single expression, quality assurance and control (QA/QC).

In order to implement an effective QC program, an enterprise must first decide which specific standards the product or service must meet. Then the extent of QC actions must be determined (for example, the percentage of units to be tested from each lot). Next, real-world data must be collected(for example, the percentage of units that fail) and the results reported to management personnel. After this, corrective action must be decided upon and taken (for example, defective units must be repaired or rejected and poor service repeated at no charge until the customer is satisfied). If too many unit failures or instances of poor service occur, a plan must be devised to improve the production or service process and then that plan must be put into action. Finally, the QC process must be ongoing to ensure that remedial efforts, if required, have produced satisfactory results and to immediately detect recurrences or new instances of trouble.

Source: http://whatis.techtarget.com/definition/0,,sid9_gci1127382,00.html

>>> Keep highlighted passages in mind, as you read the rest of this document. <<<


Introduction

• Semiconductor fabs use in-line inspections to– Detect what are commonly called “defects”– Use “defect” count / density charts to monitor / control the “fab quality”– Not focused on the unit product – the individual circuit, or the “die”

• This strategy is misleading at best and catastrophic at worst– In-line inspections detect anomalies (the “trivial many”)– Defects (the “vital few”) are only a subset of all anomalies– The noise of the “trivial many” anomalies can drive the chart trends

• Too much time wasted reacting to the trivial many• Too easy to miss the vital few


Goal• Apply a strategy that separates the vital few from the trivial many• Defect component needs to be extracted from all anomalies• Need to know potential faults – fraction of defects that are harmful• Determine the probable affect of faults on each die (unit product)• Need to apply a die-based, defect-limited yield strategy


Define the Unit Product

• Wafers – Run in batches of up to 25 wafers per batch– Contain die: 10s, 100s, 1000s of die per wafer– Are carriers of the die only -- NOT the unit product

• Die – Individual circuits sold to customers– The unit product produced


Semiconductor In-Line Inspection• Inspection – sample lots

– Sample of lots and wafers in lot• Data Result: Anomalies – anything detected by inspection

– Inspection tool noise – false positives– Cosmetic anomalies

• Color, grain, etc. from normal process variation• No negative effect on yield

– Defects (the vital few)• Abnormal and potentially harmful• Particle or process-related• Separate from other anomalies using classification (categorization)

• Data Result: Wafer maps – coordinate data of anomalies


Anomaly CountingInspection results: Wafer maps and anomaly density (counts).

Anomaly counting cannot distinguish between high points caused by random or clustered anomalies.No distribution or die information on chart.

Clusters

Note: All wafer maps were produced using the “KlarfView” application, which can be found at: http://www.valaddsoft.com/

Map of random anomalies These high points could be random, affecting many die, or they could be clustered, affecting few die.


Defect Counting

• Defects are sub-set of all anomalies• Requires categorization (classification) of anomalies to

separate defects from rest• Requires selection process

– Automatic to reduce human bias– Wafer-based random selection: no distribution / die information.


Random SamplingWafer-based random sampling tends to over-sample in the clustered regions. Many randomly-distributed anomalies (on many die) are not sampled (lighter spots).

This is ok, if the goal is to define number of defects on the wafer.

But, it adds no information regarding the number of die affected (distribution).

So this is not the correct sampling strategy if we want to monitor the quality of the die.

Clusters

Selected anomalies = dark spots


Sampling was done using the “DBSample”application.


Defect ClassificationExamples of defects as seen during classification.Some obviously impact the product, others aren’t as obvious.So defects have a probability of affecting the die circuits.


Defect Count

6001000100601000

100302010

=×=×++

=CountDefect

Assume: Anomaly count = 1000

Classification data:

Type A: nA = 40 (assume this is a cosmetic anomaly)Type B: nB = 10Type C: nC = 20Type D: nD = 30

After classification, the defect classification data can be used to extract the number of defects from the overall population of anomalies.

So it is estimated that 60% of the anomalies are defects that could potentially harm the product. Type “A” is left out, because we assumed it was a cosmetic anomaly.


Defect Density ChartThe noise level is reduced and now reflects the count of defects on the wafer, with the rest of the anomalies removed. We’re now able to see the vital few, but we need to consider the fact that all defects don’t cause fails.

Anomaly density (gray)Defect density (black)


Fault Count

( )0 10 10 240.44

100K

+ + += =

0.44 1000 440F = × =

( )1

M

i ii

p nK

N=

×=∑

Type A: nA = 40, pA = 0.0, fA = 0Type B: nB = 10, pB = 1.0, fB = 10Type C: nC = 20, pC = 0.5, fC = 10Type D: nD = 30, pD = 0.8, fD = 24

F K A= ×

( )i ii

p nK

N×

=

i iF K A= ×

The fault count is defined as the weighted-average kill ratio multiplied by the number of anomalies.

The defect count data can be refined further.We can apply the probability of failure, pi, for each of the ith defect types to their respective counts to find the overall fault count on the wafer.

For individual fault types:

Weighted average kill ratio, for M types and N classified anomalies.

Weighted average kill ratio.

Fault count

Fault count


Fault Density ChartBy applying a probability of failure to each anomaly type, the noise level is reduced even further. The chart now reflects the count of faults on the wafer.But, fluctuations in this chart can still be driven by clusters.We still need to capture distribution information – the number of die (unit product) affected.

Anomaly density (gray)Fault density (black)


Defect-Limited Yield

• Yield is the fraction of all die that are good• Yield can be affected by

– Process problems and fall-on particles – defects that cause faults– Things that may or may not be caught using in-line inspections

• Defect-Limited Yield (DLY)– Definition: The yield loss for each defect, or group of defects– Other issues may cause yield loss– The defect-limited yield will only cap the upper limit to potential yield loss due

to detected defects– Actual yield may be lower, due to issues that are not detected from in-line

inspections– So DLY cannot be relied on as a “yield predictor”, but only as a quality metric

to identify potential yield issues due to detected defects


DLY: General Form

f K a= ×

O

I

DPct Clean DieD

=

A O

I

D DDLYD′ +

=

A

AaD

=

( )fA O

I

e D DDLY

D

− × +=

fA AD e D−′ = ×

( )1

M

i ii

p nK

N=

×=∑

If we assume all anomalies will cause faults, we can find the pct of die without anomalies (or pct clean die) by dividing the number of die without anomalies, DO, by the number of die inspected, DI:

But if we assume that only a portion of the anomalies have a probability of causing faults, some of the anomalous die have a probability of not failing (or die that can be recovered), D’A:

Note: This is analogous to a “yield” number.

The number of anomalous die that may be recovered, D’A, can be expressed as a probability density function applied to the number of anomalous die. Assuming all anomalies are random, we can use the Poisson distribution function:

Where,

Now the DLY can be expressed as:


DLY: General Form

( )i ii

p nK

N×

=

( )ifA O

iI

e D DDLY

D

− × +=

i if K a= ×A

AaD

=

DLY can also be expressed in terms of individual (or combined) anomaly types. For the ith type:

Average number of faults for the ith anomaly type.

Weighted kill ratio for the ith anomaly.

Average number of anomalies on anomalous die.

Kill ratio for the ith type

Total classified

Number classified for the ith type


Why Use the Poisson Distribution Function?

fY e−=

Poisson Statistics (Random distributions)

Cf A D= ×

The average number of faults per die is

Sources:

C. Stapper, et. Al., “Integrated Circuit Yield Statistics”, Proceedings of the IEEE, Vol. 71, No. 4, pp. 453-468, April 1983.

C. Stapper, “On a Composite Model to the IC Yield Problem”, IEEE Journal of Solid State Circuits, Vol. SC-10, pp. 537-539, December 1975.

AC is the critical area, and D is the defect density.

Note: The reference uses λ instead of f. But the meaning is the same. It is the average number of faults per die.

From the references: For random distributions, Poisson statistics can be applied.

CA P A= ×The critical area is the probability of failure, P, times the die area

( )f P A D P d= × × = ×

So, the average number of faults per die can be expressed as the probability of failure times the average number of defects per die



f P d= ×

The average number of faults per die is be expressed as the probability of failure times the average number of defects per die

f K a= ×A

AaD

=

( )fA O

I

e D DDLY

D

− × +=

( )1

M

i ii

p nK

N=

×=∑

For DLY, we can define the average number of faults per die as the product of the weighted-average of the kill ratios for all classified anomalies (which is analogous to the probability of failure), and the average number of anomalies per die:

Now we can apply the average number of faults to the Poisson distribution function to find the “yield”, of the number of anomalous die, or anomalous die that can be recovered.

So, armed with nothing more than the data collected from in-line inspections, we can estimate the impact of defects on yield – the defect-limited yield.

Where and



• The Poisson function works only for randomly-distributions• Anomaly maps typically contain mixed distributions.• How can we apply the Poisson function to mixed distributions?

– Separate die with random anomalies from die with clustered anomalies– Treat each die group as random distributions

• A lower-density group for random die• A higher-density group for clustered die

– Estimate the number of recovered random and clustered die seperately– Simply add the number of recovered die to the number of clean die to find the

total number of die that likely will not fail


DLY: Mixed-Distribution

A O

I

D DDLYD′ +

=

A R CD D D′ ′ ′= +

R C O

I

D D DDLYD

′ ′+ +=

If we assume the distributions of anomalies will always be random, the DLY can be expressed as:

But as we can see from the wafer maps, we have mixed distributions –random and clustered anomalies. So, we need to pull the 2 distributions apart into their random and clustered die components:

So the DLY can now be expressed as:

Cluster

Random


DLY: Mixed-Distribution

R R Rf K a= ×

( )1

CM

i i Ci

CC

p nK

N=

×=∑( )

1

RM

i i Ri

RR

p nK

N=

×=∑

R C O

I

D D DDLYD

′ ′+ +=

RR

R

AaD

= CC

C

AaD

=

( ){ } ( )C C C R R Rf K a a K a= × − + ×

RfR RD e D−′ = × Cf

C CD e D−′ = ×

C C Cf K a= ×

In order to correctly apply the DLY to the random and clustered distributions, we can express DLY as:

For the random distribution: For the clustered distribution:

Weighted-average kill ratio for random anomalies only:

The average number of random anomalies over random anomalous die:

Weighted-average kill ratio for clustered anomalies only:

The average number of clustered anomalies over clustered anomalous die:

Note:If KR = KC, then


Avg Number of Faults Per Clust Die

( ){ } ( )C C C R R Rf K a a K a= × − + ×

( )1

CM

i i Ci

CC

p nK

N=

×=∑ C

CC

AaD

= ( )1

RM

i i Ri

RR

p nK

N=

×=∑

RR

R

AaD

=

Weighted average kill ratio for the classified anomalies on just the clustered die.

Average number of clustered anomalies on clustered die.

Weighted average kill ratio for the classified anomalies on just the random die.

Average number of random anomalies on random die.

If we assume the clustered and random anomalies are independent, we can treat the 2 distributions separately on the clustered die.

C C Cf K a= ×

But, if we want to assume KR = KC,

then


DLY: Random Only

AR ORandOnly

I

D DDLYD

′ +=

all all Rf K a= ×( )

1

M

i ii

all

p nK

N=

×=∑ R

RR

AaD

=

allfAR AD e D−′ = ×

At times, it is important to know what the DLY would be on a wafer, if there were no clusters. This information can be used to plot the mixed-distribution data and the “random only” data together on the same chart, to see which wafers are clustered and which are not.

The fault density is expressed in terms of the weighted-average kill ratios for all anomaly types, applied to the average number of random anomalies per random die, applied to all anomalous die. This assumes all die only contain random anomalies, and all have a proportional probability of containing the same anomaly types.


Die-Based Clustering

In order to use the mixed-distribution DLY, we must separate the random die, DR, and clustered die, DC.

We can do this by identifying clustered die as die containing significantly more anomalies, compared to the other anomalous die.

Clustered die


Clustering was defined using the “DBCluster”application.

Dark spots: anomalies in clustered die


Die-Based SamplingNow that we can separate the random die from the clustered die, we can apply die-based sampling to ensure we have a fair selection of anomalies over as many die possible.

Random sampling Die-based sampling

Compared to wafer-based random sampling, die-based sampling forces a fair sampling of more anomalous die, while still ensuring we get a fair sampling from clustered die.


Sampling was done using the “DBSample”application.


Die-Based Sampling

Random sampling Die-based sampling



Mixing Multiple Products

• Many fabs run multiple products that have different number of die• The “native” DLY is modulated by the number of die on the wafer• Apply a standard set of die to the wafer map to estimate a “normalized” DLY• Apply normalization after all other steps (inspection, classification,

clustering and native DLY est.) have been completed• Normalization permits a better apples-apples monitoring of processes that

span multiple product types• Normalization also permits application of DLY estimations to bare wafer

inspections


Normalize DieA die-based strategy can be modulated by the number of die on a wafer. In order to apply this strategy to fabs that are running multiple products with different die layouts, we can normalize the die layout to a standard set of die. The normalization allows us to plot the data on one chart for all products. This also allows us to apply a die-based strategy on wafers that have no die (lower right).

Native Die Normalized Die Native Die Normalized Die


Normalization was done using the “RDie” application.


Example of DLY ResponseAssuming the same die layout (or normalized die):> DLY is modulated by number of die affected> Probability anomalies can cause a fault (the Krs)> And density of anomalies per die.


DLY Chart

Example of a DLY chart.

The average mixed-dist DLY for all wafers in a lot are plotted along with the random-only DLY (circles).

The bars indicate the high and low values for each lot.


DLY Compared to Anomaly Density

The DLY data shows numerous low points that were traced to a problem with a process tool.

During this same time-period, the anomaly density chart showed virtually no correlation to the problem.

The DLY data proved to be a superior indicator of the problem.


Level DLY and Defect DLY

Level DLY and Defect DLY Defect DLY vs. Level DLY

Example of how one defect type drove the DLY for one level. (Chart on left – DLY the same as prev chart)The low points (excursions) correlated to a specific tool problem.The same data is plotted on the chart on the right to show how much this defect drove the level DLY.


Cumulative DLY For All Levels

1

N

Cum ii

DLY DLY=

=∏

The cumulative yield, DLYcum is expressed as the product of the DLYs for each level.

For N levels, the cumulative DLY is:


Cumulative DLY Chart

The cumulative DLY is driven by more than one level.

One level shown on this chart clearly had an affect on the cumulative DLY due to an excursion (middle of chart).

Because DLY is modulated by the same factors that can affect yield, there is a good chance that the issues pushing the DLY down will affect final yield.

Baseline Excursion Recovered


Implementation Steps

• Inspect the wafer to find the anomalies.• Run die-based clustering to identify clustered die.• Run die-based sampling to select anomalies to classify.• Classify the anomalies.• Apply a pre-defined set of kill ratios to each type of anomaly.• Calculate the un-normalized DLY using the "native" die layout.• Normalize the die.• Re-run die-based clustering.• Using the classification data already collected, calculate the

normalized DLY.


Implementation Steps for Bare Wafer

• Inspect the wafer to find the anomalies.• Normalize the die to add die information to the data.• Run die-based clustering to identify clustered die.• Run die-based sampling to select anomalies to classify.• Classify the anomalies.• Apply a pre-defined set of kill ratios to each type of anomaly.• Calculate the DLY.


Summary

• Wafer-based counting strategies– Do not adequately monitor and control the unit product – the die– Can be driven by noise – the high points on the chart– Wasted effort by focusing on the trivial many, while missing the vital few

• Die-based DLY strategy– Removes a lot of the noise that can result in missed opportunities– Focuses attention to factors that can drive issues affecting the die

• Number of die affected• Probability of anomalies causing faults from extracted defects• Number anomalies (extracted defects) on anomalous die

– Manages the impact of clustered die on the data– Emphasizes the vital few, while minimizing the trivial many


References

Menon, Venu B., "Chapter 27: Yield Management", "Handbook of Semiconductor Manufacturing Technology", Marcel Dekker Inc., 2000, pp. 869-887.

Nurani, R.K., "Effective Defect Management Strategies For Emerging Fab Needs", Statistical Methodology, IEEE International Workshop, 2001, pp. 33-37.

Riley, Stuart, "A Simplified Approach to Die-Based Yield Analysis", Semiconductor International, Vol. 30, No. 8, August 2007, pp. 47-51.

Riley, Stuart L., "Limitations to Estimating Yield Based on In-Line Defect Measurements," dft, pp.46, 1999 International Symposium on Defect and Fault Tolerance in VLSI Systems, 1999

Riley, Stuart L., "Estimating the Impact of Defects on Yield from In-Line Defect Measurement Data", Semiconductor International Web Exclusive, December 1999, http://www.semiconductor.net/article/206973-Estimating_the_Impact_of_Defects_on_Yield_from_In_Line_Defect_Measurement_Data.php?rssid=20279

Riley, Stuart L., "Optical Inspection of Wafers Using Large Area Defect Detection and Sampling", Proceedings IEEE International Workshop on VLSI Systems, November, 1992 (pp. 12-21).

Stapper, Charles, Et. Al, "Integrated Circuit Yield Statistics", Proceedings of the IEEE, Vol. 71, No. 4, April 1983, pp. 453-470.