17
Bayesian Inference Technique for Data mining for Yield Enhancement in Semiconductor Manufacturing Data Presenter: M. Khakifirooz Co-authors: C-F Chien, Y-J Chen National Tsing Hua University ISMI 2015, 16 th -18 th Oct. KAIST, Daejeon, Korea 1

Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

Bayesian Inference Technique for Data mining for Yield

Enhancement in Semiconductor Manufacturing Data

Presenter: M. Khakifirooz

Co-authors: C-F Chien, Y-J Chen

National Tsing Hua University

ISMI 2015, 16th -18th Oct.

KAIST, Daejeon, Korea

1

Page 2: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

The Purpose of Bayesian Inference

Data Structure provided by Data Model

Data Analysis Approach

•Bayesian Variable Selection (BVS)

•Data Clearance

•Yield Classification

Conclusive Research

Framework

Final Decision

Table

Conclusion &

Path Forward

2

Outline

Page 3: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

Bayesian Inference

Naïve Bayesian Classifier

Gaussian Bayesian Classifier

Bayesian Networks

3

Learning Curve

The Purpose of Bayesian Inference

Page 4: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

4

Human ExperienceHuman Experience

+

System Analysis

Yield Learning Curve of Semiconductor Manufacturing

Yield Learning Curve of

Semiconductor

Manufacturing:

In addition to data

analytics, Cumulative

Engineering Training

and Experiencesignificantly enhanced

yield improvement

Effron(1996), Tobin et al. (1999)

The Purpose of Bayesian Inference

Page 5: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

5

Data Structure provided by Data Model

𝑖 = 1, … ,𝑀𝑁

⋕ of process stagesample size

1 ≤ 𝑘𝑖 ≤ 𝑁 ⋕ of specify tools at each stage𝑛𝑖𝑗 , 𝑗 = 1,… , 𝑘𝑖1 ≤ 𝑃𝑛𝑖𝑗 ≤ 𝑛𝑖𝑗

𝑝𝑙 , 𝑙 = 1,… , 𝑃𝑛𝑖𝑗

frequency of each specify tool⋕ of exist chambers for each toolfrequency of each exist chamber

𝑁 =

𝑗=1

𝑘𝑖

𝑙=1

𝑃𝑛𝑖𝑗

𝑝𝑙 𝑁 ∗ 𝑀 =

𝑖=1

𝑀

𝑗=1

𝑘𝑖

𝑙=1

𝑃𝑛𝑖𝑗

𝑝𝑙

Response Variable: %Yield (continues)

Explanatory Variables: Stages (tools-chambers) (nominal)

Stages (process time) (continues)

Obs. 𝐯𝐚𝐫𝟏 𝐯𝐚𝐫𝟐

𝑛1 𝑎1 𝑎2

𝑛2 𝑎1 𝑏2

𝑛3 𝑏1 Na

Obs. 𝐯𝐚𝐫𝟏-𝒂𝟏 𝐯𝐚𝐫𝟏−𝒃1 𝐯𝐚𝐫𝟐-𝒂𝟐 𝐯𝐚𝐫𝟐-𝒃2

𝑛1 1 0 1 0

𝑛2 1 0 0 1

𝑛3 0 1 0 0

Nominal Variables

Dummy Variables

Page 6: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

6

Data Structure provided by Data Model

Yield 𝒔𝒕𝒂𝒈𝒆 𝟏 𝒔𝒕𝒂𝒈𝒆 𝟐

obs. 1 𝑇𝑜𝑜𝑙 1 𝑇𝑜𝑜𝑙 2

obs. 2 𝑇𝑜𝑜𝑙 1 𝑇𝑜𝑜𝑙 1

obs. 3 𝑇𝑜𝑜𝑙 2 Tool 2

Yield 𝒔𝒕𝒂𝒈𝒆 𝟏 𝒔𝒕𝒂𝒈𝒆 𝟐

obs. 1 Chamber 1 Chamber 2

obs. 2 Chamber 2 Chamber 1

obs. 3 Chamber 1 Chamber 2

Yield 𝒔𝒕𝒂𝒈𝒆 𝟏 𝒔𝒕𝒂𝒈𝒆 𝟐

obs. 1 𝑇𝑜𝑜𝑙 1. Chamber 1 𝑇𝑜𝑜𝑙 2. Chamber 2

obs. 2 𝑇𝑜𝑜𝑙 1. Chamber 2 𝑇𝑜𝑜𝑙 1. Chamber 1

obs. 3 𝑇𝑜𝑜𝑙 2. Chamber 1 𝑇𝑜𝑜𝑙 2. Chamber 2

Yield 𝒔𝒕𝒂𝒈𝒆 𝟏 𝒔𝒕𝒂𝒈𝒆 𝟐

obs. 1 𝐷𝑎𝑡𝑒 1.1 𝐷𝑎𝑡𝑒 1.2

obs. 2 𝐷𝑎𝑡𝑒 2.1 𝐷𝑎𝑡𝑒 2.2

obs. 3 𝐷𝑎𝑡𝑒 3.1 Date 3.2

Yield 𝒔 𝟏. 𝑻 𝟏. 𝑪𝒉 𝟏 𝒔 𝟏. 𝑻 𝟏. 𝑪𝒉 𝟐 𝒔 𝟏. 𝑻 𝟐. 𝑪𝒉 𝟏 𝒔 𝟐. 𝑻 𝟐. 𝑪𝒉 𝟐 𝒔 𝟐. 𝑻 𝟐. 𝑪𝒉 𝟐

obs. 1 1 0 0 1 0

obs. 2 0 1 0 0 1

obs. 3 0 0 1 1 0

Yield 𝒔 𝟏. 𝑻 𝟏. 𝑪𝒉 𝟏 𝒔 𝟏. 𝑻 𝟏. 𝑪𝒉 𝟐 𝒔 𝟏. 𝑻 𝟐. 𝑪𝒉 𝟏 𝒔 𝟐. 𝑻 𝟐. 𝑪𝒉 𝟐 𝒔 𝟐. 𝑻 𝟐. 𝑪𝒉 𝟐

obs. 1 𝐷𝑎𝑡𝑒 1.1 0 0 𝐷𝑎𝑡𝑒 1.2 0

obs. 2 0 𝐷𝑎𝑡𝑒 2.1 0 0 𝐷𝑎𝑡𝑒 2.2

obs. 3 0 0 𝐷𝑎𝑡𝑒 3.1 𝐷𝑎𝑡𝑒 2.3 0

Page 7: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

7

Data Structure provided by Data Model

Obs. 𝐯𝐚𝐫𝟏-𝒂𝟏 𝐯𝐚𝐫𝟏−𝒃1 𝐯𝐚𝐫𝟏-𝒄𝟏

𝑛1 1 0 0

𝑛2 0 0 1

𝑛3 0 1 0

Pr(ith variable sellected)1

3

1

3

1

3

var1−𝑎1, var1−𝑏1, var1−𝑐1 𝑑Multinomial

1

3,1

3,1

3

1,0,0

0,0,10,1,0

𝐯𝐚𝐫𝟏-𝒂𝟏

𝐯𝐚𝐫𝟏-𝒄𝟏 𝐯𝐚𝐫𝟏−𝒃1

To randomly pick a point

in this space, we need a

continues distribution

Distribution over Multinomial

(posterior distribution):

Dirichlet Distribution

selection probability based on engineer experience

Page 8: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

Critical Phenomena:

i. High dimensionality caused by transforming categorical variables to

dummies

ii. Multicollinearity caused by dummies nature

iii. Complicated posterior distribution caused hardness for direct

variable selection

Remedy:

Approximate Inference with SamplingUse random sampling (MCMC techniques: Gibbs sampler, Metropolis-Hastings,…) to approximate the

distribution and selecting significant explanatories

8

Data Analysis Approach

Page 9: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

9

Data Analysis Approach: Gibbs Sampler

Suppose 𝒙𝟏, 𝒙𝟐~𝐏𝐫 𝑥, 𝑥2

Beginning with initial value 𝒙𝟏𝟎, 𝒙𝟐𝟎

Sampling at iteration t as follow:

Iteration Sample 𝐱𝟏 Sample 𝐱𝟐

k x𝟏𝑡 ~𝐏𝐫 x𝟏|x𝟐

t−1 x𝟐𝑡 ~𝐏𝐫 x𝟐|x𝟏

𝑡

Iterating the above step until the

sample values have the same

distribution as if they where

sampled from the true posterior

joint distribution

Based on frequency of visits, selecting the most probable variables

Page 10: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

10

Data Analysis Approach: Data Clearance

When X is categorical (dummy var.) &

Y is quantitative variable- parametric or non-parametric?

- dependent or independent?

- unbalanced class?

Yield value Representative var.

Bad Yield 53.12 < 1

Middle Yield 53.12 ≤ and ≤ 57.51 ignore

Good Yield >57.51 0

Page 11: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

11

Data Analysis Approach: Data Clearance

Level a Level b

Level c fc𝑎 fc𝑏

Level d fd𝑎 fd𝑏

Variable

I

Variable II

If both 𝑣𝑎𝑟. 𝐼 & 𝑣𝑎𝑟. 𝐼𝐼 are explanatory:

- test the Interchangeability of measures

- measurement of the degree of Homogeneity

If 𝑣𝑎𝑟. 𝐼 is explanatory and 𝑣𝑎𝑟. 𝐼𝐼 is response:

- measurement of the Reliability of instrument (test/scale)

- measurement of the Objectivity or lack of bias

MEASURMENT of AGREEMENTW. S. Robinson(1957)

Cohen’s Kappa 𝓚

𝒦 < 0, "No agreement"

0 ≤ 𝒦 < 0.2, “Slight agreement“

0.2 ≤ 𝒦 < 0.4, "Fair agreement"

0.4 ≤ 𝒦 < 0.6, "Moderate agreement"

0.6 ≤ 𝒦 < 0.8, "Substantial agreement"

0.8 ≤ 𝒦 ≤ 1, "Almost perfect agreement"

Page 12: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

12

Research Framework (I)

Data

Preparation

Data

Mining &

Key Factor

Screening

Problem

Definition

Data Integration

Dummy Variable Construction for

Integrated Variables (1460 var.)

Wrap the associate variables

Cohen’s Kappa

Statistics for

each pairs of

input variables

Agreement

Assign Cutting Point &

Bad/Middle/Good WafersNo Agreement

A Bayesian Framework for

Semiconductor Manufacturing Data

Almost perfect

agreementSubstantial agreement Moderate agreement

3 109 1,764

Fair agreement Slight agreement No agreement

24,539 280,081 758,574

THE CLASS DISTRIBUTION FOR THE KAPPA TEST FOR EACH PAIR OF INPUT VARIABLES

Page 13: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

13

Research Framework (II)

BVS via Gibbs Sampler

Data Clearance 𝒦 ≤ 0.2

No Agreement

Agreement

GLM Construction with Gaussian

distribution & Repeated Random

Sub-sampling Validation

A Comparison to the Wrapped

Variables

Define Abnormal Devices & Time

Model

Construction,

Evaluation &

Interpretation

Cohen’s Kappa

Statistics for

each pairs of X

& Y

Data

Mining &

Key Factor

Screening

ModelRMSE Adjusted R-squared

Min Median Max Min Median Max

Gibbs +

GLM1.842 2.653 2.841 0.046 0.371 0.711

GBM +

GLM2.534 3.051 3.332 0.000 0.053 0.337

RF +

GLM2.268 2.838 3.660 0.016 0.293 0.507

GLM 7.951 34.60 139.8 0.000 0.029 0.214

Number of resamples 20, Number of iterations 2

Page 14: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

14

Decision Graph

High Yield

Middle Yield

Low Yield

Page 15: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

FactorsDate

Bad Good

Stage10 - Tool2 - Chamber3 before 8/29/2014 2:32 after 8/29/2014 12:50

Stage12 - Tool2 - Chamber1between 8/30/2014 3:26 &

8/30/2014 3:43before 8/29/2014 10:55

Stage12 - Tool2 - Chamber4after 8/29/2014 7:36 till 8/30/2014

3:44before 8/29/2014 7:36

Stage13 - Tool5 - Chamber2 - generally effected the high yield

Stage17 - Tool2 - Chamber2 after 8/30/2014 12:21 before 8/30/2014 10:37

Stage23-Tool3-Chamber2 - generally effected the high yield

Stage44 - Tool7.- Chamber2 and

Chamber3at 9/3/2014 at 9/1/2014

Stage49 - Tool1.- Chamber4 at 9/3/2014 at 9/2/2014

Stage57 - Tool1.- Chamber3 - generally effected the high yield

15

Decision Table

Page 16: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

Based on the empirical results, we validate that the proposed approach has

practical viability, which means adding the efficacy of domain knowledge

and experience to the system could improve results.

Using the domain knowledge might be to restrict conjunctions in rules to

tools, chambers and steps that are related to occurs within a reasonable

time frame.

The data are not sampled from a stationary population, hence, over the

time, the results may change significantly, or some empirical answer might

be reject based on engineer domain knowledge, which doesn’t mean that

the result is incorrect.

The result may be a proxy for one or more events that are occurring

elsewhere or at the other periods of the time, hence, the simulation study is

an essential tool for evaluation the accuracy of our proposed method.

16

Conclusion &

Path Forward

Page 17: Bayesian Inference Technique for Data mining for Yield ...xs3d.kaist.ac.kr/ismi2015/Presentations/ismi2015... · System Analysis Yield Learning Curve of Semiconductor Manufacturing

17