Upload
vijaybijaj
View
749
Download
3
Tags:
Embed Size (px)
Citation preview
1
Design and Analysis of Computer Experiments
Nathan Soderborg
DFSS Master Black Belt
Ford Motor Co.
WCBF DFSS Conference Workshop
Feb 9, 2009
February 2009 N. Soderborg 2
Outline
� Background: Six Sigma Context
� Foundation: Useful Computer Models
� Deterministic vs. Probabilistic Approaches
� Monte Carlo Simulation
� Design of Computer Experiments
� Analysis of Computer Experiments
� Case Studies
3
Background: Six Sigma Context
February 2009 N. Soderborg 4
Design for Six Sigma
� A scientific PD approach that leverages Six Sigma culture
� A means to re-instill rigorous deductive and inductive reasoning in PD processes…
� Definition of objective engineering metrics with targets correlated to customer needs and desires
� Characterization of product performance using transfer functions to assess risks
� Optimization of designs through transfer function knowledge and identification of counter-measures to avoid potential failure modes
� Verification that designs perform to targets and counter-measures eliminate potential failure modes
DefineCTS’s
CharacterizeSystem
OptimizeProduct/ Process
VerifyResults
(Ford DCOV)
February 2009 N. Soderborg 5
Definition of a Transfer Function
� A mathematical model that relates an output measure Y to input variables (x’s):
Y = F(y1, …, yn), y1 = f(x1, …, xn), etc.
� Why “transfer” function?
(“function” or “equation” would suffice)
� For purposes of today’s discussion, transfer functions are computer models
February 2009 N. Soderborg 6
Where Transfer Functions Come From
� Deduction: using first principles to characterize system physics, geometry, or material properties� Physics Equations that Describe Function
e.g., V=IR, f=ma, f=kx, k.e.= ½mv2
� Finite Element and other Analytic Models, e.g., computer models not expressible in closed-form equations
� Geometric Descriptions of Parts and Systemse.g., equations from schematics based on reverse engineering,
lumped mass models, drawings & prints; variation/tolerance stack-up
� Induction: analyzing experimental, empirical data� Directed Experimentation
e.g., response surface or multivariate regression equation from DOE
using analytic models or hardware
� Analysis of Existing Datae.g., regression to enhance informed observations
TF based on
“First Principles”
TF based on
Empirical Data
Incre
asin
g D
egre
e o
f A
ppro
xim
ation
February 2009 N. Soderborg 7
What Transfer Functions are Used For
� In early phases of a project, a typical goal is to develop or improve transfer functions that
� Correlate customer needs to objective metrics
� Provide a formula for system output “y” based on input “x’s”
� In latter phases, a typical goal is to exploit those transfer functions to identify optimal robust designs, i.e., achieve performance
� On target
� With minimal variability
� At affordable cost
This requires probabilisticcapability & analysis, i.e., being able to represent the output of the model as a probability distribution
y
Target
Original
Design
Target
y
Optimized
Design
8
Foundation: Useful Computer Models
“All models are wrong; some are useful.”
--George Box
February 2009 N. Soderborg 9
Characteristics of a Good Model
� Fits Data
� For a deductive, first principles based model: Fits data collected from physical tests
� For an inductive, statistical model: Fits the data sample used to construct the model
� Predicts Well
� Predicts responses well at points not included in the data sample or regions of space used to construct the model
� Interpolates well
� Extrapolates well
Did we do the modeling right?
February 2009 N. Soderborg 10
Characteristics of a Good Model
� Parsimonious (conceptually)
� Is the simplest of competing models that adequately predicts a phenomenon
� Note: introducing more terms in a model may improve fit, but over-complicate the model (and impair prediction)
� Parsimonious (from a business perspective)
� Incurs reasonable development cost compared to the knowledge and results expected
� Incurs containable computation costs
Did we do the modeling right?
February 2009 N. Soderborg 11
Characteristics of a Good Model
� Interpretable
� Correctly applies & represents physics, geometry, & material properties
� Provides engineering insight; answers the desired questions
� Contains terms that are fundamental (e.g., dimensional analysis)
� Has clear purpose & boundaries (domain can be small and still useful)
Did we model the right things?
February 2009 N. Soderborg 12
P-Diagram
� In engineering, computer models should help us simulate or predict performance under real-world conditions
� We would like to account for variability in build, environment, and usage (aka: noise)
� A high-level framework for this is the Parameter Diagram (see Phadke, Davis)
error states/failure modes
SignalxS
xC
ySystem
y = f (xS,xC ,xN )Ideal Function
Noise Factors
Control Factors
y
xN
February 2009 N. Soderborg 13
Challenge of Representing Noise in Models
� Models based on first principles will include factors from physics, such as:
� Loads, energy transfer
� Properties of materials
� Dimensions and geometries
� Often the particular noise factors we identify are not factors in our model—but are there “surrogates?”
� Try to understand and estimate the effect of variability in noise factors on factors included in the model
Typical Noise Factor Types:
• Manufacturing variation
• Deterioration over time
• Customer usage/duty cycles
• External environment
• System interactions
Typical Model Factor Types• Load/Energy Transfer
• Material Properties
• Geometry & Dimensions
Translate effects of variation in these into variation in these.
14
Deterministic vs. Probabilistic Approaches
February 2009 N. Soderborg 15
Levels of Design Refinement
� Trial & Error
� Hand Calculations
� Physical tests as needed
� Learning from experience
� Planned Physical Experimentation (DOE)
� Empirical Learning
� Statistical Analysis
� Analytic Modeling (Deterministic)
� Computer calculations
� “What-if” Scenarios
� Analytic Robust Design (Stochastic/Probabilistic)
� Designed Experiments
� Optimization (Single and Multiple Objective)
Looking for a new
design concept?
That requires a
different set of tools.
February 2009 N. Soderborg 16
Deterministic Analysis
Inputs
Nominal or Worst Case values of
• Dimensions
• Materials
• Load
• etc...
Input Examples
• Gages
• Young’s Modulus
•Cylinder Pressure
Model Examples
•Finite Element Analysis
•Regression Equation
•Numerical Model
Outputs
Point Estimate of
• Performance
• Life
Safety Factor or
Design Margin
Output Examples
•Deflection
•Life
•Voltage
Computer Model
February 2009 N. Soderborg 17
Deterministic Analysis
Safety Margin
Mean
Stress
Mean
Strength
Mean
Stress
Mean
Strength
Safety Margin
DESIGN 1
DESIGN 2
Which design is more reliable?
February 2009 N. Soderborg 18
Probabilistic Analysis
DESIGN 1Smaller safety factor,
higher reliability
DESIGN 2Larger safety factor,
lower reliability
Safety Margin
Mean
Stress
Mean
Strength
Mean
Stress
Mean
Strength
Safety Margin
The interference region betweenstress and strength defines theprobability of failure. Thisdetermines reliability.
A design with a larger safetyfactor may have lower reliabilitydepending upon stress and strength variability.
February 2009 N. Soderborg 19
Probabilistic Analysis & Optimization
For a given nominal, sample the assumed distributionaround nominal:• Dimensions
• Material properties
• Loads
• Usage
• Manufacturing
• etc.
Performance variabilityat nominal• Dispersion
• Local Sensitivity
• Reliability Assessment
Inputs Outputs
Performance variabilityacross multiple designs• Global Sensitivity
• Robust Design Direction
• Robust Design Optimization
Iteration over
multiple
nominal values
Computer Model
February 2009 N. Soderborg 20
Probabilistic Optimization: Example
� Objective: Find fixture design that minimizes deflection, accounting for manufacturing variation
� Design Variables:
� Locator Positions (4)
� Clamp Positions (4)
� Clamp Force
1. Optimization without variability2. Optimization including variability
Deflection: smaller is better
0
20
40
60
80
100
120
900 920 940 960
Clamp Position
1 2
Range of Response Variability
1
2
Engine Block Mfg. Fixture
February 2009 N. Soderborg 21
Challenges to Probabilistic Design
� Statistical distributions for input factors may be unknown and costly to ascertain
� Data that is available may be imprecise
� The organization may
� Lack statistical expertise or training
� Have difficulty dealing with results that include uncertainty
All of this is OK!
� The goal should not be to predict reliability precisely
� Rather, the goal is to make and demonstrate improvement
� Learn by using data from similar processes when available
� Try a variety of assumptions to convey a range of possible risks
� Use analyses to make comparisons instead of absolute predictions
22
Monte Carlo Simulation
February 2009 N. Soderborg 23
Monte Carlo Simulation
1. Assign a probability distribution to each input variable, xi
a. Generate a “random” instance for each xi from its distribution
b. Calculate and record the value of y from substituting the generated instance from each xi into the transfer function
2. Repeat steps a & b many times (e.g., 100-1,000,000)
3. Calculate y statistics, e.g., mean, std. dev., histogram
4. Estimate success or failure probability based on targets/limits
y = f (x1, x2, … xd)
Transfer Function/Computer Model
PDF: x1 PDF: x2 PDF: xd
PDF: y
limit
February 2009 N. Soderborg 24
Example: Door “Drop-off”
� Performance Variable: Door drop-off
� Model: Finite Element Analysis
� Design Variables
� Number of missing welds
� Materials: Door, Hinge, Reinforcement, Hinge Pillar
� Gages: Door, Hinge, Reinforcement, Hinge Pillar
� Central Gravity Location
� Trim Weight
� Design Requirement: Drop-off < 1.5 mm
� Goal of the Study
� Check if the drop-off requirement is met when variations in design variables are considered
� Explore opportunities for design improvement or cost reduction
February 2009 N. Soderborg 25
Example: Door “Drop-off”
� Conclusions
� Design meets the drop-off requirement even when variations in gages, material, trim weight, and center of gravity are present
� Door hinge reinforcement is most dominant factor for controlling door drop-off
� May be able to reduce cost by downgaging door hinge reinforcement from 2.4 mm to 2.0 mm (must demonstrate fatigue requirements can still be met)
0.87 0.89 0.91 0.93 0.95 0.97 0.99
freq
ue
nc
y
Door Drop-off, (mm)
Door Drop-off Distribution
0
0.02
0.04
0.06
0.08
0.1
99th Percentile
= 0.9560 mm
Contribution to Variability:
Door reinforcement gage: 37%Trim weight: 14%Central gravity: 12%Hinge pillar reinforcement gage: 11%
February 2009 N. Soderborg 26
Example: Vehicle Vibration
� Problem: Irritating Vibration Phenomenon
� Response: Seat Track Shake
� Model: Vibration Analysis Tool
� Design Variables:
� Stiffness of over 30 Bushings
� Stiffness and/or Damping of over 20 Engine Mounts
� Over 20 others: characteristics of
� Struts
� Structural Mounts
� Subframe
� Subframe Mounts
� etc.
February 2009 N. Soderborg 27
1107.0 921.8 736.4 550.9 365.5 180.0
590
540
490
440
Engm nt1
Shake
@58
Main Effects Plot - Means for Shake@58
Stiffness
Baseline Design
Mount Type A
Robust Design
Mount Type B
Example: Vehicle Vibration
February 2009 N. Soderborg 28
Software for Monte Carlo Simulation
� Dimensional Variation Analysis tools such as VSA® employ Monte Carlo Simulation (MCS)
� Minitab® facilitates random number generation that can be used for MCS
� Several Excel-based tools are for sale—the most widespread is Crystal Ball®, which provides a custom interface for MCS in Excel
� Allows user to identify cells as “assumptions” (x-variables) and “forecasts” (y-variables)
� Includes automatic generation of y-histograms, real-time updating with simulation, optional optimization routines
� Excel’s built-in random number generation can suffice when supplemental software is not available
February 2009 N. Soderborg 29
Monte Carlo Simulation in Excel
Without supplemental software… -generate “random” numbers for x’s
-calculate y-values with Excel formula
-use data analysis & histogram tools to characterize y-distribution
Select…
Tools/ Data Analysis
(available from Analysis ToolPak Add-in)
February 2009 N. Soderborg 30
Random Number Generation in Excel
No. of xi’s (columns)
No. of instances of each xi’s (rows)
Distributions (PDFs):e.g., Uniform, Normal,
Bernoulli, Binomial, Poisson, Patterned, Discrete
Seed: same seed repeats
same set of pseudo-random numbers
Output: Worksheet &
cell range where numbers are stored
February 2009 N. Soderborg 31
Additional Distributions
If the desired distribution is not an automatic selection in Excel (but
the inverse CDF can be coded as a function)
� For each xi, generate a set of uniformly distributed random numbers between 0 and 1
� Substitute each of the numbers into the inverse CDF of xi, to obtain a set distributed as xi
� Calculate response y for each element in this new set
� Create a histogram of the set of response values y and calculate statistics
-4 -3 -2 -1 0 1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CDF
of xi
1
uniform
new dist for xi
y=f(x)
frequency of y
0
February 2009 N. Soderborg 32
Door Latch Example
� A door latch design & production team developed mathematical equations for key customer outputs (outside release effort, outside travel) using the part drawings and applying principles of trigonometry and elementary physics
� These equations were coded into an Excel spreadsheet
� The team had production data (capability, mean, standard deviation) available for the input variables in the equations
� Part dimensions
� Part edge curvature and geometry
� Spring forces
� Etc.
February 2009 N. Soderborg 33
Door Latch Example—Spreadsheet Model
Factor data: nominal, spread…
y=D4*(SQRT(Z4*Z4+AA4*AA4)/SQRT(AB4*AB4+AC4*AC4))
Transfer Function Equation(example)
February 2009 N. Soderborg 34
Door Latch Example—Simulation & Results
Variable d1 d2 d3 d4 d5 d6
Variable Variable Variable Variable Variable Variable
NOMINAL 15.8000 25.3000 19.0400 16.2000 17.9500 1.6800
0
10
20
30
40
50
60
70
5.78
5.84 5.
95.
966.
026.08
6.14 6.
26.
266.32
6.38
6.44 6.
56.
56
15.94288 25.36693 19.10459 16.17876 17.98629 1.626887
15.86396 25.34382 19.01017 16.17493 17.92128 1.711719
15.80739 25.13663 18.96766 16.17173 17.98257 1.670076
15.82119 25.28508 19.06259 16.17272 17.98329 1.680418
15.79549 25.29365 19.1008 16.2428 17.93715 1.64764
15.88546 25.29864 19.13414 16.21519 17.8696 1.730449
15.74837 25.50067 18.98401 16.13 17.95443 1.717248
15.80886 25.39548 18.91719 16.20812 17.87082 1.646429
15.88649 25.21907 19.07608 16.1767 18.04452 1.737719
15.61927 25.31412 19.00349 16.18069 17.95738 1.648598
15.73831 25.19662 19.08068 16.33519 17.93109 1.72337
6.258756
6.031181
6.009915
5.990283
6.373316
6.09988
5.877883
6.072534
6.251613
6.114448
6.226314
τO/S
Calculated
6.171921
. . .
.
.
.
.
.
.
LSL* USL*
d1 d2 d3 d4 d5 d6 YT
. . .
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.Genera
te 1
000 R
ow
sEach
colu
mn a
separa
te d
istr
ibution
Calc
ula
te y
for
each
row
usi
ng t
he
transf
er
funct
ion
Estimate % of product outside specs based on variation assumptions for x’s
Draw histogram
*Limits are examples only
Outside Travel
February 2009 N. Soderborg 35
Crystal Ball Example
36
Design of Computer Experiments
February 2009 N. Soderborg 37
Motivation
If you have a computer model already, why do designed experiments to create a model of the model?
� Make design decisions faster and cheaper
� Some models are computationally intensive, time consuming, and expensive to set up and run
� Robust design analysis needs a probabilistic approach that requires many runs
� You can replace expensive models with approximations (metamodels) for carrying out Monte Carlo simulation, robust design, multi-objective optimization, etc.
� Gain insight into the original model
� Often the model cannot be expressed explicitly (it is a “black box”), e.g. Finite Element Analysis
� A metamodel can be used to efficiently understand effects, interactions, and sensitivities
February 2009 N. Soderborg 38
A Flow for Analytic Robust Design
Develop & Document System Understanding
Understand functions/failures, P-Diagram
Design a Computer ExperimentSample for uniformity, orthogonality
Run the ExperimentEvaluate the model at each sample point
Develop a Response Surface Model
Apply advanced regression, other methods
AnalyzeSensitivities
Find important factors
Assess ReliabilityQuantify risk
Optimize for Robustness
Select a robust design
P-Diagram
Control Factors
Noise FactorsResp.Signal
Run # x1 x40
1
n x40,n
x40,1
x1,n
x1,1
x5
x1x2x3x4
February 2009 N. Soderborg 39
Computer-based Experimentation
� The move toward Analytic Robust Design along with ever-increasing computing power has fueled the development of a new field of study over the past few decades:
Design and Analysis of Computer Experiments (DACE)
� Early computer experimenters realized that traditional Experimental Designs were sometimes inadequate or inefficient compared to alternatives
� In addition, certain non-parametric techniques for fitting the data may offer more useful models than polynomial regression
February 2009 N. Soderborg 40
Physical vs. Computer Experiments
Physical
� Responses are stochastic (involve random error)
� Replication helps improve precision of results
� Some inputs are unknown
� Randomization is recommended
� Blocking nuisance factors may help
Computer
� Responses are determinis-tic (no random error)
� Replication has no value
� All inputs are known
� Randomization has no value
� Blocking is irrelevant
February 2009 N. Soderborg 41
Physical vs. Computer Experiments
Physical
� Experiment logistics can be resource-intensive
� Minimizing the number of runs is generally desirable
� Parameter adjustment requires physical work
� The set up is usually only available for a short time period (e.g., interruption of production)
Computer
� Experiment logistics often require fewer resources
� A relatively large number of runs may be feasible
� Parameter adjustments take place in software
� The set up can be “saved”and returned to; a sequential approach is more feasible
February 2009 N. Soderborg 42
Computer
� Relative logistical ease allows variable sampling over many levels
� Multiple level sampling allows high order, nonlinear models
� Flexible alternatives to standard arrays available, e.g., Latin Hypercube, Uniform designs, etc. (close to orthogonal)
Physical
� Logistical requirements limit sampling to 2 or 3 levels per variable
� Thus, models are typically limited to be linear or quadratic
� Typical design is a standard orthogonal array, e.g., full or fractional factorial, or response surface methods
Physical vs. Computer Experiments
February 2009 N. Soderborg 43
Desirable Computer Experiment Properties
� Is Balanced
� Each factor has an equal number of runs at each level
� This weights levels equally in estimating effects
� (The number of runs will be a multiple of the number of levels of each factor)
� Captures Response Non-linearity if Present
� Two levels for a factor allow modeling of linear effects
� Modeling higher order non-linearity requires a higher number of levels per factor
� Exhibits Good Projective Properties
� Projections onto significant factor subspaces…
� Include no “pseudo-replicates”
� Avoid significant point-clustering
� Maximizes information related to significant factor behavior
February 2009 N. Soderborg 44
Desirable Computer Experiment Properties
� Is Orthogonal or Close to Orthogonal
� Correlation between factors is zero or close to zero (column orthogonality)
� Allows effects of factors to be distinguished and estimated cleanly
� Fills the Design Space� Sample points are spread throughout the design space as evenly
or uniformly as possible
� Helps model the full range of design behavior without any assumptions on factor importance
� Improves interpolation capability for building a good metamodel
How “space filling” a design is can be measured by various criteria; in practice, seek designs that have relatively good orthogonalityand good space filling properties
February 2009 N. Soderborg 45
Computer Experiment Design
Example Strategies
� Orthogonal Array
� Response Surface Methods
� Space Filling Designs: sampling based� Random Sample
� Latin Hypercube Sample
� Space Filling Designs: based on optimizing various criteria� Management of minimum and maximum distances between points
� Minimum “discrepancy” or departure from uniformity
� Maximum “entropy” or unpredictability
� Low Discrepancy (Quasi Monte Carlo) Sequences
Traditional Approaches
February 2009 N. Soderborg 46
Latin Hypercube Sampling
� Latin Squares
� Latin Hypercubes are extensions of Latin Squares to higher dimensions
� An NxN Latin Square has the property that each of N symbols appears exactly once in each row and exactly once in each column
� Latin Hypercubes
� Latin Hypercube Sampling divides each dimension of the design space into Nintervals
� A set of N points is selected so that when the set is projected onto any dimension, exactly one point is in each of the intervals for that dimension
� (Kind of like Sudoku!)
BCDEA
CDEAB
DEABC
EABCD
ABCDE
••••
••••
••••
••••
••••
February 2009 N. Soderborg 47
Latin Hypercube/Factorial Comparison
� 3 replicates for each variable projection
� 3 levels for each variable: at most quadratic effects can be captured
� No replicates for variable projections
� 9 levels for each variable: higher order nonlinear effects can be captured
9 Run FactorialTypical Physical Experiment
9 Run Latin HypercubeFeasible in Computer Experimentation
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
x1
x2
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
x1
x2
If x2 is not significant, there are essentially 3 repeat
points, for each level of x1: “Pseudo-replicates”Large regions unsampled
February 2009 N. Soderborg 48
Latin Hypercube Example
� 4-factor, 11-level LH Design
x1 x2 x3 x4
0.8 -0.8 -0.2 -0.6
0.2 -0.4 0.2 1
-1 -0.6 -0.6 0
-0.2 -1 0.6 0.4
0.4 1 0.4 -0.4
-0.4 0.6 -1 -0.2
0 0.2 -0.4 -1
-0.6 -0.2 1 -0.8
0.6 0 -0.8 0.6
-0.8 0.8 0 0.8
1 0.4 0.8 0.2
x1 x2 x3 x4
x1 * * * 0.027 0.145 -0.009p 0.937 0.670 0.979
x2 * * * -0.073 -0.036p 0.832 0.915
x3 * * * -0.027p 0.937
x4 * * *
0.5
-0.5
0.5
-0.5
0.5
-0.5
0.5-0.5
0.5
-0.5
0.5-0.5 0.5-0
.5 0.5-0.5
x1
x2
x3
x4
Design Matrix
Pearson Correlation
Matrix Plots: 2-dimensional projections
February 2009 N. Soderborg 49
Uniform Designs
� A uniform design is a sample of points that minimizes some measure of discrepancy
� Discrepancy is a metric quantifying “how far” the points are from being uniformly distributed
� Uniform designs allow different numbers of levels for each factor
� An existing design can be “optimized” for uniformity, e.g.
� Subset of a full factorial
� Latin Hypercube
Initial Latin Hypercube Optimized for Uniformity
February 2009 N. Soderborg 50
0.5
-0.5
0.5
-0.5
0.5
-0.5
0.5-0.5
0.5
-0.5
0.5-0.5 0.5-0.5 0.5-0.5
x1
x2
x3
x4
Uniform Mixed-Level Design Example4-factor, 12 Run, Mixed Level Design (subset of a full factorial design)
Design Matrix
Pearson Correlation
Matrix Plots: 2-dimensional projectionsx1 x 2 x 3 x 4
0 1 1 -1
0 0 .3 3 -0 .3 0 .6
-1 0 .3 3 -1 -0 .6
0 -1 0 .3 3 -0 .6
1 1 -1 -0 .2
-1 -0 .3 1 -0 .2
1 -0 .3 -0 .3 -1
1 0 .3 3 0 .3 3 0 .2
1 -1 1 0 .6
-1 1 0 .3 3 1
-1 -1 -0 .3 0 .2
0 -0 .3 -1 1
x1 x2 x3 x4
x1 * * * 0 0 -0.1195
p 1 1 0.71139
x2 * * * -0.1333 -0.0436
p 0.67953 0.89287
x3 * * * -0.0873
p 0.78736
x4 * * *
Number of Levels for each factorx1: 3, x2: 4, x3: 4, x4: 6
February 2009 N. Soderborg 51
Low Discrepancy Sequences
� A sequential approach to identifying experimental points
� Useful when experiments can proceed sequentially, especially if the computer model is slow
� While waiting for the model to generate the next output, the analyst can do preliminary work to decide if the results are accurate enough
� Sequences are based on Monte Carlo approaches to space-filling sequences used for integration
� Such sequences may be used to substitute for sampling from a uniform probability distribution (Quasi-Monte Carlo)
� Some sequences are specifically designed to have low discrepancy
� Roughly speaking, the discrepancy of a sequence is low if
� the number of points in the sequence falling into an arbitrary set Bis close to proportional to the measure of B, as would happen on average in the case of a uniform distribution (Wikipedia)
February 2009 N. Soderborg 52
Low Discrepancy Sequence Example
� Examples of sequences in the literature
� Sobol Sequence
� Hammersley Sequence
� Halton Sequence
100 Monte Carlo Samples 100 Halton Samples
“Open” spaces
Point clustering
February 2009 N. Soderborg 53
Design of Computer Experiment Summary
� Latin Hypercube
� Computationally inexpensive to generate
� Allows large number of runs and factors sampled at many levels
� Good projective properties on low dimensional subspaces
� Available in many software sources
� Number of levels = number of runs—this can be a big constraint
� Uniform Designs
� Design matrices with good orthogonality and projective properties can be refined to improve uniformity
� Algorithms apply to any number of levels and factors per level
� Not as common in software as Latin Hypercube (JMP?)
� Computation required to optimize designs grows with number of runs, factors, and levels—can consume some time for big designs
February 2009 N. Soderborg 54
Design of Computer Experiment Summary
� Low Discrepancy Sequences
� Provides a sequence of points that fill space close to uniformly
� Allows sequential experimentation
� Typically, number of levels is same as number of runs
� Can be used in the place of “random” sequences when more uniform sampling is desired
� Slowly becoming available in commercial software; code can be downloaded from various websites
55
Analysis of Computer Experiments
February 2009 N. Soderborg 56
Analysis of Computer Experiments
Develop & Document System Understanding
Understand functions/failures, P-Diagram
Design a Computer ExperimentSample for uniformity, orthogonality
Run the ExperimentEvaluate the model at each sample point
Develop a Response Surface Model
Apply advanced regression, other methods
AnalyzeSensitivities
Find important factors
Assess ReliabilityQuantify risk
Optimize for Robustness
Select a robust design
P-Diagram
Control Factors
Noise FactorsResp.Signal
Run # x1 x40
1
n x40,n
x40,1
x1,n
x1,1
x5
x1x2x3x4
Analysis
February 2009 N. Soderborg 57
Generating Response Surfaces
� A traditional approach is to treat the computer model as an unknown transfer function, f
� Assume that transfer function has a particular form e.g.,
� Polynomial
� Trigonometric function, etc.
� Find coefficients β that provide the “best fit” of a function of the assumed form to the response data, i.e., y=f(ββββ,x)
� Responses from physical experiments will not match the output of the generated function exactly due to
� Experimental measurement error
� Differences in the assumed form of the function vs. the true form
� Absence of some influential factors in the experiment
� Etc.
February 2009 N. Soderborg 58
Interpolation
� However, with computer experiments it would be desirable for experimental responses to match the output of the generated function exactly
� Computer experiments are not subject to experimental error—responses reflect the true output of the analytical model
� All input factors are known
� The challenge is to generate a metamodel that
� Matches response data AND
� Predicts well the response values at points not used to construct the metamodel
� This is an interpolation problem: a specific case of curve fitting, in which the function must go exactly through the data points
February 2009 N. Soderborg 59
Interpolation Examples
� In the plane, 2 points determine a unique line, 3 points determine a unique 2nd order polynomial, etc.
� However, if data subject to experimental error is interpolated assuming a polynomial functional form, the result can be severe “over-fitting”
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 0.1 0.2 0.3 0.4 0.5
2 points fit by a 1st
order polynomial
3 points fit by a 2nd
order polynomial
6 points, subject to
experimental error, fit by a 6th
order polynomial; a better
predictor is a best fit line
February 2009 N. Soderborg 60
Metamodel Building
� For any given true model there are many metamodels
� To find a metamodel with good prediction capability, often the best approach is to
� Try both “best fit” and interpolation methods and combinations
� Choose a final model based on validation studies
� Generally, the metamodel is constructed as a linear combination of elements from a set of building block functions called basis functions, i.e.,
� Where βi are the coefficients and Bj are the basis functions
( ) ( ) ( ) ( ) ( )xxxxx MM
M
j
jj BβBβBβBβf +++==∑=
...ˆ1100
0
February 2009 N. Soderborg 61
Types of Basis Functions
� Polynomials
� Splines
� Fourier Functions
� Wavelets
� Radial Basis Functions
� Kriging Functions
� Neural Networks
� Etc.
•Most powerful for low-dimension input
variables (terms grow exponentially with
dimension)
•Results are interpretable in terms of
familiar functions
•May be more natural for high-dimension
input variables
•Results may be difficult to interpret in
terms of familiar functions
February 2009 N. Soderborg 62
Basis Function Examples� Polynomials (up to 2nd order) � Fourier basis
(1-dimension, over [0,1])
( )( )
( )
( )
( )( )
( ) ddddd
d
dd
d
dd
xxB
xxB
xB
xB
xB
xB
B
12/)1(2
2112
22
211
11
0 1
−−+
+
+
=
=
=
=
=
=
=
x
x
x
x
x
x
x
M
M
M1st order
terms
2nd order
terms
Interaction
terms
constant
( )( ) ( )( ) ( )
( ) ( )( ) ( )xkxB
xkxB
xxB
xxB
xB
k
k
π
π
π
π
2sin
2cos
2sin
2cos
1
2
12
2
1
0
=
=
=
=
=
−
M
February 2009 N. Soderborg 63
Metamodel Building
� For interpolation, find and select sufficiently many (M) basis functions so that y=Bββββ can be solved for ββββ, i.e.,
� For “best fit,” find basis functions and that minimizes
( ) ( )( ) ( )
( ) ( )
=
M
nMn
M
M
n BB
BB
BB
y
y
y
β
β
β
M
M
L
KKK
L
L
M
1
0
0
220
110
2
1
xx
xx
xx
( ) ( ) yBBBβxBβy TTn
i
M
j
ijji By1
2
1 0
2 ˆ i.e., ,−
= =
=
−=− ∑ ∑β
β̂
Least-squares
estimator
Coefficient
Vector
Matrix of M basis functions, each
evaluated at the n sample points
Response
Vector
February 2009 N. Soderborg 64
Example: Splines, MARS
Multivariate Adaptive Regression Splines
� An automated, adaptive regression method
� Developed by Prof. Jerome Friedman of Stanford University in early 1990s, available in commercial software from Salford Systems
� Basis functions are built from functions of the form
>−
=− +otherwise ,0
if ,)(
κκκ
xxx
<−
=− +otherwise ,0
if ,)(
κκκ
xxx
x10
κ
-κy
x10
κ
κy“knot”
Piece-wise linear “hockey stick” functions
February 2009 N. Soderborg 65
Example: MARS
MARS Model ComponentsCombined
Result
x
1
5
y1 = 5.02196
y
0
5
y2 = 0.238230*[x1-6.035000]
x
1
5
0
5
y
x
10
5
-5
y3 = -0.977209*[5.100000-x1]
y
y4 = -1.227350*[x1-5.100000]
x
10
5
-5
y
x
1
5
0
5
y
y = y1 + y2 + y3 + y4
Notation: [.] is also used to denote (.)+
February 2009 N. Soderborg 66
Example: Gaussian Stochastic Kriging
� Proposed by Matheron for modeling spatial data in geostatistics (1963)
� Systematically introduced to computer experiments by Mitchell (1989)
� Uses continuous basis functions of the form
where xij is the jth dimension of the ith sample point, and θi is estimated in the process of generating the response surface
( )
−− ∑=
d
j
ijjj xxθ1
2exp
February 2009 N. Soderborg 67
Example: Gaussian Stochastic Kriging
GSK one variable example; 4 data points
February 2009 N. Soderborg 68
Comparison of Different Methods
� The true function contains a reasonable amount of nonlinearity
� Using the same sampling strategy (30 points, LHS), compare the fits to the true function of different modeling methods:
� Polynomial Regression, RSM
� Polynomial Regression, stepwise
� MARS
� Kriging
Actual function: y = (30+x1*SIN(x1))*(4+EXP(-x2))
x1
x2
150
140
130
120
110
150
140
130
120
110
543210
5
4
3
2
1
0
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
< 110
110 120
120 130
130 140
140 150
150
y
4.5
3.0100
y
120
140
x2
160
1.50.01.5 0.03.0
4.5x1
February 2009 N. Soderborg 69
Polynomial Regression: RSM Fit
Function y sampled with LH DOE, 30runs: i.e., x1, x2 have 30 levels in [0,5]
y = 142.93+9.56x1 - 13.51x2-2.87x12 + 1.81x22
Actual function: y = (30+x1*SIN(x1))*(4+EXP(-x2))
2D Response
Surface Method
(Minitab)
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
< 110
110 120
120 130
130 140
140 150
150
y
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
-
< 100
100 110
110 120
120 130
130 140
140 150
150
y1
February 2009 N. Soderborg 70
Polynomial Regression: Stepwise Fit
y = 144.9+14.35x1-5.3x12+0.326x13
-23.2 x2 +6.7 x22-0.65x23
Actual function: y = (30+x1*SIN(x1))*(4+EXP(-x2))
2D Stepwise
Regression Method
(Minitab)
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
< 110
110 120
120 130
130 140
140 150
150
y
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
-
< 100
100 110
110 120
120 130
130 140
140 150
150
y2
Function y sampled with LH DOE, 30
runs: i.e., x1, x2 have 30 levels in [0,5]
February 2009 N. Soderborg 71
MARS Fit
Function y sampled with LH DOE, 30runs: i.e., x1, x2 have 30 levels in [0,5]
y = 134.05-11.88(x1-2.24)+-4.76(2.24-x1)+ -1.50(x2-1.55)+
+13.74(1.55-x2)+
Actual function: y = (30+x1*SIN(x1))*(4+EXP(-x2))
x1
x2
150
140
130
120
110
543210
5
4
3
2
1
0
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
< 110
110 120
120 130
130 140
140 150
150
y
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
-
< 100
100 110
110 120
120 130
130 140
140 150
150
y3
2D MARS
Prediction (Ford
Encore software)
February 2009 N. Soderborg 72
Gaussian Stochastic Kriging Fit
Function y sampled with LH DOE, 30runs: i.e., x1, x2 have 30 levels in [0,5]
Actual function: y = (30+x1*SIN(x1))*(4+EXP(-x2))
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
< 110
110 120
120 130
130 140
140 150
150
y
2D GSK Prediction
(Ford Encore software)
x1
x2
543210
5
4
3
2
1
0
>
-
-
-
-
-
< 100
100 110
110 120
120 130
130 140
140 150
150
Kriging
February 2009 N. Soderborg 73
MARS/Kriging Comparison
� MARS Strengths
� Non-parametric: no assumption of underlying model required
� Well suited for high-dimensional problems; good for data mining
� Reasonably low computational demand
� MARS Limitations
� While often useful for understanding general trends, models sometimes do not accurately capture local behavior
� GSK Strengths� Interpolates data
� GSK Limitations� Relatively computationally demanding� Can over-fit data
74
Case Studies
February 2009 N. Soderborg 75
Case Study: Piston Slap
REFERENCE
SAE Paper: 2003-01-0148, “Robust Piston Design and Optimization Using Piston Secondary Motion Analysis”
Problem/Opportunity
� Piston slap is an unwanted engine noise that results from piston secondary motion
� A combination of transient forces and certain piston clearances can result in
� Lateral movement of the piston within the cylinder
� Rotation of the piston about the piston pin
� This can cause the piston to impact the cylinder wall at regular intervals
� The design team has developed CAE model that predicts piston secondary motion, so this phenomenon can be explored analytically
Goals
� Achieve minimal piston friction and minimal piston noise simultaneously
� Reduce customer complaints“Deep Dive”
February 2009 N. Soderborg 76
Case Study: Steering Wheel Nibble
REFERENCE
SAE Paper: 2005-01-1399, “Using Computer Aided Engineering to Find and Avoid the Steering Wheel ‘Nibble’ Failure Mode”
Problem/Opportunity
� Steering System is highly coupled:� Desire efficiency steering wheel to road wheel� Desire inefficiency road wheel to steering wheel
� Steering wheel nibble (undesired tangential oscillation between 10-20 Hz) is a potential failure mode typically in autos with rack and pinion steering
� A result of a chassis system response to wheel-end excitations
� Excitations can result in up to 0.05 mm of steering rack displacement which gets amplified into undesired steering wheel oscillations of up to 0.2o
Goals
� Use any and all approaches to avoid the Nibble Failure Mode
� Focus on making the designs less sensitive to Noise
“Deep Dive”
February 2009 N. Soderborg 77
Case Study: Side Impact Design Criteria
REFERENCE
SAE Paper: 2005-01-0291, “Model of IIHS Side Impact Torso Response Measures using Transfer Function Equations”
Problem/Opportunity
� IIHS Side Impact Evaluation
� New test mode – side impact
� Develop guidelines to specify minimum targets
� Improve program efficiency by providing vehicle content guidelines
� Currently a variety of vehicle specific solutions are being developed
� Measures to be balanced with existing Regulatory & Company requirements
Goals
� Develop transfer functions
� Develop design guidelines
“Deep Dive”
February 2009 N. Soderborg 78
Case Study: Hybrid Electric Vehicle Motor
Problem/Opportunity
� Ford’s Hybrid Electric Escape is the company’s first production Hybrid Electric Vehicle (HEV)
� The Power Split Transmission incorporates new technologies, including high power permanent magnet motors & torque control
Goals
� Ensure that the Escape’s electric motor meets targets based on comparable gas engine performance for
� Torque Accuracy
� Power Loss
� Traction Motor Noise, Vibration, Harshness
“Deep Dive”
February 2009 N. Soderborg 79
Bibliography (1 of 2)� T. Davis, “Science, engineering, and statistics,” Applied Stochastic Models in
Business and Industry, Vol. 22, Issue 5-6, pp. 401-430, 2006.
� K.-T. Fang, R. Li, and A. Sudjianto, Design and Modeling for Computer Experiments, Chapman & Hall/CRC, New York, 2006.
� I. Farooq, J. Pinkerton, N. Soderborg, et. al., “Model of IIHS Side Impact Torso Response Measures using Transfer Function Equations,” SAE World Congress, April 11-14, 2005, SAE-2005-01-0291.
� R. Hoffman, A. Sudjianto, X. Du, and J. Stout, “Robust Piston Design and Optimization Using Piston Secondary Motion Analysis,” SAE World Congress, March 3-6, 2003, SAE-2003-01-0148.
� J. Lee, et. al., “An Approach to Robust Design Employing Computer Experiments, Proceedings of DETC ’01, ASME Design Automation Conference, Sept 9-12, 2001, Pittsburgh, PA, DETC2001/DAC-21095.
� T. Santner, B. Williams, and W. Notz, The Design and Analysis of Computer Experiments, Springer Verlag, New York, 2003.
� T. Simpson, “Comparison of Response Surface and Kriging Models in the Multidisciplinary Design of an Aerospike Nozzle,” NASA/CR-1998-206935.
� N. Soderborg, “Challenges and Approaches to Design for Six Sigma in the Automotive Industry,” SAE World Congress, April 11-14, 2005, SAE-2005-01-1211.
February 2009 N. Soderborg 80
Bibliography (2 of 2)� N. Soderborg, “Design for Six Sigma at Ford,” Six Sigma Forum Magazine,
November 2004, 15-22.
� N. Soderborg, “Applications and Challenges in Probabilistic and Robust Design Based on Computer Modeling,” Invited Talk, Proceedings of the American Statistical Association Section on Physical and Engineering Sciences, 1999 Spring Research Conference on Statistics in Industry and Technology, June 2, 1999, Minneapolis, MN, pp. 207-212.
� R. Thomas, N. Soderborg, and S. Borders, “Using CAE to Find and Avoid Failure Modes: A Steering Wheel ‘Nibble’ Case Study,” SAE World Congress, April 11-14, 2005, SAE-2005-01-1399.
� G. Wang and S. Shan, “Review of Metamodeling Techniques in Support of Engineering Design Optimization”, Transactions of the ASME, Vol. 129, Apr. 2007, p. 370-380.
� S. Wang, Reliability & Robustness Engineering Using Computer Experiments (AR&R), 2000 Spring Research Conference on Statistics in Industry and Technology, Seattle, WA, June 26, 2000.
� G. Wiggs, “Design for Six Sigma (DFSS): The First 10 Years at GE,” SAE 2008, Application of Lean and Six Sigma for the Automotive Industry conference, Dec. 2-3, 2008.