Data-Based Methods in Process Monitoring and Control

  • Upload
    a-villa

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    1/43

    DATADATA--BASED METHODS FORBASED METHODS FOR

    PROCESS ANALYSIS,PROCESS ANALYSIS,

    MONITORING AND CONTROLMONITORING AND CONTROLJohn FJohn F MacGregorMacGregor

    McMaster UniversityMcMaster UniversityCanadaCanada

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    2/43

    OverviewOverview

    System Identification is an important area for dataSystem Identification is an important area for data

    analysis in systems engineeringanalysis in systems engineering But are other equally important areasBut are other equally important areas

    In particular, how can we use historical dataIn particular, how can we use historical data--bases thatbases that

    are collected routinely by process computersare collected routinely by process computers

    This presentation looks at many different aspects of thisThis presentation looks at many different aspects of this

    problemproblem

    Difficult nature of historical dataDifficult nature of historical data

    Latent variable methodsLatent variable methods

    Problems and industrial applicationsProblems and industrial applications

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    3/43

    Nature of Historical Process DataNature of Historical Process Data

    Very high dimensionalVery high dimensional

    Hundreds to thousands of variables measured every few seconds foHundreds to thousands of variables measured every few seconds for yearsr years

    NonNon--causalcausal

    Not result of designed experimentsNot result of designed experiments

    Identifying the causal effect of one variable on another is notIdentifying the causal effect of one variable on another is not generallygenerally

    possiblepossible

    NonNon--full rankfull rank

    Variables are highly correlated with one anotherVariables are highly correlated with one another

    Statistical rank is very lowStatistical rank is very low Rank is independent of the number of variables measuredRank is independent of the number of variables measured

    Depends on number of independent sources of variation occurringDepends on number of independent sources of variation occurringin thein the

    processprocess

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    4/43

    Nature of the DataNature of the Data

    Missing dataMissing data

    1010--20% missing is common20% missing is common Analysis methods must be able to trivially handle thisAnalysis methods must be able to trivially handle this

    Low signalLow signal--toto--noise rationoise ratio

    Little information in any one variableLittle information in any one variable

    Need multivariate methods to extract the information fromNeed multivariate methods to extract the information from

    all the variablesall the variables

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    5/43

    Concept of Latent VariablesConcept of Latent Variables

    Measurements on k variablesMeasurements on k variables xx == [[xx11,, xx22, ...,, ..., xxkk]]

    Process actually driven by small set ofProcess actually driven by small set ofaa independentindependent

    latent variables that actual drive the systemlatent variables that actual drive the systemzz == [[zz11,, zz22,, ,, zzaa] (] (aa

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    6/43

    Latent Variable Regression ModelsLatent Variable Regression Models

    Data matrices: X (n*k), Y (n*m)Data matrices: X (n*k), Y (n*m)

    X = T PX = T PTT + E+ E

    Y = T QY = T QTT + F+ Fwhere T = X W is (n*where T = X W is (n*aa) matrix of LV scores) matrix of LV scores

    Note:Note: Symmetric in X and YSymmetric in X and Y

    Both functions of theBoth functions of the LVsLVs

    No assumption of a causal directionNo assumption of a causal direction

    Both measured with errorBoth measured with error X & Y decided by objectives / what will be available in futureX & Y decided by objectives / what will be available in future

    Model for X space as well as Y (very key point)Model for X space as well as Y (very key point)

    Prediction:Prediction:

    === BXXWQTQY TT

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    7/43

    Latent variable modelLatent variable model

    Operating spaceOperating space

    summarized by:summarized by: few orthogonalfew orthogonal LVsLVs

    -- t1, t2, t1, t2,

    and distance of anand distance of anobservationobservation xxjj from thisfrom this

    space given byspace given by

    SPE x xi ijj

    K

    ij= =

    ( )^

    1

    2

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    8/43

    Estimation of Latent variable ModelsEstimation of Latent variable Models

    Many approachesMany approachesDifferent objectivesDifferent objectives Principal Component Analysis (PCA/SVD) &Principal Component Analysis (PCA/SVD) &

    Principal Component Regression (PCR)Principal Component Regression (PCR) Max. variance components in X spaceMax. variance components in X space

    PLS (Projection to Latent Structures)PLS (Projection to Latent Structures)

    Max. covarianceMax. covariance Reduced Rank Regression (RRR)Reduced Rank Regression (RRR)

    Max.Max.varvar of y explained by correlation with Xof y explained by correlation with X

    Canonical Correlation Analysis (CCA)Canonical Correlation Analysis (CCA) Max. correlationMax. correlation

    Max. Likelihood MethodsMax. Likelihood Methods

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    9/43

    Discussion of LV MethodsDiscussion of LV Methods

    All estimation methods provide set of orthogonalAll estimation methods provide set of orthogonal LVsLVs

    Only PCR, PLS, ML provide good model for the XOnly PCR, PLS, ML provide good model for the X--spacespace

    XX--space model is most important part of the model in many applicatspace model is most important part of the model in many applicationsions Why need model for X space?Why need model for X space?

    In identification X is full rank by design of experimentsIn identification X is full rank by design of experiments

    With process dataWith process dataX is of very low rank (a

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    10/43

    Areas of Industrial ApplicationAreas of Industrial Application

    Analysis of Industrial DataAnalysis of Industrial Data--basesbases

    Process Monitoring and FDIProcess Monitoring and FDI Soft Sensors / Inferential ModelsSoft Sensors / Inferential Models

    Extracting information from multivariate sensorsExtracting information from multivariate sensors

    System identificationSystem identification

    Process control in reduced dimensional LV spacesProcess control in reduced dimensional LV spaces

    Many other interesting areasMany other interesting areas

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    11/43

    Analysis of Process DataAnalysis of Process Data--basesbases

    (Troubleshooting process problems)(Troubleshooting process problems) Currently a major area of application of these LV methods inCurrently a major area of application of these LV methods in

    industryindustry

    A major justification for every computer system was to collectA major justification for every computer system was to collectdata for process improvement !data for process improvement !

    But little has been done with these databasesBut little has been done with these databases Data graveyards !Data graveyards !

    Massive data sets, missing data, outliers, extreme correlation aMassive data sets, missing data, outliers, extreme correlation amongmongvariables, nonvariables, non--causal nature of data, data compression algorithms, etc.causal nature of data, data compression algorithms, etc.

    Latent variable model are ideal for analyzing these dataLatent variable model are ideal for analyzing these data

    Two common analysis problems:Two common analysis problems: Retrospective analysis using different time scalesRetrospective analysis using different time scales

    Weekly averages, hourly averages, minute, second data, Weekly averages, hourly averages, minute, second data,

    Short term troubleshooting for immediate problemsShort term troubleshooting for immediate problems

    Build local models to detect & diagnose problemsBuild local models to detect & diagnose problems

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    12/43

    Tools for Analysis of Process DataTools for Analysis of Process Data

    LV score plots (LV score plots (egeg. t. t11vsvs tt22) show the important process) show the important process

    behavior in the LV spacebehavior in the LV space Loading plots (wLoading plots (w11, w, w22) allow interpretation of general) allow interpretation of general

    movements in the scores (movements in the scores (ttii == XwXwii))

    Contribution plots show contribution of each variableContribution plots show contribution of each variableto local changes in the scores & SPEto local changes in the scores & SPE

    Contribution ofContribution ofxxjjtoto tt

    ii== xx

    jj**ww

    ijij Contribution ofContribution ofxxjj toto SPESPEii = (= (xxijijx^x^ijij))

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    13/43

    Example: Industrial batch fermentationExample: Industrial batch fermentation

    processprocess

    Nature of batch data

    End Properties

    time

    variables

    Z

    Variable Trajectories

    batches

    X Y

    Initial Conditions

    More than 300,000 observation in data set

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    14/43

    Nature of the process trajectory data (X)Nature of the process trajectory data (X)

    Trajectories for some variables during one batch

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    15/43

    t1

    t2

    PLS model has only 2 significant componentsPLS model has only 2 significant components

    Each batch summarized by 2 LV scores (tEach batch summarized by 2 LV scores (t11, t, t22))

    Good separation of batches. Good batches have high t1=Xw1

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    16/43

    Interpretation using PLS loading plot for wInterpretation using PLS loading plot for w11Each variable has 350 loading weights associated with the 350 time intervals of the batch

    Good batches have: -high x1 & x3 during last 2/3 of batch and low x4 values

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    17/43

    Process Monitoring and FDIProcess Monitoring and FDI

    Build a new PLS model from historical data with onlyBuild a new PLS model from historical data with only

    acceptable operationacceptable operation Any deviation from this model will reveal unacceptableAny deviation from this model will reveal unacceptable

    behaviorbehavior

    Statistics to plot:Statistics to plot: HotellingsHotellingsTT22::

    Residual SPE:

    2

    1

    22 / l

    a

    l

    l stT =

    =

    2

    1

    )(

    =

    = ijk

    j

    iji xxSPEResidual SPE:

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    18/43

    Monitoring Plots:Monitoring Plots: HotellingsHotellingsTT22 andand

    SPESPEMonitoring of new batch #73

    T2 plot SPE plot

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    19/43

    Contribution plots to diagnose theContribution plots to diagnose the

    problemproblem

    Problem: Variable x6 diverged above its nominal trajectory at time 277

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    20/43

    Soft sensors / Inferential ModelsSoft sensors / Inferential Models

    Soft sensors built from process data using regression,Soft sensors built from process data using regression,

    ANNsANNs, PLS, PLS Advantage of PLS models when:Advantage of PLS models when:

    Large number of highly correlated measurementsLarge number of highly correlated measurements

    Missing dataMissing data

    Occasional outliers in the X measurementsOccasional outliers in the X measurements

    Adaptive PLS and nonlinear PLS often usedAdaptive PLS and nonlinear PLS often used

    Key point in building inferential models is nature of theKey point in building inferential models is nature of the

    data useddata used

    E S f S i l dE S ft S i l d t t

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    21/43

    Ex. Soft Sensor using large data setsEx. Soft Sensor using large data setsBoiler Performance prediction from Turbulent FlameBoiler Performance prediction from Turbulent Flame

    ImagesImages

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    22/43

    ProblemsProblems

    Boiler fed with time varying mixture of wasteBoiler fed with time varying mixture of waste

    hydrocarbon streams and natural gas.hydrocarbon streams and natural gas. Energy content of waste stream varies considerablyEnergy content of waste stream varies considerably

    Want to estimate energy content of waste stream in real timeWant to estimate energy content of waste stream in real time

    Want to estimate the steam generation rateWant to estimate the steam generation rate

    Pollutant concentrations in offPollutant concentrations in off--gas vary widely due togas vary widely due to

    changing feedschanging feeds

    Want to monitor pollutants in real time (Want to monitor pollutants in real time (NONOxx, SO, SO22))

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    23/43

    Flame images highly variableFlame images highly variable

    Time

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    24/43

    MultiMulti--way PCA and PLS to extractway PCA and PLS to extract

    Information from Flame ImagesInformation from Flame Images Large 3Large 3--dimensional image arrays obtained everydimensional image arrays obtained every

    secondsecond MultiMulti--way PCAway PCA

    Obtain very stable LV score plots of the highly variableObtain very stable LV score plots of the highly variable

    flame imagesflame images

    Averaging/filtering done in score spaceAveraging/filtering done in score space

    Extract feature information from the PCA score spaceExtract feature information from the PCA score space Relate features to boiler performance via PLSRelate features to boiler performance via PLS

    Feature extraction:Feature extraction:

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    25/43

    Feature extraction:Feature extraction:

    Example: Extraction of flame luminousExample: Extraction of flame luminous

    regionregion

    (a) One sample image

    (b) Score plot and mask (c) The flame region decided by the mask

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    26/43

    Comparison of predicted and measuredComparison of predicted and measured

    steam flow ratessteam flow rates

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    10:41 11:02 11:24 11:45 12:07 12:29

    Time

    Predicted value

    Measured value

    150

    160

    170

    180

    190

    200

    210

    220

    230

    240

    250

    13:20 13:41 14:03

    Time

    Steamf

    lowr

    at

    e(kp/hr)

    Predicted value

    Measured value

    (a) Case I (b) Case II

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    27/43

    NONOXX concentrations in offconcentrations in off--gasgas

    50

    100

    150

    200

    250

    300

    50 150 250

    Observation (ppm)

    Prediction(ppm)

    Training set

    Test set

    E i I f i f N lE i I f i f N l

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    28/43

    Extracting Information from NovelExtracting Information from Novel

    SensorsSensors Revolution in new micro/molecular sensorsRevolution in new micro/molecular sensors

    More use of fiber optics spectrometers, imaging,More use of fiber optics spectrometers, imaging,acoustical, etc. sensorsacoustical, etc. sensors

    Characteristics:Characteristics:

    Massive amounts of nonMassive amounts of non--specific dataspecific data RobustRobust

    InexpensiveInexpensive

    Greatly enhance possibilities for controlGreatly enhance possibilities for control Problem is extracting the information from the largeProblem is extracting the information from the large

    number of highly correlated measurements at each timenumber of highly correlated measurements at each time

    OnOn--line Monitoring and Feedbackline Monitoring and Feedback

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    29/43

    gg

    Control of Snack Food Quality usingControl of Snack Food Quality using

    Digital ImagingDigital Imaging

    C

    Unseasoned

    Pr oduct

    Seasoni ng

    Tumbl er

    Conveyor Bel t

    Camer a

    Li ght i ng

    Comput er

    Lab Analysis

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    30/43

    PCA score plot histograms of RGB imagesPCA score plot histograms of RGB images

    Non-seasoned Low-seasoned High-seasoned

    On-line Image Product Image Background Image

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    31/43

    + 1

    g g g g

    Product Mask

    Lab Analyze Value

    ModelPredictValue

    Training Set

    Test Set

    Predicted

    seasoning level

    2

    Product Image

    Seasoning level Mask

    Cumulative histogram PLS model

    Predicted

    seasoning

    variance3

    Apply model to

    each small

    window image

    Seasoning

    distribution

    Visual Inspection

    Monitoring

    & Control

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    32/43

    Prediction ResultsPrediction Resultsseasoning contentseasoning content

    Lab Analyze Value

    ModelPredictValue

    Training Set

    Test Set

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    33/43

    ClosedClosed--loop control of seasoning content andloop control of seasoning content and

    seasoning distribution from digital cameraseasoning distribution from digital camera

    Non-seasoned

    product weight

    Predictedseasoning level

    Seasoning

    feeder speed

    Seasoning bias

    (Manipulate variable)

    Seasoning level

    set point

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    34/43

    Subspace Identification MethodsSubspace Identification Methods

    All subspace methods are based on variants of differentAll subspace methods are based on variants of different

    LV modeling methodsLV modeling methods

    N4SID algorithms: Variants of RRRN4SID algorithms: Variants of RRR

    CVA algorithms: CCACVA algorithms: CCA

    Both these involve LV methods that do not model the XBoth these involve LV methods that do not model the X--space (no need in this case)space (no need in this case)

    States are theStates are the LVsLVs

    Process Control in Reduced DimensionalProcess Control in Reduced Dimensional

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    35/43

    Process Control in Reduced DimensionalProcess Control in Reduced Dimensional

    LV SpacesLV Spaces Control in the low dimensional LV space useful when:Control in the low dimensional LV space useful when:

    CV and/or MV spaces are high dimensional and nonCV and/or MV spaces are high dimensional and non

    --fullfull

    rankrank

    Examples where CV space is of low rank:Examples where CV space is of low rank:

    Spatial control of sheet and film processesSpatial control of sheet and film processes Control of distributed properties (MWD, PSD)Control of distributed properties (MWD, PSD)

    Example where MV space is of low rank:Example where MV space is of low rank:

    MVsMVs are trajectories in batch processesare trajectories in batch processes

    Control of MW & Amine Ends in BatchControl of MW & Amine Ends in Batch

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    36/43

    Control of MW & Amine Ends in BatchControl of MW & Amine Ends in Batch

    Nylon PolymerizationNylon PolymerizationFull MV trajectories to be recomputed at several decision times during batch

    - Very high dimensional, elements of trajectories highly correlated (low rank)

    0 25 50 75 100 125 150 175 2000

    50

    100

    150

    200

    250

    ReactorP

    ressure

    32

    1

    Time (min)

    Decision Points

    Manipulated Variable Trajectory

    0 25 50 75 100 125 150 175 20020

    25

    30

    35

    40

    45

    50

    JacketP

    ressure

    Time (min)

    1

    2

    3

    Manipulated Variable Trajectory

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    37/43

    Control via MV trajectory manipulationControl via MV trajectory manipulation

    Want new MV trajectories at every decision time (Want new MV trajectories at every decision time (ii))

    Very high dimensional MV vectorsVery high dimensional MV vectors

    But trajectories must respect past operating policies &But trajectories must respect past operating policies &

    constraintsconstraints

    Must remain in reduced LV space of the modelMust remain in reduced LV space of the model

    Control in the LV space of the PLS modelControl in the LV space of the PLS model

    From the optimized values of theFrom the optimized values of the

    LVsLVs

    (t(t

    11, t, t

    22) compute) compute

    the entire remaining MV trajectoriesthe entire remaining MV trajectories

    Uses the LV model of the XUses the LV model of the X--space from PLSspace from PLS

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    38/43

    Identification and Control StrategyIdentification and Control Strategy

    IdentificationIdentification: PLS model using process variable & MV: PLS model using process variable & MV

    trajectory data from past batch operation plus a fewtrajectory data from past batch operation plus a few

    batches with designed exp. at the control pointsbatches with designed exp. at the control points

    PredictionPrediction::

    At each decision period predict final quality using PLS modelAt each decision period predict final quality using PLS model ProblemProblemdont have the trajectory data for rest of batch!dont have the trajectory data for rest of batch!

    Must use PLS model of XMust use PLS model of X--space to impute the processspace to impute the process

    variable trajectories for the remaining part of the batchvariable trajectories for the remaining part of the batch(missing data)(missing data)

    Id ifi i d C l SId ifi i d C l S

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    39/43

    Identification and Control StrategyIdentification and Control Strategy

    ControlControl::

    Trajectory reconstructionTrajectory reconstruction of the full MV trajectories using Xof the full MV trajectories using X--

    space model from PLSspace model from PLS

    {

    axmin

    present

    2sp1sp

    t

    ttt

    Qtty

    tQt)yy(Q)yy(

    m

    12

    2

    2

    2

    )(

    )(

    )(

    min

    +=

    +=

    ++

    =

    A

    a a

    apresent

    TTT

    TT

    i

    s

    ttT

    st

    T

    T22 P)W)(PWx(tx

    12

    T21

    T1

    TT =

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    40/43

    Control of batch trajectoriesControl of batch trajectories

    PLS model only 2 dimensional: Calculate 2 LVs at each decision point

    MV trajectories then re-computed from them using PLS model

    0 25 50 75 100 125 150 175 200

    0

    50

    100

    150

    200

    250

    Manipulated Variable Trajectory

    Decision Points

    -0.1

    0.1

    - - - Nominal condition

    -10% in W

    - - +10% in W

    Time (min)

    Re

    actorPressure

    0 25 50 75 100 125 150 175 200

    20

    25

    30

    35

    40

    45

    50

    JacketPressure

    -0.1

    0.1

    - - - Nominal condition

    -10% in W

    - - +10% in W

    Time (min)

    SUMMARYSUMMARY

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    41/43

    SUMMARYSUMMARY

    Presented overview of dataPresented overview of data--based methods for processbased methods for processanalysis, monitoring and control.analysis, monitoring and control.

    Latent variable models provide the basis for treatingLatent variable models provide the basis for treatingthese subspace problemsthese subspace problems They naturally handleThey naturally handle

    High dimensionality, extreme correlation & reduced rankHigh dimensionality, extreme correlation & reduced rank

    missing data & outliersmissing data & outliers

    They provide models for the X spaceThey provide models for the X space

    Have presented a few areas of applicationHave presented a few areas of application Analysis of dataAnalysis of data--bases / troubleshootingbases / troubleshooting Process monitoring / FDIProcess monitoring / FDI

    Soft sensors and control from digital imagesSoft sensors and control from digital images

    Control in reduced dimensional spacesControl in reduced dimensional spaces

    Many other areasMany other areas

    A k l d

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    42/43

    AcknowledgementsAcknowledgements

    All my excellent graduate students who haveAll my excellent graduate students who have

    contributed to this researchcontributed to this research

    In particular toIn particular to

    HongluHongluYuYu

    Salvador GarciaSalvador Garcia Jesus FloresJesus Flores

  • 7/30/2019 Data-Based Methods in Process Monitoring and Control

    43/43