26
Lab for System Informatics and Data Analytics (SIDA) Industrial Big Data Analytics for Quality Improvement in Complex Systems Department of Industrial and Systems Engineering University of Wisconsin-Madison Dr. Kaibo Liu 1

Industrial Big Data Analytics for Quality Improvement in Complex … · 5 Lab for System Informatics and Data Analytics (SIDA) Objective-oriented sensor system designs in complex

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • Lab for System Informatics and Data Analytics (SIDA)

    Industrial Big Data Analytics for Quality

    Improvement in Complex Systems

    Department of Industrial and Systems Engineering

    University of Wisconsin-Madison

    Dr. Kaibo Liu

    1

  • Lab for System Informatics and Data Analytics (SIDA)

    Background

    • A.P. 2013-now, Department of industrial and Systems Engineering, UW-Madison

    • Ph.D. 2013, Industrial Engineering (Minor: Machine Learning), Georgia Institute of Technology

    • M.S. 2011, Statistics, Georgia Institute of Technology

    • B.S. 2009, Industrial Engineering and Engineering Management, Hong Kong University of Science and Technology, Hong Kong

    2

  • Lab for System Informatics and Data Analytics (SIDA)

    My Research & Expertise

    Research Interests Expertise

    System Informatics and data analytics:

    • Complex system modeling and performance assessment

    • Data fusion for online process monitoring, diagnosis and prognostics

    • Statistical learning, data mining, and decision making

    Multi-disciplinary Research

    3

    Spatiotemporal Field Modeling and Prediction

    Sensor Measurement and Monitoring Strategy

    System Degradation Analysis and Prognostics

    Engineering

    Statistics/ Machine Learning

    Operation Research/

    Control

    Multidisciplinary approach

    Overall, my research goal is to make sense of big data for better decision making!

  • Lab for System Informatics and Data Analytics (SIDA) 4

    Sensor Measurement and

    Monitoring Strategy

  • Lab for System Informatics and Data Analytics (SIDA) 5

    Objective-oriented sensor system designs in

    complex systemsObjective• Obtain an optimal sensor allocation design at

    minimum cost under different user specified quality requirements

    Results Summary• Ensure customer satisfaction by optimally

    designing sensor allocation strategy• The average cycle time, cost and inventory

    level can be greatly reduced• Algorithms have been tested in several

    applications, e.g., the hot forming and the cap alignment processes

    • Supported several studentsEffectively search for optimal sensor system design solutions

    Approaches• A best allocation subsets by intelligent search,

    named BASIS algorithm that intelligently searches for the optimal sensor allocation solution

    • Features• Consider the trade-off of detection speed,

    fault diagnosis accuracy, and cost savings

  • Lab for System Informatics and Data Analytics (SIDA) 6

    Causation-based monitoring, diagnosis and

    controlObjective• Transform from existing correlation-based

    techniques into a new causation-based quality control paradigm to achieve effective online quality monitoring and inference, root cause diagnosis, and proactive process control

    Approaches

    • Features• Engineering knowledge enhanced causal

    modeling• Causation-based online quality monitoring,

    inference, and diagnosis• Causation-based online feed-forward and

    feed-back process control

    Results Summary• Establish a series of causation-based

    monitoring, diagnosis and control techniques for quality improvement in complex systems

    • Algorithms have been tested in the hot forming, the cap alignment, and the rolling processes

    • Supported several studentsimproved efficiency, yield, and quality

  • Lab for System Informatics and Data Analytics (SIDA) 7

    Online monitoring of Big Data Streams

    Objective• Create a new paradigm of dynamic data-driven

    modeling, sampling and monitoring schemes for Big Data Streams (e.g., Video streams)

    Approaches• A self-updated statistical model to fully

    characterize the changing background• A dynamic, data-driven sampling strategy

    subject to practical resources constraints • A scalable and robust statistical process

    control method tailored for Big Data Streams

    • Features• Scalability: linear complexity that ensures

    practical implementation• Adaptability: automatically localize the

    anomaly regions without any prior knowledge

    Results Summary• Establish a series of real-time monitoring

    methodologies that are tailored for Big data streams for quick anomaly detection (either cyber of physical) and localization

    • Algorithms have been tested in various applications, e.g., diaper manufacturing, climate monitoring and solar flare detection

    • Supported several students

    Examples of thermal profiles on the polishing pad

    during CMP process under different conditions

    Maximize the detection capability with practical resources constraints

  • Lab for System Informatics and Data Analytics (SIDA)

    Dynamic Data-Driven Modeling, Sampling and

    Monitoring for Real-Time Solar Flare Detection

    8

    (a) Applications𝑡

    Original Solar Image

    (b) Applications modeling

    Updated Solar Image

    (c) Application measurement

    systems and methods

    Dynamic Sampling

    𝑡

    DDDAS

    Framework

    (d) Mathematical and

    statistical algorithms

    SPC Chart

    Update

    Model

    Update

    SPC

    Update samplingSample data

    • A dynamically updated

    spatial-temporal

    statistical model fully

    characterize the

    changing background

    • A dynamic sampling

    algorithm that

    actively decides

    which data streams to

    observe given the

    resources constraints

    • A scalable and robust

    SPC to effectively

    combine the information

    from significant data

    streams to produce an

    overall global

    monitoring system

  • Lab for System Informatics and Data Analytics (SIDA) 9

    Sensor Measurement and Monitoring Strategy

    • Objective-Oriented Optimal Sensor Allocation Strategy: determine the minimum number of sensors needed given user specified requirements

    • Adaptive Sensor Allocation Strategy: Adaptively adjust sensor allocation in a Bayesian Network to enhance monitoring and diagnosis

    • A Top-r based Adaptive Sampling Strategy: Online monitor normally distributed big data streams in the context of limited resources

    • A Nonparametric Adaptive Sampling Strategy: Online monitor non-normal big data streams in the context of limited resources

    • Effective Online Data Monitoring and Saving Strategy: intelligently select and record the most informative extreme values in the simulation data

    • A Spatial Adaptive Sampling Procedure: leverage the spatial information and adaptively and intelligently integrate two seemingly contradictory ideas (Wide and deep searches)

    • A Rank-based Sampling Algorithm by Data Augmentation: automatically augment information for unobservable variables based on the online observations

  • Lab for System Informatics and Data Analytics (SIDA) 10

    System Degradation Modeling and

    Prognostics

  • Lab for System Informatics and Data Analytics (SIDA) 11

    Internet of Things-enabled Condition-based

    Monitoring, Diagnosis, and Prognostics

    Objective• Leverage condition monitoring signals

    collected from multiple and heterogeneous sensors to better visualize and assess the current system health status and predict its future behavior in real time

    Approaches• Novel data fusion methods that select

    best sensors and combine their information to construct health indices for system performance assessment

    and visualization, ℎ𝑖,𝑡 = 𝑓 𝒙𝑖,.,𝑡

    • Features• Combine data-driven approaches and

    engineering principles governing the underlying failure mechanism to ensure satisfactory performance

    Results Summary• Establish a series of data fusion

    methodologies that are tailored for IoT-enabled service systems for health status visualization, characterization and prediction

    • Algorithms have been tested in various applications, e.g., engine health monitoring, Alzheimer's disease and forklift management

    • Supported several students

    Aircraft engine diagram

    Better health status characterization

    Better fault diagnosis

    Better RUL prediction

  • Lab for System Informatics and Data Analytics (SIDA)

    Case Study – Engine RUL prediction

    Name T24 T50 P30 Nf Ps30 phi NRf BPR htBleed W31 W32

    Value 0.13 0.37 -0.03 -0.05 0.23 -0.21 -0.08 0.16 0.12 -0.05 -0.16

    12

    • Optimal weights 𝒘∗: ℎ𝑖 𝑡 = 𝑳𝑖 𝑡 𝒘∗

    T24…

    Health index

    W32

    The stochastic degradation models

    (Gebraeel, 2006)Bayesian updating methodsReal time sensor

    information

    Remaining life prediction

    • Developed HI-QL improved the RUL prediction accuracy

    o by 64.83% compared with the best single sensor

    o by 20.7% compared with existingHI-based models

  • Lab for System Informatics and Data Analytics (SIDA) 13

    System Degradation Modeling and Prognostics

    • Non-parametric data fusion model: does not need to know the parametric form of the degradation signal

    • semi-parametric data fusion model: integrate degradation modeling and prognostics in an integrated manner

    • SNR-based data fusion model: immune to the heterogeneous sensor challenges in terms of signal scales and measurement units

    • Quantile regression-based data fusion model: ensure to recover the underlying degradation status with estimated fusion coefficients converging to the true values

    • Sensory-based Failure Threshold Estimation: online update the failure threshold estimation of the in-field unit

    • Kernel-trick for nonlinear data fusion model

    • Generic data fusion model with automatic sensor selection

    • Data fusion model for multiple failure modes

    • Data fusion model when there are multiple environmental conditions

    • Generic data fusion model when mutisensor signals are asynchronous

    • Dynamic control of degradation speed and RLD via workload adjustment

  • Lab for System Informatics and Data Analytics (SIDA)

    Smart Monitoring of Alzheimer’s Disease via Data Fusion,

    Personalized Prognostics, and Selective Sensing

    14

    The model of AD trajectory [3]

    Existing Screening Approaches

    New Methodology

    Biomarkers Screening Tests Smart Monitoring

    Effective-ness

    Expensive, e.g., $ 5000 per scan for

    PiB-PET

    Passive information collection:

    burden, and complexity

    Proactive information

    collection driven by accurate

    statistical models Proposed Smart Monitoring Method

  • Lab for System Informatics and Data Analytics (SIDA)

    Data-Driven Failure Predictive Analytics for

    Internet of Things (IoT) enabled Service Systems

    Establish a core set of data-driven modeling, failure prognosis, and service decision-making methodologies for emerging Internet of Things (IoT)

    enabled service systems, particularly in the context of TMHNA

    15

    Historical off-line dataon multiple units

    Time0

    Condition monitoring (CM) data

    Failure

    Censored

    Time-to-failure data

    Fai

    lure

    cas

    es

    Failure event data

    Real-time on-line CM dataon individual units

    0 5 10 15

    34

    56

    78

    910

    Time

    CM

    Sig

    na

    l

    0 5 10 15

    34

    56

    78

    910

    0 5 10 15

    34

    56

    78

    910

    0 5 10 15

    34

    56

    78

    910

    0 5 10 15

    34

    56

    78

    910

    0 5 10 15

    34

    56

    78

    910

    0 5 10 15

    34

    56

    78

    910

    0 5 10 15

    34

    56

    78

    910

    Car #1 signal

    Car #2 signal

    Car #i signal… …

    .

    Equipment

    in the field

    Communication

    network

    Back-office

    Processing center

    Sensing dataService alert

    Unit

    Unit

    Unit

  • Lab for System Informatics and Data Analytics (SIDA)

    Big data analytics solutions to improve nuclear power

    plant efficiency: Online monitoring, visualization,

    prognosis, and maintenance decision making

    Advance the ability to assess equipment condition and predict the remaining useful life (RUL) to support optimal maintenance decision

    making in nuclear power plants.

    16

  • Lab for System Informatics and Data Analytics (SIDA) 17

    Spatiotemporal Field Modeling

    and Prediction

  • Lab for System Informatics and Data Analytics (SIDA) 18

    Real-time travel demand modeling and

    prediction in smart and connected citiesObjective• Online prediction of the origin-destination

    (OD) demand in traffic networks • Existing literature models the demand count

    data separately for different OD pairs without considering spatial correlations or domain knowledge

    Approaches• Propose a multivariate Poisson log-normal

    model with specific parametrization tailored to the traffic demand problem

    • Capture the spatiotemporal correlations of the traffic demand across different routes and epochs and automatically clusters the routes based on the demand correlations

    • The model is estimated using an Expectation-Maximization (EM) algorithm and applied for predicting future demand counts at the subsequent epochs

    Results Summary• The proposed method integrates traffic

    network domain knowledge and achieves a sparse estimation based on clusters of routes.

    • Estimate the parameters of the model accurately with the developed EM algorithm

    • Has been applied on a real New York yellow taxi dataset

    • Supported several students

    ഥ 𝝁

    𝑡

  • Lab for System Informatics and Data Analytics (SIDA) 19

    Modeling of dynamic thermal fields via

    grid-based sensor networksObjective• Accurate modeling and estimation of the full-

    scale grain thermal field based on the grid-based sensor networks.

    • Challenges:• Grid-based but sparse sensor data• Spatiotemporal correlation structures• Local variability of grain temperature

    Approaches• Integrate physical dynamics model (for global

    profile) and spatiotemporal stochastic processes (for local profile)

    • Develop a spatiotemporal transfer learning technique for 3D field estimation using sensor observations from several homogeneous data sources

    • Estimate time-varying parameters in PDE models from the obtained data to acquire a more accurate description of the dynamics

    Results Summary• The proposed methods integrate physical

    dynamics model, spatiotemporal statistical model, and advanced machine learning technique to achieves an accurate estimation of the 3D thermal fields based on grid-based sensor networks.

    • Has been tested and verified on several real datasets for grain storage application

    𝑡1 𝑡2 𝑡𝑀…

    Time

    𝑌(𝑠, 𝑡1) 𝑌(𝑠, 𝑡2) 𝑌(𝑠, 𝑡𝑀)…

  • Lab for System Informatics and Data Analytics (SIDA) 20

    Other Research Projects

  • Lab for System Informatics and Data Analytics (SIDA) 21

    Operator activity index development and

    performance improvement

    Objective• Propose a generic approach to develop an

    effective composite index to identify high-performing operators on multiple dimensions

    Results Summary• Developed an OAI by combining worker

    metrics information to measure the activity of operators

    • OAI by NPCA meaningfully explains the operator activity and also provides guidance for performance improvement

    • Algorithms have been tested in the forklift operator activity analyses

    • Supported several students

    Approaches• a new nonnegative principal component

    analysis (NPCA) approach with optimal balance• Best separation of operators• Comply with practical interpretation

  • Lab for System Informatics and Data Analytics (SIDA)

    Obstructive Sleep Apnea Detection

    22

  • Lab for System Informatics and Data Analytics (SIDA) 23

    Retail Site Location Analysis by Business Data

    AnalyticsObjective• Choose an optimal location for the opening of

    a new retail site

    Results Summary• Established a generic guideline on leveraging

    data analytics tools for resolving business issues when dealing with business big data

    • Algorithms have been tested in a real case study involving choosing an optimal location for the opening of a new retail site

    • Supported several students

    Approaches• Estimate the new market shares of the

    company over the country if the new retail site is tentatively opened at different potential locations

    The company of interest conducts gas station equipment repair and replacement business, who provided a dataset contains a total of more than 1 million detailed business transactions with a size about 8 GB over the past 5 years.

  • Lab for System Informatics and Data Analytics (SIDA)

    Research Summary

    Engineering

    Statistics OR/Control

    Engineering

    Statistics/ Data

    Mining

    Operation Research/

    Control

    Industrial Big Data Analytics

    24

  • Lab for System Informatics and Data Analytics (SIDA) 25

    Acknowledgement

  • Lab for System Informatics and Data Analytics (SIDA)

    Thank you!

    Questions?

    [email protected]

    26