24
Anomaly Detection Framework Root Cause Analysis Anomaly Prediction Validate Anomalies Feedback loop Reports & Dashboards PREDICTIVE MAINTENANCE Data Sources Component Failure Prediction System Failure Prevention Fault Identification Understanding Data Requirements for Incorporating Predictive Maintenance on Naval Systems

Understanding Data Requirements for Incorporating …Outline - Science Behind Data Science • Maintenance Strategies in Vogue • Predictive Maintenance as Alternative • Data Science

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Anomaly Detection

    Framework

    Root Cause Analysis

    Anomaly Prediction

    Validate Anomalies

    Feedback loop

    Reports & Dashboards

    PREDICTIVE MAINTENANCE

    Data Sources

    Component Failure Prediction

    System Failure Prevention

    Fault Identification

    Understanding Data Requirements for Incorporating Predictive Maintenance

    on Naval Systems

  • Outline - Science Behind Data Science

    • Maintenance Strategies in Vogue

    • Predictive Maintenance as Alternative

    • Data Science Approach

    • Business Questions & End Goal Objectives

    • Qualifying problem Criterion

    • Data Requirements and Preparation

    • Data Preprocessing and Feature Engineering

    • Historical Data Usage

    • Training validity and Testing

    Assimilating how the Data is Transformed from Numbers to Intelligence

  • Objective - PdM on Fire Control Radar

    • Theoretical Approach vs Practical Application

    • Framework Adopted

    • Multiple Data types

    • Business Questions - Data Driven or SME Validated ??

    • Data Engineering, Model Development and Testing

    • Constraints

    • Lessons Learnt and Seeking the Solutions

  • Maintenance Strategies A Brief Primer

    Strategy Evolution

  • Predictive Maintenance - An Alternative

    • Predictive Maintenance Scenario

    • Data Collected Over time to Monitor state of Equipment

    • To find patterns that can help predict and ultimately prevent failure

    • Remaining Useful Life - to schedule maintenance in advance

    • Flagging irregular behaviour : Anomaly Detection through Time series analysis

    • Failure Diagnosis and recommendation of mitigation or maintenance action after the failure

    Use of Cognitive Technology, viz, Data Science and AI/ML models to find m & c in (Y = mx + c)

  • AI is the Answer - What is the Question ??

    The Big Picture - Data is the key to Frame the Correct Question

  • In the Beginning….• Business Questions/ Problem Statement

    • Detect Anomalies

    • Predict if Equipment may fail in near Future

    • Estimate Remaining Useful Life (RUL)

    • Identify Failure Causes and Maintenance Actions

    • Problem Qualifying for Predictive Analysis

    • Availability of target/ Outcome to predict

    • Record of operational history with Good/ Bad Outcomes

    • Domain Expertise

  • And then came Data Discovery

    • Predictive Model

    • Learns patterns from Historical Data

    • Predicts Future Outcomes based on the Patterns learnt

    • Accuracy of predictions depends on

    • Relevance

    • Quality

    • Sufficiency

  • Data and it’s Relevance• Relevant Data

    • Failure History

    • Machine Repair History

    • Operating Conditions - Sensor Inputs

    • Time Varying Conditions - Leading to Degradation

    • Equipment Metadata and Static Features

    Two Main Data Types- Temporal/ Time Varying and Static

  • Data and it’s QC• Quality Data

    • Predictor Attribute Value (X) to be Accurate wrt Target Variable (Y)

    • Exploratory Data Analysis approach towards Quality

    • Use data to suggest Hypothesis

    • Provide insights on Data sets and basis for further data Collection

    • Guide feature Selection and Model Building Process

    Data as Table of Records - Rows are Training Instances/ Columns are Independent Variable or Predictive Feature

  • Feature Engineering• Process between EDA and Modelling

    • Create Additional relevant features to increase Predictive Power

    • Feature Selection to eliminate redundant or correlated features

    • Use of Lag features - How far the model has to look back

    • Rolling Aggregates - Overlapping Data

    • Tumbling Aggregates - Distinct Time Segments

  • Data Representation

    Asset ID Time Features………. Label

    A 123 Day 1 ……….

    A 123 Day 2 ………..

    B 234 Day 1 …………

    B 234 Day 2 ………..

  • Modelling Techniques for PdM

    • Label Construction Methods

    • Binary Classification

    1 = Fail ; 0 = Normal

    • Regression Model - Remaining Useful Life (RUL)

  • Modelling - The Cheat Sheet

    Predict between two categories

    Predict between several categories

    Under 100 features, aggressive boundary Fast training times

    Identifies and predicts rare or unusual data points

    Discover structure

    Unsupervised learning

    Separates similar data points into intuitive groups

    High accuracy, better efficiency

    Classifies images with popular networks

    Generate recommendations

    Collaborative filtering, better performance with lower cost by reducing dimensionality

    Predicts what someone will be interested in

    Extract information from text

    Converts text data to integer encoded features using the Vowpal Wabbit library

    Creates a dictionary of n-grams from a column of free text

    Converts words to values for use in NLP tasks, like recommender, named entity recognition, machine translation

    Derives high-quality information from text

    Performs cleaning operations on text, like removal of stop-words, case normalization

    Fast training times, linear model

    Accuracy, long training times

    Accuracy, fast training times

    Depends on the two-class classifier

    Non-parametric, fast training times and scalable

    Answers complex questions with multiple possible answers

    Under 100 features, linear model

    Fast training, linear model

    Accurate, fast training

    Fast training, linear model

    Accurate, fast training, large memory footprint

    Accurate, long training times

    Answers simple two-choice questions, like yes or no, true or falseMakes forecasts by estimating the

    relationship between values

    Predicts event counts

    Fast training, linear model

    Linear model, small data sets

    Accurate, fast training times

    Accurate, long training times

    Accurate, fast training times, large memory footprint

    Predicts a distribution

    Predict values

    Find unusual occurrences Classify images

    Extract N-Gram Features from Text

    Feature Hashing

    Preprocess Text

    Word2Vector

    Poisson Regression

    Linear Regression

    Bayesian Linear Regression

    Decision Forest Regression

    Neural Network Regression

    Boosted Decision Tree Regression

    Fast Forest Quantile Regression

    PCA-Based Anomaly DetectionOne Class SVM

    K-Means

    DenseNet

    Two-Class Support Vector Machine

    Two-Class Averaged Perceptron

    Two-Class Decision Forest

    Two-Class Logistic Regression

    Two-Class Boosted Decision Tree

    Two-Class Neural Network

    Multiclass Logistic Regression

    Multiclass Neural Network

    Multiclass Decision Forest

    One-vs-All Multiclass

    Multiclass Boosted Decision Tree

    Answers questions like: What info is in this text?

    Regression Answers the question: What will they be interested in?

    Answers questions like: How much or how many?

    Answers questions like: How is this organized?

    Answers the question: Is this weird? Answers questions like: What does this image represent?

    Answers questions like: Is this A or B?

    Answers questions like: Is this A or B or C or D?

    Text Analytics

    SVD Recommender

    Recommenders

    Clustering

    Anomaly Detection Image Classification

    Two-Class Classification

    Multiclass Classification

    © 2019 Microsoft Corporation. All rights reserved. Share this poster: aka.ms/mlcheatsheet

    This cheat sheet helps you choose the best machine learning algorithm for your predictive analytics solution. Your decision is driven by both the nature of your data and the goal you want to achieve with your data. Machine Learning Algorithm Cheat Sheet

  • Validation - Sea Trials• Cross Validation- Test Model in

    Training Phase

    • Training Data

    • Test Data

    • k fold randomly splits the data into k subsets

    • Runs the algorithm k times

    • Current fold as validation, remaining as Training Data

  • Model Evaluation - REFCOM• Performance Metrics

    • Regression - Mean Absolute/ Squared Error

    • Classification - Confusion Matrix

    • Accuracy

    • Recall

    • Precision

    • F Score

  • Practical Implementation - Dogwatches

    • Application on a Weapon Control Module

    • Control the gun mount laying angle

    • Send engagement commands

    • Receive Tell back commands

    • Stabilisation Data

    • Discrete PCB

    • Synchro PCB

    • System Controller

  • Data Set and Initial Analysis

    Available Parameter

    Primary Operating Characteristic

    Early Warning Trend feasibility

    ML based Prediction feasibility

    Ref Voltg Cmd Azm Coarse_Gun Measurement Yes Yes

    Ref Voltg Cmd Azm Fine_Gun Measurement Yes Yes

    Ref Freq Cmd Azm Coarse_Gun Measurement Yes Yes

    Ref Freq Cmd Azm Fine_Gun Measurement Yes Yes

    Ref Voltg Cmd Elv Coarse_Gun Measurement Yes Yes

    Ref Voltg Cmd Elv Fine_Gun Measurement Yes Yes

    Ref Freq Cmd Elv Coarse_Gun Measurement Yes Yes

    Ref Freq Cmd Elv Fine_Gun Measurement Yes Yes

    Ref Voltg Tlbk Azm_Gun Measurement Yes Yes

    Ref Voltg Tlbk Elv_Gun Measurement Yes Yes

    Ref Freq Tlbk Azm_Gun Measurement Yes Yes

    Ref Freq Tlbk Elv_Gun Measurement Yes Yes

  • Business QuestionsCategory Questions

    Key Dependency Predictive/ML

    Capability

    Health Status Are all PCBs (including Modules and Channels) in a healthy and functioning state?List of critical parameters and conditions that denote a healthy state, Early Warning Trends & Anomaly detection output Yes

    Health Status If not, can I know which Channels/Modules/PCBs have gone bad?Mapping of Parameters to PCB/Module (Data Dictionary) Early Warning Trends & Anomaly detection Yes

    Cabinet Health Is the Board/CPU temperature healthy for FCS to be operated?Board/CPU temperature threshold for FCS to be operated , Early Warning Trend output

    Yes

    Parameter View Can I view the Navigational, Current, Frequency and Voltage data?List of Parameters mapped to each (Data Dictionary) No

    Early Warning Trends Are there any Spikes in Voltage, Current, Frequency values?None No

    Early Warning Trends Are there any Spikes outside of control limits for Voltage, Current, Frequency values?Control Limits to be provided for each Parameter No

    Early Warning Trends Is the Board/CPU temperature showing an increasing trend beyond expectation?Expected Trend to be detected from Data, SME to validate Yes

    Early Warning Trends Is the Loss parameter showing an increasing trend beyond expectation?Expected Trend to be detected from Data, SME to validate Yes

    Early Warning Trends Is the Voltage, Frequency, Current showing a Decreasing trend beyond expectation?Expected Trend to be detected from Data, SME to validate Yes

    Early Warning Trends Is there any deviation in Command and Tellback beyond expectation?Tolerance for deviation in Command and Tellback to be provided Yes

    Anomaly Detection

    Does the total number of Spikes in Single OR Cumulative runs point to anomalous behavior?

    Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes

    Early Warning Trends Is there any deviation in Command and Tellback beyond expectation?Tolerance for deviation in Command and Tellback to be provided Yes

    Anomaly Detection

    Does the total number of Spikes in Single OR Cumulative runs point to anomalous behavior?

    Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes

    Anomaly Detection

    Does the total deviation in Command and Tellback in Single OR Cumulative runs point to anomalous behavior?

    Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes

    Anomaly Detection

    Does the Navigational, Gun Mount parameter variation point to Anomalous behavior?

    Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes

    Anomaly Detection

    Does the total Loss value in Single OR Cumulative runs point to anomalous behavior?

    Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes

  • Data Qualifications• Good Quality data with less than 1% missing Data

    • Time stamp formatting, Joining of data

    • Feature Extraction & EDA

    • Identification of Spikes

    • Deviation Features

    • Tumbling window based lag features , viz, aggregate over current usage, Total System usage

    • A total of 120 + features for modelling

  • Model Selection and TestingSl Data Packet Model Models

    DevelopedModel

    SelectedModel

    Optimisation

    (a) OLT OPT Board Temperature Prediction & CPU Temperature Prediction

    Polynomial Regression, Decision Tree Regression

    Decision Tree Regression based on accuracy

    Opt imised for Tree Depth

    (b) Synchro PCB

    Anomaly Detection Model

    K - m e a n s , D B S c a n , Gaussian Mixture model, Truncated SVD (PCA)

    K-means using PCA based on Elbow Curve, Silhouette Score and Confusion Matrix

    Optimized for number of components in PCA,

    Number of Clusters based on Elbow Curve and Silhouette Score

    (c) Discrete PCB

    Anomaly Detec t ion Mode l Module 1

    Anomaly Detec t ion Mode l Module 4

    Anomaly Detec t ion Mode l Module 5

    Sl Data Packet Model Test Set Test Measure

    (a) OLT OPT Board Temperature Prediction & CPU Temperature Prediction

    5 Runs of Normal and 13 Runs of Abnormal Operation (Simulated)

    Decision Tree Regression based on accuracy

    (b) Synchro PCB

    Anomaly Detection Model

    Dataset having mix of UCL Breach, LCL Breach and High Rate of Change of Reference & Line Voltage or Reference Frequency

    Confusion Matrix Scores of True Positive, false Positive Rates, Recall and F1 Score

    (c) Discrete PCB

    Anomaly Detection Model Module 1

    Dataset having mix of UCL Breach, LCL Breach, Logic High Breach and High Rate of Change of Instantaneous and Average Current & Voltage

    Confusion Matrix Scores of True Positive, false Positive Rates, Recall and F1 Score

    Anomaly Detection Model Module 4

    Anomaly Detection Model Module 5

  • Lab Results Achieved

    Model Prediction for Start Temperature of 37o

    C for over 8 hours of Operation

    2 Clusters correspond to the ‘No

    Anomaly Cluster’

    • Regression based Temperature Prediction Module with 95 % accuracy- System Controller PCB

    • Unsupervised Learning Based Anomaly detection Module

  • Lessons Learnt• Incorporation of PdM on complex system with OEM support

    • IoT sensors usage is inescapable

    • Data Loggers as integral part of new acquisitions

    • Learning the Machine Learning - Training the manpower

    • Just learnt - Deep Learning can be Used with Raw data

    • Data labelling/ Annotation

  • THANK YOU