Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Anomaly Detection
Framework
Root Cause Analysis
Anomaly Prediction
Validate Anomalies
Feedback loop
Reports & Dashboards
PREDICTIVE MAINTENANCE
Data Sources
Component Failure Prediction
System Failure Prevention
Fault Identification
Understanding Data Requirements for Incorporating Predictive Maintenance
on Naval Systems
Outline - Science Behind Data Science
• Maintenance Strategies in Vogue
• Predictive Maintenance as Alternative
• Data Science Approach
• Business Questions & End Goal Objectives
• Qualifying problem Criterion
• Data Requirements and Preparation
• Data Preprocessing and Feature Engineering
• Historical Data Usage
• Training validity and Testing
Assimilating how the Data is Transformed from Numbers to Intelligence
Objective - PdM on Fire Control Radar
• Theoretical Approach vs Practical Application
• Framework Adopted
• Multiple Data types
• Business Questions - Data Driven or SME Validated ??
• Data Engineering, Model Development and Testing
• Constraints
• Lessons Learnt and Seeking the Solutions
Maintenance Strategies A Brief Primer
Strategy Evolution
Predictive Maintenance - An Alternative
• Predictive Maintenance Scenario
• Data Collected Over time to Monitor state of Equipment
• To find patterns that can help predict and ultimately prevent failure
• Remaining Useful Life - to schedule maintenance in advance
• Flagging irregular behaviour : Anomaly Detection through Time series analysis
• Failure Diagnosis and recommendation of mitigation or maintenance action after the failure
Use of Cognitive Technology, viz, Data Science and AI/ML models to find m & c in (Y = mx + c)
AI is the Answer - What is the Question ??
The Big Picture - Data is the key to Frame the Correct Question
In the Beginning….• Business Questions/ Problem Statement
• Detect Anomalies
• Predict if Equipment may fail in near Future
• Estimate Remaining Useful Life (RUL)
• Identify Failure Causes and Maintenance Actions
• Problem Qualifying for Predictive Analysis
• Availability of target/ Outcome to predict
• Record of operational history with Good/ Bad Outcomes
• Domain Expertise
And then came Data Discovery
• Predictive Model
• Learns patterns from Historical Data
• Predicts Future Outcomes based on the Patterns learnt
• Accuracy of predictions depends on
• Relevance
• Quality
• Sufficiency
Data and it’s Relevance• Relevant Data
• Failure History
• Machine Repair History
• Operating Conditions - Sensor Inputs
• Time Varying Conditions - Leading to Degradation
• Equipment Metadata and Static Features
Two Main Data Types- Temporal/ Time Varying and Static
Data and it’s QC• Quality Data
• Predictor Attribute Value (X) to be Accurate wrt Target Variable (Y)
• Exploratory Data Analysis approach towards Quality
• Use data to suggest Hypothesis
• Provide insights on Data sets and basis for further data Collection
• Guide feature Selection and Model Building Process
Data as Table of Records - Rows are Training Instances/ Columns are Independent Variable or Predictive Feature
Feature Engineering• Process between EDA and Modelling
• Create Additional relevant features to increase Predictive Power
• Feature Selection to eliminate redundant or correlated features
• Use of Lag features - How far the model has to look back
• Rolling Aggregates - Overlapping Data
• Tumbling Aggregates - Distinct Time Segments
Data Representation
Asset ID Time Features………. Label
A 123 Day 1 ……….
A 123 Day 2 ………..
B 234 Day 1 …………
B 234 Day 2 ………..
Modelling Techniques for PdM
• Label Construction Methods
• Binary Classification
1 = Fail ; 0 = Normal
• Regression Model - Remaining Useful Life (RUL)
Modelling - The Cheat Sheet
Predict between two categories
Predict between several categories
Under 100 features, aggressive boundary Fast training times
Identifies and predicts rare or unusual data points
Discover structure
Unsupervised learning
Separates similar data points into intuitive groups
High accuracy, better efficiency
Classifies images with popular networks
Generate recommendations
Collaborative filtering, better performance with lower cost by reducing dimensionality
Predicts what someone will be interested in
Extract information from text
Converts text data to integer encoded features using the Vowpal Wabbit library
Creates a dictionary of n-grams from a column of free text
Converts words to values for use in NLP tasks, like recommender, named entity recognition, machine translation
Derives high-quality information from text
Performs cleaning operations on text, like removal of stop-words, case normalization
Fast training times, linear model
Accuracy, long training times
Accuracy, fast training times
Depends on the two-class classifier
Non-parametric, fast training times and scalable
Answers complex questions with multiple possible answers
Under 100 features, linear model
Fast training, linear model
Accurate, fast training
Fast training, linear model
Accurate, fast training, large memory footprint
Accurate, long training times
Answers simple two-choice questions, like yes or no, true or falseMakes forecasts by estimating the
relationship between values
Predicts event counts
Fast training, linear model
Linear model, small data sets
Accurate, fast training times
Accurate, long training times
Accurate, fast training times, large memory footprint
Predicts a distribution
Predict values
Find unusual occurrences Classify images
Extract N-Gram Features from Text
Feature Hashing
Preprocess Text
Word2Vector
Poisson Regression
Linear Regression
Bayesian Linear Regression
Decision Forest Regression
Neural Network Regression
Boosted Decision Tree Regression
Fast Forest Quantile Regression
PCA-Based Anomaly DetectionOne Class SVM
K-Means
DenseNet
Two-Class Support Vector Machine
Two-Class Averaged Perceptron
Two-Class Decision Forest
Two-Class Logistic Regression
Two-Class Boosted Decision Tree
Two-Class Neural Network
Multiclass Logistic Regression
Multiclass Neural Network
Multiclass Decision Forest
One-vs-All Multiclass
Multiclass Boosted Decision Tree
Answers questions like: What info is in this text?
Regression Answers the question: What will they be interested in?
Answers questions like: How much or how many?
Answers questions like: How is this organized?
Answers the question: Is this weird? Answers questions like: What does this image represent?
Answers questions like: Is this A or B?
Answers questions like: Is this A or B or C or D?
Text Analytics
SVD Recommender
Recommenders
Clustering
Anomaly Detection Image Classification
Two-Class Classification
Multiclass Classification
© 2019 Microsoft Corporation. All rights reserved. Share this poster: aka.ms/mlcheatsheet
This cheat sheet helps you choose the best machine learning algorithm for your predictive analytics solution. Your decision is driven by both the nature of your data and the goal you want to achieve with your data. Machine Learning Algorithm Cheat Sheet
Validation - Sea Trials• Cross Validation- Test Model in
Training Phase
• Training Data
• Test Data
• k fold randomly splits the data into k subsets
• Runs the algorithm k times
• Current fold as validation, remaining as Training Data
Model Evaluation - REFCOM• Performance Metrics
• Regression - Mean Absolute/ Squared Error
• Classification - Confusion Matrix
• Accuracy
• Recall
• Precision
• F Score
Practical Implementation - Dogwatches
• Application on a Weapon Control Module
• Control the gun mount laying angle
• Send engagement commands
• Receive Tell back commands
• Stabilisation Data
• Discrete PCB
• Synchro PCB
• System Controller
Data Set and Initial Analysis
Available Parameter
Primary Operating Characteristic
Early Warning Trend feasibility
ML based Prediction feasibility
Ref Voltg Cmd Azm Coarse_Gun Measurement Yes Yes
Ref Voltg Cmd Azm Fine_Gun Measurement Yes Yes
Ref Freq Cmd Azm Coarse_Gun Measurement Yes Yes
Ref Freq Cmd Azm Fine_Gun Measurement Yes Yes
Ref Voltg Cmd Elv Coarse_Gun Measurement Yes Yes
Ref Voltg Cmd Elv Fine_Gun Measurement Yes Yes
Ref Freq Cmd Elv Coarse_Gun Measurement Yes Yes
Ref Freq Cmd Elv Fine_Gun Measurement Yes Yes
Ref Voltg Tlbk Azm_Gun Measurement Yes Yes
Ref Voltg Tlbk Elv_Gun Measurement Yes Yes
Ref Freq Tlbk Azm_Gun Measurement Yes Yes
Ref Freq Tlbk Elv_Gun Measurement Yes Yes
Business QuestionsCategory Questions
Key Dependency Predictive/ML
Capability
Health Status Are all PCBs (including Modules and Channels) in a healthy and functioning state?List of critical parameters and conditions that denote a healthy state, Early Warning Trends & Anomaly detection output Yes
Health Status If not, can I know which Channels/Modules/PCBs have gone bad?Mapping of Parameters to PCB/Module (Data Dictionary) Early Warning Trends & Anomaly detection Yes
Cabinet Health Is the Board/CPU temperature healthy for FCS to be operated?Board/CPU temperature threshold for FCS to be operated , Early Warning Trend output
Yes
Parameter View Can I view the Navigational, Current, Frequency and Voltage data?List of Parameters mapped to each (Data Dictionary) No
Early Warning Trends Are there any Spikes in Voltage, Current, Frequency values?None No
Early Warning Trends Are there any Spikes outside of control limits for Voltage, Current, Frequency values?Control Limits to be provided for each Parameter No
Early Warning Trends Is the Board/CPU temperature showing an increasing trend beyond expectation?Expected Trend to be detected from Data, SME to validate Yes
Early Warning Trends Is the Loss parameter showing an increasing trend beyond expectation?Expected Trend to be detected from Data, SME to validate Yes
Early Warning Trends Is the Voltage, Frequency, Current showing a Decreasing trend beyond expectation?Expected Trend to be detected from Data, SME to validate Yes
Early Warning Trends Is there any deviation in Command and Tellback beyond expectation?Tolerance for deviation in Command and Tellback to be provided Yes
Anomaly Detection
Does the total number of Spikes in Single OR Cumulative runs point to anomalous behavior?
Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes
Early Warning Trends Is there any deviation in Command and Tellback beyond expectation?Tolerance for deviation in Command and Tellback to be provided Yes
Anomaly Detection
Does the total number of Spikes in Single OR Cumulative runs point to anomalous behavior?
Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes
Anomaly Detection
Does the total deviation in Command and Tellback in Single OR Cumulative runs point to anomalous behavior?
Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes
Anomaly Detection
Does the Navigational, Gun Mount parameter variation point to Anomalous behavior?
Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes
Anomaly Detection
Does the total Loss value in Single OR Cumulative runs point to anomalous behavior?
Normal behaviour to be learnt from Data; more the data breadth & depth, better the Learning Yes
Data Qualifications• Good Quality data with less than 1% missing Data
• Time stamp formatting, Joining of data
• Feature Extraction & EDA
• Identification of Spikes
• Deviation Features
• Tumbling window based lag features , viz, aggregate over current usage, Total System usage
• A total of 120 + features for modelling
Model Selection and TestingSl Data Packet Model Models
DevelopedModel
SelectedModel
Optimisation
(a) OLT OPT Board Temperature Prediction & CPU Temperature Prediction
Polynomial Regression, Decision Tree Regression
Decision Tree Regression based on accuracy
Opt imised for Tree Depth
(b) Synchro PCB
Anomaly Detection Model
K - m e a n s , D B S c a n , Gaussian Mixture model, Truncated SVD (PCA)
K-means using PCA based on Elbow Curve, Silhouette Score and Confusion Matrix
Optimized for number of components in PCA,
Number of Clusters based on Elbow Curve and Silhouette Score
(c) Discrete PCB
Anomaly Detec t ion Mode l Module 1
Anomaly Detec t ion Mode l Module 4
Anomaly Detec t ion Mode l Module 5
Sl Data Packet Model Test Set Test Measure
(a) OLT OPT Board Temperature Prediction & CPU Temperature Prediction
5 Runs of Normal and 13 Runs of Abnormal Operation (Simulated)
Decision Tree Regression based on accuracy
(b) Synchro PCB
Anomaly Detection Model
Dataset having mix of UCL Breach, LCL Breach and High Rate of Change of Reference & Line Voltage or Reference Frequency
Confusion Matrix Scores of True Positive, false Positive Rates, Recall and F1 Score
(c) Discrete PCB
Anomaly Detection Model Module 1
Dataset having mix of UCL Breach, LCL Breach, Logic High Breach and High Rate of Change of Instantaneous and Average Current & Voltage
Confusion Matrix Scores of True Positive, false Positive Rates, Recall and F1 Score
Anomaly Detection Model Module 4
Anomaly Detection Model Module 5
Lab Results Achieved
Model Prediction for Start Temperature of 37o
C for over 8 hours of Operation
2 Clusters correspond to the ‘No
Anomaly Cluster’
• Regression based Temperature Prediction Module with 95 % accuracy- System Controller PCB
• Unsupervised Learning Based Anomaly detection Module
Lessons Learnt• Incorporation of PdM on complex system with OEM support
• IoT sensors usage is inescapable
• Data Loggers as integral part of new acquisitions
• Learning the Machine Learning - Training the manpower
• Just learnt - Deep Learning can be Used with Raw data
• Data labelling/ Annotation
THANK YOU