Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 766994. It is the property of the PROPHESY consortium and shall not be distributed or reproduced without the formal approval of the PROPHESY Project Coordination Committee.
DELIVERABLE
D2.2 – Specification of Data Assets and Services
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 2
Project Acronym: PROPHESY
Grant Agreement number: 766994 (H2020‐IND‐CE‐2016‐17/H2020‐FOF‐2017)
Project Full Title: Platform for rapid deployment of self‐configuring and
optimized predictive maintenance services
Project Coordinator: INTRASOFT International SA
DELIVERABLE
D2.2 – Specification of Data Assets and Services
Dissemination level (PU) Public
Type of Document (R) Report
Contractual date of delivery M08, 31/05/2018
Deliverable Leader FHG
Status ‐ version, date Final – V2.01, 04/06/2018
WP / Task responsible FHG
Keywords: Data asset; data service; data flow; data analysis; data
analytics; customer needs; functional view; business
goals;
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 3
ExecutiveSummary
PROPHESY’s WP2 aims to design an engineering blueprint of the PROPHESY‐PdM platform, to
serve as the baseline for its technological development and deployment. The engineering
blueprint composes itself of a high‐level architecture of the system of interest plus the
technical infrastructure that supports the architecture.
The document at hand addresses the objectives predictive maintenance (PdM) in PROPHESY
needs to achieve, and derives the data (and its in‐process and external sources), the data
modelling needs, and the functions thereof. It also reflects system qualities that drive the
system architecture most. The deliverable focusses on answering the questions what the
system of interest shall accomplish. The answers on how the system will work and be built
will be given in work package 3.
Task 2.2, together with the remaining tasks from work package 2, sets the context for work
packages 3‐7. To this purpose, the document at hand contributes with basic concepts for data
collection and analytics, renders data modelling standards and data analytics algorithms to
be used, and gives referential advice on data flows and functions needed for predictive
maintenance. Finally, the document depicts the needs of PROPHESY’s use cases so that the
PROPHESY partners can understand what the systems to be built have to accomplish, and
what benefits they have for the users.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 4
Deliverable Leader: FHG
Contributors: ICARE, NOVAID, AIT, FHG, MONDRAGON, SENSAP, TUE, MMS,
PHILIPS, INTRA
Reviewers: PHILIPS, JLR, ICARE, All
Approved by: INTRA
Document History
Version Date Contributor(s) Description
0.0 27.02.2018 FHG, NOVAID Very first draft.
0.1 09.03.2018 FHG ToC & contributors.
0.2 14.03.2018 FHG Initial contributions.
0.3 23.03.2018
ICARE, NOVAID, AIT, FHG,
MONDRAGON, SENSAP,
TUE, MMS, PHILIPS, JLR
Worked in feedback. Template revision.
0.4 28.03.2018 FHG Added to section 2.
0.5 29.03.2018 SENSAP, FHG, JLR,
PHILIPS Added to section 2, 4, 5. Included review feedback
1.0 30.03.2018 FHG Added to section 3. Updated Acronym Table.
Finalized document (version1.0).
1.1 04.05.2018 FHG, JLR, PHILIPS Updated UC function models
1.2 17.05.2018 AIT
Added new chapter toc “Specification of Data
Collection, Sharing and Analytics” and first
content.
1.3 24.05.2018 MMS, FHG Included 3.1.1 (Data Collection Embedded in the
Machines)
1.4 27.05.2018 AIT, MONDRAGON,
PHILIPS, FHG
Added content to chapter 3. Update of UC2.
Updated Acronym Table.
1.5 28.05.2018 PHILIPS, FHG Revision of the use case overviews 1‐3, revision of
introductory parts, and conclusion.
1.6 29.05.2018 JLR, FHG Revision of the use case overviews 4‐6
2.0 30.05.2018 FHG Integrated review feedback. Improved picture
quality. Finalized document (version2.0)
2.01 04.06.2018 ICARE, FHG Integrated very last feedback. Minor UC2‐
Overview revision.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 5
TableofContentsEXECUTIVE SUMMARY ................................................................................................................................. 3
TABLE OF CONTENTS .................................................................................................................................... 5
TABLE OF FIGURES ....................................................................................................................................... 6
LIST OF TABLES ............................................................................................................................................. 6
DEFINITIONS, ACRONYMS AND ABBREVIATIONS .......................................................................................... 8
INTRODUCTION ................................................................................................................................. 10
1.1 THE PROPHESY VISION .............................................................................................................................. 10 1.2 WP2 OVERVIEW ......................................................................................................................................... 11 1.3 TASK 2.2 OVERVIEW .................................................................................................................................... 11 1.4 DOCUMENT PURPOSE AND AUDIENCE ............................................................................................................. 11 1.5 DOCUMENT SCOPE AND APPROACH ................................................................................................................ 12 1.6 DOCUMENT STRUCTURE ............................................................................................................................... 14
FOUNDATIONS FOR DATA COLLECTION AND ANALYTICS .................................................................... 15
2.1 BIG DATA SOLUTION LIFE‐CYCLE: CRISP‐DM .................................................................................................. 15 2.1.1 Introduction .................................................................................................................................. 15 2.1.2 Cross‐Industry Standard Process for Data Mining (CRISP‐DM) ..................................................... 16 2.1.3 CRISP‐DM reference model ........................................................................................................... 17
2.2 BUSINESS GOALS AND KEY QUALITIES .............................................................................................................. 24 2.3 ISO 13374 ............................................................................................................................................... 27 2.4 DATA SEMANTICS ........................................................................................................................................ 28
2.4.1 General Needs ............................................................................................................................... 28 2.4.2 MIMOSA ........................................................................................................................................ 30
SPECIFICATION OF DATA COLLECTION, SHARING AND ANALYTICS ...................................................... 33
3.1 DATA COLLECTION SPECIFICATIONS ................................................................................................................. 33 3.1.1 Data Collection embedded in the Machines ................................................................................. 33 3.1.2 Data Collection for Sensors and Field Devices .............................................................................. 35 3.1.3 Edge Gateway Data Collection ..................................................................................................... 35 3.1.4 Data Collection from Maintenance Systems and Databases ........................................................ 36
3.2 DATA ANALYTICS SPECIFICATIONS ................................................................................................................... 36 3.2.1 Data Pre‐processing ...................................................................................................................... 37 3.2.2 Anomaly Detection ....................................................................................................................... 38 3.2.3 Root Cause Analysis ...................................................................................................................... 39 3.2.4 Remaining Useful Life ................................................................................................................... 39 3.2.5 Discovery of Rate Conditions ........................................................................................................ 40
3.3 DATA SHARING AND INTEROPERABILITY SPECIFICATIONS ..................................................................................... 40 3.3.1 Standards‐based Digital Models for PROPHESY............................................................................ 41 3.3.2 Data Sharing and Exchange Specifications ................................................................................... 41 3.3.3 Data Persistence Specifications .................................................................................................... 41
REFERENCE DATA FLOW AND FUNCTIONS ......................................................................................... 42
4.1 INFORMATION MODEL ................................................................................................................................. 42 4.2 FUNCTION MODEL....................................................................................................................................... 45 4.3 OVERVIEW TEMPLATE .................................................................................................................................. 51
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 6
PHILIPS USE CASES (GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS) ........................................ 53
5.1 PROPHESY UC1 (PHILIPS) ........................................................................................................................ 54 5.2 PROPHESY UC2 (PHILIPS) ........................................................................................................................ 55 5.3 PROPHESY UC3 (PHILIPS) ........................................................................................................................ 56
JLR USE CASES (GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS) .............................................. 57
6.1 PROPHESY UC4 (JLR) ............................................................................................................................... 58 6.2 PROPHESY UC5 (JLR) ............................................................................................................................... 59 6.3 PROPHESY UC6 (JLR) ............................................................................................................................... 60
CONCLUSION .................................................................................................................................... 61
REFERENCES ...................................................................................................................................... 62
TableofFiguresFIGURE 1‐1: FUNCTIONAL ANALYSIS AND THE ARCHITECTURE DESIGN PHASES [2] ................................................................... 12 FIGURE 2‐1 ESTIMATED GROWTH OF THE US DIGITAL UNIVERSE ......................................................................................... 15 FIGURE 2‐2: PHASES OF THE CRISP‐DM REFERENCE MODEL ............................................................................................. 17 FIGURE 2‐3: CRISP‐DM – PHASES, GENERIC TASKS AND OUTPUTS ..................................................................................... 18 FIGURE 2‐4: HOW ORGANIZATIONS HANDLE DATA FLOW: A GIANT MESS (SOURCE: CONFLUENT)4 .......................................... 28 FIGURE 2‐5: OSA‐EAI ARCHITECTURE .......................................................................................................................... 31 FIGURE 2‐6: OSA‐CBM FUNCTIONAL BLOCKS CONFORM TO ISO‐13374 ........................................................................... 32 FIGURE 3‐1: BRANKAMP X7 PROCESS MONITORING SYSTEM .............................................................................................. 33 FIGURE 3‐2: ARTIS GENIOR MODULAR CPU‐02 PROCESS MONITORING SYSTEM ................................................................... 34 FIGURE 3‐3: DATA ANALYTICS LEVELS FOR MAINTENANCE................................................................................................. 37 FIGURE 4‐1: INFORMATION MODEL ELEMENT CATEGORIES [10] ........................................................................................ 42 FIGURE 4‐2: DATA STRUCTURES AND KINDS [10] ............................................................................................................. 43 FIGURE 4‐3: DATA SOURCES AND PROCESSED DATA [10] .................................................................................................. 44 FIGURE 4‐4: DATA SUBJECTS AND CONTENT [10] ............................................................................................................ 44 FIGURE 4‐5: DATA CHARACTERISTICS [10] ..................................................................................................................... 45 FIGURE 4‐6: MANTIS FUNCTIONAL MODEL .................................................................................................................. 45 FIGURE 4‐7: DATA PROCESSING BLOCK DIAGRAM FROM ISO 13374‐2 ................................................................................ 46 FIGURE 4‐8: TEMPLATE FOR CAPTURING GOALS, FUNCTIONS, DATA, AND DATA SOURCES & FLOWS ........................................... 52 FIGURE 5‐1: PROPHESY UC1 (PHILIPS) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS ........................................... 54 FIGURE 5‐2: PROPHESY UC2 (PHILIPS) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS ........................................... 55 FIGURE 5‐3: PROPHESY UC3 (PHILIPS) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS ........................................... 56 FIGURE 6‐1: PROPHESY UC4 (JLR) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS .................................................. 58 FIGURE 6‐2: PROPHESY UC5 (JLR) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS .................................................. 59 FIGURE 6‐3: PROPHESY UC6 (JLR) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS .................................................. 60
ListofTablesTABLE 2‐1: SELECTION OF QUALITIES RELATED TO PDM .................................................................................................... 26 TABLE 4‐1: KEY MAINTENANCE FUNCTION: DATA ACQUISITION ......................................................................................... 47 TABLE 4‐2: KEY MAINTENANCE FUNCTION: DATA MANIPULATION ..................................................................................... 47 TABLE 4‐3: KEY MAINTENANCE FUNCTION: STATE DETECTION ........................................................................................... 48 TABLE 4‐4: KEY MAINTENANCE FUNCTION: HEALTH ASSESSMENT ...................................................................................... 49 TABLE 4‐5: KEY MAINTENANCE FUNCTION: PROGNOSTIC ASSESSMENT ............................................................................... 49
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 7
TABLE 4‐6: KEY MAINTENANCE FUNCTION: ADVISORY GENERATION ................................................................................... 50 TABLE 4‐7: KEY MAINTENANCE FUNCTION: MAINTENANCE PLANNING ................................................................................ 51 TABLE 4‐8: KEY MAINTENANCE FUNCTION: MAINTENANCE EXECUTION ............................................................................... 51
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 8
Definitions,AcronymsandAbbreviationsAcronym/
Abbreviation Title
ADT Anomaly Detection
AG Advisory Generation
B2MML Business To Machine Mark‐up Language
CAEX Computer Aided Engineering Exchange
CBM Condition Based Monitoring
CCOM Common Conceptual Object Model
CM&D Condition monitoring and diagnostics
CMM Coordinate Measuring Machine
CNC Computer Numerical Control
CPPS Cyber‐Physical Production System
CPS Cyber‐Physical System
CPSoS Cyber‐Physical System of Systems
CRIS Common Relational Information Schema
CRISP‐DM Cross‐Industry Standard Process for Data Mining
CRUD Create, Read, Update, Delete
DA Data Acquisition DBSCAN Density‐Based Spatial Clustering of Applications with Noise DIKW Data–Information–Knowledge–Wisdom
DM Data Manipulation
DMC Data Matrix Code
DPWS Device Profile for Web Services
DTA Digital Torque Adapter
EOL End Of Life ETL Extract ‐ Transform ‐ Load
EXAMCE EXAct Method‐based Cluster Ensembles
FIPA Foundation for Intelligent Physical Agents
HA Health Assessment
HFML High Frequency Machine Learning (in the Cloud)
ICA Independent Component Analysis
IIC Industrial Internet Consortium
IIoT Industrial Internet‐of‐Things
IIRA Industrial Internet Reference Architecture
IoT Internet‐of‐Things
IRR Internal Rate of Return
JADE Java Agent Development Framework
JLR Project partner acronym: Jaguar Land Rover Limited
KNN k‐nearest neighbors algorithm
k‐NN k‐nearest neighbors algorithm
KPI Key Performance Indicator
LFML Low Frequency Machine Learning (on the Edge)
LSTM Long short‐term memory
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 9
ME Maintenance Execution
MEDA Machine emulation detection algorithm
MMS Project partner acronym: MARPOSS Monitoring Solutions GmbH MMT Manage My Tools (Siemens)
MP Maintenance Planning
mRMR Minimum Redundancy Maximum Relevance
MSPC Multivariate Statistical Process Control
MTBF Mean Time Between Failures
MTTR Mean Time to Repair
OEE Overall Equipment Effectiveness
oMEDA oMEDA is a variant of MEDA to connect observations and variables OPC OLE for Process Control
OPC‐UA OPC Unified Architecture
OPR Offline Process Recorder
OSA‐CBM Open System Architecture for Condition Based Maintenance OSA‐EAI Open Systems Architecture for Enterprise Application Integration P&P Plug and Produce
PA Prognostic Assessment
PCA Principal Component Analysis
PdM Predictive Maintenance
PLM Product Lifecycle Management
PROPHESY‐AR PROPHESY‐Augmented Reality
PROPHESY‐CPS PROPHESY‐Cyber Physical System
PROPHESY‐ML PROPHESY‐Machine Learning
PROPHESY‐PdM PROPHESY‐Predictive Maintenance
PROPHESY‐SOE PROPHESY‐Service Optimization Engine
PRT Predicted Repair Time
QARMA Quantitative Association Rule Mining
QRC Quick Response Code
RAModel Reference Architectural Model
RCA Root Cause Analysis
RFLP Requirements, Functional, Logical, Physical
RPN Risk Priority Number
RUL Remaining Useful Life
SD State Detection SOA Service‐Oriented Architecture
SOAP Simple Object Access Protocol
SoI System of Interest
SoS System‐of‐Systems
TPM Total Productive Maintenance
UML Unified Modelling Language
WP Work Package WS Web Service
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 10
Introduction1.1 The PROPHESY Vision Despite the proclaimed benefits of predictive maintenance, the majority of manufacturers
are still disposing with preventive and condition‐based maintenance approaches, which result
in suboptimal OEE (Overall Equipment Effectiveness). This is mainly due to the challenges of
predictive maintenance deployments, including the fragmentation of the various
maintenance related datasets (i.e. data “silos”), the lack of solutions that combine multiple
sensing modalities for maintenance based on advanced predictive analytics, the fact that
early predictive maintenance solutions do not close the loop to the production as part of an
integrated approach, the limited exploitation of advanced training and visualization
modalities for predictive maintenance (such as the use of Augmented Reality (AR)
technologies), as well as the lack of validated business models for the deployment of
predictive maintenance solutions to the benefit of all stakeholders. The main goal of
PROPHESY is to lower the deployment barriers for advanced and intelligence predictive
maintenance solutions, through developing and validating (in factories) novel technologies
that address the above‐listed challenges.
In order to alleviate the fragmentation of datasets and to close the loop to the field,
PROPHESY will specify a novel CPS (Cyber Physical System) platform for predictive
maintenance, which shall provide the means for diverse data collection, consolidation and
interoperability, while at the same time supporting digital automation functions that will
close the loop to the field and will enable “autonomous” maintenance functionalities. The
project’s CPS platform is conveniently called PROPHESY‐CPS and is developed in the scope of
WP3 of the project.
In order to exploit multiple sensing modalities for timely and accurate predictions of
maintenance parameters (e.g., RUL (Remaining Useful Life)), PROPHESY will employ
advanced predictive analytics which shall operate over data collected from multiple sensors,
machines, devices, enterprise systems and maintenance‐related databases (e.g., asset
management databases). Moreover, PROPHESY will provide tools that will facilitate the
development and deployment of its library of advanced analytics algorithms. The analytics
tools and techniques of the project, will be bundled together in a toolbox that is coined
PROPHESY‐ML and is developed in WP4 of the project.
In order to leverage the benefits of advanced training and visualization for maintenance,
including increased efficiency and safety of human‐in‐the‐loop processes the project will
take advantage of an Augmented Reality (AR) platform. The AR platform will be customized
for use in maintenance scenarios with particular emphasis on remote maintenance. It will be
also combined with a number of visualization technologies such as ergonomic dashboards, as
a means of enhancing worker’s support and safety. The project’s AR platform is conveniently
called PROPHESY‐AR.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 11
In order to develop and validate viable business models for predictive maintenance
deployments, the project will explore optimal deployment of configurations of turn‐key
solutions, notably solutions that comprise multiple components and technologies of the
PROPHESY project (e.g., data collection, data analytics, data visualization and AR components
in an integrated solution). The project will provide the means for evaluating such
configurations against various business and maintenance criteria, based on corresponding,
relevant KPIs (Key Performance Indicators). PROPHESY’s tools for developing and evaluating
alternative deployment configurations form the project service optimization engine, which
we call PROPHESY‐SOE.
1.2 WP2 Overview PROPHESY’s WP2 aims to design a technical draft of the platform to serve as a basis for its
technological development and deployment. The engineering blueprint consists of a high‐
level architecture of the system of interest and the technical infrastructure that supports the
architecture. It provides specifications for the main deliverables of the PROPHESY project,
along with the technical architecture of the PROPHESY‐CPS platform. Its main objectives
include the architecture and specifications of the PROPHESY‐CPS platform optimized for PdM
activities, of data collection and analytics mechanisms, and of the services and the service
optimization engine. It also specifies the project’s demonstrators.
1.3 Task 2.2 Overview Task 2.2 focusses on the specifications of mechanisms for sensor data collection and analytics
to be used in PROPHESY. It specifies the sensors and data collection modalities to be
supported, the machine learning and deep learning techniques to be implemented, and
considers requirements like performance, latency, scalability and extensibility. Special
emphasis is paid in the specification of the data modelling standards and ontologies, which
will be used for data representation within PROPHESY. To this end, task 2.2 uses the MIMOSA
ontology and extends it appropriately. Furthermore, the task specifies the detailed structure
of the data assets comprising the PROPHESY‐ML toolkit, including data analytics algorithms
and tools. It pays special emphasis in the specification of data sharing and interoperability,
which will leverage the MIMOSA‐based structures, along with data exchanging and sharing in
the PROPHESY‐CPS platform. The task also specifies where and to what purpose data
visualization in the project takes place.
1.4 Document Purpose and Audience Task 2.1, together with the remaining tasks from work package 2, sets the context for work
packages 3‐7. To this purpose, the document at hand contributes with basic concepts for data
collection and analytics, and gives referential advice on data flows and functions needed for
predictive maintenance. Finally, the document depicts the needs of PROPHESY’s use cases so
that the PROPHESY partners can understand what the systems to be built have to accomplish,
and what benefits they have for the users.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 12
1.5 Document Scope and Approach D2.2 specifies the sensor data collection and analytics mechanisms to be used in PROPHESY.
Therefore, it addresses the objectives PdM in PROPHESY needs to achieve, and derives the
data (and its in‐process and external sources), the data modelling needs, and the functions
thereof. D2.2 also reflects system qualities (e.g. performance, latency, scalability and
extensibility) that drive the system architecture most. The deliverable focusses on answering
the questions what the system of interest shall accomplish, and on the core logical
architecture. The answers on how the system will work and be built will be given in WP3.
To this end, D2.2 follows the first two steps of the RFLP approach (Requirements – Functional
– Logical – Physical) as the baseline for model‐based design with systems engineering that
enables close interaction and collaboration between the different engineering disciplines.
Figure1‐1:Functionalanalysisandthearchitecturedesignphases[2]
Figure 1‐1 gives an overview on the approach D2.2 takes. The steps can be described as
follows [3]:
Operational analysis / Requirements analysis
This step focuses on analysing the customers’ needs, the goals that shall be reached by the
system users, and the activities being performed by the users. Outputs of this step leads to
an “operational architecture” describing and structuring the needs, in terms of actors/users
and their operational capabilities and activities, operational use scenarios giving dimensioning
parameters, operational constraints including safety, security, system life cycle, and others.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 13
Functional and non‐functional need analysis
The role of this step is to focus on the system itself in order to define how it can satisfy
operational needs (prescribed in operational analysis) along with its expected behaviour and
qualities. This includes the definition of functional as well as non‐functional requirements of
the system, e.g., safety, security, and performance. At this step, they are focused on system
boundary level and not on individual system components. Furthermore, role sharing and
potential interactions between system and operators are of concern here. Outputs of this
step mainly consist of the system functional need description, descriptions of interoperability
and interaction with the users and probable external systems (functions, non‐functional
constraints, and interoperation), and system/SW requirements. Note that these two steps,
which are a prerequisite for architecture definition, have high impact on the future design
being developed in the forthcoming steps, and therefore should be approved with and
validated by the customer.
Logical Architecture Analysis
The role of this step is to identify the system’s parts (hereafter called components), their roles,
relationships and properties, while excluding implementation or technical issues. This
constitutes the logical architecture of the system. The output of this step is a logical
architecture consisting of components, their interface definitions, and functions allocated to
them (along with their functional exchanges). Traceability links to the requirements and
operational scenarios are established.
Physical Architecture Analysis
This step defines the “final” architecture of the system at physical level, ready to be developed
(by lower engineering levels). Therefore, it introduces rationalization, architectural patterns,
new technical services and components, and makes the logical architecture evolve according
to implementation, technical and technological constraints or choices (at this level of
engineering).Output of this step is the chosen physical architecture consisting of components
to be produced and the way they are taken into account in the subsystem design. As is the
case with logical analysis, traceability links towards the requirements and operational
scenarios are established.
D2.2 focusses on the ‘need understanding’ part of Figure 1‐1, that is the goals and functional
view based on the expected system qualities. Hence, it addresses the flow of data, the major
requirements, and the core logical architecture. It also denotes specifications regarding data
collection, sharing and analytics to be used in PROPHESY.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 14
1.6 Document Structure The document structure is as follows:
Section 1 Introduction details the document context, purpose and intended audience,
as well as, the overall strategy applied in WP2 while underlining the role played by this
document with respect to the whole project.
Section 2 Foundations for Data Collection and Analytics holds basic concepts and
elaborates on the Big Data Solution Life‐Cycle CRISP‐DM, on business goals and
qualities, on the ISO standard 13374 (Condition monitoring and diagnostics of
machines ‐ Data processing, communication and presentation), and on data semantics
including an overview on MIMOSA.
Section 3 Specification of Data Collection, Sharing and Analytics is devoted to
specifications regarding data collection, sharing and analytics, including information
about the data collection interfaces, the data modelling standards and data analytics
algorithms to be used.
Section 4 Reference Data Flow and Functions presents the information and function
model, and introduces a template for recording information of the use cases’ PdM
scenarios from the perspective of goals, functions, data, and data sources & flows.
Section 5 PHILIPS Use Cases uses the template from section 4.3 for detailing the needs
of the three PHILIPS use cases.
Section 6 JLR Use Cases uses the template from section 4.3 for detailing the needs of
the three JLR use cases.
Section 7 Conclusion provides the conclusion of this document
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 15
FoundationsforDataCollectionandAnalytics2.1 Big Data Solution Life‐Cycle: CRISP‐DM
2.1.1 Introduction
Nowadays the impact of Big Data in business is unquestionable and un‐ignorable. More and
more companies generate new knowledge from data, add value and develop new business
models. Private companies and research institutions capture these terabytes of data about
their users’ interactions, business, social media, and also sensors from different devices such
as mobile phones, sensors, etc. The challenge of this era is to make sense of this data ocean.
Evolution of big data
The past 10 years there are many studies about big data evolution. For example, Eric Schmidt
of Google had stated in 2010 that every two days we create as much information as we did
from the dawn of civilization up until 2003 [12]. In 2012, an IDC and EMC report stated that
the digital universe is doubling every two years and will reach 40,000 exabytes (40 trillion
gigabytes) by 2020 [13].
Figure2‐1estimatedgrowthoftheUSdigitalUniverse
Based on an IDC and EMC report there are some more interesting and latest statistics on data:
90% of the world’s data has been created in last two years
Big Data market growth projection is $53.4 billion by 2017, from $10.2 billion in 2013
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 16
70% of the digital universe, approx. 900 exabytes are generated by users
98% of global information is now digital, that was 25% in 2000
10% increase in data visibility means additional
$65.7 million for a typical Fortune 1000 company
But what is the reason of this growing? The mushrooming evolution of big data is due a) to
the increasing usage of IoT devices such as mobile devices, wireless sensors, RFID readers etc.,
b) to the continued growth on Internet usage, social networks, c) falling costs of the
technology devices and hardware that create, capture, manage, protect, and store
information and d) to the growth of machine‐generated data [12].
Challenges
This evolution forces enterprises to develop effective methodologies and infrastructure to
gather, process, and harvest data. It is very critical to create capabilities that can narrow down
big data into relevance and importance and, keep only the information that matters most to
the business. Every part of produced data is not gathered and moreover is not processed, is
useless. Based on an IDC1 report the percentage of information in the USA digital universe
that would be useful if tagged and analyzed will be much bigger (40%) by 2020 [12],[13].
In conclusion, the Big Data Revolution is referred more in the capability of actually doing
something with the data, making more sense out of it. In order to build a capability that can
achieve beneficial data targets, enterprises need to understand the data lifecycle and
challenges at different stages. The most known methodology for data mining is the CRISP‐DM
methodology, which the next section describes.
2.1.2 Cross‐Industry Standard Process for Data Mining (CRISP‐DM)
The Cross‐Industry Standard Process for Data Mining named CRISP‐DM provides a structured
approach to planning a data mining project. CRISP‐DM was conceived in 1996 and the next
year, it got underway as a European Union project under the ESPRIT funding initiative [14].
The project was led by five companies: SPSS, Teradata, Daimler AG, NCR Corporation, and
OHRA (an insurance company). The project was finally incorporated into SPSS.
CRISP‐DM methodology defines a hierarchy process model which consists of a set of tasks.
Each task is described at four abstraction levels (from general to specific):
Level 1: The data mining process is organized into six major phases, which consist of a
set of generic tasks
Level 2: This level defines generic tasks which cover all possible data mining situations
Level 3: This level defines specialized tasks that describe how actions in the generic
tasks should be carried out in specific situations
Level 4: The process instance is a record of actions, decisions, and results of an actual
data mining engagement. Each process instance is organized according to the tasks
1 https://www.idc.com/
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 17
defined at the higher levels, but represents what actually happened in a particular
engagement, rather than what happens in general.
Additionally, the CRISP‐DM methodology introduces a horizontal dimension, which defines
the reference model and the user guide. The reference model presents a quick overview of
phases, tasks, and their outputs. Briefly, it describes what to do in a data mining project. The
user guide gives detailed guides for each phase and each task within a phase and depicts how
to carry out a data mining project.
2.1.3 CRISP‐DM reference model
Figure2‐2:PhasesoftheCRISP‐DMreferencemodel2
The reference model provides an overview of the data mining life cycle. This life cycle divides
into phases and each phase is divided in tasks. There are six phases as shown in Figure 2‐2:
Business Understanding
Data understanding
Data preparation
Modeling
2 http://crisp-dm.eu/
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 18
Model evaluation
Deployment
The sequence of phases is not restricted. There is always the necessity to move backward.
The output of each phase determines which is the next phase or task. Figure 2‐3 presents the
six phases and the contained tasks and outputs.
Figure2‐3:CRISP‐DM–Phases,generictasksandoutputs3
2.1.3.1 Business understanding
This phase deals with the business view of the project. Business understanding initially
focuses on understanding of the project objectives and requirements, then converting this
knowledge into a data mining problem definition and finally, a preliminary plan designed to
achieve the objectives. The business understanding phase consists of four generic tasks:
Determine business objectives
Assess situation
Determine data mining goals
Produce project plan
3 http://crisp-dm.eu/
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 19
Determine business objectives
This task depicts what the customer really wants to accomplish from a business view
Outputs
Background Record the known information about business situation
Business objectives Describe the customer’s primary business objectives
Business success criteria
From a business point of view, describe the criteria for a successful or useful outcome to the project. This should be specific enough to be measured objectively
Assess situation
This task involves more detailed investigation of the resources, constraints, assumptions, and other factors that affect data analysis goal and project plan.
Outputs
Inventory of resources
List of available resources such as personnel, data, computing resources and software.
Requirements, assumptions, and constraints
List of project requirements, such as completion schedule, quality of results, security and legal issues. Make sure that you can use the data.
Risks and contingencies
List the risks or events that might delay the project or cause it to fail. Plans and actions will be taken if these risks take place.
Terminology Define a glossary of terminology relevant to the project:
A glossary relevant to business terminology
A glossary relevant to data mining terminology
Costs and benefits Construct a cost‐benefit analysis for the project: compare the project costs with the potential benefits to the business
Determine data mining goals
Translate business goals to data mining goals
Outputs
Data mining goals Describe the intended outputs of the project that achieve the business objectives.
Data mining success criteria
Define the criteria for a successful outcome to the project in technical terms.
Produce project plan
Define a plan for achieving the data mining goals. The plan should specify the steps to be performed during the project, including the initial selection of tools and techniques.
Outputs
Project plan List the stages to be executed in the project, including their duration, required resources, inputs, outputs and dependencies. Analyse dependencies between time schedule and risks.
Initial assessment of tools and techniques
This output performs an initial assessment of tools and techniques
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 20
2.1.3.2 Data understanding
The data understanding phase starts with an initial data collection and proceeds with
activities that enable you to become familiar with the data, identify data quality problems,
discover first insights into the data, and/or detect interesting subsets to form hypotheses
regarding hidden information.
Collect initial data
Acquire the data listed in the project resources. Also includes data loading, if necessary for data understanding
Outputs
Initial data collection report
List the acquired datasets with locations, acquired methods, problems encountered and resolutions to these problems.
Describe data
Examine the “gross” or “surface” properties of the acquired data and report on the results
Outputs
Data description report
Acquired data description, including the data format, quantity, the identities of the fields, and any other discovered surface features.
Explore data
This task addresses data mining questions using querying, visualization, and reporting techniques. These include distribution of key attributes, relationships, results of simple aggregations, properties of significant sub‐populations and simple statistical analyses.
Outputs
Data exploration report
Description of the results, such as first findings or initial hypothesis. If appropriate, graphs and plots can be included.
Verify data quality
Examine the quality of the data, addressing appropriate questions (is the data complete or correct? Does it contain errors and how are they?)
Outputs
Data quality report List the results of the data quality verification. If there is quality problem, define possible solutions.
2.1.3.3 Data preparation
The data preparation phase covers all activities needed to construct the final dataset from
the initial raw data. The tasks of preparation phase could be performed multiple times and
not in any predefined order. This phase produces a list of datasets and their description.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 21
Select data
This task decides which data will be used for analysis based on criteria such as relevance to the data mining goals, quality, technical restrictions (volume or datatype limitations)
Outputs
Rationale for inclusion/exclusion
List the data to be included/excluded and the reasons for these decisions.
Clean data
This task raises the data quality to the desired level for the selected analysis techniques. This can involve selection of clean subsets, default values definition or other techniques such as estimation of missing data.
Outputs
Data cleaning report Describe what actions were taken to address the data quality problems reported during the “Verify Data Quality” task. Data transformations for cleaning purposes. The impact on the analysis results should be considered.
Construct data
This task includes operations for data preparation construction (production of derived attributes or entire new records or transformed values for existing attributes).
Outputs
Derived attributes Derived attributes are new attributes that are constructed from one or more existing attributes in the same record.
Generated records Describe the creation of completely new records.
Integrate data
This task includes methods for multiple data combination to new records.
Outputs
Merged data Merging tables, aggregations.
Format data
Formatting transformations refer to syntactic modifications that do not change the meaning of the data but might be required by the modelling tool.
Outputs
Reformatted data Reformatted data
2.1.3.4 Modelling
In this phase, the appropriate modelling techniques are selected and applied, and their
parameters are calibrated to optimal values. Usually, there are several techniques for the
same data mining problem type. Some techniques have specific requirements on the form of
data. For this reason, the backward moving to data preparation phase is often necessary.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 22
Select modelling technique
Select the modelling technique is going to be used. Although the tool may have already selected during the Business Understanding phase, this task refers to specific modelling techniques. If multiple techniques are applied, this task is repeated for each technique.
Outputs
Modelling technique Document the actual modelling technique that is to be used.
Modelling assumptions
Record any assumption made by modelling technique
Generate test design
Generate a procedure or mechanism to test the model’s quality and validity before it is build.
Outputs
Test design Define and describe a plan for training, testing, and evaluating the models.
Build model
Run the modelling tool on the prepared dataset to create one or more models.
Outputs
Parameter settings List the necessary parameters and their chosen values.
Models The actual models produced by the modelling tool
Model descriptions Report on the interpretation of the models and document any difficulties encountered with their meanings
Assess model
The data mining engineer interprets the models according to his domain knowledge, the data mining success criteria, and the desired test design. The data mining engineer judges the success of the application of modelling and discovery techniques technically. The data mining engineer tries to rank the models. He assesses the models according to the evaluation criteria. In this task, he also compares all results according to the evaluation criteria
Outputs
Model assessment Summarize results of this task, list qualities of generated models (e.g., in terms of accuracy), and rank their quality in relation to each other.
Revised parameter settings
According to the model assessment, revise parameter settings and tune them for the next run in the Build Model task.
2.1.3.5 Evaluation
This phase focuses on the evaluation and the review of the created model, to be certain that
it achieves the business objectives. A key objective is to determine if there is some important
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 23
business issue that has not been sufficiently considered. At the end of this phase, a decision
on the use of the data mining results should be reached.
Evaluate results
This task assesses the degree to which the model meets the business objectives and seeks to determine if there is some business reason why this model is deficient. Moreover, evaluation also assesses other data mining results generated. Data mining results involve models that are necessarily related to the original business objectives and all other findings that are not necessarily related to the original business objectives, but might also unveil additional challenges, information, or hints for future directions.
Outputs
Assessment of data mining results with respect to business success criteria
Summarize assessment results in terms of business success criteria, including a final statement regarding whether the project already meets the initial business objectives.
Approved models The generated models that meet the selected criteria become the approved models.
Review process
Review of the data mining process to determine if there is any important factor or task that has somehow been overlooked or if the quality assurance issues are covered
Outputs
Review of process Summarize the process review and highlight activities that have been missed and those that should be repeated
Determine next steps
This task defines the next steps based on the input of the assessment and the process review. These steps include project finishing and deployment, initiate further iterations or setup new data mining projects. The task also includes the budget and remaining resources analysis.
Outputs
List of possible actions
List of the potential further actions and the reasons for each option.
Decision Description of the decision as to how to proceed.
2.1.3.6 Deployment
Depending on the requirements, the deployment phase can be as simple as generating a
report or as complex as implementing a repeatable data mining process across the enterprise.
In many cases, the customer, and not the data analyst, carries out the deployment steps. In
any case, it is important for the customer to understand up front what actions need to be
carried out in order to actually make use of the created models.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 24
Plan deployment
This task takes the evaluation results and determines a strategy for deployment. If a general procedure has been identified to create the relevant model(s), this procedure is documented here for later deployment.
Outputs
Deployment plan Summarize the deployment strategy, including the necessary steps and how to perform them.
Plan monitoring and maintenance
This task defines a plan for monitoring and maintenance. The maintenance strategy helps to avoid unnecessarily and/or incorrect usage of data mining results.
Outputs
Monitoring and maintenance plan
Summarize the monitoring and maintenance strategy, including the necessary steps and how to perform them.
Produce final report
In this task, the project team writes the final report. This report may be only a summary or a final and comprehensive presentation of the data mining result
Outputs
Final report This is the final written report of the data mining engagement. It includes all the previous deliverables, summarizing and organizing the results.
Final presentation Usually this is a meeting at the end of the project at which the results are presented to the customer.
Review project
This task assesses what went right and what went wrong, what was done well and what needs to be improved
Outputs
Experience documentation
Summarize important experience gained during the project.
2.2 Business Goals and Key Qualities In general, we can distinguish between four main classes of architecture drivers: business
goals, functional requirements, constraints and quality requirements. Each of these classes
might have its individual stakeholders that articulate concerns belonging to that particular
class.
Business Goals
Business goals are the first class (and most abstract) of architectural drivers. Business goals
are goals that are important for the overall enterprise that is developing the respective
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 25
architecture or has placed an order to build the system. Examples for business goals are time
to market (denoting the strategy in terms of time), the market scope, or costs.
Functional Requirements
Functional requirements are drivers for the architecture as well. However, there are
differences in functional requirements: Some drive the architecture ‐ some do not. In some
sense, the functional requirements that make the product unique and worth building are the
ones that influence the architecture development the most.
Quality Requirements
Quality is not only about correctness of functionality. Successful software systems have to
assure additional properties such as performance, security, extensibility, maintainability, and
so forth. In general, we distinguish between run‐time and development‐time quality
attributes. Run‐time quality attributes can be measured by watching the respective system in
operation. Examples for run‐time quality attributes are performance, security, safety,
availability, and reliability. Development‐time quality attributes can be measured by watching
a team in operation. Examples for development‐time quality attributes are extensibility,
modifiability, and portability.
Constraints
One important but easily overlooked input for software and systems design are constraints
that influence the design decisions of subsequent steps. Constraints can be organizational,
technical, regulatory, or political. Making them explicit provides a solid basis for subsequent
decision making in design.
Typical business goals in the field of maintenance are:
Analyse product performance remotely
Improve quality
Increase equipment lifespan
Increase operational efficiency
Reduce downtime
Reduce idle time
Reduce maintenance cost
Reduce production cost
Reduce rework & scrap
Increase customer satisfaction
Increase worker mobility
Increase worker training efficiency
Reduce accidents
Reduce energy consumption
Reduce start‐up time
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 26
Reduce stock
Reduce time to process orders
Reduce time to market
Consolidate customer inks
Integrate real‐time customer feedback
Table 2‐1 lists a selection of possibly relevant qualities influencing IoT architecures w.r.t.
maintenance processes.
Table2‐1:SelectionofQualitiesrelatedtoPdM
Quality Category Quality Goal
Adaptability Adaptability for different environments
Adaptability Adaptability for new components
Adaptability Use of preferred technologies
Availability Concurrent access on data
Availability Planned downtimes of system parts
Availability Unplanned downtimes of system parts (software)
Availability Unplanned downtimes of system parts (hardware)
Latency Require real‐time data access for certain scenarios
Performance The performance of PdM networks must support services' criticality
Performance The performance of production processes must not be influenced negatively by PdM
Robustness Robustness against unstable network conditions
Robustness Robustness against no network conditions
Safety Monitored processes and sites are not influenced w.r.t. safety
Scalability Scaling over large number of data generating devices
Security Separation of views for different MANTIS users
Security Overall data usage control
Security Security levels and zones
Upgradeability No data loss during upgrade of system components
… …
For concrete maintenance‐related scenarios, the qualities have to be operationalized to be
measurable. Therefore, the environment, stimulus and system response have to be described
in an adequate manner.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 27
2.3 ISO 13374 ISO 13374 [4] consists of the following parts, under the general title "Condition monitoring
and diagnostics of machines ‐ Data processing, communication and presentation": part 1:
General guidelines, part 2: Data processing, part 3: Communication, part 4: Presentation.
The various computer software systems written for condition monitoring and diagnostics
(CM&D) of machines that are currently in use cannot easily exchange data or operate in a
plug‐and‐play fashion without an extensive integration effort. This makes it difficult to
integrate systems and provide a unified view of the condition of machinery to users. The
intent of ISO 13374 Parts 1 through 4 is to provide the basic requirements for open CM&D
software architectures which will allow CM&D information to be processed, communicated,
and displayed by various software packages without platform‐specific or hardware‐specific
protocols.
ISO 13374‐1 (General Guidelines) establishes general guidelines for software specifications
related to data processing, communication, and presentation of machine condition
monitoring and diagnostic information.
ISO 13374‐2 (Data Processing) details the requirements for a reference information model
and a reference processing model to which an open CM&D architecture needs to conform.
Software designers require both an information model and a processing model to adequately
describe all data processing requirements. ISO 13374‐2 facilitates the interoperability of
CM&D systems. It standardizes the reference information model and reference‐processing
model for an open CM&D architecture. ISO 13374‐2 describes the required content for the
five layers (L1: semantic definitions, L2: a non‐proprietary conceptual information model or
'schema', L3: implementation data model, L4: reference data library, and L5: data document
definitions) of the open CM&D information architecture, which describes all the data objects,
types, relationships, etc. for a given system. ISO 13374‐2 defines the six key processing blocks,
namely Data Acquisition (DA), Data Manipulation (DM), State Detection (SD), Health
Assessment (HA), Prognostic Assessment (PA), and Advisory Generation (AG). As an
informative annex, ISO 13374‐2 provides compliant UML, XML, and Middleware service
specifications as well.
ISO 13374‐3 (Communication) specifies requirements for data communication for an open
CM&D reference information architecture and for a reference processing architecture.
Software design professionals require communications to be defined for exchange of CM&D
information between software systems. This part of ISO 13374 facilitates the interoperability
of CM&D systems. The technologies and software used by the data‐oriented processing
blocks (DA, DM, and SD) often vary from those used by the analysis‐oriented processing blocks
(HA, PA, and AG) (cf. Figure 4‐7). ISO 13374‐3 states that a UML model, compliant with ISO/IEC
19501 [5], shall support the open CM&D data‐processing communications. It defines the
block processing methods and interface types that should be utilized by each data‐processing
block defined in ISO 13374‐2. ISO 13374‐3 also contains an informative annex of an open
CM&D information architecture.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 28
ISO 13374‐4 (Presentation) details the requirements for presentation of information for
technical analysis and decision support in an open CM&D. Software design professionals need
to present diagnostic/prognostic data, health information, advisories, and recommendations
on computer displays and in written report formats to end‐users. This part of ISO 13374
provides standards for the display of this information in CM&D systems.
2.4 Data Semantics Most organizations are facing an explosion of data coming from new applications, new
business opportunities, IoT, and more. The ideal architecture most envision is a clean,
optimized system that allows businesses to capitalize on all that data.
Figure2‐4:HowOrganizationsHandleDataFlow:AGiantMess(Source:confluent)4
However, traditional systems used for solving these problems were designed in an era that
predates large‐scale distributed systems, and lack the ability to scale to meet the needs of the
modern data‐driven organization4.
The evaluation of these service gaps in existing big data systems results in the necessity of a
higher‐level semantic representation of data context and taxonomy.
2.4.1 General Needs
In complex systems (such as Cyber Physical Systems), merging data from heterogeneous
information sources is one of the biggest challenges. The data are generated in different
formats and across different levels of an ecosystem. The data is then stored in distributed
clouds or local storage systems and made available via a variety of protocols from all end
points of the data generation and consumption chain. The integration of subordinate
resources can pose challenges on several levels:
4 https://www.confluent.io/product/confluent-platform/
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 29
Interoperability of protocols / connectivity: components use different protocols or
interfaces, a platform must provide the means to integrate them effectively into an
existing ecosystem
Format incompatibilities: there are a variety of data formats. From simple numerical
one‐dimensional streams to complex data structures
Structuring: merging of structured and unstructured data sources
Encoding: proprietary components and software often use different interfaces and
languages
To support data merging and processing, it makes sense to use a semantic network of devices,
instances and relations. In this context, semantic metadata
facilitate communication across platforms and different architectures (simplifies
mappings and translations),
simplify the manipulation of information (systematic processes involving semantic
metadata can create synergies in the data analysis and pre‐processing workflows),
decouple data and its description and makes data sources more comparable,
facilitate data visualization and human machine interaction services.
In an industrial scenario, businesses either outsource their data analysis or employ data
experts to explore and exploit the potential of their data. Often these data consumers are
little or not at all familiar with the underlying processes behind the data provided and the
acquisition of this knowledge requires a costly exchange of information between the data
provider (industrial business) and the consultant or data analyst. The industry partner may
know what result he is looking for and can provide large amounts of potentially relevant
information. However, this becomes problematic as the collection of all data that
contextualize the origin of this information becomes a hunt for clues distributed across the
different levels of the organization. In addition, the analyst faces the challenge of selecting
and distinguishing all available data sources that can prolong the data mining process or lead
to irrelevant results. For example, maintenance personnel and floor managers have
comprehensive insights into the functioning of machines and causality chains in a production
line. This expertise is rarely grouped or integrated with the original data, and providing
contextual information to the analyst becomes a tedious process. This knowledge covers
various aspects of the data, from its origin and composition (SI units, nominal values, sensor
type, reference values, etc.) to its origin and location in an interlinked process. Therefore,
data analysis can easily take up to 6 weeks, only a small portion of which is allocated to
develop the final data mining model. To avoid costly and inefficient intermediate processes,
information management strategies must focus on contextual information and metadata. The
integration of metadata networks leads to semantic webs with high‐level resource
representations and can improve interoperability in large data ecosystems [6][7]. Adding
metadata to the available data sources on a network is an effective way to virtualize
connections between devices in cyber‐physical systems. Data management solutions should
not only aim to meet data storage and scalability requirements, as data semantics in complex
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 30
data ecosystems becomes more and more relevant and sensitive, especially as the amount of
data in a network increases. In addition to improved human interaction, semantic labelling
offers great added value for the available data.
The mathematical resources acquire a higher representation layer, which can also be
processed to gain insights and imagine relationships that escape common statistical
processes. This means not only a richer connectivity network in terms of ontologies and
dependencies in the production line, but also a more significant communication channel
between data providers and consumers. In addition, modern technologies such as cognitive
computing and knowledge discovery in databases point to an increasing trend in which data
analysis relies on underlying semantic networks that improve the discovery of relevant and
high‐quality data sources in Smart IoT[8][9]. This leads to an urgent need to semantically
enrich IoT environments with labelled data and information maps that can link the resources
available in an industrial network with internal or external knowledge databases and
ontologies.
2.4.2 MIMOSA
MIMOSA™ [1] is a non‐profit industry association focused on open standards and facilitation
of full life‐cycle management. This is based on digital assets that must accurately reflect the
physical entities. Real‐time control and maintenance heavily depend on information
integration upon such complex physical assets. MIMOSA helps to establish a basis for a more
integrated approach, combining full life‐cycle engineering with operation and maintenance
activities.
MIMOSA Open System Architecture (OSA) specification provides a series of interrelated
information standards. Here we introduce the two most relevant:
MIMOSA OSA‐EAI (OSA for Enterprise Application Integration)5 defines data structures for
storing and moving collective information about all aspects of equipment, including platform
health and future capability, into enterprise applications. This includes the physical
configuration of platforms as well as reliability, condition, and maintenance of platforms,
systems, and subsystems.
OSA‐EAI creates an information backbone to facilitate the integration of asset management
by providing (1) an information exchange standard to allow sharing asset registry, condition,
maintenance and reliability information between enterprise systems, and (2) a relational
database model to allow storage of the same asset information (see Figure 2‐5: OSA‐EAI
Architecture).
(1) To accommodate various types of applications and integration scenarios, the OSA‐EAI supports the exchange of XML files over multiple data transport options including files (Tech‐Doc), HTTP (Tech‐XML‐Web), and SOAP Web Services (Tech‐XML‐Services). The
5 http://www.mimosa.org/mimosa-osa-eai
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 31
Web Service definitions are sufficiently granular such that they can be used in a Service‐Oriented Architecture (SOA).
(2) The relational database model (Common Relational Information Schema – CRIS) is represented as a logical and physical model, with SQL scripts targeted for Oracle Database and Microsoft SQL Server. The SQL scripts include both creating the database schema and inserting the MIMOSA Reference Data, the latter which can be extended to support unique project or organizational requirements. Tech‐CDE‐Services provides an efficient mechanism to manage a CRIS database via Web Services.
While CRIS provides a means to store enterprise operation and maintanance information, the
Common Conceptual Object Model (CCOM) provides a foundation for all MIMOSA standards.
Figure2‐5:OSA‐EAIArchitecture
MIMOSA OSA‐CBM (Open System Architecture for Condition‐Based Maintenance)6
specification is a reference architecture and framework for implementing condition‐based
maintenance systems. The goal is that standardization of information exchange specifications
would ideally facilitate the integration and interchangeability of CBM components from a
variety of sources. In short,
it describes a standardized information delivery system for condition based monitoring,
the information that is moved around and how to move it, and
it also has built in meta‐data to describe the processing that is occurring.
The OSA‐CBM is defined using the Unified Modelling Language (UML) to separate the
information from the technical interfaces used to communicate the information.
6 http://www.mimosa.org/mimosa-osa-cbm
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 32
In addition, MIMOSA OSA‐CBM is compliant with and form the informative reference to the
ISO 13374‐1 standard for machinery diagnostic systems. ISO‐13374, Condition Monitoring
and Diagnostics of Machines, defines the six blocks of functionality in a condition monitoring
system, as well as the general inputs and outputs of those six blocks (see Figure 2‐6). OSA‐
CBM is an implementation of the ISO‐13374 functional specification. It adds data structures
and defines interface methods for the functionality blocks defined by the ISO standard.
Figure2‐6:OSA‐CBMFunctionalBlocksconformtoISO‐13374
In relation to OSA‐EAI, OSA‐CBM uses many of the data elements that are defined by the OSA‐
EAI. The goal is to have OSA‐CBM map into OSA‐EAI, with future releases.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 33
Specification of Data Collection, Sharing andAnalytics
In this section, we provide a set of specifications that will drive data collection and analytics
as part of PROPHESY‐ML. Moreover, a set of specifications for data sharing and
interoperability is provided.
3.1 Data Collection Specifications PROPHESY will support data collection from a variety of systems that comprise maintenance‐
related data. To this end, it will provide the means for interfacing to different types of systems
as a means of consolidating previously fragmented data that reside in multiple systems (i.e.
“data silos”).
3.1.1 Data Collection embedded in the Machines
For the PROPHESY demonstrator the monitoring of process information is one of the main
tasks.
At Philips, two Brankamp branded MMS X5 systems are already equipped at the
demonstrator side. To allow an extended process monitoring by integration of further sensors
and interfaces to PROPHESY, an upgrade to Brankamp branded X7 systems is necessary.
Figure3‐1:BrankampX7processmonitoringsystem
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 34
Figure 3‐1 shows a Brankamp X7 system as well as some possible process curves. The X7
system allows up to 24 channels for an extended process monitoring. The HMI part of the
system works on a Windows Operation System, so that an easy connection to other
PROPHESY parts is possible. Furthermore, the X7 Cockpit provides a switchable mask design
with flexible arrangement of the monitoring channels (according to the machine
configuration). Binary input signals can be monitored with up to three monitoring windows
to ensure the earliest possible fault detection. The failure distribution shows machine
downtimes and the frequency of process failures for a quick and easy failure analysis.
The X7 system has different options to transmit the process data to the PROPHESY platform.
For the Philips demonstrator, a standard PC (Edge‐PC) will be used to gather all data from the
X7 and provide the data in a secure way to PROPHESY (see Deliverable 7.1). The LFML as well
as further data processing could be performed at this Edge‐PC.
At JLR, all machines are already equipped with Artis Genior Modular CPU‐01 process
monitoring devices at the demonstrator side. To allow an extended process monitoring by
integration of further sensors and interfaces to PROPHESY, an upgrade to Artis Genior
Modular CPU‐02 devices is mandatory.
Figure3‐2:ArtisGeniorModularCPU‐02processmonitoringsystem
Figure 3‐2 shows an Artis Genior Modular CPU‐02 system as well as the modular system
structure of the Genior system. Genior Modular can simultaneously monitor and visualize up
to 24 signals and 10 channels. The Multiview display is ideal for the simultaneous monitoring
of multiple spindles, axles and other equipment values. It shows the entire machining
situation at a glance. Genior Modular is easy to install and to integrate in machine controls.
Depending on the area of application, visualization and operation can be done alternatively
via the control or an external system (Windows or Linux). The central evaluation unit can be
upgraded with various measuring transducers to operate the system with sensors and can be
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 35
modularly expanded at any time. Thus, Genior Modular is prepared for a huge range of
requirements.
Similar to the X7 system, Genior Modular provides also some different options to transmit
the process data to the PROPHESY platform. For the JLR demonstrator, an OPR‐Edge device
will be used to gather and store all data from the Genior Modular and provide the data in a
secure way to PROPHESY (see Deliverable 7.1). The LFML as well as further data processing
could be performed at this OPR‐Edge device.
Both system comes with an internal binary data format. To transform the binary format in
other formats (e.g. csv) a converter is provided by MMS.
3.1.2 Data Collection for Sensors and Field Devices
PROPHESY should support data collection directly from sensor data sources, subject to the
following specifications:
Dynamic sensor registration: The platform should keep track of multiple sensors and
devices in support of a dynamic environment where sensors are likely to join or leave
dynamically.
Support for different types of sensors: The platform shall support multiple sensor types
such as vibration, acoustic, ultrasound, temperature, power consumption and more.
Moreover, support for both wireless and wired sensors/devices should be provided.
Sensor Virtualization: The data collection functionalities should be flexibly customized to
the different sensor types, based on appropriate abstractions and virtualization of
interfaces.
Streaming data support: The platform should support collection of streaming data i.e.
streams with very high data ingestion rates.
Streaming data pre‐processing: The platform should support pre‐processing of data
streams, including their filtering as a means of optimizing network bandwidth and storage.
Publish‐Subscribe and Request‐Reply: The platform should support both push and pull
modes of data collection, based on a combination of publish‐subscribe and request‐reply
collection modalities.
Streaming Data Analytics: The platform should support analytics over data streams,
including the production of new streams that correspond to the results of the analytics.
Field Abstraction: Support for multiple connectivity protocols should be provided, as part
of field abstraction.
3.1.3 Edge Gateway Data Collection
PROPHESY should be also able to interface and collect data from edge gateways (e.g., IoT
edge devices, sensor data collection gateways), that are deployed in the field. The following
specifications should be supported:
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 36
Support for Data Collection from Groups of Sensors: The PROPHESY platform should be
capable of interfacing to gateways towards accessing data from multiple sensors that are
attached/interfaced to the gateway.
Dynamic Registration of Groups of Devices: The PROPHESY platform should support the
dynamic registration of edge gateways and of the sensors that they comprise.
Support Edge Analytics: PROPHESY should support edge analytics based on data from
multiple sensors.
Edge Gateway Data Transformation: PROPHESY should support the transformation of the
data of the edge gateways to formats and semantics supported by the PROPHESY
platform.
3.1.4 Data Collection from Maintenance Systems and Databases
PROPHESY should support the acquisition of datasets from enterprise systems (e.g.,
Enterprise Resource Planning (ERP) systems), maintenance systems (e.g., CMM
(Computerized Maintenance Management) and other databases (e.g., AM (Asset
Management Databases), subject to the following specifications:
Abstraction and Virtualization of different types of systems: PROPHESY should support
general interfaces for data collection for each one of the different system types (e.g., ERP,
AM, CMM).
Service‐Oriented and Platform Agnostic Protocols: Platform neutral interfaces (e.g.,
REST/HTTP) to all the different systems shall be supported, including interfaces for
distributed service‐based access.
Publish Subscribe and Request Reply: Both push and pull modalities to data acquisition
should be supported, based on publish‐subscribe and request‐reply interfaces.
Filtering and (pre)processing: The platform should support filtering and pre‐processing of
data stemming from enterprise and maintenance systems.
3.2 Data Analytics Specifications In data Analytics for Maintenance purposes, there are several levels of actions, depending on
the outcome of the analysis itself. These levels can be divided into 3 main blocks: Anomaly
Detection (ADT), Root Cause Analysis (RCA) and Remaining Useful Life (RUL).
The first level, Anomaly Detection, aims to find anomalies in the datasets in order to launch
an alarm. The second level, the Root Cause Analysis, tries to find the cause of that alarm, and
is only launched when an anomaly is detected. Finally, the Remaining Useful Life block, is
responsible of calculating the life of the asset, depending on the type of damage found in the
RCA phase. The higher we escalate in the pyramid, the more data is needed in order to feed
the models. This correct amount of data will result in a reproducible and reliable outcome.
Every data analysis problem has a main objective, which is to find an answer for a question.
As the type of question to be answered is different in each of the levels, also the analysis of
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 37
the data must be different for each of the blocks; this means that the type of algorithms for
each of the levels must be different.
In this section of the document, the types of algorithms that are going to be considered for
the PROPHESY‐ML toolbox are specified, dividing them into the different levels for
maintenance analytics.
The PROPHESY‐ML toolbox will implement the most suitable algorithms among the ones
listed in the subsections below. Among the implemented algorithms, the Use Cases can select
one or more of these algorithms, in order to use them in their final product.
Figure3‐3:DataAnalyticslevelsforMaintenance
3.2.1 Data Pre‐processing
Data pre‐processing is one of the most important parts of the analysis. The pre‐processing
step should not be confused with the term “data cleaning”.
Data cleaning is a step that should be done before the analysis starts, which consists of
cleaning incorrect data, and transforming the data into a structured format so that the
analytics algorithms can make use of them. Usually the engineer performs this previous step
manually.
On the other hand, the data pre‐processing step consists of utility functions and transformers
that change raw feature vectors into a representation that is more suitable for the
downstream algorithms. Among the families for data pre‐processing we can find the ones
listed here:
1. Classical pre‐processing methods:
a. Standardization
b. Scaling
c. Normalization
d. Quantile transform
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 38
e. Encoding of categorical features
f. Polynomial Features
2. Dimensionality Reduction methods
a. Feature Selection methods (mRMR, Random Forest feature scores, …)
b. Feature Extraction methods (PCA, ICA, …)
3. Time Series Feature extraction methods (only for time series)
a. Basic features (max, min, mean, std, …)
b. Subspace modelling family
c. Auto‐Regressive modelling family
d. Stochastic Subspace identification features
e. Wavelet Transform featurization
f. Deep Learning Features
g. Principal Component Analysis
As the pre‐processing step transforms the raw data into something more adequate for the
analytics algorithm, the fact that we propose a multi‐level approach, may lead to the pre‐
processing step to be redefined in each level. This means that the anomaly detection method
can have a type of pre‐processing while the RCA method can have a totally different pre‐
processing step, and the same holds for the RUL method.
3.2.2 Anomaly Detection
This is the first step in the pyramid of maintenance data analytics. It can also be called as
novelty detection or outlier detection. It basically consists of detecting whether the input
dataset is still within the limits of the statistical model. The statistical model is constructed
using only data of the healthy asset, so the data needed for deploying this kind of solution is
not big, but the more data available, the better model you can construct.
Within this context, several approaches can be used. The approaches can be univariate or
multivariate. The univariate solutions do not take the relation between different variables,
whereas the multivariate ones do consider this relation. This problem can be treated as a
classification problem, or a clustering problem. Among the different kind of algorithms in
PROPHESY‐ML we will look into the ones listed here:
1. OneClass classification
a. Null‐Space Anomaly detection
b. MSPC anomaly detection
c. One‐class Support Vector Machines
d. Isolation forest
e. Local Outlier factor
2. Anomaly detection as classification problem
a. Random Forest
b. Least Squares Anomaly Detection
c. Deep Neural networks
3. Anomaly detection as clustering problem
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 39
a. Correlation based anomaly detection
b. K‐means anomaly detection
c. Deep Neural networks
3.2.3 Root Cause Analysis
Once the damage has been detected, the reason for the fault must be found. This step has
two sub steps. One of them is the Anomaly designation, and the other the definition root
cause of the Anomaly. The Anomaly Designation part can be covered with classification or
clustering solutions, while the definition or the root cause needs domain expert information.
The definition of the root cause can be done by simply labelling the resulting clusters on the
anomaly designation part or more advanced algorithms can also be used.
Usually in order to jump from the anomaly detection step to the RCA step, data concerning
the faulty state is needed. Again, the algorithms used for this purpose can be based on
different approaches. Among the different kind of algorithms in PROPHESY‐ML we will look
into the ones listed here:
1. Classification Algorithms
a. Ensemble methods
b. Bayesian networks
c. Support Vector Machines
d. KNN
e. (Deep) Neural Networks
2. Clustering Algorithms
a. Gaussian mixture models
b. K‐means
c. PCA ‐ OMEDA
d. DBSCAN Family of Clustering Algorithms
e. EXAMCE
f. (Deep) Neural Networks
3. Descriptors
a. Association Rule Mining (QARMA)
b. Attribute Oriented Induction
c. Sequential Pattern Mining (time domain)
d. (Deep) Neural Networks
3.2.4 Remaining Useful Life
Finally, on the top of the pyramid it lays the Remaining Useful Life. This should be calculated,
once it is known that damage exists, and when the type of damage is known.
As mentioned in the previous sections, the more we escalate in the pyramid, the more data
is needed. This time, in order to have an accurate degradation model, apart from having data
from a damaged state, we should also need data of the wear we are trying to model. This
makes the section of RUL to be divided again into two subsections: the quantification, and
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 40
the wear modelling. The quantification is the model that tells how damaged the asset we are
evaluating is, while the wear model models the quantification and predicts how it will evolve
in time.
The algorithms that make the RUL possible can be separated into different families. In the
PROPHESY‐ML toolbox, the following will be studied.
1. RUL as Regression problem
a. Logistic Regression
b. Support Vector Regression
c. Random Forest Regressor
2. RUL as Time Series Analysis
a. Auto‐Regressive Family
b. Deep LSTM Networks
3.2.5 Discovery of Rate Conditions
In order to discover rate conditions and associated quantitative rules, PROPHESY will also
explore the following novel set of non‐supervised learning algorithms that are contributed by
one of the partners (AIT). Note that details about these algorithms and their use in PROPHESY
will be provided as part of WP4 deliverables that deal with the detailed specifications of the
PROPHESY‐ML analytics algorithms.
3.2.5.1 QARMA (Parallel Quantitative Association Rule Mining)
Parallel Quantitative Association Rule Mining is a highly parallel/distributed algorithm that
can mine all “interesting” quantitative association rules in multi‐dimensional datasets [15].
The algorithm can be used to discover (rare) conditions under which certain controls must be
applied to bring a system back into a desired state.
3.2.5.2 EXAMCE
Clustering in large numbers of clusters. When large numbers of small clusters are sought (e.g.
to discover rare cases that best fit together), standard algorithms such as K‐Means break
down. Newer algorithms are a much better fit for such purposes [16].
3.2.5.3 DBSCAN Family of Clustering Algorithms
The advantage of DBSCAN is that it is an inherently incremental online algorithm that only
needs to see the data once, and can thus be used in streaming applications.
3.3 Data Sharing and Interoperability Specifications The PROPHESY‐CPS platform should also offer data sharing and interoperability
functionalities, as part of enabling the unified and consolidated processing of datasets that
reside in fragmented systems. To this end, PROPHESY‐CPS should adhere to the following
specifications about data models, data exchanges and interoperability.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 41
3.3.1 Standards‐based Digital Models for PROPHESY
The PROPHESY platform should support standards‐based digital models for the
representation of maintenance and automation information. In particular:
Standards‐based digital representation of maintenance data: PROPHESY should support
standards‐based digital models, which will ensure data and semantic interoperability
across datasets stemming from diverse data sources.
Standards‐based digital representation of automation‐related data: PROPHESY should
support digital models for the representation of automation operations as a means of
“hiding” the diversity of the various automation systems and devices that might engage
in maintenance workflows.
API Support: PROPHESY should provide APIs for CRUD (Create, Read Update Delete)
operations over entities/instances of the data models.
Support for multiple bindings and formats: PROPHESY should support multiple bindings
(i.e. in different languages and formats) for access and use of data models instances.
3.3.2 Data Sharing and Exchange Specifications
PROPHESY should provide support for sharing and exchanging data across different
maintenance‐related systems, based on the following modalities:
On‐line Data Sharing, based on appropriate platform agnostic services (e.g., Web
Services).
Batch and off‐line data sharing, based on appropriate ETL (Extract Transform Load)
processes for transferring maintenance datasets across different systems.
3.3.3 Data Persistence Specifications
PROPHESY should provide a variety of options for storing, managing and persisting
maintenance datasets, which shall ensure scalability and support different types of data. In
particular:
Support for structured and unstructured data: PROPHESY should provide the means for
storing and managing structured, unstructured and semi‐structured datasets.
Big Data Support: The PROPHESY data infrastructure should support the four Vs of Big
Data: Volume, Variety, Velocity, and Veracity.
Cost‐Effective Scalability: The PROPHESY infrastructure shall provide cost‐effective
scalability, including the ability to scale‐up and/or scale‐out as needed.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 42
ReferenceDataFlowandFunctions4.1 Information Model An Information Model defines the structure (e.g. relations, attributes, services) of all the
information on a conceptual level. The term information is used along to the definitions of
the DIKW hierarchy (see [11]) where data is defined as pure values without relevant or
useable context. Information adds the right context to data and offers answers to typical
questions like who, what, where and when. The description of the representation of the
information (e.g. binary, XML, RDF etc.) and concrete implementations are not part of the
here depicted Information Model.
The following figure gives an overview to the relevant elements in the information model,
used in MANTIS [10]:
Figure4‐1:InformationModelElementCategories[10]
Data in can be differentiated in the following kinds:
Data: syntactically treated signals or signs
Metadata: data that provides information about other data
class IM_Overview
PdM Data
DataCharacteristic + Accuracy
+ Asynchronous + Frequency
+ Latency + Resolution
+ Synchronous
DataContent + Advice
+ Event + GeoData
+ HealthStatus + HealthTrend
+ Image + MaintenancePlan
+ MaintenanceReport + Sound
+ State + Video DataProcessing
+ AggregatedData + AugmentedData + ManipulatedData
+ RawData
DataSource + ManualData + SensedData
DataStructure + Stream
+ Value + ValueContainer
DataSubject + AssetDescription
+ ERPData + MachineData + ProcessData
DataKind + Information
+ Knowledge + MetaData
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 43
Information: semantically enriched data, regular structured, aggregated and
augmented
Knowledge: observations that have been meaningfully organized, accumulated and
embedded in a context through experience, communication, or inference. It is used to
interpret situations and to generate activities, behaviour and solutions.
The data will be structured in single values, value containers, or data streams.
class IM_DataKind&DataStructure
PdM Data Information Knowledge
MetaData
StreamValue ValueContainer
Structure
Figure4‐2:Datastructuresandkinds[10]
The data on the one hand side consists of somehow sensed data for determining the condition
of assets monitored, and on the other hand it comprises manual data from human actors in
the maintenance process. Within the process of predicting maintenance issues the data is
captured as raw data and successively enriched with syntactic, semantic and experience
based increments.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 44
class IM_DataProcessing&DataSource
AggregatedData AugmentedData
PdM Data
ManipulatedData
ManualData
RawData
SensedData
Processing
Source
Figure4‐3:Datasourcesandprocesseddata[10]
Data is covering a lot of differing kinds of content ranging from simple to complex, and
comprises 4 different subjects:
class IM_DataContent&DataSubject
PdM Data
VideoSound MaintenanceReport
MaintenancePlanImage
Image2D Image3D
HealthTrend
HealthStatus
GeoData
Event Advice
State
ERPDataAssetDescription MachineDataProcessData
Content
Subject
Figure4‐4:Datasubjectsandcontent[10]
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 45
Data handled, depending on its origin, can be assigned a meaningful sample of characteristics,
which are shown in the figure below:
class IM_DataCharacteristics
Accuracy Asynchronous
PdM Data
Frequency LatencyResolution Synchronous
Characteristic
Figure4‐5:DataCharacteristics[10]
4.2 Function Model D2.2 follows the first two steps of the RFLP approach (Requirements – Functional – Logical –
Physical) as the baseline for model‐based design with systems engineering that enables close
interaction and collaboration between the different engineering disciplines. Therefore, we
concentrate on customers’ needs, qualities and the functions a PdM system has to provide.
Figure 4‐6 depicts the functional view as developed in MANTIS [10], which PROPHESY will
adapt to.
Figure4‐6:MANTISFunctionalModel
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 46
Within this functional perspective, this section explains the key maintenance functions
defined in ISO 13374: Data Acquisition (DA), Data Manipulation (DM), State Detection (SD),
Health Assessment (HA), Prognostic Assessment (PA), and Advisory Generation (AG). Figure
4‐7 denotes the overview on these processing blocks:
Figure4‐7:DataprocessingblockdiagramfromISO13374‐2
In more detail, the processing blocks can be described as follows [4]:
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 47
Table4‐1:KeyMaintenanceFunction:DataAcquisition
Data Acquisition (DA)
The output, which is obtained from the sensor, is converted into the digital parameter, which
represents a physical quantity and information related to e.g. time, calibration, the quality of the
data, the data collector utilized, and the sensor configuration.
Functionality The Data Acquisition (DA) function provides system access to digitized data entered automatically or manually.
Inputs The DA function may represent a specialized data acquisition function that has analogue feeds (e.g. from legacy sensors), or it may collect and consolidate sensor signals from a data bus. Alternatively, it might represent the software interface to a smart sensor:
Analogue, manual, and digital data
Control, synchronization, and configuration data Historical DA outputs
Outputs The DA function basically is a server of calibrated/scaled digitized sensor data records. The output of all DA function blocks shall contain the following:
Digitized data Time‐order/time‐reference data, normally referenced with UTC and local time zone
Data quality indicator (e.g. "bad", "good", "unknown'', "under review'', etc.).
Table4‐2:KeyMaintenanceFunction:DataManipulation
Data Manipulation (DM)
Data manipulation unit performs signal analysis and calculates the relevant descriptors. The end
result is a virtual sensor reading from raw data.
Functionality The Data Manipulation (DM) function processes the digital data from the DA function to convert it to a desired form, which characterizes specific descriptors (features) of interest in the machine condition monitoring and diagnostic process. Often the functionality within this layer consists of some signal processing algorithms. DM calculates descriptors (features) from sampled sensor data, other descriptors, or the output of computations. The computation may be characterized as an input‐output mapping.
Inputs Sampled digital data from the DA function, and cascaded data from other DM instances.
Outputs Descriptors (features) from sampled sensor data, other descriptors, or the output of computations. The computation may be characterized as an input‐output mapping.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 48
Table4‐3:KeyMaintenanceFunction:StateDetection
State Detection (SD)
State Detection (SD) facilitates the creation and maintenance of normal baseline ”profiles” and
searches for abnormalities from the data whenever the new data is collected and determines the
abnormality zone to it, if any, for example, an alert or alarm.
Functionality The State Detection (SD) function (sometimes referred to as "state awareness") is to compare DM and/or DA outputs against expected baseline profile values or operational limits, in order to generate enumerated state indicators with respective boundary exceedances. The SD function generates indicators, which may be utilized by the Health Assessment function to generate alerts and alarms. When appropriate data are available, the SD block should generate assessments based on operational context, sensitive to the current operational state or operational environment.
Inputs Current DA and DM outputs; cascaded SD output
Historical DA and DM outputs
Operational data (context, environment, state, external systems data)
Configuration data
Outputs DA control signals and scheduling commands (for optimizing functional interplay)
DM control signals (for optimizing functional interplay)
Data which will contribute to a diagnosis in the health assessment function
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 49
Table4‐4:KeyMaintenanceFunction:HealthAssessment
Health Assessment (HA)
The Health Assessment (HA) block diagnoses any faults, and rates the current health of the
equipment or process, considering all state information.
Functionality The Health Assessment (HA) function utilizes expertise from a human or automated agent to determine the current health of the equipment and to diagnose existing fault conditions. It determines the state of health and potential failures by fusing the outputs of the DA, DM, SD and other HA function blocks. HA performs agent‐specific assessments of a component's or system's current health state with the associated diagnoses of discovered abnormal states in the associated operational context. HA results may also include evidence and explanation information.
Inputs DA, DM, SD outputs
Operational data Configuration data Human expertise
Automated agent expertise
Outputs Component/system's current health grade
Diagnosed faults and failures with associated likelihood probability Calculation of the current risk priority number (RPN)
Modelling of ambiguity groups and multiple hypotheses may be included in The output data structures
Explanation detailing the evidence for a diagnosis or health grade
Table4‐5:KeyMaintenanceFunction:PrognosticAssessment
Prognostic Assessment (PA)
The Prognostic Assessment (PA) block determines the future health states and failure modes of the
equipment and/or process based on the current health assessment and projected usage loads, as
well as the remaining useful life predictions.
Functionality Prognostic Assessment (PA) performs agent‐specific assessments of a component's or system's future health state with the associated predicted abnormal states and remaining life for a projected operational context. It may also include evidence and explanation information. It uses a combination of prognostic models and their algorithms, including future operational usage model(s).
Inputs DA, DM, SD, HA and (cascaded) PA outputs
Account historical failure data and operational history Projected failure rates related to operational utilization
Outputs Health grade at a future time
Estimation of the remaining life of an asset given its projected usage profile
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 50
Table4‐6:KeyMaintenanceFunction:AdvisoryGeneration
Advisory Generation (AG)
Advisory Generation (AG) block offers actionable information regarding maintenance or operation
of an equipment and/or process so that the lifetime of the equipment and/or process can be
optimized.
Functionality Advisory Generation (AG) integrates information (including safety, environmental, operational goals, financial incentives, etc.) to generate advisories to operations and maintenance and to respond to capability forecast assessment requests.
Inputs DA, DM, SD, HA, PA and (cascaded) AG outputs
Operational data Configuration data External constraints (safety, environmental, budgetary, etc.)
Operational history (including usage and maintenance)
Current and future mission profiles
High‐level unit objectives Resource constraints
Outputs Recommendations, such as o Prioritized operational and maintenance actions o Capability forecast assessments o Modified operational profiles to allow mission completion
Maintenance advisories (put as structured work requests) o Verification of monitoring data o Performance of additional monitoring o …
Operational advisories o Immediate (e.g. notification of alerts and action steps) o Strategic (e.g. notification of (high) risk of failure) o Capability forecast
Complemental to the ISO 13374 standard, MANTIS [10] defines two additional maintenance
functions, Maintenance Planning (MP), and Maintenance Execution (ME):
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 51
Table4‐7:KeyMaintenanceFunction:MaintenancePlanning
Maintenance Planning (MP)
The Maintenance Planning (MP) block generates a work plan on how and when to maintain the
system of interest.
Functionality Maintenance Planning (MP) prepares a plan for an optimized immediate and strategic maintenance based on current and historic health data, as well as resource planning data.
Inputs AG outputs Resource planning data
Outputs Work plan
Table4‐8:KeyMaintenanceFunction:MaintenanceExecution
Maintenance Execution (ME)
The Maintenance Execution (ME) block executes the MP – in most cases this is done by maintenance
staff. However, from the function viewpoint a human maintenance worker should be seen as one
possible technological solution to accomplish this function.
Functionality Maintenance Execution (ME) executes the maintenance plan.
Inputs MP outputs
Templates
Guidelines Constraints
Outputs Report data on the outcome
4.3 Overview Template To describe the needs for maintenance scenarios, PROPHESY has prepared a template for
capturing the relevant information in a structured way along the aspects depicted in the
previous sections.
Figure 4‐8 shows a generic example of this template, exemplifying an overview of the
interplay between goals, data sources (internal and external), and maintenance related
functions. The overview also indicates the data flows in a certain maintenance scenario.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 52
Figure4‐8:Templateforcapturinggoals,functions,data,anddatasources&flows
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 53
PHILIPSUseCases(Goals,functions,data,datasources&flows)
This section comprises, based on the PHILIPS use cases, what the PROPHESY PdM users need
to accomplish and what the system has to accomplish for its users. This analysis focuses on
the customers’ needs, and the goals that shall be reached. It also depicts how the operational
needs can be satisfied, considering the existing infrastructure.
The sections 4.1‐3 depict the elaboration of the three PHILIPS use cases concerning
“ML to predict RUL on wear‐part level: Cold‐forming Tool” (UC1),
“ML to predict RUL on wear‐part level: 5‐fold Cut‐out Tool” (UC2), and
“AR to Assist Tool Maintenance Operations on wear‐part level: Cold‐forming Tool & 5‐fold
cut‐out Tool” (UC3).
All data and data flows elicited in the PHILIPS use cases initially remain in the realm of PHILIPS.
However, some data not yet clearly identified go to the PdM platform for training and
development of machine learning algorithms and KPI extraction.
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 54
5.1 PROPHESY UC1 (PHILIPS)
Figure5‐1:PROPHESYUC1(PHILIPS)–Goals,functions,data,datasources&flows
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 55
5.2 PROPHESY UC2 (PHILIPS)
Figure5‐2:PROPHESYUC2(PHILIPS)–Goals,functions,data,datasources&flows
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 56
5.3 PROPHESY UC3 (PHILIPS)
Figure5‐3:PROPHESYUC3(PHILIPS)–Goals,functions,data,datasources&flows
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 57
JLR Use Cases (Goals, functions, data, datasources&flows)
This section comprises, based on the JLR use cases, what the PROPHESY PdM users need to
accomplish and what the system has to accomplish for its users. This analysis focuses on the
customers’ needs, and the goals that shall be reached. It also depicts how the operational
needs can be satisfied, considering the existing infrastructure.
The sections 5.1‐3 depict the elaboration of the three JLR use cases concerning
“Condition prediction for MAG Specht 600 Ballscrew” (UC4),
“Condition prediction for hole generation tool on OP90 Cylinder Head” (UC5), and
“AR for technical Instructions & training for key components replacement/assembly, and
for RUL information” (UC6).
All data and data flows elicited in the JLR use cases initially remain in the realm of JLR.
However, some data not yet clearly identified go to the PdM platform for training and
development of machine learning algorithms and KPI extraction (UC1&2). UC3 streams audio
and video data via the OCULAVIS SHARE7 platform.
7 https://www.share-platform.de/
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 58
6.1 PROPHESY UC4 (JLR)
Figure6‐1:PROPHESYUC4(JLR)–Goals,functions,data,datasources&flows
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 59
6.2 PROPHESY UC5 (JLR)
Figure6‐2:PROPHESYUC5(JLR)–Goals,functions,data,datasources&flows
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 60
6.3 PROPHESY UC6 (JLR)
Figure6‐3:PROPHESYUC6(JLR)–Goals,functions,data,datasources&flows
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 61
ConclusionPROPHESY’s WP2 aims to design a technical draft of the platform to serve as a basis for its
technological development and deployment. The engineering blueprint consists of a high‐
level architecture of the system of interest and the technical infrastructure that supports the
architecture.
This document renders the objectives to be achieved in predictive maintenance and derives
the data (and its in‐process and external sources), the data modelling needs and the functions
thereof. It also reflects the system qualities that drive the system architecture most. The work
answers the question of what the system of interest should accomplish. The answers to the
questions of how the system should function and be built will be given in work package 3.
D2.2 lays the foundations for data collection and analytics, addressing big data solutions,
business goals and key qualities, and considers basic standards like ISO 133748 and MIMOSA.
It also re‐uses results from the MANTIS9 project, provides specifications for data sharing and
interoperability, as well as for data collection and analytics as part of the PROPHESY‐ML. It
additionally extends the reference data flow and functions with a template for recording
information of the PROPHESY use cases’ predictive maintenance scenarios. The document
closes by detailing the needs of the six PROPHESY use cases using the template designed.
8Condition monitoring and diagnostics of machines - Data processing, communication and presentation 9 http://www.mantis-project.eu/
D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018
Dissemination level: (PU) Public Page 62
References[1] "MIMOSA ‐ An Operation and Maintenance Information Open System Alliance". [Online].
Available: http://www.mimosa.org/. [Accessed: 28‐Mar‐2018].
[2] Pascal Roques, "Systems Architecture Modeling with the Arcadia Method ‐ A Practical Guide to Capella", 1. Edition, STE Press – Elsevier, 2017
[3] "Cost Efficient Methods and Processes for Safety Relevant Embedded Systems (CESAR) ". [Online], Available: https://artemis‐ia.eu/project/1‐cesar.html. [Accessed: 28‐Mar‐2018].
[4] ISO 13374, "Condition monitoring and diagnostics of machines ‐ Data processing, communication and presentation", Parts 1‐4
[5] ISO/IEC 19501, "Information technology – Open Distributed Processing – Unified Modeling Language (UML) Version 1.4.2"
[6] Pratikkumar Desai, Amit Sheth and Pramod Anantharam. "Semantic Gateway as a Service Architecture for IoT Interoperability". In: Proceedings of IEEE International Conference on Mobile Services, 2015, pp. 313‐319.
[7] Martín Serrano et al. "Internet of Things ‐ IoT Semantic interoperability: Research Challenges, Best Practices, Recommendations and Next Steps". In: European Research Cluster on the Internet of Things, 2015.
[8] Amit Sheth. "Internet of Things to Smart IoT through semantic, cognitive, and perceptual Computing". In: IEEE Intelligent Systems, Volume 31, Issue 2, 2016, pp. 108‐112.
[9] Riccardo Petrolo et al. "The design of the gateway for the Cloud of Things". In: Annals of Telecommunications, Volume 72. Issue 1‐2, 2016, pp. 31‐40.
[10] "MANTIS: Cyber Physical System based Proactive Collaborative Maintenance". [Online]. Available: http://www.mantis‐project.eu/. [Accessed: 14‐Mar‐2018].
[11] J. Rowley. "The wisdom hierarchy: Representations of the DIKW hierarchy". In: Journal of Information Science, 2007, Volume 33, Issue 2, pp. 163‐180
[12] Anand Veeramani. "Big Data 101 ‐ Creating Real Value from the Data‐Lifecycle". 2014. [Online], Available: https://www.happiestminds.com/whitepapers/Big‐Data‐101‐Creating‐Real‐Value‐from‐the‐Data‐Lifecycle.pdf [Accessed: 30‐Mar‐2018]
[13] CRISP‐DM. [Online], Available: http://crisp‐dm.eu/ [Accessed: 30‐Mar‐2018]
[14] John Gantz and David Reinsel. "The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East". IDC Country Brief 2013. [Online], Available: https://www.emc.com/collateral/analyst‐reports/idc‐digital‐universe‐united‐states.pdf [Accessed: 30‐Mar‐2018]
[15] I.T. Christou, E. Amolochitis, Z.‐H. Tan. “QARMA: A Parallel Algorithm for Mining All Quantitative Association Rules and Some of its Applications”. Knowledge & Information Systems, 2017, Accepted with Minor Revisions.
[16] Ioannis T. Christou. “Coordination of Cluster Ensembles via Exact Methods”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, Volume 33, Issue 2, pp. 279‐293