D2.2 – Specification of Data Assets and Services · 2019. 11. 13. · figure 4‐7: data processing block diagram from iso 13374‐2 ..... 46 F IGURE 4‐8: T EMPLATE FOR CAPTURING

This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 766994. It is the property of the PROPHESY consortium and shall not be distributed or reproduced without the formal approval of the PROPHESY Project Coordination Committee.

DELIVERABLE

D2.2 – Specification of Data Assets and Services

D2.2 – Specification of Data Assets and ServicesFinal – V2.01, 04/06/2018

Dissemination level: (PU) Public Page 2

Project Acronym: PROPHESY

Grant Agreement number: 766994 (H2020‐IND‐CE‐2016‐17/H2020‐FOF‐2017)

Project Full Title: Platform for rapid deployment of self‐configuring and

optimized predictive maintenance services

Project Coordinator: INTRASOFT International SA

DELIVERABLE

D2.2 – Specification of Data Assets and Services

Dissemination level (PU) Public

Type of Document (R) Report

Contractual date of delivery M08, 31/05/2018

Deliverable Leader FHG

Status ‐ version, date Final – V2.01, 04/06/2018

WP / Task responsible FHG

Keywords: Data asset; data service; data flow; data analysis; data

analytics; customer needs; functional view; business

goals;



ExecutiveSummary

PROPHESY’s WP2 aims to design an engineering blueprint of the PROPHESY‐PdM platform, to

serve as the baseline for its technological development and deployment. The engineering

blueprint composes itself of a high‐level architecture of the system of interest plus the

technical infrastructure that supports the architecture.

The document at hand addresses the objectives predictive maintenance (PdM) in PROPHESY

needs to achieve, and derives the data (and its in‐process and external sources), the data

modelling needs, and the functions thereof. It also reflects system qualities that drive the

system architecture most. The deliverable focusses on answering the questions what the

system of interest shall accomplish. The answers on how the system will work and be built

will be given in work package 3.

Task 2.2, together with the remaining tasks from work package 2, sets the context for work

packages 3‐7. To this purpose, the document at hand contributes with basic concepts for data

collection and analytics, renders data modelling standards and data analytics algorithms to

be used, and gives referential advice on data flows and functions needed for predictive

maintenance. Finally, the document depicts the needs of PROPHESY’s use cases so that the

PROPHESY partners can understand what the systems to be built have to accomplish, and

what benefits they have for the users.



Deliverable Leader: FHG

Contributors: ICARE, NOVAID, AIT, FHG, MONDRAGON, SENSAP, TUE, MMS,

PHILIPS, INTRA

Reviewers: PHILIPS, JLR, ICARE, All

Approved by: INTRA

Document History

Version Date Contributor(s) Description

0.0 27.02.2018 FHG, NOVAID Very first draft.

0.1 09.03.2018 FHG ToC & contributors.

0.2 14.03.2018 FHG Initial contributions.

0.3 23.03.2018

ICARE, NOVAID, AIT, FHG,

MONDRAGON, SENSAP,

TUE, MMS, PHILIPS, JLR

Worked in feedback. Template revision.

0.4 28.03.2018 FHG Added to section 2.

0.5 29.03.2018 SENSAP, FHG, JLR,

PHILIPS Added to section 2, 4, 5. Included review feedback

1.0 30.03.2018 FHG Added to section 3. Updated Acronym Table.

Finalized document (version1.0).

1.1 04.05.2018 FHG, JLR, PHILIPS Updated UC function models

1.2 17.05.2018 AIT

Added new chapter toc “Specification of Data

Collection, Sharing and Analytics” and first

content.

1.3 24.05.2018 MMS, FHG Included 3.1.1 (Data Collection Embedded in the

Machines)

1.4 27.05.2018 AIT, MONDRAGON,

PHILIPS, FHG

Added content to chapter 3. Update of UC2.

Updated Acronym Table.

1.5 28.05.2018 PHILIPS, FHG Revision of the use case overviews 1‐3, revision of

introductory parts, and conclusion.

1.6 29.05.2018 JLR, FHG Revision of the use case overviews 4‐6

2.0 30.05.2018 FHG Integrated review feedback. Improved picture

quality. Finalized document (version2.0)

2.01 04.06.2018 ICARE, FHG Integrated very last feedback. Minor UC2‐

Overview revision.



TableofContentsEXECUTIVE SUMMARY ................................................................................................................................. 3

TABLE OF CONTENTS .................................................................................................................................... 5

TABLE OF FIGURES ....................................................................................................................................... 6

LIST OF TABLES ............................................................................................................................................. 6

DEFINITIONS, ACRONYMS AND ABBREVIATIONS .......................................................................................... 8

INTRODUCTION ................................................................................................................................. 10

1.1 THE PROPHESY VISION .............................................................................................................................. 10 1.2 WP2 OVERVIEW ......................................................................................................................................... 11 1.3 TASK 2.2 OVERVIEW .................................................................................................................................... 11 1.4 DOCUMENT PURPOSE AND AUDIENCE ............................................................................................................. 11 1.5 DOCUMENT SCOPE AND APPROACH ................................................................................................................ 12 1.6 DOCUMENT STRUCTURE ............................................................................................................................... 14

FOUNDATIONS FOR DATA COLLECTION AND ANALYTICS .................................................................... 15

2.1 BIG DATA SOLUTION LIFE‐CYCLE: CRISP‐DM .................................................................................................. 15 2.1.1 Introduction .................................................................................................................................. 15 2.1.2 Cross‐Industry Standard Process for Data Mining (CRISP‐DM) ..................................................... 16 2.1.3 CRISP‐DM reference model ........................................................................................................... 17

2.2 BUSINESS GOALS AND KEY QUALITIES .............................................................................................................. 24 2.3 ISO 13374 ............................................................................................................................................... 27 2.4 DATA SEMANTICS ........................................................................................................................................ 28

2.4.1 General Needs ............................................................................................................................... 28 2.4.2 MIMOSA ........................................................................................................................................ 30

SPECIFICATION OF DATA COLLECTION, SHARING AND ANALYTICS ...................................................... 33

3.1 DATA COLLECTION SPECIFICATIONS ................................................................................................................. 33 3.1.1 Data Collection embedded in the Machines ................................................................................. 33 3.1.2 Data Collection for Sensors and Field Devices .............................................................................. 35 3.1.3 Edge Gateway Data Collection ..................................................................................................... 35 3.1.4 Data Collection from Maintenance Systems and Databases ........................................................ 36

3.2 DATA ANALYTICS SPECIFICATIONS ................................................................................................................... 36 3.2.1 Data Pre‐processing ...................................................................................................................... 37 3.2.2 Anomaly Detection ....................................................................................................................... 38 3.2.3 Root Cause Analysis ...................................................................................................................... 39 3.2.4 Remaining Useful Life ................................................................................................................... 39 3.2.5 Discovery of Rate Conditions ........................................................................................................ 40

3.3 DATA SHARING AND INTEROPERABILITY SPECIFICATIONS ..................................................................................... 40 3.3.1 Standards‐based Digital Models for PROPHESY............................................................................ 41 3.3.2 Data Sharing and Exchange Specifications ................................................................................... 41 3.3.3 Data Persistence Specifications .................................................................................................... 41

REFERENCE DATA FLOW AND FUNCTIONS ......................................................................................... 42

4.1 INFORMATION MODEL ................................................................................................................................. 42 4.2 FUNCTION MODEL....................................................................................................................................... 45 4.3 OVERVIEW TEMPLATE .................................................................................................................................. 51



PHILIPS USE CASES (GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS) ........................................ 53

5.1 PROPHESY UC1 (PHILIPS) ........................................................................................................................ 54 5.2 PROPHESY UC2 (PHILIPS) ........................................................................................................................ 55 5.3 PROPHESY UC3 (PHILIPS) ........................................................................................................................ 56

JLR USE CASES (GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS) .............................................. 57

6.1 PROPHESY UC4 (JLR) ............................................................................................................................... 58 6.2 PROPHESY UC5 (JLR) ............................................................................................................................... 59 6.3 PROPHESY UC6 (JLR) ............................................................................................................................... 60

CONCLUSION .................................................................................................................................... 61

REFERENCES ...................................................................................................................................... 62

TableofFiguresFIGURE 1‐1: FUNCTIONAL ANALYSIS AND THE ARCHITECTURE DESIGN PHASES [2] ................................................................... 12 FIGURE 2‐1 ESTIMATED GROWTH OF THE US DIGITAL UNIVERSE ......................................................................................... 15 FIGURE 2‐2: PHASES OF THE CRISP‐DM REFERENCE MODEL ............................................................................................. 17 FIGURE 2‐3: CRISP‐DM – PHASES, GENERIC TASKS AND OUTPUTS ..................................................................................... 18 FIGURE 2‐4: HOW ORGANIZATIONS HANDLE DATA FLOW: A GIANT MESS (SOURCE: CONFLUENT)4 .......................................... 28 FIGURE 2‐5: OSA‐EAI ARCHITECTURE .......................................................................................................................... 31 FIGURE 2‐6: OSA‐CBM FUNCTIONAL BLOCKS CONFORM TO ISO‐13374 ........................................................................... 32 FIGURE 3‐1: BRANKAMP X7 PROCESS MONITORING SYSTEM .............................................................................................. 33 FIGURE 3‐2: ARTIS GENIOR MODULAR CPU‐02 PROCESS MONITORING SYSTEM ................................................................... 34 FIGURE 3‐3: DATA ANALYTICS LEVELS FOR MAINTENANCE................................................................................................. 37 FIGURE 4‐1: INFORMATION MODEL ELEMENT CATEGORIES [10] ........................................................................................ 42 FIGURE 4‐2: DATA STRUCTURES AND KINDS [10] ............................................................................................................. 43 FIGURE 4‐3: DATA SOURCES AND PROCESSED DATA [10] .................................................................................................. 44 FIGURE 4‐4: DATA SUBJECTS AND CONTENT [10] ............................................................................................................ 44 FIGURE 4‐5: DATA CHARACTERISTICS [10] ..................................................................................................................... 45 FIGURE 4‐6: MANTIS FUNCTIONAL MODEL .................................................................................................................. 45 FIGURE 4‐7: DATA PROCESSING BLOCK DIAGRAM FROM ISO 13374‐2 ................................................................................ 46 FIGURE 4‐8: TEMPLATE FOR CAPTURING GOALS, FUNCTIONS, DATA, AND DATA SOURCES & FLOWS ........................................... 52 FIGURE 5‐1: PROPHESY UC1 (PHILIPS) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS ........................................... 54 FIGURE 5‐2: PROPHESY UC2 (PHILIPS) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS ........................................... 55 FIGURE 5‐3: PROPHESY UC3 (PHILIPS) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS ........................................... 56 FIGURE 6‐1: PROPHESY UC4 (JLR) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS .................................................. 58 FIGURE 6‐2: PROPHESY UC5 (JLR) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS .................................................. 59 FIGURE 6‐3: PROPHESY UC6 (JLR) – GOALS, FUNCTIONS, DATA, DATA SOURCES & FLOWS .................................................. 60

ListofTablesTABLE 2‐1: SELECTION OF QUALITIES RELATED TO PDM .................................................................................................... 26 TABLE 4‐1: KEY MAINTENANCE FUNCTION: DATA ACQUISITION ......................................................................................... 47 TABLE 4‐2: KEY MAINTENANCE FUNCTION: DATA MANIPULATION ..................................................................................... 47 TABLE 4‐3: KEY MAINTENANCE FUNCTION: STATE DETECTION ........................................................................................... 48 TABLE 4‐4: KEY MAINTENANCE FUNCTION: HEALTH ASSESSMENT ...................................................................................... 49 TABLE 4‐5: KEY MAINTENANCE FUNCTION: PROGNOSTIC ASSESSMENT ............................................................................... 49



TABLE 4‐6: KEY MAINTENANCE FUNCTION: ADVISORY GENERATION ................................................................................... 50 TABLE 4‐7: KEY MAINTENANCE FUNCTION: MAINTENANCE PLANNING ................................................................................ 51 TABLE 4‐8: KEY MAINTENANCE FUNCTION: MAINTENANCE EXECUTION ............................................................................... 51



Definitions,AcronymsandAbbreviationsAcronym/

Abbreviation Title

ADT Anomaly Detection

AG Advisory Generation

B2MML Business To Machine Mark‐up Language

CAEX Computer Aided Engineering Exchange

CBM Condition Based Monitoring

CCOM Common Conceptual Object Model

CM&D Condition monitoring and diagnostics

CMM Coordinate Measuring Machine

CNC Computer Numerical Control

CPPS Cyber‐Physical Production System

CPS Cyber‐Physical System

CPSoS Cyber‐Physical System of Systems

CRIS Common Relational Information Schema

CRISP‐DM Cross‐Industry Standard Process for Data Mining

CRUD Create, Read, Update, Delete

DA Data Acquisition DBSCAN Density‐Based Spatial Clustering of Applications with Noise DIKW Data–Information–Knowledge–Wisdom

DM Data Manipulation

DMC Data Matrix Code

DPWS Device Profile for Web Services

DTA Digital Torque Adapter

EOL End Of Life ETL Extract ‐ Transform ‐ Load

EXAMCE EXAct Method‐based Cluster Ensembles

FIPA Foundation for Intelligent Physical Agents

HA Health Assessment

HFML High Frequency Machine Learning (in the Cloud)

ICA Independent Component Analysis

IIC Industrial Internet Consortium

IIoT Industrial Internet‐of‐Things

IIRA Industrial Internet Reference Architecture

IoT Internet‐of‐Things

IRR Internal Rate of Return

JADE Java Agent Development Framework

JLR Project partner acronym: Jaguar Land Rover Limited

KNN k‐nearest neighbors algorithm

k‐NN k‐nearest neighbors algorithm

KPI Key Performance Indicator

LFML Low Frequency Machine Learning (on the Edge)

LSTM Long short‐term memory



ME Maintenance Execution

MEDA Machine emulation detection algorithm

MMS Project partner acronym: MARPOSS Monitoring Solutions GmbH MMT Manage My Tools (Siemens)

MP Maintenance Planning

mRMR Minimum Redundancy Maximum Relevance

MSPC Multivariate Statistical Process Control

MTBF Mean Time Between Failures

MTTR Mean Time to Repair

OEE Overall Equipment Effectiveness

oMEDA oMEDA is a variant of MEDA to connect observations and variables OPC OLE for Process Control

OPC‐UA OPC Unified Architecture

OPR Offline Process Recorder

OSA‐CBM Open System Architecture for Condition Based Maintenance OSA‐EAI Open Systems Architecture for Enterprise Application Integration P&P Plug and Produce

PA Prognostic Assessment

PCA Principal Component Analysis

PdM Predictive Maintenance

PLM Product Lifecycle Management

PROPHESY‐AR PROPHESY‐Augmented Reality

PROPHESY‐CPS PROPHESY‐Cyber Physical System

PROPHESY‐ML PROPHESY‐Machine Learning

PROPHESY‐PdM PROPHESY‐Predictive Maintenance

PROPHESY‐SOE PROPHESY‐Service Optimization Engine

PRT Predicted Repair Time

QARMA Quantitative Association Rule Mining

QRC Quick Response Code

RAModel Reference Architectural Model

RCA Root Cause Analysis

RFLP Requirements, Functional, Logical, Physical

RPN Risk Priority Number

RUL Remaining Useful Life

SD State Detection SOA Service‐Oriented Architecture

SOAP Simple Object Access Protocol

SoI System of Interest

SoS System‐of‐Systems

TPM Total Productive Maintenance

UML Unified Modelling Language

WP Work Package WS Web Service



Introduction1.1 The PROPHESY Vision Despite the proclaimed benefits of predictive maintenance, the majority of manufacturers

are still disposing with preventive and condition‐based maintenance approaches, which result

in suboptimal OEE (Overall Equipment Effectiveness). This is mainly due to the challenges of

predictive maintenance deployments, including the fragmentation of the various

maintenance related datasets (i.e. data “silos”), the lack of solutions that combine multiple

sensing modalities for maintenance based on advanced predictive analytics, the fact that

early predictive maintenance solutions do not close the loop to the production as part of an

integrated approach, the limited exploitation of advanced training and visualization

modalities for predictive maintenance (such as the use of Augmented Reality (AR)

technologies), as well as the lack of validated business models for the deployment of

predictive maintenance solutions to the benefit of all stakeholders. The main goal of

PROPHESY is to lower the deployment barriers for advanced and intelligence predictive

maintenance solutions, through developing and validating (in factories) novel technologies

that address the above‐listed challenges.

In order to alleviate the fragmentation of datasets and to close the loop to the field,

PROPHESY will specify a novel CPS (Cyber Physical System) platform for predictive

maintenance, which shall provide the means for diverse data collection, consolidation and

interoperability, while at the same time supporting digital automation functions that will

close the loop to the field and will enable “autonomous” maintenance functionalities. The

project’s CPS platform is conveniently called PROPHESY‐CPS and is developed in the scope of

WP3 of the project.

In order to exploit multiple sensing modalities for timely and accurate predictions of

maintenance parameters (e.g., RUL (Remaining Useful Life)), PROPHESY will employ

advanced predictive analytics which shall operate over data collected from multiple sensors,

machines, devices, enterprise systems and maintenance‐related databases (e.g., asset

management databases). Moreover, PROPHESY will provide tools that will facilitate the

development and deployment of its library of advanced analytics algorithms. The analytics

tools and techniques of the project, will be bundled together in a toolbox that is coined

PROPHESY‐ML and is developed in WP4 of the project.

In order to leverage the benefits of advanced training and visualization for maintenance,

including increased efficiency and safety of human‐in‐the‐loop processes the project will

take advantage of an Augmented Reality (AR) platform. The AR platform will be customized

for use in maintenance scenarios with particular emphasis on remote maintenance. It will be

also combined with a number of visualization technologies such as ergonomic dashboards, as

a means of enhancing worker’s support and safety. The project’s AR platform is conveniently

called PROPHESY‐AR.



In order to develop and validate viable business models for predictive maintenance

deployments, the project will explore optimal deployment of configurations of turn‐key

solutions, notably solutions that comprise multiple components and technologies of the

PROPHESY project (e.g., data collection, data analytics, data visualization and AR components

in an integrated solution). The project will provide the means for evaluating such

configurations against various business and maintenance criteria, based on corresponding,

relevant KPIs (Key Performance Indicators). PROPHESY’s tools for developing and evaluating

alternative deployment configurations form the project service optimization engine, which

we call PROPHESY‐SOE.

1.2 WP2 Overview PROPHESY’s WP2 aims to design a technical draft of the platform to serve as a basis for its

technological development and deployment. The engineering blueprint consists of a high‐

level architecture of the system of interest and the technical infrastructure that supports the

architecture. It provides specifications for the main deliverables of the PROPHESY project,

along with the technical architecture of the PROPHESY‐CPS platform. Its main objectives

include the architecture and specifications of the PROPHESY‐CPS platform optimized for PdM

activities, of data collection and analytics mechanisms, and of the services and the service

optimization engine. It also specifies the project’s demonstrators.

1.3 Task 2.2 Overview Task 2.2 focusses on the specifications of mechanisms for sensor data collection and analytics

to be used in PROPHESY. It specifies the sensors and data collection modalities to be

supported, the machine learning and deep learning techniques to be implemented, and

considers requirements like performance, latency, scalability and extensibility. Special

emphasis is paid in the specification of the data modelling standards and ontologies, which

will be used for data representation within PROPHESY. To this end, task 2.2 uses the MIMOSA

ontology and extends it appropriately. Furthermore, the task specifies the detailed structure

of the data assets comprising the PROPHESY‐ML toolkit, including data analytics algorithms

and tools. It pays special emphasis in the specification of data sharing and interoperability,

which will leverage the MIMOSA‐based structures, along with data exchanging and sharing in

the PROPHESY‐CPS platform. The task also specifies where and to what purpose data

visualization in the project takes place.

1.4 Document Purpose and Audience Task 2.1, together with the remaining tasks from work package 2, sets the context for work

packages 3‐7. To this purpose, the document at hand contributes with basic concepts for data

collection and analytics, and gives referential advice on data flows and functions needed for

predictive maintenance. Finally, the document depicts the needs of PROPHESY’s use cases so

that the PROPHESY partners can understand what the systems to be built have to accomplish,

and what benefits they have for the users.



1.5 Document Scope and Approach D2.2 specifies the sensor data collection and analytics mechanisms to be used in PROPHESY.

Therefore, it addresses the objectives PdM in PROPHESY needs to achieve, and derives the

data (and its in‐process and external sources), the data modelling needs, and the functions

thereof. D2.2 also reflects system qualities (e.g. performance, latency, scalability and

extensibility) that drive the system architecture most. The deliverable focusses on answering

the questions what the system of interest shall accomplish, and on the core logical

architecture. The answers on how the system will work and be built will be given in WP3.

To this end, D2.2 follows the first two steps of the RFLP approach (Requirements – Functional

– Logical – Physical) as the baseline for model‐based design with systems engineering that

enables close interaction and collaboration between the different engineering disciplines.

Figure1‐1:Functionalanalysisandthearchitecturedesignphases[2]

Figure 1‐1 gives an overview on the approach D2.2 takes. The steps can be described as

follows [3]:

Operational analysis / Requirements analysis

This step focuses on analysing the customers’ needs, the goals that shall be reached by the

system users, and the activities being performed by the users. Outputs of this step leads to

an “operational architecture” describing and structuring the needs, in terms of actors/users

and their operational capabilities and activities, operational use scenarios giving dimensioning

parameters, operational constraints including safety, security, system life cycle, and others.



Functional and non‐functional need analysis

The role of this step is to focus on the system itself in order to define how it can satisfy

operational needs (prescribed in operational analysis) along with its expected behaviour and

qualities. This includes the definition of functional as well as non‐functional requirements of

the system, e.g., safety, security, and performance. At this step, they are focused on system

boundary level and not on individual system components. Furthermore, role sharing and

potential interactions between system and operators are of concern here. Outputs of this

step mainly consist of the system functional need description, descriptions of interoperability

and interaction with the users and probable external systems (functions, non‐functional

constraints, and interoperation), and system/SW requirements. Note that these two steps,

which are a prerequisite for architecture definition, have high impact on the future design

being developed in the forthcoming steps, and therefore should be approved with and

validated by the customer.

Logical Architecture Analysis

The role of this step is to identify the system’s parts (hereafter called components), their roles,

relationships and properties, while excluding implementation or technical issues. This

constitutes the logical architecture of the system. The output of this step is a logical

architecture consisting of components, their interface definitions, and functions allocated to

them (along with their functional exchanges). Traceability links to the requirements and

operational scenarios are established.

Physical Architecture Analysis

This step defines the “final” architecture of the system at physical level, ready to be developed

(by lower engineering levels). Therefore, it introduces rationalization, architectural patterns,

new technical services and components, and makes the logical architecture evolve according

to implementation, technical and technological constraints or choices (at this level of

engineering).Output of this step is the chosen physical architecture consisting of components

to be produced and the way they are taken into account in the subsystem design. As is the

case with logical analysis, traceability links towards the requirements and operational

scenarios are established.

D2.2 focusses on the ‘need understanding’ part of Figure 1‐1, that is the goals and functional

view based on the expected system qualities. Hence, it addresses the flow of data, the major

requirements, and the core logical architecture. It also denotes specifications regarding data

collection, sharing and analytics to be used in PROPHESY.



1.6 Document Structure The document structure is as follows:

Section 1 Introduction details the document context, purpose and intended audience,

as well as, the overall strategy applied in WP2 while underlining the role played by this

document with respect to the whole project.

Section 2 Foundations for Data Collection and Analytics holds basic concepts and

elaborates on the Big Data Solution Life‐Cycle CRISP‐DM, on business goals and

qualities, on the ISO standard 13374 (Condition monitoring and diagnostics of

machines ‐ Data processing, communication and presentation), and on data semantics

including an overview on MIMOSA.

Section 3 Specification of Data Collection, Sharing and Analytics is devoted to

specifications regarding data collection, sharing and analytics, including information

about the data collection interfaces, the data modelling standards and data analytics

algorithms to be used.

Section 4 Reference Data Flow and Functions presents the information and function

model, and introduces a template for recording information of the use cases’ PdM

scenarios from the perspective of goals, functions, data, and data sources & flows.

Section 5 PHILIPS Use Cases uses the template from section 4.3 for detailing the needs

of the three PHILIPS use cases.

Section 6 JLR Use Cases uses the template from section 4.3 for detailing the needs of

the three JLR use cases.

Section 7 Conclusion provides the conclusion of this document



FoundationsforDataCollectionandAnalytics2.1 Big Data Solution Life‐Cycle: CRISP‐DM

2.1.1 Introduction

Nowadays the impact of Big Data in business is unquestionable and un‐ignorable. More and

more companies generate new knowledge from data, add value and develop new business

models. Private companies and research institutions capture these terabytes of data about

their users’ interactions, business, social media, and also sensors from different devices such

as mobile phones, sensors, etc. The challenge of this era is to make sense of this data ocean.

Evolution of big data

The past 10 years there are many studies about big data evolution. For example, Eric Schmidt

of Google had stated in 2010 that every two days we create as much information as we did

from the dawn of civilization up until 2003 [12]. In 2012, an IDC and EMC report stated that

the digital universe is doubling every two years and will reach 40,000 exabytes (40 trillion

gigabytes) by 2020 [13].

Figure2‐1estimatedgrowthoftheUSdigitalUniverse

Based on an IDC and EMC report there are some more interesting and latest statistics on data:

90% of the world’s data has been created in last two years

Big Data market growth projection is $53.4 billion by 2017, from $10.2 billion in 2013



70% of the digital universe, approx. 900 exabytes are generated by users

98% of global information is now digital, that was 25% in 2000

10% increase in data visibility means additional

$65.7 million for a typical Fortune 1000 company

But what is the reason of this growing? The mushrooming evolution of big data is due a) to

the increasing usage of IoT devices such as mobile devices, wireless sensors, RFID readers etc.,

b) to the continued growth on Internet usage, social networks, c) falling costs of the

technology devices and hardware that create, capture, manage, protect, and store

information and d) to the growth of machine‐generated data [12].

Challenges

This evolution forces enterprises to develop effective methodologies and infrastructure to

gather, process, and harvest data. It is very critical to create capabilities that can narrow down

big data into relevance and importance and, keep only the information that matters most to

the business. Every part of produced data is not gathered and moreover is not processed, is

useless. Based on an IDC1 report the percentage of information in the USA digital universe

that would be useful if tagged and analyzed will be much bigger (40%) by 2020 [12],[13].

In conclusion, the Big Data Revolution is referred more in the capability of actually doing

something with the data, making more sense out of it. In order to build a capability that can

achieve beneficial data targets, enterprises need to understand the data lifecycle and

challenges at different stages. The most known methodology for data mining is the CRISP‐DM

methodology, which the next section describes.

2.1.2 Cross‐Industry Standard Process for Data Mining (CRISP‐DM)

The Cross‐Industry Standard Process for Data Mining named CRISP‐DM provides a structured

approach to planning a data mining project. CRISP‐DM was conceived in 1996 and the next

year, it got underway as a European Union project under the ESPRIT funding initiative [14].

The project was led by five companies: SPSS, Teradata, Daimler AG, NCR Corporation, and

OHRA (an insurance company). The project was finally incorporated into SPSS.

CRISP‐DM methodology defines a hierarchy process model which consists of a set of tasks.

Each task is described at four abstraction levels (from general to specific):

Level 1: The data mining process is organized into six major phases, which consist of a

set of generic tasks

Level 2: This level defines generic tasks which cover all possible data mining situations

Level 3: This level defines specialized tasks that describe how actions in the generic

tasks should be carried out in specific situations

Level 4: The process instance is a record of actions, decisions, and results of an actual

data mining engagement. Each process instance is organized according to the tasks

1 https://www.idc.com/



defined at the higher levels, but represents what actually happened in a particular

engagement, rather than what happens in general.

Additionally, the CRISP‐DM methodology introduces a horizontal dimension, which defines

the reference model and the user guide. The reference model presents a quick overview of

phases, tasks, and their outputs. Briefly, it describes what to do in a data mining project. The

user guide gives detailed guides for each phase and each task within a phase and depicts how

to carry out a data mining project.

2.1.3 CRISP‐DM reference model

Figure2‐2:PhasesoftheCRISP‐DMreferencemodel2

The reference model provides an overview of the data mining life cycle. This life cycle divides

into phases and each phase is divided in tasks. There are six phases as shown in Figure 2‐2:

Business Understanding

Data understanding

Data preparation

Modeling

2 http://crisp-dm.eu/



Model evaluation

Deployment

The sequence of phases is not restricted. There is always the necessity to move backward.

The output of each phase determines which is the next phase or task. Figure 2‐3 presents the

six phases and the contained tasks and outputs.

Figure2‐3:CRISP‐DM–Phases,generictasksandoutputs3

2.1.3.1 Business understanding

This phase deals with the business view of the project. Business understanding initially

focuses on understanding of the project objectives and requirements, then converting this

knowledge into a data mining problem definition and finally, a preliminary plan designed to

achieve the objectives. The business understanding phase consists of four generic tasks:

Determine business objectives

Assess situation

Determine data mining goals

Produce project plan

3 http://crisp-dm.eu/



Determine business objectives

This task depicts what the customer really wants to accomplish from a business view

Outputs

Background Record the known information about business situation

Business objectives Describe the customer’s primary business objectives

Business success criteria

From a business point of view, describe the criteria for a successful or useful outcome to the project. This should be specific enough to be measured objectively

Assess situation

This task involves more detailed investigation of the resources, constraints, assumptions, and other factors that affect data analysis goal and project plan.

Outputs

Inventory of resources

List of available resources such as personnel, data, computing resources and software.

Requirements, assumptions, and constraints

List of project requirements, such as completion schedule, quality of results, security and legal issues. Make sure that you can use the data.

Risks and contingencies

List the risks or events that might delay the project or cause it to fail. Plans and actions will be taken if these risks take place.

Terminology Define a glossary of terminology relevant to the project:

A glossary relevant to business terminology

A glossary relevant to data mining terminology

Costs and benefits Construct a cost‐benefit analysis for the project: compare the project costs with the potential benefits to the business

Determine data mining goals

Translate business goals to data mining goals

Outputs

Data mining goals Describe the intended outputs of the project that achieve the business objectives.

Data mining success criteria

Define the criteria for a successful outcome to the project in technical terms.

Produce project plan

Define a plan for achieving the data mining goals. The plan should specify the steps to be performed during the project, including the initial selection of tools and techniques.

Outputs

Project plan List the stages to be executed in the project, including their duration, required resources, inputs, outputs and dependencies. Analyse dependencies between time schedule and risks.

Initial assessment of tools and techniques

This output performs an initial assessment of tools and techniques



2.1.3.2 Data understanding

The data understanding phase starts with an initial data collection and proceeds with

activities that enable you to become familiar with the data, identify data quality problems,

discover first insights into the data, and/or detect interesting subsets to form hypotheses

regarding hidden information.

Collect initial data

Acquire the data listed in the project resources. Also includes data loading, if necessary for data understanding

Outputs

Initial data collection report

List the acquired datasets with locations, acquired methods, problems encountered and resolutions to these problems.

Describe data

Examine the “gross” or “surface” properties of the acquired data and report on the results

Outputs

Data description report

Acquired data description, including the data format, quantity, the identities of the fields, and any other discovered surface features.

Explore data

This task addresses data mining questions using querying, visualization, and reporting techniques. These include distribution of key attributes, relationships, results of simple aggregations, properties of significant sub‐populations and simple statistical analyses.

Outputs

Data exploration report

Description of the results, such as first findings or initial hypothesis. If appropriate, graphs and plots can be included.

Verify data quality

Examine the quality of the data, addressing appropriate questions (is the data complete or correct? Does it contain errors and how are they?)

Outputs

Data quality report List the results of the data quality verification. If there is quality problem, define possible solutions.

2.1.3.3 Data preparation

The data preparation phase covers all activities needed to construct the final dataset from

the initial raw data. The tasks of preparation phase could be performed multiple times and

not in any predefined order. This phase produces a list of datasets and their description.



Select data

This task decides which data will be used for analysis based on criteria such as relevance to the data mining goals, quality, technical restrictions (volume or datatype limitations)

Outputs

Rationale for inclusion/exclusion

List the data to be included/excluded and the reasons for these decisions.

Clean data

This task raises the data quality to the desired level for the selected analysis techniques. This can involve selection of clean subsets, default values definition or other techniques such as estimation of missing data.

Outputs

Data cleaning report Describe what actions were taken to address the data quality problems reported during the “Verify Data Quality” task. Data transformations for cleaning purposes. The impact on the analysis results should be considered.

Construct data

This task includes operations for data preparation construction (production of derived attributes or entire new records or transformed values for existing attributes).

Outputs

Derived attributes Derived attributes are new attributes that are constructed from one or more existing attributes in the same record.

Generated records Describe the creation of completely new records.

Integrate data

This task includes methods for multiple data combination to new records.

Outputs

Merged data Merging tables, aggregations.

Format data

Formatting transformations refer to syntactic modifications that do not change the meaning of the data but might be required by the modelling tool.

Outputs

Reformatted data Reformatted data

2.1.3.4 Modelling

In this phase, the appropriate modelling techniques are selected and applied, and their

parameters are calibrated to optimal values. Usually, there are several techniques for the

same data mining problem type. Some techniques have specific requirements on the form of

data. For this reason, the backward moving to data preparation phase is often necessary.



Select modelling technique

Select the modelling technique is going to be used. Although the tool may have already selected during the Business Understanding phase, this task refers to specific modelling techniques. If multiple techniques are applied, this task is repeated for each technique.

Outputs

Modelling technique Document the actual modelling technique that is to be used.

Modelling assumptions

Record any assumption made by modelling technique

Generate test design

Generate a procedure or mechanism to test the model’s quality and validity before it is build.

Outputs

Test design Define and describe a plan for training, testing, and evaluating the models.

Build model

Run the modelling tool on the prepared dataset to create one or more models.

Outputs

Parameter settings List the necessary parameters and their chosen values.

Models The actual models produced by the modelling tool

Model descriptions Report on the interpretation of the models and document any difficulties encountered with their meanings

Assess model

The data mining engineer interprets the models according to his domain knowledge, the data mining success criteria, and the desired test design. The data mining engineer judges the success of the application of modelling and discovery techniques technically. The data mining engineer tries to rank the models. He assesses the models according to the evaluation criteria. In this task, he also compares all results according to the evaluation criteria

Outputs

Model assessment Summarize results of this task, list qualities of generated models (e.g., in terms of accuracy), and rank their quality in relation to each other.

Revised parameter settings

According to the model assessment, revise parameter settings and tune them for the next run in the Build Model task.

2.1.3.5 Evaluation

This phase focuses on the evaluation and the review of the created model, to be certain that

it achieves the business objectives. A key objective is to determine if there is some important



business issue that has not been sufficiently considered. At the end of this phase, a decision

on the use of the data mining results should be reached.

Evaluate results

This task assesses the degree to which the model meets the business objectives and seeks to determine if there is some business reason why this model is deficient. Moreover, evaluation also assesses other data mining results generated. Data mining results involve models that are necessarily related to the original business objectives and all other findings that are not necessarily related to the original business objectives, but might also unveil additional challenges, information, or hints for future directions.

Outputs

Assessment of data mining results with respect to business success criteria

Summarize assessment results in terms of business success criteria, including a final statement regarding whether the project already meets the initial business objectives.

Approved models The generated models that meet the selected criteria become the approved models.

Review process

Review of the data mining process to determine if there is any important factor or task that has somehow been overlooked or if the quality assurance issues are covered

Outputs

Review of process Summarize the process review and highlight activities that have been missed and those that should be repeated

Determine next steps

This task defines the next steps based on the input of the assessment and the process review. These steps include project finishing and deployment, initiate further iterations or setup new data mining projects. The task also includes the budget and remaining resources analysis.

Outputs

List of possible actions

List of the potential further actions and the reasons for each option.

Decision Description of the decision as to how to proceed.

2.1.3.6 Deployment

Depending on the requirements, the deployment phase can be as simple as generating a

report or as complex as implementing a repeatable data mining process across the enterprise.

In many cases, the customer, and not the data analyst, carries out the deployment steps. In

any case, it is important for the customer to understand up front what actions need to be

carried out in order to actually make use of the created models.



Plan deployment

This task takes the evaluation results and determines a strategy for deployment. If a general procedure has been identified to create the relevant model(s), this procedure is documented here for later deployment.

Outputs

Deployment plan Summarize the deployment strategy, including the necessary steps and how to perform them.

Plan monitoring and maintenance

This task defines a plan for monitoring and maintenance. The maintenance strategy helps to avoid unnecessarily and/or incorrect usage of data mining results.

Outputs

Monitoring and maintenance plan

Summarize the monitoring and maintenance strategy, including the necessary steps and how to perform them.

Produce final report

In this task, the project team writes the final report. This report may be only a summary or a final and comprehensive presentation of the data mining result

Outputs

Final report This is the final written report of the data mining engagement. It includes all the previous deliverables, summarizing and organizing the results.

Final presentation Usually this is a meeting at the end of the project at which the results are presented to the customer.

Review project

This task assesses what went right and what went wrong, what was done well and what needs to be improved

Outputs

Experience documentation

Summarize important experience gained during the project.

2.2 Business Goals and Key Qualities In general, we can distinguish between four main classes of architecture drivers: business

goals, functional requirements, constraints and quality requirements. Each of these classes

might have its individual stakeholders that articulate concerns belonging to that particular

class.

Business Goals

Business goals are the first class (and most abstract) of architectural drivers. Business goals

are goals that are important for the overall enterprise that is developing the respective



architecture or has placed an order to build the system. Examples for business goals are time

to market (denoting the strategy in terms of time), the market scope, or costs.

Functional Requirements

Functional requirements are drivers for the architecture as well. However, there are

differences in functional requirements: Some drive the architecture ‐ some do not. In some

sense, the functional requirements that make the product unique and worth building are the

ones that influence the architecture development the most.

Quality Requirements

Quality is not only about correctness of functionality. Successful software systems have to

assure additional properties such as performance, security, extensibility, maintainability, and

so forth. In general, we distinguish between run‐time and development‐time quality

attributes. Run‐time quality attributes can be measured by watching the respective system in

operation. Examples for run‐time quality attributes are performance, security, safety,

availability, and reliability. Development‐time quality attributes can be measured by watching

a team in operation. Examples for development‐time quality attributes are extensibility,

modifiability, and portability.

Constraints

One important but easily overlooked input for software and systems design are constraints

that influence the design decisions of subsequent steps. Constraints can be organizational,

technical, regulatory, or political. Making them explicit provides a solid basis for subsequent

decision making in design.

Typical business goals in the field of maintenance are:

Analyse product performance remotely

Improve quality

Increase equipment lifespan

Increase operational efficiency

Reduce downtime

Reduce idle time

Reduce maintenance cost

Reduce production cost

Reduce rework & scrap

Increase customer satisfaction

Increase worker mobility

Increase worker training efficiency

Reduce accidents

Reduce energy consumption

Reduce start‐up time



Reduce stock

Reduce time to process orders

Reduce time to market

Consolidate customer inks

Integrate real‐time customer feedback

Table 2‐1 lists a selection of possibly relevant qualities influencing IoT architecures w.r.t.

maintenance processes.

Table2‐1:SelectionofQualitiesrelatedtoPdM

Quality Category Quality Goal

Adaptability Adaptability for different environments

Adaptability Adaptability for new components

Adaptability Use of preferred technologies

Availability Concurrent access on data

Availability Planned downtimes of system parts

Availability Unplanned downtimes of system parts (software)

Availability Unplanned downtimes of system parts (hardware)

Latency Require real‐time data access for certain scenarios

Performance The performance of PdM networks must support services' criticality

Performance The performance of production processes must not be influenced negatively by PdM

Robustness Robustness against unstable network conditions

Robustness Robustness against no network conditions

Safety Monitored processes and sites are not influenced w.r.t. safety

Scalability Scaling over large number of data generating devices

Security Separation of views for different MANTIS users

Security Overall data usage control

Security Security levels and zones

Upgradeability No data loss during upgrade of system components

… …

For concrete maintenance‐related scenarios, the qualities have to be operationalized to be

measurable. Therefore, the environment, stimulus and system response have to be described

in an adequate manner.



2.3 ISO 13374 ISO 13374 [4] consists of the following parts, under the general title "Condition monitoring

and diagnostics of machines ‐ Data processing, communication and presentation": part 1:

General guidelines, part 2: Data processing, part 3: Communication, part 4: Presentation.

The various computer software systems written for condition monitoring and diagnostics

(CM&D) of machines that are currently in use cannot easily exchange data or operate in a

plug‐and‐play fashion without an extensive integration effort. This makes it difficult to

integrate systems and provide a unified view of the condition of machinery to users. The

intent of ISO 13374 Parts 1 through 4 is to provide the basic requirements for open CM&D

software architectures which will allow CM&D information to be processed, communicated,

and displayed by various software packages without platform‐specific or hardware‐specific

protocols.

ISO 13374‐1 (General Guidelines) establishes general guidelines for software specifications

related to data processing, communication, and presentation of machine condition

monitoring and diagnostic information.

ISO 13374‐2 (Data Processing) details the requirements for a reference information model

and a reference processing model to which an open CM&D architecture needs to conform.

Software designers require both an information model and a processing model to adequately

describe all data processing requirements. ISO 13374‐2 facilitates the interoperability of

CM&D systems. It standardizes the reference information model and reference‐processing

model for an open CM&D architecture. ISO 13374‐2 describes the required content for the

five layers (L1: semantic definitions, L2: a non‐proprietary conceptual information model or

'schema', L3: implementation data model, L4: reference data library, and L5: data document

definitions) of the open CM&D information architecture, which describes all the data objects,

types, relationships, etc. for a given system. ISO 13374‐2 defines the six key processing blocks,

namely Data Acquisition (DA), Data Manipulation (DM), State Detection (SD), Health

Assessment (HA), Prognostic Assessment (PA), and Advisory Generation (AG). As an

informative annex, ISO 13374‐2 provides compliant UML, XML, and Middleware service

specifications as well.

ISO 13374‐3 (Communication) specifies requirements for data communication for an open

CM&D reference information architecture and for a reference processing architecture.

Software design professionals require communications to be defined for exchange of CM&D

information between software systems. This part of ISO 13374 facilitates the interoperability

of CM&D systems. The technologies and software used by the data‐oriented processing

blocks (DA, DM, and SD) often vary from those used by the analysis‐oriented processing blocks

(HA, PA, and AG) (cf. Figure 4‐7). ISO 13374‐3 states that a UML model, compliant with ISO/IEC

19501 [5], shall support the open CM&D data‐processing communications. It defines the

block processing methods and interface types that should be utilized by each data‐processing

block defined in ISO 13374‐2. ISO 13374‐3 also contains an informative annex of an open

CM&D information architecture.



ISO 13374‐4 (Presentation) details the requirements for presentation of information for

technical analysis and decision support in an open CM&D. Software design professionals need

to present diagnostic/prognostic data, health information, advisories, and recommendations

on computer displays and in written report formats to end‐users. This part of ISO 13374

provides standards for the display of this information in CM&D systems.

2.4 Data Semantics Most organizations are facing an explosion of data coming from new applications, new

business opportunities, IoT, and more. The ideal architecture most envision is a clean,

optimized system that allows businesses to capitalize on all that data.

Figure2‐4:HowOrganizationsHandleDataFlow:AGiantMess(Source:confluent)4

However, traditional systems used for solving these problems were designed in an era that

predates large‐scale distributed systems, and lack the ability to scale to meet the needs of the

modern data‐driven organization4.

The evaluation of these service gaps in existing big data systems results in the necessity of a

higher‐level semantic representation of data context and taxonomy.

2.4.1 General Needs

In complex systems (such as Cyber Physical Systems), merging data from heterogeneous

information sources is one of the biggest challenges. The data are generated in different

formats and across different levels of an ecosystem. The data is then stored in distributed

clouds or local storage systems and made available via a variety of protocols from all end

points of the data generation and consumption chain. The integration of subordinate

resources can pose challenges on several levels:

4 https://www.confluent.io/product/confluent-platform/



Interoperability of protocols / connectivity: components use different protocols or

interfaces, a platform must provide the means to integrate them effectively into an

existing ecosystem

Format incompatibilities: there are a variety of data formats. From simple numerical

one‐dimensional streams to complex data structures

Structuring: merging of structured and unstructured data sources

Encoding: proprietary components and software often use different interfaces and

languages

To support data merging and processing, it makes sense to use a semantic network of devices,

instances and relations. In this context, semantic metadata

facilitate communication across platforms and different architectures (simplifies

mappings and translations),

simplify the manipulation of information (systematic processes involving semantic

metadata can create synergies in the data analysis and pre‐processing workflows),

decouple data and its description and makes data sources more comparable,

facilitate data visualization and human machine interaction services.

In an industrial scenario, businesses either outsource their data analysis or employ data

experts to explore and exploit the potential of their data. Often these data consumers are

little or not at all familiar with the underlying processes behind the data provided and the

acquisition of this knowledge requires a costly exchange of information between the data

provider (industrial business) and the consultant or data analyst. The industry partner may

know what result he is looking for and can provide large amounts of potentially relevant

information. However, this becomes problematic as the collection of all data that

contextualize the origin of this information becomes a hunt for clues distributed across the

different levels of the organization. In addition, the analyst faces the challenge of selecting

and distinguishing all available data sources that can prolong the data mining process or lead

to irrelevant results. For example, maintenance personnel and floor managers have

comprehensive insights into the functioning of machines and causality chains in a production

line. This expertise is rarely grouped or integrated with the original data, and providing

contextual information to the analyst becomes a tedious process. This knowledge covers

various aspects of the data, from its origin and composition (SI units, nominal values, sensor

type, reference values, etc.) to its origin and location in an interlinked process. Therefore,

data analysis can easily take up to 6 weeks, only a small portion of which is allocated to

develop the final data mining model. To avoid costly and inefficient intermediate processes,

information management strategies must focus on contextual information and metadata. The

integration of metadata networks leads to semantic webs with high‐level resource

representations and can improve interoperability in large data ecosystems [6][7]. Adding

metadata to the available data sources on a network is an effective way to virtualize

connections between devices in cyber‐physical systems. Data management solutions should

not only aim to meet data storage and scalability requirements, as data semantics in complex



data ecosystems becomes more and more relevant and sensitive, especially as the amount of

data in a network increases. In addition to improved human interaction, semantic labelling

offers great added value for the available data.

The mathematical resources acquire a higher representation layer, which can also be

processed to gain insights and imagine relationships that escape common statistical

processes. This means not only a richer connectivity network in terms of ontologies and

dependencies in the production line, but also a more significant communication channel

between data providers and consumers. In addition, modern technologies such as cognitive

computing and knowledge discovery in databases point to an increasing trend in which data

analysis relies on underlying semantic networks that improve the discovery of relevant and

high‐quality data sources in Smart IoT[8][9]. This leads to an urgent need to semantically

enrich IoT environments with labelled data and information maps that can link the resources

available in an industrial network with internal or external knowledge databases and

ontologies.

2.4.2 MIMOSA

MIMOSA™ [1] is a non‐profit industry association focused on open standards and facilitation

of full life‐cycle management. This is based on digital assets that must accurately reflect the

physical entities. Real‐time control and maintenance heavily depend on information

integration upon such complex physical assets. MIMOSA helps to establish a basis for a more

integrated approach, combining full life‐cycle engineering with operation and maintenance

activities.

MIMOSA Open System Architecture (OSA) specification provides a series of interrelated

information standards. Here we introduce the two most relevant:

MIMOSA OSA‐EAI (OSA for Enterprise Application Integration)5 defines data structures for

storing and moving collective information about all aspects of equipment, including platform

health and future capability, into enterprise applications. This includes the physical

configuration of platforms as well as reliability, condition, and maintenance of platforms,

systems, and subsystems.

OSA‐EAI creates an information backbone to facilitate the integration of asset management

by providing (1) an information exchange standard to allow sharing asset registry, condition,

maintenance and reliability information between enterprise systems, and (2) a relational

database model to allow storage of the same asset information (see Figure 2‐5: OSA‐EAI

Architecture).

(1) To accommodate various types of applications and integration scenarios, the OSA‐EAI supports the exchange of XML files over multiple data transport options including files (Tech‐Doc), HTTP (Tech‐XML‐Web), and SOAP Web Services (Tech‐XML‐Services). The

5 http://www.mimosa.org/mimosa-osa-eai



Web Service definitions are sufficiently granular such that they can be used in a Service‐Oriented Architecture (SOA).

(2) The relational database model (Common Relational Information Schema – CRIS) is represented as a logical and physical model, with SQL scripts targeted for Oracle Database and Microsoft SQL Server. The SQL scripts include both creating the database schema and inserting the MIMOSA Reference Data, the latter which can be extended to support unique project or organizational requirements. Tech‐CDE‐Services provides an efficient mechanism to manage a CRIS database via Web Services.

While CRIS provides a means to store enterprise operation and maintanance information, the

Common Conceptual Object Model (CCOM) provides a foundation for all MIMOSA standards.

Figure2‐5:OSA‐EAIArchitecture

MIMOSA OSA‐CBM (Open System Architecture for Condition‐Based Maintenance)6

specification is a reference architecture and framework for implementing condition‐based

maintenance systems. The goal is that standardization of information exchange specifications

would ideally facilitate the integration and interchangeability of CBM components from a

variety of sources. In short,

it describes a standardized information delivery system for condition based monitoring,

the information that is moved around and how to move it, and

it also has built in meta‐data to describe the processing that is occurring.

The OSA‐CBM is defined using the Unified Modelling Language (UML) to separate the

information from the technical interfaces used to communicate the information.

6 http://www.mimosa.org/mimosa-osa-cbm



In addition, MIMOSA OSA‐CBM is compliant with and form the informative reference to the

ISO 13374‐1 standard for machinery diagnostic systems. ISO‐13374, Condition Monitoring

and Diagnostics of Machines, defines the six blocks of functionality in a condition monitoring

system, as well as the general inputs and outputs of those six blocks (see Figure 2‐6). OSA‐

CBM is an implementation of the ISO‐13374 functional specification. It adds data structures

and defines interface methods for the functionality blocks defined by the ISO standard.

Figure2‐6:OSA‐CBMFunctionalBlocksconformtoISO‐13374

In relation to OSA‐EAI, OSA‐CBM uses many of the data elements that are defined by the OSA‐

EAI. The goal is to have OSA‐CBM map into OSA‐EAI, with future releases.



Specification of Data Collection, Sharing andAnalytics

In this section, we provide a set of specifications that will drive data collection and analytics

as part of PROPHESY‐ML. Moreover, a set of specifications for data sharing and

interoperability is provided.

3.1 Data Collection Specifications PROPHESY will support data collection from a variety of systems that comprise maintenance‐

related data. To this end, it will provide the means for interfacing to different types of systems

as a means of consolidating previously fragmented data that reside in multiple systems (i.e.

“data silos”).

3.1.1 Data Collection embedded in the Machines

For the PROPHESY demonstrator the monitoring of process information is one of the main

tasks.

At Philips, two Brankamp branded MMS X5 systems are already equipped at the

demonstrator side. To allow an extended process monitoring by integration of further sensors

and interfaces to PROPHESY, an upgrade to Brankamp branded X7 systems is necessary.

Figure3‐1:BrankampX7processmonitoringsystem



Figure 3‐1 shows a Brankamp X7 system as well as some possible process curves. The X7

system allows up to 24 channels for an extended process monitoring. The HMI part of the

system works on a Windows Operation System, so that an easy connection to other

PROPHESY parts is possible. Furthermore, the X7 Cockpit provides a switchable mask design

with flexible arrangement of the monitoring channels (according to the machine

configuration). Binary input signals can be monitored with up to three monitoring windows

to ensure the earliest possible fault detection. The failure distribution shows machine

downtimes and the frequency of process failures for a quick and easy failure analysis.

The X7 system has different options to transmit the process data to the PROPHESY platform.

For the Philips demonstrator, a standard PC (Edge‐PC) will be used to gather all data from the

X7 and provide the data in a secure way to PROPHESY (see Deliverable 7.1). The LFML as well

as further data processing could be performed at this Edge‐PC.

At JLR, all machines are already equipped with Artis Genior Modular CPU‐01 process

monitoring devices at the demonstrator side. To allow an extended process monitoring by

integration of further sensors and interfaces to PROPHESY, an upgrade to Artis Genior

Modular CPU‐02 devices is mandatory.

Figure3‐2:ArtisGeniorModularCPU‐02processmonitoringsystem

Figure 3‐2 shows an Artis Genior Modular CPU‐02 system as well as the modular system

structure of the Genior system. Genior Modular can simultaneously monitor and visualize up

to 24 signals and 10 channels. The Multiview display is ideal for the simultaneous monitoring

of multiple spindles, axles and other equipment values. It shows the entire machining

situation at a glance. Genior Modular is easy to install and to integrate in machine controls.

Depending on the area of application, visualization and operation can be done alternatively

via the control or an external system (Windows or Linux). The central evaluation unit can be

upgraded with various measuring transducers to operate the system with sensors and can be



modularly expanded at any time. Thus, Genior Modular is prepared for a huge range of

requirements.

Similar to the X7 system, Genior Modular provides also some different options to transmit

the process data to the PROPHESY platform. For the JLR demonstrator, an OPR‐Edge device

will be used to gather and store all data from the Genior Modular and provide the data in a

secure way to PROPHESY (see Deliverable 7.1). The LFML as well as further data processing

could be performed at this OPR‐Edge device.

Both system comes with an internal binary data format. To transform the binary format in

other formats (e.g. csv) a converter is provided by MMS.

3.1.2 Data Collection for Sensors and Field Devices

PROPHESY should support data collection directly from sensor data sources, subject to the

following specifications:

Dynamic sensor registration: The platform should keep track of multiple sensors and

devices in support of a dynamic environment where sensors are likely to join or leave

dynamically.

Support for different types of sensors: The platform shall support multiple sensor types

such as vibration, acoustic, ultrasound, temperature, power consumption and more.

Moreover, support for both wireless and wired sensors/devices should be provided.

Sensor Virtualization: The data collection functionalities should be flexibly customized to

the different sensor types, based on appropriate abstractions and virtualization of

interfaces.

Streaming data support: The platform should support collection of streaming data i.e.

streams with very high data ingestion rates.

Streaming data pre‐processing: The platform should support pre‐processing of data

streams, including their filtering as a means of optimizing network bandwidth and storage.

Publish‐Subscribe and Request‐Reply: The platform should support both push and pull

modes of data collection, based on a combination of publish‐subscribe and request‐reply

collection modalities.

Streaming Data Analytics: The platform should support analytics over data streams,

including the production of new streams that correspond to the results of the analytics.

Field Abstraction: Support for multiple connectivity protocols should be provided, as part

of field abstraction.

3.1.3 Edge Gateway Data Collection

PROPHESY should be also able to interface and collect data from edge gateways (e.g., IoT

edge devices, sensor data collection gateways), that are deployed in the field. The following

specifications should be supported:



Support for Data Collection from Groups of Sensors: The PROPHESY platform should be

capable of interfacing to gateways towards accessing data from multiple sensors that are

attached/interfaced to the gateway.

Dynamic Registration of Groups of Devices: The PROPHESY platform should support the

dynamic registration of edge gateways and of the sensors that they comprise.

Support Edge Analytics: PROPHESY should support edge analytics based on data from

multiple sensors.

Edge Gateway Data Transformation: PROPHESY should support the transformation of the

data of the edge gateways to formats and semantics supported by the PROPHESY

platform.

3.1.4 Data Collection from Maintenance Systems and Databases

PROPHESY should support the acquisition of datasets from enterprise systems (e.g.,

Enterprise Resource Planning (ERP) systems), maintenance systems (e.g., CMM

(Computerized Maintenance Management) and other databases (e.g., AM (Asset

Management Databases), subject to the following specifications:

Abstraction and Virtualization of different types of systems: PROPHESY should support

general interfaces for data collection for each one of the different system types (e.g., ERP,

AM, CMM).

Service‐Oriented and Platform Agnostic Protocols: Platform neutral interfaces (e.g.,

REST/HTTP) to all the different systems shall be supported, including interfaces for

distributed service‐based access.

Publish Subscribe and Request Reply: Both push and pull modalities to data acquisition

should be supported, based on publish‐subscribe and request‐reply interfaces.

Filtering and (pre)processing: The platform should support filtering and pre‐processing of

data stemming from enterprise and maintenance systems.

3.2 Data Analytics Specifications In data Analytics for Maintenance purposes, there are several levels of actions, depending on

the outcome of the analysis itself. These levels can be divided into 3 main blocks: Anomaly

Detection (ADT), Root Cause Analysis (RCA) and Remaining Useful Life (RUL).

The first level, Anomaly Detection, aims to find anomalies in the datasets in order to launch

an alarm. The second level, the Root Cause Analysis, tries to find the cause of that alarm, and

is only launched when an anomaly is detected. Finally, the Remaining Useful Life block, is

responsible of calculating the life of the asset, depending on the type of damage found in the

RCA phase. The higher we escalate in the pyramid, the more data is needed in order to feed

the models. This correct amount of data will result in a reproducible and reliable outcome.

Every data analysis problem has a main objective, which is to find an answer for a question.

As the type of question to be answered is different in each of the levels, also the analysis of



the data must be different for each of the blocks; this means that the type of algorithms for

each of the levels must be different.

In this section of the document, the types of algorithms that are going to be considered for

the PROPHESY‐ML toolbox are specified, dividing them into the different levels for

maintenance analytics.

The PROPHESY‐ML toolbox will implement the most suitable algorithms among the ones

listed in the subsections below. Among the implemented algorithms, the Use Cases can select

one or more of these algorithms, in order to use them in their final product.

Figure3‐3:DataAnalyticslevelsforMaintenance

3.2.1 Data Pre‐processing

Data pre‐processing is one of the most important parts of the analysis. The pre‐processing

step should not be confused with the term “data cleaning”.

Data cleaning is a step that should be done before the analysis starts, which consists of

cleaning incorrect data, and transforming the data into a structured format so that the

analytics algorithms can make use of them. Usually the engineer performs this previous step

manually.

On the other hand, the data pre‐processing step consists of utility functions and transformers

that change raw feature vectors into a representation that is more suitable for the

downstream algorithms. Among the families for data pre‐processing we can find the ones

listed here:

1. Classical pre‐processing methods:

a. Standardization

b. Scaling

c. Normalization

d. Quantile transform



e. Encoding of categorical features

f. Polynomial Features

2. Dimensionality Reduction methods

a. Feature Selection methods (mRMR, Random Forest feature scores, …)

b. Feature Extraction methods (PCA, ICA, …)

3. Time Series Feature extraction methods (only for time series)

a. Basic features (max, min, mean, std, …)

b. Subspace modelling family

c. Auto‐Regressive modelling family

d. Stochastic Subspace identification features

e. Wavelet Transform featurization

f. Deep Learning Features

g. Principal Component Analysis

As the pre‐processing step transforms the raw data into something more adequate for the

analytics algorithm, the fact that we propose a multi‐level approach, may lead to the pre‐

processing step to be redefined in each level. This means that the anomaly detection method

can have a type of pre‐processing while the RCA method can have a totally different pre‐

processing step, and the same holds for the RUL method.

3.2.2 Anomaly Detection

This is the first step in the pyramid of maintenance data analytics. It can also be called as

novelty detection or outlier detection. It basically consists of detecting whether the input

dataset is still within the limits of the statistical model. The statistical model is constructed

using only data of the healthy asset, so the data needed for deploying this kind of solution is

not big, but the more data available, the better model you can construct.

Within this context, several approaches can be used. The approaches can be univariate or

multivariate. The univariate solutions do not take the relation between different variables,

whereas the multivariate ones do consider this relation. This problem can be treated as a

classification problem, or a clustering problem. Among the different kind of algorithms in

PROPHESY‐ML we will look into the ones listed here:

1. OneClass classification

a. Null‐Space Anomaly detection

b. MSPC anomaly detection

c. One‐class Support Vector Machines

d. Isolation forest

e. Local Outlier factor

2. Anomaly detection as classification problem

a. Random Forest

b. Least Squares Anomaly Detection

c. Deep Neural networks

3. Anomaly detection as clustering problem



a. Correlation based anomaly detection

b. K‐means anomaly detection

c. Deep Neural networks

3.2.3 Root Cause Analysis

Once the damage has been detected, the reason for the fault must be found. This step has

two sub steps. One of them is the Anomaly designation, and the other the definition root

cause of the Anomaly. The Anomaly Designation part can be covered with classification or

clustering solutions, while the definition or the root cause needs domain expert information.

The definition of the root cause can be done by simply labelling the resulting clusters on the

anomaly designation part or more advanced algorithms can also be used.

Usually in order to jump from the anomaly detection step to the RCA step, data concerning

the faulty state is needed. Again, the algorithms used for this purpose can be based on

different approaches. Among the different kind of algorithms in PROPHESY‐ML we will look

into the ones listed here:

1. Classification Algorithms

a. Ensemble methods

b. Bayesian networks

c. Support Vector Machines

d. KNN

e. (Deep) Neural Networks

2. Clustering Algorithms

a. Gaussian mixture models

b. K‐means

c. PCA ‐ OMEDA

d. DBSCAN Family of Clustering Algorithms

e. EXAMCE

f. (Deep) Neural Networks

3. Descriptors

a. Association Rule Mining (QARMA)

b. Attribute Oriented Induction

c. Sequential Pattern Mining (time domain)

d. (Deep) Neural Networks

3.2.4 Remaining Useful Life

Finally, on the top of the pyramid it lays the Remaining Useful Life. This should be calculated,

once it is known that damage exists, and when the type of damage is known.

As mentioned in the previous sections, the more we escalate in the pyramid, the more data

is needed. This time, in order to have an accurate degradation model, apart from having data

from a damaged state, we should also need data of the wear we are trying to model. This

makes the section of RUL to be divided again into two subsections: the quantification, and



the wear modelling. The quantification is the model that tells how damaged the asset we are

evaluating is, while the wear model models the quantification and predicts how it will evolve

in time.

The algorithms that make the RUL possible can be separated into different families. In the

PROPHESY‐ML toolbox, the following will be studied.

1. RUL as Regression problem

a. Logistic Regression

b. Support Vector Regression

c. Random Forest Regressor

2. RUL as Time Series Analysis

a. Auto‐Regressive Family

b. Deep LSTM Networks

3.2.5 Discovery of Rate Conditions

In order to discover rate conditions and associated quantitative rules, PROPHESY will also

explore the following novel set of non‐supervised learning algorithms that are contributed by

one of the partners (AIT). Note that details about these algorithms and their use in PROPHESY

will be provided as part of WP4 deliverables that deal with the detailed specifications of the

PROPHESY‐ML analytics algorithms.

3.2.5.1 QARMA (Parallel Quantitative Association Rule Mining)

Parallel Quantitative Association Rule Mining is a highly parallel/distributed algorithm that

can mine all “interesting” quantitative association rules in multi‐dimensional datasets [15].

The algorithm can be used to discover (rare) conditions under which certain controls must be

applied to bring a system back into a desired state.

3.2.5.2 EXAMCE

Clustering in large numbers of clusters. When large numbers of small clusters are sought (e.g.

to discover rare cases that best fit together), standard algorithms such as K‐Means break

down. Newer algorithms are a much better fit for such purposes [16].

3.2.5.3 DBSCAN Family of Clustering Algorithms

The advantage of DBSCAN is that it is an inherently incremental online algorithm that only

needs to see the data once, and can thus be used in streaming applications.

3.3 Data Sharing and Interoperability Specifications The PROPHESY‐CPS platform should also offer data sharing and interoperability

functionalities, as part of enabling the unified and consolidated processing of datasets that

reside in fragmented systems. To this end, PROPHESY‐CPS should adhere to the following

specifications about data models, data exchanges and interoperability.



3.3.1 Standards‐based Digital Models for PROPHESY

The PROPHESY platform should support standards‐based digital models for the

representation of maintenance and automation information. In particular:

Standards‐based digital representation of maintenance data: PROPHESY should support

standards‐based digital models, which will ensure data and semantic interoperability

across datasets stemming from diverse data sources.

Standards‐based digital representation of automation‐related data: PROPHESY should

support digital models for the representation of automation operations as a means of

“hiding” the diversity of the various automation systems and devices that might engage

in maintenance workflows.

API Support: PROPHESY should provide APIs for CRUD (Create, Read Update Delete)

operations over entities/instances of the data models.

Support for multiple bindings and formats: PROPHESY should support multiple bindings

(i.e. in different languages and formats) for access and use of data models instances.

3.3.2 Data Sharing and Exchange Specifications

PROPHESY should provide support for sharing and exchanging data across different

maintenance‐related systems, based on the following modalities:

On‐line Data Sharing, based on appropriate platform agnostic services (e.g., Web

Services).

Batch and off‐line data sharing, based on appropriate ETL (Extract Transform Load)

processes for transferring maintenance datasets across different systems.

3.3.3 Data Persistence Specifications

PROPHESY should provide a variety of options for storing, managing and persisting

maintenance datasets, which shall ensure scalability and support different types of data. In

particular:

Support for structured and unstructured data: PROPHESY should provide the means for

storing and managing structured, unstructured and semi‐structured datasets.

Big Data Support: The PROPHESY data infrastructure should support the four Vs of Big

Data: Volume, Variety, Velocity, and Veracity.

Cost‐Effective Scalability: The PROPHESY infrastructure shall provide cost‐effective

scalability, including the ability to scale‐up and/or scale‐out as needed.



ReferenceDataFlowandFunctions4.1 Information Model An Information Model defines the structure (e.g. relations, attributes, services) of all the

information on a conceptual level. The term information is used along to the definitions of

the DIKW hierarchy (see [11]) where data is defined as pure values without relevant or

useable context. Information adds the right context to data and offers answers to typical

questions like who, what, where and when. The description of the representation of the

information (e.g. binary, XML, RDF etc.) and concrete implementations are not part of the

here depicted Information Model.

The following figure gives an overview to the relevant elements in the information model,

used in MANTIS [10]:

Figure4‐1:InformationModelElementCategories[10]

Data in can be differentiated in the following kinds:

Data: syntactically treated signals or signs

Metadata: data that provides information about other data

class IM_Overview

PdM Data

DataCharacteristic + Accuracy

+ Asynchronous + Frequency

+ Latency + Resolution

+ Synchronous

DataContent + Advice

+ Event + GeoData

+ HealthStatus + HealthTrend

+ Image + MaintenancePlan

+ MaintenanceReport + Sound

+ State + Video DataProcessing

+ AggregatedData + AugmentedData + ManipulatedData

+ RawData

DataSource + ManualData + SensedData

DataStructure + Stream

+ Value + ValueContainer

DataSubject + AssetDescription

+ ERPData + MachineData + ProcessData

DataKind + Information

+ Knowledge + MetaData



Information: semantically enriched data, regular structured, aggregated and

augmented

Knowledge: observations that have been meaningfully organized, accumulated and

embedded in a context through experience, communication, or inference. It is used to

interpret situations and to generate activities, behaviour and solutions.

The data will be structured in single values, value containers, or data streams.

class IM_DataKind&DataStructure

PdM Data Information Knowledge

MetaData

StreamValue ValueContainer

Structure

Figure4‐2:Datastructuresandkinds[10]

The data on the one hand side consists of somehow sensed data for determining the condition

of assets monitored, and on the other hand it comprises manual data from human actors in

the maintenance process. Within the process of predicting maintenance issues the data is

captured as raw data and successively enriched with syntactic, semantic and experience

based increments.



class IM_DataProcessing&DataSource

AggregatedData AugmentedData

PdM Data

ManipulatedData

ManualData

RawData

SensedData

Processing

Source

Figure4‐3:Datasourcesandprocesseddata[10]

Data is covering a lot of differing kinds of content ranging from simple to complex, and

comprises 4 different subjects:

class IM_DataContent&DataSubject

PdM Data

VideoSound MaintenanceReport

MaintenancePlanImage

Image2D Image3D

HealthTrend

HealthStatus

GeoData

Event Advice

State

ERPDataAssetDescription MachineDataProcessData

Content

Subject

Figure4‐4:Datasubjectsandcontent[10]



Data handled, depending on its origin, can be assigned a meaningful sample of characteristics,

which are shown in the figure below:

class IM_DataCharacteristics

Accuracy Asynchronous

PdM Data

Frequency LatencyResolution Synchronous

Characteristic

Figure4‐5:DataCharacteristics[10]

4.2 Function Model D2.2 follows the first two steps of the RFLP approach (Requirements – Functional – Logical –

Physical) as the baseline for model‐based design with systems engineering that enables close

interaction and collaboration between the different engineering disciplines. Therefore, we

concentrate on customers’ needs, qualities and the functions a PdM system has to provide.

Figure 4‐6 depicts the functional view as developed in MANTIS [10], which PROPHESY will

adapt to.

Figure4‐6:MANTISFunctionalModel



Within this functional perspective, this section explains the key maintenance functions

defined in ISO 13374: Data Acquisition (DA), Data Manipulation (DM), State Detection (SD),

Health Assessment (HA), Prognostic Assessment (PA), and Advisory Generation (AG). Figure

4‐7 denotes the overview on these processing blocks:

Figure4‐7:DataprocessingblockdiagramfromISO13374‐2

In more detail, the processing blocks can be described as follows [4]:



Table4‐1:KeyMaintenanceFunction:DataAcquisition

Data Acquisition (DA)

The output, which is obtained from the sensor, is converted into the digital parameter, which

represents a physical quantity and information related to e.g. time, calibration, the quality of the

data, the data collector utilized, and the sensor configuration.

Functionality The Data Acquisition (DA) function provides system access to digitized data entered automatically or manually.

Inputs The DA function may represent a specialized data acquisition function that has analogue feeds (e.g. from legacy sensors), or it may collect and consolidate sensor signals from a data bus. Alternatively, it might represent the software interface to a smart sensor:

Analogue, manual, and digital data

Control, synchronization, and configuration data Historical DA outputs

Outputs The DA function basically is a server of calibrated/scaled digitized sensor data records. The output of all DA function blocks shall contain the following:

Digitized data Time‐order/time‐reference data, normally referenced with UTC and local time zone

Data quality indicator (e.g. "bad", "good", "unknown'', "under review'', etc.).

Table4‐2:KeyMaintenanceFunction:DataManipulation

Data Manipulation (DM)

Data manipulation unit performs signal analysis and calculates the relevant descriptors. The end

result is a virtual sensor reading from raw data.

Functionality The Data Manipulation (DM) function processes the digital data from the DA function to convert it to a desired form, which characterizes specific descriptors (features) of interest in the machine condition monitoring and diagnostic process. Often the functionality within this layer consists of some signal processing algorithms. DM calculates descriptors (features) from sampled sensor data, other descriptors, or the output of computations. The computation may be characterized as an input‐output mapping.

Inputs Sampled digital data from the DA function, and cascaded data from other DM instances.

Outputs Descriptors (features) from sampled sensor data, other descriptors, or the output of computations. The computation may be characterized as an input‐output mapping.



Table4‐3:KeyMaintenanceFunction:StateDetection

State Detection (SD)

State Detection (SD) facilitates the creation and maintenance of normal baseline ”profiles” and

searches for abnormalities from the data whenever the new data is collected and determines the

abnormality zone to it, if any, for example, an alert or alarm.

Functionality The State Detection (SD) function (sometimes referred to as "state awareness") is to compare DM and/or DA outputs against expected baseline profile values or operational limits, in order to generate enumerated state indicators with respective boundary exceedances. The SD function generates indicators, which may be utilized by the Health Assessment function to generate alerts and alarms. When appropriate data are available, the SD block should generate assessments based on operational context, sensitive to the current operational state or operational environment.

Inputs Current DA and DM outputs; cascaded SD output

Historical DA and DM outputs

Operational data (context, environment, state, external systems data)

Configuration data

Outputs DA control signals and scheduling commands (for optimizing functional interplay)

DM control signals (for optimizing functional interplay)

Data which will contribute to a diagnosis in the health assessment function



Table4‐4:KeyMaintenanceFunction:HealthAssessment

Health Assessment (HA)

The Health Assessment (HA) block diagnoses any faults, and rates the current health of the

equipment or process, considering all state information.

Functionality The Health Assessment (HA) function utilizes expertise from a human or automated agent to determine the current health of the equipment and to diagnose existing fault conditions. It determines the state of health and potential failures by fusing the outputs of the DA, DM, SD and other HA function blocks. HA performs agent‐specific assessments of a component's or system's current health state with the associated diagnoses of discovered abnormal states in the associated operational context. HA results may also include evidence and explanation information.

Inputs DA, DM, SD outputs

Operational data Configuration data Human expertise

Automated agent expertise

Outputs Component/system's current health grade

Diagnosed faults and failures with associated likelihood probability Calculation of the current risk priority number (RPN)

Modelling of ambiguity groups and multiple hypotheses may be included in The output data structures

Explanation detailing the evidence for a diagnosis or health grade

Table4‐5:KeyMaintenanceFunction:PrognosticAssessment

Prognostic Assessment (PA)

The Prognostic Assessment (PA) block determines the future health states and failure modes of the

equipment and/or process based on the current health assessment and projected usage loads, as

well as the remaining useful life predictions.

Functionality Prognostic Assessment (PA) performs agent‐specific assessments of a component's or system's future health state with the associated predicted abnormal states and remaining life for a projected operational context. It may also include evidence and explanation information. It uses a combination of prognostic models and their algorithms, including future operational usage model(s).

Inputs DA, DM, SD, HA and (cascaded) PA outputs

Account historical failure data and operational history Projected failure rates related to operational utilization

Outputs Health grade at a future time

Estimation of the remaining life of an asset given its projected usage profile



Table4‐6:KeyMaintenanceFunction:AdvisoryGeneration

Advisory Generation (AG)

Advisory Generation (AG) block offers actionable information regarding maintenance or operation

of an equipment and/or process so that the lifetime of the equipment and/or process can be

optimized.

Functionality Advisory Generation (AG) integrates information (including safety, environmental, operational goals, financial incentives, etc.) to generate advisories to operations and maintenance and to respond to capability forecast assessment requests.

Inputs DA, DM, SD, HA, PA and (cascaded) AG outputs

Operational data Configuration data External constraints (safety, environmental, budgetary, etc.)

Operational history (including usage and maintenance)

Current and future mission profiles

High‐level unit objectives Resource constraints

Outputs Recommendations, such as o Prioritized operational and maintenance actions o Capability forecast assessments o Modified operational profiles to allow mission completion

Maintenance advisories (put as structured work requests) o Verification of monitoring data o Performance of additional monitoring o …

Operational advisories o Immediate (e.g. notification of alerts and action steps) o Strategic (e.g. notification of (high) risk of failure) o Capability forecast

Complemental to the ISO 13374 standard, MANTIS [10] defines two additional maintenance

functions, Maintenance Planning (MP), and Maintenance Execution (ME):



Table4‐7:KeyMaintenanceFunction:MaintenancePlanning

Maintenance Planning (MP)

The Maintenance Planning (MP) block generates a work plan on how and when to maintain the

system of interest.

Functionality Maintenance Planning (MP) prepares a plan for an optimized immediate and strategic maintenance based on current and historic health data, as well as resource planning data.

Inputs AG outputs Resource planning data

Outputs Work plan

Table4‐8:KeyMaintenanceFunction:MaintenanceExecution

Maintenance Execution (ME)

The Maintenance Execution (ME) block executes the MP – in most cases this is done by maintenance

staff. However, from the function viewpoint a human maintenance worker should be seen as one

possible technological solution to accomplish this function.

Functionality Maintenance Execution (ME) executes the maintenance plan.

Inputs MP outputs

Templates

Guidelines Constraints

Outputs Report data on the outcome

4.3 Overview Template To describe the needs for maintenance scenarios, PROPHESY has prepared a template for

capturing the relevant information in a structured way along the aspects depicted in the

previous sections.

Figure 4‐8 shows a generic example of this template, exemplifying an overview of the

interplay between goals, data sources (internal and external), and maintenance related

functions. The overview also indicates the data flows in a certain maintenance scenario.



Figure4‐8:Templateforcapturinggoals,functions,data,anddatasources&flows



PHILIPSUseCases(Goals,functions,data,datasources&flows)

This section comprises, based on the PHILIPS use cases, what the PROPHESY PdM users need

to accomplish and what the system has to accomplish for its users. This analysis focuses on

the customers’ needs, and the goals that shall be reached. It also depicts how the operational

needs can be satisfied, considering the existing infrastructure.

The sections 4.1‐3 depict the elaboration of the three PHILIPS use cases concerning

“ML to predict RUL on wear‐part level: Cold‐forming Tool” (UC1),

“ML to predict RUL on wear‐part level: 5‐fold Cut‐out Tool” (UC2), and

“AR to Assist Tool Maintenance Operations on wear‐part level: Cold‐forming Tool & 5‐fold

cut‐out Tool” (UC3).

All data and data flows elicited in the PHILIPS use cases initially remain in the realm of PHILIPS.

However, some data not yet clearly identified go to the PdM platform for training and

development of machine learning algorithms and KPI extraction.



5.1 PROPHESY UC1 (PHILIPS)

Figure5‐1:PROPHESYUC1(PHILIPS)–Goals,functions,data,datasources&flows











JLR Use Cases (Goals, functions, data, datasources&flows)

This section comprises, based on the JLR use cases, what the PROPHESY PdM users need to

accomplish and what the system has to accomplish for its users. This analysis focuses on the

customers’ needs, and the goals that shall be reached. It also depicts how the operational

needs can be satisfied, considering the existing infrastructure.

The sections 5.1‐3 depict the elaboration of the three JLR use cases concerning

“Condition prediction for MAG Specht 600 Ballscrew” (UC4),

“Condition prediction for hole generation tool on OP90 Cylinder Head” (UC5), and

“AR for technical Instructions & training for key components replacement/assembly, and

for RUL information” (UC6).

All data and data flows elicited in the JLR use cases initially remain in the realm of JLR.

However, some data not yet clearly identified go to the PdM platform for training and

development of machine learning algorithms and KPI extraction (UC1&2). UC3 streams audio

and video data via the OCULAVIS SHARE7 platform.

7 https://www.share-platform.de/



6.1 PROPHESY UC4 (JLR)

Figure6‐1:PROPHESYUC4(JLR)–Goals,functions,data,datasources&flows











ConclusionPROPHESY’s WP2 aims to design a technical draft of the platform to serve as a basis for its

technological development and deployment. The engineering blueprint consists of a high‐

level architecture of the system of interest and the technical infrastructure that supports the

architecture.

This document renders the objectives to be achieved in predictive maintenance and derives

the data (and its in‐process and external sources), the data modelling needs and the functions

thereof. It also reflects the system qualities that drive the system architecture most. The work

answers the question of what the system of interest should accomplish. The answers to the

questions of how the system should function and be built will be given in work package 3.

D2.2 lays the foundations for data collection and analytics, addressing big data solutions,

business goals and key qualities, and considers basic standards like ISO 133748 and MIMOSA.

It also re‐uses results from the MANTIS9 project, provides specifications for data sharing and

interoperability, as well as for data collection and analytics as part of the PROPHESY‐ML. It

additionally extends the reference data flow and functions with a template for recording

information of the PROPHESY use cases’ predictive maintenance scenarios. The document

closes by detailing the needs of the six PROPHESY use cases using the template designed.

8Condition monitoring and diagnostics of machines - Data processing, communication and presentation 9 http://www.mantis-project.eu/



References[1] "MIMOSA ‐ An Operation and Maintenance Information Open System Alliance". [Online].

Available: http://www.mimosa.org/. [Accessed: 28‐Mar‐2018].

[2] Pascal Roques, "Systems Architecture Modeling with the Arcadia Method ‐ A Practical Guide to Capella", 1. Edition, STE Press – Elsevier, 2017

[3] "Cost Efficient Methods and Processes for Safety Relevant Embedded Systems (CESAR) ". [Online], Available: https://artemis‐ia.eu/project/1‐cesar.html. [Accessed: 28‐Mar‐2018].

[4] ISO 13374, "Condition monitoring and diagnostics of machines ‐ Data processing, communication and presentation", Parts 1‐4

[5] ISO/IEC 19501, "Information technology – Open Distributed Processing – Unified Modeling Language (UML) Version 1.4.2"

[6] Pratikkumar Desai, Amit Sheth and Pramod Anantharam. "Semantic Gateway as a Service Architecture for IoT Interoperability". In: Proceedings of IEEE International Conference on Mobile Services, 2015, pp. 313‐319.

[7] Martín Serrano et al. "Internet of Things ‐ IoT Semantic interoperability: Research Challenges, Best Practices, Recommendations and Next Steps". In: European Research Cluster on the Internet of Things, 2015.

[8] Amit Sheth. "Internet of Things to Smart IoT through semantic, cognitive, and perceptual Computing". In: IEEE Intelligent Systems, Volume 31, Issue 2, 2016, pp. 108‐112.

[9] Riccardo Petrolo et al. "The design of the gateway for the Cloud of Things". In: Annals of Telecommunications, Volume 72. Issue 1‐2, 2016, pp. 31‐40.

[10] "MANTIS: Cyber Physical System based Proactive Collaborative Maintenance". [Online]. Available: http://www.mantis‐project.eu/. [Accessed: 14‐Mar‐2018].

[11] J. Rowley. "The wisdom hierarchy: Representations of the DIKW hierarchy". In: Journal of Information Science, 2007, Volume 33, Issue 2, pp. 163‐180

[12] Anand Veeramani. "Big Data 101 ‐ Creating Real Value from the Data‐Lifecycle". 2014. [Online], Available: https://www.happiestminds.com/whitepapers/Big‐Data‐101‐Creating‐Real‐Value‐from‐the‐Data‐Lifecycle.pdf [Accessed: 30‐Mar‐2018]

[13] CRISP‐DM. [Online], Available: http://crisp‐dm.eu/ [Accessed: 30‐Mar‐2018]

[14] John Gantz and David Reinsel. "The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East". IDC Country Brief 2013. [Online], Available: https://www.emc.com/collateral/analyst‐reports/idc‐digital‐universe‐united‐states.pdf [Accessed: 30‐Mar‐2018]

[15] I.T. Christou, E. Amolochitis, Z.‐H. Tan. “QARMA: A Parallel Algorithm for Mining All Quantitative Association Rules and Some of its Applications”. Knowledge & Information Systems, 2017, Accepted with Minor Revisions.

[16] Ioannis T. Christou. “Coordination of Cluster Ensembles via Exact Methods”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, Volume 33, Issue 2, pp. 279‐293

Documents

D2.2 – Specification of Data Assets and Services · 2019. 11. 13. · figure 4‐7: data processing block diagram from iso 13374‐2 ..... 46 F IGURE 4‐8: T EMPLATE FOR CAPTURING