6
Evaluation of Distributed Query-Based Monitoring over Data Distribution Service arton B´ ur , D´ aniel Varr´ o ∗†‡ McGill University, Montreal, Canada [email protected], [email protected] MTA-BME Lend¨ ulet Cyber-Physical Systems Research Group, Budapest, Hungary Budapest University of Technology and Economics, Budapest, Hungary Abstract—Safety assurance in critical IoT systems is increas- ingly important as those systems are becoming ubiquitous in various application areas, such as health care and transportation. Recently, novel runtime monitoring approaches started to adapt expressive rule- or query-based languages to capture safety properties, which are evaluated over runtime models. In this paper, we define two distributed query-based runtime monitoring strategies for IoT and cyber-physical systems built on top of a distributed runtime model deployed over the Data Distribution Service as standard communication middleware. We provide detailed scalability evaluation of our prototype implementation over a case study motivated by an open-source IoT demonstrator. Index Terms—Runtime models, Distributed queries, Runtime Monitoring, Data Distribution Service (DDS) I. I NTRODUCTION Motivation: Critical Internet-of-things and cyber-physical systems (CPS) [1]–[3] frequently need to provide autonomous behavior with complex interaction with its uncertain envi- ronment using intelligent data processing techniques over a heterogeneous computation platform. In such systems, runtime verification (RV) [4], [5] is frequently used to ensure safe operation by monitoring. Recent RV approaches like started to exploit rule-based [6], [7] or query-based [7] techniques over a richer (relational or graph-based) information model. Runtime models (also known as [email protected] [8], [9]) provide such a rich knowledge representation to capture the runtime state of the application data, services and platforms in the form of typed and attributed graphs [10] to serve as a unifying semantic basis for various kinds of analysis techniques. For example, runtime models have been used for the assurance of self-adaptive systems (SAS) in [11], [12]. In a distributed CPS over a resource constrained platform, the underlying runtime model also needs to be distributed, and updated with high frequency driven by the incoming sensor information and the changes in network topology. Problem statement: In our previous work [7], we demon- strated the feasibility of query-based runtime monitoring, i.e. to define runtime monitors captured in a highly expressive declarative graph query language executed over a distributed runtime model deployed to a physical CPS platform. However, this approach used an ad hoc runtime model management layer (using low-level sockets for communication) while only an initial performance evaluation was carried out with few participants and medium-sized models. In [13], we adapted distributed runtime models over a standard middleware of (real-time) data distributed service (DDS) [14]. Objectives: Our current work provides a scalability eval- uation of different query evaluation strategies used for runtime monitoring over a distributed runtime model which can be deployed over resource-constrained fog computing [15] envi- ronment. Our experimental evaluation is carried out on a case study motivated by the MoDeS3 CPS demonstrator (Model- Based Demonstrator for Smart and Safe Cyber-Physical Sys- tems) [16], which is an open source educational platform. In particular, this paper provides the following contributions: We define single executor (coordinator-driven) and mul- tiple executor (decentralized) monitoring strategies for evaluating graph queries over distributed runtime models. We provide prototype implementation of both strategies over a standard industrial DDS middleware. We carry out detailed scalability evaluation of this pro- totype implementation for increasing model size, number of participants and certain configuration parameters. II. PRELIMINARIES:DISTRIBUTED RUNTIME MODELS This section provides the modeling foundations for query- based runtime monitoring. We revisit definitions related to distributed runtime models based on our previous work [7]. A. Domain-specific modeling languages In domain-specific (modeling) languages (DSLs), a domain is often defined by a metamodel including main concepts such as classes, attributes, and relations captured in the form of graph models and a set of structural consistency constraints. A metamodel can be formalized as a vocabulary Σ= {C 1 ,..., C n1 , A 1 ,..., A n2 , R 1 ,..., R n3 } with a unary predicate symbol C i for each class, a binary predicate symbol A j for each attribute, and a binary predicate symbol R k for each relation in the metamodel. Example: Figure 1 shows a metamodel for the MoDeS3 demonstrator with Participants (identified in the network by hostID attribute) which host the DomainElements.A DomainElement is either a Train or RailroadElement.A Train has a speed attribute and the train is always located on 978-1-5386-4980-0/19/$31.00 c 2019 IEEE 683

Evaluation of Distributed Query-Based Monitoring over Data … · 2019. 4. 15. · Evaluation of Distributed Query-Based Monitoring over Data Distribution Service Marton B´ ur´

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Evaluation of Distributed Query-Based Monitoringover Data Distribution Service

    Márton Búr∗, Dániel Varró∗†‡∗McGill University, Montreal, Canada

    [email protected], [email protected]†MTA-BME Lendület Cyber-Physical Systems Research Group, Budapest, Hungary

    ‡ Budapest University of Technology and Economics, Budapest, Hungary

    Abstract—Safety assurance in critical IoT systems is increas-ingly important as those systems are becoming ubiquitous invarious application areas, such as health care and transportation.Recently, novel runtime monitoring approaches started to adaptexpressive rule- or query-based languages to capture safetyproperties, which are evaluated over runtime models. In thispaper, we define two distributed query-based runtime monitoringstrategies for IoT and cyber-physical systems built on top of adistributed runtime model deployed over the Data DistributionService as standard communication middleware. We providedetailed scalability evaluation of our prototype implementationover a case study motivated by an open-source IoT demonstrator.

    Index Terms—Runtime models, Distributed queries, RuntimeMonitoring, Data Distribution Service (DDS)

    I. INTRODUCTION

    Motivation: Critical Internet-of-things and cyber-physicalsystems (CPS) [1]–[3] frequently need to provide autonomousbehavior with complex interaction with its uncertain envi-ronment using intelligent data processing techniques over aheterogeneous computation platform. In such systems, runtimeverification (RV) [4], [5] is frequently used to ensure safeoperation by monitoring. Recent RV approaches like startedto exploit rule-based [6], [7] or query-based [7] techniquesover a richer (relational or graph-based) information model.

    Runtime models (also known as [email protected] [8], [9])provide such a rich knowledge representation to capture theruntime state of the application data, services and platformsin the form of typed and attributed graphs [10] to serveas a unifying semantic basis for various kinds of analysistechniques. For example, runtime models have been used forthe assurance of self-adaptive systems (SAS) in [11], [12].

    In a distributed CPS over a resource constrained platform,the underlying runtime model also needs to be distributed, andupdated with high frequency driven by the incoming sensorinformation and the changes in network topology.

    Problem statement: In our previous work [7], we demon-strated the feasibility of query-based runtime monitoring, i.e.to define runtime monitors captured in a highly expressivedeclarative graph query language executed over a distributedruntime model deployed to a physical CPS platform. However,this approach used an ad hoc runtime model managementlayer (using low-level sockets for communication) while only

    an initial performance evaluation was carried out with fewparticipants and medium-sized models. In [13], we adapteddistributed runtime models over a standard middleware of(real-time) data distributed service (DDS) [14].

    Objectives: Our current work provides a scalability eval-uation of different query evaluation strategies used for runtimemonitoring over a distributed runtime model which can bedeployed over resource-constrained fog computing [15] envi-ronment. Our experimental evaluation is carried out on a casestudy motivated by the MoDeS3 CPS demonstrator (Model-Based Demonstrator for Smart and Safe Cyber-Physical Sys-tems) [16], which is an open source educational platform. Inparticular, this paper provides the following contributions:

    • We define single executor (coordinator-driven) and mul-tiple executor (decentralized) monitoring strategies forevaluating graph queries over distributed runtime models.

    • We provide prototype implementation of both strategiesover a standard industrial DDS middleware.

    • We carry out detailed scalability evaluation of this pro-totype implementation for increasing model size, numberof participants and certain configuration parameters.

    II. PRELIMINARIES: DISTRIBUTED RUNTIME MODELS

    This section provides the modeling foundations for query-based runtime monitoring. We revisit definitions related todistributed runtime models based on our previous work [7].

    A. Domain-specific modeling languages

    In domain-specific (modeling) languages (DSLs), a domainis often defined by a metamodel including main conceptssuch as classes, attributes, and relations captured in theform of graph models and a set of structural consistencyconstraints. A metamodel can be formalized as a vocabularyΣ = {C1, . . . ,Cn1 ,A1, . . . ,An2 ,R1, . . . ,Rn3} with a unarypredicate symbol Ci for each class, a binary predicate symbolAj for each attribute, and a binary predicate symbol Rk foreach relation in the metamodel.

    Example: Figure 1 shows a metamodel for the MoDeS3demonstrator with Participants (identified in the networkby hostID attribute) which host the DomainElements. ADomainElement is either a Train or RailroadElement. ATrain has a speed attribute and the train is always located on978-1-5386-4980-0/19/$31.00 c⃝2019 IEEE

    683

  • Participant

    hostID : EInt

    RailRoadElement

    ModelRoot

    Segment

    isEnabled : EBoolean

    Turnout

    Train

    speed : EInt

    DomainElement

    [0..*] participants

    [0..1] straight [0..1] divergent

    [0..*]domainElements

    [0..*] hosts

    [1..1] on[0..1] left

    [0..1] right

    Fig. 1: Metamodel for MoDeS3

    Fig. 2: Distributed runtime model for MoDeS3

    a RailroadElement. Turnouts and Segments are Railroad-Elements with links to the left and right side RailroadEle-ments. These left and right references are used to describe theactual connections between the different RailroadElements.A Turnout has references to RailroadElements that representthe straight and divergent directions.

    B. Distributed runtime (graph) models

    We carry out runtime monitoring over runtime models [7]which serve as rich knowledge representation of the obser-vations, internal state and the environment of the system inoperation. The runtime model is distributed along participantsthe physical system and connected via the network [13]. Theseparticipants process the data provided by their correspondingsensors, and they are able to perform edge- or cloud-basedcomputations on the data. The runtime model managementcomponents are deployed and executed on the platform ele-ments, thus resource constraints need to be respected duringallocation. The communication middleware needs to ensurefast and reliable communication between the components.

    Formally, a runtime model M = ⟨DomM , IM ⟩ is a logicstructure over Σ, as in [17], where DomM is a finite setof objects and elementary data values (integers, strings, etc.)in the model. IM is a 2-valued interpretation of predicatesymbols in Σ defined as follows (where op, oq representobjects and ap represents an attribute value):

    • Class predicates: If object op is an instance of class Ci,then the interpretation of Ci in M evaluates to 1 denotedby [[Ci(op)]]

    M= 1, and it evaluates to 0 otherwise.

    • Attribute predicate: If there is an attribute of type Aj inop with value ar in M , then [[Aj(op, ar)]]

    M= 1, and 0

    otherwise.• Reference predicates: If there is a link of type Rk from

    op to oq in M , then [[Rk(op, oq)]]M

    = 1, otherwise 0.In our distributed runtime model [7], each participant has

    up-to-date but incomplete knowledge about the distributed

    system. Moreover, we assume that each model object isexclusively managed by a single participant, referred to asthe host of that element, which serves as the single source oftruth. This way, each participant can make calculations (e.g.execute a monitor locally) based on its own view of the system,and it is able to modify the mutable properties of its hostedmodel elements. Formally, for a predicate P with parametersv1, . . . , vn, [[P (v1, . . . , vn)]]

    M@p denotes its value over the

    distributed runtime model M is stored by host p.Example: Figure 2 shows a snapshot of the distributed

    runtime model M for the MoDeS3 system. Participants de-ployed to three different physical computing units managedifferent parts of the system. The model captures the threeparticipants (Participant 1 – Participant 3) deployed to thecomputing units, the domain elements (s1–s8, tu1, tu2, tr1,and tr2) as well as the links between them. Each participanthosts model elements contained within them in the figure, e.g.Participant 2 is responsible for storing attributes and outgoingreferences of objects s3, s4, s5, and tr1.

    C. Real-time Data Distribution Service

    Both model update and monitoring messages are sent be-tween the components over a middleware using a publish-subscribe protocol that implements the real-time data distri-bution service (RDDS [18]). RDDS is an extension for theDDS standard [14] of the Object Management Group (OMG)to unify common practices concerning data-centric commu-nication using a publish-subscribe architecture. Furthermore,the middleware also abstracts from the platform and network-specific details, and it can provide quality of service (QoS)and quality of data (QoD) guarantees. Thus, in the currentpaper, we assume reliable and timely delivery of messages bythe underlying middleware.

    III. QUERY-BASED MONITORING

    A. Graph queries for specifying safety monitors

    As proposed in [7], we rely on the VIATRA Query Lan-guage (VQL) [19] to capture the monitored properties. Thisdeclarative graph query language allows to capture safetyproperties on a high level of abstraction over the runtimemodel, which eases the definition and comprehension ofsafety monitors for engineers and avoids accidental complexitycaused by additional platform-specific or deployment details.While graph queries can be extended to express temporalbehavior [20], this work is restricted to safety properties wherethe violation of a property is expressible by graph queries.

    The expressiveness of VQL converges to first-order logicwith transitive closure, thus it provides a rich language forcapturing a variety of complex structural conditions and depen-dencies. Any match (result) of a graph query over the runtimemodel highlights a violation of the safety property. Formally,a graph query φ(v1, . . . , vn) can be evaluated over a runtimemodel M (denoted by [[φ(v1, . . . , vn)]]

    MZ ) along a variable

    binding from variables to objects and data values in M (i.e.,Z : {v1, . . . , vn} → DomM ) in accordance with the semantic

    684

  • rules defined in [17]. A variable binding Z is called a match ifquery φ is evaluated to 1 over M , i.e. [[φ(v1, . . . , vn)]]

    MZ = 1.

    Example: In the railway domain, safety standards pre-scribe a minimum distance between trains on track [21], [22].The closeTrains monitor definition captures a (simplified)description of the minimum headway distance to identify vio-lating situations where trains have only limited space betweeneach other. Technically, one needs to detect if there are twodifferent trains on two different railroad elements, which areconnected by a third railroad element. Any match of thispattern highlights track elements where passing trains need tobe stopped immediately. Listing 1 shows the monitoring querycloseTrains in textual VQL syntax and Listing 2 displays itas a graph formula.

    pattern closeTrains (start : RailroadElement, end : RailroadElement) {

    Train.on(train_1,start);Train.on(train_2,end);train_1 != train_2;RailroadElement.connectedTo(start, middle);RailroadElement.connectedTo(middle, end);start != end;

    }

    Listing 1: Monitoring goal in the VIATRA Query Language

    CloseTrains(start, end) =RailroadElement(start) ∧ RailroadElement(end)∧∃train1 : Train(train1 ) ∧ On(train1 , start)∧∃train2 : Train(train2 ) ∧ On(train2 , end)∧¬(train1 = train2 )∧∃middle : RailroadElement(middle) ∧ ConnectedTo(start,middle)∧ConnectedTo(mid, end)∧¬(start = end)

    Listing 2: Monitoring goal as a formula

    B. Monitor execution

    Our system-level runtime monitoring framework is hierar-chical and distributed. Monitors may observe the local runtimemodel of a participant, and they can collect informationfrom runtime models of different participants. Moreover, onemonitor may request information from other monitors, thusyielding a hierarchical network.

    Monitors compute matches of a graph query φ(v1, . . . , vn)along a search plan by assigning model objects to variablesv1, . . . , vn and evaluating the predicate of the query. A searchplan is an ordered list of search operations (e.g. checkingtype of objects, navigating along references) that traverses theruntime graph model in order to find all complete variablebindings satisfying the query condition. The search plan forφ(v1, . . . , vn) also depends on the initial binding informationfor the input parameters (provided by the caller) as a variablewith a fixed value can greatly reduce the search space.

    Example: The pseudo code in Algorithm 1 sketchesthe implementation for finding all matches of closeTrains ifinput parameters (start and end in L1) are unbound. Unboundinput parameters and variables that are not assigned a valueare represented with NULL. In L2 the set matches serves asthe container of the results and is initialized empty. In thefollowing line (L3), the if-statement ensures that both input

    parameters are unbound. The search algorithm is shown in L4-L13. It starts with a loop that binds objects of type RailRoad-Element to variable start. Lines 5-6 show navigation along thetrain reference starting from start. If an object of type Trainis present on start, the search continues by navigating twicealong the connectedTo reference in each possible way, again,from start and binding variable end (L7-L9). In L10-L11 thepresence of a train is checked on end. Finally, if both train1and train1 are bound to different train objects (L12), a matchis found and in ⟨start , end⟩ is added to matches in L13.

    Algorithm 1 Compute results of closeTrains1: input: start, end2: matches ← ∅3: if start = NULL and end = NULL then4: for start in {all RailRoadElement objects} do5: train1 ← start .train()6: if train1 ̸= NULL then7: for middle in start.connectedTo() do8: for end in middle.connectedTo() do9: if start ̸= end then

    10: train2 ← end .train()11: if train2 ̸= NULL then12: if train1 ̸= train2 then13: matches .add(⟨start , end⟩)

    The complete matching algorithm of closeTrains needs tohandle initial variable bindings (e.g. end ̸= NULL) which isomitted here for space considerations.

    In distributed query-based monitoring, a significant chal-lenge in executing search plans is to manage navigation alongreferences and reading an attribute values of objects hosted byother participants. Additionally, we wish to allow that monitorsinitiated by an arbitrary participant can return all violations inthe whole system. We have identified two possible approachesto support these requirements:

    • Single executor: a single participant executes all stepsof the search plan while accessing information storedat other participants, which can be achieved by a syn-chronous remote procedure call implementation overDDS. The request message conveys i) what reference/at-tribute value of ii) what object is requested and iii) bywhich participant, while the reply message encapsulatesthe i) object identifier and ii) an array of values.

    • Multiple executors: the participant which initiates theexecution of a monitor will not directly request informa-tion from other participants. Instead, if a remote object isencountered and its reference/attribute value is queried,the partial variable bindings are asynchronously passedto the other participant, which continues the execution ofthe search plan. The request message needs to contain i)an auto-generated unique request identifier, ii) the partialvariable binding, iii) the next step in the search plan andiv) the requester participant ID. Once the other participantreceives this message, it continues the matching and

    685

  • finally, it returns all found matches to the requester ina reply message with i) the set of matches and ii) therequest identifier.

    IV. EVALUATION

    We evaluated the scalability of query-based runtime moni-toring using these execution strategies with respect to increas-ing (a) model size, and (b) number of participants, and (c)certain configuration parameters of the DDS middleware.

    A. Benchmark setup

    1) CPS monitoring benchmark: To assess the distributedruntime verification framework, we used the MoDeS3 railwayCPS demonstrator [16] where multiple safety properties aremonitored. They are all based on important aspects of thedomain. The monitoring goals of interest are the following:

    • CloseTrains: see example in subsection III-A.• Derailment: a train approaches a turnout that is set to the

    opposite direction (causing the train to derail).• EndOfSiding: a train approaches an end of the track.• TrainLocations: highlight segments with a train.

    Since the original runtime model of the CPS demonstratorhas only a total of 49 objects, we scaled up the model byreplicating the original elements (except for the computingunits). This way we obtained models with 140 – 1.4M objectshaving similar structural properties as the original one.

    2) Technical details: We synthesized the C++ monitors forthe MoDeS3 case study. As DDS implementation, we builtupon RTI Connext Professional, which is currently consideredto be the leading Industrial IoT connectivity framework.

    We used a quality of service profile called High Throughputdefined in the RTI Connext DDS library which provides an ini-tial configuration designed for applications that require reliabledata streaming at a high data rate. This profile improves theperformance by allowing the middleware to buffer applicationmessages and send them together in a single transport layermessage, which is a UDPv4 datagram in our case. We slightlytailored this profile to better fit monitoring purposes, namely i)we added a deadline to force flushing the content of the bufferto the network at least every 100ms to avoid ”stuck” messagesand ii) we disabled the shared memory transport to achievemore realistic network throughput with communication overUDPv4.

    We ran our distributed monitors in isolation but within thesame physical host. We used Docker1 to manage applicationcontainers, and assigned a dedicated virtual network interfaceto allow communication between the containers. The hostingcomputer was a single server machine with 32 CPU cores.

    Every single data point in the diagrams of this sectiondisplays results of 20 consecutive runs. A data point isobtained by removing the lowest and highest measured value,then taking the average of the remaining 18.

    1https://www.docker.com/

    10-participants

    0.01

    0.1

    1

    140 1400 14000 140000 1400000

    Exec

    uti

    on

    tim

    e (s

    )

    Model size (number of objects)

    CloseTrains Derailment EndOfSiding TrainLocations

    Page 1

    Fig. 3: Query execution times in a network of 10 participants

    B. Experiment results

    We only provide an excerpt of our findings in this section,while a complete export of conducted measurement data isavailable online2.

    1) Scalability wrt. model sizes: In our previous work, weconducted an initial evaluation of our query-based monitoringapproach deployed to a network of six physical single boardcomputers [7] with 43K objects. Figure 3 summarizes our re-sults over a DDS middleware with 10 participants and multipleexecutors. It shows scalability up to 1.4 million objects withsimilar runtime as reported in [7], which highlights that graphpattern matching scales well when deployed over an industrialDDS middleware. However, the single executor strategy failedto scale by providing 3-4 orders of magnitude slower executiontimes (for only 2 participants and 14K objects).

    2) Scalability wrt. number of participants: The next sce-nario investigates the impact of increasing the number ofparticipants on query evaluation time. The results for Close-Trains and Derailment monitors are depicted in Figures 4a and4b, respectively. The x-axis shows the increasing model sizeswhile the y-axis depicts execution time for 2, 5, 10 and 20participants.

    The results show negligible difference in execution timeswith increasing number of participants in case of the selectedtwo monitors. This leads us to the conclusion that the systemscales well for higher degrees of distribution.

    3) Impact of batching DDS messages: The default setting inthe High Throughput DDS profile allows application messagebuffering. This buffer is packed in a single UDPv4 datagramonce it reaches 30kB and sent over the network. With thisset of experiments, we investigated how this buffer sizeimpacts query evaluation performance. The results are shownin Figure 5a for CloseTrains and in Figure 5b for Derailmentwith two and five participants.

    Buffering messages slows down query execution for modelswith less than 14K objects. However, buffering can improveexecution run time for certain monitors over models with morethan 14K elements (e.g. in case of Derailment) and will notdegrade run time in others (like in case of CloseTrains).

    2https://imbur.github.io/cps-query/

    686

  • closeTrains-2-5-10-20

    0.01

    0.1

    1

    140 1400 14000 140000 1400000

    Exec

    uti

    on

    tim

    e (s

    )

    Model size (number of objects)

    2 participants 5 participants 10 participants 20 participants

    Page 1

    (a) CloseTrains monitor execution times

    derailment-2-5-10-20

    0.01

    0.1

    1

    140 1400 14000 140000 1400000

    Exec

    uti

    on

    tim

    e (s

    )

    Model size (number of objects)

    2 participants 5 participants 10 participants 20 participants

    Page 1

    (b) Derailment monitor execution times

    Fig. 4: Monitor execution times with multiple participants

    closetrains-buffer

    0.001

    0.01

    0.1

    1

    140 1400 14000 140000 1400000

    Exec

    uti

    on

    tim

    e (s

    )

    Model size (number of objects)

    30kB buffer, 2 participants 30kB buffer, 5 participants

    3kB buffer, 2 participants 3kB buffer, 5 participants

    0kB buffer, 2 participants 0kB buffer, 5 participants

    Page 1

    (a) CloseTrains monitor execution times

    derailment-buffer

    0.001

    0.01

    0.1

    1

    10

    140 1400 14000 140000 1400000

    Exec

    uti

    on

    tim

    e (s

    )

    Model size (number of objects)

    30kB buffer, 2 participants 30kB buffer, 5 participants

    3kB buffer, 2 participants 3kB buffer, 5 participants

    0kB buffer, 2 participants 0kB buffer, 5 participants

    Page 1

    (b) Derailment monitor execution times

    Fig. 5: Impact of varying DDS message buffer sizes

    V. RELATED WORK

    We overview related papers covering distributed modelqueries, runtime verification of distributed systems and IoTmonitoring frameworks.

    Distributed model queries: The authors of [23] provide atoolset called GreyCat for maintaining and analyzing temporalgraphs over arbitrary storage technologies, such as in-memorystores or NoSQL databases. The IncQuery-D framework [24]provides support for distributed incremental model queries. Itbuilds an in-memory middleware layer on top of a distributedmodel storage system, and uses the Rete algorithm [25] forincremental maintenance of query results.

    As a key difference, these distributed graph query evaluationtechniques use a cloud-based execution environment, thus theyare not directly applicable for a heterogeneous execution IoTplatform with low-memory computation units.

    Runtime verification of distributed systems: Classicalruntime verification approaches supporting distributed systems[26]–[28] frequently use temporal logic for specifying safetyproperties and expected behavior, which excel in tempo-ral properties over atomic predicates but they typically fail

    to express complex structural requirements. Evaluating suchcomplex structural properties over a distributed data model is,in fact, a main added value of our query-based approach.

    Brace [29] is a framework for distributed runtime verifica-tion of CPSs that imposes low runtime overhead on the systemunder monitor. However, global properties are monitored by adedicated central entity, which is different from our solution,where any of the participants can initiate and collect monitor-ing results from the whole system.

    IoT monitoring frameworks: A cloud-centric vision forInternet of Things is presented in [30], which is a dominantapproach in healthcare IoT applications. Typically, data col-lection from the environment is carried out at the edge of thenetwork, then intelligent processing is performed on a cloudplatform [31], [32]. However, our focus differs from this intwo ways, namely i) all the processing is done on the networkedge and ii) the monitoring focus is on the system propertiesitself, and not on the environment.

    VI. CONCLUSION AND FUTURE WORKIn this paper, we defined two strategies for distributed query-

    based runtime monitoring over a distributed runtime model

    687

  • used in the context of critical CPS and IoT applications. Weused the standard Data Distribution Service as an underly-ing communication middleware for runtime monitors, whichprovides QoS and QoD guarantees. We carried out detailedscalability evaluations using our prototype ( implemented inC++) in the context of an open educational IoT demonstrator,which showed favorable results both with increasing numberof participants (up to 20 participants) and increasing modelsize (with 1.4 million objects).

    As a future work, we are planning on enhance the singleexecutor strategy by adding support to asynchronously requestinformation about the runtime model of other participants. Thiswould enable parallel search execution by utilizing the timethat is now spent on waiting for remote calls to complete.

    ACKNOWLEDGMENT

    This paper is partially supported by MTA-BME LendületCyber-Physical Systems Research Group, the NSERC RGPIN-04573-16 project, and the McConnell Memorial Fellowship(as part of the MEDA program). The authors would like tothank Gábor Szárnyas the help on providing the evaluationenvironment.

    REFERENCES

    [1] C. Krupitzer, F. M. Roth, S. VanSyckel, G. Schiele, and C. Becker, “Asurvey on engineering approaches for self-adaptive systems,” Perv. Mob.Comput., vol. 17, pp. 184–206, feb 2015.

    [2] C. B. Nielsen, P. G. Larsen, J. S. Fitzgerald, J. Woodcock, and J. Pe-leska, “Systems of systems engineering: Basic concepts, model-basedtechniques, and research directions,” ACM Comput. Surv., vol. 48, no. 2,p. 18, 2015.

    [3] J. Sztipanovits, X. Koutsoukos, G. Karsai, N. Kottenstette, P. Antsaklis,V. Gupta, B. Goodwine, and J. Baras, “Toward a science of cyber-physical system integration,” Proceedings of the IEEE, vol. 100, no. 1,pp. 29–44, 2012.

    [4] M. Leucker and C. Schallhart, “A brief account of runtime verification,”J. Log. Algebr. Program., vol. 78, no. 5, pp. 293–303, 2009.

    [5] S. Mitsch and A. Platzer, “ModelPlex: Verified runtime validation ofverified cyber-physical system models,” in Intl. Conference on RuntimeVerification, 2014.

    [6] K. Havelund, “Rule-based runtime verification revisited,” Int. J. Softw.Tools Technol. Transfer, vol. 17, no. 2, pp. 143–170, 2015.

    [7] M. Búr, G. Szilágyi, A. Vörös, and D. Varró, “Distributed graph queriesfor runtime monitoring of cyber-physical systems,” in 21st Interna-tional Conference on Fundamental Approaches to Software Engineering(FASE), 2018, pp. 111–128.

    [8] G. S. Blair, N. Bencomo, and R. B. France, “[email protected],” IEEEComputer, vol. 42, no. 10, pp. 22–27, 2009.

    [9] M. Szvetits and U. Zdun, “Systematic literature review of the objectives,techniques, kinds, and architectures of models at runtime,” Softw. Syst.Model., vol. 15, no. 1, 2013.

    [10] H. Ehrig, K. Ehrig, U. Prange, and G. Taentzer, “Fundamentals ofalgebraic graph transformation (monographs in theoretical computerscience. an eatcs series). secaucus,” 2006.

    [11] B. H. C. Cheng, K. I. Eder, M. Gogolla, L. Grunske, M. Litoiu, H. A.Müller, P. Pelliccione, A. Perini, N. A. Qureshi, B. Rumpe, D. Schneider,F. Trollmann, and N. M. Villegas, “Using models at runtime to addressassurance for self-adaptive systems,” in [email protected], 2011, pp.101–136.

    [12] T. Vogel and H. Giese, “Model-driven engineering of self-adaptive soft-ware with EUREMA,” ACM Transactions on Autonomous and AdaptiveSystems (TAAS), vol. 8, no. 4, p. 18, 2014.

    [13] M. Búr, G. Szilágyi, A. Vörös, and D. Varró, “[email protected] for runtime verification of cyber-physicalsystems,” preprint. [Online]. Available: https://imbur.github.io/cps-query/sttt18-runtime-models.pdf

    [14] G. Pardo-Castellote, “OMG Data-Distribution Service: Architecturaloverview,” in Proc. 23rd Int. Conf Distrib. Comput. Syst. Workshops,2003.

    [15] F. Bonomi and R. Milito, “Fog computing and its role in the internet ofthings,” Proceedings of the MCC workshop on Mobile Cloud Computing,08 2012.

    [16] A. Vörös et al., “MoDeS3: Model-Based Demonstrator for Smart andSafe Cyber-Physical Systems,” in Lecture Notes in Computer Science(including subseries Lecture Notes in Artificial Intelligence and LectureNotes in Bioinformatics), vol. 10811 LNCS, 2018, pp. 460–467.

    [17] D. Varró, O. Semeráth, G. Szárnyas, and Á. Horváth, Towards theAutomated Generation of Consistent, Diverse, Scalable and RealisticGraph Models. Springer, 2018, no. 10800.

    [18] W. Kang, K. Kapitanova, and S. Son, “Rdds: A real-time data distri-bution service for cyber-physical systems,” IEEE Trans. Ind. Informat.,vol. 8, no. 2, pp. 393–405, 2012.

    [19] G. Bergmann, Z. Ujhelyi, I. Ráth, and D. Varró, “A graph query languagefor EMF models,” in Theory and Practice of Model Transformations -4th International Conference, ICMT 2011, Zurich, Switzerland, June 27-28, 2011. Proceedings, 2011, pp. 167–182.

    [20] I. Dávid, I. Ráth, and D. Varró, “Foundations for Streaming ModelTransformations by Complex Event Processing,” Software and SystemModeling, vol. 17, no. 1, pp. 135–162, 2018.

    [21] M. Abril et al., “An assessment of railway capacity,” TransportationResearch Part E: Logistics and Transportation Review, vol. 44, no. 5,pp. 774–806, 2008.

    [22] D. Emery, “Headways on high speed lines,” in 9th World Congress onRailway Research, 2011, pp. 22–26.

    [23] T. Hartmann, F. Fouquet, M. Jimenez, R. Rouvoy, and Y. Le Traon,“Analyzing complex data in motion at scale with temporal graphs,”in The 29th International Conference on Software Engineering &Knowledge Engineering (SEKE’17). KSI Research, 2017, p. 6.

    [24] B. Izsó, G. Szárnyas, I. Ráth, and D. Varró, “Incquery-d: Incrementalgraph search in the cloud,” in Proceedings of the Workshop on Scalabilityin Model Driven Engineering, ser. BigMDE ’13. New York, NY, USA:ACM, 2013, pp. 4:1–4:4.

    [25] C. L. Forgy, “Rete: A fast algorithm for the many pattern/many objectpattern match problem,” Artificial Intelligence, vol. 19, no. 1, pp. 17–37,sep 1982.

    [26] A. El-Hokayem and Y. Falcone, “Monitoring decentralized specifi-cations,” in Proceedings of the 26th ACM SIGSOFT InternationalSymposium on Software Testing and Analysis, ser. ISSTA 2017. NewYork, NY, USA: ACM, 2017, pp. 125–135.

    [27] P. S. Duggirala, T. T. Johnson, A. Zimmerman, and S. Mitra, “Staticand dynamic analysis of timed distributed traces,” in 2012 IEEE 33rdReal-Time Systems Symposium, Dec 2012, pp. 173–182.

    [28] V. K. Garg, “Observation of global properties in distributed systems.”in SEKE. Citeseer, 1996, pp. 418–425.

    [29] X. Zheng, C. Julien, R. Podorozhny, F. Cassez, and T. Rakotoarivelo.[30] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of things

    (iot): A vision, architectural elements, and future directions,” Futuregeneration computer systems, vol. 29, no. 7, pp. 1645–1660, 2013.

    [31] M. Hassanalieragh, A. Page, T. Soyata, G. Sharma, M. Aktas, G. Mateos,B. Kantarci, and S. Andreescu, “Health monitoring and managementusing internet-of-things (iot) sensing with cloud-based processing: Op-portunities and challenges,” in 2015 IEEE International Conference onServices Computing, June 2015, pp. 285–292.

    [32] S. Fang, L. Da Xu, Y. Zhu, J. Ahati, H. Pei, J. Yan, Z. Liu et al., “Anintegrated system for regional environmental monitoring and manage-ment based on internet of things.” IEEE Trans. Industrial Informatics,vol. 10, no. 2, pp. 1596–1605, 2014.

    688