14
UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl) UvA-DARE (Digital Academic Repository) A semantic-web approach for modeling computing infrastructures Ghijsen, M.; van der Ham, J.; Grosso, P.; Dumitru, C.; Zhu, H.; Zhao, Z.; de Laat, C. Published in: Computers & Electrical Engineering DOI: 10.1016/j.compeleceng.2013.08.011 Link to publication Citation for published version (APA): Ghijsen, M., van der Ham, J., Grosso, P., Dumitru, C., Zhu, H., Zhao, Z., & de Laat, C. (2013). A semantic-web approach for modeling computing infrastructures. Computers & Electrical Engineering, 39(8), 2553-2565. https://doi.org/10.1016/j.compeleceng.2013.08.011 General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. Download date: 27 Nov 2020

A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

UvA-DARE is a service provided by the library of the University of Amsterdam (http://dare.uva.nl)

UvA-DARE (Digital Academic Repository)

A semantic-web approach for modeling computing infrastructures

Ghijsen, M.; van der Ham, J.; Grosso, P.; Dumitru, C.; Zhu, H.; Zhao, Z.; de Laat, C.

Published in:Computers & Electrical Engineering

DOI:10.1016/j.compeleceng.2013.08.011

Link to publication

Citation for published version (APA):Ghijsen, M., van der Ham, J., Grosso, P., Dumitru, C., Zhu, H., Zhao, Z., & de Laat, C. (2013). A semantic-webapproach for modeling computing infrastructures. Computers & Electrical Engineering, 39(8), 2553-2565.https://doi.org/10.1016/j.compeleceng.2013.08.011

General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s),other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).

Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, statingyour reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Askthe Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam,The Netherlands. You will be contacted as soon as possible.

Download date: 27 Nov 2020

Page 2: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

Computers and Electrical Engineering 39 (2013) 2553–2565

Contents lists available at ScienceDirect

Computers and Electrical Engineering

journal homepage: www.elsevier .com/ locate/compeleceng

A semantic-web approach for modeling computinginfrastructures q

0045-7906/$ - see front matter � 2013 Elsevier Ltd. All rights reserved.http://dx.doi.org/10.1016/j.compeleceng.2013.08.011

q Reviews processed and recommended for publication to Editor-in-Chief by Guest Editor Prof. Jesus Carretero.⇑ Corresponding author.

E-mail address: [email protected] (M. Ghijsen).

Mattijs Ghijsen ⇑, Jeroen van der Ham, Paola Grosso, Cosmin Dumitru, Hao Zhu, Zhiming Zhao,Cees de LaatSystem and Network Engineering Research Group, Informatics Institute, University of Amsterdam, The Netherlands

a r t i c l e i n f o

Article history:Available online 25 September 2013

a b s t r a c t

This paper describes our approach to modeling computing infrastructures. Our main con-tribution is the Infrastructure and Network Description Language (INDL) ontology. The aimof INDL is to provide technology independent descriptions of computing infrastructures,consisting of processing and storage resources and the network topology that connectsthese resources. INDL also provides descriptions for virtualization of resources and the ser-vices offered by resources. We build our infrastructure model upon the Network MarkupLanguage (NML). Although INDL is a stand-alone model, it can be easily connected withthe NML model. In this paper we show how INDL and NML are used as a basis for modelsused in three different applications: the CineGrid infrastructure, the Logical InfrastructureComposition Layer in the GEYSERS EU-FP7 project and the NOVI federation platform. Fur-thermore, we show the use of INDL for monitoring energy aspects of computing infrastruc-tures and its application for workflow planning on computing infrastructures.

� 2013 Elsevier Ltd. All rights reserved.

1. Introduction

One of the main ingredients in the design, implementation and operation of computing infrastructures is the informationmodel. This information model must describe both the physical infrastructure and its virtualization aspects. In this paper wedescribe the fundamentals of such an information model: the Infrastructure and Network Description Language (INDL). INDLdraws on earlier work on modeling computer networks [1] and on the application of these models in the context of digitalcinema [2]. Furthermore, INDL relates to ongoing efforts in the OGF Network Mark-up Language Working Group (NML-WG),and two European projects: GEYSERS [3] and NOVI [4].

As argued in [5], an infrastructure modeling framework provides the basis for virtualization and management of infra-structure resources. This framework should facilitate the description, modeling, discovery, composition, and monitoringof those resources and is therefore one of the key components of computing and cloud infrastructures. In this paper we focusmainly on how to describe computing and cloud infrastructures in such a way that the resulting model is technology inde-pendent, reusable, easily extensible and linkable to other existing models. To meet these demands we base our modelingapproach and the models itself on Semantic Web technologies [6].

Current research on modeling computer networks has already lead to the development of the Network Markup Language(NML). Due to its Semantic-Web approach we can easily re-use the NML model to provide the network model and combinethis with an earlier stand-alone version of INLD [7]. This earlier version of INDL contained only a limited vocabulary for

Page 3: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

2554 M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565

describing network topologies. In this paper we have improved INDL such that it is no longer a stand-alone model but in-stead it imports and utilizes the full network model offered by NML.

The structure of this paper is as follows: first we describe related work on infrastructure modeling. Next we motivate ourmodeling approach using Semantic Web technology in Section 3. Then, we introduce the main components of our infrastruc-ture model, the Network Markup Language in Section 4 and the Infrastructure and Network Description Language (INDL) inSection 5. In Section 6 we demonstrate the applicability and re-usability of INDL by three case studies in which INDL is ap-plied for modeling specific computing infrastructures. In Section 7 we discuss the usage of INDL for an energy monitoringontology and a QoS and workflow ontology. We conclude our paper in Section 8.

2. Related work

The Open Cloud Computing Interface (OCCI) [8] is an API developed by the OCCI working group within the OGF. OCCI pro-vides a number of UML diagrams to model computing infrastructures including network, storage and computing resourcesbut explicit models for virtualization are lacking.

The Common Information Model (CIM) [9] was developed within the Distributed Management Task Force (DMTF) [10]and it provides a network device information model commonly used in enterprise settings. It is an object-oriented informa-tion model using the Unified Modeling Language (UML) for its descriptions and it attempts to capture descriptions of com-puter systems, operating systems, networks and other related diagnostic information. Although CIM allows for networkdescriptions, they are not very well supported compared to other more network centric models. A successor to CIM is theDEN-ng model, Directory Enabled Networking next generation [11], which extends the CIM model mostly with descriptionsof business rules.

The Virtual private eXecution infrastructure Description Language (VXDL) is being developed by INRIA [12] and Lyatiss[13] and it is mostly focussed on modeling requests for virtual infrastructures. It uses an XML syntax to describe (requestsfor) infrastructures in varying levels of detail. Such a request consists of four parts: a general description, a description ofnon-network resources, a network topology, and the time interval for this reservation. Similar to CIM, VXDL does not providea comprehensive network model.

In [14], the Semantic Resource Description Language (SRDL) is presented. This ontology is aimed at service oriented opti-cal connection services. However this ontology does not contain any properties for network resources which makes themsemantically indistinguishable and as such, unusable within a semantic context. Furthermore, SRDL does not provide anymechanisms to distinguish between physical resources, virtualized resources, and the services that are offered by theseresources.

A comparable approach is the Media Applications Description Language (MADL) [15] which is an RDF ontology with asimilar purpose as the CineGrid Description language [2]. The goal of these ontologies is to describe computing infrastruc-tures for the storage, transport and display of high definition media content. MADL however suffers from the same deficien-cies as SRDL such as a limited ability to model connectivity. For example, MADL lacks a link concept which would allow forthe description of bandwidth between nodes. MADL also does not contain any concepts for describing switches and routers.

Within the GENI project [16], the SFA (Slice-based Facility Architecture) format was developed for PlanetLab [17] to pro-vide infrastructure and request descriptions. The first version of this format was defined in RSpec (Resource SPECification)[18]. This later evolved into ProtoGENI RSpec v2 [19], which has been chosen to be the standard interchange format for allGENI platforms. The RSpec v2 format is a simple XML based format aimed at virtual environments. It allows platforms andusers to describe nodes, their virtualization properties, and a very limited form of network connectivity. The format worksvery well with PlanetLab and compatible systems, but it is difficult to apply at other networks or computing infrastructures.

NDL–OWL is a model developed in the ORCA-BEN [20] project, also within the GENI initiative. Their Semantic-Web basedmodel includes network topologies, layers, utilities and technologies (PC, Ethernet, DTN, fiber switch) as well as cloud com-puting, and in particular software and virtual machine, substrate measurement capabilities and service procedures andprotocols.

3. Semantic-web framework

The Semantic Web [6] was first proposed as a way for machines to comprehend web pages and data. It uses the WebOntology Language (OWL) [21], which is a knowledge representation language used to describe ontologies. OWL uses theResource Description Language (RDF) to describe data in the form of triples. These triples consist of subject, predicate andobject, meaning you provide some information about a certain subject (e.g. a book). The predicate (e.g. has-author) is usedto describe a relation between subject and object (e.g. a person). The subject and object can also be used as the subject or objectof other triples which results in a graph structure where the subjects and objects are the nodes and the predicates are theedges. Ontologies in OWL provide vocabularies for these triplets, defining what kind of predicates there are, which standardtypes are available, and so on. The vocabulary that is defined in an ontology is defined within a single namespace to ensureunique URI’s for the concepts in the ontology.

Fig. 1 shows the import relations between the models covered in this paper. We distinguish between two types of im-ports; a full import in which the full vocabulary of an ontology is imported in another ontology, and a selective import in

Page 4: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

nml.owl

geysers.owl novi.owl cdl.owl

edl.owl

qosawf.owl

qosawf_mapping.owl

indl.owl

full import

selective import

Fig. 1. Connecting and extending models.

M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565 2555

which only specific concepts of an ontology are imported into the other. As can be seen, for the three infrastructure models(CDL, NOVI and GEYSERS) we fully import the vocabularies defined in NML and INDL. For EDL, only a few concepts from NMLand INDL are imported (see Section 7.1) and the QOSAWF workflow ontology does not import any NML or INDL concepts atall. Instead, a separate mapping ontology is created where concepts from NML, INDL and QOSAWF ontologies are alignedusing owl:sameAs or owl:equivalentClass relations. Thus, using the different forms of imports and mappings offered byRDF and OWL, we allow (parts of) models to be easily re-used in other models.

Additionally we see two more advantages of using OWL for describing computing infrastructures. First, because of itsgraph structure, RDF provides a good match to computing infrastructures that can also be seen as large graphs of connectedresources.

Second, OWL provides explicit separation between semantics and syntax. An OWL schema defines the ontology, i.e. theset of classes and the relations that can exist between those classes. Instances of these classes and their properties are thendefined using an OWL syntax. One of the most popular syntaxes for OWL is XML/RDF which uses an XML notation to describeRDF triples, but other terser notations also exist. The clear separation of ontology and syntax also allow users to mix differentontologies without being hindered by the syntax in which these models are described.

4. Network Markup Language

In the past years network architectures, especially in research and educational networks, have seen a gradual shift in thetype of services offered to end users and applications. They have moved from pure packet-switched data delivery services toa mixture of packet-switched and circuit-switched services. These hybrid networks [22] use optical and photonic devices tocreate circuits in a natural manner, e.g. DWDM devices. These circuits are nowadays an essential component in providingintegrated network-computing services in cloud infrastructures. The extensive use of circuits in hybrid networks showedthe need for interchangeable network models that could support the operation of control planes protocols, e.g. GMLPS.

The Network Markup Language Working Group – NML-WG – within the Open Grid Forum has gathered experts in thearea of network topology descriptions and is working towards a first standard. The modeling effort done in NDL [1] has lar-gely been incorporated in the NML schemas and expanded by adopting concepts and models from the PerfSONAR commu-nity [23].

Building on the experience of the different groups, the NML group has created a very generic network description schema[24]. This schema contains the necessary components to build a high-level domain network topology, but also go down tothe technical details and describe all technological capabilities of a network. Fig. 2 shows the basic elements of NML.

Node is a machine part of the network, this can be a router, or just a regular PC. Port describes how a Node is connectedto the network. Link is the connection between two Ports and Service defines the capabilities of a Node or Port, exam-ples are a SwitchingService, or an AdaptationService respectively.

An important difference with previous models is that NML is a completely unidirectional model. A single network port ona physical machine is modeled by two unidirectional NML Ports. Thus, a flexible model is created to describe uni- and bi-directional network concepts.

One of the strengths of NML is that it provides natural support for distribution of information. Independent network oper-ators can create topology descriptions for their infrastructure based on NML; they can publish them (on the Web) and con-trol plane software can independently gather the information needed to create circuits across domains.

5. The Infrastructure and Network Description Language

The goal of the Infrastructure and Network Description Language (INDL) is to capture the concept of virtualization in com-puting infrastructures and to describe the storage and computing capabilities of the resources. The INDL ontology is built

Page 5: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

Ability to create a given Deadaptation Service

group of Portsencoding: URI hasLabelGroup: <LabelGroup>

Port Group

rencoding: URIhasLabelGroup: <LabelGroup>

group of LinksLink Group

encoding: URIhasLabel: <Label>

Logical (virtual) directed data transport between Ports

Link

A device, or partition of a device

Node

encoding: URIhasLabel: <Label>

Logical (virtual) directed interface at a certain layer

Port

Bidirectional Link

Bidirectional Port

version: serialconnected graph

Topology

ordered list of Network Objects

Ordered List

hasTopology

hasNode

implementedBy

hasService

providesPort

providesLink

hasOutboundPort

hasInboundPortis

Sour

ce

kniS

si

isSerialCompoundLink2

2

hasInboundPort

Ability to create a given adaptation

Adaptation Service

Ability to create a Link (cross connect)

Switching Service

hasService

hasLinkha

sLin

kha

sPor

t

hasOutboundPort

Fig. 2. Network Markup Language main classes and properties.

2556 M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565

upon the NML ontology and it uses the nml:Node concept as the basic entity for describing a resource in a computing infra-structure. INDL can be used as a stand-alone model (i.e. without any network descriptions), or it can be used in combinationwith NML by importing the NML ontology into the INDL definition. In the latter case, all NML concepts will become availableto the user of INDL.

Virtualization is modeled using the VirtualNode concept, which is modeled as a subclass of nml:Node (i.e. virtual nodeinherits all properties of node). A virtual node is also implemented on a node (see Fig. 3). The implementing node itself can beeither a physical node or another virtual node. This allows us to create layers of virtualization stacked on top of each other.

Fig. 4 shows how the internal components of a node are modeled by defining nml:Node to consist of a number of Node-Component. The NodeComponent is an abstract class which describes the following essential components of machineswhich are of interest to the user: MemoryComponent shows how much memory is available at a node in GB, Processor-Component to describe how many cores a node has, their speed in GHz and architecture, and StorageComponent to definethe space in GB available for local storage.

The key feature of INDL which makes it reusable and easy to extend is that we have decoupled virtualization, function-ality and connectivity. This allows us to add new functionality (e.g. adding a new type of NodeComponent) without impact-ing how we model its connectivity with other devices or how we model virtualization of the new resource. Furthermore,connectivity and functionality is modeled the same for physical nodes and virtual nodes which allows INDL to describe phys-ical computing infrastructures as well as virtual infrastructures.

nml:Node VirtualNode

implementedBy

implements

rdfs:subClassOf

Fig. 3. INDL: Modeling virtualization of nodes.

Page 6: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

nml:Node NodeComponent

hasComponent

partOf

MemoryComponent

ProcessorComponent

StorageComponent

rdfs:subClassOf

rdfs:subClassOf

rdfs:subClassOf

size

speedcores

architecture

size

Double

Double

Double

Integer

String

Fig. 4. INDL: Modeling internal node components.

M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565 2557

6. INDL extensions

To demonstrate the extensibility, applicability and flexibility of INDL, we will discuss a number of examples on how INDLhas been used to meet the specific challenges imposed by its application to different computing infrastructures: the CineGridinfrastructure, the GEYSERS architecture and NOVI federation.

6.1. Cinegrid Description Language

CineGrid [25] is a multidisciplinary community that exploits the recent advances in computing, network infrastructuresand adapts them to the digital cinema world. The CineGrid collaboration operates a distributed testbed spanning over multi-ple continents. The current nodes are located in the US, Europe, Asia and South America. All the sites are connected usinghigh-speed dynamic photonic networks provided by the Global Lambda Integrated Facility community [26]. Each site oper-ates a number of high capacity storage nodes while some sites also operate visualization and computing facilities that areable to access, display and process the stored content. These resources can be shared by the participants.

The CineGrid Description Language [2]. is an ontology that covers all the types of services and devices along with theirproperties, specific to the CineGrid infrastructure: storage services, video processing, streaming and transcoding services,screens, projectors, tiled displays, etc. In CineGrid exchangeable service and infrastructure descriptions for the supportingdevices are the basis to run applications across multiple domains.

An earlier version of CDL defined concepts for CineGrid nodes (collections of elements under a single administration),hosts (compute resources that provide various services) and clusters (collections of hosts). Using the owl:sameAs propertya connection was made to similar concepts in NDL. To facilitate a better integration between NML and CDL we have adjustedCDL such that it directly imports concepts from INDL and NML. Thus CDL Node has been replaced by NML Topology, CDLCluster is replaced by NML Group and CDL Host is replaced by NML Node. In addition to compute resources CDL also pro-vides concepts for high resolution displays, i.e. Display and Projector.

As mentioned above, CDL provides concepts for the various services encountered in the CineGrid ecosystem: Storage,Visualizer, Transcoder, Streamer. Each concept is further subclassed and extended with properties to accommodatea specific technology. For example a Transcoder service can support only a limited range of codecs. This is expressed throughthe supportsCodec relation. Codecs are also expressed as concepts that inherit the base class Codec.

Fig. 5 describes a small part of the infrastructure available in the CineGrid test bed. It shows one of the CineGrid Amster-dam Exchange storage nodes connected to one of the sites of the DAS4 distributed cluster. These two resources can be usedfor the storage, streaming and processing (compression, decompression, image transformation) of ultra high quality mediathat is typically used in digital cinema. The CGEX node provides two services: storage and visualization (sink of a videostream). The compute cluster also provides two types of services: storage and processing. In order to efficiently representthe capabilities of the compute cluster, it’s nodes compute and storage resources are represented as two aggregated values.The cluster provides also a transcoding service that can encode and decode to and from three different types of image for-mats widely used for high resolution video.

6.2. The NOVI information model

NOVI (Networking innovations Over Virtualized Infrastructures) [4] researches methods, algorithms and information sys-tems that can enable composition and management of isolated (virtual) resources provided by different federated FutureInternet (FI) platforms. The NOVI Information Model provides abstractions and semantics of federated virtualized resources,enabling ontology-based tools and algorithms used by all the various services operating in the architecture. The informationmodel developed in NOVI has two main objectives. First, to model abstractions supporting the federation of the FEDERICAand PlanetLab Europe platforms, and second to define the necessary modeling concepts needed by other Future Internet plat-forms to join the NOVI innovation cloud at a later stage.

Page 7: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

ex:CGEXnml:Node

cdl:LocalStorageService

cdl:SAGEVisualizer

nml:providesServicenml:providesService

cdl:PixlesX

4096

cdl:PixlesY

2160

ex:CGEX_eth0nml:Port

nml:hasInboundPort

ex:DAS4UvAnml:Node

ex:Force10nml:Node

ex:Force10_0/0nml:Port

nml:hasOutboundPort

ex:CGEX_to_Force10

nml:Link

nml:isSource

nml:isSink

ex:SwitchForce10nml:SwitchingServ

ice

nml:providesService

ex:DAS4_eth0nml:Port

ex:Force10_0/1nml:Port

ex:DAS4_to_Force10nml:Link

cdl:DAS4UvA_storageindl:StorageComponent

cdl:DAS4UvA_computeindl:Processing

Component

2000

460800

indl:storageCapacity

indl:CPUFrequeny

indl:hasComponent

indl:hasComponent

cdl:NFSStorageService

nml:providesService

nml:hasOutboundPort

cdl:TranscodeService

nml:providesService

cdl:Jpeg2000_NTTcdl:Jpeg2000

cdl:Jpeg2000_Intopixcdl:Codec

cdl:TIFFcdl:Codec

cdl:supportsCodec

cdl:supportsCodec

cdl:supportsCodec

cdl:CGEX_storageindl:StorageComponent

indl:hasComponent

2000

ex:SwitchLinknml:Link

nml:providesLink

isSource

isSink

isSourceisSink

nml:hasInboundPort

indl:storageCapacity

Fig. 5. CDL example.

2558 M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565

The federation of different platforms goes further than just providing access to resources. Several services from the dif-ferent platforms have to be combined. An example is authentication and policy, so that the user can log in directly to theNOVI layer and use resources that he is entitled to. Another example is a federated monitoring service so as to provide a userwith a single access point for monitoring his virtual infrastructure. Both the requesting and the monitoring services requirean information model describing resources from the different platforms. In NOVI we use a general ontology based on INDL todescribe resources and infrastructure.

Fig. 6 shows the classes that are currently defined in the NOVI Information Model (IM).

� Platform is introduced as a subclass of the NML Group and it describes a particular platform in the NOVI federation.Resources can be linked to the class to denote membership of that platform, and it can provide pointers to other infor-mation such as the management service of that platform.� The NML Service concept is extended with four additional subclasses to describe the services which can be imple-

mented by the NOVI resources. For example requesting a total of X GB of storage. We have different services to describeCPU Processing power, Memory size, Storage size, and network Switching capabilities.

nml:Service

nml:Group

Platform

ProcessingService

MemoryService

StorageService

SwitchingService

rdfs:subClassOfrdfs:subClassOf

rdfs:subClassOfrdfs:subClassOf

rdfs:subClassOf

nml:Node

isContainedIn

Fig. 6. The classes defined in the NOVI resource ontology.

Page 8: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565 2559

Fig. 7 shows a possible request for NOVI. The request contains a very simple topology with two nodes that are connectedto each other, starting April 19th 2012 at 11:00:00 CEST and lasts 48 h from that time. An added constraint here is that thenodes must be on different platforms, in this case PlanetLab and FEDERICA respectively. Furthermore, the node in PlanetLabis requested to be in Amsterdam, and the node in FEDERICA should have a storage service of at least 50 GB. This request canof course be extended with more resources, nodes in other platforms, et cetera.

6.3. The GEYSERS information modeling framework

The information model that was developed in the GEYSERS EU-FP7 project [3] focusses on one specific component in theGEYSERS architecture, the Logical Infrastructure Composition Layer (LICL) [27]. The LICL uses virtualization to decouple thephysical infrastructure from its associated control-plane and enables on-demand provisioning of virtual infrastructures. Akey aspect of the LICL is that it enables composing resources that belong to different infrastructure providers. This aspectwas a main motivation for adopting the semantic approach also in this project. The LICL information modeling frameworkenables the different LICL components – possibly residing at different infrastructure providers – to interact using a commonvocabulary. The initial required ingredients for this information model are: physical resources (IT and network resources),virtual infrastructures and virtual infrastructure requests, energy related aspects, quality of service, and security aspects.The GEYSERS information model imports the INDL ontology including all network related concepts from NML.

6.3.1. Modeling optical switchesOne of the requirements for the GEYSERS information model is to describe optical switches. Because INDL nor NML con-

tain any concepts specifically for describing optical switches, the INDL NodeComponent is extended with a new subclassOpticalSwitchComponent as shown in Fig. 8.

The OpticalSwitchComponent has the following properties:

� inputPower and outputPower describe the input and output range of the power that an optical switch is able to handleand provide.� fiberType describes whether the type of fiber supported by the optical switch is single-mode or multi-mode.� insertionLoss and returnLoss describe the optical loss in power of the optical switch.� configurationTime is the time needed to configure the switch by adding or removing internal switchedTo links between

the Interfaces.� impairment describes the linear and nonlinear effects such as polarization-dependent loss (PDL), polarization-mode dis-

persion (PMD), chromatic dispersion, crosstalk, etc.

To model a ROADM and OXC, specific components for these types of devices have been added as subclasses ofOpticalSwitchComponent.

� addWavelengths describes the number of wavelengths added by a ROADM.� dropWavelengths describes the number of wavelengths dropped by a ROADM.� bypassWavelengths describes the number of wavelengths that are bypassed by a ROADM.

ex:RequestAnml:Topology

ex:NodeBindl:VirtualNode

ex:NodeA indl:VitualNode

ex:FEDERICA novi:Platform ex:PlanetLab

novi:Platform

ex:Amsterdamnml:Location

ex:Storage novi:StorageService

ex:TopLifetimenml:Lifetime

nml:hasNodenml:hasNode

nml:hasService

novi:isContainedIn

nml:existsDuring

nml:locatedAt

novi:isContainedIn

nml:startTime 19 April 2012, 11:00:00 CEST

nml:endTime21 April 2012, 11:00:00 CEST

novi:hasStorageSize

50 GB

ex:NodeA_out nml:Port

ex:NodeB_in nml:Port

nml:hasInboundPortex:NodeA_to_NodeB

nml:Link

nml:isSource nml:isSinknml:hasOutboundPort

ex:NodeA_in nml:Port

ex:NodeB_out nml:Port

ex:NodeB_to_NodeA nml:Link

nml:isSink nml:isSource

nml:hasInboundPort nml:hasOutboundPort

Fig. 7. Example request showing two nodes on federated platforms in NOVI.

Page 9: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

indl:NodeComponent

OpticalSwitchComponent

ROADMComponent

OXCComponent

RangeoutputPowerinputPower

insertionLoss

returnLoss

Duration

gurationTime

FiberType berType

integer

addWavelengths

bypassWavelengthsdropWavelengths

wavelengthsPerFibernumberOfWavelengths

rdfs:subClassOf

rdfs:subClassOf

rdfs:subClassOf

double

Fig. 8. Concepts for modeling optical switches in the GEYSERS information modeling framework.

2560 M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565

� wavelengthsPerFiber to describe the number of wavelengths an OXC is able to put on a single fiber.� numberOfFibers the total amount of (input plus output) fibers an OXC has.

A more elaborate description of these concepts and their properties can be found in [28].

6.3.2. Modeling virtualizationThe GEYSERS LICL acts as a middleware for the decoupling of the physical substrate and the provisioning of a virtual infra-

structure as a service. In order to accomplish this, a complex layer of virtualization needs to be modeled in which we need todistinguish between different types of virtual nodes. Therefore, the INDL VirtualNode concept has been extended withthree new subclasses as shown in Fig. 9.

� LogicalResource represents the aggregation of a number of physical IT resources (i.e. Nodes) into a single resource.Systems like OpenNebula or OpenStack can be used to manage such a cluster of IT nodes and offer its capacity to theupper layer in the LICL system. The LogicalResource also describes the hypervisor that is being used.� ResourcePool is used to indicate a reservation of (part of) a LogicalResources capacity for future use.� VirtualResource represents an instantiated virtual machine or virtual switch. A VirtualResource can support dif-

ferent types of disk-images and also point to a URI where the VM image is located.

Fig. 10 gives a simplified example on how these three concepts are used to represent a virtual infrastructure that isembedded on the physical substrate. It shows two storage nodes and two compute nodes that are both aggregated into aLogicalResource. For the sake of the example, the OXC resource is partitioned into two virtual OXCs, both representedas a VirtualResource. On both of the logical resources, a ResourcePool is implemented to reserve some capacity for fu-ture use. Two VMs, each represented as VirtualResource, are also instantiated and connected via the two virtual OXCs.

indl:VirtualNode

LogicalResource ResourcePool Virtual

Resource

rdfs:subClassOfrdfs:subClassOf

rdfs:subClassOf

Fig. 9. GEYSERS extension of the VirtualNode concept.

Page 10: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

indl:implements

ex:LR1, gey:LogicalResource

ex:Storage1 nml:Node

ex:Storage2nml:Node

indl:implements indl:implements

ex:LR3gey:LogicalResource

ex:Compute1 nml:Node

ex:Compute2 nml:Node

indl:implements

ex:OXC1 nml:Node

ex:StoragePoolgey:ResourcePool

indl:implements

ex:VirtualOXC1gey:VirtualResource

ex:ComputePool gey:ResourcePool

indl:implements indl:implements

ex:VM1 gey:VirtualResource

indl:implements

ex:VM2 gey:ResourcePool

indl:implements

ex:VirtualOXC2, gey:VirtualResource

indl:implements

ex:Storage1out nml:Port

ex:LR2, gey:LogicalResource

indl:implements

ex:OXCin nml:Port

ex:OXCout nml:Port

ex:Compute1in nml:Port

nml:hasOutboundPort nml:hasInboundPort nml:hasOutboundPort nml:hasInboundPort

ex:Storage1_to_OXC nml:Link

ex:OXC_to_Comput1 nml:Link

nml:isSource nml:isSinknml:isSourcenml:isSink

ex:OXC_switch nml:Link

nml:isSinknml:isSource

ex:OXCSwitchService nml:SwitchingService

nml:providesService

nml:providesLink

ex: OXCCompgey:OXC

Component

indl:hasComponent

Fig. 10. GEYSERS virtualization model.

M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565 2561

The example also shows how the NML concepts are used to define a unidirectional connection from one of the storage nodesvia the OXC to one of the computing nodes.

7. INDL usage

In the previous section we have shown how NML and INDL are used to define the models for three different computinginfrastructures. In this section we show how NML and INDL is used by the Energy Description Language (EDL). EDL is a modelaimed at monitoring energy related aspects of computing infrastructures. By connecting EDL to NML and INDL, it can be eas-ily used for monitoring the previously discussed computing infrastructures.

Another usage of NML and INDL is its application for modeling workflows on large scale computing infrastructures. In thiscase, a quality of service and workflow ontology (QOSAWF) is defined that is completely separate from NML and INDL. In-stead of importing NML or INDL concepts, a separate mapping ontology defines the relations between the QOSAWF ontologyand the NML and INDL ontologies.

7.1. EDL – Energy Description Language

The total energy consumption, the energy sources used and pragmatically the cost of power are becoming increasinglyimportant factor in the planning, design and operation of ICT infrastructures. The use of virtualization and the possibilityto rely onto IaaS offerings provide new dimension when trying to manage power and increase greenness of an infrastructure.

The Energy Description Language (EDL) [29] is an OWL ontology that describes all the energy related parameters that areneeded for power management in such large scale (virtual) infrastructures. EDL supports several power management scenar-ios; among others for example in green path searches, where only resources powered by green energy sources can be used;low-power resource selection, where resources with low power characteristics, e.g. solid state disks, low-power processors,are preferred; peak power management, where it is important to track the maximum power the resource consumes to judgewhether this is above a predefined upper bound.

Fig. 11 shows the classes and the main properties of this ontology.The figure shows the relation between NML, INDL and EDL: at the top left of the figure we can see a nml:NetworkObject

and an indl:NodeComponent belong to an edl:Resource, which is monitoredBy an edl:MonitorComponent and has a correspond-ing edl:ResourceEnergyDescr. All NML and INDL resources can be monitored for their power consumption and their energycharacteristics can be described accurately so that power management systems can use all this extra additional data.

7.2. Planning workflow over large scale infrastructure

Data intensive applications, such as collaborative digital media processing [30], have very high requirements for theunderlying services and the network connections, and the resources are often provided by different domains. Reserving re-sources from the infrastructure is a way to create a dedicated environment for executing workflows with very high perfor-mance requirements. The logic of workflow applications can be modeled from different abstractions; from high leveldescription to underlying concrete services that require mapping between different abstractions. Fig. 12 roughly shows

Page 11: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

Fig. 11. The Energy Description Language – EDL and its three main parts: the EnergyMetric class and subclasses (in yellow), the Monitor Component classand its subclasses (in orange) and the ResourceEnergyDescription and it subclasses (in gray). (For interpretation of the references to color in this figurelegend, the reader is referred to the web version of this article.)

Fig. 12. The network service interface and resource selection in workflow applications.

2562 M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565

Page 12: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

Fig. 13. The concepts defined in the QOSAWF schema.

M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565 2563

the mapping between workflow process descriptions to the underlying network that connects devices for hosting servicesand data for supporting the processes.

Developing optimal searching strategies that can efficiently search and reserve network resources to connect the comput-ing and storage services required by the workflow is the core problem. The INDL provides a suitable mechanism to describethe resources in the computing and storage infrastructure, and network connectivity among those resources. NEtWork QoSPlanner (NEWQoSPlanner) is a workflow planning system which is able to select network resources in the context of work-flow composition and scheduling [31]. In the NEWQoSPlanner system, a schema for describing abstract workflows pro-cess(qosawf.owl) was proposed to define the basic concepts of workflow processes, pre/post/execution conditions of aprocess, data, and quality attributes, as shown in Fig. 13.

The original application context of NEWQoSPlanner is CineGrid, and the mapping mechanism between the qosawf.owland the CineGrid is via a separate file called qosawf-ontmap-cdl.owl [31]. The information model of INDL promotes a bettermapping between services, storage and computing elements, and the network topologies compared to the early semanticdescription stack used by the NEWQoSPlanner.

8. Conclusions

We have shown how the Infrastructure and Network Description Language (INDL) is defined as an extension of NetworkMarkup Language (NML), thus creating a extensible, technology independent model of computing infrastructures. The exten-sibility and applicability is demonstrated by using INDL as a basis for modeling three different infrastructure: the CineGridinfrastructure, the NOVI federated platforms and the GEYSERS architecture. The use of Semantic-Web technology in our ap-proach facilitates the creation of models that can be easily connected, stacked and extended by other models.

An interesting comparison can be made between INDL and the NDL–OWL model developed in the ORCA-BEN [20] project.NDL–OWL extends NDL and uses the Web Ontology Language (OWL) instead of RDF. Their ontology models networks topol-ogy, layers, utilities and technologies (PC, Ethernet, DTN, fiber switch) and it is based on NDL. This is also the main differencebetween NDL–OWL and INDL. The approach for modeling network topologies in NDL–OWL is based on NDL while INDL usesthe latest developments in the OGF NML-WG. Furthermore, NDL–OWL covers cloud computing, and in particular softwareand virtual machine, substrate measurement capabilities and service procedures and protocols. In this respect it is a neces-sary next step to try to align INDL and NDL–OWL as much as possible, and we see this as an opportunity to be pursued withina standardization working group.

The focus of this paper is mainly on modeling computing infrastructures. However for future work we foresee a completemodeling suite, including infrastructure descriptions, monitoring models and access policy models. The use of the EnergyDescription Language (EDL) as a monitoring ontology can be seen as a first step in this direction.

Page 13: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

2564 M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565

Acknowledgments

This work was partially supported by the European Commission, 7th Framework Programme for Research and Techno-logical Development, Capacities, Grant No. 257867 – NOVI and Grant No. 248657 – GEYSERS. Furthermore, this publicationwas supported by the Dutch national program COMMIT and the GigaPort 2012 Research on Networks project.

References

[1] van der Ham JJ, Dijkstra F, Travostino F, Andree HMA, de Laat CTAM. Using RDF to describe networks. Future Gener Comput Syst – IGrid 2005: GlobalLambda Integr Facility 2006;22(8):862–7.

[2] Koning R, Grosso P, de Laat CTAM. Using ontologies for resource description in the CineGrid exchange. Future Gener Comput Syst 2011;27(7):960–5.[3] Escalona E, Peng S, Nejabati R, Simeonidou D, Garcia-Espin JA, Ferrer Riera J, et al. GEYSERS : a novel architecture for virtualization and co-provisioning

of dynamic optical networks and IT services. In: Proceedings future network and mobile summit; 2011.[4] NOVI – Networking innovations over virtualized infrastructures. <http://www.fp7-novi.eu/>.[5] Demchenko Y, van der Ham JJ, Ghijsen M, Cristea M, Yakovenko V, de Laat CTAM. On-demand provisioning of cloud and grid based infrastructure

services for collaborative projects and groups. In: The 2011 international conference on collaboration technologies and systems (CTS 2011); 2011.[6] Berners-Lee T, Hendler J, Lassila O. The semantic web. Sci Am 2001:29–37.[7] Ghijsen M, van der Ham J, Grosso P, de Laat CTAM. Towards an Infrastructure description language for modeling computing infrastructures. In: 2012

IEEE 10th international symposium on parallel and distributed processing with applications (ISPA). IEEE; 2012. p. 207–14.[8] Open cloud computing interface (OCCI). <http://occi-wg.org/>.[9] Common information model (CIM). <http://dmtf.org/standards/cim>.

[10] Distributed management task force (DMTF). <http://www.dmtf.org/>.[11] Strassner J. DEN-ng: achieving business-driven network management. In: Network operations and management symposium, 2002. NOMS 2002. 2002

IEEE/IFIP; 2002. p. 753–66.[12] Koslovski GP, Primet PV-B, Charão AS, Vicat-Blanc Primet P, Schwertner Charão A. VXDL: virtual resources and interconnection networks description

language. In: Networks for grid applications lecture notes of the institute for computer sciences. Social informatics and telecommunicationsengineering, 2. Berlin, Heidelberg: Springer; 2009. p. 138–54.

[13] Lyatiss resources. <http://www.lyatiss.com/resources/>.[14] Abosi CE, Nejabati R, Simeonidou D. Design and development of a semantic information modelling framework for a service oriented optical internet. J

Netw 2010;5(11):1300–9.[15] Ntofon O-D, Hunter DK, Simeonidou D. Towards semantic modeling framework for future service oriented networked media infrastructures. In: 2012

4th Computer science and electronic engineering conference (CEEC); 2012. p. 200-5.[16] GENI project. <http://www.geni.net/>.[17] Chun B, Culler D, Roscoe T, Bavier A, Peterson L, Wawrzoniak M, et al. PlanetLab: an overlay testbed for broad-coverage services. SIGCOMM Comput

Commun Rev 2003;33(3):3–12.[18] ProtoGENI RSpec. <http://www.protogeni.net/trac/protogeni/wiki/RSpec>.[19] Slice federation architecture (SFA). <http://groups.geni.net/geni/attachment/wiki/SliceFedArch>.[20] Baldine I, Xin Y, Mandal A, Heermann C, Chase J, Marupadi V, et al. Networked cloud orchestration: a geni perspective. In: Workshop on management

of emerging networks and services; 2010.[21] Web ontology language (OWL). <http://www.w3.org/2004/OWL/>.[22] de Laat CTAM, Radius E, Wallace S. The rationale of the current optical networking initiatives. Future Gener Comput Syst 2003;19(6):999–1008.[23] Hanemann A, Boote JW, Boyd EL, Durand J, Kudarimoti L, Lapacz R, et al. PerfSONAR: a service oriented architecture for multi-domain network

monitoring. In: Benatallah B, Casati F, Traverso P, editors. Service-oriented computing – ICSOC 2005. Lecture notes in computer science, 3826. Berlin/Heidelberg: Springer; 2005. p. 241–54.

[24] Network-markup language schema diagram. <https://forge.ogf.org/sf/go/doc15481>.[25] Grosso P, Herr L, Ohta N, Hearty P, de Laat CTAM. Editorial: CineGrid: super high definition media over optical networks. Future Gener Comput Syst

2011;27(7):881–5.[26] Global lambda integrated facility (GLIF). <http://www.glif.is>.[27] Garcia-Espin JA, Ferrer Riera J, Figuerola S, Ghijsen M, Demchenko Y, Buysse J, et al. Logical infrastructure composition layer, the GEYSERS holistic

approach for infrastructure virtualisation. In: Networking to services. The 28th trans european research and education networking conference, 21–24May, 2012. Keykjavik, Iceland, Selected papers; 2012.

[28] Nejabati R, Escalona E, Peng S, Simeonidou D. Optical network virtualization. In: 2011 15th international conference on optical network design andmodeling (ONDM); 2011. p. 1–5.

[29] Zhu H, van der Veldt K, Grosso P, Zhao Z, Liao X, de Laat C. Energy-aware semantic modeling in large scale infrastructures. In: The IEEE internationalconference on green computing and communications, GreenCom 2012; 2012.

[30] Fortino G, Palau CE, editors. Next generation content delivery infrastructures: emerging paradigms and technologies. IGI; 2012.[31] Zhao Z, Grosso P, van der Ham J, Koning R, de Laat C. An agent based network resource planner for workflow applications. Int J Multiagent Grid Syst 7/6

2011:187–202.

Mattijs Ghijsen received his M.Sc. in Computer Science from the University of Twente in 2004. He has a background in Artificial Intelligence with a focus ondesigning distributed, intelligent and adaptive systems. He has applied his research in the domains of crisis management, smart electricity grids, grid andcloud computing, and the maintainability of distributed software systems.

Jeroen van der Ham is a researcher in the SNE group at the UvA. His research interests are in semantic descriptions of multi-layer and multi-domainnetworks and (virtualised) resources, as well as associated algorithms and architectures. He is also the editor of the NML Schema document and an activecontributor to the NSI group.

Paola Grosso is assistant professor in the SNE group at the UvA. Her research interests are in the area of Future Internet architectures and the underlyingmodels that can support their operations. In particular she studies how hardware and network infrastructures can contribute to the creation of ‘greener’ andsustainable ICT services.

Cosmin Dumitru is a Ph.D. student in the SNE group at the UvA. In 2010 he received his master’s degree in system and network engineering at the sameuniversity with the ‘Cum Laude’ distinction. Cosmin’s research is focused on high-performance networking and distribution and processing of high-definition media over optical networks. Currently he is involved in the Cinegrid project.

Page 14: A semantic-web approach for modeling computing infrastructures · A semantic-web approach for modeling computing infrastructuresq Mattijs Ghijsen , Jeroen van der Ham, Paola Grosso,

M. Ghijsen et al. / Computers and Electrical Engineering 39 (2013) 2553–2565 2565

Hao Zhu has been a Ph.D. student in the SNE group at the UvA since 2011. His research interests are in the area of Green IT. To be particular, he studiesenergy-aware state monitoring system in large scale distributed HPC system as well as energy efficient resource discovery.

Zhiming Zhao studied Computer Science in Nanjing Normal University (NJNU) (1989–1993) and East China Normal University (ECNU) (1993–1996) inChina. He obtained a Ph.D. in computer science from University of Amsterdam (UvA) in 2004. He is currently a researcher at UvA. His research interestincludes scientific workflow management systems, multi agent systems, e-Science, advanced network and Cloud computing.

Cees de Laat is chair of the System and Network Engineering research section at the University of Amsterdam. Research in the section includes optical/switched data-transport in TeraScale eScience applications, Semantic web to describe e-Science infrastructure, distributed authorization architectures,virtualized cyber-infrastructure and Security& Privacy. http://delaat.net/.