[ACM Press the 2003 ACM workshop - Fairfax, Virginia (2003.10.31-2003.10.31)] Proceedings of the 2003 ACM workshop on XML security - XMLSEC '03 - Concept-level access control for the

Concept-level Access Control for the Semantic Web Li Qin Vijayalakshmi Atluri

MSIS Department and Center for Information Management, Integration and Connectivity (CIMIC)

180 University Ave. Newark, NJ 07102

{liqin,atluri}@cimic.rutgers.edu

ABSTRACT Recently, the notion of the Semantic Web has been introduced to define a machine-interpretable web targeted for automation, integration and reuse of data across different applications. Under the Semantic Web, web pages are annotated by concepts that are formally defined in ontologies along with the relationships among them. As information pertaining to different concepts has varying access control requirements, in this paper, we propose an access control model for the semantic web that is capable of specifying authorizations over concepts defined in ontologies and enforcing them upon data instances annotated by the concepts. It is important to note that semantic relationships among concepts play a key role in making access control decisions. This is because, based on the relationship, one may infer information contained in one concept node from that of the other. Therefore, we first identify the important domain-independent relationships among concepts, categorize them and propose propagation policies based on these categories of relationships. In particular, we allow propagation of authorizations based on the semantic relationships among concepts to prevent illegal inferences. We then show how concept-level security polices can be represented in an OWL-based access control language. Finally, we demonstrate how users’ requests can be handled under our access control model. Our concept-level model is especially suitable for the specification and administration of access control over semantically related web data under the Semantic Web even if they conform to different DTDs or use different tag names.

Categories and Subject Descriptors D.4.6 [Operating Systems]: Security and Protection – access controls.

General Terms Security

Keywords Semantic Web, access control, ontology, concept, propagation

1. INTRODUCTION Using machines to interpret and process information in the World Wide Web is a mission impossible since the semantics of web data is expressed for human consumption only. As an extension of the current one, the Semantic Web [4] has been on the way to building web pages annotated by concepts that are formally defined in ontologies along with the relationships among concepts in order for machines to understand them.

As an important component of the Semantic Web, ontologies will bring structure to the meaningful contents of web pages. Not only links between web pages are semantic, but also concepts defined in ontologies are semantically linked. Ontologies along with the relationships among them form a web of ontologies. As a result, the functioning of web services can be enhanced in various ways. For example, the accuracy of web searches can be greatly improved if only web pages that refer to a precise concept are checked. Data objects can be clustered by attaching themselves as instances to concepts.

Figure 1 shows part of an ontology for weapons as a directed labeled graph. It represents concepts such as ‘weapon’, ‘conventional weapon’, etc. in ovals and the relationship from a concept to another concept (or preliminary data type in rectangle) as the label to the outgoing edge of the concept. We will use this as an example for our discussions throughout the paper.

Due to the varied sensitivities of information, appropriate access control mechanisms for the Semantic Web must be in place in order to ensure subjects can access only and all the information authorized to them. Towards this end, we propose an access control model for the semantic web that is capable of specifying authorizations over concepts defined in ontologies and enforcing them upon data instances annotated by the concepts.

There has been research on access control to XML documents [7, 9, 11], where access authorizations can be specified at fine granularity of elements in an XML document or a DTD. The XML document is considered as the instance to its DTD, the schema. We claim DTD as the structural schema to the document since DTD basically specifies the legal elements and attribute lists allowed and how they are structured or contained in the document. Access control to XML documents is the foundation based on which access control for the Semantic Web should be built since the Semantic Web is built upon XML. However, when applied to the Semantic Web, the existing access control models to XML have to be extended to take into consideration the higher layers of the Semantic Web architecture [3] above XML (see Figure 2), such as RDF and ontologies. More importantly, in

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACM Workshop on XML Security, October 31, 2003, Fairfax, VA, USA. Copyright 2003 ACM 1-58113-777-X/03/0010…$5.00.

94

Figure 1. Part of an Ontology for Weapons.

practice, not only we need to enforce access control at the element, document or DTD level, but also there is a need to control access at the concept level. For example, it is natural that one would want to restrict web data on ‘sex’ only to people aged 18 above, or information on ‘chemical weapons’ is denied to people from certain countries. Instead of specifying authorizations over the element in each related document or DTD, we propose specifying access control over concepts like ‘sex’ and ‘chemical weapons’, and enforcing them upon all their data instances. These concepts are defined in ontologies, which constitute an important component of the Semantic Web. We define ontologies as the semantic schema to the web data, annotated by the concepts in ontologies and attached as instances to them. Note that, each valid XML document conforms to a particular DTD structurally (one-to-one). However, it may use concepts from multiple ontologies for annotation (one-to-many). Even if web data are across documents, conform to different structures (DTDs), or use different tag names for annotation, as long as they are semantically related through concepts in ontologies, their access can be regulated and enforced in a consistent manner through our proposed concept-level access control model. This model is a better alternative and at least a good complimentary to the element-level access control under the infrastructure of the Semantic Web.

Access control to XML documents allows specifications on how the access to one element can be propagated to other elements, which are usually structurally related to the specified element. Especially, the hierarchical structure of XML documents and DTDs leads to propagation policy on whether an authorization specified at a given level can propagate to lower levels [9]. For example, if a user has access to an element, he/she may also be able to access its sub-elements if access to the element has been specified to allow propagation to its sub-elements. Access control for the Semantic Web requires that propagations be based on the

semantic relationships among concepts or ontologies. It is essential due to the following compelling reasons:

(1) Security can be violated if access control to each concept is considered separately ignoring the interrelationships among concepts. Illegal inferences to the instances of a concept can be made by the knowledge of the instances of its related concepts along with the relationships between them. For instance, a subject is denied access to ‘special weapon’ but somehow granted access to ‘missile’. Since ‘missile’ is defined as a kind of ‘special weapon’, then instances of ‘special weapon’ can be illegally inferred from those of ‘missile’.

(2) Information may be inaccessible to authorized subjects if relationships among concepts are not considered. While traditional access control mechanisms take a conservative approach and ensure that only the information authorized to be viewed should be revealed to a subject, access control to the semantic web needs to additionally ensure that all the information authorized to be viewed should be revealed to a subject. Ignoring the latter requirement may be detrimental if wrong decisions are made based on incomplete information. For example, a military officer needs to check the information on ‘special weapon’, to which he is granted the access. However, his decision may be biased if he is unaware that some information he needs is labeled by the equivalent concept ‘non-conventional weapon’ whose access is denied to him. As another example, a doctor needs to check the medical records related to some disease, which may be labeled with different but equivalent names. The doctor’s diagnosis may be affected if he is given access to those related to only some of these equivalent concepts.

Biochemical Weapon

Consumes

Non-Conventional

Weapon

Nuclear Weapon

Missile

Biological Weapon

Chemical Weapon

Trigger Nuclear Missile

Equivalent_To

Is_A Is_A Is_A

Is_A

Part_Of Union_OfIntersection_Of

Has_Range

Fuel Special Weapon

Weapon

Conventional Weapon

Complement_Of

Range

Positive Integer

Range_Value

Union_Of

Rifle

95

(3) It can reduce the number of explicit access control specifications and help derive secure and consistent authorizations for the security administrator.

(4) It facilitates access control administration as to ontology evolution to some extent. Since ontologies evolve over time, if the relationship between concepts is changed or new concepts are added, the authorization propagations can just be re-evaluated so that appropriate authorizations will be applied accordingly. For example, if a concept equivalent to an existing concept is added to an ontology, since equivalent concepts have the same security requirements, the security administrator does not need to explicitly specify the authorizations to this newly added concept since the authorizations to its equivalent concept can be propagated to it when re-evaluating the propagations.

We will elaborate on the relationships among concepts and define corresponding propagation policies for our access control model. Controlling access at the concept level and propagating authorizations based on the interrelationships among concepts provide great flexibility and expressive power for controlling access to semantically related data under the Semantic Web. When data instances are under independent security administration, our access control model can help identify inconsistencies and prevent security violations.

The objective of our work is to define a concept-level access control model with support for propagation based on the semantic relationships among concepts to regulate the access to data instances distributed over the Semantic Web. With the architecture of the Semantic Web as its basis, our access control model can greatly simplify the specification and administration of access control for the Semantic Web by exploiting the semantic relationships among concepts. In this paper, we show how concept-level security polices can be represented in an OWL-based access control language. In addition, we demonstrate how users’ requests can be handled under our access control model. Our concept-level model is especially suitable for the specification and administration of access control over semantically related web data under the Semantic Web even if they conform to different DTDs or use different tag names.

The remainder of this paper is organized as follows. We discuss the preliminaries relevant to the Semantic Web in Section 2. Then, we identify and classify the relationships among concepts in Section 3. In Section 4, we present our access control model and the rules under which authorizations can be propagated. We discuss how different types of users’ requests are handled under our model in Section 5. In Section 6 we briefly present the related work and present conclusions and future research in Section 7.

2. PRELIMINARIES The Semantic Web The World Wide Web has become one of the major sources of information and services. It is decentralized, dynamic, large and growing at an accelerating pace. However, the full potential of the current web remains untapped because the 'meaning' of the information is expressed for human consumption only. To overcome this, the notion of the Semantic Web has been introduced to define a machine-interpretable web targeted for automation, integration and reuse of data across different

applications. “The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” [4] The architecture of the Semantic Web is shown in Figure 2 [3]. The functioning of the Semantic Web will depend on a number of technologies that are in place or under development. Some important ones include XML, RDF and ontologies.

Figure 2. Architecture of the Semantic Web.

Extensible Markup Language (XML) Unlike HTML, which has a predefined vocabulary of tags for displaying data, XML is designed to describe data by enabling XML authors to invent self-descriptive tags to annotate web pages or sections of text on a web page and apply arbitrary structure to their web documents. In other words, XML is all about the description of data while HTML describes data and its presentation. Basically, XML documents consist of elements, attributes and texts. An XML document is valid if it conforms to a Document Type Definition (DTD), which defines the legal elements of an XML document and the number of occurrences for these elements. DTD can be incorporated within XML data or exist as an external document.

Resource Description Framework (RDF) Using XML as the interchange format, RDF [14] is developed as a model to express the metadata of resources, where resources can be anything that has identity. RDF statements are triples consisting of a subject (the resource being described), a predicate (a property associated with resource) and its object (the value of the property). The resources (both subject and predicate are resources; object is either a resource or a literal) are each identified by a Universal Resource Identifier (URI).

Ontologies According to the W3C Web Ontology Working Group charter, "An ontology defines the terms used to describe and represent an area of knowledge." Generally, an ontology contains a description of important concepts in a domain, crucial properties of each concept and restrictions on properties such as property cardinality, property value type, domain and range of a property.

Ontologies have become a basic component of the Semantic Web since they define and relate concepts used to describe the data on the web, thus providing interpretations for the content of web pages. There is a pointer between a semantic web page and an ontology or ontologies so that terms existing in the web page can be clearly specified and inferences based on the ontology definitions can be made. Ontologies are distributed, reusable and

96

web accessible for semantic interoperability among different applications. OWL [13] is believed to be the most promising ontology language based on the syntax of RDF. Part of the weapons ontology may be represented in OWL as follows in Figure 3.

<?xml version="1.0"?> <!DOCTYPE owl [ <!ENTITY wpn "http://cimic.rutgers.edu/ontology/weapon#"> <!ENTITY ful “http://cimic.rutgers.edu/ontology/fuel#”> ]> <rdf:RDF xmlns = “&wpn;” xmlns:wpn = “&wpn;” xmlns:ful = “&ful;” xmlns:owl = "http://www.w3.org/2002/07/owl#" xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs = "http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd = "http://www.w3.org/2000/10/XMLSchema#"> … <owl:Class rdf:ID="Missile"> <rdfs:subClassOf rdf:resource = "#SpecialWeapon" /> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource = "#hasRange" /> <owl:allValuesFrom rdf:resource = "#Range" /> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource = "#consumes" /> <owl:someValuesFrom rdf:resource = "&ful;fuel" /> </owl:Restriction> </rdfs:subClassOf> </owl:Class> …

Figure 3. Part of Weapons Ontology represented in OWL.

3. CONCEPTS AND THEIR RELATIONSHIPS Our access control model for the Semantic Web is to specify access authorizations based on concepts and the relationships among these concepts, and to enforce these authorizations upon the data annotated by these concepts. Each concept under the Semantic Web is defined in an ontology, we denote the set of ontologies as O = {o1,o2,…,on} and the set of concepts in O as C = {c1,c2,…,cn}. The semantics of a concept c, is expressed in O by its relationships with other concepts or primitive data types. We call the relationships that organize concepts into a hierarchical structure taxonomies, T = {t1,t2,…,tn} and the relationships other than taxonomies properties, P = {p1,p2,…,pn} as the set of properties. Each property, pi, conforms to certain restrictions with the set of restrictions as R = {r1,r2,…,rn}, such as restrictions over its cardinality and assumes related concepts or primitive data types as the values, with their set denoted as V = {v1,v2,…,vn}. Each concept c has members that are grouped together because they share the common properties, P, of c and these members are called instances of c, where I = {i1,i2,…,in} for the set of instances. We define concepts below.

Definition 1: Concept. A concept is defined as a tuple c = (O, T, (P, R, V), I) where O is the ontology in which c is defined, T is the set of taxonomies, P is a set of properties, R is a set of restrictions on P, V, which can be either concepts or primitive data types, is a set of values to P and I is a set of instances of c.

For example, considering the ontology of weapons in Figure 1, the concept ‘missile’ is defined as a kind of special weapon consuming fuel (which is defined in the ontology http://cimic.rutgers.edu/ontology/fuel) and characterized by a number as its range in the ontology http://cimic.rutgers.edu/ontology/weapon. Therefore, ‘missile’ is a subclass of ‘special weapon’ with two properties defined here: Consumes, taking ‘fuel’ as its value and Has_Range, taking ‘range’ as its value. An instance of the concept ‘missile’ can be Iraq’s missile. Also referring to the OWL representation in Figure 3, we denote this concept as a tuple, missile = (http://cimic.rutgers.edu/ontology/weapon, (subclassOf, special weapon), (Has_Range, allValuesFrom, range), (Consumes, someValuesFrom, http://cimic.rutgers.edu/ontology/fuel#fuel), Iraq’s missile).

The data annotated by a concept are the instances of it. In this paper, we define the relationships among concepts in terms of their instances. Relationships among concepts in the same ontology are explicitly specified in the ontology. It is worth noting that these relationships may also exist among concepts from different ontologies, which may be used by independent data sources. Though articulating the relationships among concepts from different ontologies is out of the scope of this paper, our relationship classification can also be applied to relationships among concepts across ontologies. The inference risk is greatly increased especially when ontologies are overlapped with each other and a large set of documents are aggregated. On the other hand, the advantages of authorization propagation are even more impressive given the relationships among concepts across ontologies. Below, we identify the basic domain-independent relationships among concepts in Section 3.1 and further classify them in Section 3.2.

3.1 Basic Relationships Since we apply the access authorization specified on a concept to all its instances, we identify some important relationships among concepts in terms of the relationships among their instances.

Definition 2: Superclass/Subclass. Concept ci is a subclass of cj and cj is a superclass of ci iff each instance of concept ci is also an instance of concept cj, but not vice versa. We denote this as ci⊂cj.

Both the superclass and subclass relationships are transitive. These are the typical relationships in taxonomies. For example, ‘conventional weapon’ is a superclass of ‘rifle’ and ‘rifle’ is a subclass of ‘conventional weapon’, i.e. rifle⊂conventional weapon.

Definition 3: Equivalence. Concept ci is equivalent to concept cj, denoted as ci≡cj, iff each instance of concept ci is also an instance of concept cj and vice versa.

This relationship is reflexive, symmetric and transitive. For example, ‘special weapon’ is an equivalent concept to ‘non-conventional weapon’, i.e. special weapon≡non-conventional weapon.

Definition 4: Part/Whole. Concept ci is part of concept cj, denoted as ci∈{cj,} iff each instance of cj has an instance of ci.

Both the above relationships are transitive. For example, ‘trigger’ is part of ‘rifle’, i.e. trigger∈{rifle}.

97

Definition 5: Overlap/Intersection. Concepts ci, i = 1,…,k, have an intersection cj iff each instance of cj is an instance of each ci, i = 1,…,k and each instance shared by all cis, i = 1,…,k, is an instance of cj. We denote this as cj = c1∩c2∩…∩ck. Concepts ci, i = 1,…,k, are overlapping iff c1∩c2∩…∩ck≠φ.

For example, the concept ‘nuclear missile’ is defined as the intersection of ‘nuclear weapon’ and ‘missile’, i.e. nuclear missile=nuclear weapon∩missile, which means that each instance of ‘nuclear missile’ has to be an instance of both ‘nuclear weapon’ and ‘missile’, and each common instance of ‘nuclear weapon’ and ‘missile’ is also an instance of ‘nuclear missile’.

Definition 6: Sub-concept/Union. A concept ci is the union of its sub-concepts cj, i = 1,…,k, iff each instance of cj, i = 1,…,k, is an instance of ci and each instance of ci is an instance of at least one of the concepts cj, i = 1,…,k. We denote this as ci = c1∪c2∪…∪ck.

For example, ‘biochemical weapon’ is defined as the union of ‘biological weapon’ and ‘chemical weapon’, i.e. biochemical weapon=biological weapon∪chemical weapon. Therefore, each instance of ‘biological weapon’ or ‘chemical weapon’ is also an instance of ‘biochemical weapon’, and each instance of ‘biochemical weapon’ has to be an instance of at least one of the sub-concepts ‘biological weapon’ and ‘chemical weapon’.

Definition 7: Complement. Concept ci is the complement of concept cj iff each instance of ci is not an instance of cj and vice versa, i.e. ci∩cj=φ, and the union of ci and cj constitutes the universe. We denote this as ci=~cj.

This relationship is symmetric. For example, ‘conventional weapon’ is defined as the complement of ‘special weapon’, i.e. conventional weapon=~special weapon.

These relationships are domain independent. Users of our access control model can also be able to define domain-specific relationships that should be considered for propagation.

3.2 Relationship Classification Given the relationships among concepts, instances of one concept can be inferred, given the instances of another with or without some other related concepts. We denote as ci⇒cj if instances of cj can be inferred from those of ci and ci!⇒cj otherwise. Based on whether and how an inference can be made, we classify relationships among concepts into three categories: inferable, partially inferable and non-inferable.

Definition 8: Inferable Relationship (IR). The relationship from concept ci to cj is of IR if ci⇒cj. We say the inferable relationship is weak, ci⇒wcj, if instances of ci may not expose all the properties of cj. Otherwise, it is strong, ci⇒scj.

Inferable Relationships (IRs) may include the following:

(i) Equivalence relationship between two concepts. That is, if ci≡cj, then ci⇒scj (and cj⇒sci).

(ii) Relationship from the whole concept to the part concept. That is, if cj∈{ci}, then ci⇒scj.

(iii) Relationship from the subclass concept to the superclass concept. That is, if ci⊂cj, then ci⇒scj. This relationship also includes: (1) Relationship from a sub-concept to the union

concept. That is, if cj=c1∪c2∪…∪ck, ci⇒scj, i = 1,…,k. (2) Relationship from the intersection concept to any of the overlapping concepts. That is, if ci = c1∩c2∩…∩ck, ci⇒scj, j = 1,…,k.

For example, ‘nuclear missile’ is the intersection of ‘nuclear weapon’ and ‘missile’. Instances of ‘nuclear weapon’ and ‘missile’ can be inferred from those of ‘nuclear missile’.

An example of the weak inferable relationship can be the relationship from the superclass concept to the subclass concept if instances of the subclass concept can be inferred from those of the superclass concept.

Definition 9: Partially Inferable Relationship (PIR). The relationship from concept ci to cj is of PIR if ci∧ck1∧...∧ckn =>cj, where k1,…,kn ≠ i,j and ck1,...,ckn ≠ φ.

Partially inferable relationships include the relationships from any overlapping concept to the intersection concept, that is, if ci = c1∩c2∩…∩ck, c1∧c2∧…∧ck⇒ci.

Definition 10: Non-Inferable Relationship (NIR). Concepts ci and cj are related by NIR if ci∧ck1∧...∧ckn!=>cj and cj∧ck1∧...∧ckn!=>ci where k1,…,kn≠ i,j.

The relationship from a concept to its complement belongs to NIR. An example of NIR is the relationship between ‘conventional weapon’ and ‘special weapon’ since they contain no common instances.

4. CONCEPT-LEVEL ACCESS CONTROL The access authorization specifies whether a subject can perform certain actions on an object. In this section, we first present subjects, privileges and sign, objects, and then propose our model for concept-level access control as well as methodologies for propagation.

4.1 Authorization Subjects Authorization subjects can be referred to by their identities or IP addresses, credentials, privileged groups or roles [15], and may be organized hierarchically. Typically, a higher level subject in the hierarchy implicitly has all the privileges of subjects at lower levels via inheritance. We denote a set of subjects as SUB = {sub1, sub2,…,subn}.

4.2 Privileges We assume that the set of privilege modes include read, write, create and delete, which can be associated with a sign, either positive or negative, to allow specification of permissions and denials, respectively. We assume that the set of privilege modes are partially ordered. For example, ‘write’ is at a higher level than ‘read’ because if a subject has write access to an object, it also has read access to that object. It is the opposite in case of privileges with negative sign. For example, if a subject has no read access to an object, then it definitely has no write access to that object.

4.3 Authorization Objects The authorization object can be an ontology itself, a concept or a set of concepts within an ontology by extending the ontology’s URI with path expressions. Since ontologies are based on RDF,

98

the path expressions can make use of RDFPath [10], which is to represent the path between any two arbitrary nodes in RDF graph, similar to XPath for XML represented as a document tree. The authorization for an ontology or any concept within an ontology is applicable to all the data instances annotated by the concept(s). For example, if the object is the concept ‘rifle’ in the ontology http://cimic.rutgers.edu/ontology/weapon.owl, then the object is <http://cimic.rutgers.edu/ontology/weapon.owl#rifle>, using an RDFPath expression.

4.4 Access Authorization Based on the above discussion, we define a concept-level access control model with access authorization defined below:

Definition 11: (Concept-level Authorization) The concept-level access authorization for the Semantic Web is represented as a 4-tuple ca = {obj, a, s, sub}, where

- obj can be an ontology, o, or a concept, c, or a set of concepts in o identified by its ontology URI appended with a path expression;

- a is the privilege including read, write, create and delete;

- s is the sign∈{+,-} with positive for permission and negative for denial;

- sub is the subject to whom the authorization is applied;

We use ca (obj), ca (a), ca (s) and ca (sub), respectively, to refer to object, privilege, sign and subject in each authorization ca. For instance, authorization that Mary has read access to the concept ‘special weapon’ in the ontology http://cimic.rutgers.edu/ontology/weapon.owl can be specified as ca = {Mary, read, +,

<http://cimic.rutgers.edu/ontology/weapon.owl#special weapon>}, where ca (obj) =

<http://cimic.rutgers.edu/ontology/weapon.owl#special weapon>, ca (a) = read, ca (s) = + and ca (sub) = Mary.

We denote the set of authorizations explicitly specified as the explicit authorization base. In addition to the explicit specifications of concept-level authorizations, they can also be generated through propagation. We denote the set of authorizations derived through propagation as the propagated authorization base. We discuss propagation policies below, which specify the rules that allow such propagation.

4.5 Propagation Policies The security administrator can explicitly specify the authorizations to some concepts, and then propagations are performed to extend these authorizations to other concepts based on the relationships among concepts and propagation policies. We propagate positive and negative authorizations for different purposes: The negative propagations are to prevent any possible direct or indirect unauthorized access while the positive authorizations are propagated to ensure subjects have access to all the information to which they are authorized.

In our discussion below, we use to indicate the direction of propagation if the propagation is allowed and use ! to indicate no propagation can be done. We denote the concept(s) at which the propagation originates as source concept(s) and the concept(s)

at which it terminates as destination concept(s). The +/- sign associated with the indicates the sign of authorization being propagated. We use domain-independent relationships defined earlier, as examples and for simplicity, we only show the intersection of two concepts for partially inferable relationships. In the following, we define the different propagation policies allowed in our model. These propagation policies are enforced when deriving propagation rules.

There is exactly one source concept for inferable relationships. Different from partially inferable relationships, there is no need to check the authorization of any other concepts other than the source concept. The propagation policies still depend on whether the inferable relationship is strong or weak with details shown below.

Propagation Policy 1: If the inferable relationship from ci to cj is strong, i.e. ci⇒scj, then

• Given cai = {ci, a, s, sub} and cai (s) = +, then cai can be propagated from ci to cj, thereby deriving a new authorization caj = {cj, a, +, sub}. We denote this propagation rule as cai+ caj+.

• Given caj = {cj, a, s, sub} and caj (s) = -, then caj can be propagated from cj to ci, thereby deriving a new authorization cai = {ci, a, -, sub}. We denote this propagation rule as caj-

cai-.

Figure 4 shows the relationship ci⇒scj and Figure 5 shows the corresponding propagation rules of cai+ caj+ and caj- cai- for the relationship in Figure 4.

Figure 4. ci⇒scj.

Figure 5. cai+ caj+ and caj- cai-.

Propagation Policy 2: If the inferable relationship from ci to cj is weak, i.e. ci⇒wcj, then

• Given caj = {cj, a, s, sub} and caj (s) = -, then caj can be propagated from cj to ci, thereby deriving a new authorization cai = {ci, a, -, sub}. We denote this propagation rule as caj-

cai-.

Figure 6 shows the relationship ci⇒wcj and Figure 7 shows the corresponding propagation rule of caj- cai- for the relationship in Figure 6.

Figure 6. ci⇒wcj.

ci cj IRs

cai caj +

-

ci cj

IRw

99

Figure 7. caj- cai-.

If concept ci is related to cj by a partially inferable relationship, e.g. ci∧ck1∧...∧ckn=>cj, where k1,…,kn ≠ i,j and ck1,...,ckn ≠ φ, then propagations will depend on authorizations to ckn as well as ci. There are more than one source concepts for PIR.

Propagation Policy 3: If the relationship from ci and cki, i = 1,…,n, to cj is PIR, then

• Given cai = {ci, a, s, sub} and cai (s) = +, caki = {cki, a, s, sub} and caki (s) = +, i = 1,…,n, then cai and caki can be propagated from ci and cki to cj, deriving a new authorization caj = {cj, a, +, sub}. We denote this propagation rule as cai+∧caki+∧…∧cakn+ caj+.

Figure 8 shows the relationship ci∧ck1∧...∧ckn=>cj where k1,…,kn ≠ i,j and ck1,...,ckn ≠ φ and Figure 9 shows the corresponding propagation rule of cai+∧caki+∧…∧cakn+ caj+ for the relationship in Figure 8.

Figure 8. ci∧ck1∧...∧ckn=>cj, k1,…,kn ≠ i,j and ck1,...,ckn ≠ φ.

Figure 9. cai+∧caki+∧…∧cakn+ caj+.

Propagation Policy 4: If ci and cj and related by a Non-Inferable Relationship, i.e. ci∧ck1∧...∧ckn!=>cj and cj∧ck1∧...∧ckn!=>ci, where k1,…,kn ≠ i,j, then cai! caj and caj! cai, no propagations can be done among concepts related by non-inferable relationships.

4.6 Authorization Conflict Resolutions After the Security Administrator explicitly specifies authorizations to some concepts, propagations can be done based on either IRs or PIRs. For a particular concept, the authorization to it can be propagated from any concept that is directly or indirectly related to it. For example, if authorization to concept A is explicitly specified, concept A propagates the authorization to concept B which further propagates it to concept C, then authorization to C is

directly propagated from B and indirectly propagated from A. Therefore, conflicts between authorizations may occur to some concepts. However, we would like to clarify that authorization conflicts don’t result from our propagation policies due to their conflict-free nature.

Propagation rules may conflict with each other. For a set of authorizations cai∧caki, i = 1,…,n, we use (cai∧caki)- to denote any set of authorizations obtained by reversing the sign of a subset or all of the authorizations in cai∧caki. For each specific authorization cai∧caki, (cai∧caki)- has (2n+1-1) possibilities, considering there are (n+1) concepts in cai∧caki, i = 1,…,n. Specifically, we use ca- to denote the authorization having the same components as ca except that ca and ca- have the opposite sign. We define conflicting propagation rules of a propagation rule as those which conflict with it, formally as follows:

Definition 12: Conflicting Propagation Rule. Given a propagation rule cai∧caki caj, i = 1,…,n, its conflicting propagation rules include:

• (cai∧caki)- caj

• cai∧caki caj-

• caj (cai∧caki)-

For example, for cai+ caj+, its conflicting propagation rules include:

- cai- caj+

- cai+ caj-

- caj+ cai-

For each relationship, we denote the set of propagation rules as PR and the set of conflicting propagation policies as CPR.

Definition 13: Conflict-free Propagation Rule. Propagation rules for a particular relationship are conflict-free if PR∩CPR = φ.

The following proposition proves that our propagation rules themselves will not lead to conflicts during bi-directional propagation because the source concept(s) and destination concept share the same sign.

Proposition: If all the propagation rules are derived adhering to the propagation policies 1, 2, 3 and 4, then the propagation rules are conflict-free.

Proofsketch: In the following, we provide a proofsketch for some propagation policies. Since policy four does not allow propagation, it need not be considered.

For ck = ci∩cj,

PR = {

cak+ cai+ (policy 1),

cak+ caj+ (policy 1),

cai- cak- (policy 1),

caj- cak- (policy 1),

cai+^caj+ cak+ (policy 3)}.

ci cj PIR

ckn … PIR

PIR

…

cai caj +

cakn

+ +

cai caj -

100

Figure 10. Process of Access Control Specification and Propagation.

We identify the conflicting propagation policies for each (written between the square brackets following the rule) below,

cak+ cai+ [cak+ cai-, cak- cai+, cai+ cak-],

cak+ caj+ [cak+ caj-, cak- caj+, caj+ cak-],

cai- cak- [cai- cak+, cai+ cak-, cak- cai+],

caj- cak- [caj- cak+, caj+ cak-, cak- caj+],

cai+^caj+ cak+ [cai+^caj- cak+, cai-^caj+ cak+, cai-^caj-cak+, cak+ cai+^caj-, cak+ cai-^caj+].

Therefore, CPR = {

cak+ cai-, cak- cai+, cai+ cak-, cai- cak+,

cak+ caj-, cak- caj+, caj+ cak-, caj- cak+,

cai+^caj- cak+, cai-^caj+ cak+, cai-^caj- cak+,

cak+ cai+^caj-, cak+ cai-^caj+}. We can see that PR∩CPR = φ.

Since our propagation policies are conflict-free, authorization conflict to a concept ci can occur in one of the following two cases:

• conflict between explicit authorization to ci and propagated authorization to ci;

• conflict between two separate propagated authorizations to ci.

When conflict occurs, we suggest the negative authorization takes precedence. This negative authorization may be further propagated to other concepts, further overriding positive explicit authorizations to other concepts.

By consistent authorization base, we mean the combination of explicit authorization base and propagated authorization base that does not contain any authorization conflicts. Some tools should be developed to help the security administrator perform this task. Therefore, the explicit specification, propagation and conflict resolution are not one-time procedure but need to be repeated until a consistent authorization base is derived. Besides, whenever the security administrator makes changes to the authorizations, propagation should be performed based on the

new specifications. We show this process in Figure 10. The manager for relationships among ontologies is to manage the relationships among concepts across ontologies so that propagation can be done across ontologies to help derive consistent authorizations.

4.7 Semantic Access Control Language We propose Semantic Access Control Language (SACL) to express our concept-level access authorizations. Since OWL is the most promising web ontology language proposed by W3C, our SACL uses the syntax and vocabulary of OWL and also introduces new vocabulary defined for access control, such as SACL:higherLevelThan (and SACL:lowerLevelThan) to specify the ordering between subjects or privileges and ‘canRead’ or ‘readBy’ to specify the privileges of subjects to certain objects. Advantages of using OWL include that tools developed for OWL can be utilized and another level of authorizations can be further specified to these authorizations expressed in SACL.

It is known that the objects for concept-level security policies are already defined as concepts in ontologies. When subjects are identified by roles or privileged groups, our idea is that subjects should also be formally defined in some ontology so that they can be used for annotation. Each access authorization is to connect the subject and object by the action and sign, specified as an ObjectProperty between the subject and object concept. These subject concepts can be specified as the subclass of concept ‘subject’. When subjects are identified by users’ identities, they can be defined as individuals of ‘subject’.

For example, we have ‘Missile Design Engineer’ defined as a subject concept at http://cimic.rutgers.edu/ontology/wpnsubject. The missile design engineer should have read access to information on missiles. We specify this authorization as follows:

<owl:Class rdf:ID="http://cimic.rutgers.edu/ontology/wpnsubject#Missile Design Engineer"> <owl:Restriction> <owl:onProperty rdf:resource="#canRead" /> <owl:someValuesFrom rdf:resource="http://cimic.rutgers.edu/ontology/weapon#missile " /> </owl:Restriction> </owl:Class>

Manager for Relationships

among Ontologies

Conflict Detection & Resolution

Ontologies

Propagation Policies

PropagationRules

Propagated Authorization

Base

Consistent Authorization

Base Propagation

Explicit Authorization

Base

101

The ordering between subjects can be specified as the ObjectProperty of one subject concept related to another. The ordering between privileges is specified as the relationship between two ObjectProperties. We can express the relationship of privilege ‘canWrite’ with ‘canRead’ and ‘canNotWrite’ as follows:

<owl:ObjectProperty rdf:ID=”canWrite”> <owl:inverseOf rdf:resource=”canNotWrite” /> <SACL:higherLevelThan rdf:resource=”#canRead” /> </owl:ObjectProperty>

5. ACCESS TO SEMANTIC CONTENTS In this section, we discuss briefly how subjects can access the semantic contents stored in a repository when they request them by information pull (especially for the read action) under our access control model. The users’ request can be sent based on either concepts or documents.

Request for concept The user may request all the semantic contents related to a particular concept, similar to the search currently done based on keywords. In this case, an index to the concepts should be built in advance to indicate which contents are annotated by each concept, similar to how the search engine indexes each keyword. As we know, there is a pointer from a data instance to its annotation concept. In order to build the index, we need to go through concept by concept in each document and append each data instance, which can be identified by its ontology URI and a path expression, to its annotation concept. The index for each concept has the format similar to one below:

[ontology URI for concept ci][Path to concept ci] : [Document URI1][Path to the data instance];[Document URI2][Path to the data instance];…[Document URIk][Path to the data instance].

This index needs to be built in advance, updated incrementally or re-created from scratch when there are changes to concepts or data instances. It may suffer from some staleness if it is done periodically. When a request is submitted upon a particular concept, access authorizations to the requested concept along with certain other related concepts are checked to see whether the subject has the privilege to perform the action. If the request is to read and evaluated to be positive, the semantic contents annotated by the concept, its equivalent concepts and maybe its subclass concepts should be composed and rendered to the subject.

Request for document The user can also request for a document, whose elements may be annotated by concepts from more than one ontology. Since each element includes a pointer to its annotated concept, the access authorization for the concept should be checked to see whether the user can access the web data annotated. This checking should be done for each concept involved in the document and can be done either on-the-fly or in advance to improve efficiency. A customized view of the document is generated based on authorization evaluation.

6. RELATED WORK Damiani, et al. proposed an access control model for XML documents and DTDs in [9]. Kudo et al. proposed a provisional authorizations model for XML documents and an XML access control language (XACL) [11]. [5] introduced an XML-based

language for specifying subject credentials and security policies for web documents.

Content-dependent access control, such as [1], tries to enforce access control to objects in digital libraries based on the content of objects, described as concepts associated with authorization objects. However, it has to have some mechanism in place to extract concepts from documents and build a conceptual hierarchy. Our concept-level access control model is the natural result of the Semantic Web with concepts defined in ontologies.

[2] proposed an authorization model for information portals which provides tools to prevent indirect unauthorized access and a framework to derive authorizations in a consistent and safe manner based on the relationships between the base and derived data.

In [17], Stoica et al. proposed using ontologies to detect security violations among distributed XML documents and specifically designed Oxsegin, an Ontology guided XML Security Engine, to detect replicated information under different security classifications. Our access control model can help tackle the same problem from the perspective of access control.

More recent work includes [12], which introduces static analysis for XML access control by using automata for representing and comparing queries, access control and schema with the goal of easing the burden of checking access control policies for XML documents. In [6], Bertino et al. propose a distributed infrastructure that enables subjects in collaboration to verify whether the update operations on a document have been performed based on the stated access control policies usually without interacting with the document server. [8] presented an access control model for XML documents by exploiting XML’s own capabilities and a language for the specification of access restrictions along with a description of a system architecture for access control enforcement.

7. CONCLUSIONS AND RELATED WORK In this paper, we have proposed a concept-level access control model for the Semantic Web by specifying authorizations over concepts defined in ontologies and enforcing them upon their data instances, with support for propagations based on the relationships among concepts. We discussed some domain-independent relationships among concepts and how these relationships affect propagation policies. We also present how to specify the security policy using Semantic Access Control Language, which is mainly OWL-based. How users’ read request for a concept or a document can be handled by our access control model is also presented.

Though we focus on propagations performed based on the relationships among concepts, they can certainly be performed based on the relationships between ontologies though the relationships between ontologies may not be semantic. For example, if one ontology imports another ontology, authorizations to the concepts in the imported ontology can be propagated to those in the importing ontology.

As part of our future research, we will explore the complexity of propagation and conflict resolution. For example, when propagations are done and the resulted conflicts are resolved, it may mean that authorizations to some other concepts will need to

102

be re-evaluated and new conflicts may arise. We will identify approaches to avoid these chaining overheads and make propagation work more efficiently. Moreover, in this paper, we have considered a trivial way to resolve the conflicts by resorting to negative authorizations prevail all the time. However, it may not be true, especially because the semantic web is distributed, where the concepts and their corresponding authorizations are specified by different entities. In such cases, one may want to adopt policies such as explicit authorizations prevail over derived authorizations. Supporting such conflict resolutions is not trivial since it requires further investigation into the behavior of the propagation of authorizations and the relationships among concepts.

8. REFERENCES [1] N. R. Adam, V. Atluri, E. Bertino and E. Ferrari. A Content-

based Authorization Model for Digital Libraries. IEEE Transactions Knowledge and Data Engineering, Volume 14, Number 2, 2002, pages 296-315.

[2] V. Atluri and A. Gal. An Authorization Model for Temporal and Derived Data: Securing Information Portals. ACM Transactions on Information Systems Security, Volume 5, Number 1, 2002.

[3] Tim Berners-Lee, Semantic Web-XML2000. http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html. 2000.

[4] Tim Berners-Lee, James Hendler and Ora Lassila. The Semantic Web. Scientific American, May 2001.

[5] Elisa Bertino, Silvana Castano and Elena Ferrari. On Specifying Security Policies for Web Documents with an XML-based Language. 6th ACM Symposium on Access Control Models and Technologies (SACMAT 2001), 57-65.

[6] Elisa Bertino, Gianluca Correndo, Elena Ferrari and Giovanni Mella. An Infrastructure for Managing Secure Update Operations on XML data. Proceedings of 8th ACM Symposium on Access Control Models and Technologies, 110-122.

[7] Elisa Bertino, Silvana Castano, Elena Ferrari and Marco Mesiti. Controlled Access and Dissemination of XML

Documents. Workshop on Web Information and Data Management 1999, 22-27.

[8] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, P. Samarati. A Fine-Grained Access Control System for XML Documents. ACM Transactions on Information and System Security (TISSEC), vol. 5, n. 2, May 2002, pp. 169-202.

[9] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi and P. Samarati. Securing XML Documents. In Proc. of the 2000 International Conference on Extending Database Technology (EDBT2000), Konstanz, Germany, March 27-31, 2000.

[10] Stefan Kokkelink. Quick introduction to RDFPath. Available at http://zoe.mathematik.uni-osnabrueck.de/QAT/RDFPath/Quick/Quick.html.

[11] Michiharu Kudo and Satoshi Hada. XML Document Security based on Provisional Authorization. In Proc. Of ACM Conference on Computer and Communication Security (CCS 2000), Nov 2000.

[12] Makoto Murata, Akihiko Tozawa, Michiharu Kudo and Satoshi Hada. XML Access Control Using Static Analysis. In Proc. Of 10th ACM Conference on Computer and Communication Security (CCS 2003).

[13] OWL Web Ontology Language Reference. http://www.w3.org/TR/owl-ref/.

[14] Resource Description Framework. http://www.w3.org/RDF/.

[15] Ravi Sandhu, Edward Coyne, Hal Feinstein, Charles Youman. Role-Based Access Control Models. IEEE Computer, 1996.

[16] Larry M. Stephens and Michael N. Huhns. Consensus Ontologies: Reconciling the Semantics of Web Pages and Agents. IEEE Internet Computing 5(5): 92-95 (2001).

[17] Andrei G. Stoica and Csilla Farkas. Ontology guided Security Engine, Submitted to Journal of Intelligent Information Systems, 2002. Available at http://www.cse.sc.edu/research/isl/isl_ont_sec.shtml.

103