INFORMATION SPACES AS A BASIS FOR PERSONALISING THE SEMANTIC WEB

INFORMATION SPACES AS A BASIS FOR PERSONALISING THESEMANTIC WEB

Ian OliverNokia Research Center, Itamerenkatu 11-13, Helsinki, Finland

[email protected]

Keywords: Space based computing, Semantic Web, Mobile devices

Abstract: The future of the Semantic Web lies not in the ubiquity, addressability and global sharing of information butrather in localised, information spaces and their interactions. These information spaces will be made at a muchmore personal level and not necessarily adhere to globally agreed semantics and structures but rely more uponad hoc and evolving semantic structures.

1 Introduction

In this paper we position our vision of the contin-uation of the development or evolution of the Seman-tic Web (Berners-Lee et al., 2001). This is best visu-alised as the Giant Global Graph concept popularisedby Tim Berners-Lee1.

Most information however is not ubiquitous butpersonalised, hidden, private and interpreted locally -this information tends to be the personal, highly dy-namic information that one stores about oneself: con-tact lists, friends, media files, ‘my’ current context,‘my’ family, ‘my’ home etc and the interweaving andlinking between these entities through ad hoc personalstructures.

We elaborate on the ideas of ubiquitous informa-tion, the role of reasoning and knowledge, the lo-cation of the information with relation to its ubiq-uity through the concept of projections from the Gi-ant Global Graph called spaces. We then describe animplementation of an environment supporting theseideas in a mobile and personal context as well as manyof the issues that this directly brings up with regardsto what are semantics and how information is goingto be dealt with in this context.

In the following sections we outline our positionand areas of research relating to notions of personali-sation, Semantic Web, information and its meaing andsemantics as well as our implementation.

1htt p : //en.wikipedia.org/wiki/Giant Global Graph

2 Personalisation and Spaces

The Semantic Web is succeeding in relativelysmall-scale, specific situations which are restricted toa given domain. If we expand the notion of a do-main in a more orthogonal sense to encompass per-sonal level then this suggests that we have a notion ofa ‘Personal Semantic Web’ in which one can organisetheir own information according to these principles.The advantages of a Semantic Web based approach isthat certain structures, schemata and semantics can befixed enabling some - and this is an important point,we should not (and can not?) try to attempt everything- meaningful communication, reasoning and interop-erability to take place.

Mobile devices with various methods of connec-tivity which now constitute for many as being the pri-mary gateway to the internet and also being a ma-jor storage point for much personal information (Ide-hen and Erling, 2008; Lassila and Adler, 2003). Thisis in addition to the normal range of personal com-puters and furthermore sensor devices plus ‘internet’based providers. Combining these devices togetherand lately the applications and the information storedby those applications is a major challenge of interop-erability (Tolk and Muguira, 2003; Turnitsa, 2005).

This is achieved through numerous, individual andpersonalspacesin which persons, groups of personsetc can place, share, interact and manipulate webs ofinformation (Krummenacher, 2008; Khushraj et al.,2004) with their own locally agreed semantics with-

out necessarily conforming to an unobtainable, globalwhole. These spaces areprojectionsof the ‘GiantGlobal Graph’ in which one can apply semantics andreasoning at a local level. A detailed survey of suchspace-based systems is given in (Nixon et al., 2008).

This approach we feel addresses at least two of thecounter-arguments against the Semantic Web vision:feasibility and privacy by directly addressing notionsof locality or ubiquity and ownership. Feasibility be-cause we are changing the problem to address muchsmaller-scale structures through setting clear bound-aries in terms of computation.

In order to apply reasoning and other manipula-tions (such as sharing) of that information we are re-quired to construct processes which have access tothat information - typically these are known as agentsin the ‘traditional’ sense of the word although wetend towards the classification given in (Haag et al.,2003). Agents are either personal in that they per-form tasks either directly decided upon by the user orautonomously for or on behalf of the user, monitorparticular situations or reason/data-mine the existinginformation.

Personalisation is achieved through explicitly de-marking a space in which information is stored andagents have access. Within each space information isorganised according to the owner (or owners) of thatspace. For an agent to obtain entry to that space thenit is made on the terms of that space. Similarly for twospaces to interact directly similar contracts must alsohold. Interactions with spaces is described in section3.

The kinds of information that are stored in a per-sonal space vary but initially contacts lists, media files(links to media), personal information managementdata (calendars etc), email and other personal com-munication etc. This easily expands to informationfeeds such as those provided through RSS and evenWWW interfaces, family or community information,social networking and so on. These kinds of informa-tion can be then further augumented by tagging, inter-nal links and more sophisticated equivalence relation-ships such as might be seen between social network-ing contacts, contacts lists, calendars etc. In additionmore static and thus more externalisable informationcan be stored or referenced in the same manner - suchinformation might be census records, telephone direc-tories or even cultural information (Hyvonen et al.,2008).

Of course there are issues regarding the interpre-tation of information and how the meaning or seman-tics is preserved across spaces and agents; this alsoincludes deciding whether two independent structuresactually represent the same piece of information andcan be merged or coalesced. Furthermore issues re-garding trust and security need to be addressed - wedo not specifically discuss this problem in this paper.

3 Interaction and Sharing withSpaces

Consider the follow scenarios. In figure 1 Aliceinteracts with her personal space - through ‘agents’2

running on a multitude of devices. This space con-tains a corpus of information A which through localreasoning and deductive closure algorithms - a featureof our spaces - provides her with the corpus R(A).Information is represented using Semantic Web stan-dards, ie: RDF, RDFS, OWL, FOAF etc, rule sets inRuleML etc.

Alice

A

R(A)

Alice’s "agents"executing on

interacting with

uses

...and others...

Figure 1: Alice’s Agents, Devices, Spaces and Information

In this case the boundary of R(A) is the limit ofAlice’s personal space. If Alice has two spaces orcorpii of information she might bind these together toproduce a much larger space. These individual corpiiof information may be overlapping in terms of theircontent. Alice can interact simultaneously with manydiscrete spaces. These situations are visualised in fig-ure 2; for simplicity we only show the spaces and clo-sures. Here the total information available for a givenspace is the union of the deductive closure over all theindividual corpii.

A

B

C Alice

D

R(C)

R(D)

R(A u B)

Figure 2: Alice and Alice’s Spaces

Alice can decide to break and reconfigure her cur-rent spaces into many smaller spaces. This may bemade in any manner including removing all informa-tion and creating multiple individual smaller or evenempty spaces to making a complete copies of the cur-rent space.

2Agent is rather a loaded word, but alternatives aren’tnumerous: executives, nodes, UIs, programs, etc

We now introduce Bob, who has a space of hisown that is constructed from a single corpus of infor-mation. At this point we can say nothing about therelationship between the contents of the corpii A andB with C; they may potentially all contain the sameinformation.

Interaction between Alice and Bob can be made inthree different forms. The first is simple in that Bobonly needs to give Alice’s ‘agents’ access to his spaceas visualised in figure 3. Bob of course has his ownideas about privacy and grants Alice access to onlya portion of his space. Alice has direct access to asubset of Bob’s space - if she has write access thenpotentially this could have an effect on the space as awhole.

Privacy is asymmetric - it is on the sharer’s termsonly thus precluding the need for a globally agreedprivacy mechanism. If Alice just so happens tobe able to satisfy the criteria for accessing Bob’sspace then Alice is granted access at whatever levelBob’s privacy mechanisms allow. Alice’s own privacymechanisms do not affect Bob’s mechanisms and viceversa.

Alice

A

B

Bob

C

Figure 3: Alice Interaction with Bob’s Space

The second form of interaction is a variation ofthis: Bob to partitions off a given subspace to whichAlice has access, then any changes can be kept lo-cal and the merge back controlled by Bob explicitly.In both of these cases access policy including trustmechanisms are local to Bob - there is no need forAlice to know about what mechanisms are in place.

The fist two forms we expect to be the most com-mon methods of interaction, the third offers a set ofdifferent possibilities based around the merging ofthe spaces. This is more complicated as Alice andBob must both agree to the merge (fig. 4) both interms of personally agreeing through their trust mech-anisms (might just be personal trust) but also throughthe shared semantics of the information.

This then constitutes how spaces are related buthas not yet addressed certain specific ideas about in-formation and the semantics of that information andthese are discussed in section 4.

Alice

A

B

CBob

Figure 4: Alice’s and Bob’s (Merged) Space

4 Information and Semantic Issues

We can classify the issues with this approach as:

• Non-monotonicity of Deductive Closure andRules

• Graph Provenance

• Semantics of the Information

• Uncertainty, Incompleteness, Inconsistency andUndefinedness of Information

Given a single spaces, the information containedin that space isi(s), the rulesρ(s) and the deductiveclosure calculated asR(i(s),ρ(s)). A merge of twospacess1 ands2 results in a single spacesm wheresmis calculated asR(i(s1)∪ i(s2),δ(ρ(s1),ρ(s2)). Thefunction δ determines the set of rules to apply andis constructed from a mechanism which prioritisesthe rules somehow - the exact mechanism would ofcourse vary but we envisage would be similar to thosefound in certain kinds of non-monotonic calculi, eg.(Mueller, 2006).

Graph provenance and the related semanticsproblem arethe major problems in any SemanticWeb/interoperability system as there, beyond pre-agreed addressing and strict, standardised ontologies,no definitive method for relating structures represent-ing the same information together exists.

While information about typing etc is carriedwithin the space, deepersemanticsis not. Identifi-cation of larger meaningful structures such as RDFmolecules (Ding et al., 2005) provide at this timethe strongest basis for provenance analysis and ad-dressing deeper semantically meaningful structures.In the most part we must rely upon local conventionbetween any two merging or interacting spaces withthe hope that this leads to a more global convention(Afraz Jaffri, 2008; Shafiq et al., 2008).

The meaning of the information to Alice might bevery different to the meaning ascribed by Bob (seealso provenance above). The meaning or semanticsof the information can only be made by the reader(Burcea et al., 2003); the writer of the informationonly gives hints through typing and tagging and otherrelationships to what the intended meaning might be.

This hints to the question how do we guaran-tee that two agents understand the information in the

same way, our response to this is that we don’t care;at least to the point where there is no internal mech-anism for this (Kim and Anseo-Dong, 2002). As itis, the meaning of some ontological structure evolvesover time and there appears to be no reasonable mech-anism for communicating deeper level semantic struc-tures. For true, reliable communication how manysemantic levels are required is unknown. However,since that if access to a space is granted or a mergemade then this is implicit agreement between the par-ties and particularly the readers over the intended se-mantics of the information.

In (Gardenfors, 2000) is provided a detailed dis-cussion of the style ofintentional semantics(Kim andAnseo-Dong, 2002) we propose here that it is not just-ing typing but the whole construct of properties of anobject - and then the scope over which we define anobject - that must be taken into consideration whendeciding how an agent interprets a given structure.This also applies when dealing with the aforemen-tioned graph provenance issue.

To complicate matters further, the notion of se-mantics embodied within these kinds of informationstructures is little more than meta-data whereas onereally needs to describe a further relationship intothe ‘real-world’ (Smith, 1996). Solutions based uponlexical and semantic analysis through resources suchas Wordnet(Fellbaum, 1998) appear to be the mostpromising here with regard to issues surrounding se-mantics similarity (Jan Wielemaker and Wielinga,2003; Ruotsalo and Hyvonen, 2007). However,despite this we can (an no current computer sys-tem) never be sure that any two agents actually actupon their respective interpretations information inthe same way.

Currently we are seeing two mechanisms for se-mantic agreement: the first is through standardisa-tion of ontologies (W3C) and the second is throughfolksonomic evolution of initially personal and infor-mal structures into ad hoc ontologies which becomemore concretised as social agreements form. Even inthe strict, formal ontology development scenario evo-lution of the ontology takes place as usage changesand develops (Ruotsalo et al., 2008), however this ismuch slower than the folksonomic cases. A fairlycommon mechanism for agreeing on semantics is theupper ontology approach, but again this suffers fromthe problem ofmanyupper ontologies (for example:(Hyvonen, 2008))- an interesting discussion of this ismade in (Wheeler, 2004)

Finally the problem of uncertain, incomplete,inconsistent and undefined information requires amuch more formal approach within the various agentand reasoning structures (Lassila, 2008). Withinthe Semantic Web we must further explore notionsof undefinedness, modality, probability and non-monotonicity. This is left for future work though

mechanisms are already present within our imple-mentation for attaching and modality properties.

5 Implementation

Supporting these ideas we already have the com-puting and networking infrastructure. Our particularsolution builds a space-based computing framework(Oliver and Honkola, 2008; Oliver et al., 2008) basedupon the Piglet/Wilbur (Lassila, 2007) RDF++ en-gine (Lassila, 2002). The notion of space being con-structed out of a number of individual, linked (totallyroutable) brokers. Interaction with the space is via aagents which may reside on any suitable device withsuitable connectivity and computing capabilities; sim-ilarly the brokers. Figure 5 shows one possible con-figuration.

Agents may connect to one or more spaces at atime and to which spaces may vary over the lifetimeof an agent. Mobility in this sense is provided ina ‘pi-calculus’ manner (Milner, 1999) in that linksmay move rather than the physical running process.Agents can save state and become ‘mobile’ when an-other agent restores that state. We have not addressedcode mobility in the current implementation and thisremains low priority at this moment.

Agents themselves are anonymous and indepen-dent of each other - there is no explicit control flowother than that provided through preconditions toagent actions. A coordination model based around ex-pressing coordination structures as first-order entitiesis being investigated, however we are more focussedon collecting and reasoning over context. Controlflow can be made outside of the space through agentsexplicitly sharing details of their external interfacesthrough the space - this has been successfully used incoordinating media streaming and storage devices viaUPnP3 and NoTA4 for example.

The brokers each contain a corpus of informationand when linked together to form a space distributethe information in an asymmetric manner - some in-formation is not replicated because of computationalresources, connectivity, storage reasons and even le-gal issues such as copyright. This distribution layeris also responsible for the efficient distributed com-putation of queries and marshalling the resultant re-sponses such that calculating the deductive closurecan be made. In this manner we can distribute com-plex computation away from the original agent overthe whole space.

This distribution can be used in a number of waysand one of the most interesting for us is giving the

3Universal Plug and Play http://upnp.org4Network on Terminal Architecture

http://www.notaworld.org/

Space "A" Space "B"

Individual Brokers +

Information Stores

connections to spaces

Agents +

Control Flow Mechanisms

explicitPotential

Figure 5: An Example Implementation Configuration

user temporary access to a larger body of informationwhich can be used to selectively enhance their currentcorpus of information, for example, a body oflinkinginformationbased around theowl:sameAs construct(Passant, 2008). Other examples include personal ac-cess to more dynamic corpii such as news, weather orsimilar services.

We make no attempt to enforce consistency of theinformation rather letting the writer of the informa-tion have freedom to express what they want and leavethe interpretation to the reader. The semantics of theinformation is merely intensional in that the writerprovides ‘clues’ through typing, tagging and othermeans. Repair of information according to schemataor other criterion can be enforced within the space andfor some kinds of information would even be desir-able. If consistency of particular structures is requiredthen this can be achieved through agent implementa-tion and specific belief revision models.

6 Conclusion

In summary we believe that the Semantic Web willmove from being a global information corpus to mul-tiple, individual, linked and personal corpii. Seman-tics, reasoning and processing about the informationwill be localised and personalised within these corpii.As corpii are shared, linked, merged and split, certainschemata will coalesce and evolute into fixed or stan-dard structures in an evolutionary form.

There are still questions regarding what preciselyis semantics and how this is preserved across informa-tion structures but techniques do exist for reasoningabout this and the related graph/information prove-nance problems and these are currently being imple-

mented and trialled within the framework we have de-scribed.

REFERENCES

Afraz Jaffri, Hugh Glaser, I. M. (2008). Uri disam-biguation in the context of linked data. InLinkedData on the Web (LDOW 2008), Beijing, China.

Berners-Lee, T., Hendler, J., and Lassila, O.(2001). The semantic web.Scientific American,284(5):34–43.

Burcea, I., Petrovic, M., and Jacobsen, H.-A. (2003).I know what you mean: semantic issues ininternet-scale publish/subscribe systems. InCruz, I. F., Kashyap, V., Decker, S., and Eck-stein, R., editors,SWDB, pages 51–62.

Ding, L., Finin, T., Peng, Y., Silva, P. P. D., and L, D.(2005). Tracking rdf graph provenance using rdfmolecules. Technical report, 2005, Proceedingsof the Fourth International Semantic Web Con-ference.

Fellbaum, C. (1998). Wordnet: An electronic lexicaldatabase.

Gardenfors, P. (2000).Conceptual Spaces: The ge-ometry of thought. The MIT Press. 0-262-57219-2.

Haag, S., Cummings, M., and McCubbrey, D. J.(2003). Management Information Systems forthe Information Age. McGraw-Hill Companies.4th edition. 978-0072819472.

Hyvonen, E. (2008). Finnonto-malli kansallisen se-manttisen webin sisaltoinfrastruktuurin perus-taksi - visio ja sen toteus (finnonto model as the

basis for a national semantic web infrastructure- vision and its implementation). InPaper pre-sented at the publication event of the Finnish On-tology Library Service ONKI.

Hyvonen, E., Makela, E., Kauppinen, T., Alm, O.,Kurki, J., Ruotsalo, T., Seppala, K., Takala, J.,Puputti, K., Kuittinen, H., Viljanen, K., Tuomi-nen, J., Palonen, T., Frosterus, M., Sinkkila,R., Paakkarinen, P., Laitio, J., and Nyberg, K.(2008). Culturesampo – a collective memory offinnish cultural heritage on the semantic web 2.0.

Idehen, K. and Erling, O. (2008). Linked data spacesand data portability. InLinked Data on the Web(LDOW 2008), Beijing, China.

Jan Wielemaker, G. S. and Wielinga, B. (2003).Prolog-based infrastructure for rdf: performanceand scalability. InIn: D. Fensel, K. Sycara andJ. Mylopoulos (eds.) The Semantic Web - Pro-ceedings ISWC’03, Sanibel Island, Florida. Lec-ture Notes in Computer Science, volume 2870,Springer-Verlag, pages 644–658.

Khushraj, D., Lassila, O., and Finin, T. (22-26 Aug.2004). stuples: semantic tuple spaces.Mo-bile and Ubiquitous Systems: Networking andServices, 2004. MOBIQUITOUS 2004. The FirstAnnual International Conference on, pages 268–277.

Kim, H.-G. and Anseo-Dong (2002). Pragmatics ofthe semantic web. InSemantic Web Workshop.Hawaii, USA.

Krummenacher, R. (2008). The smartest space ofall: A global space of (machine-understandable)knowledge. In 1st Russian Conference onSmartSpaces, ruSmart 2008, St.Petersburg, Rus-sia.

Lassila, O. (2002). Taking the rdf model theory outfor a spin. InISWC ’02: Proceedings of the FirstInternational Semantic Web Conference on TheSemantic Web, pages 307–317, London, UK.Springer-Verlag.

Lassila, O. (2007).Programming Semantic Web Ap-plications: A Synthesis of Knowledge Represen-tation and Semi-Structured Data. PhD thesis,Helsinki University of Technology.

Lassila, O. (2008). Some personal thoughts on se-mantic web and non-symbolic ai. In4th Interna-tional Workshop on Uncertainty Reasoning forthe Semantic Web.

Lassila, O. and Adler, M. (2003). Semantic gadgets:Ubiquitous computing meets the semantic web.In Fensel, D., Hendler, J. A., Lieberman, H.,and Wahlster, W., editors,Spinning the SemanticWeb: Bringing the World Wide Web to Its Full

Potential, pages 363–376. MIT Press. 0-26206-232-1.

Milner, R. (1999). Communicating and Mobile Sys-tems: the Pi-Calculus. Cambridge UniversityPress. 0-521-65869-1.

Mueller, E. T. (2006). Commonsense Reasoning.Morgan Kaufmann. 0-12-369388-8.

Nixon, L. J. B., Simperl, E., Krummenacher, R., andMartin-Recuerda, F. (2008). Tuplespace-basedcomputing for the semantic web: a survey of thestate-of-the-art.The Knowledge Engineering Re-view, 23(2):181–212.

Oliver, I. and Honkola, J. (2008). Personal semanticweb through a space based computing environ-ment. In Middleware for Semantic Web 08 atICSC’08, Santa Clara, CA, USA.

Oliver, I., Honkola, J., and Ziegler, J. (2008). Dy-namic, localised space based semantic webs. InWWW/Internet Conference, Freiburg, Germany.

Passant, A. (2008). :me owl:sameasflickr:33669349@n00. InLinked Data onthe Web (LDOW 2008), Beijing, China.

Ruotsalo, T. and Hyvonen, E. (2007). A method fordetermining ontology-based semantic relevance.In Proceedings of the International Conferenceon Database and Expert Systems ApplicationsDEXA 2007, Regensburg, Germany. Springer.

Ruotsalo, T., Seppala, K., Viljanen, K., Makela, E.,Kurki, J., Alm, O., Kauppinen, T., Tuominen,J., Frosterus, M., Sinkkila, R., and Hyvonen, E.(2008). Ontology-based approach for interoper-ability of digital collections.Signum, (5).

Shafiq, O., Scharffe, F., Wutke, D., and del Valle,G. T. (2008). Resolving data heterogeneity is-sues in open distributed communication middle-ware. InProceedings of 3rd International Con-ference on Internet and Web Applications andServices (ICIW 2008), Athens, Greece.

Smith, B. C. (1996).On the Origin of Objects. TheMIT Press. 0-262-69209-0.

Tolk, A. and Muguira, J. A. (2003). The levels ofconceptual interoperability model (lcim). InPro-ceedings IEEE Fall Simulation InteroperabilityWorkshop, IEEE CS Press.

Turnitsa, C. D. (2005). Extending the levels ofconceptual interoperability model. InProceed-ings IEEE Summer Computer Simulation Con-ference, IEEE CS Press.

Wheeler, Q. D. (2004). Taxonomic triage and thepoverty of phylogeny.