15
Social Network and Data Portability using Semantic Web Technologies Uldis Boj¯ ars 1 , Alexandre Passant 2,3 , John G. Breslin 1 , Stefan Decker 1 1 DERI, National University of Ireland, Galway, Ireland [email protected] 2 LaLIC, Universit´ e Paris-Sorbonne, Paris, France [email protected] 3 Electricit´ e de France R&D, Clamart, France [email protected] Abstract. Social network and data portability has recently gained a lot of interest as one of the issues for social media sites on the Web. In this paper, we will show how Semantic Web technologies and especially the FOAF and SIOC vocabularies can be used to model user information and user-generated content in a machine-readable way. Thus, we will see how data and network information can be reused among various services and applications, at almost zero-cost for developers of such tools. Key words: Social Media, Semantic Web, Web 2.0, Data Portability, FOAF, SIOC 1 Introduction Social media sites, including social networking services, have captured the at- tention of millions of users as well as billions of dollars in investment and ac- quisition. To better enable a user’s access to multiple sites, portability between social media sites is required in terms of (1) identification, personal profiles and friend networks and (2) user’s content expressed on each site, whether it is about blog posts, pictures, bookmarks or any type of data. Such portability would al- low users to easily exchange content between services, or merge and share their social network between various websites. This requires representation mecha- nisms to interconnect both people and objects on the Web in an interoperable, machine-understandable, and extensible way. The Semantic Web, which is an extension of the current Web in which information is given well-defined mean- ing, better enabling computers and people to work in cooperation [3], provides those required representation mechanisms for portability between social media sites: it links people and objects to record and represent the heterogeneous ties that bind each to the other. The FOAF 1 initiative [8] provides a solution to the first requirement (1), while the SIOC 2 project [7] can address the latter (2). By 1 Friend-of-a-Friend - http://www.foaf-project.org 2 Semantically-Interlinked Online Communities - http://sioc-project.org 5

Social Network and Data Portability using Semantic Web

  • Upload
    lycong

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

Social Network and Data Portability using

Semantic Web Technologies

Uldis Bojars1, Alexandre Passant2,3, John G. Breslin1, Stefan Decker1

1 DERI, National University of Ireland, Galway, [email protected]

2 LaLIC, Universite Paris-Sorbonne, Paris, [email protected]

3 Electricite de France R&D, Clamart, [email protected]

Abstract. Social network and data portability has recently gained a lotof interest as one of the issues for social media sites on the Web. In thispaper, we will show how Semantic Web technologies and especially theFOAF and SIOC vocabularies can be used to model user informationand user-generated content in a machine-readable way. Thus, we will seehow data and network information can be reused among various servicesand applications, at almost zero-cost for developers of such tools.

Key words: Social Media, Semantic Web, Web 2.0, Data Portability,FOAF, SIOC

1 Introduction

Social media sites, including social networking services, have captured the at-tention of millions of users as well as billions of dollars in investment and ac-quisition. To better enable a user’s access to multiple sites, portability betweensocial media sites is required in terms of (1) identification, personal profiles andfriend networks and (2) user’s content expressed on each site, whether it is aboutblog posts, pictures, bookmarks or any type of data. Such portability would al-low users to easily exchange content between services, or merge and share theirsocial network between various websites. This requires representation mecha-nisms to interconnect both people and objects on the Web in an interoperable,machine-understandable, and extensible way. The Semantic Web, which is anextension of the current Web in which information is given well-defined mean-ing, better enabling computers and people to work in cooperation [3], providesthose required representation mechanisms for portability between social mediasites: it links people and objects to record and represent the heterogeneous tiesthat bind each to the other. The FOAF1 initiative [8] provides a solution to thefirst requirement (1), while the SIOC2 project [7] can address the latter (2). By

1 Friend-of-a-Friend - http://www.foaf-project.org2 Semantically-Interlinked Online Communities - http://sioc-project.org

5

using agreed-upon Semantic Web formats like FOAF and SIOC to describe peo-ple, content objects, and their connections, social media sites can interoperateand provide portable data by appealing to some common semantics. Moreover,the combination of OpenID and FOAF can be used as a backbone for uniqueidentification and profile definition on social media sites, which can in turn belinked to a user’s created content via SIOC.

In this paper, we will discuss the application of these technologies to enhancecurrent social media sites with semantics and to address issues with portabilitybetween such services. We will show how FOAF and SIOC can provide smartsolutions for data portability amongst various social media sites, allowing oneto reuse their data and friends networks from one service on other services, aswell as interlinking content from one site to another. We will present theoreticalaspects as well as scenarios and implementations of such solutions.

2 Overview of Social Network Portability

2.1 Data Portability History

”Social network portability” is the term used to describe the ability to reuse one’sown profile across various social networking sites. Brad Fitzpatrick3 spoke froma developer’s point of view about forming a ”decentralised social graph” [9] anddiscussed some ideas for social network portability and aggregating one’s friendsacross sites. However, it is not just friends that may need to be ported across so-cial networking sites (and across social media sites in general), but identity andcontent items as well. Soon afterwards, ”A Bill of Rights for Users of the SocialWeb” [11] was authored for social websites who wish to guarantee ownership andcontrol over one’s own personal information. As part of this bill, the authors as-serted that participating sites should provide social network portability, but thatthey should also guarantee users ”ownership of their own personal information,including the activity stream of content they create”, and also stated that ”sitessupporting these rights shall allow their users to syndicate their own stream ofactivity outside the site”. The Social Graph API4 from Google is another relatedeffort that provides methods to query aggregated social graph information fromthe Web. It currently uses formats like XFN and FOAF, which we will talk aboutlater. More recently, the temporary removal of prominent blogger Robert Scoblefrom Facebook5 relaunched interest in data ownership and portability amongstdifferent social media sites. The DataPortability project6 was launched in 2007,with members from various organisations including Facebook, Google and Mi-crosoft coming together to discuss portability issues from technical and legalstandpoints. The OpenSocial foundation, recently proposed by Google, Yahoo!3 Founder of the LiveJournal blogging community4 http://code.google.com/apis/socialgraph/5 http://scobleizer.com/2008/01/03/ive-been-kicked-off-of-facebook/6 http://dataportability.org

6

and MySpace also aims to provide APIs to let developers write social applica-tions that access data and networks from various social media websites.

However, to enable a person’s transition and / or migration across social me-dia sites, there are significant challenges associated with achieving such porta-bility both in terms of the person-to-person networks and the content objectsexpressed on each site. Social media sites should be able to collect a person’srelevant content items and objects of interest and provide some limited dataportability (at the very least, for their most highly used or rated items). We willrefer to these items as one’s social media contributions, or SMCs. Through suchportability, the interactions and actions of a person with other users and objects(on systems they are already using) can be used to create new person or contentassociations when they register for a new social media site. Rather than requiringproprietary APIs to access this data from each service, we think that uniformrepresentation mechanisms are needed to represent and interconnect people andobjects on the Web in an interoperable, extensible way.

2.2 The Semantic Web and Data Portability

The Semantic Web provides such representation mechanisms: it links peopleand objects to record and represent the heterogeneous ties that bind us to eachother. By using agreed-upon Semantic Web formats, like RDF with existing ornew ontologies, to describe people, content objects, and the connections thatlink them together, social media sites can interoperate by appealing to commonsemantics. Developers are already using Semantic Web technologies to augmentthe ways in which they create, reuse, and link content on social media sites,and some of them already provide exports from social networking sites in suchmachine-readable formats. In the other direction, social media sites can serve asrich data sources for Semantic Web applications. As Tim Berners-Lee said in theISWC 2005 podcast, Semantic Web technologies can support online communitieseven as ”online communities ... support Semantic Web data by being the sourcesof people voluntarily connecting things together”7. Such semantically-linked datacan provide an enhanced view of individual or community activity across socialmedia sites (for example, ”show me all the content that Alice has acted on in thepast three months”). Thus, we do not consider Web 2.0 and Semantic Web asopposing candidates, but rather we believe that they can be combined with eachother to provide a Social Web where data can be exchanged and interlinked nomatter where it comes from[1].

In the next section, we will describe how the Semantic Web, and especiallyFOAF, can be used to define one’s profile and can act as a unique entry point forpersonal data across different social media sites. We will also place emphasis onhow it can be used to define not only personal information, but also decentralisedsocial networks, and how a user could re-use this information within SemanticWeb compliant social media websites. The second part of this paper will overviewthe data portability aspect, thanks to the SIOC ontology that provides a way to7 http://esw.w3.org/topic/IswcPodcast

7

describe all data entries for a given user wherever they come from. We will seethrough an example how it helps to move data from one platform to another.Finally, we will conclude with various thoughts regarding links between socialmedia sites, the Semantic Web, social networks and data portability.

3 Social Network Representation with FOAF

3.1 Identity and Networking Management Across Social Media Sites

While many social media sites allow people to define their social networks, onlya few of them permit users to export their networks so that they can be reusedacross other applications. Moreover, when this is the case, users have to rely onsome specific APIs, which means writing ad-hoc tools for each data provider.The FOAF project provides a way to represent social network data in a sharedand machine-readable way, since it defines an ontology for representing peopleand the relationships that they share. While some sites already offer FOAF ex-port, such as LiveJournal8, MyBlogLog9 and Hi5.com10, there are many othersocial media sites that do not directly expose their data in RDF. However, devel-opers have created different tools to achieve this goal. For example, user profileinformation is available in RDF thanks to exporters for Flickr11, Facebook12

or Twitter13. In the latter, this complements machine-readable social networkdescriptions already embedded via microformats in their pages.

Using FOAF, people and relationships can be modeled using these principles:

– each person is represented as a foaf:Person instance and may be assignedURI(s), their unique identifiers on the (Semantic) Web;

– each person has various properties, such as a name (foaf:name), nickname(foaf:nickname) or birthdate (foaf:birthday);

– people can be related to each other using the foaf:knows property.

For example, the following snippet of code represents one of the author’sprofiles created from Flickr using FOAF:

flickr:33669349@N00 a foaf:Person ;foaf:name "Alexandre Passant" ;foaf:mbox_sha1sum "528b95cc44060ceea571d7498a9fd2c7e3ca8a4c" .foaf:knows flickr:32233977@N00 .

Leveraging Semantic Web representations of people and social networks usingwidely-adopted ontologies such as FOAF allows us to use generic RDF parsers8 http://livejournal.com9 http://www.mybloglog.com/

10 http://hi5.com11 http://apassant.net/blog/2007/12/18/rdf-export-of-flickr-profiles-with-foaf-and-

sioc/12 http://www.dcs.shef.ac.uk/ mrowe/foafgenerator.html13 http://sioc-project.org/node/262

8

and SPARQL[10] (a RDF query language which recently became a W3C recom-mendation) to browse and reuse data. Thus, end users can use the same toolsto parse their network wherever it comes from. To that extent, FOAF simpli-fies the process of writing tools for developers of social-networking frameworks,especially since many open-source tools are available for most platforms14 15.

3.2 Merging and Querying Social Networks

As introduced previously, FOAF allows us to describe personal profiles, but itcan also be used to represent relationships between people. Since various sitescan export a FOAF representation of users and social networks using their ownURIs schemes as shown in the previous RDF snippet, there is still a need tomerge and consolidate distributed profiles. In order to consolidate URIs, net-work owners may rely on Semantic Web best practices that suggest the useof the following properties to represent the identity of existing objects [4] (1)owl:sameAs is used to identify that two resources are the same in spite of dif-ferent URIs and (2) rdfs:seeAlso is used to let crawlers and Semantic Webbrowsers such as Tabulator [2] know where to find additional RDF statementsabout the resource. Many Semantic Web tools also follow Linked Data guide-lines16 and try to dereference instance URIs, thus providing another way to findadditional RDF data.

Using these properties in a distributed and open multi social-network contextallows people to interlink and unify the various URIs that represent themselves.To do so, people can reference a main FOAF URI which can be described viaa hand-crafted or automatically generated FOAF profile which links to otherexisting profiles (and also to interlink distributed social networks from variousplatforms), as the following snippet and Fig.1 describes:

:me owl:sameAs flickr:33669349@N00 ;owl:sameAs twitter:terraces ;owl:sameAs facebook:foaf-607513040.rdf#me .

Providing such an entry point allows any RDF-compliant tool to browseone’s complete social network in a simple way, i.e. retrieving relationships fromFlickr, Twitter or Facebook (1) with standard libraries and SPARQL queries and(2) without having to crawl the Web for data since everything can be accessedfrom one FOAF file. As an example, we provide a simple script that rendersa users’ complete social network in a user-friendly Flash interface17. This toolonly requires the main URI of the user, and thanks to the interlinkage propertiesdescribed before, it retrieves other URIs and related social networks to render it,as shown on Fig.2. This application, which requires only a few lines of Pythonand SPARQL queries to parse the complete network, clearly shows the benefitsof using common semantics to describe networks on social media websites.14 http://www.mkbergman.com/?page id=34615 http://www.w3.org/2001/sw/SW-FAQ#tools16 http://www.w3.org/DesignIssues/LinkedData.html17 http://apassant.net/home/2008/01/foafgear

9

flickr:2233977@N00

flickr:33669349@N00

flickr: 43184127@N00

flickr: 24266175@N00

foaf:knowsfoaf:knows

foaf:knows

twitter:CaptSolo

twitter:terraces

twitter:Wikier

twitter:CharlesNepote

foaf:knowsfoaf:knowsfoaf:knows

twitter:potiontv

foaf:knows

myblog:a30

myblog:a2

myblog:a19myblog:a26

foaf:knows

foaf:knowsfoaf:knows

myuri:me

owl:sameAs

owl:sameAsowl:sameAs

http://tools.opiumfield.com/twitter/terraces/rdf

http://myblog/foaf-export

http://apassant.net/home/2007/12/flickrdf/data/people/33669349N00

Fig. 1. Interlinking social networks with the Semantic Web

Fig. 2. Browsing a complete social network with a single entry point

Finally, another way to identify uniqueness of someone across various net-works is to rely on properties that can uniquely identify him. This is especiallyuseful to merge people among various social networks when they did not ex-plicitly use owl:sameAs links. When writing ontologies, OWL offers the abilityto define properties as being inverse functional in order to indicate that twoRDF descriptions using the same value for this property are ”talking” aboutthe same entity. OWL axioms (i.e., InverseFunctionalProperty) tell us which

10

properties can be used in this way, i.e. as indirect identifiers - implementing ”ref-erence by description”. FOAF uses several properties of this kind, for example,foaf:mbox sha1sum (i.e. scrambled e-mail) and foaf:openid (i.e. an OpenIDURL). Thus, even if people use different screen names on various websites, assoon as they register with the same email or with the same OpenID URL, theycan be uniquely identified across distributed social networks. As an example,the following SPARQL query will retrieve for one user the information about allthe people that he or she knows on Flickr and on their weblog (if a contact leftsome comment there), merging their identities thanks to their mbox sha1sum,whatever their username may be on those platforms.

SELECT ?friend ?emailWHERE {GRAPH <http://my_flickr_export> {:me foaf:knows ?friend .?friend foaf:mbox_sha1sum ?email

}GRAPH <http://my_blog_export> {:me foaf:knows ?friend .?friend foaf:mbox_sha1sum ?email

}}

4 Social Network Portability with FOAF

4.1 Social Networks in Personal Applications

Such semantically-powered social network descriptions can be re-used in existingpersonal desktop- or web-based tools. For example, Knowee18 is a web-basedapplication that allows a user to list all of their FOAF URIs, and also includes amicroformat parser (which can be used, for example, with Twitter) that features”smushing” capabilities (i.e., identity reasoning), based on user-defined rules aswell as on pre-defined ones to uniquely identify people among various socialnetworks descriptions. The network is then browsable using an AJAX-ified userinterface. From the desktop point of view, Beatnik19 provides a semantic addressbook where you can browse various FOAF profiles and see connections betweenpeople. A future development reusing those aspects is the SPARQLPress20 plug-in for WordPress, that may be used as a personal social network aggregator basedon semantic technologies within this popular blogging tool.18 http://knowee.org19 https://sommer.dev.java.net/source/browse/sommer/trunk/misc/AddressBook/www/20 http://wiki.foaf-project.org/SparqlPress

11

4.2 Reusing Social Networks Across Social Media Site

In order to explain the benefits of such an approach from a portability-between-sites point of view, we will describe the use case of a FOAF-aware social mediawebsite.

Bob, a new user, wants to join a (fictional) Networkr service in order to sharesome pictures and posts with his friends. To do so, he creates an account usinghis OpenID URL. While the primary advantage with OpenID is that he does notneed a new login and password to connect, it allows the site to easily discoverhis FOAF profile and URI. Indeed, Bob has delegated his OpenID to his owndomain name, and added an autodiscovery link in his homepage HTML headerto let software agents discover the location of his profile with a single line ofcode21:

<head><link rel="meta"type="application/rdf+xml"title="FOAF" href="bob_foaf.rdf" />

</head>

The system will then retrieve Bob’s profile as well as his URI (thanks to thefoaf:openid property) and read Bob’s social networks to check if any people inone of his existing networks are already registered on Networkr. The service willthen ask Bob if he wants to consider all those people as friends on Networkr.Since Bob does not want to grant access to everyone about his activities onthis website, he decides to check himself who to add from his existing friends.He then adds photos, and decides to restrict access to only those people he hadpreviously added in Flickr. All of these steps have been efficiently achieved by thewebsite since it just has to query a single profile and the related RDF descriptionof Bob’s social graph. Moreover, each time he logs in, Networkr again browsesBob’s complete network to retrieve updates and change local access rights ifneeded. Thus, as soon as Bob adds someone as a friend on Flickr, he will gainaccess to his new picture gallery. Finally, since Networkr is completely open andconsider that the data and social graph belongs to the user, it allows Bob toexport his new restricted network, which he can then reuse on other websites.Privacy issues should be considered in those uses cases, to allow more complexaccess rights definition or restrictions, for example if a user wants certain kind ofpictures to be seen only by a subgroup of his Flickr network. Identification andtrust may also be a problem to display only relevant information when requestingRDF data from Networkr and the recent RDF authentication discussion22 couldbe considered here.21 http://wiki.foaf-project.org/Autodiscovery22 http://blogs.sun.com/bblfish/entry/rdfauth_sketch_of_a_buzzword

12

5 Data Portability with SIOC

5.1 Describing Social Media Contributions using SIOC

The SIOC initiative was initially established to describe and link discussion poststaking place on online community forums such as blogs, message boards, andmailing lists. As discussions begin to move beyond simple text-based conversa-tions to include audio and video content, SIOC has evolved to describe not onlyconventional discussion platforms but also new Web-based communication andcontent-sharing mechanisms [5].

In combination with the FOAF vocabulary for describing people and theirfriends, and the Simple Knowledge Organization System (SKOS) model fororganising thesaurus-like data, SIOC lets developers link user-created contentitems to other related items, to people (via their associated user accounts), andto topics (using specific ”tags” or hierarchical categories). Through its SIOCType module, SIOC can represent various types of containers (i.e. Wiki, Blog,MessageBoard) and content items (i.e. WikiArticle, BlogPost, BoardPost).Moreover, this is not limited to textual content because SIOC can be also usedto represent content such as ImageGallery in the example of the Flickr exporter.Finally, as a good Semantic Web citizen, SIOC reuses and extends existing on-tologies such as Dublin Core and FOAF in order to be compatible with RDFdata modeled using other existing vocabularies23.

Various tools, exporters and services have been created to expose SIOC datafrom existing online communities24. These include APIs for PHP, Perl, Javaand Ruby, data exporters for systems like WordPress, Drupal, phpBB and Blo-gEngine.NET, data producers for RFC 4155 mailboxes, SIOC converters for Web2.0 services like Twitter and Jaiku, and usage in commercial products includingTalis Engage and OpenLink Virtuoso.

All of these data sources provide accurate structured descriptions of socialmedia contributions (SMCs) that can be aggregated from different sites (e.g.by person via their user accounts, by co-occurring topics, etc.). Fig.3 shows theprocess of porting SIOC data from various sources to SIOC import mechanismsfor WordPress and future applications. We will now describe the SIOC importplugin for WordPress.

5.2 Importing SIOC Data, with a WordPress Example

The SIOC import plugin25 for WordPress blog engine is an initial demonstratorfor social media portability using SIOC. SIOC import panel in the WordPressadministrator user interface (Fig.4) allows a weblog maintainer to import user-created content from another website (described in the form of SIOC data) totheir weblog.23 SIOC Ontology: Related Ontologies and Vocabularies -

http://www.w3.org/Submission/sioc-related/24 http://rdfs.org/sioc/applications/25 http://wiki.sioc-project.org/w/SIOC Import Plugin

13

Fig. 3. Porting social media contributions from data providers to import services

Fig. 4. Importing SIOC data in WordPress

Data to be imported can be created from a number of different social mediasites using SIOC export tools (as described above) which are at the data creationside of the SIOC ”food chain” [6]. Since the only requirement for the importeris to ”understand” data modeled with SIOC, it makes no difference wether thedata comes from another WordPress blog, a vBulletin message board or yourlatest updates on Twitter.

14

For example, a SIOC exporter plugin for a blog engine would create a SIOCRDF representation of every blog post and comment, including informationabout:

– the content of a post (sioc:content)– the author (sioc:has creator)– the creation / update date (dct:created / dct:updated)– tags and categories (sioc:topic)– all comments on the post (sioc:has reply)– information about the container blog (sioc:has container)

The use of RDF for data representation enables us to easily extend this datamodel with new properties when they become necessary, even with needs thatare not covered by SIOC itself but that one social media site would require toachieve certain task. We thus benefit from a model which is well-formalised butcompletely extensible.

The import process implemented by the WordPress SIOC import plugin isthe following:

– Parse RDF data (using the open-source ARC26 RDF parser)– Find all posts - instance(s) ofsioc:Post - which exhibit all of the properties

required by the target site– For each post found, it creates a new post using WordPress API calls

The ”proof of concept” implementation of SIOC imported worked with asingle SIOC file and imports all the posts contained within it. Fig.5 shows anexample post imported into WordPress.

Since SIOC is a universal data format and is not specific to any particular site,this pilot implementation already allows us to move content between differentblog engines or even between different kinds of social media sites. However, morefunctionality is needed for data portability and the next iteration of WordPressSIOC importer uses the sioc:has reply property to identify, retrieve (i.e. fetchadditional SIOC RDF files describing these comments) and re-create all thecomments associated with the blog posts imported. This approach can be furtherextended by processing all other kinds of objects and information described insource SIOC data.

5.3 Data Portability for a Complete Social Media Site

We will now describe how a SIOC import tool can be extended to port alluser-created content from one social media site to another. By starting froma site’s main SIOC profile, we retrieve machine-readable information about allthe content of this site - starting with the forums hosted therein, and thenretrieving the contained posts, comments, and associated users. This extendedSIOC import tool retrieves all SIOC data pages (possibly limited by user-definedfilters) and to re-create all the data found in this SIOC page on the target social

26 http://arc.semsol.org

15

Fig. 5. Imported post in WordPress

media site. Just as can be done with FOAF for social networks, the completeRDF description of a social media site can be found by pointing to a single entrypoint - a SIOC site profile.

A result of importing the SIOC data for a whole site will be a replica ofthe original site, including links between objects (e.g. between posts and theircomments). Often, a part of the content that a user wants to port is not forpublic consumption (e.g. if a user is porting some personal information betweenhis accounts). SIOC can be used in this case, but the user will first need toauthenticate at the source site and ensure that they have enough privileges toaccess all the data that need to be migrated.

Another step in social media portability is keeping two sites synchronised (ifrequired): having the same set of users, posts, comments, category hierarchies,etc. In principle, this can be achieved by importing a full SIOC dataset and thenmonitoring SIOC data feeds for new items added (some SIOC export tools mayneed to be extended to do this). Implementing this in practice will undoubtedlyunfold some interesting challenges, such as real-time synchronisation in two di-rections, as well as the choice of pushing data or letting services find the dataon the Web and update it themselves.

5.4 Perspectives of SIOC Data Portability

An interesting use case for SIOC data portability would be migration betweendifferent platforms. For example, this could occur if a person has been using a

16

mailing list for a particular community, and they then decide that the extendedfunctionality offered to them by a Web-based bulletin board platform is required.Once again, since SIOC can be used to represent various types of containers anditems, but using the same format and based on the same high level concepts inthe SIOC ontology, moving a mailing-list content to a bulletin board, or a blogpost comment to a wiki page can be easily achievable.

While existing web feeds can offer some data portability, the ”sliding win-dow” metaphor (providing only last 10-15 items) underlying RSS/Atom con-trasts somewhat with the goal of creating a complete archive. SIOC goes furtherby providing a complete format for representing social media site’s data and po-tentially offering a full archive of site’s content. At the same time there remainsan overlap in scope here between SIOC and Atom in particular, given that Atomalso offers a rich data access and editing protocol.

The discussion-type content items are not the only kind of items that can beported. The SIOC Types module27 extends SIOC to be able to describe variousWeb 2.0 and social media content types. Different types of content items (Sound,MovingImage, Event, Bookmarks, etc) can be organised in sioc:Container(s)and ported in the same way. While the example in the previous section wasbased on a WordPress implementation, any social media web site could use thesame process to import data from one service to another. For example, using acommon data format, a user can post on his blog, others can reply in comments,and then the whole discussion can be moved to a bulletin board or importedas a discussion thread associated with a photo on an image sharing site such asFlickr.

In some use cases, data portability may require selective importing of aspecific kind of objects. E.g. a user may decide to port information about allvideos (sioc t:MovingImages) from his website to Youtube, while moving hissioc t:Bookmarks to a bookmark manager such as del.icio.us. What should bekept in mind is that the latter (bookmarks) can be fully expressed in RDF whilethe former (videos) have a ”payload” that will be separate from RDF data andmay require specific handling in order to archive it or to transfer from one siteto another.

We mainly discussed porting data from one location or site to another, butwe can also consider a wider context - personal ”life-stream” or activity streaminformation. Such streams describe all kinds of activities performed by users,including (but not limited to) creation of content items and social network rela-tions. One interesting application using such streams (and of SMC data in SIOCas one kind of activity data) can be ”life-stream” archiving where a person re-trieves and keeps an archive of all her activities on different social media sites,which can both keep a permanent personal archive and allow different kinds ofpersonal data applications built on top of it.27 http://rdfs.org/sioc/types

17

6 Conclusions and Future Work

In this paper we began by introducing the various challenges regarding socialnetwork and data portability and how the Semantic Web can be used to help withachieving this goal. We first introduced the use of FOAF to represent identityand distributed social networks, as well as ways to reuse it across various personalapplications or various social media sites. We then demonstrated how SIOC datacan be used to represent and port the diverse social media contributions beingmade by users on various sites. Consequently, both approaches can be mergedto retrieve and identify all content objects produced by a single user on varioussocial media sites and services, as well as their friends or social network data,as shown on Fig.6, thus leveraging the transfer of both people and content databetween Web 2.0 sites using the Semantic Web.

Fig. 6. FOAF, SIOC and Data Portability

For future work, an important issue is who should be allowed to reuse cer-tain data in other sites (as spam blogs are often duplicating other people’s con-tent without authorisation for SEO purposes), and what information do people”own” on a social media site and are allowed to port elsewhere. As well as col-lecting a person’s relevant content objects, social media sites may need to verifythat a person is allowed to reuse data / metadata from these objects in externalsystems. This could be achieved by using SIOC and FOAF as representation for-mats, aggregating content items created by a person (through her user accounts)from various sites, and combining this with some authentication, signature and

18

trust mechanisms to verify that these items can be reused by the authenticatedindividual on whatever new sites they choose. When porting content created byothers the information about content licenses (e.g. Creative Commons licensesin RDF28) will need to be added to SIOC RDF data and taken into accountwhen porting data.

7 Acknowledgments

This material is based upon work supported by Science Foundation Ireland undergrant number SFI/02/CE1/I131. Authors would like to thank Dan Brickley, oneof the creators of the FOAF vocabulary, for his valuable feedback and suggestionsfor this paper.

References

1. Anupriya Ankolekar, Markus Krotzsch, Duc Thanh Tran, and Denny Vrandecic.The Two Cultures: Mashing up Web 2.0 and the Semantic Web. Journal of WebSemantics, 6(1), FEB 2008.

2. Tim Berners-Lee, Yuhsin Chen, Lydia Chilton, Dan Connolly, Ruth Dhanaraj,James Hollenbach, Adam Lerer, and David Sheets. Tabulator: Exploring and an-alyzing linked data on the semantic web. In Proceedings of the 3rd InternationalSemantic Web User Interaction Workshop, 2006.

3. Tim Berners-Lee, James Hendler, and Ora Lassila. The Semantic Web. ScientificAmerican, 284(5):34–44, May 2001.

4. Chris Bizer, Richard Cyganiak, and Tom Heath. How to Publish Linked Data onthe Web. http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkedDataTutorial/, 20July 2007.

5. Uldis Bojars, John Breslin, Aidan Finn, and Stefan Decker. Using the SemanticWeb for Linking and Reusing Data Across Web 2.0 Communities. The Journal ofWeb Semantics, Special Issue on the Semantic Web and Web 2.0 (Forthcoming),2008.

6. Uldis Bojars, John G. Breslin, Vassilios Peristeras, Giovanni Tummarello, and Ste-fan Decker. Interlinking the Social Web with Semantics. IEEE Intelligent Systems,23(3), May/June 2008.

7. J.G. Breslin, A. Harth, U. Bojars, and S. Decker. Towards Semantically-InterlinkedOnline Communities. In Proceedings of the Second European Semantic Web Con-ference, ESWC 2005, May 29–June 1, 2005, Heraklion, Crete, Greece, 2005.

8. Dan Brickley and Libby Miller. FOAF Vocabulary Specification. NamespaceDocument 2 Sept 2004, FOAF Project, 2004. http://xmlns.com/foaf/0.1/.

9. Brad Fitzpatrick and David Recordon. Thoughts on the social graph.http://bradfitz.com/social-graph-problem/, 08 2007.

10. Eric Prud’hommeaux and Andy Seaborne. SPARQL query language for RDF.W3C Recommentation, W3C, January 2008.

11. Joseph Smarr, Marc Canter, Robert Scoble, and Michael Arrington. A bill of rightsfor users of the social web. http://opensocialweb.org/2007/09/05/bill-of-rights/,4 September 2007.

28 http://creativecommons.org/ns

19