SiocLog: Providing IRC Discussion Logs as Linked Data

Preview:

DESCRIPTION

Social Data on the Web Workshop at the International Semantic Web Conference / Washington, DC / 26th October 2009

Citation preview

School of Engineering and Informatics

SiocLog: Providing IRC Discussion Logs as Linked Data

Tuukka Hastrup1, Uldis Bojars2 and John G. Breslin2, 3

1 University of Jyväskylä, Finland

2 DERI, NUI Galway, Ireland 3 School of Engineering and Informatics, NUI Galway, Ireland

School of Engineering and Informatics

Motivation

• IRC conversations are quite disconnected from the Web and even from other IRC channels and networks

• Often there is valuable and needed information in an IRC chat that cannot be linked to people, topics or events, or in general referenced from elsewhere

• This may be useful to people who do not use IRC, by those on other networks, or simply by people who leave and rejoin a channel

School of Engineering and Informatics

Motivation (2)

• SIOC provides a framework for linking social media contributions to other content and Linked Data resources, and IRC can become part of that framework

• We also need mechanisms to link the IRC contributions to the people who made them, hence the use of Web ID

School of Engineering and Informatics

Background

• We will begin by introducing the various areas relevant to this system:

– IRC

– Linked Data

– SIOC

– Web ID

School of Engineering and Informatics

Internet Relay Chat (IRC)

• Instant messaging / internet chat is a major form of social interaction online

• It is often disconnected from the Web:

– Due to the different protocols involved

– Due to its real-time nature / lack of persistent storage

• IRC was one of the earliest chat systems

• It has an important role amongst open-source communities, web communities, and even geeks!

– Hundreds of thousands of users online at any time

School of Engineering and Informatics

Linked Data

• Building a “Web of Data” to enhance the current Web

• Exposing, sharing and connecting data about things via dereferenceable URIs

• Linking datasets together that were not previously connected, for example:

– Music and people

– Real-world things and places

• The Linking Open Data (LOD) effort aims to link various open datasets together (DBpedia, GeoNames, etc.)

School of Engineering and Informatics

Semantically-Interlinked Online Communities (SIOC)

• An effort from DERI, NUI Galway to discover how we can create / establish ontologies on the Semantic Web

• Goal of the SIOC ontology is to address interoperability issues on the (Social) Web

• http://sioc-project.org/

• SIOC has been adopted in a framework of 50 applications or modules deployed on over 400 sites

• Various domains: Web 2.0, enterprise information integration, HCLS, e-government

School of Engineering and Informatics

School of Engineering and Informatics

Some of the SIOC core ontology classes and properties

School of Engineering and Informatics

Some examples of where SIOC is already use (about 50 implementations / applications)

School of Engineering and Informatics

Web ID

• A Web ID is a web address that identifies a person as a Linked Data item

• A Web ID should also lead to a document with more information about that person (e.g. FOAF, other RDF)

• For more information, see the definition in this paper:

– Ching-Man Au Yeung, Ilaria Liccardi, Kanghao Lu, Oshani Seneviratne, Tim Berners-Lee, “Decentralization: The Future of Online Social Networking”, W3C Workshop on Future of Social Networking

School of Engineering and Informatics

Design

School of Engineering and Informatics

Mapping IRC identifiers to URIs on the Web

• irc://freenode

(IRC Network)

• irc://freenode/%23channel

(Channel)

• No identifier

(Message)

• irc://freenode/persona,isuser

(Chat Persona)

• http://irc.sioc-project.org/#freenode

• http://irc.sioc-project.org/channel#channel

• http://irc.sioc-project.org/channel/0000-00-00 #00:00:00.00

• http://irc.sioc-project.org/users/persona#user

School of Engineering and Informatics

Some of the internal and external links

School of Engineering and Informatics

Browsing the Linked Data

School of Engineering and Informatics

Creating a link between a user account on IRC and a personal profile

• Claiming a Web ID creates a link [black] between a user account (a sioc:User that created a sioc:Post in a sioct:ChatChannel) and a person (foaf:Person)

• The person can manually verify this:

– By pointing back to the sioc:User from their foaf:Person definition [grey]

School of Engineering and Informatics

Web IDs in SiocLog

• A Web ID can be claimed using mttlbot

• Can claim using standard IRC services

/msg nickserv

set property webid SomeWebID

School of Engineering and Informatics

Implementation

• 2000 lines of Python source code

• 1000 lines of Zope/TAL HTML templates

• Twisted, SimpleTAL and Redland libraries

• Four major components:

– IRC interface, data analysis, data integration, Web

School of Engineering and Informatics

Implementation (2)

• IRC interface:

– Discussion logger / persona monitor on Twisted

• Data analysis:

– Process logs, a filters pipeline, sinks for stats / output

• Data integration:

– Queries for external Linked Data (personal profiles)

• Web interface:

– Requests via CGI, publishes as HTML and RDF

School of Engineering and Informatics

Finding the names of friends of an IRC persona with SPARQL

semwebquery –sparql "SELECT ?name WHERE {

?person foaf:holdsAccount

<http://irc.sioc-project.org/users/melvster#user> .

?person foaf:knows ?friend .

?friend foaf:name ?name . }"

School of Engineering and Informatics

Validation

• 291 chat personas on five channels

• 22,418 chat messages

• 51 chat personas have associated Web IDs claimed using mttlbot (2/3) or nickserv (1/3)

– 44 of those have a valid associated RDF document

• Scalable (projected 4 million triples in 10 years)

• SiocLog data being consumed by the “Towards linked sensor data for Hackystat” project

• SiocLog interfaces to FOAF Me for new profile creation

School of Engineering and Informatics

Future work

• Extend to instant messaging and private messaging

• Study of IRC communities where users and content are distributed across channels and networks

School of Engineering and Informatics

Acknowledgements

• We would like to thank Science Foundation Ireland for their support under grant SFI/08/CE/I1380 (Líon 2)

• Thanks also to Benja Fallenstein and Dan Brickley for their insights

School of Engineering and Informatics

Summary

• IRC conversations are quite disconnected from the Web and even from other IRC channels and networks

• Often there is valuable and needed information in an IRC chat that cannot be linked to people, topics or events, or in general referenced from elsewhere

• SIOC provides a framework for interlinking social media to other content and Linked Data, and IRC has been integrated as a part of that framework

• We also used mechanisms to link IRC contributions to the people who made them via Web ID and FOAF

Recommended