SemChat: Extracting Personal Information from Chat Conversations (EKAW 2010)

Preview:

DESCRIPTION

This paper was presented in the 1st Workshop on Personal Semantic Data (PSD 2010: http://semanticweb.org/wiki/Personal_Semantic_Data) at EKAW 2010 (http://ekaw2010.inesc-id.pt/) Conference on Knowledge Engineering and Knowledge Management by the Masses in Lisbon, Portugal on 11 October 2010. The full paper can be found on: http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-629/psd2010_paper2.pdf

Citation preview

By Keith Cortis & Charlie Abela

Instant Messaging (IM) - communication

in real time were messages are transferred

in a seemingly peer-to-peer manner

Increase in the fragmentation of personal

information

Several tools developed to aid users in the

management of their personal information

space

Vision behind Semantic Desktop (SD) -

tackling the difficulties when managing

personal information

Research - towards this area & extraction

of semantics from chat conversations

Improve PIM by linking the different

content found on the desktop with the

extracted semantics

Exploiting and extending NEPOMUK’s

Social Semantic Desktop framework with a

semantic chat client component, ‘SemChat’

Extraction and annotation of important

concepts from a chat conversation

Storage of any concepts that were not

annotated, for reference in future SemChat

sessions

Semantic search for specific concepts (incl.

events) in different ways, for example by

date

Ability to use this plug-in from different

chat clients achievable by using a client that

can handle multiple protocols

General Architecture

Buy SmartDraw!- purchased copies print this

document without a watermark .

Visit www.smartdraw.com or call 1-800-768-3729.

NEPOMUK – allows user to manage alldata found on her desktop and to link thedocuments within the PIMO

Spark IM – XMPP chat client that satisfiedour needs

Spark IM – enhanced with multiprotocolfunctionality via the availability of anXMPP server

End of chat session - non-intrusive system

Cost of interruptions varies on average

between 10-15 minutes before users return

their focus to the disrupted task

Context menus used to represent operations

that a user can do, for each extracted concept

JAPE rules implemented – to recognize

possible events within a chat conversation

using regular expressions in annotations

Rule: EventRule

(

{ Lookup.majorType==event_trigger }

):eventTrigger

-->

{

AnnotationSet matchedAnns= (AnnotationSet) bindings.get("eventTrigger");

FeatureMap newFeatures= Factory.newFeatureMap();

newFeatures.put("rule","EventRule");

outputAS.add(matchedAnns.firstNode(),matchedAnns.lastNode(),

"EventTrigger",newFeatures);

}

Title and prospective date of the extracted

event can be edited by the user

Annotated event will automatically be saved

within Spark’s Task List

User can filter out a search by several criteria for

example by date

No formal evaluation was performed on

any of the semantic chat clients’ projects

that we considered in the related works

section

A session was organized were 8 users tried

out SemChat

6-12 participants are enough to test the

usability of a system (Dumas and Redish)

Features of extracting concepts from chat

conversations – proved as a popular choice

Semantic search feature proved to be less

popular with several users

Majority of users experienced the

extraction of concepts and/or events from

their chat conversation

All extracted concepts/events annotated

by users were successfully stored in the

PIMO and Task List respectively

In some cases important concepts flagged

within a conversation were not extracted

Problem – XtraK4Me selects most

important key phrases ordered by

occurrence rate

Problem addressed by improving XtraK4Me

or possibly using a better key phrase extractor

Limitation – some events not extracted since

they didn’t conform to the structure that

SemChat was implemented to recognize

Possible solution – further extend ANNIE

NER to recognize all possible types of events

that can be present within a chat conversation

Context-aware chat program

Tries to solve semantic conflicts which

occur between chatting users through the

tagging of ambiguous chat messages

Solves part of this problem and is a step

forward towards eliminating semantic

conflicts which occur in chat sessions

Morphological analysis used to extract

proper nouns from the dialogue text

Online images and articles from Wikipedia

related to the extracted nouns are

simultaneously displayed alongside the

dialogue text

Helps in reducing the elements of

ambiguity like searching

Identify and improve problems that IM

systems encounter moving towards the

Networked Semantic Desktop

Chat window offers a taxonomy panel

where annotation of messages is permitted

whilst a user is chatting

Semantic Querying - search of messages

wanted by specifying a particular attribute

System uses existing email transport

technology

Is integrated with NEPOMUK

Handles and keeps track of action items

within email messages

Extracts tasks and appointments found

within email messages which are then

added to the email client’s scheduler

Prototype system

Automatically identifies action items (tasks)

in email messages

Presents user with a task-focused summary

of a message

User can add action items to their “to do”

list

Integration of SemChat with popular

applications such as a an email client like

Thunderbird

Extracted events would be logged

automatically into the client’s event scheduler

Extend ANNIE NER through JAPE so that

other entities could be extracted from

conversations such as: emails, products, etc.

Semantic search feature – further optimize

the searching process

Semantic search feature – further enhanced

to display part of chat transcript satisfying

the search criteria

Semantic annotations generated by

SemChat – quantitatively evaluated in the

future

Investigate slang language in IM into more

depth so that SemChat would be adopted

to be handle it

Ex. : “mt b4 lunch @11.30am nxt tue”

We can further extend ANNIE NER with

JAPE to be able to recognize such an event

‘mt b4’ as being ‘meet before’ and ‘nxt tue’

as being ‘next Tuesday’

We have presented a semantic chat component in

SemChat which was integrated with a SSD

application – NEPOMUK

SemChat contributes further to area of PIM

through the integration of concepts in the user’s

PIMO and the integration of events within an

events scheduler

SemChat also reflects the research being done in

the area of the SD in relation to Semantic Chat

Thank you for your attention !

Any Questions?

Recommended