16
Miha Grčar (Department of Knowledge Technologies, Jožef Stefan Institute) & FIRST Consortium M12 scenario: Early prototype demo Luxembourg, Nov 2011

M12 scenario: Early prototype demo

Embed Size (px)

DESCRIPTION

M12 scenario: Early prototype demo. Miha Gr čar ( Dep artment of Knowledge Technologies, Jožef Stefan Institute ) & FIRST Consortium. Outline. We will show two integrated prototypes Twitter sentiment analysis prototype Sentiment extraction prototype The aim is to… - PowerPoint PPT Presentation

Citation preview

Page 1: M12  scenario: Early prototype  demo

Miha Grčar (Department of Knowledge Technologies, Jožef Stefan Institute)

& FIRST Consortium

M12 scenario: Early prototype demo

Luxembourg, Nov 2011

Page 2: M12  scenario: Early prototype  demo

Outline

We will show two integrated prototypesTwitter sentiment analysis prototypeSentiment extraction prototype

The aim is to…Give a better idea of the overall FIRST processGive a hint of the final “product” from the

technological perspectiveDemonstrate the collaboration between partners

(integration efforts)

Luxembourg, Nov 2011FIRST Y1 Review Meeting 2

Page 3: M12  scenario: Early prototype  demo

Twitter Sentiment Demo

Luxembourg, Nov 2011

Architecture, Integration & Scaling Strategy

Man

agem

ent

WP

10

WP2 & WP7

Dis

sem

inat

ion

& E

xplo

itatio

nW

P9

WP3 WP4 WP6

OntologyInfrastructure

InformationExtraction

Sentiment Analysis

Decision SupportInfrastructure

Domain-independent GUI(Open Source)

Information Integration

Data, Information & Knowledge Base

WP5

WP1 & WP8

UC#1Market

Surveillance

UC#2 Reputational

Risk management

UC#3 Online Retail

Brokerage

DataAcquisition

DataAcquisition

FIRST Y1 Review Meeting 3

Sentiment Analysis

Decision SupportInfrastructure

Domain-independent GUI(Open Source)

Information Integration

Acquisition of

Tweets

SentimentClassificatio

n

ActiveLearning &Visualizatio

n

Database of Tweets

Web-based Infrastructur

e

Page 4: M12  scenario: Early prototype  demo

Basic Concepts

Sentiment classification

Active learning

Luxembourg, Nov 2011FIRST Y1 Review Meeting 4

Page 5: M12  scenario: Early prototype  demo

Sentiment Classification

Labeled examples

Build model (train classifier)

Classify unlabeled examples

Luxembourg, Nov 2011FIRST Y1 Review Meeting

POS Financial markets are now officially open :)POS market intelligence GMI Interactive and Mintel Win ARF Great Minds Award for Quality in ResearchPOS $AAPL : trust me -- AAPL will soar tomorrowNEG Oh how I miss the days with GBP was at least 2 times the AUD. Sterling forecast to hit all-time lows soonNEG omg! did you know BORDERS closed?! they went bankrupt last month and closed!! awww, too bad! i love borders!!NEG @aekins that's just too bad

LabeledExamples

TrainingAlgorithm

Classification

Model

Classification

Algorithm

UnlabeledExamples

Predictions(Labels)

So Nickelodeon filed for bankruptcyand announced that the next Kids Choice

Awards will be it's last.NEG

Classification

Model

5

Page 6: M12  scenario: Early prototype  demo

Active Learning

Labeling examples manually is expensive

Active learning reduces this cost

Experts provide labels only for a (small) subset of examples

Examples in this subset are carefully chosen to produce a classifier that is “as accurate as possible”

Luxembourg, Nov 2011FIRST Y1 Review Meeting 6

Page 7: M12  scenario: Early prototype  demo

Document space (tweets)

Active Learning

7 Luxembourg, Nov 2011FIRST Y1 Review Meeting

Page 8: M12  scenario: Early prototype  demo

Document space (reality)

Active Learning

Negative sentiment

Posi

tive s

enti

ment

Optimal

hyper

plane

Luxembourg, Nov 20118FIRST Y1 Review Meeting

Page 9: M12  scenario: Early prototype  demo

Document space (initial guess)

Active Learning

Luxembourg, Nov 20119FIRST Y1 Review Meeting

Page 10: M12  scenario: Early prototype  demo

Document space (refinement)

Active Learning

Luxembourg, Nov 201110FIRST Y1 Review Meeting

Page 11: M12  scenario: Early prototype  demo

Document space (refinement)

Active Learning

Luxembourg, Nov 201111FIRST Y1 Review Meeting

Page 12: M12  scenario: Early prototype  demo

Document space (almost there…)

Active Learning

Luxembourg, Nov 201112FIRST Y1 Review Meeting

Page 13: M12  scenario: Early prototype  demo

Acquired Tweets

Active Learning Workflow

Luxembourg, Nov 2011FIRST Y1 Review Meeting

Twitter API

Language

Detector

Near-Duplicat

e Remover

Part of Speech Tagger

Training Algorith

m

Classification Model

TweetsLabeledDataset

Twitter API

PreprocessingClassifie

r

Classification Model

Twitter Sentime

nt Results

UserQuery

13

Client

Client Client

Active Learning

Demo video(3:00)

Page 14: M12  scenario: Early prototype  demo

Early Integrated Prototype

Luxembourg, Nov 2011

Architecture, Integration & Scaling Strategy

Man

agem

ent

WP

10

WP2 & WP7

Dis

sem

inat

ion

& E

xplo

itatio

nW

P9

WP3 WP4 WP6

OntologyInfrastructure

InformationExtraction

Sentiment Analysis

Decision SupportInfrastructure

Information Integration

Data, Information & Knowledge Base

WP5

WP1 & WP8

UC#1Market

Surveillance

UC#2 Reputational

Risk management

UC#3 Online Retail

Brokerage

DataAcquisition

DataAcquisition

FIRST Y1 Review Meeting 14

Sentiment Analysis

Information Integration

InformationExtraction

OntologyInfrastructure

Historical Data

ZeroMQChannel

Decision SupportInfrastructure

Visualization

Domain-independent GUI(Open Source)

Domain-independent GUI(Open Source)

Stream Simulator

Web-basedInfrastructu

re

Page 15: M12  scenario: Early prototype  demo

Early Integrated Prototype

Luxembourg, Nov 2011FIRST Y1 Review Meeting

Data Stream Simulator

ZeroMQ Channel

15

Sentiment ExtractorZeroMQ Channel

Historical Data (Documents)

HTTPPushWeb

Server

Client(Browser

)

Client(Browser

)

Client(Browser

)

Documents(XML)

Sentiment Index(Numbers)

Java Java C#

JavaScript

WP3 WP4 WP2/7

WP5

WP2/7

WP6

Demo video(1:00)

Page 16: M12  scenario: Early prototype  demo

Concluding Remarks

Effortless integration of data acquisition, sentiment extraction, and Web-based interface

First “signs of usefulness” for the financial domain

Relationship to UC#2 (reputation) and UC#3 (retail brokerage)

Luxembourg, Nov 2011FIRST Y1 Review Meeting 16