Upload
lael-hardin
View
21
Download
5
Embed Size (px)
DESCRIPTION
M12 scenario: Early prototype demo. Miha Gr čar ( Dep artment of Knowledge Technologies, Jožef Stefan Institute ) & FIRST Consortium. Outline. We will show two integrated prototypes Twitter sentiment analysis prototype Sentiment extraction prototype The aim is to… - PowerPoint PPT Presentation
Citation preview
Miha Grčar (Department of Knowledge Technologies, Jožef Stefan Institute)
& FIRST Consortium
M12 scenario: Early prototype demo
Luxembourg, Nov 2011
Outline
We will show two integrated prototypesTwitter sentiment analysis prototypeSentiment extraction prototype
The aim is to…Give a better idea of the overall FIRST processGive a hint of the final “product” from the
technological perspectiveDemonstrate the collaboration between partners
(integration efforts)
Luxembourg, Nov 2011FIRST Y1 Review Meeting 2
Twitter Sentiment Demo
Luxembourg, Nov 2011
Architecture, Integration & Scaling Strategy
Man
agem
ent
WP
10
WP2 & WP7
Dis
sem
inat
ion
& E
xplo
itatio
nW
P9
WP3 WP4 WP6
OntologyInfrastructure
InformationExtraction
Sentiment Analysis
Decision SupportInfrastructure
Domain-independent GUI(Open Source)
Information Integration
Data, Information & Knowledge Base
WP5
WP1 & WP8
UC#1Market
Surveillance
UC#2 Reputational
Risk management
UC#3 Online Retail
Brokerage
DataAcquisition
DataAcquisition
FIRST Y1 Review Meeting 3
Sentiment Analysis
Decision SupportInfrastructure
Domain-independent GUI(Open Source)
Information Integration
Acquisition of
Tweets
SentimentClassificatio
n
ActiveLearning &Visualizatio
n
Database of Tweets
Web-based Infrastructur
e
Basic Concepts
Sentiment classification
Active learning
Luxembourg, Nov 2011FIRST Y1 Review Meeting 4
Sentiment Classification
Labeled examples
Build model (train classifier)
Classify unlabeled examples
Luxembourg, Nov 2011FIRST Y1 Review Meeting
POS Financial markets are now officially open :)POS market intelligence GMI Interactive and Mintel Win ARF Great Minds Award for Quality in ResearchPOS $AAPL : trust me -- AAPL will soar tomorrowNEG Oh how I miss the days with GBP was at least 2 times the AUD. Sterling forecast to hit all-time lows soonNEG omg! did you know BORDERS closed?! they went bankrupt last month and closed!! awww, too bad! i love borders!!NEG @aekins that's just too bad
LabeledExamples
TrainingAlgorithm
Classification
Model
Classification
Algorithm
UnlabeledExamples
Predictions(Labels)
So Nickelodeon filed for bankruptcyand announced that the next Kids Choice
Awards will be it's last.NEG
Classification
Model
5
Active Learning
Labeling examples manually is expensive
Active learning reduces this cost
Experts provide labels only for a (small) subset of examples
Examples in this subset are carefully chosen to produce a classifier that is “as accurate as possible”
Luxembourg, Nov 2011FIRST Y1 Review Meeting 6
Document space (tweets)
Active Learning
7 Luxembourg, Nov 2011FIRST Y1 Review Meeting
Document space (reality)
Active Learning
Negative sentiment
Posi
tive s
enti
ment
Optimal
hyper
plane
Luxembourg, Nov 20118FIRST Y1 Review Meeting
Document space (initial guess)
Active Learning
Luxembourg, Nov 20119FIRST Y1 Review Meeting
Document space (refinement)
Active Learning
Luxembourg, Nov 201110FIRST Y1 Review Meeting
Document space (refinement)
Active Learning
Luxembourg, Nov 201111FIRST Y1 Review Meeting
Document space (almost there…)
Active Learning
Luxembourg, Nov 201112FIRST Y1 Review Meeting
Acquired Tweets
Active Learning Workflow
Luxembourg, Nov 2011FIRST Y1 Review Meeting
Twitter API
Language
Detector
Near-Duplicat
e Remover
Part of Speech Tagger
Training Algorith
m
Classification Model
TweetsLabeledDataset
Twitter API
PreprocessingClassifie
r
Classification Model
Twitter Sentime
nt Results
UserQuery
13
Client
Client Client
Active Learning
Demo video(3:00)
Early Integrated Prototype
Luxembourg, Nov 2011
Architecture, Integration & Scaling Strategy
Man
agem
ent
WP
10
WP2 & WP7
Dis
sem
inat
ion
& E
xplo
itatio
nW
P9
WP3 WP4 WP6
OntologyInfrastructure
InformationExtraction
Sentiment Analysis
Decision SupportInfrastructure
Information Integration
Data, Information & Knowledge Base
WP5
WP1 & WP8
UC#1Market
Surveillance
UC#2 Reputational
Risk management
UC#3 Online Retail
Brokerage
DataAcquisition
DataAcquisition
FIRST Y1 Review Meeting 14
Sentiment Analysis
Information Integration
InformationExtraction
OntologyInfrastructure
Historical Data
ZeroMQChannel
Decision SupportInfrastructure
Visualization
Domain-independent GUI(Open Source)
Domain-independent GUI(Open Source)
Stream Simulator
Web-basedInfrastructu
re
Early Integrated Prototype
Luxembourg, Nov 2011FIRST Y1 Review Meeting
Data Stream Simulator
ZeroMQ Channel
15
Sentiment ExtractorZeroMQ Channel
Historical Data (Documents)
HTTPPushWeb
Server
Client(Browser
)
Client(Browser
)
Client(Browser
)
Documents(XML)
Sentiment Index(Numbers)
Java Java C#
JavaScript
WP3 WP4 WP2/7
WP5
WP2/7
WP6
Demo video(1:00)
Concluding Remarks
Effortless integration of data acquisition, sentiment extraction, and Web-based interface
First “signs of usefulness” for the financial domain
Relationship to UC#2 (reputation) and UC#3 (retail brokerage)
Luxembourg, Nov 2011FIRST Y1 Review Meeting 16