Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Text Mining for Online Reputation Monitoring
SSIM 2016
Pedro Saleiro
POPSTAR
2
Media Coverage
3
Media Coverage
4
Media Coverage
5
Reputation[Van Riel et al., 2007] define reputation as “overall assessments of organizations by their stakeholders”
6
Social Media & Online News
7
Online Reputation MonitoringTracking what is said about a given entity on Social Media
Early ORM systems focused on counting entity mentions on Social Media
Implies collecting, cleaning, filtering, mining, exploring and analysing large streams of unstructured text data
Current Systems focus in NER, NED, Polarity Classification and Visualisation (Social Media Analytics)
8
Online Reputation Monitoring
Social
Entity∑+/-
Vision
News Social
Pred.
Rel.
Entity∑+/-
Retr.
Framework for ORM
12
PolarityClassification
NER
EntitiesWarehouse
Online News Social Media
Soc. Media Crawling
Active Learning
Data Aggregation
News Crawling
Services & Applications
Prediction Tasks
Mentions Extraction
Relations Extraction
NED
Entity-RelationIndexes
Entity-Rel. Indexing
Entity-Rel. Ranking
Applications
Reputation Management
Fraud Detection
Computational Journalism
Political Science
Social Media Marketing
13
Preliminary WorkExtraction
NED approach for Twitter (binary classification)
Context similarity with Wikipedia, Freebase
Dataset -140K tweets (96K test set)
1st place REPLAB 2013 (0.90 Accuracy)
14
Planed Work
Extraction
Active Learning for
NED
Sentiment Analysis
15
Preliminary Work
Retrieval
Entity-Relation Retrieval using MdT dataset (11M news articles)
Mentions and Relations Extraction using ClueWeb 09 (50M web pages)
16
Planed Work
Retrieval
Scalable indexing approach for entities and relations
Learning to rank/data fusion approach
Strategy for automatic collection of relevance judgments (Wikipedia Tables)
17
Preliminary WorkPrediction
Classification approach for Twitter entity popularity prediction based on news
Regression approach for Portuguese polls prediction based on Twitter sentiment aggregate functions
18
Planed WorkPrediction
Extend number of entities for popularity prediction
Deal with bursts
Different learning approach/experimental setting
Use different entity-centric knowledge for polls prediction
Evaluate predictive approaches at different periods in time
19