This presentation compares four tools for analysing the sentiment in the content of free-text survey responses concerning a healthcare information website. It was completed by Despo Georgiou as part of her internship at UXLabs (http://uxlabs.co.uk)
Text of Sentiment analysis in healthcare
Sentiment Analysis in Healthcare A case study using survey
responses
Focus on Healthcare 1) Difficult field biomedical text 2)
Potential improvements Relevant Research: NLP procedure: FHF
prediction (Roy et. al., 2013) TPA: Who is sick, Google Flu Trends
(Maged et. al., 2010) BioTeKS: analyse biomedical text (Mack et.
al., 2004)
Sentiment Analysis Opinions Thoughts Feelings Used to extract
information from raw data
Sentiment Analysis Examples Surveys: analyse open-ended
questions Business & Governments: assist in the decision-making
process & monitor negative communication Consumer feedback:
analyse reviews Health: analyse biomedical text
Aims & Objectives Can existing Sentiment Analysis tools
respond to the needs of any healthcare- related matter? Is it
possible to accurate replicate human language using machines?
The case study details 8 survey questions (open &
close-ended) Analysed 137 responses based on the question: What is
your feedback? Commercial tools: Semantria & TheySay
Non-commercial tools: Google Predication API & WEKA
Introducing a Baseline Example Polarity Class CG 102 not
available Hence: Negative Neutral Classification But Factual
Statement Positive or negative? Final label: Neutral Q.1 Q.2 Q.3
Q.6 Q.8 Avg. 3 5 4 5 5 4.4
Google Prediction API 1) Pre-process the data: punctuation
& capital removal, account for negation 2) Separate into
training and testing sets 3) Insert pre-labelled data 4) Train
model 5) Test model 6) Cross validation: 4-fold 7) Compare with
baseline
Google Prediction API Results 5 122 10 Classification Results
Neutral Negative Positive
WEKA 1) Separate into training and testing sets 2) Choose
graphical user interface: The Explorer 3) Insert pre-labelled data
4) Pre-process the data: punctuation, capital & stopwords
removal and alphabetically tokenize
WEKA 5) Consider resampling: whether a balanced dataset is
preferred 6) Choose classifier: Nave Bayes 7) Classify using cross
validation: 4-fold
WEKA Results Resampling: 10% increase in precision 6% increase
in accuracy Overall, 82% correctly classified
The tools Semantria: range between -2 and 2 TheySay: three
percentages for negative, positive & neutral Google Prediction
API: three values for negative, positive & neutral WEKA:
percentage of correctly classified
Evaluation Tool Accuracy Commercial Tools Semantria 51.09%
TheySay 68.61% Non-Commercial Tools Google Prediction API 72.25%
WEKA 82.35%
Evaluation Tool Kappa statistic F-measure Semantria 0.2692
0.550 TheySay 0.3886 0.678 Google Prediction API 0.2199 0.628 WEKA
0.5735 0.809
Evaluation
Evaluation 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Negative
Neutral Positive PrecisionValue Class Comparison of Precision
Semantria TheySay Google API WEKA
Evaluation 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Negative
Neutral Positive RecallValue Class Comparison of Recall Semantria
TheySay Google API WEKA
Evaluation: Single-sentence responses Tool Accuracy based on
correct classification All responses Single- sentence Responses
Commercial Tools Semantria 51.09% 53.49% TheySay 68.61% 72.09%
Non-Commercial Tools Google Prediction API 72.25% 54% WEKA 82.35%
70%
Conclusions Semantria: business use TheySay: prepare for
competition & academic research Google Prediction API:
classification WEKA: extraction & classification in
healthcare
Conclusions Commercial tools: easy to use and provide results
quickly Non-commercial tools: time-consuming but more reliable
Conclusions Is it possible to accurate replicate human language
using machines? Approx. 70% accuracy for all tools (except
Semantria) WEKA: most powerful tool
Conclusions Can existing SA tools respond to the needs of any
healthcare-related matter? Commercial tools can not respond
Non-commercial can be trained
Limitations Only four tools Small dataset Potential errors in
manual classification Detailed analysis of single-sentence
responses was omitted
Recommendations Examine reliability of other commercial tools
Investigate other non-commercial tools, especially NLTK and GATE
Examine other classifiers (SVM & MaxEnt) Investigate all WEKAs
GUI
Recommendations Verify labels using more people Label sentence
as well as the whole response Negativity associated with long
reviews