CoSACT: A Collaborative Tool for Fine-Grained Sentiment

CoSACT: A Collaborative Tool for Fine-Grained Sentiment Annotation andConsolidation of Text

Tobias Daudert1∗ , Manel Zarrouk1 and Brian Davis21Insight Centre for Data Analytics, National University of Ireland, Galway

2Department of Computer Science, Maynooth University{tobias.daudert, manel.zarrouk, brian.davis}@insight-centre.org

AbstractRecently, machine learning and in particular deepneural network approaches have become progres-sively popular, playing an increasingly importantrole in the financial domain. This results in an in-creased demand for large volumes of high qualitylabeled data for training and testing. While an-notation tools are available to support text analy-sis tasks such as entity recognition and sentimentclassification for generic content, there is an ab-sence of annotation tools purposely built for the fi-nancial domain. Hand in hand with this, there arealso no existing best practices for sentiment annota-tion in the financial domain. To address this issue,we suggest fundamental practices for the creationof new datasets for this domain and integrate theseinto our annotation tool. We present CoSACT, aserver-based tool which supports the collaborativeannotation and consolidation of a dataset purposelybuilt for the financial domain.

1 IntroductionThe increase in popularity and adoption of social media inrecent years has generated a deluge of valuable, high vol-ume and volatile user-generated content [Derczynski et al.,2015]. Microblogs have become a central medium of com-munication and are used in almost all areas of our dailylife. They are mainly characterized by their short length,nonetheless, they can be rich in content and highly opinion-ated [Sinha, 2014]. Microblogs are used by services such asStocktwits1, Twitter2, or Facebook3. The growth of Twitter,by 1,006% between 2010 and 2015, clearly shows the highimportance of short messages.4. Opinion Mining from shorttexts, such as microblogs, has become an increasingly impor-tant task [Derczynski et al., 2015; Derczynski et al., 2013].Data generated from online communication acts as a poten-tial goldmine for discovering knowledge [Dey and Haque,

∗Contact Author1https://stocktwits.com/2https://twitter.com/3https://www.facebook.com/4https://www.statista.com/statistics/282087/

number-of-monthly-active-twitter-users/

2009]. In the area of finance, sentiment acquired from textcan positively influence businesses in many ways. For ex-ample, it can be used as an indicator for trading or provideknowledge about the public perception of a company. Sen-timent classification and opinion mining tools increasinglyemploy machine learning approaches [Missen et al., 2013;Pang and Lee, 2008] requiring access to labeled data for train-ing and testing for benchmarking learning systems. The an-notation and subsequent consolidation of microblogs into ahigh-quality gold standard is a challenging, time-consumingand labor-intensive task. Annotation tools aim to reduce theburden associated with this task, however, there are few pub-licly available annotation tools dedicated to capturing the sen-timent in short texts [Trakultaweekoon and Klaithin, 2016]and often, unlike our tool, they do not focus deeper beyondthe ”document level” towards fine-grained annotation of sen-timent annotation at the entity/target level.

In this paper, we present CoSACT5, the first annotation toolincorporating a consolidation mode purposely built for thefinancial domain.

2 Annotation DesignSentiment in the financial area is driven by multiple factors,hence, the quality and information content of datasets is cru-cial to the success of sentiment analysis. The results of sen-timent analysis can have significant implications in the finan-cial domain, positively as well as negatively. One exampleis algorithmic trading; on one hand, it can be very profitableand some companies (e.g. Renaissance Technologies6) arevery successful with the computational analysis, on the otherhand, algorithmic trading can lead to big losses when comput-ers misinterpret information. Incidents such as the LululemonAthletica Inc. case highlight this risk; their stocks droppeddue to problems with the company’s product - sport trousers- which was sarcastically discussed on Twitter. Automatedtrading systems misinterpreted these tweets as positive andlong-investors lost money7. The more diverse informationis available with a high level of detail, the better a classifier

5https://github.com/TDaudert/CoSACT6https://www.rentec.com7https://www.theepochtimes.com/

how-hedge-funds-use-twitter-to-gain-an-edge-in-trading2227349.html

34Proceedings of the First Workshop on Financial Technology and Natural Language Processing

(FinNLP@IJCAI 2019), pages 34-39, Macao, China, August 12, 2019.

https://stocktwits.com/

https://twitter.com/

https://www.facebook.com/

https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/

https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/

https://github.com/TDaudert/CoSACT

https://www.rentec.com

https://www.theepochtimes.com/how-hedge-funds-use-twitter-to-gain-an-edge-in-trading_2227349.html



is able to learn the underlying data characteristics. Prior todeveloping our annotation tool, we considered which short fi-nancial text parameters are necessary to conduct high-qualitysentiment analysis; hence, our tool was developed keepingthis in mind. We detail these characteristics below:

1. Sentiment granularity - Currently, the majority of sen-timent datasets is annotated in a categorical fashion withpolarity (positive, negative, neutral) [Saif et al., 2013].Although this simplifies the annotation process and leadsto higher levels of inter-annotator agreements, a senti-ment classifier trained on a polarity dataset will not beable to learn how to classify data with a higher gran-ularity (i.e. in more than 3 classes). However, in thefinancial domain, this poses a severe drawback since theability to capture the degree to which one text is morepositive/negative than another is highly desirable. Con-sidering a case in which one text is strongly positive andanother one slightly negative, the question arises whichtext has a stronger impact on an asset price. This caseapplies to all periods in which multiple data with differ-ent sentiments exist. Comparing the sentiment polarity(positive, negative, neutral) with trading actions (buy,sell, hold), polarity classification seems to be an obvi-ous choice, however, in the real world, there is seldoma scenario with only one data artifact to be classified.To address this, data annotated with CoSACT receives acontinuous sentiment score between -1 and +1 with 0 asneutral (e.g. 0.472). However the tool has the ability tolater divide the annotated data categorically into classeswithout limiting the granularity at the time of annotation.

2. Multi-entity annotation - Sentiment at document-levelmay not be sufficiently accurate when dealing withscenarios requiring the consideration of single entities.Many documents (i.e. twits, tweets, posts) mention morethan one entity, hence, a good dataset should aim to in-clude one sentiment annotation for each contained en-tity. The benefit here is at hand; some texts might con-tain positive sentiment towards some entities and neutralor negative towards others. To accommodate this, eachcashtag in CoSACT achieves a sentiment annotation. Asyou can see in Figure 1, the displayed text contains twoentities represented by $DD and $MON; both receive anindividual sentiment score.

3. Spans - In addition to specifying a sentiment for eachentity, CoSACT also requests the span in which this sen-timent is contained. This feature provides high-qualityinformation for a sentiment classifier since this fine-grained annotation allow us to aim for an entity-levelclassification (i.e. determine one sentiment score foreach entity in a text).

4. Contextual dependency - A contextual dependency isrepresented by additional information (i.e. image, url)apart from the text the sentiment is referring to. Fromthe sentiment analysis point of view, this is relevantas not all information a sentiment is based on canbe retrieved solely from the annotated text. A moresophisticated sentiment classifier may wish to leveragethe additional external contextual information.

5. Justification - To facilitate the consolidation, CoSACTcollects the annotator’s reasoning behind his/her anno-tation. Moreover, the automatic evaluation of the justi-fications can be used for the generation of a confidencescore which provides further information to a sentimentclassifier. It becomes possible to give more importanceto some annotations than to those the annotator was un-sure of his judgment.

3 The Tool: CoSACTCoSACT is a server-based tool, which requires only abrowser (e.g. Chrome8, Firefox9). It is developed us-ing JavaScript10 (particularly NodeJS)11, a SQL database12,HTML13 and CSS14. As it is utilizing a server-client architec-ture, multiple people can collaborate online in the annotationof the same dataset, at any point in time, without requiringthe storage of large amounts of data on their local machines.It also gives the flexibility to outsource the annotation task asonly a browser is required and the annotators will never comein contact with the raw dataset itself. The access to the tooland, hence, the data, is protected by a user-name and pass-word combination. Therefore, the tool’s availability on theweb carries a reduced security risk.CoSACT evolved from two separate tools, an annotation anda consolidation tool, which were tested in real-world sce-narios. These were employed in the Social Sentiment In-dexes Project (SSIX)15 to create gold standards in differentlanguages to monitor sentiment related to the financial mar-kets. Additionally, the data used in the Semeval 2017 Task5 competition on Fine-Grained Sentiment Analysis on Finan-cial Microblogs and News was generated with this annotationand consolidation tool [Cortis et al., 2017]. The gold standarddataset used by Byrne et al. [2016] to predict the outcome ofthe Brexit poll based on tweets was also created using CoS-ACT.

To better detail the tool’s interface, we will use numbersenclosed in parenthesis, e.g. (1), which point to specific loca-tions in the following figures.

3.1 The Annotation ModeThe CoSACT interface for the annotation mode is shown inFigure 1. The microblog text as presented to the annotator isrepresented in box (3). Right to it, (4) highlights the differ-ent submission options. Here, the annotator can choose be-tween submitting his/her annotation (green button), labelingthe presented microblog as spam (red button), labeling it asirrelevant (blue button), or the annotator can click on one of

8https://www.google.com/chrome/index.html9https://www.mozilla.org/de/firefox/

10https://en.wikipedia.org/wiki/JavaScript11https://nodejs.org/en/12https://en.wikipedia.org/wiki/SQL13https://en.wikipedia.org/wiki/HTML14https://en.wikipedia.org/wiki/Cascading Style Sheets15https://ssix-project.eu/

35

https://www.google.com/chrome/index.html

https://www.mozilla.org/de/firefox/

https://en.wikipedia.org/wiki/JavaScript

https://nodejs.org/en/

https://en.wikipedia.org/wiki/SQL

https://en.wikipedia.org/wiki/HTML

https://en.wikipedia.org/wiki/Cascading_Style_Sheets

https://ssix-project.eu/

Figure 1: The interface of CoSACT in the annotation mode.

the two grey buttons. In this case, the annotator then choosesto label the data as “I don’t know”, in case he/she is unsureabout the correct annotation, or as “Not enough information”,in case the uncertainty clearly derives from the presented dataitself. This differentiation provides insights about the infor-mation content of the data as it becomes clear whether a textmight have been too concise for a good interpretation or theannotator has simply not been able to interpret it. Below thepresented tweet, (5) labels a field aiming at storing a justi-fication for the annotators choice. This justification mightprovide relevant value in addition to the previously assignedfeature. On its right, the two boxes (6) “Image” and “Exter-nal” (e.g. URL) can be ticked in case the presented messageis based on additional information and the content alone doesnot represent the sentiment fully. Location (10) in Figure 1 isthe sentiment slider which the annotators use to assign a con-tinuous sentiment between -1 (Negative) and +1 (Positive) toan entity. The text area at location (11), allows the annotatorsto assign one or multiple spans to the sentiment and, there-fore, to specify the part of the microblog which contains thesentiment. In case an annotator is unable to assign a span, thebox on top of the text area can be ticked. Finally, the the textarea (12) in Figure 1 presents the selected spans; the buttonabove is for resetting spans in case an annotator wants to re-vise his/her selection.Note that the master selectors presented with a cyan back-ground in (7)-(9) of Figure 1, affect all the other fields. Forexample, a change in the slider (7) is also moving the twosliders below. This functionality gives the annotators the ca-pability to select spans for microblogs with multiple entitieswithout the necessity to select them independently.

We chose to use a slider to determine the sentiment (See

(7)/(10) in Figure 1) to prevent the annotators from usingsome sentiments disproportionately. If the sentiment rangewould present classes, annotators would tend to move theslider towards one of the categories (e.g. 1,2,3). Similarly,annotators requested to give numbers for a sentiment mightbe unable to go into a high level of detail and would tend toannotate with rounded numbers (e.g. -1.0, -0.5, 0.0, 0.5, 1.0).

3.2 The Consolidation ModeIn consolidation mode, CoSACT is able to smoothly consoli-date a previously annotated dataset. At the first time the usertasked with consolidation logs in, he/she is requested to spec-ify the number of required annotations per microblog as wellas to set an acceptable deviation. The deviation defines thedegree of variation between the annotations of a text. Thesetwo values are then used to automatically consolidate all mi-croblogs which fulfill the given criteria (Figure 2). A manualconsolidation is required for the remaining microblogs; au-tomatic consolidations can be amended by the consolidator.As shown in Figure 3, the example does not meet the crite-ria thus, it needs to be consolidated. In box 5, microblogs tobe consolidated have a gray background, auto-consolidatedones are highlighted cyan, and consolidated microblogs areshown in blue. The microblog currently in use is underlaid ingreen. The consolidator can use the three buttons in (4) to ei-ther classify a microblog as spam, irrelevant or to save his/herconsolidation. Location (6) shows the scores from the anno-tators. The “ALL” area (7)-(8) is similar to areas (7)-(9) ofthe annotation mode, as well as the mechanism of assigning asentiment and span. The mechanism of assigning a sentimentand spans is similar to the process described in Section 3.1.The consolidator is requested to set the final sentiment as well

36

Figure 2: The interface of CoSACT in the consolidation mode presenting an auto-consolidated microblog.

Figure 3: The interface of CoSACT in the consolidation mode presenting a microblog to be consolidated.

as to set the spans, all under consideration of the annotationsgiven and shown.

4 Related Work

Due to the current high interest in microblogs and sentimentanalysis, tools focusing on sentiment annotation of short texthave become of relevant importance. Some examples of ser-

37

vice based general purpose collaborative linguistic and se-mantic annotation tools include WebAnno [de Castilho et al.,2014], GATE Teamware [Bontcheva et al., 2013] which is in-tegrated into the AnnoMarket Text Mining Service [Tablan etal., 2013], TURKSENT [Eryigit et al., 2013], and SenseTag[Trakultaweekoon and Klaithin, 2016]. Due to our interest insentiment annotation, we focus on two of these publicly avail-able tools which also focus on this particular task: TURK-SENT and SenseTag. However, it is important to clarify thatthese tools are not purposely built for the financial domainnor were applied in any financial context. We chose to ad-dress these, as they are the most closely related tools to ourwork.

Eryigit [2013] were the first to create a publicly availablesentiment annotation tool. It incorporates a manual or semi-automatic linguistic annotation layer which includes text nor-malization, named entity recognition, morphology, and syn-tax tasks. TURKSENT provides a software as a service whichdoes not require a specific platform, and it is also accessibleby multiple users. For the annotation, categorical labels sup-ported by images (i.e. emoticons) are used; these range from1-3 and include an additional ambivalent label. Two tasks areassigned during the annotation process. In the first task, theannotator is required to provide a general sentiment label forthe text and select an appropriate comment category and sen-tence type. For the second step, it is performed a target basedannotation where tuples of the brand, product/service and fea-tures are provided. The second analysed tool, SenseTag, con-sists of three components: data collection, data annotationand administrative operation. The data is automatically col-lected from microblogs and classified into four domains. Thisinformation is then pre-processed and stored in a database.Non-annotated messages are selected randomly and are thenmanually tagged for each word. The words are classified into4 categories: positive, negative, feature, and entity. Finally,the last component allows the management of the tags, andthe tagging tool as well as data handling.

Considering these tools, we identified generic tool charac-teristics and shortcomings, which we address with CoSACT:

1. Complex pre-processing. While these tools rely onan additional annotator task to deal with entity recog-nition, CoSACT applies a simplistic approach focusingon a previous input of entities or regular expressions (i.e.regex) to extract them. Either the entities to be found arebeing specified beforehand, or a regex is given to iden-tify them. This removes the additional laborious task forthe annotators. Our tool accepts free text, hence, not de-manding the pre-identification of categories or features.

2. Categorical sentiment. Both TURKSENSE and Sense-Tag utilize a small, categorical range of sentiment. InCoSACT, we use a wide, continuous span from -1 to +1,which allows for the extraction of a fine-grained senti-ment score. CoSACT is the only tool providing a fine-grained sentiment scale.

3. Overall score. Our tool provides a sentiment score forthe text, as a whole, and for the entities (e.g. $Volkswa-gen) and/or tickers (e.g. VOW). This is not present inSenseTag.

4. Contextual Dependencies. Short messages may con-tain additional information conveyed by an image orURL. CoSACT takes this into account by allowing theuser to identify messages where this situation occurs.

5. Consolidation. Our tool implements a consolidationmode which allows a user, preferentially a domain ex-pert, to consolidate already annotated datasets. Thisis useful in cases where the minimum number of an-notations is not met and/or when the deviation is ex-ceeded, in the remaining scenarios the microblog is auto-consolidated. This is a vital step for the development ofquality gold standards and not yet implemented in othertools.

6. Usability features. CoSACT’s interface is able to indi-cate the number of messages in the annotation and con-solidation tasks and the overall progress. It also has theadvantage of easily allowing the user to review alreadyconsolidated text. Annotated text cannot be reviewed,this is a deliberate design feature. We believe the anno-tator should focus on the current text and not review theassigned sentiment in accordance with other messagesseen before.

5 ConclusionIn this paper, we addressed the growing need for qualitylabeled data for training supervised sentiment classificationtools for the financial domain. We established fundamentalpractices we believe to be essential to improve the datasetquality and incorporated them in an annotation and consoli-dation tool designed for the financial domain. Although col-lecting many different parameters from an annotator, CoS-ACT simplifies the process of annotation by providing an au-tomatic consolidation as well as a manual consolidation in-terface. It combines the benefits of collecting a magnitudeof information while keeping the manual effort low. Entitiescan be annotated with a sentiment, one or multiple spans ex-pressing it, a justification for the annotation, and contextualinformation. In addition, data can be classified as spam orirrelevant and annotators can click on “I don’t know” in casethey are overwhelmed, or can mark texts with “Not enoughinformation” in case they perceive it as not revealing enoughinformation. All these parameters contribute meaningfully toautomated sentiment analysis in a financial setting. Since thistool is server-based, it only needs to be set up once and allowsannotators and consolidators to work on data with a mini-mum of effort, requiring not more than a standard browser.Our tool encourages the efficient engineering of quality fine-grained labeled datasets for developing, testing and bench-marking (semi-)supervised learning systems for entity orien-tated sentiment analysis.

AcknowledgmentsThis publication has emanated from research conducted withthe financial support of Science Foundation Ireland (SFI) un-der Grant Number SFI/12/RC/2289, co-funded by the Euro-pean Regional Development Fund.

38

References[Bontcheva et al., 2013] Kalina Bontcheva, Hamish Cun-

ningham, Ian Roberts, Angus Roberts, Valentin Tablan,Niraj Aswani, and Genevieve Gorrell. Gate teamware: aweb-based, collaborative text annotation framework. Lan-guage Resources and Evaluation, 47(4):1007–1029, 2013.

[Byrne et al., 2016] David Byrne, Angelo Cavallini, RossMcDermott, Manuela Hurlimann, Frederico Caroli,Malek Ben Khaled, Andre Freitas, Manel Zarrouk, Lau-rentiu Vasiliu, Brian Davis, et al. In or out? real-timemonitoring of brexit sentiment on twitter. SEMANTiCS2016, 2016.

[Cortis et al., 2017] Keith Cortis, Andre Freitas, TobiasDaudert, Manuela Huerlimann, Manel Zarrouk, SiegfriedHandschuh, and Brian Davis. Semeval-2017 task 5: Fine-grained sentiment analysis on financial microblogs andnews. In Proceedings of the 11th International Workshopon Semantic Evaluation (SemEval-2017), pages 519–535,2017.

[de Castilho et al., 2014] Richard Eckart de Castilho, ChrisBiemann, Iryna Gurevych, and Seid Muhie Yimam. We-banno: a flexible, web-based annotation tool for clarin.In Proceedings of the CLARIN Annual Conference (CAC),2014.

[Derczynski et al., 2013] Leon RA Derczynski, Bin Yang,and Christian S Jensen. Towards context-aware search andanalysis on social media data. In Proceedings of the 16thinternational conference on extending database technol-ogy, pages 137–142. ACM, 2013.

[Derczynski et al., 2015] Leon Derczynski, Diana Maynard,Giuseppe Rizzo, Marieke van Erp, Genevieve Gorrell,Raphael Troncy, Johann Petrak, and Kalina Bontcheva.Analysis of named entity recognition and linking fortweets. Information Processing & Management, 51(2):32–49, 2015.

[Dey and Haque, 2009] Lipika Dey and Sk Mirajul Haque.Opinion mining from noisy text data. International Jour-nal on Document Analysis and Recognition, 12(3):205–226, 2009.

[Eryigit et al., 2013] Gulsen Eryigit, Fatih Samet Cetin,Meltem Yanik, Tanel Temel, and Ilyas Cicekli. Turksent:A sentiment annotation tool for social media. In LAW@ACL, pages 131–134, 2013.

[Missen et al., 2013] Malik Muhammad Saad Missen, Mo-hand Boughanem, and Guillaume Cabanac. Opinion min-ing: reviewed from word to document level. Social Net-work Analysis and Mining, 3(1):107–125, 2013.

[Pang and Lee, 2008] Bo Pang and Lillian Lee. OpinionMining and Sentiment Analysis Bo. Foundations andTrends R© in Information Retrieval, 2(1-2):1–135, 2008.

[Saif et al., 2013] Hassan Saif, Miriam Fernandez, YulanHe, and Harith Alani. Evaluation datasets for twitter sen-timent analysis: a survey and a new dataset, the sts-gold.2013.

[Sinha, 2014] Nitish Sinha. Using big data in finance : Ex-ample of sentiment extraction from news articles. Tech-nical report, Board of Governors of the Federal ReserveSystem (US), 2014.

[Tablan et al., 2013] Valentin Tablan, Kalina Bontcheva, IanRoberts, Hamish Cunningham, Marin Dimitrov, andAD Ontotext. Annomarket: An open cloud platform fornlp. In ACL (Conference System Demonstrations), pages19–24, 2013.

[Trakultaweekoon and Klaithin, 2016] KanokornTrakultaweekoon and Supon Klaithin. Sensetag: Atagging tool for constructing thai sentiment lexicon. InComputer Science and Software Engineering (JCSSE),2016 13th International Joint Conference on, pages 1–4.IEEE, 2016.

39

Documents

CoSACT: A Collaborative Tool for Fine-Grained Sentiment