28
Project Acronym: FIRST Project Title: Large scale information extraction and integration infrastructure for supporting financial decision making Project Number: 257928 Instrument: STREP Thematic Priority: ICT-2009-4.3 Information and Communication Technology D7.3 Early Prototype of Integrated Financial Market Information System GUI Work Package: WP7 Integrated Financial Market Information System Due Date: 30/09/2012 Submission Date: 01/10/2012 Start Date of Project: 01/10/2010 Duration of Project: 36 Months Organisation Responsible for Deliverable: ATOS Version: 1.0 Status: Final Author(s): Mateusz Radzimski Miha Grčar ATOS JSI Reviewer(s): Achim Klein UHOH Nature: R Report P Prototype D Demonstrator O Other Dissemination level: PU - Public CO - Confidential, only for members of the consortium (including the Commission) RE - Restricted to a group specified by the consortium (including the Commission Services) Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013)

D7.3 Early Prototype of Integrated Financial Market

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Project Acronym: FIRST

Project Title: Large scale information extraction and integration infrastructure for supporting financial decision making

Project Number: 257928

Instrument: STREP

Thematic Priority: ICT-2009-4.3 Information and Communication Technology

D7.3 Early Prototype of Integrated Financial Market

Information System GUI Work Package: WP7 – Integrated Financial Market Information

System

Due Date: 30/09/2012

Submission Date: 01/10/2012

Start Date of Project: 01/10/2010

Duration of Project: 36 Months

Organisation Responsible for Deliverable: ATOS

Version: 1.0

Status: Final

Author(s): Mateusz Radzimski

Miha Grčar

ATOS

JSI

Reviewer(s): Achim Klein UHOH

Nature: R – Report P – Prototype D – Demonstrator O – Other

Dissemination level: PU - Public CO - Confidential, only for members of the

consortium (including the Commission)

RE - Restricted to a group specified by the consortium (including the Commission Services)

Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013)

D7.3

Revision history Version Date Modified by Comments

0.1 10/09/2012 Mateusz Radzimski (ATOS)

First version of ToC provided

0.2 21/09/2012 Mateusz Radzimski (ATOS)

Contribution to section 2

0.3 24/09/2012 Miha Grčar (JSI) Visualisations description

0.4 25/09/2012 Mateusz Radzimski (ATOS)

GUI guidelines

0.5 26/09/2012 Mateusz Radzimski (ATOS)

GUI hosting platform

0.6 28/09/2012 Mateusz Radzimski (ATOS)

Adressing reviewer’s comments

0.7 28/09/2012 Mateusz Radzimski (ATOS)

Preparation of the final version and last corrections

1.0 30/09/2012 Tomás Pariente (ATOS)

Final editing, conclusion and ready for submission

D7.3

Copyright © 2012, FIRST Consortium

The FIRST Consortium (www.project-first.eu) grants third parties the right to use and distribute all or parts of this document, provided that the FIRST project and the document are properly referenced.

THIS DOCUMENT IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENT, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------

D7.3

Executive Summary

This report describes the development and setup details of the first prototype of Integrated Financial Market Information System GUI (IFMIS GUI or Sentify portal), produced within WP7, task T7.3.

The overall development is based on the design as described in (FIRST D2.2 Conceptual and technical integrated architecture design, 2011) conforms to the devised FIRST architecture and is built on top of the infrastructure being developed within WP7: (FIRST D7.1 Integration Infrastructure Release, 2012) and (FIRST D7.2 Early prototype of Integrated Financial Market Information System, 2012).

The Sentify portal aims at presenting sentiment data computed by WP4 (FIRST D4.2 Semantic information extraction components, addressing noise and uncertainty, 2012) and will have a form of a web application, also integrating different visualisation prototypes developed within (FIRST D6.4 Interactive visualisation of textual streams v1, 2012) and making use of data stored in FIRST knowledge base (FIRST D5.3 Large-scale integrated knowledge base, 2012).

This report accompanies the early prototype of the FIRST Sentify portal, which implements some of the aforementioned functionalities, and while it is still in the early stage of development, it already provides a general development framework and first functionalities.

D7.3

Table of Contents

Executive Summary ...................................................................................................... 4

Abbreviations and acronyms ....................................................................................... 7

1. Introduction ............................................................................................................ 8

2. Integrated Financial Market Information System GUI: The origin of the Sentify portal .............................................................................................................................. 9

2.1. Overview of GUI integration .............................................................................. 9

2.2. GUI hosting platform ......................................................................................... 9

2.3. GUI portal elements and structure................................................................... 10

2.4. Guidelines for GUI widgets .............................................................................. 12

2.5. Next steps and future development ................................................................. 14

3. Integrated WP6 visualisations ............................................................................. 15

3.1. Document Space Visualisation ........................................................................ 15

3.2. Tag Clouds ...................................................................................................... 16

3.3. Canyon Flow Visualization .............................................................................. 17

3.4. Visualizing Co-occurrence Statistics ............................................................... 18

3.4.1 Co-occurrence circle ................................................................................ 18

3.4.2 Occurrence and co-occurrences timeline ................................................. 19

3.5. Twitter sentiment timeline ................................................................................ 20

4. Conclusion ............................................................................................................ 22

References ................................................................................................................... 23

Annex 1. Specification of WP4 sentiment drill-down GUIs for WP7 .................. 24

Annex a. Sentiment Retrieval .................................................................................. 24

Annex b. Sentiment Inspection (Drill-down) ............................................................ 26

Annex i. Sentiment Index Inspection .................................................................. 26

Annex ii. Sentiment Document Inspection ........................................................... 26

Annex iii. Sentiment Sentence Inspection ............................................................ 26

Annex c. Aggregated Sentiment Time Series Visualization .................................... 28

D7.3

Index of Figures

Figure 1: Deployment of the Sentify web application .................................................................. 10 Figure 2: Screenshot of the early mock-up of the Sentify main page. .......................................... 12 Figure 3: Document space visualization ....................................................................................... 16

Figure 4: Tag cloud visualization (mockup) ................................................................................. 17 Figure 5: Canyon flow visualization (mockup) ............................................................................. 18 Figure 6: Co-occurrence circle visualization (mockup) ................................................................ 19 Figure 7: Visualization of occurrence and co-occurrence statistics. ............................................. 20 Figure 8: Twitter sentiment timeline ............................................................................................. 21

Index of Tables

Table 1: Initial website structure ................................................................................................... 11 Table 2: Guidelines for widgets development ............................................................................... 13

D7.3

© FIRST consortium Page 7 of 28

Abbreviations and acronyms

IFMIS Integrated Financial Market Information System

WP Workpackage

ID Identifier

DB Database

WCF Windows Communication Foundation

GUI Graphical User Interface

SOAP Simple Object Access Protocol – method for exchanging data in Web Services

WSDL Web Service Description Language

WS Web Service

px Pixels (as width and height dimensions measure)

HTML HyperText Markup Language

URL Uniform resource locator

CSS Cascading Style Sheets

HTTP Hypertext Transfer Protocol

GWT Google Web toolkit

D7.3

© FIRST consortium Page 8 of 28

1. Introduction

This document describes the M24 release of the early prototype of the Integrated Financial Market Information System GUI, also called IFMIS GUI or Sentify portal.

The aim of the Sentify portal is to showcase core functionalities of the FIRST system and results of the sentiments computation. Those are:

Visualising sentiments related to sentiments objects

o Including aggregated sentiment index (that is: from multiple documents, multiple instruments, etc.).

Inspecting and performing drill-down with respect to:

o documents and annotations,

o sentiments on all levels and associated data (textual representations, indicators).

Showcasing the technical results provided by other components

o WP6 textual-streams visualisations,

o other visualisations developed within the FIRST project, demonstrating added value of the sentiment data.

The drill-down functionality is a one of the core parts of the Sentify Portal. It is the way for inspecting sentiments results, by offering more detailed views, and going from summary information (such as aggregated sentiments) into more details (showing concrete related documents, or even phrases expressing certain sentiment).

All the Sentify Portal functionalities are relying on the data stored in the WP5 Knowledge Base, and provided by WP4 sentiment extraction components. The database schema and data access services were also designed with the WP7 in mind. This means that data services are providing data with the necessary granularity to implement i.e. drill-down mechanism. On the other the early work of task T7.3 allowed for necessary refinement and validation of those services. For details on the data provided by WP4 see (FIRST D4.2 Semantic information extraction components, addressing noise and uncertainty, 2012), for details on WP5 services used by IFMIS GUI, see (FIRST D5.3 Large-scale integrated knowledgebase, 2012). The draft definition of the WP4 data and services for implementing WP7 drill-down mechanism is presented in the Annex 1 of this deliverable.

The scope of development of IFMIS GUI also covers integration of “external”, separately developed visualisations and GUIs from WP6. However, this deliverable does not cover development details of those modules. They are providing their own environment and task T7.3 is only performing visual integration. So from the point of view of the end-user they are coherent with the rest of the IFMIS portal.

Note that Sentify portal is still in the early phase of development and most of the work is still ongoing, therefore many features are not yet implemented and will be incrementally added to the portal once they are ready. However it already provides the first features and necessary framework for embedding visual GUIs. The drill-down functionality (as of the moment of this report’s writing) is limited to the aggregated sentiment level and individual financial instrument/company level. Details on the current state of IFMIS GUI and technical documentation are presented in the following chapters.

D7.3

© FIRST consortium Page 9 of 28

2. Integrated Financial Market Information System GUI: The origin of the Sentify portal

Integrated Financial Market Information System GUI is the common access point to the functionality offered by the core FIRST infrastructure. As devised in previous work (FIRST D2.2 Conceptual and technical integrated architecture design, 2011) the IFMIS GUI has been developed as a Web application, accessible for end users without the need for neither installing any software nor requiring any additional dependencies, but a web browser and internet connection.

The following sections describe details of the implementation of the first prototype of the IFMIS GUI and the integration of various components developed outside of the scope of WP7.

As stated before, in the further parts of this document as well as the further lifecycle of this prototype, IFMIS GUI will be also called “Sentify”, to provide the FIRST result website with the recognizable name.

2.1. Overview of GUI integration

Y1 requirements studies revealed that some visual components will be developed in different technologies (i.e. WP7 in Java, WP6 .NET, etc.). This problem was approached in the architecture design deliverable (FIRST D2.2 Conceptual and technical integrated architecture design, 2011) and some extra considerations have been made in order to accommodate different technologies and integrate into one coherent web application. The preferred solution needs to embrace the technological diversity and assumes decoupled, web-based technical integration of the FIRST GUIs.

This means that on one hand we have visual web components (widgets and WP6 visualisations) developed separately within WP6 and in principle independent from each other, and on the other hand: a IFMIS web portal providing a common ground for showcasing basic functionalities of the FIRST project.

The way for unifying both worlds includes the parameterisation of the widgets and embedding them in the scope of the Sentify portal. The architectural overview and technical details of such implementation have been already presented in D2.2. From the conceptual side the IFMIS portal needs to provide a clear content structure and guidelines on how widgets should be placed, what information (parameters) are necessary and the interaction between both widgets and the IFMIS GUI.

Therefore the general idea of GUI-based integration is that all parts (widgets, visualisations) are seamlessly embedded into the overall portal, so from the user perspective, everything looks as if it was one integrated web application/website.

The following subsections will define details on the widget and visualisations integration, and also introduce the first prototype of the Sentify portal.

2.2. GUI hosting platform

The Sentify portal is deployed as Java Web Application and currently it’s hosted on the Jetty server. The visual template design has been done using traditional HTML and CSS, while adding application logic and widgets embedding and some widgets themselves (see section 2.3 for more) have been developed using Vaadin1 open source web application framework that is

1 http://www.vaadin.com/

D7.3

© FIRST consortium Page 10 of 28

based in the Google Web Toolkit1 (GWT). The choice of the technology was explained in the (FIRST D2.2 Conceptual and technical integrated architecture design, 2011), however instead of using raw GWT, we decided to opt for Vaadin framework, due to the fact that it provides its custom components on top of GWT that needed less work to be adapted to the IFMIS GUI needs.

The deployment of the first prototype of Sentify portal has been presented in the Figure 1. The main application is deployed on the application server that is not directly reachable by the client. The communication is going through the proxy server that acts as a URL rewriting and traffic forwarding proxy. This means that to the end user it all looks as if it was on one server and on one website. It also helps managing URLs of pages that can be redirected to different application modules.

This setup allows hosting different parts of the web application (different websites or modules) on the different application server instances or even machines.

Figure 1: Deployment of the Sentify web application

The current version of the Sentify portal is reachable under the following primarily URL: http://first-vm1.ijs.si/ and also under the following alias URL: http://sentify.project-first.eu/.

From the technical point of view, the main application is served from the Jetty main application server. It serves all graphical user interface and handles general IFMIS functionalities (see sections 2.3). It also embeds all external widgets (see section 3). Those are incorporated as HTML frames within the IFMIS web portal. URL parameters are passed to the iframe as well, so the IFMIS GUI is able to control the display of the embedded widgets.

2.3. GUI portal elements and structure

At the time of the writing, the current version of the Sentify portal is structured as presented in the Table 1.

1 https://developers.google.com/web-toolkit/

Client Machine Apache reverse HTTP proxy server

first-vm1.ijs.si

Jetty servermain Sentify websitefirst-vm1.ijs.si:8088

IIS Server.NET widgets and visualisationsfirst.ijs.si:80

Other visual or GUI components

D7.3

© FIRST consortium Page 11 of 28

Relative URL Description Parameters

/ Main page

Displaying:

latest news widget (no parameters, showing all latest processed documents),

trending topics visualisation widget,

documents processed widget,

list of monitored sources.

None

/companies/{companyID} Company resumé page

Displaying:

company sentiment chart widget,

company aggregated sentiment,

latest news widget (with company parameter, showing document relevant only to the currently displayed company),

company info box widget (company abstract)

Company ID,

/instrument/{instrumentID}

Financial instrument page

Displaying:

financial instrument sentiment chart widget,

company aggregated sentiment,

latest news widget (with instrument id parameter, showing document relevant only to the currently displayed instrument id)

Instrument ID

/document/{documentID} Single document analysis page

Displaying:

annotated document widget,

document summary widget.

Document ID

/visualisations/{visualisationID}

A placeholder for all WP6 visualisations. Each full-screen visualisation is shown on a dedicated page. For details see section 3.

Visualisation ID

Table 1: Initial website structure

D7.3

© FIRST consortium Page 12 of 28

Navigation between those pages is realized by setting HTML links on the relevant elements. In the current prototype those are:

companies’ names: pointing to the “Company resumé” page (/company/{companyID})

financial instruments’ names: pointing to the “Financial instrument” page (/instrument /{ instrument ID})

news titles: pointing to the “Single document analysis” page (/document/{documentID})

The widgets listed in the Table 1 are parameterized, so they change the content according to the parameter they are invoked with. In this sense some of them can be reused between pages. The example screenshot of the main page mock-up, showing the idea of the news-list, monitored sources, trending topics and processed document widgets is depicted in the Figure 2.

Figure 2: Screenshot of the early mock-up of the Sentify main page.

2.4. Guidelines for GUI widgets

Due to the specific heterogeneous architecture of GUI components, in order to bring them together into unified web application, a list of guidelines have been defined. While each GUI component that was developed separately can be also showcased separately (outside of the Sentify portal), the integrated version needs additional effort to ensure that GUIs fit the rest of the web application.

The level of technical conformance depends on particular widgets’ or GUIs’ requirements and can be divided into 3 groups:

Visualisation of the

trending topics

Latest news list,

with aggregated

sentiment at the

document level

Sentify menu

Number of

processed

documents

List of monitored

sources

embeddable widget

D7.3

© FIRST consortium Page 13 of 28

(1) Display-only widgets: widgets providing display-only features, such as info boxes, side panels to complement primary data displayed on the specific page. Those widgets can provide static information, doesn’t react to the user interactions (such as mouse clicking). Those widgets, however, are usually parameterised and display different information depending on the page they are embedded.

(2) Widgets with internal interactions: widgets providing interactive features (such as dynamic visualisations, data filters, etc.). Those widgets are not static and can fetch necessary data dynamically from their hosting environment. The results of the interaction are only local to the widget.

(3) Widgets with external interactions: in such widgets it’s is possible that the widget can affect the global website (for instance, by reloading global website, sending form to other URL, etc.). This can be used when link from one widget opens another page, where other details are shown.

Please note that while technical conformance depends on widget’s purpose, visual conformance is set to be common for all GUIs, in order to fit the Sentify website: to avoid any navigation problems and improve user experience.

For ensuring technical soundness of GUI integration, the following guidelines have been defined:

ID Description

Guidelines concerning design

D.1 Use colours and CSS styles of the main IFMIS (Sentify) portal, such as style of the text, headings, links, background and foreground colours, content panel size, etc.

D.2 Don’t repeat common elements such as page header, footer and other “boilerplate” information.

D.3 For main-screen widgets don’t use additional borders.

D.4 For side-panels, use the predefined style (div.block) that sets default borders

D.5 The main-screen content placeholder is resizable, so it can be any embedded component. The width dimension varies from maximum of 737px to the minimum of 583px for the full-page mode, and min 138px when in the “mobile” mode.

D.6 The side-panels are also resizable. The width dimension varies from the maximum of 333px to the minimum of 252px for the full-page mode and min 138px when in the “mobile” mode.

D.7 For full-screen visualisations the width shall not exceed 1160px and it’s advisable that the content could be resized.

Guidelines concerning Sentify navigation

T.1 Use clickable links on active elements (e.g. sentiment objects, such as countries, companies, stocks) to enable navigation. The landing sites are defined in the section 2.3. Please note that this list is evolving, and new pages are to be introduced.

T.2 When navigating across webpages, the widgets should always use the <a target=”_top”> for the links and use URLs of Sentify portal (see section 2.3),

not of the widget itself.

T.3 Establish and conform to the URL parameters of the widgets.

Table 2: Guidelines for widgets development

D7.3

© FIRST consortium Page 14 of 28

While Sentify portal is still in its early stage of development, those rules aim at facilitating the incorporation and integration of different visual components into the overall system.

2.5. Next steps and future development

The future development is mostly oriented at refining modules that are already under development, and also improving the overall drill-down functionality. The latter aspect is crucial, for the overall demonstration purpose of the FIRST project. We plan to cover the all levels of the drill-down, in order to provide sentiment details from the aggregated level down to the individual sentence level. While sentiment data computed within WP4 is becoming more mature and accurate with time, this task is becoming realizable in the upcoming months.

Another aspect of the future development is the improvement of the integration and navigation between pages, widgets & visualisations. While the work before M24 provided technical framework and necessary feedback on the integration issues, the post-M24 tasks will orient on focusing on the concrete details, such as final implementation of the mock-ups and enhancement of the existing ones (e.g. by adding search functionalities, or including date ranges in the drill-down, etc)

The last aspect is the completeness of the Sentify portal: we will analyse what other functionalities are required in order to showcase 100% of the potential of the FIRST infrastructure.

D7.3

© FIRST consortium Page 15 of 28

3. Integrated WP6 visualisations

The goal of IFMIS GUI is to also incorporate visualisations developed as an effort of other tasks of WP6. Those visualisations were currently “self-contained” and independent of the Sentify portal, and an ongoing work has been started in order to provide features for integrating with other visualisations.

Note that due to the fact that WP6 visualisations are developed in parallel in the same moment, when this deliverable is produced, the following sections are not necessarily capturing the latest up-to-date M24 state of visualisations. Some of them are in the mock-up or early prototype phase.

This however doesn’t influence the overall integration process that has already started and will be reflected in its final version in the M33 and M36 deliverables.

The following sections present the list of visualisations currently planned to be included in the first prototype of the Sentify portal. For the details on those visualisation and the usage scenario, see WP6 deliverables, especially (FIRST D6.4 Interactive visualisation of textual streams v1, 2012).

3.1. Document Space Visualisation

A document space or topic space is a high-dimensional bag-of-words space in which documents are represented as points. To visualise a topic space, we need to project documents onto a 2-dimensional canvas so that the distances between the points reflect the similarities between the corresponding documents.

Description: Real-time visualization of document space, transparency used to give info about age of documents, color coding employed to denote document-level sentiment, frequent terms and/or recognized entities are overlaid to “describe” different parts of the space

Input Data: Bag-of-words vectors (optionally also document-level sentiment and/or discovered entities)

Visualized data: Documents, document-level sentiment, document-document similarity, terms or entities, topics (implicitly)

Visualisation mode: Full-screen

D7.3

© FIRST consortium Page 16 of 28

Figure 3: Document space visualization

3.2. Tag Clouds

A tag cloud (word cloud) is a visual representation for text data, typically used to depict keyword metadata (tags) on Web sites, or to visualize free form text. Tags are usually single words, and the importance of each tag is shown with font size or color. This format is useful for quickly perceiving the most prominent terms and for locating a term alphabetically to determine its relative prominence. When used as Web site navigation aids, the terms are hyperlinked to items associated with the tag.1

Description: Real-time visualization of the most frequent terms (also possible: topics, entities, co-occurrences), transparency used to give info about age of terms, font size denotes relative importance of terms, color coding employed to denote sentiment attributed to terms and/or entities

Input Data: Bag-of-words vectors, discovered entities, entity-related sentiment

Visualized data: Terms, entities and entity-related sentiment, co-occurrences

Visualisation mode: Embedded widget

1 Taken from Wikipedia, http://en.wikipedia.org/wiki/Tag_cloud

D7.3

© FIRST consortium Page 17 of 28

Figure 4: Tag cloud visualization (mockup)

3.3. Canyon Flow Visualization

While topic spaces give insights into large amounts of documents with respect to topic coverage, temporal visualizations provide valuable insights into how topics evolved through time.

Description: Real-time visualization of topics through time, allows the user to navigate to more specific or more general topics, colour coding used to denote sentiment related to topics, detected topics are described by characteristic terms or discovered entities, entities can be visualized instead of topics

Input Data: Bag-of-words vectors, arranged into a bisecting cluster hierarchy, discovered entities

Visualized data: Topics/entities through time, sentiment related to topics/entities, topics are described by characteristic terms/entities

Visualisation mode: Embedded widget

D7.3

© FIRST consortium Page 18 of 28

Figure 5: Canyon flow visualization (mockup)

3.4. Visualizing Co-occurrence Statistics

Co-occurrence can either mean concurrence / coincidence or, in a more specific sense, the above-chance frequent occurrence of two terms from a text corpus alongside each other in a certain order. Co-occurrence in this linguistic sense can be interpreted as an indicator of semantic proximity or an idiomatic expression.1

3.4.1 Co-occurrence circle

Co-occurrence circle provides a static view on the co-occurrence statistics computed from document streams.

Description: Real-time visualization of co-occurrence statistics of discovered entities, colour coding used to denote sentiment related to entities, transparency used to denote entity occurrence age

Input Data: Discovered entities, co-occurrence statistics, entity-related sentiment

Visualized data: Discovered entities, co-occurrence statistics, entity-related sentiment

Visualisation mode: Full screen

1 Taken from Wikipedia, http://en.wikipedia.org/wiki/Co-occurrence

D7.3

© FIRST consortium Page 19 of 28

Figure 6: Co-occurrence circle visualization (mockup)

3.4.2 Occurrence and co-occurrences timeline

Occurrence and co-occurrences timeline provides a dynamic view (i.e., view through time) on the occurrence and co-occurrence statistics computed from document streams.

Description: Visualization of entity co-occurrence statistics through time, with the functionality to define co-occurrence pairs of interest, show frequency chart, apply moving average smoothing, inspect the underlying articles/blog posts, overlay MACD to identify differences between short term and long term trends

Input Data: Stream of annotated documents

Visualized data: Entity occurrence and co-occurrence statistics through time, moving average time series, MACD time series

Visualisation mode: Full screen

D7.3

© FIRST consortium Page 20 of 28

Figure 7: Visualization of occurrence and co-occurrence statistics.

3.5. Twitter sentiment timeline

In FIRST, we collect data on roughly 1,800 US stocks (such as Google, Cisco, Microsoft, Apple) and analyse the vocabulary to determine whether the sentiment towards that particular financial instrument is positive or negative. We are then able to observe sentiment and volume (i.e., tweeting frequency) trends through time by visualizing this data along the timeline. This kind of visualization can alert us about important events (volume peaks) and sentiment cross-overs.

Description: Visualize stock closing price, relevant events, positive tweets, negative tweets, absolute difference between positive and negative tweets

Input Data: Stream of tweets about a particular stock, price time series, events

Visualized data: Stock closing price (grey line), relevant events (green dots), positive tweets (blue), negative tweets (red), absolute difference between positive and negative tweets (yellow)

Visualisation mode: Full screen

D7.3

© FIRST consortium Page 21 of 28

Figure 8: Twitter sentiment timeline

D7.3

© FIRST consortium Page 22 of 28

4. Conclusion

This document presented the current state of development of the early prototype of the Integrated Financial Market Information System GUI, also known as Sentify portal, as of month M24.

It already implements the first set of functionalities, such as deployment infrastructure, first version of the Sentify portal, first version of the data presentation widgets or early prototype of the drill-down. It also provides the general structure and guidelines for further development.

The Sentify portal intends to be a showcase of the interaction of end-users with the FIRST results. These results come from different tools developed for different purposes, in different programming languages and deployment environments. Ensuring a common way of integrating those GUIs into a common framework without redesigning them is not an easy task. To achieve this, a lightweight GUI integration process implying a minimum agreement for integration has been established with the developers of FIRST GUIs. Therefore, this document provides to the developers a set of guidelines to ease the process of integration of new GUIs and widgets to fit in the Sentify portal.

As the work is ongoing, there are still many exciting work waiting to be done in the next project period.

D7.3

© FIRST consortium Page 23 of 28

References

FIRST D2.2 Conceptual and technical integrated architecture design. (2011).

FIRST D2.3 Scaling Strategy. (2011).

FIRST D4.2 Semantic information extraction components, addressing noise and uncertainty.

(2012).

FIRST D5.1 Specification of the information-integration model. (2011).

FIRST D5.3 Large-scale integrated knowledge base. (2012).

FIRST D5.3 Large-scale integrated knowledgebase. (2012).

FIRST D6.4 Interactive visualisation of textual streams v1. (2012).

FIRST D7.1 Integration Infrastructure Release. (2012).

FIRST D7.2 Early prototype of Integrated Financial Market Information System. (2012).

D7.3

© FIRST consortium Page 24 of 28

Annex 1. Specification of WP4 sentiment drill-down GUIs for WP7

The GUIs for visualizing results of sentiment extraction, classification, and analysis in WP4 can be categorized as: (1) services for retrieving sentiment with respect to specific objects, (2) inspecting all sentiments of a document or index (drill down), and (3) visualization of time series of sentiment.

Annex a. Sentiment Retrieval

The objective of the retrieval functionality is to look up and display sentiments with respect to specific sentiment objects.

The user needs to specify the following sentiment attributes for the retrieval:

- Name/URI of the sentiment object (mandatory)

- Name/URI of the feature of the sentiment object (optional)

- Name/URI of the subfeature of the sentiment object (optional)

- startedatetime – start of the time period to retrieve sentiments

- enddatetime – end of the time period to retrieve sentiments

The retrieval can address (1) the document level, or (2) aggregated sentiment index of several documents.

Document level

Concerning the document level, the result of the query would be a list of documents. The appropriate service to be used is getSentimentDocumentObject. Each of the documents contains sentiment with respect to the constraints specified by the user. Along with each document, appropriate document meta data (getDocumentMetadata service) could be displayed. For instance

- title,

- publication date,

- author,

- url of the original article.

Further, the document level sentiment data should be displayed:

- sentiment object

- sentiment feature

- sentiment subfeature (if applicable, ie., with reputation use case)

- sentiment score number [-1,1] - possibly accompanied by a graphical display showing direction (>0 positive, <=0 negative) and intensity () of the sentiment. This is redundant to the degree of memberships that will be introduced in 2012.

- sentiment positive degree of membership interval label (e.g., small amount)

- sentiment negative degree of membership interval label

- sentiment holder’s certainty degree of membership interval label (that is, how certain was the author about the sentiment) [no guarantee on that]

D7.3

© FIRST consortium Page 25 of 28

Along with the sentiment attributes, the plain document should be available on request for each of the documents in the list.

By means of the sentiment inspection services, a drill down is possible (to sentiment on sentence level etc.) that can be asked for by the user with respect to each document in the result list.

Sentiment Index

The query specification is the same. However, instead of a set of documents, an aggregated sentiment is delivered that is an average of the document-level sentiments of all the matched documents that refer to the same feature of the sentiment object. Appropriate services to be used would be

- getSentimentIndexScoreObject

- getSentimentIndexDOMObject

Display of sentiment on index level is the same to the document level.

- sentiment object

- sentiment feature

- sentiment subfeature (if applicable, ie., with reputation use case)

- sentiment score number [-1,1] - possibly accompanied by a graphical display showing direction (>0 positive, <=0 negative) and intensity () of the sentiment. This is redundant to the degree of memberships that will be introduced in 2012.

- sentiment positive degree of membership interval label (e.g., small amount)

- sentiment negative degree of membership interval label

- sentiment holder’s certainty degree of membership interval label (that is, how certain was the author about the sentiment) [no guarantee on that]

By means of the sentiment inspection services, a drill down is possible (to sentiment on document level) that can be asked for by the user.

D7.3

© FIRST consortium Page 26 of 28

Annex b. Sentiment Inspection (Drill-down)

The sentiment inspection should be a universal functionality that can be used in several places of the GUI(s) to drill down and inspect components of each of the sentiments on different levels, starting in a top-down way with the sentiment index, going to the documents, and sentence level. From the index level, one would go to the document level. From the document level, one would inspect the sentence level.

Annex i. Sentiment Index Inspection The user would select a sentiment index, possibly provided by the sentiment retrieval part of the GUI. The inspection would list all documents that were used for aggregating sentiments that refer to the same object and feature. Appropriate services to be used are getSentimentDocumentObject – this would require specifying the same query attributes as for retrieving the sentiment index. Otherwise, some caching of the document level sentiment that make up the index might be appropriate.

The attributes and metadata to be shown per document is specified in chapter 2.1. Further, in the inspection part, for each document, the different versions of each document could be made available. These versions are being created by the information processing pipeline in wp3 and wp4 and contain the output of different steps (e.g., html, plain text, pre-processed GATE xml, sentiment classification results). These document versions would be available via a network-attached file storage.

For each of the documents, a further drill down to the sentence level should be possible on request of the user.

Annex ii. Sentiment Document Inspection The user would select a document (returned by sentiment retrieval) containing different sentiments with respect to different sentiment objects and different sentiment features. This inspection functionality should work on the basis of the plain text of a document. The user would be able to select from a list all available combinations of sentiment object, sentiment feature, sentiment subfeature, for which at least one sentiment was found in the document (e.g., a document could contain sentiment on (1) price of IBM stock and (2) reputation of Deutsche Bank with respect to business behaviour(subfeature of reputation)). The appropriate service to be utilized would be getSentimentDocument.

Based on the selection of the user, all sentiments on the sentence level that refer to the selected sentiment object and sentiment feature (and subfeature, if applicable) should be (colour) marked in the document. By selecting each of the sentences, the sentiment data (polarity, score, ?) would be on display.

On request, a further drill down per document or per sentence would be available, see next subsection.

Annex iii. Sentiment Sentence Inspection The user would select either a whole document or a sentence containing sentiment retrieved by one of the steps above. The deep sentence inspection works on the same basis as the sentiment document inspection described above. In contrast to that, the deep sentence inspection would additionally display the partial text parts (phrases) of each of the sentences in the document currently being looked at. These phrases were extracted for sentiment analysis and allow to follow up the sentiment polarity classifications. The types of phrases to be displayed are

- Sentiment object

D7.3

© FIRST consortium Page 27 of 28

- Sentiment feature

- Sentiment subfeature (if applicable)

- Sentiment orientation

- Indicator

- Linguistic modifiers (e.g., intensifiers, diminishers, neutralizers etc.)

Further, if a sentiment on sentence level is an indirect one, the indicator that was used to infer the sentiment indirectly an its associated data can be displayed. For instance, name and URI of the indicator, polarity of the indicator.

The appropriate services to be used would be getPhrases and getIndicator.

D7.3

© FIRST consortium Page 28 of 28

Annex c. Aggregated Sentiment Time Series Visualization

The premium functionality of sentiment visualization is to display time series of Sentiment index that refer to a specific (sub-) feature of a sentiment object.

The user needs to specify the following sentiment attributes for the retrieval:

- Name/URI of the sentiment object (mandatory)

- Name/URI of the feature of the sentiment object (optional)

- Name/URI of the subfeature of the sentiment object (optional)

- startedatetime – start of the time period to retrieve sentiments

- enddatetime – end of the time period to retrieve sentiments

- aggregationinterval – eg. daily or weekly

For each interval (e.g., daily) with the boundaries of startdatetime and enddatetime (could be for instance one year), a query to getSentimentIndexScoreObject or getSentimentIndexDOMObject services has to be done. Each query delivers one sentiment index instance for the aggregationinterval. All index instances make up the sentiment index time series.

For the score, the time series consists of [-1,1] values. These would be displayed in a graphical way. Nice extensions would be scrolling and zooming in the time series. It should be possible to apply a moving average noise filter (length to be specified by the user) to the time series. The sentiment index time series can be accompanied by other time series such as the time series of the price of a financial instrument referring to the same time period.

For the degree of membership intervals, a transformation to a metric scale is required. An ad-hoc solution could be

- no amount => 0

- small amount => 1

- medium amount => 2

- large amount => 3

- maximum amount => 4