15
Towards a Benchmarking Framework for Financial Text Mining Caslav Bozic 1 , Ryan Riordan 2 , Detlef Seese 1 , and Christof Weinhardt 2 1 Institute of Applied Informatics and Formal Description Methods, KIT {bozic, detlef.seese}@kit.edu 2 Institute of Information Systems and Management, KIT {ryan.riordan, weinhardt}@kit.edu Summary. Different data mining methods for financial texts and various sentiment mea- sures are described in the existing literature, without common benchmarks for comparing these approaches. The framework proposed in this paper and the corresponding imple- mented system facilitate combining more sources of financial data into comprehensive integral dataset. The use of the dataset is then illustrated by analyzing the candidate measures by estimating parameters of regression on different returns and other financial indicators that can be defined using system’s novel data transformation approach. 1 Introduction Different data mining methods for financial texts are described in the literature, and systems that apply these particular methods have been implemented. For an evaluation of the proposed methods authors use different approaches, some of them incorporating assumptions that do not conform to real financial markets’ conditions. This paper presents the current status of a project attempting to offer a possibility to compare performance of these systems by offering a framework, a base dataset, and an implementation of the system according to design science guidelines from Hevner et al. (2004). The research is a part of the project FINDS (Financial News and Data Services) initiated within Information Management and Market Engineering Graduate School at the Karlsruhe Institute of Technology. Project FINDS has the goal to conduct innovative research on the analysis of quantitative and qualitative information related to financial markets and to provide services that can help both researchers in financial field, and also professional traders (Bozic, 2009). The paper is organized as follows: in section 2 we give an overview of the existing literature in the field. Section 3 describes data sources for the comprehensive research set, preprocessing methods used on data and gets the reader acquainted with the dataset by presenting some descriptive statistics. Section 4 illustrates the usage of the framework and the system by comparing four sentiment measures. Section 5 concludes the paper and describes the ideas for future research.

Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Frameworkfor Financial Text Mining

Caslav Bozic1, Ryan Riordan2, Detlef Seese1, and Christof Weinhardt2

1 Institute of Applied Informatics and Formal Description Methods, KIT{bozic, detlef.seese}@kit.edu

2 Institute of Information Systems and Management, KIT{ryan.riordan, weinhardt}@kit.edu

Summary. Different data mining methods for financial texts and various sentiment mea-sures are described in the existing literature, without common benchmarks for comparingthese approaches. The framework proposed in this paper and the corresponding imple-mented system facilitate combining more sources of financial data into comprehensiveintegral dataset. The use of the dataset is then illustrated by analyzing the candidatemeasures by estimating parameters of regression on different returns and other financialindicators that can be defined using system’s novel data transformation approach.

1 Introduction

Different data mining methods for financial texts are described in the literature,and systems that apply these particular methods have been implemented. For anevaluation of the proposed methods authors use different approaches, some of themincorporating assumptions that do not conform to real financial markets’ conditions.This paper presents the current status of a project attempting to offer a possibilityto compare performance of these systems by offering a framework, a base dataset,and an implementation of the system according to design science guidelines fromHevner et al. (2004).

The research is a part of the project FINDS (Financial News and Data Services)initiated within Information Management and Market Engineering Graduate Schoolat the Karlsruhe Institute of Technology. Project FINDS has the goal to conductinnovative research on the analysis of quantitative and qualitative informationrelated to financial markets and to provide services that can help both researchersin financial field, and also professional traders (Bozic, 2009).

The paper is organized as follows: in section 2 we give an overview of the existingliterature in the field. Section 3 describes data sources for the comprehensiveresearch set, preprocessing methods used on data and gets the reader acquaintedwith the dataset by presenting some descriptive statistics. Section 4 illustrates theusage of the framework and the system by comparing four sentiment measures.Section 5 concludes the paper and describes the ideas for future research.

Page 2: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

22 Caslav Bozic, Ryan Riordan, Detlef Seese, and Christof Weinhardt

2 Related Work

Many researchers have studied how published news influence market reactions.The methodologies used range from a fairly simple content analysis using classicalstatistical tools, to complex machine-learning methods. The approaches vary froman engineering approach which focuses on implementation and proving economicrelevance, to chiefly theoretical approaches whose goal is to describe underlyingeconomic phenomena.

Mittermayer and Knolmayer (2006a) compare eight text-mining systems, includ-ing their own. Since more technical performance criteria are often missing, it isnot possible to draw clear conclusions about relative performance. Wuthrich et al.(1998) classify news articles published overnight on web portals into three categoriesdepending on their influence on the one of five equity indices: Dow Jones, Nikkei,FTSE, Hang Seng, and Straits Times. With this system they attempt to forecast thetrend of the index daily value one day ahead. They use Naıve Bayes, Nearest Neigh-bour and Neural Net classifier, and a hand-crafted underlying dictionary. Lavrenkoet al. (2000) use Naıve Bayes classifier to classify news articles from Yahoo!Financeinto five groups, according to the influence on particular U.S. stocks. The featureswere determined automatically and the forecast horizon was from five to ten hours.Gidofalvi and Elkan (2003) use again naıve Bayes classifier with three categories torecognize articles which have bigger positive or negative influence on constituentsof Dow Jones index. With features defined using mutual information measurethey work on ten minutes aggregated intraday data. Fung, Yu, and Lam (2003)partially use commercially available text mining systems to predict a price trend forintraday market movements of some of the stocks listed on the Hong Kong StockExchange. For classification purposes they use support vector machines. Finally,Mittermayer and Knolmayer (2006b) propose a high frequency forecast system thatclassifies press releases of publicly traded companies in the U.S. using a dictionarythat combines automatically selected features and a hand-crafted thesaurus. Forclassification the authors use the polynomial version of SVM.

While most of the works focus on predicting price trends of single stock or index,there are works that aim at determining influence of news releases to volatility.Thomas (2003) improves risk-return profile by exiting the market in case of newsthat are predicting high volatility, while Schulz, Spiliopoulou, and Winkler (2003)attempt to classify press releases of German public companies according theirinfluence on volatility of stock prices.

Another group of publications not included in the survey by Mittermayer andKnolmayer (2006a) contains works that do not primary attempt to prove economicalrelevance of published text by evaluating specifically tailored trading strategies,but rather to find statistically relevant relations between financial indicators andsentiment extracted from the text.

Antweiler and Frank (2004) use Naıve Bayes and SVM classifiers to classifymessages posted to Yahoo!Finance and Raging Bull and determine their sentiment.

Page 3: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Framework for Financial Text Mining 23

They do not find statistically significant correlation with stock prices, but they findsentiment and volume of messages significantly correlated to trade volumes andvolatility. In their methodological paper Das and Chen (2007) offer a variety ofclassifiers, as well as composed sentiment measure as a result of voting amongclassifiers. In the illustrative example they analyze Yahoo stock boards and stockprices of 8 technology companies, but they do not find clear evidence that thesentiment index can be predictive for stock prices.

There are two pivotal articles published in the Journal of Finance. Tetlock (2007)observes Wall Street Journal’s column ”Abreast of the Market”, uses content analysissoftware General Inquirer together with Principal Component Analysis approachand finds that high pessimism in published media predicts downward pressure onmarket prices. Authors of Tetlock, Saar-Tsechansky, and Macskassy (2008) succeededto find that rate of negative words in news stories about certain company predictslow earnings of the company.

If we observe text mining methodologies as a transformations that assign numer-ical value to every textual string, we can refer to that numerical value as sentimentindex. All publications from the former group have at least implicit statementsabout the predictive power of the specific sentiment index on e.g. returns or volatil-ity. Following the evaluation approach from the latter group of publication, we aimat providing financial text mining research community with a framework and a toolthat can be used for proving their statements using statistical significance criteria.

3 Data

3.1 Data Sources

We use Thomson Reuters TickHistory data and the output from the Reuters NewsS-cope Sentiment Engine as a source dataset. These data sources are convenientbecause they provide access to trading data over period of more than 10 years, andextensive amount of sentiment data related to financial news stories.

Data constituting the output of the Reuters NewsScope Sentiment Engine rep-resents the author’s sentiment measure for every English-language news itempublished via NewsScope in years 2003-2008. The measure classifies a news iteminto one of three categories: positive, negative, or neutral. The probability of thenews item falling into each of the categories is also given.

Thomson Reuters TickHistory data is available through the DataScope platform.We use daily data for the period equivalent to the NewsScope Sentiment Enginedataset. This provides us with data on opening and closing prices for the particularproduct, bid and ask, as well as volume data. Other types of data about companiescoming from this source are the total amount of shares, used for calculating marketcapitalization, and paid dividends, that can be used for adjusting the returns.Additionally, it is the source of the data about daily values of the MSCI Indices forindividual countries, as well as MCSI World Index.

Page 4: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

24 Caslav Bozic, Ryan Riordan, Detlef Seese, and Christof Weinhardt

Field Name Description Sourceric Reuters Instrument Code THd Date THOpen Opening daily price THHigh Maximal price within the day THLow Minimal price within the day THvol Daily trading volume THLast Closing daily price THBid Average bid THAsk Average ask THspread Average spread THcc Close to close daily return of the equity Doc Open to close daily return of the equity Doo Open to open daily return of the equity Dco Close to open daily return of the equity Dcnt Number of news items mentioning company RNSEpnaccnt Number of news stories mentioning company RNSEnsent pos Number of positive news items mentioning company RNSEnsent neut Number of neutral news items mentioning company RNSEnsent neg Number of negative news items mentioning company RNSEavgsent Average sentiment RNSEasent pos Average probability that news items are positive RNSEasent neut Average probability that news items are neutral RNSEasent neg Average probability that news items are negative RNSEnet sent Net sentiment: asent pos - asent neg Dnet sent std Standard deviation of net sentiment Dric market Abbreviation of company’s home market THcountry Country Dicc Close to close daily return of the country index Dioc Open to close daily return of the country index Dioo Open to open daily return of the country index Dico Close to open daily return of the country index Dcnt country Number of news items related to country RNSEavgsent item Average sentiment (country level) RNSEeccc Excess daily close to close return w/r to country index Decoc Excess daily open to close return w/r to country index Decco Excess daily close to open return w/r to country index Decoo Excess daily open to open return w/r to country index D

Table 1: Main fields and sources (TH - TickHistory,RNSE - Sentiment Engine, D - derived)

Page 5: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Framework for Financial Text Mining 25

Reuters NewsScope Sentiment Engine data has two main properties - the times-tamp of the news item publishing, and the related company mentioned in the newsitem. It is preprocessed in a way that the records are aggregated on the level of eachcompany, and also calendar day - according to local time in force at the location ofcompany’s home market. In that way we get an average sentiment for a companyfor each day in 6 years period. The sentiments are expressed by different calculatedvalues. The first aggregated sentiment measure is average sentiment class - wherethe classes are represented by values 1, 0, and -1 for positive, neutral, and negativeclass, respectively. Further measures are average probabilities that each news itemfalls into positive, negative or neutral class. As the Reuters sentiment index usedfor the evaluation later we adopted a deducted value defined as average probabilityof positive class minus average probability of negative class within one day.

From the data on daily prices we derive data on different returns: namelyclose-to-open, close-to-close, open-to-open, and open-to close returns. This is doneconsidering only days with trades for particular equity or index. The generic way ofdefining calculated fields derived from source data is described in later section. Thecalculated returns are adjusted for paid dividends by increasing the price of shareon the trading day following the dividend payment by the amount of dividendspaid per share.

3.2 Descriptive Statistics

Reuters NewsScope Sentiment Engine data consists of author’s sentiment measuresof English-language news items published through Reuters NewsScope in theperiod from 2003 to 2006 inclusive. Each record represents a unique mention ofthe specific company, with a possibility of one news item relating to more thanone company. In our dataset there are 6,127,190 records about 10,665 differentcompanies. Figure 1 shows the top twenty companies with greatest number ofrecords related to that company. According to Reuters news production process,several news items can make one single news story, e.g. short alert item is publishedimmediately, and after some time extension of the same story is published as a newitem, or in the case of corrections. In the available data average number of newsitems per story is 1.995, saying that in average two items make one news story.

Increase in data volume over the years is obvious and yearly record numberdoubles over the period of 6 years, as shown on Figure 2. Besides, on the samefigure, one can follow increase in the partial volume of records related to differenthome markets. The information about a company’s home market is an integralpart of the Reuters code uniquely identifying each instrument. Figure 2 separatelyshows data about the ten markets with greatest changes over the years. It can beseen that the fraction of the two biggest US markets, NYSE and NASDAQ, in thetotal data volume has grown from something over 40% to little less than 60% in this6 years period.

Each news item can have multiple tags called topic codes, and topic codes aregrouped into categories. One of the categories represents countries, and can be used

Page 6: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

26 Caslav Bozic, Ryan Riordan, Detlef Seese, and Christof Weinhardt

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

GE

NE

RA

L M

OT

OR

S

CIT

IGR

OU

P

FO

RD

MO

TO

R C

O

GE

NE

RA

L E

LE

C C

O

EX

XO

N M

OB

IL

JPM

OR

GA

N C

HA

SE

BP

MIC

RO

SO

FT

CP

BO

EIN

G C

O

GO

LD

M S

AC

HS

GR

P

FR

ED

DIE

MA

C

SO

NY

CO

RP

MA

RT

ST

OR

ES

ME

RR

ILL L

YN

CH

BA

NK

OF

AM

ER

ICA

FA

NN

IE M

AE

EA

DS

TO

YO

TA

MO

TO

R C

O

DE

UT

SC

HE

BA

NK

N

HS

BC

HO

LD

ING

S

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

GE

NE

RA

L M

OT

OR

S

CIT

IGR

OU

P

FO

RD

MO

TO

R C

O

GE

NE

RA

L E

LE

C C

O

EX

XO

N M

OB

IL

JPM

OR

GA

N C

HA

SE

BP

MIC

RO

SO

FT

CP

BO

EIN

G C

O

GO

LD

M S

AC

HS

GR

P

FR

ED

DIE

MA

C

SO

NY

CO

RP

WA

L-M

AR

T S

TO

RE

S

ME

RR

ILL L

YN

CH

BA

NK

OF

AM

ER

ICA

FA

NN

IE M

AE

EA

DS

TO

YO

TA

MO

TO

R C

O

DE

UT

SC

HE

BA

NK

N

HS

BC

HO

LD

ING

S

Figure 1: Top twenty companies by number of records

400000

600000

800000

1000000

1200000

1400000

1600000 rest

T

VX

MI

AX

PA

TO

DE

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

2003 2004 2005 2006 2007 2008

rest

T

VX

MI

AX

PA

TO

DE

L

O

N

Figure 2: Data volume change over years

for determining what country is mentioned in the particular news item. Using thisauthor tagging feature, average sentiment per country can be calculated. The Figure3 is showing counties with greatest and lowest average sentiment, when countrytags with less than 1000 mentions are excluded.

During the week, only 2.19% of all news items are published on weekends.Figure 4 shows the distribution of records per days of week. Our dataset comprisesof data spanning 2192 days in total, considering GMT time zone, and there are even13 companies that are mentioned at least once in more than 80% of all days, asshown in Table 2. One more interesting property of the data is noticeable sentimentdecrease for news items published on Saturdays, as shown on Figure 5.

Page 7: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Framework for Financial Text Mining 27

Figure 3: Best and worst average sentiment for countries with over 1000 mentions

Company Days with news PercentGENERAL ELEC CO 1911 87.18SONY CORP 1891 86.27CITIGROUP 1863 84.99BP 1851 84.44WALT DISNEY CO 1841 83.99GENERAL MOTORS 1837 83.8EXXON MOBIL 1822 83.12WAL-MART STORES 1802 82.21BOEING CO 1798 82.03MICROSOFT CP 1792 81.75HSBC HOLDINGS 1791 81.71GOLDM SACHS GRP 1782 81.3FORD MOTOR CO 1754 80.02EADS 1752 79.93TOYOTA MOTOR CO 1729 78.88DEUTSCHE BANK N 1729 78.88VOLKSWAGEN AG 1720 78.47VODAFONE GROUP 1711 78.06CHEVRON 1704 77.74TOTAL FINA 1693 77.24Table 2: The 20 companies mentioned on most days

Page 8: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

28 Caslav Bozic, Ryan Riordan, Detlef Seese, and Christof Weinhardt

400000

600000

800000

1000000

1200000

1400000

1600000

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

MON TUE WED THU FRI SAT SUN

Figure 4: Number of records per days of week

0

0.02

0.04

0.06

0.08

0.1

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

MON TUE WED THU FRI SAT SUN

Figure 5: Average sentiment per days of week

3.3 Preprocessing

To enable easy extending of the existing dataset, generic interfacing is one of thesystem’s features. It is generic in that aspect that it allows any format of theDataScope platform output to be loaded into the system automatically. New sourceto be added is placed in a comma separated values (CSV) file. The procedure ofgeneric load then starts, making new data source available for the next steps ofpreprocessing. The names of the new columns that can be later used for referencingare extracted from header row of the comma separated input file. The necessarytype transformations on the added data are conducted by means of metadatadictionary, which holds datatypes and formats related to the column names. In thisway new data is transformed into the representation that can be used as input ofthe successive steps.

Page 9: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Framework for Financial Text Mining 29

As our goal is to explore dependencies of various indicators and sentimentmeasures, another feature of the system is a novel approach to defining calculatedindicators. Any of the columns from the source datasets can be used as a basefor calculations. Besides full range of arithmetic operators, lagging operators canbe defined and used as well. The definition of the calculated fields is given inthe notation of simple grammar we defined. The definition is saved in controlfile, which is parsed and used for code generation. It produces PL/SQL code thatapplied to the source datasets produces additional datasets containing calculatedfields.

Final result is a clean dataset with derived fields that were not part of the sourcedatasets, designed for particular benchmarking purpose. The overview of the dataflow can be seen on Figure 6.Data Processing

10

Thomson ReutersTickHistory

Reuters NewsScopeSentiment Engine

Calculation &Aggregationaccording to

formal definition

Grammar AggregationTransformation

Clean Data

Benchmarking

Caslav Bozic - Towards Benchmarking Framework for Financial Text Mining

Figure 6: Overview of the Preprocessing system

Page 10: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

30 Caslav Bozic, Ryan Riordan, Detlef Seese, and Christof Weinhardt

4 First Results

As an illustration of framework application we performed comparison of foursentiment measures. Three of them are produced using classifiers that are imple-mented in the scope of FINDS Project. The first one is using Bayes-Fisher filter,second Support Vector Machines, while third implements artificial Neural Networksmethodology. The classifiers are trained on high frequency intraday data on trades,aggregated to 1 minute average prices. The news items are grouped into classesdepending on their influence on the price movements. Observing the period oftwo hours immediately following news item publication, we determine class of thenews item. If the price stabilized on the level higher than the level at the momentof news item publication, news item was classified as positive. If the price stabi-lized on lower level, news item was classified as negative. If the price stabilizationcouldn’t be observed, news item is considered neutral. More details about classifierimplementation can be found in Pfrommer et al. (2010). Training period was firstnine months of 2003, and news stories about five big technology companies (SAP,Oracle, Microsoft, IBM, and Apple) were used.

The last three months of 2003 represent the evaluation period, with total of 729news stories analyzed. The process of news transformation shown on Figure 7consists of text tokenizing, stemming, and finally text classification using selectedmethodology. Using the dataset and the framework we calculated parameters forregression of lagged dependent variables against each sentiment measure. Thevariables were four different daily returns, four daily excess returns, and averagedaily spread, all lagged up to ten days. The regression formula is given by idt =

α(id,i) + β(id,i)St−i + εt where id represents dependent variable, and i size of thelag. The results are presented in Table 3 and Table 4. The column title determinesdependent variable, and the row represents days of lag between sentiment measureS and dependent variable id. Each cell shows two values: estimated coefficientvalue β first, and then p value according to t-statistics.

The results show that just for 14 variables could be proven statistically relevantdependence on lagged sentiment values. Although RNSE data shows most of thestatistically significant results, seven, and all of them have positive coefficients, it isnot possible to give clear statement about comparative performance of sentimentmeasures. This issue possibly arises from the low data volume, causing the statisticaldependence not to be observed. To overcome the problem, regression parameterestimation for one of the measures is repeated with greater number of data points.RNSE measure is chosen because of the extensive volume of the data. Table 5presents the results from regression parameters estimation on data sample includingall six years of processed news stories in period 2003 to 2008 for two major U.S.markets contained in the system. All tested dependent variables show statisticallysignificant dependence on sentiment value with lag of at least one day. The increasein number of statistically relevant results shows that greater amount of data offers apossibility to draw more confident conclusions.

Page 11: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Framework for Financial Text Mining 31

Bay

es-F

ish

er

la

g cc

oo

co

oc

sp

read

eccc

ecoo

ecco

ecoc

1 -0

.008

24

0.11

-0

.003

16

0.54

-0

.006

41

0.15

-0

.001

82

0.48

0.

0001

3175

0.

45

-0.0

0646

0.

23

-0.0

0439

0.

40

-0.0

0386

0.

43

-0.0

026

0.24

2

-0.0

0212

0.

72

-0.0

0557

0.

19

-0.0

0046

567

0.93

-0

.001

65

0.54

-0

.000

1933

70.

18

-0.0

039

0.53

-0

.002

63

0.47

-0

.002

52

0.65

-0

.001

47

0.54

3

-0.0

0056

663

0.91

-0

.004

66

0.29

-0

.002

43

0.54

0.

0018

7 0.

49

-0.0

0063

3 0.

01

-0.0

028

0.53

-0

.000

9659

20.

79

-0.0

0301

0.

44

0.00

0298

81

0.90

4

-0.0

0646

0.

15

0.00

26

0.51

-0

.006

67

0.10

0.

0002

0725

0.

93

-0.0

0049

77

0.07

-0

.003

57

0.40

0.

0008

996

0.77

-0

.003

3 0.

40

-0.0

0020

749

0.91

5

-0.0

0032

704

0.94

-0

.000

1323

2 0.

97

0.00

193

0.61

-0

.002

25

0.32

-0

.000

1992

80.

30

0.00

223

0.60

-0

.001

48

0.64

0.

0024

2 0.

51

-0.0

0016

307

0.93

6

0.00

34

0.42

-0

.001

63

0.67

0.

0026

0.

43

0.00

0797

63

0.73

-0

.000

5966

0.

04

0.00

0362

64

0.92

-0

.000

8124

10.

83

0.00

0379

01

0.89

-0

.000

0045

5 1.

00

7 0.

0013

9 0.

75

0.00

642

0.12

-0

.002

34

0.47

0.

0037

3 0.

17

-0.0

0034

695

0.26

0.

0032

6 0.

41

0.00

584

0.17

-0

.000

9226

20.

73

0.00

424

0.11

8

0.00

172

0.69

-0

.000

9565

8 0.

81

0.00

291

0.36

-0

.001

19

0.65

-0

.000

121

0.54

0.

0016

5 0.

68

0.00

262

0.50

0.

0010

6 0.

70

0.00

0620

9 0.

80

9 -0

.003

43

0.27

-0

.005

7 0.

11

0.00

0230

65

0.93

-0

.003

66

0.08

-0

.000

4424

0.

01

-0.0

0181

0.

53

-0.0

0642

0.

09

0.00

206

0.42

-0

.003

85

0.05

10

-0

.004

11

0.19

-0

.003

8 0.

25

-0.0

0204

0.

49

-0.0

0207

0.

21

-0.0

0007

904

0.64

-0

.002

31

0.41

-0

.006

19

0.05

-0

.000

1007

40.

97

-0.0

0218

0.

12

S

V

M

lag

cc

oo

co

oc

spre

ad

ec

cc

ec

oo

ec

co

ec

oc

1

0.00

102

0.83

-0

.003

39

0.48

-0

.001

01

0.81

0.

0020

3 0.

40

-0.0

0003

017

0.76

-0

.001

04

0.85

0.

0012

7 0.

81

-0.0

027

0.59

0.

0016

9 0.

46

2 -0

.003

3 0.

56

-0.0

0377

0.

36

-0.0

0499

0.

31

0.00

169

0.52

-0

.000

0437

60.

65

-0.0

0535

0.

38

0.00

371

0.30

-0

.006

37

0.24

0.

0010

5 0.

65

3 -0

.003

25

0.48

-0

.003

13

0.46

-0

.005

68

0.14

0.

0024

3 0.

36

-0.0

0009

86

0.59

-0

.003

53

0.43

-0

.001

49

0.68

-0

.004

78

0.21

0.

0012

8 0.

58

4 -0

.002

81

0.54

-0

.001

86

0.64

-0

.004

5 0.

28

0.00

169

0.45

-0

.000

0466

70.

83

-0.0

0236

0.

58

-0.0

0109

0.

73

-0.0

0413

0.

30

0.00

179

0.34

5

-0.0

0022

28

0.96

-0

.000

5497

9 0.

89

-0.0

0288

0.

46

0.00

266

0.25

0.

0000

0991

0.

96

0.00

0273

64

0.95

0.

0011

9 0.

72

-0.0

0215

0.

56

0.00

239

0.22

6

0.00

218

0.62

0.

0015

6 0.

69

-0.0

0253

0.

47

0.00

471

0.05

-0

.000

2602

20.

41

0.00

148

0.72

0.

0039

0.

35

-0.0

0121

0.

70

0.00

27

0.28

7

-0.0

0014

802

0.97

0.

0021

0.

63

-0.0

0358

0.

29

0.00

344

0.22

-0

.000

0072

10.

99

0.00

29

0.48

0.

0049

2 0.

26

-0.0

0128

0.

66

0.00

416

0.13

8

-0.0

0127

0.

78

-0.0

0473

0.

25

-0.0

0147

0.

66

0.00

0199

47

0.94

0.

0004

5594

0.

13

0.00

114

0.80

-0

.004

54

0.29

0.

0004

2362

0.

89

0.00

0720

89

0.79

9

-0.0

0316

0.

33

-0.0

0257

0.

49

-0.0

0493

0.

09

0.00

177

0.42

-0

.000

1396

70.

57

-0.0

0421

0.

16

-0.0

0391

0.

31

-0.0

043

0.10

0.

0000

8207

0.

97

10

-0.0

0112

0.

73

0.00

43

0.21

-0

.004

34

0.16

0.

0032

2 0.

06

0.00

0465

03

0.13

-0

.000

3814

0.

90

0.00

532

0.12

-0

.003

34

0.23

0.

0029

7 0.

05

Tab

le

3:

Para

met

er

estim

ate

and

p

valu

e fo

r la

gged

se

ntim

ent

mea

sure

s pro

duce

d

by

Bay

es-F

isch

er

and

SVM

cl

assi

fier

sTa

ble

3:Pa

ram

eter

esti

mat

ean

dp

valu

efo

rla

gged

sent

imen

tm

easu

res

prod

uced

byBa

yes-

Fish

eran

dSV

Mcl

assi

fiers

Page 12: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

32 Caslav Bozic, Ryan Riordan, Detlef Seese, and Christof Weinhardt N

N

lag

cc

oo

co

oc

spre

ad

ec

cc

ec

oo

ec

co

ec

oc

1

0.00

0891

98

0.85

-0

.000

0506

2 0.

99

-0.0

0268

0.

52

0.00

357

0.12

0.

0000

888

0.65

-0

.001

34

0.78

0.

0025

6 0.

59

-0.0

035

0.43

0.

0021

5 0.

28

2 -0

.000

1854

40.

97

0.00

534

0.16

-0

.002

72

0.56

0.

0025

3 0.

30

-0.0

0010

813

0.53

0.

0019

0.

73

0.00

566

0.08

-0

.000

4352

90.

93

0.00

238

0.26

3

0.00

143

0.74

-0

.003

12

0.44

0.

0027

1 0.

45

-0.0

0128

0.

60

0.00

0144

92

0.67

0.

0011

0.

79

-0.0

0034

131

0.92

0.

0017

9 0.

61

-0.0

0067

498

0.75

4

0.00

0272

33

0.95

0.

0019

7 0.

58

-0.0

0103

0.

78

0.00

13

0.52

-0

.000

3276

50.

32

0.00

419

0.28

0.

0014

2 0.

62

0.00

282

0.44

0.

0014

6 0.

39

5 0.

002

0.62

-0

.000

1865

4 0.

96

0.00

0402

67

0.91

0.

0015

9 0.

43

-0.0

0022

739

0.30

0.

0063

2 0.

10

0.00

273

0.35

0.

0029

2 0.

38

0.00

333

0.05

6

-0.0

0051

28

0.89

0.

0037

7 0.

27

-0.0

0178

0.

55

0.00

126

0.55

-0

.000

2800

30.

44

0.00

258

0.43

0.

0029

8 0.

37

0.00

0482

3 0.

85

0.00

216

0.27

7

0.00

208

0.61

-0

.002

18

0.56

0.

0016

2 0.

58

0.00

0455

96

0.85

-0

.000

2035

20.

65

0.00

236

0.50

-0

.000

8988

70.

81

0.00

0923

07

0.71

0.

0014

5 0.

53

8 0.

0019

8 0.

61

0.00

0620

15

0.86

-0

.001

63

0.57

0.

0036

1 0.

12

-0.0

0043

306

0.19

0.

0009

4201

0.

80

0.00

0632

96

0.86

-0

.002

07

0.43

0.

0030

4 0.

17

9 -0

.002

06

0.47

-0

.001

65

0.61

-0

.002

99

0.23

0.

0009

3048

0.

62

-0.0

0031

224

0.24

-0

.001

76

0.49

-0

.001

82

0.58

-0

.002

54

0.25

0.

0007

9024

0.

65

10

-0.0

0188

0.

51

-0.0

0003

904

0.99

-0

.002

58

0.33

0.

0006

9155

0.

64

-0.0

0042

125

0.21

-0

.002

63

0.31

-0

.001

52

0.61

-0

.002

35

0.33

-0

.000

2421

5 0.

85

R

NS

E D

a

ta

lag

cc

oo

co

oc

spre

ad

ec

cc

ec

oo

ec

co

ec

oc

1

0.00

364

0.22

0.

0007

7126

0.

79

0.00

31

0.24

0.

0005

3878

0.

73

0.00

0469

26

0.04

0.

0033

8 0.

24

0.00

0766

88

0.79

0.

0030

2 0.

25

0.00

0368

11

0.79

2

0.00

346

0.27

0.

0008

1511

0.

75

0.00

0017

46

0.99

0.

0034

5 0.

03

0.00

0056

51

0.71

0.

0037

6 0.

22

0.00

0471

86

0.84

0.

0012

5 0.

64

0.00

251

0.07

3

-0.0

0237

0.

37

0.00

0501

75

0.87

-0

.002

87

0.21

0.

0005

086

0.74

0.

0003

7629

0.

10

-0.0

0106

0.

66

0.00

0985

42

0.74

-0

.002

33

0.27

0.

0012

9 0.

35

4 -0

.001

94

0.52

0.

0020

9 0.

48

-0.0

0107

0.

70

-0.0

0087

09

0.55

0.

0001

7053

0.

44

-0.0

0167

0.

58

0.00

106

0.72

-0

.000

5078

30.

86

-0.0

0117

0.

39

5 0.

0070

3 0.

02

0.00

419

0.11

0.

0032

2 0.

24

0.00

382

0.01

-0

.000

0289

60.

87

0.00

491

0.11

0.

0044

4 0.

07

0.00

134

0.63

0.

0035

8 0.

01

6 0.

004

0.14

0.

0036

6 0.

20

-0.0

0012

613

0.96

0.

0041

3 0.

01

-0.0

0010

191

0.63

0.

0029

0.

23

0.00

449

0.13

-0

.001

13

0.58

0.

0040

2 <0

.01

7 0.

0015

5 0.

61

0.00

255

0.34

-0

.000

0374

90.

99

0.00

159

0.30

0.

0002

2742

0.

31

-0.0

0065

236

0.83

0.

0003

3338

0.

89

-0.0

019

0.48

0.

0012

7 0.

37

8 0.

0012

6 0.

65

0.00

0688

39

0.81

0.

0010

3 0.

65

0.00

0232

26

0.88

-0

.000

1254

20.

54

-0.0

0267

0.

30

-0.0

0051

524

0.86

-0

.002

02

0.35

-0

.000

6356

3 0.

65

9 0.

0025

0.

37

-0.0

0054

481

0.83

0.

0016

2 0.

52

0.00

0878

12

0.53

-0

.000

1835

20.

38

0.00

208

0.46

-0

.002

7 0.

27

0.00

147

0.57

0.

0005

8709

0.

65

10

0.00

102

0.70

0.

0035

5 0.

16

-0.0

0107

0.

64

0.00

209

0.13

0.

0000

6798

0.

71

-0.0

0010

018

0.97

0.

0012

4 0.

60

-0.0

0204

0.

35

0.00

194

0.11

Tab

le 4

: Pa

ram

eter

est

imat

e an

d p

val

ue

for

lagged

sen

tim

ent

mea

sure

s pro

duce

d b

y N

N c

lass

ifie

r an

d f

rom

RN

SE d

ata

Ta

ble

4:Pa

ram

eter

esti

mat

ean

dp

valu

efo

rla

gged

sent

imen

tm

easu

res

prod

uced

byN

Ncl

assi

fier

and

from

RN

SEda

ta

Page 13: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Framework for Financial Text Mining 33

RN

SE

Dat

a 20

03-2

008

lag

cc

oo

co

oc

spre

ad

ec

c

eoo

ec

o

eoc

1 0.

0019

2 *

0.00

422

* 0.

0006

9627

* 0.

0012

2 *

-0.0

0022

379

0.00

13

0.00

159

* 0.

0031

5 *

0.00

0492

24

* 0.

0010

4 *

2 0.

0007

9088

*

0.00

0947

52

* 0.

0005

3439

0.00

010.

0002

565

0.03

23-0

.000

1892

80.

0061

0.

0003

2129

0.

0376

0.00

0627

96

0.00

030.

0001

8202

0.

1377

0.00

0104

81

0.35

82

3 0.

0005

4479

0.

0023

0.00

0470

96

0.01

03

0.00

0591

87*

-0.0

0004

708

0.69

09-0

.000

1986

30.

0040

0.

0001

3246

0.

4290

0.00

0088

23

0.62

460.

0002

2776

0.

0621

-0.0

0011

461

0.38

05

4 0.

0006

4604

0.

0002

0.00

0634

94

0.00

05

0.00

0537

41*

0.00

0108

63

0.34

76-0

.000

2188

30.

0016

0.

0001

5744

0.

3067

0.00

0147

38

0.38

370.

0001

4664

0.

2254

-0.0

0002

051

0.85

86

5 0.

0002

2031

0.

2161

0.00

0299

27

0.10

03

0.00

0407

970.

0031

-0.0

0018

766

0.11

16-0

.000

2129

70.

0022

-0

.000

1218

50.

4556

-0.0

0019

69

0.26

300.

0001

2828

0.

2879

-0.0

0028

489

0.02

34

6 0.

0002

0479

0.

2540

0.00

0482

02

0.00

86

0.00

0105

780.

4423

0.00

0099

01

0.41

02-0

.000

2263

30.

0012

-0

.000

1574

0.

3664

0.00

0034

79

0.85

18-0

.000

1108

90.

3569

-0.0

0006

389

0.64

92

7 0.

0002

5166

0.

1574

0.00

0297

64

0.10

07

0.00

0076

580.

5771

0.00

0175

09

0.13

47-0

.000

2703

80.

0001

0.

0001

9783

0.

1838

-0.0

0008

992

0.58

090.

0000

9277

0.

4398

0.00

0060

7 0.

5710

8

0.00

0254

73

0.15

29-0

.000

0316

4 0.

8623

0.

0003

9427

0.00

42-0

.000

1395

40.

2397

-0.0

0025

098

0.00

03

0.00

0017

77

0.92

07-0

.000

1344

0.

4789

0.00

0227

77

0.05

96-0

.000

2376

1 0.

1003

9

0.00

0172

0.

3371

0.00

0407

52

0.02

63

0.00

0114

370.

4064

0.00

0057

63

0.63

07-0

.000

2348

0.

0006

0.

0000

3302

0.

8616

0.00

0152

86

0.43

490.

0000

3762

0.

7555

-0.0

0002

812

0.85

24

10

0.00

0379

59

0.03

290.

0001

7503

0.

3333

0.

0002

8868

0.03

580.

0000

9091

0.

4365

-0.0

0024

143

0.00

06

0.00

0012

63

0.92

89-0

.000

0140

90.

9276

-0.0

0000

507

0.96

66-0

.000

0181

2 0.

8507

11

0.

0003

2344

0.

0748

0.00

0254

58

0.16

83

0.00

0375

990.

0060

-0.0

0005

256

0.66

95-0

.000

2185

80.

0018

-0

.000

0369

70.

8519

-0.0

0038

506

0.05

780.

0003

613

0.00

26-0

.000

4045

9 0.

0120

12

0.

0002

2131

0.

2169

0.00

0446

02

0.01

47

0.00

0180

890.

1862

0.00

0040

42

0.73

73-0

.000

2448

90.

0005

-0

.000

1286

10.

4582

0.00

0151

26

0.41

260.

0000

7943

0.

5070

-0.0

0024

499

0.07

97

13

0.00

0414

67

0.01

870.

0002

5657

0.

1517

0.

0003

6354

0.00

760.

0000

5112

0.

6588

-0.0

0024

455

0.00

06

0.00

0107

97

0.47

860.

0000

1232

0.

9405

0.00

0164

77

0.16

67-0

.000

0751

8 0.

5089

14

0.

0002

2718

0.

2052

0.00

0505

54

0.00

55

0.00

0136

180.

3213

0.00

0091

01

0.45

04-0

.000

2542

50.

0003

0.

0000

9126

0.

5439

0.00

0148

53

0.36

070.

0000

9409

0.

4330

-0.0

0004

179

0.70

59

15

0.00

0038

22

0.82

86-0

.000

0029

5 0.

9869

0.

0002

1168

0.12

18-0

.000

1734

60.

1348

-0.0

0021

112

0.00

27

-0.0

0009

376

0.50

45-0

.000

1868

20.

2264

0.00

0176

99

0.13

95-0

.000

2966

9 0.

0023

16

0.

0004

305

0.01

700.

0002

9846

0.

1048

0.

0003

6374

0.00

810.

0000

6676

0.

5825

-0.0

0021

967

0.00

17

0.00

0108

26

0.51

070.

0001

3612

0.

4411

0.00

0122

93

0.30

58-0

.000

0392

2 0.

7607

17

0.

0002

8462

0.

1203

0.00

0355

85

0.05

68

0.00

0253

240.

0647

0.00

0031

38

0.80

26-0

.000

2281

60.

0012

0.

0002

1946

0.

2327

0.00

0020

64

0.91

530.

0002

9143

0.

0153

-0.0

0009

058

0.55

10

18

0.00

0276

66

0.12

360.

0002

8981

0.

1121

0.

0002

1893

0.11

090.

0000

5773

0.

6304

-0.0

0023

1 0.

0008

0.

0000

9887

0.

5301

0.00

0228

95

0.17

900.

0001

3912

0.

2442

-0.0

0005

783

0.63

31

19

0.00

0161

56

0.36

940.

0001

5372

0.

4024

0.

0002

2874

0.09

29-0

.000

0671

70.

5804

-0.0

0018

249

0.00

84

-0.0

0007

894

0.61

60-0

.000

0632

60.

7096

0.00

0088

99

0.45

79-0

.000

1752

4 0.

1478

20

0.

0004

1741

0.

0200

0.00

0284

56

0.11

62

0.00

0367

310.

0077

0.00

0050

11

0.67

39-0

.000

1935

0.

0054

0.

0000

0844

0.

9614

-0.0

0001

209

0.94

650.

0000

9293

0.

4407

-0.0

0009

454

0.47

53

T abl

e5:

Para

met

eres

tim

ate

and

pva

lue

for

lagg

edse

ntim

ent

mea

sure

sfr

omR

NSE

data

inpe

riod

2003

to20

08(*

pva

lue

¡0.0

001)

Page 14: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

34 Caslav Bozic, Ryan Riordan, Detlef Seese, and Christof WeinhardtFirst Results

12 Caslav Bozic - Towards Benchmarking Framework for Financial Text Mining

Tokenizer

Stemmer

Classification

Reuters TakesNews Stories

Full Text of News Stories

3 classifiersBayes – FisherSVMNeural Network

Figure 7: Transformations of news data

5 Conclusion

In an attempt to define a benchmark for performance of sentiment measures forfinancial texts, we propose the framework and present implementation of the systemthat enables easy deriving of financial indicators to be used in performance measure.As an illustration we offer a use-case analysis of four different approaches, althoughnot giving any general conclusions about adequacy of particular approaches. Itis shown that due to the inadequate data volume conclusions are vague, but theincrease in the volume of analyzed data offers us a possibility to draw more robustconclusions.

Some of the future directions of the research would be extending the source base,including Compustat data on market capitalization and earnings of companies, andalso extending calculated dataset, to offer wider variety of indicators which couldbe compared against candidate sentiment measure. More sophisticated statisticalanalysis should be performed on the final dataset to be able to give more decisivestatements about general adequacy of text mining approaches in financial news.These could include using panel regression to control for inter-firm dependenciesand applying significance test that would account for possible heteroscedasticity inthe data.

Page 15: Towards a Benchmarking Framework for Financial Text Mining...partially use commercially available text mining systems to predict a price trend for intraday market movements of some

Towards a Benchmarking Framework for Financial Text Mining 35

References

Antweiler, W. and M. Z. Frank (2004): “Is All That Talk Just Noise? The InformationContent of InternetStock Message Boards,” The Journal of Finance, 59(3), pp. 1259–1294.

Bozic, C. (2009): “FINDS - Integrative services,” in: Computer Systems and Applications,2009. AICCSA 2009. IEEE/ACS International Conference on, pp. 61–62.

Das, S. and M. Chen (2007): “Yahoo! for Amazon: Sentiment extraction from smalltalk on the web,” Management Science, 53(9), pp. 1375–1388.

Fung, G. P. C., J. X. Yu, and W. Lam (2003): “Stock prediction: Integrating textmining approach using real-time news,” in: Computational Intelligence for FinancialEngineering, 2003. Proceedings. 2003 IEEE International Conference on, pp. 395 – 402.

Gidofalvi, G. and C. Elkan (2003): “Using news articles to predict stock price move-ments,” Department of Computer Science and Engineering, University of California,San Diego.

Hevner, A. R., S. T. March, J. Park, and S. Ram (2004): “Design Science in InformationSystems Research,” MIS Quarterly, 28(1), pp. 75–105.

Lavrenko, V., M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, and J. Allan (2000):“Mining of Concurrent Text and Time-Series,” in: Sixth ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining.

Mittermayer, M. and G. Knolmayer (2006a): “Text mining systems for marketresponse to news: A survey,” Institute of Informations Systems, University of Bern.

Mittermayer, M.-A. and G. F. Knolmayer (2006b): “NewsCATS: A News Categoriza-tion and Trading System,” Data Mining, IEEE International Conference on, 0, pp.1002–1007.

Pfrommer, J., C. Hubschneider, and S. Wenzel (2010): “Sentiment Analysis on StockNews using Historical Data and Machine Learning Algorithms,” Term Paper,Karlsruhe Institute of Technology (KIT).

Schulz, A., M. Spiliopoulou, and K. Winkler (2003): “Kursrelevanzprognose vonAd-hoc-Meldungen: Text Mining wider die Informationsuberlastung im MobileBanking,” Wirtschaftsinformatik, 2, pp. 181–200.

Tetlock, P. (2007): “Giving Content to Investor Sentiment: The Role of Media in theStock Market,” Journal of Finance, 62(3).

Tetlock, P., M. Saar-Tsechansky, and S. Macskassy (2008): “More Than Words:Quantifying Language to Measure Firms’ Fundamentals,” Journal of Finance, 63(3),pp. 1437–1467.

Thomas, J. (2003): News and trading rules, Ph.D. thesis, Carnegie Mellon University.Wuthrich, B., D. Permunetilleke, S. Leung, V. Cho, J. Zhang, and W. Lam (1998):

“Daily prediction of major stock indices from textual www data,” 1998 IEEEInternational Conference on Systems, Man, and Cybernetics.