1
Department of Computing University of Surrey Department of Accounting, Finance & Management University of Essex SURREY: K. Ahmad (Principal Investigator), T. Taskaya-Temizel, D. Cheng, L. Gillam, S. Ahmad, M. Casey, H. Traboulsi, and ESSEX: J.Nankervis Supported by the ESRC Project Number: RES-149-25-0028 Web page: http://www.computing.surrey.ac.uk/grid/fingrid/ Financial INformation GRID An ESRC e-Social Science Pilot UniS Introduction Social science research requires the capture and analysis of data that is quantitative - numerical data - and data that is qualitative - opinions expressed in language or other sign systems. The fusion of multi-modal information, is critical to social sciences research. Large volumes of such data is now being made available: Decision making in financial and political economics, both by researchers and financial traders, now involves analysis of streaming time serial data and financial and political news (c. 2 GB per year). The confirmation or rejection of theories related to efficient markets for example, the generation of buy or sell signal, involves statistical and linguistic analysis of the streaming data. Financial and political analysis requires data over short time periods (daily) or longer time periods (5-10 years). This is large volume of data which requires instant processing much like data emerging from particle or gene factories- except that the data is in two or more modalities in our case. Achievements A three-tier architecture has been implemented using Globus Toolkit 3.0 and Java CogKit (GRAM, GASS, GridFTP) on a 24 machine cluster. The cluster is connected to Reuters Financial Services streamer. SATISFI- Fusing Numeric & Textual Information on Grid SATISFI, SUPPORTED BY SURREY GRID, PROVIDES THREE SERVICES: 1. NEWS ANALYSIS service for extracting MARKET SENTIMENT. 2. MARKET SENTIMENT correlation with FINANCIAL TIME SERIES. 3. BOOTSTRAPPING service for computing standard errors, confidence intervals and hypothesis testing by a simulation of the TIME SERIES or MARKET SENTIMENT SERIES. Reuters Financial Services Streaming Data and News Service Research Objectives The FINGRID project is a collaboration between financial economists, econometricians at Essex , and computing academics, particularly in grid computing and artificial intelligence, at Surrey: Create a Grid environment based on Open Grid Services Architecture. Provide a demonstrable software application, on the Grid, for analysing financial information in the form of quantitative and qualitative data. Evaluate the benefits of the Grid approach. Next Steps Investigate and evaluate Condor-G, MPICH2 and OGSA-DAI for effective job management, parallel processing and database management Towards a knowledge grid P ARALLEL and D ISTRIBUTED K NOWLEDGE D ISCOVERY : Continual analysis and fusion of text and numerical data both real- time and historical data. K NOWLEDGE G RID S ERVICES : KNOWLEDGE RETRIEVAL: Adapt information extraction methods and systems (e.g. Surrey’s SYSTEM QUIRK) onto a GRID architecture for extended semantic analysis. KNOWLEDGE MODELLING: Representation of non-stationary time series using Wavelet Analysis, Neural Networks and Fuzzy Logic, such that the system learns from its past experience. Streaming Textual Data Clien t GRID Cluster 24 Slaves Streaming Numeric Data Main Cluster Text and Time Series Service Notify user about results Distribute Tasks Receive Results Send Ser vic e Request 1 2 3 4 TextA nalysis 0 100 200 300 400 500 600 1 2 4 8 # ofm achines T im e in secon TextA nalysis (process tim e in m s) Sim ple B ootstrapping 0 500 1000 1500 2000 2500 1 2 4 8 # ofm achines Tim e in second Bootstrap rep=500 Bootstrap rep=1000 Surrey Grid (a) (b) Speed-up for (a) bootstrapping and (b) text processing

Department of Computing University of Surrey Department of Accounting, Finance & Management University of Essex SURREY: K. Ahmad (Principal Investigator),

  • View
    216

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Department of Computing University of Surrey Department of Accounting, Finance & Management University of Essex SURREY: K. Ahmad (Principal Investigator),

Department of ComputingUniversity of Surrey

Department of Accounting, Finance & Management

University of Essex

SURREY: K. Ahmad (Principal Investigator), T. Taskaya-Temizel, D. Cheng, L. Gillam, S. Ahmad, M. Casey, H. Traboulsi, and ESSEX: J.NankervisSupported by the ESRC Project Number: RES-149-25-0028Web page: http://www.computing.surrey.ac.uk/grid/fingrid/

Financial INformation GRID An ESRC e-Social Science Pilot

UniS

IntroductionSocial science research requires the capture and analysis of data that is

quantitative - numerical data - and data that is qualitative - opinions expressed in

language or other sign systems.

The fusion of multi-modal information, is critical to social sciences research.

Large volumes of such data is now being made available: Decision making in financial and political economics, both by researchers and financial traders, now involves analysis of streaming time serial data and financial and political news (c. 2 GB per year). The confirmation or rejection of theories related to efficient markets for example, the generation of buy or sell signal, involves statistical and linguistic analysis of the streaming data.

Financial and political analysis requires data over short time periods (daily) or longer time periods (5-10 years). This is large volume of data which requires instant processing – much like data emerging from particle or gene factories- except that the data is in two or more modalities in our case.

AchievementsA three-tier architecture has been implemented using Globus Toolkit 3.0 and Java CogKit (GRAM, GASS, GridFTP) on a 24 machine cluster. The cluster is connected to Reuters Financial Services streamer.

SATISFI- Fusing Numeric & Textual Information on Grid

SATISFI, SUPPORTED BY SURREY GRID, PROVIDES THREE SERVICES:

1. NEWS ANALYSIS service for extracting MARKET SENTIMENT.

2. MARKET SENTIMENT correlation with FINANCIAL TIME SERIES.

3. BOOTSTRAPPING service for computing standard errors, confidence intervals and hypothesis testing by a simulation of the TIME SERIES or MARKET SENTIMENT SERIES.

Reuters Financial Services Streaming Data and News Service

Research ObjectivesThe FINGRID project is a collaboration between financial

economists, econometricians at Essex , and computing academics, particularly in grid computing and artificial

intelligence, at Surrey:

• Create a Grid environment based on Open Grid Services Architecture.

• Provide a demonstrable software application, on the Grid, for analysing financial information in the form of quantitative and qualitative data.

• Evaluate the benefits of the Grid approach.

Next StepsInvestigate and evaluate Condor-G, MPICH2 and OGSA-DAI for effective job management, parallel processing and database management

Towards a knowledge grid PARALLEL and DISTRIBUTED KNOWLEDGE DISCOVERY:

Continual analysis and fusion of text and numerical data both real-time and historical data.

KNOWLEDGE GRID SERVICES:

KNOWLEDGE RETRIEVAL: Adapt information extraction methods and systems (e.g. Surrey’s SYSTEM QUIRK) onto a GRID architecture for extended semantic analysis.

KNOWLEDGE MODELLING: Representation of non-stationary time series using Wavelet Analysis, Neural Networks and Fuzzy Logic, such that the system learns from its past experience.

Streaming Textual Data

Client

GRID Cluster24 Slaves

Streaming Numeric DataMain

Cluster

Text and Time Series

Service

Notify user about results

Distribute Tasks

Receive Results

Send Service

Request

1

2

34

Text Analysis

0

100

200

300

400

500

600

1 2 4 8

# of machines

Tim

e in

seconds

Text Analysis(process time in ms)

Simple Bootstrapping

0

500

1000

1500

2000

2500

1 2 4 8

# of machines

Tim

e in

seconds

Bootstrap rep=500

Bootstrap rep=1000

Surrey Grid

(a)

(b)

Speed-up for (a) bootstrapping and (b) text

processing