41
Session 4: Model Development and Stylized Facts 1

Session 4 : Model Development and Stylized Facts

  • Upload
    astin

  • View
    64

  • Download
    0

Embed Size (px)

DESCRIPTION

Session 4 : Model Development and Stylized Facts. Agenda. Education Session 1: Industry Introduction and Derivatives Overview Session 2: Overview of Market Microstructure Session 3: Prerequisites for Algorithmic Trading System (ATS) Development and Selecting a Platform - PowerPoint PPT Presentation

Citation preview

Page 1: Session  4 :  Model Development and Stylized Facts

1

Session 4: Model Development and Stylized Facts

Page 2: Session  4 :  Model Development and Stylized Facts

2

Agenda

Education

Session 1: Industry Introduction and Derivatives OverviewSession 2: Overview of Market Microstructure Session 3: Prerequisites for Algorithmic Trading System (ATS) Development and Selecting a PlatformSession 4: Model Development and Stylized FactsSession 5: Review of the Scientific Method and the ATS Development ProcessSession 6: Formulation and Specification of a StrategySession 7: Backtesting, Optimization, Implementation, Risk Management

Research

Session 1: WorkshopSession 2 Workshop

Competition - 2 weeks (10 days)

Page 3: Session  4 :  Model Development and Stylized Facts

3

Introducing Econometrics

In Summary,

You can build a model around a hypothesis to test how financial market variables relate

To test the model, you need either sample or population data

Page 4: Session  4 :  Model Development and Stylized Facts

4

Introducing Econometrics

With the data, you run a regression to analyze the relationship between the dependent and independent variable(s)

These relationships are determined by the statistical report which contain statistics (p-value, r-squared, etc.)

Page 5: Session  4 :  Model Development and Stylized Facts

5

Introducing Econometrics

The p-value is the confidence that the null hypothesis is correct – there is no relationship in your model (or the relationship in your model is due only to chance).

To know the threshold to reject the null, you have to state a confidence interval. The characteristics of the confidence interval and statistical thresholds depend on the kind of distribution you are working with. There are chi-squared distributions, normal distributions, t-distributions, etc. Depends on how much data and how many variables.

Page 6: Session  4 :  Model Development and Stylized Facts

6

Introducing Econometrics

Normal Distribution Curve

Page 7: Session  4 :  Model Development and Stylized Facts

7

Introducing Econometrics

Simple Regression

Econometric Model Ex – What is the effect of education on income?

A simple regression with a single independent variable is known as a simple regression.

Page 8: Session  4 :  Model Development and Stylized Facts

8

Introducing Econometrics

I = α + β1(E) + ε

whereα = a constant amount (what one earns with zero education);β1 = the effect in dollars of an additional year of education on income, hypothesized to be positive; andε = the “noise” term reflecting other factors that influence earnings.

Page 9: Session  4 :  Model Development and Stylized Facts

9

Introducing Econometrics

I = α + β1(E) + ε

Remember what is observable and what is not: The data set contains observations for I and E. The noise component ε is comprised of factors that are unobservable, or at least unobserved. The parameters α and β1 are also unobservable. The task of regression analysis is to produce an estimate of these two parameters, based upon the information contained in the data set and, as shall be seen, upon some assumptions about the characteristics of ε.

Page 10: Session  4 :  Model Development and Stylized Facts

10

Introducing Econometrics

When we do a linear regression - we want the best line that describes the sample data because we want to use it to describe the population.

Many lines can be drawn to describe the data using different mathematical techniques, but regression analysis uses Ordinary Least Squares.

Page 11: Session  4 :  Model Development and Stylized Facts

11

Introducing Econometrics

Ordinary Least Squares: The line that is made when the sum of the squares of observations is minimum.

Page 12: Session  4 :  Model Development and Stylized Facts

12

Introducing Econometrics

Multiple Regression Of course, income is affected by a variety of factors in addition to years of education - factors that were aggregated into the noise term before.

Multiple regression is a technique that allows additional factors to enter the analysis separately so that the effect of each can be estimated. It is valuable for quantifying the impact of various simultaneous influences upon a single dependent variable.

Page 13: Session  4 :  Model Development and Stylized Facts

13

Introducing Econometrics

I = α + β1(E) + β2(A) + ε

Whereβ2 = the effect of an additional year of age on income

Page 14: Session  4 :  Model Development and Stylized Facts

14

Introducing Econometrics

I = α + β1(E) + β2(A) + ε

After running a regression on this model and looking at the statistical summary report, we are looking at two things for our simple analysis: P-value and Coefficient

The size of the p-value for a coefficient says nothing about the size of the effect that that variable is having on your dependent variable - it is possible to have a highly significant result (very small p-value) for a miniscule effect.

Page 15: Session  4 :  Model Development and Stylized Facts

15

Introducing Econometrics

So that’s a crude introduction to linear regressions.

We can also use nonlinear regressions to approximate data which is nonlinear.

Statistical properties are more complex and techniques are more sophisticated. Move forward with it if you dare!

Page 16: Session  4 :  Model Development and Stylized Facts

16

Introducing Econometrics

Since income is a function of education and age [I(E,A)], these data points will be plotted in three dimensions

We do ordinary least squares and find a line of best fit

Page 17: Session  4 :  Model Development and Stylized Facts

17

Stylized Facts

Previous econometric research on financial markets is already available (academic papers, regulatory studies, governmental studies). There is tons!

We learn stylized facts from this research (simplified presentations of empirical findings made in the social sciences).

Page 18: Session  4 :  Model Development and Stylized Facts

18

Stylized Facts

Gathering basic stylized facts on the behavior of financial assets and their returns is an important research activity.

Knowing these facts can make building a model much easier or allow one to advance more quickly in their model development.

Page 19: Session  4 :  Model Development and Stylized Facts

19

Stylized Facts

AutocorrelationA time series sometimes repeats patterns whereby earlier values have some relation to later values.

This is the statistics that measures the degree of that

similarity.

The higher the autocorrelation, the higher the persistence of patterns

reemerging later.

Page 20: Session  4 :  Model Development and Stylized Facts

20

Stylized Facts

Goodhardt (1989) and Goodhart and Figliuoli (1991) first reported the existence of negative autocorrelation of returns at higher-frequencies.

They based their study on the USD-DEM and their sampling period was from January 5, 1987 to January 5, 1993.

Page 21: Session  4 :  Model Development and Stylized Facts

21

Stylized Facts

Goodhardt (1989) and Goodhart and Figliuoli (1991) first reported the existence of negative autocorrelation of returns at higher-frequencies.

They based their study on the USD-DEM and their sampling period was from January 5, 1987 to January 5, 1993.

Page 22: Session  4 :  Model Development and Stylized Facts

22

Stylized Facts

The autocorrelation function of returns measured at a 1 minute interval was plotted against its lags.

They found a significant (95% confidence) autocorrelation for up to 4 minutes of lag.

The negative autocorrelations would last for up to 60 minutes at a time.

Page 23: Session  4 :  Model Development and Stylized Facts

23

Stylized FactsGoodhart demonstrated that the autocorrelation was not affected by news or the absence of major announcements. Goodhart and Figliuoli found that the pattern was not caused by prices bouncing between different areas with different information sets.

This negative autocorrelation was also found in some eurofutures contracts (Ballocchi 1999).

Page 24: Session  4 :  Model Development and Stylized Facts

24

Stylized Facts

Page 25: Session  4 :  Model Development and Stylized Facts

25

Stylized FactsExtreme Risk in Financial Markets

Page 26: Session  4 :  Model Development and Stylized Facts

26

Stylized Facts

Extreme Risk in Financial Markets

Are there theoretical processes that can model and evaluate the type of fat tails that come out of our empirical analysis?

Yes. You can compute a tail index which is a demanding task but with the help of high frequency data it is possible to achieve reasonable accuracy (Pictet 1998, Dacorogna 2001)

Page 27: Session  4 :  Model Development and Stylized Facts

27

Stylized Facts

Extreme Risk in Financial MarketsThey prove a theorem that shows more data improves the estimation of the tail index.

Its based on Extreme Value Theory. Its good to attempt these sorts of models because when extreme events occur, no one will take you up on the other side of the hedge. So, you want to have contingency plans for certain kinds of extreme events.

Page 28: Session  4 :  Model Development and Stylized Facts

28

Stylized Facts

From Nanex.com:

“On October 19, 2011, the stock ENV went $9.48 to $0.02 in 1 second and then quickly recovered in pre-market trading. This is something the public has been told can no longer happen with the improvements made since the Flash Crash of May 6, 2010. Apparently, it can happen and as shown below, it was clearly an algorithm that took the price to almost zero in just one split second.”

Page 29: Session  4 :  Model Development and Stylized Facts

29

Stylized Facts

Page 30: Session  4 :  Model Development and Stylized Facts

30

Stylized Facts

Flash Equity Failures in 2006, 2007, 2008, 2009, 2010, and 2011

We have analyzed all listed equities for 2006, 2007, 2008, 2009 and 2010 for potential "mini crashes" in individual stocks. We were surprised at the number of incidents we found. 

Parameters used: To qualify as a down-draft candidate, the stock had to tick down at least 10 times before ticking up -- all within 1.5 seconds and the price change had to exceed 0.8%.

To qualify as a up-draft candidate, the stock had to tick up at least 10 times before ticking down -- all within 1.5 seconds and the price change had to exceed 0.8%.

Page 31: Session  4 :  Model Development and Stylized Facts

31

Stylized Facts

Stub Quoting

Washington, D.C., Nov. 8, 2010 — The Securities and Exchange Commission approved new rules proposed by the exchanges and FINRA to strengthen the minimum quoting standards for market makers and effectively prohibit "stub quotes" in the U.S. equity markets.

A stub quote is an offer to buy or sell a stock at a price so far away from the prevailing market that it is not intended to be executed, such as an order to buy at a penny or an offer to sell at $100,000. A market maker may enter stub quotes to nominally comply with its obligation to maintain a two-sided quotation at those times when it does not wish to actively provide liquidity. Executions against stub quotes represented a significant proportion of the trades that were executed at extreme prices on May 6, and subsequently broken.

Page 32: Session  4 :  Model Development and Stylized Facts

32

Stylized Facts

Stub Quoting

The SEC document states for securities in the S&P 500 or Russell 1000 indexes, market makers must enter quotes that are not more than 9.5% away from the NBBO. During the first 15 and last 25 minutes of trading, entered quotes must be no further than 21.5% away from the NBBO. For all other exchange-listed equities, it's 31.5%*.Nanex processed all quotes for each of the last 4 days and printed out the time, symbol, and offending quote that was outside a generous 32% band. Here are the results (note how often a bid is at a penny!):

08/05/2011 - 1,021,115 stub quotes.08/08/2011 - 1,056,795 stub quotes.08/09/2011 - 1,315,940 stub quotes.08/10/2011 - 942,894 stub quotes.

Page 33: Session  4 :  Model Development and Stylized Facts

33

Page 34: Session  4 :  Model Development and Stylized Facts

34

Page 35: Session  4 :  Model Development and Stylized Facts

35

Page 36: Session  4 :  Model Development and Stylized Facts

36

Page 37: Session  4 :  Model Development and Stylized Facts

37

Stylized Facts

“The charts below shows message traffic for each of the 12 multicast lines plus the total for CQS. Each multicast line carries quotes for a certain range of symbols alphabetically. For example, symbols beginning with letters A and B are transmitted on line1, C and D on line 2 and so forth.

On April 26, 2010, quote message rates shot up to capacity at 9:29:10. That means other legitimate quotes would incur queuing delays. Between 250 and 500 quote updates per symbol were blasted per second. Symbols were chosen in a way that each multicast line filled to capacity. This was not some random fluke, but a well timed sophisticated algorithm involving a precise count of quotes from multiple exchanges. We think this might be one of the earliest examples of a full feed quote stuffing algorithm either running in production or undergoing a test. Note that the Flash Crash occurs the following week.”

Page 38: Session  4 :  Model Development and Stylized Facts

38

Page 39: Session  4 :  Model Development and Stylized Facts

39

Stylized Facts

http://www.nanex.net/FlashCrash/OngoingResearch.html

Page 40: Session  4 :  Model Development and Stylized Facts

40

Stylized Facts

The eMini 9:54:58 Event.

The following images show CME's eMini future (S&P 500) depth of book and trades. The images are rainbow (ROYGBIV) color coded by the relative size at each depth level. Red indicates a lot of size, violet indicates size approaching 0. Note that a full minute before each event, the depth starts cooling rapidly. The volume of contracts traded is represented at the bottom of the chart. Note the spike in volume at precisely 9:54:58.

This event is caused by the release of the Michigan Consumer Sentiment number at 9:55am twice a month. Apparently a premium subscription (approx $5,000) is available which provides a 2 second early release of this number.

Page 41: Session  4 :  Model Development and Stylized Facts

41