Transcript
Page 1: Using MongoDB As a Tick Database

Sr. Solution Architect, MongoDB

Matt Kalan

How Capital Markets Firms Use MongoDB as a Tick Database

Page 2: Using MongoDB As a Tick Database

Agenda

• MongoDB One Slide Overview

• FS Use Cases

• Writing/Capturing Market Data

• Reading/Analyzing Market Data

• Performance, Scalability, & High Availability

• Q&A

Page 3: Using MongoDB As a Tick Database

MongoDB Technical Benefits

Horizontally Scalable-Sharding

Agile &Flexible

High Performance-Indexes-RAM

Application

HighlyAvailable-Replica Sets

{ name: “John Smith”, date: “2013-08-01”), address: “10 3rd St.”, phone: [ { home: 1234567890}, { mobile: 1234568138} ] }

db.cust.insert({…})db.cust.find({ name:”John Smith”})

Page 4: Using MongoDB As a Tick Database

Most Common FS Use Cases

1. Tick Data Capture & Analysis

2. Reference Data Management

3. Risk Analysis & Reporting

4. Trade Repository

5. Portfolio Reporting

Page 5: Using MongoDB As a Tick Database

Writing and Capturing Tick Data

Page 6: Using MongoDB As a Tick Database

Tick Data Capture & Analysis Requirements

• Capture real-time market data (multi-asset, top of book, depth of book, even news)

• Load historical data

• Aggregate data into bars, daily, monthly intervals

• Enable queries & analysis on raw ticks or aggregates

• Drive backtesting or automated signals

Page 7: Using MongoDB As a Tick Database

Tick Data Capture & Analysis –Why MongoDB?

• High throughput => can capture real-time feeds for all

products/asset classes needed

• High scalability => all data and depth for all historical time

periods can be captured

• Flexible & Range-based indexing => fast querying on time

ranges and any fields

• Aggregation Framework => can shape raw data into aggregates

(e.g. ticks to bars)

• Map-reduce capability (Native MR or Hadoop Connector) =>

batch analysis looking for patterns and opportunities

• Easy to use => native language drivers and JSON expressions that

you can apply for most operational database needs as well

• Low TCO => Low software license cost and commodity hardware

Page 8: Using MongoDB As a Tick Database

Trades/metrics

High Level Trading Architecture

Feed Handler

Exchanges/Markets/Brokers

Capturing Application

Low Latency Applications

Higher Latency Trading

Applications

Backtesting and Analysis Applications

Market Data

Cached Static & Aggregated Data

News & social networking

sources

Orders

Orders

Page 9: Using MongoDB As a Tick Database

Trades/metrics

High Level Trading Architecture

Feed Handler

Exchanges/Markets/Brokers

Capturing Application

Low Latency Applications

Higher Latency Trading

Applications

Backtesting and Analysis Applications

Market Data

Cached Static & Aggregated Data

News & social networking

sources

Orders

Orders

Data Types• Top of book• Depth of book• Multi-asset• Derivatives (e.g.

strips)• News (text, video)• Social Networking

Page 10: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),bidPrice: 55.37,offerPrice: 55.58,bidQuantity: 500,offerQuantity: 700

}

> db.ticks.find( {symbol: "DIS",

bidPrice: {$gt: 55.36} } )

Top of Book [e.g. equities]

Page 11: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),bidPrices: [55.37, 55.36, 55.35],offerPrices: [55.58, 55.59, 55.60],bidQuantities: [500, 1000, 2000],offerQuantities: [1000, 2000, 3000]

}

> db.ticks.find( {bidPrices: {$gt: 55.36} } )

Depth of Book

Page 12: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),bids: [

{price: 55.37, amount: 500}, {price: 55.37, amount: 1000}, {price: 55.37, amount: 2000} ],

offers: [ {price: 55.58, amount: 1000}, {price: 55.58, amount: 2000}, {price: 55.59, amount: 3000} ]

}

> db.ticks.find( {"bids.price": {$gt: 55.36} } )

Or However Your App Uses It

Page 13: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),spreadPrice: 0.58leg1: {symbol: “CLM13, price: 97.34}leg2: {symbol: “CLK13, price: 96.92}

}

db.ticks.find( { “leg1” : “CLM13” },

{ “leg2” : “CLK13” },

{ “spreadPrice” : {$gt: 0.50 } } )

Synthetic Spreads

Page 14: Using MongoDB As a Tick Database

{

_id : ObjectId("4e2e3f92268cdda473b628f6"),

symbol : "DIS",

timestamp: ISODate("2013-02-15 10:00"),

title: “Disney Earnings…”

body: “Walt Disney Company reported…”,

tags: [“earnings”, “media”, “walt disney”]

}

News

Page 15: Using MongoDB As a Tick Database

{

_id : ObjectId("4e2e3f92268cdda473b628f6"),

timestamp: ISODate("2013-02-15 10:00"),

twitterHandle: “jdoe”,

tweet: “Heard @DisneyPictures is releasing…”,

usernamesIncluded: [“DisneyPictures”],

hashTags: [“movierumors”, “disney”]

}

Social Networking

Page 16: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS”,openTS: Date("2013-02-15 10:00"),closeTS: Date("2013-02-15 10:05"),open: 55.36,high: 55.80,low: 55.20,close: 55.70

}

Aggregates (bars, daily, etc)

Page 17: Using MongoDB As a Tick Database

Querying/Analyzing Tick Data

Page 18: Using MongoDB As a Tick Database

Architecture for Querying Data

Higher Latency Trading

Applications

Backtesting Applications

• Ticks• Bars• Other

analysis

Research & Analysis

Applications

Page 19: Using MongoDB As a Tick Database

// Compound indexes

> db.ticks.ensureIndex({symbol: 1, timestamp:1})

// Index on arrays

>db.ticks.ensureIndex( {bidPrices: -1})

// Index on any depth

> db.ticks.ensureIndex( {“bids.price”: 1} )

// Full text search

> db.ticks.ensureIndex ( {tweet: “text”} )

Index Any Fields: Arrays, Nested, etc.

Page 20: Using MongoDB As a Tick Database

// Ticks for last month for media companies

> db.ticks.find({ symbol: {$in: ["DIS", “VIA“, “CBS"]}, timestamp: {$gt: new ISODate("2013-01-01")}, timestamp: {$lte: new ISODate("2013-01-31")}})

// Ticks when Disney’s bid breached 55.50 this month

> db.ticks.find({ symbol: "DIS",

bidPrice: {$gt: 55.50}, timestamp: {$gt: new ISODate("2013-02-01")}})

Query for ticks by time; price threshold

Page 21: Using MongoDB As a Tick Database

Analyzing/Aggregating Options

• Custom application code– Run your queries, compute your results

• Aggregation framework– Declarative, pipeline-based approach

• Native Map/Reduce in MongoDB– Javascript functions distributed across cluster

• Hadoop Connector– Offline batch processing/computation

Page 22: Using MongoDB As a Tick Database

//Aggregate minute bars for Disney for February

db.ticks.aggregate( { $match: {symbol: "DIS”, timestamp: {$gt: new ISODate("2013-02-01")}}}, { $project: { year: {$year: "$timestamp"}, month: {$month: "$timestamp"}, day: {$dayOfMonth: "$timestamp"}, hour: {$hour: "$timestamp"}, minute: {$minute: "$timestamp"}, second: {$second: "$timestamp"}, timestamp: 1, price: 1}}, { $sort: { timestamp: 1}}, { $group : { _id : {year: "$year", month: "$month", day: "$day", hour: "$hour", minute: "$minute"}, open: {$first: "$price"}, high: {$max: "$price"}, low: {$min: "$price"}, close: {$last: "$price"} }} )

Aggregate into min bars

Page 23: Using MongoDB As a Tick Database

//then count the number of down bars

{ $project: { downBar: {$lt: [“$close”, “$open”] }, timestamp: 1, open: 1, high: 1, low: 1, close: 1}}, { $group: {

_id: “$downBar”,

sum: {$sum: 1}}} })

Add Analysis on the Bars

Page 24: Using MongoDB As a Tick Database

var mapFunction = function () {

emit(this.symbol, this.bidPrice);

}

var reduceFunction = function (symbol, priceList) {

return Array.sum(priceList);

}

> db.ticks.mapReduce(

map, reduceFunction, {out: ”tickSums"})

MapReduce Example: Sum

Page 25: Using MongoDB As a Tick Database

Process Data in Hadoop

• MongoDB’s Hadoop Connector

• Supports Map/Reduce, Streaming, Pig

• MongoDB as input/output storage for Hadoop jobs– No need to go through HDFS

• Leverage power of Hadoop ecosystem against operational data in MongoDB

Page 26: Using MongoDB As a Tick Database

Performance, Scalability, and High Availability

Page 27: Using MongoDB As a Tick Database

Why MongoDB Is Fast and Scalable

Better data locality

Relational MongoDB

In-Memory Caching

Auto-Sharding

Read/write scaling

Page 28: Using MongoDB As a Tick Database

Auto-sharding for Horizontal Scale

mongod

Read/Write Scalability

Key RangeSymbol: A…Z

Page 29: Using MongoDB As a Tick Database

Auto-sharding for Horizontal Scale

Read/Write Scalability

mongod mongod

Key RangeSymbol: A…J

Key RangeSymbol: K…Z

Page 30: Using MongoDB As a Tick Database

Sharding

mongod mongodmongod mongod

Read/Write Scalability

Key RangeSymbol: A…F

Key RangeSymbol: G…J

Key RangeSymbol: K…O

Key RangeSymbol: P…Z

Page 31: Using MongoDB As a Tick Database

Primary

Secondary

Secondary

Primary

Secondary

Secondary

Primary

Secondary

Secondary

Primary

Secondary

Secondary

MongoS MongoS MongoS

Key RangeSymbol: A…F, Time

Key RangeSymbol: G…J,Time

Key RangeSymbol: K…O,Time

Key RangeSymbol: P…Z, Time

Application

Page 32: Using MongoDB As a Tick Database

Summary

• MongoDB is high performance for tick data

• Scales horizontally automatically by auto-sharding

• Fast, flexible querying, analysis, & aggregation

• Dynamic schema can handle any data types

• MongoDB has all these features with low TCO

• We can support you with anything discussed

Page 33: Using MongoDB As a Tick Database

Questions?

Page 34: Using MongoDB As a Tick Database

Sr. Solution Architect, MongoDB

Matt Kalan

#ConferenceHashtag

Thank You


Recommended