34
Sr. Solution Architect, MongoDB Matt Kalan How Capital Markets Firms Use MongoDB as a Tick Database

Using MongoDB As a Tick Database

  • Upload
    mongodb

  • View
    4.986

  • Download
    3

Embed Size (px)

DESCRIPTION

Learn how you can enjoy the developer productivity, low TCO, and unlimited scale of MongoDB as a tick database for capturing, analyzing, and taking advantage of opportunities in tick data. This presentation will illustrates how MongoDB can easily and quickly store variable data formats, like top and depth of book, multiple asset classes, and even news and social networking feeds. It will explore aggregating and analyzing tick data in real-time for automated trading or in batch for research and analysis and how auto-sharding enables MongoDB to scale with commodity hardware to satisfy unlimited storage and performance requirements.

Citation preview

Page 1: Using MongoDB As a Tick Database

Sr. Solution Architect, MongoDB

Matt Kalan

How Capital Markets Firms Use MongoDB as a Tick Database

Page 2: Using MongoDB As a Tick Database

Agenda

• MongoDB One Slide Overview

• FS Use Cases

• Writing/Capturing Market Data

• Reading/Analyzing Market Data

• Performance, Scalability, & High Availability

• Q&A

Page 3: Using MongoDB As a Tick Database

MongoDB Technical Benefits

Horizontally Scalable-Sharding

Agile &Flexible

High Performance-Indexes-RAM

Application

HighlyAvailable-Replica Sets

{ name: “John Smith”, date: “2013-08-01”), address: “10 3rd St.”, phone: [ { home: 1234567890}, { mobile: 1234568138} ] }

db.cust.insert({…})db.cust.find({ name:”John Smith”})

Page 4: Using MongoDB As a Tick Database

Most Common FS Use Cases

1. Tick Data Capture & Analysis

2. Reference Data Management

3. Risk Analysis & Reporting

4. Trade Repository

5. Portfolio Reporting

Page 5: Using MongoDB As a Tick Database

Writing and Capturing Tick Data

Page 6: Using MongoDB As a Tick Database

Tick Data Capture & Analysis Requirements

• Capture real-time market data (multi-asset, top of book, depth of book, even news)

• Load historical data

• Aggregate data into bars, daily, monthly intervals

• Enable queries & analysis on raw ticks or aggregates

• Drive backtesting or automated signals

Page 7: Using MongoDB As a Tick Database

Tick Data Capture & Analysis –Why MongoDB?

• High throughput => can capture real-time feeds for all

products/asset classes needed

• High scalability => all data and depth for all historical time

periods can be captured

• Flexible & Range-based indexing => fast querying on time

ranges and any fields

• Aggregation Framework => can shape raw data into aggregates

(e.g. ticks to bars)

• Map-reduce capability (Native MR or Hadoop Connector) =>

batch analysis looking for patterns and opportunities

• Easy to use => native language drivers and JSON expressions that

you can apply for most operational database needs as well

• Low TCO => Low software license cost and commodity hardware

Page 8: Using MongoDB As a Tick Database

Trades/metrics

High Level Trading Architecture

Feed Handler

Exchanges/Markets/Brokers

Capturing Application

Low Latency Applications

Higher Latency Trading

Applications

Backtesting and Analysis Applications

Market Data

Cached Static & Aggregated Data

News & social networking

sources

Orders

Orders

Page 9: Using MongoDB As a Tick Database

Trades/metrics

High Level Trading Architecture

Feed Handler

Exchanges/Markets/Brokers

Capturing Application

Low Latency Applications

Higher Latency Trading

Applications

Backtesting and Analysis Applications

Market Data

Cached Static & Aggregated Data

News & social networking

sources

Orders

Orders

Data Types• Top of book• Depth of book• Multi-asset• Derivatives (e.g.

strips)• News (text, video)• Social Networking

Page 10: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),bidPrice: 55.37,offerPrice: 55.58,bidQuantity: 500,offerQuantity: 700

}

> db.ticks.find( {symbol: "DIS",

bidPrice: {$gt: 55.36} } )

Top of Book [e.g. equities]

Page 11: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),bidPrices: [55.37, 55.36, 55.35],offerPrices: [55.58, 55.59, 55.60],bidQuantities: [500, 1000, 2000],offerQuantities: [1000, 2000, 3000]

}

> db.ticks.find( {bidPrices: {$gt: 55.36} } )

Depth of Book

Page 12: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),bids: [

{price: 55.37, amount: 500}, {price: 55.37, amount: 1000}, {price: 55.37, amount: 2000} ],

offers: [ {price: 55.58, amount: 1000}, {price: 55.58, amount: 2000}, {price: 55.59, amount: 3000} ]

}

> db.ticks.find( {"bids.price": {$gt: 55.36} } )

Or However Your App Uses It

Page 13: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS",timestamp: ISODate("2013-02-15 10:00"),spreadPrice: 0.58leg1: {symbol: “CLM13, price: 97.34}leg2: {symbol: “CLK13, price: 96.92}

}

db.ticks.find( { “leg1” : “CLM13” },

{ “leg2” : “CLK13” },

{ “spreadPrice” : {$gt: 0.50 } } )

Synthetic Spreads

Page 14: Using MongoDB As a Tick Database

{

_id : ObjectId("4e2e3f92268cdda473b628f6"),

symbol : "DIS",

timestamp: ISODate("2013-02-15 10:00"),

title: “Disney Earnings…”

body: “Walt Disney Company reported…”,

tags: [“earnings”, “media”, “walt disney”]

}

News

Page 15: Using MongoDB As a Tick Database

{

_id : ObjectId("4e2e3f92268cdda473b628f6"),

timestamp: ISODate("2013-02-15 10:00"),

twitterHandle: “jdoe”,

tweet: “Heard @DisneyPictures is releasing…”,

usernamesIncluded: [“DisneyPictures”],

hashTags: [“movierumors”, “disney”]

}

Social Networking

Page 16: Using MongoDB As a Tick Database

{ _id : ObjectId("4e2e3f92268cdda473b628f6"),symbol : "DIS”,openTS: Date("2013-02-15 10:00"),closeTS: Date("2013-02-15 10:05"),open: 55.36,high: 55.80,low: 55.20,close: 55.70

}

Aggregates (bars, daily, etc)

Page 17: Using MongoDB As a Tick Database

Querying/Analyzing Tick Data

Page 18: Using MongoDB As a Tick Database

Architecture for Querying Data

Higher Latency Trading

Applications

Backtesting Applications

• Ticks• Bars• Other

analysis

Research & Analysis

Applications

Page 19: Using MongoDB As a Tick Database

// Compound indexes

> db.ticks.ensureIndex({symbol: 1, timestamp:1})

// Index on arrays

>db.ticks.ensureIndex( {bidPrices: -1})

// Index on any depth

> db.ticks.ensureIndex( {“bids.price”: 1} )

// Full text search

> db.ticks.ensureIndex ( {tweet: “text”} )

Index Any Fields: Arrays, Nested, etc.

Page 20: Using MongoDB As a Tick Database

// Ticks for last month for media companies

> db.ticks.find({ symbol: {$in: ["DIS", “VIA“, “CBS"]}, timestamp: {$gt: new ISODate("2013-01-01")}, timestamp: {$lte: new ISODate("2013-01-31")}})

// Ticks when Disney’s bid breached 55.50 this month

> db.ticks.find({ symbol: "DIS",

bidPrice: {$gt: 55.50}, timestamp: {$gt: new ISODate("2013-02-01")}})

Query for ticks by time; price threshold

Page 21: Using MongoDB As a Tick Database

Analyzing/Aggregating Options

• Custom application code– Run your queries, compute your results

• Aggregation framework– Declarative, pipeline-based approach

• Native Map/Reduce in MongoDB– Javascript functions distributed across cluster

• Hadoop Connector– Offline batch processing/computation

Page 22: Using MongoDB As a Tick Database

//Aggregate minute bars for Disney for February

db.ticks.aggregate( { $match: {symbol: "DIS”, timestamp: {$gt: new ISODate("2013-02-01")}}}, { $project: { year: {$year: "$timestamp"}, month: {$month: "$timestamp"}, day: {$dayOfMonth: "$timestamp"}, hour: {$hour: "$timestamp"}, minute: {$minute: "$timestamp"}, second: {$second: "$timestamp"}, timestamp: 1, price: 1}}, { $sort: { timestamp: 1}}, { $group : { _id : {year: "$year", month: "$month", day: "$day", hour: "$hour", minute: "$minute"}, open: {$first: "$price"}, high: {$max: "$price"}, low: {$min: "$price"}, close: {$last: "$price"} }} )

Aggregate into min bars

Page 23: Using MongoDB As a Tick Database

//then count the number of down bars

{ $project: { downBar: {$lt: [“$close”, “$open”] }, timestamp: 1, open: 1, high: 1, low: 1, close: 1}}, { $group: {

_id: “$downBar”,

sum: {$sum: 1}}} })

Add Analysis on the Bars

Page 24: Using MongoDB As a Tick Database

var mapFunction = function () {

emit(this.symbol, this.bidPrice);

}

var reduceFunction = function (symbol, priceList) {

return Array.sum(priceList);

}

> db.ticks.mapReduce(

map, reduceFunction, {out: ”tickSums"})

MapReduce Example: Sum

Page 25: Using MongoDB As a Tick Database

Process Data in Hadoop

• MongoDB’s Hadoop Connector

• Supports Map/Reduce, Streaming, Pig

• MongoDB as input/output storage for Hadoop jobs– No need to go through HDFS

• Leverage power of Hadoop ecosystem against operational data in MongoDB

Page 26: Using MongoDB As a Tick Database

Performance, Scalability, and High Availability

Page 27: Using MongoDB As a Tick Database

Why MongoDB Is Fast and Scalable

Better data locality

Relational MongoDB

In-Memory Caching

Auto-Sharding

Read/write scaling

Page 28: Using MongoDB As a Tick Database

Auto-sharding for Horizontal Scale

mongod

Read/Write Scalability

Key RangeSymbol: A…Z

Page 29: Using MongoDB As a Tick Database

Auto-sharding for Horizontal Scale

Read/Write Scalability

mongod mongod

Key RangeSymbol: A…J

Key RangeSymbol: K…Z

Page 30: Using MongoDB As a Tick Database

Sharding

mongod mongodmongod mongod

Read/Write Scalability

Key RangeSymbol: A…F

Key RangeSymbol: G…J

Key RangeSymbol: K…O

Key RangeSymbol: P…Z

Page 31: Using MongoDB As a Tick Database

Primary

Secondary

Secondary

Primary

Secondary

Secondary

Primary

Secondary

Secondary

Primary

Secondary

Secondary

MongoS MongoS MongoS

Key RangeSymbol: A…F, Time

Key RangeSymbol: G…J,Time

Key RangeSymbol: K…O,Time

Key RangeSymbol: P…Z, Time

Application

Page 32: Using MongoDB As a Tick Database

Summary

• MongoDB is high performance for tick data

• Scales horizontally automatically by auto-sharding

• Fast, flexible querying, analysis, & aggregation

• Dynamic schema can handle any data types

• MongoDB has all these features with low TCO

• We can support you with anything discussed

Page 33: Using MongoDB As a Tick Database

Questions?

Page 34: Using MongoDB As a Tick Database

Sr. Solution Architect, MongoDB

Matt Kalan

#ConferenceHashtag

Thank You