How a Hedge Fund Uses MongoDB (1)

Embed Size (px)

Citation preview

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    1/18

    How a hedge fund usesMongoDB

    Roman Shtylman

    Athena Capital Research

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    2/18

    Making money in the stockmarket.

    1. Listen to market data2. ????

    3. Profit!

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    3/18

    Agenda

    About Athena Capital Research 3 uses of MongoDB at Athena Dropcopy BSON Logging Realtime Monitoring

    Wrap-Up Questions

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    4/18

    Athena Capital Research

    Strong focus on technical talent and technology 90% of employees come from engineering, math, or hard

    science backgrounds

    Quantitative investment manager

    math

    Automated trading robots

    C++

    speed

    Open source stack

    freedom

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    5/18

    MongoDB at Athena

    Lots of unstructured data Many sources of data Want to be able to query quickly Not everything goes into a database Avoid creating schema after schema

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    6/18

    Dropcopy

    Third parties require near-real-time reporting of tradingactivity

    Accounting Risk management Compliance

    Exchanges provide a "drop-copy" FIX protocol

    Scrub the messages and forward to said third party MongoDB for message passing

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    7/18

    FIX Protocol

    Financial Information eXchange Key/value based ASCII

    Header + body + trailer Key is numeric (maps to some "standard" name) Value is string

    Good fit for MongoDB Key / value Flexible document sizes easier to query than SQL alternatives

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    8/18

    Architecture

    We have incoming FIX session (drop copy)

    Need to have outgoing FIX session

    MongoDB acts as the glue (message passing layer)

    1. Incoming drop copy -> FIX log file2. fix2json3. MongoDB4. Tail cursor5. Client

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    9/18

    Drop side

    C++ client application for the drop copy connection Known system and can be kept database free QuickFix

    fix2json Tail reading of output FIX log files Easy to represent fix as json and subsequently bson Keep db inserts independent of FIX connection

    Downsides of combining Re-population

    Data will not be resent

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    10/18

    MongoDB setup

    Capped collection Natural index

    Data is purged daily using a simple MongoDB shell script Important to keep tabs on the data size if your data

    requirements change often Mitigated intraday if you are constantly reading Critical if you want full replay

    Easy to reconcile with Drop FIX logs

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    11/18

    Outgoing side

    C++ FIX application QuickFix

    Tail cursor Handling restarts

    Select only required fields

    Filter and alter any field before sending Outgoing message log in FIX Easily handle different clients

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    12/18

    Benefits

    Full copy of incoming data for querying Aggregation queries

    Easy replay Client disconnects

    Easy verification

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    13/18

    BSON Logging

    Event logging Independent of std::cout

    Relevant for tracking down problems and keeping records Logging time is "wasted" time Previous logging solution was slow

    XML based String conversions

    XML is easy to read after logging

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    14/18

    BSON Benefits

    Binary with loose document format Defined by the app during logging

    Internal data format for MongoDB mongorestore

    Exists sequentially in flat files

    Easily rendered as json Numbers:

    original XML implementation: 1k ops/s improved XLM implementation: 3k ops/s

    first pass BSON implementation: ~20k ops/s current BSON implementation: ~30k ops/s

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    15/18

    BSON Gotchas

    BSON timestamp type is int64_t milliseconds BSON not a standalone library

    Highly coupled to MongoDB c++ driver Like MongoDB, schema-less

    Just something to remember if creating post-processing

    tools

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    16/18

    Realtime Monitoring

    Log entries are similar to one another Some can have extra fields

    Each machine contains independent logs Each log could be a different format Daemon to read and insert into MongoDD

    Central location, no hunting when problems happen Real-time monitoring and alerting

    Human intervention required Web based tools to "tail" view log entries

    WebSockets

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    17/18

    Wrap-Up

    "Realtime" is relative Benchmark to meet your needs

    Disjoint pieces can be less prone to failure Other MongoDB uses

    Contribute to LuaMongo driver

    BSON code contributions Bugfixes

  • 8/3/2019 How a Hedge Fund Uses MongoDB (1)

    18/18

    Questions?

    [email protected]

    Reference:http://www.mongodb.org/display/DOCS/Tailable+Cursors

    FIX:http://en.wikipedia.org/wiki/Financial_Information_eXchangehttp://www.quickfixengine.org/

    http://www.onixs.biz/tools/fixdictionary/

    BSON:http://bsonspec.org/

    http://www.quickfixengine.org/http://en.wikipedia.org/wiki/Financial_Information_eXchangehttp://www.mongodb.org/display/DOCS/Tailable+Cursorshttp://bsonspec.org/http://www.onixs.biz/tools/fixdictionary/http://www.quickfixengine.org/http://en.wikipedia.org/wiki/Financial_Information_eXchangehttp://en.wikipedia.org/wiki/Financial_Information_eXchangehttp://www.mongodb.org/display/DOCS/Tailable+Cursors