4
O ver the past few years, complex event processing (CEP) technology has grown from a niche category rooted in academia to one of the central pillars in the modern financial services technology stack. In this article, I will discuss the reasons why this new technology has become so widely adopted, demonstrate one of its main new uses, and preview where I see this industry headed. History and trends To understand CEP, and the central role it has come to play in financial services, it is useful to consider the broader market and technology environments within which this development has taken place. Over the past decade, data volumes have skyrocketed, new sources of liquidity have mushroomed, sell-side firms have created internal crossing networks, trading has become increasingly sophisticated and trading volumes in complex derivatives have spiked, collectively far outstripping the stock markets. For these reasons, the old, batch-based technology model no longer suffices. In that model, firms would record their activity throughout the day, typically using a relational database, and then run a batch of programmes at the end of the day to process this data. For example, a fund would send orders to its broker throughout the trading day, storing these in a position-keeping database along the way. Then, at the close of the day, they (or their prime broker on their behalf ) would re-compute their value at risk (VaR) as well as their net exposures to various factors, and use these closing values the next day to inform their trading decisions. The problem is that those nightly numbers are quickly out-of-date. As markets sped up and volumes rose, traders began demanding these numbers again at the middle of the day, then several times per day. These intraday batches came to be requested so frequently that systems designed to handle a single nightly run simply could not scale up to support the business needs. This model has yielded to a new paradigm of continuous, event-driven systems. In this design, rather than store data for processing at a later stage, we set up the processes and then, as data arrives, act upon it immediately. For example, we might create a system to continuously re-price off-the-run US treasury bonds using a model dictating that each bond should be computed according to some 026 Automated Trader | Q1 2008 Complex event processing is already viewed by many as an integral part of today’s auto/algo trading platforms. Using market data handling as an example, Daniel Chait, Managing Director, Lab49, a financial markets technology consultancy, asserts that CEP has a much wider range of applications. New Horizons for Complex Event Processing Daniel Chait

Lab49 feature automatedtraderq12008techworkshoplab49

Embed Size (px)

Citation preview

Page 1: Lab49 feature automatedtraderq12008techworkshoplab49

Over the past few years, complex eventprocessing (CEP) technology has grownfrom a niche category rooted in academia to

one of the central pillars in the modern financialservices technology stack. In this article, I will discussthe reasons why this new technology has become sowidely adopted, demonstrate one of its main newuses, and preview where I see this industry headed.

History and trends

To understand CEP, and the central role it has cometo play in financial services, it is useful to considerthe broader market and technology environmentswithin which this development has taken place. Overthe past decade, data volumes have skyrocketed, newsources of liquidity have mushroomed, sell-side firmshave created internal crossing networks, trading has

become increasingly sophisticated and tradingvolumes in complex derivatives have spiked,collectively far outstripping the stock markets.

For these reasons, the old, batch-based technologymodel no longer suffices. In that model, firms wouldrecord their activity throughout the day, typicallyusing a relational database, and then run a batch ofprogrammes at the end of the day to process thisdata. For example, a fund would send orders to itsbroker throughout the trading day, storing these in aposition-keeping database along the way. Then, atthe close of the day, they (or their prime broker ontheir behalf ) would re-compute their value at risk(VaR) as well as their net exposures to variousfactors, and use these closing values the next day toinform their trading decisions. The problem is thatthose nightly numbers are quickly out-of-date. Asmarkets sped up and volumes rose, traders begandemanding these numbers again at the middle of theday, then several times per day. These intradaybatches came to be requested so frequently thatsystems designed to handle a single nightly runsimply could not scale up to support the businessneeds.

This model has yielded to a new paradigm ofcontinuous, event-driven systems. In this design,rather than store data for processing at a later stage,we set up the processes and then, as data arrives, actupon it immediately. For example, we might create asystem to continuously re-price off-the-run UStreasury bonds using a model dictating that eachbond should be computed according to some

026 � Automated Trader | Q1 2008

Complex event processing is alreadyviewed by many as an integral partof today’s auto/algo tradingplatforms. Using market datahandling as an example, DanielChait, Managing Director, Lab49, afinancial markets technologyconsultancy, asserts that CEP has amuch wider range of applications.

New Horizons for Complex Event Processing

Daniel Chait

Page 2: Lab49 feature automatedtraderq12008techworkshoplab49

calculated spreads from a corresponding on-the-runbond. Then, with each new tick in the price of theon-the-run bond, we will trigger a cascade of analyticcode to re-calculate the new prices for all the relatedoff-the-run bonds. CEP engines are designed toenable this continuous, event-driven model.

Buy side takes control

Another interesting effect of the acceleration in themarkets is that buy-side firms have been pressured toadopt cutting-edge technology, which for a long timewas the purview only of the large sell-sideinstitutions. And as buy-side firms look to take morecontrol over their trading functions (for exampledirect market access, high-speed algorithmic trading,real-time risk), they are increasingly developingtighter technology integration with theirbroker/dealers. These two effects have demandedincreasing sophistication of their IT operations. CEP,as one of the key new technologies enabling real-timefinancial services, is seeing rapid adoption withinbuy-side firms as part of this overall trend.

CEP first gained widespread adoption in automatedand algorithmic trading. Auto/algo trading systemsare ideally suited to CEP for several reasons: theyoften require low latency and high throughput; theyemploy complex logic in response to outside marketevents; and they frequently require connections toseveral outside trading venues. Before the advent ofCEP engines, firms had to write much of this codethemselves, adding to the cost, complexity and riskof building algorithmic trading solutions. Now, usinga CEP platform, developers can focus on creating thebusiness logic to implement a particular algorithmand let the platform handle details such asconnecting to exchanges, triggering trading decisionlogic and maintaining consistency even in the face ofnetwork outages, hardware failures etc.

Additionally, certain features of CEP products arespecifically geared towards automated andalgorithmic traders. For example, firms need a way totest out their trading strategies and executionalgorithms prior to deployment. Most CEP productsprovide sophisticated features for feeding insimulated data, recording and playing back live data,and debugging applications, providing a richenvironment for teams to develop, test and refinetheir strategies.

Expanding CEP use cases

Because CEP was first adopted for algorithmictrading systems, it is still often thought of primarilyin those terms. However, in fact there are many otherinteresting applications of CEP across the financialmarkets and firms are expanding their use to manynew applications.

To demonstrate how and why CEP engines are beingdeployed to meet new challenges, we’ll take theexample – drawn from recent Lab49 projects – of asystem for handling market data within a hedge fund.Firms engaged in trading activity must connect toseveral sources of market data, run checks on that datato ensure that it is clean and valid, and convert it tostandard formats. Additionally, they would likenotifications when problems occur such as a feedbecoming unavailable. By architecting a market datasystem around a CEP engine, firms can achieve thesegoals faster and better than they otherwise could(Figure 1 overleaf illustrates such a system). Here’s why:

• Feed handling – CEP platforms all come withmany pre-built feed handlers for common marketdata formats, wherein the vendors have already donemuch of the heavy lifting involved in connecting tothe sources, parsing their proprietary data formats,handling control messages and re-connects, etc. Incases where clients want to connect to a data sourcefor which the vendor does not already have a feedhandler, the platform will provide a framework forcreating new input adapters to plug in additionalmarket data sources.

• Data standardisation – CEP engines can easilyenable canonicalisation (the process of standardisingdata about the same entity from different sourceswithin a common reference schema). Examples ofdata that may need canonicalisation include ordersize fields represented in lots of 1,000 versus singleunits, different date formats, various ways torepresent complex structured products, etc.

Q1 2008 | Automated Trader � 027

TECHNOLOGY WORKSHOP

“… buy-side firms have been pressured toadopt cutting-edge technology, which for along time was the purview only of the large

sell-side institutions.”

Page 3: Lab49 feature automatedtraderq12008techworkshoplab49

028 � Automated Trader | Q1 2008

Through their event models and schema definitionlanguages, CEP engines allow developers to easilydefine their desired canonical format and then tobuild sophisticated transformation code that convertseach incoming message as it arrives.

• Cleansing and validation – The centralprogramming paradigm enabled by CEP platforms isvery well suited to the cleansing and validation ofmarket data. Essentially, cleansing and validatingmarket data comes down to recognising patterns inthe data and filtering out anomalous events whichfall outside recognised parameters. A classictechnique is to keep a trailing average window for aset of values and to reject any new events which falloutside normal distribution. These bad values canresult from any number of errors, including fat-finger problems and transmission glitches. Theprogramming languages and design patternsincorporated into CEP engines are ideally suited tosolving this problem as they include the inherentcapability for recognising patterns, creating rollingtime-based data sets and computing statistics basedupon streams of messages.

• Alerts – A market data system should be able tosend notifications when problems occur. CEPengines can help with this by using their ability tomonitor streams and detect patterns, and theirfeatures for hooking into email servers, postingmessages to web services and other such integrationpoints for notifications and alerts.

• Data delivery – Having accepted market datafeeds and performed the relevant cleansing andnormalisation, a market data system must handledelivery of the data to other systems, which will then

consume the data (these may include P&L systems,auto/algo trading systems, analytic spreadsheets andmore). This brings two main challenges which CEPengines can solve. Delivery poses an integrationproblem, resulting from the fact that there are a hostof various downstream systems that may consumethe data, each with their own applicationprogramming interfaces (APIs). Just as CEP enginesinclude both pre-built input adapters and adapterframeworks which help when connected incomingdata streams, they also include output adapters forpopular systems (including message buses, relationaldatabases and distributed data caches) as well astoolkits for easily creating custom output adapters toproprietary systems. Market data delivery alsopresents challenges in producing data for differentconsumption patterns. For example, if the data isbeing delivered to a high-speed trading platformwhich is using it to make automated tradingdecisions based on sub-second arbitrageopportunities, the delivery system needs to producedata as fast as possible. On the other hand, if thatsame market data is updating a spreadsheet that atrader watches to help make investment decisions, itmay not need to be updated more than once everyfew minutes or so, since humans cannot process andreact to data that fast. CEP engines support thecapability to throttle output streams according tothese types of constraints, allowing customers toreadily access very high performance when neededand to limit message traffic, and thus network load,for less sensitive applications.

Other use case categories including pricing, real-timerisk and P&L calculation, and portfolio managementfollow many of the same patterns and present equallycompelling use cases.

New Horizons for Complex Event Processing

Figure 1: Architecture of a CEP-based market data system

Page 4: Lab49 feature automatedtraderq12008techworkshoplab49

CEP’s inflection point

Prior to 2007, very few in the finance sectorwere aware of CEP products and even fewerhad actually begun to implement such systemswithin a few specialised areas. 2007 was abreakout year in which the volume of articles,conferences and events on CEP was matchedby the arrival of new vendors and use cases,confirming its status as a mainstreamtechnology. Based on an understanding of thesector developed through Lab49’s partnershipswith vendors and implementations withclients, I sense that CEP is at a majorinflection point right now.

Looking out to 2008 and beyond, I expectCEP technologies to mature from workgroup-or desk-level tools to enterprise-wide platforms.Tools have started to evolve to support thedemands placed on other enterprisetechnologies such as: high availability; faulttolerance; better deployment, management andconfiguration tools; and tighter integrationwith entitlement, authentication and securityinfrastructure. At the same time, customers arepushing for more sophisticated end-user tools(customisable dashboards, Excel integration, adhoc query and business intelligence tools, etc.)to lessen their reliance on IT departments everytime a business user needs to see some data. Forinstance, the most recent version upgrades toStreamBase include features for better errorhandling, finer-grained security and support forseveral different fault tolerant deploymentscenarios. Likewise, Coral8 has recently added64-bit Windows support, an updated portalproduct, real-time Excel integration andenhanced security capabilities. Lastly, theemergence of related technologies such as low-latency messaging systems, grid computingplatforms and data fabrics has led to aconfusing patchwork of different systems.Vendors will increasingly look for tighterintegration of the various overlapping featuresets these products provide, yielding simplerand more comprehensive suites of platformsfrom fewer vendors. For instance, BEA hasadded closer integration with distributed cacheproducts as well as publish/subscribe messagingfeatures to their event server products.

A Brief Buyer’s Guide to CEPSelecting a CEP product today can be a complicatedmatter. The various vendors in the market have takenvastly different approaches to several key aspects of theirofferings and it can be complicated to sort it all out.Here are a few tips:

Performance – Performance matters, of course. Bear inmind, though, performance of a CEP product iscomposed of several aspects, including plug-inperformance, scalability as queries and users are added,and effective monitoring and reporting of performancestatistics in real time. Don’t take the vendor’s word for itthat their product can handle a certain number ofmessages per second. Take a test drive under real-worldscenarios to see how it actually stacks up.

Features – Feature sets vary greatly. For instance, someproducts focus on small, fast and lightweight runtimeengines, others on massive scalability and highavailability. Some include many adapters for plugging inall kinds of outside data, while others leave the codingto you. Other variables include: security; platform-independence; and richness of integrated developmentenvironments. Deciding which of these features is mostimportant to you should guide your evaluation process.

Design – From SQL-like to rules-based, CEP productsdiffer in their approach to creating applications in theirsystems. Some include rich visual programmingenvironments which may or may not be equivalent inpower and expressiveness to their text-basedprogramming model. Other differences include:persistent model vs. purely transient; supporting updatesand deletes vs. insert-only; and multi- vs. single-threaded.

Platform or solution? – Some products providesignificant business-focused applications out of the box,others provide ready-to-use templates to use as startingpoints for building applications, while others focus onthe platform and leave the application development toyou. Usually I advise firms to buy the functionalitywhere they do not feel they have a compellingcompetitive advantage and to build the pieces wherethey feel they can gain an edge. If purchasing a CEPplatform with the intention of using the business-specific modules it provides (e.g. electronic trading,market liquidity, pricing and many more), then takeinto account the ramifications of using logic that everyother customer will also have access to.

New Horizons for Complex Event Processing

030 � Automated Trader | Q1 2008