Eagle6 Enterprise Situational Awareness

Confidential & Proprietary

Eagle6 Enterprise Situational Awareness

MongoDC 2014

14 Oct 2014


Adam Bell, Director, Product Management◦ 15+ years enterprise software solutions experience◦ 10+ years enterprise architecture◦ Healthcare◦ Technology◦ MongoDB user since 2012

Introduction

Meghan Gill

This is a lot of text for a slide. How about a few bullets?


About Rivera Group Service-Disabled Veteran

Owned Small Business Minority-Owned Business

Primary NAICS Codes:◦ 541511◦ 541512

Established: 2002*

Subject Matter Expertise:

Enterprise Software Development

Business Process Reengineering (BPR)

Proprietary Software: Eagle6 Modeling & Simulation

*Over 30% of Rivera Group Employees Are Veterans


Eagle6 is a modeling tool that automatically collects system data (application code, database schemas, log files, etc.) and provides an ability to continuously monitor for unwanted system states (bugs) that may result in system degradation and/or system outages.

Eagle6 – Enterprise Situational Awareness

Meghan Gill

Would love an image of the product

Meghan Gill

What is the key message or take away you want people to get from this talk? This is a good place to set the stage.


Large sets of Multi-Dimensional Data Heavy Read and Write Heavy Audit Requirements Fast Near Real-Time Analytics Analytics are User Driven

Our Use Case


About Our Documents◦ 1000(s) of leafs on documents◦ Document Sizes (Bson):

2300 bytes (2.3Kb) - 729699 bytes (729Kb)◦ Need to quickly add new data structures

Multi-Dimensional Data


Sample Document (Slice)"network" : {

"host" : [{

"host" : {"network" : null,"ips" : [

"6PNfL9bV7BJO"],"names" : [ ],"mx" : [ ],"txt" : [ ],"srv" : [ ],"ns" : [ ]

}},{

"host" : {"network" : null,"ips" : [

"yqb6q7er3DvWf"],"names" : [

"tQypbmzVrEZHtWG1n"],"mx" : [ ],"txt" : [ ],"srv" : [ ],"ns" : [ ]

}}

],}


Do not drop transactions Capturing large volumes of real-time data

(web access logs, database transactions, etc)

Read & Write Heavy

Greg Steinbruner

Shard keys are such an important topic when it comes to scaling. Perhaps we could see an architecture slide here, or code relevant to the shard key. It would be great to hear about what you decided to use and specfically why.

Meghan Gill

This is a list of bullets but I'm not entirely sure what you are trying to get across. How about framing it as "this is the problem we encountered" --> how we solved it?


Indexing has been a challenge MongoDB only supports 64 indexes To many indexes defeats the purpose of

indexes Scenarios exist where we will not know what

fields the user needs until they request them

Indexing - Challenges


Key Value Approach TODO: Need an example of KV indexes

Indexing - Solution


Deployment Architecture

Greg Steinbruner

A picture is worth a thousand words here.

Meghan Gill

Agree picture would be better. Also in terms of flow, is this the right place? maybe talk about development considerations and then deployment.


Shard Key is an object hash Goal is equal distribution of data across

shards

◦ Example {

hash: '00003820efcff8b669b055606813bcd360ace3f43fbf9c129845b3028992eacabcaef8cd13796dc7a96b7a5f38b0efaceaadecfd537c72eaec8a8f9c10a00a1e’,

offset: -1}

Shard Key


Needed a flexible way to Aggregate Needed a way to represent Aggregations to

end users with out writing functions Needed a way to cache frequently run

analytics

Real-time User Driven Analytics

Greg Steinbruner

too proprietary to share the code of a particular aggregation?

Meghan Gill

What are you aggregating?


Provides a Rich set of operations for aggregating data

We have been using the Aggregation Framework since MongoDB 2.2

Aggregation Pipelines have enabled us to do smart caching

JSON versions of the pipeline allow end users flexibility without writing Map/Reduce code

Analytics - Aggregation Framework


Example Pipeline { "$match": { ”servers": { "$exists": true } } }, { "$group": { "_id": ”$serversr", "clients": { "$addToSet": ”$client.names } } }, { "$project": { "title": { "$join": "$_id.names" }, "_id": { "id": { "$join": "$_id.ips" } }, "clients": 1, "server": "$_id" } }, { "$group": { "_id": "$_id", "server": { "$addToSet": "$server" }, "clients": { "$addToSet": "$clients" }, "title": { "$addToSet": "$title" } } },

{ "$unwind": "$clients" }, { "$unwind": "$clients" }, { "$project": { "clients": { "$join": "$clients" }, "server": "$server", "title": "$title" } }, { "$group": { "_id": "$_id", "server": { "$addToSet": "$server" }, "clients": { "$addToSet": "$clients" }, "title": { "$addToSet": "$title" } } }, { "$unwind": "$title" }, { "$unwind": "$title" }


Ported Aggregation framework to Javascript ◦ Custom built extensions

Accumulators $stdDev

Expressions $regex $slice

Document Sources $projectPrevious $split

Pipelines for Everyone


Questions

Greg Steinbruner

It's not so clear from the deck what the summary message is for the talk. Do you have next steps? Is there a future you can point to? What are you challenging the audience to do or think about having heard the talk?