63
Andrey Mikhalchuk Chief Architect Akira Technologies FULL-STACK WEB DEVELOPMENT WITH MONGODB, NODE.JS AND AWS

Full-stack Web Development with MongoDB, Node.js and AWS

  • Upload
    mongodb

  • View
    595

  • Download
    6

Embed Size (px)

DESCRIPTION

Akira Technologies will share its experience of building a universal scalable high-performance platform for conducting surveys. Using MongoDB allowed replacing dozens unique survey systems with a single flexible solution, improved data and questionnaire reusability, simplified data analysis. We will also cover full-stack development and integration with Node.js, Hadoop, deployment to AWS Cloud, offline caching and stress-tecting the entire system with Tsung. A working prototype will be demonstrated including multiple surveys, dynamically rebuilding interface, geolocation, data analysis and visualization.

Citation preview

Page 1: Full-stack Web Development with MongoDB, Node.js and AWS

Andrey Mikhalchuk

Chief Architect

Akira Technologies

FULL-STACK WEB DEVELOPMENT WITH

MONGODB, NODE.JS AND AWS

Page 2: Full-stack Web Development with MongoDB, Node.js and AWS

Founded in 2003Technology and management

consulting company80+ employeesHUBZone, SBA 8(a), SDBHITSP and NIEM Contributing MemberISO 9001, CMMI Level 3Top Secret Facility ClearanceMore on http://www.akira-tech.com

ABOUT AKIRA

Page 3: Full-stack Web Development with MongoDB, Node.js and AWS

Relational databases -> MongoDB

Integrating with MongoDBStress-testingComplications and solutions

ABOUT THIS PRESENTATION

Page 4: Full-stack Web Development with MongoDB, Node.js and AWS

“Leading source of quality data about the nation’s people and economy”

US Census Bureau

ABOUT THE CLIENT

Page 5: Full-stack Web Development with MongoDB, Node.js and AWS

PROBLEM TO SOLVE

Page 6: Full-stack Web Development with MongoDB, Node.js and AWS

WHERE DOES THE QUALITY DATA COME FROM?

SurveysDecennial SurveyAmerican Community Survey Survey of Income and Program Participation Current Populat ion Survey Consumer Expendi ture Survey Nat iona l Heal th Interv iew Survey American Hours ing Survey American Time Use Survey Beginn ing Teacher Long i tud ina l Study Consumer Expendi ture Survey Current Populat ion Survey Hous ing Vacancy Survey Ident i ty Theft Supplement Nat iona l Ambulatory Medica l Care Survey

Page 7: Full-stack Web Development with MongoDB, Node.js and AWS

Most of these surveys use unique software to collect

and process the data

PROBLEM

Page 8: Full-stack Web Development with MongoDB, Node.js and AWS

Different data storages: Text files or different formatBinary data filesRelational databases

Incompatible data structuresDozens of programming languages Technologies incompatibles closed architectures

PROBLEM

Page 9: Full-stack Web Development with MongoDB, Node.js and AWS

Single solution for all surveysScalableInexpensiveReusable cross-survey dataWorks on all devicesFlat learning curve for developers

AKIRA GOAL

Page 10: Full-stack Web Development with MongoDB, Node.js and AWS

15© Akira Technologies, 2013

SURVEY DSLHow to define a new survey

Page 11: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 16

HOW TO DEFINE SURVEY

Akira has developed a prototype of an engine that takes JSON specification and turns it into feature-rich survey

JSON is a simple language that analysts can use to define questions

Developers later can enrich user experience with custom bells and whistles

Both analysts and developers can reuse existing functionality in future surveys

Page 12: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 17

DEFINE SURVEY

1. survey = {2. name: "decennial",3. type: 'survey',4. title: "Decennial Survey",5. stylesheet: "decennial.css",6. header: "MongoDB PoC Demo",7. footer: "Measuring America …",8. intro: "The Census must count every

…",9. questions: []10.}

Page 13: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 18

QUESTIONS

1. {2. name: "num_people_1",3. question: "How many people were l iving or staying …",4. details: "<B>INCLUDE</B> in this number:<UL><LI>…",5. type: "int",6. required: true,7. },{8. name: "additional_people_2",9. question: "Were there any additional people staying …",10. details: "Mark all that apply",11. type: "checkbox",12. options: [13. "Children, such as newborn babies or foster children",14. "Relatives, such as adult children, cousins …",15. "Nonrelatives",16. "People staying here temporarily",17. "No additional people”18. }

Page 14: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 19

BELLS AND WHISTLES

1. validator: function(s) {2. n = parseInt( s, 10 ); 3. return n > 0 && n < 5 ? null : ”Enter value in

[1,4] range”4. },

5. decorator: function( q ) {6. for( i = 1; i <= 4; i++ ) {7. if( i > parseInt( $("input#num_people_1").val(),

10 ) ) {8. $("div#question_person_"+i).hide();9. } else {10. $("div#question_person_"+i).show();11. }12. }

Page 15: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 20

ACTION!

Let’s see how this code works:http://census.akira-tech.com/survey/8

Contains all Decennial questionsWritten in <1 hourValidates input in realtimeUpdates interface dynamically

Page 16: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 21

Page 17: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 22

ACTION!

Take a look at another example:http://census.akira-tech.com/survey/9

Contains most ACS questionsWritten in <1 hour Supports geolocation

Page 18: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 23

Page 19: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 24

MORE CUSTOMIZATIONS

We didn’t have to create a new database schema

We didn’t have to create new web interfaceWe can customize all aspects of survey processHere is another great example:

http://census.akira-tech.com/test/10000001

http://census.akira-tech.com/test_dashboard/sat_math

Page 20: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 25

Page 21: Full-stack Web Development with MongoDB, Node.js and AWS

26© Akira Technologies, 2013

MOBILE VERSION

Page 22: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 27

MOBILE VERSION

Speaking of geolocation …Let’s take a look at iPad version.

Android/Windows versions will look similarIt looks basically the same, not surprisinglySurprisingly the iPad is not even connected to

the internet!Let’s complete the survey anyway and submit

the resultsEnjoy the result!

Page 23: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 28

Page 24: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 29

OFFLINE MODE

Let’s shutdown the browser, it’s not in RAM anymore

Let’s connect the iPad to the internetGet back to the pageIsn’t what you see is awesome?!This is complete offl ine mode out of the

boxThe interface is the same ->

No re-learning Single interface for CAPI, WAPI, CATI and more Less code to maintain

Page 25: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 30

Page 26: Full-stack Web Development with MongoDB, Node.js and AWS

31© Akira Technologies, 2013

ANALYTICS

After some t ime this system could col lect bi l l ions of records. How do we process them?

Page 27: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 32

DATA PROCESSING

Hadoop allows distributed data processing on thousands servers

Cloudera manager and AWS CLI allow deploying hundres of servers in minutes

We have deployed cluster of 3 nodes in AWS. Let’s see how it can be reconfigured

Page 28: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 33

LET’S PROCESS SOME DATA

There are a lot of options for processing data in Hadoop: Hive – DataWarehouse infrastructure for data query and

analysis MapReduce – programming model for large-scale data

processing Pig – high-level platform for creating MapReduce programs

In this demo we chose Pig as the simplest way to demonstrate the power of Hadoop

This script loads data from MongoDB into , calculates statistics and pushes it back to MongoDB

Page 29: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 34

SOME PIG LATIN HERE

1. REGISTER /home/ec2-user/Distr/mongo-2.10.1.jar;2. REGISTER /usr/ l ib/hadoop-0.20/l ib/mongo-hadoop-

core_0.20.205.0-1.1.0.jar;3. REGISTER /usr/ l ib/hadoop-0.20/l ib/mongo-hadoop-

pig_0.20.205.0-1.1.0.jar;4. raw = LOAD 'mongodb://master:27017/mongodb_poc.invites'

USING com.mongodb.hadoop.pig.MongoLoader( 'u__id:chararray, token:chararray, emai l :chararray, survey:chararray, fi rst_name:chararray, last_name:chararray, address:chararray, interview_mode:chararray, fr_ id:int, processed:chararray, state:chararray' )

5. AS ( id, token, email , survey, fi rst, last, address, mode, fr, processed, state);

6. raw_l imited = LIMIT raw 3000;7. DUMP raw_l imited;8. raw_fi ltered = FILTER raw BY processed == 'true';9. total_processed = FOREACH raw_fi ltered GENERATE

COUNT(processed);10. total_by_state = GROUP raw_fi ltered BY state;11. DUMP total_by_state;12. DUMP total_processed;13. STORE total_by_state INTO

'mongodb://master:27017/mongodb_poc.statist ics'14. USING com.mongodb.hadoop.pig.MongoStorage();

Page 30: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 35

STATISTICS

Both MongoDB and Hadoop provide aggregation framework. Hadoop works best for slow crunching humongous

quantities of data MongoDB is good for quick calculations on reasonably

large (tens of millions records) scopesProcessed data is pushed back to Mongo for

storing and future visualizationWe used Highcharts (an open source library)

for data visualization

Page 31: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 36

Page 32: Full-stack Web Development with MongoDB, Node.js and AWS

37© Akira Technologies, 2013

ARCHITECTURE

Page 33: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 38

HOW MULTIPLE SURVEYS ARE POSSIBLE? MONGODB!

We use MongoDB Instead of creating complex database we store

both surveys and responses as documentsAll surveys are stored in the same collection.

If survey questions change, you still can query all versions of the survey in single query.

You even can query totally different survey results on core parameters, like DOB

The mongo cluster is deployed in the cloudWe use Amazon Web Services (AWS) as the

cloud platform, but can use any other solution as well or build our own

Let’s take a look at the deployment

Page 34: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 39

Page 35: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 40

THE POC HARDWARE

In the PoC we use very basic computers: Realtime data collection:

0.613 Gb RAM 30 Gb HDD 1 Core iPhone 5: 1Gb RAM, 32Gb SSD, Dual-core

Data processing: 1.7gb RAM 6 Gb storage 1 Core Samsung Galaxy S4: 2Gb RAM, 16Gb Storage, Quad-core

Page 36: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 41

WEBSERVER(S)

We use Node.js as the web serverAll code is written in JavascriptNode.js + MongoDB + Akira Survey DSL =

everything is written in Javascript. Learning curve is almost flat

Validators in survey definition can be used both on client side and server side, no need to duplicate code

Node.js is extremely fast, but Nginx is faster for static content. We use it as surrogate Content Delivery Network (CDN)

Page 37: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 42

ACTION!

Let’s go to the AWS console:

https://console.aws.amazon.com/ec2/v2/home?region=us-west-2#

InstancesMaster, Node1, Node2 all run the same confi guration:

Mongo Confi g, Mongo Router, Mongo DB + Node.jsNode3 is identical to other nodes except it typically

doesn’t run Mongo confi g serverAll nodes are load-balanced with Elastic Load Balanser

(ELB) If we turn on Node 3 it will be automatically included

into load balancing All nodes share MongoDB content, this is called

sharding

Page 38: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 43

PRODUCTION CONFIGURATION

PoC is different from the production configuration: Slow/cheap servers Only few nodes in clusters Runtime and data warehouse in the same cloud Simplified security No real CDN No Mongo replicas

Let’s take a look at how this could work in production

Page 39: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 44

Page 40: Full-stack Web Development with MongoDB, Node.js and AWS

45© Akira Technologies, 2013

MONITORING

How can we make sure the system works properly, predict fai lures and avoid them?

Page 41: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 46

AWS CONSOLE

Provides status of all your serversAllows shutting them down when you don’t

need them and bringing back in minutesYou can take software running on a server

and move it to a more powerful computer in minutes

Also provides vital statistics about all your servers

Page 42: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 47

Page 43: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 48

Page 44: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 49

Page 45: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 50

MONGODB

Many free opensource and commercial tools

Here is just one exampleProvides comprehensive statistics on all

aspects of a MongoDB cluster performance

Page 46: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 51

Page 47: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 52

HADOOP

Multiple commercial and opensource solutions for monitoring and managing Hadoop clusters

Cloudera Manager – deploys nodes in EC2 in bulk

Here is just the standard out-of the box Hadoop web interface for monitoring cluster health

Page 48: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 53

Page 49: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 54

Page 50: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 55

Page 51: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 56

Page 52: Full-stack Web Development with MongoDB, Node.js and AWS

57© Akira Technologies, 2013

PERFORMANCE

How can we guarantee the system wil l withstand real l i fe load?

Page 53: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 58

STRESS TESTING

We user Tsung to stress-test the system, creating load of up to 40000 simultaneous users.

Even on a single laptop the system was serving 250+ responses per second with avg response time < 1/3 sec

Page 54: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 59

SCALABILITY

What if you need even better performance?Scale vertically. Every node can be shut

down and restarted on more powerful hardware up to 32 cores, 117Gb RAM, 2Tb SSD

Scale horizontally. Hundreds of copies can be deployed in <1hr. Akira already built the infrastructure that supports plugging in hundreds of nodes

All scaling operations are either automated or could be automated with AWS CLI scripts

Add reliability by adding replicas to MongoDB

Nodes

CPU

Core

s

Page 55: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 60

Page 56: Full-stack Web Development with MongoDB, Node.js and AWS

61© Akira Technologies, 2013

INTEGRATIONEven the best system has l imited use i f we can’t integrate i t with other systems.

Page 57: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 62

INTEGRATION

Our PoC code provides complete REST API for manipulating surveys and responses

As we have demonstrated before we can easily integrate it with Oracle Service Bus and SOAP-based services

Data can be extracted into Oracle database both from MongoDB and Hadoop

SAS and R can be used to process data, both integrate with Hadoop and MongoDB

OPA Hadoop can write data into .csv files for OPA batch processing OPA batch processor can output to .csv for Hadoop MongoDB

consumption OPA Determinations engine can be queried from Hadoop MR

tasks

Page 58: Full-stack Web Development with MongoDB, Node.js and AWS

63© Akira Technologies, 2013

COSTHow much does this solution cost?

Page 59: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 64

SOFTWARE

Only open source software is used in this PoC

Most software can be used absolutely for free

Some has nominal fee, typically within $1000 range

All software has commercial licenses providing support.

Page 60: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 65

CLOUD SERVICES

If you don’t use it, you don’t pay for itHundreds of nodes can be deployed from the

images we have preparedHot-plug nodes can be preconfigured and

shutdown for future useThis is how much we paid for our servers for

3 weeks:

https://portal.aws.amazon.com/gp/aws/developer/account?ie=UTF8&action=activity-summary

Page 61: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 66

Page 62: Full-stack Web Development with MongoDB, Node.js and AWS

© Akira Technologies, 2013 67

SUMMARY

Effortlessly supports multiple surveys

Extremely scalable, leverages cloudLow-cost open sourceEasily integrates with Oracle DB,

Services, OPA Tested to handle stress loadsSupports online and offl ine

interview modes

akira
Changed the Title
akira
changed the sentence
Page 63: Full-stack Web Development with MongoDB, Node.js and AWS

CONTACT

Akira Technologies, Inc1747 Penn ave NW #600Washington, DC 20006P: 202.517.7187F: 800.589.3129E: [email protected]: www.akira-tech.com