25
Big Data Pipeline Lambda Architecture - Batch Layer with AngularJS Java Restful Web Services Apache Hadoop Apache Spark Apache Cassandra on Amazon Web Services Cloud Platform

Big data Lambda Architecture - Batch Layer Hands On

Embed Size (px)

Citation preview

Page 1: Big data Lambda Architecture - Batch Layer Hands On

Big Data PipelineLambda Architecture - Batch Layer

with AngularJS

Java Restful Web ServicesApache HadoopApache Spark

Apache Cassandraon Amazon Web Services Cloud Platform

Page 2: Big data Lambda Architecture - Batch Layer Hands On

INGEST STORE Process Visualize

BIG Data Pipeline

Data Pipeline

Page 3: Big data Lambda Architecture - Batch Layer Hands On

AngularJS Web App

RestWeb Services

ApacheWeb Logs

S3

Log/Data File

SparkEngine

SparkSQL

HDFS

ApacheCassandra S3

HDFS

ApacheCassandra

AngularJS Web App

April

May

June

July

0125

00

30

INGEST STORE PROCESSVISUALIZE

STORE

InteractiveQueries

BIG Data Batch Layer Pipeline

Spark Cluster

Page 4: Big data Lambda Architecture - Batch Layer Hands On

AngularJS Web App

ClickStreamData

ApacheWeb Logs

Log/Data File

SparkStreaming

SparkSQL

ApacheKafka

S3

HDFS

ApacheCassandra

AngularJS Web App

April

May

June

July

0125

00

30

INGEST STREAM PROCESSVISUALIZE

STORE

InteractiveQueries

Spark Cluster

TCPSockets

BIG Data Real-Time Layer Pipeline

Page 5: Big data Lambda Architecture - Batch Layer Hands On

Install Web Server

Page 6: Big data Lambda Architecture - Batch Layer Hands On

EC2 instance for Web Server

Page 7: Big data Lambda Architecture - Batch Layer Hands On

cat /etc/*-release

sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java8-installer

java -version

mkdir webserver

cd webserver

wget http://www-eu.apache.org/dist/tomcat/tomcat-8/v8.0.36/bin/apache-tomcat-8.0.36.tar.gz

tar xvzf apache-tomcat-8.0.36.tar.gz

ubuntu@ip-172-31-59-137:~/webserver/apache-tomcat-8.0.36/bin$ ./startup.sh

Commands to setup Apache Tomcat 8.0

Page 8: Big data Lambda Architecture - Batch Layer Hands On

Apache Tomcat 8.0 running on EC2 Instance

Page 9: Big data Lambda Architecture - Batch Layer Hands On

Install Apache Cassandra - 3 Node Cluster on AWS

Page 10: Big data Lambda Architecture - Batch Layer Hands On

3 EC2 instance for Cassandra Cluster

Page 11: Big data Lambda Architecture - Batch Layer Hands On

cat /etc/*-release

sudo add-apt-repository ppa:webupd8team/javasudo apt-get updatesudo apt-get install oracle-java8-installer

java -version

mkdir db

cd db

wget http://www-eu.apache.org/dist/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz

tar xvzf apache-cassandra-3.0.7-bin.tar.gz

cd apache-cassandra-3.0.7/

cd apache-cassandra-3.0.7

bin/cassandra -f

bin/cqlsh

cassandra1 ——-> 52.87.183.121cassandra2 ——-> 52.207.239.229cassandra3 ——-> 54.174.185.29

Commands to setup Apache Cassandra 3.0.7Repeat for all 3 EC2 instances

Change following in conf/cassandra.yaml

cluster_name: 'Test Cluster’

listen_address:

broadcast_address: 54.174.185.29

seeds: “52.87.183.121,52.207.239.229"

rpc_address:

cassandra1 ——-> 52.87.183.121cassandra2 ——-> 52.207.239.229cassandra3 ——-> 54.174.185.29

Page 12: Big data Lambda Architecture - Batch Layer Hands On

3 Node Cassandra Server running on AWS EC2 Instances

Page 13: Big data Lambda Architecture - Batch Layer Hands On

3 Node Cassandra Server running

CREATE KEYSPACE users;

WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 3};

CREATE TABLE user( id int PRIMARY KEY, name text );

select * from user;

Page 14: Big data Lambda Architecture - Batch Layer Hands On

AngularJS - Java Restful WebServices Deployed on AWS Cloud

Page 15: Big data Lambda Architecture - Batch Layer Hands On

AngularJS - Java Restful WebServices

Page 16: Big data Lambda Architecture - Batch Layer Hands On

AngularJS - Java Restful WebServices

Page 17: Big data Lambda Architecture - Batch Layer Hands On

AngularJS - Java Restful WebServices

Page 18: Big data Lambda Architecture - Batch Layer Hands On

Tomcat Web Server Web Log we will be processingwith Apache Hadoop/Spark

Page 19: Big data Lambda Architecture - Batch Layer Hands On

Web Log and Python Application deployed toAWS Bucket

Page 20: Big data Lambda Architecture - Batch Layer Hands On

Spark job executed on AWS EMR - Spark Cluster

Page 21: Big data Lambda Architecture - Batch Layer Hands On

Results stored in Cassandra Database

Page 22: Big data Lambda Architecture - Batch Layer Hands On

Results stored in AWS S3 Bucket

Page 23: Big data Lambda Architecture - Batch Layer Hands On

Python Application BatchLogAnalyzer.py executed on AWS Spark Cluster

Page 24: Big data Lambda Architecture - Batch Layer Hands On

Results compared in console and Cassandra Database