82
A Head Start on Cloud-native Event Driven Applications S. Suhothayan (Suho) Director WSO2 @suhothayan

Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

A Head Start on Cloud-native Event Driven Applications

S. Suhothayan (Suho)

Director WSO2@suhothayan

Page 2: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Increasing demand is causing disaggregation

Page 3: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Types of System Integration

Page 4: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Types of System Integration

Page 5: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Types of System Integration

Page 6: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Nation of Event Driven Applications

Applications that process data arriving via events and streamsResponds to actions generated by both users and systems.Messages passed via event handlers or message queuesWithin the application or across applications.

Applications relationships using queues and topics● One to One ● One to Many ● Many to One ● Many to Many

Page 7: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Characteristics of Cloud Nativeness

Developed as microservices.Packaged in containers.Deployed via continuous delivery workflows.Managed on elastic container-based infrastructureFollow agile DevOps processes.

Other Key Characteristics● Designed as loosely coupled microservices ● Developed with best-of-breed languages and frameworks● Architected with a clean separation of stateless and stateful services

Source: https://medium.com/walmartlabs/cloud-native-application-architecture-a84ddf378f82

Page 8: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Stateless Event Driven Applications

Processing done based on incoming event content, and if needed, by calling databases and other services.

● Can have multiple instances at the same time. ● Apps can be started and stopped when every needed. ● Highly scalable!

Can run on Kubernetes with multiple replicas, or even in serverless frameworks.

Page 9: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Deployment of Stateless Apps

When microservices use HTTP/TCP transports

Page 10: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Deployment of Stateless Apps

When microservices using messaging systems (JMS/Kafka/NATS)

Page 11: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Not all apps are stateless!

Page 12: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Some Stateful Use Cases!

● Number of server errors over last 20 min.● Same credit card being used in Russia and then in USA within 1 hour.

(Potential Fraud)● Purchasing 3 diamond rings consecutively (high value transactions) within

one hour. (Potential Fraud)● Item delivered but payment not received within 15 min. ● Continuous stock price increase

followed by the first drop.

Page 13: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Building Stateful Applications

Apps need to maintain states (remember previous events).

Things to remember ● State should be preserved during system failure. ● State should be shared with new nodes when scaled.

Options ● Use databases to store the state.● Use distributed cache to store and replicate state.● Keep the data in-memory.

Page 14: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Using Databases to Store the State

Advantages● Data can be shared with all the applications.● Supports transactions.● Support for reach data definition and querying language.

Disadvantages● Introduce high latency in the decisions making process. ● Not the best to implement a state machine.

Things to remember ● State should be preserved during system failure. ● State should be shared to new nodes when scaled.

Options Can use databases to store the state : ● High latency in decisions making process. ● Not the best to implement a state machine..

Can keep the data in-memory : ● Only finite amount of data can be stored. ● Cannot scale

Page 15: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Using Distributed Cache to Share State

Advantages● Not limited to the memory of the computing node.● Fast read. ● Data is preserved during node failures.

Disadvantages● Only Eventual consistency.● Not the best for transactions.● Limited querying support compared to databases.

Things to remember ● State should be preserved during system failure. ● State should be shared to new nodes when scaled.

Options Can use databases to store the state : ● High latency in decisions making process. ● Not the best to implement a state machine..

Can keep the data in-memory : ● Only finite amount of data can be stored. ● Cannot scale

Page 16: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Keeping Data In-Memory

Advantages● Very fast read and write access.● Best to implement state machine like data structures.

Disadvantages● Memory is limited to the computing node.● Cannot scale. ● Data is not preserved during system failures.

Things to remember ● State should be preserved during system failure. ● State should be shared to new nodes when scaled.

Options Can use databases to store the state : ● High latency in decisions making process. ● Not the best to implement a state machine..

Can keep the data in-memory : ● Only finite amount of data can be stored. ● Cannot scale

Page 17: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Best Approach for Stateful Apps

Database Cache Memory

High Performance

Scalability

Fault Tolerant

Page 18: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

There is no one tool for all cases!

Page 19: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Solution to High Performance, Scalable, and Fault Tolerant Apps

Use a combination of database, cache, and in-memory● Keep data in memory. (most states are short lived)● Periodically perform state snapshots to the databases.● Use cache to preload data (to improve read performance)

Solve by following the mapreduce philosophy.1. Map the incoming data.2. Partition data by key.3. Parallely process/summarize the data.4. If needed, combine the data at the end.

Page 20: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

When microservices using messaging systems (Kafka/NATS)

Deployment of Stateful Apps

Page 21: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

System snapshots periodically, replay data from source upon failure.

How Checkpointing Works?

Page 22: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

System snapshots periodically, replay data from source upon failure.

How Checkpointing Works? ...

Page 23: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

System snapshots periodically, replay data from source upon failure.

How Checkpointing Works? ...

Page 24: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

System snapshots periodically, replay data from source upon failure.

How Checkpointing Works? ...

Page 25: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

System snapshots periodically, replay data from source upon failure.

How Checkpointing Works? ...

Page 26: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

System snapshots periodically, replay data from source upon failure.

How Checkpointing Works? ...

Page 27: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

How to implement all this?

Page 28: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Streaming and Event Processing Frameworks

Use frameworks to build high performance, scalable, and fault tolerant event driven applications.

Supported frameworks:

Page 29: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Characteristics Of Typical Big Data Frameworks

They are massive (need 5 - 6 nodes per minimum deployment).They need a specialized team to manage. Developers need to go through an approval process to access the framework (center of excellence).

Issues: Reduce speed of innovations and productivity. Less autonomy to the team. High maintenance cost.

Page 30: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Choosing The Best Tool For Microservices Environment?

● Microservices are written, deployed and managed by two pizza size teams. ● Team have full autonomy in the way their project runs. ● Focus on agile and fast development. ● Use the best tool for the task.

Each service can be on written in the best language for its task, and no need to force all service to be in Java, Python or Go.

Page 31: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Introducing A Cloud Native Stream Processor

● Lightweight (low memory footprint and quick startup)● 100% open source (no commercial features).● Native support for Docker and Kubernetes.● Support agile devops workflow and full CI/CD pipeline.● Allow event processing logics to be written in SQL like query language and

via graphical tool. ● Used by :

Page 32: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Working with Siddhi

● Develop apps using Siddhi Editor.● CI/CD with build integration and

Siddhi Test Framework. ● Running modes

○ Emadded in Java/Python apps.○ Microservice in bare metal/VM.○ Microservice in Docker.○ Microservice in Kubernetes

(distributed deployment with NATS)

Page 33: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

We are happy to announce Siddhi 5.1

With new features such as:

● Export to Docker, and Kubernetes.● New connectors such as gRPC (with Protobuf), S3, Google Cloud Storage.● Database caching. ● Support to remove duplicate events.● Support for complex data transformations with List and Map extensions. ● Unified configuration madel.● Sandbox support to run test.● Improved error handling (log/wait/fault-stream).

Page 34: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Streaming SQL

@app:name('Alert-Processor')

@source(type='kafka', ..., @map(type='json'))define stream TemperatureStream(roomNo string, temp double);

@sink(type='email', ..., @map(type='text')) define stream AlertStream(roomNo string, avgTemp double);

@info(name='AlertQuery') from TemperatureStream#window.time(5 min)select roomNo, avg(temp) as avgTempgroup by roomNooutput first every 15 mininsert into AlertStream;

Source/Sink & Streams

Window Query with Rate Limiting

Page 35: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Web Based Graphical Editor

Page 36: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Web Based Graphical Editor

Page 37: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

CI/CD Pipeline of Siddhi

Page 38: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Patterns For Event Driven Data Processing

Page 39: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Patterns For Event Driven Data Processing

1. Consume and publish events with various data formats.2. Data filtering and preprocessing.3. Date transformation.4. Database integration and caching.5. Service integration and error handling.6. Rule processing7. Serving online and predefined ML models.8. Data Summarization.9. Scatter-gather and data pipelining.

10. Realtime decisions as a service (On-demand processing).

Page 40: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Scenario: Order Processing

Customers place orders.Shipments are made.Customers pay for the order. Tasks: ● Process order fulfillment. ● Alerts sent on abnormal conditions. ● Send recommendations.● Throttle order requests when limit

exceeded.● Provide order analytics over time.

Page 41: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Consume and Publish Events With Various Data Formats

Page 42: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Consume and Publish Events With Various Data Formats

Supported transports

● NATS, Kafka, RabbitMQ, JMS, Amazon SQS, MQTT, IBMMQ● HTTP, gRPC, TCP, Email, WebSocket, ● Change Data Capture (CDC)● File, S3, Google Cloud Storage

Supported data formats

● JSON, XML, Avro, Protobuf, Text, Binary, Key-value, CSV

Page 43: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Consume and Publish Events With Various Data Formats

Default JSON mapping

Custom JSON mapping

@source(type = mqtt, …, @map(type = json))define stream OrderStream(custId string, item string, amount int);

@source(type = mqtt, …, @map(type = json, @attribute(“$.id”,"$.itm” “$.count”)))define stream OrderStream(custId string, item string, amount int);

{“event”:{“custId”:“15”,“item”:“shirt”,“amount”:2}}

{“id”:“15”,“itm”:“shirt”,“count”:2}

Page 44: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Data Filtering and Preprocessing

Page 45: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Data Filtering and Preprocessing

Filtering● Value ranges● String matching● Regex

Setting Defaults● Null checks ● Default function● If-then-else function

define stream OrderStream (custId string, item string, amount int);

from OrderStream[item!=“unknown”]select default(custId, “internal”) as custId, item, ifThenElse(amount<0, 0, amount) as amount, insert into CleansedOrderStream;

Page 46: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Date Transformation

Page 47: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Date Transformation

Data extraction ● JSON, Text

Reconstruct messages ● JSON, Text

Inline operations● Math, Logical operations

Inbuilt function calls● 60+ extensions

Custom function calls● Java, JS

json:getDouble(json,"$.amount") as amount

str:concat(‘Hello ’,name) as greeting

amount * price as cost

time:extract('DAY', datetime) as day

myFunction(item, price) as discount

Page 48: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Database Integration and Caching

Page 49: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Database Integration and Caching

Supported Databases: ● RDBMS (MySQL, Oracle, DB2, Postgre, H2), Redis, Hazelcast● MongoDB, HBase, Cassandra, Solr, Elasticsearch

Page 50: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Database Integration and Caching

Joining table with cache (preloads data for high read performance).

define stream CleansedOrderStream (custId string, item string, amount int);

@store(type=‘rdbms’, …, @cache(cache.policy=‘LRU’, … ))@primaryKey(‘name’)@index(‘unitPrice’)define table ItemPriceTable(name string, unitPrice double);

from CleansedOrderStream as O join ItemPriceTable as T on O.item == T.name

select O.custId, O.item, O.amount * T.unitPrice as priceinsert into EnrichedOrderStream;

Table with Cache

Join Query

Page 51: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Service Integration and Error Handling

Page 52: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Enriching data with HTTP and gRPC service Calls

● Non blocking ● Handle response based on status

codes

● Handle error conditions○ Send events to Error Stream○ Log failed events○ Block events till endpoint

become available

Service Integration and Error Handling

200

4**

Handle response based on status code

Page 53: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

SQL for HTTP Service Integration

Calling external HTTP service and consuming the response.

@sink(type='http-call', publisher.url="http://mystore.com/discount", sink.id="discount", @map(type='json'))define stream EnrichedOrderStream (custId string, item string, price double);

@source(type='http-call-response', http.status.code="200", sink.id="discount", @map(type='json', @attributes(custId ="trp:custId", ..., price="$.discountedPrice")))define stream DiscountedOrderStream (custId string, item string, price double);

Call service

Consume Response

Page 54: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Rule Processing

Page 55: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Rule Processing

Type of predefined rules

● Rules on single event○ Filter, If-then-else, Match, etc.

● Rules on collection of events○ Summarization○ Join with window or table

● Rules based on event occurrence order ○ Pattern detection○ Trend (sequence) detection ○ Non-occurrence of event

Page 56: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Alert Based On Single Event

Use filter to identify abnormal events.

define stream DiscountedOrderStream (custId string, item string, price double);

from DiscountedOrderStream[price>2000]select custId, item, price, ‘High Price Purchase’ as reasoninsert into AlertStream;

Filter Query

Page 57: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Alert Based On Collection Of Events

Use stateful window query to aggregate orders over time for each customer, and alert conditions once every 5 minute.

define stream DiscountedOrderStream (custId string, item string, price double);

from DiscountedOrderStream#window.time(30 min)select custId, sum(price) as totalPricegroup by custIdhaving totalPrice > 5000output first every 5 mininsert into AlertStream;

Window querywith aggregation and

rate limiting

Page 58: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Alert Based On Event Occurrence Order

Use stateful pattern query to detect event occurrence order and non occurrence.

define stream OrderStream (custId string, orderId string, ...);define stream PaymentStream (orderId string, ...);

from every (e1=OrderStream) -> not PaymentStream[e1.orderId==orderId] within 15 minselect e1.custId, e1.orderId, ...insert into PaymentDelayedStream;

Non occurrence of event

Page 59: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Serving Online and Predefined ML Models

Page 60: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Serving Online and Predefined ML Models

Type of Machine Learning and Artificial Intelligence processing ● Anomaly detection

○ Markov model

● Serving pre-created ML models○ PMML (build from Python, R, Spark, H2O.ai, etc)○ TensorFlow

● Online machine learning ○ Clustering ○ Classification ○ Regression from OrderStream

#pmml:predict(“/home/user/ml.model”,custId, itemId)insert into RecommendationStream;

Find recommendations

Page 61: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Data Summarization

Page 62: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Data Summarization

Type of data summarization ● Time based

○ Sliding time window○ Tumbling time window○ On time granularities (secs to years)

● Event count based ○ Sliding length window○ Tumbling length window

● Session based ● Frequency based

Type of aggregations● Sum● Count● Avg● Min● Max● DistinctCount● StdDev

Page 63: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Aggregation Over Multiple Time Granularities

Aggregation on every second, minute, hour, … , yearBuilt using 𝝀 architecture ● In-memory real-time data● RDBMs based historical data

define aggregation OrderAggregation from OrderStream select custId, itemId, sum(price) as total, avg(price) as avgPrice group by custId, itemId aggregate every sec ... year;

Query

Speed Layer & Serving Layer

Batch Layer

Page 64: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Data Retrieval from Aggregations

Query for relevant time interval and granularity.

Data being retrieved both from memory and DB with milliseconds accuracy.

from OrderAggregation within "2019-10-06 00:00:00", "2019-10-30 00:00:00" per "days"select total as orders;

Page 65: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Scatter-gather and Data Pipelining

Page 66: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Scatter-gather and Data Pipelining

Divide into sub-elements, process each and combine the results

Example : json:tokenize() -> process -> window.batch() -> json:group() str:tokenize() -> process -> window.batch() -> str:groupConcat()

{x,x,x} {x},{x},{x} {y},{y},{y} {y,y,y}

Page 67: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

● Create a Siddhi App per use case (Collection of queries).● Connect multiple Siddhi Apps using in-memory source and sink.● Allow rules addition and deletion at runtime.

Modularization

Siddhi Runtime

Siddhi App for data capture

and preprocessing

Siddhi Appsfor each use case

Siddhi App for common data publishing logic

Page 68: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Realtime Decisions As A Service

Page 69: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Realtime Decisions As A Service

Query Data Stores using REST APIs● Database backed stores (RDBMS, NoSQL)● Named aggregations● In-memory windows & tables

Call HTTP and gRPC service using REST APIs● Use Service and Service-Response

loopbacks● Process Siddhi query chain

and send response synchronously

Page 70: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Full size Image Area with text

Kubernetes Deployment

Page 71: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Steps On Deploying Siddhi App In Kubernetes

Page 72: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Siddhi Custom Resource Definition

apiVersion: siddhi.io/v1alpha2kind: SiddhiProcessmetadata: name: <name>spec: apps: - script: | <siddhi app> container: env: - name: <key> value: <value> image: "siddhiio/siddhi-runner-ubuntu:5.1.0" messagingSystem: type: nats persistentVolumeClaim: <PVC config> runner: | state.persistence: enabled: true

Page 73: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Siddhi Custom Resource Definition

apiVersion: siddhi.io/v1alpha2kind: SiddhiProcessmetadata: name: <name>spec: apps: - script: | <siddhi app> container: env: - name: <key> value: <value> image: "siddhiio/siddhi-runner-ubuntu:5.1.0" messagingSystem: type: nats persistentVolumeClaim: <PVC config> runner: | state.persistence: enabled: true

Siddhi process name

Page 74: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Siddhi Custom Resource Definition

apiVersion: siddhi.io/v1alpha2kind: SiddhiProcessmetadata: name: <name>spec: apps: - script: | <siddhi app> container: env: - name: <key> value: <value> image: "siddhiio/siddhi-runner-ubuntu:5.1.0" messagingSystem: type: nats persistentVolumeClaim: <PVC config> runner: | state.persistence: enabled: true

Specify Siddhi Appsas inline text or ConfigMaps

Page 75: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Siddhi Custom Resource Definition

apiVersion: siddhi.io/v1alpha2kind: SiddhiProcessmetadata: name: <name>spec: apps: - script: | <siddhi app> container: env: - name: <key> value: <value> image: "siddhiio/siddhi-runner-ubuntu:5.1.0" messagingSystem: type: nats persistentVolumeClaim: <PVC config> runner: | state.persistence: enabled: true

Necessary environment variables

Page 76: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Siddhi Custom Resource Definition

apiVersion: siddhi.io/v1alpha2kind: SiddhiProcessmetadata: name: <name>spec: apps: - script: | <siddhi app> container: env: - name: <key> value: <value> image: "siddhiio/siddhi-runner-ubuntu:5.1.0" messagingSystem: type: nats persistentVolumeClaim: <PVC config> runner: | state.persistence: enabled: true

Siddhi Runner with extensions

Page 77: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Siddhi Custom Resource Definition

apiVersion: siddhi.io/v1alpha2kind: SiddhiProcessmetadata: name: <name>spec: apps: - script: | <siddhi app> container: env: - name: <key> value: <value> image: "siddhiio/siddhi-runner-ubuntu:5.1.0" messagingSystem: type: nats persistentVolumeClaim: <PVC config> runner: | state.persistence: enabled: true

Distributed processing configs wirth NATS

Page 78: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Image Area

Siddhi Custom Resource Definition

apiVersion: siddhi.io/v1alpha2kind: SiddhiProcessmetadata: name: <name>spec: apps: - script: | <siddhi app> container: env: - name: <key> value: <value> image: "siddhiio/siddhi-runner-ubuntu:5.1.0" messagingSystem: type: nats persistentVolumeClaim: <PVC config> runner: | state.persistence: enabled: true

Periodic incremental persistence, and other Siddhi Runner configs

Page 79: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Kubernetes Deployment

Page 80: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Kubernetes Deployment

$ kubectl get siddhiNAME STATUS READY AGESample-app Running 2/2 5m

Page 81: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

What We Looked At

● Characteristics of cloud native apps. ● Problems in implementing event driven stateful applications. ● Siddhi : Cloud Native Stream Processor.● Patterns on implementing event driven applications. ● Deploying on Kubernetes with Siddhi and NATS.

For more info visithttps://siddhi.io

Page 82: Event Driven Applications A Head ... - Big Data Days 2019 · Keep data in memory. (most states are short lived) Periodically perform state snapshots to the databases. Use cache to

Thank You