Upload
mapr-technologies
View
58
Download
0
Embed Size (px)
Citation preview
® 1®© 2016 MapR Technologies 1© 2016 MapR Technologies
®
Handling the ExtremesScaling and Streaming in Finance
® 2®© 2016 MapR Technologies 2
Agenda• History
– Past, present, future• Messaging platforms
– Defining the extremes• Use cases
– Email, fraud• Resources• Q&A
® 3®© 2016 MapR Technologies 3
Message Bus
Specialized Storage
Operational Applications
J2EE AppServer
Relational Database
Legacy Business Platforms
• IT must integrate all the products
• Inability to operationalize the insight rapidly
• Can’t deal with high speed data ingestion and processing
• Scale up architecture leads to high cost
Specialized Storage
Analytical Applications
Analytic Database ETL Tool BI Tool
® 4®© 2016 MapR Technologies 4
Converged Data Platform
Analytical Applications
Operational Applications
Converged ApplicationsComplete Access to Real-time and
Historical Data in One Platform
Developers Creating Database and Event Based
Applications
(Bottom Line Initiatives) (Top Line Initiatives)
Analysts Creating BI Reports and KPIs on Data
Warehouse
Historical Data Current Data
® 5®© 2016 MapR Technologies 5
Application Development and Deployment
Oracle
Bulk Load
Machine Learning
Data LakePredictive
Modeling
BI / Reporting
Insights DB
Events(Kafka)
NoSQL
SQL Server
Graph DB
Microservice(.NET)
Microservice(NodeJS)
Microservice(Java)
Customer Insights
SQL Server
IIS, ASP.NET
DesktopBrowser
(Javascript, jQuery)
SQL
HTML, CSS, JS
MicrosoftReportingService
2005 Today DesktopBrowser
(Javascript, 20+ Frameworks)
Tablet
Native Android
Native iOS
JSON
JSON, CSS, HTML, JS
Backend for Frontend
(Java)
® 6®© 2016 MapR Technologies 6
Application Development and Deployment
Oracle
Bulk Load
Machine Learning
Data LakePredictive
Modeling
BI / Reporting
Insights DB
Events(Kafka)
NoSQL
SQL Server
Graph DB
Microservice(.NET)
Backend for Frontend
(Java)
Microservice(NodeJS)
Microservice(Java)
DesktopBrowser
(Javascript, 20+ Frameworks)
Tablet
Native Android
Native iOS
Customer Insights
JSON
JSON, CSS, HTML, JS
SQL Server
IIS, ASP.NET
DesktopBrowser
(Javascript, jQuery)
SQL
HTML, CSS, JS
MicrosoftReportingService
2005 Today
® 7®© 2016 MapR Technologies 7
Web-Scale StorageMapR-FS MapR-DB
Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability
MapR StreamsEvent StreamingDatabase
MapR Platform Services: Open API ArchitectureAssures Interoperability, Avoids Lock-in
HDFS API
POSIXNFS
SQL,HBase
APIJSONAPI
KafkaAPI
® 8®© 2016 MapR Technologies 8
Converged Application Benefits• Consumers scale horizontally with partitions
• 1:1 mapping between consumer and partition• Enables predictable scaling as production needs grow
• Data can be seamlessly replicated to another cluster• Enables HA with zero code changes
• Data is indexed dynamically according to receivers, senders• Scales beyond the capabilities of Kafka
• Snapshots can be taken to capture state• Enables faster testing and deployment of
applications
® 10®© 2016 MapR Technologies 10
Producers Consumers
Astream isanunboundedsequenceofeventscarriedfromasetofproducerstoasetofconsumers.
What’s a Stream?
Producersandconsumersdon’thavetobeawareofeachother,insteadtheyparticipateinsharedtopics.
Thisiscalledpublish/subscribe.
/Events:Topic
® 11®© 2016 MapR Technologies 11
Ability to Handle the “Extreme”
• 1+ Trillion Events– per day
• Millions of Producers– Billions of events per second
• Multiple Consumers– Potentially for every event
• Multiple Data Centers– Plan for success– Plan for drastic failure
Think that is crazy? Consider having 100 servers and performing:
Monitoring and Application logs…– 100 metrics per server– 60 samples per minute– 50 metrics per request– 1,000 log entries per request (abnormally
small, depends on level)– 1million requests per day
~ 2 billion events per day, for one small (ish) use case
Extreme Average Reality
® 12®© 2016 MapR Technologies 12
Producing and Consuming is Easyproducer = new KafkaProducer<>();
ProducerRecord<> event = new ProducerRecord<>(“/Events:Topic”, “MyEvent”);
producer.send(event);
consumer = new KafkaConsumer<>();
consumer.subscribe(“/MyStream:MyTopic”);
while(true) {ConsumerRecords<> events = consumer.poll(1000);Iterator<> newEvents = records.iterator();while(newEvents.hasNext()) {
System.out.println(newEvents.next().toString());}
}
/Events:Topic
® 13®© 2016 MapR Technologies 13
Producers and Consumers
/Events:Topic Analytics
Consumers
Stream ProcessorsSocial Platforms
Servers (Logs, Metrics)
Sensors
Mobile Apps
Other Apps & Microservices
Alerting Systems
Stream Processing Frameworks
Databases & Search Engines
Dashboards
Other Apps & Microservices
® 14®© 2016 MapR Technologies 14
Considering a Messaging Platform• 50-100k messages per second used to be good
– Not really good to handle decoupled communication between services
• Kafka model is BLAZING fast– Kafka 0.9 API with message sizes at 200 bytes– MapR Streams on a 5 node cluster sustained 18 million events / sec– Throughput of 3.5GB/s and over 1.5 trillion events / day
• Manual sharding is not a “great” solution– Adding more servers should be easy and fool proof, not painful– Yes, I have lived through this
® 16®© 2016 MapR Technologies 16
Event-based Data Drives Applications
FailureAlerts
Real-time application & network monitoring
Trending now
WebPersonalized Offers
Real-time Fraud Detection
Ad optimizationSupply Chain Optimization
® 19®© 2016 MapR Technologies 19
Prevention Options• Train people to not click random links in emails
– This will NEVER happen (Honestly!)
• E-mail appliances to prevent users from seeing emails– Most typically require users to intervene– Costly
® 20®© 2016 MapR Technologies 20
Constructing an E-Mail Management Pipeline
Postfix Mail Server
E-Mail Stream
MTA
Spam FiltersPhishing Classification Internal Affairs
Legal Archive
MTA Postfix Mail Server
® 21®© 2016 MapR Technologies 21
Benefits of Approach• Customizable pipeline
• Can learn and apply new policies– Spam– Phishing classification– Fraud attempts
• Retention policies– Auditable– Simple search and discovery– Litigation hold
® 22®© 2016 MapR Technologies 22
ClassifiersFighting Fraudulent Web Traffic
Activity Stream
Click Stream
Deviation from Normal
Blacklist Activities
Whitelist Activities
User Activity Profile
Known Bad Classifier
All OK Classifier
Session Alteration Stream Notify Security
® 23®© 2016 MapR Technologies 23
Similarities between Marketing and Fraud?
Customer 360 Website Fraud
• Build a user profile– What are their normal usage patterns
• Build “segmented” profiles– What do real users normally do
• Dynamically alter website– Prevent user functionality
• Kick-off external workflows– Notify security team
• Build a user profile– What type of content do they like
• Build “segmented” profiles– Company affiliation
• Dynamically alter website– Show alternate content
• Kick-off external workflows– Nurture emails
® 25®© 2016 MapR Technologies 25
Learn More about Converged Applications
Check out our Converged Application BlueprintVisit www.mapr.com/appblueprint