Upload
johann-romefort
View
383
Download
5
Tags:
Embed Size (px)
Citation preview
How to enable the Lean Enterprise
Johann Romefort co-founder @ rainbow
My Background
• Seesmic - Co-founder & CTO
Video conversation platform
Social media clients…lots of pivots :)
• Rainbow - Co-founder & CTO
Enterprise App Store
Goal of this presentation
• Understand what is the Lean Enterprise, how it relates to big data and the software architecture you build
• Have a basic understanding of the technologies and tools involved
What is the Lean Enterprise?
http://en.wikipedia.org/wiki/Lean_enterprise
“Lean enterprise is a practice focused on value creation for the end customer with minimal waste and processes.”
Enabling the OODA Loop
!
!
“Get inside your adversaries' OODA loop to disorient them” !
OBSERVE ORIENT DECIDE ACT
USAF Colonel John Boyd on Combat:
OODA Loop
Enabling the OODA LoopOODA Loop
The OODA Loop for software
image credit: Adrian Cockcroft
OODA Loop
• (Observe) Innovation and (Decide) Culture are mainly human-based
• Orient (BigData) and Act (Cloud) can be automated
ORIENT
What is Big Data?
• It’s data at the intersection of 3 V:
• Velocity (Batch / Real time / Streaming)
• Volume (Terabytes/Petabytes)
• Variety (structure/semi-structured/unstructured)
Why is everybody talking about it?
• Cost of generation of data has gone down
• By 2015, 3B people will be online, pushing data volume created to 8 zettabytes
• More data = More insights = Better decisions
• Ease and cost of processing is falling thanks to cloud platforms
Data flow and constraintsGenerate
Ingest / Store
Process
Visualize / Share
The 3 V involve heterogeneity and
make it hard to achieve those steps
What is AWS?
• AWS is a cloud computing platform
• On-demand delivery of IT resources
• Pay-as-you-go pricing model
Cloud Computing
+ +
StorageCompute Networking
Adapts dynamically to ever changing needs to stick closely
to user infrastructure and applications requirements
How does AWS helps with Big Data?
• Remove constraints on the ingesting, storing, and processing layer and adapts closely to demands.
• Provides a collection of integrated tools to adapt to the 3 V’s of Big Data
• Unlimited capacity of storage and processing power fits well to changing data storage and analysis requirements.
Computing Solutions for Big Data on AWS
Kinesis
EC2 EMR
Redshift
Computing Solutions for Big Data on AWS
EC2All-purpose computing instances.Dynamic Provisioning and resizingLet you scale your infrastructure at low cost
Use Case: Well suited for running custom or proprietary application (ex: SAP Hana, Tableau…)
Computing Solutions for Big Data on AWS
EMR
‘Hadoop in the cloud’
Adapt to complexity of the analysis and volume of data to process
Use Case: Offline processing of very large volume of data, possibly unstructured (Variety variable)
Computing Solutions for Big Data on AWS
Kinesis
Stream Processing
Real-time data
Scale to adapt to the flow of inbound data
Use Case: Complex Event Processing, click streams, sensors data, computation over window of time
Computing Solutions for Big Data on AWS
RedShift
Data Warehouse in the cloud
Scales to Petabytes
Supports SQL Querying
Start small for just $0.25/h
Use Case: BI Analysis, Use of ODBC/JDBC legacy software to analyze or visualize data
Storage Solution for Big Data on AWS
DynamoDB RedShift
S3 Glacier
Storage Solution for Big Data on AWS
DynamoDB
NoSQL DatabaseConsistent Low latency access Column-base flexible data model
Use Case: Offline processing of very large volume of data, possibly unstructured (Variety variable)
Storage Solution for Big Data on AWS
S3
Use Case: Backups and Disaster recovery, Media storage, Storage for data analysis
Versatile storage system
Low-cost
Fast retrieving of data
Storage Solution for Big Data on AWS
Glacier
Use Case: Storing raw logs of data. Storing media archives. Magnetic tape replacement
Archive storage of cold data
Extremely low-cost
optimized for data infrequently accessed
What makes AWS different when it comes to big data?
Given the 3V’s a collection of tools is most of the time needed for your data processing and storage.
Integrated Environment for Big Data
AWS Big Data solutions comes integrated with each others alreadyAWS Big Data solutions also integrate with the whole AWS ecosystem (Security, Identity Management, Logging, Backups, Management Console…)
Example of products interacting with each other.
Tightly integrated rich environment of tools
On-demand scaling sticking to processing requirements
+
=Extremely cost-effective and easy to deploy solution for big data needs
• Error Detection: Real-time detection of hardware problems
• Optimization and Energy management
Use Case: Real-time IOT Analytics
Gathering data in real time from sensors deployed in factory and send them for immediate processing
First Version of the infrastructure
Aggregate
Sensors data
nodejs stream
processor
On customer site
evaluate rules over time window
in-house hadoop cluster
mongodb
feed algorithmwrite raw data for further
processing
backup
Version of the infrastructure ported to AWS
Aggregate
Sensors data
On customer site
evaluate rules over time window
write raw data for
archiving
Kinesis RedShift for BI
analysis
Glacier
ACT
Cloud and Lean Enterprise
Let’s start with a personal example
First year @seesmic
• Prototype becomes production
• Monolithic architecture
• No analytics/metrics
• Little monitoring
• Little automated testing
I built a monolith
or…at least I tried
Early days at SeesmicFirst year @seesmic
Everybody loves a good horror story
We crashed Techcrunch
What did we do?
Add a QA Manager
Add bearded SysAdmin
We added tons of process so nothing can’t go wrong
Impact on dev team
• Frustration of slow release process
• Lots of back and forth due to bugs and the necessity to test app all over each time
• Chain of command too long
• Feeling no power in the process
• Low trust
Impact on product team
• Frustration of not executing fast enough
• Frustration of having to ask for everything (like metrics)
• Feeling engineers always have the last word
Impact on Management
• Break down software into smaller autonomous units
• Break down teams into smaller autonomous units
• Automating and tooling, CI / CD
• Plan for the worst
What can you do?
=Break down software into smaller autonomous units
Introduction to Microservices
Monolith vs Microservices - 10000ft view -
Monolith vs Microservices - databases -
Monolith vs Microservices - servers -
Microservices - example -
Break down team into smaller units
Amazon’s “two-pizza teams”
• 6 to 10 people; you can feed them with two pizzas.
• It’s not about size, but about accountability and autonomy
• Each team has its own fitness function
• Full devops model: good tooling needed
• Still need to be designed for resiliency
• Harder to test
Friction points
Continuous Integration
(CI) is the practice, in software engineering, of merging all developer working copies with a
shared mainline several times a day
Continuous Deployment
Continuous Deployment
Tools for Continuous Integration
• Jenkins (Open Source, Lot of plugins, hard to configure)
• Travis CI (Look better, less plugins)
Tools for Continuous Deployment
• GO.cd (Open-Source)
• shippable.com (SaaS, Docker support)
• Code Deploy (AWS)
+ Puppet, Chef, Ansible, Salt, Docker…
Impact on dev
• Autonomy
• Not afraid to try new things
• More confident in codebase
• Don’t have to linger around with old bugs until there’s a release
Impact on product team
• Iterate faster on features
• Can make, bake and break hypothesis faster
• Product gets improved incrementally everyday
Impact on Management
• Enabling Microservices architecture
• Enabling better testing
• Enabling devops model
• Come talk to the Docker team tomorrow!
Thank Youfollow me: @romefort