Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
Benefiting From a Shared Test System Lakshminarayanan Vasudevan
Benefiting From a Shared Test System
2
PayPal Production Footprint
©2015 PayPal Inc. Confidential and proprietary. 3
• Operate our own datacenters in 3 locations
• Embarking on Public Cloud strategy
• Cloud environment is approximately 150,000 vm’s
• Production environment is 60/40 split between
OpenStack & VMWare
• 20K payments processed per minute
• 1.9 million web hits per minute
• Frameworks: Java, Node, & C++
• 2800+ services
The Problem
©2015 PayPal Inc. Confidential and proprietary. 4
• Complex and excessive dependencies
• Rapidly growing code base
• 3 Payment stacks handling PayPal transactions
• Slow release cycles
• Inordinate amount of time required for test prep
How can a developer effectively
operate in this scenario?
Silo Test Environments
©2015 PayPal Inc. Confidential and proprietary. 5
PayPal in a Box – Stage2
2010
560 Services
15 MLOC
2014
1400 Services
50 MLOC
2002
30 Services
1 MLOC
2017
2800 Services
100 MLOC
• In 2015 there were 4100+ stage2’s
• Hardware cost for 2014 was 20 million
• 2017 planned expenditure was 30 million
Hardware was NOT the largest
cost!
Challenges with Silo Test Environments
©2015 PayPal Inc. Confidential and proprietary.
Code
6
Deploy on
Stage2
Learn, Debug &
Troubleshoot Failing
Services
Up Rev Dependent Services +
DB Schema
Test
Keep your Dependent
Services UP
Secure Stage2
Configure Stage2
30% - 50% Time wasted every sprint
maintaining stages: • Deploying all the components
• Managing environment
• Identifying transitive dependencies
• Test topology is not same prod
• Unhappy engineers
• Unproductive engineers
• Poor quality
• Longer TTM
• Integration testing was a nightmare
The Solution – Managed Stage
Our answer to the stage2 problem is Managed Stage:
• A Production Like Environment
• Cluster of machines running ALL services
• Ability to scale each service based on traffic
• Zero code drift, code refresh in minutes
• Easy to test against
• Centrally Managed
©2015 PayPal Inc. Confidential and proprietary. 7
PayPal’s Shared and Integrated Test Environment
Managed Stage Architecture
©2015 PayPal Inc. Confidential and proprietary. 8
CC
Standby Mesos
Master
Standby Mesos
Master
Active Mesos
Master
Standby Aurora Standby Aurora
Scheduler
Active Aurora
Scheduler
DB
Mesos slaves
Router Pool
Pool
Front Pool
Mid
Pool
Back
Other
pools
Zookeeper
1
Zookeeper
3
Zookeeper
2
Managed Stage Environments
©2015 PayPal Inc. Confidential and proprietary. 9
MSMaster (Live)
N
R
S
H
C
N
R
S
H
C
N
R
S
H
C
MS Release (N +1)
N
R
S
H
C
N
R
S
H
C
N
R
S
H
C
MS (LnP)
N
R
S
H
C
N
R
S
H
C
N
R
S
H
C
©2015 PayPal Inc. Confidential and proprietary. 10
PDLC Using Managed Stage
Code
Deploy Your Service On User
Stage
Test
o Happy engineers
o More time to
• Build
• Ship
• Think
• Play
Self-Service
User Stage
©2015 PayPal Inc. Confidential and proprietary. 11
User Stage A1, C1
DB
Bidirectional
Routing via
PPFE &
haproxy
Ex. Testing A1, C1
Flow Example: A -> B -> C -> D
A’
C’ B
D
C
A
Managed Stage
PP
F
E
User Stage
Services
N
R
S
H
C
N
R
S
H
C
N
R
S
H
C
Developer Transformation
• Paradigm shift from silo testing to shared environment:
• Modified test frameworks and test cases
• New patterns for test execution and triage
• Education of PD teams to leverage centralized logging and monitoring
• Need to move away from anti-patterns
o Configurations in code
o Hard coded dependencies
o Custom test configurations; assumes silo test environment
o Flawed deployment patterns
• Move away from stage2 testing
©2015 PayPal Inc. Confidential and proprietary. 12
Core Operating Principles
©2015 PayPal Inc. Confidential and proprietary. 13
• Irrational optimism - whatever it takes to make PD teams successful
• Drive transformation - silo environment to an integrated and shared environment
• Engineering solutions to fix the problem for good - DRY - Automate everything, No- SSH policy
• Small incremental changes - Contain risk, quick restoration
• Restore first – rollback, wire off
• Operational Excellence
Five parts to the puzzle
©2015 PayPal Inc. Confidential and proprietary. 14
Monitoring & Alerting
Self Healing
Empower Customer
(Self Service)
Continuous Deployment
Standard Operating
Procedures (SOP)
January 2016 Availability
©2015 PayPal Inc. Confidential and proprietary 15
January 2017 Availability
©2015 PayPal Inc. Confidential and proprietary. 16
What Did We Gain
©2015 PayPal Inc. Confidential and proprietary. 17
• Eliminated stage2 hardware cost (Stage2 count: 744 as compared to 4100)
• Improved developer productivity by 30%
• Improved application stability index
• Gained visibility into test case execution gaps
• Test case quality
• Created a path for improving engineering hygiene and product quality
Learnings
©2015 PayPal Inc. Confidential and proprietary. 18
• Managed Stage availability issues impacts ALL
PayPal development teams
• Foundation changes require significant cultural
shifts
• Cultural inertia was/is a persistent challenge
• Illusion of control
• Success requires tremendous tenacity and
absolute resolve
• Test Environment is a direct reflection of
Engineering hygiene
• Standardized automated operations is a MUST
o Monitoring
o Alerting
o Self healing
• Significant investment needed for education
Developer Productivity is a Continuous Journey
©2016 PayPal Inc. Confidential and proprietary. 19
• Altus – internally developed PaaS platform
• Docker
• ECD
• Parallel test execution
• MMI – Mother May I
• Production Auto-remediation
• Quality Guardrails
Q & A
©2015 PayPal Inc. Confidential and proprietary. 20