Upload
letuong
View
213
Download
0
Embed Size (px)
Citation preview
Migrating and Running Continuous Integration Systems at Scale in AWS
Peter CaronAWS Loft, München | November 16, 2016
Agenda
1. Transition to CloudLift and ShiftCrawl, Walk, Run, Sprint
2. CI / CD Overview
3. Challenges
HERE’s Business
HERE is one the world’s leading map data companies and is now able to deliver the next generation of mobility and location-based services.
HERE Products
HERE software serves map, traffic and location data to a variety of target platforms• HERE Open Location Platform• Embedded Automobile Navigation • Enterprise Extensions• Mobile Apps
HERE’s Challenge
HERE needed a CI system that could meet complex and heterogenousdeployments and releases that could scale.
Before the Cloud
Region #x
HomemadeBuild Tools
Build Systems ~5+ Builds per month40T+ Tests cycle
CD Platform Pipelines 17 Services2 Unique pipelines1000+VMs on VMWare1 ish Deployments/month100s Acceptance tests/month~10 Build runs / month
Region #2
JenkinsUnder
the Desk
Region #1
JenkinsIn the Data Centre ?
AWS Services in Production
Region #1
security groupEC2 instances
JenkinsMaster
Amazon EFS
AmazonS3
Spot Instances
AWSDevice Farm
Amazon Cloud Watch
Amazon VPC
Region #2
security groupEC2 instances
JenkinsMaster
Common CI Systems – CCI (Jenkins / Electric Flow)110K+ Builds per day25M+ Tests per day
CI for Micro-services - JaaS (Jenkins as a Service)130 Products and services
CD Platform Pipelines (go as a Service)36 Services668 Unique pipelines600+ VMs on AWS 40+ Deployments/month100s Acceptance tests/day1400+ Build runs / month
What kind(s) of integration and testing to use?
Jenkins Unit testing
go integration testing
go deployment
Mesos / Marathon deployment orchestration
Real Device testing
Customer Integration
Static Data Services Micro ServicesReal-time Data Services Embedded and Downloaded
Applications
Moving our CI / CD infrastructure to AWS …• Git• Gerrit• Jenkins• Go• Splunk It was a simple lift and shift from our local infrastructure
Plan to Grow
• Get your Workflow right• i.e. Get your CI act together first
• Know your Capacity and Limits• Focus on Testing• Set Expectations Internally• Know your Fallback options• Monitor changes (costs)
Create a Culture Change1. Start small, iterate• A single developer group before your flagship product
2. Understand your changes • There is infrastructure outside the control of your developer.
Don’t let is become Expensive Hosting 2.03. Infrastructure as Code is not just a buzz word• Apply it if you have one or more people using CI
4. Measure Results and Adapt WoW• Only react to verifiable metrics
What did we learn?
• Don’t trust the plugins• Capacity is always underestimated• Costs will be high• Plan fallback• Trust the developers – just enough• Moving the Cloud will help nothing• People will use it
… and what could we have done better?
Mainline
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Full Verification Full Verification Full Verification
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Every 5 minDuration < 20min
Runs on each successful SV
Duration < 20min
Runs every 3 hrsDuration : 3hrs
Runs every day ?Duration : ? Release
candidateE2E (Manual
Tests)
Client Pipelines
Artifact
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Baseline Sanity Tests
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Pre-submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Submit Verification
Full VerificationFull VerificationFull Verification
Service Pipelines
Other Challenges
EC2 Instance types and plugins• Build Rotator • Fluentd• BFA Plugin
Special Challenges (Peaks and Valleys)• CO2 (Choose your AWS region first)• Performance (Watch your Queues)• Use Containers (Duh!)
• Hierarchy Killer Plugin• Timestamp Plugin
The Advantages of CI in the Cloud
• Security in our infrastructure• Stability: automated tests run reliably in a consistent
infrastructure• Rapid scaling: slaves come online fast• Cost control: slaves go off-line fast• Common AWS tools are known to Engineers• High availability: Master servers are always available• Multi-regional presence reduces latency• Parallel builds and testing will reduced time and costs
Thank youContact
Peter CaronService Automation and Continuous Integration
HEREInvalidenstrasse 11610115 Berlin
PluginsPlugin name Version affected Issue Action Download
BuildRotator Plugin ---
LogRotator that comes with Jenkins tries to be much smarter then needed. So, it loads entire job history at least twice to understand what could be removed and what -not.
Update plugin. Replace "LogRotator" to "BuildRotator" as build discard mechanism everywhere.
BuildRotator.hpi
Fluentd --- Send data to Fluentd Install and enjoy. fluentd.hpi
BFA Plugin 1.13.0 and earlier
When we have a huge amount of aborted builds, BFA needs to process all of them, that creates queue and slowdown Jenkins/feedback itself.
Build failure analyzer => Advanced => "Ignore aborted builds" option should be enabled in Jenkins configuration.
Available in Jenkins
PluginsPlugin name Version affected Issue Action Download
HierarchyKillerPlugin 0.98 and earlier
When plugin goes to kill some item from queue, it kills first job in queue instead of killing job that was connected to upstream.
FIX: Correct API call was used.
Update plugin and have fun. build-hierarchy-killer.hpi
Timestamper Plugin 1.8.4 and earlier
Even then Jenkins needs only last 150 KB, plugin reads entire log (because of the # of users we have up to 3 GB) to calculate timestamp for last X lines.Main problem that plugin stores timestamps in encoded format - VarInt.
FIX: Read only last 150 KB of logs for finished builds.
Update plugin and enjoy. Available in Jenkins