Continuous integration at scale

Preview:

Citation preview

Continuous Integration at scale

Vivek Singh, ThoughtWorks

CI Design, more than technology

• application architecture• development process• human behavior• art of compromise

What’s creates scale for CI?

• number of committers• frequency of commit• size of codebase• frequency of release

Experience from a project

• 100 committers• ~ 2 commits a day pet commiter• distributed• 45 days release cycle

Some context

• C#, .NET• inherited (non-buildable) codebase• one of the busiest website in UK and its

complete backend• distributed team (Bangalore, London, Pune)• used Go (formerly Cruise)

Best way to understand

• starting point• witness the evolution• what worked and didn’t

Server under the desk

• small team• lot of external dependencies• carefully and painfully setup environment

And soon long build times

• but before that……

CI Users

• developers• analysts• project manager

Developers want

• fast• reliable• and it always passes

QA

• want it to provide good builds

…so that they can test new things and verify known issues

Project Manager

• should catch important bugs…so that software is closer to be shipped

CI can be quick, cheap and useful

pick two

Continuous…

• …integration• …(automated) testing• …deployment

Continuous Integration

• meaningful handover to next stage of delivery process

• running only unit tests misses the point

back to, Long build times

• lot of code and tests• multiple teams working on different part of

codebase

Multiple Single Jobs Build?

Source Control(s)

Job B

Output

Job C

Output

Job A

Output

MaterialsMaterials

Materials

e.g. Hudson Slaves

Which is green build?

• material (x,y) => (a, b) Green• material (x) => (c) Green• material (y) => (c) Red

Multiple Single Jobs Build

• provides wrong build to downstream (e.g. QA)• reason: no synchronization on materials

Pipelined Builds

Pipeline 2

Pipeline 11

Pipeline 1

Source Control

Job BJob A

Output(s)

Job D Job E

Output(s)

MaterialsMaterials

Materials

Pipelined Builds, Why

• mimics component dependency, hence feels right

• no unnecessary builds, optimum use of resources

Pipelined Builds, Why Not?

• material sync issue• complex to understand• longer build time• difficult to track material flow• different from developer build

Staged Team Commit

Continuous IntegrationContinuous Integration

SourceControl

Local Source Control

Release DeliverableLocal Output

Commiters

Local testing

Continuous Integration

Local Output

Commiters

Local testing

Local Source Control

ManualPeriodicMerge

ManualPeriodicMerge

Staged Team Commit, Why?

• provides isolation• no need to build everything

Staged Team Commit, Why Not?

• huge merge problems• increase in testing effort

(we tried with SVN it might be better with GIT)

Parallel Jobs Build

Developer

Source Control(s)

Job B Job CJob A

Material Synchronizer

Materials

Materials

Materials Materials

Regression Firefox Chrome

A

B C

E F

A => a.compile, a.testB => a.compile, b.compile, b.testE => a.compile, b.compile, e.compile, e.testF => a.compile, b.compile, c.compile, f.compile, f.testSmoke => all.compile, smoke

Dependency Build

Parallel Jobs Build

• all CI issues cannot be solved without changing architecture– modularization– testability without external dependencies

• cannot do this with any other tool than Go

Continuous integration and virtualization

• clean build• Subversion• Git

I am a developer

• want to do the right thing• I don’t understand the CI design• I also forget to check the build status before

commiting/pushing• I don’t want delay fixing of build

Commit Gate

Continuous Integration

Source Control

Commiters

Pre Commit Hook

CheckStatus

GreenYellow

Recommended