Upload
yujuan-jiang
View
149
Download
0
Embed Size (px)
Citation preview
Co-evolution of Infrastructure and Source Code
- An Empirical Study
Yujuan Jiang, Bram Adams MCIS lab Polytechnique Montreal, Canada
Server 1: Ubuntu
Server 2: centOS
Infrastructure Code Automates Environment Setup
Commit Build
Deploy
Deploy
Automate Instantiation of Web Server with Puppet & Chef
# Chef snippet case node[:platform] when “ubuntu” package “httpd-v1” do version “2.4.12” action: install end when “centOS” package “httpd-v2” do version “2.2.29” action: install end end
# Puppet snippet case $platform{ ‘ubuntu’: { package {‘httpd-v1’: ensure => “2.4.12” } } ‘centOS’: { package {‘httpd-v2’: ensure => “2.2.29” } } }
Infrastructure Code Widely Used by Large Companies
Uses Both Chef & Puppet + Large Data Set of 262 Repos
Preliminary & Research Questions
TestInf
Inf
PQ1: How many !infrastructure files !
does a project have?
PQ2: How many !infrastructure files !change per month?
PQ3: How large !are infrastructure !system changes?
Inf
Bld
Prod
Test
Tester
Production !developer
Build developerInfrastructure !
developer
RQ1: How tight is the coupling !between infrastructure code and !
other kinds of code?RQ2: Who changes !
infrastructure code?
Prod BldTest
Inf Bld
Prod Other
Test
Collect all files !from repos
InfInfInfBldBldBld
ProdProdProdTestTestTest
OtherOther
Other
Classify files into 5 groups !(“Other” deserted) !
for each project
InfInfInf Inf
Split projects into 2 groups-!Multi & Single
Commit co-change Ownership coupling
TestInf
Inf
Statistical visualization
Monthly change ratio
Average churn
Preliminary analysis
Inf
Bld
Prod
Test
Production !developer
Build developer
Tester
Infrastructure !developer? ?
RQs: coupling relation
Approach
Prod BldTest
Inf Bld
Prod Other
Test
Collect all files !from repos
InfInfInfBldBldBld
ProdProdProdTestTestTest
OtherOther
Other
Classify files into 5 groups !(“Other” deserted) !
for each project
InfInfInf Inf
Split projects into 2 groups-!Multi & Single
Commit co-change Ownership coupling
TestInf
Inf
Statistical visualization
Monthly change ratio
Average churn
Preliminary analysis
Inf
Bld
Prod
Test
Production !developer
Build developer
Tester
Infrastructure !developer? ?
RQs: coupling relation
Approach
Prod BldTest
Inf Bld
Prod Other
Test
Collect all files !from repos
InfInfInfBldBldBld
ProdProdProdTestTestTest
OtherOther
Other
Classify files into 5 groups !(“Other” deserted) !
for each project
InfInfInf Inf
Split projects into 2 groups-!Multi & Single
Commit co-change Ownership coupling
TestInf
Inf
Statistical visualization
Monthly change ratio
Average churn
Preliminary analysis
Inf
Bld
Prod
Test
Production !developer
Build developer
Tester
Infrastructure !developer? ?
RQs: coupling relation
Approach
Prod BldTest
Inf Bld
Prod Other
Test
Collect all files !from repos
InfInfInfBldBldBld
ProdProdProdTestTestTest
OtherOther
Other
Classify files into 5 groups !(“Other” deserted) !
for each project
InfInfInf Inf
Split projects into 2 groups-!Multi & Single
Commit co-change Ownership coupling
TestInf
Inf
Statistical visualization
Monthly change ratio
Average churn
Preliminary analysis
Inf
Bld
Prod
Test
Production !developer
Build developer
Tester
Infrastructure !developer? ?
RQs: coupling relation
Approach
Prod BldTest
Inf Bld
Prod Other
Test
Collect all files !from repos
InfInfInfBldBldBld
ProdProdProdTestTestTest
OtherOther
Other
Classify files into 5 groups !(“Other” deserted) !
for each project
InfInfInf Inf
Split projects into 2 groups-!Multi & Single
Commit co-change Ownership coupling
TestInf
Inf
Statistical visualization
Monthly change ratio
Average churn
Preliminary analysis
Inf
Bld
Prod
Test
Production !developer
Build developer
Tester
Infrastructure !developer? ?
RQs: coupling relation
Approach
Prod BldTest
Inf Bld
Prod Other
Test
Collect all files !from repos
InfInfInfBldBldBld
ProdProdProdTestTestTest
OtherOther
Other
Classify files into 5 groups !(“Other” deserted) !
for each project
InfInfInf Inf
Split projects into 2 groups-!Multi & Single
Commit co-change Ownership coupling
TestInf
Inf
Statistical visualization
Monthly change ratio
Average churn
Preliminary analysis
Inf
Bld
Prod
Test
Production !developer
Build developer
Tester
Infrastructure !developer? ?
RQs: coupling relation
Approach
Case Study Results
TestInf
Inf
PQ1: How many !infrastructure files !
does a project have?
PQ2: How many !infrastructure files !change per month?
PQ3: How large !are infrastructure !system changes?
Inf
Bld
Prod
Test
Tester
Production !developer
Build developerInfrastructure !
developer
RQ1: How tight is the coupling !between infrastructure code and !
other kinds of code?RQ2: Who changes !
infrastructure code?
! PQ1: Infrastructure files almost as large as source code and test files!
TestInf
Inf
File Size !(LOC)
Infrastructure Build Production Test
1100
10000
2,486
Infrastructure Build Production Test
54
2991 2768
TestInf
Inf
PQ1: How many !infrastructure files !
does a project have?
PQ2: How many !infrastructure files !change per month?
PQ3: How large !are infrastructure !system changes?
Inf
Bld
Prod
Test
Tester
Production !developer
Build developerInfrastructure !
developer
RQ1: How tight is the coupling !between infrastructure code and !
other kinds of code?RQ2: Who changes !
infrastructure code?
Case Study Results
! PQ2: The monthly change for infrastructure files has a
median value of 0.28
Infrastructure vs BuildInfrastructure & Build Production & Test
Comparable to production, and!
Higher than Build & Test!0.28 0.28
0.180.21
The proportion of files
changed per month
TestInf
Inf
PQ1: How many !infrastructure files !
does a project have?
PQ2: How many !infrastructure files !change per month?
PQ3: How large !are infrastructure !system changes?
Inf
Bld
Prod
Test
Tester
Production !developer
Build developerInfrastructure !
developer
RQ1: How tight is the coupling !between infrastructure code and !
other kinds of code?RQ2: Who changes !
infrastructure code?
Case Study Results
Infrastructure vs Build Production vs Test
! PQ3: Average churn per file is the highest across all file
categoriesAverage MCF
(Monthly Churn/File)
Infrastructure & Build Production & Test
TestInf
Inf
PQ1: How many !infrastructure files !
does a project have?
PQ2: How many !infrastructure files !change per month?
PQ3: How large !are infrastructure !system changes?
Inf
Bld
Prod
Test
Tester
Production !developer
Build developerInfrastructure !
developer
RQ1: How tight is the coupling !between infrastructure code and !
other kinds of code?RQ2: Who changes !
infrastructure code?
Case Study Results
! RQ1: The changes to Infrastructure files are tightly coupled with the changes to Test
and Production files.!
0.0
0.1
0.2
0.3
0.4
0.5
Infrastructure <=> Build Infrastructure <=> Production Infrastructure <=> Test
LegendProbability that left requires changes to right
Vice versa
Implication of Infrastructure !to other Category Code Change
0.2637
0.4583
0.03470.1085 0.0885
0.2578
RQ1: The most common reasons for the coupling between Infrastructure and Test are
“Integration” and “Update”.!
INTEGRATION!(e.g.: enabling new !
test modules !or integrating new !
test cases)
UPDATE (e.g.: changed a global !
variable valu)
TestInf
Inf
PQ1: How many !infrastructure files !
does a project have?
PQ2: How many !infrastructure files !change per month?
PQ3: How large !are infrastructure !system changes?
Inf
Bld
Prod
Test
RQ1: How tight is the coupling !between infrastructure code and !
other kinds of code?RQ2: Who changes !
infrastructure code?
Case Study Results
Tester
Production !developer
Build developerInfrastructure !
developer
TestInf
Inf
PQ1: How many !infrastructure files !
does a project have?
PQ2: How many !infrastructure files !change per month?
PQ3: How large !are infrastructure !system changes?
Inf
Bld
Prod
Test
RQ1: How tight is the coupling !between infrastructure code and !
other kinds of code?RQ2: Who changes !
infrastructure code?
Case Study Results
Tester
Production !developer
Build developerInfrastructure !
developerCheck out !our paper !please! :)