24
COSMO/COSMO-CLM SVN to git migration Katie Osterried Carlos Osuna C2SM [email protected] [email protected] December 7, 2015 C 2 SM Center for Climate Systems Modeling

COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

COSMO/COSMO-CLM SVN to git migration

Katie OsterriedCarlos Osuna

[email protected]

[email protected]

December 7, 2015

C2SMCenter for Climate Systems Modeling

Page 2: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

Contents

1 Introduction 2

2 Glossary of version control terms 3

3 Git vs. SVN 33.1 Why are we switching? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Di↵erences between SVN and git . . . . . . . . . . . . . . . . . . . . . . . . 4

4 Structure of the C2SM git repositories 44.1 Mapping of the SVN repositories to git repositories . . . . . . . . . . . . . . 64.2 Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

5 Working with the C2SM-RCM git repositories 75.1 Central repository workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . 75.2 Collaborative development . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95.3 Issue Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.4 Usage examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5.4.1 User wants to develop new feature . . . . . . . . . . . . . . . . . . . . 115.4.2 User wants to incorporate new cosmo version . . . . . . . . . . . . . . 145.4.3 Two users want to share a bug fix . . . . . . . . . . . . . . . . . . . . 16

5.5 Best Practices for using git . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

6 The migration from SVN to git 196.1 Migration plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196.2 How to migrate your personal branches . . . . . . . . . . . . . . . . . . . . . 20

7 Resources and Help 21

1

Page 3: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

1 Introduction

This document describes the migration of the COSMOmodel and related codes from the Sub-version (SVN) version control system to the git version control system. The COSMO codeshosted both at CSCS (cosmo.cscs.ch) and on the HPCForge website (scm.hpcforge.org/var/lib/gforge/chroot/scmrepos/svn/cclm-dev/) will be migrated to the git version control systemin December of 2015. This document is intended to provide all the required informationabout the migration to the users of these two SVN repositories.

The Subversion COSMO repositories have proved to be a useful tool for the C2SM com-munity, as there are several important functionalities that a code versioning system provides.The code repositories are used to track changes to the code and the reasons for those changesas the code develops through time. They are also used by groups of users to collaborativelydevelop codes. The administrators of the repositories use them to provide the latest versionsof codes to the users and distribute bug fixes to the whole community. The goal of thismigration from SVN to the git version control system is to upgrade to a system that willbetter support those useful functionalities.

This document is divided into four main sections. The first main section, Section 3,describes the reasons for the migration and some important di↵erences between SVN andgit. This is followed by Section 4, which describes how all of the codes currently in the twoSVN repositories will be hosted in git. The permissions settings for the access of the code isalso discussed in this section.

The next main section, Section 5, describes how the users will work with the git reposi-tories. This section includes descriptions of a workflow for use with the code hosted in git, aworkflow for collaborative development of software, and some detailed step-by-step examplesof how to e↵ectively use the git version control system.

In the final main section, Section 6, the details of the migration of the code from SVN togit are provided. This includes the timeline for the migration and detailed instructions forusers to migrate their own personal code history to the new git system.

Directly following this introduction in Section 2 is a Glossary of terms related to versioncontrol systems that may prove useful to readers not yet familiar with git and version controlsystem terminology.

Section 7 at the end of the document lists resources that the users of the new git repos-itories may find useful. This includes resources for using git, using the web hosting serviceGithub, accessing and using the C2SM repositories, and support during the migration.

2

Page 4: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

2 Glossary of version control terms

branch an independent line of development.

clone copy a repository into a new directory.

commit a snapshot of the code.

fork a copy of a repository made on a web host.

master the main line of development in git.

merge join two or more code development histories.

pull request a request to merge your code changes into a central repository or branch.

remote any repository linked to the local repository.

SVN Subversion version control system.

tag a frozen reference to a particular commit.

trunk the main line of development in SVN.

3 Git vs. SVN

3.1 Why are we switching?

Git is a distributed version control system that was written in 2005. Since then, it hasbecome a very popular and widely used tool. Git was designed for collaborative, opensource development of code, and that design means that it has some inherent advantagesover SVN. Because of the way that git interacts with remote servers, it tends to be fasterthan SVN. Git repositories are smaller than SVN ones, because git stores the code changehistory in a more e�cient way than SVN. Git was designed to make integration of codechanges from multiple developers simple. This means the merging of code branches can beaccomplished with fewer commands and less extraneous changes to the commit history thanin SVN.

In addition to git’s inherent advantages, another advantage of using git is the tools andinterfaces that have developed due to it’s popularity. The web interfaces that are usedwith git are user-friendly and well-supported. They allow you to easily visualize the historyand current status of your code. You can track the contributions to your code made bycollaborators and communicate with collaborators easily.

Because git was designed to be used for heavily collaborative projects, there are nowsome well established tools for the reviewing of code with git. This feature is one of themain reasons for the migration of the COSMO code from SVN to git. Code review is an

3

Page 5: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

important part of any collaborative software development process. Code review helps usersto find and fix bugs, makes sure that code follows established coding standards, and ensuresthat there are readable and useful comments in the code and that the code adheres to theoverall software design. All of the C2SM repository users are encouraged to incorporate codereview with git into their collaborative code development projects.

3.2 Di↵erences between SVN and git

There are some di↵erences between SVN and git that will influence the structure of the codestorage and the workflow of the users of the repositories. Some di↵erences between SVN andgit and the implications of those di↵erences for the users of the repositories are listed next.

1. SVN is a centralized version control system and git is a distributed version controlsystem. In a distributed version control system, when a user copies the code fromwherever it resides, they receive the entire repository, including all of the history. Thisis di↵erent from SVN, where the user would copy only a part of the central repositoryand work with that.

Implications for the user:

• There will be many small git repositories instead of a few large ones. That way,the user can copy and work with only the parts of the code that they need to.

• Personal development branches will not be stored in the central git repositories, sousers will not have to download development branches from users that are totallyunrelated to their own work.

2. In SVN, branches and tags are simply copies of a folder containing the code from acertain commit. SVN doesn’t recognize the di↵erence between a commit of code inthe trunk or a tag or a branch. They are all just commits. However, in git, there arevery clear definitions of trunk, branches, and tags. Branches and tags are not copiesof the code, but pointers in the history to a certain commit. Tags are not allowed tobe changed at all. Additionally, there is no distinction between branch tags and trunktags, as we have currently been using in the SVN repository.

Implications for the user:

• The COSMO git repositories will follow the standard version control system lay-out. This means branches and tags have to be directly related to the trunk thatthey are associated with.

• Users will have a repository for each code that they work with. These repositorieswill contain only the master branch and their personal branches.

4 Structure of the C2SM git repositories

Because of the distributed nature of git, most of the codes contained in the COSMO SVNrepositories will be migrated into individual git repositories. These individual git repositories

4

Page 6: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

will be stored in a central location on a website where the C2SM community can access them.These centrally located git repositories will be used to release new versions, features, andbug fixes in the codes to the C2SM community. They will not be used for active codedevelopment.

Some of the smaller codes in the SVN repositories will be combined together into one gitrepository. (The exact mapping of the content of the SVN repositories to git repositories isdescribed in section 4.1.) Breaking up the large SVN repositories into smaller git ones willallow users to obtain only the codes that they want to work with and not have to downloada lot of unrelated data. These git repositories will contain the master and tags (whereappropriate). They will not contain any other branches. (Note: in git, the main branchof the code is known as the master branch, which is the same as the trunk in SVN. In therest of this document, the main branch of a git repository will be referred to as master tobe consistent with Git terminology). This setup of the centrally located git repositories isdepicted in the schematic in Figure 1. The versioning numbering of the codes defined for

Figure 1: Schematic of structure of the git repositories

the codes in the SVN repositories will be maintained in the git repositories. Following thatsystem, a tag will be created whenever there is a new numbered version of a code available.

The central repositories will be bare repositories, which means that they do not havea working copy associated with them. It is standard for shared repositories in git to bebare, as this prevents accidental editing and committing of code to shared repositories.Bare repositories are designed this way because it is not intended that they themselvesshould be used for development, but only to pass code on to users. Therefore, all of thecode development will take place in individual copies of these central repositories, using aworkflow that is described in more detail in section 5.1.

The central repositories will be placed on the git hosting website Github (github.com/C2SM-RCM) where they will be available to the C2SM RCM community. These repositories onGithub will be used for new users to access the code, new versions of code to be releasedto users, and bug fixes and other changes to be shared with users. Github was chosen asthe host for the repositories because it o↵ers a web interface for uploading and maintainingrepositories as well as tools for collaboration, issue tracking, and visualization of repositories.

5

Page 7: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

The repositories will be hosted on Github using an organization called C2SM-RCM. Anorganization on Github is a tool for allowing a large group of people to access the samerepositories, while keeping the repositories private and setting di↵erent permissions levelsfor di↵erent repositories. Members of the C2SM community will be able to create individualaccounts on Github and be granted access to the central repositories at C2SM-RCM bysending an email request to the owners of the C2SM-RCM organization. When users wantto develop the code, they will copy the central repository into their own individual space onGithub and work from there.

Most of the central repositories will be private, because much of the software containedin the repositories is licensed and cannot be publicly shared. When the user makes a copyof a private repository, their copy on Github will also remain private. Once the user has acopy of the repository on the development computer, they must take care not to post thecode on a public server or in a public repository on a git web host. Posting a licensed codepublicly or providing code access to an unlicensed user would be a clear violation of theC2SM Agreement on the Acceptable Use of Informatics Resources that each user must signbefore gaining access to the repositories.

We have purchased a plan on Github in order to create private repositories. The priceof these plans is dependent on the number of private repositories we will have. Based onthe structure of the current SVN repository, we will require between 25 and 50 privaterepositories. This includes cosmo, cclm, and int2lm, as well as all the other tools andscripts. On Github, we will pay approximately 100 CHF a month to have up to 50 privaterepositories. The annual cost to host the code that is currently in the two SVN repositorieswill therefore be approximately 1200 CHF.

4.1 Mapping of the SVN repositories to git repositories

The codes contained in the two large SVN repositories (cosmo.cscs.ch and HPCForge) willbe mapped to approximately 30 smaller, git repositories that will be hosted on Github(github.com/C2SM-RCM). The mapping of the SVN repositories to git repositories can befound in Tables 1 and 2 at the end of this document. For each path in the SVN repositoriesto a particular code, the name of the target git repository is listed. For the most part,each of the new git repositories will contain only one code. However, some of the smallercodes have been grouped together where it was appropriate, and some of the codes whichare obsolete will not be migrated. The SVN repositories will be kept intact in a read-onlystate after the migration, so all the information contained there will still be accessible evenif it was not migrated to git. The location of the read-only SVN repositories may changeafter the migration. In this case, the new location will be communicated to the users of therepositories by email.

Some codes in the cosmo.cscs.ch SVN repository have a tags folder, which contains copiesof the code at certain release points. In git, tags are not separate directories, but are simplypointers to a certain snapshot of the code. The trunk tags for most of those codes thathave them will be migrated to tags in the central code repositories. The cosmo.cscs.ch SVNrepository also has a vendor directory at the same directory level as the trunk directory forseveral di↵erent codes. This vendor directory contains the code as it comes directly from thegroup that releases it; for example, the COSMO vendor directory contains the o�cial DWD

6

Page 8: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

version of the code. In the git repositories, there will be separate vendor repositories for mostof the codes that have a vendor directory. These repositories will contain the vendor codeand vendor tags only. The vendor copies of the code will be named as <codename>-vendor.

4.2 Permissions

Access to the central repositories hosted on Github is controlled through the use of teamswithin the C2SM-RCM organization. All users must first sign the C2SM Agreement onthe Acceptable Use of Informatics Resources before they are given access to any of the gitrepositories. The permissions system in the C2SM-RCM organization is based on the currentpermissions setup for the SVN repositories. There are three levels of permissions:

1. Owners: the Owners consists of a few people who have complete control over therepositories in the C2SM-RCM organization. They have the ability to create anddelete repositories, create new teams, add people to teams, and have read and writeaccess to every repository.

2. Admins: for each individual repository in the C2SM-RCM organization, there is ateam called admin-<repository name>. These people have write access to their as-signed repository. They are responsible for adding new versions of the code to therepository, tagging new versions of the code, and dealing with bug fixes for their as-signed repository. The people on these teams are those who are currently acting as”superusers” in the SVN repository for each specific code.

3. Users: The members of this team have read access to all of the git repositories in theC2SM-RCM organization. This means they can copy the central repositories into theirown account on Github and also generate requests to the code owners to have theirwork integrated into a central repository. They do not have permission to write to anyof the central repositories.

5 Working with the C2SM-RCM git repositories

5.1 Central repository workflow

The members of the C2SM-RCM organization on Github will use the centrally located gitrepositories in what’s known as a forking workflow. In this workflow, the users fork, orcopy, the repository from the C2SM-RCM Github account to their personal Github account.They can then clone, or copy, the forked repository from their personal Github account tothe desired computer and work with it there. They will use their fork on Github to storetheir personal changes to the code. A schematic of this workflow follows here.

7

Page 9: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

The user will have a separate repository in their personal Github account for each di↵erentcode that they are developing and using. It is recommended that when people fork a repos-itory and begin to develop a code, they start from a specific version number, or tag (seeSection 5.4 for details on how to accomplish this.) This is better for reproducability reasonsthan starting with the master branch of one of the central repositories.

A typical step by step forking developmental workflow is described next. This shortexample assumes that a user is new to the repositories and would like to begin working onthe COSMO model.

1. The user must create a Github account and email an owner of the C2SM-RCM organi-zation to ask for access to the central repositories. After signing the C2SM AcceptableUse Agreement, the user is added to the C2SM-RCM organization on Github andassigned to the Users team, granting them read-only access to all of the central repos-itories.

2. The user forks the cosmo repository by clicking the fork button in the C2SM-RCM/cosmorepository on Github. A copy of the cosmo repository will appear in their personalGithub account, titled <username>/cosmo

3. The user clones the forked repository to the computer they would like to develop on.

4. The user creates a branch for a new feature from a tag and develops changes in thatbranch.

5. The user saves the changes to the code in the feature branch to the local repository.

6. The user saves the changes to their fork of the code in their Github account.

8

Page 10: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

A full description of this workflow and other usage scenarios, including git commands,can be found in section 5.4 of this document.

5.2 Collaborative development

One of the strengths of git and Github is the support available for collaborative developmentof software codes. This section describes a recommended workflow for people who would liketo work together on code development, for example when multiple new features are beingadded to a code for a project.

Before a collaborative software development project begins, the people involved shouldmeet and discuss the workflow they intend to use. They need to decide several things,including: how to distinguish individual lines of development, what the naming conventionswill be, what the permissions will be for the central repository, and how the merging togetherof new code features will be accomplished. What follows here is a recommendation for aproject workflow using Github.

It is recommended that the project members should work together on the same centralizedrepository. This can easily be accomplished on Github by creating an organization for theproject and adding the project members to the project organization. The code requiredto start development can then be forked from the C2SM-RCM central repository into theorganization account. Each project member can then copy this organization repository totheir development machine and work from there.

It is also recommended that the development should be done in branches, and the masterbranch should be kept stable and only used for the merging in of feature branches when thecode features are finished and tested. How to divide the work up into branches is one of thequestions that should be decided before the development starts. Ideally, each branch wouldcorrespond to a new feature independent of the changes in other branches.

The code developed by one person should be reviewed by at least one other person beforeit is incorporated in the master code branch. On Github, the mechanism provided for codereview is called a pull request. A pull request is simply a request from one developer tohave their branch merged back into the master branch of the code. Once a project memberhas finished developing and testing their code, they can create a pull request in the projectrepository on Github. Other members of the project organization can comment on the pullrequest, and follow up commits can be made to the code based on the pull request. Whenthe code reviewers are satisfied that the code in the pull request is up to the standards of theproject, the pull request can be granted and the branch merged into the master branch ofthe repository. If there are no conflicts between the feature branch and the master branch,then the merging can be done from the Github web interface. If there are conflicts, then themerging must be done by command line on a development computer.

Here is a step by step description of the recommended workflow for projects:

1. The project members should meet and decide on a workflow for the project.

2. One of the project members should create a new organization on Github for the project.

3. A project member then forks the desired code from the C2SM-RCM central repositoryinto the project organization on Github.

9

Page 11: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

4. Each of the project members then copies the project repository to their desired com-puters for development, using git clone.

5. Each project member develops the code and saves the changes to their local repository,using feature branches as decided on in the workflow meeting.

6. The feature branches should also be stored in the central project repository, so thatmembers that are collaborating to develop a feature can share their code changes.

7. Once a feature branch is complete, one of the developers creates a pull request onGithub to merge their branch back into the master branch.

8. After the code in the feature branch has been reviewed by the designated code reviewersusing the Github interface, the changes are merged back into the master branch.

5.3 Issue Tracking

Github has a built in issue tracking feature that will be used to keep track of bugs andproblems with the code in the central repositories in the C2SM-RCM organization on Github.This feature can be used by any member of the C2SM-RCM organization to report issues orbugs that are occurring with the code in a repository. When a user encounters a problemwith the code, they should first check the issue tracker for the central repository to see ifsomeone has already reported the problem they are experiencing. If not, then they can enteran issue using the Github interface. They do this by clicking on the issues link on the righthand side of the repository page on Github. They can then enter a new issue by clicking thenew issue button. They should give a detailed description of the problem as they experiencedit. The administrators of the code repository can then add comments and labels to the issueand close the issue when it is resolved. Once an issue has been reported, users can sign upfor notifications about the issue, which will update them as the status of the issue changes.If the user has spotted a bug in the code and figured out how to resolve it, they can firstreport the issue, and then create a pull request containing the necessary bug fix. The usershould reference the issue number in the pull request description. This will automaticallyclose the issue when the pull request is accepted and the bug fix is merged into the code inthe central repository.

5.4 Usage examples

In this section, some examples are given of possible use cases of the C2SM-RCM git repos-itories. For each example, the scenario is described and then step by step instructions aregiven and illustrated. The colored rectangles in the illustrations each represent a di↵erentgit repository, as illustrated in the following legend:

10

Page 12: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

The rectangles are color-coded to represent their location: blue rectangles represent reposi-tories in the C2SM-RCM organization on Github, turquoise rectangles represent repositoriesin the user accounts on Github, and purple rectangles represent the repositories located ondevelopment computers.

5.4.1 User wants to develop new feature

First, consider a new user who would simply like to develop a new feature in the COSMOmodel.

1. The user must create a Github account and email the owners of the C2SM-RCMorganization to ask for access to the central repositories. After signing the C2SMAcceptable Use Agreement, the user is added to the C2SM-RCM organization onGithub and assigned to the Users team, granting them read-only access to all of thecentral repositories.

2. Once the user has access to the C2SM-RCM organization, they should use the web in-terface to fork the cosmo repository from the COSMO-RCM organization on Github totheir own account. This is done by navigating in Github to the desired code repositoryand clicking on the fork button in the upper right-hand corner of the screen.

11

Page 13: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

The forked repository will be named by Github as <username >/cosmo.

3. Next, the user should copy the forked repository to the computer where they wantto work on the development of the code. This is done by navigating to the folder onthe development machine where the repository should be stored and typing: git clonefork-url. The fork-url can be found on Github by navigating to the forked repositoryin the user’s account and looking for the clone URL on the lower right hand side ofthe screen. Once the clone is complete, the user now has an identical copy of the coderepository in their Github account and on their development computer.

12

Page 14: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

4. In the code repository on the development machine, the user should make a newbranch for development. This branch should be based on a tagged release of thecode, to help with reproducability of the work. The command to do this is:git branch<feature><tagname>. For example, if the user would like to make a feature branchcalled convection starting with the COSMO 5.0.1 version, the command would be git

branch convection cosmo5.0.1

5. Next, the user will make their code modifications in the feature branch.

6. The user can then use the git add and git commit commands to save the changes tothe local copy of the repository. For more information about using git locally, see theGit tutorial slides and the Resources section at the end of this document.

7. Finally, the user should send the changes from their code repository on the developmentmachine to their repository on Github. Because the repository on the development ma-chine was cloned from the one on Github, they are already linked together as remotes.Each remote repository that a given git repository is linked to is given a name toidentify it. The code repository on Github is automatically given the remote name oforigin during the cloning process. So, the command to save the feature branch fromthe development repository to the one on Github is: git push origin <feature >

13

Page 15: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

5.4.2 User wants to incorporate new cosmo version

In this scenario, a user has already forked the cosmo model from the C2SM-RCM organizationand cloned the fork to their development computer. They have created their own featurebranch, feature, and made some modifications in it. There is now a new version 5.0.2 ofCOSMO in the tags of the cosmo repository in the C2SM-RCM organization on Github andthey would like to add these changes to their feature branch.

1. The user must first link the cosmo repository on their development computer withthe one in the C2SM-RCM organization space on Github. This is done by settingthe C2SM-RCM Github repository as a remote for the repository on the developmentcomputer. Each remote repository that a given git repository is linked to is given aname to identify it. By git convention, the central repository which is the source of thecode is named upstream. So, the user must set the C2SM-RCM cosmo repository onGithub as a remote called upstream. This is accomplished by executing the followingcommand in the repository on the development computer: git remote add upstream

https://github.com/C2SM-RCM/cosmo

14

Page 16: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

2. The user should then copy the 5.0.2 version of cosmo from the tags in the C2SM-RCMcosmo repository on Github to their cosmo repository on the development computer:git fetch --tags upstream. This command will copy all of the new tags from the C2SM-RCM cosmo repository to the development computer.

15

Page 17: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

3. Next, the user should merge the changes into the feature branch. From the featurebranch the command to do this is: git merge cosmo5.0.2, where cosmo5.0.2 is the nameof the tag. The name of all the tags in a git repository can be displayed with the git

tag command.

4. Finally, the user can save these changes to their personal cosmo repository on Githubusing git push origin feature.

5.4.3 Two users want to share a bug fix

In this scenario, two users have been making changes to the cosmo code in their individualcode repositories. User 2 has committed a bug fix that User 1 would like to obtain. Thebug fix was a single commit made by User 2 amidst the other code developments they havemade in their feature branch. User 1 does not want to obtain the other code developmentsmade by User 2, but just the bug fix.

16

Page 18: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

1. First, User 1 should link their cosmo repository with User 2’s cosmo repository byadding it as a remote using the command: git remote add user2 user2path, whereuser2path is the path to User2’s cosmo repository.

2. Now, User 1 can get the code changes from User 2’s repository using git fetch user2

3. User 1 can find the desired commit using: git log remotes/user2/feature2 . This willgenerate a list of all the commits and their identifying checksums.

17

Page 19: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

In this case, User 1 would like to have only commit B in their branch, because that isthe one that contains the bug fix.

4. Finally, User 1 can merge the single commit into their branch: git cherry-pick commit

B, where commit B is the git commit identification checksum (the long sequence ofnumbers and letters associated with each git commit).

Note: User 2 should also use the issue tracker on the C2SM-RCM cosmo repository toreport the bug that they have identified. They should then generate a pull request onGithub with the bug fix. This should be done with a branch that only contains the bugfix and no other code changes. This will allow the administrators of the C2SM-RCMcosmo repository to incorporate the bug fix into the repository and then distribute itto the whole community.

18

Page 20: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

5.5 Best Practices for using git

Although users have complete control over their personal repositories, there are some bestpractice rules that everyone should follow. These will ensure that the work is clearly orga-nized and transparent. Here is a list of recommendations for working with git:

1. Each new feature or functionality should be developed in a unique branch that has alogical name.

2. Commits should be made for each small change or set of related changes.

3. Commit messages should be understandable by everyone. By git convention, the firstline of the commit message should be a summary, followed by a detailed explanation.

4. Repositories should be kept clean and up to date. Branches that are no longer inuse should be removed. This is also true for forked repositories and organizations onGithub that are no longer being used.

5. Projects:

(a) Choose a workflow at the beginning of the project

(b) Use pull requests and code review to integrate new code features.

6 The migration from SVN to git

6.1 Migration plan

The migration of the content of the repositories from SVN to git will occur on Monday,December 7th, 2015, except for the COSMO, CCLM, INT2LM, and INT2LM-CLM codes,which will be migrated on Monday December 14th, 2015. The migration will result in thecreation of two kinds of repositories: the central code repositories in the organization spaceon Github (github.com/C2SM-RCM) and individual repositories for the users including theirown personal branches. C2SM will create the central code repositories and post them to theC2SM-RCM organization space on Github. Users will be responsible for migrating informa-tion from their own branch(es) of the SVN repositories into an individual git repository. Themigrated repositories will contain all of the code history from the SVN repositories, exceptfor a few special cases that have been discussed with the a↵ected users.

On the migration day, the path to the codes to be migrated in the SVN repositorieswill be frozen (changed to read only access). No one will be able to save changes to thecode repositories at this time. The migration of the central repositories to the C2SM-RCMorganization on Github will then occur. As soon as this is complete and the migration hasbeen tested, users will be notified by email. They can then begin to migrate their personalbranches and work with git. After the migration day, everyone will be working solely with gitand the Github repositories and there will be no more changes saved to the SVN repositories.After the migration, the SVN repositories will be kept intact in a read-only state so that itwill be possible to retrieve information from it at a later date if required. The location of

19

Page 21: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

these read-only SVN repositories may change at some point in the future, but users will benotified by email before this happens.

6.2 How to migrate your personal branches

Each user will be responsible for the migration of their own branches of code from the SVNrepositories. That is, any information that is found in the https://cosmo.cscs.ch/<codename>/branch or scm.hpcforge.org/branches directories will not be migrated automatically. Theuser should migrate only those branches that they are still actively using. There is no reasonto move branches that are no longer in development.

Here are the steps the user should follow to migrate their own branches. The user willneed to follow these steps for each code for which they have branches to migrate.

1. Sign up for an account on Github, if you don’t already have one. This is done bynavigating to github.com and clicking the Sign Up button on the upper right handside of the page.

2. Email the owners of the C2SM-RCM organization at [email protected] to askfor access to the central repositories. Be sure to include your Github username in theemail. This can be done well ahead of the migration to prepare.

3. Once you have access to the C2SM-RCM organization and the migration of the centralrepositories has finished, you should fork the desired code into your personal webspace.This is done by navigating to https://github.com/C2SM-RCM/ and selecting the de-sired code repository. You can then click the fork button on the upper right handcorner of the screen to generate the fork in your personal account.

4. Now, you should download the prepared migration materials. These consist of a migra-tion script, a README file, and an authors file. The migration materials are providedin a repository on Github called migration. The user should go to github.com/C2SM-RCM/migration and click on the Download ZIP button. After the file has downloaded,the ZIP file should be unpacked in the location where the migration should take place.You should make sure to download the files as a ZIP instead of cloning them as a gitrepository. This will ensure that you don’t generate a nested git repository when yourun the migration script.

5. Next, you need to convert your personal branches from SVN to git. This is done by run-ning the provided migration script called c2sm svn2git.sh. The Ruby utility svn2git isneeded to run this script. This utility can be found at: github.com/nirvdrum/svn2git.The svn2git utility is already installed on all the machines at IAC, and available toMeteoSwiss users on Kesch at CSCS. To run the script on Kesch, you must first loadtwo modules: Ruby and git. In order to run the bash script, you will need to fill invariables pertaining to the branches you wish to migrate, as described in READMEfile. When the script is run, it will do the following:

(a) Generate a git repository on the computer in the location where the script is run.This repository will consist of the master branch of the code and all of the otherpersonal branches you migrated.

20

Page 22: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

(b) Make a link between the local git repository and your fork of the code that ishosted on Github. This is done by setting the forked repository in your personalaccount on Github as a remote named origin.

(c) Push all of the information from the migrated branches to the fork on Github.

(d) Synchronize the master branches of the local and Github repositories.

When these steps are completed, the git repository on your development computer and theforked repository in your Github account will be identical. You can then start working withgit. All of the code commits that you made in SVN now have been converted to git andstored in your local and forked repositories.

7 Resources and Help

Git and Github are widely used and there are many resources for help with these tools. Here,a few are listed for your information. For help with:

1. git

• git help: Displays general information about git and man pages for specific com-mands.

• http://git-scm.com/doc: Contains a comprehensive git reference manual

• http://gitref.org: A quick reference for commonly used git commands

2. Github

• https://help.github.com/: Help pages for Github

• https://guides.github.com/: Step by step guides to basic Github features.

3. C2SM-RCM repositories: for help with the C2SM-RCM repositories please contact theorganization owners at [email protected].

4. Migrating your branches : for questions about the migration please contact eitherKatherine Osterried ([email protected]) or Carlos Osuna ([email protected]).

21

Page 23: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

Table 1: Mapping of the cosmo.cscs.ch repository to git repositoriesSVN repository Directory Sub-directories Sub-sub-directories Git repositorycosmo.cscs.ch cosmo trunk cclm c2sm cclm

cosmo c2sm cosmovendor cclm cclm-vendor

cosmo cosmo-vendorfieldextra trunk fieldextraflexpart trunk flexpartint2lm trunk int2lm c2sm int2lm

int2lm c2sm clm int2lm-clmvendor cclm int2lm-clm-vendor

cosmo int2lm-vendorlagranto trunk lagranto cosmo iac lagrantolib libgrib1 trunk libgrib1

libgrib1 idl trunk libgrib1libgrib api trunk libgrib-api

vendor libgrib-api-vendorrttov10 rttovrttov7 rttov

mch-tools LMscripts trunk lmscriptsbufins mchtoolsdispersion oprtoolsfix2free mchtoolsfourDverif oprtoolsidl mchtoolsmovero trunk moveroradar oprtoolsrealpath oprtoolsrubylib oprtoolssnowanalysis oprtoolsstat pp oprtoolssurface verif trunk surface verifutilities oprtoolsxtract skyguide oprtools

tools cip mchtoolsextpar trunk extparncl cosmolib mch-ncl

ncl testsuite mch-ncl

22

Page 24: COSMO/COSMO-CLM SVN to git migration - C2SM Wiki · over SVN. Because of the way that git interacts with remote servers, it tends to be faster than SVN. Git repositories are smaller

Table 2: Mapping of the HPCForge repository to git repositoriesSVN repository Directory Sub-directories Git repositoryHPCForge COSMO trunk cosmo-pompa

STELLA trunk stellatrunk INTERNODE internode

PHYSICS STANDALONE physics-standaloneenv env-cscslibjasper libjasper

23