Understanding Git—
How not to lose your HEAD
Sebastian Neuser
HaSi e.V.
2013-09-26
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Who am I?
I am a student of applied computer science (subject: electricalengineering) at the University of Siegen.
I work for os-cillation GmbH, where in most projects Git inconjunction with GitLab is used for version control and projectmanagement (issue tracking, wikis, . . . ).
I have been using Unix-like operating systems for about tenyears now, currently Debian is my system of choice.
I am trying to use as few proprietary software as possible andyou should too!
Although Saint IGNUcius said ”vi vi vi is the editor of thebeast”, I use vim extensively and with passion.
Sebastian Neuser Understanding Git
Terms and conditions
This presentation aims to provide an overview of the (imho)most important parts of Git ; it is just scratching the surface!
Shell code is highlighted
$ echo like this.
like this.
Lines beginning with a $ are input, others are output.
The presentation is based on Git version 1.7.
I tend to talk fast! Please complain if it’s too fast.
Feel free to ask questions any time.
Sebastian Neuser Understanding Git
What is Git?
$ whatis git
git (1) - the stupid content tracker
Git is more than a simple version control system – itsdesigners use the term source code management.
Git is free software.
Git is used in some of the biggest (free and) open sourceprojects (Linux, KDE, Qt, . . . ).
Git was initially written by Linus himself, because Linuxneeded a new VCS and he felt he could write somethingbetter than anything in existence in less than two weeks.
Sebastian Neuser Understanding Git
What is Git not?
Git is very unlike centralized VCS (CVS, subversion, . . . ).
Git is (against popular belief) neither very complex norespecially hard to learn.
Git is not well suited for huge repositories!
Git is not an excuse for not making any backups.
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Object types
There are four elementary object types in Git:
blob ; a file.tree ; a directory.commit ; a particular state of the working directory.tag ; an annotated tag (we will ignore this one for now).
commit commit
tree tree
blob blob blob blob
...
Sebastian Neuser Understanding Git
Object types – detailed
A blob is simply the content of a particular file plus somemeta-data.
A tree is a plain text file, which contains a list of blobs and/ortrees with their corresponding file modes and names.
A commit is also a plain text file containing information aboutthe author of the commit, a timestamp and references to theparent commit(s) and the corresponding tree.
All objects are compressed with the DEFLATE algorithm andstored in the git object database under .git/objects.
Sebastian Neuser Understanding Git
The SHA-1
The Secure Hash Algorithm is a 160 bit cryptographic hashfunction used in TLS, SSH, PGP, . . .
Every object is identified and referenced by its SHA-1 hash.
Every time Git accesses an object, it validates the hash.
Linus Torvalds: ”Git uses SHA-1 in a way which has nothingat all to do with security. [...] It’s about the ability to trustyour data.”
If you change only a single character in a single file, all hashesup to the commit change!
Sebastian Neuser Understanding Git
Combining hashes and objects
commit
9967ccb
commit
57be356
tree
a910656
tree
41fe540
blob
80146af
blob
50fcd26
blob
d542920
blob
50fcd26
Sebastian Neuser Understanding Git
References
master (as well as every other ref) is a text file whichcontains nothing more than the hash of the latest commitmade on the branch:
$ cat .git/refs/heads/master
57be35615e5782705321e5025577828a0ebed13d
HEAD is also a text file and contains only a pointer to the lastobject that was checked out:
$ cat .git/HEAD
ref: refs/heads/master
Sebastian Neuser Understanding Git
References – visualized
commit
9967ccb
commit
57be356
tree
a910656
tree
41fe540
blob
80146af
blob
50fcd26
blob
d542920
blob
50fcd26
master
HEAD
Sebastian Neuser Understanding Git
The staging area
The staging area is kind of an intermediate layer between theworking directory and the repository, also called the index.
It is a binary file containing a virtual working tree state usedto compose commits.
Upon commit, the (virtual) tree of the staging area is taggedwith meta data (author name, timestamp, parent, . . . ) andthus turned into an actual commit.
It is possible to circumvent the staging area and not use it atall, but doing so neglects some of Git’s greatest features.
Sebastian Neuser Understanding Git
The index – Shell example
The Git plumbing command git ls-files parses the index fileand shows its contents:
$ git ls-files --stage
100644 d542920062e17fd9e20b3b85fd2d0733b1fb7e77 0 file0
100644 50fcd26d6ce3000f9d5f12904e80eccdc5685dd1 0 file1
$ echo File 2 > file2
$ git add file2
$ git ls-files --stage
100644 d542920062e17fd9e20b3b85fd2d0733b1fb7e77 0 file0
100644 50fcd26d6ce3000f9d5f12904e80eccdc5685dd1 0 file1
100644 4475433e279a71203927cbe80125208a3b5db560 0 file2
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Creating or cloning a repository – Creating
Initializing a repository:
$ git init demo
Initialized empty Git repository in /home/haggl/demo/.git/
This command automatically creates a working directory andwithin it a repository (; the .git-directory) with a branchcalled master, which is checked out.
Initializing a bare repository (for server-side operation):
$ git init --bare bare.git
Initialized empty Git repository in /home/haggl/bare.git/
This command creates only the repository itself and themaster-branch.
Sebastian Neuser Understanding Git
Creating or cloning a repository – Cloning
$ git clone [email protected]:haggl/dotfiles.git
Cloning into dotfiles...
remote: Counting objects: 488, done.
remote: Compressing objects: 100% (251/251), done.
remote: Total 488 (delta 249), reused 434 (delta 202)
Receiving objects: 100% (488/488), 86.36 KiB, done.
Resolving deltas: 100% (249/249), done.
This command creates a working directory, makes a full copy of theoriginal repository and checks out the master-branch. It also setsup the remote repository you cloned from and names it origin.
Sebastian Neuser Understanding Git
Staging and committing files – Staging
Staging a new or changed file or folder:
$ git stage <path>
Staging parts of a changed file or folder:
$ git stage -p <path>
Staging all changed files (not adding new ones!):
$ git stage -u
Removing a file from the index (; un-staging it):
$ git reset HEAD <path>
Sebastian Neuser Understanding Git
Adding and committing files – Manipulating files
Discarding all changes to a file or directory:
$ git checkout <path>
Renaming a file or directory:
$ git mv <old name> <new name>
Removing a file:
$ git rm <file>
Sebastian Neuser Understanding Git
Adding and committing files – Committing
Simple commit, directly specifying the message:
$ git commit -m’<message>’
Showing a full diff in the commit message editor:
$ git commit -v
Overwriting the last commit (; HEAD) in case you messed upor forgot something:
$ git commit --amend
Committing all changes (but not new files!):
$ git commit -a
Sebastian Neuser Understanding Git
Displaying information – Index and working directory
Showing a summary of the working directory and the index:
$ git status
Showing a diff between the working directory and the index:
$ git diff
Showing a diff between the index and HEAD:
$ git diff --cached
Sebastian Neuser Understanding Git
Displaying information – Index and working directory
Showing the diff of a specific commit:
$ git show <commit>
Showing the commit log of the current branch:
$ git log
Commit log with a summary of each commit’s changed files:
$ git log --stat
Drawing a nice graph of all commits in all branches:
$ git log --graph --oneline --decorate --all
Sebastian Neuser Understanding Git
Handling branches
Creating a branch pointing to HEAD:
$ git branch <branch name>
Creating a branch pointing to a specific commit:
$ git branch <branch name> <commit>
Checking out a branch:
$ git checkout <branch name>
Deleting a branch:
$ git branch -D <branch name>
Sebastian Neuser Understanding Git
Handling branches – continued
Renaming a branch:
$ git branch -m <old name> <new name>
Letting a branch point to a specific commit:
$ git branch -f <branch name> <commit>
Merging a branch into HEAD:
$ git merge <branch name>
Sebastian Neuser Understanding Git
Stashing
Saving the current working directory and index state intocommits and revert the changes to match HEAD:
$ git stash
Listing all available stashes:
$ git stash list
Showing the diff of a particular stash:
$ git stash show <stash>
Applying and deleting a particular stash:
$ git stash pop <stash>
Sebastian Neuser Understanding Git
Working with remote repositories – Management
Adding a remote repository:
$ git remote add <name> <url>
Renaming a remote repository:
$ git remote rename <old name> <new name>
Removing a remote repository:
$ git remote rm <name>
Sebastian Neuser Understanding Git
Working with remote repositories – Downstream data flow
Updating all refs from all remote repositories, downloading thenecessary objects and deleting obsolete remote refs:
$ git remote update --prune
Updating all refs from a specific remote repository anddownloading necessary objects:
$ git fetch <remote>
Downloading objects and refs from a remote repository andmerging a branch into HEAD:
$ git pull <remote> <branch>
Sebastian Neuser Understanding Git
Working with remote repositories – Upstream data flow
Pushing a branch to a remote repository:
$ git push <remote> <branch>
Deleting a branch from a remote repository:
$ git push <remote> --delete <branch>
Overwriting a branch on a remote repository:
$ git push -f <remote> <branch>
WARNING: When you are collaborating with others youshould not use this command, because it messes up theirrepositories due to changing hashes!
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Commit guidelines – Do
Conventions regarding the commit message:1 The first line is a short (≤ 50 characters) description of what
the commit is doing in imperative present tense.2 Second line is blank.3 Following lines describe the commit in more detail.
Make small commits! The shorter your commit-diffs, theeasier it will be to fix things later on.
Commit early! Once you have committed a change, it’s safe.You can hide the sausage making later (before pushing, ofcourse!) by amending the commit(s).
Sebastian Neuser Understanding Git
Commit guidelines – Do not
Don’t make commits which leave the working directory in abroken state!
Don’t commit files that can be generated!
Don’t commit configuration files that have to be adjusted ona per-workstation basis!
Don’t commit large binaries unless it is absolutely necessary!
Sebastian Neuser Understanding Git
Topic branches
Aside from really small changes like cosmetics in comments,whitespace issues and so on you should normally never workon the master-branch but create a special so-calledtopic-branch.
If you have an issue tracking system like the one on GitHub /in GitLab, it is reasonable to name the topic branches afterthe issue / ticket – perhaps with a token that identifies you:
$ git branch sn/issue42 origin/master
Sebastian Neuser Understanding Git
Rebasing
A rebase essentially does the following:1 Save all commits that are in the current branch but not in the
target branch as diffs.2 Rewind HEAD to the common ancestor of the two branches.3 Apply all commits from the target branch.4 Apply all saved diffs on the target branch in order.
The resulting tree is exactly the same as if the two brancheshad been merged, only the commit history looks different.
With an interactive rebase you can amend commits furtherdown in the history, change commit messages and evenrearrange commits. This is where the fun begins! ;-)
Sebastian Neuser Understanding Git
Rebasing – visualized (starting point)
dcc1a52
mergeme
0e74e93
89f8555
master
HEAD
a5edb85
rebaseme
$ git merge mergeme
Merge made by recursive.
file7 | 2 +-
file8 | 2 +-
file9 | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
Sebastian Neuser Understanding Git
Rebasing – visualized (after merge)
dcc1a52
mergeme
0e74e93
89f8555
f9ee4d3
master
HEAD
a5edb85
rebaseme
$ git checkout rebaseme
Switched to branch ’rebaseme’
$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: Change contents of files 4 through 6.
Sebastian Neuser Understanding Git
Rebasing – visualized (after rebase)
dcc1a52
mergeme
0e74e93
89f8555
f9ee4d3
master 0f2eb73
rebaseme
HEAD
Sebastian Neuser Understanding Git
Rebasing – onto master
When you are done with your work on a topic branch, but themaster-branch was updated in the meantime, it is best (andin some projects required) to rebase your work on top of theupdated branch.
$ git fetch origin
$ git rebase origin/master
It is generally a good idea to do this from time to time even ifyour work is not yet finished, because if merge conflicts occur,you can fix them while they are still easily manageable.
Sebastian Neuser Understanding Git
Pull/Merge requests
In most cases you will not be allowed to push to the mainrepository (in GitLab: the master branch) directly.
To contribute to a project, you have to1 clone the main repository,2 create a topic branch,3 do your work on it,4 if necessary rebase your work on the current branch head,5 test your changes (!),6 push the changes to your own public repository (in GitLab: the
main repository) and7 submit a so-called pull-request (in GitLab: merge-request) to
let the maintainer know, that you have changes in your publicrepository, that you want her/him to merge.
Sebastian Neuser Understanding Git
Typical workflow – Clone or pull
1 Clone or update the refs of the remote repository:
Clone:
$ git clone <remote url>$ git branch <topic branch name>$ git checkout <topic branch name>
Pull:
$ git checkout master$ git pull origin master$ git branch <topic branch name>$ git checkout <topic branch name>
Sebastian Neuser Understanding Git
Typical workflow – Commit and push
2 Work on your topic branch:
[... some vimmin’ ...]
$ git add <changed files>
$ git status
$ git commit -v
Repeat . . .
3 Push:
$ git fetch origin
$ git rebase origin/master
[... resolve conflicts if necessary ...]
$ git push origin <topic branch name>
Sebastian Neuser Understanding Git
Sebastian Neuser Understanding Git
Pros
Very fast
Creating a new repository is trivial
The repository uses little space
Branching and merging is easy
Rebase (!)
SHA-1 ensures data integrity
Sebastian Neuser Understanding Git
Cons
Sometimes strange behaviour when working with different lineendings (CRLF, LF)
Inability to check out only parts of a repository
Documentation (; man-pages) is partly quite technical andthus might not be overly helpful
Possibility of data loss, if you do not know what you are doing
Sebastian Neuser Understanding Git
Final thoughts
You should definitely cherish personal ~/.gitconfig and~/.gitignore configuration files.
Git aliases are a really helpful (and powerful!) thing. It’sworth looking into it!
git add -p gives you the opportunity to commit code that isnot tested! So after doing that, you should git stash andtest the tree you actually committed!
When you git reset --hard or git rebase -i, at somepoint you will need git reflog, so you might as well lookinto it right away.
As always: Think, before you hit Enter !
Sebastian Neuser Understanding Git
Further information
Though it may be not that easy to read, man is still yourfriend.
The interactive Git cheatsheet.
The Git book.
My personal dotfile repository containing (among other thingsyou might find useful) my global Git configuration files.
Linus’ talk about the advantages of a distributed VCS andwhy SVN users are stupid and ugly.
https://startpage.com is an interface for the Google searchengine, which doesn’t store information about you.
Sebastian Neuser Understanding Git
Thank you for your attention!
Questions?
Sebastian Neuser Understanding Git