참고 - http://wiki.eclipse.org/EGit/User_Guide#Rebase_Conflicts


Rebase 소개 

브랜치 A을 복사한 브랜치 B에서 작업 중, 누군가 브랜치 A에 커밋 하였다고 하자. 


 

                                          B'        A'

--o--o--o--o--o     Branch A

        |

         --o--o--o Branch B



향 후,  브랜치 B의 작업을 다시  브랜치 A에 반영하고 싶다면,  브랜치  B를 브랜치 A에 Merge해야 한다. 

그러나 아직, B의 작업이 끝나지 않아 반영하지는 않는다고 가정하자. 

대신 커밋 A'의 내용을 현재 B의 작업에 사용하기 위해,  브랜치 A을 B에 반영하기 원할 때가 있다.

아니면 향후 Merge 시의 무수한 충돌들이 염려되어 수시로 최신의  브랜치 A를  브랜치 B에 반영하면서 작업하고

싶다고 가정하자. 

 이럴 경우, 우리는 Merge 대신 Rebase을 선택하게 된다. Rebase 명령어의 동작 방식은 다음과 같다. 

  1. 브랜치 B의 커밋 B' 이후의 커밋들을 임시 저장하고  브랜치 B를 커밋 B' 상태로 되돌린다.
  2. 브랜치 B에  브랜치 A의 커밋 B' 이후의 커밋들, 즉 커밋 A'를 반영한다. 
  3. 1에서 임시저장한  브랜치 B의 커밋들을  브랜치 B에 반영한다.  

Rebase 후의 결과는 아래와 같이 우리가 원하는 결과를 볼 수 있다. 


 
                                          B'       A'

--o--o--o--o--o     Branch A

                      |

                       --o--o--o Branch B




Eclipse에서의 Rebase  방법

Image:EGit-0.10-StartRebaseFromRepoView.png

Eclipse의 EGit에서도 위와 같이 Rebase를 지원한다. Rebae 방법은 2가지가 있다. 

  1. Git 저장소 뷰에서나, 프로젝트의 Team 메뉴에서 Rebase를 선택하고 기준으로 삼기 원하는 브랜치, 혹은 커밋, 태그 등을 선택
  2. Git 저장소 뷰에서 기준으로 삼기 원하는  브랜치를 직접 선택



Eclispe에서 Rebase 충돌 해결 방법 

  그러나 실제 작업 중에는 위와 같은 Rebase 중 충돌이 일어날 때가 많다. Merge 때와 마찬가지로, 충돌을 해결해야 하는데, 

Eclipse에서 충돌을 해결해도 계속 문제가 발생하여 한참을 헤맨 적이 있다. 알고 봤더니, 본인은 문제가 일어나는 파일들을

수정을 했지만, 이것을 Git의 index 상에 반영하기 위한 작업을 하지 않았던 것이 문제였다. ㅡ.ㅡ 굳이 Eclipse가 아니고 리눅스 환경에서도 동일한 작업이 필요한데, 바로 충돌 해결 후, git add 를 해주어야 index에 제대로 반영된다. 그제서야 Merge 든 

Rebase든 진행이 가능하다. Eclipse에서 Rebase 중 충돌이 일어나면 다음과 같은 다이얼로그가 뜬다. 


위의 Action to perform에서 보듯이 현재 문제가 되는 커밋만 제외하고 계속 Rebase를 진행하거나, Rebase를 취소할 수도 있다. 

충돌을 해결할려면 Merge때와 같이, Merge Tool을 이용하면 된다. (혹은 Do nothing을 선택하여 직접 문제가 되는 부분들을 찾아 수정할 수도 있다.) 

여기서 중요한 포인트는, 모든 수정이 끝났으면, 꼭 Git Add를 해준다는 것! Eclipse 상에서는 다음 절차를 따른다. 

  1. 위의 Start MergeTool to resolve conflicts을 선택하여 MergeTool을 통해 수정하거나, 충돌이 일어난 파일을 수동을 찾아 직접 수정하여 충돌 해결
  2. Project의 Team Menu->Add to Index 선택 
  3. Project의 Team Menu -> Rebase -> Continue Rebaes 선택 

이 방법을 몰라 어제 작업 중 2시간을 날려 먹고, 결국에는 충돌나는 파일을 백업해놓고 지운 다음, 다시 Merge하는 등 온갑 삽질을 다하였다. Merge 시에도 위의 2번만 알았다면 파일을 백업해놓는 추태는 벌이지도 않았을텐데. 아쉽다; 

  개인적으로는, Merge 시 발생하는 Merge 커밋이 보기가 싫어 Rebase를 더 선호하는 편이다. 물론 Merge 커밋을 안 보이게 할 수도 있지만, 따로 옵션을 주는 것이 번거롭기도 하고, 변경 이력 관리도 Rebase가 더 깔끔하게 느껴지는 것 같다. 

앞으로 프로젝트에 자주 참고해야겠다!     

Posted by 라판

  git을 사용하다 보면 push 전 pull을 하지 않아, 변경내역이 손실될 때가 생긴다. 

다음 히스토리를 보자. 

                                  c

          ==========o <-- mywork

        /            \

 o--o--o--o--o--o--o--o--o  <-- master

       a     b        m   

    pull을 통해 master 브랜치의 commit a까지 업데이트되었다고 가정하자. 

그리고 mywork라는 변경내역으로 commit c를 만들어 push하려고 한다.  

그러나 그 와중에 누군가 master branch에 commit b 등 여러 커밋을 push하였고, 

만약 충돌이 발생 시, 정상대로라면 mywork의 push는 다음 에러와 함께 거절당한다.

[rejected - non-fast-forward ]


그러나 사용자의 실수로 인해(어떻게 이런 경우가 생기는 지는 파악이 안되었다) push가

되어 머지 commit m이 만들어지고 기존 commit a-b-m 의 경로의 변경내역은 사라지게

되는 현상이 발생하였다. 

이 경우 복구를 위해 우리가 할 수 있는 것은 손수 local에 commit m 이전의 변경내역을

저장하고 이를 복사하여 다시 commit과 push를 하는 것이다. 


그러나 git에는 이러한 불편한 작업들을 git revert란 명령어로 깔끔하게 이전 commit을 취소하도록 해준다. 

그리고 특히 위와 같이 merge commit을 revert할 때는 다음과 같이 명령어를 쓰면 된다. 

git revert -m <부모 숫자> <merge commit> 

여기서 부모 숫자는 merge의 경우 2개의 부모를 가지는데, 2개의 부모 중 어떤 부모를 reversing에 있어

메인라인으로 취급할 것이냐를 결정한다. 대개 commit 내역에서 제일 첫번째로 표시된 parent가 1, 두번째가 2,

이런 식이다. 


git revert로 merge 이전의 상태로 돌려졌으면, 이 상태에서 mywork 변경내역을 다시 commit하여 push하면 

정상적으로 복구할 수 있다. 


참조 - http://namhyung.springnote.com/pages/3132772#toc_6



Posted by 라판

git에서 환경설정 파일은 gitconfig인데, 여기서 author의 name, e-mail 등을 지정할 수 있다. 

linux cui 환경에서는 다음과 같이 수정할 수 있다. 

git config --global user.name "<name of user>"
git config --global user.email "<email of user>"

Eclipse에서도 결국 gitconfig 파일을 수정해주어야 하는데, 이는 

Windows->Preferences->Team->git->Configuration->User Settings에서 New Entry를 눌러 새 Entry를 추가해주면 된다. 



  • E-mail
    • Key : user.email
    • Value : your@email.address
  • Name
    • Key : user.name
    • Value : yourname 


참고

http://wiki.eclipse.org/EGit/User_Guide /

Posted by 라판
TAG git

git 사용팁

툴/Git 2012.01.27 11:36

1. 빈 폴더 git에 add하기 
   
   보통 프로젝트를 만들 때 "git add ." 명령어를 사용하는 데, 이 경우  빈폴더는 추가되지 않는다. 
이로 인해 컴파일 시, 경로 관련하여 링킹 오류가 나타날 때가 있다. 불행히도 현재 git은 빈 폴더를
저장할 수 있는 매커니즘이 존재하지 않는다. git은 파일만을 객체에 저장하며 git 저장소가 복사될
때 해당 파일의 경로에 명시된 폴더만을 만들 뿐이다. 
  이 문제를 해결하기 위해서는 약간의 트릭이 필요하다. 빈 폴더에 의미없는 파일을 생성하면 된다.
 
$ ls -a empty
$ touch empty/.gitignore
$ git add empty
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
# (use "git rm --cached ..." to unstage)
#
#  new file: empty/.gitignore
#
 
출처 :  http://ruby.about.com/od/git/f/Can-I-Add-Empty-Directories-To-Git-Repositories.htm  
Posted by 라판
TAG git

출처 -  http://nvie.com/posts/a-successful-git-branching-model/ 

A successful Git branching model

In this post I present the development model that I’ve introduced for all of my projects (both at work and private) about a year ago, and which has turned out to be very successful. I’ve been meaning to write about it for a while now, but I’ve never really found the time to do so thoroughly, until now. I won’t talk about any of the projects’ details, merely about the branching strategy and release management.

It focuses around Git as the tool for the versioning of all of our source code.

Why git?

For a thorough discussion on the pros and cons of Git compared to centralized source code control systems, see the web. There are plenty of flame wars going on there. As a developer, I prefer Git above all other tools around today. Git really changed the way developers think of merging and branching. From the classic CVS/Subversion world I came from, merging/branching has always been considered a bit scary (“beware of merge conflicts, they bite you!”) and something you only do every once in a while.

But with Git, these actions are extremely cheap and simple, and they are considered one of the core parts of your daily workflow, really. For example, in CVS/Subversion books, branching and merging is first discussed in the later chapters (for advanced users), while in every Git book, it’s already covered in chapter 3 (basics).

As a consequence of its simplicity and repetitive nature, branching and merging are no longer something to be afraid of. Version control tools are supposed to assist in branching/merging more than anything else.

Enough about the tools, let’s head onto the development model. The model that I’m going to present here is essentially no more than a set of procedures that every team member has to follow in order to come to a managed software development process.

Decentralized but centralized

The repository setup that we use and that works well with this branching model, is that with a central “truth” repo. Note that this repo is only considered to be the central one (since Git is a DVCS, there is no such thing as a central repo at a technical level). We will refer to this repo as origin, since this name is familiar to all Git users.

Each developer pulls and pushes to origin. But besides the centralized push-pull relationships, each developer may also pull changes from other peers to form sub teams. For example, this might be useful to work together with two or more developers on a big new feature, before pushing the work in progress to origin prematurely. In the figure above, there are subteams of Alice and Bob, Alice and David, and Clair and David.

Technically, this means nothing more than that Alice has defined a Git remote, namedbob, pointing to Bob’s repository, and vice versa.

The main branches

At the core, the development model is greatly inspired by existing models out there. The central repo holds two main branches with an infinite lifetime:

  • master
  • develop

The master branch at origin should be familiar to every Git user. Parallel to the master branch, another branch exists called develop.

We consider origin/master to be the main branch where the source code of HEAD always reflects aproduction-ready state.

We consider origin/develop to be the main branch where the source code of HEAD always reflects a state with the latest delivered development changes for the next release. Some would call this the “integration branch”. This is where any automatic nightly builds are built from.

When the source code in the develop branch reaches a stable point and is ready to be released, all of the changes should be merged back into master somehow and then tagged with a release number. How this is done in detail will be discussed further on.

Therefore, each time when changes are merged back into master, this is a new production release by definition. We tend to be very strict at this, so that theoretically, we could use a Git hook script to automatically build and roll-out our software to our production servers everytime there was a commit on master.

Supporting branches

Next to the main branches master and develop, our development model uses a variety of supporting branches to aid parallel development between team members, ease tracking of features, prepare for production releases and to assist in quickly fixing live production problems. Unlike the main branches, these branches always have a limited life time, since they will be removed eventually.

The different types of branches we may use are:

  • Feature branches
  • Release branches
  • Hotfix branches

Each of these branches have a specific purpose and are bound to strict rules as to which branches may be their originating branch and which branches must be their merge targets. We will walk through them in a minute.

By no means are these branches “special” from a technical perspective. The branch types are categorized by how we use them. They are of course plain old Git branches.

Feature branches

May branch off from: develop
Must merge back into: develop
Branch naming convention: anything except masterdevelop,release-*, or hotfix-*

Feature branches (or sometimes called topic branches) are used to develop new features for the upcoming or a distant future release. When starting development of a feature, the target release in which this feature will be incorporated may well be unknown at that point. The essence of a feature branch is that it exists as long as the feature is in development, but will eventually be merged back into develop(to definitely add the new feature to the upcoming release) or discarded (in case of a disappointing experiment).

Feature branches typically exist in developer repos only, not inorigin.

Creating a feature branch

When starting work on a new feature, branch off from the develop branch.

$ git checkout -b myfeature develop
Switched to a new branch "myfeature"

Incorporating a finished feature on develop

Finished features may be merged into the develop branch definitely add them to the upcoming release:

$ git checkout develop
Switched to branch 'develop'
$ git merge --no-ff myfeature
Updating ea1b82a..05e9557
(Summary of changes)
$ git branch -d myfeature
Deleted branch myfeature (was 05e9557).
$ git push origin develop

The --no-ff flag causes the merge to always create a new commit object, even if the merge could be performed with a fast-forward. This avoids losing information about the historical existence of a feature branch and groups together all commits that together added the feature. Compare:

In the latter case, it is impossible to see from the Git history which of the commit objects together have implemented a feature—you would have to manually read all the log messages. Reverting a whole feature (i.e. a group of commits), is a true headache in the latter situation, whereas it is easily done if the --no-ff flag was used.

Yes, it will create a few more (empty) commit objects, but the gain is much bigger that that cost.

Unfortunately, I have not found a way to make --no-ff the default behaviour of git mergeyet, but it really should be.

Release branches

May branch off from: develop
Must merge back into: develop and master
Branch naming convention: release-*

Release branches support preparation of a new production release. They allow for last-minute dotting of i’s and crossing t’s. Furthermore, they allow for minor bug fixes and preparing meta-data for a release (version number, build dates, etc.). By doing all of this work on a release branch, the develop branch is cleared to receive features for the next big release.

The key moment to branch off a new release branch from develop is when develop (almost) reflects the desired state of the new release. At least all features that are targeted for the release-to-be-built must be merged in to develop at this point in time. All features targeted at future releases may not—they must wait until after the release branch is branched off.

It is exactly at the start of a release branch that the upcoming release gets assigned a version number—not any earlier. Up until that moment, the develop branch reflected changes for the “next release”, but it is unclear whether that “next release” will eventually become 0.3 or 1.0, until the release branch is started. That decision is made on the start of the release branch and is carried out by the project’s rules on version number bumping.

Creating a release branch

Release branches are created from the develop branch. For example, say version 1.1.5 is the current production release and we have a big release coming up. The state of develop is ready for the “next release” and we have decided that this will become version 1.2 (rather than 1.1.6 or 2.0). So we branch off and give the release branch a name reflecting the new version number:

$ git checkout -b release-1.2 develop
Switched to a new branch "release-1.2"
$ ./bump-version.sh 1.2
Files modified successfully, version bumped to 1.2.
$ git commit -a -m "Bumped version number to 1.2"
[release-1.2 74d9424] Bumped version number to 1.2
1 files changed, 1 insertions(+), 1 deletions(-)

After creating a new branch and switching to it, we bump the version number. Here, bump-version.sh is a fictional shell script that changes some files in the working copy to reflect the new version. (This can of course be a manual change—the point being that some files change.) Then, the bumped version number is committed.

This new branch may exist there for a while, until the release may be rolled out definitely. During that time, bug fixes may be applied in this branch (rather than on the developbranch). Adding large new features here is strictly prohibited. They must be merged intodevelop, and therefore, wait for the next big release.

Finishing a release branch

When the state of the release branch is ready to become a real release, some actions need to be carried out. First, the release branch is merged into master (since every commit onmaster is a new release by definition, remember). Next, that commit on master must be tagged for easy future reference to this historical version. Finally, the changes made on the release branch need to be merged back into develop, so that future releases also contain these bug fixes.

The first two steps in Git:

$ git checkout master
Switched to branch 'master'
$ git merge --no-ff release-1.2
Merge made by recursive.
(Summary of changes)
$ git tag -a 1.2

The release is now done, and tagged for future reference.
Edit: You might as well want to use the -s or -u <key> flags to sign your tag cryptographically.

To keep the changes made in the release branch, we need to merge those back intodevelop, though. In Git:

$ git checkout develop
Switched to branch 'develop'
$ git merge --no-ff release-1.2
Merge made by recursive.
(Summary of changes)

This step may well lead to a merge conflict (probably even, since we have changed the version number). If so, fix it and commit.

Now we are really done and the release branch may be removed, since we don’t need it anymore:

$ git branch -d release-1.2
Deleted branch release-1.2 (was ff452fe).

Hotfix branches

May branch off from: master
Must merge back into: develop and master
Branch naming convention: hotfix-*

Hotfix branches are very much like release branches in that they are also meant to prepare for a new production release, albeit unplanned. They arise from the necessity to act immediately upon an undesired state of a live production version. When a critical bug in a production version must be resolved immediately, a hotfix branch may be branched off from the corresponding tag on the master branch that marks the production version.

The essence is that work of team members (on the develop branch) can continue, while another person is preparing a quick production fix.

Creating the hotfix branch

Hotfix branches are created from the master branch. For example, say version 1.2 is the current production release running live and causing troubles due to a severe bug. But changes on develop are yet unstable. We may then branch off a hotfix branch and start fixing the problem:

$ git checkout -b hotfix-1.2.1 master
Switched to a new branch "hotfix-1.2.1"
$ ./bump-version.sh 1.2.1
Files modified successfully, version bumped to 1.2.1.
$ git commit -a -m "Bumped version number to 1.2.1"
[hotfix-1.2.1 41e61bb] Bumped version number to 1.2.1
1 files changed, 1 insertions(+), 1 deletions(-)

Don’t forget to bump the version number after branching off!

Then, fix the bug and commit the fix in one or more separate commits.

$ git commit -m "Fixed severe production problem"
[hotfix-1.2.1 abbe5d6] Fixed severe production problem
5 files changed, 32 insertions(+), 17 deletions(-)

Finishing a hotfix branch

When finished, the bugfix needs to be merged back into master, but also needs to be merged back into develop, in order to safeguard that the bugfix is included in the next release as well. This is completely similar to how release branches are finished.

First, update master and tag the release.

$ git checkout master
Switched to branch 'master'
$ git merge --no-ff hotfix-1.2.1
Merge made by recursive.
(Summary of changes)
$ git tag -a 1.2.1

Edit: You might as well want to use the -s or -u <key> flags to sign your tag cryptographically.

Next, include the bugfix in develop, too:

$ git checkout develop
Switched to branch 'develop'
$ git merge --no-ff hotfix-1.2.1
Merge made by recursive.
(Summary of changes)

The one exception to the rule here is that, when a release branch currently exists, the hotfix changes need to be merged into that release branch, instead ofdevelop. Back-merging the bugfix into the release branch will eventually result in the bugfix being merged into develop too, when the release branch is finished. (If work indevelop immediately requires this bugfix and cannot wait for the release branch to be finished, you may safely merge the bugfix into develop now already as well.)

Finally, remove the temporary branch:

$ git branch -d hotfix-1.2.1
Deleted branch hotfix-1.2.1 (was abbe5d6).

Summary

While there is nothing really shocking new to this branching model, the “big picture” figure that this post began with has turned out to be tremendously useful in our projects. It forms an elegant mental model that is easy to comprehend and allows team members to develop a shared understanding of the branching and releasing processes.

A high-quality PDF version of the figure is provided here. Go ahead and hang it on the wall for quick reference at any time.

Update: And for anyone who requested it: here’s the gitflow-model.src.key of the main diagram image (Apple Keynote).

' > Git' 카테고리의 다른 글

Eclipse Git에서 Author 정보 변경하기  (0) 2012.07.05
git 사용팁  (0) 2012.01.27
A Visual Guide to Version Control  (0) 2012.01.05
A successful Git branching model  (0) 2012.01.05
분산 버전 관리 소개  (2) 2012.01.05
GIT 관련 소개글  (0) 2011.09.21
Posted by 라판
TAG Branch, git

분산 버전 관리 시스템에 대해 잘 설명해 놓은 글.
나중에 번역해보자 
출처 :  http://betterexplained.com/articles/intro-to-distributed-version-control-illustrated/ 

아울러 대표적인 DVCS인 git에 대한 참고는 아래.
http://namhyung.springnote.com/pages/3132772 

Intro to Distributed Version Control (Illustrated)
 

Traditional version control helps you backup, track and synchronize files. Distributed version control makes it easy to share changes. Done right, you can get the best of both worlds: simple merging and centralized releases.

Distributed? What’s wrong with regular version control?

Nothing — read a visual guide to version control if you want a quick refresher. Sure,some people will deride you for using an “ancient” system. But you’re still OK in my book: using any VCS is a positive step forward for a project.

Centralized VCS emerged from the 1970s, when programmers had thin clients and admired “big iron” mainframes (how can you not like a machine with a then-gluttonous 8 bits to a byte?).

Centralized is simple, and what you’d first invent: a single place everyone can check in and check out. It’s like a library where you get to scribble in the books.

This model works for backup, undo and synchronization but isn’t great for merging and branching changes people make. As projects grow, you want to split features into chunks, developing and testing in isolation and slowly merging changes into the main line. In reality, branching is cumbersome, so new features may come as a giant checkin, making changes difficult to manage and untangle if they go awry.

Sure, merging is always “possible” in a centralized system, but it’s not easy: you often need to track the merge yourself to avoid making the same change twice. Distributed systems make branching and merging painless because they rely on it.

A Few Diagrams, Please

Other tutorials have plenty of nitty-gritty text commands; here’s a visual look. To refresh, developers use a central repo in a typical VCS:

Everyone syncs and checks into the main trunk: Sue adds soup, Joe adds juice, and Eve adds eggs.

Sue’s change must go into main before it can be seen by others. Yes, theoretically Suecould make a new branch for other people to try out her changes, but this is a pain in a regular VCS.

Distributed Version Control Systems (DVCS)

In a distributed model, every developer has their own repo. Sue’s changes live in her local repo, which she can share with Joe or Eve:

But will it be a circus with no ringleader? Nope. If desired, everyone can push changes into a common repo, suspiciously like the centralized model above. This franken-repo contains the changes of Sue, Joe and Eve.

I wish distributed version control had a different name, such as “independent”, “federated” or “peer-to-peer.” The term “distributed” evokes thoughts of distributed computing, where work is split among a grid of machines (like searching for signals withSETI@home or doing protein folding).

DVCS is not like Seti@home: each node is completely independent and sharing is optional (in Seti you must phone back your results).

Key Concepts In 5 Minutes

Here’s the basics; there’s a book on patch theory if you’re interested.

Core Concepts

  • Centralized version control focuses on synchronizing, tracking, and backing up files.
  • Distributed version control focuses on sharing changes; every change has a guid or unique id.
  • Recording/Downloading and applying a change are separate steps (in a centralized system, they happen together).
  • Distributed systems have no forced structure. You can create “centrally administered” locations or keep everyone as peers.

New Terminology

  • push: send a change to another repository (may require permission)
  • pull: grab a change from a repository

Key Advantages

  • Everyone has a local sandbox. You can make changes and roll back, all on your local machine. No more giant checkins; your incremental history is in your repo.
  • It works offline. You only need to be online to share changes. Otherwise, you can happily stay on your local machine, checking in and undoing, no matter if the “server” is down or you’re on an airplane.
  • It’s fast. Diffs, commits and reverts are all done locally. There’s no shaky network or server to ask for old revisions from a year ago.
  • It handles changes well. Distributed version control systems were built around sharing changes. Every change has a guid which makes it easy to track.
  • Branching and merging is easy. Because every developer “has their own branch”, every shared change is like reverse integration. But the guids make it easy to automatically combine changes and avoid duplicates.
  • Less management. Distributed VCSes are easy to get running; there’s no “always-running” server software to install. Also, DVCSes may not require you to “add” new users; you just pick what URLs to pull from. This can avoid political headaches in large projects.

Key Disadvantages

  • You still need a backup. Some claim your “backup” is the other machines that have your changes. I don’t buy it — what if they didn’t accept them all? What if they’re offline and you have new changes? With a DVCS, you still want a machine to push changes to “just in case”. (In Subversion, you usually dedicate a machine to store the main repo; do the same for a DVCS).
  • There’s not really a “latest version”. If there’s no central location, you don’t immediately know whether to see Sue, Joe or Eve for the latest version. Again, a central location helps clarify what the latest “stable” release is.
  • There aren’t really revision numbers. Every repo has its own revision numbers depending on the changes. Instead, people refer to change numbers: Pardon me, do you have change fa33e7b? (Remember, the id is an ugly guid). Thankfully, you can tag releases with meaningful names.

Mercurial Quickstart

Mercurial is a fast, simple DVCS. The nickname is hg, like the element Mercury.


cd project
hg init                                (create repo here)
hg add list.txt                        (start tracking file)
hg commit -m "Added file"              (check file into local repo)
hg log                                 (see history; notice guid)

changeset:   0:55bbcb7a4c24
user:        Kalid@kazad-laptop
date:        Sun Oct 14 21:36:18 2007 -0400
summary:     Added file

[edit file]
hg revert list.txt                 (revert to previous version)

hg tag v1.0                        (tag this version)
[edit file]
hg update -C v1.0                  ("update" to the older tagged version; -C forces overwrite of local copy)

Once Mercurial has initialized a directory, it looks like this:

You have:

  • A working copy. The files you are currently editing.
  • A repository. A directory (.hg in Mercurial) containing all patches and metadata (comments, guids, dates, etc.). There’s no central server so the data stays with you.

In our distributed example, Sue, Joe and Eve have their own repos, with independent revision histories.

Understanding Updates and Merging

There’s a few items that confused me when learning about DVCS. First, updates happen in several steps:

  • Getting the change into a repo (pushing or pulling)
  • Applying the change to the files (update or merge)
  • Saving the new version (commit)

Second, depending on the change, you can update or merge:

  • Updates happen when there’s no ambiguity. For example, I pull changes to a file that only you’ve been editing. The file just jumps to the latest revision, since there’s no overlapping changes.
  • Merges are needed when we have conflicting changes. If we both edit a file, we end up with two “branches” (i.e. alternate universes). One world has my changes, the other world has yours. In this case we (probably) want to merge the changes together into a single universe.

I’m still wrapping my head around how easily branches spring up and collapse in aDVCS:

In this case, a merge is needed because (+Soup) and (+Juice) are changes to a common parent: the list with just “Milk”. After Joe merges the files, Sue can do a regular “pull and update” to get the combined file from Joe. She doesn’t have to merge again on her own.

In Mercurial you can run:


hg incoming ../another-dir  (see pending changes)
hg pull ../another-dir      (download changes)

hg update                   (actually apply changes...)
hg merge                    (... or merge if needed)

hg commit                   (check in merged file; unite branches)

Yep, the “pull-merge-commit” cycle is long. Luckily, Mercurial has shortcuts to combine commands into a single one. Though it seems complex, it’s much easier than handling merges manually in Subversion.

Most merges are automatic. When conflicts come up, they are typically resolved quickly. Mercurial keeps track of the parent/child relationship for every change (our merged list has two parents), as well as the “heads” or latest changes in each branch. Before the merge we have two heads; afterwards, one.

Organizing a Distributed Project

Here’s one way to organize a distributed project:

Sue, Joe and Eve check changes into a common branch. They can trade patches with each other to do simple “buddy builds”: Hey buddy, can you try out these patches? I need to see if it works before I push to the experimental branch.

Later, a maintainer can review and pull changes from the experimental branch into a stable branch, which has the latest release. A distributed VCS helps isolate changes but still provide the “single source” of a centralized system. There are many models of development, from “pull only” (where maintainers decide what to take, and is used when developing Linux) to “shared push” (which acts like a centralized system). A distributed VCS gives you flexibility in how a project is maintained.

Practice And Scathing Ridicule Makes Perfect

I’m a DVCS newbie, but am happy with what I’ve learned so far. I enjoy SVN, but it’s “fun” seeing how easy a merge can be. My suggestion is to start with Subversion, get a grasp for team collaboration, then experiment with a distributed model. With the proper layout a DVCS can do anything a centralized system can, with the added benefit of easy merging.

Online Resources

Notable Quotes:

  • “How many have done a branch and merged it? How many of you enjoyed it?”
  • “When you do a merge, you plan ahead for a week, then set aside a day to do it.”
  • “Some people have 5, 10, 15 branches”. One branch is experimental. One branch is maintenance, etc.
  • “CVS — you don’t commit. You make changes without committing. You never commit until it passes a giant test suite. People make 1-liner changes, knowing it can’tpossibly break.”

So good luck, and watch out for the holy wars. Feel free to share any tips or suggestions below.

130 THOUGHTS ON “INTRO TO DISTRIBUTED VERSION CONTROL (ILLUSTRATED)

  1. Your git link goes to darcs.net instead of git.or.cz

    In any case, that was a good article. I’ll give it to my friend who I have thus far been unable to articulate the benefits of dvcs and git to.

  2. Thakns for the guide. I’ve been putting off developing something very similar to this for a couple of months and now I don’t need to.

    What did you use for your diagrams? They look cool :)

  3. But isnt this in clearcase ucm / multisite for years? each developer has his own developer streams on which he can work independantly and there is the integration stream. each project has an integration stream and all streams can deliver or mergecopy to each other. on each stream there can be multiple views.

  4. Hi Cogmios, thanks for the info. I’ve never used Clearcase, but from Wikipedia it appears as both client/server and distributed, which is pretty interesting (along with its view model). Nope, these ideas are not brand new — just wanted to explain them:)

  5. Clearcase UCM is not a distributed version control system, and neither is Clearcase multi-site.

    Clearcase UCM is hand holding of branch creation and merging of changes between streams, ticket tracking integration etc. It’s the Higher-Order Perl of Clearcase. Underneath it’s just a centralised VCS with branches.

    Clearcase multi-site is form of VCS replication. Multiple sites can be defined and a replication schedule defined. Each branch is mastered in one site, i.e. writable in one place, and a read-only copy to replicated to the other sites. Mastering can be changed. And one can merge from a read only (remote) branch to local branch with no problems. The merge tracking is good, if slow, but does have some bugs that can surface if you have a slow network.

    So ClearCase multi-site is kind of like a distributed VCS, however only the centralised ‘sites’ are the nodes of the web, unlike the contemporary distributed VCS’s in which every machine or even working copy is a node in the web.

  6. Hi David, thanks for the details! From your description, it seems like ClearCase is somewhere between a regular CVS and DVCS, by allowing to merge from multiple sites (even if not every working copy is independent). Appreciate the info.

  7. Awesome. Thanks. You could get a job as a professional communicator.

    Explaining the concepts with those nice visual graphics is great. Pretty hard to explain the details of configuring Eclipse that way, though.

  8. Thanks for the comment — maybe I’ll try getting into that eventually :) . Yep, I agree you’ll always need the manual for certain things (but nearly anything can be more clear with a simple diagram!).

  9. Hi Kalid. Thank you for this wonderful write-up. You recommended to start with Subversion, is that because it’s simpler? What do you think of tools like Trac? I just need a good tool for my website projects. Thank you again!

  10. Hi Ildar, thanks for the comment. Yes, I’ve found regular VCS a good way to start. Subversion is pretty popular and works with many tools, like Trac. You can get subversion working first, then use Trac if you need project management. Personally I don’t use Trac for my projects.

    Here’s an article on getting Subversion up and running with TortoiseSVN, a nice GUI client: http://lifehacker.com/software/subversion/hack-attack-using-subversion-with-tortoisesvn-192367.php

    It’s nice to look at a directory and see what’s up to date and what isn’t.

  11. Thanks, Kalid! TortoiseSVN looks very interesting, especially for people who don’t want to use the command line.

  12. Hi Kalid,

    Very well written article. Cleared out the pros and cons about DVCS for me!

    Maybe you can help me on this one. Im currently looking for a version control tool for our devision. We support Oracle software, mostly custom code.

    Sometimes code needs to be changed in case to fix errors or for further development. Therefore it needs version control. We are used to work with no more than one person on the same object on the same time because merging afterwards is very complex.

    So we developed a tool where we can claim objects. Once an object is claimed, an other person isnt allowed to develop that specific object.

    Now we need a more sophisticated tool, like subversion or mercurial. But, what would be your advise? In a centralized system we are still able to claim a source(by locking) but a distributed cant do that, can they.

    My question is, wich one should I pick? I really like DVCS(particularly HG with Tortoise) but sources cant be claimed. I ask myself, how bad is that actually?

    Hope you can give some advice on this. I dont expect a plug and play answer offcourse!

  13. Hi Johan, glad you enjoyed the article! Great question — it’s hard to say, but if just starting out I’d begin with Subversion.

    Centralized systems are easier to understand and get going with, and the locking feature is pretty nice.

    I believe there are ways to export Subversion to a distributed system later if you find you need to do that (article here: http://ww2.samhart.com/node/49).

    Hope this helps!

  14. easy to read and easy to be read
    haha…..great article to teach many ppl out there
    >.^

  15. Nice visual representation and I learned a lot. But I’m still not seeing how merges are easier than Subversion. Can you compare/contrast a little more, maybe with an example of how Svn is different (for the merge process, not the tracking).

  16. Hi taleswapper, great question. The primary difference is that DVCS keeps track of each change individually, so it can tell what changes need to be applied and what have already been applied.

    With SVN, you need to manually track whether a change has been applied. So when merging branches, you aren’t quite sure what revisions to start and end from, and whether you’ve already done it.

    With a DVCS, you can just “pull” from a repository and be confident you are getting just the changes you need.

    The actual merge command is pretty straightforward in SVN (svn merge -r5:6http://path/to/branch), but again, you need to know what revisions to start and end from, and whether it’s been applied.

  17. A very good read… really gives a _visual_ of the DVCS. You also cleared a few doubts that I had in mind regarding the DVCS implementation in a Enterprise Dev Arena. I have a few queries;

    * how do you make available the most latest, stable code.

    * Doesn’t having *private* dev copy with each developer, lead to a scenario wherein huge merges may be required due to the divergence.

    * Again making private changes, I think, is to some extent, depriving a community from the changes available.

    Thanks again for the post.

  18. Hi Kishore, those are great questions. I don’t have any experience managing DVCS projects, so I’m making a guess based on what I’ve read.

    1) Latest code. One drawback with a distributed model is that there’s no central location. One alternative is to have a maintainer who pulls code from various contributors on some schedule.

    2) Giant merges. Although giant merges may happen, they are easier than merging in centralized systems because changes are tracked better. But you’re right, unless there is some syncrhronization the code could diverge.

    3) Yes, it’s hard for the community to see the changes since everyone has their own copy. Having a mainter put together a “central” release may help this too (everything else considered experimental).

    Again, these are just guesses based on the descriptions I’ve seen. Some projects use a hybrid centralized/distributed model.

  19. Hi Kishore,
    Though I do not manage any large dVCS repository, I am quite actively following related topics on the Emacs development lists.
    Attempting to answer your questions:

    1> There is ONE official repository from where the releases are made. It is up to the individual developers to convince the owner/maintainer to pull in their changes. There are sites that list developer repositories with a brief description. Someone wanting to try code not still in the official repository can pull changes from the developer repositories if they are published (mercurial has a stable and crew repositories)

    2> Giant merges do not often happen (well it did happen twice recently on the Emacs branch – multi-tty and unicode). But, it would still suffer from the same fate as in a centralized VCS. If some developer had merged some of the changes, those merges are remembered and you do not have to solve them again in dVCS.

    3> Each developer can publish his/her official latest repository in the published list. If they are not disciplined in keeping it up to date, the same can happen in centralized VCS if they just edit and never commit.

    Overall, I would say you can do anything in dVCS that you can do in centralized VCS. dVCS is a super set.

  20. Hmmm…I wonder if a polling system could work instead of a “maintainer” to make decisions for what get placed in the main trunk? Say have a control panel that lets you set up each user with a weighted vote (maybe you don’t want the new untested guy to have the same weight as the rest of the team who has been battle tested), and for setting the trigger for acceptance of new code into the main. Hmmmm…

  21. Hi Scott, that’s an interesting idea. You’re right, currently DVCS require a maintainer to pull, but there may be cool ways to automate the process.

  22. Just delete this bit, it wrong.

    “Centralized VCS emerged from the 1970s, when programmers had thin clients and admired “big iron” mainframes (how can you not like a machine with a then-gluttonous 8 bits to a byte?).”

  23. I don’t see how this can possibly work. Yes, I know it works for the Linux kernel and some other projects — I just do not see how. With a central repository, all of the developers are basing their work on the same code, and everyone’s changes are incorporated, so that everyone has everyone else’s code. You keep things separate by branching and merging. It’s all straightforward. I don’t know why Torvalds has such hate for branching and merging — it’s really no big deal, in a central VCS.

    But in this “distributed” system… if there are ten developers, every developer needs to individually merge the changes from the other ten? So instead of 10 merges, there are 100 merges, all being merged into different branches which may or may not be based on the same trunk code? To me, this looks like a nightmare.

    Again: yes, I know, it works for some projects. I just don’t see how.

  24. … unless, of course, you have one person who acts as the “maintainer”, whose repository becomes the de facto central repository. In which case, what is the point of all this extra added complexity?

    I guess I just don’t get it.

  25. Hi Scartis, great question. Regarding organization, you get a few benefits:

    * Choice of decentralized (default) or ‘centralized’ system (with a maintainer who pulls changes)
    * Very simple change management. Although branching/merging is “possible” in subversion, it’s difficult in practice because the system doesn’t keep track of the changes. So you don’t know if you’re applying the same patch twice.
    * Speed. Because it’s decentralized, you can make checkins/checkouts locally, which is very fast (no need to go over the network). In fact, you only need the network to sync so it works great in disconnected scenarios.

    Personally, I’m happy with subversion for my projects (any VCS is better than nothing) but DVCS is a step above.

  26. Hi Kalid,

    Thank you for such an impressive and informative article.
    I have gone through your other article as well; that is also equally impressive.

    Your articles have helped me in understanding basics SCM as well as DSCM.

    Thanks once again.

  27. If anyone is interested in a DVCS system check out Plastic SCM. We use a replication system that uses branches to synchronize data between individual servers which means users can always check in to a local server (individual or shared at an office) and sync up with other servers and even work concurrently in the same branches.

    Want to know how we handle conflicts and other scenario’s check us out here:

    http://www.codicesoftware.com/xpproducts/xpcore/xpdistributed.aspx

    We can even share changes through email. Imagine two users working in total isolation from one another with the exception of emailing changes back and forth. We have a blog post here that shows off the email approach.

    http://codicesoftware.blogspot.com/2008/09/check-in-changes-using-email.html

  28. Dear Kalid,

    Thank you for your agreement on translation.
    I have translated this article into Chinese, and post it with the link to your original article url. The translated article is posted on http://scmteamwork.blogspot.com/
    Since this artilce relates to version control introduction, which is also very helpful to let people understand what’s version control, so we also translate this article. Of course, we will also post your original article link with the translated one.
    Thank you for your good illustration of the VCS concept, and we believe more people will enjoy it.

    Best regards,
    Yu-Hsiu Lin
    ESAST CO LTD
    http://www.esast.com

  29. Hi Kalid,

    Don’t forget that you can use a version control system non source code files/personal files too.

    Obviously, any developer will use them for source code control, but time after time fellow my developers and co-workers are always astounded that I use a personal repro for my other personal files too. Why keep 17 different backups of old of my Resume/CV when I can put into my local Repro on my hard drive or my NAS at home. It make life so much easier – and isn’t that what high tech is all about.

    I finally convinced my boss to keep all our design docs etc under source control too. Finally, no more meetings where someone doesn’t have the latest copy of the desing spec for review.. etc etc.

    Love the site- keep up the good work
    Mike

  30. @Mike: Great point! Yes, source control systems are very nice for keeping any type of file in sync. Unfortunately binary formats don’t get the benefit of change tracking, but for keeping files up-to-date I’m sure it can be a lifesaver! Thanks for the comment.

  31. I tend to work on two or three person microprojects, and I tend to be the maintainer. I definitely find hg to be easier for small fast projects. I don’t know why, precisely, but it just seems simpler and more straightforward, everyone runs their own branch, they commit, combine, and push changes whenever they want, I get all their changes and branches whenever I pull, I weave the branches together into the current stable version. You can do the same things in svn, but it just feels more natural to me in hg.

  32. @J Paul: Thanks for the note — I don’t have experience as a maintainer, but I can see how the distributed model makes sense for multi-person projects. Everyone has their own copy which is combined / weaved together as necessary. Appreciate the insight!

  33. Hi Kalid,

    I found this post very useful.

    Your diagrams are also very good. Did you use Photoshop, or any other tool to make them?

  34. This article is a clear and useful introduction. Thank you.

    As a self-taught web programmer I do not have a background in CS. Version control for the web sites I have been involved with consist of someone telling me this is my page to update :) Today I am learning the Drupal framework application and its more complicated. I have to learn some system.

    The issue of locking was mentioned in comment #21 above. I imagine locking to be important for inter-meshed projects with high degree of file dependencies (I’m thinking OO classes and objects). From what I know of OO it could be possible to over-work your area to cause overlap. In the case of Drupal, this framework is divided at the meta-logic level of its design into modules, themes, etc. Do the conceptual divisions keep programmers on track? And is there a similar conceptual division of labor making git usable for Linux development built in from the get-go by open source contribution?

    Then I would ask the programmers who do not see dVCS’s advantages, are you involved in conceptually complicated projects? Is it more difficult to have a sense of ownership where a centralized locking system or contract enforcement is necessary?

  35. @Xtian: Great question. In general, version control systems can only protect you from so many conflicts. Git is really good about managing changes, but ideally, you isolate components into different modules / files that can be worked on separately. If two people are working on different components, a DVCS like Git is definitely the easiest way to manage these different branches. Rather than locking, you have a set master branch that everyone can pull from (assume is the authoritative version) and the maintainer can examine the individual branches and merge them into master as needed. This way the master is always stable.

    Try taking a look at the branch diagram on http://betterexplained.com/articles/aha-moments-when-learning-git/ for more ideas on branch strategies. Hope this helps!

' > Git' 카테고리의 다른 글

Eclipse Git에서 Author 정보 변경하기  (0) 2012.07.05
git 사용팁  (0) 2012.01.27
A Visual Guide to Version Control  (0) 2012.01.05
A successful Git branching model  (0) 2012.01.05
분산 버전 관리 소개  (2) 2012.01.05
GIT 관련 소개글  (0) 2011.09.21
Posted by 라판