Skip to main content

An introduction to git-svn for Subversion/SVK users and deser...

Popularity Report

Total Popularity Score: 0

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Rank

Groups (3)

Bookmark History

Saved by 15 people (4 private), first by anonymouse user on 2007-08-19


Public Sticky notes

This article is aimed at people who want to contribute to projects which are using Subversion as their code-wiki

Highlighted by danieljomphe

Subversion users can skip SVK and move straight onto git-svn with this tutorial.

Highlighted by danieljomphe

People who are responsible for Subversion servers and are converting them to git in order to lay them down to die are advised to consider the one-off git-svnimport, which is useful for bespoke conversions where you don't necessarily want to leave SVN/CVS/etc breadcrumbs behind. I'll mention bespoke conversions at the end of the tutorial, and the sort of thing that you end up doing with them.

Highlighted by danieljomphe

A lot of this tutorial is dedicated to advocacy, sadly necessary. Those who would rather just cut to the chase will probably want to skip straight to

Highlighted by danieljomphe

Another way of looking at it is to say that it's really a content- addressable filesystem, used to track directory trees.

Highlighted by danieljomphe

we've got a simple and efficient filesystem which competes with RevML but is XML free

Highlighted by danieljomphe

Subversion added nothing to CVS' development model.

Highlighted by danieljomphe

Yes, it's a bunch of small programs that do one thing and do it well, get over it, they're being unified

Highlighted by danieljomphe

There's also a pure Java

Highlighted by danieljomphe

I used to push strongly for SVK, but got brow-beaten by people who were getting far more out of their version control system than I knew possible until I saw what they were talking about.

Highlighted by danieljomphe

SVK could easily use git as a backing filesystem and drop the dependency on Subversion altogether. So could bzr or hg.

Highlighted by danieljomphe

The repository model (see right) is also simple enough that there are complete git re-implementations you can draw upon, in a variety of languages.

Highlighted by danieljomphe

git is first and foremost a toolkit for writing VCS systems

Highlighted by danieljomphe

Writing a tool to do something that you want is often quite a simple matter of plugging together a few core commands.

It's simple enough that once a few basic concepts are there, you begin to feel comfortable knowing that the repository just can't wedge, changes can be discarded yet not lost unless you request them to be cleaned up, etc.

Highlighted by danieljomphe

I really haven't seen a nicer tool than gitk for browsing a repository.

Highlighted by danieljomphe

gitk does some really cool things but is most useful when looking at projects that have cottoned onto feature branches (see feature branches, below). If you're looking at a project where everyone commits largely unrelated changes to one branch it just ends up a straight line, and not very interesting.

Highlighted by danieljomphe

You can easily publish your changes for others who are switched on to git to pull. At a stretch, you can just throw the .git directory on an HTTP server somewhere and publish the path.

Highlighted by danieljomphe

There's the git-daemon for more efficient serving of repositories (at least, in terms of network use), and gitweb.cgi to provide a visualisation of a git repository.

Highlighted by danieljomphe

With Subversion, everyone has to commit their changes back to the central wiki, I mean repository, to share them.

Highlighted by danieljomphe

With Git (actually this is completely true for other distributed systems), it's trivial to push and pull changes between each other. If what you're pulling has common history then git will just pull the differences.

Highlighted by danieljomphe

If the person publishes their repository as described above, using the git-daemon(1), http or anything else that you can get your kernel to map to its VFS, then you can set it up as a "remote" and pull from it

Highlighted by danieljomphe

Most people say "but I don't want branches". But users of darcs report that they didn't know how much they really did want branches, but never knew until darcs made it so easy. In essence every change can behave as a branch, and this isn't painful.

Highlighted by danieljomphe

Because you can easily separate your repositories into stable branches, temporary branches, etc, then you can easily set up programs that only let commits through if they meet criteria of your choosing.

Highlighted by danieljomphe

Because you can readily work on branches without affecting the stable branch, it is perfectly acceptable for a stable branch to be updated by a single maintainer only

Highlighted by danieljomphe

Some repositories, for instance the Linux kernel, run a policy of no commit may break the build. What this means is that if you have a problem, you can use bisection to work out which patch introduced the bug.

Highlighted by danieljomphe

You might use a continual integration server that is responsible for promoting branches to trunk should they pass the strictures that you set.

Highlighted by danieljomphe

There is an awful lot less to keep in your head, and you don't have to do things like plan branching in advance.

Highlighted by danieljomphe

Good feature branches mean you end up prototyping well-developed changes; the emphasis shifts away from making atomic commits. If you forgot to add a file, or made some other little mistake, it's easy to go back and change it. If you haven't even pushed your changes anywhere, that's not only fine, but appreciated by everyone involved. Review and revise before you push is the counter-balance to frequent commits.

Highlighted by danieljomphe

Not only is the implementation fast locally, it's very network efficient, and the protocol for exchanging revisions is also very good at figuring out what needs to be transferred quickly. This is a huge difference - one repository hosted on Debian's Alioth SVN server took 2 days to synchronise because the protocol is so chatty. Now it fits in 3 megs and would not take that long to synchronise over a 150 baud modem.

Highlighted by danieljomphe

Disk might be cheap, but my /home is always full - git has a separate step for compacting repositories, which means that delta compression can be far more effective. If you're a compression buff, think of it as having an arbitrarily sized window, because when delta compressing git is able to match strings anywhere else in the repository - not just the file which is the notional ancestor of the new revision.

This space efficiency affects everything - the virtual memory footprint in your buffercache while mining information from the repository, how much data needs to be transferred during "push" and "pull" operations, and so on. Compare that to Subversion, which even when merging between branches is incapable of using the same space for the changes hitting the target branch. The results speak for themselves - I have observed an average of 10 to 1 space savings going from Subversion FSFS to git.

Highlighted by danieljomphe

Perhaps somebody has already made a conversion of the project and put it somewhere

Highlighted by danieljomphe

git-svn fetch

Highlighted by danieljomphe

But people who use git are used to treating their repositories as a revision data warehouse which they use to mine useful information when they are trying to understand a codebase.

Highlighted by danieljomphe

importing the whole repository from Subversion

Highlighted by danieljomphe

git svn init

Highlighted by danieljomphe

git svn fetch

Highlighted by danieljomphe

If you like, you can skip early revisions using the -r option to git-fetch.

Highlighted by danieljomphe

make a local branch for development

Highlighted by danieljomphe

The name "foo" is completely private; it's just a local name you're assigning to the piece of work you're doing. Eventually you will learn to group related commits onto branches, called "topic branches", as described in the introduction.

Highlighted by danieljomphe

Say you want to take a project, and work on it somewhere else in a different direction, you can just make a copy using cp or your favourite file manager. Contrast this with Subversion, where you have to fiddle around with branches/ paths, svn cp, svn switch, etc

Highlighted by danieljomphe

Each of those copies is fully independent, and can diverge freely. You can easily push and pull changes between them without tearing your hair out.

Highlighted by danieljomphe

Each time you have a new idea, make a new branch and work in that.

Highlighted by danieljomphe

But anyway, that copying was too slow and heavy. We don't want to copy 70MB each time we want to work on a new idea. We want to create new branches at the drop of a hat. Maybe you don't want to copy the actual repository, just make another checkout. We can use git-clone again

Highlighted by danieljomphe

The -l option to git-clone told git to hardlink the objects together, so not only are these two sharing the same repository but they can still be moved around independently. Cool. I now have two checkouts I can work with, build software in, etc.

Highlighted by danieljomphe

But all that's a lot of work and most of the time I don't care to create lots of different directories for all my branches. I can just make a new branch and switch to it immediately with git-checkout:

Highlighted by danieljomphe

Once you have some edits you want to commit, you can use git-commit to commit them. Nothing (not even file changes) gets committed by default; you'll probably find yourself using git-commit -a to get similar semantics to svn commit.

Highlighted by danieljomphe

There is also a GUI for preparing commits in early (but entirely functional) stages of development.

Highlighted by danieljomphe

People used to darcs or SVK's interactive commit will like to try git add -i

Highlighted by danieljomphe

correcting changes in your local branch

Highlighted by danieljomphe

If it's the top commit, you can just add --amend to your regular git-commit command to, well, amend the last commit. If you explored the git-gui interface, you might have noticed the "Amend Last Commit" switch as well.

Highlighted by danieljomphe

You can also uncommit. The command for this is git-reset

Highlighted by danieljomphe

HEAD~1 is a special syntax that means "one commit before the reference called HEAD". HEAD^ is a slightly shorter shorthand for the same thing. I could have also put a complete revision number, a partial (non-ambiguous) revision number, or something like remotes/trunk. See git-rev-parse(1) for the full list of ways in which you can specify revisions.

Highlighted by danieljomphe

I sometimes write commands like `gitk --all `git-fsck | awk '/dangling commit/ {print $3}'`' to see all the commits in the repository, not just the ones with "post-it notes" (aka references) stuck to them.

Highlighted by danieljomphe

"Another" way to revise commits is to make a branch from the point a few commits ago, then make a new series of commits that is revised in the way that you want. This is the same scenario as before.

Highlighted by danieljomphe

I've introduced a new command there - git-cherry-pick. This takes a commit and tries to copy its changes to the branch you've currently got checked out. This technique is called rebasing commits. There is also a git-rebase command which probably would have been fewer commands than the above. But that's my way.

Highlighted by danieljomphe

Using Git opens the door to a bazaar of VCS tools rather than sacrificing your projects at the altar of one.

Highlighted by danieljomphe

keep your local branch up to date with Subversion

Highlighted by danieljomphe

The recommended way to do this for people familiar with Subversion is to use git-svn rebase.

Highlighted by danieljomphe

Note: before you do this, you should have a "clean" working tree - no local uncommitted changes. You can use git-stash (git 1.5.3+) to hide away local uncommitted changes for later.

Highlighted by danieljomphe

This command is doing something similar to the above commands that used git-cherry-pick; it's copying the changes from one point on the revision tree to another

Highlighted by danieljomphe

Better still is to bunch up your in-progress working copy changes into a set of unfinished commits, using git add -i (or git-gui / git-citool). Then try the rebase. You'll end up this time with more commits on top of the SVN tree than just one, so using Stacked Git you can "stg uncommit -n 4" (if you broke your changes into 4 commits), then use "stg pop" / "stg push" to wind around the stack (as well as "stg refresh" when finished making changes) to finish them - see

Highlighted by danieljomphe

Once you grok that, you'll only need to use stg and git-svn fetch.

Highlighted by danieljomphe

in my experience stg is the best tool for rebasing

Highlighted by danieljomphe

Ok, so you've already gone and made the commits locally that you wanted to publish back to the Subversion server. Perhaps you've even made a collection of changes, revising each change to be clearly understandable, making a single small change well such that the entire series of changes can be easily reviewed by your fellow project contributors. It is now time to publish your changes back to Subversion.

The command to use is git svn dcommit. The d stands for delta

Highlighted by danieljomphe

git-svn won't let the server merge revisions on the fly; if there were updates since you fetched / rebased, you'll have to do that again.

Highlighted by danieljomphe

People are not used to this, thinking somehow that if somebody commits something to file A, then somebody else commits something to file B, the server should make a merged version with both changes, despite neither of the people committing actually having a working tree with both changes. This suffers from the same fundamental problem that darcs' patch calculus does - that just because patches apply 'cleanly' does not imply that they make sense - such a decision can only be automatically made with a dedicated continual integration (smoke) server.

Highlighted by danieljomphe

This is normally what I use in preference to rebase.

Highlighted by danieljomphe

This will merge all the commits that aren't in your ancestry, but are in the ancestry of the branch trunk (try setting rightmost drop-down in gitk to 'ancestor' and clicking around to get a feel for what this means), and make a new commit which has two parents - your old HEAD, and whatever commit trunk is up to.

Highlighted by danieljomphe

there are many shortfallings in git.

Highlighted by danieljomphe

Sadly, this model is in use by virtually every Subversion hosted project out there. And that is going to be hard to undo.

Highlighted by danieljomphe

It is possible to use git in this way (see the figure to the right) - but it's not trivial, and not default. In fact git itself is developed in this way, using feature branches, aka topic branches.

Highlighted by danieljomphe

Left: what darcs thinks when you start committing without marking tag points.

Highlighted by danieljomphe

Right: Subversion has a somewhat smaller brain...

Highlighted by danieljomphe

bzr comes with some great utilities like the Patch Queue Manager which helps show you your feature branches. With PQM, you just create a branch with a description of what you're trying to do, make it work against the version that you branched off, and then you're done. The branch can be updated to reflect changes in trunk, and eventually merged and closed.

Highlighted by danieljomphe

Windows support is good. Consistent implementation. Experience with the distributed development model. Friendly and approachable author and core team.

Highlighted by danieljomphe

Actually the models of git and bzr are similar enough that bzr could be fitted atop of the git repository model

Highlighted by danieljomphe

Mercurial is missing lightweight branches that makes git so powerful, and there is no content hashing, so it doesn't really do the whole "revision protocol" thing like git.

Highlighted by danieljomphe

If you're on Windows it's probably a lot easier to get going.

Highlighted by danieljomphe