Trey Smith's blog

Posted Sun 2013-03-03 21:39

Migrating a repository from CVS and Subversion to Git with history

Today I migrated my old ZMDP planning software from cvs and svn repositories to git. I got pretty deep into some undocumented stuff, so here are my notes.

My basic plan was to migrate from cvs to svn using cvs2svn, then migrate from svn to git using git-svn. But there were some other problems I also needed to fix.

First install the tools. I'm using MacPorts [1].

$ sudo port -vn install cvs2svn
# git-core does not include git-svn by default
$ sudo port -vn install git-core +svn

Then migrate the cvs repo to svn.

$ cp -a ~/projects/zmdp/repository cvsrepo
$ cvs2svn -s svnrepo cvsrepo

Notice I made a backup of the cvs repository first in case cvs2svn did something bad. But no problem, that worked great and preserved the cvs history.

Here's where my first problem comes in. Several years ago I needed to migrate this codebase from cvs to svn in a hurry and I did it the quick and dirty way, by creating a fresh svn repository from a cvs checkout, which lost the cvs history. Since then I made about ten commits in the old svn repo, and now I want to tack those commits onto the end of the cvs history that I migrated into the new svn repo.

Long story short, it seems to be easy to merge the content of two branches in svn but hard to merge their commit history. The straightforward svn merge operation basically squashes the commit history of whatever you merge in together into one commit.

Time for a work-around. Apply each revision of the old svn repo individually to the new repo, commit it with the correct comment, then set the svn:date property of the revision to match the original commit time.

But first I had to set the svn settings to allow editing revision properties. Apparently svn considers that kind of thing suspicious by default. In order to allow it you have to make the pre-revprop-change hook a valid executable that exits with a successful status of 0. The web is short on examples of how to do this.

$ cd ~/sandbox/cvs2svn/svnrepo/hooks
$ cat <<"EOF" > pre-revprop-change
#!/bin/bash
exit 0
EOF
$ chmod +x pre-revprop-change

Now apply, commit, and edit the old commits one at a time. If there were more than ten I'd have to script it, but this was just barely ok to do manually.

$ cd ~/projects/zmdp/svnRepository/src
$ svn log
------------------------------------------------------------------------
r10 | mfsmith3 | 2010-08-16 14:48:04 -0700 (Mon, 16 Aug 2010) | 1 line

updated requirements section of README, new tested compilers
...
$ cd /tmp/newsvncheckout
$ svn merge -r 9:10 file:///Users/mfsmith3/projects/zmdp/svnRepository/src
$ svn commit -m 'updated requirements section of README, new tested compilers'
$ svn propset svn:date '2010-08-16T21:48:04.0Z' --revprop -r HEAD \
    file:///Users/mfsmith3/sandbox/cvs2svn/svnrepo

Now transition everything to git.

$ git svn clone file:///Users/mfsmith3/sandbox/cvs2svn/svnrepo/trunk/src

Things were looking good, but I made a last pass through the git commit log and noticed that the authorship was messed up--I prefer a clean name and email address in the git log but these commits just had an old UNIX username. Luckily, git has much better tools for editing the commit history than svn does.

$ echo > filter.sh <<"EOF"
#!/bin/sh
git filter-branch --commit-filter '
      if [ "$GIT_AUTHOR_NAME" = "trey" ];
      then
              GIT_AUTHOR_NAME="Trey Smith";
              GIT_AUTHOR_EMAIL="trey.smith@gmail.com";
              git commit-tree "$@";
      else
              git commit-tree "$@";
      fi' HEAD
EOF
$ ./filter.sh

All done...

[1]The -n option to port is one I almost always use--it keeps port from aggressively trying to upgrade everything the package you're installing depends on. There's nothing like installing a minor package through port and having it download and recompile a dozen other packages including new Python and Perl interpreters because somebody bumped the patchlevel. If you're not careful, those upgrades can also orphan and break other packages. If you want to stay up to date you're probably better off running port upgrade outdated on a weekly basis.
Category: tech
Tags: git