git svn --preserve-empty-dirs - the performace-killer switch

I have a task to migrate an SVN repository into a new GIT repository.

The origin SVN repository is a very big one - it has more than 120000 revisions (and growing…) with many branches and tags. It is basically a Java project (maven build) which contains (too) many empty folders that - for build reasons - must remain also after the migration to GIT.

As I usually do for such migrations, I’m using the git svn command for the migration.

But testing the migration for the current task resulted with the estimation of about 2 weeks run(!!!) which I can’t afford to have.

After discussing this with the owners of the repository which consists of having the commit-history and preserving at least the main development branch, I manage to reduce the migration time into 18 hours on a very powerful server.

Still, this seems not valid for us and then I’ve discovered that the cause of the long migration is the usage of the --preserve-empty-dirs flag. I’ve noticed that running the same migration (only one branch with it full history) - without that flag - took only 80 minutes on the same powerful machine!!!

Still, even that were a big discovery (for me at least) it was not good for the project that needs to preserve those empty folders.

For solving this I realized that while the empty folder are needed for build process at the HEAD revision, it is not needed for any of the history revisions.

Therefore I’ve created the following migration process:

1. migrate the SVN repository without the --preserve-empty-dirs flag

2. SVN checkout the HEAD revision of the branch onto a separate folder

3. put an empty .gitignore file (or whatever dummy file you want) into each of the empty folders

 (the following command will do the job: find . -type d -empty -exec touch {}/.gitignore \;)

4. overwrite the above folder onto the migrated folder

5. commit the changes as a new GIT commit

 

This will result with a GIT repository that have a version of the code that passes build in its last commit and still have the history of all the code on the main development branch.

 

Enjoy,

 

Yoram Michaeli

yorammi@tikalk.com #yorammi

Thank you for your interest!

We will contact you as soon as possible.

Send us a message

Oops, something went wrong
Please try again or contact us by email at info@tikalk.com