6

I'm working with a large svn repository (30,000+ revisions). I am using git-svn with limited success.

My major problem is that the svn repository contains frequent updates to large binary files (~30MB). I do not care about the history of these files but I do care about the current versions of these files.

git svn rebase runs very slowly if there have been multiple updates to the large binary files since my last svn rebase (which is common). My git database also grows very quickly. I'm looking to resolve these two key issues.

Ideally, what I would like to do is completely ignore these large files from svn and then run a script that fetches only the latest version which I would then block with my .gitignore. I'm very open to other options though.

2
  • Git does not use any smart algorithm for versioning of binary files, it just stores each version of them, so your repository is growing. Make sure you use git-svn on a Unix platform, it is known to be awfully slow on Windows. I would suggest you ignore the path for those file, and add them by amending the last commit. Commented Apr 9, 2013 at 21:24
  • You say that you would suggest I ignore the path for those files. I'd love to, how can I do that? Commented Apr 9, 2013 at 21:32

3 Answers 3

10

you can ignore some files using the --ignore-paths option of git svn:

 --ignore-paths=<regex>
               This allows one to specify a Perl regular expression that will
               cause skipping of all matching paths from checkout from SVN.
               The --ignore-paths option should match for every fetch
               (including automatic fetches due to clone, dcommit, rebase,
               etc) on a given repository.

                   config key: svn-remote.<name>.ignore-paths

               If the ignore-paths config key is set and the command line
               option is also given, both regular expressions will be used.

               Examples:

               Skip "doc*" directory for every fetch

                       --ignore-paths="^doc"

               Skip "branches" and "tags" of first level directories

                       --ignore-paths="^[^/]+/(?:branches|tags)"
Sign up to request clarification or add additional context in comments.

1 Comment

Looks like that works perfectly. Thanks for pointing me in the right direction.
2

git svn option --ignore-paths=

is useful to exclude unwanted binary files when converting svn repository to git

If you know the filename extensions of the binary files then you can write a regex expression to exclude them, FOR EXAMPLE,

.jar files:--ignore-paths=".*.jar$"

Comments

0

From what i know i don't think its possible to do so using git-svn since git-svn fetch the whole svn info and convert it to git. check your pack file and you will see that its a huge file.

what i would do is adding the huge files/folder to the .gitignore so it will not process them at all.

A nice solution is published here: http://alblue.bandlem.com/2011/11/git-tip-of-week-git-bigjobbies.html

1 Comment

Adding the the files/folder to the .gitignore does not prevent 'git svn rebase' from pulling the files in.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.