I’ve needed to move files or directories (along with their histories) from one Git repository into a new repository often enough now that I’m annoyed with myself each time I can’t remember how to do it. Hence, here are my notes on how to accomplish this.
I don’t take any credit for the actual commands mentioned here, everything has been gleaned from the amazing knowledge resource that is StackOverflow. In particular these answers were used when working out the solution presented below:
- How to split a git repository while preserving subdirectories?
- Splitting a set of files within a git repo into their own repository, preserving relevant history
- How to move a file from one git repository to another while preserving history
When one project is really two
Imagine you have a repository which has been growing and growing and at some point you realise that a part of the repository is really a project on its own. How to take this part (be it a file, set of files, or entire subdirectory) and create a new repository containing only these files and their respective histories (and no other)?
The trick is to think of the new repository as being the old repository, however with the files (and their histories) that you don’t want to keep removed from it.
This is the process to use:
- clone the original repository locally
- enter the clone and remove all files from git that aren’t wanted
Moving files and directories
First, clone the original repo:
Now remove the
origin remote reference (we want to detach the new
repository from the history of the original one):
Then it’s a simple matter of uttering the following incantation:
What this does is goes through all commits in the clone of the original repository looking for files which don’t match the files you want to keep and removes their entries in the index. Afterwards you’re left with just the commits for just the files you’re interested in.
To make sure that everything is cleaned up, you can also run Git’s garbage collector explicitly so that everything that isn’t required really has been purged:
Now rename the directory to something more appropriate for the subproject
that has been created and reassign the
origin remote pointer (assuming, of
course, that the remote bare repository has already been created):
Of course, the moved files need to be removed from the original repository and a commit message indicating where they ended up would be very helpful for possible repository archaeology in the future.
git filter-repo over
A reader made me aware of
filter-repo which is a more
powerful tool for history rewriting than
git filter-branch. In fact, the
filter-branch explicitly warns
against its use and recommends users to prefer
git filter-repo instead.
The reason for the warning is that
git filter-branch has several safety
and performance pitfalls which make using it potentially dangerous for the
casual user. Some people won’t be able to use
git filter-repo yet because
git >= 2.22.0 in order to work. If you’re in that situation,
you’ll have to fall back to the
git filter-branch solution.
git filter-repo isn’t part of the standard suite of Git tools,
hence it’s necessary to install it before you can use it. Unfortunately, it
hasn’t yet been packaged for Debian (but is packaged by some other Linux
hence it will be necessary for Debian users to install the source code.
Since this comes as a single Python script, the installation is very simple:
just put the file somewhere in your
To install the script, grab and unpack the latest tarball:
and copy the
git-filter-repo script into a directory in your
filter-repo git subcommand will be available; in other words, you
can run the command as
So how do we use
filter-repo to filter files as described in the
filter-branch example above? Again, clone the repo and remove the origin:
which, as you can see, is significantly easier to use than the previous
solution. Note that you’ll need to specify full paths to the files you want
to keep; the
filter-branch solution used a
grep hence the paths weren’t
as relevant in that case.
Much more information, including several examples, is available in the
git filter-repo documentation.
Moving just a directory
The definitive guide to moving a subdirectory is in the answer to this question on Stack Overflow: Detach (move) subdirectory into separate Git repository
To paraphrase that answer, here is how to extract just the given directory, pulling in all branches and tags.
If you don’t want all tags and branches you can just rewrite the current
HEAD by using this version of the command:
Making the complex simple:
It turns out that splitting a subdirectory of a project out into a new project is sufficiently common that there is also a Git command especially for it:
Again, clone the original repo. This is effectively a backup of your
repository, which is a good idea, because the
git subtree command is
destructive and will rewrite your history. As I saw on a T-shirt recently:
“No backup? No pity!”.
Now we split a subdirectory of the repository (called the “prefix” in
subtree terminology) into its own “project” and create a new branch with
just this subdirectory and its history.
If you check out the new branch
you’ll find only the files from the subdirectory that you just split from the original project. Assuming that you’ve already made a bare repository for the new project, you can now add the bare repository as an upstream reference and push this branch to the new project’s master branch:
The nice, clean, shiny project repository can now be cloned from upstream:
And that’s it! I hope that helped someone and that it helps my forgetful future self :-)
Since publishing this post, a reader pointed out a tool he is working on called Git X-Modules to solve the problem outlined above. He describes it like so:
The tool is designed to help with migration to monorepo and as a replacement to Git submodules/subtree. Unlike one-time convertion described in your article, it continuously syncs old and new repository, this makes the migration smooth.
Note that I haven’t used Git X-Modules and hence can’t endorse it, however it might be of interest to people looking for a higher-level solution to the problem of moving files to a new Git repository.
If you liked this post and want to see more like this, please buy me a coffee!