Merging Git repositories into subdirectories of another

12 minute read

Preparing to write invoices for this month made me realise that I’d split some Git repositories up too far. But how to put them back together again? Here are two ways: one using git merge with a bit of preparation and rework, and one using git subtree.

Three Git repository icons on the left with arrows from each leading to a
single Git repository icon on the right.
Merging many repositories into one.
Image credits: Jason Long

It’s that time of the month again: invoice writing time! Fun!1 To be honest, I’d rather be hacking code and providing value for users of that code.2 Or riding my bike. One or the other.

Anyway, around the time that one month morphs into another, I need to generate invoices for hours worked within that month. Standard stuff. Now, because I’m a geek, I use LaTeX to make my invoices. Naturally, because LaTeX input is plain text, I keep these files in a Git repository. That’s the way things were until the middle of last year when I realised that the Git repo I was using was waaay too general for the specific task of keeping track of client invoices.

How did I let it get this far? Well, at the time it wasn’t clear to me that I was going to go freelance and so things developed3 organically. Consequently, client invoices ended up being stored in sub-folders of a much larger Git repository, containing a lot of other stuff.

The situation looked sort of like this:4

.
├── business-plan
├── cat-videos
├── clients
│   ├── client-a
│   │   └── invoices
│   │       ├── dapper-invoice.cls
│   │       ├── invoice-client-a-2025-01.tex
│   │       ├── invoice-client-a-2025-02.tex
│   │       ├── invoice-client-a-2025-03.tex
│   │       └── Makefile
│   ├── client-b
│   │   ├── invoices
│   │   │   ├── dapper-invoice.cls
│   │   │   ├── invoice-client-b-2025-01.tex
│   │   │   └── Makefile
│   │   └── quotations
│   │       ├── Makefile
│   │       └── quotation-client-b-2025-01.tex
│   └── client-c
│       └── invoices
│           ├── dapper-invoice.cls
│           ├── invoice-client-c-2025-01.tex
│           ├── invoice-client-c-2025-02.tex
│           └── Makefile
├── finances
├── tax
├── timesheets
└── stuff

Now, I’m one of those people who likes small, focused projects. This way irrelevant clutter is reduced and it’s possible to concentrate on only that topic within a given project repository. This propensity does tend to cause a proliferation of Git repositories. However, I find this much better than lumping everything together in some mega-repo. Swings and roundabouts, I guess. So, due to organic development, I’d managed to create sub-projects that needed to detach from the mother ship and go off and have lives of their own. To separate the sub-projects, I used git filter-repo to move their files into their own repositories. Great! Problem solved!

Well, not really. I’ve now realised (several months later) that I went too far and split things at too fine a granularity. In other words, I’ve got Git repos that are too focused and that I should really collect under a single project umbrella, one level of abstraction up. Having the repositories so separate meant that I couldn’t share common files and I ended up repeating myself. To be honest, I don’t know why I split things up quite that drastically. It seemed like a good idea at the time, I guess. Anyway, to make things nice and DRY, I need to merge the repositories into one. That’s the process I’m going to discuss here.

So, what we want is this:5

.
├── client-a
│   └── invoices                               .
│       └── *                                  └── clients
                                                   ├── client-a
.                                                  │   └── invoices
├── client-b                 SMOOSH                │       └── *
│   ├── invoices             ----->                ├── client-b
│   │   └── *                                      │   ├── invoices
│   └── quotations                                 │   │   └── *
│       └── *                                      │   └── quotations
                                                   │       └── *
.                                                  └── client-c
└── client-c                                           └── invoices
    └── invoices                                           └── *
        └── *

The solution is to create a new Git repository, add the overly-specific repos as remotes and then merge the remotes into the new repo while allowing for unrelated histories. The inspiration for this solution came from a recent comment in the r/git/ subreddit. Let’s give it a go!

Smooshing multiple Git repositories into subdirectories of one

First, create a directory for the new Git repository and initialise it:

$ mkdir clients
$ cd clients
$ git init
Initialized empty Git repository in <base-dir>/clients/.git/

Now we add the repos we want to merge as remotes of the new repository:

$ git remote add client-a /path/to/client-a/
$ git remote add client-b /path/to/client-b/
$ git remote add client-c /path/to/client-c/

To be able to merge these repositories into our new one, we need to fetch the upstream information from the remotes:

$ git fetch --all
Fetching client-a
remote: Enumerating objects: 16, done.
remote: Counting objects: 100% (16/16), done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 16 (delta 3), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (16/16), 1.35 KiB | 345.00 KiB/s, done.
From /path/to/client-a
 * [new branch]      main       -> client-a/main
Fetching client-b
remote: Enumerating objects: 13, done.
remote: Counting objects: 100% (13/13), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 13 (delta 1), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (13/13), 1.26 KiB | 429.00 KiB/s, done.
From /path/to/client-b
 * [new branch]      main       -> client-b/main
Fetching client-c
remote: Enumerating objects: 12, done.
remote: Counting objects: 100% (12/12), done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 12 (delta 2), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (12/12), 1.03 KiB | 528.00 KiB/s, done.
From /path/to/client-c
 * [new branch]      main       -> client-c/main

Anyone reading carefully will realise that the file sizes here are very small. This is because I used empty files for the example. I’m trying to get across the idea of the process, after all, not show details of my clients’ invoices.

To get us prepared for the repos we’re about to merge in the new repository, let’s create subdirectories for the individual projects:

$ mkdir client-a client-b client-c

Now we give the merge a starting point (arbitrarily chosen to be from the client-a remote repository). This effectively merges the client-a repository into the new clients repo:

$ git reset --hard client-a/main
HEAD is now at fa21293 Add 2025-03 invoice for client a

There’s a small problem here though: the remote repo assumes that the files are in the project’s root directory. I.e. the .gitignore file and the invoices directory from the client-a project aren’t in the client-a subdirectory of our bright, shiny new clients repository:

$ ls
client-a  client-b  client-c  invoices

The solution is simple. Move the files into the subdirectory where we want them to be and commit that change:

$ git mv .gitignore invoices/ client-a/
$ git commit -m "Move client a files into client-a subdir"
[main 0a8b1b2] Move client a files into client-a subdir
 6 files changed, 0 insertions(+), 0 deletions(-)
 rename .gitignore => client-a/.gitignore (100%)
 rename {invoices => client-a/invoices}/Makefile (100%)
 rename {invoices => client-a/invoices}/dapper-invoice.cls (100%)
 rename {invoices => client-a/invoices}/invoice-client-a-2025-01.tex (100%)
 rename {invoices => client-a/invoices}/invoice-client-a-2025-02.tex (100%)
 rename {invoices => client-a/invoices}/invoice-client-a-2025-03.tex (100%)

Although the solution is simple, a word of warning: this doesn’t scale. In the current example with only three repositories to merge into one, this is manageable. Should you have tens of repositories (or more) to merge,6 you might need to look for a more elegant solution.

An alternative solution would be to create the desired directory structure in the remote repository before merging it into the common one. This is the approach used in this blog post.

We now need to merge the remaining remote repositories into the new, common one, moving the merged files into their new locations after each merge. In our example here, we start with the client-b repository:

$ git merge client-b/main --allow-unrelated-histories
Merge made by the 'recursive' strategy.
 .gitignore                                | 3 +++
 invoices/Makefile                         | 0
 invoices/dapper-invoice.cls               | 0
 invoices/invoice-client-b-2025-01.tex     | 0
 quotations/Makefile                       | 0
 quotations/quotation-client-b-2025-01.tex | 0
 6 files changed, 3 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 invoices/Makefile
 create mode 100644 invoices/dapper-invoice.cls
 create mode 100644 invoices/invoice-client-b-2025-01.tex
 create mode 100644 quotations/Makefile
 create mode 100644 quotations/quotation-client-b-2025-01.tex
$ git mv .gitignore invoices/ quotations/ client-b/
$ git commit -m "Move client b files into client-b subdir"
[main 13921cd] Move client b files into client-b subdir
 6 files changed, 0 insertions(+), 0 deletions(-)
 rename .gitignore => client-b/.gitignore (100%)
 rename {invoices => client-b/invoices}/Makefile (100%)
 rename {invoices => client-b/invoices}/dapper-invoice.cls (100%)
 rename {invoices => client-b/invoices}/invoice-client-b-2025-01.tex (100%)
 rename {quotations => client-b/quotations}/Makefile (100%)
 rename {quotations => client-b/quotations}/quotation-client-b-2025-01.tex (100%)

Note that the git merge step will create a merge commit requiring its own commit message.

We finish off with the client-c repository:

$ git merge client-c/main --allow-unrelated-histories
Merge made by the 'recursive' strategy.
 .gitignore                            | 0
 invoices/Makefile                     | 0
 invoices/dapper-invoice.cls           | 0
 invoices/invoice-client-c-2025-01.tex | 0
 invoices/invoice-client-c-2025-02.tex | 0
 5 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 .gitignore
 create mode 100644 invoices/Makefile
 create mode 100644 invoices/dapper-invoice.cls
 create mode 100644 invoices/invoice-client-c-2025-01.tex
 create mode 100644 invoices/invoice-client-c-2025-02.tex
$ git mv .gitignore invoices/ client-c/
$ git commit -m "Move client c files into client-c subdir"
[main 9a909f6] Move client c files into client-c subdir
 5 files changed, 0 insertions(+), 0 deletions(-)
 rename .gitignore => client-c/.gitignore (100%)
 rename {invoices => client-c/invoices}/Makefile (100%)
 rename {invoices => client-c/invoices}/dapper-invoice.cls (100%)
 rename {invoices => client-c/invoices}/invoice-client-c-2025-01.tex (100%)
 rename {invoices => client-c/invoices}/invoice-client-c-2025-02.tex (100%)

And that’s it! Listing the directory will show you the desired structure:

.
├── client-a
│   ├── .gitignore
│   └── invoices
│       ├── dapper-invoice.cls
│       ├── invoice-client-a-2025-01.tex
│       ├── invoice-client-a-2025-02.tex
│       ├── invoice-client-a-2025-03.tex
│       └── Makefile
├── client-b
│   ├── .gitignore
│   ├── invoices
│   │   ├── dapper-invoice.cls
│   │   ├── invoice-client-b-2025-01.tex
│   │   └── Makefile
│   └── quotations
│       ├── Makefile
│       └── quotation-client-b-2025-01.tex
└── client-c
    ├── .gitignore
    └── invoices
        ├── dapper-invoice.cls
        ├── invoice-client-c-2025-01.tex
        ├── invoice-client-c-2025-02.tex
        └── Makefile

Now I can do some cleanup and merge the contents of the .gitignore and dapper-invoice.cls files into single ones residing in the repository’s root directory.

You can also see that the histories of the constituent repositories are still intact by using the awesome git grog command:7

Colourful output of merged repositories via `git grog`

That’s great! We’ve merged the repositories into subdirectories of a new, single repository, while retaining the histories of the original repos. Nice!

Alternative: use git subtree

There’s always more than one way to do it. In this case, we can use git subtree to do the same thing.8 Let’s try it out.

We create the project’s root directory and initialise the new repository in it as before:

$ mkdir clients
$ cd clients
$ git init
Initialized empty Git repository in /path/to/clients/.git/

This time we don’t need to add any remotes and merge them in, git subtree will do it for us as part of the git subtree add command. But first, we need to ensure that our repo has at least one commit in it before continuing, otherwise, we’ll get an error, e.g.:

$ git subtree add --prefix=client-a /path/to/client-a/ main
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Working tree has modifications.  Cannot add.

What the ambiguous argument 'HEAD' bit is trying to say (as far as I can tell) is that there’s no HEAD commit yet in this repository for git subtree to add to. So, let’s add one to start the ball rolling:

$ touch .gitignore
$ git add .gitignore
$ git commit -m "Initial import"  # I know it's a boring message; what else could I say?

Now the git subtree add command will work as expected:

$ git subtree add --prefix=client-a /path/to/client-a/ main
git fetch /path/to/client-a/ main
remote: Enumerating objects: 16, done.
remote: Counting objects: 100% (16/16), done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 16 (delta 3), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (16/16), 1.35 KiB | 197.00 KiB/s, done.
From /path/to/client-a
 * branch            main       -> FETCH_HEAD
Added dir 'client-a'

Note that I specifically don’t want to use the --squash option as mentioned in some HOWTOs. I like to keep my histories intact.

Listing the directory, we’ll find that git subtree has created a directory called client-a for us. Also, the files from the original client-a repository are present there. This is handy as it means we don’t have to create these directories ourselves, nor do we have to move the files around after merging in the remote.

Checking git log, you’ll find a kind of merge commit that Git automatically created, giving detailed info about the client-a directory’s provenance:

$ git log
commit 00149fda4c50121c208c5ed806cdcad30ac36255 (HEAD -> main)
Merge: 1d95319 fa21293
Author: Paul Cochrane <paul@peateasea.de>
Date:   Thu Mar 27 15:50:48 2025 +0100

    Add 'client-a/' from commit 'fa212936ea333a70c9839ca410720ea72585c33f'

    git-subtree-dir: client-a
    git-subtree-mainline: 1d95319060126d90594d94fb193f41f97563a598
    git-subtree-split: fa212936ea333a70c9839ca410720ea72585c33f

Adding the other two remote repositories as above:

$ git subtree add --prefix=client-b /path/to/client-b/ main
git fetch /path/to/client-b/ main
remote: Enumerating objects: 13, done.
remote: Counting objects: 100% (13/13), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 13 (delta 1), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (13/13), 1.26 KiB | 644.00 KiB/s, done.
From /path/to/client-b
 * branch            main       -> FETCH_HEAD
Added dir 'client-b'
$ git subtree add --prefix=client-c /path/to/client-c/ main
git fetch /path/to/client-c/ main
remote: Enumerating objects: 12, done.
remote: Counting objects: 100% (12/12), done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 12 (delta 2), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (12/12), 1.03 KiB | 528.00 KiB/s, done.
From /path/to/client-c
 * branch            main       -> FETCH_HEAD
Added dir 'client-c'

We have the files we expect and they’re in their proper locations:

.
├── client-a
│   ├── .gitignore
│   └── invoices
│       ├── dapper-invoice.cls
│       ├── invoice-client-a-2025-01.tex
│       ├── invoice-client-a-2025-02.tex
│       ├── invoice-client-a-2025-03.tex
│       └── Makefile
├── client-b
│   ├── .gitignore
│   ├── invoices
│   │   ├── dapper-invoice.cls
│   │   ├── invoice-client-b-2025-01.tex
│   │   └── Makefile
│   └── quotations
│       ├── Makefile
│       └── quotation-client-b-2025-01.tex
├── client-c
│   ├── .gitignore
│   └── invoices
│       ├── dapper-invoice.cls
│       ├── invoice-client-c-2025-01.tex
│       ├── invoice-client-c-2025-02.tex
│       └── Makefile
└── .gitignore

Also, the git grog output is very similar to before:

Colourful output of `git subtree` merged repositories via `git grog`

which means that our history is intact.

We’ve merged the repos into one again! Yay! :tada:

Wrapping up

So there you have it. From “it seemed like a good idea” to “oops” and finally to a more sensible repository structure, all thanks to the power available in Git.

Now back to writing those invoices…

  1. Yes, I mean this ironically. 

  2. If you need someone to hack on code and provide value to your users, give me a yell! I’m available for freelance Python/Perl backend development and maintenance work. Contact me at paul@peateasea.de and let’s discuss how I can help solve your business’ hairiest problems. 

  3. Pun not intended! 

  4. This is only an example layout to get across the idea and only vaguely reflects the situation I had at the time. 

  5. Yes, “smoosh” is the technical term. :wink: 

  6. Dunno how likely that is… 

  7. I’ve forgotten where I originally found this alias, but it’s pretty cool, uses lots of colour and is a nice complement to tig. IIRC grog stands for “graphical log” as it’s a modified version of git log. I also find the name amusing. Here’s the code split over two lines for more readability:

    alias.grog=log --graph --abbrev-commit --decorate --all \
    --format=format:"%C(bold blue)%h%C(reset) - %C(bold cyan)%aD%C(dim white) - %an%C(reset) %C(bold green)(%ar)%C(reset)%C(bold yellow)%d%C(reset)%n %C(white)%s%C(reset)"
    

  8. One could even use the more advanced technique of subtree merging, mentioned in the Git book which can be used for the same purpose. 

Support

If you liked this post and want to see more, please buy me a coffee!

buy me a coffee logo

Categories:

Updated: