little cubes

Why you should stop using git pull

Zach Posten - 2022 May 26

...in its default form

If you ask most developers what git pull does, they'll usually say something along the lines of "it pulls down the latest code". But it actually does more than that. From the docs:

Pulling is actually a two step process:

  1. git fetch - This what most people think pull does; it goes out to the remote and updates your local git with all of the changes that have happened on the remote since you last communicated with it.
  2. git merge - This is the dark side of pull that we want to avoid. If the branch that you're pulling is different on the remote than it is on your local, git (by default) will (secretly 🥷) create a merge commit that you probably don't want.

The Problem

If someone else has updated your remote branch, you probably want to know about that and make a decision about how to deal with it, rather than just have git assume that you want to create a merge commit. For example, if someone has force updated the main branch and you run git pull main, you are going to create two sets of almost duplicate commits when you (secretly 🥷) merge the old version of history with the new version.

In practice, using git pull as part of your workflow will bloat your repository with many unnecessary merge commits and sometimes duplicate commits. Especially if you're in the habit of running git pull on branches that are shared between multiple teammates.

The Solution

Okay so you're hopefully convinced that running git pull can have some unintended consequences, but what do we do about it!? Run this:

git config --global pull.ff only

That git config setting essentially adds the --ff-only flag to the git pull command by default, which makes it safe to run! From the docs:

Another way of phrasing that would be, "if the history of this branch has changed in such a way that a merge commit (or a rebase) would be necessary, then do nothing."

What happens when it breaks

In the case where you do have a "divergent local history", Git will tell you that something went wrong and you'll have the freedom to choose how to deal with the situation. Specifically, it will tell you fatal: Not possible to fast-forward, aborting.

When that happens, you've got a decision to make:

  • Have you just forgotten to update your remote lately? git push
  • Has someone else updated your remote?
    • Do you have un-pushed changes to this same branch? git cherry-pick or git rebase
    • Do you just want to accept their version of the branch? git reset

But this way is harder

The reason that git pull works like it does by default is because it's easy. You don't have to make any decisions, and git can "fix" it for you automatically.

But that simplicity has a big tradeoff, as Git itself acknowledges. In recent versions of Git, when you run git pull there is some warning text that is easily ignored:

1 2 3 4 5 6 7 8 9 10 11 12 hint: Pulling without specifying how to reconcile divergent branches is hint: discouraged. You can squelch this message by running one of the following hint: commands sometime before your next pull: hint: hint: git config pull.rebase false # merge (the default strategy) hint: git config pull.rebase true # rebase hint: git config pull.ff only # fast-forward only hint: hint: You can replace "git config" with "git config --global" to set a default hint: preference for all repositories. You can also pass --rebase, --no-rebase, hint: or --ff-only on the command line to override the configured default per hint: invocation.

The tradeoff is a much more confusing history of your repo. Adding all those extraneous merge commits makes it a lot harder to read the git history and understand what was done, by who, and in what order.