Debug with Git Binary Search

I don’t think I have to spend too many words on Git. Every programmer who was not on the moon in the last 5 years should already be a proficient Git user. Git is an amazing, flexible and powerful version control system. Sure, as Mercury fans often claim, its command’s syntax is often really unclear ( git reset HEAD <file> anyone?) and some operation are really unintuitive and hard to remember (e.g., remove a remote branch?). But Git is the most popular, successful and probably powerful tools available for VCS. That’s a fact.

Many of you use Git daily, I’m sure of this. You are using it for managing projects, tracking version of your software, personal documents or to collaborate with other colleagues and open source softwares. But I’m pretty sure that many of you don’t know that you can use Git for debug! Yes, Git can be one of your debugging tools too! Let’s see how!

Suppose the you have released a project and at some point you receive a bug report similar to this:

Hi! With the last version 10.2 of AwesomeSoftware I have this <nasty-bug>! No problem with 10.1!

Now, the user says that with version 10.1 the software worked fine but with the new released version 10.2 a regression bug has been introduced. The bad news is that there are 50 commits to separate version 10.1 from 10.2! How can you locate where in your gigantic codebase the bug is? If you could identify which commit introduced the bug it would be awesome: in this way you can see exactly which lines of code can be blamed for the bug! So you think to a simple strategy:

  1. You temporary revert your project to 25 commits ago (half of the commits between the bad current version and the good 10.1 version).
  2. You perform a test to check if the bug is already there.
  3. If yes, the bug was introduced before the current commit, so go to 12 commits backward (half of 25) and try again.
  4. If no, the bug was introduced after the current commit, so go forward of 12 commits and try again.
  5. Repeat this until the evil commit is found.

Cool. With this simple strategy you can find your evil commit in Log_2(50)  iterations! This awesome but… it is boring as hell! It is possible to avoid this repetitive nightmare? Cheer up! The solution is right inside Git:  git bisect !

How to use it? Simple. First we initialize the tool:

We start the tool with  start (obviously) and then we say two important things: 1) that the current version is bad and 2) that the version tagged with v10.1  is good. If you don’t have a tag (and this is bad) you can use the SHA-1 id of the good commit.

At this point you should see something like:

Git automatically reverted in the middle between your good and bad commit. Now, you can perform the test script, or test suite, or anything else, in order to check if in this commit the bug was present or not. Suppose this commit is good. We can say this to Git with the following command:

Now Git will move forward of half 25 commits (12 more or less). Then you can repeat the test. Suppose this commit is bad. We can say this to Git with

And so on. You have to continue this thing until the bad commit is found. Then, you have to remember to use the command  git bisect reset in order to come back to a valid working state for the repository. This is important. You can really do weird stuff if you do not use the reset command.

Ok. Cool. This make all the “find-evil-commit” algorithm way easier but it is still boring. Don’t worry! Git has a solution for this too!

Suppose to have an automated script for the specific problem. This script return 0 if there is no bug and non-0 if there is a problem (you can use all the traditional test suite and unit testing tools). Then you can fully automatize the algorithm! Just use the following command:

Wow! We first say to Git to start a binary search from HEAD (bad) to v10.1 (good). Then we use the  run command to execute a given script at each step! Just press return, and you will be teleported to the bad commit! Extra-quick! :D

This can save a lot of time.