Fork the project to make your own modifications, which allows you to easily integrate your contributions. However, if you don't send these modifications back upstream - that is, back to the parent repository - you may lose track of them, which can result in different branches in your version control. To ensure that all contributors get information from the same place, you need to understand how git forking and git upstream interact. In this blog post, I will introduce you to the basics, troubleshoot any issues, and even leave you with a cool tip to stay ahead of the curve.
Git Upstream: Keeping Up to Date#
Let me first explain the common settings and basic workflow for interacting with the upstream repository.
In a standard setup, you usually have an origin and an upstream remote - the latter being the gatekeeper of the project or the true source you want to contribute to.
First, make sure you have set up a remote for the upstream repository and have also set up an origin:
$ git remote -v
origin [email protected]:my-user/some-project.git (fetch)
origin [email protected]:my-user/some-project.git (push)
If you don't have an upstream, you can easily add it using the remote command.
git remote add upstream [email protected]:some-gatekeeper-maintainer/some-project.git
Check that the remote has been added successfully:
git remote -v
origin [email protected]:my-user/some-project.git (fetch)
origin [email protected]:my-user/some-project.git (push)
upstream [email protected]:some-gatekeeper-maintainer/some-project.git (fetch)
upstream [email protected]:some-gatekeeper-maintainer/some-project.git (push)
Now you can fetch the latest changes from the upstream repository. Repeat this action whenever you want to get updates.
(If the project has tags that haven't been merged into master, you should also do: git fetch upstream --tags
)
git fetch upstream
In general, you want to keep your local branches as close mirrors of the upstream master branch and do any work in feature branches, as they may later become pull requests.
At this point, it doesn't matter whether you use merge or rebase, as the results are usually the same. Let's use merge:
git checkout master
git merge upstream/master
When you want to share some work with the upstream maintainers, you can create a feature branch from the master branch. Once you're satisfied, push it to your remote repository.
You can also use rebase instead and then merge to ensure that the upstream has a clean set of commits (preferably one) to evaluate.
git checkout -b feature-x
#some work and some commits happen
#some time passes git fetch upstream
git rebase upstream/master
If you need to squash several commits into one, you can use the powerful rebase interactive at this point.
Publish with git fork#
After going through the above steps, publish your work in the remote fork with a simple push.
git push origin feature-x
If you have to update it after publishing the remote branch feature-x due to some feedback from the upstream maintainers, you have a few options:
Create a new branch that includes your updates and the updates from the upstream repository. Merge the updates from the upstream into your local branch and make a merge commit, which will mess up the upstream repository. Rebuild your local branch on top of the updated upstream and force push to the remote branch.
git push -f origin feature-x
Personally, I prefer to keep the history clean as much as possible and choose option three, but different teams have different workflows. Note: You can only do this when using your own fork; rewriting the history of shared repositories and branches should never be done. Rewriting the history of shared repositories and branches is something you should never do.
Tip of the day: Ahead/Behind number in the prompt#
After fetching, git status will show you how many commits ahead or behind the synchronized remote branch you are. Wouldn't it be better if you could see this information in your faithful command prompt? That's what I thought, so I started using my bash chopsticks and made it happen.
Here's what it looks like on your prompt after you configure it:
nick-macbook-air:~/dev/projects/stash[1|94]$
This is what you need to add to your .bashrc or equivalent, just a function:
function ahead_behind {
curr_branch=$(git rev-parse --abbrev-ref HEAD);
curr_remote=$(git config branch.$curr_branch.remote);
curr_merge_branch=$(git config branch.$curr_branch.merge | cut -d / -f 3);
git rev-list --left-right --count $curr_branch...$curr_remote/$curr_merge_branch | tr -s '\t' '|';
}
You can enrich your bash prompt with this new function ahead_behind to achieve the desired effect. I'll leave the coloring work to the reader.
Sample prompt:
export PS1="\h:\w[\$(ahead_behind)]$"
Internal Structure#
For those who like details and explanations, here's how it works.
We get the symbolic name of the current HEAD and the current branch.
curr_branch=$(git rev-parse --abbrev-ref HEAD);
We get the remote that the current branch points to.
curr_remote=$(git config branch.$curr_branch.remote);
We get the branch that this remote branch should be merged into (by using a cheap Unix trick to discard everything including the last slash [/]).
curr_merge_branch=$(git config branch.$curr_branch.merge | cut -d / -f 3);
Now we have what we need to collect the number of commits ahead and behind.
git rev-list --left-right --count $curr_branch...$curr_remote/$curr_merge_branch | tr -s '\t' '|';
We use the old Unix tr to convert TABs into separators |.
Getting started with git upstream#
That's the basic drill with git upstream - how to set it up, create new branches, collect changes, publish with git fork, and a handy tip to see how many commits ahead/behind your remote branch is.