Mavericks

Sean Blanton’s Blog on Software Management

Archive for July, 2008

Git for Services

I’ve been considering the management of our services code under Git. It seems that the support of the distributed development model fits perfectly with sharing and developing code, mostly Perl, among multiple sites (consultants and/or customers). It allows us to keep a primary repository under our own control, but it also allows an on-site consultant to clone a repository and either enhance or customize or both while on site. After the consultant leaves, the customer would be able to choose to receive updates from our on-line repository on GitHub, for example, or not. They could also contribute enhancements, or not, and we can decide if we want to accept any changes they pushed, or not, or futz around with them first.

A consultant could make both enhancements and customizations and as long as they are in separate commits, we can cherry-pick the enhancement commits into our master branch. Pretty cool stuff.

Some of our customers have strict controls over what executables they allow to be installed on their machines, and they may not allow the Git executable client. However, one can clone a repository onto a USB drive and make modifications to the work tree there. This would appear no different than editing files outside of version control. After the edits are done, the USB key can be returned to a machine with a Git client, the changes added and then committed to the repository on the USB key. Those changes in turn could then be pushed to the on-line repository. So, a sort of open source development could be done without violating the customer’s security policies.

With the Web 2.0 evolution, information flow between people has changed from a ‘push’ paradigm (I send you an email) to a pull paradigm (I follow you on Twitter). How could this possibly relate to code management such as branching, merging and history? Well, Git’s distributed repository model and how one obtains code updates from “friend” repositories is similar to Twitter and how you obtain status updates on the people you choose to follow. Instead of communicating micro-blog entries or status updates, Git is communicating source code branch updates.

Also like how Facebook or Twitter allows you to specify a person’s name in lieu of the communication protocol identifier (email address or web page), Git uses aliases for long repository locations so you have a more direct, natural language and human feel to what you are doing: “git fetch linus” will pull changes from Linus’ repository, which you have only had to define once.

Here is a scenario where Steve and I are working on a part of the Linux file system to provide information useful for build management and dependency tracking, which Meister and other tools can take advantage of. Steve started by cloning the master Linux repository and started working away making changes. Steve asked me to work on another part of this project, so I cloned his repository, allowing me to pick up all his changes. I am now automatically following (Git calls it remote-tracking) Steve’s “master” branch of his repository since I started my repository by cloning his. The “master” branch is a.k.a. the “trunk” code stream. I can pick up his updates periodically with:

$ git pull

Now, I may also want to get updates directly from the master Linux repository, but it has a complicated URL that I won’t remember and only want to look up once. So, as a one-time command I do:

$ git remote add linux-nfs git://linux-nfs.org/pub/nfs-2.6.git

Forever after:

$ git fetch linux-nfs
* refs/remotes/linux-nfs/master: storing branch 'master' ...
commit: bf81b46

The “fetch” command doesn’t put the master Linux changes directly into my workspace, but off to the side for me to examine first (very nice). If I want, I can accept the changes into my local work tree. To tell me which repositories I am following (which friends), I do:

$ git branch –r
linux-nfs/master
steve/master
origin/master

“origin/master” is my own trunk. I could also get the full repository information associated with the short names, but as long as it works, I don’t want to know what it is. For me, this type of friendly and fluid interaction with repositories is one of the major advantages over CVS and Subversion.