Best Practices and Technology in Software Delivery
17 Feb
It’s very common to have a code check-out step be part of an integration build. Far better it is to not check out code before a build. What? How is that possible?
Let me explain, Fred. The simple approach most of us take (and have to take when getting things started) has developers commit, commit, commit, and when it is time to deploy, check out the code, do a build, and then deploy the application. There is room here for both problems and optimizations. Doing a full check out of the code tree is more costly in terms of time than checking out only what has changed. Updating the code tree with a single commit is less costly than updating with a large number of commits.
You may be limited by the technology in-hand and how much you’ve invested in learning the technology and possibly customizing it. For example, if your file control tool can only do a full check out of a source tree, or that’s the only command you had time to implement in order to meet the deadline, or you don’t trust your tool to do incremental updates, then you are basically running the longest builds possible.
On the other hand, if you could update the code tree every time a developer does a commit with only the changed files, then you are ready to execute a build at any moment. This requires some deft manipulation of your file control tool, and that’s why you don’t see it more often.
You might think “continuous integration” will take care of this. Developer commits, update checked out, build is run. However, you may end up with a build, test execution and deployment that takes longer than the typical time between developer commits. You still have to do incremental updates and it only solves the problem in cases with very low developer activity.
I’d like to point out one tool that does an excellent job of post-commit code checkout, CA Software Change Manager for Distributed. CA SCM (for short) is the tool, formerly known as Harvest, from the company formerly known as Computer Associates. CA SCM is a highly scalable (1000’s of developers) file control tool with a great lifecycle process model. We at OpenMake Software still have our very first customer still using OpenMake/Meister with CA Harvest/SCM after 11 years. While we have a reseller arrangement with CA, our partnership with CA in services has extended to 14 years.
About 10 years ago, OpenMake Software developed an integration with the then, Platinum Technologies’ Harvest product, modeled after the now dead Computer Associates product, Endevor Workstation, that had an excellent post-action code tree update. (Endevor for z/OS, a.k.a CA SCM for Z/OS is still very popular and has a similar functionality called ‘output libraries’ - following all this?) Our integration had the horrific name, ‘Har-refresh’.
As product partners, we finally transitioned Har-refresh from an external add-on to CA who have turned it into a core functionality of the product, called Hrefresh (a better name.) Rather than simply a post-commit check out, HRefresh updates the code tree after any action that updates a dynamic code view. This includes, renames, deletions, commits and code promotions and demotions. We like this because CA SCM does all the work and we cherry-pick sets of up-to-date code trees to build up an application source code stack for a build. We align Meister dependency directories with HRefresh-managed file system directories for a tight SCM (software configuration management) build.
This mechanism distributes the resource load for checking out code to times when builds are not required. It’s true that often times people want to build as soon as their code is checked in (or promoted), but on average it is a very big net win reducing build times.
This is just an example of the type of sophistication that is out there to prevent pre-build code check outs and save time on your builds.
9 Feb
One of OpenMake Software’s product strategies is to keep things simple. Build management is one of the most complex operations in all of the IT world, and one of our key benefits is to simplify, organize and automate the build process for development, testing and production.
We’ve seen a trend among our customers to simplify their build management infrastructure by going to fewer build machines with more CPU cores. Builds in particular use relatively more CPU resources than other resources as code is interpreted and compiled in memory and then finally written to disk. By reducing the total number of machines, rack space, procurement, administration and other IT overhead costs are reduced at great cost savings per machine eliminated.
Recently, I was at one of the big chip makers where they used dual quad-core CPU Linux machines for their development and builds. They had two machines and were able to control access to allow separate areas for development, testing and release builds in keeping with best practices. Having all the horsepower of 8 CPU cores on a single machine kept them from needing more machines.
Another customer does 6000 builds per month with Meister on just two build machines.
IBM, when selling BuildForge, likes to talk about big build server farms, because their tool does remote execution on multiple machines, as does Meister. However, BuildForge does not do builds at all. It can remotely execute your existing build scripts, but there is little real value add to that. BuildForge is also famously expensive. What happens over the next few years to the high investment in multi-machine remote execution software as the number of machines declines, perhaps dramatically?
A similar argument can be for Electric Cloud’s Electric Accelerator product. It’s possible in some cases, for C/C++ builds to gain an edge by pushing a compile operation to another machine, and then bringing it back. You would only do this to gain access to additional CPU resources. In the past, you might have 8 build machines that Electric Accelerator would farm operations out to. Now, you can pull all those operations into a single machine and there is no need for that functionality. Also, you are stuck with converting your GNU makefiles into other GNU makefiles.
Meister is optimized for multi-core CPU build machines and offers multi-threaded capability to both build events and non-build workflow events. You know where your build is and there are fewer dependencies on network resources. Both BuildForge and Electric Accelerator add additional overhead to build administration to coordinate across multiple machines - a dying practice, that no organization wants to invest in. Meister is the best bet for a future with fewer build machines with more horsepower.
5 Feb
I’ve recently been learning the Ruby on Rails framework for web development. It’s become a quite popular framework for getting database connected websites up and running relatively quickly. One way it is easier to get a site started than with other frameworks is because of the Convention over Configuration mantra that it lives by. Instead of requiring loads of configuration files to build a basic site (that can grow and become quite complex by the way) it has a feature called scaffolding which automatically builds your Model, View and Controller classes based on tables it finds in the database. It can do this by making assumptions based on standard conventions about interacting with a database from a website and naming and using classes in a standard way.
Although I am still a rookie when it comes to understanding the many facets of Ruby on Rails, I have really been trying to emulate the Convention over Configuration way of doing things in my various build/release projects at customer sites. One problem I inevitably encounter in most organizations is that the development of build and release methodologies has been left to the various development teams and not been thought about holistically using a centrally managed approach - this leads to little to no standards and Convention over Configuration is chucked aside. Not only is this inefficient from an organizational standpoint - why reinvent the wheel over and over again for each team when they are essentially tackling the same sets of problems, but it also makes for a nightmarish audit trail that could get you into trouble.
One reason this happens so frequently is that the managers that are supposed to be in charge of standards for building and releasing applications are often not privy to the kind of technical requirements that the various development units have when it comes to putting together and delivering their applications. And when developers try to explain the requirements, the standards people may get lost because they can’t possibly understand the nuts and bolts of every application.
To pull off real centralized management of builds and deployments, the standards people need to take a deep breath and rethink their objectives - start looking for the commonalities, not the differences between applications. In doing this, they will find that that problem that that developer told you was so unique and must be solved a certain way is probably very similar to the problem the other developer told you about last week - or just look on the web and see how many thousands of external developers have this same “unique” problem. It turns out that most applications can be constructed in the same type of way. Just because the source files are different between applications doesn’t mean that the paths they take to their target executable, dll, Jar, War or Ear file is very different at all. And when those paths are essentially the same - create a reusable process that the various teams can share. Use Convention over Configuration as your guide - standardize and centralize the common processes and externalize the technical specifications using highly modularized control files.
Here’s a simple task for you to try. This assumes all of your application teams have their code checked into a central repository - if not, you have bigger problems than standardizing builds and releases and should address those first. Look at your various technologies, whether it’s .Net, Java or some other and try to identify where the code tree’s start under the root of the project. You’d be amazed at how many teams check their .Net solutions into different levels of a code tree for no good reason, or Java teams that have their their source packages buried some place in the code tree. Next, look for the common root starting point for all these application types and try to come up with a simple standard based on this information. Finally, notify those that are not following that standard that you would like to move their code up and over to this new location (its usually up and not down) - it should actually be pretty easy to do. After this has been done, you can now have all build and deploy scripts use a standard root variable to find dependencies (think something like SOURCE_ROOT).
It always amazes me how many teams don’t standardize simple things like code tree start points in their source projects - it equally amazes me how much mileage you can get just out of making simple path standardization adjustments. After you’ve worked on the source tree, try doing the same with your common libraries. This isn’t rocket science - just remember, Convention over Configuration makes everything easier.
Adam
5 Feb
Recently on an assignment I come across a problem that seemingly should have been easy to solve but turned out to be a bit of a stumper. I was trying to execute a remote process using an agent that had to start as a service under the local system account. My requirement was that the process started by that service had to execute a build using a specific user’s profile. It was not enough just start the build process as another user, because I needed all of the user’s environment variables, such as APPDATA and USERPROFILE and its registry settings - the equivalent of logging in that user. In Unix, I would just ’su’ to the new user and source the profile - not so easy, as it turned out, on Windows.
After a bit of research I learned about the Windows RunAs program that allows you to do essentially what I needed. The only problem was that it couldn’t be executed in batch mode since it opened another command prompt to ask for the user’s password. Looking online, I found some people who had written all sorts of wrapper processes to try to get RunAs to work programatically. This included such circuitous methods as using a VB Script wrapper that detected typed keystrokes to pass them to the password login command prompt. Not a glamorous solution to be sure and not one that I could trust to be reliable for widespread enterprise use. I knew there had to be a Windows API for this - after some trial and error I finally found the perfect one to do what I wanted and its supposed to be compatible with 2000 - Vista versions of Windows. It involved using the C# ProcessStartInfo class under the System.Diagnostics package. And, like anytime you find the right tool for the job, once I figured out how to use it, the problem actually became a very quick fix. The class requires five arguments to Start a new process: executable file name, arguments to executable, user name, domain name and password. Note, the password needs to be converted into a SecureString for PSI to work. For my purposes, I didn’t want to pass in an encrypted password, so I just converted it in my script.
Here is the critical code snippet to handle command line arguments passed into the Main method.
ProcessStartInfo psi = new ProcessStartInfo();
psi.FileName = args[0];
Console.WriteLine(”Execution File: ” + args[0]);
psi.Arguments = args[1];
Console.WriteLine(”Execution Arguments: ” + args[1]);
psi.UseShellExecute = false;
psi.Domain = args[2];
Console.WriteLine(”Domain: ” + args[2]);
psi.UserName = args[3];
Console.WriteLine(”User: ” + args[3]);
String pw = args[4];
Char[] pwChars = pw.ToCharArray();
SecureString ss = new SecureString();
foreach (Char pwChar in pwChars)
{
ss.AppendChar(pwChar);
}
psi.Password = ss;
psi.WorkingDirectory = Directory.GetCurrentDirectory(); //start the process in the current working dir
psi.LoadUserProfile = true;
Process p = new Process();
p.StartInfo = psi;
p.Start();
That’s it! I actually wrapped the Start() method in a try, catch and attempted to login the user name to a local machine account of the same name in cases where the domain login failed in order to handle some machines that were not on the same domain, but you probably won’t need that. Hope this helps!
Adam
4 Feb
OM Services Senior Consultant Adam Gabor will be contributing to this blog this week. I’ve been the primary blogger for awhile now and am eager to have our other world-class consultants contributing their expertise to the community.
I took over the name of this blog, but now that we will be expecting more contributions from the others, we may have to reorg the blog a bit. I’ll keep you posted.