Sean Blanton

Best Practices and Technology in Software Delivery

download os x adobe fireworks cs3. Adobe Fireworks CS3 9.0 | Buy your software cheap and easy .download adobe fireworks serial number adobe fireworks cs 3. Adobe Fireworks CS4 10 | Buy your software cheap and easy .adobe technote fireworks mx emerging issues phone activation adobe fireworks 9.0. Adobe Fireworks CS4 10 Multilingual | Buy your software cheap and easy .tutorial adobe fireworks slideshow adobe fireworks free. Adobe Buy Cheap Software Online Software Store .adobe fireworks 8

Archive for the ‘SCM Strategy’ Category

It is just plain fun to run parallel workflows and builds and watch the activities and build steps light up the workflow monitor in real time like a Christmas tree. See this flash demo to see what I mean.

As customers go to machines with more and more cores, fewer machines are needed in the application lifecycle infrastructure, particularly for builds and code retrievals - the most resource intensive functions. This is helping to simplify the infrastructure, reduce maintenance and administration and drive down costs.

Several of our customers are running around 5000 builds and non-build workflows per month on two machines. The primary reason for two machines, in fact, is for disaster recovery, and the goal is to run both machines at less than half capacity so that in the event that one machine (or datacenter) fails, all the current capacity can be run as a contingency on the one machine that’s left.

Thread control is very simple with Meister and Mojo. Both use the omsubmit dependency manager program to handle this. Meister’s om program translates build events into workflow steps using omsubmit. The OMSUBMIT_MAX_USER_PROC value sets the maximum allowed number of threads.

You might think that if you are running dual, quad-core build machines that you should set the max threads at 8. However, Meister posts build operations to one thread and the associated logging operation to another. Compile operations notoriously use a lot of memory and CPU resources, but the logging operation posts to a server and waits for the operation to complete. There is really no disadvantage to setting the max threads higher than 16 in this case, so go ahead and do it.

As a non-build workflow example, I worked on JBoss deployments to 48 Linux machines. The workflow was parallelized into 48 activities each of which deployed to a single machine in parallel. The deployment activity was largely a remote execute operation that extracted archives on the remote machine. The extraction took about a second for a medium sized application. Again, this is a waiting situation where machine resources are essentially idle while the thread is in use, so use more threads. The machine was a dual, dual-core build machine and we set OMSUBMIT_MAX_USER_PROC to 50.

Watching the workflow monitor as the deployment ran, we could see roughly half of the machines light up (meaning actively running) at any one time and the entire deployment process synchronized all 48 machines in a little over two seconds.

So, don’t simply match your machine’s CPU threading capabilities - overclock! Aim high for max threads and try to determine where your performance is optimized. I’d love to provide you with some metrics as a function of thread count, but usually once something is working it’s on to the next project. I barely have enough time to blog!

Check Out Code Post-Commit - Not Pre-Build

It’s very common to have a code check-out step be part of an integration build. Far better it is to not check out code before a build. What? How is that possible?

Let me explain, Fred. The simple approach most of us take (and have to take when getting things started) has developers commit, commit, commit, and when it is time to deploy, check out the code, do a build, and then deploy the application. There is room here for both problems and optimizations. Doing a full check out of the code tree is more costly in terms of time than checking out only what has changed. Updating the code tree with a single commit is less costly than updating with a large number of commits.

You may be limited by the technology in-hand and how much you’ve invested in learning the technology and possibly customizing it. For example, if your file control tool can only do a full check out of a source tree, or that’s the only command you had time to implement in order to meet the deadline, or you don’t trust your tool to do incremental updates, then you are basically running the longest builds possible.

On the other hand, if you could update the code tree every time a developer does a commit with only the changed files, then you are ready to execute a build at any moment. This requires some deft manipulation of your file control tool, and that’s why you don’t see it more often.

You might think “continuous integration” will take care of this. Developer commits, update checked out, build is run. However, you may end up with a build, test execution and deployment that takes longer than the typical time between developer commits. You still have to do incremental updates and it only solves the problem in cases with very low developer activity.

I’d like to point out one tool that does an excellent job of post-commit code checkout, CA Software Change Manager for Distributed. CA SCM (for short) is the tool, formerly known as Harvest, from the company formerly known as Computer Associates. CA SCM is a highly scalable (1000’s of developers) file control tool with a great lifecycle process model. We at OpenMake Software still have our very first customer still using OpenMake/Meister with CA Harvest/SCM after 11 years. While we have a reseller arrangement with CA, our partnership with CA in services has extended to 14 years.

About 10 years ago, OpenMake Software developed an integration with the then, Platinum Technologies’ Harvest product, modeled after the now dead Computer Associates product, Endevor Workstation, that had an excellent post-action code tree update. (Endevor for z/OS, a.k.a CA SCM for Z/OS is still very popular and has a similar functionality called ‘output libraries’ - following all this?) Our integration had the horrific name, ‘Har-refresh’.

As product partners, we finally transitioned Har-refresh from an external add-on to CA who have turned it into a core functionality of the product, called Hrefresh (a better name.) Rather than simply a post-commit check out, HRefresh updates the code tree after any action that updates a dynamic code view. This includes, renames, deletions, commits and code promotions and demotions. We like this because CA SCM does all the work and we cherry-pick sets of up-to-date code trees to build up an application source code stack for a build. We align Meister dependency directories with HRefresh-managed file system directories for a tight SCM (software configuration management) build.

This mechanism distributes the resource load for checking out code to times when builds are not required. It’s true that often times people want to build as soon as their code is checked in (or promoted), but on average it is a very big net win reducing build times.

This is just an example of the type of sophistication that is out there to prevent pre-build code check outs and save time on your builds.

One of OpenMake Software’s product strategies is to keep things simple. Build management is one of the most complex operations in all of the IT world, and one of our key benefits is to simplify, organize and automate the build process for development, testing and production.

We’ve seen a trend among our customers to simplify their build management infrastructure by going to fewer build machines with more CPU cores. Builds in particular use relatively more CPU resources than other resources as code is interpreted and compiled in memory and then finally written to disk. By reducing the total number of machines, rack space, procurement, administration and other IT overhead costs are reduced at great cost savings per machine eliminated.

Recently, I was at one of the big chip makers where they used dual quad-core CPU Linux machines for their development and builds. They had two machines and were able to control access to allow separate areas for development, testing and release builds in keeping with best practices. Having all the horsepower of 8 CPU cores on a single machine kept them from needing more machines.

Another customer does 6000 builds per month with Meister on just two build machines.

IBM, when selling BuildForge, likes to talk about big build server farms, because their tool does remote execution on multiple machines, as does Meister. However, BuildForge does not do builds at all. It can remotely execute your existing build scripts, but there is little real value add to that. BuildForge is also famously expensive. What happens over the next few years to the high investment in multi-machine remote execution software as the number of machines declines, perhaps dramatically?

A similar argument can be for Electric Cloud’s Electric Accelerator product. It’s possible in some cases, for C/C++ builds to gain an edge by pushing a compile operation to another machine, and then bringing it back. You would only do this to gain access to additional CPU resources. In the past, you might have 8 build machines that Electric Accelerator would farm operations out to. Now, you can pull all those operations into a single machine and there is no need for that functionality. Also, you are stuck with converting your GNU makefiles into other GNU makefiles.

Meister is optimized for multi-core CPU build machines and offers multi-threaded capability to both build events and non-build workflow events. You know where your build is and there are fewer dependencies on network resources. Both BuildForge and Electric Accelerator add additional overhead to build administration to coordinate across multiple machines - a dying practice, that no organization wants to invest in. Meister is the best bet for a future with fewer build machines with more horsepower.

I’ve recently been learning the Ruby on Rails framework for web development. It’s become a quite popular framework for getting database connected websites up and running relatively quickly. One way it is easier to get a site started than with other frameworks is because of the Convention over Configuration mantra that it lives by. Instead of requiring loads of configuration files to build a basic site (that can grow and become quite complex by the way) it has a feature called scaffolding which automatically builds your Model, View and Controller classes based on tables it finds in the database. It can do this by making assumptions based on standard conventions about interacting with a database from a website and naming and using classes in a standard way.

Although I am still a rookie when it comes to understanding the many facets of Ruby on Rails, I have really been trying to emulate the Convention over Configuration way of doing things in my various build/release projects at customer sites. One problem I inevitably encounter in most organizations is that the development of build and release methodologies has been left to the various development teams and not been thought about holistically using a centrally managed approach - this leads to little to no standards and Convention over Configuration is chucked aside. Not only is this inefficient from an organizational standpoint - why reinvent the wheel over and over again for each team when they are essentially tackling the same sets of problems, but it also makes for a nightmarish audit trail that could get you into trouble.

One reason this happens so frequently is that the managers that are supposed to be in charge of standards for building and releasing applications are often not privy to the kind of technical requirements that the various development units have when it comes to putting together and delivering their applications. And when developers try to explain the requirements, the standards people may get lost because they can’t possibly understand the nuts and bolts of every application.

To pull off real centralized management of builds and deployments, the standards people need to take a deep breath and rethink their objectives - start looking for the commonalities, not the differences between applications. In doing this, they will find that that problem that that developer told you was so unique and must be solved a certain way is probably very similar to the problem the other developer told you about last week - or just look on the web and see how many thousands of external developers have this same “unique” problem. It turns out that most applications can be constructed in the same type of way. Just because the source files are different between applications doesn’t mean that the paths they take to their target executable, dll, Jar, War or Ear file is very different at all. And when those paths are essentially the same - create a reusable process that the various teams can share. Use Convention over Configuration as your guide - standardize and centralize the common processes and externalize the technical specifications using highly modularized control files.

Here’s a simple task for you to try. This assumes all of your application teams have their code checked into a central repository - if not, you have bigger problems than standardizing builds and releases and should address those first. Look at your various technologies, whether it’s .Net, Java or some other and try to identify where the code tree’s start under the root of the project. You’d be amazed at how many teams check their .Net solutions into different levels of a code tree for no good reason, or Java teams that have their their source packages buried some place in the code tree. Next, look for the common root starting point for all these application types and try to come up with a simple standard based on this information. Finally, notify those that are not following that standard that you would like to move their code up and over to this new location (its usually up and not down) - it should actually be pretty easy to do. After this has been done, you can now have all build and deploy scripts use a standard root variable to find dependencies (think something like SOURCE_ROOT).

It always amazes me how many teams don’t standardize simple things like code tree start points in their source projects - it equally amazes me how much mileage you can get just out of making simple path standardization adjustments. After you’ve worked on the source tree, try doing the same with your common libraries. This isn’t rocket science - just remember, Convention over Configuration makes everything easier.

Adam

Best Practices Production Build Control

Finding the blog Enterprise Maven made me decide to go back to the basics, today. This blog is from 2006, but the best practices of production control ignored here go back decades. I’d like to point out that Oleg Gusakov, the author, wrote the blog in a very good spirit and seems like a nice guy. He just seems to be a bit naive about what’s been happening with software development in the enterprise.

In the first section, he assumes that the only enterprise build and deploy solution is one that is customized, while OpenMake Meister has been serving that role now for 12 years. He does correctly conclude that all the enterprises in the world should not be independently investing in the same type of build and deploy solution. It is a costly investment and this functionality should be productized. That’s exactly why we did it and why that is still one of our chief selling points.

He is right that developing a product that should be commoditized is a drain on the business. However, the converse, having a commercial product provide the functionality at a greatly reduced cost compared with one homegrown, provides a competitive advantage over those companies who don’t have such a product.

Through the middle of the article, again, I think Oleg is unaware of the heavy horse SCM products out there that provide a lot of the expected functionality. Tools like CA Harvest, Serena Dimensions and others are very complex and sophisticated n-tier products. They nevertheless do not provide build support, so by combining an enterprise file control tool with an enterprise build and workflow tool, Meister, you canvas the required functionality.

Lastly, regarding the enterprise development lifecycle, he is right it is an oversimplification. I like his phrase that he hopes to “grow the meat.” At OM Services, we have “fully grown meat” and the enterprise lifecycle documents that we develop with our customers and clients are typically 50-80 pages in length. Here is where I review the generally accepted best practices, going back to the seventies with mainframe development. (NO, distributed platforms are not somehow different in the high level process!)

  1. It all starts with production control. Developers do not have access to production due to a fundamental conflict of interest. Maintaining business continuity trumps developers’ ease of delivery to production.
  2. Since someone else puts the code into production (or operates a tool which does so), this is the basis for separation of roles and responsibilities in the enterprise software development lifecycle.
  3. To ensure integrity of the production environment, the production build must be done by a group representing the business, not development. Developers do not do production builds.
  4. Working backwards, if you want your test environment to be as close as possible to production, you lock this down and prevent access to developers. This is usually the QA testing environment.
  5. Again, to avoid a conflict of interest, the QA testers should be working for the business, not the application development team.
  6. And it follows that the build for the QA environment is done by the business.
  7. The developers job from this perspective is delivering source code to the business, which wants retain the source code and the ability to use it (meaning they can build it).
  8. The process of developers transferring source code to the QA build people was called “throwing it over the wall”. Now, the heavy horse SCM tools and Meister workflow make it easy to do this and allows variations of iterative development involving the QA environment.

Any type of continuous integration or agile development practice typically happens before the QA environment. Any develop methodology for the enterprise must take into account the fundamental conflict of interest between software change delivery and business continuity or ignore it and remain entirely in front of QA.

If you are a developer, you can think of this as a loss of privilege, or you can be elated that other people are doing the dirty work for you and you can focus on the art and science of engineering business solutions. If you are really depressed, maybe you should be on the other side of the wall!

How to Improve Your RFP Process

I’ve been involved with software procurement that involve RFP’s (Request-for-Proposal) on both sides of the fence - as part of the purchasing organization, and more frequently as a software vendor. I’ve seen the mistakes people make in sending out an RFP and then making a purchasing decision based on the results and I’m filing those items away should the time come for me to head up a software purchase myself.

RFP’s work best when you need a product that is strongly commoditized. For example, you need a software package to manage your purchasing. Or, you need software to do perform all of your HR (Human Resource) functions. HR functions are pretty much the same at most companies - sure larger companies might have needs for scalability and breadth of functionality that smaller companies don’t, but its all HR.

Before I get too far into my experience, let me mention that this is not a sour grapes article. Meister does great in sales with RFP’s and we invariably win or come in a close second. However, in some of the “close second’s” we’ve lost to a product that we would not regard as a competitor and we can see that the purchaser has not satisfied some of their stated key requirements at the beginning of the process that got us involved. It has made me wonder and this article is the result.

The first way to screw up an RFP is to make it a democracy - keep it an oligarchy of stakeholders. If you start out with a need for an HR solution and you solicit requirements from everyone in your company, you might get requirements like “needs to do purchasing” and “needs to do supply-chain management”. If you then invite more people from the purchasing department to participate and then have everyone score according to the sum-total requirements, you may very well end up with a purchasing solution when your original goal was an HR solution. In this case, there was a lack of weighting for HR requirements and HR stakeholders’ votes.

Some old fashioned leadership can work here, where the key stakeholder makes the final decision and is accountable for it, taking into account everyone’s scorecard. Sure, this is an extreme example, but it illustrates my point about requirement dilution clearly.

The second way to screw up an RFP is to limit the value you can get from a procurement. Let’s say you send out an RFP for an HR solution and one of the vendors says they can do financials as well. You could say, well, we are only looking for an HR solution (that could very well be the case, but let’s say there is no solution for financials in place). You could talk to the guy in charge of financials and let him know there is an opportunity. My first point is relevant here because really the software products are no longer commodities. Either you should open the RFP up to vendors who can do HR AND financials or make a decision based on thorough investigation of the functionality with management consensus about the overall benefit of each product to the organization. In this case, it’s more opportunity lost, but finding opportunities and bringing them forward is how people win leadership awards (or keep their jobs in a rough economy).

To summarize, have clearly defined business needs and requirements and stick to those when making your decision. If you find you are trying to choose between apples and oranges, step back and regroup with management to determine the overall value of each product to the organization. In our industry, single tools that provide functionality in SCM, development and IT infrastructure are hardly commoditized today and have small overlap with one another.

I suppose it boils down to having clear requirements, stakeholder involvement and effective leadership. But, isn’t that always the case?

Mojo and Meister 7.2.1 On The Way

We’ve been testing Mojo and Meister 7.2.1 and getting ready for their release on December 15. This is a maintenance release with bug fixes from users who’ve started running builds and workflows with Meister 7.2 and using the free workflow automation of Mojo and putting those releases through their paces. It also contains a lot of UI and documentation improvements.

Existing users of the 7.2 version of either Mojo or Meister will be able to upgrade via the update sites, http://www.openmakesoftware.com/mojo/update_site and http://www.openmakesoftware.com/meister/update_site, respectively.

Users interested in getting Mojo, the free workflow automation tool, and Meister, the industry leading build automation tool, can find download instructions on our website, http:///www.openmakesoftware.com.

With the recent financial meltdown, I couldn’t help but notice a trend among my clients. I’ve worked with over one hundred companies in one capacity or another that has given me an insight into how they develop software.

Among these companies, there were two particularly frustrating companies where I was on site and a third that I assisted with a very difficult proof-of-concept. I actually compiled software applications for each of these companies, each of them failed to implement the enterprise software process and automation improvements I was helping them with and none of these three companies exists today - victims of risky investment practices and high-profile failures in the 2008 financial crisis.

To be sure, I’ve had many frustrations at many other companies because implementing centralized software development management practices is extremely difficult, involving almost every department in IT. But the other companies ultimately gained the consensus, management backing and financial support to implement real change to lower the risk of software delivery and improve business continuity.

The three software management failures that ultimately turned business failures were particularly sore spots for me. And, as a trained physicist, when I see three software management failures and three business failures and they are the SAME three out of a hundred, well, I know there’s a very high probability for a relationship.

An anecdote: one of the three, a super large bank liked to grow by acquiring other banks. Word on the street is that the OCC, stepped in and said you have to improve your software management practices before you can acquire more banks. So, the bank implemented a software management improvement program including first centralized version control and later more proper configuration management (always surprising to me that they can deliver binaries and manage versions but not know if the two are related in any way). I was brought in for the centralized build management part. Then the OCC said something like “We see you’ve implemented some version control. OK, you can go ahead an buy more banks.” POOF! The software improvement projects were all massively scaled back and there was no more enterprise build management to work on. How about that?

When a company implements software development and delivery improvements they are lowering the risk of proprietary software changes which in turn lowers business risk by decreasing interruptions of services and ensuring on-time delivery of new features. At this stage of industry maturity, a company that does not have control over software delivery is accepting a business risk that fewer and fewer competitors accept.

So its reasonable to believe that a company that takes large financial risks will take risks across the board - even with their software management practices.

  • 1 Comment
  • Filed under: SCM Strategy
  • File Control Madness in Eclipse

    I found myself actually using four different file control tool plug-ins in a single Eclipse 3.4 workspace. This is not show-off, but for legitimate needs. Before proceeding, let me disclaim that I am reorganizing my Perl development on a new machine and I have everything somewhat haphazardly in a single workspace. Ideally I will have different workspaces for different projects, but until I build a standard set of preference, particularly for EPIC Perl templates, and, I can export and import them into different workspaces, I’m locked into a single workspace for now.

    Image

    If you are not familiar with Eclipse and version control (or as I call it generically “file control”) you have to install plug-ins that provide the functionality to interface with different tools. I have an EPIC plug-in that provides Perl tools, and I’ve installed EGIT for Git integration and plug-ins for Subversion and Bazaar. The CVS plug-in actually comes as part of the base Eclipse install, though that status is questionable given the popularity of Subversion and the rapid rise of Git.

    These plug-ins provide the capability to create a new project from the contents of the file control repository, or attach an existing Eclipse project to a new project under file control. You do this by right-clicking on the project and going to the “Team” menu and the “Share” item.Here is a quick explanation of the screen shot above. “om64Perl” comes out of our OpenMake CVS repository. The ones attached to Git, are pretty obvious with the word “Git” clearly to the right of the project name. Being a distributed repository tool, the Git repository that the projects are attached to is actually in the workspace. Then, I have an anemic open source project on SourceForge to which the “PerlSCM” project is attached via Subversion. And, finally, there is the Perl VCI project “vci” that uses Bazaar.

    There you go. Because I’m involved with three open source projects that use different file control tools, and regular work that uses another, I end up with four.

    People as Glue

    Perhaps you’ve heard of “glue” scripting, which is scripting designed to pull together various tools and processes into an integrated process automation.For example, suppose you have both ClearQuest from IBM and CA Harvest. Neither IBM nor CA have a real stake in integrating with the competitor’s tool, but you do. So what do you do? You create some nifty Perl scripts (because there are no other real scripting languages) to associate Harvest package promotion with ClearQuest record changes.

    Well, we in OM Services have become sort of a people form of the same thing. We’ve had multiple engagements with the same companies, providing a consistent level of expertise and proprietary knowledge of each company’s software processes and automation technology.

    Sometimes we even smooth out transitions from staff turnover and that’s where I think we act as “glue” in time. OK, so I have a physics background and can’t distinguish between time and space, but providing connectivity between two points is sort of what we do.

    The goals of an SCM team charged with build management is to NOT have proprietary knowledge outside the team, but well, that can be expensive and sometimes impossible even when the funds are there. So, while we strive to meet the needs of our customers through services, we think the tool should provide that bridge, keeping the proprietary knowledge in house, and we in services are constantly feeding back our input into product development to achieve that goal.

    Foray into Bazaar

    I’m going to contribute my CA Harvest knowledge to the Perl VCI module. Max Alexander-Kanat, who runs that uses the bazaar code control tool for that. So far I haven’t used that one, but I’m all up for it.

    I was wondering how many code control tools I’ve used. Here is a list and a tally:

    SCCS, RCS, PVCS/Version Manager, Endevor Workstation (RIP), Endevor for UNIX (RIP), Endevor mainframe, CVS, Subversion, CA Harvest, MKS Source Integrity, Perforce, Git, Microsoft Visual Source Safe, ClearCase, StarTeam, Serena ChangeMan for Distributed Platforms (RIP), Serena Dimensions. Total 17 - only 17?

    There are a couple more tools that I saw or downloaded, but did not actually use like Microsoft’s Team Foundation Server, IBM’s CMVC (nearly RIP) and Aldon’s Lifecycle Manager for AS/400.

    Git a Popular Topic at BarCamp Milwaukee

    I led a session at BarCamp Milwaukee this weekend on the Git code control tool. I prepared for a look-at-my-laptop presentation for the 4 people who signed up by Friday. At the appointed time about 30 people showed up to a room with no projector (about 1/4th of the conference attendees). Now, that’s the kind of thing to keep you on your toes!

    Several of the developers knew the tool better than I did and so I became the discussion leader. We talked about the basics, distributed development, branching, the Eclipse plug-in and suitability for the enterprise (the verdict was “yes, it is”).

    In general, a lot believe Git is superior to both CVS, Subversion and even ClearCase. Git has advantages in checkout speed, branch support and is better for supporting builds. It is fundamentally different in that it supports a distributed development model. But, it is similar to CVS and Subversion in that it is basically a command-line tool with little GUI support (compared with tools like Perforce, StarTeam and AccuRev) and lack of enterprise integration and reporting capabilities that high-end SCM tools have like Team Foundation Server, Serena Dimensions, IBM Jazz and CA Harvest.

    There was also forklift driving and a build-and-take-home your own robot sessions there in addition to functional programming and PostgreSQL.

    Building WebSphere EJB Client JAR Projects

    How do you set up an automated build for EJB client JAR’s from the IBM Rational Software Delivery 7 development environment for WebSphere 6?

    This question came up recently in my work for a major insurance company. When one extends the EJB client class, that is all a developer has to do as far as RAD 7 is concerned. When the developer deploys the JAR to the server, RAD 7 quietly generates stub source Java classes, compiles them and includes them in the JAR file.

    An automated build in this context means that all the code the developer created in RAD 7 and checked into version control, is checked out of version control without RAD 7 and built exactly the way the developer intended. This is what OpenMake Meister is for.

    One developer I was working with was concerned with how to generate those same source files in the automated build, which in his case was using OpenMake. He was familiar with how OpenMake uses the ejbdeploy command for building EAR’s with EJB server-side code and expected some equivalent for the EJB client.

    Mercifully RAD 7 actually leaves the generated source files behind in the Eclipse project, in the standard source location. This means that we get the source code for free and there is really no need to regenerate it. All one has to do is check in the generated source to version control along with the developer coded source and build a normal JAR file in the automated build.

    For the developer, this means:

    1. Check in all the Java source code in the project. This is easy – better than picking and choosing which Java source to check in as the developer thought he had to do. This would be ultra-high risk for making a mistake and breaking the team’s automated build.
    2. Assign the “Default Java JAR” configuration mapping for the EJB client project using the OpenMake Target Generator Eclipse plug-in.

    A lesson to learn from this is that not all technologies or technology variants will have an impact to the build process. The developer was considering an idealist approach to reproduce every minute step of RAD 7, but the best solution was something practical and simple. Build management is part art and part dirty science. Having a “generate” step for the EJB client Java classes in the automated build only introduces an additional point of possible failure, and we build-meisters know we don’t need any more of those!

    With the Web 2.0 evolution, information flow between people has changed from a ‘push’ paradigm (I send you an email) to a pull paradigm (I follow you on Twitter). How could this possibly relate to code management such as branching, merging and history? Well, Git’s distributed repository model and how one obtains code updates from “friend” repositories is similar to Twitter and how you obtain status updates on the people you choose to follow. Instead of communicating micro-blog entries or status updates, Git is communicating source code branch updates.

    Also like how Facebook or Twitter allows you to specify a person’s name in lieu of the communication protocol identifier (email address or web page), Git uses aliases for long repository locations so you have a more direct, natural language and human feel to what you are doing: “git fetch linus” will pull changes from Linus’ repository, which you have only had to define once.

    Here is a scenario where Steve and I are working on a part of the Linux file system to provide information useful for build management and dependency tracking, which Meister and other tools can take advantage of. Steve started by cloning the master Linux repository and started working away making changes. Steve asked me to work on another part of this project, so I cloned his repository, allowing me to pick up all his changes. I am now automatically following (Git calls it remote-tracking) Steve’s “master” branch of his repository since I started my repository by cloning his. The “master” branch is a.k.a. the “trunk” code stream. I can pick up his updates periodically with:

    $ git pull

    Now, I may also want to get updates directly from the master Linux repository, but it has a complicated URL that I won’t remember and only want to look up once. So, as a one-time command I do:

    $ git remote add linux-nfs git://linux-nfs.org/pub/nfs-2.6.git

    Forever after:

    $ git fetch linux-nfs
    * refs/remotes/linux-nfs/master: storing branch ’master’ …
    commit: bf81b46

    The “fetch” command doesn’t put the master Linux changes directly into my workspace, but off to the side for me to examine first (very nice). If I want, I can accept the changes into my local work tree. To tell me which repositories I am following (which friends), I do:

    $ git branch –r
    linux-nfs/master
    steve/master
    origin/master

    “origin/master” is my own trunk. I could also get the full repository information associated with the short names, but as long as it works, I don’t want to know what it is. For me, this type of friendly and fluid interaction with repositories is one of the major advantages over CVS and Subversion.

    Here Comes Git for Code Change Management

    If you are a hard-core open source programmer, you probably use Git for project code change management instead of Subversion (I chose those words carefully). There is a lot of passion from Git advocates and, while it is not a very mature solution, it has a lot of momentum to push it forward. Merely being conceived of and written by Linus Torvalds and being used on a few large open source projects, such as the very Linux kernel itself, is enough to garner wide support.

    A great place to learn about Git is Sam Vilain’s Tutorial. He goes into a lot of detail on the benefits and how-to’s of using Git. Some of the highlights include repository space savings of over 90% and local-to-repository sync times dropping from hours in Subversion to minutes with Git. The real power of Git is in the highly distributed repositories and the ease and control of moving and accepting changes between repositories. For an open source project with a large number of developers it seems Git will really shine. Git has fine control over branching, merging and accepting or not accepting project changes according to various criteria.

    A popular way to use Git is to have Git pull from a public Subversion or CVS repository with convenient integration with those tools to a local Git repository and work from there. Friends working on the same project can easily pass changes between each other with Git and later commit back to the centralized CVS or Subversion repository. GitHub provides a simple Git repository hosting service. Doing a lot of Java work with JBoss and WebSphere, I am naturally interested in an Eclipse plug-in for Git and indeed one exists. It looks like a newborn infant, but I will check it out.

    I also have a Perl open source project that is currently pretty anemic, but I hope to revitalize it soon. I really hate the fact that I’m locked into using Subversion on SourceForge and I never came to like Subversion. I’m eager to explore moving the project to GitHub, even though I’ll probably be the only committer for awhile. Since I’m a hardcore software management person and robust Perl developer, I think Git might be my tool. I’ll let you know.

    YAJBT – Yet another Java Build Tool

    Here comes buildr: yet another Java build tool. Hopefully I, or one of my other cohorts will check this out in detail soon. But, with my experience working with all manner of build tools, with 100 companies and many more development teams, I can already make a few observations.

    First of all, why another build tool for Java? I am occasionally told that Maven or Ant is a perfect tool, but clearly the people behind buildr don’t think so. The choice of JRuby as the vehicle for delivering this tool, I think is probably a good one. JRuby is a scripting language in the same vain as Perl, which is used by Meister.

    Doing software builds is an ugly business involving lots of file and operating system interaction. This is not where Java shines, but scripting languages can. As long as operating systems are written in C and not Java, C-like tools will be better and faster at interacting with them. Plain Ruby itself is C-based, and JRuby no doubt inherits C-like operating traits. Calling out to a Java compiler from Perl or JRuby, though it has its own JVM, does not represent a significant overhead compared with the file system operations and the compilation/translation itself.

    Both Maven and Ant are relatively difficult to extend compared most other build tools, and I’ll be buildr beats them here. If you have all the Maven plug-ins and Ant tasks you need, then good for you. If not, then you have to start developing in Java and it becomes too much of an investment to sink into a build system. It is much cheaper to extend in JRuby or Perl. My frequently cited example is the XMLBeans compile step in Meister, written in Perl, which is only 40 lines of real code. The Maven plug-in is 60 pages of Java code and no one can tell me really what it is doing (I asked on all the forums). Less code is usually more transparent, which is also good for build audits.

    I am a little disappointed to see them try to placate the Maven and Ant users by promising it is a drop-in replacement for Maven and they have all the Ant tasks covered. Both tools have their drawbacks and I don’t want to see another tool with the same deficiencies. They should have the cajones (or coñejos) to apply all their resources to what they think is a better tool (with its own unique benefits and deficiencies). I imagine offering Ant task equivalents is pretty easy because of the ease of coding in JRuby compared with Java.

    They also don’t mention who is supposed to use the tool. Is it for individuals, small development teams, the enterprise? Maven falls short because it is only appropriate for development teams and not for stable, controlled, enterprise builds. Ant is not even tool, but a means to create some tools for small teams. I don’t think Meister will fear buildr either.

    Well, since buildr is only in incubation status with Apache, I’m not sure how much time I’ll be able to spend on it, but I am curious and I’ll let you know if I find out more.

    JBoss checks for certain watch files when handling deploying or undeploying an application. The watch files are certain key files germane to the object you are deploying. For an EAR, the watch file is the application.xml and the optional jboss-app.xml files. For a web application archive, the watch files are the web.xml and jboss-web.xml files. For single-file XML resources, such as datasources, the watch file is the XML file itself. In this article, I am dealing with archives that are deployed in unextracted (unzipped) form.

    The first check is made for the existence or non-existence of a watch file. If a previously unknown watch file is found, the appropriate deployer is started and the file modification timestamp is stored in memory. If a known watch file is found to be missing, the appropriate undeployer is launched.

    If a known watch file is found on a subsequent pass of checking watch files, its timestamp is checked against the time that was stored in memory by the deploy process. If the deployed watch file is newer, the appropriate deployer is launched which apparently first dumps the associated resources and then reloads the object as if it were newly found.

    This leaves a hole that can lead to the horrifying result of having files deployed to the server, but not having the changes reflected in the running application.

    The issue has to do with completely replacing a running application with a new version. You might first delete the application completely from the runtime area leaving the server to undeploy it. Then you replace the object with a new version of itself. The window of time between checks of the watch files is finite and I’ve found it is possible to remove and replace the archive within that window so that the JBoss server does not detect that the watch file was missing and so it is not unloaded from memory. The server does check the watch file timestamps, but if you have changed files other than the watch files and have not updated the timestamps of the watch files themselves, the server will happily ignore the new version of the archive while running the old one.

    If you use this deployment strategy, then this issue is essentially a random process, and a deployment failure due to this reason happened in our case on only a few percent of all deployments. When you are running a few hundred deployments a week, or it happens for a production deployment it becomes a big problem – especially when people don’t know what the problem is. A simple resolution is to always update the timestamps of the watch files when changing anything for a deployed application. This will take care of everything but possibly compiled JSP’s. (Possibly more on that later.)

    This also points to a “restart” mechanism for JBoss – simply ‘touch’ the watch files of a running application to change their timestamps to the current time. This will trigger the dump-and-reload on the next watch file check. This can be useful when the application has not changed, but an associated XML resource has.

    When you work with a locking-type version control tool like CA Harvest, your Meister build project will appear in your Eclipse workspace as read-only when you check out an existing workspace. I’ve been using Eclipse for WebSphere development (WebSphere Studio Application Developer) and for JBoss via MyEclipse IDE. If you want to regenerate your Java targets, you first have to check out the Meister build project so that the files are writable.

    Since this can lock the targets exclusively and prevent others from updating the target, you may not want to check out the build project, but you may still want to develop freely and update your local targets for Meister to build it. For this situation I recommend creating a separate build project that you may never check in to version control. It will be writable and it allows you great freedom for a maximally agile development environment. The ‘official’ build project may reference all the built archives in the workspace, but having your own local build project can allow you to focus for a unit build. For example, my workspace may contain an EAR project, a WAR project and one or more JAR projects. If I am principally working only on one of the JAR projects, my local build project can reference only that one JAR project.

    When it’s time to release your JAR code updates to the system build and test environments, synchronize your workspace and check out the VC build project. Generate your targets, do a local system build and then check everything in. Your team system build will work fine!

    I wanted to share a specific benefit I enjoyed while using Meister for Java development. As part of my role to help develop an automated JBoss build and deploy system, I ended up taking on a developer role for a web services security project for both JBoss and WebSphere. While the project involved about 1000 lines of Perl, it also got me writing simple web services and consumers for JBoss and WebSphere and building them using Meister and its Eclipse plug-in.

    Believe it or not, I am still using WebSphere Studio Application Developer 5.1. While my specific tale involves that IDE, it is equally applicable to MyEclipse and Rational Application Developer set of Eclipse IDE’s. In my environment, CA Harvest is the version control/SCM tool and Meister is the build tool. After code is checked in from my desktop using the CA Harvest eclipse plug-in, the code is replicated out to a Linux server, where Meister performs the official system build that is sanctioned for deployment to the application server. There is also a Meister Eclipse plug-in that scans the WSAD workspace for build targets and dependencies. Meister stores this information in one XML file per build target and those files are also checked in to CA Harvest right along side the source code.

    Working intensely within the WSAD Eclipse environment as the project manager cracked the whip, I worked with a consumer application and updated it according to the changes in the service WSDL and service endpoint URL’s. One thing I learned is that if one of the parameters for the consumer is tweaked, don’t bother tweaking the XML or generated code, just regenerate the whole client. WSAD will even check out the files before if they need to be. So everything looked good on my desktop with the service and consumer deployed to two separate WebSphere servers on ports 9080 and 9081. Now to get it into the enterprise ‘dev’ environment…

    Using the ‘Generate Target Definitions’ feature of the Meister plug-in I updated the Meister build target XML definition files and checked in all my code. I then promoted the code in CA Harvest which automatically kicked off a ‘dev’ build in the Linux environment. I got an error back from Meister saying ‘jdmpview.jar’ doesn’t exist.

    Since I knew my consumer app and its elementary nature, I knew that jdmpview.jar wasn’t one of my JAR’s and it must be one of WebSphere’s. Given that 200 other Java apps use the same build environment with the same standards, I probably didn’t use some new feature of WebSphere that no one else is using. Therefore, it must a problem on my local desktop with the version of JVM I was using.

    Sure enough, the consumer app was using the base_v51 WebSphere runtime instead of the ee_v51. (I did inherit the initial version of the app from someone else!) And, oddly enough, there is an extra JAR in the base that is missing in the more fully featured Enterprise Edition. Meister correctly forced the runtime environment to be EE for the Linux build, overriding the developer selection. I switched the runtime in the Java build path properties, regenerated the Meister target definitions, checked them in and promoted them to a successful ‘dev’ build. Regenerating the target definitions had the effective of switching out the list of JAR files in the library path from the base_v51 set to the ee_v51 set. The whole thing including one bad and one good build took about 4 minutes.

    The great benefit for me was the balance between developer and SCM functions. We could have applied more controls at the desktop level, but from my perspective, I prefer an Agile environment with more freedom even if it means occasionally hanging myself with my own rope. In this scenario I let the tools dot the I’s and cross the T’s and it took no more time than say, waiting for Outlook over VPN.

    In developing Java applications for multiple server environments (e.g. dev, test and prod) there is a common pain-point of having to manage deployment descriptor or configuration files specific to each server. For example, you may have an XML log4j configuration file with some parameters different for different server environments. You may want to turn on debug messaging for the development server, but turn it off for production. At the same time, the Java source code will (eventually) be the same in production as it was in development. A similar situation applies for .NET application development.

    Like many build management tasks, managing these environment-specific files is generally left to either manual or some type of scripting. This is really something that needs to have a high level of automation applied. Particularly in larger environments, much like scripted build management solutions, existing tactics fall short. This situation is in a far worse state than even the compile part of build management. It is not enough to simply have a script that can spit out some files. One of the biggest problems is information management and the fact that parameter values in the configuration files may be determined by different teams! How do a production engineering team and an application developer both feed inputs into the same XML file?

    I’ve worked on this problem for several years and with a number of companies. The critical functionality can be broken down into two different items – information management and a processing engine. In an effort come up with something better, I’ve done a review of what’s out there and here is what I came up with:

    • Ant ‘filter‘ task: As with many Ant tasks, this works great if you are an individual with a few items that need updating. It is a nightmare if you are working in a multi-team enterprise with multiple server environments. The main problem is that you have to constantly take working copies of XML files and insert a token for Ant to later re-replace. This leads to a management nightmare to synchronize parameterized copies of XML files with their working copies from the desktop environment. The advantage is that it works for any file type so you can use it for properties files as well as XML files.
    • OOPS Consultancy Ant ‘xmltask‘: This is a good engine for specifying and performing changes to the XML and has a full feature set. In fact, we use this in some of the Meister build services. The problem is that it is only for Ant and therefore you have all the reuse, standardization and hard coding issues. Xmltask can provide part of the solution we are looking for, but we still have an information management problem to deal with.
    • Maven: Maven has what is essentially the Ant filter task. The specifications are abstracted in the pom files, which is better than Ant, but it encourages templating of configuration files leading to all the problems associated with that (synchronizing templates with working files, testing templates, etc.)
    • XML:DB XUpdate: This is a working draft of a specification to encode XML update instructions into an XML document. There is a Java implementation of XUpdate listed on the site called ‘Lexus’, but I couldn’t find anything on it. Since the build management task requires us to generate XML files, I’m not keen on generating XML files using xupdate tags that will allow me to generate other XML files.
    • Perl XML::Twig: This has worked wonderfully for me on a back-end web services security effort and I could not be more happy with such a precise, elegant and brief XML library, which includes XPath. This is not a solution for Java or .NET developers, but it could serve as an engine to mimic xmltask or implement the XUpdate specification.
    • Excel. Yes, I’ve seen Excel used effectively as the information management front-end to updating the XML. It is a convenient format to share among teams, it is centralized source of information, it can be checked into version control and it can be saved as an XML file itself for processing by another engine. In a large environment, you may have 5 or more server environments, lots of different components to configure, so you could have literally hundreds of parameters to manage. Excel gives you a nicely transparent way to view those values.