News from the Field
14 Aug
How do you set up an automated build for EJB client JAR’s from the IBM Rational Software Delivery 7 development environment for WebSphere 6?
This question came up recently in my work for a major insurance company. When one extends the EJB client class, that is all a developer has to do as far as RAD 7 is concerned. When the developer deploys the JAR to the server, RAD 7 quietly generates stub source Java classes, compiles them and includes them in the JAR file.
An automated build in this context means that all the code the developer created in RAD 7 and checked into version control, is checked out of version control without RAD 7 and built exactly the way the developer intended. This is what OpenMake Meister is for.
One developer I was working with was concerned with how to generate those same source files in the automated build, which in his case was using OpenMake. He was familiar with how OpenMake uses the ejbdeploy command for building EAR’s with EJB server-side code and expected some equivalent for the EJB client.
Mercifully RAD 7 actually leaves the generated source files behind in the Eclipse project, in the standard source location. This means that we get the source code for free and there is really no need to regenerate it. All one has to do is check in the generated source to version control along with the developer coded source and build a normal JAR file in the automated build.
For the developer, this means:
A lesson to learn from this is that not all technologies or technology variants will have an impact to the build process. The developer was considering an idealist approach to reproduce every minute step of RAD 7, but the best solution was something practical and simple. Build management is part art and part dirty science. Having a “generate” step for the EJB client Java classes in the automated build only introduces an additional point of possible failure, and we build-meisters know we don’t need any more of those!
3 Jul
With the Web 2.0 evolution, information flow between people has changed from a ‘push’ paradigm (I send you an email) to a pull paradigm (I follow you on Twitter). How could this possibly relate to code management such as branching, merging and history? Well, Git’s distributed repository model and how one obtains code updates from “friend” repositories is similar to Twitter and how you obtain status updates on the people you choose to follow. Instead of communicating micro-blog entries or status updates, Git is communicating source code branch updates.
Also like how Facebook or Twitter allows you to specify a person’s name in lieu of the communication protocol identifier (email address or web page), Git uses aliases for long repository locations so you have a more direct, natural language and human feel to what you are doing: “git fetch linus” will pull changes from Linus’ repository, which you have only had to define once.
Here is a scenario where Steve and I are working on a part of the Linux file system to provide information useful for build management and dependency tracking, which Meister and other tools can take advantage of. Steve started by cloning the master Linux repository and started working away making changes. Steve asked me to work on another part of this project, so I cloned his repository, allowing me to pick up all his changes. I am now automatically following (Git calls it remote-tracking) Steve’s “master” branch of his repository since I started my repository by cloning his. The “master” branch is a.k.a. the “trunk” code stream. I can pick up his updates periodically with:
Now, I may also want to get updates directly from the master Linux repository, but it has a complicated URL that I won’t remember and only want to look up once. So, as a one-time command I do:
Forever after:
The “fetch” command doesn’t put the master Linux changes directly into my workspace, but off to the side for me to examine first (very nice). If I want, I can accept the changes into my local work tree. To tell me which repositories I am following (which friends), I do:
“origin/master” is my own trunk. I could also get the full repository information associated with the short names, but as long as it works, I don’t want to know what it is. For me, this type of friendly and fluid interaction with repositories is one of the major advantages over CVS and Subversion.
26 May
Here comes buildr: yet another Java build tool. Hopefully I, or one of my other cohorts will check this out in detail soon. But, with my experience working with all manner of build tools, with 100 companies and many more development teams, I can already make a few observations.
First of all, why another build tool for Java? I am occasionally told that Maven or Ant is a perfect tool, but clearly the people behind buildr don’t think so. The choice of JRuby as the vehicle for delivering this tool, I think is probably a good one. JRuby is a scripting language in the same vain as Perl, which is used by Meister.
Doing software builds is an ugly business involving lots of file and operating system interaction. This is not where Java shines, but scripting languages can. As long as operating systems are written in C and not Java, C-like tools will be better and faster at interacting with them. Plain Ruby itself is C-based, and JRuby no doubt inherits C-like operating traits. Calling out to a Java compiler from Perl or JRuby, though it has its own JVM, does not represent a significant overhead compared with the file system operations and the compilation/translation itself.
Both Maven and Ant are relatively difficult to extend compared most other build tools, and I’ll be buildr beats them here. If you have all the Maven plug-ins and Ant tasks you need, then good for you. If not, then you have to start developing in Java and it becomes too much of an investment to sink into a build system. It is much cheaper to extend in JRuby or Perl. My frequently cited example is the XMLBeans compile step in Meister, written in Perl, which is only 40 lines of real code. The Maven plug-in is 60 pages of Java code and no one can tell me really what it is doing (I asked on all the forums). Less code is usually more transparent, which is also good for build audits.
I am a little disappointed to see them try to placate the Maven and Ant users by promising it is a drop-in replacement for Maven and they have all the Ant tasks covered. Both tools have their drawbacks and I don’t want to see another tool with the same deficiencies. They should have the cajones (or coñejos) to apply all their resources to what they think is a better tool (with its own unique benefits and deficiencies). I imagine offering Ant task equivalents is pretty easy because of the ease of coding in JRuby compared with Java.
They also don’t mention who is supposed to use the tool. Is it for individuals, small development teams, the enterprise? Maven falls short because it is only appropriate for development teams and not for stable, controlled, enterprise builds. Ant is not even tool, but a means to create some tools for small teams. I don’t think Meister will fear buildr either.
Well, since buildr is only in incubation status with Apache, I’m not sure how much time I’ll be able to spend on it, but I am curious and I’ll let you know if I find out more.
9 May
First, let me say how nice it is to have the Mojo workflow engine that allows us to manage the compliance checks, deploy to multiple machines in parallel and validate deployment. This makes our lives a lot easier and provides clear benefits for deployment via the parallelization, dependency management, scalability, logging and reporting. Underneath the covers, and for those of you who don’t have the luxury to use this almost-free product, there are some important low-level tools that are critical to the development, testing and operation of the Mojo JBoss deployment system on Linux.
With the most important listed first, they are:
JBoss support wins hands down due to the number of bugs and critically important undocumented features. On a scale of 1 to 10 where 10 is the best documentation, I give JBoss about a 3 or 4. Googling doesn’t even help that much for deployment issues.
You may be surprised at the prominence of Perl, but if you think about what you are really doing and what the best tool for the job is, it makes sense. You are really moving an archive (a ZIP format file), copying XML files, creating directories, changing permissions, extracting the archive to the file system perhaps. Where did I mention Java? Nowhere. The twiddle.sh command comes in handy if you get the secret commands from JBoss support that tell you if the application you deployed has actually started correctly. Notice that this is a shell script suggesting we’re not the first to use non-Java tools to manage deployment.
Particularly on the testing side, I can’t think of a viable alternative to Perl testing. We need to test that we created this directory, changed that permission, updated that file timestamp, etc. We have about 300 test cases encoded in Perl that are run with every change to the deployment system. It takes about 20 seconds to write and run a simple test case in Perl.
Lessons? Use JBoss support early and often and use Perl.
26 Mar
When you work with a locking-type version control tool like CA Harvest, your Meister build project will appear in your Eclipse workspace as read-only when you check out an existing workspace. I’ve been using Eclipse for WebSphere development (WebSphere Studio Application Developer) and for JBoss via MyEclipse IDE. If you want to regenerate your Java targets, you first have to check out the Meister build project so that the files are writable.
Since this can lock the targets exclusively and prevent others from updating the target, you may not want to check out the build project, but you may still want to develop freely and update your local targets for Meister to build it. For this situation I recommend creating a separate build project that you may never check in to version control. It will be writable and it allows you great freedom for a maximally agile development environment. The ‘official’ build project may reference all the built archives in the workspace, but having your own local build project can allow you to focus for a unit build. For example, my workspace may contain an EAR project, a WAR project and one or more JAR projects. If I am principally working only on one of the JAR projects, my local build project can reference only that one JAR project.
When it’s time to release your JAR code updates to the system build and test environments, synchronize your workspace and check out the VC build project. Generate your targets, do a local system build and then check everything in. Your team system build will work fine!
23 Mar
I wanted to share a specific benefit I enjoyed while using Meister for Java development. As part of my role to help develop an automated JBoss build and deploy system, I ended up taking on a developer role for a web services security project for both JBoss and WebSphere. While the project involved about 1000 lines of Perl, it also got me writing simple web services and consumers for JBoss and WebSphere and building them using Meister and its Eclipse plug-in.
Believe it or not, I am still using WebSphere Studio Application Developer 5.1. While my specific tale involves that IDE, it is equally applicable to MyEclipse and Rational Application Developer set of Eclipse IDE’s. In my environment, CA Harvest is the version control/SCM tool and Meister is the build tool. After code is checked in from my desktop using the CA Harvest eclipse plug-in, the code is replicated out to a Linux server, where Meister performs the official system build that is sanctioned for deployment to the application server. There is also a Meister Eclipse plug-in that scans the WSAD workspace for build targets and dependencies. Meister stores this information in one XML file per build target and those files are also checked in to CA Harvest right along side the source code.
Working intensely within the WSAD Eclipse environment as the project manager cracked the whip, I worked with a consumer application and updated it according to the changes in the service WSDL and service endpoint URL’s. One thing I learned is that if one of the parameters for the consumer is tweaked, don’t bother tweaking the XML or generated code, just regenerate the whole client. WSAD will even check out the files before if they need to be. So everything looked good on my desktop with the service and consumer deployed to two separate WebSphere servers on ports 9080 and 9081. Now to get it into the enterprise ‘dev’ environment…
Using the ‘Generate Target Definitions’ feature of the Meister plug-in I updated the Meister build target XML definition files and checked in all my code. I then promoted the code in CA Harvest which automatically kicked off a ‘dev’ build in the Linux environment. I got an error back from Meister saying ‘jdmpview.jar’ doesn’t exist.
Since I knew my consumer app and its elementary nature, I knew that jdmpview.jar wasn’t one of my JAR’s and it must be one of WebSphere’s. Given that 200 other Java apps use the same build environment with the same standards, I probably didn’t use some new feature of WebSphere that no one else is using. Therefore, it must a problem on my local desktop with the version of JVM I was using.
Sure enough, the consumer app was using the base_v51 WebSphere runtime instead of the ee_v51. (I did inherit the initial version of the app from someone else!) And, oddly enough, there is an extra JAR in the base that is missing in the more fully featured Enterprise Edition. Meister correctly forced the runtime environment to be EE for the Linux build, overriding the developer selection. I switched the runtime in the Java build path properties, regenerated the Meister target definitions, checked them in and promoted them to a successful ‘dev’ build. Regenerating the target definitions had the effective of switching out the list of JAR files in the library path from the base_v51 set to the ee_v51 set. The whole thing including one bad and one good build took about 4 minutes.
The great benefit for me was the balance between developer and SCM functions. We could have applied more controls at the desktop level, but from my perspective, I prefer an Agile environment with more freedom even if it means occasionally hanging myself with my own rope. In this scenario I let the tools dot the I’s and cross the T’s and it took no more time than say, waiting for Outlook over VPN.
23 Mar
In developing Java applications for multiple server environments (e.g. dev, test and prod) there is a common pain-point of having to manage deployment descriptor or configuration files specific to each server. For example, you may have an XML log4j configuration file with some parameters different for different server environments. You may want to turn on debug messaging for the development server, but turn it off for production. At the same time, the Java source code will (eventually) be the same in production as it was in development. A similar situation applies for .NET application development.
Like many build management tasks, managing these environment-specific files is generally left to either manual or some type of scripting. This is really something that needs to have a high level of automation applied. Particularly in larger environments, much like scripted build management solutions, existing tactics fall short. This situation is in a far worse state than even the compile part of build management. It is not enough to simply have a script that can spit out some files. One of the biggest problems is information management and the fact that parameter values in the configuration files may be determined by different teams! How do a production engineering team and an application developer both feed inputs into the same XML file?
I’ve worked on this problem for several years and with a number of companies. The critical functionality can be broken down into two different items – information management and a processing engine. In an effort come up with something better, I’ve done a review of what’s out there and here is what I came up with:
22 Feb
I’ve worked out a good system putting Meister build management metadata under version control. Why put it under version control when you can do a backup, you may ask? Well, there are times when you may want to develop a reusable workflow or update a Java build method script and test in an isolated test environment. For this you would use a separate test instance of the Meister server. Putting at least some of these files under control will help ensure that you move the known version of your tested server metadata file into your production environment.
There are a few other challenges to be aware of. You may want projects and dependency directories to be able to be created an edited in production while you simultaneously modify a workflow, for example. You don’t want to overwrite a dependency directory definition in production with an older one you are migrating from the test environment.
I’ve got a few other challenges as I’m using CA Harvest for version control and its excellent control over changes extends version locking to the file system by removing write access to files that are not locked by someone for change. Meister requires that most metadata files be writable, so if you check in files to CA Harvest directly, you will leave Meister metadata files read-only and that will cause problems.
So, here are a few tips from my now slick Meister metadata version control system:
15 Feb
I’ve found the multi-threaded capabilities of Mojo and Meister workflows to be very valuable for builds and deployment. The chief benefit I’ve received is in saving time as you might expect. I’ve been working with a workflow that deploys a Java application to up to 24 servers. Let’s ignore the sequential part of the workflow and examine the time difference of running parallel deployments versus one where each of the 24 machines is updated in sequence. The deployment process takes about 5 seconds per machine. Sequentially, that’s 24 x 5 seconds, or 2 minutes. In parallel, well it’s not quite 5 seconds, but closer to about 20 seconds because of limitations of the Linux machine it is running on. Still, that’s a tremendous 100 second savings.
In addition to using the parallel workflow to cater to impatience and improve productivity, I want the Java application to hit all of the servers in the cluster close to the same time. In this particular strategy, only 3 machines out of the 24 are in the cluster. The rest are to support dynamic resource allocation and disaster recovery. Running the deploys in parallel allows me to hit all machines, and therefore all the machines in a cluster at close to the same time without having to figure out some ordering so that the cluster servers are hit first and then the rest. This ends up saving a lot of coding, testing and possibly debugging. Great stuff.
28 Jan
There are times in build management that you need to encrypt something – often a password. In the last blog, I gave an overview of the encryption process. Now, I’ll show how you can accomplish something.
Besides just having an encryption algorithm, there are a number of important details to be minded: key, block management algorithm, initialization vector, binary-to-text encoding. Here is what I ended up doing. The encrypted text ended up in the text field of an element in XML and it was successfully decrypted on the other end in pure Java.
First, you need your basic cipher We’ll use the Rijndael algorithm specified by AES. I used a 128-bit key generated with help from the Crypt::Random module:
use Crypt::Rijndael; my $base_cipher = Crypt::Rijndael->new( $key, Crypt::Rijndael::MODE_CBC( ) );
Next you use this cipher within a block algorithm:
use Crypt::CBC; #--Cipher block chaining my $block_cipher = Crypt::CBC->new( -cipher => $base_cipher, -header => 'none', -iv => $iv, -padding => 'space' );
Even though the initialization vector, $iv, does not need to be secret, I enjoyed making it “randomy” with the ultra-cool Data::Random module. Also note that the padding strategy, adding spaces, is not binary safe so it works for encrypting text, but not for binary format files. Now you just encrypt:
my $encrypted_raw_binary =
$block_cipher->encrypt( $plain_text );
use MIME::Base64;
my $encrypted_text_string = encode_base64(
$encrypted_raw_binary,
''
);
#-- empty 2nd arg means “don’t break up long lines”
The last step is necessary to give you something you can easily manipulate as a string to read and write into files.
So that’s it. To decrypt you just do the reverse.
23 Jan
I wanted to pass along what I learned about a new area for me: encryption. I’m working on a build management project for securing Java web services and I’ve enjoyed learning about encryption methods. There are a couple of key concepts to learn and Wikipedia has some informative and entertaining pages. I recommend the “The Code Book” by Simon Singh for a great history of the subject.
One concept strange to newbies is that the encryption algorithm should be widely known and public. The key (**ahem**) is that the encryption key remains secret. If there is a problem discovered with the algorithm, you want to be the first to know. The commonly used algorithms are so robust that there is little advantage to be gained by understanding how they work, as long as the encryption key remains secret.
The U.S. government held a competition for an encryption algorithm to be the Advanced Encryption Standard (AES). The algorithm chosen for this is called Rijndael and it replaced the Data Encryption Algorithm (DEA) of the Data Encryption Standard (DES). The triple form of DEA is still commonly used and is incorrectly but widely known as Triple DES. So yes, the encryption algorithm for the U.S. government’s most top secret data is widely known.
AES specifies not only that Rijndael be used, but that it be used with a 128-bit key. Rijndael also encrypts only 16 bytes. What?! Yes, so basically you have to chop up your message into 16 byte blocks and encrypt each one separately.
If you were encrypting a long message or a lot of messages, you would be encrypting similar words over and over and a lot of your 16 byte blocks might look similar or even identical. This makes you susceptible to a form of frequency analysis attack (see “The Code Book”). So, another algorithm is tacked on to obfuscate the 16 byte blocks after encryption. A commonly used block algorithm, Cipher Block Chaining (CBC), makes the text of a block depend on the encrypted text of the preceding block as well as its own encrypted value. The first block in the message is seeded with an initialization vector (starter value of 16 bytes of text) that interestingly does NOT need to be secret. That doesn’t quite make sense to me, but I trust the experts.
If your message is not exactly a multiple of 16 bytes, you will have to pad it with something that you agree on with the decrypter. The padding characters have implications for what is “binary safe” so be careful. (See Crypt::CBC for a great rundown of commonly used padding techniques.)
The last thing you need to know is that when you encrypt your blocks and obfuscate with a good block algorithm, you end up with raw binary data. I certainly don’t recommend it for XML. This encrypted data, however, is commonly encoded using the MIME format Base 64. This, from the early use.net days, converts raw binary into alphanumeric characters plus ‘-’, ‘=’ and ‘/’. And, yes, that’s 65 characters. You will also need to decide if you will break up the lines with a carriage return after so many characters or not.
So, to get your encrypted value into XML, you can 1) choose a known encryption algorithm, 2) generate a key, 3) use a block management algorithm, 4) decide how to pad your last block, 5) generate an initialization vector (for CBC), and 6) convert it to Base 64 for suitability for text files. To decrypt, do the reverse. Enjoy!
26 Dec
OK, this topic might be a snoozer, but if we’re going to do build management for our RDBMS (Oracle, SQL Server, etc.) in a revolutionary new way, we need to compare what’s going on with database source code changes and builds and compare that with what we already know.
We said that when we make a runtime change to a database, we are only applying changes on top of what we already have, but in J2EE for Java, we replace the entire running application with another instance of the entire application. This is not an incremental deployment in any sense.
If we compare with the case for C/C++, our source code change might result in replacing one of the application’s executables with a new one. OK, this is a more incremental, but maybe I changed one C source file, resulting in one object file change. I still have to rebuild a new executable with possibly many additional unchanged object files.
For both Java and C, traditional build management technologies allow for incremental builds. That means, if I only change a subset of the source code, a build can be done that re-compiles the minimum number of files, taking into account the full impact of each file change. (Many applications have lost the ability to do incremental builds, but Meister can get it back.) So, for Java and C, there should be the ability to do an incremental build, but when that build step is complete, the runtime environment remains unchanged. A separate deployment step needs to happen which is less incremental to some degree.
For the database changes, there is only a single step combining both build and “deployment” and it is always incremental. When you do the build, the runtime environment is changed immediately. So, as I mentioned earlier, there are differences in build management for RDBMS’s and operating system/JVM applications. Again, let that not deter us from bringing those changes under the umbrella of a common build management system.
20 Dec
The majority of the time database changes (regardless of whether you are using Oracle, SQLServer, Sybase, DB2, MySQL or something else) coincide with application changes in a typical business application. For example, an application change request indicates that the Java application needs to start using a new column in an existing database. So, in order for the new version of the application to function correctly, you need to alter the existing database table to add the new column. There are some fundamental differences in how the two changes are manifested.
For the Java application, deployed to WebSphere, JBoss or other app server, let’s say you started out with a single Java class, foo.java, and this application change requires you to add a second class, bar.java. Typically, I would recompile both classes and bundle them together in an archive (a ZIP file) and ship the archive out to the runtime environment. This has the effect of replacing foo.java, whether or not it has changed along with adding bar.java. For the database change, however, you can only apply the ALTER statement to add the column to the table. You do not re-issue a command to recreate that table. Even if you issued commands to drop the table and recreate it, before applying the ALTER, you would lose all the data in the table (unless you dumped the data before hand and re-imported it).
Following J2EE standards for Java development, the whole application is completely replaced every single time it is changed, while for the database, only the changes needed are applied to the existing configuration. In other language, for Java, the existing runtime configuration is completely replaced, while for the database only a configuration delta is applied.
So, there are some significant differences between Java and database changes, but that doesn’t stop you from managing database changes effectively.
18 Dec
In many ways, databases are runtime systems comparable to other programmable environments such as operating systems and application servers based on Java virtual machines. All environments typically have their own specialty engineering support teams in larger companies and their own preferred programming languages for making changes: Java for application server environments such as WebSphere and JBoss, C/C++/.NET for the Linux, UNIX and Windows operating systems; and, PL/SQL/DDL for database changes.
While there are all sorts of tools that allow directly changing databases through an IDE (Interactive Development Environment), and even versioning changes, one trend has become clear in corporate environments over the last decade. That is the desire to manage database changes within their existing change and configuration management infrastructure. Going forward in this blog, I’ll address some of the challenges both organizationally and technically with doing this. The good news is that, yes it can be and is being done successfully. I will also address how Meister contributes to database change control and integrates database changes with the organization’s overall build management practice.
2 Nov
In my previous blog I showed how to test that a subroutine fails properly when given incorrect arguments. I generally use a similar method to test command line Perl programs that are run as Meister activities.
use Test::More qw(no_plan);
my $program = 'actvity_01.pl';
my $rc = eval {`$program`} ;
ok( !$?, "Runs fine or not: $rc");
I take it a bit farther and test calling the script with different inputs to make sure it fails with missing args and succeeds in a situation when it should. When running scripts with Meister and Mojo, it is important to make sure you end with the correct return code so you have the proper handling in the workflow.
I am aware of the Test::Exception and just now really looked at it. It has some attractive functions like ‘throws_ok’. This function tests for a failure and that the error message matches a regex. That is nice and compact. Still, for what I am doing these days, I don’t have standard error messages so it is overkill. I’ll keep it in mind for the future.
1 Nov
A common problem in the UNIX and Linux worlds is managing the shell profile. The profile is used by the shell to set basic parameters such as environment variables that all programs running under the shell would have access to. An example is the OPENMAKE_SERVER environment variable that must have as its value the URL of the KB server servlet.
Profiles tend to grow and may contain a number of different settings important to different programs. It is very common to hear about changes to the profile causing some unintended effect to one or more settings, which in turn causes some program or other to stop working. For example, if I have an OpenMake client installed on a Linux box, and then I want to use a COBOL compiler, I may need to set a number of new variables like COBDIR in my profile. A common practice would be to copy the profile from one of the developers who regularly compiles COBOL, on top of your profile. This way you ensure your compiles work. However, by copying you just overwrote all the setting you had before.
Now, of course, you wouldn’t intentionally lose all your settings you careful defined, but maybe the sys admin copied it for you, trying to be helpful. This very simplistic example is easy to handle, but with lots of settings and lots of programs depending on them, you could easily have a situation where some parameter change is not detected until something crashes in production. So, what do you do? You want to test your changes, but the profiles are usually in some shell script language and are outside the common application lifecycle processes.
It turns out Perl is ideal for this task. All you have to do (for a login-type profile) is to check the environment variables. This effectively gives you both functional and regression tests for your profile.
use Test::More;
ok( exists $ENV{COBDIR}, "COBDIR exists");
ok($ENV{COBDIR} eq '/usr/cob', "COBDIR set correctly");
Because it is so easy to write these tests (vi: yy j p dw <your change>), I like to make them pretty granular. In this example, I will know from the two tests if COBDIR is set incorrectly, or not set at all. That can tell me a lot about what caused the problem. This is an incredibly small investment to ensure your profile changes don’t cause unintended production failures.
5 Sep
I think it is funny that Electric Cloud is saying that Perl is and old, outdated language (presumably to position themselves against OpenMake and Meister) when they are based on make, which truly was designed for C and is not very extensible (but works for ad hoc build scripting). I can add FTP or email support to a Meister build method in less than 10 lines of Perl code - not so easy to do with make alone. I can even update my Facebook page with the latest build result through a Perl API.
Most developers know Perl has a recognized place in most companies.
Java developers often think Java is the only thing in the world, but in comparing Maven (an all Java build tool for Java builds only) and OpenMake, it took 60 pages of Java code to do the same thing (see my previous blog on XMLBeans) as 1.5 pages of perl code as used in OpenMake.
Perl works great at moving files around and working with the operating system – Java is not so good at that. Meister and Mojo use Java for the server which works well, but Meister uses Perl where it can be used best (parsing texts and interacting with the operating system) and C is in turn used where is best (C can’t be beat for speed). We also use make for what it is does well - the rules engine and dependency ordering. Combining the C based make engine with Perl for executing commands really gives enormous power to Meister.
2 Aug
How did Cruise Control and Maven lead to Mojo and Meister? We listened to developers’ struggles with CC and Maven and the result so far is Mojo and Meister. Read on for details…
I was working with some developers on standardizing their builds on Openmake (precursor to Meister) and asked them about their experiences with Cruise Control and Maven for Java builds. They had been using Cruise Control to call Maven. While there are a lot of different bits of functionality in these tools, I want to focus on the repeatable processes they are set up to control. Cruise Control (CC from now on) was set up to check out files from CVS and initiate Maven which did the build and ran JUnits.
So CC had a repeatable build loop of 1. execute according to schedule, 2. run a CVS check out, 3. initiate Maven, 4. report results.
Maven itself had its own, more specialized build loop of 3a. compile, 3b. run Junits, 3c. “package” the Java, or create the deployable archive. (There were a few more steps, but I’ll skip it to maintain focus)
The resulting integration involved nested build loops. CC ran one build loop, and one of the steps in the CC build loop was a multistep Maven build loop. This cumbersome “integration” resulted in a typical build log. If you’ve never read a build log, then maybe you are reading the wrong Blog - they are typically very long, somewhat cryptic and it is usually tedious to identify specifics of commands and errors when there is a problem.
So, how can we do better? Well, we can “flatten” the nested build loop into a single repeatable build loop where every step is reported equally and then some. Mojo is roughly the equivalent of CC with a graphical build monitor to flag in bright red steps that fail in an event summary view in Eclipse or a in a standalone Eclipse RCP client. Meister adds to Mojo the Maven bit. The difference is that Meister includes Mojo, as if CC and Maven were one fully seemless product. With Meister, all steps in the build loop are treated equally. This includes the CVS check out as well as the compile step for a particular JAR. When an error is flagged in red on the Workflow Monitor the developer clicks on the red line and the (HTML formatted) build log is pulled up with the focus precisely on the error location. This provides the transparency the developers were looking for in the build loop and it saved them a lot of time poring through tedious logs.
This is what we mean by community-developed. We work closely with developers in the field and find out what is a productivity killer (and annoying) and listen to what they say about what they would like to see improved. We take their feedback and directly put it back into product development. For me it is really rewarding to help improve the build process for developers. Developers tell me they appreciate someone listening and working to help them be more productive. I wonder if it is a kind of social work for builds?
31 Jul
Since there was some interest in this topic, I’m diving deeper and showing some code snippets.
As I mentioned before, I’ve found the Archive::Zip and XML::Twig modules invaluable in validating Java deployment descriptors. Post-build validation is important to ensure company standards are adhered to and prevent mishaps, such as deploying an app with a context root already used by another app, or one with no context root defined. It is far easier to fail prior to attempting deployment than to allow deployment to continue, waste valuable time and resources and possibly collide or upset the jvm, costing more time for more people.
I’ve pared down the code I’ve used to the bare essentials for just taking an archive, finding an XML file and extracting it to memory. Next time I’ll get to the XML parsing bit with XML::Twig. Enjoy
30 Jul
In the first part of this series, I gave an overview of what this series is about. I want to step back and describe the problem I have with part of my development process that is so conveniently resolved by a Mojo/Meister workflow using perl scripts. This results in a continuous integration for perl development.
My environment currently is a build and deployment system that uses Meister for build and workflow management, CA Harvest for version control and high level application lifecycle workflow management and about 100 perl scripts used for deploying all sorts of Java applications and performing validations of various sorts. This system supports several hundred Java applications. Because of the large numbers of applications, automation can only be achieved by adhering to standards and naming conventions, much to the dismay of developers.
I was planning to dive in with the deployment and validation scripts, but I’ve been slowly building up a change control and testing system for the perl scripts themselves and I was confronted with a more simple problem that is perfect for starters in this series. I’ve been improving the testing of the perl scripts by implementing unit test scripts using the Test::More perl module and relatives. We have three environments for perl development, testing and runtime. I’ll call them dev, qa and prod, for short.
CA Harvest is a big database storing every version ever created in the company. Although it is mostly used for Java code, we the build team are using it to manage our perl scripts. At each state in the lifecycle: dev, qa and prod, there is an associated file sytem area that is synchronized with what is in the database. This synchornization is done via a perl script so that when we ‘promote’ code in Harvest from dev to qa, the qa file system area is synchronzied with the qa view in Harvest of our source code. (One of the nice things about Harvest is its ability to trigger back end automation of your favorite script on certain actions, like ‘promote’.) My goal is to trigger the automated testing of those perl scripts after synchronization with a script using Test::Harness.
All perl scripts executed by Harvest are executed on the Harvest server itself. The first problem I had is that some of our perl scripts are executed on the Harvest server, and another set is executed on the physically different build server. I needed to run two sets of perl scripts on two different machines. This is just for the qa environment. I also wanted to do post-production deployment implementation verification (testing what you just put into production) and there are currently two build servers, meaning the scripts need to run on two build machines plus the server. Since the build machines are geographically separate (different network connections) and they could have slightly different shell profiles, I don’t want to assume the test results would be the same.
The problem I have with Harvest is that Harvest will execute a series of post action scripts, but 1) it will execute them in series, 2) if any preceding step fails, it will not execute the following steps.
Currently my synchronization script actually fails, due to a few files that shouldn’t be overwritten. These files are configuration files that must be different between dev, qa and prod, and it is my TODO to clean that up. Rather than tweak the perl synchronization script to make it pass, I need to reorganize the files in Harvest - no small task. So I want they syncrhonization script to fail, but I still want the unit tests to run. That is problem A. Harvest will detect the failure in the synch and exit without running the tests.
The second problem is (B) I am extremely impatient, and why shouldn’t I simultaneously execute my perl tests on the two or three different machines instead of waiting for them to finish, one-by-one. The Meister workflow (same as Mojo) allows me to A) ignore the return code of the synchronization script and B) execute all of the unit tests in parallel on the different machines (via ssh if necessary). For gravy, I had the workflow wait until all the tests were complete on all the machines and then send an email with the HTML log URL.
So, I had Harvest call a single perl script that launched a Meister worflow, which in turn executed the synchronziation, ignored its return code, and the split into parallel unit tests, wait for all the unit tests to complete on all the machines and then fire off an email.
This really saved a lot of tedious repetition in my development process and I am excited about using it. We are likely to grow both the number of build machines and number of perl scripts as time goes on this system is nicely extensible.