News from the Field
24 Apr
As a follow up to my article on automating XML updates, I’d like to report that I did use Excel and Perl’s XML::Twig to successfully generate XML descriptors for my web service consumer, and it was a lot easier than I thought. I’m using XFire 1.2.6 web services stack running under JBoss and using MyEclipse IDE 5.0. I’m happy to say I went from blank spreadsheet and no plan to generated XML files from spreadsheet values in one and half hours. The implementation is of course expandable and reusable. This implementation should work for WebSphere and .NET as well.
I needed to create different configurations for my web application so that the service request went to different endpoints for different environments. The endpoint is at an enterprise service bus (ESB) and there is a different ESB for each environment. I need to have my ‘dev’ instance of the consumer hit the ‘dev’ instance of the ESB, the ‘qa’ instance of my web app hit the ‘qa’ instance of the ESB, etc. We’ve set up Meister to pick up the correct XML file for the target environment for the build of the WAR.
I started by setting up the spreadsheet as follows. I had an unnecessary column for Host indicating JBoss, but I hope to include WebSphere and maybe .NET as well some day. My web app actually connects to two services a.k.a. providers, so there is a column there. And, next is the configuration label for my web app with the name corresponding to the environment it is designed for. So, the first three columns of the spreadsheet look like:
|
Host |
Provider |
Configuration |
|
JBoss |
helloworld_service |
dev |
|
int |
||
|
perf |
||
|
qa |
||
|
prod |
||
|
JBoss |
foobar_service |
dev |
|
int |
||
|
perf |
||
|
qa |
||
|
prod |
Then I needed a way to indicate the resource that would change. Right now I only have XML files, but I chose to stick with a generic URL for that. Unlike Maven or Ant generators, we start with an XML file that actually works and has been tested - not some hacked up parameterized version that takes additional effort to create. The fourth column of the spreadsheet looks like the following (with repeated entries omitted):
|
Document URL |
|
file://consumerWeb/src/com/company/consumer/HelloWorldConsumer.xml |
|
file://consumerWeb/src/com/company/consumer/FooBarConsumer.xml |
Next, I needed a way to specify a target location to change within the XML file. Now, I know I’m going to use XPath, but I’ll want this to one day work for properties files as well, so I came up with a URL-like thing called a Universal Datum Locator (UDL) which pre-pends the method of locating the datum to change on to a method-specific locator. It could be a property name, an XPath or a Perl regex, for example. In this case it is XPath and then the last column contains the replacement value for the datum indicated by the UDL. XPath is also very intuitive and easier to construct than it may look.
The value for the UDL looks like:
xpath://beans/bean[@factory-bean=’xfireProxyFactory’]/ constructor-arg[@index=’1′]/value
So the fifth column contains the UDL’s, which in my case is always the same XPath expression. The final column of the spreadsheet contains the replacement value of the datum indicated by the UDL:
|
Value |
|
http://devesb/esb/helloworld_service/services/HelloWorldJBossService |
|
http://intesb/esb/helloworld_service/services/HelloWorldJBossService |
|
http://peresb/esb/helloworld_service/services/HelloWorldJBossService |
|
http://accesb/esb/helloworld_service/services/HelloWorldJBossService |
|
http://prdesb/esb/helloworld_service/services/HelloWorldJBossService |
|
http://devesb/esb/foobar_service/services/FooBarJBossService |
|
http://intesb/esb/foobar_service/services/FooBarJBossService |
|
http://peresb/esb/foobar_service/services/FooBarJBossService |
|
http://accesb/esb/foobar_service/services/FooBarJBossService |
|
http://prdesb/esb/foobar_service/services/FooBarJBossService |
My nifty Perl script is only about 80 lines of real code and because XML::Twig is nearly the best thing in the world, I pass the entire XPath in as a hash key to modify the source XML file:
my $twig = XML::Twig->new(
pretty_print => ‘indented’,
twig_handlers => {
“$xpath” => sub {
$_->set_text($new_datum);
}
}
);
Here, “$xpath” is directly from the “UDL” column of the spreadsheet with only the ‘xpath://’ stripped off and “$new_datum” is directly from the “Value” column. That’s a pretty useful one line subroutine if you ask me. I had the new XML files each generated into a different folder (dev/,int/, etc). Then, I checked them into version control (CA Harvest) and built each of them with Meister. If you want the full code, let me know and I’ll post it somewhere.
I did find working with the Excel 2003 XML Spreadsheet format a tiny bit awkward. You have to keep track of the column and row indices, but not bad other than that. I see Microsoft Word 2007 allows you to save as an XML document directly, but you apparently have to define bindings. I’ll have to check that out.
15 Apr
JBoss checks for certain watch files when handling deploying or undeploying an application. The watch files are certain key files germane to the object you are deploying. For an EAR, the watch file is the application.xml and the optional jboss-app.xml files. For a web application archive, the watch files are the web.xml and jboss-web.xml files. For single-file XML resources, such as datasources, the watch file is the XML file itself. In this article, I am dealing with archives that are deployed in unextracted (unzipped) form.
The first check is made for the existence or non-existence of a watch file. If a previously unknown watch file is found, the appropriate deployer is started and the file modification timestamp is stored in memory. If a known watch file is found to be missing, the appropriate undeployer is launched.
If a known watch file is found on a subsequent pass of checking watch files, its timestamp is checked against the time that was stored in memory by the deploy process. If the deployed watch file is newer, the appropriate deployer is launched which apparently first dumps the associated resources and then reloads the object as if it were newly found.
This leaves a hole that can lead to the horrifying result of having files deployed to the server, but not having the changes reflected in the running application.
The issue has to do with completely replacing a running application with a new version. You might first delete the application completely from the runtime area leaving the server to undeploy it. Then you replace the object with a new version of itself. The window of time between checks of the watch files is finite and I’ve found it is possible to remove and replace the archive within that window so that the JBoss server does not detect that the watch file was missing and so it is not unloaded from memory. The server does check the watch file timestamps, but if you have changed files other than the watch files and have not updated the timestamps of the watch files themselves, the server will happily ignore the new version of the archive while running the old one.
If you use this deployment strategy, then this issue is essentially a random process, and a deployment failure due to this reason happened in our case on only a few percent of all deployments. When you are running a few hundred deployments a week, or it happens for a production deployment it becomes a big problem – especially when people don’t know what the problem is. A simple resolution is to always update the timestamps of the watch files when changing anything for a deployed application. This will take care of everything but possibly compiled JSP’s. (Possibly more on that later.)
This also points to a “restart” mechanism for JBoss – simply ‘touch’ the watch files of a running application to change their timestamps to the current time. This will trigger the dump-and-reload on the next watch file check. This can be useful when the application has not changed, but an associated XML resource has.
23 Mar
I wanted to share a specific benefit I enjoyed while using Meister for Java development. As part of my role to help develop an automated JBoss build and deploy system, I ended up taking on a developer role for a web services security project for both JBoss and WebSphere. While the project involved about 1000 lines of Perl, it also got me writing simple web services and consumers for JBoss and WebSphere and building them using Meister and its Eclipse plug-in.
Believe it or not, I am still using WebSphere Studio Application Developer 5.1. While my specific tale involves that IDE, it is equally applicable to MyEclipse and Rational Application Developer set of Eclipse IDE’s. In my environment, CA Harvest is the version control/SCM tool and Meister is the build tool. After code is checked in from my desktop using the CA Harvest eclipse plug-in, the code is replicated out to a Linux server, where Meister performs the official system build that is sanctioned for deployment to the application server. There is also a Meister Eclipse plug-in that scans the WSAD workspace for build targets and dependencies. Meister stores this information in one XML file per build target and those files are also checked in to CA Harvest right along side the source code.
Working intensely within the WSAD Eclipse environment as the project manager cracked the whip, I worked with a consumer application and updated it according to the changes in the service WSDL and service endpoint URL’s. One thing I learned is that if one of the parameters for the consumer is tweaked, don’t bother tweaking the XML or generated code, just regenerate the whole client. WSAD will even check out the files before if they need to be. So everything looked good on my desktop with the service and consumer deployed to two separate WebSphere servers on ports 9080 and 9081. Now to get it into the enterprise ‘dev’ environment…
Using the ‘Generate Target Definitions’ feature of the Meister plug-in I updated the Meister build target XML definition files and checked in all my code. I then promoted the code in CA Harvest which automatically kicked off a ‘dev’ build in the Linux environment. I got an error back from Meister saying ‘jdmpview.jar’ doesn’t exist.
Since I knew my consumer app and its elementary nature, I knew that jdmpview.jar wasn’t one of my JAR’s and it must be one of WebSphere’s. Given that 200 other Java apps use the same build environment with the same standards, I probably didn’t use some new feature of WebSphere that no one else is using. Therefore, it must a problem on my local desktop with the version of JVM I was using.
Sure enough, the consumer app was using the base_v51 WebSphere runtime instead of the ee_v51. (I did inherit the initial version of the app from someone else!) And, oddly enough, there is an extra JAR in the base that is missing in the more fully featured Enterprise Edition. Meister correctly forced the runtime environment to be EE for the Linux build, overriding the developer selection. I switched the runtime in the Java build path properties, regenerated the Meister target definitions, checked them in and promoted them to a successful ‘dev’ build. Regenerating the target definitions had the effective of switching out the list of JAR files in the library path from the base_v51 set to the ee_v51 set. The whole thing including one bad and one good build took about 4 minutes.
The great benefit for me was the balance between developer and SCM functions. We could have applied more controls at the desktop level, but from my perspective, I prefer an Agile environment with more freedom even if it means occasionally hanging myself with my own rope. In this scenario I let the tools dot the I’s and cross the T’s and it took no more time than say, waiting for Outlook over VPN.
28 Jan
There are times in build management that you need to encrypt something – often a password. In the last blog, I gave an overview of the encryption process. Now, I’ll show how you can accomplish something.
Besides just having an encryption algorithm, there are a number of important details to be minded: key, block management algorithm, initialization vector, binary-to-text encoding. Here is what I ended up doing. The encrypted text ended up in the text field of an element in XML and it was successfully decrypted on the other end in pure Java.
First, you need your basic cipher We’ll use the Rijndael algorithm specified by AES. I used a 128-bit key generated with help from the Crypt::Random module:
use Crypt::Rijndael; my $base_cipher = Crypt::Rijndael->new( $key, Crypt::Rijndael::MODE_CBC( ) );
Next you use this cipher within a block algorithm:
use Crypt::CBC; #--Cipher block chaining my $block_cipher = Crypt::CBC->new( -cipher => $base_cipher, -header => 'none', -iv => $iv, -padding => 'space' );
Even though the initialization vector, $iv, does not need to be secret, I enjoyed making it “randomy” with the ultra-cool Data::Random module. Also note that the padding strategy, adding spaces, is not binary safe so it works for encrypting text, but not for binary format files. Now you just encrypt:
my $encrypted_raw_binary =
$block_cipher->encrypt( $plain_text );
use MIME::Base64;
my $encrypted_text_string = encode_base64(
$encrypted_raw_binary,
''
);
#-- empty 2nd arg means “don’t break up long lines”
The last step is necessary to give you something you can easily manipulate as a string to read and write into files.
So that’s it. To decrypt you just do the reverse.
23 Jan
I wanted to pass along what I learned about a new area for me: encryption. I’m working on a build management project for securing Java web services and I’ve enjoyed learning about encryption methods. There are a couple of key concepts to learn and Wikipedia has some informative and entertaining pages. I recommend the “The Code Book” by Simon Singh for a great history of the subject.
One concept strange to newbies is that the encryption algorithm should be widely known and public. The key (**ahem**) is that the encryption key remains secret. If there is a problem discovered with the algorithm, you want to be the first to know. The commonly used algorithms are so robust that there is little advantage to be gained by understanding how they work, as long as the encryption key remains secret.
The U.S. government held a competition for an encryption algorithm to be the Advanced Encryption Standard (AES). The algorithm chosen for this is called Rijndael and it replaced the Data Encryption Algorithm (DEA) of the Data Encryption Standard (DES). The triple form of DEA is still commonly used and is incorrectly but widely known as Triple DES. So yes, the encryption algorithm for the U.S. government’s most top secret data is widely known.
AES specifies not only that Rijndael be used, but that it be used with a 128-bit key. Rijndael also encrypts only 16 bytes. What?! Yes, so basically you have to chop up your message into 16 byte blocks and encrypt each one separately.
If you were encrypting a long message or a lot of messages, you would be encrypting similar words over and over and a lot of your 16 byte blocks might look similar or even identical. This makes you susceptible to a form of frequency analysis attack (see “The Code Book”). So, another algorithm is tacked on to obfuscate the 16 byte blocks after encryption. A commonly used block algorithm, Cipher Block Chaining (CBC), makes the text of a block depend on the encrypted text of the preceding block as well as its own encrypted value. The first block in the message is seeded with an initialization vector (starter value of 16 bytes of text) that interestingly does NOT need to be secret. That doesn’t quite make sense to me, but I trust the experts.
If your message is not exactly a multiple of 16 bytes, you will have to pad it with something that you agree on with the decrypter. The padding characters have implications for what is “binary safe” so be careful. (See Crypt::CBC for a great rundown of commonly used padding techniques.)
The last thing you need to know is that when you encrypt your blocks and obfuscate with a good block algorithm, you end up with raw binary data. I certainly don’t recommend it for XML. This encrypted data, however, is commonly encoded using the MIME format Base 64. This, from the early use.net days, converts raw binary into alphanumeric characters plus ‘-’, ‘=’ and ‘/’. And, yes, that’s 65 characters. You will also need to decide if you will break up the lines with a carriage return after so many characters or not.
So, to get your encrypted value into XML, you can 1) choose a known encryption algorithm, 2) generate a key, 3) use a block management algorithm, 4) decide how to pad your last block, 5) generate an initialization vector (for CBC), and 6) convert it to Base 64 for suitability for text files. To decrypt, do the reverse. Enjoy!
27 Jul
An important part of a build system is transparency. How easy is it for some to tell how the gory mechanics of a build are done? In part 2, I continue with the comparison of the perl build method of Meister and the Java plug-in of Maven to compile XML schema’s to Java classes via the XMLBeans compiler. The lack of transparency in the Maven build was costly to developers.
As I mentioned in part 1, creating the perl stub, or build method, was pretty straightforward for Meister. Using the predefined perl objects from the Openmake::File and Openmake::FileList classes allowed a straightforward call of the XMLBeans compiler, via the XMLBeans scomp wrapper, took less than a page and a half of perl code. There was one parameter that was required to generate 1.5 Java source code, and we got it working, but there was some question about extra classes in the straightforward compile by Meister compared with the JAR file created with Maven using the Maven XMLBeans compile plug-in (here). The straight Meister build also had some classes in different folders with names that looked suspiciously like signatures. This made some people nervous about the integrity of the Meister build.
So, I asked how Maven was doing the compile. I had set the Meister build method to just call the compiler with one parameter and the source files. It’s not like it was the GNU C++ compiler with its 400 or so flags! There was literally no other way to do the build. The Meister build was done exactly the way Apache said to do it. I didn’t get any answers.
I should also point out that my developer contact was worried about differences in the byte counts for like named classes. But, in most modern compilers, including Java, source file system paths and/or time stamps are often embedded in the binary or byte code, so it is actually more common than not that byte counts will not match when doing builds with two different build systems.
I had the fortune of working with Selvi at the time, and she looked at the extra classes in the Meister scomp compile and said they looked like temporary template classes, because of their naming convention. They all ended in “List” – also common in C++. A little research indicated that was likely the case. Was Maven cleaning these up?
OK, so I couldn’t get any answers from anyone. I went and downloaded the source code for the Maven plug-in for XMLBeans. I printed it out and got a big shock: it was 60 pages of Java source code printed! I went through the code, and while I understand Java and could basically understand what it was doing, the details were not obvious. Running Maven through a Java debugger was not going to happen. I then posted a question on the XMLBeans developers’ forum (can’t remember the URL) and got no response, despite the fact that there were a number of build questions answered there.
By this time, the developers had already deployed the Meister-built JAR and it had tested fine, and there was no more nervousness about differences in the fine details of the differently built JAR files. It was time to move on.
Apparently none of the developers either in house or responsible for XMLBeans at Apache could explain the differences, which means they could not explain how Maven was doing the build. I’m sure the one guy somewhere in the world who wrote the Maven plug-in probably knows, but I didn’t try too hard to find him.
The lack of transparency in the Maven build process probably cost the organization about 20 hours of work through unnecessary research and validation. Usually the lack of transparency in the build process is mainly the concern of auditors, but this is a case where it was costly for the developers.
26 Jul
An important part of a build system is transparency. How easy is it for some to tell how the gory mechanics of a build are done? Sean looks at his experience writing a Meister build service and method to compile XMLBeans and compares it with Maven.
I was working on a build team that managed QA and Production builds using OpenMake (previous name for Meister). In this particular shop, the developers were free to use what they wanted on their desktops, but OpenMake was used for the QA and Production builds.
It was determined that XMLBeans was going to be used as part of one application build. XMLBeans (http://xmlbeans.apache.org/ ) binds Java objects to XML schema designs (XSD). It has a simple compile process that takes XSD and an XSD config file and produces a JAR. The developers were using Maven with a plug-in to compile the XMLBeans.
How a company can effectively adopt new technologies deserves a book in itself, and I am not suggesting this is the best process to take, but this is how it happened. I took on the task of creating the build service (rules) and build method (perl stub) to allow OpenMake to build the XMLBeans JAR. This JAR was then deployed in a WAR under WEB-INF/lib. That was pretty much the only specs for this build service I had.
The simple approach is to find out from the vendor, Apache in this case, their recommended approach for compiling and stick closely to that. The documentation was sparse, but there was a command line wrapper, called scomp, for the Java compiler class available that loaded the necessary libraries and constructed the arguments for the Java class. My impression looking at the wrapper (ksh for *NIX, bat for Windows) was that it was a little too simple, but it met all the specs so I went forward with having the build method perl stub call the scomp wrapper using OpenMake’s perl object methods for both the dependencies and the output JAR file name in the scomp command line. The perl object for dependencies simply contains a list of the XSD and XSD config files as determined by the C engine by searching for dependencies along the search path. The name of the JAR file is determined by the target defintion.
Using these objects is pretty simple and familiar to most Perl programmers. If I want to use the name of the expected JAR file name, the target name, I use $Target->get. That’s it. (It’s not a proprietary language as some competitors suggest!) It was pretty straightforward, and I turned it back to the developers.
22 Jul
I just wanted to mention how much I love the XML::Twig and Archive::Zip perl modules. And, why perl, anyway?
I’ve working on some custom perl scripts to validate deployment descriptors and I’ve found that using Archive::Zip to identify and extract XML files and then using XML::Twig to parse them is vastly superior to having perl call shell commands (yuck!). I’m a serious minimalist when it comes to designig and writing scripts and I love the simplicity and brevity these modules allow. These scripts are called as workflow activities in Meister and it is a pretty sweet setup.
Some developers I’ve worked with have asked “Why not Java instead of perl?” Briefly, perl lends itself very well to doing the simple, dirty work of pushing files around and reading, writing and parsing text. Since perl is compiled at runtime, the scripts are transparent, easy to change and you don’t have to compile it and deploy it somewhere. This translates into greatly shorter development times for the build system. This is important because it’s the software engineers writing the business applications that get the big bucks - no one wants to pay for infrastructure improvements. Java has its place too in Mojo and Meister and so does C/C++. All three languages are cross-platform and are used for what they are best at in the tools.