Mavericks

News from the Field

Archive for the ‘Perl’ Category

Here Comes Git for Code Change Management

If you are a hard-core open source programmer, you probably use Git for project code change management instead of Subversion (I chose those words carefully). There is a lot of passion from Git advocates and, while it is not a very mature solution, it has a lot of momentum to push it forward. Merely being conceived of and written by Linus Torvalds and being used on a few large open source projects, such as the very Linux kernel itself, is enough to garner wide support.

A great place to learn about Git is Sam Vilain’s Tutorial. He goes into a lot of detail on the benefits and how-to’s of using Git. Some of the highlights include repository space savings of over 90% and local-to-repository sync times dropping from hours in Subversion to minutes with Git. The real power of Git is in the highly distributed repositories and the ease and control of moving and accepting changes between repositories. For an open source project with a large number of developers it seems Git will really shine. Git has fine control over branching, merging and accepting or not accepting project changes according to various criteria.

A popular way to use Git is to have Git pull from a public Subversion or CVS repository with convenient integration with those tools to a local Git repository and work from there. Friends working on the same project can easily pass changes between each other with Git and later commit back to the centralized CVS or Subversion repository. GitHub provides a simple Git repository hosting service. Doing a lot of Java work with JBoss and WebSphere, I am naturally interested in an Eclipse plug-in for Git and indeed one exists. It looks like a newborn infant, but I will check it out.

I also have a Perl open source project that is currently pretty anemic, but I hope to revitalize it soon. I really hate the fact that I’m locked into using Subversion on SourceForge and I never came to like Subversion. I’m eager to explore moving the project to GitHub, even though I’ll probably be the only committer for awhile. Since I’m a hardcore software management person and robust Perl developer, I think Git might be my tool. I’ll let you know.

YAJBT – Yet another Java Build Tool

Here comes buildr: yet another Java build tool. Hopefully I, or one of my other cohorts will check this out in detail soon. But, with my experience working with all manner of build tools, with 100 companies and many more development teams, I can already make a few observations.

First of all, why another build tool for Java? I am occasionally told that Maven or Ant is a perfect tool, but clearly the people behind buildr don’t think so. The choice of JRuby as the vehicle for delivering this tool, I think is probably a good one. JRuby is a scripting language in the same vain as Perl, which is used by Meister.

Doing software builds is an ugly business involving lots of file and operating system interaction. This is not where Java shines, but scripting languages can. As long as operating systems are written in C and not Java, C-like tools will be better and faster at interacting with them. Plain Ruby itself is C-based, and JRuby no doubt inherits C-like operating traits. Calling out to a Java compiler from Perl or JRuby, though it has its own JVM, does not represent a significant overhead compared with the file system operations and the compilation/translation itself.

Both Maven and Ant are relatively difficult to extend compared most other build tools, and I’ll be buildr beats them here. If you have all the Maven plug-ins and Ant tasks you need, then good for you. If not, then you have to start developing in Java and it becomes too much of an investment to sink into a build system. It is much cheaper to extend in JRuby or Perl. My frequently cited example is the XMLBeans compile step in Meister, written in Perl, which is only 40 lines of real code. The Maven plug-in is 60 pages of Java code and no one can tell me really what it is doing (I asked on all the forums). Less code is usually more transparent, which is also good for build audits.

I am a little disappointed to see them try to placate the Maven and Ant users by promising it is a drop-in replacement for Maven and they have all the Ant tasks covered. Both tools have their drawbacks and I don’t want to see another tool with the same deficiencies. They should have the cajones (or coñejos) to apply all their resources to what they think is a better tool (with its own unique benefits and deficiencies). I imagine offering Ant task equivalents is pretty easy because of the ease of coding in JRuby compared with Java.

They also don’t mention who is supposed to use the tool. Is it for individuals, small development teams, the enterprise? Maven falls short because it is only appropriate for development teams and not for stable, controlled, enterprise builds. Ant is not even tool, but a means to create some tools for small teams. I don’t think Meister will fear buildr either.

Well, since buildr is only in incubation status with Apache, I’m not sure how much time I’ll be able to spend on it, but I am curious and I’ll let you know if I find out more.

The first rule for Bash/C/Korn shell scripts in a Perl program environment is to re-write them all in Perl. If your Perl environment has any sophistication, you will have common code, standardized logging (perhaps with Log::Log4perl), testing with Test::More, etc. and your shell scripts just can’t keep pace.

If you share the environment with any non-Perl applications, however, you will still have to deal with the environment profile(s). I also have some legacy shell scripts that we can’t justify converting to Perl unless they have another reason to change. (Don’t change tested code in my house ~~ head bobble + finger wave ~~, nuh-uh!)

There are two ways I know of that you can extend the benefits of your Perl implementation towards your legacy and profile shell scripts. The first is through Bahut’s excellent tip on embedding POD documentation in shell script. This solves my problem of generating HTML documentation from POD in Perl scripts and having upsetting holes where the shell scripts are. I also have some controls for the Perl scripts that run podchecker before committing to version control, which fails if no documentation is found. Now, I can extend this control to the shell scripts.

The second Perl tool you can extend is the testing functionality. I’ve found the functionality in Test::More to be useful for validating that the changes to the shell environment profiles are correct and do not introduce defects. Profiles can be notoriously tricky to change when they get fat and you have variables depending on other variables. Mostly the profiles in my case are used to set environment variables that control the version control and build system, and these can be easily validated in a test script called profiles.t via checks like:

ok( $ENV{CODE_ROOT} eq ‘/opt/code’, “CODE_ROOT set to ‘/opt/code’”);

You then just rattle off tests for all the variables that are set and you have a great way to validate that everything will still work after the profile change. For a legacy script, you may not be able to have a crack at the internals, but you can at least check the return code and maybe some external effect it has somewhere, such as a file timestamp change.

eval { `legacy_script.sh`};

ok( !$?, “legacy_script”); #– $? is zero if script executes successfully

Profile.t and any other test scripts used to test legacy shell code can be bundled with all the other Perl tests via Test::Harness for a single test suite that really tests everything shell and Perl.

First, let me say how nice it is to have the Mojo workflow engine that allows us to manage the compliance checks, deploy to multiple machines in parallel and validate deployment. This makes our lives a lot easier and provides clear benefits for deployment via the parallelization, dependency management, scalability, logging and reporting. Underneath the covers, and for those of you who don’t have the luxury to use this almost-free product, there are some important low-level tools that are critical to the development, testing and operation of the Mojo JBoss deployment system on Linux.

With the most important listed first, they are:

  1. JBoss support
  2. The Perl executable (5.6-5.10) and base language
  3. Perl’s Test::Simple or Test::More modules
  4. Perl’s Test::Harness module
  5. The JBoss twiddle.sh script or command equivalent
  6. Perl’s XML::Twig
  7. Perl’s Archive::Zip
  8. vi
  9. ssh
  10. xterm

JBoss support wins hands down due to the number of bugs and critically important undocumented features. On a scale of 1 to 10 where 10 is the best documentation, I give JBoss about a 3 or 4. Googling doesn’t even help that much for deployment issues.

You may be surprised at the prominence of Perl, but if you think about what you are really doing and what the best tool for the job is, it makes sense. You are really moving an archive (a ZIP format file), copying XML files, creating directories, changing permissions, extracting the archive to the file system perhaps. Where did I mention Java? Nowhere. The twiddle.sh command comes in handy if you get the secret commands from JBoss support that tell you if the application you deployed has actually started correctly. Notice that this is a shell script suggesting we’re not the first to use non-Java tools to manage deployment.

Particularly on the testing side, I can’t think of a viable alternative to Perl testing. We need to test that we created this directory, changed that permission, updated that file timestamp, etc. We have about 300 test cases encoded in Perl that are run with every change to the deployment system. It takes about 20 seconds to write and run a simple test case in Perl.

Lessons? Use JBoss support early and often and use Perl.

Automating XML Updates for Web Services

As a follow up to my article on automating XML updates, I’d like to report that I did use Excel and Perl’s XML::Twig to successfully generate XML descriptors for my web service consumer, and it was a lot easier than I thought. I’m using XFire 1.2.6 web services stack running under JBoss and using MyEclipse IDE 5.0. I’m happy to say I went from blank spreadsheet and no plan to generated XML files from spreadsheet values in one and half hours. The implementation is of course expandable and reusable. This implementation should work for WebSphere and .NET as well.

I needed to create different configurations for my web application so that the service request went to different endpoints for different environments. The endpoint is at an enterprise service bus (ESB) and there is a different ESB for each environment. I need to have my ‘dev’ instance of the consumer hit the ‘dev’ instance of the ESB, the ‘qa’ instance of my web app hit the ‘qa’ instance of the ESB, etc. We’ve set up Meister to pick up the correct XML file for the target environment for the build of the WAR.

I started by setting up the spreadsheet as follows. I had an unnecessary column for Host indicating JBoss, but I hope to include WebSphere and maybe .NET as well some day. My web app actually connects to two services a.k.a. providers, so there is a column there. And, next is the configuration label for my web app with the name corresponding to the environment it is designed for. So, the first three columns of the spreadsheet look like:

Host

Provider

Configuration

     

JBoss

helloworld_service

dev

   

int

   

perf

   

qa

   

prod

     

JBoss

foobar_service

dev

   

int

   

perf

   

qa

   

prod

 

Then I needed a way to indicate the resource that would change. Right now I only have XML files, but I chose to stick with a generic URL for that. Unlike Maven or Ant generators, we start with an XML file that actually works and has been tested - not some hacked up parameterized version that takes additional effort to create. The fourth column of the spreadsheet looks like the following (with repeated entries omitted):

 

Next, I needed a way to specify a target location to change within the XML file. Now, I know I’m going to use XPath, but I’ll want this to one day work for properties files as well, so I came up with a URL-like thing called a Universal Datum Locator (UDL) which pre-pends the method of locating the datum to change on to a method-specific locator. It could be a property name, an XPath or a Perl regex, for example. In this case it is XPath and then the last column contains the replacement value for the datum indicated by the UDL. XPath is also very intuitive and easier to construct than it may look.

The value for the UDL looks like:

xpath://beans/bean[@factory-bean=’xfireProxyFactory’]/ constructor-arg[@index=’1′]/value

So the fifth column contains the UDL’s, which in my case is always the same XPath expression. The final column of the spreadsheet contains the replacement value of the datum indicated by the UDL:

Value

 

http://devesb/esb/helloworld_service/services/HelloWorldJBossService

http://intesb/esb/helloworld_service/services/HelloWorldJBossService

http://peresb/esb/helloworld_service/services/HelloWorldJBossService

http://accesb/esb/helloworld_service/services/HelloWorldJBossService

http://prdesb/esb/helloworld_service/services/HelloWorldJBossService

 

http://devesb/esb/foobar_service/services/FooBarJBossService

http://intesb/esb/foobar_service/services/FooBarJBossService

http://peresb/esb/foobar_service/services/FooBarJBossService

http://accesb/esb/foobar_service/services/FooBarJBossService

http://prdesb/esb/foobar_service/services/FooBarJBossService

 

My nifty Perl script is only about 80 lines of real code and because XML::Twig is nearly the best thing in the world, I pass the entire XPath in as a hash key to modify the source XML file:

my $twig = XML::Twig->new(

pretty_print => ‘indented’,

twig_handlers => {

“$xpath” => sub {

$_->set_text($new_datum);

}

}

);

Here, “$xpath” is directly from the “UDL” column of the spreadsheet with only the ‘xpath://’ stripped off and “$new_datum” is directly from the “Value” column. That’s a pretty useful one line subroutine if you ask me. I had the new XML files each generated into a different folder (dev/,int/, etc). Then, I checked them into version control (CA Harvest) and built each of them with Meister. If you want the full code, let me know and I’ll post it somewhere.

I did find working with the Excel 2003 XML Spreadsheet format a tiny bit awkward. You have to keep track of the column and row indices, but not bad other than that. I see Microsoft Word 2007 allows you to save as an XML document directly, but you apparently have to define bindings. I’ll have to check that out.

I wanted to share a specific benefit I enjoyed while using Meister for Java development. As part of my role to help develop an automated JBoss build and deploy system, I ended up taking on a developer role for a web services security project for both JBoss and WebSphere. While the project involved about 1000 lines of Perl, it also got me writing simple web services and consumers for JBoss and WebSphere and building them using Meister and its Eclipse plug-in.

Believe it or not, I am still using WebSphere Studio Application Developer 5.1. While my specific tale involves that IDE, it is equally applicable to MyEclipse and Rational Application Developer set of Eclipse IDE’s. In my environment, CA Harvest is the version control/SCM tool and Meister is the build tool. After code is checked in from my desktop using the CA Harvest eclipse plug-in, the code is replicated out to a Linux server, where Meister performs the official system build that is sanctioned for deployment to the application server. There is also a Meister Eclipse plug-in that scans the WSAD workspace for build targets and dependencies. Meister stores this information in one XML file per build target and those files are also checked in to CA Harvest right along side the source code.

Working intensely within the WSAD Eclipse environment as the project manager cracked the whip, I worked with a consumer application and updated it according to the changes in the service WSDL and service endpoint URL’s. One thing I learned is that if one of the parameters for the consumer is tweaked, don’t bother tweaking the XML or generated code, just regenerate the whole client. WSAD will even check out the files before if they need to be. So everything looked good on my desktop with the service and consumer deployed to two separate WebSphere servers on ports 9080 and 9081. Now to get it into the enterprise ‘dev’ environment…

Using the ‘Generate Target Definitions’ feature of the Meister plug-in I updated the Meister build target XML definition files and checked in all my code. I then promoted the code in CA Harvest which automatically kicked off a ‘dev’ build in the Linux environment. I got an error back from Meister saying ‘jdmpview.jar’ doesn’t exist.

Since I knew my consumer app and its elementary nature, I knew that jdmpview.jar wasn’t one of my JAR’s and it must be one of WebSphere’s. Given that 200 other Java apps use the same build environment with the same standards, I probably didn’t use some new feature of WebSphere that no one else is using. Therefore, it must a problem on my local desktop with the version of JVM I was using.

Sure enough, the consumer app was using the base_v51 WebSphere runtime instead of the ee_v51. (I did inherit the initial version of the app from someone else!) And, oddly enough, there is an extra JAR in the base that is missing in the more fully featured Enterprise Edition. Meister correctly forced the runtime environment to be EE for the Linux build, overriding the developer selection. I switched the runtime in the Java build path properties, regenerated the Meister target definitions, checked them in and promoted them to a successful ‘dev’ build. Regenerating the target definitions had the effective of switching out the list of JAR files in the library path from the base_v51 set to the ee_v51 set. The whole thing including one bad and one good build took about 4 minutes.

The great benefit for me was the balance between developer and SCM functions. We could have applied more controls at the desktop level, but from my perspective, I prefer an Agile environment with more freedom even if it means occasionally hanging myself with my own rope. In this scenario I let the tools dot the I’s and cross the T’s and it took no more time than say, waiting for Outlook over VPN.

In developing Java applications for multiple server environments (e.g. dev, test and prod) there is a common pain-point of having to manage deployment descriptor or configuration files specific to each server. For example, you may have an XML log4j configuration file with some parameters different for different server environments. You may want to turn on debug messaging for the development server, but turn it off for production. At the same time, the Java source code will (eventually) be the same in production as it was in development. A similar situation applies for .NET application development.

Like many build management tasks, managing these environment-specific files is generally left to either manual or some type of scripting. This is really something that needs to have a high level of automation applied. Particularly in larger environments, much like scripted build management solutions, existing tactics fall short. This situation is in a far worse state than even the compile part of build management. It is not enough to simply have a script that can spit out some files. One of the biggest problems is information management and the fact that parameter values in the configuration files may be determined by different teams! How do a production engineering team and an application developer both feed inputs into the same XML file?

I’ve worked on this problem for several years and with a number of companies. The critical functionality can be broken down into two different items – information management and a processing engine. In an effort come up with something better, I’ve done a review of what’s out there and here is what I came up with:

  • Ant ‘filter‘ task: As with many Ant tasks, this works great if you are an individual with a few items that need updating. It is a nightmare if you are working in a multi-team enterprise with multiple server environments. The main problem is that you have to constantly take working copies of XML files and insert a token for Ant to later re-replace. This leads to a management nightmare to synchronize parameterized copies of XML files with their working copies from the desktop environment. The advantage is that it works for any file type so you can use it for properties files as well as XML files.
  • OOPS Consultancy Ant ‘xmltask‘: This is a good engine for specifying and performing changes to the XML and has a full feature set. In fact, we use this in some of the Meister build services. The problem is that it is only for Ant and therefore you have all the reuse, standardization and hard coding issues. Xmltask can provide part of the solution we are looking for, but we still have an information management problem to deal with.
  • Maven: Maven has what is essentially the Ant filter task. The specifications are abstracted in the pom files, which is better than Ant, but it encourages templating of configuration files leading to all the problems associated with that (synchronizing templates with working files, testing templates, etc.)
  • XML:DB XUpdate: This is a working draft of a specification to encode XML update instructions into an XML document. There is a Java implementation of XUpdate listed on the site called ‘Lexus’, but I couldn’t find anything on it. Since the build management task requires us to generate XML files, I’m not keen on generating XML files using xupdate tags that will allow me to generate other XML files.
  • Perl XML::Twig: This has worked wonderfully for me on a back-end web services security effort and I could not be more happy with such a precise, elegant and brief XML library, which includes XPath. This is not a solution for Java or .NET developers, but it could serve as an engine to mimic xmltask or implement the XUpdate specification.
  • Excel. Yes, I’ve seen Excel used effectively as the information management front-end to updating the XML. It is a convenient format to share among teams, it is centralized source of information, it can be checked into version control and it can be saved as an XML file itself for processing by another engine. In a large environment, you may have 5 or more server environments, lots of different components to configure, so you could have literally hundreds of parameters to manage. Excel gives you a nicely transparent way to view those values.

I’ve worked out a good system putting Meister build management metadata under version control. Why put it under version control when you can do a backup, you may ask? Well, there are times when you may want to develop a reusable workflow or update a Java build method script and test in an isolated test environment. For this you would use a separate test instance of the Meister server. Putting at least some of these files under control will help ensure that you move the known version of your tested server metadata file into your production environment.

There are a few other challenges to be aware of. You may want projects and dependency directories to be able to be created an edited in production while you simultaneously modify a workflow, for example. You don’t want to overwrite a dependency directory definition in production with an older one you are migrating from the test environment.

I’ve got a few other challenges as I’m using CA Harvest for version control and its excellent control over changes extends version locking to the file system by removing write access to files that are not locked by someone for change. Meister requires that most metadata files be writable, so if you check in files to CA Harvest directly, you will leave Meister metadata files read-only and that will cause problems.

So, here are a few tips from my now slick Meister metadata version control system:

  • Use the …/meister/kbserver/tomcat/webapps/openmake.ear/openmake.war/ directory as the root of the file tree under version control. Don’t version anything outside of that tree, but do version everything under it for a simple boundary to your project.
  • Before checking in/committing changes, copy files from the Meister runtime directory to an alternate workspace or temporary directory tree. Check in/commit from there, not the Meister runtime directory. This ensures that you don’t write into the runtime environment, possibly corrupting something.
  • Before taking updates from version control, check them out to an alternate workspace, reference directory or other temporary directory. This is again to avoid corrupting the runtime environment. For CA Harvest, you can use the automated reference directories to copy from. Be careful not to preserve the read-only attribute when you copy from there. I used File::Copy::Recursive and its option not to preserve permissions for a simple, streamlined copy.
  • Manage the check in’s and deploys (copy’s) according to the major subdivisions of the Meister metadata. My check in and deploy commands work by taking a directory relative to the openmake.war/ directory as an argument.
  • Use the awesome Perl SCM project by yours truly to reduce your coding by 90% if you are using CA Harvest. CA Harvest also requires a number of other steps, such as creating a change package, locking files you plan to change to the package before checking them in, and breaking up the locking and check in commands into manageable chunks. The Perl SCM project helps with all that.
  • If you can manage your changes incrementally, all the better. OpenMake Software’s HarRefresh and CA Harvest’s HRefresh manage reference directories with incremental changes. This gives you a workspace or reference directory with only the changes relative to a baseline, not the full baseline. This means if you changed only one workflow, then it is easy to copy just the one changed file to the production environment, instead of the whole baseline with only one changed file. You will have a very simple and transparent production change procedure with this – very nice.

Multi-Threaded Workflows and Java Deployment

I’ve found the multi-threaded capabilities of Mojo and Meister workflows to be very valuable for builds and deployment. The chief benefit I’ve received is in saving time as you might expect. I’ve been working with a workflow that deploys a Java application to up to 24 servers. Let’s ignore the sequential part of the workflow and examine the time difference of running parallel deployments versus one where each of the 24 machines is updated in sequence. The deployment process takes about 5 seconds per machine. Sequentially, that’s 24 x 5 seconds, or 2 minutes. In parallel, well it’s not quite 5 seconds, but closer to about 20 seconds because of limitations of the Linux machine it is running on. Still, that’s a tremendous 100 second savings.

In addition to using the parallel workflow to cater to impatience and improve productivity, I want the Java application to hit all of the servers in the cluster close to the same time. In this particular strategy, only 3 machines out of the 24 are in the cluster. The rest are to support dynamic resource allocation and disaster recovery. Running the deploys in parallel allows me to hit all machines, and therefore all the machines in a cluster at close to the same time without having to figure out some ordering so that the cluster servers are hit first and then the rest. This ends up saving a lot of coding, testing and possibly debugging. Great stuff.

There are times in build management that you need to encrypt something – often a password. In the last blog, I gave an overview of the encryption process. Now, I’ll show how you can accomplish something.

Besides just having an encryption algorithm, there are a number of important details to be minded: key, block management algorithm, initialization vector, binary-to-text encoding. Here is what I ended up doing. The encrypted text ended up in the text field of an element in XML and it was successfully decrypted on the other end in pure Java.

First, you need your basic cipher We’ll use the Rijndael algorithm specified by AES. I used a 128-bit key generated with help from the Crypt::Random module:

use Crypt::Rijndael;

my $base_cipher = Crypt::Rijndael->new(
 $key,
 Crypt::Rijndael::MODE_CBC( )
);

Next you use this cipher within a block algorithm:

use Crypt::CBC;  #--Cipher block chaining

my $block_cipher = Crypt::CBC->new(
  -cipher => $base_cipher,
  -header => 'none',
  -iv    =>    $iv,
  -padding => 'space'
);

Even though the initialization vector, $iv, does not need to be secret, I enjoyed making it “randomy” with the ultra-cool Data::Random module. Also note that the padding strategy, adding spaces, is not binary safe so it works for encrypting text, but not for binary format files. Now you just encrypt:

my $encrypted_raw_binary =
               $block_cipher->encrypt( $plain_text );

use MIME::Base64;

my $encrypted_text_string  = encode_base64(
  $encrypted_raw_binary,
  ''
);

#-- empty 2nd arg means “don’t break up long lines”

The last step is necessary to give you something you can easily manipulate as a string to read and write into files.

So that’s it. To decrypt you just do the reverse.

Encryption Primer for XML

I wanted to pass along what I learned about a new area for me: encryption. I’m working on a build management project for securing Java web services and I’ve enjoyed learning about encryption methods. There are a couple of key concepts to learn and Wikipedia has some informative and entertaining pages. I recommend the “The Code Book” by Simon Singh for a great history of the subject.

One concept strange to newbies is that the encryption algorithm should be widely known and public. The key (**ahem**) is that the encryption key remains secret. If there is a problem discovered with the algorithm, you want to be the first to know. The commonly used algorithms are so robust that there is little advantage to be gained by understanding how they work, as long as the encryption key remains secret.

The U.S. government held a competition for an encryption algorithm to be the Advanced Encryption Standard (AES). The algorithm chosen for this is called Rijndael and it replaced the Data Encryption Algorithm (DEA) of the Data Encryption Standard (DES). The triple form of DEA is still commonly used and is incorrectly but widely known as Triple DES. So yes, the encryption algorithm for the U.S. government’s most top secret data is widely known.

AES specifies not only that Rijndael be used, but that it be used with a 128-bit key. Rijndael also encrypts only 16 bytes. What?! Yes, so basically you have to chop up your message into 16 byte blocks and encrypt each one separately.

If you were encrypting a long message or a lot of messages, you would be encrypting similar words over and over and a lot of your 16 byte blocks might look similar or even identical. This makes you susceptible to a form of frequency analysis attack (see “The Code Book”). So, another algorithm is tacked on to obfuscate the 16 byte blocks after encryption. A commonly used block algorithm, Cipher Block Chaining (CBC), makes the text of a block depend on the encrypted text of the preceding block as well as its own encrypted value. The first block in the message is seeded with an initialization vector (starter value of 16 bytes of text) that interestingly does NOT need to be secret. That doesn’t quite make sense to me, but I trust the experts.

If your message is not exactly a multiple of 16 bytes, you will have to pad it with something that you agree on with the decrypter. The padding characters have implications for what is “binary safe” so be careful. (See Crypt::CBC for a great rundown of commonly used padding techniques.)

The last thing you need to know is that when you encrypt your blocks and obfuscate with a good block algorithm, you end up with raw binary data. I certainly don’t recommend it for XML. This encrypted data, however, is commonly encoded using the MIME format Base 64. This, from the early use.net days, converts raw binary into alphanumeric characters plus ‘-’, ‘=’ and ‘/’. And, yes, that’s 65 characters. You will also need to decide if you will break up the lines with a carriage return after so many characters or not.

So, to get your encrypted value into XML, you can 1) choose a known encryption algorithm, 2) generate a key, 3) use a block management algorithm, 4) decide how to pad your last block, 5) generate an initialization vector (for CBC), and 6) convert it to Base 64 for suitability for text files. To decrypt, do the reverse. Enjoy!

Perl VCI and SCM Projects

Jim Graham pointed me to Max Kanat’s VCI Project that is providing an abstracted interface to version control functions. This is similar to my Perl SCM project. VCI has support for CVS, Subversion and a number of version control tools I confess I’ve never heard of: Mercurial, Git and Bazaar.

I haven’t had proper time to devote to Perl SCM and so it is withering in its quasi-alpha state. VCI appears to work, so I’ve been thinking of contributing to it. One thing perl-scm does have is adapters to Harvest, Openmake and the Lawson ERP tool. The perl-scm project and its uncommitted updates ** ahem ** do have a robust interface to CA Harvest, so I would probably start there to contribute to VCI. (A customer is paying for work that includes updates to the perl-scm module and they own the code so I can’t commit it ‘as-is’ and need to re-shape it to commit it back to the repository)

Using the Perl SCM module for this client has reduced coding on build management utilities by 90% so I know the VCI project is a useful one. CA Harvest, for example, does not have a persistent command line connection or context and particularly benefits from the abstraction. It allows me to do $hctx->get(@files) instead of construction a very long command line: “hco –b myserver –eh usrpwd –en myproject –st dev –vp \myproject\source –cp /home/sean –p mychange_001 –r –uk –op pc @files” (no guarantees that command is even right). With Harvest you tend to want to use a number of different sequential commands, each of which reuses two-thirds of the same arguments. This was an object waiting to happen.

Perl Has an Awesome Open Source Community

I am continually impressed with the Perl community and how well it supports users both operationally and personally. Unlike Java or even communities surrounding a single product like Maven, Perl is completely centralized. See CPAN, the Comprehensive Perl Archive Network.

On CPAN, nearly all the world’s Perl modules (functional extensions) can be found. This is not just a place to download stuff, it has a standardized interface for organizing not only the modules themselves, but all the documentation, history, bug reports and author/maintainer information about each module. (Authors and maintainers get a prestigious ‘cpan.org’ email address)

Typically, if you know the module name you want, you can just type it into Google and it will take you right to the module’s CPAN page. I recently did that for ‘Archive Zip‘. Some of the most popular modules may have their own websites that include additional tutorials, like two others I’ve accessed recently XML::Twig and Net::DNS.

This centralization has enormous advantages. When installing a new Perl module in your local environment (a fully automated procedure), the “transitive dependencies” are automatically determined and you are given the option to install them as well. (These are the modules that the module I am interested needs in order to run that I don’t have yet in my environment.)

  • 0 Comments
  • Filed under: Perl
  • Chicago Ignited

    I had a great time at the Ignite Chicago talks last night. I was actually the first speaker of eighteen with no idea what to expect. It was pretty challenging to write a 5 minute talk at 15 seconds per slide. It forced me to focus on the subject matter and also helped me hone my rehearsal skills. Feedback was that it was entertaining but a bit of a tour de force as I explained the problem of build management in corporate America and also presented a solution with a particular technical architecture involving tools and Perl scripting. I used some examples from this blog.

    I met some neat people who are now friends on Facebook. There was a big U. Chicago contingent and I’m pretty sure I must have met Erin McKean before. Her talk on her work with the Oxford American Dictionary gets my nod for the best talk.

    Hats off to Jesper Andersen and Sean Harper for organizing it. I liked it so much I volunteered to help organize the next one.

    “Ignite” Lightning Talks

    I’ll also be attending Ignite - Chicago. This is not the music festival, but a series of geeky 5 minute talks with about 20 slides. This is just my style, the length of the talk matching the length of my patience. My talk is entitled “Controlling Java with Perl”, and I’ll be covering many of the same topics related to build management I cover in my blog. I see O’Reilly Publishing’s name on the web page, so maybe they have something to do with it. Already there are 98 people planning to attend. Like the Windy City Hackathon, if you are planning to attend, please drop me a line – I’ll buy you a beer.

    Windy City Hackathon (Perl)

    I’ll be attending the Windy City Hackathon in Chicago on Saturday, December 15. The Hackathon is for Perl programmers to get together and help each other solve problems, learn new techniques and get to know one another. The full Hackathon lasts from Friday the 14th through Sunday, the 16th. This is yet another great benefit of the Perl community (YAGBPC). If you didn’t get the parenthetical humor, then you are not close enough to the Perl community and you need to attend.

    If you will be attending, drop me a line – I’d love to meet you.

    In my previous blog I showed how to test that a subroutine fails properly when given incorrect arguments. I generally use a similar method to test command line Perl programs that are run as Meister activities.

    use Test::More qw(no_plan); 
    
    my $program = 'actvity_01.pl';
     my $rc = eval {`$program`} ; 
    
    ok( !$?,  "Runs fine or not: $rc");

    I take it a bit farther and test calling the script with different inputs to make sure it fails with missing args and succeeds in a situation when it should. When running scripts with Meister and Mojo, it is important to make sure you end with the correct return code so you have the proper handling in the workflow.

    I am aware of the Test::Exception and just now really looked at it. It has some attractive functions like ‘throws_ok’. This function tests for a failure and that the error message matches a regex. That is nice and compact. Still, for what I am doing these days, I don’t have standard error messages so it is overkill. I’ll keep it in mind for the future.

    Testing *NIX Profiles with Perl

    A common problem in the UNIX and Linux worlds is managing the shell profile. The profile is used by the shell to set basic parameters such as environment variables that all programs running under the shell would have access to. An example is the OPENMAKE_SERVER environment variable that must have as its value the URL of the KB server servlet.

    Profiles tend to grow and may contain a number of different settings important to different programs. It is very common to hear about changes to the profile causing some unintended effect to one or more settings, which in turn causes some program or other to stop working. For example, if I have an OpenMake client installed on a Linux box, and then I want to use a COBOL compiler, I may need to set a number of new variables like COBDIR in my profile. A common practice would be to copy the profile from one of the developers who regularly compiles COBOL, on top of your profile. This way you ensure your compiles work. However, by copying you just overwrote all the setting you had before.

    Now, of course, you wouldn’t intentionally lose all your settings you careful defined, but maybe the sys admin copied it for you, trying to be helpful. This very simplistic example is easy to handle, but with lots of settings and lots of programs depending on them, you could easily have a situation where some parameter change is not detected until something crashes in production. So, what do you do? You want to test your changes, but the profiles are usually in some shell script language and are outside the common application lifecycle processes.

    It turns out Perl is ideal for this task. All you have to do (for a login-type profile) is to check the environment variables. This effectively gives you both functional and regression tests for your profile.

    use Test::More;
    
    ok( exists $ENV{COBDIR}, "COBDIR exists");
    
    ok($ENV{COBDIR} eq '/usr/cob', "COBDIR set correctly");

    Because it is so easy to write these tests (vi: yy j p dw <your change>), I like to make them pretty granular. In this example, I will know from the two tests if COBDIR is set incorrectly, or not set at all. That can tell me a lot about what caused the problem. This is an incredibly small investment to ensure your profile changes don’t cause unintended production failures.

    Testing Failures in Perl

    One of the things I find annoying when coding is having a subroutine that you’ve called with wrong arguments merrily process away with inappropriate inputs. I try to check for values for all the args at the beginning of the subroutine:

    sub my_sub {
      my $partition  = shift;
      confess "$partition not passed"   #-- you 'use Carp;'d
        unless $partition;
    }

    Something like that depending on the exact requirements. Since my subroutine is nearly always in a Perl module to make it easy to test, I use ‘confess’ instead of ‘die’ to get the full stack trace in case of an error. This lets you know which code called the subroutine incorrectly, rather than simply the line of the die command in your perl module. Duh – not useful.

    And now, since I want to make sure the subroutine fails if I call it wrong, I write a test script that goes something like this:

    use Test::More;
    
    use_ok('MyModule', "Use is fine");  #- tests that 'use' works
    
    eval { MyModule::my_sub('') };
    ok( ! $@, "Failed as it's s'posed to: $@");

    There are a couple of important things to note here. When you test for an error, make sure you print the error back out. While it would be nice to then do an ok to check the error message, that is really diminishing returns in terms of how much coding you do for your testing. I find it is enough to print out the error message. This lets you know if your script is failing for an entirely different reason than the one you intended. Usually this is related to simply getting the test script working correctly, and once you get that working, you don’t have to worry about it. If you print the error message, though, you will have to get used to seeing error messages wiz by as your tests are run, but as long as you run all your tests through Test::Harness, you only have to be concerned about seeing the ‘All Tests Successful’ message at the very end.

  • 0 Comments
  • Filed under: Testing, Coding, Perl
  • Perl Testing

    I do a fair amount of Perl development for our clients and I’ve become a big fan of the Test::More package for testing my Perl code. I’ve found it very useful for unit testing the Perl code itself and also for running integration tests that involve commercial product command line API’s. To get started you have to look at Michael Schwern’s tutorials that also explain the philosophy and best practices of testing. (Michael G. Schwern’s Perl Testing Tutorial)

    I found I had to change my coding architecture slightly to pull subroutines in Perl scripts out into Perl modules. This makes sense, of course, because you can load the Perl module from a test script and test the handling of arguments and returns of each subroutine. Schwern says “test the manual” and that would be great if in my environment someone would pay for the documentation. Well, there should always be the POD (plain ol’ documentation), but I actually use the test script as my list of requirements and specs. I simultaneously write and run the test scripts as I write the code for the functionality of interest. When I finish my last line of coding and I run the test script and it completes successfully, I know I’m done with development.

    Well, those are the basics. I’ll share some more of experiences writing and running tests in Perl in future blogs.

  • 0 Comments
  • Filed under: Testing, Coding, Perl