Sean Blanton

Agile Build, CI and Testing Automation

Archive for the ‘Perl’ Category

I attended a lunchtime set of lightning talks at ThoughtWorks yesterday (11/11/09) in downtown Chicago. It was great, but surprisingly they had trouble nailing down compelling arguments for adopting Agile practices. I wanted to help them by adding a business perspective to “trust me, you’ll like it better,” and summarizing a few other topics of the day interpreting them in terms of my own experience.

They also dispelled the belief by some audience members that Agile is a radical new way of doing things, and it correctly came out that many practices were developed previously, even on the mainframe. I am not taking anything away from the leaders in the Agile community who have popularized many best practices, encouraged collective adoption and developed specific ways of incorporating them into coherent software engineering frameworks.

I realized I myself ended up doing a variant of extreme programming (XP) without knowing anything about it through application of common sense optimizations and best practices. Hopefully my notes here will help people understand why certain engineering practices are advocated by XP. One day I ended up being a development lead with no knowledge and a blank slate and this is what happened…

Code Quality Beyond – “It Works”

Erik Dornenberg mentioned that code should be easy to change and ThoughtWorks CEO Becky Parsons mentioned that you show pay attention to testing when writing code. My own mantra is “good quality code should be easy to change, easy to test and easy to debug.” The relative priorities of each is determined by business needs.

Test Driven Development

In my environment, I often was given requirements that were simple technical specifications. For example, I had to ensure that some value followed a naming convention. The best way to test that the code met the specification was to encode the specifications into the unit tests and then write the code so that it passes the tests. How else would you do it? The optimization here is that I didn’t want to waste my time writing code that didn’t pass a unit test, so I wrote the test first. Possibly I found this out by trial-and-error.

Automated Testing

Duh – I had 300 unit tests for my codebase. Am I going to run each of them manually? We did have sufficient complexity that I worried that one change might break something seemingly unrelated, so I made sure all unit tests were run before promoting the code for deployment. So, no changes could be deployed without all unit tests being run against the code.

Continuous Integration

I’ve also done a lot of consulting in the application lifecycle management space and, we didn’t get too deep into developer practices. But, since we were implementing version control with automated build, we always introduced continuous integration – we just didn’t have that nice tidy term. One day, I hope to thank Martin Fowler (who gave one of the lightning talks) for popularizing/coining the term so I don’t have to fumble with a longer explanation. Unfortunately, when I now say “continuous integration,” I’m met with a blank or embarrassed stare (“I feel I should know what that is, but don’t”) so I still have to go through the full explanation.

On my own team, I wanted to know about integration problems as soon as possible and I wanted to ensure unit tests were written for code changes by other developers. So, I had unit tests automatically run prior to code check-in. A failing or missing unit test meant the code could not be checked in. Well, I lost that battle and had to relax the enforcement to mere warnings. I even had a unit test generator to make it easy to pass.

BTW, this was all Perl and you can have integration issues there just as in C# or Java. Integration problems often have to do with namespace and interface changes, and those issues come up in Perl also. Perl also gets compiled prior to runtime, so you can also break the Perl build. There is just not a separately executed build step (ok, ‘perl –c’ – maybe I’ll try that next time).

Pair Programming

We sort of half-implemented pair programming and code reviews because the engineers were not familiar with Perl best practice coding. They were also not familiar with the architecture and standard patterns I invented and had in my head alone. Yes, another nasty real world situation. Had I introduced actual “pair programming” it would have been better than me looking over someone’s shoulder and trying to convince them to use best practices to no avail.

Had I introduced actual pair programming, I would have had to immediately answer to the project manager who would want to know why it was taking two people to make each change instead of one (sure there are twice as many changes, but…). This was the hottest topic of the day at ThoughtWorks, but no one in the audience articulated the question that succinctly.

Neal Ford in his talk, told us that, with pair programming, it took 15% longer to code the changes, but it resulted in 15% fewer defects. Then came “trust us, that’s better.” You were almost there Neal. We need cost and timeline justifications for the management to accept pair programming. The one thing I’ll add to the below analysis is that probably most of the 15% defects would get caught in pre-release testing for non-pair programming, but let’s say one would get through to production.

Cost Justification for Pair Programming

Find out the cost of a production defect. At the low end, it is a few $10k’s, at the high end it’s millions. Let’s say with pair programming, we prevent one defect that would cost $50k plus a lot of stress.

Then you have on top of that the cost of finding the other defects in QA, repairing them and possibly impacting your timeline. That’s got to be worth at least $50k for a typical business app.

Now compare with the 15% extra time it takes to make the changes in pair programming. That’s 15% extra time for two programmers over let’s say a take a two-week iteration. One of our customers uses $65/hr to estimate the cost of salary AND all the IT resources consumed by a developer, so let’s use that. The cost of extra development time is:

80 hrs/developer * 2 developers * 15% * $65/hr = $1560

That is nothing compared with $100k worth of problems and sill worth it if I’ve wildly overestimated the cost of the defects.

Timeline Justification for Pair Programming

To calculate the cost of the extra test and repair work, you really needed to know how long all that would take and what the extra effort is because time is money. Al l you have to do is take 15% of 80 hours (12) and compare that to the time it takes to test, repair and re-test. It is probably worth investing the extra time time for pair programming.

Interleaving Design and Coding

When I became a dev lead, I was supposed to do design and let the junior members code, but I could not do one without the other. I couldn’t articulate why, but I did both and it worked out. Now, we hear that is a best practice, and to wait as long as possible before doing the design and design while coding. I’d still like a well-articulated argument about why it works.

Effective Communication

Communication breakdowns are a common reason for project failures. Features of Agile methodologies such as the stand-up meeting, well-defined artifacts and having testers and business analysts interacting very closely with developers are ways to facilitate communication. Discover (the credit card company) has a Discover Lean initiative that puts business analysts and developers in the same room, accomplishing the same thing.

Summary

I was also happy to hear the experienced ThoughtWorks folks espouse an adopt-what-works methodology rather than follow a specific methodology religiously. That is very good advice in my view. After all, I was very successful with IXP – Ignorant Extreme Programming.

Windy City Perl Mongers

I enjoyed the Windy City Perl Mongers meeting last night. I met a lot of neat people and heard a great talk on CouchDB and one on an AI aggregator. A diverse group of people showed up with some serious technical knowledge and skills. Iraq veteran and author of many Perl books, Brian D Foy was there. They were appreciative of OpenMake Software’s sponsorship of YAPC NA last year.

Josh McAdams organized and hosted the meeting at the Chicago Google offices, where free beer, wine and milk were available in the fridge. Afterwards, we went to Rock Bottom Brewery and lasted until after midnight.

  • 0 Comments
  • Filed under: Perl, RDBMS
  • As one who has done many version control tool A to version control tool B conversions, I know how difficult such a task is. That’s why I am all the more impressed that 20+ years of Perl history from multiple repositories have been converted to a single Git repository.

    I can’t add much more about the benefits than the announcement itself:

    http://use.perl.org/article.pl?sid=08/12/22/0830205

  • 0 Comments
  • Filed under: Git, OpenMake, Perl
  • File Control Madness in Eclipse

    I found myself actually using four different file control tool plug-ins in a single Eclipse 3.4 workspace. This is not show-off, but for legitimate needs. Before proceeding, let me disclaim that I am reorganizing my Perl development on a new machine and I have everything somewhat haphazardly in a single workspace. Ideally I will have different workspaces for different projects, but until I build a standard set of preference, particularly for EPIC Perl templates, and, I can export and import them into different workspaces, I’m locked into a single workspace for now.

    Image

    If you are not familiar with Eclipse and version control (or as I call it generically “file control”) you have to install plug-ins that provide the functionality to interface with different tools. I have an EPIC plug-in that provides Perl tools, and I’ve installed EGIT for Git integration and plug-ins for Subversion and Bazaar. The CVS plug-in actually comes as part of the base Eclipse install, though that status is questionable given the popularity of Subversion and the rapid rise of Git.

    These plug-ins provide the capability to create a new project from the contents of the file control repository, or attach an existing Eclipse project to a new project under file control. You do this by right-clicking on the project and going to the “Team” menu and the “Share” item.Here is a quick explanation of the screen shot above. “om64Perl” comes out of our OpenMake CVS repository. The ones attached to Git, are pretty obvious with the word “Git” clearly to the right of the project name. Being a distributed repository tool, the Git repository that the projects are attached to is actually in the workspace. Then, I have an anemic open source project on SourceForge to which the “PerlSCM” project is attached via Subversion. And, finally, there is the Perl VCI project “vci” that uses Bazaar.

    There you go. Because I’m involved with three open source projects that use different file control tools, and regular work that uses another, I end up with four.

    I’ll take that last one about ensuring a drive is mapped and make a Mojo activity template out of it so you only have to enter the drive letter and the share name. Here’s another one…

    I have a workflow that can only run on a specific machine because it assumes certain software is installed (Git in this case). To fail a workflow that is not running on the correct machine, run this shell command at the beginning of the workflow:

    if not “%COMPUTERNAME%”==”ASLAN” exit 1

    I’ll make an activity template out of that one also, so then you only have to enter the computername. Note that this is case sensitive!

    It’s true we can make one out of Perl that is platform independent.

  • 0 Comments
  • Filed under: Meister, Mojo, Perl
  • If you are a hard-core open source programmer, you probably use Git for project code change management instead of Subversion (I chose those words carefully). There is a lot of passion from Git advocates and, while it is not a very mature solution, it has a lot of momentum to push it forward. Merely being conceived of and written by Linus Torvalds and being used on a few large open source projects, such as the very Linux kernel itself, is enough to garner wide support.

    A great place to learn about Git is Sam Vilain’s Tutorial. He goes into a lot of detail on the benefits and how-to’s of using Git. Some of the highlights include repository space savings of over 90% and local-to-repository sync times dropping from hours in Subversion to minutes with Git. The real power of Git is in the highly distributed repositories and the ease and control of moving and accepting changes between repositories. For an open source project with a large number of developers it seems Git will really shine. Git has fine control over branching, merging and accepting or not accepting project changes according to various criteria.

    A popular way to use Git is to have Git pull from a public Subversion or CVS repository with convenient integration with those tools to a local Git repository and work from there. Friends working on the same project can easily pass changes between each other with Git and later commit back to the centralized CVS or Subversion repository. GitHub provides a simple Git repository hosting service. Doing a lot of Java work with JBoss and WebSphere, I am naturally interested in an Eclipse plug-in for Git and indeed one exists. It looks like a newborn infant, but I will check it out.

    I also have a Perl open source project that is currently pretty anemic, but I hope to revitalize it soon. I really hate the fact that I’m locked into using Subversion on SourceForge and I never came to like Subversion. I’m eager to explore moving the project to GitHub, even though I’ll probably be the only committer for awhile. Since I’m a hardcore software management person and robust Perl developer, I think Git might be my tool. I’ll let you know.

    Here comes buildr: yet another Java build tool. Hopefully I, or one of my other cohorts will check this out in detail soon. But, with my experience working with all manner of build tools, with 100 companies and many more development teams, I can already make a few observations.

    First of all, why another build tool for Java? I am occasionally told that Maven or Ant is a perfect tool, but clearly the people behind buildr don’t think so. The choice of JRuby as the vehicle for delivering this tool, I think is probably a good one. JRuby is a scripting language in the same vain as Perl, which is used by Meister.

    Doing software builds is an ugly business involving lots of file and operating system interaction. This is not where Java shines, but scripting languages can. As long as operating systems are written in C and not Java, C-like tools will be better and faster at interacting with them. Plain Ruby itself is C-based, and JRuby no doubt inherits C-like operating traits. Calling out to a Java compiler from Perl or JRuby, though it has its own JVM, does not represent a significant overhead compared with the file system operations and the compilation/translation itself.

    Both Maven and Ant are relatively difficult to extend compared most other build tools, and I’ll be buildr beats them here. If you have all the Maven plug-ins and Ant tasks you need, then good for you. If not, then you have to start developing in Java and it becomes too much of an investment to sink into a build system. It is much cheaper to extend in JRuby or Perl. My frequently cited example is the XMLBeans compile step in Meister, written in Perl, which is only 40 lines of real code. The Maven plug-in is 60 pages of Java code and no one can tell me really what it is doing (I asked on all the forums). Less code is usually more transparent, which is also good for build audits.

    I am a little disappointed to see them try to placate the Maven and Ant users by promising it is a drop-in replacement for Maven and they have all the Ant tasks covered. Both tools have their drawbacks and I don’t want to see another tool with the same deficiencies. They should have the cajones (or coñejos) to apply all their resources to what they think is a better tool (with its own unique benefits and deficiencies). I imagine offering Ant task equivalents is pretty easy because of the ease of coding in JRuby compared with Java.

    They also don’t mention who is supposed to use the tool. Is it for individuals, small development teams, the enterprise? Maven falls short because it is only appropriate for development teams and not for stable, controlled, enterprise builds. Ant is not even tool, but a means to create some tools for small teams. I don’t think Meister will fear buildr either.

    Well, since buildr is only in incubation status with Apache, I’m not sure how much time I’ll be able to spend on it, but I am curious and I’ll let you know if I find out more.

    The first rule for Bash/C/Korn shell scripts in a Perl program environment is to re-write them all in Perl. If your Perl environment has any sophistication, you will have common code, standardized logging (perhaps with Log::Log4perl), testing with Test::More, etc. and your shell scripts just can’t keep pace.

    If you share the environment with any non-Perl applications, however, you will still have to deal with the environment profile(s). I also have some legacy shell scripts that we can’t justify converting to Perl unless they have another reason to change. (Don’t change tested code in my house ~~ head bobble + finger wave ~~, nuh-uh!)

    There are two ways I know of that you can extend the benefits of your Perl implementation towards your legacy and profile shell scripts. The first is through Bahut’s excellent tip on embedding POD documentation in shell script. This solves my problem of generating HTML documentation from POD in Perl scripts and having upsetting holes where the shell scripts are. I also have some controls for the Perl scripts that run podchecker before committing to version control, which fails if no documentation is found. Now, I can extend this control to the shell scripts.

    The second Perl tool you can extend is the testing functionality. I’ve found the functionality in Test::More to be useful for validating that the changes to the shell environment profiles are correct and do not introduce defects. Profiles can be notoriously tricky to change when they get fat and you have variables depending on other variables. Mostly the profiles in my case are used to set environment variables that control the version control and build system, and these can be easily validated in a test script called profiles.t via checks like:

    ok( $ENV{CODE_ROOT} eq ‘/opt/code’, “CODE_ROOT set to ‘/opt/code’”);

    You then just rattle off tests for all the variables that are set and you have a great way to validate that everything will still work after the profile change. For a legacy script, you may not be able to have a crack at the internals, but you can at least check the return code and maybe some external effect it has somewhere, such as a file timestamp change.

    eval { `legacy_script.sh`};

    ok( !$?, “legacy_script”); #– $? is zero if script executes successfully

    Profile.t and any other test scripts used to test legacy shell code can be bundled with all the other Perl tests via Test::Harness for a single test suite that really tests everything shell and Perl.

    First, let me say how nice it is to have the Mojo workflow engine that allows us to manage the compliance checks, deploy to multiple machines in parallel and validate deployment. This makes our lives a lot easier and provides clear benefits for deployment via the parallelization, dependency management, scalability, logging and reporting. Underneath the covers, and for those of you who don’t have the luxury to use this almost-free product, there are some important low-level tools that are critical to the development, testing and operation of the Mojo JBoss deployment system on Linux.

    With the most important listed first, they are:

    1. JBoss support
    2. The Perl executable (5.6-5.10) and base language
    3. Perl’s Test::Simple or Test::More modules
    4. Perl’s Test::Harness module
    5. The JBoss twiddle.sh script or command equivalent
    6. Perl’s XML::Twig
    7. Perl’s Archive::Zip
    8. vi
    9. ssh
    10. xterm

    JBoss support wins hands down due to the number of bugs and critically important undocumented features. On a scale of 1 to 10 where 10 is the best documentation, I give JBoss about a 3 or 4. Googling doesn’t even help that much for deployment issues.

    You may be surprised at the prominence of Perl, but if you think about what you are really doing and what the best tool for the job is, it makes sense. You are really moving an archive (a ZIP format file), copying XML files, creating directories, changing permissions, extracting the archive to the file system perhaps. Where did I mention Java? Nowhere. The twiddle.sh command comes in handy if you get the secret commands from JBoss support that tell you if the application you deployed has actually started correctly. Notice that this is a shell script suggesting we’re not the first to use non-Java tools to manage deployment.

    Particularly on the testing side, I can’t think of a viable alternative to Perl testing. We need to test that we created this directory, changed that permission, updated that file timestamp, etc. We have about 300 test cases encoded in Perl that are run with every change to the deployment system. It takes about 20 seconds to write and run a simple test case in Perl.

    Lessons? Use JBoss support early and often and use Perl.

    As a follow up to my article on automating XML updates, I’d like to report that I did use Excel and Perl’s XML::Twig to successfully generate XML descriptors for my web service consumer, and it was a lot easier than I thought. I’m using XFire 1.2.6 web services stack running under JBoss and using MyEclipse IDE 5.0. I’m happy to say I went from blank spreadsheet and no plan to generated XML files from spreadsheet values in one and half hours. The implementation is of course expandable and reusable. This implementation should work for WebSphere and .NET as well.

    I needed to create different configurations for my web application so that the service request went to different endpoints for different environments. The endpoint is at an enterprise service bus (ESB) and there is a different ESB for each environment. I need to have my ‘dev’ instance of the consumer hit the ‘dev’ instance of the ESB, the ‘qa’ instance of my web app hit the ‘qa’ instance of the ESB, etc. We’ve set up Meister to pick up the correct XML file for the target environment for the build of the WAR.

    I started by setting up the spreadsheet as follows. I had an unnecessary column for Host indicating JBoss, but I hope to include WebSphere and maybe .NET as well some day. My web app actually connects to two services a.k.a. providers, so there is a column there. And, next is the configuration label for my web app with the name corresponding to the environment it is designed for. So, the first three columns of the spreadsheet look like:

    Host

    Provider

    Configuration

         

    JBoss

    helloworld_service

    dev

       

    int

       

    perf

       

    qa

       

    prod

         

    JBoss

    foobar_service

    dev

       

    int

       

    perf

       

    qa

       

    prod

     

    Then I needed a way to indicate the resource that would change. Right now I only have XML files, but I chose to stick with a generic URL for that. Unlike Maven or Ant generators, we start with an XML file that actually works and has been tested – not some hacked up parameterized version that takes additional effort to create. The fourth column of the spreadsheet looks like the following (with repeated entries omitted):

     

    Next, I needed a way to specify a target location to change within the XML file. Now, I know I’m going to use XPath, but I’ll want this to one day work for properties files as well, so I came up with a URL-like thing called a Universal Datum Locator (UDL) which pre-pends the method of locating the datum to change on to a method-specific locator. It could be a property name, an XPath or a Perl regex, for example. In this case it is XPath and then the last column contains the replacement value for the datum indicated by the UDL. XPath is also very intuitive and easier to construct than it may look.

    The value for the UDL looks like:

    xpath://beans/bean[@factory-bean='xfireProxyFactory']/ constructor-arg[@index='1']/value

    So the fifth column contains the UDL’s, which in my case is always the same XPath expression. The final column of the spreadsheet contains the replacement value of the datum indicated by the UDL:

    Value

     

    http://devesb/esb/helloworld_service/services/HelloWorldJBossService

    http://intesb/esb/helloworld_service/services/HelloWorldJBossService

    http://peresb/esb/helloworld_service/services/HelloWorldJBossService

    http://accesb/esb/helloworld_service/services/HelloWorldJBossService

    http://prdesb/esb/helloworld_service/services/HelloWorldJBossService

     

    http://devesb/esb/foobar_service/services/FooBarJBossService

    http://intesb/esb/foobar_service/services/FooBarJBossService

    http://peresb/esb/foobar_service/services/FooBarJBossService

    http://accesb/esb/foobar_service/services/FooBarJBossService

    http://prdesb/esb/foobar_service/services/FooBarJBossService

     

    My nifty Perl script is only about 80 lines of real code and because XML::Twig is nearly the best thing in the world, I pass the entire XPath in as a hash key to modify the source XML file:

    my $twig = XML::Twig->new(

    pretty_print => ‘indented’,

    twig_handlers => {

    “$xpath” => sub {

    $_->set_text($new_datum);

    }

    }

    );

    Here, “$xpath” is directly from the “UDL” column of the spreadsheet with only the ‘xpath://’ stripped off and “$new_datum” is directly from the “Value” column. That’s a pretty useful one line subroutine if you ask me. I had the new XML files each generated into a different folder (dev/,int/, etc). Then, I checked them into version control (CA Harvest) and built each of them with Meister. If you want the full code, let me know and I’ll post it somewhere.

    I did find working with the Excel 2003 XML Spreadsheet format a tiny bit awkward. You have to keep track of the column and row indices, but not bad other than that. I see Microsoft Word 2007 allows you to save as an XML document directly, but you apparently have to define bindings. I’ll have to check that out.