Nov 22, 2005
Age Old Problem - Regression Testing of Print Output
By Al Cooper, COPI
Problem Background
Most companies have faced the issue of how to manage changes to output application programs by implementing very strict Change Management procedures. Such procedures require extensive testing of the modifications as well as regression testing of new modules against previous cycles of test data. Many procedures even require extended periods of parallel processing of production data against both the existing production code as well as the newly modified code to ensure that oversights don’t slip through the cracks.
During my years at COPI as a Senior Output Analyst, I have experienced a large variety of methods for verifying that changes to output programs did not inadvertently lead to unwanted changes. One thing that was common to all of the methods I have seen is that they couldn’t guarantee 100% accuracy.
As we all know, the first full production run on the new programs is when you are most likely to encounter problems. Having a major batch application reach a successful end of job on the first day of full production is often thought of as a major milestone. For transaction output, this is only the beginning of the waiting game to find out if anything was actually wrong with the output.
Why does it take so long to find out if something went wrong with transaction output? It is because, the customers as recipients, are often the final judges regarding unexpected errors. Fortunately for most application developers, they don’t have to face the customers directly. However, they do have to face the internal business managers who represent customers and these managers tend to not only pass along the customer’s ire but also add a bit of their own.
Using Eyeballs to Solve the Problem
In order to prevent this type of scenario, companies that produce external customer documents often spend a significant amount of time and money in managing application program changes.
When I started as a programmer in the mid 60s, I was often asked to make a quick change to a report before it was run with no more than the user’s verbal instructions. Today that type of change is rarely allowed and most often small changes must be rolled into regularly scheduled maintenance releases. Many application development organizations only allow releases to be installed 2 to 3 times per year resulting in a lot of frustration by their users but these schedules are mandated by the need to perform extensive regression testing on all changes.
Regression testing, for those who may not be familiar with the term, is the process of rerunning data from previous production or test cycles through the modified application program(s) and comparing the output to make sure that the changes worked. With printed output this almost always involves printing seemingly duplicate piles of paper output, sitting down in front of the two piles and comparing the old to the new.
In the 60s and 70s, when line mode print output was the norm, we used a simple utility called “Tape Compare” which would compare 2 tape files (one containing the old print file and the other containing the new print file) and it would highlight any differences. This utility was of limited use, so most of the time we sat down with two piles of paper and “eyeballed” them.
With the advent of page printing protocols, including Xerox Metacode & AFP, which support the inclusion of electronic forms as well as graphics, a simple data comparison became next to useless as many of the changes being made were in these “non-data” entities within the print file that wouldn’t show up in a data comparison utility.
As such, until recently, the onus for verifying the content and format of printed output has fallen on the human eye. As we all know, eyeballing two large stacks of documents is not only time consuming but leads to eye strain and eventually to a flippant and often careless attitude toward the comparison. Often, one of the biggest mistakes made during this type of comparison is in using the developers themselves to make the comparison. They’ll look at the output and see “what they expect to see” and often miss small discrepancies that would be easily noticed by an independent user.
The challenges noted have led many to ask over the years, “How can we improve on the eyeball methodology?”
Using Technology to Solve the Problem
In the past few months, I have been exposed to a few different software products that can digitally compare one print file to another without ever putting toner or ink to paper. The level of comparison sophistication varies from product to product. Some products compare only the variable data while others go so far as to compare raster images and identify very discreet differences in such things as logos and signatures.
Some of these products involve sophisticated Graphical User Interfaces (GUI) that show a benchmark or control file on one side of the screen and the test version on the other side with other windows for highlighting discrepancies. Also, most of these products support a method for selecting differences identified in one scan, and allowing subsequent scans to ignore the previously identified differences. Some products run batch only comparisons and log the differences while others run in an interactive mode on a workstation and display the differences immediately.
The number of platforms supported by these products varies greatly from mainframe, UNIX & AIX to Windows. These products also support a variety of print protocols including AFP, Xerox, PDF and more.
Why should someone consider one of these automated change management products? I have heard a number of marketing phrases used, but the ones that I think are most applicable to these products include:
- “Reduced Time to Market”
- “Reduction in Customer Complaints about Erroneous Output”
- “Reduction in Testing Costs”
- “Improved Change Management Control”
Reduced Time to Market
Most application development projects involve extended periods of time for regression testing in order to identify modifications that didn’t get applied in all appropriate circumstances. These project extensions also help to ensure that those little seemingly unrelated problems caused by inadvertent logic changes to some other area of the program don’t sneak through resulting in erroneous data going to a customer. A lot of this time is spent in physically comparing the output.
In many output centers today, the physical output is not produced in the same building as the developers or testing organization charged with performing the regression test. As such, a major factor in this testing effort involves turn around time and delivery time to get the physical output to the people that are going to do the actual comparison.
All of this time can be significantly reduced through the use of an automated change management tool resulting in shorter project extensions with faster implementation of application changes.
Reduction in Customer Complaints about Erroneous Output
At one point or another in our careers we face the unpleasant task of receiving communications from upset customers claiming that their output was incorrect. Since it is a good idea to keep your customers happy, it is important to eliminate or at a minimum, reduce the number of errors that are allowed to reach the customer base. In order to accomplish this, effective regression testing is a necessity - not a luxury.
Being able to electronically compare an old version of a customer print file to the latest test version will help ensure that the expected changes reach the customer and that the unexpected “gotchas” don’t.
Reduction in Testing Costs
Most people don’t understand why the purchase of yet another product can actually reduce the cost of testing, and as a result, the overall cost of a project. There are two major areas of direct savings associated with the old fashioned method of regression testing of print output:
During normal regression testing, complete files are printed. Some of these files can be huge. And, both the benchmark and the test files must usually be printed. This results in even more paper, toner and “click charges” to print these files. Also, in cases where these files can’t be produced during the normal work day, overtime charges are assessed for the operators to stick around to print this output. These expenses are often considered to be “corporate overhead” but they are, in fact, “corporate expenses” that can be eliminated through the use of automated change management tools.
Improved Change Management Control
This marketing catch phrase is really a combination of the other advantages. Improved time to market, reduction in customer complaints and testing costs are all indicators of an improved Change Management System. The savings and advantages brought through the use of automated change management tools will increase their use dramatically over the next few years.
At this point I would have to say that some of these products are still in their infancy, and they can all use improvements, but they have made a major step forward in the fight against erroneous data being sent to our customers.
I strongly recommend that if you currently rely on the old “eyeball” method of regression testing for printed output, that you start investigating these products. I’m sure you’ll be seeing many announcements of new products and improvements to these products over the next months in this publication and others. Take the time to check them out.
If you have ideas on this topic or would like to discuss this age old problem further, please drop me an e-mail at ac@888999copi.com.