One of my springtime projects at work was to button up a recently-refreshed process to transmit some billing data between systems. By 'recently-refreshed,' I mean 'finally made a process whereby a human doesn't have to manually generate a file which had been the de-facto process for the better part of twelve years. Due to the cascading effects of some staff turnover in the unit in question, IT was brought into the loop to help find a better way to work this process.
In January we did a soft-run of the transport process with a version of the file that I'd pseudo-automatically created. The key improvement in January's run was to ensure the transport was seamless between the source system, our automation environment, and the destination systems, so other than ensuring the file met the proper specification, very little proper automation was involved in its manipulation. Since it was a simple text file, the modifications necessary were super easy to do with some creative find/replace functions in Visual Studio Code (you are using VSCode for your everyday work, right?).
This One's Different
I have a lot of processes in production for manipulating and generating import/transport files. Oddly enough, though, for all of those things in production I didn't have one that used something super old school (to me): fixed width fields. The destination system in this particular transport process is incredibly picky about its character spacing, so we have to ensure it's character-perfect. In our testing during the January process, we discovered a few spacing issues that needed to be addressed, and I am glad we discovered those at the time because it made the buttoning-up process go much more smoothly.
Three Steps
If we ignore the steps of reading an input file and writing an output file, the process boils down to three relatively simple steps:
- Trim up and sanitize our input fields;
- Add to some running calculations (record count and sum data); and
- Stitch together a set of strings for writing to file.
That Weird Third Step
In PHP, I'm actually combining part of steps 1 and 3 in one line, which has the added 'benefit' of making me feel super gross about myself, but it works (and I don't have to worry about incoming data that unexpectedly went sideways, along the lines of this:
$outputText .= str_pad(trim($row[0]), 11) . str_pad(trim($row[1]), 9) . trim($row[2]) . "\r\n";
It just feels a bit wrong to be doing so many string operations in succession, especially gross because we're padding out the very strings we're trimming. But it's a necessary step: undoubtedly somewhere along the line we'll wind up with an input that contains some leading spaces or something and that'll inject all sorts of weird behavior in the output.
Ultimately, the output above has three fields for transport. The first field is commonly seven characters, but the second field must begin at character 12, so we pad the first string to 11 characters. The third field starts at character 21, so we pad the second field to 9 characters. The third field is actually comprised of several same-width data points concatenated (in a different step) into one "string," so we simply append it and close with a carriage return.
Seems Archaic, But It Works
It would be easy for me to criticize the destination system for forcing something so 'archaic' as a fixed-width import file, but at the end of the day once I knew all of the data points (and their starting characters), the process including the above snippet really only took about a half-hour to pull together and test. Thanks to diff
, I could verify that the output of my automated process generated a file that was identical to that which was successfully transported in January.
Come the next time we have to schedule a transport, we will have removed about 90% of the manual effort involved in this process. I will take that as a win for all involved!
Headline image via giphy