New Patterns for Redfish-Codegen
I haven’t spent very much time working on the Redfish-Codegen project lately, unfortunately. There’s an open PR out there that’s festering, and I’m long overdue for a release. I’ve been fixating on more important (read: vain) issues that I haven’t found a path forward on, and I’m experiencing some executive dysfunction.
The first problem is the technical complexity of the code generator, relative
to the size of the application. The constructor for the main class of the code
generator application is 113 lines long, and while I am pleased that we managed
to keep expert information localized in this area of the codebase for so
long, it’s become quite unwieldy. Mustache templates for code generation have
become cumbersome and bugprone to maintain, and there’s a monstrous 8.81 MiB
patch file under version control that performs a simple transformation on all
of the input schemas. Naturally, this patch is applied using quilt(1)
, and
it’s generated by a Python script, which is called from a shell script. There
are a lot of skills that a contributor needs to be effective.
Contributors to this codebase know Rust–we can count on that, or else they wouldn’t have been considering this solution. Anything else is extra. So, I’d like to gradually rewrite the code generator in Rust.
The ultimate pattern for gradual replacement of legacy systems is the Strangler Fig pattern. In this pattern, rewrites can proceed as long as they don’t cross the seam between two components. Rewriting a component is done wholesale, but as long as components are isolated, we are saved from the trap of the rewrite spiraling out of control. In order to apply this pattern, though, we need seams. That’s where my next pattern comes in.
Since the beginning, the code generator has employed a kind of batch-processing technique, similar to the Batch-sequential processing architecture pattern. In this pattern, connectors pass data between stages, which transform the data from one form to the next. Ideally, stages are decoupled from the implementation of adjacent stages, coupled only to the representation of the data that they receive. Stages receive their input in its entirety, and they produce their entire output synchronous with the next stage. This is a common architecture style for compilers and code generators of all kinds. In the canonical implementation, stages are allowed (or expected) to terminate when they have completed their processing.
This isn’t formalized in the architecture, however, so code for one “stage” is mixed with code from another stage. When reading the code, one has to stare for a long time to figure out when this transformation might be applied during code generation.
This sucks.
Keeping all of the expert information localized to the main class was very useful towards discovering this, however, since it made the problem painfully obvious as soon as it appeared.
So, I implemented a few types in Rust that will allow us to begin reconstructing the pipeline in Rust, with first-class abstractions that represent our architecture pattern, which I’ll describe in my next post!.