Repeated validation could mean it’s time for an object rather than a primitive
It’s very easy to take “Don’t Repeat Yourself” (DRY) to an illogical extreme. It’s also possible to take primitive avoidance in object-oriented computer programming languages (like C++ and Java) to an illogical extreme, something that I’ve written about before.
However, there are times when DRY can lead you to a more elegant solution, such as when it helps you see that restricting validation to object constructors is more elegant than repeatedly checking primitives for validity.
Elementary math will be enough to follow along the examples in this article even though they deal with a few unfamiliar mathematical concepts (which are not necessarily difficult).
I suppose I could have tried to come up with a toy example. I will make sure to explain the less familiar mathematical concepts that are necessary to understand entry validation in the program I’m drawing the examples from.
I’m working on this program that draws diagrams of prime numbers in various different sets of numbers called “rings.” My first version of the program dealt with “imaginary quadratic” rings only.
To get started, the program needs a negative squarefree integer, like −2 or −47. The default choice is −1.
A squarefree number is an integer that is not divisible by any perfect square greater than 1. It is easy to see that 2, 3, 5, 6, 7, 10 are all squarefree, while 4, 8 and 9 are not: 4 and 8 are both divisible by 4 = 2², and 9 = 3².
I assume that I don’t need to explain what negative numbers are. Anyone who studies object-oriented programming languages should have come across negative numbers when learning about the primitive data types.
So −2, −3, −5, −6, −7, −10 are also squarefree, but −4, −8 and −9 are not, since both −4 and −8 are divisible by 4, and −9 is divisible by 9.
What was originally called RingWindowDisplay
receives an int
primitive, which is checked for being negative, being squarefree, and not being below an arbitrary limit of −8191 (I had an even lower number as the original arbitrary limit).
Once RingWindowDisplay
verifies the int
primitive satisfies these requirements, it passes the int
along to the ImaginaryQuadraticRing
constructor, which also checks that the int
is negative and squarefree.
The arbitrary limit of −8191 does not apply to the ImaginaryQuadraticRing
constructor. The only reason you can’t use −2147483648 is because it’s divisible by 4, 16, 64, etc.
But you can use −2147483647 (which is Integer.MIN_VALUE + 1
) for the ImaginaryQuadraticRing
constructor, though that wouldn’t lead to very interesting diagrams in RingWindowDisplay
; I would need to come up with a different drawing algorithm, but that’s outside the scope of this article.
It feels redundant to have both the RingWindowDisplay
constructor and the ImaginaryQuadraticRing
constructor check that a given int
is negative and squarefree.
For at least a couple of reasons it would be wrong to assume that the ImaginaryQuadraticRing
constructor will always be given a proper initialization parameter.
So the ImaginaryQuadraticRing
constructor must do some validation. Then is it necessary for the RingWindowDisplay
constructor to do any validation at all? Perhaps only if I care to enforce the arbitrary −8191 limit.
Eventually it occurred to me that it just made a lot more sense for the RingWindowDisplay
constructor to receive an ImaginaryQuadraticRing
object and not do the same validation that the ImaginaryQuadraticRing
constructor does.
I’m not going to bore you with a complete history of the changes. The pertinent detail here is that the program needs to construct an ImaginaryQuadraticRing
object in order to construct a RingWindowDisplay
object.
However, there is still another bit of repeated validation elsewhere in the program: the end user can type in a dialog box a new negative squarefree number to select a different imaginary quadratic ring diagram to look at.
There is a big difference between the RingWindowDisplay
constructor and the dialog box for the user to enter the number: the former is normally called within the program, while latter is normally in response to the action of an end user who may not know and does not need to know how any of this works under the hood.
I wrote the dialog box subroutine to make a lot of quiet substitutions, like multiplying positive numbers by −1. Also, if the user enters a number that is not squarefree, like −162, the program will substitute the next lower squarefree number, like −163 (provided it’s not below the arbitrary limit of −8191, in which case it will substitute the arbitrary limit).
Like I said, the end user does not need to know how this works under the hood, and might not want to. It makes little difference to them if there is repeated validation or not, as long as it does not affect program performance in a way they can notice.
The dialog box subroutine could be rewritten so that instead of checking whether the user’s number is squarefree or not, it passes the number along to the ImaginaryQuadraticRing
constructor and catches an IllegalArgumentException
if that arises.
This could potentially mean trying to construct an ImaginaryQuadraticRing
object several times, and several IllegalArgumentException
objects would be constructed along the way.
The actual performance hit wouldn’t even be a hiccup. The worst case scenario, with the arbitrary limit of −8191, would be having to skip over five numbers, according to entry A045882 in the On-Line Encyclopedia of Integer Sequences (OEIS).
For example, if the user enters −844 (which is divisible by 4), the validation would skip over −844, −845 (divisible by 13²), −846 (divisible by 9), −847 (divisible by 11²) and −848 (divisible by 16) to finally land on −849, which is thrice −283.
That would be five failed attempts to create an ImaginaryQuadraticRing
object, and five successful creations of IllegalArgumentException
objects.
The program would display Z[√−849] at whatever zoom level happens to be set a lot faster than it would respond to an input of −3 at a zoom level of 2 pixels per unit interval, which takes a noticeable fraction of a second.
In this particular instance, I decided it was better to repeat the validation rather than rely on a constructor throwing exceptions, even though the performance hit is minuscule.
Even with the repeated validation of the user input elsewhere in the program, having one constructor rely on another constructor for validation is a more adaptable approach as I strive to explore algebraic integers in purely real quadratic rings, or at higher degrees, like in simple cubic real rings.
So I created the interface IntegerRing
, which QuadraticRing
implements. In turn, both RealQuadraticRing
and ImaginaryQuadraticRing
extend QuadraticRing
(a textbook inheritance hierarchy).
Perhaps later on, as I figure out the math, I will write SimpleRealCubicRing
and SimpleImaginaryCubicRing
, both of which will extend CubicRing
, an abstract class I’ve already started to write.
Along similar lines, I renamed RingWindowDisplay
as ImagQuadRingDisplay
and made it extend the abstract RingDisplay
class. There will definitely be RealQuadRingDisplay
, and maybe there will also be SimpleRealCubicRingDisplay
and SimpleImagCubicRingDisplay
, all of which extend RingDisplay
.
The RingDisplay
constructor requires an implementation of IntegerRing
. The classes that extend RingDisplay
can narrow that as needed, e.g., the ImagQuadRingDisplay
constructor takes an ImaginaryQuadraticRing
object which it then passes up to the RingDisplay
constructor (by invoking super
).
Likewise the RealQuadRingDisplay
constructor would take a RealQuadraticRing
object which it passes up to the RingDisplay
constructor.
Since the end user won’t generally be calling any of these constructors directly, the constructors can simply rely on the constructors of other objects to take care of the validation that is relevant to them instead of repeating it.
It’s different in the case of something that depends on user input. You certainly don’t want the program to just suddenly crash for an invalid input. If the program is not run from the command line, the end user might not see any exception messages.
But to repeatedly catch exceptions instead of just doing the necessary validation might be inelegant, even if the performance hit is negligible.
In summary, when you find the same kind of validation happening in different places in your program, you should seriously consider whether that validation might be better off restricted to a particular constructor.
Except perhaps in a toy example, the answer will be a judgement call that depends on your good taste.