Wednesday, February 20, 2013

When should I use an int[] ?

Our team was participating in a code review (via ReviewBoard – pretty good tool) and some folks noticed that an int[] was being used to represent a version number (eg, 5.3.2 would be int[]{5, 3, 2}). It had been that way and this code was just continuing to use that representation. Comments were made that using an actual domain-specific class (e.g., VersionNumber) with constituent parts (getMajor(), getMinor(), etc.) would be better. Discussions erupted around changing it now, since it had been working just fine for years, but that’s another blog post.

Let me be blunt – I hated this use of int[]. Fundamentally, what is happening is that the author is writing in a high-level, object-oriented language and using the constructs provided by assembly language. This is not an uncommon problem – software engineers have often spent time in school learning all about how “it’s all just ones and zeroes” and that memory is “just one long array of 32 64-bit numbers”. And, yes, it is important to know that stuff. And, I’ve got nothing against assembly as a language and nothing but respect for the guys that write it (I’m not one of them – but if I was writing assembly, it’s quite possible I’d forego the array and represent a version number with just 16 bits, masking off 1 nibble for each component, and supporting 4 digit version numbers whose values can reach 15.15.15.15! Pack the bits!)

And yet…

It’s even more important to understand when to completely forget (or at least ignore) those underlying truths. When people don’t ignore or forget them, well, you see int[] crop up in places. You see bitmasks used to represent a set of Boolean options. You see an int used to represent the result of comparing two objects (Seriously, what’s up with that? Can I get an enum in 21st century Java?)

Oh, wait, did your buddy just say it’s more efficient to use the int[] over the object? I very much doubt that the bottleneck of your Java application’s performance is the creation time or memory footprint of simple objects like VersionNumber, but I stand ready to see the supporting data. Now, I’m not saying it never is… I haven’t seen your code. I’m just saying it isn’t about 99.99% of the time, so let’s not commit the cardinal sin of optimization – optimizing without evidence.

At the end of the day, object-oriented languages like Java really only have one basic design tool – classes. This is what allows for objects/instances, inheritance, polymorphism and encapsulation. Choosing to use an int[] is choosing to not use the one tool Java offers, and certainly calls into question why you’re writing in Java at all. And, have some empathy for the people that come after you, that maintain your code, or use your code in a new feature. Nobody wants to call a method that returns an int[] and recreate the logic that gives semantics to that nonsense.

But, hey, if you really want an int[], I won’t fight you. Just do me a favor, and make it a private member on the VersionNumber class. Use encapsulation to hide the truth and let me continue to think in my object-oriented language.