In Java, mind the difference between primitives and their wrappers
Knowing and understanding the difference between objects and primitives in Java is one of those things that we should expect of every Java programmer, even though in practice it won’t make any difference in program performance.
In some contexts, you can’t use a primitive without causing an error, you must use the wrapper. But in other contexts, you can use either a primitive or a wrapper and it doesn’t seem to make a difference: your program runs, it passes all the tests and delivers the right results.
However, at code review, if anyone asks why you used a wrapper instead of a primitive on such and such line, hopefully you have a better answer than that your finger just happened to press the Shift key.
One of the benefits of using an integrated development environment (IDE) like NetBeans or IntelliJ is that it will alert you to some of the cases where it’s preferable to use a primitive but it’s not an error to use a wrapper.
And of course it’ll also alert you when using a primitive is an error that prevents program compilation and execution.
I expect that everyone reading this knows that Java has eight primitive data types: the integral types byte
, short
, int
and long
; the floating point types float
and double
, the kind of integral type char
, and the Boolean type boolean
.
Note that the reserved keywords for the primitive data types in Java start with lowercase letters and consist of lowercase letters only. Indeed all reserved keywords in Java use lowercase letters only.
Although “boolean
” is flagged as a misspelling by most spell checkers, and is sometimes “corrected” to “Boolean
,” both are correct in Java, though they mean slightly different things which a Java beginner might not have learned just yet, or is supposed to have learned already but doesn’t understand yet.
Also, it should be noted that although String
gets a lot of special treatment on the Java platform and feels like a primitive, it’s not really a primitive type. And primitives are not objects.
I also expect that everyone reading this knows about arrays in Java. Arrays have very little overhead. They’re great when you know in advance how many array spaces you’re going to need, and you don’t expect to be moving the array elements around too much.
Here are some examples of arrays, each with exactly four elements:
char[] letters = {'a', 'b', 'c', 'd'};
int[] primes = {2, 3, 5, 7};
String[] muskNames = {"Aramis", "Athos", "D'Artagnan", "Porthos"};
Exception[] exceptions = {new IndexOutOfBoundsException(),
new InputMismatchException(),
new NullPointerException(),
new RuntimeException()};
I can’t think of any reason why you would want to put exceptions into an array, other than to show that it can be done. The point is that an array can hold any element of the defined type.
It’s possible to resize an array at runtime, and to insert elements into an array, but it’s kind of a pain in the neck. So Java provides ArrayList<E>
, where E
is some reference type.
ArrayList<String> muskNames = new ArrayList<>();
muskNames.add("Aramis");
muskNames.add("Athos");
muskNames.add("Porthos");
And then we can add D’Artagnan and however many more musketeers we want. But we can’t create an ArrayList<int>
. Well, we can try, but it won’t compile or run. In an IDE, you should see bright, red squiggly lines.
We have to use Integer
, the object wrapper for int
, instead of int
. The wrapper for char
is Character
. For the other primitives, the wrapper is named the same as the primitive, but capitalized: Byte
, Short
, Long
, Float
, Double
and Boolean
.
With these wrappers, we can use primitives in collections like ArrayList<E>
. Try this:
ArrayList<Integer> primes = new ArrayList<>();
primes.add(new Integer(2));
primes.add(new Integer(3));
primes.add(new Integer(5));
primes.add(new Integer(7));
This should cause five warnings in your IDE. First, the collection is only added to, never read. And that’s fine, we’ll presumably get around to that later.
The next four lines have “unnecessary boxing” warnings. The Java compiler takes care of “auto-boxing” in cases like this. You can have your IDE replace the unnecessary instantiations with simple integer literals.
ArrayList<Integer> primes = new ArrayList<>();
primes.add(2);
primes.add(3);
primes.add(5);
primes.add(7);
It works in the other direction, too, as we’ll see soon.
Let’s do something a little more involved now, a static class with a static function that returns an ArrayList<Integer>
of prime numbers up to a specified threshold.
package org.oeis.primes;import java.util.ArrayList;public class PrimeLister {
private static final ArrayList<Integer> PRIMES =
new ArrayList<>();
private static int currThresh;
static {
PRIMES.add(2);
PRIMES.add(3);
PRIMES.add(5);
PRIMES.add(7);
currThresh = 10;
}
// STUB TO FAIL THE FIRST TEST
public static ArrayList<Integer> listPrimes(int threshold) {
ArrayList<Integer> selPrimes = new ArrayList<>();
return selPrimes;
}
}
As simple and elementary as this concept might seem, it can benefit from test-driven development (TDD). Here’s our test class:
package org.oeis.primes;import java.util.ArrayList;
import java.util.Arrays;import org.junit.Test;
import static org.junit.Assert.*;public class PrimeListerTest {
@Test
public void testListPrimes() {
System.out.println("listPrimes");
int threshold = 10;
Integer[] smallPrimes = {2, 3, 5, 7};
ArrayList<Integer> expResult =
new ArrayList<>(Arrays.asList(smallPrimes));
ArrayList<Integer> result =
PrimeLister.listPrimes(threshold);
assertEquals(expResult, result);
}
}
Notice the line where the array smallPrimes
is declared. My first instinct was to declare it as int[]
, but that causes an error on the following line, because Arrays.asList()
expects an array of objects, not an array of primitives.
Once again we can just use the integer literals for 2, 3, 5 and 7 rather than the instantiations new Integer(2)
, new Integer(3)
, new Integer(5)
and new Integer(7)
; that would be unnecessary boxing here just like in the other class.
Notice also that threshold
is declared as an int
. There is no need whatsoever for it to be an Integer
.
The test fails, because right now listPrimes()
just returns an empty ArrayList<Integer>
. The simplest way to make this test pass would be to just pass it the very list it expects.
public static ArrayList<Integer> listPrimes(int threshold) {
ArrayList<Integer> selPrimes = new ArrayList<>(PRIMES);
return selPrimes;
}
It would be a mistake to return PRIMES
, but that’s a topic for another day.
Our next test is a lot like the previous test:
@Test
public void testListPrimesTo100() {
int threshold = 100;
Integer[] smallPrimes = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29,
31, 37, 41, 43, 47, 53, 59, 61, 67,
71, 73, 79, 83, 89, 97};
ArrayList<Integer> expResult =
new ArrayList<>(Arrays.asList(smallPrimes));
ArrayList<Integer> result =
PrimeLister.listPrimes(threshold);
assertEquals(expResult, result);
}
The test fails, it expects the primes between 1 and 100 but only gets the primes between 1 and 10 (see A40 in Sloane’s OEIS for longer lists of primes).
We’ll get this new test to pass, and in the process further demonstrate auto-boxing.
public static ArrayList<Integer> listPrimes(int threshold) {
int thresh = Math.abs(threshold);
if (thresh > currThresh) {
for (int n = currThresh + 1; n <= thresh; n++) {
double root = Math.sqrt(n);
boolean noDivisorFound = true;
int index = 0;
int p;
do {
p = PRIMES.get(index);
noDivisorFound = (n % p != 0);
index++;
} while (p <= root && noDivisorFound);
if (noDivisorFound) PRIMES.add(n);
}
currThresh = thresh;
}
return new ArrayList<>(PRIMES);
}
This uses five int
variables: the threshold
parameter, the static field currThresh
and the local variables thresh
, index
and p
. There is no need for p
to be of type Integer
even though PRIMES.get(index)
is of that type. That’s automatic “unboxing” at work.
By the way, in refactoring, I decided there isn’t much point in declaring selPrimes
and then returning that in the very next line, given that the copy of PRIMES
can be instantiated anonymously right on the return line.
The next test should be one that tests whether listPrimes()
can trim a list. Even if there is a way to specify a running order for the tests, we should not rely on that. The test should ensure that PrimeLister
has a list of primes that is longer than the one we’ll ask for.
@Test
public void testPrimeListerCanTrim() {
int threshold = 80;
ArrayList<Integer> result =
PrimeLister.listPrimes(threshold);
System.out.println("PrimeLister reports " + result.size()
+ " primes between 1 and " + threshold);
threshold = 20;
Integer[] smallPrimes = {2, 3, 5, 7, 11, 13, 17, 19};
ArrayList<Integer> expResult =
new ArrayList<>(Arrays.asList(smallPrimes));
result = PrimeLister.listPrimes(threshold);
assertEquals(expResult, result);
}
As before, we declare smallPrimes
as Integer[]
rather than int[]
because asList()
requires an array of objects.
When I ran this, the tests ran in source order. That happens sometimes, when there are too few tests to create different random test orders.
The last test failed because listPrimes()
gave a list of primes up to 100 rather than just to 20. But we also see that it erroneously reported 25 primes between 1 and 80. If this test had ran first, perhaps listPrimes()
would have given a list with the first 22 primes rather than the first 25 primes.
My first thought on how to make this test pass was to copy PRIMES
to a new ArrayList<Integer>
, then remove the primes greater than threshold
, and then return that list.
Here we get to a potentially awkward aspect of ArrayList<E>
: we can call remove()
according to an index
of type int
, or we can call remove()
on an object of type E
, which in this use case happens to be Integer
.
However, as I thought more about it, I decided it would be more efficient to somehow make a selection of the list and then return that. As I looked through the ArrayList<E>
documentation, I decided subList()
is probably the best option.
We can certainly try. Worst thing that can happen is that the test still doesn’t pass. Let’s add this into listPrimes()
, preferably right after the initialization of thresh
:
if (thresh < currThresh) {
int trimIndex = PRIMES.size();
int p;
do {
trimIndex--;
p = PRIMES.get(trimIndex);
} while (p > thresh);
return new ArrayList<>(PRIMES.subList(0, trimIndex));
}
Oops, that’s not quite right. PrimeLister
reports 21 primes between 1 and 80, which is wrong, and the list of primes between 1 and 20 is missing 19.
According to the Javadoc for subList()
, it “returns a view of the portion of this list between the specified fromIndex
, inclusive, and toIndex
, exclusive.” So my trimIndex
is short by 1. A small tweak corrects the problem:
if (thresh < currThresh) {
int trimIndex = PRIMES.size();
int p;
do {
trimIndex--;
p = PRIMES.get(trimIndex);
} while (p > thresh);
return new ArrayList<>(PRIMES.subList(0,
trimIndex + 1));
}
The test should pass now.
The performance of PrimeLister
can be optimized. For one thing, it’s wasteful to test even numbers for primality, because almost all of them are composite. Odd multiples of 3 can also be skipped.
For another, the thresh < currThresh
branch can be optimized to take advantage of the prime number theorem to find trimIndex
more quickly. Say thresh
is 100 and currThresh
is 10,000.
As listPrimes()
stands now, this would cause trimIndex
to get iterated from 1,229 down to 24, whereas 100/log 100 would cause trimIndex
to get iterated from 21 or 22 to 24.
This would increase the cyclomatic complexity of listPrimes()
, but the performance gain is worthwhile, I think.
Using wrappers where primitives will do should have no effect on performance. But it could have an effect on people who review the programs you write, potentially confusing them.
Elsewhere here on Medium, you might have read that you should prefer objects to primitives. But that’s mainly in cases where the objects bundle information that the primitives can’t, as for example objects that represent spans of time or amounts of money.
Although the object wrappers provide some useful functions, they don’t really carry too much more information than a primitive, like what time zone a span of time is reckoned from, or what currency an amount of money refers to.
Therefore, wrappers should only be used where primitives can’t, and for their useful static functions. And not arbitrarily where a primitive would accomplish the same goal.