3 Good Reasons to Avoid Arrays in Java Interfaces

If you still find yourself defining methods like this

public String[] getParameters();

in an interface, you should think again. Arrays are not just old-fashioned, there are good reasons to avoid exposing them. In this article, I’ll try to summarize the main drawbacks of using arrays in Java APIs.

[ Looking for more tips on writing clean code? See our Software Craftsmanship page. | Polish up your coding with the help of experts in Eclipse Training. ]

Let me start with perhaps the most unexpected thing:

Arrays lead can lead to poor performance


Update: It has been pointed out in the comments that iterating over an ArrayList is significantly slower than iterating over an array. I was surprised to find that the extra costs of the list iterator (mainly caused by checks for concurrent modification) can outweigh the savings I’ve explained here. This counters my point that arrays inhibit performance, so I’ve adjusted the title. Thanks go to Peter Drake for bringing in this aspect.
However, the benchmark presented below is still valid. It’s true that interfaces with arrays are not necessarily faster. Depending on the consumers of an interface, lists may result in better overall performance.

You may think that working with arrays is the fastest possible because arrays are the low-level data structure used in most Collection implementations. How can using a plain array be slower than using an object that contains an array?

Let’s start with this common idiom that certainly looks familiar to you:

public String[] getNames() {
  return namesList.toArray( new String[ namesList.size() ] );
}

This method creates an array from a modifiable collection used to keep the data internally. It tries to optimize the array creation by providing an array of the correct size. Interestingly, this “optimization” makes it slower than the simpler version below (see the green vs. the orange bar in the chart):

public String[] getNames() {
  return namesList.toArray( new String[ 0 ] );
}

However, if the method returns a List, creating the defensive copy is yet faster (the red bar):

public List getNames() {
  return new ArrayList( namesList );
}

The difference is that an ArrayList stores its items in an Object[] array and use the untyped toArray method which is a lot faster (the blue bar) than the typed one. This is typesafe since the untyped array is wrapped in the generic type ArrayList<T> that is checked by the compiler.

toArray 3 Good Reasons to Avoid Arrays in Java Interfaces

This chart shows a benchmark with n = 5 on Java 7. However, the picture does not change much with more items or another VM. The CPU overhead might not seem drastic, but it adds up. Chances are that consumers of an array have to convert it into a collection in order to do anything with it, then convert the result back to an array to feed it into another interface method etc.

Using a simple ArrayList instead of an array improves performance, without adding much footprint. ArrayList adds a constant overhead of 32 bytes to the wrapped array. For example, an array with ten objects requires 104 bytes, an ArrayList 136 bytes.

With Collections, you may even decide to return an unmodifiable version of the internal list:

public List getNames() {
  return Collections.unmodifiableList( namesList );
}

This operation performs in constant time, so it’s much faster than any of the above (yellow bar). This is not the same as a defensive copy. An unmodifiable collection will change when your internal data changes. If this happens, clients can run into a ConcurrentModificationException while iterating over the items. It can be considered bad design that an interface provides methods that throw an UnsupportedOperationException at runtime. However, at least for internal use, this method can be a high-performance alternative to a defensive copy – something that is not possible with arrays.

Arrays define a structure, not an interface

Java is an object oriented language. The central idea of object orientation is that objects provide a set of methods to access and manipulate their data fields instead of manipulating the data fields directly. These methods make up an interface that explains what you can do with the object.

Because Java has been designed for performance, primitive types and arrays have been mixed into the type system. Objects use arrays internally to store data efficiently. However, even though arrays represent a modifiable collection of elements, they do not provide any methods to access and manipulate these elements. In fact, there’s not much you can do with an array except accessing and replacing its elements directly. Arrays don’t even implement toString and equals in a meaningful way, while collections do:

String[] array = { "foo", "bar" };
List list = Arrays.asList( array );
 
System.out.println( list );
// -&gt; [foo, bar]
System.out.println( array );
// -&gt; [Ljava.lang.String;@6f548414
 
list.equals( Arrays.asList( "foo", "bar" ) )
// -&gt; true
array.equals( new String[] { "foo", "bar" } )
// -&gt; false

7 Responses to “3 Good Reasons to Avoid Arrays in Java Interfaces”

  1. Yann says:

    Please provide the performance testing code you used.

  2. Peter Drake says:

    Very interesting!

    What about iteration? Is it faster to iterate over an array (which involves no object creation or method calls) or an ArrayList? How much?

  3. Peter Drake says:

    Very interesting!

    Two counterpoints:

    - Iterating through an ArrayList takes much longer than iterating through an array.

    - You can’t put fast, small primitives in an ArrayList.

  4. Ian Bull says:

    @Peter
    Do you have any metrics on iteration through an array vs a collection? This would certainly be valuable to see. Another advantage is that you could potentially back the iterator with some other datastream (say a DB) and instead of loading all the results up front, you could page them. I’m not sure I would design an API around this hypothetical case though.

    As for primitives, yes, collections and primitives are really awkward. I’ve found with Java 8, things can get even more subtle (and not in a good way).

  5. Michal Chmielarz says:

    Interesting chart you’ve put. Could you provide code of the performance tests, please?

  6. Ralf Sternberg says:

    I’ve uploaded the benchmark code to [1]. It’s based on caliper 0.5. The chart has been created with d3 [2] using a little tool that I’ll share after some polishing in the next days.

    [1] https://gist.github.com/ralfstx/10641850#file-arrayvslistbenchmark
    [2] http://d3js.org/

  7. Ralf Sternberg says:

    @Peter, thanks for this hint. I’ve run some benchmarks for array vs. list iteration and I was surprised how big the difference is. I guess that’s the price for the concurrent modification checks. This kind of counters my point that arrays lead to poor performance. I’ll run some more tests and add an update to the post.

7 responses so far

Written by . Published in Categories: EclipseSource News, Editors choice