About those Enums, Java…

October 9, 2012 Steve Hawley

As you’ve seen, we’ve released a new product – JoltImage.  This is our foray into the Java world of image processing, something that we’ve been contemplating for some time.  I will be writing about that experience so that engineers faced with similar problems (porting .NET code to Java) might have a reference.

When I started working in Java in 1996, it was a brand new language and working in it was freeing to a certain extent.  I found that compared with C or C++, I was able to write far more code with far fewer bugs that I had in the past.  Having checked arrays, immutable strings, and garbage collection removed three personal sources of bugs.

One thing that it lacked was enumerations.  This is something that is, IMHO, vital.  The work around was declaring public final values to use or to make singletons.  The problem with public finals are the lack of type safety and singletons are more heavyweight than you’d like.

Before I take a hammer to Java, let’s step back and examine the use cases for enumerations:

Cardinal values

Ordinal values

Set elements

Cardinal values are effectively atoms – symbolic names that have no particular meaningful value.  Ordinals are symbols that have specific values that have meaning and may or may not have operations on those values.  Set elements are either cardinal or ordinal values that need to be contained in a set and have specific set operations (intersection, union, difference, complement, member, etc.).

In Java, enums are easiest to use for Cardinal values and appear to be optimized for that.  I can do the following declaration:

public enum IOPortID {

   A, B, C, D;

}

 

In this case, I can refer to an IO port very easily by name.  Terrific.  Now suppose that each of the ports has slightly more meaning.

Let’s say that each of those ports is associated with a hardware address.  You’d LIKE to be able to just associate the address with the symbol.  That should be easy, but you can’t do that.  At least not really.  Instead, you have two choices, code that looks like this:

public enum IOPortID {

   A, B, C, D;

   final int toHardwareAddress() {

      switch (this) {

      case A: return 0xaba;

      case B: return 0xabb;

      case C: return 0xabc;

      case D: return 0xabd;

      default: throw new RuntimeException(“Illegal IOPortID!”);

      }

   }

}

 

or code that looks like this:

 

public enum IOPortID {

   A(0xaba), B(0xabb), C(0xabc), D(0xabd);

   private int _value;

   private IOPortID(int value) { _value = value; }

   public static IOPortID fromValue(int value)

   {

      if (value == A._value) return A;

      if (value == B._value) return B;

      if (value == C._value) return C;

      if (value == D._value) return D;

      throw new IllegalArgumentException(“unknown hardware address “ + value);

   }

}

Either implementation is bad.  The first one might be slightly better in that I’ve decoupled the port ID value from the port ID symbol and made that particular method accessible only within the namespace.  In both cases, I’ve had to put in a stack of boilerplate code which is a bad code smell.  I will say that because Java does not dictate how many members you put into an enum, you could very well put in any number of scalar dimensions or references to other classes.  For example, you could make an enum that encapsulates longitude and latitude and represents major cities.

And herein is the problem with Java enumerated values: they are not enumerated values.  They are singleton classes.  When I declare an enum as in the very first example, I end up with 4 classes constructed, A, B, C, and D.  Don’t believe me?  Add a setValue() method to each of the above and you can mutate your enum.  Yikes.  I can think of a few reasons why you might want to do this, but they’re all hacks on top of a bad base model.

One of the other handy use cases in enums is to represent bit flags for settings.  This is frequently used in software – for example, the permissions in a PDF file is represented by a 32-bit integer holding a set of bit flags.  The prescribed method is to use EnumSet.  EnumSet has two main problems with it which are intimately entwined: it is neither blittable nor is it immutable.  Let me explain – a blittable type is any type that is bit-for-bit copied by an assignment.  Scalars (int, char, float, etc.) are blittable.  Class instances are not blittable, they are copied.  Since Enums and EnumSets are classes, they are not blittable.  This means that if they are mutable (and Enums can be and EnumSets are), then we have to be very careful about how they are used.  For example, if I design an abstract class that needs to return an EnumSet of all operations that it supports, I need to either prescribe that the implementation always return a new EnumSet (bad idea – I can’t trust all my implementors) or wrap each call to the abstract method in another method that does a EnumSet.copyOf().  This is cumbersome and error prone and is necessary because EnumSet is fundamentally broken.  remove() and removeAll mutate the set – this is bad.  Instead, their should be the methods intersect(other), union(other), difference(other), complement(), which do the actual set operations and return a new EnumSet with the appropriate types and there should be no mutators.  This way instead of writing this code:

public final boolean has1DOtherThanCode39(EnumSet<Symbologies> sym)

{

   EnumSet<Symbologies> set = EnumSet.copyOf(sym); // have to copy!

   set.removeAll(Symbologies.all2D());

   set.remove(Symbologies.CODE_39);

   return !set.isEmpty();

}

I could write this:

return !sym.difference(Symbologies.all2D().union(Symbologies.CODE_39)).isEmpty();

or the equivalent in .NET

return (sym & ~(Symbologies.All2D | Symbologies.Code39)) != 0;

The requirement of the copyOf() is to prevent us from side-effecting the input, since it is passed by reference.  This is an easy thing to forget which is a bug waiting to happen.  If we left out the copyOf() call, we would be side-effecting the caller’s copy which in the case of this particular operation will not always be noticeable.  Not cool.  In addition, if EnumSet were immutable, then we would be allocating 3 EnumSet objects that would get thrown away immediately (probably their reason for making them mutable in the first place).

So we can see that the Java Enum is a class and not actually an enum and when any case beyond simple cardinals is needed (which in my experience is about 3/4 of the time), the programmer is required to use boilerplate code which is error prone.  Finally, the use of EnumSets is cumbersome as well as requiring the client to understand the consequences of having a mutable class, although to its credit, EnumSet will scale well beyond 64 members.  By contrast, the .NET enum is a scalar (and blittable), and is trivial to use in all the common cases and only starts to be a leaky abstraction when the use case is inappropriate for an enum and more appropriate for an actual class.

About the Author

Steve Hawley

Steve was with Atalasoft from 2005 until 2015. He was responsible for the architecture and development of DotImage, and one of the masterminds behind Bacon Day. Steve has over 20 years of experience with companies like Bell Communications Research, Adobe Systems, Newfire, Presto Technologies.

Follow on Twitter More Content by Steve Hawley
Previous Article
The Cupcake Truck Cometh
The Cupcake Truck Cometh

If you couldn’t tell, Atalasoft is serious about appreciating its...

Next Article
Rapid fire updates
Rapid fire updates

Hey.  So it’s been a while.  Which means…   RAPID FIRE UPDATES IN...

Try any of our Imaging SDKs free for 30 days with Full Support

Download Now