2012-05-19

Immutability

Today I had a pretty long discussion concerning immutability, value objects and encapsulation in the Java programming language. All these are very important topics, but let me focus now on just one thing - immutability.

Defaults

Everybody has some habits in the way they're programming, knitting or cooking. I'm only amateur when it comes to cycling, but I have heard once that it is a good habit to keep your buttocks on the saddle as you cycle. On the contrary standing on pedals would be a bad habit. So how about (im)mutability?
One of Java authorities, Joshua Bloch, in his awesome Effective Java suggests that one should minimize mutability. Also on the JVM nowadays there seems to be a trend of adopting the functional programming paradigm (with use of languages like Clojure and Scala or tools like Guava) which strongly imposes immutability.
Unfortunately it seems to me that there are lots of Software Engineers having Object Oriented background for which mutable is the default, hence they do not feel comfortable with the concept of immutability.

The good

OK. So what you can gain by introducing habit of having immutable data structures in your project? Well, there is a few things I can think of.

Reasoning

When objects are immutable they are easy to reason about. There is just simply one state that they can be in. Well, but that seems abstract. Let me give you an example. The best one I can think of is inspired by a case study from Domain Driven Design by Eric Evans (which I highly recommend you to read). It's about paint mixing domain logic. There is a paint bucket class which has amount of paint and a colour. There is a method that pours one bucket into another.
Since the objects are mutable there are at least two possible semantics of pouring: One really pours the paint from one bucket to another, so that the first one stays empty in the end. This seems awkward. Consider the following code:
First of all the empty bucket of yellow paint is useless (which already makes this semantics wrong) but the worse thing is that it is really easy for a Software Engineer to assume that yellow wasn't emptied by pouring into red bucket which presumably will cause errors.
But maybe the semantics is wrong. Let's consider the second one: Pouring copies the paint so that source bucket stays the same. But wait. Now, after pouring, the amount of paint in the world increases, right? To make things worse let's expand the interface so that we can pour part of the bucket. Example:
Here the developer assumed that he can pour half of the bucket to the one with yellow paint and use the other half mixing into red. He basically assumed that the amount of paint stays the same when pouring.
Now let's look how immutability can help here. The Bucket instances are immutable and #pour method was changed to #mix which returns a new instance.
Now mixing colours seems pretty obvious. Actually we can see that it works similar to adding integers with +. This is because #mix became a commutative and associative binary operator which is a good thing because we are used to these. Everybody feels how adding two integers works so mixing paint shouldn't be much more difficult.

Sharing

Many people think that making things immutable will increase their memory footprint. I mean obviously we are producing new objects all the time so we should consume more memory, right? Wrong. I mean, partially wrong. First of all, when we are really changing one object to another then we're probably abandoning the first one, so the garbage collector can take care of the old one so that it will not consume memory for long. Besides, there are many places where we can dramatically gain on memory usage.
The awesome thing about immutable objects is that they are immutable (thanks Captain Obvious!) so we can share those with any pieces of the system. No fear that someone at some point in time changes objects we depend on without our knowledge. No need for making copies (especially the deep copies).

Concurrency

Concurrency is difficult. I think that pretty much anyone who wrote some concurrent piece of code will agree. One of the worst things in concurrency is locking which is expensive and can easily cause a piece of concurrent code to hang in a deadlock. But guess what? Immutable data structures are thread safe by definition and because they can not be changed they do not require any kind of locking.

The bad

By this time you probably got suspicious. Is it really that awesome? The answer is: yes in most cases. There are some algorithms that perform significantly better when implemented with mutable data structures. But the thing is that performance bottlenecks in your application will more likely be caused by deadlocks or thread starvation than garbage collector cleaning your unused immutable instances so in the end it should be a worthy investment.