Part One: Things Java Got Right (evanjones.ca)

[ 2006-January-08 16:48 ]

I love programming languages, probably because I spend a lot of time using them to express my thoughts. Currently, Java is probably the most widely used language for new software. I am not in love with Java, but I do think that its design got a number of things right. In my opinion, Java has been successful because its designers tried to address C++'s biggest flaws, without changing much else. This is a great example of a "worse is better" design, since they just patched the flaws in the most popular language of the time, instead of reinventing everything. This article describes the things that I think make Java a good language. Part Two is about Java's bad decisions.

Garbage Collection

Memory management is hard. Garbage collection is an elegant solution that eliminates the problem: just let the computer take care of it for you. Not having to remember to explicitly free objects makes writing programs easier and more reliable. The cost is that garbage collection can be less efficient, both in terms of the amount of memory consumed, and due to the CPU cycles for executing the garbage collection algorithm. However, modern garbage collectors can actually be faster for some applications. However, that hardly matters. The vast majority of code is not performance critical. The program will be written faster with garbage collection, which means you can spend more time optimizing the 1% where the performance really matters. There are some rare exceptions where garbage collection does not make sense, but for those cases we have the C programming language.

Memory Safety

A second class of common bugs are invalid memory accesses. These are also the kind of problem that commonly create exploitable security flaws. Java requires all memory accesses to be checked to ensure they are in fact valid. This is more expensive than C's "assume the programmer is right" model. The best example is that every array access includes a check to ensure that it is within the limits. Like garbage collection, this trades some runtime performance in order to make writing software easier.

Standard Library

A well-defined standard library makes it much easier for separately defined APIs to interact. In the C/C++ world, every API defines its own string and container types, which makes life miserable. Java, on the other hand, was defined at the beginning with a nice set of standard types: Unicode strings, containers and operating system objects like files and sockets. These types are critical, since nearly every application and API makes use of them. This makes code reuse much easier. There are other important parts of the standard library, but they are all somewhat application specific.

Runtime Dynamic Linking

Java is completely dynamically linked. The advantage is that only one class needs to be recompiled when it changes. This is very unlike C++, where nearly all the code that uses a class must be recompiled if it changes. This means that hacks are required to put C++ classes into shared libraries. With Java, this just works. This makes it easier to reuse code because shared libraries are easier to create. The cost is again, performance. Certain operations are more expensive because they must do look-ups at runtime, instead of using compile-time information.

Platform Independent Bytecode

Java is not distributed as binary executables compiled for a specific CPU and operating system, but rather as platform independent bytecode. This is another decision that has performance implications, as it requires the end user system to have a compiler to translate the bytecode into native code. However, the advantage has been that people are more willing to invest in Java tools because they will be able to easily port them to any other platform.

Binary Standards

Many people in the open-source community have criticized Sun for not being "open" enough with what can carry the "Java" brand. I do not wish to defend or criticize either position here, but there has been one advantage to their strict requirements: Java implementations are binary compatible. Code compiled by one Java implementation can be used by another. This may seem trivial, but it is not. Using code code compiled by different C++ compilers is basically impossible. This leads to the horrible situation where companies that wish to provide APIs need to ship multiple versions of their libraries compiled with all the popular compilers. The reason is that the C++ standard only specifies the languange, whereas the Java standard also specifies the binary formats for the class files.

Conclusion

I could list a number of more minor decisions that I believe were good ones, but these are the ones that have had very significant, positive impact. If you look at this list, all the decisions make it easier to write correct programs and to reuse code. Most of these decisions hurt performance, in a general sense. However, the result is that programmers are more productive in Java than they are in C or C++ and that is the reason that Java is more popular. Many of these major decisions have positive interactions. For example, the strict binary standards and the standard library contribute to make the platform-independent bytecode easily portable. This type of synergy is a common characteristic of good design.