Java 1.5 on Linux

about | archive


[ 2004-December-01 23:45 ]

Java on Linux is a little entertaining, mostly because licence restrictions prevent distributing Sun's JDK directly. So here is what you need to do to install Java on Debian and RedHat, and how to avoid Java 1.5 from randomly hanging on RedHat 9. As an added bonus, I've included the tips I've learned about how to track down a problem with the JVM, or a problem with your code.

Debian

RedHat

Sun distributes an RPM package, so installation is easier. However, there is an incompatability between NTPL (Linux's new thread library) distributed with RedHat 9 and the Java 1.5 JVM. This bug causes the JVM to hang randomly and eat up 100% CPU. If you attach to it with gdb, (gdb - [process id]) and get a back trace (bt) you will see it is waiting on some mutex. If you attach to it with strace (strace -p [process id]) you will see it is stuck waiting on a futex. In any event, to work around it you can set LD_ASSUME_KERNEL=2.4.1 before running the JVM. This disables the new thread library. Rumour has it that Fedora does not have this problem, and I know for a fact that Debian sarge does not have this problem. I recommend Debian.

Notes on Debugging Hard Java Problems

In tracking this problem down, I learned a bit about Java debugging tools that I didn't know before, so here is what I used to track down the problem:

  1. Send the process SIGQUIT, and it should produce a thread trace (use kill -SIGQUIT [process id]). If it doesn't, the JVM has hung.
  2. Enable debugging with java -agentlib:jdwp=transport=dt_socket,address=8000,server=y,suspend=n (see Sun's Java 1.5 debugging documentation for details), even if you don't want to debug right now. It isn't that slow, and it lets you attach to in case the process hangs or does something strange (jdb -attach 8000). If your problem disappears, it's a JVM issue. The experimental Java 1.5 debug tools are also worth a look because they can attach to a process that wasn't started with debugging enabled.
  3. It could help to get the output from the statistical profiler, and see where your process is spending its time. See java -agentlib:hprof=help for more info.
  4. As a last resort, then you can try attaching to the JVM process with gdb (gdb - [process id]) and get a back trace (bt). You might be able to figure out where in your program the problem lies from the current native code executing. Alternatively, you can get a log of the system calls with strace (strace -p [process id]).
  5. If all else fails, search the web and see if anyone else has had your problem.