DSLs

Jon Udell on DSLs. I think he has it right. DSLs are the way of the future. You only need to look that the proliferation of XML flavors used in the Java world to see that basically everyone has decided that using a DSL is better that writing Java code. I think this is especially telling in that XML is a horrible programming language and still using XML is easier/better than hand coding it in the generic language. Just imagine what could happen if these languages were designed to be easy to understand.

One of the reasons I really like Ruby is that it is easy implement stuff that looks and feels like syntax but is really just normal code. This allows you to extend the languages in ways that make your code more obvious. Just take the member access decorators as an example. If you want to make a method private you do the following:

private
def my_method
  #do something
end

In most languages the “private” is a keyword that the compiler understands but in ruby it is just a class method that says “future methods defined on this class are private until otherwise specified”. The power of this is pretty amazing when it is used correctly.

Another example is Rake (I stole these examples from Jim Weirich’s Rake tutorial). Rake is yet another build system like Make and Ant but it implemented as a set of extensions to Ruby so that your build script is straight Ruby but the common operations, like task dependencies, are expressed succinctly and in a way that is easy to understand. For example

file 'main.o' => ["main.c", "greet.h"] do
  sh "cc -c -o main.o main.c"
end

‘file’ defines a task that creates a file. File tasks know things like: if my file does not exist or any of the files on which I am dependent are newer by my file I need to execute, otherwise I am a no-op. Just think how much less obvious it would be if you were to write that out in general purpose language — or in an XML dialect. And even better you have a full strength language at your disposal if you need to do something that the build systems developers didn’t anticipate, which you will.

Back to Jon’s article. I am not sure if there will be a consolidation of environments, in the near future at least. It would be really nice if this were true but there are problems with all the obvious contenders. Most of the open source community will not accept a VM that is not fork-able (by which I mean that they cannot fork the code base if they do not like the direction it is moving). The Sun JVM is proprietary and the only open source JVMs are incomplete. The .NET CLR is an option but everyone is afraid of MS. The CLR is probably safe because of it’s status as an ECMA standard but there does not seem to be much movement to port existing languages to the CLR even though Mono claims to be in pretty good shape these days. Then there is Parrot — the Perl 6 VM — but it is not complete yet and it is not clear when it will be ready or how well it will support non-Perl languages.

You may have noticed the above is mostly about what the open source community will accept. I think most of the innovation in programming languages and DSLs is coming out of the open source community right now. There has some movement toward more inclusion in the commercial offerings. Sun has been adding support for dynamic languages with BSF and Coyote and the .NET CLR has always supported multiple languages. However, I think that if there is an environment consolidation it will be because the open source community comes to a consensus that there is one or two platforms that are good enough for all their needs. I know both Parrot and Mono want to be this platform but neither of them are there yet nor are any of the commercial VMs. It will be interesting to see what happens.

{Update: Fixed the description of the private access modifier in in Ruby. Thanks to obsolete rubyist pointing out that I had gotten it wrong.}

Continuations (or How my Head Exploded)

Occasionally I find a new idea and wonder how I lived so long without encountering it before. Continuations are one of those ideas. I have been hearing about them for a few months but only recently have I started to understand them. They have been around for a long time and can solve problems that a difficult or impossible to solve otherwise. Of course, continuations are not supported in the in-vogue languages so that is probably how I missed them.

Anyway, here are some links to a couple of tutorials (thanks Charlie). A gentle introduction and a not-so gentle description of external iterators, called generators, in ruby. If those articles do not humble you a little I am impressed.

XP as Over-Reaction (Redux)

In an earlier entry I said, “XP is an over-reaction to water fall development methodologies.” I was wrong. I still think XP is an over reaction but I think it is reacting to heavy-weight methodologies rather than water fall methodologies.

The primary difference between XP and other methodologies is that it is that XP urges you to not do a lot of things you have been told to do in the past, because “you are not going to need it”, such as design documents and anticipated features. This mentality definately has some benefit — it is very easy to over-engineer software — but I think that most implementations of XP take it too far. It is easy to just decide not to do anything you do not need at this very moment. I think this sets you up for trouble in the future.

For example, I have been told that it is uncommon for Java code implemented using XP to have javadoc comments. This follows from the basic principles of XP, if you apply them with little thought. You do not need the comments when you are writing the method — you just wrote the test and you know what the method is suppose to do — so writing a comment is a waste of time. However I think that method and class comments are invaluable, they allow for much easier maintenance and refactoring in the future.

I suspect that most of my issues with XP come from poor implementations of the process. On the other hand, it almost does not matter. XP is intended to produce better software more reliably and if it is difficult to use correctly then it is unlikely to achieve that goal.

Interview with Alan Kay

This is an interesting interview with Alan Kay.

Most software today is very much like an Egyptian pyramid with millions of bricks piled on top of each other, with no structural integrity, but just done by brute force and thousands of slaves.

He also has some choice words for early binding

If you’re using early-binding languages as most people do, rather than late-binding languages, then you really start getting locked in to stuff that you’ve already done. You can’t reformulate things that easily.

and the common languages today

so both Lisp and Smalltalk can do their things and are viable today. But both of them are quite obsolete, of course. The stuff that is in vogue today is only about “one- half” of those languages.

and for computer science eductation

but I fear—as far as I can tell—that most undergraduate degrees in computer science these days are basically Java vocational training.

fixed link to interview

Java Daemon

Lately I have been writing a Java program that needs to run in the back ground (like a daemon). I found a couple of neat little tricks that can make this easier. These ideas probably only work in a Unix environment but they have been tested on Linux and Solaris.

So you have your program and you want to start it such that it will not be killed when you log out of the shell in which you start it. You could use nohup, but nohup redirects the standard out and error to files, which is annoying because you are writing all output to a log file anyway. You could do java -cp your_class_path com.domain.main_class <&- 1>/dev/null 2>&1 & which runs the program in the background, closes standard in, and redirects standard out and error to the bit bucket. By closing standard in to the process the shell will not kill the program when it exits and it is running in the background so it will not interfere with other actions we might want to perform with this shell.

However, it would be nice if you could print errors that occur during startup to the prompt — for example if the config file were missing. This is nice because it is generally appropriate to sanity check the configuration so that if the program starts there is a significant chance it will actually work correctly. So let’s not redirect standard out and error. That leaves you with
java -cp your_class_path com.domain.main_class <&- &, which works but a shell will not exit while there are still programs attached to its standard out and error — as this one is.

The solution is to have the Java program detach from standard out and error once its startup is complete. So we will create a method called daemonize() that we will call just before entering the infinite loop that is the main program logic.

static public void daemonize()
{
   System.out.close();
   System.err.close();
}

So now we have a main method which looks like

static public void main(String[] args)
{
   try
   {
       // do sanity checks and startup actions
       daemonize();
   }
   catch (Throwable e)
   {
       System.err.println("Startup failed.");
       e.printStackTrace();
   }

   // do infinite loop
}

Now when we start our program we get to see if it started correctly and if it does start it will not be killed when the shell exits nor will it prevent the shell from exiting.

Now that the program is completely detached from the shell the only way to stop it is by killing the process. However, to do that you need to know the pid. Java has no way for a program to figure its pid directly — it is too system dependent. So we will create a shell script to launch our daemon and record its pid for future use.

#!/bin/sh
java -cp your_class_path com.domain.main_class &lt;&amp;- &
pid=$!
echo ${pid} > mydaemon.pid

This script launches the program and then writes the pid to the file ‘mydaemon.pid’. Now when you want to kill the program you can do

kill `cat mydaemon.pid`

There are a couple of problems with this, one is that the pid file gets written even if the daemon failed to start successfully, another is that if the daemon crashes the pid file will still exist — which might lead you to believe it is still working.

We can solve the pid file surviving daemon crash problem by passing it into the Java program like the following

java -Ddaemon.pidfile=mydaemon.pid -cp your_class_path com.domain.main_class &lt;&amp;- &

And the updated the daemonize() method to the following

.

static public void daemonize()
{
   getPidFile().deleteOnExit();
   System.out.close();
   System.err.close();
}

where getPidFile() is a method which returns a File object file specified by the system property “daemon.pidfile”. That way the pid file will be deleted when the VM exits.

For the overly eager creation of the pid file you could add a delay to the shell script and then check to make sure the process is still running before writing the pid file. But how long should the delay be? A better way is to take advantage of the fact the a shell will not exit while a program is attached to its standard out or error — even if the process is running in the background. We know that our program will not detach from those until the startup process is complete. The following shell script achieves this

#!/bin/sh

launch_daemon()
{
  /bin/sh &lt;&lt;EOF
     java -Ddaemon.pidfile=mydaemon.pid -cp your_class_path com.domain.main_class &lt;&amp;- &
     pid=\$!
     echo \${pid}
EOF
}

daemon_pid=`launch_daemon`
if ps -p "${daemon_pid}" &gt;/dev/null 2&gt;&1
then
  # daemon is running.
  echo ${daemon_pid} &gt; mydaemon.pid
else
  echo "Daemon did not start."
fi

This script starts a sub-shell and launches the daemon (in the launch_daemon() function). The sub-shell will only return once the java program has detached from the console — for our program that means it has completed its startup or died. After the launch_daemon() function returns we check to see if the pid it started is still running. If so it means that the daemon started correctly and the we write the daemon’s pid to the pid file. Remember that whenever the daemon’s VM shuts down the pid file will be deleted so you can treat the existence of the pid file as an indication that the process is running.

Now it occurs to you that if a problem occurs during startup you really would like to log it to both the log file and console. Since you are using log4j this is pretty straight forward. Just updated you main method like the following

static public void main(String[] args)
{
   Appender startupAppender = new ConsoleAppender(new SimpleLayout(), "System.err");
   try
   {
       logger.addAppender(startupAppender);
       // do sanity checks and startup actions
       daemonize();
   }
   catch (Throwable e)
   {
       logger.fatal("Startup failed.",e);
   }
   finally
   {
      logger.removeAppender(startupAppender);
   }

   // do infinite loop
}

where “logger” is a static member variable that contains a Logger object. The nice thing about this is you can log message anywhere in you startup code and know that someone will see them, even if it occurs before you have configured the logging based on the application configuration. If the normal application logging is configured, the messages will go both to the console and to the log file for future debugging.

So all that works pretty well. There is another problem though. There is not a clean way to shut this daemon down. We need a graceful way to handle shutdown. So we add the following code to our main class

static protected boolean shutdownRequested = false;

static public void shutdown()
{
   shutdownRequested = true;
}

static public isShutdownRequested()
{
   return shutdownRequested;
}

Then we update our application so that occasionally it checks ‘isShutdownRequested()’ and if it is we leave the main loop. Now our main method looks like

static public void main(String[] args)
{
   Appender startupAppender = new ConsoleAppender(new SimpleLayout());
   try
   {
       logger.addAppender(startupAppender);
       // do sanity checks and startup actions
       daemonize();
   }
   catch (Throwable e)
   {
       logger.fatal("Startup failed.",e);
   }
   finally
   {
      logger.removeAppender(startupAppender);
   }

   while(!isShutdownRequested())
   {
      // wait for stimuli
      // process stimulus
   }
}

This looks pretty good but you still only shutdown from inside the program. The solution is a VM shutdown hook. The is a bit of code that the VM runs when it is shutdown. We create the following method

static protected void addDaemonShutdownHook()
{
   Runtime.getRuntime().addShutdownHook( new Thread() { public void run() { MainClass.shutdown(); }});
}

and update the shutdown method as follows

static public void shutdown()
{
   shutdownRequested = true;

   try
   {
       getMainDaemonThread().join();
   }
   catch(InterruptedException e)
   {
       logger.error("Interrupted which waiting on main daemon thread to complete.");
   }
}

Note that we now wait for the main daemon thread to die. This is because the VM waits for the VM shutdown hooks to complete for exiting but it does not wait for other threads to complete. This join allows the main daemon threads to complete in a controlled way rather than being killed by the VM. Then we update the main method to call this new addDaemonShutdownHook() method

static public void main(String[] args)
{
   Appender startupAppender = new ConsoleAppender(new SimpleLayout());
   try
   {
       logger.addAppender(startupAppender);
       // do sanity checks and startup actions
       daemonize();
       addDaemonShutdownHook();
   }
   catch (Throwable e)
   {
       logger.fatal("Startup failed.",e);
   }
   finally
   {
      logger.removeAppender(startupAppender);
   }

   while(!isShutdownRequested())
   {
      // wait for stimuli
      // process stimulus
   }

   // do shutdown actions
}

Now you can kill the process using kill `cat mydaemon.pid` but shutdown will be orderly and controlled.

So there you have it. A fairly safe and full featured way of create a Unix daemon with Java. Of course, if you do not need the extra control and do not mind have native binaries it might be easier to use Jakarta Daemon.

{Update: corrected a variable name in one of the shell scripts and added a needed closing brace to one of the bits of Java code}

Quote for the Day

From The State of the Scripting Universe

Compilation will eventually come to be seen for what it is: merely an optimization tool, and one whose use is almost always premature.

This interview is really interesting, especially once you realize that this was done via email and that none of the interviewees were aware of the others answers. The consensus is amazing.

Stop Over-Reacting

One of my pet peeves with Java is the member access modifiers. I despise the semantics of private. My problem with private is that it prevents subclasses from accessing the member. This is almost always a bad idea. Basically you are saying, “I know how my code should be used all other developers are too stupid to be trusted to use this member in a reasonable way”. If there is one thing you can rely on, it is that any code you write will someday be used in a way you have not yet anticipated.

I think the semantics of private are an example of over-reaction. I can hear the reasoning now, “Global variables make a program hard to debug because they are not encapsulated and therefore get accidentally modified. We should solve that problem by having complete encapsulation.” The real problem, however, is that global variables get accidentally changed, not the lack of encapsulation. Encapsulation is merely a tool that might help prevent accidental use/change of state.

The inappropriate use and change happens because it is not clear who owns them and in what contexts that state should be used and changed. In practice, it seems that encapsulation does help solve this problems but the encapsulation need only be obvious, not enforced. With OO we have a way to provide obvious logical encapsulation in the form of members. The addition of these obvious boundaries of use effectively solves the accidental mis-use problem – regardless of whether those boundaries are enforce or not – by making it clear where it is generally appropriate to use/modify an item. The strictness of Java’s private modifier is not necessary or helpful.

Once I started thinking about this, I noticed that this sort of over-reaction is rampant in the software industry. Static typing is a over-reaction to weakly typed languages. Java/C# inheritance models are an over-reaction to the complexities of multiple inheritance in C++ (interfaces exists because, as it turns out, you really do need multiple inheritance to make OO work and these languages have a broken inheritance model). XP is an over-reaction to water fall development methodologies. (I think. I have not totally convenience myself of this one yet.)

For example, I think manifest (or static) typing is an over-reaction to weakly typed systems. Everyone who has worked with C has a story about how they accidentally overwrote some random memory by doing pointer arithmetic something that was not really a pointer – or something similar – and it caused their program to fail much later in a completely different part of the code. This sort of thing is very difficult to debug because the code which is incorrect is not where the failure occurs. So C++ was introduced with strong and manifest types, and it was better. But it was the strong typing – that is, if you attempted to use an item in way that its type does not support the code fails in an obvious way – was the thing that made it better, not that the types are manifest in the source code. But everyone got the “manifest typing == strong typing == good; anything else == bad” meme anyway.

As for multiple inheritance I can only say that the C++ implementation of multiple inheritance is complex. But it is an implementation problem, not a conceptual problem. Multiple inheritance is not conceptually difficult and it is useful.

This tendency to over-react has been noticed before, of course. There is a well known pattern in software systems called second system syndrome. Second system syndrome is, at its core, an over-reaction. It usually goes like this. You build a system and it works basically as intended. People use it and want it to do something you had not anticipated and it takes a lot of work to implement that functionality. So you say, “well I am not going have that problem again” and you make the 2.0 version super flexible, extensible and any other -able you can think of. The thing about all those -ables is that they make the base system more complex and mostly you will not be using them. You over-reacted and you pay the price in much more difficult maintenance.

XP attempts to mitigate this tendency by saying, “assume simplicity”, which seems to have devolved into “don’t crystal-ball”. I think the “don’t crystal-ball” form is an over-reaction in itself. We have been burned in the past by unneeded complexity so instead we preclude all functionality that is not needed at this exact moment (even if you will probably need it tomorrow). Assuming simplicity not a bad approach but developers should try to guess what is going to happen in the future. Then they should examine those predictions with an extremely critical eye. If a predicted functionality is not likely to be needed it should be ignored. If the predicted functionality can be added easily at a later date it should be ignored. If the predicted functionality is likely to be needed and would be difficult to add in the future it is something that should be implemented now – or at least enough of it should be implemented so the rest can be implemented easily in the future. The thing to remember is that most of the functionality you can imagine will never be needed so you need to be brutal when evaluating your predictions.

I think our industry would be a lot better off if we could learn solve the real problem instead of over-reacting to spurious issues that previous solutions introduced.