Hiring Should be Obsolete

Paul Graham has a new essay, Hiring is Obsolete. As usual it is quite insightful, though perhaps just a little self serving in light of his latest venture.

I feel just a bit disappointed when I read about other people’s experiences related to founding startups. Now that I have a family the risk founding a startup is too high. I wonder how much wealth would be generated if I did not have to risk my family’s health and well-being to be involved in a startup. Until then founding startups will remain — primarily, at least — domain of the young (or rich).

Regardless of what your emotional reaction is to universal health care and a social safety net, there is no denying that they would reduce the risk of founding a startup. I think that risk reduction would be a massive boost to the economy. It would allow anyone with a good idea to pursue it without having to worry about what would happen if your child gets sick, etc. Unfortunately, it would be at the short-term expense of entrenched interests (read: owners of congress people) so things are not likely to change anytime soon.

Hungarian Good, Exceptions Bad?

I think Joel has got a good idea in Joel on Software – Making Wrong Code Look Wrong. Unfortunately, his implementation sucks.

I see his point about Apps Hungarian, in which you tag names with an indication of how the application uses it, but I think it clutters the code significantly. Fundamentally, Apps Hungarian is a kludge — and a pretty vile one at that — to get around the fact that common languages lack the really important feature of type aliasing. “rwMax” is fine but so is “maxRow” and I find the latter a bit easier on the eyes (camel case grips aside). It would be nice is if you could have a type named RowNumber, which was really just another name for an integer, but in this case you probably want to included row or col(umn) in the var name anyway, just to make it obvious. In my experience this is accepted practice, but perhaps not followed as often as it should be. Either you use names that look like “rwMy” or “MyRow”.

As for his example web app, I have a much better solution than his. Rather than crufting up the names in your code the interfaces you used should be domain specific. The idea of making it obvious when returning text provided by user without HTML encoding it first can easily be achieve by creating an HTML writer class (or module). At it’s most basic this class would have two methods, write(), which HTML encodes the string before writing it, and writeUnencoded(), which writes the string with no modifications. With this design using writeUnencoded() with constants or literals is always okay, but using writeUnencoded() with variable is suspicious. So every time you see

htmlOut.writeUnencoded("<br />")

It is correct.

htmlOut.write("<br />")

It is probably wrong.

htmlOut.write(name)

It is correct.

htmlOut.writeUnencoded(name)

It is probably very wrong.

This solution is better in several respects: bad code smells bad, it makes doing a dangerous thing more difficult than doing safe things and it is not subject to breakage by newbies that have not learned your naming conventions yet. Using Apps Hungarian gets you bad code smells bad (but only if both the reader and writer are intimately familiar with the prefixes being used), but it does not encourage to you to do the right thing. One of the rules I use is dangerous operations should be indicated as such and the equivalent safe operation should be easier to find and/or use.

As for exceptions, Joel is not completely wrong that throws are a bit like GOTOs. But they are still a better way to program. I think most of this ground has been covered but having to check the result code of every method or function you call is not a good approach to producing code that is understandable, and failing to do that in a non-exception environment is a sure way to end up with unstable code. Fortunately, Joel has already lost this argument because most programmers use exceptions and have found that while they may occasionally cause unexpected behavior they are a big win, with regards to both code understandability and stability.

Advertising Gone Too Far

This morning there was a story about movie marketing on Morning Edition. One thing they noted it that in a recent sit-com there was a paid reference to the movie ET. A sub-plot was created exclusively to advertise that movie. I already knew that most every product you see in television shows is paid advertising (I mean “product placement”) but it had never occured to me that the actual plots of television shows were for sale. Maybe I am just too naive.

What Is A “Unit” Test

I think that one reason unit testing is such a big win is that it integrates design verification into the design (read: coding) process. (See Jack Reeves’ Code As Design.) This connection has made me start to pay a little more attention to the unit tests I see around me and I have found some odd ideas about what is a unit test vs a functional test. There is quite a lot of room for interpretation here because the name is, probably intentionally, vague. These ideas, that I think are odd, have caused me to refine my idea of what a unit test is.

One of the odd ideas is that a unit test should only test a single class. In some ways this seems to be reasonable. This is basically the unit test orthodoxy that I originally learned, though perhaps taken to an extreme. However, I think this view of unit tests is not internally consistent. Even simple unit tests test how different classes fit together. When is the last time you mocked the String, List or other basic classes that come with the language you are using? No one does that because it is not worth the effort. But by not doing that you are inherently not testing a single class, you are testing that a set of classes work together correctly.

REXML could not parse this XML/HTML: 
<p>The other odd idea is that a test that interacts the outside world — the file system, network, etc. &#8212 should be called "functional" tests rather than unit tests.  I am a firm believer that names are, almost always, important.  The impact of calling the tests functional, rather than unit, is that they do not get run nearly as often.  This happens because the "functional" tests often require significant setup.  So everyone basically avoids running the functional tests that are not directly related to the code on which they are working, even if some portion of them are easy to run.</p>

I dislike both of the above ideas. In a perfect world, I think unit tests would test every possible use of every bit of code in the system and they would be run before every check-in. This is obvious unrealistic but I think you end up at a better place if you start at what you would really like, and then remove only the parts that are not feasible. There are two main obstacles to my perfect world unit tests. First, they would take a long time for any non-trivial application. Worse yet, some tests are by definition long running and you do not want to run those tests every time you check-in. Second, some of the tests you would like require an environment that is difficult to recreate programmatically, for example if you need a DB in known state, middle-ware that only your test is using, etc.

TestNG has a nice solution for long testing times. (There is a lot about TestNG that concerns me but this feature is cool.) You can annotate your tests in such a way that you can exclude tests which take a long time from any particular test run. Those long running tests can still be unit tests, just like all the other unit tests, but they can be excluded when it is important for the testing to complete quickly. In other frameworks you can achieve the same behavior by sequestering your long tests in particular fixtures which are not included in the “normal” test suite, but are run by your automated build system.

The second issue is the one concerns me the most. This is because running these tests is not just a matter of waiting for them to finish but rather knowing how to setup the environment such that they will work at all. Normally these tests are off loaded to a QA or test team. This means that you have moved the verification of the design to a team outside of your control. This strikes me as sub-optimal because is lengthens the time between defect introduction and detection. Unfortunately, I do not have a good solution for this problem. In most environments it is more acceptable to have a QA team run these sorts of tests than to have the developers spend a lot of time setting up and maintaining multiple independent environments for tests purposes. Joel points out that in some situations VMware may be a good solutions to this problem.

For me, the difference between a unit test and a functional test has nothing to do with what is tested, but rather when the test is run. If it is primarily run as part of coding it is a unit test. If it is primarily run outside of the coding process it is a functional test. The sooner you can find a flaw in the design the better so it is better to do as much testing as possible as part of coding, so I think all levels of software should be tested in unit tests. There should be fixtures that test individual classes, the interactions of classes in individual packages and the interactions of classes across packages. These fixture should do what ever is necessary to support the tests, including writing to the file system or network. All of those tests go into the main test suite unless one of the following is demonstrably true.

  • It is not feasible to setup the appropriate environment programmatically
  • The test would take an unreasonable amount of time to execute

Those two criteria are a bit vague, and intentionally so. I think that every project has to decide for itself what feasible and reasonable. I can say that for the projects on which I have worked, the execution time of the tests has been completely immaterial, with the exception of load tests.