How REST Can Relieve Your (Lack of) Documentation Guilt

A couple of months ago we hired a contractor to write a reporting interface for our high volume monitoring system. Our system exposes all of it’s data in RESTful web services, and his job has been to take that data and allow users to create reports based on it.

This morning a couple of my teammates and I asked him if he thought our documentation was sufficient to allow supplementary applications, like the one he was finishing up, to be written without having direct access to the developers. He replied to this effect,

To be honest, I did not really look at the documentation. I just fetched the URLs you gave me, and the ones I found in those documents, and so on. It did not take me very long to get a pretty good idea of what kind of data was available and where.

That ability to understand a large system by simply exploring it is one of the most powerful features RESTful architectures. But only if you are using all the precepts of REST, including that resources are represented by documents with links1 (or, hypermedia is the engine of application state, if you prefer a more traditional phrasing).

A RESTful architecture will let you scale, and distribute your application beyond all reasonable expectations. But even better, since you know that anyone who cares can just go exploring, it will also let you feel less guilty about not writing all that documentation that you never quite get around to.

  1. Hat tip to Stefan Tilkov for either reporting that Sanjiva Weerawarana used this phrase in is QCon 07 presentation, or for coining that phrase himself (I cannot tell for sure which it was).

When To Use Exceptions

Marty Alchin recently posted about the “evils” of returning None (or nil or null depending on your language of choice). I think he has it basically right. Sure there are situations where returning nil1 is appropriate, but they are pretty rare. For example, if a method actually does what the client asked and there is nothing meaningful to return, then by all means return nil. If, however, the method was not actually able to do what the client asked, raising an exception is the appropriate thing to do. Unfortunately, I think that most methods that return nil in modern libraries and programs actually do so as a way to indicate a failure condition, and that is evil.

C├ędric Beust responded to Mr Alchin saying basically, a) “problems caused by returning null are easy to debug” and b) “programmers are all knowing, about now and the future, so they can decide when to return nil and when to raise an exception.” (I am, as you might have guessed, taking significant liberties in my para-phrasing. You should go read Mr Beust’s post if want to know what he actually said.)

Ease of debugging

On the debugging point I would say that Mr Beust is generally correct. It is usually the case that nil returns are fairly easy to debug. However, this is not always true. In fact, it is not uncommon, in my experience, to find a nil where it is not suppose to be but then to spend a fair bit of time tracking down where that value became nil. That time is usually spent walking my way up the call stack trying to find the subtle bug that results in a nil return from a method in only some odd situations.

That sort of debugging is not the end of the world but it is annoying particularly because it is so easily avoidable.

All knowing programmers

To be fair, Mr Beust did not actually say that he believes that programmers are all knowing. But he did describe a way of using exceptions that would only make sense if programmers were all knowing. From that, one might infer that he does believe that programmers are in fact omniscient. In my experience this misconception is fairly common in the Java community (Actually, this mentality exists to varying degrees in most programming communities) . Java itself includes many decisions that seem only to make sense in the presence of this assumption. But I digress.

The statement to which I am referring is

Here is a scoop: exception should only be thrown for exceptional situations. Not finding a configuration value is not exceptional. Not finding a word in a document is not exceptional: it can happen, it’s even expected to happen, and it’s perfectly okay if it does.

My response to this is: Who exactly are you to decide that not finding a configuration value is a non-exceptional event for my application? Library programmers2 should not be deciding what is and is not an “exceptional” event. That is a value judgement they cannot possibly make correctly, unless they understand every way that every program will make use of that library (or class) in perpetuity.

Exceptions are not about “exceptional situations”, whatever that means. They are a way for methods tell their caller that it was not able to do what the caller asked it to do. If I say, “dictionary give me the value associated with a particular key” and the key does not exist, the appropriate response is a NoSuchKey exception. Returning a nil in that situation is a lie. A lie with real, and negative, consequences. Consider this, I ask a dictionary for the value associated with a key and it returns nil. Does that mean the key does not exist, or that the key does exist and it’s value is nil? Those two are very different but a nil returning method conflates them requiring, at the very least, an addition method call figure out which possibility is actually the case.

If having methods actually inform you of their inability to perform the desired action is complicated or hard to understand it is time to upgrade your language or write a better interface. For example, if catching an exception from a dictionary look up complicated for the cases where you just want to use a default value, that exception catching and default value behavior could easily be put in a get_with_default(key, default_value) method that either returns the value from the dictionary or the default value. That would certainly be clearer than returning nil and having every consumer add an ugly if block after the get. Or you could switch to a language (such as Ruby) with a compact single line exception handling syntax.

Either way my advice is: Do not use nil as a way to indicate that the object was unable to perform the requested operation, that is job of exceptions. If you see a method that returns nil demand an affirmative defense for that behavior because it is often incorrect.

  1. nil is the Ruby equivalent of None in Python and null in Java. Since all other programming languages are but pale shadows in comparison to Ruby I shall hence forth be using nil to describe this concept.

  2. By library programmer I mean someone that is writing code that will be used by someone else at some point in the future. If that does not include you it is because a) you are not a programmer or b) you are writing completely unmaintainable spaghetti code.

ActiveRecord race conditions

Ara Howard has discovered that the ActiveRecord validation mechanism does not ensure data integrity.1 Validations feel a bit like database constraints but it turns out they are really only useful for producing human friendly error messages.

This is because the assertions they define are tested by reading from the database before the changes are written to the database. As you will no doubt recall, phantom reads are not prevented by any isolation mode other than serializable. So unless you are running your database in serializable isolation mode (and you aren’t because nobody does) that means that the use of ActiveRecord validations setup a classic race condition.

On my work project we found this out the hard way. The appearance of multiple records should have been blocked by the validations was a bit surprising. In our case, the impacted models happened to be immutable so we only had to solve this from for the #find_or_create case. We ended up reimplementing #find_or_create so that it does the following:

  1. do a find 2.

    1. if we found a matching record return the model object
    2. if it does not exist create a savepoint
  2. insert the record 4.

    1. if the insert succeeds return the new model object
    2. if the insert failed roll back to the savepoint
  3. re-run the find and return the result

This approach does requires the use of database constraints but, having your data integrity constraints separated from the data model definition has always felt a bit awkward. So I think this more of a feature than a bug.

It would be really nice if this behavior were included by default in ActiveRecord. A similar approach could be used to remove the race conditions in regular creates and saves by simply detecting the insert or update failures and re-executing the validations. This would not even require that the validations/constraints be duplicated. The validations could, in most cases, be generated mechanically from the database constraints. For example, DrySQL already does this.

Such an approach would provide the pretty error messages Rails users expect, neatly combined with the data integrity guarantees that users of modern databases expect.

  1. You simply must love any sample code that has a method Fubar.hork_the_db.

HTTP Authentication with shared identities

Authentication has been bane of my existence lately. By which I mean, it is complicated and interesting and I am loving every minute of it (but, as you can see, I am not going to let that stop me from complaining about it). However tonight I have run into an authentication problem that I am not sure how to solve. I am hoping someone out there can point me toward a solution.

So, Dear Lazyweb, here is my question: is there a mechanism available that allows an HTTP client to have a single identity for several applications and to be able to authenticate itself to each of those applications in such a way that even a malicious application would be unable to impersonate the actor to the other applications in the system?

Oh yeah, and it would be really nice if this were already implemented for libcurl and in Ruby.)

Some background

I have a system composed of a set applications which communicate with one another using RESTful web services. This system supports the addition of arbitrary new applications to system. However,some of these applications maybe written by (relatively) untrusted parties.

All actors, both end user and components of the system, have a single system wide identity. This identity is managed by the one trusted component in the system. This component is responsible for, amongst other things, authentication of actors.

We settled on OpenID as the mechanism for end user authentication. Other than having one of the worst specs I have ever read OpenID is really nice. OpenID solves this problem by forwarding the user’s browser to the identity provider (our trusted component) and the identity provider verifies the user’s identity claim. The application that requested the authentications is then notified of success of failure of the authentication process. This approach has the advantage that the user’s password, even in an encrypted form, never passes though the untrusted components of the system.

Unfortunately, end user authentication is only a subset of the authentication required for this system. There are many automated actors that also make have use of the resources exposed by components in the system. These actors need to the authenticated also, but OpenID is rather unsatisfactory for this purpose. So another solution to delegated authentication is required.

My initial thought was to use MD5-sess based HTTP Digest auth. The spec explicitly mention that it could be used to implement authentication using a third party identity provider. Upon further study, however it only works if the application doing the authentication is trusted. This is because to verify the requester’s identity the application must have the hash of the users account, realm and password. With that bit of information it would quite easy for the application to impersonate the original requester. In my environment of limit trust that is unacceptable.

Another potential, if naive, option is to use HTTP digest auth but to pass the authentication credentials though to the identity provider. The identity provider could then response with an indication of whether the requester proved that they new the password. Unfortunately, the additional load placed on the identity provider by having to verify the requester’s identity for every single request handled by any part o the system is just too great. Not to mention the additional lag this would impose on response times.

Now, the astute reader will by now be fairly yelling something about how this problem was solve by Kerberos years ago. Not only is this true but theoretically, the negotiate HTTP auth scheme supports Kerberos based authenitication. However, I have yet to find any Ruby libraries that support that scheme. Tomorrow, I will probably dive into the RFC to determine if I can implement support myself. If you know of a library that implements this scheme please let me know.

I have also looked at OpenID HTTP authentication. It looks a bit simpler than the Kerberos based negotiate auth scheme, but it seems a bit under cooked for a production system. On the other hand, it does have potential. If there are no other options it might be workable. It would be pretty easy to implement on the Ruby side of the house, particularly since I have spent the last couple of days coming to terms with OpenID, but on the C++ side it might be a bit more of a problem.

Anyway, it is late now so I am going to go to sleep and await your answers.

Hierarchical Resources in Rails

Consider a situation where you have a type of resource which always belongs to a resource of another type. How do you model the URI space using Rails? For example, say you have an address resource type. An address is always associated with exactly one user, but a user may have several addresses (work, home, etc).

The simple approach

The simplest approach from a Rails implementation perspective is to just have a flat URI space. In this scenario the URI for the collection of addresses associated with a user and a particular address would be, respectively:{user_id}{address_id}

From a REST/Web arch standpoint there is absolutely no problem with this URI. It is a bit ugly for the humans around, though. Worse yet, one might reasonably infer from it that references the collection of all the addresses known to the system. While that might be nice from an information modeling point of view, in reality that collection is probably going to be too large to return as a single documents. To be fair, it would be perfectly legal to respond to /addresses with a 404 or 403, but it would be a bit surprising to get that result if you were exploring the system.

The fully hierarchically approach

Edge Rails contains some improvements to the resource oriented route generators. One of the changes adds support for sub-resources. Sub-resources are support via the :has_many and :has_one options to ActiveController::Routing::Map#resources. These options produce fully hierarchically URIs for the resources. For example{user_id}/addresses{user_id}/addresses/{address_id}

The first URI references the collection of all the addresses for the specified user. The second URI references a particular address that belongs to the specified user. These URIs are very pretty, but they add some complexity to the controllers that fulfill them.

The additionally complexity stems from the fact that address_id is unique among all addresses of (in most cases it would be an automatically generated surrogate key). This leads to the potential for the address_id to be valid but that the address it identifies to not belong to the user identified by user_id. In such cases the most responsible thing to do is to return a 404, but doing so takes a couple of extra lines in each of the actions that deal with individual addresses.

The semi-hierarchical approach

After trying both of the previous approaches and finding them not entirely satisfactory. I have started using a hybrid approach. The collection resources are defined below the resources to which they belong but the collection member resources are referenced without an intermediate. For example{user_id}/addresses{address_id}

This has the advantage of producing fairly attractive URIs across the board. It also provides an obvious location to add a collection resource containing all the child resources if you have a need for looking at all of them with out the parent resource being involved. And it does not require any extraneous code in the controllers to deal will the possibly of the specified parent and child resources being unrelated.

On the downside, it does requires some changes to the routing system to make defining such routes simple and maintainable. Also, it might be a bit surprising if you are exploring the system. For example, if you request you will get a 404, which is probably not what you would expect.

Even with the disadvantages mentioned above I am quite pleased with how the URIs and controllers turn using this technique. If you are looking for a way to deal with hierarchical resources you should give it a try.

JSON Schema Definition Languages

We recently settled on using JSON as the preferred format for the REST-based distributed application on which I am working. We don’t need the expressiveness of XML and JSON is a lot cheaper to generate and parse, particularly in Ruby. Now we are busy defining dialects to encode the data we have, which is happy work. The only problem is there is not a widely accepted schema language for describing JSON documents.

I am not entirely sure a schema language for JSON is necessary in any strict sense. I think that validating documents against a schema is overrated. And, people do seem to be getting along just fine with examples and prose descriptions. Still, the formalist in me is screaming out for a concise way to describe the JSON documents we accept and emit.

I have a weakness for formal grammars. I often write ABNF grammars describing inputs and outputs of my programs, even in situations where most people would just use a couple of examples. I learned XML schema very early in it’s life and I had a love/hate relationship with it for years. Even though it is ugly and complicated I continued to use it because it let me write formal descriptions of the XML variants I created.1

There are a couple of relatively obscure schema languages for JSON. Unfortunately, I don’t find either of them satisfactory.


The Cerny schema validator seem quite functional and while it is not intended as a JSON document schema language, it could be used as one.2 Unfortunately, CERNY.schema requires a complete Javascript interpreter and run-time to perform validation. This requirement stems from the fact that CERNY.schema allows parts of the schema to be defined as procedural Javascript code. This approach is completely unacceptable for a language independent data transfer language like JSON.

Such procedural checking is misguided, even beyond the practical problems with requiring a Javascript run-time. Procedural validation code is a powerful technique for validating documents. However, this procedural code greatly reduces the usefulness of schema documents as an informational tool. Schema languages should be designed to communicate the structure of documents to humans, and only incidentally to validator programs.


Another JSON schema language I have tried is Kwalify. Kwalify seems reasonably capable also but it has some warts that really bother me. My main issue with Kwalify is that it is super verbose. This is primarily due to the fact that Kwalify schema documents are written in YAML (or JSON). Schema definitions can be encoded in a generic hierarchical data language, but it is not a very good idea. I find both XSD and the XML variant of RelaxNG to be excessively noise and Kwalify shows a similarly poor signal to noise ratio. Schema language designers should look to RelaxNG’s compact syntax for inspiration and forget trying to encode the schema in the language being described.


I think JSON could benefit from a schema language that is completely declarative and has a compact and readable syntax. If anyone is working on such a thing I would love to know about it. I could roll my own but I would really rather not unless it is absolutely necessary.

  1. Later I learned of RelaxNG. Now am able to have my cake and eat it too. RelaxNG is a much simpler and elegant way to describe XML documents. And if you really need XML schema for some part of you tool chain you can mechanically convert the RelaxNG into XML schema.

  2. Updated to clarify that CERNY.schema is not intended as a JSON validator. I originally made the incorrect logical leap that it was. It is a easy step from “validator for JavaScript objects” to JSON validator because JSON documents are just serialized JavaScript object. However, the author of CERNY.schema informed me that JSON validation is not the intended use of CERNY.schema.

REST vs WS-* War

David Chappell declares the REST vs WS-* war over

To anybody who’s paying attention and who’s not a hopeless partisan, the war between REST and WS-* is over. The war ended in a truce rather than crushing victory for one side–it’s Korea, not World War II. The now-obvious truth is that both technologies have value, and both will be used going forward.

In this conflict I am, undeniably, a REST partisan. I know this colors my perceptions, but it is not obvious to me that the war is over. It has become obvious that WS-* will not prevail, but that does not mean it is over. Those who have invested a great deal of time and thought into WS-* may hope that it remains a viable technology, at least for certain problems, but that does not mean it will. I think the situations for which WS-* is the best available technology are vanishingly rare, perhaps even nonexistent. As Mark Baker puts it in his response to Mr Chappell

Perhaps David – or anybody else – could point me towards a data oriented application which can’t fit (well) into such a model (not REST, just the uniform interface part).

I expect that when all is said and done WS-* will still be around. But rather than as a vibrate technology platform, the way Mr Chappell seems to anticipate, I think it survive in a way far more like the way Cobol is still around today: as a zombie, unkillable and ready to eat the brain of anyone who wanders too close the legacy systems.


Mr Whitney recently posted an article in which he described mock objects as “bug aggregators”. I once held a similar point of view. Back then my belief was that test doubles (mock, stub, etc) should only be used when real objects would not work, either because they were too hard to setup or because they were too slow. Recently, however, my thinking towards mocks has become a bit more nuanced1.

A couple of months ago my team switched from Test::Unit to RSpec2 for our system validation needs. As part of that switch I read more deeply into the Behavior Driven Development(BDD) movement, which is where tools such as RSpec originate. BDD is, in many ways, just a collection of developer testing best practices combined with much better terminology and some wicked tools (like RSpec). While that may not sound like much it is the biggest improvement in the testing ecosystem since the emergence of systematic automated unit testing.

One of the things that the BDD promotes is the heavy use of test doubles to isolate the tested class from the rest of the system. BDD takes this to such an extreme that Martin Fowler has refers to practitioners of BDD as “mockist”. So when we switched to RSpec I decided to set my previously held notions aside and to give full fledged BDD a try3.

I have been mocking and stubbing my heart out for a couple of months. During that time I have reached the conclusion that test doubles can be very useful as design tools. At the same time they can, as Mr Whitney points out, hide some really nasty bugs.

What are tests for, anyway?

The tension I feel around test doubles comes from the fact that automated unit tests, or executable specification, serve multiple purposes. These purposes are almost never aligned. Improving a test for one purpose will reduce is usefulness for other purposes. As a bare minimum tests serve these purposes:

* tests verify that the code does what you intended * tests provide a set of examples for how to use a class * tests allow for comfortable evolution of the system by protecting against incompatible changes in the future * tests act as a proving ground for the design

While it is the least obvious, experienced practitioners of TDD/BDD often cite that last purpose as one of the most important. Tests have an amazing way of surfacing poor design choices, and test doubles magnify this. You quickly notice interfaces that are excessively complex when you have to mock and stub all the method calls involved. Hard to mock interfaces are a design smell.

Extensive use of test doubles radically enhances the quality of tests as tool for validating the design. At the same time degrades the usefulness of the tests for protecting you from future incompatible changes. I don’t think test doubles aggregate bugs but they are great at hiding them. If you use test doubles extensively you will, quite regularly, end up with two classes that do not work correctly with one another in the real system, but who’s tests pass. On the other hand, those classes will be better designed than if you were just using real objects.

In some languages (Java springs to mind) heavy use of test doubles might also degrade the quality of the tests as examples. If you are using a mocking system that requires a lot of code to mock an object it might tip the against mocking except in extreme cases. However, in Ruby this is not really an issue. The mock/stub library that comes with RSpec has such a clean and pretty syntax that it takes almost nothing away from the readability of the tests.

To counteract the degradation of tests as barrier to the code being inadvertently broken in the future, mockists usually suggest adding higher level of tests that are more about testing the system as a whole rather than the individual parts, or acceptance testing. These sorts of test are usually a good idea in their own right, but they also free you up to make more aggressive use of test doubles to improve the design of your code.

It’s all about tradeoffs

As with most engineering decisions, how much to use test doubles boils down to a set of trade-offs. You have to decide which of the uses of tests are most important for your project and then try to optimize along those dimensions. My currently feeling is that the design benefits of using test doubles usually outweighs the costs. Only you and your team can decide if that is true for your project, but you will never know until you give them a real try.

  1. In this case, as in most, “more nuanced” is a just a euphemism for “I am not sure what the best course of action is”.

  2. If you are doing ruby development you should be using RSpec. It is what Test::Unit, and all the other xUnit frameworks, should have been.

  3. For me, the best way to find the limits of something new is just to use it as aggressively as possible until I start finding situation where it does not work. Once it starts breaking down in real life you usually end up with a really good feel for the limits.

Hypermedia as the Engine of Application State

One of the least well understood core tenets of the REST architectural style is that “hypermedia is the engine of application state”. Which basically means that responses from the server will be documents that include URIs to everything you can do next. For example, if GET a blog post the response document will have URIs embedded in it that allow you to create a comment, edit the post and any other action that you might want to do. Basically, this allows you to think of your application as a state machine with every page representing a state and links representing every possible transition from the current state.

This approach means that to correctly be access your application the only things a client needs to know is a) a well know starting point URI and b) how to parse one of the document formats (representations) your application supports. For human facing web applications this approach is the one that is always used. Browsers understand how to parse HTML and extract the links to the next action and the user provides starting point URIs. This has become so ingrained in the industry’s culture that most people never really even think about it in these explicit terms.

However, I am now working with a system in which there are many independent automated processes which interact with each other. It was not immediately obvious to me, or my colleagues, that we should be following the same pattern even when there are no humans involved. After all, a program can easily remember how to construct a URI from a couple of pieces of information that it knows. In fact, URI construction is almost always easier to implement than the REST approach of requesting a well known resource, parsing the returned representation and extract the URI you want.1

After pondering this for a while I did reach the, rather unsurprising, conclusion that Roy Fielding is correct and that hypermedia should be the core of our distributed application. My epiphany came when I decided that I really wanted to change the shape of a URI. I realized that if the system were only slightly bigger (and it will get there soon) there would be the strong probability that I would not know all the places that accessed the resources whose URIs I wanted to change. Therefore, I would not be able to change the shape of those URIs.

A URI that is constructed by a client constitutes a permanent, potentially huge, commitment by the server. Any resource that may be addressed by the constructed URIs must forever live on that particular server (or set of servers) and the URI patterns must be supported forever.2 Effectively, you are trading a small one time development cost on the client side for an ongoing, and ever increasing, maintenance cost on the server side. When it is stated like that it becomes obvious that URI construction introduces an almost absurd level of coupling between the client and server.

With a truly REST based architecture you are free to change just about anything about the disposition and naming of resources in the system, except for a few well known start point URIs. You can change the URI shapes (for example, if you decide that you really hate the currently scheme). You can relocate resources to different servers (for example, if you need to partition you data). Best of all, you can do those things without asking any ones permission, because clients will not even notice the difference.

Using hypermedia as the engine of application state has the effect of preserving the reversible for a huge number of decisions that are irreversible in most other architectural styles. As the section 9 of “The Pragmatic Programmer” points out, reversibility is one of the goals of any design should strive for. No one knows for sure what will be needed in the future, so having the ability to easily change your mind is invaluable. Particularly when it can be had so cheaply.

  1. It can get even more off-putting. Just imagine a situation where you would need to request several intermediate resources before you get to the URI of the resource for which you are looking.

  2. You could setup redirects/rewrites or proxy the requests if you really needed to move the resource but for high volume URIs having to redirect or proxy would probably eat a significant part the benefits you would get from moving the resource. Worse, though, is the fact that those remedies increase the maintenance requirements significantly because then you have to administrate both the real resources and the redirection/rewriting or proxying rules.

Rake is Sweet

Rake is a really excellent build tool. It is basically Make on steroids (and minus a few of the annoying inconveniences of make). If you build software of any sort you owe it to yourself to check out Rake.

The source of my Rake related euphoria today is that I just used a feature of Rake that is not available in any other build tool that I know. Namely, I added an action to an existing task1. This feature allows you to extend the behavior of a task that you do directly own. Say for example, ones defined by the framework you are using.

My particular situation was this. I have some data that is absolutely required for the application to function (permissions data in this case). Changes to this data don’t happen at run-time and the code explicitly references these records. Which means that while this information is stored in the database it is more akin to code and the data model than it is to the data managed by the application.

Given that this data is reference explicitly by the code it must reside in source control. Rails migrations are and excellent way to manage changes to the data model of an application and, as it turns out, the foundation data too. If you need to add or change some of this foundation data you can just write a migration to add, update or delete the appropriate records.

There is one slight issue with using migrations to manage foundation data, though. Only the structure of the development database gets automatically copied to the test database. So the code that requires the foundation data will fail its tests because that data does not exist. I have run into this problem before. That time I solved it by changing the way Rails creates the test database such that it used the migrations rather than copying the development database’s structure. It is a very nice approach but unfortunately it does not work for my current project.

To solve my problem this time I simply added an action to the db:test:prepare task to copy the data from the table. The standard db:test:prepare task provided by Rails dumps the development database’s structure and then creates a clean test database using that dump. For our project it still does but then it follows that up by dumping the data from roles table and loading that into to the test database also.

Extending the db:test:prepare task means that all the tasks get an appropriate test database when they need it. And without me having to go around and added a dependency to all of them. I love it when my tools let me solve my problems easily.

  1. A Rake task is equivalent to a target in Make and Ant