Unobtrusive link info

Mr Amundsen’s recent post regarding the design of “semantic machine media types” got me thinking about media type design. One of the commonly encouraged practices, particularly on the REST discuss group, is the use of link elements.

I really dislike this idea. It sets my teeth on edge because it treats links – which are possibly the most important bits of data in existence – as second class citizens. It is easiest to show what i mean with a bit of extrapolation:

<complexElement rel="entry">
  <string rel="id">234132</string>
  <string rel="displayName">Peter Williams</string>
  <complexElement rel="name">
    <string rel="familyName">Williams</string>
    <string rel="givenName">Peter</string>
  </complexElement>
  <complexElement rel="emails">
    <link rel="email" href="mailto:pezra@barleyenough.org"/>
    <string rel="type">personal</string>
  </complexElement>
</complexElement>

That is what a portable contact might look like if we treated all data the way link elements work. That example looks pretty ugly to me, as i suspect it does to most people. It is ugly because very important information regarding the role of elements is relegated to a subsidiarity position in favor of fairly unimportant information about its type. However, link elements do to links exactly what my example does to all the data. The effect is that properties whose values happen to be independently addressable resources are obfuscated.

The revealed preference of the world is against link elements. Just look at pretty much any format that embeds application specific semantics. As far as i know, there is not a single widely used format that actually represents its links as link elements. Even atom uses properly named elements for most its links. The link element it defines is largely relegated to the back water of extensibility.

One benefit that link elements have, or at least could have if they where more widely used, is the facilitation of standard link processing tools. Fortunately, we do not have to give up the expressiveness and clarity of intention revealing names to achieve this result. Rather than obscuring the links we could just treat them as normal data. The additional information needed to support standard tools could be added in a relatively unobtrusive way. Consider the following:

<entry xmlns:link="http://unobtrusive-generic-linking.org/">
  ...
  <emails>
    <value link:hrefDisposition="elementContent" 
           link:rel="foo">
      mailto:pezra@barelyenough.org</value>
    <type>personal</type>
  <emails>
</entry>

This is idea is similar to xLink but more flexible and simpler to use.

You could expand the idea to JSON with relative ease. Consider the following expansion of the portable contacts json format tagged with some unobtrusive link info.

{"entry": [
  {"id": "42",
  "emails":
    [{"address"   : "mailto:pezra@barelyenough.org",
      "type"      : "personal",
      "_linkInfo" : {"hrefDisposition" : "address", 
                     "rel" : "foo"}}]}]}

Unobtrusive link info makes links visible to and usable by generic link processing tools while protecting the use of intention revealing names that format designers, and users, want. This is important because it allows new formats to reuse “standard” link semantics more easily and uniformly.

Novelty value

Babys eating in front of the TV

Eating and watching television at the same time is seldom done at my house. It must be fun, though, because Audrey lets her stuffed animals do it.

In defense of link storage

It seems that more and more are beginning to grasp the hypermedia constraint of REST. This is an unmitigated Good Thing. However, once you get hypermedia the idea of a client persisting links that it has found starts to seem a little odd. For example, Kirk Wylie describes clients that store links as “not well behaved” in his excellent presentation on REST in financial systems. Even on the rest-discuss mailing list there is no consensus on the matter.

The idea of an application as a set of states (read: representations) with transitions (read: links) to other states seems to go against the idea of storing links. Transitions from one application state to another are surely transient. Any change in the application state, either by this client or some completely unrelated client, could easily invalidate those transitions. In that context a client that stored links for later use would surely be doomed dereferencing dead links for the rest of it’s days.

Further, the idea that clients might store links is a frightening specter for maintainers of services. If clients store links, and you prefer not to break those clients, you must continue supporting any links you have ever included in any representation in perpetuity. Talk about limiting your design freedom. Such a strict requirement would surely raise the cost of maintaining the service over time.

Reality sets in

Those are scary thoughts. Some of these issues are even real. But end the end it doesn’t matter. Almost all non-trivial systems are going to require that URIs be stored in places other than the origin server. Sometimes these stored URIs will merely be caches. Other times they will be data that cannot be recalculated mechanically.

For example, say you have an order taking system and an inventory system. When placing an order the user goes to the web site, searches for “coffee”, selects the third item in the results and places an order 1 of that item. An order is a set of line items each of which references a product. Once payment is received the order system is going to need to be able to tell the shipping department which items from inventory to send to the customer.

The inventory system has, of course, a URI for every type of product that is for sale. So the simplest and most effective way for an order to reference a product is to use that the inventory URI for that product. URIs are called universal resource identifiers for a reason, we might as well use them as such.

In this example, we have a situation where the product references in the order are not merely caches of URIs. Many things may change the ordering of search results – a new product being added, an old one being discontinued, even a small change to a description of some product. So at any moment the the third item in the search results for “coffee” might be different. Once the user has made their selection no automata can reliably retrace those steps.

The implications of this are significant. The inventory must continue to support the product URIs used in orders until such time as the order system would never care to dereference those URIs again. If a month from now the user comes back and wants to see their order history, those product URIs had better still work.

Fortunately, HTTP provides us with a ready solution. Behold the awesomeness that is HTTP redirection. HTTP redirection is your best friend when it comes to gracefully changing REST/HTTP applications. Clients get what they need – URIs continue to work as identifiers indefinitely – and servers get what they need – a lot of freedom to change the names and dispositions of resources.

We are still faced with this issue of the transient nature of links. Certainly, many links encode transitions which may be transient. The client has no general way of distinguishing between links which represent transiently available state transitions, and those that represent more permanent transitions.

In our example, immediately after creating an order, it probably provides some links to pay for the order. After the user has provided payment those transition would no longer be valid. However, the link to the inventory product is a more permanent part of the order resource.

The only tractable way i see to deal with this issue is to document the lifespan the various link found in a representation. Once the client implementer understand the semantics of the links they well often be able to infer the likely lifespan of the links without further input. However, guidance can be provided in situations where precision is required or the lifespan is ambiguous. A transient link is, by definition, an option part of the representation so documenting the conditions that cause it to be present is likely to be required anyway.

Best practices: Server

REST/HTTP application developers should assume that clients will store links and dereference them after indeterminate periods of time. When resources are relocated or renamed requests to the resource’s obsolete URI should be redirected to the canonical URI using a 301 Moved<br /> Permanently response.

For links whose validity has a bounded lifespan the documentation of the representations (the media type) should explicitly layout that the link is transient and optional. If possible the documentation should also describe the conditions of the links existence.

Remind client developers early and often that client must follow any and all redirects from the server.

Best practices: Client

Clients should follow redirects. Fastidiously.

Clients should update it’s internal storage upon receiving a 301<br /> Moved Permanently response by replace the URI it requested with newly provided location.

Client developers should be aware of transient links in the representations being dealt with. Either do not store these URIs or ensure that attempts so use these URIs handle failure in ways that make sense for the application.

Believe and follow the redirections the server sends to you. Seriously.

Elliot’s 6th Birthday

Elliot is six today!

Elliot blowing out candles

It has been a big year. Probably the biggest thing is kindergarten which he loves. He is getting to be a big kid but he definitely still has his little kid moments.