Announcing HalClient (for ruby)

HalClient is yet another ruby client library for HAL based web APIs. The goal is to provide an easy to use set of abstractions on top of HAL without completely hiding the HAL based API underneath. The areas of complication that HalClient seeks to simplify are

  • CURIE links
  • regular vs embedded links
  • templated links
  • working RFC6573 collections

Unlike many other ruby HAL libraries HalClient does not attempt to abstract HAL away in favor of domain objects. Domain objects are great but HalClient leaves that to the application code.


CURIEd links are often misunderstood by new users of HAL. Dealing with them is not hard but it requires care to do correctly. Failure to implement CURIE support correctly will result in future breakage as services make minor syntactic changes to how they encode links. HalClient’s approach is to treat CURIEs as a purely over-the-wire encoding choice. Looking up links in HalClient is always done using the full link relation. This insulates clients from future changes by the server to the namespaces in the HAL representations.

From the client perspective there is very little difference between embedded resources and remote links. The only difference is that dereferencing a remote link will take a lot longer. Servers are allowed to move links from the _links section to the _embedded section with impunity. Servers are also allow to put half of the targets of a particular rel in the _links section and the other half in the _embedded section. These choices are all semantically equivalent and therefor should not effect clients ability to function.

HalClient facilitates this by providing a single way to navigate links. The #related(rel) method provides a set of representations for all the resources linked via the specified relationship regardless of which section the links are specified. Clients don’t have to worry about the details or what idiosyncratic choices the server may be making today.

Templated links are a powerful feature of HAL but they can be a little challenging to work with in a uniform way. HalClient’s philosophy is that the template itself is rarely of interest. Therefore the #related method takes, as a second argument, as set of option with which to expand the template. The resulting full URL is used instantiate a new representation. This removes the burden of template management from the client and allows clients to treat templated links very similarly to normal links.

RFC6573 collections

Collections are a part of almost every application. HalClient provides built in support for collections implemented using the standard item, next, prev link relationships. The result is a Ruby Enumerable that can used just like your favor collections. The collection is lazily evaluated so it can be used even for very large collections.


If you are using HAL based web APIs I strongly encourage you to use a client library of some sort. The amount of resilience you will gain, and the amount of minutiae you will save yourself from will be well worth it. The Ruby community has a nice suite of HAL libraries whose level of abstraction ranges from ActiveRecord style ORM to very thin veneers over JSON parsing. HalClient tries to be somewhere in the middle. Exposing the API as designed while providing helpful commonly needed functionality to make the client application simpler and easier to implement and understand.

HalClient is under active development so expect to see even more functionality over time. Feedback and pull requests are, of course, greatly desired. We’d love to have your help and insight.

HTML is domain specific

The partisans of generic media types sometimes hold up HTML as an example of how much can be accomplished without domain specific media types. HTML doesn’t have application/business specific semantics and the whole human facing web uses it, so machine clients should be able to use a generic media type too. There is just one flaw with this logic. HTML is domain specific in the extreme. HTML provides strong semantics for defining document oriented user interfaces. There is nothing generic about HTML.

In the HTML ecosystem, the generic format is SGML. Nobody uses SGML out of the box because it is too generic. Instead, various SGML applications, such as HTML, are created with the appropriate domain semantics to be useful. HTML would not have been very successful if it had just defined links via the a element (which is all you need to have hypermedia semantics) and left it up to individual web sites to define what various other elements meant.

The programs we use on the WWW almost exclusively use the strongly domain specific semantics of HTML. Browsers, for example, render HTML based to the screen based on the specified semantics. We have web readers which adapt HTML — which is fundamentally visually oriented — for use by the visually impaired. We have search engines which analyze link patterns and human readable text to provide good indexing. We have super smart browsers which can often fill in forms for us. They can do these things because of the clear, domain specific semantics of HTML.

Programs don’t, generally, try to drive the human facing web to accomplish specific application/business goals because the business semantics are hidden in the prose, lists and labels. Anyone who has tried is familiar with the fragility of web scraping. These semantics, and therefore any capabilities based on them, are unavailable to machine clients of the HTML based web because the media type does not specify those semantics. Media types which target machine clients should bear this in mind.

Bookmarks and URI based versioning

Threads about how to version hypermedia (or REST) APIs are multitude. I certainly have made my opinion known in the past. That being said, the most common approach being used in the wild is putting a version number in the URI of the resources which are part of the API. For example,

That approach has the advantage of being simple and easy to understand. Its main downside is that it makes it difficult for existing clients to switch to a newer version of the if one becomes available. The difficultly arises because most existing clients will have bookmarked certain resources that are needed to accomplish their goals. Such bookmarks complicate the upgrade quite significantly. Clients who want to use an upgraded API must choose to rewrite those bookmarks based on some out of band knowledge, support both the old and new version of the API, or force the user to start over from scratch.

None of these are good options. The simplest, most attractive approach is the first. However, forcing clients to mangle saved URIs reduces the freedom of the server to evolve. The translation between the two versions of the API will have to be obvious and simple. That means you are going to have to preserve key parts of the URI into the new structure. You cannot switch from a numeric surrogate key to a slug to improve your SEO. Likewise, cannot move from a slug to a numeric surrogate key to prevent name collisions. You never know when the upgrade script will be executed. It could be years from now so you will also need to maintain those URIs forever. Some clients have probably bookmarked some resources that you do not think of as entry points, you will need to be this careful for every resource in your system.

The second option, forcing clients to support both versions of the API, is even worse that the first. This means that once a particular instance of a client has used the API it is permanently locked into that version of that API. This is horrible because it means that early users cannot take advantage of new functionality in the API. It is also means that deprecated versions of the API must be maintained much longer than would otherwise be necessary.

The third option, forcing users to start over from scratch, is what client writers must do if they want to use functionality which is not available in the obsolete version when there is no clear upgrade path between API versions. This is not much work for the client or server implementers but it seriously sucks for the users. Any configuration, and maybe even previous work, is lost and they are forced to recreate it.

A way forward

Given that this style of versioning is the most common we need a solution. The link header provides one possible solution. We can introduce a link to relate the old and new versions of logically equivalent resources. When introducing a breaking API change the server bumps the API version and changes the URIs in any way it likes, eg the new URI might be In the old version of the API a link header is added to responses to indicated the equivalent resource in the new API, eg

GET /v1/orders/42 HTTP/1.1

HTTP/1.1 200 OK
link: <>; rel="alternate"

Older clients will happily ignore this addition and continue to work correctly. Newer clients will check every response involving a stored URI for the presences of such a link and will treat it as a redirect. That is, they will follow the link and use the most modern variant they support.

If you are really bad at API design you can stack these links. For example, the v1 variants might have links to both the v2 and v3 variants. Chaining might also work but it would require clients to, at least, be aware that any intermediate version upgrade link relations so that they could follow that chain to the version they prefer.

You could also add links to the obsolescent variant’s body. This would be almost equivalent except that it requires clients to be able to parse older responses enough to search for the presence of such a link. Using the HTTP link header field nicely removes that requirement by moving the link from the arbitrarily formatted body to the HTTP header which will be supported by all reasonable HTTP clients.

Using URIs to version APIs may not be the cleanest way to implement versioning but the power of hypermedia allows us to work around its most obvious deficiencies. This is good given the prevalence of that approach to versioning.