Media types and profiles

Opponents of API versioning using media types often suggest that media type proliferation is a cause for serious concern. The implication is that the more media types that exist, the more different formats intermediates and tools will need to understand in order to be useful. Fortunately, this is just not true. Having lots of media types does not imply having a lot of incompatible formats. Nor does it imply requiring tooling and intermediates to be a lot more complex.

This is difficult to come to terms with, in part, because most media types are not just one thing. The Atom feed for this blog is simultaneously an article syndication document, an XML document processable by any compliant XML parser, a UTF8 encoded plain text document and a octet stream. All of those are media types. It would be perfectly legal to return an Atom document but set the Content-Type to text/plain, but we generally choose to request and identify Atom feeds using the Atom media type because that provides the most value to the client making the request. Notice that we choose the most — not the least — specific media type.

A downside of using the most specific media type available is that some intermediates are put at a disadvantage. If some intermediate is able to do something useful with XML but does not understand that Atom is XML it might not do that useful thing with our request. On the other hand, using a less specific media type might have the same disadvantage. If we call our Atom document an octet stream intermediates are going to pretty much ignore it. We have a stack of formats each of which is a compatible extension of the all the ones below it, but we are only allowed to give it one name. This is bound to leave some components unable to work optimally.

Only being able to specify a single name is the root of the problem, not having lots of compatibly layered formats. Fortunately, the profile link relation provides a solution. You just have to use it in a slightly different way than its proponents currently suggest.

Very specific media types

The client constructing the requests needs to be able to tell the server what it needs to accomplish its goal. If it can work with any old octet stream it can put */* in the Accept header field. If, on the other hand, it is expecting specific information to be provided in an element with a specific id then it needs to be able to let the server know that. A very specific media type combined with content negotiation is great way to provide this while still allowing substantial flexibility to servers.

Use profile link header for more generic format information

Rather than prevent clients from asking for what they need, servers should decorate responses with profile link headers that provide hints about alternate ways a representation could be processed. This provides intermediates and tooling a way to identify representations they can work with, regardless of what the Content-Type header field says.

Consider an API that uses a media type based on Atom but with extensions. It could register a very specific media type in the vendor tree for that particular flavor of atom and use that in Content-Type header field. In addition, it could provide link headers pointed to http://tools.ietf.org/html/rfc4287 (Atom), http://www.w3.org/TR/2006/REC-xml11-20060816/ (XML), and some URI representing plain text. Clients that need the specific extensions to work can make that known. Clients that can work with any old Atom document can request Atom documents and get them with or without the extensions. Intermediates that work with any Atom document can easily detect that very specific media typed responses are, in fact, Atom so they can do their job. And if at some later date a standard way to represent this data emerges the API can add support for it without breaking any of its existing clients, direct or implicit.

Background

This particular line of thinking was prompted by Peter Janes pointing out that profiles and media type based versioning might be complementary.

Something has been nagging at me about the approaches to REST API versioning presented by Peter Williams and Mark Nottingham. I’m sure they’re complementary, but I’m not quite grokking how.

That insight really got me thinking. The impacts to intermediates of media type versioning has been a nagging issue for me for a while now and i am happy to finally have a solution.

Bookmarks and URI based versioning

Threads about how to version hypermedia (or REST) APIs are multitude. I certainly have made my opinion known in the past. That being said, the most common approach being used in the wild is putting a version number in the URI of the resources which are part of the API. For example, http://api.example.com/v1/products/42.

That approach has the advantage of being simple and easy to understand. Its main downside is that it makes it difficult for existing clients to switch to a newer version of the if one becomes available. The difficultly arises because most existing clients will have bookmarked certain resources that are needed to accomplish their goals. Such bookmarks complicate the upgrade quite significantly. Clients who want to use an upgraded API must choose to rewrite those bookmarks based on some out of band knowledge, support both the old and new version of the API, or force the user to start over from scratch.

None of these are good options. The simplest, most attractive approach is the first. However, forcing clients to mangle saved URIs reduces the freedom of the server to evolve. The translation between the two versions of the API will have to be obvious and simple. That means you are going to have to preserve key parts of the URI into the new structure. You cannot switch from a numeric surrogate key to a slug to improve your SEO. Likewise, cannot move from a slug to a numeric surrogate key to prevent name collisions. You never know when the upgrade script will be executed. It could be years from now so you will also need to maintain those URIs forever. Some clients have probably bookmarked some resources that you do not think of as entry points, you will need to be this careful for every resource in your system.

The second option, forcing clients to support both versions of the API, is even worse that the first. This means that once a particular instance of a client has used the API it is permanently locked into that version of that API. This is horrible because it means that early users cannot take advantage of new functionality in the API. It is also means that deprecated versions of the API must be maintained much longer than would otherwise be necessary.

The third option, forcing users to start over from scratch, is what client writers must do if they want to use functionality which is not available in the obsolete version when there is no clear upgrade path between API versions. This is not much work for the client or server implementers but it seriously sucks for the users. Any configuration, and maybe even previous work, is lost and they are forced to recreate it.

A way forward

Given that this style of versioning is the most common we need a solution. The link header provides one possible solution. We can introduce a link to relate the old and new versions of logically equivalent resources. When introducing a breaking API change the server bumps the API version and changes the URIs in any way it likes, eg the new URI might be http://example.com/v2/products/super-widget. In the old version of the API a link header is added to responses to indicated the equivalent resource in the new API, eg http://example.com/v2/rels/v2-equivalent.

>>>
GET /v1/orders/42 HTTP/1.1
...

<<<
HTTP/1.1 200 OK
link: <http://example.com/v2/orders/super-widget>; rel="alternate http://example.com/v2/rels/v2-equivalent"
...

Older clients will happily ignore this addition and continue to work correctly. Newer clients will check every response involving a stored URI for the presences of such a link and will treat it as a redirect. That is, they will follow the link and use the most modern variant they support.

If you are really bad at API design you can stack these links. For example, the v1 variants might have links to both the v2 and v3 variants. Chaining might also work but it would require clients to, at least, be aware that any intermediate version upgrade link relations so that they could follow that chain to the version they prefer.

You could also add links to the obsolescent variant’s body. This would be almost equivalent except that it requires clients to be able to parse older responses enough to search for the presence of such a link. Using the HTTP link header field nicely removes that requirement by moving the link from the arbitrarily formatted body to the HTTP header which will be supported by all reasonable HTTP clients.

Using URIs to version APIs may not be the cleanest way to implement versioning but the power of hypermedia allows us to work around its most obvious deficiencies. This is good given the prevalence of that approach to versioning.

REST/HTTP Service Versioning (Response to Jean-Jacques Dubray)

Jean-Jacques Dubray takes issue with my approach of using content negotiation to manage service versioning in HTTP. I actually hesitate to respond to Mr. Dubray because the overall tone of his piece is rather off putting. On the other hand, he raises a couple of interesting questions which I have been really looking for and excuse to talk about. So I will give it a go.

Handling obsolescent service providers

Mr. Dubray asks how we deal with version skew between the client and server.

Backwards compatibility is when the consumer comes in with a “newer” request version than the service provider can provide. This is common when a consumer uses different providers for the same type of service. So ultimately, you need to provide some room to define the version of both the consumer and the version of the service provider that it is targeting. Your mechanism only supports “one version”.

Not true, the versioning mechanism I describe easily handles multiple versions. First, lets be clear, a service provider cannot provide capabilities that where not conceived of until after it was written. So Mr. Dubray must be interested in is the ability of a single consumer to successfully communicate with multiple versions of the service provider. I agree with him that this is an absolutely vital feature of any versioning mechanism.

Fortunately, content negotiation deals with this issue quite handily. I left this out of the original post for simplicities sake but it well worth talking about. HTTP allows user agents – or service consumers, if you prefer – to specify more than one acceptable response format. For example, the following is a perfectly legal HTTP conversation.

===>
GET /accounts/42
Accept: application/vnd.myapp-v2+xml, application/vnd.myapp-v1+xml;q=0.8

<===
200 OK
Content-Type: application/vnd.myapp-v1+xml

<account>
  <name>Inigo Montoya</name>
</account>

The Accept header field in the request indicates that the consumer can operate using either version 1 or 2 of the API but it prefers version 2. Accept headers can include any number of MIME media types along with preference indicators (the q=number part). This allows consumers to inform the server of all acceptable dialects of the API with which it can work. In the example, the server obviously did not support version 2 of the API and therefore responded using version 1.

Resource deprecation

Further along Mr. Dubray asks this question,

Another flaw of your versioning strategy is that URIs are by default part of the versioning strategy. I have often pointed out that “Query-by-examples” are encoded by members of the REST community (MORCs) in a URI syntax, for instance:

/customer/{id}/PurchaseOrders ...

Peter, how do you express that a particular QBE belongs to one version and not to the other?

I don’t. The set of purchase orders associated with a particular customer is not version specific. The customer has agreed to purchase the same things regardless of which version of the service you are talking to.

Perhaps the question Mr. Dubray is really trying to ask is, what happens if you want to deprecate such resource?

(One reason to do so might be that the purchase order collections become too big to reasonably render in a single response. There are other, better ways to solve that particular problem but it is a nice concrete use case for resource deprecation.)

Resource deprecations is easily handled in REST using media types to handle versioning. First some ground rules, user agents should never be constructing such a URI. Doing so should be a gross violation of the HATEOAS constraint of REST. Rather they would be extracting that URI from the representation of the customer provided by the server. In such a case, an HTTP conversation getting the purchase orders for a customer might look like this.

===>
GET /customer/42
Accept: application/vnd.myapp-v1+xml
<===
200 OK
Content-Type: application/vnd.myapp-v1+xml

<customer>
  <purchase-orders href="http://service.example/customer/42/purchase-orders"/>
</customer>


===>
GET /customer/42/purchase-orders
Accept: application/vnd.myapp-v1+xml
<===
200 OK    
Content-Type: application/vnd.myapp-v1+xml

<purchase-orders>
  ...
</purchase-orders>

At version 2 of the API we deprecate the all-purchase-orders-for-customer resource – removing all references to it in the customer representations – and replace it with a purchases-order-by-month-by-customer resource. A similar HTTP conversation with a client capable of handling version 2 of the API would look like this.

===>
GET /customer/42
Accept: application/vnd.myapp-v2+xml
<===
200 OK
Content-Type: application/vnd.myapp-v2+xml

<customer>
  <purchase-orders-by-month href-template="http://service.example/customer/42/purchase-orders?in_month={xmlschema-gYearMonth}"/>
</customer>


===>
GET /customer/42/purchase-orders?in_month=2008-05
Accept: application/vnd.myapp-v2+xml
<===
200 OK    
Content-Type: application/vnd.myapp-v2+xml

<purchase-orders>
  ...
</purchase-orders>

Notice that in version 2 of the API the all-purchase-orders-for-customer resource is no longer exposed in any way. As a human you might guess that it still exists, and indeed it would need to in order to handle requests to version 1 of the API. However, a version 2 consumer will never make a request to that resource because it is not mentioned in the version 2 representations. Indeed, any requests for the all-purchase-orders-for-customer by a version 2 consumer would be met with a 406 Not Acceptable response because it is not part of the version 2 API.

Wrap up

Toward the end Mr. Dubray gets into full rant mode with these bits,

You will soon start realizing that resources do have a state that is independent of the “application” since by definition a resource can participate in multiple “applications”. This is the essence of SOA, i.e. the essence of reuse.

Resources certainly may participate in multiple “applications”. There is nothing in the REST principles that prevent that. I don’t really claim to be an SOA expert. I just make systems work using REST principles. So far I have not found a problem reusing my resources in multiple applications. In fact, REST seems to excel at that very thing.

At least, some of the MORCs Member Of the REST Community could have the courtesy to acknowledge that they are indeed building a programming model on top of REST, that this programming model needs clear semantics and that these semantics are not intrinsically part of REST (nor always RESTful).

I, for one, will readily acknowledge that we have built, and are continuing to build, programming models on top of REST. REST is merely a set of principles, articulated as constraints, that facilitate the creation of useful network based architectures. I would be very surprised if many in the REST community would disagree with me. These programming models do, for the most part, adhere to the REST principles.

Building REST/HTTP web services is certainly not fully understood yet. That does not make it special, hardly any sort of system design or architecture is fully understood. However, REST seems, to me at least, to be a better fit for today’s applications and technologies than any of the alternatives.


If you’re interested in REST/HTTP service versioning be sure not to miss the rest of the series.

Versioning REST Web Services (Tricks and Tips)

In my previous post on this subject I described an approach to versioning the API of a REST/HTTP web service. This approach has significant advantages over the approach that is currently most common (i.e. embedding a version token in the URL). However, it does have some downsides. This post is an attempt to outline those and to present some ways to mitigate the negative impacts.

Nonstandard MIME media types

Using content negotiation to manage versions requires, by definition, the introduction of nonstandard media types. There is really no way around this. I personally don’t feel this is a Bad Thing. The new, nonstandard, media types do a much better job describing the sort of media the client is requesting. It does, however, mean that browsers – and perhaps some HTTP tools – will work less well with the web service.

The browser not working is a pretty big issue. They are almost certainly not the target consumer of the services, but having the browser not work raises the level of effort for exploring the API. If you have created a cool new service you want as few barriers to entry as possible. Personally, I always use curl when I am exploring but I know several people who would prefer to use a browser.

Unfortunately, I don’t really have a great general solution for browsers. That being said, in many situations a much can be done to make life better. For example, if the resources in question do not have HTML representations you could serve the current preferred format with a generic content type that browsers can render – e.g. text/plain or application/xml – to browsers.

Curl

One advantage of having the version token directly in the URL is that it makes it really easy to use curl against the service. By default curl makes requests with the Accept header field set to */*. For a reasonably designed service this would result in a response in the current preferred format. If you want to change to Accept header you need to invoke curl like this

curl --header 'Accept: application/vnd.foo.myformat-v1+xml' http://api.example/hello-world

That is not too horrible, really. It is a bit much to type all the time, but I have curl rc files for all the formats I deal with on a daily basis. If your service is implemented in Rails there is an even easier way. With Rails you give each format you support a short name that may be used as an “extension” for URLs. For example, if we define the short name for application/vnd.foo.myformat-v1+xml to be mf1 we can say this

curl http://api.example/hello-world.mf1

That is equivalent, from the point of view of a Rails based service, to the previous example. I imagine similar functionality could be implemented in most web frameworks. This effectively puts you back to having the version embedded in the URL, which is convenient for debugging and exploration. (It is still unsuitable for production use, though, for all the same reasons as other approaches to embedding the version in the URL.)

Nonobviousness

Another potential downside of using content negotiated versioning is that the various versions my be less discoverable, compared to a version-in-the-URL approach. I am not entirely sure this is true – after all there is a version token in the media type – but if it is true it would be a Good Thing.

Do you really want people “discovering” a version of the API that was deprecated a year ago? I think it might be better, in either approach, to use version tokens that are not readily guessable. Obviously, previous versions of and API will be documented and remain accessible, but raising some barriers to entry on depreciated parts of a system seems appropriate to me.

Unfamiliarity

This may be the biggest issue of all. People are just not very familiar, and therefore comfortable, with content negotiation. This in spite of the fact that it has been a fundamental part of HTTP since forever. I think this features obscurity is waning now, though, because it is such a powerful feature.

Two years ago Rails got content negotiation support. (That link seems to be broken at the moment. You can see part of the post I am talking about by going here and searching for “The Accept header”.) As frameworks like Rails keep adding and improving their support for this powerful feature the community of developers will become more familiar and comfortable with it. What is needed now is more education in the community on how best to utilize this feature.


If you’re interested in REST/HTTP service versioning be sure not to miss the rest of the series.

Versioning REST Web Services

Managing changes to APIs is hard. That is no surprise to anyone who has ever maintained an API of any sort. Web services, being a special case of API, are susceptible to many of the difficulties around versioning as other types of APIs. For HTTP based REST style web services the combination of resources and content negotiation can be used to mitigate most of the issues surrounding API versioning.

Let’s assume you have a REST/HTTP web service that has some account resources. Say you can make a request like this:

===>
GET /accounts/3 HTTP/1.1
Accept: application/vnd.mycompany.myapp+xml
<===
HTTP/1.1 200 OK
Content-Type: application/vnd.mycompany.myapp+xml

<account>
  <name>Inigo Montoya</name>
</account>

First, you probably noticed that my example uses a vendor MIME media type to describe the representation. Using a more generic MIME media type like application/xml is much more common, at least in my experience. Using generic media types is perfectly legal but a bit silly. You are not really asking for any old XML document, but rather an XML document that has a quite specific schema. Aside from my idealistic rantings, using a specific media type has some strong practical benefits which are at the core of this post.

Backwards compatible changes

Often changes will need to be made to expose new behavior of the system that do not negatively impact correctly implemented clients. Say, for example, you want to start tracking email address for accounts. If the application/vnd.mycompany.myapp+xml format documentation is clear that elements that are not recognized should be ignored you can simply add a email element to the account representation.

<account>
  <name>Inigo Montoya</name>
  <email-address>mailto:prepare-to-die@youkilledmyfather.example</email-address>
</account>

Any client that was created before the addition of the email element will simply ignore it’s presence. Problem solved.

Incompatible changes

Unfortunately, not all changes can be implemented in a way that is backwards compatible. For example, a couple of months after adding email to accounts the sales team sign a deal for 1 bazillion dollars. But the new customer demands that each account be allowed to have more than one email address. After thinking for a while, you decide that the best way to expose that is by changing the account representation as follows.

<account>
  <name>Inigo Montoya</name>
  <email-addresses>
    <email-address priority='1'>mailto:prepare-to-die@youkilledmyfather.example</email-address>
    <email-address priority='2'>mailto:vengeance@youkilledmyfather.example</email-address>
  <email-address>
</account>

That, of course, will break any clients that are expecting the old format – so pretty much all of them. This is a place where we can bring content negotiation to bear. You can simply define a new media type – say application/vnd.mycompany.myapp-v2+xml – and associate new multi-email format with it. Clients can then request whichever format they want. Older clients don’t know the new media type so they get served the older single email format.

===>
GET /accounts/3 HTTP/1.1
Accept: application/vnd.mycompany.myapp+xml
<===
HTTP/1.1 200 OK
Content-Type: application/vnd.mycompany.myapp+xml

<account>
  <name>Inigo Montoya</name>
  <email-address>mailto:prepare-to-die@youkilledmyfather.example</email-address>
</account>

Newer clients do know the new media type so they can have access to the new functionality.

===>
GET /accounts/3 HTTP/1.1
Accept: application/vnd.mycompany.myapp-v2+xml
<===
HTTP/1.1 200 OK
Content-Type: application/vnd.mycompany.myapp-v2+xml

<account>
  <name>Inigo Montoya</name>
  <email-addresses>
    <email-address priority='1'>mailto:prepare-to-die@youkilledmyfather.example</email-address>
    <email-address priority='2'>mailto:vengeance@youkilledmyfather.example</email-address>
  <email-address>
</account>

Everyone gets what they need. Easy as pie.

Alternate approaches

The most commonly proposed approach for versioning REST/HTTP web service interfaces today seems to be to mutilate the URIs by inserting a version. For example,

http://foo.example/api/v1/accounts/3

I really hate this approach as it implies that an account in one version of the API is really a different resource than the account in a different version of the API.

It also forces clients into a nasty choice, either support multiple versions of the API simultaneously or break one of the core constrains of REST. For example, say a client exists for the v1 API that saves references (URIs that include the version indicator) to accounts. Some time later the client is updated to support the new version of the API. In this situation the The client can support both versions of the API simultaneously because all the previously stored URIs point at the old version of the API or it has to mung all the URIs it has stored to the point at the new API. Munging all the URIS breaks the HATEOAS constraint of REST and supporting multiple versions of the API is a maintenance nightmare.

Conclusion

Changes to REST/HTTP web service interfaces come it three basic flavors, changes to the properties associated with a type of resource, additions of new types of resources and deprecation of obsolete types of resources. If you are following the HATEOAS constraint of REST the approach described here can be used to safely handle all three scenarios.

This approach does lead to media types being created, but media types are cheap so we can – and should – have as many as we need. Used properly, content negotiation can be used to solve the problems related to versioning a REST/HTTP web service interface.


If you’re interested in REST/HTTP service versioning be sure not to miss the rest of the series.