BarCamp Boulder
For anyone in the Denver/Boulder metro area, BarCampBoulder happening tonight and tomorrow (March 30 & 31).
For anyone in the Denver/Boulder metro area, BarCampBoulder happening tonight and tomorrow (March 30 & 31).
Benjamin Carlyle has posted a followup about using PUT to create new resources in which he brings up some interesting issues.
First, it seems I miss understood his original idea slightly. My misunderstanding does not affect how I feel about his approach much. I don’t like the idea of PUT-ting to a “factory” resource with a GUID in the query string any more than I like putting to a GUID based URI that the server might actually be able to use. In fact, I think I might like it less. On the other hand, PUT-ting to a “factory” is really the the same same thing I proposed in response to Mr Carlyle’s original post, I just left off the GUID bit. I find GUIDs to be slightly repulsive and I really don’t see any need for them in the approaches being discussed.
Mr Carlyle also points out that the HTTP spec demands a 301 (Moved Permanently) redirect be used if the server wants a PUT applied to a different URI. Unfortunately that does not really match semantics of redirecting from a factory/URL generator resource. This occured to me when I was writing my proposal for safe resource creation (which is really just Mr Carlyle’s proposal without the GUID) but I punted and did not even mention the issue. A possible solution might be to use an extension code in the 300 series to mean “URI Reserved”. The would mean a PUT request to a factory/URL generation resource would response with
HTTP/1.1 372 URI Reserved
Location: http://example.com/yourNewResource
The semantics are nice and clean but it has the disadvantage of being non-standard. This sort of “extension code” is explicitly supported in the HTTP spec but it does require that clients be customized to understand it.
Stefan Tilkov proposes an alternate approach. His idea involves a POST request to URI generation service. This would then return the new URI in the Location:
header. Totally workable. It requires a significant, though not disastrous, level of coupling between the server and client. The approach is loses some of it’s tidiness if the server would like to use the a natural key mechanism for the URIs (and it is positively messy if the server would like to transition from a generated key to a natural key). For the naturally keyed URIs to be generated the POST request to the URI generation service would have to include the the complete representation you want to store. This is fine in practice it makes it look even more like the PUT based approaches.
One of the reasons Mr Tilkov does not like the redirected PUT approach to resource creation is that
it violates the original purpose of PUT, though — if I PUT to a > URI, I don’t expect it to have different results each time I do so
There is certainly one way to look at the redirected PUT request that is a little out of sync with the canonical PUT semantics but I don’t think it has any thing to do with the results or the request. The semantic problem I see is that a PUT is a request to store an entity at a particular URI. In the context of redirected PUT based resource creation is that the new entity will never be stored at the initial URI to which it is PUT.
This is not as big an issue as it seems at first glance, however. If you think of the initial URI used for resource creation as pointing to the next unused slot in a collection of resources, rather than it being a resource factory, the semantics line up much more cleanly. From this point of view, PUT-ting an entity to the “next empty slot” URI and being redirected to a permanent URI for that slot fits rather nicely with normal PUT semantics. The redirect is necessary because once a slot is spoken for the “next slot” URI, by definition, points someplace else.
This way of thinking about a the new resource is similar to a “latest” URI. No one would quibble about a resource with a URI like http://example.com/blog/latest
. The response to a GET of that URI like this would change often and those changes would result from the change in state of some other resource (namely the posts collection resource). The important thing to keep in mind is that the resource in question is just most recent post. Similarly, a “next slot” resource always points to the next new member of a resource collection.
If you choose to use this way of thinking about resource creation perhaps a URI whose purpose is slightly harder to confuse would be helpful. Say something like http://example.com/purchase-order/next-new
, though I am not sure if that is really better.
It seems that I was a little unclear in my post about using the HTTP PUT method for resource creation, so let me try again.
In this post Benjamin Carlyle points out that using POST for resource creation has some serious flaws. To see the main problem with using POST for resource creation consider the following scenario.
You make a POST request to create a new purchase order resource but you don’t get a response. One of two things may have happened.
the server got your request and create the purchase order
the server never received the request, or crashed before it was able to create the new purchase order
If 1 happened you don’t want to re-issue the new purchase order request because that would double the order you just made. If 2 happened you do want to re-issue the new purchase order request otherwise you are not going to get the stuff you are trying to order. Unfortunately, with POST base resource creation there is no way to tell in which way the request failed, and therefore no way to decide what to do to resolve the problem without some outside (read: human) input.
It should be noted that when a human is already involved (such as in a user facing web app) this is really not much of a problem. If you try to create a new blog post but don’t get a response you just go look at your blog to see if actually worked or not and then do the appropriate thing. However, when you get into computer-to-computer interactions this issue is a big deal, and in many situation it borders on completely unacceptable.
One way to handle resource creation that does not suffer from these issues is to use PUT instead of POST. PUT request are idempotent, by definition, so anytime you make PUT request and don’t get a response you can just re-issue the request until you do. Using that characteristic to implement resource creation would look something like this:
A client needing to create a new purchase order makes a PUT request, containing a representation of the purchase order to create, to a well know new resource URI (something like http://example.com/purchase-orders/new
)
The server generates a new URI to reference this new purchase order, using whatever mechanism it chooses, and responses with redirect to this new URI.
The client re-issues the PUT request containing the purchase order representation to the newly generated URI.
The server stores the new purchase order and responses with a “201 Created”
In this scenario there are two points at which you could get no response from the server, but at each of those points there is a exactly one correct thing to do to resolve the failure. If you don’t get a response to the initial PUT to the “new resource” URI you can just re-issue it until you do get a response. Because the server never actually creates a resource at this point it does not matter if the server processed your request an the response got lost, or if your request never made it to the server. By re-issuing the request the worst thing that can happen is that a few URIs are generated that will never get used but URIs free so that is no problem at all. Once you have the redirect from the initial “new resource” request you can re-issue the PUT against the redirect URI. If no response is received you can simply re-issue the request until you do get one. The idempotence guarantee of PUT requests means that making the same request multiple times has the same effect as making it once.
This approach is adapted from a very similar approach suggested by Benjamin Carlyle. My primary concerns with the approach Mr. Carlyle suggested is that it forces the client of a RESTful web service to understand the URI generation scheme used by the server or the server to understand a URI generation scheme it has no intention of actually using. The GUID based URIs Mr. Carlyle suggests is workable but I think it forces far too much knowledge about the implementations of the client and server into each other. This knowledge would cause the server, clients or both to become more complicated than necessary.
It seems that if you tilt your head just the right way software engineering has a lot in common with manufacturing. One thing I like about this idea is that it does a pretty good job of explaining why agile software development practices work.
Benjamin Carlyle has an interesting bit about the possibility of deprecating the HTTP POST method. I think most people who have thought deeply about RESTful architectures have had similar thoughts. GET, PUT and DELETE are all nicely idempotent, but POST is not. GET, PUT, and DELETE have clean, well defined semantics, but POST does not. POST generally seems a little out of place next to it’s cleaner cut cohorts. However, there is an absolute requirement for the “process this” semantics of POST so deprecating it is completely out of the question. On the other hand, it does get used in some situations where there are better approaches.
POSTs lack of idempotence has some nasty side effects particularly for what is probably the most common use of POST today, new resource creation. Consider the following scenario, you POST a request to create a new resource but you don’t get a response. It is impossible to automatically recover from this scenario. You cannot resend the request because the new resource may have been created and you just did not get the response and you cannot check to see if the resource was created because you don’t know the URI it would have been assigned if it had, in fact, been created.
I think Mr. Carlyle is correct that using POST for resource creation is sub-optimal. He suggest the following approach1, suppose you have a new purchase order resource you want to make known to the server. The client generates a GUID and the issues the following request
PUT /purchaseOrders/76fd9473-a270-4aac-8a06-e5265048cbbc HTTP/1.1
Host: example.com
Content-Length: ....
<PurchaseOrder>
...
</PurchaseOrder>
The server thinks “I do not know of a purchase order with an id of 76fd9473-a270-4aac-8a06-e5265048cbbc so this request is regarding a never before seen purchase order” then the go about storing that brand new purchase order and returns a “201 Created” response. From are RESTful point of view this is a reasonable approach. It relies on only standard PUT semantics, is resource focused and is idempotent so you can safely keep repeating the request until you get a response.
While relatively straight forward the approach does requires a lot of out of band information because the client has to understand the servers id generation scheme and be able to reliably generate a new id that is unique. If you are using GUIDs as the resource id’s you can be reasonably assured that clients can, in fact, generate globally unique ids so there are not really a practical short term problem.
Some potential problems arise when you have a server that uses a scheme for which ids cannot be reliably generated on the client, such as monotonically-increasing numbers, or where there would be a different way to figure the id for each type of resource, such as natural keys. It also tightly couples the clients to the server implementation in ways I think are dangerous. For example, what happens if you decide you would like to move to an URL scheme that is less ugly than GUIDs?
For all those situations Mr. Carlyle proposes continuing to use a GUID based URI for resource creation PUTs but rather than actually creating the resource, having the server response with a permanent redirect to a new URI at which the resource should exist. The client would then re-issue the PUT request to the new URI and only then would server create the new resource.
I really like the general arc of this proposal but I despise the GUID part. GUIDs are ugly and they offend my sense of elegance. However, this proposed approach can be easily tweaked into something I really like. Suppose that rather using GUID based URIs you instead used a “new resource” URI. For example you could say
PUT /purchaseOrders/new HTTP/1.1
Host: example.com
Content-Length: ....
<PurchaseOrder>
...
</PurchaseOrder>
the server would respond with
HTTP/1.1 303 See Other
Location: http://example.com/purchaseOrders/any_random_sort_of_id_the_server_wants
...
and the client would re-issue the PUT to the URI specified in the redirect. This pushes URI generation back to the server, where it belongs, while still retaining the general goodness of having all interactions between the client and server be idempotent.
It’s just too bad this approach does not work in a browser2.
I rephrase it here to make sure I really understand it and because I found the examples in the original a little hard to follow.
↩</li>
According to RFC 2616 (section 10.3) a redirection
MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD.
So while technically you could do this in a browser the user experience would suck.
↩</li> </ol> </div>