HTTP Authentication with shared identities

Authentication has been bane of my existence lately. By which I mean, it is complicated and interesting and I am loving every minute of it (but, as you can see, I am not going to let that stop me from complaining about it). However tonight I have run into an authentication problem that I am not sure how to solve. I am hoping someone out there can point me toward a solution.

So, Dear Lazyweb, here is my question: is there a mechanism available that allows an HTTP client to have a single identity for several applications and to be able to authenticate itself to each of those applications in such a way that even a malicious application would be unable to impersonate the actor to the other applications in the system?

Oh yeah, and it would be really nice if this were already implemented for libcurl and in Ruby.)

Some background

I have a system composed of a set applications which communicate with one another using RESTful web services. This system supports the addition of arbitrary new applications to system. However,some of these applications maybe written by (relatively) untrusted parties.

All actors, both end user and components of the system, have a single system wide identity. This identity is managed by the one trusted component in the system. This component is responsible for, amongst other things, authentication of actors.

We settled on OpenID as the mechanism for end user authentication. Other than having one of the worst specs I have ever read OpenID is really nice. OpenID solves this problem by forwarding the user’s browser to the identity provider (our trusted component) and the identity provider verifies the user’s identity claim. The application that requested the authentications is then notified of success of failure of the authentication process. This approach has the advantage that the user’s password, even in an encrypted form, never passes though the untrusted components of the system.

Unfortunately, end user authentication is only a subset of the authentication required for this system. There are many automated actors that also make have use of the resources exposed by components in the system. These actors need to the authenticated also, but OpenID is rather unsatisfactory for this purpose. So another solution to delegated authentication is required.

My initial thought was to use MD5-sess based HTTP Digest auth. The spec explicitly mention that it could be used to implement authentication using a third party identity provider. Upon further study, however it only works if the application doing the authentication is trusted. This is because to verify the requester’s identity the application must have the hash of the users account, realm and password. With that bit of information it would quite easy for the application to impersonate the original requester. In my environment of limit trust that is unacceptable.

Another potential, if naive, option is to use HTTP digest auth but to pass the authentication credentials though to the identity provider. The identity provider could then response with an indication of whether the requester proved that they new the password. Unfortunately, the additional load placed on the identity provider by having to verify the requester’s identity for every single request handled by any part o the system is just too great. Not to mention the additional lag this would impose on response times.

Now, the astute reader will by now be fairly yelling something about how this problem was solve by Kerberos years ago. Not only is this true but theoretically, the negotiate HTTP auth scheme supports Kerberos based authenitication. However, I have yet to find any Ruby libraries that support that scheme. Tomorrow, I will probably dive into the RFC to determine if I can implement support myself. If you know of a library that implements this scheme please let me know.

I have also looked at OpenID HTTP authentication. It looks a bit simpler than the Kerberos based negotiate auth scheme, but it seems a bit under cooked for a production system. On the other hand, it does have potential. If there are no other options it might be workable. It would be pretty easy to implement on the Ruby side of the house, particularly since I have spent the last couple of days coming to terms with OpenID, but on the C++ side it might be a bit more of a problem.

Anyway, it is late now so I am going to go to sleep and await your answers.

The power of hypermedia remains non-obvious

Patrick Mueller contemplates whether or not we really need URIs in our documents1. This is a pretty common question in my experience. This question comes up because it is not always immediately obvious just how powerful embedding links in documents is.

What Mr. Mueller suggests is that if you have a client that needs account information for a particular person that is could simply take the account numbers found in the person representations and based on some out of band information, construct the account URIs. For example, if you got a person representation that looked like


The client could then make get requests to http://bank.example/accounts/3242 and http://bank.example/accounts/5523 to get the persons account information. The client would have constructed those URIs based on some configuration or compile time information about the structure of account URIs. This is a very common approach. Hell, it is even the one use by the ActiveResource library in Rails. But common does make it good.

Magically creating URIs out of the ether would work at first but say this bank we work for buys another bank. There are some people that have accounts at both banks. Now, if a persons accounts where referenced by URI, rather than just number, you could just add them to the list like this:

    <account href="http://bank.example/accounts/3242"/>
    <account href="http://bank.example/accounts/5523"/>
    <account href="http://other-bank.example/accounts/9823"/>

The fact that some accounts are served by the original system and some are served by the other banks system is unimportant. However, if the client is constructing URI based on out of band information this approach fails completely. This is just one example of the sort of problems that disappear when you reference resources by URI, rather than some disembodied id.

One of the potential advantages of using just ids, rather than a URI is that it will require less work on the server to generate the document. I suppose ids are less costly, in a strict sense, if the server generating the document is also serves the account resources. But how much faster? Building a URI like the ones above could be as cheap as a single string concatenation. As far I am concerned, that is not really enough work to spend any time worrying about. On the other hand, if the server generating the document does not also serve the account resources, then the accounts should be being referenced by URI internally anyway so using the URI should be cheaper (not to mention safer).

Mr Mueller suggests, as a proof point, that Google Maps must work by URI construction based on a priori knowledge of the shape of tile URIs. It may well, for all I know, but it certainly would not have to. For example, the server could pass the client a tile URI template and the client could then calculate the x and y offsets of the required tiles based on the x and y values of the tiles it already has. Or each tile could include links to the tiles that touch it (which would allow arbitrary partitioning of the tiles which would be nice). No doubt there are other reasonable RESTful choices too.

The more I work with REST based architectures the more enamored of hypermedia. Links make your representations brightly lit, well connected spaces and that will benefit you application in ways you probably have not even imagined yet.

  1. BTW, Mr Mueller, I was unable to post a comment on your blog. After pressing Post I was brought back to the comments page with the message Comment authentication failed! after the Comments text area.

Hierarchical Resources in Rails

Consider a situation where you have a type of resource which always belongs to a resource of another type. How do you model the URI space using Rails? For example, say you have an address resource type. An address is always associated with exactly one user, but a user may have several addresses (work, home, etc).

The simple approach

The simplest approach from a Rails implementation perspective is to just have a flat URI space. In this scenario the URI for the collection of addresses associated with a user and a particular address would be, respectively:{user_id}{address_id}

From a REST/Web arch standpoint there is absolutely no problem with this URI. It is a bit ugly for the humans around, though. Worse yet, one might reasonably infer from it that references the collection of all the addresses known to the system. While that might be nice from an information modeling point of view, in reality that collection is probably going to be too large to return as a single documents. To be fair, it would be perfectly legal to respond to /addresses with a 404 or 403, but it would be a bit surprising to get that result if you were exploring the system.

The fully hierarchically approach

Edge Rails contains some improvements to the resource oriented route generators. One of the changes adds support for sub-resources. Sub-resources are support via the :has_many and :has_one options to ActiveController::Routing::Map#resources. These options produce fully hierarchically URIs for the resources. For example{user_id}/addresses{user_id}/addresses/{address_id}

The first URI references the collection of all the addresses for the specified user. The second URI references a particular address that belongs to the specified user. These URIs are very pretty, but they add some complexity to the controllers that fulfill them.

The additionally complexity stems from the fact that address_id is unique among all addresses of (in most cases it would be an automatically generated surrogate key). This leads to the potential for the address_id to be valid but that the address it identifies to not belong to the user identified by user_id. In such cases the most responsible thing to do is to return a 404, but doing so takes a couple of extra lines in each of the actions that deal with individual addresses.

The semi-hierarchical approach

After trying both of the previous approaches and finding them not entirely satisfactory. I have started using a hybrid approach. The collection resources are defined below the resources to which they belong but the collection member resources are referenced without an intermediate. For example{user_id}/addresses{address_id}

This has the advantage of producing fairly attractive URIs across the board. It also provides an obvious location to add a collection resource containing all the child resources if you have a need for looking at all of them with out the parent resource being involved. And it does not require any extraneous code in the controllers to deal will the possibly of the specified parent and child resources being unrelated.

On the downside, it does requires some changes to the routing system to make defining such routes simple and maintainable. Also, it might be a bit surprising if you are exploring the system. For example, if you request you will get a 404, which is probably not what you would expect.

Even with the disadvantages mentioned above I am quite pleased with how the URIs and controllers turn using this technique. If you are looking for a way to deal with hierarchical resources you should give it a try.

JSON Schema Definition Languages

We recently settled on using JSON as the preferred format for the REST-based distributed application on which I am working. We don’t need the expressiveness of XML and JSON is a lot cheaper to generate and parse, particularly in Ruby. Now we are busy defining dialects to encode the data we have, which is happy work. The only problem is there is not a widely accepted schema language for describing JSON documents.

I am not entirely sure a schema language for JSON is necessary in any strict sense. I think that validating documents against a schema is overrated. And, people do seem to be getting along just fine with examples and prose descriptions. Still, the formalist in me is screaming out for a concise way to describe the JSON documents we accept and emit.

I have a weakness for formal grammars. I often write ABNF grammars describing inputs and outputs of my programs, even in situations where most people would just use a couple of examples. I learned XML schema very early in it’s life and I had a love/hate relationship with it for years. Even though it is ugly and complicated I continued to use it because it let me write formal descriptions of the XML variants I created.1

There are a couple of relatively obscure schema languages for JSON. Unfortunately, I don’t find either of them satisfactory.


The Cerny schema validator seem quite functional and while it is not intended as a JSON document schema language, it could be used as one.2 Unfortunately, CERNY.schema requires a complete Javascript interpreter and run-time to perform validation. This requirement stems from the fact that CERNY.schema allows parts of the schema to be defined as procedural Javascript code. This approach is completely unacceptable for a language independent data transfer language like JSON.

Such procedural checking is misguided, even beyond the practical problems with requiring a Javascript run-time. Procedural validation code is a powerful technique for validating documents. However, this procedural code greatly reduces the usefulness of schema documents as an informational tool. Schema languages should be designed to communicate the structure of documents to humans, and only incidentally to validator programs.


Another JSON schema language I have tried is Kwalify. Kwalify seems reasonably capable also but it has some warts that really bother me. My main issue with Kwalify is that it is super verbose. This is primarily due to the fact that Kwalify schema documents are written in YAML (or JSON). Schema definitions can be encoded in a generic hierarchical data language, but it is not a very good idea. I find both XSD and the XML variant of RelaxNG to be excessively noise and Kwalify shows a similarly poor signal to noise ratio. Schema language designers should look to RelaxNG’s compact syntax for inspiration and forget trying to encode the schema in the language being described.


I think JSON could benefit from a schema language that is completely declarative and has a compact and readable syntax. If anyone is working on such a thing I would love to know about it. I could roll my own but I would really rather not unless it is absolutely necessary.

  1. Later I learned of RelaxNG. Now am able to have my cake and eat it too. RelaxNG is a much simpler and elegant way to describe XML documents. And if you really need XML schema for some part of you tool chain you can mechanically convert the RelaxNG into XML schema.

  2. Updated to clarify that CERNY.schema is not intended as a JSON validator. I originally made the incorrect logical leap that it was. It is a easy step from “validator for JavaScript objects” to JSON validator because JSON documents are just serialized JavaScript object. However, the author of CERNY.schema informed me that JSON validation is not the intended use of CERNY.schema.

REST vs WS-* War

David Chappell declares the REST vs WS-* war over

To anybody who’s paying attention and who’s not a hopeless partisan, the war between REST and WS-* is over. The war ended in a truce rather than crushing victory for one side–it’s Korea, not World War II. The now-obvious truth is that both technologies have value, and both will be used going forward.

In this conflict I am, undeniably, a REST partisan. I know this colors my perceptions, but it is not obvious to me that the war is over. It has become obvious that WS-* will not prevail, but that does not mean it is over. Those who have invested a great deal of time and thought into WS-* may hope that it remains a viable technology, at least for certain problems, but that does not mean it will. I think the situations for which WS-* is the best available technology are vanishingly rare, perhaps even nonexistent. As Mark Baker puts it in his response to Mr Chappell

Perhaps David – or anybody else – could point me towards a data oriented application which can’t fit (well) into such a model (not REST, just the uniform interface part).

I expect that when all is said and done WS-* will still be around. But rather than as a vibrate technology platform, the way Mr Chappell seems to anticipate, I think it survive in a way far more like the way Cobol is still around today: as a zombie, unkillable and ready to eat the brain of anyone who wanders too close the legacy systems.

Hypermedia as the Engine of Application State

One of the least well understood core tenets of the REST architectural style is that “hypermedia is the engine of application state”. Which basically means that responses from the server will be documents that include URIs to everything you can do next. For example, if GET a blog post the response document will have URIs embedded in it that allow you to create a comment, edit the post and any other action that you might want to do. Basically, this allows you to think of your application as a state machine with every page representing a state and links representing every possible transition from the current state.

This approach means that to correctly be access your application the only things a client needs to know is a) a well know starting point URI and b) how to parse one of the document formats (representations) your application supports. For human facing web applications this approach is the one that is always used. Browsers understand how to parse HTML and extract the links to the next action and the user provides starting point URIs. This has become so ingrained in the industry’s culture that most people never really even think about it in these explicit terms.

However, I am now working with a system in which there are many independent automated processes which interact with each other. It was not immediately obvious to me, or my colleagues, that we should be following the same pattern even when there are no humans involved. After all, a program can easily remember how to construct a URI from a couple of pieces of information that it knows. In fact, URI construction is almost always easier to implement than the REST approach of requesting a well known resource, parsing the returned representation and extract the URI you want.1

After pondering this for a while I did reach the, rather unsurprising, conclusion that Roy Fielding is correct and that hypermedia should be the core of our distributed application. My epiphany came when I decided that I really wanted to change the shape of a URI. I realized that if the system were only slightly bigger (and it will get there soon) there would be the strong probability that I would not know all the places that accessed the resources whose URIs I wanted to change. Therefore, I would not be able to change the shape of those URIs.

A URI that is constructed by a client constitutes a permanent, potentially huge, commitment by the server. Any resource that may be addressed by the constructed URIs must forever live on that particular server (or set of servers) and the URI patterns must be supported forever.2 Effectively, you are trading a small one time development cost on the client side for an ongoing, and ever increasing, maintenance cost on the server side. When it is stated like that it becomes obvious that URI construction introduces an almost absurd level of coupling between the client and server.

With a truly REST based architecture you are free to change just about anything about the disposition and naming of resources in the system, except for a few well known start point URIs. You can change the URI shapes (for example, if you decide that you really hate the currently scheme). You can relocate resources to different servers (for example, if you need to partition you data). Best of all, you can do those things without asking any ones permission, because clients will not even notice the difference.

Using hypermedia as the engine of application state has the effect of preserving the reversible for a huge number of decisions that are irreversible in most other architectural styles. As the section 9 of “The Pragmatic Programmer” points out, reversibility is one of the goals of any design should strive for. No one knows for sure what will be needed in the future, so having the ability to easily change your mind is invaluable. Particularly when it can be had so cheaply.

  1. It can get even more off-putting. Just imagine a situation where you would need to request several intermediate resources before you get to the URI of the resource for which you are looking.

  2. You could setup redirects/rewrites or proxy the requests if you really needed to move the resource but for high volume URIs having to redirect or proxy would probably eat a significant part the benefits you would get from moving the resource. Worse, though, is the fact that those remedies increase the maintenance requirements significantly because then you have to administrate both the real resources and the redirection/rewriting or proxying rules.


I have been watching the Semantic Web efforts with guarded interest for the last few years. I really like the idea. However, I have always thought it was probably a pipe dream. The Semantic Web is a chicken and egg problem, there must be a lot of data published to attract the general developer population but it needs to attract the general developer population to get a lot of data published.

RDF, SPARQL and the other Semantic Web technologies are pretty uniformly wicked cool. Unfortunately, they are also rather unlike the technologies with which most developers are familiar. I has never obvious to me how we, as an industry, could get to the Semantic Web from here. But today I became aware of GRDDL1, which is the path to the Semantic Web.

As I understand it, GRDDL amounts to this: publish your data in what ever format you like but include a link to an XSLT transform that will convert your published format into an RDF document. So you can continue to publish your microformatted HTML document and be part the Semantic Web just by adding a link element.

My initial reaction to GRDDL is an exquisite combination of “man, there are some really smart people in the world” and “duh, why did I not see that”. That set of feelings is usually a strong indication of a good idea.

Backing up to S3

I recently setup an automated backup system for my (and my wife’s) blog.1 Based on the recommendation of Mr O’Grady (and my belief that RESTful architectures are a good way to solve most problems) I decided to use Amazon’s S3 as the off site storage. I did not to take the same approach as RedMonk, however, because I wanted to play with S3 a bit more directly.

After playing with it I have to say that I am very impressed. S3’s RESTful API is powerful while being simple enough get started with right away. The Ruby AWS::S3 library makes it even easier to get started by providing a nice, idiomatic, wrapper around S3’s functionality.

My backup solution ended up being a 20 line ruby script2 that dumps a database, compresses the dump and then pushes it to S3. That combine with a couple of crontab entries and I was done.

It gets better, though. I got my first bill today:

Greetings from Amazon Web Services,

This e-mail confirms that your latest billing statement is available on the AWS web site. Your account will be charged the following:

Total: $0.02

Please see the Account Activity area of the AWS web site for detailed account information:

So there you go, a secure remote backup for only 2 cents (and a couple of hours of my time). I think these web service things may be around to stay.

  1. I cannot believe it took me so long to get around to that.

  2. That 20 lines includes nice command line argument parsing, too, thanks to Main (maybe when it grows up it will get a website of it’s own).

RESTful resource creation (redux)

Benjamin Carlyle has posted a followup about using PUT to create new resources in which he brings up some interesting issues.

First, it seems I miss understood his original idea slightly. My misunderstanding does not affect how I feel about his approach much. I don’t like the idea of PUT-ting to a “factory” resource with a GUID in the query string any more than I like putting to a GUID based URI that the server might actually be able to use. In fact, I think I might like it less. On the other hand, PUT-ting to a “factory” is really the the same same thing I proposed in response to Mr Carlyle’s original post, I just left off the GUID bit. I find GUIDs to be slightly repulsive and I really don’t see any need for them in the approaches being discussed.

Response codes

Mr Carlyle also points out that the HTTP spec demands a 301 (Moved Permanently) redirect be used if the server wants a PUT applied to a different URI. Unfortunately that does not really match semantics of redirecting from a factory/URL generator resource. This occured to me when I was writing my proposal for safe resource creation (which is really just Mr Carlyle’s proposal without the GUID) but I punted and did not even mention the issue. A possible solution might be to use an extension code in the 300 series to mean “URI Reserved”. The would mean a PUT request to a factory/URL generation resource would response with

HTTP/1.1 372 URI Reserved

The semantics are nice and clean but it has the disadvantage of being non-standard. This sort of “extension code” is explicitly supported in the HTTP spec but it does require that clients be customized to understand it.

Leveraging POST

Stefan Tilkov proposes an alternate approach. His idea involves a POST request to URI generation service. This would then return the new URI in the Location: header. Totally workable. It requires a significant, though not disastrous, level of coupling between the server and client. The approach is loses some of it’s tidiness if the server would like to use the a natural key mechanism for the URIs (and it is positively messy if the server would like to transition from a generated key to a natural key). For the naturally keyed URIs to be generated the POST request to the URI generation service would have to include the the complete representation you want to store. This is fine in practice it makes it look even more like the PUT based approaches.

What does PUT mean

One of the reasons Mr Tilkov does not like the redirected PUT approach to resource creation is that

> it violates the original purpose of PUT, though — if I PUT to a > URI, I don’t expect it to have different results each time I do so

There is certainly one way to look at the redirected PUT request that is a little out of sync with the canonical PUT semantics but I don’t think it has any thing to do with the results or the request. The semantic problem I see is that a PUT is a request to store an entity at a particular URI. In the context of redirected PUT based resource creation is that the new entity will never be stored at the initial URI to which it is PUT.

This is not as big an issue as it seems at first glance, however. If you think of the initial URI used for resource creation as pointing to the next unused slot in a collection of resources, rather than it being a resource factory, the semantics line up much more cleanly. From this point of view, PUT-ting an entity to the “next empty slot” URI and being redirected to a permanent URI for that slot fits rather nicely with normal PUT semantics. The redirect is necessary because once a slot is spoken for the “next slot” URI, by definition, points someplace else.

This way of thinking about a the new resource is similar to a “latest” URI. No one would quibble about a resource with a URI like The response to a GET of that URI like this would change often and those changes would result from the change in state of some other resource (namely the posts collection resource). The important thing to keep in mind is that the resource in question is just most recent post. Similarly, a “next slot” resource always points to the next new member of a resource collection.

If you choose to use this way of thinking about resource creation perhaps a URI whose purpose is slightly harder to confuse would be helpful. Say something like, though I am not sure if that is really better.