Load Testing and Virtualization Tools

I really enjoy finding and using good tools. There are a couple of tools I have been using lately that give me that warm fuzzy feeling in spades, so I thought I would share.

curl-loader

The first one is curl-loader. This is a really nice tool for load testing web applications. It is based on cURL1, which as we all know, is one the best ways to test web applications. Being based on libcurl means that you can load test applications that require HTTP authentication2, redirection following, SSL or pratically any other of the myriad of features supported by cURL.

The only issue I have run into is that test scripts must be explicit about the exact URIs to operate against. This is not really surprising, but if you have multiple setups all with different host names it can get a little tedious modifying the scripts for each environment. Fortunately, the script a simple text format that is amiable to being generated. I knocked to together a couple of Ruby scripts to generate various scripts for our application in just a couple of hours – and that included learning the curl-loaders script format.

VirtualBox

The other tool have been having some success with is VirtualBox. VirtualBox is a open source virtual machine. It is available via apt-get in Ubuntu and works really well3. I mostly use virtualization to verify that our software really does function correctly when there are several servers in a cluster. You cannot really extract much information about performance or scalability from a completely virtualized environment (at least not when all the pieces are running on my laptop) but you can tell if it is going to work or not. VirtualBox seems quite fast and it is dead simple to setup new virtual machine instances, all of which make it a joy to work with.


  1. To be precise it uses libcurl, which is the functionality of the curl command made available as a native library.

    </li>

  2. The ability to handle HTTP authentication is surprisingly uncommon in load testing frameworks.

    </li>

  3. The only problem I have had so far is that I put my machine to sleep when VirtualBox had control of the keyboard and when I resumed the keyboard was completely non-functional.

    </li> </ol> </div>

We Must Be Doing Something Right

Yesterday while Elliot, Audrey and I were walking through DFW1 Elliot – he is 4 years old – asked me, “Where are we going?”

“Concourse C, but I am not sure which gate yet2.”, I said.

“Daddy, is it C dot com or C and some other letters?”, Elliot asked in reply.


  1. Which, I must say, is really a pretty nice airport these days. I have not been there since late last century, when I was traveling a lot for work. At that time, it was the airport in the US which I loathed the most.

    </li>

  2. A two hour layover has it’s disadvantages, but when one adult is traveling with two children it is well worth it.

    </li> </ol> </div>

Concurrency and System Architecture

Mr Dekorte take on concurrency in shared memory systems

If you’re looking for languages or concurrency tools that will scale to the high core count desktop machines of the near future, I wouldn’t put stock in MISD oriented solutions such as transactional memory or elaborate functional programming compiler techniques. Shared memory systems simply won’t survive the exponential rise in core counts.

He is right, what we have now is not going to scale in the long run. I am not sure we will see much change on the ground any time soon, though. People, and industries, have an strong inclination to hang onto the status quo, even when there are better alternatives available. On the other hand, I would not be surprised if the future is largely populated by virtual shared memory systems running on top of physical MIMD machines.

RESTful Service Discovery and Description

There has been a great deal of discussion regarding RESTful web service description languages this week. The debate is great for the community but I think Steve Vinoski has it basically right

never once — not even once — have I seen anyone develop a consuming application without relying on some form of human-oriented documentation for the service being consumed

When you start writing an application that makes use of some services you are not writing some sort of generic web services consumer. You are writing a consumer of one very specific web service and the semantics of a service, as with everything else, turn out to be a lot more complicated, subtle and interesting than the syntax.

Human-oriented documentation necessary because only human can understand the really interesting parts of a service description. Based on my experience, it also seems to be sufficient. Sure we could all jump on the full fledged service description language band wagon but I don’t think that service consumers would get much, if any, value out of it.1

Discoverability

Discoverability is the most important capability that interface definition languages bring to the table. However, most service description languages provide discoverability almost as a side effect, rather than it being their primary purpose.

I think it would be better to promote discoverability by working on a more focused capabilities publishing mechanism. To that end, I want to describe what my team has done on this front. It is not entirely suitable general use, but useful standards often emerge from extracting the common parts of many bespoke solutions.

First I want to be clear about the terminology I am using just to make sure we all understand one another

service
A cohesive set of resources that are exposed via HTTP.
resource
An single entity which exposes a RESTful interface via HTTP.
service provider
A process or set of processes that implement one or more services.
container
Another name for a service provider.

Background

We needed the ability to discover the actual URIs of resources at runtime from very early in our project because of our basic architecture. Our system is composed of at least four services2. The containers that provide these services may be deploy in various (and arbitrary) ways. Maintaining the list of top level resources of other services in configuration files became unmanageable long before we every actually deployed the system in production.

We need a way that any component in the system could discover the URIs of resources exposed by other components in the system. We handled this by providing a system wide registry of all the services that are available and a description resource for each service that provides link to the resources contained with that service.

Service Description

Containers that provide a service are responsible for exposing a “service description” for that service.

A service description is a resource that provides links to all the top level resources in service. Currently we support just one type of representation (format) for service description, a JSON format that looks like this

{
  "_type":        "ServiceDescriptor",
  "service_type": "http://mydomain.example/services/something-interesting",
  "resources": [
    {
      "_type": "ResourceDescriptor",
      "name":  "OpenIdProvider",
      "href":  "http://core.ssbe.example/openid"
    },
    {
      "_type": "ResourceDescriptor",
      "name":  "AllAccounts",
      "href":  "http://core.ssbe.example/accounts"
    }
  ]
}

service_type is the globally unique name for the type service that is being described. It should be a URI that is owned by the creator of the service. Each top level resource that is exposed as part of this service has a resource descriptor in the resources set.

If you wanted know about all the accounts of the system you would

  1. GET the service descriptor resource 2. iterate over the resources collection until you found the AllAccounts resource descriptor 3. GET the URI found in the href pair of the resource descriptor (http://core.ssbe.example/accounts in this example)

One important thing to note is that each resource is really exactly one resource, and not a type of resource. If you are looking for a particular account you have to get the AllAccounts collection and find the account you are looking for in that set.

Capabilities

The Capabilities resource is the only well known entry point for our system. If a program wants to interact with our system it always starts with the capabilities service and the works it’s way down, using the links in documents, to the resource it actually cares about.

The JSON representation we support looks like

{  
  "_type": "SystemCapabilities",
  "services": [
    {
      "_type":        "ServiceDescriptor",
      "href":         "http://alarm.ssbe.example/service_descriptors/escalations",
      "service_type": "http://mydomain.example/services/something-interesting"
    }
  ]
}

To discover the URI of a particular top level resource a consumer must

  1. GET the capabilities document 2. iterate though the objects in services until if it finds the one of the correct service_type 3. GET the full service descriptor using the URI in the href pair 4. iterate of the resources until it finds the one with the correct name 5. extract the URI from it’s href pair

Services are registered with the capabilities resource, by POSTing a service description to it, when the containers that provide those services are started.

Issues

No supported methods or format information

This approach only provides a way to discover the URIs of top level resources. It makes no attempt to describe the representations (formats) or methods those resources support. That sort of thing would not be hard add but so far I have had absolutely not need for it. That information is provided by the human-oriented documentation and since it does not change in each deployment there is no need for it included in the dynamic resource discovery mechanism.

Non-top level resources are not represented

Resources that are not top level – by which I mean resources that not listed in a service description document – are not represented at all. This is a feature, really, but it makes extending this format to include method and data format information less compelling becayse only a relatively minor subset of the resources in the system are surfaced in the service descriptions.

Encourages large representations

The fact that only singleton resources are supported can lead to top level documents that are excessively large. In fact, we have already had to deal with this issue. We have basically punted on the issue but I think the correct approach would be to introduce a ResourceTypeDescriptor that would operate much like a ResourceDescriptor except that the link would be a URI template rather than a concrete URI.


  1. On the other hand service providers might get some value. Something like WADL does give you a way to declaratively define a suite of regression tests. On the other hand, you might be better off using a tool specifically built for that purpose.

    </li>

  2. That is the base number of services. Additionally functionality is added to the system in the form of additional services so the actually number of services varies based on what you need the system to do.

    </li> </ol> </div>