Services Questions

I recently had a colleague ask me several questions about service oriented architectures and breaking monoliths apart. This is an area in which i have a good deal of experience so i decided to publish my answers here.

What is a “service”?

A “service” is a discrete bit of functionality exposed via well defined interface (usually a standardized format over a standardized network protocol) that can be utilized by clients that are unknown and/or unanticipated at the time of the service’s implementation. Due to the well defined interface, clients of a service do not need to understand how the service is implemented. This style of software architecture exists to overcome the diseconomies of scale suffered by software software

How has the services landscape changed in the last 5-10 years?

In the mid-2000s it became clear that WS-*, the dominate service technology at the time, was dramatically over complicated and led to expensive to maintain systems. WS-*’s protocol independence combined with the RPC style adopted by most practitioners meant that clients generally ended up being very tightly coupled with a particular service implementation.

As WS-*’s deficiencies became clear, REST style architectures gained popularity. They reduce complexity and coupling by utilizing an application protocol such as HTTP, and often using simpler message formats such as JSON. The application protocol provides a uniform interface for all services which reduces the coupling between client and service.

Microservices are a relatively recent variant of service oriented architectures. As the name suggest the main thrust of microservices is the size of the components that implement them. The services themselves could be message queue bases or REST style APIs. The rise of devops, automation around deploy and operations, raises the practicality of deploy a large number of very small components.

Message queue based architectures are experiencing a bit of a resurgence in recent years. Similar architectures where popular in the early 2000’s but where largely abandoned in favor of WS-*. Queue based architectures often provide throughput and latency advantages over REST style architecture at the expense of visibility and testability.

What do modern production services architectures look like?

It depends on the application’s goals. Systems that need high throughput, low latency and extreme scalability tend to be message queue based event driven architectures. Systems that are looking for ease of integration with external clients (such as those developed by third parties) tend to be resource oriented REST APIs with fixed URL structures with limited runtime discoverability. Systems that are seeking long term maintainability and evolvability tend to be hypermedia oriented REST APIs. Each style has certain strengths and weaknesses. It is important to pick the right one for the application.

How granular is a service?

I would distinguish a service from a component. Services should be small, encapsulating a single, discrete bit of functionality. If a client wants to perform an action, that action is an excellent candidate for being a service. A component, on the other hand, is a pile of code that implements one or more services. A component should be small enough that you could imagine re-writing it from scratch in a different technology. However, there is a certain fixed overhead for each component. Finding the balance between component size and the number of components is an important part of designing service architectures.

What is the process to start breaking down a large existing application?

Generally organizations start by extracting authentication and authorization. It is an area that is fairly well standardize (oauth, SAML, etc) and also necessary to support the development of other services. Once authentication and authorization are extracted from the monolith, another bit of functionality is chosen for implementation outside of the monolith. This process is repeated until the monolith is modularized enough to meet the organizational goals. The features implemented outside of the monolith are often new functionality but to really break apart a monolith you eventually have to target some of the existing features.

Starting small and preceding incrementally are the keys to success. Small successes make it easier to build consensus around the approach. Once consensus is reached existing, customer facing features can be extracted more readily.

What are the organization / team structure impacts?

It is generally superior to construct a (rough, high level) design of the services needed and to form vertically integrated teams to implement business features across the set of components. Empowering all teams to create new components and, where appropriate, new services, increases the likelihood of success and shortens the time to implement a service oriented architecture.

Alternatively, component structures can follow the organizational structure (Conway’s law), resulting in one component per group in the organization. This approach can be practical in organizations where vertically integrated teams are politically unacceptable.

What are needed / helpful tools? To run services? To manage services?

  • Service discovery. Without it an inordinate amount of configuration is required for each component to interact with the ecosystem of services.
  • Instrumentation and monitoring. Without this it is impossible to detect and remediate issues that arise from the integration of an ecosystem of services.

How do companies document their services, interfaces and versions?

There are not any popular tools for creating documentation for hypermedia and message queue based APIs. However, general guidance can be found for creating documentation, but in general you are own your own.

For more resource oriented APIs tools like swagger and api blueprint can be helpful.

How can we speed developer ramp up in a service architecture?

To speed developer ramp up it is important to have well maintained scripts to build a sandbox environment including all services. Preferably a single script that can deploy components to virtual machines or docker containers on the developers machine. Additionally it is important to maintain searchable documentation about the API so that developers can find information about existing services.

How to deploy new versions of the ecosystem and its components?

Envisioning an ecosystem of services as a single deployable unit is contrary to the goals of service oriented architectures. Rather each component should be versioned and deployed independently. This allows the components to change, organically, as needed. This approach increases costs on the marketing, product management and configuration management fronts. The benefits gained on the development side by avoiding the diseconomies of scale are worth it.

How to manage API compatibility changes?

A service oriented architecture makes breaking API changes more damaging. It is often difficult to know all the clients using a particular API. The more successful an API the harder, and more time consuming, it is to find and fix all clients. This difficultly leads to the conclusion that all changes should be made in a backwards compatible way. When breaking changes are unavoidable (which is rare) server-driven content negotiation can enable components to fulfill requests for both the old API and the new one. Once all clients are transitioned to the new API the old API can be removed. Analytics in the old API can help identify clients in need of transitions and to determine when it is no longer needed.

How to version the ecosystem and its components?

This is a marketing and project management problem more than anything. The ecosystem will not have a single version number. However, when a certain set of, business meaningful, features have been completed it is convenient for marketing to declare that a new “version”. Such declarations are of little consequence on the development side so they should be done whenever desirable.

RESTful Service Discovery and Description

There has been a great deal of discussion regarding RESTful web service description languages this week. The debate is great for the community but I think Steve Vinoski has it basically right

never once — not even once — have I seen anyone develop a consuming application without relying on some form of human-oriented documentation for the service being consumed

When you start writing an application that makes use of some services you are not writing some sort of generic web services consumer. You are writing a consumer of one very specific web service and the semantics of a service, as with everything else, turn out to be a lot more complicated, subtle and interesting than the syntax.

Human-oriented documentation necessary because only human can understand the really interesting parts of a service description. Based on my experience, it also seems to be sufficient. Sure we could all jump on the full fledged service description language band wagon but I don’t think that service consumers would get much, if any, value out of it.1

Discoverability

Discoverability is the most important capability that interface definition languages bring to the table. However, most service description languages provide discoverability almost as a side effect, rather than it being their primary purpose.

I think it would be better to promote discoverability by working on a more focused capabilities publishing mechanism. To that end, I want to describe what my team has done on this front. It is not entirely suitable general use, but useful standards often emerge from extracting the common parts of many bespoke solutions.

First I want to be clear about the terminology I am using just to make sure we all understand one another

service
A cohesive set of resources that are exposed via HTTP.
resource
An single entity which exposes a RESTful interface via HTTP.
service provider
A process or set of processes that implement one or more services.
container
Another name for a service provider.

Background

We needed the ability to discover the actual URIs of resources at runtime from very early in our project because of our basic architecture. Our system is composed of at least four services2. The containers that provide these services may be deploy in various (and arbitrary) ways. Maintaining the list of top level resources of other services in configuration files became unmanageable long before we every actually deployed the system in production.

We need a way that any component in the system could discover the URIs of resources exposed by other components in the system. We handled this by providing a system wide registry of all the services that are available and a description resource for each service that provides link to the resources contained with that service.

Service Description

Containers that provide a service are responsible for exposing a “service description” for that service.

A service description is a resource that provides links to all the top level resources in service. Currently we support just one type of representation (format) for service description, a JSON format that looks like this

{
  "_type":        "ServiceDescriptor",
  "service_type": "http://mydomain.example/services/something-interesting",
  "resources": [
    {
      "_type": "ResourceDescriptor",
      "name":  "OpenIdProvider",
      "href":  "http://core.ssbe.example/openid"
    },
    {
      "_type": "ResourceDescriptor",
      "name":  "AllAccounts",
      "href":  "http://core.ssbe.example/accounts"
    }
  ]
}

service_type is the globally unique name for the type service that is being described. It should be a URI that is owned by the creator of the service. Each top level resource that is exposed as part of this service has a resource descriptor in the resources set.

If you wanted know about all the accounts of the system you would

1. GET the service descriptor resource 2. iterate over the resources collection until you found the AllAccounts resource descriptor 3. GET the URI found in the href pair of the resource descriptor (http://core.ssbe.example/accounts in this example)

One important thing to note is that each resource is really exactly one resource, and not a type of resource. If you are looking for a particular account you have to get the AllAccounts collection and find the account you are looking for in that set.

Capabilities

The Capabilities resource is the only well known entry point for our system. If a program wants to interact with our system it always starts with the capabilities service and the works it’s way down, using the links in documents, to the resource it actually cares about.

The JSON representation we support looks like

{  
  "_type": "SystemCapabilities",
  "services": [
    {
      "_type":        "ServiceDescriptor",
      "href":         "http://alarm.ssbe.example/service_descriptors/escalations",
      "service_type": "http://mydomain.example/services/something-interesting"
    }
  ]
}

To discover the URI of a particular top level resource a consumer must

1. GET the capabilities document 2. iterate though the objects in services until if it finds the one of the correct service_type 3. GET the full service descriptor using the URI in the href pair 4. iterate of the resources until it finds the one with the correct name 5. extract the URI from it’s href pair

Services are registered with the capabilities resource, by POSTing a service description to it, when the containers that provide those services are started.

Issues

No supported methods or format information

This approach only provides a way to discover the URIs of top level resources. It makes no attempt to describe the representations (formats) or methods those resources support. That sort of thing would not be hard add but so far I have had absolutely not need for it. That information is provided by the human-oriented documentation and since it does not change in each deployment there is no need for it included in the dynamic resource discovery mechanism.

Non-top level resources are not represented

Resources that are not top level – by which I mean resources that not listed in a service description document – are not represented at all. This is a feature, really, but it makes extending this format to include method and data format information less compelling becayse only a relatively minor subset of the resources in the system are surfaced in the service descriptions.

Encourages large representations

The fact that only singleton resources are supported can lead to top level documents that are excessively large. In fact, we have already had to deal with this issue. We have basically punted on the issue but I think the correct approach would be to introduce a ResourceTypeDescriptor that would operate much like a ResourceDescriptor except that the link would be a URI template rather than a concrete URI.


  1. On the other hand service providers might get some value. Something like WADL does give you a way to declaratively define a suite of regression tests. On the other hand, you might be better off using a tool specifically built for that purpose.

  2. That is the base number of services. Additionally functionality is added to the system in the form of additional services so the actually number of services varies based on what you need the system to do.