Why i don’t love schemas

Once, long ago, i thought that schemas were great and that they could provide much value. Schemas hold the promise of declaratively and unambiguously defining message formats. Many improvements would be “easy” once such a definition was available. Automated low-code validations, automagically created high quality, verifiably correct documentation, clearer understanding of inputs, etc. I tried fiercely to create good schemas, using ever better schema languages (RelaxNG compact syntax, anyone) and ever more “sophisticated” tools to capture the potential benefits schemas.

At every turn i found the returns on investment disappointing. Schema languages are hard for humans to write and even harder to interpret. They are usually unable to express many real world constraints (you can only have that radio with this trim package, you can only have that engine if you not in California, etc). High quality documentation is super hard even with a schema. Generation from schema solves the easy part of documentation, explaining why the reader should care is the hard part. Optimizing a non-dominate factor in a process doesn’t actually move the needle all that much. Automated low-code validations often turned out hamper evolution far more than they caught real errors.

I wasn’t alone in my disappointment. The whole industry noticed that XML, with its schemas and general complexity was holding us back. Schemas ossified API clients and servers to the point that evolving systems was challenging, bordering on impossible. Understanding the implications of all the code generated from schemas was unrealistic for mere mortals. Schema became so complex that it was impossible to generate human interpretable documentation from them. Instead people just passed around megs of XSD files and call it “documentation”.

JSON emerged primarily as a reaction to the complexity of XML. However, XML didn’t start out complex, it accreted the complexity over time. Unsurprising the cycle is repeating. JSON schema is a thing and seems to be gaining popularity. It is probably a fools errand to try steming the tide but i’m going to try anyway.

doomed to watch everyone else repeat history

What would it take for schemas to be net positive? Only a couple of really hard things.

Humans first

Schema languages should be human readable first, and computer readable second. Modern parser generators make purpose built languages easy to implement. A human centered language turns schemas from something that is only understandable by an anointed few into a powerful, empowering tool for the entire industry. RelaxNG concise syntax (linked above) is a good place to look for inspiration. The XML community never adopted a human readable syntax and it contributed to the ultimate failure of XML as a technology. Hopefully we in the JSON community can do better.

Avoid the One True Schema cul de sac

This one is more of a cultural and education issue than technical one. This principle hinges on two realizations

  1. A message is only valuable if at least one consumer can achieve some goal with that message.
  2. Consumers want to maximize the number goals achieved

However, consumers don’t necessarily have the same goals as each other. In fact, any two consumers are likely to have highly divergent goals. Therefore, they are likely to have highly divergent needs in messages. A message with which one consumer can achieve a goal may be useless to another. Therefore, designing a schema to be used by more than consumer is an infinite-variable optimization problem. Your task is to minimize the difference between the set of valid messages and the set of actually processable messages for every possible consumer! (See appendix A for more details) A losing proposition if there ever was one.

To mitigate this schema languages should provide first class support for “personalizing” existing schemas. A consumer should able to a) declare that it only cares about some subset of the properties defined in a producer’s schema and b) that it will accept and use properties not declared in a producer’s schema. This would allow consumers to finely tune their individual schemas to their specific needs. This would improve evolvability by reducing incidental coupling, increase the clarity of documentation by hiding all irrelevant parts of the producer’s schema, and improve automated validation by ignoring irrelevant parts of messages.

We as a community should also educate people in the dangers of The One True Schema pattern.

Conclusion

Designing schemas for humans and avoiding the One True Schema are both really hard. And, unfortunately, i doubt our ability to reach consensus and execute on them. Given that i think most message producers and basically all message consumers are better avoiding schemas for anything other than documents.


Appendix A: Where in i get sorta mathy about why consumers shouldn’t share schemas unless they have the same goals

I don’t know if this will help other people but it did help me clarify my thinking on this issue.

M = set of all messages

mC = {m | m ∈ M, m contains info needed by consumer C}

A particular consumer, C, needs messages to contain certain information to achieve its goal.

mV = {m | m ∈ M, m is valid against schema S}

For any particular schema there is some subset of all messages that are valid.

mC = lim mV as S->perfectSchemaFor(C)

As the schema approaches perfection for consumer C the set of valid messages approaches the set of messages actually processable by the consumer.

mC ≠ mV
mC ⊄ mV
mC ⊅ mV

In practice, however, there is always some mis-match between the set of valid messages and the set of messages actually processable by the consumer. Some technically invalid messages contain enough information for the consumer to achieve its goal. Some technically valid messages will contain insufficient information. The mis-match may be due to bugs, a lack of expressiveness in the schema language or just poor design. The job of a schema designer is to minimize the mis-match.

Now consider a second consumer of these messages.

mD = {m | m ∈ M, m contains info needed by consumer D}

A particular consumer, D, needs messages to contain certain information to achieve its goal.

mC ≠ mD (in general)

The information needed by consumer D will, in general, be different from the information needed by consumer C. Therefore, the set of messages processable by C will, in general, not equal the set of messages processable by D.

perfectSchemaFor(C) ≠ perfectSchemaFor(D)

This is the kicker. The perfect schema for consumer C is, in general, different from the perfect schema any other consumer. Minimizing the difference between mV and mC will tend to increase the difference between mV and mD.

supervisor child startup in elixir

Fact: Supervisors fully serialize the startup of their children. Each child’s init/1 function runs to completion before the supervisor starts the next child.

For example, given modules ModuleA and ModuleB that implement both the init/1 callback and this supervisor:

Supervisor.start_link([
  ModuleA,
  ModuleB
], strategy: :one_for_one)

The ModuleA child process will be started and its init/1 function will run to completion before the ModuleB child process starts. This is true regardless of the strategy used.

so what?

This fact can often simplify application startup and error handling scenarios. For example, consider a situation where several processes need a token to interact with an API. The application must acquire this token on startup. If the token expires or the server revokes the token the application must acquire a brand new token and distribute it to all the processes.

We can implement that behavior simply and easily using serialization and appropriate crash handling strategy of a supervisor.

defmodule MyApi.TokenManager do
  use GenServer

  def init(_) do
    token = fetch_token_from_api!()
    {:ok, token}
  end

  def handle_call(:get_token, _from, token) do
    {:reply, token, token}
  end
end

defmodule MyApi.ThingDoer do
  use GenServer

  def handle_call(:do_thing, _from, _state) do
    token = MyApi.TokenManager.get_token()

    # do stuff with token; crash if it doesn't work
  en
end

defmodule MyApi.OtherThingDoer do
  use GenServer

  def handle_call(:do_other_thing, _from, _state) do
    token = MyApi.TokenManager.get_token()

    # do stuff with token; crash if it doesn't work
  en
end

Supervisor.start_link([
  MyApi.TokenManager,
  MyApi.ThingDoer,
  MyApi.OtherThingDoer
], strategy: :one_for_all)

In this example MyApi.TokenManager.init/1 acquires the token before returning. That means the token is ready by the time MyApi.ThingDoer and MyApi.OtherThingDoer start. If at any point to the API server revokes the token, or it expires, the next thing doer to try to use it can just crash. That crash will cause the supervisor to shutdown the remaining children and restart them all beginning with MyApi.TokenManager which will acquire a new, working token.

With this approach MyApi.ThingDoer and MyApi.OtherThingDoer don’t need any specific error handling code around token management. The removal of situation-specific error handling logic makes them simpler and more reliable.

hypermedia format manifesto

Through my work developing and consuming APIs i have come to value:

  • evolvability over message and implementation simplicity
  • self describing messages over reduced message sizes
  • standards over bespoke solutions
  • human readability over client simplicity
  • uniformity over flexibility

I value the things on the right, but i value the things on the left more.

evolvability over message and implementation simplicity

APIs and the producers and consumers of APIs must be able to evolve over time. Evolvability inherently means that the message and implementations will be more complex. Designers and implementers must have forward compatibility in mind at all times. This forward compatibility mindset produces features that add value only after months, years or even decades of life. Having those features is more complex than not, but the return on those investments is worth the cost.

self describing messages over reduced message sizes

Embedding all the information needed to interpret a message simplifies client implementation and improves evolvability. However, embedding all that information necessarily increase the size of the message. For most APIs the additional data transfer is just not important enough to give up the benefits of self-describing messages.

standards over bespoke solutions

Standards allow reuse of code and of knowledge. Standards often encode hard-won practical experience about what works and what doesn’t. However, standard solutions often don’t fit as well as purpose-designed solutions to specific problems.

human readability over client simplicity

It is important that APIs be understandable by mere mortals. An average developer should be able to easily understand and explore an API without specialized tools. Achieving human readability while also honoring the other values often means that clients must become more complicated.

uniformity over flexibility

There should be a small number of ways to express a particular message. This makes consumer and producer implementations simpler. However, this means that existing APIs will likely be non-conformant. It also means that some messages will be less intuitive and human readable.

why now

There has been a fair bit of discussion in HTTP APIs hypermedia channel (get an invite) lately about hypermedia formats (particularly those of the JSON variety). Personally, i find all of the existing options wanting. I’m not sure the world needs yet another JSON based hypermedia format but the discussion did prompt me to try to articulate what i value in a format. The format is blatantly stolen from the agile manifesto.

How i judge software engineers

My kid’s teachers routinely provide rubrics for assignments. At first blush, rubrics are tools to make grading an assessment easier. They are effective in that role. They turn out to be at least as effective at communicating expectations. Readers of a rubric can quickly and easily determine what is important.

Recently i developed a rubric to help me judge the performance of engineers (it was review season yet again). The engineers on my team have appreciated the transparency and clarity this tool provides. Hopefully, it will be helpful to others.

The Rubric is divided into two sections. One about results and the other about behavior. Both of these are important. Good, pro-social behavior is just as important in engineers as strong results.

Results

3 2 1 0
Continuous improvement
  • always leaves code substantially better than they found it
  • manages scope of refactors to match time available
  • often improves code as part of normal work
  • sometimes improves code as part of normal work
  • often has to abandon refactors due to time constraints
  • rarely improves existing code
Communication
  • communicates intentions early and clearly
  • often provides material support to teammates (developers, QA, on-call person, etc)
  • regularly creates & maintains runbook (particularly when on call)
  • keeps stakeholder informed (particularly when on call)
  • communicates intentions after it is hard to change course
  • regularly supports teammates
  • occasionally maintains runbook
  • rarely communicates intentions
  • sometimes supports teammates
  • doesn’t maintain runbook
  • never communicates intentions
  • never supports teammates
  • never creates or improves runbook entries
Production support
  • prioritizes concurrent incidents correctly
  • use critical thinking and problem-solving skills to resolve issues quickly
  • handles lower priority issues (eg, jenkins nodes down) when there are no higher priority incidents
  • prioritizes concurrent incidents correctly
  • use critical thinking and problem solving skills to resolve issues
  • handles lower priority issues (eg, jenkins nodes down) when there are no higher priority incidents
  • prioritizes concurrent incidents correctly
  • ignores lower priority issues (eg, jenkins nodes down) even when there are no high priority incidents (works stories while on call)
  • relies too heavily on others (rather than using critical thinking and problem-solving skills)
  • bad attitude
  • incorrectly prioritizes concurrent incidents
  • always ignores lower priority issues (jenkins nodes down)
  • doesn’t communicate incident status to stakeholders
  • relies on others to resolve issues (throws it over the wall)
Code Quality
  • functional
  • well factored
  • documentation on classes/modules and public method/functions
  • PRs require trivial changes
  • functional
  • well factored
  • poorly documented
  • PRs require minor changes
  • functional
  • poorly factored
  • undocumented
  • PRs require some rework
  • buggy
  • poorly factored
  • undocumented
  • PRs require substantial rework
Productivity
  • usually delivers more  stories per sprint than the average engineer
  • usually delivers more points per sprint than the average engineer
  • occasionally delivers more stories/points per sprint than the average engineer
  • usually delivers fewer stories/points per sprint than the average engineer
  • delivers substantially fewer stories/points per sprint than the average engineer
Testing
  • public contracts well tested
  • key scenarios have acceptance tests
  • tests are independent of current implementation
  • public contract partially tested
  • key scenarios have acceptance tests
  • tests are independent or current implementation
  • public contract partially tested
  • tests dependent on current implementation
  • acceptance tests check too many edge cases
  • no unit or functional tests
Feedback
  • often reviews PRs
  • feedback is substantive and useful
  • reviews show understanding of PRs intent and the code that interacts with it
  • regularly reviews PRs
  • advice would materially improve PRs
  • reviews show understanding of PRs intent
  • sometimes reviews PRs
  • reviews are superficial
  • reviews are hard to understand
  • never reviews PRs
Product & domain knowledge
  • understands the domain
  • understands most of the supported features of the product
  • understands some of the historical features of the product
  • understands the domain
  • understands many of the supported features of the product
  • some knowledge of the domains
  • limited knowledge of product features
  • no knowledge of utility and grid edge domain
  • no knowledge of product
Personal goals
  • goals are SMART
  • goals drive achievement of team and corporate goals in material ways
  • achieves goals on time
  • goals are SMART
  • goals weakly support team and corporate goals
  • achieves goals
  • goals have no relation to team and corporate goals
  • achieves goals
  • goals are vague or unattainable
  • goals work against team and corporate goals
  • doesn’t achieve goals

Behavior

Attitude
  • polite and engaging even when under stress (eg, when on call)
  • accepts setbacks and moves forward
  • normally polite but brusque when under stress
  • accepts setbacks and moves forward
  • normally polite but rude when under stress
  • rude
  • dismissive
Courage
  • courageous in all aspects of work every day
  • strives for greatness even when difficult
  • occasionally fails spectacularly
  • often courageous in most aspects of work
  • occasionally fails
  • sometimes courageous in some aspects of work
  • rarely fails
  • often timid
  • usually takes least risky (and rewarding) approach
Motivation
  • highly motivated to succeed
  • accepts challenges and new responsibilities
  • motivated to succeed
  • grudgingly accepts new challenges and responsibilities
  • resists new challenges and responsibilities
  • lacks motivation
  • rejects all new challenges and responsibilities
Strategic thinking plans for 6 month – 2 year horizon plans for 3 – 6 month horizon plans for 1 – 3 month horizon no planning
Trustworthiness
  • earns trust of others
  • reliably meets commitments
  • honest
  • occasionally fails to meet commitments
  • sometimes fails to gain the trust of others
  • fails to meet commitments
  • occasionally misconstrues the facts
  • often misconstrues the facts
  • widely distrusted
Learning
  • seeks out learning opportunities
  • applies lessons learn to enhance success
  • elicits relevant experiences from others
  • interested in learning
  • applies lessons learn to enhance success
  • learns when pushed
  • resists change
  • uninterested in learning
  • resists change
Pairing*
  • improves productivity and morale of pairs
  • pairs most of the time
  • pairs effectively
  • pairs most of the time
  • pairs ineffectively
  • pairs some of the time
  • reduces productivity and morale of pairs
  • rarely pairs

* Optional. If your team pairs at a matter of course this is very important. If your team works as individuals then ignore this row.

Skiing a beautiful spring day at Loveland ski area.

Eclipse 2017

10 minutes to totality

Sunset at 12:56

And the light is back

Sometimes Audrey is still just a kid playing at the park.

Doing our touristly duty (Great Platte River Road Archway Monument) while waiting on the eclipse. 

Marian Hill is awesome!

Grand canyon rafting — day 6

The Hike out. 8.6 miles horizontally, 1 mile vertically, 108 degrees in the sun (which most of the way is). We are each carrying about 25 pounds gear and water.

View from camp in the, early, morning light.

Leaving the boats behind.

We’ve come a long way.

 

We made it!

Total hike time: 7 hours. Not bad.