A spec-backed protocol for inter-service communication #9

vemv · 2019-06-13T04:24:09Z

Brief

Given a microservices architecture where services communicate via e.g. Kafka, I advocate that those Kafka messages should be:

'speced' (backed by clojure.spec)
- enables an inter-service contract
- note that 'message specs' are decoupled from 'model specs'
  - 'model specs' refers to the write model / read model of each service's entities.
    - Those should be private (encapsulated)
- also enables unit-testing each service without depending on other services' implementation details or private specs
versioned
- each attribute, and each message (namely: a composition of attributes) are namespaced and versioned
- attribute example: (spec/def :messaging.blue-app.v1/age integer?)
- message example: (spec/def :messaging.blue-app.v1/user (spec/keys :req [:messaging.blue-app.v1/age]))

Rationale

Specs

The usual reasons apply: contracts reduce the chance for misunderstandings, which often translate into bugs.

This particularly applies to ever-evolving, distributed systems.

Namespaces

I advocate that both message names, and those messages' attributes are heavily namespaced.

Basic namespacing (depth 1) enables clojure.specing
Heavy namespacing (using the app name, and possibly the name of one of its modules) enables an unmistakable mapping from spec to code
- e.g. I should be able to hit a shortcut in my IDE for a given keyword and jump straight to the spec definition.
- Centralized/unmistakable ownership == fewer chances for misunderstandings

Versioning

As part of the namespacing strategy, version numbers should be part of each message/attribute namespace.

Systems evolve
- Data semantics evolve
Not reifying that evolution is a source of complexity
- Because a single thing should't have N meanings depending on who watches the thing / when the thing is watched

An additional benefit is that unit-testing has guaranteed correctness:

when ProjectA consumes ProjectB's specs for data generation, it does it at a specific, immutable version
- ProjectB is free to evolve, without breaking other projects' tests

How

Monotonically increasing, natural version numbers

i.e v1 -> v2 -> v3 -> ...

There's no semver-like concept of breakage
- One man's non-breaking is another's breaking
- A new version is simply 'different' without judgement or guarantee whatsoever, and consumers should unit-test against it
It can be trivially enforced that all messages/attributes have a version as part of their names.
"But I'll end up having v16 which seems ridiculously high"
- There are infinite numbers, you won't run out of them
It is perfectly fine that a message at v1 contains an attribute at v2
- e.g. I can create a new type of message (hence the v1) leveraging existing attributes (that may be at v2 by now)
And it is perfectly fine that a message at v2 contains an attribute at v1
- Else you'd have to constantly bump attributes' versions, which would duplicate things tremendously.

Message specs are decoupled from model specs

One might have a (spec/def :blue-app/user ...) spec serving as a 'write model', i.e. a schema you follow before persisting to a DB
Then, when creating (spec/def :messaging.blue-app.v1/user) one should not leverage the :blue-app/user spec at all
- It couples the messaging spec to arbitrary implementation details (business-specific predicates)
- Would make the messaging spec non-consumable from external projects.

Immutable, centralized message schemas

All specs are immutable
- messages ones, attributes ones
Want to mutate a spec? inc the message schema version instead
For enforcing this:
- Messaging specs aren't inlined into their owner's project, but instead reside in a centralized repo
  - stricter code review, checklist, tooling
Maybe use https://github.com/metosin/spec-tools/blob/master/docs/02_data_specs.md for defining messaging specs
- data (unlike code) is more serializable, therefore more easily comparable (which is necessary for the immutability requirement)

Closing thoughts

Specs in production

It would be unfortunate to do all this effort, and then disable spec checking in production.

I have created nedap/speced.def#70 accordingly.

Zero-downtime + exactly-once processing

A usual problem in event-driven architectures is:

how to deal with different versions of a message?
- Example 1: you want to replay a very old log, containing heterogenous ('evolved') shapes of the same data
- Example 2: you have a deployed service, which is only aware of an old message schema. Now you want to deploy a new version of the app, aware of the new schema
  - For a (more-or-less) brief period of time, you have two app versions processing the new message
    - old one will fail due to outdated expectations
    - and your message will be processed twice
      - maybe the old service performed some side-effects. Buggily or not, this was undesirable.

With my recipe, this problem disappears altogether:

Services are configured (production.edn) to only process a finite, known set of message versions
- e.g. RedApp--Deployment1 is configured to observe #{:messaging.blue-app.v1/user}
  - v2 is not processed at all
- and RedApp--Deployment2 is configured to observe #{:messaging.blue-app.v2/user}
  - v1 is not processed at all
- Both deployments can coexist for an arbitrarily long time, without processing unexpected schemas, and ensuring exactly-once processing.

The text was updated successfully, but these errors were encountered:

jwkoelewijn · 2019-08-06T14:48:15Z

I see the appeal of this approach. I have one minor challenge regarding this approach, and that is that this approach seems to be Clojure oriented, in that it is heavily relying on namespace support of keywords and spec. Not saying that this is a problem, however, ideally we would like to have the same advantages for our Ruby consumers. Any thoughts on that?

For reference, at the moment we use https://github.com/nedap/postman/blob/master/bin/generators/ruby_generator.rb (which uses xsd2ruby under the hood) to build Plain Old Ruby Objects for the different messages we have defined using an XSD (see https://github.com/nedap/postman and https://github.com/nedap/postman/blob/master/models/Message.xsd).
This approach, however, does not benefit from any checks and specs, however, being able to automate the classes is a nice feature.

Do you see any possibilities to generate the same using the approach you described for how to best deal with our messaging formats?

I know it is a lot to ask, however, I would like to think of an approach to migrate our current Postman messages to incorporate some of the ideas you mention as well, so consider these questions more as a hammock-time-trigger :)

vemv · 2019-08-07T05:08:51Z

Do you see any possibilities to generate the same using the approach you described for how to best deal with our messaging formats?

Yes. I'd keep using XSD in a quite similar manner, while adding:

versioning support both at model- and attribute-level.
slightly stronger types, e.g. asserting that strings are not-blank

That versioning could be achieved either using some or other official XSD feature, or in an ad-hoc manner by convention.

Out of this XSD, we'd emit both Ruby classes, and Clojure specs.

For Ruby classes, we'd need to tweak https://github.com/nedap/postman/blob/master/bin/generators/ruby_generator.rb so it emits emits e.g. Pep::Postman::UserEvent::V1, with attributes called e.g. age_v1
- i.e. both classes and attributes will become versioned
- if not done already, we could start emitting some Ruby preconditions (fail unless foo) out of the XSD types.
We can automatically emit Clojure specs out of XSD
- (not done atm?)
- Accordingly, specs cannot be arbitrarily strong
  - Which is in fact desirable: read It couples the messaging spec to arbitrary implementation details and Would make the messaging spec non-consumable from external projects in the original issue

So, XSD remains canonical, while the goals of this issue seem still met.

One also could do it in the opposite direction (making specs canonical, emit Ruby out of those), but it seems an approach bound to give issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A spec-backed protocol for inter-service communication #9

A spec-backed protocol for inter-service communication #9

vemv commented Jun 13, 2019 •

edited

Loading

jwkoelewijn commented Aug 6, 2019

vemv commented Aug 7, 2019

A spec-backed protocol for inter-service communication #9

A spec-backed protocol for inter-service communication #9

Comments

vemv commented Jun 13, 2019 • edited Loading

Brief

Rationale

Specs

Namespaces

Versioning

How

Monotonically increasing, natural version numbers

Message specs are decoupled from model specs

Immutable, centralized message schemas

Closing thoughts

Specs in production

Zero-downtime + exactly-once processing

jwkoelewijn commented Aug 6, 2019

vemv commented Aug 7, 2019

vemv commented Jun 13, 2019 •

edited

Loading