Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A spec-backed protocol for inter-service communication #9

Open
vemv opened this issue Jun 13, 2019 · 2 comments
Open

A spec-backed protocol for inter-service communication #9

vemv opened this issue Jun 13, 2019 · 2 comments

Comments

@vemv
Copy link
Contributor

vemv commented Jun 13, 2019

Brief

Given a microservices architecture where services communicate via e.g. Kafka, I advocate that those Kafka messages should be:

  • 'speced' (backed by clojure.spec)
    • enables an inter-service contract
    • note that 'message specs' are decoupled from 'model specs'
      • 'model specs' refers to the write model / read model of each service's entities.
        • Those should be private (encapsulated)
    • also enables unit-testing each service without depending on other services' implementation details or private specs
  • versioned
    • each attribute, and each message (namely: a composition of attributes) are namespaced and versioned
    • attribute example: (spec/def :messaging.blue-app.v1/age integer?)
    • message example: (spec/def :messaging.blue-app.v1/user (spec/keys :req [:messaging.blue-app.v1/age]))

Rationale

Specs

The usual reasons apply: contracts reduce the chance for misunderstandings, which often translate into bugs.

This particularly applies to ever-evolving, distributed systems.

Namespaces

I advocate that both message names, and those messages' attributes are heavily namespaced.

  • Basic namespacing (depth 1) enables clojure.specing
  • Heavy namespacing (using the app name, and possibly the name of one of its modules) enables an unmistakable mapping from spec to code
    • e.g. I should be able to hit a shortcut in my IDE for a given keyword and jump straight to the spec definition.
    • Centralized/unmistakable ownership == fewer chances for misunderstandings

Versioning

As part of the namespacing strategy, version numbers should be part of each message/attribute namespace.

  • Systems evolve
    • Data semantics evolve
  • Not reifying that evolution is a source of complexity
    • Because a single thing should't have N meanings depending on who watches the thing / when the thing is watched

An additional benefit is that unit-testing has guaranteed correctness:

  • when ProjectA consumes ProjectB's specs for data generation, it does it at a specific, immutable version
    • ProjectB is free to evolve, without breaking other projects' tests

How

Monotonically increasing, natural version numbers

i.e v1 -> v2 -> v3 -> ...

  • There's no semver-like concept of breakage
    • One man's non-breaking is another's breaking
    • A new version is simply 'different' without judgement or guarantee whatsoever, and consumers should unit-test against it
  • It can be trivially enforced that all messages/attributes have a version as part of their names.
  • "But I'll end up having v16 which seems ridiculously high"
    • There are infinite numbers, you won't run out of them
  • It is perfectly fine that a message at v1 contains an attribute at v2
    • e.g. I can create a new type of message (hence the v1) leveraging existing attributes (that may be at v2 by now)
  • And it is perfectly fine that a message at v2 contains an attribute at v1
    • Else you'd have to constantly bump attributes' versions, which would duplicate things tremendously.

Message specs are decoupled from model specs

  • One might have a (spec/def :blue-app/user ...) spec serving as a 'write model', i.e. a schema you follow before persisting to a DB
  • Then, when creating (spec/def :messaging.blue-app.v1/user) one should not leverage the :blue-app/user spec at all
    • It couples the messaging spec to arbitrary implementation details (business-specific predicates)
    • Would make the messaging spec non-consumable from external projects.

Immutable, centralized message schemas

  • All specs are immutable
    • messages ones, attributes ones
  • Want to mutate a spec? inc the message schema version instead
  • For enforcing this:
    • Messaging specs aren't inlined into their owner's project, but instead reside in a centralized repo
      • stricter code review, checklist, tooling
  • Maybe use https://github.com/metosin/spec-tools/blob/master/docs/02_data_specs.md for defining messaging specs
    • data (unlike code) is more serializable, therefore more easily comparable (which is necessary for the immutability requirement)

Closing thoughts

Specs in production

It would be unfortunate to do all this effort, and then disable spec checking in production.

I have created nedap/speced.def#70 accordingly.

Zero-downtime + exactly-once processing

A usual problem in event-driven architectures is:

  • how to deal with different versions of a message?
    • Example 1: you want to replay a very old log, containing heterogenous ('evolved') shapes of the same data
    • Example 2: you have a deployed service, which is only aware of an old message schema. Now you want to deploy a new version of the app, aware of the new schema
      • For a (more-or-less) brief period of time, you have two app versions processing the new message
        • old one will fail due to outdated expectations
        • and your message will be processed twice
          • maybe the old service performed some side-effects. Buggily or not, this was undesirable.

With my recipe, this problem disappears altogether:

  • Services are configured (production.edn) to only process a finite, known set of message versions
    • e.g. RedApp--Deployment1 is configured to observe #{:messaging.blue-app.v1/user}
      • v2 is not processed at all
    • and RedApp--Deployment2 is configured to observe #{:messaging.blue-app.v2/user}
      • v1 is not processed at all
    • Both deployments can coexist for an arbitrarily long time, without processing unexpected schemas, and ensuring exactly-once processing.
@jwkoelewijn
Copy link

I see the appeal of this approach. I have one minor challenge regarding this approach, and that is that this approach seems to be Clojure oriented, in that it is heavily relying on namespace support of keywords and spec. Not saying that this is a problem, however, ideally we would like to have the same advantages for our Ruby consumers. Any thoughts on that?

For reference, at the moment we use https://github.com/nedap/postman/blob/master/bin/generators/ruby_generator.rb (which uses xsd2ruby under the hood) to build Plain Old Ruby Objects for the different messages we have defined using an XSD (see https://github.com/nedap/postman and https://github.com/nedap/postman/blob/master/models/Message.xsd).
This approach, however, does not benefit from any checks and specs, however, being able to automate the classes is a nice feature.

Do you see any possibilities to generate the same using the approach you described for how to best deal with our messaging formats?

I know it is a lot to ask, however, I would like to think of an approach to migrate our current Postman messages to incorporate some of the ideas you mention as well, so consider these questions more as a hammock-time-trigger :)

@vemv
Copy link
Contributor Author

vemv commented Aug 7, 2019

Do you see any possibilities to generate the same using the approach you described for how to best deal with our messaging formats?

Yes. I'd keep using XSD in a quite similar manner, while adding:

  • versioning support both at model- and attribute-level.
  • slightly stronger types, e.g. asserting that strings are not-blank

That versioning could be achieved either using some or other official XSD feature, or in an ad-hoc manner by convention.

Out of this XSD, we'd emit both Ruby classes, and Clojure specs.

  • For Ruby classes, we'd need to tweak https://github.com/nedap/postman/blob/master/bin/generators/ruby_generator.rb so it emits emits e.g. Pep::Postman::UserEvent::V1, with attributes called e.g. age_v1
    • i.e. both classes and attributes will become versioned
    • if not done already, we could start emitting some Ruby preconditions (fail unless foo) out of the XSD types.
  • We can automatically emit Clojure specs out of XSD
    • (not done atm?)
    • Accordingly, specs cannot be arbitrarily strong
      • Which is in fact desirable: read It couples the messaging spec to arbitrary implementation details and Would make the messaging spec non-consumable from external projects in the original issue

So, XSD remains canonical, while the goals of this issue seem still met.

One also could do it in the opposite direction (making specs canonical, emit Ruby out of those), but it seems an approach bound to give issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants