diff --git a/spnl/semantics.md b/spnl/semantics.md
new file mode 100644
index 00000000..27e6c2db
--- /dev/null
+++ b/spnl/semantics.md
@@ -0,0 +1,53 @@
+# Details of Execution
+
+The lifecycle of a span query includes:
+
+- Input from client to the generate
+- Output to client of the generate
+- By-products: what the model server caches as a result of that generate
+
+## Input Concerns
+
+Input from client to the generate.
+
+### Messages
+
+```
+(system m): a message with role "system" and content m
+(user m): a message with role "user" and content m
+(assistant m): a message with role "assistant" and content m
+```
+
+### Terminology
+
+The terminology below has capitalized letters representing strings and
+lowercase letters representing token sequences. We assume that when
+mapping `A` to `a` the chat template is applied then the tokenizer.
+
+```
+A, B, C: these represent messages
+a, b, c: these represent corresponding token sequences, with chat template applied
+_: ensure that the preceding sequence both starts and ends on a block boundary
++: special token for begin span
+x: special token for restore cross attention
+```
+
+### Rules
+
+```
+(seq A B C) -> abc
+(plus A B C) -> (+a)_(+b)_(+c)_   meaning add + to each and ensure each starts and ends on a block boundary
+(cross A B C) -> ab(xc)_          meaning add x before the last element and ensure (xc) starts and ends on a block boundary
+```
+
+### Examples
+
+```
+(cross A (plus B C) D) -> a(+b)_(+c)_(xd)_
+```
+
+## By-product of generate
+
+What the model server caches as a result of that generate.
+
+TODO