Skip to content

Commit 1e28ef4

Browse files
Add documentation for KeySpace structure (#3687)
Co-authored-by: Scott Dugas <[email protected]>
1 parent 3392b57 commit 1e28ef4

File tree

1 file changed

+111
-0
lines changed
  • fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/keyspace

1 file changed

+111
-0
lines changed

fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/keyspace/package-info.java

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,117 @@
2727
* separate pieces of identifying information about the record store.
2828
* A {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} is a symbolic representation of this layout, giving names -- or constant values -- to each component.
2929
* A {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpacePath} is a concrete instance of a {@code KeySpace}, somewhat like a directory pathname in a file system.
30+
* A {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory} is a template, or definition, for a type and (potentially value) that can be assigned to a node in the tree.
31+
* </p>
32+
* <p>
33+
* The purpose of the {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} model is to uniquely
34+
* define the structure and format of the data layout for record layer. As a consumer of record layer may find out, data
35+
* layout complexity may grow over time and various parts of the data model may become tangled and hard to keep consistent
36+
* and separate. The {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} (and through it, the
37+
* {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory}) are there to define and enforce
38+
* type safety and consistency allowing data to be uniquely interpreted.
39+
* Each {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory} has a {@code name}, a {@code type}
40+
* and potentially a {@code value}.
41+
* The invariants enforced by the system are:
42+
* <ul>
43+
* <li>
44+
* For each {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory}, there can only be
45+
* a single sub-directory value of a certain type. For example, for a directory node named "database" there can be one
46+
* child of type {@code long}, one child of type {@code String}, one child of type {@code NULL}, etc
47+
* </li>
48+
* <li>
49+
* All children names within a parent directory are unique
50+
* </li>
51+
* </ul>
52+
* The benefits provided by the KeySpace structures and these invariants are:
53+
* <ul>
54+
* <li>
55+
* A well-defined and type-safe structure for the raw {@link com.apple.foundationdb.tuple.Tuple}s that define the
56+
* way in which a key is structured. The {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace}
57+
* API can be used to generate Tuples for the keys that are guaranteed to follow the defined structure.
58+
* </li>
59+
* <li>
60+
* A way to parse the raw keys from the database and an API to assign semantics to them. For example, the Key Tuple
61+
* {@code ["sys", 8, 15, "user data"]} can be unambiguously parsed and assigns {@code "sys"} to "database name",
62+
* {@code 8} to "user ID" etc.
63+
* </li>
64+
* <li>
65+
* An extensible structure that can add functionality to the keyspace structure. One such example is the directory layer
66+
* encoding, where (potentially long) string constituents of the key are replaced by {@code long} values and thus reduce the
67+
* storage required for representation. This is done through subclassing {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory}
68+
* (see {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.DirectoryLayerDirectory}) where the encoding and decoding is managed.
69+
* </li>
70+
* </ul>
71+
* </p>
72+
* <p>
73+
* To illustrate the issues that this is meant to solve, imagine a structure of
74+
* <pre>
75+
* permissions
76+
* |
77+
* ---------------------
78+
* | |
79+
* users(id) groups(id)
80+
* </pre>
81+
* One way to represent this structure is by defining the method:
82+
* <pre>
83+
* Subspace userSubspace(long id) { return permissionsTuple.add(id); }
84+
* </pre>
85+
* and then, in some other part of the code, define this method:
86+
* <pre>
87+
* Subspace groupSubspace(long id) { return permissionsTuple.add(id); }
88+
* </pre>
89+
* but then the key tuple {@code ["permissions", 8]} is ambiguous and cannot be determined whether it refers to userId 8 or groupId 8.
90+
* This can cause data corruption where data meant to be written to one area of the model overwrites other pieces of data from
91+
* another part of the model.
92+
* The {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} will prevent this from happening. A properly
93+
* formed (and allowed) structure would look like
94+
* <pre>
95+
* "permissions"
96+
* |
97+
* ---------------------
98+
* | |
99+
* "users" "groups"
100+
* | |
101+
* userId(long) groupId(long)
102+
* </pre>
103+
* and the tuple {@code ["permissions", "users", 8]} is unambiguous.
104+
* </p>
105+
* <p>
106+
* In reality, the implementation is slightly more nuanced: Each {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory}
107+
* can have many children of a certain type as long as they are all defined as {@code Constants} - that is, they have a predefined value,
108+
* or a single {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory#ANY_VALUE} value of that type.
109+
* For example, the node for "database" can have children (with {@code String} type)
110+
* for "schema", "permissions" and "records" - where these values are known ahead of time. One cannot then add a value of type {@code String}
111+
* as a child with the value "user-442233".
112+
* </p>
113+
* <p>
114+
* For example: The following structure lists a hypothetical database data layout definition and example key/value entries that
115+
* conform to these values: Each entry is an instance of a {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory}
116+
* with the format of name/type/value
117+
*
118+
* <pre>
119+
* root/String/"root"
120+
* |
121+
* ---------------------------------------
122+
* | |
123+
* database/String/"database" users/String/"users"
124+
* | |
125+
* -------------------------- userId/long/AnyValue
126+
* | |
127+
* dbId/long/AnyValue schema/String/"schema"
128+
* |
129+
* ------------------------------
130+
* | |
131+
* dbName/String/AnyValue dbSchema/String/AnyValue
132+
* </pre>
133+
* In the above example, the FDB key of {@code ["root", "database", 8, "my-db"]} would be interpreted as follows:
134+
* <ul>
135+
* <li>"root" -> root</li>
136+
* <li>"database" -> database</li>
137+
* <li>8 -> dbId</li>
138+
* <li>"my-db" -> dbName</li>
139+
* </ul>
140+
* and {@code ["root", "database", "my-db"]} would be considered illegal, as there is no way to match the "my-db" to a Directory node.
30141
* </p>
31142
*/
32143
package com.apple.foundationdb.record.provider.foundationdb.keyspace;

0 commit comments

Comments
 (0)