|
27 | 27 | * separate pieces of identifying information about the record store. |
28 | 28 | * A {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} is a symbolic representation of this layout, giving names -- or constant values -- to each component. |
29 | 29 | * A {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpacePath} is a concrete instance of a {@code KeySpace}, somewhat like a directory pathname in a file system. |
| 30 | + * A {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory} is a template, or definition, for a type and (potentially value) that can be assigned to a node in the tree. |
| 31 | + * </p> |
| 32 | + * <p> |
| 33 | + * The purpose of the {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} model is to uniquely |
| 34 | + * define the structure and format of the data layout for record layer. As a consumer of record layer may find out, data |
| 35 | + * layout complexity may grow over time and various parts of the data model may become tangled and hard to keep consistent |
| 36 | + * and separate. The {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} (and through it, the |
| 37 | + * {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory}) are there to define and enforce |
| 38 | + * type safety and consistency allowing data to be uniquely interpreted. |
| 39 | + * Each {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory} has a {@code name}, a {@code type} |
| 40 | + * and potentially a {@code value}. |
| 41 | + * The invariants enforced by the system are: |
| 42 | + * <ul> |
| 43 | + * <li> |
| 44 | + * For each {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory}, there can only be |
| 45 | + * a single sub-directory value of a certain type. For example, for a directory node named "database" there can be one |
| 46 | + * child of type {@code long}, one child of type {@code String}, one child of type {@code NULL}, etc |
| 47 | + * </li> |
| 48 | + * <li> |
| 49 | + * All children names within a parent directory are unique |
| 50 | + * </li> |
| 51 | + * </ul> |
| 52 | + * The benefits provided by the KeySpace structures and these invariants are: |
| 53 | + * <ul> |
| 54 | + * <li> |
| 55 | + * A well-defined and type-safe structure for the raw {@link com.apple.foundationdb.tuple.Tuple}s that define the |
| 56 | + * way in which a key is structured. The {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} |
| 57 | + * API can be used to generate Tuples for the keys that are guaranteed to follow the defined structure. |
| 58 | + * </li> |
| 59 | + * <li> |
| 60 | + * A way to parse the raw keys from the database and an API to assign semantics to them. For example, the Key Tuple |
| 61 | + * {@code ["sys", 8, 15, "user data"]} can be unambiguously parsed and assigns {@code "sys"} to "database name", |
| 62 | + * {@code 8} to "user ID" etc. |
| 63 | + * </li> |
| 64 | + * <li> |
| 65 | + * An extensible structure that can add functionality to the keyspace structure. One such example is the directory layer |
| 66 | + * encoding, where (potentially long) string constituents of the key are replaced by {@code long} values and thus reduce the |
| 67 | + * storage required for representation. This is done through subclassing {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory} |
| 68 | + * (see {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.DirectoryLayerDirectory}) where the encoding and decoding is managed. |
| 69 | + * </li> |
| 70 | + * </ul> |
| 71 | + * </p> |
| 72 | + * <p> |
| 73 | + * To illustrate the issues that this is meant to solve, imagine a structure of |
| 74 | + * <pre> |
| 75 | + * permissions |
| 76 | + * | |
| 77 | + * --------------------- |
| 78 | + * | | |
| 79 | + * users(id) groups(id) |
| 80 | + * </pre> |
| 81 | + * One way to represent this structure is by defining the method: |
| 82 | + * <pre> |
| 83 | + * Subspace userSubspace(long id) { return permissionsTuple.add(id); } |
| 84 | + * </pre> |
| 85 | + * and then, in some other part of the code, define this method: |
| 86 | + * <pre> |
| 87 | + * Subspace groupSubspace(long id) { return permissionsTuple.add(id); } |
| 88 | + * </pre> |
| 89 | + * but then the key tuple {@code ["permissions", 8]} is ambiguous and cannot be determined whether it refers to userId 8 or groupId 8. |
| 90 | + * This can cause data corruption where data meant to be written to one area of the model overwrites other pieces of data from |
| 91 | + * another part of the model. |
| 92 | + * The {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace} will prevent this from happening. A properly |
| 93 | + * formed (and allowed) structure would look like |
| 94 | + * <pre> |
| 95 | + * "permissions" |
| 96 | + * | |
| 97 | + * --------------------- |
| 98 | + * | | |
| 99 | + * "users" "groups" |
| 100 | + * | | |
| 101 | + * userId(long) groupId(long) |
| 102 | + * </pre> |
| 103 | + * and the tuple {@code ["permissions", "users", 8]} is unambiguous. |
| 104 | + * </p> |
| 105 | + * <p> |
| 106 | + * In reality, the implementation is slightly more nuanced: Each {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory} |
| 107 | + * can have many children of a certain type as long as they are all defined as {@code Constants} - that is, they have a predefined value, |
| 108 | + * or a single {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory#ANY_VALUE} value of that type. |
| 109 | + * For example, the node for "database" can have children (with {@code String} type) |
| 110 | + * for "schema", "permissions" and "records" - where these values are known ahead of time. One cannot then add a value of type {@code String} |
| 111 | + * as a child with the value "user-442233". |
| 112 | + * </p> |
| 113 | + * <p> |
| 114 | + * For example: The following structure lists a hypothetical database data layout definition and example key/value entries that |
| 115 | + * conform to these values: Each entry is an instance of a {@link com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpaceDirectory} |
| 116 | + * with the format of name/type/value |
| 117 | + * |
| 118 | + * <pre> |
| 119 | + * root/String/"root" |
| 120 | + * | |
| 121 | + * --------------------------------------- |
| 122 | + * | | |
| 123 | + * database/String/"database" users/String/"users" |
| 124 | + * | | |
| 125 | + * -------------------------- userId/long/AnyValue |
| 126 | + * | | |
| 127 | + * dbId/long/AnyValue schema/String/"schema" |
| 128 | + * | |
| 129 | + * ------------------------------ |
| 130 | + * | | |
| 131 | + * dbName/String/AnyValue dbSchema/String/AnyValue |
| 132 | + * </pre> |
| 133 | + * In the above example, the FDB key of {@code ["root", "database", 8, "my-db"]} would be interpreted as follows: |
| 134 | + * <ul> |
| 135 | + * <li>"root" -> root</li> |
| 136 | + * <li>"database" -> database</li> |
| 137 | + * <li>8 -> dbId</li> |
| 138 | + * <li>"my-db" -> dbName</li> |
| 139 | + * </ul> |
| 140 | + * and {@code ["root", "database", "my-db"]} would be considered illegal, as there is no way to match the "my-db" to a Directory node. |
30 | 141 | * </p> |
31 | 142 | */ |
32 | 143 | package com.apple.foundationdb.record.provider.foundationdb.keyspace; |
0 commit comments