cedar-policy · cdisselkoen · Jun 7, 2024 · Jun 10, 2024 · Jun 10, 2024 · blubben
diff --git a/text/0069-schema-libraries.md b/text/0069-schema-libraries.md
@@ -0,0 +1,178 @@
+# Schema libraries
+
+## Related issues and PRs
+
+- Reference Issues: Inspired by [RFC 58], but includes new material not present in RFC 58
+- Implementation PR(s):
+
+## Timeline
+
+- Started: 2024-06-07
+
+## Summary
+
+Allow Cedar schemas to include/import "libraries" of definitions from remote
+URLs.
+
+This RFC does not propose that the Cedar team would build or maintain any such
+libraries; it only proposes the mechanism for importing libraries from URLs.
+
+## Basic example
+
+Human schema format:
+```
+import "https://raw.githubusercontent.com/cedar-policy/cedar-examples/release/3.2.x/cedar-example-use-cases/document_cloud/document_cloud.cedarschema"
+import "https://example.com/cedar_schemas/oidc.cedarschema"
+import "https://example.com/cedar_schemas_json/foobar.cedarschema.json"
+
+namespace "MyApp" {
+  ...
+}
+```
+
+JSON schema format:
+```
+{
+  "imports" : [
+    "https://raw.githubusercontent.com/cedar-policy/cedar-examples/release/3.2.x/cedar-example-use-cases/document_cloud/document_cloud.cedarschema",
+    "https://example.com/cedar_schemas/oidc.cedarschema",
+    "https://example.com/cedar_schemas_json/foobar.cedarschema.json"
+  ],
+  "MyApp": {
+    ...
+  }
+}
+```
+
+## Motivation
+
+Some data sources are common across many applications and useful to many Cedar
+users, either within the same organization or even across organizations.
+
+### 1. Within the same organization
+
+This RFC would allow an organization to define its own libraries of schema
+definitions, which could be reused across many different schemas (say, for
+different webapps owned by the organization).
+For instance, the organization might have common definitions of `User` or
+`Account` that apply in many different applications, and although those
+applications may not want to share entire Cedar schemas, with this RFC they
+could share just the definitions of `User` or `Account`, which could be defined
+once in a central location (in the same or separate libraries).
+
+### 2. Across organizations
+
+For another motivating example, consider identity providers (IdPs) which comply
+with the OpenID Connect standard (OIDC).
+The OIDC standard includes a list of attributes that exist on a user type; this
+is naturally declared as a Cedar entity type.
+With this RFC, anyone could provide a "library" representing Cedar definitions for
+OIDC types, and provide that library as a Cedar schema file at some URL; and then
+other Cedar users could use those definitions simply by importing the file from that URL.
+This would allow the Cedar community to gradually coalesce on the "best" way to
+represent an OIDC user in Cedar.
+
+### Motivations common to both scenarios
+
+In both of the above scenarios (within-organization and cross-organization),
+we obtain three key benefits:
+1. Saving each Cedar user the effort of writing common declarations themselves.
+  This facilitates code reuse in schemas, and makes it easier to get started
+  with Cedar.
+2. Providing a way to define common types and actions in a centralized way,
+  which ensures many schemas agree on the "correct" or "best practices"
+  definitions, and provides a single place to make updates if updates are
+  required.
+3. Facilitating code reuse for Cedar authorization calls, not just schemas.
+  When everyone shares a common definition of `OIDC::User`, the community could
+  conceivably converge on a reusable library function for, e.g., converting an
+  OIDC token into Cedar entity data.
+  This would further make it easier to get started with Cedar.
+  (Note that this RFC does not propose the Cedar team writing or maintaining
+  either library definitions for use in schemas or library functions for use in
+  Cedar authorization calls. It only points out that the community could
+  converge on these things.)
+
+## Detailed design
+
+Import statements are only allowed at the top level, outside of all namespace
+declarations.
+(In the future, another RFC could propose allowing imports in other positions.)
+
+The target of the import must be a raw file containing a valid Cedar schema.
+This schema may contain any definitions that are valid today in Cedar schemas,
+including namespaces, entity type definitions, common type definitions, and
+action declarations.
+The validator will essentially concatenate all of these definitions into the
+schema at the location of the `import` statement.
+
+Cedar will autodetect whether the imported schema is a human-format or
+JSON-format schema.
+(Today, there are no strings that are both valid human-format and valid
+JSON-format schemas; this RFC proposes encoding that as a design principle in
+perpetuity.)
+In particular, all valid JSON-format schemas must have `{` as their first
+non-whitespace character, and no valid human-format schemas have `{` as their
+first non-whitespace character.
+
+This RFC does not propose any mechanism for versioning libraries.
+Instead, it proposes that versioning would be done _above_ the Cedar layer,
+i.e., should be the responsibility of library authors.
+For instance, library authors could provide a different URL for different
+versions of their library, avoiding changing the contents of the URL for the
+existing versions of the library.
+This RFC doesn't preclude later adding a versioning feature, in which case the
+syntax proposed in this RFC would be interpreted as "import the latest version
+of this library".
+
+In the JSON format, we do not need to reserve the namespace named `"imports"`:
+if `"imports"` maps to a JSON object, it represents the namespace `"imports"`,
+while if `"imports"` maps to a JSON array, it represents import statements as
+defined in this RFC.
+
+## Drawbacks
+
+1. The Cedar validator, and other tools that rely on schemas, will have to make
+network calls in order to perform their jobs. This has availability and latency
+implications which may not be acceptable for some users. Of course, those users
+could simply not use this feature.
+2. Cedar schemas would no longer be self-contained, in that a single (hopefully
+readable) file contains all of the relevant definitions. To mitigate this, we
+could provide a utility that displays the schema with all imports expanded.
+3. The Cedar Rust code would have to bring in substantial new dependencies, so
+that it could download libraries from remote URLs. To mitigate this for users
+who are concerned about this and don't need/want this feature (e.g., in
+resource-constrained environments, offline environments, Wasm, etc), we could
+put this RFC's functionality behind a Cargo feature, so that it and its
+dependencies could be opted-into / opted-out-of at compile time. (This RFC
+proposes it would be enabled by default, but the Cargo feature would allow users
+to compile-time disable it.)
+4. Implementation complexity for the Cedar validator and other tools that rely
+on schemas.
+
+This is not a breaking change for any existing Cedar users.
+All existing valid Cedar schemas remain valid.
+
+## Alternatives
+
+### Alternative A: Distribute libraries without the `import` mechanism
+
+Cedar already supports schemas spread over multiple files, in APIs like
+[`Schema::from_schema_fragments()`].
+So, users could reasonably easily distribute and use libraries today, without
+any `import` mechanism.
+When calling Cedar APIs, they would provide library schema files in addition to
+the rest of their schema.
+
+### Alternative B: Explicit declaration of human/JSON format, not autodetection
+
+Instead of the autodetection mechanism described above, we could require schema
+authors to explicitly indicate whether they are importing a human-format or
+JSON-format library.
+For instance, in the human schema format, this could look like
+```import [json] "https://..."```
+(where the absence of `[json]` would indicate the human format, since Cedar
+positions that as the default format).
+
+[RFC 58]: https://github.com/cedar-policy/rfcs/pull/58
+[`Schema::from_schema_fragments()`]: https://docs.rs/cedar-policy/latest/cedar_policy/struct.Schema.html#method.from_schema_fragments