-
Notifications
You must be signed in to change notification settings - Fork 15
Schema libraries #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Schema libraries #69
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,178 @@ | ||
| # Schema libraries | ||
|
|
||
| ## Related issues and PRs | ||
|
|
||
| - Reference Issues: Inspired by [RFC 58], but includes new material not present in RFC 58 | ||
| - Implementation PR(s): | ||
|
|
||
| ## Timeline | ||
|
|
||
| - Started: 2024-06-07 | ||
|
|
||
| ## Summary | ||
|
|
||
| Allow Cedar schemas to include/import "libraries" of definitions from remote | ||
| URLs. | ||
|
|
||
| This RFC does not propose that the Cedar team would build or maintain any such | ||
| libraries; it only proposes the mechanism for importing libraries from URLs. | ||
|
|
||
| ## Basic example | ||
|
|
||
| Human schema format: | ||
| ``` | ||
| import "https://raw.githubusercontent.com/cedar-policy/cedar-examples/release/3.2.x/cedar-example-use-cases/document_cloud/document_cloud.cedarschema" | ||
| import "https://example.com/cedar_schemas/oidc.cedarschema" | ||
| import "https://example.com/cedar_schemas_json/foobar.cedarschema.json" | ||
|
|
||
| namespace "MyApp" { | ||
| ... | ||
| } | ||
| ``` | ||
|
|
||
| JSON schema format: | ||
| ``` | ||
| { | ||
| "imports" : [ | ||
| "https://raw.githubusercontent.com/cedar-policy/cedar-examples/release/3.2.x/cedar-example-use-cases/document_cloud/document_cloud.cedarschema", | ||
| "https://example.com/cedar_schemas/oidc.cedarschema", | ||
| "https://example.com/cedar_schemas_json/foobar.cedarschema.json" | ||
| ], | ||
| "MyApp": { | ||
| ... | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ## Motivation | ||
|
|
||
| Some data sources are common across many applications and useful to many Cedar | ||
| users, either within the same organization or even across organizations. | ||
|
|
||
| ### 1. Within the same organization | ||
|
|
||
| This RFC would allow an organization to define its own libraries of schema | ||
| definitions, which could be reused across many different schemas (say, for | ||
| different webapps owned by the organization). | ||
| For instance, the organization might have common definitions of `User` or | ||
| `Account` that apply in many different applications, and although those | ||
| applications may not want to share entire Cedar schemas, with this RFC they | ||
| could share just the definitions of `User` or `Account`, which could be defined | ||
| once in a central location (in the same or separate libraries). | ||
|
|
||
| ### 2. Across organizations | ||
|
|
||
| For another motivating example, consider identity providers (IdPs) which comply | ||
| with the OpenID Connect standard (OIDC). | ||
| The OIDC standard includes a list of attributes that exist on a user type; this | ||
| is naturally declared as a Cedar entity type. | ||
| With this RFC, anyone could provide a "library" representing Cedar definitions for | ||
| OIDC types, and provide that library as a Cedar schema file at some URL; and then | ||
| other Cedar users could use those definitions simply by importing the file from that URL. | ||
| This would allow the Cedar community to gradually coalesce on the "best" way to | ||
| represent an OIDC user in Cedar. | ||
|
|
||
| ### Motivations common to both scenarios | ||
|
|
||
| In both of the above scenarios (within-organization and cross-organization), | ||
| we obtain three key benefits: | ||
| 1. Saving each Cedar user the effort of writing common declarations themselves. | ||
| This facilitates code reuse in schemas, and makes it easier to get started | ||
| with Cedar. | ||
| 2. Providing a way to define common types and actions in a centralized way, | ||
| which ensures many schemas agree on the "correct" or "best practices" | ||
| definitions, and provides a single place to make updates if updates are | ||
| required. | ||
| 3. Facilitating code reuse for Cedar authorization calls, not just schemas. | ||
| When everyone shares a common definition of `OIDC::User`, the community could | ||
| conceivably converge on a reusable library function for, e.g., converting an | ||
| OIDC token into Cedar entity data. | ||
| This would further make it easier to get started with Cedar. | ||
| (Note that this RFC does not propose the Cedar team writing or maintaining | ||
| either library definitions for use in schemas or library functions for use in | ||
| Cedar authorization calls. It only points out that the community could | ||
| converge on these things.) | ||
|
|
||
| ## Detailed design | ||
|
|
||
| Import statements are only allowed at the top level, outside of all namespace | ||
| declarations. | ||
| (In the future, another RFC could propose allowing imports in other positions.) | ||
|
|
||
| The target of the import must be a raw file containing a valid Cedar schema. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should they instead be a namespace? And then we leave providing the schemas from a location up to an upstream process? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sort of like how Smithy does it: https://smithy.io/1.0/spec/core/idl.html There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I might be misunderstanding something -- this RFC as-written allows importing items in a namespace; in fact, all declared items in the library (import target) must be in a (possibly empty) namespace, as otherwise it wouldn't be a valid Cedar schema. You might be looking for a mechanism for "opening" a namespace, i.e., to use items from a namespace unqualified in the rest of the schema/policies? (Akin to Rust's (I realize that Java's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking about how I would use this for a specific use-case I have now. But after thinking a bit more, I think the answer is: I wouldn't. In my case, I want to define my schema dependencies in "packages" and then use a package/dependency manager tool to declare my dependencies and deal with retrieving the artifacts. All I really need from the Cedar library is that ability to load a schema from a set of files (you can already mostly do this but there are some sharp edges.) I do think the "opening" a namespace feature would be useful, but I agree thats a separate RFC. But that raises the question: do we want "import" in the schema spec at all? Are we bringing unnecessary complexity into Cedar that is best left to some upstream system more suited to deal with it (e.g. a package manager)? Thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can you talk more about the sharp edges? We'd like to track these as issues :)
Valid point; this kind of question was one reason for posting the RFC. Interested to see if a majority of folks feel this way, or if there are folks who feel that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The biggest sharp edge was I wanted to load each schema file as a SchemaFragment and then produce the final schema and validate. However, there were cases where it would fail to validate the fragment since it referenced things in other fragments. So I end up handing to convert all the schemas to the human readable format and append them together. Then load and validate that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting. In my understanding, referencing things in other fragments should "just work", and if it doesn't, that is a bug. If you have any specific examples, it would be great if we could get reproducers so we can fix the bugs. |
||
| This schema may contain any definitions that are valid today in Cedar schemas, | ||
| including namespaces, entity type definitions, common type definitions, and | ||
| action declarations. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since the actions in a schema can impact authorization, a schema library adding or removing an action from a group could result in unexpected authorization decisions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would you propose that libraries cannot include action declarations, and can only include entity type definitions and common type definitions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't that's worth it. Presumably, if you're using a schema library, you're consuming entity data from the IdP as well. The issue is just that you might not expect a schema to impact authorization. Sub-resource integrity plus documentation should mitigate this concern. |
||
| The validator will essentially concatenate all of these definitions into the | ||
| schema at the location of the `import` statement. | ||
|
|
||
| Cedar will autodetect whether the imported schema is a human-format or | ||
| JSON-format schema. | ||
| (Today, there are no strings that are both valid human-format and valid | ||
| JSON-format schemas; this RFC proposes encoding that as a design principle in | ||
| perpetuity.) | ||
| In particular, all valid JSON-format schemas must have `{` as their first | ||
| non-whitespace character, and no valid human-format schemas have `{` as their | ||
| first non-whitespace character. | ||
|
|
||
| This RFC does not propose any mechanism for versioning libraries. | ||
| Instead, it proposes that versioning would be done _above_ the Cedar layer, | ||
| i.e., should be the responsibility of library authors. | ||
| For instance, library authors could provide a different URL for different | ||
| versions of their library, avoiding changing the contents of the URL for the | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should provide something akin to sub resource integrity so users can guard against a changing import. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bringing here the summary of an offline conversation -- it makes a lot of sense to have some feature like this, but since including hashes directly in the schema file is a little ugly, maybe we'd want a separate lockfile, and at that point we might want to consider some separate Cedar.toml file (analogous to Cargo.toml) where you could declare all your dependencies, and possibly other global configuration. Does the long-term path see Cedar having something more and more equivalent to |
||
| existing versions of the library. | ||
| This RFC doesn't preclude later adding a versioning feature, in which case the | ||
| syntax proposed in this RFC would be interpreted as "import the latest version | ||
| of this library". | ||
|
|
||
| In the JSON format, we do not need to reserve the namespace named `"imports"`: | ||
| if `"imports"` maps to a JSON object, it represents the namespace `"imports"`, | ||
| while if `"imports"` maps to a JSON array, it represents import statements as | ||
| defined in this RFC. | ||
|
|
||
| ## Drawbacks | ||
|
|
||
| 1. The Cedar validator, and other tools that rely on schemas, will have to make | ||
| network calls in order to perform their jobs. This has availability and latency | ||
| implications which may not be acceptable for some users. Of course, those users | ||
| could simply not use this feature. | ||
| 2. Cedar schemas would no longer be self-contained, in that a single (hopefully | ||
| readable) file contains all of the relevant definitions. To mitigate this, we | ||
| could provide a utility that displays the schema with all imports expanded. | ||
| 3. The Cedar Rust code would have to bring in substantial new dependencies, so | ||
| that it could download libraries from remote URLs. To mitigate this for users | ||
| who are concerned about this and don't need/want this feature (e.g., in | ||
| resource-constrained environments, offline environments, Wasm, etc), we could | ||
| put this RFC's functionality behind a Cargo feature, so that it and its | ||
|
||
| dependencies could be opted-into / opted-out-of at compile time. (This RFC | ||
| proposes it would be enabled by default, but the Cargo feature would allow users | ||
| to compile-time disable it.) | ||
| 4. Implementation complexity for the Cedar validator and other tools that rely | ||
| on schemas. | ||
|
|
||
| This is not a breaking change for any existing Cedar users. | ||
| All existing valid Cedar schemas remain valid. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| ### Alternative A: Distribute libraries without the `import` mechanism | ||
|
|
||
| Cedar already supports schemas spread over multiple files, in APIs like | ||
| [`Schema::from_schema_fragments()`]. | ||
| So, users could reasonably easily distribute and use libraries today, without | ||
| any `import` mechanism. | ||
| When calling Cedar APIs, they would provide library schema files in addition to | ||
| the rest of their schema. | ||
|
|
||
| ### Alternative B: Explicit declaration of human/JSON format, not autodetection | ||
|
|
||
| Instead of the autodetection mechanism described above, we could require schema | ||
| authors to explicitly indicate whether they are importing a human-format or | ||
| JSON-format library. | ||
| For instance, in the human schema format, this could look like | ||
| ```import [json] "https://..."``` | ||
| (where the absence of `[json]` would indicate the human format, since Cedar | ||
| positions that as the default format). | ||
|
|
||
| [RFC 58]: https://github.com/cedar-policy/rfcs/pull/58 | ||
| [`Schema::from_schema_fragments()`]: https://docs.rs/cedar-policy/latest/cedar_policy/struct.Schema.html#method.from_schema_fragments | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also have mechanism to
usea particular namespace? For instance, say I'm building an app where everything is under the AWS namespace. It would be nice if I could justuse AWS::IdentityCenterand then policy authors only need to specify the sub-entity. e.g.UservsAWS::IdentityCenter::User.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can totally see this being useful, but I'm torn whether it should be part of this RFC or a separate RFC.
It would be harder to envision what a
usewould look like for policies (as opposed to schemas); I'm not sure if users will be happy with ausemechanism for schemas but not having one for policies?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, its most useful for policies IMO. Perhaps a separate RFC is better.