Skip to content

Conversation

jenn-le
Copy link
Contributor

@jenn-le jenn-le commented May 15, 2025

Description

Addsstaged API to SchemaFactoryAlpha. This creates an allowed type that can be read from the location where it's defined in a tree from the document but not written or moved into that location prior to upgrading.

@Copilot Copilot AI review requested due to automatic review settings May 15, 2025 05:41
@jenn-le jenn-le requested a review from a team as a code owner May 15, 2025 05:41
// schema C: number or string, both fully allowed
const schemaC = factory.optional([SchemaFactoryAlpha.number, SchemaFactoryAlpha.string]);

const provider = new TestTreeProviderLite(3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Advertising to customers that how you write simple examples for how to use our public API surface is to use a non exported private test utility from the tree package is not great.

I think we should try and do this using the actual public API surface. If that means this version of the code has to be in the end-to-end tests instead of the tree package, or that it has to use independentView, then it can be updated accordingly.


In the future, SharedTree will add an API that allows staged allowed types to be upgraded via a runtime schema upgrade so that the type can be more easily deployed using a configuration flag change rather than a code change.

Below is a full example of how the schema migration process works. This can also be found in our tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any app which claims to really be a "full" example here needs to subscribe to schema change events and handle the associated invalidation from a schema upgrade which might make it no longer viewable. Maybe worth linking TreeViewComponent,

function TreeViewComponent<TSchema extends ImplicitFieldSchema>({
as a reference for how to do that if this examples doesn't.

policy: this.schemaPolicy,
},
this,
true, // if isInitialization is true, schema validation does not occur
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we validate in the initialize case?

When documenting code, it's more important to put in a comment for why it's doing something instead of just what it's doing. A comment about what is hard to know if it's out of date. A comment about why provides new useful information, which can be used to understand if the code is doing the right thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The why is in the prepareForInsertionContextless doc. I'm pretty sure this stuff is needs to be changed so just removed this comment for now since it just makes things more confusing.

* If true, will walk the staged allowed types of the schema in both the node callback and the allowedTypes callback.
* If undefined, will skip any staged allowed types in the node callback but will include them in the allowedTypes callback.
*/
walkStagedAllowedTypes?: true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want something like this, if even to just document that one needs to be concerned about it.

I don't think this option is really enough though.

For example, if you want to walk the portion of the view schema which is enabled (currently supported by the stored schema), passing undefined here might missing some things which have been staged and upgraded already, but passing true might include too much.

Maybe we need an allowedTypeFilter?: (context: undefined | {node: TreeNodeSchema, field?: string}, type: AnnodatedAllowedType) => boolean so it would be possible to implement traversing all types currently allowed according to the stored schema.

// 1. node schemas that are declared as normal allowed types
// 2. node schemas that are declared as staged allowed types that also exist in the stored schema
//
// TODO:#38722 When runtime schema upgrades are implemented, this will need to be updated to include
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to include staged schema types which have been upgraded right now since another client could have made the stored schema, and it could include some of the staged types as enabled already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the scenario you're describing. Upgrading a staged schema means removing the staged annotation in the view schema. So this will include upgraded staged schemas since they will essentially be schemas that are declared as normal types so this is covered by #1. Clients that have not upgraded should have these schemas in the stored schema so that case is covered by #2.

continue;
}
if (metadata.stagedSchemaUpgrade === undefined || stored.nodeSchema.has(identifier)) {
viewNodeSchema.set(identifier, type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to recurse into the types under this schema to validate them for compatibility. Currently the traversal is not going to cover those in the case where metadata.stagedSchemaUpgrade !== undefined but stored.nodeSchema.has(identifier)

see https://github.com/microsoft/FluidFramework/pull/24631/files#r2220267693

jenn-le and others added 8 commits July 21, 2025 14:06
);
});

it("initialize doesn't run schema validation", () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not good. We are supposed to have more validation now, not less. Initialize should validate unconditionally. If it doesn't, we could get documents corrupted from the very moment of creation.

assert(child instanceof UnhydratedFlexTreeNode);

// Modify the tree so that it is out of schema.
// The public API is supposed to prevent out of schema trees,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer true: staged allowed types allow the public APi to create out of schema trees! Therefor we must absolutely detect this with validation. The lack of validation (no exception) shown below is a blocker for this change as it allows normal looking code using this feature to corrupt documents.

assert.equal(compatibility.canUpgrade, false);
assert.equal(compatibility.canInitialize, true);

view.initialize([5, "test"]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this corrupts the document, writing something into it that it out of schema with the stored schema.

When working with staged allowed types, new documents must not allow the staged types in the stored schema, and lso must not allow them in the data. Allow them in the stored schema break the app version compat with the previous app versions, and allowing it in the data corrupts the document.

});

it("additional asserts validates schema after edit", () => {
it.skip("additional asserts validates schema after edit", () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not disable document corruption detection nor tests that document corruption detection is working.

I'm pretty sure this PR corrupts documents, and we want to detect and fix that, not ignore it.

@CraigMacomber
Copy link
Contributor

Superseded by #25116

CraigMacomber added a commit that referenced this pull request Aug 7, 2025
## Description

Adds `staged` API to `SchemaFactoryAlpha`. This creates an allowed type
in the view schema that may or may not be included int he stored schema.

Continuation of #24631,
with a new PR from my fork so I can have push permissions.

Major changes:

- Adds  SchemaFactoryAlpha.staged
- There is no longer a single choice for how to derive a stored schema
from a view schema:
- Different use cases have been split into different APIs for example
(toInitalSchema, toUpgradeSchema)
- The underlying toStoredSchema now takes options to control how each
staged schema is handled.
- Unhydrated content always permits staged content (with the exception
of clone): this ensures that export/import round tripping of staged
content works.
- testDocuments test suite has been expanded so existing round trip
tests cover the above.

Known issues/limitations:

- Recursive types are not supported. Tracked by
https://dev.azure.com/fluidframework/internal/_workitems/edit/45711
- Clone should produce nodes with a union of source context and staged
types so that both unknown optional fields work, and new staged types
can be inserted. Tracked by
https://dev.azure.com/fluidframework/internal/_workitems/edit/45725 but
also partially hidden by
https://dev.azure.com/fluidframework/internal/_workitems/edit/45723
which covers how inserting the out of schema types does not error in
unhydrated context currently. This is ok, as we catch them when
inserting into hydrated documents and thus the two pugs combine to make
the desired behavior, but itn't great, and could cause other issue due
to violating internal invariants. Some of the new clone tests cover
this.
- Some places, mainly those not using alpha typer, can't accept
annotated allowed types and thus don't accept staged types. Examples
include the root of the tree view config, recursiveObject fields, and
likely more. Many of these can be worked around using dedicated alpha
APIs and/or wrapping the implicit field schema in an explicit one using
SchemaFactoryAlpha.required. Some cases, like recursiveArray do not have
a viable workaround, and can be addressed in future work, possibly after
stabilizing annotated allowed types.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: dds: tree area: dds Issues related to distributed data structures area: framework Framework is a tag for issues involving the developer framework. Eg Aqueduct base: main PRs targeted against main branch changeset-present public api change Changes to a public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants