Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for patching, merging, replacing metadata #688

Merged
merged 35 commits into from
Jun 4, 2024

Conversation

hyperrealist
Copy link
Contributor

@hyperrealist hyperrealist commented Mar 11, 2024

  • patch_metadata uses a http patch request with application/json-patch+json content type
  • merge_metadata can be triggered with a http patch request and application/merge-patch+json
  • update_metadata uses a similar approach to merge_metadata, but constructs a json-patch on the client-side
  • add jsonpatch and json-merge-patch to requirements

Copy link
Member

@danielballan danielballan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I think the main thing that's missing is implementing the new adapters methods on the SQL-backed adapter in Tiled itself.

async def update_metadata(self, metadata=None, specs=None):

In a later PR, we could even look into pushing the patch representation all the way into SQL, but to start I think applying the path in Python and doing a full replace in SQL is the way to go.

tiled/client/metadata_update.py Outdated Show resolved Hide resolved
tiled/server/router.py Outdated Show resolved Hide resolved
tiled/client/base.py Show resolved Hide resolved
Copy link
Contributor

@Kezzsim Kezzsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at this earlier on a call with Dan, not seeing anything too out of the ordinary that he hasn't already touched on.

@hyperrealist hyperrealist force-pushed the patch-api-new branch 2 times, most recently from 2f1a0a7 to 709812f Compare March 21, 2024 09:58
@hyperrealist
Copy link
Contributor Author

@danielballan, @Kezzsim this PR is ready for another review. Changes since your last review are minimal, though it took me an obscene amount of time to figure this out.

@hyperrealist hyperrealist changed the title [WIP] add support for patching, merging, replacing metadata Add support for patching, merging, replacing metadata Mar 21, 2024
tiled/client/base.py Outdated Show resolved Hide resolved
tiled/server/router.py Outdated Show resolved Hide resolved
tiled/server/router.py Outdated Show resolved Hide resolved
tiled/client/base.py Outdated Show resolved Hide resolved
tiled/client/base.py Show resolved Hide resolved
tiled/client/base.py Outdated Show resolved Hide resolved
specs: Optional[Specs]

# Wait for fix https://github.com/pydantic/pydantic/issues/3957
# to do this with `unique_items` parameters to `pydantic.constr`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like they've fixed it already in ver 1.10.12 (we use 1.10.13). Would it be worthwhile just uncommenting line 86?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe in a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I marked this one as resolved by mistake. This came from a part of code I copied from @danielballan. Not exactly sure what needs to be done here, but I agree maybe it should be a separate PR.

https://github.com/bluesky/tiled/blame/e056aa9694375d2b034d69a0143d40845b6ee2bb/tiled/server/schemas.py#L436-L437

tiled/client/base.py Show resolved Hide resolved
@hyperrealist
Copy link
Contributor Author

hyperrealist commented Mar 22, 2024

I pushed a couple of commits addressing suggestions from @danielballan and @genematx.

Also added an option in update_metadata for retrieving a JSON patch without applying it. May be useful for setting up a batch update via patch_metadata (fast).

Copy link
Member

@danielballan danielballan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note in line on the latest changes.

Also:

  • @skarakuzu recently added a CHANGELOG. Can you add a line item for this?
  • Can these methods be added to the API reference? It looks like update_metadata was never added, and the new methods should be added too.

tiled/client/base.py Outdated Show resolved Hide resolved
@danielballan
Copy link
Member

I’ve asked @Kezzsim to test drive this locally a bit before we merge. Merge away when you are satisfied, Kari!

@danielballan
Copy link
Member

Rebased and force-pushed to resolve merge conflicts with #695

@danielballan
Copy link
Member

Tests failed due to an incompatibility with pydantic 2.x, introduced by #695. I have pushed a commit with a fix.

@Kezzsim
Copy link
Contributor

Kezzsim commented Mar 26, 2024

This looks good, however I'm noticing a break from convention when it comes to how HTTP is implemented. Pulling up swagger docs on http://127.0.0.1:8000/docs reveals that application/json is the only mimetype in the fastAPI dropdown.
jsonpatchproblem1
Technically this is correct, as the body in @hyperrealist 's spec is a json object which contains valid json patches, but the overall request body itself is not a json patch.

Example of current request body [dict]:

{
 "patch": [
   {
     "op": "replace",
     "path": "/test",
     "value": "testing"
   }
 ],
 "specs": null
}

Proposed json patch implementation of body [list]:
(conforming to RFC6902)

[
   {
     "op": "replace",
     "path": "metadata/test",
     "value": "testing"
   }
 ]

Fixing this to have it be within spec would require one of two potential changes:

  1. Split the endpoint into PATCH /api/v1/metadata/{path} and PATCH /api/v1/specs/{path} which both take valid json-patch objects that alter their respective properties at the root path.
  2. (As seen in the example above) change the body to be a valid json-patch but make the RFC6902 path roots be metadata and specs respectively.

@danielballan mentioned pulling in @dylanmcreynolds since he has a good eye for making similar API design decisions that are both conforming to spec and aesthetically pleasing. I'd like to see what his suggestion is.

@hyperrealist
Copy link
Contributor Author

Thanks @Kezzsim, this is something that had been bothering me in the background at times. I appreciate your attention to detail!

  1. Split the endpoint into PATCH /api/v1/metadata/{path} and PATCH /api/v1/specs/{path} which both take valid json-patch objects that alter their respective properties at the root path.

  2. (As seen in the example above) change the body to be a valid json-patch but make the RFC6902 path roots be metadata and specs respectively.

  1. If it is typical to have to update both metadata and specs together this would necessitate extra http requests. If that is not the case I think this is a very much viable solution.
  2. This is a creative way to conform to the http spec but I don't like how arguments are mixed together, especially since how they are mixed together would change depending on the mimetype (we support both application/json-patch+json and application/merge-patch+json. See here).
  3. A third solution might be to let the payload be of application/json mimetype and split the endpoint into something like PATCH /api/v1/patch_metadata/{path} and PATCH /api/v1/merge_metadata/{path}, each requiring a valid json with an embedded json patch or a merge patch respectively. But I think this is achieving little at the expense of polluting the API and adding to the complexity.
  4. A fourth solution is to enforce a sub-mimetype at the content level:
{
 "type": "application/json-patch+json",  # could default to this. may also support aliases  "json-patch"
 "metadata": [
   {
     "op": "replace",
     "path": "/test",
     "value": "testing"
   }
 ],
 "specs": [],
}

and

{
 "type": "application/merge-patch+json",  # or "merge-patch" / "merge"
 "metadata": {"test": "testing"},
 "specs": [],
}

All these solutions would conform to http spec, but I am leaning toward solutions 1 or 4 for their conceptual simplicity and ease of implementation. Anyway, I am completely open to other ideas and design considerations.

@danielballan
Copy link
Member

Yes, good catch @Kezzsim! And thanks for presenting some additional options @hyperrealist.

Some additional design considerations:

Option (1) would break symmetry between GET and PATCH, as GET /metadata/{path} currently returns all "metadata" about a node, broadly defined, including structure family, structure, specs, data sources, and metadata. Making separate routes for PATCH seems asymmetrical in this sense, in addition to introducing overhead from separate requests for some use cases.

I can foresee adding support for patching data sources, alongside metadata and specs, which would make the downsides with (2) and (3) even more salient.

I can see no problems with (4). There was something appealing about specifying the patch format in the content-type header, but there is also something clean about making the request plain JSON (everybody knows JSON!) with the patch format specified in the body. It seems to keep our options open.

If we go that way, it is worth considering options for what to name the key. Is it content-type, format, minetype, type?

@hyperrealist
Copy link
Contributor Author

I very much intentionally wanted to avoid calling it content-type. I wouldn't envy whoever is documenting content-type the http header and content-type the subtype for this endpoint. mimetype sounds too restrictive to me as I would hesitate to support aliases under that name. Between format and type I like type because it hints at this being a mimetype without being overly restrictive. If adding a bit to the boilerplate is not an issue I would go with patch-type, metadata-patch, and specs-patch.

@danielballan
Copy link
Member

Seeing 👍s from @Kezzsim and myself, I think going ahead with (4) with name type makes sense.

@dylanmcreynolds
Copy link
Contributor

I very much intentionally wanted to avoid calling it content-type. I wouldn't envy whoever is documenting content-type the http header and content-type the subtype for this endpoint. mimetype sounds too restrictive to me as I would hesitate to support aliases under that name.

I'm not sure I understand the hesitance about calling it content-type. There's precedence here. For example, it normally appears in both the header and the body of multi-part form posts: https://swagger.io/docs/specification/describing-request-body/multipart-requests/

@hyperrealist
Copy link
Contributor Author

hyperrealist commented Apr 17, 2024

CI errors for Python 3.11 are coming from dask's incompatibility with 3.11.9 (fixed in 2024.4.1). Github CI uses 3.11.9, so it is broken at the moment.

I pinned python<3.11.9 for now [on second thought, let me actually remove the pin so that you can see the errors], but I am not sure if I should attempt to resolve this issue in this PR. We should probably update to dask 2024.4.1.

With that note, I think this ready for another review, @danielballan @Kezzsim @genematx. I did not implement patch for specs, but just for metadata. If you think patching specs makes sense I can add that.

detail="This node does not support update of metadata.",
)

if body.content_type in [
Copy link
Member

@danielballan danielballan Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of, "Adhere to standards wherever possible," I would prefer to start with strictly accepting application/json-patch+json and application/merge-patch+json here. The use case for aliases seems narrow to me. The MIME type are unambiguous, Google-able, and easy enough to copy/paste from examples, or achieve through tooling like the Tiled Python client.

At least, I would wait to add aliases until/unless there is demonstrated demand for it that cannot be easily addressed another way. It's easy to add this later if we need to, but hard to remove once supported.


class HyphenizedBaseModel(pydantic.BaseModel):
# This model configuration allows aliases like "content-type"
model_config = ConfigDict(alias_generator=lambda f: f.replace("_", "-"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever!

@danielballan
Copy link
Member

FYI, @hyperrealist, you may have seen this go by already but in case not, the CI failures on Python 3.11 are not the fault of this PR and can be ignored. See #715.

@danielballan
Copy link
Member

I played with this locally. It works as advertised. I really like the ease of use of update_metadata...no more manually making deep copies in user code!

I took the liberty of rebasing on main and then renaming md_patch to metadata_patch. (I appreciate the case for brevity, but we don't abbreviate it anywhere else in the codebase, I suspect it will be more often read that typed, so I am privileging clarity.)

Will merge when CI passes.

@danielballan danielballan merged commit 10677a4 into bluesky:main Jun 4, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants