Skip to content

feat: Add writer settings to enable collection sorting using a comparer #2363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 2, 2025

Conversation

MaggieKimani1
Copy link
Contributor

  • Changes the API surface area back to IDictionary to allow future implementations to use OrderedDictionary without it being a breaking change and continue allowing Swashbuckle customers to provide custom ordering.
  • IDictionary will require customers to explicitly provide the type until .NET adds implicit support
  • Adds OpenApiWriterSettings options to meet the need for deterministic ordering as per this comment:
    1. EnableSorting to enable sorting dictionary keys using a default comparer
    2. KeyComparer to allow consumers to plugin a custom comparer to sort dictionary entries

Fixes #2329

@MaggieKimani1 MaggieKimani1 requested a review from a team as a code owner May 21, 2025 08:19
@MaggieKimani1 MaggieKimani1 changed the title feat: Add writer settings to enable sorting using a comparer feat: Add writer settings to enable collection sorting using a comparer May 21, 2025
baywet
baywet previously requested changes May 21, 2025
@bkoelman
Copy link
Contributor

I'm pretty sure that ASP.NET does not want all dictionaries to be sorted when a comparer is provided, according to this comment. A similar concern applies to Swashbuckle.

Once collections (and not just dictionaries) are interfaces again, ASP.NET can initialize with a sorted collection at the places where sorting is desired. And in case SortedDictionary doesn't meet their performance requirements, they can provide their own IDictionary implementation that lazily sorts when enumerated (which is what the serializer does). Because everyone needs to initialize every collection anyway, now that they are made nullable, I don't see how providing a comparer in options would help anyone.

darrelmiller
darrelmiller previously approved these changes May 27, 2025
@darrelmiller
Copy link
Member

@bkoelman The primary goal of sorting the collections is to provide a deterministic order. Not sorting all of them would risk non-deterministic ordering. Obviously this approach is not going to suit everyone. Ordering using the writer settings is opt-in and now that everything is back to interfaces, based on your feedback, developers are free to provide implementations with specific ordering characteristics.

@bkoelman
Copy link
Contributor

First of all, thanks for reverting back to collection interfaces. This unblocks mine and other projects.

The primary goal of sorting the collections is to provide a deterministic order.

But the order is already deterministic, right? To the best of my knowledge, it has always been in this library, without the need for sorting support. This is because List<>, HashSet<> and Dictionary<,> preserve insertion order. So reading or writing the same DOM twice results in the exact same ordering. Can you provide a use case where ordering is not deterministic already? The only thing I can think of is parallel processing, but that requires collections to be thread-safe, so I don't think that happens in this library.

Ordering using the writer settings is opt-in and now that everything is back to interfaces, based on your feedback, developers are free to provide implementations with specific ordering characteristics.

I understand that using the writer settings is opt-in, and I'm happy with that. It's just that I can't think of a scenario where anyone would ever want to use it, given the way it is implemented in this PR: it's an all-or-nothing switch. Once a comparer is provided, then all dictionaries everywhere get sorted, which is what truly nobody wants. I'm just alerting that likely a useless feature is being added, possibly due to miscommunication with others. This won't help ASP.NET or Swashbuckle in any way, because they want to preserve declaration order for C# properties. They do want to sort the list of component schemas, but with the all-or-nothing comparer support provided here, they can't use that.

This is because the methods WriteCollectionInternal and WriteMapInternal (which do the sorting) are called from everywhere, not just for the list of component schemas.

@mikekistler
Copy link
Member

@bkoelman Here is a concrete example where we need to be able to sort path objects by path:

dotnet/eShop#680

Specifically:

https://github.com/dotnet/eShop/pull/680/files#diff-37fbe077a16c7f3aa77bef886e92d1cab152057067f1e21bf54442489892cd27

In this PR, the Catalog service adds a new API version while the previous version should be unchanged. But the diff shows 602 lines changed.

image

This is almost entirely because of reorderings, e.g. the "/api/catalog/items/by" path was originally ordered after "/api/catalog/items", but the changes cause it to now be ordered before "/api/catalog/items".

ASP.NET does not document the order in which endpoints are added to the document and developers should not depend on this ordering. So to achieve a deterministic ordering we want to emit the paths objects ordered by path.

@MaggieKimani1 MaggieKimani1 enabled auto-merge May 28, 2025 12:31
@bkoelman
Copy link
Contributor

@mikekistler Thanks for providing a use case. Again, this is about sorting paths only. Not about sorting everything in the document. For example, the CatalogBrand schema has properties id and brand. It wouldn't be correct if their ordering changed too.

This example essentially reinforces my point that providing a comparer won't solve the problem.

@MaggieKimani1 MaggieKimani1 dismissed baywet’s stale review May 29, 2025 09:14

Dismissing this as all feedback is implemented and this needs to be merged.

@darrelmiller
Copy link
Member

@bkoelman Appreciate you pushing on this. I understand this is an edge case but C# dictionaries are not deterministic. From the docs:

The order in which the items are returned is undefined.
https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.dictionary-2?view=net-9.0&redirectedfrom=MSDN

@mikekistler If the properties dictionary in schemas are sorted is that going to be concern for you?
I don't really want to only sort some hardcoded subset of dictionaries. Do we need to consider passing some set of names of properties to sort into the writer settings?

@bkoelman
Copy link
Contributor

bkoelman commented May 30, 2025

Oh yeah, I'm well aware of that. In my early days I implemented my own collection types with stronger guarantees. Until after a few years I realized almost every .NET app indirectly depends on this implementation detail. Dictionary<,> internally maintains an array exactly for this, because it's so extremely useful, at the cost of requiring extra memory. The whole .NET ecosystem would break if this ever changes, so it's a theoretical concern.

The same applies to the CLR memory model, whose implementation has way stronger guarantees than documented, and systems have taken a dependency on it. A few years back, those docs were updated to match the implementation, after ARM support was added that initially didn't provide the same guarantees, and everything broke.

And let's not forget that various improvements in .NET Core were reverted after a few releases when customers started reporting their .NET Framework code didn't work anymore. For example, that List<> implements IReadOnlyCollection<>, which is odd, but it was too late to change.

So I wouldn't worry about this for a second. This library was been deterministic for many years. If that weren't the case, it would have surfaced, because OAS files are typically under source control in consuming systems. That is, because Swashbuckle uses sorted collections in some places, which aspnet can also do already.

@bkoelman
Copy link
Contributor

Assuming that Mike's code relies on aspnet doing the sorting, confirm with @captainsafia, who has already indicated that sorting will not be applied to properties, but will only be applied to components here:

This isn't about reorder properties on schemas. It's about ordering in the for top-level items under the components property in the OpenAPI document.

@mikekistler
Copy link
Member

@darrelmiller I'm afraid it will be a problem if the properties in schemas are sorted lexicographically.

There may just a few other cases where we don't want lexicographic sorting. Others I can think of:

  • enum values
  • the tag names in the tags array of an operation
  • security schemes in the security of an operation -- maybe, unsure on this one.

In short, I think we would like a solution that lets us sort the things we want sorted, but leave alone things we'd prefer to leave in their "natural order".

For example, if we could provide a set of comparers on OpenAPI object types, e.g. Comparer<PathObject>, and the writer would sort those particular object types with that comparer if it was provided and otherwise leave the existing order. Could something like that work?

@baywet
Copy link
Member

baywet commented Jun 2, 2025

@mikekistler @bkoelman would that work for you if instead of having a single comparer, we had a bunch of named comparers in the settings? (e.g. PathSortingComparer, ComponentIdsSortingComparer, SchemaPropertiesSortingComparer, etc...) this way everyone gets granular control over what they want to sort? And if new collection arise from the standard, adding new comparers is additive (no breaking changes)

Copy link
Member

@baywet baywet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides the ongoing discussion about the desired behaviour

@MaggieKimani1 MaggieKimani1 merged commit d7eaf47 into main Jun 2, 2025
12 of 15 checks passed
@MaggieKimani1 MaggieKimani1 deleted the fix/revert-to-IDictionary-and-allow-sorting branch June 2, 2025 16:53
Copy link

sonarqubecloud bot commented Jun 2, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
73.6% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@mikekistler
Copy link
Member

would that work for you if instead of having a single comparer, we had a bunch of named comparers in the settings? (e.g. PathSortingComparer, ComponentIdsSortingComparer, SchemaPropertiesSortingComparer, etc...) this way everyone gets granular control over what they want to sort? And if new collection arise from the standard, adding new comparers is additive (no breaking changes)

I think something like this will work for ASP.NET. I'm not sure why these have to be named comparers, as I think it could be just a set of comparers that holds generic Comparer<T>, but named comparers are fine. If they are going to be named, I think a standard naming convention like "Comparer" should be used. I think "SortingComparer" is a bit redundant and "Comparer" is preferable.

@bkoelman
Copy link
Contributor

bkoelman commented Jun 2, 2025

would that work for you if instead of having a single comparer, we had a bunch of named comparers in the settings?

Unfortunately, it's not that simple. Let's take an enum, which can be defined at various depths in the tree, ie, inside a not, oneOf, allOf, if/then/else, etc. If that enum originates from a user-defined model, it should not be sorted. But other enums that exist only for technical reasons should generally be sorted. For example, an enum for the discriminator in inheritance, because the ordering of CLR types in an assembly is not something OpenAPI consumers care about.

Besides that, only string comparers make sense (used for dictionary keys). A comparer on an IList<IOpenApiSchema> is impossible to implement in a generic way. How could someone ever sort a list that consists of different schema types? The only reasonable way to sort such a list is to implement a visitor/walker that recursively inspects subtrees.

I'm happy to see that this PR has been merged, because the interfaces unblock all use cases, as far as I'm aware. I recommend hiding the comparer support for the next .NET 10 preview release. As far as I understand it, ASP.NET won't benefit from their presence, but the good news is that they can now do whatever's needed themselves, since interfaces are back.

We're merely guessing at what the true concerns are for ASP.NET, so I think a conversation with the team should happen first, before releasing something that hasn't been fully thought through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for deterministic sorting while serializing collections
5 participants