Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 69 additions & 14 deletions azure/ConsiderationsForServiceDesign.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,20 +293,20 @@ The operation is initiated with a POST operation and the operation path ends in

```text
POST /<service-or-resource-url>:<action>?api-version=2022-05-01
Operation-Id: 22
{
"arg1": 123
"arg2": "abc"
}
Operation-Id: 22

{
"arg1": 123
"arg2": "abc"
}
```

The response is a `202 Accepted` as described above.

```text
HTTP/1.1 202 Accepted
Operation-Location: https://<status-monitor-endpoint>/22

{
"id": "22",
"status": "NotStarted"
Expand All @@ -323,7 +323,7 @@ When the operation completes successfully, the result (if there is one) will be

```text
HTTP/1.1 200 OK

{
"id": "22",
"status": "Succeeded",
Expand All @@ -344,7 +344,7 @@ PUT /items/FooBar&api-version=2022-05-01
Operation-Id: 22

{
"prop1": 555,
"prop1": 555,
"prop2": "something"
}
```
Expand All @@ -358,13 +358,13 @@ The response may also include an `Operation-Location` header for backward compat
If the resource supports ETags, the response may contain an `etag` header and possibly an `etag` property in the resource.

```text
HTTP/1.1 201 Created
HTTP/1.1 201 Created
Operation-Id: 22
Operation-Location: https://items/operations/22
etag: "123abc"

{
"id": "FooBar",
"id": "FooBar",
"etag": "123abc",
"prop1": 555,
"prop2": "something"
Expand All @@ -381,7 +381,7 @@ When the additional processing completes, the status monitor will indicate if it

```text
HTTP/1.1 200 OK

{
"id": "22",
"status": "Succeeded"
Expand Down Expand Up @@ -412,8 +412,8 @@ POST /<status-monitor-url>:cancel?api-version=2022-05-01
A successful response to a control operation should be a `200 OK` with a representation of the status monitor.

```text
HTTP/1.1 200 OK
HTTP/1.1 200 OK

{
"id": "22",
"status": "Canceled"
Expand Down Expand Up @@ -515,6 +515,61 @@ For example, the client can specify an `If-Match` header with the last ETag valu
The service processes the update only if the ETag value in the header matches the ETag of the current resource on the server.
By computing and returning ETags for your resources, you enable clients to avoid using a strategy where the "last write always wins."

## Returning String Offsets & Lengths (Substrings)

Some Azure services return substring offset & length values within a string. For example, the offset & length within a string to a name, email address, or phone #.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit phone # seems too informal? Just phone number?

When a service response includes a string, the client's programming language deserializes that string into that language's internal string encoding. Below are the possible encodings and examples of languages that use each encoding:

| Encoding | Example languages |
| -------- | ------- |
| UTF-8 | Go, Rust, Ruby, PHP |
| UTF-16 | JavaScript, Java, C# |
| CodePoint (UTF-32) | Python |

Because the service doesn't know what language a client is written in and what string encoding that language uses, the service can't return UTF-agnostic offset and length values that the client can use to index within the string. To address this, the service response must include offset & length values for all 3 possible encodings and then the client code must select the encoding it required by its language's internal string encoding.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar nit:

Suggested change
Because the service doesn't know what language a client is written in and what string encoding that language uses, the service can't return UTF-agnostic offset and length values that the client can use to index within the string. To address this, the service response must include offset & length values for all 3 possible encodings and then the client code must select the encoding it required by its language's internal string encoding.
Because the service doesn't know in what language a client is written and what string encoding that language uses, the service can't return UTF-agnostic offset and length values that the client can use to index within the string. To address this, the service response must include offset & length values for all 3 possible encodings and then the client code must select the encoding required by its language's internal string encoding.


For example, if a service response needed to identify offset & length values for "name" and "email" substrings, the JSON response would look like this:

```
{
(... other properties not shown...)
"fullString": "(...some string containing a name and an email address...)",
"name": {
"offset": {
"utf8": 12,
"utf16": 10,
      "codePoint": 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, we seems got 2 spaces here "codePoint": 4

   },
   "length": {
   "uft8": 10,
      "utf16": 8,
      "codePoint": 2
    }
  },
  "email": {
 "offset": {
      "utf8": 12,
      "utf16": 10,
      "codePoint": 4
    },
    "length": {
      "uft8": 10,
      "utf16": 8,
      "codePoint": 4
    }
  }
}
```

Then, the Go developer, for example, would get the substring containing the name using code like this:

```
var response := client.SomeMethodReturningJSONShownAbove(...)
name := response.fullString[ response.name.offset.utf8 : response.name.offset.utf8 + response.name.length.utf8]
```

The service must calculate the offset & length for all 3 encodings and return them because clients find it difficult working with Unicode encodings and how to convert from one encoding to another. In other words, we do this to simplify client development and ensure customer success when isolating a substring.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also mention that it makes pass-through requests easier as well? That was the thing that really won me over. I think the same was true for @JeffreyRichter, IIRC.


## Getting Help: The Azure REST API Stewardship Board
The Azure REST API Stewardship board is a collection of dedicated architects that are passionate about helping Azure service teams build interfaces that are intuitive, maintainable, consistent, and most importantly, delight our customers. Because APIs affect nearly all downstream decisions, you are encouraged to reach out to the Stewardship board early in the development process. These architects will work with you to apply these guidelines and identify any hidden pitfalls in your design.

Expand Down
13 changes: 11 additions & 2 deletions azure/Guidelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Please ensure that you add an anchor tag to any new guidelines that you add and

| Date | Notes |
| ----------- | -------------------------------------------------------------- |
| 2024-Jan-17 | Added guidelines on returning string offsets & lengths |
| 2023-May-12 | Explain service response for missing/unsupported `api-version` |
| 2023-Apr-21 | Update/clarify guidelines on POST method repeatability |
| 2023-Apr-07 | Update/clarify guidelines on polymorphism |
Expand Down Expand Up @@ -438,7 +439,7 @@ This indicates to client libraries and customers that values of the enumeration

Polymorphism types in REST APIs refers to the possibility to use the same property of a request or response to have similar but different shapes. This is commonly expressed as a `oneOf` in JsonSchema or OpenAPI. In order to simplify how to determine which specific type a given request or response payload corresponds to, Azure requires the use of an explicit discriminator field.

Note: Polymorphic types can make your service more difficult for nominally typed languages to consume. See the corresponding section in the [Considerations for service design](./ConsiderationsForServiceDesign.md#avoid-surprises) for more information.
Note: Polymorphic types can make your service more difficult for nominally typed languages to consume. See the corresponding section in the [Considerations for service design](./ConsiderationsForServiceDesign.md#avoid-surprises) for more information.

<a href="#json-use-discriminator-for-polymorphism" name="json-use-discriminator-for-polymorphism">:white_check_mark:</a> **DO** define a discriminator field indicating the kind of the resource and include any kind-specific fields in the body.

Expand Down Expand Up @@ -838,7 +839,7 @@ For example:
### Repeatability of requests

Fault tolerant applications require that clients retry requests for which they never got a response, and services must handle these retried requests idempotently. In Azure, all HTTP operations are naturally idempotent except for POST used to create a resource and [POST when used to invoke an action](
https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#performing-an-action).
https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#performing-an-action).

<a href="#repeatability-headers" name="repeatability-headers">:ballot_box_with_check:</a> **YOU SHOULD** support repeatable requests as defined in [OASIS Repeatable Requests Version 1.0](https://docs.oasis-open.org/odata/repeatable-requests/v1.0/repeatable-requests-v1.0.html) for POST operations to make them retriable.
- The tracked time window (difference between the `Repeatability-First-Sent` value and the current time) **MUST** be at least 5 minutes.
Expand Down Expand Up @@ -1098,6 +1099,14 @@ While it may be tempting to use a revision/version number for the resource as th

<a href="#condreq-etag-depends-on-encoding" name="condreq-etag-depends-on-encoding">:white_check_mark:</a> **DO**, when supporting multiple representations (e.g. Content-Encodings) for the same resource, generate different ETag values for the different representations.

<a href="#substrings" name="substrings"></a>
### Returning String Offsets & Lengths (Substrings)

All string values in JSON are inherently Unicode and UTF-8 encoded, but clients written in a high-level programming language must work with strings in that language's string encoding, which may be UTF-8, UTF-16, or CodePoints (UTF-32).
When a service response includes a string offset or length value, it should specify these values in all 3 encodings to simplify client development and ensure customer success when isolating a substring.

<a href="#substrings-return-value-for-each-encoding" name="substrings-return-value-for-each-encoding">:white_check_mark:</a> **DO** include all 3 encodings (UTF-8, UTF-16, and CodePoint) for every string offset or length value in a service response.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should document here in this doc the exact format we want e.g., {"utf8": 2, "utf16": 1, "codePoint":1}. We document formats for LROs, pageables, and errors. How you expanded on that in "Considerations" is perfect, but you should also link to that section e.g.,

Suggested change
<a href="#substrings-return-value-for-each-encoding" name="substrings-return-value-for-each-encoding">:white_check_mark:</a> **DO** include all 3 encodings (UTF-8, UTF-16, and CodePoint) for every string offset or length value in a service response.
<a href="#substrings-return-value-for-each-encoding" name="substrings-return-value-for-each-encoding">:white_check_mark:</a> **DO** include all 3 encodings (UTF-8, UTF-16, and CodePoint) for every string offset or length value in a service response using the schema below. See [considerations](ConsiderationsForServiceDesign.md#{actual-stub-here}) for more information.
```json
{
"length": {
"utf8": 2,
"utf16": 1,
"codePoint": 1
}
}
```


<a href="#telemetry" name="telemetry"></a>
### Distributed Tracing & Telemetry
Azure SDK client guidelines specify that client libraries must send telemetry data through the `User-Agent` header, `X-MS-UserAgent` header, and Open Telemetry.
Expand Down