Skip to content

Commit c7d483c

Browse files
JeffreyRichtermikekistlerweidongxu-microsoft
authored
Updated LRO guidelines (#517)
* Added LRO guidelines * Restore all prior named guidelines with some edits where needed * Address PR review feedback Co-authored-by: Mike Kistler <[email protected]> Co-authored-by: Weidong Xu <[email protected]>
1 parent 011dd3c commit c7d483c

File tree

2 files changed

+370
-127
lines changed

2 files changed

+370
-127
lines changed

azure/ConsiderationsForServiceDesign.md

+166-74
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77

88
| Date | Notes |
99
| ----------- | -------------------------------------------------------------- |
10+
| 2024-Mar-17 | Updated LRO guidelines |
1011
| 2024-Jan-17 | Added guidelines on returning string offsets & lengths |
1112
| 2022-Jul-15 | Update guidance on long-running operations |
1213
| 2022-Feb-01 | Updated error guidance |
@@ -208,7 +209,7 @@ It is good practice to define the path for action operations that is easily dist
208209
2) use a special character not in the set of valid characters for resource names to distinguish the "action" in the path.
209210

210211
In Azure we recommend distinguishing action operations by appending a ':' followed by an action verb to the final path segment. E.g.
211-
```http
212+
```text
212213
https://.../<resource-collection>/<resource-id>:<action>?<input parameters>
213214
```
214215

@@ -217,7 +218,7 @@ cannot collide with a resource path that contains user-specified resource ids.
217218

218219
## Long-Running Operations
219220

220-
Long-running operations are an API design pattern that should be used when the processing of
221+
Long-running operations (LROs) are an API design pattern that should be used when the processing of
221222
an operation may take a significant amount of time -- longer than a client will want to block
222223
waiting for the result.
223224

@@ -226,33 +227,111 @@ a _status monitor_, which is an ephemeral resource that will track the status an
226227
The status monitor resource is distinct from the target resource (if any) and specific to the individual
227228
operation request.
228229

229-
A POST or DELETE operation returns a `202 Accepted` response with the status monitor in the response body.
230-
A long-running POST should not be used for resource create -- use PUT as described below.
231-
PATCH must never be used for long-running operations -- it should be reserved for simple resource updates.
232-
If a long-running update is required it should be implemented with POST.
230+
There are four types of LROs allowed in Azure REST APIs:
231+
232+
1. An LRO to create or replace a resource that involves additional long-running processing.
233+
2. An LRO to delete a resource.
234+
3. An LRO to perform an action on or with an existing resource (or resource collection).
235+
4. An LRO to perform an action not related to an existing resource (or resource collection).
236+
237+
The following sections describe these patterns in detail.
238+
239+
### Create or replace a resource requiring additional long-running processing
240+
<a href="#put-with-additional-long-running-processing"></a> <!-- Preserve anchor of previous heading -->
241+
242+
A special case of long-running operations that occurs often is a PUT operation to create or replace a resource
243+
that involves some additional long-running processing.
244+
One example is a resource that requires physical resources (e.g. servers) to be "provisioned" to make the resource functional.
245+
246+
In this case:
247+
- The operation must use the PUT method (NOTE: PATCH is never allowed here)
248+
- The URL identifies the resource being created or replaced.
249+
- The request and response body have identical schemas & represent the resource.
250+
- The request may contain an `Operation-Id` header that the service will use as
251+
the ID of the status monitor created for the operation.
252+
- If the `Operation-Id` matches an existing operation and the request content is the same,
253+
treat as a retry and return the same response as the earlier request.
254+
Otherwise fail the request with a `409-Conflict`.
255+
256+
```text
257+
PUT /items/FooBar&api-version=2022-05-01
258+
Operation-Id: 22
259+
260+
{
261+
"prop1": 555,
262+
"prop2": "something"
263+
}
264+
```
233265

234-
There is a special form of long-running operation initiated with PUT that is described
235-
in [Create (PUT) with additional long-running processing](./Guidelines.md#put-operation-with-additional-long-running-processing).
236-
The remainder of this section describes the pattern for long-running POST and DELETE operations.
266+
In this case the response to the initial request is a `201 Created` to indicate that
267+
the resource has been created or `200 OK` when the resource was replaced.
268+
The response body should be a representation of the resource that was created,
269+
and should include a `status` field indicating the current status of the resource.
270+
A status monitor is created to track the additional processing and the ID of the status monitor
271+
is returned in the `Operation-Id` header of the response.
272+
The response must also include an `Operation-Location` header for backward compatibility.
273+
If the resource supports ETags, the response may contain an `etag` header and possibly an `etag` property in the resource.
274+
275+
```text
276+
HTTP/1.1 201 Created
277+
Operation-Id: 22
278+
Operation-Location: https://items/operations/22
279+
etag: "123abc"
280+
281+
{
282+
"id": "FooBar",
283+
"status": "Provisioning",
284+
"prop1": 555,
285+
"prop2": "something",
286+
"etag": "123abc"
287+
}
288+
```
237289

238-
This diagram illustrates how a long-running operation with a status monitor is initiated and then how the client
290+
The client will issue a GET to the status monitor to obtain the status of the operation performing the additional processing.
291+
292+
```text
293+
GET https://items/operations/22?api-version=2022-05-01
294+
```
295+
296+
When the additional processing completes, the status monitor indicates if it succeeded or failed.
297+
298+
```text
299+
HTTP/1.1 200 OK
300+
301+
{
302+
"id": "22",
303+
"status": "Succeeded"
304+
}
305+
```
306+
307+
If the additional processing failed, the service may delete the original resource if it is not usable in this state,
308+
but should clearly document this behavior.
309+
310+
### Long-running delete operation
311+
312+
A long-running delete operation returns a `202 Accepted` with a status monitor which the client uses to determine the outcome of the delete.
313+
314+
The resource being deleted should remain visible (returned from a GET) until the delete operation completes successfully.
315+
316+
When the delete operation completes successfully, a client must be able to create a new resource with the same name without conflicts.
317+
318+
This diagram illustrates how a long-running DELETE operation is initiated and then how the client
239319
determines it has completed and obtains its results:
240320

241321
```mermaid
242322
sequenceDiagram
243323
participant Client
244324
participant API Endpoint
245325
participant Status Monitor
246-
Client->>API Endpoint: POST/DELETE
326+
Client->>API Endpoint: DELETE
247327
API Endpoint->>Client: HTTP/1.1 202 Accepted<br/>{ "id": "22", "status": "NotStarted" }
248328
Client->>Status Monitor: GET
249329
Status Monitor->>Client: HTTP/1.1 200 OK<br/>Retry-After: 5<br/>{ "id": "22", "status": "Running" }
250330
Client->>Status Monitor: GET
251331
Status Monitor->>Client: HTTP/1.1 200 OK<br/>{ "id": "22", "status": "Succeeded" }
252332
```
253333

254-
1. The client sends the request to initiate the long-running operation.
255-
The initial request could be a POST or DELETE method.
334+
1. The client sends the request to initiate the long-running DELETE operation.
256335
The request may contain an `Operation-Id` header that the service uses as the ID of the status monitor created for the operation.
257336

258337
2. The service validates the request and initiates the operation processing.
@@ -261,8 +340,8 @@ Otherwise the service responds with a `202-Accepted` HTTP status code.
261340
The response body is the status monitor for the operation including the ID, either from the request header or generated by the service.
262341
When returning a status monitor whose status is not in a terminal state, the response must also include a `retry-after` header indicating the minimum number of seconds the client should wait
263342
before polling (GETing) the status monitor URL again for an update.
264-
For backward compatibility, the response may also include an `Operation-Location` header containing the absolute URL
265-
of the status monitor resource (without an api-version query parameter).
343+
For backward compatibility, the response must also include an `Operation-Location` header containing the absolute URL
344+
of the status monitor resource, including an api-version query parameter.
266345

267346
3. After waiting at least the amount of time specified by the previous response's `Retry-after` header,
268347
the client issues a GET request to the status monitor using the ID in the body of the initial response.
@@ -275,14 +354,11 @@ If the operation is still being processed, the status field will contain a "non-
275354

276355
5. After the operation processing completes, a GET request to the status monitor returns the status monitor with a status field set to a terminal value -- `Succeeded`, `Failed`, or `Canceled` -- that indicates the result of the operation.
277356
If the status is `Failed`, the status monitor resource contains an `error` field with a `code` and `message` that describes the failure.
278-
If the status is `Succeeded` and the LRO is an Action operation, the operation results will be returned in the `result` field of the status monitor.
279-
If the status is `Succeeded` and the LRO is an operation on a resource, the client can perform a GET on the resource
280-
to observe the result of the operation if desired.
281357

282-
6. There may be some cases where a long-running operation can be completed before the response to the initial request.
358+
6. There may be some cases where a long-running DELETE operation can be completed before the response to the initial request.
283359
In these cases, the operation should still return a `202 Accepted` with the `status` property set to the appropriate terminal state.
284360

285-
7. The service is responsible for purging the status-monitor resource.
361+
7. The service is responsible for purging the status monitor resource.
286362
It should auto-purge the status monitor resource after completion (at least 24 hours).
287363
The service may offer DELETE of the status monitor resource due to GDPR/privacy.
288364

@@ -292,6 +368,9 @@ An action operation that is also long-running combines the [Action Operations](#
292368
with the [Long Running Operations](#long-running-operations) pattern.
293369

294370
The operation is initiated with a POST operation and the operation path ends in `:<action>`.
371+
A long-running POST should not be used for resource create: use PUT as described above.
372+
PATCH must never be used for long-running operations: it should be reserved for simple resource updates.
373+
If a long-running update is required it should be implemented with POST.
295374

296375
```text
297376
POST /<service-or-resource-url>:<action>?api-version=2022-05-01
@@ -303,7 +382,7 @@ Operation-Id: 22
303382
}
304383
```
305384

306-
The response is a `202 Accepted` as described above.
385+
A long-running action operation returns a `202 Accepted` response with the status monitor in the response body.
307386

308387
```text
309388
HTTP/1.1 202 Accepted
@@ -333,82 +412,95 @@ HTTP/1.1 200 OK
333412
}
334413
```
335414

336-
### PUT with additional long-running processing
415+
This diagram illustrates how a long-running action operation is initiated and then how the client
416+
determines it has completed and obtains its results:
337417

338-
A special case of long-running operation that occurs often is a PUT operation to create or replace a resource
339-
that involves some additional long-running processing.
340-
One example is a resource requires physical resources (e.g. servers) to be "provisioned" to make the resource functional.
341-
In this case, the request may contain an `Operation-Id` header that the service will use as
342-
the ID of the status monitor created for the operation.
418+
```mermaid
419+
sequenceDiagram
420+
participant Client
421+
participant API Endpoint
422+
participant Status Monitor
423+
Client->>API Endpoint: POST
424+
API Endpoint->>Client: HTTP/1.1 202 Accepted<br/>{ "id": "22", "status": "NotStarted" }
425+
Client->>Status Monitor: GET
426+
Status Monitor->>Client: HTTP/1.1 200 OK<br/>Retry-After: 5<br/>{ "id": "22", "status": "Running" }
427+
Client->>Status Monitor: GET
428+
Status Monitor->>Client: HTTP/1.1 200 OK<br/>{ "id": "22", "status": "Succeeded", "result": { ... } }
429+
```
343430

344-
```text
345-
PUT /items/FooBar&api-version=2022-05-01
346-
Operation-Id: 22
431+
1. The client sends the request to initiate the long-running action operation.
432+
The request may contain an `Operation-Id` header that the service uses as the ID of the status monitor created for the operation.
347433

348-
{
349-
"prop1": 555,
350-
"prop2": "something"
351-
}
352-
```
434+
2. The service validates the request and initiates the operation processing.
435+
If there are any problems with the request, the service responds with a `4xx` status code and error response body.
436+
Otherwise the service responds with a `202-Accepted` HTTP status code.
437+
The response body is the status monitor for the operation including the ID, either from the request header or generated by the service.
438+
When returning a status monitor whose status is not in a terminal state, the response must also include a `retry-after` header indicating the minimum number of seconds the client should wait
439+
before polling (GETing) the status monitor URL again for an update.
440+
For backward compatibility, the response may also include an `Operation-Location` header containing the absolute URL
441+
of the status monitor resource, including an api-version query parameter.
353442

354-
In this case the response to the initial request is a `201 Created` to indicate that the resource has been created
355-
or `200 OK` when the resource was replaced.
356-
The response body contains a representation of the created resource, which is the standard pattern for a create operation.
357-
A status monitor is created to track the additional processing and the ID of the status monitor
358-
is returned in the `Operation-Id` header of the response.
359-
The response may also include an `Operation-Location` header for backward compatibility.
360-
If the resource supports ETags, the response may contain an `etag` header and possibly an `etag` property in the resource.
443+
3. After waiting at least the amount of time specified by the previous response's `Retry-after` header,
444+
the client issues a GET request to the status monitor using the ID in the body of the initial response.
445+
The GET operation for the status monitor is documented in the REST API definition and the ID
446+
is the last URL path segment.
361447

362-
```text
363-
HTTP/1.1 201 Created
364-
Operation-Id: 22
365-
Operation-Location: https://items/operations/22
366-
etag: "123abc"
448+
4. The status monitor responds with information about the operation including its current status,
449+
which should be represented as one of a fixed set of string values in a field named `status`.
450+
If the operation is still being processed, the status field will contain a "non-terminal" value, like `NotStarted` or `Running`.
367451

368-
{
369-
"id": "FooBar",
370-
"etag": "123abc",
371-
"prop1": 555,
372-
"prop2": "something"
373-
}
374-
```
452+
5. After the operation processing completes, a GET request to the status monitor returns the status monitor with a status field set to a terminal value -- `Succeeded`, `Failed`, or `Canceled` -- that indicates the result of the operation.
453+
If the status is `Failed`, the status monitor resource contains an `error` field with a `code` and `message` that describes the failure.
454+
If the status is `Succeeded`, the operation results (if any) are returned in the `result` field of the status monitor.
375455

376-
The client will issue a GET to the status monitor to obtain the status of the operation performing the additional processing.
456+
6. There may be some cases where a long-running action operation can be completed before the response to the initial request.
457+
In these cases, the operation should still return a `202 Accepted` with the `status` property set to the appropriate terminal state.
377458

378-
```text
379-
GET https://items/operations/22?api-version=2022-05-01
380-
```
459+
7. The service is responsible for purging the status monitor resource.
460+
It should auto-purge the status monitor resource after completion (at least 24 hours).
461+
The service may offer DELETE of the status monitor resource due to GDPR/privacy.
381462

382-
When the additional processing completes, the status monitor will indicate if it succeeded or failed.
463+
### Long-running action operation not related to a resource
383464

384-
```text
385-
HTTP/1.1 200 OK
465+
When a long-running action operation is not related to a specific resource (a batch operation is one example),
466+
another approach is needed.
386467

387-
{
388-
"id": "22",
389-
"status": "Succeeded"
390-
}
391-
```
468+
This type of LRO should be initiated with a PUT method on a URL that represents the operation to be performed,
469+
and includes a final path parameter for the user-specified operation ID.
470+
The response of the PUT includes a response body containing a representation of the status monitor for the operation
471+
and an `Operation-Location` response header that contains the absolute URL of the status monitor.
472+
In this type of LRO, the status monitor should include any information from the request used to initiate the operation,
473+
so that a failed operation could be reissued if necessary.
392474

393-
If the additional processing failed, the service may delete the original resource if it is not usable in this state,
394-
but would have to clearly document this behavior.
475+
Clients will use a GET on the status monitor URL to obtain the status and results of the operation.
476+
Since the HTTP semantic for PUT is to create a resource, the same schema should be used for the PUT request body,
477+
the PUT response body, and the response body of the GET for the status monitor for the operation.
478+
For this type of LRO, the status monitor URL should be the same URL as the PUT operation.
395479

396-
### Long-running delete operation
480+
The following examples illustrate this pattern.
397481

398-
A long-running delete operation follows the general pattern of a long-running operation --
399-
it returns a `202 Accepted` with a status monitor which the client uses to determine the outcome of the delete.
482+
```text
483+
PUT /translate-operations/<operation-id>?api-version=2022-05-01
400484
401-
The resource being deleted should remain visible (returned from a GET) until the delete operation completes successfully.
485+
<JSON body with parameters for the operation>
486+
```
487+
488+
Note that the client specifies the operation id in the URL path.
402489

403-
When the delete operation completes successfully, a client must be able to create new resource with same name without conflicts.
490+
A successful response to the PUT operation should have a `201 Created` status and response body
491+
that contains a representation of the status monitor _and_ any information from the request used to initiate the operation.
492+
493+
The service is responsible for purging the status monitor after some period of time,
494+
but no earlier than 24 hours after the completion of the operation.
495+
The service may offer DELETE of the status monitor resource due to GDPR/privacy.
404496

405497
### Controlling a long-running operation
406498

407499
It might be necessary to support some control action on a long-running operation, such as cancel.
408500
This is implemented as a POST on the status monitor endpoint with `:<action>` added.
409501

410502
```text
411-
POST /<status-monitor-url>:cancel?api-version=2022-05-01
503+
POST /<status-monitor-endpoint>:cancel?api-version=2022-05-01
412504
```
413505

414506
A successful response to a control operation should be a `200 OK` with a representation of the status monitor.

0 commit comments

Comments
 (0)