[PMM-287] OpenAPI: Retries (#26)

philsturgeon · chailandau · web-flow · commit ef3afb8a9f0a · 2025-05-19T11:19:12.000-04:00
Co-authored-by: Chai Landau &lt;112015853+chailandau@users.noreply.github.com&gt;
diff --git a/openapi/responses/rate-limiting.mdx b/openapi/responses/rate-limiting.mdx
@@ -103,6 +103,8 @@ Retry-After: 3600
 }
 ```
 
+*Learn more about retries [over here](/openapi/retries).*
+
 ## Rate limiting headers
 
 Rate limiting headers are HTTP headers that provide information about the rate
@@ -153,43 +155,6 @@ Content-Type: application/json
 }
 ```
 
-## Examples or constants
-
-When describing some of these headers it feels important to provide a bit of
-context about what the headers contain, either with an exact value, or if that's
-not possible then some examples of what values might appear.
-
-Here is an example of how to do that with the `Retry-After` header, using the Header Object `example` field:
-
-```yaml
-headers:
-  Retry-After:
-    description: The number of seconds to wait before making another request.
-    schema:
-      type: integer
-    example: 3600
-```
-
-Using an example here is important because the value of `Retry-After` can vary
-depending on the billing plan the user is on. For example, a free plan might
-have a `Retry-After` value of 60 seconds, while a paid plan might have a
-`Retry-After` value of 10 seconds.
-
-If the value of `Retry-After` is always the same, then using `const` inside the
-`schema` object is a good way to indicate that.
-
-```yaml
-headers:
-  Retry-After:
-    description: The number of seconds to wait before making another request.
-    schema:
-      type: integer
-      const: 3600
-```
-
-This indicates that the `Retry-After` value is always 3600 seconds, regardless of
-the billing plan the user is on.
-
 ## Rate limiting headers draft RFC
 
 The `X-RateLimit-*` headers are not standard HTTP headers, and they are not
@@ -206,7 +171,7 @@ requests allowed.
 The following example shows a `RateLimit` header with a policy named "default",
 which has another 50 requests allowed in the next 30 seconds.
 
-```
+```http
 RateLimit: "default";r=50;t=30
 ```
 
diff --git a/openapi/responses/retries.mdx b/openapi/responses/retries.mdx
@@ -0,0 +1,232 @@
+---
+title: Retries
+description: Communicate to API consumers when they should retry requests, and how
+  long to wait before retrying. This can help reduce the number of requests being made to an API 
+  when its having a rough time, and can also help consumers avoid tripping over on rate limits.
+---
+
+Retries are a common pattern in API design, especially when dealing with
+transient errors or rate limits, or connection issues that could have just been
+a bit getting flipped somewhere in the ether, but a second attempt would likely
+work.
+
+API consumers may or may not automatically retry failed requests, and if they do, they
+may not know how long to wait before retrying. This is especially true for rate
+limits, where the API may be able to handle a burst of requests, but then
+throttle the consumer for a period of time to avoid overwhelming the system.
+
+To avoid confusing consumers who just want to work with an API, it's a good idea
+to communicate to them how retries should be used in the API, with that
+information clearly explained in API documentation and automatically handled by
+SDKs.
+
+# Retries in OpenAPI
+
+Whilst OpenAPI does not having any dedicated functionality describing retries,
+it does not need to, as HTTP itself has retries covered.
+
+The `Retry-After` header is defined in RFC 9110 as the standard way to
+communicate that a client or server failure has happened, but that a retry is
+possible after a certain amount of time. 
+
+```yaml
+responses:
+  "429":
+    description: Too Many Requests
+    headers:
+      Retry-After:
+        description: The number of seconds to wait before retrying the request.
+        schema:
+          type: integer
+          example: 120
+```
+
+The most common usage of Retry-After is using the number of seconds to wait
+before trying again, but it can also be used to communicate a date and time when
+the request can be retried. 
+
+```yaml
+responses:
+  "429":
+    description: Too Many Requests
+    headers:
+      Retry-After:
+        description: The number of seconds to wait before retrying the request.
+        schema:
+          type: string
+          format: date-time
+          example: 2025-07-01T12:00:00Z
+```
+
+This header is used in a number of different scenarios, including:
+
+- 404 Not Found - The server MAY send a Retry-After header field to indicate that the resource is temporarily unavailable, or could exist in the near future.
+- 408 Request Timeout - The server MAY send a Retry-After header field to indicate that it is temporary and after what time the client MAY try again.
+- 409 Conflict - The server MAY send a Retry-After header field to indicate that the conflict could be temporary and suggest a time where it may have been resolved.
+- 413 Content Too Large - If the condition is temporary, the server could send a Retry-After header field to indicate that it is temporary and the client could try again.
+- 429 Too Many Requests - The client has sent too many requests in a given amount of time, and the server is asking the client to wait before sending more requests.
+- 503 Service Unavailable - A service may be unavailable temporarily, so a Retry-After header field could suggest an appropriate amount of time for the client to wait before checking if the service is back.
+
+Some other 5XX errors could well be retryable, such as 504 Gateway Timeout, and
+a 507 Insufficient Storage might resolve itself, but it's possible to go too
+far. For example, 501 Not Implemented could be seen as temporary because it
+could be implemented soon, but nobody wants API consumers retrying to API
+forever waiting for a feature to be deployed which is never going to be
+deployed.
+
+Describing every single instance of anything that could ever happen in OpenAPI
+is not the goal of any API documentation because that is impossible. Focusing on
+key areas where retries are likely to be needed is a good balance, and from
+there, API consumers can use their own judgement on whether to retry or not.
+
+## Examples or constants
+
+When describing some of these headers it feels important to provide a bit of
+context about what the headers contain, either with an exact value, or if that's
+not possible then some examples of what values might appear.
+
+Here is an example of how to do that with the `Retry-After` header, using the
+Header Object `example` field:
+
+```yaml
+headers:
+  Retry-After:
+    description: The number of seconds to wait before making another request.
+    schema:
+      type: integer
+    example: 3600
+```
+
+Using an example here is important because the value of `Retry-After` can vary
+depending on the billing plan the user is on. For example, a free plan might
+have a `Retry-After` value of 60 seconds, while a paid plan might have a
+`Retry-After` value of 10 seconds.
+
+If the value of `Retry-After` is always the same, then using `const` inside the
+`schema` object is a good way to indicate that (available from OpenAPI v3.1).
+
+```yaml
+headers:
+  Retry-After:
+    description: The number of seconds to wait before making another request.
+    schema:
+      type: integer
+      const: 3600
+```
+
+This indicates that the `Retry-After` value is always 3600 seconds, regardless of
+the billing plan the user is on.
+
+## x-speakeasy-retries
+
+Another ways to communicate how retries should work, is using the Speakeasy
+vendor extension `x-speakeasy-retries`. Using this extension, OpenAPI can
+communicate retry for a particular operation, or the API as a whole, to API
+consumers. 
+
+This extension can be used to specify the number of retries, the delay between
+retries, and the maximum delay.
+
+```yaml
+/webhooks/subscribe:
+  post:
+    operationId: subscribeToWebhooks
+    servers:
+      - url: https://speakeasy.bar
+    x-speakeasy-usage-example:
+      tags:
+        - server
+        - retries
+    x-speakeasy-retries:
+      strategy: backoff
+      backoff:
+        initialInterval: 10
+        maxInterval: 200
+        maxElapsedTime: 1000
+        exponent: 1.15
+```
+
+This example shows how to use the `x-speakeasy-retries` extension to specify a
+"backoff" strategy for retries, which is a common pattern for retrying failed
+requests with longer waits between retries. The `initialInterval` is the time to
+wait before the first retry, and the `maxInterval` is the maximum time to wait
+between retries. The `maxElapsedTime` is the maximum time to wait before giving
+up on retries, and the `exponent` is the exponent to use for the backoff
+calculation.
+
+This approach could be used as well as, or instead of, the `Retry-After` header.
+It would be a good way to add automatic retries to SDKs which means the client
+will be automatically retrying things without needing to learn everything about
+which status codes are retryable, and how long to wait before retrying.
+
+## Rate Limits
+
+Rate limits are a common use case for retries, and the `Retry-After` header is a
+great way to communicate to API consumers when they should retry requests, and
+how long to wait before retrying.
+
+Learn more about [rate limiting with OpenAPI](/openapi/rate-limiting).
+
+## Best practices
+
+### Examples over constants
+
+When describing the `Retry-After` header, it's a good idea to use examples instead of
+constants to allow for flexibility in the API. Some APIs will push up the
+`Retry-After` value during times of high load, and some APIs will have different
+`Retry-After` values depending on the billing plan the user is on. Using
+examples allows for this flexibility, whilst still letting API consumers know
+what sort of values they can expect (e.g: integer vs date-time).
+
+### Expand Retry-After in periods of high load
+
+When the API is under heavy load, it's a good idea to expand the `Retry-After`
+header to allow for longer wait times between retries. This can help reduce the
+number of requests being made to the API, and can also help consumers avoid
+tripping over on rate limits.
+
+### Use exponential backoff
+
+When retrying requests, it's a good idea to suggest an exponential backoff
+approach, as a server which is struggling to keep up with requests does not need
+to be hammered with even more requests. Sometimes it needs a moment to recover,
+and clients giving larger gaps between retries can help with that.
+
+### Document alternative solutions
+
+Once the maximum number of retries has been reached, it's is likely to be an
+actual outage instead of simply a blip in availability. The API client and their
+end-users should be informed of this, and given alternative solutions to the
+problem. This could be a link to a status page, or a support email address.
+
+For example, at a coworking space, when the conference room booking system was
+down, the client would replace the "Submit Booking" button with a message saying
+"The booking system is down, please email somebody@example.org". OpenAPI is a
+great way to communicate this sort of information to API clients so they can use
+it to inform their users.
+
+```yaml
+responses:
+  "503":
+    description: Service Unavailable
+    headers:
+      Retry-After:
+        description: The number of seconds to wait before making another request.
+        schema:
+          type: integer
+        example: 3600
+    content:
+      application/problem+json:
+        schema:
+          type: object
+          properties:
+            type:
+              type: string
+              const: "https://example.com/probs/service-unavailable"
+            title:
+              type: string
+              const: Service Unavailable
+            detail:
+              type: string
+              const: The service is currently unavailable, please try again later, or contact support@example.org to report the issue.
+```