Skip to main content
hhow09's Blog

Idempotency

Definition [1] #

Definition in Wikipedia

Idempotence is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application.

idempotent elements are the functions f: E → E […] such that for all x in E, f(f(x)) = f(x)

In computer science, the term idempotence may have a different meaning depending on the context in which it is applied

In HTTP protocol #

GET, PUT, and DELETE should be implemented in an idempotent manner according to the standard, but POST doesn't need to be.

in event driven driven system #

In event stream processing, idempotence refers to the ability of a system to produce the same outcome, even if the same file, event or message is received more than once.

Why we need idempotency in distributed system ? #

Requests retry is inevitable. #

Failures Happen [2] #

Many kinds of failures become apparent as requests taking longer than usual, and potentially never completing. When a client is waiting longer than usual for a request to complete, it also holds on to the resources it was using for that request for a longer time. When a number of requests hold on to resources for a long time, the server can run out of those resources. These resources can include memory, threads, connections, ephemeral ports, or anything else that is limited.

With idempotency, retry will just produce same side effect.

E.g. /POST record 1st time, record inserted and returned. 2nd time, find the result and simply return it again.

Duplicate Messages are Inevitable. [3] [4] #

I'm coining the phrase "effectively-once" for message processing with at-least-once + idempotent operations. [5]

In event driven system, we talks about duplicate messages in both producer and consumer.

For producer side, duplicate messages happens when message broker fail to acknowledge due temporary error.

For consumer side, at-least-once delivery is the most common setting. Since exactly once delivery comes with much more complexity and configuration and performance hurt.

An application typically uses a message broker, such as Apache Kafka or RabbitMQ, that implements at-least once delivery. At-least once delivery ensures that messages will be delivered. It does mean, however, that the message broker can invoke a message handler repeatedly for the same message. You must use the Idempotent Consumer pattern to ensure that your message handlers correctly handle duplicate messages.

Determine if the operation naturally Idempotent ? #

some business logic naturally has follwing properties could be idempotent by default.

For Example #

The request included an external API call #

Idempotency key #

RFC #

Definition [6] #

An idempotency key is a unique value that’s generated by a client and sent to an API along with a request. The server stores the key to use for bookkeeping the status of that request on its end. If a request should fail partway through, the client retries with the same idempotency key value, and the server uses it to look up the request’s state and continue from where it left off. The name “idempotency key” comes from Stripe’s API.

Key Generation #

Implementations #


Idempotent Middleware #

Idempotency middleware allows for fault-tolerant APIs where duplicate requests.

business logic does not aware of the idempotent key

Flow #

sequenceDiagram
    title: idempotent middleware flow;
    actor C as Client;
    participant server;
    participant R as request repo;
    participant LS as lock store;
    
    
    C ->> server: request (idempotent_key)
    note over server: verify key
    note over server: check key expiry
    server ->> R: GET Response

    alt Not Exist (first pass)
        R -->> server: Not Exist
        server ->>+ LS: lock request
        LS -->>- server: success
        note over server: handle request
        server ->>+ R: SET Response
        R -->> server: Success
        server ->>+ LS: unlock request 
        LS -->>- server: success    
        server ->> C: Response    
        R -->> server: Response
        server ->> C: Response
    else Exist
        R -->> server: Response
        server ->> C: Response
    end

Idempotency Fingerprint [8] #

An idempotency fingerprint MAY be used in conjunction with an idempotency key to determine the uniqueness of a request. Such a fingerprint is generated from request payload data by the resource server.

For Example:

func fingerPrint(r *http.Request) ([]byte, error) {
	key := r.Header.Get("Idempotency-Key")
	body, err := io.ReadAll(r.Body)
	if err != nil {
		return nil, fmt.Errorf("fail to read body, err: %w" + err.Error())
	}
	fields := map[string]any{
		"body": string(body),
		"path": r.URL.Path,
		"key":  key,
	}
	b, err := json.Marshal(fields)
	if err != nil {
		return nil, fmt.Errorf("fail to marshal fields, err: %w" + err.Error())
	}
	sum := sha1.Sum(b)
	return sum[:], nil
}

Key Expiration #

Does the server store idempotency keys forever? [9]

Stripe [10] #

Clients can safely retry requests that include an idempotency key as long as the second request occurs within 24 hours from when you first receive the key (keys expire out of the system after 24 hours).

RFC [11] [12] #

The resource MAY enforce time based idempotency keys, thus, be able to purge or delete a key upon its expiry. The resource server SHOULD define such expiration policy and publish it in the documentation.

Resource server MUST publish idempotency related specification. This specification MUST include expiration related policy if applicable. Server is responsible for managing the lifecycle of the idempotency key.

Frameworks #


Integrate Idempotent key into business logic #

the blog post Implementing Stripe-like Idempotency Keys in Postgres provides example content in detail.

TLDR:

Concept #

Foreign state mutations [6:1] #

To shore up our backend, it’s key to identify where we’re making foreign state mutations; that is, calling out and manipulating data on another system. This might be creating a charge on Stripe, adding a DNS record, or sending an email.

Some foreign state mutations are idempotent by nature (e.g. adding a DNS record), some are not idempotent but can be made idempotent with the help of an idempotency key (e.g. charge on Stripe, sending an email), and some operations are not idempotent, most often because a foreign service hasn’t designed them that way and doesn’t provide a mechanism like an idempotency key.

Atomic phase [6:2] #

An atomic phase is a set of local state mutations that occur in transactions between foreign state mutations.

Atomic phases should be safely committed before initiating any foreign state mutation. If the call fails, our local state will still have a record of it happening that we can use to retry the operation.

Example: Rocket Ride #

Atomic Phases

Example: Medusa [13] #

Idempotency Key implementation in Expressjs

What if External API is not idempotent [6:3] #

Unfortunately, not every service will make this guarantee. If we try to make a non-idempotent foreign state mutation and we see a failure, we may have to persist this operation as permanently errored. In many cases we won’t know whether it’s safe to retry or not, and we’ll have to take the conservative route and fail the operation.

Companies Posts #

Reference #


  1. Idempotency Key:原理與實測 ↩︎

  2. AWS Timeouts, retries, and backoff with jitter ↩︎

  3. Microservice Architecture: Idempotent consumer pattern ↩︎

  4. Idempotent Processing with Kafka ↩︎

  5. Viktor Klang's Twitter ↩︎

  6. Implementing Stripe-like Idempotency Keys in Postgres ↩︎ ↩︎ ↩︎ ↩︎

  7. RFC Idempotency-Key 2.2 ↩︎ ↩︎

  8. RFC Idempotency-Key 2.4 ↩︎

  9. My Thoughts on Idempotency ↩︎

  10. Stripe Doc ↩︎

  11. RFC Idempotency-Key 2.3 ↩︎

  12. RFC Idempotency-Key 2.5 ↩︎

  13. An open-source implementation of idempotency keys in NodeJS with Express ↩︎