Skip to main content
hhow09's Blog

Session guarantees for weakly consistent replicated data

Introduction #

Eventual Consistency is not enough.

The term “eventually” is deliberately vague: in general, there is no limit to how far a replica can fall behind. [1]

Unfortunately, the lack of guarantees concerning the ordering of read and write operations in weakly consistent systems can confuse users and applications, as reported in experiences with Grapevine [21]. A user may read some value for a data item and then later read an older value. Similarly, a user may update some data item based on reading some other data, while others read the updated item without seeing the data on which it is based. [2]

Consistency Models [3] #

Consistency Models by JEPSEN

PRAM Consistency [3:1] [4] #

Defition #

all processes see write operations issued by a given process in the same order as they were invoked by that process.

On the other hand, processes may observe writes issued by different processes in different orders.

Notations #

  1. DB(S): the current contents of the server's database.
  2. DB(S,t): the ordered sequence of Writes that have been received by server S at or before time t.
  3. WriteOrder(W1,W2): a boolean predicate indicating whether Write W1 should be ordered before Write W2

Session Guarantees #

Sessions are not intended to correspond to atomic transactions that ensure atomicity and serializability. Instead, the intent is to present individual applications with a view of the database that is consistent with their own actions, even if they read and write from various, potentially inconsistent servers. We want the results of operations performed in a session to be consistent with the model of a single centralized server, possibly being read and updated concurrently by multiple clients. [2:1]

Read Your Writes #

read operations reflect previous writes.

If Read R follows Write W in a session and R is performed at server S at time t, then W is included in DB(S,t).

Examples [2:2] #

  1. Shortly after changing password, user would occasionally type the new password and receive an "invalid password" response. The RYW-guarantee ensures that the login process will always read the most recent password.

  2. As a user reads and deletes messages, those messages are removed from the displayed "new mail" folder. If the user stops reading mail and returns sometime later, she should not see deleted messages reappear simply because the mail reader refreshed its display from a different copy of the database.

Monotonic Reads #

successive reads must reflect a non-decreasing set of writes.

Namely, if a process has read a certain value v from an object, any successive read operation will not return any value written before v.

Examples [2:3] #

  1. The user's calendar program periodically refreshes its display by reading all of today's calendar appointments from the database. If it accesses servers with inconsistent copies of the database, recently added (or deleted) meetings may appear to come and go. The MR-guarantee can effectively prevent this since it disallows access to copies of the database that are less current than the previously read copy.

Monotonic Writes #

If Write W1 precedes Write W2 in a session, then, for any server S2, if W2 in DB(S2) then W1 is also in DB(S2) and WriteOrder(W1,W2).

If a process performs write w1, then w2, then all processes observe w1 before w2.

Examples [2:4] #

  1. A text editor when editing replicated files to ensure that if the user saves version N of the file and later saves version N+1 then version N+1 will replace version N at all servers. In particular, it avoids the situation in which version N is written to some server and version N+1 to a different server and the versions get propagated such that version N is applied after N+1.

Writes Follow Reads #

writes are propagated after reads on which they depend.

if a process reads a value v, which came from a write w1, and later performs write w2, then w2 must be visible after w1. Once you’ve read something, you can’t change that read’s past.

Examples #

  1. User sees reactions (W) to posted articles only if he/she have read the original posting.

Database Configurations #

MongoDB[5] #

Read Concern Write Concern Read own writes Monotonic reads Monotonic writes Writes follow reads
"majority" "majority"
"majority" { w: 1 }
"local" { w: 1 }
"local" "majority"

Read Concern Majority [6] #

Majority does a timestamped read at the stable timestamp (also called the last committed snapshot in the code, for legacy reasons). The data read only reflects the oplog entries that have been replicated to a majority of nodes in the replica set. Any data seen in majority reads cannot roll back in the future. Thus majority reads prevent dirty reads, though they often are stale reads.

Read concern majority reads do not wait for anything to be committed; they just use different snapshots from local reads. Read concern majority reads usually return as fast as local reads, but sometimes will block. For example, right after startup or rollback when we do not have a committed snapshot, majority reads will be blocked. Also, when some of the secondaries are unavailable or lagging, majority reads could slow down or block.

PostgreSQL [7] [8] #

PostgreSQL streaming replication is asynchronous by default. If the primary server crashes then some transactions that were committed may not have been replicated to the standby server, causing data loss. The amount of data loss is proportional to the replication delay at the time of failover.

Reference #


  1. Designing Data Intensive Applications Chapter 5: Replication ↩︎

  2. Session Guarantees for Weakly Consistent Replicated Data ↩︎ ↩︎ ↩︎ ↩︎ ↩︎

  3. JEPSEN - Consistency Models ↩︎ ↩︎

  4. Consistency in Non-Transactional Distributed Storage Systems ↩︎

  5. MongoDB: Causal Consistency and Read and Write Concerns ↩︎

  6. MongoDB: Replication Internals ↩︎

  7. PostgreSQL synchronous_commit ↩︎

  8. PostgreSQL: Synchronous Replication ↩︎