Database Replication
- Lets you keep data geographically close to users
- Allow for failover
- Lets you process more queries than a single machine could handle
- Horizontally scale read queries
Leader-Follower (Master-Slave)
- One Replica is the Leader - all writes go through the leader
- This sends each write to the followers
Synchronous vs Asynchronous
Synchronous: The leader waits for the follower's response that it received the data
This is disadvantageous if the follower crashes, the leader has to block all writes until the ok is received But it's advantageous that the followers are guaranteed to be up to date with the leader
Semi-Synchronous: One of the followers will be synchronous, and the others will be async
- If the synchronous follower gets slow, one of the async followers will be made synchronous.
- Thus, an up-to-date copy of the data will be available on at least two nodes
Fully Asynchronous
- If the leader fails, any writes that haven't yet gone to followers will be lost
- But the advantage is that it'll be much faster
Setting up New Followers
- Take consistent snapshots of the leader's Database
- Copy the snapshot to the new follower
- The follower connects to the leader and requests new data changes (PostgreSQL log sequence number)
Handling Node Outages
Follower Failure
- The follower that crashes will know it's last transaction before the fault, thus will connect to the leader and take new writes
Leader Failure - Failover
- Need to: Promote a follower, configure clients to send their writes to the new leader, inform followers of the new leader
- Identify that the leader failed - if it doesn't respond for 30s
- Choose a new leader - the most up to date replica
Failover
- Async - usually discard writes
Replication Log Strategies
Statement-Based - store the leader's write statements (SQL queries)
- This breaks down if nondeterministic functions such as NOW() or RAND() are used
- Auto-incrementing columns thta get out of date
- Triggers and stored procedures can have side effects and lead to nondeterministic behavior
The leader can replace nondeterministic function calls with a fixed return value when the statement is logged
WAL Shipping
- Log is append-only
- When the followers process the log, they create the same data structure as the leader
- This makes failover and rolling upgrades easy since the leader will never need to recreate the B-Trees
- Used by PostgreSQL
Logical Log
- The replication log is in a different format than the leader's
- Aims to contain as little information as possible to replicate
Because it's decoupled from internals, it can more easily be backwards compatible
Trigger-Based Replication
- Lets application code run when a database is updated
- This approach has high overhead to other replication methods
Handling Reads - Replication Lag
Read-After-Write Consistency
- After a user writes data, they need to be able to refresh and see it uploaded
- 1st technique - When reading something the user may have modified, read it from the leader
- Always read the user's own profile from the leader
- 2 - Allocate a time-period after writes to read from the leader
Monotonic Reads
- A user's query goes to a replica forward in time, then another query to a replica that's lagging
- So they see data moving back in time
Monotonic Reads guarantee that this won't happen - it's a stronger guarantee than eventual consistency
Consistent Prefix Reads - Readers will see writes in the order they came in
- Always make sure a user's reads come from the same replica