From 84ccde60554dbe581f6083ea6a705ac7f997a637 Mon Sep 17 00:00:00 2001 From: Vasileios Zois Date: Wed, 26 Feb 2025 14:26:34 -0800 Subject: [PATCH 1/4] change mentions of MMR to FAT --- website/docs/cluster/replication.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/website/docs/cluster/replication.md b/website/docs/cluster/replication.md index 2d68db2687..3b23b604d1 100644 --- a/website/docs/cluster/replication.md +++ b/website/docs/cluster/replication.md @@ -48,10 +48,10 @@ In addition, newly configured replicas, added to the cluster, could face longer The cluster operator can choose between various replication options to achieve a trade-off between performance and durability. A summary of these options is shown below: -- Main Memory Replication (MMR) +- Fast AOF Truncation (FAT) This option forces the primary to aggressively truncate the AOF so it does not spill into disk. It can be used in combination with aof-memory option which determines the maximum AOF memory buffer size. - When a replica attaches to a primary with MMR turned on, the AOF is not guaranteed to be truncated which may result in writes being lost. - To overcome this issue MMR should be used with ODC. + When a replica attaches to a primary with FAT turned on, the AOF is not guaranteed to be truncated which may result in writes being lost. + To overcome this issue FAT should be used with ODC. - On Demand Checkpoint (ODC) This option forces the primary to take a checkpoint if no checkpoint is available when replica tries to attach and recover. If a checkpoint becomes or was availalbe and the CCRO has not been truncated, then the primary will lock it to prevent truncation while a replica is recovering. In this case, they AOF log could spill to disk as the AOF in memory buffer becomes full. From 7ec37985f4ace9d4715ad294574ccd40129e3f59 Mon Sep 17 00:00:00 2001 From: Vasileios Zois Date: Wed, 26 Feb 2025 15:39:44 -0800 Subject: [PATCH 2/4] add diskless replication description --- website/docs/cluster/replication.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/website/docs/cluster/replication.md b/website/docs/cluster/replication.md index 3b23b604d1..6ae7e6771e 100644 --- a/website/docs/cluster/replication.md +++ b/website/docs/cluster/replication.md @@ -226,6 +226,25 @@ replica_announced:1 192.168.1.26:7001> ``` +# Diskless Replication +When AOF gets truncated, full synchronization requires taking a checkpoint and sending that checkpoint over to the attaching replica. +This operation can be expensive because it involves multiple I/O operations at the primary and replica. +For this reason, we added a variant of full synchronization called diskless replication. +This is implemented using a streaming checkpoint that allows clients to continue issuing read and writes at the primary while attaching replicas synchronize. +To enable diskless replication the server needs to be started with the following flags +--repl-diskless-sync=true +This is used to enable diskless replication + +--repl-diskless-sync-delay=\. +This is used to determine how many seconds to wait before starting the full sync, in order to give the opportunity to multiple replicas to attach and receive the streaming checkpoint. + +There is no additional requirements to that of using the aforementioned flags in order to leverage diskless replication. +The APIs for mapping replicas remains the same (i.e. CLUSTER REPLICATE, REPLICAOF etc.). + +Note that streaming replication does not take a checkpoint thus the AOF is not automatically truncated (unless FAT flag is sued) every time a full sync is performed. +This happens to ensure durability in the event of a failure which will not be possible if the AOF gets truncated without a persitent checkpoint. +However, the store version gets incremented to ensure consistency accross different instances that may be fully synced at different times. +Users can still utilize SAVE/BGSAVE commands to take a manual checkpoint which safely truncates the AOF. From 5dda9cb8354f5895910aba7a78f993a497c1ce32 Mon Sep 17 00:00:00 2001 From: Vasileios Zois Date: Thu, 6 Mar 2025 14:32:32 -0800 Subject: [PATCH 3/4] fix typos --- website/docs/cluster/replication.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/website/docs/cluster/replication.md b/website/docs/cluster/replication.md index 6ae7e6771e..38ba93ca0e 100644 --- a/website/docs/cluster/replication.md +++ b/website/docs/cluster/replication.md @@ -243,8 +243,8 @@ This is used to determine how many seconds to wait before starting the full sync There is no additional requirements to that of using the aforementioned flags in order to leverage diskless replication. The APIs for mapping replicas remains the same (i.e. CLUSTER REPLICATE, REPLICAOF etc.). -Note that streaming replication does not take a checkpoint thus the AOF is not automatically truncated (unless FAT flag is sued) every time a full sync is performed. +Note that diskless replication does not take an actual checkpoint. +Hence every time a full sync is performed, the AOF is not automatically truncated (unless FAT flag is used). This happens to ensure durability in the event of a failure which will not be possible if the AOF gets truncated without a persitent checkpoint. -However, the store version gets incremented to ensure consistency accross different instances that may be fully synced at different times. -Users can still utilize SAVE/BGSAVE commands to take a manual checkpoint which safely truncates the AOF. - +However, the store version gets incremented to ensure consistency across different instances that may be fully synced at different times. +Users can still utilize SAVE/BGSAVE commands or --aof-size-limit to periodically take a checkpoint and safely truncates the AOF. \ No newline at end of file From e6775a2de26c418f5d00447eb2b89f7e97de933f Mon Sep 17 00:00:00 2001 From: Badrish Chandramouli Date: Thu, 6 Mar 2025 17:32:48 -0800 Subject: [PATCH 4/4] Fix typo in replication documentation --- website/docs/cluster/replication.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/cluster/replication.md b/website/docs/cluster/replication.md index 38ba93ca0e..decff5b418 100644 --- a/website/docs/cluster/replication.md +++ b/website/docs/cluster/replication.md @@ -247,4 +247,4 @@ Note that diskless replication does not take an actual checkpoint. Hence every time a full sync is performed, the AOF is not automatically truncated (unless FAT flag is used). This happens to ensure durability in the event of a failure which will not be possible if the AOF gets truncated without a persitent checkpoint. However, the store version gets incremented to ensure consistency across different instances that may be fully synced at different times. -Users can still utilize SAVE/BGSAVE commands or --aof-size-limit to periodically take a checkpoint and safely truncates the AOF. \ No newline at end of file +Users can still utilize SAVE/BGSAVE commands or --aof-size-limit to periodically take a checkpoint and safely truncate the AOF. \ No newline at end of file