I have a multi-region, bare-metal Kubernetes cluster that uses Tailscale and flannel for cross-node networking. For storage, I’m using the OpenEBS LocalPV ZFS CSI driver on two specific worker nodes, each configured with a 100GB ZFS zpool. PersistentVolumeClaims (PVCs) and their Kubernetes-native VolumeSnapshots are provisioned on one of these two nodes, depending on where the workloads (Pods) are scheduled.
The setup looks like this:
Workernode1: 100GB zpool hosting PVCs & snapshotsWorkernode2: 100GB zpool also hosting PVCs & snapshots
I’ve ensured proper node labeling so that PVCs bind to the intended nodes. Now I’m considering a replication strategy. In the event that one node fails or becomes unavailable, I would like the other node to have an up-to-date copy of the PVC data and any associated snapshots. This would allow me to quickly recover workloads by pointing them at the replicated data on the surviving node.
I’m currently looking at using zrepl to perform incremental, periodic ZFS-level snapshots and replication, effectively maintaining zpool1/replicated/… datasets on each node. This would happen out-of-band from Kubernetes; zrepl would run at the host level and handle snapshot creation, replication, and retention according to a schedule.
My question: Is using zrepl (or a similar ZFS-native replication tool) outside of Kubernetes the recommended approach for this scenario? Are there known best practices, patterns, or alternative solutions that integrate more deeply with Kubernetes, or leverage the CSI snapshots directly for automated cross-node replication?
I haven’t found a built-in replication feature in the OpenEBS LocalPV ZFS driver, and I’m not aware of a Kubernetes-native operator or controller that seamlessly handles node-to-node replication of ZFS-based volumes. Any guidance on how others have approached this problem, or suggestions on a more integrated solution, would be greatly appreciated.
The design decision to rely on localpv has to do with latency. I can not rely on synchronous replication due to it.