Stumbled on dRAID recently and was grokking the docs for OpenZFS on linux and I came away with a conclusion from the chart that I wanted to double check.
This is a brand new system that I'm currently building from scratch and trying to plot out how to use my 5x12TB HDD disks. My priorities:
- Data Integrity
- Storage capacity
- I/O Performance
- Uptime (home lab use)
I was happy with the capacity and write performance of RAIDz1, but because of the size of these drives, I couldn't risk a second drive failure while rebuilding a failed one (my impression was that a 12TB drive could take a day or so, linked chart seems to agree). So I was forced to choose RAIDz2, which hits my overall capacity, but I'm fine with that. What I was worried about was write performance hit (I might just have to run tests...for another post).
Anyway, the chart on dRAID docs in the link above suggests that even for 5 drives, having parity 1 (equivalent to RAIDz1) and a distributed spare shortens rebuild time to just about 4 hours and greatly reduces the load on the single new drive. Faster rebuild for 1 drive and less write load seems to greatly diminish the risk of a second failed drive of this size while rebuilding the first as the burden is shared.
Image may be NSFW.
Clik here to view.
Question 1: I know that dRAID is for pools with many more disks, but other than complexity of setup (which I'll figure out), I only see upsides to choosing the dRAID route over RAIDz2.
Would you agree? Am I missing something about dRAID?
Question 2: More specifically, is the risk of a second drive failure reduced enough in the dRAID strategy so that I can get the capacity and performance back of only having 1 parity? My cake and eat it too?
Update: If anyone had some pointers on how to test this setup with the two methods, I'm down to try so that we could get some real data. I'm new to ZFS, so I'd need some help.