In Part One we started running the numbers on various RAID levels, and we covered read and write speeds. In this part we’ll take it further and study how the number of drives affect:
- Actual capacity
- Cost per terabyte
- Drive and Array failure
Which RAID level gives the best capacity?
Here are the various RAID levels compared on actual capacity according to the number of drives (size of one drive is 4 TB):
Drives | Total Capacity (TB) | ||||
0 | 1 | 5 | 6 | 10 | |
2 | 8 | 4 | x | x | x |
3 | 12 | 4 | 8 | x | x |
4 | 16 | 4 | 12 | 8 | 8 |
5 | 20 | 4 | 16 | 12 | x |
6 | 24 | 4 | 20 | 16 | 12 |
7 | 28 | 4 | 24 | 20 | x |
8 | 32 | 4 | 28 | 24 | 16 |
9 | 36 | 4 | 32 | 28 | x |
10 | 40 | 4 | 36 | 32 | 20 |
11 | 44 | 4 | 40 | 36 | x |
12 | 48 | 4 | 44 | 40 | 24 |
13 | 52 | 4 | 48 | 44 | x |
14 | 56 | 4 | 52 | 48 | 28 |
15 | 60 | 4 | 56 | 52 | x |
16 | 64 | 4 | 60 | 56 | 32 |
32 | 128 | 4 | 124 | 120 | 64 |
64 | 256 | 4 | 252 | 248 | 128 |
And here’s the graph:
RAID 1 is the biggest loser here. Once you cross two drives, you are just duplicating the same data to every additional drive. The total capacity of a RAID 1 drive is always equal to the capacity of one drive.
RAID 10 has a 50% drive penalty, because in affect it behaves like a RAID 1, had it been capable of scaling up. RAID 5 and RAID 6 do much better, because they don’t have to duplicate data, but just write a few parity bits which don’t require a lot of overhead.
RAID 0 is the best, because it gives you the full capacity of each drive. Don’t forget to note that if a drive is rated at X TB, the actual capacity is always a bit smaller.
If you’re short on cash and absolutely need to make the most out of every drive, the above chart tells you which options to go for. This is why RAID 5 is very popular. It gives you a lot of space, and offers you some protection (redundancy). But it comes at a price, which we’ll see soon.
Which RAID level is the cheapest on a Cost per terabyte basis?
Obviously, if an array cannot use up all its capacity, then the cost per terabyte goes up. Let’s see by how much:
Drives | Cost/TB | ||||
0 | 1 | 5 | 6 | 10 | |
2 | $ 75 | $ 150 | x | x | x |
3 | $ 75 | $ 225 | $ 113 | x | x |
4 | $ 75 | $ 300 | $ 100 | $ 150 | $ 150 |
5 | $ 75 | $ 375 | $ 94 | $ 125 | x |
6 | $ 75 | $ 450 | $ 90 | $ 113 | $ 150 |
7 | $ 75 | $ 525 | $ 88 | $ 105 | x |
8 | $ 75 | $ 600 | $ 86 | $ 100 | $ 150 |
9 | $ 75 | $ 675 | $ 84 | $ 96 | x |
10 | $ 75 | $ 750 | $ 83 | $ 94 | $ 150 |
11 | $ 75 | $ 825 | $ 83 | $ 92 | x |
12 | $ 75 | $ 900 | $ 82 | $ 90 | $ 150 |
13 | $ 75 | $ 975 | $ 81 | $ 89 | x |
14 | $ 75 | $ 1,050 | $ 81 | $ 88 | $ 150 |
15 | $ 75 | $ 1,125 | $ 80 | $ 87 | x |
16 | $ 75 | $ 1,200 | $ 80 | $ 86 | $ 150 |
32 | $ 75 | $ 2,400 | $ 77 | $ 80 | $ 150 |
64 | $ 75 | $ 4,800 | $ 76 | $ 77 | $ 150 |
And here’s the graph:
It is instructive to know that RAID 0 and RAID 10 have no impact on the cost per TB, no matter how many drives you buy. This means scaling up has no benefits unless the drive vendor gives you any discounts.
On the other hand, RAID 5 and RAID 6 gets cheaper as you increase the number of drives, regardless of any discounts you get for the actual drives. They do stagnate after a point, though – which is about the 32-drive mark. Interestingly, once you increase the number of drives to 64 or more, RAID 6 actually catches up with RAID 5 in terms of cost per TB; and they both catch up with RAID 0. But 64 is a large number of drives for one array, and you’ll be lucky to find a controller (or stacking controllers even) with that many connectors!
RAID 1, though, is a total waste of money beyond two drives, but we knew that already.
Drive and Array failure rates for various RAID levels
There are two things to consider when looking at the failure rate of a RAID array:
- How many drives can fail?
- What are the chances of failure?
The number of drives that can fail for different RAID levels
Here’s how they stack up:
Drives | How many drives can fail | ||||
0 | 1 | 5 | 6 | 10* | |
2 | 0 | 1 | x | x | x |
3 | 0 | 2 | 1 | x | x |
4 | 0 | 3 | 1 | 2 | 2 |
5 | 0 | 4 | 1 | 2 | x |
6 | 0 | 5 | 1 | 2 | 3 |
7 | 0 | 6 | 1 | 2 | x |
8 | 0 | 7 | 1 | 2 | 4 |
9 | 0 | 8 | 1 | 2 | x |
10 | 0 | 9 | 1 | 2 | 5 |
11 | 0 | 10 | 1 | 2 | x |
12 | 0 | 11 | 1 | 2 | 6 |
13 | 0 | 12 | 1 | 2 | x |
14 | 0 | 13 | 1 | 2 | 7 |
15 | 0 | 14 | 1 | 2 | x |
16 | 0 | 15 | 1 | 2 | 8 |
32 | 0 | 31 | 1 | 2 | 16 |
64 | 0 | 63 | 1 | 2 | 32 |
*Half the number of drives in a RAID 10 array can fail, but only one from each span. E.g., if you divide your 16-bay array into 8 groups, then only one drive per group can fail (or 8 drives total). The actual calculation of the probability of failure for RAID 10 is far more complicated.
And here’s the graph:
Obviously, RAID 0 performs poorly. No matter how many drives you add, a RAID 0 array will not tolerate a single drive failing. RAID 1 performs best, because it just duplicates data into as many drives as you can add. RAID 6 offers better protection than RAID 5, and I guesstimate that RAID 10 offers better odds than RAID 6.
The failure rate of a RAID array
How does the number of drives affect the failure rate of an entire array? Here are the stats:
Drives | Array failure rate | ||||
0 | 1 | 5 | 6 | 10** | |
2 | 10% | 0.25% | x | x | x |
3 | 14% | 0.01% | 3% | x | x |
4 | 19% | 0.00% | 5% | 3% | 0.186% |
5 | 23% | 0.00% | 7% | 5% | x |
6 | 26% | 0.00% | 10% | 7% | 0.397% |
7 | 30% | 0.00% | 12% | 10% | x |
8 | 34% | 0.00% | 14% | 12% | 0.673% |
9 | 37% | 0.00% | 16% | 14% | x |
10 | 40% | 0.00% | 19% | 16% | 1.003% |
11 | 43% | 0.00% | 21% | 19% | x |
12 | 46% | 0.00% | 23% | 21% | 1.379% |
13 | 49% | 0.00% | 25% | 23% | x |
14 | 51% | 0.00% | 26% | 25% | 1.793% |
15 | 54% | 0.00% | 28% | 26% | x |
16 | 56% | 0.00% | 30% | 28% | 2.239% |
32 | 81% | 0.00% | 54% | 52% | 6.450% |
64 | 96% | 0.00% | 80% | 79% | 15.4% |
**Calculating the failure rate of a RAID 10 array isn’t easy. I’ve used a simple formula that might be totally incorrect. Each RAID 10 array is split into n/2 RAID 1 spans where n is the total number of drives. The failure rate of each span is equal to 0.25%, the failure rate of a 2-drive RAID 1 array. All these spans are striped (RAID 0), and as the number of spans increases, the chances for failure rises (just like RAID 0). The formula I’ve used for RAID 10 = rate of RAID 1 (0.25%) x number of spans (n/2) x rate of RAID 0 for n/2, which is the number of spans.
Here’s the graph:
Obviously, RAID 0 is dangerous even with two drives. But as you increase drives it becomes an unacceptable risk. E.g., If you have about 12 drives in RAID 0, the chance of failure is 50:50 – that means anytime. If you have 64 drives, it’s almost a 100% – which means failure is imminent. What’s the difference? With 12 drives, you get to flip a coin. With 64 drives, you get to flip a coin, but it’s given to you by Two-Face.
RAID 1 is the simplest way to go if all you needed were two drives. Add one more to a RAID 1 array and you’ll be as safe as possible. The same applies to RAID 10 – the sheer number of drives makes failure in the real world a minimal possibility, even for smaller arrays. But it is an uncomfortable peace.
But look at RAID 5 and RAID 6. They’re both unacceptably dangerous as you increase the size of your array. Remember what I said in Part One about not being able to carry out all the treasure from the cave in one go? So far RAID 5 and RAID 6 have seemed like great solutions, but not for large drive arrays. There’s always a trade-off with RAID somewhere.
Okay, we have all this data. In Part Three we’ll see if we can combine all this. Why would we want to do that? Who knows? It might shed light on a few rules of thumb that’ll make selecting the right RAID level for video editing or post production easier.
Let’s see.