Which is the Best RAID level for Video Editing and Post Production? (Part Three): Number Soup for the Soul

In Part One and Part Two we ran the numbers on each RAID level to find out how well they performed on speed, capacity and failure rates as the drive sizes increased. In this final part we’ll try to put together a few rules of thumb to help guide us in selecting the right RAID level for video editing or post production.

Important: There are many things I have not included. Small things make a big difference, so please don’t take any decision based on these calculations. The aim of this article is to help you boil down your choices to a manageable number, not make the choice for you. You are responsible for your actions.

Exclusive Bonus: Download my free guide (with examples) on how to find the best camera angles for dialogue scenes when your mind goes blank.

Bringing everything down to one number

What are the things we’re worried about most when building a RAID array? Here are some important points:

  • Total capacity
  • Value for money
  • Redundancy
  • Speed

There are others, too:

  • Ease of setup
  • Cost and availability of hardware RAID controllers
  • Number of drives and size of chassis and heating solutions
  • Compatibility of hardware and software
  • CPU usage
  • Battery Backup for the controller
  • Caching
  • Buying drives from various sources to better the odds

Let’s focus on the first four for now. That offers us a theoretical starting point. After all, if a RAID level doesn’t work for us theoretically, there’s no point worrying about its technicalities.

The objective is to bring each important point to a common number. Luckily, RAID offers us simple formulas that can be scaled equally among all levels. E.g., if a drive is slow or if its failure rate is high, every RAID level suffers. If the price changes, everyone’s affected equally.

Mathematically, there are an infinite number of ways in which such numbers can be produced. I’m going for the simplest method, possibly the most error-prone, but who cares as long as it makes some sense? The two types of numbers I’m going to be reducing our data to are:

  • Factors
  • Value

Factors

A factor mustn’t have a unit. It’s just a number that should ideally be between 0 and +1. 1 being the best and 0 being the worst.

Value

You will spend X amount of money on your RAID array. But how much value do you get out of it? Obviously this is a subjective matter because each individual perceives value differently. Just for fun, I multiply the total cost of the array by the factor I’ve obtained to get the value.

Let’s see how this works in practice.

Speed factor

We have seen that on average the ability of a RAID array to read quickly is three times more important than its ability to write fast. The ‘Speed Factor’ (S) should include both read and write speeds (you can’t buy them separately you know). I used this formula:

Speed Factor (S) = (3 x Read speed + 1 x Write Speed) / (4 x Maximum speed of the array)

Here’s what we get:

Drives Speed factor (S)
RAID 0 RAID 1 RAID 5 RAID 6 RAID 10
2 1.000 0.875 x x x
3 1.000 0.833 0.583 x x
4 1.000 0.813 0.656 0.438 0.875
5 1.000 0.800 0.700 0.525  x
6 1.000 0.792 0.729 0.583 0.875
7 1.000 0.786 0.750 0.625  x
8 1.000 0.781 0.766 0.656 0.875
9 1.000 0.778 0.778 0.681  x
10 1.000 0.775 0.788 0.700 0.875
11 1.000 0.773 0.795 0.716  x
12 1.000 0.771 0.802 0.729 0.875
13 1.000 0.769 0.808 0.740  x
14 1.000 0.768 0.813 0.750 0.875
15 1.000 0.767 0.817 0.758  x
16 1.000 0.766 0.820 0.766 0.875
32 1.000 0.758 0.848 0.820 0.875
64 1.000 0.754 0.861 0.848 0.875

And here’s the graph:

RAID Speed Factor

For simplicity’s sake, I’m not showing you the value chart but you can multiply the cost of the array with the speed factor to get the Value for Speed (ValueS) for each array. Here’s what it looks like on a graph:

RAID Speed Value

If all you cared for was speed, it is pretty obvious that RAID 0 is the way to go, followed quickly by RAID 10. As far as value for money for speed is concerned, RAID 0 is ideal, followed by RAID 10.

Capacity factor

The capacity factor is easier to calculate. It is:

Capacity Factor (C) = Actual capacity of the array / Maximum possible capacity

Here is the chart:

Drives Capacity Factor (C)
RAID 0 RAID 1 RAID 5 RAID 6 RAID 10
2 1.000 0.500 x x x
3 1.000 0.333 0.667 x x
4 1.000 0.250 0.750 0.500 0.500
5 1.000 0.200 0.800 0.600  x
6 1.000 0.167 0.833 0.667 0.500
7 1.000 0.143 0.857 0.714  x
8 1.000 0.125 0.875 0.750 0.500
9 1.000 0.111 0.889 0.778  x
10 1.000 0.100 0.900 0.800 0.500
11 1.000 0.091 0.909 0.818  x
12 1.000 0.083 0.917 0.833 0.500
13 1.000 0.077 0.923 0.846  x
14 1.000 0.071 0.929 0.857 0.500
15 1.000 0.067 0.933 0.867  x
16 1.000 0.063 0.938 0.875 0.500
32 1.000 0.031 0.969 0.938 0.500
64 1.000 0.016 0.984 0.969 0.500

And here is the graph:

RAID Capacity Factor

The value for money for capacity (ValueC) is as follows:

RAID Value for Capacity

RAID 0 is the best if capacity is your only concern. However, RAID 5 and RAID 6 both offer excellent value as the number of drives go up. RAID 10 offers half the value.

Failure factor

The failure factor is complicated, and is possibly the most error-prone part of my calculations. Array failure rates depend on the failure rate of a drive (constant) and the number of drives. However, each RAID type also differs in the number of drives that can fail. I feel both these variables must be accounted for in the Failure Factor (F).

The first step is to find the drive failure factor, which is the number of drives that are allowed to fail divided by the total number of drives. Let’s call this ‘d’. To keep things simple I’m not showing you that chart. You should know that I changed the RAID 0 value (0) to 0.001 just so we don’t encounter a ‘divided by zero’ scenario. We have already calculated the array failure rate earlier.

Resistance to Failure Factor (F) = (1-Array Failure Rate+d) / 2

I wanted to create the factor such that the higher number offered greater ‘protection’. This makes all factors behave commonly – the higher the better. All I’ve done is average the two, by giving both factors equal importance.

Here’s the chart:

Drives Resistance to Failure Factor (F)
RAID 0 RAID 1 RAID 5 RAID 6 RAID 10
2 0.450 0.749  x  x  x
3 0.430 0.833 0.652  x  x
4 0.405 0.875 0.600 0.735 0.749
5 0.385 0.900 0.565 0.675  x
6 0.370 0.917 0.533 0.632 0.748
7 0.350 0.929 0.511 0.593  x
8 0.330 0.937 0.493 0.565 0.747
9 0.315 0.944 0.476 0.541  x
10 0.300 0.950 0.455 0.520 0.745
11 0.285 0.955 0.440 0.496  x
12 0.270 0.958 0.427 0.478 0.743
13 0.255 0.962 0.413 0.462 x
14 0.245 0.964 0.406 0.446 0.741
15 0.230 0.967 0.393 0.437  x
16 0.220 0.969 0.381 0.423 0.739
32 0.095 0.984 0.246 0.271 0.718
64 0.020 0.992 0.108 0.121 0.673

Here’s the graph:

RAID Resistance Factor

And here’s the value for money for resistance (ValueF) graph:

RAID Value for Failure

RAID 1, in pure protection terms, is always the best, regardless of the number of drives. RAID 10 is the next best thing. RAID 0 is the worst.

Results

The R Factor

The idea behind creating three factors is to multiply them to form one final factor – R.

R = S x C x F

This offers some flexibility, though. What if you didn’t care about capacity, for example? In that case, ignore it, and R = S x F. You can ignore two factors and just focus on one thing. Choose any combination you want.

I’m going to leave the permutations and combinations to you, and will only provide the results of combining all three factors. Here’s the chart:

Drives R Factor
0 1 5 6 10
2 0.450 0.328 x x x
3 0.430 0.231 0.253 x x
4 0.405 0.178 0.295 0.161 0.328
5 0.385 0.144 0.316 0.213  x
6 0.370 0.121 0.324 0.246 0.327
7 0.350 0.104 0.329 0.265  x
8 0.330 0.092 0.330 0.278 0.327
9 0.315 0.082 0.329 0.286  x
10 0.300 0.074 0.322 0.291 0.326
11 0.285 0.067 0.319 0.290  x
12 0.270 0.062 0.314 0.291 0.325
13 0.255 0.057 0.308 0.289  x
14 0.245 0.053 0.306 0.287 0.324
15 0.230 0.049 0.300 0.287  x
16 0.220 0.046 0.293 0.283 0.323
32 0.095 0.023 0.202 0.209 0.314
64 0.020 0.012 0.091 0.099 0.295

And here’s the graph:

RAID Factor R

What does this tell us? In no uncertain terms:

  • If your RAID array is between 2-8 drives large, RAID 0 is best.
  • If your RAID array is between 8-10 drives large, RAID 5 is best.
  • If your RAID array is greater than 10 drives, RAID 10 is best.

What?? How can RAID 0 be best, even though we all know a single drive failure will ruin our work? We have already calculated the array failure rate for each RAID array. To find how many years the array will survive based on those rates, look at this table:

Drives Life Expectancy (years)
0 1 5 6 10
2 10 400 x x x
3 7 33 x x
4 5 20 33 526
5 4 14 20  x
6 4 10 14 256
7 3 8 10  x
8 3 7 8 147
9 3 6 7  x
10 3 5 6 100
11 2 5 5  x
12 2 4 5 72
13 2 4 4  x
14 2 4 4 56
15 2 4 4  x
16 2 3 4 45
32 1 2 2 15
64 1 1 1 7

What is the typical warranty period of a hard drive? Three years? So, one can safely say that a RAID array must ‘live’ for three years before it fails (which it will – all arrays will fail at some point). Going by that, a RAID 0 array with 10 drives will live for three years.

A RAID 5 or 6 array with 16 drives will live up to three years; and a RAID 10 drive with 64+ drives will live longer than 3 years. RAID 1 of course, lives up to 400 years even with two drives. Remember, RAID is NOT backup. Regardless of what RAID you choose, you must always keep backups of your data. When you know you have sufficient backups, an array failure is no longer such a fearful thing, is it?

And don’t forget, your RAID array can be brought to its knees by the controller, the motherboard, the CPU, RAM, the OS, or just human error.

On the one hand we are bombarded with information on how dangerous RAID 0 is, but the numbers don’t support that FUD. But what about bad luck, you might be wondering. Bad luck can happen to anybody at any time. There is no mathematical basis for luck, only chance. You either believe in luck (didn’t I tell you you could select a RAID array with just your gut feeling?) or believe in chance. It’s not for me to decide for you.

Go make your own luck.

Value for money

What about value for money? Here’s the formula:

Value for Money (V) = (ValueS + ValueC + ValueF ) / 3

Here’s the chart:

Drives Value for money (V)
RAID 0 RAID 1 RAID 5 RAID 6 RAID 10
2 $ 490 $ 425 x x x
3 $ 729 $ 600 $ 571 x x
4 $ 962 $ 775 $ 803 $ 669 $ 850
5 $ 1,193 $ 950 $ 1,033 $ 900  x
6 $ 1,422 $ 1,125 $ 1,258 $ 1,129 $ 1,274
7 $ 1,645 $ 1,300 $ 1,483 $ 1,353  x
8 $ 1,864 $ 1,475 $ 1,707 $ 1,577 $ 1,697
9 $ 2,084 $ 1,650 $ 1,928 $ 1,800  x
10 $ 2,300 $ 1,825 $ 2,143 $ 2,020 $ 2,120
11 $ 2,514 $ 2,000 $ 2,360 $ 2,233  x
12 $ 2,724 $ 2,175 $ 2,575 $ 2,449 $ 2,542
13 $ 2,932 $ 2,350 $ 2,788 $ 2,663  x
14 $ 3,143 $ 2,525 $ 3,006 $ 2,875 $ 2,963
15 $ 3,345 $ 2,700 $ 3,215 $ 3,093  x
16 $ 3,552 $ 2,875 $ 3,423 $ 3,301 $ 3,382
32 $ 6,704 $ 5,675 $ 6,599 $ 6,493 $ 6,696
64 $ 12,928 $ 11,275 $ 12,503 $ 12,397 $ 13,108

Here’s the graph:

RAID Value for Money Final

Only RAID 1 falls behind as the number of drives increases, but generally, all RAID levels are good value for money! RAID 0 being the best until a drive count of 64, where RAID 10 takes over.

Takeaways

The numbers, as far as I can tell, don’t lie:

  • For 2-8 drives, you’ll be safe with RAID 0.
  • For 8-10 drives, you’ll be happy with RAID 5.
  • For more than 10 drives, you’ll be thrilled to have RAID 10.

What should you pick? It’s not always this difficult. The total size of your array, your data rate and budget will limit your possibilities. Within that, you can use my methodology to come to a final decision. You could either give all factors equal importance, or leave out the ones you don’t care about.

You could assume that you will not accept anything less than five years life-expectancy for your drive array. In that case RAID 0 works great for up to 4 drives, and RAID 10 beyond that point. Heck, even if you chose each factor one at a time, it is difficult to argue against RAID 0 or RAID 10.

Going by our charts, RAID 5 and RAID 6 doesn’t look good for the kind of work we are into at all! That’s actually a relief, because you really don’t need to spend extra on RAID controllers when you can use software RAID. Look at the operating systems that support RAID 0, 1 and 10:

  • Windows 8**
  • Linux+mdadm
  • Mac OS X

**Windows 8 has a feature called Storage Spaces that let you create something similar to RAIDs 0, 1 and 5, but not RAID 10. However, Windows also has a software RAID option for striping (RAID 0). You can combine the two to create a RAID-10-like array.

Let’s consider some of the other factors we listed earlier:

  • Ease of setup – RAIDs 0 and 1 are the easiest to set up, followed by RAID 10.
  • Cost and availability of hardware RAID controllers – the problem with RAID controllers is that if it fails, you will have to find a ‘matching’ model that supports your existing array.
  • Number of drives and size of chassis and heating solutions – RAID 0 will keep your drives to a minimum. RAID 10 takes the most drives.
  • Compatibility of hardware and software – See above.
  • CPU usage – If no parity calculations are required, you don’t need to tax your CPU.
  • Battery Backup for the controller – No controller means no Battery Backup Module (BBM) required. Here, too, compatibility is a problem.
  • Caching – Hardware caching is always a good thing.
  • Buying drives from various sources to better the odds – is common to all RAID arrays, accept maybe RAID 1.

You do the math.

But spell it out dammit! Which is the best RAID level for video?

I’m not mincing words: RAID 0 is the best RAID level for video up to 8 drives. After that point RAID 10 is champion.

Exclusive Bonus: Download my free guide (with examples) on how to find the best camera angles for dialogue scenes when your mind goes blank.

3 replies on “Which is the Best RAID level for Video Editing and Post Production? (Part Three): Number Soup for the Soul”

  1. All wrong. Read speed doesn’t increase in a linear way for all drives. And the failure rate for raid6 is much less.

  2. Caviot: RAID 5 is error prone with 1TB or larger drives.  Been there done that.  Del had massive problems and had to stop using RAID 5 on 1 TB drives.  I have a custom made and have errored twice in a year with drives that test out perfect after the failure.  Not fun when you’re in the middle of a job.

  3. Thank you very much for such an extensive guide, this was just what I needed, great statistics, great graphs, great information – thanks again! And never mind my previous question, – we will be going with 20-disk RAID6 and a tape backup set-up.

    Cheers!

Comments are closed.