There are three ways to make a decision on the best RAID level for your video editing and post production needs:
- Gut feeling or instinct – also called an impulse buy.
- Years of first-hand experience running various RAID levels and systems.
- Analyzing available data to find the best odds for each RAID level.
The first-time RAID buyer should attempt to understand the complexity that goes into selecting the ‘right’ option. Too often something that looks good today will prove totally inadequate tomorrow. Those who cannot fathom or don’t want to deal with the complexity will rely on opinions or their gut feeling (If you read too many third-party opinions you’ll be more confused than ever!). You could get lucky or unlucky. The sad part is you don’t have a say in the matter either way.
If you have years of first-hand experience you wouldn’t be looking for an answer anyway. You’re already convinced about what’s best for you. This article explores the last option. Who’s it for?
I will focus on the small business post facility or single video editor. If you want to know the real-world numbers behind each RAID configuration, and are not afraid to get a bit technical to arrive at the best solution for your needs, then you will enjoy reading this article. You will ultimately do as your personality and situation dictates. I couldn’t find a similar resource to the one I’m about to show you, so I put it together.
If you’re absolutely new to RAID, you must first read AFRAID, a RAID primer.
Which RAID levels should we look at?
The RAID levels that I’ll be looking at are 0, 1, 5, 6 and 10.
These are not the only available RAID levels. There are nested RAID levels (50, 100, etc.) and proprietary/non-standard RAID levels (Z, Diff-RAID, etc.) that I won’t look at. The choices are complex as it is. The smart thing to do is begin with the easy ones.
If the RAID levels we’re covering here are inadequate for your unique needs, then you should look further.
The biggest benefit that RAID brings to the table is ‘Redundancy’ – in short, it is the ability of the system to let you continue working without delays even if there’s one or more drive failures.
The first question to ask yourself, then, is: Do you need redundancy at all? Imagine this situation:
You are editing on one 4 TB 7,200 rpm drive that sits in your computer or laptop. You have a backup on an external drive somewhere. Suddenly, your 4 TB drive fails or is erased or whatever. Here are the sequence of events that must take place to get you back to ‘as you were’:
- Format your drive (4 TB drives can take many hours, even a whole day). If it’s dead, you will replace it (Order and delivery means it’ll take days).
- Connect your external backup to your computer via USB 2.0, 3.0, eSATA or Thunderbolt (Fifteen minutes, if the drive is nearby. Maybe a day or more if the drive is in another location).
- Copy your footage to the formatted or new drive. To copy it via USB 2.0, it’ll take about 6 hours per terabyte. Via USB 3.0, Thunderbolt or eSATA, it’ll take about 3 hours per terabyte.
- Restart your NLE and hope everything links automatically (Ten minutes if everything goes well).
If your total data size is low, you might experience the loss of half a day. If your data size is significant, you will lose at least two days. If your drive dies, you might lose a week.
Of course, the smarter way to work is by having two 4TB internal drives. This way, if one crashes or dies, you can link to the other one and continue working (assuming you were alert enough to duplicate your data beforehand). The downtime in this case is about an hour not more. However, not everybody has space for two internal drives.
Can you afford to take this loss in productivity? Run the numbers based on your own data needs. If you are okay with the down time, then you don’t need RAID for redundancy.
Then, the only reason you might want RAID is speed. We’ll get to that.
The ideal drive size for a small business post facility or editor
Let’s assume you have determined that redundancy or speed is essential to your editing needs. There are many variables that go into estimating the right RAID requirements. But, there are three factors you’ll need to know right at the beginning:
- What is the total size of your data?
- What is the data rate you’ll be needing on a regular basis?
- How many layers do you typically have on your timeline overlapping each other?
The first determines the total capacity of your RAID array. The market forces you to choose the size of one drive. From the two numbers, you will arrive at the total number of drives needed to make up your required capacity. This, and the second and third factors will tell you how fast it has to be.
Working backwards to find your budget
Among all the available RAID levels that give you the capacity and speed you need, find both the cheapest and the most expensive. Does your budget lie somewhere in between? For post facilities wanting to deploy a SAN on site, read How to get SAN storage for video production.
In either case, you are limited to drive sizes that cap at 4 TB. If speed is critical, you will want to invest in 7,200 rpm drives (though 5,400 rpm or similar drives should work okay too). The cost of a 4 TB 7,200 rpm is about $300 or more (or whatever the price is). More drives means a linear increase in the total price. There’s no way of getting around it (unless you are buying hundreds of drives).
You are also limited by the connectivity options, so you cannot go about multiplying your speed. E.g., what’s the point in designing an 8-bay RAID 0 array filled with SSDs that can read at 500 MB/s. The read speed you’ll get from such a beast is about 30 Gbps. There isn’t any cheap external connectivity option that delivers data at this rate. The closest I can think of is 32G Fiber Channel.
See? You can’t aim high, and you can’t live with the low. It’s like walking into a large cave filled with treasure. There’s only so much you can carry in one trip. Finding the right RAID solution is a balancing act.
Every RAID solution offers speed. The more drives you add, the faster your array becomes. Many people automatically assume they need RAID 0 if speed is their goal. That’s rubbish. As I’ve mentioned above, you cannot multiply drives for speed without limits (just like speed limits on our roads). That means, there are options where you can have your redundancy as well as your speed – all under budget.
Which speed is the most important – write or read?
Footage recorded on set is sacrosanct. You would almost never write over it. You will only read from it.
While editing, you might render out certain scenes with effects. Or, you might create motion graphics or 3D elements that are baked in as your edit proceeds. This means, there is some writing that happens. One cannot categorically say that you’ll never write to your array. You might.
Let’s see how this affects speed. There are three major data rate ‘ranges’ in video today:
- 50 Mbps (DSLRs, Canon C300, etc.)
- 220 Mbps (Prores HQ, DNxHD 220)
- 150 MB/s (R3D, Cinema DNG from the BMCC, etc.)
Here’s the data rate (in MB/s) for the number of streams you need to read at one go:
As you can see, at 50 Mbps, you can easily read 10 streams of 1080p content even with one 7,200 rpm drive (they deliver 100 MB/s or better). However, reading even one stream of R3D footage in full quality is tough if not impossible.
Writing offers the same problems. Leaving aside the time it’ll take your CPU or GPU to compute the new pixels, writing to the same format will essentially mean you need the same speeds, maybe more. Why more? If you’re writing to the same array being used to read the source material, your speeds will be effectively cut down by a minimum of half. E.g., if your editing timeline has four streams of footage, and you’re reading from them and rendering the composited version at the same time, your array will need enough bandwidth to manage 5 (4 read + 1 write) streams of data. That means, whatever your desired write speed is, multiply it by 5! If it’s one stream per read and one stream per write, multiply it by 2.
You see, because you are always reading from your RAID array, your RAID array will have to work extra hard to also write at the same time. For this simple reason, I strongly advise against writing to a RAID array while you’re working. The best way to manage this is to perform your renders at the end of the day or a break and come back when it’s done. If this isn’t workable, get two arrays.
Reading is the primary activity of a RAID array containing source material for editing. Reading is more important, and therefore must take absolute priority. If anything interferes with this (even if that’s writing) then that’s a bad thing.
Secondly, you might read multiple streams at the same time (that’s the nature of editing), but you’ll hardly ever write more than one stream at the same time. Most batch rendering applications perform one render at a time. You can safely say that no matter what you do, it’ll be safe to design a RAID array in this way:
- Desired read speed = Number of read streams x data rate. It would be very rare for the number of streams to exceed 5.
- Desired write speed = data rate of rendered format.
What does that tell you? The read speed is at least twice as important as the write speed. On average, it is three times more important.
Read and Write speeds of various RAID levels
Let’s look at the theoretical read and write speeds offered by various RAID levels.
Notes: I’m assuming the following:
- Drive speed (read or write) is 100 MB/s, though you have faster platter drives.
- Drive size is 4 TB.
- Price of a 4 TB drive is $300.
- Failure rate of a 4 TB drive is 5% (You buy a 100 and five will fail within the first year).
On the left is the number of drives in the array. Speeds are in MB/s.
|Read Speed (MB/s)|
x – RAID 5 needs a minimum of 3 drives. RAID 6 and 10 needs a minimum of 4 drives. RAID 10 needs an even number of drives.
If looking at the table is giving you headaches, here’s a simple graph of the same table:
It doesn’t take a rocket scientist to realize that all RAID levels offer a linear increase in read speed as you increase the number of drives. RAID 10 and RAID 1 are as fast as RAID 0. RAID 5 and 6 are no slouches either.
See? Speed does not categorically mean RAID 0. Get that out of your head.
Write speeds are a bit weirder. From this point on we enter the twilight zone. Brace yourselves.
Most drives offer a slightly lower write speed when compared to read speeds. It’s not because of any physical thing but simply because the drive needs to find the right spot to write data in before it can write. This additional calculation makes it slower than reading. For simplicity’s sake, I’m going to assume that the write speed is equal to the read speed. When you make calculations in the real world, you’ll never be at the limit. You must always account for some wastage or loss. I’m assuming your total data rate is well below both the read and write limits of your drive.
So, if we assume the write speed is the same as the read speed, then, in an ideal world, the write graph should look like this:
However, this doesn’t happen in practice. For RAID 5 and 6, data isn’t duplicated. Instead, parity data is calculated and written alongside the actual write data. E.g., if your write data rate is 100 MB/s, in reality a RAID 5 or RAID 6 array will write 100+x MB/s (x is parity). It gets even more complicated. If you are overwriting data (rendering a new version of a composition or chroma key, for example) then the old parity data needs to be read, and only then will the new one be calculated. Then, the new one will be compared to the old one to see if it needs to be changed, and then the writing takes place.
RAID 5 and 6 perform worse for sequential data (video is sequential data) because that many more calculations have to happen per second.
There are various factors that affect the overall speed of writing parity data. The most important is the RAID Controller that must calculate the parity data before writing it. The really good (expensive) controllers have correct caching to eliminate a lot of this time wastage. The cheaper hardware RAID controllers don’t do this so well. For this reason, there is no universal formula for calculating the average write speeds for RAID 5 and RAID 6 arrays, but they’ll look something like this in the best case:
|Drives||Write Speed (MB/s)|
And here’s the graph:
RAID 1 is good with two drives. Adding drives to a RAID 1 does not increase write speed in any way. RAID 0 is best, and every other RAID is similar (Don’t forget, RAID 5 and 6 only perform this way with really good hardware RAID Controllers, otherwise the performance is going to be very poor). As a rule of thumb, you can estimate the write speeds for a RAID 5 or 6 array to be 25% or lower than the read speed. I’ve estimated it at 50% because that’s the best that you can get.
In Part Two we’ll look at factors other than speed and run the numbers on them!