Before they started investigations, the detectives huddled together to mull over how to tackle the insane RAID problem, and to have some wonderful pizza served fresh from Hotel Network’s famous kitchen.
The pizza came in three sizes, and it was difficult estimating whether an increase in size (which also increased the price) gave a proportionate amount of toppings.
Each pizza slice is called a Block. A block is just data grouped together so it can be read or written easily. Think shoveling. It would take an insane amount of time to move sand if we had to do it one particle at a time, but thanks to a shovel it can be done faster.
Data is bits of 0s and 1s, as we have seen, and a block is a group of 0s and 1s. The size of the block is decided by the file system.
The pizzas were delicious. Some detectives could swallow an entire slice in one go. Others had to take tiny nibbles. It was a constant battle between saving time vs –
One detective almost choked. Contrary to popular belief, it wasn’t the guy who swallowed a whole slice. Not this time, anyway.
With so many pizzas leaving the kitchen how did the chefs keep track of the toppings? How did they ensure each slice had the right number of toppings?
The answer to error detection is an elegant solution called a Parity Check. How does it work?
Simply put, a bit (0 or 1) is added at the end of each block, to either make it even or odd. The parity system is fixed to either even or odd, not both.
One of the detectives screamed: “Are you friggin’ crazy? There are enough 0s and 1s flying around anyway, and you want to add one more?” This detective was promptly calmed down with a can of beer, with compliments from the chef.
Initially, things were a mess, the chef explained. Pizzas roll out of the kitchen like an assembly line. The chef responsible to add toppings just looked at the pizza already done, and repeated the same on the next pizza, and so on.
Very soon they realized all it took was one mistake, and every pizza henceforth would have that error. If the chef added an extra slice of pepperoni on one pizza, then every subsequent pizza had the extra pepperoni.
He corrected this problem by asking the chef to look at the last pizza and compare it to a standard pizza chart. From then on, things went perfectly. Here’s an example of how a parity check works in general:
If we have two drives A and B that need to be filled with information, we add a third drive C as a parity drive. Its data is calculated using the formula predetermined as A x B = C.
Now, if one drive fails or has an error, its information can be recreated from the other two drives by using the same formula.
What if we just backed up the data in the traditional way? If we had to backup the information in both drives A and B, we’d need two more drives C and D. By using the parity method, we have a way to keep data ‘backed up’ with only 3 drives. Neat.
This, in theory, is a simple process. In reality, though, it’s anything but. Like all solutions, it has its pros and cons. However, it is one of the bedrocks of many RAID systems.
All the math behind a RAID system is governed by the RAID Controller. With some controllers, once a RAID system is build around it, if the controller fails, then one might have to purchase the same model, or at least another similar controller from the same manufacturer.
It’s like having a librarian for a large library organizing books in alphabetical order according to the last name of the author. When he moves on, the guy coming after cannot find anything if his system is to sort books by the first letter of the title, and so on. There are infinite ways in which data can be organized.
Which means there is more than one kind of RAID system. It’s a whodunit for most professionals. Probably the most frequently asked question about RAID is not “What is RAID”, but “Which RAID should I use?”. Each system has its own unique properties – just like there are special forces units in the army, navy and the air force. They are generally classified as the same thing, but are built and organized separately, with different goals in mind.
For better or for worse RAID systems are designated with numbers:
- RAID 0
- RAID 1
- RAID 2
- RAID 3
- RAID 4
- RAID 5
- RAID 6
RAID 0 isn’t even a true RAID, because there’s no redundancy. But it has an ace up its sleeve that no other RAID has. We’ll take a look at each one soon.
There are many situations in which one objective might be better served with two kinds of RAIDs. Can we have RAIDs of RAIDs? Yes, and they are designated thus:
- RAID 0+1
- RAID 10
- RAID 50
- RAID 53
- RAID 100
And so on. You are only limited by your imagination (and some technology). This kind of structure is called a Nested RAID.
The key thing is to remember that once a bunch of disks are configured in a RAID system, the RAID system behaves like a single disk to an outsider. It’s exactly like a hard drive, except with super powers.
How do we make sense of the format RAID XYZ? Simple, just remember this:
RAID X is a box of drives. If you’re having trouble imagining, use DVDs instead. RAID X is a box of DVDs.
RAID XY is a RAID Y truck filled with RAID X boxes.
RAID XYZ is a RAID Z ship filled with RAID Y trucks, each truck filled with RAID X boxes.
So if you’re faced with a dilmena of remembering the difference between, say RAID 01 and 10, then just follow the above format. The first number is the lowest level, and the last number is the highest.
In the next chapter, we’ll take a look at the two gods of the RAID mythology – RAID 0 and RAID 1.
Links for Further Research: