Admin's RAID Data Recovery Guide

RAID Storage Techniques

Each RAID level uses one or more of the following techniques to write and read data from an array:

Striping

Striping is a technique which offers the best performance of any RAID configuration. In a striped array, data is interleaved across all the drives in the array.

An analogy may be helpful in understanding how striping works.

Imagine you asked a friend to write down all the numbers between 0 and 100. It would probably take him a few minutes to jot them all down. Now imagine that instead of asking just one friend to write down the numbers, you asked ten friends to divide the numbers up equally amongst themselves so that one writes down 0 to 9, another 10 to 19, and so on and so forth until all were assigned a task. It would take a fraction of the time. This is how striping works. By splitting up the data and distributing it across multiple drives, you increase performance.

Performance in a striped array is dependant on the stripe width (the number of drives in the array) and the stripe size (the size of the chunks of data being written across the array). Striping can occur at two different levels: byte level and block level. Byte level striping involves breaking up the data into bytes and storing them sequentially across the hard drives. Block level striping involves breaking up the data into a given block size. These blocks are then distributed in the same way across the array as in byte level striping.

So, what stripe size should you use to wring the most performance out of your RAID? Well, that depends on what type of application you're using it for.

Larger stripes mean fewer accesses to the disk. For this reason, larger stripes are useful for I/O-intensive (Input/Output) applications such as database servers. Smaller stripes on the other hand, mean that data can be accessed more quickly because data chunks are smaller. Consequently, smaller stripes are better suited for throughput-intensive applications such as video production and editing.

Mirroring

Although a striped array may offer the best performance of any RAID configuration, it provides no redundancy. If one drive in the array fails, all of your data will be lost and you may need to consider RAID data recovery options.

That's where mirroring comes in. With mirroring, whatever you write to one drive, gets written simultaneously to another. Thus, you always have an exact duplicate of your data on the second drive. This is one of the two data redundancy techniques used in RAID to protect you from data loss. The advantage of this technique is that when one hard drive in the array fails, the system can still continue to operate since there are two copies of the data. Downtime is minimal and rebuilding the data from the good copy is relatively easy.

Mirroring also provides a small performance boost over a single non-arrayed drive. Since the mirrored pairs contain the same data, the RAID controller can read data from one drive while simultaneously requesting data from the other. Of course, write speeds will be slower than with other techniques because data must be written twice, once on each drive.

Parity

Parity is an error correction technique commonly used in certain RAID levels. It is used to reconstruct data on a drive that has failed in an array.

Here's how it works: your RAID controller adds a parity byte to all binary information being written to the array. Basically, this is just an extra byte of data tacked onto the actual data. These parity bytes are added up by the controller to equal either an even or an odd number. By analyzing this value, the controller can determine whether the information has been compromised in any way. If it has, it can replace the data automatically with data from the other drive.

You may be wondering how the parity data is created in the first place. Well, typically it's done using a logical operation called eXclusive OR (XOR). Basically, the controller analyzes the series of 0's and 1's which make up the data, and returns either a TRUE (for even numbers) or FALSE (for odd ones). By using this data, it can "fill in the blanks". It's like being back in your high school algebra class. You know that 3 + 6 = 9. If you see the equation 3 + _ = 9, you know the blank is supposed to be a 6. The XOR logic is used in this way to rebuild corrupted data on the array, thus maintaining integrity.

Previous | Next: RAID Levels

+ Home	+ Basics	+ Storage Techniques	+ RAID Levels 0-2	+ RAID Levels 3-5	+ RAID Data Recovery
Site Map \| Articles