RAID Storage Techniques
Each RAID level uses one or more of
the following techniques to write and read data from an array:
Striping is a technique which offers the best performance of any
RAID configuration. In a striped array, data is interleaved across
all the drives in the array.
An analogy may be helpful in understanding how striping works.
Imagine you asked a friend to write down all the numbers between
0 and 100. It would probably take him a few minutes to jot them
all down. Now imagine that instead of asking just one friend to
write down the numbers, you asked ten friends to divide the numbers
up equally amongst themselves so that one writes down 0 to 9, another
10 to 19, and so on and so forth until all were assigned a task.
It would take a fraction of the time. This is how striping works.
By splitting up the data and distributing it across multiple drives,
you increase performance.
Performance in a striped array is dependant on the stripe
width (the number of drives in the array) and the stripe size
(the size of the chunks of data being written across the array).
Striping can occur at two different levels: byte level and block
level. Byte level striping involves breaking
up the data into bytes and storing them sequentially across the
hard drives. Block level striping involves breaking
up the data into a given block size. These blocks are then distributed
in the same way across the array as in byte level striping.
So, what stripe size should you use to wring the most performance
out of your RAID? Well, that depends on what type of application
you're using it for.
Larger stripes mean fewer accesses to the disk. For this reason,
larger stripes are useful for I/O-intensive (Input/Output) applications
such as database servers. Smaller stripes on the other hand, mean
that data can be accessed more quickly because data chunks are smaller.
Consequently, smaller stripes are better suited for throughput-intensive
applications such as video production and editing.
Although a striped array may offer the best performance of any
RAID configuration, it provides no redundancy. If one drive in the
array fails, all of your data will be lost and you may need to consider
RAID data recovery options.
That's where mirroring comes in. With mirroring, whatever you
write to one drive, gets written simultaneously to another. Thus,
you always have an exact duplicate of your data on the second drive.
This is one of the two data redundancy techniques used in RAID to
protect you from data loss. The advantage of this technique is that
when one hard drive in the array fails, the system can still continue
to operate since there are two copies of the data. Downtime is minimal
and rebuilding the data from the good
copy is relatively easy.
Mirroring also provides a small performance boost over a single
non-arrayed drive. Since the mirrored pairs contain the same data,
the RAID controller can read data from one drive while simultaneously
requesting data from the other. Of course, write speeds will be
slower than with other techniques because data must be written twice,
once on each drive.
Parity is an error correction technique commonly used in certain
RAID levels. It is used to reconstruct
data on a drive that has failed in an array.
Here's how it works: your RAID controller adds a parity byte to
all binary information being written to the array. Basically, this
is just an extra byte of data tacked onto the actual data. These
parity bytes are added up by the controller to equal either an even
or an odd number. By analyzing this value, the controller can determine
whether the information has been compromised in any way. If
it has, it can replace the data automatically with data from the
other drive.
You may be wondering how the parity data is created in the first
place. Well, typically it's done using a logical operation called
eXclusive OR (XOR). Basically, the controller
analyzes the series of 0's and 1's which make up the data, and returns
either a TRUE (for even numbers) or FALSE (for odd ones). By using
this data, it can "fill in the blanks". It's like being
back in your high school algebra class. You know that 3 + 6 = 9.
If you see the equation 3 + _ = 9, you know the blank is supposed
to be a 6. The XOR logic is used in this way to rebuild corrupted
data on the array, thus maintaining integrity.
Previous | Next:
RAID Levels
|