Introduction to RAID
RAID (Redundant Array of Independent Drives) is a storage technology
that is revolutionizing on-line data storage in computers.
Spanning the entire spectrum from personal computers to mainframes, RAID
offers significant improvements in availability, reliability, and
maintainability of information storage, along with higher performance than
today's conventional magnetic disk drives. Yet the concept behind this
revolutionary technology is relatively simple.
Background:
RAID is a type of disk array which follows a formalized definition of
how data is stored and retrieved. A disk array is simply a set of two or
more identical disk drives interfaced to a host computer in such a manner as
to appear as a single disk unit to that host. The disk array's hardware (or
software) manages the data distribution such that the host computer is
unaware that data is actually being written to and/or read from multiple
disk drives. RAID technology expands upon this simple disk array approach to
define several methods to provide data redundancy and higher performance.
Unlike discrete disk drives connected directly to a host computer
system, RAID systems spread data across all of the disk drives within the
RAID system. Since parity data is also stored, the failure of any single
drive will not result in the loss of data. Rather, a new drive can be
installed and the data reconstituted from the remaining functioning drives,
thus allowing RAID systems to be fault tolerant. The benefit of this feature
is that if any single disk in the RAID array fails, the system continues to
function without down time or loss of any data. This is possible because the
redundant data or error correction information is stored separately from the
data within the RAID array. Furthermore, the redundant data or error
correction information can be used to reconstruct any data that was stored
on a failed disk. This redundancy can be further extended to other
components within the system to further increase data availability.
What is Parity Data?
Parity is additional information stored with the original data to ensure
that the original data is recoverable in the event of an error. Consider the
following example. Suppose I ask you to remember (or store) the following
four numbers for me - 1, 8, 3, 4 and you dutifully write the numbers down.
The following week I need the numbers and ask you to retrieve them. As you
read them off of the paper, you find that the fourth number is illegible and
you cannot decipher it. Part of the data has now become unrecoverable - a
disastrous situation for a computer. Suppose now that when I gave you the
original four numbers, you also wrote down a fifth number (parity data), 16,
which is the sum of the four. This time, if the fourth number were
illegible, you could simply subtract 12 (the sum of the three known numbers)
from 16 (the fifth number) to obtain 4 (the missing number). Thus, the data
has been recovered. RAID systems store parity information in much the same
manner.
|