Dichov`s blog: Redundant Array of Independent Disks

One of the most unreliable components of the hardware on which our precious information systems run is the good old hard drive.
RAID is an acronym for Redundant Array of Inexpensive (or Independent) Disks. This is a storage technology that combines multiple disk drive components into a logical unit. RAID is basically a group of disks, usually with one or both characteristics of parity and striping. Parity is redundancy of your blocks of data on the disks; striping allows the individual drive speeds and feeds to add up, giving you more performance than a single disk could provide.
RAID is a form of fault tolerance. A RAID array is a collection of drives which collectively act as a single storage system, which can tolerate the failure of a drive without losing data.
There are five key RAID solutions:

RAID 0 (striping)
RAID 1 (mirroring),
RAID 5 (striping with parity),
RAID 6 (striping with double parity) and
RAID 10 (mirroring with striping).

RAID 0 - Striping

RAID 0 is a striped disk array without fault tolerance. RAID Level 0 provides high I/O performance at low inherent cost. Despite the name, RAID 0 is not actually RAID, unless it is combined with other technologies to provide data and functional redundancy, regeneration and rebuilding.

RAID 1 - Mirroring

Each disk in a mirrored array holds an identical image of user data.
Mirroring is extremely stable as the process is so simple, but it requires you to purchase twice as many drives as you would need if you were not using RAID at all, as your second drive is dedicated to redundancy. Write speeds are equivalent to a non-RAID system while read speeds are almost twice as fast in most situations, as during read operations the drives can access in parallel to increase throughput.

RAID 5 - Striping with Single Parity

In this RAID type data is written in a complex stripe across all drives in the array with a distributed parity block that exists across all of the drives. By doing this RAID 5 is able to use an arbitrarily sized array of three or more disks and only loses the storage capacity equivalent to a single disk to parity. A RAID 5 array can survive the loss of any single disk in the array.
RAID 5 is often used because of its cost effectiveness, due to its lack of storage capacity loss in large arrays. Unlike mirroring, striping with parity requires that a calculation be performed for each write stripe across the disks and this creates some overhead. Therefore the throughput is not always an obvious calculation and is dependent heavily upon the computational power of the system doing the parity calculation.
Advantages: Highest Read data transaction rate. Medium Write data transaction rate.
RAID 5 suffers from being able to lose only a single drive.
When capacity is at a premium RAID 5 is a popular choice because it loses the least storage capacity compared to other array types.

RAID 6 - Redundant Striping with Double Parity

RAID 6 is an independent data disks with two independent distributed parity schemes. RAID 6 is practically identical to RAID 5 but uses two parity blocks per stripe rather than one to allow for additional protection against disk failure. If a disk fails, the array is still redundant. Even a second drive can fail and the array will still continue to operate.
RAID 6 is special in that it allows for the failure of any two drives within an array without suffering data loss. To accommodate the additional level of redundancy a RAID 6 array loses the storage capacity of the equivalent to two drives in the array and requires a minimum of four drives.
RAID 6, used in a large array, introduces a very small loss of storage capacity while providing the assurance of being able to lose any two drives. It provides for an extremely high data fault tolerance and can sustain multiple simultaneous drive failures.
RAID 6 has very poor write performance.

RAID 10 - Mirroring plus Striping

Technically RAID 10 is a hybrid RAID type encompassing a set of RAID 1 mirrors existing in a non-parity stripe (RAID 0). RAID 10 is implemented as a striped array whose segments are RAID 1 arrays. With RAID 10, drives must be added in pairs so only an even number of drives can exist in an array.
RAID 10 can survive the loss of up to half of the total set of drives but a maximum loss of one from each pair. RAID 10 does not involve a parity calculation, giving it a performance advantage over RAID 5 or RAID 6 and requiring less computational power to drive the array. RAID 10 delivers the greatest read performance of any common RAID type as all drives in the array can be used simultaneously in read operations. Transactional performance with RAID-10 is good because either disk in the mirror can respond to read requests. No parity information needs to be calculated so disk writes are handled efficiently. Each disk in the mirrored set must perform the same write.

If a disk fails in a RAID-10 array, write performance is not affected because there a member of the mirror can still accept writes. Reads are moderately affected because now only one physical disk can respond to read requests. When the failed disk is replaced, the mirror is again established, and the data must be copied or rebuilt.

RAID 10 can tolerate up to 50% loss of drives if one member of every pair would fail. It provides high reliability combined with high performance.
Disadvantages: Very expensive.

As a conclusion

1. In RAID 10, if you had bad luck and lost 2 drives in one of the mirrors (very likely they would wear out at the same time because they are doing the same workload!), you'd lose your whole array. On the other hand, the rebuild time for RAID-6 is very much longer vs. disk-to-disk clone for RAID-10.
2. In RAID 10 array, disk capacity is cut in half, but it requires less computational power to drive the array (then RAID-6 arrays).
3. RAID 10 provides very high Read/Write performance combined with high reliability.RAID-6 provides very high reliability with high Read performance, but low Write performance.
4. RAID 10 arrays are typically used in environments that require uncompromising availability coupled with exceptionally high throughput for the delivery of data located in secondary storage. RAID-6 arrays are good solution for critical applications.

Approximate Read/Write speed correlation
RAID Level	Read speed
RAID 0 (striping)	Read speed = 1 x ( Write speed )
RAID 1 (mirroring),	Read speed = 2 x ( Write speed )
RAID 5 (striping with parity),	Read speed = ~ 4 x ( Write speed )
RAID 6 (striping with double parity)	Read speed = ~ 6 x ( Write speed )
RAID 10 (mirroring with striping).	Read speed = 2 x ( Write speed )

Dichov`s blog

Friday, April 5, 2013

Redundant Array of Independent Disks - RAID6 vs. RAID 10 (or 0+1)