Learn about RAID

In the late 1980s and early 1990s, IT service providers faced a massive increase in the amount of data to be stored. Storage technologies are becoming very expensive to place a large number of highly capable hard drives on servers. RAID was born that solved this problem.

In the late 1980s and early 1990s, IT service providers faced a massive increase in the amount of data to be stored.Storage technologies are becoming very expensive to place a large number of highly capable hard drives on servers.RAID was born that solved this problem.


How is RAID defined? First of all, RAID stands for Redundant Array of Inexpensive Disks. This is the system that works by connecting a series of low-cost hard drives together to form a single large-capacity memory device that supports higher efficiency and reliability than previous solutions. here. RAID is used and deployed as a storage method in businesses and servers, but in the next 5 years RAID has become popular for all users.

Advantages of RAID

There are three main reasons for applying RAID:

  1. Preventive
  2. High efficiency
  3. Low cost

Redundancy is the most important factor in the development of RAID for server environments. Redundancy allows backing up memory data when there is a problem. If a hard disk drive fails, it can be swapped to another hard drive without shutting down the system or using a spare hard drive. The backup method depends on the version of RAID used.

When applying strong RAID versions, you can clearly see its increased efficiency. The efficiency also depends on the number of hard drives linked together and control circuits.

All IT group managers want to reduce costs. When the RAID standard comes into being, cost is a key issue. The goal of RAID ranges is to provide better memory for the system compared to using large volumes separately.

There are 3 levels of RAID used for desktop systems: RAID 0, RAID 1 and RAID 5. In many cases, only two of the three superiors are valid and one of the two techniques used is not is a level of RAID.

RAID 0

RAID 0 is not really a valid RAID level. The provided level 0 cannot provide any level of redundancy for the stored data. So if a hard drive fails, it will endanger the data.

RAID 0 uses a technique called 'striping'. 'Striping' divides a single block of data as shown in drawings and spreads them across hard drives. The effect of striping is to increase performance. It is possible to record two blocks of data simultaneously to two hard drives, far more than a hard drive like before.

Below is an example of how data has been written to RAID 0. Each line in the graph represents a data block and each column represents a different hard drive. The numbers in the table represent data blocks. The same numbers indicate a repeating data block.

 

Hard drive 1

Hard drive 2

Block 1

first

2

Block 2

3

4

Block 3

5

6

Therefore, if all 6 data blocks in a table are combined into a single data file, it is possible to read and write to the stability much faster than reading on a drive. Each drive while operating in parallel can only read 3 blocks of data while it needs to use an additional single drive to read all 6 data blocks. The drawback of this technique is that if a drive fails, the data will not work. Need to access all 6 data blocks to read data but can only access 3 blocks.

Advantages :

  1. Increase storage efficiency.
  2. Do not lose data capacity.

Disadvantage :

  1. There is no backup drive.

RAID 1

New RAID 1 is the first real version. RAID provides simple data redundancy with 'mirroring' technology. This technique requires two separate hard drives with the same capacity. One drive will be the active drive, the other is the backup drive. When data is written to the active drive, it is also written to the backup drive.

This is an example of how data is written to RAID 1. Each line in the graph represents a data block and each column represents a different hard drive. The numbers in the table represent data blocks. The same numbers indicate a repeating data block.

 

Hard drive 1

Hard drive 2

Block 1

first

first

Block 2

2

2

Block 3

3

3

RAID 1 provides a full data backup version for the system. If a drive has a problem, the remaining drive is still active. The drawback of this technique is that RAID capacity is only equal to the smallest capacity of two hard drives if the storage capacity on two drives is used independently.

Advantages :

  1. Provide comprehensive data backup.

Disadvantage :

  1. Storage capacity is only as large as the smallest disk capacity.
  2. Do not increase performance.
  3. Many downtime to change drive activity when something goes wrong.

RAID 0 + 1

This is a RAID combination that some manufacturers have implemented to combine the benefits of the two versions together. This combination only applies to systems with at least 4 hard drives. The techniques of 'mirroring' and 'striping' combine to create a backup effect. The first set of drives is activated and the data will be divided through which the second setting reflects these data to the second drive.

The following example shows how data is written to RAID 0 + 1. Each line in the graph represents a data block and each column represents a different hard drive. The numbers in the table represent data blocks. The same numbers indicate a repeating data block.

 

Hard drive 1

Hard drive 2

Hard drive 3

Hard drive 4

Block 1

first

2

first

2

Block 2

3

4

3

4

Block 3

5

6

5

6

In this case, the data blocks will be split across the drives and reflected between the two settings. The performance of RAID 0 is increased because the hard drive only takes about half of the execution time compared to a single drive while ensuring redundancy. The main drawback of this method is the cost because it requires at least 4 hard drives.

Advantages :

  1. Increase performance.
  2. Data is fully redundant.

Disadvantage :

  1. Requires a large number of hard drives.
  2. Data retrieval capability halved.

RAID 10 or 1 + 0

RAID 10 is similar to RAID 0 + 1. Instead of dividing data between drive settings and mirroring them, the first two hard drives will be reflected together. This is setting up RAID cage. Two pairs of drives 1 and 2, 3 and 4 will reflect each other. They will then be set into data division ranges.

Here's an example of how data is written to RAID 10. Each line in the graph represents a data block and each column represents a different hard drive. The numbers in the table represent data blocks. The same numbers indicate a repeating data block.

 

Hard drive 1

Hard drive 2

Hard drive 3

Hard drive 4

Block 1

first

first

2

2

Block 2

3

3

4

4

Block 3

5

5

6

6

Also set up like RAID 0 + 1, RAID 10 needs a minimum of 4 hard drives to perform its functions. However, data protected with RAID 10 is much safer than RAID 0 + 1.

Advantages :

  1. Increase performance.
  2. Data is fully redundant.

Disadvantage :

  1. Requires a large number of hard drives.
  2. Data retrieval capability halved.

RAID 5

RAID 5 is the strongest for desktop systems. Their feature is the need for a hardware controller to manage hard drive ranges, but some computer operating systems can do this through software. This method uses 'parity' division (parity) to maintain data redundancy. At least three equally high-capacity hard drives are needed to apply RAID 5.

'Parity' is a binary operation that compares two data blocks with a third data block based on the first two blocks. The simplest explanation is even and odd. If the sum of the two data blocks is even, the number of bits is even, if the sum of the two data blocks is odd, the number of bits is odd. Therefore, the operation 0 + 0 and 1 + 1 are equal to 0 and 0 + 1 or 1 + 0 will be equal to 1. Based on this binary operation, an array in the array will fail, it will allow the "parity" bits. 'restore data when the drive is replaced.

The following is an example of how data is written to RAID 5. Each line in the graph represents a data block and each column represents a different hard drive. The numbers in the table represent data blocks. The same numbers indicate a repeating data block. 'P' are 'parity' bits for two data blocks

 

Hard drive 1

Hard drive 2

Hard drive 3

Block 1

first

2

P

Block 2

3

P

4

Block 3

P

5

6

The 'parity' bits that circulate between hard drives increase the efficiency and reliability of data. The hard drive range will still increase performance through a single drive because many drives will write data faster than a drive. Data is also fully redundant thanks to the 'parity' bits. Where drive 2 fails, the data can be recovered based on data and bits on the other two drives. Reduced data capacity is caused by parity data blocks. In fact if n is the number of drives and z is the capacity, then we have the following formula:

(n-1) * z = Capacity

In case there are 3 hard drives with capacity of 500GB / drive, the total capacity will be (3-1) x500GB = 1000 GB

Advantages :

  1. Increase storage capacity
  2. Data is fully redundant
  3. 24x7 fast swap capability

Disadvantage :

  1. High price
  2. Reduced performance during recovery

Software RAID and hardware

To use the RAID function, it is necessary to have the software installed on the operating system or through specialized hardware to control the flow of data moving from the computer to the hard drive. This is really important when RAID 5 inherits a large number of computer requirements to provide appropriate calculations.

For software, the central processor cycle (CPU) will perform the necessary tasks for RAID. Using software, the price will be low because all that is needed is the hard drive. The only problem with software RAIDs is the loss of system performance. In general, the results can range from 5% or more depending on which processor, memory, hard drive and RAID type are in use. Many people no longer use software RAID because the price of hardware RAID control has decreased in recent years.

Hardware RAID has the advantage of using dedicated circuits to control all calculations for RAID outside the processor. This method produces high storage performance. The problem of hardware RAID is the cost. Prices for RAID 0/1 controllers are very small because many 'chipsets' are already built on motherboards. On the other hand, RAID 5 hardware requires additional circuits.

Select the hard drive

Many people are unaware that the capacity and capacity of a RAID array depends greatly on what type of hard drive is used. To achieve the best results, all hard drives on the network should have the same design and brand. Besides, they also need to have the same capacity and performance. There is no uniform requirement between drives, but if the drives do not agree, they may affect the RAID array.

The capacity of RAID depends on their level. For RAID 0, the partition can be executed through the space of two hard drives. With two 80Gb and 100Gb drives, the final output of the range will be 160GB. Similarly, for RAID 1, the drives can only reconcile the data to the smallest size so that the final capacity will be only 80GB. RAID 5 is even more complicated when calculating the formula above. If you use 3 80GB, 100GB and 120GB drives, the capacity will be 160GB of data.

The execution time of the sequence also depends on the hard drive. To perform a command function, we have to wait for the data to be written to each drive before we can continue to the next steps. This means that on the RAID array example, the controller must wait until the data has been written to block 1 through all the drives in the array before you can continue the other settings for the hard drive. It also means that in areas where a hard drive has only half the performance, it will slow down the speed of other drives.

Conclude

RAID supports systems with many different utilities depending on the version being applied. Most customers will use RAID 0 to speed up execution without reducing memory space. Mostly due to redundancy is not the main problem for the average user. In fact, most computer systems only provide RAID 0 or RAID 1. The cost to implement RAID 0 + 1 or RAID 5 is too expensive for the average customer and is only applied to the workstation. work or high-end server systems.

Mark Kyrnin

5 ★ | 1 Vote