Archive for the ‘RAID’ Category

I’ve been using my two 500G drives in a mixed RAID1/RAID5 array. The drives are partitioned in three — 100M, 10G, and 460G partitions. The first partition of every drive is in a RAID1 — this is my boot partition. The 10G partitions are also in a RAID1 (the root partition). And the big partitions are assembled in RAID5. All is fine, and testing each RAID device with hdparm revealed reading performance equal to that of a single hard disk — about 60MB/s.

Today I added another disk. Same partitioning — I grew the raid5 array to become double the size and I also added another device to the raid1 arrays so that there are three mirrors. Curious as I am, I again tested the RAID devices with hdparm. The results were intriguing — the RAID 5 array read at 120MB/s while the RAID 1 arrays were still stuck at 60MB/s. It is also interesting that when there were only two drives in the RAID 5 array hdparm -t reported 60 MB/s.

I figure that the 64k chunks on the RAID5 array force reading from different drives, while when the kernel is allowed to choose where to read from in the RAID 1, it simply goes for a single non-busy drive. Interleaving the reads on RAID 1 would have been nice but I guess I’ll figure it some other time.

Reading performance of hard disks and RAID arrays This figure illustrates the read performance when reading from the first, second and third hard disk and the three RAID arrays in turn. It is interesting to see that when reading from md1 (the root partition and a RAID1 array) we see reads from different drives that are sequentially switched over — it starts reading from sdb and then switches to sda. When reading from md2 (the RAID5 array), however, we see concurrent reading at a constant speed from all drives.

I’m building a new server, but since I am a little short on cash I managed to only buy two 500GB Western Digital hard drives. The logical solution would be to add them to a RAID-1. RAID-0 offers no redundancy in case of a crash, and RAID-5 is not possible if you believe what everyone is saying, including this Wikipedia article.

The truth is that RAID-5 requires only two drives to operate properly. The way RAID-5 works is that it stores the checksum of all data blocks in a stripe in a checksumming block on the same stripe. So, if you have only two disks you’ll get the data of one disk checksummed on the other and they’ll go on alternating as per the specification. In other words, by joining my two disks on a RAID-5 I’ll get almost the equivalent of a RAID-1. And if you don’t believe me, here is how to try it for yourselves.

First, create two empty files of about 100MB in size:
dd if=/dev/zero bs=100M count=1 of=diskA
dd if=/dev/zero bs=100M count=1 of=diskB

Bind them to a loopback device:
losetup /dev/loop/0 $PWD/diskA
losetup /dev/loop/0 $PWD/diskB

Create a RAID-5 on the two loop devices:
mdadm -C -n 2 -l 5 -a yes /dev/md/0 /dev/loop/[01]

Check the progress, it should be recovering one disk… wait until it is complete (or not, your choice).
cat /proc/mdstat
Perosnalities: [raid6] [raid5] [raid4]
md0 : active raid5 loop1[1] loop0[1]
102336 blocks level 5, 64k chunk, algorithm 2 [2/2] [UU]

Now make a filesystem and copy some files over:
mkreiserfs /dev/md/0
mount /dev/md/0 /mnt
rsync -avP /usr/portage/distfiles/ /mnt/

The rsync command will fail if you are copying more than 100MB but don’t sweat it too much. Get the md5 checksums of the original and the copied files:
cd /mnt; md5sum * > ~/md5sum-copied
cd /usr/portage/distfiles; ls /mnt | xargs md5sum > ~/md5sum-original

Except for the last file, the checksums should match. Now go ahead and fail a drive:
mdadm --fail /dev/md/0 /dev/loop/0

Repeat the md5summing process, did it match? It did. Try remounting the directory, it will still work.

But why would one want to go to all the trouble of using RAID-5 on two disks if it offers no advantages over RAID-1. Well, with recent kernels it is possible to grow existing RAID-5 arrays by adding new disks. I am planning on getting at least one more hard drive, possibly two. I could subsequently add them to the current two-disk setup and get a real RAID-5 with three or four disks. Here, try it out for yourselves:

dd if=/dev/zero of=disk2 bs=100M count=1
losetup /dev/loop/2 $PWD/disk2
mdadm --add /dev/md/0 /dev/loop/2
mdadm --grow -n 3 /dev/md/0

Now check /proc/mdstat for the reshaping process. Eventually it will finish and all you have to do then is resize your file systems and maybe partitions if you created partitionable arrays, and you’re done.

From an expandability point of view, RAID-5 on two disks does make a lot of sense.