RAID5 on two disks only

I'm building a new server, but since I am a little short on cash I managed to only buy two 500GB Western Digital hard drives. The logical solution would be to add them to a RAID-1. RAID-0 offers no redundancy in case of a crash, and RAID-5 is not possible if you believe what everyone is saying, including this Wikipedia article.

The truth is that RAID-5 requires only two drives to operate properly. The way RAID-5 works is that it stores the checksum of all data blocks in a stripe in a checksumming block on the same stripe. So, if you have only two disks you'll get the data of one disk checksummed on the other and they'll go on alternating as per the specification. In other words, by joining my two disks on a RAID-5 I'll get almost the equivalent of a RAID-1. And if you don't believe me, here is how to try it for yourselves.

First, create two empty files of about 100MB in size:
dd if=/dev/zero bs=100M count=1 of=diskA
dd if=/dev/zero bs=100M count=1 of=diskB


Bind them to a loopback device:
losetup /dev/loop/0 $PWD/diskA
losetup /dev/loop/0 $PWD/diskB


Create a RAID-5 on the two loop devices:
mdadm -C -n 2 -l 5 -a yes /dev/md/0 /dev/loop/[01]


Check the progress, it should be recovering one disk... wait until it is complete (or not, your choice).
cat /proc/mdstat
Perosnalities: [raid6] [raid5] [raid4]
md0 : active raid5 loop1[1] loop0[1]
102336 blocks level 5, 64k chunk, algorithm 2 [2/2] [UU]


Now make a filesystem and copy some files over:
mkreiserfs /dev/md/0
mount /dev/md/0 /mnt
rsync -avP /usr/portage/distfiles/ /mnt/


The rsync command will fail if you are copying more than 100MB but don't sweat it too much. Get the md5 checksums of the original and the copied files:
cd /mnt; md5sum * > ~/md5sum-copied
cd /usr/portage/distfiles; ls /mnt | xargs md5sum > ~/md5sum-original


Except for the last file, the checksums should match. Now go ahead and fail a drive:
mdadm --fail /dev/md/0 /dev/loop/0


Repeat the md5summing process, did it match? It did. Try remounting the directory, it will still work.

But why would one want to go to all the trouble of using RAID-5 on two disks if it offers no advantages over RAID-1. Well, with recent kernels it is possible to grow existing RAID-5 arrays by adding new disks. I am planning on getting at least one more hard drive, possibly two. I could subsequently add them to the current two-disk setup and get a real RAID-5 with three or four disks. Here, try it out for yourselves:

dd if=/dev/zero of=disk2 bs=100M count=1
losetup /dev/loop/2 $PWD/disk2
mdadm --add /dev/md/0 /dev/loop/2
mdadm --grow -n 3 /dev/md/0


Now check /proc/mdstat for the reshaping process. Eventually it will finish and all you have to do then is resize your file systems and maybe partitions if you created partitionable arrays, and you're done.

From an expandability point of view, RAID-5 on two disks does make a lot of sense.

Comments

  1. The wikipedia article was corrected soon after I posted this article. The article that I was referring to looked like this.

    ReplyDelete
  2. As the linked article below shows, RAID5 on two disks is exactly the same as RAID1.

    http://www.n8gray.org/blog/2006/09/05/stupid-raid-tricks-with-evms-and-mdadm/

    ReplyDelete
  3. RAID5 on two disks is NOT the same as RAID1 on two disks. Functionally, yes, but there are performance differences, somewhat dependant on implementation. RAID5 can never be faster than and is generally slower than RAID1, on two disks. Due to computational overhead, if nothing else.

    ReplyDelete
  4. @Anonymous: I agree with what you're saying and I would expect RAID5 to be slower when writing, though I would also expect to hardly be noticeable at all.

    The case with reading is quite different and RAID5 is much faster.

    ReplyDelete

Post a Comment

Popular posts from this blog

Installing Gentoo with full disk encryption

ADSL Router Model CT-5367 user and pass (VIVACOM)

FreeIPA cluster with containers