[PVE-User] need Help (Hardware Raid)

Matthew Caron Matt.Caron at sixnet.com
Tue May 24 15:47:42 CEST 2011

On 05/24/2011 09:10 AM, Muhammad Yousuf Khan wrote:
> tell any one :P . have u tested mdam fail over

Yes - it works the same as it does in all the other boxes. If I yank out 
a drive, it just keeps running.

All that said, I *don't* recommend doing software RAID for anything 
other than RAID1. In this respect, I agree with them.

The argument basically boils down like this (and I am paraphrasing from 
forum conversations):

Q: Do you support software RAID?
A: No. Software RAID is bad because if you lose power it can corrupt 
your disks. Get a battery backed hardware RAID controller.
Q: But if you are just doing a RAID1? mirror and lose power, it's the 
same as if you're using a single disk.
A: But if you are doing software RAID5?, it's worse.
Q: I'm not talking RAID5?, I'm talking RAID1?.
A: But if we put in support for RAID1?, we'd want to support all 
software RAID levels, otherwise it's a halfway job. So, we're not going 
to support any software RAID.

However, it's still Linux, and it's still Debian, and it's still based 
off the same tools we're used to using, so we can do what we want.

There is even a mostly correct post here:


Here is my (somewhat revised) procedure pulled from my system setup notes:

1. Log in as root (ssh or console, doesn't matter. ssh is likely 
easier). Note that at this point there is only the root account. 
Password is what you set up in the install.

2. Get the tools you need:

apt-get install mdadm initramfs-tools

3. In the mdadm config screen, choose "all" (the default)

4. Edit the initramfs module list in /etc/initramfs-tools/modules and 
add raid1 at the bottom

5. sfdisk the drives. Note that these drives have a funky layout, hence 
the --force parameter

sfdisk -d /dev/sda | sfdisk --force /dev/sdb

6. Create the two metadisks, level 1 (mirrored), 2 devices, first one 
missing. Also bootstraps config.

mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdb1
mdadm --create /dev/md1 --level=1 --raid-devices=2 missing /dev/sdb2
mdadm --detail scan >> /etc/mdadm/mdadm.conf

7. Create a new physical lvm volume, extend the existing logical volume 
into it, move the data over, then reduce the existing volume out of the 
array. (Note that the move takes a while.)

pvcreate /dev/md1
vgextend pve /dev/md1
pvmove /dev/sda2 /dev/md1
vgreduce pve /dev/sda2

8. Add the drive you just reduced out of the array to the metadisk as 
the missing component and then watch the rebuild until it completes

mdadm --add /dev/md1 /dev/sda2
watch -n1 "cat /proc/mdstat"

9. Make a filesystem on the smaller metadisk and copy /boot to it

mkfs.ext3 /dev/md0
mkdir /mnt/md0
mount /dev/md0 /mnt/md0
cp -a /boot/* /mnt/md0/.
umount /mnt/md0
rmdir /mnt/md0
umount /boot

10. edit /etc/fstab and change the line mounting /dev/sda1 as /boot to 
mount /dev/md0 as /boot

11. Remount our boot drive

mount /boot

12. change the id of /dev/md0 to fd (linux raid autodetect)

sfdisk --change-id /dev/sda 1 fd

13. Add the old /boot drive to the metadisk

mdadm --add /dev/md0 /dev/sda1

The rebuild will be basically instant, so you shouldn't have too much of 
a problem. I'd cat /proc/mdstat just to be sure.

14. Set up grub on the new drive
root (hd1,0)
setup (hd1)

15. Update the initial ramdisk to reflect all the changes you made:

update-initramfs -t -u

16. Reboot. If it breaks, you get to keep the pieces.

> and proxmox up-gradation?

I've just been doing standard upgrades with:
sudo aptitude update
sudo aptitude full-upgrade

The only bump is sometimes the initramfs doesn't always get set up with 
the correct bits, so you end up having to do:

update-initramfs -k <version> -v -u -t

But this was only a problem in older kernels - I haven't had a problem 

 > is it working fine?

It has been working fine for about a year and a half.

> if i go for the software RAID "mdam" how stable it would be

As stable as a single disk. I have no concerns about RAID1 in this 
regard. Writes are simultaneous, reads are round-robin. So, if the power 
gets cut, your disks will be in the same shape as they would have been 
if there was only one - with all the good and bad which that entails.

I have a bit more concern about other RAID levels - since there are 
parity calculations, etc. a battery-backed setup is really the way to go 

> what you say about the performance?

Slightly faster than a single disk on read, the same on write, and low 
overhead. Again, there are no calculations to be done, so there is just 
a bit more additional traffic on the bus. However, the PCIe bus is SO 
much faster than the individual SATA channels that I'm unconcerned. This 
used to be more of an issue with it was, back when you could easily 
saturate a straight PCI bus, but those speeds have so far outstripped 
the performance of rotational media that I wouldn't even consider it an 
issue unless you're using SSD's, and even then you'd have to run the 
numbers and see.

> are you running this in production?

Depends on your definition. We have it as a process-critical server used 
for a variety of development, build, and testing servers (nightly builds 
and testing, etc.). That said, it is not customer facing - it is 
internal to engineering. So, folks would notice if it blew up, and it 
would make life difficult, but if it took me 48 hours to get it back, I 

As an aside, it's a fairly small rig - Core i7 @ 2.93GHz, 12GB RAM, 
mirrored 500GB HDDs. There are 8 machines provisioned, with 5 active. 
Most of them are only allocated 1GB of memory, one has 2GB, and the 
aggregate Physical memory usage is currently 5.41GB used. CPU Usage is 
about 3%, with a load of about 0.3.

So, the machine is bored most of the time.

> please share some of your experience regarding mdam in proxmox as it is not
> recommended.

Yes, and in some cases, they are correct, and I have no argument. In 
other cases, I respectfully disagree. This is one of those cases.

> i am sending this email only to you because PVE community might
> get angry if we try to legalize  some thing they already declared
> as unsuitable strategy for proxmox :)

I appreciate that, but I'm putting it back on-list, because I think 
there's more folks out there doing this than many folks realize. I think 
there is a valid use case here, and the project maintainers should be 
aware of it.

However, they may also have a different opinion, and you're better off 
getting both sides of the story. Someone could call me a fool and cite 
some data of which I was not aware, which could help prevent you from 
making a mistake.

Besides - we are part of the community.
Matthew Caron
Build Engineer
Sixnet | www.sixnet.com
O +1 518 877 5173 Ext. 138
F +1 518 602 9209
matt.caron at sixnet.com

More information about the pve-user mailing list