[PVE-User] need Help (Hardware Raid)
Matt.Caron at sixnet.com
Tue May 24 15:47:42 CEST 2011
On 05/24/2011 09:10 AM, Muhammad Yousuf Khan wrote:
> tell any one :P . have u tested mdam fail over
Yes - it works the same as it does in all the other boxes. If I yank out
a drive, it just keeps running.
All that said, I *don't* recommend doing software RAID for anything
other than RAID1. In this respect, I agree with them.
The argument basically boils down like this (and I am paraphrasing from
Q: Do you support software RAID?
A: No. Software RAID is bad because if you lose power it can corrupt
your disks. Get a battery backed hardware RAID controller.
Q: But if you are just doing a RAID1? mirror and lose power, it's the
same as if you're using a single disk.
A: But if you are doing software RAID5?, it's worse.
Q: I'm not talking RAID5?, I'm talking RAID1?.
A: But if we put in support for RAID1?, we'd want to support all
software RAID levels, otherwise it's a halfway job. So, we're not going
to support any software RAID.
However, it's still Linux, and it's still Debian, and it's still based
off the same tools we're used to using, so we can do what we want.
There is even a mostly correct post here:
Here is my (somewhat revised) procedure pulled from my system setup notes:
1. Log in as root (ssh or console, doesn't matter. ssh is likely
easier). Note that at this point there is only the root account.
Password is what you set up in the install.
2. Get the tools you need:
apt-get install mdadm initramfs-tools
3. In the mdadm config screen, choose "all" (the default)
4. Edit the initramfs module list in /etc/initramfs-tools/modules and
add raid1 at the bottom
5. sfdisk the drives. Note that these drives have a funky layout, hence
the --force parameter
sfdisk -d /dev/sda | sfdisk --force /dev/sdb
6. Create the two metadisks, level 1 (mirrored), 2 devices, first one
missing. Also bootstraps config.
mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/sdb1
mdadm --create /dev/md1 --level=1 --raid-devices=2 missing /dev/sdb2
mdadm --detail scan >> /etc/mdadm/mdadm.conf
7. Create a new physical lvm volume, extend the existing logical volume
into it, move the data over, then reduce the existing volume out of the
array. (Note that the move takes a while.)
vgextend pve /dev/md1
pvmove /dev/sda2 /dev/md1
vgreduce pve /dev/sda2
8. Add the drive you just reduced out of the array to the metadisk as
the missing component and then watch the rebuild until it completes
mdadm --add /dev/md1 /dev/sda2
watch -n1 "cat /proc/mdstat"
9. Make a filesystem on the smaller metadisk and copy /boot to it
mount /dev/md0 /mnt/md0
cp -a /boot/* /mnt/md0/.
10. edit /etc/fstab and change the line mounting /dev/sda1 as /boot to
mount /dev/md0 as /boot
11. Remount our boot drive
12. change the id of /dev/md0 to fd (linux raid autodetect)
sfdisk --change-id /dev/sda 1 fd
13. Add the old /boot drive to the metadisk
mdadm --add /dev/md0 /dev/sda1
The rebuild will be basically instant, so you shouldn't have too much of
a problem. I'd cat /proc/mdstat just to be sure.
14. Set up grub on the new drive
15. Update the initial ramdisk to reflect all the changes you made:
update-initramfs -t -u
16. Reboot. If it breaks, you get to keep the pieces.
> and proxmox up-gradation?
I've just been doing standard upgrades with:
sudo aptitude update
sudo aptitude full-upgrade
The only bump is sometimes the initramfs doesn't always get set up with
the correct bits, so you end up having to do:
update-initramfs -k <version> -v -u -t
But this was only a problem in older kernels - I haven't had a problem
> is it working fine?
It has been working fine for about a year and a half.
> if i go for the software RAID "mdam" how stable it would be
As stable as a single disk. I have no concerns about RAID1 in this
regard. Writes are simultaneous, reads are round-robin. So, if the power
gets cut, your disks will be in the same shape as they would have been
if there was only one - with all the good and bad which that entails.
I have a bit more concern about other RAID levels - since there are
parity calculations, etc. a battery-backed setup is really the way to go
> what you say about the performance?
Slightly faster than a single disk on read, the same on write, and low
overhead. Again, there are no calculations to be done, so there is just
a bit more additional traffic on the bus. However, the PCIe bus is SO
much faster than the individual SATA channels that I'm unconcerned. This
used to be more of an issue with it was, back when you could easily
saturate a straight PCI bus, but those speeds have so far outstripped
the performance of rotational media that I wouldn't even consider it an
issue unless you're using SSD's, and even then you'd have to run the
numbers and see.
> are you running this in production?
Depends on your definition. We have it as a process-critical server used
for a variety of development, build, and testing servers (nightly builds
and testing, etc.). That said, it is not customer facing - it is
internal to engineering. So, folks would notice if it blew up, and it
would make life difficult, but if it took me 48 hours to get it back, I
As an aside, it's a fairly small rig - Core i7 @ 2.93GHz, 12GB RAM,
mirrored 500GB HDDs. There are 8 machines provisioned, with 5 active.
Most of them are only allocated 1GB of memory, one has 2GB, and the
aggregate Physical memory usage is currently 5.41GB used. CPU Usage is
about 3%, with a load of about 0.3.
So, the machine is bored most of the time.
> please share some of your experience regarding mdam in proxmox as it is not
Yes, and in some cases, they are correct, and I have no argument. In
other cases, I respectfully disagree. This is one of those cases.
> i am sending this email only to you because PVE community might
> get angry if we try to legalize some thing they already declared
> as unsuitable strategy for proxmox :)
I appreciate that, but I'm putting it back on-list, because I think
there's more folks out there doing this than many folks realize. I think
there is a valid use case here, and the project maintainers should be
aware of it.
However, they may also have a different opinion, and you're better off
getting both sides of the story. Someone could call me a fool and cite
some data of which I was not aware, which could help prevent you from
making a mistake.
Besides - we are part of the community.
Sixnet | www.sixnet.com
O +1 518 877 5173 Ext. 138
F +1 518 602 9209
matt.caron at sixnet.com
More information about the pve-user