[PVE-User] shared LVM on host-based mirrored iSCSI LUNs
Stefan Sänger
stsaenger at googlemail.com
Wed Apr 25 17:52:59 CEST 2012
Hi,
I did some more research and here is what I found out...
Am 23.04.2012 18:10, schrieb Flavio Stanchina:
>
> Not safe, as far as I know.
That was my first guess right away. My iSCSI-setup is just a test
environment to see what is possible - but unfortunately there is a
production system that uses FC, two FC SAN boxes and where host-based
mirroring should be implemented.
BTW: That hardware was not my decision, and right now I am basically
trying to figure out what will be the best way to go on with this problem...
> It would be just like using a
> non-distributed filesystem such as extX on shared storage: md is not
> meant to be used in this way, there is no locking between multiple
> nodes.
Well - it is not really like using a file system on shared storage.
As long as everything is working and the RAID was synced once, it is not
really a problem to connect the RAID-LUNs to another host. The other
host will only discover a clean RAID, will find the lvm information and
go along with that.
The interesting part is that LVM in fact is kind of a locking mechanism
here: every logical volume can only be used by a single VM, and that
single VM can only ron on one host at a time. So there is a clear
mapping of physical extents to virtual machines and hence there is no
data corruption as every host system is only writing to extents it is
allowed to.
But in case of a failover when one of the hosts goes down, the other
hosts are not aware of the RAID-state, since every host keeps its own
RAID-metadata.
And a write command issued by a VM to the logical volume will mean that
md has to issue two write commands - one two each LUN. Since there is no
communication about the RAID state between hosts there is no way two get
at least a consistent state.
What is more, reading from a clean, synced RAID1 is supposed to be done
round-robin just like RAID0 - whithout checking the mirrored block.
So if something has been written to only one RAID member it is a
coin-flip if you will read that or not. And that means that even if the
fsck of the VM will think everything is fine it is not :(
> While I can't think of a sure way to break it, I wouldn't feel
> safe to use it in production.
Well, I think I probably described a decent way why it can break.
And that leads me to the next question:
Instead of using RAID to do the mirroring, LVM should be able to take
care for this. I will do some tests, but maybe you guys around here have
a good idea about it.
So my next test will be:
- deleting the RAID
- disconnecting the iSCSI-Targets from all nodes but one
- creating single physical volumes on each LUN
- creating the volume group using -cy (--clustered=yes) with both LUNs
- probably the tricky point: creating the logical volume manually
using lvcreate -m 1
- adding that volume to the virtual machine
I am not sure about some lvcreate options like --mirrorlog yet, and not
sure if it will work anyway. But I think I should give it a try...
> Use DRBD between the two NAS boxes -- or whatever kind of realtime
> mirroring OpenFiler has to offer -- to mirror the disks, then use
> multipath to expose both ends to the VM hosts.
As mentioned above, this is basically some research on how to implement
host based mirroring. I did not come up with this requirement, bus since
I am using proxmox ve for some time now I woul prefer using it here as well.
Stefan
More information about the pve-user
mailing list