[pve-devel] [PATCH storage 0/2] Fix #2046 and disksize-mismatch with shared LVM

Stoiko Ivanov s.ivanov at proxmox.com
Fri Jan 4 16:44:06 CET 2019


On Fri, 4 Jan 2019 16:06:49 +0100
Fabian Grünbichler <f.gruenbichler at proxmox.com> wrote:

> On Fri, Jan 04, 2019 at 02:41:00PM +0100, Stoiko Ivanov wrote:
> > On Fri, 4 Jan 2019 14:12:23 +0100
> > Alwin Antreich <a.antreich at proxmox.com> wrote:
> >   
> > > On Fri, Jan 04, 2019 at 02:06:23PM +0100, Stoiko Ivanov wrote:  
> > > > The issue was observed recently and can lead to potential
> > > > dataloss. When using a shared LVM storage (e.g. over iSCSI) in
> > > > a clustered setup only the node, where a guest is active
> > > > notices the size change upon disk-resize (lvextend/lvreduce).
> > > > 
> > > > LVM's metadata gets updated on all nodes eventually (the latest
> > > > when pvestatd runs and lists all LVM-volumes (lvs/vgs update the
> > > > metadata), however the device-files (/dev/$vg/$lv) on all nodes,
> > > > where the guest is not actively running do not notice the
> > > > change.
> > > > 
> > > > Steps to reproduce an I/O error:
> > > > * create a qemu-guest with a disk backed by a shared LVM storage
> > > > * create a filesystem on that disk and fill it to 100%
> > > > * resize the disk/filesystem
> > > > * put some more data on the filesystem
> > > > * migrate the guest to another node
> > > > * try reading past the initial disksize
> > > > 
> > > > The second patch fixes the size-mismatch by running `lvchange
> > > > --refresh` whenever we activate a volume with LVM and should fix
> > > > the critical issue
> > > > 
> > > > The first patch introduces a direct implementation of
> > > > volume_size_info to the LVMPlugin.pm, reading the volume size
> > > > via `lvs`, instead of falling back to `qemu-img info` from
> > > > Plugin.pm. While this should always yield the same output after
> > > > the second patch on the node where a guest is currently
> > > > running, there still might be a mismatch when the LV is active
> > > > (e.g. after a fresh boot) on a node, and gets resized on
> > > > another node.    
> > > I faintly recall, that there was a discussion offline about
> > > changing the activation of LVs, especially for booting. Something
> > > similar to, 'we activate LVs only if we need them on the specific
> > > node'.  
> > 
> > Could make sense in general. However the volumes might get
> > activated by some external invocation nonetheless (e.g. running
> > `vgchange -ay`) - then the refresh is still necessary.  
> 
> specifically, we discussed setting the 'k' attribute
> ('activationskip') on PVE-managed LVs - these will then be ignored by
> all the usual activation on boot, since those do not pass
> '--ignoreactivationskip' / '-K' for obvious reasons ;)
That sounds like a good idea (yet the next new thing learned about
LVM)! 
(We just should keep in mind to update the docs of how to get to the
image of a stopped VM).

> 
> having the refresh here is a good idea anyway, but maybe we want to
> re-evaluate the above mechanism for 6.x?
Hm - We could start setting the bit upon activation right away as far
as I see - Since we do not (yet) rely on it being present, but do
activate the LVs upon guest start it shouldn't break anything?

> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel





More information about the pve-devel mailing list