[PVE-User] New Kernel - 2.6.32-16 (2.6.32-79) - pvetest

Michael Rasmussen mir at miras.org
Thu Oct 18 20:08:22 CEST 2012


On Thu, 18 Oct 2012 15:46:14 +0000
Martin Maurer <martin at proxmox.com> wrote:

> 
> OpenVZ live migration with the CTs on NFS storage now appears to work!  Several live migrates back and forth between two nodes running the new kernel completed without errors - first time I've been able to do that.
> 
The problem we are facing can be read here:
http://www.freesoft.org/CIE/RFC/1813/32.htm

"[..]Thus, if a client needs to be able to continue to access a file
after using REMOVE to remove it, the client should take steps to make
sure that the file will still be accessible. The usual mechanism used
is to use RENAME to rename the file from its old name to a new hidden
name."

What does this mean:

1) The migration rutine suspends running process.
2) Locks shared by more than one process means a race condition causing
the creation of .nfsxxxxxx files.

One of two consequences depending on the implementation of VZ migration:
1) When the migration rutine restores state on another node
the process will recreate the lock file and succeeding processes will
use this lock file and release there old lock file. The filesystem
therefore will remove the .nfsxxxxxx. Migration will fail due to
inconsistency in filesystem.
2) The nfs server will, when restarted, ensure that clients reclaim
the locks leading to the same problem as 1)

How can this be solved?
Either, in some way, disregard .nfsxxxxxxx files when saving state or
configure a higher NLM grace period before saving state (when NLM is
in an active grace period new locks cannot be granted).

Disregarding .nfsxxxxxxx files could cause crashing processes while
honoring the NLM grace period will extend transition time but this is
the safe solution. So if the safe path should be followed two things
could reduce transition time: 1) configure NLM grace period to a higher
value so that file locks are not granted until filesystem is completely
restored (kernel: lockd.nlm_grace_period — Assign a grace period to the
lock manager. Userland: lockd -g n where n is number of seconds[0-240])
2) avoid using NLM file locking by mounting the NFS share with nolock
(Theoretically this should suffice but I have no test to prove this)

Proxmox kernel:
cat /proc/sys/fs/nfs/nlm_grace_period
0



-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://pve.proxmox.com/pipermail/pve-user/attachments/20121018/0561e5dc/attachment-0013.sig>


More information about the pve-user mailing list