[PVE-User] Problems with backup process and NFS
f.gruenbichler at proxmox.com
Tue May 23 08:48:46 CEST 2017
On Mon, May 22, 2017 at 02:52:13PM +0200, Uwe Sauter wrote:
> >>> perl -e 'use strict; use warnings; use PVE::ProcFSTools; use Data::Dumper; print Dumper(PVE::ProcFSTools::parse_proc_mounts());'
> >> $VAR1 = [
> >> ....
> >> [
> >> '<hostname of NFS server>:/backup/proxmox-infra',
> >> '/mnt/pve/aurel',
> >> 'nfs',
> >> 'rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=<if of
> >> NFS server>,mountvers=3,mountport=892,mountproto=tcp,local_lock=none,addr=<ip of NFS server>',
> >> '0',
> >> '0'
> >> ],
> >> .....
> >> ];
> > the culprit is likely that your storage.cfg contains the IP, but your
> > /proc/mounts contains the hostname (with a reverse lookup inbetween?).
> I was following https://pve.proxmox.com/wiki/Storage:_NFS , quote: "To avoid DNS lookup delays, it is usually preferable to use an
> IP address instead of a DNS name". But yes, the DNS in our environment is configured to allow reverse lookups.
which - AFAIK - is still true, especially since failing DNS means
failing NFS storage if you put the host name there. I think for NFSv4
the situation is slightly different, as reverse lookups are part of the
authentication process, but I haven't played around with that yet.
I cannot reproduce the behaviour you report with an NFS server with
working reverse lookup (proto and mountproto set to tcp, so the
resulting options string looks identical to yours modulo the addresses).
/proc/mounts contains the IP address as source if I put the IP address
into storage.cfg, and the hostname if I put the hostname in storage.cfg
(both on 4.4 and 5.0 Beta).
is there anything else in your setup/environment that might cause this
behaviour? what OS is the NFS server on? any entries in /etc/hosts
relating to the NFS server?
> > can you test using the hostname in your storage.cfg instead of the IP?
> I removed the former definition and umounted the NFS share on all nodes. BTW, why is a storage not umounted when it is deleted
> from the WebUI?
because storage deactivation in PVE happens mostly on a volume level,
and only when needed. deactivating something that is (potentially) still
needed is more dangerous than leaving something activated that is not ;)
> Now storage definition looks like:
> nfs: aurel
> export /backup/proxmox-infra
> path /mnt/pve/aurel
> server aurel.XXXXX.de
> content backup
> maxfiles 30
> options vers=3
> With this definition, the backup succeeded (and I got mails back from each host).
I suspected as much.
> So it seems that the recommendation from the wiki prevents PVE's mechanism from working properly (when being used in an
> environment where reverse name lookups are correctly configured).
... on your machine in your specific environment. Your report is the
first showing this behaviour that I know of, so until we get more
information I am inclined to not blame our instructions here :P running
with IP addresses instead of host names with NFSv3 has been shown to be
more robust (as in, we've had multiple cases where people experienced
NFS storage outages because of DNS problems).
> >> I tested the bakcup job with a local storage and then I got emails. So it is definitivly something related to NFS and backups, not
> >> the mailing mechansim.
> > yes and no - nothing special about NFS here, would be triggered by any
> > storage where storage_info (or the sub call to activate_storage) fails.
> > see my proposed patch for #1389 on pve-devel:
> > https://pve.proxmox.com/pipermail/pve-devel/2017-May/026511.html
> I'm not familiar enough with Perl to be able to comment whether this is enough…
was just intended as a heads up that this specific part of the problem
should be fixed soon (once the patch has been reviewed, applied, and
updated packages have trickled down through the repositories).
More information about the pve-user