[PVE-User] PVECluster Issue

Patrick pk at argonius.de
Thu Nov 21 15:07:21 CET 2013


Hi,

i've also ran into this problem with NFS Storage. On other systems the
nfs share as backup volume works fine, only on one pve cluster is does
not work.

i've build a workaround script, which just checks if the backup gets a
lock timeout. if this occours, backup is tried again for max 3 times.
not really a good solution, but this works for me.

i don't think that there is a problem with nfs, cause on other systems
the nfs share for backup is doin fine.

greetz,
patrick

On 18.11.2013 14:05, David Thompson wrote:
> Hi everyone,
>
> I have an issue with a intel modular server with 5 nodes. Whenever the nodes run a backup, some of them fail and often the VM's become locked. These VM's are all QEMU based virtual servers and not openvz as they are Windows servers.
>
> Often everyday, I need to restart the cluster after something like this appears in the backup logs by going to one of the nodes that is red and restarting the PVECluster, which I believe is the correct way to do this, so please correct my if I'm wrong.
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> INFO: starting new backup job: vzdump 103 101 --quiet 1 --mailto snip at snip --mode snapshot --compress lzo --storage nfsData --node node2
> INFO: Starting Backup of VM 101 (qemu)
> INFO: status = running
> INFO: unable to open file '/etc/pve/nodes/node2/qemu-server/101.conf.tmp.666400' - Software caused connection abort
> INFO: update VM 101: -lock backup
> ERROR: Backup of VM 101 failed - command 'qm set 101 --lock backup' failed: exit code 107
> INFO: Starting Backup of VM 103 (qemu)
> ERROR: Backup of VM 103 failed - unable to find configuration file for VM 103 - no such machine
> INFO: Backup job finished with errors
> TASK ERROR: job errors
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> INFO: starting new backup job: vzdump 106 --quiet 1 --mailto snip at snip --mode snapshot --compress lzo --storage nfsData --node node4
> INFO: Starting Backup of VM 106 (qemu)
> INFO: status = running
> INFO: unable to open file '/etc/pve/nodes/node4/qemu-server/106.conf.tmp.641046' - File exists
> INFO: update VM 106: -lock backup
> ERROR: Backup of VM 106 failed - command 'qm set 106 --lock backup' failed: exit code 17
> INFO: Backup job finished with errors
> TASK ERROR: job errors
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> The backups all go to an NFS share mounted on on a private network over gig ethernet. 
> Any ideas as to why this is happening? The backups are all staggered from the hosts to back up to the share
>
> Thanks for any insight.
>
> _____________________________
> David Thompson 
>
>
>
>
>
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pve-user/attachments/20131121/bf72f058/attachment.htm>


More information about the pve-user mailing list