[PVE-User] Crash after Upgrade PVE2.3

Martin Schuchmann ms at city-pc.de
Thu Mar 21 12:46:56 CET 2013


Hi there,

Yesterday I did the upgrade from 2.2 up to 2.3 (pveversion see below) on 
all three nodes of our cluster (no HA).
At 23:00 the usual backup of a KVM Machine (801) started via vzdump.cron 
on Node 3 and ended with errors (see syslog below).

After this crash the VMs on Node 3 and the Webinterface had not been 
reachable anymore.

We restarted pvedaemond and pvestatd and had been able to reach the 
webinterface.

We tried to stop the vms but the processes "vzctl stop xxx" remained in 
the process list, even kill -9 did not work for removing them.
"reboot" via ssh failed also - we had to execute an "echo b > 
/proc/sysrq-trigger" to restart the host.

After reboot everthing was fine, the VMs started again.

Now we have on the two other nodes (no reboot) still an issue in syslog:

Mar 21 12:09:18 promo2 pvestatd[101835]: WARNING: command 'df -P -B 1 
/mnt/pve/p3_storage' failed: got timeout"But an

But on the bash the "df -P -B 1 /mnt/pve/p3_storage" works fine on every 
of the three hosts.


Had this heavy backup issue been reported earlier?
Any hints to prevent from that?

Regards, Martin


pve-manager: 2.3-13 (pve-manager/2.3/7946f1f1)
running kernel: 2.6.32-19-pve
proxmox-ve-2.6.32: 2.3-93
pve-kernel-2.6.32-10-pve: 2.6.32-63
pve-kernel-2.6.32-19-pve: 2.6.32-93
pve-kernel-2.6.32-17-pve: 2.6.32-83
lvm2: 2.02.95-1pve2
clvm: 2.02.95-1pve2
corosync-pve: 1.4.4-4
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.93-2
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.9-1
pve-cluster: 1.0-36
qemu-server: 2.3-18
pve-firmware: 1.0-21
libpve-common-perl: 1.0-49
libpve-access-control: 1.0-26
libpve-storage-perl: 2.3-6
vncterm: 1.0-3
vzctl: 4.0-1pve2
vzprocps: 2.0.11-2
vzquota: 3.1-1
pve-qemu-kvm: 1.4-8
ksm-control-daemon: 1.1-1


Mar 20 23:00:01 promo3 /USR/SBIN/CRON[150583]: (root) CMD (vzdump 801 
306 --quiet 1 --mode snapshot --compress lzo --storage p2_storage)
Mar 20 23:00:02 promo3 vzdump[150584]: <root at pam> starting task 
UPID:promo3:00024C3A:00785E0A:514A3162:vzdump::root at pam:
Mar 20 23:00:02 promo3 vzdump[150586]: INFO: starting new backup job: 
vzdump 801 306 --quiet 1 --mode snapshot --compress lzo --storage p2_storage
Mar 20 23:00:02 promo3 vzdump[150586]: INFO: Starting Backup of VM 306 
(openvz)
Mar 20 23:00:31 promo3 pvestatd[2328]: WARNING: unable to connect to VM 
801 socket - timeout after 31 retries
...
Mar 20 23:03:11 promo3 pvestatd[2328]: WARNING: unable to connect to VM 
801 socket - timeout after 31 retries
Mar 20 23:03:18 promo3 kernel: INFO: task kvm:2585 blocked for more than 
120 seconds.
Mar 20 23:03:18 promo3 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 20 23:03:18 promo3 kernel: kvm           D ffff88107a480da0 0  
2585      1    0 0x00000000
Mar 20 23:03:18 promo3 kernel: ffff88107a92fd08 0000000000000082 
0000000000000000 ffff880879df35c8
Mar 20 23:03:18 promo3 kernel: ffff880878cc08c0 00000000000000db 
ffff88107c415810 ffff88107a92fab8
Mar 20 23:03:18 promo3 kernel: ffff88107c415800 0000000104af1976 
ffff88107a481368 000000000001e9c0
Mar 20 23:03:18 promo3 kernel: Call Trace:
Mar 20 23:03:18 promo3 kernel: [<ffffffff8119ad69>] 
__sb_start_write+0x169/0x1a0
Mar 20 23:03:18 promo3 kernel: [<ffffffff81097200>] ? 
autoremove_wake_function+0x0/0x40
Mar 20 23:03:18 promo3 kernel: [<ffffffff81127489>] 
generic_file_aio_write+0x69/0x100
Mar 20 23:03:18 promo3 kernel: [<ffffffff811e325b>] 
aio_rw_vect_retry+0xbb/0x220
Mar 20 23:03:18 promo3 kernel: [<ffffffff811e4bc4>] aio_run_iocb+0x64/0x170
Mar 20 23:03:18 promo3 kernel: [<ffffffff811e614c>] do_io_submit+0x2bc/0x670
Mar 20 23:03:18 promo3 kernel: [<ffffffff811e6510>] sys_io_submit+0x10/0x20
Mar 20 23:03:18 promo3 kernel: [<ffffffff8100b102>] 
system_call_fastpath+0x16/0x1b
Mar 20 23:03:18 promo3 kernel: INFO: task lvcreate:150596 blocked for 
more than 120 seconds.
Mar 20 23:03:18 promo3 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 20 23:03:18 promo3 kernel: lvcreate      D ffff88087aae6d20 0 150596 
150595    0 0x00000000
Mar 20 23:03:18 promo3 kernel: ffff8802fc5bbc48 0000000000000082 
0000000000000000 00000000000000d2
Mar 20 23:03:18 promo3 kernel: ffffe8ffffffffff ffff88087bec5760 
ffffffff81ac37d0 ffffffff8141c110
Mar 20 23:03:18 promo3 kernel: 0000000000000000 0000000104af1b10 
ffff88087aae72e8 000000000001e9c0
Mar 20 23:03:18 promo3 kernel: Call Trace:
Mar 20 23:03:18 promo3 kernel: [<ffffffff8141c110>] ? copy_params+0x90/0x110
Mar 20 23:03:18 promo3 kernel: [<ffffffff8119ab6d>] sb_wait_write+0x9d/0xb0
Mar 20 23:03:18 promo3 kernel: [<ffffffff81097200>] ? 
autoremove_wake_function+0x0/0x40
Mar 20 23:03:18 promo3 kernel: [<ffffffff8119c2d0>] freeze_super+0x60/0x140
Mar 20 23:03:18 promo3 kernel: [<ffffffff811d5ad8>] freeze_bdev+0x98/0xe0
Mar 20 23:03:18 promo3 kernel: [<ffffffff81415697>] dm_suspend+0x97/0x270
Mar 20 23:03:18 promo3 kernel: [<ffffffff8141a1dc>] ? 
__find_device_hash_cell+0xac/0x170
Mar 20 23:03:18 promo3 kernel: [<ffffffff8141b4a6>] dev_suspend+0x76/0x250
Mar 20 23:03:18 promo3 kernel: [<ffffffff8141c344>] ctl_ioctl+0x1b4/0x270
Mar 20 23:03:18 promo3 kernel: [<ffffffff8141b430>] ? dev_suspend+0x0/0x250
Mar 20 23:03:18 promo3 kernel: [<ffffffff8141c413>] dm_ctl_ioctl+0x13/0x20
Mar 20 23:03:18 promo3 kernel: [<ffffffff811ac622>] vfs_ioctl+0x22/0xa0
Mar 20 23:03:18 promo3 kernel: [<ffffffff81061bcf>] ? 
pick_next_task_fair+0x16f/0x1f0
Mar 20 23:03:18 promo3 kernel: [<ffffffff8109e52d>] ? 
sched_clock_cpu+0xcd/0x110
Mar 20 23:03:18 promo3 kernel: [<ffffffff811ac7ca>] do_vfs_ioctl+0x8a/0x590
Mar 20 23:03:18 promo3 kernel: [<ffffffff8151dc50>] ? 
thread_return+0xbe/0x88e
Mar 20 23:03:18 promo3 kernel: [<ffffffff8108e675>] ? set_one_prio+0x75/0xd0
Mar 20 23:03:18 promo3 kernel: [<ffffffff811acd1f>] sys_ioctl+0x4f/0x80
Mar 20 23:03:18 promo3 kernel: [<ffffffff8100b102>] 
system_call_fastpath+0x16/0x1b
Mar 20 23:03:21 promo3 pvestatd[2328]: WARNING: unable to connect to VM 
801 socket - timeout after 31 retries
...






More information about the pve-user mailing list