[PVE-User] CPU soft lookup

Sat Feb 26 14:30:10 CET 2011

Hi,

I am experiencing reproducible KVM VM crashes/hangs and once a lost network config, when doing backups
via vzdump on the Hypervisor. Most of the time the VM just got stuck and i have to shut it down via
qm stop. Note the problem only occurs with the VM which i am backing up and sometimes with this vm when
copying large files on the HV node. The VM serves NFS and some databases and has around 150GB of data
which need to get backed up everytime. The other 3 VMs never crashed but once in a while i find the same
warning in the logs. I guess the reason for them not crashing might be that they are considerably smaller
in terms of used disk space.

I found this bug report https://bugs.launchpad.net/ubuntu/+source/linux/+bug/579276
and it contains some links to reports from Red Hat.
I am not exactly sure if the proposed patches fix my problem but these fixes are all in newer kernel
branches. My question is now if it would be worth to try the 2.6.35 Kernel on the HV. But what about
the VMs, do i need a newer/patched kernel there too?

Feb 26 13:25:17 be01 kernel: BUG: soft lockup - CPU#2 stuck for 10s! [swapper:0]
Feb 26 13:25:17 be01 kernel: CPU 2:
Feb 26 13:25:17 be01 kernel: Modules linked in: nfsd exportfs nfs_acl auth_rpcgss ipv6 xfrm_nalgo crypto_api act_police cls_fw cls_u32 sch_htb sch_hfsc sch_ingress sch_sfq xt_connlimit xt_realm iptable_raw xt_comment xt_policy ipt_ULOG ipt_TTL ipt_ttl ipt_TOS ipt_tos ipt_TCPMSS ipt_SAME ipt_REJECT ipt_REDIRECT ipt_recent ipt_owner ipt_NETMAP ipt_MASQUERADE ipt_iprange ipt_hashlimit ipt_ECN ipt_ecn ipt_DSCP ipt_dscp ipt_CLUSTERIP ipt_ah ipt_addrtype ip_nat_tftp ip_nat_snmp_basic ip_nat_sip ip_nat_pptp ip_nat_irc ip_nat_h323 ip_nat_ftp ip_nat_amanda ip_conntrack_tftp ip_conntrack_sip ip_conntrack_pptp ip_conntrack_netbios_ns ip_conntrack_irc ip_conntrack_h323 ip_conntrack_ftp ts_kmp ip_conntrack_amanda xt_tcpmss xt_pkttype xt_physdev bridge xt_NFQUEUE xt_multiport xt_MARK xt_mark xt_mac xt_limit xt_length xt_helper xt_DSCP xt_dccp xt_conntrack xt_CONNMARK xt_connmark xt_CLASSIFY ipt_LOG xt_tcpudp xt_state iptable_nat ip_nat ip_conntrack iptable_mangle nfnetlink iptable_filter ip_tables x_tables lockd sunrpc xfs
Feb 26 13:25:17 be01 kernel: dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy joydev virtio_blk virtio_balloon virtio_net i2c_piix4 virtio_pci i2c_core virtio_ring ide_cd serio_raw virtio pcspkr cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Feb 26 13:25:17 be01 kernel: Pid: 0, comm: swapper Not tainted 2.6.18-194.32.1.el5 #1
Feb 26 13:25:17 be01 kernel: RIP: 0010:[<ffffffff8006b36b>]  [<ffffffff8006b36b>] default_idle+0x29/0x50
Feb 26 13:25:17 be01 kernel: RSP: 0018:ffff81021fc67ef0  EFLAGS: 00000246
Feb 26 13:25:17 be01 kernel: RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
Feb 26 13:25:17 be01 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff8030a718
Feb 26 13:25:17 be01 kernel: RBP: ffff81021fc1c270 R08: ffff81021fc66000 R09: 000000000000003e
Feb 26 13:25:17 be01 kernel: R10: ffff81021fcc0038 R11: 0000000000000000 R12: 00000000000fc133
Feb 26 13:25:17 be01 kernel: R13: 000022062c42fc61 R14: ffff8101639ff080 R15: ffff81021fc1c080
Feb 26 13:25:17 be01 kernel: FS:  0000000000000000(0000) GS:ffff81021fc1be40(0000) knlGS:0000000000000000
Feb 26 13:25:17 be01 kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Feb 26 13:25:17 be01 kernel: CR2: 00002b8470339000 CR3: 000000020ce1a000 CR4: 00000000000006e0
Feb 26 13:25:17 be01 kernel:
Feb 26 13:25:17 be01 kernel: Call Trace:
Feb 26 13:25:17 be01 kernel:  [<ffffffff800492c4>] cpu_idle+0x95/0xb8
Feb 26 13:25:17 be01 kernel:  [<ffffffff80077991>] start_secondary+0x498/0x4a7
Feb 26 13:25:17 be01 kernel:

The VMs are all CentOS 5.5.
On the HV nodes, there are 2 KVM VMs each which are more or less identical.
No OpenVZ is used, the two HV nodes share the storage via an LSI SAS HBA with
15K RPM disks. The VMs use the deadline IO scheduler and the HVs the default CFQ one.

# pveversion -v
pve-manager: 1.7-11 (pve-manager/1.7/5470)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.7-30
pve-kernel-2.6.32-4-pve: 2.6.32-30
pve-kernel-2.6.18-2-pve: 2.6.18-5
qemu-server: 1.1-28
pve-firmware: 1.0-10
libpve-storage-perl: 1.0-16
vncterm: 0.9-2
vzctl: 3.0.24-1pve4
vzdump: 1.2-10
vzprocps: 2.0.11-1dso2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.13.0-3
ksm-control-daemon: 1.0-4

Debian Version: 5.0.8

VM configuration:
name: be01
ide2: none,media=cdrom
bootdisk: ide0
ostype: l26
ide0: kvm-share1:vm-102-disk-1,cache=none
memory: 8192
sockets: 2
onboot: 1
description:
cores: 2
vlan2: virtio=AA:A0:9F:11:67:E1
virtio0: kvm-share1:vm-102-disk-2,cache=none
boot: c
freeze: 0
cpuunits: 200000
acpi: 1
kvm: 1
vlan1: virtio=BE:30:52:BF:27:36
virtio1: data-share1:vm-102-disk-1,cache=none
virtio2: kvm-share1:vm-102-disk-3,cache=none
args: -balloon virtio

Backup is done like this
nice -n 14 vzdump --snapshot --size 2048 --compress --stdexcludes --ionice 7 --bwlimit 6148 --dumpdir /mnt 102

I tried first without nice, bwlimit and ionice this got me into trouble really fast.
Repeated experiments showed this to be usable values for the moment, but still sometimes i get the kernel warnings
shown above.

When copying large files i use this, else sometimes i have the same problem as when doing backups:
nice -n 14 cstream -i <input> -t 6148000 -o <output> &
ionice -c 2 -n 7 -p "$!"

Btw. how can i apply IO Limits to the VMs, i would like to limit the allowed network and disk resource usage.
Especially disk usage since shared storage is used.
IIUC i could use CGROUPs to limit block IO bandwidth and network usage.
Anybody here who would not mind to share his experiences with doing so?

thanks
   --lars