[PVE-User] Guest sudden stop

Alessandro Briosi ab1 at metalit.com
Tue Feb 14 09:59:13 CET 2017


Hi all,
I have had a strange behavior yesterday on a new cluster.

A Windows 2008 Guest suddenly was off, but I could not find any clue in
the logs on why it was off.
It's a VM which was migrated from a physical one in the w.e. It had been
working fine for the whole time.

I then thought it had something to do with the file system (as it's on
gluster), but no clue there either, no errors were reported at that time.

The server is also a gluster server and has a dedicated bond for
gluster, a dedicated bond for cluster, and a dedicated bond for the LAN,
all with balance-alb.
The ethernet for cluster and gluster have jumbo frames enabled (enabled
also on the switches)
it's a dual E5-2630 server with 128GB ram, and it's not over allocated,
server load is basically very low.
The vm has many cpu assigned but is not using much of them right now.

down happened around 16:20 (local time)

Here some information:

proxmox-ve: 4.4-79 (running kernel: 4.4.35-2-pve)
pve-manager: 4.4-12 (running version: 4.4-12/e71b7a74)
pve-kernel-4.4.35-1-pve: 4.4.35-77
pve-kernel-4.4.35-2-pve: 4.4.35-79
lvm2: 2.02.116-pve3
corosync-pve:
2.4.0-1                                                                                                           

libqb0:
1.0-1                                                                                                                   

pve-cluster:
4.0-48                                                                                                             

qemu-server:
4.0-108                                                                                                            

pve-firmware:
1.1-10                                                                                                            

libpve-common-perl:
4.0-91                                                                                                      

libpve-access-control:
4.0-23                                                                                                   

libpve-storage-perl:
4.0-73                                                                                                     

pve-libspice-server1:
0.12.8-1                                                                                                  

vncterm:
1.2-1                                                                                                                  

pve-docs:
4.4-3                                                                                                                 

pve-qemu-kvm:
2.7.1-1                                                                                                           

pve-container: 1.0-93
pve-firewall: 2.0-33
pve-ha-manager: 1.0-40
ksm-control-daemon: 1.2-1
glusterfs-client: 3.8.8-1
lxc-pve: 2.0.7-1
lxcfs: 2.0.6-pve1
criu: 1.6.0-1
novnc-pve: 0.5-8
smartmontools: 6.5+svn4324-1~pve80
zfsutils: 0.6.5.8-pve14~bpo80


This is the only entry in the syslog:

Feb 13 16:18:35 srvpve1 systemd-timesyncd[3187]:
interval/delta/delay/jitter/drift 2048s/+0.003s/0.037s/0.022s/+5ppm
Feb 13 16:20:08 srvpve1 kernel: [492671.687688] vmbr0: port 3(tap101i0)
entered disabled state
Feb 13 16:20:08 srvpve1 kernel: [492671.689734] vmbr0: port 3(tap101i0)
entered disabled state
Feb 13 16:25:37 srvpve1 pvedaemon[101918]: <root at pam> successful auth
for user 'root at pam'

I have checked basically all the logs but found no clues on what happened.
Also within windows the event viewer does not report anything except
that at next boot (which I had to start manually),
it reported that windows was badly shutdown.

Virtio disk driver installed is 0.1.126

The VM is NOT configured for HA, and right now it's the only vm which is
heavily used, the other VMs are not much used but they did not stop,
they all use the same glusterfs filesystem.

this is the VM configuration:

boot: cdn
bootdisk: virtio0
cores: 8
ide2: none,media=cdrom
memory: 65536
name: scavb
net0: e1000=1C:C1:DE:E9:2A:06,bridge=vmbr0
numa: 0
onboot: 1
ostype: win7
scsihw: virtio-scsi-pci
smbios1: uuid=90477e47-4357-42f8-9c1d-797d4111a604
sockets: 2
virtio0: datastore1:101/vm-101-disk-1.qcow2,size=500G
virtio1: datastore1:101/vm-101-disk-2.qcow2,size=900G


Any hint on how to proceed.
If any other info is needed for debug please let me know.

Alessandro





More information about the pve-user mailing list