[PVE-User] inconsistency between rgmanager & pve status

Dhaussy Alexandre ADhaussy at voyages-sncf.com
Mon Oct 27 11:03:39 CET 2014


I think there is a good chance that this problem ("pvevm status" 
hanging) also came from insufficent open files limit.

Le 22/10/2014 11:37, Dhaussy Alexandre a écrit :
> Hello,
>
> This problem didn't show up since two weeks..
>
> Last time it hanged, i captured a quick strace, and it seems there was a
> timeout in a file descriptor.
> Unfortunatly i killed the process faster than i thought to look in
> /proc/pid/fd...so not sure if it helps.
>
> root at proxmoxt2:~# strace -s 8192 -p 45388
> Process 45388 attached - interrupt to quit
> restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT
> (Connection timed out)
> poll([{fd=5, events=POLLIN}], 1, 0)     = 0 (Timeout)
> futex(0x7f15bac8d010, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0,
> {1412859311, 894453861}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
> poll([{fd=5, events=POLLIN}], 1, 0)     = 0 (Timeout)
> futex(0x7f15bac8d010, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0,
> {1412859313, 893486725}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
> poll([{fd=5, events=POLLIN}], 1, 0)     = 0 (Timeout)
> futex(0x7f15bac8d010, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0,
> {1412859315, 892519589}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
> poll([{fd=5, events=POLLIN}], 1, 0)     = 0 (Timeout)
> futex(0x7f15bac8d010, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0,
> {1412859317, 891552453}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
> poll([{fd=5, events=POLLIN}], 1, 0)     = 0 (Timeout)
> futex(0x7f15bac8d010, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0,
> {1412859319, 890585317}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
> poll([{fd=5, events=POLLIN}], 1, 0)     = 0 (Timeout)
> futex(0x7f15bac8d010, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0,
> {1412859321, 890618197}, ffffffff^C <unfinished ...>
> Process 45388 detached
>
> Regards,
> Alexandre.
>
> Le 07/10/2014 08:16, Dietmar Maurer a écrit :
>>> I don't get why my problem has to do with a mount point failure ?
>>> All FS are monitored with nagios every two minutes and i had no errors so far.
>>>
>>> Maybe i miss something..but pvevm status only checks if the kvm process is
>>> running ?
>> Indeed, that should not cause a hang. Maybe you can instrument pvevm to see where it hangs?
>>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user



More information about the pve-user mailing list