[PVE-User] High load on cpu0 with 3.10 pve kernel on proxmox 3.4 [SOLVED]

Falko Trojahn trojahn+proxmox at pluspol.info
Mon Sep 14 10:24:11 CEST 2015


Am 11.09.2015 um 11:31 schrieb Falko Trojahn:
> Am 01.09.2015 um 16:15 schrieb Thomas Lamprecht:
>>>> There is an huge amount of interrupts from the network card and also
>>>> from your raid.
>>>>
>>>> I'd guess that the main problem it's your network card, make sure the
>>>> driver of eth0 and eth3 are configured and installed correctly.
>>> Ok, I'll check'em again.
>>>
>>> What do you think about compiling a newer kernel, since there seems
>>> to be a patch for specially this problem?
>> Hmm, but then it's quite strange that it happens on only one node as
>> they all have the same kernel and hardware, I guess?
> Yes, that's the strange thing.
In the mean time a second server had this problem for some time.

> 
>>
>> You will need the <version>-pve kernel for Proxmox VE, and the options
>> are 2.6.32.x (OpenVZ kernel) and the 3.10.x (RHEL 6 kernel).
> I'll try compile from https://git.proxmox.com/?p=pve-kernel-3.10.0.git;
> we nethertheless need the iscsitarget module as there seems to be a
> bug in the dkms module which we had to fix manually.
After adding the patch from commit e9ba61f0ddacaa5efd7dd4619d828d2466638913
(see https://www.kernel.org/pub/linux/kernel/v3.0/ChangeLog-3.10.33)
and some adjustions in Makefile (gcc version, reactivating iscsi target)
we use the recompiled pve-kernel-3.10.0-11.
ksoftirqd and CPU temperature is back to normal.

Thanx for your support!
Best regards
Falko

> 
>>
>> PVE4 beta has the 4.1.x kernel at the moment, maybe this could solve the
>> problem if it's kernel related.
> 
> I'm not sure, but wouldn't it make sense to give the 3.19.8 pve kernel a
> try?
> 
> Best regards,
> Falko
> 
>>
>>> ---------------->8-------------------------------8<----------------------
>>> Since commit 77873803363c "net_dma: mark broken" we no longer pin dma
>>> engines active for the network-receive-offload use case. As a result
>>> the ->free_chan_resources() that occurs after the driver self test no
>>> longer has a NET_DMA induced ->alloc_chan_resources() to back it up. A
>>> late firing irq can lead to ksoftirqd spinning indefinitely due to the
>>> tasklet_disable() performed by ->free_chan_resources(). Only
>>> ->alloc_chan_resources() can clear this condition in affected kernels.
>>>
>>> This problem has been present since commit 3e037454bcfa "I/OAT: Add
>>> support for MSI and MSI-X" in 2.6.24, but is now exposed. Given the
>>> NET_DMA use case is deprecated we can revisit moving the driver to use
>>> threaded irqs. ...
>>> ---------------->8-------------------------------8<----------------------
>>>
>>> from:
>>>
>>> https://www.kernel.org/pub/linux/kernel/v3.0/ChangeLog-3.10.33
>>>
>>> Cheers,
>>> Falko
>>>
>>
>>
> 
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 




More information about the pve-user mailing list