[PVE-User] A less aggressive OOM?
Victor Rodriguez
vrodriguez at soltecsis.com
Mon Jul 7 23:39:34 CEST 2025
Hi,
I would start by analyzing the memory status at the time of the OOM.
There should be a some lines in journal/syslog were the kernel writes
what the memory looked like and you can figure out why it had to kill a
process.
Makes few sense that OOM triggers in 64GB hosts with just 24GB
configured in VMs and, probably, less real usage. IMHO it's not VMs what
fill your memory up to the point of OOM, but some other process, ZFS
ARC, maybe even some mem leak. Maybe some process is producing severe
memory fragmentation.
Regards,
On 7/7/25 11:26, Marco Gaiarin wrote:
> We have upgraded a set of clusters from PVE6 to PVE8, and we have found that
> in newer kernels, OOM is a bit more 'aggressive' and sometime kill a VMs.
>
> Nodes have plently of RAM (64GB, VMs are 2-3, each 8GB ram), VMs have qemu
> agent installed and ballooning enabled, but still sometime OOM happen.
> Clearly, if get OOM the main VMs that have the local DNS, we get some
> trouble.
>
>
> I've looked in PVE wiki, but found nothing. There's some way to relax OOM,
> or control their behaviour?
>
> In nodes there's no swap, so probably the best thing to do (but the hardest
> one ;-) is to setup some swap with a lower swappiness, but i'm seeking
> feedback.
>
>
> Thanks.
>
--
More information about the pve-user
mailing list