[PVE-User] A less aggressive OOM?

Roland devzero at web.de
Thu Jul 10 11:08:31 CEST 2025


if OOM kicks in because half of the ram is being used for 
caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
should be resolved at zfs or memory management level.

Why kill processes instead of reclaiming arc ? i think that's totally 
wrong behaviour.

will watch out for appropriate zfs github issue or we should consider 
open up one.

roland

Am 10.07.25 um 10:56 schrieb Victor Rodriguez:
> Hi,
>
> Checked the OOM log and for me the conclusion is clear (disclaimer, 
> numbers might not be exact):
>
> - You had around 26.7G used mem by processes + 2.3G for shared memory:
>
> active_anon:17033436kB
> inactive_anon:10633368kB
> shmem:2325708kB
> mapped:2285988kB
> unevictable:158204kB
>
> - Seems like you are also using ZFS (some zd* disks appear in the log) 
> and given that you were doing backups at the time of the OOM, I will 
> suppose that that your ARC size is set to 50% of the hosts memory 
> (check with arc_summary), so another 32G of used memory. ARC is 
> reclaimable by the host, but usually ZFS does not return that memory 
> fast enough, specially during heavy use of the ARC (i.e. reading for a 
> backup), so can't really count on that memory.
>
> - Memory was quite framented and only small pages were available:
>
> Node 0 Normal:
>   16080*4kB
>   36886*8kB
>   22890*16kB
>   4687*32kB
>   159*64kB
>   10*128kB
>   0*256kB
>   0*512kB
>   0*1024kB
>   0*2048kB
>   0*4096kB
>
>
> Conclusions:
>
> You had 32+26.7+2.3 ≃ 61G of used memory, with the ~3G available being 
> small blocks that can't be used for the typically large allocations 
> that VMs do. You host had no choice but to trigger OOM.
>
>
> What I would do:
>
> - Lower ARC size [1]
> - Add some swap (never place it in a ZFS disk!). Even some ZRAM could 
> help.
> - Lower your VMs memory, either the total, either the minimum memory 
> (balloon) or both. Check that VirtIO drivers + balloon driver is 
> installed and working so the host can reclaim memory from the guests.
> - Get more ram :)
>
>
> Regards
>
>
> [1] 
> https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
>
>
>
> On 7/8/25 18:31, Marco Gaiarin wrote:
>> Mandi! Victor Rodriguez
>>    In chel di` si favelave...
>>
>>> I would start by analyzing the memory status at the time of the OOM. 
>>> There
>>> should be a some lines in journal/syslog were the kernel writes what 
>>> the
>>> memory looked like and you can figure out why it had to kill a process.
>> This is the full OOM log:
>>
>> Jul  4 20:00:12 pppve1 kernel: [3375931.660119] kvm invoked 
>> oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.669158] CPU: 1 PID: 4088 
>> Comm: kvm Tainted: P           O       6.8.12-10-pve #1
>> Jul  4 20:00:12 pppve1 kernel: [3375931.677778] Hardware name: Dell 
>> Inc. PowerEdge T440/021KCD, BIOS 2.24.0 04/02/2025
>> Jul  4 20:00:12 pppve1 kernel: [3375931.686211] Call Trace:
>> Jul  4 20:00:12 pppve1 kernel: [3375931.689504]  <TASK>
>> Jul  4 20:00:12 pppve1 kernel: [3375931.692428] dump_stack_lvl+0x76/0xa0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.696915] dump_stack+0x10/0x20
>> Jul  4 20:00:12 pppve1 kernel: [3375931.701057] dump_header+0x47/0x1f0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.705358] 
>> oom_kill_process+0x110/0x240
>> Jul  4 20:00:12 pppve1 kernel: [3375931.710169] 
>> out_of_memory+0x26e/0x560
>> Jul  4 20:00:12 pppve1 kernel: [3375931.714707] 
>> __alloc_pages+0x10ce/0x1320
>> Jul  4 20:00:12 pppve1 kernel: [3375931.719422] 
>> alloc_pages_mpol+0x91/0x1f0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.724136] alloc_pages+0x54/0xb0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.728320] 
>> __get_free_pages+0x11/0x50
>> Jul  4 20:00:12 pppve1 kernel: [3375931.732938] __pollwait+0x9e/0xe0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.737015] eventfd_poll+0x2c/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.741261] do_sys_poll+0x2f4/0x610
>> Jul  4 20:00:12 pppve1 kernel: [3375931.745587]  ? 
>> __pfx___pollwait+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.750332]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.754900]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.759463]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.764011]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.768617]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.773165]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.777688]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.782156]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.786622]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.791111] 
>> __x64_sys_ppoll+0xde/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.795656] 
>> x64_sys_call+0x1818/0x2480
>> Jul  4 20:00:12 pppve1 kernel: [3375931.800193] do_syscall_64+0x81/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.804485]  ? 
>> __x64_sys_ppoll+0xf2/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.809100]  ? 
>> syscall_exit_to_user_mode+0x86/0x260
>> Jul  4 20:00:12 pppve1 kernel: [3375931.814566]  ? 
>> do_syscall_64+0x8d/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.818979]  ? 
>> syscall_exit_to_user_mode+0x86/0x260
>> Jul  4 20:00:12 pppve1 kernel: [3375931.824425]  ? 
>> do_syscall_64+0x8d/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.828825]  ? 
>> clear_bhb_loop+0x15/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.833211]  ? 
>> clear_bhb_loop+0x15/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.837579]  ? 
>> clear_bhb_loop+0x15/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.841928] 
>> entry_SYSCALL_64_after_hwframe+0x78/0x80
>> Jul  4 20:00:12 pppve1 kernel: [3375931.847482] RIP: 0033:0x765bb1ce8316
>> Jul  4 20:00:12 pppve1 kernel: [3375931.851577] Code: 7c 24 08 e8 2c 
>> 95 f8 ff 4c 8b 54 24 18 48 8b 74 24 10 41 b8 08 00 00 00 41 89 c1 48 
>> 8b 7c 24 08 4c 89 e2 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 
>> 44 89 cf 89 44 24 08 e8 76 95 f8 ff 8b 44
>> Jul  4 20:00:12 pppve1 kernel: [3375931.871194] RSP: 
>> 002b:00007fff2d39ea20 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
>> Jul  4 20:00:12 pppve1 kernel: [3375931.879298] RAX: ffffffffffffffda 
>> RBX: 00006045d3e68470 RCX: 0000765bb1ce8316
>> Jul  4 20:00:12 pppve1 kernel: [3375931.886963] RDX: 00007fff2d39ea40 
>> RSI: 0000000000000010 RDI: 00006045d4de5f20
>> Jul  4 20:00:12 pppve1 kernel: [3375931.894630] RBP: 00007fff2d39eaac 
>> R08: 0000000000000008 R09: 0000000000000000
>> Jul  4 20:00:12 pppve1 kernel: [3375931.902299] R10: 0000000000000000 
>> R11: 0000000000000293 R12: 00007fff2d39ea40
>> Jul  4 20:00:12 pppve1 kernel: [3375931.909951] R13: 00006045d3e68470 
>> R14: 00006045b014d570 R15: 00007fff2d39eab0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.917656]  </TASK>
>> Jul  4 20:00:12 pppve1 kernel: [3375931.920515] Mem-Info:
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] active_anon:4467063 
>> inactive_anon:2449638 isolated_anon:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  active_file:611 
>> inactive_file:303 isolated_file:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] unevictable:39551 
>> dirty:83 writeback:237
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] 
>> slab_reclaimable:434580 slab_unreclaimable:1792355
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  mapped:571491 
>> shmem:581427 pagetables:26365
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] sec_pagetables:11751 
>> bounce:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] 
>> kernel_misc_reclaimable:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  free:234516 
>> free_pcp:5874 free_cma:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.969518] Node 0 
>> active_anon:17033436kB inactive_anon:10633368kB active_file:64kB 
>> inactive_file:3196kB unevictable:158204kB isolated(anon):0kB 
>> isolated(file):0kB mapped:2285988kB dirty:356kB writeback:948kB 
>> shmem:2325708kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:866304kB 
>> writeback_tmp:0kB kernel_stack:11520kB pagetables:105460kB 
>> sec_pagetables:47004kB all_unreclaimable? no
>> Jul  4 20:00:12 pppve1 kernel: [3375932.004977] Node 0 DMA 
>> free:11264kB boost:0kB min:12kB low:24kB high:36kB 
>> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB 
>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB 
>> present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB 
>> local_pcp:0kB free_cma:0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.032646] lowmem_reserve[]: 0 
>> 1527 63844 63844 63844
>> Jul  4 20:00:12 pppve1 kernel: [3375932.038675] Node 0 DMA32 
>> free:252428kB boost:0kB min:1616kB low:3176kB high:4736kB 
>> reserved_highatomic:2048KB active_anon:310080kB 
>> inactive_anon:986436kB active_file:216kB inactive_file:0kB 
>> unevictable:0kB writepending:0kB present:1690624kB managed:1623508kB 
>> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.069110] lowmem_reserve[]: 0 0 
>> 62317 62317 62317
>> Jul  4 20:00:12 pppve1 kernel: [3375932.074979] Node 0 Normal 
>> free:814396kB boost:290356kB min:356304kB low:420116kB high:483928kB 
>> reserved_highatomic:346112KB active_anon:11258684kB 
>> inactive_anon:15111580kB active_file:0kB inactive_file:2316kB 
>> unevictable:158204kB writepending:1304kB present:65011712kB 
>> managed:63820796kB mlocked:155132kB bounce:0kB free_pcp:12728kB 
>> local_pcp:0kB free_cma:0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.109188] lowmem_reserve[]: 0 0 
>> 0 0 0
>> Jul  4 20:00:12 pppve1 kernel: [3375932.114119] Node 0 DMA: 0*4kB 
>> 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 
>> 1*2048kB (M) 2*4096kB (M) = 11264kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.127796] Node 0 DMA32: 
>> 5689*4kB (UMH) 1658*8kB (UMH) 381*16kB (UM) 114*32kB (UME) 97*64kB 
>> (UME) 123*128kB (UMEH) 87*256kB (MEH) 96*512kB (UMEH) 58*1024kB (UME) 
>> 5*2048kB (UME) 11*4096kB (ME) = 253828kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.148050] Node 0 Normal: 
>> 16080*4kB (UMEH) 36886*8kB (UMEH) 22890*16kB (UMEH) 4687*32kB (MEH) 
>> 159*64kB (UMEH) 10*128kB (UE) 0*256kB 0*512kB 0*1024kB 0*2048kB 
>> 0*4096kB = 887088kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.165899] Node 0 
>> hugepages_total=0 hugepages_free=0 hugepages_surp=0 
>> hugepages_size=1048576kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.175876] Node 0 
>> hugepages_total=0 hugepages_free=0 hugepages_surp=0 
>> hugepages_size=2048kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.185569] 586677 total 
>> pagecache pages
>> Jul  4 20:00:12 pppve1 kernel: [3375932.190737] 0 pages in swap cache
>> Jul  4 20:00:12 pppve1 kernel: [3375932.195285] Free swap  = 0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.199404] Total swap = 0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.203513] 16679583 pages RAM
>> Jul  4 20:00:12 pppve1 kernel: [3375932.207787] 0 pages 
>> HighMem/MovableOnly
>> Jul  4 20:00:12 pppve1 kernel: [3375932.212819] 314667 pages reserved
>> Jul  4 20:00:12 pppve1 kernel: [3375932.217321] 0 pages hwpoisoned
>> Jul  4 20:00:12 pppve1 kernel: [3375932.221525] Tasks state (memory 
>> values in pages):
>> Jul  4 20:00:12 pppve1 kernel: [3375932.227400] [  pid  ]   uid tgid 
>> total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents 
>> oom_score_adj name
>> Jul  4 20:00:12 pppve1 kernel: [3375932.239680] [   1959]   106 
>> 1959     1971      544       96      448         0 61440        
>> 0             0 rpcbind
>> Jul  4 20:00:12 pppve1 kernel: [3375932.251672] [   1982]   104 
>> 1982     2350      672      160      512         0 57344        
>> 0          -900 dbus-daemon
>> Jul  4 20:00:12 pppve1 kernel: [3375932.264020] [   1991]     0 
>> 1991     1767      275       83      192         0 57344        
>> 0             0 ksmtuned
>> Jul  4 20:00:12 pppve1 kernel: [3375932.276094] [   1995]     0 
>> 1995    69541      480       64      416         0 86016        
>> 0             0 pve-lxc-syscall
>> Jul  4 20:00:12 pppve1 kernel: [3375932.289686] [   2002]     0 
>> 2002     1330      384       32      352         0 53248        
>> 0             0 qmeventd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.301811] [   2003]     0 
>> 2003    55449      727      247      480         0 86016        
>> 0             0 rsyslogd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.313919] [   2004]     0 
>> 2004     3008      928      448      480         0 69632        
>> 0             0 smartd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.325834] [   2009]     0 
>> 2009     6386      992      224      768         0 77824        
>> 0             0 systemd-logind
>> Jul  4 20:00:12 pppve1 kernel: [3375932.339438] [   2010]     0 
>> 2010      584      256        0      256         0 40960        
>> 0         -1000 watchdog-mux
>> Jul  4 20:00:12 pppve1 kernel: [3375932.352936] [   2021]     0 
>> 2021    60174      928      256      672         0 90112        
>> 0             0 zed
>> Jul  4 20:00:12 pppve1 kernel: [3375932.364626] [   2136]     0 
>> 2136    75573      256       64      192         0 86016        
>> 0         -1000 lxcfs
>> Jul  4 20:00:12 pppve1 kernel: [3375932.376485] [   2397]     0 
>> 2397     2208      480       64      416         0 61440        
>> 0             0 lxc-monitord
>> Jul  4 20:00:12 pppve1 kernel: [3375932.389169] [   2421]     0 
>> 2421    40673      454       70      384         0 73728        
>> 0             0 apcupsd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.400685] [   2426]     0 
>> 2426     3338      428      172      256         0 69632        
>> 0             0 iscsid
>> Jul  4 20:00:12 pppve1 kernel: [3375932.412121] [   2427]     0 
>> 2427     3464     3343      431     2912         0 77824        
>> 0           -17 iscsid
>> Jul  4 20:00:12 pppve1 kernel: [3375932.423754] [   2433]     0 
>> 2433     3860     1792      320     1472         0 77824        
>> 0         -1000 sshd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.435208] [   2461]     0 
>> 2461   189627     2688     1344     1344         0 155648        
>> 0             0 dsm_ism_srvmgrd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.448290] [   2490]   113 
>> 2490     4721      750      142      608         0 61440        
>> 0             0 chronyd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.459988] [   2492]   113 
>> 2492     2639      502      118      384         0 61440        
>> 0             0 chronyd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.471684] [   2531]     0 
>> 2531     1469      448       32      416         0 49152        
>> 0             0 agetty
>> Jul  4 20:00:12 pppve1 kernel: [3375932.483269] [   2555]     0 
>> 2555   126545      673      244      429         0 147456        
>> 0             0 rrdcached
>> Jul  4 20:00:12 pppve1 kernel: [3375932.483275] [   2582]     0 
>> 2582   155008    15334     3093      864     11377 434176        
>> 0             0 pmxcfs
>> Jul  4 20:00:12 pppve1 kernel: [3375932.506653] [   2654]     0 
>> 2654    10667      614      134      480         0 77824        
>> 0             0 master
>> Jul  4 20:00:12 pppve1 kernel: [3375932.517986] [   2656]   107 
>> 2656    10812      704      160      544         0 73728        
>> 0             0 qmgr
>> Jul  4 20:00:12 pppve1 kernel: [3375932.529118] [   2661]     0 
>> 2661   139892    41669    28417     2980     10272 405504        
>> 0             0 corosync
>> Jul  4 20:00:12 pppve1 kernel: [3375932.540553] [   2662]     0 
>> 2662     1653      576       32      544         0 53248        
>> 0             0 cron
>> Jul  4 20:00:12 pppve1 kernel: [3375932.551657] [   2664]     0 
>> 2664     1621      480       96      384         0 57344        
>> 0             0 proxmox-firewal
>> Jul  4 20:00:12 pppve1 kernel: [3375932.564093] [   3164]     0 
>> 3164    83332    26227    25203      768       256 360448        
>> 0             0 pve-firewall
>> Jul  4 20:00:12 pppve1 kernel: [3375932.576192] [   3233]     0 
>> 3233    85947    28810    27242     1216       352 385024        
>> 0             0 pvestatd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.587638] [   3417]     0 
>> 3417    93674    36011    35531      480         0 438272        
>> 0             0 pvedaemon
>> Jul  4 20:00:12 pppve1 kernel: [3375932.599167] [   3421]     0 
>> 3421    95913    37068    35884     1120        64 454656        
>> 0             0 pvedaemon worke
>> Jul  4 20:00:12 pppve1 kernel: [3375932.611536] [   3424]     0 
>> 3424    96072    36972    35852     1088        32 454656        
>> 0             0 pvedaemon worke
>> Jul  4 20:00:12 pppve1 kernel: [3375932.623977] [   3426]     0 
>> 3426    96167    37068    35948     1056        64 458752        
>> 0             0 pvedaemon worke
>> Jul  4 20:00:12 pppve1 kernel: [3375932.636698] [   3558]     0 
>> 3558    90342    29540    28676      608       256 385024        
>> 0             0 pve-ha-crm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.648477] [   3948]    33 
>> 3948    94022    37705    35849     1856         0 471040        
>> 0             0 pveproxy
>> Jul  4 20:00:12 pppve1 kernel: [3375932.660083] [   3954]    33 
>> 3954    21688    14368    12736     1632         0 221184        
>> 0             0 spiceproxy
>> Jul  4 20:00:12 pppve1 kernel: [3375932.671862] [   3956]     0 
>> 3956    90222    29321    28521      544       256 397312        
>> 0             0 pve-ha-lrm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.683484] [   3994]     0 3994  
>> 1290140   706601   705993      608         0 6389760        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.694551] [   4088]     0 4088  
>> 1271416  1040767  1040223      544         0 8994816        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.705624] [   4160]     0 
>> 4160    89394    30149    29541      608         0 380928        
>> 0             0 pvescheduler
>> Jul  4 20:00:12 pppve1 kernel: [3375932.717864] [   4710]     0 
>> 4710     1375      480       32      448         0 57344        
>> 0             0 agetty
>> Jul  4 20:00:12 pppve1 kernel: [3375932.729183] [   5531]     0 
>> 5531   993913   567351   566647      704         0 5611520        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.740212] [   6368]     0 6368  
>> 5512483  4229046  4228342      704         0 34951168        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.751255] [   9796]     0 
>> 9796     1941      768       64      704         0 57344        
>> 0             0 lxc-start
>> Jul  4 20:00:12 pppve1 kernel: [3375932.762840] [   9808] 100000  
>> 9808     3875      160       32      128         0 77824        
>> 0             0 init
>> Jul  4 20:00:12 pppve1 kernel: [3375932.774063] [  11447] 100000 
>> 11447     9272      192       64      128         0 118784        
>> 0             0 rpcbind
>> Jul  4 20:00:12 pppve1 kernel: [3375932.785534] [  11620] 100000 
>> 11620    45718      240      112      128         0 126976        
>> 0             0 rsyslogd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.797241] [  11673] 100000 
>> 11673     4758      195       35      160         0 81920        
>> 0             0 atd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.808516] [  11748] 100000 
>> 11748     6878      228       36      192         0 98304        
>> 0             0 cron
>> Jul  4 20:00:12 pppve1 kernel: [3375932.819868] [  11759] 100102 
>> 11759    10533      257       65      192         0 122880        
>> 0             0 dbus-daemon
>> Jul  4 20:00:12 pppve1 kernel: [3375932.832328] [  11765] 100000 
>> 11765    13797      315      155      160         0 143360        
>> 0             0 sshd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.843547] [  11989] 100104 
>> 11989   565602    19744      288      160     19296 372736        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.855266] [  12169] 100104 
>> 12169   565938   537254      678      192    536384 4517888        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.866950] [  12170] 100104 
>> 12170   565859   199654      550      224    198880 4296704        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.878525] [  12171] 100104 
>> 12171   565859     4710      358      224      4128 241664        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.890252] [  12172] 100104 
>> 12172   565962     7654      518      192      6944 827392        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.901845] [  12173] 100104 
>> 12173    20982      742      518      224         0 200704        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375932.913421] [  13520] 100000 
>> 13520     9045      192      128       64         0 114688        
>> 0             0 master
>> Jul  4 20:00:13 pppve1 kernel: [3375932.924809] [  13536] 100100 
>> 13536     9601      320      128      192         0 126976        
>> 0             0 qmgr
>> Jul  4 20:00:13 pppve1 kernel: [3375932.936088] [  13547] 100000 
>> 13547     3168      192       32      160         0 73728        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375932.947424] [  13548] 100000 
>> 13548     3168      160       32      128         0 73728        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375932.958761] [1302486]     0 
>> 1302486     1941      768       96      672         0 53248        
>> 0             0 lxc-start
>> Jul  4 20:00:13 pppve1 kernel: [3375932.970490] [1302506] 100000 
>> 1302506     2115      128       32       96         0 65536        
>> 0             0 init
>> Jul  4 20:00:13 pppve1 kernel: [3375932.981999] [1302829] 100001 
>> 1302829     2081      128        0      128         0 61440        
>> 0             0 portmap
>> Jul  4 20:00:13 pppve1 kernel: [3375932.993763] [1302902] 100000 
>> 1302902    27413      160       64       96         0 122880        
>> 0             0 rsyslogd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.005719] [1302953] 100000 
>> 1302953   117996     1654     1366      227        61 450560        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.017459] [1302989] 100000 
>> 1302989     4736       97       33       64         0 81920        
>> 0             0 atd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.028905] [1303004] 100104 
>> 1303004     5843       64       32       32         0 94208        
>> 0             0 dbus-daemon
>> Jul  4 20:00:13 pppve1 kernel: [3375933.041272] [1303030] 100000 
>> 1303030    12322      334      110      224         0 139264        
>> 0             0 sshd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.052755] [1303048] 100000 
>> 1303048     5664       64       32       32         0 94208        
>> 0             0 cron
>> Jul  4 20:00:13 pppve1 kernel: [3375933.064220] [1303255] 100000 
>> 1303255     9322      224       96      128         0 118784        
>> 0             0 master
>> Jul  4 20:00:13 pppve1 kernel: [3375933.075896] [1303284] 100101 
>> 1303284     9878      352      128      224         0 122880        
>> 0             0 qmgr
>> Jul  4 20:00:13 pppve1 kernel: [3375933.087405] [1303285] 100000 
>> 1303285     1509       32        0       32         0 61440        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375933.099008] [1303286] 100000 
>> 1303286     1509       64        0       64         0 61440        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375933.110571] [1420994]    33 
>> 1420994    21749    13271    12759      512         0 204800        
>> 0             0 spiceproxy work
>> Jul  4 20:00:13 pppve1 kernel: [3375933.123378] [1421001]    33 
>> 1421001    94055    37044    35892     1152         0 434176        
>> 0             0 pveproxy worker
>> Jul  4 20:00:13 pppve1 kernel: [3375933.136284] [1421002]    33 
>> 1421002    94055    36980    35860     1120         0 434176        
>> 0             0 pveproxy worker
>> Jul  4 20:00:13 pppve1 kernel: [3375933.149173] [1421003]    33 
>> 1421003    94055    37044    35892     1152         0 434176        
>> 0             0 pveproxy worker
>> Jul  4 20:00:13 pppve1 kernel: [3375933.162040] [2316827]     0 
>> 2316827     6820     1088      224      864         0 69632        
>> 0         -1000 systemd-udevd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.174778] [2316923]     0 
>> 2316923    51282     2240      224     2016         0 438272        
>> 0          -250 systemd-journal
>> Jul  4 20:00:13 pppve1 kernel: [3375933.187768] [3148356]     0 
>> 3148356    32681    21120    19232     1888         0 249856        
>> 0             0 glpi-agent (tag
>> Jul  4 20:00:13 pppve1 kernel: [3375933.200481] [3053571]     0 
>> 3053571    19798      480       32      448         0 57344        
>> 0             0 pvefw-logger
>> Jul  4 20:00:13 pppve1 kernel: [3375933.212970] [3498513] 100033 
>> 3498513   119792     7207     2632      223      4352 516096        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.224713] [3498820] 100104 
>> 3498820   575918   235975     9351      160    226464 3424256        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.236579] [3500997] 100033 
>> 3500997   119889     7202     2594      192      4416 524288        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.248240] [3501657] 100104 
>> 3501657   571325   199025     6001      160    192864 2945024        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.260100] [3502514] 100033 
>> 3502514   119119     5907     2004      191      3712 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.271772] [3503679] 100104 
>> 3503679   575295   211508     6612      192    204704 2953216        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.283619] [3515234] 100033 
>> 3515234   119042     6568     1960      192      4416 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.295362] [3515420] 100104 
>> 3515420   569839    97579     4491      160     92928 2293760        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.307155] [3520282] 100033 
>> 3520282   119129     5416     2056      192      3168 495616        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.318923] [3520287] 100033 
>> 3520287   119015     5709     1894      167      3648 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.330805] [3520288] 100033 
>> 3520288   119876     5961     2729      224      3008 507904        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.342648] [3521057] 100104 
>> 3521057   573824    46069     8341      128     37600 1830912        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.354567] [3521067] 100104 
>> 3521067   574768    99734     7446       96     92192 2134016        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.366512] [3521301] 100104 
>> 3521301   569500   174722     4194      160    170368 2482176        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.378484] [3532810] 100033 
>> 3532810   118740     4127     1727      160      2240 479232        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.390140] [3532933] 100033 
>> 3532933   118971     5064     1864      160      3040 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.401854] [3534151] 100104 
>> 3534151   567344   168822     1686      160    166976 2408448        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.413852] [3535832] 100104 
>> 3535832   569005    41042     2578      128     38336 1150976        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.425919] [3550993] 100033 
>> 3550993   118029     1768     1544      224         0 425984        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.437868] [3560475]   107 
>> 3560475    10767      928      160      768         0 77824        
>> 0             0 pickup
>> Jul  4 20:00:13 pppve1 kernel: [3375933.449513] [3563017] 100101 
>> 3563017     9838      256       96      160         0 122880        
>> 0             0 pickup
>> Jul  4 20:00:13 pppve1 kernel: [3375933.461255] [3575085] 100100 
>> 3575085     9561      288      128      160         0 118784        
>> 0             0 pickup
>> Jul  4 20:00:13 pppve1 kernel: [3375933.473119] [3579986]     0 
>> 3579986     1367      384        0      384         0 49152        
>> 0             0 sleep
>> Jul  4 20:00:13 pppve1 kernel: [3375933.484646] [3579996] 100104 
>> 3579996   566249     5031      615      128      4288 450560        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.496645] [3580020]     0 
>> 3580020    91269    30310    29606      704         0 409600        
>> 0             0 pvescheduler
>> Jul  4 20:00:13 pppve1 kernel: [3375933.509585] [3580041]     0 
>> 3580041     5005     1920      640     1280         0 81920        
>> 0           100 systemd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.521297] [3580044]     0 
>> 3580044    42685     1538     1218      320         0 102400        
>> 0           100 (sd-pam)
>> Jul  4 20:00:13 pppve1 kernel: [3375933.533226] [3580125] 100104 
>> 3580125   566119     5607      583      704      4320 446464        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.545245] [3580193]     0 
>> 3580193     4403     2368      384     1984         0 81920        
>> 0             0 sshd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.556849] 
>> oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=qemu.slice,mems_allowed=0,global_oom,task_memcg=/qemu.slice/121.scope,task=kvm,pid=6368,uid=0
>> Jul  4 20:00:13 pppve1 kernel: [3375933.573133] Out of memory: Killed 
>> process 6368 (kvm) total-vm:22049932kB, anon-rss:16913368kB, 
>> file-rss:2944kB, shmem-rss:0kB, UID:0 pgtables:34132kB oom_score_adj:0
>> Jul  4 20:00:15 pppve1 kernel: [3375935.378441]  zd16: p1 p2 p3 < p5 
>> p6 >
>> Jul  4 20:00:16 pppve1 kernel: [3375936.735383] oom_reaper: reaped 
>> process 6368 (kvm), now anon-rss:0kB, file-rss:32kB, shmem-rss:0kB
>> Jul  4 20:01:11 pppve1 kernel: [3375991.767379] vmbr0: port 
>> 5(tap121i0) entered disabled state
>> Jul  4 20:01:11 pppve1 kernel: [3375991.778143] tap121i0 
>> (unregistering): left allmulticast mode
>> Jul  4 20:01:11 pppve1 kernel: [3375991.785976] vmbr0: port 
>> 5(tap121i0) entered disabled state
>> Jul  4 20:01:11 pppve1 kernel: [3375991.791555]  zd128: p1
>> Jul  4 20:01:13 pppve1 kernel: [3375993.594688]  zd176: p1 p2
>>
>>
>>> Makes few sense that OOM triggers in 64GB hosts with just 24GB 
>>> configured in
>>> VMs and, probably, less real usage. IMHO it's not VMs what fill your 
>>> memory
>>> up to the point of OOM, but some other process, ZFS ARC, maybe even 
>>> some mem
>>> leak. Maybe some process is producing severe memory fragmentation.
>> i can confirm that server was doing some heavy I/O (backup), but AFAIK
>> nothing more.
>>
>>
>> Mandi! Roland
>>
>>> it's a little bit weird that OOM kicks in with VMs <32GB RAM when 
>>> you have 64GB
>>> take a closer look why this happens , i.e. why OOM thinks there is 
>>> ram pressure
>> effectively server was running:
>>   + vm 100, 2GB
>>   + vm 120, 4GB
>>   + vm 121, 16GB
>>   + vm 127, 4GB
>>   + lxc 124, 2GB
>>   + lxc 125, 4GB
>>
>> so exactly 32GB of RAM. But most of the VM/LXC barely arrived at half 
>> of the
>> allocated RAM...
>>
>>
>>
>> Thanks.
>>


More information about the pve-user mailing list