[PVE-User] Kernel Memory Leak on PVE6?
Chris Hofstaedtler | Deduktiva
chris.hofstaedtler at deduktiva.com
Fri Sep 20 14:31:17 CEST 2019
Hi,
I'm seeing a very interesting problem on PVE6: one of our machines
appears to leak kernel memory over time, up to the point where only
a reboot helps. Shutting down all KVM VMs does not release this
memory.
I'll attach some information below, because I just couldn't figure
out what this memory is used for. Once before shutting down the VMs,
and once after. I had to reboot the PVE host now, but I guess
in a few days it will be at least noticable again.
This machine has the same (except CPU) hardware as the box next to
it; however this one was freshly installed with PVE6, the other one
is an upgrade from PVE5 and doesn't exhibit this problem. It's quite
puzzling because I haven't seen this symptom at all at all the
customer installations.
Here are some graphs showing the memory consumption over time:
http://zeha.at/~ch/T/20190920-pve6_meminfo_0.png
http://zeha.at/~ch/T/20190920-pve6_meminfo_1.png
Looking forward to any debug help, suggestions, ...
Chris
** Almost out of memory, before VM shutdown: **
top - 10:24:19 up 22 days, 22:29, 1 user, load average: 1.85, 1.57, 1.32
Tasks: 530 total, 1 running, 529 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.8 us, 0.4 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 80413.1 total, 509.9 free, 70879.7 used, 9023.5 buff/cache
MiB Swap: 20480.0 total, 6516.6 free, 13963.4 used. 8699.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3183 root 20 0 10.6g 6.0g 2960 S 8.7 7.6 5861:52 /usr/bin/kvm -id 103 -name puppet -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+
3349 root 20 0 9266032 4.3g 2972 S 6.8 5.4 3834:41 /usr/bin/kvm -id 2017 -name go-test-srv01 -chardev socket,id=qmp,path=/var/run/qemu-server/2017.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=+
3068 root 20 0 5060928 3.7g 2900 S 6.8 4.7 3110:01 /usr/bin/kvm -id 101 -name backup -chardev socket,id=qmp,path=/var/run/qemu-server/101.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event+
3399 root 20 0 5094772 2.3g 2944 S 50.5 2.9 10780:07 /usr/bin/kvm -id 3002 -name monitor01 -chardev socket,id=qmp,path=/var/run/qemu-server/3002.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-+
3254 root 20 0 32.8g 1.9g 3040 S 1.0 2.4 490:39.29 /usr/bin/kvm -id 2005 -name debbuild -chardev socket,id=qmp,path=/var/run/qemu-server/2005.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-e+
2994 root 20 0 2656268 658428 2980 S 9.7 0.8 2895:15 /usr/bin/kvm -id 100 -name pbx -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+
2927 root 20 0 2664232 479372 2944 S 6.8 0.6 2343:43 /usr/bin/kvm -id 102 -name ns1 -chardev socket,id=qmp,path=/var/run/qemu-server/102.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,pa+
2417 root rt 0 606912 211336 51444 S 1.9 0.3 613:27.87 /usr/sbin/corosync -f
2023020 root 20 0 246556 98020 97044 S 0.0 0.1 15:47.80 /lib/systemd/systemd-journald
1806 root 20 0 967944 32724 23612 S 0.0 0.0 53:49.62 /usr/bin/pmxcfs
2801 root 20 0 314488 32428 6464 S 0.0 0.0 322:58.23 pvestatd +
3771741 root 20 0 150776 31728 3700 S 0.0 0.0 0:12.81 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize
2799 root 20 0 316056 27452 5656 S 0.0 0.0 95:49.25 pve-firewall +
2909 root 20 0 325248 12684 5268 S 1.0 0.0 7:03.91 pve-ha-lrm +
868033 ch 20 0 21660 9104 7280 S 0.0 0.0 0:00.12 /lib/systemd/systemd --user
868009 root 20 0 16912 7988 6856 S 0.0 0.0 0:00.03 sshd: ch [priv]
1 root 20 0 171820 7640 5032 S 0.0 0.0 19:58.80 /lib/systemd/systemd --system --deserialize 37
2876 root 20 0 325544 7124 4988 S 0.0 0.0 4:18.16 pve-ha-crm +
1654 Debian-+ 20 0 40488 7096 2864 S 0.0 0.0 77:37.18 /usr/sbin/snmpd -Lsd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf -f -p /run/snmpd.pid
868045 ch 20 0 10240 5404 3996 S 0.0 0.0 0:00.11 -zsh
868044 ch 20 0 16912 4636 3492 S 0.0 0.0 0:00.02 sshd: ch at pts/0
1644 root 20 0 29608 4520 3496 S 0.0 0.0 4:59.62 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
868336 root 20 0 7716 4372 3092 S 0.0 0.0 0:00.03 -bash
1761096 root 20 0 351564 4180 3336 S 0.0 0.0 1:12.83 pvedaemon worker +
1776171 root 20 0 351696 4076 3352 S 0.0 0.0 1:18.27 pvedaemon worker +
868370 root 20 0 11680 4016 2964 R 2.9 0.0 0:00.68 top
1780591 root 20 0 351696 4008 3248 S 0.0 0.0 1:11.73 pvedaemon worker +
1086 root 20 0 19540 3984 3720 S 0.0 0.0 3:11.21 /lib/systemd/systemd-logind
868335 root 20 0 10156 3788 3364 S 0.0 0.0 0:00.01 sudo -i
2899 www-data 20 0 121256 3412 3080 S 0.0 0.0 0:33.99 spiceproxy +
2000791 www-data 20 0 344932 3412 2604 S 0.0 0.0 1:16.39 pveproxy worker +
2000792 www-data 20 0 344932 3348 2604 S 0.0 0.0 1:07.07 pveproxy worker +
1251 root 20 0 225816 3296 2424 S 0.0 0.0 9:47.44 /usr/sbin/rsyslogd -n -iNONE
1258 message+ 20 0 9212 3268 2820 S 0.0 0.0 6:41.36 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root at vn03:~# uname -a
Linux vn03 5.0.21-1-pve #1 SMP PVE 5.0.21-1 (Tue, 20 Aug 2019 17:16:32 +0200) x86_64 GNU/Linux
root at vn03:~# free -m
total used free shared buff/cache available
Mem: 80413 70877 515 101 9019 8708
Swap: 20479 13963 6516
root at vn03:~# dpkg -l pve\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=======================-============-============-======================================================
ii pve-cluster 6.0-5 amd64 Cluster Infrastructure for Proxmox Virtual Environment
ii pve-container 3.0-5 all Proxmox VE Container management tool
ii pve-docs 6.0-4 all Proxmox VE Documentation
ii pve-edk2-firmware 2.20190614-1 all edk2 based firmware modules for virtual machines
ii pve-firewall 4.0-7 amd64 Proxmox VE Firewall
ii pve-firmware 3.0-2 all Binary firmware code for the pve-kernel
ii pve-ha-manager 3.0-2 amd64 Proxmox VE HA Manager
ii pve-i18n 2.0-2 all Internationalization support for Proxmox VE
un pve-kernel <none> <none> (no description available)
ii pve-kernel-5.0 6.0-7 all Latest Proxmox VE Kernel Image
ii pve-kernel-5.0.15-1-pve 5.0.15-1 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.0.18-1-pve 5.0.18-3 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-5.0.21-1-pve 5.0.21-1 amd64 The Proxmox PVE Kernel Image
ii pve-kernel-helper 6.0-7 all Function for various kernel maintenance tasks.
un pve-kvm <none> <none> (no description available)
ii pve-manager 6.0-6 amd64 Proxmox Virtual Environment Management Tools
ii pve-qemu-kvm 4.0.0-5 amd64 Full virtualization on x86 hardware
un pve-qemu-kvm-2.6.18 <none> <none> (no description available)
ii pve-xtermjs 3.13.2-1 all HTML/JS Shell client
root at vn03:~# slabtop -o | head -50
Active / Total Objects (% used) : 205425461 / 212231433 (96.8%)
Active / Total Slabs (% used) : 4949759 / 4949759 (100.0%)
Active / Total Caches (% used) : 114 / 161 (70.8%)
Active / Total Size (% used) : 60112896.56K / 60714678.54K (99.0%)
Minimum / Average / Maximum Object : 0.01K / 0.29K / 16.62K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
43583592 43542487 99% 0.20K 1117528 39 8940224K vm_area_struct
26520256 26518592 99% 0.06K 414379 64 1657516K anon_vma_chain
16788000 16434450 97% 0.25K 524625 32 4197000K filp
13079680 13078464 99% 0.03K 102185 128 408740K kmalloc-32
11544320 5261058 45% 0.06K 180380 64 721520K dmaengine-unmap-2
10128740 10127452 99% 0.09K 220190 46 880760K anon_vma
9602484 9602484 100% 0.04K 94142 102 376568K pde_opener
7442736 7442572 99% 0.19K 177208 42 1417664K cred_jar
7213200 7209695 99% 0.13K 240440 30 961760K kernfs_node_cache
6023850 5992341 99% 0.19K 143425 42 1147400K dentry
5704350 5704350 100% 0.08K 111850 51 447400K task_delay_info
5054066 5054066 100% 0.69K 109871 46 3515872K files_cache
4664512 4664481 99% 0.12K 145766 32 583064K pid
4591440 4591440 100% 1.06K 153048 30 4897536K mm_struct
4207445 4203908 99% 0.58K 76499 55 2447968K inode_cache
4104480 4104291 99% 0.62K 80480 51 2575360K sock_inode_cache
3901440 3900588 99% 0.06K 60960 64 243840K kmalloc-64
3856230 3856160 99% 1.06K 128541 30 4113312K signal_cache
3423826 3417982 99% 0.65K 69874 49 2235968K proc_inode_cache
3139584 3138382 99% 0.01K 6132 512 24528K kmalloc-8
2983344 2983255 99% 0.19K 71032 42 568256K kmalloc-192
2426976 2426413 99% 1.00K 75843 32 2426976K kmalloc-1k
1939854 1931355 99% 0.09K 46187 42 184748K kmalloc-96
1649895 1649895 100% 2.06K 109993 15 3519776K sighand_cache
1280544 1280544 100% 1.00K 40017 32 1280544K UNIX
1052928 1050819 99% 0.50K 32904 32 526464K kmalloc-512
1029792 1029312 99% 0.25K 32181 32 257448K skbuff_head_cache
940624 940559 99% 4.00K 117578 8 3762496K kmalloc-4k
799895 787069 98% 5.75K 159979 5 5119328K task_struct
735696 724643 98% 0.10K 18864 39 75456K buffer_head
525504 525378 99% 2.00K 32844 16 1051008K kmalloc-2k
433024 426780 98% 0.06K 6766 64 27064K kmem_cache_node
310710 301758 97% 1.05K 10357 30 331424K ext4_inode_cache
292340 290078 99% 0.68K 6220 47 199040K shmem_inode_cache
215250 214814 99% 0.38K 5125 42 82000K kmem_cache
212296 196761 92% 0.57K 7582 28 121312K radix_tree_node
158464 158464 100% 0.02K 619 256 2476K kmalloc-16
149925 149925 100% 1.25K 5997 25 191904K UDPv6
71424 71140 99% 0.12K 2232 32 8928K kmalloc-128
70020 70020 100% 0.16K 1376 51 11008K kvm_mmu_page_header
40032 40009 99% 0.25K 1251 32 10008K kmalloc-256
34944 33823 96% 0.09K 832 42 3328K kmalloc-rcl-96
34816 32567 93% 0.06K 544 64 2176K kmalloc-rcl-64
root at vn03:~# pct list
root at vn03:~# qm list
VMID NAME STATUS MEM(MB) BOOTDISK(GB) PID
100 pbx running 2048 16.00 2994
101 backup running 4096 32.00 3068
102 ns1 running 2048 32.00 2927
103 puppet running 10240 16.00 3183
2005 debbuild running 32768 40.00 3254
2017 go-test-srv01 running 8192 20.00 3349
3002 monitor01 running 4096 32.00 3399
5001 salsa-runner-01 stopped 16384 32.00 0
6001 deduktiva-runner-01 stopped 32768 32.00 0
6901 mac stopped 4096 0.25 0
root at vn03:~# sysctl -a | grep hugepages
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
*** After shutdown of all VMs: ***
top - 10:39:56 up 22 days, 22:44, 2 users, load average: 0.83, 1.84, 1.88
Tasks: 491 total, 1 running, 490 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 80413.1 total, 18276.4 free, 52704.9 used, 9431.8 buff/cache
MiB Swap: 20480.0 total, 19393.6 free, 1086.4 used. 26801.1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2417 root rt 0 606908 211332 51444 S 1.0 0.3 613:46.50 /usr/sbin/corosync -f
2878 www-data 20 0 344800 133424 21784 S 0.0 0.2 0:36.09 pveproxy +
883317 www-data 20 0 361776 133084 11056 S 0.0 0.2 0:01.04 pveproxy worker +
2836 root 20 0 343228 132060 21764 S 0.0 0.2 0:38.88 pvedaemon +
883319 www-data 20 0 360688 130992 11148 S 1.0 0.2 0:01.26 pveproxy worker +
883318 www-data 20 0 358056 128864 11148 S 0.0 0.2 0:01.75 pveproxy worker +
883166 root 20 0 351912 121884 10220 S 0.0 0.1 0:00.96 pvedaemon worker +
883165 root 20 0 351848 121584 9952 S 0.0 0.1 0:00.40 pvedaemon worker +
883164 root 20 0 351712 121560 10060 S 0.0 0.1 0:00.65 pvedaemon worker +
2801 root 20 0 307252 92952 20996 S 0.0 0.1 323:07.31 pvestatd +
2023020 root 20 0 267408 90508 89344 S 0.0 0.1 15:48.85 /lib/systemd/systemd-journald
2899 www-data 20 0 121260 59804 12212 S 0.0 0.1 0:34.77 spiceproxy +
883544 www-data 20 0 121500 51260 3448 S 0.0 0.1 0:00.05 spiceproxy worker +
876236 root 20 0 524564 50188 37612 S 0.0 0.1 0:01.90 /usr/bin/pmxcfs
3771741 root 20 0 150776 30880 3264 S 0.0 0.0 0:12.86 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent --no-daemonize
2799 root 20 0 316112 28352 5840 S 0.0 0.0 95:51.91 pve-firewall +
2909 root 20 0 325212 14196 5404 S 0.0 0.0 7:04.14 pve-ha-lrm +
2876 root 20 0 325564 9600 5224 S 0.0 0.0 4:18.33 pve-ha-crm +
868033 ch 20 0 21660 8844 7020 S 0.0 0.0 0:00.14 /lib/systemd/systemd --user
root at vn03:~# free -m
total used free shared buff/cache available
Mem: 80413 52700 18281 115 9431 26805
Swap: 20479 1086 19393
root at vn03:~# slabtop -o | head -50
Active / Total Objects (% used) : 199865696 / 200976971 (99.4%)
Active / Total Slabs (% used) : 4771440 / 4771440 (100.0%)
Active / Total Caches (% used) : 114 / 161 (70.8%)
Active / Total Size (% used) : 59688763.91K / 59945034.02K (99.6%)
Minimum / Average / Maximum Object : 0.01K / 0.30K / 16.62K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
43540380 43499279 99% 0.20K 1116420 39 8931360K vm_area_struct
26459776 26457217 99% 0.06K 413434 64 1653736K anon_vma_chain
16782720 16429406 97% 0.25K 524460 32 4195680K filp
13075712 13074728 99% 0.03K 102154 128 408616K kmalloc-32
10104728 10103625 99% 0.09K 219668 46 878672K anon_vma
9599628 9599628 100% 0.04K 94114 102 376456K pde_opener
7442106 7442024 99% 0.19K 177193 42 1417544K cred_jar
7211280 7207550 99% 0.13K 240376 30 961504K kernfs_node_cache
5999322 5970370 99% 0.19K 142841 42 1142728K dentry
5691447 5691447 100% 0.08K 111597 51 446388K task_delay_info
5052594 5052594 100% 0.69K 109839 46 3514848K files_cache
4657408 4657315 99% 0.12K 145544 32 582176K pid
4590750 4590721 99% 1.06K 153025 30 4896800K mm_struct
4206400 4202839 99% 0.58K 76480 55 2447360K inode_cache
4091424 4091235 99% 0.62K 80224 51 2567168K sock_inode_cache
3903104 3901440 99% 0.06K 60986 64 243944K kmalloc-64
3855600 3855530 99% 1.06K 128520 30 4112640K signal_cache
3416133 3410170 99% 0.65K 69717 49 2230944K proc_inode_cache
3124224 3123017 99% 0.01K 6102 512 24408K kmalloc-8
2982840 2982826 99% 0.19K 71020 42 568160K kmalloc-192
2425760 2424977 99% 1.00K 75805 32 2425760K kmalloc-1k
1940694 1932266 99% 0.09K 46207 42 184828K kmalloc-96
1649415 1649346 99% 2.06K 109961 15 3518752K sighand_cache
1279520 1279520 100% 1.00K 39985 32 1279520K UNIX
1043392 1040142 99% 0.50K 32606 32 521696K kmalloc-512
1021152 1020672 99% 0.25K 31911 32 255288K skbuff_head_cache
938880 938777 99% 4.00K 117360 8 3755520K kmalloc-4k
797715 784886 98% 5.75K 159543 5 5105376K task_struct
713388 699031 97% 0.10K 18292 39 73168K buffer_head
643008 73139 11% 0.06K 10047 64 40188K dmaengine-unmap-2
525520 525326 99% 2.00K 32845 16 1051040K kmalloc-2k
432768 426806 98% 0.06K 6762 64 27048K kmem_cache_node
308100 298326 96% 1.05K 10270 30 328640K ext4_inode_cache
292387 289915 99% 0.68K 6221 47 199072K shmem_inode_cache
215250 214971 99% 0.38K 5125 42 82000K kmem_cache
212380 180327 84% 0.57K 7585 28 121360K radix_tree_node
157952 157952 100% 0.02K 617 256 2468K kmalloc-16
150150 150150 100% 1.25K 6006 25 192192K UDPv6
71008 70660 99% 0.12K 2219 32 8876K kmalloc-128
40064 40056 99% 0.25K 1252 32 10016K kmalloc-256
34986 34259 97% 0.09K 833 42 3332K kmalloc-rcl-96
34368 32733 95% 0.06K 537 64 2148K kmalloc-rcl-64
33660 33300 98% 0.05K 396 85 1584K ftrace_event_field
typical VM config:
balloon: 0
bootdisk: virtio0
cores: 2
cpu: Haswell-noTSX
ide2: none,media=cdrom
memory: 4096
name: backup
net0: virtio=52:54:00:b7:e0:ba,bridge=vmbr100
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=39d362a5-6bae-41b7-9803-b76279e2280f
sockets: 1
virtio0: datastore:vm-101-disk-1,cache=writeback,size=32G
virtio1: datastore:vm-101-disk-2,cache=writeback,size=100G
--
Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien)
www.deduktiva.com / +43 1 353 1707
More information about the pve-user
mailing list