[PVE-User] Problem with Centos 5.X virtio ethernet drivers and last PVE updates
Eneko Lacunza
elacunza at binovo.es
Tue Sep 22 11:01:07 CEST 2020
Hi Richard,
El 22/9/20 a las 10:19, richard lucassen escribió:
> On Mon, 3 Aug 2020 13:54:54 +0200
> Eneko Lacunza via pve-user <pve-user at lists.proxmox.com> wrote:
>
>> As reported 10 days ago, we have found a e1000e driver hang recently,
>> after upgrading from PVE 5.4 to 6.2, in an otherwise stable server.
>>
>> It could be a driver issue and not a virtio network issue, but we
>> haven't seen another hang since the one reported.
> [note] I just moved the images to a new proxmox 6.2.11 environment and
> the problem remains. An RTL8169 NIC works well
>
We had a new fence on 7th sept on that cluster. Can't confirm if it was
a e1000e hang, but it is likely.
3 nodes on the cluster; all 3 have integrated e1000e interfaces, and
we're seeing random down/ups of intel physical interfaces:
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2)
I219-LM (rev 31)
I also found a trace (not linked to a fence) in the logs of the first
and third node, maybe it isn't related:
Sep 8 08:57:14 proxmox1 kernel: [35054.564849] ------------[ cut here
]------------
Sep 8 08:57:14 proxmox1 kernel: [35054.564856] NETDEV WATCHDOG:
enp0s31f6 (e1000e): transmit queue 0 timed out
Sep 8 08:57:14 proxmox1 kernel: [35054.564867] WARNING: CPU: 1 PID: 0
at net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270
Sep 8 08:57:14 proxmox1 kernel: [35054.564868] Modules linked in:
rpcsec_gss_krb5 auth_rpcgss nfsv4 nfsv3 nfs_acl nfs lockd grace fscache
ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table
_filter ip6_tables iptable_raw ipt_REJECT nf_reject_ipv4 xt_mark xt_set
xt_physdev xt_addrtype xt_comment xt_multiport xt_conntrack nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip_set_hash_net ip_set sct
p iptable_filter bpfilter xfs softdog nfnetlink_log nfnetlink
intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp
coretemp kvm_intel kvm irqbypass crct10dif_pclmul snd_hda_codec_hdmi
crc32_pcl
mul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic
ledtrig_audio aesni_intel crypto_simd cryptd glue_helper mei_hdcp i915
drm_kms_helper snd_hda_intel snd_intel_dspcfg intel_cstate snd_hda_codec
snd_hda_core snd_hwdep snd_pcm snd_timer mei_me snd mei soundcore drm
i2c_algo_bit intel_pch_thermal intel_rapl_perf fb_sys_fops syscopyarea
sysfillrect sysimgblt ie31200_edac dell_wmi
Sep 8 08:57:14 proxmox1 kernel: [35054.564888] dell_smbios serio_raw
dcdbas pcspkr sparse_keymap intel_wmi_thunderbolt wmi_bmof
dell_wmi_descriptor mac_hid acpi_pad zfs(PO) zunicode(PO) zlua(PO)
zavl(PO) icp(P
O) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm
iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq d
m_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c e1000e
xhci_pci psmouse i2c_i801 xhci_hcd ahci tg3 libahci wmi video
Sep 8 08:57:14 proxmox1 kernel: [35054.564915] CPU: 1 PID: 0 Comm:
swapper/1 Tainted: P O 5.4.44-2-pve #1
Sep 8 08:57:14 proxmox1 kernel: [35054.564915] Hardware name: Dell Inc.
PowerEdge T30/07T4MC, BIOS 1.0.7 07/30/2017
Sep 8 08:57:14 proxmox1 kernel: [35054.564917] RIP:
0010:dev_watchdog+0x264/0x270
Sep 8 08:57:14 proxmox1 kernel: [35054.564918] Code: 48 85 c0 75 e6 eb
a0 4c 89 ef c6 05 81 1a eb 00 01 e8 80 b1 fa ff 89 d9 4c 89 ee 48 c7 c7
70 2f 63 bb 48 89 c2 e8 cd 7a 74 ff <0f> 0b eb 82 0f 1f 84 00 00 00
00 00 0f 1f 44 00 00 55 48 89 e5 41
Sep 8 08:57:14 proxmox1 kernel: [35054.564918] RSP:
0018:ffffb352c003ce58 EFLAGS: 00010282
Sep 8 08:57:14 proxmox1 kernel: [35054.564919] RAX: 0000000000000000
RBX: 0000000000000000 RCX: 0000000000000000
Sep 8 08:57:14 proxmox1 kernel: [35054.564920] RDX: ffff9b79bdaa7740
RSI: 00000000000000f6 RDI: 0000000000000300
Sep 8 08:57:14 proxmox1 kernel: [35054.564920] RBP: ffffb352c003ce88
R08: 00000000000003d9 R09: 0000000000000004
Sep 8 08:57:14 proxmox1 kernel: [35054.564921] R10: 0000000000000000
R11: 0000000000000001 R12: 0000000000000001
Sep 8 08:57:14 proxmox1 kernel: [35054.564921] R13: ffff9b79ac3f0000
R14: ffff9b79ac3f0480 R15: ffff9b79accd0080
Sep 8 08:57:14 proxmox1 kernel: [35054.564922] FS:
0000000000000000(0000) GS:ffff9b79bda80000(0000) knlGS:0000000000000000
Sep 8 08:57:14 proxmox1 kernel: [35054.564922] CS: 0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Sep 8 08:57:14 proxmox1 kernel: [35054.564923] CR2: 00007fff22cdde9c
CR3: 0000000709c0a004 CR4: 00000000003626e0
Sep 8 08:57:14 proxmox1 kernel: [35054.564923] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
Sep 8 08:57:14 proxmox1 kernel: [35054.564924] DR3: 0000000000000000
DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep 8 08:57:14 proxmox1 kernel: [35054.564924] Call Trace:
Sep 8 08:57:14 proxmox1 kernel: [35054.564925] <IRQ>
Sep 8 08:57:14 proxmox1 kernel: [35054.564928] ?
pfifo_fast_enqueue+0x160/0x160
Sep 8 08:57:14 proxmox1 kernel: [35054.564930] call_timer_fn+0x32/0x130
Sep 8 08:57:14 proxmox1 kernel: [35054.564931]
run_timer_softirq+0x1a5/0x430
Sep 8 08:57:14 proxmox1 kernel: [35054.564933] ? ktime_get+0x3c/0xa0
Sep 8 08:57:14 proxmox1 kernel: [35054.564935] ?
lapic_next_deadline+0x26/0x30
Sep 8 08:57:14 proxmox1 kernel: [35054.564936] ?
clockevents_program_event+0x93/0xf0
Sep 8 08:57:14 proxmox1 kernel: [35054.564938] __do_softirq+0xdc/0x2d4
Sep 8 08:57:14 proxmox1 kernel: [35054.564940] irq_exit+0xa9/0xb0
Sep 8 08:57:14 proxmox1 kernel: [35054.564941]
smp_apic_timer_interrupt+0x79/0x130
Sep 8 08:57:14 proxmox1 kernel: [35054.564942]
apic_timer_interrupt+0xf/0x20
Sep 8 08:57:14 proxmox1 kernel: [35054.564943] </IRQ>
Sep 8 08:57:14 proxmox1 kernel: [35054.564945] RIP:
0010:cpuidle_enter_state+0xbd/0x450
Sep 8 08:57:14 proxmox1 kernel: [35054.564946] Code: ff e8 a7 b4 84 ff
80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 63 03 00 00 31 ff
e8 ca 22 8b ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 8d 02 00 00 49 63
cd 48 8b 75 d0 48 2b 75 c8 48 8d
Sep 8 08:57:14 proxmox1 kernel: [35054.564946] RSP:
0018:ffffb352c00c3e48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
Sep 8 08:57:14 proxmox1 kernel: [35054.564947] RAX: ffff9b79bdaaad40
RBX: ffffffffbb957a00 RCX: 000000000000001f
Sep 8 08:57:14 proxmox1 kernel: [35054.564948] RDX: 00001fe1c6e2a49d
RSI: 0000000026a5b845 RDI: 0000000000000000
Sep 8 08:57:14 proxmox1 kernel: [35054.564948] RBP: ffffb352c00c3e88
R08: 0000000000000002 R09: 000000000002a5c0
Sep 8 08:57:14 proxmox1 kernel: [35054.564949] R10: 000069f45edea306
R11: ffff9b79bdaa99e0 R12: ffff9b79bdab6600
Sep 8 08:57:14 proxmox1 kernel: [35054.564949] R13: 0000000000000006
R14: ffffffffbb957c58 R15: ffffffffbb957c40
Sep 8 08:57:14 proxmox1 kernel: [35054.564951] ?
cpuidle_enter_state+0x99/0x450
Sep 8 08:57:14 proxmox1 kernel: [35054.564952] cpuidle_enter+0x2e/0x40
Sep 8 08:57:14 proxmox1 kernel: [35054.564954] call_cpuidle+0x23/0x40
Sep 8 08:57:14 proxmox1 kernel: [35054.564954] call_cpuidle+0x23/0x40
Sep 8 08:57:14 proxmox1 kernel: [35054.564955] do_idle+0x22c/0x270
Sep 8 08:57:14 proxmox1 kernel: [35054.564957] cpu_startup_entry+0x1d/0x20
Sep 8 08:57:14 proxmox1 kernel: [35054.564958] start_secondary+0x166/0x1c0
Sep 8 08:57:14 proxmox1 kernel: [35054.564960]
secondary_startup_64+0xa4/0xb0
Sep 8 08:57:14 proxmox1 kernel: [35054.564961] ---[ end trace
3a481687c9259238 ]---
Cheers
--
Eneko Lacunza | Tel. 943 569 206
| Email elacunza at binovo.es
Director Técnico | Site. https://www.binovo.es
BINOVO IT HUMAN PROJECT S.L | Dir. Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
More information about the pve-user
mailing list