[PVE-User] Problem with Centos 5.X virtio ethernet drivers and last PVE updates

Eneko Lacunza elacunza at binovo.es
Tue Sep 22 11:01:07 CEST 2020


Hi Richard,

El 22/9/20 a las 10:19, richard lucassen escribió:
> On Mon, 3 Aug 2020 13:54:54 +0200
> Eneko Lacunza via pve-user <pve-user at lists.proxmox.com> wrote:
>
>> As reported 10 days ago, we have found a e1000e driver hang recently,
>> after upgrading from PVE 5.4 to 6.2, in an otherwise stable server.
>>
>> It could be a driver issue and not a virtio network issue, but we
>> haven't seen another hang since the one reported.
> [note] I just moved the images to a new proxmox 6.2.11 environment and
> the problem remains. An RTL8169 NIC works well
>
We had a new fence on 7th sept on that cluster. Can't confirm if it was 
a e1000e hang, but it is likely.

3 nodes on the cluster; all 3 have integrated e1000e interfaces, and 
we're seeing random down/ups of intel physical interfaces:

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) 
I219-LM (rev 31)

I also found a trace (not linked to a fence) in the logs of the first 
and third node, maybe it isn't related:

Sep  8 08:57:14 proxmox1 kernel: [35054.564849] ------------[ cut here 
]------------
Sep  8 08:57:14 proxmox1 kernel: [35054.564856] NETDEV WATCHDOG: 
enp0s31f6 (e1000e): transmit queue 0 timed out
Sep  8 08:57:14 proxmox1 kernel: [35054.564867] WARNING: CPU: 1 PID: 0 
at net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270
Sep  8 08:57:14 proxmox1 kernel: [35054.564868] Modules linked in: 
rpcsec_gss_krb5 auth_rpcgss nfsv4 nfsv3 nfs_acl nfs lockd grace fscache 
ebtable_filter ebtables ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table
_filter ip6_tables iptable_raw ipt_REJECT nf_reject_ipv4 xt_mark xt_set 
xt_physdev xt_addrtype xt_comment xt_multiport xt_conntrack nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip_set_hash_net ip_set sct
p iptable_filter bpfilter xfs softdog nfnetlink_log nfnetlink 
intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp 
coretemp kvm_intel kvm irqbypass crct10dif_pclmul snd_hda_codec_hdmi 
crc32_pcl
mul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic 
ledtrig_audio aesni_intel crypto_simd cryptd glue_helper mei_hdcp i915 
drm_kms_helper snd_hda_intel snd_intel_dspcfg intel_cstate snd_hda_codec
  snd_hda_core snd_hwdep snd_pcm snd_timer mei_me snd mei soundcore drm 
i2c_algo_bit intel_pch_thermal intel_rapl_perf fb_sys_fops syscopyarea 
sysfillrect sysimgblt ie31200_edac dell_wmi
Sep  8 08:57:14 proxmox1 kernel: [35054.564888]  dell_smbios serio_raw 
dcdbas pcspkr sparse_keymap intel_wmi_thunderbolt wmi_bmof 
dell_wmi_descriptor mac_hid acpi_pad zfs(PO) zunicode(PO) zlua(PO) 
zavl(PO) icp(P
O) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm 
iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 
sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq d
m_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c e1000e 
xhci_pci psmouse i2c_i801 xhci_hcd ahci tg3 libahci wmi video
Sep  8 08:57:14 proxmox1 kernel: [35054.564915] CPU: 1 PID: 0 Comm: 
swapper/1 Tainted: P           O      5.4.44-2-pve #1
Sep  8 08:57:14 proxmox1 kernel: [35054.564915] Hardware name: Dell Inc. 
PowerEdge T30/07T4MC, BIOS 1.0.7 07/30/2017
Sep  8 08:57:14 proxmox1 kernel: [35054.564917] RIP: 
0010:dev_watchdog+0x264/0x270
Sep  8 08:57:14 proxmox1 kernel: [35054.564918] Code: 48 85 c0 75 e6 eb 
a0 4c 89 ef c6 05 81 1a eb 00 01 e8 80 b1 fa ff 89 d9 4c 89 ee 48 c7 c7 
70 2f 63 bb 48 89 c2 e8 cd 7a 74 ff <0f> 0b eb 82 0f 1f 84 00 00 00
  00 00 0f 1f 44 00 00 55 48 89 e5 41
Sep  8 08:57:14 proxmox1 kernel: [35054.564918] RSP: 
0018:ffffb352c003ce58 EFLAGS: 00010282
Sep  8 08:57:14 proxmox1 kernel: [35054.564919] RAX: 0000000000000000 
RBX: 0000000000000000 RCX: 0000000000000000
Sep  8 08:57:14 proxmox1 kernel: [35054.564920] RDX: ffff9b79bdaa7740 
RSI: 00000000000000f6 RDI: 0000000000000300
Sep  8 08:57:14 proxmox1 kernel: [35054.564920] RBP: ffffb352c003ce88 
R08: 00000000000003d9 R09: 0000000000000004
Sep  8 08:57:14 proxmox1 kernel: [35054.564921] R10: 0000000000000000 
R11: 0000000000000001 R12: 0000000000000001
Sep  8 08:57:14 proxmox1 kernel: [35054.564921] R13: ffff9b79ac3f0000 
R14: ffff9b79ac3f0480 R15: ffff9b79accd0080
Sep  8 08:57:14 proxmox1 kernel: [35054.564922] FS: 
0000000000000000(0000) GS:ffff9b79bda80000(0000) knlGS:0000000000000000
Sep  8 08:57:14 proxmox1 kernel: [35054.564922] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Sep  8 08:57:14 proxmox1 kernel: [35054.564923] CR2: 00007fff22cdde9c 
CR3: 0000000709c0a004 CR4: 00000000003626e0
Sep  8 08:57:14 proxmox1 kernel: [35054.564923] DR0: 0000000000000000 
DR1: 0000000000000000 DR2: 0000000000000000
Sep  8 08:57:14 proxmox1 kernel: [35054.564924] DR3: 0000000000000000 
DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep  8 08:57:14 proxmox1 kernel: [35054.564924] Call Trace:
Sep  8 08:57:14 proxmox1 kernel: [35054.564925]  <IRQ>
Sep  8 08:57:14 proxmox1 kernel: [35054.564928]  ? 
pfifo_fast_enqueue+0x160/0x160
Sep  8 08:57:14 proxmox1 kernel: [35054.564930] call_timer_fn+0x32/0x130
Sep  8 08:57:14 proxmox1 kernel: [35054.564931] 
run_timer_softirq+0x1a5/0x430
Sep  8 08:57:14 proxmox1 kernel: [35054.564933]  ? ktime_get+0x3c/0xa0
Sep  8 08:57:14 proxmox1 kernel: [35054.564935]  ? 
lapic_next_deadline+0x26/0x30
Sep  8 08:57:14 proxmox1 kernel: [35054.564936]  ? 
clockevents_program_event+0x93/0xf0
Sep  8 08:57:14 proxmox1 kernel: [35054.564938] __do_softirq+0xdc/0x2d4
Sep  8 08:57:14 proxmox1 kernel: [35054.564940]  irq_exit+0xa9/0xb0
Sep  8 08:57:14 proxmox1 kernel: [35054.564941] 
smp_apic_timer_interrupt+0x79/0x130
Sep  8 08:57:14 proxmox1 kernel: [35054.564942] 
apic_timer_interrupt+0xf/0x20
Sep  8 08:57:14 proxmox1 kernel: [35054.564943]  </IRQ>
Sep  8 08:57:14 proxmox1 kernel: [35054.564945] RIP: 
0010:cpuidle_enter_state+0xbd/0x450
Sep  8 08:57:14 proxmox1 kernel: [35054.564946] Code: ff e8 a7 b4 84 ff 
80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 63 03 00 00 31 ff 
e8 ca 22 8b ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 8d 02 00 00 49 63 
cd 48 8b 75 d0 48 2b 75 c8 48 8d
Sep  8 08:57:14 proxmox1 kernel: [35054.564946] RSP: 
0018:ffffb352c00c3e48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
Sep  8 08:57:14 proxmox1 kernel: [35054.564947] RAX: ffff9b79bdaaad40 
RBX: ffffffffbb957a00 RCX: 000000000000001f
Sep  8 08:57:14 proxmox1 kernel: [35054.564948] RDX: 00001fe1c6e2a49d 
RSI: 0000000026a5b845 RDI: 0000000000000000
Sep  8 08:57:14 proxmox1 kernel: [35054.564948] RBP: ffffb352c00c3e88 
R08: 0000000000000002 R09: 000000000002a5c0
Sep  8 08:57:14 proxmox1 kernel: [35054.564949] R10: 000069f45edea306 
R11: ffff9b79bdaa99e0 R12: ffff9b79bdab6600
Sep  8 08:57:14 proxmox1 kernel: [35054.564949] R13: 0000000000000006 
R14: ffffffffbb957c58 R15: ffffffffbb957c40
Sep  8 08:57:14 proxmox1 kernel: [35054.564951]  ? 
cpuidle_enter_state+0x99/0x450
Sep  8 08:57:14 proxmox1 kernel: [35054.564952] cpuidle_enter+0x2e/0x40
Sep  8 08:57:14 proxmox1 kernel: [35054.564954] call_cpuidle+0x23/0x40
Sep  8 08:57:14 proxmox1 kernel: [35054.564954] call_cpuidle+0x23/0x40
Sep  8 08:57:14 proxmox1 kernel: [35054.564955]  do_idle+0x22c/0x270
Sep  8 08:57:14 proxmox1 kernel: [35054.564957] cpu_startup_entry+0x1d/0x20
Sep  8 08:57:14 proxmox1 kernel: [35054.564958] start_secondary+0x166/0x1c0
Sep  8 08:57:14 proxmox1 kernel: [35054.564960] 
secondary_startup_64+0xa4/0xb0
Sep  8 08:57:14 proxmox1 kernel: [35054.564961] ---[ end trace 
3a481687c9259238 ]---


Cheers



-- 
Eneko Lacunza                   | Tel.  943 569 206
                                 | Email elacunza at binovo.es
Director Técnico                | Site. https://www.binovo.es
BINOVO IT HUMAN PROJECT S.L     | Dir.  Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun




More information about the pve-user mailing list