<p> Hi,</p>
<p>I found that, with the recent pve kernel the network controller just crash and cannot back to life again without power off/on the system.</p>
<p>This occurs on a almost no-traffic interface after 1-2 days.</p>
<p>I found the following conversations on the net, exactly the same situation:</p>
<p>https://bugzilla.redhat.com/show_bug.cgi?id=625776<br />
https://lkml.org/lkml/2012/3/17/48<br />
http://lists.centos.org/pipermail/centos/2011-September/118027.html<br />
http://sourceforge.net/p/e1000/bugs/358/<br />
</p>
<p>Background info:</p>
<ul>
<li>latest proxmox 3.1, fresh install, up to date</li>
<li>no heavy traffic, almost nothing</li>
<li>6 x gigabit LAN cards:<br />
07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection<br />
Subsystem: Intel Corporation Device 0000<br />
Flags: bus master, fast devsel, latency 0, IRQ 40<br />
Memory at e8500000 (32-bit, non-prefetchable) [size=128K]<br />
I/O ports at 3000 [size=32]<br />
Memory at e8520000 (32-bit, non-prefetchable) [size=16K]<br />
Expansion ROM at e8d00000 [disabled] [size=2K]<br />
Capabilities: [c8] Power Management version 2<br />
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+<br />
Capabilities: [e0] Express Endpoint, MSI 00<br />
Capabilities: [a0] MSI-X: Enable- Count=1 Masked-<br />
Capabilities: [100] Advanced Error Reporting<br />
Capabilities: [140] Device Serial Number 00-03-1d-ff-ff-0b-8a-e7<br />
Kernel driver in use: e1000e</li>
<li>ethtool -i eth1<br />
driver: e1000e<br />
version: 2.4.14-NAPI<br />
firmware-version: 3.1-1<br />
bus-info: 0000:03:00.0<br />
supports-statistics: yes<br />
supports-test: yes<br />
supports-eeprom-access: yes<br />
supports-register-dump: yes<br />
supports-priv-flags: no</li>
<li>ifconfig eth1<br />
eth1 Link encap:Ethernet HWaddr 00:03:1d:0b:8a:e3 <br />
inet6 addr: fe80::203:1dff:fe0b:8ae3/64 Scope:Link<br />
UP BROADCAST MULTICAST MTU:1500 Metric:1<br />
<span style="color: rgb(255, 0, 0);">RX packets:70200 errors:354515190463890 dropped:59085865077315 overruns:0 frame:236343460309260</span><br />
TX packets:13254 errors:0 dropped:0 overruns:0 carrier:0<br />
collisions:0 txqueuelen:1000 <br />
RX bytes:28083621 (26.7 MiB) TX bytes:2211516 (2.1 MiB)<br />
Interrupt:17 Memory:e8900000-e8920000</li>
<li>Kernel Command line:<br />
BOOT_IMAGE=/vmlinuz-2.6.32-23-pve root=UUID=dd4c475c-b71d-4497-aeb1-c36a06a8c46f ro quiet</li>
<li>dmesg report:<br />
------------[ cut here ]------------<br />
WARNING: at net/sched/sch_generic.c:267 dev_watchdog+0x28a/0x2a0() (Tainted: P --------------- )<br />
Hardware name: HuronRiver Platform<br />
NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out<br />
Modules linked in: fuse vzethdev vznetdev pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs vzrst nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 vzcpt nf_conntrack vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables vhost_net tun macvtap macvlan kvm_intel kvm vzevent ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc ipv6 ext3 jbd snd_hda_codec_realtek i915 snd_pcsp snd_hda_intel snd_hda_codec snd_hwdep zfs(P) zunicode(P) snd_pcm snd_page_alloc drm_kms_helper zavl(P) zcommon(P) parport_pc i2c_i801 video snd_timer iTCO_wdt iTCO_vendor_support drm i2c_algo_bit serio_raw parport shpchp snd i2c_core output soundcore ext4 jbd2 mbcache znvpair(P) spl zlib_deflate sg e1000e ahci [last unloaded: scsi_wait_scan]<br />
Pid: 0, comm: swapper veid: 0 Tainted: P --------------- 2.6.32-23-pve #1<br />
Call Trace:<br />
<IRQ> [<ffffffff8106f667>] ? warn_slowpath_common+0x87/0xe0<br />
[<ffffffff8106f776>] ? warn_slowpath_fmt+0x46/0x50<br />
[<ffffffff8148a08a>] ? dev_watchdog+0x28a/0x2a0<br />
[<ffffffff8108065b>] ? internal_add_timer+0xcb/0x130<br />
[<ffffffff81489e00>] ? dev_watchdog+0x0/0x2a0<br />
[<ffffffff81083be6>] ? run_timer_softirq+0x176/0x370<br />
[<ffffffff81033755>] ? native_apic_msr_write+0x35/0x40<br />
[<ffffffff810793cb>] ? __do_softirq+0x11b/0x260<br />
[<ffffffff810ac015>] ? tick_dev_program_event+0x65/0xc0<br />
[<ffffffff810ac09a>] ? tick_program_event+0x2a/0x30<br />
[<ffffffff8100c32c>] ? call_softirq+0x1c/0x30<br />
[<ffffffff8100de95>] ? do_softirq+0x75/0xb0<br />
[<ffffffff810796a5>] ? irq_exit+0xc5/0xd0<br />
[<ffffffff81549a10>] ? smp_apic_timer_interrupt+0x70/0x9b<br />
[<ffffffff8100bcd3>] ? apic_timer_interrupt+0x13/0x20<br />
<EOI> [<ffffffff812db91b>] ? intel_idle+0xdb/0x160<br />
[<ffffffff812db8f9>] ? intel_idle+0xb9/0x160<br />
[<ffffffff814352a4>] ? cpuidle_idle_call+0x94/0x130<br />
[<ffffffff81009219>] ? cpu_idle+0xa9/0x100<br />
[<ffffffff8151bd21>] ? rest_init+0x85/0x94<br />
[<ffffffff81c34cd6>] ? start_kernel+0x40b/0x417<br />
[<ffffffff81c3433b>] ? x86_64_start_reservations+0x126/0x12a<br />
[<ffffffff81c34436>] ? x86_64_start_kernel+0xf7/0x106<br />
---[ end trace c5f8a6b8504af481 ]---<br />
e1000e 0000:03:00.0: eth1: Reset adapter unexpectedly<br />
e1000e 0000:03:00.0: eth1: Timesync Tx Control register not set as expected<br />
e1000e 0000:03:00.0: eth1: Error reading PHY register<br />
e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx<br />
....<br />
e1000e 0000:03:00.0: eth1: Reset adapter unexpectedly<br />
e1000e 0000:03:00.0: eth1: Timesync Tx Control register not set as expected<br />
vmbr1: port 1(eth1) entering disabled state</li>
</ul>
<p>Any idea?</p>
<p>Thanks, István</p>
<p> </p>