2006-01-05 20:30:18

by Lukasz Trabinski

[permalink] [raw]
Subject: e1000_watchdog_task: NIC Link is Up/Down on kernels 2.4.15 2.4.14

Hello

I have machine with build in ethernet interafce, system is Linux Fedora
Core 4.
Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet
Controller. It's connected to cisco Catalyst 4000 switch:

#show interface GigabitEthernet3/5

GigabitEthernet3/5 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet Port, address is xxx
Description: xxx
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, link type is auto, media type is 10/100/1000-TX
input flow-control is off, output flow-control is off


configuration on switch interface:

interface GigabitEthernet3/5
switchport mode access
switchport nonegotiate
logging event link-status
no cdp enable
end



on linux machine:

[root@w3cache ~]# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: umbg
Wake-on: g
Current message level: 0x00000007 (7)
Link detected: yes

Some times I observe flaping eth0 interface:

Jan 5 18:31:19 w3cache kernel: e1000: eth0: e1000_watchdog_task: NIC Link
is Down
Jan 5 18:31:21 w3cache kernel: e1000: eth0: e1000_watchdog_task: NIC Link
is Up 1000 Mbps Full Duplex
Jan 5 18:31:25 w3cache kernel: e1000: eth0: e1000_watchdog_task: NIC Link
is Down
Jan 5 18:31:27 w3cache kernel: e1000: eth0: e1000_watchdog_task: NIC Link
is Up 1000 Mbps Full Duplex
Jan 5 18:56:30 w3cache kernel: e1000: eth0: e1000_watchdog_task: NIC Link
is Down
Jan 5 18:56:33 w3cache kernel: e1000: eth0: e1000_watchdog_task: NIC Link
is Up 1000 Mbps Full Duplex

Patchcords and other physical layer is OK. Tested kernels
kernel-2.6.14-1.1653_FC4 and latest vanilla 2.4.15. I have tried set full
duplex, speed 1000 on switch but it's happan again.
On cisco switch i don't have seen any flaps this interface or any errors.

Any idea? Thank You.

I have also commited this problem to redhat bugzilla:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177043


--
?T


2006-01-09 15:29:05

by Roger Heflin

[permalink] [raw]
Subject: RE: e1000_watchdog_task: NIC Link is Up/Down on kernels 2.4.15 2.4.14



> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Lukasz Trabinski
> Sent: Thursday, January 05, 2006 2:30 PM
> To: [email protected]
> Cc: [email protected]
> Subject: e1000_watchdog_task: NIC Link is Up/Down on kernels
> 2.4.15 2.4.14
>
> Hello
>
> I have machine with build in ethernet interafce, system is
> Linux Fedora Core 4.
> Ethernet controller: Intel Corporation 82541GI/PI Gigabit
> Ethernet Controller. It's connected to cisco Catalyst 4000 switch:
>
> #show interface GigabitEthernet3/5
>
> GigabitEthernet3/5 is up, line protocol is up (connected)
> Hardware is Gigabit Ethernet Port, address is xxx
> Description: xxx
> MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
> reliability 255/255, txload 1/255, rxload 1/255
> Encapsulation ARPA, loopback not set
> Keepalive set (10 sec)
> Full-duplex, 1000Mb/s, link type is auto, media type is
> 10/100/1000-TX
> input flow-control is off, output flow-control is off
>
>
> configuration on switch interface:
>
> interface GigabitEthernet3/5
> switchport mode access
> switchport nonegotiate
> logging event link-status
> no cdp enable
> end
>
>
>
> on linux machine:
>
> [root@w3cache ~]# ethtool eth0
> Settings for eth0:
> Supported ports: [ TP ]
> Supported link modes: 10baseT/Half 10baseT/Full
> 100baseT/Half 100baseT/Full
> 1000baseT/Full
> Supports auto-negotiation: Yes
> Advertised link modes: 10baseT/Half 10baseT/Full
> 100baseT/Half 100baseT/Full
> 1000baseT/Full
> Advertised auto-negotiation: Yes
> Speed: 1000Mb/s
> Duplex: Full
> Port: Twisted Pair
> PHYAD: 0
> Transceiver: internal
> Auto-negotiation: on
> Supports Wake-on: umbg
> Wake-on: g
> Current message level: 0x00000007 (7)
> Link detected: yes
>
> Some times I observe flaping eth0 interface:
>
> Jan 5 18:31:19 w3cache kernel: e1000: eth0:
> e1000_watchdog_task: NIC Link is Down Jan 5 18:31:21 w3cache
> kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000
> Mbps Full Duplex Jan 5 18:31:25 w3cache kernel: e1000: eth0:
> e1000_watchdog_task: NIC Link is Down Jan 5 18:31:27 w3cache
> kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000
> Mbps Full Duplex Jan 5 18:56:30 w3cache kernel: e1000: eth0:
> e1000_watchdog_task: NIC Link is Down Jan 5 18:56:33 w3cache
> kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000
> Mbps Full Duplex
>
> Patchcords and other physical layer is OK. Tested kernels
> kernel-2.6.14-1.1653_FC4 and latest vanilla 2.4.15. I have
> tried set full duplex, speed 1000 on switch but it's happan again.
> On cisco switch i don't have seen any flaps this interface or
> any errors.
>
> Any idea? Thank You.
>
> I have also commited this problem to redhat bugzilla:
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177043
>
>
> --
> ?T

LT,

What kind of motherboard are you using?

If an E1000 overheats it will look like it is being disconnected, and they
appear to need to be under load.

Things that make it more likely to overheat are:
cases with bad airflow
certain motherboards with no heat sink in the e1000 chipset
higher elevations (thinner air).

We have fixed a number of machines with this issue by adding a heatsink/fan
near the ethernet chip, and we have got at least one MB manufacturer to
duplicate
the issue, and add a heat sink on their motherboads to correct the issue.

Roger

2006-01-25 21:53:07

by Lukasz Trabinski

[permalink] [raw]
Subject: RE: e1000_watchdog_task: NIC Link is Up/Down on kernels 2.4.15 2.4.14

On Mon, 9 Jan 2006, Roger Heflin wrote:

about:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177043

> We have fixed a number of machines with this issue by adding a heatsink/fan
> near the ethernet chip, and we have got at least one MB manufacturer to
> duplicate
> the issue, and add a heat sink on their motherboads to correct the issue.

Thank You for sugestions. I will try to move this machine to another rack,
because i have find information about high temperature on
CPU in logs files.

Jan 1 04:03:21 w3cache kernel: CPU2: Running in modulated clock mode
Jan 1 04:03:21 w3cache kernel: CPU3: Running in modulated clock mode
Jan 1 04:03:26 w3cache kernel: CPU3: Temperature above threshold
Jan 1 04:03:26 w3cache kernel: CPU2: Temperature above threshold



--
?T