2005-03-24 02:08:35

by Ben Greear

[permalink] [raw]
Subject: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

00:00.0 Host bridge: Intel Corp. Server Memory Controller Hub (rev 0c)
00:02.0 PCI bridge: Intel Corp. Memory Controller Hub PCI Express Port A0 (rev 0c)
00:03.0 PCI bridge: Intel Corp. Memory Controller Hub PCI Express Port A1 (rev 0c)
00:1c.0 PCI bridge: Intel Corp. 6300ESB 64-bit PCI-X Bridge (rev 02)
00:1d.0 USB Controller: Intel Corp. 6300ESB USB Universal Host Controller (rev 02)
00:1d.4 System peripheral: Intel Corp. 6300ESB Watchdog Timer (rev 02)
00:1d.5 PIC: Intel Corp. 6300ESB I/O Advanced Programmable Interrupt Controller (rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI Bridge (rev 0a)
00:1f.0 ISA bridge: Intel Corp. 6300ESB LPC Interface Controller (rev 02)
00:1f.2 IDE interface: Intel Corp. 6300ESB SATA Storage Controller (rev 02)
00:1f.3 SMBus: Intel Corp. 6300ESB SMBus Controller (rev 02)
01:00.0 PCI bridge: Intel Corp. PCI Bridge Hub A (rev 09)
01:00.2 PCI bridge: Intel Corp. PCI Bridge Hub B (rev 09)
02:01.0 PCI bridge: IBM PCI-X to PCI-X Bridge (rev 03)
03:04.0 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet Controller (rev 03)
03:04.1 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet Controller (rev 03)
03:06.0 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet Controller (rev 03)
03:06.1 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet Controller (rev 03)
04:01.0 PCI bridge: IBM PCI-X to PCI-X Bridge (rev 02)
05:04.0 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01)
05:04.1 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01)
05:06.0 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01)
05:06.1 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01)
07:01.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet Controller
07:02.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet Controller
08:01.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 10)
08:02.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)


Attachments:
dmesg.txt (15.08 kB)
lspci.txt (2.03 kB)
Download all attachments

2005-03-24 03:25:43

by Ben Greear

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

Ben Greear wrote:
> I'm having a strange problem. I have an X6DVA motherboard
> with dual 2.8Ghz emt-64 processors, 1GB of RAM, SATA HD, etc.

> I tried kernel 2.6.11 which uses irq 26, and 2.6.10-1.770_FC2smp, which
> maps the irq to 209 or something like that. Distribution is FC2, x86.
> Kernel is compiled for x86-SMP as well.
>
> I suspect that this may be a hardware issue of some sort, but if anyone
> has any suggestions as to how to debug this further, please do let
> me know. I'm attaching the lspci and dmesg output in case that helps.

I am now less certain: I tried with a separate but similar machine, and
eth3 still has bad interrupt test. I tried 2.6.9 kernel, same problem.
I tried 2.4.29 kernel (on FC2 distribution), and the same problem exists.

I tried with pci=noacpi, and this just messes up everything (irqs are
disabled, etc).

Could this be a bug in the motherboard implementation?

Off to try some different combinations of NIC hardware...

Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2005-03-24 08:10:08

by Lennert Buytenhek

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

On Wed, Mar 23, 2005 at 06:03:30PM -0800, Ben Greear wrote:

> I have two 4-port e1000 NICs in the system, on a riser card.

How is the riser card wired? F.e. does it have a single edge
connector, and provides two PCI slots, or does it have a tiny
additional edge connector that routes REQ#/GNT#/INTx from a
nearby PCI slot, etc.?


--L

2005-03-24 08:18:40

by Ben Greear

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

Lennert Buytenhek wrote:
> On Wed, Mar 23, 2005 at 06:03:30PM -0800, Ben Greear wrote:
>
>
>>I have two 4-port e1000 NICs in the system, on a riser card.
>
>
> How is the riser card wired? F.e. does it have a single edge
> connector, and provides two PCI slots, or does it have a tiny
> additional edge connector that routes REQ#/GNT#/INTx from a
> nearby PCI slot, etc.?

It has an edge connector, a full 64-bit ribbon connector, and
a 32-bit ribon connector. As far as I can tell, there are no
shared signals.

It is made by Adex electronics, and is part number: P/NPCITX3S1 884-335-185

I tried two different systems, and the problem is identical, so I believe
it is not a hardware manufacturing glitch. It may be a hardware design
issue in either the riser or the motherboard. Or a BIOS PCI irq mapping
problem...

Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2005-03-24 10:14:11

by Daniel Egger

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

On 24.03.2005, at 03:03, Ben Greear wrote:

> When trying to send/receive traffic, I get TX watchdog timeouts. The
> other
> interfaces seem to work just fine.

No idea whether my problem is related but due to a broken motherboard
I had to switch from a SiS based Athlon board (ECS K7S5A) to a new
one which is VIA based:

0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600
AGP] Host Bridge (rev 80)
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
0000:00:09.0 VGA compatible controller: Cirrus Logic GD 5446
0000:00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8169 Gigabit Ethernet (rev 10)
0000:00:0b.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit
Ethernet Controller
0000:00:0d.0 FireWire (IEEE 1394): NEC Corporation uPD72874 IEEE1394
OHCI 1.1 3-port PHY-Link Ctrlr (rev 01)
0000:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420
SATA RAID Controller (rev 80)
0000:00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 81)
0000:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
0000:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
[K8T800 South]
0000:00:11.5 Multimedia audio controller: VIA Technologies, Inc.
VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
0000:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102
[Rhine-II] (rev 78)

Strange enough the beforehand very reliable Intel Pro/1000 MT
controller would also see watchdog errors in an otherwise
unchanged environment (same kernel, cards, CPU, etc.). I tried
different kernels 2.6.8-2.6.10, but no change; I tried without
ACPI information for IRQ routing -- nope. I tried swapping PCI
slots -- negative, sir.

As a temporary counter measure (this box is not only a ethernet
bridge between 100Mbit and 1000Mbit switched networks but also
the primary fileserver for my netboot TFTP/NFS environment, so
dropouts are especially nasty since it takes some time until the
NFS machines on the Gbit network will resume operation) I popped
in the cheeeep RealTek card (which caused some slight problems
like permanent hangs and bad performance before) and everything
works like a charm. Of course, after throwing in some extra
money to get at least somewhat professional equipment, I'd like
to use it, too.

This is the (strange?) ethtool output:
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: Unknown! (65535)
Duplex: Unknown! (255)
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: umbg
Wake-on: g
Current message level: 0x00000007 (7)
Link detected: no

The card is still in the system and running, so if someone wants
me to run to more tests or diagnostic, please be my guest.

Servus,
Daniel


Attachments:
PGP.sig (186.00 B)
This is a digitally signed message part

2005-03-24 18:07:18

by Ben Greear

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

Daniel Egger wrote:

> The card is still in the system and running, so if someone wants
> me to run to more tests or diagnostic, please be my guest.

What does: ethtool -t eth0
show?

Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2005-03-24 18:38:06

by Francois Romieu

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

Daniel Egger <[email protected]> :
[...]
> NFS machines on the Gbit network will resume operation) I popped
> in the cheeeep RealTek card (which caused some slight problems
> like permanent hangs and bad performance before) and everything
> works like a charm. Of course, after throwing in some extra

Any pointer to a description of your r8169 issues with a recent kernel ?

--
Ueimor

2005-03-24 19:38:40

by Ben Greear

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

Lennert Buytenhek wrote:
> On Wed, Mar 23, 2005 at 06:03:30PM -0800, Ben Greear wrote:
>
>
>>I have two 4-port e1000 NICs in the system, on a riser card.
>
>
> How is the riser card wired? F.e. does it have a single edge
> connector, and provides two PCI slots, or does it have a tiny
> additional edge connector that routes REQ#/GNT#/INTx from a
> nearby PCI slot, etc.?

I was able to reproduce the problem even when the 4-port e1000 NIC
is plugged directly into the motherboard, so it's not the
riser...

I also tried with a 4-port VIA-Rhine NIC (router-board 44). It also
fails it's third interface, with the same problem. So, it is not
the e1000 NIC nor the e1000 driver that is the problem.

I do notice that it is the same interrupt (26) that is always assigned
to the broken port. I have the lspci and dmesg output for the via-rhine
boot if anyone wants it...

Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2005-03-29 20:37:35

by Ben Greear

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard (Solved)

For posterity's sake:

The problem is evidently a hardware problem, and I'll have to
return the board to the manufacturer so they can solder on another
part.

So, if you want to use 4-port NICs in slot-5 of the SuperMicro X6DVA-EG
board, then purchase the X6DVA-4G instead, as the X6DVA-EG will NOT work.

Actually, anything that tries to use the 3rd PCI function will probably fail
as well...

Enjoy,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2005-04-14 18:01:06

by Ganesh Venkatesan

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

Ben:

Have you checked if the BIOS on the super micro machine is the latest
and greatest. I have had interrupt routing issues very similar to the
one you are describing due to a BIOS Interrupt Routing issue. Moving
to newer BIOS fixed it.

ganesh.

On 3/24/05, Ben Greear <[email protected]> wrote:
> Lennert Buytenhek wrote:
> > On Wed, Mar 23, 2005 at 06:03:30PM -0800, Ben Greear wrote:
> >
> >
> >>I have two 4-port e1000 NICs in the system, on a riser card.
> >
> >
> > How is the riser card wired? F.e. does it have a single edge
> > connector, and provides two PCI slots, or does it have a tiny
> > additional edge connector that routes REQ#/GNT#/INTx from a
> > nearby PCI slot, etc.?
>
> I was able to reproduce the problem even when the 4-port e1000 NIC
> is plugged directly into the motherboard, so it's not the
> riser...
>
> I also tried with a 4-port VIA-Rhine NIC (router-board 44). It also
> fails it's third interface, with the same problem. So, it is not
> the e1000 NIC nor the e1000 driver that is the problem.
>
> I do notice that it is the same interrupt (26) that is always assigned
> to the broken port. I have the lspci and dmesg output for the via-rhine
> boot if anyone wants it...
>
> Ben
>
> --
> Ben Greear <[email protected]>
> Candela Technologies Inc http://www.candelatech.com
>
>

2005-04-14 20:34:42

by Ben Greear

[permalink] [raw]
Subject: Re: PCI interrupt problem: e1000 & Super-Micro X6DVA motherboard

Ganesh Venkatesan wrote:

>Ben:
>
>Have you checked if the BIOS on the super micro machine is the latest
>and greatest. I have had interrupt routing issues very similar to the
>one you are describing due to a BIOS Interrupt Routing issue. Moving
>to newer BIOS fixed it.
>
>

A new BIOS didn't help. Super-Micro eventually reproduced the problem,
and told me
the fix was to send the MB back to them so they could solder another part
onto it.... I haven't received the MB back yet so I don't know if they
really
have a fix for it or not...

Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com