2004-09-12 22:36:21

by Hans-Frieder Vogt

[permalink] [raw]
Subject: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Hi,

2.6.9-rc1-bk11 introduced a patch for the Realtek 8169 network chip driver,
that leads to a freeze of my system during bootup.
My system:
Athlon 64, VIA K8T800 chipset, RTL8110S-32 (equiv. to RTL8169), running in
64bit mode

no unusual messages are on the console prior to the freeze.

I traced the problem back to the patch which was introduced in 2.6.9-rc1-bk11
and is also part of 2.6.9-rc1-mm3 and -mm4:
--- a/drivers/net/r8169.c 2004-07-02 11:51:44 -07:00
+++ b/drivers/net/r8169.c 2004-08-31 00:15:35 -07:00
@@ -983,7 +983,7 @@

tp->cp_cmd = PCIMulRW | RxChkSum;

- if ((sizeof(dma_addr_t) > 32) &&
+ if ((sizeof(dma_addr_t) > 4) &&
!pci_set_dma_mask(pdev, DMA_64BIT_MASK))
tp->cp_cmd |= PCIDAC;
else {

which now, on my 64bit-system, enables DAC. For whatever reason this freezes
my system (I do not understand why, because the r8169 seems to understand DAC
according to the available documentation, perhaps a VIA K8T800 bug?).
Until somebody comes up with a proper solution for this problem, I suggest as
a work-around to introduce a parameter so that the DAC can be simply
unselected if necessary, as outlined in the patch below.

Thanks for any suggestions as to what the problem might be.
Hans-Frieder

--- linux-2.6.9-rc1-mm4.orig/drivers/net/r8169.c 2004-09-08 15:43:31.525119800
+0200
+++ linux-2.6.9-rc1-mm4/drivers/net/r8169.c 2004-09-11 12:54:53.910456828
+0200
@@ -167,6 +167,7 @@
MODULE_DEVICE_TABLE(pci, rtl8169_pci_tbl);

static int rx_copybreak = 200;
+static int use_dac = 1;

enum RTL8169_registers {
MAC0 = 0, /* Ethernet hardware address. */
@@ -398,6 +399,8 @@
MODULE_DESCRIPTION("RealTek RTL-8169 Gigabit Ethernet driver");
MODULE_PARM(media, "1-" __MODULE_STRING(MAX_UNITS) "i");
MODULE_PARM(rx_copybreak, "i");
+MODULE_PARM(use_dac, "i");
+MODULE_PARM_DESC(use_dac, "Use DAC addressing for DMA transfers on 64bit
machines");
MODULE_LICENSE("GPL");

static int rtl8169_open(struct net_device *dev);
@@ -1152,7 +1155,7 @@

dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;

- if ((sizeof(dma_addr_t) > 4) && !pci_set_dma_mask(pdev, DMA_64BIT_MASK)) {
+ if ((sizeof(dma_addr_t) > 4) && use_dac && !pci_set_dma_mask(pdev,
DMA_64BIT_MASK)) {
tp->cp_cmd |= PCIDAC;
dev->features |= NETIF_F_HIGHDMA;
} else {

--
--
Hans-Frieder Vogt e-mail: hfvogt (at) arcor (dot) de


2004-09-12 23:26:56

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Hans-Frieder Vogt <[email protected]> :
[2.6.9-rc1-bk11 r8169 freeze on amd64]
> Until somebody comes up with a proper solution for this problem, I suggest as
> a work-around to introduce a parameter so that the DAC can be simply
> unselected if necessary, as outlined in the patch below.
>
> Thanks for any suggestions as to what the problem might be.

Remove the workaround, apply the attached patch and watch the oops.

If it happens in rtl8169_rx_interrupt(), you may notice that R12 is set to 0xbfc.
R12 is pkt_len in rtl8169_rx_interrupt. This value is twice too high. I have not
figured why so far and I'll go to bed.

Please Cc: netdev and [email protected] on followup.

--
Ueimor


Attachments:
(No filename) (691.00 B)
r8169-dbg-b.patch (1.50 kB)
Download all attachments

2004-09-13 12:47:47

by Hans-Frieder Vogt

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Am Montag, 13. September 2004 01:26 schrieben Sie:
> Hans-Frieder Vogt <[email protected]> :
> [2.6.9-rc1-bk11 r8169 freeze on amd64]
>
> > Until somebody comes up with a proper solution for this problem, I
> > suggest as a work-around to introduce a parameter so that the DAC can be
> > simply unselected if necessary, as outlined in the patch below.
> >
> > Thanks for any suggestions as to what the problem might be.
>
> Remove the workaround, apply the attached patch and watch the oops.
>
Francois,
thanks for your quick response.

I applied the patch to an otherwise clean 2.6.9-rc1-bk17, but no change to
previous behaviour:
no oops (BUG_ON not triggered)! System boots up as normal, but just after I
log in on the console the system freezes, i.e., keyboard does not react any
more and the system is not accessible via network.
The time from the moment I log in to the time when the system freezes varies,
but is in the order of 5s.
There is no difference whether NAPI is enabled or not.

My .config and the boot messages are attached to this e-mail.

Hans-Frieder

> If it happens in rtl8169_rx_interrupt(), you may notice that R12 is set to
> 0xbfc. R12 is pkt_len in rtl8169_rx_interrupt. This value is twice too
> high. I have not figured why so far and I'll go to bed.
>
> Please Cc: netdev and [email protected] on followup.
>
> --
> Ueimor

--
--
Hans-Frieder Vogt e-mail: hfvogt (at) arcor (dot) de


Attachments:
(No filename) (1.40 kB)
config-2.6.9-rc1-bk17 (30.24 kB)
boot.msg-napi (15.04 kB)
Download all attachments

2004-09-13 22:03:23

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Hans-Frieder Vogt <[email protected]> :
[...]
> no oops (BUG_ON not triggered)! System boots up as normal, but just after I

...

> log in on the console the system freezes, i.e., keyboard does not react any
> more and the system is not accessible via network.

- do the keyboard leds or the magic sysrq answer (I assume you boot without X) ?
- does it make a difference if you boot with the network cable unpluged (i.e.
fine until pluged then dead when first packet comes in) ?

> The time from the moment I log in to the time when the system freezes varies,
> but is in the order of 5s.

First packet probably. Can you verify this point ?

> There is no difference whether NAPI is enabled or not.

I will welcome lspci -vx + gcc version + objdump -S of the r8169 module.

--
Ueimor

2004-09-13 23:34:51

by Hans-Frieder Vogt

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Am Dienstag, 14. September 2004 00:02 schrieb Francois Romieu:
> Hans-Frieder Vogt <[email protected]> :
> [...]
>
> > no oops (BUG_ON not triggered)! System boots up as normal, but just after
> > I
>
> ...
>
> > log in on the console the system freezes, i.e., keyboard does not react
> > any more and the system is not accessible via network.
>
> - do the keyboard leds or the magic sysrq answer (I assume you boot without
X) ?

(To be able to exclude any side effect of X, I have booted without X and I
have also removed all graphics driver related modules)

When the system freezes, the keyboard is completely dead, the LEDs do not
react any more and also the sysrq keys do not work.

> - does it make a difference if you boot with the network cable
> unpluged (i.e. fine until pluged then dead when first packet comes in) ?
>

YES!! With the network cable unplugged, the system does not freeze!
Every 10 seconds, I get now the message:
r8169: eth0: PHY reset until link up
but otherwise everything seems fine.

> > The time from the moment I log in to the time when the system freezes
> > varies, but is in the order of 5s.
>
> First packet probably. Can you verify this point ?
>

I think the test with the network cable unplugged supports this assumption.
With network cable unplugged, /proc/interrupts shows 0 interrupts for the
network card, so probably the first interrupt leads to the system freeze.

> > There is no difference whether NAPI is enabled or not.
>
> I will welcome lspci -vx + gcc version + objdump -S of the r8169 module.
>

lspci -vx and objdump -S output (gzipped) are attached, gcc version is 3.4.2
(Debian 3.4.2-2), but no visible difference with 3.4.1.

> --
> Ueimor

Thanks for your help, Francois.
I will put a few printks into the interrupt routine and hope to be able to
tell you more tomorrow,

Hans-Frieder

--
--
Hans-Frieder Vogt e-mail: hfvogt (at) arcor (dot) de


Attachments:
(No filename) (1.88 kB)
lspci-vx.out (9.38 kB)
r8169-dump.out.gz (24.97 kB)
Download all attachments

2004-09-15 22:47:43

by Hans-Frieder Vogt

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Francois,

I did a few tests with the r8169 driver on my x86-64 system:

with DAC enabled (as is default),
the first interrupt is a read interrupt, because the status read from
IntrStatus is 0x8001, i.e. SYSErr | RxOK.
So, a PCI error did occur, which is strange.
Because of the RxOK bit, rtl8169_rx_interrupt is called. In this routine,
tp->cur_rx is 0
tp->dirty_rx is 0
tp->RxDescArray[entry].status is 0,
which then gives a pkt_size of -4!
The system freezes somewhere in rtl8169_rx_interrupt.

When I do a
if (pkt_size < 0) break;
in rtl8169_rx_interrupt to block the errorneous pkt_size, the system still
freezes, but AFTER LEAVING the interrupt routine rtl8169_interrupt.

Same, when I introduced a shortcut to get rid of the SYSErr.

For comparison:
When I switch off DAC, then the first interrupts are
1: IntrStatus: 0x0484 (? | TxDescUnavail?), -> tx interrupt
2: IntrStatus: 0x0484, -> tx interrupt
3: IntrStatus: 0x0485, -> rx interrupt
...

Just another thought:
Of course x86-64 has the address-space that enables >4GB RAM, and x86-64
always supports DAC (as stated in include/asm-x86_64/pci.h), but I have
currently only 1GB RAM, so, strictly speaking, DAC is not really necessary.
Strange enough, the latest Realtek driver 2.2 does not even support DAC (only
the lower 32 bit of the DMA-Addresses are written to the registers).
Could it be that the Realtek driver does not support DAC for a good reason?

Anyway, I will continue searching for the problem...

Hans-Frieder

Am Dienstag, 14. September 2004 01:31 schrieb Hans-Frieder Vogt:
> Am Dienstag, 14. September 2004 00:02 schrieb Francois Romieu:
> > Hans-Frieder Vogt <[email protected]> :
> > [...]
> >
> > > no oops (BUG_ON not triggered)! System boots up as normal, but just
> > > after I
> >
> > ...
> >
> > > log in on the console the system freezes, i.e., keyboard does not react
> > > any more and the system is not accessible via network.
> >
> > - do the keyboard leds or the magic sysrq answer (I assume you boot
> > without
>
> X) ?
>
> (To be able to exclude any side effect of X, I have booted without X and I
> have also removed all graphics driver related modules)
>
> When the system freezes, the keyboard is completely dead, the LEDs do not
> react any more and also the sysrq keys do not work.
>
> > - does it make a difference if you boot with the network cable
> > unpluged (i.e. fine until pluged then dead when first packet comes in) ?
>
> YES!! With the network cable unplugged, the system does not freeze!
> Every 10 seconds, I get now the message:
> r8169: eth0: PHY reset until link up
> but otherwise everything seems fine.
>
> > > The time from the moment I log in to the time when the system freezes
> > > varies, but is in the order of 5s.
> >
> > First packet probably. Can you verify this point ?
>
> I think the test with the network cable unplugged supports this assumption.
> With network cable unplugged, /proc/interrupts shows 0 interrupts for the
> network card, so probably the first interrupt leads to the system freeze.
>
> > > There is no difference whether NAPI is enabled or not.
> >
> > I will welcome lspci -vx + gcc version + objdump -S of the r8169 module.
>
> lspci -vx and objdump -S output (gzipped) are attached, gcc version is
> 3.4.2 (Debian 3.4.2-2), but no visible difference with 3.4.1.
>
> > --
> > Ueimor
>
> Thanks for your help, Francois.
> I will put a few printks into the interrupt routine and hope to be able to
> tell you more tomorrow,
>
> Hans-Frieder

--
--
Hans-Frieder Vogt e-mail: hfvogt (at) arcor (dot) de

2004-09-15 23:17:00

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Hans-Frieder Vogt <[email protected]> :
[...]
[...]
> Of course x86-64 has the address-space that enables >4GB RAM, and x86-64
> always supports DAC (as stated in include/asm-x86_64/pci.h), but I have
> currently only 1GB RAM, so, strictly speaking, DAC is not really necessary.

Worse than that: r8169 in 2.6.9-rc[1/2] does not advertise its ability to
DMA to high memory.

> Strange enough, the latest Realtek driver 2.2 does not even support DAC (only
> the lower 32 bit of the DMA-Addresses are written to the registers).
> Could it be that the Realtek driver does not support DAC for a good reason?
>
> Anyway, I will continue searching for the problem...

Can you simply try the attached patch with the network cable unplugged ?

It will not fix your issue but if the result & 0x08 != 0, you can probably
stop your testing for now as it will mean "known issue".

--
Ueimor


Attachments:
(No filename) (883.00 B)
r8169-xx0.patch (585.00 B)
Download all attachments

2004-09-15 23:42:10

by Hans-Frieder Vogt

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Am Donnerstag, 16. September 2004 01:09 schrieb Francois Romieu:
> Hans-Frieder Vogt <[email protected]> :
> [...]
> [...]
>
> > Of course x86-64 has the address-space that enables >4GB RAM, and x86-64
> > always supports DAC (as stated in include/asm-x86_64/pci.h), but I have
> > currently only 1GB RAM, so, strictly speaking, DAC is not really
> > necessary.
>
> Worse than that: r8169 in 2.6.9-rc[1/2] does not advertise its ability to
> DMA to high memory.
>
> > Strange enough, the latest Realtek driver 2.2 does not even support DAC
> > (only the lower 32 bit of the DMA-Addresses are written to the
> > registers). Could it be that the Realtek driver does not support DAC for
> > a good reason?
> >
> > Anyway, I will continue searching for the problem...
>
> Can you simply try the attached patch with the network cable unplugged ?
>
> It will not fix your issue but if the result & 0x08 != 0, you can probably
> stop your testing for now as it will mean "known issue".
>
> --
> Ueimor

r8169: eth0: Config2 = 0x10

... does not seem to be the already known issue? Anyhow, if you have more
ideas, I will be happy to test them :-)

Hans-Frieder

--
--
Hans-Frieder Vogt e-mail: hfvogt (at) arcor (dot) de

2004-09-16 07:03:41

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Hans-Frieder Vogt <[email protected]> :
[...]
> r8169: eth0: Config2 = 0x10
>
> ... does not seem to be the already known issue? Anyhow, if you have more

Typo of mine: it is. The BSD people have noticed some time ago that the DAC
of the chipset apparently does bad things when the card is inserted in a
32 bits wide slot.

Jon, if your ppc64 box offers 64 bits wide PCI slots, it would be nice if
you could ttry 2.6.9-rc2-bkX, apply
http://www.fr.zoreil.com/people/francois/misc/r8169-xx0.patch
and report the content of the "Config2" line in the logs of the kernel.

--
Ueimor

2004-09-16 18:56:52

by Jon Mason

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

> Jon, if your ppc64 box offers 64 bits wide PCI slots, it would be nice if
> you could ttry 2.6.9-rc2-bkX, apply
> http://www.fr.zoreil.com/people/francois/misc/r8169-xx0.patch
> and report the content of the "Config2" line in the logs of the kernel.

Here is the info you requested:

r8169 Gigabit Ethernet driver 1.6LK loaded
eth8: Identified chip type is 'RTL8169s/8110s'.
eth8: RTL8169 at 0xa0000000800b7000, 00:40:f4:96:fc:3f, IRQ 131
r8169: eth8: Config2 = 0x01
r8169: eth8: link up

My p630 has 64bit PCI-X slots, but my r8169 adapter is a 32bit adapter. See
lspci output below (with a 64bit PCI-X e1000 adapter as a comparison).
.....
0001:61:01.1 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet
Controller (rev 03)
Subsystem: IBM: Unknown device 0289
Flags: bus master, 66Mhz, medium devsel, latency 144, IRQ 122
Memory at cc100000 (64-bit, non-prefetchable) [size=cc000000]
Memory at cc040000 (64-bit, non-prefetchable) [size=256K]
I/O ports at 12ec00 [size=64]
Expansion ROM at 00040000 [disabled]
Capabilities: [dc] Power Management version 2
Capabilities: [e4] Capabilities: [f0] Message Signalled
Interrupts: 64bit+ Queue=0/0 Enable-
......
0002:01:01.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet
Flags: bus master, 66Mhz, medium devsel, latency 74, IRQ 131
I/O ports at 20fc00 [size=e0000000]
Memory at e0020000 (32-bit, non-prefetchable) [size=256]
Expansion ROM at 00020000 [disabled]
Capabilities: [dc] Power Management version 2


I think my AMD64 system at home has a 64bit integrated adapter, as the
performance on it is 2.5x faster (either that or r8169 and ppc don't play
well together). I will verify this when I get home.

--
Jon Mason
[email protected]

2004-09-17 15:54:21

by Jon Mason

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

On Thursday 16 September 2004 01:20 pm, Jon Mason wrote:
> > Jon, if your ppc64 box offers 64 bits wide PCI slots, it would be nice if
> > you could ttry 2.6.9-rc2-bkX, apply
> > http://www.fr.zoreil.com/people/francois/misc/r8169-xx0.patch
> > and report the content of the "Config2" line in the logs of the kernel.
>
> Here is the info you requested:
>
> r8169 Gigabit Ethernet driver 1.6LK loaded
> eth8: Identified chip type is 'RTL8169s/8110s'.
> eth8: RTL8169 at 0xa0000000800b7000, 00:40:f4:96:fc:3f, IRQ 131
> r8169: eth8: Config2 = 0x01
> r8169: eth8: link up
>
> My p630 has 64bit PCI-X slots, but my r8169 adapter is a 32bit adapter.
> See lspci output below (with a 64bit PCI-X e1000 adapter as a comparison).
> .....
> 0001:61:01.1 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet
> Controller (rev 03)
> Subsystem: IBM: Unknown device 0289
> Flags: bus master, 66Mhz, medium devsel, latency 144, IRQ 122
> Memory at cc100000 (64-bit, non-prefetchable) [size=cc000000]
> Memory at cc040000 (64-bit, non-prefetchable) [size=256K]
> I/O ports at 12ec00 [size=64]
> Expansion ROM at 00040000 [disabled]
> Capabilities: [dc] Power Management version 2
> Capabilities: [e4] Capabilities: [f0] Message Signalled
> Interrupts: 64bit+ Queue=0/0 Enable-
> ......
> 0002:01:01.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
> Gigabit Ethernet (rev 10)
> Subsystem: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit
> Ethernet Flags: bus master, 66Mhz, medium devsel, latency 74, IRQ 131 I/O
> ports at 20fc00 [size=e0000000]
> Memory at e0020000 (32-bit, non-prefetchable) [size=256]
> Expansion ROM at 00020000 [disabled]
> Capabilities: [dc] Power Management version 2
>
>
> I think my AMD64 system at home has a 64bit integrated adapter, as the
> performance on it is 2.5x faster (either that or r8169 and ppc don't play
> well together). I will verify this when I get home.

Well, it appears that my home system only has a 32bit adapter as well (see
below).

0000:00:0b.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10)
Subsystem: Micro-Star International Co., Ltd.: Unknown device 702c
Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 16
I/O ports at c800 [size=cff20000]
Memory at cfffb700 (32-bit, non-prefetchable) [size=256]
Expansion ROM at 00020000 [disabled]
Capabilities: [dc] Power Management version 2

Before I make any sweeping comments about the performance on ppc64, I should
probably do some more tests. I'll have to get back to you regarding that.

Would you like me to run the "Config2" patch on my amd64 system?

--
Jon Mason
[email protected]

2004-09-17 16:12:30

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Jon Mason <[email protected]> :
[...]
> Before I make any sweeping comments about the performance on ppc64, I should
> probably do some more tests. I'll have to get back to you regarding that.
>
> Would you like me to run the "Config2" patch on my amd64 system?

Please do. If I read you correctly, 2.6.9-rc2-bkX works (more or less)
on both your ppc64 and amd64 systems, right ?

--
Ueimor

2004-09-19 21:15:31

by Andy Lutomirski

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Francois Romieu wrote:
> Jon Mason <[email protected]> :
> [...]
>
>>Before I make any sweeping comments about the performance on ppc64, I should
>>probably do some more tests. I'll have to get back to you regarding that.
>>
>>Would you like me to run the "Config2" patch on my amd64 system?
>
>
> Please do. If I read you correctly, 2.6.9-rc2-bkX works (more or less)
> on both your ppc64 and amd64 systems, right ?

No, still broken. But I don't see any change at all in 2.6.9-rc2-bk5.
This is on amd64, with 32-bit slots (I think).

BTW, the crash is not immediate. It takes several seconds after trying
to send/recieve.

I say _trying_ because I can't ping anything. I haven't had time before
the crash to figure out what's wrong, but the device at the other end
does flash that a packet came through the wire.

FWIW, it looks like init_board is setting PCIDAC in tp->cp_cmd but that
isn't updated to the card until after the rx ring is filled in
r8169_open. This seems suspicious, since DMA memory is being allocated
possibly in >32-bit addresses but the card hasn't been told to support
that. Fixing this doesn't seem to help, though...

Turning off high DMA fixes it. Maybe it just needs to be disabled until
someone figures out what's going on.

--Andy

2004-09-19 21:43:51

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Andy Lutomirski <[email protected]> :
[...]
> FWIW, it looks like init_board is setting PCIDAC in tp->cp_cmd but that
> isn't updated to the card until after the rx ring is filled in
> r8169_open. This seems suspicious, since DMA memory is being allocated
> possibly in >32-bit addresses but the card hasn't been told to support
> that. Fixing this doesn't seem to help, though...

rtl8169_hw_start() writes the CPlusCmd register before the ring descriptor
adresses are set. Can you elaborate why it would not be enough ?

Btw the r8169 driver in 2.6.9-rcX does not advertise NETIF_F_HIGHDMA: where
would a >32 bit address come from ?

> Turning off high DMA fixes it. Maybe it just needs to be disabled until
> someone figures out what's going on.

I am cooking a patch for it (+ check for PCI error).

As a side note, the r8169 chipset does not like DAC to be enabled on a
32bit system. I got the usual PCI error reported while trying it.

--
Ueimor

2004-09-19 23:49:32

by Andy Lutomirski

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Index: 2.6.9-rc2-mm1/drivers/net/r8169.c
===================================================================
--- 2.6.9-rc2-mm1.orig/drivers/net/r8169.c 2004-09-19 16:43:09.725537944 -0700
+++ 2.6.9-rc2-mm1/drivers/net/r8169.c 2004-09-19 16:50:33.900013160 -0700
@@ -1044,8 +1044,6 @@
if (tp->link_ok(ioaddr))
goto out_unlock;

- printk(KERN_WARNING PFX "%s: PHY reset until link up\n", dev->name);
-
tp->phy_reset_enable(ioaddr);

out_mod_timer:


Attachments:
r8169-quiet.txt (456.00 B)

2004-09-20 02:57:09

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Francois Romieu wrote:
> rtl8169_hw_start() writes the CPlusCmd register before the ring descriptor
> adresses are set. Can you elaborate why it would not be enough ?


That sounds like a bug right there... need all the addresses set up
before we turn on stuff.

Jeff


2004-09-20 07:19:48

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Jeff Garzik <[email protected]> :
[...]
> That sounds like a bug right there... need all the addresses set up
> before we turn on stuff.

The description of the CPlusCmd in the 8169 datasheet includes a small note
which suggests that this register should be set up early.

It does not cost much to try and see if it makes a difference for DAC though.

--
Ueimor

2004-09-20 16:06:35

by Jon Mason

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Andy,
Your setup sounds very similar to mine, and I am not hitting the error. I am
running Gentoo on AMD Athlon(tm) 64 Processor 3200+ with 512MB RAM. My r8169
adapter (8110 chipset) is integrated in my MoBo. How is your setup
different?

Thanks,

--
Jon Mason
[email protected]

2004-09-20 18:00:03

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Francois Romieu wrote:
> Jeff Garzik <[email protected]> :
> [...]
>
>>That sounds like a bug right there... need all the addresses set up
>>before we turn on stuff.
>
>
> The description of the CPlusCmd in the 8169 datasheet includes a small note
> which suggests that this register should be set up early.
>
> It does not cost much to try and see if it makes a difference for DAC though.

Let me know what happens :)

Jeff



2004-09-20 21:16:14

by Francois Romieu

[permalink] [raw]
Subject: Re: 2.6.9-rc1-bk11+ and 2.6.9-rc1-mm3,4 r8169: freeze during boot (FIX included)

Jeff Garzik <[email protected]> :
[...]
> Let me know what happens :)

Nothing conclusive so far.

I applied patch below on my 32bit system on top of current 2.6.9-rc2-mm1 +
patch issued yesterday + extra hack to force PCI DAC on 32 bit system.
It does not help if I force PCI DAC.
It does not change anything if PCI DAC is not enabled.


diff -puN drivers/net/r8169.c~r8169-145b-4 drivers/net/r8169.c
--- linux-2.6.9-rc2/drivers/net/r8169.c~r8169-145b-4 2004-09-20 21:39:20.000000000 +0200
+++ linux-2.6.9-rc2-fr/drivers/net/r8169.c 2004-09-20 21:39:20.000000000 +0200
@@ -1483,6 +1483,11 @@ rtl8169_hw_start(struct net_device *dev)
void *ioaddr = tp->mmio_addr;
u32 i;

+ RTL_W32(TxDescStartAddrLow, ((u64) tp->TxPhyAddr & DMA_32BIT_MASK));
+ RTL_W32(TxDescStartAddrHigh, ((u64) tp->TxPhyAddr >> 32));
+ RTL_W32(RxDescAddrLow, ((u64) tp->RxPhyAddr & DMA_32BIT_MASK));
+ RTL_W32(RxDescAddrHigh, ((u64) tp->RxPhyAddr >> 32));
+
/* Soft reset the chip. */
RTL_W8(ChipCmd, CmdReset);

@@ -1494,7 +1499,6 @@ rtl8169_hw_start(struct net_device *dev)
}

RTL_W8(Cfg9346, Cfg9346_Unlock);
- RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb);
RTL_W8(EarlyTxThres, EarlyTxThld);

// For gigabit rtl8169
@@ -1509,24 +1513,9 @@ rtl8169_hw_start(struct net_device *dev)
RTL_W32(TxConfig,
(TX_DMA_BURST << TxDMAShift) | (InterFrameGap <<
TxInterFrameGapShift));
- tp->cp_cmd |= RTL_R16(CPlusCmd);
- RTL_W16(CPlusCmd, tp->cp_cmd);
-
- if (tp->mac_version == RTL_GIGA_MAC_VER_D) {
- dprintk(KERN_INFO PFX "Set MAC Reg C+CR Offset 0xE0. "
- "Bit-3 and bit-14 MUST be 1\n");
- tp->cp_cmd |= (1 << 14) | PCIMulRW;
- RTL_W16(CPlusCmd, tp->cp_cmd);
- }
-
tp->cur_rx = 0;

- RTL_W32(TxDescStartAddrLow, ((u64) tp->TxPhyAddr & DMA_32BIT_MASK));
- RTL_W32(TxDescStartAddrHigh, ((u64) tp->TxPhyAddr >> 32));
- RTL_W32(RxDescAddrLow, ((u64) tp->RxPhyAddr & DMA_32BIT_MASK));
- RTL_W32(RxDescAddrHigh, ((u64) tp->RxPhyAddr >> 32));
RTL_W8(Cfg9346, Cfg9346_Lock);
- udelay(10);

RTL_W32(RxMissed, 0);

@@ -1538,6 +1527,17 @@ rtl8169_hw_start(struct net_device *dev)
/* Enable all known interrupts by setting the interrupt mask. */
RTL_W16(IntrMask, rtl8169_intr_mask);

+ tp->cp_cmd |= RTL_R16(CPlusCmd);
+
+ if (tp->mac_version == RTL_GIGA_MAC_VER_D) {
+ dprintk(KERN_INFO PFX "Set MAC Reg C+CR Offset 0xE0. "
+ "Bit-3 and bit-14 MUST be 1\n");
+ tp->cp_cmd |= (1 << 14) | PCIMulRW;
+ }
+
+ RTL_W16(CPlusCmd, tp->cp_cmd);
+ RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb);
+
netif_start_queue(dev);
}


_