2011-02-16 22:43:44

by Alexander Wuerstlein

[permalink] [raw]
Subject: r8169 hangs machine on kernel boot (bisected)

Hello,

I've just tried to boot a new computer featuring a Realtek r8168 onboard
(lspci calls it Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI
Express Gigabit Ethernet controller [10ec:8168] (rev 04), see [1])
network chip with kernel 2.6.38-rc5 and current git. Both hang on boot
just after USB device initialization and just before the kernel usually
does DHCP. The previous 2.6.37 didn't hang on boot, but showed strange
behaviour (only 10MBit half duplex on autonegotiation, tons of errors on
the switch interface[2]) which is why I tried the newer kernel in hopes
that there would be fixes.

I bisected the hang-before-dhcp bug down to commit 'r8169: magic.'
(b646d90053f887c1bc243191e693a9b02d09f2c2, also see [1]). Since the
commit really does its description justice and looks like some weird
hardware magic, I'd like to ask the wizards on how to proceed with
fixing it.



Ciao,

Alexander Wuerstlein.

[1] Kernel .config, lspci, bisect log:
http://wwwcip.cs.fau.de/~snalwuer/kernel-r8169/
[2] the Realtek-supplied r8168 doesn't show those problems
[3] feel free to criticise my Cc:, I wasn't quite sure...


2011-02-18 05:44:26

by Hayes Wang

[permalink] [raw]
Subject: RE: r8169 hangs machine on kernel boot (bisected)

Hello,

I have try the 8168DP and it wouldn't hang. I think I need more information.
Could you give the information about the motherboard and the version of BIOS.
Besides, please use the realtek driver, and dump the MAC information by using
"ethtool -d eth0". These are helpful to find out what happens. Thanks.

Best Regards,
Hayes


-----Original Message-----
From: Alexander Wuerstlein [mailto:[email protected]]
Sent: Thursday, February 17, 2011 6:35 AM
To: franois romieu
Cc: Hayeswang; David S. Miller; [email protected]
Subject: r8169 hangs machine on kernel boot (bisected)

Hello,

I've just tried to boot a new computer featuring a Realtek r8168 onboard (lspci
calls it Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit
Ethernet controller [10ec:8168] (rev 04), see [1]) network chip with kernel
2.6.38-rc5 and current git. Both hang on boot just after USB device
initialization and just before the kernel usually does DHCP. The previous 2.6.37
didn't hang on boot, but showed strange behaviour (only 10MBit half duplex on
autonegotiation, tons of errors on the switch interface[2]) which is why I tried
the newer kernel in hopes that there would be fixes.

I bisected the hang-before-dhcp bug down to commit 'r8169: magic.'
(b646d90053f887c1bc243191e693a9b02d09f2c2, also see [1]). Since the commit
really does its description justice and looks like some weird hardware magic,
I'd like to ask the wizards on how to proceed with fixing it.



Ciao,

Alexander Wuerstlein.

[1] Kernel .config, lspci, bisect log:
http://wwwcip.cs.fau.de/~snalwuer/kernel-r8169/
[2] the Realtek-supplied r8168 doesn't show those problems [3] feel free to
criticise my Cc:, I wasn't quite sure...

2011-02-18 17:30:21

by Alexander Wuerstlein

[permalink] [raw]
Subject: Re: r8169 hangs machine on kernel boot (bisected)

Hello,

On 11-02-18 15:01, hayeswang <[email protected]> wrote:
> Hello,
> > I've just tried to boot a new computer featuring a Realtek r8168 onboard (lspci
> > calls it Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit
> > Ethernet controller [10ec:8168] (rev 04), see [1]) network chip with kernel
> > 2.6.38-rc5 and current git. Both hang on boot just after USB device
> > initialization and just before the kernel usually does DHCP. The previous 2.6.37
> > didn't hang on boot, but showed strange behaviour (only 10MBit half duplex on
> > autonegotiation, tons of errors on the switch interface[2]) which is why I tried
> > the newer kernel in hopes that there would be fixes.
> >
> > I bisected the hang-before-dhcp bug down to commit 'r8169: magic.'
> > (b646d90053f887c1bc243191e693a9b02d09f2c2, also see [1]). Since the commit
> > really does its description justice and looks like some weird hardware magic,
> > I'd like to ask the wizards on how to proceed with fixing it.
>
> I have try the 8168DP and it wouldn't hang. I think I need more information.
> Could you give the information about the motherboard and the version of BIOS.
> Besides, please use the realtek driver, and dump the MAC information by using
> "ethtool -d eth0". These are helpful to find out what happens. Thanks.

I'm sorry that it took me so long to answer, I've had some problems
booting the machine today since the PXE-Client didn't get any DHCP
requests through. I'm not sure if its related, the problem cleared up
after I opened up the machine to get the motherboard data and stuff.

Long story short, new files 'bios', 'hardware' and 'ethtool-d' in
http://wwwcip.cs.fau.de/~snalwuer/kernel-r8169
Bios revision is the latest available from Fujitsu for that hardware.
Hardware numbers and serials are accurate, but the inscriptions on the
chips I suspected to be PHYs were very hard to read, so expect errors.
The 'ethtool-d' is especially curious, since it consists of only
'FF'-bytes. The ethtool register dump was taken on a 2.6.37 with r8168
as a module.

fujitsu.com also has some (limited, end-user-focused) hardware
documentation if you enter the serial number 'YL7E003277' on
http://ts.fujitsu.com/support/downloads.html

If you need any further information, don't hesitate to ask.



Ciao,

Alexander Wuerstlein.

2011-02-21 02:05:52

by Hayes Wang

[permalink] [raw]
Subject: RE: r8169 hangs machine on kernel boot (bisected)

Hello,

> From: Alexander Wuerstlein [mailto:[email protected]]
> Sent: Saturday, February 19, 2011 1:30 AM
> To: Hayeswang
> Cc: 'David S. Miller'; [email protected]; 'franois romieu'
> Subject: Re: r8169 hangs machine on kernel boot (bisected)
>
> Hello,
>
> On 11-02-18 15:01, hayeswang <[email protected]> wrote:
> > Hello,
> > > I've just tried to boot a new computer featuring a Realtek r8168
> > > onboard (lspci calls it Realtek Semiconductor Co., Ltd.
> > > RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168]
> > > (rev 04), see [1]) network chip with kernel
> > > 2.6.38-rc5 and current git. Both hang on boot just after
> USB device
> > > initialization and just before the kernel usually does DHCP. The
> > > previous 2.6.37 didn't hang on boot, but showed strange behaviour
> > > (only 10MBit half duplex on autonegotiation, tons of
> errors on the
> > > switch interface[2]) which is why I tried the newer
> kernel in hopes that there would be fixes.
> > >
> > > I bisected the hang-before-dhcp bug down to commit 'r8169: magic.'
> > > (b646d90053f887c1bc243191e693a9b02d09f2c2, also see [1]).
> Since the
> > > commit really does its description justice and looks like
> some weird
> > > hardware magic, I'd like to ask the wizards on how to
> proceed with fixing it.
> >
> > I have try the 8168DP and it wouldn't hang. I think I need
> more information.
> > Could you give the information about the motherboard and
> the version of BIOS.
> > Besides, please use the realtek driver, and dump the MAC
> information
> > by using "ethtool -d eth0". These are helpful to find out
> what happens. Thanks.
>
> I'm sorry that it took me so long to answer, I've had some
> problems booting the machine today since the PXE-Client
> didn't get any DHCP requests through. I'm not sure if its
> related, the problem cleared up after I opened up the machine
> to get the motherboard data and stuff.
>
> Long story short, new files 'bios', 'hardware' and 'ethtool-d' in
> http://wwwcip.cs.fau.de/~snalwuer/kernel-r8169
> Bios revision is the latest available from Fujitsu for that hardware.
> Hardware numbers and serials are accurate, but the
> inscriptions on the chips I suspected to be PHYs were very
> hard to read, so expect errors.
> The 'ethtool-d' is especially curious, since it consists of
> only 'FF'-bytes. The ethtool register dump was taken on a
> 2.6.37 with r8168 as a module.

Thanks the information you provide. I would send the information to the related
people. However, the FF value of all registers is curious. I would try to
reproduce the issue first.

>
> fujitsu.com also has some (limited, end-user-focused)
> hardware documentation if you enter the serial number
> 'YL7E003277' on http://ts.fujitsu.com/support/downloads.html
>
> If you need any further information, don't hesitate to ask.
>


Best Regards,
Hayes

2011-02-22 12:31:16

by Hayes Wang

[permalink] [raw]
Subject: RE: r8169 hangs machine on kernel boot (bisected)

Hello,

please use the patch of the attatched file to check if it fix the issue. Thanks.

Best Regards,
Hayes


> -----Original Message-----
> From: Alexander Wuerstlein [mailto:[email protected]]
> Sent: Saturday, February 19, 2011 1:30 AM
> To: Hayeswang
> Cc: 'David S. Miller'; [email protected]; 'franois romieu'
> Subject: Re: r8169 hangs machine on kernel boot (bisected)
>
> Hello,
>
> On 11-02-18 15:01, hayeswang <[email protected]> wrote:
> > Hello,
> > > I've just tried to boot a new computer featuring a Realtek r8168
> > > onboard (lspci calls it Realtek Semiconductor Co., Ltd.
> > > RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168]
> > > (rev 04), see [1]) network chip with kernel
> > > 2.6.38-rc5 and current git. Both hang on boot just after
> USB device
> > > initialization and just before the kernel usually does DHCP. The
> > > previous 2.6.37 didn't hang on boot, but showed strange behaviour
> > > (only 10MBit half duplex on autonegotiation, tons of
> errors on the
> > > switch interface[2]) which is why I tried the newer
> kernel in hopes that there would be fixes.
> > >
> > > I bisected the hang-before-dhcp bug down to commit 'r8169: magic.'
> > > (b646d90053f887c1bc243191e693a9b02d09f2c2, also see [1]).
> Since the
> > > commit really does its description justice and looks like
> some weird
> > > hardware magic, I'd like to ask the wizards on how to
> proceed with fixing it.
> >
> > I have try the 8168DP and it wouldn't hang. I think I need
> more information.
> > Could you give the information about the motherboard and
> the version of BIOS.
> > Besides, please use the realtek driver, and dump the MAC
> information
> > by using "ethtool -d eth0". These are helpful to find out
> what happens. Thanks.
>
> I'm sorry that it took me so long to answer, I've had some
> problems booting the machine today since the PXE-Client
> didn't get any DHCP requests through. I'm not sure if its
> related, the problem cleared up after I opened up the machine
> to get the motherboard data and stuff.
>
> Long story short, new files 'bios', 'hardware' and 'ethtool-d' in
> http://wwwcip.cs.fau.de/~snalwuer/kernel-r8169
> Bios revision is the latest available from Fujitsu for that hardware.
> Hardware numbers and serials are accurate, but the
> inscriptions on the chips I suspected to be PHYs were very
> hard to read, so expect errors.
> The 'ethtool-d' is especially curious, since it consists of
> only 'FF'-bytes. The ethtool register dump was taken on a
> 2.6.37 with r8168 as a module.
>
> fujitsu.com also has some (limited, end-user-focused)
> hardware documentation if you enter the serial number
> 'YL7E003277' on http://ts.fujitsu.com/support/downloads.html
>
> If you need any further information, don't hesitate to ask.
>
>
>
> Ciao,
>
> Alexander Wuerstlein.


Attachments:
0002-net-r8169-fix-the-wrong-parameter-of-point-address.patch (1.13 kB)

2011-02-22 14:38:49

by Alexander Wuerstlein

[permalink] [raw]
Subject: Re: r8169 hangs machine on kernel boot (bisected)

On 11-02-22 13:31, hayeswang <[email protected]> wrote:
> Hello,
>
> please use the patch of the attatched file to check if it fix the issue. Thanks.

Yes, the patch you sent applied to a kernel from git (rev ee715087024b)
fixes the issue. I'll also test applying on -rc6 later and report.

Thanks alot for your help!


Ciao,

Alexander Wuerstlein.

2011-02-23 22:10:40

by Alexander Wuerstlein

[permalink] [raw]
Subject: Re: r8169 hangs machine on kernel boot (bisected)

On 11-02-22 13:31, hayeswang <[email protected]> wrote:
> Hello,
>
> please use the patch of the attatched file to check if it fix the issue. Thanks.

I also just tested with 2.6.38-rc5 (in addition to the test with current
git which I mailed about yesterday) and can also confirm, that your
patch fixes the issue with 2.6.38-rc5.


Thanks again.




Ciao,

Alexander Wuerstlein.