2017-12-20 08:41:50

by Chris Chiu

[permalink] [raw]
Subject: r8169 take too long to complete driver initialization

Hi,
We've hit a suspend/resume issue on a Acer desktop caused by r8169
driver. The dmseg
https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
shows it's still in msleep() within a mutex lock.
After looking into the code, it's caused by the
rtl8168ep_stop_cmac() which is waiting 100 seconds for
rtl_ocp_tx_cond. The following dmesg states that the r8169 driver is
loaded.

[ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded

But it takes > 100 seconds to get the following messages

[ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
(uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
[ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46
[ 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
bytes, tx checksumming: ko]

So any trial to suspend the machine during this period would always
get device/resource busy message then abort. Is this rtl_ocp_tx_cond
necessary? Because the ethernet is still working and I don't see any
problem. I don't know it should be considered normal or not. Please
let me know if any more information required. Thanks

Chris


2018-01-05 02:17:38

by Chris Chiu

[permalink] [raw]
Subject: Re: r8169 take too long to complete driver initialization

On Wed, Dec 20, 2017 at 4:41 PM, Chris Chiu <[email protected]> wrote:
> Hi,
> We've hit a suspend/resume issue on a Acer desktop caused by r8169
> driver. The dmseg
> https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
> shows it's still in msleep() within a mutex lock.
> After looking into the code, it's caused by the
> rtl8168ep_stop_cmac() which is waiting 100 seconds for
> rtl_ocp_tx_cond. The following dmesg states that the r8169 driver is
> loaded.
>
> [ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>
> But it takes > 100 seconds to get the following messages
>
> [ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
> (uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
> [ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
> 0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46
> [ 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
> bytes, tx checksumming: ko]
>
> So any trial to suspend the machine during this period would always
> get device/resource busy message then abort. Is this rtl_ocp_tx_cond
> necessary? Because the ethernet is still working and I don't see any
> problem. I don't know it should be considered normal or not. Please
> let me know if any more information required. Thanks
>
> Chris

gentle ping,

cheers.

2018-01-29 10:13:09

by Chris Chiu

[permalink] [raw]
Subject: Re: r8169 take too long to complete driver initialization

On Fri, Jan 5, 2018 at 10:17 AM, Chris Chiu <[email protected]> wrote:
> On Wed, Dec 20, 2017 at 4:41 PM, Chris Chiu <[email protected]> wrote:
>> Hi,
>> We've hit a suspend/resume issue on a Acer desktop caused by r8169
>> driver. The dmseg
>> https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
>> shows it's still in msleep() within a mutex lock.
>> After looking into the code, it's caused by the
>> rtl8168ep_stop_cmac() which is waiting 100 seconds for
>> rtl_ocp_tx_cond. The following dmesg states that the r8169 driver is
>> loaded.
>>
>> [ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>>
>> But it takes > 100 seconds to get the following messages
>>
>> [ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
>> (uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
>> [ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
>> 0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46
>> [ 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
>> bytes, tx checksumming: ko]
>>
>> So any trial to suspend the machine during this period would always
>> get device/resource busy message then abort. Is this rtl_ocp_tx_cond
>> necessary? Because the ethernet is still working and I don't see any
>> problem. I don't know it should be considered normal or not. Please
>> let me know if any more information required. Thanks
>>
>> Chris
>
> gentle ping,
>
> cheers.

Hi,
Just found a r8168 driver which seems to be authrized by realtek for cross
comparison. I tried applying the patch to latest 4.15 kernel and the driver done
it's initialization in faily short time. The patch file is here
https://gist.github.com/mschiu77/fcf406e64a1a437f46cf2be643f1057d.

In mainline r8169.c, the IBISR0 register need to be polled in the
rtl8168ep_stop_cmac().
In the patch file, there's also the same IBISR0 polling code in
Dash2DisableTx(),
but it's been bypassed since the chipset maches HW_DASH_SUPPORT_TYPE_2.
Per the rtl_chip_info[] in r8168_n.c, CFG_METHOD_23/27/28 are
HW_DASH_SUPPORT_TYPE_2,
and they happens to be the only 3 named RTL8168EP/8111EP in the rtl_chip_info[].

To find the same matches in r8169.c, RTL_GIGA_MAC_VER_49/50/51
seems share the
same config. Can anyone clarify if the rtl_ocp_tx_cond() really
necessary for 8168EP/8111EP?
Or we can just ignore the condition check for RTL_GIGA_MAC_VER_49/50/51?

Chris

2018-01-29 15:25:05

by ChunHao Lin

[permalink] [raw]
Subject: RE: r8169 take too long to complete driver initialization

Hi Chris,

Could you test following patch?

DECLARE_RTL_COND(rtl_ocp_tx_cond)
{
void __iomem *ioaddr = tp->mmio_addr;

- return RTL_R8(IBISR0) & 0x02;
+ return RTL_R8(IBISR0) & 0x20;
}

static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
{
void __iomem *ioaddr = tp->mmio_addr;

RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
- rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
+ rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
}

Thanks.

------Please consider the environment before printing this e-mail.

> -----Original Message-----
> From: Chris Chiu [mailto:[email protected]]
> Sent: Monday, January 29, 2018 6:12 PM
> To: nic_swsd <[email protected]>; [email protected]; Linux
> Kernel <[email protected]>; Linux Upstreaming Team
> <[email protected]>
> Subject: Re: r8169 take too long to complete driver initialization
>
> On Fri, Jan 5, 2018 at 10:17 AM, Chris Chiu <[email protected]> wrote:
> > On Wed, Dec 20, 2017 at 4:41 PM, Chris Chiu <[email protected]> wrote:
> >> Hi,
> >> We've hit a suspend/resume issue on a Acer desktop caused by
> >> r8169 driver. The dmseg
> >> https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
> >> shows it's still in msleep() within a mutex lock.
> >> After looking into the code, it's caused by the
> >> rtl8168ep_stop_cmac() which is waiting 100 seconds for
> >> rtl_ocp_tx_cond. The following dmesg states that the r8169 driver is
> >> loaded.
> >>
> >> [ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> >>
> >> But it takes > 100 seconds to get the following messages
> >>
> >> [ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
> >> (uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
> >> [ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
> >> 0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46 [
> >> 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
> >> bytes, tx checksumming: ko]
> >>
> >> So any trial to suspend the machine during this period would always
> >> get device/resource busy message then abort. Is this rtl_ocp_tx_cond
> >> necessary? Because the ethernet is still working and I don't see any
> >> problem. I don't know it should be considered normal or not. Please
> >> let me know if any more information required. Thanks
> >>
> >> Chris
> >
> > gentle ping,
> >
> > cheers.
>
> Hi,
> Just found a r8168 driver which seems to be authrized by realtek for cross
> comparison. I tried applying the patch to latest 4.15 kernel and the driver
> done it's initialization in faily short time. The patch file is here
> https://gist.github.com/mschiu77/fcf406e64a1a437f46cf2be643f1057d.
>
> In mainline r8169.c, the IBISR0 register need to be polled in the
> rtl8168ep_stop_cmac().
> In the patch file, there's also the same IBISR0 polling code in
> Dash2DisableTx(), but it's been bypassed since the chipset maches
> HW_DASH_SUPPORT_TYPE_2.
> Per the rtl_chip_info[] in r8168_n.c, CFG_METHOD_23/27/28 are
> HW_DASH_SUPPORT_TYPE_2, and they happens to be the only 3 named
> RTL8168EP/8111EP in the rtl_chip_info[].
>
> To find the same matches in r8169.c, RTL_GIGA_MAC_VER_49/50/51
> seems share the same config. Can anyone clarify if the rtl_ocp_tx_cond()
> really necessary for 8168EP/8111EP?
> Or we can just ignore the condition check for RTL_GIGA_MAC_VER_49/50/51?
>
> Chris
>
> ------Please consider the environment before printing this e-mail.

2018-01-30 12:08:55

by Chris Chiu

[permalink] [raw]
Subject: Re: r8169 take too long to complete driver initialization

On Mon, Jan 29, 2018 at 11:24 PM, Hau <[email protected]> wrote:
> Hi Chris,
>
> Could you test following patch?
>
> DECLARE_RTL_COND(rtl_ocp_tx_cond)
> {
> void __iomem *ioaddr = tp->mmio_addr;
>
> - return RTL_R8(IBISR0) & 0x02;
> + return RTL_R8(IBISR0) & 0x20;
> }
>
> static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
> {
> void __iomem *ioaddr = tp->mmio_addr;
>
> RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
> - rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
> + rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
> RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
> RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
> }
>
> Thanks.
>

Yes. It completes the initialization in 70 ms. So it means the rtl_ocp_tx_cond
are waiting for incorrect register bit? Can you help work out a patch for this?

Chris


>> -----Original Message-----
>> From: Chris Chiu [mailto:[email protected]]
>> Sent: Monday, January 29, 2018 6:12 PM
>> To: nic_swsd <[email protected]>; [email protected]; Linux
>> Kernel <[email protected]>; Linux Upstreaming Team
>> <[email protected]>
>> Subject: Re: r8169 take too long to complete driver initialization
>>
>> On Fri, Jan 5, 2018 at 10:17 AM, Chris Chiu <[email protected]> wrote:
>> > On Wed, Dec 20, 2017 at 4:41 PM, Chris Chiu <[email protected]> wrote:
>> >> Hi,
>> >> We've hit a suspend/resume issue on a Acer desktop caused by
>> >> r8169 driver. The dmseg
>> >> https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
>> >> shows it's still in msleep() within a mutex lock.
>> >> After looking into the code, it's caused by the
>> >> rtl8168ep_stop_cmac() which is waiting 100 seconds for
>> >> rtl_ocp_tx_cond. The following dmesg states that the r8169 driver is
>> >> loaded.
>> >>
>> >> [ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> >>
>> >> But it takes > 100 seconds to get the following messages
>> >>
>> >> [ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
>> >> (uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
>> >> [ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
>> >> 0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46 [
>> >> 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
>> >> bytes, tx checksumming: ko]
>> >>
>> >> So any trial to suspend the machine during this period would always
>> >> get device/resource busy message then abort. Is this rtl_ocp_tx_cond
>> >> necessary? Because the ethernet is still working and I don't see any
>> >> problem. I don't know it should be considered normal or not. Please
>> >> let me know if any more information required. Thanks
>> >>
>> >> Chris
>> >
>> > gentle ping,
>> >
>> > cheers.
>>
>> Hi,
>> Just found a r8168 driver which seems to be authrized by realtek for cross
>> comparison. I tried applying the patch to latest 4.15 kernel and the driver
>> done it's initialization in faily short time. The patch file is here
>> https://gist.github.com/mschiu77/fcf406e64a1a437f46cf2be643f1057d.
>>
>> In mainline r8169.c, the IBISR0 register need to be polled in the
>> rtl8168ep_stop_cmac().
>> In the patch file, there's also the same IBISR0 polling code in
>> Dash2DisableTx(), but it's been bypassed since the chipset maches
>> HW_DASH_SUPPORT_TYPE_2.
>> Per the rtl_chip_info[] in r8168_n.c, CFG_METHOD_23/27/28 are
>> HW_DASH_SUPPORT_TYPE_2, and they happens to be the only 3 named
>> RTL8168EP/8111EP in the rtl_chip_info[].
>>
>> To find the same matches in r8169.c, RTL_GIGA_MAC_VER_49/50/51
>> seems share the same config. Can anyone clarify if the rtl_ocp_tx_cond()
>> really necessary for 8168EP/8111EP?
>> Or we can just ignore the condition check for RTL_GIGA_MAC_VER_49/50/51?
>>
>> Chris
>>
>> ------Please consider the environment before printing this e-mail.

2018-02-02 02:04:37

by Chris Chiu

[permalink] [raw]
Subject: Re: r8169 take too long to complete driver initialization

On Tue, Jan 30, 2018 at 8:07 PM, Chris Chiu <[email protected]> wrote:
> On Mon, Jan 29, 2018 at 11:24 PM, Hau <[email protected]> wrote:
>> Hi Chris,
>>
>> Could you test following patch?
>>
>> DECLARE_RTL_COND(rtl_ocp_tx_cond)
>> {
>> void __iomem *ioaddr = tp->mmio_addr;
>>
>> - return RTL_R8(IBISR0) & 0x02;
>> + return RTL_R8(IBISR0) & 0x20;
>> }
>>
>> static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
>> {
>> void __iomem *ioaddr = tp->mmio_addr;
>>
>> RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
>> - rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
>> + rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
>> RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
>> RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
>> }
>>
>> Thanks.
>>
>
> Yes. It completes the initialization in 70 ms. So it means the rtl_ocp_tx_cond
> are waiting for incorrect register bit? Can you help work out a patch for this?
>
> Chris
>
>

Gentle ping,
cheers.

Chris

>>> -----Original Message-----
>>> From: Chris Chiu [mailto:[email protected]]
>>> Sent: Monday, January 29, 2018 6:12 PM
>>> To: nic_swsd <[email protected]>; [email protected]; Linux
>>> Kernel <[email protected]>; Linux Upstreaming Team
>>> <[email protected]>
>>> Subject: Re: r8169 take too long to complete driver initialization
>>>
>>> On Fri, Jan 5, 2018 at 10:17 AM, Chris Chiu <[email protected]> wrote:
>>> > On Wed, Dec 20, 2017 at 4:41 PM, Chris Chiu <[email protected]> wrote:
>>> >> Hi,
>>> >> We've hit a suspend/resume issue on a Acer desktop caused by
>>> >> r8169 driver. The dmseg
>>> >> https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
>>> >> shows it's still in msleep() within a mutex lock.
>>> >> After looking into the code, it's caused by the
>>> >> rtl8168ep_stop_cmac() which is waiting 100 seconds for
>>> >> rtl_ocp_tx_cond. The following dmesg states that the r8169 driver is
>>> >> loaded.
>>> >>
>>> >> [ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>>> >>
>>> >> But it takes > 100 seconds to get the following messages
>>> >>
>>> >> [ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
>>> >> (uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
>>> >> [ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
>>> >> 0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46 [
>>> >> 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
>>> >> bytes, tx checksumming: ko]
>>> >>
>>> >> So any trial to suspend the machine during this period would always
>>> >> get device/resource busy message then abort. Is this rtl_ocp_tx_cond
>>> >> necessary? Because the ethernet is still working and I don't see any
>>> >> problem. I don't know it should be considered normal or not. Please
>>> >> let me know if any more information required. Thanks
>>> >>
>>> >> Chris
>>> >
>>> > gentle ping,
>>> >
>>> > cheers.
>>>
>>> Hi,
>>> Just found a r8168 driver which seems to be authrized by realtek for cross
>>> comparison. I tried applying the patch to latest 4.15 kernel and the driver
>>> done it's initialization in faily short time. The patch file is here
>>> https://gist.github.com/mschiu77/fcf406e64a1a437f46cf2be643f1057d.
>>>
>>> In mainline r8169.c, the IBISR0 register need to be polled in the
>>> rtl8168ep_stop_cmac().
>>> In the patch file, there's also the same IBISR0 polling code in
>>> Dash2DisableTx(), but it's been bypassed since the chipset maches
>>> HW_DASH_SUPPORT_TYPE_2.
>>> Per the rtl_chip_info[] in r8168_n.c, CFG_METHOD_23/27/28 are
>>> HW_DASH_SUPPORT_TYPE_2, and they happens to be the only 3 named
>>> RTL8168EP/8111EP in the rtl_chip_info[].
>>>
>>> To find the same matches in r8169.c, RTL_GIGA_MAC_VER_49/50/51
>>> seems share the same config. Can anyone clarify if the rtl_ocp_tx_cond()
>>> really necessary for 8168EP/8111EP?
>>> Or we can just ignore the condition check for RTL_GIGA_MAC_VER_49/50/51?
>>>
>>> Chris
>>>
>>> ------Please consider the environment before printing this e-mail.

2018-02-02 11:50:48

by ChunHao Lin

[permalink] [raw]
Subject: RE: r8169 take too long to complete driver initialization


> -----Original Message-----
> From: Chris Chiu [mailto:[email protected]]
> Sent: Friday, February 2, 2018 10:03 AM
> To: Hau <[email protected]>
> Cc: nic_swsd <[email protected]>; [email protected]; Linux
> Kernel <[email protected]>; Linux Upstreaming Team
> <[email protected]>
> Subject: Re: r8169 take too long to complete driver initialization
>
> On Tue, Jan 30, 2018 at 8:07 PM, Chris Chiu <[email protected]> wrote:
> > On Mon, Jan 29, 2018 at 11:24 PM, Hau <[email protected]> wrote:
> >> Hi Chris,
> >>
> >> Could you test following patch?
> >>
> >> DECLARE_RTL_COND(rtl_ocp_tx_cond)
> >> {
> >> void __iomem *ioaddr = tp->mmio_addr;
> >>
> >> - return RTL_R8(IBISR0) & 0x02;
> >> + return RTL_R8(IBISR0) & 0x20;
> >> }
> >>
> >> static void rtl8168ep_stop_cmac(struct rtl8169_private *tp) {
> >> void __iomem *ioaddr = tp->mmio_addr;
> >>
> >> RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
> >> - rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
> >> + rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
> >> RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
> >> RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01); }
> >>
> >> Thanks.
> >>
> >
> > Yes. It completes the initialization in 70 ms. So it means the
> > rtl_ocp_tx_cond are waiting for incorrect register bit? Can you help work
> out a patch for this?
> >
> > Chris
> >
> >
>
> Gentle ping,
> cheers.
>
> Chris
>

I have submitted the patch to kernel team.
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=086ca23d03c0d2f4088f472386778d293e15c5f6


------Please consider the environment before printing this e-mail.

> >>> -----Original Message-----
> >>> From: Chris Chiu [mailto:[email protected]]
> >>> Sent: Monday, January 29, 2018 6:12 PM
> >>> To: nic_swsd <[email protected]>; [email protected]; Linux
> >>> Kernel <[email protected]>; Linux Upstreaming Team
> >>> <[email protected]>
> >>> Subject: Re: r8169 take too long to complete driver initialization
> >>>
> >>> On Fri, Jan 5, 2018 at 10:17 AM, Chris Chiu <[email protected]> wrote:
> >>> > On Wed, Dec 20, 2017 at 4:41 PM, Chris Chiu <[email protected]>
> wrote:
> >>> >> Hi,
> >>> >> We've hit a suspend/resume issue on a Acer desktop caused by
> >>> >> r8169 driver. The dmseg
> >>> >>
> https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
> >>> >> shows it's still in msleep() within a mutex lock.
> >>> >> After looking into the code, it's caused by the
> >>> >> rtl8168ep_stop_cmac() which is waiting 100 seconds for
> >>> >> rtl_ocp_tx_cond. The following dmesg states that the r8169 driver
> >>> >> is loaded.
> >>> >>
> >>> >> [ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> >>> >>
> >>> >> But it takes > 100 seconds to get the following messages
> >>> >>
> >>> >> [ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
> >>> >> (uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
> >>> >> [ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
> >>> >> 0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46 [
> >>> >> 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
> >>> >> bytes, tx checksumming: ko]
> >>> >>
> >>> >> So any trial to suspend the machine during this period would
> >>> >> always get device/resource busy message then abort. Is this
> >>> >> rtl_ocp_tx_cond necessary? Because the ethernet is still working
> >>> >> and I don't see any problem. I don't know it should be considered
> >>> >> normal or not. Please let me know if any more information
> >>> >> required. Thanks
> >>> >>
> >>> >> Chris
> >>> >
> >>> > gentle ping,
> >>> >
> >>> > cheers.
> >>>
> >>> Hi,
> >>> Just found a r8168 driver which seems to be authrized by realtek
> >>> for cross comparison. I tried applying the patch to latest 4.15
> >>> kernel and the driver done it's initialization in faily short time. The patch
> file is here
> >>> https://gist.github.com/mschiu77/fcf406e64a1a437f46cf2be643f1057d.
> >>>
> >>> In mainline r8169.c, the IBISR0 register need to be polled in
> >>> the rtl8168ep_stop_cmac().
> >>> In the patch file, there's also the same IBISR0 polling code in
> >>> Dash2DisableTx(), but it's been bypassed since the chipset maches
> >>> HW_DASH_SUPPORT_TYPE_2.
> >>> Per the rtl_chip_info[] in r8168_n.c, CFG_METHOD_23/27/28 are
> >>> HW_DASH_SUPPORT_TYPE_2, and they happens to be the only 3 named
> >>> RTL8168EP/8111EP in the rtl_chip_info[].
> >>>
> >>> To find the same matches in r8169.c, RTL_GIGA_MAC_VER_49/50/51
> >>> seems share the same config. Can anyone clarify if the
> >>> rtl_ocp_tx_cond() really necessary for 8168EP/8111EP?
> >>> Or we can just ignore the condition check for
> RTL_GIGA_MAC_VER_49/50/51?
> >>>
> >>> Chris
> >>>
> >>> ------Please consider the environment before printing this e-mail.

2018-02-05 04:18:01

by Chris Chiu

[permalink] [raw]
Subject: Re: r8169 take too long to complete driver initialization

On Fri, Feb 2, 2018 at 7:49 PM, Hau <[email protected]> wrote:
>
>> -----Original Message-----
>> From: Chris Chiu [mailto:[email protected]]
>> Sent: Friday, February 2, 2018 10:03 AM
>> To: Hau <[email protected]>
>> Cc: nic_swsd <[email protected]>; [email protected]; Linux
>> Kernel <[email protected]>; Linux Upstreaming Team
>> <[email protected]>
>> Subject: Re: r8169 take too long to complete driver initialization
>>
>> On Tue, Jan 30, 2018 at 8:07 PM, Chris Chiu <[email protected]> wrote:
>> > On Mon, Jan 29, 2018 at 11:24 PM, Hau <[email protected]> wrote:
>> >> Hi Chris,
>> >>
>> >> Could you test following patch?
>> >>
>> >> DECLARE_RTL_COND(rtl_ocp_tx_cond)
>> >> {
>> >> void __iomem *ioaddr = tp->mmio_addr;
>> >>
>> >> - return RTL_R8(IBISR0) & 0x02;
>> >> + return RTL_R8(IBISR0) & 0x20;
>> >> }
>> >>
>> >> static void rtl8168ep_stop_cmac(struct rtl8169_private *tp) {
>> >> void __iomem *ioaddr = tp->mmio_addr;
>> >>
>> >> RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
>> >> - rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
>> >> + rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
>> >> RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
>> >> RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01); }
>> >>
>> >> Thanks.
>> >>
>> >
>> > Yes. It completes the initialization in 70 ms. So it means the
>> > rtl_ocp_tx_cond are waiting for incorrect register bit? Can you help work
>> out a patch for this?
>> >
>> > Chris
>> >
>> >
>>
>> Gentle ping,
>> cheers.
>>
>> Chris
>>
>
> I have submitted the patch to kernel team.
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=086ca23d03c0d2f4088f472386778d293e15c5f6
>
>

Cool. Thanks for this.

Chris

> ------Please consider the environment before printing this e-mail.
>
>> >>> -----Original Message-----
>> >>> From: Chris Chiu [mailto:[email protected]]
>> >>> Sent: Monday, January 29, 2018 6:12 PM
>> >>> To: nic_swsd <[email protected]>; [email protected]; Linux
>> >>> Kernel <[email protected]>; Linux Upstreaming Team
>> >>> <[email protected]>
>> >>> Subject: Re: r8169 take too long to complete driver initialization
>> >>>
>> >>> On Fri, Jan 5, 2018 at 10:17 AM, Chris Chiu <[email protected]> wrote:
>> >>> > On Wed, Dec 20, 2017 at 4:41 PM, Chris Chiu <[email protected]>
>> wrote:
>> >>> >> Hi,
>> >>> >> We've hit a suspend/resume issue on a Acer desktop caused by
>> >>> >> r8169 driver. The dmseg
>> >>> >>
>> https://gist.github.com/mschiu77/b741849b5070281daaead8dfee312d1a
>> >>> >> shows it's still in msleep() within a mutex lock.
>> >>> >> After looking into the code, it's caused by the
>> >>> >> rtl8168ep_stop_cmac() which is waiting 100 seconds for
>> >>> >> rtl_ocp_tx_cond. The following dmesg states that the r8169 driver
>> >>> >> is loaded.
>> >>> >>
>> >>> >> [ 20.270526] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
>> >>> >>
>> >>> >> But it takes > 100 seconds to get the following messages
>> >>> >>
>> >>> >> [ 140.400223] r8169 0000:02:00.0 (unnamed net_device)
>> >>> >> (uninitialized): rtl_ocp_tx_cond == 1 (loop: 2000, delay: 50).
>> >>> >> [ 140.413294] r8169 0000:02:00.0 eth0: RTL8168ep/8111ep at
>> >>> >> 0xffffb16c80db1000, f8:0f:41:ea:74:0d, XID 10200800 IRQ 46 [
>> >>> >> 140.413297] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200
>> >>> >> bytes, tx checksumming: ko]
>> >>> >>
>> >>> >> So any trial to suspend the machine during this period would
>> >>> >> always get device/resource busy message then abort. Is this
>> >>> >> rtl_ocp_tx_cond necessary? Because the ethernet is still working
>> >>> >> and I don't see any problem. I don't know it should be considered
>> >>> >> normal or not. Please let me know if any more information
>> >>> >> required. Thanks
>> >>> >>
>> >>> >> Chris
>> >>> >
>> >>> > gentle ping,
>> >>> >
>> >>> > cheers.
>> >>>
>> >>> Hi,
>> >>> Just found a r8168 driver which seems to be authrized by realtek
>> >>> for cross comparison. I tried applying the patch to latest 4.15
>> >>> kernel and the driver done it's initialization in faily short time. The patch
>> file is here
>> >>> https://gist.github.com/mschiu77/fcf406e64a1a437f46cf2be643f1057d.
>> >>>
>> >>> In mainline r8169.c, the IBISR0 register need to be polled in
>> >>> the rtl8168ep_stop_cmac().
>> >>> In the patch file, there's also the same IBISR0 polling code in
>> >>> Dash2DisableTx(), but it's been bypassed since the chipset maches
>> >>> HW_DASH_SUPPORT_TYPE_2.
>> >>> Per the rtl_chip_info[] in r8168_n.c, CFG_METHOD_23/27/28 are
>> >>> HW_DASH_SUPPORT_TYPE_2, and they happens to be the only 3 named
>> >>> RTL8168EP/8111EP in the rtl_chip_info[].
>> >>>
>> >>> To find the same matches in r8169.c, RTL_GIGA_MAC_VER_49/50/51
>> >>> seems share the same config. Can anyone clarify if the
>> >>> rtl_ocp_tx_cond() really necessary for 8168EP/8111EP?
>> >>> Or we can just ignore the condition check for
>> RTL_GIGA_MAC_VER_49/50/51?
>> >>>
>> >>> Chris
>> >>>
>> >>> ------Please consider the environment before printing this e-mail.