2015-02-03 10:15:50

by Tomas Szepe

[permalink] [raw]
Subject: 1e918876 breaks r8169 (linux-3.18+)

Hi,

Since linux-3.18.0, r8169 is having problems driving one of my add-on
PCIe NICs. The interface is losing link for several seconds at a time,
the frequency being about once a minute when the traffic is high.

The first loss of link is accompanied by the message "NETDEV WATCHDOG:
eth1 (r8169): transmit queue 0 timed out" and a call trace, while
subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
(w/o the complementary "link down" message).

I've traced the culprit down to commit 1e918876, "r8169: add support
for Byte Queue Limits" by Florian Westphal <[email protected]>. Reverting
the patch appears to fix the problem on linux-3.18.5.

The same issue might already have been reported by Marco Berizzi here:
http://lkml.org/lkml/2014/12/11/65

The NIC is:
01:00.0 0200: 10ec:8168 (Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller) (rev 01)
Subsystem: 10ec:8168 (Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller)
Flags: bus master, fast devsel, latency 0, IRQ 27
I/O ports at e000 [size=256]
Memory at f7d00000 (64-bit, non-prefetchable) [size=4K]
Expansion ROM at f7d20000 [disabled] [size=128K]
Capabilities: [40] Power Management version 2
Capabilities: [48] Vital Product Data
Capabilities: [50] MSI: Enable+ Count=1/2 Maskable- 64bit+
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [84] Vendor Specific Information: Len=4c <?>
Capabilities: [100] Advanced Error Reporting
Capabilities: [12c] Virtual Channel
Capabilities: [148] Device Serial Number 98-2d-00-00-10-ec-81-68
Capabilities: [154] Power Budgeting <?>

Best regards,
--
Tomas Szepe <[email protected]>


2015-02-03 10:42:27

by Florian Westphal

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Tomas Szepe <[email protected]> wrote:
> Hi,
>
> Since linux-3.18.0, r8169 is having problems driving one of my add-on
> PCIe NICs. The interface is losing link for several seconds at a time,
> the frequency being about once a minute when the traffic is high.
>
> The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> (w/o the complementary "link down" message).
>
> I've traced the culprit down to commit 1e918876, "r8169: add support
> for Byte Queue Limits" by Florian Westphal <[email protected]>. Reverting
> the patch appears to fix the problem on linux-3.18.5.
> The same issue might already have been reported by Marco Berizzi here:
> http://lkml.org/lkml/2014/12/11/65

Thanks for reporting this! I'm no lkml subscriber and thus did not
see earlier report.

I'll try to reproduce this but unfortunately I am currently travelling
and won't have access to my r8169 nic until Feb 10th.

If you're willing to experiment (not even compile tested):

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5067,8 +5067,6 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
RTL_W8(ChipCmd, CmdReset);

rtl_udelay_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
-
- netdev_reset_queue(tp->dev);
}

static void rtl_request_uncached_firmware(struct rtl8169_private *tp)
@@ -6773,6 +6771,7 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
{
rtl8169_tx_clear_range(tp, tp->dirty_tx, NUM_TX_DESC);
tp->cur_tx = tp->dirty_tx = 0;
+ netdev_reset_queue(tp->dev);
}

static void rtl_reset_work(struct rtl8169_private *tp)
@@ -7231,9 +7230,9 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
tx_left--;
}

- if (tp->dirty_tx != dirty_tx) {
- netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);
+ netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);

+ if (tp->dirty_tx != dirty_tx) {
u64_stats_update_begin(&tp->tx_stats.syncp);
tp->tx_stats.packets += pkts_compl;
tp->tx_stats.bytes += bytes_compl;
@@ -7263,6 +7262,8 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)

RTL_W8(TxPoll, NPQ);
}
+ } else {
+ WARN_ON_ONCE(pkts_compl);
}
}

2015-02-05 16:56:33

by Tomas Szepe

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Ehlo,

> > Since linux-3.18.0, r8169 is having problems driving one of my add-on
> > PCIe NICs. The interface is losing link for several seconds at a time,
> > the frequency being about once a minute when the traffic is high.
> > ...
>
> Thanks for reporting this! I'm no lkml subscriber and thus did not
> see earlier report.

Np, I just like my eth interfaces being up when they're supposed to be up. :)

> I'll try to reproduce this but unfortunately I am currently travelling
> and won't have access to my r8169 nic until Feb 10th.
>
> If you're willing to experiment (not even compile tested):
>
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> ...

I'm happy to report this patch fixes the problem (tested on vanilla
3.18.5), and the throughput of the link seems fine in both directions,
too.

Best regards,
--
Tomáš Szépe <[email protected]>

2015-02-06 14:05:19

by Tomas Szepe

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

> > > Since linux-3.18.0, r8169 is having problems driving one of my add-on
> > > PCIe NICs. The interface is losing link for several seconds at a time,
> > > the frequency being about once a minute when the traffic is high.
> > > ...
> >
> > Thanks for reporting this! I'm no lkml subscriber and thus did not
> > see earlier report.
>
> Np, I just like my eth interfaces being up when they're supposed to be up. :)
>
> > I'll try to reproduce this but unfortunately I am currently travelling
> > and won't have access to my r8169 nic until Feb 10th.
> >
> > If you're willing to experiment (not even compile tested):
> >
> > diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> > ...
>
> I'm happy to report this patch fixes the problem (tested on vanilla
> 3.18.5), and the throughput of the link seems fine in both directions,
> too.

Unfortunately, I have to take this back. I made the conclusion too early.
The problem appears with this patch applied, too, only perhaps later and
with a different frequency pattern.

Sorry about the confusion,
--
Tomas Szepe <[email protected]>

2015-02-08 12:18:43

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

On Fri, 06 Feb 2015 15:04:50 +0100, Tomas Szepe wrote:

> Unfortunately, I have to take this back. I made the conclusion too early.
> The problem appears with this patch applied, too, only perhaps later and
> with a different frequency pattern.

+1 can confirm - I also see the stack trace in question [1] from time to
time, however even under low/moderate load and more or less randomly.
This also used to happen with 3.14.x, so I don't think it has anything
to do woth the BQL patch pe se.

I *think* - and have suspected for some time now - that this is caused by
a change in TSO/GSO or other offload settings, and that some r8169 models
consequently suffer occasional (or in your case frequent) hiccups.
BQL and/or a qdisc doing requeues might just interfer with whatever upsets
the driver or HW.

I used to turn on all possible settings via ethtool, but am now back to
defaults, which disables sg/tso/gso. Exactly which of these settings is the
culprit remains to be seen.

Holger

[1] https://lkml.org/lkml/2014/12/11/65

2015-02-10 15:45:49

by Florian Westphal

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Florian Westphal <[email protected]> wrote:
> Tomas Szepe <[email protected]> wrote:
> > Hi,
> >
> > Since linux-3.18.0, r8169 is having problems driving one of my add-on
> > PCIe NICs. The interface is losing link for several seconds at a time,
> > the frequency being about once a minute when the traffic is high.
> >
> > The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> > eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> > subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> > (w/o the complementary "link down" message).
> >
> > I've traced the culprit down to commit 1e918876, "r8169: add support
> > for Byte Queue Limits" by Florian Westphal <[email protected]>. Reverting
> > the patch appears to fix the problem on linux-3.18.5.
> > The same issue might already have been reported by Marco Berizzi here:
> > http://lkml.org/lkml/2014/12/11/65
>
> Thanks for reporting this! I'm no lkml subscriber and thus did not
> see earlier report.
>
> I'll try to reproduce this but unfortunately I am currently travelling
> and won't have access to my r8169 nic until Feb 10th.

I tried to reproduce this without success so far on my RTL8168d/8111d device.
I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
last 5 hours and so far I saw no watchdog tx timeouts.

I'll keep this running for a day or so to see if it just takes more time
to trigger.

Do you use any "special" settings (e.g. offload features off, mtu > 1500, etc)?

Thanks.

2015-02-11 09:47:12

by Tomas Szepe

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Hi,

> > > Since linux-3.18.0, r8169 is having problems driving one of my add-on
> > > PCIe NICs. The interface is losing link for several seconds at a time,
> > > the frequency being about once a minute when the traffic is high.
> > >
> > > The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> > > eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> > > subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> > > (w/o the complementary "link down" message).
> > >
> > > I've traced the culprit down to commit 1e918876, "r8169: add support
> > > for Byte Queue Limits" by Florian Westphal <[email protected]>. Reverting
> > > the patch appears to fix the problem on linux-3.18.5.
> > > The same issue might already have been reported by Marco Berizzi here:
> > > http://lkml.org/lkml/2014/12/11/65
> >
> > Thanks for reporting this! I'm no lkml subscriber and thus did not
> > see earlier report.
> >
> > I'll try to reproduce this but unfortunately I am currently travelling
> > and won't have access to my r8169 nic until Feb 10th.
>
> I tried to reproduce this without success so far on my RTL8168d/8111d device.

I was afraid it might come to this. Of all the ~15 r8169 interfaces (mostly
onboard) I've got running in about 10 machines, only a single one is affected.

> I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
> last 5 hours and so far I saw no watchdog tx timeouts.
>
> I'll keep this running for a day or so to see if it just takes more time
> to trigger.

Ok, but if the bug were to manifest itself in the same manner as it does
over here, it wouldn't require so much pressure.

> Do you use any "special" settings (e.g. offload features off, mtu > 1500, etc)?

Nothing special.

modprobe r8169
nameif eth1 xx:xx:xx:xx:xx:xx
ip link set eth1 up
ip addr add 192.168.x.y/24 brd + dev eth1
And that's it, really.

Seeya,
--
Tomáš Szépe <[email protected]>

2015-02-21 10:15:39

by Tomas Szepe

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

> > > Since linux-3.18.0, r8169 is having problems driving one of my add-on
> > > PCIe NICs. The interface is losing link for several seconds at a time,
> > > the frequency being about once a minute when the traffic is high.
> > >
> > > The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> > > eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> > > subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> > > (w/o the complementary "link down" message).
> > >
> > > I've traced the culprit down to commit 1e918876, "r8169: add support
> > > for Byte Queue Limits" by Florian Westphal <[email protected]>. Reverting
> > > the patch appears to fix the problem on linux-3.18.5.
> > > The same issue might already have been reported by Marco Berizzi here:
> > > http://lkml.org/lkml/2014/12/11/65
> >
> > Thanks for reporting this! I'm no lkml subscriber and thus did not
> > see earlier report.
> >
> > I'll try to reproduce this but unfortunately I am currently travelling
> > and won't have access to my r8169 nic until Feb 10th.
>
> I tried to reproduce this without success so far on my RTL8168d/8111d device.
> I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
> last 5 hours and so far I saw no watchdog tx timeouts.
>
> I'll keep this running for a day or so to see if it just takes more time
> to trigger.

So, how's this coming along? Don't you think the patch should be reverted
until the problem is diagnosed/understood/fixed?

--
Tomas Szepe <[email protected]>

2015-02-21 10:31:15

by Florian Westphal

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Tomas Szepe <[email protected]> wrote:
> > I tried to reproduce this without success so far on my RTL8168d/8111d device.
> > I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
> > last 5 hours and so far I saw no watchdog tx timeouts.
> >
> > I'll keep this running for a day or so to see if it just takes more time
> > to trigger.
>
> So, how's this coming along? Don't you think the patch should be reverted
> until the problem is diagnosed/understood/fixed?

Sorry.

David, please consider reverting

1e918876853aa85435e0f17fd8b4a92dcfff53d6
(r8169: add support for Byte Queue Limits)

and

0bec3b700d106a8b0a34227b2976d1a582f1aab7
(r8169: add support for xmit_more)

I cannot reproduce any hangs (tried for 2days with 40 parallel
netperfs using both 100mbit and 1gbit receiver).

And I don't see anything wrong with the change either.
Seems like some revisions of the HW are just dodgy?

I hate giving up, but I have no means to diagnose this any further.
Even reporter says it doesn't affect all of his r8169 nics.

So I think the change is correct per se, but might be revealing some
HW/firmware bug.

2015-02-21 10:57:43

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

On Sat, 21 Feb 2015 11:31:04 +0100, Florian Westphal wrote:

> Tomas Szepe <[email protected]> wrote:
>> > I tried to reproduce this without success so far on my RTL8168d/8111d device.
>> > I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
>> > last 5 hours and so far I saw no watchdog tx timeouts.
>> >
>> > I'll keep this running for a day or so to see if it just takes more time
>> > to trigger.
>>
>> So, how's this coming along? Don't you think the patch should be reverted
>> until the problem is diagnosed/understood/fixed?
>
> Sorry.
>
> David, please consider reverting
>
> 1e918876853aa85435e0f17fd8b4a92dcfff53d6
> (r8169: add support for Byte Queue Limits)
>
> and
>
> 0bec3b700d106a8b0a34227b2976d1a582f1aab7
> (r8169: add support for xmit_more)
>
> I cannot reproduce any hangs (tried for 2days with 40 parallel
> netperfs using both 100mbit and 1gbit receiver).
>
> And I don't see anything wrong with the change either.
> Seems like some revisions of the HW are just dodgy?
>
> I hate giving up, but I have no means to diagnose this any further.
> Even reporter says it doesn't affect all of his r8169 nics.
>
> So I think the change is correct per se, but might be revealing some
> HW/firmware bug.

Florian,

have you experimented with offload settings? The only times r8169 seems to
hiccup is with sg/tso enabled. I've reverted my NIC settings back to mostly
defaults (which does not enable sg/tso) and had no hangs, spurious timeouts
or other problems ever since, despite BQL, xmit_more and client/server
use for 24/7.

Tomas never said whether his setup enabled any offload settings; it's not
inconceivable that a distribution might try to automatically "optimize"
things.

-h

2015-02-21 16:22:32

by Eric Dumazet

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

On Sat, 2015-02-21 at 11:31 +0100, Florian Westphal wrote:
> Tomas Szepe <[email protected]> wrote:
> > > I tried to reproduce this without success so far on my RTL8168d/8111d device.
> > > I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
> > > last 5 hours and so far I saw no watchdog tx timeouts.
> > >
> > > I'll keep this running for a day or so to see if it just takes more time
> > > to trigger.
> >
> > So, how's this coming along? Don't you think the patch should be reverted
> > until the problem is diagnosed/understood/fixed?
>
> Sorry.
>
> David, please consider reverting
>
> 1e918876853aa85435e0f17fd8b4a92dcfff53d6
> (r8169: add support for Byte Queue Limits)
>
> and
>
> 0bec3b700d106a8b0a34227b2976d1a582f1aab7
> (r8169: add support for xmit_more)
>
> I cannot reproduce any hangs (tried for 2days with 40 parallel
> netperfs using both 100mbit and 1gbit receiver).
>
> And I don't see anything wrong with the change either.
> Seems like some revisions of the HW are just dodgy?
>
> I hate giving up, but I have no means to diagnose this any further.
> Even reporter says it doesn't affect all of his r8169 nics.
>
> So I think the change is correct per se, but might be revealing some
> HW/firmware bug.

Hold on.

I believe there is one race in the way you access skb->xmit_more _after_

txd->opts1 = cpu_to_le32(status);

After this point, TX might have completed and TX completion already have
freed skb

Could Tomas try following fix ?

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index ad0020af2193..f2764366a36c 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7050,6 +7050,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
u32 opts[2];
int frags;
bool stop_queue;
+ bool xmit_more;

if (unlikely(!TX_FRAGS_READY_FOR(tp, skb_shinfo(skb)->nr_frags))) {
netif_err(tp, drv, dev, "BUG! Tx Ring full when queue awake!\n");
@@ -7091,7 +7092,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
txd->opts2 = cpu_to_le32(opts[1]);

netdev_sent_queue(dev, skb->len);
-
+ xmit_more = skb->xmit_more;
skb_tx_timestamp(skb);

/* Force memory writes to complete before releasing descriptor */
@@ -7108,7 +7109,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,

stop_queue = !TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS);

- if (!skb->xmit_more || stop_queue ||
+ if (!xmit_more || stop_queue ||
netif_xmit_stopped(netdev_get_tx_queue(dev, 0))) {
RTL_W8(TxPoll, NPQ);


2015-02-21 17:46:36

by Florian Westphal

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Eric Dumazet <[email protected]> wrote:
> Hold on.
>
> I believe there is one race in the way you access skb->xmit_more _after_
>
> txd->opts1 = cpu_to_le32(status);
>
> After this point, TX might have completed and TX completion already have
> freed skb

Hmm, I _thought_ HW would not start xmit of this descriptor/skb until after

RTL_W8(TxPoll, NPQ);

call a bit later... your patch surely looks safer though, thanks for
looking into this.

> Could Tomas try following fix ?

In addition -- Tomas, on your affected board -- did you try
turning off gso/tso?

Holger Hoffst?tte mentioned on lkml that his affected r8169 nic is
stable with xmit_more+bql after disabling gso/tso on the nic

(ethtool -k $dev to display settings, "-K $dev tso off gso off" to
disable offloads).

2015-02-21 18:09:07

by Eric Dumazet

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

On Sat, 2015-02-21 at 18:46 +0100, Florian Westphal wrote:
> Eric Dumazet <[email protected]> wrote:
> > Hold on.
> >
> > I believe there is one race in the way you access skb->xmit_more _after_
> >
> > txd->opts1 = cpu_to_le32(status);
> >
> > After this point, TX might have completed and TX completion already have
> > freed skb
>
> Hmm, I _thought_ HW would not start xmit of this descriptor/skb until after
>
> RTL_W8(TxPoll, NPQ);

Note this 'kick' does not provide tail ptr.

NIC basically looks at TX descriptors to find ones with the DescOwn bit
set. It stops when if find one TX descriptor _without_ DescOwn.

So what can happen here is the NIC was kicked earlier (prior transmit),
but found your TX descriptor, before you got a chance to read
skb->xmit_more

This is why we have these wmb() everywhere. We want to do

txd->opts1 = XXX

Only when we are really ready.

2015-02-21 18:32:18

by Florian Westphal

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Eric Dumazet <[email protected]> wrote:
> On Sat, 2015-02-21 at 18:46 +0100, Florian Westphal wrote:
> > Eric Dumazet <[email protected]> wrote:
> > > Hold on.
> > >
> > > I believe there is one race in the way you access skb->xmit_more _after_
> > >
> > > txd->opts1 = cpu_to_le32(status);
> > >
> > > After this point, TX might have completed and TX completion already have
> > > freed skb
> >
> > Hmm, I _thought_ HW would not start xmit of this descriptor/skb until after
> >
> > RTL_W8(TxPoll, NPQ);
>
> Note this 'kick' does not provide tail ptr.
>
> NIC basically looks at TX descriptors to find ones with the DescOwn bit
> set. It stops when if find one TX descriptor _without_ DescOwn.

Makes sense, thanks for explaining this.

2015-02-21 19:05:28

by Tomas Szepe

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

> > David, please consider reverting
> >
> > 1e918876853aa85435e0f17fd8b4a92dcfff53d6
> > (r8169: add support for Byte Queue Limits)
> >
> > and
> >
> > 0bec3b700d106a8b0a34227b2976d1a582f1aab7
> > (r8169: add support for xmit_more)
> >
> > I cannot reproduce any hangs (tried for 2days with 40 parallel
> > netperfs using both 100mbit and 1gbit receiver).
> >
> > And I don't see anything wrong with the change either.
> > Seems like some revisions of the HW are just dodgy?
> >
> > I hate giving up, but I have no means to diagnose this any further.
> > Even reporter says it doesn't affect all of his r8169 nics.
> >
> > So I think the change is correct per se, but might be revealing some
> > HW/firmware bug.
>
> Hold on.
>
> I believe there is one race in the way you access skb->xmit_more _after_
>
> txd->opts1 = cpu_to_le32(status);
>
> After this point, TX might have completed and TX completion already have
> freed skb
>
> Could Tomas try following fix ?
>
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index ad0020af2193..f2764366a36c 100644
> ...

Sure, just did. Unfortunately, 3.19.0 + 0bec3b70 + this patch results
in a driver that retains the problem.

Sorry,
--
Tomas Szepe <[email protected]>

2015-02-21 19:26:42

by Tomas Szepe

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

> > Could Tomas try following fix ?
>
> In addition -- Tomas, on your affected board -- did you try
> turning off gso/tso?
>
> Holger Hoffstätte mentioned on lkml that his affected r8169 nic is
> stable with xmit_more+bql after disabling gso/tso on the nic
>
> (ethtool -k $dev to display settings, "-K $dev tso off gso off" to
> disable offloads).

They're always off on this NIC (cannot be turned on).

# ethtool -k eth1| grep segmentation-offload
tcp-segmentation-offload: off
generic-segmentation-offload: off [requested on]
# ethtool -K eth1 gso on tso on
Could not change any device features
# ethtool -k eth1| grep segmentation-offload
tcp-segmentation-offload: off
generic-segmentation-offload: off [requested on]

--
Tomas Szepe <[email protected]>

2015-02-21 19:54:13

by Eric Dumazet

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

On Sat, 2015-02-21 at 20:05 +0100, Tomas Szepe wrote:

> Sure, just did. Unfortunately, 3.19.0 + 0bec3b70 + this patch results
> in a driver that retains the problem.

OK, could you test following patch instead ?

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index ad0020af2193..18d0decb5093 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7050,6 +7050,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
u32 opts[2];
int frags;
bool stop_queue;
+ bool xmit_more;

if (unlikely(!TX_FRAGS_READY_FOR(tp, skb_shinfo(skb)->nr_frags))) {
netif_err(tp, drv, dev, "BUG! Tx Ring full when queue awake!\n");
@@ -7091,7 +7092,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
txd->opts2 = cpu_to_le32(opts[1]);

netdev_sent_queue(dev, skb->len);
-
+ xmit_more = skb->xmit_more;
skb_tx_timestamp(skb);

/* Force memory writes to complete before releasing descriptor */
@@ -7108,7 +7109,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,

stop_queue = !TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS);

- if (!skb->xmit_more || stop_queue ||
+ if (stop_queue) {
+ /* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
+ * not miss a ring update when it notices a stopped queue.
+ */
+ smp_wmb();
+ netif_stop_queue(dev);
+ }
+ if (!xmit_more ||
netif_xmit_stopped(netdev_get_tx_queue(dev, 0))) {
RTL_W8(TxPoll, NPQ);

@@ -7116,11 +7124,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
}

if (stop_queue) {
- /* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
- * not miss a ring update when it notices a stopped queue.
- */
- smp_wmb();
- netif_stop_queue(dev);
/* Sync with rtl_tx:
* - publish queue status and cur_tx ring index (write barrier)
* - refresh dirty_tx ring index (read barrier).

2015-02-22 00:42:17

by Tomas Szepe

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

> > Sure, just did. Unfortunately, 3.19.0 + 0bec3b70 + this patch results
> > in a driver that retains the problem.
>
> OK, could you test following patch instead ?

Yup, but tough luck: 3.19.0 + 0bec3b70 + this patch -> problem present.

--
Tomas Szepe <[email protected]>

2015-02-22 20:57:57

by David Miller

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

From: Tomas Szepe <[email protected]>
Date: Sun, 22 Feb 2015 01:41:51 +0100

>> > Sure, just did. Unfortunately, 3.19.0 + 0bec3b70 + this patch results
>> > in a driver that retains the problem.
>>
>> OK, could you test following patch instead ?
>
> Yup, but tough luck: 3.19.0 + 0bec3b70 + this patch -> problem present.

I'm reverting the two commits for now, as below.

We can put them back in if we can resolve the problems.

====================
[PATCH] r8169: Revert BQL and xmit_more support.

There are certain regressions which are pointing to
these two commits which we are having a hard time
resolving. So revert them for now.

Specifically this reverts:

commit 0bec3b700d106a8b0a34227b2976d1a582f1aab7
Author: Florian Westphal <[email protected]>
Date: Wed Jan 7 10:49:49 2015 +0100

r8169: add support for xmit_more

and

commit 1e918876853aa85435e0f17fd8b4a92dcfff53d6
Author: Florian Westphal <[email protected]>
Date: Wed Oct 1 13:38:03 2014 +0200

r8169: add support for Byte Queue Limits

There were some attempts by Eric Dumazet to address some obvious
problems in the TX flow, to see if they would fix the problems,
but none of them seem to help for the regression reporters.

Signed-off-by: David S. Miller <[email protected]>
---
drivers/net/ethernet/realtek/r8169.c | 30 +++++++-----------------------
1 file changed, 7 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index ad0020a..b156092 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5067,8 +5067,6 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
RTL_W8(ChipCmd, CmdReset);

rtl_udelay_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
-
- netdev_reset_queue(tp->dev);
}

static void rtl_request_uncached_firmware(struct rtl8169_private *tp)
@@ -7049,7 +7047,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
u32 status, len;
u32 opts[2];
int frags;
- bool stop_queue;

if (unlikely(!TX_FRAGS_READY_FOR(tp, skb_shinfo(skb)->nr_frags))) {
netif_err(tp, drv, dev, "BUG! Tx Ring full when queue awake!\n");
@@ -7090,8 +7087,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,

txd->opts2 = cpu_to_le32(opts[1]);

- netdev_sent_queue(dev, skb->len);
-
skb_tx_timestamp(skb);

/* Force memory writes to complete before releasing descriptor */
@@ -7106,16 +7101,11 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,

tp->cur_tx += frags + 1;

- stop_queue = !TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS);
+ RTL_W8(TxPoll, NPQ);

- if (!skb->xmit_more || stop_queue ||
- netif_xmit_stopped(netdev_get_tx_queue(dev, 0))) {
- RTL_W8(TxPoll, NPQ);
-
- mmiowb();
- }
+ mmiowb();

- if (stop_queue) {
+ if (!TX_FRAGS_READY_FOR(tp, MAX_SKB_FRAGS)) {
/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
* not miss a ring update when it notices a stopped queue.
*/
@@ -7198,7 +7188,6 @@ static void rtl8169_pcierr_interrupt(struct net_device *dev)
static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
{
unsigned int dirty_tx, tx_left;
- unsigned int bytes_compl = 0, pkts_compl = 0;

dirty_tx = tp->dirty_tx;
smp_rmb();
@@ -7222,8 +7211,10 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
rtl8169_unmap_tx_skb(&tp->pci_dev->dev, tx_skb,
tp->TxDescArray + entry);
if (status & LastFrag) {
- pkts_compl++;
- bytes_compl += tx_skb->skb->len;
+ u64_stats_update_begin(&tp->tx_stats.syncp);
+ tp->tx_stats.packets++;
+ tp->tx_stats.bytes += tx_skb->skb->len;
+ u64_stats_update_end(&tp->tx_stats.syncp);
dev_kfree_skb_any(tx_skb->skb);
tx_skb->skb = NULL;
}
@@ -7232,13 +7223,6 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
}

if (tp->dirty_tx != dirty_tx) {
- netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);
-
- u64_stats_update_begin(&tp->tx_stats.syncp);
- tp->tx_stats.packets += pkts_compl;
- tp->tx_stats.bytes += bytes_compl;
- u64_stats_update_end(&tp->tx_stats.syncp);
-
tp->dirty_tx = dirty_tx;
/* Sync with rtl8169_start_xmit:
* - publish dirty_tx ring index (write barrier)
--
2.1.0

2015-05-07 19:23:11

by Daniel Exner

[permalink] [raw]
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Hi,

first of all, please keep me in CC as I am currently not subscribed to LKML.

> From: Tomas Szepe <[email protected]>
> Date: Sun, 22 Feb 2015 01:41:51 +0100
>
>>>> Sure, just did. Unfortunately, 3.19.0 + 0bec3b70 + this patch results
>>>> in a driver that retains the problem.
>>>
>>> OK, could you test following patch instead ?
>>
>> Yup, but tough luck: 3.19.0 + 0bec3b70 + this patch -> problem present.
>
> I'm reverting the two commits for now, as below.
>
> We can put them back in if we can resolve the problems.
>
> ====================
> [PATCH] r8169: Revert BQL and xmit_more support.

I observe exactly the same problems since Kernel 4.1 merge window,
_although_ this patch is in the tree.
So it must be something else.

Stacktrace 1 (still with rc1):
> WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:303
> dev_watchdog+0x23f/0x250()
> NETDEV WATCHDOG: em1 (r8169): transmit queue 0 timed out
> Modules linked in: af_packet rfcomm uvcvideo videobuf2_vmalloc
> videobuf2_memops videobuf2_core v4l2_common videodev media bnep raid456
> nf_conntrack_ipv4 async_raid6_recov nf_defrag_ipv4 async_memcpy
> snd_usb_audio snd_usbmidi_lib async_pq async_xor xt_tcpudp xt_limit xor
> async_tx xt_conntrack nf_conntrack xt_multiport btusb btbcm btintel
> bluetooth rfkill joydev iptable_filter evdev mac_hid ip_tables x_tables msr
> raid6_pq w83627ehf md_mod hwmon_vid amdkfd amd_iommu_v2 kvm_amd kvm
> snd_seq_dummy radeon snd_seq_oss microcode snd_emu10k1_synth snd_emux_synth
> psmouse snd_seq_midi_emul snd_seq_virmidi pcspkr serio_raw ttm snd_emu10k1
> edac_core drm_kms_helper edac_mce_amd k10temp snd_hda_codec_realtek
> snd_util_mem snd_hda_codec_generic snd_hda_codec_hdmi drm snd_ac97_codec
> snd_hda_intel snd_hda_controller agpgart emu10k1_gp ac97_bus gameport
> i2c_algo_bit snd_hda_codec nuvoton_cir rc_core 8250_fintek snd_hda_core
> acpi_cpufreq sp5100_tco snd_hwdep snd_pcm i2c_piix4 button shpchp processor
> thermal_sys hwmon ppdev snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi
> snd_seq_device snd_timer snd soundcore sch_fq_codel fuse binfmt_misc lp
> parport_pc parport ext4 crc16 jbd2 mbcache hid_generic sr_mod cdrom sd_mod
> hid_microsoft usbhid hid ahci libahci r8169 mii firewire_ohci firewire_core
> crc_itu_t xhci_pci ehci_pci ohci_pci xhci_hcd ehci_hcd ohci_hcd libata
> usbcore scsi_mod usb_common sunrpc dm_mirror dm_region_hash dm_log dm_mod
> CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.1.0-rc1-36580-g385d5ed-dirty #28
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./970 Extreme3
> R2.0, BIOS P1.60 06/05/2014
> 0000000000000000 ffffffff815f162c ffffffff814b23ce ffff88023fd43e08
> ffffffff810526b7 0000000000000000 ffff8802362fc3a0 ffff8802362fc000
> 0000000000000005 0000000000000001 ffffffff81052735 ffffffff815f5700
> Call Trace:
> <IRQ> [<ffffffff814b23ce>] ? dump_stack+0x47/0x67
> [<ffffffff810526b7>] ? warn_slowpath_common+0x77/0xb0
> [<ffffffff81052735>] ? warn_slowpath_fmt+0x45/0x50
> [<ffffffff813eaa8f>] ? dev_watchdog+0x23f/0x250
> [<ffffffff813ea850>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a8fe5>] ? call_timer_fn.isra.26+0x15/0x80
> [<ffffffff81067000>] ? execute_in_process_context+0x60/0x60
> [<ffffffff813ea850>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a9218>] ? run_timer_softirq+0x1c8/0x270
> [<ffffffff81055eac>] ? __do_softirq+0x10c/0x220
> [<ffffffff8105612e>] ? irq_exit+0x8e/0xb0
> [<ffffffff81036439>] ? smp_apic_timer_interrupt+0x39/0x50
> [<ffffffff814b829e>] ? apic_timer_interrupt+0x6e/0x80
> <EOI> [<ffffffff8103f0a2>] ? native_safe_halt+0x2/0x10
> [<ffffffff8100c865>] ? default_idle+0x5/0x10
> [<ffffffff8100c9a6>] ? amd_e400_idle+0x66/0xe0
> [<ffffffff8108b429>] ? cpu_startup_entry+0x319/0x390
> [<ffffffff81034520>] ? start_secondary+0x170/0x180
> ---[ end trace c308ca6d62e7a991 ]---

Puzzling as this is a Phenom II X6 and no E400..

Stacktrace 2 (rc2):
> ------------[ cut here ]------------
> WARNING: CPU: 5 PID: 35 at net/sched/sch_generic.c:303
> dev_watchdog+0x23f/0x250()
> NETDEV WATCHDOG: em1 (r8169): transmit queue 0 timed out
> Modules linked in: af_packet rfcomm bnep raid456 async_raid6_recov
> async_memcpy async_pq uvcvideo videobuf2_vmalloc videobuf2_memops
> videobuf2_core v4l2_common videodev async_xor media xor async_tx
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_limit xt_conntrack
> nf_conntrack xt_multiport msr iptable_filter ip_tables snd_usb_audio btusb
> x_tables btbcm snd_usbmidi_lib btintel bluetooth raid6_pq rfkill joydev
> w83627ehf hwmon_vid md_mod mac_hid evdev kvm_amd amdkfd kvm amd_iommu_v2
> snd_seq_dummy microcode snd_seq_oss radeon snd_emu10k1_synth snd_emux_synth
> snd_seq_midi_emul snd_seq_virmidi snd_hda_codec_realtek ttm
> snd_hda_codec_generic edac_core psmouse serio_raw edac_mce_amd pcspkr
> snd_hda_codec_hdmi k10temp drm_kms_helper snd_emu10k1 drm snd_util_mem
> snd_hda_intel snd_ac97_codec agpgart emu10k1_gp snd_hda_controller ac97_bus
> gameport i2c_algo_bit snd_hda_codec snd_hda_core sp5100_tco snd_hwdep
> nuvoton_cir snd_pcm 8250_fintek rc_core i2c_piix4 shpchp button acpi_cpufreq
> processor thermal_sys ppdev snd_seq_midi snd_seq_midi_event snd_seq
> snd_rawmidi snd_seq_device snd_timer snd soundcore asus_atk0110 hwmon
> sch_fq_codel fuse lp binfmt_misc parport_pc parport ext4 crc16 jbd2 mbcache
> hid_generic sr_mod cdrom sd_mod hid_microsoft usbhid hid firewire_ohci ahci
> r8169 firewire_core libahci mii crc_itu_t xhci_pci ohci_pci ehci_pci
> ohci_hcd xhci_hcd ehci_hcd libata usbcore scsi_mod usb_common sunrpc
> dm_mirror dm_region_hash dm_log dm_mod
> CPU: 5 PID: 35 Comm: ksoftirqd/5 Not tainted 4.1.0-rc2-36683-g979f4b5-dirty
> #29
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./970 Extreme3
> # R2.0, BIOS P1.60 06/05/2014
> 0000000000000000 ffffffff815f162c ffffffff814b233e ffff88023620fce8
> ffffffff810526b7 0000000000000000 ffff880230d683a0 ffff880230d68000
> 0000000000000005 0000000000000001 ffffffff81052735 ffffffff815f5700
> Call Trace:
> [<ffffffff814b233e>] ? dump_stack+0x47/0x67
> [<ffffffff810526b7>] ? warn_slowpath_common+0x77/0xb0
> [<ffffffff81052735>] ? warn_slowpath_fmt+0x45/0x50
> [<ffffffff813eaa0f>] ? dev_watchdog+0x23f/0x250
> [<ffffffff813ea7d0>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a8fe5>] ? call_timer_fn.isra.26+0x15/0x80
> [<ffffffff813ea7d0>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a9218>] ? run_timer_softirq+0x1c8/0x270
> [<ffffffff81055eac>] ? __do_softirq+0x10c/0x220
> [<ffffffff81055fe9>] ? run_ksoftirqd+0x29/0x50
> [<ffffffff8106ff55>] ? smpboot_thread_fn+0x135/0x250
> [<ffffffff8106fe20>] ? sort_range+0x20/0x20
> [<ffffffff8106d0be>] ? kthread+0xce/0xf0
> [<ffffffff81070303>] ? smpboot_register_percpu_thread+0x63/0xf0
> [<ffffffff8106cff0>] ? kthread_create_on_node+0x180/0x180
> [<ffffffff814b77a2>] ? ret_from_fork+0x42/0x70
> [<ffffffff8106cff0>] ? kthread_create_on_node+0x180/0x180
> ---[ end trace 4f88915aa0200ae6 ]---


I would have bisected it but its hard to trigger.

One thing that sticks out in the log:
> r8169 0000:05:00.0: can't disable ASPM; OS doesn't have ASPM control

but:
> acpi PNP0A03:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
> acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM


Regards,
Daniel Exner
--
Daniel Exner
Public-Key: https://www.dragonslave.de/pub_key.asc


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature