Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933568AbbBCKm1 (ORCPT ); Tue, 3 Feb 2015 05:42:27 -0500 Received: from Chamillionaire.breakpoint.cc ([80.244.247.6]:48052 "EHLO Chamillionaire.breakpoint.cc" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932235AbbBCKmZ (ORCPT ); Tue, 3 Feb 2015 05:42:25 -0500 Date: Tue, 3 Feb 2015 11:42:14 +0100 From: Florian Westphal To: Tomas Szepe Cc: Florian Westphal , Francois Romieu , Hayes Wang , Eric Dumazet , Tom Herbert , "David S. Miller" , Marco Berizzi , linux-kernel@vger.kernel.org Subject: Re: 1e918876 breaks r8169 (linux-3.18+) Message-ID: <20150203104214.GG24751@breakpoint.cc> References: <20150203100816.GA5807@louise.pinerecords.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150203100816.GA5807@louise.pinerecords.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2680 Lines: 72 Tomas Szepe wrote: > Hi, > > Since linux-3.18.0, r8169 is having problems driving one of my add-on > PCIe NICs. The interface is losing link for several seconds at a time, > the frequency being about once a minute when the traffic is high. > > The first loss of link is accompanied by the message "NETDEV WATCHDOG: > eth1 (r8169): transmit queue 0 timed out" and a call trace, while > subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up" > (w/o the complementary "link down" message). > > I've traced the culprit down to commit 1e918876, "r8169: add support > for Byte Queue Limits" by Florian Westphal . Reverting > the patch appears to fix the problem on linux-3.18.5. > The same issue might already have been reported by Marco Berizzi here: > http://lkml.org/lkml/2014/12/11/65 Thanks for reporting this! I'm no lkml subscriber and thus did not see earlier report. I'll try to reproduce this but unfortunately I am currently travelling and won't have access to my r8169 nic until Feb 10th. If you're willing to experiment (not even compile tested): diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -5067,8 +5067,6 @@ static void rtl_hw_reset(struct rtl8169_private *tp) RTL_W8(ChipCmd, CmdReset); rtl_udelay_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100); - - netdev_reset_queue(tp->dev); } static void rtl_request_uncached_firmware(struct rtl8169_private *tp) @@ -6773,6 +6771,7 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp) { rtl8169_tx_clear_range(tp, tp->dirty_tx, NUM_TX_DESC); tp->cur_tx = tp->dirty_tx = 0; + netdev_reset_queue(tp->dev); } static void rtl_reset_work(struct rtl8169_private *tp) @@ -7231,9 +7230,9 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp) tx_left--; } - if (tp->dirty_tx != dirty_tx) { - netdev_completed_queue(tp->dev, pkts_compl, bytes_compl); + netdev_completed_queue(tp->dev, pkts_compl, bytes_compl); + if (tp->dirty_tx != dirty_tx) { u64_stats_update_begin(&tp->tx_stats.syncp); tp->tx_stats.packets += pkts_compl; tp->tx_stats.bytes += bytes_compl; @@ -7263,6 +7262,8 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp) RTL_W8(TxPoll, NPQ); } + } else { + WARN_ON_ONCE(pkts_compl); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/