Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750952AbbBUKPj (ORCPT ); Sat, 21 Feb 2015 05:15:39 -0500 Received: from louise.pinerecords.com ([213.168.185.253]:57188 "EHLO louise.pinerecords.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750740AbbBUKPg (ORCPT ); Sat, 21 Feb 2015 05:15:36 -0500 Date: Sat, 21 Feb 2015 11:15:12 +0100 From: Tomas Szepe To: Florian Westphal Cc: Francois Romieu , Hayes Wang , Eric Dumazet , Tom Herbert , "David S. Miller" , Marco Berizzi , linux-kernel@vger.kernel.org Subject: Re: 1e918876 breaks r8169 (linux-3.18+) Message-ID: <20150221101512.GB17223@louise.pinerecords.com> References: <20150203100816.GA5807@louise.pinerecords.com> <20150203104214.GG24751@breakpoint.cc> <20150210154536.GB16264@breakpoint.cc> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150210154536.GB16264@breakpoint.cc> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1800 Lines: 38 > > > Since linux-3.18.0, r8169 is having problems driving one of my add-on > > > PCIe NICs. The interface is losing link for several seconds at a time, > > > the frequency being about once a minute when the traffic is high. > > > > > > The first loss of link is accompanied by the message "NETDEV WATCHDOG: > > > eth1 (r8169): transmit queue 0 timed out" and a call trace, while > > > subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up" > > > (w/o the complementary "link down" message). > > > > > > I've traced the culprit down to commit 1e918876, "r8169: add support > > > for Byte Queue Limits" by Florian Westphal . Reverting > > > the patch appears to fix the problem on linux-3.18.5. > > > The same issue might already have been reported by Marco Berizzi here: > > > http://lkml.org/lkml/2014/12/11/65 > > > > Thanks for reporting this! I'm no lkml subscriber and thus did not > > see earlier report. > > > > I'll try to reproduce this but unfortunately I am currently travelling > > and won't have access to my r8169 nic until Feb 10th. > > I tried to reproduce this without success so far on my RTL8168d/8111d device. > I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the > last 5 hours and so far I saw no watchdog tx timeouts. > > I'll keep this running for a day or so to see if it just takes more time > to trigger. So, how's this coming along? Don't you think the patch should be reverted until the problem is diagnosed/understood/fixed? -- Tomas Szepe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/