Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750952AbbBUK5n (ORCPT ); Sat, 21 Feb 2015 05:57:43 -0500 Received: from plane.gmane.org ([80.91.229.3]:52446 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750740AbbBUK5m (ORCPT ); Sat, 21 Feb 2015 05:57:42 -0500 X-Injected-Via-Gmane: http://gmane.org/ To: linux-kernel@vger.kernel.org From: Holger =?iso-8859-1?q?Hoffst=E4tte?= Subject: Re: 1e918876 breaks r8169 (linux-3.18+) Date: Sat, 21 Feb 2015 10:57:33 +0000 (UTC) Message-ID: References: <20150203100816.GA5807@louise.pinerecords.com> <20150203104214.GG24751@breakpoint.cc> <20150210154536.GB16264@breakpoint.cc> <20150221101512.GB17223@louise.pinerecords.com> <20150221103104.GA26574@breakpoint.cc> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: p4ff5875b.dip0.t-ipconnect.de User-Agent: Pan/0.139 (Sexual Chocolate; GIT bf56508 git://git.gnome.org/pan2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1983 Lines: 50 On Sat, 21 Feb 2015 11:31:04 +0100, Florian Westphal wrote: > Tomas Szepe wrote: >> > I tried to reproduce this without success so far on my RTL8168d/8111d device. >> > I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the >> > last 5 hours and so far I saw no watchdog tx timeouts. >> > >> > I'll keep this running for a day or so to see if it just takes more time >> > to trigger. >> >> So, how's this coming along? Don't you think the patch should be reverted >> until the problem is diagnosed/understood/fixed? > > Sorry. > > David, please consider reverting > > 1e918876853aa85435e0f17fd8b4a92dcfff53d6 > (r8169: add support for Byte Queue Limits) > > and > > 0bec3b700d106a8b0a34227b2976d1a582f1aab7 > (r8169: add support for xmit_more) > > I cannot reproduce any hangs (tried for 2days with 40 parallel > netperfs using both 100mbit and 1gbit receiver). > > And I don't see anything wrong with the change either. > Seems like some revisions of the HW are just dodgy? > > I hate giving up, but I have no means to diagnose this any further. > Even reporter says it doesn't affect all of his r8169 nics. > > So I think the change is correct per se, but might be revealing some > HW/firmware bug. Florian, have you experimented with offload settings? The only times r8169 seems to hiccup is with sg/tso enabled. I've reverted my NIC settings back to mostly defaults (which does not enable sg/tso) and had no hangs, spurious timeouts or other problems ever since, despite BQL, xmit_more and client/server use for 24/7. Tomas never said whether his setup enabled any offload settings; it's not inconceivable that a distribution might try to automatically "optimize" things. -h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/