Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752848AbbKIDdB (ORCPT ); Sun, 8 Nov 2015 22:33:01 -0500 Received: from mail-lb0-f175.google.com ([209.85.217.175]:36761 "EHLO mail-lb0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752108AbbKIDc6 (ORCPT ); Sun, 8 Nov 2015 22:32:58 -0500 MIME-Version: 1.0 In-Reply-To: <20151108.222343.882542189870767346.davem@davemloft.net> References: <6c0fed5fc63a4c3488f5f08409f508ee@HKXPR30MB0039.064d.mgd.msft.net> <20151108.215247.1161049251544860672.davem@davemloft.net> <20151108.222343.882542189870767346.davem@davemloft.net> Date: Mon, 9 Nov 2015 13:32:56 +1000 Message-ID: Subject: Re: linux-next network throughput performance regression From: Dave Airlie To: David Miller Cc: decui@microsoft.com, Eric Dumazet , dsa@cumulusnetworks.com, sixiao@microsoft.com, Network Development , haiyangz@microsoft.com, LKML , devel@linuxdriverproject.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2287 Lines: 58 On 9 November 2015 at 13:23, David Miller wrote: > From: Dexuan Cui > Date: Mon, 9 Nov 2015 03:11:35 +0000 > >>> -----Original Message----- >>> From: David Miller [mailto:davem@davemloft.net] >>> Sent: Monday, November 9, 2015 10:53 >>> To: Dexuan Cui >>> Cc: eric.dumazet@gmail.com; dsa@cumulusnetworks.com; Simon Xiao >>> ; netdev@vger.kernel.org; Haiyang Zhang >>> ; linux-kernel@vger.kernel.org; >>> devel@linuxdriverproject.org >>> Subject: Re: linux-next network throughput performance regression >>> >>> From: Dexuan Cui >>> Date: Mon, 9 Nov 2015 02:39:24 +0000 >>> >>> >> Throughput on a single TCP flow for a 40G NIC can be tricky to tune. >>> > Why is a single TCP flow trickier than multiple TCP flows? >>> > IMO it should be easier to analyze the issue of a single TCP flow? >>> >>> Because a single TCP flow can only use one of the many TX queues >>> that such modern NICs have. >>> >>> The single TX queue becomes the bottleneck. >>> >>> Whereas if you have several TCP flows, all of them can use independant >>> TX queues on the NIC in parallel to fill the link with traffic. >>> >>> That's why. >> >> Thanks, David! >> I understand 1 TX queue is the bottleneck (however in Simon's >> test, TX=1 => 36.7Gb/s, TX=8 => 37.7 Gb/s, so it looks the TX=1 bottleneck >> is not so obvious). >> I'm just wondering how the bottleneck became much narrower with >> recent linux-next in Simon's result (36.7 Gb/s vs. 18.2 Gb/s). IMO there >> must be some latency somewhere. > > I think the whole thing here is that you misinterpreted what Eric said. > > He is not arguing that some regression did, or did not, happen. > > He instead was making the basic statement about the fact that due to > the lack of paralellness a single stream TCP case is harder to > optimize for high speed NICs. > > That is all. We recently had a regression tracked down in a similiar area that was because of link order. Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/