Received: by 10.223.176.46 with SMTP id f43csp1997447wra; Sun, 21 Jan 2018 08:57:48 -0800 (PST) X-Google-Smtp-Source: AH8x2254cqIN3K5VyZVrSokBI4EHjg88a4stu/y4Fd5DNVvHXmdXHjsKdYApw+TcZXIaPpHNEwPu X-Received: by 10.99.151.74 with SMTP id d10mr1488875pgo.258.1516553868051; Sun, 21 Jan 2018 08:57:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516553868; cv=none; d=google.com; s=arc-20160816; b=xes5tAXfDc2hr+81lkeMhDOU77bjsHtnEMcxNqYlQqV7EPFkr53VNqIrxBRCrP9tQO DN4rWoZ/H8/ZL5zLgqddqj7NFqD9tcRMIyi6FOTNCzIVRkkwwn+JPeEQYA4XwIvjPWLc rWjBMnHWLFbvNOoVMLAYqICtVP71Qz39VxICoFzZXccFHQDjarvs9raHpiqTKYkGpJeP 6cYCTXaKxp4mgMTjx6PKYsX2j+ZAwMvxKPlS4f+py6Uisg+4hUb4ltyA0BlItCfdoS+f FfGULoJMvPWGt/QsGH/RMpsJfP0NgCvidYNAKOiztzOwrvTOlk6RZncog5UwW2k6t3qS qUuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:from:subject:cc:to:message-id:date :arc-authentication-results; bh=xK3oCvhD9PWaxZPlQa3vCQuxcSVZYczdIS8+k9lbhDI=; b=hWxVX4xPe+9xwtLU1mph2VNCUoAY8fyey1oiSzlyy/gPX2hUURyJLOZkpVeSaew0ba rHp2pWk2yJLphwNSbRJVQoNYzGWNoTAKyi6abBVV8LLkC+1s3vLF6jxN3x/3rKbx90Sh Kg+adFJ8RFDWPbrGQ+x4xBporYD8zkZL7u5MhVAoMPe6IFhAi4V2r5G8e0LwGKz9llLu /hKCBgBluap+Pmt2nDu42YEQ0b4nvnyfIHgosRv5qQGfTIrN6H5NqO5q7s7Gaor5/xVR THpUm3bwz3G2G3n6YBJq7LpAA9FLNeF2ihzDaej0oleuLiO5XYX2+EPDNOuL0ZkruXYx X+ag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g6si638232pgs.673.2018.01.21.08.57.32; Sun, 21 Jan 2018 08:57:48 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750970AbeAUQ5J (ORCPT + 99 others); Sun, 21 Jan 2018 11:57:09 -0500 Received: from shards.monkeyblade.net ([184.105.139.130]:46672 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750825AbeAUQ5H (ORCPT ); Sun, 21 Jan 2018 11:57:07 -0500 Received: from localhost (pool-173-77-163-229.nycmny.fios.verizon.net [173.77.163.229]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id BB25413E51810; Sun, 21 Jan 2018 08:57:05 -0800 (PST) Date: Sun, 21 Jan 2018 11:57:03 -0500 (EST) Message-Id: <20180121.115703.1454514585438593490.davem@davemloft.net> To: frederic@kernel.org Cc: torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, alexander.levin@verizon.com, peterz@infradead.org, mchehab@s-opensource.com, hannes@stressinduktion.org, paulmck@linux.vnet.ibm.com, wanpeng.li@hotmail.com, dima@arista.com, tglx@linutronix.de, akpm@linux-foundation.org, pabeni@redhat.com, rrendec@arista.com, mingo@kernel.org, sgruszka@redhat.com, riel@redhat.com, edumazet@google.com Subject: Re: [RFC PATCH 1/4] softirq: Limit vector to a single iteration on IRQ tail From: David Miller In-Reply-To: <20180121163008.GB2879@lerouge> References: <20180119.134727.512994648781037639.davem@davemloft.net> <20180121163008.GB2879@lerouge> X-Mailer: Mew version 6.7 on Emacs 25.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.12 (shards.monkeyblade.net [149.20.54.216]); Sun, 21 Jan 2018 08:57:07 -0800 (PST) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Frederic Weisbecker Date: Sun, 21 Jan 2018 17:30:09 +0100 > On Fri, Jan 19, 2018 at 01:47:27PM -0500, David Miller wrote: >> From: Linus Torvalds >> Date: Fri, 19 Jan 2018 10:25:03 -0800 >> >> > On Fri, Jan 19, 2018 at 8:16 AM, David Miller wrote: >> >> >> >> So this "get requeued" condition I think will trigger always for >> >> networking tunnel decapsulation. >> > >> > Hmm. Interesting and a perhaps bit discouraging. >> > >> > Will it always be just a _single_ level of indirection, or will double >> > tunnels (I assume some people do that, just because the universe is >> > out to get us) then result in this perhaps repeating several times? >> >> Every level of tunnel encapsulation will trigger a new softirq. >> >> So if you have an IP tunnel inside of an IP tunnel that will trigger >> twice. > > So we may likely need to come back to a call counter based limit :-s I'm not so sure exactly to what extent we should try to handle that, and if so exactly how. The only reason we do this is to control stack usage. The re-softirq on tunnel decapsulation is functioning as a continuation of sorts. If we are already running in the net_rx_action() softirq it would be so much nicer and efficient to just make sure net_rx_action() runs the rest of the packet processing. Right now that doesn't happen because net_rx_action() runs only one round of NAPI polling. It doesn't re-snapshot the list and try again before returning. net_rx_action() has it's own logic like do_softirq() does for timing out in the middle of it's work which may or may not have some further influence upon fairness to other softirqs. Basically, it runs a snapshot the NAPI poll list for this CPU until either usecs_to_jiffies(netdev_budget_usecs) jiffies have elapsed or the list snapshot has been fully processed. The default netdev_budget_usecs is 2000, which is my math isn't broken is 2 jiffies when HZ=1000. I know why we use 2000 instead of 1000, it's to handle the case where we are invoked very close to the end of a jiffy. That situation does happen often enough in practice to cause performance problems. It would seem that all of these issues are why the tendency is to deal with measuring cost using time rather than a simpler heuristic such as whether softirqs were retriggered during a softirq run.