Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp2970914pxk; Mon, 21 Sep 2020 01:43:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzpI7Yol1hGLDOQ4yHYD90aChL7jkfCfaGFYbWPh+cr4znbNhHQtdXpxxp9X5JiUpMfNNDJ X-Received: by 2002:a17:906:4c58:: with SMTP id d24mr50599148ejw.108.1600677782080; Mon, 21 Sep 2020 01:43:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600677782; cv=none; d=google.com; s=arc-20160816; b=G0JIrPNNNJB6s0Z+45p3pTMj4Ks1LkjEp2wg/ec+kxSMvR+KB86c+pK0yZRIrgW6x1 3RGGISk25Ml0EPGsmv7CBi9VJDGg7aYZY7OormSHXcQlrgtKwlJJQfWpmeOQCuiJWEFI PZ7wSRWdkczU6V9X3kHWCocc3pSCsYez+Y9kquFD5JckLYwTm1Q2UuWAUVK3Deo20i63 oO6CR4sBW9HrGZrBxQy4yw8yHMwMG0v0OkNESrD5Y46sgzSQad2pa5fftlMQKhq7aZsA I2jpBs77noLgOBCOFAUFZ8bjlAeVLaO7l7wUMJ6Zc2EBEeNYlUvPGic1g0zIf8fAjmir YaSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=30VhJjvHIeIC2LhtdLp+mCc+RGGR88eHvALf0z1FNoA=; b=Kp5N0AIaRhzgsCPB3KTeAzmhIZmZjPMRpInPeR+2G12clEFDxSSUJLB2uF+nrze+Ex OCSHYR69OY/tSGcAPPry316lCLHsaGPSXo0acbS0PsN9dftK+esZEPwCxKjc7kk6yCHg y/pr9oxmE1Yuy8YMUBvtMSfq6jYVpS9B1Yb59LtBu6vUAjtj3C9F4pH6bZvqFPLHbQlT VEl4avAUlRjCDQLjegNoa4LiEIxXdkEz5FBehr1Z08zNySGF/9blGXRzhid43eD4bl+q 3OUWpHB3sGrFSQ7sBIqyIdCfJxhnglgk2geJ2N0p0xAq5FbJuSRWB84bWG15HDB14nIi 88Og== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u16si7640374edr.507.2020.09.21.01.42.38; Mon, 21 Sep 2020 01:43:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726572AbgIUIkN (ORCPT + 99 others); Mon, 21 Sep 2020 04:40:13 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:41486 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726236AbgIUIkN (ORCPT ); Mon, 21 Sep 2020 04:40:13 -0400 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 8CE2CAB2206491486E5A; Mon, 21 Sep 2020 16:40:11 +0800 (CST) Received: from [10.74.191.121] (10.74.191.121) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.487.0; Mon, 21 Sep 2020 16:40:04 +0800 Subject: Re: [PATCH net-next] net: use in_softirq() to indicate the NAPI context in napi_consume_skb() To: Eric Dumazet CC: David Miller , Jakub Kicinski , linmiaohe , , "Florian Westphal" , Davide Caratti , "Steffen Klassert" , Paolo Abeni , , Saeed Mahameed , netdev , LKML , References: <1600653893-206277-1-git-send-email-linyunsheng@huawei.com> <2102eba1-eeea-bf95-2df5-7fcfa3141694@huawei.com> From: Yunsheng Lin Message-ID: Date: Mon, 21 Sep 2020 16:40:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.191.121] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/9/21 16:17, Eric Dumazet wrote: > On Mon, Sep 21, 2020 at 10:10 AM Yunsheng Lin wrote: >> >> On 2020/9/21 15:19, Eric Dumazet wrote: >>> On Mon, Sep 21, 2020 at 4:08 AM Yunsheng Lin wrote: >>>> >>>> When napi_consume_skb() is called in the tx desc cleaning process, >>>> it is usually in the softirq context(BH disabled, or are processing >>>> softirqs), but it may also be in the task context, such as in the >>>> netpoll or loopback selftest process. >>>> >>>> Currently napi_consume_skb() uses non-zero budget to indicate the >>>> NAPI context, the driver writer may provide the wrong budget when >>>> tx desc cleaning function is reused for both NAPI and non-NAPI >>>> context, see [1]. >>>> >>>> So this patch uses in_softirq() to indicate the NAPI context, which >>>> doesn't necessarily mean in NAPI context, but it shouldn't care if >>>> NAPI context or not as long as it runs in softirq context or with BH >>>> disabled, then _kfree_skb_defer() will push the skb to the particular >>>> cpu' napi_alloc_cache atomically. >>>> >>>> [1] https://lkml.org/lkml/2020/9/15/38 >>>> >>>> Signed-off-by: Yunsheng Lin >>>> --- >>>> note that budget parameter is not removed in this patch because it >>>> involves many driver changes, we can remove it in separate patch if >>>> this patch is accepted. >>>> --- >>>> net/core/skbuff.c | 6 ++++-- >>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>> index e077447..03d0d28 100644 >>>> --- a/net/core/skbuff.c >>>> +++ b/net/core/skbuff.c >>>> @@ -895,8 +895,10 @@ void __kfree_skb_defer(struct sk_buff *skb) >>>> >>>> void napi_consume_skb(struct sk_buff *skb, int budget) >>>> { >>>> - /* Zero budget indicate non-NAPI context called us, like netpoll */ >>>> - if (unlikely(!budget)) { >>>> + /* called by non-softirq context, which usually means non-NAPI >>>> + * context, like netpoll. >>>> + */ >>>> + if (unlikely(!in_softirq())) { >>>> dev_consume_skb_any(skb); >>>> return; >>>> } >>>> -- >>> >>> >>> I do not think we should add this kind of fuzzy logic, just because >>> _one_ driver author made a mistake. >>> >>> Add a disable_bh() in the driver slow path, and accept the _existing_ >>> semantic, the one that was understood by dozens. >> >> As my understanding, this patch did not change _existing_ semantic, >> it still only call _kfree_skb_defer() in softirq context. This patch >> just remove the requirement that a softirq context hint need to be >> provided to decide whether calling _kfree_skb_defer(). > > I do not want to remove the requirement. > >> >> Yes, we can add DEBUG_NET() clauses to catch this kind of error as >> you suggested. >> >> But why we need such a debug clauses, when we can decide if delaying >> skb freeing is possible in napi_consume_skb(), why not just use >> in_softirq() to make this API more easy to use? Just as __dev_kfree_skb_any() >> API use "in_irq() || irqs_disabled()" checking to handle the irq context >> and non-irq context. > > > I just do not like your patch. > > Copying another piece of fuzzy logic, inherited from legacy code is > not an excuse. > > Add a local_bh_disable() in the driver slow path to meet _existing_ > requirement, so that we can keep the hot path fast. "!in_softirq()" checking make the napi_consume_skb() slower than "!budget" checking? do I miss something? As a matter of fact, the hns3 driver has fixed this problem by passing zero-budget to napi_consume_skb() in non-NAPI context, this patch is more about how to avoid or catch this kind of error. So your opinion is still to catch this kind of error using something like DEBUG_NET() clauses? > . >