Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp3490539rdh; Thu, 28 Sep 2023 13:16:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHI6PqyuGP/98xSf0VkulLyOeH8PnhGDMquG8GDKf4k6KodVND0hMXaIwHXVt4aM8ribSce X-Received: by 2002:a05:6a20:4304:b0:13d:5b8e:db83 with SMTP id h4-20020a056a20430400b0013d5b8edb83mr2598058pzk.9.1695932170497; Thu, 28 Sep 2023 13:16:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695932170; cv=none; d=google.com; s=arc-20160816; b=yQ7Jh/Lx7QBPO4RbtH9QYFXRrMo1AFTSzkH9X8gZGDhWHcdQS/iyRdV8OzDAWqP0ug NB0hQ9ckX35wnkNrYGLeeD49MtTl4as6wA/X8lnBPclbTj5DQwWEw7xc6TxIKHwsrjKm vLvxHkOr+NLAyMYvsnqsTNGgrsYlgrDpyUD2dN67pZkJ5mHO0BS0xg6NEwvoZGgHVmOV 869XR8qTwxLjDrceIsDgfTmXEyg61dYY4nZnD+E8Ud7vOMQYlX7sPp1Ku5yoT4OGerk9 nvbOb1UrwvVbTJzcLa33umsyCToMd3CnQALD9QNRUPc/QqCGoud9httupJoBIe0/630Q h4PQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=hoK7DrnkbZoEIlYmoBBEvLuBCh+s4Y05tdAexrGncuU=; fh=tX1pVPzUB4VqSEEJuD493A3oLLMsXX2GAylXkqZLhQo=; b=EQrmouNggp5Xke0dVwRXD0In62CRd3X9XoT0GVuMQlcHDxuoY+R8UTb+tWTR9F5GjE KBvJp8tF2iFydAusU07V9N1+qiNLE4b1cLLI0kF9ViOmXdDzD4iaT/Nwr8xpyeFxGmzd 10c9R2xejruvC2s6AmTc9aBm5lP2Xv4Jc8d2VMWOmulm1iPF7ho+DfqWinqzumbEiHHU ktLamV4a/oWm1R9vpOwbFcmeAR6sPV3KwL0jqklmixCa8hkZUHb5nZ/+r8RoDgQkP2qW Emk4BiNBaXrAtui1Ec5YuE0gwbpu4V9VfziY20xvqx3AN9sVY/uSNzdZK8DZuBF6SdTH 9mNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oP8tWTNc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id w23-20020a1709026f1700b001c6315729efsi7354277plk.590.2023.09.28.13.16.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Sep 2023 13:16:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oP8tWTNc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id B1DF480FF029; Thu, 28 Sep 2023 09:23:41 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230081AbjI1QX2 (ORCPT + 99 others); Thu, 28 Sep 2023 12:23:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229605AbjI1QX1 (ORCPT ); Thu, 28 Sep 2023 12:23:27 -0400 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 373C5EB for ; Thu, 28 Sep 2023 09:23:24 -0700 (PDT) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-534694a9f26so17294a12.1 for ; Thu, 28 Sep 2023 09:23:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695918202; x=1696523002; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hoK7DrnkbZoEIlYmoBBEvLuBCh+s4Y05tdAexrGncuU=; b=oP8tWTNcPnRfRpkBzt1652ESSkvNJRz+wQdzgf34Dck6WUPoqMN+FAg3xAYmuHZgGv +8zy0V0TQiOp8kLzfk2+MGMIjnYUeDfm5ag9MSrG2in6ye+org/O/oUMolb9at2wZK/w /A2J4skXgpYpCiDdQbJEj5+bXpsQp3C69RdnE/2F5+o44iA+en1EYO26X5tdTYASCveD yJ3gs8P55VtHylKARuKbG/nWosuxnnx8BmNLjnwI4HLv2pwYjkDabWaERGiQCfpWbR/p ueZxvT4+N3ZXIeNVyWaUcOdyi+c95JgwPzmJfxhJd6JRXSyV1iYoRQrHXcap5rU+g4p2 iyHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695918202; x=1696523002; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hoK7DrnkbZoEIlYmoBBEvLuBCh+s4Y05tdAexrGncuU=; b=wqiO+bqpsJ1h4+cz5ulLUmRI9J6JpNC3R3qP/qrfUPVpdZpjv+hO/pkYzc/c2E597Q X8SvMwxDctN1FBxYmPeOOFrYoPF2/8yDVcCma4jrHRl8iuraXz+L1wxjOMZEo9Um5ARe k7HQikedX6pwmT1g4gdqNN4ci37a9qc2/nGa0Z3kKkTXQhcrMLibV7ZI1mUpURRc4Hf5 i/zaqxqVU/GKqAD/UwcQ4acUbMApETkAUCzovkOSiTFARUufh1w1o8KJbXNpezW4D+YG 6p7V0yxOVJgmOQmkJFTWTh3dkuFD2yVV/qKifMYFlmir6B41NHRMhVjfz+IXYNBdNtzL jzfg== X-Gm-Message-State: AOJu0YzpTbKJJ9LZPYcZl4JjlTpPmdjbi1q1oF4RxGmPWilHl4boFfPQ noUk6M/gUWkLOpy8kHAJTmRuY0OQvOhh2w5h4Is3cw== X-Received: by 2002:a50:aa93:0:b0:52e:f99a:b5f8 with SMTP id q19-20020a50aa93000000b0052ef99ab5f8mr418659edc.7.1695918202280; Thu, 28 Sep 2023 09:23:22 -0700 (PDT) MIME-Version: 1.0 References: <20230928100418.521594-1-yajun.deng@linux.dev> <5d8e302c-a28d-d4f4-eb91-4b54eb89490b@linux.dev> In-Reply-To: <5d8e302c-a28d-d4f4-eb91-4b54eb89490b@linux.dev> From: Eric Dumazet Date: Thu, 28 Sep 2023 18:23:11 +0200 Message-ID: Subject: Re: [PATCH v6] net/core: Introduce netdev_core_stats_inc() To: Yajun Deng Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Lobakin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Thu, 28 Sep 2023 09:23:41 -0700 (PDT) On Thu, Sep 28, 2023 at 6:16=E2=80=AFPM Yajun Deng w= rote: > > > On 2023/9/28 23:44, Eric Dumazet wrote: > > On Thu, Sep 28, 2023 at 5:40=E2=80=AFPM Yajun Deng wrote: > >> > >> On 2023/9/28 22:18, Eric Dumazet wrote: > >>> On Thu, Sep 28, 2023 at 12:04=E2=80=AFPM Yajun Deng wrote: > >>>> Although there is a kfree_skb_reason() helper function that can be u= sed to > >>>> find the reason why this skb is dropped, but most callers didn't inc= rease > >>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped= . > >>>> > >>>> For the users, people are more concerned about why the dropped in ip > >>>> is increasing. > >>>> > >>>> Introduce netdev_core_stats_inc() for trace the caller of the droppe= d > >>>> skb. Also, add __code to netdev_core_stats_alloc(), as it's called > >>>> unlinkly. > >>>> > >>>> Signed-off-by: Yajun Deng > >>>> Suggested-by: Alexander Lobakin > >>>> --- > >>>> v6: merge netdev_core_stats and netdev_core_stats_inc together > >>>> v5: Access the per cpu pointer before reach the relevant offset. > >>>> v4: Introduce netdev_core_stats_inc() instead of export dev_core_sta= ts_*_inc() > >>>> v3: __cold should be added to the netdev_core_stats_alloc(). > >>>> v2: use __cold instead of inline in dev_core_stats(). > >>>> v1: https://lore.kernel.org/netdev/20230911082016.3694700-1-yajun.de= ng@linux.dev/ > >>>> --- > >>>> include/linux/netdevice.h | 21 ++++----------------- > >>>> net/core/dev.c | 17 +++++++++++++++-- > >>>> 2 files changed, 19 insertions(+), 19 deletions(-) > >>>> > >>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > >>>> index 7e520c14eb8c..eb1fa04fbccc 100644 > >>>> --- a/include/linux/netdevice.h > >>>> +++ b/include/linux/netdevice.h > >>>> @@ -4002,32 +4002,19 @@ static __always_inline bool __is_skb_forward= able(const struct net_device *dev, > >>>> return false; > >>>> } > >>>> > >>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(stru= ct net_device *dev); > >>>> - > >>>> -static inline struct net_device_core_stats __percpu *dev_core_stats= (struct net_device *dev) > >>>> -{ > >>>> - /* This READ_ONCE() pairs with the write in netdev_core_stat= s_alloc() */ > >>>> - struct net_device_core_stats __percpu *p =3D READ_ONCE(dev->= core_stats); > >>>> - > >>>> - if (likely(p)) > >>>> - return p; > >>>> - > >>>> - return netdev_core_stats_alloc(dev); > >>>> -} > >>>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset); > >>>> > >>>> #define DEV_CORE_STATS_INC(FIELD) = \ > >>>> static inline void dev_core_stats_##FIELD##_inc(struct net_device= *dev) \ > >>>> { = \ > >>>> - struct net_device_core_stats __percpu *p; = \ > >>>> - = \ > >>>> - p =3D dev_core_stats(dev); = \ > >>>> - if (p) = \ > >>>> - this_cpu_inc(p->FIELD); = \ > >>> Note that we were using this_cpu_inc() which implied : > >>> - IRQ safety, and > >>> - a barrier paired with : > >>> > >>> net/core/dev.c:10548: storage->rx_dropped +=3D > >>> READ_ONCE(core_stats->rx_dropped); > >>> net/core/dev.c:10549: storage->tx_dropped +=3D > >>> READ_ONCE(core_stats->tx_dropped); > >>> net/core/dev.c:10550: storage->rx_nohandler +=3D > >>> READ_ONCE(core_stats->rx_nohandler); > >>> net/core/dev.c:10551: storage->rx_otherhost_dropped > >>> +=3D READ_ONCE(core_stats->rx_otherhost_dropped); > >>> > >>> > >>>> + netdev_core_stats_inc(dev, = \ > >>>> + offsetof(struct net_device_core_stats, FIELD= )); \ > >>>> } > >>>> DEV_CORE_STATS_INC(rx_dropped) > >>>> DEV_CORE_STATS_INC(tx_dropped) > >>>> DEV_CORE_STATS_INC(rx_nohandler) > >>>> DEV_CORE_STATS_INC(rx_otherhost_dropped) > >>>> +#undef DEV_CORE_STATS_INC > >>>> > >>>> static __always_inline int ____dev_forward_skb(struct net_device = *dev, > >>>> struct sk_buff *skb= , > >>>> diff --git a/net/core/dev.c b/net/core/dev.c > >>>> index 606a366cc209..88a32c392c1d 100644 > >>>> --- a/net/core/dev.c > >>>> +++ b/net/core/dev.c > >>>> @@ -10497,7 +10497,8 @@ void netdev_stats_to_stats64(struct rtnl_lin= k_stats64 *stats64, > >>>> } > >>>> EXPORT_SYMBOL(netdev_stats_to_stats64); > >>>> > >>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(stru= ct net_device *dev) > >>>> +static __cold struct net_device_core_stats __percpu *netdev_core_st= ats_alloc( > >>>> + struct net_device *dev) > >>>> { > >>>> struct net_device_core_stats __percpu *p; > >>>> > >>>> @@ -10510,7 +10511,19 @@ struct net_device_core_stats __percpu *netd= ev_core_stats_alloc(struct net_device > >>>> /* This READ_ONCE() pairs with the cmpxchg() above */ > >>>> return READ_ONCE(dev->core_stats); > >>>> } > >>>> -EXPORT_SYMBOL(netdev_core_stats_alloc); > >>>> + > >>>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset) > >>>> +{ > >>>> + /* This READ_ONCE() pairs with the write in netdev_core_stat= s_alloc() */ > >>>> + struct net_device_core_stats __percpu *p =3D READ_ONCE(dev->= core_stats); > >>>> + > >>>> + if (unlikely(!p)) > >>>> + p =3D netdev_core_stats_alloc(dev); > >>>> + > >>>> + if (p) > >>>> + (*(unsigned long *)((void *)this_cpu_ptr(p) + offset= ))++; > >>> While here you are using a ++ operation that : > >>> > >>> - is not irq safe > >>> - might cause store-tearing. > >>> > >>> I would suggest a preliminary patch converting the "unsigned long" fi= elds in > >>> struct net_device_core_stats to local_t > >> Do you mean it needs to revert the commit 6510ea973d8d ("net: Use > >> this_cpu_inc() to increment > >> > >> net->core_stats") first? But it would allocate memory which breaks on > >> PREEMPT_RT. > > I think I provided an (untested) alternative. > > > > unsigned long __percpu *field =3D (__force unsigned long __percpu *) > > ((__force u8 *)p + offset); > > this_cpu_inc(field); > > unsigned long __percpu *field =3D (__force unsigned long __percpu *) > ((__force u8 *)p + offset); > this_cpu_inc(*(int *)field); > > This would compiler success. But I didn't test it. > This cold look complex. Why exactly ? Not very different from the cast you already had. > Shoud I base v3? Export dev_core_stats_*_inc() intead of introduce netdev= _core_stats_inc(). > That would be easy. Well, you tell me, but this does not look incremental to me. I do not think we need 4 different (and maybe more to come if struct net_device_core_stats grows in the future) functions for some hardly used path.