Received: by 2002:a05:7412:da14:b0:e2:908c:2ebd with SMTP id fe20csp1315762rdb; Sun, 8 Oct 2023 01:54:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHKb6M7hTl0dyjLYRtm7YcYkOG8+5gofw7++401RWNAlis5a91dJWFTol/DTIHdpMXDhUt+ X-Received: by 2002:a17:902:ab12:b0:1c3:868f:5958 with SMTP id ik18-20020a170902ab1200b001c3868f5958mr10410000plb.20.1696755266985; Sun, 08 Oct 2023 01:54:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696755266; cv=none; d=google.com; s=arc-20160816; b=rapzb1aCSFncL/9Mky8W1Ch1oeuG07qmCzrP8W8vfzfsJEWbBF0roJkzItAg6ux4tL SBYvtvdhCbqt4/s8aLOgRE2K9X/EjeIgPHbpK1+WbSZ+E6VlS8fFKbAoE7r1uPBvskHg SDOmBglVANfbi7yw9WgwilQS4B6ngWDtP3BakjFkcYY2HlT9XT2g+JKBDZt367Rdwb7i GNti+hRYIQyAhreEYMNCM+JyMZZ/CcHwPIr/id8nNSWkqFLSYAEVTJid44BtyQGTzUoa Ug/HvrUgKb4595VTwXl/26I6E39vvm9tH5D59wlgozBp/MvsIWK4DUhb+6b0BatwoSz3 Cfxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=M3HAurHN8+gOZu2k1gt6qYEl5KYRR6FLliDA1VuqOt4=; fh=tX1pVPzUB4VqSEEJuD493A3oLLMsXX2GAylXkqZLhQo=; b=QlBqxk5HPI44Who3iroQUp8QC6vgjMVzc+S5gk2LUj7LyHRrofAHDuEn4u5iMCBvtw 4DgjV+ODGqxrHZObHSeSg82uCz9yE9HKQUQCAXH3wDF9dBs8XsbTgOpj+1ibMwUKrnhm UL1udtAWowFwst9T4W5ztO9PxJf6NAqdXsaoqke8XLLiWuLn46hVAXWNz2YN/t4m9FSM r+MV47UrclKoyLXzqeVm78goGaQf1ZTSt3I2aDlM9Ouayx8N1h3wpwre8Snj1eYgWL5B t/Z8NHFbqasYkFO5+Ai8C4VoLBhrEBctrBhedASuTe44XtFSIoSIx8jaIAUnvPDAxj+m UCgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=ol0AZbAR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id p10-20020a170902e74a00b001c73f81b24asi2991462plf.521.2023.10.08.01.54.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 08 Oct 2023 01:54:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=ol0AZbAR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 0DA01802DBB2; Sun, 8 Oct 2023 01:54:24 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234274AbjJHIx7 (ORCPT + 99 others); Sun, 8 Oct 2023 04:53:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344547AbjJHIx4 (ORCPT ); Sun, 8 Oct 2023 04:53:56 -0400 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76CAE128 for ; Sun, 8 Oct 2023 01:53:21 -0700 (PDT) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-536ef8a7dcdso8192a12.0 for ; Sun, 08 Oct 2023 01:53:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696755199; x=1697359999; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=M3HAurHN8+gOZu2k1gt6qYEl5KYRR6FLliDA1VuqOt4=; b=ol0AZbARuL0jZqwp4K4OtB+6w5UCOjBEk6Im8OnH1JHlmtNP2gn7V8m7ks3Wvh1UuW s3GkKvA8+02c8TuACf7D/CRVF399zsq7jyZom0WbNjqZow3/+otJhdjbPTbWiQsKsl9P eosLMQg/Al7Y1EYXIdRCBwmqBbvZdYJ4yq2DX2YWyJ3FbQSTljuRpMUvf5M1Q/vRoXvC RCpwJnHJQvCBDCItrpesKbLLQuGULZP2eND1XT6YYBE9wMoDdmKpQpOejdcQyppFsxBR Ydq3v3xGZxrFFRj7iDwzuj0IOAPoDxAL+jhE9JSXpPSHR78mWDj3CpD3yJH5O1cmoIP9 mr8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696755199; x=1697359999; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M3HAurHN8+gOZu2k1gt6qYEl5KYRR6FLliDA1VuqOt4=; b=py6WUGVpIiNByy6W3qvryB8xwFAErUg3kW1rvqmgOzrhdPymRh6dWCkNf6Fo1v8SBZ rgwiBpY+2gl1jrLkID4pM+4bJYCSRI7HnpSQY1dv5b2hKd/g1smIUQYAAJIxUnbkCReI igwRR8gI15rDg6qKWIRXjmk46q7mXcHKVdfBdB9wTbOI34G3kz93SNqrB8xMdV4lqK7Z rqQrqqZYdtr6aysTmNgEEl/LtUKm6kTamQY/4aEo5hrfDMzBTDe6wS7HkrEaTN6JN2CG Wsu97aTbuvnXu6KCMsxLGl9tLMyL+WG9+hrFbZv7+5Ho7M2O3c+/1IFmoyEKDrEOAMud 07ww== X-Gm-Message-State: AOJu0YzXa87LHB9tK3USUXQzjMPenRG0o+tsSeiihrcIsETGssKuG2fv oUqWCkcOs4Sgeno/5SQxj9eVXb/EI9D0LcxXImq3Eg== X-Received: by 2002:a50:d0d7:0:b0:538:1d3b:172f with SMTP id g23-20020a50d0d7000000b005381d3b172fmr299192edf.3.1696755199254; Sun, 08 Oct 2023 01:53:19 -0700 (PDT) MIME-Version: 1.0 References: <20231007050621.1706331-1-yajun.deng@linux.dev> <917708b5-cb86-f233-e878-9233c4e6c707@linux.dev> <9f4fb613-d63f-9b86-fe92-11bf4dfb7275@linux.dev> In-Reply-To: <9f4fb613-d63f-9b86-fe92-11bf4dfb7275@linux.dev> From: Eric Dumazet Date: Sun, 8 Oct 2023 10:53:05 +0200 Message-ID: Subject: Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc() To: Yajun Deng Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Lobakin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.8 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Sun, 08 Oct 2023 01:54:24 -0700 (PDT) On Sun, Oct 8, 2023 at 10:44=E2=80=AFAM Yajun Deng w= rote: > > > On 2023/10/8 15:18, Eric Dumazet wrote: > > On Sun, Oct 8, 2023 at 9:00=E2=80=AFAM Yajun Deng wrote: > >> > >> On 2023/10/8 14:45, Eric Dumazet wrote: > >>> On Sat, Oct 7, 2023 at 8:34=E2=80=AFAM Yajun Deng wrote: > >>>> On 2023/10/7 13:29, Eric Dumazet wrote: > >>>>> On Sat, Oct 7, 2023 at 7:06=E2=80=AFAM Yajun Deng wrote: > >>>>>> Although there is a kfree_skb_reason() helper function that can be= used to > >>>>>> find the reason why this skb is dropped, but most callers didn't i= ncrease > >>>>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropp= ed. > >>>>>> > >>>>> ... > >>>>> > >>>>>> + > >>>>>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset) > >>>>>> +{ > >>>>>> + /* This READ_ONCE() pairs with the write in netdev_core_st= ats_alloc() */ > >>>>>> + struct net_device_core_stats __percpu *p =3D READ_ONCE(dev= ->core_stats); > >>>>>> + unsigned long *field; > >>>>>> + > >>>>>> + if (unlikely(!p)) > >>>>>> + p =3D netdev_core_stats_alloc(dev); > >>>>>> + > >>>>>> + if (p) { > >>>>>> + field =3D (unsigned long *)((void *)this_cpu_ptr(p= ) + offset); > >>>>>> + WRITE_ONCE(*field, READ_ONCE(*field) + 1); > >>>>> This is broken... > >>>>> > >>>>> As I explained earlier, dev_core_stats_xxxx(dev) can be called from > >>>>> many different contexts: > >>>>> > >>>>> 1) process contexts, where preemption and migration are allowed. > >>>>> 2) interrupt contexts. > >>>>> > >>>>> Adding WRITE_ONCE()/READ_ONCE() is not solving potential races. > >>>>> > >>>>> I _think_ I already gave you how to deal with this ? > >>>> Yes, I replied in v6. > >>>> > >>>> https://lore.kernel.org/all/e25b5f3c-bd97-56f0-de86-b93a3172870d@lin= ux.dev/ > >>>> > >>>>> Please try instead: > >>>>> > >>>>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset) > >>>>> +{ > >>>>> + /* This READ_ONCE() pairs with the write in netdev_core_sta= ts_alloc() */ > >>>>> + struct net_device_core_stats __percpu *p =3D READ_ONCE(dev-= >core_stats); > >>>>> + unsigned long __percpu *field; > >>>>> + > >>>>> + if (unlikely(!p)) { > >>>>> + p =3D netdev_core_stats_alloc(dev); > >>>>> + if (!p) > >>>>> + return; > >>>>> + } > >>>>> + field =3D (__force unsigned long __percpu *)((__force void = *)p + offset); > >>>>> + this_cpu_inc(*field); > >>>>> +} > >>>> This wouldn't trace anything even the rx_dropped is in increasing. I= t > >>>> needs to add an extra operation, such as: > >>> I honestly do not know what you are talking about. > >>> > >>> Have you even tried to change your patch to use > >>> > >>> field =3D (__force unsigned long __percpu *)((__force void *)p + offs= et); > >>> this_cpu_inc(*field); > >> > >> Yes, I tested this code. But the following couldn't show anything even > >> if the rx_dropped is increasing. > >> > >> 'sudo python3 /usr/share/bcc/tools/trace netdev_core_stats_inc' > > Well, I am not sure about this, "bpftrace" worked for me. > > > > Make sure your toolchain generates something that looks like what I got= : > > > > 000000000000ef20 : > > ef20: f3 0f 1e fa endbr64 > > ef24: e8 00 00 00 00 call ef29 > > ef25: R_X86_64_PLT32 __fentry__-0x4 > > ef29: 55 push %rbp > > ef2a: 48 89 e5 mov %rsp,%rbp > > ef2d: 53 push %rbx > > ef2e: 89 f3 mov %esi,%ebx > > ef30: 48 8b 87 f0 01 00 00 mov 0x1f0(%rdi),%rax > > ef37: 48 85 c0 test %rax,%rax > > ef3a: 74 0b je ef47 > > ef3c: 89 d9 mov %ebx,%ecx > > ef3e: 65 48 ff 04 08 incq %gs:(%rax,%rcx,1) > > ef43: 5b pop %rbx > > ef44: 5d pop %rbp > > ef45: c3 ret > > ef46: cc int3 > > ef47: e8 00 00 00 00 call ef4c > > ef48: R_X86_64_PLT32 .text.unlikely.+0x13c > > ef4c: 48 85 c0 test %rax,%rax > > ef4f: 75 eb jne ef3c > > ef51: eb f0 jmp ef43 > > ef53: 66 66 66 66 2e 0f 1f data16 data16 data16 cs nopw 0x0(%rax,%= rax,1) > > ef5a: 84 00 00 00 00 00 > > > I'll share some I can see it. > > 1. > > objdump -D vmlinux > > ffffffff81b2f170 : > ffffffff81b2f170: e8 8b ea 55 ff callq ffffffff8108dc00 > <__fentry__> > ffffffff81b2f175: 55 push %rbp > ffffffff81b2f176: 48 89 e5 mov %rsp,%rbp > ffffffff81b2f179: 48 83 ec 08 sub $0x8,%rsp > ffffffff81b2f17d: 48 8b 87 e8 01 00 00 mov 0x1e8(%rdi),%rax > ffffffff81b2f184: 48 85 c0 test %rax,%rax > ffffffff81b2f187: 74 0d je ffffffff81b2f196 > > ffffffff81b2f189: 89 f6 mov %esi,%esi > ffffffff81b2f18b: 65 48 ff 04 30 incq %gs:(%rax,%rsi,1) > ffffffff81b2f190: c9 leaveq > ffffffff81b2f191: e9 aa 31 6d 00 jmpq ffffffff82202340 > <__x86_return_thunk> > ffffffff81b2f196: 89 75 fc mov %esi,-0x4(%rbp) > ffffffff81b2f199: e8 82 ff ff ff callq ffffffff81b2f120 > > ffffffff81b2f19e: 8b 75 fc mov -0x4(%rbp),%esi > ffffffff81b2f1a1: 48 85 c0 test %rax,%rax > ffffffff81b2f1a4: 75 e3 jne ffffffff81b2f189 > > ffffffff81b2f1a6: c9 leaveq > ffffffff81b2f1a7: e9 94 31 6d 00 jmpq ffffffff82202340 > <__x86_return_thunk> > ffffffff81b2f1ac: 0f 1f 40 00 nopl 0x0(%rax) > > > 2. > > sudo cat /proc/kallsyms | grep netdev_core_stats_inc > > ffffffff9c72f120 T netdev_core_stats_inc > ffffffff9ca2676c t netdev_core_stats_inc.cold > ffffffff9d5235e0 r __ksymtab_netdev_core_stats_inc > > > 3. > > =E2=9E=9C ~ ifconfig enp34s0f0 > enp34s0f0: flags=3D4163 mtu 1500 > inet 10.10.30.88 netmask 255.255.255.0 broadcast 10.10.30.255 > inet6 fe80::6037:806c:14b6:f1ca prefixlen 64 scopeid 0x20 > ether 04:d4:c4:5c:81:42 txqueuelen 1000 (Ethernet) > RX packets 29024 bytes 3118278 (3.1 MB) > RX errors 0 dropped 794 overruns 0 frame 0 > TX packets 16961 bytes 2662290 (2.6 MB) > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > device interrupt 29 memory 0x39fff4000000-39fff47fffff > > =E2=9E=9C ~ ifconfig enp34s0f0 > enp34s0f0: flags=3D4163 mtu 1500 > inet 10.10.30.88 netmask 255.255.255.0 broadcast 10.10.30.255 > inet6 fe80::6037:806c:14b6:f1ca prefixlen 64 scopeid 0x20 > ether 04:d4:c4:5c:81:42 txqueuelen 1000 (Ethernet) > RX packets 29272 bytes 3148997 (3.1 MB) > RX errors 0 dropped 798 overruns 0 frame 0 > TX packets 17098 bytes 2683547 (2.6 MB) > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > device interrupt 29 memory 0x39fff4000000-39fff47fffff > > > The rx_dropped is increasing. > > > 4. > > sudo python3 /usr/share/bcc/tools/trace netdev_core_stats_inc > > TIME PID TID COMM FUNC > > (Empty, I didn't see anything.) > > > 5. > > sudo trace-cmd record -p function -l netdev_core_stats_inc > > sudo trace-cmd report > > (Empty, I didn't see anything.) > > > If I add a 'pr_info("\n");' like: > > + pr_info("\n"); > field =3D (__force unsigned long __percpu *)((__force void *)p + > offset); > this_cpu_inc(*field); > > > Everything is OK. The 'pr_info("\n");' can be changed to anything else, > but not > > without it. This seems to be a bug that has nothing to do with the patch. Try getting help from Steven maybe.