Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B16A5C433EF for ; Thu, 16 Dec 2021 18:24:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240651AbhLPSYx (ORCPT ); Thu, 16 Dec 2021 13:24:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236282AbhLPSYv (ORCPT ); Thu, 16 Dec 2021 13:24:51 -0500 Received: from mail-qv1-xf30.google.com (mail-qv1-xf30.google.com [IPv6:2607:f8b0:4864:20::f30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D071DC06173E for ; Thu, 16 Dec 2021 10:24:50 -0800 (PST) Received: by mail-qv1-xf30.google.com with SMTP id ke6so111170qvb.1 for ; Thu, 16 Dec 2021 10:24:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=S3bQug7Q3Y8bGfDQ90oiL7YlAL/sesqoY0B+mdgASAE=; b=ihLB1h+Rgl72xRpn+k9ZhT1kdNXT+Nqs4XGJHZ7OaLplnDAJe3aNTJ15nJd5m8VUmW Wh1NoK7hOl/+m7vRS0rM3Ij/VbmElam8MPFWKTU5DMSYtWGozNwzKPoaGDR2duRt7UqG mlATVlZmF1FM9zA8EwGpFzZG7TKPf3zCHULt4U6d8PGBKNsmXonALM5Q6QvNmHxbn31g Db7ogFshTMtcoCF3OB2ohrUkCYJEs37POuMpJ5eiDyUwWG2t0niKCxQ0qk1TKj6cuuPu BrdbI5MgXIRD5XP4giDSFdrjtsT3BcNMA9U2DMHRn5KnsyfSjIk047tb9taXZCBpBkSf A/gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=S3bQug7Q3Y8bGfDQ90oiL7YlAL/sesqoY0B+mdgASAE=; b=oRuEIVOuNw4wkgYv34C28m2SKRK7uV59uT0q+tX97BLW4b8dCBtfbPZ7AUckEpYKrs l4Ug24ehUztKZqIkZir/2kxg1Qn/nrGm3F1zSIbb1CS3AP1+aKtTjZ1NvFkYNy1oc5BT iGKpuac6MewtucIvuvdvE93rK604f+/dEQ4xogmd6guZFLahCC+4ioHwqNFQ4a57qYrQ riVQyl58ozPrh1DXyQALtYRHbQOmwePtGDVv0VmwVF4zeu6Ekz01Ok7N3+++aBBSl/Bk Te2YT3rMdqvVNdYZidMwbtVooIyM7V0mpfVdiHzMX3mgwg9uoUPDxrTU4KU0JbOEGgCd wHBQ== X-Gm-Message-State: AOAM530Be+0kAdAwv7GgSN2u7b+ctbJOBpmKdTfZ4Y9OUONozWqSwXME nLQ96EJeAmwTGiuq2dDbDh+g2Rna84XH6eOvbalN8A== X-Google-Smtp-Source: ABdhPJxmfD/HFoD7QRvwBZ38fcHdjZTFJJJt0+kkFuzoNVWsCXU+XmZlr3QxbwmDKkh25oZF5D6rr6K+FVIIqjnvfD4= X-Received: by 2002:a05:6214:d88:: with SMTP id e8mr7230680qve.80.1639679089762; Thu, 16 Dec 2021 10:24:49 -0800 (PST) MIME-Version: 1.0 References: <634c2c87-84c9-0254-3f12-7d993037495c@gmail.com> <92f69969-42dc-204a-4138-16fdaaebb78d@gmail.com> <7ca623df-73ed-9191-bec7-a4728f2f95e6@gmail.com> <20211216181449.p2izqxgzmfpknbsw@kafai-mbp.dhcp.thefacebook.com> In-Reply-To: <20211216181449.p2izqxgzmfpknbsw@kafai-mbp.dhcp.thefacebook.com> From: Stanislav Fomichev Date: Thu, 16 Dec 2021 10:24:38 -0800 Message-ID: Subject: Re: [PATCH v3] cgroup/bpf: fast path skb BPF filtering To: Martin KaFai Lau Cc: Pavel Begunkov , netdev@vger.kernel.org, bpf@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Song Liu , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 16, 2021 at 10:14 AM Martin KaFai Lau wrote: > > On Thu, Dec 16, 2021 at 01:21:26PM +0000, Pavel Begunkov wrote: > > On 12/15/21 22:07, Stanislav Fomichev wrote: > > > On Wed, Dec 15, 2021 at 11:55 AM Pavel Begunkov wrote: > > > > > > > > On 12/15/21 19:15, Stanislav Fomichev wrote: > > > > > On Wed, Dec 15, 2021 at 10:54 AM Pavel Begunkov wrote: > > > > > > > > > > > > On 12/15/21 18:24, sdf@google.com wrote: > > [...] > > > > > > > I can probably do more experiments on my side once your patch is > > > > > > > accepted. I'm mostly concerned with getsockopt(TCP_ZEROCOPY_RECEIVE). > > > > > > > If you claim there is visible overhead for a direct call then there > > > > > > > should be visible benefit to using CGROUP_BPF_TYPE_ENABLED there as > > > > > > > well. > > > > > > > > > > > > Interesting, sounds getsockopt might be performance sensitive to > > > > > > someone. > > > > > > > > > > > > FWIW, I forgot to mention that for testing tx I'm using io_uring > > > > > > (for both zc and not) with good submission batching. > > > > > > > > > > Yeah, last time I saw 2-3% as well, but it was due to kmalloc, see > > > > > more details in 9cacf81f8161, it was pretty visible under perf. > > > > > That's why I'm a bit skeptical of your claims of direct calls being > > > > > somehow visible in these 2-3% (even skb pulls/pushes are not 2-3%?). > > > > > > > > migrate_disable/enable together were taking somewhat in-between > > > > 1% and 1.5% in profiling, don't remember the exact number. The rest > > > > should be from rcu_read_lock/unlock() in BPF_PROG_RUN_ARRAY_CG_FLAGS() > > > > and other extra bits on the way. > > > > > > You probably have a preemptiple kernel and preemptible rcu which most > > > likely explains why you see the overhead and I won't (non-preemptible > > > kernel in our env, rcu_read_lock is essentially a nop, just a compiler > > > barrier). > > > > Right. For reference tried out non-preemptible, perf shows the function > > taking 0.8% with a NIC and 1.2% with a dummy netdev. > > > > > > > > I'm skeptical I'll be able to measure inlining one function, > > > > variability between boots/runs is usually greater and would hide it. > > > > > > Right, that's why I suggested to mirror what we do in set/getsockopt > > > instead of the new extra CGROUP_BPF_TYPE_ENABLED. But I'll leave it up > > > to you, Martin and the rest. > I also suggested to try to stay with one way for fullsock context in v2 > but it is for code readability reason. > > How about calling CGROUP_BPF_TYPE_ENABLED() just next to cgroup_bpf_enabled() > in BPF_CGROUP_RUN_PROG_*SOCKOPT_*() instead ? SG! > It is because both cgroup_bpf_enabled() and CGROUP_BPF_TYPE_ENABLED() > want to check if there is bpf to run before proceeding everything else > and then I don't need to jump to the non-inline function itself to see > if there is other prog array empty check. > > Stan, do you have concern on an extra inlined sock_cgroup_ptr() > when there is bpf prog to run for set/getsockopt()? I think > it should be mostly noise from looking at > __cgroup_bpf_run_filter_*sockopt()? Yeah, my concern is also mostly about readability/consistency. Either __cgroup_bpf_prog_array_is_empty everywhere or this new CGROUP_BPF_TYPE_ENABLED everywhere. I'm slightly leaning towards __cgroup_bpf_prog_array_is_empty because I don't believe direct function calls add any visible overhead and macros are ugly :-) But either way is fine as long as it looks consistent.