Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp2235882rwp; Fri, 14 Jul 2023 03:05:24 -0700 (PDT) X-Google-Smtp-Source: APBJJlEuUCSlnAfNg8g5tgi2O7952blBsvp8XtUi0T5udiWQ1Fhl1V1VJEzkyOyuOnjdFbHwNKV3 X-Received: by 2002:a17:907:7709:b0:991:bf04:204f with SMTP id kw9-20020a170907770900b00991bf04204fmr3657784ejc.60.1689329123919; Fri, 14 Jul 2023 03:05:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689329123; cv=none; d=google.com; s=arc-20160816; b=mkRC2vEj1wjXCKWvtEZ1YbroP2eKA87JhJSPuCcPifC58UWbgMZooDosrzOH9UX1Ig HhIqu1ZFr3lr+STv3VS5kiFw+WwxjivH/gAWdZlgANL1EE31oyp4Hj6W09rfsi/MN6Lg 7/FhNJLI66u8QYxmgtgFlNBTP2W5DT/JPciz5OC20zDimU47+H4Szc6/u8PSsccwtaVn XzWAyRKEBa21Dk3IriIafjmeHxPuyUujwYJ3JUhse6V/E8ZsMR3yUlENpNc9V20+B3FL O4+pvpNzBCHMDFFhjswLr32dLmZGbtBwxPPPraaNuDZj3/5d6CvBnjovqFPFNP+FmQQw NQVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=AY3zheLPGRl8eWviCAMoVjwEMAKPy+tv7+pPBsJhNpo=; fh=B56yUm9qG26Qhduwnq7PGho8FIzvuBFc59NInvDHx5A=; b=u2ym7/D9eICo/L+Y3Mtdl5CcrJnB36kKtlr/S+nzAlqJTnDPJyZAsi4c6jU1WrX4fl 2ci/r6biryeXv+3ygbsFSjlXnaoR4LdlVOxLCWEvZ/F9rR1GdL/dpXCvXeLaCuFK7iin 1yLViJCj1/3sqDrOo60CIu2y+Z7EgbZvbSpP51s9/eQshLUkOpYnEMqiMRSJ5a7sv+Yw dQsxDxPamjrFoDl/MiyPsPsmhGtZF7Ics0Lq07toXx9cYPb3Ne0cCRaTwTu5m8oH5kUP 6g05si3u7gw77QSfRFXMfLHxzXGRBvtEmGzwng2IQ9yRCJBV1p6U2IEyXrrQ8FF+zC4m Afmw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c4-20020a17090603c400b0097650856f55si8670974eja.695.2023.07.14.03.04.59; Fri, 14 Jul 2023 03:05:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235933AbjGNJsU (ORCPT + 99 others); Fri, 14 Jul 2023 05:48:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235072AbjGNJsE (ORCPT ); Fri, 14 Jul 2023 05:48:04 -0400 Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:237:300::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF8A32D73; Fri, 14 Jul 2023 02:47:59 -0700 (PDT) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1qKFOn-0002GU-MB; Fri, 14 Jul 2023 11:47:41 +0200 Date: Fri, 14 Jul 2023 11:47:41 +0200 From: Florian Westphal To: Daniel Xu Cc: Alexei Starovoitov , Andrii Nakryiko , Alexei Starovoitov , Florian Westphal , "David S. Miller" , Pablo Neira Ayuso , Paolo Abeni , Daniel Borkmann , Eric Dumazet , Jakub Kicinski , Jozsef Kadlecsik , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf , LKML , netfilter-devel , coreteam@netfilter.org, Network Development , David Ahern Subject: Re: [PATCH bpf-next v4 2/6] netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link Message-ID: <20230714094741.GA7912@breakpoint.cc> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_PASS,T_SCC_BODY_TEXT_LINE, T_SPF_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Daniel Xu wrote: > On Thu, Jul 13, 2023 at 04:10:03PM -0700, Alexei Starovoitov wrote: > > Why is rcu_assign_pointer() used? > > If it's not RCU protected, what is the point of rcu_*() accessors > > and rcu_read_lock() ? > > > > In general, the pattern: > > rcu_read_lock(); > > ptr = rcu_dereference(...); > > rcu_read_unlock(); > > ptr->.. > > is a bug. 100%. FWIW, I agree with Alexei, it does look... dodgy. > The reason I left it like this is b/c otherwise I think there is a race > with module unload and taking a refcnt. For example: > > ptr = READ_ONCE(global_var) > > // ptr invalid > try_module_get(ptr->owner) > Yes, I agree. > I think the the synchronize_rcu() call in > kernel/module/main.c:free_module() protects against that race based on > my reading. > > Maybe the ->enable() path can store a copy of the hook ptr in > struct bpf_nf_link to get rid of the odd rcu_dereference()? > > Open to other ideas too -- would appreciate any hints. I would suggest the following: - Switch ordering of patches 2 and 3. What is currently patch 3 would add the .owner fields only. Then, what is currently patch #2 would document the rcu/modref interaction like this (omitting error checking for brevity): rcu_read_lock(); v6_hook = rcu_dereference(nf_defrag_v6_hook); if (!v6_hook) { rcu_read_unlock(); err = request_module("nf_defrag_ipv6"); if (err) return err < 0 ? err : -EINVAL; rcu_read_lock(); v6_hook = rcu_dereference(nf_defrag_v6_hook); } if (v6_hook && try_module_get(v6_hook->owner)) v6_hook = rcu_pointer_handoff(v6_hook); else v6_hook = NULL; rcu_read_unlock(); if (!v6_hook) err(); v6_hook->enable(); I'd store the v4/6_hook pointer in the nf bpf link struct, its probably more self-explanatory for the disable side in that we did pick up a module reference that we still own at delete time, without need for any rcu involvement. Because above handoff is repetitive for ipv4 and ipv6, I suggest to add an agnostic helper for this. I know you added distinct structures for ipv4 and ipv6 but if they would use the same one you could add static const struct nf_defrag_hook *get_proto_frag_hook(const struct nf_defrag_hook __rcu *hook, const char *modulename); And then use it like: v4_hook = get_proto_frag_hook(nf_defrag_v4_hook, "nf_defrag_ipv4"); Without a need to copy the modprobe and handoff part. What do you think?