Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp1786391rwp; Thu, 13 Jul 2023 17:05:53 -0700 (PDT) X-Google-Smtp-Source: APBJJlHw9B18mV0PGoDyU05AY0HT7bZWP6szT4Gi5CW2rXGlfuNZml+AEPDdwyjP/+64BAQavE1l X-Received: by 2002:a05:6a20:8405:b0:131:39cc:4c21 with SMTP id c5-20020a056a20840500b0013139cc4c21mr2936183pzd.56.1689293152840; Thu, 13 Jul 2023 17:05:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689293152; cv=none; d=google.com; s=arc-20160816; b=MapqRf6FcoFqLPO6khIwTdDFzGYUm0j2ZQ9q83xBHVicMIYEJO7LQB6EwIURTEFa9y Mtd+dZsuZn6tkyK7kc3JdjRpuR+rt/Gd2cCbWVMUOJ28tiTlHp8FPVY/ZrXecN87EuxL tubAahj9gtzPiFTMklyVS+QGQZYir0+wnXNCKTRYZYlOYVHXQfaYt83ZqPIXtBBHwH6M m7wX3lWOXjDYRGkLfsy51gnUCjpr5C57XzhssDdxrWoz7HnSX255E9m5SYWPxQKsLMBg J1dh88wuH4BDla8nUmDMr2UrY1YrWEOBEF/q5g/Gyt1BW+SKkZcNXORH6nL3mafsCvDD Q9YA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=+GmlXdgctY//6EFLp5mb4ooQ/uW3kyekwB8T9pUoHMQ=; fh=K7fFINMq11bAhCWodUjGaYM0JHLPPKkK9/6n5KyG/Rw=; b=JPizCGNembacGLXLeDxaaOD1cWaww497w1gUb30wG4NoTYDKQgOqm6VXWeBIKj6Zc4 RlfohdsAY0nBj9PljFARZH8e+qkOZ5TLL+xwZ9b+9oAVriIGGtMw0g7Tbq6b+CTIJ0Uk y2PE9m5MfyYxL1vWlMU+JZGAFXJg1KR+8Xv81sP1V2JolDQnL9AYEcTmJ1iXEnqWconJ uTtP4KGa0Ss3tXxu49pg3Wnvb00F/vz3sRzXRLEw7IDsqVbwDWPcXg8A+doEulFv2aBE Pb8HRomnfTLd7XxFlUCkWg4m9RTy2axmfOUql2jXRAHEbBObCtZAxvWtTE3f7K4q7bul 3jjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=svVT9V6D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y192-20020a638ac9000000b0055ba8970e7asi5992268pgd.616.2023.07.13.17.05.40; Thu, 13 Jul 2023 17:05:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=svVT9V6D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234033AbjGMXKT (ORCPT + 99 others); Thu, 13 Jul 2023 19:10:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229916AbjGMXKS (ORCPT ); Thu, 13 Jul 2023 19:10:18 -0400 Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A18BBAA; Thu, 13 Jul 2023 16:10:16 -0700 (PDT) Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2b6f9edac8dso19238281fa.3; Thu, 13 Jul 2023 16:10:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689289815; x=1691881815; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+GmlXdgctY//6EFLp5mb4ooQ/uW3kyekwB8T9pUoHMQ=; b=svVT9V6DInVTbCtZ3X0AR636Qs8t7S9qRV0Z61EvjvxF9Xl5YpYClQSWPiTPcasKGa NluXDLpndLX7LJJ3Z6xoAEhbAu9abkbJrRFthQJvJWi26ip1t1yNkSkP3WKveoD2Zl69 CkV55WR2VxhT40EA9WU9Of6x4///LQl5YrwgKbNkmW9wLBQcbUzmxhg+W53iobHIC0TG bRM/oHG+kUAKI+SHm7YUYnxINYRBeZzRg8tmzZYZiDVjO2HWZvV3LnH+sJzJzL5vIh9O Bcgupng+exdVeNgQ0xoHlTeIHLooQyKHWa1ZQWEcPIlL+mvG3fgMzA9Stml0lb9UNEJt OvWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689289815; x=1691881815; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+GmlXdgctY//6EFLp5mb4ooQ/uW3kyekwB8T9pUoHMQ=; b=f3zlds0EevRJNBR4h/KrFP0nYxHW50zeJPOAAn2NyZN/dELxFBnm5dk8bdvuB8ga9O negxNK+72eqi4iqSKD5Ym9kVXJyIHJOr4NFU9eRH4aYOMaFFTSs2QW4bknAhkWNqZb1y PkafXE0DRm9KEn+ucCTyBtHhZ1HnOJ72sF9XGA+wjbtZCWyX/3BZXGgmc0ALDhXeebOQ Wk71ijeaMD5QGFb5SRLf27Qi0O2HGkWShbXp0hiUjy6B0LwWWby6owIN6nJJTyJmk8tB 0Vr2XndEnoBdJhtGxLIXtScABwnKPII02c7SzWq8osqqQkYgvKIL1Y4iOGQ5nv019fNp ZQyA== X-Gm-Message-State: ABy/qLbwigOhDhANlp60EGDD+Oa9BAD2mJ7T65xrN+wz23k0PLziSump eZTXnXvS9EWN2dAWR+3ofkECZDrglBnk5j1WkNM= X-Received: by 2002:a2e:94c7:0:b0:2b6:e2aa:8fc2 with SMTP id r7-20020a2e94c7000000b002b6e2aa8fc2mr2201041ljh.46.1689289814589; Thu, 13 Jul 2023 16:10:14 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Alexei Starovoitov Date: Thu, 13 Jul 2023 16:10:03 -0700 Message-ID: Subject: Re: [PATCH bpf-next v4 2/6] netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link To: Daniel Xu Cc: Andrii Nakryiko , Alexei Starovoitov , Florian Westphal , "David S. Miller" , Pablo Neira Ayuso , Paolo Abeni , Daniel Borkmann , Eric Dumazet , Jakub Kicinski , Jozsef Kadlecsik , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf , LKML , netfilter-devel , coreteam@netfilter.org, Network Development , David Ahern Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 12, 2023 at 9:33=E2=80=AFPM Daniel Xu wrote: > > On Wed, Jul 12, 2023 at 06:26:13PM -0700, Alexei Starovoitov wrote: > > On Wed, Jul 12, 2023 at 6:22=E2=80=AFPM Daniel Xu wrote= : > > > > > > Hi Alexei, > > > > > > On Wed, Jul 12, 2023 at 05:43:49PM -0700, Alexei Starovoitov wrote: > > > > On Wed, Jul 12, 2023 at 4:44=E2=80=AFPM Daniel Xu w= rote: > > > > > +#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6) > > > > > + case NFPROTO_IPV6: > > > > > + rcu_read_lock(); > > > > > + v6_hook =3D rcu_dereference(nf_defrag_v6_hook); > > > > > + if (!v6_hook) { > > > > > + rcu_read_unlock(); > > > > > + err =3D request_module("nf_defrag_ipv6"); > > > > > + if (err) > > > > > + return err < 0 ? err : -EINVAL; > > > > > + > > > > > + rcu_read_lock(); > > > > > + v6_hook =3D rcu_dereference(nf_defrag_v6_= hook); > > > > > + if (!v6_hook) { > > > > > + WARN_ONCE(1, "nf_defrag_ipv6_hook= s bad registration"); > > > > > + err =3D -ENOENT; > > > > > + goto out_v6; > > > > > + } > > > > > + } > > > > > + > > > > > + err =3D v6_hook->enable(link->net); > > > > > > > > I was about to apply, but luckily caught this issue in my local tes= t: > > > > > > > > [ 18.462448] BUG: sleeping function called from invalid context a= t > > > > kernel/locking/mutex.c:283 > > > > [ 18.463238] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pi= d: > > > > 2042, name: test_progs > > > > [ 18.463927] preempt_count: 0, expected: 0 > > > > [ 18.464249] RCU nest depth: 1, expected: 0 > > > > [ 18.464631] CPU: 15 PID: 2042 Comm: test_progs Tainted: G > > > > O 6.4.0-04319-g6f6ec4fa00dc #4896 > > > > [ 18.465480] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996= ), > > > > BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > > > [ 18.466531] Call Trace: > > > > [ 18.466767] > > > > [ 18.466975] dump_stack_lvl+0x32/0x40 > > > > [ 18.467325] __might_resched+0x129/0x180 > > > > [ 18.467691] mutex_lock+0x1a/0x40 > > > > [ 18.468057] nf_defrag_ipv4_enable+0x16/0x70 > > > > [ 18.468467] bpf_nf_link_attach+0x141/0x300 > > > > [ 18.468856] __sys_bpf+0x133e/0x26d0 > > > > > > > > You cannot call mutex under rcu_read_lock. > > > > > > Whoops, my bad. I think this patch should fix it: > > > > > > ``` > > > From 7e8927c44452db07ddd7cf0e30bb49215fc044ed Mon Sep 17 00:00:00 200= 1 > > > Message-ID: <7e8927c44452db07ddd7cf0e30bb49215fc044ed.1689211250.git.= dxu@dxuuu.xyz> > > > From: Daniel Xu > > > Date: Wed, 12 Jul 2023 19:17:35 -0600 > > > Subject: [PATCH] netfilter: bpf: Don't hold rcu_read_lock during > > > enable/disable > > > > > > ->enable()/->disable() takes a mutex which can sleep. You can't sleep > > > during RCU read side critical section. > > > > > > Our refcnt on the module will protect us from ->enable()/->disable() > > > from going away while we call it. > > > > > > Signed-off-by: Daniel Xu > > > --- > > > net/netfilter/nf_bpf_link.c | 10 ++++++++-- > > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.= c > > > index 77ffbf26ba3d..79704cc596aa 100644 > > > --- a/net/netfilter/nf_bpf_link.c > > > +++ b/net/netfilter/nf_bpf_link.c > > > @@ -60,9 +60,12 @@ static int bpf_nf_enable_defrag(struct bpf_nf_link= *link) > > > goto out_v4; > > > } > > > > > > + rcu_read_unlock(); > > > err =3D v4_hook->enable(link->net); > > > if (err) > > > module_put(v4_hook->owner); > > > + > > > + return err; > > > out_v4: > > > rcu_read_unlock(); > > > return err; > > > @@ -92,9 +95,12 @@ static int bpf_nf_enable_defrag(struct bpf_nf_link= *link) > > > goto out_v6; > > > } > > > > > > + rcu_read_unlock(); > > > err =3D v6_hook->enable(link->net); > > > if (err) > > > module_put(v6_hook->owner); > > > + > > > + return err; > > > out_v6: > > > rcu_read_unlock(); > > > return err; > > > @@ -114,11 +120,11 @@ static void bpf_nf_disable_defrag(struct bpf_nf= _link *link) > > > case NFPROTO_IPV4: > > > rcu_read_lock(); > > > v4_hook =3D rcu_dereference(nf_defrag_v4_hook); > > > + rcu_read_unlock(); > > > if (v4_hook) { > > > v4_hook->disable(link->net); > > > module_put(v4_hook->owner); > > > } > > > - rcu_read_unlock(); > > > > > > break; > > > #endif > > > @@ -126,11 +132,11 @@ static void bpf_nf_disable_defrag(struct bpf_nf= _link *link) > > > case NFPROTO_IPV6: > > > rcu_read_lock(); > > > v6_hook =3D rcu_dereference(nf_defrag_v6_hook); > > > + rcu_read_unlock(); > > > > No. v6_hook is gone as soon as you unlock it. > > I think we're protected here by the try_module_get() on the enable path. > And we only disable defrag if enabling succeeds. The module shouldn't > be able to deregister its hooks until we call the module_put() later. > > I think READ_ONCE() would've been more appropriate but I wasn't sure if > that was ok given nf_defrag_v(4|6)_hook is written to by > rcu_assign_pointer() and I was assuming symmetry is necessary. Why is rcu_assign_pointer() used? If it's not RCU protected, what is the point of rcu_*() accessors and rcu_read_lock() ? In general, the pattern: rcu_read_lock(); ptr =3D rcu_dereference(...); rcu_read_unlock(); ptr->.. is a bug. 100%.