Received: by 2002:a05:7412:a986:b0:f9:90c9:de9f with SMTP id o6csp23851rdh; Wed, 20 Dec 2023 13:11:58 -0800 (PST) X-Google-Smtp-Source: AGHT+IGabUgJrGABatoiW+kFyqUR0b2pifQ1UzCTEHQIR06fej0qEpPJL6PTWC0H3zDQ+36KWE00 X-Received: by 2002:adf:f684:0:b0:336:7052:b4bb with SMTP id v4-20020adff684000000b003367052b4bbmr100540wrp.258.1703106717988; Wed, 20 Dec 2023 13:11:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703106717; cv=none; d=google.com; s=arc-20160816; b=Hu+Xe6+aDUSKOGTbkhKlIGoC0FkrW77zbPa7Io3E1GB8buan0zdrrW/kgiyrLWgmPH fMCWCs9z9ucf6tbUyyGvoBQ7NSkxyB7LME2GLk1l+NiYTzC8wLWB5pKIIxohUOfhSLJM 0pMgO1+1b35778/t9AHOkf8aEcIO+nu6STTNaDIZb7oOmAqKGLb2r3X3hOxzQ23CYAr0 YSheyShgRqN/W49iQzb2c9VdyDnkX9VtNDkt7o+lmHUaTL96MyDlv0fIehEDLR4b34PO jygTz62FKW0f9qFvfGgMDorQdaVsdDOK8EpsaNg67yw5PWw5DUBVk5uSu6FNvvL70H25 0cng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=vcmfAZtKeHQ+xz4l7OTo1rSFylynVcbq+ZS/3NlOtSo=; fh=jz2MkKbeDckMvVuEN7RWtua4LnHOz2/NmjM+tCcyudQ=; b=mecRr16stuHJWeYpPArVnfF3h3DqVv8LEizdDfK09OeVgab5R8oALa5sAYZoq666R0 Bw1jLHMoFOjfEleUmowHiHU6trIDavBLoJ9oECbJeFFP8SQItudi41mmCDoJsWL/03m/ PaBqPtQpJINOJ+mXhDnYhjRW6VOEdgNGrdVUWmkRCascXjfbK+lSQdmgrXcwv5jKUqPo mBOrWO/J4SyxIyA1dTsf7N1lWxsT5dVYyyBZelqyJOqXFu0WAC/CW+mdcvL4SwWBYIIl SCDA8OpVNEzpS8Ss5XC0GtMXjjondTiLQF9oN/ZGEQ0bbhiO1RS05ALiJpSWaz5m+bC5 xlAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=iFyhXVgn; spf=pass (google.com: domain of linux-kernel+bounces-7433-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7433-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id f17-20020a50a6d1000000b0055354ab4e96si177811edc.512.2023.12.20.13.11.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Dec 2023 13:11:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-7433-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=iFyhXVgn; spf=pass (google.com: domain of linux-kernel+bounces-7433-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7433-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 8EB361F23B29 for ; Wed, 20 Dec 2023 21:11:57 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id F10AE48CFD; Wed, 20 Dec 2023 21:11:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iFyhXVgn" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8497B48796; Wed, 20 Dec 2023 21:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f53.google.com with SMTP id ffacd0b85a97d-33668163949so69799f8f.2; Wed, 20 Dec 2023 13:11:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703106698; x=1703711498; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vcmfAZtKeHQ+xz4l7OTo1rSFylynVcbq+ZS/3NlOtSo=; b=iFyhXVgnnS060kCc2qzkZT6ysybHruYBFnPRAG2ar7hLIqlrG8Qbzisn6tGYmfvNik 9VBIT/xh9CflEISptX6FR2gmjfDN8KEJHLre5A4b+QnQnryEQ31R818tr80FPx3/jEhP NSsyCPXRyxAJ/lg/6s9GKbyryB7+0ZcAypQj/b6osY1pS/EHb4aeASOGKCslpfE5GNz7 FBz8TPtyh+LCekCya+83RylqbyAoBQEOVQ2BV8WzyMHTtE6Fl/L8h8akII73JZZBdQ4b R5vrAASBeNKn0CXgp3FExlYGZo7MluqProU2YVQbGKj+yAEpsB5Q9EmLu+vIc3or4pKX UOTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703106698; x=1703711498; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vcmfAZtKeHQ+xz4l7OTo1rSFylynVcbq+ZS/3NlOtSo=; b=sqLYCwLIVCIz/hsWFqM+bTrYOm/pY6VzcGrIVq7OF7R0PzUFanIkZYQHwbp9zM1Z26 F9G5uNIwiSjn3u8XX3bCk8HDakp8D6hFTI53Pr/2FzY5FrB8YfcizzUaMDOP1mF6ATIj SA+RgBsKfJMqVX36z3mcoYC1OGTFRlGvRbdI/i9sCsg9tKScsSziFy3S548POr0Uy4su v4roxjBSv2NvLkruHTEB8lRswi7uhAhXUm/wQWekz1ZZm0q8SbrK9TGqDaw1GCgrHQHC qZ7EMIrEQvuhIXcQEciNJ0vpb8pmM61sQswV6CwDsIPfWsIiCZCmNm1Vq+KM4QcLmbEQ NAkw== X-Gm-Message-State: AOJu0YwsF0nRLAdcHKdmmABCHlWyFIooGHAeDr/HxARguldlILHpixmF 6ASVXF09jNL4/uUq9yNNBa+cqyWu1Gpq0NbPsbM= X-Received: by 2002:a5d:4f06:0:b0:336:6ebb:704f with SMTP id c6-20020a5d4f06000000b003366ebb704fmr63710wru.122.1703106697456; Wed, 20 Dec 2023 13:11:37 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <1703081351-85579-1-git-send-email-alibuda@linux.alibaba.com> <1703081351-85579-2-git-send-email-alibuda@linux.alibaba.com> In-Reply-To: <1703081351-85579-2-git-send-email-alibuda@linux.alibaba.com> From: Alexei Starovoitov Date: Wed, 20 Dec 2023 13:11:26 -0800 Message-ID: Subject: Re: [RFC nf-next v3 1/2] netfilter: bpf: support prog update To: "D. Wythe" Cc: Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , bpf , LKML , Network Development , coreteam@netfilter.org, netfilter-devel , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Dec 20, 2023 at 6:09=E2=80=AFAM D. Wythe wrote: > > From: "D. Wythe" > > To support the prog update, we need to ensure that the prog seen > within the hook is always valid. Considering that hooks are always > protected by rcu_read_lock(), which provide us the ability to > access the prog under rcu. > > Signed-off-by: D. Wythe > --- > net/netfilter/nf_bpf_link.c | 63 ++++++++++++++++++++++++++++++++++-----= ------ > 1 file changed, 48 insertions(+), 15 deletions(-) > > diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c > index e502ec0..9bc91d1 100644 > --- a/net/netfilter/nf_bpf_link.c > +++ b/net/netfilter/nf_bpf_link.c > @@ -8,17 +8,8 @@ > #include > #include > > -static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb, > - const struct nf_hook_state *s) > -{ > - const struct bpf_prog *prog =3D bpf_prog; > - struct bpf_nf_ctx ctx =3D { > - .state =3D s, > - .skb =3D skb, > - }; > - > - return bpf_prog_run(prog, &ctx); > -} > +/* protect link update in parallel */ > +static DEFINE_MUTEX(bpf_nf_mutex); > > struct bpf_nf_link { > struct bpf_link link; > @@ -26,8 +17,20 @@ struct bpf_nf_link { > struct net *net; > u32 dead; > const struct nf_defrag_hook *defrag_hook; > + struct rcu_head head; I have to point out the same issues as before, but will ask them differently... Why do you think above rcu_head is necessary? > }; > > +static unsigned int nf_hook_run_bpf(void *bpf_link, struct sk_buff *skb, > + const struct nf_hook_state *s) > +{ > + const struct bpf_nf_link *nf_link =3D bpf_link; > + struct bpf_nf_ctx ctx =3D { > + .state =3D s, > + .skb =3D skb, > + }; > + return bpf_prog_run(rcu_dereference_raw(nf_link->link.prog), &ctx= ); > +} > + > #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4) || IS_ENABLED(CONFIG_NF_DEFRAG_IPV= 6) > static const struct nf_defrag_hook * > get_proto_defrag_hook(struct bpf_nf_link *link, > @@ -126,8 +129,7 @@ static void bpf_nf_link_release(struct bpf_link *link= ) > static void bpf_nf_link_dealloc(struct bpf_link *link) > { > struct bpf_nf_link *nf_link =3D container_of(link, struct bpf_nf_= link, link); > - > - kfree(nf_link); > + kfree_rcu(nf_link, head); Why is this needed ? Have you looked at tcx_link_lops ? > } > > static int bpf_nf_link_detach(struct bpf_link *link) > @@ -162,7 +164,34 @@ static int bpf_nf_link_fill_link_info(const struct b= pf_link *link, > static int bpf_nf_link_update(struct bpf_link *link, struct bpf_prog *ne= w_prog, > struct bpf_prog *old_prog) > { > - return -EOPNOTSUPP; > + struct bpf_nf_link *nf_link =3D container_of(link, struct bpf_nf_= link, link); > + int err =3D 0; > + > + mutex_lock(&bpf_nf_mutex); Why do you need this mutex? What race does it solve? > + > + if (nf_link->dead) { > + err =3D -EPERM; > + goto out; > + } > + > + /* target old_prog mismatch */ > + if (old_prog && link->prog !=3D old_prog) { > + err =3D -EPERM; > + goto out; > + } > + > + old_prog =3D link->prog; > + if (old_prog =3D=3D new_prog) { > + /* don't need update */ > + bpf_prog_put(new_prog); > + goto out; > + } > + > + old_prog =3D xchg(&link->prog, new_prog); > + bpf_prog_put(old_prog); > +out: > + mutex_unlock(&bpf_nf_mutex); > + return err; > } > > static const struct bpf_link_ops bpf_nf_link_lops =3D { > @@ -226,7 +255,11 @@ int bpf_nf_link_attach(const union bpf_attr *attr, s= truct bpf_prog *prog) > > link->hook_ops.hook =3D nf_hook_run_bpf; > link->hook_ops.hook_ops_type =3D NF_HOOK_OP_BPF; > - link->hook_ops.priv =3D prog; > + > + /* bpf_nf_link_release & bpf_nf_link_dealloc() can ensures that l= ink remains > + * valid at all times within nf_hook_run_bpf(). > + */ > + link->hook_ops.priv =3D link; > > link->hook_ops.pf =3D attr->link_create.netfilter.pf; > link->hook_ops.priority =3D attr->link_create.netfilter.priority; > -- > 1.8.3.1 >