Received: by 2002:a05:6500:1b8f:b0:1fa:5c73:8e2d with SMTP id df15csp755842lqb; Wed, 29 May 2024 09:33:42 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWUB/S1mwg8i4ruprXOKBKmqTGXuRxXGrh+/2aj47ui94/QY2phdRUbpseZ8XjjeX2ktEioVYsKykj4MS2lJSrwZVOoRWCVorC0Nvsyww== X-Google-Smtp-Source: AGHT+IG91zlO4f5DwrYdLKvqOrRkCoaH4JYCKtXFUZ9UgkQTT++FNjh4GK2w3/Wf437KXYp0qTD1 X-Received: by 2002:a17:906:3b8e:b0:a65:d699:16f with SMTP id a640c23a62f3a-a65d6990343mr67170066b.69.1717000422426; Wed, 29 May 2024 09:33:42 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717000422; cv=pass; d=google.com; s=arc-20160816; b=N0AzMBEOYAppDkxS80KGGo1qSkTS+kJm0zkjvlu6pP8gRNZgJy3gKPjzTDav4tU43v 3PAt8dH9mvlzsjanS5DO1HIy757qKtq6KIzYJb92qc1JKuzSLdwCvkZrz3JrGsBpUGUx z5wByoRAzqyX41ORo2b9ZMO6sOo5SIWZAJKo6YIdXuDgMRUGyaFo5xEoyC7osNrAOSlJ x7yxmJILZAxuUZxdCwdBZAKwalRUei1Pklce5mc+9vXA7FxntjZlCbYdOI5OwGUNvKcB AbGCRQZWKVL9I2x0GXpPL+E3cAk9N0grT0Jik/dTMMpslA277ZTR1wwOw8ccdhh8U0dz eJFA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:dkim-signature:dkim-signature:from; bh=NFjC1AYiTOqw3rC3z2gomjP0qeae2FRXS7jgtGN7DWk=; fh=U4McIH9iidcL9gKbZYQDg94tOKPoFrCz1YUp4PUzhJM=; b=jDmHtxqpiLXx7yEZJhSWliLzhK0ldU4zP0lFzBrneCYJwLQrPpHq0Lh9psriQf2unM zRJO6QK+0bsn4bgzYt/Dx/7ai9RksL40k5ewe87IglbKAn5l5S8oOYB2VUI9YNPD4l9q eOLQ2v/YNlrGUoCEMDgJXKHhuQpsQ0kRDJjbpISkNSv8OLxNGVOkKrj9skkpZGJU27+A RktTwDW5flza2/3Zoix8sPqvy3T4j7/8mml+jho/GeEDiCqpK6qaR7AySLkfhMCnCG5o tIWjY2/55pzKO8UN58xrolHnQivj8bfNII3vM1qFXWvJMKyz2vZmcBODJwl5NGJ1N9jR xH4w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Trwtg++c; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-194458-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-194458-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id a640c23a62f3a-a635125b4c0si175170966b.627.2024.05.29.09.33.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 May 2024 09:33:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-194458-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Trwtg++c; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-194458-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-194458-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id B2FB11F2322E for ; Wed, 29 May 2024 16:32:54 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A085E1C0DED; Wed, 29 May 2024 16:29:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="Trwtg++c"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="n0TUtsOy" Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 174D718734A; Wed, 29 May 2024 16:29:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717000182; cv=none; b=A3O/c5IguGzWIKFFJUAXQg9f87Qvwk5KjPq0V8B70vcXcBku+zuwLj6m/wCwdUCPWgARAUmXw5Ey/wyD7vb9fvRh4wTZcQu4+4YRc8vjauVHJHALf2jeBp9+VwxpLKSfnpmtqvp9j13p+eK0fkVK5hrXfHwrAeMYjp7BU2gj6vE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717000182; c=relaxed/simple; bh=f49pIcC90uouPuwDyv/aXW+vyT3FgClyt0ZYDNCGKRw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=huXHBQkP8N7yW+FQiTnkSl72UPpANr1Ch+Ld8CZHjwjkRxBpIPRTle0tE/VpnGc26rQDjY/brYw9RH1PQa9bZSgFDFh6JrlXOO55K1uI/KneCTgNY0wHIbDatmtPhhgLcHWXk6MCWNLjNNf4wbYshjm5Hd8i38TF+ARdKM77/18= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=Trwtg++c; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=n0TUtsOy; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1717000178; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NFjC1AYiTOqw3rC3z2gomjP0qeae2FRXS7jgtGN7DWk=; b=Trwtg++cfNHUfE0NU2KKHRhu7GtNnhxoYwtbPpAKs7aga7celcdDm2hIr7DFTw4PH5iU3d CAx7v5ZYkHbiFSIVs80ulDN2vR0YezMuRaQc19/hU9IAEBD5WYtLIThn5dvknsvC++pm6h G0Kjug/UcnP+StLrQpAftMKOVYvz3IpgEvJFo16SKTGa3RNJcTPrVCBA2QnzqqZswNnDhn OtJgXJZ0kNvtA3h8boFgs2WhAH3MCUWb7jwFONX7kixNMbljzd1jShfEfk1nT5RuWUPqkU Fvq0DwJdqAeTYjOG5Nsu3HGjX1692045ZlrfT/EsiTJNeQ9uqytVUM2JovM9cw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1717000178; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NFjC1AYiTOqw3rC3z2gomjP0qeae2FRXS7jgtGN7DWk=; b=n0TUtsOyhxnec/bKE73DTV9qzdwIOG3gx4kIE62wYOmW7o4eEKMumgf5CS7nCrXnezqFeP Kb5Y+2DMwbtKEJDw== To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: "David S. Miller" , Boqun Feng , Daniel Borkmann , Eric Dumazet , Frederic Weisbecker , Ingo Molnar , Jakub Kicinski , Paolo Abeni , Peter Zijlstra , Thomas Gleixner , Waiman Long , Will Deacon , Sebastian Andrzej Siewior , Alexei Starovoitov , Andrii Nakryiko , David Ahern , Hao Luo , Jiri Olsa , John Fastabend , KP Singh , Martin KaFai Lau , Song Liu , Stanislav Fomichev , Yonghong Song , bpf@vger.kernel.org Subject: [PATCH v3 net-next 12/15] seg6: Use nested-BH locking for seg6_bpf_srh_states. Date: Wed, 29 May 2024 18:02:35 +0200 Message-ID: <20240529162927.403425-13-bigeasy@linutronix.de> In-Reply-To: <20240529162927.403425-1-bigeasy@linutronix.de> References: <20240529162927.403425-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable The access to seg6_bpf_srh_states is protected by disabling preemption. Based on the code, the entry point is input_action_end_bpf() and every other function (the bpf helper functions bpf_lwt_seg6_*()), that is accessing seg6_bpf_srh_states, should be called from within input_action_end_bpf(). input_action_end_bpf() accesses seg6_bpf_srh_states first at the top of the function and then disables preemption. This looks wrong because if preemption needs to be disabled as part of the locking mechanism then the variable shouldn't be accessed beforehand. Looking at how it is used via test_lwt_seg6local.sh then input_action_end_bpf() is always invoked from softirq context. If this is always the case then the preempt_disable() statement is superfluous. If this is not always invoked from softirq then disabling only preemption is not sufficient. Replace the preempt_disable() statement with nested-BH locking. This is not an equivalent replacement as it assumes that the invocation of input_action_end_bpf() always occurs in softirq context and thus the preempt_disable() is superfluous. Add a local_lock_t the data structure and use local_lock_nested_bh() in guard notation for locking. Add lockdep_assert_held() to ensure the lock is held while the per-CPU variable is referenced in the helper functions. Cc: Alexei Starovoitov Cc: Andrii Nakryiko Cc: David Ahern Cc: Hao Luo Cc: Jiri Olsa Cc: John Fastabend Cc: KP Singh Cc: Martin KaFai Lau Cc: Song Liu Cc: Stanislav Fomichev Cc: Yonghong Song Cc: bpf@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior --- include/net/seg6_local.h | 1 + net/core/filter.c | 3 +++ net/ipv6/seg6_local.c | 22 ++++++++++++++-------- 3 files changed, 18 insertions(+), 8 deletions(-) diff --git a/include/net/seg6_local.h b/include/net/seg6_local.h index 3fab9dec2ec45..888c1ce6f5272 100644 --- a/include/net/seg6_local.h +++ b/include/net/seg6_local.h @@ -19,6 +19,7 @@ extern int seg6_lookup_nexthop(struct sk_buff *skb, struc= t in6_addr *nhaddr, extern bool seg6_bpf_has_valid_srh(struct sk_buff *skb); =20 struct seg6_bpf_srh_state { + local_lock_t bh_lock; struct ipv6_sr_hdr *srh; u16 hdrlen; bool valid; diff --git a/net/core/filter.c b/net/core/filter.c index 7c46ecba3b01b..ba1a739a9bedc 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -6450,6 +6450,7 @@ BPF_CALL_4(bpf_lwt_seg6_store_bytes, struct sk_buff *= , skb, u32, offset, void *srh_tlvs, *srh_end, *ptr; int srhoff =3D 0; =20 + lockdep_assert_held(&srh_state->bh_lock); if (srh =3D=3D NULL) return -EINVAL; =20 @@ -6506,6 +6507,7 @@ BPF_CALL_4(bpf_lwt_seg6_action, struct sk_buff *, skb, int hdroff =3D 0; int err; =20 + lockdep_assert_held(&srh_state->bh_lock); switch (action) { case SEG6_LOCAL_ACTION_END_X: if (!seg6_bpf_has_valid_srh(skb)) @@ -6582,6 +6584,7 @@ BPF_CALL_3(bpf_lwt_seg6_adjust_srh, struct sk_buff *,= skb, u32, offset, int srhoff =3D 0; int ret; =20 + lockdep_assert_held(&srh_state->bh_lock); if (unlikely(srh =3D=3D NULL)) return -EINVAL; =20 diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index 24e2b4b494cb0..c4828c6620f07 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -1380,7 +1380,9 @@ static int input_action_end_b6_encap(struct sk_buff *= skb, return err; } =20 -DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states); +DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states) =3D { + .bh_lock =3D INIT_LOCAL_LOCK(bh_lock), +}; =20 bool seg6_bpf_has_valid_srh(struct sk_buff *skb) { @@ -1388,6 +1390,7 @@ bool seg6_bpf_has_valid_srh(struct sk_buff *skb) this_cpu_ptr(&seg6_bpf_srh_states); struct ipv6_sr_hdr *srh =3D srh_state->srh; =20 + lockdep_assert_held(&srh_state->bh_lock); if (unlikely(srh =3D=3D NULL)) return false; =20 @@ -1408,8 +1411,7 @@ bool seg6_bpf_has_valid_srh(struct sk_buff *skb) static int input_action_end_bpf(struct sk_buff *skb, struct seg6_local_lwt *slwt) { - struct seg6_bpf_srh_state *srh_state =3D - this_cpu_ptr(&seg6_bpf_srh_states); + struct seg6_bpf_srh_state *srh_state; struct ipv6_sr_hdr *srh; int ret; =20 @@ -1420,10 +1422,14 @@ static int input_action_end_bpf(struct sk_buff *skb, } advance_nextseg(srh, &ipv6_hdr(skb)->daddr); =20 - /* preempt_disable is needed to protect the per-CPU buffer srh_state, - * which is also accessed by the bpf_lwt_seg6_* helpers + /* The access to the per-CPU buffer srh_state is protected by running + * always in softirq context (with disabled BH). On PREEMPT_RT the + * required locking is provided by the following local_lock_nested_bh() + * statement. It is also accessed by the bpf_lwt_seg6_* helpers via + * bpf_prog_run_save_cb(). */ - preempt_disable(); + local_lock_nested_bh(&seg6_bpf_srh_states.bh_lock); + srh_state =3D this_cpu_ptr(&seg6_bpf_srh_states); srh_state->srh =3D srh; srh_state->hdrlen =3D srh->hdrlen << 3; srh_state->valid =3D true; @@ -1446,15 +1452,15 @@ static int input_action_end_bpf(struct sk_buff *skb, =20 if (srh_state->srh && !seg6_bpf_has_valid_srh(skb)) goto drop; + local_unlock_nested_bh(&seg6_bpf_srh_states.bh_lock); =20 - preempt_enable(); if (ret !=3D BPF_REDIRECT) seg6_lookup_nexthop(skb, NULL, 0); =20 return dst_input(skb); =20 drop: - preempt_enable(); + local_unlock_nested_bh(&seg6_bpf_srh_states.bh_lock); kfree_skb(skb); return -EINVAL; } --=20 2.45.1