Received: by 2002:a05:6a10:6d25:0:0:0:0 with SMTP id gq37csp346743pxb; Sat, 11 Sep 2021 07:00:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzdpOEIG2+L4GWIv5LRIlDI5jfwXuohkxnngYyo0olXrDh/nSzp7cqNLxpnnARk6Cj+516b X-Received: by 2002:a50:cac7:: with SMTP id f7mr3406417edi.302.1631368827183; Sat, 11 Sep 2021 07:00:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631368827; cv=none; d=google.com; s=arc-20160816; b=sogpYLMue0qvP5uELPSNI4GgJXr1gXQGafX5nzB1OlJj/DUuBiRbBgcYMfZlKCNdBa O32acf0FXr/lIjieWnhTVqWIsfwTpzn8iFfI3VY0pUL5UdYD8I1r5fK9Hu4NgfL+8IVV 1NHt8tN4U8hb5+/eUwQXN1EmatciwTfJeNWICB5vpFvR0FnaBloUK2UNU1QAzEm+6ZLe mkRRgGHrbHCJtvH8K+SrSdndjLI/Jp14cuObs8Cka1cQzzSP3xoe9HgO2//F2YQwE1Jq wWzGJHvcvNqiN8mxrexSDST7Q8eeYyU1eRntb9zVO1ZFVkUTUyBW+7FTIdgdPjXh8Pdc RTAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature:dkim-signature; bh=2RVWdNL4hfyXLC8JxU6W4ciHj77gofnUoqseD9igssM=; b=Y25neVOxmS8PD7GQJXp/nXtfgW9Pk8aEOYnN85YPmnjvzRTvTdcoAmEgt3kFiDEk+k oPn7CEoKsYcHveM53YBYogwGiOVh2x6GaOrvyWS+R/BNszYL1ggOd+M2hEPVRruSzUE8 UpheCVzkQ2Dh2tnUWrFqN+u0/HV1a++YgG/y0gsKnMeTPKalVyZ3zzX37bq7MQ9mHKyI VAcwKNmkdvT+TAlD07INHu1JGRXadeLD3brhwLP/WcSPxoxqT4lM3FAjpLHdFQoPvswb +WZ8AVbzXjJQnhCm5+vhrp7OLOxqly2ayC2e/LR2xQOzYhBqY2pSXvssk5mUihOkXMn4 CZnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2021-07-09 header.b=BLW4zClW; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=DCKLXotf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w10si2261525ejv.584.2021.09.11.06.59.58; Sat, 11 Sep 2021 07:00:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2021-07-09 header.b=BLW4zClW; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=DCKLXotf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235933AbhIKN7w (ORCPT + 99 others); Sat, 11 Sep 2021 09:59:52 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]:36778 "EHLO mx0b-00069f02.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230249AbhIKN7v (ORCPT ); Sat, 11 Sep 2021 09:59:51 -0400 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 18BCjc19032579; Sat, 11 Sep 2021 13:58:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : mime-version : content-type : content-transfer-encoding; s=corp-2021-07-09; bh=2RVWdNL4hfyXLC8JxU6W4ciHj77gofnUoqseD9igssM=; b=BLW4zClW++XS4PkrS1/4+UOTJePDx3NOoaflK4ViX/5AJjzW+lWFSnbNWvqOKJJ+dJca 6T3SaWEVcRCW6QTDKhRKPyaiG+HAewpLy8NKqT7Oa5nZYqS3u6G6A834WQWqAc/3Xu0R AY5UEUsYR0W/TOVMnaLCgYPlCMig6z1H9sao73ekJxgKCDQIXROZhzxpzJT9uc6iDKmg QFkCWX8goqcb51AUU5Rmrj2YsyiuYFIzUz2qchfyXOrNdPOsvoE8iCeCr2bscvKHK0wF 82LTHeuSoaOjYrjzV5ygH6fXEgknIMBwDApQS92fbbC3D53ZpxxPzYCG8rHsljCpezCV JA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=2RVWdNL4hfyXLC8JxU6W4ciHj77gofnUoqseD9igssM=; b=DCKLXotfAdU2ilbA4P5vWTrU2zMNQnHMta5wxGnLZdqnEG7lcdVvyn5wipE8ImFUVyWI IK24EA5/fV/+4je706Y9CzPfMaeEMYFRDmDfRgVG8r3VPKDy5GWdrySKgjCY7lUREdly ro9OQ1jZkPbuUeQeaRM4mO8bTAvPPnTRO3tMafC5RqIx6vqK2E7/j+71OnC1tMa6HDZQ ssTdKQxNsIuStea8Q79poLyI/4i2/BTrBuy9H1L3PlpzgdiKuGkmLvPJS2ljJfRjzJFk IUbtZUbWUXdc1IqXShN15TWKIKH4uCkGFsMXL7MMrqbaWQrdMlBM1e3yFL9eRj6VALCH Pw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3b0jvdrk9h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 11 Sep 2021 13:58:30 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 18BDt5hU146292; Sat, 11 Sep 2021 13:58:29 GMT Received: from lab02.no.oracle.com (lab02.no.oracle.com [10.172.144.56]) by userp3020.oracle.com with ESMTP id 3b0kshva8v-1; Sat, 11 Sep 2021 13:58:29 +0000 From: =?UTF-8?q?H=C3=A5kon=20Bugge?= To: gregkh@linuxfoundation.org, sashal@kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, stable@vger.kernel.org, dledford@redhat.com, jgg@nvidia.com, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, gnault@redhat.com Subject: [PATCH 5.4] netns: protect netns ID lookups with RCU Date: Sat, 11 Sep 2021 15:58:26 +0200 Message-Id: <1631368706-22561-1-git-send-email-haakon.bugge@oracle.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10103 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=0 spamscore=0 adultscore=0 malwarescore=0 bulkscore=0 mlxscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109030001 definitions=main-2109110095 X-Proofpoint-GUID: ndfY6x-5v4tAQZvYfbWnNzH9EJnwT51w X-Proofpoint-ORIG-GUID: ndfY6x-5v4tAQZvYfbWnNzH9EJnwT51w Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Guillaume Nault __peernet2id() can be protected by RCU as it only calls idr_for_each(), which is RCU-safe, and never modifies the nsid table. rtnl_net_dumpid() can also do lockless lookups. It does two nested idr_for_each() calls on nsid tables (one direct call and one indirect call because of rtnl_net_dumpid_one() calling __peernet2id()). The netnsid tables are never updated. Therefore it is safe to not take the nsid_lock and run within an RCU-critical section instead. Signed-off-by: Guillaume Nault Signed-off-by: David S. Miller A nice side-effect of replacing spin_{lock,unlock}_bh() with rcu_spin_{lock,unlock}() in peernet2id() is that it avoids the situation where SoftIRQs get enabled whilst IRQs are turned off. From bugzilla.redhat.com/show_bug.cgi?id=1384179 (an ancient 4.9.0-0.rc0 kernel): dump_stack+0x86/0xc3 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 __local_bh_enable_ip+0x9d/0xc0 _raw_spin_unlock_bh+0x35/0x40 peernet2id+0x54/0x80 netlink_broadcast_filtered+0x220/0x3c0 netlink_broadcast+0x1d/0x20 audit_log+0x6a/0x90 security_set_bools+0xee/0x200 [] Note, security_set_bools() calls write_lock_irq(). peernet2id() calls spin_unlock_bh(). From an internal (UEK) stack trace based on the v4.14.35 kernel (LTS 4.14.231): queued_spin_lock_slowpath+0xb/0xf _raw_spin_lock_irqsave+0x46/0x48 send_mad+0x3d2/0x590 [ib_core] ib_sa_path_rec_get+0x223/0x4d0 [ib_core] path_rec_start+0xa3/0x140 [ib_ipoib] ipoib_start_xmit+0x2b0/0x6a0 [ib_ipoib] dev_hard_start_xmit+0xb2/0x237 sch_direct_xmit+0x114/0x1bf __dev_queue_xmit+0x592/0x818 dev_queue_xmit+0x10/0x12 arp_xmit+0x38/0xa6 arp_send_dst.part.16+0x61/0x84 arp_process+0x825/0x889 arp_rcv+0x140/0x1c9 __netif_receive_skb_core+0x401/0xb39 __netif_receive_skb+0x18/0x59 netif_receive_skb_internal+0x45/0x119 napi_gro_receive+0xd8/0xf6 ipoib_ib_handle_rx_wc+0x1ca/0x520 [ib_ipoib] ipoib_poll+0xcd/0x150 [ib_ipoib] net_rx_action+0x289/0x3f4 __do_softirq+0xe1/0x2b5 do_softirq_own_stack+0x2a/0x35 do_softirq+0x4d/0x6a __local_bh_enable_ip+0x57/0x59 _raw_spin_unlock_bh+0x23/0x25 peernet2id+0x51/0x73 netlink_broadcast_filtered+0x223/0x41b netlink_broadcast+0x1d/0x1f rdma_nl_multicast+0x22/0x30 [ib_core] send_mad+0x3e5/0x590 [ib_core] ib_sa_path_rec_get+0x223/0x4d0 [ib_core] rdma_resolve_route+0x287/0x810 [rdma_cm] rds_rdma_cm_event_handler_cmn+0x311/0x7d0 [rds_rdma] rds_rdma_cm_event_handler_worker+0x22/0x30 [rds_rdma] process_one_work+0x169/0x3a6 worker_thread+0x4d/0x3e5 kthread+0x105/0x138 ret_from_fork+0x24/0x49 Here, pay attention to ib_nl_make_request() which calls spin_lock_irqsave() on a global lock just before calling rdma_nl_multicast(). Thereafter, peernet2id() enables SoftIRQs, and ipoib starts and calls the same path and ends up trying to acquire the same global lock again. (cherry picked from commit 2dce224f469f060b9998a5a869151ef83c08ce77) Fixes: fba143c66abb ("netns: avoid disabling irq for netns id") Signed-off-by: HÃ¥kon Bugge Conflicts: net/core/net_namespace.c * Due to context differences because v5.4 lacks commit 4905294162bd ("netns: Remove __peernet2id_alloc()"). Only comments affected. --- net/core/net_namespace.c | 28 ++++++++++------------------ 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index c303873..9bf1551 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -211,9 +211,9 @@ static int net_eq_idr(int id, void *net, void *peer) return 0; } -/* Should be called with nsid_lock held. If a new id is assigned, the bool alloc - * is set to true, thus the caller knows that the new id must be notified via - * rtnl. +/* Must be called from RCU-critical section or with nsid_lock held. If + * a new id is assigned, the bool alloc is set to true, thus the + * caller knows that the new id must be notified via rtnl. */ static int __peernet2id_alloc(struct net *net, struct net *peer, bool *alloc) { @@ -237,7 +237,7 @@ static int __peernet2id_alloc(struct net *net, struct net *peer, bool *alloc) return NETNSA_NSID_NOT_ASSIGNED; } -/* should be called with nsid_lock held */ +/* Must be called from RCU-critical section or with nsid_lock held */ static int __peernet2id(struct net *net, struct net *peer) { bool no = false; @@ -281,9 +281,10 @@ int peernet2id(struct net *net, struct net *peer) { int id; - spin_lock_bh(&net->nsid_lock); + rcu_read_lock(); id = __peernet2id(net, peer); - spin_unlock_bh(&net->nsid_lock); + rcu_read_unlock(); + return id; } EXPORT_SYMBOL(peernet2id); @@ -962,6 +963,7 @@ struct rtnl_net_dump_cb { int s_idx; }; +/* Runs in RCU-critical section. */ static int rtnl_net_dumpid_one(int id, void *peer, void *data) { struct rtnl_net_dump_cb *net_cb = (struct rtnl_net_dump_cb *)data; @@ -1046,19 +1048,9 @@ static int rtnl_net_dumpid(struct sk_buff *skb, struct netlink_callback *cb) goto end; } - spin_lock_bh(&net_cb.tgt_net->nsid_lock); - if (net_cb.fillargs.add_ref && - !net_eq(net_cb.ref_net, net_cb.tgt_net) && - !spin_trylock_bh(&net_cb.ref_net->nsid_lock)) { - spin_unlock_bh(&net_cb.tgt_net->nsid_lock); - err = -EAGAIN; - goto end; - } + rcu_read_lock(); idr_for_each(&net_cb.tgt_net->netns_ids, rtnl_net_dumpid_one, &net_cb); - if (net_cb.fillargs.add_ref && - !net_eq(net_cb.ref_net, net_cb.tgt_net)) - spin_unlock_bh(&net_cb.ref_net->nsid_lock); - spin_unlock_bh(&net_cb.tgt_net->nsid_lock); + rcu_read_unlock(); cb->args[0] = net_cb.idx; end: -- 1.8.3.1