Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp6118289pxb; Mon, 14 Feb 2022 16:02:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJwNWz3JBBGXkcqulnF4HjElhpp1NGT3iAXnytebjoZ9zzkK2M4ZLmXz1n0zrU+m0LM2zbL/ X-Received: by 2002:a17:90b:1d04:b0:1b9:337d:490 with SMTP id on4-20020a17090b1d0400b001b9337d0490mr1351970pjb.144.1644883344615; Mon, 14 Feb 2022 16:02:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644883344; cv=none; d=google.com; s=arc-20160816; b=NUy3Z4trnP1LITn2c22t9G2MBlEJuHFDhZ7wMl81H3eBPZls53uQ3PuhCtWT4J1tRd huP0OBirXhayY1ArLiyG+5Fem8Iac3bO2TaEVGPF2OW2Lzv7f/KIYWKVPu1D8yP5cGhy XHNr/HTQc6Em4qirnokZBEFlxtbqSTLr31u+GfMfYiDw7Gzh7xoXPxGIjljrLXMnIj1g v6ztLZx6JA/FHJDlDYKeDyVT4JVy8VxKgxQtdcAcYD9wmC+/wDKdUn+5/bb/xMYt0dR3 FTf9vA0xtnf8yAzEUu9DH1Xm8KD9IvUvRVZHB3QWU8ODRc9hdx1vrJM63xnDcYOiZ1ab 7c0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=hvdYpaDofonZ/8AmJBo0xXWdkJi+fNWSGHXnW2uFIZM=; b=bIVPOWSO8NadmOKI5+VqLYGUUNKJU7MyUjgXS2XLl85dzNCIeNwlvXNMfn468YLeFq 1hHg7lksCHSrt5RRRHAuOYDKxIVHmXOMf98RQWKmuILvNsdkCl5Rdp4ctE2ylhKaWMfU Xj6d6Ou4s2mAkdjstO50VKUIudJCP2dNF9HOzl+M66SzEeEe9aJQu2oEY8TzT/UZdqq5 gnnh0eV7Y1vTv7c2tXPd7gQj8PaNZ/URulNqQlo3EYVWpXwb2afXd4WKBvqLK3h8pevw SxAPpPgXsZW+XamC7oRMNwMULeOUC60ogT06JkM6mZV2u9PxnrdBeZl1kD7A+qWAAfHW duBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e2si1102043pgc.62.2022.02.14.16.02.03; Mon, 14 Feb 2022 16:02:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230174AbiBNVFM (ORCPT + 99 others); Mon, 14 Feb 2022 16:05:12 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:49802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230104AbiBNVFG (ORCPT ); Mon, 14 Feb 2022 16:05:06 -0500 X-Greylist: delayed 514 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Mon, 14 Feb 2022 13:04:57 PST Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B9DE104A59; Mon, 14 Feb 2022 13:04:55 -0800 (PST) Received: from [2603:3005:d05:2b00:6e0b:84ff:fee2:98bb] (helo=imladris.surriel.com) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nJiNk-0000oK-H1; Mon, 14 Feb 2022 15:55:36 -0500 Date: Mon, 14 Feb 2022 15:55:36 -0500 From: Rik van Riel To: "Paul E. McKenney" Cc: Chris Mason , Giuseppe Scrivano , "viro@zeniv.linux.org.uk" , "linux-kernel@vger.kernel.org" , linux-fsdevel , Kernel Team Subject: Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited() Message-ID: <20220214155536.1e0da8b6@imladris.surriel.com> In-Reply-To: <20220214194440.GZ4285@paulmck-ThinkPad-P17-Gen-1> References: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1> <20220214194440.GZ4285@paulmck-ThinkPad-P17-Gen-1> X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.31; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: riel@shelob.surriel.com X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 14 Feb 2022 11:44:40 -0800 "Paul E. McKenney" wrote: > On Mon, Feb 14, 2022 at 07:26:49PM +0000, Chris Mason wrote: > Moving from synchronize_rcu() to synchronize_rcu_expedited() does buy > you at least an order of magnitude. But yes, it should be possible to > get rid of all but one call per batch, which would be better. Maybe > a bit more complicated, but probably not that much. It doesn't look too bad, except for the include of ../fs/mount.h. I'm hoping somebody has a better idea on how to deal with that. Do we need a kern_unmount() variant that doesn't do the RCU wait, or should it get a parameter, or something else? Is there an ordering requirement between the synchronize_rcu call and zeroing out n->mq_mnt->mnt_ls? What other changes do we need to make everything right? The change below also fixes the issue that to-be-freed items that are queued up while the free_ipc work function runs do not result in the work item being enqueued again. This patch is still totally untested because the 4 year old is at home today :) diff --git a/ipc/namespace.c b/ipc/namespace.c index 7bd0766ddc3b..321cbda17cfb 100644 --- a/ipc/namespace.c +++ b/ipc/namespace.c @@ -17,6 +17,7 @@ #include #include +#include "../fs/mount.h" #include "util.h" static struct ucounts *inc_ipc_namespaces(struct user_namespace *ns) @@ -117,10 +118,7 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids, static void free_ipc_ns(struct ipc_namespace *ns) { - /* mq_put_mnt() waits for a grace period as kern_unmount() - * uses synchronize_rcu(). - */ - mq_put_mnt(ns); + mntput(ns->mq_mnt); sem_exit_ns(ns); msg_exit_ns(ns); shm_exit_ns(ns); @@ -134,11 +132,19 @@ static void free_ipc_ns(struct ipc_namespace *ns) static LLIST_HEAD(free_ipc_list); static void free_ipc(struct work_struct *unused) { - struct llist_node *node = llist_del_all(&free_ipc_list); + struct llist_node *node; struct ipc_namespace *n, *t; - llist_for_each_entry_safe(n, t, node, mnt_llist) - free_ipc_ns(n); + while ((node = llist_del_all(&free_ipc_list))) { + llist_for_each_entry(n, node, mnt_llist) + real_mount(n->mq_mnt)->mnt_ns = NULL; + + /* Wait for the last users to have gone away. */ + synchronize_rcu(); + + llist_for_each_entry_safe(n, t, node, mnt_llist) + free_ipc_ns(n); + } } /*