Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp2189187iof; Tue, 7 Jun 2022 22:38:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz4pDBHTjQkAnI11HOi4ZSigdhEoEaeqUpL1cowfank7ge41q2ZLCOz/rukDmVuWoNdlRvq X-Received: by 2002:a17:90b:4a03:b0:1e8:4b95:677f with SMTP id kk3-20020a17090b4a0300b001e84b95677fmr24432572pjb.85.1654666696315; Tue, 07 Jun 2022 22:38:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654666696; cv=none; d=google.com; s=arc-20160816; b=n6z0pfVukXqU77TFOO7SN5G4ItRHWPYHa7hy/O2awjZpvLUldZuZXRThhn3m82LVM6 YHyVrbof8b4aRKy69EbU/5X59wChyxPDNSXKz+LXzNwyL5Twzjnxa2e9GCvOT6uwlvdZ xGCWqSbTcqn9MbBa0fali1+TmrkVy5M36JzzOb+B3zY/0oVDQ8h8GDZmNy53P9WRRyUD u9AW8NXp7dTzM6ja0YHVuwALUiMbcAzANBU1UCSQnh7ozxy5Cd2Z3vlNHoDSKsnLXDzY mRXOJC5Pggaqz+MeijPFAUF2Zy0WQ94Qg64ZS+UQ8AUmZoZqYa8gyR6ioSWqUecjFUU0 YG3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=fwVw2AoIPgmoOA2siu771gjgx7dT+mREKsnyR0rNF1U=; b=S7hnraCCkoLZOqvWCzYIaHDnC+3q3KH0c55dbckW8ZoRpy3Z5j07yt366f+aDClRpo 2vJMaoxb/ip1KZFUs/cZYxP1QxRCHRAra/cwEIWx4rEHJpqw2H40PXVigCfSmURMG/de vKiINMNXnK89id+uTJL68qcYCCO2TN15VsRMV/F/djUG0QEGXQ0HJdAqOcWCUUvwsclW gtoaANrGvNvoNpT94hL8S+LhoyVzY0j/lfIqWLFm9BHIWZ98OzC2pAVrV4XQpAm3iBGJ EpJ5jn/GMhimaYPM60sJLfYp9ZTaBTNDAr4fcIHBv0qsiqmC4XKyO7plWEW/5J1XaVG2 uDcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=hDJAhm3g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id rm11-20020a17090b3ecb00b001d97a68ce61si9139260pjb.172.2022.06.07.22.38.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jun 2022 22:38:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=hDJAhm3g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7A3C740F25A; Tue, 7 Jun 2022 22:13:03 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1388636AbiFHAch (ORCPT + 99 others); Tue, 7 Jun 2022 20:32:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1382451AbiFGV6s (ORCPT ); Tue, 7 Jun 2022 17:58:48 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A81DA1C93A; Tue, 7 Jun 2022 12:14:08 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0C3E9B8237B; Tue, 7 Jun 2022 19:13:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 765F7C385A2; Tue, 7 Jun 2022 19:13:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1654629227; bh=g9VhsKXrQN/jX3DPUybyMswcmuRgsxa1Z38Jhy2E7qA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hDJAhm3g6AdUEkC3Ccpv0yVgC0yNnzPPUPvIuf+C+uhIJNcOwrQz28qz6MhbH+yPE EfSs7LKksFGkzAWu7Zgd4UG6UkjZ2KjgC6lLNNiUSxnafVTlN2C1x8wlAkKVDVv+F2 DNoN3uPRAmk4nswbf6v4hE3tGttwxd34MeMlKZjY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Waiman Long , Al Viro , David Howells , Manfred Spraul , Davidlohr Bueso , Andrew Morton , Sasha Levin Subject: [PATCH 5.18 615/879] ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() Date: Tue, 7 Jun 2022 19:02:13 +0200 Message-Id: <20220607165020.700622910@linuxfoundation.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220607165002.659942637@linuxfoundation.org> References: <20220607165002.659942637@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Waiman Long [ Upstream commit d60c4d01a98bc1942dba6e3adc02031f5519f94b ] When running the stress-ng clone benchmark with multiple testing threads, it was found that there were significant spinlock contention in sget_fc(). The contended spinlock was the sb_lock. It is under heavy contention because the following code in the critcal section of sget_fc(): hlist_for_each_entry(old, &fc->fs_type->fs_supers, s_instances) { if (test(old, fc)) goto share_extant_sb; } After testing with added instrumentation code, it was found that the benchmark could generate thousands of ipc namespaces with the corresponding number of entries in the mqueue's fs_supers list where the namespaces are the key for the search. This leads to excessive time in scanning the list for a match. Looking back at the mqueue calling sequence leading to sget_fc(): mq_init_ns() => mq_create_mount() => fc_mount() => vfs_get_tree() => mqueue_get_tree() => get_tree_keyed() => vfs_get_super() => sget_fc() Currently, mq_init_ns() is the only mqueue function that will indirectly call mqueue_get_tree() with a newly allocated ipc namespace as the key for searching. As a result, there will never be a match with the exising ipc namespaces stored in the mqueue's fs_supers list. So using get_tree_keyed() to do an existing ipc namespace search is just a waste of time. Instead, we could use get_tree_nodev() to eliminate the useless search. By doing so, we can greatly reduce the sb_lock hold time and avoid the spinlock contention problem in case a large number of ipc namespaces are present. Of course, if the code is modified in the future to allow mqueue_get_tree() to be called with an existing ipc namespace instead of a new one, we will have to use get_tree_keyed() in this case. The following stress-ng clone benchmark command was run on a 2-socket 48-core Intel system: ./stress-ng --clone 32 --verbose --oomable --metrics-brief -t 20 The "bogo ops/s" increased from 5948.45 before patch to 9137.06 after patch. This is an increase of 54% in performance. Link: https://lkml.kernel.org/r/20220121172315.19652-1-longman@redhat.com Fixes: 935c6912b198 ("ipc: Convert mqueue fs to fs_context") Signed-off-by: Waiman Long Cc: Al Viro Cc: David Howells Cc: Manfred Spraul Cc: Davidlohr Bueso Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin --- ipc/mqueue.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 7c08eb3c258d..54cb6264f8cf 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -45,6 +45,7 @@ struct mqueue_fs_context { struct ipc_namespace *ipc_ns; + bool newns; /* Set if newly created ipc namespace */ }; #define MQUEUE_MAGIC 0x19800202 @@ -427,6 +428,14 @@ static int mqueue_get_tree(struct fs_context *fc) { struct mqueue_fs_context *ctx = fc->fs_private; + /* + * With a newly created ipc namespace, we don't need to do a search + * for an ipc namespace match, but we still need to set s_fs_info. + */ + if (ctx->newns) { + fc->s_fs_info = ctx->ipc_ns; + return get_tree_nodev(fc, mqueue_fill_super); + } return get_tree_keyed(fc, mqueue_fill_super, ctx->ipc_ns); } @@ -454,6 +463,10 @@ static int mqueue_init_fs_context(struct fs_context *fc) return 0; } +/* + * mq_init_ns() is currently the only caller of mq_create_mount(). + * So the ns parameter is always a newly created ipc namespace. + */ static struct vfsmount *mq_create_mount(struct ipc_namespace *ns) { struct mqueue_fs_context *ctx; @@ -465,6 +478,7 @@ static struct vfsmount *mq_create_mount(struct ipc_namespace *ns) return ERR_CAST(fc); ctx = fc->fs_private; + ctx->newns = true; put_ipc_ns(ctx->ipc_ns); ctx->ipc_ns = get_ipc_ns(ns); put_user_ns(fc->user_ns); -- 2.35.1