Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp329913pxj; Tue, 1 Jun 2021 23:43:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwH1mL3K8MS1wRNvpcItJOPR8pBqOgj6VIOJYaINwwjMbqFtrgW7svI8pxkCSQd5wHD/e8f X-Received: by 2002:a05:620a:102c:: with SMTP id a12mr25928169qkk.339.1622616182097; Tue, 01 Jun 2021 23:43:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622616182; cv=none; d=google.com; s=arc-20160816; b=pqbndOUt6rQiIuGXXm/lqehzp/C0IlBkx9LLYd/SMoG/K1fTE6ekbaKiq4Dc2UxI8a 0O5dSXOSUf8WYY30q78HZ7CIqt7S9bLR7P2rG1sgDJG6eyXMJeDQ12V9yn3WIy736hsT mEv1/kSTpLSUxsYR3u+nuATV6xdxCDg1jvLFXakVLUhfxxnbiO+QTtrJrVhTE1XxycPU KYpyEyvDfP9coM9Iq+cYqhfpauS+mgxL8Av2V7Ps5KgizCa2Lp5UmixIe47k+u7NB0FS dyhqqo7gZoKvZAVH3BKNEQy+Dd7tdrjlIoSo6PUO9Njth/0CiGdrNsOkMyPxNmQ8cAYm NwPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature:dkim-signature; bh=QB/0xZpsDFsfe7kzn9RW18VuOpTUDa+ex8OTHsbhbEk=; b=Hy68UDtGX147t1rXGCmwpROtSu6FVE0qeLc7Gg1aPU4elR5U19hA1Tz61Lcr66s9cj 1g7JUjLFtBzbRdLy4AV+codx4IifhjDUZqOXVrUrbMrYDvuU6eUoBUZjitqS6PS4j55p W1EMCnesDfvPg4lapXv/zfGTBeONV6LfcnT5zuioIKne1+TfhCfnETCeGjREw27lbvJj cy5lBEyFFbrbTX9iwzPAQQWoQJc3VhEELVv+LzVeVy8P7SeiEb99DE37I2d7ttlki4Y9 UhOmJ9effMuUrr4+7f8tqkCL7vQgrUpvXu0nzOZswARktgzXQ14X4vY8s+UK4Ebk4ToX ErzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@themaw.net header.s=fm3 header.b=vwog3kS7; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=QdQTkTSA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r1si15711638iom.52.2021.06.01.23.42.49; Tue, 01 Jun 2021 23:43:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@themaw.net header.s=fm3 header.b=vwog3kS7; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=QdQTkTSA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230093AbhFBFnm (ORCPT + 99 others); Wed, 2 Jun 2021 01:43:42 -0400 Received: from new4-smtp.messagingengine.com ([66.111.4.230]:55983 "EHLO new4-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230055AbhFBFnl (ORCPT ); Wed, 2 Jun 2021 01:43:41 -0400 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailnew.nyi.internal (Postfix) with ESMTP id EDC0158095D; Wed, 2 Jun 2021 01:41:57 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Wed, 02 Jun 2021 01:41:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=themaw.net; h= message-id:subject:from:to:cc:date:in-reply-to:references :content-type:mime-version:content-transfer-encoding; s=fm3; bh= QB/0xZpsDFsfe7kzn9RW18VuOpTUDa+ex8OTHsbhbEk=; b=vwog3kS7l73qGDW/ K43KByzvpATQAElJsTcfdHaPhzbgGsKwvVDbKBioK6OCYVfNba1QfaKkc51iBC60 1CJJAITKHSWTeibTBXJB93VWlXdAEmeWSufQNYgeO6guJeRJUfJvks4lnxGDjHaj BkWpaNkiFlUPXFIszHN81/vv/UNzdcuCa8J4IPGfs7Wl8ryDYRck7ohgrDtPNe7a Vlbn9f0xOIniP/AVhmaUSU5pey9QzzIwJUdgJp8ONN23BgTwP/I3HjZ4DO6EsmA/ wb+HYAvUCFw4HqlWyTPGL6oiTOTNwGwaduKnZMYRH74sgJchqMG7L8bxNJVmVU9r O32S/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=QB/0xZpsDFsfe7kzn9RW18VuOpTUDa+ex8OTHsbhb Ek=; b=QdQTkTSATCay/eoWzoctc94OjUF/QI/8LXeeJRVIsNPixE3tzNs9c166F 7gPjWRe4e5i45hJVq4WL3/KmV9chDGSWFR1rz4Z2qUaWZpg3V868HleY9CO/+6g/ 2pmuSPjtgq5N6Glbk3HWzhfKj98Hpk+Dvpa/4mQdMthydCHaD4CIf9JaGwHW/8t4 i5eV9UUglRXYEp8ek0RLkx4nEZ6g8sAMsCifQXR9NR2ksuGFGmFErVL2v8GjFVTs ekai6hGC2tCXedJ/79crO/YyXL9qrfUjgzY8NLDQSyq2/JbXkPAkk+rszciEygeP 5iTMuWYlTSqYJdNJhKNd6DqF/odKA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdeliedgleelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepkffuhffvffgjfhgtfggggfesthekredttderjeenucfhrhhomhepkfgrnhcu mfgvnhhtuceorhgrvhgvnhesthhhvghmrgifrdhnvghtqeenucggtffrrghtthgvrhhnpe fgleelkeetheelgeehueejueduhfeufffgleehgfevtdehhffhhffhtddugfefheenucev lhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrrghvvghnse hthhgvmhgrfidrnhgvth X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 2 Jun 2021 01:41:52 -0400 (EDT) Message-ID: Subject: Re: [REPOST PATCH v4 4/5] kernfs: use i_lock to protect concurrent inode updates From: Ian Kent To: Miklos Szeredi Cc: Greg Kroah-Hartman , Tejun Heo , Eric Sandeen , Fox Chen , Brice Goglin , Al Viro , Rick Lindsley , David Howells , Marcelo Tosatti , linux-fsdevel , Kernel Mailing List Date: Wed, 02 Jun 2021 13:41:38 +0800 In-Reply-To: References: <162218354775.34379.5629941272050849549.stgit@web.messagingengine.com> <162218366632.34379.11311748209082333016.stgit@web.messagingengine.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2021-06-01 at 15:18 +0200, Miklos Szeredi wrote: > On Fri, 28 May 2021 at 08:34, Ian Kent wrote: > > > > The inode operations .permission() and .getattr() use the kernfs > > node > > write lock but all that's needed is to keep the rb tree stable > > while > > updating the inode attributes as well as protecting the update > > itself > > against concurrent changes. > > > > And .permission() is called frequently during path walks and can > > cause > > quite a bit of contention between kernfs node operations and path > > walks when the number of concurrent walks is high. > > > > To change kernfs_iop_getattr() and kernfs_iop_permission() to take > > the rw sem read lock instead of the write lock an additional lock > > is > > needed to protect against multiple processes concurrently updating > > the inode attributes and link count in kernfs_refresh_inode(). > > > > The inode i_lock seems like the sensible thing to use to protect > > these > > inode attribute updates so use it in kernfs_refresh_inode(). > > > > Signed-off-by: Ian Kent > > --- > >  fs/kernfs/inode.c |   10 ++++++---- > >  fs/kernfs/mount.c |    4 ++-- > >  2 files changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c > > index 3b01e9e61f14e..6728ecd81eb37 100644 > > --- a/fs/kernfs/inode.c > > +++ b/fs/kernfs/inode.c > > @@ -172,6 +172,7 @@ static void kernfs_refresh_inode(struct > > kernfs_node *kn, struct inode *inode) > >  { > >         struct kernfs_iattrs *attrs = kn->iattr; > > > > +       spin_lock(&inode->i_lock); > >         inode->i_mode = kn->mode; > >         if (attrs) > >                 /* > > @@ -182,6 +183,7 @@ static void kernfs_refresh_inode(struct > > kernfs_node *kn, struct inode *inode) > > > >         if (kernfs_type(kn) == KERNFS_DIR) > >                 set_nlink(inode, kn->dir.subdirs + 2); > > +       spin_unlock(&inode->i_lock); > >  } > > > >  int kernfs_iop_getattr(struct user_namespace *mnt_userns, > > @@ -191,9 +193,9 @@ int kernfs_iop_getattr(struct user_namespace > > *mnt_userns, > >         struct inode *inode = d_inode(path->dentry); > >         struct kernfs_node *kn = inode->i_private; > > > > -       down_write(&kernfs_rwsem); > > +       down_read(&kernfs_rwsem); > >         kernfs_refresh_inode(kn, inode); > > -       up_write(&kernfs_rwsem); > > +       up_read(&kernfs_rwsem); > > > >         generic_fillattr(&init_user_ns, inode, stat); > >         return 0; > > @@ -284,9 +286,9 @@ int kernfs_iop_permission(struct user_namespace > > *mnt_userns, > > > >         kn = inode->i_private; > > > > -       down_write(&kernfs_rwsem); > > +       down_read(&kernfs_rwsem); > >         kernfs_refresh_inode(kn, inode); > > -       up_write(&kernfs_rwsem); > > +       up_read(&kernfs_rwsem); > > > >         return generic_permission(&init_user_ns, inode, mask); > >  } > > diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c > > index baa4155ba2edf..f2f909d09f522 100644 > > --- a/fs/kernfs/mount.c > > +++ b/fs/kernfs/mount.c > > @@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block > > *sb, struct kernfs_fs_context *k > >         sb->s_shrink.seeks = 0; > > > >         /* get root inode, initialize and unlock it */ > > -       down_write(&kernfs_rwsem); > > +       down_read(&kernfs_rwsem); > >         inode = kernfs_get_inode(sb, info->root->kn); > > -       up_write(&kernfs_rwsem); > > +       up_read(&kernfs_rwsem); > >         if (!inode) { > >                 pr_debug("kernfs: could not get root inode\n"); > >                 return -ENOMEM; > > > > This last hunk is not mentioned in the patch header.  Why is this > needed? Yes, that's right. The lock is needed to keep the node rb tree stable. kernfs_get_inode() calls kernfs_refresh_inode() indirectly so since the i_lock is probably not needed here this hunk could just as well have gone into the rwsem change but because of that kernfs_refresh_inode() call it also makes sense to put it here. I'd prefer to keep it here and clearly what's going on isn't as obvious as I thought so I can add this reasoning to the description if you still think it's worth while? > > Otherwise looks good. > > Thanks, > Miklos