Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp242566pxb; Wed, 22 Sep 2021 21:39:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxYgIELAY8V+crcI5VdlK2O4bSRnY/xHPFPMsnHPn5h6OMJBStskule2ImZ+AwIlTfE/h1I X-Received: by 2002:a50:e006:: with SMTP id e6mr3218450edl.302.1632371960445; Wed, 22 Sep 2021 21:39:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632371960; cv=none; d=google.com; s=arc-20160816; b=pBPywAP7Q+OHKBz1v6hc6eVMrBJs4iYzKBK9N3WLq3jONcn1txQX7I0PKNfTkw25Ci wNZeBL10vI8QXY7JvOYQ4oXX2hYhVf9o6gbzzb37lqlb86EVJhDgPutHaoc/vfUzF177 yjA3y6mMK4qc0nKy+emmYRDpY7MOhZ94wNJ8XczZTFNfI/tGkpYQSEmkum+cl0PkJ7I0 23re6IQUY4mWBB5yVUub941UMk9ku/a+hZbwGrUgsH3PMCy+GPx3E0UwXvPrrw4LiZfh ED/LM2P3FFJiOhhSOjX8fXUFffuFWnPc/BSfhnQa9wyG1Fn5qcQbFIJDFhsLeSdoP3Fi vtrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=usagujyk9JZxbhEfO/6dy+qrVIqf8/aV7iHW6+Wrreg=; b=wGAWxrgRwOEFiAZqJpbnGmZ/uvxseFV68BVz9eizAwoV1RS4sC1G2D1cmbAJ4jCoj4 7mHEM0B7VOuSV4xoWpHjfCXlNMzQvU8RaxPiAv+MP6XS0vTNcAZxJJmmFYqe4cs5aeEk lOsdYf8U56n06jlvDHxDdbDgmO7AMlCgwlsZxYIPjdc3u9kCGfX4uLU3qfmHSSExko0o Q/o20Mf48DPtUFcWijCloIwSvrU5+MDRdS7YL9zVfnZQ0txi0SPJIBkDPonHM8L4CMo+ HGdblHYsBcwUlp9/nrTOW8C9rCwW3r4YSJpBq2yTC7JEl0esCVeUZu+UhG3E2DfDYn3v 0M3g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x18si5371807edl.617.2021.09.22.21.38.29; Wed, 22 Sep 2021 21:39:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238259AbhIWEg0 (ORCPT + 99 others); Thu, 23 Sep 2021 00:36:26 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:16376 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235562AbhIWEgZ (ORCPT ); Thu, 23 Sep 2021 00:36:25 -0400 Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4HFMfQ51qXzRMtL; Thu, 23 Sep 2021 12:30:38 +0800 (CST) Received: from dggpeml500025.china.huawei.com (7.185.36.35) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Thu, 23 Sep 2021 12:34:51 +0800 Received: from [10.174.176.117] (10.174.176.117) by dggpeml500025.china.huawei.com (7.185.36.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Thu, 23 Sep 2021 12:34:50 +0800 Subject: Re: [PATCH] kernfs: fix the race in the creation of negative dentry To: Ian Kent , Greg Kroah-Hartman , Tejun Heo , Miklos Szeredi CC: , , References: <20210911021342.3280687-1-houtao1@huawei.com> <7b92b158200567f0bba26a038191156890921f13.camel@themaw.net> <6c8088411523e52fc89b8dd07710c3825366ce64.camel@themaw.net> <747aee3255e7a07168557f29ad962e34e9cb964b.camel@themaw.net> <077362887b4ceeb01c27fbf36fa35adae02967c9.camel@themaw.net> From: Hou Tao Message-ID: <8b7d0f46-6975-993c-5d88-7a4e809317ab@huawei.com> Date: Thu, 23 Sep 2021 12:34:50 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <077362887b4ceeb01c27fbf36fa35adae02967c9.camel@themaw.net> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Content-Language: en-US X-Originating-IP: [10.174.176.117] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500025.china.huawei.com (7.185.36.35) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 9/23/2021 10:50 AM, Ian Kent wrote: > On Thu, 2021-09-23 at 09:52 +0800, Hou Tao wrote: >> Hi, >> >> On 9/15/2021 10:09 AM, Ian Kent wrote: >>> On Wed, 2021-09-15 at 09:35 +0800, Ian Kent wrote: >>> >> Sorry for the late reply. >>> I think something like this is needed (not even compile tested): >>> >>> kernfs: dont create a negative dentry if node exists >>> >>> From: Ian Kent >>> >>> In kernfs_iop_lookup() a negative dentry is created if associated >>> kernfs >>> node is incative which makes it visible to lookups in the VFS path >>> walk. >>> >>> But inactive kernfs nodes are meant to be invisible to the VFS and >>> creating a negative for these can have unexpetced side effects. >>> >>> Signed-off-by: Ian Kent >>> --- >>>  fs/kernfs/dir.c |    9 ++++++++- >>>  1 file changed, 8 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c >>> index ba581429bf7b..a957c944cf3a 100644 >>> --- a/fs/kernfs/dir.c >>> +++ b/fs/kernfs/dir.c >>> @@ -1111,7 +1111,14 @@ static struct dentry >>> *kernfs_iop_lookup(struct inode *dir, >>>   >>>         kn = kernfs_find_ns(parent, dentry->d_name.name, ns); >>>         /* attach dentry and inode */ >>> -       if (kn && kernfs_active(kn)) { >>> +       if (kn) { >>> +               /* Inactive nodes are invisible to the VFS so don't >>> +                * create a negative. >>> +                */ >>> +               if (!kernfs_active(kn)) { >>> +                       up_read(&kernfs_rwsem); >>> +                       return NULL; >>> +               } >>>                 inode = kernfs_get_inode(dir->i_sb, kn); >>>                 if (!inode) >>>                         inode = ERR_PTR(-ENOMEM); >>> >>> >>> Essentially, the definition a kernfs negative dentry, for the >>> cases it is meant to cover, is one that has no kernfs node, so >>> one that does have a node should not be created as a negative. >>> >>> Once activated a subsequent ->lookup() will then create a >>> positive dentry for the node so that no invalidation is >>> necessary. >> I'm fine with the fix which is much simpler. > Great, although I was hoping you would check it worked as expected. > Did you check? > If not could you please do that check? Yes, I will test whether or not it fixes the race. >>> This distinction is important because we absolutely do not want >>> negative dentries created that aren't necessary. We don't want to >>> leave any opportunities for negative dentries to accumulate if >>> we don't have to. >>>     >>> I am still thinking about the race you have described. >>> >>> Given my above comments that race might have (maybe probably) >>> been present in the original code before the rwsem change but >>> didn't trigger because of the serial nature of the mutex. >> I don't think there is such race before the enabling of negative >> dentry, >> but maybe I misunderstanding something. > No, I think you're probably right, it's the introduction of using > negative dentries to prevent the expensive dentry alloc/free cycle > of frequent lookups of non-existent paths that's exposed the race. > >>> So it may be wise (perhaps necessary) to at least move the >>> activation under the rwsem (as you have done) which covers most >>> of the change your proposing and the remaining hunk shouldn't >>> do any harm I think but again I need a little more time on that. >> After above fix, doing sibling tree operation and activation >> atomically >> will reduce the unnecessary lookup, but I don't think it is necessary >> for the fix of race. > Sorry, I don't understand what your saying. > > Are you saying you did check my suggested patch alone and it > resolved the problem. And that you also think the small additional > dentry churn is ok too. I haven't tested it, but I think it is OK. And moving the activation under the rwsem is not necessary for the problem. Regards, Tao > > If so I agree, and I'll forward the patch to Greg, ;) > > Ian >> Regards, >> Tao >>> I'm now a little concerned about the invalidation that should >>> occur on deactivation so I want to have a look at that too but >>> it's separate to this proposal. >>> Greg, Tejun, Hou, any further thoughts on this would be most >>> welcome. >>> >>> Ian >>> . > > .