Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp656872imw; Wed, 13 Jul 2022 05:51:19 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sz4ztcjCFO30QlAJiwEuUFUlOYQU8Q/CYuL6lIplR1U3jedNPklG2WLN5JV8yMQeE+dTsH X-Received: by 2002:a63:24e:0:b0:412:5277:7a2a with SMTP id 75-20020a63024e000000b0041252777a2amr2797834pgc.165.1657716679179; Wed, 13 Jul 2022 05:51:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657716679; cv=none; d=google.com; s=arc-20160816; b=Hcn//fOZy1L5STS69pZatJDCPJ1092COIGJAiWfu15Rmc2jJIfWopPMvthB6A3MVF/ mGo/9xU1FCJkS+Y4ox4RMCsQpUEpATL0olXbKiJ27dKz2G1BeeYKAHLS9py2kgglgEvU SWQv3pryDU+F/4t/djJU1Xz4WS992QyBCm+J8ti5yrIYpQIxvhTTgk0WsBrbV9o4U4Zl xp9ji/9T3LrvsVpAJ5uX0YCCB2J/madQdLnpvtuJnQk7LMuWFNqGL1MHCdLFsZRxbMtd u1m7MXbc6rRaGTlHu5At6aHln2gG74T1roiaFjjHPuhgnofQnASch2tOZLs7txJ1bQdM U/2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=LS1cGduFHkLensQZ/wI39hm/DOVx8J/NtfvPKQcEI7Q=; b=DmAgdkA/77/4VfNS6s/lOOG7i+/zSI4bxICpRJGfA7Qt/3Z381A3A9amgKM10oKfs9 EuAZIChdkzg+PhSdTt683nxHJbYDoyvuRQ/x5QQKeCfJd6WMMfQuMdwGALQgMNECz49Z OcZTG7lnePodqzPZUBSciwYE034B20QrHNQIeZuOP4/18jffzePxVdIykHBmU41A9anA 4HxgU9kMCM7IYIne5dFhDY/VFT2vLDJo3imEPIC0itj1UVMQ3YQLa5s4K4s5txpXLWjG yS4kuzQ3LFugpwcF7QMYFp3w1TvXjpS1cWgCNVOtmBvOA1eoS8+tlTu1SP3jwELl5FFu ibdA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fvH4Kdxc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 202-20020a6215d3000000b00528a31c652bsi15779672pfv.68.2022.07.13.05.51.06; Wed, 13 Jul 2022 05:51:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fvH4Kdxc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231625AbiGMMlV (ORCPT + 99 others); Wed, 13 Jul 2022 08:41:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231801AbiGMMlT (ORCPT ); Wed, 13 Jul 2022 08:41:19 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 25C2CF991A for ; Wed, 13 Jul 2022 05:41:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657716077; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LS1cGduFHkLensQZ/wI39hm/DOVx8J/NtfvPKQcEI7Q=; b=fvH4KdxcMwrEVbv75bFX+6ETJKUDLrrvnsR5yESsgscCtXezATkkLjK0INAHQ8AVwpBw4+ xfH1LwMJUQEpZ3NcliWrJKW6giz0IWzx0n5lSGIGUCMtx1e7iX1iyDWMzLvpT3wP65DH4m 643bqvSPFSe+FLwTSWkVyy4mllN0zgc= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-274-EOBcwUFRNdqVDjMRjhNzNQ-1; Wed, 13 Jul 2022 08:41:16 -0400 X-MC-Unique: EOBcwUFRNdqVDjMRjhNzNQ-1 Received: by mail-qv1-f69.google.com with SMTP id ld6-20020a056214418600b00472ffe4640eso3680877qvb.13 for ; Wed, 13 Jul 2022 05:41:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=LS1cGduFHkLensQZ/wI39hm/DOVx8J/NtfvPKQcEI7Q=; b=EJmOa3Ln+BQNWEyuIcK3s63qBUIR6qKVHvpeXn4o46g77ReQaA6ITaBtOhns1hmibF Z+PjL7Gplye0VqqxzOSVRmeNJU0ea9aNjBCQxP3Qpwkk4aq2rFefgJXWFpNmDlNGFHCo RU2UOAbARcSSZfkAwTMoWk4Jg4jWrvvOVlTESWh0u5ZlNwZUMPtY31ttS6qFktp3EK9t 6nhsZCfFAopod7SMOZ5Mq4m81pFYgrDDVkVU/HrBoTc1XLG6LmLHRZbCwlZ0Dfcxf+fU Yf5TGz5AoXQS4wqtVhiYV8+2qiKz7yVD/r/LPqW/8cPlP3mRldMaCYStRpd3d4VFZQnk p3jQ== X-Gm-Message-State: AJIora9eoPVAyqGd9XZG2vbN0RRgyEFO5e3zSPt1YQN7WgW88F4zCCEx 57FkUDTgRqgGoXwKuT6OjUI4nHHxjrMSh5IQ0itmKKXUOOcELiiJV8rmed2XvIwt1J02cHNNSdm XpO7QCbH2GJuSnmbH4xgBiP09 X-Received: by 2002:a05:6214:c4d:b0:472:ff2d:eef7 with SMTP id r13-20020a0562140c4d00b00472ff2deef7mr2632217qvj.84.1657716075573; Wed, 13 Jul 2022 05:41:15 -0700 (PDT) X-Received: by 2002:a05:6214:c4d:b0:472:ff2d:eef7 with SMTP id r13-20020a0562140c4d00b00472ff2deef7mr2632204qvj.84.1657716075352; Wed, 13 Jul 2022 05:41:15 -0700 (PDT) Received: from bfoster (c-24-61-119-116.hsd1.ma.comcast.net. [24.61.119.116]) by smtp.gmail.com with ESMTPSA id w20-20020a05620a0e9400b006b5b1d632bbsm2013058qkm.99.2022.07.13.05.41.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jul 2022 05:41:14 -0700 (PDT) Date: Wed, 13 Jul 2022 08:41:11 -0400 From: Brian Foster To: Zhihao Cheng Cc: ebiederm@xmission.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, yukuai3@huawei.com, yi.zhang@huawei.com Subject: Re: [PATCH v4] proc: Fix a dentry lock race between release_task and lookup Message-ID: References: <20220601062332.232439-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 13, 2022 at 03:24:50PM +0800, Zhihao Cheng wrote: > 在 2022/7/12 22:16, Brian Foster 写道: > > On Wed, Jun 01, 2022 at 02:23:32PM +0800, Zhihao Cheng wrote: > > > Commit 7bc3e6e55acf06 ("proc: Use a list of inodes to flush from proc") > > > moved proc_flush_task() behind __exit_signal(). Then, process systemd > > > can take long period high cpu usage during releasing task in following > > > concurrent processes: > > > > > > systemd ps > > > kernel_waitid stat(/proc/tgid) > > > do_wait filename_lookup > > > wait_consider_task lookup_fast > > > release_task > > > __exit_signal > > > __unhash_process > > > detach_pid > > > __change_pid // remove task->pid_links > > > d_revalidate -> pid_revalidate // 0 > > > d_invalidate(/proc/tgid) > > > shrink_dcache_parent(/proc/tgid) > > > d_walk(/proc/tgid) > > > spin_lock_nested(/proc/tgid/fd) > > > // iterating opened fd > > > proc_flush_pid | > > > d_invalidate (/proc/tgid/fd) | > > > shrink_dcache_parent(/proc/tgid/fd) | > > > shrink_dentry_list(subdirs) ↓ > > > shrink_lock_dentry(/proc/tgid/fd) --> race on dentry lock > > > > > > > Curious... can this same sort of thing happen with /proc//task if > > that dir similarly has a lot of dentries? > > > > Yes. It could happend too. There will be many dentries under > /proc//task when there are many tasks under same thread group. > > We must put /proc//task into pid->inodes, because we have to handle > single thread exiting situation: Any one of threads should invalidate its > /proc//task/ dentry before begin released. You may refer to the > function proc_flush_task_mnt() before commit 7bc3e6e55acf06 ("proc: Use a > list of inodes to flush from proc"). > Ah, I see. So historically when the (thread) task goes away, we look up the tgid and then the associated /proc//task/ dentry to zap it. Thanks for the pointer.. Brian > > ... > > > Fixes: 7bc3e6e55acf06 ("proc: Use a list of inodes to flush from proc") > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216054 > > > Signed-off-by: Zhihao Cheng > > > Signed-off-by: Zhang Yi > > > --- > > > v1->v2: Add new helper proc_pid_make_base_inode that performs the extra > > > work of adding to the pid->list. > > > v2->v3: Add performance regression in commit message. > > > v3->v4: Make proc_pid_make_base_inode() static > > > fs/proc/base.c | 34 ++++++++++++++++++++++++++-------- > > > 1 file changed, 26 insertions(+), 8 deletions(-) > > > > > > diff --git a/fs/proc/base.c b/fs/proc/base.c > > > index c1031843cc6a..d884933950fd 100644 > > > --- a/fs/proc/base.c > > > +++ b/fs/proc/base.c > > ... > > > @@ -1931,6 +1926,27 @@ struct inode *proc_pid_make_inode(struct super_block * sb, > > > return NULL; > > > } > > > +static struct inode *proc_pid_make_base_inode(struct super_block *sb, > > > + struct task_struct *task, umode_t mode) > > > +{ > > > + struct inode *inode; > > > + struct proc_inode *ei; > > > + struct pid *pid; > > > + > > > + inode = proc_pid_make_inode(sb, task, mode); > > > + if (!inode) > > > + return NULL; > > > + > > > + /* Let proc_flush_pid find this directory inode */ > > > + ei = PROC_I(inode); > > > + pid = ei->pid; > > > + spin_lock(&pid->lock); > > > + hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes); > > > + spin_unlock(&pid->lock); > > > + > > > + return inode; > > > +} > > > + > > > > Somewhat related to the question above.. it would be nice if this > > wrapper had a line or two comment above it that explained when it should > > or shouldn't be used over the underlying function (for example, why or > > why not include /proc//task?). Otherwise the patch overall seems > > reasonable to me.. > > > > Thanks for advice, I will add some notes in v5. > > Brian > > > > > int pid_getattr(struct user_namespace *mnt_userns, const struct path *path, > > > struct kstat *stat, u32 request_mask, unsigned int query_flags) > > > { > > > @@ -3350,7 +3366,8 @@ static struct dentry *proc_pid_instantiate(struct dentry * dentry, > > > { > > > struct inode *inode; > > > - inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO); > > > + inode = proc_pid_make_base_inode(dentry->d_sb, task, > > > + S_IFDIR | S_IRUGO | S_IXUGO); > > > if (!inode) > > > return ERR_PTR(-ENOENT); > > > @@ -3649,7 +3666,8 @@ static struct dentry *proc_task_instantiate(struct dentry *dentry, > > > struct task_struct *task, const void *ptr) > > > { > > > struct inode *inode; > > > - inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO); > > > + inode = proc_pid_make_base_inode(dentry->d_sb, task, > > > + S_IFDIR | S_IRUGO | S_IXUGO); > > > if (!inode) > > > return ERR_PTR(-ENOENT); > > > -- > > > 2.31.1 > > > > > > > . > > >