Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1896769pxb; Sat, 2 Apr 2022 07:26:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwoxnBF2a2DlbU91ku+yBiqoCwkoOhirK5ZuIJY0vKWlcIe+ciQjcqYIB7Z4EXj5icIjKOD X-Received: by 2002:a17:903:230c:b0:156:e47:387e with SMTP id d12-20020a170903230c00b001560e47387emr15570096plh.119.1648909618488; Sat, 02 Apr 2022 07:26:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648909618; cv=none; d=google.com; s=arc-20160816; b=gCJ2iWyt5rT3YOgMO21EwrmfEpqZKdV7N3HEPQ+CPYIy7glx+PFXF3J2RdY8aIQDEZ 0W6W/bAilh+4Fs8UQqd7sYGgZZjUeeHQjbQUlIHF2pUn0WBIeGZUu3UXBglJIa0yAQs+ UCOSFrITggbwuRLyYwoMN7GVsdS/FAWVyoDjMfsQzIuTElh1mDJqxEavQnSLHhY67qW/ DfmFhl51+2jwpe/9qWngJFngReVJQhi1e45OBw1TwktBj2CFCwHEiB6QzpwrVF+QVlc2 Eq+NPKONAIDD8ahBHWKNlAalqIORB86AiIBTul4lWss6xGcb7Ruwx9y5h0tOo2620RUj tbYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=wsZ8+7tw6c06QAqOXF6qaSQafSOtIGNVYFKFV6/lERY=; b=uk5u4tb5JuTHXi/NMm3FrBwvUWOWJgEiPEBuDP45umvo/m/lh0j5/LqIHLhP9UtlJI sBPNLWGC5RIXefYAbmLyIlTIL3CmgEacoZDmdkTNCafXF/AXX63pqgHnBinzt8Rd4Cgj 6cRLiMgqGmcsclNISRWdmeI05IAa6lnAGaLSfaLxWqVwKUoavf4SJTApkXd2/GOo46QV hSX0Iv8r/4AHzROSSUhrFCNVlO3jmVp0xDs/jsdXNIy+lJbxSJ23tbdlLKrPGOh+xIgH 0RCN+a2xf/v5sqIGyD1GpockYcvfBBEYoF1HPOeO0EeJQ6POrXFgkeIvgmTtfinY2ZTE 2jyQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e10-20020a17090301ca00b00153be5f394esi5254738plh.529.2022.04.02.07.26.44; Sat, 02 Apr 2022 07:26:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244770AbiDADzR (ORCPT + 99 others); Thu, 31 Mar 2022 23:55:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242603AbiDADzP (ORCPT ); Thu, 31 Mar 2022 23:55:15 -0400 Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6AD7C12E155; Thu, 31 Mar 2022 20:53:24 -0700 (PDT) Received: from dread.disaster.area (pa49-180-43-123.pa.nsw.optusnet.com.au [49.180.43.123]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 9AC735341BF; Fri, 1 Apr 2022 14:53:23 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1na8Lh-00CLCZ-Mx; Fri, 01 Apr 2022 14:53:21 +1100 Date: Fri, 1 Apr 2022 14:53:21 +1100 From: Dave Chinner To: Jeff Layton Cc: viro@zeniv.linux.org.uk, ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] fs: change test in inode_insert5 for adding to the sb list Message-ID: <20220401035321.GR1609613@dread.disaster.area> References: <20220331225632.247244-1-jlayton@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220331225632.247244-1-jlayton@kernel.org> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.4 cv=e9dl9Yl/ c=1 sm=1 tr=0 ts=62467734 a=MV6E7+DvwtTitA3W+3A2Lw==:117 a=MV6E7+DvwtTitA3W+3A2Lw==:17 a=kj9zAlcOel0A:10 a=z0gMJWrwH1QA:10 a=drOt6m5kAAAA:8 a=VwQbUJbxAAAA:8 a=20KFwNOVAAAA:8 a=7-415B0cAAAA:8 a=ke8cdjoDFq0_I0rIqSUA:9 a=CjuIK1q_8ugA:10 a=RMMjzBEyIzXRtoq5n5K6:22 a=AjGcO6oz07-iQ99wixmX:22 a=biEYGPWJfzWAr4FL6Ov7:22 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 31, 2022 at 06:56:32PM -0400, Jeff Layton wrote: > The inode_insert5 currently looks at I_CREATING to decide whether to > insert the inode into the sb list. This test is a bit ambiguous though > as I_CREATING state is not directly related to that list. > > This test is also problematic for some upcoming ceph changes to add > fscrypt support. We need to be able to allocate an inode using new_inode > and insert it into the hash later if we end up using it, and doing that > now means that we double add it and corrupt the list. > > What we really want to know in this test is whether the inode is already > in its superblock list, and then add it if it isn't. Have it test for > list_empty instead and ensure that we always initialize the list by > doing it in inode_init_once. It's only ever removed from the list with > list_del_init, so that should be sufficient. > > Suggested-by: Al Viro > Signed-off-by: Jeff Layton > --- > fs/inode.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > This is the alternate approach that Al suggested to me on IRC. I think > this is likely to be more robust in the long run, and we can avoid > exporting another symbol. Looks good to me. Reviewed-by: Dave Chinner FWIW, I'm getting ready to resend patches originally written by Waiman Long years ago to convert the inode sb list to a different structure (per-cpu lists) for scalability reasons, but is still allows using list-empty() to check if the inode is on the list or not so I dont' see a problem with this change at all. > Al, if you're ok with this, would you mind taking this in via your tree? > I'd like to see this in sit in linux-next for a bit so we can see if any > benchmarks get dinged. I think that is unlikely - the sb inode list just doesn't show up in profiles until you are pushing several hundred thousand inodes a second through the inode cache and there really aren't a lot of worklaods out there that do that. At that point, sb list lock contention becomes the issue, not the requirement to add in-use inodes to the sb list... e.g. concurrent 'find <...> -ctime' operations on XFS hit sb list lock contention limits at about 600,000 inodes/s being, instantiated, stat()d and reclaimed from memory. With Waiman's dlist code I mention above, it'll do 1.5 million inodes/s for the same CPU usage. And a concurrent bulkstat workload goes from 600,000 inodes/s to over 6 million inodes/s for the same CPU usage. That bulkstat workload is hitting memory reclaim scalability limits as I'm turning over ~12GB/s of cached memory on a machine with only 16GB RAM... Cheers, Dave. -- Dave Chinner david@fromorbit.com