Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp3280484ioo; Sun, 29 May 2022 20:11:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyXTEsawNrm/dNPiP2Slc5JWblMnZrm9TmEvZaRAb5lKj2uVBam0NYEFoCf0aLPh7iO1ujf X-Received: by 2002:a17:907:98f6:b0:6ff:1549:1a13 with SMTP id ke22-20020a17090798f600b006ff15491a13mr20421743ejc.668.1653880300125; Sun, 29 May 2022 20:11:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653880300; cv=none; d=google.com; s=arc-20160816; b=KFjdfBV7gjm0uSazYk0KuXqFHPHi2sAQPDpHWttKmRBCcjjRzP1IG9vF23+AqLdHxN cUYYQFTYOtplwSdd3ZlGAxKjKJleHYVdrbuYEnvHRRFLjTkJSNUnqZz/lFwDvYRgscEU U7mwBha9A+FOkaqLQJDP+mLo8vFFj9/oKImyH1mSbHA40Lti5DxNXfOxNZ/9wG23cwbK m388Y4o8+yu6Uw2sFRdQkmLto0T0/qDSQgu2IOtA7OFNB0cy2w3dzj+UB6wz1qELipg+ 50MHZfczatpzDpW46ExawZypoTpWLeMPw+AhGwIrlat58z3kePgetspKAFGgWUDi50Fe WWKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:references:in-reply-to:cc:subject:to:from:date; bh=fowTlGQZFkbPhc6/cb4BHPfU+9YrV+yVo7FHem9XD3w=; b=Lawf1TIiUsF3nd+sHS3m1IchBpygAPRXMkBEoVQpn4IutjMNBTZ7pnsLna8SI9+0oQ Lh3p+6oHedR2G3ruxUl0SrphPaRHYsiOZ6+sGOgiLwcEXtwuWipefyyRyJvyPuPo2ZTa jrn/Ilq5pLR/Qjp+CxHAYYBYeeshuO/ycMZyB2fSQVkK8EjqVsVLM7yreM5kHQa61swP kSxwFtwI6zfhUcMYjTDKJBGKW5+ycmJuIXjhIVxbiJ6BXwYJ5wp19uqks4rPSkkn4Lkk YYD6CgoQFx+N4pwuyfFdI2E5bW97rMTLEE3t9uA8orcoT3yHtnxe+T19f90i3qwZLflu nqBw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sh35-20020a1709076ea300b006e7f5e5c677si11677495ejc.969.2022.05.29.20.10.58; Sun, 29 May 2022 20:11:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231538AbiE3Bgw (ORCPT + 99 others); Sun, 29 May 2022 21:36:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230402AbiE3Bgw (ORCPT ); Sun, 29 May 2022 21:36:52 -0400 Received: from out20-86.mail.aliyun.com (out20-86.mail.aliyun.com [115.124.20.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38D4063DF for ; Sun, 29 May 2022 18:36:51 -0700 (PDT) X-Alimail-AntiSpam: AC=CONTINUE;BC=0.04445068|-1;CH=green;DM=|CONTINUE|false|;DS=CONTINUE|ham_enroll_verification|0.0182193-0.00142166-0.980359;FP=0|0|0|0|0|-1|-1|-1;HT=ay29a033018047199;MF=wangyugui@e16-tech.com;NM=1;PH=DS;RN=5;RT=5;SR=0;TI=SMTPD_---.NvNh89i_1653874608; Received: from 192.168.2.112(mailfrom:wangyugui@e16-tech.com fp:SMTPD_---.NvNh89i_1653874608) by smtp.aliyun-inc.com(33.37.67.126); Mon, 30 May 2022 09:36:48 +0800 Date: Mon, 30 May 2022 09:36:50 +0800 From: Wang Yugui To: Chuck Lever III Subject: Re: filecache LRU performance regression Cc: Frank van der Linden , Linux NFS Mailing List , Matthew Wilcox , Liam Howlett In-Reply-To: <69EAFC64-E5B1-450A-9DCE-695E478A213B@oracle.com> References: <20220529103218.65DA.409509F4@e16-tech.com> <69EAFC64-E5B1-450A-9DCE-695E478A213B@oracle.com> Message-Id: <20220530093649.43F8.409509F4@e16-tech.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.75.04 [en] X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Hi, > > > On May 28, 2022, at 10:32 PM, Wang Yugui wrote: > > > > Hi, > > > >>> On May 27, 2022, at 4:37 PM, Frank van der Linden wrote: > >>> > >>> On Fri, May 27, 2022 at 06:59:47PM +0000, Chuck Lever III wrote: > >>>> > >>>> Hi Frank- > >>>> > >>>> Bruce recently reminded me about this issue. Is there a bugzilla somewhere? > >>>> Do you have a reproducer I can try? > >>> > >>> Hi Chuck, > >>> > >>> The easiest way to reproduce the issue is to run generic/531 over an > >>> NFSv4 mount, using a system with a larger number of CPUs on the client > >>> side (or just scaling the test up manually - it has a calculation based > >>> on the number of CPUs). > >>> > >>> The test will take a long time to finish. I initially described the > >>> details here: > >>> > >>> https://lore.kernel.org/linux-nfs/20200608192122.GA19171@dev-dsk-fllinden-2c-c1893d73.us-west-2.amazon.com/ > >>> > >>> Since then, it was also reported here: > >>> > >>> https://lore.kernel.org/all/20210531125948.2D37.409509F4@e16-tech.com/T/#m8c3e4173696e17a9d5903d2a619550f352314d20 > >> > >> Thanks for the summary. So, there isn't a bugzilla tracking this > >> issue? If not, please create one here: > >> > >> https://bugzilla.linux-nfs.org/ > >> > >> Then we don't have to keep asking for a repeat summary ;-) > >> > >> > >>> I posted an experimental patch, but it's actually not quite correct > >>> (although I think the idea behind it is makes sense): > >>> > >>> https://lore.kernel.org/linux-nfs/20201020183718.14618-4-trondmy@kernel.org/T/#m869aa427f125afee2af9a89d569c6b98e12e516f > >> > >> A handful of random comments: > >> > >> - nfsd_file_put() is now significantly different than it used > >> to be, so that part of the patch will have to be updated in > >> order to apply to v5.18+ > > > > When many open files(>NFSD_FILE_LRU_THRESHOLD), > > nfsd_file_gc() will waste many CPU times. > > Thanks for the suggestion. I agree that CPU and > memory bandwidth are not being used effectively > by the filecache garbage collector. > > > > Can we serialize nfsd_file_gc() for v5.18+ as a first step? > > If I understand Frank's problem statement correctly, > garbage collection during an nfsd_file_put() under > an NFSv4-only workload walks the length of the LRU > list and finds nothing to evict 100% of the time. > That seems like a bug, and fixing it might give us > the most improvement in this area. > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index 3d944ca..6abefd9 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -63,6 +63,8 @@ static struct delayed_work nfsd_filecache_laundrette; > > > > static void nfsd_file_gc(void); > > > > +static atomic_t nfsd_file_gc_queue_delayed = ATOMIC_INIT(0); > > + > > static void > > nfsd_file_schedule_laundrette(void) > > { > > @@ -71,8 +73,10 @@ nfsd_file_schedule_laundrette(void) > > if (count == 0 || test_bit(NFSD_FILE_SHUTDOWN, &nfsd_file_lru_flags)) > > return; > > > > - queue_delayed_work(system_wq, &nfsd_filecache_laundrette, > > + if(atomic_cmpxchg(&nfsd_file_gc_queue_delayed, 0, 1)==0){ > > + queue_delayed_work(system_wq, &nfsd_filecache_laundrette, > > NFSD_LAUNDRETTE_DELAY); > > I might misunderstand what this is doing exactly. > I'm sure there's a preferred idiom in the kernel > for not queuing a new work item when one is already > running, so that an open-coded cmpxchg is not > necessary. Thanks. I dropped queue_delayed_work related change and reposted it. > It might be better to allocate a specific workqueue > for filecache garbage collection, and limit the > maximum number of active work items allowed on that > queue to 1. One benefit of that might be reducing > the number of threads contending for the filecache > data structures. In this way, the number of threads of filecache garbage collection is reduced. but filecache is still accessed by all nfsd threads for filecache fetch/save. Best Regards Wang Yugui (wangyugui@e16-tech.com) 2022/05/30 > If GC is capped like this, maybe create one GC > workqueue per nfsd_net so GC activity in one > namespace does not starve filecache GC in other > namespaces. > > Before I would take patches like this, though, > performance data demonstrating a problem and some > improvement should be presented separately or as > part of the patch descriptions. > If you repost, start a separate thread and cc: > > M: Tejun Heo > R: Lai Jiangshan > > to get review by workqueue domain experts. > >