Received: by 2002:a05:6512:2355:0:0:0:0 with SMTP id p21csp198997lfu; Wed, 30 Mar 2022 20:36:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyUuVs8sgzioZyTLd7aAw++YPBowRqAp5rTylGBkJ4G169sE6nwr//6ZCvg11KtKKyelP0e X-Received: by 2002:a63:2b84:0:b0:398:2527:df55 with SMTP id r126-20020a632b84000000b003982527df55mr8931865pgr.281.1648697781630; Wed, 30 Mar 2022 20:36:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648697781; cv=none; d=google.com; s=arc-20160816; b=f06N3BEUsWFBKyEcrSNI9Xt/XBMXwDDnx1kr2YDHumYVVo+KdJXu7up0XgNSTbJsQ4 PzZNsyvmWJD5zM2GJuiTBVVqzOeTzuuxnBAR4vuXAqqwSnkjcgb7mRgjne3yDyEveAIL +8ZkQ5DDFBSrg8hUt3KxPrPsBoSm0sO4A7u/DBAj4jGIWPiIot9HzZPl2jN/89FSVRqS LuSRG9z1y7RY0c7Shmvu6KY+zShm0G2MmIk0u8oPGo9/m+wR6NfnPtfOkIklh6A/uU4C FiFT66z7h3QQNRUI7CGmAVKSQ3SOEFnJ0+kUQ/on01A4+K4Kwb689/Z3pWose/cP5IPc NnUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=Je6/s8Ypzp57VFQNWzctFIQlJ1K47N8QuVcvBmQvgoA=; b=rezdHJr3Cz7WP/Of0Xpe3F1sq4q0lkFN/OicTprfEFOLn0osn6E7Gn6TBKkortmr2d hFdJ5Wx0wzwkIkJA116/KjoZKUZkI/adyayraoJ2QQZsz/D6BerYYjFfvrSWkTlvscrq P9xrdzLpy1QV06vDQNMX33uEXSao7AoAJpu6A3FWGacBTxYzLiCtV7qXoavH3AoXNI7v dJH1MhBIdoRtRChUl4bNOittoV3ubdgom48VXe72XAk5xqeeZK0jmZfAJMr+0jQJhpU0 E6tSiGIekII1iwClcXJR96rPKb2+/WoScrv+C5gl3McQqAa9ybsrzA6msAGqBYF+NCom 9itw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=EnkKI3rd; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id pg1-20020a17090b1e0100b001c6add0d56csi2166265pjb.54.2022.03.30.20.36.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Mar 2022 20:36:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=EnkKI3rd; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 45A9FA0BFF; Wed, 30 Mar 2022 19:58:36 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348751AbiC3QVm (ORCPT + 99 others); Wed, 30 Mar 2022 12:21:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348739AbiC3QVj (ORCPT ); Wed, 30 Mar 2022 12:21:39 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF1B26C4AC for ; Wed, 30 Mar 2022 09:19:53 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 80970210E5; Wed, 30 Mar 2022 16:19:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1648657192; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Je6/s8Ypzp57VFQNWzctFIQlJ1K47N8QuVcvBmQvgoA=; b=EnkKI3rdMgoLnYLo/VbPVM8/IcqoYRu3txlwVxn/X+5vv2peVIPzfR3RYeJvsrotAPphCY ZgpGsrEhMrZ0P6TcaWvvsv9hxPdrFgZ+EfRLGwMvWlTOxxJkli4GANJUYB0tREIUJNA9q4 ifaJDNcd9POMklc1LLGaZnhHRbQj+VY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1648657192; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Je6/s8Ypzp57VFQNWzctFIQlJ1K47N8QuVcvBmQvgoA=; b=4bpERvfk4AQpSDfYHdgXXUOqwNaKbGPBYVLHpc6nGqGOjE3NtiSCydu2gsi035YycY2/r8 N/CybYAi5CD+poBw== Received: from quack3.suse.cz (unknown [10.163.28.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 6D1F6A3B83; Wed, 30 Mar 2022 16:19:52 +0000 (UTC) Received: by quack3.suse.cz (Postfix, from userid 1000) id 1E225A0610; Wed, 30 Mar 2022 18:19:52 +0200 (CEST) Date: Wed, 30 Mar 2022 18:19:52 +0200 From: Jan Kara To: Chuck Lever III Cc: Trond Myklebust , "jack@suse.cz" , Linux NFS Mailing List , "nfbrown@suse.com" , Bruce Fields Subject: Re: Performance regression with random IO pattern from the client Message-ID: <20220330161952.haopqr342qlij5hg@quack3.lan> References: <20220330103457.r4xrhy2d6nhtouzk@quack3.lan> <64a4832afd830d7c831ab687bc7a72cc791c2f0c.camel@hammerspace.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Wed 30-03-22 15:38:38, Chuck Lever III wrote: > > On Mar 30, 2022, at 11:03 AM, Trond Myklebust wrote: > > > > On Wed, 2022-03-30 at 12:34 +0200, Jan Kara wrote: > >> Hello, > >> > >> during our performance testing we have noticed that commit > >> b6669305d35a > >> ("nfsd: Reduce the number of calls to nfsd_file_gc()") has introduced > >> a > >> performance regression when a client does random buffered writes. The > >> workload on NFS client is fio running 4 processed doing random > >> buffered writes to 4 > >> different files and the files are large enough to hit dirty limits > >> and > >> force writeback from the client. In particular the invocation is > >> like: > >> > >> fio --direct=0 --ioengine=sync --thread --directory=/mnt/mnt1 -- > >> invalidate=1 --group_reporting=1 --runtime=300 --fallocate=posix -- > >> ramp_time=10 --name=RandomReads-128000-4k-4 --new_group -- > >> rw=randwrite --size=4000m --numjobs=4 --bs=4k -- > >> filename_format=FioWorkloads.\$jobnum --end_fsync=1 > >> > >> The reason why commit b6669305d35a regresses performance is the > >> filemap_flush() call it adds into nfsd_file_put(). Before this commit > >> writeback on the server happened from nfsd_commit() code resulting in > >> rather long semisequential streams of 4k writes. After commit > >> b6669305d35a > >> all the writeback happens from filemap_flush() calls resulting in > >> much > >> longer average seek distance (IO from different files is more > >> interleaved) > >> and about 16-20% regression in the achieved writeback throughput when > >> the > >> backing store is rotational storage. > >> > >> I think the filemap_flush() from nfsd_file_put() is indeed rather > >> aggressive and I think we'd be better off to just leave writeback to > >> either > >> nfsd_commit() or standard dirty page cleaning happening on the > >> system. I > >> assume the rationale for the filemap_flush() call was to make it more > >> likely the file can be evicted during the garbage collection run? Was > >> there > >> any particular problem leading to addition of this call or was it > >> just "it > >> seemed like a good idea" thing? > >> > >> Thanks in advance for ideas. > >> > >> Honza > > > > It was mainly introduced to reduce the amount of work that > > nfsd_file_free() needs to do. In particular when re-exporting NFS, the > > call to filp_close() can be expensive because it synchronously flushes > > out dirty pages. That again means that some of the calls to > > nfsd_file_dispose_list() can end up being very expensive (particularly > > the ones run by the garbage collector itself). > > The "no regressions" rule suggests that some kind of action needs > to be taken. I don't have a sense of whether Jan's workload or NFS > re-export is the more common use case, however. > > I can see that filp_close() on a file on an NFS mount could be > costly if that file has dirty pages, due to the NFS client's > "flush on close" semantic. > > Trond, are there alternatives to flushing in the nfsd_file_put() > path? I'm willing to entertain bug fix patches rather than a > mechanical revert of b6669305d35a. Yeah, I don't think we need to rush fixing this with a revert. Also because just removing the filemap_flush() from nfsd_file_put() would keep other benefits of that commit while fixing the regression AFAIU. But I think making flushing less aggressive is desirable because as I wrote in my other reply, currently it is preventing pagecache from accumulating enough dirty data for a good IO pattern. Honza -- Jan Kara SUSE Labs, CR