Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp2536161rwi; Fri, 28 Oct 2022 08:11:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM55uR3ozvKUYkmwa/bQmhqaSivr2WU7brmR5QvVvFi8esTFo7Fey4GC+oQQHhjYzZSnFedp X-Received: by 2002:aa7:88c9:0:b0:56b:e851:5b65 with SMTP id k9-20020aa788c9000000b0056be8515b65mr24884093pff.74.1666969868456; Fri, 28 Oct 2022 08:11:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666969868; cv=none; d=google.com; s=arc-20160816; b=hmcnfI8gZ6rRFXSJgMHw49YU/OIOTYIxKGgtIq9CDNnSb4somAu/5b6fiZ+I4LWltQ /4QppVFjTrCfVM4wWh+ABZ1btIF0P9MeObW0rB0XaH1LagJ1oXBPvOUwRwHIE9q60GKm UAuEsID381UD/ZaTMi4Ul3ALTRU6wyBh8tm4o2SIlpMbOsZg32CSHecmTtOARE5r468b Rok6BN2XU+Ovp6+51l9SVdpnZiZ5yjnPDh1TgQXUzAJ6GWqb185dSOWv5W48rKCeYqh5 V+KrcnEnDu87/0kbd0QlKEo1/ink1gttZeXjckTmIHxC0JwEYqkAznj7RIzCzcL/2sJ1 vS2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=9+UNLwmhbIwZCxDcTSO6PBcGNVBj5aRgzXXfbpYo8mo=; b=t8C+Xupy+sbzXNbtofo5Sq0bddJuyPsW1bkMV5Ojs2Of2hiDGmmTswvP6whChOJDO6 5JIfbZ/J5YPjhbSTNFyfphCvrjqSnP5f+Uo4W8UFS0iUAGjF258q1W3Mhau7S95VxvtV 5CHdDoZuo3vvI1kZ6+ji79s051dbiTOXjn1cDWSjWlB1xScId0+5FAdeD4Q/12E0kKTY QYveencRbzCqjLgHzp//1k1avd+ItILEoX3WK/fLBGssZjJt62oSxL3opz26qa8SL2k+ QBDj99q0qBQ4flC7pZsFBewi4hFsMiTeB3M9kSILtVKnnKcOWh3y+N2uSomntD92pT5X ZZ+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=B2qNITlY; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 27-20020a63105b000000b00435135c1d96si6014166pgq.806.2022.10.28.08.10.53; Fri, 28 Oct 2022 08:11:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=B2qNITlY; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231214AbiJ1PFW (ORCPT + 99 others); Fri, 28 Oct 2022 11:05:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229817AbiJ1PFS (ORCPT ); Fri, 28 Oct 2022 11:05:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5461D5C95B for ; Fri, 28 Oct 2022 08:05:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EA64262906 for ; Fri, 28 Oct 2022 15:05:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9169C433D6; Fri, 28 Oct 2022 15:05:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666969510; bh=q6qbx9E39EZoCKWS4VIzviiamLW0NoH2CRzvecowd58=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=B2qNITlYHfz8uIwSFqlTUq2YWTS+n5lZHdq5in22uWDlWG2G4h26IQOxssyDxUHt4 1UHZVl+6C6qKHhZul1spse4qxXj2jHvnU3MZ1eFoby6nCIheodD5ryY9/hvP329Z30 QDEFUnrOyDyv9PM8lT4eWIXAosBNi5EtRGIZq1kxI6hJD5G7ie18g51XmMLEAATSbK n4D5xwiTcA07yeKustTN8l/BZBTvi0+mxJVqNR0w48ddnTNh5C+r6EvT4klnIHR91/ Y3OT47tbut6Ktgb+KjYU/lLGdf5kc3LcZ5vr7WaR6gi9R/qIA0rKD7t+MIGJIvNwGD Bi9nzkUydTbmg== Message-ID: Subject: Re: [PATCH v2 3/3] nfsd: start non-blocking writeback after adding nfsd_file to the LRU From: Jeff Layton To: Chuck Lever III Cc: Linux NFS Mailing List , Neil Brown Date: Fri, 28 Oct 2022 11:05:08 -0400 In-Reply-To: References: <20221027215213.138304-1-jlayton@kernel.org> <20221027215213.138304-4-jlayton@kernel.org> Content-Type: text/plain; charset="ISO-8859-15" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4 (3.44.4-2.fc36) MIME-Version: 1.0 X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Fri, 2022-10-28 at 13:16 +0000, Chuck Lever III wrote: >=20 > > On Oct 27, 2022, at 5:52 PM, Jeff Layton wrote: > >=20 > > When a GC entry gets added to the LRU, kick off SYNC_NONE writeback > > so that we can be ready to close it out when the time comes. >=20 > For a large file, a background flush still has to walk the file's > pages to see if they are dirty, and that consumes time, CPU, and > memory bandwidth. We're talking hundreds of microseconds for a > large file. >=20 > Then the final flush does all that again. >=20 > Basically, two (or more!) passes through the file for exactly the > same amount of work. Is there any measured improvement in latency > or throughput? >=20 > And then... for a GC file, no-one is waiting on data persistence > during nfsd_file_put() so I'm not sure what is gained by taking > control of the flushing process away from the underlying filesystem. >=20 >=20 > Remind me why the filecache is flushing? Shouldn't NFSD rely on > COMMIT operations for that? (It's not obvious reading the code, > maybe there should be a documenting comment somewhere that > explains this arrangement). >=20 Fair point. I was trying to replicate the behaviors introduced in these patches: b6669305d35a nfsd: Reduce the number of calls to nfsd_file_gc() 6b8a94332ee4 nfsd: Fix a write performance regression AFAICT, the fsync is there to catch writeback errors so that we can reset the write verifiers (AFAICT). The rationale for that is described here: 055b24a8f230 nfsd: Don't garbage collect files that might contain write err= ors The problem with not calling vfs_fsync is that we might miss writeback errors. The nfsd_file could get reaped before a v3 COMMIT ever comes in. nfsd would eventually reopen the file but it could miss seeing the error if it got opened locally in the interim. I'm not sure we need to worry about that so much for v4 though. Maybe we should just do this for GC files? >=20 > > Signed-off-by: Jeff Layton > > --- > > fs/nfsd/filecache.c | 37 +++++++++++++++++++++++++++++++------ > > 1 file changed, 31 insertions(+), 6 deletions(-) > >=20 > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c > > index d2bbded805d4..491d3d9a1870 100644 > > --- a/fs/nfsd/filecache.c > > +++ b/fs/nfsd/filecache.c > > @@ -316,7 +316,7 @@ nfsd_file_alloc(struct nfsd_file_lookup_key *key, u= nsigned int may) > > } > >=20 > > static void > > -nfsd_file_flush(struct nfsd_file *nf) > > +nfsd_file_fsync(struct nfsd_file *nf) > > { > > struct file *file =3D nf->nf_file; > >=20 > > @@ -327,6 +327,22 @@ nfsd_file_flush(struct nfsd_file *nf) > > nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id)); > > } > >=20 > > +static void > > +nfsd_file_flush(struct nfsd_file *nf) > > +{ > > + struct file *file =3D nf->nf_file; > > + unsigned long nrpages; > > + > > + if (!file || !(file->f_mode & FMODE_WRITE)) > > + return; > > + > > + nrpages =3D file->f_mapping->nrpages; > > + if (nrpages) { > > + this_cpu_add(nfsd_file_pages_flushed, nrpages); > > + filemap_flush(file->f_mapping); > > + } > > +} > > + > > static void > > nfsd_file_free(struct nfsd_file *nf) > > { > > @@ -337,7 +353,7 @@ nfsd_file_free(struct nfsd_file *nf) > > this_cpu_inc(nfsd_file_releases); > > this_cpu_add(nfsd_file_total_age, age); > >=20 > > - nfsd_file_flush(nf); > > + nfsd_file_fsync(nf); > >=20 > > if (nf->nf_mark) > > nfsd_file_mark_put(nf->nf_mark); > > @@ -500,12 +516,21 @@ nfsd_file_put(struct nfsd_file *nf) > >=20 > > if (test_bit(NFSD_FILE_GC, &nf->nf_flags)) { > > /* > > - * If this is the last reference (nf_ref =3D=3D 1), then transfer > > - * it to the LRU. If the add to the LRU fails, just put it as > > - * usual. > > + * If this is the last reference (nf_ref =3D=3D 1), then try > > + * to transfer it to the LRU. > > + */ > > + if (refcount_dec_not_one(&nf->nf_ref)) > > + return; > > + > > + /* > > + * If the add to the list succeeds, try to kick off SYNC_NONE > > + * writeback. If the add fails, then just fall through to > > + * decrement as usual. >=20 > These comments simply repeat what the code does, so they seem > redundant to me. Could they instead explain why? >=20 >=20 > > */ > > - if (refcount_dec_not_one(&nf->nf_ref) || nfsd_file_lru_add(nf)) > > + if (nfsd_file_lru_add(nf)) { > > + nfsd_file_flush(nf); > > return; > > + } > > } > > __nfsd_file_put(nf); > > } > > --=20 > > 2.37.3 > >=20 >=20 > -- > Chuck Lever >=20 >=20 >=20 --=20 Jeff Layton