Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3939580pxb; Tue, 17 Nov 2020 07:26:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJzQ7gPA50oymvSU5HnZG/Zc+8Yrhmv+GvHs5evhWb5lalbjbaMOKCCtSLGbP5D1ydEwwUqa X-Received: by 2002:aa7:da81:: with SMTP id q1mr21138528eds.14.1605626773744; Tue, 17 Nov 2020 07:26:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605626773; cv=none; d=google.com; s=arc-20160816; b=WjgOu+uIvssmqg2NA71Hj9IHA4tesWmcO94Mwn4bCVyaAbFxuJ8rq/9CFGyNlgDQGg fVe3Xh+kMCxLIfxWAXawGK3XS9VN++JnXAg2d0BQQIWJclUMOANXtq7CFAQFAxrnFayI BIK/ktLxL7G2FkkV3eYBgrn4RaYEiipAovqBhdVV5s4kyDARdEHFjqpBbkxqYeCdAM6K 12T7sNIOTck+TCDwKiglOLJamZ+u64gnAHhmX6uPx/NA09857vWFWqpWBNisb/Pp6InQ a+lSdkXNbf5Bz+JnwD+OqdMV3ho+CeEMEfKsw+r54Fbj7FjFgS9J6o8uUqLKsBRpibTz bBtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:from:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:date :dkim-signature:dkim-filter; bh=hPhzIdY6gNd9mDmeWEK+Rm55DPyoFV6fdXub/Cxg7yA=; b=RGJt232AVM67YdM7+gI93pZgbsXgOitmzfts/lG8/fCYXzOyhDSl5GQm18zVJRf/vt 5yrciUMhNGo7on7rTJAOGPdGqBGkA5+i/P1nrCNISckPJ5qIPaKhehZS61TJuW5gdL86 am5IHu74hnurucoJeycHs3CW6Wvqpw1PtHKHHSY3+1Bx62/HhZA+eK+JwCDgelL+yKXY nX1ylc4WN78JBSIfNVIZ/GWbzXsh6gy+Vcx4KGX2tX8H70LTWqkC90n2fUTUPJfaXu8g G3q7khJy+yZzLCX5aYxZlK/TPXug1tb3azEaElL63ty9NBCNidYa3A90YvWIz4PuNJeT ldOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fieldses.org header.s=default header.b=Ym7JlXM2; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y10si13250493ejf.165.2020.11.17.07.25.40; Tue, 17 Nov 2020 07:26:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@fieldses.org header.s=default header.b=Ym7JlXM2; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729110AbgKQPZi (ORCPT + 99 others); Tue, 17 Nov 2020 10:25:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729087AbgKQPZi (ORCPT ); Tue, 17 Nov 2020 10:25:38 -0500 Received: from fieldses.org (fieldses.org [IPv6:2600:3c00:e000:2f7::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 817BFC0613CF for ; Tue, 17 Nov 2020 07:25:38 -0800 (PST) Received: by fieldses.org (Postfix, from userid 2815) id 6A7201C21; Tue, 17 Nov 2020 10:25:37 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.11.0 fieldses.org 6A7201C21 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fieldses.org; s=default; t=1605626737; bh=hPhzIdY6gNd9mDmeWEK+Rm55DPyoFV6fdXub/Cxg7yA=; h=Date:To:Cc:Subject:References:In-Reply-To:From:From; b=Ym7JlXM27NT3f42Ao8OJ+9L10gGNfKCqcEJIyUGLsK5KArxJGSJdu/KgXFrklzJDU HKpHPXehLwmCfItVzoAjlTfMN0orbzenFiLWi8aUi+27P8mAnCmWfOnsbgzEATT88e ck0Jm+u4qk5L3u4yFQJCsJxoShbvhW+aCozXhYdc= Date: Tue, 17 Nov 2020 10:25:37 -0500 To: "J. Bruce Fields" Cc: Jeff Layton , Daire Byrne , Trond Myklebust , linux-cachefs , linux-nfs Subject: Re: [PATCH 2/4] nfsd: pre/post attr is using wrong change attribute Message-ID: <20201117152537.GB4556@fieldses.org> References: <20201117031601.GB10526@fieldses.org> <1605583086-19869-1-git-send-email-bfields@redhat.com> <1605583086-19869-2-git-send-email-bfields@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1605583086-19869-2-git-send-email-bfields@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) From: bfields@fieldses.org (J. Bruce Fields) Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Mon, Nov 16, 2020 at 10:18:04PM -0500, J. Bruce Fields wrote: > From: "J. Bruce Fields" > > fill_{pre/post}_attr are unconditionally using i_version even when the > underlying filesystem doesn't have proper support for i_version. Actually, I didn't have this quite right.... These values are queried, but they aren't used, thanks to the "change_supported" field of nfsd4_change_info; in set_change_info(): cinfo->change_supported = IS_I_VERSION(d_inode(fhp->fh_dentry)); and then later on encode_cinfo() chooses to use stored change attribute or ctime values depending on how change_supported. But as of the ctime changes, just querying the change attribute here has side effects. So, that explains why Daire's team was seeing a performance regression, while no one was complaining about our returned change info being garbage. Anyway. --b. > > Move the code that chooses which i_version to use to the common > nfsd4_change_attribute(). > > The NFSEXP_V4ROOT case probably doesn't matter (the pseudoroot > filesystem is usually read-only and unlikely to see operations with pre > and post change attributes), but let's put it in the same place anyway > for consistency. > > Fixes: c654b8a9cba6 ("nfsd: support ext4 i_version") > Signed-off-by: J. Bruce Fields > --- > fs/nfsd/nfs4xdr.c | 11 +---------- > fs/nfsd/nfsfh.c | 11 +++++++---- > fs/nfsd/nfsfh.h | 23 ----------------------- > fs/nfsd/vfs.c | 32 ++++++++++++++++++++++++++++++++ > fs/nfsd/vfs.h | 3 +++ > 5 files changed, 43 insertions(+), 37 deletions(-) > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index 833a2c64dfe8..6806207b6d18 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -2295,16 +2295,7 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp) > static __be32 *encode_change(__be32 *p, struct kstat *stat, struct inode *inode, > struct svc_export *exp) > { > - if (exp->ex_flags & NFSEXP_V4ROOT) { > - *p++ = cpu_to_be32(convert_to_wallclock(exp->cd->flush_time)); > - *p++ = 0; > - } else if (IS_I_VERSION(inode)) { > - p = xdr_encode_hyper(p, nfsd4_change_attribute(stat, inode)); > - } else { > - *p++ = cpu_to_be32(stat->ctime.tv_sec); > - *p++ = cpu_to_be32(stat->ctime.tv_nsec); > - } > - return p; > + return xdr_encode_hyper(p, nfsd4_change_attribute(stat, inode, exp)); > } > > /* > diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c > index b3b4e8809aa9..4fbe1413e767 100644 > --- a/fs/nfsd/nfsfh.c > +++ b/fs/nfsd/nfsfh.c > @@ -719,6 +719,7 @@ void fill_pre_wcc(struct svc_fh *fhp) > { > struct inode *inode; > struct kstat stat; > + struct svc_export *exp = fhp->fh_export; > __be32 err; > > if (fhp->fh_pre_saved) > @@ -736,7 +737,7 @@ void fill_pre_wcc(struct svc_fh *fhp) > fhp->fh_pre_mtime = stat.mtime; > fhp->fh_pre_ctime = stat.ctime; > fhp->fh_pre_size = stat.size; > - fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode); > + fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode, exp); > fhp->fh_pre_saved = true; > } > > @@ -746,17 +747,19 @@ void fill_pre_wcc(struct svc_fh *fhp) > void fill_post_wcc(struct svc_fh *fhp) > { > __be32 err; > + struct inode *inode = d_inode(fhp->fh_dentry); > + struct svc_export *exp = fhp->fh_export; > > if (fhp->fh_post_saved) > printk("nfsd: inode locked twice during operation.\n"); > > err = fh_getattr(fhp, &fhp->fh_post_attr); > - fhp->fh_post_change = nfsd4_change_attribute(&fhp->fh_post_attr, > - d_inode(fhp->fh_dentry)); > + fhp->fh_post_change = > + nfsd4_change_attribute(&fhp->fh_post_attr, inode, exp); > if (err) { > fhp->fh_post_saved = false; > /* Grab the ctime anyway - set_change_info might use it */ > - fhp->fh_post_attr.ctime = d_inode(fhp->fh_dentry)->i_ctime; > + fhp->fh_post_attr.ctime = inode->i_ctime; > } else > fhp->fh_post_saved = true; > } > diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h > index 56cfbc361561..547aef9b3265 100644 > --- a/fs/nfsd/nfsfh.h > +++ b/fs/nfsd/nfsfh.h > @@ -245,29 +245,6 @@ fh_clear_wcc(struct svc_fh *fhp) > fhp->fh_pre_saved = false; > } > > -/* > - * We could use i_version alone as the change attribute. However, > - * i_version can go backwards after a reboot. On its own that doesn't > - * necessarily cause a problem, but if i_version goes backwards and then > - * is incremented again it could reuse a value that was previously used > - * before boot, and a client who queried the two values might > - * incorrectly assume nothing changed. > - * > - * By using both ctime and the i_version counter we guarantee that as > - * long as time doesn't go backwards we never reuse an old value. > - */ > -static inline u64 nfsd4_change_attribute(struct kstat *stat, > - struct inode *inode) > -{ > - u64 chattr; > - > - chattr = stat->ctime.tv_sec; > - chattr <<= 30; > - chattr += stat->ctime.tv_nsec; > - chattr += inode_query_iversion(inode); > - return chattr; > -} > - > extern void fill_pre_wcc(struct svc_fh *fhp); > extern void fill_post_wcc(struct svc_fh *fhp); > #else > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > index 1ecaceebee13..2c71b02dd1fe 100644 > --- a/fs/nfsd/vfs.c > +++ b/fs/nfsd/vfs.c > @@ -2390,3 +2390,35 @@ nfsd_permission(struct svc_rqst *rqstp, struct svc_export *exp, > > return err? nfserrno(err) : 0; > } > + > +/* > + * We could use i_version alone as the change attribute. However, > + * i_version can go backwards after a reboot. On its own that doesn't > + * necessarily cause a problem, but if i_version goes backwards and then > + * is incremented again it could reuse a value that was previously used > + * before boot, and a client who queried the two values might > + * incorrectly assume nothing changed. > + * > + * By using both ctime and the i_version counter we guarantee that as > + * long as time doesn't go backwards we never reuse an old value. > + */ > +u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode, > + struct svc_export *exp) > +{ > + u64 chattr; > + > + if (exp->ex_flags & NFSEXP_V4ROOT) { > + chattr = cpu_to_be32(convert_to_wallclock(exp->cd->flush_time)); > + chattr <<= 32; > + } else if (IS_I_VERSION(inode)) { > + chattr = stat->ctime.tv_sec; > + chattr <<= 30; > + chattr += stat->ctime.tv_nsec; > + chattr += inode_query_iversion(inode); > + } else { > + chattr = stat->ctime.tv_sec; > + chattr <<= 32; > + chattr += stat->ctime.tv_nsec; > + } > + return chattr; > +} > diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h > index a2442ebe5acf..26ed15256340 100644 > --- a/fs/nfsd/vfs.h > +++ b/fs/nfsd/vfs.h > @@ -132,6 +132,9 @@ __be32 nfsd_statfs(struct svc_rqst *, struct svc_fh *, > __be32 nfsd_permission(struct svc_rqst *, struct svc_export *, > struct dentry *, int); > > +u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode, > + struct svc_export *exp); > + > static inline int fh_want_write(struct svc_fh *fh) > { > int ret; > -- > 2.28.0