2014-11-12 17:08:07

by Weston Andros Adamson

[permalink] [raw]
Subject: [PATCH] NFS: fix subtle change in COMMIT behavior

Recent work in the pgio layer made it possible for there to be more than one
request per page. This caused a subtle change in commit behavior, because
write.c:nfs_commit_unstable_pages compares the number of *pages* waiting for
writeback against the number of requests on a commit list to choose when to
send a COMMIT in a non-blocking flush.

This is probably hard to hit in normal operation - you have to be using
rsize/wsize < PAGE_SIZE, or pnfs with lots of boundaries that are not page
aligned to have a noticeable change in behavior.

Signed-off-by: Weston Andros Adamson <[email protected]>
---

There is one portion of this patch that I'm not 100% on - the blocklayout
portion. It seems to be using the (old) nfsi->npages as a count of how many
contiguous *whole* pages are in use, but that's never been what nfsi->npages
was counting -- it would be incremented if just one byte of a page was in use,
but should be contiguous (as enforced by nfs_flush_incompatible).

Trond things that simply using the new nfsi->nrequests is ok here, because
the block layout should always be using requests > PAGE_SIZE. Any blocklayout
experts have an opinion? CCing Christoph...

-dros

fs/nfs/blocklayout/blocklayout.c | 2 +-
fs/nfs/callback_proc.c | 2 +-
fs/nfs/inode.c | 8 ++++----
fs/nfs/pagelist.c | 11 ++++++++---
fs/nfs/write.c | 17 ++++++++++++-----
include/linux/nfs_fs.h | 4 ++--
6 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
index b439edfda3ff..5cd836b415a9 100644
--- a/fs/nfs/blocklayout/blocklayout.c
+++ b/fs/nfs/blocklayout/blocklayout.c
@@ -812,7 +812,7 @@ static u64 pnfs_num_cont_bytes(struct inode *inode, pgoff_t idx)

/* Optimize common case that writes from 0 to end of file */
end = DIV_ROUND_UP(i_size_read(inode), PAGE_CACHE_SIZE);
- if (end != NFS_I(inode)->npages) {
+ if (end != NFS_I(inode)->nrequests) {
rcu_read_lock();
end = page_cache_next_hole(mapping, idx + 1, ULONG_MAX);
rcu_read_unlock();
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 9c3a07b48655..f4be136f02de 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -49,7 +49,7 @@ __be32 nfs4_callback_getattr(struct cb_getattrargs *args,
goto out_iput;
res->size = i_size_read(inode);
res->change_attr = delegation->change_attr;
- if (nfsi->npages != 0)
+ if (nfsi->nrequests != 0)
res->change_attr++;
res->ctime = inode->i_ctime;
res->mtime = inode->i_mtime;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 6388a59f2add..90367d7e31a0 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1149,7 +1149,7 @@ static unsigned long nfs_wcc_update_inode(struct inode *inode, struct nfs_fattr
if ((fattr->valid & NFS_ATTR_FATTR_PRESIZE)
&& (fattr->valid & NFS_ATTR_FATTR_SIZE)
&& i_size_read(inode) == nfs_size_to_loff_t(fattr->pre_size)
- && nfsi->npages == 0) {
+ && nfsi->nrequests == 0) {
i_size_write(inode, nfs_size_to_loff_t(fattr->size));
ret |= NFS_INO_INVALID_ATTR;
}
@@ -1192,7 +1192,7 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat
if (fattr->valid & NFS_ATTR_FATTR_SIZE) {
cur_size = i_size_read(inode);
new_isize = nfs_size_to_loff_t(fattr->size);
- if (cur_size != new_isize && nfsi->npages == 0)
+ if (cur_size != new_isize && nfsi->nrequests == 0)
invalid |= NFS_INO_INVALID_ATTR|NFS_INO_REVAL_PAGECACHE;
}

@@ -1619,7 +1619,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
if (new_isize != cur_isize) {
/* Do we perhaps have any outstanding writes, or has
* the file grown beyond our last write? */
- if ((nfsi->npages == 0) || new_isize > cur_isize) {
+ if ((nfsi->nrequests == 0) || new_isize > cur_isize) {
i_size_write(inode, new_isize);
invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA;
invalid &= ~NFS_INO_REVAL_PAGECACHE;
@@ -1784,7 +1784,7 @@ static void init_once(void *foo)
INIT_LIST_HEAD(&nfsi->access_cache_entry_lru);
INIT_LIST_HEAD(&nfsi->access_cache_inode_lru);
INIT_LIST_HEAD(&nfsi->commit_info.list);
- nfsi->npages = 0;
+ nfsi->nrequests = 0;
nfsi->commit_info.ncommit = 0;
atomic_set(&nfsi->commit_info.rpcs_out, 0);
atomic_set(&nfsi->silly_count, 1);
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index fabb173f5f62..a2d8827f5975 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -263,6 +263,7 @@ bool nfs_page_group_sync_on_bit(struct nfs_page *req, unsigned int bit)
static inline void
nfs_page_group_init(struct nfs_page *req, struct nfs_page *prev)
{
+ struct inode *inode;
WARN_ON_ONCE(prev == req);

if (!prev) {
@@ -281,12 +282,16 @@ nfs_page_group_init(struct nfs_page *req, struct nfs_page *prev)
* nfs_page_group_destroy is called */
kref_get(&req->wb_head->wb_kref);

- /* grab extra ref if head request has extra ref from
- * the write/commit path to handle handoff between write
- * and commit lists */
+ /* grab extra ref and bump the request count if head request
+ * has extra ref from the write/commit path to handle handoff
+ * between write and commit lists. */
if (test_bit(PG_INODE_REF, &prev->wb_head->wb_flags)) {
+ inode = page_file_mapping(req->wb_page)->host;
set_bit(PG_INODE_REF, &req->wb_flags);
kref_get(&req->wb_kref);
+ spin_lock(&inode->i_lock);
+ NFS_I(inode)->nrequests++;
+ spin_unlock(&inode->i_lock);
}
}
}
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 0bd0a5d07326..c31d479da159 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -675,7 +675,8 @@ static void nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
nfs_lock_request(req);

spin_lock(&inode->i_lock);
- if (!nfsi->npages && NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE))
+ if (!nfsi->nrequests &&
+ NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE))
inode->i_version++;
/*
* Swap-space should not get truncated. Hence no need to plug the race
@@ -686,9 +687,11 @@ static void nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
SetPagePrivate(req->wb_page);
set_page_private(req->wb_page, (unsigned long)req);
}
- nfsi->npages++;
+ nfsi->nrequests++;
/* this a head request for a page group - mark it as having an
- * extra reference so sub groups can follow suit */
+ * extra reference so sub groups can follow suit.
+ * This flag also informs pgio layer when to bump nrequests when
+ * adding subrequests. */
WARN_ON(test_and_set_bit(PG_INODE_REF, &req->wb_flags));
kref_get(&req->wb_kref);
spin_unlock(&inode->i_lock);
@@ -714,7 +717,11 @@ static void nfs_inode_remove_request(struct nfs_page *req)
wake_up_page(head->wb_page, PG_private);
clear_bit(PG_MAPPED, &head->wb_flags);
}
- nfsi->npages--;
+ nfsi->nrequests--;
+ spin_unlock(&inode->i_lock);
+ } else {
+ spin_lock(&inode->i_lock);
+ nfsi->nrequests--;
spin_unlock(&inode->i_lock);
}

@@ -1750,7 +1757,7 @@ static int nfs_commit_unstable_pages(struct inode *inode, struct writeback_contr
/* Don't commit yet if this is a non-blocking flush and there
* are a lot of outstanding writes for this mapping.
*/
- if (nfsi->commit_info.ncommit <= (nfsi->npages >> 1))
+ if (nfsi->commit_info.ncommit <= (nfsi->nrequests >> 1))
goto out_mark_dirty;

/* don't wait for the COMMIT response */
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index c72d1ad41ad4..6d627b92df53 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -163,7 +163,7 @@ struct nfs_inode {
*/
__be32 cookieverf[2];

- unsigned long npages;
+ unsigned long nrequests;
struct nfs_mds_commit_info commit_info;

/* Open contexts for shared mmap writes */
@@ -520,7 +520,7 @@ extern void nfs_commit_free(struct nfs_commit_data *data);
static inline int
nfs_have_writebacks(struct inode *inode)
{
- return NFS_I(inode)->npages != 0;
+ return NFS_I(inode)->nrequests != 0;
}

/*
--
1.9.3 (Apple Git-50)



2014-11-13 13:38:18

by Weston Andros Adamson

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior


> On Nov 13, 2014, at 6:02 AM, Christoph Hellwig <[email protected]> wrote:
>
> On Thu, Nov 13, 2014 at 09:03:53AM +0800, Peng Tao wrote:
>>> but should be contiguous (as enforced by nfs_flush_incompatible).
>>>
>> The idea is to get an estimation on the length of layout that client
>> wants to use. So even if a single byte is used by client, client would
>> want to ask for a layout that covers the whole page.
>
> That code is very non-obvious and could use a few more comments.
>
> I think the NFS_I(inode)->npages should be replaced with
> inode->i_mapping->nrpages which counts the number of page cache pages.

That sounds reasonable. Can someone track this down? I don’t have a block
server to test against…

Thanks for the help Tao and Christoph!

-dros

2014-11-13 13:38:52

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior

On Thu, Nov 13, 2014 at 08:38:15AM -0500, Weston Andros Adamson wrote:
> That sounds reasonable. Can someone track this down? I don’t have a block
> server to test against…
>
> Thanks for the help Tao and Christoph!

I will give that variant a spin on my test setup.

2014-11-14 15:02:45

by Weston Andros Adamson

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior

>
> On Nov 14, 2014, at 2:04 AM, Christoph Hellwig <[email protected]> wrote:
>
> The patch below passes block layout testing for me:
> <0001-pnfs-blocklayout-fix-end-calculation-in-pnfs_num_con.patch>

Great! That seems like a cleaner solution to me.

Trond - There will be a minor conflict with my patch, but it should be very
easy to fix.

-dros

2014-11-24 22:04:36

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior

On Fri, 2014-11-14 at 08:04 +0100, Christoph Hellwig wrote:
> The patch below passes block layout testing for me:

Hmm... Are you sure? It didn't compile for me, due to the misspelled
'nrpages'.

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]



2014-11-24 23:47:31

by Thomas Haynes

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior

Sure it isn’t you?

You also had “issues” with Anna’s patch.

Just saying!

> On Nov 24, 2014, at 2:04 PM, Trond Myklebust <[email protected]> wrote:
>
> On Fri, 2014-11-14 at 08:04 +0100, Christoph Hellwig wrote:
>> The patch below passes block layout testing for me:
>
> Hmm... Are you sure? It didn't compile for me, due to the misspelled
> 'nrpages'.
>
> --
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> [email protected]
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2014-11-25 13:06:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior

On Mon, Nov 24, 2014 at 05:04:32PM -0500, Trond Myklebust wrote:
> On Fri, 2014-11-14 at 08:04 +0100, Christoph Hellwig wrote:
> > The patch below passes block layout testing for me:
>
> Hmm... Are you sure? It didn't compile for me, due to the misspelled
> 'nrpages'.

Hmm, my locally commited version has nrpages, while the one on the list has
nr_pages. The one I tested obviously had nrpages.

2014-11-13 01:04:14

by Peng Tao

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior

On Thu, Nov 13, 2014 at 1:08 AM, Weston Andros Adamson
<[email protected]> wrote:
> Recent work in the pgio layer made it possible for there to be more than one
> request per page. This caused a subtle change in commit behavior, because
> write.c:nfs_commit_unstable_pages compares the number of *pages* waiting for
> writeback against the number of requests on a commit list to choose when to
> send a COMMIT in a non-blocking flush.
>
> This is probably hard to hit in normal operation - you have to be using
> rsize/wsize < PAGE_SIZE, or pnfs with lots of boundaries that are not page
> aligned to have a noticeable change in behavior.
>
> Signed-off-by: Weston Andros Adamson <[email protected]>
> ---
>
> There is one portion of this patch that I'm not 100% on - the blocklayout
> portion. It seems to be using the (old) nfsi->npages as a count of how many
> contiguous *whole* pages are in use, but that's never been what nfsi->npages
> was counting -- it would be incremented if just one byte of a page was in use,
> but should be contiguous (as enforced by nfs_flush_incompatible).
>
The idea is to get an estimation on the length of layout that client
wants to use. So even if a single byte is used by client, client would
want to ask for a layout that covers the whole page.

> Trond things that simply using the new nfsi->nrequests is ok here, because
> the block layout should always be using requests > PAGE_SIZE. Any blocklayout
> experts have an opinion? CCing Christoph...
>
> -dros
>
> fs/nfs/blocklayout/blocklayout.c | 2 +-
> fs/nfs/callback_proc.c | 2 +-
> fs/nfs/inode.c | 8 ++++----
> fs/nfs/pagelist.c | 11 ++++++++---
> fs/nfs/write.c | 17 ++++++++++++-----
> include/linux/nfs_fs.h | 4 ++--
> 6 files changed, 28 insertions(+), 16 deletions(-)
>
> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
> index b439edfda3ff..5cd836b415a9 100644
> --- a/fs/nfs/blocklayout/blocklayout.c
> +++ b/fs/nfs/blocklayout/blocklayout.c
> @@ -812,7 +812,7 @@ static u64 pnfs_num_cont_bytes(struct inode *inode, pgoff_t idx)
>
> /* Optimize common case that writes from 0 to end of file */
> end = DIV_ROUND_UP(i_size_read(inode), PAGE_CACHE_SIZE);
> - if (end != NFS_I(inode)->npages) {
> + if (end != NFS_I(inode)->nrequests) {
> rcu_read_lock();
> end = page_cache_next_hole(mapping, idx + 1, ULONG_MAX);
> rcu_read_unlock();
> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
> index 9c3a07b48655..f4be136f02de 100644
> --- a/fs/nfs/callback_proc.c
> +++ b/fs/nfs/callback_proc.c
> @@ -49,7 +49,7 @@ __be32 nfs4_callback_getattr(struct cb_getattrargs *args,
> goto out_iput;
> res->size = i_size_read(inode);
> res->change_attr = delegation->change_attr;
> - if (nfsi->npages != 0)
> + if (nfsi->nrequests != 0)
> res->change_attr++;
> res->ctime = inode->i_ctime;
> res->mtime = inode->i_mtime;
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 6388a59f2add..90367d7e31a0 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1149,7 +1149,7 @@ static unsigned long nfs_wcc_update_inode(struct inode *inode, struct nfs_fattr
> if ((fattr->valid & NFS_ATTR_FATTR_PRESIZE)
> && (fattr->valid & NFS_ATTR_FATTR_SIZE)
> && i_size_read(inode) == nfs_size_to_loff_t(fattr->pre_size)
> - && nfsi->npages == 0) {
> + && nfsi->nrequests == 0) {
> i_size_write(inode, nfs_size_to_loff_t(fattr->size));
> ret |= NFS_INO_INVALID_ATTR;
> }
> @@ -1192,7 +1192,7 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat
> if (fattr->valid & NFS_ATTR_FATTR_SIZE) {
> cur_size = i_size_read(inode);
> new_isize = nfs_size_to_loff_t(fattr->size);
> - if (cur_size != new_isize && nfsi->npages == 0)
> + if (cur_size != new_isize && nfsi->nrequests == 0)
> invalid |= NFS_INO_INVALID_ATTR|NFS_INO_REVAL_PAGECACHE;
> }
>
> @@ -1619,7 +1619,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> if (new_isize != cur_isize) {
> /* Do we perhaps have any outstanding writes, or has
> * the file grown beyond our last write? */
> - if ((nfsi->npages == 0) || new_isize > cur_isize) {
> + if ((nfsi->nrequests == 0) || new_isize > cur_isize) {
> i_size_write(inode, new_isize);
> invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA;
> invalid &= ~NFS_INO_REVAL_PAGECACHE;
> @@ -1784,7 +1784,7 @@ static void init_once(void *foo)
> INIT_LIST_HEAD(&nfsi->access_cache_entry_lru);
> INIT_LIST_HEAD(&nfsi->access_cache_inode_lru);
> INIT_LIST_HEAD(&nfsi->commit_info.list);
> - nfsi->npages = 0;
> + nfsi->nrequests = 0;
> nfsi->commit_info.ncommit = 0;
> atomic_set(&nfsi->commit_info.rpcs_out, 0);
> atomic_set(&nfsi->silly_count, 1);
> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> index fabb173f5f62..a2d8827f5975 100644
> --- a/fs/nfs/pagelist.c
> +++ b/fs/nfs/pagelist.c
> @@ -263,6 +263,7 @@ bool nfs_page_group_sync_on_bit(struct nfs_page *req, unsigned int bit)
> static inline void
> nfs_page_group_init(struct nfs_page *req, struct nfs_page *prev)
> {
> + struct inode *inode;
> WARN_ON_ONCE(prev == req);
>
> if (!prev) {
> @@ -281,12 +282,16 @@ nfs_page_group_init(struct nfs_page *req, struct nfs_page *prev)
> * nfs_page_group_destroy is called */
> kref_get(&req->wb_head->wb_kref);
>
> - /* grab extra ref if head request has extra ref from
> - * the write/commit path to handle handoff between write
> - * and commit lists */
> + /* grab extra ref and bump the request count if head request
> + * has extra ref from the write/commit path to handle handoff
> + * between write and commit lists. */
> if (test_bit(PG_INODE_REF, &prev->wb_head->wb_flags)) {
> + inode = page_file_mapping(req->wb_page)->host;
> set_bit(PG_INODE_REF, &req->wb_flags);
> kref_get(&req->wb_kref);
> + spin_lock(&inode->i_lock);
> + NFS_I(inode)->nrequests++;
> + spin_unlock(&inode->i_lock);
> }
> }
> }
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index 0bd0a5d07326..c31d479da159 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -675,7 +675,8 @@ static void nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
> nfs_lock_request(req);
>
> spin_lock(&inode->i_lock);
> - if (!nfsi->npages && NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE))
> + if (!nfsi->nrequests &&
> + NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE))
> inode->i_version++;
> /*
> * Swap-space should not get truncated. Hence no need to plug the race
> @@ -686,9 +687,11 @@ static void nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
> SetPagePrivate(req->wb_page);
> set_page_private(req->wb_page, (unsigned long)req);
> }
> - nfsi->npages++;
> + nfsi->nrequests++;
> /* this a head request for a page group - mark it as having an
> - * extra reference so sub groups can follow suit */
> + * extra reference so sub groups can follow suit.
> + * This flag also informs pgio layer when to bump nrequests when
> + * adding subrequests. */
> WARN_ON(test_and_set_bit(PG_INODE_REF, &req->wb_flags));
> kref_get(&req->wb_kref);
> spin_unlock(&inode->i_lock);
> @@ -714,7 +717,11 @@ static void nfs_inode_remove_request(struct nfs_page *req)
> wake_up_page(head->wb_page, PG_private);
> clear_bit(PG_MAPPED, &head->wb_flags);
> }
> - nfsi->npages--;
> + nfsi->nrequests--;
> + spin_unlock(&inode->i_lock);
> + } else {
> + spin_lock(&inode->i_lock);
> + nfsi->nrequests--;
> spin_unlock(&inode->i_lock);
> }
>
> @@ -1750,7 +1757,7 @@ static int nfs_commit_unstable_pages(struct inode *inode, struct writeback_contr
> /* Don't commit yet if this is a non-blocking flush and there
> * are a lot of outstanding writes for this mapping.
> */
> - if (nfsi->commit_info.ncommit <= (nfsi->npages >> 1))
> + if (nfsi->commit_info.ncommit <= (nfsi->nrequests >> 1))
> goto out_mark_dirty;
>
> /* don't wait for the COMMIT response */
> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> index c72d1ad41ad4..6d627b92df53 100644
> --- a/include/linux/nfs_fs.h
> +++ b/include/linux/nfs_fs.h
> @@ -163,7 +163,7 @@ struct nfs_inode {
> */
> __be32 cookieverf[2];
>
> - unsigned long npages;
> + unsigned long nrequests;
> struct nfs_mds_commit_info commit_info;
>
> /* Open contexts for shared mmap writes */
> @@ -520,7 +520,7 @@ extern void nfs_commit_free(struct nfs_commit_data *data);
> static inline int
> nfs_have_writebacks(struct inode *inode)
> {
> - return NFS_I(inode)->npages != 0;
> + return NFS_I(inode)->nrequests != 0;
> }
>
> /*
> --
> 1.9.3 (Apple Git-50)
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2014-11-13 11:02:46

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] NFS: fix subtle change in COMMIT behavior

On Thu, Nov 13, 2014 at 09:03:53AM +0800, Peng Tao wrote:
> > but should be contiguous (as enforced by nfs_flush_incompatible).
> >
> The idea is to get an estimation on the length of layout that client
> wants to use. So even if a single byte is used by client, client would
> want to ask for a layout that covers the whole page.

That code is very non-obvious and could use a few more comments.

I think the NFS_I(inode)->npages should be replaced with
inode->i_mapping->nrpages which counts the number of page cache pages.