2020-11-04 16:28:20

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v3 00/17] Readdir enhancements

From: Trond Myklebust <[email protected]>

The following patch series performs a number of cleanups on the readdir
code.
It also adds support for 1MB readdir RPC calls on-the-wire, and modifies
the caching code to ensure that we cache the entire contents of that
1MB call (instead of discarding the data that doesn't fit into a single
page).

v2: Fix the handling of the NFSv3/v4 directory verifier
v3: Optimise searching when the readdir cookies are seen to be ordered

Trond Myklebust (17):
NFS: Ensure contents of struct nfs_open_dir_context are consistent
NFS: Clean up readdir struct nfs_cache_array
NFS: Clean up nfs_readdir_page_filler()
NFS: Clean up directory array handling
NFS: Don't discard readdir results
NFS: Remove unnecessary kmap in nfs_readdir_xdr_to_array()
NFS: Replace kmap() with kmap_atomic() in nfs_readdir_search_array()
NFS: Simplify struct nfs_cache_array_entry
NFS: Support larger readdir buffers
NFS: More readdir cleanups
NFS: nfs_do_filldir() does not return a value
NFS: Reduce readdir stack usage
NFS: Cleanup to remove nfs_readdir_descriptor_t typedef
NFS: Allow the NFS generic code to pass in a verifier to readdir
NFS: Handle NFS4ERR_NOT_SAME and NFSERR_BADCOOKIE from readdir calls
NFS: Improve handling of directory verifiers
NFS: Optimisations for monotonically increasing readdir cookies

fs/nfs/client.c | 4 +-
fs/nfs/dir.c | 629 +++++++++++++++++++++++++---------------
fs/nfs/inode.c | 7 -
fs/nfs/internal.h | 6 -
fs/nfs/nfs3proc.c | 35 ++-
fs/nfs/nfs4proc.c | 40 +--
fs/nfs/proc.c | 18 +-
include/linux/nfs_fs.h | 9 +-
include/linux/nfs_xdr.h | 17 +-
9 files changed, 459 insertions(+), 306 deletions(-)

--
2.28.0


2020-11-04 16:29:07

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH v3 01/17] NFS: Ensure contents of struct nfs_open_dir_context are consistent

From: Trond Myklebust <[email protected]>

Ensure that the contents of struct nfs_open_dir_context are consistent
by setting them under the file->f_lock from a private copy (that is
known to be consistent).

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/dir.c | 72 +++++++++++++++++++++++++++++++---------------------
1 file changed, 43 insertions(+), 29 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 4e011adaf967..67d8595cd6e5 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -144,20 +144,23 @@ struct nfs_cache_array {
struct nfs_cache_array_entry array[];
};

-typedef struct {
+typedef struct nfs_readdir_descriptor {
struct file *file;
struct page *page;
struct dir_context *ctx;
unsigned long page_index;
- u64 *dir_cookie;
+ u64 dir_cookie;
u64 last_cookie;
+ u64 dup_cookie;
loff_t current_index;
loff_t prev_index;

unsigned long dir_verifier;
unsigned long timestamp;
unsigned long gencount;
+ unsigned long attr_gencount;
unsigned int cache_entry_index;
+ signed char duped;
bool plus;
bool eof;
} nfs_readdir_descriptor_t;
@@ -273,7 +276,7 @@ int nfs_readdir_search_for_pos(struct nfs_cache_array *array, nfs_readdir_descri
}

index = (unsigned int)diff;
- *desc->dir_cookie = array->array[index].cookie;
+ desc->dir_cookie = array->array[index].cookie;
desc->cache_entry_index = index;
return 0;
out_eof:
@@ -298,33 +301,32 @@ int nfs_readdir_search_for_cookie(struct nfs_cache_array *array, nfs_readdir_des
int status = -EAGAIN;

for (i = 0; i < array->size; i++) {
- if (array->array[i].cookie == *desc->dir_cookie) {
+ if (array->array[i].cookie == desc->dir_cookie) {
struct nfs_inode *nfsi = NFS_I(file_inode(desc->file));
- struct nfs_open_dir_context *ctx = desc->file->private_data;

new_pos = desc->current_index + i;
- if (ctx->attr_gencount != nfsi->attr_gencount ||
+ if (desc->attr_gencount != nfsi->attr_gencount ||
!nfs_readdir_inode_mapping_valid(nfsi)) {
- ctx->duped = 0;
- ctx->attr_gencount = nfsi->attr_gencount;
+ desc->duped = 0;
+ desc->attr_gencount = nfsi->attr_gencount;
} else if (new_pos < desc->prev_index) {
- if (ctx->duped > 0
- && ctx->dup_cookie == *desc->dir_cookie) {
+ if (desc->duped > 0
+ && desc->dup_cookie == desc->dir_cookie) {
if (printk_ratelimit()) {
pr_notice("NFS: directory %pD2 contains a readdir loop."
"Please contact your server vendor. "
"The file: %.*s has duplicate cookie %llu\n",
desc->file, array->array[i].string.len,
- array->array[i].string.name, *desc->dir_cookie);
+ array->array[i].string.name, desc->dir_cookie);
}
status = -ELOOP;
goto out;
}
- ctx->dup_cookie = *desc->dir_cookie;
- ctx->duped = -1;
+ desc->dup_cookie = desc->dir_cookie;
+ desc->duped = -1;
}
if (nfs_readdir_use_cookie(desc->file))
- desc->ctx->pos = *desc->dir_cookie;
+ desc->ctx->pos = desc->dir_cookie;
else
desc->ctx->pos = new_pos;
desc->prev_index = new_pos;
@@ -334,7 +336,7 @@ int nfs_readdir_search_for_cookie(struct nfs_cache_array *array, nfs_readdir_des
}
if (array->eof_index >= 0) {
status = -EBADCOOKIE;
- if (*desc->dir_cookie == array->last_cookie)
+ if (desc->dir_cookie == array->last_cookie)
desc->eof = true;
}
out:
@@ -349,7 +351,7 @@ int nfs_readdir_search_array(nfs_readdir_descriptor_t *desc)

array = kmap(desc->page);

- if (*desc->dir_cookie == 0)
+ if (desc->dir_cookie == 0)
status = nfs_readdir_search_for_pos(array, desc);
else
status = nfs_readdir_search_for_cookie(array, desc);
@@ -801,7 +803,6 @@ int nfs_do_filldir(nfs_readdir_descriptor_t *desc)
int i = 0;
int res = 0;
struct nfs_cache_array *array = NULL;
- struct nfs_open_dir_context *ctx = file->private_data;

array = kmap(desc->page);
for (i = desc->cache_entry_index; i < array->size; i++) {
@@ -814,22 +815,22 @@ int nfs_do_filldir(nfs_readdir_descriptor_t *desc)
break;
}
if (i < (array->size-1))
- *desc->dir_cookie = array->array[i+1].cookie;
+ desc->dir_cookie = array->array[i+1].cookie;
else
- *desc->dir_cookie = array->last_cookie;
+ desc->dir_cookie = array->last_cookie;
if (nfs_readdir_use_cookie(file))
- desc->ctx->pos = *desc->dir_cookie;
+ desc->ctx->pos = desc->dir_cookie;
else
desc->ctx->pos++;
- if (ctx->duped != 0)
- ctx->duped = 1;
+ if (desc->duped != 0)
+ desc->duped = 1;
}
if (array->eof_index >= 0)
desc->eof = true;

kunmap(desc->page);
dfprintk(DIRCACHE, "NFS: nfs_do_filldir() filling ended @ cookie %Lu; returning = %d\n",
- (unsigned long long)*desc->dir_cookie, res);
+ (unsigned long long)desc->dir_cookie, res);
return res;
}

@@ -851,10 +852,9 @@ int uncached_readdir(nfs_readdir_descriptor_t *desc)
struct page *page = NULL;
int status;
struct inode *inode = file_inode(desc->file);
- struct nfs_open_dir_context *ctx = desc->file->private_data;

dfprintk(DIRCACHE, "NFS: uncached_readdir() searching for cookie %Lu\n",
- (unsigned long long)*desc->dir_cookie);
+ (unsigned long long)desc->dir_cookie);

page = alloc_page(GFP_HIGHUSER);
if (!page) {
@@ -863,9 +863,9 @@ int uncached_readdir(nfs_readdir_descriptor_t *desc)
}

desc->page_index = 0;
- desc->last_cookie = *desc->dir_cookie;
+ desc->last_cookie = desc->dir_cookie;
desc->page = page;
- ctx->duped = 0;
+ desc->duped = 0;

status = nfs_readdir_xdr_to_array(desc, page, inode);
if (status < 0)
@@ -894,7 +894,6 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx)
nfs_readdir_descriptor_t my_desc = {
.file = file,
.ctx = ctx,
- .dir_cookie = &dir_ctx->dir_cookie,
.plus = nfs_use_readdirplus(inode, ctx),
},
*desc = &my_desc;
@@ -915,13 +914,20 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx)
if (res < 0)
goto out;

+ spin_lock(&file->f_lock);
+ desc->dir_cookie = dir_ctx->dir_cookie;
+ desc->dup_cookie = dir_ctx->dup_cookie;
+ desc->duped = dir_ctx->duped;
+ desc->attr_gencount = dir_ctx->attr_gencount;
+ spin_unlock(&file->f_lock);
+
do {
res = readdir_search_pagecache(desc);

if (res == -EBADCOOKIE) {
res = 0;
/* This means either end of directory */
- if (*desc->dir_cookie && !desc->eof) {
+ if (desc->dir_cookie && !desc->eof) {
/* Or that the server has 'lost' a cookie */
res = uncached_readdir(desc);
if (res == 0)
@@ -946,6 +952,14 @@ static int nfs_readdir(struct file *file, struct dir_context *ctx)
if (res < 0)
break;
} while (!desc->eof);
+
+ spin_lock(&file->f_lock);
+ dir_ctx->dir_cookie = desc->dir_cookie;
+ dir_ctx->dup_cookie = desc->dup_cookie;
+ dir_ctx->duped = desc->duped;
+ dir_ctx->attr_gencount = desc->attr_gencount;
+ spin_unlock(&file->f_lock);
+
out:
if (res > 0)
res = 0;
--
2.28.0

2020-11-07 12:51:30

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v3 00/17] Readdir enhancements

On 4 Nov 2020, at 11:16, [email protected] wrote:

> From: Trond Myklebust <[email protected]>
>
> The following patch series performs a number of cleanups on the readdir
> code.
> It also adds support for 1MB readdir RPC calls on-the-wire, and modifies
> the caching code to ensure that we cache the entire contents of that
> 1MB call (instead of discarding the data that doesn't fit into a single
> page).
>
> v2: Fix the handling of the NFSv3/v4 directory verifier
> v3: Optimise searching when the readdir cookies are seen to be ordered

Hi Trond, thanks for these.

I did a bit of testing with these on 4-core/4G client listing 1.5M files
with READDIR. I compared v5.10-rc2 without/with this set.

+------+ v5.10.rc-2 +--+ this v3 patch set +
| run | time | rpc calls | | time | rpc calls |

nfsv3 with dtsize 262144:
+------+---------+-----------+--+--------+-----------+
| 1 | 81.583 | 14710 | | 53.568 | 215 |
| 2 | 81.147 | 14710 | | 50.781 | 215 |
| 3 | 81.61 | 14710 | | 50.514 | 215 |
| 4 | 82.405 | 14710 | | 50.746 | 215 |
| 5 | 82.066 | 14710 | | 50.397 | 215 |
| 6 | 82.395 | 14710 | | 50.892 | 215 |
| 7 | 81.657 | 14710 | | 50.882 | 215 |
| 8 | 81.555 | 14710 | | 50.981 | 215 |
| 9 | 81.421 | 14710 | | 50.558 | 215 |
| 10 | 81.472 | 14710 | | 50.588 | 215 |

nfsv3 with dtsize 1048576:
+------+---------+-----------+--+--------+-----------+
| 1 | 81.563 | 14710 | | 52.692 | 61 |
| 2 | 82.123 | 14710 | | 49.934 | 61 |
| 3 | 81.714 | 14710 | | 50.158 | 61 |
| 4 | 81.707 | 14710 | | 50.083 | 61 |
| 5 | 81.44 | 14710 | | 50.045 | 61 |
| 6 | 81.685 | 14710 | | 50.021 | 61 |
| 7 | 81.17 | 14710 | | 50.131 | 61 |
| 8 | 81.366 | 14710 | | 49.928 | 61 |
| 9 | 81.067 | 14710 | | 50.081 | 61 |
| 10 | 81.524 | 14710 | | 50.442 | 61 |

nfsv4 with dtsize 32768:
+------+---------+-----------+--+--------+-----------+
| 1 | 99.534 | 14712 | | 79.461 | 331 |
| 2 | 98.998 | 14712 | | 79.338 | 331 |
| 3 | 99.462 | 14712 | | 81.101 | 331 |
| 4 | 99.891 | 14712 | | 78.888 | 331 |
| 5 | 99.516 | 14712 | | 81.147 | 331 |
| 6 | 98.649 | 14712 | | 83.084 | 331 |
| 7 | 101.159 | 14712 | | 80.461 | 331 |
| 8 | 100.402 | 14712 | | 79.003 | 331 |
| 9 | 98.548 | 14712 | | 80.619 | 331 |
| 10 | 97.456 | 14712 | | 81.317 | 331 |

nfsv4 with dtsize 1048576:
+------+---------+-----------+--+--------+-----------+
| 1 | 100.357 | 14712 | | 78.976 | 91 |
| 2 | 99.61 | 14712 | | 79.328 | 91 |
| 3 | 101.095 | 14712 | | 80.649 | 91 |
| 4 | 107.904 | 14712 | | 78.285 | 91 |
| 5 | 103.665 | 14712 | | 79.258 | 91 |
| 6 | 98.877 | 14712 | | 78.817 | 91 |
| 7 | 99.567 | 14712 | | 81.11 | 91 |
| 8 | 99.096 | 14712 | | 80.296 | 91 |
| 9 | 100.124 | 14712 | | 78.865 | 91 |
| 10 | 100.603 | 14712 | | 79.143 | 91 |

These look great. Feel free to add either/both of my:
Reviewed-by: Benjamin Coddington <[email protected]>
Tested-by: Benjamin Coddington <[email protected]>

Ben

2020-11-07 14:23:49

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3 00/17] Readdir enhancements

On Sat, 2020-11-07 at 07:49 -0500, Benjamin Coddington wrote:
> On 4 Nov 2020, at 11:16, [email protected] wrote:
>
> > From: Trond Myklebust <[email protected]>
> >
> > The following patch series performs a number of cleanups on the
> > readdir
> > code.
> > It also adds support for 1MB readdir RPC calls on-the-wire, and
> > modifies
> > the caching code to ensure that we cache the entire contents of
> > that
> > 1MB call (instead of discarding the data that doesn't fit into a
> > single
> > page).
> >
> > v2: Fix the handling of the NFSv3/v4 directory verifier
> > v3: Optimise searching when the readdir cookies are seen to be
> > ordered
>
> Hi Trond, thanks for these.
>
> I did a bit of testing with these on 4-core/4G client listing 1.5M
> files
> with READDIR.  I compared v5.10-rc2 without/with this set.
>
> +------+     v5.10.rc-2      +--+ this v3 patch set  +
> > run  |  time   | rpc calls |  |  time  | rpc calls |
>
> nfsv3 with dtsize 262144:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 81.583  | 14710     |  | 53.568 | 215       |
> > 2    | 81.147  | 14710     |  | 50.781 | 215       |
> > 3    | 81.61   | 14710     |  | 50.514 | 215       |
> > 4    | 82.405  | 14710     |  | 50.746 | 215       |
> > 5    | 82.066  | 14710     |  | 50.397 | 215       |
> > 6    | 82.395  | 14710     |  | 50.892 | 215       |
> > 7    | 81.657  | 14710     |  | 50.882 | 215       |
> > 8    | 81.555  | 14710     |  | 50.981 | 215       |
> > 9    | 81.421  | 14710     |  | 50.558 | 215       |
> > 10   | 81.472  | 14710     |  | 50.588 | 215       |
>
> nfsv3 with dtsize 1048576:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 81.563  | 14710     |  | 52.692 | 61        |
> > 2    | 82.123  | 14710     |  | 49.934 | 61        |
> > 3    | 81.714  | 14710     |  | 50.158 | 61        |
> > 4    | 81.707  | 14710     |  | 50.083 | 61        |
> > 5    | 81.44   | 14710     |  | 50.045 | 61        |
> > 6    | 81.685  | 14710     |  | 50.021 | 61        |
> > 7    | 81.17   | 14710     |  | 50.131 | 61        |
> > 8    | 81.366  | 14710     |  | 49.928 | 61        |
> > 9    | 81.067  | 14710     |  | 50.081 | 61        |
> > 10   | 81.524  | 14710     |  | 50.442 | 61        |
>
> nfsv4 with dtsize 32768:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 99.534  | 14712     |  | 79.461 | 331       |
> > 2    | 98.998  | 14712     |  | 79.338 | 331       |
> > 3    | 99.462  | 14712     |  | 81.101 | 331       |
> > 4    | 99.891  | 14712     |  | 78.888 | 331       |
> > 5    | 99.516  | 14712     |  | 81.147 | 331       |
> > 6    | 98.649  | 14712     |  | 83.084 | 331       |
> > 7    | 101.159 | 14712     |  | 80.461 | 331       |
> > 8    | 100.402 | 14712     |  | 79.003 | 331       |
> > 9    | 98.548  | 14712     |  | 80.619 | 331       |
> > 10   | 97.456  | 14712     |  | 81.317 | 331       |
>
> nfsv4 with dtsize 1048576:
> +------+---------+-----------+--+--------+-----------+
> > 1    | 100.357 | 14712     |  | 78.976 | 91        |
> > 2    | 99.61   | 14712     |  | 79.328 | 91        |
> > 3    | 101.095 | 14712     |  | 80.649 | 91        |
> > 4    | 107.904 | 14712     |  | 78.285 | 91        |
> > 5    | 103.665 | 14712     |  | 79.258 | 91        |
> > 6    | 98.877  | 14712     |  | 78.817 | 91        |
> > 7    | 99.567  | 14712     |  | 81.11  | 91        |
> > 8    | 99.096  | 14712     |  | 80.296 | 91        |
> > 9    | 100.124 | 14712     |  | 78.865 | 91        |
> > 10   | 100.603 | 14712     |  | 79.143 | 91        |
>
> These look great.  Feel free to add either/both of my:
> Reviewed-by: Benjamin Coddington <[email protected]>
> Tested-by: Benjamin Coddington <[email protected]>

Thanks again for testing! I missed this email before sending out v4,
but since that only adds 2 new patches to the series to deal with
Dave's v. large changing directory case, I assume I can apply the above
tags to the rest anyway as they have not changed?

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2020-11-08 11:06:37

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v3 00/17] Readdir enhancements

On 7 Nov 2020, at 9:23, Trond Myklebust wrote:

> On Sat, 2020-11-07 at 07:49 -0500, Benjamin Coddington wrote:
>> On 4 Nov 2020, at 11:16, [email protected] wrote:
>>
>>> From: Trond Myklebust <[email protected]>
>>>
>>> The following patch series performs a number of cleanups on the
>>> readdir
>>> code.
>>> It also adds support for 1MB readdir RPC calls on-the-wire, and
>>> modifies
>>> the caching code to ensure that we cache the entire contents of
>>> that
>>> 1MB call (instead of discarding the data that doesn't fit into a
>>> single
>>> page).
>>>
>>> v2: Fix the handling of the NFSv3/v4 directory verifier
>>> v3: Optimise searching when the readdir cookies are seen to be
>>> ordered
>>
>> Hi Trond, thanks for these.
>>
>> I did a bit of testing with these on 4-core/4G client listing 1.5M
>> files
>> with READDIR.  I compared v5.10-rc2 without/with this set.
>>
>> +------+     v5.10.rc-2      +--+ this v3 patch set  +
>>> run  |  time   | rpc calls |  |  time  | rpc calls |
>>
>> nfsv3 with dtsize 262144:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 81.583  | 14710     |  | 53.568 | 215       |
>>> 2    | 81.147  | 14710     |  | 50.781 | 215       |
>>> 3    | 81.61   | 14710     |  | 50.514 | 215       |
>>> 4    | 82.405  | 14710     |  | 50.746 | 215       |
>>> 5    | 82.066  | 14710     |  | 50.397 | 215       |
>>> 6    | 82.395  | 14710     |  | 50.892 | 215       |
>>> 7    | 81.657  | 14710     |  | 50.882 | 215       |
>>> 8    | 81.555  | 14710     |  | 50.981 | 215       |
>>> 9    | 81.421  | 14710     |  | 50.558 | 215       |
>>> 10   | 81.472  | 14710     |  | 50.588 | 215       |
>>
>> nfsv3 with dtsize 1048576:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 81.563  | 14710     |  | 52.692 | 61        |
>>> 2    | 82.123  | 14710     |  | 49.934 | 61        |
>>> 3    | 81.714  | 14710     |  | 50.158 | 61        |
>>> 4    | 81.707  | 14710     |  | 50.083 | 61        |
>>> 5    | 81.44   | 14710     |  | 50.045 | 61        |
>>> 6    | 81.685  | 14710     |  | 50.021 | 61        |
>>> 7    | 81.17   | 14710     |  | 50.131 | 61        |
>>> 8    | 81.366  | 14710     |  | 49.928 | 61        |
>>> 9    | 81.067  | 14710     |  | 50.081 | 61        |
>>> 10   | 81.524  | 14710     |  | 50.442 | 61        |
>>
>> nfsv4 with dtsize 32768:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 99.534  | 14712     |  | 79.461 | 331       |
>>> 2    | 98.998  | 14712     |  | 79.338 | 331       |
>>> 3    | 99.462  | 14712     |  | 81.101 | 331       |
>>> 4    | 99.891  | 14712     |  | 78.888 | 331       |
>>> 5    | 99.516  | 14712     |  | 81.147 | 331       |
>>> 6    | 98.649  | 14712     |  | 83.084 | 331       |
>>> 7    | 101.159 | 14712     |  | 80.461 | 331       |
>>> 8    | 100.402 | 14712     |  | 79.003 | 331       |
>>> 9    | 98.548  | 14712     |  | 80.619 | 331       |
>>> 10   | 97.456  | 14712     |  | 81.317 | 331       |
>>
>> nfsv4 with dtsize 1048576:
>> +------+---------+-----------+--+--------+-----------+
>>> 1    | 100.357 | 14712     |  | 78.976 | 91        |
>>> 2    | 99.61   | 14712     |  | 79.328 | 91        |
>>> 3    | 101.095 | 14712     |  | 80.649 | 91        |
>>> 4    | 107.904 | 14712     |  | 78.285 | 91        |
>>> 5    | 103.665 | 14712     |  | 79.258 | 91        |
>>> 6    | 98.877  | 14712     |  | 78.817 | 91        |
>>> 7    | 99.567  | 14712     |  | 81.11  | 91        |
>>> 8    | 99.096  | 14712     |  | 80.296 | 91        |
>>> 9    | 100.124 | 14712     |  | 78.865 | 91        |
>>> 10   | 100.603 | 14712     |  | 79.143 | 91        |
>>
>> These look great.  Feel free to add either/both of my:
>> Reviewed-by: Benjamin Coddington <[email protected]>
>> Tested-by: Benjamin Coddington <[email protected]>
>
> Thanks again for testing! I missed this email before sending out v4,
> but since that only adds 2 new patches to the series to deal with
> Dave's v. large changing directory case, I assume I can apply the above
> tags to the rest anyway as they have not changed?

Yes, I'll check those out too.

Ben

2020-11-08 18:16:09

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: Re: [PATCH v3 00/17] Readdir enhancements



----- Original Message -----
> From: "Benjamin Coddington" <[email protected]>
> To: "Trond Myklebust" <[email protected]>
> Cc: "linux-nfs" <[email protected]>
> Sent: Saturday, 7 November, 2020 13:49:31
> Subject: Re: [PATCH v3 00/17] Readdir enhancements

> On 4 Nov 2020, at 11:16, [email protected] wrote:
>
>> From: Trond Myklebust <[email protected]>
>>
>> The following patch series performs a number of cleanups on the readdir
>> code.
>> It also adds support for 1MB readdir RPC calls on-the-wire, and modifies
>> the caching code to ensure that we cache the entire contents of that
>> 1MB call (instead of discarding the data that doesn't fit into a single
>> page).
>>
>> v2: Fix the handling of the NFSv3/v4 directory verifier
>> v3: Optimise searching when the readdir cookies are seen to be ordered
>
> Hi Trond, thanks for these.
>
> I did a bit of testing with these on 4-core/4G client listing 1.5M files
> with READDIR. I compared v5.10-rc2 without/with this set.
>
> +------+ v5.10.rc-2 +--+ this v3 patch set +
>| run | time | rpc calls | | time | rpc calls |
>
> nfsv3 with dtsize 262144:
> +------+---------+-----------+--+--------+-----------+
>| 1 | 81.583 | 14710 | | 53.568 | 215 |
>| 2 | 81.147 | 14710 | | 50.781 | 215 |
>| 3 | 81.61 | 14710 | | 50.514 | 215 |
>| 4 | 82.405 | 14710 | | 50.746 | 215 |
>| 5 | 82.066 | 14710 | | 50.397 | 215 |
>| 6 | 82.395 | 14710 | | 50.892 | 215 |
>| 7 | 81.657 | 14710 | | 50.882 | 215 |
>| 8 | 81.555 | 14710 | | 50.981 | 215 |
>| 9 | 81.421 | 14710 | | 50.558 | 215 |
>| 10 | 81.472 | 14710 | | 50.588 | 215 |
>
> nfsv3 with dtsize 1048576:
> +------+---------+-----------+--+--------+-----------+
>| 1 | 81.563 | 14710 | | 52.692 | 61 |
>| 2 | 82.123 | 14710 | | 49.934 | 61 |
>| 3 | 81.714 | 14710 | | 50.158 | 61 |
>| 4 | 81.707 | 14710 | | 50.083 | 61 |
>| 5 | 81.44 | 14710 | | 50.045 | 61 |
>| 6 | 81.685 | 14710 | | 50.021 | 61 |
>| 7 | 81.17 | 14710 | | 50.131 | 61 |
>| 8 | 81.366 | 14710 | | 49.928 | 61 |
>| 9 | 81.067 | 14710 | | 50.081 | 61 |
>| 10 | 81.524 | 14710 | | 50.442 | 61 |
>
> nfsv4 with dtsize 32768:
> +------+---------+-----------+--+--------+-----------+
>| 1 | 99.534 | 14712 | | 79.461 | 331 |
>| 2 | 98.998 | 14712 | | 79.338 | 331 |
>| 3 | 99.462 | 14712 | | 81.101 | 331 |
>| 4 | 99.891 | 14712 | | 78.888 | 331 |
>| 5 | 99.516 | 14712 | | 81.147 | 331 |
>| 6 | 98.649 | 14712 | | 83.084 | 331 |
>| 7 | 101.159 | 14712 | | 80.461 | 331 |
>| 8 | 100.402 | 14712 | | 79.003 | 331 |
>| 9 | 98.548 | 14712 | | 80.619 | 331 |
>| 10 | 97.456 | 14712 | | 81.317 | 331 |
>
> nfsv4 with 1048576:
> +------+---------+-----------+--+--------+-----------+
>| 1 | 100.357 | 14712 | | 78.976 | 91 |
>| 2 | 99.61 | 14712 | | 79.328 | 91 |
>| 3 | 101.095 | 14712 | | 80.649 | 91 |
>| 4 | 107.904 | 14712 | | 78.285 | 91 |
>| 5 | 103.665 | 14712 | | 79.258 | 91 |
>| 6 | 98.877 | 14712 | | 78.817 | 91 |
>| 7 | 99.567 | 14712 | | 81.11 | 91 |
>| 8 | 99.096 | 14712 | | 80.296 | 91 |
>| 9 | 100.124 | 14712 | | 78.865 | 91 |
>| 10 | 100.603 | 14712 | | 79.143 | 91 |


Hi Ben, hi Trond,

though number of RPC call between dtsize 1048576 and 32768
is x3 less, the time it takes almost the same. According to
your results, at some point (<= 32K) a bigger dtsize makes
no difference. As the original dtsize is 32K
(#define NFS_MAX_READDIR_PAGES 8), it looks like that the
performance enhancements mostly contributed by a change
not related to the buffer size.

On another, the number of RPC calls with v3-patch-set drops
by x40. What ever Trond have changed there has a big impact!

Thanks a lot for your efforts,
Tigran.

>
> These look great. Feel free to add either/both of my:
> Reviewed-by: Benjamin Coddington <[email protected]>
> Tested-by: Benjamin Coddington <[email protected]>
>
> Ben