2020-11-24 03:46:22

by Dai Ngo

[permalink] [raw]
Subject: [PATCH] NFSD: Fix 5 seconds delay when doing inter server copy

Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
seconds delay regardless of the size of the copy. The delay is from
nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
fails because the seqid in both nfs4_state and nfs4_stateid are 0.

Fix by modifying the source server to return the stateid for COPY_NOTIFY
request with seqid 1 instead of 0. This is also to conform with
section 4.8 of RFC 7862.

Here is the relevant paragraph from section 4.8 of RFC 7862:

A copy offload stateid's seqid MUST NOT be zero. In the context of a
copy offload operation, it is inappropriate to indicate "the most
recent copy offload operation" using a stateid with a seqid of zero
(see Section 8.2.2 of [RFC5661]). It is inappropriate because the
stateid refers to internal state in the server and there may be
several asynchronous COPY operations being performed in parallel on
the same file by the server. Therefore, a copy offload stateid with
a seqid of zero MUST be considered invalid.

Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <[email protected]>
---
fs/nfsd/nfs4state.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d7f27ed6b794..33ee1a6961e3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
refcount_set(&cps->cp_stateid.sc_count, 1);
if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
goto out_free;
+ cps->cp_stateid.stid.si_generation = 1;
spin_lock(&nn->s2s_cp_lock);
list_add(&cps->cp_list, &p_stid->sc_cp_list);
spin_unlock(&nn->s2s_cp_lock);
--
2.9.5


2020-11-24 23:56:50

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] NFSD: Fix 5 seconds delay when doing inter server copy

On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
> seconds delay regardless of the size of the copy. The delay is from
> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
>
> Fix by modifying the source server to return the stateid for COPY_NOTIFY
> request with seqid 1 instead of 0. This is also to conform with
> section 4.8 of RFC 7862.
>
> Here is the relevant paragraph from section 4.8 of RFC 7862:
>
> A copy offload stateid's seqid MUST NOT be zero. In the context of a
> copy offload operation, it is inappropriate to indicate "the most
> recent copy offload operation" using a stateid with a seqid of zero
> (see Section 8.2.2 of [RFC5661]). It is inappropriate because the
> stateid refers to internal state in the server and there may be
> several asynchronous COPY operations being performed in parallel on
> the same file by the server. Therefore, a copy offload stateid with
> a seqid of zero MUST be considered invalid.
>
> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
> Signed-off-by: Dai Ngo <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index d7f27ed6b794..33ee1a6961e3 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
> refcount_set(&cps->cp_stateid.sc_count, 1);
> if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
> goto out_free;
> + cps->cp_stateid.stid.si_generation = 1;

This affects the stateid returned by COPY_NOTIFY, but not the one
returned by COPY. I think we wan to add this to nfs4_init_cp_state()
and cover both.

--b.

> spin_lock(&nn->s2s_cp_lock);
> list_add(&cps->cp_list, &p_stid->sc_cp_list);
> spin_unlock(&nn->s2s_cp_lock);
> --
> 2.9.5

2020-11-30 18:00:08

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH] NFSD: Fix 5 seconds delay when doing inter server copy

Hello Dai -

> On Nov 24, 2020, at 3:49 PM, J. Bruce Fields <[email protected]> wrote:
>
> On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
>> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
>> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
>> seconds delay regardless of the size of the copy. The delay is from
>> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
>> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
>>
>> Fix by modifying the source server to return the stateid for COPY_NOTIFY
>> request with seqid 1 instead of 0. This is also to conform with
>> section 4.8 of RFC 7862.
>>
>> Here is the relevant paragraph from section 4.8 of RFC 7862:
>>
>> A copy offload stateid's seqid MUST NOT be zero. In the context of a
>> copy offload operation, it is inappropriate to indicate "the most
>> recent copy offload operation" using a stateid with a seqid of zero
>> (see Section 8.2.2 of [RFC5661]). It is inappropriate because the
>> stateid refers to internal state in the server and there may be
>> several asynchronous COPY operations being performed in parallel on
>> the same file by the server. Therefore, a copy offload stateid with
>> a seqid of zero MUST be considered invalid.
>>
>> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
>> Signed-off-by: Dai Ngo <[email protected]>
>> ---
>> fs/nfsd/nfs4state.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index d7f27ed6b794..33ee1a6961e3 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>> refcount_set(&cps->cp_stateid.sc_count, 1);
>> if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
>> goto out_free;
>> + cps->cp_stateid.stid.si_generation = 1;
>
> This affects the stateid returned by COPY_NOTIFY, but not the one
> returned by COPY. I think we wan to add this to nfs4_init_cp_state()
> and cover both.

Since time is creeping on towards the next merge window, I assume
this particular fix needs to go there, but I don't see the final
version of it (with Bruce's suggested fix) on the list. Did I miss
it?


>> spin_lock(&nn->s2s_cp_lock);
>> list_add(&cps->cp_list, &p_stid->sc_cp_list);
>> spin_unlock(&nn->s2s_cp_lock);
>> --
>> 2.9.5

--
Chuck Lever
[email protected]



2020-11-30 18:50:21

by Dai Ngo

[permalink] [raw]
Subject: Re: [PATCH] NFSD: Fix 5 seconds delay when doing inter server copy

Hi Chuck,

Sorry for the delay. I will make update the patch, test it, and re-submit
it by end of today.

Thanks,
-Dai

On 11/30/20 9:57 AM, Chuck Lever wrote:
> Hello Dai -
>
>> On Nov 24, 2020, at 3:49 PM, J. Bruce Fields <[email protected]> wrote:
>>
>> On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
>>> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
>>> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
>>> seconds delay regardless of the size of the copy. The delay is from
>>> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
>>> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
>>>
>>> Fix by modifying the source server to return the stateid for COPY_NOTIFY
>>> request with seqid 1 instead of 0. This is also to conform with
>>> section 4.8 of RFC 7862.
>>>
>>> Here is the relevant paragraph from section 4.8 of RFC 7862:
>>>
>>> A copy offload stateid's seqid MUST NOT be zero. In the context of a
>>> copy offload operation, it is inappropriate to indicate "the most
>>> recent copy offload operation" using a stateid with a seqid of zero
>>> (see Section 8.2.2 of [RFC5661]). It is inappropriate because the
>>> stateid refers to internal state in the server and there may be
>>> several asynchronous COPY operations being performed in parallel on
>>> the same file by the server. Therefore, a copy offload stateid with
>>> a seqid of zero MUST be considered invalid.
>>>
>>> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
>>> Signed-off-by: Dai Ngo <[email protected]>
>>> ---
>>> fs/nfsd/nfs4state.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index d7f27ed6b794..33ee1a6961e3 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>>> refcount_set(&cps->cp_stateid.sc_count, 1);
>>> if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
>>> goto out_free;
>>> + cps->cp_stateid.stid.si_generation = 1;
>> This affects the stateid returned by COPY_NOTIFY, but not the one
>> returned by COPY. I think we wan to add this to nfs4_init_cp_state()
>> and cover both.
> Since time is creeping on towards the next merge window, I assume
> this particular fix needs to go there, but I don't see the final
> version of it (with Bruce's suggested fix) on the list. Did I miss
> it?
>
>
>>> spin_lock(&nn->s2s_cp_lock);
>>> list_add(&cps->cp_list, &p_stid->sc_cp_list);
>>> spin_unlock(&nn->s2s_cp_lock);
>>> --
>>> 2.9.5
> --
> Chuck Lever
> [email protected]
>
>
>

2020-11-30 21:32:47

by Dai Ngo

[permalink] [raw]
Subject: Re: [PATCH] NFSD: Fix 5 seconds delay when doing inter server copy

On 11/24/20 12:49 PM, J. Bruce Fields wrote:
> On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
>> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
>> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
>> seconds delay regardless of the size of the copy. The delay is from
>> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
>> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
>>
>> Fix by modifying the source server to return the stateid for COPY_NOTIFY
>> request with seqid 1 instead of 0. This is also to conform with
>> section 4.8 of RFC 7862.
>>
>> Here is the relevant paragraph from section 4.8 of RFC 7862:
>>
>> A copy offload stateid's seqid MUST NOT be zero. In the context of a
>> copy offload operation, it is inappropriate to indicate "the most
>> recent copy offload operation" using a stateid with a seqid of zero
>> (see Section 8.2.2 of [RFC5661]). It is inappropriate because the
>> stateid refers to internal state in the server and there may be
>> several asynchronous COPY operations being performed in parallel on
>> the same file by the server. Therefore, a copy offload stateid with
>> a seqid of zero MUST be considered invalid.
>>
>> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
>> Signed-off-by: Dai Ngo <[email protected]>
>> ---
>> fs/nfsd/nfs4state.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index d7f27ed6b794..33ee1a6961e3 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>> refcount_set(&cps->cp_stateid.sc_count, 1);
>> if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
>> goto out_free;
>> + cps->cp_stateid.stid.si_generation = 1;
> This affects the stateid returned by COPY_NOTIFY, but not the one
> returned by COPY. I think we wan to add this to nfs4_init_cp_state()
> and cover both.

Hi Bruce, thank you for your suggestion. Updated patch tested and submitted.

-Dai

P.S sorry for the delay, I was on leave last few days.

>
> --b.
>
>> spin_lock(&nn->s2s_cp_lock);
>> list_add(&cps->cp_list, &p_stid->sc_cp_list);
>> spin_unlock(&nn->s2s_cp_lock);
>> --
>> 2.9.5