There exists a problem in the code that was easily reproducible prior
to the commit b757144fd77cf5512f5b60179ba5ca8dcc5184b4 ("NFSv4 Be less
aggressive about returning delegations"). The same code is used during
delegation return and is still reproducible (with highly coordinated
setup of triggering just the right actions).
The problem lies in the call of nfs4_lock_delegation_recall(). In the
nutshell, when the LOCK op gets an error such as ERR_GRACE, the error
is unhanded which leaves the client in incorrect state assuming that
it has 'successfully' acquired a lock that was locally acquired while
holding a delegation.
I believe the problem stems from the fact that
nfs4_lock_delegation_recall(), unlike normal version of LOCK
nfs4_proc_getlk(), lacks the do { handle_error(do_proc())} while(err)
structure.
I believe the problem should be fixed by something like the following:
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6ca0c8e..78d088c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5936,11 +5936,12 @@ int nfs4_lock_delegation_recall(struct
file_lock *fl, struct nfs4_state *state,
struct nfs_server *server = NFS_SERVER(state->inode);
int err;
- err = nfs4_set_lock_state(state, fl);
- if (err != 0)
- return err;
- err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
- return nfs4_handle_delegation_recall_error(server, state, stateid, err);
+ do {
+ err = nfs4_handle_delegation_recall_error(server, state,
+ stateid, _nfs4_do_setlk(state, F_SETLK, fl,
+ NFS_LOCK_NEW));
+ } while (err == -EAGAIN);
+ return err;
}
struct nfs_release_lockowner_data {
Is this an acceptable solution?
On Wed, Sep 24, 2014 at 6:45 PM, Trond Myklebust
<[email protected]> wrote:
> On Wed, Sep 24, 2014 at 6:31 PM, Olga Kornievskaia <[email protected]> wrote:
>> On Wed, Sep 24, 2014 at 3:57 PM, Trond Myklebust
>> <[email protected]> wrote:
>>> Hi Olga,
>>>
>>> On Wed, Sep 24, 2014 at 2:20 PM, Olga Kornievskaia <[email protected]> wrote:
>>>> Hi Trond,
>>>>
>>>> nfs_delegation_claim_opens() return EAGAIN to nfs_end_delegation_return().
>>>> issync is always 0 (as its called by the
>>>> nfs_client_return_marked_delegations) and it breaks out of the loop...
>>>> as a result the error just doesn't get handled.
>>>
>>> Ah. OK, so this is being called from
>>> nfs_client_return_marked_delegations. That makes sense.
>>>
>>> So for that case, I'd expect the call to return to the loop in
>>> nfs4_state_manager(), and then to retry through that after doing
>>> whatever is needed to recover.
>>> Essentially, we should be setting NFS4CLNT_DELEGRETURN again, and then
>>> bouncing back into nfs_client_return_marked_delegations (after all the
>>> recovery work has been done).
>>
>> Yes I don't fully understand what it should be. It never does anything
>> about recovering from the lock error and simply returns the
>> delegation. Ok I don't know if it means anything to you, but the 2nd
>> time around (when it returns the delegation even though it hasn't
>> recovered the lock), it never goes into the
>> nfs4_open_delegation_recall() because stateid condition doesn't hold
>> true.
>>
>> If it's not too much trouble, could you explain why lock error
>> shouldn't be handled as I suggested instead of resending the open with
>> claim_cur over again. As I understand in your case, it'll be a series
>> of successful open with claim_cur paired with a failed lock with
>> err_grace. In my case, it'll be one open with claim_cur and a number
>> of lock with err_grace.
>
> There is only 1 state manager thread allowed per nfs_client (i.e. per
> server) and so we want to avoid having it busy wait in any one state
> handler. Doing so would basically mean that all other state recovery
> on that nfs_client is on hold; i.e. we could not deal with exceptions
> like ADMIN_REVOKED, CB_PATH_DOWN, etc until the busy wait is over.
> This is why that code has been designed to fall all the way back to
> nfs4_state_manager() in the event of any error/exception.
Ok, thanks. It make sense. And makes things complicated. I'm sure
you'll beat me to figuring out why the error is not handled but I'll
keep trying.
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> [email protected]
Hi Olga,
On Wed, Sep 24, 2014 at 1:47 PM, Olga Kornievskaia <[email protected]> wrote:
> There exists a problem in the code that was easily reproducible prior
> to the commit b757144fd77cf5512f5b60179ba5ca8dcc5184b4 ("NFSv4 Be less
> aggressive about returning delegations"). The same code is used during
> delegation return and is still reproducible (with highly coordinated
> setup of triggering just the right actions).
>
> The problem lies in the call of nfs4_lock_delegation_recall(). In the
> nutshell, when the LOCK op gets an error such as ERR_GRACE, the error
> is unhanded which leaves the client in incorrect state assuming that
> it has 'successfully' acquired a lock that was locally acquired while
> holding a delegation.
>
> I believe the problem stems from the fact that
> nfs4_lock_delegation_recall(), unlike normal version of LOCK
> nfs4_proc_getlk(), lacks the do { handle_error(do_proc())} while(err)
> structure.
>
> I believe the problem should be fixed by something like the following:
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 6ca0c8e..78d088c 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5936,11 +5936,12 @@ int nfs4_lock_delegation_recall(struct
> file_lock *fl, struct nfs4_state *state,
> struct nfs_server *server = NFS_SERVER(state->inode);
> int err;
>
> - err = nfs4_set_lock_state(state, fl);
> - if (err != 0)
> - return err;
> - err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
> - return nfs4_handle_delegation_recall_error(server, state, stateid, err);
> + do {
> + err = nfs4_handle_delegation_recall_error(server, state,
> + stateid, _nfs4_do_setlk(state, F_SETLK, fl,
> + NFS_LOCK_NEW));
> + } while (err == -EAGAIN);
> + return err;
> }
>
> struct nfs_release_lockowner_data {
>
> Is this an acceptable solution?
>
In the current code, I expect the EAGAIN from
nfs4_handle_delegation_recall_error() to be propagated back to
nfs_end_delegation_return(). It should then result in the client
waiting for state recovery to complete followed by a loop back to
nfs_delegation_claim_opens().
Could you explain why that isn't working for your case?
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
Hi Olga,
On Wed, Sep 24, 2014 at 2:20 PM, Olga Kornievskaia <[email protected]> wrote:
> Hi Trond,
>
> nfs_delegation_claim_opens() return EAGAIN to nfs_end_delegation_return().
> issync is always 0 (as its called by the
> nfs_client_return_marked_delegations) and it breaks out of the loop...
> as a result the error just doesn't get handled.
Ah. OK, so this is being called from
nfs_client_return_marked_delegations. That makes sense.
So for that case, I'd expect the call to return to the loop in
nfs4_state_manager(), and then to retry through that after doing
whatever is needed to recover.
Essentially, we should be setting NFS4CLNT_DELEGRETURN again, and then
bouncing back into nfs_client_return_marked_delegations (after all the
recovery work has been done).
> Would it be useful to provide you with a pcap trace?
>
> The easiest way to trigger it is to enable the return of inactive
> delegation and have simple get a delegation and local lock, sleep and
> reboot the server and see the lock recovery error with err_grace.
Thanks for the reproducer! I'll see if I can get round to testing. ;-p
> On Wed, Sep 24, 2014 at 2:02 PM, Trond Myklebust
> <[email protected]> wrote:
>> Hi Olga,
>>
>> On Wed, Sep 24, 2014 at 1:47 PM, Olga Kornievskaia <[email protected]> wrote:
>>> There exists a problem in the code that was easily reproducible prior
>>> to the commit b757144fd77cf5512f5b60179ba5ca8dcc5184b4 ("NFSv4 Be less
>>> aggressive about returning delegations"). The same code is used during
>>> delegation return and is still reproducible (with highly coordinated
>>> setup of triggering just the right actions).
>>>
>>> The problem lies in the call of nfs4_lock_delegation_recall(). In the
>>> nutshell, when the LOCK op gets an error such as ERR_GRACE, the error
>>> is unhanded which leaves the client in incorrect state assuming that
>>> it has 'successfully' acquired a lock that was locally acquired while
>>> holding a delegation.
>>>
>>> I believe the problem stems from the fact that
>>> nfs4_lock_delegation_recall(), unlike normal version of LOCK
>>> nfs4_proc_getlk(), lacks the do { handle_error(do_proc())} while(err)
>>> structure.
>>>
>>> I believe the problem should be fixed by something like the following:
>>>
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index 6ca0c8e..78d088c 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -5936,11 +5936,12 @@ int nfs4_lock_delegation_recall(struct
>>> file_lock *fl, struct nfs4_state *state,
>>> struct nfs_server *server = NFS_SERVER(state->inode);
>>> int err;
>>>
>>> - err = nfs4_set_lock_state(state, fl);
>>> - if (err != 0)
>>> - return err;
>>> - err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
>>> - return nfs4_handle_delegation_recall_error(server, state, stateid, err);
>>> + do {
>>> + err = nfs4_handle_delegation_recall_error(server, state,
>>> + stateid, _nfs4_do_setlk(state, F_SETLK, fl,
>>> + NFS_LOCK_NEW));
>>> + } while (err == -EAGAIN);
>>> + return err;
>>> }
>>>
>>> struct nfs_release_lockowner_data {
>>>
>>> Is this an acceptable solution?
>>>
>>
>> In the current code, I expect the EAGAIN from
>> nfs4_handle_delegation_recall_error() to be propagated back to
>> nfs_end_delegation_return(). It should then result in the client
>> waiting for state recovery to complete followed by a loop back to
>> nfs_delegation_claim_opens().
>> Could you explain why that isn't working for your case?
>>
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On Wed, Sep 24, 2014 at 6:31 PM, Olga Kornievskaia <[email protected]> wrote:
> On Wed, Sep 24, 2014 at 3:57 PM, Trond Myklebust
> <[email protected]> wrote:
>> Hi Olga,
>>
>> On Wed, Sep 24, 2014 at 2:20 PM, Olga Kornievskaia <[email protected]> wrote:
>>> Hi Trond,
>>>
>>> nfs_delegation_claim_opens() return EAGAIN to nfs_end_delegation_return().
>>> issync is always 0 (as its called by the
>>> nfs_client_return_marked_delegations) and it breaks out of the loop...
>>> as a result the error just doesn't get handled.
>>
>> Ah. OK, so this is being called from
>> nfs_client_return_marked_delegations. That makes sense.
>>
>> So for that case, I'd expect the call to return to the loop in
>> nfs4_state_manager(), and then to retry through that after doing
>> whatever is needed to recover.
>> Essentially, we should be setting NFS4CLNT_DELEGRETURN again, and then
>> bouncing back into nfs_client_return_marked_delegations (after all the
>> recovery work has been done).
>
> Yes I don't fully understand what it should be. It never does anything
> about recovering from the lock error and simply returns the
> delegation. Ok I don't know if it means anything to you, but the 2nd
> time around (when it returns the delegation even though it hasn't
> recovered the lock), it never goes into the
> nfs4_open_delegation_recall() because stateid condition doesn't hold
> true.
>
> If it's not too much trouble, could you explain why lock error
> shouldn't be handled as I suggested instead of resending the open with
> claim_cur over again. As I understand in your case, it'll be a series
> of successful open with claim_cur paired with a failed lock with
> err_grace. In my case, it'll be one open with claim_cur and a number
> of lock with err_grace.
There is only 1 state manager thread allowed per nfs_client (i.e. per
server) and so we want to avoid having it busy wait in any one state
handler. Doing so would basically mean that all other state recovery
on that nfs_client is on hold; i.e. we could not deal with exceptions
like ADMIN_REVOKED, CB_PATH_DOWN, etc until the busy wait is over.
This is why that code has been designed to fall all the way back to
nfs4_state_manager() in the event of any error/exception.
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On Wed, Sep 24, 2014 at 3:57 PM, Trond Myklebust
<[email protected]> wrote:
> Hi Olga,
>
> On Wed, Sep 24, 2014 at 2:20 PM, Olga Kornievskaia <[email protected]> wrote:
>> Hi Trond,
>>
>> nfs_delegation_claim_opens() return EAGAIN to nfs_end_delegation_return().
>> issync is always 0 (as its called by the
>> nfs_client_return_marked_delegations) and it breaks out of the loop...
>> as a result the error just doesn't get handled.
>
> Ah. OK, so this is being called from
> nfs_client_return_marked_delegations. That makes sense.
>
> So for that case, I'd expect the call to return to the loop in
> nfs4_state_manager(), and then to retry through that after doing
> whatever is needed to recover.
> Essentially, we should be setting NFS4CLNT_DELEGRETURN again, and then
> bouncing back into nfs_client_return_marked_delegations (after all the
> recovery work has been done).
Yes I don't fully understand what it should be. It never does anything
about recovering from the lock error and simply returns the
delegation. Ok I don't know if it means anything to you, but the 2nd
time around (when it returns the delegation even though it hasn't
recovered the lock), it never goes into the
nfs4_open_delegation_recall() because stateid condition doesn't hold
true.
If it's not too much trouble, could you explain why lock error
shouldn't be handled as I suggested instead of resending the open with
claim_cur over again. As I understand in your case, it'll be a series
of successful open with claim_cur paired with a failed lock with
err_grace. In my case, it'll be one open with claim_cur and a number
of lock with err_grace.
>> Would it be useful to provide you with a pcap trace?
>>
>> The easiest way to trigger it is to enable the return of inactive
>> delegation and have simple get a delegation and local lock, sleep and
>> reboot the server and see the lock recovery error with err_grace.
>
> Thanks for the reproducer! I'll see if I can get round to testing. ;-p
>
>> On Wed, Sep 24, 2014 at 2:02 PM, Trond Myklebust
>> <[email protected]> wrote:
>>> Hi Olga,
>>>
>>> On Wed, Sep 24, 2014 at 1:47 PM, Olga Kornievskaia <[email protected]> wrote:
>>>> There exists a problem in the code that was easily reproducible prior
>>>> to the commit b757144fd77cf5512f5b60179ba5ca8dcc5184b4 ("NFSv4 Be less
>>>> aggressive about returning delegations"). The same code is used during
>>>> delegation return and is still reproducible (with highly coordinated
>>>> setup of triggering just the right actions).
>>>>
>>>> The problem lies in the call of nfs4_lock_delegation_recall(). In the
>>>> nutshell, when the LOCK op gets an error such as ERR_GRACE, the error
>>>> is unhanded which leaves the client in incorrect state assuming that
>>>> it has 'successfully' acquired a lock that was locally acquired while
>>>> holding a delegation.
>>>>
>>>> I believe the problem stems from the fact that
>>>> nfs4_lock_delegation_recall(), unlike normal version of LOCK
>>>> nfs4_proc_getlk(), lacks the do { handle_error(do_proc())} while(err)
>>>> structure.
>>>>
>>>> I believe the problem should be fixed by something like the following:
>>>>
>>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>>> index 6ca0c8e..78d088c 100644
>>>> --- a/fs/nfs/nfs4proc.c
>>>> +++ b/fs/nfs/nfs4proc.c
>>>> @@ -5936,11 +5936,12 @@ int nfs4_lock_delegation_recall(struct
>>>> file_lock *fl, struct nfs4_state *state,
>>>> struct nfs_server *server = NFS_SERVER(state->inode);
>>>> int err;
>>>>
>>>> - err = nfs4_set_lock_state(state, fl);
>>>> - if (err != 0)
>>>> - return err;
>>>> - err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
>>>> - return nfs4_handle_delegation_recall_error(server, state, stateid, err);
>>>> + do {
>>>> + err = nfs4_handle_delegation_recall_error(server, state,
>>>> + stateid, _nfs4_do_setlk(state, F_SETLK, fl,
>>>> + NFS_LOCK_NEW));
>>>> + } while (err == -EAGAIN);
>>>> + return err;
>>>> }
>>>>
>>>> struct nfs_release_lockowner_data {
>>>>
>>>> Is this an acceptable solution?
>>>>
>>>
>>> In the current code, I expect the EAGAIN from
>>> nfs4_handle_delegation_recall_error() to be propagated back to
>>> nfs_end_delegation_return(). It should then result in the client
>>> waiting for state recovery to complete followed by a loop back to
>>> nfs_delegation_claim_opens().
>>> Could you explain why that isn't working for your case?
>>>
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> [email protected]
Hi Trond,
nfs_delegation_claim_opens() return EAGAIN to nfs_end_delegation_return().
issync is always 0 (as its called by the
nfs_client_return_marked_delegations) and it breaks out of the loop...
as a result the error just doesn't get handled.
Would it be useful to provide you with a pcap trace?
The easiest way to trigger it is to enable the return of inactive
delegation and have simple get a delegation and local lock, sleep and
reboot the server and see the lock recovery error with err_grace.
On Wed, Sep 24, 2014 at 2:02 PM, Trond Myklebust
<[email protected]> wrote:
> Hi Olga,
>
> On Wed, Sep 24, 2014 at 1:47 PM, Olga Kornievskaia <[email protected]> wrote:
>> There exists a problem in the code that was easily reproducible prior
>> to the commit b757144fd77cf5512f5b60179ba5ca8dcc5184b4 ("NFSv4 Be less
>> aggressive about returning delegations"). The same code is used during
>> delegation return and is still reproducible (with highly coordinated
>> setup of triggering just the right actions).
>>
>> The problem lies in the call of nfs4_lock_delegation_recall(). In the
>> nutshell, when the LOCK op gets an error such as ERR_GRACE, the error
>> is unhanded which leaves the client in incorrect state assuming that
>> it has 'successfully' acquired a lock that was locally acquired while
>> holding a delegation.
>>
>> I believe the problem stems from the fact that
>> nfs4_lock_delegation_recall(), unlike normal version of LOCK
>> nfs4_proc_getlk(), lacks the do { handle_error(do_proc())} while(err)
>> structure.
>>
>> I believe the problem should be fixed by something like the following:
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 6ca0c8e..78d088c 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -5936,11 +5936,12 @@ int nfs4_lock_delegation_recall(struct
>> file_lock *fl, struct nfs4_state *state,
>> struct nfs_server *server = NFS_SERVER(state->inode);
>> int err;
>>
>> - err = nfs4_set_lock_state(state, fl);
>> - if (err != 0)
>> - return err;
>> - err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
>> - return nfs4_handle_delegation_recall_error(server, state, stateid, err);
>> + do {
>> + err = nfs4_handle_delegation_recall_error(server, state,
>> + stateid, _nfs4_do_setlk(state, F_SETLK, fl,
>> + NFS_LOCK_NEW));
>> + } while (err == -EAGAIN);
>> + return err;
>> }
>>
>> struct nfs_release_lockowner_data {
>>
>> Is this an acceptable solution?
>>
>
> In the current code, I expect the EAGAIN from
> nfs4_handle_delegation_recall_error() to be propagated back to
> nfs_end_delegation_return(). It should then result in the client
> waiting for state recovery to complete followed by a loop back to
> nfs_delegation_claim_opens().
> Could you explain why that isn't working for your case?
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> [email protected]
On Wed, Sep 24, 2014 at 7:20 PM, Olga Kornievskaia <[email protected]> wrote:
> On Wed, Sep 24, 2014 at 6:45 PM, Trond Myklebust
> <[email protected]> wrote:
>> On Wed, Sep 24, 2014 at 6:31 PM, Olga Kornievskaia <[email protected]> wrote:
>>> On Wed, Sep 24, 2014 at 3:57 PM, Trond Myklebust
>>> <[email protected]> wrote:
>>>> Hi Olga,
>>>>
>>>> On Wed, Sep 24, 2014 at 2:20 PM, Olga Kornievskaia <[email protected]> wrote:
>>>>> Hi Trond,
>>>>>
>>>>> nfs_delegation_claim_opens() return EAGAIN to nfs_end_delegation_return().
>>>>> issync is always 0 (as its called by the
>>>>> nfs_client_return_marked_delegations) and it breaks out of the loop...
>>>>> as a result the error just doesn't get handled.
>>>>
>>>> Ah. OK, so this is being called from
>>>> nfs_client_return_marked_delegations. That makes sense.
>>>>
>>>> So for that case, I'd expect the call to return to the loop in
>>>> nfs4_state_manager(), and then to retry through that after doing
>>>> whatever is needed to recover.
>>>> Essentially, we should be setting NFS4CLNT_DELEGRETURN again, and then
>>>> bouncing back into nfs_client_return_marked_delegations (after all the
>>>> recovery work has been done).
>>>
>>> Yes I don't fully understand what it should be. It never does anything
>>> about recovering from the lock error and simply returns the
>>> delegation. Ok I don't know if it means anything to you, but the 2nd
>>> time around (when it returns the delegation even though it hasn't
>>> recovered the lock), it never goes into the
>>> nfs4_open_delegation_recall() because stateid condition doesn't hold
>>> true.
>>>
>>> If it's not too much trouble, could you explain why lock error
>>> shouldn't be handled as I suggested instead of resending the open with
>>> claim_cur over again. As I understand in your case, it'll be a series
>>> of successful open with claim_cur paired with a failed lock with
>>> err_grace. In my case, it'll be one open with claim_cur and a number
>>> of lock with err_grace.
>>
>> There is only 1 state manager thread allowed per nfs_client (i.e. per
>> server) and so we want to avoid having it busy wait in any one state
>> handler. Doing so would basically mean that all other state recovery
>> on that nfs_client is on hold; i.e. we could not deal with exceptions
>> like ADMIN_REVOKED, CB_PATH_DOWN, etc until the busy wait is over.
>> This is why that code has been designed to fall all the way back to
>> nfs4_state_manager() in the event of any error/exception.
>
> Ok, thanks. It make sense. And makes things complicated. I'm sure
> you'll beat me to figuring out why the error is not handled but I'll
> keep trying.
>
I believe the cause of the improper handling of LOCK errors during
return of delegations is in the check of matching stateid in
nfs_delegation_claim_opens(). After the first attempt of returning the
delegation -- sending an open with delegate_cur and then the lock
which errors -- the code properly errors out, aborts the delegation
return and will try again. However, the stateid at this point has been
update to the open_stateid received by that open. The check for
matching stateid and delegation stateid fails, so it skips this open
and just returns the delegation and moves forward.
I'd like an option about either
(1) removing the check for matching state ids as a solution, or
(2) should the stateid be updated back to the delegation stateid
(which i think be wrong)?
>>
>> --
>> Trond Myklebust
>>
>> Linux NFS client maintainer, PrimaryData
>>
>> [email protected]