2023-09-15 06:24:41

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v2] NFSv4: fairly test all delegations on a SEQ4_ revocation

On 24 Aug 2023, at 14:52, Benjamin Coddington wrote:

> When the client is required to use TEST_STATEID to discover which
> delegation(s) have been revoked, it may continually test delegations at the
> head of the list if the server continues to be unsatisfied and send
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED. For a large number of delegations
> this behavior is prone to live-lock because the client may never be able to
> test and free revoked state at the end of the list since the
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED will cause us to flag delegations at
> the head of the list to be tested. This problem is further exacerbated by
> the state manager's willingness to be scheduled out on a busy system while
> testing the list of delegations.
>
> Keep a generation counter for each attempt to test all delegations, and
> skip delegations that have already been tested in the current pass.
>
> Signed-off-by: Benjamin Coddington <[email protected]>

This one went through the ringer in an environment that saw multiple clients
live-locking, and resolves the problem for them. They asked me to add:

Tested-by: Torkil Svensgaard <[email protected]>
Tested-by: Ruben Vestergaard <[email protected]>

Ben


2023-09-19 02:57:12

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v2] NFSv4: fairly test all delegations on a SEQ4_ revocation

On 14 Sep 2023, at 9:18, Benjamin Coddington wrote:

> On 24 Aug 2023, at 14:52, Benjamin Coddington wrote:
>
>> When the client is required to use TEST_STATEID to discover which
>> delegation(s) have been revoked, it may continually test delegations at the
>> head of the list if the server continues to be unsatisfied and send
>> SEQ4_STATUS_RECALLABLE_STATE_REVOKED. For a large number of delegations
>> this behavior is prone to live-lock because the client may never be able to
>> test and free revoked state at the end of the list since the
>> SEQ4_STATUS_RECALLABLE_STATE_REVOKED will cause us to flag delegations at
>> the head of the list to be tested. This problem is further exacerbated by
>> the state manager's willingness to be scheduled out on a busy system while
>> testing the list of delegations.
>>
>> Keep a generation counter for each attempt to test all delegations, and
>> skip delegations that have already been tested in the current pass.
>>
>> Signed-off-by: Benjamin Coddington <[email protected]>
>
> This one went through the ringer in an environment that saw multiple clients
> live-locking, and resolves the problem for them. They asked me to add:
>
> Tested-by: Torkil Svensgaard <[email protected]>
> Tested-by: Ruben Vestergaard <[email protected]>
>
> Ben

Did this one get rejected with a reason? This fix could also be implemented
with flag (as I mentioned in a reply on v2).

Ben