2022-02-24 01:20:38

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH v1] NFSv4.1 provide mount option to toggle trunking discovery

I have forgotten to cc Kurt Garloff to the post.

On Wed, Feb 23, 2022 at 12:40 PM Olga Kornievskaia
<[email protected]> wrote:
>
> From: Olga Kornievskaia <[email protected]>
>
> Introduce a new mount option -- trunkdiscovery,notrunkdiscovery -- to
> toggle whether or not the client will engage in actively discovery
> of trunking locations.
>
> Signed-off-by: Olga Kornievskaia <[email protected]>
> ---
> fs/nfs/client.c | 3 ++-
> fs/nfs/fs_context.c | 8 ++++++++
> include/linux/nfs_fs_sb.h | 1 +
> 3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/client.c b/fs/nfs/client.c
> index d1f34229e11a..84c080ddfd01 100644
> --- a/fs/nfs/client.c
> +++ b/fs/nfs/client.c
> @@ -857,7 +857,8 @@ static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, str
> }
>
> if (clp->rpc_ops->discover_trunking != NULL &&
> - (server->caps & NFS_CAP_FS_LOCATIONS)) {
> + (server->caps & NFS_CAP_FS_LOCATIONS &&
> + !(server->flags & NFS_MOUNT_NOTRUNK_DISCOVERY))) {
> error = clp->rpc_ops->discover_trunking(server, mntfh);
> if (error < 0)
> return error;
> diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
> index ea17fa1f31ec..ad1448a63aa0 100644
> --- a/fs/nfs/fs_context.c
> +++ b/fs/nfs/fs_context.c
> @@ -80,6 +80,7 @@ enum nfs_param {
> Opt_source,
> Opt_tcp,
> Opt_timeo,
> + Opt_trunkdiscovery,
> Opt_udp,
> Opt_v,
> Opt_vers,
> @@ -180,6 +181,7 @@ static const struct fs_parameter_spec nfs_fs_parameters[] = {
> fsparam_string("source", Opt_source),
> fsparam_flag ("tcp", Opt_tcp),
> fsparam_u32 ("timeo", Opt_timeo),
> + fsparam_flag_no("trunkdiscovery", Opt_trunkdiscovery),
> fsparam_flag ("udp", Opt_udp),
> fsparam_flag ("v2", Opt_v),
> fsparam_flag ("v3", Opt_v),
> @@ -529,6 +531,12 @@ static int nfs_fs_context_parse_param(struct fs_context *fc,
> else
> ctx->flags &= ~NFS_MOUNT_NOCTO;
> break;
> + case Opt_trunkdiscovery:
> + if (result.negated)
> + ctx->flags |= NFS_MOUNT_NOTRUNK_DISCOVERY;
> + else
> + ctx->flags &= ~NFS_MOUNT_NOTRUNK_DISCOVERY;
> + break;
> case Opt_ac:
> if (result.negated)
> ctx->flags |= NFS_MOUNT_NOAC;
> diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
> index ca0959e51e81..d0920d7f5f9e 100644
> --- a/include/linux/nfs_fs_sb.h
> +++ b/include/linux/nfs_fs_sb.h
> @@ -151,6 +151,7 @@ struct nfs_server {
> #define NFS_MOUNT_SOFTREVAL 0x800000
> #define NFS_MOUNT_WRITE_EAGER 0x01000000
> #define NFS_MOUNT_WRITE_WAIT 0x02000000
> +#define NFS_MOUNT_NOTRUNK_DISCOVERY 0x04000000
>
> unsigned int fattr_valid; /* Valid attributes */
> unsigned int caps; /* server capabilities */
> --
> 2.27.0
>


2022-02-24 15:35:54

by Kurt Garloff

[permalink] [raw]
Subject: Re: [PATCH v1] NFSv4.1 provide mount option to toggle trunking discovery

Hi Olga,

thanks!

I can confirm that after applying your patch (against 5.15.24),
passing the mount option notrunkdiscovery to NFS mount does
indeed make the NFS mounts from the Qnap server (with knfsd from
3.4.6 kernel) work again.

Sorry to say, but I don't think this is the final solution yet:
* The option notrunkdiscovery is not tolerated by older kernels,
  so there is no way to pass a setting that works with <= 5.15.23
  and the kernels with your latest patch. This is a problem for
  me who keeps automount maps in LDAP.
* From a user perspective, the 5.15.24+/5.6.10+/5.17-rc behavior
  is still a regression, as the Qnap NFS mounts all break. With
  your patch, there is a way to recover, but it needs to be well
  documented and then be found by an admin who then needs to do
  the right change. Not acceptable for -stable IMVHO and according
  to Greg's explanations not for mainline either.

I see a number of possibilities to resolve this:
(0) We pretend it's not a problem that's serious enough to take
    action (and ensure that we do document this new option well).
(1) We revert the patch that does FS_LOCATIONS discovery.
    Assuming that FS_LOCATIONS does provide a useful feature, this
    would not be our preferred solution, I guess ...
(2) We prevent NFS v4.1 servers to use FS_LOCATIONS (like my patch
    implements) and additionally allow for the opt-out with
    notrunkdiscovery mount option. This fixes the known regression
    with Qnap knfsd (without requiring user intervention) and still
    allows for FS_LOCATIONS to be useful with NFSv4.2 servers that
    support this. The disadvantage is that we won't use the feature
    on NFSv4.1 servers which might support this feature perfectly
    (and there's no opt-in possibility). And the risk is that there
    might be NFSv4.2 servers out there that also misreport
    FS_LOCATIONS support and still need manual intervention (which
    at least is possible with notrunkdiscovery).
(3) We make this feature an opt-in thing and require users to
    pass trunkdiscovery to take advantage of the feature.
    This has zero risk of breakage (unless we screw up the patch),
    but always requires user intervention to take advantage of
    the FS_LOCATIONS feature.
(4) Identify a way to recover from the mount with FS_LOCATIONS
    against the broken server, so instead of hanging we do just
    turn this off if we find it not to work. Disadavantage is that
    this adds complexity and might stall the mounting for a while
    until the recovery worked. The complexity bears the risk for
    us screwing up.

I personally find solutions 2 -- 4 acceptable.

If the experts see (4) as simple enough, it might be worth a try.
Otherwise (2) or (3) would seem the way to go from my perspective.

Thanks!

--
Kurt Garloff <[email protected]>
Cologne, Germany

2022-02-26 02:20:35

by Kurt Garloff

[permalink] [raw]
Subject: Re: [PATCH v1] NFSv4.1 provide mount option to toggle trunking discovery

Hi,

Am 24. Februar 2022 14:42:41 MEZ schrieb Kurt Garloff <[email protected]>:
>Hi Olga,
>[...]
>
>I see a number of possibilities to resolve this:
>(0) We pretend it's not a problem that's serious enough to take
>     action (and ensure that we do document this new option well).
>(1) We revert the patch that does FS_LOCATIONS discovery.
>     Assuming that FS_LOCATIONS does provide a useful feature, this
>     would not be our preferred solution, I guess ...
>(2) We prevent NFS v4.1 servers to use FS_LOCATIONS (like my patch
>     implements) and additionally allow for the opt-out with
>     notrunkdiscovery mount option. This fixes the known regression
>     with Qnap knfsd (without requiring user intervention) and still
>     allows for FS_LOCATIONS to be useful with NFSv4.2 servers that
>     support this. The disadvantage is that we won't use the feature
>     on NFSv4.1 servers which might support this feature perfectly
>     (and there's no opt-in possibility). And the risk is that there
>     might be NFSv4.2 servers out there that also misreport
>     FS_LOCATIONS support and still need manual intervention (which
>     at least is possible with notrunkdiscovery).
>(3) We make this feature an opt-in thing and require users to
>     pass trunkdiscovery to take advantage of the feature.
>     This has zero risk of breakage (unless we screw up the patch),
>     but always requires user intervention to take advantage of
>     the FS_LOCATIONS feature.
>(4) Identify a way to recover from the mount with FS_LOCATIONS
>     against the broken server, so instead of hanging we do just
>     turn this off if we find it not to work. Disadavantage is that
>     this adds complexity and might stall the mounting for a while
>     until the recovery worked. The complexity bears the risk for
>     us screwing up.
>
>I personally find solutions 2 -- 4 acceptable.
>
>If the experts see (4) as simple enough, it might be worth a try.
>Otherwise (2) or (3) would seem the way to go from my perspective.

Any thought ls?

Thanks,

--
Kurt Garloff <[email protected]>, Cologne, Germany
(Sent from Mobile with K9.)

2022-03-17 05:06:29

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH v1] NFSv4.1 provide mount option to toggle trunking discovery

On 25 Feb 2022, at 17:48, Kurt Garloff wrote:

> Hi,
>
> Am 24. Februar 2022 14:42:41 MEZ schrieb Kurt Garloff <[email protected]>:
>> Hi Olga,
>> [...]
>>
>> I see a number of possibilities to resolve this:
>> (0) We pretend it's not a problem that's serious enough to take
>>     action (and ensure that we do document this new option well).
>> (1) We revert the patch that does FS_LOCATIONS discovery.
>>     Assuming that FS_LOCATIONS does provide a useful feature, this
>>     would not be our preferred solution, I guess ...
>> (2) We prevent NFS v4.1 servers to use FS_LOCATIONS (like my patch
>>     implements) and additionally allow for the opt-out with
>>     notrunkdiscovery mount option. This fixes the known regression
>>     with Qnap knfsd (without requiring user intervention) and still
>>     allows for FS_LOCATIONS to be useful with NFSv4.2 servers that
>>     support this. The disadvantage is that we won't use the feature
>>     on NFSv4.1 servers which might support this feature perfectly
>>     (and there's no opt-in possibility). And the risk is that there
>>     might be NFSv4.2 servers out there that also misreport
>>     FS_LOCATIONS support and still need manual intervention (which
>>     at least is possible with notrunkdiscovery).
>> (3) We make this feature an opt-in thing and require users to
>>     pass trunkdiscovery to take advantage of the feature.
>>     This has zero risk of breakage (unless we screw up the patch),
>>     but always requires user intervention to take advantage of
>>     the FS_LOCATIONS feature.
>> (4) Identify a way to recover from the mount with FS_LOCATIONS
>>     against the broken server, so instead of hanging we do just
>>     turn this off if we find it not to work. Disadavantage is that
>>     this adds complexity and might stall the mounting for a while
>>     until the recovery worked. The complexity bears the risk for
>>     us screwing up.
>>
>> I personally find solutions 2 -- 4 acceptable.
>>
>> If the experts see (4) as simple enough, it might be worth a try.
>> Otherwise (2) or (3) would seem the way to go from my perspective.
>
> Any thought ls?

I think (3) is the best way, and perhaps using sysfs to toggle it would
be a solution to the problems presented by a mount option.

I'm worried that this issue is being ignored because that's usually what
happens when requests/patches are proposed that violate the policy of "we do
not fix the client for server bugs". In this case that policy conficts with
"no user visible regressions".

Can anyone declare which policy takes precedent?

Ben

2022-03-17 06:37:40

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH v1] NFSv4.1 provide mount option to toggle trunking discovery



> On Mar 16, 2022, at 8:39 AM, Benjamin Coddington <[email protected]> wrote:
>
> On 25 Feb 2022, at 17:48, Kurt Garloff wrote:
>
>> Hi,
>>
>> Am 24. Februar 2022 14:42:41 MEZ schrieb Kurt Garloff <[email protected]>:
>>> Hi Olga,
>>> [...]
>>>
>>> I see a number of possibilities to resolve this:
>>> (0) We pretend it's not a problem that's serious enough to take
>>> action (and ensure that we do document this new option well).
>>> (1) We revert the patch that does FS_LOCATIONS discovery.
>>> Assuming that FS_LOCATIONS does provide a useful feature, this
>>> would not be our preferred solution, I guess ...
>>> (2) We prevent NFS v4.1 servers to use FS_LOCATIONS (like my patch
>>> implements) and additionally allow for the opt-out with
>>> notrunkdiscovery mount option. This fixes the known regression
>>> with Qnap knfsd (without requiring user intervention) and still
>>> allows for FS_LOCATIONS to be useful with NFSv4.2 servers that
>>> support this. The disadvantage is that we won't use the feature
>>> on NFSv4.1 servers which might support this feature perfectly
>>> (and there's no opt-in possibility). And the risk is that there
>>> might be NFSv4.2 servers out there that also misreport
>>> FS_LOCATIONS support and still need manual intervention (which
>>> at least is possible with notrunkdiscovery).
>>> (3) We make this feature an opt-in thing and require users to
>>> pass trunkdiscovery to take advantage of the feature.
>>> This has zero risk of breakage (unless we screw up the patch),
>>> but always requires user intervention to take advantage of
>>> the FS_LOCATIONS feature.
>>> (4) Identify a way to recover from the mount with FS_LOCATIONS
>>> against the broken server, so instead of hanging we do just
>>> turn this off if we find it not to work. Disadavantage is that
>>> this adds complexity and might stall the mounting for a while
>>> until the recovery worked. The complexity bears the risk for
>>> us screwing up.
>>>
>>> I personally find solutions 2 -- 4 acceptable.
>>>
>>> If the experts see (4) as simple enough, it might be worth a try.
>>> Otherwise (2) or (3) would seem the way to go from my perspective.
>>
>> Any thought ls?
>
> I think (3) is the best way, and perhaps using sysfs to toggle it would
> be a solution to the problems presented by a mount option.
>
> I'm worried that this issue is being ignored because that's usually what
> happens when requests/patches are proposed that violate the policy of "we do
> not fix the client for server bugs". In this case that policy conficts with
> "no user visible regressions".
>
> Can anyone declare which policy takes precedent?

"No regressions" probably takes precedent. IMO 1) should be done
immediately.

This is not a server bug, necessarily. The server is responding
within the realm of what is allowed by specification and I can
see cases where a server might have a good reason to DELAY an
fs_locations request for a while.

So I think once 1) has been done and backported to stable, the
client functionality should be restored via 4) to ensure that a
server can't delay a client mount indefinitely. (Although I think
I've stated that preference before).

I don't see any need to involve a human in making the decision
to try fs_locations. The client should try fs_locations and if
it doesn't work, just move on as quickly as it can. As always,
I don't relish adding more administrative controls if we can
avoid it. Such controls add to our test matrix and our long
term technical debt. Easy to add, but very difficult to change
or remove once merged. I can't see an upside here.


--
Chuck Lever