2023-03-07 22:05:55

by Jerry Zhang

[permalink] [raw]
Subject: [PATCH] sunrpc: Fix incorrect parsing of expiry time

The expiry time field is mean to be expressed in seconds since boot.
The get_expiry() function parses a relative time value in seconds.
In order to get the absolute time of seconds since boot that the given
message will expire, the right thing is to add seconds_since_boot()
to the given relative value.

Previously this logic was subtracting boot.tv_sec from the relative
value, which was causing some confusing behavior. The return type of
time64_t could possibly underflow if time since boot is greater than
the passed in relative argument. Also several checks in nfs code compare
the return value to 0 to indicate failure, and this could spuriously
be tripped if seconds since boot happened to match the argument.

Fixes: c5b29f885afe ("sunrpc: use seconds since boot in expiry cache")
Signed-off-by: Jerry Zhang <[email protected]>
---
include/linux/sunrpc/cache.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index ec5a555df96f..b96b1319c93d 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -301,16 +301,14 @@ static inline int get_time(char **bpp, time64_t *time)
}

static inline time64_t get_expiry(char **bpp)
{
time64_t rv;
- struct timespec64 boot;

if (get_time(bpp, &rv))
return 0;
if (rv < 0)
return 0;
- getboottime64(&boot);
- return rv - boot.tv_sec;
+ return rv + seconds_since_boot();
}

#endif /* _LINUX_SUNRPC_CACHE_H_ */
--
2.37.3



2023-03-07 22:31:59

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: Fix incorrect parsing of expiry time

On Wed, 08 Mar 2023, Jerry Zhang wrote:
> The expiry time field is mean to be expressed in seconds since boot.

Correct.

> The get_expiry() function parses a relative time value in seconds.

Incorrect. It parses and absoulte wall-clock time.

NeilBrown

> In order to get the absolute time of seconds since boot that the given
> message will expire, the right thing is to add seconds_since_boot()
> to the given relative value.
>
> Previously this logic was subtracting boot.tv_sec from the relative
> value, which was causing some confusing behavior. The return type of
> time64_t could possibly underflow if time since boot is greater than
> the passed in relative argument. Also several checks in nfs code compare
> the return value to 0 to indicate failure, and this could spuriously
> be tripped if seconds since boot happened to match the argument.
>
> Fixes: c5b29f885afe ("sunrpc: use seconds since boot in expiry cache")
> Signed-off-by: Jerry Zhang <[email protected]>
> ---
> include/linux/sunrpc/cache.h | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index ec5a555df96f..b96b1319c93d 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -301,16 +301,14 @@ static inline int get_time(char **bpp, time64_t *time)
> }
>
> static inline time64_t get_expiry(char **bpp)
> {
> time64_t rv;
> - struct timespec64 boot;
>
> if (get_time(bpp, &rv))
> return 0;
> if (rv < 0)
> return 0;
> - getboottime64(&boot);
> - return rv - boot.tv_sec;
> + return rv + seconds_since_boot();
> }
>
> #endif /* _LINUX_SUNRPC_CACHE_H_ */
> --
> 2.37.3
>
>


2023-03-07 22:57:33

by Jerry Zhang

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: Fix incorrect parsing of expiry time

On Tue, Mar 7, 2023 at 2:31 PM NeilBrown <[email protected]> wrote:
>
> On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > The expiry time field is mean to be expressed in seconds since boot.
>
> Correct.
>
> > The get_expiry() function parses a relative time value in seconds.
>
> Incorrect. It parses and absoulte wall-clock time.
I'm not familiar with the source of truth for this info. Is there a
specification of some sort?

For reference, we were seeing writes to
/proc/net/rpc/nfsd.export/channel randomly fail with EINVAL despite
usually succeeding with the same invocation. Upon investigation this
was the string that exportfs was writing "-test-client- /path/to/mount
3 0 65534 65534 0". "3" was the value for expiry in this message,
which led me to conclude that this is a relative field. If it isn't,
perhaps this is a bug in userspace nfs tools?

The failure in this was if nfs-server starts exactly 3s after bootup,
boot.tv_sec would be 3 and thus get_expiry() returns 0, causing a
failure to be returned.
>
> NeilBrown
>
> > In order to get the absolute time of seconds since boot that the given
> > message will expire, the right thing is to add seconds_since_boot()
> > to the given relative value.
> >
> > Previously this logic was subtracting boot.tv_sec from the relative
> > value, which was causing some confusing behavior. The return type of
> > time64_t could possibly underflow if time since boot is greater than
> > the passed in relative argument. Also several checks in nfs code compare
> > the return value to 0 to indicate failure, and this could spuriously
> > be tripped if seconds since boot happened to match the argument.
> >
> > Fixes: c5b29f885afe ("sunrpc: use seconds since boot in expiry cache")
> > Signed-off-by: Jerry Zhang <[email protected]>
> > ---
> > include/linux/sunrpc/cache.h | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> > index ec5a555df96f..b96b1319c93d 100644
> > --- a/include/linux/sunrpc/cache.h
> > +++ b/include/linux/sunrpc/cache.h
> > @@ -301,16 +301,14 @@ static inline int get_time(char **bpp, time64_t *time)
> > }
> >
> > static inline time64_t get_expiry(char **bpp)
> > {
> > time64_t rv;
> > - struct timespec64 boot;
> >
> > if (get_time(bpp, &rv))
> > return 0;
> > if (rv < 0)
> > return 0;
> > - getboottime64(&boot);
> > - return rv - boot.tv_sec;
> > + return rv + seconds_since_boot();
> > }
> >
> > #endif /* _LINUX_SUNRPC_CACHE_H_ */
> > --
> > 2.37.3
> >
> >
>

2023-03-07 23:20:57

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: Fix incorrect parsing of expiry time

On Wed, 08 Mar 2023, Jerry Zhang wrote:
> On Tue, Mar 7, 2023 at 2:31 PM NeilBrown <[email protected]> wrote:
> >
> > On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > > The expiry time field is mean to be expressed in seconds since boot.
> >
> > Correct.
> >
> > > The get_expiry() function parses a relative time value in seconds.
> >
> > Incorrect. It parses and absoulte wall-clock time.
> I'm not familiar with the source of truth for this info. Is there a
> specification of some sort?
>
> For reference, we were seeing writes to
> /proc/net/rpc/nfsd.export/channel randomly fail with EINVAL despite
> usually succeeding with the same invocation. Upon investigation this
> was the string that exportfs was writing "-test-client- /path/to/mount
> 3 0 65534 65534 0". "3" was the value for expiry in this message,
> which led me to conclude that this is a relative field. If it isn't,
> perhaps this is a bug in userspace nfs tools?

The above information is very useful. This sort of detail should always
be included with a bug report, or a patch proposing to fix a bug.

The intent of that "3" is to be a time in the past. We don't want the
-test-client- entry to be added to the cache, but we want a failure
message if the path cannot be exported. So we set a time in the past as
the expiry time.
Using 0 is awkward as it often has special meaning, so I chose '3'.

>
> The failure in this was if nfs-server starts exactly 3s after bootup,
> boot.tv_sec would be 3 and thus get_expiry() returns 0, causing a
> failure to be returned.

I don't understand this. getboottime64() doesn't report time since boot.
It reports the time when the system booted. It only changes when the
system time is deliberately changed.
At boot, it presumably reports 0. As soon as some tool (e.g. systemd or
ntpdate) determines what the current time it and calls settimeofday() or
a similar function, the system time is changed, and the boot-time is
changed by the same amount. Typically this will make it well over 1
billion (for anything booted this century).
So for the boot time to report as '3', something would need to set the
current time to a moment early in January 1970. I'd be surprised if
anything is doing that.

How much tracing have you done? Have you printed out the value of
boot.tv_sec and confirmed that it is '3' or have you only deduced it
from other evidence.
Exactly what firm evidence do you have?

Thanks,
NeilBrown


2023-03-07 23:56:52

by Jerry Zhang

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: Fix incorrect parsing of expiry time

On Tue, Mar 7, 2023 at 3:20 PM NeilBrown <[email protected]> wrote:
>
> On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > On Tue, Mar 7, 2023 at 2:31 PM NeilBrown <[email protected]> wrote:
> > >
> > > On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > > > The expiry time field is mean to be expressed in seconds since boot.
> > >
> > > Correct.
> > >
> > > > The get_expiry() function parses a relative time value in seconds.
> > >
> > > Incorrect. It parses and absoulte wall-clock time.
> > I'm not familiar with the source of truth for this info. Is there a
> > specification of some sort?
> >
> > For reference, we were seeing writes to
> > /proc/net/rpc/nfsd.export/channel randomly fail with EINVAL despite
> > usually succeeding with the same invocation. Upon investigation this
> > was the string that exportfs was writing "-test-client- /path/to/mount
> > 3 0 65534 65534 0". "3" was the value for expiry in this message,
> > which led me to conclude that this is a relative field. If it isn't,
> > perhaps this is a bug in userspace nfs tools?
>
> The above information is very useful. This sort of detail should always
> be included with a bug report, or a patch proposing to fix a bug.
>
> The intent of that "3" is to be a time in the past. We don't want the
> -test-client- entry to be added to the cache, but we want a failure
> message if the path cannot be exported. So we set a time in the past as
> the expiry time.
> Using 0 is awkward as it often has special meaning, so I chose '3'.
>
> >
> > The failure in this was if nfs-server starts exactly 3s after bootup,
> > boot.tv_sec would be 3 and thus get_expiry() returns 0, causing a
> > failure to be returned.
>
> I don't understand this. getboottime64() doesn't report time since boot.
> It reports the time when the system booted. It only changes when the
> system time is deliberately changed.
Ok I misinterpreted what this function does.
> At boot, it presumably reports 0. As soon as some tool (e.g. systemd or
> ntpdate) determines what the current time it and calls settimeofday() or
> a similar function, the system time is changed, and the boot-time is
> changed by the same amount. Typically this will make it well over 1
> billion (for anything booted this century).
> So for the boot time to report as '3', something would need to set the
> current time to a moment early in January 1970. I'd be surprised if
> anything is doing that.
I see the discrepency now -- our system is actually an embedded
platform without an RTC. So it thinks that it is "1970" every time it
boots up, at least until it connects to the internet or similar, which
it may or may not ever do. We use NFS to share mountpoints between 2
linux systems on our board connected via usb-ethernet. The fact that
it allows simultaneous access gives it an advantage over other
protocols like mass storage.

Its likely that the code is working as intended then, it just didn't
take our particular usecase into account.

>
> How much tracing have you done? Have you printed out the value of
> boot.tv_sec and confirmed that it is '3' or have you only deduced it
> from other evidence.
> Exactly what firm evidence do you have?
Sure I've added this simple debug print with the necessary info

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 15422c951fd1..5af49198b162 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -528,10 +528,12 @@ static int svc_export_parse(struct cache_detail
*cd, char *mesg, int mlen)
int len;
int err;
struct auth_domain *dom = NULL;
struct svc_export exp = {}, *expp;
int an_int;
+ struct timespec64 boot;
+ char* orig_mesg = mesg;

if (mesg[mlen-1] != '\n')
return -EINVAL;
mesg[mlen-1] = 0;

@@ -564,10 +566,12 @@ static int svc_export_parse(struct cache_detail
*cd, char *mesg, int mlen)
exp.ex_devid_map = NULL;

/* expiry */
err = -EINVAL;
exp.h.expiry_time = get_expiry(&mesg);
+ getboottime64(&boot);
+ printk("mesg is '%s' expiry is %lld and boot_s is %lld\n",
orig_mesg, exp.h.expiry_time, boot.tv_sec);
if (exp.h.expiry_time == 0)
goto out3;

/* flags */
err = get_int(&mesg, &an_int);

and the output is

[ 14.093506] mesg is '-test-client- /path/to/mount 3 8192 65534
65534 0' expiry is 0 and boot_s is 3

which largely confirms the info above.

Do you think we'd be able to handle this case cleanly?
>
> Thanks,
> NeilBrown
>

2023-03-08 00:28:10

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: Fix incorrect parsing of expiry time


(I've removed some bouncing email addresses, and also removed
Trond and Anna who are responsible for the NFS client and so
not directly interested in the server. They will likely find
this on the list if they are interested).

On Wed, 08 Mar 2023, Jerry Zhang wrote:
> On Tue, Mar 7, 2023 at 3:20 PM NeilBrown <[email protected]> wrote:
> >
> > On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > > On Tue, Mar 7, 2023 at 2:31 PM NeilBrown <[email protected]> wrote:
> > > >
> > > > On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > > > > The expiry time field is mean to be expressed in seconds since boot.
> > > >
> > > > Correct.
> > > >
> > > > > The get_expiry() function parses a relative time value in seconds.
> > > >
> > > > Incorrect. It parses and absoulte wall-clock time.
> > > I'm not familiar with the source of truth for this info. Is there a
> > > specification of some sort?
> > >
> > > For reference, we were seeing writes to
> > > /proc/net/rpc/nfsd.export/channel randomly fail with EINVAL despite
> > > usually succeeding with the same invocation. Upon investigation this
> > > was the string that exportfs was writing "-test-client- /path/to/mount
> > > 3 0 65534 65534 0". "3" was the value for expiry in this message,
> > > which led me to conclude that this is a relative field. If it isn't,
> > > perhaps this is a bug in userspace nfs tools?
> >
> > The above information is very useful. This sort of detail should always
> > be included with a bug report, or a patch proposing to fix a bug.
> >
> > The intent of that "3" is to be a time in the past. We don't want the
> > -test-client- entry to be added to the cache, but we want a failure
> > message if the path cannot be exported. So we set a time in the past as
> > the expiry time.
> > Using 0 is awkward as it often has special meaning, so I chose '3'.
> >
> > >
> > > The failure in this was if nfs-server starts exactly 3s after bootup,
> > > boot.tv_sec would be 3 and thus get_expiry() returns 0, causing a
> > > failure to be returned.
> >
> > I don't understand this. getboottime64() doesn't report time since boot.
> > It reports the time when the system booted. It only changes when the
> > system time is deliberately changed.
> Ok I misinterpreted what this function does.
> > At boot, it presumably reports 0. As soon as some tool (e.g. systemd or
> > ntpdate) determines what the current time it and calls settimeofday() or
> > a similar function, the system time is changed, and the boot-time is
> > changed by the same amount. Typically this will make it well over 1
> > billion (for anything booted this century).
> > So for the boot time to report as '3', something would need to set the
> > current time to a moment early in January 1970. I'd be surprised if
> > anything is doing that.
> I see the discrepency now -- our system is actually an embedded
> platform without an RTC. So it thinks that it is "1970" every time it
> boots up, at least until it connects to the internet or similar, which
> it may or may not ever do. We use NFS to share mountpoints between 2
> linux systems on our board connected via usb-ethernet. The fact that
> it allows simultaneous access gives it an advantage over other
> protocols like mass storage.
>
> Its likely that the code is working as intended then, it just didn't
> take our particular usecase into account.
>
> >
> > How much tracing have you done? Have you printed out the value of
> > boot.tv_sec and confirmed that it is '3' or have you only deduced it
> > from other evidence.
> > Exactly what firm evidence do you have?
> Sure I've added this simple debug print with the necessary info
>
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index 15422c951fd1..5af49198b162 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -528,10 +528,12 @@ static int svc_export_parse(struct cache_detail
> *cd, char *mesg, int mlen)
> int len;
> int err;
> struct auth_domain *dom = NULL;
> struct svc_export exp = {}, *expp;
> int an_int;
> + struct timespec64 boot;
> + char* orig_mesg = mesg;
>
> if (mesg[mlen-1] != '\n')
> return -EINVAL;
> mesg[mlen-1] = 0;
>
> @@ -564,10 +566,12 @@ static int svc_export_parse(struct cache_detail
> *cd, char *mesg, int mlen)
> exp.ex_devid_map = NULL;
>
> /* expiry */
> err = -EINVAL;
> exp.h.expiry_time = get_expiry(&mesg);
> + getboottime64(&boot);
> + printk("mesg is '%s' expiry is %lld and boot_s is %lld\n",
> orig_mesg, exp.h.expiry_time, boot.tv_sec);
> if (exp.h.expiry_time == 0)
> goto out3;
>
> /* flags */
> err = get_int(&mesg, &an_int);
>
> and the output is
>
> [ 14.093506] mesg is '-test-client- /path/to/mount 3 8192 65534
> 65534 0' expiry is 0 and boot_s is 3
>
> which largely confirms the info above.
>
> Do you think we'd be able to handle this case cleanly?

Thanks for all the details. Yes I think we can and should handle this
case cleanly. I think the core problem is that get_expiry() mixes the
error code into the expiry time. If we separate those out the problem
should disappear.

Please try this patch.

From: NeilBrown <[email protected]>
Date: Wed, 8 Mar 2023 11:19:01 +1100
Subject: [PATCH] SUNRPC: return proper error from get_expiry()

The get_expiry() function currently returns a timestamp, and uses the
special return value of 0 to indicate an error.

Unfortunately this causes a problem when 0 is the correct return value.

On a system with no RTC it is possible that the boot time will be seen
to be "3". When exportfs probes to see if a particular filesystem
supports NFS export it tries to cache information with an expiry time of
"3". The intention is for this to be "long in the past". Even with not
RTC it will not be far in the future (at most a second or two) so this
is harmless.
But if the boot time happens to have been calculated to be "3", then
get_expiry will fail incorrectly as it converts the number to "seconds
since bootime".

To avoid this problem we change get_expiry() to report the error quite
separately from the expiry time. The error is now the return value.
The expiry time is reported through a by-reference parameter.

Reported-by: Jerry Zhang <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
---
fs/nfsd/export.c | 13 ++++++-------
fs/nfsd/nfs4idmap.c | 8 ++++----
include/linux/sunrpc/cache.h | 15 ++++++++-------
net/sunrpc/auth_gss/svcauth_gss.c | 12 ++++++------
net/sunrpc/svcauth_unix.c | 12 ++++++------
5 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
index 668c7527b17e..6da74aebe1fb 100644
--- a/fs/nfsd/export.c
+++ b/fs/nfsd/export.c
@@ -123,11 +123,11 @@ static int expkey_parse(struct cache_detail *cd, char *mesg, int mlen)

/* OK, we seem to have a valid key */
key.h.flags = 0;
- key.h.expiry_time = get_expiry(&mesg);
- if (key.h.expiry_time == 0)
+ err = get_expiry(&mesg, &key.h.expiry_time);
+ if (err)
goto out;

- key.ek_client = dom;
+ key.ek_client = dom;
key.ek_fsidtype = fsidtype;
memcpy(key.ek_fsid, buf, len);

@@ -610,9 +610,8 @@ static int svc_export_parse(struct cache_detail *cd, char *mesg, int mlen)
exp.ex_devid_map = NULL;

/* expiry */
- err = -EINVAL;
- exp.h.expiry_time = get_expiry(&mesg);
- if (exp.h.expiry_time == 0)
+ err = get_expiry(&mesg, &exp.h.expiry_time);
+ if (err)
goto out3;

/* flags */
@@ -624,7 +623,7 @@ static int svc_export_parse(struct cache_detail *cd, char *mesg, int mlen)
if (err || an_int < 0)
goto out3;
exp.ex_flags= an_int;
-
+
/* anon uid */
err = get_int(&mesg, &an_int);
if (err)
diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c
index 5e9809aff37e..7a806ac13e31 100644
--- a/fs/nfsd/nfs4idmap.c
+++ b/fs/nfsd/nfs4idmap.c
@@ -240,8 +240,8 @@ idtoname_parse(struct cache_detail *cd, char *buf, int buflen)
goto out;

/* expiry */
- ent.h.expiry_time = get_expiry(&buf);
- if (ent.h.expiry_time == 0)
+ error = get_expiry(&buf, &ent.h.expiry_time);
+ if (error)
goto out;

error = -ENOMEM;
@@ -408,8 +408,8 @@ nametoid_parse(struct cache_detail *cd, char *buf, int buflen)
memcpy(ent.name, buf1, sizeof(ent.name));

/* expiry */
- ent.h.expiry_time = get_expiry(&buf);
- if (ent.h.expiry_time == 0)
+ error = get_expiry(&buf, &ent.h.expiry_time);
+ if (error)
goto out;

/* ID */
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index ec5a555df96f..518bd28f5ab8 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -300,17 +300,18 @@ static inline int get_time(char **bpp, time64_t *time)
return 0;
}

-static inline time64_t get_expiry(char **bpp)
+static inline int get_expiry(char **bpp, time64_t *rvp)
{
- time64_t rv;
+ int error;
struct timespec64 boot;

- if (get_time(bpp, &rv))
- return 0;
- if (rv < 0)
- return 0;
+ error = get_time(bpp, rvp);
+ if (error)
+ return error;
+
getboottime64(&boot);
- return rv - boot.tv_sec;
+ (*rvp) -= boot.tv_sec;
+ return 0;
}

#endif /* _LINUX_SUNRPC_CACHE_H_ */
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index acb822b23af1..bfaf584d296a 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -258,11 +258,11 @@ static int rsi_parse(struct cache_detail *cd,

rsii.h.flags = 0;
/* expiry */
- expiry = get_expiry(&mesg);
- status = -EINVAL;
- if (expiry == 0)
+ status = get_expiry(&mesg, &expiry);
+ if (status)
goto out;

+ status = -EINVAL;
/* major/minor */
len = qword_get(&mesg, buf, mlen);
if (len <= 0)
@@ -484,11 +484,11 @@ static int rsc_parse(struct cache_detail *cd,

rsci.h.flags = 0;
/* expiry */
- expiry = get_expiry(&mesg);
- status = -EINVAL;
- if (expiry == 0)
+ status = get_expiry(&mesg, &expiry);
+ if (status)
goto out;

+ status = -EINVAL;
rscp = rsc_lookup(cd, &rsci);
if (!rscp)
goto out;
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
index b1efc34db6ed..9e7798a69051 100644
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -225,9 +225,9 @@ static int ip_map_parse(struct cache_detail *cd,
return -EINVAL;
}

- expiry = get_expiry(&mesg);
- if (expiry ==0)
- return -EINVAL;
+ err = get_expiry(&mesg, &expiry);
+ if (err)
+ return err;

/* domainname, or empty for NEGATIVE */
len = qword_get(&mesg, buf, mlen);
@@ -497,9 +497,9 @@ static int unix_gid_parse(struct cache_detail *cd,
uid = make_kuid(current_user_ns(), id);
ug.uid = uid;

- expiry = get_expiry(&mesg);
- if (expiry == 0)
- return -EINVAL;
+ err = get_expiry(&mesg, &expiry);
+ if (err)
+ return err;

rv = get_int(&mesg, &gids);
if (rv || gids < 0 || gids > 8192)
--
2.39.2


2023-03-08 01:53:29

by Jerry Zhang

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: Fix incorrect parsing of expiry time

On Tue, Mar 7, 2023 at 4:27 PM NeilBrown <[email protected]> wrote:
>
>
> (I've removed some bouncing email addresses, and also removed
> Trond and Anna who are responsible for the NFS client and so
> not directly interested in the server. They will likely find
> this on the list if they are interested).
>
> On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > On Tue, Mar 7, 2023 at 3:20 PM NeilBrown <[email protected]> wrote:
> > >
> > > On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > > > On Tue, Mar 7, 2023 at 2:31 PM NeilBrown <[email protected]> wrote:
> > > > >
> > > > > On Wed, 08 Mar 2023, Jerry Zhang wrote:
> > > > > > The expiry time field is mean to be expressed in seconds since boot.
> > > > >
> > > > > Correct.
> > > > >
> > > > > > The get_expiry() function parses a relative time value in seconds.
> > > > >
> > > > > Incorrect. It parses and absoulte wall-clock time.
> > > > I'm not familiar with the source of truth for this info. Is there a
> > > > specification of some sort?
> > > >
> > > > For reference, we were seeing writes to
> > > > /proc/net/rpc/nfsd.export/channel randomly fail with EINVAL despite
> > > > usually succeeding with the same invocation. Upon investigation this
> > > > was the string that exportfs was writing "-test-client- /path/to/mount
> > > > 3 0 65534 65534 0". "3" was the value for expiry in this message,
> > > > which led me to conclude that this is a relative field. If it isn't,
> > > > perhaps this is a bug in userspace nfs tools?
> > >
> > > The above information is very useful. This sort of detail should always
> > > be included with a bug report, or a patch proposing to fix a bug.
> > >
> > > The intent of that "3" is to be a time in the past. We don't want the
> > > -test-client- entry to be added to the cache, but we want a failure
> > > message if the path cannot be exported. So we set a time in the past as
> > > the expiry time.
> > > Using 0 is awkward as it often has special meaning, so I chose '3'.
> > >
> > > >
> > > > The failure in this was if nfs-server starts exactly 3s after bootup,
> > > > boot.tv_sec would be 3 and thus get_expiry() returns 0, causing a
> > > > failure to be returned.
> > >
> > > I don't understand this. getboottime64() doesn't report time since boot.
> > > It reports the time when the system booted. It only changes when the
> > > system time is deliberately changed.
> > Ok I misinterpreted what this function does.
> > > At boot, it presumably reports 0. As soon as some tool (e.g. systemd or
> > > ntpdate) determines what the current time it and calls settimeofday() or
> > > a similar function, the system time is changed, and the boot-time is
> > > changed by the same amount. Typically this will make it well over 1
> > > billion (for anything booted this century).
> > > So for the boot time to report as '3', something would need to set the
> > > current time to a moment early in January 1970. I'd be surprised if
> > > anything is doing that.
> > I see the discrepency now -- our system is actually an embedded
> > platform without an RTC. So it thinks that it is "1970" every time it
> > boots up, at least until it connects to the internet or similar, which
> > it may or may not ever do. We use NFS to share mountpoints between 2
> > linux systems on our board connected via usb-ethernet. The fact that
> > it allows simultaneous access gives it an advantage over other
> > protocols like mass storage.
> >
> > Its likely that the code is working as intended then, it just didn't
> > take our particular usecase into account.
> >
> > >
> > > How much tracing have you done? Have you printed out the value of
> > > boot.tv_sec and confirmed that it is '3' or have you only deduced it
> > > from other evidence.
> > > Exactly what firm evidence do you have?
> > Sure I've added this simple debug print with the necessary info
> >
> > diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> > index 15422c951fd1..5af49198b162 100644
> > --- a/fs/nfsd/export.c
> > +++ b/fs/nfsd/export.c
> > @@ -528,10 +528,12 @@ static int svc_export_parse(struct cache_detail
> > *cd, char *mesg, int mlen)
> > int len;
> > int err;
> > struct auth_domain *dom = NULL;
> > struct svc_export exp = {}, *expp;
> > int an_int;
> > + struct timespec64 boot;
> > + char* orig_mesg = mesg;
> >
> > if (mesg[mlen-1] != '\n')
> > return -EINVAL;
> > mesg[mlen-1] = 0;
> >
> > @@ -564,10 +566,12 @@ static int svc_export_parse(struct cache_detail
> > *cd, char *mesg, int mlen)
> > exp.ex_devid_map = NULL;
> >
> > /* expiry */
> > err = -EINVAL;
> > exp.h.expiry_time = get_expiry(&mesg);
> > + getboottime64(&boot);
> > + printk("mesg is '%s' expiry is %lld and boot_s is %lld\n",
> > orig_mesg, exp.h.expiry_time, boot.tv_sec);
> > if (exp.h.expiry_time == 0)
> > goto out3;
> >
> > /* flags */
> > err = get_int(&mesg, &an_int);
> >
> > and the output is
> >
> > [ 14.093506] mesg is '-test-client- /path/to/mount 3 8192 65534
> > 65534 0' expiry is 0 and boot_s is 3
> >
> > which largely confirms the info above.
> >
> > Do you think we'd be able to handle this case cleanly?
>
> Thanks for all the details. Yes I think we can and should handle this
> case cleanly. I think the core problem is that get_expiry() mixes the
> error code into the expiry time. If we separate those out the problem
> should disappear.
>
> Please try this patch.
That worked, thanks for the quick fix!
>
> From: NeilBrown <[email protected]>
> Date: Wed, 8 Mar 2023 11:19:01 +1100
> Subject: [PATCH] SUNRPC: return proper error from get_expiry()
>
> The get_expiry() function currently returns a timestamp, and uses the
> special return value of 0 to indicate an error.
>
> Unfortunately this causes a problem when 0 is the correct return value.
>
> On a system with no RTC it is possible that the boot time will be seen
> to be "3". When exportfs probes to see if a particular filesystem
> supports NFS export it tries to cache information with an expiry time of
> "3". The intention is for this to be "long in the past". Even with not
> RTC it will not be far in the future (at most a second or two) so this
> is harmless.
> But if the boot time happens to have been calculated to be "3", then
> get_expiry will fail incorrectly as it converts the number to "seconds
> since bootime".
>
> To avoid this problem we change get_expiry() to report the error quite
> separately from the expiry time. The error is now the return value.
> The expiry time is reported through a by-reference parameter.
>
> Reported-by: Jerry Zhang <[email protected]>
Tested-by: Jerry Zhang <[email protected]>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/nfsd/export.c | 13 ++++++-------
> fs/nfsd/nfs4idmap.c | 8 ++++----
> include/linux/sunrpc/cache.h | 15 ++++++++-------
> net/sunrpc/auth_gss/svcauth_gss.c | 12 ++++++------
> net/sunrpc/svcauth_unix.c | 12 ++++++------
> 5 files changed, 30 insertions(+), 30 deletions(-)
>
> diff --git a/fs/nfsd/export.c b/fs/nfsd/export.c
> index 668c7527b17e..6da74aebe1fb 100644
> --- a/fs/nfsd/export.c
> +++ b/fs/nfsd/export.c
> @@ -123,11 +123,11 @@ static int expkey_parse(struct cache_detail *cd, char *mesg, int mlen)
>
> /* OK, we seem to have a valid key */
> key.h.flags = 0;
> - key.h.expiry_time = get_expiry(&mesg);
> - if (key.h.expiry_time == 0)
> + err = get_expiry(&mesg, &key.h.expiry_time);
> + if (err)
> goto out;
>
> - key.ek_client = dom;
> + key.ek_client = dom;
> key.ek_fsidtype = fsidtype;
> memcpy(key.ek_fsid, buf, len);
>
> @@ -610,9 +610,8 @@ static int svc_export_parse(struct cache_detail *cd, char *mesg, int mlen)
> exp.ex_devid_map = NULL;
>
> /* expiry */
> - err = -EINVAL;
> - exp.h.expiry_time = get_expiry(&mesg);
> - if (exp.h.expiry_time == 0)
> + err = get_expiry(&mesg, &exp.h.expiry_time);
> + if (err)
> goto out3;
>
> /* flags */
> @@ -624,7 +623,7 @@ static int svc_export_parse(struct cache_detail *cd, char *mesg, int mlen)
> if (err || an_int < 0)
> goto out3;
> exp.ex_flags= an_int;
> -
> +
> /* anon uid */
> err = get_int(&mesg, &an_int);
> if (err)
> diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c
> index 5e9809aff37e..7a806ac13e31 100644
> --- a/fs/nfsd/nfs4idmap.c
> +++ b/fs/nfsd/nfs4idmap.c
> @@ -240,8 +240,8 @@ idtoname_parse(struct cache_detail *cd, char *buf, int buflen)
> goto out;
>
> /* expiry */
> - ent.h.expiry_time = get_expiry(&buf);
> - if (ent.h.expiry_time == 0)
> + error = get_expiry(&buf, &ent.h.expiry_time);
> + if (error)
> goto out;
>
> error = -ENOMEM;
> @@ -408,8 +408,8 @@ nametoid_parse(struct cache_detail *cd, char *buf, int buflen)
> memcpy(ent.name, buf1, sizeof(ent.name));
>
> /* expiry */
> - ent.h.expiry_time = get_expiry(&buf);
> - if (ent.h.expiry_time == 0)
> + error = get_expiry(&buf, &ent.h.expiry_time);
> + if (error)
> goto out;
>
> /* ID */
> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
> index ec5a555df96f..518bd28f5ab8 100644
> --- a/include/linux/sunrpc/cache.h
> +++ b/include/linux/sunrpc/cache.h
> @@ -300,17 +300,18 @@ static inline int get_time(char **bpp, time64_t *time)
> return 0;
> }
>
> -static inline time64_t get_expiry(char **bpp)
> +static inline int get_expiry(char **bpp, time64_t *rvp)
> {
> - time64_t rv;
> + int error;
> struct timespec64 boot;
>
> - if (get_time(bpp, &rv))
> - return 0;
> - if (rv < 0)
> - return 0;
> + error = get_time(bpp, rvp);
> + if (error)
> + return error;
> +
> getboottime64(&boot);
> - return rv - boot.tv_sec;
> + (*rvp) -= boot.tv_sec;
> + return 0;
> }
>
> #endif /* _LINUX_SUNRPC_CACHE_H_ */
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index acb822b23af1..bfaf584d296a 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -258,11 +258,11 @@ static int rsi_parse(struct cache_detail *cd,
>
> rsii.h.flags = 0;
> /* expiry */
> - expiry = get_expiry(&mesg);
> - status = -EINVAL;
> - if (expiry == 0)
> + status = get_expiry(&mesg, &expiry);
> + if (status)
> goto out;
>
> + status = -EINVAL;
> /* major/minor */
> len = qword_get(&mesg, buf, mlen);
> if (len <= 0)
> @@ -484,11 +484,11 @@ static int rsc_parse(struct cache_detail *cd,
>
> rsci.h.flags = 0;
> /* expiry */
> - expiry = get_expiry(&mesg);
> - status = -EINVAL;
> - if (expiry == 0)
> + status = get_expiry(&mesg, &expiry);
> + if (status)
> goto out;
>
> + status = -EINVAL;
> rscp = rsc_lookup(cd, &rsci);
> if (!rscp)
> goto out;
> diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c
> index b1efc34db6ed..9e7798a69051 100644
> --- a/net/sunrpc/svcauth_unix.c
> +++ b/net/sunrpc/svcauth_unix.c
> @@ -225,9 +225,9 @@ static int ip_map_parse(struct cache_detail *cd,
> return -EINVAL;
> }
>
> - expiry = get_expiry(&mesg);
> - if (expiry ==0)
> - return -EINVAL;
> + err = get_expiry(&mesg, &expiry);
> + if (err)
> + return err;
>
> /* domainname, or empty for NEGATIVE */
> len = qword_get(&mesg, buf, mlen);
> @@ -497,9 +497,9 @@ static int unix_gid_parse(struct cache_detail *cd,
> uid = make_kuid(current_user_ns(), id);
> ug.uid = uid;
>
> - expiry = get_expiry(&mesg);
> - if (expiry == 0)
> - return -EINVAL;
> + err = get_expiry(&mesg, &expiry);
> + if (err)
> + return err;
>
> rv = get_int(&mesg, &gids);
> if (rv || gids < 0 || gids > 8192)
> --
> 2.39.2
>