2021-01-04 11:34:47

by Hackintosh Five

[permalink] [raw]
Subject: Boot time improvement with systemd and nfs-utils

rpc-statd-notify is causing a 10 second hang on my system during boot
due to an unwanted dependency on network-online.target. This
dependency isn't needed anyway, because rpc-statd-notify (sm-notify)
will wait for the network to come online if it isn't already (up to 15
minutes, so no risk of timeout that would be avoided by systemd)
=============================================
From c90bd7e701c2558606907f08bf27ae9be3f8e0bf Mon Sep 17 00:00:00 2001
From: Hackintosh 5 <[email protected]>
Date: Sat, 2 Jan 2021 14:28:30 +0000
Subject: [PATCH] systemd: network-online.target is not needed for
rpc-statd-notify.service

Commit 09e5c6c2 changed the After line for rpc-statd-notify to change
network.target to network-online.target, which is incorrect, because
sm-notify has a default timeout of 15 minutes, which is longer than
the timeout for network-online.target. In other words, the dependency
on network-online.target is useless and delays system boot by ~10
seconds.
---
systemd/rpc-statd-notify.service | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/systemd/rpc-statd-notify.service
b/systemd/rpc-statd-notify.service
index aad4c0d2..8a40e862 100644
--- a/systemd/rpc-statd-notify.service
+++ b/systemd/rpc-statd-notify.service
@@ -1,8 +1,8 @@
[Unit]
Description=Notify NFS peers of a restart
DefaultDependencies=no
-Wants=network-online.target
-After=local-fs.target network-online.target nss-lookup.target
+Wants=network.target
+After=local-fs.target network.target nss-lookup.target

# if we run an nfs server, it needs to be running before we
# tell clients that it has restarted.
--
2.29.2


2021-01-04 12:28:50

by Chuck Lever

[permalink] [raw]
Subject: Re: Boot time improvement with systemd and nfs-utils

Hello, thanks for your report.

The dependency you are removing addresses a bug -- if the network is not configured when rpc.statd is started, the rpc.statd process continues to use incorrect local address information even after the network is up.


> On Jan 4, 2021, at 6:32 AM, Hackintosh Five <[email protected]> wrote:
>
> rpc-statd-notify is causing a 10 second hang on my system during boot
> due to an unwanted dependency on network-online.target. This
> dependency isn't needed anyway, because rpc-statd-notify (sm-notify)
> will wait for the network to come online if it isn't already (up to 15
> minutes, so no risk of timeout that would be avoided by systemd)
> =============================================
> From c90bd7e701c2558606907f08bf27ae9be3f8e0bf Mon Sep 17 00:00:00 2001
> From: Hackintosh 5 <[email protected]>
> Date: Sat, 2 Jan 2021 14:28:30 +0000
> Subject: [PATCH] systemd: network-online.target is not needed for
> rpc-statd-notify.service
>
> Commit 09e5c6c2 changed the After line for rpc-statd-notify to change
> network.target to network-online.target, which is incorrect, because
> sm-notify has a default timeout of 15 minutes, which is longer than
> the timeout for network-online.target. In other words, the dependency
> on network-online.target is useless and delays system boot by ~10
> seconds.
> ---
> systemd/rpc-statd-notify.service | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/systemd/rpc-statd-notify.service
> b/systemd/rpc-statd-notify.service
> index aad4c0d2..8a40e862 100644
> --- a/systemd/rpc-statd-notify.service
> +++ b/systemd/rpc-statd-notify.service
> @@ -1,8 +1,8 @@
> [Unit]
> Description=Notify NFS peers of a restart
> DefaultDependencies=no
> -Wants=network-online.target
> -After=local-fs.target network-online.target nss-lookup.target
> +Wants=network.target
> +After=local-fs.target network.target nss-lookup.target
>
> # if we run an nfs server, it needs to be running before we
> # tell clients that it has restarted.
> --
> 2.29.2

--
Chuck Lever



2021-01-04 12:52:44

by Hackintosh Five

[permalink] [raw]
Subject: Re: Boot time improvement with systemd and nfs-utils

Hi, thanks for the fast reply

I have never even used nfs and I'm not a systemd expert, so I'm not at
all sure this interpretation is correct, but here goes. I only removed
the dependency from rpc.statd.notify, not rpc.statd. I didn't remove
the `After=nfs-server` line, and for nfs-server to be up,
network-online must be up first (there's an After requirement in the
nfs-server unit). So if the nfs-server is enabled, the
rpc-statd-notify will order itself after the server is up, which
depends on the network. That means that, if there is a server, the
server must be up before it sends notifications, so it will have the
right hostname. This only improves boot speed on nfs clients, where
nfs-client.target pulls in rpc-statd-notify.service.


On Mon, Jan 4, 2021 at 12:27 PM Chuck Lever <[email protected]> wrote:
>
> Hello, thanks for your report.
>
> The dependency you are removing addresses a bug -- if the network is not configured when rpc.statd is started, the rpc.statd process continues to use incorrect local address information even after the network is up.
>
>
> > On Jan 4, 2021, at 6:32 AM, Hackintosh Five <[email protected]> wrote:
> >
> > rpc-statd-notify is causing a 10 second hang on my system during boot
> > due to an unwanted dependency on network-online.target. This
> > dependency isn't needed anyway, because rpc-statd-notify (sm-notify)
> > will wait for the network to come online if it isn't already (up to 15
> > minutes, so no risk of timeout that would be avoided by systemd)
> > =============================================
> > From c90bd7e701c2558606907f08bf27ae9be3f8e0bf Mon Sep 17 00:00:00 2001
> > From: Hackintosh 5 <[email protected]>
> > Date: Sat, 2 Jan 2021 14:28:30 +0000
> > Subject: [PATCH] systemd: network-online.target is not needed for
> > rpc-statd-notify.service
> >
> > Commit 09e5c6c2 changed the After line for rpc-statd-notify to change
> > network.target to network-online.target, which is incorrect, because
> > sm-notify has a default timeout of 15 minutes, which is longer than
> > the timeout for network-online.target. In other words, the dependency
> > on network-online.target is useless and delays system boot by ~10
> > seconds.
> > ---
> > systemd/rpc-statd-notify.service | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/systemd/rpc-statd-notify.service
> > b/systemd/rpc-statd-notify.service
> > index aad4c0d2..8a40e862 100644
> > --- a/systemd/rpc-statd-notify.service
> > +++ b/systemd/rpc-statd-notify.service
> > @@ -1,8 +1,8 @@
> > [Unit]
> > Description=Notify NFS peers of a restart
> > DefaultDependencies=no
> > -Wants=network-online.target
> > -After=local-fs.target network-online.target nss-lookup.target
> > +Wants=network.target
> > +After=local-fs.target network.target nss-lookup.target
> >
> > # if we run an nfs server, it needs to be running before we
> > # tell clients that it has restarted.
> > --
> > 2.29.2
>
> --
> Chuck Lever
>
>
>

2021-01-04 12:55:37

by Chuck Lever

[permalink] [raw]
Subject: Re: Boot time improvement with systemd and nfs-utils

Hi-

> On Jan 4, 2021, at 7:51 AM, Hackintosh Five <[email protected]> wrote:
>
> Hi, thanks for the fast reply
>
> I have never even used nfs and I'm not a systemd expert, so I'm not at
> all sure this interpretation is correct, but here goes. I only removed
> the dependency from rpc.statd.notify, not rpc.statd.

Same problem exists for sm-notify.


> I didn't remove
> the `After=nfs-server` line, and for nfs-server to be up,
> network-online must be up first (there's an After requirement in the
> nfs-server unit). So if the nfs-server is enabled, the
> rpc-statd-notify will order itself after the server is up, which
> depends on the network.

IIRC sm-notify runs on clients too. That's why the dependency is
on the network and not on nfs-server.


> That means that, if there is a server, the
> server must be up before it sends notifications, so it will have the
> right hostname. This only improves boot speed on nfs clients, where
> nfs-client.target pulls in rpc-statd-notify.service.
>
>
> On Mon, Jan 4, 2021 at 12:27 PM Chuck Lever <[email protected]> wrote:
>>
>> Hello, thanks for your report.
>>
>> The dependency you are removing addresses a bug -- if the network is not configured when rpc.statd is started, the rpc.statd process continues to use incorrect local address information even after the network is up.
>>
>>
>>> On Jan 4, 2021, at 6:32 AM, Hackintosh Five <[email protected]> wrote:
>>>
>>> rpc-statd-notify is causing a 10 second hang on my system during boot
>>> due to an unwanted dependency on network-online.target. This
>>> dependency isn't needed anyway, because rpc-statd-notify (sm-notify)
>>> will wait for the network to come online if it isn't already (up to 15
>>> minutes, so no risk of timeout that would be avoided by systemd)
>>> =============================================
>>> From c90bd7e701c2558606907f08bf27ae9be3f8e0bf Mon Sep 17 00:00:00 2001
>>> From: Hackintosh 5 <[email protected]>
>>> Date: Sat, 2 Jan 2021 14:28:30 +0000
>>> Subject: [PATCH] systemd: network-online.target is not needed for
>>> rpc-statd-notify.service
>>>
>>> Commit 09e5c6c2 changed the After line for rpc-statd-notify to change
>>> network.target to network-online.target, which is incorrect, because
>>> sm-notify has a default timeout of 15 minutes, which is longer than
>>> the timeout for network-online.target. In other words, the dependency
>>> on network-online.target is useless and delays system boot by ~10
>>> seconds.
>>> ---
>>> systemd/rpc-statd-notify.service | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/systemd/rpc-statd-notify.service
>>> b/systemd/rpc-statd-notify.service
>>> index aad4c0d2..8a40e862 100644
>>> --- a/systemd/rpc-statd-notify.service
>>> +++ b/systemd/rpc-statd-notify.service
>>> @@ -1,8 +1,8 @@
>>> [Unit]
>>> Description=Notify NFS peers of a restart
>>> DefaultDependencies=no
>>> -Wants=network-online.target
>>> -After=local-fs.target network-online.target nss-lookup.target
>>> +Wants=network.target
>>> +After=local-fs.target network.target nss-lookup.target
>>>
>>> # if we run an nfs server, it needs to be running before we
>>> # tell clients that it has restarted.
>>> --
>>> 2.29.2
>>
>> --
>> Chuck Lever
>>
>>
>>

--
Chuck Lever



2021-01-04 16:07:02

by Hackintosh Five

[permalink] [raw]
Subject: Re: Boot time improvement with systemd and nfs-utils

I see. Does rpc-statd-notify HAVE to start before nfs-client? If not,
perhaps a one-off timer unit with no delay could be made so that the
startup of rpc-statd-notify doesn't block the boot process, while
still running after network-online?

On Mon, Jan 4, 2021 at 1:26 PM Chuck Lever <[email protected]> wrote:
>
> The problem is not in sm-notify itself, it's in the C library functions. The system's DNS resolver configuration is set during network startup. When a process first attempts a DNS query, it retrieves the system DNS configuration as it is at that moment, and keeps that configuration until the process exits. If sm-notify starts before the system's DNS resolver is configured, then it simply doesn't work because it can't perform DNS queries correctly.
>
>
> On Jan 4, 2021, at 8:13 AM, Hackintosh Five <[email protected]> wrote:
>
> Yep, sm-notify is Wanted by nfs-utils.target, and hence clients. But I can't see anywhere in the source of lm-notify where being offline would make a difference (beyond triggering a retry). If such a function does exist and I missed it, perhaps it could be moved to the end of the program (so the hostname is calculated only when the network is available, or whatever)
>
> On Mon, 4 Jan 2021, 12:54 Chuck Lever, <[email protected]> wrote:
>>
>> Hi-
>>
>> > On Jan 4, 2021, at 7:51 AM, Hackintosh Five <[email protected]> wrote:
>> >
>> > Hi, thanks for the fast reply
>> >
>> > I have never even used nfs and I'm not a systemd expert, so I'm not at
>> > all sure this interpretation is correct, but here goes. I only removed
>> > the dependency from rpc.statd.notify, not rpc.statd.
>>
>> Same problem exists for sm-notify.
>>
>>
>> > I didn't remove
>> > the `After=nfs-server` line, and for nfs-server to be up,
>> > network-online must be up first (there's an After requirement in the
>> > nfs-server unit). So if the nfs-server is enabled, the
>> > rpc-statd-notify will order itself after the server is up, which
>> > depends on the network.
>>
>> IIRC sm-notify runs on clients too. That's why the dependency is
>> on the network and not on nfs-server.
>>
>>
>> > That means that, if there is a server, the
>> > server must be up before it sends notifications, so it will have the
>> > right hostname. This only improves boot speed on nfs clients, where
>> > nfs-client.target pulls in rpc-statd-notify.service.
>> >
>> >
>> > On Mon, Jan 4, 2021 at 12:27 PM Chuck Lever <[email protected]> wrote:
>> >>
>> >> Hello, thanks for your report.
>> >>
>> >> The dependency you are removing addresses a bug -- if the network is not configured when rpc.statd is started, the rpc.statd process continues to use incorrect local address information even after the network is up.
>> >>
>> >>
>> >>> On Jan 4, 2021, at 6:32 AM, Hackintosh Five <[email protected]> wrote:
>> >>>
>> >>> rpc-statd-notify is causing a 10 second hang on my system during boot
>> >>> due to an unwanted dependency on network-online.target. This
>> >>> dependency isn't needed anyway, because rpc-statd-notify (sm-notify)
>> >>> will wait for the network to come online if it isn't already (up to 15
>> >>> minutes, so no risk of timeout that would be avoided by systemd)
>> >>> =============================================
>> >>> From c90bd7e701c2558606907f08bf27ae9be3f8e0bf Mon Sep 17 00:00:00 2001
>> >>> From: Hackintosh 5 <[email protected]>
>> >>> Date: Sat, 2 Jan 2021 14:28:30 +0000
>> >>> Subject: [PATCH] systemd: network-online.target is not needed for
>> >>> rpc-statd-notify.service
>> >>>
>> >>> Commit 09e5c6c2 changed the After line for rpc-statd-notify to change
>> >>> network.target to network-online.target, which is incorrect, because
>> >>> sm-notify has a default timeout of 15 minutes, which is longer than
>> >>> the timeout for network-online.target. In other words, the dependency
>> >>> on network-online.target is useless and delays system boot by ~10
>> >>> seconds.
>> >>> ---
>> >>> systemd/rpc-statd-notify.service | 4 ++--
>> >>> 1 file changed, 2 insertions(+), 2 deletions(-)
>> >>>
>> >>> diff --git a/systemd/rpc-statd-notify.service
>> >>> b/systemd/rpc-statd-notify.service
>> >>> index aad4c0d2..8a40e862 100644
>> >>> --- a/systemd/rpc-statd-notify.service
>> >>> +++ b/systemd/rpc-statd-notify.service
>> >>> @@ -1,8 +1,8 @@
>> >>> [Unit]
>> >>> Description=Notify NFS peers of a restart
>> >>> DefaultDependencies=no
>> >>> -Wants=network-online.target
>> >>> -After=local-fs.target network-online.target nss-lookup.target
>> >>> +Wants=network.target
>> >>> +After=local-fs.target network.target nss-lookup.target
>> >>>
>> >>> # if we run an nfs server, it needs to be running before we
>> >>> # tell clients that it has restarted.
>> >>> --
>> >>> 2.29.2
>> >>
>> >> --
>> >> Chuck Lever
>> >>
>> >>
>> >>
>>
>> --
>> Chuck Lever
>>
>>
>>
>
> --
> Chuck Lever
>
>
>

2021-01-04 18:02:47

by Chuck Lever

[permalink] [raw]
Subject: Re: Boot time improvement with systemd and nfs-utils

I'm not aware of a similar reason why the notify step would need to
block the boot process. Maybe Type=forking is wrong, but I thought
sm-notify was a forking/daemonizing type of utility.


> On Jan 4, 2021, at 11:03 AM, Hackintosh Five <[email protected]> wrote:
>
> I see. Does rpc-statd-notify HAVE to start before nfs-client? If not,
> perhaps a one-off timer unit with no delay could be made so that the
> startup of rpc-statd-notify doesn't block the boot process, while
> still running after network-online?
>
> On Mon, Jan 4, 2021 at 1:26 PM Chuck Lever <[email protected]> wrote:
>>
>> The problem is not in sm-notify itself, it's in the C library functions. The system's DNS resolver configuration is set during network startup. When a process first attempts a DNS query, it retrieves the system DNS configuration as it is at that moment, and keeps that configuration until the process exits. If sm-notify starts before the system's DNS resolver is configured, then it simply doesn't work because it can't perform DNS queries correctly.


--
Chuck Lever



2021-01-04 18:56:53

by Hackintosh Five

[permalink] [raw]
Subject: [PATCH v2] systemd: rpc-statd-notify.service can run in the background

This allows rpc-statd-notify to run in the background when it is
only in use by a client. This is done by a timer unit with a one
second timeout, which is Wanted by nfs-client.target. The result
is that there is no longer a dependency on network-online.target
by multi-user.target, so everyone gets faster boot times yay.
---
systemd/nfs-client.target | 2 +-
systemd/rpc-statd-notify.timer | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
create mode 100644 systemd/rpc-statd-notify.timer

diff --git a/systemd/nfs-client.target b/systemd/nfs-client.target
index 8a8300a1..b7cce746 100644
--- a/systemd/nfs-client.target
+++ b/systemd/nfs-client.target
@@ -5,7 +5,7 @@ Wants=remote-fs-pre.target

# Note: we don't "Wants=rpc-statd.service" as "mount.nfs" will arrange to
# start that on demand if needed.
-Wants=rpc-statd-notify.service
+Wants=rpc-statd-notify.timer

# GSS services dependencies and ordering
Wants=auth-rpcgss-module.service
diff --git a/systemd/rpc-statd-notify.timer b/systemd/rpc-statd-notify.timer
new file mode 100644
index 00000000..bac68817
--- /dev/null
+++ b/systemd/rpc-statd-notify.timer
@@ -0,0 +1,9 @@
+[Unit]
+Description=Notify NFS peers of a restart
+RefuseManualStart=true
+RefuseManualStop=true
+
+[Timer]
+OnActiveSec=1
+Unit=rpc-statd-notify.service
+RemainAfterElapse=false
--
2.29.2

2021-01-04 18:57:19

by Hackintosh Five

[permalink] [raw]
Subject: Re: Boot time improvement with systemd and nfs-utils

The issue isn't with the forking type, that's certainly correct (since
it does indeed fork). The problem is that systemd is putting a
dependency between nfs-client (required by multi-user.target) and
rpc-statd-notify (which requires network-online), resulting in gdm
waiting for network-online. The only workaround I was able to make
work was to create a new timer unit which simply launches sm-notify
after 1 second. nfs-client can start the timer unit, which then
*asynchronously* starts sm-notify, meaning that sm-notify gets to keep
its dependency on network-online. Patch with that method will be sent
in a moment (if git send-email decides to work)



On Mon, Jan 4, 2021 at 6:00 PM Chuck Lever <[email protected]> wrote:
>
> I'm not aware of a similar reason why the notify step would need to
> block the boot process. Maybe Type=forking is wrong, but I thought
> sm-notify was a forking/daemonizing type of utility.
>
>
> > On Jan 4, 2021, at 11:03 AM, Hackintosh Five <[email protected]> wrote:
> >
> > I see. Does rpc-statd-notify HAVE to start before nfs-client? If not,
> > perhaps a one-off timer unit with no delay could be made so that the
> > startup of rpc-statd-notify doesn't block the boot process, while
> > still running after network-online?
> >
> > On Mon, Jan 4, 2021 at 1:26 PM Chuck Lever <[email protected]> wrote:
> >>
> >> The problem is not in sm-notify itself, it's in the C library functions. The system's DNS resolver configuration is set during network startup. When a process first attempts a DNS query, it retrieves the system DNS configuration as it is at that moment, and keeps that configuration until the process exits. If sm-notify starts before the system's DNS resolver is configured, then it simply doesn't work because it can't perform DNS queries correctly.
>
>
> --
> Chuck Lever
>
>
>

2021-01-05 15:00:03

by Tom Talpey

[permalink] [raw]
Subject: Re: [PATCH v2] systemd: rpc-statd-notify.service can run in the background

On 1/4/2021 1:55 PM, Hackintosh 5 wrote:
> This allows rpc-statd-notify to run in the background when it is
> only in use by a client. This is done by a timer unit with a one
> second timeout, which is Wanted by nfs-client.target. The result
> is that there is no longer a dependency on network-online.target
> by multi-user.target, so everyone gets faster boot times yay.

I'm concerned that this change may allow the nfs client to start
before the sm-notify has a chance to send its "I'm back" message
to the server, and for the server to process it. This will lead
to lock failures.

Also, I'm unclear how an apparently arbitrary 1-second delay is
fixing this. Is this really a systemd thing? If so, changing the
NFS behavior is the wrong approach.

Tom.

> ---
> systemd/nfs-client.target | 2 +-
> systemd/rpc-statd-notify.timer | 9 +++++++++
> 2 files changed, 10 insertions(+), 1 deletion(-)
> create mode 100644 systemd/rpc-statd-notify.timer
>
> diff --git a/systemd/nfs-client.target b/systemd/nfs-client.target
> index 8a8300a1..b7cce746 100644
> --- a/systemd/nfs-client.target
> +++ b/systemd/nfs-client.target
> @@ -5,7 +5,7 @@ Wants=remote-fs-pre.target
>
> # Note: we don't "Wants=rpc-statd.service" as "mount.nfs" will arrange to
> # start that on demand if needed.
> -Wants=rpc-statd-notify.service
> +Wants=rpc-statd-notify.timer
>
> # GSS services dependencies and ordering
> Wants=auth-rpcgss-module.service
> diff --git a/systemd/rpc-statd-notify.timer b/systemd/rpc-statd-notify.timer
> new file mode 100644
> index 00000000..bac68817
> --- /dev/null
> +++ b/systemd/rpc-statd-notify.timer
> @@ -0,0 +1,9 @@
> +[Unit]
> +Description=Notify NFS peers of a restart
> +RefuseManualStart=true
> +RefuseManualStop=true
> +
> +[Timer]
> +OnActiveSec=1
> +Unit=rpc-statd-notify.service
> +RemainAfterElapse=false
>