Modern day kernel will no longer return all timeout
errors instead the process spins endlessly in the kernel.
This behavior will cause the foreground mount to hang, never
allowing the mount to go into background.
So this patch eliminates the foreground mount cause
background mounts to go directly into background
Signed-off-by: Steve Dickson <[email protected]>
---
utils/mount/stropts.c | 31 ++++++++-----------------------
1 files changed, 8 insertions(+), 23 deletions(-)
diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
index a642394..92a7245 100644
--- a/utils/mount/stropts.c
+++ b/utils/mount/stropts.c
@@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi)
}
/*
- * Handle "background" NFS mount [first try]
- *
- * Returns a valid mount command exit code.
- *
- * EX_BG should cause the caller to fork and invoke nfsmount_child.
- */
-static int nfsmount_parent(struct nfsmount_info *mi)
-{
- if (nfs_try_mount(mi))
- return EX_SUCCESS;
-
- /* retry background mounts when the server is not up */
- if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
- mount_error(mi->spec, mi->node, errno);
- return EX_FAIL;
- }
-
- sys_mount_errors(mi->hostname, errno, 1, 1);
- return EX_BG;
-}
-
-/*
* Handle "background" NFS mount [retry daemon]
*
* Returns a valid mount command exit code: EX_SUCCESS if successful,
@@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi)
static int nfsmount_bg(struct nfsmount_info *mi)
{
if (!mi->child)
- return nfsmount_parent(mi);
+ /*
+ * Modern day kernels no longer return all
+ * timeouts errors in all cases, instead
+ * the process spins in the kernel, which
+ * will hang a foreground mount. So background
+ * mounts have to go directly into background
+ */
+ return EX_BG;
else
return nfsmount_child(mi);
}
--
1.7.1
On 03/11/2014 05:13 PM, Trond Myklebust wrote:
>
> On Mar 8, 2014, at 8:22, Steve Dickson <[email protected]> wrote:
>
>> Modern day kernel will no longer return all timeout
>> errors instead the process spins endlessly in the kernel.
>> This behavior will cause the foreground mount to hang, never
>> allowing the mount to go into background.
>>
>> So this patch eliminates the foreground mount cause
>> background mounts to go directly into background
>>
>> Signed-off-by: Steve Dickson <[email protected]>
>> ---
>> utils/mount/stropts.c | 31 ++++++++-----------------------
>> 1 files changed, 8 insertions(+), 23 deletions(-)
>>
>> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
>> index a642394..92a7245 100644
>> --- a/utils/mount/stropts.c
>> +++ b/utils/mount/stropts.c
>> @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi)
>> }
>>
>> /*
>> - * Handle "background" NFS mount [first try]
>> - *
>> - * Returns a valid mount command exit code.
>> - *
>> - * EX_BG should cause the caller to fork and invoke nfsmount_child.
>> - */
>> -static int nfsmount_parent(struct nfsmount_info *mi)
>> -{
>> - if (nfs_try_mount(mi))
>> - return EX_SUCCESS;
>> -
>> - /* retry background mounts when the server is not up */
>> - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
>> - mount_error(mi->spec, mi->node, errno);
>> - return EX_FAIL;
>> - }
>> -
>> - sys_mount_errors(mi->hostname, errno, 1, 1);
>> - return EX_BG;
>> -}
>> -
>> -/*
>> * Handle "background" NFS mount [retry daemon]
>> *
>> * Returns a valid mount command exit code: EX_SUCCESS if successful,
>> @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi)
>> static int nfsmount_bg(struct nfsmount_info *mi)
>> {
>> if (!mi->child)
>> - return nfsmount_parent(mi);
>> + /*
>> + * Modern day kernels no longer return all
>> + * timeouts errors in all cases, instead
>> + * the process spins in the kernel, which
>> + * will hang a foreground mount. So background
>> + * mounts have to go directly into background
>> + */
>> + return EX_BG;
>> else
>> return nfsmount_child(mi);
>> }
>
> Hi Steve,
>
> Doesn?t this mean that ?mount.nfs? will no longer attempt to wait
> for the mount to complete?
Yes. The foreground will no longer be tried....
> That?s why I suggested having the parent set a timer, and then
> waiting for whichever comes first out of SIGCHLD or SIGALRM (indicating
> either that the child mount process is done mounting
> or that the timeout occurred).
Why wait? Kernels today no longer return on timeouts or ECONNREFUSED
so instead of have the foreground mounts hang forever why not
just let the background mounts hang, forever?
steved.
On Mon, 10 Mar 2014 10:18:16 +1100 NeilBrown <[email protected]> wrote:
> On Sat, 8 Mar 2014 08:22:44 -0500 Steve Dickson <[email protected]> wrote:
>
> > Modern day kernel will no longer return all timeout
> > errors instead the process spins endlessly in the kernel.
> > This behavior will cause the foreground mount to hang, never
> > allowing the mount to go into background.
> >
> > So this patch eliminates the foreground mount cause
> > background mounts to go directly into background
> >
> > Signed-off-by: Steve Dickson <[email protected]>
>
> Did NFS mounts *ever* time out (when 'soft' isn't given)?
>
> If so, there is probably a regression which maybe should be fixed.
>
> If not, then this has always been a bug since sting-options were introduced
> and the kernel started doing the mountd filehandle lookup...
>
> So I'm probably OK with the patch but I wonder if there is more of a story
> behind this that we should be sure we understand...
>
> Thanks,
> NeilBrown
Sorry, I really should read email in chronological order, shouldn't I :-)
The foregoing discussion seemed to focus on NFSv4. Does NFSv3 behave the
same way? A quick test suggests that it doesn't.
mount -o bg,vers=3 server:/path /mnt
backgrounds as it should after a few seconds.
So I suspect this patch should be version dependent?
However falling-back from v4 (which we leave entirely to the kernel) to v3
could be a bit awkward.
I think that an NFSv4 mount really does need to timeout - whether that
happens in the kernel or through the use of "alarm()" doesn't really bother
me. But if it doesn't timeout, then it is violating the documentation and
ignoring the retry= setting.
NeilBrown
On Sat, 8 Mar 2014 08:22:44 -0500 Steve Dickson <[email protected]> wrote:
> Modern day kernel will no longer return all timeout
> errors instead the process spins endlessly in the kernel.
> This behavior will cause the foreground mount to hang, never
> allowing the mount to go into background.
>
> So this patch eliminates the foreground mount cause
> background mounts to go directly into background
>
> Signed-off-by: Steve Dickson <[email protected]>
Did NFS mounts *ever* time out (when 'soft' isn't given)?
If so, there is probably a regression which maybe should be fixed.
If not, then this has always been a bug since sting-options were introduced
and the kernel started doing the mountd filehandle lookup...
So I'm probably OK with the patch but I wonder if there is more of a story
behind this that we should be sure we understand...
Thanks,
NeilBrown
> ---
> utils/mount/stropts.c | 31 ++++++++-----------------------
> 1 files changed, 8 insertions(+), 23 deletions(-)
>
> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
> index a642394..92a7245 100644
> --- a/utils/mount/stropts.c
> +++ b/utils/mount/stropts.c
> @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi)
> }
>
> /*
> - * Handle "background" NFS mount [first try]
> - *
> - * Returns a valid mount command exit code.
> - *
> - * EX_BG should cause the caller to fork and invoke nfsmount_child.
> - */
> -static int nfsmount_parent(struct nfsmount_info *mi)
> -{
> - if (nfs_try_mount(mi))
> - return EX_SUCCESS;
> -
> - /* retry background mounts when the server is not up */
> - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
> - mount_error(mi->spec, mi->node, errno);
> - return EX_FAIL;
> - }
> -
> - sys_mount_errors(mi->hostname, errno, 1, 1);
> - return EX_BG;
> -}
> -
> -/*
> * Handle "background" NFS mount [retry daemon]
> *
> * Returns a valid mount command exit code: EX_SUCCESS if successful,
> @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi)
> static int nfsmount_bg(struct nfsmount_info *mi)
> {
> if (!mi->child)
> - return nfsmount_parent(mi);
> + /*
> + * Modern day kernels no longer return all
> + * timeouts errors in all cases, instead
> + * the process spins in the kernel, which
> + * will hang a foreground mount. So background
> + * mounts have to go directly into background
> + */
> + return EX_BG;
> else
> return nfsmount_child(mi);
> }
Hey Neil,
On 03/09/2014 07:43 PM, NeilBrown wrote:
> On Mon, 10 Mar 2014 10:18:16 +1100 NeilBrown <[email protected]> wrote:
>
>> On Sat, 8 Mar 2014 08:22:44 -0500 Steve Dickson <[email protected]> wrote:
>>
>>> Modern day kernel will no longer return all timeout
>>> errors instead the process spins endlessly in the kernel.
>>> This behavior will cause the foreground mount to hang, never
>>> allowing the mount to go into background.
>>>
>>> So this patch eliminates the foreground mount cause
>>> background mounts to go directly into background
>>>
>>> Signed-off-by: Steve Dickson <[email protected]>
>>
>> Did NFS mounts *ever* time out (when 'soft' isn't given)?
>>
>> If so, there is probably a regression which maybe should be fixed.
>>
>> If not, then this has always been a bug since sting-options were introduced
>> and the kernel started doing the mountd filehandle lookup...
>>
>> So I'm probably OK with the patch but I wonder if there is more of a story
>> behind this that we should be sure we understand...
>>
>> Thanks,
>> NeilBrown
>
> Sorry, I really should read email in chronological order, shouldn't I :-)
>
> The foregoing discussion seemed to focus on NFSv4. Does NFSv3 behave the
> same way? A quick test suggests that it doesn't.
> mount -o bg,vers=3 server:/path /mnt
> backgrounds as it should after a few seconds.
Right... After first RPC timeout the kernel returns ETIMEDOUT and
the mount goes into background... This is what I was trying
to simulate with my first set of patches....
>
> So I suspect this patch should be version dependent?
It could be, but I'm not sure its necessary.
>
> However falling-back from v4 (which we leave entirely to the kernel) to v3
> could be a bit awkward.
Why? Being in foreground or background should not effect any kind of mount behavior.
>
> I think that an NFSv4 mount really does need to timeout - whether that
> happens in the kernel or through the use of "alarm()" doesn't really bother
> me. But if it doesn't timeout, then it is violating the documentation and
> ignoring the retry= setting.
Yes... retry= does talk about different timeouts on fg and bg mounts....
I guess if the mount goes directly into bg the the first time out
is not tried...
At the end of the day, too would prefer the kernel return timeouts
like it once did... but unfortunately that is not going to happen.
The new way is to spin kernel and hang the process... forever...
which is unfortunate... IMHO...
The only thing I against using alarm() is in the past NFS and signals
have not always played nice together... Interrupting the kernel
in critical parts of the code is always dicey at best... but
maybe this is no loner case.
One question I do have, does anybody know why the first foreground mount
was even tried before going into background? Since bg mount are usually
used in fstabs, going directly into background would definitely
speed up mounts during booting...
steved.
On Mar 11, 2014, at 18:56, Steve Dickson <[email protected]> wrote:
>
>
> On 03/11/2014 05:13 PM, Trond Myklebust wrote:
>>
>> On Mar 8, 2014, at 8:22, Steve Dickson <[email protected]> wrote:
>>
>>> Modern day kernel will no longer return all timeout
>>> errors instead the process spins endlessly in the kernel.
>>> This behavior will cause the foreground mount to hang, never
>>> allowing the mount to go into background.
>>>
>>> So this patch eliminates the foreground mount cause
>>> background mounts to go directly into background
>>>
>>> Signed-off-by: Steve Dickson <[email protected]>
>>> ---
>>> utils/mount/stropts.c | 31 ++++++++-----------------------
>>> 1 files changed, 8 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
>>> index a642394..92a7245 100644
>>> --- a/utils/mount/stropts.c
>>> +++ b/utils/mount/stropts.c
>>> @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi)
>>> }
>>>
>>> /*
>>> - * Handle "background" NFS mount [first try]
>>> - *
>>> - * Returns a valid mount command exit code.
>>> - *
>>> - * EX_BG should cause the caller to fork and invoke nfsmount_child.
>>> - */
>>> -static int nfsmount_parent(struct nfsmount_info *mi)
>>> -{
>>> - if (nfs_try_mount(mi))
>>> - return EX_SUCCESS;
>>> -
>>> - /* retry background mounts when the server is not up */
>>> - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
>>> - mount_error(mi->spec, mi->node, errno);
>>> - return EX_FAIL;
>>> - }
>>> -
>>> - sys_mount_errors(mi->hostname, errno, 1, 1);
>>> - return EX_BG;
>>> -}
>>> -
>>> -/*
>>> * Handle "background" NFS mount [retry daemon]
>>> *
>>> * Returns a valid mount command exit code: EX_SUCCESS if successful,
>>> @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi)
>>> static int nfsmount_bg(struct nfsmount_info *mi)
>>> {
>>> if (!mi->child)
>>> - return nfsmount_parent(mi);
>>> + /*
>>> + * Modern day kernels no longer return all
>>> + * timeouts errors in all cases, instead
>>> + * the process spins in the kernel, which
>>> + * will hang a foreground mount. So background
>>> + * mounts have to go directly into background
>>> + */
>>> + return EX_BG;
>>> else
>>> return nfsmount_child(mi);
>>> }
>>
>> Hi Steve,
>>
>> Doesn?t this mean that ?mount.nfs? will no longer attempt to wait
>> for the mount to complete?
> Yes. The foreground will no longer be tried....
>
>> That?s why I suggested having the parent set a timer, and then
>> waiting for whichever comes first out of SIGCHLD or SIGALRM (indicating
>> either that the child mount process is done mounting
>> or that the timeout occurred).
> Why wait? Kernels today no longer return on timeouts or ECONNREFUSED
> so instead of have the foreground mounts hang forever why not
> just let the background mounts hang, forever?
That?s the point; we would let the backgrounded mount.nfs hang for as long as it takes. However the foreground ?mount.nfs? process would exit after the user-specified timeout.
The reason for wanting to wait is so that the ?bg? entries in /etc/fstab are either known to be fully mounted, or timed out before we allow the ?mount remote filesystems? init process to complete. If not, the bootup will be unpredictable, with nobody being able to rely on the mounts being present even in situations where an ?fg? mount would succeed instantly.
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On Mar 8, 2014, at 8:22, Steve Dickson <[email protected]> wrote:
> Modern day kernel will no longer return all timeout
> errors instead the process spins endlessly in the kernel.
> This behavior will cause the foreground mount to hang, never
> allowing the mount to go into background.
>
> So this patch eliminates the foreground mount cause
> background mounts to go directly into background
>
> Signed-off-by: Steve Dickson <[email protected]>
> ---
> utils/mount/stropts.c | 31 ++++++++-----------------------
> 1 files changed, 8 insertions(+), 23 deletions(-)
>
> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
> index a642394..92a7245 100644
> --- a/utils/mount/stropts.c
> +++ b/utils/mount/stropts.c
> @@ -913,28 +913,6 @@ static int nfsmount_fg(struct nfsmount_info *mi)
> }
>
> /*
> - * Handle "background" NFS mount [first try]
> - *
> - * Returns a valid mount command exit code.
> - *
> - * EX_BG should cause the caller to fork and invoke nfsmount_child.
> - */
> -static int nfsmount_parent(struct nfsmount_info *mi)
> -{
> - if (nfs_try_mount(mi))
> - return EX_SUCCESS;
> -
> - /* retry background mounts when the server is not up */
> - if (nfs_is_permanent_error(errno) && errno != EOPNOTSUPP) {
> - mount_error(mi->spec, mi->node, errno);
> - return EX_FAIL;
> - }
> -
> - sys_mount_errors(mi->hostname, errno, 1, 1);
> - return EX_BG;
> -}
> -
> -/*
> * Handle "background" NFS mount [retry daemon]
> *
> * Returns a valid mount command exit code: EX_SUCCESS if successful,
> @@ -982,7 +960,14 @@ static int nfsmount_child(struct nfsmount_info *mi)
> static int nfsmount_bg(struct nfsmount_info *mi)
> {
> if (!mi->child)
> - return nfsmount_parent(mi);
> + /*
> + * Modern day kernels no longer return all
> + * timeouts errors in all cases, instead
> + * the process spins in the kernel, which
> + * will hang a foreground mount. So background
> + * mounts have to go directly into background
> + */
> + return EX_BG;
> else
> return nfsmount_child(mi);
> }
Hi Steve,
Doesn?t this mean that ?mount.nfs? will no longer attempt to wait for the mount to complete? That?s why I suggested having the parent set a timer, and then waiting for whichever comes first out of SIGCHLD or SIGALRM (indicating either that the child mount process is done mounting or that the timeout occurred).
Cheers
Trond
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]