2012-01-26 14:50:16

by Srinivas KANDAGATLA

[permalink] [raw]
Subject: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

From: Srinivas Kandagatla <[email protected]>

This patch attempts to minimize the delay in nfs root mount, which
happens as side effect of nfs-root mount retry by changing the
NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX values.

Current strategy is, if do_mount_root fails, sleep for 5 seconds for the
second attempt followed by a 5<<1 seconds delay for each loop with a
maximum of 30 seconds delay.
For 5 retries it would take.

5 + 10 + 20 + 30 + 30 = 95 Seconds
with
each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.

Which means Kernel can only attempt the succession root-mounts or panic
after 110 seconds.

So changing min and max timeouts will have the below delays.
0 + 1 + 3 + 7 + 15 = 26 Seconds.
and with
each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.
Which means Kernel can only attempt the succession root-mounts or panic
after 41 seconds

As, do_mount_root timesout in 3-4 seconds which should be sufficient
delay to start of the second nfs mount attempt and increasing delay
after that makes more sense.

I clearly see an advantange in changing these values because, Without
this patch my board mounts nfs in 9-10 seconds, however with this patch
can mount nfs in 4-5 seconds.

Signed-off-by: Srinivas Kandagatla <[email protected]>
---
Hello All,
With latest kernel I can see that my nfs-root mounts with big delay
of 5 seconds when compared to 2.6.32. It took 9-10 seconds, where as in 2.6.32 it took 4-5 seconds.

However with modifications to NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX, nfs root mounts as it used to do it in 2.6.32.
As first nfs mount timeout itself introduces sufficient delay to start the second retry.
I think changing the min-max values will help people to nfs boot there boards faster than it is in 3.3 kernel.

Comments ?

Thanks.
srini



init/do_mounts.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index ef6478f..b8214ce 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -361,8 +361,8 @@ out:

#ifdef CONFIG_ROOT_NFS

-#define NFSROOT_TIMEOUT_MIN 5
-#define NFSROOT_TIMEOUT_MAX 30
+#define NFSROOT_TIMEOUT_MIN 1
+#define NFSROOT_TIMEOUT_MAX 32
#define NFSROOT_RETRY_MAX 5

static int __init mount_nfs_root(void)
@@ -390,7 +390,7 @@ static int __init mount_nfs_root(void)
break;

/* Wait, in case the server refused us immediately */
- ssleep(timeout);
+ ssleep(timeout - 1);
timeout <<= 1;
if (timeout > NFSROOT_TIMEOUT_MAX)
timeout = NFSROOT_TIMEOUT_MAX;
--
1.6.3.3



2012-01-26 15:14:32

by Srinivas KANDAGATLA

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

Jim Rees wrote:
> You may want to read the mailing list thread from the earlier patch. I
> don't remember the details but lowering the panic timeout from 110 to 41
> seconds may re-introduce the problem that patch was trying to solve. You
> could increase the retry count to compensate.
>
>From commit 43717c7d, "NFS: Retry mounting NFSROOT" log.
Original problems looks exactly same as the one I encountered.
First operation was an rpcbind request to determine which port the NFS
server was listening on timeout in 2-3 seconds, by this time Switch was
ready. which was then followed by default nfs port 2049 selection at
this time the nfs mount was successful.

In my case the PHY became ready just before the second attempt, same as
LAN switch in the original issue.
I think most of the cases fall in this category.
So there was no delay before this patch except the timeout from rpcbind
was sufficient delay to get the Switch/PHY(in my case) in working state.

I think introduction of timeout actually was not necessary for the
second attempt.

If user want to have more delay than 41 secs, he will be able to
increase the retry count from my mount-retry kernel param patch.

Thanks,
srini

2012-01-27 08:18:54

by Srinivas KANDAGATLA

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

Chuck Lever wrote:
> On Jan 26, 2012, at 9:42 AM, Srinivas KANDAGATLA wrote:
>
>
>> From: Srinivas Kandagatla <[email protected]>
>>
>> This patch attempts to minimize the delay in nfs root mount, which
>> happens as side effect of nfs-root mount retry by changing the
>> NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX values.
>>
>> Current strategy is, if do_mount_root fails, sleep for 5 seconds for the
>> second attempt followed by a 5<<1 seconds delay for each loop with a
>> maximum of 30 seconds delay.
>> For 5 retries it would take.
>>
>> 5 + 10 + 20 + 30 + 30 = 95 Seconds
>> with
>> each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.
>>
>> Which means Kernel can only attempt the succession root-mounts or panic
>> after 110 seconds.
>>
>> So changing min and max timeouts will have the below delays.
>> 0 + 1 + 3 + 7 + 15 = 26 Seconds.
>> and with
>> each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.
>> Which means Kernel can only attempt the succession root-mounts or panic
>> after 41 seconds
>>
>> As, do_mount_root timesout in 3-4 seconds which should be sufficient
>> delay to start of the second nfs mount attempt and increasing delay
>> after that makes more sense.
>>
>> I clearly see an advantange in changing these values because, Without
>> this patch my board mounts nfs in 9-10 seconds, however with this patch
>> can mount nfs in 4-5 seconds.
>>
>
> This feels like tuning the default settings for a very specific set up. In the original thread for this work, 41 seconds would probably not be long enough for the network switch to enable the port.
>
Probably you might be right.
We could address that by increasing NFSROOT_RETRY_MAX to 7 from 5.

However, I want to highlight an another major issue of having an min
timeout start at 5.
Most of the boards (or SOC's) have in-build ethernet MAC's and external
Ethernet PHY's, Linux PHY (& phy state-machine) framework takes on an
average of 2-3 seconds from phy_start to get the ethernet PHY into a
Link-up state.
All those above boards which used to mount NFS in 3-4 seconds prior to
the original "NFS: Retry mounting NFSROOT" patch and few other patches,
NOW mounts NFS root in 9-10 seconds.
Which is the case with all the ST boards and most of the embedded boards
with external PHY's fall in.

Having the timeout start from 0 will address the uses case mentioned
above and increasing the loops from 5 to 8 will increase the total NFS
root timeout to 109 seconds.

0 + 1 + 3 + 7 + 15 + 31 + 31 = 88
+
each do_mount_root timeout @ 3-4 seconds x 7 = 21 = 109 seconds


Finally the intention of the patch was to address most of the use-cases
and not just the one I hit, increasing the retries to 7 should address
the orignal network switch case and reducing the timeout-min should help
other boards do nfs mounts as quickly as it used to happen before.

Let me know if this sounds OK to you so that I can generate a new patch?



Thanks,
srini
> I don't have a better solution at this time, but I think the current defaults will work (possibly with added delay) on most systems, whereas the proposed settings will probably result in more panics. I prefer to keep the current settings until we have a solution that doesn't break other systems.
>
>
>> Signed-off-by: Srinivas Kandagatla <[email protected]>
>> ---
>> Hello All,
>> With latest kernel I can see that my nfs-root mounts with big delay
>> of 5 seconds when compared to 2.6.32. It took 9-10 seconds, where as in 2.6.32 it took 4-5 seconds.
>>
>> However with modifications to NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX, nfs root mounts as it used to do it in 2.6.32.
>> As first nfs mount timeout itself introduces sufficient delay to start the second retry.
>> I think changing the min-max values will help people to nfs boot there boards faster than it is in 3.3 kernel.
>>
>> Comments ?
>>
>> Thanks.
>> srini
>>
>>
>>
>> init/do_mounts.c | 6 +++---
>> 1 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/init/do_mounts.c b/init/do_mounts.c
>> index ef6478f..b8214ce 100644
>> --- a/init/do_mounts.c
>> +++ b/init/do_mounts.c
>> @@ -361,8 +361,8 @@ out:
>>
>> #ifdef CONFIG_ROOT_NFS
>>
>> -#define NFSROOT_TIMEOUT_MIN 5
>> -#define NFSROOT_TIMEOUT_MAX 30
>> +#define NFSROOT_TIMEOUT_MIN 1
>> +#define NFSROOT_TIMEOUT_MAX 32
>> #define NFSROOT_RETRY_MAX 5
>>
>> static int __init mount_nfs_root(void)
>> @@ -390,7 +390,7 @@ static int __init mount_nfs_root(void)
>> break;
>>
>> /* Wait, in case the server refused us immediately */
>> - ssleep(timeout);
>> + ssleep(timeout - 1);
>> timeout <<= 1;
>> if (timeout > NFSROOT_TIMEOUT_MAX)
>> timeout = NFSROOT_TIMEOUT_MAX;
>> --
>> 1.6.3.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>


2012-01-27 12:34:33

by Jim Rees

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

I hate timeouts. Isn't there some way to wait until the interface is up
before attempting the mount?

2012-01-30 07:54:46

by Srinivas KANDAGATLA

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

Jim Rees wrote:
> I hate timeouts. Isn't there some way to wait until the interface is up
> before attempting the mount?
>
I think mount waits till ipconfig is finished, however the actual link
may not be up at that time(in cases of static ip's).
Waiting for link to become UP can have a adverse side-effect of waiting
forever in cases where cable isn't connected or n/w is not reachable....

retry is the only best option at this moment...






2012-01-26 15:35:48

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.


On Jan 26, 2012, at 9:42 AM, Srinivas KANDAGATLA wrote:

> From: Srinivas Kandagatla <[email protected]>
>
> This patch attempts to minimize the delay in nfs root mount, which
> happens as side effect of nfs-root mount retry by changing the
> NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX values.
>
> Current strategy is, if do_mount_root fails, sleep for 5 seconds for the
> second attempt followed by a 5<<1 seconds delay for each loop with a
> maximum of 30 seconds delay.
> For 5 retries it would take.
>
> 5 + 10 + 20 + 30 + 30 = 95 Seconds
> with
> each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.
>
> Which means Kernel can only attempt the succession root-mounts or panic
> after 110 seconds.
>
> So changing min and max timeouts will have the below delays.
> 0 + 1 + 3 + 7 + 15 = 26 Seconds.
> and with
> each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.
> Which means Kernel can only attempt the succession root-mounts or panic
> after 41 seconds
>
> As, do_mount_root timesout in 3-4 seconds which should be sufficient
> delay to start of the second nfs mount attempt and increasing delay
> after that makes more sense.
>
> I clearly see an advantange in changing these values because, Without
> this patch my board mounts nfs in 9-10 seconds, however with this patch
> can mount nfs in 4-5 seconds.

This feels like tuning the default settings for a very specific set up. In the original thread for this work, 41 seconds would probably not be long enough for the network switch to enable the port.

I don't have a better solution at this time, but I think the current defaults will work (possibly with added delay) on most systems, whereas the proposed settings will probably result in more panics. I prefer to keep the current settings until we have a solution that doesn't break other systems.

> Signed-off-by: Srinivas Kandagatla <[email protected]>
> ---
> Hello All,
> With latest kernel I can see that my nfs-root mounts with big delay
> of 5 seconds when compared to 2.6.32. It took 9-10 seconds, where as in 2.6.32 it took 4-5 seconds.
>
> However with modifications to NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX, nfs root mounts as it used to do it in 2.6.32.
> As first nfs mount timeout itself introduces sufficient delay to start the second retry.
> I think changing the min-max values will help people to nfs boot there boards faster than it is in 3.3 kernel.
>
> Comments ?
>
> Thanks.
> srini
>
>
>
> init/do_mounts.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/init/do_mounts.c b/init/do_mounts.c
> index ef6478f..b8214ce 100644
> --- a/init/do_mounts.c
> +++ b/init/do_mounts.c
> @@ -361,8 +361,8 @@ out:
>
> #ifdef CONFIG_ROOT_NFS
>
> -#define NFSROOT_TIMEOUT_MIN 5
> -#define NFSROOT_TIMEOUT_MAX 30
> +#define NFSROOT_TIMEOUT_MIN 1
> +#define NFSROOT_TIMEOUT_MAX 32
> #define NFSROOT_RETRY_MAX 5
>
> static int __init mount_nfs_root(void)
> @@ -390,7 +390,7 @@ static int __init mount_nfs_root(void)
> break;
>
> /* Wait, in case the server refused us immediately */
> - ssleep(timeout);
> + ssleep(timeout - 1);
> timeout <<= 1;
> if (timeout > NFSROOT_TIMEOUT_MAX)
> timeout = NFSROOT_TIMEOUT_MAX;
> --
> 1.6.3.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2012-01-26 14:55:24

by Jim Rees

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

You may want to read the mailing list thread from the earlier patch. I
don't remember the details but lowering the panic timeout from 110 to 41
seconds may re-introduce the problem that patch was trying to solve. You
could increase the retry count to compensate.

2012-02-03 10:19:19

by Srinivas KANDAGATLA

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

Hi Chuck,

Am resending patch with modifications to retries to compensate the total
delay as it is currently there.

As I said in my previous email... intention of the patch is to address most of the use-cases.


Thanks
srini

Srinivas Kandagatla wrote:
> Chuck Lever wrote:
>> On Jan 26, 2012, at 9:42 AM, Srinivas KANDAGATLA wrote:
>>
>>
>>> From: Srinivas Kandagatla <[email protected]>
>>>
>>> This patch attempts to minimize the delay in nfs root mount, which
>>> happens as side effect of nfs-root mount retry by changing the
>>> NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX values.
>>>
>>> Current strategy is, if do_mount_root fails, sleep for 5 seconds for the
>>> second attempt followed by a 5<<1 seconds delay for each loop with a
>>> maximum of 30 seconds delay.
>>> For 5 retries it would take.
>>>
>>> 5 + 10 + 20 + 30 + 30 = 95 Seconds
>>> with
>>> each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.
>>>
>>> Which means Kernel can only attempt the succession root-mounts or panic
>>> after 110 seconds.
>>>
>>> So changing min and max timeouts will have the below delays.
>>> 0 + 1 + 3 + 7 + 15 = 26 Seconds.
>>> and with
>>> each do_mount_root timeout @ 3-4 seconds x 5 = 15 seconds.
>>> Which means Kernel can only attempt the succession root-mounts or panic
>>> after 41 seconds
>>>
>>> As, do_mount_root timesout in 3-4 seconds which should be sufficient
>>> delay to start of the second nfs mount attempt and increasing delay
>>> after that makes more sense.
>>>
>>> I clearly see an advantange in changing these values because, Without
>>> this patch my board mounts nfs in 9-10 seconds, however with this patch
>>> can mount nfs in 4-5 seconds.
>>>
>> This feels like tuning the default settings for a very specific set up. In the original thread for this work, 41 seconds would probably not be long enough for the network switch to enable the port.
>>
> Probably you might be right.
> We could address that by increasing NFSROOT_RETRY_MAX to 7 from 5.
>
> However, I want to highlight an another major issue of having an min
> timeout start at 5.
> Most of the boards (or SOC's) have in-build ethernet MAC's and external
> Ethernet PHY's, Linux PHY (& phy state-machine) framework takes on an
> average of 2-3 seconds from phy_start to get the ethernet PHY into a
> Link-up state.
> All those above boards which used to mount NFS in 3-4 seconds prior to
> the original "NFS: Retry mounting NFSROOT" patch and few other patches,
> NOW mounts NFS root in 9-10 seconds.
> Which is the case with all the ST boards and most of the embedded boards
> with external PHY's fall in.
>
> Having the timeout start from 0 will address the uses case mentioned
> above and increasing the loops from 5 to 8 will increase the total NFS
> root timeout to 109 seconds.
>
> 0 + 1 + 3 + 7 + 15 + 31 + 31 = 88
> +
> each do_mount_root timeout @ 3-4 seconds x 7 = 21 = 109 seconds
>
>
> Finally the intention of the patch was to address most of the use-cases
> and not just the one I hit, increasing the retries to 7 should address
> the orignal network switch case and reducing the timeout-min should help
> other boards do nfs mounts as quickly as it used to happen before.
>
> Let me know if this sounds OK to you so that I can generate a new patch?
>
>
>
> Thanks,
> srini
>> I don't have a better solution at this time, but I think the current defaults will work (possibly with added delay) on most systems, whereas the proposed settings will probably result in more panics. I prefer to keep the current settings until we have a solution that doesn't break other systems.
>>
>>
>>> Signed-off-by: Srinivas Kandagatla <[email protected]>
>>> ---
>>> Hello All,
>>> With latest kernel I can see that my nfs-root mounts with big delay
>>> of 5 seconds when compared to 2.6.32. It took 9-10 seconds, where as in 2.6.32 it took 4-5 seconds.
>>>
>>> However with modifications to NFSROOT_TIMEOUT_MIN and NFSROOT_TIMEOUT_MAX, nfs root mounts as it used to do it in 2.6.32.
>>> As first nfs mount timeout itself introduces sufficient delay to start the second retry.
>>> I think changing the min-max values will help people to nfs boot there boards faster than it is in 3.3 kernel.
>>>
>>> Comments ?
>>>
>>> Thanks.
>>> srini
>>>
>>>
>>>
>>> init/do_mounts.c | 6 +++---
>>> 1 files changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/init/do_mounts.c b/init/do_mounts.c
>>> index ef6478f..b8214ce 100644
>>> --- a/init/do_mounts.c
>>> +++ b/init/do_mounts.c
>>> @@ -361,8 +361,8 @@ out:
>>>
>>> #ifdef CONFIG_ROOT_NFS
>>>
>>> -#define NFSROOT_TIMEOUT_MIN 5
>>> -#define NFSROOT_TIMEOUT_MAX 30
>>> +#define NFSROOT_TIMEOUT_MIN 1
>>> +#define NFSROOT_TIMEOUT_MAX 32
>>> #define NFSROOT_RETRY_MAX 5
>>>
>>> static int __init mount_nfs_root(void)
>>> @@ -390,7 +390,7 @@ static int __init mount_nfs_root(void)
>>> break;
>>>
>>> /* Wait, in case the server refused us immediately */
>>> - ssleep(timeout);
>>> + ssleep(timeout - 1);
>>> timeout <<= 1;
>>> if (timeout > NFSROOT_TIMEOUT_MAX)
>>> timeout = NFSROOT_TIMEOUT_MAX;
>>> --
>>> 1.6.3.3
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>
>


Attachments:
0001-do_mounts-Change-the-nfs-mount-retry-min-max-delays.patch (2.96 kB)

2012-02-03 13:16:44

by Jim Rees

[permalink] [raw]
Subject: Re: [PATCH 3.3.0-rc1 1/2] do_mounts: Change the nfs-mount retry min max delays.

It seems confusing to change the meaning of NFSROOT_TIMEOUT_MIN. It used to
be the timeout minimum, as the name implies, but now it's the timeout
minimum plus one. I understand that the actual timeout includes the time
spent in mount, but I would expect NFSROOT_TIMEOUT_MIN to just be the
initial loop sleep time. Maybe this just needs a comment.