2021-10-27 21:28:23

by Maxime Chevallier

[permalink] [raw]
Subject: [RFC PATCH net] net: ipconfig: Release the rtnl_lock while waiting for carrier

While waiting for a carrier to come on one of the netdevices, some
devices will require to take the rtnl lock at some point to fully
initialize all parts of the link.

That's the case for SFP, where the rtnl is taken when a module gets
detected. This prevents mounting an NFS rootfs over an SFP link.

This means that while ipconfig waits for carriers to be detected, no SFP
modules can be detected in the meantime, it's only detected after
ipconfig times out.

This commit releases the rtnl_lock while waiting for the carrier to come
up, and re-takes it to check the for the init device and carrier status.

At that point, the rtnl_lock seems to be only protecting
ic_is_init_dev().

Fixes: 73970055450e ("sfp: add SFP module support")
Signed-off-by: Maxime Chevallier <[email protected]>
---
I've sent this patch as an RFC (it doesn't look very clean indeed), since I'm
not fully familiar with the implications of modifying the locking scheme at
that point in the boot process. Please feel free to comment or suggest other
approaches.

net/ipv4/ipconfig.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index 816d8aad5a68..069ae05bd0a5 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -278,7 +278,12 @@ static int __init ic_open_devs(void)
if (ic_is_init_dev(dev) && netif_carrier_ok(dev))
goto have_carrier;

+ /* Give a chance to do complex initialization that
+ * would require to take the rtnl lock.
+ */
+ rtnl_unlock();
msleep(1);
+ rtnl_lock();

if (time_before(jiffies, next_msg))
continue;
--
2.25.4


2021-10-27 21:28:47

by Antoine Tenart

[permalink] [raw]
Subject: Re: [RFC PATCH net] net: ipconfig: Release the rtnl_lock while waiting for carrier

Hi Maxime,

Quoting Maxime Chevallier (2021-10-27 15:19:53)
> While waiting for a carrier to come on one of the netdevices, some
> devices will require to take the rtnl lock at some point to fully
> initialize all parts of the link.
>
> That's the case for SFP, where the rtnl is taken when a module gets
> detected. This prevents mounting an NFS rootfs over an SFP link.
>
> This means that while ipconfig waits for carriers to be detected, no SFP
> modules can be detected in the meantime, it's only detected after
> ipconfig times out.
>
> This commit releases the rtnl_lock while waiting for the carrier to come
> up, and re-takes it to check the for the init device and carrier status.
>
> At that point, the rtnl_lock seems to be only protecting
> ic_is_init_dev().
>
> Fixes: 73970055450e ("sfp: add SFP module support")

Was this working with SFP modules before?

> diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
> index 816d8aad5a68..069ae05bd0a5 100644
> --- a/net/ipv4/ipconfig.c
> +++ b/net/ipv4/ipconfig.c
> @@ -278,7 +278,12 @@ static int __init ic_open_devs(void)
> if (ic_is_init_dev(dev) && netif_carrier_ok(dev))
> goto have_carrier;
>
> + /* Give a chance to do complex initialization that
> + * would require to take the rtnl lock.
> + */
> + rtnl_unlock();
> msleep(1);
> + rtnl_lock();
>
> if (time_before(jiffies, next_msg))
> continue;

The rtnl lock is protecting 'for_each_netdev' and 'dev_change_flags' in
this function. What could happen in theory is a device gets removed from
the list or has its flags changed. I don't think that's an issue here.

Instead of releasing the lock while sleeping, you could drop the lock
before the carrier waiting loop (with a similar comment) and only
protect the above 'for_each_netdev' loop.

Antoine

2021-10-28 06:46:40

by Maxime Chevallier

[permalink] [raw]
Subject: Re: [RFC PATCH net] net: ipconfig: Release the rtnl_lock while waiting for carrier

Hello Antoine,

On Wed, 27 Oct 2021 18:05:09 +0200
Antoine Tenart <[email protected]> wrote:

>Hi Maxime,
>
>Quoting Maxime Chevallier (2021-10-27 15:19:53)
>> While waiting for a carrier to come on one of the netdevices, some
>> devices will require to take the rtnl lock at some point to fully
>> initialize all parts of the link.
>>
>> That's the case for SFP, where the rtnl is taken when a module gets
>> detected. This prevents mounting an NFS rootfs over an SFP link.
>>
>> This means that while ipconfig waits for carriers to be detected, no SFP
>> modules can be detected in the meantime, it's only detected after
>> ipconfig times out.
>>
>> This commit releases the rtnl_lock while waiting for the carrier to come
>> up, and re-takes it to check the for the init device and carrier status.
>>
>> At that point, the rtnl_lock seems to be only protecting
>> ic_is_init_dev().
>>
>> Fixes: 73970055450e ("sfp: add SFP module support")
>
>Was this working with SFP modules before?

From what I can tell, no. In that case, does it need a fixes tag ?
It seems the problem has always been there, and booting an nfsroot
never worked over SFP links.

>
>> diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
>> index 816d8aad5a68..069ae05bd0a5 100644
>> --- a/net/ipv4/ipconfig.c
>> +++ b/net/ipv4/ipconfig.c
>> @@ -278,7 +278,12 @@ static int __init ic_open_devs(void)
>> if (ic_is_init_dev(dev) && netif_carrier_ok(dev))
>> goto have_carrier;
>>
>> + /* Give a chance to do complex initialization that
>> + * would require to take the rtnl lock.
>> + */
>> + rtnl_unlock();
>> msleep(1);
>> + rtnl_lock();
>>
>> if (time_before(jiffies, next_msg))
>> continue;
>
>The rtnl lock is protecting 'for_each_netdev' and 'dev_change_flags' in
>this function. What could happen in theory is a device gets removed from
>the list or has its flags changed. I don't think that's an issue here.
>
>Instead of releasing the lock while sleeping, you could drop the lock
>before the carrier waiting loop (with a similar comment) and only
>protect the above 'for_each_netdev' loop.

Nice catch, the effect should be the same but with a much cleaner idea
of what is being protected.

I'll give it a try and respin, thanks for the review !

Maxime

>Antoine



--
Maxime Chevallier, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

2021-10-28 08:43:46

by Antoine Tenart

[permalink] [raw]
Subject: Re: [RFC PATCH net] net: ipconfig: Release the rtnl_lock while waiting for carrier

Quoting Maxime Chevallier (2021-10-28 08:45:20)
> On Wed, 27 Oct 2021 18:05:09 +0200
> Antoine Tenart <[email protected]> wrote:
> >Quoting Maxime Chevallier (2021-10-27 15:19:53)
> >>
> >> Fixes: 73970055450e ("sfp: add SFP module support")
> >
> >Was this working with SFP modules before?
>
> From what I can tell, no. In that case, does it need a fixes tag ?
> It seems the problem has always been there, and booting an nfsroot
> never worked over SFP links.

In that case I'd say targeting net-next is fine.

Antoine