2019-06-11 10:04:20

by Jonas Bonn

[permalink] [raw]
Subject: [PATCH 1/1] Address regression in inet6_validate_link_af

Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
with systemd 241. In that revision, systemd-networkd fails to pass the
required flags early enough. This appears to be addressed in later
versions of systemd, but for users of version 241 where systemd-networkd
nonetheless worked with earlier kernels, the strict check introduced by
the patch causes a regression in behaviour.

This patch converts the failure to supply the required flags from an
error into a warning. With this, systemd-networkd version 241 once
again is able to bring up the link, albeit not quite as intended and
thereby with a warning in the kernel log.

CC: Maxim Mikityanskiy <[email protected]>
CC: David S. Miller <[email protected]>
CC: Alexey Kuznetsov <[email protected]>
CC: Hideaki YOSHIFUJI <[email protected]>
Signed-off-by: Jonas Bonn <[email protected]>
---
net/ipv6/addrconf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 081bb517e40d..e2477bf92e12 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -5696,7 +5696,8 @@ static int inet6_validate_link_af(const struct net_device *dev,
return err;

if (!tb[IFLA_INET6_TOKEN] && !tb[IFLA_INET6_ADDR_GEN_MODE])
- return -EINVAL;
+ net_warn_ratelimited(
+ "required link flag omitted: TOKEN/ADDR_GEN_MODE\n");

if (tb[IFLA_INET6_ADDR_GEN_MODE]) {
u8 mode = nla_get_u8(tb[IFLA_INET6_ADDR_GEN_MODE]);
--
2.20.1


2019-06-12 14:48:43

by Maxim Mikityanskiy

[permalink] [raw]
Subject: Re: [PATCH 1/1] Address regression in inet6_validate_link_af

On 2019-06-11 13:03, Jonas Bonn wrote:
> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
> with systemd 241. In that revision, systemd-networkd fails to pass the
> required flags early enough. This appears to be addressed in later
> versions of systemd, but for users of version 241 where systemd-networkd
> nonetheless worked with earlier kernels, the strict check introduced by
> the patch causes a regression in behaviour.
>
> This patch converts the failure to supply the required flags from an
> error into a warning.
The purpose of my patch was to prevent a partial configuration update on
invalid input. -EINVAL was returned both before and after my patch, the
difference is that before my patch there was a partial update and a warning.

Your patch basically makes mine pointless, because you revert the fix,
and now we'll have the same partial update and two warnings.

One more thing is that after applying your patch on top of mine, the
kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL
is what happened before my patch, and also after my patch.

Regarding the systemd issue, I don't think we should change the kernel
to adapt to bugs in systemd. systemd didn't have this bug from day one,
it was a regression introduced in [1]. The kernel has always returned
-EINVAL here, but the behavior before my patch was basically a UB, and
after the patch it's well-defined. If systemd saw EINVAL and relied on
the UB that came with it, it can't be a reason enough to break the kernel.

Moreover, the bug looks fixed in systemd's master, so what you suggest
is to insert a kernel-side workaround for an old version of software
when there is a fixed one.

Please correct me if anything I say is wrong.

Thanks,
Max

[1]:
https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf

> With this, systemd-networkd version 241 once
> again is able to bring up the link, albeit not quite as intended and
> thereby with a warning in the kernel log.
>
> CC: Maxim Mikityanskiy <[email protected]>
> CC: David S. Miller <[email protected]>
> CC: Alexey Kuznetsov <[email protected]>
> CC: Hideaki YOSHIFUJI <[email protected]>
> Signed-off-by: Jonas Bonn <[email protected]>
> ---
> net/ipv6/addrconf.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 081bb517e40d..e2477bf92e12 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -5696,7 +5696,8 @@ static int inet6_validate_link_af(const struct net_device *dev,
> return err;
>
> if (!tb[IFLA_INET6_TOKEN] && !tb[IFLA_INET6_ADDR_GEN_MODE])
> - return -EINVAL;
> + net_warn_ratelimited(
> + "required link flag omitted: TOKEN/ADDR_GEN_MODE\n");
>
> if (tb[IFLA_INET6_ADDR_GEN_MODE]) {
> u8 mode = nla_get_u8(tb[IFLA_INET6_ADDR_GEN_MODE]);
>

2019-06-13 15:07:46

by Maxim Mikityanskiy

[permalink] [raw]
Subject: Re: [PATCH 1/1] Address regression in inet6_validate_link_af

On 2019-06-13 09:45, Jonas Bonn wrote:
> Hi Max,
>
> On 12/06/2019 12:42, Maxim Mikityanskiy wrote:
>> On 2019-06-11 13:03, Jonas Bonn wrote:
>>> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
>>> with systemd 241.  In that revision, systemd-networkd fails to pass the
>>> required flags early enough.  This appears to be addressed in later
>>> versions of systemd, but for users of version 241 where systemd-networkd
>>> nonetheless worked with earlier kernels, the strict check introduced by
>>> the patch causes a regression in behaviour.
>>>
>>> This patch converts the failure to supply the required flags from an
>>> error into a warning.
>> The purpose of my patch was to prevent a partial configuration update on
>> invalid input. -EINVAL was returned both before and after my patch, the
>> difference is that before my patch there was a partial update and a
>> warning.
>>
>> Your patch basically makes mine pointless, because you revert the fix,
>> and now we'll have the same partial update and two warnings.
>
> Unfortunately, yes...

So what you propose is a degradation.

>>
>> One more thing is that after applying your patch on top of mine, the
>> kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL
>> is what happened before my patch, and also after my patch.
>
> Yes, you're right, it would probably be better revert the entire patch
> because the checks in set_link_af have been dropped on the assumption
> that validate_link_af catches the badness.

We shouldn't introduce workarounds in the kernel for some temporary bugs
in old userspace. A regression was introduced in systemd: it started
sending invalid messages, didn't check the return code properly and
relied on an undefined behavior. It was then fixed in systemd. If the
kernel had all kinds of workarounds for all buggy software ever existed,
it would be a complete mess. The software used the API in a wrong way,
the expected behavior is failure, so it shouldn't expect anything else.
For me, the trade-off between fixing the kernel behavior and supporting
some old buggy software while keeping a UB in the kernel forever has an
obvious resolution.

>>
>> Regarding the systemd issue, I don't think we should change the kernel
>> to adapt to bugs in systemd. systemd didn't have this bug from day one,
>> it was a regression introduced in [1]. The kernel has always returned
>> -EINVAL here, but the behavior before my patch was basically a UB, and
>> after the patch it's well-defined. If systemd saw EINVAL and relied on
>> the UB that came with it, it can't be a reason enough to break the
>> kernel.
>>
>> Moreover, the bug looks fixed in systemd's master, so what you suggest
>> is to insert a kernel-side workaround for an old version of software
>> when there is a fixed one.
>
> I agree, systemd is buggy here.  Probably what happens is:
>
> i)  systemd tries to set the link up
> ii)  it ends up doing a "partial" modification of the link state;
> critically, though, enough to actually effect the link state changing to UP
> iii)  systemd sees the -EINVAL error and decides it probably failed to
> bring up the link
> iv)  systemd then gets a notification that the link is up and runs a
> DHCP client on it
>
> I haven't noticed any "partial modification" warnings in the kernel log
> but I wasn't looking for them, either...
>
> With the new behaviour in 5.2, step ii) above results in no "partial
> modification" so the link remains down and systemd is forever unable to
> bring it up.
>
> Anyway, for the record, the error is:
>
> systemd:  Could not bring up interface... (invalid parameter)
>
> And the solution is:  Linux >= 5.2 requires systemd != v241.
>
> If nobody else notices, that's good enough for me.
>
>>
>> Please correct me if anything I say is wrong.
>
> Nothing wrong, but it's still a regression.
>
> /Jonas
>
>>
>> Thanks,
>> Max
>>
>> [1]:
>> https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf
>>
>>
>>> With this, systemd-networkd version 241 once
>>> again is able to bring up the link, albeit not quite as intended and
>>> thereby with a warning in the kernel log.
>>>
>>> CC: Maxim Mikityanskiy <[email protected]>
>>> CC: David S. Miller <[email protected]>
>>> CC: Alexey Kuznetsov <[email protected]>
>>> CC: Hideaki YOSHIFUJI <[email protected]>
>>> Signed-off-by: Jonas Bonn <[email protected]>
>>> ---
>>>    net/ipv6/addrconf.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>>> index 081bb517e40d..e2477bf92e12 100644
>>> --- a/net/ipv6/addrconf.c
>>> +++ b/net/ipv6/addrconf.c
>>> @@ -5696,7 +5696,8 @@ static int inet6_validate_link_af(const struct
>>> net_device *dev,
>>>            return err;
>>>        if (!tb[IFLA_INET6_TOKEN] && !tb[IFLA_INET6_ADDR_GEN_MODE])
>>> -        return -EINVAL;
>>> +        net_warn_ratelimited(
>>> +            "required link flag omitted: TOKEN/ADDR_GEN_MODE\n");
>>>        if (tb[IFLA_INET6_ADDR_GEN_MODE]) {
>>>            u8 mode = nla_get_u8(tb[IFLA_INET6_ADDR_GEN_MODE]);
>>>
>>

2019-06-13 16:44:49

by Jonas Bonn

[permalink] [raw]
Subject: Re: [PATCH 1/1] Address regression in inet6_validate_link_af

Hi Max,

On 12/06/2019 12:42, Maxim Mikityanskiy wrote:
> On 2019-06-11 13:03, Jonas Bonn wrote:
>> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
>> with systemd 241. In that revision, systemd-networkd fails to pass the
>> required flags early enough. This appears to be addressed in later
>> versions of systemd, but for users of version 241 where systemd-networkd
>> nonetheless worked with earlier kernels, the strict check introduced by
>> the patch causes a regression in behaviour.
>>
>> This patch converts the failure to supply the required flags from an
>> error into a warning.
> The purpose of my patch was to prevent a partial configuration update on
> invalid input. -EINVAL was returned both before and after my patch, the
> difference is that before my patch there was a partial update and a warning.
>
> Your patch basically makes mine pointless, because you revert the fix,
> and now we'll have the same partial update and two warnings.

Unfortunately, yes...

>
> One more thing is that after applying your patch on top of mine, the
> kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL
> is what happened before my patch, and also after my patch.

Yes, you're right, it would probably be better revert the entire patch
because the checks in set_link_af have been dropped on the assumption
that validate_link_af catches the badness.

>
> Regarding the systemd issue, I don't think we should change the kernel
> to adapt to bugs in systemd. systemd didn't have this bug from day one,
> it was a regression introduced in [1]. The kernel has always returned
> -EINVAL here, but the behavior before my patch was basically a UB, and
> after the patch it's well-defined. If systemd saw EINVAL and relied on
> the UB that came with it, it can't be a reason enough to break the kernel.
>
> Moreover, the bug looks fixed in systemd's master, so what you suggest
> is to insert a kernel-side workaround for an old version of software
> when there is a fixed one.

I agree, systemd is buggy here. Probably what happens is:

i) systemd tries to set the link up
ii) it ends up doing a "partial" modification of the link state;
critically, though, enough to actually effect the link state changing to UP
iii) systemd sees the -EINVAL error and decides it probably failed to
bring up the link
iv) systemd then gets a notification that the link is up and runs a
DHCP client on it

I haven't noticed any "partial modification" warnings in the kernel log
but I wasn't looking for them, either...

With the new behaviour in 5.2, step ii) above results in no "partial
modification" so the link remains down and systemd is forever unable to
bring it up.

Anyway, for the record, the error is:

systemd: Could not bring up interface... (invalid parameter)

And the solution is: Linux >= 5.2 requires systemd != v241.

If nobody else notices, that's good enough for me.

>
> Please correct me if anything I say is wrong.

Nothing wrong, but it's still a regression.

/Jonas

>
> Thanks,
> Max
>
> [1]:
> https://github.com/systemd/systemd/commit/0e2fdb83bb5e22047e0c7cc058b415d0e93f02cf
>
>> With this, systemd-networkd version 241 once
>> again is able to bring up the link, albeit not quite as intended and
>> thereby with a warning in the kernel log.
>>
>> CC: Maxim Mikityanskiy <[email protected]>
>> CC: David S. Miller <[email protected]>
>> CC: Alexey Kuznetsov <[email protected]>
>> CC: Hideaki YOSHIFUJI <[email protected]>
>> Signed-off-by: Jonas Bonn <[email protected]>
>> ---
>> net/ipv6/addrconf.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index 081bb517e40d..e2477bf92e12 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -5696,7 +5696,8 @@ static int inet6_validate_link_af(const struct net_device *dev,
>> return err;
>>
>> if (!tb[IFLA_INET6_TOKEN] && !tb[IFLA_INET6_ADDR_GEN_MODE])
>> - return -EINVAL;
>> + net_warn_ratelimited(
>> + "required link flag omitted: TOKEN/ADDR_GEN_MODE\n");
>>
>> if (tb[IFLA_INET6_ADDR_GEN_MODE]) {
>> u8 mode = nla_get_u8(tb[IFLA_INET6_ADDR_GEN_MODE]);
>>
>

2019-06-14 08:20:58

by Jonas Bonn

[permalink] [raw]
Subject: Re: [PATCH 1/1] Address regression in inet6_validate_link_af



On 13/06/2019 16:13, Maxim Mikityanskiy wrote:
> On 2019-06-13 09:45, Jonas Bonn wrote:
>> Hi Max,
>>
>> On 12/06/2019 12:42, Maxim Mikityanskiy wrote:
>>> On 2019-06-11 13:03, Jonas Bonn wrote:
>>>> Patch 7dc2bccab0ee37ac28096b8fcdc390a679a15841 introduces a regression
>>>> with systemd 241.  In that revision, systemd-networkd fails to pass the
>>>> required flags early enough.  This appears to be addressed in later
>>>> versions of systemd, but for users of version 241 where systemd-networkd
>>>> nonetheless worked with earlier kernels, the strict check introduced by
>>>> the patch causes a regression in behaviour.
>>>>
>>>> This patch converts the failure to supply the required flags from an
>>>> error into a warning.
>>> The purpose of my patch was to prevent a partial configuration update on
>>> invalid input. -EINVAL was returned both before and after my patch, the
>>> difference is that before my patch there was a partial update and a
>>> warning.
>>>
>>> Your patch basically makes mine pointless, because you revert the fix,
>>> and now we'll have the same partial update and two warnings.
>>
>> Unfortunately, yes...
>
> So what you propose is a degradation.

Yes. You're not going to get an argument out of me here... :)

>
>>>
>>> One more thing is that after applying your patch on top of mine, the
>>> kernel won't return -EINVAL anymore on invalid input. Returning -EINVAL
>>> is what happened before my patch, and also after my patch.
>>
>> Yes, you're right, it would probably be better revert the entire patch
>> because the checks in set_link_af have been dropped on the assumption
>> that validate_link_af catches the badness.
>
> We shouldn't introduce workarounds in the kernel for some temporary bugs
> in old userspace. A regression was introduced in systemd: it started
> sending invalid messages, didn't check the return code properly and
> relied on an undefined behavior. It was then fixed in systemd. If the
> kernel had all kinds of workarounds for all buggy software ever existed,
> it would be a complete mess. The software used the API in a wrong way,
> the expected behavior is failure, so it shouldn't expect anything else.
> For me, the trade-off between fixing the kernel behavior and supporting
> some old buggy software while keeping a UB in the kernel forever has an
> obvious resolution.

You're missing the point: this is a regression in kernel behaviour.
systemd v241 is a tagged release, included in downloadable releases of
at least Yocto 2.7 and Fedora 30. Irregardless of whether systemd's
implementation is buggy or not, it's functionality depends on the
(undefined but deterministic) behaviour of a released kernel and
therefore you can't just change that behaviour.

Like I said, I can get past this, so I'm happy to pretend that I didn't
discover this and just patch my own system. Unfortunately, I suspect
this isn't the last you'll hear about this.

/Jonas

>>
>> Anyway, for the record, the error is:
>>
>> systemd:  Could not bring up interface... (invalid parameter)
>>
>> And the solution is:  Linux >= 5.2 requires systemd != v241.
>>
>> If nobody else notices, that's good enough for me.
>>
>>>
>>> Please correct me if anything I say is wrong.
>>
>> Nothing wrong, but it's still a regression.
>>
>> /Jonas
>>
>>>
>>> Thanks,
>>> Max
>>>