2022-03-08 23:45:07

by Ilya Maximets

[permalink] [raw]
Subject: Re: [ovs-dev] [PATCH net-next v8] net: openvswitch: IPv6: Add IPv6 extension header support

On 3/7/22 23:46, Jakub Kicinski wrote:
> On Mon, 7 Mar 2022 23:14:13 +0100 Ilya Maximets wrote:
>> The main problem is that userspace uses the modified copy of the uapi header
>> which looks like this:
>> https://github.com/openvswitch/ovs/blob/f77dbc1eb2da2523625cd36922c6fccfcb3f3eb7/datapath/linux/compat/include/linux/openvswitch.h#L357
>>
>> In short, the userspace view:
>>
>> enum ovs_key_attr {
>> <common attrs>
>>
>> #ifdef __KERNEL__
>> /* Only used within kernel data path. */
>> #endif
>>
>> #ifndef __KERNEL__
>> /* Only used within userspace data path. */
>> #endif
>> __OVS_KEY_ATTR_MAX
>> };
>>
>> And the kernel view:
>>
>> enum ovs_key_attr {
>> <common attrs>
>>
>> #ifdef __KERNEL__
>> /* Only used within kernel data path. */
>> #endif
>>
>> __OVS_KEY_ATTR_MAX
>> };
>>
>> This happened before my time, but the commit where userspace made a wrong
>> turn appears to be this one:
>> https://github.com/openvswitch/ovs/commit/beb75a40fdc295bfd6521b0068b4cd12f6de507c
>> The attribute for userspace only was added to the common enum after the
>> OVS_KEY_ATTR_TUNNEL_INFO. I'm not sure how things didn't fall apart when
>> OVS_KEY_ATTR_NSH was added later (no-one cared that NSH doesn't work, because
>> OVS didn't support it yet?).
>>
>> In general, any addition of a new attribute into that enumeration leads to
>> inevitable clash between userpsace-only attributes and new kernel attributes.
>>
>> After the kernel update, kernel provides new attributes to the userspace and
>> userspace tries to parse them as one of the userspace-only attributes and
>> fails. In our current case userspace is trying to parse OVS_KEY_ATTR_IPV6_EXTHDR
>> as userspace-only OVS_KEY_ATTR_PACKET_TYPE, because they have the same value in the
>> enum, fails and discards the netlink message as malformed. So, IPv6 is fully
>> broken, because OVS_KEY_ATTR_IPV6_EXTHDR is supplied now with every IPv6 packet
>> that goes to userspace.
>>
>> We need to unify the view of 'enum ovs_key_attr' between userspace and kernel
>> before we can add any new values to it.
>>
>> One way to do that should be addition of both userspace-only attributes to the
>> kernel header (and maybe exposing OVS_KEY_ATTR_TUNNEL_INFO too, just to keep
>> it flat and avoid any possible problems in the future). Any other suggestions
>> are welcome. But in any case this will require careful testing with existing
>> OVS userspace to avoid any unexpected issues.
>>
>> Moving forward, I think, userspace OVS should find a way to not have userpsace-only
>> attributes, or have them as a separate enumeration. But I'm not sure how to do
>> that right now. Or we'll have to add userspace-only attributes to the kernel
>> uapi before using them.
>
> Thanks for the explanation, we can apply a revert if that'd help your
> CI / ongoing development but sounds like the fix really is in user
> space. Expecting netlink attribute lists not to grow is not fair.

I don't think it was intentional, just a careless mistake. Unfortunately,
all OVS binaries built during the last 5 years rely on that unwanted
expectation (re-build will also not help as they are using a copy of the
uAPI header and the clash will be there anyway). If we want to keep them
working, kernel uAPI has to be carefully updated with current userspace-only
attributes before we add any new ones. That is not great, but I don't see
any other option right now that doesn't require code changes in userspace.

I'd say that we need to revert the current patch and re-introduce it
later when the uAPI problem is sorted out. This way we will avoid blocking
the net-next testing and will also avoid problems in case the uAPI changes
are not ready at the moment of the new kernel release.

What do you think?

>
> Since ovs uses genetlink you should be able to dump the policy from
> the kernel and at least validate that it doesn't overlap.

That is interesting. Indeed, this functionality can be used to detect
problems or to define userspace-only attributes in runtime based on the
kernel reply. Thanks for the pointer!

Best regards, Ilya Maximets.


2022-03-09 01:22:53

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [ovs-dev] [PATCH net-next v8] net: openvswitch: IPv6: Add IPv6 extension header support

On Tue, 8 Mar 2022 01:04:00 +0100 Ilya Maximets wrote:
> > Thanks for the explanation, we can apply a revert if that'd help your
> > CI / ongoing development but sounds like the fix really is in user
> > space. Expecting netlink attribute lists not to grow is not fair.
>
> I don't think it was intentional, just a careless mistake. Unfortunately,
> all OVS binaries built during the last 5 years rely on that unwanted
> expectation (re-build will also not help as they are using a copy of the
> uAPI header and the clash will be there anyway). If we want to keep them
> working, kernel uAPI has to be carefully updated with current userspace-only
> attributes before we add any new ones. That is not great, but I don't see
> any other option right now that doesn't require code changes in userspace.
>
> I'd say that we need to revert the current patch and re-introduce it
> later when the uAPI problem is sorted out. This way we will avoid blocking
> the net-next testing and will also avoid problems in case the uAPI changes
> are not ready at the moment of the new kernel release.
>
> What do you think?

Let me add some people I associate with genetlink work in my head
(fairly or not) to keep me fair here.

It's highly unacceptable for user space to straight up rewrite kernel
uAPI types but if it already happened the only fix is something like:

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 9d1710f20505..ab6755621e02 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -351,11 +351,16 @@ enum ovs_key_attr {
OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4, /* struct ovs_key_ct_tuple_ipv4 */
OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6, /* struct ovs_key_ct_tuple_ipv6 */
OVS_KEY_ATTR_NSH, /* Nested set of ovs_nsh_key_* */
- OVS_KEY_ATTR_IPV6_EXTHDRS, /* struct ovs_key_ipv6_exthdr */

#ifdef __KERNEL__
OVS_KEY_ATTR_TUNNEL_INFO, /* struct ip_tunnel_info */
#endif
+ /* User space decided to squat on types 30 and 31 */
+ OVS_KEY_ATTR_IPV6_EXTHDRS = 32, /* struct ovs_key_ipv6_exthdr */
+ /* WARNING: <scary warning to avoid the problem coming back> */
+
__OVS_KEY_ATTR_MAX
};


right?

> > Since ovs uses genetlink you should be able to dump the policy from
> > the kernel and at least validate that it doesn't overlap.
>
> That is interesting. Indeed, this functionality can be used to detect
> problems or to define userspace-only attributes in runtime based on the
> kernel reply. Thanks for the pointer!