2021-01-08 13:06:17

by Dongseok Yi

[permalink] [raw]
Subject: [RFC PATCH net] udp: check sk for UDP GRO fraglist

It is a workaround patch.

UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
forwarding. Only the header of head_skb from ip_finish_output_gso ->
skb_gso_segment is updated but following frag_skbs are not updated.

A call path skb_mac_gso_segment -> inet_gso_segment ->
udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
does not try to update any UDP/IP header of the segment list.

It might make sense because each skb of frag_skbs is converted to a
list of regular packets. Header update with checksum calculation may
be not needed for UDP GROed frag_skbs.

But UDP GRO frag_list is started from udp_gro_receive, we don't know
whether the skb will be NAT forwarded at that time. For workaround,
try to get sock always when call udp4_gro_receive -> udp_gro_receive
to check if the skb is for local.

I'm still not sure if UDP GRO frag_list is really designed for local
session only. Can kernel support NAT forward for UDP GRO frag_list?
What am I missing?

Fixes: 9fd1ff5d2ac7 (udp: Support UDP fraglist GRO/GSO.)
Signed-off-by: Dongseok Yi <[email protected]>
---
net/ipv4/udp_offload.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index ff39e94..d476216 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -457,7 +457,7 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
int flush = 1;

NAPI_GRO_CB(skb)->is_flist = 0;
- if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
+ if (sk && (skb->dev->features & NETIF_F_GRO_FRAGLIST))
NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1;

if ((sk && udp_sk(sk)->gro_enabled) || NAPI_GRO_CB(skb)->is_flist) {
@@ -537,8 +537,7 @@ struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb)
NAPI_GRO_CB(skb)->is_ipv6 = 0;
rcu_read_lock();

- if (static_branch_unlikely(&udp_encap_needed_key))
- sk = udp4_gro_lookup_skb(skb, uh->source, uh->dest);
+ sk = udp4_gro_lookup_skb(skb, uh->source, uh->dest);

pp = udp_gro_receive(head, skb, uh, sk);
rcu_read_unlock();
--
2.7.4


2021-01-08 13:40:24

by Steffen Klassert

[permalink] [raw]
Subject: Re: [RFC PATCH net] udp: check sk for UDP GRO fraglist

On Fri, Jan 08, 2021 at 09:52:28PM +0900, Dongseok Yi wrote:
> It is a workaround patch.
>
> UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
> forwarding. Only the header of head_skb from ip_finish_output_gso ->
> skb_gso_segment is updated but following frag_skbs are not updated.
>
> A call path skb_mac_gso_segment -> inet_gso_segment ->
> udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
> does not try to update any UDP/IP header of the segment list.
>
> It might make sense because each skb of frag_skbs is converted to a
> list of regular packets. Header update with checksum calculation may
> be not needed for UDP GROed frag_skbs.
>
> But UDP GRO frag_list is started from udp_gro_receive, we don't know
> whether the skb will be NAT forwarded at that time. For workaround,
> try to get sock always when call udp4_gro_receive -> udp_gro_receive
> to check if the skb is for local.
>
> I'm still not sure if UDP GRO frag_list is really designed for local
> session only. Can kernel support NAT forward for UDP GRO frag_list?
> What am I missing?

The initial idea when I implemented this was to have a fast
forwarding path for UDP. So forwarding is a usecase, but NAT
is a problem, indeed. A quick fix could be to segment the
skb before it gets NAT forwarded. Alternatively we could
check for a header change in __udp_gso_segment_list and
update the header of the frag_skbs accordingly in that case.

2021-01-11 02:05:34

by Dongseok Yi

[permalink] [raw]
Subject: RE: [RFC PATCH net] udp: check sk for UDP GRO fraglist

On 2021-01-08 22:35, Steffen Klassert wrote:
> On Fri, Jan 08, 2021 at 09:52:28PM +0900, Dongseok Yi wrote:
> > It is a workaround patch.
> >
> > UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
> > forwarding. Only the header of head_skb from ip_finish_output_gso ->
> > skb_gso_segment is updated but following frag_skbs are not updated.
> >
> > A call path skb_mac_gso_segment -> inet_gso_segment ->
> > udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
> > does not try to update any UDP/IP header of the segment list.
> >
> > It might make sense because each skb of frag_skbs is converted to a
> > list of regular packets. Header update with checksum calculation may
> > be not needed for UDP GROed frag_skbs.
> >
> > But UDP GRO frag_list is started from udp_gro_receive, we don't know
> > whether the skb will be NAT forwarded at that time. For workaround,
> > try to get sock always when call udp4_gro_receive -> udp_gro_receive
> > to check if the skb is for local.
> >
> > I'm still not sure if UDP GRO frag_list is really designed for local
> > session only. Can kernel support NAT forward for UDP GRO frag_list?
> > What am I missing?
>
> The initial idea when I implemented this was to have a fast
> forwarding path for UDP. So forwarding is a usecase, but NAT
> is a problem, indeed. A quick fix could be to segment the
> skb before it gets NAT forwarded. Alternatively we could
> check for a header change in __udp_gso_segment_list and
> update the header of the frag_skbs accordingly in that case.

Thank you for explaining.
Can I think of it as a known issue? I think we should have a fix
because NAT can be triggered by user. Can I check the current status?
Already planning a patch or a new patch should be written?

2021-01-11 08:46:34

by Steffen Klassert

[permalink] [raw]
Subject: Re: [RFC PATCH net] udp: check sk for UDP GRO fraglist

On Mon, Jan 11, 2021 at 11:02:42AM +0900, Dongseok Yi wrote:
> On 2021-01-08 22:35, Steffen Klassert wrote:
> > On Fri, Jan 08, 2021 at 09:52:28PM +0900, Dongseok Yi wrote:
> > > It is a workaround patch.
> > >
> > > UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
> > > forwarding. Only the header of head_skb from ip_finish_output_gso ->
> > > skb_gso_segment is updated but following frag_skbs are not updated.
> > >
> > > A call path skb_mac_gso_segment -> inet_gso_segment ->
> > > udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
> > > does not try to update any UDP/IP header of the segment list.
> > >
> > > It might make sense because each skb of frag_skbs is converted to a
> > > list of regular packets. Header update with checksum calculation may
> > > be not needed for UDP GROed frag_skbs.
> > >
> > > But UDP GRO frag_list is started from udp_gro_receive, we don't know
> > > whether the skb will be NAT forwarded at that time. For workaround,
> > > try to get sock always when call udp4_gro_receive -> udp_gro_receive
> > > to check if the skb is for local.
> > >
> > > I'm still not sure if UDP GRO frag_list is really designed for local
> > > session only. Can kernel support NAT forward for UDP GRO frag_list?
> > > What am I missing?
> >
> > The initial idea when I implemented this was to have a fast
> > forwarding path for UDP. So forwarding is a usecase, but NAT
> > is a problem, indeed. A quick fix could be to segment the
> > skb before it gets NAT forwarded. Alternatively we could
> > check for a header change in __udp_gso_segment_list and
> > update the header of the frag_skbs accordingly in that case.
>
> Thank you for explaining.
> Can I think of it as a known issue?

No, it was not known before you reported it.

> I think we should have a fix
> because NAT can be triggered by user. Can I check the current status?
> Already planning a patch or a new patch should be written?

We have to do a new patch to fix that issue. If you want do
do so, go ahead.

2021-01-11 13:11:59

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [RFC PATCH net] udp: check sk for UDP GRO fraglist

From: Steffen Klassert <[email protected]>
Date: Mon, 11 Jan 2021 09:43:22 +0100

> On Mon, Jan 11, 2021 at 11:02:42AM +0900, Dongseok Yi wrote:
>> On 2021-01-08 22:35, Steffen Klassert wrote:
>>> On Fri, Jan 08, 2021 at 09:52:28PM +0900, Dongseok Yi wrote:
>>>> It is a workaround patch.
>>>>
>>>> UDP/IP header of UDP GROed frag_skbs are not updated even after NAT
>>>> forwarding. Only the header of head_skb from ip_finish_output_gso ->
>>>> skb_gso_segment is updated but following frag_skbs are not updated.
>>>>
>>>> A call path skb_mac_gso_segment -> inet_gso_segment ->
>>>> udp4_ufo_fragment -> __udp_gso_segment -> __udp_gso_segment_list
>>>> does not try to update any UDP/IP header of the segment list.
>>>>
>>>> It might make sense because each skb of frag_skbs is converted to a
>>>> list of regular packets. Header update with checksum calculation may
>>>> be not needed for UDP GROed frag_skbs.
>>>>
>>>> But UDP GRO frag_list is started from udp_gro_receive, we don't know
>>>> whether the skb will be NAT forwarded at that time. For workaround,
>>>> try to get sock always when call udp4_gro_receive -> udp_gro_receive
>>>> to check if the skb is for local.
>>>>
>>>> I'm still not sure if UDP GRO frag_list is really designed for local
>>>> session only. Can kernel support NAT forward for UDP GRO frag_list?
>>>> What am I missing?
>>>
>>> The initial idea when I implemented this was to have a fast
>>> forwarding path for UDP. So forwarding is a usecase, but NAT
>>> is a problem, indeed. A quick fix could be to segment the
>>> skb before it gets NAT forwarded. Alternatively we could
>>> check for a header change in __udp_gso_segment_list and
>>> update the header of the frag_skbs accordingly in that case.
>>
>> Thank you for explaining.
>> Can I think of it as a known issue?
>
> No, it was not known before you reported it.
>
>> I think we should have a fix
>> because NAT can be triggered by user. Can I check the current status?
>> Already planning a patch or a new patch should be written?
>
> We have to do a new patch to fix that issue. If you want do
> do so, go ahead.

This patch is incorrect. I do NAT UDP GRO Fraglists via nftables
(both with and without flow offload) with no issues since March'20.
Packet loss rates are always +/- 0, so I can say it works properly.
I can share any details / dump any runtime data if needed.

Thanks,
Al