2020-11-09 13:08:56

by Menglong Dong

[permalink] [raw]
Subject: [PATCH] net: tcp: ratelimit warnings in tcp_recvmsg

From: Menglong Dong <[email protected]>

'before(*seq, TCP_SKB_CB(skb)->seq) == true' means that one or more
skbs are lost somehow. Once this happen, it seems that it will
never recover automatically. As a result, a warning will be printed
and a '-EAGAIN' will be returned in non-block mode.

As a general suituation, users call 'poll' on a socket and then receive
skbs with 'recv' in non-block mode. This mode will make every
arriving skb of the socket trigger a warning. Plenty of skbs will cause
high rate of kernel log.

Besides, WARN is for indicating kernel bugs only and should not be
user-triggable. Replace it with 'net_warn_ratelimited' here.

Signed-off-by: Menglong Dong <[email protected]>
---
net/ipv4/tcp.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b2bc3d7fe9e8..5e38dfd03036 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2093,11 +2093,12 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock,
/* Now that we have two receive queues this
* shouldn't happen.
*/
- if (WARN(before(*seq, TCP_SKB_CB(skb)->seq),
- "TCP recvmsg seq # bug: copied %X, seq %X, rcvnxt %X, fl %X\n",
- *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
- flags))
+ if (unlikely(before(*seq, TCP_SKB_CB(skb)->seq))) {
+ net_warn_ratelimited("TCP recvmsg seq # bug: copied %X, seq %X, rcvnxt %X, fl %X\n",
+ *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
+ flags);
break;
+ }

offset = *seq - TCP_SKB_CB(skb)->seq;
if (unlikely(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)) {
@@ -2108,9 +2109,11 @@ int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock,
goto found_ok_skb;
if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
goto found_fin_ok;
- WARN(!(flags & MSG_PEEK),
- "TCP recvmsg seq # bug 2: copied %X, seq %X, rcvnxt %X, fl %X\n",
- *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt, flags);
+
+ if (!(flags & MSG_PEEK))
+ net_warn_ratelimited("TCP recvmsg seq # bug 2: copied %X, seq %X, rcvnxt %X, fl %X\n",
+ *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt,
+ flags);
}

/* Well, if we have backlog, try to process it now yet. */
--
2.25.1



2020-11-09 13:38:33

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] net: tcp: ratelimit warnings in tcp_recvmsg

On Mon, Nov 9, 2020 at 2:07 PM <[email protected]> wrote:
>
> From: Menglong Dong <[email protected]>
>
> 'before(*seq, TCP_SKB_CB(skb)->seq) == true' means that one or more
> skbs are lost somehow. Once this happen, it seems that it will
> never recover automatically. As a result, a warning will be printed
> and a '-EAGAIN' will be returned in non-block mode.
>
> As a general suituation, users call 'poll' on a socket and then receive
> skbs with 'recv' in non-block mode. This mode will make every
> arriving skb of the socket trigger a warning. Plenty of skbs will cause
> high rate of kernel log.
>
> Besides, WARN is for indicating kernel bugs only and should not be
> user-triggable. Replace it with 'net_warn_ratelimited' here.
>
> Signed-off-by: Menglong Dong <[email protected]>

I do not think this patch is useful. That is simply code churn.

Can you trigger the WARN() in the latest upstream version ?
If yes this is a serious bug that needs urgent attention.

Make sure you have backported all needed fixes into your kernel, if
you get this warning on a non pristine kernel.

2020-11-09 14:52:31

by Menglong Dong

[permalink] [raw]
Subject: Re: [PATCH] net: tcp: ratelimit warnings in tcp_recvmsg

On Mon, Nov 9, 2020 at 9:36 PM Eric Dumazet <[email protected]> wrote:
>
> I do not think this patch is useful. That is simply code churn.
>
> Can you trigger the WARN() in the latest upstream version ?
> If yes this is a serious bug that needs urgent attention.
>
> Make sure you have backported all needed fixes into your kernel, if
> you get this warning on a non pristine kernel.

Theoretically, this WARN() shouldn't be triggered in any branches.
Somehow, it just happened in kernel v3.10. This really confused me. I
wasn't able to keep tracing it, as it is a product environment.

I notice that the codes for tcp skb receiving didn't change much
between v3.10 and the latest upstream version, and guess the latest
version can be triggered too.

If something is fixed and this WARN() won't be triggered, just ignore me.

Cheers,
Menglong Dong

2020-11-09 15:41:08

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] net: tcp: ratelimit warnings in tcp_recvmsg



On 11/9/20 3:48 PM, Menglong Dong wrote:
> On Mon, Nov 9, 2020 at 9:36 PM Eric Dumazet <[email protected]> wrote:
>>
>> I do not think this patch is useful. That is simply code churn.
>>
>> Can you trigger the WARN() in the latest upstream version ?
>> If yes this is a serious bug that needs urgent attention.
>>
>> Make sure you have backported all needed fixes into your kernel, if
>> you get this warning on a non pristine kernel.
>
> Theoretically, this WARN() shouldn't be triggered in any branches.
> Somehow, it just happened in kernel v3.10. This really confused me. I
> wasn't able to keep tracing it, as it is a product environment.
>
> I notice that the codes for tcp skb receiving didn't change much
> between v3.10 and the latest upstream version, and guess the latest
> version can be triggered too.
>
> If something is fixed and this WARN() won't be triggered, just ignore me.
>

Yes, I confirm this WARN() should not trigger.

The bug is not in tcp recvmsg(), that is why you do not see obvious
fix for this issue in 3.10