2005-09-12 10:28:24

by Bernhard Rosenkraenzer

[permalink] [raw]
Subject: 2.6.13-mm2 hard lockup

2.6.13-mm2 locks up hard occasionally (happened for me twice, each time about
20 minutes after booting).
It left the following in the syslog:

Sep 11 23:19:55 dhcppc0 kernel: KERNEL: assertion ((int)tp->lost_out >= 0)
failed at net/ipv4/tcp_input.c (2148)
Sep 11 23:19:59 dhcppc0 kernel: KERNEL: assertion ((int)tp->lost_out >= 0)
failed at net/ipv4/tcp_input.c (2148)
Sep 11 23:25:23 dhcppc0 kernel: KERNEL: assertion ((int)tp->sacked_out >= 0)
failed at net/ipv4/tcp_input.c (2147)
Sep 11 23:26:04 dhcppc0 last message repeated 2 times
Sep 11 23:27:55 dhcppc0 kernel: KERNEL: assertion ((int)tp->sacked_out >= 0)
failed at net/ipv4/tcp_input.c (2147)
Sep 11 23:27:57 dhcppc0 kernel: KERNEL: assertion ((int)tp->lost_out >= 0)
failed at net/ipv4/tcp_input.c (2148)
Sep 11 23:28:00 dhcppc0 kernel: KERNEL: assertion ((int)tp->sacked_out >= 0)
failed at net/ipv4/tcp_input.c (2147)

(This was compiled without sysrq support, therefore no detailed trace)

LLaP
bero


2005-09-12 11:53:55

by Michal Piotrowski

[permalink] [raw]
Subject: Re: 2.6.13-mm2 hard lockup

Hi,

On 12/09/05, Bernhard Rosenkraenzer <[email protected]> wrote:
> 2.6.13-mm2 locks up hard occasionally (happened for me twice, each time about
> 20 minutes after booting).
> It left the following in the syslog:
>
> Sep 11 23:19:55 dhcppc0 kernel: KERNEL: assertion ((int)tp->lost_out >= 0)
> failed at net/ipv4/tcp_input.c (2148)
> Sep 11 23:19:59 dhcppc0 kernel: KERNEL: assertion ((int)tp->lost_out >= 0)
> failed at net/ipv4/tcp_input.c (2148)
> Sep 11 23:25:23 dhcppc0 kernel: KERNEL: assertion ((int)tp->sacked_out >= 0)
> failed at net/ipv4/tcp_input.c (2147)
> Sep 11 23:26:04 dhcppc0 last message repeated 2 times
> Sep 11 23:27:55 dhcppc0 kernel: KERNEL: assertion ((int)tp->sacked_out >= 0)
> failed at net/ipv4/tcp_input.c (2147)
> Sep 11 23:27:57 dhcppc0 kernel: KERNEL: assertion ((int)tp->lost_out >= 0)
> failed at net/ipv4/tcp_input.c (2148)
> Sep 11 23:28:00 dhcppc0 kernel: KERNEL: assertion ((int)tp->sacked_out >= 0)
> failed at net/ipv4/tcp_input.c (2147)
>
> (This was compiled without sysrq support, therefore no detailed trace)
>
> LLaP
> bero
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Someone from netdev has sent this path. I didn't test it.

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -485,11 +485,6 @@ int tcp_fragment(struct sock *sk, struct
TCP_SKB_CB(buff)->when = TCP_SKB_CB(skb)->when;
buff->tstamp = skb->tstamp;

- if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
- tp->lost_out -= tcp_skb_pcount(skb);
- tp->left_out -= tcp_skb_pcount(skb);
- }
-
old_factor = tcp_skb_pcount(skb);

/* Fix up tso_factor for both original and new SKB. */

Here is patch from Herbert Xu


diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -499,7 +499,7 @@ int tcp_fragment(struct sock *sk, struct
/* If this packet has been sent out already, we must
* adjust the various packet counters.
*/
- if (after(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq)) {
+ if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq)) {
int diff = old_factor - tcp_skb_pcount(skb) -
tcp_skb_pcount(buff);


Regards,
Michal Piotrowski