2004-09-09 11:15:25

by Alex Riesen

[permalink] [raw]
Subject: 2.6.9-rc1+bk: assertion tcp_get_pcount failed at net/ipv4/tcp_input.c

The box froze after being left for some time (some 10 hours) unattended.
The only thing in I could find in logs was:

Sep 8 22:30:18 steel kernel: KERNEL: assertion ((int)tcp_get_pcount(&tp->lost_out) >= 0) failed at net/ipv4/tcp_input.c (2422)
Sep 8 22:30:18 steel kernel: Leak l=4294967295 4
Sep 8 22:32:49 steel kernel: KERNEL: assertion ((int)tcp_get_pcount(&tp->lost_out) >= 0) failed at net/ipv4/tcp_input.c (2422)
Sep 8 22:32:50 steel last message repeated 2 times
Sep 8 22:32:50 steel kernel: Leak l=4294967295 3

This can probably be not the reason. I do not know the actual time.
There should have been some traffic (two bittorrents were running).

The .config attached.


Attachments:
(No filename) (687.00 B)
.config (31.94 kB)
Download all attachments

2004-09-10 03:34:42

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.9-rc1+bk: assertion tcp_get_pcount failed at net/ipv4/tcp_input.c

On Thu, Sep 09, 2004 at 11:12:33AM +0000, Alex Riesen wrote:
> The box froze after being left for some time (some 10 hours) unattended.
> The only thing in I could find in logs was:
>
> Sep 8 22:30:18 steel kernel: KERNEL: assertion ((int)tcp_get_pcount(&tp->lost_out) >= 0) failed at net/ipv4/tcp_input.c (2422)
> Sep 8 22:30:18 steel kernel: Leak l=4294967295 4

Looks like the factor isn't set early enough. Can you please check
that you had the changeset titled

[TCP]: Make sure SKB tso factor is setup early enough.

from davem?

If you did, then please apply the following patch and tell us what
the resulting messages.
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Attachments:
(No filename) (841.00 B)
p (523.00 B)
Download all attachments

2004-09-10 05:30:40

by David Miller

[permalink] [raw]
Subject: Re: 2.6.9-rc1+bk: assertion tcp_get_pcount failed at net/ipv4/tcp_input.c

On Fri, 10 Sep 2004 13:30:55 +1000
Herbert Xu <[email protected]> wrote:

> On Thu, Sep 09, 2004 at 11:12:33AM +0000, Alex Riesen wrote:
> > The box froze after being left for some time (some 10 hours) unattended.
> > The only thing in I could find in logs was:
> >
> > Sep 8 22:30:18 steel kernel: KERNEL: assertion ((int)tcp_get_pcount(&tp->lost_out) >= 0) failed at net/ipv4/tcp_input.c (2422)
> > Sep 8 22:30:18 steel kernel: Leak l=4294967295 4
>
> Looks like the factor isn't set early enough. Can you please check
> that you had the changeset titled
>
> [TCP]: Make sure SKB tso factor is setup early enough.
>
> from davem?
>
> If you did, then please apply the following patch and tell us what
> the resulting messages.

Herbert did you see my division fix I made today for
tso_factor calculation? I was dividing by the TSO mss
instead of the normal one :-)

I think that is the cause of these problems. It was
definitely the cause of a BUG() trap hit in tcp_transmit_skb()
that someone else reported in the past day.

2004-09-10 05:56:52

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.9-rc1+bk: assertion tcp_get_pcount failed at net/ipv4/tcp_input.c

On Thu, Sep 09, 2004 at 10:25:42PM -0700, David S. Miller wrote:
>
> Herbert did you see my division fix I made today for
> tso_factor calculation? I was dividing by the TSO mss
> instead of the normal one :-)

That would do it :)
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt