2001-02-20 20:15:21

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: 2.4.1 under heavy network load - more info

Hello!

> of errors a bit but I'm not sure I fully understand the implications of
> doing so.

Until these numbers do not exceed total amount of RAM, this is exactly
the action required in this case.

Dumps, which you sent to me, show nothing pathological. Actually,
they are made in some period of full peace: only 47 orphans and
only about 10MB of accounted memory.


> echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout

This is not essential with 2.4. In 2.4 this state does not grab any essential
resources.

> echo 0 > /proc/sys/net/ipv4/tcp_timestamps
> echo 0 > /proc/sys/net/ipv4/tcp_sack

Why?


> echo "3072 3584 6144" > /proc/sys/net/ipv4/tcp_mem

If you still have problems with orphans, you should raise these
numbers. Extremal settings are sort of:

Z=<total amount of ram in pages>
Y=<something < Z>
Z=<something < Y>
echo "$X $Y $Z" > /proc/sys/net/ipv4/tcp_mem

Set them to maximum and if the messages will not disappear completely,
decrease them to more tough limits.


> Feb 18 15:05:44 mcquack kernel: sending pkt_too_big to self

Normal. Debugging.


> Feb 18 15:24:07 mcquack kernel: TCP: peer xx.xx.xx.xx:1084/7000 shrinks
> window 2106777705:1072:2106779313. Bad, what else can I say?

Debugging too.

> Feb 18 15:42:06 mcquack kernel: TCP: dbg sk->wmem_queued 5664
> tcp_orphan_count 99 tcp_memory_allocated 6145

Number Z is exceeded, newly _closed_ sockets will be aborted and
stack entered state of moderation of its appetite.

Dump, which you have sent to me and further messages in logs,
show that it succeded and converged to normal state.


> Please let me know if I can provide more debug info or test something!

Actually, the only dubious place in your original report was something
about behaviour of ssh. ssh surely cannot be affected by this effect.
Could you elaborate this? What kind of problem exactly? Maybe, some
tcpdump is the problem is reproducable.

Alexey


2001-02-21 11:15:32

by Magnus Walldal

[permalink] [raw]
Subject: RE: 2.4.1 under heavy network load - more info


> Hello!
>
> > of errors a bit but I'm not sure I fully understand the implications of
> > doing so.
>
> Until these numbers do not exceed total amount of RAM, this is exactly
> the action required in this case.

OK! I actually expected 2.4 to be somewhat selftuning. But if I exceed the
amount of physical RAM because I use a lot of sockets, I can understand that
Linux refuse.

> Dumps, which you sent to me, show nothing pathological. Actually,
> they are made in some period of full peace: only 47 orphans and
> only about 10MB of accounted memory.

Interesting you say that, I looked at the logs and I see over 5000 sockets
used, does'nt look peaceful to me. But you are absolutely right about the
orphans. The error about "too many orphans" must be wrong and is triggered
by some other condition. Look at the output from the debug printk I've
added:

Feb 18 15:43:50 mcquack kernel: TCP: too many of orphaned sockets
Feb 18 15:43:50 mcquack kernel: TCP: debug sk->wmem_queued 5364
tcp_orphan_count 124 tcp_memory_allocated 6146

Not many orphans but still an error, OK!

> > echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
>
> This is not essential with 2.4. In 2.4 this state does not grab
> any essential
> resources.
>
> > echo 0 > /proc/sys/net/ipv4/tcp_timestamps
> > echo 0 > /proc/sys/net/ipv4/tcp_sack
>
> Why?

Removed, makes not visible difference. OK!

> > echo "3072 3584 6144" > /proc/sys/net/ipv4/tcp_mem
>
> If you still have problems with orphans, you should raise these
> numbers. Extremal settings are sort of:
> Dump, which you have sent to me and further messages in logs,
> show that it succeded and converged to normal state.

I raised the numbers a little bit more. Now with 128MB RAM in the box we can
handle a maximum of 7000 connections. No more because we start to swap too
much.

Feb 21 10:43:41 mcquack kernel: KERNEL: assertion (tp->lost_out == 0) failed
at tcp_input.c(1202):tcp_remove_reno_sacks

One about every 10 minutes.

I want to add a few conclusions and observasions to all this. Previously the
owner of this machine handled about 10k connections with FreeBSD, same
amount of RAM, same app and no tuning.

To run with very many sockets open I have to
1) Tune Linux heavily
and
2) The error about "too many orphans" is bogus?
3) I will get a lot of debug crap i syslog
4) FreeBSD with the same hardware and software handles more connections

Found this article, "The Linux 20k" socket challenge. Pretty interesting!
http://old.jabber.org/article/34.html


> Actually, the only dubious place in your original report was something
> about behaviour of ssh. ssh surely cannot be affected by this effect.
> Could you elaborate this? What kind of problem exactly? Maybe, some
> tcpdump is the problem is reproducable.
>
> Alexey

This happened once under very heavy load (8000+ connections) and I have been
unable to reproduce.

We have decided to upgrade the system and try to handle 20-30k connections,
will let you know how it goes!

Regards,
Magnus Walldal

2001-02-21 18:08:06

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: 2.4.1 under heavy network load - more info

Hello!

> OK! I actually expected 2.4 to be somewhat selftuning.

Defaults for these numbers (X,Y,Z) are very conservative.


> Interesting you say that, I looked at the logs and I see over 5000 sockets
> used, does'nt look peaceful to me. But you are absolutely right about the
> orphans. The error about "too many orphans" must be wrong and is triggered
> by some other condition. Look at the output from the debug printk I've
> added:
>
> Feb 18 15:43:50 mcquack kernel: TCP: too many of orphaned sockets

Well, message is not accurate. It refuses to hold this particular
orphan, because it feels that too much of memory is consumed.
Change number Z and the message will disappear.

Poor orphans are the first victims, because they have nobody
to take care of, but kernel. And kernel is harsh parent. 8)


> I raised the numbers a little bit more. Now with 128MB RAM in the box we can
> handle a maximum of 7000 connections. No more because we start to swap too
> much.

Really? Well, it is unlikely to have something with net.
Your dumps show that at 6000 connections networking eated less
than 10MB of memory. Probably, swapping is mistuned.


> Feb 21 10:43:41 mcquack kernel: KERNEL: assertion (tp->lost_out == 0) failed
> at tcp_input.c(1202):tcp_remove_reno_sacks

This is also debugging. Harmless.


> 2) The error about "too many orphans" is bogus?

Yes. It is sort of desinformation. It means really that
accounting detected excess of limits, which are set.


> 3) I will get a lot of debug crap i syslog

It will disappear as soon as debugging is disabled. I.e. when
kernel will enter distributions, I guess.

If I was responsible for this, I would not kill them.
The more messages is the better. Otherwise you would have
nothing to report and even did not notice that something is wrong. 8)


> This happened once under very heavy load (8000+ connections) and I have been
> unable to reproduce.

Probably this has nothing to do with tcp, but explained by some
vm failure, sort of oom killer.

Alexey

2001-02-23 02:16:47

by Rik van Riel

[permalink] [raw]
Subject: Re: 2.4.1 under heavy network load - more info

On Wed, 21 Feb 2001 [email protected] wrote:

> > I raised the numbers a little bit more. Now with 128MB RAM in the box we can
> > handle a maximum of 7000 connections. No more because we start to swap too
> > much.
>
> Really? Well, it is unlikely to have something with net.
> Your dumps show that at 6000 connections networking eated less
> than 10MB of memory. Probably, swapping is mistuned.

In that case, could I see some vmstat (and/or top) output of
when the kernel is no longer able to keep up, or maybe even
a way I could reproduce these things at the office ?

I'm really interested in things which make Linux 2.4 break
performance-wise since I'd like to have them fixed before the
distributions start shipping 2.4 as default.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com.br/

2001-02-23 12:52:25

by Chris Evans

[permalink] [raw]
Subject: Re: 2.4.1 under heavy network load - more info


On Wed, 21 Feb 2001, Rik van Riel wrote:

> I'm really interested in things which make Linux 2.4 break
> performance-wise since I'd like to have them fixed before the
> distributions start shipping 2.4 as default.

Hi Rik,

With kernel 2.4.1, I found that caching is way too aggressive. I was
running konqueror in 32Mb (the quest for a lightwieght browser!)
Unfortunately, the system seemed to insist on keeping 16Mb used for
caches, with 15Mb given to the application and X. This led to a lot of
swapping and paging by konqueror. I think the browser would be fully
usable in 32Mb, were the caching not out of balance.

Cheers
Chris

2001-02-23 13:37:35

by Rik van Riel

[permalink] [raw]
Subject: Re: 2.4.1 under heavy network load - more info

On Fri, 23 Feb 2001, Chris Evans wrote:
> On Wed, 21 Feb 2001, Rik van Riel wrote:
>
> > I'm really interested in things which make Linux 2.4 break
> > performance-wise since I'd like to have them fixed before the
> > distributions start shipping 2.4 as default.
>
> With kernel 2.4.1, I found that caching is way too aggressive. I
> was running konqueror in 32Mb (the quest for a lightwieght
> browser!) Unfortunately, the system seemed to insist on keeping
> 16Mb used for caches, with 15Mb given to the application and X.

Wrong.

Cache and processes are INCLUSIVE. Konquerer and your other
applications will share a lot of memory with the cache. More
precisely, everything which is backed by a file or has been
swapped out once (and swapped back in later) will SHARE memory
with both cache and processes.

In 2.4.1-pre<something> the kernel swaps out cache 32 times more
agressively than it scans pages in processes. Until we find a way
to auto-balance these things, expect them to be wrong for at least
some workloads ;(

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com.br/

2001-02-23 17:25:49

by Magnus Walldal

[permalink] [raw]
Subject: RE: 2.4.1 under heavy network load - more info



> In that case, could I see some vmstat (and/or top) output of
> when the kernel is no longer able to keep up, or maybe even
> a way I could reproduce these things at the office ?

Interactive response is actually pretty OK, the only thing I'm seeing is
short (about 1 sec) pauses, they could be due to network problems or VM
stuff...
hard to say because I work with the machine over the net and not from
console.
What I do see during these short pauses is that sendq is building up on the
remote end, nothing happens for a short while and then things continue as
nothing bad happened ;)

It feels like a subtle problem, nothing terribly wrong, but the system does
not feel
100% OK either. Be it a networking or a VM-problem.

Some data from vmstat
root@mcquack:/root# vmstat 3
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy
id
1 0 0 76572 2400 324 48160 4 2 1 1 22 14 47 39
14
1 0 0 76572 2400 324 48160 0 0 0 1 3889 5 49 51
0
1 0 0 76572 2400 324 48160 0 0 0 1 3698 7 48 52
0
1 0 0 76572 2400 324 48160 0 0 0 0 3759 6 46 54
0
1 0 0 76572 2400 324 48160 0 0 0 0 2987 6 48 52
1
1 0 0 76572 2400 324 48160 0 0 0 0 3015 5 46 54
0
1 0 0 76572 2400 324 48160 0 0 0 0 4024 4 45 55
0
2 0 0 76572 2396 324 48160 0 0 0 0 4066 21 44 42
14
1 0 0 76572 2400 324 48160 0 0 0 0 3995 75 29 26
44
1 0 0 76572 2400 324 48160 0 0 0 0 3747 30 43 40
16
1 0 0 76572 2400 324 48160 0 0 0 0 3568 5 44 56
0
1 0 0 76572 2400 324 48160 0 0 0 0 3942 4 43 57
0
1 0 0 76572 2400 324 48160 0 0 0 0 3702 5 44 56
0
1 0 0 76572 2376 320 48176 0 0 0 0 3994 50 33 32
34
1 0 0 76572 2376 320 48176 0 0 0 0 3637 31 25 24
51
1 0 0 76572 2376 320 48176 0 0 0 0 3445 5 48 52
0
1 0 0 76572 2376 320 48176 0 0 0 0 3709 5 52 48
0

This goes on and on, long periods of zero idle time and then a short period
with some idle time and some more cs, the "short pauses" are (when they
happen to occur) just before or slightly after the period with more context
switches.

Top says:
4:53pm up 5 days, 13:42, 1 user, load average: 0.62, 0.69, 0.64
22 processes: 19 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 47.1% user, 52.8% system, 0.0% nice, 0.0% idle
Mem: 127264K av, 125052K used, 2212K free, 0K shrd, 452K
buff
Swap: 499960K av, 76680K used, 423280K free 48480K
cached

PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
1152 adm 20 0 110M 78M 42792 R 0 99.3 63.1 6772m ircd


Btw..the box is a PII-450 so it's not terribly slow ;)


> I'm really interested in things which make Linux 2.4 break
> performance-wise since I'd like to have them fixed before the
> distributions start shipping 2.4 as default.


As always, I'm happy to provide you with more information if I can!

Regards,
Magnus