LinuxLists.cc - [patch] CFS scheduler, -v13

2007-05-17 17:48:01

[permalink] [raw]

Subject: [patch] CFS scheduler, -v13

i'm pleased to announce release -v13 of the CFS scheduler patchset.

The CFS patch against v2.6.22-rc1, v2.6.21.1 or v2.6.20.10 can be
downloaded from the usual place:

http://people.redhat.com/mingo/cfs-scheduler/

-v13 is a fixes-only release. It fixes a smaller accounting bug, so if
you saw small lags during desktop use under certain workloads then
please re-check that workload under -v13 too. It also tweaks SMP
load-balancing a bit. (Note: the load-balancing artifact reported by
Peter Williams is not a CFS-specific problem and he reproduced it in
v2.6.21 too. Nevertheless -v13 should be less prone to such artifacts.)

I know about no open CFS regression at the moment, so please re-test
-v13 and if you still see any problem please re-report it. Thanks!

Changes since -v12:

- small tweak: made the "fork flow" of reniced tasks zero-sum

- debugging update: /proc/<PID>/sched is now seqfile based and echoing
0 to it clears the maximum-tracking counters.

- more debugging counters

- small rounding fix to make the statistical average of rounding errors
zero

- scale both the runtime limit and the granularity on SMP too, and make
it dependent on HZ

- misc cleanups

As usual, any sort of feedback, bugreport, fix and suggestion is more
than welcome,

Ingo

2007-05-17 21:49:12

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

On Thursday 17 May 2007 23:15:33 Ingo Molnar wrote:
> i'm pleased to announce release -v13 of the CFS scheduler patchset.
>
> The CFS patch against v2.6.22-rc1, v2.6.21.1 or v2.6.20.10 can be
> downloaded from the usual place:
>
> http://people.redhat.com/mingo/cfs-scheduler/
>
> -v13 is a fixes-only release. It fixes a smaller accounting bug, so if
> you saw small lags during desktop use under certain workloads then
> please re-check that workload under -v13 too. It also tweaks SMP
> load-balancing a bit. (Note: the load-balancing artifact reported by
> Peter Williams is not a CFS-specific problem and he reproduced it in
> v2.6.21 too. Nevertheless -v13 should be less prone to such artifacts.)
>
> I know about no open CFS regression at the moment, so please re-test
> -v13 and if you still see any problem please re-report it. Thanks!
>
> Changes since -v12:
>
> - small tweak: made the "fork flow" of reniced tasks zero-sum
>
> - debugging update: /proc/<PID>/sched is now seqfile based and echoing
> 0 to it clears the maximum-tracking counters.
>
> - more debugging counters
>
> - small rounding fix to make the statistical average of rounding errors
> zero
>
> - scale both the runtime limit and the granularity on SMP too, and make
> it dependent on HZ
>
> - misc cleanups
>
> As usual, any sort of feedback, bugreport, fix and suggestion is more
> than welcome,
>
> Ingo
> -
Hi
Been testing this version of CFS from last an hour or so and still facing same
lag problems while browsing sites with heavy JS and or flash usage. Mouse
movement is pathetic and audio starts to skip. I haven't face this behavior
with CFS till v11.

Regards
Ananitya

2007-05-18 10:27:29

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

* Anant Nitya <[email protected]> wrote:

> Hi
>
> Been testing this version of CFS from last an hour or so and still
> facing same lag problems while browsing sites with heavy JS and or
> flash usage. Mouse movement is pathetic and audio starts to skip. I
> haven't face this behavior with CFS till v11.

i have just tried 5 different versions of the Flash plugin and i cannot
reproduce this (flash games are still smooth and acceptable even with
the system significantly overloaded with 5 infite loops or with a kernel
build), so it would be nice if you could help me debug this problem.

The last version that worked for you was v11, correct? The biggest v11
-> v12 change was the yield workaround, and while testing your workload
i also noticed that all Flash versions except the latest one (9.0 r31)
use sys_sched_yield() quite frequently. So it would be nice to know
which plugin version you are using (and which Firefox version): you can
check that by typing about:plugins into firefox. Furthermore, could you
also try the following tune:

echo 0 > /proc/sys/kernel/sched_yield_bug_workaround

and this:

echo 2 > /proc/sys/kernel/sched_yield_bug_workaround

if none of this changes behavior then please send me the output of the
following:

strace -ttt -TTT -o strace.txt -f -p `pidof firefox-bin`
< reproduce the lag in firefox >
< Ctrl-C the strace >

and send me the strace.txt file (off-line, it's going to be large).
Thanks,

Ingo

2007-05-18 15:20:37

by Mike Lothian

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

Just thought I'd let you know that CFS is working on the PS3

neutrino boot # dmesg
Using PS3 machine description
Page orders: linear mapping = 24, virtual = 12, io = 12
Starting Linux PPC64 #1 SMP Fri May 18 09:26:38 UTC 2007
-----------------------------------------------------
ppc64_pft_size = 0x14
physicalMemorySize = 0x8000000
ppc64_caches.dcache_line_size = 0x80
ppc64_caches.icache_line_size = 0x80
htab_address = 0x0000000000000000
htab_hash_mask = 0x1fff
-----------------------------------------------------
Linux version 2.6.22-rc1-cfs-v13 (root@localhost) (gcc version 4.1.1
(Gentoo 4.1.1-r3)) #1 SMP Fri May 18 09:26:38 UTC 2007

It feels more responsive but I shall do more testing and see if there
are any real benefits

On 17/05/07, Ingo Molnar <[email protected]> wrote:
>
> i'm pleased to announce release -v13 of the CFS scheduler patchset.
>
> The CFS patch against v2.6.22-rc1, v2.6.21.1 or v2.6.20.10 can be
> downloaded from the usual place:
>
> http://people.redhat.com/mingo/cfs-scheduler/
>
> -v13 is a fixes-only release. It fixes a smaller accounting bug, so if
> you saw small lags during desktop use under certain workloads then
> please re-check that workload under -v13 too. It also tweaks SMP
> load-balancing a bit. (Note: the load-balancing artifact reported by
> Peter Williams is not a CFS-specific problem and he reproduced it in
> v2.6.21 too. Nevertheless -v13 should be less prone to such artifacts.)
>
> I know about no open CFS regression at the moment, so please re-test
> -v13 and if you still see any problem please re-report it. Thanks!
>
> Changes since -v12:
>
> - small tweak: made the "fork flow" of reniced tasks zero-sum
>
> - debugging update: /proc/<PID>/sched is now seqfile based and echoing
> 0 to it clears the maximum-tracking counters.
>
> - more debugging counters
>
> - small rounding fix to make the statistical average of rounding errors
> zero
>
> - scale both the runtime limit and the granularity on SMP too, and make
> it dependent on HZ
>
> - misc cleanups
>
> As usual, any sort of feedback, bugreport, fix and suggestion is more
> than welcome,
>
> Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2007-05-18 15:58:19

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

* Michael Lothian <[email protected]> wrote:

> Just thought I'd let you know that CFS is working on the PS3

heh, an important milestone i think =B-)

Ingo

2007-05-18 18:23:17

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

On Friday 18 May 2007 15:56:07 Ingo Molnar wrote:
> * Anant Nitya <[email protected]> wrote:
> > Hi
> >
> > Been testing this version of CFS from last an hour or so and still
> > facing same lag problems while browsing sites with heavy JS and or
> > flash usage. Mouse movement is pathetic and audio starts to skip. I
> > haven't face this behavior with CFS till v11.
>
> i have just tried 5 different versions of the Flash plugin and i cannot
> reproduce this (flash games are still smooth and acceptable even with
> the system significantly overloaded with 5 infite loops or with a kernel
> build), so it would be nice if you could help me debug this problem.
>
> The last version that worked for you was v11, correct? The biggest v11
> -> v12 change was the yield workaround, and while testing your workload
> i also noticed that all Flash versions except the latest one (9.0 r31)
> use sys_sched_yield() quite frequently. So it would be nice to know
> which plugin version you are using (and which Firefox version): you can
> check that by typing about:plugins into firefox. Furthermore, could you
> also try the following tune:
Hi
I am using konqueror and about:plugins gives back this information regarding
flashplayer.
Shockwave Flash Shockwave Flash 9.0 r31 libflashplayer.so
application/x-shockwave-flash - Shockwave Flash (swf)
application/futuresplash - FutureSplash Player (spl)
>
> echo 0 > /proc/sys/kernel/sched_yield_bug_workaround
>
> and this:
>
> echo 2 > /proc/sys/kernel/sched_yield_bug_workaround
>
These values do visibly makes browsing smooth but it still lags though lag
time is less compared to original values.

> if none of this changes behavior then please send me the output of the
> following:
>
> strace -ttt -TTT -o strace.txt -f -p `pidof firefox-bin`
> < reproduce the lag in firefox >
> < Ctrl-C the strace >
>
> and send me the strace.txt file (off-line, it's going to be large).
> Thanks,
I am sending you all these information off list.

Regards
Ananitya

2007-05-19 21:16:56

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

On Friday 18 May 2007 15:56:07 Ingo Molnar wrote:
> * Anant Nitya <[email protected]> wrote:
> > Hi
> >
> > Been testing this version of CFS from last an hour or so and still
> > facing same lag problems while browsing sites with heavy JS and or
> > flash usage. Mouse movement is pathetic and audio starts to skip. I
> > haven't face this behavior with CFS till v11.
>
> i have just tried 5 different versions of the Flash plugin and i cannot
> reproduce this (flash games are still smooth and acceptable even with
> the system significantly overloaded with 5 infite loops or with a kernel
> build), so it would be nice if you could help me debug this problem.
>
> The last version that worked for you was v11, correct? The biggest v11
> -> v12 change was the yield workaround, and while testing your workload
> i also noticed that all Flash versions except the latest one (9.0 r31)
> use sys_sched_yield() quite frequently. So it would be nice to know
> which plugin version you are using (and which Firefox version): you can
> check that by typing about:plugins into firefox. Furthermore, could you
> also try the following tune:
>
> echo 0 > /proc/sys/kernel/sched_yield_bug_workaround
>
> and this:
>
> echo 2 > /proc/sys/kernel/sched_yield_bug_workaround
>
> if none of this changes behavior then please send me the output of the
> following:
>
> strace -ttt -TTT -o strace.txt -f -p `pidof firefox-bin`
> < reproduce the lag in firefox >
> < Ctrl-C the strace >
>
> and send me the strace.txt file (off-line, it's going to be large).
> Thanks,
Hi Ingo,

Please ignore my last report about lag problem while using CFS-v13, it is
working perfectly fine with 2.6.21.1 and the lag I used to see in v12 is not
there with v13 anymore. After digging in a bit I found that problem is only
occurring in 2.6.22-rc1 and it get fired by network usage while transmitting
data upstream. I don't have any evidence that CFS is involved in lag problem
since 2.6.22-rc1 with stock scheduler is also having same lag problem and it
seems directly proportional with upstream speed while downstream doesn't
shows any misbehavior { at lower upstream speed lag is less but with higher
upstream speed system starts crawling and system load hitting to 70/75}. Lets
see how 2.6.22-rc2 is doing.

Regards
Ananitya

>
> Ingo

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism

2007-05-20 06:39:12

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

* Anant Nitya <[email protected]> wrote:

> Hi Ingo,
>
> Please ignore my last report about lag problem while using CFS-v13, it
> is working perfectly fine with 2.6.21.1 and the lag I used to see in
> v12 is not there with v13 anymore. [...]

ah, great - i was looking over your debug data and couldnt find the
problem! This moves CFS into the "no open regressions" state again ;-)

> [...] After digging in a bit I found that problem is only occurring in
> 2.6.22-rc1 and it get fired by network usage while transmitting data
> upstream. I don't have any evidence that CFS is involved in lag
> problem since 2.6.22-rc1 with stock scheduler is also having same lag
> problem and it seems directly proportional with upstream speed while
> downstream doesn't shows any misbehavior { at lower upstream speed lag
> is less but with higher upstream speed system starts crawling and
> system load hitting to 70/75}. Lets see how 2.6.22-rc2 is doing.

if that lag still occurs with rc2 then please repeat the following
debugging steps under CFS [which has more instrumentation than the stock
scheduler]:

cat /proc/`pidof firefox-bin`/sched > sched1.txt
echo 0 > /proc/`pidof firefox-bin`/sched

< reproduce the lag in firefox >

cat /proc/`pidof firefox-bin`/sched > sched2.txt

this way we'll be able to tell what nature this delay has. Also, could
you send me your kernel's .config (off-list)?

Ingo

2007-05-21 07:59:26

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Anant Nitya <[email protected]> wrote:

> Please ignore my last report about lag problem while using CFS-v13, it
> is working perfectly fine with 2.6.21.1 and the lag I used to see in
> v12 is not there with v13 anymore. After digging in a bit I found that
> problem is only occurring in 2.6.22-rc1 and it get fired by network
> usage while transmitting data upstream. I don't have any evidence that
> CFS is involved in lag problem since 2.6.22-rc1 with stock scheduler
> is also having same lag problem and it seems directly proportional
> with upstream speed while downstream doesn't shows any misbehavior {
> at lower upstream speed lag is less but with higher upstream speed
> system starts crawling and system load hitting to 70/75}. Lets see how
> 2.6.22-rc2 is doing.

ok, i got your -rc2 debug numbers (off-list), and it doesnt look pretty:

before-lag:

sleep_max : 259502076
block_max : 27690921
wait_max : 16381558

after-lag:

sleep_max : 584186160
block_max : 261780071
wait_max : 881255577

ouch! a nearly 1 second delay got observed by the scheduler - something
is really killing your system!

what does 'top' show during an upload? Is any system related task out of
whack? Could you try to get a readprofile or an oprofile output from the
kernel, so that we can see what is slowing it down so much? It could be
something networking related in v2.6.22-rc2.

Ingo

2007-05-21 08:04:47

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Ingo Molnar <[email protected]> wrote:

> ok, i got your -rc2 debug numbers (off-list), and it doesnt look pretty:
>
> before-lag:
>
> sleep_max : 259502076
> block_max : 27690921
> wait_max : 16381558
>
> after-lag:
>
> sleep_max : 584186160
> block_max : 261780071
> wait_max : 881255577
>
> ouch! a nearly 1 second delay got observed by the scheduler - something
> is really killing your system!
>
> what does 'top' show during an upload? Is any system related task out
> of whack? Could you try to get a readprofile or an oprofile output
> from the kernel, so that we can see what is slowing it down so much?
> It could be something networking related in v2.6.22-rc2.

ah, you got the latency tracer from Thomas, as part of the -hrt patchset
- that makes it quite a bit easier to debug. Could you run the attached
trace-it-10sec utility:

trace-it-10sec > trace-to-ingo.txt

and send me the (compressed) trace output (off-list, or post an URL to
the trace)? Try to reproduce the 'lag' event in the 10 seconds while the
tracer is running.

Ingo

2007-05-21 08:06:25

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Ingo Molnar <[email protected]> wrote:

> ah, you got the latency tracer from Thomas, as part of the -hrt patchset
> - that makes it quite a bit easier to debug. Could you run the attached
> trace-it-10sec utility:
>
> trace-it-10sec > trace-to-ingo.txt

attached ...

Ingo

Attachments:

(No filename) (278.00 B)
trace-it-10sec.c (2.31 kB)
Download all attachments

2007-05-21 08:12:51

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Ingo Molnar <[email protected]> wrote:

> > ouch! a nearly 1 second delay got observed by the scheduler - something
> > is really killing your system!
>
> ah, you got the latency tracer from Thomas, as part of the -hrt patchset
> - that makes it quite a bit easier to debug. [...]

and ... you already did a trace for Thomas, for the softirq problem:

http://cybertek.info/taitai/trace.txt.bz2

this trace shows really bad networking related kernel activities!

gkrellm-5977 does this at timestamp 0:

gkrellm-5977 0..s. 0us : cond_resched_softirq (established_get_next)

2 milliseconds later it's still in established_get_next() (!):

gkrellm-5977 0..s. 2001us : cond_resched_softirq (established_get_next)

and the whole thing takes ... 455 msecs:

gkrellm-5977 0..s. 455443us+: cond_resched_softirq (established_get_next)

i think this suggests that you have tons of open sockets. What does
"netstat -ts" say on your box?

Ingo

2007-05-21 08:25:47

by David Miller

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

From: Ingo Molnar <[email protected]>
Date: Mon, 21 May 2007 09:58:24 +0200

> what does 'top' show during an upload? Is any system related task
> out of whack? Could you try to get a readprofile or an oprofile
> output from the kernel, so that we can see what is slowing it down
> so much? It could be something networking related in v2.6.22-rc2.

There is a driver specific problem that's been around for a while,
but it only effects 3c59x chips, is that what you have?

2007-05-21 08:28:54

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* David Miller <[email protected]> wrote:

> From: Ingo Molnar <[email protected]>
> Date: Mon, 21 May 2007 09:58:24 +0200
>
> > what does 'top' show during an upload? Is any system related task
> > out of whack? Could you try to get a readprofile or an oprofile
> > output from the kernel, so that we can see what is slowing it down
> > so much? It could be something networking related in v2.6.22-rc2.
>
> There is a driver specific problem that's been around for a while, but
> it only effects 3c59x chips, is that what you have?

the problem first showed up in v2.6.22-rc1 and he didnt have it in
v2.6.21 - does that still qualify his box for the 3c59x problem?

Ingo

2007-05-21 08:29:19

by David Miller

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

From: Ingo Molnar <[email protected]>
Date: Mon, 21 May 2007 10:12:01 +0200

> and ... you already did a trace for Thomas, for the softirq problem:
>
> http://cybertek.info/taitai/trace.txt.bz2
>
> this trace shows really bad networking related kernel activities!
>
> gkrellm-5977 does this at timestamp 0:
>
> gkrellm-5977 0..s. 0us : cond_resched_softirq (established_get_next)

So it's not the 3c59x bug :-)

If you have a lot of sockets, there is not way to make
the performance of dumping /proc/net/tcp not suck, use
the netlink socket dumping which is:

1) more efficient even for full dumps
2) allows filtering for the best possible performance

2007-05-21 08:30:45

by David Miller

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

From: Ingo Molnar <[email protected]>
Date: Mon, 21 May 2007 10:28:05 +0200

> the problem first showed up in v2.6.22-rc1 and he didnt have it in
> v2.6.21 - does that still qualify his box for the 3c59x problem?

If the latency is showing up in /proc/net/tcp dumping, it's not the
3c59x problem.

Please just discard any latency trace that shows symbols from that
code, really.

2007-05-21 10:19:58

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* David Miller <[email protected]> wrote:

> > gkrellm-5977 0..s. 0us : cond_resched_softirq
> > (established_get_next)
>
> So it's not the 3c59x bug :-)
>
> If you have a lot of sockets, there is not way to make the performance
> of dumping /proc/net/tcp not suck, use the netlink socket dumping
> which is:
>
> 1) more efficient even for full dumps
> 2) allows filtering for the best possible performance

hm, there is a cond_resched_softirq() for every line output so the
actual latency from this alone shouldnt be that bad. While /proc/net/tcp
has a quadratic algorithm, the per-line latency is O(N), which shouldnt
show up on the radar.

but note that Ananitya is running a fast system as a stock desktop
system browsing the web, so there shouldnt be tons of sockets. So the
latency isnt caused by /proc/net/tcp itself, but there does seem to be
some networking related anomaly.

we'll hopefully be able to tell this more specifically from the re-done
trace.

Ingo

2007-05-21 10:21:24

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Monday 21 May 2007 13:42:01 Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
> > > ouch! a nearly 1 second delay got observed by the scheduler - something
> > > is really killing your system!
> >
> > ah, you got the latency tracer from Thomas, as part of the -hrt patchset
> > - that makes it quite a bit easier to debug. [...]
>
> and ... you already did a trace for Thomas, for the softirq problem:
>
> http://cybertek.info/taitai/trace.txt.bz2
>
> this trace shows really bad networking related kernel activities!
>
> gkrellm-5977 does this at timestamp 0:
>
> gkrellm-5977 0..s. 0us : cond_resched_softirq (established_get_next)
>
> 2 milliseconds later it's still in established_get_next() (!):
>
> gkrellm-5977 0..s. 2001us : cond_resched_softirq (established_get_next)
>
> and the whole thing takes ... 455 msecs:
>
> gkrellm-5977 0..s. 455443us+: cond_resched_softirq (established_get_next)
>
> i think this suggests that you have tons of open sockets. What does
> "netstat -ts" say on your box?
On 2.6.21.1 doing normal work while seeding few torrents produces this
with "netstat -ts". I will send you same information for 2.6.22-rc2 after a
reboot.

Regards
Ananitya

>
> Ingo

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism

Attachments:

(No filename) (1.35 kB)
netstat-ts-normal-workload.txt (1.33 kB)
Download all attachments

2007-05-21 10:21:44

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Anant Nitya <[email protected]> wrote:

> Tcp:
> 5 connections established

hm, this does not explain the /proc/net/tcp overhead i think - although
it could be a red herring. Will have a closer look at your new trace.

if possible please try to generate the automatic softirq trace for
Thomas, and then a separate trace for the firefox/net-lag thing, using
trace-it-10sec.c. Btw., for the second trace, could you boot with
maxcpus=1? That would make the second trace quite a bit more
straightforward to analyze. You probably need both cpus to trigger the
softirq problem.

Ingo

2007-05-21 16:04:57

by Linus Torvalds

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

On Sun, 20 May 2007, Anant Nitya wrote:
>
> After digging in a bit I found that problem is only occurring in
> 2.6.22-rc1 and it get fired by network usage while transmitting data
> upstream.

Can you bisect it? Just do

git bisect start
git bisect good v2.6.21
git bisect bad v2.6.22-rc1

and start testing the end result. The bisection thing is pretty efficient,
so while there's almost 5000 commits in there, you realy shouldn't need to
test more than ten kernels to get it narrowed down to just five commits or
so, and since it seems to be very repeatable and noticeable for you,
bisecting should be the trivial thing to figure out what broke.

David: all the blather about network drivers and/or /proc/net/tcp being
slow anyway misses the *big* point: it didn't use to do this. So there's a
new bug there. Maybe something keeps sockets around in a dead state on the
hash lists or whatever. Mayube something else breaks his bittorrent
client. Whatever. We don't know. But it's a regression.

Linus

2007-05-21 19:58:31

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Monday 21 May 2007 13:42:01 Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
> > > ouch! a nearly 1 second delay got observed by the scheduler - something
> > > is really killing your system!
> >
> > ah, you got the latency tracer from Thomas, as part of the -hrt patchset
> > - that makes it quite a bit easier to debug. [...]
>
> and ... you already did a trace for Thomas, for the softirq problem:
>
> http://cybertek.info/taitai/trace.txt.bz2
>
> this trace shows really bad networking related kernel activities!
>
> gkrellm-5977 does this at timestamp 0:
>
> gkrellm-5977 0..s. 0us : cond_resched_softirq (established_get_next)
>
> 2 milliseconds later it's still in established_get_next() (!):
>
> gkrellm-5977 0..s. 2001us : cond_resched_softirq (established_get_next)
>
> and the whole thing takes ... 455 msecs:
>
> gkrellm-5977 0..s. 455443us+: cond_resched_softirq (established_get_next)
>
> i think this suggests that you have tons of open sockets. What does
> "netstat -ts" say on your box?
I am posting links to the information you asked for. One more thing, after
digging a bit more I found its QoS shaping that is making the box crawl. Once
I disabled the traffic shaping everything comes back to smooth and normal.
Shaping being done on very low speed residential ADSL 256/64 Kbps connection.
If you want me to post shaping rules, please free to ask. BTW its a simple
HTB/SFQ rules.

http://cybertek.info/taitai/netstat-ts-before-crawl-normal-workload.txt
http://cybertek.info/taitai/netstat-ts-while-crawl-normal-workload.txt
http://cybertek.info/taitai/trace-to-ingo.txt.bz2

Regards
Ananitya
>
> Ingo

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism

2007-05-21 20:46:52

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Anant Nitya <[email protected]> wrote:

> I am posting links to the information you asked for. One more thing,
> after digging a bit more I found its QoS shaping that is making the
> box crawl. Once I disabled the traffic shaping everything comes back
> to smooth and normal. Shaping being done on very low speed residential
> ADSL 256/64 Kbps connection. If you want me to post shaping rules,
> please free to ask. BTW its a simple HTB/SFQ rules.
[...]
> http://cybertek.info/taitai/trace-to-ingo.txt.bz2

thanks! This trace indeed includes the smoking gun, htb_dequeue() and
__qdisc_run():

privoxy-12926 1.Ns1 1597us : rb_first (htb_dequeue)

this goes on, non-preemptible, for 160 milliseconds (!):

privoxy-12926 1.Ns1 161568us : rb_first (htb_dequeue)
privoxy-12926 1.Ns1 161568us : qdisc_watchdog_schedule (htb_dequeue)

and finally manages to escape the loop:

privoxy-12926 1.Ns1 161597us : rb_first (htb_dequeue)
privoxy-12926 1.Ns1 161597us : rb_first (htb_dequeue)
privoxy-12926 1.Ns1 161599us : htb_safe_rb_erase (htb_dequeue)
privoxy-12926 1.Ns1 161599us : rb_erase (htb_safe_rb_erase)
privoxy-12926 1.Ns1 161600us : htb_change_class_mode (htb_dequeue)
privoxy-12926 1.Ns1 161601us : htb_activate_prios (htb_change_class_mode)

and the system recovers.

David, any ideas about what's wrong with htb_dequeue(), based on this
trace?

Ingo

2007-05-21 21:02:26

by Patrick McHardy

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

Ingo Molnar wrote:
> * Anant Nitya <[email protected]> wrote:
>
>
>>I am posting links to the information you asked for. One more thing,
>>after digging a bit more I found its QoS shaping that is making the
>>box crawl. Once I disabled the traffic shaping everything comes back
>>to smooth and normal. Shaping being done on very low speed residential
>>ADSL 256/64 Kbps connection. If you want me to post shaping rules,
>>please free to ask. BTW its a simple HTB/SFQ rules.
>
> [...]
>
>>http://cybertek.info/taitai/trace-to-ingo.txt.bz2
>
>
> thanks! This trace indeed includes the smoking gun, htb_dequeue() and
> __qdisc_run():
>
> privoxy-12926 1.Ns1 1597us : rb_first (htb_dequeue)
>
> this goes on, non-preemptible, for 160 milliseconds (!):
>
> privoxy-12926 1.Ns1 161568us : rb_first (htb_dequeue)
> privoxy-12926 1.Ns1 161568us : qdisc_watchdog_schedule (htb_dequeue)
>
> and finally manages to escape the loop:
>
> privoxy-12926 1.Ns1 161597us : rb_first (htb_dequeue)
> privoxy-12926 1.Ns1 161597us : rb_first (htb_dequeue)
> privoxy-12926 1.Ns1 161599us : htb_safe_rb_erase (htb_dequeue)
> privoxy-12926 1.Ns1 161599us : rb_erase (htb_safe_rb_erase)
> privoxy-12926 1.Ns1 161600us : htb_change_class_mode (htb_dequeue)
> privoxy-12926 1.Ns1 161601us : htb_activate_prios (htb_change_class_mode)
>
> and the system recovers.
>
> David, any ideas about what's wrong with htb_dequeue(), based on this
> trace?

This looks like fallout from the switch to hrtimers. Anant, please
send me your HTB script, I'll try to reproduce it.

2007-05-21 21:30:45

by Patrick McHardy

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

[NET_SCHED]: sch_htb: fix event cache time calculation

The event cache time must be an absolute value, when no event exists it is
incorrectly set to 1s instead of 1s in the future.

Should fix excessive load reported by Anant Nitya <[email protected]>.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit 49d1023ea0ea8377e740123d5954e88a00f78b7c
tree 031c210f1b5e37ade5a4fa519f5808cd49225b89
parent 637fc540b0ad22bf7971929e906e704236af06cd
author Patrick McHardy <[email protected]> Mon, 21 May 2007 23:24:16 +0200
committer Patrick McHardy <[email protected]> Mon, 21 May 2007 23:25:51 +0200

net/sched/sch_htb.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 99bcec8..035788c 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -976,8 +976,9 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch)

if (q->now >= q->near_ev_cache[level]) {
event = htb_do_events(q, level);
- q->near_ev_cache[level] = event ? event :
- PSCHED_TICKS_PER_SEC;
+ if (!event)
+ event = q->now + PSCHED_TICKS_PER_SEC;
+ q->near_ev_cache[level] = event;
} else
event = q->near_ev_cache[level];

Attachments:

x (1.18 kB)

2007-05-22 06:18:30

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Tuesday 22 May 2007 03:00:31 Patrick McHardy wrote:
> Patrick McHardy wrote:
> > Ingo Molnar wrote:
> >>* Anant Nitya <[email protected]> wrote:
> >>>I am posting links to the information you asked for. One more thing,
> >>>after digging a bit more I found its QoS shaping that is making the
> >>>box crawl. Once I disabled the traffic shaping everything comes back
> >>>to smooth and normal. Shaping being done on very low speed residential
> >>>ADSL 256/64 Kbps connection. If you want me to post shaping rules,
> >>>please free to ask. BTW its a simple HTB/SFQ rules.
> >>
> >>[...]
> >>
> >>>http://cybertek.info/taitai/trace-to-ingo.txt.bz2
> >>
> >>thanks! This trace indeed includes the smoking gun, htb_dequeue() and
> >>__qdisc_run():
> >>
> >>[..]
> >
> > This looks like fallout from the switch to hrtimers. Anant, please
> > send me your HTB script, I'll try to reproduce it.
>
> I think I already found the bug, please try if this patch helps.

Sorry, but this patch is not helping here. I recompiled the kernel with this
patch but same load pattern still make system to crawl.

Here is the link for script I use to shape traffic.

http://cybertek.info/taitai/adslbwopt.sh

Regards
Ananitya

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism

2007-05-22 06:21:18

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Monday 21 May 2007 15:50:09 Ingo Molnar wrote:
> * Anant Nitya <[email protected]> wrote:
> > Tcp:
> > 5 connections established
>
> hm, this does not explain the /proc/net/tcp overhead i think - although
> it could be a red herring. Will have a closer look at your new trace.
>
> if possible please try to generate the automatic softirq trace for
> Thomas, and then a separate trace for the firefox/net-lag thing, using
> trace-it-10sec.c. Btw., for the second trace, could you boot with
> maxcpus=1? That would make the second trace quite a bit more
> straightforward to analyze. You probably need both cpus to trigger the
> softirq problem.
>
> Ingo

here is the link for new trace with maxcpus=1.
http://cybertek.info/taitai/trace-it-10sec-to-ingo-with-maxcpus=1.bz2

Regards
Ananitya

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism

2007-05-22 06:23:21

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Anant Nitya <[email protected]> wrote:

> > I think I already found the bug, please try if this patch helps.
>
> Sorry, but this patch is not helping here. I recompiled the kernel
> with this patch but same load pattern still make system to crawl.
>
> Here is the link for script I use to shape traffic.
>
> http://cybertek.info/taitai/adslbwopt.sh

could you also apply the fix for the softirq problem below, to make sure
it does not interact?

Ingo

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4212,9 +4212,7 @@ int __sched cond_resched_softirq(void)
BUG_ON(!in_softirq());

if (need_resched() && system_state == SYSTEM_RUNNING) {
- raw_local_irq_disable();
- _local_bh_enable();
- raw_local_irq_enable();
+ local_bh_enable();
__cond_resched();
local_bh_disable();
return 1;

2007-05-22 06:24:18

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Anant Nitya <[email protected]> wrote:

> > I think I already found the bug, please try if this patch helps.
>
> Sorry, but this patch is not helping here. [...]

btw., could you please send this patch on-list too please?

Ingo

2007-05-22 06:24:54

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Ingo Molnar <[email protected]> wrote:

> > Sorry, but this patch is not helping here. [...]
>
> btw., could you please send this patch on-list too please?

disregard this - just found Patrick's patch.

Ingo

2007-05-22 09:20:33

by Patrick McHardy

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

Anant Nitya wrote:
>>Patrick McHardy wrote:
>>
>>I think I already found the bug, please try if this patch helps.
>
>
> Sorry, but this patch is not helping here. I recompiled the kernel with this
> patch but same load pattern still make system to crawl.
>
> Here is the link for script I use to shape traffic.
>
> http://cybertek.info/taitai/adslbwopt.sh

Thanks. Please also send the output of "tc -s -d qdisc show dev
ppp0" and "tc -d -s class show dev ppp0" at the time the problem
occurs.

2007-05-22 12:48:10

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

qdisc htb 1: r2q 1 default 50 direct_packets_stat 0 ver 3.17
Sent 837184 bytes 3603 pkt (dropped 0, overlimits 60528154 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 4210: parent 1:10 limit 50p quantum 1492b flows 50/1024 perturb 10sec
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 4220: parent 1:20 limit 50p quantum 1492b flows 50/1024 perturb 10sec
Sent 102922 bytes 2364 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 4230: parent 1:30 limit 64p quantum 1492b flows 64/1024 perturb 10sec
Sent 12690 bytes 167 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 4240: parent 1:40 limit 128p quantum 1492b flows 128/1024 perturb 10sec
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 4250: parent 1:50 limit 64p quantum 1492b flows 64/1024 perturb 10sec
Sent 714095 bytes 944 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 666: parent 1:666 limit 128p quantum 1492b flows 128/1024 perturb 10sec
Sent 7477 bytes 128 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0

Attachments:

(No filename) (778.00 B)
tc_qdisc_stats_while_crawl (1.26 kB)
tc_class_stats_while_crawl (2.27 kB)
Download all attachments

2007-05-22 22:07:58

by Bill Davidsen

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

Anant Nitya wrote:
> On Thursday 17 May 2007 23:15:33 Ingo Molnar wrote:
>> i'm pleased to announce release -v13 of the CFS scheduler patchset.
>>
>> The CFS patch against v2.6.22-rc1, v2.6.21.1 or v2.6.20.10 can be
>> downloaded from the usual place:
>>
>> http://people.redhat.com/mingo/cfs-scheduler/
>>
>> -v13 is a fixes-only release. It fixes a smaller accounting bug, so if
>> you saw small lags during desktop use under certain workloads then
>> please re-check that workload under -v13 too. It also tweaks SMP
>> load-balancing a bit. (Note: the load-balancing artifact reported by
>> Peter Williams is not a CFS-specific problem and he reproduced it in
>> v2.6.21 too. Nevertheless -v13 should be less prone to such artifacts.)
>>
>> I know about no open CFS regression at the moment, so please re-test
>> -v13 and if you still see any problem please re-report it. Thanks!
>>
>> Changes since -v12:
>>
>> - small tweak: made the "fork flow" of reniced tasks zero-sum
>>
>> - debugging update: /proc/<PID>/sched is now seqfile based and echoing
>> 0 to it clears the maximum-tracking counters.
>>
>> - more debugging counters
>>
>> - small rounding fix to make the statistical average of rounding errors
>> zero
>>
>> - scale both the runtime limit and the granularity on SMP too, and make
>> it dependent on HZ
>>
>> - misc cleanups
>>
>> As usual, any sort of feedback, bugreport, fix and suggestion is more
>> than welcome,
>>
>> Ingo
>> -
> Hi
> Been testing this version of CFS from last an hour or so and still facing same
> lag problems while browsing sites with heavy JS and or flash usage. Mouse
> movement is pathetic and audio starts to skip. I haven't face this behavior
> with CFS till v11.
>
'm not seeing this, do have a site or two as examples?

--
Bill Davidsen <[email protected]>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

2007-05-23 05:41:36

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Tuesday 22 May 2007 11:52:33 Ingo Molnar wrote:
> * Anant Nitya <[email protected]> wrote:
> > > I think I already found the bug, please try if this patch helps.
> >
> > Sorry, but this patch is not helping here. I recompiled the kernel
> > with this patch but same load pattern still make system to crawl.
> >
> > Here is the link for script I use to shape traffic.
> >
> > http://cybertek.info/taitai/adslbwopt.sh
>
> could you also apply the fix for the softirq problem below, to make sure
> it does not interact?
>
> Ingo
>
> Index: linux/kernel/sched.c
> ===================================================================
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -4212,9 +4212,7 @@ int __sched cond_resched_softirq(void)
> BUG_ON(!in_softirq());
>
> if (need_resched() && system_state == SYSTEM_RUNNING) {
> - raw_local_irq_disable();
> - _local_bh_enable();
> - raw_local_irq_enable();
> + local_bh_enable();
> __cond_resched();
> local_bh_disable();
> return 1;

Hi Ingo
Above patch does solve __ soft_irq_pending __ problem. I am running this patch
with kernel 2.6.21.1 since last day doing all kinda things but haven't
encountered any __ NOHZ: local_softirq_pending __. But network lag that I am
seeing since 2.6.22-rc1 is still there even with this patch applied. If you
need any more information please do ask. Meanwhile I will do gitbisect as
suggested by linus to find out the specific commit that introduced this
problem and will inform once I find it. Its good to see system running
without any __ local_softirq_problem __ :)

Regards
Ananitya

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism

2007-05-23 05:45:58

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

On Wednesday 23 May 2007 03:36:27 Bill Davidsen wrote:
> Anant Nitya wrote:
> > On Thursday 17 May 2007 23:15:33 Ingo Molnar wrote:
> >> i'm pleased to announce release -v13 of the CFS scheduler patchset.
> >>
> >> The CFS patch against v2.6.22-rc1, v2.6.21.1 or v2.6.20.10 can be
> >> downloaded from the usual place:
> >>
> >> http://people.redhat.com/mingo/cfs-scheduler/
> >>
> >> -v13 is a fixes-only release. It fixes a smaller accounting bug, so if
> >> you saw small lags during desktop use under certain workloads then
> >> please re-check that workload under -v13 too. It also tweaks SMP
> >> load-balancing a bit. (Note: the load-balancing artifact reported by
> >> Peter Williams is not a CFS-specific problem and he reproduced it in
> >> v2.6.21 too. Nevertheless -v13 should be less prone to such artifacts.)
> >>
> >> I know about no open CFS regression at the moment, so please re-test
> >> -v13 and if you still see any problem please re-report it. Thanks!
> >>
> >> Changes since -v12:
> >>
> >> - small tweak: made the "fork flow" of reniced tasks zero-sum
> >>
> >> - debugging update: /proc/<PID>/sched is now seqfile based and echoing
> >> 0 to it clears the maximum-tracking counters.
> >>
> >> - more debugging counters
> >>
> >> - small rounding fix to make the statistical average of rounding errors
> >> zero
> >>
> >> - scale both the runtime limit and the granularity on SMP too, and make
> >> it dependent on HZ
> >>
> >> - misc cleanups
> >>
> >> As usual, any sort of feedback, bugreport, fix and suggestion is more
> >> than welcome,
> >>
> >> Ingo
> >> -
> >
> > Hi
> > Been testing this version of CFS from last an hour or so and still facing
> > same lag problems while browsing sites with heavy JS and or flash usage.
> > Mouse movement is pathetic and audio starts to skip. I haven't face this
> > behavior with CFS till v11.
>
> 'm not seeing this, do have a site or two as examples?

Please disregard the above post, lag problem I am experiencing got introduced
in 2.6.22-rcX and is network QoS specific and its not related to CFS.

Regards
Ananitya

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism

2007-05-23 06:31:17

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Anant Nitya <[email protected]> wrote:

> > could you also apply the fix for the softirq problem below, to make
> > sure it does not interact?

> Above patch does solve __ soft_irq_pending __ problem. I am running
> this patch with kernel 2.6.21.1 since last day doing all kinda things
> but haven't encountered any __ NOHZ: local_softirq_pending __. But
> network lag that I am seeing since 2.6.22-rc1 is still there even with
> this patch applied. If you need any more information please do ask.
> Meanwhile I will do gitbisect as suggested by linus to find out the
> specific commit that introduced this problem and will inform once I
> find it. Its good to see system running without any __
> local_softirq_problem __ :)

thanks.

if you feel inclined to try the git-bisection then by all means please
do it (it will certainly be helpful and educative), but it's optional: i
dont think you should 'need' to go through extra debugging chores, my
analysis based on the excellent trace you provided still holds and
whoever modified htb_dequeue()'s logic recently ought to be able to
figure that out (or send you a debug patch to further narrow the problem
down).

The trace shows a _clearly_ anomalous loop: for example there's 56396
(!) calls to rb_first() in htb_dequeue() [without the kernel ever
exiting that function]:

earth4:~/s> grep rb_first trace-to-ingo.txt | wc -l
56396

and the set of rules you are using are alot simpler and the networking
load you are using is not large by any means. Here's the trace analysis
below again.

Ingo

----------------------->

> http://cybertek.info/taitai/trace-to-ingo.txt.bz2

This trace indeed includes the smoking gun, htb_dequeue() and
__qdisc_run():

privoxy-12926 1.Ns1 1597us : rb_first (htb_dequeue)

this goes on, non-preemptible, for 160 milliseconds (!):

privoxy-12926 1.Ns1 161568us : rb_first (htb_dequeue)
privoxy-12926 1.Ns1 161568us : qdisc_watchdog_schedule (htb_dequeue)

and finally manages to escape the loop:

privoxy-12926 1.Ns1 161597us : rb_first (htb_dequeue)
privoxy-12926 1.Ns1 161597us : rb_first (htb_dequeue)
privoxy-12926 1.Ns1 161599us : htb_safe_rb_erase (htb_dequeue)
privoxy-12926 1.Ns1 161599us : rb_erase (htb_safe_rb_erase)
privoxy-12926 1.Ns1 161600us : htb_change_class_mode (htb_dequeue)
privoxy-12926 1.Ns1 161601us : htb_activate_prios (htb_change_class_mode)

and the system recovers.

2007-05-23 07:15:01

by Volker Armin Hemmann

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

Hi,

I wanted to mail earlier, but I had always something get in my way.

I used cfs v13 since you announced it. Since patching the kernel (2.6.21.1)
with cfs v13
I did the following things;

- big backup of home onto tape and restoring it after changing to reiser4
(yes, I know the threads about its problems, but I always wanted to try it -
and when fragmentation reached the point of unbearable, I switched. The
really important stuff is still on a different fs, and I figured that using
an experimental fs might be the kick I need to do frequently updates
again..).

- lots and lots and lots of compiling. And then some more compiling.

- burning dvds (I also switched to libata - if you go experimental, why don't
do it completly... only the dvd drive and a 'data dump' harddisk are
affected.. but hey, since then no surprise-pio-mode anymore...)

- some ut2004

- assorted desktop stuff. typing, surfing, video, music, - most of
it in parallel. some 'minor' games like freeciv, wesnoth, lgeneral.

- some vegastrike-svn

So far I am pretty satisfied. I can't see any regressions, music is
skip-free, videos play nice, burning dvds is fast, ut2004 and vegastrike play
well. The other games can't be better and did not get worse. Everything else
behaves like always.

Gl?ck Auf,
Volker

2007-05-23 07:22:31

[permalink] [raw]

Subject: Re: [patch] CFS scheduler, -v13

* Hemmann, Volker Armin <[email protected]> wrote:

> Hi,
>
> I wanted to mail earlier, but I had always something get in my way.
>
> I used cfs v13 since you announced it. Since patching the kernel
> (2.6.21.1) with cfs v13 I did the following things;
[...]

> So far I am pretty satisfied. I can't see any regressions, music is
> skip-free, videos play nice, burning dvds is fast, ut2004 and
> vegastrike play well. The other games can't be better and did not get
> worse. Everything else behaves like always.

thanks for the feedback!

Ingo

2007-05-23 10:59:20

by Patrick McHardy

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f28bb2d..f536060 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -174,7 +174,7 @@ requeue:

out:
BUG_ON((int) q->q.qlen < 0);
- return q->q.qlen;
+ return skb ? q->q.qlen : 0;
}

void __qdisc_run(struct net_device *dev)

Attachments:

x (318.00 B)

2007-05-23 11:06:43

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Patrick McHardy <[email protected]> wrote:

> How is this trace to be understood? Is it simply a call trace in
> execution-order? [...]

yeah. There's a help section at the top of the trace which explains the
other fields too:

_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
privoxy-12926 1.Ns1 0us : ktime_get_ts (ktime_get)

the function name in braces is the parent function. So in this case the
trace entry means we called ktime_get_ts() from ktime_get().

Ingo

2007-05-23 11:27:30

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Wed, May 23, 2007 at 12:56:04PM +0200, Patrick McHardy wrote:
>
> Looking at the recent changes to __qdisc_run, this indeed seems
> to be the case, when the qdisc is throttled and has packets queued
> we return a value != 0, causing __qdisc_run to loop until all
> packets have been sent, which may be a long time.

Good catch! I was obviously half awake at the time :)

We could also fix it this way:

[NET_SCHED]: Fix qdisc_restart return value when dequeue is empty

My previous patch that changed the return value of qdisc_restart
incorrectly made the case where dequeue returns empty continue
processing packets.

This patch is based on diagnosis and fix by Patrick McHardy.

Signed-off-by: Herbert Xu <[email protected]>

Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f28bb2d..cbefe22 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -169,8 +169,8 @@ requeue:
else
q->ops->requeue(skb, q);
netif_schedule(dev);
- return 0;
}
+ return 0;

out:
BUG_ON((int) q->q.qlen < 0);

2007-05-23 11:36:59

by Patrick McHardy

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

Herbert Xu wrote:
> On Wed, May 23, 2007 at 12:56:04PM +0200, Patrick McHardy wrote:
>
>>Looking at the recent changes to __qdisc_run, this indeed seems
>>to be the case, when the qdisc is throttled and has packets queued
>>we return a value != 0, causing __qdisc_run to loop until all
>>packets have been sent, which may be a long time.
>
>
> Good catch! I was obviously half awake at the time :)
>
> We could also fix it this way:

Yes, that looks better, thanks.

2007-05-23 11:42:16

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

* Herbert Xu <[email protected]> wrote:

> [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty
>
> My previous patch that changed the return value of qdisc_restart
> incorrectly made the case where dequeue returns empty continue
> processing packets.
>
> This patch is based on diagnosis and fix by Patrick McHardy.
>
> Signed-off-by: Herbert Xu <[email protected]>

also:

Reported-and-debugged-by: Anant Nitya <[email protected]>

...

i gave your patch a quick test-boot and networking still works fine.

Ingo

2007-05-23 15:02:39

by Linus Torvalds

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Wed, 23 May 2007, Patrick McHardy wrote:
>
> Yes, that looks better, thanks.

There appear to be other obvious problems in the recent "cleanups" in this
area..

Look at

psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, psched_time_t bound)
{
return min(tv1 - tv2, bound);
}

and compare it to the previous code:

#define PSCHED_TDIFF_SAFE(tv1, tv2, bound) \
min_t(long long, (tv1) - (tv2), bound)

and ponder how that "trivial cleanup" totally broke the thing.

Hint: "psched_time_t" is an "u64". What does that mean for

min(tv1 - tv2, bound);

again, when "tv2" is larger than tv1. It _used_ to return a negative
value. Now it returns a positive "bound" upper bound, because "tv1-tv2"
will be used as a huge unsigned (and thus _positive_) integer. And was
that accidental, or done on purpose?

Sounds accidental to me, since you then want to return a "psched_tdiff_t",
which is typedeffed to be "long".

Doesn't sound very safe to me, especially since the commit message for
this is "[NET_SCHED]: turn PSCHED_TDIFF_SAFE into inline function", and
there's no indication that anybody realized that it changed semantics in
the process.

Hmm? What _should_ that thing do?

Linus

2007-05-23 17:19:51

by Patrick McHardy

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

Linus Torvalds wrote:
> There appear to be other obvious problems in the recent "cleanups" in this
> area..
>
> Look at
>
> psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, psched_time_t bound)
> {
> return min(tv1 - tv2, bound);
> }
>
> and compare it to the previous code:
>
> #define PSCHED_TDIFF_SAFE(tv1, tv2, bound) \
> min_t(long long, (tv1) - (tv2), bound)
>
> and ponder how that "trivial cleanup" totally broke the thing.
>
> Hint: "psched_time_t" is an "u64". What does that mean for
>
> min(tv1 - tv2, bound);
>
> again, when "tv2" is larger than tv1. It _used_ to return a negative
> value. Now it returns a positive "bound" upper bound, because "tv1-tv2"
> will be used as a huge unsigned (and thus _positive_) integer. And was
> that accidental, or done on purpose?
>
> Sounds accidental to me, since you then want to return a "psched_tdiff_t",
> which is typedeffed to be "long".
>
> Doesn't sound very safe to me, especially since the commit message for
> this is "[NET_SCHED]: turn PSCHED_TDIFF_SAFE into inline function", and
> there's no indication that anybody realized that it changed semantics in
> the process.

I did realize it, but tv2 > tv1 can't happen and makes no sense for
the users of this function. I probably should have provided a more
detailed changelog entry.

> Hmm? What _should_ that thing do?

It is used to calculate the amount of tokens a tocken bucket has
accumulated since the last refill, thus we always have tv1 >= tv2
(modulo ktime wraps). In fact tv2 > tv1 was never properly
supported. This macro would have returned the negative long long
value, but all users assign it to a psched_tdiff_t (long), so
depending on the exact values, it might still be interpreted as a
large positive value. Additionally there was a second implementation
for the gettimeofday clocksource that didn't return the negative
difference but the bound value.

2007-05-23 21:31:08

by David Miller

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

From: Ingo Molnar <[email protected]>
Date: Wed, 23 May 2007 13:40:21 +0200

>
> * Herbert Xu <[email protected]> wrote:
>
> > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty
> >
> > My previous patch that changed the return value of qdisc_restart
> > incorrectly made the case where dequeue returns empty continue
> > processing packets.
> >
> > This patch is based on diagnosis and fix by Patrick McHardy.
> >
> > Signed-off-by: Herbert Xu <[email protected]>
>
> also:
>
> Reported-and-debugged-by: Anant Nitya <[email protected]>

Applied, thanks everyone.

2007-05-24 05:44:30

by Patrick McHardy

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

[NET_SCHED]: sch_htb: fix event cache time calculation

The event cache time must be an absolute value, when no event exists it is
incorrectly set to 1s instead of 1s in the future.

Signed-off-by: Patrick McHardy <[email protected]>

---
commit 49d1023ea0ea8377e740123d5954e88a00f78b7c
tree 031c210f1b5e37ade5a4fa519f5808cd49225b89
parent 637fc540b0ad22bf7971929e906e704236af06cd
author Patrick McHardy <[email protected]> Mon, 21 May 2007 23:24:16 +0200
committer Patrick McHardy <[email protected]> Mon, 21 May 2007 23:25:51 +0200

net/sched/sch_htb.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 99bcec8..035788c 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -976,8 +976,9 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch)

if (q->now >= q->near_ev_cache[level]) {
event = htb_do_events(q, level);
- q->near_ev_cache[level] = event ? event :
- PSCHED_TICKS_PER_SEC;
+ if (!event)
+ event = q->now + PSCHED_TICKS_PER_SEC;
+ q->near_ev_cache[level] = event;
} else
event = q->near_ev_cache[level];

Attachments:

x (1.11 kB)

2007-05-24 06:40:27

by David Miller

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

From: Patrick McHardy <[email protected]>
Date: Thu, 24 May 2007 07:41:00 +0200

> David Miller wrote:
> >>* Herbert Xu <[email protected]> wrote:
> >>
> >>>[NET_SCHED]: Fix qdisc_restart return value when dequeue is empty
> >
> > Applied, thanks everyone.
>
>
> Even though it didn't fix this problem, this patch I sent earlier is
> also needed.

Thanks a lot for reminding me about this patch Patrick, applied.

2007-05-24 07:14:55

[permalink] [raw]

Subject: Re: bad networking related lag in v2.6.22-rc2

On Thursday 24 May 2007 03:00:56 David Miller wrote:
> From: Ingo Molnar <[email protected]>
> Date: Wed, 23 May 2007 13:40:21 +0200
>
> > * Herbert Xu <[email protected]> wrote:
> > > [NET_SCHED]: Fix qdisc_restart return value when dequeue is empty
> > >
> > > My previous patch that changed the return value of qdisc_restart
> > > incorrectly made the case where dequeue returns empty continue
> > > processing packets.
> > >
> > > This patch is based on diagnosis and fix by Patrick McHardy.
> > >
> > > Signed-off-by: Herbert Xu <[email protected]>
> >
> > also:
> >
> > Reported-and-debugged-by: Anant Nitya <[email protected]>
>
> Applied, thanks everyone.

Networking lag I been seeing since 2.6.22-rc1, disappeared after applying this
patch. Thanks to everyone who helped me run my system sane again. :)

Reagards
Ananitya

--
Out of many thousands, one may endeavor for perfection, and of
those who have achieved perfection, hardly one knows Me in truth.
-- Gita Sutra Of Mysticism