2002-01-05 22:41:24

by Ingo Molnar

[permalink] [raw]
Subject: [patch] O(1) scheduler, 2.4.17-B0, 2.5.2-pre8-B0.


this is the next, bugfix release of the O(1) scheduler:

http://redhat.com/~mingo/O(1)-scheduler/sched-O1-2.5.2-B0.patch
http://redhat.com/~mingo/O(1)-scheduler/sched-O1-2.4.17-B0.patch

This release could fix the lockups and crashes reported by some people.

Changes:

- remove the likely/unlikely define from sched.h and include compiler.h.
(Adrian Bunk)

- export sys_sched_yield, reported by Pawel Kot.

- turn off 'child runs first' temporarily, to see the effect.

- export nr_context_switches() as well, needed by ReiserFS.

- define resched_task() in the correct order to avoid compiler warnings
on UP.

- maximize the frequency of timer-tick driven load-balancing to 100 per
sec.

- clear ->need_resched in the RT scheduler path as well.

- simplify yield() support, remove TASK_YIELDED and __schedule_tail().

Comments, bug reports, suggestions are welcome,

Ingo


2002-01-05 23:07:08

by Ingo Molnar

[permalink] [raw]
Subject: [patch] O(1) scheduler, 2.5.2-pre9-B1.


i've uploaded a new patch, against 2.5.2-pre9:

http://redhat.com/~mingo/O(1)-scheduler/sched-O1-2.5.2-B1.patch

there are no changes but the merge against the Linus kernel.

Ingo

2002-01-05 23:35:32

by Pawel Kot

[permalink] [raw]
Subject: Re: [patch] O(1) scheduler, 2.4.17-B0, 2.5.2-pre8-B0.

On Sun, 6 Jan 2002, Ingo Molnar wrote:

Hi Ingo,

> this is the next, bugfix release of the O(1) scheduler:
>
> http://redhat.com/~mingo/O(1)-scheduler/sched-O1-2.4.17-B0.patch
>
> This release could fix the lockups and crashes reported by some people.
[...]
> Comments, bug reports, suggestions are welcome,

The same machine as before. The same scenario:
1. login
2. startx
3. gimp &
4. netscape &
5. mozilla &
6. freeze & oops
7. SAK
8. pasting oops
9. freeze

Result of ksymoops below. Hope it helps.

ksymoops 2.4.1 on i686 2.4.17sched. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.17sched (specified)
-m /usr/src/linux-2.4.17-scheduler/System.map (specified)

Warning (compare_maps): mismatch on symbol sb_be_quiet , sb_lib says d2845804, /lib/modules/2.4.17sched/kernel/drivers/sound/sb_lib.o says d2843ea4. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sb_lib.o entry
Warning (compare_maps): mismatch on symbol smw_free , sb_lib says d2845810, /lib/modules/2.4.17sched/kernel/drivers/sound/sb_lib.o says d2843eb0. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sb_lib.o entry
Warning (compare_maps): mismatch on symbol audio_devs , sound says d2828e00, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d28287a0. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Warning (compare_maps): mismatch on symbol midi_devs , sound says d2828e70, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d2828810. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Warning (compare_maps): mismatch on symbol mixer_devs , sound says d2828e18, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d28287b8. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Warning (compare_maps): mismatch on symbol num_audiodevs , sound says d2828e14, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d28287b4. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Warning (compare_maps): mismatch on symbol num_midis , sound says d2828e88, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d2828828. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Warning (compare_maps): mismatch on symbol num_mixers , sound says d2828e2c, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d28287cc. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Warning (compare_maps): mismatch on symbol num_synths , sound says d2828e6c, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d282880c. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Warning (compare_maps): mismatch on symbol synth_devs , sound says d2828e40, /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o says d28287e0. Ignoring /lib/modules/2.4.17sched/kernel/drivers/sound/sound.o entry
Jan 6 00:25:55 blurp kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Jan 6 00:25:55 blurp kernel: c011311c
Jan 6 00:25:55 blurp kernel: *pde = 00000000
Jan 6 00:25:55 blurp kernel: Oops: 0002
Jan 6 00:25:55 blurp kernel: CPU: 0
Jan 6 00:25:55 blurp kernel: EIP: 0010:[<c011311c>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Jan 6 00:25:55 blurp kernel: EFLAGS: 00010006
Jan 6 00:25:55 blurp kernel: eax: c02b6854 ebx: c9c62000 ecx: c9c6202c edx: 00000000
Jan 6 00:25:55 blurp kernel: esi: c9e08000 edi: c9c62000 ebp: c9e09e74 esp: c9e09e2c
Jan 6 00:25:55 blurp kernel: ds: 0018 es: 0018 ss: 0018
Jan 6 00:25:55 blurp kernel: Process mozilla-bin (pid: 145, stackpage=c9e09000)
Jan 6 00:25:55 blurp kernel: Stack: c9c62000 c9e08000 00000246 c9c6202c c1517788 00000100 00000520 c9c64aa0
Jan 6 00:25:55 blurp kernel: c1405e80 c02b6520 00000001 c9c64000 c9c62000 c9c62000 cc959a64 c9c64564
Jan 6 00:25:55 blurp kernel: c02b6500 00000246 c9e09e84 c0114ebf c9c62000 00000000 c1517788 c0115efc
Jan 6 00:25:55 blurp kernel: Call Trace: [<c0114ebf>] [<c0115efc>] [<c01057c6>] [<c0106ae3>] [<c0121008>]
Jan 6 00:25:55 blurp kernel: [<c0105423>] [<c0121121>] [<c0121008>] [<c01d84bd>] [<c01d8545>] [<c01d91ec>]
Jan 6 00:25:55 blurp kernel: [<c0106bd4>] [<c0106ae3>]
Jan 6 00:25:55 blurp kernel: Code: 89 0a 8b 47 24 8b 55 dc 0f b3 42 0c ff 02 89 57 34 8b 4d f8

>>EIP; c011311c <try_to_wake_up+41c/460> <=====
Trace; c0114ebf <wake_up_process+b/1c>
Trace; c0115efc <do_fork+664/720>
Trace; c01057c6 <sys_clone+1e/28>
Trace; c0106ae3 <system_call+33/38>
Trace; c0121008 <exec_modprobe+0/74>
Trace; c0105423 <kernel_thread+1f/38>
Trace; c0121121 <request_module+a5/1a0>
Trace; c0121008 <exec_modprobe+0/74>
Trace; c01d84bd <sock_create+95/100>
Trace; c01d8545 <sys_socket+1d/50>
Trace; c01d91ec <sys_socketcall+64/200>
Trace; c0106bd4 <error_code+34/3c>
Trace; c0106ae3 <system_call+33/38>
Code; c011311c <try_to_wake_up+41c/460>
0000000000000000 <_EIP>:
Code; c011311c <try_to_wake_up+41c/460> <=====
0: 89 0a mov %ecx,(%edx) <=====
Code; c011311e <try_to_wake_up+41e/460>
2: 8b 47 24 mov 0x24(%edi),%eax
Code; c0113121 <try_to_wake_up+421/460>
5: 8b 55 dc mov 0xffffffdc(%ebp),%edx
Code; c0113124 <try_to_wake_up+424/460>
8: 0f b3 42 0c btr %eax,0xc(%edx)
Code; c0113128 <try_to_wake_up+428/460>
c: ff 02 incl (%edx)
Code; c011312a <try_to_wake_up+42a/460>
e: 89 57 34 mov %edx,0x34(%edi)
Code; c011312d <try_to_wake_up+42d/460>
11: 8b 4d f8 mov 0xfffffff8(%ebp),%ecx


10 warnings issued. Results may not be reliable.

pkot
--
mailto:[email protected] :: mailto:[email protected]
http://kt.linuxnews.pl/ :: Kernel Traffic po polsku

2002-01-06 03:36:32

by listmail

[permalink] [raw]
Subject: Re: [patch] O(1) scheduler, 2.4.17-B0, 2.5.2-pre8-B0.

How close are you and Robert Love on getting this patch and his pre-emt
patches to co-operate...seems like that might bring huge wins. I know, I
know I could diff, and fix the rejects myself, but this seems to deep in
the kernel for a relative newbie like myself(plus I am more a file system
guy)

Bill

On Sun, 6 Jan 2002, Ingo Molnar wrote:

>
> this is the next, bugfix release of the O(1) scheduler:
>
> http://redhat.com/~mingo/O(1)-scheduler/sched-O1-2.5.2-B0.patch
> http://redhat.com/~mingo/O(1)-scheduler/sched-O1-2.4.17-B0.patch
>
> This release could fix the lockups and crashes reported by some people.
>
> Changes:
>
> - remove the likely/unlikely define from sched.h and include compiler.h.
> (Adrian Bunk)
>
> - export sys_sched_yield, reported by Pawel Kot.
>
> - turn off 'child runs first' temporarily, to see the effect.
>
> - export nr_context_switches() as well, needed by ReiserFS.
>
> - define resched_task() in the correct order to avoid compiler warnings
> on UP.
>
> - maximize the frequency of timer-tick driven load-balancing to 100 per
> sec.
>
> - clear ->need_resched in the RT scheduler path as well.
>
> - simplify yield() support, remove TASK_YIELDED and __schedule_tail().
>
> Comments, bug reports, suggestions are welcome,
>
> Ingo
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-01-06 06:09:53

by Robert Love

[permalink] [raw]
Subject: Re: [patch] O(1) scheduler, 2.4.17-B0, 2.5.2-pre8-B0.

On Sat, 2002-01-05 at 22:34, [email protected] wrote:
> How close are you and Robert Love on getting this patch and his pre-emt
> patches to co-operate...seems like that might bring huge wins. I know, I
> know I could diff, and fix the rejects myself, but this seems to deep in
> the kernel for a relative newbie like myself(plus I am more a file system
> guy)

Unfortunately it looks like it is going to take a bit more than fixing
trivial rejects. I started working on it today. I suspect I am going
to need a lot better understanding of Ingo's scheduler, so I am learning
it. I am traveling tomorrow but should be able to dive into it on
Monday.

Ingo and I both agree that the patches together are a Good Thing.

I have a fully ported patch at this point but it hard locks on boot. I
believe the problem to be a few bits in sched.c, but there may be some
underlying changes that break assumptions elsewhere.

We are working on it. Help is always appreciated, though ;)

Robert Love

2002-01-06 12:51:03

by Anton Blanchard

[permalink] [raw]
Subject: O(1) scheduler, 2.5.2-pre9-B1 results


Hi Ingo,

I got your scheduler rewrite going on ppc64. Here are some initial
LMbench results with sched-O1-2.4.17-B4.patch. Bear in mind the two
machines are different chips (one is Power3 and the other is RS64), so
some differences will result:

2 way (POWER3) summary:
signal handling down a bit (GOOD)
fork down a lot (very GOOD)
exec, sh down (GOOD)
context switches all down (GOOD)
communication latencies: Pipe, AF, TCP slightly up (BAD)
pipe bandwidth up (GOOD)

4 way (RS64) summary:
stat up a bit (BAD)
fork down a lot (very GOOD)
exec, sh down (GOOD)
context switches same or down (GOOD)
communication latencies: Pipe, AF, TCP slightly up (BAD)
pipe bandwidth up (GOOD)

So far things look good. Next up I'll look at how it scales on the 12
way.

Anton

2002-01-06 16:17:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: O(1) scheduler, 2.5.2-pre9-B1 results


On Sun, 6 Jan 2002, Anton Blanchard wrote:

> communication latencies: Pipe, AF, TCP slightly up (BAD)

this is mainly because i have not made the O(1) scheduler fully aware of
synchronous wakeups yet. I'm working on this part now that the bugs are
fixed. If you remove synchronous wakeups from the stock kernel then you'll
see processes distributed to different CPUs but bad lmbench latencies.

> So far things look good. Next up I'll look at how it scales on the 12
> way.

thanks!

Ingo


2002-01-06 22:49:32

by Nathaniel

[permalink] [raw]
Subject: Re: [patch] O(1) scheduler, 2.4.17-B0, 2.5.2-pre8-B0.

Out of sheer curiosity (and this might be a stupid question), is there
any effort to make the following lines of development all work together:
RML's preempt-kernel and lock-break (and netdev, but that doesn't touch
the other stuff), Rick's rmap VM, and the O(1) scheduler? If so, is it
being applied to 2.4 or 2.5? (Definately seems 2.5-ish, but given that
all the patches are available for 2.4, I thought I'd ask.)

This system just got 2.4.18-pre1 with RML's preempt and Rick's rmap10c
patches. Seems stable though dbench 10 can take all responsiveness out
of KDE (though XMMS never skips). The O(1) scheduler did not apply, nor
did lock-break, otherwise I would be running with all of the above.

Are any of these actually mutually exclusive? (that is, am I just
wasting time and decreasing the s:n ratio on LKML?)

Thanks in advance.

--Nathan

Robert Love wrote:

>On Sat, 2002-01-05 at 22:34, [email protected] wrote:
>
>>How close are you and Robert Love on getting this patch and his pre-emt
>>patches to co-operate...seems like that might bring huge wins. I know, I
>>know I could diff, and fix the rejects myself, but this seems to deep in
>>the kernel for a relative newbie like myself(plus I am more a file system
>>guy)
>>
>
>Unfortunately it looks like it is going to take a bit more than fixing
>trivial rejects. I started working on it today. I suspect I am going
>to need a lot better understanding of Ingo's scheduler, so I am learning
>it. I am traveling tomorrow but should be able to dive into it on
>Monday.
>
>Ingo and I both agree that the patches together are a Good Thing.
>
>I have a fully ported patch at this point but it hard locks on boot. I
>believe the problem to be a few bits in sched.c, but there may be some
>underlying changes that break assumptions elsewhere.
>
>We are working on it. Help is always appreciated, though ;)
>
> Robert Love
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>




2002-01-08 16:54:55

by George Anzinger

[permalink] [raw]
Subject: Re: [patch] O(1) scheduler, 2.4.17-B0, 2.5.2-pre8-B0.

Nathan wrote:
>
> Out of sheer curiosity (and this might be a stupid question), is there
> any effort to make the following lines of development all work together:
> RML's preempt-kernel and lock-break (and netdev, but that doesn't touch
> the other stuff), Rick's rmap VM, and the O(1) scheduler? If so, is it
> being applied to 2.4 or 2.5? (Definately seems 2.5-ish, but given that
> all the patches are available for 2.4, I thought I'd ask.)
>
> This system just got 2.4.18-pre1 with RML's preempt and Rick's rmap10c
> patches. Seems stable though dbench 10 can take all responsiveness out
> of KDE (though XMMS never skips). The O(1) scheduler did not apply, nor
> did lock-break, otherwise I would be running with all of the above.
>
> Are any of these actually mutually exclusive? (that is, am I just
> wasting time and decreasing the s:n ratio on LKML?)

No, not in concept. Just that they collide in a couple of places and
need a bit of sorting out. Give us a moment.

George
>
> Thanks in advance.
>
> --Nathan
>
> Robert Love wrote:
>
> >On Sat, 2002-01-05 at 22:34, [email protected] wrote:
> >
> >>How close are you and Robert Love on getting this patch and his pre-emt
> >>patches to co-operate...seems like that might bring huge wins. I know, I
> >>know I could diff, and fix the rejects myself, but this seems to deep in
> >>the kernel for a relative newbie like myself(plus I am more a file system
> >>guy)
> >>
> >
> >Unfortunately it looks like it is going to take a bit more than fixing
> >trivial rejects. I started working on it today. I suspect I am going
> >to need a lot better understanding of Ingo's scheduler, so I am learning
> >it. I am traveling tomorrow but should be able to dive into it on
> >Monday.
> >
> >Ingo and I both agree that the patches together are a Good Thing.
> >
> >I have a fully ported patch at this point but it hard locks on boot. I
> >believe the problem to be a few bits in sched.c, but there may be some
> >underlying changes that break assumptions elsewhere.
> >
> >We are working on it. Help is always appreciated, though ;)
> >
> > Robert Love
> >
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to [email protected]
> >More majordomo info at http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at http://www.tux.org/lkml/
> >
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
George [email protected]
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

2002-01-08 18:06:02

by Rik van Riel

[permalink] [raw]
Subject: Re: [patch] O(1) scheduler, 2.4.17-B0, 2.5.2-pre8-B0.

On Tue, 8 Jan 2002, george anzinger wrote:
> Nathan wrote:
> >
> > Out of sheer curiosity (and this might be a stupid question), is there
> > any effort to make the following lines of development all work together:
> > RML's preempt-kernel and lock-break (and netdev, but that doesn't touch
> > the other stuff), Rick's rmap VM, and the O(1) scheduler? If so, is it

> No, not in concept. Just that they collide in a couple of places and
> need a bit of sorting out. Give us a moment.

I'm adding low latency reschedule points to page_launder_zone()
and refill_inactive_zone() right now ;)

regards,

Rik
--
Shortwave goes a long way: irc.starchat.net #swl

http://www.surriel.com/ http://distro.conectiva.com/