2007-08-03 12:07:51

by T. J. Brumfield

[permalink] [raw]
Subject: Scheduler Situation

First off, I am an avid reader of the LKML but I'm not a developer.
Admittedly I am a piss-poor C developer who likes to poke around the
code, play with patches and attempt to learn, but in reality at best I
pretend I understand it, and I don't really. I fully defer to the
technical knowledge of far greater minds on this list. Even having a
basic understanding of C and looking at the code, I still don't
understand basic kernel operations like memory management or CPU
scheduling well enough to see in code what works best.

What I can say, is that someone who has had years of project
management experience, it is painfully obvious here that there are
events here in personal issues which should be easy to spot and
rectify.

I, like many people, had been using Con's patches for years and were
greatly pleased by them. I've experimented with a variety of kernel
flavor and patches, often woefully trying to amass my own collection
of custom patches and often breaking things while trying to integrate
too many patches at once that don't patch nicely with one another.
And when I've had questions, I often read through Con's mailing list
archives from his site.

It would appear he spent 4 years working on his patch-set, primarily
based around his version of a scheduler. And in reading the LKML it
seems that aspects of his patch-set he pushed for mainline inclusion.
He was shot down saying that his ideas were flat-out wrong, and still
he worked for years to improve his work. He answered questions, fixed
bugs, and made himself very available.

It may very well be that CFS is a better scheduler than SD. Ingo is a
very respected coder, and from even Con's mouth it seems that CFS is
pretty simple in its execution. Ingo seems to suggest that since he
posted his code so quickly after he wrote it, that he didn't do
anything wrong.

Linus, a man I greatly admire and respect, especially for his
blunt/terse statements, also seems to suggest that no one has wronged
Con here. However in insisting that the decision was based on Con's
inability to support his code, I can fully understand why Con would
leave kernel development permanently.

The simple truth is that Con poured years into a project despite being
rebuffed and told he was wrong. The moment that people change their
minds and realize that his concepts have merit, no one apologizes for
all the past criticism. Ingo did very much credit Con for
inspiration, but I can't see how this decision was anything but
political. Linus said it himself, that he trusts Ingo to stand behind
the code, and he didn't trust Con to do the same.

I am reticent to accuse anyone of dishonesty, but that statement just
doesn't seem to add up. And even if that is the way Linus truly felt,
it doesn't seem fair given how well Con had supported his code and his
users. Regardless for a man who claims to not make political
decisions on code, that is exactly how he operated here. He chose the
person over the code. From his own mouth, it seemed he remains very
put off by earlier comments from the "Con camp" that perhaps there
should be separation between the server and desktop kernels.

And while Linus is no-doubt correct that such a separation should not
occur, I never really saw Con push for such a thing. I know I don't
fully understand the code, but it does seem to make sense to an idiot
like me however that with all the other kernel options to customize
your build for your needs, it isn't beyond reason to go with a
plugsched type solution. The kernel is already immensely modular, no
doubt the most modular piece of code on the planet.

It works amazingly well in small embedded devices to large
multi-processor servers across multiple architectures and processor
types. The reason I'm posting on an issue I'm sure that many people
are already sick of, is that I'm sure many people would like to see
two things happen.

1 - Can someone please explain why the kernel can be modular in every
other aspect, including offering a choice of IO schedulers, but not
kernel schedulers?

2 - Can't we all agree that this was poorly handled? Politics should
not affect code, and we are all adults. People should accept
rejection of patches, and what not. At the same time, if I were Con
I'd be extremely hurt, and many people have echoed this very
sentiment. I think a project of this size that depends so much on a
large community for testing, contributions, etc. can't afford to
alienate people in this fashion. Con may never come back to kernel
land. That's unfortunate, but this needs to be addressed. I'm sure a
whole lot of people would feel better if they knew this kind of
treatment is not likely to happen again, and ideally I think Con
should get an apology.

If I'm a developer who sits outside the usual circle of contributers,
and I have an idea for a big change in the kernel, how likely am I to
devote a bunch of time if I have the impression that my work will be
shot down, rewritten by another person, and lastly I will be
personally attacked rather than thanked for my contributions?

I'm really hoping that we can take moment to display some
consideration for the feelings of others here, not to babysit or
placate, but rather to set a precedent for such an important project
on how contributers will be treated.

-- T. J. Brumfield
"In the beginning the Universe was created. This has made a lot of
people very angry and been widely regarded as a bad move."
--Douglas Adams
"Nihilism makes me smile."
--Christopher Quic


2007-08-03 13:05:47

by Andev

[permalink] [raw]
Subject: Re: Scheduler Situation

On 8/3/07, T. J. Brumfield <[email protected]> wrote:
> First off, I am an avid reader of the LKML but I'm not a developer.
> Admittedly I am a piss-poor C developer who likes to poke around the
> code, play with patches and attempt to learn, but in reality at best I
> pretend I understand it, and I don't really. I fully defer to the
> technical knowledge of far greater minds on this list. Even having a
> basic understanding of C and looking at the code, I still don't
> understand basic kernel operations like memory management or CPU
> scheduling well enough to see in code what works best.
>
> What I can say, is that someone who has had years of project
> management experience, it is painfully obvious here that there are
> events here in personal issues which should be easy to spot and
> rectify.
>
> I, like many people, had been using Con's patches for years and were
> greatly pleased by them. I've experimented with a variety of kernel
> flavor and patches, often woefully trying to amass my own collection
> of custom patches and often breaking things while trying to integrate
> too many patches at once that don't patch nicely with one another.
> And when I've had questions, I often read through Con's mailing list
> archives from his site.
>
> It would appear he spent 4 years working on his patch-set, primarily
> based around his version of a scheduler. And in reading the LKML it
> seems that aspects of his patch-set he pushed for mainline inclusion.
> He was shot down saying that his ideas were flat-out wrong, and still
> he worked for years to improve his work. He answered questions, fixed
> bugs, and made himself very available.
>
> It may very well be that CFS is a better scheduler than SD. Ingo is a
> very respected coder, and from even Con's mouth it seems that CFS is
> pretty simple in its execution. Ingo seems to suggest that since he
> posted his code so quickly after he wrote it, that he didn't do
> anything wrong.
>
> Linus, a man I greatly admire and respect, especially for his
> blunt/terse statements, also seems to suggest that no one has wronged
> Con here. However in insisting that the decision was based on Con's
> inability to support his code, I can fully understand why Con would
> leave kernel development permanently.
>
> The simple truth is that Con poured years into a project despite being
> rebuffed and told he was wrong. The moment that people change their
> minds and realize that his concepts have merit, no one apologizes for
> all the past criticism. Ingo did very much credit Con for
> inspiration, but I can't see how this decision was anything but
> political. Linus said it himself, that he trusts Ingo to stand behind
> the code, and he didn't trust Con to do the same.
>
> I am reticent to accuse anyone of dishonesty, but that statement just
> doesn't seem to add up. And even if that is the way Linus truly felt,
> it doesn't seem fair given how well Con had supported his code and his
> users. Regardless for a man who claims to not make political
> decisions on code, that is exactly how he operated here. He chose the
> person over the code. From his own mouth, it seemed he remains very
> put off by earlier comments from the "Con camp" that perhaps there
> should be separation between the server and desktop kernels.
>
> And while Linus is no-doubt correct that such a separation should not
> occur, I never really saw Con push for such a thing. I know I don't
> fully understand the code, but it does seem to make sense to an idiot
> like me however that with all the other kernel options to customize
> your build for your needs, it isn't beyond reason to go with a
> plugsched type solution. The kernel is already immensely modular, no
> doubt the most modular piece of code on the planet.
>
> It works amazingly well in small embedded devices to large
> multi-processor servers across multiple architectures and processor
> types. The reason I'm posting on an issue I'm sure that many people
> are already sick of, is that I'm sure many people would like to see
> two things happen.
>
> 1 - Can someone please explain why the kernel can be modular in every
> other aspect, including offering a choice of IO schedulers, but not
> kernel schedulers?

Good question. has been answered in other threads. Linus does'nt like
having separate kernel schedulers.

2007-08-03 13:06:05

by Oleksandr Natalenko

[permalink] [raw]
Subject: Re: Scheduler Situation

T. J. Brumfield <enderandrew <at> gmail.com> writes:

> 1 - Can someone please explain why the kernel can be modular in every
> other aspect, including offering a choice of IO schedulers, but not
> kernel schedulers?

IMHO, Linus has a grudge against Con, but I can't understand, why. Con has
written nice code, I use it even now, I'm glad to have nice high-interactive
system on desktop. But "a grudge" is not an argument for Linus, it's better not
to make a decision by himself, but to let the community to decide, what is
better - CFS or SD.

Question is not only in CFS vs SD, but in -ck patchset. Linus must remember
that he has lost nice code and nice maintainer, and it's not a right decision.
Linux is a public OS, so let people decide themselves, but not Linus himself.

2007-08-03 13:19:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: about modularization


* T. J. Brumfield <[email protected]> wrote:

> 1 - Can someone please explain why the kernel can be modular in every
> other aspect, including offering a choice of IO schedulers, but not
> kernel schedulers?

that's a fundamental misconception. If you boot into a distro kernel on
a typical PC, about half of the kernel code that the box runs in any
moment will be in modules, half of it is in the "kernel core". For
example, on a random laptop:

$ echo `lsmod | cut -c1-30 | cut -d' ' -f2-` | sed 's/Size //' |
sed 's/ /+/g' | bc
2513784

i.e. 2.5 MB of modules. The core kernel's size:

$ dmesg | grep 'kernel code'
Memory: 2053212k/2087808k available (2185k kernel code, 33240k reserved, 1174k data, 244k init, 1170304k highmem)

2.1 MB of kernel core code. (of course the total body of "possible
drivers" is 10 times larger than that of the core kernel - but the
fundamental 'variety' is not.)

most of the modules are for stuff where there is a significant physical
difference between the components they support. Drivers for different
pieces of hardware. Filesystem drivers for different on-disk physical
layouts. Network protocol drivers for different on-wire formats. The
sanest technological decision there is clearly to modularize.

And note that often it's not even about choice there: the user's system
has a particular piece of hardware, to which there is usually one
primary driver. The user does not have any real 'choice' over the
modularization here, it's largely a technological act to make the
kernel's footprint smaller.

But the kernel core, which does not depend as much on the physical
properties of the stuff it supports (it depends on the physics of the
machine of course, but those rules are mostly shared between all
machines of that architecture), and is fundamentally influenced by the
syscall API (which is not modular either) and by our OS design
decisions, has much less reason to be modularized.

The core kernel was always non-modular, and it depends on the technical
details whether we want to or _have to_ modularize something so that it
becomes modular to the user too. For example we dont have 'competing',
modular versions of the IPv4 stack. Neither of the VFS. Nor of timers,
futexes, nor of locking code or of the CPU scheduler. But we can switch
out any of those implementations from the core kernel, and did so
numerous times in the past and will do so in the future.

CPU schedulers are as core kernel code as it gets - you cannot even boot
without having a CPU scheduler. IO schedulers, although similar in name,
are quite different beasts from CPU schedulers, and they are somewhere
between the core kernel and drivers. They are not 'physical drivers' (an
IO scheduler can drive any disk), nor are they fully 'core kernel code'
in the sense of a kernel not even being able to boot without them. Also,
disks are physically different from CPUs, in a way which works _against_
the user-modularization of CPU schedulers. (there are also many other
differences which have been pointed out in the past)

In any case, the IO subsystem maintainers decided to modularize IO
schedulers, and that's their decision. One of the authors of the IO
scheduler code said it on lkml recently that while modularization of IO
scheduler had advantages too, in retrospect he wishes they would not
have made IO schedulers modular and now that decision cannot be undone.
So even that much different situation was far from a clear decision, and
some negative effects can be felt today too, in form of having two
primary IO schedulers but not having one IO scheduler that works well in
all cases. For CPU schedulers the circumstances point away away from
user-selectable modularization even stronger.

Ingo

2007-08-03 13:22:18

by Alistair John Strachan

[permalink] [raw]
Subject: Re: Scheduler Situation

The real question is WHY do people keep writing essays about topics that have
_already_ been exhaustively explored in other threads? If you want a better
understanding of the situation, read the archives, DON'T post another
duplicate message about the same scheduler parade.

Unless you've got some constructive input that goes further than asking the
same questions that have already been asked and answered multiple times
before, please just do not pollute this list.

It's becoming so repetitive this almost sounds like a 419.

--
Cheers,
Alistair.

137/1 Warrender Park Road, Edinburgh, UK.

2007-08-03 15:29:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: about modularization


* debian developer <[email protected]> wrote:

> > 1 - Can someone please explain why the kernel can be modular in
> > every other aspect, including offering a choice of IO schedulers,
> > but not kernel schedulers?
>
> Good question. has been answered in other threads. Linus does'nt like
> having separate kernel schedulers.

not just Linus, but neither me nor Nick Piggin, nor a ton of other
kernel hackers agree with that idea, for numerous technical reasons,
as it has been discussed to death already ;-)

and the last but not least point, although they might sound pretty
similar, there is quite a bit of difference between "IO schedulers" and
"CPU schedulers", just like there is quite a bit of difference between
"Paris Hilton" and "The Hilton, Paris" =B-)

Ingo

2007-08-03 17:55:00

by Rene Herman

[permalink] [raw]
Subject: Re: about modularization

On 08/03/2007 03:19 PM, Ingo Molnar wrote:

> One of the authors of the IO scheduler code said it on lkml recently that
> while modularization of IO scheduler had advantages too, in retrospect he
> wishes they would not have made IO schedulers modular and now that
> decision cannot be undone.

Just as a matter of interest -- why can't it? (a pointer to a list archive
if you have one, or a name so I can look for it myself if you don't, will do
as answer).

Rene.

2007-08-03 19:00:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: about modularization


* Rene Herman <[email protected]> wrote:

> On 08/03/2007 03:19 PM, Ingo Molnar wrote:
>
> > One of the authors of the IO scheduler code said it on lkml recently
> > that while modularization of IO scheduler had advantages too, in
> > retrospect he wishes they would not have made IO schedulers modular
> > and now that decision cannot be undone.
>
> Just as a matter of interest -- why can't it? (a pointer to a list
> archive if you have one, or a name so I can look for it myself if you
> don't, will do as answer).

some apps depend on AS, some on CFQ, and once you expose something to
users it's _very_ hard to remove it, even if the technical arguments are
strong.

http://lists.openwall.net/linux-kernel/2007/04/16/23

Ingo

2007-09-01 21:47:33

by Oleg Verych

[permalink] [raw]
Subject: Re: about modularization

* Date: Fri, 3 Aug 2007 15:19:00 +0200
* Received-SPF: softfail (mx3: transitioning domain of elte.hu does not designate 157.181.1.14 as permitted sender) client-ip=157.181.1.14; [email protected]; helo=elvis.elte.hu;

> If you boot into a distro kernel on
> a typical PC, about half of the kernel code that the box runs in any
> moment will be in modules, half of it is in the "kernel core". For
> example, on a random laptop:

That was your laptop and distro.

> $ echo `lsmod | cut -c1-30 | cut -d' ' -f2-` | sed 's/Size //' |
> sed 's/ /+/g' | bc
> 2513784
>
> i.e. 2.5 MB of modules. The core kernel's size:
>
> $ dmesg | grep 'kernel code'
> Memory: 2053212k/2087808k available (2185k kernel code, 33240k reserved, 1174k data, 244k init, 1170304k highmem)
>
> 2.1 MB of kernel core code. (of course the total body of "possible
> drivers" is 10 times larger than that of the core kernel - but the
> fundamental 'variety' is not.)

Just for reference here's my 2+ years old Asus A4K, kernel is form
Debian Etch:

deen:/tmp# uname -a
Linux deen 2.6.18-4-amd64 #1 SMP Mon Mar 26 11:36:53 CEST 2007 x86_64 x86_64
deen:/tmp# lsmod | (read a; while read a b c; do S=$((b+${S=0})); done; echo $S)
1583684
deen:/tmp# lsmod | grep xfs
xfs 485192 3
deen:/tmp# dmesg | grep kernel\ code
Memory: 506676k/523520k available (1930k kernel code, 16456k reserved, 868k data, 176k init)

Apart from diff in hardware and implied designing/coding skills, decision
was made

* after one "wrong" response plus illness from Con,
* brave core-duo by Ingo and Tomas, who made some bunch of students to test
scheduler and reported success to Linus.

I don't know why, after all that variety of things (mostly drivers, but
recent *fd also) there's such big resistance to anything that's useful
and used by ordinary people. A star sickness, pride? If yes, that's just
ridiculous, but who cares.
____