2012-06-15 22:56:40

by Thomas Gleixner

[permalink] [raw]
Subject: [ATTEND or not ATTEND] That's the question!

Dear KS comittee,

like it or not, I really can't convice myself to adjust to the concept
of selfadvertising.

So I stick to the traditional way of proposing topics for KS and let
you decide whether the topic is interesting and my attendance is
required.

As you might know I'm wasting^spending a lot of time to fight the
steady growing insanity in the kernel code base. My current target is
cpu hotplug, but that's just a place holder for a more general
problem.

The steady increasing interest in Linux and the outcome of our quests
to convince involved parties to contribute leads to a few interesting
questions.

Thinking more about it, it all boils down to a single question:

Are we (the kernel community and the current maintainer setup) able
to cope with the inflood of patches?

I for myself (admittedly I'm responsible for too much already, and
I'm quite sure that other top level maintainers suffer in the same
way) have a hard time to keep track of all the "interesting" bits
which hit my inbox.

Sure one might argue that I should delegate responsibility to others
to lower my workload.

I'd be happy to do that, really. I'm not a control freak and I
really don't care about my patch count statistics (I never did, and
I wish that this particular idiocy would have never been invented).

Also I have delegated stuff to a large degree already.

Though I have a hard time to find people who I can trust enough to
take care of crucial core infrastructure bits.

Aside of that I see a (steady increasing) repeated pattern that
potential contributors propose totaly clueless patches to "solve" a
particular problem.

The time I spend on talking clue into those folks is at least an
order of magnitude larger than coding it myself.

I'm pretty sure that this is not caused by my inabilty to explain
stuff to those folks, but by the insanity of managers who believe
that adding a random number of random chosen so called "human
resources" (I abhor that phrase) will solve the problems at hand.

I know that the world and this industry in particular is driven by
such insanities, but I can't commit myself to adhering to that.

So the main questions I want to raise on Kernel Summit are:

- How do we cope with the need to review the increasing amount of
(insane) patches and their potential integration?

- How do we prevent further insanity to known problem spaces (like
cpu hotplug) without stopping progress?

A side question, but definitely related is:

- How do we handle "established maintainers" who are mainly
interested in their own personal agenda and ignoring justified
criticism just because they can?

Thanks,

tglx


2012-06-15 23:34:16

by Greg KH

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, Jun 16, 2012 at 12:56:36AM +0200, Thomas Gleixner wrote:
> So the main questions I want to raise on Kernel Summit are:
>
> - How do we cope with the need to review the increasing amount of
> (insane) patches and their potential integration?

That's a very good question, and I've been wondering if someone is
trying to flood us with crap submissions just to try to DoS us and slow
us down. If not, it's an interesting "attack" vector onto our
development process that we need to be able to handle better.

> - How do we prevent further insanity to known problem spaces (like
> cpu hotplug) without stopping progress?

Progress can slow, if we want it to, in some areas, just to let people
get the time to fix up the issues we currently have. That saves time in
the long run, but requires that someone make it very clear as to what is
going on and how it will change in the future.

But, both of these are great things to talk about, I like it.

> A side question, but definitely related is:
>
> - How do we handle "established maintainers" who are mainly
> interested in their own personal agenda and ignoring justified
> criticism just because they can?

The wonderful, "how do we remove a maintainer who isn't working out"
problem. It's a tough one, I don't think we really have any standard
way. Luckily in the past, the insane ones went away on their own :)

greg k-h

2012-06-16 10:50:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Fri, 15 Jun 2012, Greg KH wrote:
> On Sat, Jun 16, 2012 at 12:56:36AM +0200, Thomas Gleixner wrote:
> > So the main questions I want to raise on Kernel Summit are:
> >
> > - How do we cope with the need to review the increasing amount of
> > (insane) patches and their potential integration?
>
> That's a very good question, and I've been wondering if someone is
> trying to flood us with crap submissions just to try to DoS us and slow
> us down. If not, it's an interesting "attack" vector onto our
> development process that we need to be able to handle better.

I don't think it's a DoS attack.

Interestingly enough the embedded folks have pretty much got their
gear together and are activly working on consolidation. They learned
the hard way that just hacking more crap into the code is going to end
in a disaster so they are activly watching out for other people
working in the same area.

The folks who concern me more are in the enterprise space. There are
moments where I start to believe that big corp managers have
established a secret project to implement RFC2795.

While the embedded horror is and was mostly confined in SoC specific
places, the enterprise flood is massivly targeted at the guts of the
core kernel. That makes me increasingly nervous.

> > - How do we prevent further insanity to known problem spaces (like
> > cpu hotplug) without stopping progress?
>
> Progress can slow, if we want it to, in some areas, just to let people
> get the time to fix up the issues we currently have. That saves time in
> the long run, but requires that someone make it very clear as to what is
> going on and how it will change in the future.

Indeed, but sadly there are not enough maintainers who enforce that
and trying to enforce it is a major challenge.

Even if people realize that there is a problem, the "we need to reach
our milestones" mindset doesn't allow them to sit down and help with
fixing it. Though that's not a Linux specific issue, but I wish that
we as the kernel community could find a way to confine this global
braindamage.

I've been doing continous cleanup work in the last decade and enjoyed
it, though I'm starting to get increasingly frustrated and grumpy. It
might be an age thing :) Though talking to Al Viro makes me certain,
that it's not.

A good start would be if you could convert your kernel statistics into
accounting the consolidation effects of contributions instead of
fostering the idiocy that corporates have started to measure themself
and the performance of their employees (I'm not kidding, it's the sad
reality) with line and commit count statistics.

Maybe that would give more people an incentive to care about the big
and long term picture instead of basking in their short sighted "hack
it into submisson" achievements. I know that it's the wrong reason
when they don't realize the real thing themself, but the end justifies
the means :)

Thanks,

tglx

2012-06-16 11:27:12

by Alan

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Fri, 15 Jun 2012 16:34:13 -0700
Greg KH <[email protected]> wrote:

> On Sat, Jun 16, 2012 at 12:56:36AM +0200, Thomas Gleixner wrote:
> > So the main questions I want to raise on Kernel Summit are:
> >
> > - How do we cope with the need to review the increasing amount of
> > (insane) patches and their potential integration?

One possibility perhaps is to throw insane stuff at submaintainers or
folks learning that path. Once they've achieved some kind of sanity with
the submaintainer then it can bother the real maintainer.

And for a lot of the insane stuff we should just say no (*). People can
maintain it out of tree and there is a point at which some work is best
done out of tree, and some specialist stuff is best kept out of the main
tree if it harms the mainstream too much.

(*) politely. So not by taking lessons from Linus.

> That's a very good question, and I've been wondering if someone is
> trying to flood us with crap submissions just to try to DoS us and slow
> us down. If not, it's an interesting "attack" vector onto our
> development process that we need to be able to handle better.

There are a small number of very large businesses that generate a lot of
the money in the server space. They have individual needs and the money
tends to drive chunks of kernel development. Thats both a bad and a good.
The bad is they are trying to get it to do what they need, now and
without planning for the long term properly. The good is that they are
paying developers who otherwise might be working on other stuff.

It's a parallel IMHO of the "distribution people sucked everyone away and
then realised they'd dug a huge hole" problem but with new actors.

We also have a large consumer electronics and now android ecosystem much
of which is made up of companies and people whose business history and
business model for all products has always been

- make it boot
- make it usable
- ship it
- run

they have little incentive to share, they have no interest in the longer
term, and it's often not rational economics for them to clean up their
code and upstream it because it just helps their rivals, and besides the
part is now 6 months old so obsolete in their world.

For a lot of hardware the only way that is going to get fixed is if

a) it is easier to do the job right than hack it up
b) when the hardware vendors are more involved and have longer term plans
c) their customers start to demand it in order to be up and running very
fast (ie there is heavy consolidation in the platform)

It's not in these little "hack it/ship it" houses interest to care about
making a part work well long term, because they have no particular loyalty
to any component or supplier, and can charge twice if they do the work
twice anyway.

> > - How do we prevent further insanity to known problem spaces (like
> > cpu hotplug) without stopping progress?
>
> Progress can slow, if we want it to, in some areas, just to let people
> get the time to fix up the issues we currently have. That saves time in
> the long run, but requires that someone make it very clear as to what is
> going on and how it will change in the future.

There are also a couple of areas (chunks of VM perhaps is one) where it
might actually be quicker to shoot the patient and breed a replacement.
Possibly however that's one of the areas that getting someone
mathematical involved to do more rigorous modelling of the behaviour
might be better. I know looking at what comes out of the VM to the block
layer.. it ain't pretty at times.

> But, both of these are great things to talk about, I like it.
>
> > A side question, but definitely related is:
> >
> > - How do we handle "established maintainers" who are mainly
> > interested in their own personal agenda and ignoring justified
> > criticism just because they can?

More often than not it's because they believe their agenda is right.
E.g. I'm firmly of the opinion that 99% of users would be better off if we
took all the current scheduler nonsense and replaced it with a lightly
tweaked version of Ingo's original O(1) scheduler.

I don't think Ingo and Peter have "persona" agendas in this area they are
doing what they think is right and dealing with the needs of big
enterprise and where most of the money is.

I have a suspicion that two things are going to correct chunks of this
over time anyway
- handheld/phone/tablet
- the need for very thin virtualisation

> The wonderful, "how do we remove a maintainer who isn't working out"
> problem. It's a tough one, I don't think we really have any standard
> way. Luckily in the past, the insane ones went away on their own :)

We have current problems and they are often caused by the maintainer in
question having other commitments that they consider more important (and
probably are in some cases).

One thing that seems to be working well are all the areas that have two
or more maintainers. As a simple statistical fault tolerance they don't
generally both have newborns, get the flu, change job or get yanked into
a critical customer problem by their employer on the same week.

Right now we are doing it for real in some areas, and via the "screw
this, mail Andrew Morton" process for others, plus Linus fields some of
the really dumb ones. We could formalise some of that a bit more and
encourage more maintainers to actual team up with one of the other
contributors they trust.

(And we probably need to clone DaveM or get him to delegate more 8))

And we need about ten extra GregKH's if anyone has spares

Alan
--
'Go go go Gregzilla'

2012-06-16 13:29:03

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, 16 Jun 2012 12:50:05 +0200 (CEST)
Thomas Gleixner <[email protected]> wrote:

> A good start would be if you could convert your kernel statistics into
> accounting the consolidation effects of contributions instead of
> fostering the idiocy that corporates have started to measure themself
> and the performance of their employees (I'm not kidding, it's the sad
> reality) with line and commit count statistics.

I would dearly love to come up with a way to measure "real work" in
some fashion; I've just not, yet, figured out how to do that. I do
fear that the simple numbers we're able to generate end up creating the
wrong kinds of incentives.

Any thoughts on how to measure "consolidation effects"? I toss out
numbers on code removal sometimes, but that turns out to not be a whole
lot more useful than anything else on its own.

Thanks,

jon

2012-06-16 13:32:32

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, Jun 16, 2012 at 07:29:06AM -0600, Jonathan Corbet wrote:
> On Sat, 16 Jun 2012 12:50:05 +0200 (CEST)
> Thomas Gleixner <[email protected]> wrote:
>
> > A good start would be if you could convert your kernel statistics into
> > accounting the consolidation effects of contributions instead of
> > fostering the idiocy that corporates have started to measure themself
> > and the performance of their employees (I'm not kidding, it's the sad
> > reality) with line and commit count statistics.
>
> I would dearly love to come up with a way to measure "real work" in
> some fashion; I've just not, yet, figured out how to do that. I do
> fear that the simple numbers we're able to generate end up creating the
> wrong kinds of incentives.
>
> Any thoughts on how to measure "consolidation effects"? I toss out
> numbers on code removal sometimes, but that turns out to not be a whole
> lot more useful than anything else on its own.

I fear there is no reliable automated way to measure that :)

2012-06-16 13:50:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Saturday, June 16, 2012, Jonathan Corbet wrote:
> On Sat, 16 Jun 2012 12:50:05 +0200 (CEST)
> Thomas Gleixner <[email protected]> wrote:
>
> > A good start would be if you could convert your kernel statistics into
> > accounting the consolidation effects of contributions instead of
> > fostering the idiocy that corporates have started to measure themself
> > and the performance of their employees (I'm not kidding, it's the sad
> > reality) with line and commit count statistics.
>
> I would dearly love to come up with a way to measure "real work" in
> some fashion; I've just not, yet, figured out how to do that. I do
> fear that the simple numbers we're able to generate end up creating the
> wrong kinds of incentives.

I have exactly the same feeling about that.

> Any thoughts on how to measure "consolidation effects"? I toss out
> numbers on code removal sometimes, but that turns out to not be a whole
> lot more useful than anything else on its own.

Well, that is very difficult to measure. I'd look for cases in which certain
function calls and data types become more widespread.

2012-06-16 15:19:29

by Phil Turmel

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On 06/16/2012 07:30 AM, Alan Cox wrote:
> 'Go go go Gregzilla'

QOTW!

2012-06-16 16:43:10

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, 2012-06-16 at 12:30 +0100, Alan Cox wrote:
> On Fri, 15 Jun 2012 16:34:13 -0700
> Greg KH <[email protected]> wrote:
>
> > On Sat, Jun 16, 2012 at 12:56:36AM +0200, Thomas Gleixner wrote:
> > > So the main questions I want to raise on Kernel Summit are:
> > >
> > > - How do we cope with the need to review the increasing amount of
> > > (insane) patches and their potential integration?

> There are a small number of very large businesses that generate a lot of
> the money in the server space. They have individual needs and the money
> tends to drive chunks of kernel development. Thats both a bad and a good.
> The bad is they are trying to get it to do what they need, now and
> without planning for the long term properly. The good is that they are
> paying developers who otherwise might be working on other stuff.

I don't think that is necessarily true. Most of the large businesses are
interested in long term solutions: that's why they hire people rather
then farming the work out to consultants.

The problem is that the internal cultures of those businesses isn't
necessarily adapted to an open development model. Their focus is on a
limited set of features that they can sell to customers, and so they get
confused when you tell them "no, I don't want to have to implement 10
different variants of read/write to satisfy 10 different workloads, even
if each one can be shown to produce a 2% performance increase".

The people who are aware of the Linux culture, tend to be the
developers, since they are immersed in that culture and since we have
programs for educating them (via workshops, conference tutorials,
mailing lists, Linus's shouting...). It should be the responsibility of
those developers to educate their program managers etc, and usually that
will happen once the program managers realise that their projects are
going nowhere with the maintainers...

> We also have a large consumer electronics and now android ecosystem much
> of which is made up of companies and people whose business history and
> business model for all products has always been
>
> - make it boot
> - make it usable
> - ship it
> - run
>
> they have little incentive to share, they have no interest in the longer
> term, and it's often not rational economics for them to clean up their
> code and upstream it because it just helps their rivals, and besides the
> part is now 6 months old so obsolete in their world.
>
> For a lot of hardware the only way that is going to get fixed is if
>
> a) it is easier to do the job right than hack it up
> b) when the hardware vendors are more involved and have longer term plans
> c) their customers start to demand it in order to be up and running very
> fast (ie there is heavy consolidation in the platform)
>
> It's not in these little "hack it/ship it" houses interest to care about
> making a part work well long term, because they have no particular loyalty
> to any component or supplier, and can charge twice if they do the work
> twice anyway.

Right, but that is a different problem: that's about getting people to
contribute in the first place, rather than the review issue that Thomas
raised.

> > The wonderful, "how do we remove a maintainer who isn't working out"
> > problem. It's a tough one, I don't think we really have any standard
> > way. Luckily in the past, the insane ones went away on their own :)
>
> We have current problems and they are often caused by the maintainer in
> question having other commitments that they consider more important (and
> probably are in some cases).
>
> One thing that seems to be working well are all the areas that have two
> or more maintainers. As a simple statistical fault tolerance they don't
> generally both have newborns, get the flu, change job or get yanked into
> a critical customer problem by their employer on the same week.
>
> Right now we are doing it for real in some areas, and via the "screw
> this, mail Andrew Morton" process for others, plus Linus fields some of
> the really dumb ones. We could formalise some of that a bit more and
> encourage more maintainers to actual team up with one of the other
> contributors they trust.

If by "maintainer" you mean "patch reviewer", then I agree. Teaming up
for the review process is the right thing to do, and is (as far as I
know) what we were trying to resolve with the "Reviewed-by" tag.
Formalising the review process and raising the status of developers that
commit to reviewing patches is entirely the right thing to do. Actually
maintaining trees of reviewed patches is the trivial part of the
operation.

Perhaps the right thing to do is to start demanding that all patches
that are submitted to the maintainer contain at least one "Reviewed-by:"
tag?

> (And we probably need to clone DaveM or get him to delegate more 8))
>
> And we need about ten extra GregKH's if anyone has spares
>
> Alan
> --
> 'Go go go Gregzilla'

ACK! :-)

Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2012-06-17 10:41:04

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, 16 Jun 2012, Jonathan Corbet wrote:
> On Sat, 16 Jun 2012 12:50:05 +0200 (CEST)
> Thomas Gleixner <[email protected]> wrote:
>
> > A good start would be if you could convert your kernel statistics into
> > accounting the consolidation effects of contributions instead of
> > fostering the idiocy that corporates have started to measure themself
> > and the performance of their employees (I'm not kidding, it's the sad
> > reality) with line and commit count statistics.
>
> I would dearly love to come up with a way to measure "real work" in
> some fashion; I've just not, yet, figured out how to do that. I do
> fear that the simple numbers we're able to generate end up creating the
> wrong kinds of incentives.
>
> Any thoughts on how to measure "consolidation effects"? I toss out
> numbers on code removal sometimes, but that turns out to not be a whole
> lot more useful than anything else on its own.

I don't think there is an automated way.

How about not publishing the stats at all and just mention anything
related to them when something exceptional happens? E.g. out the the
blue 90% of the patches were submitted by hobbyists.

If you look at the stats of the last years, there is nothing really
interesting happening. We already know who employs the most kernel
developers and who of them is doing most of the work.

If companies really want to measure their "importance" or the
"performance" of their employees they can create their own stats and
abuse them for whatever they want.

Thanks,

tglx

2012-06-17 17:04:52

by Mark Brown

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, Jun 16, 2012 at 12:30:32PM +0100, Alan Cox wrote:
> Greg KH <[email protected]> wrote:

> We also have a large consumer electronics and now android ecosystem much
> of which is made up of companies and people whose business history and
> business model for all products has always been

> - make it boot
> - make it usable
> - ship it
> - run

> they have little incentive to share, they have no interest in the longer
> term, and it's often not rational economics for them to clean up their
> code and upstream it because it just helps their rivals, and besides the
> part is now 6 months old so obsolete in their world.

This is actually getting a lot better these days.

> For a lot of hardware the only way that is going to get fixed is if

> a) it is easier to do the job right than hack it up
> b) when the hardware vendors are more involved and have longer term plans
> c) their customers start to demand it in order to be up and running very
> fast (ie there is heavy consolidation in the platform)

The latter two are happening right now, mostly thanks to consumer demand
for software updates for things like phones though the desire to keep
hardware platforms rolling indepenently of OS releases is also a factor.
One of the big blockers to that has been the need to move all the out of
tree stuff up to a newer kernel, reducing the diff to mainline is a good
way to minimise the effort involved. I was very pleased when I started
to find handset vendors wanting to confirm that patches given to them
were also going upstream.

This doesn't apply to all hardware but more and more things are getting
in field updates.

> Right now we are doing it for real in some areas, and via the "screw
> this, mail Andrew Morton" process for others, plus Linus fields some of
> the really dumb ones. We could formalise some of that a bit more and
> encourage more maintainers to actual team up with one of the other
> contributors they trust.

Yes, this would really help as would better backup plans when things
aren't working. Finding people to work with is not just a question of
trust, it's also a question of finding people with similar work patterns
as a mismatch can be painful.

2012-06-17 18:52:00

by Greg KH

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sun, Jun 17, 2012 at 12:40:55PM +0200, Thomas Gleixner wrote:
> If you look at the stats of the last years, there is nothing really
> interesting happening. We already know who employs the most kernel
> developers and who of them is doing most of the work.

What is interesting, and is why I started collecting this information
years ago, is keeping track of our rate-of-change, the number of new
developers we have contributing, and the number of different companies
and how many of them are contributing. All of those numbers are good to
watch to see how well as a community we are doing.

So far, all of those numbers are going up, which is good. If they ever
stop dropping, I will get worried.

These numbers, and stats, are also good for getting other companies to
get involved in kernel development. I've used them for many years to
point out that they need to get involved, and in one noticable case
(Intel), it has made a huge difference. Other cases (Amazon and
Motorola), it hasn't helped out at all.

They also show what areas of the kernel are under major change and
churn, which is interesting to see for some people who don't pay that
much attention to our community (2 years ago the x86 rework was obvious,
and this year the ARM and SoC work is obvious).

> If companies really want to measure their "importance" or the
> "performance" of their employees they can create their own stats and
> abuse them for whatever they want.

Companies do do that. You also see companies "hiding" their
contributions from the stats as they don't want to show up on the radar
for odd reasons (Qualcomm is one example of this, they spread their
contributions around 3 different companies for "misguided" legal
reasons.)

Microsoft was an interesting example of a company that ended up doing a
lot of work for just one set of drivers, and ended up showing high in
the stats because of that. That provided a great example of a company
that no one had ever thought would contribute, was doing so (the local
Seattle paper's headline read, "Is Cancer Cured?" which was so funny to
me and pissed so many locals off.)

And yes, some companies try to "game" the numbers, but it's really hard
to do this given how much real work is being done by people, and how
obvious it is when it happens. So far I haven't seen anyone succeed in
doing this, but they might have been so good that I didn't notice.

And as always, of course statistics lie, we all know this, but sometimes
they can be helpful for your cause :)

greg k-h

2012-06-17 18:58:27

by Mark Brown

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, Jun 16, 2012 at 07:29:06AM -0600, Jonathan Corbet wrote:

> Any thoughts on how to measure "consolidation effects"? I toss out
> numbers on code removal sometimes, but that turns out to not be a whole
> lot more useful than anything else on its own.

It'd probably help if we could split out the framework and driver code,
but at the end of the day it's all just stats and therefore has to be
taken with a pinch (or large helping) of salt.

2012-06-19 15:46:16

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Fri, Jun 15, 2012 at 4:56 PM, Thomas Gleixner <[email protected]> wrote:
> ? - How do we cope with the need to review the increasing amount of
> ? ? (insane) patches and their potential integration?

There's certainly a problem here. Sometimes I see patches that are
technically correct, but there's a philosophical discussion about
whether a design or user interface change is desirable. Those are fun
and interesting to deal with.

The bigger problem for me is that many patches do something useful and
desirable but have some obscure technical problem, and it's hard to
catch them. Usually these patches come from competent submitters who
merely don't know where all the landmines in the current design are
buried.

As a trivial example, a recent patch added a PCI "final" fixup. Seems
perfectly reasonable, except that final fixups aren't applied to
hot-added devices. That's a bug in PCI, and we shouldn't expect
everybody who writes a quirk to know about it.

We can try to address this by "educating developers" or "documenting
the design better" or "delegating to submaintainers" or whatever.
Those are valuable, but in some way they're cop-outs. A more
effective fix would be to remove the landmines, reduce complexity, and
improve the design. If we have coherent code that follows the
hardware architecture and matches people's intuition about how things
"should work," I think we'll get patches with fewer issues.

Sorry, I think I just reiterated what you, Greg KH, Alan, et al have
already said :)

2012-06-19 19:18:56

by Roland Dreier

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Tue, Jun 19, 2012 at 8:45 AM, Bjorn Helgaas <[email protected]> wrote:
> We can try to address this by "educating developers" or "documenting
> the design better" or "delegating to submaintainers" or whatever.
> Those are valuable, but in some way they're cop-outs. ?A more
> effective fix would be to remove the landmines, reduce complexity, and
> improve the design. ?If we have coherent code that follows the
> hardware architecture and matches people's intuition about how things
> "should work," I think we'll get patches with fewer issues.
>
> Sorry, I think I just reiterated what you, Greg KH, Alan, et al have
> already said :)

No, I don't think that's just reiteration, I think it's an important point :)

There is a certain strain of thinking in our community that is resistant
to working on design improvements as you describe. And I think without
improving in that direction, we're going to drown in subtly broken patches.

- R.

2012-06-20 00:41:12

by Dave Chinner

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, Jun 16, 2012 at 04:43:05PM +0000, Myklebust, Trond wrote:
> On Sat, 2012-06-16 at 12:30 +0100, Alan Cox wrote:
> > On Fri, 15 Jun 2012 16:34:13 -0700
> > Greg KH <[email protected]> wrote:
> > > On Sat, Jun 16, 2012 at 12:56:36AM +0200, Thomas Gleixner wrote:
> > > > So the main questions I want to raise on Kernel Summit are:
> > > >
> > > > - How do we cope with the need to review the increasing amount of
> > > > (insane) patches and their potential integration?
> > One thing that seems to be working well are all the areas that have two
> > or more maintainers. As a simple statistical fault tolerance they don't
> > generally both have newborns, get the flu, change job or get yanked into
> > a critical customer problem by their employer on the same week.
> >
> > Right now we are doing it for real in some areas, and via the "screw
> > this, mail Andrew Morton" process for others, plus Linus fields some of
> > the really dumb ones. We could formalise some of that a bit more and
> > encourage more maintainers to actual team up with one of the other
> > contributors they trust.
>
> If by "maintainer" you mean "patch reviewer", then I agree. Teaming up
> for the review process is the right thing to do, and is (as far as I
> know) what we were trying to resolve with the "Reviewed-by" tag.
> Formalising the review process and raising the status of developers that
> commit to reviewing patches is entirely the right thing to do. Actually
> maintaining trees of reviewed patches is the trivial part of the
> operation.
>
> Perhaps the right thing to do is to start demanding that all patches
> that are submitted to the maintainer contain at least one "Reviewed-by:"
> tag?

The upside of this is that people who regularly review patches
(note: review, not ack) are more likely to have their patches
reviewed promptly by other people. i.e. this often devolves to a "I
scratch your back, you scratch mine" kind of arrangement.

In turn, this encourages prompt code review because it makes it more
likely that code is going to be reviewed quickly by others because
the others remember who reviewed their last patch-bomb quickly.

And the maintainer quickly learns whose reviews can be trusted, too,
which then leads to less load on the maintainer as reviewed-by tags
grow in trust value....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2012-06-20 19:51:28

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [Ksummit-2012-discuss] [ATTEND or not ATTEND] That's the question!

On Sat, Jun 16, 2012 at 07:29:06AM -0600, Jonathan Corbet wrote:
> On Sat, 16 Jun 2012 12:50:05 +0200 (CEST)
> Thomas Gleixner <[email protected]> wrote:
>
> > A good start would be if you could convert your kernel statistics into
> > accounting the consolidation effects of contributions instead of
> > fostering the idiocy that corporates have started to measure themself
> > and the performance of their employees (I'm not kidding, it's the sad
> > reality) with line and commit count statistics.
>
> I would dearly love to come up with a way to measure "real work" in
> some fashion; I've just not, yet, figured out how to do that. I do
> fear that the simple numbers we're able to generate end up creating the
> wrong kinds of incentives.

I can't see any alternative to explaining what somebody did and why it
was important.

To that end, the best resource for understanding the value of somebody's
work is the lwn.net kernel page--if their work has been discussed there.

So, all you need to do is to hire a dozen more of you, and we're
covered!

--b.

>
> Any thoughts on how to measure "consolidation effects"? I toss out
> numbers on code removal sometimes, but that turns out to not be a whole
> lot more useful than anything else on its own.
>
> Thanks,
>
> jon
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/