2007-12-20 13:11:17

by Jan Beulich

[permalink] [raw]
Subject: [PATCH 0/4] add task handling notifier

With more and more sub-systems/sub-components leaving their footprint
in task handling functions, it seems reasonable to add notifiers that
these components can use instead of having them all patch themselves
directly into core files.

Patch 1 introduces the base definitions and hooks for task creation
and deletion.
Patch 2 switches delayacct to make use of the notifier.
Patch 3 makes the procevents/connector use the infrastructure and adds
additional notifiers needed there.
Patch 4 makes the security keys handling use this, too.

Signed-off-by: Jan Beulich <[email protected]>


2007-12-20 22:22:39

by Ingo Oeser

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

Hi Jan,

I like and support your idea!

On Thursday 20 December 2007, Jan Beulich wrote:
> With more and more sub-systems/sub-components leaving their footprint
> in task handling functions, it seems reasonable to add notifiers that
> these components can use instead of having them all patch themselves
> directly into core files.

Yes, but why export variables? Wouldn't it be better to export
an API?

That simplifies the callers (they all pass "current" as task
and "task_notifier_list" as arguments).

It also prevents exposing internal variables (notifier lists
ARE internal variables) to modules.

What do you think?


Best Regards

Ingo Oeser

2007-12-21 07:36:30

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

>Yes, but why export variables? Wouldn't it be better to export
>an API?
>
>That simplifies the callers (they all pass "current" as task
>and "task_notifier_list" as arguments).
>
>It also prevents exposing internal variables (notifier lists
>ARE internal variables) to modules.
>
>What do you think?

Would be a simple change if the concept itself is generally welcome. Will
first see whether I get other comments requiring re-work.

Jan

2007-12-23 12:26:34

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Thu, Dec 20, 2007 at 01:11:24PM +0000, Jan Beulich wrote:
> With more and more sub-systems/sub-components leaving their footprint
> in task handling functions, it seems reasonable to add notifiers that
> these components can use instead of having them all patch themselves
> directly into core files.

I agree that we probably want something like this. As do some others,
so we already had a few a few attempts at similar things. The first one
is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
includes allocating per-task data for it's users. Then also from SGI
there has been a simplified version called pnotify that's also available
from the website above.

Later Matt Helsley had something called "Task Watchers" which lwn has
an article on: http://lwn.net/Articles/208117/.

For some reason neither ever made a lot of progess (performance
problems?).

As said by other we really should have a register/unregister API instead
of exposing the implementation. Also please don't export anything until
we actuall grow modular users in the tree.

2007-12-25 22:07:40

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Sun, 23 Dec 2007 12:26:21 +0000 Christoph Hellwig <[email protected]> wrote:

> On Thu, Dec 20, 2007 at 01:11:24PM +0000, Jan Beulich wrote:
> > With more and more sub-systems/sub-components leaving their footprint
> > in task handling functions, it seems reasonable to add notifiers that
> > these components can use instead of having them all patch themselves
> > directly into core files.
>
> I agree that we probably want something like this. As do some others,
> so we already had a few a few attempts at similar things. The first one
> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> includes allocating per-task data for it's users. Then also from SGI
> there has been a simplified version called pnotify that's also available
> from the website above.
>
> Later Matt Helsley had something called "Task Watchers" which lwn has
> an article on: http://lwn.net/Articles/208117/.
>
> For some reason neither ever made a lot of progess (performance
> problems?).
>

I had it in -mm, sorted out all the problems but ended up not pulling the
trigger.

Problem is, it adds runtime overhead purely for the convenience of kernel
programmers, and I don't think that's a good tradeoff.

Sprinkling direct calls into a few well-known sites won't kill us, and
we've survived this long. Why not keep doing that, and save everyone a few
cycles?

2008-01-08 13:37:54

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

>>> Andrew Morton <[email protected]> 25.12.07 23:05 >>>
>On Sun, 23 Dec 2007 12:26:21 +0000 Christoph Hellwig <[email protected]> wrote:
>
>> On Thu, Dec 20, 2007 at 01:11:24PM +0000, Jan Beulich wrote:
>> > With more and more sub-systems/sub-components leaving their footprint
>> > in task handling functions, it seems reasonable to add notifiers that
>> > these components can use instead of having them all patch themselves
>> > directly into core files.
>>
>> I agree that we probably want something like this. As do some others,
>> so we already had a few a few attempts at similar things. The first one
>> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
>> includes allocating per-task data for it's users. Then also from SGI
>> there has been a simplified version called pnotify that's also available
>> from the website above.
>>
>> Later Matt Helsley had something called "Task Watchers" which lwn has
>> an article on: http://lwn.net/Articles/208117/.
>>
>> For some reason neither ever made a lot of progess (performance
>> problems?).
>>
>
>I had it in -mm, sorted out all the problems but ended up not pulling the
>trigger.
>
>Problem is, it adds runtime overhead purely for the convenience of kernel
>programmers, and I don't think that's a good tradeoff.
>
>Sprinkling direct calls into a few well-known sites won't kill us, and
>we've survived this long. Why not keep doing that, and save everyone a few
>cycles?

Am I to conclude then that there's no point in addressing the issues other
people pointed out? While I (obviously, since I submitted the patch disagree),
I'm not certain how others feel. My main point for disagreement here is (I'm
sorry to repeat this) that as long as certain code isn't allowed into the kernel
I think it is not unreasonable to at least expect the kernel to provide some
fundamental infrastructure that can be used for those (supposedly
unacceptable) bits. All I did here was utilizing the base infrastructure I want
added to clean up code that appeared pretty ad-hoc.

Jan

2008-01-08 22:21:30

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Tue, 08 Jan 2008 13:38:03 +0000
"Jan Beulich" <[email protected]> wrote:

> >>> Andrew Morton <[email protected]> 25.12.07 23:05 >>>
> >On Sun, 23 Dec 2007 12:26:21 +0000 Christoph Hellwig <[email protected]> wrote:
> >
> >> On Thu, Dec 20, 2007 at 01:11:24PM +0000, Jan Beulich wrote:
> >> > With more and more sub-systems/sub-components leaving their footprint
> >> > in task handling functions, it seems reasonable to add notifiers that
> >> > these components can use instead of having them all patch themselves
> >> > directly into core files.
> >>
> >> I agree that we probably want something like this. As do some others,
> >> so we already had a few a few attempts at similar things. The first one
> >> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> >> includes allocating per-task data for it's users. Then also from SGI
> >> there has been a simplified version called pnotify that's also available
> >> from the website above.
> >>
> >> Later Matt Helsley had something called "Task Watchers" which lwn has
> >> an article on: http://lwn.net/Articles/208117/.
> >>
> >> For some reason neither ever made a lot of progess (performance
> >> problems?).
> >>
> >
> >I had it in -mm, sorted out all the problems but ended up not pulling the
> >trigger.
> >
> >Problem is, it adds runtime overhead purely for the convenience of kernel
> >programmers, and I don't think that's a good tradeoff.
> >
> >Sprinkling direct calls into a few well-known sites won't kill us, and
> >we've survived this long. Why not keep doing that, and save everyone a few
> >cycles?
>
> Am I to conclude then that there's no point in addressing the issues other
> people pointed out? While I (obviously, since I submitted the patch disagree),
> I'm not certain how others feel. My main point for disagreement here is (I'm
> sorry to repeat this) that as long as certain code isn't allowed into the kernel
> I think it is not unreasonable to at least expect the kernel to provide some
> fundamental infrastructure that can be used for those (supposedly
> unacceptable) bits. All I did here was utilizing the base infrastructure I want
> added to clean up code that appeared pretty ad-hoc.
>

Ah. That's a brand new requirement.

The requirement which I thought we were addressing was "clean the code up
by adding a notifier chain so multiple subsystems don't need to patch in
hard-coded calls".

My contention is that the code clarity which this gains isn't worth the
runtime cost.



Now we have a new requirement: "allow out-of-tree code to hook into these
spots without needing to patch those few callsites".

I think we'd need a pretty detailed description of the pain which this
would relieve before we would take such an extraordinary step. What are
those (unidentified) add-on features doing at present? Patching calls into
fork.c/exec.c/exit.c?

2008-01-09 00:03:20

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

Andrew wrote:
> What are those (unidentified) add-on features doing at present?
> Patching calls into fork.c/exec.c/exit.c?

Most likely. I suspect we have general agreement and awareness
that such patching is not something that sells well in Linux-land.
And for good reason in my personal view ... such patching by loadable
modules could open the door to compromising the integrity of Linux in
ways that could be dangerous.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.940.382.4214

2008-01-09 00:37:19

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Tue, 8 Jan 2008 18:03:09 -0600
Paul Jackson <[email protected]> wrote:

> Andrew wrote:
> > What are those (unidentified) add-on features doing at present?
> > Patching calls into fork.c/exec.c/exit.c?
>
> Most likely. I suspect we have general agreement and awareness
> that such patching is not something that sells well in Linux-land.
> And for good reason in my personal view ... such patching by loadable
> modules could open the door to compromising the integrity of Linux in
> ways that could be dangerous.
>

err, no.

What I meant was that the providers of these mystery features are
presumably also patching into fork.c/exec.c/exit.c at the source code level
so as to enable the mystery features within their overall kernel package.

If so, this doesn't sounds terribly onerous.

2008-01-09 02:24:25

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Sun, 2007-12-23 at 12:26 +0000, Christoph Hellwig wrote:
> On Thu, Dec 20, 2007 at 01:11:24PM +0000, Jan Beulich wrote:
> > With more and more sub-systems/sub-components leaving their footprint
> > in task handling functions, it seems reasonable to add notifiers that
> > these components can use instead of having them all patch themselves
> > directly into core files.
>
> I agree that we probably want something like this. As do some others,
> so we already had a few a few attempts at similar things. The first one
> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> includes allocating per-task data for it's users. Then also from SGI
> there has been a simplified version called pnotify that's also available
> from the website above.
>
> Later Matt Helsley had something called "Task Watchers" which lwn has
> an article on: http://lwn.net/Articles/208117/.

Apologies for the late reply -- I haven't had internet access for the
last few weeks.

> For some reason neither ever made a lot of progess (performance
> problems?).

Yeah. Some discussion on measuring the performance of Task Watchers:
http://thread.gmane.org/gmane.linux.lse/4698

The requirements for Task Watchers were:

Allow sleeping in most/all notifier functions in these paths:
fork
exec
exit
change [re][ug]id
No performance overhead
One "chain" per path ("I only care about exec().")
Easy to use
Scales to large numbers of CPUs
Useful to make most in-tree code more readable. Task Watchers took
direct calls to these pieces of code out of the fork/exec/exit paths:
audit
semundo
cpusets
mempolicy
trace irqflags
lockdep
keys (for processes -- not for thread groups)
process events connector
Useful for loadable modules

Performance overhead in microbenchmarks was measurable at around 1% (see
the URL above). Overhead on benchmarks like kernbench on the other hand
were in the noise margins (which were around 1.6%) and hence I couldn't
determine the overhead there.

I never got the loadable module part completely working due to races
between notifier functions and the module unload path. The solution to
the races seemed to require adding more overhead to the notifier
function paths (SRCU-like grace periods).

I stopped pushing the patch set because I hadn't found any new
optimizations to offset the overheads while still meeting all the
requirements and Andrew still felt that the "make it more readable"
argument was not sufficient to justify its inclusion.

Jan, instead of adding notifiers could utrace be used or made to work
for modules? Also, please add me to the Cc list for any reposts of the
entire series. Thanks!

Cheers,
-Matt Helsley

2008-01-09 02:47:22

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Tue, 2008-01-08 at 14:14 -0800, Andrew Morton wrote:
> On Tue, 08 Jan 2008 13:38:03 +0000
> "Jan Beulich" <[email protected]> wrote:
>
> > >>> Andrew Morton <[email protected]> 25.12.07 23:05 >>>
> > >On Sun, 23 Dec 2007 12:26:21 +0000 Christoph Hellwig <[email protected]> wrote:
> > >
> > >> On Thu, Dec 20, 2007 at 01:11:24PM +0000, Jan Beulich wrote:
> > >> > With more and more sub-systems/sub-components leaving their footprint
> > >> > in task handling functions, it seems reasonable to add notifiers that
> > >> > these components can use instead of having them all patch themselves
> > >> > directly into core files.
> > >>
> > >> I agree that we probably want something like this. As do some others,
> > >> so we already had a few a few attempts at similar things. The first one
> > >> is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> > >> includes allocating per-task data for it's users. Then also from SGI
> > >> there has been a simplified version called pnotify that's also available
> > >> from the website above.
> > >>
> > >> Later Matt Helsley had something called "Task Watchers" which lwn has
> > >> an article on: http://lwn.net/Articles/208117/.
> > >>
> > >> For some reason neither ever made a lot of progess (performance
> > >> problems?).
> > >>
> > >
> > >I had it in -mm, sorted out all the problems but ended up not pulling the
> > >trigger.
> > >
> > >Problem is, it adds runtime overhead purely for the convenience of kernel
> > >programmers, and I don't think that's a good tradeoff.
> > >
> > >Sprinkling direct calls into a few well-known sites won't kill us, and
> > >we've survived this long. Why not keep doing that, and save everyone a few
> > >cycles?
> >
> > Am I to conclude then that there's no point in addressing the issues other
> > people pointed out? While I (obviously, since I submitted the patch disagree),
> > I'm not certain how others feel. My main point for disagreement here is (I'm
> > sorry to repeat this) that as long as certain code isn't allowed into the kernel
> > I think it is not unreasonable to at least expect the kernel to provide some
> > fundamental infrastructure that can be used for those (supposedly
> > unacceptable) bits. All I did here was utilizing the base infrastructure I want
> > added to clean up code that appeared pretty ad-hoc.
> >
>
> Ah. That's a brand new requirement.

In all fairness it's not really a brand new requirement -- just one that
wasn't strongly emphasized during prior attempts to get something like
this in.

I had a mostly-working patch for this on top of the Task Watchers v2
patch set. I never posted that specific patch because it had a race with
module unloading and the fix only increased the overhead you were
unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X]
description for Task Watchers v2 (http://lwn.net/Articles/207873/):

"TODO:
...
I'm working on three more patches that add support for creating a task
watcher from within a module using an ELF section. They haven't recieved
as much attention since I've been focusing on measuring the performance
impact of these patches."

<snip>

Would tainting the kernel upon registration of out-of-tree "notifiers"
be more acceptable?

Cheers,
-Matt Helsley

2008-01-09 03:22:21

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Tue, 08 Jan 2008 18:47:00 -0800 Matt Helsley <[email protected]> wrote:

> > >
> ...
> > > Am I to conclude then that there's no point in addressing the issues other
> > > people pointed out? While I (obviously, since I submitted the patch disagree),
> > > I'm not certain how others feel. My main point for disagreement here is (I'm
> > > sorry to repeat this) that as long as certain code isn't allowed into the kernel
> > > I think it is not unreasonable to at least expect the kernel to provide some
> > > fundamental infrastructure that can be used for those (supposedly
> > > unacceptable) bits. All I did here was utilizing the base infrastructure I want
> > > added to clean up code that appeared pretty ad-hoc.
> > >
> >
> > Ah. That's a brand new requirement.
>
> In all fairness it's not really a brand new requirement -- just one that
> wasn't strongly emphasized during prior attempts to get something like
> this in.
>
> I had a mostly-working patch for this on top of the Task Watchers v2
> patch set. I never posted that specific patch because it had a race with
> module unloading and the fix only increased the overhead you were
> unhappy with. I mentioned it briefly in my lengthy [PATCH 0/X]
> description for Task Watchers v2 (http://lwn.net/Articles/207873/):
>
> "TODO:
> ...
> I'm working on three more patches that add support for creating a task
> watcher from within a module using an ELF section. They haven't recieved
> as much attention since I've been focusing on measuring the performance
> impact of these patches."
>
> <snip>
>
> Would tainting the kernel upon registration of out-of-tree "notifiers"
> be more acceptable?

How does that work? module.c does the register/deregister on behalf of the
module?

I certainly encourage people to disagreee with me here, but my current
thinking is:

- the cleanup aspect isn't worth the runtime overhead and

- the support-modular-users aspect is largely new and would need a lot
more description and justification (with examples) before we can even
begin to evaluate it.

2008-01-09 03:27:47

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Tue, 2008-01-08 at 18:24 -0800, Matt Helsley wrote:
> On Sun, 2007-12-23 at 12:26 +0000, Christoph Hellwig wrote:
> > On Thu, Dec 20, 2007 at 01:11:24PM +0000, Jan Beulich wrote:
> > > With more and more sub-systems/sub-components leaving their footprint
> > > in task handling functions, it seems reasonable to add notifiers that
> > > these components can use instead of having them all patch themselves
> > > directly into core files.
> >
> > I agree that we probably want something like this. As do some others,
> > so we already had a few a few attempts at similar things. The first one
> > is from SGI and called PAGG (http://oss.sgi.com/projects/pagg/) and also
> > includes allocating per-task data for it's users. Then also from SGI
> > there has been a simplified version called pnotify that's also available
> > from the website above.
> >
> > Later Matt Helsley had something called "Task Watchers" which lwn has
> > an article on: http://lwn.net/Articles/208117/.
>
> Apologies for the late reply -- I haven't had internet access for the
> last few weeks.
>
> > For some reason neither ever made a lot of progess (performance
> > problems?).
>
> Yeah. Some discussion on measuring the performance of Task Watchers:
> http://thread.gmane.org/gmane.linux.lse/4698
>
> The requirements for Task Watchers were:
>
> Allow sleeping in most/all notifier functions in these paths:
> fork
> exec
> exit
> change [re][ug]id
> No performance overhead
> One "chain" per path ("I only care about exec().")
> Easy to use
> Scales to large numbers of CPUs
> Useful to make most in-tree code more readable. Task Watchers took
> direct calls to these pieces of code out of the fork/exec/exit paths:
> audit
> semundo
> cpusets
> mempolicy
> trace irqflags
> lockdep
> keys (for processes -- not for thread groups)
> process events connector
> Useful for loadable modules
>
> Performance overhead in microbenchmarks was measurable at around 1% (see
> the URL above). Overhead on benchmarks like kernbench on the other hand
> were in the noise margins (which were around 1.6%) and hence I couldn't
> determine the overhead there.
>
> I never got the loadable module part completely working due to races
> between notifier functions and the module unload path. The solution to
> the races seemed to require adding more overhead to the notifier
> function paths (SRCU-like grace periods).
>
> I stopped pushing the patch set because I hadn't found any new
> optimizations to offset the overheads while still meeting all the
> requirements and Andrew still felt that the "make it more readable"
> argument was not sufficient to justify its inclusion.

Oops. It's been nearly two years so I've forgotten exactly where Task
Watchers v2 was when I stopped pushing it. After a bit more searching I
found a more recent posting:
http://lkml.org/lkml/2006/12/14/384

And here's why I think the microbenchmark results improved to the point
there was a small performance improvement over mainline:
http://lkml.org/lkml/2006/12/19/124

I seem to recall kernbench was still too noisy to tell.

The patch allowing modules to register Task Watchers still isn't posted
there for the reasons I've already described.

Cheers,
-Matt Helsley

2008-01-09 09:51:56

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

>> Am I to conclude then that there's no point in addressing the issues other
>> people pointed out? While I (obviously, since I submitted the patch disagree),
>> I'm not certain how others feel. My main point for disagreement here is (I'm
>> sorry to repeat this) that as long as certain code isn't allowed into the kernel
>> I think it is not unreasonable to at least expect the kernel to provide some
>> fundamental infrastructure that can be used for those (supposedly
>> unacceptable) bits. All I did here was utilizing the base infrastructure I want
>> added to clean up code that appeared pretty ad-hoc.
>>
>
>Ah. That's a brand new requirement.

I'm sorry, but I didn't feel this was important, as I didn't expect the cleanup
effect to cause much debate...

>I think we'd need a pretty detailed description of the pain which this
>would relieve before we would take such an extraordinary step. What are
>those (unidentified) add-on features doing at present? Patching calls into
>fork.c/exec.c/exit.c?

Yes. And the unidentified feature is NLKD. But as with other notifiers (most
notably the module unload one), all reasonable kernel debuggers should
need them (or do explicit patching of the mentioned source files). As I
explained before, I think that if kernel debuggers aren't allowed into the
tree, they should at least be allowed to co-exist (since the argument of
requiring in-tree users and submitting code for mainline inclusion is void
if political/personal reasons exclude certain code from even being
considered for inclusion).

Jan

2008-01-09 10:04:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/4] add task handling notifier

On Wed, Jan 09, 2008 at 09:52:01AM +0000, Jan Beulich wrote:
> Yes. And the unidentified feature is NLKD. But as with other notifiers (most
> notably the module unload one), all reasonable kernel debuggers should
> need them (or do explicit patching of the mentioned source files). As I
> explained before, I think that if kernel debuggers aren't allowed into the
> tree, they should at least be allowed to co-exist (since the argument of
> requiring in-tree users and submitting code for mainline inclusion is void
> if political/personal reasons exclude certain code from even being
> considered for inclusion).

We already have a kernel debugger in tree, xmon. There's anotherone
hopefully getting in soon (kgdb), so they can do the right thing as the
other subsystems.

And as usual, we're not going to provide hooks for out of tree mess.