2008-07-29 22:41:27

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: module-placed markers/tracepoints

* Frank Ch. Eigler ([email protected]) wrote:
> Hi -
>
> Some locals are wondering -- is there code for (or need for new code
> for) incrementing module reference counts while markers and/or
> tracepoints resident in modules have active clients?
>
> - FChE

Hi Frank,

Probe module unloading is supposed to be done automatically assuming the
following module unload behavior :

1 - Module unload code takes the stop machine lock.
2 - It is assumed that the module exit function is called at that time,
while no preempt-disabled section is executed. This function
unregisters the callbacks.
3 - Given that preemption is disabled around tracepoint/marker calls, no
calls are left to a freed module.

Actually, I notice that it might not be the case. I took for granted
what (I can't remember who ? Rusty ?) told me about module unload code.
If we look at sys_delete_module() :


/* Stop the machine so refcounts can't move and disable module. */
ret = try_stop_module(mod, flags, &forced);
if (ret != 0)
goto out;

/* Never wait if forced. */
if (!forced && module_refcount(mod) != 0)
wait_for_zero_refcount(mod);

mutex_unlock(&module_mutex);
/* Final destruction now noone is using it. */
if (mod->exit != NULL)
mod->exit();

So we see that mod->exit(), which I believe is responsible for calling
the module exit() functions, is called after the stop_machine, not
during the stop machine.

Therefore, what would be needed here is to add a synchronize_sched()
after mod->exit() in module.c or after each tracepoint/marker unregister
in marker.c/tracepoint.c. However, I'd really prefer to add this to
module.c since adding this for _every_ unregister will slow down
processing or multiple probe unregistration too much.

Rusty, am I understanding that correctly ?

Mathieu


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68


2008-07-29 23:03:17

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: module-placed markers/tracepoints

Hi -

On Tue, Jul 29, 2008 at 06:41:16PM -0400, Mathieu Desnoyers wrote:
> > Some locals are wondering -- is there code for (or need for new code
> > for) incrementing module reference counts while markers and/or
> > tracepoints resident in modules have active clients?
>
> Probe module unloading is supposed to be done automatically assuming the
> following module unload behavior [...]

The question was more that if module-placed markers/tracepoints are
armed, is there any mechanism to prevent the modules' unloading. For
kprobes, there is.

- FChE

2008-07-29 23:19:46

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: module-placed markers/tracepoints

* Frank Ch. Eigler ([email protected]) wrote:
> Hi -
>
> On Tue, Jul 29, 2008 at 06:41:16PM -0400, Mathieu Desnoyers wrote:
> > > Some locals are wondering -- is there code for (or need for new code
> > > for) incrementing module reference counts while markers and/or
> > > tracepoints resident in modules have active clients?
> >
> > Probe module unloading is supposed to be done automatically assuming the
> > following module unload behavior [...]
>
> The question was more that if module-placed markers/tracepoints are
> armed, is there any mechanism to prevent the modules' unloading. For
> kprobes, there is.
>
> - FChE

I see, it's the other way around : declaring a marker/tracepoint in a
module.

When you register to a marker/tracepoint, you actually register the
probe in a hash table. It will be connected to every placed
marker/tracepoint with that given name. Upon module load, markers/tp
that match the name are connected, and upon module unload all markers/tp
are simply freed by module.c : unlike kprobes, there is no need for the
marker/tp infrastructures to keep track of every marker/tp. So we don't
need any refcount on the module which has the marker/tp.

However, the probe question is important; I think a synchronize_sched()
is missing to handle unload of marker/tp probes modules.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-07-30 01:41:17

by Rusty Russell

[permalink] [raw]
Subject: Re: module-placed markers/tracepoints

On Wednesday 30 July 2008 08:41:16 Mathieu Desnoyers wrote:
> Therefore, what would be needed here is to add a synchronize_sched()
> after mod->exit() in module.c or after each tracepoint/marker unregister
> in marker.c/tracepoint.c. However, I'd really prefer to add this to
> module.c since adding this for _every_ unregister will slow down
> processing or multiple probe unregistration too much.
>
> Rusty, am I understanding that correctly ?
>
> Mathieu

Hi Mathieu,

Yes: stop_machine is merely used to atomically check the module refcount
for zero and set the state so it can't be incremented again (ie.
try_module_get will fail).

So placing a tracepoint or marker in a module does not bump the module
refcount? If that's true, then there needs to be some kind of
remove_markers_from_module() call after module->exit(), which should do the
synchronize_sched() or whatever, right?

Rusty.

2008-07-30 02:28:02

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH] Module : call synchronize_sched() between module exit() and free.

> Hi Mathieu,
>
> Yes: stop_machine is merely used to atomically check the module refcount
> for zero and set the state so it can't be incremented again (ie.
> try_module_get will fail).
>
> So placing a tracepoint or marker in a module does not bump the module
> refcount? If that's true, then there needs to be some kind of
> remove_markers_from_module() call after module->exit(), which should do the
> synchronize_sched() or whatever, right?
>
> Rusty.

Actually, it's not placing a marker/tracepoint in a module which causes
a problem, this is a simple function call after all, and correctly dealt
with by current module.c code.

The problem comes from a probe function (the callback) that would be
registered to be called from a marker and would sit in an unloadable
kernel module. I would not want to tie the refcount of the probe modules
to the fact that they are connected to a marker because it would then
become impossible to unload them due to the fact that unregistration is
done in module exit().

This is one of the reasons why I disable preemption around the marker
site (the function call) : to make sure I can can unregister the
callback, wait for a quiescent state (with synchronize_sched()) and then
free the module memory.

This would give the following supplementary guarantee about module
teardown : every function called with preemption off and unregistered in
the module exit() would reach a quiescent state before the module is
freed. Given this does apply to rarely used code (module unload), I
think it might be ok to simply add a call to synchronize_sched() before
the module memory is freed. Not tying this to markers/tracepoints would
keep the behavior consistant across various build options, which is IMHO
a good thing.

I could also just document that a mandatory "synchronize_sched()" should
be called at the end of the probe module exit() function which makes
sure the probes has reached a quiescent state.

I don't want to add a synchronize_sched() into the marker/tracepoint
probe unregistration code because I want to keep batch probe
unregistration fast enough so it does no take ~5 seconds to unload ~100
probes. (may take longer on a loaded SMP system)

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-07-30 03:05:03

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] Module : call synchronize_sched() between module exit() and free.

On Wednesday 30 July 2008 12:27:51 Mathieu Desnoyers wrote:
> > Hi Mathieu,
> >
> > Yes: stop_machine is merely used to atomically check the module
> > refcount for zero and set the state so it can't be incremented again (ie.
> > try_module_get will fail).
> >
> > So placing a tracepoint or marker in a module does not bump the module
> > refcount? If that's true, then there needs to be some kind of
> > remove_markers_from_module() call after module->exit(), which should do
> > the synchronize_sched() or whatever, right?
> >
> > Rusty.
>
> Actually, it's not placing a marker/tracepoint in a module which causes
> a problem, this is a simple function call after all, and correctly dealt
> with by current module.c code.
>
> The problem comes from a probe function (the callback) that would be
> registered to be called from a marker and would sit in an unloadable
> kernel module. I would not want to tie the refcount of the probe modules
> to the fact that they are connected to a marker because it would then
> become impossible to unload them due to the fact that unregistration is
> done in module exit().

Hi Mathieu,

Still confused, sorry. Why don't you don't do a synchronize_sched() at
the end of your module's exit routine? "You must be completely finished by
the time ->exit() returns" is the rule so far...

Rusty.

2008-07-30 04:05:31

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] Module : call synchronize_sched() between module exit() and free.

* Rusty Russell ([email protected]) wrote:
> On Wednesday 30 July 2008 12:27:51 Mathieu Desnoyers wrote:
> > > Hi Mathieu,
> > >
> > > Yes: stop_machine is merely used to atomically check the module
> > > refcount for zero and set the state so it can't be incremented again (ie.
> > > try_module_get will fail).
> > >
> > > So placing a tracepoint or marker in a module does not bump the module
> > > refcount? If that's true, then there needs to be some kind of
> > > remove_markers_from_module() call after module->exit(), which should do
> > > the synchronize_sched() or whatever, right?
> > >
> > > Rusty.
> >
> > Actually, it's not placing a marker/tracepoint in a module which causes
> > a problem, this is a simple function call after all, and correctly dealt
> > with by current module.c code.
> >
> > The problem comes from a probe function (the callback) that would be
> > registered to be called from a marker and would sit in an unloadable
> > kernel module. I would not want to tie the refcount of the probe modules
> > to the fact that they are connected to a marker because it would then
> > become impossible to unload them due to the fact that unregistration is
> > done in module exit().
>
> Hi Mathieu,
>
> Still confused, sorry. Why don't you don't do a synchronize_sched() at
> the end of your module's exit routine? "You must be completely finished by
> the time ->exit() returns" is the rule so far...
>

Hi Rusty,

Yes, this is I think the right solution. Sorry for the title of the
previous email : I started writing a patch and changed my mind half-way.
Doing a synchronize_sched() at the end of the exit() function seems like
the right thing to do. I'll have to update the documentation and
examples accordingly.

Thanks,

Mathieu

> Rusty.

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-07-30 11:42:17

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [PATCH] Module : call synchronize_sched() between module exit() and free.

Hi -

On Tue, Jul 29, 2008 at 10:27:51PM -0400, Mathieu Desnoyers wrote:
> [...]
> Actually, it's not placing a marker/tracepoint in a module which causes
> a problem, this is a simple function call after all, and correctly dealt
> with by current module.c code.
> [...]

Just to spell it out, it is this scenario I'd like to see documented:

module-foo.c:
foo() { ... trace_mark (foo, "..."); ... }

module-bar.c:
setup() { ... marker_probe_register ("foo" , ..., &foo_handler ); }
teardown() { ... marker_probe_unregister ("foo" , ..., &foo_handler ); }
foo_handler() { }

1) module-foo loads
2) module-bar loads
3) module-bar.c:setup()
4) module-foo unloads

What happens here? Certainly no more calls to foo_handler, but is
that all? (Would it not be desirable for an active marker to cause
module-foo's refcount to increase, so as to prevent unloading at this
time?)

5) module-bar.c:teardown()

Can this teardown code succeed fully even if module-foo is already
dead and gone?


- FChE

2008-07-30 14:10:26

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] Module : call synchronize_sched() between module exit() and free.

* Frank Ch. Eigler ([email protected]) wrote:
> Hi -
>
> On Tue, Jul 29, 2008 at 10:27:51PM -0400, Mathieu Desnoyers wrote:
> > [...]
> > Actually, it's not placing a marker/tracepoint in a module which causes
> > a problem, this is a simple function call after all, and correctly dealt
> > with by current module.c code.
> > [...]
>
> Just to spell it out, it is this scenario I'd like to see documented:
>
> module-foo.c:
> foo() { ... trace_mark (foo, "..."); ... }
>
> module-bar.c:
> setup() { ... marker_probe_register ("foo" , ..., &foo_handler ); }
> teardown() { ... marker_probe_unregister ("foo" , ..., &foo_handler ); }
> foo_handler() { }
>
> 1) module-foo loads
> 2) module-bar loads
> 3) module-bar.c:setup()
> 4) module-foo unloads
>
> What happens here? Certainly no more calls to foo_handler, but is
> that all?

Yep, that's it. However a hash table still keeps track of the "foo"
handler so that it's reconnected whenever module-foo is reloaded.

>(Would it not be desirable for an active marker to cause
> module-foo's refcount to increase, so as to prevent unloading at this
> time?)

No, because I want to be able to unload the marked module and I don't
want the fact that a probe is connected to it to change that.

>
> 5) module-bar.c:teardown()
>
> Can this teardown code succeed fully even if module-foo is already
> dead and gone?
>

Yes.

Actually there is a detail missing here. Your teardown should be :
teardown() { ... marker_probe_unregister ("foo" , ..., &foo_handler );
synchronize_sched(); /* Before returning from exit */ }

This makes sure that every live marker call are finished and that it is
safe to unload module-bar (the probe).

Mathieu
>
> - FChE

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68

2008-07-31 00:54:49

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] Module : call synchronize_sched() between module exit() and free.

On Thursday 31 July 2008 00:09:38 Mathieu Desnoyers wrote:
> * Frank Ch. Eigler ([email protected]) wrote:
> >(Would it not be desirable for an active marker to cause
> > module-foo's refcount to increase, so as to prevent unloading at this
> > time?)
>
> No, because I want to be able to unload the marked module and I don't
> want the fact that a probe is connected to it to change that.

Also you might want to put a marker in the module's exit code.

> Actually there is a detail missing here. Your teardown should be :
> teardown() { ... marker_probe_unregister ("foo" , ..., &foo_handler );
> synchronize_sched(); /* Before returning from exit */ }
>
> This makes sure that every live marker call are finished and that it is
> safe to unload module-bar (the probe).

Perhaps create a synchronize_marker_unregister() wrapper for such uses; it's
better documentation and the caller doesn't have to know exactly how it
works...

Cheers,
Rusty.