Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
comments on points raied.
The suggestion was made that kernel module removal be depreciated or
removed. I'd like to note that there are two common uses for this
capability, and the problems addressed by module removal should be kept in
mind. These are in addition to the PCMCIA issue raised.
1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
better (or usefully at all) with one or the other driver used to read CDs.
I've noted that several programs for reading the image from an ISO CD
(readcd and sdd) have end of data problems with the SCSI interface.
2 - restarting NICs when total reinitialization is needed. In server
applications it's sometimes necessary to move or clear a NIC connection,
force renegotiation because the blade on the switch was set wrong, etc.
It's preferable to take down one NIC for a moment than suffer a full
outage via reboot.
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
On Mon, 1 Jul 2002, Bill Davidsen wrote:
> Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
> comments on points raied.
>
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
>
One of the best features of Linux is the ability to install and remove
modules. With this capability, designing drivers in Linux is much easier
than, for instance, Sun or NT. With Linux, one can make and test
individual portions of drivers much like writing and testing individual
procedures for user-mode code. As long as the programmer doesn't do
something that destroys the kernel, one can remove, code more, install,
then continue until done.
If the ability to remove modules is eliminated, we are degenerating
the kernel. I would have to write code that assumes that everything
works, i.e., a complete module, then throw it off the cliff to see if
it flies. I know this is the way many of the recent grads do it, but
they get promoted to management long before their code is tested and
I end up having to fix their stuff. So please don't eliminate module-
removal capability.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Windows-2000/Professional isn't.
On Monday, 01 July 2002, at 13:48:55 -0400,
Bill Davidsen wrote:
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
>
>From my non-kernel non-programmer point of view, module removal can be
useful under more circunstances than the ones you said. For example,
trying some combination of parameters for a module to get you damned
piece of hardware working, and having to reboot each time you want to
pass it a different set of parameters, doesn't seem reasonable to me.
Examples such as network cards (as Bill said), your nice TV capture
card, or just setting a "debug=1" for a module that seems not to be
working OK.
Except there was a way to pass parameters to modules once loaded, and
have them "reconfigure" themselves for the new parameters.
Just the opinion of a Linux user :)
--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Woody (Linux 2.4.19-pre6aa1)
On 07/01, Jose Luis Domingo Lopez said something like:
> On Monday, 01 July 2002, at 13:48:55 -0400,
> Bill Davidsen wrote:
>
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be kept in
> > mind. These are in addition to the PCMCIA issue raised.
> >
> From my non-kernel non-programmer point of view, module removal can be
> useful under more circunstances than the ones you said. For example,
> trying some combination of parameters for a module to get you damned
How about simply upgrading a driver w/out rebooting?
--
Shawn Leas
[email protected]
I planted some bird seed. A bird came up. Now I don't know
what to feed it.
-- Stephen Wright
On Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
Bill Davidsen <[email protected]> escribi?:
> Having read some notes on the Ottawa Kernel Summit, I'd like to offer
> some comments on points raied.
>
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be
> kept in mind. These are in addition to the PCMCIA issue raised.
And why people wants to remove this nice feature? Only because they
don't use it, or there's a more profund reason?
On Mon, 1 Jul 2002 21:57:18 +0200
Diego Calleja <[email protected]> escribi?:
> On Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
> Bill Davidsen <[email protected]> escribi?:
>
> > Having read some notes on the Ottawa Kernel Summit, I'd like to
> > offer some comments on points raied.
> >
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be
> > kept in mind. These are in addition to the PCMCIA issue raised.
>
> And why people wants to remove this nice feature? Only because they
> don't use it, or there's a more profund reason?
profund->deep (sorry, not all the world speak english ;) )
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Monday, 01 July 2002, at 13:45:28 -0500,
Diego Calleja wrote:
> And why people wants to remove this nice feature? Only because they
> don't use it, or there's a more profund reason?
>
This was discussed in the recent Linux Kernel Summit. Read:
http://lwn.net/Articles/3327/
for more information on the subject.
--
Jos? Luis Domingo L?pez
Linux Registered User #189436 Debian Linux Woody (Linux 2.4.19-pre6aa1)
On Mon, 1 Jul 2002, Diego Calleja wrote:
> On Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
> Bill Davidsen <[email protected]> escribi?:
>
> > Having read some notes on the Ottawa Kernel Summit, I'd like to offer
> > some comments on points raied.
> >
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be
> > kept in mind. These are in addition to the PCMCIA issue raised.
>
> And why people wants to remove this nice feature? Only because they
> don't use it, or there's a more profund reason?
Summary from listening to the mp3: It has issues with races that were
only made worse by preempt.
The alternative solution proposed (I forget who mentioned this, though.)
was to allow loading a second module on top of the first.
Those who spoke at the summit will probably want to have the "300 email flame war"
here before this spirals too far. :)
--
Ryan Anderson
sometimes Pug Majere
Hi,
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
> 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> better (or usefully at all) with one or the other driver used to read CDs.
The proposal was to deprecate module removal, *not* to deprecate
dynamic registration of devices. If you want to maintain the above
functionality, then there's no reason why a device can't be
deregistered from one driver and reregistered on another while both
drivers are present. Note that the scsi stack already allows you to
dynamically register and deregister specific targets on the fly.
> 2 - restarting NICs when total reinitialization is needed. In server
> applications it's sometimes necessary to move or clear a NIC connection,
> force renegotiation because the blade on the switch was set wrong, etc.
> It's preferable to take down one NIC for a moment than suffer a full
> outage via reboot.
Again, you might want to do this even with a non-modular driver, or if
you had one module driving two separate NICs --- the shutdown of one
card shouldn't necessarily require the removal of the module code from
the kernel, which is all Rusty was talking about doing.
Cheers,
Stephen
On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:
> Hi,
>
> > The suggestion was made that kernel module removal be depreciated or
> > removed. I'd like to note that there are two common uses for this
> > capability, and the problems addressed by module removal should be kept in
> > mind. These are in addition to the PCMCIA issue raised.
>
> > 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> > better (or usefully at all) with one or the other driver used to read CDs.
>
> The proposal was to deprecate module removal, *not* to deprecate
> dynamic registration of devices. If you want to maintain the above
> functionality, then there's no reason why a device can't be
> deregistered from one driver and reregistered on another while both
> drivers are present. Note that the scsi stack already allows you to
> dynamically register and deregister specific targets on the fly.
As I am led to understand from reading this thread, there is some
known bug caused by module removal. Therefore the "solution" is to
remove module removal capability.
This is absurd. Next, somebody will remove network capability because
there's some bug in it. Hello there........? Are there any carbon-
based life-forms out there?
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Windows-2000/Professional isn't.
On Mon, 1 Jul 2002, Bill Davidsen wrote:
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
I would add another one: switching hardware in a laptop.
I have a laptop in which the cdrom and the floppy drives both share the
same bay, ie, I can put either the floppy or the cd in the bay but not
both simultaneously.
To be able to use the cdrom after I have booted with the floppy drive
installed (and used it), I have to remove the floppy module first.
Otherwise the cdrom modules cannot be loaded...
Not having the rmmod possibility would be a big step backwards.
Cheers,
Ariel Garcia
On Tue, Jul 02, 2002 at 08:04:14AM -0400, Richard B. Johnson wrote:
> As I am led to understand from reading this thread, there is some
> known bug caused by module removal. Therefore the "solution" is to
> remove module removal capability.
>
> This is absurd. Next, somebody will remove network capability because
> there's some bug in it. Hello there........? Are there any carbon-
> based life-forms out there?
I may be misundersting things, but I dont think anyone is proposing
removing the module unloading capability. I think what people are
saying is that the bugs are hard to trigger and fixing them introduces
performance problems so its better to tell people not to unload modules.
I dont think anyone is proposing preventing you from ignoring this advise.
I dont know this for sure, so I would love someone to make a definitive
comment.
Thanks,
Jim
On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:
> > 2 - restarting NICs when total reinitialization is needed. In server
> > applications it's sometimes necessary to move or clear a NIC connection,
> > force renegotiation because the blade on the switch was set wrong, etc.
> > It's preferable to take down one NIC for a moment than suffer a full
> > outage via reboot.
>
> Again, you might want to do this even with a non-modular driver, or if
> you had one module driving two separate NICs --- the shutdown of one
> card shouldn't necessarily require the removal of the module code from
> the kernel, which is all Rusty was talking about doing.
Then you need a new ioctl to get the driver to go through the
initialization all over again, because the only time the cards are fully
probed is when the module is loaded. You can ifconfig up and down all day
and nothing improves. Oh, and reload resets the counts in the driver, that
sometimes very desirable as well.
Not that it can't be done, just that it works now, and reinventing modules
without remove seems a lot of work when we can just have some broken
modules which don't remove.
Also, as someone mentioned, it means a reboot every time you need to try
something new while doing module development. That doesn't sound like a
great idea...
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
> Also, as someone mentioned, it means a reboot every time you need to try
> something new while doing module development. That doesn't sound like a
> great idea...
This is a misconception I've seen go by a few times now; maybe I wrote up
that session badly. Anyway...
...the module remove option, under the proposal, would not disappear
entirely. It would just no longer be represented as safe. Removable
modules would be a "kernel hacking" option, still available to people
developing drivers and such. Aunt Tillie would no longer be able to remove
modules from her kernel, but that's not likely to bother her too much...
jon
Jonathan Corbet
Executive editor, LWN.net
[email protected]
Bill Davidsen wrote:
> On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:
> > Again, you might want to do this even with a non-modular driver, or if
> > you had one module driving two separate NICs --- the shutdown of one
> > card shouldn't necessarily require the removal of the module code from
> > the kernel, which is all Rusty was talking about doing.
[...]
> Also, as someone mentioned, it means a reboot every time you need to try
> something new while doing module development. That doesn't sound like a
> great idea...
They key phrase is "removal of the module code".
====
The proposal was to leave the code in the kernel, but to drop all
references such that it would not interfere with new versions of
the same module. The issue is strictly making sure *something* is
there, if an after-removal reference happens.
As I wrote in the other thread on this topic, it seems that only
the "return" case is truly module-specific. Since that one could
probably be fixed by other means, I don't quite see what not
freeing the memory area occupied by a module would really buy.
On the other hand, if there are cases where other after-removal
references can happen, this would also break other areas of the
kernel, and should be fixed no matter what happens with modules.
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://icapeople.epfl.ch/almesber/_____________________________________/
> developing drivers and such. Aunt Tillie would no longer be able to
> remove modules from her kernel, but that's not likely to bother her too
> much...
It would very much bother uncle John, who is in high availability.
Regards
Oliver
On Tue, Jul 02, 2002 at 06:07:06PM +0200, Oliver Neukum wrote:
> > developing drivers and such. Aunt Tillie would no longer be able to
> > remove modules from her kernel, but that's not likely to bother her too
> > much...
>
> It would very much bother uncle John, who is in high availability.
Then the HA kernel turns on the ability to still remove modules, along
with all of the other things needed in an HA environment but not
elsewhere. Provided removing a module doesn't become a horribly racy,
barely usable bit of functionality, which I hope it won't.
--
Tom Rini (TR1265)
http://gate.crashing.org/~trini/
Am Dienstag, 2. Juli 2002 19:48 schrieb Tom Rini:
> On Tue, Jul 02, 2002 at 06:07:06PM +0200, Oliver Neukum wrote:
> > > developing drivers and such. Aunt Tillie would no longer be able to
> > > remove modules from her kernel, but that's not likely to bother her
> > > too much...
> >
> > It would very much bother uncle John, who is in high availability.
>
> Then the HA kernel turns on the ability to still remove modules, along
> with all of the other things needed in an HA environment but not
> elsewhere. Provided removing a module doesn't become a horribly racy,
> barely usable bit of functionality, which I hope it won't.
Either there is a race or there isn't. Such a thing is unusable on a
production system.
Regards
Oliver
On Tue, 2 Jul 2002, Oliver Neukum wrote:
> Am Dienstag, 2. Juli 2002 19:48 schrieb Tom Rini:
> > On Tue, Jul 02, 2002 at 06:07:06PM +0200, Oliver Neukum wrote:
> > > > developing drivers and such. Aunt Tillie would no longer be able to
> > > > remove modules from her kernel, but that's not likely to bother her
> > > > too much...
> > >
> > > It would very much bother uncle John, who is in high availability.
> >
> > Then the HA kernel turns on the ability to still remove modules, along
> > with all of the other things needed in an HA environment but not
> > elsewhere. Provided removing a module doesn't become a horribly racy,
> > barely usable bit of functionality, which I hope it won't.
>
> Either there is a race or there isn't. Such a thing is unusable on a
> production system.
In a single processor, no preempt kernel, there is no race.
Turn on SMP or preempt and there is one.
Anyway, on a HA machine, how the heck are you going to be removing
modules anyway? If one of your two identical network cards fails, you
lose your HA status if you need to shut down both to remove the module
and thus be able to remove the second card, right?
I would think the HA guys would be pushing for a status of "modules are
never removed, so design around that" to make their lives easier. That
would also mean you would have a way to say, "hey driver, rescan for
devices - I think you'll find a new one that you should manage".
--
Ryan Anderson
sometimes Pug Majere
On Mon, Jul 01, 2002 at 01:48:55PM -0400, Bill Davidsen wrote:
> Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
> comments on points raied.
>
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
>
> 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> better (or usefully at all) with one or the other driver used to read CDs.
> I've noted that several programs for reading the image from an ISO CD
> (readcd and sdd) have end of data problems with the SCSI interface.
This will go away once the IDE rewrite is finished. There will be only
one way to access an IDE CD. And it'll be fixed so that all programs
work with it. Or the programs get fixed.
> 2 - restarting NICs when total reinitialization is needed. In server
> applications it's sometimes necessary to move or clear a NIC connection,
> force renegotiation because the blade on the switch was set wrong, etc.
> It's preferable to take down one NIC for a moment than suffer a full
> outage via reboot.
This can be solved without module unload as well. I think something
that'll fulfill the reset functionality will grow out of the hotplug/pm
system once it's fully implemented.
--
Vojtech Pavlik
SuSE Labs
Hi!
Okay. So we want modules and want them unload. And we want it bugfree.
So... then its okay if module unload is *slow*, right?
I believe you can just freeze_processes(), unload module [now its
safe, you *know* noone is using that module, because all processes are
in your refrigerator], thaw_processes().
That's going to take *lot* of time, but should be very simple and very
effective.
Pavel
--
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?
On Wed, 3 Jul 2002, Pavel Machek wrote:
> Hi!
>
> Okay. So we want modules and want them unload. And we want it bugfree.
>
> So... then its okay if module unload is *slow*, right?
>
> I believe you can just freeze_processes(), unload module [now its
> safe, you *know* noone is using that module, because all processes are
> in your refrigerator], thaw_processes().
>
> That's going to take *lot* of time, but should be very simple and very
> effective.
>
Absolutely. Nobody cares if it takes many milliseconds to unload a
module. We just don't want to take the 30 or more seconds necessary to
reboot the machine (3 minutes on some embedded '486es with network boots).
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Windows-2000/Professional isn't.
On Tue, 2 Jul 2002 17:50:19 -0400 (EDT)
Ryan Anderson <[email protected]> escribi?:
> In a single processor, no preempt kernel, there is no race.
> Turn on SMP or preempt and there is one.
So we _can't_ talk about remove module removal, but _disabling_ module
removal in the worst case.
Because if the above is correct, single processors without preempt works
well and can use module removal safely...
On Wednesday 03 July 2002 05:48, Pavel Machek wrote:
> Hi!
>
> Okay. So we want modules and want them unload. And we want it bugfree.
>
> So... then its okay if module unload is *slow*, right?
>
> I believe you can just freeze_processes(), unload module [now its
> safe, you *know* noone is using that module, because all processes are
> in your refrigerator], thaw_processes().
>
> That's going to take *lot* of time, but should be very simple and very
> effective.
Hi Pavel,
Is it just the mod_dec_use_count; return/unload race we're worried about?
I'm not clear on why this is hard. I'd think it would be sufficient just to
walk all runnable processes to ensure none has an execution address inside the
module. For smp, an ipi would pick up the current process on each cpu.
At this point the use count must be zero and the module deregistered, so all
we're interested in is that every process that dec'ed the module's use count
has succeeded in executing its way out of the module. If not, we try again
later, or if we're impatient, also bump any processes still inside the module
to the front of the run queue.
I'm sure I must have missed something.
--
Daniel
On Wednesday 03 July 2002 05:48, Pavel Machek wrote:
> Hi!
>
> Okay. So we want modules and want them unload. And we want it bugfree.
>
> So... then its okay if module unload is *slow*, right?
>
> I believe you can just freeze_processes(), unload module [now its
> safe, you *know* noone is using that module, because all processes are
> in your refrigerator], thaw_processes().
>
> That's going to take *lot* of time, but should be very simple and very
> effective.
Hi Pavel,
Is it just the mod_dec_use_count; return/unload race we're worried about?
I'm not clear on why this is hard. I'd think it would be sufficient just to
walk all runnable processes to ensure none has an execution address inside the
module. For smp, an ipi would pick up the current process on each cpu.
At this point the use count must be zero and the module deregistered, so all
we're interested in is that every process that dec'ed the module's use count
has succeeded in executing its way out of the module. If not, we try again
later, or if we're impatient, also bump any processes still inside the module
to the front of the run queue.
I'm sure I must have missed something.
--
Daniel
On Thu, 4 Jul 2002 00:26:36 +0200,
Diego Calleja <[email protected]> wrote:
>On Tue, 2 Jul 2002 17:50:19 -0400 (EDT)
>Ryan Anderson <[email protected]> escribi?:
>
>> In a single processor, no preempt kernel, there is no race.
>> Turn on SMP or preempt and there is one.
>
>So we _can't_ talk about remove module removal, but _disabling_ module
>removal in the worst case.
>
>Because if the above is correct, single processors without preempt works
>well and can use module removal safely...
Module removal is not safe even on UP without preempt. UP with no
preempt only removes this race
Read usecount
Enter module
Increment usecount
Check usecount == 0
Clean up module
There are other race conditions on module unload, which is why the
problem is so "interesting".
In article <[email protected]>,
you write:
>
>Is it just the mod_dec_use_count; return/unload race we're worried about?
>>I'm not clear on why this is hard. I'd think it would be sufficient just
>to
>walk all runnable processes to ensure none has an execution address inside
>the
>module. For smp, an ipi would pick up the current process on each cpu.
>
>At this point the use count must be zero and the module deregistered, so
>all
>we're interested in is that every process that dec'ed the module's use
>count
>has succeeded in executing its way out of the module. If not, we try
>again
>later, or if we're impatient, also bump any processes still inside the
>module
>to the front of the run queue.
>
>I'm sure I must have missed something.
>
Identifying the valid execution address in a stack traceback isn't
possible on many architectures (in particular, i386 without debugging
support enabled) so this wouldn't work if one of the functions had
called into a function outside the module.
On the other hand, if it was made a rule that any code in a module that
might theoretically be running without reference counts wasn't allowed
to call out of the module (or maybe was allowed to call out to a very
specific small set of library functions that were known to the module
system), then something like this might work.
Paul
On Thursday 04 July 2002 02:29, [email protected] wrote:
> In article <[email protected]>,
> you write:
> >
> >Is it just the mod_dec_use_count; return/unload race we're worried about?
> >>I'm not clear on why this is hard. I'd think it would be sufficient just to
> >walk all runnable processes to ensure none has an execution address inside
> >the module. For smp, an ipi would pick up the current process on each cpu.
> >
> >At this point the use count must be zero and the module deregistered, so
> >all we're interested in is that every process that dec'ed the module's use
> >count has succeeded in executing its way out of the module. If not, we try
> >again later, or if we're impatient, also bump any processes still inside the
> >module to the front of the run queue.
> >
> >I'm sure I must have missed something.
>
> Identifying the valid execution address in a stack traceback isn't
> possible on many architectures (in particular, i386 without debugging
> support enabled) so this wouldn't work if one of the functions had
> called into a function outside the module.
I'm only thinking about the current execution address, and am only
concerned with the handful of instructions between the dec and the
ret. But I did miss something all right: I'd assumed we somehow
magically wrap all exported functions in inc/dec, which on closer
inspection, isn't the case.
> On the other hand, if it was made a rule that any code in a module that
> might theoretically be running without reference counts wasn't allowed
> to call out of the module (or maybe was allowed to call out to a very
> specific small set of library functions that were known to the module
> system), then something like this might work.
That seems like a reasonable rule to make. This, along with the
execution address inspection, would give us a way of declaring
certain modules 'remove clean'. Module removal is just too nice a
feature to give up on so easily.
What else did I miss?
--
Daniel
On Wed, 3 Jul 2002 05:48:09 +0200,
Pavel Machek <[email protected]> wrote:
>Okay. So we want modules and want them unload. And we want it bugfree.
>
>So... then its okay if module unload is *slow*, right?
>
>I believe you can just freeze_processes(), unload module [now its
>safe, you *know* noone is using that module, because all processes are
>in your refrigerator], thaw_processes().
The devil is in the details.
You must not freeze the process doing rmmod, that way lies deadlock.
Modules can run their own kernel threads. When the module shuts down
it terminates the threads but we must wait until the process entries
for the threads have been reaped. If you are not careful, the zombie
clean up code can refer to the module that no longer exists. You must
not freeze any threads that belong to the module.
You must not freeze any process that has entered the module but not yet
incremented the use count, nor any process that has decremented the use
count but not yet left the module. Simply looking at the EIP after
freeze is not enough. Module code with a use count of 0 is allowed to
call any function as long as that function does not sleep. That rule
used to be safe, but adding preempt has turned that safe rule into a
race, freezing processes has the same effect as preempt.
Using freeze or any other quiesce style operation requires that the
module clean up be split into two parts. The logic must be :-
Check usecount == 0
Call module unregister routine. Unlike the existing clean up code,
this only removes externally visible interfaces, it does not delete
module structures.
<handwaving>
Outside the module, do whatever it takes to ensure that nothing is
executing any module code, including threads, command callbacks etc.
</handwaving>
Check the usecount again.
If usecount is non-zero then some other code entered the module after
checking the usecount the first time and before unregister completed.
Either mark the module for delayed delete or reactivate the module by
calling the module's register routine.
If usecount is still 0 after the handwaving, then it is safe to call
the final module clean up routine to destroy its structures, release
hardware etc. Then (and only then) is it safe to free the module.
Rusty and I agree that if (and it's a big if) we want to support module
unloading safely then this is the only sane way to do it. It requires
some moderately complex handwaving code, changes to every module (split
init and cleanup in two) and a new version of modutils in order to do
this method. Because of the high cost, Rusty is exploring other
options before diving into a kernel wide change.
Keith Owens wrote:
>
> ...
> Rusty and I agree that if (and it's a big if) we want to support module
> unloading safely then this is the only sane way to do it.
Dumb question: what's wrong with the current code as-is? I don't think
I've ever seen a module removal bug report which wasn't attributable to
some straightforward driver-failed-to-clean-something-up bug.
What problem are you trying to solve here?
-
Leaders
PrivateKeith Owens wrote:
> On Wed, 3 Jul 2002 05:48:09 +0200,
> Pavel Machek <[email protected]> wrote:
>
>>Okay. So we want modules and want them unload. And we want it bugfree.
>>
>>So... then its okay if module unload is *slow*, right?
>>
>>I believe you can just freeze_processes(), unload module [now its
>>safe, you *know* noone is using that module, because all processes are
>>in your refrigerator], thaw_processes().
>
>
> The devil is in the details.
>
> You must not freeze the process doing rmmod, that way lies deadlock.
>
> Modules can run their own kernel threads. When the module shuts down
> it terminates the threads but we must wait until the process entries
> for the threads have been reaped. If you are not careful, the zombie
> clean up code can refer to the module that no longer exists. You must
> not freeze any threads that belong to the module.
>
> You must not freeze any process that has entered the module but not yet
> incremented the use count, nor any process that has decremented the use
> count but not yet left the module. Simply looking at the EIP after
> freeze is not enough. Module code with a use count of 0 is allowed to
> call any function as long as that function does not sleep. That rule
> used to be safe, but adding preempt has turned that safe rule into a
> race, freezing processes has the same effect as preempt.
>
> Using freeze or any other quiesce style operation requires that the
> module clean up be split into two parts. The logic must be :-
>
> Check usecount == 0
>
> Call module unregister routine. Unlike the existing clean up code,
> this only removes externally visible interfaces, it does not delete
> module structures.
>
> <handwaving>
> Outside the module, do whatever it takes to ensure that nothing is
> executing any module code, including threads, command callbacks etc.
> </handwaving>
>
> Check the usecount again.
>
> If usecount is non-zero then some other code entered the module after
> checking the usecount the first time and before unregister completed.
> Either mark the module for delayed delete or reactivate the module by
> calling the module's register routine.
>
> If usecount is still 0 after the handwaving, then it is safe to call
> the final module clean up routine to destroy its structures, release
> hardware etc. Then (and only then) is it safe to free the module.
>
>
> Rusty and I agree that if (and it's a big if) we want to support module
> unloading safely then this is the only sane way to do it. It requires
> some moderately complex handwaving code, changes to every module (split
> init and cleanup in two) and a new version of modutils in order to do
> this method. Because of the high cost, Rusty is exploring other
> options before diving into a kernel wide change.
Why not treat a module just like any other structure? Obtain a
reference to it _before_ using it. I propose this change:
- When a module is inserted, it has a reference count of 1. This keeps
it loaded.
- get_module() obtains a reference to a module. It cannot fail.
- put_module() releases a reference to a module. If it goes to zero,
the exit callback is called and then module is safely freeable.
- sys_delete_module calls the deregister callback. When this function
returns, it is gauranteed that the reference count will not increase
again. It then does a put_module().
pseudo code here:
void get_module(struct module *mod)
{
atomic_inc(&mod->uc.usecount);
}
void put_module(struct module *mod)
{
if (atomic_dec_and_lock(&mod->uc.usecount, &unload_lock)) {
if (mod->exit)
mod->exit();
free_module(mod);
spin_unlock(&unload_lock);
}
}
sys_delete_module()
{
mod->deregister();
put_module(mod);
}
On Wed, Jul 03, 2002 at 10:25:34PM -0400, Brian Gerst wrote:
> Leaders
> PrivateKeith Owens wrote:
> >On Wed, 3 Jul 2002 05:48:09 +0200,
> >Pavel Machek <[email protected]> wrote:
> >
> >>Okay. So we want modules and want them unload. And we want it bugfree.
> >>
> >>So... then its okay if module unload is *slow*, right?
> >>
> >>I believe you can just freeze_processes(), unload module [now its
> >>safe, you *know* noone is using that module, because all processes are
> >>in your refrigerator], thaw_processes().
> >
> >
> >The devil is in the details.
> >
> >You must not freeze the process doing rmmod, that way lies deadlock.
> >
> >Modules can run their own kernel threads. When the module shuts down
> >it terminates the threads but we must wait until the process entries
> >for the threads have been reaped. If you are not careful, the zombie
> >clean up code can refer to the module that no longer exists. You must
> >not freeze any threads that belong to the module.
> >
> >You must not freeze any process that has entered the module but not yet
> >incremented the use count, nor any process that has decremented the use
> >count but not yet left the module. Simply looking at the EIP after
> >freeze is not enough. Module code with a use count of 0 is allowed to
> >call any function as long as that function does not sleep. That rule
> >used to be safe, but adding preempt has turned that safe rule into a
> >race, freezing processes has the same effect as preempt.
> >
> >Using freeze or any other quiesce style operation requires that the
> >module clean up be split into two parts. The logic must be :-
> >
> >Check usecount == 0
> >
> >Call module unregister routine. Unlike the existing clean up code,
> >this only removes externally visible interfaces, it does not delete
> >module structures.
> >
> ><handwaving>
> > Outside the module, do whatever it takes to ensure that nothing is
> > executing any module code, including threads, command callbacks etc.
> ></handwaving>
> >
> >Check the usecount again.
> >
> >If usecount is non-zero then some other code entered the module after
> >checking the usecount the first time and before unregister completed.
> >Either mark the module for delayed delete or reactivate the module by
> >calling the module's register routine.
> >
> >If usecount is still 0 after the handwaving, then it is safe to call
> >the final module clean up routine to destroy its structures, release
> >hardware etc. Then (and only then) is it safe to free the module.
> >
> >
> >Rusty and I agree that if (and it's a big if) we want to support module
> >unloading safely then this is the only sane way to do it. It requires
> >some moderately complex handwaving code, changes to every module (split
> >init and cleanup in two) and a new version of modutils in order to do
> >this method. Because of the high cost, Rusty is exploring other
> >options before diving into a kernel wide change.
>
> Why not treat a module just like any other structure? Obtain a
> reference to it _before_ using it. I propose this change:
Because in general you don't know you're going to use a module before
you use it. Using a module is (necessarily) not a narrow well-defined
interface.
--
David Gibson | For every complex problem there is a
[email protected] | solution which is simple, neat and
| wrong. -- H.L. Mencken
http://www.ozlabs.org/people/dgibson
On Wed, 03 Jul 2002 18:53:55 -0700,
Andrew Morton <[email protected]> wrote:
>Keith Owens wrote:
>> Rusty and I agree that if (and it's a big if) we want to support module
>> unloading safely then this is the only sane way to do it.
>
>Dumb question: what's wrong with the current code as-is? I don't think
>I've ever seen a module removal bug report which wasn't attributable to
>some straightforward driver-failed-to-clean-something-up bug.
>
>What problem are you trying to solve here?
Uncounted references to module code and/or data.
To be race safe, the reference count must be bumped before calling
any function that might be in a module. With the current unload
method, MOD_INC_USE_COUNT inside the module is too late.
Al Viro closed some of these races by adding the owner field and
using try_inc_mod_count. That assumes that every bit of code that
dereferences a function pointer will handle the module locking.
Code to externally bump the use count must itself be race safe.
drivers/char/busmouse.c::busmouse_open() has
if (mse->ops->owner && !try_inc_mod_count(mse->ops->owner))
This is safe because it first does
down(&mouse_sem)
which locks mse and its operations. At least I assume it is safe, I
hope that mse operations cannot be deregistered without mouse_sem.
Although this appears to be safe, it is not obvious that it is safe,
you have to trace the interaction between mouse_sem, the module
unload_lock, the MOD_DELETED flag for the module and the individual
module clean up routines to be sure that there are no races.
OTOH, HiSax_mod_inc_use_count appears to be unsafe. It has no
locking of its own before calling try_inc_mod_count(). AFAICT this
race exists :-
HiSax_command() -> HiSax_mod_inc_use_count()
mod = cs->hw.hisax_d_if->owner;
// race here
if (mod)
try_inc_mod_count(mod);
Without any (obvious) locking to prevent hisax unregistration on
another cpu, mod can be deleted in the middle of that code. To add
insult to injury, HiSax_mod_inc_use_count does not check if it locked
the module or not, it blindly assumes it worked and continues to use
the module structures!
I cannot be sure if get_mtd_device() (calls try_inc_mod_count) is
safe or not. There is no locking around calls to get_mtd_device()
which makes it look unsafe. OTOH it is invoked from mtdblock_open()
via mtd_fops.open, which may mean that a higher routine is locking
the module down. But it is not obvious that the code is safe.
If get_mtd_device() is being protected by a higher routine such as
get_fops() then the module use count is already non-zero and
try_inc_mod_count will always succeed. In that case,
try_inc_mod_count is overkill, an unconditional __MOD_INC_USE_COUNT
will do the job just as well. In either case, there is a question
mark over this code.
Similar comments for net/core/dev.c::dev_open(). No obvious locks to
protect the parameters passed to try_inc_mod_count, they may or may
not be protected by a lock in a higher routine. try_inc_mod_count
may or may not be overkill here.
The main objections to the existing reference counting model are :-
* Complex and fragile. The interactions between try_inc_mod_count,
some other lock that protects the parameters to try_inc_mod_count and
the module clean up routines are not obvious. Every module writer
has to get the interaction exactly right.
* Difficult to audit. Is get_mtd_device() safe or not? If it is safe,
why is it safe and is try_inc_mod_count overkill?
* Easy to omit. The comments above only apply to the code that is
using try_inc_mod_count. How much code exists that should be doing
module locking but is not?
* All the external locking is scattered around the kernel, anywhere
that might call a function in a module.
* The external locking is extra cpu overhead (and storage, but to a
lesser extent) all the time. Just to cope with a relatively rare
unload event.
Given those objections, Rusty is looking at alternative methods,
ranging from no unload at all to an unload method that moves all the
complexity into module.c instead of spreading it around the kernel.
On Wed, 03 Jul 2002 22:25:34 -0400,
Brian Gerst <[email protected]> wrote:
>Why not treat a module just like any other structure? Obtain a
>reference to it _before_ using it.
That is what try_inc_use_count() does. But the interface is messy and
difficult to audit. It relies on the caller taking some other lock
first to ensure that the module address will not change while you are
trying to call try_inc_use_count.
Ryan Anderson wrote:
>
> On Tue, 2 Jul 2002, Oliver Neukum wrote:
[...]
> > Either there is a race or there isn't. Such a thing is unusable on a
> > production system.
>
> In a single processor, no preempt kernel, there is no race.
> Turn on SMP or preempt and there is one.
Seems to me that module _replacement_ (as opposed to
unloading in order to free memory) is an easier case.
Just load the new module and initialize it. If some
other preempted processor manages to race and activate
the old module - no problem, because the code isn't gone.
The old module may be unloaded once we know the new one
will get all future requests and the old one has 0 references.
Helge Hafting
At 08:04 AM 7/2/2002 -0400, Richard B. Johnson wrote:
>On Tue, 2 Jul 2002, Stephen C. Tweedie wrote:
>
> > Hi,
> >
> > > The suggestion was made that kernel module removal be depreciated or
> > > removed. I'd like to note that there are two common uses for this
> > > capability, and the problems addressed by module removal should be
> kept in
> > > mind. These are in addition to the PCMCIA issue raised.
> >
> > > 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> > > better (or usefully at all) with one or the other driver used to read
> CDs.
> >
> > The proposal was to deprecate module removal, *not* to deprecate
> > dynamic registration of devices. If you want to maintain the above
> > functionality, then there's no reason why a device can't be
> > deregistered from one driver and reregistered on another while both
> > drivers are present. Note that the scsi stack already allows you to
> > dynamically register and deregister specific targets on the fly.
>
>As I am led to understand from reading this thread, there is some
>known bug caused by module removal. Therefore the "solution" is to
>remove module removal capability.
Read the thread more carefully, and you'll understand more than "some
known bug". Heck, you may even be able to contribute to a more
palatable race solution than depreciating module unload entirely.
>This is absurd. Next, somebody will remove network capability because
>there's some bug in it. Hello there........? Are there any carbon-
>based life-forms out there?
I'm not absolutely certain that those life-forms discussing options are
carbon-based ;-) but they have demonstrated considerable knowledge
of the subject IMO.
-Mike
Keith Owens wrote:
> On Wed, 03 Jul 2002 22:25:34 -0400,
> Brian Gerst <[email protected]> wrote:
>
>>Why not treat a module just like any other structure? Obtain a
>>reference to it _before_ using it.
>
>
> That is what try_inc_use_count() does. But the interface is messy and
> difficult to audit. It relies on the caller taking some other lock
> first to ensure that the module address will not change while you are
> trying to call try_inc_use_count.
And that is almost always the case anyways, since most cases traverse a
linked list that must already be protected. You are trying to
overengineer a solution to a simple, but subtle, problem.
PS. If you really want to make the broken cases show themselves, poison
the module memory when it is unloaded. The same can be done for dumping
init data and text.
--
Brian Gerst
Brian Gerst wrote:
> PS. If you really want to make the broken cases show themselves, poison
> the module memory when it is unloaded. The same can be done for dumping
> init data and text.
Unfortunately, many of the code paths leading to the races are
sufficiently different that it's currently unlikely to actually
hit them. (E.g. it's kind of hard for some 1000+ instructions
code path to compete successfully with a cached, pre-fetched,
and pre-decoded "ret" instruction. Entry-after-removal races
(and the related data races) may be a little bit easier to catch,
but I suppose we're still talking about a likely difference of
orders of magnitude.)
I don't think preemption changes that fundamentally.
Microthreading may, and NUMA will make execution times more
variable, increasing the risk of hitting a race.
The best way to trigger such races is probably to simulate
microthreading, e.g. with an UML-SMP where only one "CPU" can
run at a time, and where execution switching is under script
control.
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://icapeople.epfl.ch/almesber/_____________________________________/
Hi,
On Thu, Jul 04, 2002 at 01:48:59AM +0200, Daniel Phillips
<[email protected]> wrote:
> Is it just the mod_dec_use_count; return/unload race we're worried about?
> I'm not clear on why this is hard. I'd think it would be sufficient just to
> walk all runnable processes to ensure none has an execution address inside the
> module.
That fails if:
the module function has called somewhere else in the kernel (and
with -fomit-frame-pointer, you can't reliably walk back up the stack
to find out if there is a stack frame higher up the stack which is in
the module);
the module has taken an interrupt into an unrelated driver;
we have computed a call into the module but haven't actually executed
the call yet;
etc.
> For smp, an ipi would pick up the current process on each cpu.
Without freezing the other CPUs, that still leaves the race wide open.
--Stephen
Hi!
> >Okay. So we want modules and want them unload. And we want it bugfree.
> >
> >So... then its okay if module unload is *slow*, right?
> >
> >I believe you can just freeze_processes(), unload module [now its
> >safe, you *know* noone is using that module, because all processes are
> >in your refrigerator], thaw_processes().
>
> The devil is in the details.
I don't think so.
> You must not freeze the process doing rmmod, that way lies deadlock.
That's automagic. Process doing freeze will not freeze itself.
> Modules can run their own kernel threads. When the module shuts
> down
> it terminates the threads but we must wait until the process entries
> for the threads have been reaped. If you are not careful, the zombie
> clean up code can refer to the module that no longer exists. You must
> not freeze any threads that belong to the module.
Look at the code. freezer will try for 5 seconds, then give up. So, in
rare case module has some threads, rmmod will simply fail. I believe
we can fix rare remaining modules one by one.
> You must not freeze any process that has entered the module but not yet
> incremented the use count, nor any process that has decremented the use
> count but not yet left the module. Simply looking at the EIP after
Look how freezer works. refrigerator() is blocking, by definition. So
if all processes reach refrigerator(), and the use count == 0, it is
indeed safe to unload.
> freeze is not enough. Module code with a use count of 0 is allowed to
> call any function as long as that function does not sleep. That
> rule
And that's okay. If it only calls non-sleeping functions, it is not
going to call refrigerator(), and it will exit module code before
calling refrigerator().
> Using freeze or any other quiesce style operation requires that the
> module clean up be split into two parts. The logic must be :-
I still think it does not.
Pavel
--
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?
Hi!
> I'm assuming for the sake of argument that we're requiring the use count to
> be incremented for any call outside the module, a rule we might want anyway,
> since it is less fragile than the no-sleeping rule.
>
> > the module has taken an interrupt into an unrelated driver;
>
> With Ben's new separate interrupt stacks the current IP would be available at
> a known place at the base of the interrupt stack.
>
> > we have computed a call into the module but haven't actually executed
> > the call yet;
>
> This one is problematic, and yes, I now agree the problem is hard. This is
> where Keith's handwaving comes in: we are supposed to have deregistered the
> module's services and ensured all processes are out of the module at this
> point. I don't know how that helps, really. I just want to note that this
> seems to be the only really hard problem. It's not insoluable though: going
> to extremes we could record each region of code from which module calls
> originate and check for execution addresses in that region, along with
> execution addresses in the module. Picking up the call address would have to
> be an atomic read. You don't have to tell me this is ugly and slow, but it
> would work.
freeze_processes(), and now you know that rmmod is only runnable job
on the system [approximately; you'd have to audit all threads marked
as non-stoppable]. So... if rmmod() does not do any computed calls to
module being removed, noone is doing that and all is safe.
Pavel
--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.
On Friday 05 July 2002 11:40, Stephen Tweedie wrote:
> Hi,
>
> On Thu, Jul 04, 2002 at 01:48:59AM +0200, Daniel Phillips
> <[email protected]> wrote:
>
> > Is it just the mod_dec_use_count; return/unload race we're worried about?
> > I'm not clear on why this is hard. I'd think it would be sufficient just
> > to walk all runnable processes to ensure none has an execution address
> > inside the module.
>
> That fails if:
>
> the module function has called somewhere else in the kernel (and
> with -fomit-frame-pointer, you can't reliably walk back up the stack
> to find out if there is a stack frame higher up the stack which is in
> the module);
Hi Stephen,
I'm assuming for the sake of argument that we're requiring the use count to
be incremented for any call outside the module, a rule we might want anyway,
since it is less fragile than the no-sleeping rule.
> the module has taken an interrupt into an unrelated driver;
With Ben's new separate interrupt stacks the current IP would be available at
a known place at the base of the interrupt stack.
> we have computed a call into the module but haven't actually executed
> the call yet;
This one is problematic, and yes, I now agree the problem is hard. This is
where Keith's handwaving comes in: we are supposed to have deregistered the
module's services and ensured all processes are out of the module at this
point. I don't know how that helps, really. I just want to note that this
seems to be the only really hard problem. It's not insoluable though: going
to extremes we could record each region of code from which module calls
originate and check for execution addresses in that region, along with
execution addresses in the module. Picking up the call address would have to
be an atomic read. You don't have to tell me this is ugly and slow, but it
would work.
> etc.
You enumerated all the areas of concern that I'd identified, so I'm curious
what the etc stands for.
> > For smp, an ipi would pick up the current process on each cpu.
>
> Without freezing the other CPUs, that still leaves the race wide open.
With per-cpu runqueues, we take an ipi onto each processor and take the
task's runqueue lock. We can now check each runnable task, and the task the
ipi interrupted. Repeating for each processor, the only door we have to
close is inter-processor task migration, which is not a fast path. So this
part, at least, seems doable.
Anyway, I'm beginning to see what all the fuss is about.
--
Daniel
On Fri, 5 Jul 2002 15:48:17 +0200,
Pavel Machek <[email protected]> wrote:
>Keith Owens wrote
>> Modules can run their own kernel threads. When the module shuts down
>> it terminates the threads but we must wait until the process entries
>> for the threads have been reaped. If you are not careful, the zombie
>> clean up code can refer to the module that no longer exists. You must
>> not freeze any threads that belong to the module.
>
>Look at the code. freezer will try for 5 seconds, then give up. So, in
>rare case module has some threads, rmmod will simply fail. I believe
>we can fix rare remaining modules one by one.
There is no failure path for rmmod. Once rmmod sees a use count of 0,
it must succeed. Which is why both Rusty and I agree that rmmod must
be split in two, one piece that is allowed to fail and a second piece
that is not.
BTW, freeze_processes() silently ignores zombie processes. The test is
not obvious, it is hidden in the INTERESTING macro which has an
embedded 'continue' to break out of the loop. Easy to miss.
>> You must not freeze any process that has entered the module but not yet
>> incremented the use count, nor any process that has decremented the use
>> count but not yet left the module. Simply looking at the EIP after
>
>Look how freezer works. refrigerator() is blocking, by definition. So
>if all processes reach refrigerator(), and the use count == 0, it is
>indeed safe to unload.
freeze_processes()
signal_wake_up() - sets TIF_SIGPENDING for other task
kick_if_running()
resched_task() - calls preempt_disable() for this cpu
smp_send_reschedule()
smp_reschedule_interrupt() - now on another cpu
ret_from_intr
resume_kernel - on other cpu
With CONFIG_PREEMPT, a process running on another cpu without a lock
when freeze_processes() is called should immediately end up in
schedule. I don't see anything in that code path that disables
preemption on other cpus. If I am right, then a second cpu could be in
this window when freeze_processes is called
if (xxx->func)
xxx->func()
where func is a module function. There is still a window from loading
the function address, through calling the function and up to the point
where the function does MOD_INC_USE_COUNT. Any reschedule in that
window opens a race with rmmod. Without preemption, freeze might be
safe, with preemption the race is back again.
Hi,
On Mon, 8 Jul 2002, Keith Owens wrote:
> With CONFIG_PREEMPT, a process running on another cpu without a lock
> when freeze_processes() is called should immediately end up in
> schedule. I don't see anything in that code path that disables
> preemption on other cpus. If I am right, then a second cpu could be in
> this window when freeze_processes is called
>
> if (xxx->func)
> xxx->func()
>
> where func is a module function. There is still a window from loading
> the function address, through calling the function and up to the point
> where the function does MOD_INC_USE_COUNT. Any reschedule in that
> window opens a race with rmmod. Without preemption, freeze might be
> safe, with preemption the race is back again.
While I agree that the current module interface is broken, but the current
suggestions are IMHO not any better. They rely on specific knowledge about
the scheduler. Next time someone wants to change the scheduler or the
scheduling behaviour, he runs into the risk breaking modules (again).
All we have to do is to make sure that there is no reference to the module
left. In other words we have to make sure there is no data structure with
a reference to the module left. Managing data structures is something we
already have to do anyway, so why don't we just the existing interfaces?
As an example I have an alternative "fix" (that means it compiles) for the
bdev layer. First it makes (un)register_blkdev smp/preempt safe, but the
important change is in unregister_blkdev, which basically does:
for_all_bdev(bdev) {
if (bdev->bd_op == op)
return -EBUSY;
}
After that we know that noone can call into the module anymore, now we
only have to make use of -EBUSY information somehow.
That means that the module count is not strictly necessary anymore and
structures don't have to be polluted with module owner fields.
IMO that's the far cleaner solution and doesn't require messing with the
scheduler. The module interface has to be fixed anyway and I'd prefer it
wouldn't require some scheduling magic.
bye, Roman
Index: fs/block_dev.c
===================================================================
RCS file: /usr/src/cvsroot/linux-2.5/fs/block_dev.c,v
retrieving revision 1.1.1.17
diff -u -p -r1.1.1.17 block_dev.c
--- fs/block_dev.c 6 Jul 2002 00:33:38 -0000 1.1.1.17
+++ fs/block_dev.c 7 Jul 2002 22:12:50 -0000
@@ -417,55 +417,98 @@ int get_blkdev_list(char * p)
Return the function table of a device.
Load the driver if needed.
*/
-const struct block_device_operations * get_blkfops(unsigned int major)
+const struct block_device_operations *__get_blkfops(unsigned int major)
{
const struct block_device_operations *ret = NULL;
/* major 0 is used for non-device mounts */
if (major && major < MAX_BLKDEV) {
#ifdef CONFIG_KMOD
+ spin_unlock(&bdev_lock);
if (!blkdevs[major].bdops) {
char name[20];
sprintf(name, "block-major-%d", major);
request_module(name);
}
+ spin_lock(&bdev_lock);
#endif
ret = blkdevs[major].bdops;
}
return ret;
}
+const struct block_device_operations *get_blkfops(unsigned int major)
+{
+ const struct block_device_operations *bd;
+ spin_lock(&bdev_lock);
+ bd = __get_blkfops(major);
+ spin_unlock(&bdev_lock);
+ return bd;
+}
+
int register_blkdev(unsigned int major, const char * name, struct block_device_operations *bdops)
{
+ int res = 0;
+
+ spin_lock(&bdev_lock);
if (major == 0) {
for (major = MAX_BLKDEV-1; major > 0; major--) {
if (blkdevs[major].bdops == NULL) {
blkdevs[major].name = name;
blkdevs[major].bdops = bdops;
- return major;
+ res = major;
+ goto out;
}
}
- return -EBUSY;
+ res = -EBUSY;
+ goto out;
+ }
+ if (major >= MAX_BLKDEV) {
+ res = -EINVAL;
+ goto out;
+ }
+ if (blkdevs[major].bdops && blkdevs[major].bdops != bdops) {
+ res = -EBUSY;
+ goto out;
}
- if (major >= MAX_BLKDEV)
- return -EINVAL;
- if (blkdevs[major].bdops && blkdevs[major].bdops != bdops)
- return -EBUSY;
blkdevs[major].name = name;
blkdevs[major].bdops = bdops;
- return 0;
+out:
+ spin_unlock(&bdev_lock);
+ return res;
}
int unregister_blkdev(unsigned int major, const char * name)
{
+ struct block_device *bdev;
+ struct list_head *p;
+ int i, res = 0;
+
if (major >= MAX_BLKDEV)
return -EINVAL;
- if (!blkdevs[major].bdops)
- return -EINVAL;
- if (strcmp(blkdevs[major].name, name))
- return -EINVAL;
+ spin_lock(&bdev_lock);
+ if (!blkdevs[major].bdops) {
+ res = -EINVAL;
+ goto out;
+ }
+ if (strcmp(blkdevs[major].name, name)) {
+ res = -EINVAL;
+ goto out;
+ }
+ for (i = 0; i < HASH_SIZE; i++) {
+ list_for_each(p, &bdev_hashtable[i]) {
+ bdev = list_entry(p, struct block_device, bd_hash);
+ if (bdev->bd_op == blkdevs[major].bdops) {
+ res = -EBUSY;
+ goto out;
+ }
+ }
+ }
+
blkdevs[major].name = NULL;
blkdevs[major].bdops = NULL;
+out:
+ spin_unlock(&bdev_lock);
return 0;
}
@@ -481,11 +524,15 @@ int unregister_blkdev(unsigned int major
int check_disk_change(kdev_t dev)
{
int i;
+ struct block_device *bdev = bdget(kdev_t_to_nr(dev));
const struct block_device_operations * bdops = NULL;
i = major(dev);
- if (i < MAX_BLKDEV)
- bdops = blkdevs[i].bdops;
+ spin_lock(&bdev_lock);
+ if (!bdev->bd_op && i < MAX_BLKDEV)
+ bdev->bd_op = blkdevs[i].bdops;
+ bdops = bdev->bd_op;
+ spin_unlock(&bdev_lock);
if (bdops == NULL) {
devfs_handle_t de;
@@ -496,12 +543,12 @@ int check_disk_change(kdev_t dev)
devfs_put_ops (de); /* We're running in owner module */
}
}
- if (bdops == NULL)
- return 0;
- if (bdops->check_media_change == NULL)
- return 0;
- if (!bdops->check_media_change(dev))
+ if (bdops == NULL ||
+ bdops->check_media_change == NULL ||
+ !bdops->check_media_change(dev)) {
+ bdput(bdev);
return 0;
+ }
printk(KERN_DEBUG "VFS: Disk change detected on device %s\n",
__bdevname(dev));
@@ -511,6 +558,7 @@ int check_disk_change(kdev_t dev)
if (bdops->revalidate)
bdops->revalidate(dev);
+ bdput(bdev);
return 1;
}
@@ -536,7 +584,9 @@ static int do_open(struct block_device *
down(&bdev->bd_sem);
lock_kernel();
if (!bdev->bd_op) {
- bdev->bd_op = get_blkfops(major(dev));
+ spin_lock(&bdev_lock);
+ bdev->bd_op = __get_blkfops(major(dev));
+ spin_unlock(&bdev_lock);
if (!bdev->bd_op)
goto out;
owner = bdev->bd_op->owner;
Keith Owens wrote:
> On Fri, 5 Jul 2002 15:48:17 +0200,
> Pavel Machek <[email protected]> wrote:
> >Keith Owens wrote
> >> Modules can run their own kernel threads. When the module shuts down
> >> it terminates the threads but we must wait until the process entries
> >> for the threads have been reaped. If you are not careful, the zombie
> >> clean up code can refer to the module that no longer exists. You must
> >> not freeze any threads that belong to the module.
I am just out to share a possible angle. -all though not really a programmer.
Can one module replace another -running- twin-module without having to hand-
shake with the kernel? -and without freezing ? (transparently)
If obvious no - no need to read further.
Otherwhise, there might be a possibility to create a shut-down module out of
the existing module it self?
In this case it could probably be a smooth solution to ask the module creator.
(if known, and available) if he/she would be able to make this modification
on the fly, as he/she probably knows the most about how the signals operate.
That is to strip down the existing loadable module into sort of a dummy.
Of course this won't work if hazardous removal issues are shared among all or
most of the existing kernel modules.
An apology for taking up time and bandwith. But rmmod is such a neat command
to have a round.
/Daniel Mose
On Thu, 4 Jul 2002, Daniel Phillips wrote:
> On Wednesday 03 July 2002 05:48, Pavel Machek wrote:
> > Hi!
> >
> > Okay. So we want modules and want them unload. And we want it bugfree.
> >
> > So... then its okay if module unload is *slow*, right?
> >
> > I believe you can just freeze_processes(), unload module [now its
> > safe, you *know* noone is using that module, because all processes are
> > in your refrigerator], thaw_processes().
> >
> > That's going to take *lot* of time, but should be very simple and very
> > effective.
>
> Hi Pavel,
>
> Is it just the mod_dec_use_count; return/unload race we're worried about?
> I'm not clear on why this is hard. I'd think it would be sufficient just to
> walk all runnable processes to ensure none has an execution address inside the
> module. For smp, an ipi would pick up the current process on each cpu.
>
> At this point the use count must be zero and the module deregistered, so all
> we're interested in is that every process that dec'ed the module's use count
> has succeeded in executing its way out of the module. If not, we try again
> later, or if we're impatient, also bump any processes still inside the module
> to the front of the run queue.
>
> I'm sure I must have missed something.
>
> --
> Daniel
>
Assuming that there are no users of a module, i.e., the current
MOD_DEC_USE_COUNT mechanism, checked under a lock, I think all
you have to do to assure race-free module removal is to:
(1) Make sure that `current` isn't preempted. There are several
ways to do this. This may require another global variable.
(2) Execute cleanup_module(). It is assumed that the user-code
will deallocate interrupts, and free resources, etc.
(3) Set a global flag "module_remove", it doesn't have to be atomic.
It needs only to be volatile. It is used in schedule() to trap
all CPUs.
schedule()
{
while(module_remove)
;
}
(4) Wait number-of-CPUs timer-ticks with interrupts alive and well.
This will make certain that everything queued on the timer gets
a chance to be executed and everything that would be preempted
gets trapped in schedule().
(5) Remove the module, reset the flag.
The cost is the single variable that must be checked every time schedule()
is executed plus the 'don't preempt me' variable (if necessary).
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Windows-2000/Professional isn't.
Hi,
On Mon, 8 Jul 2002, Richard B. Johnson wrote:
> (3) Set a global flag "module_remove", it doesn't have to be atomic.
> It needs only to be volatile. It is used in schedule() to trap
> all CPUs.
> schedule()
> {
> while(module_remove)
> ;
> }
That doesn't sound too clean to me...
Maybe we should lock that module explicitly, instead of halting anything
that is schedule()d.
We should possibly add something to lock in struct module (or
module_info), be it some kind of integer or be it a semaphore (which is
clearly a bit too much, I think) or a spinlock, or whatever. This
shouldn't protect the module from being used in parallel, but from being
used in removal. So on removal, we do something like module->remove |= 1
or even up(module->m_sem), and when we're done, we do something related to
undo the up, remove or whatever...
BTW, looking at struct module, we have this union
union {
atomic_t usecount;
long pad;
}
Fair enough, but if long pad is to pad (as it name tells us), shouldn't it
be atomic_t then (I mean, what if we change the type for atomic_t)?
Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------
On Mon, 8 Jul 2002, Thunder from the hill wrote:
> Hi,
>
> On Mon, 8 Jul 2002, Richard B. Johnson wrote:
> > (3) Set a global flag "module_remove", it doesn't have to be atomic.
> > It needs only to be volatile. It is used in schedule() to trap
> > all CPUs.
> > schedule()
> > {
> > while(module_remove)
> > ;
> > }
>
> That doesn't sound too clean to me...
>
> Maybe we should lock that module explicitly, instead of halting anything
> that is schedule()d.
Locking the module does nothing except increase module overhead. The
premise is that we don't care how long it takes to remove a module. It
just must be done safely. So, what we need to do it make certain that...
(1) All calls from the module have returned.
(2) All calls to the module code have returned.
(3) All user-access has completed.
That's what the trap in schedule() does, in conjunction with the
wait-for-timer-ticks. We don't want to have lots of locks and semiphores
that have to be accessed during normal execution paths.
>
> We should possibly add something to lock in struct module (or
> module_info), be it some kind of integer or be it a semaphore (which is
> clearly a bit too much, I think) or a spinlock, or whatever.
But this doesn't solve the module-removal problem .
> shouldn't protect the module from being used in parallel, but from being
> used in removal. So on removal, we do something like module->remove |= 1
> or even up(module->m_sem), and when we're done, we do something related to
> undo the up, remove or whatever...
>
Again, it's not the problem I'm addressing.
> BTW, looking at struct module, we have this union
>
> union {
> atomic_t usecount;
> long pad;
> }
>
> Fair enough, but if long pad is to pad (as it name tells us), shouldn't it
> be atomic_t then (I mean, what if we change the type for atomic_t)?
>
Good point. Member usecount could be anything. A 'long' isn't the correct
pad for all types, but it will probably handle everything that was
intended.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Windows-2000/Professional isn't.
On Mon, 8 Jul 2002 08:21:59 -0400 (EDT),
"Richard B. Johnson" <[email protected]> wrote:
>On Thu, 4 Jul 2002, Daniel Phillips wrote:
>> Is it just the mod_dec_use_count; return/unload race we're worried about?
>> I'm not clear on why this is hard. I'd think it would be sufficient just to
>> walk all runnable processes to ensure none has an execution address inside the
>> module. For smp, an ipi would pick up the current process on each cpu.
>>
>> At this point the use count must be zero and the module deregistered, so all
>> we're interested in is that every process that dec'ed the module's use count
>> has succeeded in executing its way out of the module. If not, we try again
>> later, or if we're impatient, also bump any processes still inside the module
>> to the front of the run queue.
>
>Assuming that there are no users of a module, i.e., the current
>MOD_DEC_USE_COUNT mechanism, checked under a lock, I think all
>you have to do to assure race-free module removal is to:
>
>(1) Make sure that `current` isn't preempted. There are several
> ways to do this. This may require another global variable.
>
>(2) Execute cleanup_module(). It is assumed that the user-code
> will deallocate interrupts, and free resources, etc.
You must not deallocate any resources until you have flushed out any
callers who have loaded the address of a module function but not yet
done MOD_INC_USE_COUNT. It races between cleanup and a new consumer
entering the module.
>(3) Set a global flag "module_remove", it doesn't have to be atomic.
> It needs only to be volatile. It is used in schedule() to trap
> all CPUs.
> schedule()
> {
> while(module_remove)
> ;
> }
>
>(4) Wait number-of-CPUs timer-ticks with interrupts alive and well.
> This will make certain that everything queued on the timer gets
> a chance to be executed and everything that would be preempted
> gets trapped in schedule().
>
>(5) Remove the module, reset the flag.
>
>The cost is the single variable that must be checked every time schedule()
>is executed plus the 'don't preempt me' variable (if necessary).
(3) and (4) are the handwaving[1] bit that says 'wait until all
consumers have either left the module or incremented the use count'. I
don't like the idea of completely stopping schedule. Apart from the
long latency it introduces on module removal, I suspect that threads
belong to the module being removed will be a problem, they must proceed
to completion and full thread clean up.
Frankly the quiesce code is not the real problem. The real race is
this one :-
Check use count, 0 so proceed with unload.
Another task enters the module on another cpu.
Call cleanup_module(), remove resources.
Increment use count.
Use resources, oops.
There are only two viable solutions to this race. Either _always_
increment the use count outside the module (via the owner field in
structures) or split module exit code in two, as described in [1].
The current code tries to use 'increment use count outside module' but
that has its own race in getting the address of the module. Closing
that race relies on the interaction between three (yes, three)
unrelated locks which have to be obtained and released in the right
order. Not only is this complex and fragile, a quick scan of the
kernel found one outright bug and several dubious sections of code.
Rusty and I think that the current method is difficult to use and to
audit, as well as scattering the complexity around the kernel. Not to
mention that preempt breaks the existing code. We both think that we
should give up on the current method and put all the complexity of
flushing stale references in one place[2]. The downside is the need to
split module init and exit code into two phases, i.e. change every
module to separate unregister from unallocate.
[1] http://marc.theaimsgroup.com/?l=linux-kernel&m=102574568025442&w=2
[2] http://marc.theaimsgroup.com/?l=linux-kernel&m=102575536930661&w=2
On Mon, 08 Jul 2002 23:06:05 +1000,
Keith Owens <[email protected]> wrote:
>The current code tries to use 'increment use count outside module' but
>that has its own race in getting the address of the module. Closing
>that race relies on the interaction between three (yes, three)
>unrelated locks which have to be obtained and released in the right
>order. Not only is this complex and fragile, a quick scan of the
>kernel found one outright bug and several dubious sections of code.
Correction: make that two outright bugs. I have just been told that
one of the dubious bits of code is broken.
Tracking the interaction of multiple locks to ensure that they
correctly prevent a race on incrementing the use count is too messy and
fragile. The offending bits of code were not written by beginners but
by experienced kernel programmers. If experienced programmers get it
wrong, then the method is wrong.
Hi,
On Mon, 8 Jul 2002, Richard B. Johnson wrote:
> > That doesn't sound too clean to me...
> >
> > Maybe we should lock that module explicitly, instead of halting anything
> > that is schedule()d.
Not to mention that we even stop our own cleanup thread, looking at this.
> Locking the module does nothing except increase module overhead. The
> premise is that we don't care how long it takes to remove a module. It
> just must be done safely. So, what we need to do it make certain that...
>
> (1) All calls from the module have returned.
> (2) All calls to the module code have returned.
> (3) All user-access has completed.
>
> That's what the trap in schedule() does, in conjunction with the
> wait-for-timer-ticks. We don't want to have lots of locks and semiphores
> that have to be accessed during normal execution paths.
Still, we shouldn't lock everything. I could do awful lots of interesting
things while the only thing that is being done is to remove a module. It
doesn't make sense IMO to lock things that are completely unrelated to
modules.
And BTW, what's so much of an overhead if we tell everyone who tries to
mess around with a certain module that he'd better wait until we unloaded
it? It could be done like your schedule hack, but cleaner in that respect
that those who got nothing to do with the module can keep on running.
> Good point. Member usecount could be anything. A 'long' isn't the
> correct pad for all types, but it will probably handle everything that
> was intended.
But as I mentioned - atomic_t is changed (e.g. to long long) ->
module->pad blows up, because the sizeof(struct module) is different,
depending on which part of the union we're using.
Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------
Okay, maybe this is a bit naive, but isn't this problem already solved?
Couldn't we just put a read/write lock on the module, where using is
reading, and removing is writing? As I understand it, this should
impose little overhead on the use (read) case, and ensure that, when a
context has the remove (write) lock there are no no users (readers) and
cannot be any?
Daniel
On Mon, 2002-07-08 at 09:58, Thunder from the hill wrote:
> Hi,
>
> Still, we shouldn't lock everything. I could do awful lots of interesting
> things while the only thing that is being done is to remove a module. It
> doesn't make sense IMO to lock things that are completely unrelated to
> modules.
>
> And BTW, what's so much of an overhead if we tell everyone who tries to
> mess around with a certain module that he'd better wait until we unloaded
> it? It could be done like your schedule hack, but cleaner in that respect
> that those who got nothing to do with the module can keep on running.
>
> > Good point. Member usecount could be anything. A 'long' isn't the
> > correct pad for all types, but it will probably handle everything that
> > was intended.
>
> But as I mentioned - atomic_t is changed (e.g. to long long) ->
> module->pad blows up, because the sizeof(struct module) is different,
> depending on which part of the union we're using.
>
> Regards,
> Thunder
> --
> (Use http://www.ebb.org/ungeek if you can't decode)
> ------BEGIN GEEK CODE BLOCK------
> Version: 3.12
> GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
> N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
> e++++ h* r--- y-
> ------END GEEK CODE BLOCK------
--
Recursion n.:
See Recursion.
-- Random Shack Data Processing Dictionary
Hi,
On 8 Jul 2002, Daniel Gryniewicz wrote:
> Okay, maybe this is a bit naive, but isn't this problem already solved?
> Couldn't we just put a read/write lock on the module, where using is
> reading, and removing is writing? As I understand it, this should
> impose little overhead on the use (read) case, and ensure that, when a
> context has the remove (write) lock there are no no users (readers) and
> cannot be any?
My suggestion could cope with just one lock, but there seems to be
something speaking against that.
Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------
Hi!
> >> Modules can run their own kernel threads. When the module shuts down
> >> it terminates the threads but we must wait until the process entries
> >> for the threads have been reaped. If you are not careful, the zombie
> >> clean up code can refer to the module that no longer exists. You must
> >> not freeze any threads that belong to the module.
> >
> >Look at the code. freezer will try for 5 seconds, then give up. So, in
> >rare case module has some threads, rmmod will simply fail. I believe
> >we can fix rare remaining modules one by one.
>
> There is no failure path for rmmod. Once rmmod sees a use count of 0,
> it must succeed. Which is why both Rusty and I agree that rmmod must
> be split in two, one piece that is allowed to fail and a second piece
> that is not.
>
> BTW, freeze_processes() silently ignores zombie processes. The test is
> not obvious, it is hidden in the INTERESTING macro which has an
> embedded 'continue' to break out of the loop. Easy to miss.
Is there problems? Zombie processes are basically dead, not
interesting, processes.
> >> You must not freeze any process that has entered the module but not yet
> >> incremented the use count, nor any process that has decremented the use
> >> count but not yet left the module. Simply looking at the EIP after
> >
> >Look how freezer works. refrigerator() is blocking, by definition. So
> >if all processes reach refrigerator(), and the use count == 0, it is
> >indeed safe to unload.
>
> freeze_processes()
> signal_wake_up() - sets TIF_SIGPENDING for other task
> kick_if_running()
> resched_task() - calls preempt_disable() for this cpu
> smp_send_reschedule()
> smp_reschedule_interrupt() - now on another cpu
> ret_from_intr
> resume_kernel - on other cpu
>
> With CONFIG_PREEMPT, a process running on another cpu without a lock
> when freeze_processes() is called should immediately end up in
> schedule. I don't see anything in that code path that disables
> preemption on other cpus. If I am right, then a second cpu could be in
> this window when freeze_processes is called
>
> if (xxx->func)
> xxx->func()
okay, so we have
if (xxx->func)
interrupt
schedule()
but schedule at this point is certainly not going to enter signal
handling code, so refrigerator is deffered at run to userspace, so it
continues with
interrupt return
xxx->func()
...
attempt to return to userspace
signal code
refrigerator().
So I believe that is safe.
> where func is a module function. There is still a window from loading
> the function address, through calling the function and up to the point
> where the function does MOD_INC_USE_COUNT. Any reschedule in that
> window opens a race with rmmod. Without preemption, freeze might be
> safe, with preemption the race is back again.
Pavel
--
Casualities in World Trade Center: ~3k dead inside the building,
cryptography in U.S.A. and free speech in Czech Republic.
On Mon, 8 Jul 2002 20:13:31 +0200,
Pavel Machek <[email protected]> wrote:
>Keith Owens wrote
>> BTW, freeze_processes() silently ignores zombie processes. The test is
>> not obvious, it is hidden in the INTERESTING macro which has an
>> embedded 'continue' to break out of the loop. Easy to miss.
>
>Is there problems? Zombie processes are basically dead, not
>interesting, processes.
I have had reports of module threads being shut down by the cleanup
routine, the module was removed then the zombie threads were reaped and
got an oops because the module code that spawned them no longer
existed. It is not clear why this is a problem, I will follow up with
the reporter.
>> freeze_processes()
>> signal_wake_up() - sets TIF_SIGPENDING for other task
>> kick_if_running()
>> resched_task() - calls preempt_disable() for this cpu
>> smp_send_reschedule()
>> smp_reschedule_interrupt() - now on another cpu
>> ret_from_intr
>> resume_kernel - on other cpu
>>
>> With CONFIG_PREEMPT, a process running on another cpu without a lock
>> when freeze_processes() is called should immediately end up in
>> schedule. I don't see anything in that code path that disables
>> preemption on other cpus. If I am right, then a second cpu could be in
>> this window when freeze_processes is called
>>
>> if (xxx->func)
>> xxx->func()
>
>okay, so we have
>
> if (xxx->func)
> interrupt
> schedule()
>
>but schedule at this point is certainly not going to enter signal
>handling code
With preempt it will. The interrupt drops back through ret_from_intr
-> resume_kernel. The preempt count is 0, preemption has only been
disabled on the sending cpu, not the receiving cpu. need_resched is
entered immediately.
Hi!
> >> freeze_processes()
> >> signal_wake_up() - sets TIF_SIGPENDING for other task
> >> kick_if_running()
> >> resched_task() - calls preempt_disable() for this cpu
> >> smp_send_reschedule()
> >> smp_reschedule_interrupt() - now on another cpu
> >> ret_from_intr
> >> resume_kernel - on other cpu
> >>
> >> With CONFIG_PREEMPT, a process running on another cpu without a lock
> >> when freeze_processes() is called should immediately end up in
> >> schedule. I don't see anything in that code path that disables
> >> preemption on other cpus. If I am right, then a second cpu could be in
> >> this window when freeze_processes is called
> >>
> >> if (xxx->func)
> >> xxx->func()
> >
> >okay, so we have
> >
> > if (xxx->func)
> > interrupt
> > schedule()
> >
> >but schedule at this point is certainly not going to enter signal
> >handling code
>
> With preempt it will. The interrupt drops back through ret_from_intr
> -> resume_kernel. The preempt count is 0, preemption has only been
> disabled on the sending cpu, not the receiving cpu. need_resched is
> entered immediately.
Yep but signal handling code will not be entered because signal
handling is only entered when returning to userspace.
Pavel
--
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?
Daniel Mose wrote:
> Keith Owens wrote:
> > On Fri, 5 Jul 2002 15:48:17 +0200,
> > Pavel Machek <[email protected]> wrote:
> > >Keith Owens wrote
> > >> Modules can run their own kernel threads. When the module shuts down
> > >> it terminates the threads but we must wait until the process entries
> > >> for the threads have been reaped. If you are not careful, the zombie
> > >> clean up code can refer to the module that no longer exists. You must
> > >> not freeze any threads that belong to the module.
>
> I am just out to share a possible angle. -all though not really a programmer.
>
> Can one module replace another -running- twin-module without having to hand-
> shake with the kernel? -and without freezing ? (transparently)
>
> If obvious no - no need to read further.
. . .
As some conversations have implied rently in the thread, that it would be
possible to disconnect,and re-connect the original module - This might also be.
But I will not pretend that I know anything. Just trying to use a bit of wits
It is only an angle. And I'm not exactly boored with Ansi-drawing.
HWclock.
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=->-=-=-=-=-=-=-=-=-
1. 2. 3.
derivate.o==== |L derivate.o====#|L 100kB derivate.o====|L=#
| G'day! |i @ A |i @ in |Bye! |i @
V |n # 1MB | Up2date? |n use V |n
realmodule.o=====|u=#@ in realmodule.o====#|u=# realmodule.o=== |u
x use x x
HWclock.
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=->-=-=-=-=-=-=-=-=-
4. Users=0? 5. Users=0? 6. Users=0?
/dev/0<<-derivate.o=<<=|L# /dev/0xx derivate.o xx|Lx /dev/0<<derivate.o====|L#
\_EOT=>>=|i#@ Me uglu!>>#|i#@ >> Me brken!=>>=|i#@
|n |n |n
( re lm dule.o )== |u r lm d l .o |u d l . |u
x x x
HWclock
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=->-=-=-=-=-=-=-=-=-
7. 8.
n n n |L
derivate.o x|x i u > u i > i u - \ - |i
I am 0+! L x x L L x |n
|u
x
No this is not really what happened or?.
Actually what can happen is <of course anything>, but hopefully that the "real"
module is removed safely and replaced by a dummy permanently, or until another
module (maybe the original "real" module) is loaded back into the kernel.
Perhaps the kernel doesn't get to free all of the memory pages that were
allocated to the "real" module, But the original module is gone, which is what
was desired, no? The dummy should probably be designed to be able to answer
kernel processes in a sort of lame way (rest easy), as well as to listen for
user processes, to deny them acess, if it is not able to unload completely.
I think of this as both a workaround And a design issue. Dummy modules is
not something new. It has been (and is) used to trick kernel processes before.
(and now)
With dummy modules you will probably not have to risk either the functionality
of the kernel it self, while running, by severe re-programming, opening up new
possible bug holes, nor the Real, original modules brilliant functionality that
we might want to unload just to momentarily NOT achieve this purpose.
There is just one BIG issue here as far as I'm concerned.
I Don't know if theese are relevant thoughts or not, since I lack SeVereLy in
Experience of Linux processing shapes, and communication features.
So I'm fumbelling more than just a bit here.
(But at Least I'm not having High school summer classes in Fundamental C
programming rules on Linux-kernel Mailing list. ;-).
No one named or forgotten.)
I use rmmod as a security feature at home. What the kernel don't know about,
it can't reveal to possible intruders -right? So I use a ping script, and
mailto, to unload eth0 from one of my lan boxes, while dialing on line, so
I really DO load and UN-load modules all the time. ( 2.2.12 )
---
Image:
- makes sure u make sense.
One oddball use of module removal.
On my tivo I have a module loaded that disables scrambling of the shows it
records (allowing me to download them to standard mpeg files) I need to
not have this module loaded to watch shows that were recorded without the
module loaded.
if module unloading is removed then I will have to reboot my tivo to
unload this module.
yes this module could probably be rewritten to have an interface that
allowed the encryption to be enabled or disabled, but that makes the
module significanlty more complicated then mearly implementing null
functions to replace the existing encryption functions.
David Lang
On Mon, 1 Jul 2002, Bill Davidsen wrote:
> Date: Mon, 1 Jul 2002 13:48:55 -0400 (EDT)
> From: Bill Davidsen <[email protected]>
> To: Linux-Kernel Mailing List <[email protected]>
> Subject: [OKS] Module removal
>
> Having read some notes on the Ottawa Kernel Summit, I'd like to offer some
> comments on points raied.
>
> The suggestion was made that kernel module removal be depreciated or
> removed. I'd like to note that there are two common uses for this
> capability, and the problems addressed by module removal should be kept in
> mind. These are in addition to the PCMCIA issue raised.
>
> 1 - conversion between IDE-CD and IDE-SCSI. Some applications just work
> better (or usefully at all) with one or the other driver used to read CDs.
> I've noted that several programs for reading the image from an ISO CD
> (readcd and sdd) have end of data problems with the SCSI interface.
>
> 2 - restarting NICs when total reinitialization is needed. In server
> applications it's sometimes necessary to move or clear a NIC connection,
> force renegotiation because the blade on the switch was set wrong, etc.
> It's preferable to take down one NIC for a moment than suffer a full
> outage via reboot.
>
> --
> bill davidsen <[email protected]>
> CTO, TMR Associates, Inc
> Doing interesting things with little computers since 1979.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>