I've been playing with workqueues, and I've found that once I unload the
module, if I don't call destroy_workqueue(); then the workqueue I've created
stays in the process list, [my_wq], I don't know if that's meant to be, or is
it a bug, cause I believe there can be two options in here:
1) It's meant to be so you can unload your module and let the works run some
time after you're already gone, that allows you to probe other modules or do
whatever necesary without the need to wait for the workqueue to be emtpy.
2) It's a bug, cause the module allows to be unloaded, destroying the structs
but not removing the workqueue from the process context.
Which one is it?I hope I'm being clear with my question.
I was about to try to find a solution to remove the queue but maybe it's meant
to be, although not likely.
Vicente.
Vicente> I've been playing with workqueues, and I've found that
Vicente> once I unload the module, if I don't call
Vicente> destroy_workqueue(); then the workqueue I've created
Vicente> stays in the process list, [my_wq], I don't know if
Vicente> that's meant to be, or is it a bug, cause I believe there
Vicente> can be two options in here:
Vicente> 1) It's meant to be so you can unload your module and let
Vicente> the works run some time after you're already gone, that
Vicente> allows you to probe other modules or do whatever necesary
Vicente> without the need to wait for the workqueue to be emtpy.
Vicente> 2) It's a bug, cause the module allows to be unloaded,
Vicente> destroying the structs but not removing the workqueue
Vicente> from the process context.
Not destroying its workqueue is a bug in the module just like any
other resource leak. It's analogous to a module allocating some
memory with kmalloc() and not calling kfree() when it's unloaded. If
a module creates a workqueue, then it should call destroy_workqueue()
when it's unloaded.
By the way, the module (or any code calling destroy_workqueue()) must
make sure that it has race conditions that might result in work being
submitted to the queue while it is being destroyed.
-R .
On Saturday 19 February 2005 04:57 am, you wrote:
> Vicente> I've been playing with workqueues, and I've found that
> Vicente> once I unload the module, if I don't call
> Vicente> destroy_workqueue(); then the workqueue I've created
> Vicente> stays in the process list, [my_wq], I don't know if
> Vicente> that's meant to be, or is it a bug, cause I believe there
> Vicente> can be two options in here:
>
> Vicente> 1) It's meant to be so you can unload your module and let
> Vicente> the works run some time after you're already gone, that
> Vicente> allows you to probe other modules or do whatever necesary
> Vicente> without the need to wait for the workqueue to be emtpy.
>
> Vicente> 2) It's a bug, cause the module allows to be unloaded,
> Vicente> destroying the structs but not removing the workqueue
> Vicente> from the process context.
>
> Not destroying its workqueue is a bug in the module just like any
> other resource leak. It's analogous to a module allocating some
> memory with kmalloc() and not calling kfree() when it's unloaded. If
> a module creates a workqueue, then it should call destroy_workqueue()
> when it's unloaded.
What if I need the module to be unloaded cause It's mutually exclusive with
another module to be loaded, and I still need to run the works in a workqueue
time before that happens? That's completely out of the picture?cause that
might be useful.
>
> By the way, the module (or any code calling destroy_workqueue()) must
> make sure that it has race conditions that might result in work being
> submitted to the queue while it is being destroyed.
yes, I think flushing is enough, is it?
>
> -R .
Vicente.
Vicente> What if I need the module to be unloaded cause It's
Vicente> mutually exclusive with another module to be loaded, and
Vicente> I still need to run the works in a workqueue time before
Vicente> that happens? That's completely out of the picture?cause
Vicente> that might be useful.
That doesn't sound like a good idea. For one thing, how does the
second module get the workqueue pointer from the first module? The
simplest solution would be for each module to create and destroy its
own workqueue. However, if you really need a shared workqueue for
some reason, then have a simple third module that can remain loaded
and have it manage the workqueue for both of your other modules.
Roland> By the way, the module (or any code calling
Roland> destroy_workqueue()) must make sure that it has race
Roland> conditions that might result in work being submitted to
Roland> the queue while it is being destroyed.
Vicente> yes, I think flushing is enough, is it?
Usually, but you have to be extra-careful if you have any work
functions that might resubmit themselves.
- Roland
On Saturday 19 February 2005 04:57 am, you wrote:
> Vicente> I've been playing with workqueues, and I've found that
> Vicente> once I unload the module, if I don't call
> Vicente> destroy_workqueue(); then the workqueue I've created
> Vicente> stays in the process list, [my_wq], I don't know if
> Vicente> that's meant to be, or is it a bug, cause I believe there
> Vicente> can be two options in here:
>
> Vicente> 1) It's meant to be so you can unload your module and let
> Vicente> the works run some time after you're already gone, that
> Vicente> allows you to probe other modules or do whatever necesary
> Vicente> without the need to wait for the workqueue to be emtpy.
>
> Vicente> 2) It's a bug, cause the module allows to be unloaded,
> Vicente> destroying the structs but not removing the workqueue
> Vicente> from the process context.
>
> Not destroying its workqueue is a bug in the module just like any
> other resource leak. It's analogous to a module allocating some
> memory with kmalloc() and not calling kfree() when it's unloaded. If
> a module creates a workqueue, then it should call destroy_workqueue()
> when it's unloaded.
What if I need the module to be unloaded cause It's mutually exclusive with
another module to be loaded, and I still need to run the works in a workqueue
time before that happens? That's completely out of the picture?cause that
might be useful.
>
> By the way, the module (or any code calling destroy_workqueue()) must
> make sure that it has race conditions that might result in work being
> submitted to the queue while it is being destroyed.
yes, I think flushing is enough, is it?
>
> -R .
Vicente.
Roland Dreier wrote:
> Not destroying its workqueue is a bug in the module just like any
> other resource leak. It's analogous to a module allocating some
> memory with kmalloc() and not calling kfree() when it's unloaded.
Except though that with kmalloc() it's indeed just a leak while in this
case things might blow up violently if run_workqueue() later accesses a
workqueue_struct (or work_struct) which is already gone as part of the
modules' datasection, for example. That's to say, if I'm reading this
right...
I have no idea about the module refcounting stuff. Is there a chance
that create_workqueue() could increase a reference somewhere so that the
module wouldn't be allowed to unload untill after a destroy_workqueue()?
Rene.
Rene> I have no idea about the module refcounting stuff. Is there
Rene> a chance that create_workqueue() could increase a reference
Rene> somewhere so that the module wouldn't be allowed to unload
Rene> untill after a destroy_workqueue()?
There's no point to doing this, since it's adding complexity to try
and avoid a very obvious and easy to find bug. Other types of
resource leaks are harder to find, but a module not destroying a
workqueue is going to be trivial to spot and fix.
(There are technical issues as well -- if create_workqueue()
increments the module reference count, then you would never be able to
unload the module if the destroy_workqueue() was in the module_exit
function, because you can never even start to unload a module with a
non-zero ref count).
- R.
Roland Dreier wrote:
> Rene> I have no idea about the module refcounting stuff. Is there
> Rene> a chance that create_workqueue() could increase a reference
> Rene> somewhere so that the module wouldn't be allowed to unload
> Rene> untill after a destroy_workqueue()?
>
> There's no point to doing this, since it's adding complexity to try
> and avoid a very obvious and easy to find bug. Other types of
> resource leaks are harder to find, but a module not destroying a
> workqueue is going to be trivial to spot and fix.
Yes, fair enough. Thank you.
Rene.