On Mon, 5 Nov 2007, Jens Axboe wrote:
> Hi Peter,
>
> You don't seem to have a bugzilla account, so could not reassign to you.
> See http://bugzilla.kernel.org/show_bug.cgi?id=9294
Problem is repeatable on my computer. It dies in __module_get() on this
line:
BUG_ON(module_refcount(module) == 0);
I think this is because commit 7b595756ec1f49e0049a9e01a1298d53a7faaa15,
which states: "Note that with this change, userland holding a sysfs node
does not prevent the backing module from being unloaded."
Unfortunately, I don't know how this sysfs stuff is supposed to work, and
therefore don't know how to fix the problem.
--
Peter Osterlund - [email protected]
http://web.telia.com/~u89404340
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index a8130a4..a5ee213 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -358,10 +358,19 @@ static ssize_t class_pktcdvd_store_add(struct class *c, const char *buf,
size_t count)
{
unsigned int major, minor;
+
if (sscanf(buf, "%u:%u", &major, &minor) == 2) {
+ /* pkt_setup_dev() expects caller to hold reference to self */
+ if (!try_module_get(THIS_MODULE))
+ return -ENODEV;
+
pkt_setup_dev(MKDEV(major, minor), NULL);
+
+ module_put(THIS_MODULE);
+
return count;
}
+
return -EINVAL;
}
Hello,
have not tested it yet, but i quess, the code mentioned by Peter
is in pkt_new_dev() that is called by pkt_setup_dev():
/* This is safe, since we have a reference from open(). */
__module_get(THIS_MODULE);
So, now, there must be checks in every sysfs operation in the module code,
to ensure that the module is still loaded?
BTW: the bug report says:
Steps to reproduce:
modprobe pktcdvd
echo 22:0 >/sys/class/pktcdvd/add
Is there any module unload??? Why is the module not available after the modprobe, but the sysfs entries, generated by the module? Confused ;)
-Thomas
Am 06.11.2007, 10:06 Uhr, schrieb Tejun Heo <[email protected]>:
> [Greg cc'd]
>
> Peter Osterlund wrote:
>> On Mon, 5 Nov 2007, Jens Axboe wrote:
>>
>>> Hi Peter,
>>>
>>> You don't seem to have a bugzilla account, so could not reassign to you.
>>> See http://bugzilla.kernel.org/show_bug.cgi?id=9294
>>
>> Problem is repeatable on my computer. It dies in __module_get() on this
>> line:
>>
>> BUG_ON(module_refcount(module) == 0);
>>
>> I think this is because commit 7b595756ec1f49e0049a9e01a1298d53a7faaa15,
>> which states: "Note that with this change, userland holding a sysfs node
>> does not prevent the backing module from being unloaded."
>>
>> Unfortunately, I don't know how this sysfs stuff is supposed to work,
>> and therefore don't know how to fix the problem.
>
> Does this fix the problem?
>
On Tue, 6 Nov 2007, Thomas Maier wrote:
> Hello,
>
> have not tested it yet, but i quess, the code mentioned by Peter
> is in pkt_new_dev() that is called by pkt_setup_dev():
>
> /* This is safe, since we have a reference from open(). */
> __module_get(THIS_MODULE);
>
>
> So, now, there must be checks in every sysfs operation in the module code,
> to ensure that the module is still loaded?
I haven't tested it either yet. What I don't understand is this: If the
__module_get() is not safe because the module code could have already been
unloaded, how can it possibly be made safe by adding more code to the
pktcdvd module? If the module is unloaded, trying to execute its code
can't be a good thing no matter what the code does.
> BTW: the bug report says:
>
> Steps to reproduce:
>
> modprobe pktcdvd
> echo 22:0 >/sys/class/pktcdvd/add
>
> Is there any module unload??? Why is the module not available after the
> modprobe, but the sysfs entries, generated by the module? Confused ;)
I think the purpose of the BUG_ON in __module_get() is to catch cases that
are unsafe, even if the call would have happened to work in this
particular case.
--
Peter Osterlund - [email protected]
http://web.telia.com/~u89404340
Peter Osterlund wrote:
> On Tue, 6 Nov 2007, Thomas Maier wrote:
>
>> Hello,
>>
>> have not tested it yet, but i quess, the code mentioned by Peter
>> is in pkt_new_dev() that is called by pkt_setup_dev():
>>
>> /* This is safe, since we have a reference from open(). */
>> __module_get(THIS_MODULE);
>>
>>
>> So, now, there must be checks in every sysfs operation in the module
>> code,
>> to ensure that the module is still loaded?
>
> I haven't tested it either yet. What I don't understand is this: If the
> __module_get() is not safe because the module code could have already
> been unloaded, how can it possibly be made safe by adding more code to
> the pktcdvd module? If the module is unloaded, trying to execute its
> code can't be a good thing no matter what the code does.
>
sysfs itself is now out of module lifespan rules. sysfs callbacks are
guaranteed to stay in memory while running by sysfs node removal waiting
for completion of in-flight operations before returning. In pktcdvd's
case, class_destroy() call in pkt_sysfs_cleanup() will wait for all
in-flight sysfs r/w ops to complete.
So, even while sysfs callbacks are executing, the module beneath can die
but it will stay in memory till all the callbacks return. You need to
test module liveness using try_module_get() (and it can fail) if you
want to grab module reference from sysfs callbacks.
>> BTW: the bug report says:
>>
>> Steps to reproduce:
>>
>> modprobe pktcdvd
>> echo 22:0 >/sys/class/pktcdvd/add
>>
>> Is there any module unload??? Why is the module not available after
>> the modprobe, but the sysfs entries, generated by the module? Confused ;)
>
> I think the purpose of the BUG_ON in __module_get() is to catch cases
> that are unsafe, even if the call would have happened to work in this
> particular case.
The BUG_ON is detecting valid condition here. If you rmmod pktcdvd
after sysfs write has begun but before __module_get() ran, device node
will be created after the module is killed and scheduled to be unloaded.
Thanks.
--
tejun
On Wed, 7 Nov 2007, Tejun Heo wrote:
> Peter Osterlund wrote:
>> If the
>> __module_get() is not safe because the module code could have already
>> been unloaded, how can it possibly be made safe by adding more code to
>> the pktcdvd module? If the module is unloaded, trying to execute its
>> code can't be a good thing no matter what the code does.
>
> sysfs itself is now out of module lifespan rules. sysfs callbacks are
> guaranteed to stay in memory while running by sysfs node removal waiting
> for completion of in-flight operations before returning. In pktcdvd's
> case, class_destroy() call in pkt_sysfs_cleanup() will wait for all
> in-flight sysfs r/w ops to complete.
>
> So, even while sysfs callbacks are executing, the module beneath can die
> but it will stay in memory till all the callbacks return. You need to
> test module liveness using try_module_get() (and it can fail) if you
> want to grab module reference from sysfs callbacks.
Thanks for the explanation.
Given that explanation, I think the patch is correct and it does fix the
BUG on my computer. Can you please push it upstream?
In any case:
Acked-by: Peter Osterlund <[email protected]>
--
Peter Osterlund - [email protected]
http://web.telia.com/~u89404340
pkt_setup_dev() expects module reference to be held on invocation.
This used to be true for sysfs callbacks but not anymore. Test and
grab module reference around pkt_setup_dev() in
class_pktcdvd_store_add().
Signed-off-by: Tejun Heo <[email protected]>
Acked-by: Peter Osterlund <[email protected]>
---
Greg, can you please push this patch through your tree?
Thanks a lot.
drivers/block/pktcdvd.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index a8130a4..a5ee213 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -358,10 +358,19 @@ static ssize_t class_pktcdvd_store_add(struct class *c, const char *buf,
size_t count)
{
unsigned int major, minor;
+
if (sscanf(buf, "%u:%u", &major, &minor) == 2) {
+ /* pkt_setup_dev() expects caller to hold reference to self */
+ if (!try_module_get(THIS_MODULE))
+ return -ENODEV;
+
pkt_setup_dev(MKDEV(major, minor), NULL);
+
+ module_put(THIS_MODULE);
+
return count;
}
+
return -EINVAL;
}
On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote:
> pkt_setup_dev() expects module reference to be held on invocation.
> This used to be true for sysfs callbacks but not anymore. Test and
> grab module reference around pkt_setup_dev() in
> class_pktcdvd_store_add().
>
> Signed-off-by: Tejun Heo <[email protected]>
> Acked-by: Peter Osterlund <[email protected]>
> ---
> Greg, can you please push this patch through your tree?
> Thanks a lot.
>
> drivers/block/pktcdvd.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
Why through my tree? I don't do block devices :)
Shouldn't Jens or at least Andrew take it?
thanks,
greg k-h
Greg KH wrote:
> On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote:
>> pkt_setup_dev() expects module reference to be held on invocation.
>> This used to be true for sysfs callbacks but not anymore. Test and
>> grab module reference around pkt_setup_dev() in
>> class_pktcdvd_store_add().
>>
>> Signed-off-by: Tejun Heo <[email protected]>
>> Acked-by: Peter Osterlund <[email protected]>
>> ---
>> Greg, can you please push this patch through your tree?
>> Thanks a lot.
>>
>> drivers/block/pktcdvd.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>
> Why through my tree? I don't do block devices :)
Because it's a regression introduced by changes in sysfs?
> Shouldn't Jens or at least Andrew take it?
That's fine too. Jens?
--
tejun
On Thu, Nov 08 2007, Tejun Heo wrote:
> Greg KH wrote:
> > On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote:
> >> pkt_setup_dev() expects module reference to be held on invocation.
> >> This used to be true for sysfs callbacks but not anymore. Test and
> >> grab module reference around pkt_setup_dev() in
> >> class_pktcdvd_store_add().
> >>
> >> Signed-off-by: Tejun Heo <[email protected]>
> >> Acked-by: Peter Osterlund <[email protected]>
> >> ---
> >> Greg, can you please push this patch through your tree?
> >> Thanks a lot.
> >>
> >> drivers/block/pktcdvd.c | 9 +++++++++
> >> 1 file changed, 9 insertions(+)
> >
> > Why through my tree? I don't do block devices :)
>
> Because it's a regression introduced by changes in sysfs?
>
> > Shouldn't Jens or at least Andrew take it?
>
> That's fine too. Jens?
Sure, I'm pushing some stuff off today anyway.
--
Jens Axboe
Hello,
tested it too, running linux 2.6.23 in a qemu instance, and the patch worked.
But i would prefer to take the try_module_get() stuff into pkt_setup_dev() because
it is used also in the older procfs interface. Can we run into the same problem here, means
procfs holds no module references too, like sysfs now?
Maybe also the "/sys/class/pktcdvd/remove" command should be wrapped with an
try_module_get() ???
-Thomas
----- original Nachricht --------
Betreff: Re: pktcdvd oops
Gesendet: Mi 07 Nov 2007 23:07:10 CET
Von: "Peter Osterlund"<[email protected]>
> On Wed, 7 Nov 2007, Tejun Heo wrote:
>
> > Peter Osterlund wrote:
> >> If the
> >> __module_get() is not safe because the module code could have already
> >> been unloaded, how can it possibly be made safe by adding more code to
> >> the pktcdvd module? If the module is unloaded, trying to execute its
> >> code can't be a good thing no matter what the code does.
> >
> > sysfs itself is now out of module lifespan rules. sysfs callbacks are
> > guaranteed to stay in memory while running by sysfs node removal waiting
> > for completion of in-flight operations before returning. In pktcdvd's
> > case, class_destroy() call in pkt_sysfs_cleanup() will wait for all
> > in-flight sysfs r/w ops to complete.
> >
> > So, even while sysfs callbacks are executing, the module beneath can die
> > but it will stay in memory till all the callbacks return. You need to
> > test module liveness using try_module_get() (and it can fail) if you
> > want to grab module reference from sysfs callbacks.
>
> Thanks for the explanation.
>
> Given that explanation, I think the patch is correct and it does fix the
> BUG on my computer. Can you please push it upstream?
>
> In any case:
>
> Acked-by: Peter Osterlund <[email protected]>
>
> --
> Peter Osterlund - [email protected]
> http://web.telia.com/~u89404340
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--- original Nachricht Ende ----
[email protected] wrote:
> Hello,
>
> tested it too, running linux 2.6.23 in a qemu instance, and the patch worked.
> But i would prefer to take the try_module_get() stuff into pkt_setup_dev() because
> it is used also in the older procfs interface. Can we run into the same problem here, means
> procfs holds no module references too, like sysfs now?
procfs should be okay. sysfs was too intertwined with driver model and
module reference counting never worked well. We had to pull module
reference counting out of there.
> Maybe also the "/sys/class/pktcdvd/remove" command should be wrapped with an
> try_module_get() ???
No, I don't think so. The code won't go away beneath it. After
module_put() the module can die (ie. calling __module_get() on it will
trigger BUG) but it won't go away till the function finishes.
Thanks.
--
tejun
On Thu, 8 Nov 2007, [email protected] wrote:
> tested it too, running linux 2.6.23 in a qemu instance, and the patch
> worked. But i would prefer to take the try_module_get() stuff into
> pkt_setup_dev() because it is used also in the older procfs interface.
> Can we run into the same problem here, means procfs holds no module
> references too, like sysfs now?
The procfs interface can only be used to get some debug data out from the
driver, not to bind the driver to a CD/DVD device, so it shouldn't be a
problem.
The other way to bind a device is to use the pktsetup program, which is
doing ioctl calls to the driver. In that case, user space has to open the
device before being able to do the ioctls, and the open call will increase
the reference count.
--
Peter Osterlund - [email protected]
http://web.telia.com/~u89404340