2007-05-09 23:52:42

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: 2.6.21-git11: BUG in loop.ko

Seems to be getting a 0 refcount. I don't see anything in the recent
changes which might cause this, but this is relatively new behaviour.
It was working for me in the 2.6.21-pre time period, but I haven't tried
this since 2.6.21 was released.

The BUG is actually triggered by the __module_get(THIS_MODULE) in
loop_set_fd.

J

loop: module loaded
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
------------[ cut here ]------------
kernel BUG at /home/jeremy/hg/xen/paravirt/linux/include/linux/module.h:396!
invalid opcode: 0000 [#1]
PREEMPT SMP
Modules linked in: dm_snapshot dm_mod loop
CPU: 1
EIP: 0061:[<d085a911>] Not tainted VLI
EFLAGS: 00010246 (2.6.21-paravirt #1339)
EIP is at lo_ioctl+0x65/0xa52 [loop]
eax: 00000000 ebx: cfb92c98 ecx: d085e480 edx: 00000200
esi: 00004c00 edi: cf8ad428 ebp: cf37fdc0 esp: cf37fbf8
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0069
Process losetup (pid: 440, ti=cf37e000 task=cf30b4d0 task.ti=cf37e000)
Stack: c1390080 c1390080 cf37fc10 00000008 cf8ad428 cfbb2258 00000000 cf37fc34
c01458c5 cfb92c98 c1392a40 cf30ba70 cf30b4d0 cf30ba54 00000002 cf30ba70
cf30b4d0 cf30ba54 00000002 00000003 c134c088 c134c088 cf37fc90 c01215b8
Call Trace:
[<c0109173>] show_trace_log_lvl+0x1a/0x30
[<c0109226>] show_stack_log_lvl+0x9d/0xa5
[<c0109425>] show_registers+0x1f7/0x336
[<c010967d>] die+0x119/0x21b
[<c0382c78>] do_trap+0x8a/0xa4
[<c0109ad1>] do_invalid_op+0x88/0x92
[<c0382a42>] error_code+0x72/0x78
[<c0211c79>] blkdev_driver_ioctl+0x4c/0x5d
[<c02123de>] blkdev_ioctl+0x754/0x7a2
[<c0198840>] block_ioctl+0x1b/0x1f
[<c0182e2e>] do_ioctl+0x22/0x68
[<c01830a6>] vfs_ioctl+0x232/0x245
[<c0183102>] sys_ioctl+0x49/0x63
[<c0108080>] syscall_call+0x7/0xb
=======================
Code: ff 83 f8 06 0f 87 5a 09 00 00 ff 24 85 5c bc 85 d0 8b 9b cc 01 00 00 b8 80 e4 85 d0 89 9d 5c fe ff ff e8 37 0a 8f ef 85 c0 75 04 <0f> 0b eb fe b8 01 00 00 00 e8 79 9f 8c ef e8 ec 4b 9c ef c1 e0
EIP: [<d085a911>] lo_ioctl+0x65/0xa52 [loop] SS:ESP 0069:cf37fbf8



2007-05-10 00:06:47

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.21-git11: BUG in loop.ko

On Wed, 09 May 2007 16:52:41 -0700
Jeremy Fitzhardinge <[email protected]> wrote:

> Seems to be getting a 0 refcount. I don't see anything in the recent
> changes which might cause this, but this is relatively new behaviour.
> It was working for me in the 2.6.21-pre time period, but I haven't tried
> this since 2.6.21 was released.
>
> The BUG is actually triggered by the __module_get(THIS_MODULE) in
> loop_set_fd.
>
> J
>
> loop: module loaded
> device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
> ------------[ cut here ]------------
> kernel BUG at /home/jeremy/hg/xen/paravirt/linux/include/linux/module.h:396!
> invalid opcode: 0000 [#1]
> PREEMPT SMP
> Modules linked in: dm_snapshot dm_mod loop
> CPU: 1
> EIP: 0061:[<d085a911>] Not tainted VLI
> EFLAGS: 00010246 (2.6.21-paravirt #1339)
> EIP is at lo_ioctl+0x65/0xa52 [loop]
> eax: 00000000 ebx: cfb92c98 ecx: d085e480 edx: 00000200
> esi: 00004c00 edi: cf8ad428 ebp: cf37fdc0 esp: cf37fbf8
> ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0069
> Process losetup (pid: 440, ti=cf37e000 task=cf30b4d0 task.ti=cf37e000)
> Stack: c1390080 c1390080 cf37fc10 00000008 cf8ad428 cfbb2258 00000000 cf37fc34
> c01458c5 cfb92c98 c1392a40 cf30ba70 cf30b4d0 cf30ba54 00000002 cf30ba70
> cf30b4d0 cf30ba54 00000002 00000003 c134c088 c134c088 cf37fc90 c01215b8
> Call Trace:
> [<c0109173>] show_trace_log_lvl+0x1a/0x30
> [<c0109226>] show_stack_log_lvl+0x9d/0xa5
> [<c0109425>] show_registers+0x1f7/0x336
> [<c010967d>] die+0x119/0x21b
> [<c0382c78>] do_trap+0x8a/0xa4
> [<c0109ad1>] do_invalid_op+0x88/0x92
> [<c0382a42>] error_code+0x72/0x78
> [<c0211c79>] blkdev_driver_ioctl+0x4c/0x5d
> [<c02123de>] blkdev_ioctl+0x754/0x7a2
> [<c0198840>] block_ioctl+0x1b/0x1f
> [<c0182e2e>] do_ioctl+0x22/0x68
> [<c01830a6>] vfs_ioctl+0x232/0x245
> [<c0183102>] sys_ioctl+0x49/0x63
> [<c0108080>] syscall_call+0x7/0xb
> =======================
> Code: ff 83 f8 06 0f 87 5a 09 00 00 ff 24 85 5c bc 85 d0 8b 9b cc 01 00 00 b8 80 e4 85 d0 89 9d 5c fe ff ff e8 37 0a 8f ef 85 c0 75 04 <0f> 0b eb fe b8 01 00 00 00 e8 79 9f 8c ef e8 ec 4b 9c ef c1 e0
> EIP: [<d085a911>] lo_ioctl+0x65/0xa52 [loop] SS:ESP 0069:cf37fbf8
>

A few people have been playing with module refcounting lately. Did you
work out a reproduce-it recipe?

2007-05-10 00:21:04

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git11: BUG in loop.ko

Andrew Morton wrote:
> On Wed, 09 May 2007 16:52:41 -0700
> Jeremy Fitzhardinge <[email protected]> wrote:
>
>
>> Seems to be getting a 0 refcount. I don't see anything in the recent
>> changes which might cause this, but this is relatively new behaviour.
>> It was working for me in the 2.6.21-pre time period, but I haven't tried
>> this since 2.6.21 was released.
>>
>> The BUG is actually triggered by the __module_get(THIS_MODULE) in
>> loop_set_fd.
>>
>> J
>>
>> loop: module loaded
>> device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
>> ------------[ cut here ]------------
>> kernel BUG at /home/jeremy/hg/xen/paravirt/linux/include/linux/module.h:396!
>> invalid opcode: 0000 [#1]
>> PREEMPT SMP
>> Modules linked in: dm_snapshot dm_mod loop
>> CPU: 1
>> EIP: 0061:[<d085a911>] Not tainted VLI
>> EFLAGS: 00010246 (2.6.21-paravirt #1339)
>> EIP is at lo_ioctl+0x65/0xa52 [loop]
>> eax: 00000000 ebx: cfb92c98 ecx: d085e480 edx: 00000200
>> esi: 00004c00 edi: cf8ad428 ebp: cf37fdc0 esp: cf37fbf8
>> ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0069
>> Process losetup (pid: 440, ti=cf37e000 task=cf30b4d0 task.ti=cf37e000)
>> Stack: c1390080 c1390080 cf37fc10 00000008 cf8ad428 cfbb2258 00000000 cf37fc34
>> c01458c5 cfb92c98 c1392a40 cf30ba70 cf30b4d0 cf30ba54 00000002 cf30ba70
>> cf30b4d0 cf30ba54 00000002 00000003 c134c088 c134c088 cf37fc90 c01215b8
>> Call Trace:
>> [<c0109173>] show_trace_log_lvl+0x1a/0x30
>> [<c0109226>] show_stack_log_lvl+0x9d/0xa5
>> [<c0109425>] show_registers+0x1f7/0x336
>> [<c010967d>] die+0x119/0x21b
>> [<c0382c78>] do_trap+0x8a/0xa4
>> [<c0109ad1>] do_invalid_op+0x88/0x92
>> [<c0382a42>] error_code+0x72/0x78
>> [<c0211c79>] blkdev_driver_ioctl+0x4c/0x5d
>> [<c02123de>] blkdev_ioctl+0x754/0x7a2
>> [<c0198840>] block_ioctl+0x1b/0x1f
>> [<c0182e2e>] do_ioctl+0x22/0x68
>> [<c01830a6>] vfs_ioctl+0x232/0x245
>> [<c0183102>] sys_ioctl+0x49/0x63
>> [<c0108080>] syscall_call+0x7/0xb
>> =======================
>> Code: ff 83 f8 06 0f 87 5a 09 00 00 ff 24 85 5c bc 85 d0 8b 9b cc 01 00 00 b8 80 e4 85 d0 89 9d 5c fe ff ff e8 37 0a 8f ef 85 c0 75 04 <0f> 0b eb fe b8 01 00 00 00 e8 79 9f 8c ef e8 ec 4b 9c ef c1 e0
>> EIP: [<d085a911>] lo_ioctl+0x65/0xa52 [loop] SS:ESP 0069:cf37fbf8
>>
>>
>
> A few people have been playing with module refcounting lately. Did you
> work out a reproduce-it recipe?
>


100% reliable, but a bit obscure. I'm booting an FC6 livecd with a
paravirt_ops kernel under Xen. The relevant part of the iso's initrd
script is:

+ mknod /dev/loop118 b 7 118
+ mknod /dev/loop119 b 7 119
+ mknod /dev/loop120 b 7 120
+ mknod /dev/loop121 b 7 121
+ mkdir -p /dev/mapper
+ mknod /dev/mapper/control c 10 63
+ modprobe loop max_loop=128
loop: the max_loop option is obsolete and will be removed in March 2008
loop: module loaded
+ modprobe dm_snapshot
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
+ '[' 0 == 1 ']'
+ losetup /dev/loop120 /sysroot/squashfs.img
------------[ cut here ]------------
kernel BUG at /home/jeremy/hg/xen/paravirt/linux/include/linux/module.h:396!
invalid opcode: 0000 [#1]
PREEMPT SMP
Modules linked in: dm_snapshot dm_mod loop
CPU: 0
EIP: 0061:[<d085a911>] Not tainted VLI
EFLAGS: 00010246 (2.6.21-paravirt #1339)
[...]


J

2007-05-10 07:32:36

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: 2.6.21-git11: BUG in loop.ko

On Wed, May 09, 2007 at 05:20:59PM -0700, Jeremy Fitzhardinge wrote:
> Andrew Morton wrote:
> > On Wed, 09 May 2007 16:52:41 -0700
> > Jeremy Fitzhardinge <[email protected]> wrote:
> >
> >
> >> Seems to be getting a 0 refcount. I don't see anything in the recent
> >> changes which might cause this, but this is relatively new behaviour.
> >> It was working for me in the 2.6.21-pre time period, but I haven't tried
> >> this since 2.6.21 was released.
> >>
> >> The BUG is actually triggered by the __module_get(THIS_MODULE) in
> >> loop_set_fd.
> >>
> >> J
> >>
> >> loop: module loaded
> >> device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
> >> ------------[ cut here ]------------
> >> kernel BUG at /home/jeremy/hg/xen/paravirt/linux/include/linux/module.h:396!
> >> invalid opcode: 0000 [#1]
> >> PREEMPT SMP
> >> Modules linked in: dm_snapshot dm_mod loop
> >> CPU: 1
> >> EIP: 0061:[<d085a911>] Not tainted VLI
> >> EFLAGS: 00010246 (2.6.21-paravirt #1339)
> >> EIP is at lo_ioctl+0x65/0xa52 [loop]
> >> eax: 00000000 ebx: cfb92c98 ecx: d085e480 edx: 00000200
> >> esi: 00004c00 edi: cf8ad428 ebp: cf37fdc0 esp: cf37fbf8
> >> ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0069
> >> Process losetup (pid: 440, ti=cf37e000 task=cf30b4d0 task.ti=cf37e000)
> >> Stack: c1390080 c1390080 cf37fc10 00000008 cf8ad428 cfbb2258 00000000 cf37fc34
> >> c01458c5 cfb92c98 c1392a40 cf30ba70 cf30b4d0 cf30ba54 00000002 cf30ba70
> >> cf30b4d0 cf30ba54 00000002 00000003 c134c088 c134c088 cf37fc90 c01215b8
> >> Call Trace:
> >> [<c0109173>] show_trace_log_lvl+0x1a/0x30
> >> [<c0109226>] show_stack_log_lvl+0x9d/0xa5
> >> [<c0109425>] show_registers+0x1f7/0x336
> >> [<c010967d>] die+0x119/0x21b
> >> [<c0382c78>] do_trap+0x8a/0xa4
> >> [<c0109ad1>] do_invalid_op+0x88/0x92
> >> [<c0382a42>] error_code+0x72/0x78
> >> [<c0211c79>] blkdev_driver_ioctl+0x4c/0x5d
> >> [<c02123de>] blkdev_ioctl+0x754/0x7a2
> >> [<c0198840>] block_ioctl+0x1b/0x1f
> >> [<c0182e2e>] do_ioctl+0x22/0x68
> >> [<c01830a6>] vfs_ioctl+0x232/0x245
> >> [<c0183102>] sys_ioctl+0x49/0x63
> >> [<c0108080>] syscall_call+0x7/0xb
> >> =======================
> >> Code: ff 83 f8 06 0f 87 5a 09 00 00 ff 24 85 5c bc 85 d0 8b 9b cc 01 00 00 b8 80 e4 85 d0 89 9d 5c fe ff ff e8 37 0a 8f ef 85 c0 75 04 <0f> 0b eb fe b8 01 00 00 00 e8 79 9f 8c ef e8 ec 4b 9c ef c1 e0
> >> EIP: [<d085a911>] lo_ioctl+0x65/0xa52 [loop] SS:ESP 0069:cf37fbf8
> >>
> >>
> >
> > A few people have been playing with module refcounting lately. Did you
> > work out a reproduce-it recipe?
> >
>
>
> 100% reliable, but a bit obscure. I'm booting an FC6 livecd with a
> paravirt_ops kernel under Xen. The relevant part of the iso's initrd
> script is:
>
> + mknod /dev/loop118 b 7 118
> + mknod /dev/loop119 b 7 119
> + mknod /dev/loop120 b 7 120
> + mknod /dev/loop121 b 7 121
> + mkdir -p /dev/mapper
> + mknod /dev/mapper/control c 10 63
> + modprobe loop max_loop=128
> loop: the max_loop option is obsolete and will be removed in March 2008
> loop: module loaded
> + modprobe dm_snapshot
> device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
> + '[' 0 == 1 ']'
> + losetup /dev/loop120 /sysroot/squashfs.img
> ------------[ cut here ]------------
> kernel BUG at /home/jeremy/hg/xen/paravirt/linux/include/linux/module.h:396!
> invalid opcode: 0000 [#1]
> PREEMPT SMP
> Modules linked in: dm_snapshot dm_mod loop
> CPU: 0
> EIP: 0061:[<d085a911>] Not tainted VLI
> EFLAGS: 00010246 (2.6.21-paravirt #1339)
> [...]

This must be caused by dynamic loop devices creation patch

Steps to reproduce:

mknod foo b 7 1
losetup foo 1.img

where "7 1" is major/minor pair which doesn't created by udev et al
after module loading.

2007-05-10 10:27:08

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: 2.6.21-git11: BUG in loop.ko

On Thu, May 10, 2007 at 11:39:49AM +0400, Alexey Dobriyan wrote:
> On Wed, May 09, 2007 at 05:20:59PM -0700, Jeremy Fitzhardinge wrote:
> > 100% reliable, but a bit obscure. I'm booting an FC6 livecd with a
> > paravirt_ops kernel under Xen. The relevant part of the iso's initrd
> > script is:
> >
> > + mknod /dev/loop118 b 7 118
> > + mknod /dev/loop119 b 7 119
> > + mknod /dev/loop120 b 7 120
> > + mknod /dev/loop121 b 7 121
> > + mkdir -p /dev/mapper
> > + mknod /dev/mapper/control c 10 63
> > + modprobe loop max_loop=128
> > loop: the max_loop option is obsolete and will be removed in March 2008
> > loop: module loaded
> > + modprobe dm_snapshot
> > device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
> > + '[' 0 == 1 ']'
> > + losetup /dev/loop120 /sysroot/squashfs.img
> > ------------[ cut here ]------------
> > kernel BUG at /home/jeremy/hg/xen/paravirt/linux/include/linux/module.h:396!
> > invalid opcode: 0000 [#1]
> > PREEMPT SMP
> > Modules linked in: dm_snapshot dm_mod loop
> > CPU: 0
> > EIP: 0061:[<d085a911>] Not tainted VLI
> > EFLAGS: 00010246 (2.6.21-paravirt #1339)
> > [...]
>
> This must be caused by dynamic loop devices creation patch
>
> Steps to reproduce:
>
> mknod foo b 7 1
> losetup foo 1.img
>
> where "7 1" is major/minor pair which doesn't created by udev et al
> after module loading.

I don't understand what "+ 1" is doing in lo_open(). Off by one?
It's removal would certainly fix creation of random number of loop
devices after just "losetup -a".

Also refcount of loop module can be made negative via

mknod foo b 7 42
losetup -d foo # sic

2007-05-11 06:11:08

by Ken Chen

[permalink] [raw]
Subject: Re: 2.6.21-git11: BUG in loop.ko

On 5/9/07, Andrew Morton <[email protected]> wrote:
> On Wed, 09 May 2007 16:52:41 -0700
> Jeremy Fitzhardinge <[email protected]> wrote:
>
> > Seems to be getting a 0 refcount. I don't see anything in the recent
> > changes which might cause this, but this is relatively new behaviour.
> > It was working for me in the 2.6.21-pre time period, but I haven't tried
> > this since 2.6.21 was released.
> >
> > The BUG is actually triggered by the __module_get(THIS_MODULE) in
> > loop_set_fd.
>
> A few people have been playing with module refcounting lately. Did you
> work out a reproduce-it recipe?

Ah, it's a mis-understanding on what kobj_probe_t function is suppose
to return on success. When we open loop device that has not been
initialized, we probe it via:

do_open
get_gendisk
kobj_lookup
loop_probe

Notice that in kobj_lookup(), when p->probe() returns non-zero value
(I presume it is an -ERRNO), it breaks out of the loop and propagate
the return value, otherwise, loops back to the beginning of the for
loop and retry, and in there get_disk() will be called via p->lock()
to get a ref against the module.

kobj_look_up(...) {
retry:
mutex_lock(domain->lock);
for (p = domain->probes[MAJOR(dev) % 255]; p; p = p->next) {
...
if (kobj)
return kobj;
goto retry;
}

So loop_probe() mistakenly returned wrong status and leads to future
oops on inconsistent module ref count. The following patch fixes the
issue.

Signed-off-by: Ken Chen <[email protected]>


diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 18cdd8c..40f7bc2 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1460,6 +1460,7 @@ static void loop_del_one(struct loop_device *lo)
kfree(lo);
}

+/* return NULL for success, or return non-zero value if there are error */
static struct kobject *loop_probe(dev_t dev, int *part, void *data)
{
unsigned int number = dev & MINORMASK;
@@ -1474,8 +1475,8 @@ static struct kobject *loop_probe
*part = 0;
if (IS_ERR(lo))
return (void *)lo;
- else
- return &lo->lo_disk->kobj;
+
+ return NULL;
}

static int __init loop_init(void)

2007-05-11 07:40:29

by Roland

[permalink] [raw]
Subject: Re: 2.6.21-git11: BUG in loop.ko

>+ modprobe loop max_loop=128
>loop: the max_loop option is obsolete and will be removed in March 2008

looks like fc6 already contains that "remove artificial software max_loop limit" patch ?

what about mainline - will it be in 2.6.22, as andrew "estimated" when the patch showed up?

i`m asking, because i`m just curious - seems that this was put "on hold" - so someone confirm we need to wait for 2.6.23 or later?

sorry for the noise
roland

_______________________________________________________________
SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192

2007-05-11 15:09:35

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.21-git11: BUG in loop.ko

Ken Chen wrote:
> So loop_probe() mistakenly returned wrong status and leads to future
> oops on inconsistent module ref count. The following patch fixes the
> issue.
>
> Signed-off-by: Ken Chen <[email protected]>

Yep, works for me.

Acked-by: Jeremy Fitzhardinge <[email protected]>

J