Hello Herbert, Dave-
I have a little "lockup" in libcrc32c on boot and I don't know how to
solve it correctly but there are a few ideas.
First, note this bug is happening with bnx2x. But, since I didn't have
the hardware I added usage of crc32c to bnx2 to reproduce.
Here are backtraces of the two processes and the relevant dmesg
output:
[ 61.816131] ehci_hcd 0000:00:1d.7: suspend root hub
[ 91.512113] bnx2: gave up waiting for init of module libcrc32c.
[ 91.524105] bnx2: Unknown symbol crc32c
[ 92.215003] bp: lib/libcrc32c.c:66 libcrc32c_mod_init crypto_alloc_shash SUCCESS
[ 92.401131] Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.4 (Mar 03, 2010)
[ 92.401262] alloc irq_desc for 16 on node 0
[ 92.401266] alloc kstat_irqs on node 0
[ 92.401282] bnx2 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
The issue is that bnx2 is waiting for crc32c on use_module while
libcrc32c is trying to insert crc32-generic to resolve the symbol. You
can see it here via the process stacks for modprobe:
[0]kdb> btp 1661 # modprobe for bnx2
Stack traceback for pid 1661
0xffff88051209ddc0 1661 1657 0 13 S 0xffff88051209e300 modprobe
[ 75.797063] ffff8805131e7c78 0000000000000046 0000000000000000 0000000000000000
[ 75.797063] 000000000000e480 ffff8805131e7fd8 ffff88051209ddc0 ffff88051209e1c0
[ 75.797063] 00000000ffff1edb ffffffff8109517d ffff8800283ae480 ffff8800283b3e18
[ 75.797063] Call Trace:
[ 75.797063] [<ffffffff81437785>] schedule_timeout+0x1e5/0x2f0
[ 75.797063] [<ffffffff810a5e7b>] use_module+0x2eb/0x330
[ 75.797063] [<ffffffff810a7605>] load_module+0x1545/0x1e10
[ 75.797063] [<ffffffff810a7f51>] sys_init_module+0x81/0x2b0
[ 75.797063] [<ffffffff810031ab>] system_call_fastpath+0x16/0x1b
[ 75.797063] [<00007f4bf994dfba>] 0x7f4bf994dfba
[0]kdb>
[0]kdb>
[0]kdb>
[0]kdb> btp 1662 #modprobe for libcrc32c
Stack traceback for pid 1662 # modprobe for libcrc32c blocking on
0xffff880513278980 1662 1658 0 13 D 0xffff880513278ec0 modprobe
[ 75.797063] ffff88051254bb28 0000000000000046 ffff88051254baf8 ffff88051254baf4
[ 75.797063] 000000000000e480 ffff88051254bfd8 ffff880513278980 ffff880513278d80
[ 75.797063] 00000000ffff1e51 ffff8800282d3e00 ffff8800283ae480 ffff8800283b3e18
[ 75.797063] Call Trace:
[ 75.797063] [<ffffffff814377fd>] schedule_timeout+0x25d/0x2f0
[ 75.797063] [<ffffffff8143652b>] wait_for_common+0xdb/0x190
[ 75.797063] [<ffffffff8143669d>] wait_for_completion+0x1d/0x20
[ 75.797063] [<ffffffff810787d2>] call_usermodehelper_exec+0xd2/0xe0
[ 75.797063] [<ffffffff81078a7c>] __request_module+0x19c/0x1f0
[ 75.797063] [<ffffffff8122184f>] crypto_larval_lookup+0x10f/0x210
[ 75.797063] [<ffffffff812219b3>] crypto_alg_mod_lookup+0x63/0xf0
[ 75.797063] [<ffffffff81220cae>] crypto_find_alg+0xae/0xd0
[ 75.797063] [<ffffffff812215a4>] crypto_alloc_tfm+0x84/0x100
[ 75.797063] [<ffffffff81227fb9>] crypto_alloc_shash+0x19/0x20
[ 75.797063] [<ffffffffa0131035>] libcrc32c_mod_init+0x35/0x72 [libcrc32c]
[ 75.797063] [<ffffffff810001ed>] do_one_initcall+0x3d/0x1c0
[ 75.797063] [<ffffffff810a7fbd>] sys_init_module+0xed/0x2b0
[ 75.797063] [<ffffffff810031ab>] system_call_fastpath+0x16/0x1b
[ 75.797063] [<00007f4aeefaffba>] 0x7f4aeefaffba
Possible solutions:
1) Figure out the right place to drop the module_lock while waiting on
other modules to load. This boots and seems to work but who knows what
it will break. :) This might be the right fix after investigating
possible issues.
Index: linux-2.6.32/kernel/module.c
===================================================================
--- linux-2.6.32.orig/kernel/module.c
+++ linux-2.6.32/kernel/module.c
@@ -678,6 +678,8 @@ int use_module(struct module *a, struct
if (b == NULL || already_uses(a, b)) return 1;
+ mutex_unlock(&module_mutex);
+
/* If we're interrupted or time out, we fail. */
if (wait_event_interruptible_timeout(
module_wq, (err = strong_try_module_get(b)) != -EBUSY,
@@ -687,6 +689,8 @@ int use_module(struct module *a, struct
return 0;
}
+ mutex_lock(&module_mutex);
+
/* If strong_try_module_get() returned a different error, we fail. */
if (err)
return 0;
2) Force libcrc32c to be built in?
3) Tell everyone to load it first thing in initrd? This seems hacky.
What are you thoughts on the right way to fix this?
Cheers,
Brandon
On Fri, Mar 19, 2010 at 04:40:45PM -0700, Brandon Philips wrote:
>
> Possible solutions:
>
> 1) Figure out the right place to drop the module_lock while waiting on
> other modules to load. This boots and seems to work but who knows what
> it will break. :) This might be the right fix after investigating
> possible issues.
I'm not sure that this is the problem.
IIRC modprobe(8) in user-space is supposed to load all dependencies
prior to loading the main module. So something must've gone wrong
before bnx2/bnx2x was loaded. Please strace it to see why libcrc32c
wasn't loaded successfully before the kernel ever saw bnx2/bnx2x.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, Mar 20, 2010 at 09:01:15AM +0800, Herbert Xu wrote:
>
> IIRC modprobe(8) in user-space is supposed to load all dependencies
> prior to loading the main module. So something must've gone wrong
> before bnx2/bnx2x was loaded. Please strace it to see why libcrc32c
> wasn't loaded successfully before the kernel ever saw bnx2/bnx2x.
In fact I just tried it on my machine and modprobe bnx2x correctly
loaded libcrc32c + crc32c before loading bnx2x (you don't need the
hardware, it'll stay loaded). My kernel is 2.6.33-rc8.
I suspect it's a user-space problem. I'm using module-init-tools
3.4-1 from Debian.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
From: Herbert Xu <[email protected]>
Date: Sat, 20 Mar 2010 09:08:49 +0800
> On Sat, Mar 20, 2010 at 09:01:15AM +0800, Herbert Xu wrote:
>>
>> IIRC modprobe(8) in user-space is supposed to load all dependencies
>> prior to loading the main module. So something must've gone wrong
>> before bnx2/bnx2x was loaded. Please strace it to see why libcrc32c
>> wasn't loaded successfully before the kernel ever saw bnx2/bnx2x.
>
> In fact I just tried it on my machine and modprobe bnx2x correctly
> loaded libcrc32c + crc32c before loading bnx2x (you don't need the
> hardware, it'll stay loaded). My kernel is 2.6.33-rc8.
>
> I suspect it's a user-space problem. I'm using module-init-tools
> 3.4-1 from Debian.
Even if userland "does" this, nothing stops another thread
from rmmod'ing the dependency in between the modloads.
We have to handle this case properly in the kernel, no matter
what.
On Fri, Mar 19, 2010 at 07:46:12PM -0700, David Miller wrote:
>
> Even if userland "does" this, nothing stops another thread
> from rmmod'ing the dependency in between the modloads.
>
> We have to handle this case properly in the kernel, no matter
> what.
Right, if that had happened I think the reasonable response would
be to fail the subsequent load for lack of dependency rather than
looping around forever.
However, I don't think that's what Brandon is seeing here, unless
there is something performing the rmmod?
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On 09:08 Sat 20 Mar 2010, Herbert Xu wrote:
> On Sat, Mar 20, 2010 at 09:01:15AM +0800, Herbert Xu wrote:
> >
> > IIRC modprobe(8) in user-space is supposed to load all dependencies
> > prior to loading the main module. So something must've gone wrong
> > before bnx2/bnx2x was loaded. Please strace it to see why libcrc32c
> > wasn't loaded successfully before the kernel ever saw bnx2/bnx2x.
>
> In fact I just tried it on my machine and modprobe bnx2x correctly
> loaded libcrc32c + crc32c before loading bnx2x (you don't need the
> hardware, it'll stay loaded). My kernel is 2.6.33-rc8.
>
> I suspect it's a user-space problem. I'm using module-init-tools
> 3.4-1 from Debian.
This is module-init-tools 3.11.1 and SUSE Linux Enterprise SP1 RC1. I
can only reproduce on boot (thus why I changed bnx2 which I have) so
perhaps SUSE is doing something odd in how it loads modules at boot.
I will trace the boot process and get more information on exactly what
happens before it gets stuck.
Do you see how it is getting stuck though with modprobe bnx2 and
modprobe libcrc32c going at the same time though?
Cheers,
Brandon
On Fri, Mar 19, 2010 at 09:21:03PM -0700, Brandon Philips wrote:
>
> Do you see how it is getting stuck though with modprobe bnx2 and
> modprobe libcrc32c going at the same time though?
Yes I agree that's a bug. Something in those paths should simply
fail instead of waiting around.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
From: Herbert Xu <[email protected]>
Date: Sat, 20 Mar 2010 12:24:34 +0800
> On Fri, Mar 19, 2010 at 09:21:03PM -0700, Brandon Philips wrote:
>>
>> Do you see how it is getting stuck though with modprobe bnx2 and
>> modprobe libcrc32c going at the same time though?
>
> Yes I agree that's a bug. Something in those paths should simply
> fail instead of waiting around.
I hear what you're saying Herbert, but thinking about this a bit I
really think we should make this situation work instead of fail.
On Fri, Mar 19, 2010 at 10:23:25PM -0700, David Miller wrote:
>
> I hear what you're saying Herbert, but thinking about this a bit I
> really think we should make this situation work instead of fail.
I think the initial report perhaps painted this in a slight
different fashion than what it really is. The code that was
looping in module.c is not trying to load libcrc32c, but rather
it is trying to get a reference on the already-loaded libcrc32c
module.
AFAICS the only way to make it "work" would be to reload the
module in question when we can't get a reference on it. But
that would entail recursively loading a module during the process
of loading another module.
Rusty can chime in on whether this is doable.
I think I have a good guess as to why this problem is occuring
for Brandon. It is probably the result of two near-simultaneous
modprobes, one issued against libcrc32c and one against bnx2x.
The libcrc32c module is partially loaded to the point of invoking
its init function, which then tries to modprobe crc32c.
However, before this starts the modprobe on bnx2x is already in
progression. When bnx2x's loading tries to acquire a reference
on libcrc32c which it depends on, we hit the dead-lock.
So if Suse were doing some kind of parallel booting where multiple
modules may be loaded together then this could occur.
The easiest solution again would be for modprobe(8) to block the
loading of bnx2x because the module that it depends on libcrc32c
hasn't yet finished loading.
I'm open to a kernel solution too if anyone has suggestions.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, Mar 20, 2010 at 08:29:59PM +0800, Herbert Xu wrote:
> On Fri, Mar 19, 2010 at 10:23:25PM -0700, David Miller wrote:
> >
> > I hear what you're saying Herbert, but thinking about this a bit I
> > really think we should make this situation work instead of fail.
>
> I think the initial report perhaps painted this in a slight
> different fashion than what it really is. The code that was
> looping in module.c is not trying to load libcrc32c, but rather
> it is trying to get a reference on the already-loaded libcrc32c
> module.
>
> AFAICS the only way to make it "work" would be to reload the
> module in question when we can't get a reference on it. But
> that would entail recursively loading a module during the process
> of loading another module.
>
> Rusty can chime in on whether this is doable.
>
> I think I have a good guess as to why this problem is occuring
> for Brandon. It is probably the result of two near-simultaneous
> modprobes, one issued against libcrc32c and one against bnx2x.
>
> The libcrc32c module is partially loaded to the point of invoking
> its init function, which then tries to modprobe crc32c.
>
> However, before this starts the modprobe on bnx2x is already in
> progression. When bnx2x's loading tries to acquire a reference
> on libcrc32c which it depends on, we hit the dead-lock.
>
> So if Suse were doing some kind of parallel booting where multiple
> modules may be loaded together then this could occur.
>
> The easiest solution again would be for modprobe(8) to block the
> loading of bnx2x because the module that it depends on libcrc32c
> hasn't yet finished loading.
>
> I'm open to a kernel solution too if anyone has suggestions.
>
FWIW, this sounds like a regression in modprobe to me. A few years ago I fixed
a deadlock condition in the netfilter conntrack code that was tickled by
parallel rmmod's and modprobes. modprobe would take file locks on modules, and
if the same module was getting rmmodded and modprobed in parallel we'd wind up
with a deadlock. I fixed it by making the default modprobe -r behavior to be
non-blocking (which is the same as rmmod). That commit is here:
http://git.kernel.org/?p=utils/kernel/module-init-tools/module-init-tools.git;a=commit;h=b45a24e9c89a14baf63bffe0a9ff04c1c1bffb29
Later, in late 2009, That behavior was reverted:
http://git.kernel.org/?p=utils/kernel/module-init-tools/module-init-tools.git;a=commit;h=b45a24e9c89a14baf63bffe0a9ff04c1c1bffb29
withuot consideration of the consequences, of which this sounds like one.
JCM I think is working on fixing the problem in a sane way. I'd suggested that
he reapply the patch, but IIRC he told me that hes planning on trying to fix it
by removing the file locking on the modules in userspace entirely, which I think
is also reasonable.
As a test, you might try massaging my old patch above into the latest
module-init-tools to see if it makes the problem go away. Note, the result of
this will be that either the modprobe or rmmod will fail and will need to be
retried, but its non-fatal, and a retry is usually successful, as it moves the
rmmod and modprobe further apart in time.
Regards
Neil
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <[email protected]>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
On Sat, 20 Mar 2010 10:59:59 pm Herbert Xu wrote:
> On Fri, Mar 19, 2010 at 10:23:25PM -0700, David Miller wrote:
> >
> > I hear what you're saying Herbert, but thinking about this a bit I
> > really think we should make this situation work instead of fail.
>
> I think the initial report perhaps painted this in a slight
> different fashion than what it really is. The code that was
> looping in module.c is not trying to load libcrc32c, but rather
> it is trying to get a reference on the already-loaded libcrc32c
> module.
>
> AFAICS the only way to make it "work" would be to reload the
> module in question when we can't get a reference on it. But
> that would entail recursively loading a module during the process
> of loading another module.
>
> Rusty can chime in on whether this is doable.
Not really. The kernel code is pretty simple: if we can't find a symbol
we need, fail the load. To avoid nasty spurious races, we *do* wait if the
symbol we need is in a module which is still initializing. But we time out
after 30 seconds to avoid a module trainwreck.
The real fix here is to drop the lock, like Brandon suggested, but we need
to do it more carefully: when we re-acquire the lock we need to re-lookup
the symbol in case the module has vanished or changed.
Brandon, I can't see how libcrc32c's module_init calls crypto_alloc_shash,
but the problem is reproducible with simple example modules. Does it fix
your problem?
Tim: note that use_module() return value changes.
Thanks,
Rusty.
module: drop the lock while waiting for module to complete initialization.
This fixes "gave up waiting for init of module libcrc32c." which
happened at boot time due to multiple parallel module loads.
The problem was a deadlock: we wait for a module to finish
initializing, but we keep the module_lock mutex so it can't complete.
In particular, this could reasonably happen if libcrc32c does a
request_module() in its initialization routine.
So we change use_module() to return an errno rather than a bool, and if
it's -EBUSY we drop the lock and wait in the caller, then reaquire the
lock.
Reported-by: Brandon Philips <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -510,33 +510,25 @@ int use_module(struct module *a, struct
struct module_use *use;
int no_warn, err;
- if (b == NULL || already_uses(a, b)) return 1;
+ if (b == NULL || already_uses(a, b)) return 0;
/* If we're interrupted or time out, we fail. */
- if (wait_event_interruptible_timeout(
- module_wq, (err = strong_try_module_get(b)) != -EBUSY,
- 30 * HZ) <= 0) {
- printk("%s: gave up waiting for init of module %s.\n",
- a->name, b->name);
- return 0;
- }
-
- /* If strong_try_module_get() returned a different error, we fail. */
+ err = strong_try_module_get(b);
if (err)
- return 0;
+ return err;
DEBUGP("Allocating new usage for %s.\n", a->name);
use = kmalloc(sizeof(*use), GFP_ATOMIC);
if (!use) {
printk("%s: out of memory loading\n", a->name);
module_put(b);
- return 0;
+ return -ENOMEM;
}
use->module_which_uses = a;
list_add(&use->list, &b->modules_which_use_me);
no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
- return 1;
+ return 0;
}
EXPORT_SYMBOL_GPL(use_module);
@@ -823,7 +815,7 @@ static inline void module_unload_free(st
int use_module(struct module *a, struct module *b)
{
- return strong_try_module_get(b) == 0;
+ return strong_try_module_get(b);
}
EXPORT_SYMBOL_GPL(use_module);
@@ -994,17 +986,39 @@ static const struct kernel_symbol *resol
struct module *owner;
const struct kernel_symbol *sym;
const unsigned long *crc;
+ DEFINE_WAIT(wait);
+ int err;
+ long timeleft = 30 * HZ;
+again:
sym = find_symbol(name, &owner, &crc,
!(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
- /* use_module can fail due to OOM,
- or module initialization or unloading */
- if (sym) {
- if (!check_version(sechdrs, versindex, name, mod, crc, owner)
- || !use_module(mod, owner))
- sym = NULL;
+ if (!sym)
+ return NULL;
+
+ if (!check_version(sechdrs, versindex, name, mod, crc, owner))
+ return NULL;
+
+ prepare_to_wait(&module_wq, &wait, TASK_INTERRUPTIBLE);
+ err = use_module(mod, owner);
+ if (likely(!err) || err != -EBUSY || signal_pending(current)) {
+ finish_wait(&module_wq, &wait);
+ return err ? NULL : sym;
}
- return sym;
+
+ /* Module is still loading. Drop lock and wait. */
+ mutex_unlock(&module_mutex);
+ timeleft = schedule_timeout(timeleft);
+ mutex_lock(&module_mutex);
+ finish_wait(&module_wq, &wait);
+
+ /* Module might be gone entirely, or replaced. Re-lookup. */
+ if (timeleft)
+ goto again;
+
+ printk("%s: gave up waiting for init of module %s.\n",
+ mod->name, owner->name);
+ return NULL;
}
/*
On 20:29 Sat 20 Mar 2010, Herbert Xu wrote:
> On Fri, Mar 19, 2010 at 10:23:25PM -0700, David Miller wrote:
> >
> > I hear what you're saying Herbert, but thinking about this a bit I
> > really think we should make this situation work instead of fail.
>
> I think the initial report perhaps painted this in a slight
> different fashion than what it really is. The code that was
> looping in module.c is not trying to load libcrc32c, but rather
> it is trying to get a reference on the already-loaded libcrc32c
> module.
You are correct. Sorry for the confusion.
> AFAICS the only way to make it "work" would be to reload the
> module in question when we can't get a reference on it. But
> that would entail recursively loading a module during the process
> of loading another module.
>
> Rusty can chime in on whether this is doable.
>
> I think I have a good guess as to why this problem is occuring
> for Brandon. It is probably the result of two near-simultaneous
> modprobes, one issued against libcrc32c and one against bnx2x.
>
> So if Suse were doing some kind of parallel booting where multiple
> modules may be loaded together then this could occur.
It is actually due to two nearly simultaneous modprobes of bnx2x from
udev since there are two devices:
02:00.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet [14e4:1639] (rev 01)
02:00.1 Ethernet controller [0200]: Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet [14e4:1639] (rev 01)
Output from my modprobe strace wrappers:
Process 4589 (first to get to libcrc32c)
------------
4589 execve("/sbin/modprobe.bin", ["/sbin/modprobe.bin", "pci:v000014E4d00001639sv00001014"...], [/* 14 vars */]) = 0
...
4589 open("/lib/modules/2.6.32.9-debug/kernel/lib/libcrc32c.ko", O_RDONLY) = 3
4589 stat("/sys/module", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
4589 stat("/sys/module/libcrc32c", 0x7fff3d03bef0) = -1 ENOENT (No such file or directory)
4589 read(3, "\177E", 2) = 2
4589 lseek(3, 0, SEEK_SET) = 0
4589 fstat(3, {st_mode=S_IFREG|0644, st_size=90482, ...}) = 0
4589 mmap(NULL, 90482, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f7963d49000
4589 init_module(0x7f7963d49000, 90482, "") = 0
4589 munmap(0x7f7963d49000, 90482) = 0
4589 close(3) = 0
4589 open("/lib/modules/2.6.32.9-debug/kernel/drivers/net/bnx2.ko", O_RDONLY) = 3
4589 stat("/sys/module", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
4589 stat("/sys/module/bnx2", 0x7fff3d03bf90) = -1 ENOENT (No such file or directory)
4589 read(3, "\177E", 2) = 2
4589 lseek(3, 0, SEEK_SET) = 0
4589 fstat(3, {st_mode=S_IFREG|0644, st_size=950325, ...}) = 0
4589 mmap(NULL, 950325, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f7963c4c000
4589 init_module(0x7f7963c4c000, 950325, "") = 0
Process 4590 (loses out on libcrc32c and gets stuck)
------------
4590 execve("/sbin/modprobe.bin", ["/sbin/modprobe.bin", "pci:v000014E4d00001639sv00001014"...], [/* 14 vars */]) = 0
....
4590 open("/lib/modules/2.6.32.9-debug/kernel/lib/libcrc32c.ko",O_RDONLY) = 3
4590 stat("/sys/module", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
4590 stat("/sys/module/libcrc32c", 0x7ffff52a0ef0) = -1 ENOENT (No such file or directory)
4590 read(3, "\177E", 2) = 2
4590 lseek(3, 0, SEEK_SET) = 0
4590 fstat(3, {st_mode=S_IFREG|0644, st_size=90482, ...}) = 0
4590 mmap(NULL, 90482, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f1397a1b000
4590 init_module(0x7f1397a1b000, 90482, "") = -1 EEXIST (File exists)
4590 close(3) = 0
4590 open("/lib/modules/2.6.32.9-debug/kernel/drivers/net/bnx2.ko", O_RDONLY) = 3
4590 stat("/sys/module", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
4590 stat("/sys/module/bnx2", 0x7ffff52a0f90) = -1 ENOENT (No such file or directory)
4590 read(3, "\177E", 2) = 2
4590 lseek(3, 0, SEEK_SET) = 0
4590 fstat(3, {st_mode=S_IFREG|0644, st_size=950325, ...}) = 0
4590 mmap(NULL, 950325, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f139791e000
4590 init_module(0x7f139791e000, 950325, "") = -1 ENOENT (No such file or directory)
4590 write(2, "FATAL: Error inserting bnx2 (/li"..., 145) = 145
As you can see 4589 inserts libcrc32c and the other starts inserting
bnx2 after seeing libcrc32c was loaded. Creating our situation. Phew,
took me a bit to see what was really going on :)
Side note: Intially I thought that modprobe was forking for each
module dependency but that was incorrect after looking at the
code. udev is forking off two modprobes almost at the same time for
each bnx2x device alias when udevadm trigger is called during the boot
sequence.
FWIW, I don't think this is a SUSE thing as our udev is pretty lightly
patched.
Cheers,
Brandon
On Tue, Mar 30, 2010 at 18:50, Brandon Philips <[email protected]> wrote:
> Side note: Intially I thought that modprobe was forking for each
> module dependency but that was incorrect after looking at the
> code. udev is forking off two modprobes almost at the same time for
> each bnx2x device alias when udevadm trigger is called during the boot
> sequence.
Right, that's the standard behavior, and applies to all usual systems.
> FWIW, I don't think this is a SUSE thing as our udev is pretty lightly
> patched.
Yes, there is no SUSE patch at all. All usual systems ship the same stuff here.
Kay
On 09:36 Tue 30 Mar 2010, Rusty Russell wrote:
> The real fix here is to drop the lock, like Brandon suggested, but
> we need to do it more carefully: when we re-acquire the lock we need
> to re-lookup the symbol in case the module has vanished or changed.
>
> Brandon, I can't see how libcrc32c's module_init calls
> crypto_alloc_shash, but the problem is reproducible with simple
> example modules. Does it fix your problem?
Did you see my email yesterday explaining how this came about? Does
that answer your question?
http://www.mail-archive.com/[email protected]/msg04363.html
> module: drop the lock while waiting for module to complete initialization.
>
> This fixes "gave up waiting for init of module libcrc32c." which
> happened at boot time due to multiple parallel module loads.
Since the root cause is known a bit more detail can be added:
This fixes "gave up waiting for init of module libcrc32c." which is
triggered by parallel modprobes by udev for the same driver. For
example on a machine with two network interfaces:
02:00.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet [14e4:1639] (rev 01)
02:00.1 Ethernet controller [0200]: Broadcom Corporation NetXtreme II
BCM5709 Gigabit Ethernet [14e4:1639] (rev 01)
udev will fork off two modprobe processes:
/sbin/modprobe pci:v000014E4d00001639sv00001014sd0000037Cbc02sc00i00
/sbin/modprobe pci:v000014E4d00001639sv00001014sd0000037Cbc02sc00i00
> The problem was a deadlock: we wait for a module to finish
> initializing, but we keep the module_lock mutex so it can't
> complete. In particular, this could reasonably happen if libcrc32c
> does a request_module() in its initialization routine.
>
> So we change use_module() to return an errno rather than a bool, and if
> it's -EBUSY we drop the lock and wait in the caller, then reaquire the
> lock.
>
> Reported-by: Brandon Philips <[email protected]>
> Signed-off-by: Rusty Russell <[email protected]>
Reviewed the patch and it looks good and I tested it on the machine
and it works. A couple of trivial things inline below if you care.
Signed-off-by: Brandon Philips <[email protected]>
Cc: [email protected]
Thanks Rusty.
Cheers,
Brandon
> - if (b == NULL || already_uses(a, b)) return 1;
> + if (b == NULL || already_uses(a, b)) return 0;
if (b == NULL || already_uses(a, b))
return 0;
> /* If we're interrupted or time out, we fail. */
> - if (wait_event_interruptible_timeout(
> - module_wq, (err = strong_try_module_get(b)) != -EBUSY,
> - 30 * HZ) <= 0) {
> - printk("%s: gave up waiting for init of module %s.\n",
> - a->name, b->name);
> - return 0;
> - }
> -
> - /* If strong_try_module_get() returned a different error, we fail. */
> + err = strong_try_module_get(b);
> if (err)
> - return 0;
> + return err;
>
> DEBUGP("Allocating new usage for %s.\n", a->name);
> use = kmalloc(sizeof(*use), GFP_ATOMIC);
> if (!use) {
> printk("%s: out of memory loading\n", a->name);
printk(KERN_ERR "%s: out of memory loading\n", a->name);
> module_put(b);
> - return 0;
> + return -ENOMEM;
> }
>
> use->module_which_uses = a;
> list_add(&use->list, &b->modules_which_use_me);
> no_warn = sysfs_create_link(b->holders_dir, &a->mkobj.kobj, a->name);
> - return 1;
> + return 0;
> }
> EXPORT_SYMBOL_GPL(use_module);
>
> @@ -823,7 +815,7 @@ static inline void module_unload_free(st
>
> int use_module(struct module *a, struct module *b)
> {
> - return strong_try_module_get(b) == 0;
> + return strong_try_module_get(b);
> }
> EXPORT_SYMBOL_GPL(use_module);
>
> @@ -994,17 +986,39 @@ static const struct kernel_symbol *resol
> struct module *owner;
> const struct kernel_symbol *sym;
> const unsigned long *crc;
> + DEFINE_WAIT(wait);
> + int err;
> + long timeleft = 30 * HZ;
>
> +again:
> sym = find_symbol(name, &owner, &crc,
> !(mod->taints & (1 << TAINT_PROPRIETARY_MODULE)), true);
> - /* use_module can fail due to OOM,
> - or module initialization or unloading */
> - if (sym) {
> - if (!check_version(sechdrs, versindex, name, mod, crc, owner)
> - || !use_module(mod, owner))
> - sym = NULL;
> + if (!sym)
> + return NULL;
> +
> + if (!check_version(sechdrs, versindex, name, mod, crc, owner))
> + return NULL;
> +
> + prepare_to_wait(&module_wq, &wait, TASK_INTERRUPTIBLE);
> + err = use_module(mod, owner);
> + if (likely(!err) || err != -EBUSY || signal_pending(current)) {
> + finish_wait(&module_wq, &wait);
> + return err ? NULL : sym;
> }
> - return sym;
> +
> + /* Module is still loading. Drop lock and wait. */
> + mutex_unlock(&module_mutex);
> + timeleft = schedule_timeout(timeleft);
> + mutex_lock(&module_mutex);
> + finish_wait(&module_wq, &wait);
> +
> + /* Module might be gone entirely, or replaced. Re-lookup. */
> + if (timeleft)
> + goto again;
> +
> + printk("%s: gave up waiting for init of module %s.\n",
> + mod->name, owner->name);
printk(KERN_ERR "%s: gave up waiting for init of module %s.\n",
mod->name, owner->name);
> + return NULL;
> }
>
> /*
>
On Thu, 1 Apr 2010 05:33:51 am Brandon Philips wrote:
> On 09:36 Tue 30 Mar 2010, Rusty Russell wrote:
> > The real fix here is to drop the lock, like Brandon suggested, but
> > we need to do it more carefully: when we re-acquire the lock we need
> > to re-lookup the symbol in case the module has vanished or changed.
> >
> > Brandon, I can't see how libcrc32c's module_init calls
> > crypto_alloc_shash, but the problem is reproducible with simple
> > example modules. Does it fix your problem?
>
> Did you see my email yesterday explaining how this came about? Does
> that answer your question?
Yep, thanks.
> Reviewed the patch and it looks good and I tested it on the machine
> and it works. A couple of trivial things inline below if you care.
>
> Signed-off-by: Brandon Philips <[email protected]>
> Cc: [email protected]
>
> Thanks Rusty.
Thanks; already had the two caught by checkpatch; changing the other
printk is not something I like to do in the same patch.
Thanks!
Rusty.