On recent linux-next, iptables and iwl3945 would fail to load
Bisected down to this commit:
87e10115fb652a966965da1ac305cb57e6db5a45 is first bad commit
commit 87e10115fb652a966965da1ac305cb57e6db5a45
Author: Arjan van de Ven <[email protected]>
Date: Sun Feb 8 10:42:01 2009 -0800
module: create a request_module_nowait()
There seems to be a common pattern in the kernel where drivers want to
call request_module() from inside a module_init() function. Currently
this would deadlock.
As a result, several drivers go through hoops like scheduling things via
kevent, or creating custom work queues (because kevent can deadlock on them).
This patch changes this to use a request_module_nowait() function macro instead,
which just fires the modprobe off but doesn't wait for it, and thus avoids the
original deadlock entirely.
On my laptop this already results in one less kernel thread running..
Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
:040000 040000 02a1e199052893007e41f161997e552cbc5f3c1b 89c67f6d538089bdffd8a15dfccbf4e11cd06b0b M include
:040000 040000 dc67697b7b18db74a3cf13631877dc81dd8da83b 4edd476bccbb2ba62d34625290ec878d29158d6b M kernel
Reverting this commit and 1ae06b4e8430b44872422cff235faa5610d3e79b (the following
cleanup fix) makes iptables and iwl3945 start working again.
[email protected] wrote:
> On recent linux-next, iptables and iwl3945 would fail to load
>
> Bisected down to this commit:
>
this is ... curious. I just looked at the patch again and it really
looks like it really does not impact the existing codepaths...
do you have any more diagnostics other than "they do not work" ?
On Monday 16 March 2009 12:28:04 [email protected] wrote:
> On recent linux-next, iptables and iwl3945 would fail to load
Please send .config; I can't reproduce this here.
Thanks,
Rusty.
On Tue, 17 Mar 2009 16:27:58 +1030, Rusty Russell said:
> On Monday 16 March 2009 12:28:04 [email protected] wrote:
> > On recent linux-next, iptables and iwl3945 would fail to load
> Please send .config; I can't reproduce this here.
Attached are 2 .config's - config.gz is my usual laptop config, and
config-minimal.gz is the stripped-down version I was using while bisecting.
The failure mode was *really* odd - I could start up
iptables once, and get *one* error message about being unable to initialize
table 'filter' - subsequent attempts would all fail further along:
[root@turing-police ~]# /etc/init.d/iptables start
iptables: Applying firewall rules:
iptables-restore v1.4.2: iptables-restore: unable to initialize table 'filter'
Error occurred at line: 3
Try `iptables-restore -h' or 'iptables-restore --help' for more information.
[root@turing-police ~]# /etc/init.d/iptables start
iptables: Applying firewall rules: iptables-restore: line 110 failed
Yell if you want any instrumentation added, strace, etc - I still have all
the pieces needed to dig into it further...
I didn't capture the output of iwl3945 loading, because I was expecting that
to be some *other* issue - reverting the problematic commit fixed 3945 before
I had started digging further into it.
On Wednesday 18 March 2009 08:57:15 [email protected] wrote:
> On Tue, 17 Mar 2009 16:27:58 +1030, Rusty Russell said:
> > On Monday 16 March 2009 12:28:04 [email protected] wrote:
> > > On recent linux-next, iptables and iwl3945 would fail to load
> > Please send .config; I can't reproduce this here.
>
> [root@turing-police ~]# /etc/init.d/iptables start
> iptables: Applying firewall rules:
> iptables-restore v1.4.2: iptables-restore: unable to initialize table 'filter'
OK, I think this might fix it. It's already queued, but was labelled a mere
"cleanup".
Subject: kmod: use explicit names for waiting
Date: Mon, 16 Mar 2009 10:36:14 +0100
From: Jiri Slaby <[email protected]>
Impact: cleanup
As call_usermodehelper accepts enum umh_wait as a wait parameter,
use constants from this enum instead of bool in __request_module.
Signed-off-by: Jiri Slaby <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
---
kernel/kmod.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/kmod.c b/kernel/kmod.c
index b2a53d0..b750675 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -109,7 +109,8 @@ int __request_module(bool wait, const char *fmt, ...)
return -ENOMEM;
}
- ret = call_usermodehelper(modprobe_path, argv, envp, wait);
+ ret = call_usermodehelper(modprobe_path, argv, envp,
+ wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
atomic_dec(&kmod_concurrent);
return ret;
}
Rusty Russell wrote:
> On Wednesday 18 March 2009 08:57:15 [email protected] wrote:
>> On Tue, 17 Mar 2009 16:27:58 +1030, Rusty Russell said:
>>> On Monday 16 March 2009 12:28:04 [email protected] wrote:
>>>> On recent linux-next, iptables and iwl3945 would fail to load
>>> Please send .config; I can't reproduce this here.
>> [root@turing-police ~]# /etc/init.d/iptables start
>> iptables: Applying firewall rules:
>> iptables-restore v1.4.2: iptables-restore: unable to initialize table 'filter'
>
> OK, I think this might fix it. It's already queued, but was labelled a mere
> "cleanup".
I wouldn't understand how it would fix it though;
the code before my patch passes in a 1, and the code after my patch passes in a variable
which has the value 1 as well.... using a symbolic name for it instead isn't going
to impact the generated code afaics...
On Wednesday 18 March 2009 15:29:02 Arjan van de Ven wrote:
> Rusty Russell wrote:
> > OK, I think this might fix it. It's already queued, but was labelled a mere
> > "cleanup".
>
> I wouldn't understand how it would fix it though;
> the code before my patch passes in a 1, and the code after my patch passes in a variable
> which has the value 1 as well.... using a symbolic name for it instead isn't going
> to impact the generated code afaics...
There's a patch in between which changed it to a bool, but yes, it should do
nothing since we hand "true" (ie. 1): I don't think sign-extending that to -1
is legal. Yet that would explain his issues.
Anyway, he bisected it down to that original commit. Hmm...
Valdis, does this give anything in your boot logs? And what compiler version
and platform are you using?
diff --git a/kernel/kmod.c b/kernel/kmod.c
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -109,6 +109,8 @@ int __request_module(int wait, const cha
atomic_dec(&kmod_concurrent);
return -ENOMEM;
}
+
+ WARN_ON(!wait);
ret = call_usermodehelper(modprobe_path, argv, envp, wait);
atomic_dec(&kmod_concurrent);