2008-12-11 22:38:24

by Andrew Morton

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 12201] New: long wait in call_usermodehelper() / queue_work() / wait_for_completion()


(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 11 Dec 2008 14:15:21 -0800 (PST)
[email protected] wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=12201
>
> Summary: long wait in call_usermodehelper() / queue_work() /
> wait_for_completion()
> Product: Process Management
> Version: 2.5
> KernelVersion: 2.6.26.8
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Other
> AssignedTo: [email protected]
> ReportedBy: [email protected]
>
>
> Latest working kernel version: None
> Earliest failing kernel version: 2.6.26 (can't test any older)

It'd be great if you could test something more recent please.

> Distribution: CentOS 5
> Hardware Environment: Sun x4450, 16-cores, 128GB of RAM
> Software Environment: CentOS 5 + Apache webserver
> Problem Description:
> My problem started with SSH using the audit library, and my kernel not having
> AUDIT support. During strace, any call to socket(PF_NETLINK, SOCK_RAW,
> NETLINK_AUDIT) took 1-2 seconds to return. During this time, sys% was high.

Well this is bad. We don't want the kernel calling out to userspace
each time you run socket(PF_NETLINK, ...). The performance could be
awful.

I don't know if this is a net problem, an audit problem or whatever.
Probably the offending kernel code simply shouldn't exist if
CONFIG_AUDIT=n.

Please attach a copy of the config to
http://bugzilla.kernel.org/show_bug.cgi?id=12201

> As I continued to dig deeper (using lots of printks), I found that these delays
> were caused by the netlink_create() code calling request_module() to find/load
> a module for AUDIT support which doesn't exist.
>
> Continuing to dig, I found that request_module() uses call_usermodehelper() to
> run /sbin/modprobe to find/load the module.
>
> The farthest I got is that after the process is created, we call
> wait_for_completion() to get the result of that process. This waiting process
> takes 1-2 seconds.
>
> The big problem in troubleshooting here is that this only starts to happen
> after the server has been online for a while (10 days maybe) and serving lots
> of traffic. The delay gradually builds up and maxes out at around 2 seconds.
>
> If I manually call /sbin/modprobe on the commandline and provide it the same
> arguments that call_usermodehelper() uses, the command returns instantly 100%
> of the time (assuming server has been on for a while).
>
> If I write a small pilot program that calls socket(PF_NETLINK, SOCK_RAW,
> NETLINK_AUDIT), it will delay by 1-2 seconds 100% of the time (assuming server
> has been online for a while). Certain protocol types given to socket() have
> zero delay (because no module needs to be loaded).
>
> Steps to reproduce:
> Once server has been online for a while, a simple call to socket(PF_NETLINK,
> SOCK_RAW, NETLINK_AUDIT) shows the problem.

OK, weird.

Please get sysrq working then get us a task trace, so we can see who is
sleeping where. Do this:


- run your "small pilot program"

- wait one second (so we catch it while it is delaying)

- echo t > /proc/sysrq-trigger

- dmesg -s 1000000 > foo

- send us a copy of foo.

(foo will be large, so it would be best to attach it to
http://bugzilla.kernel.org/show_bug.cgi?id=12201 then email us the URL)

(Using `echo w > /proc/sysrq-trigger' would work too, if we know that
the offending processes are stuck in D state. It will produce less
output)


2008-12-11 22:52:23

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 12201] New: long wait in call_usermodehelper() / queue_work() / wait_for_completion()

Hi.

On Thu, Dec 11, 2008 at 02:37:58PM -0800, Andrew Morton ([email protected]) wrote:
> > If I write a small pilot program that calls socket(PF_NETLINK, SOCK_RAW,
> > NETLINK_AUDIT), it will delay by 1-2 seconds 100% of the time (assuming server
> > has been online for a while). Certain protocol types given to socket() have
> > zero delay (because no module needs to be loaded).
> >
> > Steps to reproduce:
> > Once server has been online for a while, a simple call to socket(PF_NETLINK,
> > SOCK_RAW, NETLINK_AUDIT) shows the problem.
>
> OK, weird.
>
> Please get sysrq working then get us a task trace, so we can see who is
> sleeping where. Do this:
>
>
> - run your "small pilot program"

Or send it to us and we will quickly fix the bug :)

Thank you!

--
Evgeniy Polyakov

2008-12-11 22:57:04

by Kay Sievers

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 12201] New: long wait in call_usermodehelper() / queue_work() / wait_for_completion()

On Thu, Dec 11, 2008 at 23:37, Andrew Morton <[email protected]> wrote:
>> As I continued to dig deeper (using lots of printks), I found that these delays
>> were caused by the netlink_create() code calling request_module() to find/load
>> a module for AUDIT support which doesn't exist.
>>
>> Continuing to dig, I found that request_module() uses call_usermodehelper() to
>> run /sbin/modprobe to find/load the module.
>>
>> The farthest I got is that after the process is created, we call
>> wait_for_completion() to get the result of that process. This waiting process
>> takes 1-2 seconds.
>>
>> The big problem in troubleshooting here is that this only starts to happen
>> after the server has been online for a while (10 days maybe) and serving lots
>> of traffic. The delay gradually builds up and maxes out at around 2 seconds.
>>
>> If I manually call /sbin/modprobe on the commandline and provide it the same
>> arguments that call_usermodehelper() uses, the command returns instantly 100%
>> of the time (assuming server has been on for a while).
>>
>> If I write a small pilot program that calls socket(PF_NETLINK, SOCK_RAW,
>> NETLINK_AUDIT), it will delay by 1-2 seconds 100% of the time (assuming server
>> has been online for a while). Certain protocol types given to socket() have
>> zero delay (because no module needs to be loaded).
>>
>> Steps to reproduce:
>> Once server has been online for a while, a simple call to socket(PF_NETLINK,
>> SOCK_RAW, NETLINK_AUDIT) shows the problem.

If you replace /sbin/modprobe in the kernel module loader, does the
delay go away:
echo /bin/true > /proc/sys/kernel/modprobe
?

Kay

2008-12-11 23:05:39

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 12201] New: long wait in call_usermodehelper() / queue_work() / wait_for_completion()

On Fri, Dec 12, 2008 at 01:51:58AM +0300, Evgeniy Polyakov ([email protected]) wrote:
> > > If I write a small pilot program that calls socket(PF_NETLINK, SOCK_RAW,
> > > NETLINK_AUDIT), it will delay by 1-2 seconds 100% of the time (assuming server
> > > has been online for a while). Certain protocol types given to socket() have
> > > zero delay (because no module needs to be loaded).
> > >
> > > Steps to reproduce:
> > > Once server has been online for a while, a simple call to socket(PF_NETLINK,
> > > SOCK_RAW, NETLINK_AUDIT) shows the problem.
> >
> > OK, weird.
> >
> > Please get sysrq working then get us a task trace, so we can see who is
> > sleeping where. Do this:
> >
> >
> > - run your "small pilot program"
>
> Or send it to us and we will quickly fix the bug :)
>
> Thank you!

Btw, does this ugly hack fix the problem?
According to your description it may be the case.

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 480184a..2b64eb5 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -434,6 +434,10 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol)
if (protocol < 0 || protocol >= MAX_LINKS)
return -EPROTONOSUPPORT;

+#ifndef CONFIG_AUDIT
+ if (protocol == NETLINK_AUDIT)
+ return -EPROTONOSUPPORT;
+#endif
netlink_lock_table();
#ifdef CONFIG_MODULES
if (!nl_table[protocol].registered) {


--
Evgeniy Polyakov

2008-12-12 19:49:52

by Michael Spiegle

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 12201] New: long wait in call_usermodehelper() / queue_work() / wait_for_completion()

> It'd be great if you could test something more recent please.

Will do. It takes a little bit of time to reproduce, so I'll respond
now with information from my current kernel and update once the new
kernel is running.

> Please attach a copy of the config to
> http://bugzilla.kernel.org/show_bug.cgi?id=12201

Done: http://bugzilla.kernel.org/attachment.cgi?id=19263

> Please get sysrq working then get us a task trace, so we can see who is
> sleeping where. Do this:

Done: http://bugzilla.kernel.org/attachment.cgi?id=19264

I wanted to get the call to /sbin/modprobe in there as well, so I had to
do it about 10 times before I could catch it. I also stopped
unimportant processes on that box to eliminate additional variables.

There is one other variable I didn't mention. The machine is diskless
and running on an nfs root. I have tried to eliminate this variable via
simple testing (just running /sbin/modprobe at the commandline returns
instantly), but wanted to mention it.

The box was 100% idle before starting the pilot program and running the
task trace. I added gettimeofday() around my socket() call and got
about 1290000us - 1490000us on average. Using a socket() call with
protocol NETLINK_ROUTE returns in about 5us - 8us. Certain other
protocol types like NETLINK_FIREWALL also have the same long delay.


Michael Spiegle
[email protected]

2008-12-12 19:52:18

by Michael Spiegle

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 12201] New: long wait in call_usermodehelper() / queue_work() / wait_for_completion()

> Btw, does this ugly hack fix the problem?
> According to your description it may be the case.
>
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 480184a..2b64eb5 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -434,6 +434,10 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol)
> if (protocol < 0 || protocol >= MAX_LINKS)
> return -EPROTONOSUPPORT;
>
> +#ifndef CONFIG_AUDIT
> + if (protocol == NETLINK_AUDIT)
> + return -EPROTONOSUPPORT;
> +#endif
> netlink_lock_table();
> #ifdef CONFIG_MODULES
> if (!nl_table[protocol].registered) {
>
>

Yes, that would take care of it. My original workaround was to disable
CONFIG_KMOD. I only have a few modules compiled (for servers that
require those modules), but most of my servers don't need any modules.
This would allow me to manually load modules myself and never have to
worry about the kernel forking stuff in userspace. Unfortunately, it
appears that CONFIG_KMOD is being deprecated soon and I didn't want to
rely on it.

What is the core reason for removing CONFIG_KMOD? I personally would
rather have the choice as to whether my kernel tries to find its own
modules or not.

Michael Spiegle
[email protected]

2008-12-12 19:54:17

by Michael Spiegle

[permalink] [raw]
Subject: Re: [Bugme-new] [Bug 12201] New: long wait in call_usermodehelper() / queue_work() / wait_for_completion()

> If you replace /sbin/modprobe in the kernel module loader, does the
> delay go away:
> echo /bin/true > /proc/sys/kernel/modprobe

Yes, that does reduce the delay significantly. The timing of the
socket() call went from about 1500000us to 430000us. This points the
finger at modprobe it seems?

Michael Spiegle
[email protected]