2019-04-26 13:10:03

by Heiko Carstens

[permalink] [raw]
Subject: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"

Hello Prarit,

it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
for modules that have finished loading") _sometimes_ causes hangs on
s390. This is unfortunately not 100% reproducible, however the
mentioned commit seems to be the only relevant one in modules.c.

What I see is a hanging system with messages like this on the console:

[ 65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
[ 65.876049] rcu: 7-....: (5999 ticks this GP) idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
[ 65.876078] (t=6000 jiffies g=-471 q=17196)
[ 65.876084] Task dump for CPU 7:
[ 65.876088] systemd-udevd R running task 0 731 721 0x06000004
[ 65.876097] Call Trace:
[ 65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
[ 65.876122] [<00000000001ee486>] finished_loading+0x4e/0xb0
[ 65.876128] [<00000000001f1ed6>] load_module+0xcce/0x27a0
[ 65.876134] [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
[ 65.876142] [<0000000000ac0766>] system_call+0x2aa/0x2c8

crash's backtrace of this task looks like this:

PID: 731 TASK: 1e79ba000 CPU: 7 COMMAND: "systemd-udevd"
LOWCORE INFO:
-psw : 0x0704c00180000000 0x0000000000ab666a
-function : memcmp at ab666a
-prefix : 0x7fe32000
-cpu timer: 0x7fffff5784d0f048
-clock cmp: 0xd60757081cd70d00
-general registers:
0x0000000000000009 0x0000000000000009
0x000003ff80347321 000000000000000000
0x000003ff8034f321 000000000000000000
0x000000000000001e 0x000003ff8c592708
0x000003e0047da900 0x000003ff8034f318
0x0000000000000001 0x0000000000000009
0x000003ff80347300 0x0000000000ad81b8
0x00000000001ee062 0x000003e004357cb0
[...]
#0 [3e004357cf0] load_module at 1f1eb0
#1 [3e004357dd0] __se_sys_init_module at 1f3af0
#2 [3e004357ea8] system_call at ac0766
PSW: 0705000180000000 000003ff8c27ad3e (user space)
GPRS: 000002aa280d2fb0 000003ff8c27ad3c 0000000000000080 00000000000035fd
000003ff8c592708 000002aa280c8770 0000000000000000 000002aa280d7130
0000000000020000 000002aa28915760 000003ff8c592708 000002aa280c00b0
000003ff8c8fff80 000002aa280c00b0 000003ff8c58a456 000003ffe9cfde90

I did not look any further into the dump, however since the commit
touches exactly the code path which seems to be looping... ;)


2019-04-26 13:24:00

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"



On 4/26/19 9:07 AM, Heiko Carstens wrote:
> Hello Prarit,
>
> it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
> for modules that have finished loading") _sometimes_ causes hangs on
> s390. This is unfortunately not 100% reproducible, however the
> mentioned commit seems to be the only relevant one in modules.c.
>
> What I see is a hanging system with messages like this on the console:
>
> [ 65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
> [ 65.876049] rcu: 7-....: (5999 ticks this GP) idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
> [ 65.876078] (t=6000 jiffies g=-471 q=17196)
> [ 65.876084] Task dump for CPU 7:
> [ 65.876088] systemd-udevd R running task 0 731 721 0x06000004
> [ 65.876097] Call Trace:
> [ 65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
> [ 65.876122] [<00000000001ee486>] finished_loading+0x4e/0xb0
> [ 65.876128] [<00000000001f1ed6>] load_module+0xcce/0x27a0
> [ 65.876134] [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
> [ 65.876142] [<0000000000ac0766>] system_call+0x2aa/0x2c8
>
> crash's backtrace of this task looks like this:
>
> PID: 731 TASK: 1e79ba000 CPU: 7 COMMAND: "systemd-udevd"
> LOWCORE INFO:
> -psw : 0x0704c00180000000 0x0000000000ab666a
> -function : memcmp at ab666a
> -prefix : 0x7fe32000
> -cpu timer: 0x7fffff5784d0f048
> -clock cmp: 0xd60757081cd70d00
> -general registers:
> 0x0000000000000009 0x0000000000000009
> 0x000003ff80347321 000000000000000000
> 0x000003ff8034f321 000000000000000000
> 0x000000000000001e 0x000003ff8c592708
> 0x000003e0047da900 0x000003ff8034f318
> 0x0000000000000001 0x0000000000000009
> 0x000003ff80347300 0x0000000000ad81b8
> 0x00000000001ee062 0x000003e004357cb0
> [...]
> #0 [3e004357cf0] load_module at 1f1eb0
> #1 [3e004357dd0] __se_sys_init_module at 1f3af0
> #2 [3e004357ea8] system_call at ac0766
> PSW: 0705000180000000 000003ff8c27ad3e (user space)
> GPRS: 000002aa280d2fb0 000003ff8c27ad3c 0000000000000080 00000000000035fd
> 000003ff8c592708 000002aa280c8770 0000000000000000 000002aa280d7130
> 0000000000020000 000002aa28915760 000003ff8c592708 000002aa280c00b0
> 000003ff8c8fff80 000002aa280c00b0 000003ff8c58a456 000003ffe9cfde90
>
> I did not look any further into the dump, however since the commit
> touches exactly the code path which seems to be looping... ;)
>

Ouch :( I wonder if I exposed a further race or another bug. Heiko, can you
determine which module is stuck? Warning: I have not compiled this code.

diff --git a/kernel/module.c b/kernel/module.c
index e8c1de2ab4e1..bd5eebc95049 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3567,6 +3567,8 @@ static int add_unformed_module(struct module *mod)
old = find_module_all(mod->name, strlen(mod->name), true);
if (old != NULL) {
if (old->state != MODULE_STATE_LIVE) {
+ printk_ratelimited("PRARIT: waiting for module %s to load.\n",
+ mod->name);
/* Wait in case it fails to load. */
mutex_unlock(&module_mutex);
err = wait_event_interruptible(module_wq,
@@ -3575,6 +3577,8 @@ static int add_unformed_module(struct module *mod)
goto out_unlocked;
goto again;
}
+ printk("PRARIT: module %s already exists.\n",
+ mod->name);
err = -EEXIST;
goto out;
}

P.

2019-04-26 15:09:24

by Heiko Carstens

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"

On Fri, Apr 26, 2019 at 09:22:34AM -0400, Prarit Bhargava wrote:
> On 4/26/19 9:07 AM, Heiko Carstens wrote:
> > Hello Prarit,
> >
> > it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
> > for modules that have finished loading") _sometimes_ causes hangs on
> > s390. This is unfortunately not 100% reproducible, however the
> > mentioned commit seems to be the only relevant one in modules.c.
> >
> > What I see is a hanging system with messages like this on the console:
> >
> > [ 65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
> > [ 65.876049] rcu: 7-....: (5999 ticks this GP) idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
> > [ 65.876078] (t=6000 jiffies g=-471 q=17196)
> > [ 65.876084] Task dump for CPU 7:
> > [ 65.876088] systemd-udevd R running task 0 731 721 0x06000004
> > [ 65.876097] Call Trace:
> > [ 65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
> > [ 65.876122] [<00000000001ee486>] finished_loading+0x4e/0xb0
> > [ 65.876128] [<00000000001f1ed6>] load_module+0xcce/0x27a0
> > [ 65.876134] [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
> > [ 65.876142] [<0000000000ac0766>] system_call+0x2aa/0x2c8
> > I did not look any further into the dump, however since the commit
> > touches exactly the code path which seems to be looping... ;)
> >
>
> Ouch :( I wonder if I exposed a further race or another bug. Heiko, can you
> determine which module is stuck? Warning: I have not compiled this code.

Here we go:

[ 11.716866] PRARIT: waiting for module s390_trng to load.
[ 11.716867] PRARIT: waiting for module s390_trng to load.
[ 11.716868] PRARIT: waiting for module s390_trng to load.
[ 11.716870] PRARIT: waiting for module s390_trng to load.
[ 11.716871] PRARIT: waiting for module s390_trng to load.
[ 11.716872] PRARIT: waiting for module s390_trng to load.
[ 11.716874] PRARIT: waiting for module s390_trng to load.
[ 11.716875] PRARIT: waiting for module s390_trng to load.
[ 11.716876] PRARIT: waiting for module s390_trng to load.
[ 16.726850] add_unformed_module: 31403529 callbacks suppressed
[ 16.726853] PRARIT: waiting for module s390_trng to load.
[ 16.726862] PRARIT: waiting for module s390_trng to load.
[ 16.726865] PRARIT: waiting for module s390_trng to load.
[ 16.726867] PRARIT: waiting for module s390_trng to load.
[ 16.726869] PRARIT: waiting for module s390_trng to load.

If I'm not mistaken then there was _no_ corresponding message on the
console stating that the module already exists.

2019-04-26 16:19:49

by Jessica Yu

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"

+++ Heiko Carstens [26/04/19 17:07 +0200]:
>On Fri, Apr 26, 2019 at 09:22:34AM -0400, Prarit Bhargava wrote:
>> On 4/26/19 9:07 AM, Heiko Carstens wrote:
>> > Hello Prarit,
>> >
>> > it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
>> > for modules that have finished loading") _sometimes_ causes hangs on
>> > s390. This is unfortunately not 100% reproducible, however the
>> > mentioned commit seems to be the only relevant one in modules.c.
>> >
>> > What I see is a hanging system with messages like this on the console:
>> >
>> > [ 65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
>> > [ 65.876049] rcu: 7-....: (5999 ticks this GP) idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
>> > [ 65.876078] (t=6000 jiffies g=-471 q=17196)
>> > [ 65.876084] Task dump for CPU 7:
>> > [ 65.876088] systemd-udevd R running task 0 731 721 0x06000004
>> > [ 65.876097] Call Trace:
>> > [ 65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
>> > [ 65.876122] [<00000000001ee486>] finished_loading+0x4e/0xb0
>> > [ 65.876128] [<00000000001f1ed6>] load_module+0xcce/0x27a0
>> > [ 65.876134] [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
>> > [ 65.876142] [<0000000000ac0766>] system_call+0x2aa/0x2c8
>> > I did not look any further into the dump, however since the commit
>> > touches exactly the code path which seems to be looping... ;)
>> >
>>
>> Ouch :( I wonder if I exposed a further race or another bug. Heiko, can you
>> determine which module is stuck? Warning: I have not compiled this code.
>
>Here we go:
>
>[ 11.716866] PRARIT: waiting for module s390_trng to load.
>[ 11.716867] PRARIT: waiting for module s390_trng to load.
>[ 11.716868] PRARIT: waiting for module s390_trng to load.
>[ 11.716870] PRARIT: waiting for module s390_trng to load.
>[ 11.716871] PRARIT: waiting for module s390_trng to load.
>[ 11.716872] PRARIT: waiting for module s390_trng to load.
>[ 11.716874] PRARIT: waiting for module s390_trng to load.
>[ 11.716875] PRARIT: waiting for module s390_trng to load.
>[ 11.716876] PRARIT: waiting for module s390_trng to load.
>[ 16.726850] add_unformed_module: 31403529 callbacks suppressed
>[ 16.726853] PRARIT: waiting for module s390_trng to load.
>[ 16.726862] PRARIT: waiting for module s390_trng to load.
>[ 16.726865] PRARIT: waiting for module s390_trng to load.
>[ 16.726867] PRARIT: waiting for module s390_trng to load.
>[ 16.726869] PRARIT: waiting for module s390_trng to load.
>
>If I'm not mistaken then there was _no_ corresponding message on the
>console stating that the module already exists.

Hm, my current theory is that we have a module whose exit() function
is taking a while to run to completion. While it is doing so, the
module's state is already set to MODULE_STATE_GOING.

With Prarit's patch, since this module is probably still in GOING,
add_unformed_module() will wait until the module is finally gone. If
this takes too long, we will keep trying to add ourselves to the
module list and hence stay in the loop in add_unformed_module().
According to Documentation/RCU/stallwarn.txt, this looping in the
kernel may trigger an rcu stall warning (see bullet point stating "a
CPU looping anywhere in the kernel without invoking schedule()".

Heiko, could you modify the patch to print the module's state to
confirm?

Thanks,

Jessica

2019-04-26 17:18:08

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"



On 4/26/19 12:09 PM, Jessica Yu wrote:
> +++ Heiko Carstens [26/04/19 17:07 +0200]:
>> On Fri, Apr 26, 2019 at 09:22:34AM -0400, Prarit Bhargava wrote:
>>> On 4/26/19 9:07 AM, Heiko Carstens wrote:
>>> > Hello Prarit,
>>> >
>>> > it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
>>> > for modules that have finished loading") _sometimes_ causes hangs on
>>> > s390. This is unfortunately not 100% reproducible, however the
>>> > mentioned commit seems to be the only relevant one in modules.c.
>>> >
>>> > What I see is a hanging system with messages like this on the console:
>>> >
>>> > [   65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
>>> > [   65.876049] rcu:     7-....: (5999 ticks this GP)
>>> idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
>>> > [   65.876078]  (t=6000 jiffies g=-471 q=17196)
>>> > [   65.876084] Task dump for CPU 7:
>>> > [   65.876088] systemd-udevd   R  running task        0   731    721
>>> 0x06000004
>>> > [   65.876097] Call Trace:
>>> > [   65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
>>> > [   65.876122]  [<00000000001ee486>] finished_loading+0x4e/0xb0
>>> > [   65.876128]  [<00000000001f1ed6>] load_module+0xcce/0x27a0
>>> > [   65.876134]  [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
>>> > [   65.876142]  [<0000000000ac0766>] system_call+0x2aa/0x2c8
>>> > I did not look any further into the dump, however since the commit
>>> > touches exactly the code path which seems to be looping... ;)
>>> >
>>>
>>> Ouch :(  I wonder if I exposed a further race or another bug.  Heiko, can you
>>> determine which module is stuck?  Warning: I have not compiled this code.
>>
>> Here we go:
>>
>> [   11.716866] PRARIT: waiting for module s390_trng to load.
>> [   11.716867] PRARIT: waiting for module s390_trng to load.
>> [   11.716868] PRARIT: waiting for module s390_trng to load.
>> [   11.716870] PRARIT: waiting for module s390_trng to load.
>> [   11.716871] PRARIT: waiting for module s390_trng to load.
>> [   11.716872] PRARIT: waiting for module s390_trng to load.
>> [   11.716874] PRARIT: waiting for module s390_trng to load.
>> [   11.716875] PRARIT: waiting for module s390_trng to load.
>> [   11.716876] PRARIT: waiting for module s390_trng to load.
>> [   16.726850] add_unformed_module: 31403529 callbacks suppressed
>> [   16.726853] PRARIT: waiting for module s390_trng to load.
>> [   16.726862] PRARIT: waiting for module s390_trng to load.
>> [   16.726865] PRARIT: waiting for module s390_trng to load.
>> [   16.726867] PRARIT: waiting for module s390_trng to load.
>> [   16.726869] PRARIT: waiting for module s390_trng to load.
>>
>> If I'm not mistaken then there was _no_ corresponding message on the
>> console stating that the module already exists.
>
> Hm, my current theory is that we have a module whose exit() function
> is taking a while to run to completion. While it is doing so, the
> module's state is already set to MODULE_STATE_GOING.
>
> With Prarit's patch, since this module is probably still in GOING,
> add_unformed_module() will wait until the module is finally gone. If
> this takes too long, we will keep trying to add ourselves to the
> module list and hence stay in the loop in add_unformed_module().
> According to Documentation/RCU/stallwarn.txt, this looping in the
> kernel may trigger an rcu stall warning (see bullet point stating "a
> CPU looping anywhere in the kernel without invoking schedule()".
>

Yeah, that's what I'm thinking too. The question is, however, why that module
is taking so long that it stalls the system. If the module state is going then
the probe of the module has already failed. Could it be some weird bug in the
notifier chains? And :) why aren't I seeing this all the time?

P.

> Heiko, could you modify the patch to print the module's state to
> confirm?
>
> Thanks,
>
> Jessica

2019-04-26 18:12:42

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"



On 4/26/19 12:09 PM, Jessica Yu wrote:
> +++ Heiko Carstens [26/04/19 17:07 +0200]:
>> On Fri, Apr 26, 2019 at 09:22:34AM -0400, Prarit Bhargava wrote:
>>> On 4/26/19 9:07 AM, Heiko Carstens wrote:
>>> > Hello Prarit,
>>> >
>>> > it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
>>> > for modules that have finished loading") _sometimes_ causes hangs on
>>> > s390. This is unfortunately not 100% reproducible, however the
>>> > mentioned commit seems to be the only relevant one in modules.c.
>>> >
>>> > What I see is a hanging system with messages like this on the console:
>>> >
>>> > [   65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
>>> > [   65.876049] rcu:     7-....: (5999 ticks this GP)
>>> idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
>>> > [   65.876078]  (t=6000 jiffies g=-471 q=17196)
>>> > [   65.876084] Task dump for CPU 7:
>>> > [   65.876088] systemd-udevd   R  running task        0   731    721
>>> 0x06000004
>>> > [   65.876097] Call Trace:
>>> > [   65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
>>> > [   65.876122]  [<00000000001ee486>] finished_loading+0x4e/0xb0
>>> > [   65.876128]  [<00000000001f1ed6>] load_module+0xcce/0x27a0
>>> > [   65.876134]  [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
>>> > [   65.876142]  [<0000000000ac0766>] system_call+0x2aa/0x2c8
>>> > I did not look any further into the dump, however since the commit
>>> > touches exactly the code path which seems to be looping... ;)
>>> >
>>>
>>> Ouch :(  I wonder if I exposed a further race or another bug.  Heiko, can you
>>> determine which module is stuck?  Warning: I have not compiled this code.
>>
>> Here we go:
>>
>> [   11.716866] PRARIT: waiting for module s390_trng to load.
>> [   11.716867] PRARIT: waiting for module s390_trng to load.
>> [   11.716868] PRARIT: waiting for module s390_trng to load.
>> [   11.716870] PRARIT: waiting for module s390_trng to load.
>> [   11.716871] PRARIT: waiting for module s390_trng to load.
>> [   11.716872] PRARIT: waiting for module s390_trng to load.
>> [   11.716874] PRARIT: waiting for module s390_trng to load.
>> [   11.716875] PRARIT: waiting for module s390_trng to load.
>> [   11.716876] PRARIT: waiting for module s390_trng to load.
>> [   16.726850] add_unformed_module: 31403529 callbacks suppressed
>> [   16.726853] PRARIT: waiting for module s390_trng to load.
>> [   16.726862] PRARIT: waiting for module s390_trng to load.
>> [   16.726865] PRARIT: waiting for module s390_trng to load.
>> [   16.726867] PRARIT: waiting for module s390_trng to load.
>> [   16.726869] PRARIT: waiting for module s390_trng to load.
>>
>> If I'm not mistaken then there was _no_ corresponding message on the
>> console stating that the module already exists.

Heiko,

Where is the s390_trng module? I can't seem to find it in the tree.

[02:06 PM root@ibm-z-108 linux-next]# find ./ -name *s390_trng*
[02:06 PM root@ibm-z-108 linux-next]# git grep s390_trng
[02:07 PM root@ibm-z-108 linux-next]#

P.

>
> Hm, my current theory is that we have a module whose exit() function
> is taking a while to run to completion. While it is doing so, the
> module's state is already set to MODULE_STATE_GOING.
>
> With Prarit's patch, since this module is probably still in GOING,
> add_unformed_module() will wait until the module is finally gone. If
> this takes too long, we will keep trying to add ourselves to the
> module list and hence stay in the loop in add_unformed_module().
> According to Documentation/RCU/stallwarn.txt, this looping in the
> kernel may trigger an rcu stall warning (see bullet point stating "a
> CPU looping anywhere in the kernel without invoking schedule()".
>
> Heiko, could you modify the patch to print the module's state to
> confirm?
>
> Thanks,
>
> Jessica

2019-04-26 19:47:11

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"



On 4/26/19 2:10 PM, Prarit Bhargava wrote:
>
>
> On 4/26/19 12:09 PM, Jessica Yu wrote:
>> +++ Heiko Carstens [26/04/19 17:07 +0200]:
>>> On Fri, Apr 26, 2019 at 09:22:34AM -0400, Prarit Bhargava wrote:
>>>> On 4/26/19 9:07 AM, Heiko Carstens wrote:
>>>>> Hello Prarit,
>>>>>
>>>>> it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
>>>>> for modules that have finished loading") _sometimes_ causes hangs on
>>>>> s390. This is unfortunately not 100% reproducible, however the
>>>>> mentioned commit seems to be the only relevant one in modules.c.
>>>>>
>>>>> What I see is a hanging system with messages like this on the console:
>>>>>
>>>>> [ 65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
>>>>> [ 65.876049] rcu: 7-....: (5999 ticks this GP)
>>>> idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
>>>>> [ 65.876078] (t=6000 jiffies g=-471 q=17196)
>>>>> [ 65.876084] Task dump for CPU 7:
>>>>> [ 65.876088] systemd-udevd R running task 0 731 721
>>>> 0x06000004
>>>>> [ 65.876097] Call Trace:
>>>>> [ 65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
>>>>> [ 65.876122] [<00000000001ee486>] finished_loading+0x4e/0xb0
>>>>> [ 65.876128] [<00000000001f1ed6>] load_module+0xcce/0x27a0
>>>>> [ 65.876134] [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
>>>>> [ 65.876142] [<0000000000ac0766>] system_call+0x2aa/0x2c8
>>>>> I did not look any further into the dump, however since the commit
>>>>> touches exactly the code path which seems to be looping... ;)
>>>>>
>>>>
>>>> Ouch :( I wonder if I exposed a further race or another bug. Heiko, can you
>>>> determine which module is stuck? Warning: I have not compiled this code.
>>>
>>> Here we go:
>>>
>>> [ 11.716866] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716867] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716868] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716870] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716871] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716872] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716874] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716875] PRARIT: waiting for module s390_trng to load.
>>> [ 11.716876] PRARIT: waiting for module s390_trng to load.
>>> [ 16.726850] add_unformed_module: 31403529 callbacks suppressed
>>> [ 16.726853] PRARIT: waiting for module s390_trng to load.
>>> [ 16.726862] PRARIT: waiting for module s390_trng to load.
>>> [ 16.726865] PRARIT: waiting for module s390_trng to load.
>>> [ 16.726867] PRARIT: waiting for module s390_trng to load.
>>> [ 16.726869] PRARIT: waiting for module s390_trng to load.
>>>
>>> If I'm not mistaken then there was _no_ corresponding message on the
>>> console stating that the module already exists.
>
> Heiko,
>
> Where is the s390_trng module? I can't seem to find it in the tree.
>
> [02:06 PM root@ibm-z-108 linux-next]# find ./ -name *s390_trng*
> [02:06 PM root@ibm-z-108 linux-next]# git grep s390_trng
> [02:07 PM root@ibm-z-108 linux-next]#
>

Never mind, I found it: s390-trng.c

P.

> P.
>
>>
>> Hm, my current theory is that we have a module whose exit() function
>> is taking a while to run to completion. While it is doing so, the
>> module's state is already set to MODULE_STATE_GOING.
>>
>> With Prarit's patch, since this module is probably still in GOING,
>> add_unformed_module() will wait until the module is finally gone. If
>> this takes too long, we will keep trying to add ourselves to the
>> module list and hence stay in the loop in add_unformed_module().
>> According to Documentation/RCU/stallwarn.txt, this looping in the
>> kernel may trigger an rcu stall warning (see bullet point stating "a
>> CPU looping anywhere in the kernel without invoking schedule()".
>>
>> Heiko, could you modify the patch to print the module's state to
>> confirm?
>>
>> Thanks,
>>
>> Jessica

2019-04-27 00:23:36

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"



On 4/26/19 3:45 PM, Prarit Bhargava wrote:
>
>
> On 4/26/19 2:10 PM, Prarit Bhargava wrote:
>>
>>
>> On 4/26/19 12:09 PM, Jessica Yu wrote:
>>> +++ Heiko Carstens [26/04/19 17:07 +0200]:
>>>> On Fri, Apr 26, 2019 at 09:22:34AM -0400, Prarit Bhargava wrote:
>>>>> On 4/26/19 9:07 AM, Heiko Carstens wrote:
>>>>>> Hello Prarit,
>>>>>>
>>>>>> it looks like your commit f9a75c1d717f ("modules: Only return -EEXIST
>>>>>> for modules that have finished loading") _sometimes_ causes hangs on
>>>>>> s390. This is unfortunately not 100% reproducible, however the
>>>>>> mentioned commit seems to be the only relevant one in modules.c.
>>>>>>
>>>>>> What I see is a hanging system with messages like this on the console:
>>>>>>
>>>>>> [ 65.876040] rcu: INFO: rcu_sched self-detected stall on CPU
>>>>>> [ 65.876049] rcu: 7-....: (5999 ticks this GP)
>>>>> idle=eae/1/0x4000000000000002 softirq=1181/1181 fqs=2729
>>>>>> [ 65.876078] (t=6000 jiffies g=-471 q=17196)
>>>>>> [ 65.876084] Task dump for CPU 7:
>>>>>> [ 65.876088] systemd-udevd R running task 0 731 721
>>>>> 0x06000004
>>>>>> [ 65.876097] Call Trace:
>>>>>> [ 65.876113] ([<0000000000abb264>] __schedule+0x2e4/0x6e0)
>>>>>> [ 65.876122] [<00000000001ee486>] finished_loading+0x4e/0xb0
>>>>>> [ 65.876128] [<00000000001f1ed6>] load_module+0xcce/0x27a0
>>>>>> [ 65.876134] [<00000000001f3af0>] __s390x_sys_init_module+0x148/0x178
>>>>>> [ 65.876142] [<0000000000ac0766>] system_call+0x2aa/0x2c8
>>>>>> I did not look any further into the dump, however since the commit
>>>>>> touches exactly the code path which seems to be looping... ;)
>>>>>>
>>>>>
>>>>> Ouch :( I wonder if I exposed a further race or another bug. Heiko, can you
>>>>> determine which module is stuck? Warning: I have not compiled this code.
>>>>
>>>> Here we go:
>>>>
>>>> [ 11.716866] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716867] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716868] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716870] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716871] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716872] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716874] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716875] PRARIT: waiting for module s390_trng to load.
>>>> [ 11.716876] PRARIT: waiting for module s390_trng to load.
>>>> [ 16.726850] add_unformed_module: 31403529 callbacks suppressed
>>>> [ 16.726853] PRARIT: waiting for module s390_trng to load.
>>>> [ 16.726862] PRARIT: waiting for module s390_trng to load.
>>>> [ 16.726865] PRARIT: waiting for module s390_trng to load.
>>>> [ 16.726867] PRARIT: waiting for module s390_trng to load.
>>>> [ 16.726869] PRARIT: waiting for module s390_trng to load.
>>>>
>>>> If I'm not mistaken then there was _no_ corresponding message on the
>>>> console stating that the module already exists.
>>
>> Heiko,
>>
>> Where is the s390_trng module? I can't seem to find it in the tree.
>>
>> [02:06 PM root@ibm-z-108 linux-next]# find ./ -name *s390_trng*
>> [02:06 PM root@ibm-z-108 linux-next]# git grep s390_trng
>> [02:07 PM root@ibm-z-108 linux-next]#
>>
>
> Never mind, I found it: s390-trng.c
>
> P.
>
>> P.
>>


Heiko and Jessica,

The issue doesn't appear to be with my patch AFAICT. The s390_trng fails to
load and then the kernel occasionally hangs (as Heiko mentioned) calling
synchronize_rcu().

The call sequence is

module_load()
do_init_module()
do_one_initcall(mod->init)

which fails.

The failure path in do_one_initcall() is entered and we start executing code at
kernel/module.c:3541

fail_free_freeinit:
kfree(freeinit);
fail:
/* Try to protect us from buggy refcounters. */
mod->state = MODULE_STATE_GOING;
synchronize_rcu();

^^^ the kernel hangs here. Sometimes it's very short and other times it seems
to hang. I've left systems that appear to be hung and come back after 10
minutes to find that they've somehow made it through this call.

Is there a known issue with RCU on s390 that is making this occur?

P.

2019-04-27 10:25:58

by Heiko Carstens

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"

On Fri, Apr 26, 2019 at 08:20:52PM -0400, Prarit Bhargava wrote:
> Heiko and Jessica,
>
> The issue doesn't appear to be with my patch AFAICT. The s390_trng fails to
> load and then the kernel occasionally hangs (as Heiko mentioned) calling
> synchronize_rcu().
>
> The call sequence is
>
> module_load()
> do_init_module()
> do_one_initcall(mod->init)
>
> which fails.
>
> The failure path in do_one_initcall() is entered and we start executing code at
> kernel/module.c:3541
>
> fail_free_freeinit:
> kfree(freeinit);
> fail:
> /* Try to protect us from buggy refcounters. */
> mod->state = MODULE_STATE_GOING;
> synchronize_rcu();
>
> ^^^ the kernel hangs here. Sometimes it's very short and other times it seems
> to hang. I've left systems that appear to be hung and come back after 10
> minutes to find that they've somehow made it through this call.
>
> Is there a known issue with RCU on s390 that is making this occur?

No there is no known issue with RCU on s390. The reason that
synchronize_rcu() doesn't finish is because a different cpu is within
an endless loop in add_unformed_module() just like Jessica suspected.

Note: the kernel is compiled with CONFIG_PREEMPT off - there is no
kernel preemption that will make the looping cpu ever go over schedule
and subsequently let synchronize_rcu() finish.

To confirm Jessicas theory - looking into the dump we have:

crash> bt 742
PID: 742 TASK: 1efa6c000 CPU: 7 COMMAND: "systemd-udevd"
#0 [3e0043aba30] __schedule at abb25e
#1 [3e0043abaa0] schedule at abb6a2
#2 [3e0043abac8] schedule_timeout at abf49a
#3 [3e0043abb60] wait_for_common at abc396
#4 [3e0043abbf0] __wait_rcu_gp at 1c0136
#5 [3e0043abc48] synchronize_rcu at 1c72ea
#6 [3e0043abc98] do_init_module at 1f10be
#7 [3e0043abcf0] load_module at 1f3594
#8 [3e0043abdd0] __se_sys_init_module at 1f3af0
#9 [3e0043abea8] system_call at ac0766

Which is the process waiting for synchronize_rcu to finish. Wading
through the stack frames gives me this struct module:

struct module {
state = MODULE_STATE_GOING,
list = {
next = 0x3ff80394508,
prev = 0xe25090 <modules>
},
name = "s390_trng\000...
...

Then we have the looping task/cpu:

PID: 731 TASK: 1e79ba000 CPU: 7 COMMAND: "systemd-udevd"
LOWCORE INFO:
-psw : 0x0704c00180000000 0x0000000000ab666a
-function : memcmp at ab666a
...
-general registers:
0x0000000000000009 0x0000000000000009
0x000003ff80347321 000000000000000000
0x000003ff8034f321 000000000000000000
0x000000000000001e 0x000003ff8c592708
0x000003e0047da900 0x000003ff8034f318
0x0000000000000001 0x0000000000000009
0x000003ff80347300 0x0000000000ad81b8
0x00000000001ee062 0x000003e004357cb0
#0 [3e004357cf0] load_module at 1f1eb0
#1 [3e004357dd0] __se_sys_init_module at 1f3af0
#2 [3e004357ea8] system_call at ac0766

which is find_module_all() calling memcmp with this string:

3ff80347318: 733339305f74726e 6700000000000000 s390_trng.......

So it all seems to fit. A simple cond_resched() call, which enforces
an RCU quiescent state for the calliung cpu, fixes the problem for me
(patch on top of linux-next 20190424 -- c392798a85ab):

diff --git a/kernel/module.c b/kernel/module.c
index 410eeb7e4f1d..48748cfec991 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3585,6 +3585,7 @@ again:
finished_loading(mod->name));
if (err)
goto out_unlocked;
+ cond_resched();
goto again;
}
err = -EEXIST;

2019-04-27 10:37:28

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"



On 4/27/19 6:24 AM, Heiko Carstens wrote:
> On Fri, Apr 26, 2019 at 08:20:52PM -0400, Prarit Bhargava wrote:
>> Heiko and Jessica,
>>
>> The issue doesn't appear to be with my patch AFAICT. The s390_trng fails to
>> load and then the kernel occasionally hangs (as Heiko mentioned) calling
>> synchronize_rcu().
>>
>> The call sequence is
>>
>> module_load()
>> do_init_module()
>> do_one_initcall(mod->init)
>>
>> which fails.
>>
>> The failure path in do_one_initcall() is entered and we start executing code at
>> kernel/module.c:3541
>>
>> fail_free_freeinit:
>> kfree(freeinit);
>> fail:
>> /* Try to protect us from buggy refcounters. */
>> mod->state = MODULE_STATE_GOING;
>> synchronize_rcu();
>>
>> ^^^ the kernel hangs here. Sometimes it's very short and other times it seems
>> to hang. I've left systems that appear to be hung and come back after 10
>> minutes to find that they've somehow made it through this call.
>>
>> Is there a known issue with RCU on s390 that is making this occur?
>
> No there is no known issue with RCU on s390. The reason that
> synchronize_rcu() doesn't finish is because a different cpu is within
> an endless loop in add_unformed_module() just like Jessica suspected.
>
> Note: the kernel is compiled with CONFIG_PREEMPT off - there is no
> kernel preemption that will make the looping cpu ever go over schedule
> and subsequently let synchronize_rcu() finish.
>
> To confirm Jessicas theory - looking into the dump we have:
>
> crash> bt 742
> PID: 742 TASK: 1efa6c000 CPU: 7 COMMAND: "systemd-udevd"
> #0 [3e0043aba30] __schedule at abb25e
> #1 [3e0043abaa0] schedule at abb6a2
> #2 [3e0043abac8] schedule_timeout at abf49a
> #3 [3e0043abb60] wait_for_common at abc396
> #4 [3e0043abbf0] __wait_rcu_gp at 1c0136
> #5 [3e0043abc48] synchronize_rcu at 1c72ea
> #6 [3e0043abc98] do_init_module at 1f10be
> #7 [3e0043abcf0] load_module at 1f3594
> #8 [3e0043abdd0] __se_sys_init_module at 1f3af0
> #9 [3e0043abea8] system_call at ac0766
>
> Which is the process waiting for synchronize_rcu to finish. Wading
> through the stack frames gives me this struct module:
>
> struct module {
> state = MODULE_STATE_GOING,
> list = {
> next = 0x3ff80394508,
> prev = 0xe25090 <modules>
> },
> name = "s390_trng\000...
> ...
>
> Then we have the looping task/cpu:
>
> PID: 731 TASK: 1e79ba000 CPU: 7 COMMAND: "systemd-udevd"
> LOWCORE INFO:
> -psw : 0x0704c00180000000 0x0000000000ab666a
> -function : memcmp at ab666a
> ...
> -general registers:
> 0x0000000000000009 0x0000000000000009
> 0x000003ff80347321 000000000000000000
> 0x000003ff8034f321 000000000000000000
> 0x000000000000001e 0x000003ff8c592708
> 0x000003e0047da900 0x000003ff8034f318
> 0x0000000000000001 0x0000000000000009
> 0x000003ff80347300 0x0000000000ad81b8
> 0x00000000001ee062 0x000003e004357cb0
> #0 [3e004357cf0] load_module at 1f1eb0
> #1 [3e004357dd0] __se_sys_init_module at 1f3af0
> #2 [3e004357ea8] system_call at ac0766
>
> which is find_module_all() calling memcmp with this string:
>
> 3ff80347318: 733339305f74726e 6700000000000000 s390_trng.......
>
> So it all seems to fit. A simple cond_resched() call, which enforces
> an RCU quiescent state for the calliung cpu, fixes the problem for me
> (patch on top of linux-next 20190424 -- c392798a85ab):
>
> diff --git a/kernel/module.c b/kernel/module.c
> index 410eeb7e4f1d..48748cfec991 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3585,6 +3585,7 @@ again:
> finished_loading(mod->name));
> if (err)
> goto out_unlocked;
> + cond_resched();

You just beat me to posting this :). I am testing out this same patch in a loop
on a few systems here in the lab. Will let you know results on Monday.

P.

> goto again;
> }
> err = -EEXIST;
>

2019-04-27 10:44:40

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"



On 4/27/19 6:24 AM, Heiko Carstens wrote:

>
> diff --git a/kernel/module.c b/kernel/module.c
> index 410eeb7e4f1d..48748cfec991 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3585,6 +3585,7 @@ again:
> finished_loading(mod->name));
> if (err)
> goto out_unlocked;
> + cond_resched();
> goto again;
> }
> err = -EEXIST;
>

Heiko, I'm testing on 2-cpu systems which appear to show the problem ~10% of the
time. On another system I backed out my original patch to set a baseline, and
noticed that occasionally the time to boot the system doubles from ~4 seconds to
9 seconds. Is this something you're also concerned with?

P.

2019-04-29 05:56:34

by Heiko Carstens

[permalink] [raw]
Subject: Re: [-next] system hangs likely due to "modules: Only return -EEXIST for modules that have finished loading"

On Sat, Apr 27, 2019 at 06:42:51AM -0400, Prarit Bhargava wrote:
> On 4/27/19 6:24 AM, Heiko Carstens wrote:
>
> >
> > diff --git a/kernel/module.c b/kernel/module.c
> > index 410eeb7e4f1d..48748cfec991 100644
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -3585,6 +3585,7 @@ again:
> > finished_loading(mod->name));
> > if (err)
> > goto out_unlocked;
> > + cond_resched();
> Heiko, I'm testing on 2-cpu systems which appear to show the problem ~10% of the
> time. On another system I backed out my original patch to set a baseline, and
> noticed that occasionally the time to boot the system doubles from ~4 seconds to
> 9 seconds. Is this something you're also concerned with?

This _could_ be an issue, since I see the problem much more likely to
happen on systems with many devices (where many means only something
like 10 block devices). As far as I can tell it looks like
systemd/udevd tries to modprobe at the s390-trng module for each(!)
device.
I have no idea why it is doing that... however given that (failed)
module handling now sometimes takes more time, this might become a
real issue on system with several 1000s of block devices, which is a
realistic scenario at least on s390.