2013-08-12 07:04:24

by Jonathan Nieder

[permalink] [raw]
Subject: [3.8-rc3 -> 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

Hi,

Tejun Heo wrote:

> This avoids the described deadlock because iosched module doesn't use
> async and thus wouldn't invoke async_synchronize_full(). This is
> hacky and incomplete. It will deadlock if async module loading nests;
> however, this works around the known problem case and seems to be the
> best of bad options.
>
> For more details, please refer to the following thread.
>
> http://thread.gmane.org/gmane.linux.kernel/1420814

My laptop fails to boot[1] with the message 'Volume group "data" not
found'. Bisects to v3.8-rc4~17 (the above commit). Reverting that
commit on top of current "master" (d92581fcad18, 2013-08-10) produces
a working kernel. dmesg output from that working kernel attached.
More details, including .config, at [2].

Any ideas for tracking this down?

Thanks,
Jonathan

[1] Screenshot: http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=bad_3.10.3-1.jpg;att=1;bug=719464
Screenshot in recovery mode: http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=bad_3.10.3-1_recovery.jpg;att=2;bug=719464
[2] http://bugs.debian.org/719464


Attachments:
(No filename) (1.07 kB)
dmesg (45.41 kB)
Download all attachments

2013-08-12 15:09:08

by Tejun Heo

[permalink] [raw]
Subject: Re: [3.8-rc3 -> 3.8-rc4 regression] Re: [PATCH] module, async: async_synchronize_full() on module init iff async is used

Hello, Jonathan.

On Mon, Aug 12, 2013 at 12:04:11AM -0700, Jonathan Nieder wrote:
> My laptop fails to boot[1] with the message 'Volume group "data" not
> found'. Bisects to v3.8-rc4~17 (the above commit). Reverting that
> commit on top of current "master" (d92581fcad18, 2013-08-10) produces
> a working kernel. dmesg output from that working kernel attached.
> More details, including .config, at [2].
>
> Any ideas for tracking this down?

Which initrd / boot script are you using? It looks like lvm assemble
scripts are running before sdX are detected leading to volume assembly
failure. Before the patch, any module loading would end up
synchronizing async probes but after the patch modprobe invocations
which don't schedule them won't be. Does your boot script happen to
run multiple modprobes in parallel and proceed to configure lvm
without waiting for modprobes of libata drivers to finish?

Thanks.

--
tejun