Hello,
I encountered a regression in current (post-5.0) mainline kernel which I
bisected to commit 1aec4211204d ("parport: daisy: use new parport device
model"). Running "modprobe parport_pc" hangs up:
tweed:~ # ps ax | grep modprobe
1206 pts/0 D+ 0:00 modprobe parport_pc
1209 ? S 0:00 /sbin/modprobe -q -- parport_lowlevel
1211 pts/1 S+ 0:00 grep modprobe
tweed:~ # cat /proc/1206/stack
[<0>] call_usermodehelper_exec+0xc7/0x140
[<0>] __request_module+0x1a1/0x430
[<0>] __parport_register_driver+0x142/0x150 [parport]
[<0>] parport_bus_init+0x1d/0x30 [parport]
[<0>] parport_default_proc_register+0x28/0x1000 [parport]
[<0>] do_one_initcall+0x46/0x1cd
[<0>] do_init_module+0x5b/0x20d
[<0>] load_module+0x1b3d/0x20f0
[<0>] __do_sys_finit_module+0xbd/0xe0
[<0>] do_syscall_64+0x60/0x120
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<0>] 0xffffffffffffffff
tweed:~ # cat /proc/1209/stack
[<0>] load_module+0xe6a/0x20f0
[<0>] __do_sys_finit_module+0xbd/0xe0
[<0>] do_syscall_64+0x60/0x120
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<0>] 0xffffffffffffffff
call_usermodehelper_exec+0xc7/0x140 is (build from commit 1aec4211204d)
line 583 in kernel/umh.c:
retval = wait_for_completion_killable(&done);
and load_module+0xe6a/0x20f0 is in add_unformed_module(), line 3577 in
kernel/module.c:
err = wait_event_interruptible(module_wq,
finished_loading(mod->name));
Unfortunately I don't have version of crash able to deal with kernels as
new as these so I wasn't able to find more for now.
I have seen this both on real hardware and in a VM.
Michal Kubecek
HI Michal,
On Wed, Mar 13, 2019 at 6:45 AM Michal Kubecek <[email protected]> wrote:
>
> Hello,
>
> I encountered a regression in current (post-5.0) mainline kernel which I
> bisected to commit 1aec4211204d ("parport: daisy: use new parport device
> model"). Running "modprobe parport_pc" hangs up:
Can you please send me your .config so that I can test it from my side.
>
> tweed:~ # ps ax | grep modprobe
> 1206 pts/0 D+ 0:00 modprobe parport_pc
> 1209 ? S 0:00 /sbin/modprobe -q -- parport_lowlevel
> 1211 pts/1 S+ 0:00 grep modprobe
> tweed:~ # cat /proc/1206/stack
> [<0>] call_usermodehelper_exec+0xc7/0x140
> [<0>] __request_module+0x1a1/0x430
> [<0>] __parport_register_driver+0x142/0x150 [parport]
And also, modprobe is trying to load a dependent module, so it will be
great if you can also do a "lsmod" before doing the modprobe and I can
check what is going wrong.
--
Regards
Sudip
On Sun, Mar 17, 2019 at 05:01:37PM +0000, Sudip Mukherjee wrote:
> On Wed, Mar 13, 2019 at 6:45 AM Michal Kubecek <[email protected]> wrote:
> > I encountered a regression in current (post-5.0) mainline kernel which I
> > bisected to commit 1aec4211204d ("parport: daisy: use new parport device
> > model"). Running "modprobe parport_pc" hangs up:
>
> Can you please send me your .config so that I can test it from my side.
Attaching two versions: config-full.gz is the real life config from the
machine where I found the issue and config-mini.gz is a minimized config
I was using while bisecting the issue. (I made a mistake and thought
that I have seen the issue with snapshot before both parport commits so
that I ran a full bisect instead of simply checking the two parport
commits which came in the merge window.)
> > tweed:~ # ps ax | grep modprobe
> > 1206 pts/0 D+ 0:00 modprobe parport_pc
> > 1209 ? S 0:00 /sbin/modprobe -q -- parport_lowlevel
> > 1211 pts/1 S+ 0:00 grep modprobe
> > tweed:~ # cat /proc/1206/stack
> > [<0>] call_usermodehelper_exec+0xc7/0x140
> > [<0>] __request_module+0x1a1/0x430
> > [<0>] __parport_register_driver+0x142/0x150 [parport]
>
> And also, modprobe is trying to load a dependent module, so it will be
> great if you can also do a "lsmod" before doing the modprobe and I can
> check what is going wrong.
Attached are three lists:
- lsmod-before ... before running "modprobe parport_pc"
- lsmod-test ... while "modprobe parport_pc" is stuck
- lsmod-after ... after killing second modprobe (1209 above)
Killing the second modprobe (PID 1209 above) lets the first finish and
as a result, parport, parport_pc and ppdev are loaded (it's the only
difference against lsmod-before). When modprobe is stuck, lsmod shows
parport with refcount of 1 (the only difference against lsmod-before).
Michal
Hi Michal,
On Sun, Mar 17, 2019 at 6:05 PM Michal Kubecek <[email protected]> wrote:
>
> On Sun, Mar 17, 2019 at 05:01:37PM +0000, Sudip Mukherjee wrote:
> > On Wed, Mar 13, 2019 at 6:45 AM Michal Kubecek <[email protected]> wrote:
> > > I encountered a regression in current (post-5.0) mainline kernel which I
> > > bisected to commit 1aec4211204d ("parport: daisy: use new parport device
> > > model"). Running "modprobe parport_pc" hangs up:
> >
> > Can you please send me your .config so that I can test it from my side.
>
> Attaching two versions: config-full.gz is the real life config from the
> machine where I found the issue and config-mini.gz is a minimized config
> I was using while bisecting the issue. (I made a mistake and thought
> that I have seen the issue with snapshot before both parport commits so
> that I ran a full bisect instead of simply checking the two parport
> commits which came in the merge window.)
Sorry, I didn't get the chance to look at it yet and have kept it
pending for this weekend. But just had a quick look and I was
wondering if the machine on which you are trying the modprobe has an
actual parallel port or the machine is not having any parallel port.
And also will you be able to send me a dmesg please.
--
Regards
Sudip
On Wed, Mar 20, 2019 at 09:30:59AM +0000, Sudip Mukherjee wrote:
> Sorry, I didn't get the chance to look at it yet and have kept it
> pending for this weekend. But just had a quick look and I was
> wondering if the machine on which you are trying the modprobe has an
> actual parallel port or the machine is not having any parallel port.
> And also will you be able to send me a dmesg please.
Attaching dmesg output from a virtual machine which doesn't seem to have
a (virtual) parallel port. This part:
[ 63.962283] parport_pc 00:05: reported by Plug and Play ACPI
[ 63.962469] parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
[ 64.061723] ppdev: user-space parallel port driver
was after I manually killed "/sbin/modprobe -q -- parport_lowlevel" which
was started during boot.
Tomorrow (when I'm in the office) I'll check what happens when I add
a parallel port to the VM and also send you dmesg output from the
physical machine where I first noticed the issue. I'm quite sure it has
parallel port on its motherboard but it might be disabled in BIOS, I'll
have to check.
Michal Kubecek
HI Michal,
On Wed, Mar 20, 2019 at 9:18 PM Michal Kubecek <[email protected]> wrote:
>
> On Wed, Mar 20, 2019 at 09:30:59AM +0000, Sudip Mukherjee wrote:
> > Sorry, I didn't get the chance to look at it yet and have kept it
> > pending for this weekend. But just had a quick look and I was
> > wondering if the machine on which you are trying the modprobe has an
> > actual parallel port or the machine is not having any parallel port.
> > And also will you be able to send me a dmesg please.
>
> Attaching dmesg output from a virtual machine which doesn't seem to have
> a (virtual) parallel port. This part:
>
> [ 63.962283] parport_pc 00:05: reported by Plug and Play ACPI
> [ 63.962469] parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
> [ 64.061723] ppdev: user-space parallel port driver
>
> was after I manually killed "/sbin/modprobe -q -- parport_lowlevel" which
> was started during boot.
Thanks for testing. I am unable to reproduce the problem in VM or in
machine, with or without parallel port. But from your logs it looks
like you have an alias set for "parport_lowlevel". When parport module
is being loaded if it does not find any port in its list, it will try
to load "parport_lowlevel" and that is where you are getting the
deadlock. "parport_lowlevel" is not a real module, but instead should
be an alias pointing to some real module. I tried by setting an alias
of parport_lowlevel" as parport_pc but still could not get the
problem.
Can you please check in your VM or machine what do you have the alias
as? It should be either in "/etc/modprobe.conf" or some conf file in
"/etc/modprobe.d" folder. And also, will you be able to test a debug
patch on your VM?
--
Regards
Sudip
On Thursday, 21 March 2019 23:43 Sudip Mukherjee wrote:
> HI Michal,
>
> On Wed, Mar 20, 2019 at 9:18 PM Michal Kubecek <[email protected]> wrote:
> > On Wed, Mar 20, 2019 at 09:30:59AM +0000, Sudip Mukherjee wrote:
> > > Sorry, I didn't get the chance to look at it yet and have kept it
> > > pending for this weekend. But just had a quick look and I was
> > > wondering if the machine on which you are trying the modprobe has
> > > an
> > > actual parallel port or the machine is not having any parallel
> > > port.
> > > And also will you be able to send me a dmesg please.
> >
> > Attaching dmesg output from a virtual machine which doesn't seem to
> > have a (virtual) parallel port. This part:
> >
> > [ 63.962283] parport_pc 00:05: reported by Plug and Play ACPI
> > [ 63.962469] parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
> > [ 64.061723] ppdev: user-space parallel port driver
> >
> > was after I manually killed "/sbin/modprobe -q -- parport_lowlevel"
> > which was started during boot.
>
> Thanks for testing. I am unable to reproduce the problem in VM or in
> machine, with or without parallel port. But from your logs it looks
> like you have an alias set for "parport_lowlevel". When parport module
> is being loaded if it does not find any port in its list, it will try
> to load "parport_lowlevel" and that is where you are getting the
> deadlock. "parport_lowlevel" is not a real module, but instead should
> be an alias pointing to some real module. I tried by setting an alias
> of parport_lowlevel" as parport_pc but still could not get the
> problem.
> Can you please check in your VM or machine what do you have the alias
> as? It should be either in "/etc/modprobe.conf" or some conf file in
> "/etc/modprobe.d" folder.
You are right, this is in /etc/modprobe.d/00-system which is part of
suse-module-tools package:
-----------------------------------------------------------------------
alias parport_lowlevel parport_pc
# disable DMA for parallel port (bnc#180390)
# Please note, the dma= and irq= options require that the io= option also be
# specified.
options parport_pc dma=none
# options parport_pc io=0x378 irq=none
# If you have multiple parallel ports, specify them this way:
# options parport_pc io=0x378,0x278 irq=none,none
-----------------------------------------------------------------------
"bnc#180390" means https://bugzilla.suse.com/show_bug.cgi?id=180390
There is a git repository for suse-module-tools package on github but
unfortunately it starts in 2017 and as comment 98 in the bug above
shows that the alias line was already in place in 2008, the reason
won't be found in OBS history either.
Anyway, when I comment out the alias line, "modprobe parport_pc"
succeeds immediately and loads parport, parport_pc and ppdev.
> And also, will you be able to test a debug patch on your VM?
Yes, definitely.
Michal
Hi Michal,
On Fri, Mar 22, 2019 at 07:13:23AM +0100, Michal Kubecek wrote:
> On Thursday, 21 March 2019 23:43 Sudip Mukherjee wrote:
> > HI Michal,
> >
> > On Wed, Mar 20, 2019 at 9:18 PM Michal Kubecek <[email protected]> wrote:
> > > On Wed, Mar 20, 2019 at 09:30:59AM +0000, Sudip Mukherjee wrote:
<snip>
> > Can you please check in your VM or machine what do you have the alias
> > as? It should be either in "/etc/modprobe.conf" or some conf file in
> > "/etc/modprobe.d" folder.
>
> You are right, this is in /etc/modprobe.d/00-system which is part of
> suse-module-tools package:
And I was able to reproduce the problem using a vm and Suse Tumblewood with
next-20190322. Can you please try the attached patch and test on your vm and
machine and check if it fixes the problem.
--
Regards
Sudip
On Sun, Mar 24, 2019 at 07:38:38PM +0000, Sudip Mukherjee wrote:
> And I was able to reproduce the problem using a vm and Suse Tumblewood with
> next-20190322. Can you please try the attached patch and test on your vm and
> machine and check if it fixes the problem.
>
> --
> Regards
> Sudip
> diff --git a/drivers/parport/share.c b/drivers/parport/share.c
> index 0171b8dbcdcd..f87948fbfc34 100644
> --- a/drivers/parport/share.c
> +++ b/drivers/parport/share.c
> @@ -274,7 +274,7 @@ static int port_check(struct device *dev, void *dev_drv)
> int __parport_register_driver(struct parport_driver *drv, struct module *owner,
> const char *mod_name)
> {
> - if (list_empty(&portlist))
> + if (list_empty(&portlist) && strcmp(drv->name, "daisy_drv"))
> get_lowlevel_driver();
>
> if (drv->devmodel) {
Yes, with this patch (on top of v5.1-rc2), both physical machine and VM
let the module(s) load cleanly even with the alias line restored.
Tested-by: Michal Kubecek <[email protected]>
Thank you,
Michal
On Mon, Mar 25, 2019 at 7:30 AM Michal Kubecek <[email protected]> wrote:
>
> On Sun, Mar 24, 2019 at 07:38:38PM +0000, Sudip Mukherjee wrote:
> > And I was able to reproduce the problem using a vm and Suse Tumblewood with
> > next-20190322. Can you please try the attached patch and test on your vm and
> > machine and check if it fixes the problem.
> >
> > --
> > Regards
> > Sudip
>
> > diff --git a/drivers/parport/share.c b/drivers/parport/share.c
> > index 0171b8dbcdcd..f87948fbfc34 100644
> > --- a/drivers/parport/share.c
> > +++ b/drivers/parport/share.c
> > @@ -274,7 +274,7 @@ static int port_check(struct device *dev, void *dev_drv)
> > int __parport_register_driver(struct parport_driver *drv, struct module *owner,
> > const char *mod_name)
> > {
> > - if (list_empty(&portlist))
> > + if (list_empty(&portlist) && strcmp(drv->name, "daisy_drv"))
> > get_lowlevel_driver();
> >
> > if (drv->devmodel) {
>
> Yes, with this patch (on top of v5.1-rc2), both physical machine and VM
> let the module(s) load cleanly even with the alias line restored.
>
> Tested-by: Michal Kubecek <[email protected]>
Thanks Michal. I will add it to my queue with your Tested-by.
btw, I think I liked using Suse. :)
--
Regards
Sudip