2017-09-09 17:41:29

by Jakub Kicinski

[permalink] [raw]
Subject: Module removal-related regression?

Hi!

I'm having trouble with modules on linux/master. rmmod succeeds but the
module is still loaded and the refcount goes to 1:

#rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
/opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
lsmod | grep nfp; \
rmmod nfp; \
lsmod | grep nfp
nfp 249856 0
nfp 200704 1

If I rmmod again the module will be actually unloaded. The user space
is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to bisect
now...


2017-09-09 19:27:38

by Jakub Kicinski

[permalink] [raw]
Subject: [bisected] Re: Module removal-related regression?

On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> Hi!
>
> I'm having trouble with modules on linux/master. rmmod succeeds but the
> module is still loaded and the refcount goes to 1:
>
> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> lsmod | grep nfp; \
> rmmod nfp; \
> lsmod | grep nfp
> nfp 249856 0
> nfp 200704 1
>
> If I rmmod again the module will be actually unloaded. The user space
> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to bisect
> now...

Got 'em!

commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD, refs/bisect/bad)
Author: Dmitry Torokhov <[email protected]>
Date: Wed Jul 19 17:24:30 2017 -0700

driver core: emit uevents when device is bound to a driver

There are certain touch controllers that may come up in either normal
(application) or boot mode, depending on whether firmware/configuration is
corrupted when they are powered on. In boot mode the kernel does not create
input device instance (because it does not necessarily know the
characteristics of the input device in question).

Another number of controllers does not store firmware in a non-volatile
memory, and they similarly need to have firmware loaded before input device
instance is created. There are also other types of devices with similar
behavior.

There is a desire to be able to trigger firmware loading via udev, but it
has to happen only when driver is bound to a physical device (i2c or spi).
These udev actions can not use ADD events, as those happen too early, so we
are introducing BIND and UNBIND events that are emitted at the right
moment.

Also, many drivers create additional driver-specific device attributes
when binding to the device, to provide userspace with additional controls.
The new events allow userspace to adjust these driver-specific attributes
without worrying that they are not there yet.

Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


Heeello... :)

2017-09-09 19:55:54

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski <[email protected]> wrote:
> On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
>> Hi!
>>
>> I'm having trouble with modules on linux/master. rmmod succeeds but the
>> module is still loaded and the refcount goes to 1:
>>
>> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
>> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
>> lsmod | grep nfp; \
>> rmmod nfp; \
>> lsmod | grep nfp
>> nfp 249856 0
>> nfp 200704 1
>>
>> If I rmmod again the module will be actually unloaded. The user space
>> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to bisect
>> now...
>
> Got 'em!
>
> commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD, refs/bisect/bad)
> Author: Dmitry Torokhov <[email protected]>
> Date: Wed Jul 19 17:24:30 2017 -0700
>
> driver core: emit uevents when device is bound to a driver

Does it happen with all modules or only nfp one?

It seems to work here:

dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
psmouse 135168 0
dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
dtor@dtor-glaptop3:~ $ sudo modprobe psmouse

Thanks.

--
Dmitry

2017-09-09 20:10:08

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski <[email protected]> wrote:
> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> >> Hi!
> >>
> >> I'm having trouble with modules on linux/master. rmmod succeeds but the
> >> module is still loaded and the refcount goes to 1:
> >>
> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> >> lsmod | grep nfp; \
> >> rmmod nfp; \
> >> lsmod | grep nfp
> >> nfp 249856 0
> >> nfp 200704 1
> >>
> >> If I rmmod again the module will be actually unloaded. The user space
> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to bisect
> >> now...
> >
> > Got 'em!
> >
> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD, refs/bisect/bad)
> > Author: Dmitry Torokhov <[email protected]>
> > Date: Wed Jul 19 17:24:30 2017 -0700
> >
> > driver core: emit uevents when device is bound to a driver
>
> Does it happen with all modules or only nfp one?
>
> It seems to work here:
>
> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> psmouse 135168 0
> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse

Yes, and only if FW is loaded/requested successfully. Hmm...

I'm building a kernel with more debug, any particular suggestions?

2017-09-09 20:17:31

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski <[email protected]> wrote:
> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> >> Hi!
> >>
> >> I'm having trouble with modules on linux/master. rmmod succeeds but the
> >> module is still loaded and the refcount goes to 1:
> >>
> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> >> lsmod | grep nfp; \
> >> rmmod nfp; \
> >> lsmod | grep nfp
> >> nfp 249856 0
> >> nfp 200704 1
> >>
> >> If I rmmod again the module will be actually unloaded. The user space
> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to bisect
> >> now...
> >
> > Got 'em!
> >
> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD, refs/bisect/bad)
> > Author: Dmitry Torokhov <[email protected]>
> > Date: Wed Jul 19 17:24:30 2017 -0700
> >
> > driver core: emit uevents when device is bound to a driver
>
> Does it happen with all modules or only nfp one?
>
> It seems to work here:
>
> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> psmouse 135168 0
> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse

It looks like the driver is actually reloaded. The driver used to
return EPROBE_DEFER, but I think it doesn't any more (rebuilding the
kernel to test that right now).

Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
else cause the driver to be loaded again?

2017-09-09 20:59:31

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski <[email protected]> wrote:
>On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
>> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski <[email protected]>
>wrote:
>> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
>> >> Hi!
>> >>
>> >> I'm having trouble with modules on linux/master. rmmod succeeds
>but the
>> >> module is still loaded and the refcount goes to 1:
>> >>
>> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
>> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
>> >> lsmod | grep nfp; \
>> >> rmmod nfp; \
>> >> lsmod | grep nfp
>> >> nfp 249856 0
>> >> nfp 200704 1
>> >>
>> >> If I rmmod again the module will be actually unloaded. The user
>space
>> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to
>bisect
>> >> now...
>> >
>> > Got 'em!
>> >
>> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
>refs/bisect/bad)
>> > Author: Dmitry Torokhov <[email protected]>
>> > Date: Wed Jul 19 17:24:30 2017 -0700
>> >
>> > driver core: emit uevents when device is bound to a driver
>>
>> Does it happen with all modules or only nfp one?
>>
>> It seems to work here:
>>
>> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> psmouse 135168 0
>> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
>> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
>
>It looks like the driver is actually reloaded. The driver used to
>return EPROBE_DEFER, but I think it doesn't any more (rebuilding the
>kernel to test that right now).
>
>Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
>else cause the driver to be loaded again?

It depends on how silly the udev rules are, but yes, this can definitely happen.


Thanks.

--
Dmitry

2017-09-09 22:03:50

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski <[email protected]> wrote:
> >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski <[email protected]>
> >wrote:
> >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> >> >> Hi!
> >> >>
> >> >> I'm having trouble with modules on linux/master. rmmod succeeds
> >but the
> >> >> module is still loaded and the refcount goes to 1:
> >> >>
> >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> >> >> lsmod | grep nfp; \
> >> >> rmmod nfp; \
> >> >> lsmod | grep nfp
> >> >> nfp 249856 0
> >> >> nfp 200704 1
> >> >>
> >> >> If I rmmod again the module will be actually unloaded. The user
> >space
> >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to
> >bisect
> >> >> now...
> >> >
> >> > Got 'em!
> >> >
> >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> >refs/bisect/bad)
> >> > Author: Dmitry Torokhov <[email protected]>
> >> > Date: Wed Jul 19 17:24:30 2017 -0700
> >> >
> >> > driver core: emit uevents when device is bound to a driver
> >>
> >> Does it happen with all modules or only nfp one?
> >>
> >> It seems to work here:
> >>
> >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> psmouse 135168 0
> >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> >
> >It looks like the driver is actually reloaded. The driver used to
> >return EPROBE_DEFER, but I think it doesn't any more (rebuilding the
> >kernel to test that right now).
> >
> >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> >else cause the driver to be loaded again?
>
> It depends on how silly the udev rules are, but yes, this can definitely happen.

I confirmed the driver doesn't use EPROBE_DEFER any more:

$ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
$

I tested without any udev rules in /etc/udev/, just the standard distro
ones. Same thing.

Please find attached the logs from rmmod with kobject debug enabled.


Attachments:
(No filename) (2.27 kB)
log (86.01 kB)
Download all attachments

2017-09-10 16:21:16

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski <[email protected]> wrote:
> > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski <[email protected]>
> > >wrote:
> > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> > >> >> Hi!
> > >> >>
> > >> >> I'm having trouble with modules on linux/master. rmmod succeeds
> > >but the
> > >> >> module is still loaded and the refcount goes to 1:
> > >> >>
> > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> > >> >> lsmod | grep nfp; \
> > >> >> rmmod nfp; \
> > >> >> lsmod | grep nfp
> > >> >> nfp 249856 0
> > >> >> nfp 200704 1
> > >> >>
> > >> >> If I rmmod again the module will be actually unloaded. The user
> > >space
> > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to
> > >bisect
> > >> >> now...
> > >> >
> > >> > Got 'em!
> > >> >
> > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> > >refs/bisect/bad)
> > >> > Author: Dmitry Torokhov <[email protected]>
> > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> > >> >
> > >> > driver core: emit uevents when device is bound to a driver
> > >>
> > >> Does it happen with all modules or only nfp one?
> > >>
> > >> It seems to work here:
> > >>
> > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > >> psmouse 135168 0
> > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> > >
> > >It looks like the driver is actually reloaded. The driver used to
> > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding the
> > >kernel to test that right now).
> > >
> > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> > >else cause the driver to be loaded again?
> >
> > It depends on how silly the udev rules are, but yes, this can definitely happen.
>
> I confirmed the driver doesn't use EPROBE_DEFER any more:
>
> $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> $

Not sure why you bring the deferrals here, they have nothing to do with
module removal. Also, deferrals are rarely issued by the leaf driver, and
more often by providers of resources (GPIO, regulator, interrupt, etc).

>
> I tested without any udev rules in /etc/udev/, just the standard distro
> ones. Same thing.

Right, so this is the default udev rule:

/lib/udev/rules.d/80-drivers.rules:

# do not edit this file, it will be overwritten on update

ACTION=="remove", GOTO="drivers_end"

ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="SD", RUN{builtin}="kmod load tifm_sd"
SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="MS", RUN{builtin}="kmod load tifm_ms"
SUBSYSTEM=="memstick", RUN{builtin}="kmod load ms_block mspro_block"
SUBSYSTEM=="i2o", RUN{builtin}="kmod load i2o_block"
SUBSYSTEM=="module", KERNEL=="parport_pc", RUN{builtin}="kmod load ppdev"
SUBSYSTEM=="serio", ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
SUBSYSTEM=="graphics", RUN{builtin}="kmod load fbcon"
KERNEL=="mtd*ro", ENV{MTD_FTL}=="smartmedia", RUN{builtin}="kmod load sm_ftl"

LABEL="drivers_end"

So udev (and systemd) want to load kernel module on any action besides
device removal. Shortsighted decision I'd say. I'll send a patch to
systemd, in the mean time you can simply adjust your local rule to read

ACTION!="add", GOTO="drivers_end"

Thanks.

--
Dmitry

2017-09-10 18:00:17

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski <[email protected]> wrote:
> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski <[email protected]>
> > > >wrote:
> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> > > >> >> Hi!
> > > >> >>
> > > >> >> I'm having trouble with modules on linux/master. rmmod succeeds
> > > >but the
> > > >> >> module is still loaded and the refcount goes to 1:
> > > >> >>
> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> > > >> >> lsmod | grep nfp; \
> > > >> >> rmmod nfp; \
> > > >> >> lsmod | grep nfp
> > > >> >> nfp 249856 0
> > > >> >> nfp 200704 1
> > > >> >>
> > > >> >> If I rmmod again the module will be actually unloaded. The user
> > > >space
> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying to
> > > >bisect
> > > >> >> now...
> > > >> >
> > > >> > Got 'em!
> > > >> >
> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> > > >refs/bisect/bad)
> > > >> > Author: Dmitry Torokhov <[email protected]>
> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> > > >> >
> > > >> > driver core: emit uevents when device is bound to a driver
> > > >>
> > > >> Does it happen with all modules or only nfp one?
> > > >>
> > > >> It seems to work here:
> > > >>
> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > > >> psmouse 135168 0
> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> > > >
> > > >It looks like the driver is actually reloaded. The driver used to
> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding the
> > > >kernel to test that right now).
> > > >
> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> > > >else cause the driver to be loaded again?
> > >
> > > It depends on how silly the udev rules are, but yes, this can definitely happen.
> >
> > I confirmed the driver doesn't use EPROBE_DEFER any more:
> >
> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> > $
>
> Not sure why you bring the deferrals here, they have nothing to do with
> module removal. Also, deferrals are rarely issued by the leaf driver, and
> more often by providers of resources (GPIO, regulator, interrupt, etc).

Yes, it's unusual, but this driver used to do it. Which is exactly why
I brought it up. Turns out it was irrelevant :)

> > I tested without any udev rules in /etc/udev/, just the standard distro
> > ones. Same thing.
>
> Right, so this is the default udev rule:
>
> /lib/udev/rules.d/80-drivers.rules:
>
> # do not edit this file, it will be overwritten on update
>
> ACTION=="remove", GOTO="drivers_end"
>
> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="SD", RUN{builtin}="kmod load tifm_sd"
> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="MS", RUN{builtin}="kmod load tifm_ms"
> SUBSYSTEM=="memstick", RUN{builtin}="kmod load ms_block mspro_block"
> SUBSYSTEM=="i2o", RUN{builtin}="kmod load i2o_block"
> SUBSYSTEM=="module", KERNEL=="parport_pc", RUN{builtin}="kmod load ppdev"
> SUBSYSTEM=="serio", ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
> SUBSYSTEM=="graphics", RUN{builtin}="kmod load fbcon"
> KERNEL=="mtd*ro", ENV{MTD_FTL}=="smartmedia", RUN{builtin}="kmod load sm_ftl"
>
> LABEL="drivers_end"
>
> So udev (and systemd) want to load kernel module on any action besides
> device removal. Shortsighted decision I'd say. I'll send a patch to
> systemd, in the mean time you can simply adjust your local rule to read
>
> ACTION!="add", GOTO="drivers_end"

Mm. That is a silly thing. You will break a lot of setups, though.

2017-09-10 18:12:24

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
>On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
>> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
>> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
>> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
><[email protected]> wrote:
>> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
>> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
><[email protected]>
>> > > >wrote:
>> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
>
>> > > >> >> Hi!
>> > > >> >>
>> > > >> >> I'm having trouble with modules on linux/master. rmmod
>succeeds
>> > > >but the
>> > > >> >> module is still loaded and the refcount goes to 1:
>> > > >> >>
>> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
>> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
>> > > >> >> lsmod | grep nfp; \
>> > > >> >> rmmod nfp; \
>> > > >> >> lsmod | grep nfp
>> > > >> >> nfp 249856 0
>> > > >> >> nfp 200704 1
>> > > >> >>
>> > > >> >> If I rmmod again the module will be actually unloaded. The
>user
>> > > >space
>> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
>to
>> > > >bisect
>> > > >> >> now...
>> > > >> >
>> > > >> > Got 'em!
>> > > >> >
>> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
>> > > >refs/bisect/bad)
>> > > >> > Author: Dmitry Torokhov <[email protected]>
>> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
>> > > >> >
>> > > >> > driver core: emit uevents when device is bound to a
>driver
>> > > >>
>> > > >> Does it happen with all modules or only nfp one?
>> > > >>
>> > > >> It seems to work here:
>> > > >>
>> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> > > >> psmouse 135168 0
>> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
>> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
>> > > >
>> > > >It looks like the driver is actually reloaded. The driver used
>to
>> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
>the
>> > > >kernel to test that right now).
>> > > >
>> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
>> > > >else cause the driver to be loaded again?
>> > >
>> > > It depends on how silly the udev rules are, but yes, this can
>definitely happen.
>> >
>> > I confirmed the driver doesn't use EPROBE_DEFER any more:
>> >
>> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
>> > $
>>
>> Not sure why you bring the deferrals here, they have nothing to do
>with
>> module removal. Also, deferrals are rarely issued by the leaf driver,
>and
>> more often by providers of resources (GPIO, regulator, interrupt,
>etc).
>
>Yes, it's unusual, but this driver used to do it. Which is exactly why
>I brought it up. Turns out it was irrelevant :)
>
>> > I tested without any udev rules in /etc/udev/, just the standard
>distro
>> > ones. Same thing.
>>
>> Right, so this is the default udev rule:
>>
>> /lib/udev/rules.d/80-drivers.rules:
>>
>> # do not edit this file, it will be overwritten on update
>>
>> ACTION=="remove", GOTO="drivers_end"
>>
>> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
>> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="SD", RUN{builtin}="kmod load
>tifm_sd"
>> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="MS", RUN{builtin}="kmod load
>tifm_ms"
>> SUBSYSTEM=="memstick", RUN{builtin}="kmod load ms_block mspro_block"
>> SUBSYSTEM=="i2o", RUN{builtin}="kmod load i2o_block"
>> SUBSYSTEM=="module", KERNEL=="parport_pc", RUN{builtin}="kmod load
>ppdev"
>> SUBSYSTEM=="serio", ENV{MODALIAS}=="?*", RUN{builtin}="kmod load
>$env{MODALIAS}"
>> SUBSYSTEM=="graphics", RUN{builtin}="kmod load fbcon"
>> KERNEL=="mtd*ro", ENV{MTD_FTL}=="smartmedia", RUN{builtin}="kmod load
>sm_ftl"
>>
>> LABEL="drivers_end"
>>
>> So udev (and systemd) want to load kernel module on any action
>besides
>> device removal. Shortsighted decision I'd say. I'll send a patch to
>> systemd, in the mean time you can simply adjust your local rule to
>read
>>
>> ACTION!="add", GOTO="drivers_end"
>
>Mm. That is a silly thing. You will break a lot of setups, though.

I think the priority it to have module loading working properly, and for most users once module is loaded it stays loaded. Unloading is mostly for developers.

Luckily newer systemd versions drop events they do not recognize, so exposure is even smaller.


Thanks.

--
Dmitry

2017-09-10 18:39:42

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, 10 Sep 2017 11:12:17 -0700, Dmitry Torokhov wrote:
> On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
> >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
> >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
> ><[email protected]> wrote:
> >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
> ><[email protected]>
> >> > > >wrote:
> >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> >
> >> > > >> >> Hi!
> >> > > >> >>
> >> > > >> >> I'm having trouble with modules on linux/master. rmmod
> >succeeds
> >> > > >but the
> >> > > >> >> module is still loaded and the refcount goes to 1:
> >> > > >> >>
> >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> >> > > >> >> lsmod | grep nfp; \
> >> > > >> >> rmmod nfp; \
> >> > > >> >> lsmod | grep nfp
> >> > > >> >> nfp 249856 0
> >> > > >> >> nfp 200704 1
> >> > > >> >>
> >> > > >> >> If I rmmod again the module will be actually unloaded. The
> >user
> >> > > >space
> >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
> >to
> >> > > >bisect
> >> > > >> >> now...
> >> > > >> >
> >> > > >> > Got 'em!
> >> > > >> >
> >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> >> > > >refs/bisect/bad)
> >> > > >> > Author: Dmitry Torokhov <[email protected]>
> >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> >> > > >> >
> >> > > >> > driver core: emit uevents when device is bound to a
> >driver
> >> > > >>
> >> > > >> Does it happen with all modules or only nfp one?
> >> > > >>
> >> > > >> It seems to work here:
> >> > > >>
> >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> > > >> psmouse 135168 0
> >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> >> > > >
> >> > > >It looks like the driver is actually reloaded. The driver used
> >to
> >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
> >the
> >> > > >kernel to test that right now).
> >> > > >
> >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> >> > > >else cause the driver to be loaded again?
> >> > >
> >> > > It depends on how silly the udev rules are, but yes, this can
> >definitely happen.
> >> >
> >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
> >> >
> >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> >> > $
> >>
> >> Not sure why you bring the deferrals here, they have nothing to do
> >with
> >> module removal. Also, deferrals are rarely issued by the leaf driver,
> >and
> >> more often by providers of resources (GPIO, regulator, interrupt,
> >etc).
> >
> >Yes, it's unusual, but this driver used to do it. Which is exactly why
> >I brought it up. Turns out it was irrelevant :)
> >
> >> > I tested without any udev rules in /etc/udev/, just the standard
> >distro
> >> > ones. Same thing.
> >>
> >> Right, so this is the default udev rule:
> >>
> >> /lib/udev/rules.d/80-drivers.rules:
> >>
> >> # do not edit this file, it will be overwritten on update
> >>
> >> ACTION=="remove", GOTO="drivers_end"
> >>
> >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
> >> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="SD", RUN{builtin}="kmod load
> >tifm_sd"
> >> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="MS", RUN{builtin}="kmod load
> >tifm_ms"
> >> SUBSYSTEM=="memstick", RUN{builtin}="kmod load ms_block mspro_block"
> >> SUBSYSTEM=="i2o", RUN{builtin}="kmod load i2o_block"
> >> SUBSYSTEM=="module", KERNEL=="parport_pc", RUN{builtin}="kmod load
> >ppdev"
> >> SUBSYSTEM=="serio", ENV{MODALIAS}=="?*", RUN{builtin}="kmod load
> >$env{MODALIAS}"
> >> SUBSYSTEM=="graphics", RUN{builtin}="kmod load fbcon"
> >> KERNEL=="mtd*ro", ENV{MTD_FTL}=="smartmedia", RUN{builtin}="kmod load
> >sm_ftl"
> >>
> >> LABEL="drivers_end"
> >>
> >> So udev (and systemd) want to load kernel module on any action
> >besides
> >> device removal. Shortsighted decision I'd say. I'll send a patch to
> >> systemd, in the mean time you can simply adjust your local rule to
> >read
> >>
> >> ACTION!="add", GOTO="drivers_end"
> >
> >Mm. That is a silly thing. You will break a lot of setups, though.
>
> I think the priority it to have module loading working properly, and
> for most users once module is loaded it stays loaded. Unloading is
> mostly for developers.
>
> Luckily newer systemd versions drop events they do not recognize, so
> exposure is even smaller.

Could you point me to where that's done?

2017-09-10 18:50:45

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, Sep 10, 2017 at 11:39 AM, Jakub Kicinski <[email protected]> wrote:
> On Sun, 10 Sep 2017 11:12:17 -0700, Dmitry Torokhov wrote:
>> On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
>> >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
>> >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
>> >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
>> >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
>> ><[email protected]> wrote:
>> >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
>> >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
>> ><[email protected]>
>> >> > > >wrote:
>> >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
>> >
>> >> > > >> >> Hi!
>> >> > > >> >>
>> >> > > >> >> I'm having trouble with modules on linux/master. rmmod
>> >succeeds
>> >> > > >but the
>> >> > > >> >> module is still loaded and the refcount goes to 1:
>> >> > > >> >>
>> >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
>> >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
>> >> > > >> >> lsmod | grep nfp; \
>> >> > > >> >> rmmod nfp; \
>> >> > > >> >> lsmod | grep nfp
>> >> > > >> >> nfp 249856 0
>> >> > > >> >> nfp 200704 1
>> >> > > >> >>
>> >> > > >> >> If I rmmod again the module will be actually unloaded. The
>> >user
>> >> > > >space
>> >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
>> >to
>> >> > > >bisect
>> >> > > >> >> now...
>> >> > > >> >
>> >> > > >> > Got 'em!
>> >> > > >> >
>> >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
>> >> > > >refs/bisect/bad)
>> >> > > >> > Author: Dmitry Torokhov <[email protected]>
>> >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
>> >> > > >> >
>> >> > > >> > driver core: emit uevents when device is bound to a
>> >driver
>> >> > > >>
>> >> > > >> Does it happen with all modules or only nfp one?
>> >> > > >>
>> >> > > >> It seems to work here:
>> >> > > >>
>> >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> >> > > >> psmouse 135168 0
>> >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
>> >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
>> >> > > >
>> >> > > >It looks like the driver is actually reloaded. The driver used
>> >to
>> >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
>> >the
>> >> > > >kernel to test that right now).
>> >> > > >
>> >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
>> >> > > >else cause the driver to be loaded again?
>> >> > >
>> >> > > It depends on how silly the udev rules are, but yes, this can
>> >definitely happen.
>> >> >
>> >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
>> >> >
>> >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
>> >> > $
>> >>
>> >> Not sure why you bring the deferrals here, they have nothing to do
>> >with
>> >> module removal. Also, deferrals are rarely issued by the leaf driver,
>> >and
>> >> more often by providers of resources (GPIO, regulator, interrupt,
>> >etc).
>> >
>> >Yes, it's unusual, but this driver used to do it. Which is exactly why
>> >I brought it up. Turns out it was irrelevant :)
>> >
>> >> > I tested without any udev rules in /etc/udev/, just the standard
>> >distro
>> >> > ones. Same thing.
>> >>
>> >> Right, so this is the default udev rule:
>> >>
>> >> /lib/udev/rules.d/80-drivers.rules:
>> >>
>> >> # do not edit this file, it will be overwritten on update
>> >>
>> >> ACTION=="remove", GOTO="drivers_end"
>> >>
>> >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
>> >> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="SD", RUN{builtin}="kmod load
>> >tifm_sd"
>> >> SUBSYSTEM=="tifm", ENV{TIFM_CARD_TYPE}=="MS", RUN{builtin}="kmod load
>> >tifm_ms"
>> >> SUBSYSTEM=="memstick", RUN{builtin}="kmod load ms_block mspro_block"
>> >> SUBSYSTEM=="i2o", RUN{builtin}="kmod load i2o_block"
>> >> SUBSYSTEM=="module", KERNEL=="parport_pc", RUN{builtin}="kmod load
>> >ppdev"
>> >> SUBSYSTEM=="serio", ENV{MODALIAS}=="?*", RUN{builtin}="kmod load
>> >$env{MODALIAS}"
>> >> SUBSYSTEM=="graphics", RUN{builtin}="kmod load fbcon"
>> >> KERNEL=="mtd*ro", ENV{MTD_FTL}=="smartmedia", RUN{builtin}="kmod load
>> >sm_ftl"
>> >>
>> >> LABEL="drivers_end"
>> >>
>> >> So udev (and systemd) want to load kernel module on any action
>> >besides
>> >> device removal. Shortsighted decision I'd say. I'll send a patch to
>> >> systemd, in the mean time you can simply adjust your local rule to
>> >read
>> >>
>> >> ACTION!="add", GOTO="drivers_end"
>> >
>> >Mm. That is a silly thing. You will break a lot of setups, though.
>>
>> I think the priority it to have module loading working properly, and
>> for most users once module is loaded it stays loaded. Unloading is
>> mostly for developers.
>>
>> Luckily newer systemd versions drop events they do not recognize, so
>> exposure is even smaller.
>
> Could you point me to where that's done?

https://github.com/systemd/systemd/blob/master/src/libsystemd/sd-device/device-private.c#L506
- 508

--
Dmitry

2017-09-10 19:09:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, Sep 10, 2017 at 11:12:17AM -0700, Dmitry Torokhov wrote:
> On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
> >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
> >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
> ><[email protected]> wrote:
> >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
> ><[email protected]>
> >> > > >wrote:
> >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> >
> >> > > >> >> Hi!
> >> > > >> >>
> >> > > >> >> I'm having trouble with modules on linux/master. rmmod
> >succeeds
> >> > > >but the
> >> > > >> >> module is still loaded and the refcount goes to 1:
> >> > > >> >>
> >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> >> > > >> >> lsmod | grep nfp; \
> >> > > >> >> rmmod nfp; \
> >> > > >> >> lsmod | grep nfp
> >> > > >> >> nfp 249856 0
> >> > > >> >> nfp 200704 1
> >> > > >> >>
> >> > > >> >> If I rmmod again the module will be actually unloaded. The
> >user
> >> > > >space
> >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
> >to
> >> > > >bisect
> >> > > >> >> now...
> >> > > >> >
> >> > > >> > Got 'em!
> >> > > >> >
> >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> >> > > >refs/bisect/bad)
> >> > > >> > Author: Dmitry Torokhov <[email protected]>
> >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> >> > > >> >
> >> > > >> > driver core: emit uevents when device is bound to a
> >driver
> >> > > >>
> >> > > >> Does it happen with all modules or only nfp one?
> >> > > >>
> >> > > >> It seems to work here:
> >> > > >>
> >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> > > >> psmouse 135168 0
> >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> >> > > >
> >> > > >It looks like the driver is actually reloaded. The driver used
> >to
> >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
> >the
> >> > > >kernel to test that right now).
> >> > > >
> >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> >> > > >else cause the driver to be loaded again?
> >> > >
> >> > > It depends on how silly the udev rules are, but yes, this can
> >definitely happen.
> >> >
> >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
> >> >
> >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> >> > $
> >>
> >> Not sure why you bring the deferrals here, they have nothing to do
> >with
> >> module removal. Also, deferrals are rarely issued by the leaf driver,
> >and
> >> more often by providers of resources (GPIO, regulator, interrupt,
> >etc).
> >
> >Yes, it's unusual, but this driver used to do it. Which is exactly why
> >I brought it up. Turns out it was irrelevant :)
> >
> >> > I tested without any udev rules in /etc/udev/, just the standard
> >distro
> >> > ones. Same thing.
> >>
> >> Right, so this is the default udev rule:
> >>
> >> /lib/udev/rules.d/80-drivers.rules:
> >>
> >> # do not edit this file, it will be overwritten on update
> >>
> >> ACTION=="remove", GOTO="drivers_end"
> >>
> >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"

So if the new uevents do not have the MODALIAS line in them, then they
will not trigger this? Dmitry, can you see if that would fix this
problem without having to fix everyone's old versions of udev/systemd?

thanks,

greg k-h

2017-09-10 19:13:51

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, 10 Sep 2017 21:09:08 +0200, Greg Kroah-Hartman wrote:
> On Sun, Sep 10, 2017 at 11:12:17AM -0700, Dmitry Torokhov wrote:
> > On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
> > >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
> > >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> > >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> > >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
> > ><[email protected]> wrote:
> > >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> > >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
> > ><[email protected]>
> > >> > > >wrote:
> > >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> > >
> > >> > > >> >> Hi!
> > >> > > >> >>
> > >> > > >> >> I'm having trouble with modules on linux/master. rmmod
> > >succeeds
> > >> > > >but the
> > >> > > >> >> module is still loaded and the refcount goes to 1:
> > >> > > >> >>
> > >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> > >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> > >> > > >> >> lsmod | grep nfp; \
> > >> > > >> >> rmmod nfp; \
> > >> > > >> >> lsmod | grep nfp
> > >> > > >> >> nfp 249856 0
> > >> > > >> >> nfp 200704 1
> > >> > > >> >>
> > >> > > >> >> If I rmmod again the module will be actually unloaded. The
> > >user
> > >> > > >space
> > >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
> > >to
> > >> > > >bisect
> > >> > > >> >> now...
> > >> > > >> >
> > >> > > >> > Got 'em!
> > >> > > >> >
> > >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> > >> > > >refs/bisect/bad)
> > >> > > >> > Author: Dmitry Torokhov <[email protected]>
> > >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> > >> > > >> >
> > >> > > >> > driver core: emit uevents when device is bound to a
> > >driver
> > >> > > >>
> > >> > > >> Does it happen with all modules or only nfp one?
> > >> > > >>
> > >> > > >> It seems to work here:
> > >> > > >>
> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > >> > > >> psmouse 135168 0
> > >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> > >> > > >
> > >> > > >It looks like the driver is actually reloaded. The driver used
> > >to
> > >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
> > >the
> > >> > > >kernel to test that right now).
> > >> > > >
> > >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> > >> > > >else cause the driver to be loaded again?
> > >> > >
> > >> > > It depends on how silly the udev rules are, but yes, this can
> > >definitely happen.
> > >> >
> > >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
> > >> >
> > >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> > >> > $
> > >>
> > >> Not sure why you bring the deferrals here, they have nothing to do
> > >with
> > >> module removal. Also, deferrals are rarely issued by the leaf driver,
> > >and
> > >> more often by providers of resources (GPIO, regulator, interrupt,
> > >etc).
> > >
> > >Yes, it's unusual, but this driver used to do it. Which is exactly why
> > >I brought it up. Turns out it was irrelevant :)
> > >
> > >> > I tested without any udev rules in /etc/udev/, just the standard
> > >distro
> > >> > ones. Same thing.
> > >>
> > >> Right, so this is the default udev rule:
> > >>
> > >> /lib/udev/rules.d/80-drivers.rules:
> > >>
> > >> # do not edit this file, it will be overwritten on update
> > >>
> > >> ACTION=="remove", GOTO="drivers_end"
> > >>
> > >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
>
> So if the new uevents do not have the MODALIAS line in them, then they
> will not trigger this? Dmitry, can you see if that would fix this
> problem without having to fix everyone's old versions of udev/systemd?

Perhaps another option is dropping the unbind event? From the commit
message it seems like only bind is really needed ATM. Do events have
to be symmetrical?

2017-09-10 21:22:26

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, Sep 10, 2017 at 12:13 PM, Jakub Kicinski <[email protected]> wrote:
> On Sun, 10 Sep 2017 21:09:08 +0200, Greg Kroah-Hartman wrote:
>> On Sun, Sep 10, 2017 at 11:12:17AM -0700, Dmitry Torokhov wrote:
>> > On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
>> > >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
>> > >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
>> > >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
>> > >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
>> > ><[email protected]> wrote:
>> > >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
>> > >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
>> > ><[email protected]>
>> > >> > > >wrote:
>> > >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
>> > >
>> > >> > > >> >> Hi!
>> > >> > > >> >>
>> > >> > > >> >> I'm having trouble with modules on linux/master. rmmod
>> > >succeeds
>> > >> > > >but the
>> > >> > > >> >> module is still loaded and the refcount goes to 1:
>> > >> > > >> >>
>> > >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
>> > >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
>> > >> > > >> >> lsmod | grep nfp; \
>> > >> > > >> >> rmmod nfp; \
>> > >> > > >> >> lsmod | grep nfp
>> > >> > > >> >> nfp 249856 0
>> > >> > > >> >> nfp 200704 1
>> > >> > > >> >>
>> > >> > > >> >> If I rmmod again the module will be actually unloaded. The
>> > >user
>> > >> > > >space
>> > >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
>> > >to
>> > >> > > >bisect
>> > >> > > >> >> now...
>> > >> > > >> >
>> > >> > > >> > Got 'em!
>> > >> > > >> >
>> > >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
>> > >> > > >refs/bisect/bad)
>> > >> > > >> > Author: Dmitry Torokhov <[email protected]>
>> > >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
>> > >> > > >> >
>> > >> > > >> > driver core: emit uevents when device is bound to a
>> > >driver
>> > >> > > >>
>> > >> > > >> Does it happen with all modules or only nfp one?
>> > >> > > >>
>> > >> > > >> It seems to work here:
>> > >> > > >>
>> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> > >> > > >> psmouse 135168 0
>> > >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
>> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
>> > >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
>> > >> > > >
>> > >> > > >It looks like the driver is actually reloaded. The driver used
>> > >to
>> > >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
>> > >the
>> > >> > > >kernel to test that right now).
>> > >> > > >
>> > >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
>> > >> > > >else cause the driver to be loaded again?
>> > >> > >
>> > >> > > It depends on how silly the udev rules are, but yes, this can
>> > >definitely happen.
>> > >> >
>> > >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
>> > >> >
>> > >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
>> > >> > $
>> > >>
>> > >> Not sure why you bring the deferrals here, they have nothing to do
>> > >with
>> > >> module removal. Also, deferrals are rarely issued by the leaf driver,
>> > >and
>> > >> more often by providers of resources (GPIO, regulator, interrupt,
>> > >etc).
>> > >
>> > >Yes, it's unusual, but this driver used to do it. Which is exactly why
>> > >I brought it up. Turns out it was irrelevant :)
>> > >
>> > >> > I tested without any udev rules in /etc/udev/, just the standard
>> > >distro
>> > >> > ones. Same thing.
>> > >>
>> > >> Right, so this is the default udev rule:
>> > >>
>> > >> /lib/udev/rules.d/80-drivers.rules:
>> > >>
>> > >> # do not edit this file, it will be overwritten on update
>> > >>
>> > >> ACTION=="remove", GOTO="drivers_end"
>> > >>
>> > >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
>>
>> So if the new uevents do not have the MODALIAS line in them, then they
>> will not trigger this? Dmitry, can you see if that would fix this
>> problem without having to fix everyone's old versions of udev/systemd?

Unfortunately MODALIAS= is being added by individual subsystems having
their subsystem specific format. Unless you'd be OK with
kobject_uevent_env() poking into the generated environment and zapping
MODALIAS= environment variables for KOBJ_BIND/KOBJ_UNBIND actions.

Let me know and I can try to come up with a patch.

I'm still going to submit correction for the rule to systemd folks.

>
> Perhaps another option is dropping the unbind event? From the commit
> message it seems like only bind is really needed ATM. Do events have
> to be symmetrical?

While you are absolutely right that bind is the most important one,
I'd be hesitant removing unbind even though we do not have concrete
use case for it yet. The bind operation complements unbind, so having
bind uevent but not unbind "feels weird".

Thanks.

--
Dmitry

2017-09-11 15:23:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Sun, Sep 10, 2017 at 02:22:22PM -0700, Dmitry Torokhov wrote:
> On Sun, Sep 10, 2017 at 12:13 PM, Jakub Kicinski <[email protected]> wrote:
> > On Sun, 10 Sep 2017 21:09:08 +0200, Greg Kroah-Hartman wrote:
> >> On Sun, Sep 10, 2017 at 11:12:17AM -0700, Dmitry Torokhov wrote:
> >> > On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
> >> > >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
> >> > >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> >> > >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> >> > >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
> >> > ><[email protected]> wrote:
> >> > >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> >> > >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
> >> > ><[email protected]>
> >> > >> > > >wrote:
> >> > >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> >> > >
> >> > >> > > >> >> Hi!
> >> > >> > > >> >>
> >> > >> > > >> >> I'm having trouble with modules on linux/master. rmmod
> >> > >succeeds
> >> > >> > > >but the
> >> > >> > > >> >> module is still loaded and the refcount goes to 1:
> >> > >> > > >> >>
> >> > >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> >> > >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> >> > >> > > >> >> lsmod | grep nfp; \
> >> > >> > > >> >> rmmod nfp; \
> >> > >> > > >> >> lsmod | grep nfp
> >> > >> > > >> >> nfp 249856 0
> >> > >> > > >> >> nfp 200704 1
> >> > >> > > >> >>
> >> > >> > > >> >> If I rmmod again the module will be actually unloaded. The
> >> > >user
> >> > >> > > >space
> >> > >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
> >> > >to
> >> > >> > > >bisect
> >> > >> > > >> >> now...
> >> > >> > > >> >
> >> > >> > > >> > Got 'em!
> >> > >> > > >> >
> >> > >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> >> > >> > > >refs/bisect/bad)
> >> > >> > > >> > Author: Dmitry Torokhov <[email protected]>
> >> > >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> >> > >> > > >> >
> >> > >> > > >> > driver core: emit uevents when device is bound to a
> >> > >driver
> >> > >> > > >>
> >> > >> > > >> Does it happen with all modules or only nfp one?
> >> > >> > > >>
> >> > >> > > >> It seems to work here:
> >> > >> > > >>
> >> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> > >> > > >> psmouse 135168 0
> >> > >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> >> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> >> > >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> >> > >> > > >
> >> > >> > > >It looks like the driver is actually reloaded. The driver used
> >> > >to
> >> > >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
> >> > >the
> >> > >> > > >kernel to test that right now).
> >> > >> > > >
> >> > >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> >> > >> > > >else cause the driver to be loaded again?
> >> > >> > >
> >> > >> > > It depends on how silly the udev rules are, but yes, this can
> >> > >definitely happen.
> >> > >> >
> >> > >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
> >> > >> >
> >> > >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> >> > >> > $
> >> > >>
> >> > >> Not sure why you bring the deferrals here, they have nothing to do
> >> > >with
> >> > >> module removal. Also, deferrals are rarely issued by the leaf driver,
> >> > >and
> >> > >> more often by providers of resources (GPIO, regulator, interrupt,
> >> > >etc).
> >> > >
> >> > >Yes, it's unusual, but this driver used to do it. Which is exactly why
> >> > >I brought it up. Turns out it was irrelevant :)
> >> > >
> >> > >> > I tested without any udev rules in /etc/udev/, just the standard
> >> > >distro
> >> > >> > ones. Same thing.
> >> > >>
> >> > >> Right, so this is the default udev rule:
> >> > >>
> >> > >> /lib/udev/rules.d/80-drivers.rules:
> >> > >>
> >> > >> # do not edit this file, it will be overwritten on update
> >> > >>
> >> > >> ACTION=="remove", GOTO="drivers_end"
> >> > >>
> >> > >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
> >>
> >> So if the new uevents do not have the MODALIAS line in them, then they
> >> will not trigger this? Dmitry, can you see if that would fix this
> >> problem without having to fix everyone's old versions of udev/systemd?
>
> Unfortunately MODALIAS= is being added by individual subsystems having
> their subsystem specific format. Unless you'd be OK with
> kobject_uevent_env() poking into the generated environment and zapping
> MODALIAS= environment variables for KOBJ_BIND/KOBJ_UNBIND actions.

Hm, any reason why it should be sending these values for those uevents?
I guess it's not worth hacking around in the lower levels just for this,
to work around crazy userspace stuff.

> I'm still going to submit correction for the rule to systemd folks.

Yes please.

> > Perhaps another option is dropping the unbind event? From the commit
> > message it seems like only bind is really needed ATM. Do events have
> > to be symmetrical?
>
> While you are absolutely right that bind is the most important one,
> I'd be hesitant removing unbind even though we do not have concrete
> use case for it yet. The bind operation complements unbind, so having
> bind uevent but not unbind "feels weird".

We might want to disable it for a year or so for people to catch up with
a newer version of udev/systemd, and then turn it back on?

thanks,

greg k-h

2017-09-11 18:29:32

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Mon, Sep 11, 2017 at 08:23:32AM -0700, Greg Kroah-Hartman wrote:
> On Sun, Sep 10, 2017 at 02:22:22PM -0700, Dmitry Torokhov wrote:
> > On Sun, Sep 10, 2017 at 12:13 PM, Jakub Kicinski <[email protected]> wrote:
> > > On Sun, 10 Sep 2017 21:09:08 +0200, Greg Kroah-Hartman wrote:
> > >> On Sun, Sep 10, 2017 at 11:12:17AM -0700, Dmitry Torokhov wrote:
> > >> > On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
> > >> > >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
> > >> > >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> > >> > >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> > >> > >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
> > >> > ><[email protected]> wrote:
> > >> > >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> > >> > >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
> > >> > ><[email protected]>
> > >> > >> > > >wrote:
> > >> > >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> > >> > >
> > >> > >> > > >> >> Hi!
> > >> > >> > > >> >>
> > >> > >> > > >> >> I'm having trouble with modules on linux/master. rmmod
> > >> > >succeeds
> > >> > >> > > >but the
> > >> > >> > > >> >> module is still loaded and the refcount goes to 1:
> > >> > >> > > >> >>
> > >> > >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> > >> > >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> > >> > >> > > >> >> lsmod | grep nfp; \
> > >> > >> > > >> >> rmmod nfp; \
> > >> > >> > > >> >> lsmod | grep nfp
> > >> > >> > > >> >> nfp 249856 0
> > >> > >> > > >> >> nfp 200704 1
> > >> > >> > > >> >>
> > >> > >> > > >> >> If I rmmod again the module will be actually unloaded. The
> > >> > >user
> > >> > >> > > >space
> > >> > >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
> > >> > >to
> > >> > >> > > >bisect
> > >> > >> > > >> >> now...
> > >> > >> > > >> >
> > >> > >> > > >> > Got 'em!
> > >> > >> > > >> >
> > >> > >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> > >> > >> > > >refs/bisect/bad)
> > >> > >> > > >> > Author: Dmitry Torokhov <[email protected]>
> > >> > >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> > >> > >> > > >> >
> > >> > >> > > >> > driver core: emit uevents when device is bound to a
> > >> > >driver
> > >> > >> > > >>
> > >> > >> > > >> Does it happen with all modules or only nfp one?
> > >> > >> > > >>
> > >> > >> > > >> It seems to work here:
> > >> > >> > > >>
> > >> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > >> > >> > > >> psmouse 135168 0
> > >> > >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> > >> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > >> > >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> > >> > >> > > >
> > >> > >> > > >It looks like the driver is actually reloaded. The driver used
> > >> > >to
> > >> > >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
> > >> > >the
> > >> > >> > > >kernel to test that right now).
> > >> > >> > > >
> > >> > >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> > >> > >> > > >else cause the driver to be loaded again?
> > >> > >> > >
> > >> > >> > > It depends on how silly the udev rules are, but yes, this can
> > >> > >definitely happen.
> > >> > >> >
> > >> > >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
> > >> > >> >
> > >> > >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> > >> > >> > $
> > >> > >>
> > >> > >> Not sure why you bring the deferrals here, they have nothing to do
> > >> > >with
> > >> > >> module removal. Also, deferrals are rarely issued by the leaf driver,
> > >> > >and
> > >> > >> more often by providers of resources (GPIO, regulator, interrupt,
> > >> > >etc).
> > >> > >
> > >> > >Yes, it's unusual, but this driver used to do it. Which is exactly why
> > >> > >I brought it up. Turns out it was irrelevant :)
> > >> > >
> > >> > >> > I tested without any udev rules in /etc/udev/, just the standard
> > >> > >distro
> > >> > >> > ones. Same thing.
> > >> > >>
> > >> > >> Right, so this is the default udev rule:
> > >> > >>
> > >> > >> /lib/udev/rules.d/80-drivers.rules:
> > >> > >>
> > >> > >> # do not edit this file, it will be overwritten on update
> > >> > >>
> > >> > >> ACTION=="remove", GOTO="drivers_end"
> > >> > >>
> > >> > >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
> > >>
> > >> So if the new uevents do not have the MODALIAS line in them, then they
> > >> will not trigger this? Dmitry, can you see if that would fix this
> > >> problem without having to fix everyone's old versions of udev/systemd?
> >
> > Unfortunately MODALIAS= is being added by individual subsystems having
> > their subsystem specific format. Unless you'd be OK with
> > kobject_uevent_env() poking into the generated environment and zapping
> > MODALIAS= environment variables for KOBJ_BIND/KOBJ_UNBIND actions.
>
> Hm, any reason why it should be sending these values for those uevents?
> I guess it's not worth hacking around in the lower levels just for this,
> to work around crazy userspace stuff.
>
> > I'm still going to submit correction for the rule to systemd folks.
>
> Yes please.
>
> > > Perhaps another option is dropping the unbind event? From the commit
> > > message it seems like only bind is really needed ATM. Do events have
> > > to be symmetrical?
> >
> > While you are absolutely right that bind is the most important one,
> > I'd be hesitant removing unbind even though we do not have concrete
> > use case for it yet. The bind operation complements unbind, so having
> > bind uevent but not unbind "feels weird".
>
> We might want to disable it for a year or so for people to catch up with
> a newer version of udev/systemd, and then turn it back on?

That is an option, but maybe we could have the patch below for a year or
2 instead?

Jakub, can you try and see if that works for you?

--
Dmitry

driver core: suppress sending MODALIAS in UNBIND uevents

From: Dmitry Torokhov <[email protected]>

The current udev rules cause modules to be loaded on all device events save
for "remove". With the introduction of KOBJ_BIND/KOBJ_UNBIND this causes
issues, as driver modules that have devices bound to their drivers get
immediately reloaded, and it appears to the user that module unloading doe
snot work.

The standard udev matching rule is foillowing:

ENV{MODALIAS}=="?*", RUN{builtin}+="kmod load $env{MODALIAS}"

Given that MODALIAS data is not terribly useful for UNBIND event, let's zap
it from the generated uevent environment until we get userspace updated
with the correct udev rule that only loads modules on "add" event.

Reported-by: Jakub Kicinski <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
---
lib/kobject_uevent.c | 47 +++++++++++++++++++++++++++++++++++++++--------
1 file changed, 39 insertions(+), 8 deletions(-)

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index e590523ea476..e5ccec526def 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -294,6 +294,24 @@ static void cleanup_uevent_env(struct subprocess_info *info)
}
#endif

+static void zap_modalias_env(struct kobj_uevent_env *env)
+{
+ int i;
+
+ for (i = 0; i < env->envp_idx;) {
+ if (strcmp(env->envp[i], "MODALIAS=")) {
+ i++;
+ continue;
+ }
+
+ if (i != env->envp_idx - 1)
+ memmove(&env->envp[i], &env->envp[i + 1],
+ sizeof(env->envp[i]) * env->envp_idx - 1);
+
+ env->envp_idx--;
+ }
+}
+
/**
* kobject_uevent_env - send an uevent with environmental data
*
@@ -409,16 +427,29 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
}
}

- /*
- * Mark "add" and "remove" events in the object to ensure proper
- * events to userspace during automatic cleanup. If the object did
- * send an "add" event, "remove" will automatically generated by
- * the core, if not already done by the caller.
- */
- if (action == KOBJ_ADD)
+ switch (action) {
+ case KOBJ_ADD:
+ /*
+ * Mark "add" event so we can make sure we deliver "remove"
+ * event to userspace during automatic cleanup. If
+ * the object did send an "add" event, "remove" will
+ * automatically generated by the core, if not already done
+ * by the caller.
+ */
kobj->state_add_uevent_sent = 1;
- else if (action == KOBJ_REMOVE)
+ break;
+
+ case KOBJ_REMOVE:
kobj->state_remove_uevent_sent = 1;
+ break;
+
+ case KOBJ_UNBIND:
+ zap_modalias_env(env);
+ break;
+
+ default:
+ break;
+ }

mutex_lock(&uevent_sock_mutex);
/* we will send an event, so request a new sequence number */

2017-09-12 12:00:48

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Mon, 11 Sep 2017 11:29:26 -0700, Dmitry Torokhov wrote:
> On Mon, Sep 11, 2017 at 08:23:32AM -0700, Greg Kroah-Hartman wrote:
> > On Sun, Sep 10, 2017 at 02:22:22PM -0700, Dmitry Torokhov wrote:
> > > On Sun, Sep 10, 2017 at 12:13 PM, Jakub Kicinski <[email protected]> wrote:
> > > > On Sun, 10 Sep 2017 21:09:08 +0200, Greg Kroah-Hartman wrote:
> > > >> On Sun, Sep 10, 2017 at 11:12:17AM -0700, Dmitry Torokhov wrote:
> > > >> > On September 10, 2017 11:00:10 AM PDT, Jakub Kicinski <[email protected]> wrote:
> > > >> > >On Sun, 10 Sep 2017 09:21:11 -0700, Dmitry Torokhov wrote:
> > > >> > >> On Sun, Sep 10, 2017 at 12:03:38AM +0200, Jakub Kicinski wrote:
> > > >> > >> > On Sat, 09 Sep 2017 13:59:25 -0700, Dmitry Torokhov wrote:
> > > >> > >> > > On September 9, 2017 1:17:26 PM PDT, Jakub Kicinski
> > > >> > ><[email protected]> wrote:
> > > >> > >> > > >On Sat, 9 Sep 2017 12:55:51 -0700, Dmitry Torokhov wrote:
> > > >> > >> > > >> On Sat, Sep 9, 2017 at 12:27 PM, Jakub Kicinski
> > > >> > ><[email protected]>
> > > >> > >> > > >wrote:
> > > >> > >> > > >> > On Sat, 9 Sep 2017 19:41:21 +0200, Jakub Kicinski wrote:
> > > >> > >
> > > >> > >> > > >> >> Hi!
> > > >> > >> > > >> >>
> > > >> > >> > > >> >> I'm having trouble with modules on linux/master. rmmod
> > > >> > >succeeds
> > > >> > >> > > >but the
> > > >> > >> > > >> >> module is still loaded and the refcount goes to 1:
> > > >> > >> > > >> >>
> > > >> > >> > > >> >> #rmmod nfp; insmod ./src/nfp.ko nfp_pf_netdev=0 ; \
> > > >> > >> > > >> >> /opt/netronome/bin/nfp-hwinfo -n 2 assembly.partno \
> > > >> > >> > > >> >> lsmod | grep nfp; \
> > > >> > >> > > >> >> rmmod nfp; \
> > > >> > >> > > >> >> lsmod | grep nfp
> > > >> > >> > > >> >> nfp 249856 0
> > > >> > >> > > >> >> nfp 200704 1
> > > >> > >> > > >> >>
> > > >> > >> > > >> >> If I rmmod again the module will be actually unloaded. The
> > > >> > >user
> > > >> > >> > > >space
> > > >> > >> > > >> >> is mostly Ubuntu 14.04. Has anyone seen this? I'm trying
> > > >> > >to
> > > >> > >> > > >bisect
> > > >> > >> > > >> >> now...
> > > >> > >> > > >> >
> > > >> > >> > > >> > Got 'em!
> > > >> > >> > > >> >
> > > >> > >> > > >> > commit 1455cf8dbfd06aa7651dcfccbadb7a093944ca65 (HEAD,
> > > >> > >> > > >refs/bisect/bad)
> > > >> > >> > > >> > Author: Dmitry Torokhov <[email protected]>
> > > >> > >> > > >> > Date: Wed Jul 19 17:24:30 2017 -0700
> > > >> > >> > > >> >
> > > >> > >> > > >> > driver core: emit uevents when device is bound to a
> > > >> > >driver
> > > >> > >> > > >>
> > > >> > >> > > >> Does it happen with all modules or only nfp one?
> > > >> > >> > > >>
> > > >> > >> > > >> It seems to work here:
> > > >> > >> > > >>
> > > >> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > > >> > >> > > >> psmouse 135168 0
> > > >> > >> > > >> dtor@dtor-glaptop3:~ $ sudo rmmod psmouse
> > > >> > >> > > >> dtor@dtor-glaptop3:~ $ lsmod | grep psmouse
> > > >> > >> > > >> dtor@dtor-glaptop3:~ $ sudo modprobe psmouse
> > > >> > >> > > >
> > > >> > >> > > >It looks like the driver is actually reloaded. The driver used
> > > >> > >to
> > > >> > >> > > >return EPROBE_DEFER, but I think it doesn't any more (rebuilding
> > > >> > >the
> > > >> > >> > > >kernel to test that right now).
> > > >> > >> > > >
> > > >> > >> > > >Could the uevent on unbind tickle Ubuntu 14.04's udev or somehow
> > > >> > >> > > >else cause the driver to be loaded again?
> > > >> > >> > >
> > > >> > >> > > It depends on how silly the udev rules are, but yes, this can
> > > >> > >definitely happen.
> > > >> > >> >
> > > >> > >> > I confirmed the driver doesn't use EPROBE_DEFER any more:
> > > >> > >> >
> > > >> > >> > $ grep -nrI EPROBE_DEFER drivers/net/ethernet/netronome/
> > > >> > >> > $
> > > >> > >>
> > > >> > >> Not sure why you bring the deferrals here, they have nothing to do
> > > >> > >with
> > > >> > >> module removal. Also, deferrals are rarely issued by the leaf driver,
> > > >> > >and
> > > >> > >> more often by providers of resources (GPIO, regulator, interrupt,
> > > >> > >etc).
> > > >> > >
> > > >> > >Yes, it's unusual, but this driver used to do it. Which is exactly why
> > > >> > >I brought it up. Turns out it was irrelevant :)
> > > >> > >
> > > >> > >> > I tested without any udev rules in /etc/udev/, just the standard
> > > >> > >distro
> > > >> > >> > ones. Same thing.
> > > >> > >>
> > > >> > >> Right, so this is the default udev rule:
> > > >> > >>
> > > >> > >> /lib/udev/rules.d/80-drivers.rules:
> > > >> > >>
> > > >> > >> # do not edit this file, it will be overwritten on update
> > > >> > >>
> > > >> > >> ACTION=="remove", GOTO="drivers_end"
> > > >> > >>
> > > >> > >> ENV{MODALIAS}=="?*", RUN{builtin}="kmod load $env{MODALIAS}"
> > > >>
> > > >> So if the new uevents do not have the MODALIAS line in them, then they
> > > >> will not trigger this? Dmitry, can you see if that would fix this
> > > >> problem without having to fix everyone's old versions of udev/systemd?
> > >
> > > Unfortunately MODALIAS= is being added by individual subsystems having
> > > their subsystem specific format. Unless you'd be OK with
> > > kobject_uevent_env() poking into the generated environment and zapping
> > > MODALIAS= environment variables for KOBJ_BIND/KOBJ_UNBIND actions.
> >
> > Hm, any reason why it should be sending these values for those uevents?
> > I guess it's not worth hacking around in the lower levels just for this,
> > to work around crazy userspace stuff.
> >
> > > I'm still going to submit correction for the rule to systemd folks.
> >
> > Yes please.
> >
> > > > Perhaps another option is dropping the unbind event? From the commit
> > > > message it seems like only bind is really needed ATM. Do events have
> > > > to be symmetrical?
> > >
> > > While you are absolutely right that bind is the most important one,
> > > I'd be hesitant removing unbind even though we do not have concrete
> > > use case for it yet. The bind operation complements unbind, so having
> > > bind uevent but not unbind "feels weird".
> >
> > We might want to disable it for a year or so for people to catch up with
> > a newer version of udev/systemd, and then turn it back on?
>
> That is an option, but maybe we could have the patch below for a year or
> 2 instead?
>
> Jakub, can you try and see if that works for you?

Unfortunately this doesn't seem to solve it :(

# modprobe nfp; lsmod | grep nfp; modprobe -r nfp; lsmod | grep nfp
nfp 1101824 0
nfp 1101824 1

2017-09-12 18:52:16

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Tue, Sep 12, 2017 at 02:00:41PM +0200, Jakub Kicinski wrote:
> On Mon, 11 Sep 2017 11:29:26 -0700, Dmitry Torokhov wrote:
> >
> > That is an option, but maybe we could have the patch below for a year or
> > 2 instead?
> >
> > Jakub, can you try and see if that works for you?
>
> Unfortunately this doesn't seem to solve it :(
>
> # modprobe nfp; lsmod | grep nfp; modprobe -r nfp; lsmod | grep nfp
> nfp 1101824 0
> nfp 1101824 1

Well, I should have tested it before sending out. How about this one?

--
Dmitry


driver core: suppress sending MODALIAS in UNBIND uevents

From: Dmitry Torokhov <[email protected]>

The current udev rules cause modules to be loaded on all device events save
for "remove". With the introduction of KOBJ_BIND/KOBJ_UNBIND this causes
issues, as driver modules that have devices bound to their drivers get
immediately reloaded, and it appears to the user that module unloading doe
snot work.

The standard udev matching rule is foillowing:

ENV{MODALIAS}=="?*", RUN{builtin}+="kmod load $env{MODALIAS}"

Given that MODALIAS data is not terribly useful for UNBIND event, let's zap
it from the generated uevent environment until we get userspace updated
with the correct udev rule that only loads modules on "add" event.

Reported-by: Jakub Kicinski <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
---
lib/kobject_uevent.c | 49 +++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 41 insertions(+), 8 deletions(-)

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index e590523ea476..f237a09a5862 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -294,6 +294,26 @@ static void cleanup_uevent_env(struct subprocess_info *info)
}
#endif

+static void zap_modalias_env(struct kobj_uevent_env *env)
+{
+ static const char modalias_prefix[] = "MODALIAS=";
+ int i;
+
+ for (i = 0; i < env->envp_idx;) {
+ if (strncmp(env->envp[i], modalias_prefix,
+ sizeof(modalias_prefix) - 1)) {
+ i++;
+ continue;
+ }
+
+ if (i != env->envp_idx - 1)
+ memmove(&env->envp[i], &env->envp[i + 1],
+ sizeof(env->envp[i]) * env->envp_idx - 1);
+
+ env->envp_idx--;
+ }
+}
+
/**
* kobject_uevent_env - send an uevent with environmental data
*
@@ -409,16 +429,29 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
}
}

- /*
- * Mark "add" and "remove" events in the object to ensure proper
- * events to userspace during automatic cleanup. If the object did
- * send an "add" event, "remove" will automatically generated by
- * the core, if not already done by the caller.
- */
- if (action == KOBJ_ADD)
+ switch (action) {
+ case KOBJ_ADD:
+ /*
+ * Mark "add" event so we can make sure we deliver "remove"
+ * event to userspace during automatic cleanup. If
+ * the object did send an "add" event, "remove" will
+ * automatically generated by the core, if not already done
+ * by the caller.
+ */
kobj->state_add_uevent_sent = 1;
- else if (action == KOBJ_REMOVE)
+ break;
+
+ case KOBJ_REMOVE:
kobj->state_remove_uevent_sent = 1;
+ break;
+
+ case KOBJ_UNBIND:
+ zap_modalias_env(env);
+ break;
+
+ default:
+ break;
+ }

mutex_lock(&uevent_sock_mutex);
/* we will send an event, so request a new sequence number */

2017-09-13 11:36:09

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [bisected] Re: Module removal-related regression?

On Tue, 12 Sep 2017 11:52:11 -0700, Dmitry Torokhov wrote:
> On Tue, Sep 12, 2017 at 02:00:41PM +0200, Jakub Kicinski wrote:
> > On Mon, 11 Sep 2017 11:29:26 -0700, Dmitry Torokhov wrote:
> > >
> > > That is an option, but maybe we could have the patch below for a year or
> > > 2 instead?
> > >
> > > Jakub, can you try and see if that works for you?
> >
> > Unfortunately this doesn't seem to solve it :(
> >
> > # modprobe nfp; lsmod | grep nfp; modprobe -r nfp; lsmod | grep nfp
> > nfp 1101824 0
> > nfp 1101824 1
>
> Well, I should have tested it before sending out. How about this one?

This one is better! Consider it:

Tested-by: Jakub Kicinski <[email protected]>

Thanks!