2022-04-29 21:14:48

by Lucas De Marchi

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 1/2] module: add a function to add module references

On Fri, Apr 29, 2022 at 11:23:51AM +0100, Mauro Carvalho Chehab wrote:
>Em Fri, 29 Apr 2022 12:10:07 +0200
>Greg KH <[email protected]> escreveu:
>
>> On Fri, Apr 29, 2022 at 10:15:03AM +0100, Mauro Carvalho Chehab wrote:
>> > HI Greg,
>> >
>> > Em Fri, 29 Apr 2022 10:30:33 +0200
>> > Greg KH <[email protected]> escreveu:
>> >
>> > > On Fri, Apr 29, 2022 at 09:07:57AM +0100, Mauro Carvalho Chehab wrote:
>> > > > Hi Daniel,
>> > > >
>> > > > Em Fri, 29 Apr 2022 09:54:10 +0200
>> > > > Daniel Vetter <[email protected]> escreveu:
>> > > >
>> > > > > On Fri, Apr 29, 2022 at 07:31:15AM +0100, Mauro Carvalho Chehab wrote:
>> > > > > > Sometimes, device drivers are bound using indirect references,
>> > > > > > which is not visible when looking at /proc/modules or lsmod.
>> > > > > >
>> > > > > > Add a function to allow setting up module references for such
>> > > > > > cases.
>> > > > > >
>> > > > > > Reviewed-by: Dan Williams <[email protected]>
>> > > > > > Signed-off-by: Mauro Carvalho Chehab <[email protected]>
>> > > > >
>> > > > > This sounds like duct tape at the wrong level. We should have a
>> > > > > device_link connecting these devices, and maybe device_link internally
>> > > > > needs to make sure the respective driver modules stay around for long
>> > > > > enough too. But open-coding this all over the place into every driver that
>> > > > > has some kind of cross-driver dependency sounds terrible.
>> > > > >
>> > > > > Or maybe the bug is that the snd driver keeps accessing the hw/component
>> > > > > side when that is just plain gone. Iirc there's still fundamental issues
>> > > > > there on the sound side of things, which have been attempted to paper over
>> > > > > by timeouts and stuff like that in the past instead of enforcing a hard
>> > > > > link between the snd and i915 side.
>> > > >
>> > > > I agree with you that the device link between snd-hda and the DRM driver
>> > > > should properly handle unbinding on both directions. This is something
>> > > > that require further discussions with ALSA and DRM people, and we should
>> > > > keep working on it.
>> > > >
>> > > > Yet, the binding between those drivers do exist, but, despite other
>> > > > similar inter-driver bindings being properly reported by lsmod, this one
>> > > > is invisible for userspace.
>> > > >
>> > > > What this series does is to make such binding visible. As simple as that.
>> > >
>> > > It also increases the reference count, and creates a user/kernel api
>> > > with the symlinks, right? Will the reference count increase prevent the
>> > > modules from now being unloadable?
>> > >
>> > > This feels like a very "weak" link between modules that should not be
>> > > needed if reference counting is implemented properly (so that things are
>> > > cleaned up in the correct order.)
>> >
>> > The refcount increment exists even without this patch, as
>> > hda_component_master_bind() at sound/hda/hdac_component.c uses
>> > try_module_get() when it creates the device link.
>>
>> Ok, then why shouldn't try_module_get() be creating this link instead of
>> having to manually do it this way again? You don't want to have to go
>> around and add this call to all users of that function, right?
>
>Works for me, but this is not a too trivial change, as the new
>try_module_get() function will require two parameters, instead of one:
>
> - the module to be referenced;
> - the module which will reference it.
>
>On trivial cases, one will be THIS_MODULE, but, in the specific case
>of snd_hda, the binding is done via an ancillary routine under
>snd_hda_core, but the actual binding happens at snd_hda_intel.
>
>Ok, we could add a __try_module_get() (or whatever other name that
>would properly express what it does) with two parameters, and then
>define try_module_get() as:
>
> #define try_module_get(mod) __try_module_get(mod, THIS_MODULE)

agree that this should be done at this level rather than open coding it
at every driver. Main improvement being fixed here regardless of the
snd-hda-intel issue is to properly annotate what is holding a module.

Right now we have 1) symbol module dependencies; 2) kernel references;
3) userspace references. With (2) and (3) being unknown to the user from
lsmod pov. Handling this any time try_module_get() is called would make
(2) visible to lsmod.

Paired with fixes to the (unreleased) kmod 30[1], this allows `modprobe
-r --remove-holders <module>` to also try removing the holders before
removing the module itself.

thanks
Lucas De Marchi

[1] https://lore.kernel.org/linux-modules/[email protected]/T/#t


>
>Would that work for you?
>
>Regards,
>Mauro