2003-01-17 19:15:05

by Russell King

[permalink] [raw]
Subject: Initcall / device model meltdown?

The initcall/device model seems to be quite fragile at the moment with
respect to not oopsing the kernel.

On many StrongARM-based systems, a multifunction chip addressed through
a serial bus is used to provide touchscreen, audio and additional digital
IO. Currently, there are many sub-drivers, some of which are modular
themselves, and some which depend on each other. They currently live
under drivers/misc, for want of a better location. They are all
initialised using module_init(), via the device model driver_register()
function.

The input drivers are also modular, and provide a device class
(input_devclass), which is registered using module_init().

One of the multifunction device drivers is a touchscreen driver, which
should obviously be part of the input device class.

Both the input core and the multifunction chip drivers are using
module_init(), the order in which these are initialised is link-order
specific, and it happens that drivers/input is initialised really late
during boot, after drivers/misc.

Since the device model requires any object to be initialised before it
is used, this causes an oops from devclass_add_driver().

We appear to have two conflicting requirements here:

1. the device model requires a certain initialisation order.
2. modules need to use module_init() which means the initialisation order
is link-order dependent, despite our multi-level initialisation system.

Obviously one solution would be to spread the drivers for this
multifunction chip throughout the kernel tree (ie, by function not
by device) so the touchscreen driver would live under drivers/input.

However, then we need to make sure that the multifunction chip's
bus type is initialised before any of the other subsystems, and of
course, the bus type is initialised using module_init() since it
lives in a module...

I think we need to re-think what we're doing with the initialisation
handling and the device model before these sorts of problems get out
of hand.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html


2003-01-17 19:24:07

by Jeff Garzik

[permalink] [raw]
Subject: Re: Initcall / device model meltdown?

On Fri, Jan 17, 2003 at 07:23:56PM +0000, Russell King wrote:
> 1. the device model requires a certain initialisation order.
> 2. modules need to use module_init() which means the initialisation order
> is link-order dependent, despite our multi-level initialisation system.
>
> Obviously one solution would be to spread the drivers for this
> multifunction chip throughout the kernel tree (ie, by function not
> by device) so the touchscreen driver would live under drivers/input.
>
> However, then we need to make sure that the multifunction chip's
> bus type is initialised before any of the other subsystems, and of
> course, the bus type is initialised using module_init() since it
> lives in a module...
>
> I think we need to re-think what we're doing with the initialisation
> handling and the device model before these sorts of problems get out
> of hand.

IMO this link order business is a problem that's existed for ages,
it's unrelated to the device model, and adding seven levels of
initcalls merely hid this problem a little bit.

Back when I was doing fbdev stuff, I just gave up and did things "the
old way", a la

#ifdef MODULE
module_init(my_driver);
#endif

and then call my_driver from other code, when it is built into the
kernel, overriding link order.

Not a great solution, I know. My preferred solution has always been to
explicitly list the dependencies, so a build-time tool can figure out
the link order automagically.

Jeff



2003-01-17 19:47:12

by Kai Germaschewski

[permalink] [raw]
Subject: Re: Initcall / device model meltdown?

On Fri, 17 Jan 2003, Russell King wrote:

> Both the input core and the multifunction chip drivers are using
> module_init(), the order in which these are initialised is link-order
> specific, and it happens that drivers/input is initialised really late
> during boot, after drivers/misc.
>
> Since the device model requires any object to be initialised before it
> is used, this causes an oops from devclass_add_driver().
>
> We appear to have two conflicting requirements here:
>
> 1. the device model requires a certain initialisation order.
> 2. modules need to use module_init() which means the initialisation order
> is link-order dependent, despite our multi-level initialisation system.

I think there's basically two ways to overcome the current fragility:

o Get the init order right. Well, doing it by hand is obviously fragile,
so it needs to be done automatically. I think rusty had patches floating
about which would ensure proper ordering depending on the exported
interfaces. Example: The pci code exports pci_register_driver(), so we
make sure that every (built-in) module which uses pci_register_driver()
runs only after the the module which defines pci_register_driver() has
finished its initcall.

Note that this is how things are done today for actual modules, you
cannot load a module which depends on symbols defined in another module
which has not yet been loaded, thus not yet initialized.

However, it relies on sufficient modularization, which is true for much
of the driver business, but e.g. pci isn't modularized and thus
a bad example (USB, ISDN, etc are better ones). This method however
does not help with early arch init etc., where I think explicit ordering
is a better idea, anyway.

o Make the init order not matter. That is, make sure that the registration
routines ("pci_register_driver()") can be run safely even before
the corresponding __initcall() has executed. E.g. have
pci_register_driver() only add the driver to a (statically initialized)
list of drivers. Then, when pci_init() gets executed, walk the list of
registered drivers, call ->probe() etc.

--Kai


2003-01-17 20:05:37

by Russell King

[permalink] [raw]
Subject: Re: Initcall / device model meltdown?

On Fri, Jan 17, 2003 at 01:56:07PM -0600, Kai Germaschewski wrote:
> o Make the init order not matter. That is, make sure that the registration
> routines ("pci_register_driver()") can be run safely even before
> the corresponding __initcall() has executed. E.g. have
> pci_register_driver() only add the driver to a (statically initialized)
> list of drivers. Then, when pci_init() gets executed, walk the list of
> registered drivers, call ->probe() etc.

For each driver, you have up to two objects that have to be pre-initialised
and registered with the device model:

- the bus_type structure
- the device_class structure

The bus type is registered by the bus subsystem (eg, PCI), and the
device_class is registered by the driver subsystem (eg, input).

Until both of those have been initialised, you can't register the
driver (without oopsing.) It isn't sufficient to wait for the bus
subsystem to be initialised, you need to wait for both the bus
and driver subsystems.

I suppose a solution would be for the device model could accept the
registration of a driver or device, but if the referenced objects
are not initialised, set a count of "objects requiring initialisation".

As each object is initialised, look for unregistered drivers and
decrement their initialisation count. When it hits zero, finis the
driver registration.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-01-17 20:39:07

by Dominik Brodowski

[permalink] [raw]
Subject: Re: Initcall / device model meltdown?

On Fri, Jan 17, 2003 at 09:37:22PM +0100, Jeff Garzik wrote:
> On Fri, Jan 17, 2003 at 07:23:56PM +0000, Russell King wrote:
> > 1. the device model requires a certain initialisation order.
> > 2. modules need to use module_init() which means the initialisation order
> > is link-order dependent, despite our multi-level initialisation system.

modules don't really need module_init() -- you can use the others, too:
in include/linux/init.h:

/* Don't use these in modules, but some people do... */
#define core_initcall(fn) module_init(fn)
#define postcore_initcall(fn) module_init(fn)
#define arch_initcall(fn) module_init(fn)
#define subsys_initcall(fn) module_init(fn)
#define fs_initcall(fn) module_init(fn)
#define device_initcall(fn) module_init(fn)
#define late_initcall(fn) module_init(fn)


So it makes sense to use the appropriate initcall level even in files that
can be compiled as modules, these #defines do their work for you. We should
update that comment, though.

Dominik

--- linux-original/include/linux/init.h 2003-01-17 16:51:23.000000000 +0100
+++ linux/include/linux/init.h 2003-01-17 21:46:34.000000000 +0100
@@ -129,7 +129,10 @@

#else /* MODULE */

-/* Don't use these in modules, but some people do... */
+/* Alternatively, you can still use these initcall levels to
+ * ensure proper initialization order when modularized stuff
+ * is compiled into the kernel.
+ */
#define core_initcall(fn) module_init(fn)
#define postcore_initcall(fn) module_init(fn)
#define arch_initcall(fn) module_init(fn)