2013-08-16 22:24:15

by Thierry Reding

[permalink] [raw]
Subject: [RFC 0/4] Early device registration

This patch series is the result of some earlier discussion[0]. Some
drivers need to probe their devices very early in the boot process (such
as IRQ chip drivers). However there is no struct device available at
that time because they are added a lot later.

The most notable case where this happens is for systems that are booted
using a device tree. A lot of those early drivers work around the issue
by "binding" themselves to the struct device_node. One disadvantage of
that is that they cannot use APIs that require a struct device (such as
regmap). Another downside is that it introduces inconsistencies in how
drivers are written.

In order to remedy that situation, this patch series introduces a way to
register devices earlier. This is more difficult than it sounds. One of
the problems is that kmalloc() is required during device registration.
Another prerequisite is sysfs since every device is represented in the
sysfs tree. To allow a device to be created before sysfs is available,
only the very minimal set of operations are performed at the early stage
and the device is added to a list of early devices that will be fully
registered once all the necessary infrastructure is available. This is
somewhat inspired by and therefore similar to deferred driver probing.

By no means is this patch series complete. I was able to boot a Tegra
device to a boot prompt without any serious regressions. One thing I did
notice is that somehow this series causes udev not to rename ethernet
devices (f.e. eth0 to enp1s0) which it does without that patch set.

One thing that this set of patches doesn't address yet is a way to probe
devices early, which might be required for some use-cases. However I'm
looking for some feedback before going further down this road.

Thierry

[0]: https://lists.ozlabs.org/pipermail/devicetree-discuss/2013-June/036542.html

Thierry Reding (4):
driver core: Register SoC bus after platform bus
driver core: Allow early registration of devices
ARM: tegra: Call of_platform_populate() early
OF: Add device pointer to struct device_node

arch/arm/mach-tegra/common.c | 33 +++++++++++++
arch/arm/mach-tegra/tegra.c | 32 -------------
drivers/base/base.h | 8 ++++
drivers/base/core.c | 107 +++++++++++++++++++++++++++++++++++++++++++
drivers/base/init.c | 3 ++
drivers/base/soc.c | 11 +----
drivers/of/device.c | 2 +
drivers/of/platform.c | 10 +---
include/linux/of.h | 2 +
9 files changed, 157 insertions(+), 51 deletions(-)

--
1.8.3.4


2013-08-16 22:16:48

by Thierry Reding

[permalink] [raw]
Subject: [RFC 4/4] OF: Add device pointer to struct device_node

Signed-off-by: Thierry Reding <[email protected]>
---
drivers/of/device.c | 2 ++
drivers/of/platform.c | 10 +---------
include/linux/of.h | 2 ++
3 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/of/device.c b/drivers/of/device.c
index f685e55..8f8e715 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -63,6 +63,8 @@ int of_device_add(struct platform_device *ofdev)
if (!ofdev->dev.parent)
set_dev_node(&ofdev->dev, of_node_to_nid(ofdev->dev.of_node));

+ ofdev->dev.of_node->dev = &ofdev->dev;
+
return device_add(&ofdev->dev);
}

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index b0d1ff8..0ae68bd 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -31,11 +31,6 @@ const struct of_device_id of_default_bus_match_table[] = {
{} /* Empty terminated list */
};

-static int of_dev_node_match(struct device *dev, void *data)
-{
- return dev->of_node == data;
-}
-
/**
* of_find_device_by_node - Find the platform_device associated with a node
* @np: Pointer to device tree node
@@ -44,10 +39,7 @@ static int of_dev_node_match(struct device *dev, void *data)
*/
struct platform_device *of_find_device_by_node(struct device_node *np)
{
- struct device *dev;
-
- dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match);
- return dev ? to_platform_device(dev) : NULL;
+ return np->dev ? to_platform_device(np->dev) : NULL;
}
EXPORT_SYMBOL(of_find_device_by_node);

diff --git a/include/linux/of.h b/include/linux/of.h
index 17ce8a6..e583a6d 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -65,6 +65,8 @@ struct device_node {
unsigned int unique_id;
struct of_irq_controller *irq_trans;
#endif
+
+ struct device *dev;
};

#define MAX_PHANDLE_ARGS 8
--
1.8.3.4

2013-08-16 22:21:20

by Grant Likely

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Fri, Aug 16, 2013 at 10:55 PM, Thierry Reding
<[email protected]> wrote:
> On Fri, Aug 16, 2013 at 02:06:37PM -0700, Greg Kroah-Hartman wrote:
>> On Fri, Aug 16, 2013 at 10:39:21PM +0200, Thierry Reding wrote:
>> > +static DEFINE_MUTEX(device_early_mutex);
>> > +static LIST_HEAD(device_early_list);
>> > +static bool device_is_early = true;
>> > +
>> > +/*
>> > + * Keep a list of early registered devices so that they can be fully
>> > + * registered at a later point in time.
>> > + */
>> > +static void device_early_add(struct device *dev)
>>
>> __init?
>
> Yes.
>
>> > +{
>> > + mutex_lock(&device_early_mutex);
>> > + list_add_tail(&dev->p->early, &device_early_list);
>> > + mutex_unlock(&device_early_mutex);
>> > +}
>> > +
>> > +/*
>> > + * Mark the early device registration phase as completed.
>> > + */
>> > +int __init device_early_init(void)
>> > +{
>> > + device_is_early = false;
>> > +
>> > + return 0;
>> > +}
>> > +
>> > +/*
>> > + * Fixup platform devices instantiated from device tree. The problem is
>> > + * that since early registration happens before interrupt controllers
>> > + * have been setup, the OF core code won't know how to map interrupts.
>> > + */
>> > +int __init platform_device_early_fixup(struct platform_device *pdev)
>>
>> This shouldn't be in this file, because:
>>
>> > +/*
>> > + * Fully register early devices.
>> > + */
>> > +int __init device_early_done(void)
>> > +{
>> > + struct device_private *private;
>> > +
>> > + list_for_each_entry(private, &device_early_list, early) {
>> > + struct device *dev = private->device;
>> > + int err;
>> > +
>> > + if (dev->bus == &platform_bus_type) {
>>
>> Why special case the platform bus? We are trying to move things off of
>> the platform bus, don't make it harder to do that :)
>
> I heard about that, but I must have missed the thread where this was
> discussed. Can you point me to it?
>
>> > + struct platform_device *pdev = to_platform_device(dev);
>> > +
>> > + err = platform_device_early_fixup(pdev);
>> > + if (err < 0)
>> > + dev_err(&pdev->dev,
>> > + "failed to fixup device %s: %d\n",
>> > + dev_name(&pdev->dev), err);
>> > + }
>>
>> You should just have a bus callback that can be made here that, if
>> present, can be called. That way any bus can handle this type of thing,
>> not just the platform one.
>
> You mean something like an .early_fixup() in struct bus_type? That would
> indeed be much cleaner. As I mentioned this is a very early prototype
> and this particular hunk exists specifically to fixup the platform
> devices created by the device tree helpers so that the kernel actually
> boots to the login prompt.
>
>> Not that I really like the whole idea anyway, but I doubt there's much I
>> can do about it...
>
> Well, getting feedback from you and others is precisely the reason why I
> wanted to post this early. There must be a reason why you don't like it,
> so perhaps you can share your thoughts and we can mould this into
> something that you'd be more comfortable with.
>
> To be honest I don't particularly like it either. It's very hackish for
> core code. But on the other hand there are a few device/driver ordering
> problems that this (or something similar) would help solve. I'm
> certainly open to discuss alternatives and perhaps there's a much
> cleaner way to solve the problem.

And on that note, I think we're all in agreement that it's ugly!
Looking at it now, I don't think it is the right approach.

In the big scheme of things, there really aren't a lot of devices that
will need this functionality. Something I don't have a good handle on
is the set of devices needed to be created early. Yes, some of the
clocks and power rails need to be set up, but do all of them? Yes, the
interrupts need to be set up, but was of_irq_init() a mistake?
Cascaded interrupt controllers really could be normal devices if there
isn't anything critical for early boot connected to them (ie. the
timer). If irq references were changed to be parsed at .probe() time
instead of device registration time, the deferred probe would have
everything needed to make sure the interrupt controllers get probed
before the users.

Also, for the devices that are created early, is it really appropriate
to use APIs that require a struct device? I could easily argue that
anything done at early boot should only be the bare minimum to allow
the kernel to get to initcalls, and so mucking about with devices is
inappropriate because it really messes with the assumptions made in
the design of the driver model.

g.

2013-08-16 22:22:01

by Thierry Reding

[permalink] [raw]
Subject: [RFC 2/4] driver core: Allow early registration of devices

Systems that boot from the devicetree currently have to rely on hacks or
workarounds to setup devices such as interrupt controllers early because
no (platform) device is available yet. This introduces inconsistencies
in the way that these drivers have to be written and precludes them from
using APIs (such as regmap or device logging functions) that rely on a
struct device being available.

This patch is an attempt at rectifying this by allowing devices to be
registered early. Early in this case means sometime after kmalloc() can
be used but before sysfs has been setup. The latter is important because
a large part of the device registration is hooking the device up with
sysfs. Since sysfs becomes available rather late in the boot process,
devices added early will be temporarily added to a list and not hooked
up to sysfs.

Once sysfs is up (or more precisely within driver_init()), the list of
early devices is processed to fully register the devices

Signed-off-by: Thierry Reding <[email protected]>
---
drivers/base/base.h | 3 ++
drivers/base/core.c | 107 ++++++++++++++++++++++++++++++++++++++++++++++++++++
drivers/base/init.c | 2 +
3 files changed, 112 insertions(+)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 77e0722..bf33d2e 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -76,6 +76,7 @@ struct device_private {
struct klist_node knode_driver;
struct klist_node knode_bus;
struct list_head deferred_probe;
+ struct list_head early;
void *driver_data;
struct device *device;
};
@@ -98,6 +99,8 @@ extern int hypervisor_init(void);
#else
static inline int hypervisor_init(void) { return 0; }
#endif
+extern int device_early_init(void);
+extern int device_early_done(void);
extern int platform_bus_init(void);
#ifdef CONFIG_SOC_BUS
extern int soc_bus_init(void);
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 09a99d6..cb240a7 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -20,6 +20,7 @@
#include <linux/notifier.h>
#include <linux/of.h>
#include <linux/of_device.h>
+#include <linux/of_irq.h>
#include <linux/genhd.h>
#include <linux/kallsyms.h>
#include <linux/mutex.h>
@@ -1020,6 +1021,102 @@ int device_private_init(struct device *dev)
klist_init(&dev->p->klist_children, klist_children_get,
klist_children_put);
INIT_LIST_HEAD(&dev->p->deferred_probe);
+ INIT_LIST_HEAD(&dev->p->early);
+ return 0;
+}
+
+/*
+ * Early device registration support
+ *
+ * Some devices such as interrupt controllers need to be available very
+ * early in the boot process. One of the bigger issues is that a lot of
+ * the device registration code relies on sysfs having been initialized
+ * in order to construct proper entries. However that has not completed
+ * yet this early, so we initialize only the very minimal fields of the
+ * device early and defer full initialization (including creating sysfs
+ * directories and links) until later.
+ */
+static DEFINE_MUTEX(device_early_mutex);
+static LIST_HEAD(device_early_list);
+static bool device_is_early = true;
+
+/*
+ * Keep a list of early registered devices so that they can be fully
+ * registered at a later point in time.
+ */
+static void device_early_add(struct device *dev)
+{
+ mutex_lock(&device_early_mutex);
+ list_add_tail(&dev->p->early, &device_early_list);
+ mutex_unlock(&device_early_mutex);
+}
+
+/*
+ * Mark the early device registration phase as completed.
+ */
+int __init device_early_init(void)
+{
+ device_is_early = false;
+
+ return 0;
+}
+
+/*
+ * Fixup platform devices instantiated from device tree. The problem is
+ * that since early registration happens before interrupt controllers
+ * have been setup, the OF core code won't know how to map interrupts.
+ */
+int __init platform_device_early_fixup(struct platform_device *pdev)
+{
+ struct device_node *np = pdev->dev.of_node;
+ unsigned int num_reg, num_irq, num;
+ struct resource *res;
+ int err;
+
+ num_reg = pdev->num_resources;
+ num_irq = of_irq_count(np);
+ num = num_reg + num_irq;
+
+ res = krealloc(pdev->resource, sizeof(*res) * num, GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ err = of_irq_to_resource_table(np, res + num_reg, num_irq);
+ if (err != num_irq)
+ return -EINVAL;
+
+ pdev->num_resources = num_reg + num_irq;
+ pdev->resource = res;
+
+ return 0;
+}
+
+/*
+ * Fully register early devices.
+ */
+int __init device_early_done(void)
+{
+ struct device_private *private;
+
+ list_for_each_entry(private, &device_early_list, early) {
+ struct device *dev = private->device;
+ int err;
+
+ if (dev->bus == &platform_bus_type) {
+ struct platform_device *pdev = to_platform_device(dev);
+
+ err = platform_device_early_fixup(pdev);
+ if (err < 0)
+ dev_err(&pdev->dev,
+ "failed to fixup device %s: %d\n",
+ dev_name(&pdev->dev), err);
+ }
+
+ err = device_add(dev);
+ if (err < 0)
+ pr_err("failed to add device %s\n", dev_name(dev));
+ }
+
return 0;
}

@@ -1063,6 +1160,16 @@ int device_add(struct device *dev)
}

/*
+ * If the device is registered early, defer everything below to a
+ * later point in time where sysfs has been properly initialized.
+ */
+ if (device_is_early) {
+ device_early_add(dev);
+ put_device(dev);
+ return 0;
+ }
+
+ /*
* for statically allocated devices, which should all be converted
* some day, we need to initialize the name. We prevent reading back
* the name, and force the use of dev_name()
diff --git a/drivers/base/init.c b/drivers/base/init.c
index a4c4090..caa1c43 100644
--- a/drivers/base/init.c
+++ b/drivers/base/init.c
@@ -30,8 +30,10 @@ void __init driver_init(void)
/* These are also core pieces, but must come after the
* core core pieces.
*/
+ device_early_init();
platform_bus_init();
soc_bus_init();
+ device_early_done();
cpu_dev_init();
memory_dev_init();
}
--
1.8.3.4

2013-08-16 22:22:15

by Thierry Reding

[permalink] [raw]
Subject: [RFC 3/4] ARM: tegra: Call of_platform_populate() early

Using the new early device registration functionality it is now possible
to register devices much earlier in the boot process. Move the call to
of_platform_populate() from tegra_dt_init() to tegra_dt_init_irq() and
allow drivers that require to be run early to access the struct device
associated with a device node.

Signed-off-by: Thierry Reding <[email protected]>
---
arch/arm/mach-tegra/common.c | 33 +++++++++++++++++++++++++++++++++
arch/arm/mach-tegra/tegra.c | 32 --------------------------------
2 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/arch/arm/mach-tegra/common.c b/arch/arm/mach-tegra/common.c
index 94a119a..24dc09b 100644
--- a/arch/arm/mach-tegra/common.c
+++ b/arch/arm/mach-tegra/common.c
@@ -25,6 +25,9 @@
#include <linux/reboot.h>
#include <linux/irqchip.h>
#include <linux/clk-provider.h>
+#include <linux/sys_soc.h>
+#include <linux/slab.h>
+#include <linux/of_platform.h>

#include <asm/hardware/cache-l2x0.h>

@@ -61,6 +64,36 @@ u32 tegra_uart_config[4] = {
#ifdef CONFIG_OF
void __init tegra_dt_init_irq(void)
{
+ struct soc_device_attribute *soc_dev_attr;
+ struct soc_device *soc_dev;
+ struct device *parent = NULL;
+
+ soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
+ if (!soc_dev_attr)
+ goto out;
+
+ soc_dev_attr->family = kasprintf(GFP_KERNEL, "Tegra");
+ soc_dev_attr->revision = kasprintf(GFP_KERNEL, "%d", tegra_revision);
+ soc_dev_attr->soc_id = kasprintf(GFP_KERNEL, "%d", tegra_chip_id);
+
+ soc_dev = soc_device_register(soc_dev_attr);
+ if (IS_ERR(soc_dev)) {
+ kfree(soc_dev_attr->family);
+ kfree(soc_dev_attr->revision);
+ kfree(soc_dev_attr->soc_id);
+ kfree(soc_dev_attr);
+ goto out;
+ }
+
+ parent = soc_device_to_device(soc_dev);
+
+ /*
+ * Finished with the static registrations now; fill in the missing
+ * devices
+ */
+out:
+ of_platform_populate(NULL, of_default_bus_match_table, NULL, parent);
+
of_clk_init(NULL);
tegra_pmc_init();
tegra_init_irq();
diff --git a/arch/arm/mach-tegra/tegra.c b/arch/arm/mach-tegra/tegra.c
index 5b86055..9abb71e 100644
--- a/arch/arm/mach-tegra/tegra.c
+++ b/arch/arm/mach-tegra/tegra.c
@@ -27,11 +27,9 @@
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_fdt.h>
-#include <linux/of_platform.h>
#include <linux/pda_power.h>
#include <linux/io.h>
#include <linux/slab.h>
-#include <linux/sys_soc.h>
#include <linux/usb/tegra_usb_phy.h>
#include <linux/clk/tegra.h>

@@ -47,37 +45,7 @@

static void __init tegra_dt_init(void)
{
- struct soc_device_attribute *soc_dev_attr;
- struct soc_device *soc_dev;
- struct device *parent = NULL;
-
tegra_clocks_apply_init_table();
-
- soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL);
- if (!soc_dev_attr)
- goto out;
-
- soc_dev_attr->family = kasprintf(GFP_KERNEL, "Tegra");
- soc_dev_attr->revision = kasprintf(GFP_KERNEL, "%d", tegra_revision);
- soc_dev_attr->soc_id = kasprintf(GFP_KERNEL, "%d", tegra_chip_id);
-
- soc_dev = soc_device_register(soc_dev_attr);
- if (IS_ERR(soc_dev)) {
- kfree(soc_dev_attr->family);
- kfree(soc_dev_attr->revision);
- kfree(soc_dev_attr->soc_id);
- kfree(soc_dev_attr);
- goto out;
- }
-
- parent = soc_device_to_device(soc_dev);
-
- /*
- * Finished with the static registrations now; fill in the missing
- * devices
- */
-out:
- of_platform_populate(NULL, of_default_bus_match_table, NULL, parent);
}

static void __init paz00_init(void)
--
1.8.3.4

2013-08-16 22:22:43

by Thierry Reding

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Fri, Aug 16, 2013 at 02:06:37PM -0700, Greg Kroah-Hartman wrote:
> On Fri, Aug 16, 2013 at 10:39:21PM +0200, Thierry Reding wrote:
> > +static DEFINE_MUTEX(device_early_mutex);
> > +static LIST_HEAD(device_early_list);
> > +static bool device_is_early = true;
> > +
> > +/*
> > + * Keep a list of early registered devices so that they can be fully
> > + * registered at a later point in time.
> > + */
> > +static void device_early_add(struct device *dev)
>
> __init?

Yes.

> > +{
> > + mutex_lock(&device_early_mutex);
> > + list_add_tail(&dev->p->early, &device_early_list);
> > + mutex_unlock(&device_early_mutex);
> > +}
> > +
> > +/*
> > + * Mark the early device registration phase as completed.
> > + */
> > +int __init device_early_init(void)
> > +{
> > + device_is_early = false;
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * Fixup platform devices instantiated from device tree. The problem is
> > + * that since early registration happens before interrupt controllers
> > + * have been setup, the OF core code won't know how to map interrupts.
> > + */
> > +int __init platform_device_early_fixup(struct platform_device *pdev)
>
> This shouldn't be in this file, because:
>
> > +/*
> > + * Fully register early devices.
> > + */
> > +int __init device_early_done(void)
> > +{
> > + struct device_private *private;
> > +
> > + list_for_each_entry(private, &device_early_list, early) {
> > + struct device *dev = private->device;
> > + int err;
> > +
> > + if (dev->bus == &platform_bus_type) {
>
> Why special case the platform bus? We are trying to move things off of
> the platform bus, don't make it harder to do that :)

I heard about that, but I must have missed the thread where this was
discussed. Can you point me to it?

> > + struct platform_device *pdev = to_platform_device(dev);
> > +
> > + err = platform_device_early_fixup(pdev);
> > + if (err < 0)
> > + dev_err(&pdev->dev,
> > + "failed to fixup device %s: %d\n",
> > + dev_name(&pdev->dev), err);
> > + }
>
> You should just have a bus callback that can be made here that, if
> present, can be called. That way any bus can handle this type of thing,
> not just the platform one.

You mean something like an .early_fixup() in struct bus_type? That would
indeed be much cleaner. As I mentioned this is a very early prototype
and this particular hunk exists specifically to fixup the platform
devices created by the device tree helpers so that the kernel actually
boots to the login prompt.

> Not that I really like the whole idea anyway, but I doubt there's much I
> can do about it...

Well, getting feedback from you and others is precisely the reason why I
wanted to post this early. There must be a reason why you don't like it,
so perhaps you can share your thoughts and we can mould this into
something that you'd be more comfortable with.

To be honest I don't particularly like it either. It's very hackish for
core code. But on the other hand there are a few device/driver ordering
problems that this (or something similar) would help solve. I'm
certainly open to discuss alternatives and perhaps there's a much
cleaner way to solve the problem.

Thierry


Attachments:
(No filename) (3.10 kB)
(No filename) (836.00 B)
Download all attachments

2013-08-16 22:24:39

by Thierry Reding

[permalink] [raw]
Subject: [RFC 1/4] driver core: Register SoC bus after platform bus

Instead of registering the SoC bus as a core initcall, call it directly
from driver_init() to make it explicit when it is called. This is done
in preparation for early device registration support.

Signed-off-by: Thierry Reding <[email protected]>
---
drivers/base/base.h | 5 +++++
drivers/base/init.c | 1 +
drivers/base/soc.c | 11 +----------
3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 2cbc677..77e0722 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -99,6 +99,11 @@ extern int hypervisor_init(void);
static inline int hypervisor_init(void) { return 0; }
#endif
extern int platform_bus_init(void);
+#ifdef CONFIG_SOC_BUS
+extern int soc_bus_init(void);
+#else
+static inline int soc_bus_init(void) { return 0; }
+#endif
extern void cpu_dev_init(void);

struct kobject *virtual_device_parent(struct device *dev);
diff --git a/drivers/base/init.c b/drivers/base/init.c
index c16f0b8..a4c4090 100644
--- a/drivers/base/init.c
+++ b/drivers/base/init.c
@@ -31,6 +31,7 @@ void __init driver_init(void)
* core core pieces.
*/
platform_bus_init();
+ soc_bus_init();
cpu_dev_init();
memory_dev_init();
}
diff --git a/drivers/base/soc.c b/drivers/base/soc.c
index 72b5e72..ea116f4 100644
--- a/drivers/base/soc.c
+++ b/drivers/base/soc.c
@@ -166,16 +166,7 @@ void soc_device_unregister(struct soc_device *soc_dev)
device_unregister(&soc_dev->dev);
}

-static int __init soc_bus_register(void)
+int __init soc_bus_init(void)
{
return bus_register(&soc_bus_type);
}
-core_initcall(soc_bus_register);
-
-static void __exit soc_bus_unregister(void)
-{
- ida_destroy(&soc_ida);
-
- bus_unregister(&soc_bus_type);
-}
-module_exit(soc_bus_unregister);
--
1.8.3.4

2013-08-17 00:11:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Fri, Aug 16, 2013 at 10:39:21PM +0200, Thierry Reding wrote:
> +static DEFINE_MUTEX(device_early_mutex);
> +static LIST_HEAD(device_early_list);
> +static bool device_is_early = true;
> +
> +/*
> + * Keep a list of early registered devices so that they can be fully
> + * registered at a later point in time.
> + */
> +static void device_early_add(struct device *dev)

__init?

> +{
> + mutex_lock(&device_early_mutex);
> + list_add_tail(&dev->p->early, &device_early_list);
> + mutex_unlock(&device_early_mutex);
> +}
> +
> +/*
> + * Mark the early device registration phase as completed.
> + */
> +int __init device_early_init(void)
> +{
> + device_is_early = false;
> +
> + return 0;
> +}
> +
> +/*
> + * Fixup platform devices instantiated from device tree. The problem is
> + * that since early registration happens before interrupt controllers
> + * have been setup, the OF core code won't know how to map interrupts.
> + */
> +int __init platform_device_early_fixup(struct platform_device *pdev)

This shouldn't be in this file, because:

> +/*
> + * Fully register early devices.
> + */
> +int __init device_early_done(void)
> +{
> + struct device_private *private;
> +
> + list_for_each_entry(private, &device_early_list, early) {
> + struct device *dev = private->device;
> + int err;
> +
> + if (dev->bus == &platform_bus_type) {

Why special case the platform bus? We are trying to move things off of
the platform bus, don't make it harder to do that :)

> + struct platform_device *pdev = to_platform_device(dev);
> +
> + err = platform_device_early_fixup(pdev);
> + if (err < 0)
> + dev_err(&pdev->dev,
> + "failed to fixup device %s: %d\n",
> + dev_name(&pdev->dev), err);
> + }

You should just have a bus callback that can be made here that, if
present, can be called. That way any bus can handle this type of thing,
not just the platform one.


Not that I really like the whole idea anyway, but I doubt there's much I
can do about it...

thanks,

greg k-h

2013-08-17 00:40:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Fri, Aug 16, 2013 at 11:55:31PM +0200, Thierry Reding wrote:
> > > + list_for_each_entry(private, &device_early_list, early) {
> > > + struct device *dev = private->device;
> > > + int err;
> > > +
> > > + if (dev->bus == &platform_bus_type) {
> >
> > Why special case the platform bus? We are trying to move things off of
> > the platform bus, don't make it harder to do that :)
>
> I heard about that, but I must have missed the thread where this was
> discussed. Can you point me to it?

I thought it was on lkml, but it might have been limited to linux-arm
and the ksummit-discuss mailing list on one of the threads there.
Sorry, I can't recall it at the moment.

> > > + struct platform_device *pdev = to_platform_device(dev);
> > > +
> > > + err = platform_device_early_fixup(pdev);
> > > + if (err < 0)
> > > + dev_err(&pdev->dev,
> > > + "failed to fixup device %s: %d\n",
> > > + dev_name(&pdev->dev), err);
> > > + }
> >
> > You should just have a bus callback that can be made here that, if
> > present, can be called. That way any bus can handle this type of thing,
> > not just the platform one.
>
> You mean something like an .early_fixup() in struct bus_type?

Yes.

> That would indeed be much cleaner. As I mentioned this is a very early
> prototype and this particular hunk exists specifically to fixup the
> platform devices created by the device tree helpers so that the kernel
> actually boots to the login prompt.

That's fine, I was giving you feedback on why I wouldn't take this as-is :)

> > Not that I really like the whole idea anyway, but I doubt there's much I
> > can do about it...
>
> Well, getting feedback from you and others is precisely the reason why I
> wanted to post this early. There must be a reason why you don't like it,
> so perhaps you can share your thoughts and we can mould this into
> something that you'd be more comfortable with.

Ideally we would have the rest of the system up and running (i.e. sysfs)
before having to deal with devices and drivers. As it is, you are
"special casing" a number of special devices on special platforms to
work around architectural issues on those platforms. Stuff that ideally
would never need to be done, except for crazy hardware designers :)

So I understand that it is needed, in some special cases, but that
doesn't mean I have to like it...

> To be honest I don't particularly like it either. It's very hackish for
> core code. But on the other hand there are a few device/driver ordering
> problems that this (or something similar) would help solve. I'm
> certainly open to discuss alternatives and perhaps there's a much
> cleaner way to solve the problem.

As this usually is needed on a case-by-case basis, I'm hoping that your
case really does need this, right? If so, there's not much the kernel
can do except add hooks like this to make it work properly.

thanks,

greg k-h

2013-08-17 00:40:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Fri, Aug 16, 2013 at 11:20:55PM +0100, Grant Likely wrote:
> Also, for the devices that are created early, is it really appropriate
> to use APIs that require a struct device? I could easily argue that
> anything done at early boot should only be the bare minimum to allow
> the kernel to get to initcalls, and so mucking about with devices is
> inappropriate because it really messes with the assumptions made in
> the design of the driver model.

I agree, this could get really messy, really fast, and be a place that
people abuse quite easily.

greg k-h

2013-08-17 10:26:49

by Tomasz Figa

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Friday 16 of August 2013 23:20:55 Grant Likely wrote:
> On Fri, Aug 16, 2013 at 10:55 PM, Thierry Reding
>
> <[email protected]> wrote:
> > On Fri, Aug 16, 2013 at 02:06:37PM -0700, Greg Kroah-Hartman wrote:
> >> On Fri, Aug 16, 2013 at 10:39:21PM +0200, Thierry Reding wrote:
> >> > +static DEFINE_MUTEX(device_early_mutex);
> >> > +static LIST_HEAD(device_early_list);
> >> > +static bool device_is_early = true;
> >> > +
> >> > +/*
> >> > + * Keep a list of early registered devices so that they can be
> >> > fully
> >> > + * registered at a later point in time.
> >> > + */
> >> > +static void device_early_add(struct device *dev)
> >>
> >> __init?
> >
> > Yes.
> >
> >> > +{
> >> > + mutex_lock(&device_early_mutex);
> >> > + list_add_tail(&dev->p->early, &device_early_list);
> >> > + mutex_unlock(&device_early_mutex);
> >> > +}
> >> > +
> >> > +/*
> >> > + * Mark the early device registration phase as completed.
> >> > + */
> >> > +int __init device_early_init(void)
> >> > +{
> >> > + device_is_early = false;
> >> > +
> >> > + return 0;
> >> > +}
> >> > +
> >> > +/*
> >> > + * Fixup platform devices instantiated from device tree. The
> >> > problem is + * that since early registration happens before
> >> > interrupt controllers + * have been setup, the OF core code won't
> >> > know how to map interrupts. + */
> >> > +int __init platform_device_early_fixup(struct platform_device
> >> > *pdev)
> >>
> >> This shouldn't be in this file, because:
> >> > +/*
> >> > + * Fully register early devices.
> >> > + */
> >> > +int __init device_early_done(void)
> >> > +{
> >> > + struct device_private *private;
> >> > +
> >> > + list_for_each_entry(private, &device_early_list, early) {
> >> > + struct device *dev = private->device;
> >> > + int err;
> >> > +
> >> > + if (dev->bus == &platform_bus_type) {
> >>
> >> Why special case the platform bus? We are trying to move things off
> >> of
> >> the platform bus, don't make it harder to do that :)
> >
> > I heard about that, but I must have missed the thread where this was
> > discussed. Can you point me to it?
> >
> >> > + struct platform_device *pdev =
> >> > to_platform_device(dev); +
> >> > + err = platform_device_early_fixup(pdev);
> >> > + if (err < 0)
> >> > + dev_err(&pdev->dev,
> >> > + "failed to fixup device %s:
> >> > %d\n",
> >> > + dev_name(&pdev->dev), err);
> >> > + }
> >>
> >> You should just have a bus callback that can be made here that, if
> >> present, can be called. That way any bus can handle this type of
> >> thing, not just the platform one.
> >
> > You mean something like an .early_fixup() in struct bus_type? That
> > would indeed be much cleaner. As I mentioned this is a very early
> > prototype and this particular hunk exists specifically to fixup the
> > platform devices created by the device tree helpers so that the
> > kernel actually boots to the login prompt.
> >
> >> Not that I really like the whole idea anyway, but I doubt there's
> >> much I can do about it...
> >
> > Well, getting feedback from you and others is precisely the reason why
> > I wanted to post this early. There must be a reason why you don't
> > like it, so perhaps you can share your thoughts and we can mould this
> > into something that you'd be more comfortable with.
> >
> > To be honest I don't particularly like it either. It's very hackish
> > for
> > core code. But on the other hand there are a few device/driver
> > ordering
> > problems that this (or something similar) would help solve. I'm
> > certainly open to discuss alternatives and perhaps there's a much
> > cleaner way to solve the problem.
>
> And on that note, I think we're all in agreement that it's ugly!
> Looking at it now, I don't think it is the right approach.
>
> In the big scheme of things, there really aren't a lot of devices that
> will need this functionality. Something I don't have a good handle on
> is the set of devices needed to be created early. Yes, some of the
> clocks and power rails need to be set up, but do all of them? Yes, the
> interrupts need to be set up, but was of_irq_init() a mistake?
> Cascaded interrupt controllers really could be normal devices if there
> isn't anything critical for early boot connected to them (ie. the
> timer). If irq references were changed to be parsed at .probe() time
> instead of device registration time, the deferred probe would have
> everything needed to make sure the interrupt controllers get probed
> before the users.
>
> Also, for the devices that are created early, is it really appropriate
> to use APIs that require a struct device? I could easily argue that
> anything done at early boot should only be the bare minimum to allow
> the kernel to get to initcalls, and so mucking about with devices is
> inappropriate because it really messes with the assumptions made in
> the design of the driver model.

Please correct me if I'm wrong, but to get to initcalls you need a timer.
However a timer driver might need interrupts (to tick) and clocks (to get
the frequency).

At least this is what happens on the platforms I work with. They don't
need any special early registration method, though. We just special case
those drivers, as Thierry pointed, so they don't use device based APIs.

However if we consider a more complex setup, let's say that the timer
driver needs to access some special registers, which are shared with other
drivers as well (again not a completely exotic case, as I already met this
when working with timer and PWM drivers for older Samsung platforms and
had to hack things a bit). One would suggest using regmap here, but it is
a device based API.

This also connects to a different (and probably less relevant) issue of
of_clk_init() and all the whole family of_whatever_init() helpers, which
(all or at least some of them) must be explicitly called from platform
code of every platform. If we could turn those drivers into normal drivers
dealing with devices this could go away.

To clarify, I'm not really for or against doing this, I'm just adding some
points that came to my mind while reading this thread (hopefully something
useful).

Best regards,
Tomasz

2013-08-17 11:07:40

by Thierry Reding

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Fri, Aug 16, 2013 at 11:20:55PM +0100, Grant Likely wrote:
> On Fri, Aug 16, 2013 at 10:55 PM, Thierry Reding
> <[email protected]> wrote:
> > On Fri, Aug 16, 2013 at 02:06:37PM -0700, Greg Kroah-Hartman wrote:
> >> On Fri, Aug 16, 2013 at 10:39:21PM +0200, Thierry Reding wrote:
> >> > +static DEFINE_MUTEX(device_early_mutex);
> >> > +static LIST_HEAD(device_early_list);
> >> > +static bool device_is_early = true;
> >> > +
> >> > +/*
> >> > + * Keep a list of early registered devices so that they can be fully
> >> > + * registered at a later point in time.
> >> > + */
> >> > +static void device_early_add(struct device *dev)
> >>
> >> __init?
> >
> > Yes.
> >
> >> > +{
> >> > + mutex_lock(&device_early_mutex);
> >> > + list_add_tail(&dev->p->early, &device_early_list);
> >> > + mutex_unlock(&device_early_mutex);
> >> > +}
> >> > +
> >> > +/*
> >> > + * Mark the early device registration phase as completed.
> >> > + */
> >> > +int __init device_early_init(void)
> >> > +{
> >> > + device_is_early = false;
> >> > +
> >> > + return 0;
> >> > +}
> >> > +
> >> > +/*
> >> > + * Fixup platform devices instantiated from device tree. The problem is
> >> > + * that since early registration happens before interrupt controllers
> >> > + * have been setup, the OF core code won't know how to map interrupts.
> >> > + */
> >> > +int __init platform_device_early_fixup(struct platform_device *pdev)
> >>
> >> This shouldn't be in this file, because:
> >>
> >> > +/*
> >> > + * Fully register early devices.
> >> > + */
> >> > +int __init device_early_done(void)
> >> > +{
> >> > + struct device_private *private;
> >> > +
> >> > + list_for_each_entry(private, &device_early_list, early) {
> >> > + struct device *dev = private->device;
> >> > + int err;
> >> > +
> >> > + if (dev->bus == &platform_bus_type) {
> >>
> >> Why special case the platform bus? We are trying to move things off of
> >> the platform bus, don't make it harder to do that :)
> >
> > I heard about that, but I must have missed the thread where this was
> > discussed. Can you point me to it?
> >
> >> > + struct platform_device *pdev = to_platform_device(dev);
> >> > +
> >> > + err = platform_device_early_fixup(pdev);
> >> > + if (err < 0)
> >> > + dev_err(&pdev->dev,
> >> > + "failed to fixup device %s: %d\n",
> >> > + dev_name(&pdev->dev), err);
> >> > + }
> >>
> >> You should just have a bus callback that can be made here that, if
> >> present, can be called. That way any bus can handle this type of thing,
> >> not just the platform one.
> >
> > You mean something like an .early_fixup() in struct bus_type? That would
> > indeed be much cleaner. As I mentioned this is a very early prototype
> > and this particular hunk exists specifically to fixup the platform
> > devices created by the device tree helpers so that the kernel actually
> > boots to the login prompt.
> >
> >> Not that I really like the whole idea anyway, but I doubt there's much I
> >> can do about it...
> >
> > Well, getting feedback from you and others is precisely the reason why I
> > wanted to post this early. There must be a reason why you don't like it,
> > so perhaps you can share your thoughts and we can mould this into
> > something that you'd be more comfortable with.
> >
> > To be honest I don't particularly like it either. It's very hackish for
> > core code. But on the other hand there are a few device/driver ordering
> > problems that this (or something similar) would help solve. I'm
> > certainly open to discuss alternatives and perhaps there's a much
> > cleaner way to solve the problem.
>
> And on that note, I think we're all in agreement that it's ugly!
> Looking at it now, I don't think it is the right approach.
>
> In the big scheme of things, there really aren't a lot of devices that
> will need this functionality. Something I don't have a good handle on
> is the set of devices needed to be created early. Yes, some of the
> clocks and power rails need to be set up, but do all of them? Yes, the
> interrupts need to be set up, but was of_irq_init() a mistake?
> Cascaded interrupt controllers really could be normal devices if there
> isn't anything critical for early boot connected to them (ie. the
> timer). If irq references were changed to be parsed at .probe() time
> instead of device registration time, the deferred probe would have
> everything needed to make sure the interrupt controllers get probed
> before the users.

I remember having this discussion with you before. Deferred probing was
one solution to the problem at the time. However as you point out the
resolution of IRQ references isn't handled by deferred probing, and
supposedly is one of the reasons why interrupt chips are now registered
early using of_irq_init(), which, as I understand it, is the modern-day
equivalent of doing this in drivers/ rather than by platform- or board-
setup code in arch/.

While I'm totally in favour of postponing the resolution of interrupt
references to .probe() time, I also think it's unrealistic to require
individual drivers to do it explicitly. I think it's something that the
core should (and can) solve.

If you nor anyone else has any objections I'll take a stab at it, see if
I can come up with something that solves this nicely. Even if it doesn't
solve everything, it would at least eliminate one of the problems that
currently require the of_irq_init() workarounds.

The common clock framework provides of_clk_init() for similar purposes,
though I'm not sure if the same arguments can be applied to it.

> Also, for the devices that are created early, is it really appropriate
> to use APIs that require a struct device? I could easily argue that
> anything done at early boot should only be the bare minimum to allow
> the kernel to get to initcalls, and so mucking about with devices is
> inappropriate because it really messes with the assumptions made in
> the design of the driver model.

I see your point. But perhaps this could be solved by allowing drivers
to register something like an .early_probe() (or perhaps .of_probe()),
which would be called early to set up the bare minimum to continue to
boot. That's probably not easy to do because drivers are usually only
registered much later too. One solution to that particular problem may
be to move the early probe function to a separate early driver struct.

Thinking about it, that's probably not a whole lot better than what we
currently have. But it would at least somewhat restore consistency in
that the same source file could provide the early driver and a regular
driver so that at least for the full driver the devices would behave
normally.

There could even be some sort of hand-off between the early and the full
driver, though I doubt that it would be necessary. If the early setup is
really only the bare minimum then that code should restrict itself to
writing hardware registers and not set up any long-lived data structures
and whatnot.

Thierry


Attachments:
(No filename) (6.99 kB)
(No filename) (836.00 B)
Download all attachments

2013-08-17 11:17:53

by Thierry Reding

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Fri, Aug 16, 2013 at 03:08:07PM -0700, Greg Kroah-Hartman wrote:
> On Fri, Aug 16, 2013 at 11:55:31PM +0200, Thierry Reding wrote:
> > > > + list_for_each_entry(private, &device_early_list, early) {
> > > > + struct device *dev = private->device;
> > > > + int err;
> > > > +
> > > > + if (dev->bus == &platform_bus_type) {
> > >
> > > Why special case the platform bus? We are trying to move things off of
> > > the platform bus, don't make it harder to do that :)
> >
> > I heard about that, but I must have missed the thread where this was
> > discussed. Can you point me to it?
>
> I thought it was on lkml, but it might have been limited to linux-arm
> and the ksummit-discuss mailing list on one of the threads there.
> Sorry, I can't recall it at the moment.

I think I found it. It was limited to linux-arm and ksummit-discuss.
Interesting read, thanks.

> > > Not that I really like the whole idea anyway, but I doubt there's much I
> > > can do about it...
> >
> > Well, getting feedback from you and others is precisely the reason why I
> > wanted to post this early. There must be a reason why you don't like it,
> > so perhaps you can share your thoughts and we can mould this into
> > something that you'd be more comfortable with.
>
> Ideally we would have the rest of the system up and running (i.e. sysfs)
> before having to deal with devices and drivers. As it is, you are
> "special casing" a number of special devices on special platforms to
> work around architectural issues on those platforms. Stuff that ideally
> would never need to be done, except for crazy hardware designers :)
>
> So I understand that it is needed, in some special cases, but that
> doesn't mean I have to like it...

So the rest of this thread seems to be going in a slightly different
direction. If indeed we can come up with a way to limit the early code
to setup only the bare minimum and not register all kinds of data
structures for later on, then this "workaround" shouldn't really be
needed at all.

I think it would be nice to ultimately have a single driver that would
know how to do early setup that will be enough to get us to initcalls
and do the full initialization once the regular .probe() is called.

> > To be honest I don't particularly like it either. It's very hackish for
> > core code. But on the other hand there are a few device/driver ordering
> > problems that this (or something similar) would help solve. I'm
> > certainly open to discuss alternatives and perhaps there's a much
> > cleaner way to solve the problem.
>
> As this usually is needed on a case-by-case basis, I'm hoping that your
> case really does need this, right? If so, there's not much the kernel
> can do except add hooks like this to make it work properly.

Well, the most obvious cases where early initialization is needed are
interrupt controllers and clocks. Those used to be setup by platform
code and was isolated to arch/arm (I guess the same is true for other
architectures as well). That way everybody could do what they wanted.
We've moved a lot of that code out into drivers/ but a side-effect of
that was that some of it wasn't turned into proper drivers.

Perhaps the current solutions are all fine and I should just ignore that
they irritate me.

Thierry


Attachments:
(No filename) (3.20 kB)
(No filename) (836.00 B)
Download all attachments

2013-08-19 19:44:06

by Stephen Warren

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On 08/17/2013 05:17 AM, Thierry Reding wrote:
...
> Well, the most obvious cases where early initialization is needed
> are interrupt controllers and clocks.

... and IOMMUs, which apparently need to initialize before any devices
whose transactions are routed through the IOMMU, in order to set
themselves up as the IOMMU for the relevant devices.

It's possible that the CPU-visible bus structure isn't a strict
inverse/reverse of the device-visible bus-structure. A device may have
CPU-visible registers on one bus segment, but inject master
transactions onto an unrelated bus segment. So it may not be as simple
as making a bus driver for the bus segment affected by the IOMMU, and
having that driver trigger instantiation of all its children.

2013-08-19 19:49:24

by Stephen Warren

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On 08/17/2013 04:26 AM, Tomasz Figa wrote:
...
> Please correct me if I'm wrong, but to get to initcalls you need a timer.
> However a timer driver might need interrupts (to tick) and clocks (to get
> the frequency).
>
> At least this is what happens on the platforms I work with. They don't
> need any special early registration method, though. We just special case
> those drivers, as Thierry pointed, so they don't use device based APIs.
>
> However if we consider a more complex setup, let's say that the timer
> driver needs to access some special registers, which are shared with other
> drivers as well (again not a completely exotic case, as I already met this
> when working with timer and PWM drivers for older Samsung platforms and
> had to hack things a bit). One would suggest using regmap here, but it is
> a device based API.

To take this even further, I'm not sure there's a particular reason why
the timer has to have an internal clock source driven from the same chip
clock input as the CPU itself. What if there's a separate clock input,
whose source chip has an enable input, which is connected to a GPIO,
which is driven by an I2C-based GPIO expander? Admittedly that's a
pretty crazy HW design, but I doubt it's much other than an accident
that it doesn't exist in practice, given the probable lack of feedback
cycle from Linux SW internals to all board designers.

It seems like the best solution here is to make the generic case fully
capable. Perhaps initcalls shouldn't depend on a timer at all, or should
be split into sets that require certain services, and those services can
appear dynamically, and when they do, the relevant initcalls that were
held off by lack of a certain feature all get triggered.

2013-08-19 20:04:42

by Thierry Reding

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Mon, Aug 19, 2013 at 01:49:18PM -0600, Stephen Warren wrote:
> On 08/17/2013 04:26 AM, Tomasz Figa wrote:
> ...
> > Please correct me if I'm wrong, but to get to initcalls you need a timer.
> > However a timer driver might need interrupts (to tick) and clocks (to get
> > the frequency).
> >
> > At least this is what happens on the platforms I work with. They don't
> > need any special early registration method, though. We just special case
> > those drivers, as Thierry pointed, so they don't use device based APIs.
> >
> > However if we consider a more complex setup, let's say that the timer
> > driver needs to access some special registers, which are shared with other
> > drivers as well (again not a completely exotic case, as I already met this
> > when working with timer and PWM drivers for older Samsung platforms and
> > had to hack things a bit). One would suggest using regmap here, but it is
> > a device based API.
>
> To take this even further, I'm not sure there's a particular reason why
> the timer has to have an internal clock source driven from the same chip
> clock input as the CPU itself. What if there's a separate clock input,
> whose source chip has an enable input, which is connected to a GPIO,
> which is driven by an I2C-based GPIO expander? Admittedly that's a
> pretty crazy HW design, but I doubt it's much other than an accident
> that it doesn't exist in practice, given the probable lack of feedback
> cycle from Linux SW internals to all board designers.
>
> It seems like the best solution here is to make the generic case fully
> capable. Perhaps initcalls shouldn't depend on a timer at all, or should
> be split into sets that require certain services, and those services can
> appear dynamically, and when they do, the relevant initcalls that were
> held off by lack of a certain feature all get triggered.

But that's pretty much a special case of deferred probing, isn't it?

If we make it any more specific than deferred probing we'll end up with
a set of statically defined features that drivers need to check instead
of directly requesting the resources they need. Maintaining that list
will probably be manageable, but how will that handle cases where for
instance two drivers announce the same services and one driver depends
on that service? It'll be difficult to find out if the provider of the
service is the right one.

Thierry


Attachments:
(No filename) (2.35 kB)
(No filename) (836.00 B)
Download all attachments

2013-08-19 20:10:48

by Thierry Reding

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Mon, Aug 19, 2013 at 01:43:59PM -0600, Stephen Warren wrote:
> On 08/17/2013 05:17 AM, Thierry Reding wrote:
> ...
> > Well, the most obvious cases where early initialization is needed
> > are interrupt controllers and clocks.
>
> ... and IOMMUs, which apparently need to initialize before any devices
> whose transactions are routed through the IOMMU, in order to set
> themselves up as the IOMMU for the relevant devices.
>
> It's possible that the CPU-visible bus structure isn't a strict
> inverse/reverse of the device-visible bus-structure. A device may have
> CPU-visible registers on one bus segment, but inject master
> transactions onto an unrelated bus segment. So it may not be as simple
> as making a bus driver for the bus segment affected by the IOMMU, and
> having that driver trigger instantiation of all its children.

Well, perhaps that can be handled via deferred probing? The only thing
preventing that right now is that drivers aren't actively aware of the
IOMMUs existence. If we can make it a requirement that each driver
needing the services of an IOMMU actively requests that service, then
deferred probing should be able to deal with it.

That doesn't necessarily mean that drivers need to be doing it manually.
It strikes me as the kind of thing that could easily be done by the
core. In fact I've been thinking of doing something similar to resolve
devicetree IRQ references at probe time, rather than at device creation
time to remove the need for interrupt chips to register early.

Thierry


Attachments:
(No filename) (1.49 kB)
(No filename) (836.00 B)
Download all attachments

2013-08-19 20:53:43

by Stephen Warren

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On 08/19/2013 02:10 PM, Thierry Reding wrote:
> On Mon, Aug 19, 2013 at 01:43:59PM -0600, Stephen Warren wrote:
>> On 08/17/2013 05:17 AM, Thierry Reding wrote: ...
>>> Well, the most obvious cases where early initialization is
>>> needed are interrupt controllers and clocks.
>>
>> ... and IOMMUs, which apparently need to initialize before any
>> devices whose transactions are routed through the IOMMU, in order
>> to set themselves up as the IOMMU for the relevant devices.
>>
>> It's possible that the CPU-visible bus structure isn't a strict
>> inverse/reverse of the device-visible bus-structure. A device may
>> have CPU-visible registers on one bus segment, but inject master
>> transactions onto an unrelated bus segment. So it may not be as
>> simple as making a bus driver for the bus segment affected by the
>> IOMMU, and having that driver trigger instantiation of all its
>> children.
>
> Well, perhaps that can be handled via deferred probing? The only
> thing preventing that right now is that drivers aren't actively
> aware of the IOMMUs existence. If we can make it a requirement that
> each driver needing the services of an IOMMU actively requests that
> service, then deferred probing should be able to deal with it.

I believe the hope was to specifically avoid drivers knowing about the
presence of an IOMMU. Drivers would simply use standard dma
alloc/map/unmap APIs, and those APIs would internally set up any IOMMU
HW is present and necessary.

Having e.g. a DT property that says "this is your IOMMU" which drivers
had to explicitly handle parsing, and/or requiring drivers to make
some call during probe to say "bind to an IOMMU" (unless that API can
simply be called in all cases including when there actually is no
IOMMU) would be a bit annoying.

> That doesn't necessarily mean that drivers need to be doing it
> manually. It strikes me as the kind of thing that could easily be
> done by the core. In fact I've been thinking of doing something
> similar to resolve devicetree IRQ references at probe time, rather
> than at device creation time to remove the need for interrupt chips
> to register early.

If we can make this automatic and hidden yet still use deferred
probing, then that'd likely be just fine.

2013-08-19 21:42:01

by Thierry Reding

[permalink] [raw]
Subject: Re: [RFC 2/4] driver core: Allow early registration of devices

On Mon, Aug 19, 2013 at 02:53:37PM -0600, Stephen Warren wrote:
> On 08/19/2013 02:10 PM, Thierry Reding wrote:
> > On Mon, Aug 19, 2013 at 01:43:59PM -0600, Stephen Warren wrote:
> >> On 08/17/2013 05:17 AM, Thierry Reding wrote: ...
> >>> Well, the most obvious cases where early initialization is
> >>> needed are interrupt controllers and clocks.
> >>
> >> ... and IOMMUs, which apparently need to initialize before any
> >> devices whose transactions are routed through the IOMMU, in order
> >> to set themselves up as the IOMMU for the relevant devices.
> >>
> >> It's possible that the CPU-visible bus structure isn't a strict
> >> inverse/reverse of the device-visible bus-structure. A device may
> >> have CPU-visible registers on one bus segment, but inject master
> >> transactions onto an unrelated bus segment. So it may not be as
> >> simple as making a bus driver for the bus segment affected by the
> >> IOMMU, and having that driver trigger instantiation of all its
> >> children.
> >
> > Well, perhaps that can be handled via deferred probing? The only
> > thing preventing that right now is that drivers aren't actively
> > aware of the IOMMUs existence. If we can make it a requirement that
> > each driver needing the services of an IOMMU actively requests that
> > service, then deferred probing should be able to deal with it.
>
> I believe the hope was to specifically avoid drivers knowing about the
> presence of an IOMMU. Drivers would simply use standard dma
> alloc/map/unmap APIs, and those APIs would internally set up any IOMMU
> HW is present and necessary.
>
> Having e.g. a DT property that says "this is your IOMMU" which drivers
> had to explicitly handle parsing, and/or requiring drivers to make
> some call during probe to say "bind to an IOMMU" (unless that API can
> simply be called in all cases including when there actually is no
> IOMMU) would be a bit annoying.

Yes, I don't think drivers should have to do that manually either. In my
opinion it falls in the same category as setting up the default pinmux
state, which is now handled by the core as well.

> > That doesn't necessarily mean that drivers need to be doing it
> > manually. It strikes me as the kind of thing that could easily be
> > done by the core. In fact I've been thinking of doing something
> > similar to resolve devicetree IRQ references at probe time, rather
> > than at device creation time to remove the need for interrupt chips
> > to register early.
>
> If we can make this automatic and hidden yet still use deferred
> probing, then that'd likely be just fine.

I plan to investigate how this could be done for the IRQ reference
resolution and if that can be done it should be possible to do it for
IOMMUs as well.

Thierry


Attachments:
(No filename) (2.69 kB)
(No filename) (836.00 B)
Download all attachments