2022-07-28 10:02:57

by Johan Hovold

[permalink] [raw]
Subject: Second-source devices and interrupt-mapping race

Hi Marc, Rob and Krzysztof,

When adding support for the new Lenovo Thinkpad X13s laptop, we realised
that it comes with two different touchpad controllers.

To enable some early adopters to use the alternate touchpad, I tried to
enable both nodes in the devicetree and have the i2c-hid driver bind to
the one that is actually present.

This turned out to be racy due to the hid driver in question enabling
async probing so that the populated and non-populated nodes can be
probed in parallel, which in turn lead to some interesting findings.

Specifically, it seems like the interrupt-domain mapping code is racy in
that it can return two different mappings for the same hwirq, and when
the hid driver enables one of them, this may end up looking like
spurious interrupts for the other mapping:

[ +0.014042] i2c_hid_of 0-002c: i2c_device_probe
[ +0.000001] i2c_hid_of 0-0015: i2c_device_probe
[ +0.000025] i2c_hid_of 0-002c: i2c_device_probe - irq mapped (166)
[ +0.000013] i2c_hid_of 0-0015: i2c_device_probe - irq mapped (167)
[ +0.000051] i2c_hid_of 0-002c: supply vddl not found, using dummy regulator
[ +0.000056] i2c_hid_of 0-0015: supply vddl not found, using dummy regulator
[ +0.000016] i2c_hid_of 0-002c: HID probe called for i2c 0x2c
[ +0.000374] i2c_hid_of 0-0015: HID probe called for i2c 0x15
...
[ +0.000180] i2c_hid_of 0-002c: Requesting IRQ: 166
[ +0.000045] irq 167, desc: (____ptrval____), depth: 1, count: 0, unhandled: 0
[ +0.000014] ->handle_irq(): (____ptrval____), handle_bad_irq+0x0/0x220
[ +0.000015] ->irq_data.chip(): (____ptrval____), msm_gpio_irq_chip+0x0/0x108
[ +0.000011] ->action(): 0000000000000000
[ +0.000006] IRQ_NOPROBE set

The interrupt is eventually disabled and the populated device fails to
probe. Note that this only happens intermittently.

This second-source example could obviously be dismissed as something
which is simply not supported (the boot firmware should have made sure
only the populated node was enabled), but what if there were actually
two separate devices sharing an interrupt and that now end up with two
different virq?

Async probing has been around for a while now and needs to be supported,
even if the platform bus doesn't use it (yet).

TL;DR:

1. Marc, does the irq mapping code need to be serialised to handle the
valid case of two devices sharing an interrupt being probed in parallel?
It may not be a common setup, but correctness first?

I've just posted a patch that should address this here:

https://lore.kernel.org/r/[email protected]


2. Rob, Krzysztof, I assume that handling second-source devices by
enabling multiple variants in the devicetree can not be considered
correct?

What about the related case of simply non-populated devices (e.g. laptop
variants without a touchscreen)?

Note that we have at least two cases of "second-source" nodes in mainline
("rtc" and "trackpad", respectively):

85a9efcd4e29 ("ARM: mvebu: add DT support for Seagate NAS 2 and 4-Bay")
689b937bedde ("arm64: dts: mediatek: add mt8173 elm and hana board")

and that, for example, the i2c-hid driver explicitly supports
non-populated devices:

b3a81b6c4fc6 ("HID: i2c-hid: check if device is there before really probing")

and the commit message indicates that this is something that Chromebooks
rely on.

For the X13s, I'm not sure how we would go about to tell the variants
apart (the ACPI tables that Windows use include both touchpads and an
optional touchscreen). In the end, the boot firmware might need to
resort to a similar kind of probing if we don't allow the kernel to do
it.

Finally, note that while disabling async probing for "second-source"
nodes (e.g. if we could mark them as requiring that) would take care of
the irq-mapping race, we'd still currently also need to move any
pinconfig handles to the parent bus node (as is also done in one of the
in-tree examples above) to suppress the corresponding pinctrl errors in
case the populated device is probed and bound first:

[ +0.010217] sc8280xp-tlmm f100000.pinctrl: pin GPIO_182 already requested by 0-0015; cannot claim for 0-002c

Johan


2022-08-04 16:03:56

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: Second-source devices and interrupt-mapping race

On Thu, Jul 28, 2022 at 3:30 AM Johan Hovold <[email protected]> wrote:
>
> Hi Marc, Rob and Krzysztof,
>
> When adding support for the new Lenovo Thinkpad X13s laptop, we realised
> that it comes with two different touchpad controllers.
>
> To enable some early adopters to use the alternate touchpad, I tried to
> enable both nodes in the devicetree and have the i2c-hid driver bind to
> the one that is actually present.
>
> This turned out to be racy due to the hid driver in question enabling
> async probing so that the populated and non-populated nodes can be
> probed in parallel, which in turn lead to some interesting findings.
>
> Specifically, it seems like the interrupt-domain mapping code is racy in
> that it can return two different mappings for the same hwirq, and when
> the hid driver enables one of them, this may end up looking like
> spurious interrupts for the other mapping:
>
> [ +0.014042] i2c_hid_of 0-002c: i2c_device_probe
> [ +0.000001] i2c_hid_of 0-0015: i2c_device_probe
> [ +0.000025] i2c_hid_of 0-002c: i2c_device_probe - irq mapped (166)
> [ +0.000013] i2c_hid_of 0-0015: i2c_device_probe - irq mapped (167)
> [ +0.000051] i2c_hid_of 0-002c: supply vddl not found, using dummy regulator
> [ +0.000056] i2c_hid_of 0-0015: supply vddl not found, using dummy regulator
> [ +0.000016] i2c_hid_of 0-002c: HID probe called for i2c 0x2c
> [ +0.000374] i2c_hid_of 0-0015: HID probe called for i2c 0x15
> ...
> [ +0.000180] i2c_hid_of 0-002c: Requesting IRQ: 166
> [ +0.000045] irq 167, desc: (____ptrval____), depth: 1, count: 0, unhandled: 0
> [ +0.000014] ->handle_irq(): (____ptrval____), handle_bad_irq+0x0/0x220
> [ +0.000015] ->irq_data.chip(): (____ptrval____), msm_gpio_irq_chip+0x0/0x108
> [ +0.000011] ->action(): 0000000000000000
> [ +0.000006] IRQ_NOPROBE set
>
> The interrupt is eventually disabled and the populated device fails to
> probe. Note that this only happens intermittently.
>
> This second-source example could obviously be dismissed as something
> which is simply not supported (the boot firmware should have made sure
> only the populated node was enabled), but what if there were actually
> two separate devices sharing an interrupt and that now end up with two
> different virq?
>
> Async probing has been around for a while now and needs to be supported,
> even if the platform bus doesn't use it (yet).
>
> TL;DR:
>
> 1. Marc, does the irq mapping code need to be serialised to handle the
> valid case of two devices sharing an interrupt being probed in parallel?
> It may not be a common setup, but correctness first?
>
> I've just posted a patch that should address this here:
>
> https://lore.kernel.org/r/[email protected]
>
>
> 2. Rob, Krzysztof, I assume that handling second-source devices by
> enabling multiple variants in the devicetree can not be considered
> correct?

Probably not, but there's not really any defined policy there. What
that looks like in DT depends on the component. Displays are a common
one and don't lend well to populating multiple in the DT. For those,
the only solution so far is we require the 2nd source to be compatible
with the default/1st. I think that was QCom chromebooks...

The easy answer is firmware should deal with figuring out what's
actually there and update the DT accordingly.

> What about the related case of simply non-populated devices (e.g. laptop
> variants without a touchscreen)?

Shouldn't that just be a case of the driver not screaming if there's no device?

> Note that we have at least two cases of "second-source" nodes in mainline
> ("rtc" and "trackpad", respectively):
>
> 85a9efcd4e29 ("ARM: mvebu: add DT support for Seagate NAS 2 and 4-Bay")
> 689b937bedde ("arm64: dts: mediatek: add mt8173 elm and hana board")
>
> and that, for example, the i2c-hid driver explicitly supports
> non-populated devices:
>
> b3a81b6c4fc6 ("HID: i2c-hid: check if device is there before really probing")
>
> and the commit message indicates that this is something that Chromebooks
> rely on.
>
> For the X13s, I'm not sure how we would go about to tell the variants
> apart (the ACPI tables that Windows use include both touchpads and an
> optional touchscreen). In the end, the boot firmware might need to
> resort to a similar kind of probing if we don't allow the kernel to do
> it.
>
> Finally, note that while disabling async probing for "second-source"
> nodes (e.g. if we could mark them as requiring that) would take care of
> the irq-mapping race, we'd still currently also need to move any
> pinconfig handles to the parent bus node (as is also done in one of the
> in-tree examples above) to suppress the corresponding pinctrl errors in
> case the populated device is probed and bound first:
>
> [ +0.010217] sc8280xp-tlmm f100000.pinctrl: pin GPIO_182 already requested by 0-0015; cannot claim for 0-002c

If the config is the same for both we could suppress that warning. If
not the same, seems a bit dangerous to configure the wrong config...

Rob

2022-11-21 12:20:45

by Johan Hovold

[permalink] [raw]
Subject: Re: Second-source devices and interrupt-mapping race

Hi Rob,

and sorry about the late follow up. I received your reply just before my
holiday and then ended up in a bit of a rabbit hole with the Qualcomm
PHY drivers so this got put on the back burner.

On Thu, Aug 04, 2022 at 09:42:01AM -0600, Rob Herring wrote:
> On Thu, Jul 28, 2022 at 3:30 AM Johan Hovold <[email protected]> wrote:

> > 2. Rob, Krzysztof, I assume that handling second-source devices by
> > enabling multiple variants in the devicetree can not be considered
> > correct?
>
> Probably not, but there's not really any defined policy there. What
> that looks like in DT depends on the component. Displays are a common
> one and don't lend well to populating multiple in the DT. For those,
> the only solution so far is we require the 2nd source to be compatible
> with the default/1st. I think that was QCom chromebooks...
>
> The easy answer is firmware should deal with figuring out what's
> actually there and update the DT accordingly.

Right.

> > What about the related case of simply non-populated devices (e.g. laptop
> > variants without a touchscreen)?
>
> Shouldn't that just be a case of the driver not screaming if there's
> no device?

Right, that's simple, but what I'm getting at is that this is also a
case of the devicetree not describing the actual hardware. If we allow
that, does that imply that we should also allow having second-source
devices in the devicetree even if we know that at least one of them will
be non-populated?

The big difference is dealing with any "shared" resources, such has the
pinconfig and interrupt in my HID example, which would now appear to be
claimed by more than one device.

It seems we'd need some way to describe the devices as mutually
exclusive, and perhaps that's reason enough not to try to generalise the
single-non-populated-device case.

> > Note that we have at least two cases of "second-source" nodes in mainline
> > ("rtc" and "trackpad", respectively):
> >
> > 85a9efcd4e29 ("ARM: mvebu: add DT support for Seagate NAS 2 and 4-Bay")
> > 689b937bedde ("arm64: dts: mediatek: add mt8173 elm and hana board")
> >
> > and that, for example, the i2c-hid driver explicitly supports
> > non-populated devices:
> >
> > b3a81b6c4fc6 ("HID: i2c-hid: check if device is there before really probing")
> >
> > and the commit message indicates that this is something that Chromebooks
> > rely on.
> >
> > For the X13s, I'm not sure how we would go about to tell the variants
> > apart (the ACPI tables that Windows use include both touchpads and an
> > optional touchscreen). In the end, the boot firmware might need to
> > resort to a similar kind of probing if we don't allow the kernel to do
> > it.
> >
> > Finally, note that while disabling async probing for "second-source"
> > nodes (e.g. if we could mark them as requiring that) would take care of
> > the irq-mapping race, we'd still currently also need to move any
> > pinconfig handles to the parent bus node (as is also done in one of the
> > in-tree examples above) to suppress the corresponding pinctrl errors in
> > case the populated device is probed and bound first:
> >
> > [ +0.010217] sc8280xp-tlmm f100000.pinctrl: pin GPIO_182 already requested by 0-0015; cannot claim for 0-002c
>
> If the config is the same for both we could suppress that warning. If
> not the same, seems a bit dangerous to configure the wrong config...

Indeed, the pin configs would need to be at least compatible.

I'm sure this has the potential to mess things up with respect to other
resources too. And even if it may be possible to get this to work in
many cases it seems firmware needs to be involved for it to work
generally.

Johan

Subject: [irqchip: irq/irqchip-next] irqdomain: Fix mapping-creation race

The following commit has been merged into the irq/irqchip-next branch of irqchip:

Commit-ID: 601363cc08da25747feb87c55573dd54de91d66a
Gitweb: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/601363cc08da25747feb87c55573dd54de91d66a
Author: Johan Hovold <[email protected]>
AuthorDate: Mon, 13 Feb 2023 11:42:48 +01:00
Committer: Marc Zyngier <[email protected]>
CommitterDate: Mon, 13 Feb 2023 19:31:24

irqdomain: Fix mapping-creation race

Parallel probing of devices that share interrupts (e.g. when a driver
uses asynchronous probing) can currently result in two mappings for the
same hardware interrupt to be created due to missing serialisation.

Make sure to hold the irq_domain_mutex when creating mappings so that
looking for an existing mapping before creating a new one is done
atomically.

Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers")
Fixes: b62b2cf5759b ("irqdomain: Fix handling of type settings for existing mappings")
Link: https://lore.kernel.org/r/[email protected]
Cc: [email protected] # 4.8
Cc: Dmitry Torokhov <[email protected]>
Cc: Jon Hunter <[email protected]>
Tested-by: Hsin-Yi Wang <[email protected]>
Tested-by: Mark-PK Tsai <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/irq/irqdomain.c | 64 +++++++++++++++++++++++++++++------------
1 file changed, 46 insertions(+), 18 deletions(-)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 78fb480..df0cbad 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -25,6 +25,9 @@ static DEFINE_MUTEX(irq_domain_mutex);

static struct irq_domain *irq_default_domain;

+static int irq_domain_alloc_irqs_locked(struct irq_domain *domain, int irq_base,
+ unsigned int nr_irqs, int node, void *arg,
+ bool realloc, const struct irq_affinity_desc *affinity);
static void irq_domain_check_hierarchy(struct irq_domain *domain);

struct irqchip_fwid {
@@ -682,9 +685,9 @@ unsigned int irq_create_direct_mapping(struct irq_domain *domain)
EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
#endif

-static unsigned int __irq_create_mapping_affinity(struct irq_domain *domain,
- irq_hw_number_t hwirq,
- const struct irq_affinity_desc *affinity)
+static unsigned int irq_create_mapping_affinity_locked(struct irq_domain *domain,
+ irq_hw_number_t hwirq,
+ const struct irq_affinity_desc *affinity)
{
struct device_node *of_node = irq_domain_get_of_node(domain);
int virq;
@@ -699,7 +702,7 @@ static unsigned int __irq_create_mapping_affinity(struct irq_domain *domain,
return 0;
}

- if (irq_domain_associate(domain, virq, hwirq)) {
+ if (irq_domain_associate_locked(domain, virq, hwirq)) {
irq_free_desc(virq);
return 0;
}
@@ -735,14 +738,20 @@ unsigned int irq_create_mapping_affinity(struct irq_domain *domain,
return 0;
}

+ mutex_lock(&irq_domain_mutex);
+
/* Check if mapping already exists */
virq = irq_find_mapping(domain, hwirq);
if (virq) {
pr_debug("existing mapping on virq %d\n", virq);
- return virq;
+ goto out;
}

- return __irq_create_mapping_affinity(domain, hwirq, affinity);
+ virq = irq_create_mapping_affinity_locked(domain, hwirq, affinity);
+out:
+ mutex_unlock(&irq_domain_mutex);
+
+ return virq;
}
EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);

@@ -809,6 +818,8 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
if (WARN_ON(type & ~IRQ_TYPE_SENSE_MASK))
type &= IRQ_TYPE_SENSE_MASK;

+ mutex_lock(&irq_domain_mutex);
+
/*
* If we've already configured this interrupt,
* don't do it again, or hell will break loose.
@@ -821,7 +832,7 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
* interrupt number.
*/
if (type == IRQ_TYPE_NONE || type == irq_get_trigger_type(virq))
- return virq;
+ goto out;

/*
* If the trigger type has not been set yet, then set
@@ -829,35 +840,45 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
*/
if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) {
irq_data = irq_get_irq_data(virq);
- if (!irq_data)
- return 0;
+ if (!irq_data) {
+ virq = 0;
+ goto out;
+ }

irqd_set_trigger_type(irq_data, type);
- return virq;
+ goto out;
}

pr_warn("type mismatch, failed to map hwirq-%lu for %s!\n",
hwirq, of_node_full_name(to_of_node(fwspec->fwnode)));
- return 0;
+ virq = 0;
+ goto out;
}

if (irq_domain_is_hierarchy(domain)) {
- virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, fwspec);
- if (virq <= 0)
- return 0;
+ virq = irq_domain_alloc_irqs_locked(domain, -1, 1, NUMA_NO_NODE,
+ fwspec, false, NULL);
+ if (virq <= 0) {
+ virq = 0;
+ goto out;
+ }
} else {
/* Create mapping */
- virq = __irq_create_mapping_affinity(domain, hwirq, NULL);
+ virq = irq_create_mapping_affinity_locked(domain, hwirq, NULL);
if (!virq)
- return virq;
+ goto out;
}

irq_data = irq_get_irq_data(virq);
- if (WARN_ON(!irq_data))
- return 0;
+ if (WARN_ON(!irq_data)) {
+ virq = 0;
+ goto out;
+ }

/* Store trigger type */
irqd_set_trigger_type(irq_data, type);
+out:
+ mutex_unlock(&irq_domain_mutex);

return virq;
}
@@ -1888,6 +1909,13 @@ void irq_domain_set_info(struct irq_domain *domain, unsigned int virq,
irq_set_handler_data(virq, handler_data);
}

+static int irq_domain_alloc_irqs_locked(struct irq_domain *domain, int irq_base,
+ unsigned int nr_irqs, int node, void *arg,
+ bool realloc, const struct irq_affinity_desc *affinity)
+{
+ return -EINVAL;
+}
+
static void irq_domain_check_hierarchy(struct irq_domain *domain)
{
}