2019-08-08 12:08:31

by Ben Luo

[permalink] [raw]
Subject: [PATCH 1/2] genirq: introduce update_irq_devid()

Sometimes, only the dev_id field of irqaction need to be changed.
E.g. KVM VM with device passthru via VFIO may switch irq injection
path between KVM irqfd and userspace eventfd. These two paths
share the same irq num and handler for the same vector of a device,
only with different dev_id referencing to different fds' contexts.

So, instead of free/request irq, only update dev_id of irqaction.
This narrows the gap for setting up new irq (and irte, if has)
and also gains some performance benefit.

Signed-off-by: Ben Luo <[email protected]>
Reviewed-by: Liu Jiang <[email protected]>
---
include/linux/interrupt.h | 3 ++
kernel/irq/manage.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 77 insertions(+)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5b8328a..2838113 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -172,6 +172,9 @@ struct irqaction {
request_percpu_nmi(unsigned int irq, irq_handler_t handler,
const char *devname, void __percpu *dev);

+extern int update_irq_devid(unsigned int irq, void *dev_id,
+ void *new_dev_id);
+
extern const void *free_irq(unsigned int, void *);
extern void free_percpu_irq(unsigned int, void __percpu *);

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e8f7f17..18dc10d 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -2059,6 +2059,80 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler,
EXPORT_SYMBOL(request_threaded_irq);

/**
+ * update_irq_devid - update irq dev_id to a new one
+ * @irq: Interrupt line to update
+ * @dev_id: A cookie to find the irqaction to update
+ * @new_dev_id: New cookie passed to the handler function
+ *
+ * Sometimes, only the cookie data need to be changed.
+ * Instead of free/request irq, only update dev_id here
+ * Not only to gain some performance benefit, but also
+ * reduce the risk of losing interrupt.
+ *
+ * E.g. irq affinity setting in a VM with VFIO passthru
+ * device is carried out in a free-then-request-irq way
+ * since lack of this very function. The free_irq()
+ * eventually triggers deactivation of IR domain, which
+ * will cleanup IRTE. There is a gap before request_irq()
+ * finally setup the IRTE again. In this gap, a in-flight
+ * irq buffering in hardware layer may trigger DMAR fault
+ * and be lost.
+ *
+ * On failure, it returns a negative value. On success,
+ * it returns 0
+ */
+int update_irq_devid(unsigned int irq, void *dev_id, void *new_dev_id)
+{
+ struct irq_desc *desc = irq_to_desc(irq);
+ struct irqaction *action, **action_ptr;
+ unsigned long flags;
+
+ WARN(in_interrupt(),
+ "Trying to update IRQ %d from IRQ context!\n", irq);
+
+ if (!desc)
+ return -EINVAL;
+
+ chip_bus_lock(desc);
+ raw_spin_lock_irqsave(&desc->lock, flags);
+
+ /*
+ * There can be multiple actions per IRQ descriptor, find the right
+ * one based on the dev_id:
+ */
+ action_ptr = &desc->action;
+ for (;;) {
+ action = *action_ptr;
+
+ if (!action) {
+ WARN(1, "Trying to update already-free IRQ %d\n", irq);
+ raw_spin_unlock_irqrestore(&desc->lock, flags);
+ chip_bus_sync_unlock(desc);
+ return -ENXIO;
+ }
+
+ if (action->dev_id == dev_id) {
+ action->dev_id = new_dev_id;
+ break;
+ }
+ action_ptr = &action->next;
+ }
+
+ raw_spin_unlock_irqrestore(&desc->lock, flags);
+ chip_bus_sync_unlock(desc);
+
+ /*
+ * Make sure it's not being used on another CPU:
+ * There is a risk of UAF for old *dev_id, if it is
+ * freed in a short time after this func returns
+ */
+ synchronize_irq(irq);
+
+ return 0;
+}
+EXPORT_SYMBOL(update_irq_devid);
+
+/**
* request_any_context_irq - allocate an interrupt line
* @irq: Interrupt line to allocate
* @handler: Function to be called when the IRQ occurs.
--
1.8.3.1


2019-08-08 19:57:10

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/2] genirq: introduce update_irq_devid()

On Thu, 8 Aug 2019, Ben Luo wrote:
> +int update_irq_devid(unsigned int irq, void *dev_id, void *new_dev_id)
> +{
> + struct irq_desc *desc = irq_to_desc(irq);
> + struct irqaction *action, **action_ptr;
> + unsigned long flags;
> +
> + WARN(in_interrupt(),
> + "Trying to update IRQ %d from IRQ context!\n", irq);

This is broken. The function needs to return on that condition. Actually it
cannot even be called from non-preemptible code.

What's worse is that if the interrupt in question is handled concurrently,
then it will either see the old or the new dev_id and because the interrupt
handler loop runs with desc->lock dropped even more crap can happen because
dev_id can be subject to load and store tearing.

Staring at that, I see that there is the same issue in setup_irq() and
free_irq(). It's actually worse there. I'll have a look.

> + /*
> + * There can be multiple actions per IRQ descriptor, find the right
> + * one based on the dev_id:
> + */
> + action_ptr = &desc->action;
> + for (;;) {
> + action = *action_ptr;
> +
> + if (!action) {
> + WARN(1, "Trying to update already-free IRQ %d\n", irq);

That's wrong in two aspects:

1) The warn should be outside of the locked region.

2) Just having the irq number is not useful for debugging either
when the interrupt is shared.

> + raw_spin_unlock_irqrestore(&desc->lock, flags);
> + chip_bus_sync_unlock(desc);
> + return -ENXIO;
> + }
> +
> + if (action->dev_id == dev_id) {
> + action->dev_id = new_dev_id;
> + break;
> + }
> + action_ptr = &action->next;
> + }
> +
> + raw_spin_unlock_irqrestore(&desc->lock, flags);
> + chip_bus_sync_unlock(desc);
> +
> + /*
> + * Make sure it's not being used on another CPU:
> + * There is a risk of UAF for old *dev_id, if it is
> + * freed in a short time after this func returns
> + */
> + synchronize_irq(irq);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(update_irq_devid);

EXPORT_SYMBOL_GPL() please.

Thanks,

tglx

2019-08-12 03:00:38

by Ben Luo

[permalink] [raw]
Subject: Re: [PATCH 1/2] genirq: introduce update_irq_devid()


在 2019/8/9 上午3:56, Thomas Gleixner 写道:
> On Thu, 8 Aug 2019, Ben Luo wrote:
>> +int update_irq_devid(unsigned int irq, void *dev_id, void *new_dev_id)
>> +{
>> + struct irq_desc *desc = irq_to_desc(irq);
>> + struct irqaction *action, **action_ptr;
>> + unsigned long flags;
>> +
>> + WARN(in_interrupt(),
>> + "Trying to update IRQ %d from IRQ context!\n", irq);
> This is broken. The function needs to return on that condition. Actually it
> cannot even be called from non-preemptible code.
>
> What's worse is that if the interrupt in question is handled concurrently,
> then it will either see the old or the new dev_id and because the interrupt
> handler loop runs with desc->lock dropped even more crap can happen because
> dev_id can be subject to load and store tearing.
>
> Staring at that, I see that there is the same issue in setup_irq() and
> free_irq(). It's actually worse there. I'll have a look.

ok,  will return with an error code in v2 in this case

>
>> + /*
>> + * There can be multiple actions per IRQ descriptor, find the right
>> + * one based on the dev_id:
>> + */
>> + action_ptr = &desc->action;
>> + for (;;) {
>> + action = *action_ptr;
>> +
>> + if (!action) {
>> + WARN(1, "Trying to update already-free IRQ %d\n", irq);
> That's wrong in two aspects:
>
> 1) The warn should be outside of the locked region.
>
> 2) Just having the irq number is not useful for debugging either
> when the interrupt is shared.
will take care in v2
>> + raw_spin_unlock_irqrestore(&desc->lock, flags);
>> + chip_bus_sync_unlock(desc);
>> + return -ENXIO;
>> + }
>> +
>> + if (action->dev_id == dev_id) {
>> + action->dev_id = new_dev_id;
>> + break;
>> + }
>> + action_ptr = &action->next;
>> + }
>> +
>> + raw_spin_unlock_irqrestore(&desc->lock, flags);
>> + chip_bus_sync_unlock(desc);
>> +
>> + /*
>> + * Make sure it's not being used on another CPU:
>> + * There is a risk of UAF for old *dev_id, if it is
>> + * freed in a short time after this func returns
>> + */
>> + synchronize_irq(irq);
>> +
>> + return 0;
>> +}
>> +EXPORT_SYMBOL(update_irq_devid);
> EXPORT_SYMBOL_GPL() please.
thanks, will use EXPORT_SYMBOL_GPL in v2
>
> Thanks,
>
> tglx

Thanks,

     Ben