by Jan Kiszka

[permalink] [raw]

Subject: Re: [PATCH v3 2/4] genirq: Inform handler about line sharing state

Am 14.12.2010 21:54, Thomas Gleixner wrote:
> On Mon, 13 Dec 2010, Jan Kiszka wrote:
>> @@ -943,6 +950,9 @@ static struct irqaction *__free_irq(unsigned int irq, void *dev_id)
>> /* Make sure it's not being used on another CPU: */
>> synchronize_irq(irq);
>>
>> + if (single_handler)
>> + desc->irq_data.drv_status &= ~IRQS_SHARED;
>> +
>
> What's the reason to clear this flag outside of the desc->lock held
> region.

We need to synchronize the irq first before clearing the flag.

The problematic scenario behind this: An IRQ started in shared mode,
this the line was unmasked after the hardirq. Now we clear IRQS_SHARED
before calling into the threaded handler. And that handler may now think
that the line is still masked as IRQS_SHARED is set.

> I need this status for other purposes as well, where I
> definitely need serialization.

Well, two options: wrap all bit manipulations with desc->lock
acquisition/release or turn drv_status into an atomic. I don't know what
your plans with drv_status are, so...

>
>> + mutex_lock(&register_lock);
>> +
>> + old_action = desc->action;
>> + if (old_action && (old_action->flags & IRQF_ADAPTIVE) &&
>> + !(desc->irq_data.drv_status & IRQS_SHARED)) {
>> + /*
>> + * Signal the old handler that is has to switch to shareable
>> + * handling mode. Disable the line to avoid any conflict with
>> + * a real IRQ.
>> + */
>> + disable_irq(irq);
>> + local_irq_disable();
>> +
>> + desc->irq_data.drv_status |= IRQS_SHARED | IRQS_MAKE_SHAREABLE;
>
> Unserialized access as well. Will think about it.
>
>> + old_action->handler(irq, old_action->dev_id);
>> + desc->irq_data.drv_status &= ~IRQS_MAKE_SHAREABLE;
>
> Thanks,
>
> tglx

Jan

Attachments:

signature.asc (259.00 B)
OpenPGP digital signature

2010-12-14 23:01:22

by Jan Kiszka

[permalink] [raw]

Subject: Re: [PATCH v3 2/4] genirq: Inform handler about line sharing state

Am 14.12.2010 22:46, Thomas Gleixner wrote:
> On Mon, 13 Dec 2010, Jan Kiszka wrote:
>> From: Jan Kiszka <[email protected]>
>> chip_bus_lock(desc);
>> retval = __setup_irq(irq, desc, action);
>> chip_bus_sync_unlock(desc);
>>
>> - if (retval)
>> + if (retval) {
>> + if (desc->action && !desc->action->next)
>> + desc->irq_data.drv_status &= ~IRQS_SHARED;
>
> This is redundant. IRQS_SHARED gets set in a code path where all
> checks are done already.

Nope, it's also set before entry of __setup_irq in case we call an
IRQF_ADAPTIVE handler.

We need to set it that early as we may race with IRQ events for the
already registered handler happening between the sharing notification
and the actual registration of the second handler.

Jan

Attachments:

signature.asc (259.00 B)
OpenPGP digital signature

2010-12-14 23:10:12

Am 16.12.2010 14:13, Thomas Gleixner wrote:
> On Mon, 13 Dec 2010, Jan Kiszka wrote:
>> + if (old_action && (old_action->flags & IRQF_ADAPTIVE) &&
>> + !(desc->irq_data.drv_status & IRQS_SHARED)) {
>> + /*
>> + * Signal the old handler that is has to switch to shareable
>> + * handling mode. Disable the line to avoid any conflict with
>> + * a real IRQ.
>> + */
>> + disable_irq(irq);
>
> This is weird, really. I thought you wanted to avoid waiting for the
> threaded handler to finish if it's on the fly. So this should be
> disable_irq_nosync() or did you change your mind ?

No, I did not. I wanted to avoid that we set MAKE_SHAREABLE while there
might be another IRQ in flight. The handler that is called due to a real
IRQ might misinterpret MAKE_SHAREABLE as "there is no real event" and
perform the wrong steps (at least the current implementation for KVM would).

However, I will rebase my patch over your series now and try to re-think
this. The question is what could go wrong if we do not guarantee that
MAKE_SHAREABLE and ordinary IRQ will always be distinguishable. If there
is really nothing, specifically for the KVM scenario, we could even drop
the disable/enable_irq. That would be also be nicer when thinking about
potential delays of the already registered handler during this
transitional phase.

Jan

Attachments:

signature.asc (259.00 B)
OpenPGP digital signature

2010-12-16 21:28:49

by Tom Lyon

[permalink] [raw]

Subject: change of email address: [email protected] -> [email protected]

eom

2010-12-17 08:20:19

On Fri, Dec 17, 2010 at 05:32:46PM +0100, Thomas Gleixner wrote:
> On Fri, 17 Dec 2010, Jan Kiszka wrote:
> > Am 17.12.2010 16:25, Thomas Gleixner wrote:
> > > Your aproach with disable_irq_nosync() is completely flawed, simply
> > > because you try to pretend that your interrupt handler is done, while
> > > it is not done at all. The threaded interrupt handler is done when
> > > user space completes. Everything else is just hacking around the
> > > problem and creates all that nasty transitional problems.
> >
> > disable_irq_nosync is the pattern currently used in KVM, it's nothing
> > new in fact.
>
> That does not make it less flawed :)
>
> > The approach looks interesting but requires separate code for
> > non-PCI-2.3 devices, i.e. when we have no means to mask at device level.
>
> Why? You can have the same code, you just can't request COND_ONESHOT
> handlers for it, so it needs unshared ONESHOT or it won't work at all,
> no matter what approach you chose. No device level mask, no sharing,
> it's that simple.
>
> > Further drawbacks - unless I missed something on first glance:
> >
> > - prevents any future optimizations that would work without IRQ thread
> > ping-pong (ie. once we allow guest IRQ injection from hardirq context
> > for selected but typical setups)
> > - two additional, though light-weight, context switches on each
> > interrupt completion
>
> The drawback of these two points is way less than the horror which you
> need to introduce to do the whole async handler disable, userspace
> enable dance. Robust and simple solutions really a preferred over
> complex and fragile horror which has a questionable runtime benefit.

I'd like to note that the overhead of involving the scheduler in
interrupt injection for an assigned device should be easily measurable:
just make the MSI handlers threaded and see what the result is.

In the case of emulated devices, when we had an extra thread
involved in MSI handling, the vcpu thread and the
interrupt injection thread were competing for cpu with strange
fluctuations in performance as the result
(i.e. sometimes we would get good speed as threading would introduce
a kind of interrupt coalescing, sometimes we would get huge latency).

> > - continuous polling if user space decides to leave the interrupt
> > unhandled (e.g. because the virtual IRQ line is masked)
>
> That should be a solvable problem.
>
> Thanks,
>
> tglx