2008-06-09 15:42:49

by Stefan Roscher

[permalink] [raw]
Subject: [PATCH 0/2] Prevent loss of interrupts in IB/ehca

This patchset contains two changes for IB/ehca and ibmebus.

The first patch enables ibmebus_request_irq() to optionally return the
IRQ number, which is used by the second patch to trigger EOI in case of
lost interrupts.

They should apply cleanly against 2.6.26 git tree.

Thanks
Stefan


2008-06-09 22:51:44

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH 0/2] Prevent loss of interrupts in IB/ehca

Stefan Roscher writes:

> This patchset contains two changes for IB/ehca and ibmebus.
>
> The first patch enables ibmebus_request_irq() to optionally return the
> IRQ number, which is used by the second patch to trigger EOI in case of
> lost interrupts.

At first sight it seems like a very bad idea for a driver to be poking
into the internals of the interrupt subsystem like this. Under what
circumstances do interrupts get lost, and why does doing an extra EOI
like this fix the problem?

Paul.

2008-06-10 08:22:12

by Stefan Roscher

[permalink] [raw]
Subject: Re: [PATCH 0/2] Prevent loss of interrupts in IB/ehca

On Tuesday 10 June 2008 00:28:16 Paul Mackerras wrote:
> Stefan Roscher writes:
>
> > This patchset contains two changes for IB/ehca and ibmebus.
> >
> > The first patch enables ibmebus_request_irq() to optionally return the
> > IRQ number, which is used by the second patch to trigger EOI in case of
> > lost interrupts.
>
> At first sight it seems like a very bad idea for a driver to be poking
> into the internals of the interrupt subsystem like this. Under what
> circumstances do interrupts get lost, and why does doing an extra EOI
> like this fix the problem?
>
> Paul.
>

The processing of events with a timer controlled polling is not the "typical"
way how you should handle adapter events.
During corner case testing, we noticed that some versions of ehca
do not properly transition to interrupt done in special load situations.
This can be resolved by periodically triggering EOI through H_EOI,
if eqes are pending.
Hope this clarifys the backround of the patch.

Is there a better way to initiate this type of EOI in a non-irq case?

regards Stefan R. and Christoph R.

2008-06-10 09:21:57

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH 0/2] Prevent loss of interrupts in IB/ehca

Stefan Roscher writes:

> The processing of events with a timer controlled polling is not the "typical"
> way how you should handle adapter events.

Do you mean it's not typical in Linux (I would have said it was), or
it's not the way that the firmware architects and implementers thought
the eHEA hcalls would be used?

> During corner case testing, we noticed that some versions of ehca
> do not properly transition to interrupt done in special load situations.
> This can be resolved by periodically triggering EOI through H_EOI,
> if eqes are pending.

So in other words, doing the extra EOIs is a workaround for a firmware
and/or hardware bug, then? If that's the case then you needed to say
that in the commit message.

Given that the ehea driver does hcalls itself directly, and this is a
workaround for a firmware/hardware bug, it may actually be cleaner
just to have the ehea driver do the necessary H_EOI calls directly.

Paul.