2003-09-09 18:30:27

by long

[permalink] [raw]
Subject: MSI fix for buggy PCI/PCI-X hardware

The latest patches that enable MSI in the Linux kernel were
sent out a few weeks ago on lkml. These patches will enable
MSI support if users rebuild the kernel with CONFIG_PCI_MSI
set to 'Y'. Option one (default option) enables MSI on
specific individual PCI/PCI-X MSI-capable devices listed in
the boot parameter "device_msi=". Users may choose option
two, which enables MSI on all PCI/PCI-X MSI-capable devices,
by setting CONFIG_PCI_MSI_ON_SPECIFIC_DEVICES to 'N'.
Depending on if either option one or option two is selected,
the kernel detects and enables MSI on requested PCI/PCI-X
MSI-capable devices.

During testing we have found many currently shipping
PCI/PCI-X MSI-capable devices have silicon bugs specific to
MSI support. Most of these will cause system failures when
MSI is enabled. To filter out MSI buggy hardware requires the
kernel to detect and disable the MSI support on specific hardware.

The proposed solution is to provide a new API, named "int
disable_msi(struct pci_dev *dev)", to allow IHV's who have
shipped PCI/PCI-X hardware that does not work in MSI mode to update
their software drivers to request the kernel to switch the
interrupt mode from MSI mode back to IRQ pin-assertion mode.
There are some reasons to justify this as below:

1) According to the PCI spec 3.0 or latest, the software
driver is prohibited from directly interfacing with its
device to switch MSI mode to IRQ pin-assertion mode.
Consequently, a kernel interface is required since the kernel
is the only entity permitted to change modes.

2) All PCI-Express endpoints require MSI support. These
endpoints should have MSI enabled automatically by default.
For PCI/PCI-X hardware that can't use MSI due to hardware bugs,
it is the IHVs' responsibility to update their software drivers to
request the kernel to disable MSI so the HW can function with MSI
enabled for all other devices.

3) Some PCI/PCI-X MSI-capable hardware devices out there work with
this patch without requiring any changes to their existing software
drivers.

I'd appreciate any feedback you have on this proposed solution.

Thanks,
Long


2003-09-09 18:41:26

by Jeff Garzik

[permalink] [raw]
Subject: Re: MSI fix for buggy PCI/PCI-X hardware

On Tue, Sep 09, 2003 at 08:39:37AM -0700, long wrote:
> The proposed solution is to provide a new API, named "int
> disable_msi(struct pci_dev *dev)", to allow IHV's who have
> shipped PCI/PCI-X hardware that does not work in MSI mode to update
> their software drivers to request the kernel to switch the
> interrupt mode from MSI mode back to IRQ pin-assertion mode.

No need for a new API. We have drivers/pci/quirk.c where we add PCI
devices with known bugs. If there is commonality of the bugs across
hardware, we can easily add a common "quirk" which fixes the situation.

Individual drivers shouldn't need to bother with this, really.

As an aside, if you need to blacklist certain _systems_ rather than
certain PCI devices, you should either modify drivers/pci/quirks.c or
arch/i386/kernel/dmi_scan.c.

Jeff



2003-09-09 22:18:20

by Nakajima, Jun

[permalink] [raw]
Subject: RE: MSI fix for buggy PCI/PCI-X hardware

Jeff, Hi.

How about the default behavior? I'm not a fan of disable_msi(), because
we need to update the driver as we find problems, and we cannot predict
which PCI/PCI-X devices in the world have such a problem, although we
know some will. The workaround in drivers/pci/quirk.c is much better,
compared to modifying the driver, but we still need to update the file
(and rebuild the kernel) as we find problems.

In my opinion, we might want to use drivers/pci/quirk.c to blacklist PCI
Express devices if any (hope not). For PCI/PCI-X devices, we might want
to enable MSI once verified for it. To that end we can also use
drivers/pci/quirk.c to whitelist them (or it's abuse?). That way we can
avoid situations like "it hangs, it does not get interrupts", "disable
ACPI, oh no, MSI".

Thanks,
Jun

> -----Original Message-----
> From: Jeff Garzik [mailto:[email protected]]
> Sent: Tuesday, September 09, 2003 11:41 AM
> To: long
> Cc: [email protected]; [email protected]; Nakajima, Jun;
Nguyen,
> Tom L; [email protected]
> Subject: Re: MSI fix for buggy PCI/PCI-X hardware
>
> On Tue, Sep 09, 2003 at 08:39:37AM -0700, long wrote:
> > The proposed solution is to provide a new API, named "int
> > disable_msi(struct pci_dev *dev)", to allow IHV's who have
> > shipped PCI/PCI-X hardware that does not work in MSI mode to update
> > their software drivers to request the kernel to switch the
> > interrupt mode from MSI mode back to IRQ pin-assertion mode.
>
> No need for a new API. We have drivers/pci/quirk.c where we add PCI
> devices with known bugs. If there is commonality of the bugs across
> hardware, we can easily add a common "quirk" which fixes the
situation.
>
> Individual drivers shouldn't need to bother with this, really.
>
> As an aside, if you need to blacklist certain _systems_ rather than
> certain PCI devices, you should either modify drivers/pci/quirks.c or
> arch/i386/kernel/dmi_scan.c.
>
> Jeff
>
>

2003-09-09 22:53:34

by Jeff Garzik

[permalink] [raw]
Subject: Re: MSI fix for buggy PCI/PCI-X hardware

Nakajima, Jun wrote:
> How about the default behavior? I'm not a fan of disable_msi(), because
> we need to update the driver as we find problems, and we cannot predict
> which PCI/PCI-X devices in the world have such a problem, although we
> know some will. The workaround in drivers/pci/quirk.c is much better,
> compared to modifying the driver, but we still need to update the file
> (and rebuild the kernel) as we find problems.

Agreed.

That's the pain of buggy hardware. The solution is to not produce buggy
hardware ;-) Failing that, it is unavoidable that the kernel would need
to be updated to notice or work around buggy hardware. That's precisely
the reason for quirks/dmi_scan existence: the special cases. Special
cases are never easy or enjoyable to maintain ;-)


> In my opinion, we might want to use drivers/pci/quirk.c to blacklist PCI
> Express devices if any (hope not). For PCI/PCI-X devices, we might want
> to enable MSI once verified for it. To that end we can also use
> drivers/pci/quirk.c to whitelist them (or it's abuse?). That way we can
> avoid situations like "it hangs, it does not get interrupts", "disable
> ACPI, oh no, MSI".


Five points here:

1) If we did that with ACPI, you guys would have only recieved a
_fraction_ of the feedback you received. IMO we want to turn on MSI
(where supported), and see what breaks. It _should_ work, otherwise the
hardware guys wouldn't have put MSI on their PCI device :)

You'll never get feedback and testing if it's turned off by default.

2) MSI is more optimal than standard (should I start calling them
legacy?) x86 interrupts. And I think they're just plain cool. So of
course I will push to default MSI to on! ;-)

3) I think this view is colored by "right now". The current MSI errata
may be worrying you, but... MSI is the future. If you choose to
whitelist, then you're creating a maintenance nightmare for the future.
You would have to qualify _every_ MSI device! Think how much it would
suck if we have to do that with PCI devices today.

Furthermore, a whitelist unfairly punishes working MSI hardware and
perhaps unfairly highlights a few key vendors at the start ;-) This is
why I like blacklists.

Broken hardware is a special case, and not something we should invest a
whole lot of time worrying about. _Assume_ the hardware is working,
then deal with the cases where it isn't. _That_ is the Linus Torvalds
model of an optimal system (IMO :))

4) I have a real-life example: tg3. The BroadCom 57xx chips are
MSI-brain-damaged. So we unconditionally program the hardware in
non-MSI mode. No special APIs needed at all.

5) Another option is to enable MSI only for devices which call
request_msi(). This idea follows the current model of
pci_enable_device(): PCI resources and interrupts are guaranteed to be
assigned and set up only after a successful call to pci_enable_device().
Then, later on, the driver will call request_irq(), which will unmask
the irq (if it's not already shared). Continuing this model, a driver's
call to request_msi() would signal that MSI is to be enabled for that
device.... and ensure that the PCI core does not unconditionally enable
MSI for any device outside of request_msi() call.

Jeff



2003-09-10 01:26:34

by Nakajima, Jun

[permalink] [raw]
Subject: RE: MSI fix for buggy PCI/PCI-X hardware

Thanks for the good points.

> 5) Another option is to enable MSI only for devices which call
> request_msi(). This idea follows the current model of
> pci_enable_device(): PCI resources and interrupts are guaranteed to
be
> assigned and set up only after a successful call to
pci_enable_device().
> Then, later on, the driver will call request_irq(), which will
unmask
> the irq (if it's not already shared). Continuing this model, a
driver's
> call to request_msi() would signal that MSI is to be enabled for that
> device.... and ensure that the PCI core does not unconditionally
enable
> MSI for any device outside of request_msi() call.

In fact, this was something I had in my mind (mine was enable_msi()).
The drawback of this is, of course, (simple) modifications to the driver
are required to get MSI enabled. Since we definitely need more APIs for
MSI-X, I think it's a consistent and clean extension required for MSI.
The benefit of this is that each device driver can have more detailed
control with enabling/disabling MSI for particular devices when
supporting a family of devices in the same binary (compared to
drivers/pci/quirk.c).

Thanks,
Jun

> -----Original Message-----
> From: Jeff Garzik [mailto:[email protected]]
> Sent: Tuesday, September 09, 2003 3:52 PM
> To: Nakajima, Jun
> Cc: long; [email protected]; [email protected]; Nguyen, Tom L;
> [email protected]
> Subject: Re: MSI fix for buggy PCI/PCI-X hardware
>
> Nakajima, Jun wrote:
> > How about the default behavior? I'm not a fan of disable_msi(),
because
> > we need to update the driver as we find problems, and we cannot
predict
> > which PCI/PCI-X devices in the world have such a problem, although
we
> > know some will. The workaround in drivers/pci/quirk.c is much
better,
> > compared to modifying the driver, but we still need to update the
file
> > (and rebuild the kernel) as we find problems.
>
> Agreed.
>
> That's the pain of buggy hardware. The solution is to not produce
buggy
> hardware ;-) Failing that, it is unavoidable that the kernel would
need
> to be updated to notice or work around buggy hardware. That's
precisely
> the reason for quirks/dmi_scan existence: the special cases. Special
> cases are never easy or enjoyable to maintain ;-)
>
>
> > In my opinion, we might want to use drivers/pci/quirk.c to blacklist
PCI
> > Express devices if any (hope not). For PCI/PCI-X devices, we might
want
> > to enable MSI once verified for it. To that end we can also use
> > drivers/pci/quirk.c to whitelist them (or it's abuse?). That way we
can
> > avoid situations like "it hangs, it does not get interrupts",
"disable
> > ACPI, oh no, MSI".
>
>
> Five points here:
>
> 1) If we did that with ACPI, you guys would have only recieved a
> _fraction_ of the feedback you received. IMO we want to turn on MSI
> (where supported), and see what breaks. It _should_ work, otherwise
the
> hardware guys wouldn't have put MSI on their PCI device :)
>
> You'll never get feedback and testing if it's turned off by default.
>
> 2) MSI is more optimal than standard (should I start calling them
> legacy?) x86 interrupts. And I think they're just plain cool. So of
> course I will push to default MSI to on! ;-)
>
> 3) I think this view is colored by "right now". The current MSI
errata
> may be worrying you, but... MSI is the future. If you choose to
> whitelist, then you're creating a maintenance nightmare for the
future.
> You would have to qualify _every_ MSI device! Think how much it
would
> suck if we have to do that with PCI devices today.
>
> Furthermore, a whitelist unfairly punishes working MSI hardware and
> perhaps unfairly highlights a few key vendors at the start ;-) This
is
> why I like blacklists.
>
> Broken hardware is a special case, and not something we should invest
a
> whole lot of time worrying about. _Assume_ the hardware is working,
> then deal with the cases where it isn't. _That_ is the Linus Torvalds
> model of an optimal system (IMO :))
>
> 4) I have a real-life example: tg3. The BroadCom 57xx chips are
> MSI-brain-damaged. So we unconditionally program the hardware in
> non-MSI mode. No special APIs needed at all.
>
> 5) Another option is to enable MSI only for devices which call
> request_msi(). This idea follows the current model of
> pci_enable_device(): PCI resources and interrupts are guaranteed to
be
> assigned and set up only after a successful call to
pci_enable_device().
> Then, later on, the driver will call request_irq(), which will
unmask
> the irq (if it's not already shared). Continuing this model, a
driver's
> call to request_msi() would signal that MSI is to be enabled for that
> device.... and ensure that the PCI core does not unconditionally
enable
> MSI for any device outside of request_msi() call.
>
> Jeff
>
>

2003-09-10 21:32:40

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: MSI fix for buggy PCI/PCI-X hardware

On Tue, 9 Sep 2003, Jeff Garzik wrote:

> 5) Another option is to enable MSI only for devices which call
> request_msi(). This idea follows the current model of
> pci_enable_device(): PCI resources and interrupts are guaranteed to be
> assigned and set up only after a successful call to pci_enable_device().
> Then, later on, the driver will call request_irq(), which will unmask
> the irq (if it's not already shared). Continuing this model, a driver's
> call to request_msi() would signal that MSI is to be enabled for that
> device.... and ensure that the PCI core does not unconditionally enable
> MSI for any device outside of request_msi() call.

Jun, i'm of the opinion that we can't be doing this in the kernel. The
bugs are device specific so putting blacklists in the core kernel
code sounds a bit wrong. What i would prefer is if this is done on a driver
level via a flag (e.g. say it's a buggy batch upto a given pci revision).
I don't believe we'd be "violating" the PCI 3.0 mandate if we only check
revisions when making a decision on driving a device via the irq line or
msi. So i'm actually agreeing with the request_msi/request_irq pair and
using that as the interface a driver writer uses to enable the respective
modes. I know it's kind of cheating and normally the communication path
when doing this query should be the kernel informing the driver of
capability and the driver working with the specified mode. But driver
informing the kernel of specific preferred mode for a device should save
some headaches. Workable?

Btw have there been some updates? I want to have another look.

Thanks,
Zwane