2009-01-23 23:25:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 0/2] PCI/PCIe port driver: Fix allocation of interrupts

Hi,

The following two patches fix the allocation of interrupts in the PCIe port
driver, which is currently broken with respect to MSI-X.

The first patch introduces a new function returning the size of a device's
MSI-X table, which is used in the second patch.

The second one is the actual fix.

Please consider for applying.

Thanks,
Rafael


2009-01-23 23:24:51

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 1/2] PCI/MSI: Introduce pci_msix_table_size()

From: Rafael J. Wysocki <[email protected]>

Introduce new function pci_msix_table_size() returning the size of
the MSI-X table of given PCI device or 0 if the device doesn't
support MSI-X.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Hidetoshi Seto <[email protected]>
---
drivers/pci/msi.c | 24 +++++++++++++++++++-----
include/linux/pci.h | 5 +++++
2 files changed, 24 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/pci/msi.c
===================================================================
--- linux-2.6.orig/drivers/pci/msi.c
+++ linux-2.6/drivers/pci/msi.c
@@ -670,6 +670,23 @@ static int msi_free_irqs(struct pci_dev*
}

/**
+ * pci_msix_table_size - return the number of device's MSI-X table entries
+ * @dev: pointer to the pci_dev data structure of MSI-X device function
+ */
+int pci_msix_table_size(struct pci_dev *dev)
+{
+ int pos;
+ u16 control;
+
+ pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
+ if (!pos)
+ return 0;
+
+ pci_read_config_word(dev, msi_control_reg(pos), &control);
+ return multi_msix_capable(control);
+}
+
+/**
* pci_enable_msix - configure device's MSI-X capability structure
* @dev: pointer to the pci_dev data structure of MSI-X device function
* @entries: pointer to an array of MSI-X entries
@@ -686,9 +703,8 @@ static int msi_free_irqs(struct pci_dev*
**/
int pci_enable_msix(struct pci_dev* dev, struct msix_entry *entries, int nvec)
{
- int status, pos, nr_entries;
+ int status, nr_entries;
int i, j;
- u16 control;

if (!entries)
return -EINVAL;
@@ -697,9 +713,7 @@ int pci_enable_msix(struct pci_dev* dev,
if (status)
return status;

- pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
- pci_read_config_word(dev, msi_control_reg(pos), &control);
- nr_entries = multi_msix_capable(control);
+ nr_entries = pci_msix_table_size(dev);
if (nvec > nr_entries)
return -EINVAL;

Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -799,6 +799,10 @@ static inline void pci_msi_shutdown(stru
static inline void pci_disable_msi(struct pci_dev *dev)
{ }

+static inline int pci_msix_table_size(struct pci_dev *dev)
+{
+ return 0;
+}
static inline int pci_enable_msix(struct pci_dev *dev,
struct msix_entry *entries, int nvec)
{
@@ -823,6 +827,7 @@ static inline int pci_msi_enabled(void)
extern int pci_enable_msi(struct pci_dev *dev);
extern void pci_msi_shutdown(struct pci_dev *dev);
extern void pci_disable_msi(struct pci_dev *dev);
+extern int pci_msix_table_size(struct pci_dev *dev);
extern int pci_enable_msix(struct pci_dev *dev,
struct msix_entry *entries, int nvec);
extern void pci_msix_shutdown(struct pci_dev *dev);

2009-01-23 23:25:11

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 2/2] PCI/PCIe portdrv: Fix allocation of interrupts (rev. 6)

From: Rafael J. Wysocki <[email protected]>

If MSI-X interrupt mode is used by the PCI Express port driver, too
many vectors are allocated and it is not ensured that the right
vectors will be used for the right services. Namely, the PCI Express
specification states that both PCI Express native PME and PCI Express
hotplug will always use the same MSI or MSI-X message for signalling
interrupts, which implies that the same vector will be used by both
of them. Also, the VC service does not use interrupts at all.
Moreover, is not clear which of the vectors allocated by
pci_enable_msix() in the current code will be used for PME and
hotplug and which of them will be used for AER if all of these
services are configured.

For these reasons, rework the allocation of interrupts for PCI
Express ports so that if MSI-X are enabled, the right vectors will be
used for the right purposes.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Hidetoshi Seto <[email protected]>
---
drivers/pci/pcie/portdrv.h | 6 +
drivers/pci/pcie/portdrv_core.c | 204 ++++++++++++++++++++++++++++++++--------
include/linux/pcieport_if.h | 12 +-
3 files changed, 180 insertions(+), 42 deletions(-)

Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -31,6 +31,152 @@ static void release_pcie_device(struct d
}

/**
+ * pcie_port_msix_add_entry - add entry to given array of MSI-X entries
+ * @entries: Array of MSI-X entries
+ * @new_entry: Index of the entry to add to the array
+ * @nr_entries: Number of entries aleady in the array
+ *
+ * Return value: Position of the added entry in the array
+ */
+static int pcie_port_msix_add_entry(
+ struct msix_entry *entries, int new_entry, int nr_entries)
+{
+ int j;
+
+ for (j = 0; j < nr_entries; j++)
+ if (entries[j].entry == new_entry)
+ return j;
+
+ entries[j].entry = new_entry;
+ return j;
+}
+
+/**
+ * pcie_port_enable_msix - try to set up MSI-X as interrupt mode for given port
+ * @dev: PCI Express port to handle
+ * @vectors: Array of interrupt vectors to populate
+ * @mask: Bitmask of port capabilities returned by get_port_device_capability()
+ *
+ * Return value: 0 on success, error code on failure
+ */
+static int pcie_port_enable_msix(struct pci_dev *dev, int *vectors, int mask)
+{
+ struct msix_entry *msix_entries;
+ int idx[PCIE_PORT_DEVICE_MAXSERVICES];
+ int nr_entries, status, pos, i, nvec;
+ u16 reg16;
+ u32 reg32;
+
+ nr_entries = pci_msix_table_size(dev);
+ if (!nr_entries)
+ return -EINVAL;
+ if (nr_entries > PCIE_PORT_MAX_MSIX_ENTRIES)
+ nr_entries = PCIE_PORT_MAX_MSIX_ENTRIES;
+
+ msix_entries = kzalloc(sizeof(*msix_entries) * nr_entries, GFP_KERNEL);
+ if (!msix_entries)
+ return -ENOMEM;
+
+ /*
+ * Allocate as many entries as the port wants, so that we can check
+ * which of them will be useful. Moreover, if nr_entries is correctly
+ * equal to the number of entries this port actually uses, we'll happily
+ * go through without any tricks.
+ */
+ for (i = 0; i < nr_entries; i++)
+ msix_entries[i].entry = i;
+
+ status = pci_enable_msix(dev, msix_entries, nr_entries);
+ if (status)
+ goto Exit;
+
+ for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
+ idx[i] = -1;
+ status = -EIO;
+ nvec = 0;
+
+ if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP)) {
+ int entry;
+
+ /*
+ * The code below follows the PCI Express Base Specification 2.0
+ * stating in Section 6.1.6 that "PME and Hot-Plug Event
+ * interrupts (when both are implemented) always share the same
+ * MSI or MSI-X vector, as indicated by the Interrupt Message
+ * Number field in the PCI Express Capabilities register", where
+ * according to Section 7.8.2 of the specification "For MSI-X,
+ * the value in this field indicates which MSI-X Table entry is
+ * used to generate the interrupt message."
+ */
+ pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+ pci_read_config_word(dev, pos + PCIE_CAPABILITIES_REG, &reg16);
+ entry = (reg16 >> 9) & PCIE_PORT_MSI_VECTOR_MASK;
+ if (entry >= nr_entries)
+ goto Error;
+
+ i = pcie_port_msix_add_entry(msix_entries, entry, nvec);
+ if (i == nvec)
+ nvec++;
+
+ idx[PCIE_PORT_SERVICE_PME_SHIFT] = i;
+ idx[PCIE_PORT_SERVICE_HP_SHIFT] = i;
+ }
+
+ if (mask & PCIE_PORT_SERVICE_AER) {
+ int entry;
+
+ /*
+ * The code below follows Section 7.10.10 of the PCI Express
+ * Base Specification 2.0 stating that bits 31-27 of the Root
+ * Error Status Register contain a value indicating which of the
+ * MSI/MSI-X vectors assigned to the port is going to be used
+ * for AER, where "For MSI-X, the value in this register
+ * indicates which MSI-X Table entry is used to generate the
+ * interrupt message."
+ */
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);
+ pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &reg32);
+ entry = reg32 >> 27;
+ if (entry >= nr_entries)
+ goto Error;
+
+ i = pcie_port_msix_add_entry(msix_entries, entry, nvec);
+ if (i == nvec)
+ nvec++;
+
+ idx[PCIE_PORT_SERVICE_AER_SHIFT] = i;
+ }
+
+ /*
+ * If nvec is equal to the allocated number of entries, we can just use
+ * what we have. Otherwise, the port has some extra entries not for the
+ * services we know and we need to work around that.
+ */
+ if (nvec == nr_entries) {
+ status = 0;
+ } else {
+ /* Drop the temporary MSI-X setup */
+ pci_disable_msix(dev);
+
+ /* Now allocate the MSI-X vectors for real */
+ status = pci_enable_msix(dev, msix_entries, nvec);
+ if (status)
+ goto Exit;
+ }
+
+ for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
+ vectors[i] = idx[i] >= 0 ? msix_entries[idx[i]].vector : -1;
+
+ Exit:
+ kfree(msix_entries);
+ return status;
+
+ Error:
+ pci_disable_msix(dev);
+ goto Exit;
+}
+
+/**
* assign_interrupt_mode - choose interrupt mode for PCI Express port services
* (INTx, MSI-X, MSI) and set up vectors
* @dev: PCI Express port to handle
@@ -42,49 +188,31 @@ static void release_pcie_device(struct d
static int assign_interrupt_mode(struct pci_dev *dev, int *vectors, int mask)
{
struct pcie_port_data *port_data = pci_get_drvdata(dev);
- int i, pos, nvec, status = -EINVAL;
- int interrupt_mode = PCIE_PORT_NO_IRQ;
-
- /* Set INTx as default */
- for (i = 0, nvec = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
- if (mask & (1 << i))
- nvec++;
- vectors[i] = dev->irq;
- }
- if (dev->pin)
- interrupt_mode = PCIE_PORT_INTx_MODE;
+ int irq, interrupt_mode = PCIE_PORT_NO_IRQ;
+ int i;

/* Check MSI quirk */
if (port_data->port_type == PCIE_RC_PORT && pcie_mch_quirk)
- return interrupt_mode;
+ goto Fallback;

- /* Select MSI-X over MSI if supported */
- pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
- if (pos) {
- struct msix_entry msix_entries[PCIE_PORT_DEVICE_MAXSERVICES] =
- {{0, 0}, {0, 1}, {0, 2}, {0, 3}};
- status = pci_enable_msix(dev, msix_entries, nvec);
- if (!status) {
- int j = 0;
+ /* Try to use MSI-X if supported */
+ if (!pcie_port_enable_msix(dev, vectors, mask))
+ return PCIE_PORT_MSIX_MODE;
+
+ /* We're not going to use MSI-X, so try MSI and fall back to INTx */
+ if (!pci_enable_msi(dev))
+ interrupt_mode = PCIE_PORT_MSI_MODE;
+
+ Fallback:
+ if (interrupt_mode == PCIE_PORT_NO_IRQ && dev->pin)
+ interrupt_mode = PCIE_PORT_INTx_MODE;
+
+ irq = interrupt_mode != PCIE_PORT_NO_IRQ ? dev->irq : -1;
+ for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
+ vectors[i] = irq;
+
+ vectors[PCIE_PORT_SERVICE_VC_SHIFT] = -1;

- interrupt_mode = PCIE_PORT_MSIX_MODE;
- for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
- if (mask & (1 << i))
- vectors[i] = msix_entries[j++].vector;
- }
- }
- }
- if (status) {
- pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
- if (pos) {
- status = pci_enable_msi(dev);
- if (!status) {
- interrupt_mode = PCIE_PORT_MSI_MODE;
- for (i = 0;i < PCIE_PORT_DEVICE_MAXSERVICES;i++)
- vectors[i] = dev->irq;
- }
- }
- }
return interrupt_mode;
}

Index: linux-2.6/include/linux/pcieport_if.h
===================================================================
--- linux-2.6.orig/include/linux/pcieport_if.h
+++ linux-2.6/include/linux/pcieport_if.h
@@ -16,10 +16,14 @@
#define PCIE_ANY_PORT 7

/* Service Type */
-#define PCIE_PORT_SERVICE_PME 1 /* Power Management Event */
-#define PCIE_PORT_SERVICE_AER 2 /* Advanced Error Reporting */
-#define PCIE_PORT_SERVICE_HP 4 /* Native Hotplug */
-#define PCIE_PORT_SERVICE_VC 8 /* Virtual Channel */
+#define PCIE_PORT_SERVICE_PME_SHIFT 0 /* Power Management Event */
+#define PCIE_PORT_SERVICE_PME (1 << PCIE_PORT_SERVICE_PME_SHIFT)
+#define PCIE_PORT_SERVICE_AER_SHIFT 1 /* Advanced Error Reporting */
+#define PCIE_PORT_SERVICE_AER (1 << PCIE_PORT_SERVICE_AER_SHIFT)
+#define PCIE_PORT_SERVICE_HP_SHIFT 2 /* Native Hotplug */
+#define PCIE_PORT_SERVICE_HP (1 << PCIE_PORT_SERVICE_HP_SHIFT)
+#define PCIE_PORT_SERVICE_VC_SHIFT 3 /* Virtual Channel */
+#define PCIE_PORT_SERVICE_VC (1 << PCIE_PORT_SERVICE_VC_SHIFT)

/* Root/Upstream/Downstream Port's Interrupt Mode */
#define PCIE_PORT_NO_IRQ (-1)
Index: linux-2.6/drivers/pci/pcie/portdrv.h
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv.h
+++ linux-2.6/drivers/pci/pcie/portdrv.h
@@ -25,6 +25,12 @@
#define PCIE_CAPABILITIES_REG 0x2
#define PCIE_SLOT_CAPABILITIES_REG 0x14
#define PCIE_PORT_DEVICE_MAXSERVICES 4
+#define PCIE_PORT_MSI_VECTOR_MASK 0x1f
+/*
+ * According to the PCI Express Base Specification 2.0, the indices of the MSI-X
+ * table entires used by port services must not exceed 31
+ */
+#define PCIE_PORT_MAX_MSIX_ENTRIES 32

#define get_descriptor_id(type, service) (((type - 4) << 4) | service)

2009-01-25 05:30:15

by Grant Grundler

[permalink] [raw]
Subject: Re: [PATCH 1/2] PCI/MSI: Introduce pci_msix_table_size()

On Sat, Jan 24, 2009 at 12:21:14AM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Introduce new function pci_msix_table_size() returning the size of
> the MSI-X table of given PCI device or 0 if the device doesn't
> support MSI-X.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> Reviewed-by: Hidetoshi Seto <[email protected]>
> ---
> drivers/pci/msi.c | 24 +++++++++++++++++++-----
> include/linux/pci.h | 5 +++++
> 2 files changed, 24 insertions(+), 5 deletions(-)
>
> Index: linux-2.6/drivers/pci/msi.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/msi.c
> +++ linux-2.6/drivers/pci/msi.c
> @@ -670,6 +670,23 @@ static int msi_free_irqs(struct pci_dev*
> }
>
> /**
> + * pci_msix_table_size - return the number of device's MSI-X table entries
> + * @dev: pointer to the pci_dev data structure of MSI-X device function
> + */
> +int pci_msix_table_size(struct pci_dev *dev)

Is this intended to be used by drivers?

If so, can you please add (a) corresponding documentation in
either the MSI or pci.txt files and (b) EXPORT_SYMBOL_GPL?

If not, can this declaration be static?
(I haven't yet looked at the second patch...maybe this is obvious once
I do.)

thanks,
grant

> +{
> + int pos;
> + u16 control;
> +
> + pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> + if (!pos)
> + return 0;
> +
> + pci_read_config_word(dev, msi_control_reg(pos), &control);
> + return multi_msix_capable(control);
> +}
> +
> +/**
> * pci_enable_msix - configure device's MSI-X capability structure
> * @dev: pointer to the pci_dev data structure of MSI-X device function
> * @entries: pointer to an array of MSI-X entries
> @@ -686,9 +703,8 @@ static int msi_free_irqs(struct pci_dev*
> **/
> int pci_enable_msix(struct pci_dev* dev, struct msix_entry *entries, int nvec)
> {
> - int status, pos, nr_entries;
> + int status, nr_entries;
> int i, j;
> - u16 control;
>
> if (!entries)
> return -EINVAL;
> @@ -697,9 +713,7 @@ int pci_enable_msix(struct pci_dev* dev,
> if (status)
> return status;
>
> - pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> - pci_read_config_word(dev, msi_control_reg(pos), &control);
> - nr_entries = multi_msix_capable(control);
> + nr_entries = pci_msix_table_size(dev);
> if (nvec > nr_entries)
> return -EINVAL;
>
> Index: linux-2.6/include/linux/pci.h
> ===================================================================
> --- linux-2.6.orig/include/linux/pci.h
> +++ linux-2.6/include/linux/pci.h
> @@ -799,6 +799,10 @@ static inline void pci_msi_shutdown(stru
> static inline void pci_disable_msi(struct pci_dev *dev)
> { }
>
> +static inline int pci_msix_table_size(struct pci_dev *dev)
> +{
> + return 0;
> +}
> static inline int pci_enable_msix(struct pci_dev *dev,
> struct msix_entry *entries, int nvec)
> {
> @@ -823,6 +827,7 @@ static inline int pci_msi_enabled(void)
> extern int pci_enable_msi(struct pci_dev *dev);
> extern void pci_msi_shutdown(struct pci_dev *dev);
> extern void pci_disable_msi(struct pci_dev *dev);
> +extern int pci_msix_table_size(struct pci_dev *dev);
> extern int pci_enable_msix(struct pci_dev *dev,
> struct msix_entry *entries, int nvec);
> extern void pci_msix_shutdown(struct pci_dev *dev);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2009-01-25 05:36:16

by Grant Grundler

[permalink] [raw]
Subject: Re: [PATCH 1/2] PCI/MSI: Introduce pci_msix_table_size()

On Sat, Jan 24, 2009 at 10:29:47PM -0700, Grant Grundler wrote:
> Is this intended to be used by drivers?

The answer depends on if "PCI Express port driver" can be loaded
as a module.

> If so, can you please add (a) corresponding documentation in
> either the MSI or pci.txt files and (b) EXPORT_SYMBOL_GPL?
>
> If not, can this declaration be static?
> (I haven't yet looked at the second patch...maybe this is obvious once
> I do.)

The answer seems to be no, it can't be static.

thanks,
grant

2009-01-25 19:02:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 1/2] PCI/MSI: Introduce pci_msix_table_size()

On Sunday 25 January 2009, Grant Grundler wrote:
> On Sat, Jan 24, 2009 at 10:29:47PM -0700, Grant Grundler wrote:
> > Is this intended to be used by drivers?
>
> The answer depends on if "PCI Express port driver" can be loaded
> as a module.

No, it's not, AFAICS.

Thanks,
Rafael

2009-01-27 18:57:40

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 1/2] PCI/MSI: Introduce pci_msix_table_size()

On Friday, January 23, 2009 3:21 pm Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Introduce new function pci_msix_table_size() returning the size of
> the MSI-X table of given PCI device or 0 if the device doesn't
> support MSI-X.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> Reviewed-by: Hidetoshi Seto <[email protected]>
> ---
> drivers/pci/msi.c | 24 +++++++++++++++++++-----
> include/linux/pci.h | 5 +++++
> 2 files changed, 24 insertions(+), 5 deletions(-)

Ok, applied these two to my linux-next branch. Thanks guys.

--
Jesse Barnes, Intel Open Source Technology Center