Hi,
This series of patches contains some fixes and cleanups of the PCI Express
port driver. Apart from bug fixes the patches remove unnecessary things from
the driver.
The patches are on top of the other series I sent a couple of days ago
(http://marc.info/?l=linux-pci&m=123083608411848&w=4,
http://marc.info/?l=linux-pci&m=123083608311844&w=4,
http://marc.info/?l=linux-pci&m=123083608711856&w=4,
http://marc.info/?l=linux-pci&m=123083608311840&w=4).
Please tell me what you think.
Thanks,
Rafael
From: Rafael J. Wysocki <[email protected]>
There is struct pcie_port_device_ext defined in portdrv.h, but its
member is only initialized and never used. Remove it.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pcie/portdrv.h | 4 ----
drivers/pci/pcie/portdrv_core.c | 8 --------
2 files changed, 12 deletions(-)
Index: linux-2.6/drivers/pci/pcie/portdrv.h
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv.h
+++ linux-2.6/drivers/pci/pcie/portdrv.h
@@ -28,10 +28,6 @@
#define get_descriptor_id(type, service) (((type - 4) << 4) | service)
-struct pcie_port_device_ext {
- int interrupt_mode; /* [0:INTx | 1:MSI | 2:MSI-X] */
-};
-
extern struct bus_type pcie_port_bus_type;
extern int pcie_port_device_probe(struct pci_dev *dev);
extern int pcie_port_device_register(struct pci_dev *dev);
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -204,17 +204,10 @@ int pcie_port_device_probe(struct pci_de
*/
int pcie_port_device_register(struct pci_dev *dev)
{
- struct pcie_port_device_ext *p_ext;
int status, type, capabilities, irq_mode, i;
int vectors[PCIE_PORT_DEVICE_MAXSERVICES];
u16 reg16;
- /* Allocate port device extension */
- if (!(p_ext = kmalloc(sizeof(struct pcie_port_device_ext), GFP_KERNEL)))
- return -ENOMEM;
-
- pci_set_drvdata(dev, p_ext);
-
/* Get port type */
pci_read_config_word(dev,
pci_find_capability(dev, PCI_CAP_ID_EXP) +
@@ -224,7 +217,6 @@ int pcie_port_device_register(struct pci
/* Now get port services */
capabilities = get_port_device_capability(dev);
irq_mode = assign_interrupt_mode(dev, vectors, capabilities);
- p_ext->interrupt_mode = irq_mode;
/* Allocate child services if any */
for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
From: Rafael J. Wysocki <[email protected]>
The PCI Express port driver should not attempt to register service
devices that require the ability to generate interrupts if generating
interrupts is not possible. Namely, if the port has no interrupt pin
configured and we cannot set up MSI or MSI-X for it, there is no way
it can generate interrupts and in such a case the port services that
rely on interrupts (PME, PCIe HP, AER) should not be enabled for it.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pcie/portdrv_core.c | 40 +++++++++++++++++++++++++++-------------
include/linux/pcieport_if.h | 1 +
2 files changed, 28 insertions(+), 13 deletions(-)
Index: linux-2.6/include/linux/pcieport_if.h
===================================================================
--- linux-2.6.orig/include/linux/pcieport_if.h
+++ linux-2.6/include/linux/pcieport_if.h
@@ -22,6 +22,7 @@
#define PCIE_PORT_SERVICE_VC 8 /* Virtual Channel */
/* Root/Upstream/Downstream Port's Interrupt Mode */
+#define PCIE_PORT_NO_IRQ (-1)
#define PCIE_PORT_INTx_MODE 0
#define PCIE_PORT_MSI_MODE 1
#define PCIE_PORT_MSIX_MODE 2
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -41,13 +41,15 @@ static void release_pcie_device(struct d
static int assign_interrupt_mode(struct pci_dev *dev, int *vectors, int mask)
{
int i, pos, nvec, status = -EINVAL;
- int interrupt_mode = PCIE_PORT_INTx_MODE;
+ int interrupt_mode = PCIE_PORT_NO_IRQ;
/* Set INTx as default */
for (i = 0, nvec = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
if (mask & (1 << i))
nvec++;
vectors[i] = dev->irq;
+ if (dev->pin)
+ interrupt_mode = PCIE_PORT_INTx_MODE;
}
/* Select MSI-X over MSI if supported */
@@ -137,7 +139,7 @@ static void pcie_device_init(struct pci_
dev->id.vendor = parent->vendor;
dev->id.device = parent->device;
dev->id.port_type = port_type;
- dev->id.service_type = (1 << service_type);
+ dev->id.service_type = service_type;
/* Initialize generic device interface */
device = &dev->device;
@@ -227,23 +229,35 @@ int pcie_port_device_register(struct pci
/* Allocate child services if any */
for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
struct pcie_device *child;
+ int service = 1 << i;
+
+ if (!(capabilities & service))
+ continue;
- if (capabilities & (1 << i)) {
- child = alloc_pcie_device(
+ /*
+ * Don't use service devices that require interrupts if there is
+ * no way to generate them.
+ */
+ if (irq_mode == PCIE_PORT_NO_IRQ
+ && service != PCIE_PORT_SERVICE_VC)
+ continue;
+
+ child = alloc_pcie_device(
dev, /* parent */
type, /* port type */
- i, /* service type */
+ service, /* service type */
vectors[i], /* irq */
irq_mode /* interrupt mode */);
- if (child) {
- status = device_register(&child->device);
- if (status) {
- kfree(child);
- continue;
- }
- get_device(&child->device);
- }
+ if (!child)
+ continue;
+
+ status = device_register(&child->device);
+ if (status) {
+ kfree(child);
+ continue;
}
+
+ get_device(&child->device);
}
return 0;
}
From: Rafael J. Wysocki <[email protected]>
The PCI Express port driver calls pci_enable_device() before setting
up interrupts, which is wrong, because if there is an interrupt pin
configured for the port, pci_enable_device() will likely set up an
interrupt link for it. However, this shouldn't be done if either
MSI or MSI-X interrupt mode is chosen for the port.
The solution is to call pci_enable_device() after setting up
interrupts, because in that case the interrupt link won't be set up
if MSI or MSI-X are enabled.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pcie/portdrv_core.c | 18 ++++++++++++++++--
drivers/pci/pcie/portdrv_pci.c | 11 +++--------
2 files changed, 19 insertions(+), 10 deletions(-)
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -204,7 +204,7 @@ int pcie_port_device_probe(struct pci_de
*/
int pcie_port_device_register(struct pci_dev *dev)
{
- int status, type, capabilities, irq_mode, i;
+ int status, type, capabilities, irq_mode, i, nr_serv;
int vectors[PCIE_PORT_DEVICE_MAXSERVICES];
u16 reg16;
@@ -217,9 +217,17 @@ int pcie_port_device_register(struct pci
/* Now get port services */
capabilities = get_port_device_capability(dev);
irq_mode = assign_interrupt_mode(dev, vectors, capabilities);
+ if (irq_mode == PCIE_PORT_NO_IRQ
+ && !(capabilities & PCIE_PORT_SERVICE_VC))
+ return -ENODEV;
+
+ status = pci_enable_device(dev);
+ if (status)
+ return status;
+ pci_set_master(dev);
/* Allocate child services if any */
- for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
+ for (i = 0, nr_serv = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
struct pcie_device *child;
int service = 1 << i;
@@ -250,7 +258,13 @@ int pcie_port_device_register(struct pci
}
get_device(&child->device);
+ nr_serv++;
}
+ if (!nr_serv) {
+ pci_disable_device(dev);
+ return -ENODEV;
+ }
+
return 0;
}
Index: linux-2.6/drivers/pci/pcie/portdrv_pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_pci.c
+++ linux-2.6/drivers/pci/pcie/portdrv_pci.c
@@ -94,18 +94,13 @@ static int __devinit pcie_portdrv_probe
if (status)
return status;
- if (pci_enable_device(dev) < 0)
- return -ENODEV;
-
- pci_set_master(dev);
if (!dev->irq && dev->pin) {
dev_warn(&dev->dev, "device [%04x:%04x] has invalid IRQ; "
"check vendor BIOS\n", dev->vendor, dev->device);
}
- if (pcie_port_device_register(dev)) {
- pci_disable_device(dev);
- return -ENOMEM;
- }
+ status = pcie_port_device_register(dev);
+ if (status)
+ return status;
pcie_portdrv_save_config(dev);
From: Rafael J. Wysocki <[email protected]>
The PCI Express port driver contains a quirk that prevents root
ports from using MSI, but there is no explanation for it and the
PCI Express specification clearly allows root ports to use MSI.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pcie/portdrv_core.c | 28 ----------------------------
1 file changed, 28 deletions(-)
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -17,8 +17,6 @@
#include "portdrv.h"
-extern int pcie_mch_quirk; /* MSI-quirk Indicator */
-
/**
* release_pcie_device - free PCI Express port service device structure
* @dev: Port service device to release
@@ -31,28 +29,6 @@ static void release_pcie_device(struct d
kfree(to_pcie_device(dev));
}
-static int is_msi_quirked(struct pci_dev *dev)
-{
- int port_type, quirk = 0;
- u16 reg16;
-
- pci_read_config_word(dev,
- pci_find_capability(dev, PCI_CAP_ID_EXP) +
- PCIE_CAPABILITIES_REG, ®16);
- port_type = (reg16 >> 4) & PORT_TYPE_MASK;
- switch(port_type) {
- case PCIE_RC_PORT:
- if (pcie_mch_quirk == 1)
- quirk = 1;
- break;
- case PCIE_SW_UPSTREAM_PORT:
- case PCIE_SW_DOWNSTREAM_PORT:
- default:
- break;
- }
- return quirk;
-}
-
/**
* assign_interrupt_mode - choose interrupt mode for PCI Express port services
* (INTx, MSI-X, MSI) and set up vectors
@@ -73,10 +49,6 @@ static int assign_interrupt_mode(struct
nvec++;
vectors[i] = dev->irq;
}
-
- /* Check MSI quirk */
- if (is_msi_quirked(dev))
- return interrupt_mode;
/* Select MSI-X over MSI if supported */
pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
From: Rafael J. Wysocki <[email protected]>
The function pcie_portdrv_save_config() in portdrv_pci.c is not
necessary. Remove it.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pcie/portdrv_pci.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
Index: linux-2.6/drivers/pci/pcie/portdrv_pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_pci.c
+++ linux-2.6/drivers/pci/pcie/portdrv_pci.c
@@ -32,11 +32,6 @@ MODULE_LICENSE("GPL");
/* global data */
static const char device_name[] = "pcieport-driver";
-static int pcie_portdrv_save_config(struct pci_dev *dev)
-{
- return pci_save_state(dev);
-}
-
static int pcie_portdrv_restore_config(struct pci_dev *dev)
{
int retval;
@@ -102,7 +97,7 @@ static int __devinit pcie_portdrv_probe
if (status)
return status;
- pcie_portdrv_save_config(dev);
+ pci_save_state(dev);
pci_enable_pcie_error_reporting(dev);
From: Rafael J. Wysocki <[email protected]>
The second argument of the ->probe() callback in
struct pcie_port_service_driver is unnecessary and never used.
Remove it.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/pciehp_core.c | 2 +-
drivers/pci/pcie/aer/aerdrv.c | 6 ++----
drivers/pci/pcie/portdrv_core.c | 2 +-
include/linux/pcieport_if.h | 3 +--
4 files changed, 5 insertions(+), 8 deletions(-)
Index: linux-2.6/include/linux/pcieport_if.h
===================================================================
--- linux-2.6.orig/include/linux/pcieport_if.h
+++ linux-2.6/include/linux/pcieport_if.h
@@ -57,8 +57,7 @@ static inline void* get_service_data(str
struct pcie_port_service_driver {
const char *name;
- int (*probe) (struct pcie_device *dev,
- const struct pcie_port_service_id *id);
+ int (*probe) (struct pcie_device *dev);
void (*remove) (struct pcie_device *dev);
int (*suspend) (struct pcie_device *dev, pm_message_t state);
int (*resume) (struct pcie_device *dev);
Index: linux-2.6/drivers/pci/pcie/aer/aerdrv.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/aer/aerdrv.c
+++ linux-2.6/drivers/pci/pcie/aer/aerdrv.c
@@ -38,8 +38,7 @@ MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);
MODULE_LICENSE("GPL");
-static int __devinit aer_probe (struct pcie_device *dev,
- const struct pcie_port_service_id *id );
+static int __devinit aer_probe (struct pcie_device *dev);
static void aer_remove(struct pcie_device *dev);
static int aer_suspend(struct pcie_device *dev, pm_message_t state)
{return 0;}
@@ -207,8 +206,7 @@ static void aer_remove(struct pcie_devic
*
* Invoked when PCI Express bus loads AER service driver.
**/
-static int __devinit aer_probe (struct pcie_device *dev,
- const struct pcie_port_service_id *id )
+static int __devinit aer_probe (struct pcie_device *dev)
{
int status;
struct aer_rpc *rpc;
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -461,7 +461,7 @@ static int pcie_port_probe_service(struc
return -ENODEV;
pciedev = to_pcie_device(dev);
- status = driver->probe(pciedev, driver->id_table);
+ status = driver->probe(pciedev);
if (!status) {
dev_printk(KERN_DEBUG, dev, "service driver %s loaded\n",
driver->name);
Index: linux-2.6/drivers/pci/hotplug/pciehp_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_core.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_core.c
@@ -399,7 +399,7 @@ static int get_cur_bus_speed(struct hotp
return 0;
}
-static int pciehp_probe(struct pcie_device *dev, const struct pcie_port_service_id *id)
+static int pciehp_probe(struct pcie_device *dev)
{
int rc;
struct controller *ctrl;
From: Rafael J. Wysocki <[email protected]>
The PCI Express port driver uses 'struct pcie_port_service_id' for
matching port service devices and drivers, but this structure
contains fields that duplicate information from the port device
itself (vendor, device, subvendor, subdevice) and fields that are not
used by any existing port service driver (class, class_mask,
drvier_data). Also, both existing port service drivers (AER and
PCIe HP) don't even use the vendor and device fields for device
matching. Therefore 'struct pcie_port_service_id' can be removed
altogether and the only useful members of it (port_type, service) can
be introduced directly into the port service device and port service
driver structures. That simplifies the code quite a bit and reduces
its size.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/pciehp_core.c | 12 ++----------
drivers/pci/pcie/aer/aerdrv.c | 16 ++--------------
drivers/pci/pcie/aer/aerdrv_core.c | 5 ++---
drivers/pci/pcie/portdrv_bus.c | 12 ++++--------
drivers/pci/pcie/portdrv_core.c | 10 ++++------
include/linux/pcieport_if.h | 19 +++++++------------
6 files changed, 21 insertions(+), 53 deletions(-)
Index: linux-2.6/include/linux/pcieport_if.h
===================================================================
--- linux-2.6.orig/include/linux/pcieport_if.h
+++ linux-2.6/include/linux/pcieport_if.h
@@ -27,19 +27,12 @@
#define PCIE_PORT_MSI_MODE 1
#define PCIE_PORT_MSIX_MODE 2
-struct pcie_port_service_id {
- __u32 vendor, device; /* Vendor and device ID or PCI_ANY_ID*/
- __u32 subvendor, subdevice; /* Subsystem ID's or PCI_ANY_ID */
- __u32 class, class_mask; /* (class,subclass,prog-if) triplet */
- __u32 port_type, service_type; /* Port Entity */
- kernel_ulong_t driver_data;
-};
-
struct pcie_device {
int irq; /* Service IRQ/MSI/MSI-X Vector */
- int interrupt_mode; /* [0:INTx | 1:MSI | 2:MSI-X] */
- struct pcie_port_service_id id; /* Service ID */
- struct pci_dev *port; /* Root/Upstream/Downstream Port */
+ int irq_mode; /* [0:INTx | 1:MSI | 2:MSI-X] */
+ struct pci_dev *port; /* Root/Upstream/Downstream Port */
+ u32 port_type; /* Type of the port */
+ u32 service; /* Port service this device represents */
void *priv_data; /* Service Private Data */
struct device device; /* Generic Device Interface */
};
@@ -68,7 +61,9 @@ struct pcie_port_service_driver {
/* Link Reset Capability - AER service driver specific */
pci_ers_result_t (*reset_link) (struct pci_dev *dev);
- const struct pcie_port_service_id *id_table;
+ u32 port_type; /* Type of the port */
+ u32 service; /* Port service this device represents */
+
struct device_driver driver;
};
#define to_service_driver(d) \
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -199,12 +199,10 @@ static void pcie_device_init(struct pci_
struct device *device;
dev->port = parent;
- dev->interrupt_mode = irq_mode;
dev->irq = irq;
- dev->id.vendor = parent->vendor;
- dev->id.device = parent->device;
- dev->id.port_type = port_type;
- dev->id.service_type = service_type;
+ dev->irq_mode = irq_mode;
+ dev->port_type = port_type;
+ dev->service = service_type;
/* Initialize generic device interface */
device = &dev->device;
@@ -427,7 +425,7 @@ void pcie_port_device_remove(struct pci_
status = device_for_each_child(&dev->dev, &device_addr, remove_iter);
if (status) {
device = (struct device*)device_addr;
- interrupt_mode = (to_pcie_device(device))->interrupt_mode;
+ interrupt_mode = to_pcie_device(device)->irq_mode;
put_device(device);
device_unregister(device);
}
Index: linux-2.6/drivers/pci/pcie/portdrv_bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_bus.c
+++ linux-2.6/drivers/pci/pcie/portdrv_bus.c
@@ -30,16 +30,12 @@ static int pcie_port_bus_match(struct de
if (drv->bus != &pcie_port_bus_type || dev->bus != &pcie_port_bus_type)
return 0;
-
+
pciedev = to_pcie_device(dev);
driver = to_service_driver(drv);
- if ( (driver->id_table->vendor != PCI_ANY_ID &&
- driver->id_table->vendor != pciedev->id.vendor) ||
- (driver->id_table->device != PCI_ANY_ID &&
- driver->id_table->device != pciedev->id.device) ||
- (driver->id_table->port_type != PCIE_ANY_PORT &&
- driver->id_table->port_type != pciedev->id.port_type) ||
- driver->id_table->service_type != pciedev->id.service_type )
+ if ((driver->port_type != PCIE_ANY_PORT
+ && driver->port_type != pciedev->port_type)
+ || driver->service != pciedev->service)
return 0;
return 1;
Index: linux-2.6/drivers/pci/pcie/aer/aerdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/aer/aerdrv_core.c
+++ linux-2.6/drivers/pci/pcie/aer/aerdrv_core.c
@@ -327,14 +327,13 @@ static int find_aer_service_iter(struct
if (device->bus == &pcie_port_bus_type) {
pcie_dev = to_pcie_device(device);
- if (pcie_dev->id.port_type == PCIE_SW_DOWNSTREAM_PORT)
+ if (pcie_dev->port_type == PCIE_SW_DOWNSTREAM_PORT)
result->is_downstream = 1;
driver = device->driver;
if (driver) {
service_driver = to_service_driver(driver);
- if (service_driver->id_table->service_type ==
- PCIE_PORT_SERVICE_AER) {
+ if (service_driver->service == PCIE_PORT_SERVICE_AER) {
result->aer_driver = service_driver;
return 1;
}
Index: linux-2.6/drivers/pci/hotplug/pciehp_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_core.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_core.c
@@ -496,18 +496,10 @@ static int pciehp_resume (struct pcie_de
}
#endif
-static struct pcie_port_service_id port_pci_ids[] = { {
- .vendor = PCI_ANY_ID,
- .device = PCI_ANY_ID,
- .port_type = PCIE_ANY_PORT,
- .service_type = PCIE_PORT_SERVICE_HP,
- .driver_data = 0,
- }, { /* end: all zeroes */ }
-};
-
static struct pcie_port_service_driver hpdriver_portdrv = {
.name = PCIE_MODULE_NAME,
- .id_table = &port_pci_ids[0],
+ .port_type = PCIE_ANY_PORT,
+ .service = PCIE_PORT_SERVICE_HP,
.probe = pciehp_probe,
.remove = pciehp_remove,
Index: linux-2.6/drivers/pci/pcie/aer/aerdrv.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/aer/aerdrv.c
+++ linux-2.6/drivers/pci/pcie/aer/aerdrv.c
@@ -48,19 +48,6 @@ static pci_ers_result_t aer_error_detect
static void aer_error_resume(struct pci_dev *dev);
static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
-/*
- * PCI Express bus's AER Root service driver data structure
- */
-static struct pcie_port_service_id aer_id[] = {
- {
- .vendor = PCI_ANY_ID,
- .device = PCI_ANY_ID,
- .port_type = PCIE_RC_PORT,
- .service_type = PCIE_PORT_SERVICE_AER,
- },
- { /* end: all zeroes */ }
-};
-
static struct pci_error_handlers aer_error_handlers = {
.error_detected = aer_error_detected,
.resume = aer_error_resume,
@@ -68,7 +55,8 @@ static struct pci_error_handlers aer_err
static struct pcie_port_service_driver aerdriver = {
.name = "aer",
- .id_table = &aer_id[0],
+ .port_type = PCIE_ANY_PORT,
+ .service = PCIE_PORT_SERVICE_AER,
.probe = aer_probe,
.remove = aer_remove,
From: Rafael J. Wysocki <[email protected]>
If MSI-X interrupt mode is used by the PCI Express port driver, too
many vectors are allocated and it is not ensured that the right
vectors will be used for various services. Namely, the PCI Express
specification states that both PCI Express native PME and PCI Express
hotplug will always use the same MSI or MSI-X message for signalling
interrupts, which implies that the same vector will be used by both
of them. Also, the VC service does not use interrupts at all.
Moreover, is not clear which of the vectors allocated by
pci_enable_msix() will be used for PME and hotplug and which of them
will be used for AER if all of these services are configured.
For these reasons, rework the allocation of interrupts for PCI
Express ports so that at most two vectors are allocated and ensure
that the right vector will be used for the right purpose.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pcie/portdrv.h | 1
drivers/pci/pcie/portdrv_core.c | 155 +++++++++++++++++++++++++++++-----------
2 files changed, 117 insertions(+), 39 deletions(-)
Index: linux-2.6/drivers/pci/pcie/portdrv.h
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv.h
+++ linux-2.6/drivers/pci/pcie/portdrv.h
@@ -25,6 +25,7 @@
#define PCIE_CAPABILITIES_REG 0x2
#define PCIE_SLOT_CAPABILITIES_REG 0x14
#define PCIE_PORT_DEVICE_MAXSERVICES 4
+#define PORT_MSI_VECTOR_MASK 0x1f
#define get_descriptor_id(type, service) (((type - 4) << 4) | service)
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -30,55 +30,120 @@ static void release_pcie_device(struct d
}
/**
+ * fix_up_vectors - ensure the right ordering of MSI-X interrupt vectors
+ * @dev: PCI Express port that is going to use the vectors
+ * @vectors: Array of interrupt vectors to check (2 entries)
+ *
+ * Return value:
+ * 0 on success, error code if the values read from config registers are not as
+ * expected
+ *
+ * If this function is called, we are going to use two interrupt vectors which
+ * may be different, but we have to make sure they are in the right order such
+ * that the vector to be used for PME and hotplug signalling is the first one.
+ *
+ * NOTE: The assumption here is that MSI message offset (with respect to the
+ * base Message Data) equal to N corresponds to index N in the array of vectors
+ * returned by pci_enable_msix().
+ */
+static int fix_up_vectors(struct pci_dev *dev, int *vectors)
+{
+ int pos, offset;
+ int vec0, vec1;
+ u16 reg16;
+ u32 reg32;
+
+ if (vectors[0] == vectors[1])
+ return 0;
+
+ pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+ pci_read_config_word(dev, pos + PCIE_CAPABILITIES_REG, ®16);
+ offset = (reg16 >> 9) & PORT_MSI_VECTOR_MASK;
+ if (offset > 1)
+ return -EIO;
+ else
+ vec0 = offset ? vectors[1] : vectors[0];
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);
+ pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, ®32);
+ offset = (reg32 >> 27) & PORT_MSI_VECTOR_MASK;
+ if (offset > 1)
+ return -EIO;
+ else
+ vec1 = offset ? vectors[1] : vectors[0];
+
+ vectors[0] = vec0;
+ vectors[1] = vec1;
+
+ return 0;
+}
+
+/**
+ * pcie_port_enable_msix - try to set up MSI-X as interrupt mode for given port
+ * @dev: PCI Express port to handle
+ * @vectors: Array of interrupt vectors to populate (2 entries)
+ * @mask: Bitmask of port capabilities returned by get_port_device_capability()
+ *
+ * Return value: 0 on success, error code on failure
+ */
+static int pcie_port_enable_msix(struct pci_dev *dev, int *vectors, int mask)
+{
+ struct msix_entry msix_entries[] = {{0, 0}, {0, 1}};
+ int nvec, status;
+
+ /*
+ * According to the PCI Express specification both PME and PCI Express
+ * hotplug always use the same interrupt vector, while AER can use
+ * a different one.
+ */
+ nvec = ((mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP))
+ && (mask & PCIE_PORT_SERVICE_AER)) ? 2 : 1;
+
+ status = pci_enable_msix(dev, msix_entries, nvec);
+ if (status)
+ return status;
+
+ vectors[0] = msix_entries[0].vector;
+ vectors[1] = nvec > 1 ? msix_entries[1].vector : vectors[0];
+
+ status = fix_up_vectors(dev, vectors);
+ if (status)
+ pci_disable_msix(dev);
+
+ return status;
+}
+
+/**
* assign_interrupt_mode - choose interrupt mode for PCI Express port services
* (INTx, MSI-X, MSI) and set up vectors
* @dev: PCI Express port to handle
- * @vectors: Array of interrupt vectors to populate
+ * @vectors: Array of interrupt vectors to populate (2 entries)
* @mask: Bitmask of port capabilities returned by get_port_device_capability()
*
* Return value: Interrupt mode associated with the port
*/
static int assign_interrupt_mode(struct pci_dev *dev, int *vectors, int mask)
{
- int i, pos, nvec, status = -EINVAL;
int interrupt_mode = PCIE_PORT_NO_IRQ;
- /* Set INTx as default */
- for (i = 0, nvec = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
- if (mask & (1 << i))
- nvec++;
- vectors[i] = dev->irq;
- if (dev->pin)
- interrupt_mode = PCIE_PORT_INTx_MODE;
+ /* Use MSI-X if supported */
+ if (!pcie_port_enable_msix(dev, vectors, mask)) {
+ interrupt_mode = PCIE_PORT_MSIX_MODE;
+ goto Exit;
}
- /* Select MSI-X over MSI if supported */
- pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
- if (pos) {
- struct msix_entry msix_entries[PCIE_PORT_DEVICE_MAXSERVICES] =
- {{0, 0}, {0, 1}, {0, 2}, {0, 3}};
- status = pci_enable_msix(dev, msix_entries, nvec);
- if (!status) {
- int j = 0;
-
- interrupt_mode = PCIE_PORT_MSIX_MODE;
- for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
- if (mask & (1 << i))
- vectors[i] = msix_entries[j++].vector;
- }
- }
- }
- if (status) {
- pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
- if (pos) {
- status = pci_enable_msi(dev);
- if (!status) {
- interrupt_mode = PCIE_PORT_MSI_MODE;
- for (i = 0;i < PCIE_PORT_DEVICE_MAXSERVICES;i++)
- vectors[i] = dev->irq;
- }
- }
- }
+ /* We can't use MSI-X, so try MSI and fall back to INTx */
+ if (!pci_enable_msi(dev))
+ interrupt_mode = PCIE_PORT_MSI_MODE;
+ else if (dev->pin)
+ interrupt_mode = PCIE_PORT_INTx_MODE;
+ else
+ goto Exit;
+
+ vectors[0] = dev->irq;
+ vectors[1] = vectors[0];
+
+ Exit:
return interrupt_mode;
}
@@ -205,7 +270,7 @@ int pcie_port_device_probe(struct pci_de
int pcie_port_device_register(struct pci_dev *dev)
{
int status, type, capabilities, irq_mode, i, nr_serv;
- int vectors[PCIE_PORT_DEVICE_MAXSERVICES];
+ int vectors[] = { -1, -1 };
u16 reg16;
/* Get port type */
@@ -229,7 +294,7 @@ int pcie_port_device_register(struct pci
/* Allocate child services if any */
for (i = 0, nr_serv = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
struct pcie_device *child;
- int service = 1 << i;
+ int irq, service = 1 << i;
if (!(capabilities & service))
continue;
@@ -242,11 +307,23 @@ int pcie_port_device_register(struct pci
&& service != PCIE_PORT_SERVICE_VC)
continue;
+ switch (service) {
+ case PCIE_PORT_SERVICE_PME:
+ case PCIE_PORT_SERVICE_HP:
+ irq = vectors[0];
+ break;
+ case PCIE_PORT_SERVICE_AER:
+ irq = vectors[1];
+ break;
+ default:
+ irq = -1;
+ }
+
child = alloc_pcie_device(
dev, /* parent */
type, /* port type */
service, /* service type */
- vectors[i], /* irq */
+ irq, /* interrupt vector */
irq_mode /* interrupt mode */);
if (!child)
continue;
On Sunday 04 January 2009, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The second argument of the ->probe() callback in
> struct pcie_port_service_driver is unnecessary and never used.
> Remove it.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
This patch had to be rebased on top of the current linux-next branch of the
pci-2.6 tree.
Updated patch is appended.
Thanks,
Rafael
---
Subject: PCI PCIe portdrv: Simplily probe callback of service drivers (rev. 2)
From: Rafael J. Wysocki <[email protected]>
The second argument of the ->probe() callback in
struct pcie_port_service_driver is unnecessary and never used.
Remove it.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/pciehp_acpi.c | 3 +--
drivers/pci/hotplug/pciehp_core.c | 2 +-
drivers/pci/pcie/aer/aerdrv.c | 6 ++----
drivers/pci/pcie/portdrv_core.c | 2 +-
include/linux/pcieport_if.h | 3 +--
5 files changed, 6 insertions(+), 10 deletions(-)
Index: linux-2.6/include/linux/pcieport_if.h
===================================================================
--- linux-2.6.orig/include/linux/pcieport_if.h
+++ linux-2.6/include/linux/pcieport_if.h
@@ -57,8 +57,7 @@ static inline void* get_service_data(str
struct pcie_port_service_driver {
const char *name;
- int (*probe) (struct pcie_device *dev,
- const struct pcie_port_service_id *id);
+ int (*probe) (struct pcie_device *dev);
void (*remove) (struct pcie_device *dev);
int (*suspend) (struct pcie_device *dev, pm_message_t state);
int (*resume) (struct pcie_device *dev);
Index: linux-2.6/drivers/pci/pcie/aer/aerdrv.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/aer/aerdrv.c
+++ linux-2.6/drivers/pci/pcie/aer/aerdrv.c
@@ -38,8 +38,7 @@ MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);
MODULE_LICENSE("GPL");
-static int __devinit aer_probe (struct pcie_device *dev,
- const struct pcie_port_service_id *id );
+static int __devinit aer_probe (struct pcie_device *dev);
static void aer_remove(struct pcie_device *dev);
static int aer_suspend(struct pcie_device *dev, pm_message_t state)
{return 0;}
@@ -207,8 +206,7 @@ static void aer_remove(struct pcie_devic
*
* Invoked when PCI Express bus loads AER service driver.
**/
-static int __devinit aer_probe (struct pcie_device *dev,
- const struct pcie_port_service_id *id )
+static int __devinit aer_probe (struct pcie_device *dev)
{
int status;
struct aer_rpc *rpc;
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -461,7 +461,7 @@ static int pcie_port_probe_service(struc
return -ENODEV;
pciedev = to_pcie_device(dev);
- status = driver->probe(pciedev, driver->id_table);
+ status = driver->probe(pciedev);
if (!status) {
dev_printk(KERN_DEBUG, dev, "service driver %s loaded\n",
driver->name);
Index: linux-2.6/drivers/pci/hotplug/pciehp_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_core.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_core.c
@@ -399,7 +399,7 @@ static int get_cur_bus_speed(struct hotp
return 0;
}
-static int pciehp_probe(struct pcie_device *dev, const struct pcie_port_service_id *id)
+static int pciehp_probe(struct pcie_device *dev)
{
int rc;
struct controller *ctrl;
Index: linux-2.6/drivers/pci/hotplug/pciehp_acpi.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_acpi.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_acpi.c
@@ -82,8 +82,7 @@ static int __initdata acpi_slot_detected
static struct list_head __initdata dummy_slots = LIST_HEAD_INIT(dummy_slots);
/* Dummy driver for dumplicate name detection */
-static int __init dummy_probe(struct pcie_device *dev,
- const struct pcie_port_service_id *id)
+static int __init dummy_probe(struct pcie_device *dev)
{
int pos;
u32 slot_cap;
On Monday 05 January 2009, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The PCI Express port driver uses 'struct pcie_port_service_id' for
> matching port service devices and drivers, but this structure
> contains fields that duplicate information from the port device
> itself (vendor, device, subvendor, subdevice) and fields that are not
> used by any existing port service driver (class, class_mask,
> drvier_data). Also, both existing port service drivers (AER and
> PCIe HP) don't even use the vendor and device fields for device
> matching. Therefore 'struct pcie_port_service_id' can be removed
> altogether and the only useful members of it (port_type, service) can
> be introduced directly into the port service device and port service
> driver structures. That simplifies the code quite a bit and reduces
> its size.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
This patch had to be rebased on top of the current linux-next branch of the
pci-2.6 tree.
Updated patch is appended.
Thanks,
Rafael
---
Subject: PCI PCIe portdrv: Remove struct pcie_port_service_id (rev. 2)
From: Rafael J. Wysocki <[email protected]>
The PCI Express port driver uses 'struct pcie_port_service_id' for
matching port service devices and drivers, but this structure
contains fields that duplicate information from the port device
itself (vendor, device, subvendor, subdevice) and fields that are not
used by any existing port service driver (class, class_mask,
drvier_data). Also, both existing port service drivers (AER and
PCIe HP) don't even use the vendor and device fields for device
matching. Therefore 'struct pcie_port_service_id' can be removed
altogether and the only useful members of it (port_type, service) can
be introduced directly into the port service device and port service
driver structures. That simplifies the code quite a bit and reduces
its size.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/hotplug/pciehp_acpi.c | 13 ++-----------
drivers/pci/hotplug/pciehp_core.c | 12 ++----------
drivers/pci/pcie/aer/aerdrv.c | 16 ++--------------
drivers/pci/pcie/aer/aerdrv_core.c | 5 ++---
drivers/pci/pcie/portdrv_bus.c | 12 ++++--------
drivers/pci/pcie/portdrv_core.c | 10 ++++------
include/linux/pcieport_if.h | 19 +++++++------------
7 files changed, 23 insertions(+), 64 deletions(-)
Index: linux-2.6/include/linux/pcieport_if.h
===================================================================
--- linux-2.6.orig/include/linux/pcieport_if.h
+++ linux-2.6/include/linux/pcieport_if.h
@@ -27,19 +27,12 @@
#define PCIE_PORT_MSI_MODE 1
#define PCIE_PORT_MSIX_MODE 2
-struct pcie_port_service_id {
- __u32 vendor, device; /* Vendor and device ID or PCI_ANY_ID*/
- __u32 subvendor, subdevice; /* Subsystem ID's or PCI_ANY_ID */
- __u32 class, class_mask; /* (class,subclass,prog-if) triplet */
- __u32 port_type, service_type; /* Port Entity */
- kernel_ulong_t driver_data;
-};
-
struct pcie_device {
int irq; /* Service IRQ/MSI/MSI-X Vector */
- int interrupt_mode; /* [0:INTx | 1:MSI | 2:MSI-X] */
- struct pcie_port_service_id id; /* Service ID */
- struct pci_dev *port; /* Root/Upstream/Downstream Port */
+ int irq_mode; /* [0:INTx | 1:MSI | 2:MSI-X] */
+ struct pci_dev *port; /* Root/Upstream/Downstream Port */
+ u32 port_type; /* Type of the port */
+ u32 service; /* Port service this device represents */
void *priv_data; /* Service Private Data */
struct device device; /* Generic Device Interface */
};
@@ -68,7 +61,9 @@ struct pcie_port_service_driver {
/* Link Reset Capability - AER service driver specific */
pci_ers_result_t (*reset_link) (struct pci_dev *dev);
- const struct pcie_port_service_id *id_table;
+ u32 port_type; /* Type of the port */
+ u32 service; /* Port service this device represents */
+
struct device_driver driver;
};
#define to_service_driver(d) \
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -199,12 +199,10 @@ static void pcie_device_init(struct pci_
struct device *device;
dev->port = parent;
- dev->interrupt_mode = irq_mode;
dev->irq = irq;
- dev->id.vendor = parent->vendor;
- dev->id.device = parent->device;
- dev->id.port_type = port_type;
- dev->id.service_type = service_type;
+ dev->irq_mode = irq_mode;
+ dev->port_type = port_type;
+ dev->service = service_type;
/* Initialize generic device interface */
device = &dev->device;
@@ -427,7 +425,7 @@ void pcie_port_device_remove(struct pci_
status = device_for_each_child(&dev->dev, &device_addr, remove_iter);
if (status) {
device = (struct device*)device_addr;
- interrupt_mode = (to_pcie_device(device))->interrupt_mode;
+ interrupt_mode = to_pcie_device(device)->irq_mode;
put_device(device);
device_unregister(device);
}
Index: linux-2.6/drivers/pci/pcie/portdrv_bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_bus.c
+++ linux-2.6/drivers/pci/pcie/portdrv_bus.c
@@ -30,16 +30,12 @@ static int pcie_port_bus_match(struct de
if (drv->bus != &pcie_port_bus_type || dev->bus != &pcie_port_bus_type)
return 0;
-
+
pciedev = to_pcie_device(dev);
driver = to_service_driver(drv);
- if ( (driver->id_table->vendor != PCI_ANY_ID &&
- driver->id_table->vendor != pciedev->id.vendor) ||
- (driver->id_table->device != PCI_ANY_ID &&
- driver->id_table->device != pciedev->id.device) ||
- (driver->id_table->port_type != PCIE_ANY_PORT &&
- driver->id_table->port_type != pciedev->id.port_type) ||
- driver->id_table->service_type != pciedev->id.service_type )
+ if ((driver->port_type != PCIE_ANY_PORT
+ && driver->port_type != pciedev->port_type)
+ || driver->service != pciedev->service)
return 0;
return 1;
Index: linux-2.6/drivers/pci/pcie/aer/aerdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/aer/aerdrv_core.c
+++ linux-2.6/drivers/pci/pcie/aer/aerdrv_core.c
@@ -327,14 +327,13 @@ static int find_aer_service_iter(struct
if (device->bus == &pcie_port_bus_type) {
pcie_dev = to_pcie_device(device);
- if (pcie_dev->id.port_type == PCIE_SW_DOWNSTREAM_PORT)
+ if (pcie_dev->port_type == PCIE_SW_DOWNSTREAM_PORT)
result->is_downstream = 1;
driver = device->driver;
if (driver) {
service_driver = to_service_driver(driver);
- if (service_driver->id_table->service_type ==
- PCIE_PORT_SERVICE_AER) {
+ if (service_driver->service == PCIE_PORT_SERVICE_AER) {
result->aer_driver = service_driver;
return 1;
}
Index: linux-2.6/drivers/pci/hotplug/pciehp_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_core.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_core.c
@@ -496,18 +496,10 @@ static int pciehp_resume (struct pcie_de
}
#endif
-static struct pcie_port_service_id port_pci_ids[] = { {
- .vendor = PCI_ANY_ID,
- .device = PCI_ANY_ID,
- .port_type = PCIE_ANY_PORT,
- .service_type = PCIE_PORT_SERVICE_HP,
- .driver_data = 0,
- }, { /* end: all zeroes */ }
-};
-
static struct pcie_port_service_driver hpdriver_portdrv = {
.name = PCIE_MODULE_NAME,
- .id_table = &port_pci_ids[0],
+ .port_type = PCIE_ANY_PORT,
+ .service = PCIE_PORT_SERVICE_HP,
.probe = pciehp_probe,
.remove = pciehp_remove,
Index: linux-2.6/drivers/pci/pcie/aer/aerdrv.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/aer/aerdrv.c
+++ linux-2.6/drivers/pci/pcie/aer/aerdrv.c
@@ -48,19 +48,6 @@ static pci_ers_result_t aer_error_detect
static void aer_error_resume(struct pci_dev *dev);
static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
-/*
- * PCI Express bus's AER Root service driver data structure
- */
-static struct pcie_port_service_id aer_id[] = {
- {
- .vendor = PCI_ANY_ID,
- .device = PCI_ANY_ID,
- .port_type = PCIE_RC_PORT,
- .service_type = PCIE_PORT_SERVICE_AER,
- },
- { /* end: all zeroes */ }
-};
-
static struct pci_error_handlers aer_error_handlers = {
.error_detected = aer_error_detected,
.resume = aer_error_resume,
@@ -68,7 +55,8 @@ static struct pci_error_handlers aer_err
static struct pcie_port_service_driver aerdriver = {
.name = "aer",
- .id_table = &aer_id[0],
+ .port_type = PCIE_ANY_PORT,
+ .service = PCIE_PORT_SERVICE_AER,
.probe = aer_probe,
.remove = aer_remove,
Index: linux-2.6/drivers/pci/hotplug/pciehp_acpi.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_acpi.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_acpi.c
@@ -67,16 +67,6 @@ static int __init parse_detect_mode(void
return PCIEHP_DETECT_DEFAULT;
}
-static struct pcie_port_service_id __initdata port_pci_ids[] = {
- {
- .vendor = PCI_ANY_ID,
- .device = PCI_ANY_ID,
- .port_type = PCIE_ANY_PORT,
- .service_type = PCIE_PORT_SERVICE_HP,
- .driver_data = 0,
- }, { /* end: all zeroes */ }
-};
-
static int __initdata dup_slot_id;
static int __initdata acpi_slot_detected;
static struct list_head __initdata dummy_slots = LIST_HEAD_INIT(dummy_slots);
@@ -110,7 +100,8 @@ static int __init dummy_probe(struct pci
static struct pcie_port_service_driver __initdata dummy_driver = {
.name = "pciehp_dummy",
- .id_table = port_pci_ids,
+ .port_type = PCIE_ANY_PORT,
+ .service = PCIE_PORT_SERVICE_HP,
.probe = dummy_probe,
};
Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> If MSI-X interrupt mode is used by the PCI Express port driver, too
> many vectors are allocated and it is not ensured that the right
> vectors will be used for various services. Namely, the PCI Express
> specification states that both PCI Express native PME and PCI Express
> hotplug will always use the same MSI or MSI-X message for signalling
> interrupts, which implies that the same vector will be used by both
> of them. Also, the VC service does not use interrupts at all.
> Moreover, is not clear which of the vectors allocated by
> pci_enable_msix() will be used for PME and hotplug and which of them
> will be used for AER if all of these services are configured.
>
> For these reasons, rework the allocation of interrupts for PCI
> Express ports so that at most two vectors are allocated and ensure
> that the right vector will be used for the right purpose.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/pci/pcie/portdrv.h | 1
> drivers/pci/pcie/portdrv_core.c | 155 +++++++++++++++++++++++++++++-----------
> 2 files changed, 117 insertions(+), 39 deletions(-)
>
> Index: linux-2.6/drivers/pci/pcie/portdrv.h
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pcie/portdrv.h
> +++ linux-2.6/drivers/pci/pcie/portdrv.h
> @@ -25,6 +25,7 @@
> #define PCIE_CAPABILITIES_REG 0x2
> #define PCIE_SLOT_CAPABILITIES_REG 0x14
> #define PCIE_PORT_DEVICE_MAXSERVICES 4
> +#define PORT_MSI_VECTOR_MASK 0x1f
>
> #define get_descriptor_id(type, service) (((type - 4) << 4) | service)
>
> Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
> +++ linux-2.6/drivers/pci/pcie/portdrv_core.c
> @@ -30,55 +30,120 @@ static void release_pcie_device(struct d
> }
>
> /**
> + * fix_up_vectors - ensure the right ordering of MSI-X interrupt vectors
> + * @dev: PCI Express port that is going to use the vectors
> + * @vectors: Array of interrupt vectors to check (2 entries)
> + *
> + * Return value:
> + * 0 on success, error code if the values read from config registers are not as
> + * expected
> + *
> + * If this function is called, we are going to use two interrupt vectors which
> + * may be different, but we have to make sure they are in the right order such
> + * that the vector to be used for PME and hotplug signalling is the first one.
> + *
> + * NOTE: The assumption here is that MSI message offset (with respect to the
> + * base Message Data) equal to N corresponds to index N in the array of vectors
> + * returned by pci_enable_msix().
> + */
I've posted the similar patch recently, which doesn't have this
assumption. Please take a look.
http://marc.info/?l=linux-pci&m=122992792507715&w=2
NOTE: I've confirmed MSI and INTx still work with my patch, but it
is not tested on the machine with MSI-X capable port, since I don't
have such environment.
Thanks,
Kenji Kaneshige
On Thursday 08 January 2009, Kenji Kaneshige wrote:
> Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > If MSI-X interrupt mode is used by the PCI Express port driver, too
> > many vectors are allocated and it is not ensured that the right
> > vectors will be used for various services. Namely, the PCI Express
> > specification states that both PCI Express native PME and PCI Express
> > hotplug will always use the same MSI or MSI-X message for signalling
> > interrupts, which implies that the same vector will be used by both
> > of them. Also, the VC service does not use interrupts at all.
> > Moreover, is not clear which of the vectors allocated by
> > pci_enable_msix() will be used for PME and hotplug and which of them
> > will be used for AER if all of these services are configured.
> >
> > For these reasons, rework the allocation of interrupts for PCI
> > Express ports so that at most two vectors are allocated and ensure
> > that the right vector will be used for the right purpose.
> >
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > ---
> > drivers/pci/pcie/portdrv.h | 1
> > drivers/pci/pcie/portdrv_core.c | 155 +++++++++++++++++++++++++++++-----------
> > 2 files changed, 117 insertions(+), 39 deletions(-)
> >
> > Index: linux-2.6/drivers/pci/pcie/portdrv.h
> > ===================================================================
> > --- linux-2.6.orig/drivers/pci/pcie/portdrv.h
> > +++ linux-2.6/drivers/pci/pcie/portdrv.h
> > @@ -25,6 +25,7 @@
> > #define PCIE_CAPABILITIES_REG 0x2
> > #define PCIE_SLOT_CAPABILITIES_REG 0x14
> > #define PCIE_PORT_DEVICE_MAXSERVICES 4
> > +#define PORT_MSI_VECTOR_MASK 0x1f
> >
> > #define get_descriptor_id(type, service) (((type - 4) << 4) | service)
> >
> > Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
> > +++ linux-2.6/drivers/pci/pcie/portdrv_core.c
> > @@ -30,55 +30,120 @@ static void release_pcie_device(struct d
> > }
> >
> > /**
> > + * fix_up_vectors - ensure the right ordering of MSI-X interrupt vectors
> > + * @dev: PCI Express port that is going to use the vectors
> > + * @vectors: Array of interrupt vectors to check (2 entries)
> > + *
> > + * Return value:
> > + * 0 on success, error code if the values read from config registers are not as
> > + * expected
> > + *
> > + * If this function is called, we are going to use two interrupt vectors which
> > + * may be different, but we have to make sure they are in the right order such
> > + * that the vector to be used for PME and hotplug signalling is the first one.
> > + *
> > + * NOTE: The assumption here is that MSI message offset (with respect to the
> > + * base Message Data) equal to N corresponds to index N in the array of vectors
> > + * returned by pci_enable_msix().
> > + */
>
> I've posted the similar patch recently, which doesn't have this
> assumption.
Actually, this assumption is in agreement with PCI Express Base Specification
2.0, which very clearly states in Section 7.8.2 that:
"[...] Interrupt Message Number – This field indicates which
MSI/MSI-X vector is used for the interrupt message generated in
association with any of the status bits of this Capability structure.
[...]
For MSI-X, the value in this field indicates which MSI-X Table
entry is used to generate the interrupt message. The entry must
be one of the first 32 entries even if the Function implements
more than 32 entries. For a given MSI-X implementation, the
entry must remain constant."
Well, I have to update the comment. :-)
> Please take a look.
>
> http://marc.info/?l=linux-pci&m=122992792507715&w=2
OK, thanks. I can easily change my patch to follow this behavior, although
I don't really think it's necessary given the above.
> NOTE: I've confirmed MSI and INTx still work with my patch, but it
> is not tested on the machine with MSI-X capable port, since I don't
> have such environment.
I also have tested my patch with both MSI and INTx, but I also don't have
MSI-X capable root ports in any of my test boxes.
Thanks,
Rafael
Rafael J. Wysocki wrote:
> On Thursday 08 January 2009, Kenji Kaneshige wrote:
>> Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <[email protected]>
>>>
>>> If MSI-X interrupt mode is used by the PCI Express port driver, too
>>> many vectors are allocated and it is not ensured that the right
>>> vectors will be used for various services. Namely, the PCI Express
>>> specification states that both PCI Express native PME and PCI Express
>>> hotplug will always use the same MSI or MSI-X message for signalling
>>> interrupts, which implies that the same vector will be used by both
>>> of them. Also, the VC service does not use interrupts at all.
>>> Moreover, is not clear which of the vectors allocated by
>>> pci_enable_msix() will be used for PME and hotplug and which of them
>>> will be used for AER if all of these services are configured.
>>>
>>> For these reasons, rework the allocation of interrupts for PCI
>>> Express ports so that at most two vectors are allocated and ensure
>>> that the right vector will be used for the right purpose.
>>>
>>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>>> ---
>>> drivers/pci/pcie/portdrv.h | 1
>>> drivers/pci/pcie/portdrv_core.c | 155 +++++++++++++++++++++++++++++-----------
>>> 2 files changed, 117 insertions(+), 39 deletions(-)
>>>
>>> Index: linux-2.6/drivers/pci/pcie/portdrv.h
>>> ===================================================================
>>> --- linux-2.6.orig/drivers/pci/pcie/portdrv.h
>>> +++ linux-2.6/drivers/pci/pcie/portdrv.h
>>> @@ -25,6 +25,7 @@
>>> #define PCIE_CAPABILITIES_REG 0x2
>>> #define PCIE_SLOT_CAPABILITIES_REG 0x14
>>> #define PCIE_PORT_DEVICE_MAXSERVICES 4
>>> +#define PORT_MSI_VECTOR_MASK 0x1f
>>>
>>> #define get_descriptor_id(type, service) (((type - 4) << 4) | service)
>>>
>>> Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
>>> ===================================================================
>>> --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
>>> +++ linux-2.6/drivers/pci/pcie/portdrv_core.c
>>> @@ -30,55 +30,120 @@ static void release_pcie_device(struct d
>>> }
>>>
>>> /**
>>> + * fix_up_vectors - ensure the right ordering of MSI-X interrupt vectors
>>> + * @dev: PCI Express port that is going to use the vectors
>>> + * @vectors: Array of interrupt vectors to check (2 entries)
>>> + *
>>> + * Return value:
>>> + * 0 on success, error code if the values read from config registers are not as
>>> + * expected
>>> + *
>>> + * If this function is called, we are going to use two interrupt vectors which
>>> + * may be different, but we have to make sure they are in the right order such
>>> + * that the vector to be used for PME and hotplug signalling is the first one.
>>> + *
>>> + * NOTE: The assumption here is that MSI message offset (with respect to the
>>> + * base Message Data) equal to N corresponds to index N in the array of vectors
>>> + * returned by pci_enable_msix().
>>> + */
>> I've posted the similar patch recently, which doesn't have this
>> assumption.
>
> Actually, this assumption is in agreement with PCI Express Base Specification
> 2.0, which very clearly states in Section 7.8.2 that:
>
> "[...] Interrupt Message Number ? This field indicates which
> MSI/MSI-X vector is used for the interrupt message generated in
> association with any of the status bits of this Capability structure.
>
> [...]
>
> For MSI-X, the value in this field indicates which MSI-X Table
> entry is used to generate the interrupt message. The entry must
> be one of the first 32 entries even if the Function implements
> more than 32 entries. For a given MSI-X implementation, the
> entry must remain constant."
>
Though I might be misunderstanding something, my understanding is
that this implies that there can be more MSI-X table entries than
number of interrupts, and PME/HotPlug or AER interrupts can be
mapped to other than first two entries. If my understanding is
correct, your patch would not work if PME/HotPlug or AER interrupts
were mapped to other than first two.
Regardless of my understanding is correct or not, I have another
concern. I think there are two interrupts for PCIe port service
currently as your patch indicates. But new interrupts can be added
in the future. With the assumption, PME/HotPlug and/or AER interrupts
would not work if the first several entries are mapped to other new
interrupts. Therefore, with the assumption, we need to fix MSI-X
initialization code whenever new interrupts are defined.
> Well, I have to update the comment. :-)
>
>> Please take a look.
>>
>> http://marc.info/?l=linux-pci&m=122992792507715&w=2
>
> OK, thanks. I can easily change my patch to follow this behavior, although
> I don't really think it's necessary given the above.
>
Please see above mentioned my concerns.
>> NOTE: I've confirmed MSI and INTx still work with my patch, but it
>> is not tested on the machine with MSI-X capable port, since I don't
>> have such environment.
>
> I also have tested my patch with both MSI and INTx, but I also don't have
> MSI-X capable root ports in any of my test boxes.
>
Though the patch is not tested with MSI-X capable ports, I think it
is much better than the current code.
Thanks,
Kenji Kaneshige
On Thursday 08 January 2009, Kenji Kaneshige wrote:
> Rafael J. Wysocki wrote:
> > On Thursday 08 January 2009, Kenji Kaneshige wrote:
> >> Rafael J. Wysocki wrote:
> >>> From: Rafael J. Wysocki <[email protected]>
> >>>
> >>> If MSI-X interrupt mode is used by the PCI Express port driver, too
> >>> many vectors are allocated and it is not ensured that the right
> >>> vectors will be used for various services. Namely, the PCI Express
> >>> specification states that both PCI Express native PME and PCI Express
> >>> hotplug will always use the same MSI or MSI-X message for signalling
> >>> interrupts, which implies that the same vector will be used by both
> >>> of them. Also, the VC service does not use interrupts at all.
> >>> Moreover, is not clear which of the vectors allocated by
> >>> pci_enable_msix() will be used for PME and hotplug and which of them
> >>> will be used for AER if all of these services are configured.
> >>>
> >>> For these reasons, rework the allocation of interrupts for PCI
> >>> Express ports so that at most two vectors are allocated and ensure
> >>> that the right vector will be used for the right purpose.
> >>>
> >>> Signed-off-by: Rafael J. Wysocki <[email protected]>
> >>> ---
> >>> drivers/pci/pcie/portdrv.h | 1
> >>> drivers/pci/pcie/portdrv_core.c | 155 +++++++++++++++++++++++++++++-----------
> >>> 2 files changed, 117 insertions(+), 39 deletions(-)
> >>>
> >>> Index: linux-2.6/drivers/pci/pcie/portdrv.h
> >>> ===================================================================
> >>> --- linux-2.6.orig/drivers/pci/pcie/portdrv.h
> >>> +++ linux-2.6/drivers/pci/pcie/portdrv.h
> >>> @@ -25,6 +25,7 @@
> >>> #define PCIE_CAPABILITIES_REG 0x2
> >>> #define PCIE_SLOT_CAPABILITIES_REG 0x14
> >>> #define PCIE_PORT_DEVICE_MAXSERVICES 4
> >>> +#define PORT_MSI_VECTOR_MASK 0x1f
> >>>
> >>> #define get_descriptor_id(type, service) (((type - 4) << 4) | service)
> >>>
> >>> Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
> >>> ===================================================================
> >>> --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
> >>> +++ linux-2.6/drivers/pci/pcie/portdrv_core.c
> >>> @@ -30,55 +30,120 @@ static void release_pcie_device(struct d
> >>> }
> >>>
> >>> /**
> >>> + * fix_up_vectors - ensure the right ordering of MSI-X interrupt vectors
> >>> + * @dev: PCI Express port that is going to use the vectors
> >>> + * @vectors: Array of interrupt vectors to check (2 entries)
> >>> + *
> >>> + * Return value:
> >>> + * 0 on success, error code if the values read from config registers are not as
> >>> + * expected
> >>> + *
> >>> + * If this function is called, we are going to use two interrupt vectors which
> >>> + * may be different, but we have to make sure they are in the right order such
> >>> + * that the vector to be used for PME and hotplug signalling is the first one.
> >>> + *
> >>> + * NOTE: The assumption here is that MSI message offset (with respect to the
> >>> + * base Message Data) equal to N corresponds to index N in the array of vectors
> >>> + * returned by pci_enable_msix().
> >>> + */
> >> I've posted the similar patch recently, which doesn't have this
> >> assumption.
> >
> > Actually, this assumption is in agreement with PCI Express Base Specification
> > 2.0, which very clearly states in Section 7.8.2 that:
> >
> > "[...] Interrupt Message Number ? This field indicates which
> > MSI/MSI-X vector is used for the interrupt message generated in
> > association with any of the status bits of this Capability structure.
> >
> > [...]
> >
> > For MSI-X, the value in this field indicates which MSI-X Table
> > entry is used to generate the interrupt message. The entry must
> > be one of the first 32 entries even if the Function implements
> > more than 32 entries. For a given MSI-X implementation, the
> > entry must remain constant."
> >
>
> Though I might be misunderstanding something, my understanding is
> that this implies that there can be more MSI-X table entries than
> number of interrupts, and PME/HotPlug or AER interrupts can be
> mapped to other than first two entries.
We are requesting two vectors with my patch and they will be assigned to the
first two entries in the MSI-X table. If my understanding of the MSI-X code is
corrtect, there won't be more MSI-X table entries set up for the port.
> If my understanding is correct, your patch would not work if PME/HotPlug or
> AER interrupts were mapped to other than first two.
In this case it's going to fall back to MSI w/ one vector, because we're going
to find that offset > 1 in one of the tests in fix_up_vectors().
> Regardless of my understanding is correct or not, I have another
> concern. I think there are two interrupts for PCIe port service
> currently as your patch indicates. But new interrupts can be added
> in the future. With the assumption, PME/HotPlug and/or AER interrupts
> would not work if the first several entries are mapped to other new
> interrupts. Therefore, with the assumption, we need to fix MSI-X
> initialization code whenever new interrupts are defined.
In that case we'll need to allocate more vectors, so we'll need to change the
code anyway. Also, we'll have to add the tests for the new interrupts to
the code, because there will have to be new registers defined to read the
MSI offsets or MSI-X table indices from.
I very much prefer to keep the code as simple as possible as long as we can.
> > Well, I have to update the comment. :-)
> >
> >> Please take a look.
> >>
> >> http://marc.info/?l=linux-pci&m=122992792507715&w=2
> >
> > OK, thanks. I can easily change my patch to follow this behavior, although
> > I don't really think it's necessary given the above.
> >
>
> Please see above mentioned my concerns.
>
> >> NOTE: I've confirmed MSI and INTx still work with my patch, but it
> >> is not tested on the machine with MSI-X capable port, since I don't
> >> have such environment.
> >
> > I also have tested my patch with both MSI and INTx, but I also don't have
> > MSI-X capable root ports in any of my test boxes.
> >
>
> Though the patch is not tested with MSI-X capable ports, I think it
> is much better than the current code.
Well, I can only agree here. :-)
Thanks a lot for your comments.
Best,
Rafael
On Thursday 08 January 2009, Rafael J. Wysocki wrote:
> On Thursday 08 January 2009, Kenji Kaneshige wrote:
> > Rafael J. Wysocki wrote:
> > > On Thursday 08 January 2009, Kenji Kaneshige wrote:
> > >> Rafael J. Wysocki wrote:
> > >>> From: Rafael J. Wysocki <[email protected]>
> > >>>
> > >>> If MSI-X interrupt mode is used by the PCI Express port driver, too
> > >>> many vectors are allocated and it is not ensured that the right
> > >>> vectors will be used for various services. Namely, the PCI Express
> > >>> specification states that both PCI Express native PME and PCI Express
> > >>> hotplug will always use the same MSI or MSI-X message for signalling
> > >>> interrupts, which implies that the same vector will be used by both
> > >>> of them. Also, the VC service does not use interrupts at all.
> > >>> Moreover, is not clear which of the vectors allocated by
> > >>> pci_enable_msix() will be used for PME and hotplug and which of them
> > >>> will be used for AER if all of these services are configured.
> > >>>
> > >>> For these reasons, rework the allocation of interrupts for PCI
> > >>> Express ports so that at most two vectors are allocated and ensure
> > >>> that the right vector will be used for the right purpose.
Appended is a cleaned-up version of the patch that also contains comments
with references to the appropriate sections of the PCI Express spec.
Thanks,
Rafael
---
Subject: PCI PCIe portdrv: Fix allocation of interrupts (rev. 2)
From: Rafael J. Wysocki <[email protected]>
If MSI-X interrupt mode is used by the PCI Express port driver, too
many vectors are allocated and it is not ensured that the right
vectors will be used for various services. Namely, the PCI Express
specification states that both PCI Express native PME and PCI Express
hotplug will always use the same MSI or MSI-X message for signalling
interrupts, which implies that the same vector will be used by both
of them. Also, the VC service does not use interrupts at all.
Moreover, is not clear which of the vectors allocated by
pci_enable_msix() will be used for PME and hotplug and which of them
will be used for AER if all of these services are configured.
For these reasons, rework the allocation of interrupts for PCI
Express ports so that at most two vectors are allocated and ensure
that the right vector will be used for the right purpose.
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/pci/pcie/portdrv.h | 1
drivers/pci/pcie/portdrv_core.c | 168 ++++++++++++++++++++++++++++++----------
2 files changed, 130 insertions(+), 39 deletions(-)
Index: linux-2.6/drivers/pci/pcie/portdrv.h
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv.h
+++ linux-2.6/drivers/pci/pcie/portdrv.h
@@ -25,6 +25,7 @@
#define PCIE_CAPABILITIES_REG 0x2
#define PCIE_SLOT_CAPABILITIES_REG 0x14
#define PCIE_PORT_DEVICE_MAXSERVICES 4
+#define PORT_MSI_VECTOR_MASK 0x1f
#define get_descriptor_id(type, service) (((type - 4) << 4) | service)
Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
===================================================================
--- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
+++ linux-2.6/drivers/pci/pcie/portdrv_core.c
@@ -30,55 +30,133 @@ static void release_pcie_device(struct d
}
/**
+ * fix_up_vectors - ensure the right ordering of MSI-X interrupt vectors
+ * @dev: PCI Express port that is going to use the vectors
+ * @vectors: Array of interrupt vectors to check (2 entries)
+ *
+ * Return value:
+ * 0 on success, error code if the values read from config registers are not as
+ * expected
+ *
+ * If this function is called, we are going to use two interrupt vectors which
+ * may be different, but we have to make sure they are in the right order such
+ * that the vector to be used for the PME and hotplug signalling is the first
+ * one.
+ */
+static int fix_up_vectors(struct pci_dev *dev, int *vectors)
+{
+ int pos, vec0, vec1;
+ unsigned int index;
+ u16 reg16;
+ u32 reg32;
+
+ if (vectors[0] == vectors[1])
+ return 0;
+
+ /*
+ * The code below follows the PCI Express Base Specification 2.0 clearly
+ * stating in Section 6.1.6 that "PME and Hot-Plug Event interrupts
+ * (when both are implemented) always share the same MSI or MSI-X
+ * vector, as indicated by the Interrupt Message Number field in the PCI
+ * Express Capabilities register", where according to Section 7.8.2 of
+ * the Specification "For MSI-X, the value in this field indicates which
+ * MSI-X Table entry is used to generate the interrupt message."
+ */
+ pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+ pci_read_config_word(dev, pos + PCIE_CAPABILITIES_REG, ®16);
+ index = (reg16 >> 9) & PORT_MSI_VECTOR_MASK;
+ if (index > 1)
+ return -EIO;
+ else
+ vec0 = vectors[index];
+ /*
+ * The code below follows Section 7.10.10 of the PCI Express Base
+ * Specification 2.0 stating that bits 31-27 of the Root Error Status
+ * Register contain a value indicating which of the MSI/MSI-X vectors
+ * assigned to the port is going to be used for AER, where "For MSI-X,
+ * the value in this register indicates which MSI-X Table entry is used
+ * to generate the interrupt message."
+ */
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR);
+ pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, ®32);
+ index = (reg32 >> 27) & PORT_MSI_VECTOR_MASK;
+ if (index > 1)
+ return -EIO;
+ else
+ vec1 = vectors[index];
+
+ vectors[0] = vec0;
+ vectors[1] = vec1;
+
+ return 0;
+}
+
+/**
+ * pcie_port_enable_msix - try to set up MSI-X as interrupt mode for given port
+ * @dev: PCI Express port to handle
+ * @vectors: Array of interrupt vectors to populate (2 entries)
+ * @mask: Bitmask of port capabilities returned by get_port_device_capability()
+ *
+ * Return value: 0 on success, error code on failure
+ */
+static int pcie_port_enable_msix(struct pci_dev *dev, int *vectors, int mask)
+{
+ struct msix_entry msix_entries[] = {{0, 0}, {0, 1}};
+ int nvec, status;
+
+ /*
+ * According to the PCI Express specification both PME and PCI Express
+ * hotplug always use the same interrupt vector, while AER can use
+ * a different one.
+ */
+ nvec = ((mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP))
+ && (mask & PCIE_PORT_SERVICE_AER)) ? 2 : 1;
+
+ status = pci_enable_msix(dev, msix_entries, nvec);
+ if (status)
+ return status;
+
+ vectors[0] = msix_entries[0].vector;
+ vectors[1] = nvec > 1 ? msix_entries[1].vector : vectors[0];
+
+ status = fix_up_vectors(dev, vectors);
+ if (status)
+ pci_disable_msix(dev);
+
+ return status;
+}
+
+/**
* assign_interrupt_mode - choose interrupt mode for PCI Express port services
* (INTx, MSI-X, MSI) and set up vectors
* @dev: PCI Express port to handle
- * @vectors: Array of interrupt vectors to populate
+ * @vectors: Array of interrupt vectors to populate (2 entries)
* @mask: Bitmask of port capabilities returned by get_port_device_capability()
*
* Return value: Interrupt mode associated with the port
*/
static int assign_interrupt_mode(struct pci_dev *dev, int *vectors, int mask)
{
- int i, pos, nvec, status = -EINVAL;
int interrupt_mode = PCIE_PORT_NO_IRQ;
- /* Set INTx as default */
- for (i = 0, nvec = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
- if (mask & (1 << i))
- nvec++;
- vectors[i] = dev->irq;
- if (dev->pin)
- interrupt_mode = PCIE_PORT_INTx_MODE;
+ /* Use MSI-X if supported */
+ if (!pcie_port_enable_msix(dev, vectors, mask)) {
+ interrupt_mode = PCIE_PORT_MSIX_MODE;
+ goto Exit;
}
- /* Select MSI-X over MSI if supported */
- pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
- if (pos) {
- struct msix_entry msix_entries[PCIE_PORT_DEVICE_MAXSERVICES] =
- {{0, 0}, {0, 1}, {0, 2}, {0, 3}};
- status = pci_enable_msix(dev, msix_entries, nvec);
- if (!status) {
- int j = 0;
-
- interrupt_mode = PCIE_PORT_MSIX_MODE;
- for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
- if (mask & (1 << i))
- vectors[i] = msix_entries[j++].vector;
- }
- }
- }
- if (status) {
- pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
- if (pos) {
- status = pci_enable_msi(dev);
- if (!status) {
- interrupt_mode = PCIE_PORT_MSI_MODE;
- for (i = 0;i < PCIE_PORT_DEVICE_MAXSERVICES;i++)
- vectors[i] = dev->irq;
- }
- }
- }
+ /* We can't use MSI-X, so try MSI and fall back to INTx */
+ if (!pci_enable_msi(dev))
+ interrupt_mode = PCIE_PORT_MSI_MODE;
+ else if (dev->pin)
+ interrupt_mode = PCIE_PORT_INTx_MODE;
+ else
+ goto Exit;
+
+ vectors[0] = dev->irq;
+ vectors[1] = vectors[0];
+
+ Exit:
return interrupt_mode;
}
@@ -205,7 +283,7 @@ int pcie_port_device_probe(struct pci_de
int pcie_port_device_register(struct pci_dev *dev)
{
int status, type, capabilities, irq_mode, i, nr_serv;
- int vectors[PCIE_PORT_DEVICE_MAXSERVICES];
+ int vectors[] = { -1, -1 };
u16 reg16;
/* Get port type */
@@ -229,7 +307,7 @@ int pcie_port_device_register(struct pci
/* Allocate child services if any */
for (i = 0, nr_serv = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
struct pcie_device *child;
- int service = 1 << i;
+ int irq, service = 1 << i;
if (!(capabilities & service))
continue;
@@ -242,11 +320,23 @@ int pcie_port_device_register(struct pci
&& service != PCIE_PORT_SERVICE_VC)
continue;
+ switch (service) {
+ case PCIE_PORT_SERVICE_PME:
+ case PCIE_PORT_SERVICE_HP:
+ irq = vectors[0];
+ break;
+ case PCIE_PORT_SERVICE_AER:
+ irq = vectors[1];
+ break;
+ default:
+ irq = -1;
+ }
+
child = alloc_pcie_device(
dev, /* parent */
type, /* port type */
service, /* service type */
- vectors[i], /* irq */
+ irq, /* interrupt vector */
irq_mode /* interrupt mode */);
if (!child)
continue;
On Sunday, January 4, 2009 2:48 pm Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The PCI Express port driver contains a quirk that prevents root
> ports from using MSI, but there is no explanation for it and the
> PCI Express specification clearly allows root ports to use MSI.
Weird. This *looks* fine, but presumably it was added for some reason?
Unfortunately the git logs don't go back that far. I'll ping some of our
chipset guys to see if there were early issues with this or something; at the
very least we could probably make the quirk a bit narrower in scope.
--
Jesse Barnes, Intel Open Source Technology Center
On Sunday, January 4, 2009 2:48 pm Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The PCI Express port driver contains a quirk that prevents root
> ports from using MSI, but there is no explanation for it and the
> PCI Express specification clearly allows root ports to use MSI.
Ah ignore my last message, it really does look like some chipsets are just
buggy here, and the quirk only affects a few, so it should be pretty harmless
for the most part (iow I don't think this patch is a good idea on those
chipsets).
--
Jesse Barnes, Intel Open Source Technology Center
On Sunday, January 4, 2009 2:49 pm Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The PCI Express port driver should not attempt to register service
> devices that require the ability to generate interrupts if generating
> interrupts is not possible. Namely, if the port has no interrupt pin
> configured and we cannot set up MSI or MSI-X for it, there is no way
> it can generate interrupts and in such a case the port services that
> rely on interrupts (PME, PCIe HP, AER) should not be enabled for it.
>
> /* Root/Upstream/Downstream Port's Interrupt Mode */
> +#define PCIE_PORT_NO_IRQ (-1)
> #define PCIE_PORT_INTx_MODE 0
> #define PCIE_PORT_MSI_MODE 1
> #define PCIE_PORT_MSIX_MODE 2
> Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
> +++ linux-2.6/drivers/pci/pcie/portdrv_core.c
> @@ -41,13 +41,15 @@ static void release_pcie_device(struct d
> static int assign_interrupt_mode(struct pci_dev *dev, int *vectors, int
> mask) {
> int i, pos, nvec, status = -EINVAL;
> - int interrupt_mode = PCIE_PORT_INTx_MODE;
> + int interrupt_mode = PCIE_PORT_NO_IRQ;
>
> /* Set INTx as default */
> for (i = 0, nvec = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
> if (mask & (1 << i))
> nvec++;
> vectors[i] = dev->irq;
> + if (dev->pin)
> + interrupt_mode = PCIE_PORT_INTx_MODE;
> }
Looks like a good cleanup. I didn't audit all the places PCIE_PORT_* stuff
was used to see if != would be a problem though (probably isn't).
--
Jesse Barnes, Intel Open Source Technology Center
On Sunday, January 4, 2009 2:51 pm Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> There is struct pcie_port_device_ext defined in portdrv.h, but its
> member is only initialized and never used. Remove it.
Yay for dead code elimination.
--
Jesse Barnes, Intel Open Source Technology Center
On Sunday, January 4, 2009 2:54 pm Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The PCI Express port driver calls pci_enable_device() before setting
> up interrupts, which is wrong, because if there is an interrupt pin
> configured for the port, pci_enable_device() will likely set up an
> interrupt link for it. However, this shouldn't be done if either
> MSI or MSI-X interrupt mode is chosen for the port.
>
> The solution is to call pci_enable_device() after setting up
> interrupts, because in that case the interrupt link won't be set up
> if MSI or MSI-X are enabled.
Hm, wonder how this bug stuck around for so long... Good catch.
--
Jesse Barnes, Intel Open Source Technology Center
On Thursday, January 8, 2009 12:45 pm Rafael J. Wysocki wrote:
> On Thursday 08 January 2009, Rafael J. Wysocki wrote:
> > On Thursday 08 January 2009, Kenji Kaneshige wrote:
> > > Rafael J. Wysocki wrote:
> > > > On Thursday 08 January 2009, Kenji Kaneshige wrote:
> > > >> Rafael J. Wysocki wrote:
> > > >>> From: Rafael J. Wysocki <[email protected]>
> > > >>>
> > > >>> If MSI-X interrupt mode is used by the PCI Express port driver, too
> > > >>> many vectors are allocated and it is not ensured that the right
> > > >>> vectors will be used for various services. Namely, the PCI Express
> > > >>> specification states that both PCI Express native PME and PCI
> > > >>> Express hotplug will always use the same MSI or MSI-X message for
> > > >>> signalling interrupts, which implies that the same vector will be
> > > >>> used by both of them. Also, the VC service does not use interrupts
> > > >>> at all. Moreover, is not clear which of the vectors allocated by
> > > >>> pci_enable_msix() will be used for PME and hotplug and which of
> > > >>> them will be used for AER if all of these services are configured.
> > > >>>
> > > >>> For these reasons, rework the allocation of interrupts for PCI
> > > >>> Express ports so that at most two vectors are allocated and ensure
> > > >>> that the right vector will be used for the right purpose.
>
> Appended is a cleaned-up version of the patch that also contains comments
> with references to the appropriate sections of the PCI Express spec.
We'll need testing here in any case; I'll have to re-read the specs carefully
to see if we can really rely on the vector numbers you have here (at first
glance Kenji-san's approach seems more robust).
--
Jesse Barnes, Intel Open Source Technology Center
On Saturday 10 January 2009, Jesse Barnes wrote:
> On Sunday, January 4, 2009 2:48 pm Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > The PCI Express port driver contains a quirk that prevents root
> > ports from using MSI, but there is no explanation for it and the
> > PCI Express specification clearly allows root ports to use MSI.
>
> Ah ignore my last message, it really does look like some chipsets are just
> buggy here, and the quirk only affects a few, so it should be pretty harmless
> for the most part (iow I don't think this patch is a good idea on those
> chipsets).
But this means we're not going to use MSI for PCIe PME at all, for example.
I'd be very much interested in learning which chipsets are affected and whether
we can make the quirk more specific.
Thanks,
Rafael
On Saturday 10 January 2009, Jesse Barnes wrote:
> On Thursday, January 8, 2009 12:45 pm Rafael J. Wysocki wrote:
> > On Thursday 08 January 2009, Rafael J. Wysocki wrote:
> > > On Thursday 08 January 2009, Kenji Kaneshige wrote:
> > > > Rafael J. Wysocki wrote:
> > > > > On Thursday 08 January 2009, Kenji Kaneshige wrote:
> > > > >> Rafael J. Wysocki wrote:
> > > > >>> From: Rafael J. Wysocki <[email protected]>
> > > > >>>
> > > > >>> If MSI-X interrupt mode is used by the PCI Express port driver, too
> > > > >>> many vectors are allocated and it is not ensured that the right
> > > > >>> vectors will be used for various services. Namely, the PCI Express
> > > > >>> specification states that both PCI Express native PME and PCI
> > > > >>> Express hotplug will always use the same MSI or MSI-X message for
> > > > >>> signalling interrupts, which implies that the same vector will be
> > > > >>> used by both of them. Also, the VC service does not use interrupts
> > > > >>> at all. Moreover, is not clear which of the vectors allocated by
> > > > >>> pci_enable_msix() will be used for PME and hotplug and which of
> > > > >>> them will be used for AER if all of these services are configured.
> > > > >>>
> > > > >>> For these reasons, rework the allocation of interrupts for PCI
> > > > >>> Express ports so that at most two vectors are allocated and ensure
> > > > >>> that the right vector will be used for the right purpose.
> >
> > Appended is a cleaned-up version of the patch that also contains comments
> > with references to the appropriate sections of the PCI Express spec.
>
> We'll need testing here in any case; I'll have to re-read the specs carefully
> to see if we can really rely on the vector numbers you have here (at first
> glance Kenji-san's approach seems more robust).
It has a catch too. Namely, is there any warranty that we'll get the same
vectors when the second pci_enable_msix() is called and that they will be in
the same order?
Rafael
Rafael J. Wysocki wrote:
> On Thursday 08 January 2009, Kenji Kaneshige wrote:
>> Rafael J. Wysocki wrote:
>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
>>>> Rafael J. Wysocki wrote:
>>>>> From: Rafael J. Wysocki <[email protected]>
>>>>>
>>>>> If MSI-X interrupt mode is used by the PCI Express port driver, too
>>>>> many vectors are allocated and it is not ensured that the right
>>>>> vectors will be used for various services. Namely, the PCI Express
>>>>> specification states that both PCI Express native PME and PCI Express
>>>>> hotplug will always use the same MSI or MSI-X message for signalling
>>>>> interrupts, which implies that the same vector will be used by both
>>>>> of them. Also, the VC service does not use interrupts at all.
>>>>> Moreover, is not clear which of the vectors allocated by
>>>>> pci_enable_msix() will be used for PME and hotplug and which of them
>>>>> will be used for AER if all of these services are configured.
>>>>>
>>>>> For these reasons, rework the allocation of interrupts for PCI
>>>>> Express ports so that at most two vectors are allocated and ensure
>>>>> that the right vector will be used for the right purpose.
>>>>>
>>>>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>>>>> ---
>>>>> drivers/pci/pcie/portdrv.h | 1
>>>>> drivers/pci/pcie/portdrv_core.c | 155 +++++++++++++++++++++++++++++-----------
>>>>> 2 files changed, 117 insertions(+), 39 deletions(-)
>>>>>
>>>>> Index: linux-2.6/drivers/pci/pcie/portdrv.h
>>>>> ===================================================================
>>>>> --- linux-2.6.orig/drivers/pci/pcie/portdrv.h
>>>>> +++ linux-2.6/drivers/pci/pcie/portdrv.h
>>>>> @@ -25,6 +25,7 @@
>>>>> #define PCIE_CAPABILITIES_REG 0x2
>>>>> #define PCIE_SLOT_CAPABILITIES_REG 0x14
>>>>> #define PCIE_PORT_DEVICE_MAXSERVICES 4
>>>>> +#define PORT_MSI_VECTOR_MASK 0x1f
>>>>>
>>>>> #define get_descriptor_id(type, service) (((type - 4) << 4) | service)
>>>>>
>>>>> Index: linux-2.6/drivers/pci/pcie/portdrv_core.c
>>>>> ===================================================================
>>>>> --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c
>>>>> +++ linux-2.6/drivers/pci/pcie/portdrv_core.c
>>>>> @@ -30,55 +30,120 @@ static void release_pcie_device(struct d
>>>>> }
>>>>>
>>>>> /**
>>>>> + * fix_up_vectors - ensure the right ordering of MSI-X interrupt vectors
>>>>> + * @dev: PCI Express port that is going to use the vectors
>>>>> + * @vectors: Array of interrupt vectors to check (2 entries)
>>>>> + *
>>>>> + * Return value:
>>>>> + * 0 on success, error code if the values read from config registers are not as
>>>>> + * expected
>>>>> + *
>>>>> + * If this function is called, we are going to use two interrupt vectors which
>>>>> + * may be different, but we have to make sure they are in the right order such
>>>>> + * that the vector to be used for PME and hotplug signalling is the first one.
>>>>> + *
>>>>> + * NOTE: The assumption here is that MSI message offset (with respect to the
>>>>> + * base Message Data) equal to N corresponds to index N in the array of vectors
>>>>> + * returned by pci_enable_msix().
>>>>> + */
>>>> I've posted the similar patch recently, which doesn't have this
>>>> assumption.
>>> Actually, this assumption is in agreement with PCI Express Base Specification
>>> 2.0, which very clearly states in Section 7.8.2 that:
>>>
>>> "[...] Interrupt Message Number ? This field indicates which
>>> MSI/MSI-X vector is used for the interrupt message generated in
>>> association with any of the status bits of this Capability structure.
>>>
>>> [...]
>>>
>>> For MSI-X, the value in this field indicates which MSI-X Table
>>> entry is used to generate the interrupt message. The entry must
>>> be one of the first 32 entries even if the Function implements
>>> more than 32 entries. For a given MSI-X implementation, the
>>> entry must remain constant."
>>>
>> Though I might be misunderstanding something, my understanding is
>> that this implies that there can be more MSI-X table entries than
>> number of interrupts, and PME/HotPlug or AER interrupts can be
>> mapped to other than first two entries.
>
> We are requesting two vectors with my patch and they will be assigned to the
> first two entries in the MSI-X table. If my understanding of the MSI-X code is
> corrtect, there won't be more MSI-X table entries set up for the port.
>
>> If my understanding is correct, your patch would not work if PME/HotPlug or
>> AER interrupts were mapped to other than first two.
>
> In this case it's going to fall back to MSI w/ one vector, because we're going
> to find that offset > 1 in one of the tests in fix_up_vectors().
>
>> Regardless of my understanding is correct or not, I have another
>> concern. I think there are two interrupts for PCIe port service
>> currently as your patch indicates. But new interrupts can be added
>> in the future. With the assumption, PME/HotPlug and/or AER interrupts
>> would not work if the first several entries are mapped to other new
>> interrupts. Therefore, with the assumption, we need to fix MSI-X
>> initialization code whenever new interrupts are defined.
>
> In that case we'll need to allocate more vectors, so we'll need to change the
> code anyway. Also, we'll have to add the tests for the new interrupts to
> the code, because there will have to be new registers defined to read the
> MSI offsets or MSI-X table indices from.
>
I'm sorry for delay. I've just resumed.
Yes, we need to change the code if we use new port service that
uses new interrupt. But there would be a delay between hardware
update and kernel update. I think currently supported MSI-X
interrupts should keep working with the new hardware even before
the kernel supports new port service.
> I very much prefer to keep the code as simple as possible as long as we can.
I agree with this.
Thanks,
Kenji Kaneshige
Rafael J. Wysocki wrote:
> On Saturday 10 January 2009, Jesse Barnes wrote:
>> On Thursday, January 8, 2009 12:45 pm Rafael J. Wysocki wrote:
>>> On Thursday 08 January 2009, Rafael J. Wysocki wrote:
>>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
>>>>> Rafael J. Wysocki wrote:
>>>>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
>>>>>>> Rafael J. Wysocki wrote:
>>>>>>>> From: Rafael J. Wysocki <[email protected]>
>>>>>>>>
>>>>>>>> If MSI-X interrupt mode is used by the PCI Express port driver, too
>>>>>>>> many vectors are allocated and it is not ensured that the right
>>>>>>>> vectors will be used for various services. Namely, the PCI Express
>>>>>>>> specification states that both PCI Express native PME and PCI
>>>>>>>> Express hotplug will always use the same MSI or MSI-X message for
>>>>>>>> signalling interrupts, which implies that the same vector will be
>>>>>>>> used by both of them. Also, the VC service does not use interrupts
>>>>>>>> at all. Moreover, is not clear which of the vectors allocated by
>>>>>>>> pci_enable_msix() will be used for PME and hotplug and which of
>>>>>>>> them will be used for AER if all of these services are configured.
>>>>>>>>
>>>>>>>> For these reasons, rework the allocation of interrupts for PCI
>>>>>>>> Express ports so that at most two vectors are allocated and ensure
>>>>>>>> that the right vector will be used for the right purpose.
>>> Appended is a cleaned-up version of the patch that also contains comments
>>> with references to the appropriate sections of the PCI Express spec.
>> We'll need testing here in any case; I'll have to re-read the specs carefully
>> to see if we can really rely on the vector numbers you have here (at first
>> glance Kenji-san's approach seems more robust).
>
> It has a catch too. Namely, is there any warranty that we'll get the same
> vectors when the second pci_enable_msix() is called and that they will be in
> the same order?
>
In my understanding, the following description of Interrupt Message
Number in the PCI Express spec explains this.
"[...] For a given MSI-X implementation, the entry must remain constant. [...]"
Thanks,
Kenji Kaneshige
On Tuesday 13 January 2009, Kenji Kaneshige wrote:
> Rafael J. Wysocki wrote:
> > On Saturday 10 January 2009, Jesse Barnes wrote:
> >> On Thursday, January 8, 2009 12:45 pm Rafael J. Wysocki wrote:
> >>> On Thursday 08 January 2009, Rafael J. Wysocki wrote:
> >>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
> >>>>> Rafael J. Wysocki wrote:
> >>>>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
> >>>>>>> Rafael J. Wysocki wrote:
> >>>>>>>> From: Rafael J. Wysocki <[email protected]>
> >>>>>>>>
> >>>>>>>> If MSI-X interrupt mode is used by the PCI Express port driver, too
> >>>>>>>> many vectors are allocated and it is not ensured that the right
> >>>>>>>> vectors will be used for various services. Namely, the PCI Express
> >>>>>>>> specification states that both PCI Express native PME and PCI
> >>>>>>>> Express hotplug will always use the same MSI or MSI-X message for
> >>>>>>>> signalling interrupts, which implies that the same vector will be
> >>>>>>>> used by both of them. Also, the VC service does not use interrupts
> >>>>>>>> at all. Moreover, is not clear which of the vectors allocated by
> >>>>>>>> pci_enable_msix() will be used for PME and hotplug and which of
> >>>>>>>> them will be used for AER if all of these services are configured.
> >>>>>>>>
> >>>>>>>> For these reasons, rework the allocation of interrupts for PCI
> >>>>>>>> Express ports so that at most two vectors are allocated and ensure
> >>>>>>>> that the right vector will be used for the right purpose.
> >>> Appended is a cleaned-up version of the patch that also contains comments
> >>> with references to the appropriate sections of the PCI Express spec.
> >> We'll need testing here in any case; I'll have to re-read the specs carefully
> >> to see if we can really rely on the vector numbers you have here (at first
> >> glance Kenji-san's approach seems more robust).
> >
> > It has a catch too. Namely, is there any warranty that we'll get the same
> > vectors when the second pci_enable_msix() is called and that they will be in
> > the same order?
> >
>
> In my understanding, the following description of Interrupt Message
> Number in the PCI Express spec explains this.
>
> "[...] For a given MSI-X implementation, the entry must remain constant. [...]"
The entry - yes, but the associated interrupt vector as sent to the CPU? That
seems to be platform-dependent.
Anyway, I'm going to send an updated patch shortly.
Thanks,
Rafael
Rafael J. Wysocki wrote:
> On Tuesday 13 January 2009, Kenji Kaneshige wrote:
>> Rafael J. Wysocki wrote:
>>> On Saturday 10 January 2009, Jesse Barnes wrote:
>>>> On Thursday, January 8, 2009 12:45 pm Rafael J. Wysocki wrote:
>>>>> On Thursday 08 January 2009, Rafael J. Wysocki wrote:
>>>>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
>>>>>>> Rafael J. Wysocki wrote:
>>>>>>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
>>>>>>>>> Rafael J. Wysocki wrote:
>>>>>>>>>> From: Rafael J. Wysocki <[email protected]>
>>>>>>>>>>
>>>>>>>>>> If MSI-X interrupt mode is used by the PCI Express port driver, too
>>>>>>>>>> many vectors are allocated and it is not ensured that the right
>>>>>>>>>> vectors will be used for various services. Namely, the PCI Express
>>>>>>>>>> specification states that both PCI Express native PME and PCI
>>>>>>>>>> Express hotplug will always use the same MSI or MSI-X message for
>>>>>>>>>> signalling interrupts, which implies that the same vector will be
>>>>>>>>>> used by both of them. Also, the VC service does not use interrupts
>>>>>>>>>> at all. Moreover, is not clear which of the vectors allocated by
>>>>>>>>>> pci_enable_msix() will be used for PME and hotplug and which of
>>>>>>>>>> them will be used for AER if all of these services are configured.
>>>>>>>>>>
>>>>>>>>>> For these reasons, rework the allocation of interrupts for PCI
>>>>>>>>>> Express ports so that at most two vectors are allocated and ensure
>>>>>>>>>> that the right vector will be used for the right purpose.
>>>>> Appended is a cleaned-up version of the patch that also contains comments
>>>>> with references to the appropriate sections of the PCI Express spec.
>>>> We'll need testing here in any case; I'll have to re-read the specs carefully
>>>> to see if we can really rely on the vector numbers you have here (at first
>>>> glance Kenji-san's approach seems more robust).
>>> It has a catch too. Namely, is there any warranty that we'll get the same
>>> vectors when the second pci_enable_msix() is called and that they will be in
>>> the same order?
>>>
>> In my understanding, the following description of Interrupt Message
>> Number in the PCI Express spec explains this.
>>
>> "[...] For a given MSI-X implementation, the entry must remain constant. [...]"
>
> The entry - yes, but the associated interrupt vector as sent to the CPU? That
> seems to be platform-dependent.
>
I'm sorry but I don't understand what the problem is.
Do you mean pci_disable_msix() doesn't work on some platforms?
Thanks,
Kenji Kaneshige
On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> Rafael J. Wysocki wrote:
> > On Tuesday 13 January 2009, Kenji Kaneshige wrote:
> >> Rafael J. Wysocki wrote:
> >>> On Saturday 10 January 2009, Jesse Barnes wrote:
> >>>> On Thursday, January 8, 2009 12:45 pm Rafael J. Wysocki wrote:
> >>>>> On Thursday 08 January 2009, Rafael J. Wysocki wrote:
> >>>>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
> >>>>>>> Rafael J. Wysocki wrote:
> >>>>>>>> On Thursday 08 January 2009, Kenji Kaneshige wrote:
> >>>>>>>>> Rafael J. Wysocki wrote:
> >>>>>>>>>> From: Rafael J. Wysocki <[email protected]>
> >>>>>>>>>>
> >>>>>>>>>> If MSI-X interrupt mode is used by the PCI Express port driver, too
> >>>>>>>>>> many vectors are allocated and it is not ensured that the right
> >>>>>>>>>> vectors will be used for various services. Namely, the PCI Express
> >>>>>>>>>> specification states that both PCI Express native PME and PCI
> >>>>>>>>>> Express hotplug will always use the same MSI or MSI-X message for
> >>>>>>>>>> signalling interrupts, which implies that the same vector will be
> >>>>>>>>>> used by both of them. Also, the VC service does not use interrupts
> >>>>>>>>>> at all. Moreover, is not clear which of the vectors allocated by
> >>>>>>>>>> pci_enable_msix() will be used for PME and hotplug and which of
> >>>>>>>>>> them will be used for AER if all of these services are configured.
> >>>>>>>>>>
> >>>>>>>>>> For these reasons, rework the allocation of interrupts for PCI
> >>>>>>>>>> Express ports so that at most two vectors are allocated and ensure
> >>>>>>>>>> that the right vector will be used for the right purpose.
> >>>>> Appended is a cleaned-up version of the patch that also contains comments
> >>>>> with references to the appropriate sections of the PCI Express spec.
> >>>> We'll need testing here in any case; I'll have to re-read the specs carefully
> >>>> to see if we can really rely on the vector numbers you have here (at first
> >>>> glance Kenji-san's approach seems more robust).
> >>> It has a catch too. Namely, is there any warranty that we'll get the same
> >>> vectors when the second pci_enable_msix() is called and that they will be in
> >>> the same order?
> >>>
> >> In my understanding, the following description of Interrupt Message
> >> Number in the PCI Express spec explains this.
> >>
> >> "[...] For a given MSI-X implementation, the entry must remain constant. [...]"
> >
> > The entry - yes, but the associated interrupt vector as sent to the CPU? That
> > seems to be platform-dependent.
> >
>
> I'm sorry but I don't understand what the problem is.
> Do you mean pci_disable_msix() doesn't work on some platforms?
No, I don't. It was just confusion on my side, sorry.
Please have a look at the new version of the patch I sent yesterday
(http://marc.info/?l=linux-pci&m=123185510828181&w=4).
Thanks,
Rafael
On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
[...]
> > I'm sorry but I don't understand what the problem is.
> > Do you mean pci_disable_msix() doesn't work on some platforms?
>
> No, I don't. It was just confusion on my side, sorry.
>
> Please have a look at the new version of the patch I sent yesterday
> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
BTW, in your patch the first dummy pci_enable_msix() allocates just one
vector, which means that the contents of both
msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
if my reading of the spec (PCI 3.0 in this case) is correct.
However, if the second pci_enable_msix() allocates two vectors, the contents of the
message number fields in the PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS
registers may change as a result.
Thanks,
Rafael
Rafael J. Wysocki wrote:
> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> [...]
>>> I'm sorry but I don't understand what the problem is.
>>> Do you mean pci_disable_msix() doesn't work on some platforms?
>> No, I don't. It was just confusion on my side, sorry.
>>
>> Please have a look at the new version of the patch I sent yesterday
>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
>
> BTW, in your patch the first dummy pci_enable_msix() allocates just one
> vector, which means that the contents of both
> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
> if my reading of the spec (PCI 3.0 in this case) is correct.
>
> However, if the second pci_enable_msix() allocates two vectors, the contents of the
> message number fields in the PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS
> registers may change as a result.
>
For MSI, the interrupt message number field may change if the Multiple
Message Enable Field was changed. On the other hand, for MSI-X, I think
the interrupt message number field is constant. This is what the "[...]
For a given MSI-X implementation, the entry must remain constant. [...]"
explains in PCI Express spec, I think.
Thanks,
Kenji Kaneshige
Rafael J. Wysocki wrote:
> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> [...]
>>> I'm sorry but I don't understand what the problem is.
>>> Do you mean pci_disable_msix() doesn't work on some platforms?
>> No, I don't. It was just confusion on my side, sorry.
>>
>> Please have a look at the new version of the patch I sent yesterday
>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
>
> BTW, in your patch the first dummy pci_enable_msix() allocates just one
> vector, which means that the contents of both
> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
> if my reading of the spec (PCI 3.0 in this case) is correct.
According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
it seems your reading is not correct.
Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
Spec says that the port could be designed for software to configure entries
assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
So if there is just one vector, it could be AAAA.
Thanks,
H.Seto
Hidetoshi Seto wrote:
> Rafael J. Wysocki wrote:
>> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
>>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
>> [...]
>>>> I'm sorry but I don't understand what the problem is.
>>>> Do you mean pci_disable_msix() doesn't work on some platforms?
>>> No, I don't. It was just confusion on my side, sorry.
>>>
>>> Please have a look at the new version of the patch I sent yesterday
>>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
>> BTW, in your patch the first dummy pci_enable_msix() allocates just one
>> vector, which means that the contents of both
>> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
>> if my reading of the spec (PCI 3.0 in this case) is correct.
>
> According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
> it seems your reading is not correct.
>
> Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
> for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
> Spec says that the port could be designed for software to configure entries
> assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
>
> So if there is just one vector, it could be AAAA.
>
BTW, I don't think pci_enable_msix() allows this kind of configuration.
With the dummy pci_enable_msix() in my patch, it would be A---, I think.
Thanks,
Kenji Kaneshige
On Thursday 15 January 2009, Kenji Kaneshige wrote:
> Hidetoshi Seto wrote:
> > Rafael J. Wysocki wrote:
> >> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
> >>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> >> [...]
> >>>> I'm sorry but I don't understand what the problem is.
> >>>> Do you mean pci_disable_msix() doesn't work on some platforms?
> >>> No, I don't. It was just confusion on my side, sorry.
> >>>
> >>> Please have a look at the new version of the patch I sent yesterday
> >>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
> >> BTW, in your patch the first dummy pci_enable_msix() allocates just one
> >> vector, which means that the contents of both
> >> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
> >> if my reading of the spec (PCI 3.0 in this case) is correct.
> >
> > According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
> > it seems your reading is not correct.
> >
> > Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
> > for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
> > Spec says that the port could be designed for software to configure entries
> > assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
> >
> > So if there is just one vector, it could be AAAA.
Our pci_enable_msix() doesn't do that. It will always do A---.
> BTW, I don't think pci_enable_msix() allows this kind of configuration.
> With the dummy pci_enable_msix() in my patch, it would be A---, I think.
And that exactly is why I'm not sure it's correct.
Namely, if only the first entry is configured, the device is only able to use
one vector, represented by this entry, for any purpose. Now, for instance, for
PCIE_CAPABILITIES_REG, there are two possibilities:
(1) the value in the register always points to the _valid_ entry in the MSI-X
table and that would be the first one,
(2) the value in the register may point to an _invalid_ entry (1 - 3).
You seem to assume that (2) is the case, but I'm not sure (that should follow
from the PCI Express spec, but it clearly doesn't, at least I couldn't find
any pointer in the spec). IMO it wouldn't make sense, because the port
wouldn't have been able to generate interrupts for this service if only one
vector had been configured.
Still, even though (2) is the case, but both PCIE_CAPABILITIES_REG and
PCI_ERR_ROOT_STATUS just happen to point to the same entry, which very well may
be possible, the second pci_enable_msix() in your patch will fail.
In any case, I think we should
(a) get the number of the port's MSI-X table entries _first_, without enabling
MSI-X,
(b) allocate as many MSI-X vectors as indicated by this number, even though
some of them may not be used,
(c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
which vector has been allocated to which service.
Thanks,
Rafael
On Thursday 15 January 2009, Rafael J. Wysocki wrote:
> On Thursday 15 January 2009, Kenji Kaneshige wrote:
> > Hidetoshi Seto wrote:
> > > Rafael J. Wysocki wrote:
> > >> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
> > >>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> > >> [...]
> > >>>> I'm sorry but I don't understand what the problem is.
> > >>>> Do you mean pci_disable_msix() doesn't work on some platforms?
> > >>> No, I don't. It was just confusion on my side, sorry.
> > >>>
> > >>> Please have a look at the new version of the patch I sent yesterday
> > >>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
> > >> BTW, in your patch the first dummy pci_enable_msix() allocates just one
> > >> vector, which means that the contents of both
> > >> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
> > >> if my reading of the spec (PCI 3.0 in this case) is correct.
> > >
> > > According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
> > > it seems your reading is not correct.
> > >
> > > Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
> > > for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
> > > Spec says that the port could be designed for software to configure entries
> > > assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
> > >
> > > So if there is just one vector, it could be AAAA.
>
> Our pci_enable_msix() doesn't do that. It will always do A---.
>
> > BTW, I don't think pci_enable_msix() allows this kind of configuration.
> > With the dummy pci_enable_msix() in my patch, it would be A---, I think.
>
> And that exactly is why I'm not sure it's correct.
>
> Namely, if only the first entry is configured, the device is only able to use
> one vector, represented by this entry, for any purpose. Now, for instance, for
> PCIE_CAPABILITIES_REG, there are two possibilities:
> (1) the value in the register always points to the _valid_ entry in the MSI-X
> table and that would be the first one,
> (2) the value in the register may point to an _invalid_ entry (1 - 3).
>
> You seem to assume that (2) is the case, but I'm not sure (that should follow
> from the PCI Express spec, but it clearly doesn't, at least I couldn't find
> any pointer in the spec). IMO it wouldn't make sense, because the port
> wouldn't have been able to generate interrupts for this service if only one
> vector had been configured.
>
> Still, even though (2) is the case, but both PCIE_CAPABILITIES_REG and
> PCI_ERR_ROOT_STATUS just happen to point to the same entry, which very well may
> be possible, the second pci_enable_msix() in your patch will fail.
>
> In any case, I think we should
> (a) get the number of the port's MSI-X table entries _first_, without enabling
> MSI-X,
> (b) allocate as many MSI-X vectors as indicated by this number, even though
> some of them may not be used,
> (c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
> which vector has been allocated to which service.
(d) mask the unused vectors.
Thanks,
Rafael
On Thursday 15 January 2009, Rafael J. Wysocki wrote:
> On Thursday 15 January 2009, Rafael J. Wysocki wrote:
> > On Thursday 15 January 2009, Kenji Kaneshige wrote:
> > > Hidetoshi Seto wrote:
> > > > Rafael J. Wysocki wrote:
> > > >> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
> > > >>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> > > >> [...]
> > > >>>> I'm sorry but I don't understand what the problem is.
> > > >>>> Do you mean pci_disable_msix() doesn't work on some platforms?
> > > >>> No, I don't. It was just confusion on my side, sorry.
> > > >>>
> > > >>> Please have a look at the new version of the patch I sent yesterday
> > > >>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
> > > >> BTW, in your patch the first dummy pci_enable_msix() allocates just one
> > > >> vector, which means that the contents of both
> > > >> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
> > > >> if my reading of the spec (PCI 3.0 in this case) is correct.
> > > >
> > > > According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
> > > > it seems your reading is not correct.
> > > >
> > > > Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
> > > > for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
> > > > Spec says that the port could be designed for software to configure entries
> > > > assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
> > > >
> > > > So if there is just one vector, it could be AAAA.
> >
> > Our pci_enable_msix() doesn't do that. It will always do A---.
> >
> > > BTW, I don't think pci_enable_msix() allows this kind of configuration.
> > > With the dummy pci_enable_msix() in my patch, it would be A---, I think.
> >
> > And that exactly is why I'm not sure it's correct.
> >
> > Namely, if only the first entry is configured, the device is only able to use
> > one vector, represented by this entry, for any purpose. Now, for instance, for
> > PCIE_CAPABILITIES_REG, there are two possibilities:
> > (1) the value in the register always points to the _valid_ entry in the MSI-X
> > table and that would be the first one,
> > (2) the value in the register may point to an _invalid_ entry (1 - 3).
> >
> > You seem to assume that (2) is the case, but I'm not sure (that should follow
> > from the PCI Express spec, but it clearly doesn't, at least I couldn't find
> > any pointer in the spec). IMO it wouldn't make sense, because the port
> > wouldn't have been able to generate interrupts for this service if only one
> > vector had been configured.
> >
> > Still, even though (2) is the case, but both PCIE_CAPABILITIES_REG and
> > PCI_ERR_ROOT_STATUS just happen to point to the same entry, which very well may
> > be possible, the second pci_enable_msix() in your patch will fail.
> >
> > In any case, I think we should
> > (a) get the number of the port's MSI-X table entries _first_, without enabling
> > MSI-X,
> > (b) allocate as many MSI-X vectors as indicated by this number, even though
> > some of them may not be used,
> > (c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
> > which vector has been allocated to which service.
>
> (d) mask the unused vectors.
However, it's probably simpler to do something like in your patch, although
I don't like the dummy enabling of MSI-X at all.
Thanks,
Rafael
Rafael J. Wysocki wrote:
> On Thursday 15 January 2009, Rafael J. Wysocki wrote:
>> On Thursday 15 January 2009, Rafael J. Wysocki wrote:
>>> On Thursday 15 January 2009, Kenji Kaneshige wrote:
>>>> Hidetoshi Seto wrote:
>>>>> Rafael J. Wysocki wrote:
>>>>>> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
>>>>>>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
>>>>>> [...]
>>>>>>>> I'm sorry but I don't understand what the problem is.
>>>>>>>> Do you mean pci_disable_msix() doesn't work on some platforms?
>>>>>>> No, I don't. It was just confusion on my side, sorry.
>>>>>>>
>>>>>>> Please have a look at the new version of the patch I sent yesterday
>>>>>>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
>>>>>> BTW, in your patch the first dummy pci_enable_msix() allocates just one
>>>>>> vector, which means that the contents of both
>>>>>> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
>>>>>> if my reading of the spec (PCI 3.0 in this case) is correct.
>>>>> According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
>>>>> it seems your reading is not correct.
>>>>>
>>>>> Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
>>>>> for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
>>>>> Spec says that the port could be designed for software to configure entries
>>>>> assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
>>>>>
>>>>> So if there is just one vector, it could be AAAA.
>>> Our pci_enable_msix() doesn't do that. It will always do A---.
Just above the implementation note, the spec says:
"Software is permitted to configure multiple MSI-X Table entries
with the same vector, and this may indeed be necessary when fewer
vectors are allocated than requested."
while "software" refers to either system software or device driver software.
So, yes, the our current implementation of system software (=Linux kernel)
doesn't do that.
However I'd like to note that doing that by "software" is not prohibited
in PCI 3.0.
>>>> BTW, I don't think pci_enable_msix() allows this kind of configuration.
>>>> With the dummy pci_enable_msix() in my patch, it would be A---, I think.
>>> And that exactly is why I'm not sure it's correct.
>>>
>>> Namely, if only the first entry is configured, the device is only able to use
>>> one vector, represented by this entry, for any purpose. Now, for instance, for
>>> PCIE_CAPABILITIES_REG, there are two possibilities:
>>> (1) the value in the register always points to the _valid_ entry in the MSI-X
>>> table and that would be the first one,
>>> (2) the value in the register may point to an _invalid_ entry (1 - 3).
The "invalid entry" is not defined.
>>> You seem to assume that (2) is the case, but I'm not sure (that should follow
>>> from the PCI Express spec, but it clearly doesn't, at least I couldn't find
>>> any pointer in the spec). IMO it wouldn't make sense, because the port
>>> wouldn't have been able to generate interrupts for this service if only one
>>> vector had been configured.
>>>
>>> Still, even though (2) is the case, but both PCIE_CAPABILITIES_REG and
>>> PCI_ERR_ROOT_STATUS just happen to point to the same entry, which very well may
>>> be possible, the second pci_enable_msix() in your patch will fail.
>>>
>>> In any case, I think we should
>>> (a) get the number of the port's MSI-X table entries _first_, without enabling
>>> MSI-X,
We cannot do this because both of PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS
will indicate the number for MSI, not for MSI-X without enabling MSI-X.
>>> (b) allocate as many MSI-X vectors as indicated by this number, even though
>>> some of them may not be used,
>>> (c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
>>> which vector has been allocated to which service.
>> (d) mask the unused vectors.
>
> However, it's probably simpler to do something like in your patch, although
> I don't like the dummy enabling of MSI-X at all.
How about this?
#define PCIE_MSIX_ENTRY_HPPME MAGIC_NUMBER_1
#define PCIE_MSIX_ENTRY_AER MAGIC_NUMBER_2
struct msix_entry msix_entries[] =
{{0, PCIE_MSIX_ENTRY_HPPME}, {0, PCIE_MSIX_ENTRY_AER}};
status = pci_enable_msix(dev, msix_entries, nvec);
And modify pci_enable_msix() to handle these magic numbers.
Thanks,
H.Seto
On Friday 16 January 2009, Hidetoshi Seto wrote:
> Rafael J. Wysocki wrote:
> > On Thursday 15 January 2009, Rafael J. Wysocki wrote:
> >> On Thursday 15 January 2009, Rafael J. Wysocki wrote:
> >>> On Thursday 15 January 2009, Kenji Kaneshige wrote:
> >>>> Hidetoshi Seto wrote:
> >>>>> Rafael J. Wysocki wrote:
> >>>>>> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
> >>>>>>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> >>>>>> [...]
> >>>>>>>> I'm sorry but I don't understand what the problem is.
> >>>>>>>> Do you mean pci_disable_msix() doesn't work on some platforms?
> >>>>>>> No, I don't. It was just confusion on my side, sorry.
> >>>>>>>
> >>>>>>> Please have a look at the new version of the patch I sent yesterday
> >>>>>>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
> >>>>>> BTW, in your patch the first dummy pci_enable_msix() allocates just one
> >>>>>> vector, which means that the contents of both
> >>>>>> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
> >>>>>> if my reading of the spec (PCI 3.0 in this case) is correct.
> >>>>> According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
> >>>>> it seems your reading is not correct.
> >>>>>
> >>>>> Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
> >>>>> for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
> >>>>> Spec says that the port could be designed for software to configure entries
> >>>>> assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
> >>>>>
> >>>>> So if there is just one vector, it could be AAAA.
> >>> Our pci_enable_msix() doesn't do that. It will always do A---.
>
> Just above the implementation note, the spec says:
> "Software is permitted to configure multiple MSI-X Table entries
> with the same vector, and this may indeed be necessary when fewer
> vectors are allocated than requested."
> while "software" refers to either system software or device driver software.
>
> So, yes, the our current implementation of system software (=Linux kernel)
> doesn't do that.
> However I'd like to note that doing that by "software" is not prohibited
> in PCI 3.0.
>
> >>>> BTW, I don't think pci_enable_msix() allows this kind of configuration.
> >>>> With the dummy pci_enable_msix() in my patch, it would be A---, I think.
> >>> And that exactly is why I'm not sure it's correct.
> >>>
> >>> Namely, if only the first entry is configured, the device is only able to use
> >>> one vector, represented by this entry, for any purpose. Now, for instance, for
> >>> PCIE_CAPABILITIES_REG, there are two possibilities:
> >>> (1) the value in the register always points to the _valid_ entry in the MSI-X
> >>> table and that would be the first one,
> >>> (2) the value in the register may point to an _invalid_ entry (1 - 3).
>
> The "invalid entry" is not defined.
s/invalid/unused/ (or masked permanently)
> >>> You seem to assume that (2) is the case, but I'm not sure (that should follow
> >>> from the PCI Express spec, but it clearly doesn't, at least I couldn't find
> >>> any pointer in the spec). IMO it wouldn't make sense, because the port
> >>> wouldn't have been able to generate interrupts for this service if only one
> >>> vector had been configured.
> >>>
> >>> Still, even though (2) is the case, but both PCIE_CAPABILITIES_REG and
> >>> PCI_ERR_ROOT_STATUS just happen to point to the same entry, which very well may
> >>> be possible, the second pci_enable_msix() in your patch will fail.
> >>>
> >>> In any case, I think we should
> >>> (a) get the number of the port's MSI-X table entries _first_, without enabling
> >>> MSI-X,
>
> We cannot do this because both of PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS
> will indicate the number for MSI, not for MSI-X without enabling MSI-X.
Yes, we can. We don't read PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS at
this point yet and the number of entries in the MSI-X table is constant
(read-only), so we can read it even before enabling MSI-X. Actually, our MSI-X
code does that already anyway.
> >>> (b) allocate as many MSI-X vectors as indicated by this number, even though
> >>> some of them may not be used,
(b) should be: call pci_enable_msix() with the last argument equal to the
number of entries in the MSI-X table or 32, whichever is smaller.
> >>> (c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
> >>> which vector has been allocated to which service.
> >> (d) mask the unused vectors.
> >
> > However, it's probably simpler to do something like in your patch, although
> > I don't like the dummy enabling of MSI-X at all.
>
> How about this?
>
> #define PCIE_MSIX_ENTRY_HPPME MAGIC_NUMBER_1
> #define PCIE_MSIX_ENTRY_AER MAGIC_NUMBER_2
>
> struct msix_entry msix_entries[] =
> {{0, PCIE_MSIX_ENTRY_HPPME}, {0, PCIE_MSIX_ENTRY_AER}};
> status = pci_enable_msix(dev, msix_entries, nvec);
>
> And modify pci_enable_msix() to handle these magic numbers.
Quite frankly, I prefer the procedure described above in (a) - (d). I'll try
to implement it and we'll see how it looks like.
Thanks,
Rafael
On Friday 16 January 2009, Rafael J. Wysocki wrote:
> On Friday 16 January 2009, Hidetoshi Seto wrote:
> > Rafael J. Wysocki wrote:
> > > On Thursday 15 January 2009, Rafael J. Wysocki wrote:
> > >> On Thursday 15 January 2009, Rafael J. Wysocki wrote:
> > >>> On Thursday 15 January 2009, Kenji Kaneshige wrote:
> > >>>> Hidetoshi Seto wrote:
> > >>>>> Rafael J. Wysocki wrote:
> > >>>>>> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
> > >>>>>>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
> > >>>>>> [...]
> > >>>>>>>> I'm sorry but I don't understand what the problem is.
> > >>>>>>>> Do you mean pci_disable_msix() doesn't work on some platforms?
> > >>>>>>> No, I don't. It was just confusion on my side, sorry.
> > >>>>>>>
> > >>>>>>> Please have a look at the new version of the patch I sent yesterday
> > >>>>>>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
> > >>>>>> BTW, in your patch the first dummy pci_enable_msix() allocates just one
> > >>>>>> vector, which means that the contents of both
> > >>>>>> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
> > >>>>>> if my reading of the spec (PCI 3.0 in this case) is correct.
> > >>>>> According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
> > >>>>> it seems your reading is not correct.
> > >>>>>
> > >>>>> Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
> > >>>>> for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
> > >>>>> Spec says that the port could be designed for software to configure entries
> > >>>>> assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
> > >>>>>
> > >>>>> So if there is just one vector, it could be AAAA.
> > >>> Our pci_enable_msix() doesn't do that. It will always do A---.
> >
> > Just above the implementation note, the spec says:
> > "Software is permitted to configure multiple MSI-X Table entries
> > with the same vector, and this may indeed be necessary when fewer
> > vectors are allocated than requested."
> > while "software" refers to either system software or device driver software.
> >
> > So, yes, the our current implementation of system software (=Linux kernel)
> > doesn't do that.
> > However I'd like to note that doing that by "software" is not prohibited
> > in PCI 3.0.
> >
> > >>>> BTW, I don't think pci_enable_msix() allows this kind of configuration.
> > >>>> With the dummy pci_enable_msix() in my patch, it would be A---, I think.
> > >>> And that exactly is why I'm not sure it's correct.
> > >>>
> > >>> Namely, if only the first entry is configured, the device is only able to use
> > >>> one vector, represented by this entry, for any purpose. Now, for instance, for
> > >>> PCIE_CAPABILITIES_REG, there are two possibilities:
> > >>> (1) the value in the register always points to the _valid_ entry in the MSI-X
> > >>> table and that would be the first one,
> > >>> (2) the value in the register may point to an _invalid_ entry (1 - 3).
> >
> > The "invalid entry" is not defined.
>
> s/invalid/unused/ (or masked permanently)
>
> > >>> You seem to assume that (2) is the case, but I'm not sure (that should follow
> > >>> from the PCI Express spec, but it clearly doesn't, at least I couldn't find
> > >>> any pointer in the spec). IMO it wouldn't make sense, because the port
> > >>> wouldn't have been able to generate interrupts for this service if only one
> > >>> vector had been configured.
> > >>>
> > >>> Still, even though (2) is the case, but both PCIE_CAPABILITIES_REG and
> > >>> PCI_ERR_ROOT_STATUS just happen to point to the same entry, which very well may
> > >>> be possible, the second pci_enable_msix() in your patch will fail.
> > >>>
> > >>> In any case, I think we should
> > >>> (a) get the number of the port's MSI-X table entries _first_, without enabling
> > >>> MSI-X,
> >
> > We cannot do this because both of PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS
> > will indicate the number for MSI, not for MSI-X without enabling MSI-X.
>
> Yes, we can. We don't read PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS at
> this point yet and the number of entries in the MSI-X table is constant
> (read-only), so we can read it even before enabling MSI-X. Actually, our MSI-X
> code does that already anyway.
>
> > >>> (b) allocate as many MSI-X vectors as indicated by this number, even though
> > >>> some of them may not be used,
>
> (b) should be: call pci_enable_msix() with the last argument equal to the
> number of entries in the MSI-X table or 32, whichever is smaller.
>
> > >>> (c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
> > >>> which vector has been allocated to which service.
> > >> (d) mask the unused vectors.
> > >
> > > However, it's probably simpler to do something like in your patch, although
> > > I don't like the dummy enabling of MSI-X at all.
> >
> > How about this?
> >
> > #define PCIE_MSIX_ENTRY_HPPME MAGIC_NUMBER_1
> > #define PCIE_MSIX_ENTRY_AER MAGIC_NUMBER_2
> >
> > struct msix_entry msix_entries[] =
> > {{0, PCIE_MSIX_ENTRY_HPPME}, {0, PCIE_MSIX_ENTRY_AER}};
> > status = pci_enable_msix(dev, msix_entries, nvec);
> >
> > And modify pci_enable_msix() to handle these magic numbers.
>
> Quite frankly, I prefer the procedure described above in (a) - (d). I'll try
> to implement it and we'll see how it looks like.
I've just sent the patch in the other thread.
Thanks,
Rafael
Rafael J. Wysocki wrote:
> On Friday 16 January 2009, Rafael J. Wysocki wrote:
>> On Friday 16 January 2009, Hidetoshi Seto wrote:
>>> Rafael J. Wysocki wrote:
>>>> On Thursday 15 January 2009, Rafael J. Wysocki wrote:
>>>>> On Thursday 15 January 2009, Rafael J. Wysocki wrote:
(snip)
>>>>>> In any case, I think we should
>>>>>> (a) get the number of the port's MSI-X table entries _first_, without enabling
>>>>>> MSI-X,
>>> We cannot do this because both of PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS
>>> will indicate the number for MSI, not for MSI-X without enabling MSI-X.
>> Yes, we can. We don't read PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS at
>> this point yet and the number of entries in the MSI-X table is constant
>> (read-only), so we can read it even before enabling MSI-X. Actually, our MSI-X
>> code does that already anyway.
Now I caught you've mean "get the size of MSI-X table."
>>>>>> (b) allocate as many MSI-X vectors as indicated by this number, even though
>>>>>> some of them may not be used,
>> (b) should be: call pci_enable_msix() with the last argument equal to the
>> number of entries in the MSI-X table or 32, whichever is smaller.
Good.
>>>>>> (c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
>>>>>> which vector has been allocated to which service.
>>>>> (d) mask the unused vectors.
>>>> However, it's probably simpler to do something like in your patch, although
>>>> I don't like the dummy enabling of MSI-X at all.
>>> How about this?
>>>
>>> #define PCIE_MSIX_ENTRY_HPPME MAGIC_NUMBER_1
>>> #define PCIE_MSIX_ENTRY_AER MAGIC_NUMBER_2
>>>
>>> struct msix_entry msix_entries[] =
>>> {{0, PCIE_MSIX_ENTRY_HPPME}, {0, PCIE_MSIX_ENTRY_AER}};
>>> status = pci_enable_msix(dev, msix_entries, nvec);
>>>
>>> And modify pci_enable_msix() to handle these magic numbers.
I found a down side of my idea:
it requires modification of arch_setup_msi_irqs etc. too.
>> Quite frankly, I prefer the procedure described above in (a) - (d). I'll try
>> to implement it and we'll see how it looks like.
>
> I've just sent the patch in the other thread.
OK. I'll review it.
Thanks,
H.Seto
Rafael J. Wysocki wrote:
> On Thursday 15 January 2009, Kenji Kaneshige wrote:
>> Hidetoshi Seto wrote:
>>> Rafael J. Wysocki wrote:
>>>> On Wednesday 14 January 2009, Rafael J. Wysocki wrote:
>>>>> On Wednesday 14 January 2009, Kenji Kaneshige wrote:
>>>> [...]
>>>>>> I'm sorry but I don't understand what the problem is.
>>>>>> Do you mean pci_disable_msix() doesn't work on some platforms?
>>>>> No, I don't. It was just confusion on my side, sorry.
>>>>>
>>>>> Please have a look at the new version of the patch I sent yesterday
>>>>> (http://marc.info/?l=linux-pci&m=123185510828181&w=4).
>>>> BTW, in your patch the first dummy pci_enable_msix() allocates just one
>>>> vector, which means that the contents of both
>>>> msix_entries[idx_hppme].entry and msix_entries[idx_aer].entry will be the same,
>>>> if my reading of the spec (PCI 3.0 in this case) is correct.
>>> According to PCI 3.0 implementation note "Handling MSI-X Vector Shortage,"
>>> it seems your reading is not correct.
>>>
>>> Assume that the port have 4 entries([0-3]) in MSI-X table, and that entry[2]
>>> for hotplug/PME and entry[3] for AER, and that kernel only allocates 2 vector.
>>> Spec says that the port could be designed for software to configure entries
>>> assigning vectors{A,B} to multiple entries as ABAB, AABB, ABBB etc.
>>>
>>> So if there is just one vector, it could be AAAA.
>
> Our pci_enable_msix() doesn't do that. It will always do A---.
>
>> BTW, I don't think pci_enable_msix() allows this kind of configuration.
>> With the dummy pci_enable_msix() in my patch, it would be A---, I think.
>
> And that exactly is why I'm not sure it's correct.
>
> Namely, if only the first entry is configured, the device is only able to use
> one vector, represented by this entry, for any purpose. Now, for instance, for
> PCIE_CAPABILITIES_REG, there are two possibilities:
> (1) the value in the register always points to the _valid_ entry in the MSI-X
> table and that would be the first one,
> (2) the value in the register may point to an _invalid_ entry (1 - 3).
>
> You seem to assume that (2) is the case, but I'm not sure (that should follow
> from the PCI Express spec, but it clearly doesn't, at least I couldn't find
> any pointer in the spec). IMO it wouldn't make sense, because the port
> wouldn't have been able to generate interrupts for this service if only one
> vector had been configured.
I don't understand what "valid" and "invalid" means. And I don't
imagine how the PCI Express chipset judges "valid" or "invalid".
If those mean "unmasked", "masked" respectively, we need to unmask
MSI-X before configuring MSI-X table entries. It doesn't look
correct behavior, I think.
After all, I think what we need to do at least for reading
Interrupt Message Number is that setting "MSI Enable"/"MSI-X Enable"
fields to 0b/1b. But it might exist some other things for enabling
MSI-X, quirk handling for example. And if we enable MSI-X by hand in
in port service driver, it would add the possibility of the race
condition between port service driver and existing pci_enable_msix().
That's why I used dummy pci_enable_msix() for reading Interrupt
Message Number. As you pointed out before, it's a little ugly, but
I think it's the simplest and most reliable way so far.
>
> Still, even though (2) is the case, but both PCIE_CAPABILITIES_REG and
> PCI_ERR_ROOT_STATUS just happen to point to the same entry, which very well may
> be possible, the second pci_enable_msix() in your patch will fail.
>
Oh, I understood. As you pointed out, my patch doesn't take into
account the case that HP/PME and AER share the same MSI-X entry.
This need to be fixed. Thanks!
> In any case, I think we should
> (a) get the number of the port's MSI-X table entries _first_, without enabling
> MSI-X,
> (b) allocate as many MSI-X vectors as indicated by this number, even though
> some of them may not be used,
> (c) use PCIE_CAPABILITIES_REG and PCI_ERR_ROOT_STATUS to check
> which vector has been allocated to which service.
>
This will cause unused vectors. We need to note that there can be
multiple PCIe ports in the system. So the number of unused vectors
can become large.
Thanks,
Kenji Kaneshige