Changes since v6:
* Fixed order of operations for device specific disable function as
noticed by Alex
* Rebased onto v4.18-rc5 (no conflicts)
Changes since v5:
* Add a quirk to handle the Intel SPT PCH case (as pointed out by Alex)
* Warn in the case that we try to disable ACS redirect on a device
that doesn't have the ACS capability (also suggested by Alex)
* Collect reviewed-by tag from Alex
* Rebased onto v4.18-rc4 (no conflicts)
Changes since v4:
* Fixed a couple documentation mistakes spotted by Randy
Changes since v3:
* Removed some of the cruft that was copied from the resource_alignment
paramater (per Alex)
* A number of docuemntation fixes as noticed by Alex and Willy
Changes since v2:
* Rebased onto v4.18-rc1 (no conflicts)
* Minor tweaks to the documentation per Andy
* Removed the "path:" prefix and use the path parsing code
for simple devices (as it works the same). Per a suggestion from Alex
Changes since v1:
* Reworked pci_dev_str_match_path using strrchr as suggested by Alex
* Collected Christian's Acks
--
Hi,
As discussed in our PCI P2PDMA series, we'd like to add a kernel
parameter for selectively disabling ACS redirection for select
bridges. Seeing this turned out to be a small series in itself, we've
decided to send this separately from the P2P work.
This series generalizes the code already done for the resource_alignment
option that already exists. The first patch creates a helper function
to match PCI devices against strings based on the code that already
existed in pci_specified_resource_alignment().
The second patch expands the new helper to optionally take a path of
PCI devfns. This is to address Alex's renumbering concern when using
simple bus-devfns. The implementation is essentially how he described it and
similar to the Intel VT-d spec (Section 8.3.1).
The final patch adds the disable_acs_redir kernel parameter which takes
a list of PCI devices and will disable the ACS P2P Request Redirect,
ACS P2P Completion Redirect and ACS P2P Egress Control bits for the
selected devices. This allows P2P traffic between selected bridges and
seeing it's done at boot, before the IOMMU groups will be created, the
groups will match the security provided by ACS.
Thanks,
Logan
--
Logan Gunthorpe (4):
PCI: Make specifying PCI devices in kernel parameters reusable
PCI: Allow specifying devices using a base bus and path of devfns
PCI: Introduce disable_acs_redir quirk
PCI: Introduce the disable_acs_redir parameter
Documentation/admin-guide/kernel-parameters.txt | 41 +++-
drivers/pci/pci.c | 310 +++++++++++++++++++-----
drivers/pci/quirks.c | 78 +++++-
include/linux/pci.h | 5 +
4 files changed, 361 insertions(+), 73 deletions(-)
--
2.11.0
Intel SPT PCH hardware has an implementation of the ACS bits that
does not comply with the PCI express standard. To deal with this
the existing code has an enable_acs() quirk for the hardware.
In order to be able to correctly disable the ACS redirect bits for
all hardware we need an analagous quirk to disable those bits.
This adds the function pci_dev_specific_disable_acs_redir() which
behaves similarly to pci_dev_specific_enable_acs() but uses a new
function pointer for quirks which disables the ACS redirect bits.
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/pci/quirks.c | 78 ++++++++++++++++++++++++++++++++++++++++++++--------
include/linux/pci.h | 5 ++++
2 files changed, 71 insertions(+), 12 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index f439de848658..414b22dc06b8 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4553,27 +4553,81 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
return 0;
}
-static const struct pci_dev_enable_acs {
+static int pci_quirk_disable_intel_spt_pch_acs_redir(struct pci_dev *dev)
+{
+ int pos;
+ u32 cap, ctrl;
+
+ if (!pci_quirk_intel_spt_pch_acs_match(dev))
+ return -ENOTTY;
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+ if (!pos)
+ return -ENOTTY;
+
+ pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap);
+ pci_read_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, &ctrl);
+
+ ctrl &= ~(PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC);
+
+ pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);
+
+ pci_info(dev, "Intel SPT PCH root port workaround: disabled ACS redirect\n");
+
+ return 0;
+}
+
+static const struct pci_dev_acs_ops {
u16 vendor;
u16 device;
int (*enable_acs)(struct pci_dev *dev);
-} pci_dev_enable_acs[] = {
- { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_enable_intel_pch_acs },
- { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, pci_quirk_enable_intel_spt_pch_acs },
- { 0 }
+ int (*disable_acs_redir)(struct pci_dev *dev);
+} pci_dev_acs_ops[] = {
+ { PCI_VENDOR_ID_INTEL, PCI_ANY_ID,
+ .enable_acs = pci_quirk_enable_intel_pch_acs,
+ },
+ { PCI_VENDOR_ID_INTEL, PCI_ANY_ID,
+ .enable_acs = pci_quirk_enable_intel_spt_pch_acs,
+ .disable_acs_redir = pci_quirk_disable_intel_spt_pch_acs_redir
+ },
};
int pci_dev_specific_enable_acs(struct pci_dev *dev)
{
- const struct pci_dev_enable_acs *i;
+ const struct pci_dev_acs_ops *p;
+ int i;
int ret;
- for (i = pci_dev_enable_acs; i->enable_acs; i++) {
- if ((i->vendor == dev->vendor ||
- i->vendor == (u16)PCI_ANY_ID) &&
- (i->device == dev->device ||
- i->device == (u16)PCI_ANY_ID)) {
- ret = i->enable_acs(dev);
+ for (i = 0; i < ARRAY_SIZE(pci_dev_acs_ops); i++) {
+ p = &pci_dev_acs_ops[i];
+ if ((p->vendor == dev->vendor ||
+ p->vendor == (u16)PCI_ANY_ID) &&
+ (p->device == dev->device ||
+ p->device == (u16)PCI_ANY_ID) &&
+ p->enable_acs) {
+ ret = p->enable_acs(dev);
+ if (ret >= 0)
+ return ret;
+ }
+ }
+
+ return -ENOTTY;
+}
+
+int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
+{
+ const struct pci_dev_acs_ops *p;
+ int i;
+ int ret;
+
+ for (i = 0; i < ARRAY_SIZE(pci_dev_acs_ops); i++) {
+ p = &pci_dev_acs_ops[i];
+ if ((p->vendor == dev->vendor ||
+ p->vendor == (u16)PCI_ANY_ID) &&
+ (p->device == dev->device ||
+ p->device == (u16)PCI_ANY_ID) &&
+ p->disable_acs_redir) {
+ ret = p->disable_acs_redir(dev);
if (ret >= 0)
return ret;
}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 340029b2fb38..3b61068dc7d1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1878,6 +1878,7 @@ enum pci_fixup_pass {
void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev);
int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
int pci_dev_specific_enable_acs(struct pci_dev *dev);
+int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
#else
static inline void pci_fixup_device(enum pci_fixup_pass pass,
struct pci_dev *dev) { }
@@ -1890,6 +1891,10 @@ static inline int pci_dev_specific_enable_acs(struct pci_dev *dev)
{
return -ENOTTY;
}
+static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
+{
+ return -ENOTTY;
+}
#endif
void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen);
--
2.11.0
Separate out the code to match a PCI device with a string (typically
originating from a kernel parameter) from the
pci_specified_resource_alignment() function into its own helper
function.
While we are at it, this change fixes the kernel style of the function
(fixing a number of long lines and extra parentheses).
Additionally, make the analogous change to the kernel parameter
documentation: Separating the description of how to specify a PCI device
into it's own section at the head of the pci= parameter.
This patch should have no functional alterations.
Signed-off-by: Logan Gunthorpe <[email protected]>
Reviewed-by: Stephen Bates <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Alex Williamson <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 28 ++++-
drivers/pci/pci.c | 157 ++++++++++++++++--------
2 files changed, 126 insertions(+), 59 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 533ff5c68970..5cc215870ee1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2994,7 +2994,26 @@
See header of drivers/block/paride/pcd.c.
See also Documentation/blockdev/paride.txt.
- pci=option[,option...] [PCI] various PCI subsystem options:
+ pci=option[,option...] [PCI] various PCI subsystem options.
+
+ Some options herein operate on a specific device
+ or a set of devices (<pci_dev>). These are
+ specified in one of the following formats:
+
+ [<domain>:]<bus>:<slot>.<func>
+ pci:<vendor>:<device>[:<subvendor>:<subdevice>]
+
+ Note: the first format specifies a PCI
+ bus/slot/function address which may change
+ if new hardware is inserted, if motherboard
+ firmware changes, or due to changes caused
+ by other kernel parameters. If the
+ domain is left unspecified, it is
+ taken to be zero. The second format
+ selects devices using IDs from the
+ configuration space which may match multiple
+ devices in the system.
+
earlydump [X86] dump PCI config space before the kernel
changes anything
off [X86] don't probe for the PCI bus
@@ -3123,11 +3142,10 @@
window. The default value is 64 megabytes.
resource_alignment=
Format:
- [<order of align>@][<domain>:]<bus>:<slot>.<func>[; ...]
- [<order of align>@]pci:<vendor>:<device>\
- [:<subvendor>:<subdevice>][; ...]
+ [<order of align>@]<pci_dev>[; ...]
Specifies alignment and device to reassign
- aligned memory resources.
+ aligned memory resources. How to
+ specify the device is described above.
If <order of align> is not specified,
PAGE_SIZE is used as alignment.
PCI-PCI bridge can be specified, if resource
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 97acba712e4e..6127155d4170 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -191,6 +191,92 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar)
EXPORT_SYMBOL_GPL(pci_ioremap_wc_bar);
#endif
+/**
+ * pci_dev_str_match - test if a string matches a device
+ * @dev: the PCI device to test
+ * @p: string to match the device against
+ * @endptr: pointer to the string after the match
+ *
+ * Test if a string (typically from a kernel parameter) matches a specified
+ * PCI device. The string may be of one of the following formats:
+ *
+ * [<domain>:]<bus>:<slot>.<func>
+ * pci:<vendor>:<device>[:<subvendor>:<subdevice>]
+ *
+ * The first format specifies a PCI bus/slot/function address which
+ * may change if new hardware is inserted, if motherboard firmware changes,
+ * or due to changes caused in kernel parameters. If the domain is
+ * left unspecified, it is taken to be 0.
+ *
+ * The second format matches devices using IDs in the configuration
+ * space which may match multiple devices in the system. A value of 0
+ * for any field will match all devices. (Note: this differs from
+ * in-kernel code that uses PCI_ANY_ID which is ~0; this is for
+ * legacy reasons and convenience so users don't have to specify
+ * FFFFFFFFs on the command line.)
+ *
+ * Returns 1 if the string matches the device, 0 if it does not and
+ * a negative error code if the string cannot be parsed.
+ */
+static int pci_dev_str_match(struct pci_dev *dev, const char *p,
+ const char **endptr)
+{
+ int ret;
+ int seg, bus, slot, func, count;
+ unsigned short vendor, device, subsystem_vendor, subsystem_device;
+
+ if (strncmp(p, "pci:", 4) == 0) {
+ /* PCI vendor/device (subvendor/subdevice) ids are specified */
+ p += 4;
+ ret = sscanf(p, "%hx:%hx:%hx:%hx%n", &vendor, &device,
+ &subsystem_vendor, &subsystem_device, &count);
+ if (ret != 4) {
+ ret = sscanf(p, "%hx:%hx%n", &vendor, &device, &count);
+ if (ret != 2)
+ return -EINVAL;
+
+ subsystem_vendor = 0;
+ subsystem_device = 0;
+ }
+
+ p += count;
+
+ if ((!vendor || vendor == dev->vendor) &&
+ (!device || device == dev->device) &&
+ (!subsystem_vendor ||
+ subsystem_vendor == dev->subsystem_vendor) &&
+ (!subsystem_device ||
+ subsystem_device == dev->subsystem_device))
+ goto found;
+
+ } else {
+ /* PCI Bus,Slot,Function ids are specified */
+ ret = sscanf(p, "%x:%x:%x.%x%n", &seg, &bus, &slot,
+ &func, &count);
+ if (ret != 4) {
+ seg = 0;
+ ret = sscanf(p, "%x:%x.%x%n", &bus, &slot,
+ &func, &count);
+ if (ret != 3)
+ return -EINVAL;
+ }
+
+ p += count;
+
+ if (seg == pci_domain_nr(dev->bus) &&
+ bus == dev->bus->number &&
+ slot == PCI_SLOT(dev->devfn) &&
+ func == PCI_FUNC(dev->devfn))
+ goto found;
+ }
+
+ *endptr = p;
+ return 0;
+
+found:
+ *endptr = p;
+ return 1;
+}
static int __pci_find_next_cap_ttl(struct pci_bus *bus, unsigned int devfn,
u8 pos, int cap, int *ttl)
@@ -5454,10 +5540,10 @@ static DEFINE_SPINLOCK(resource_alignment_lock);
static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
bool *resize)
{
- int seg, bus, slot, func, align_order, count;
- unsigned short vendor, device, subsystem_vendor, subsystem_device;
+ int align_order, count;
resource_size_t align = pcibios_default_alignment();
- char *p;
+ const char *p;
+ int ret;
spin_lock(&resource_alignment_lock);
p = resource_alignment_param;
@@ -5477,58 +5563,21 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
} else {
align_order = -1;
}
- if (strncmp(p, "pci:", 4) == 0) {
- /* PCI vendor/device (subvendor/subdevice) ids are specified */
- p += 4;
- if (sscanf(p, "%hx:%hx:%hx:%hx%n",
- &vendor, &device, &subsystem_vendor, &subsystem_device, &count) != 4) {
- if (sscanf(p, "%hx:%hx%n", &vendor, &device, &count) != 2) {
- printk(KERN_ERR "PCI: Can't parse resource_alignment parameter: pci:%s\n",
- p);
- break;
- }
- subsystem_vendor = subsystem_device = 0;
- }
- p += count;
- if ((!vendor || (vendor == dev->vendor)) &&
- (!device || (device == dev->device)) &&
- (!subsystem_vendor || (subsystem_vendor == dev->subsystem_vendor)) &&
- (!subsystem_device || (subsystem_device == dev->subsystem_device))) {
- *resize = true;
- if (align_order == -1)
- align = PAGE_SIZE;
- else
- align = 1 << align_order;
- /* Found */
- break;
- }
- }
- else {
- if (sscanf(p, "%x:%x:%x.%x%n",
- &seg, &bus, &slot, &func, &count) != 4) {
- seg = 0;
- if (sscanf(p, "%x:%x.%x%n",
- &bus, &slot, &func, &count) != 3) {
- /* Invalid format */
- printk(KERN_ERR "PCI: Can't parse resource_alignment parameter: %s\n",
- p);
- break;
- }
- }
- p += count;
- if (seg == pci_domain_nr(dev->bus) &&
- bus == dev->bus->number &&
- slot == PCI_SLOT(dev->devfn) &&
- func == PCI_FUNC(dev->devfn)) {
- *resize = true;
- if (align_order == -1)
- align = PAGE_SIZE;
- else
- align = 1 << align_order;
- /* Found */
- break;
- }
+
+ ret = pci_dev_str_match(dev, p, &p);
+ if (ret == 1) {
+ *resize = true;
+ if (align_order == -1)
+ align = PAGE_SIZE;
+ else
+ align = 1 << align_order;
+ break;
+ } else if (ret < 0) {
+ pr_err("PCI: Can't parse resource_alignment parameter: %s\n",
+ p);
+ break;
}
+
if (*p != ';' && *p != ',') {
/* End of param or invalid format */
break;
--
2.11.0
When specifying PCI devices on the kernel command line using a
BDF, the bus numbers can change when adding or replacing a device,
changing motherboard firmware, or applying kernel parameters like
pci=assign-buses. When this happens, it is usually undesirable to
apply whatever command line tweak to the wrong device.
Therefore, it is useful to be able to specify devices with a base
bus number and the path of devfns needed to get to it. (Similar to
the "device scope" structure in the Intel VT-d spec, Section 8.3.1.)
Thus, we add an option to specify devices in the following format:
[<domain>:]<bus>:<slot>.<func>[/<slot>.<func>]*
The path can be any segment within the PCI hierarchy of any length and
determined through the use of 'lspci -t'. When specified this way, it is
less likely that a renumbered bus will result in a valid device specification
and the tweak won't be applied to the wrong device.
Signed-off-by: Logan Gunthorpe <[email protected]>
Reviewed-by: Stephen Bates <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Alex Williamson <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 8 +-
drivers/pci/pci.c | 117 ++++++++++++++++++++----
2 files changed, 103 insertions(+), 22 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 5cc215870ee1..1fdd1ef03984 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3000,7 +3000,7 @@
or a set of devices (<pci_dev>). These are
specified in one of the following formats:
- [<domain>:]<bus>:<slot>.<func>
+ [<domain>:]<bus>:<slot>.<func>[/<slot>.<func>]*
pci:<vendor>:<device>[:<subvendor>:<subdevice>]
Note: the first format specifies a PCI
@@ -3009,7 +3009,11 @@
firmware changes, or due to changes caused
by other kernel parameters. If the
domain is left unspecified, it is
- taken to be zero. The second format
+ taken to be zero. Optionally, a path
+ to a device through multiple slot/function
+ addresses can be specified after the base
+ address (this is more robust against
+ renumbering issues). The second format
selects devices using IDs from the
configuration space which may match multiple
devices in the system.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 6127155d4170..59638075b4df 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -192,6 +192,89 @@ EXPORT_SYMBOL_GPL(pci_ioremap_wc_bar);
#endif
/**
+ * pci_dev_str_match_path - test if a path string matches a device
+ * @dev: the PCI device to test
+ * @p: string to match the device against
+ * @endptr: pointer to the string after the match
+ *
+ * Test if a string (typically from a kernel parameter) formatted as a
+ * path of slot/function addresses matches a PCI device. The string must
+ * be of the form:
+ *
+ * [<domain>:]<bus>:<slot>.<func>[/<slot>.<func>]*
+ *
+ * A path for a device can be obtained using 'lspci -t'. Using a path
+ * is more robust against bus renumbering than using only a single bus,
+ * slot and function address.
+ *
+ * Returns 1 if the string matches the device, 0 if it does not and
+ * a negative error code if it fails to parse the string.
+ */
+static int pci_dev_str_match_path(struct pci_dev *dev, const char *path,
+ const char **endptr)
+{
+ int ret;
+ int seg, bus, slot, func;
+ char *wpath, *p;
+ char end;
+
+ *endptr = strchrnul(path, ';');
+
+ wpath = kmemdup_nul(path, *endptr - path, GFP_KERNEL);
+ if (!wpath)
+ return -ENOMEM;
+
+ while (1) {
+ p = strrchr(wpath, '/');
+ if (!p)
+ break;
+ ret = sscanf(p, "/%x.%x%c", &slot, &func, &end);
+ if (ret != 2) {
+ ret = -EINVAL;
+ goto free_and_exit;
+ }
+
+ if (dev->devfn != PCI_DEVFN(slot, func)) {
+ ret = 0;
+ goto free_and_exit;
+ }
+
+ /*
+ * Note: we don't need to get a reference to the upstream
+ * bridge because we hold a reference to the top level
+ * device which should hold a reference to the bridge,
+ * and so on.
+ */
+ dev = pci_upstream_bridge(dev);
+ if (!dev) {
+ ret = 0;
+ goto free_and_exit;
+ }
+
+ *p = 0;
+ }
+
+ ret = sscanf(wpath, "%x:%x:%x.%x%c", &seg, &bus, &slot,
+ &func, &end);
+ if (ret != 4) {
+ seg = 0;
+ ret = sscanf(wpath, "%x:%x.%x%c", &bus, &slot, &func, &end);
+ if (ret != 3) {
+ ret = -EINVAL;
+ goto free_and_exit;
+ }
+ }
+
+ ret = (seg == pci_domain_nr(dev->bus) &&
+ bus == dev->bus->number &&
+ dev->devfn == PCI_DEVFN(slot, func));
+
+free_and_exit:
+ kfree(wpath);
+ return ret;
+}
+
+/**
* pci_dev_str_match - test if a string matches a device
* @dev: the PCI device to test
* @p: string to match the device against
@@ -200,13 +283,16 @@ EXPORT_SYMBOL_GPL(pci_ioremap_wc_bar);
* Test if a string (typically from a kernel parameter) matches a specified
* PCI device. The string may be of one of the following formats:
*
- * [<domain>:]<bus>:<slot>.<func>
+ * [<domain>:]<bus>:<slot>.<func>[/<slot>.<func>]*
* pci:<vendor>:<device>[:<subvendor>:<subdevice>]
*
* The first format specifies a PCI bus/slot/function address which
* may change if new hardware is inserted, if motherboard firmware changes,
* or due to changes caused in kernel parameters. If the domain is
- * left unspecified, it is taken to be 0.
+ * left unspecified, it is taken to be 0. In order to be robust against
+ * bus renumbering issues, a path of PCI slot/function numbers may be used
+ * to address the specific device. The path for a device can be determined
+ * through the use of 'lspci -t'.
*
* The second format matches devices using IDs in the configuration
* space which may match multiple devices in the system. A value of 0
@@ -222,7 +308,7 @@ static int pci_dev_str_match(struct pci_dev *dev, const char *p,
const char **endptr)
{
int ret;
- int seg, bus, slot, func, count;
+ int count;
unsigned short vendor, device, subsystem_vendor, subsystem_device;
if (strncmp(p, "pci:", 4) == 0) {
@@ -248,25 +334,16 @@ static int pci_dev_str_match(struct pci_dev *dev, const char *p,
(!subsystem_device ||
subsystem_device == dev->subsystem_device))
goto found;
-
} else {
- /* PCI Bus,Slot,Function ids are specified */
- ret = sscanf(p, "%x:%x:%x.%x%n", &seg, &bus, &slot,
- &func, &count);
- if (ret != 4) {
- seg = 0;
- ret = sscanf(p, "%x:%x.%x%n", &bus, &slot,
- &func, &count);
- if (ret != 3)
- return -EINVAL;
- }
-
- p += count;
+ /*
+ * PCI Bus,Slot,Function ids are specified
+ * (optionally, may include a path of devfns following it)
+ */
- if (seg == pci_domain_nr(dev->bus) &&
- bus == dev->bus->number &&
- slot == PCI_SLOT(dev->devfn) &&
- func == PCI_FUNC(dev->devfn))
+ ret = pci_dev_str_match_path(dev, p, &p);
+ if (ret < 0)
+ return ret;
+ else if (ret)
goto found;
}
--
2.11.0
In order to support P2P traffic on a segment of the PCI hierarchy,
we must be able to disable the ACS redirect bits for select
PCI bridges. The bridges must be selected before the devices are
discovered by the kernel and the IOMMU groups created. Therefore,
a kernel command line parameter is created to specify devices
which must have their ACS bits disabled.
The new parameter takes a list of devices separated by a semicolon.
Each device specified will have it's ACS redirect bits disabled.
This is similar to the existing 'resource_alignment' parameter.
The ACS Request P2P Request Redirect, P2P Completion Redirect and P2P
Egress Control bits are disabled which is sufficient to always allow
passing P2P traffic uninterrupted. The bits are set after the kernel
(optionally) enables the ACS bits itself. It is also done regardless of
whether the kernel sets the bits or not seeing some BIOS firmware is known
to set the bits on boot.
If the user tries to disable the ACS redirct for a device without the
ACS capability, a warning is printed to dmesg.
Signed-off-by: Logan Gunthorpe <[email protected]>
Reviewed-by: Stephen Bates <[email protected]>
Acked-by: Christian König <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 9 +++
drivers/pci/pci.c | 76 ++++++++++++++++++++++++-
2 files changed, 83 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1fdd1ef03984..ab19ed83f072 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3192,6 +3192,15 @@
Adding the window is slightly risky (it may
conflict with unreported devices), so this
taints the kernel.
+ disable_acs_redir=<pci_dev>[; ...]
+ Specify one or more PCI devices (in the format
+ specified above) separated by semicolons.
+ Each device specified will have the PCI ACS
+ redirect capabilities forced off which will
+ allow P2P traffic between devices through
+ bridges without forcing it upstream. Note:
+ this removes isolation between devices and
+ will make the IOMMU groups less granular.
pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power
Management.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59638075b4df..0aee076efe8b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2983,6 +2983,66 @@ void pci_request_acs(void)
pci_acs_enable = 1;
}
+static const char *disable_acs_redir_param;
+
+/**
+ * pci_disable_acs_redir - disable ACS redirect capabilities
+ * @dev: the PCI device
+ *
+ * For only devices specified in the disable_acs_redir parameter.
+ */
+static void pci_disable_acs_redir(struct pci_dev *dev)
+{
+ int ret = 0;
+ const char *p;
+ int pos;
+ u16 ctrl;
+
+ if (!disable_acs_redir_param)
+ return;
+
+ p = disable_acs_redir_param;
+ while (*p) {
+ ret = pci_dev_str_match(dev, p, &p);
+ if (ret < 0) {
+ pr_info_once("PCI: Can't parse disable_acs_redir parameter: %s\n",
+ disable_acs_redir_param);
+
+ break;
+ } else if (ret == 1) {
+ /* Found a match */
+ break;
+ }
+
+ if (*p != ';' && *p != ',') {
+ /* End of param or invalid format */
+ break;
+ }
+ p++;
+ }
+
+ if (ret != 1)
+ return;
+
+ if (!pci_dev_specific_disable_acs_redir(dev))
+ return;
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ACS);
+ if (!pos) {
+ pci_warn(dev, "cannot disable ACS redirect for this hardware as it does not have ACS capabilities\n");
+ return;
+ }
+
+ pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
+
+ /* P2P Request & Completion Redirect */
+ ctrl &= ~(PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC);
+
+ pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
+
+ pci_info(dev, "disabled ACS redirect\n");
+}
+
/**
* pci_std_enable_acs - enable ACS on devices using standard ACS capabilites
* @dev: the PCI device
@@ -3022,12 +3082,22 @@ static void pci_std_enable_acs(struct pci_dev *dev)
void pci_enable_acs(struct pci_dev *dev)
{
if (!pci_acs_enable)
- return;
+ goto disable_acs_redir;
if (!pci_dev_specific_enable_acs(dev))
- return;
+ goto disable_acs_redir;
pci_std_enable_acs(dev);
+
+disable_acs_redir:
+ /*
+ * Note: pci_disable_acs_redir() must be called even if
+ * ACS is not enabled by the kernel because the firmware
+ * may have unexpectedly set the flags. So if we are told
+ * to disable it, we should always disable it after setting
+ * the kernel's default preferences.
+ */
+ pci_disable_acs_redir(dev);
}
static bool pci_acs_flags_enabled(struct pci_dev *pdev, u16 acs_flags)
@@ -5967,6 +6037,8 @@ static int __init pci_setup(char *str)
pcie_bus_config = PCIE_BUS_PEER2PEER;
} else if (!strncmp(str, "pcie_scan_all", 13)) {
pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
+ } else if (!strncmp(str, "disable_acs_redir=", 18)) {
+ disable_acs_redir_param = str + 18;
} else {
printk(KERN_ERR "PCI: Unknown option `%s'\n",
str);
--
2.11.0
On Tue, 17 Jul 2018 11:02:03 -0600
Logan Gunthorpe <[email protected]> wrote:
> Intel SPT PCH hardware has an implementation of the ACS bits that
> does not comply with the PCI express standard. To deal with this
> the existing code has an enable_acs() quirk for the hardware.
>
> In order to be able to correctly disable the ACS redirect bits for
> all hardware we need an analagous quirk to disable those bits.
>
> This adds the function pci_dev_specific_disable_acs_redir() which
> behaves similarly to pci_dev_specific_enable_acs() but uses a new
> function pointer for quirks which disables the ACS redirect bits.
>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---
> drivers/pci/quirks.c | 78 ++++++++++++++++++++++++++++++++++++++++++++--------
> include/linux/pci.h | 5 ++++
> 2 files changed, 71 insertions(+), 12 deletions(-)
Reviewed-by: Alex Williamson <[email protected]>
On Tue, 17 Jul 2018 11:02:04 -0600
Logan Gunthorpe <[email protected]> wrote:
> In order to support P2P traffic on a segment of the PCI hierarchy,
> we must be able to disable the ACS redirect bits for select
> PCI bridges. The bridges must be selected before the devices are
> discovered by the kernel and the IOMMU groups created. Therefore,
> a kernel command line parameter is created to specify devices
> which must have their ACS bits disabled.
>
> The new parameter takes a list of devices separated by a semicolon.
> Each device specified will have it's ACS redirect bits disabled.
> This is similar to the existing 'resource_alignment' parameter.
>
> The ACS Request P2P Request Redirect, P2P Completion Redirect and P2P
> Egress Control bits are disabled which is sufficient to always allow
> passing P2P traffic uninterrupted. The bits are set after the kernel
> (optionally) enables the ACS bits itself. It is also done regardless of
> whether the kernel sets the bits or not seeing some BIOS firmware is known
> to set the bits on boot.
>
> If the user tries to disable the ACS redirct for a device without the
> ACS capability, a warning is printed to dmesg.
>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> Reviewed-by: Stephen Bates <[email protected]>
> Acked-by: Christian König <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 9 +++
> drivers/pci/pci.c | 76 ++++++++++++++++++++++++-
> 2 files changed, 83 insertions(+), 2 deletions(-)
Thanks for the re-spins!
Reviewed-by: Alex Williamson <[email protected]>
On 17/07/18 11:48 AM, Alex Williamson wrote:
> On Tue, 17 Jul 2018 11:02:04 -0600
> Logan Gunthorpe <[email protected]> wrote:
>
>> In order to support P2P traffic on a segment of the PCI hierarchy,
>> we must be able to disable the ACS redirect bits for select
>> PCI bridges. The bridges must be selected before the devices are
>> discovered by the kernel and the IOMMU groups created. Therefore,
>> a kernel command line parameter is created to specify devices
>> which must have their ACS bits disabled.
>>
>> The new parameter takes a list of devices separated by a semicolon.
>> Each device specified will have it's ACS redirect bits disabled.
>> This is similar to the existing 'resource_alignment' parameter.
>>
>> The ACS Request P2P Request Redirect, P2P Completion Redirect and P2P
>> Egress Control bits are disabled which is sufficient to always allow
>> passing P2P traffic uninterrupted. The bits are set after the kernel
>> (optionally) enables the ACS bits itself. It is also done regardless of
>> whether the kernel sets the bits or not seeing some BIOS firmware is known
>> to set the bits on boot.
>>
>> If the user tries to disable the ACS redirct for a device without the
>> ACS capability, a warning is printed to dmesg.
>>
>> Signed-off-by: Logan Gunthorpe <[email protected]>
>> Reviewed-by: Stephen Bates <[email protected]>
>> Acked-by: Christian König <[email protected]>
>> ---
>> Documentation/admin-guide/kernel-parameters.txt | 9 +++
>> drivers/pci/pci.c | 76 ++++++++++++++++++++++++-
>> 2 files changed, 83 insertions(+), 2 deletions(-)
>
> Thanks for the re-spins!
>
> Reviewed-by: Alex Williamson <[email protected]>
Thanks for all the thorough review!
Logan
>> Reviewed-by: Alex Williamson <[email protected]>
>
>
> Thanks for all the thorough review!
+1! Yes, thanks Alex for all the feedback on this series.
Bjorne, is Alex's review enough for you to take this series?
Stephen
On Tue, Jul 17, 2018 at 11:02:00AM -0600, Logan Gunthorpe wrote:
> The second patch expands the new helper to optionally take a path of
> PCI devfns. This is to address Alex's renumbering concern when using
> simple bus-devfns. The implementation is essentially how he described it and
> similar to the Intel VT-d spec (Section 8.3.1).
I don't like telling the user to grovel around lspci -t by hand. It's
not many lines of code to add a new -P option to lspci to show the path
to each device instead of bus:dev.fn
Here's three examples, first without, then with -P.
(my laptop):
6d:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
00:1d.0/00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
(tests/PCI-X-bridges-and-domains):
0002:42:00.0 Ethernet controller: Trident Microsystems 4DWave DX (rev 26)
0002:00:02.4/01.0/00.0 Ethernet controller: Trident Microsystems 4DWave DX (rev 26)
(my Nehalem system):
04:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
00:03.0/00.0/00.0/00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
The Nehalem system makes an interesting testcase because it exposes some
registers in fake PCIe devices that aren't behind the root ports. eg:
ff:06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 04)
Martin, what do you think to this patch? Also, I'm happy to send you
the lspci -xxxx from the Nehalem system to add to tests/
diff --git a/lspci.c b/lspci.c
index 3bf1925..ae0fdd2 100644
--- a/lspci.c
+++ b/lspci.c
@@ -19,6 +19,7 @@ int verbose; /* Show detailed information */
static int opt_hex; /* Show contents of config space as hexadecimal numbers */
struct pci_filter filter; /* Device filter */
static int opt_tree; /* Show bus tree */
+static int opt_path; /* Show bridge path */
static int opt_machine; /* Generate machine-readable output */
static int opt_map_mode; /* Bus mapping mode enabled */
static int opt_domains; /* Show domain numbers (0=disabled, 1=auto-detected, 2=requested) */
@@ -29,7 +30,7 @@ char *opt_pcimap; /* Override path to Linux modules.pcimap */
const char program_name[] = "lspci";
-static char options[] = "nvbxs:d:ti:mgp:qkMDQ" GENERIC_OPTIONS ;
+static char options[] = "nvbxs:d:tPi:mgp:qkMDQ" GENERIC_OPTIONS ;
static char help_msg[] =
"Usage: lspci [<switches>]\n"
@@ -247,6 +248,34 @@ sort_them(void)
/*** Normal output ***/
+static void
+show_slot_path(struct pci_dev *p)
+{
+ struct pci_dev *d = NULL;
+
+ if (opt_path && p->bus)
+ {
+ for (d = p->access->devices; d; d = d->next) {
+ if (d->hdrtype == -1)
+ d->hdrtype = pci_read_byte(d, PCI_HEADER_TYPE) & 0x7f;
+ if (d->hdrtype != PCI_HEADER_TYPE_BRIDGE &&
+ d->hdrtype != PCI_HEADER_TYPE_CARDBUS)
+ continue;
+ if (pci_read_byte(d, PCI_SECONDARY_BUS) > p->bus)
+ continue;
+ if (pci_read_byte(d, PCI_SUBORDINATE_BUS) < p->bus)
+ continue;
+ show_slot_path(d);
+ break;
+ }
+ }
+
+ if (d)
+ printf("/%02x.%d", p->dev, p->func);
+ else
+ printf("%02x:%02x.%d", p->bus, p->dev, p->func);
+}
+
static void
show_slot_name(struct device *d)
{
@@ -254,7 +283,7 @@ show_slot_name(struct device *d)
if (!opt_machine ? opt_domains : (p->domain || opt_domains >= 2))
printf("%04x:", p->domain);
- printf("%02x:%02x.%d", p->bus, p->dev, p->func);
+ show_slot_path(p);
}
void
@@ -989,6 +1018,9 @@ main(int argc, char **argv)
case 'x':
opt_hex++;
break;
+ case 'P':
+ opt_path++;
+ break;
case 't':
opt_tree++;
break;
diff --git a/lspci.man b/lspci.man
index 35b3620..565dd5b 100644
--- a/lspci.man
+++ b/lspci.man
@@ -95,6 +95,9 @@ PCI bus instead of as seen by the kernel.
.B -D
Always show PCI domain numbers. By default, lspci suppresses them on machines which
have only domain 0.
+.TP
+.B -P
+Name PCI devices by path through each bridge, instead of by bus number.
.SS Options to control resolving ID's to names
.TP
On 17/07/18 02:39 PM, Matthew Wilcox wrote:
> On Tue, Jul 17, 2018 at 11:02:00AM -0600, Logan Gunthorpe wrote:
>> The second patch expands the new helper to optionally take a path of
>> PCI devfns. This is to address Alex's renumbering concern when using
>> simple bus-devfns. The implementation is essentially how he described it and
>> similar to the Intel VT-d spec (Section 8.3.1).
>
> I don't like telling the user to grovel around lspci -t by hand. It's
> not many lines of code to add a new -P option to lspci to show the path
> to each device instead of bus:dev.fn
Thanks, this looks great! I also found parsing the lspci -t output
cumbersome.
I've also got patches pending for switchtec-user[1] that help users find
the path of downstream ports for Microsemi switches. (An example is
shown below). As the ACS feature is primarily for PCI switch users, this
should help a good segment of people. The lspci patches should cover a
lot more people though.
Logan
sudo switchtec status /dev/switchtec0 -v
Partition 0: (LOCAL)
Logical Port ID 0 (USP):
Phys Port ID: 32 (Stack 4, Port 0)
Bus-Dev-Func: 0000:02:00.0
Bus-Dev-Func Path: 0000:00:02:0/00.0
Status: UP
LTSSM: L0
Max-Width: x16
Neg Width: x16
Rate: Gen3 - 8 GT/s 15.76 GB/s
Out Bytes: 70.3 GB
In Bytes: 70.8 GB
Logical Port ID 1 (DSP):
Phys Port ID: 8 (Stack 1, Port 0)
Bus-Dev-Func: 0000:03:00.0
Bus-Dev-Func Path: 0000:00:02:0/00.0/00.0
Status: UP
LTSSM: L0
Max-Width: x8
Neg Width: x8
Rate: Gen3 - 8 GT/s 7.88 GB/s
Out Bytes: 12.2 MB
In Bytes: 441 MB
ACS: SrcValid- TransBlk- ReqRedir- CmpltRedir-
UpstreamFwd- EgressCtrl- DirectTrans-
Device: 10b5:8724 (0000:04:00.0)
0000:05
Logical Port ID 2 (DSP):
Phys Port ID: 12 (Stack 1, Port 4)
Bus-Dev-Func: 0000:03:01.0
Bus-Dev-Func Path: 0000:00:02:0/00.0/01.0
Status: UP
LTSSM: L0
Max-Width: x8
Neg Width: x8
Rate: Gen3 - 8 GT/s 7.88 GB/s
Out Bytes: 1.65 MB
In Bytes: 107 MB
ACS: SrcValid- TransBlk- ReqRedir- CmpltRedir-
UpstreamFwd- EgressCtrl- DirectTrans-
Device: 11f8:f117 (0000:0b:00.0)
nvme4
[1] https://github.com/Microsemi/switchtec-user/pull/25
On Tue, Jul 17, 2018 at 01:39:00PM -0700, Matthew Wilcox wrote:
> I don't like telling the user to grovel around lspci -t by hand. It's
> not many lines of code to add a new -P option to lspci to show the path
> to each device instead of bus:dev.fn
>
> Here's three examples, first without, then with -P.
> ...
> The Nehalem system makes an interesting testcase because it exposes some
> registers in fake PCIe devices that aren't behind the root ports. eg:
>
> ff:06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 04)
I think these appear as conventional PCI devices; at least the ones
I've seen, e.g., [1], don't have a PCIe capability, so I think it
makes sense that they're not behind a root port.
[1] https://bugzilla5.redhat.com/attachment.cgi?id=433169
On Tue, Jul 17, 2018 at 04:00:53PM -0500, Bjorn Helgaas wrote:
> On Tue, Jul 17, 2018 at 01:39:00PM -0700, Matthew Wilcox wrote:
> > The Nehalem system makes an interesting testcase because it exposes some
> > registers in fake PCIe devices that aren't behind the root ports. eg:
> >
> > ff:06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 04)
>
> I think these appear as conventional PCI devices; at least the ones
> I've seen, e.g., [1], don't have a PCIe capability, so I think it
> makes sense that they're not behind a root port.
>
> [1] https://bugzilla5.redhat.com/attachment.cgi?id=433169
Oh, I don't think we're doing anything wrong with how we're displaying
them or what we're doing with what the system presents to us. My only
point was that this is a good test-case for code which assumes that all
PCI devices lie under a PCIe root port. At one point during development,
my code reported that device up there as
/06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 04)
but since I had that system available to test with, I spotted that problem
and made it present that device as ff:06.3 (both with and without -P).
Martin? Bjorn's looking to merge this soon and it'd be nice to have
the support in lspci too.
On Tue, Jul 17, 2018 at 01:39:00PM -0700, Matthew Wilcox wrote:
> On Tue, Jul 17, 2018 at 11:02:00AM -0600, Logan Gunthorpe wrote:
> > The second patch expands the new helper to optionally take a path of
> > PCI devfns. This is to address Alex's renumbering concern when using
> > simple bus-devfns. The implementation is essentially how he described it and
> > similar to the Intel VT-d spec (Section 8.3.1).
>
> I don't like telling the user to grovel around lspci -t by hand. It's
> not many lines of code to add a new -P option to lspci to show the path
> to each device instead of bus:dev.fn
>
> Here's three examples, first without, then with -P.
>
> (my laptop):
> 6d:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
> 00:1d.0/00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
>
> (tests/PCI-X-bridges-and-domains):
> 0002:42:00.0 Ethernet controller: Trident Microsystems 4DWave DX (rev 26)
> 0002:00:02.4/01.0/00.0 Ethernet controller: Trident Microsystems 4DWave DX (rev 26)
>
> (my Nehalem system):
> 04:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
> 00:03.0/00.0/00.0/00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
>
> The Nehalem system makes an interesting testcase because it exposes some
> registers in fake PCIe devices that aren't behind the root ports. eg:
>
> ff:06.3 Host bridge: Intel Corporation Xeon 5500/Core i7 Integrated Memory Controller Channel 2 Thermal Control Registers (rev 04)
>
> Martin, what do you think to this patch? Also, I'm happy to send you
> the lspci -xxxx from the Nehalem system to add to tests/
>
> diff --git a/lspci.c b/lspci.c
> index 3bf1925..ae0fdd2 100644
> --- a/lspci.c
> +++ b/lspci.c
> @@ -19,6 +19,7 @@ int verbose; /* Show detailed information */
> static int opt_hex; /* Show contents of config space as hexadecimal numbers */
> struct pci_filter filter; /* Device filter */
> static int opt_tree; /* Show bus tree */
> +static int opt_path; /* Show bridge path */
> static int opt_machine; /* Generate machine-readable output */
> static int opt_map_mode; /* Bus mapping mode enabled */
> static int opt_domains; /* Show domain numbers (0=disabled, 1=auto-detected, 2=requested) */
> @@ -29,7 +30,7 @@ char *opt_pcimap; /* Override path to Linux modules.pcimap */
>
> const char program_name[] = "lspci";
>
> -static char options[] = "nvbxs:d:ti:mgp:qkMDQ" GENERIC_OPTIONS ;
> +static char options[] = "nvbxs:d:tPi:mgp:qkMDQ" GENERIC_OPTIONS ;
>
> static char help_msg[] =
> "Usage: lspci [<switches>]\n"
> @@ -247,6 +248,34 @@ sort_them(void)
>
> /*** Normal output ***/
>
> +static void
> +show_slot_path(struct pci_dev *p)
> +{
> + struct pci_dev *d = NULL;
> +
> + if (opt_path && p->bus)
> + {
> + for (d = p->access->devices; d; d = d->next) {
> + if (d->hdrtype == -1)
> + d->hdrtype = pci_read_byte(d, PCI_HEADER_TYPE) & 0x7f;
> + if (d->hdrtype != PCI_HEADER_TYPE_BRIDGE &&
> + d->hdrtype != PCI_HEADER_TYPE_CARDBUS)
> + continue;
> + if (pci_read_byte(d, PCI_SECONDARY_BUS) > p->bus)
> + continue;
> + if (pci_read_byte(d, PCI_SUBORDINATE_BUS) < p->bus)
> + continue;
> + show_slot_path(d);
> + break;
> + }
> + }
> +
> + if (d)
> + printf("/%02x.%d", p->dev, p->func);
> + else
> + printf("%02x:%02x.%d", p->bus, p->dev, p->func);
> +}
> +
> static void
> show_slot_name(struct device *d)
> {
> @@ -254,7 +283,7 @@ show_slot_name(struct device *d)
>
> if (!opt_machine ? opt_domains : (p->domain || opt_domains >= 2))
> printf("%04x:", p->domain);
> - printf("%02x:%02x.%d", p->bus, p->dev, p->func);
> + show_slot_path(p);
> }
>
> void
> @@ -989,6 +1018,9 @@ main(int argc, char **argv)
> case 'x':
> opt_hex++;
> break;
> + case 'P':
> + opt_path++;
> + break;
> case 't':
> opt_tree++;
> break;
> diff --git a/lspci.man b/lspci.man
> index 35b3620..565dd5b 100644
> --- a/lspci.man
> +++ b/lspci.man
> @@ -95,6 +95,9 @@ PCI bus instead of as seen by the kernel.
> .B -D
> Always show PCI domain numbers. By default, lspci suppresses them on machines which
> have only domain 0.
> +.TP
> +.B -P
> +Name PCI devices by path through each bridge, instead of by bus number.
>
> .SS Options to control resolving ID's to names
> .TP
> --
> To unsubscribe from this list: send the line "unsubscribe linux-doc" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello!
> I don't like telling the user to grovel around lspci -t by hand. It's
> not many lines of code to add a new -P option to lspci to show the path
> to each device instead of bus:dev.fn
I like the feature, but I have a couple of minor objections to the
implementation:
> -static char options[] = "nvbxs:d:ti:mgp:qkMDQ" GENERIC_OPTIONS ;
> +static char options[] = "nvbxs:d:tPi:mgp:qkMDQ" GENERIC_OPTIONS ;
Please update the help_msg, too.
> +static void
> +show_slot_path(struct pci_dev *p)
> +{
> + struct pci_dev *d = NULL;
> +
> + if (opt_path && p->bus)
> + {
> + for (d = p->access->devices; d; d = d->next) {
> + if (d->hdrtype == -1)
> + d->hdrtype = pci_read_byte(d, PCI_HEADER_TYPE) & 0x7f;
Please do not modify the hdrtype field, it is private to libpci.
Instead, always read the PCI_HEADER_TYPE register. It will be fast since
lspci caches configuration space accesses.
> + if (d->hdrtype != PCI_HEADER_TYPE_BRIDGE &&
> + d->hdrtype != PCI_HEADER_TYPE_CARDBUS)
> + continue;
> + if (pci_read_byte(d, PCI_SECONDARY_BUS) > p->bus)
> + continue;
> + if (pci_read_byte(d, PCI_SUBORDINATE_BUS) < p->bus)
> + continue;
Beware, cardbus bridges use different registers for secondary and subordinate bus.
... the more I think about it, the more I am convinced that we want to
re-use the bus topology builder from ls-tree.c. I will give it a try,
stay tuned.
Martin
Hello!
> ... the more I think about it, the more I am convinced that we want to
> re-use the bus topology builder from ls-tree.c. I will give it a try,
> stay tuned.
Please see the "topology" branch in pciutils.git.
Martin
On Fri, Aug 10, 2018 at 12:30:35PM +0200, Martin Mares wrote:
> Hello!
>
> > ... the more I think about it, the more I am convinced that we want to
> > re-use the bus topology builder from ls-tree.c. I will give it a try,
> > stay tuned.
>
> Please see the "topology" branch in pciutils.git.
Thanks. I found two problems so far.
One is that using -P and -s together doesn't work because we haven't
scanned the entire topology.
$ ./lspci-mw -PF tests/fujitsu-p8010.lspci -s 1d:00.0
00:1e.0/03.0/00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
$ ./lspci-mm -PF tests/fujitsu-p8010.lspci -s 1d:00.0
1d:00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
The other is that even when not using -s, the topology isn't fully represented:
$ ./lspci-mm -PF tests/fujitsu-p8010.lspci |grep 3com
00:1e.0/00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
I've attached a compressed form of the fujitsu-p8010.lspci dump for your
testing.
Hello!
> One is that using -P and -s together doesn't work because we haven't
> scanned the entire topology.
>
> $ ./lspci-mw -PF tests/fujitsu-p8010.lspci -s 1d:00.0
> 00:1e.0/03.0/00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
> $ ./lspci-mm -PF tests/fujitsu-p8010.lspci -s 1d:00.0
> 1d:00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
Fixed. When topology is required, we now scan all devices and apply the
filters later.
> The other is that even when not using -s, the topology isn't fully represented:
>
> $ ./lspci-mm -PF tests/fujitsu-p8010.lspci |grep 3com
> 00:1e.0/00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
Ah well, it seems that the tree mode never worked with CardBus bridges. Fixed.
After some pondering, I changed the format of the paths to include bus numbers
in all steps. I think it is more intuitive.
Please give it a try. If it works, I will merge the branch to master.
Martin
On Sun, Aug 12, 2018 at 11:28:37AM +0200, Martin Mares wrote:
> Hello!
>
> > One is that using -P and -s together doesn't work because we haven't
> > scanned the entire topology.
>
> Fixed. When topology is required, we now scan all devices and apply the
> filters later.
Thanks!
> > The other is that even when not using -s, the topology isn't fully represented:
> >
> > $ ./lspci-mm -PF tests/fujitsu-p8010.lspci |grep 3com
> > 00:1e.0/00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
>
> Ah well, it seems that the tree mode never worked with CardBus bridges. Fixed.
Haha! I can't believe we never noticed that in the last twenty years!
And we're fixing it even though PCI CardBus bridges are now completely
obsolete (my current laptop has no slots of that form factor; my previous
laptop has an ExpressCard slot; I had to go back to my previous-previous
laptop from 2008 to find sample hardware to test CardBus).
> After some pondering, I changed the format of the paths to include bus numbers
> in all steps. I think it is more intuitive.
I agree it's more intuitive, but it's not the format that Logan's code
is expecting, so it's not as useful for my purposes. How about this?
$ ./lspci -PF tests/fujitsu-p8010.lspci -s 1d:00.0
00:1e.0/03.0/00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
$ ./lspci -PPF tests/fujitsu-p8010.lspci -s 1d:00.0
00:1e.0/1c:03.0/1d:00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
I pondered asking Logan to change his parser to include the bus number
as a solution, but then I remembered the entire point of this is to make
specifying a device robust against bus number assignmnet changes. I suppose
we could have the parser accept and ignore the bus number ...
diff --git a/lspci.c b/lspci.c
index 75cb5b9..3dabbde 100644
--- a/lspci.c
+++ b/lspci.c
@@ -50,7 +50,8 @@ static char help_msg[] =
"-xxxx\t\tShow hex-dump of the 4096-byte extended config space (root only)\n"
"-b\t\tBus-centric view (addresses and IRQ's as seen by the bus)\n"
"-D\t\tAlways show domain numbers\n"
-"-P\t\tDisplay bus path in addition to bus and device number\n"
+"-P\t\tDisplay bridge path in addition to bus and device number\n"
+"-PP\t\tDisplay bus path in addition to bus and device number\n"
"\n"
"Resolving of device ID's to names:\n"
"-n\t\tShow numeric ID's\n"
@@ -264,7 +265,10 @@ show_slot_path(struct device *d)
if (br && br->br_dev)
{
show_slot_path(br->br_dev);
- printf("/%02x:%02x.%d", p->bus, p->dev, p->func);
+ if (opt_path > 1)
+ printf("/%02x:%02x.%d", p->bus, p->dev, p->func);
+ else
+ printf("/%02x.%d", p->dev, p->func);
return;
}
}
diff --git a/lspci.man b/lspci.man
index 78b5c96..55fadb1 100644
--- a/lspci.man
+++ b/lspci.man
@@ -98,6 +98,10 @@ have only domain 0.
.TP
.B -P
Identify PCI devices by path through each bridge, instead of by bus number.
+.TP
+.B -PP
+Identify PCI devices by path through each bridge, showing the bus number as
+well as the device number.
.SS Options to control resolving ID's to names
.TP
Hello!
> I agree it's more intuitive, but it's not the format that Logan's code
> is expecting, so it's not as useful for my purposes. How about this?
>
> $ ./lspci -PF tests/fujitsu-p8010.lspci -s 1d:00.0
> 00:1e.0/03.0/00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
> $ ./lspci -PPF tests/fujitsu-p8010.lspci -s 1d:00.0
> 00:1e.0/1c:03.0/1d:00.0 Network controller: 3Com Corporation 3com 3CRWE154G72 [Office Connect Wireless LAN Adapter] (rev 01)
Yes, this looks fine.
It is merged to master now and I will push a new release out of the door soon.
Thanks!
Martin
On 12/08/18 04:31 AM, Matthew Wilcox wrote:
> I pondered asking Logan to change his parser to include the bus number
> as a solution, but then I remembered the entire point of this is to make
> specifying a device robust against bus number assignmnet changes. I suppose
> we could have the parser accept and ignore the bus number ...
Yes, exactly. Bjorn's already accepted the series but we could add
support for ignored bus letters in another patch if we want. I just
think that might be confusing when you end up in a situation where the
path includes numbers that no longer actually match the actual addresses
after the bus numbers change.
Thanks,
Logan