This is mostly Tal's work to reduce code duplication in drivers and unify
the approach for reporting PCIe link speed/width and whether the device is
being limited by a slower upstream link.
This v5 series is based on Tal's v4 [1].
Changes since v4:
- Added patches to replace uses of pcie_get_minimum_link() in bnx2x,
bnxt_en, cxgb4, fm10k, and ixgbe. Note that this is a user-visible
change to the log messages, and in some cases changes dev_warn() to
dev_info(). I hope we can converge on something that works for
everybody, and it's OK if we need to tweak the text and/or level used
in pcie_print_link_status() to get there.
- Rebased on top of Jay Fang's patch that adds 16 GT/s decoding support.
- Changed pcie_get_speed_cap() and pcie_get_width_cap() to return the
values directly instead of returning both an error code and the value
via a reference parameter. I don't think the callers can really use
both the error and the value.
- Moved some declarations from linux/pci.h to drivers/pci/pci.h so
they're not visible outside the PCI subsystem. Also removed
corresponding EXPORT_SYMBOL()s. If we need these outside the PCI core,
we can export them again, but that's not needed yet.
- Reworked pcie_bandwidth_available() so it finds the uppermost limiting
device and returns width/speed info for that device (previously it
could return width from one device and speed from a different one).
The incremental diff between the v4 series (based on v4.17-rc1) and this v5
series (based on v4.17-rc1 + Jay Fang's patch) is attached. This diff
doesn't include the new patches to bnx2x, bnxt_en, cxgb4, fm10k, and ixgbe.
I don't have any of this hardware, so this is only compile-tested.
Bjorn
[1] https://lkml.kernel.org/r/[email protected]
---
Bjorn Helgaas (6):
bnx2x: Report PCIe link properties with pcie_print_link_status()
bnxt_en: Report PCIe link properties with pcie_print_link_status()
cxgb4: Report PCIe link properties with pcie_print_link_status()
fm10k: Report PCIe link properties with pcie_print_link_status()
ixgbe: Report PCIe link properties with pcie_print_link_status()
PCI: Remove unused pcie_get_minimum_link()
Tal Gilboa (8):
PCI: Add pcie_get_speed_cap() to find max supported link speed
PCI: Add pcie_get_width_cap() to find max supported link width
PCI: Add pcie_bandwidth_capable() to compute max supported link bandwidth
PCI: Add pcie_bandwidth_available() to compute bandwidth available to device
PCI: Add pcie_print_link_status() to log link speed and whether it's limited
net/mlx4_core: Report PCIe link properties with pcie_print_link_status()
net/mlx5: Report PCIe link properties with pcie_print_link_status()
net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 23 +--
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 19 --
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 75 ---------
drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 87 -----------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 47 ------
drivers/net/ethernet/mellanox/mlx4/main.c | 81 ----------
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 32 ----
drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 +
drivers/pci/pci-sysfs.c | 38 +----
drivers/pci/pci.c | 167 ++++++++++++++++++---
drivers/pci/pci.h | 20 +++
include/linux/pci.h | 6 +
12 files changed, 189 insertions(+), 410 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 1bbd6cd20213..93291ec4a3d1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3864,25 +3864,6 @@ void mlx5e_build_default_indir_rqt(u32 *indirection_rqt, int len,
indirection_rqt[i] = i % num_channels;
}
-static int mlx5e_get_pci_bw(struct mlx5_core_dev *mdev, u32 *pci_bw)
-{
- enum pcie_link_width width;
- enum pci_bus_speed speed;
- int err = 0;
- int bw;
-
- err = pcie_bandwidth_available(mdev->pdev, &speed, &width, &bw, NULL);
- if (err)
- return err;
-
- if (speed == PCI_SPEED_UNKNOWN || width == PCIE_LNK_WIDTH_UNKNOWN)
- return -EINVAL;
-
- *pci_bw = bw;
-
- return 0;
-}
-
static bool cqe_compress_heuristic(u32 link_speed, u32 pci_bw)
{
return (link_speed && pci_bw &&
@@ -3968,7 +3949,7 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
params->num_tc = 1;
mlx5e_get_max_linkspeed(mdev, &link_speed);
- mlx5e_get_pci_bw(mdev, &pci_bw);
+ pci_bw = pcie_bandwidth_available(mdev->pdev, NULL, NULL, NULL);
mlx5_core_dbg(mdev, "Max link speed = %d, PCI BW = %d\n",
link_speed, pci_bw);
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index f4b88674f029..63d0952684fb 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -158,33 +158,18 @@ static DEVICE_ATTR_RO(resource);
static ssize_t max_link_speed_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- struct pci_dev *pci_dev = to_pci_dev(dev);
- enum pci_bus_speed speed;
- const char *speed_str;
- int err;
-
- err = pcie_get_speed_cap(pci_dev, &speed);
- if (err)
- return -EINVAL;
-
- speed_str = PCIE_SPEED2STR(speed);
+ struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%s\n", speed_str);
+ return sprintf(buf, "%s\n", PCIE_SPEED2STR(pcie_get_speed_cap(pdev)));
}
static DEVICE_ATTR_RO(max_link_speed);
static ssize_t max_link_width_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- struct pci_dev *pci_dev = to_pci_dev(dev);
- enum pcie_link_width width;
- int err;
-
- err = pcie_get_width_cap(pci_dev, &width);
- if (err)
- return -EINVAL;
+ struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%u\n", width);
+ return sprintf(buf, "%u\n", pcie_get_width_cap(pdev));
}
static DEVICE_ATTR_RO(max_link_width);
@@ -201,6 +186,9 @@ static ssize_t current_link_speed_show(struct device *dev,
return -EINVAL;
switch (linkstat & PCI_EXP_LNKSTA_CLS) {
+ case PCI_EXP_LNKSTA_CLS_16_0GB:
+ speed = "16 GT/s";
+ break;
case PCI_EXP_LNKSTA_CLS_8_0GB:
speed = "8 GT/s";
break;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index bd8aa64d083a..b6951c44ae6c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5103,193 +5103,169 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
}
EXPORT_SYMBOL(pcie_set_mps);
-/**
- * pcie_get_minimum_link - determine minimum link settings of a PCI device
- * @dev: PCI device to query
- * @speed: storage for minimum speed
- * @width: storage for minimum width
- *
- * This function use pcie_bandwidth_available() for determining the minimum
- * link width and speed of the device. Legacy code is kept for compatibility.
- */
-int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
- enum pcie_link_width *width)
-{
- int bw;
-
- return pcie_bandwidth_available(dev, speed, width, &bw, NULL);
-}
-EXPORT_SYMBOL(pcie_get_minimum_link);
-
/**
* pcie_bandwidth_available - determine minimum link settings of a PCIe
- device and its bandwidth limitation
+ * device and its bandwidth limitation
* @dev: PCI device to query
- * @speed: storage for minimum speed
- * @width: storage for minimum width
- * @bw: storage for link bandwidth
* @limiting_dev: storage for device causing the bandwidth limitation
+ * @speed: storage for speed of limiting device
+ * @width: storage for width of limiting device
*
- * This function walks up the PCI device chain and determines the minimum width,
- * minimum speed and available bandwidth of the device.
+ * Walk up the PCI device chain and find the point where the minimum
+ * bandwidth is available. Return the bandwidth available there and (if
+ * limiting_dev, speed, and width pointers are supplied) information about
+ * that point.
*/
-int pcie_bandwidth_available(struct pci_dev *dev, enum pci_bus_speed *speed,
- enum pcie_link_width *width, int *bw,
- struct pci_dev **limiting_dev)
+u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
+ enum pci_bus_speed *speed,
+ enum pcie_link_width *width)
{
- int err;
+ u16 lnksta;
+ enum pci_bus_speed next_speed;
+ enum pcie_link_width next_width;
+ u32 bw, next_bw;
*speed = PCI_SPEED_UNKNOWN;
*width = PCIE_LNK_WIDTH_UNKNOWN;
- *bw = 0;
+ bw = 0;
while (dev) {
- u16 lnksta;
- enum pci_bus_speed next_speed;
- enum pcie_link_width next_width;
-
- err = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
- if (err)
- return err;
+ pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >>
PCI_EXP_LNKSTA_NLW_SHIFT;
- if (next_speed < *speed)
- *speed = next_speed;
-
- if (next_width < *width)
- *width = next_width;
+ next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
/* Check if current device limits the total bandwidth */
- if (!(*bw) ||
- (*bw > next_width * PCIE_SPEED2MBS_ENC(next_speed))) {
+ if (!bw || next_bw <= bw) {
+ bw = next_bw;
+
if (limiting_dev)
*limiting_dev = dev;
- *bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
+ if (speed)
+ *speed = next_speed;
+ if (width)
+ *width = next_width;
}
- dev = dev->bus->self;
+ dev = pci_upstream_bridge(dev);
}
- return 0;
+ return bw;
}
EXPORT_SYMBOL(pcie_bandwidth_available);
/**
- * pcie_get_speed_cap - queries for the PCI device's link speed capability
+ * pcie_get_speed_cap - query for the PCI device's link speed capability
* @dev: PCI device to query
- * @speed: storage for link speed
*
- * This function queries the PCI device speed capability.
+ * Query the PCI device speed capability. Return the maximum link speed
+ * supported by the device.
*/
-int pcie_get_speed_cap(struct pci_dev *dev, enum pci_bus_speed *speed)
+enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev)
{
- u32 lnkcap;
- int err1, err2;
+ u32 lnkcap2, lnkcap;
- *speed = PCI_SPEED_UNKNOWN;
+ /*
+ * PCIe r4.0 sec 7.5.3.18 recommends using the Supported Link
+ * Speeds Vector in Link Capabilities 2 when supported, falling
+ * back to Max Link Speed in Link Capabilities otherwise.
+ */
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &lnkcap2);
+ if (lnkcap2) { /* PCIe r3.0-compliant */
+ if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_16_0GB)
+ return PCIE_SPEED_16_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
+ return PCIE_SPEED_8_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
+ return PCIE_SPEED_5_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
+ return PCIE_SPEED_2_5GT;
+ return PCI_SPEED_UNKNOWN;
+ }
- err1 = pcie_capability_read_dword(dev, PCI_EXP_LNKCAP,
- &lnkcap);
- if (!err1 && lnkcap) {
- if (lnkcap & PCI_EXP_LNKCAP_SLS_8_0GB)
- *speed = PCIE_SPEED_8_0GT;
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if (lnkcap) {
+ if (lnkcap & PCI_EXP_LNKCAP_SLS_16_0GB)
+ return PCIE_SPEED_16_0GT;
+ else if (lnkcap & PCI_EXP_LNKCAP_SLS_8_0GB)
+ return PCIE_SPEED_8_0GT;
else if (lnkcap & PCI_EXP_LNKCAP_SLS_5_0GB)
- *speed = PCIE_SPEED_5_0GT;
+ return PCIE_SPEED_5_0GT;
else if (lnkcap & PCI_EXP_LNKCAP_SLS_2_5GB)
- *speed = PCIE_SPEED_2_5GT;
- return 0;
- }
-
- err2 = pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2,
- &lnkcap);
- if (!err2 && lnkcap) { /* PCIe r3.0-compliant */
- if (lnkcap & PCI_EXP_LNKCAP2_SLS_8_0GB)
- *speed = PCIE_SPEED_8_0GT;
- else if (lnkcap & PCI_EXP_LNKCAP2_SLS_5_0GB)
- *speed = PCIE_SPEED_5_0GT;
- else if (lnkcap & PCI_EXP_LNKCAP2_SLS_2_5GB)
- *speed = PCIE_SPEED_2_5GT;
- return 0;
+ return PCIE_SPEED_2_5GT;
}
- return err1 ? err1 : err2;
+ return PCI_SPEED_UNKNOWN;
}
-EXPORT_SYMBOL(pcie_get_speed_cap);
/**
- * pcie_get_width_cap - queries for the PCI device's link width capability
+ * pcie_get_width_cap - query for the PCI device's link width capability
* @dev: PCI device to query
- * @width: storage for link width
*
- * This function queries the PCI device width capability.
+ * Query the PCI device width capability. Return the maximum link width
+ * supported by the device.
*/
-int pcie_get_width_cap(struct pci_dev *dev, enum pcie_link_width *width)
+enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
{
u32 lnkcap;
- int err;
-
- *width = PCIE_LNK_WIDTH_UNKNOWN;
- err = pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
- if (!err && lnkcap)
- /* Shift start of width mask by 4 to get actual speed cap */
- *width = (lnkcap & PCI_EXP_LNKCAP_MLW) >> 4;
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if (lnkcap)
+ return (lnkcap & PCI_EXP_LNKCAP_MLW) >> 4;
- return err;
+ return PCIE_LNK_WIDTH_UNKNOWN;
}
-EXPORT_SYMBOL(pcie_get_width_cap);
/**
- * pcie_bandwidth_capable - Calculates a PCI device's link bandwidth capability
+ * pcie_bandwidth_capable - calculates a PCI device's link bandwidth capability
* @dev: PCI device
* @speed: storage for link speed
* @width: storage for link width
*
- * This function caculates a PCI device's link bandwidth by querying for its
- * link speed and width, multiplying them, and applying encoding overhead.
+ * Calculate a PCI device's link bandwidth by querying for its link speed
+ * and width, multiplying them, and applying encoding overhead.
*/
-int pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
+u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
enum pcie_link_width *width)
{
- pcie_get_speed_cap(dev, speed);
- pcie_get_width_cap(dev, width);
+ *speed = pcie_get_speed_cap(dev);
+ *width = pcie_get_width_cap(dev);
if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
return 0;
- return (*width) * PCIE_SPEED2MBS_ENC(*speed);
+ return *width * PCIE_SPEED2MBS_ENC(*speed);
}
-EXPORT_SYMBOL(pcie_bandwidth_capable);
/**
- * pcie_print_link_status - Reports the PCI device's link speed and width.
+ * pcie_print_link_status - Report the PCI device's link speed and width
* @dev: PCI device to query
*
- * This function checks whether the PCI device current speed and width are equal
- * to the maximum PCI device capabilities.
+ * Report the available bandwidth at the device. If this is less than the
+ * device is capable of, report the device's maximum possible bandwidth and
+ * the upstream link that limits its performance to less than that.
*/
void pcie_print_link_status(struct pci_dev *dev)
{
enum pcie_link_width width, width_cap;
- struct pci_dev *limiting_dev = NULL;
enum pci_bus_speed speed, speed_cap;
- int bw, bw_cap;
+ struct pci_dev *limiting_dev = NULL;
+ u32 bw_avail, bw_cap;
bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
- pcie_bandwidth_available(dev, &speed, &width, &bw, &limiting_dev);
+ bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed, &width);
- if (bw >= bw_cap)
+ if (bw_avail >= bw_cap)
pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
- bw, PCIE_SPEED2STR(speed), width);
+ bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
else
- pci_info(dev, "%d Mb/s available bandwidth (capable of %d Mb/s, %s x%d link)\n",
- bw, bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
- if (limiting_dev && strcmp(pci_name(limiting_dev), pci_name(dev)))
- pci_info(dev, "Bandwidth limited by device at %s\n",
- pci_name(limiting_dev));
+ pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d link at %s (capable of %d Mb/s with %s x%d link)\n",
+ bw_avail, PCIE_SPEED2STR(speed), width,
+ limiting_dev ? pci_name(limiting_dev) : "<unknown>",
+ bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
}
EXPORT_SYMBOL(pcie_print_link_status);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fcd81911b127..2a50172b9803 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -253,6 +253,26 @@ bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
void pci_reassigndev_resource_alignment(struct pci_dev *dev);
void pci_disable_bridge_window(struct pci_dev *dev);
+/* PCIe link information */
+#define PCIE_SPEED2STR(speed) \
+ ((speed) == PCIE_SPEED_16_0GT ? "16 GT/s" : \
+ (speed) == PCIE_SPEED_8_0GT ? "8 GT/s" : \
+ (speed) == PCIE_SPEED_5_0GT ? "5 GT/s" : \
+ (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
+ "Unknown speed")
+
+/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for gen3 */
+#define PCIE_SPEED2MBS_ENC(speed) \
+ ((speed) == PCIE_SPEED_8_0GT ? 7877 : \
+ (speed) == PCIE_SPEED_5_0GT ? 4000 : \
+ (speed) == PCIE_SPEED_2_5GT ? 2000 : \
+ 0)
+
+enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
+enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
+u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
+ enum pcie_link_width *width);
+
/* Single Root I/O Virtualization */
struct pci_sriov {
int pos; /* Capability position */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ef5377438a1e..86bf045f3d59 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -592,7 +592,7 @@ const unsigned char pcie_link_speed[] = {
PCIE_SPEED_2_5GT, /* 1 */
PCIE_SPEED_5_0GT, /* 2 */
PCIE_SPEED_8_0GT, /* 3 */
- PCI_SPEED_UNKNOWN, /* 4 */
+ PCIE_SPEED_16_0GT, /* 4 */
PCI_SPEED_UNKNOWN, /* 5 */
PCI_SPEED_UNKNOWN, /* 6 */
PCI_SPEED_UNKNOWN, /* 7 */
diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
index d10f556dc03e..191893e19d5c 100644
--- a/drivers/pci/slot.c
+++ b/drivers/pci/slot.c
@@ -76,6 +76,7 @@ static const char *pci_bus_speed_strings[] = {
"2.5 GT/s PCIe", /* 0x14 */
"5.0 GT/s PCIe", /* 0x15 */
"8.0 GT/s PCIe", /* 0x16 */
+ "16.0 GT/s PCIe", /* 0x17 */
};
static ssize_t bus_speed_read(enum pci_bus_speed speed, char *buf)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1a672c960c8f..5ccee29fe1b1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -256,25 +256,10 @@ enum pci_bus_speed {
PCIE_SPEED_2_5GT = 0x14,
PCIE_SPEED_5_0GT = 0x15,
PCIE_SPEED_8_0GT = 0x16,
+ PCIE_SPEED_16_0GT = 0x17,
PCI_SPEED_UNKNOWN = 0xff,
};
-#define PCIE_SPEED2STR(speed) \
- ((speed) == PCIE_SPEED_8_0GT ? "8 GT/s" : \
- (speed) == PCIE_SPEED_5_0GT ? "5 GT/s" : \
- (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
- "Unknown speed")
-
-/**
- * PCIe speed to Mb/s with encoding overhead:
- * 20% for gen2, ~1.5% for gen3
- */
-#define PCIE_SPEED2MBS_ENC(speed) \
- ((speed) == PCIE_SPEED_8_0GT ? 7877 : \
- (speed) == PCIE_SPEED_5_0GT ? 4000 : \
- (speed) == PCIE_SPEED_2_5GT ? 2000 : \
- 0)
-
struct pci_cap_saved_data {
u16 cap_nr;
bool cap_extended;
@@ -1096,15 +1081,9 @@ int pcie_get_readrq(struct pci_dev *dev);
int pcie_set_readrq(struct pci_dev *dev, int rq);
int pcie_get_mps(struct pci_dev *dev);
int pcie_set_mps(struct pci_dev *dev, int mps);
-int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
- enum pcie_link_width *width);
-int pcie_bandwidth_available(struct pci_dev *dev, enum pci_bus_speed *speed,
- enum pcie_link_width *width, int *bw,
- struct pci_dev **limiting_dev);
-int pcie_get_speed_cap(struct pci_dev *dev, enum pci_bus_speed *speed);
-int pcie_get_width_cap(struct pci_dev *dev, enum pcie_link_width *width);
-int pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
- enum pcie_link_width *width);
+u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
+ enum pci_bus_speed *speed,
+ enum pcie_link_width *width);
void pcie_print_link_status(struct pci_dev *dev);
void pcie_flr(struct pci_dev *dev);
int __pci_reset_function_locked(struct pci_dev *dev);
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 0c79eac5e9b8..103ba797a8f3 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -520,6 +520,7 @@
#define PCI_EXP_LNKCAP_SLS_2_5GB 0x00000001 /* LNKCAP2 SLS Vector bit 0 */
#define PCI_EXP_LNKCAP_SLS_5_0GB 0x00000002 /* LNKCAP2 SLS Vector bit 1 */
#define PCI_EXP_LNKCAP_SLS_8_0GB 0x00000003 /* LNKCAP2 SLS Vector bit 2 */
+#define PCI_EXP_LNKCAP_SLS_16_0GB 0x00000004 /* LNKCAP2 SLS Vector bit 3 */
#define PCI_EXP_LNKCAP_MLW 0x000003f0 /* Maximum Link Width */
#define PCI_EXP_LNKCAP_ASPMS 0x00000c00 /* ASPM Support */
#define PCI_EXP_LNKCAP_L0SEL 0x00007000 /* L0s Exit Latency */
@@ -547,6 +548,7 @@
#define PCI_EXP_LNKSTA_CLS_2_5GB 0x0001 /* Current Link Speed 2.5GT/s */
#define PCI_EXP_LNKSTA_CLS_5_0GB 0x0002 /* Current Link Speed 5.0GT/s */
#define PCI_EXP_LNKSTA_CLS_8_0GB 0x0003 /* Current Link Speed 8.0GT/s */
+#define PCI_EXP_LNKSTA_CLS_16_0GB 0x0004 /* Current Link Speed 16.0GT/s */
#define PCI_EXP_LNKSTA_NLW 0x03f0 /* Negotiated Link Width */
#define PCI_EXP_LNKSTA_NLW_X1 0x0010 /* Current Link Width x1 */
#define PCI_EXP_LNKSTA_NLW_X2 0x0020 /* Current Link Width x2 */
@@ -648,8 +650,9 @@
#define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V2 44 /* v2 endpoints without link end here */
#define PCI_EXP_LNKCAP2 44 /* Link Capabilities 2 */
#define PCI_EXP_LNKCAP2_SLS_2_5GB 0x00000002 /* Supported Speed 2.5GT/s */
-#define PCI_EXP_LNKCAP2_SLS_5_0GB 0x00000004 /* Supported Speed 5.0GT/s */
-#define PCI_EXP_LNKCAP2_SLS_8_0GB 0x00000008 /* Supported Speed 8.0GT/s */
+#define PCI_EXP_LNKCAP2_SLS_5_0GB 0x00000004 /* Supported Speed 5GT/s */
+#define PCI_EXP_LNKCAP2_SLS_8_0GB 0x00000008 /* Supported Speed 8GT/s */
+#define PCI_EXP_LNKCAP2_SLS_16_0GB 0x00000010 /* Supported Speed 16GT/s */
#define PCI_EXP_LNKCAP2_CROSSLINK 0x00000100 /* Crosslink supported */
#define PCI_EXP_LNKCTL2 48 /* Link Control 2 */
#define PCI_EXP_LNKSTA2 50 /* Link Status 2 */
From: Tal Gilboa <[email protected]>
Add pcie_get_speed_cap() to find the max link speed supported by a device.
Change max_link_speed_show() to use pcie_get_speed_cap().
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: return speed directly instead of error and *speed, don't export
outside drivers/pci]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
---
drivers/pci/pci-sysfs.c | 28 ++--------------------------
drivers/pci/pci.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.h | 10 ++++++++++
3 files changed, 56 insertions(+), 26 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7dc5be545d18..c2ea05fbbf1d 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -158,33 +158,9 @@ static DEVICE_ATTR_RO(resource);
static ssize_t max_link_speed_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- struct pci_dev *pci_dev = to_pci_dev(dev);
- u32 linkcap;
- int err;
- const char *speed;
-
- err = pcie_capability_read_dword(pci_dev, PCI_EXP_LNKCAP, &linkcap);
- if (err)
- return -EINVAL;
-
- switch (linkcap & PCI_EXP_LNKCAP_SLS) {
- case PCI_EXP_LNKCAP_SLS_16_0GB:
- speed = "16 GT/s";
- break;
- case PCI_EXP_LNKCAP_SLS_8_0GB:
- speed = "8 GT/s";
- break;
- case PCI_EXP_LNKCAP_SLS_5_0GB:
- speed = "5 GT/s";
- break;
- case PCI_EXP_LNKCAP_SLS_2_5GB:
- speed = "2.5 GT/s";
- break;
- default:
- speed = "Unknown speed";
- }
+ struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%s\n", speed);
+ return sprintf(buf, "%s\n", PCIE_SPEED2STR(pcie_get_speed_cap(pdev)));
}
static DEVICE_ATTR_RO(max_link_speed);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f6a4dd10d9b0..b29d3436ee9f 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5146,6 +5146,50 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
}
EXPORT_SYMBOL(pcie_get_minimum_link);
+/**
+ * pcie_get_speed_cap - query for the PCI device's link speed capability
+ * @dev: PCI device to query
+ *
+ * Query the PCI device speed capability. Return the maximum link speed
+ * supported by the device.
+ */
+enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev)
+{
+ u32 lnkcap2, lnkcap;
+
+ /*
+ * PCIe r4.0 sec 7.5.3.18 recommends using the Supported Link
+ * Speeds Vector in Link Capabilities 2 when supported, falling
+ * back to Max Link Speed in Link Capabilities otherwise.
+ */
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP2, &lnkcap2);
+ if (lnkcap2) { /* PCIe r3.0-compliant */
+ if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_16_0GB)
+ return PCIE_SPEED_16_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
+ return PCIE_SPEED_8_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
+ return PCIE_SPEED_5_0GT;
+ else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
+ return PCIE_SPEED_2_5GT;
+ return PCI_SPEED_UNKNOWN;
+ }
+
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if (lnkcap) {
+ if (lnkcap & PCI_EXP_LNKCAP_SLS_16_0GB)
+ return PCIE_SPEED_16_0GT;
+ else if (lnkcap & PCI_EXP_LNKCAP_SLS_8_0GB)
+ return PCIE_SPEED_8_0GT;
+ else if (lnkcap & PCI_EXP_LNKCAP_SLS_5_0GB)
+ return PCIE_SPEED_5_0GT;
+ else if (lnkcap & PCI_EXP_LNKCAP_SLS_2_5GB)
+ return PCIE_SPEED_2_5GT;
+ }
+
+ return PCI_SPEED_UNKNOWN;
+}
+
/**
* pci_select_bars - Make BAR mask from the type of resource
* @dev: the PCI device for which BAR mask is made
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fcd81911b127..1186d8be6055 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -253,6 +253,16 @@ bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
void pci_reassigndev_resource_alignment(struct pci_dev *dev);
void pci_disable_bridge_window(struct pci_dev *dev);
+/* PCIe link information */
+#define PCIE_SPEED2STR(speed) \
+ ((speed) == PCIE_SPEED_16_0GT ? "16 GT/s" : \
+ (speed) == PCIE_SPEED_8_0GT ? "8 GT/s" : \
+ (speed) == PCIE_SPEED_5_0GT ? "5 GT/s" : \
+ (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
+ "Unknown speed")
+
+enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
+
/* Single Root I/O Virtualization */
struct pci_sriov {
int pos; /* Capability position */
From: Tal Gilboa <[email protected]>
Add pcie_get_width_cap() to find the max link width supported by a device.
Change max_link_width_show() to use pcie_get_width_cap().
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: return width directly instead of error and *width, don't export
outside drivers/pci]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
---
drivers/pci/pci-sysfs.c | 10 ++--------
drivers/pci/pci.c | 18 ++++++++++++++++++
drivers/pci/pci.h | 1 +
3 files changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c2ea05fbbf1d..63d0952684fb 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -167,15 +167,9 @@ static DEVICE_ATTR_RO(max_link_speed);
static ssize_t max_link_width_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- struct pci_dev *pci_dev = to_pci_dev(dev);
- u32 linkcap;
- int err;
-
- err = pcie_capability_read_dword(pci_dev, PCI_EXP_LNKCAP, &linkcap);
- if (err)
- return -EINVAL;
+ struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%u\n", (linkcap & PCI_EXP_LNKCAP_MLW) >> 4);
+ return sprintf(buf, "%u\n", pcie_get_width_cap(pdev));
}
static DEVICE_ATTR_RO(max_link_width);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b29d3436ee9f..43075be79388 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5190,6 +5190,24 @@ enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev)
return PCI_SPEED_UNKNOWN;
}
+/**
+ * pcie_get_width_cap - query for the PCI device's link width capability
+ * @dev: PCI device to query
+ *
+ * Query the PCI device width capability. Return the maximum link width
+ * supported by the device.
+ */
+enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
+{
+ u32 lnkcap;
+
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if (lnkcap)
+ return (lnkcap & PCI_EXP_LNKCAP_MLW) >> 4;
+
+ return PCIE_LNK_WIDTH_UNKNOWN;
+}
+
/**
* pci_select_bars - Make BAR mask from the type of resource
* @dev: the PCI device for which BAR mask is made
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1186d8be6055..66738f1050c0 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -262,6 +262,7 @@ void pci_disable_bridge_window(struct pci_dev *dev);
"Unknown speed")
enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
+enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
/* Single Root I/O Virtualization */
struct pci_sriov {
From: Tal Gilboa <[email protected]>
Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
a device, based on the max link speed and width, adjusted by the encoding
overhead.
The maximum bandwidth of the link is computed as:
max_link_speed * max_link_width * (1 - encoding_overhead)
The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
encoding.
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
signatures, don't export outside drivers/pci]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
---
drivers/pci/pci.c | 21 +++++++++++++++++++++
drivers/pci/pci.h | 9 +++++++++
2 files changed, 30 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 43075be79388..9ce89e254197 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5208,6 +5208,27 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
return PCIE_LNK_WIDTH_UNKNOWN;
}
+/**
+ * pcie_bandwidth_capable - calculates a PCI device's link bandwidth capability
+ * @dev: PCI device
+ * @speed: storage for link speed
+ * @width: storage for link width
+ *
+ * Calculate a PCI device's link bandwidth by querying for its link speed
+ * and width, multiplying them, and applying encoding overhead.
+ */
+u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
+ enum pcie_link_width *width)
+{
+ *speed = pcie_get_speed_cap(dev);
+ *width = pcie_get_width_cap(dev);
+
+ if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
+ return 0;
+
+ return *width * PCIE_SPEED2MBS_ENC(*speed);
+}
+
/**
* pci_select_bars - Make BAR mask from the type of resource
* @dev: the PCI device for which BAR mask is made
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 66738f1050c0..2a50172b9803 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev *dev);
(speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
"Unknown speed")
+/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for gen3 */
+#define PCIE_SPEED2MBS_ENC(speed) \
+ ((speed) == PCIE_SPEED_8_0GT ? 7877 : \
+ (speed) == PCIE_SPEED_5_0GT ? 4000 : \
+ (speed) == PCIE_SPEED_2_5GT ? 2000 : \
+ 0)
+
enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
+u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
+ enum pcie_link_width *width);
/* Single Root I/O Virtualization */
struct pci_sriov {
From: Tal Gilboa <[email protected]>
Add pcie_print_link_status(). This logs the current settings of the link
(speed, width, and total available bandwidth).
If the device is capable of more bandwidth but is limited by a slower
upstream link, we include information about the link that limits the
device's performance.
The user may be able to move the device to a different slot for better
performance.
This provides a unified method for all PCI devices to report status and
issues, instead of each device reporting in a different way, using
different code.
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: changelog, reword log messages, print device capabilities when
not limited]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.c | 29 +++++++++++++++++++++++++++++
include/linux/pci.h | 1 +
2 files changed, 30 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e00d56b12747..cec7aed09f6b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5283,6 +5283,35 @@ u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
return *width * PCIE_SPEED2MBS_ENC(*speed);
}
+/**
+ * pcie_print_link_status - Report the PCI device's link speed and width
+ * @dev: PCI device to query
+ *
+ * Report the available bandwidth at the device. If this is less than the
+ * device is capable of, report the device's maximum possible bandwidth and
+ * the upstream link that limits its performance to less than that.
+ */
+void pcie_print_link_status(struct pci_dev *dev)
+{
+ enum pcie_link_width width, width_cap;
+ enum pci_bus_speed speed, speed_cap;
+ struct pci_dev *limiting_dev = NULL;
+ u32 bw_avail, bw_cap;
+
+ bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
+ bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed, &width);
+
+ if (bw_avail >= bw_cap)
+ pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
+ bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
+ else
+ pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d link at %s (capable of %d Mb/s with %s x%d link)\n",
+ bw_avail, PCIE_SPEED2STR(speed), width,
+ limiting_dev ? pci_name(limiting_dev) : "<unknown>",
+ bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
+}
+EXPORT_SYMBOL(pcie_print_link_status);
+
/**
* pci_select_bars - Make BAR mask from the type of resource
* @dev: the PCI device for which BAR mask is made
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f2bf2b7a66c7..38f7957121ef 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1086,6 +1086,7 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
enum pci_bus_speed *speed,
enum pcie_link_width *width);
+void pcie_print_link_status(struct pci_dev *dev);
void pcie_flr(struct pci_dev *dev);
int __pci_reset_function_locked(struct pci_dev *dev);
int pci_reset_function(struct pci_dev *dev);
From: Tal Gilboa <[email protected]>
Use pcie_print_link_status() to report PCIe link speed and possible
limitations.
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 2ef641c91c26..622f02d34aae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1043,6 +1043,10 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
dev_info(&pdev->dev, "firmware version: %d.%d.%d\n", fw_rev_maj(dev),
fw_rev_min(dev), fw_rev_sub(dev));
+ /* Only PFs hold the relevant PCIe information for this query */
+ if (mlx5_core_is_pf(dev))
+ pcie_print_link_status(dev->pdev);
+
/* on load removing any previous indication of internal error, device is
* up
*/
From: Tal Gilboa <[email protected]>
Use the new pci_bandwidth_available() function to calculate maximum
available bandwidth through the PCI chain instead of computing it ourselves
with mlx5e_get_pci_bw().
This is used to detect when the device is capable of more bandwidth than is
available in the current slot. The driver may adjust compression settings
accordingly.
Note that pci_bandwidth_available() accounts for PCIe encoding overhead, so
it is more accurate than mlx5e_get_pci_bw() was.
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: remove mlx5e_get_pci_bw() wrapper altogether]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 32 +--------------------
1 file changed, 1 insertion(+), 31 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 47bab842c5ee..93291ec4a3d1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3864,36 +3864,6 @@ void mlx5e_build_default_indir_rqt(u32 *indirection_rqt, int len,
indirection_rqt[i] = i % num_channels;
}
-static int mlx5e_get_pci_bw(struct mlx5_core_dev *mdev, u32 *pci_bw)
-{
- enum pcie_link_width width;
- enum pci_bus_speed speed;
- int err = 0;
-
- err = pcie_get_minimum_link(mdev->pdev, &speed, &width);
- if (err)
- return err;
-
- if (speed == PCI_SPEED_UNKNOWN || width == PCIE_LNK_WIDTH_UNKNOWN)
- return -EINVAL;
-
- switch (speed) {
- case PCIE_SPEED_2_5GT:
- *pci_bw = 2500 * width;
- break;
- case PCIE_SPEED_5_0GT:
- *pci_bw = 5000 * width;
- break;
- case PCIE_SPEED_8_0GT:
- *pci_bw = 8000 * width;
- break;
- default:
- return -EINVAL;
- }
-
- return 0;
-}
-
static bool cqe_compress_heuristic(u32 link_speed, u32 pci_bw)
{
return (link_speed && pci_bw &&
@@ -3979,7 +3949,7 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
params->num_tc = 1;
mlx5e_get_max_linkspeed(mdev, &link_speed);
- mlx5e_get_pci_bw(mdev, &pci_bw);
+ pci_bw = pcie_bandwidth_available(mdev->pdev, NULL, NULL, NULL);
mlx5_core_dbg(mdev, "Max link speed = %d, PCI BW = %d\n",
link_speed, pci_bw);
From: Bjorn Helgaas <[email protected]>
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.
Note that pcie_get_minimum_link() can return misleading information because
it finds the slowest link and the narrowest link without considering the
total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
about 2000 MB/s for a 16 GT/s x1 link.
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 75 -----------------------
1 file changed, 1 insertion(+), 74 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 56bc626ef006..2d6864c8199e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4762,79 +4762,6 @@ static int init_rss(struct adapter *adap)
return 0;
}
-static int cxgb4_get_pcie_dev_link_caps(struct adapter *adap,
- enum pci_bus_speed *speed,
- enum pcie_link_width *width)
-{
- u32 lnkcap1, lnkcap2;
- int err1, err2;
-
-#define PCIE_MLW_CAP_SHIFT 4 /* start of MLW mask in link capabilities */
-
- *speed = PCI_SPEED_UNKNOWN;
- *width = PCIE_LNK_WIDTH_UNKNOWN;
-
- err1 = pcie_capability_read_dword(adap->pdev, PCI_EXP_LNKCAP,
- &lnkcap1);
- err2 = pcie_capability_read_dword(adap->pdev, PCI_EXP_LNKCAP2,
- &lnkcap2);
- if (!err2 && lnkcap2) { /* PCIe r3.0-compliant */
- if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
- *speed = PCIE_SPEED_8_0GT;
- else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
- *speed = PCIE_SPEED_5_0GT;
- else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
- *speed = PCIE_SPEED_2_5GT;
- }
- if (!err1) {
- *width = (lnkcap1 & PCI_EXP_LNKCAP_MLW) >> PCIE_MLW_CAP_SHIFT;
- if (!lnkcap2) { /* pre-r3.0 */
- if (lnkcap1 & PCI_EXP_LNKCAP_SLS_5_0GB)
- *speed = PCIE_SPEED_5_0GT;
- else if (lnkcap1 & PCI_EXP_LNKCAP_SLS_2_5GB)
- *speed = PCIE_SPEED_2_5GT;
- }
- }
-
- if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
- return err1 ? err1 : err2 ? err2 : -EINVAL;
- return 0;
-}
-
-static void cxgb4_check_pcie_caps(struct adapter *adap)
-{
- enum pcie_link_width width, width_cap;
- enum pci_bus_speed speed, speed_cap;
-
-#define PCIE_SPEED_STR(speed) \
- (speed == PCIE_SPEED_8_0GT ? "8.0GT/s" : \
- speed == PCIE_SPEED_5_0GT ? "5.0GT/s" : \
- speed == PCIE_SPEED_2_5GT ? "2.5GT/s" : \
- "Unknown")
-
- if (cxgb4_get_pcie_dev_link_caps(adap, &speed_cap, &width_cap)) {
- dev_warn(adap->pdev_dev,
- "Unable to determine PCIe device BW capabilities\n");
- return;
- }
-
- if (pcie_get_minimum_link(adap->pdev, &speed, &width) ||
- speed == PCI_SPEED_UNKNOWN || width == PCIE_LNK_WIDTH_UNKNOWN) {
- dev_warn(adap->pdev_dev,
- "Unable to determine PCI Express bandwidth.\n");
- return;
- }
-
- dev_info(adap->pdev_dev, "PCIe link speed is %s, device supports %s\n",
- PCIE_SPEED_STR(speed), PCIE_SPEED_STR(speed_cap));
- dev_info(adap->pdev_dev, "PCIe link width is x%d, device supports x%d\n",
- width, width_cap);
- if (speed < speed_cap || width < width_cap)
- dev_info(adap->pdev_dev,
- "A slot with more lanes and/or higher speed is "
- "suggested for optimal performance.\n");
-}
-
/* Dump basic information about the adapter */
static void print_adapter_info(struct adapter *adapter)
{
@@ -5466,7 +5393,7 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
}
/* check for PCI Express bandwidth capabiltites */
- cxgb4_check_pcie_caps(adapter);
+ pcie_print_link_status(pdev);
err = init_rss(adapter);
if (err)
From: Bjorn Helgaas <[email protected]>
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.
Note that pcie_get_minimum_link() can return misleading information because
it finds the slowest link and the narrowest link without considering the
total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
about 2000 MB/s for a 16 GT/s x1 link.
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 19 +------------------
1 file changed, 1 insertion(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1500243b9886..3be42431e029 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8469,22 +8469,6 @@ static int bnxt_init_mac_addr(struct bnxt *bp)
return rc;
}
-static void bnxt_parse_log_pcie_link(struct bnxt *bp)
-{
- enum pcie_link_width width = PCIE_LNK_WIDTH_UNKNOWN;
- enum pci_bus_speed speed = PCI_SPEED_UNKNOWN;
-
- if (pcie_get_minimum_link(pci_physfn(bp->pdev), &speed, &width) ||
- speed == PCI_SPEED_UNKNOWN || width == PCIE_LNK_WIDTH_UNKNOWN)
- netdev_info(bp->dev, "Failed to determine PCIe Link Info\n");
- else
- netdev_info(bp->dev, "PCIe: Speed %s Width x%d\n",
- speed == PCIE_SPEED_2_5GT ? "2.5GT/s" :
- speed == PCIE_SPEED_5_0GT ? "5.0GT/s" :
- speed == PCIE_SPEED_8_0GT ? "8.0GT/s" :
- "Unknown", width);
-}
-
static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
{
static int version_printed;
@@ -8694,8 +8678,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
netdev_info(dev, "%s found at mem %lx, node addr %pM\n",
board_info[ent->driver_data].name,
(long)pci_resource_start(pdev, 0), dev->dev_addr);
-
- bnxt_parse_log_pcie_link(bp);
+ pcie_print_link_status(pdev);
return 0;
From: Tal Gilboa <[email protected]>
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.
Signed-off-by: Tal Gilboa <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/main.c | 81 -----------------------------
1 file changed, 1 insertion(+), 80 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 4d84cab77105..30cacac54e69 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -623,85 +623,6 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
return 0;
}
-static int mlx4_get_pcie_dev_link_caps(struct mlx4_dev *dev,
- enum pci_bus_speed *speed,
- enum pcie_link_width *width)
-{
- u32 lnkcap1, lnkcap2;
- int err1, err2;
-
-#define PCIE_MLW_CAP_SHIFT 4 /* start of MLW mask in link capabilities */
-
- *speed = PCI_SPEED_UNKNOWN;
- *width = PCIE_LNK_WIDTH_UNKNOWN;
-
- err1 = pcie_capability_read_dword(dev->persist->pdev, PCI_EXP_LNKCAP,
- &lnkcap1);
- err2 = pcie_capability_read_dword(dev->persist->pdev, PCI_EXP_LNKCAP2,
- &lnkcap2);
- if (!err2 && lnkcap2) { /* PCIe r3.0-compliant */
- if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
- *speed = PCIE_SPEED_8_0GT;
- else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
- *speed = PCIE_SPEED_5_0GT;
- else if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
- *speed = PCIE_SPEED_2_5GT;
- }
- if (!err1) {
- *width = (lnkcap1 & PCI_EXP_LNKCAP_MLW) >> PCIE_MLW_CAP_SHIFT;
- if (!lnkcap2) { /* pre-r3.0 */
- if (lnkcap1 & PCI_EXP_LNKCAP_SLS_5_0GB)
- *speed = PCIE_SPEED_5_0GT;
- else if (lnkcap1 & PCI_EXP_LNKCAP_SLS_2_5GB)
- *speed = PCIE_SPEED_2_5GT;
- }
- }
-
- if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN) {
- return err1 ? err1 :
- err2 ? err2 : -EINVAL;
- }
- return 0;
-}
-
-static void mlx4_check_pcie_caps(struct mlx4_dev *dev)
-{
- enum pcie_link_width width, width_cap;
- enum pci_bus_speed speed, speed_cap;
- int err;
-
-#define PCIE_SPEED_STR(speed) \
- (speed == PCIE_SPEED_8_0GT ? "8.0GT/s" : \
- speed == PCIE_SPEED_5_0GT ? "5.0GT/s" : \
- speed == PCIE_SPEED_2_5GT ? "2.5GT/s" : \
- "Unknown")
-
- err = mlx4_get_pcie_dev_link_caps(dev, &speed_cap, &width_cap);
- if (err) {
- mlx4_warn(dev,
- "Unable to determine PCIe device BW capabilities\n");
- return;
- }
-
- err = pcie_get_minimum_link(dev->persist->pdev, &speed, &width);
- if (err || speed == PCI_SPEED_UNKNOWN ||
- width == PCIE_LNK_WIDTH_UNKNOWN) {
- mlx4_warn(dev,
- "Unable to determine PCI device chain minimum BW\n");
- return;
- }
-
- if (width != width_cap || speed != speed_cap)
- mlx4_warn(dev,
- "PCIe BW is different than device's capability\n");
-
- mlx4_info(dev, "PCIe link speed is %s, device supports %s\n",
- PCIE_SPEED_STR(speed), PCIE_SPEED_STR(speed_cap));
- mlx4_info(dev, "PCIe link width is x%d, device supports x%d\n",
- width, width_cap);
- return;
-}
-
/*The function checks if there are live vf, return the num of them*/
static int mlx4_how_many_lives_vf(struct mlx4_dev *dev)
{
@@ -3475,7 +3396,7 @@ static int mlx4_load_one(struct pci_dev *pdev, int pci_dev_data,
* express device capabilities are under-satisfied by the bus.
*/
if (!mlx4_is_slave(dev))
- mlx4_check_pcie_caps(dev);
+ pcie_print_link_status(dev->persist->pdev);
/* In master functions, the communication channel must be initialized
* after obtaining its address from fw */
From: Bjorn Helgaas <[email protected]>
In some cases pcie_get_minimum_link() returned misleading information
because it found the slowest link and the narrowest link without
considering the total bandwidth of the link. For example, if the path
contained a 16 GT/s x1 link and a 2.5 GT/s x16 link,
pcie_get_minimum_link() returned 2.5 GT/s x1, which corresponds to 250 MB/s
of bandwidth, not the actual available bandwidth of about 2000 MB/s for a
16 GT/s x1 link.
Callers should use pcie_print_link_status() instead, or
pcie_bandwidth_available() if they need more detailed information.
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.c | 43 -------------------------------------------
include/linux/pci.h | 2 --
2 files changed, 45 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index cec7aed09f6b..b6951c44ae6c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5103,49 +5103,6 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
}
EXPORT_SYMBOL(pcie_set_mps);
-/**
- * pcie_get_minimum_link - determine minimum link settings of a PCI device
- * @dev: PCI device to query
- * @speed: storage for minimum speed
- * @width: storage for minimum width
- *
- * This function will walk up the PCI device chain and determine the minimum
- * link width and speed of the device.
- */
-int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
- enum pcie_link_width *width)
-{
- int ret;
-
- *speed = PCI_SPEED_UNKNOWN;
- *width = PCIE_LNK_WIDTH_UNKNOWN;
-
- while (dev) {
- u16 lnksta;
- enum pci_bus_speed next_speed;
- enum pcie_link_width next_width;
-
- ret = pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
- if (ret)
- return ret;
-
- next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
- next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >>
- PCI_EXP_LNKSTA_NLW_SHIFT;
-
- if (next_speed < *speed)
- *speed = next_speed;
-
- if (next_width < *width)
- *width = next_width;
-
- dev = dev->bus->self;
- }
-
- return 0;
-}
-EXPORT_SYMBOL(pcie_get_minimum_link);
-
/**
* pcie_bandwidth_available - determine minimum link settings of a PCIe
* device and its bandwidth limitation
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 38f7957121ef..5ccee29fe1b1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1081,8 +1081,6 @@ int pcie_get_readrq(struct pci_dev *dev);
int pcie_set_readrq(struct pci_dev *dev, int rq);
int pcie_get_mps(struct pci_dev *dev);
int pcie_set_mps(struct pci_dev *dev, int mps);
-int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
- enum pcie_link_width *width);
u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
enum pci_bus_speed *speed,
enum pcie_link_width *width);
From: Bjorn Helgaas <[email protected]>
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.
Note that pcie_get_minimum_link() can return misleading information because
it finds the slowest link and the narrowest link without considering the
total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
about 2000 MB/s for a 16 GT/s x1 link.
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 23 ++++++----------------
1 file changed, 6 insertions(+), 17 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 74fc9af4aadb..c92601f1b0f3 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -13922,8 +13922,6 @@ static int bnx2x_init_one(struct pci_dev *pdev,
{
struct net_device *dev = NULL;
struct bnx2x *bp;
- enum pcie_link_width pcie_width;
- enum pci_bus_speed pcie_speed;
int rc, max_non_def_sbs;
int rx_count, tx_count, rss_count, doorbell_size;
int max_cos_est;
@@ -14091,21 +14089,12 @@ static int bnx2x_init_one(struct pci_dev *pdev,
dev_addr_add(bp->dev, bp->fip_mac, NETDEV_HW_ADDR_T_SAN);
rtnl_unlock();
}
- if (pcie_get_minimum_link(bp->pdev, &pcie_speed, &pcie_width) ||
- pcie_speed == PCI_SPEED_UNKNOWN ||
- pcie_width == PCIE_LNK_WIDTH_UNKNOWN)
- BNX2X_DEV_INFO("Failed to determine PCI Express Bandwidth\n");
- else
- BNX2X_DEV_INFO(
- "%s (%c%d) PCI-E x%d %s found at mem %lx, IRQ %d, node addr %pM\n",
- board_info[ent->driver_data].name,
- (CHIP_REV(bp) >> 12) + 'A', (CHIP_METAL(bp) >> 4),
- pcie_width,
- pcie_speed == PCIE_SPEED_2_5GT ? "2.5GHz" :
- pcie_speed == PCIE_SPEED_5_0GT ? "5.0GHz" :
- pcie_speed == PCIE_SPEED_8_0GT ? "8.0GHz" :
- "Unknown",
- dev->base_addr, bp->pdev->irq, dev->dev_addr);
+ BNX2X_DEV_INFO(
+ "%s (%c%d) PCI-E found at mem %lx, IRQ %d, node addr %pM\n",
+ board_info[ent->driver_data].name,
+ (CHIP_REV(bp) >> 12) + 'A', (CHIP_METAL(bp) >> 4),
+ dev->base_addr, bp->pdev->irq, dev->dev_addr);
+ pcie_print_link_status(bp->pdev);
bnx2x_register_phc(bp);
From: Bjorn Helgaas <[email protected]>
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.
Note that pcie_get_minimum_link() can return misleading information because
it finds the slowest link and the narrowest link without considering the
total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
about 2000 MB/s for a 16 GT/s x1 link.
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 87 --------------------------
1 file changed, 1 insertion(+), 86 deletions(-)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index a434fecfdfeb..aa05fb534942 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2120,91 +2120,6 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
return 0;
}
-static void fm10k_slot_warn(struct fm10k_intfc *interface)
-{
- enum pcie_link_width width = PCIE_LNK_WIDTH_UNKNOWN;
- enum pci_bus_speed speed = PCI_SPEED_UNKNOWN;
- struct fm10k_hw *hw = &interface->hw;
- int max_gts = 0, expected_gts = 0;
-
- if (pcie_get_minimum_link(interface->pdev, &speed, &width) ||
- speed == PCI_SPEED_UNKNOWN || width == PCIE_LNK_WIDTH_UNKNOWN) {
- dev_warn(&interface->pdev->dev,
- "Unable to determine PCI Express bandwidth.\n");
- return;
- }
-
- switch (speed) {
- case PCIE_SPEED_2_5GT:
- /* 8b/10b encoding reduces max throughput by 20% */
- max_gts = 2 * width;
- break;
- case PCIE_SPEED_5_0GT:
- /* 8b/10b encoding reduces max throughput by 20% */
- max_gts = 4 * width;
- break;
- case PCIE_SPEED_8_0GT:
- /* 128b/130b encoding has less than 2% impact on throughput */
- max_gts = 8 * width;
- break;
- default:
- dev_warn(&interface->pdev->dev,
- "Unable to determine PCI Express bandwidth.\n");
- return;
- }
-
- dev_info(&interface->pdev->dev,
- "PCI Express bandwidth of %dGT/s available\n",
- max_gts);
- dev_info(&interface->pdev->dev,
- "(Speed:%s, Width: x%d, Encoding Loss:%s, Payload:%s)\n",
- (speed == PCIE_SPEED_8_0GT ? "8.0GT/s" :
- speed == PCIE_SPEED_5_0GT ? "5.0GT/s" :
- speed == PCIE_SPEED_2_5GT ? "2.5GT/s" :
- "Unknown"),
- hw->bus.width,
- (speed == PCIE_SPEED_2_5GT ? "20%" :
- speed == PCIE_SPEED_5_0GT ? "20%" :
- speed == PCIE_SPEED_8_0GT ? "<2%" :
- "Unknown"),
- (hw->bus.payload == fm10k_bus_payload_128 ? "128B" :
- hw->bus.payload == fm10k_bus_payload_256 ? "256B" :
- hw->bus.payload == fm10k_bus_payload_512 ? "512B" :
- "Unknown"));
-
- switch (hw->bus_caps.speed) {
- case fm10k_bus_speed_2500:
- /* 8b/10b encoding reduces max throughput by 20% */
- expected_gts = 2 * hw->bus_caps.width;
- break;
- case fm10k_bus_speed_5000:
- /* 8b/10b encoding reduces max throughput by 20% */
- expected_gts = 4 * hw->bus_caps.width;
- break;
- case fm10k_bus_speed_8000:
- /* 128b/130b encoding has less than 2% impact on throughput */
- expected_gts = 8 * hw->bus_caps.width;
- break;
- default:
- dev_warn(&interface->pdev->dev,
- "Unable to determine expected PCI Express bandwidth.\n");
- return;
- }
-
- if (max_gts >= expected_gts)
- return;
-
- dev_warn(&interface->pdev->dev,
- "This device requires %dGT/s of bandwidth for optimal performance.\n",
- expected_gts);
- dev_warn(&interface->pdev->dev,
- "A %sslot with x%d lanes is suggested.\n",
- (hw->bus_caps.speed == fm10k_bus_speed_2500 ? "2.5GT/s " :
- hw->bus_caps.speed == fm10k_bus_speed_5000 ? "5.0GT/s " :
- hw->bus_caps.speed == fm10k_bus_speed_8000 ? "8.0GT/s " : ""),
- hw->bus_caps.width);
-}
-
/**
* fm10k_probe - Device Initialization Routine
* @pdev: PCI device information struct
@@ -2326,7 +2241,7 @@ static int fm10k_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
mod_timer(&interface->service_timer, (HZ * 2) + jiffies);
/* print warning for non-optimal configurations */
- fm10k_slot_warn(interface);
+ pcie_print_link_status(interface->pdev);
/* report MAC address for logging */
dev_info(&pdev->dev, "%pM\n", netdev->dev_addr);
From: Bjorn Helgaas <[email protected]>
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself.
Note that pcie_get_minimum_link() can return misleading information because
it finds the slowest link and the narrowest link without considering the
total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
about 2000 MB/s for a 16 GT/s x1 link.
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 47 +------------------------
1 file changed, 1 insertion(+), 46 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0da5aa2c8aba..38bb9c17d333 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -270,9 +270,6 @@ static void ixgbe_check_minimum_link(struct ixgbe_adapter *adapter,
int expected_gts)
{
struct ixgbe_hw *hw = &adapter->hw;
- int max_gts = 0;
- enum pci_bus_speed speed = PCI_SPEED_UNKNOWN;
- enum pcie_link_width width = PCIE_LNK_WIDTH_UNKNOWN;
struct pci_dev *pdev;
/* Some devices are not connected over PCIe and thus do not negotiate
@@ -288,49 +285,7 @@ static void ixgbe_check_minimum_link(struct ixgbe_adapter *adapter,
else
pdev = adapter->pdev;
- if (pcie_get_minimum_link(pdev, &speed, &width) ||
- speed == PCI_SPEED_UNKNOWN || width == PCIE_LNK_WIDTH_UNKNOWN) {
- e_dev_warn("Unable to determine PCI Express bandwidth.\n");
- return;
- }
-
- switch (speed) {
- case PCIE_SPEED_2_5GT:
- /* 8b/10b encoding reduces max throughput by 20% */
- max_gts = 2 * width;
- break;
- case PCIE_SPEED_5_0GT:
- /* 8b/10b encoding reduces max throughput by 20% */
- max_gts = 4 * width;
- break;
- case PCIE_SPEED_8_0GT:
- /* 128b/130b encoding reduces throughput by less than 2% */
- max_gts = 8 * width;
- break;
- default:
- e_dev_warn("Unable to determine PCI Express bandwidth.\n");
- return;
- }
-
- e_dev_info("PCI Express bandwidth of %dGT/s available\n",
- max_gts);
- e_dev_info("(Speed:%s, Width: x%d, Encoding Loss:%s)\n",
- (speed == PCIE_SPEED_8_0GT ? "8.0GT/s" :
- speed == PCIE_SPEED_5_0GT ? "5.0GT/s" :
- speed == PCIE_SPEED_2_5GT ? "2.5GT/s" :
- "Unknown"),
- width,
- (speed == PCIE_SPEED_2_5GT ? "20%" :
- speed == PCIE_SPEED_5_0GT ? "20%" :
- speed == PCIE_SPEED_8_0GT ? "<2%" :
- "Unknown"));
-
- if (max_gts < expected_gts) {
- e_dev_warn("This is not sufficient for optimal performance of this card.\n");
- e_dev_warn("For optimal performance, at least %dGT/s of bandwidth is required.\n",
- expected_gts);
- e_dev_warn("A slot with more lanes and/or higher speed is suggested.\n");
- }
+ pcie_print_link_status(pdev);
}
static void ixgbe_service_event_schedule(struct ixgbe_adapter *adapter)
From: Tal Gilboa <[email protected]>
Add pcie_bandwidth_available() to compute the bandwidth available to a
device. This may be limited by the device itself or by a slower upstream
link leading to the device.
The available bandwidth at each link along the path is computed as:
link_speed * link_width * (1 - encoding_overhead)
The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
encoding.
Also return the device with the slowest link and the speed and width of
that link.
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: changelog, leave pcie_get_minimum_link() alone for now, return
bw directly, use pci_upstream_bridge(), check "next_bw <= bw" to find
uppermost limiting device, return speed/width of the limiting device]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 3 +++
2 files changed, 57 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 9ce89e254197..e00d56b12747 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5146,6 +5146,60 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
}
EXPORT_SYMBOL(pcie_get_minimum_link);
+/**
+ * pcie_bandwidth_available - determine minimum link settings of a PCIe
+ * device and its bandwidth limitation
+ * @dev: PCI device to query
+ * @limiting_dev: storage for device causing the bandwidth limitation
+ * @speed: storage for speed of limiting device
+ * @width: storage for width of limiting device
+ *
+ * Walk up the PCI device chain and find the point where the minimum
+ * bandwidth is available. Return the bandwidth available there and (if
+ * limiting_dev, speed, and width pointers are supplied) information about
+ * that point.
+ */
+u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
+ enum pci_bus_speed *speed,
+ enum pcie_link_width *width)
+{
+ u16 lnksta;
+ enum pci_bus_speed next_speed;
+ enum pcie_link_width next_width;
+ u32 bw, next_bw;
+
+ *speed = PCI_SPEED_UNKNOWN;
+ *width = PCIE_LNK_WIDTH_UNKNOWN;
+ bw = 0;
+
+ while (dev) {
+ pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+
+ next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
+ next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >>
+ PCI_EXP_LNKSTA_NLW_SHIFT;
+
+ next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
+
+ /* Check if current device limits the total bandwidth */
+ if (!bw || next_bw <= bw) {
+ bw = next_bw;
+
+ if (limiting_dev)
+ *limiting_dev = dev;
+ if (speed)
+ *speed = next_speed;
+ if (width)
+ *width = next_width;
+ }
+
+ dev = pci_upstream_bridge(dev);
+ }
+
+ return bw;
+}
+EXPORT_SYMBOL(pcie_bandwidth_available);
+
/**
* pcie_get_speed_cap - query for the PCI device's link speed capability
* @dev: PCI device to query
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8043a5937ad0..f2bf2b7a66c7 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1083,6 +1083,9 @@ int pcie_get_mps(struct pci_dev *dev);
int pcie_set_mps(struct pci_dev *dev, int mps);
int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
enum pcie_link_width *width);
+u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
+ enum pci_bus_speed *speed,
+ enum pcie_link_width *width);
void pcie_flr(struct pci_dev *dev);
int __pci_reset_function_locked(struct pci_dev *dev);
int pci_reset_function(struct pci_dev *dev);
On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
> From: Tal Gilboa <[email protected]>
>
> Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
> a device, based on the max link speed and width, adjusted by the encoding
> overhead.
>
> The maximum bandwidth of the link is computed as:
>
> max_link_speed * max_link_width * (1 - encoding_overhead)
>
> The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
> encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
> encoding.
>
> Signed-off-by: Tal Gilboa <[email protected]>
> [bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
> signatures, don't export outside drivers/pci]
> Signed-off-by: Bjorn Helgaas <[email protected]>
> Reviewed-by: Tariq Toukan <[email protected]>
> ---
> drivers/pci/pci.c | 21 +++++++++++++++++++++
> drivers/pci/pci.h | 9 +++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 43075be79388..9ce89e254197 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5208,6 +5208,27 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
> return PCIE_LNK_WIDTH_UNKNOWN;
> }
>
> +/**
> + * pcie_bandwidth_capable - calculates a PCI device's link bandwidth capability
> + * @dev: PCI device
> + * @speed: storage for link speed
> + * @width: storage for link width
> + *
> + * Calculate a PCI device's link bandwidth by querying for its link speed
> + * and width, multiplying them, and applying encoding overhead.
> + */
> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
> + enum pcie_link_width *width)
> +{
> + *speed = pcie_get_speed_cap(dev);
> + *width = pcie_get_width_cap(dev);
> +
> + if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
> + return 0;
> +
> + return *width * PCIE_SPEED2MBS_ENC(*speed);
> +}
> +
> /**
> * pci_select_bars - Make BAR mask from the type of resource
> * @dev: the PCI device for which BAR mask is made
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 66738f1050c0..2a50172b9803 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev *dev);
> (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> "Unknown speed")
>
> +/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for gen3 */
> +#define PCIE_SPEED2MBS_ENC(speed) \
Missing gen4.
> + ((speed) == PCIE_SPEED_8_0GT ? 7877 : \
> + (speed) == PCIE_SPEED_5_0GT ? 4000 : \
> + (speed) == PCIE_SPEED_2_5GT ? 2000 : \
> + 0)
> +
> enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
> enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
> + enum pcie_link_width *width);
>
> /* Single Root I/O Virtualization */
> struct pci_sriov {
>
On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
> From: Tal Gilboa <[email protected]>
>
> Add pcie_bandwidth_available() to compute the bandwidth available to a
> device. This may be limited by the device itself or by a slower upstream
> link leading to the device.
>
> The available bandwidth at each link along the path is computed as:
>
> link_speed * link_width * (1 - encoding_overhead)
>
> The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
> encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
> encoding.
>
> Also return the device with the slowest link and the speed and width of
> that link.
>
> Signed-off-by: Tal Gilboa <[email protected]>
> [bhelgaas: changelog, leave pcie_get_minimum_link() alone for now, return
> bw directly, use pci_upstream_bridge(), check "next_bw <= bw" to find
> uppermost limiting device, return speed/width of the limiting device]
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> drivers/pci/pci.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/pci.h | 3 +++
> 2 files changed, 57 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9ce89e254197..e00d56b12747 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5146,6 +5146,60 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
> }
> EXPORT_SYMBOL(pcie_get_minimum_link);
>
> +/**
> + * pcie_bandwidth_available - determine minimum link settings of a PCIe
> + * device and its bandwidth limitation
> + * @dev: PCI device to query
> + * @limiting_dev: storage for device causing the bandwidth limitation
> + * @speed: storage for speed of limiting device
> + * @width: storage for width of limiting device
> + *
> + * Walk up the PCI device chain and find the point where the minimum
> + * bandwidth is available. Return the bandwidth available there and (if
> + * limiting_dev, speed, and width pointers are supplied) information about
> + * that point.
> + */
> +u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> + enum pci_bus_speed *speed,
> + enum pcie_link_width *width)
> +{
> + u16 lnksta;
> + enum pci_bus_speed next_speed;
> + enum pcie_link_width next_width;
> + u32 bw, next_bw;
> +
> + *speed = PCI_SPEED_UNKNOWN;
> + *width = PCIE_LNK_WIDTH_UNKNOWN;
This is not safe anymore, now that we allow speed/width=NULL.
> + bw = 0;
> +
> + while (dev) {
> + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> +
> + next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS];
> + next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >>
> + PCI_EXP_LNKSTA_NLW_SHIFT;
> +
> + next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
> +
> + /* Check if current device limits the total bandwidth */
> + if (!bw || next_bw <= bw) {
> + bw = next_bw;
> +
> + if (limiting_dev)
> + *limiting_dev = dev;
> + if (speed)
> + *speed = next_speed;
> + if (width)
> + *width = next_width;
> + }
> +
> + dev = pci_upstream_bridge(dev);
> + }
> +
> + return bw;
> +}
> +EXPORT_SYMBOL(pcie_bandwidth_available);
> +
> /**
> * pcie_get_speed_cap - query for the PCI device's link speed capability
> * @dev: PCI device to query
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 8043a5937ad0..f2bf2b7a66c7 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1083,6 +1083,9 @@ int pcie_get_mps(struct pci_dev *dev);
> int pcie_set_mps(struct pci_dev *dev, int mps);
> int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
> enum pcie_link_width *width);
> +u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> + enum pci_bus_speed *speed,
> + enum pcie_link_width *width);
> void pcie_flr(struct pci_dev *dev);
> int __pci_reset_function_locked(struct pci_dev *dev);
> int pci_reset_function(struct pci_dev *dev);
>
On Sun, Apr 01, 2018 at 11:38:53PM +0300, Tal Gilboa wrote:
> On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
> > From: Tal Gilboa <[email protected]>
> >
> > Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
> > a device, based on the max link speed and width, adjusted by the encoding
> > overhead.
> >
> > The maximum bandwidth of the link is computed as:
> >
> > max_link_speed * max_link_width * (1 - encoding_overhead)
> >
> > The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
> > encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
> > encoding.
> >
> > Signed-off-by: Tal Gilboa <[email protected]>
> > [bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
> > signatures, don't export outside drivers/pci]
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > Reviewed-by: Tariq Toukan <[email protected]>
> > ---
> > drivers/pci/pci.c | 21 +++++++++++++++++++++
> > drivers/pci/pci.h | 9 +++++++++
> > 2 files changed, 30 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 43075be79388..9ce89e254197 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5208,6 +5208,27 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
> > return PCIE_LNK_WIDTH_UNKNOWN;
> > }
> > +/**
> > + * pcie_bandwidth_capable - calculates a PCI device's link bandwidth capability
> > + * @dev: PCI device
> > + * @speed: storage for link speed
> > + * @width: storage for link width
> > + *
> > + * Calculate a PCI device's link bandwidth by querying for its link speed
> > + * and width, multiplying them, and applying encoding overhead.
> > + */
> > +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
> > + enum pcie_link_width *width)
> > +{
> > + *speed = pcie_get_speed_cap(dev);
> > + *width = pcie_get_width_cap(dev);
> > +
> > + if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
> > + return 0;
> > +
> > + return *width * PCIE_SPEED2MBS_ENC(*speed);
> > +}
> > +
> > /**
> > * pci_select_bars - Make BAR mask from the type of resource
> > * @dev: the PCI device for which BAR mask is made
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index 66738f1050c0..2a50172b9803 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev *dev);
> > (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> > "Unknown speed")
> > +/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for gen3 */
> > +#define PCIE_SPEED2MBS_ENC(speed) \
>
> Missing gen4.
I made it "gen3+". I think that's accurate, isn't it? The spec
doesn't seem to actually use "gen3" as a specific term, but sec 4.2.2
says rates of 8 GT/s or higher (which I think includes gen3 and gen4)
use 128b/130b encoding.
On Sun, Apr 01, 2018 at 11:41:42PM +0300, Tal Gilboa wrote:
> On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
> > From: Tal Gilboa <[email protected]>
> >
> > Add pcie_bandwidth_available() to compute the bandwidth available to a
> > device. This may be limited by the device itself or by a slower upstream
> > link leading to the device.
> >
> > The available bandwidth at each link along the path is computed as:
> >
> > link_speed * link_width * (1 - encoding_overhead)
> >
> > The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
> > encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
> > encoding.
> >
> > Also return the device with the slowest link and the speed and width of
> > that link.
> >
> > Signed-off-by: Tal Gilboa <[email protected]>
> > [bhelgaas: changelog, leave pcie_get_minimum_link() alone for now, return
> > bw directly, use pci_upstream_bridge(), check "next_bw <= bw" to find
> > uppermost limiting device, return speed/width of the limiting device]
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > ---
> > drivers/pci/pci.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > include/linux/pci.h | 3 +++
> > 2 files changed, 57 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 9ce89e254197..e00d56b12747 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5146,6 +5146,60 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
> > }
> > EXPORT_SYMBOL(pcie_get_minimum_link);
> > +/**
> > + * pcie_bandwidth_available - determine minimum link settings of a PCIe
> > + * device and its bandwidth limitation
> > + * @dev: PCI device to query
> > + * @limiting_dev: storage for device causing the bandwidth limitation
> > + * @speed: storage for speed of limiting device
> > + * @width: storage for width of limiting device
> > + *
> > + * Walk up the PCI device chain and find the point where the minimum
> > + * bandwidth is available. Return the bandwidth available there and (if
> > + * limiting_dev, speed, and width pointers are supplied) information about
> > + * that point.
> > + */
> > +u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> > + enum pci_bus_speed *speed,
> > + enum pcie_link_width *width)
> > +{
> > + u16 lnksta;
> > + enum pci_bus_speed next_speed;
> > + enum pcie_link_width next_width;
> > + u32 bw, next_bw;
> > +
> > + *speed = PCI_SPEED_UNKNOWN;
> > + *width = PCIE_LNK_WIDTH_UNKNOWN;
>
> This is not safe anymore, now that we allow speed/width=NULL.
Good catch, thanks!
On 4/2/2018 3:40 AM, Bjorn Helgaas wrote:
> On Sun, Apr 01, 2018 at 11:38:53PM +0300, Tal Gilboa wrote:
>> On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
>>> From: Tal Gilboa <[email protected]>
>>>
>>> Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
>>> a device, based on the max link speed and width, adjusted by the encoding
>>> overhead.
>>>
>>> The maximum bandwidth of the link is computed as:
>>>
>>> max_link_speed * max_link_width * (1 - encoding_overhead)
>>>
>>> The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
>>> encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
>>> encoding.
>>>
>>> Signed-off-by: Tal Gilboa <[email protected]>
>>> [bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
>>> signatures, don't export outside drivers/pci]
>>> Signed-off-by: Bjorn Helgaas <[email protected]>
>>> Reviewed-by: Tariq Toukan <[email protected]>
>>> ---
>>> drivers/pci/pci.c | 21 +++++++++++++++++++++
>>> drivers/pci/pci.h | 9 +++++++++
>>> 2 files changed, 30 insertions(+)
>>>
>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>> index 43075be79388..9ce89e254197 100644
>>> --- a/drivers/pci/pci.c
>>> +++ b/drivers/pci/pci.c
>>> @@ -5208,6 +5208,27 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
>>> return PCIE_LNK_WIDTH_UNKNOWN;
>>> }
>>> +/**
>>> + * pcie_bandwidth_capable - calculates a PCI device's link bandwidth capability
>>> + * @dev: PCI device
>>> + * @speed: storage for link speed
>>> + * @width: storage for link width
>>> + *
>>> + * Calculate a PCI device's link bandwidth by querying for its link speed
>>> + * and width, multiplying them, and applying encoding overhead.
>>> + */
>>> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
>>> + enum pcie_link_width *width)
>>> +{
>>> + *speed = pcie_get_speed_cap(dev);
>>> + *width = pcie_get_width_cap(dev);
>>> +
>>> + if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
>>> + return 0;
>>> +
>>> + return *width * PCIE_SPEED2MBS_ENC(*speed);
>>> +}
>>> +
>>> /**
>>> * pci_select_bars - Make BAR mask from the type of resource
>>> * @dev: the PCI device for which BAR mask is made
>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>> index 66738f1050c0..2a50172b9803 100644
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev *dev);
>>> (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
>>> "Unknown speed")
>>> +/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for gen3 */
>>> +#define PCIE_SPEED2MBS_ENC(speed) \
>>
>> Missing gen4.
>
> I made it "gen3+". I think that's accurate, isn't it? The spec
> doesn't seem to actually use "gen3" as a specific term, but sec 4.2.2
> says rates of 8 GT/s or higher (which I think includes gen3 and gen4)
> use 128b/130b encoding.
>
I meant that PCIE_SPEED_16_0GT will return 0 from this macro since it
wasn't added. Need to return 15754.
On Mon, Apr 02, 2018 at 10:34:58AM +0300, Tal Gilboa wrote:
> On 4/2/2018 3:40 AM, Bjorn Helgaas wrote:
> > On Sun, Apr 01, 2018 at 11:38:53PM +0300, Tal Gilboa wrote:
> > > On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
> > > > From: Tal Gilboa <[email protected]>
> > > >
> > > > Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
> > > > a device, based on the max link speed and width, adjusted by the encoding
> > > > overhead.
> > > >
> > > > The maximum bandwidth of the link is computed as:
> > > >
> > > > max_link_speed * max_link_width * (1 - encoding_overhead)
> > > >
> > > > The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
> > > > encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
> > > > encoding.
> > > >
> > > > Signed-off-by: Tal Gilboa <[email protected]>
> > > > [bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
> > > > signatures, don't export outside drivers/pci]
> > > > Signed-off-by: Bjorn Helgaas <[email protected]>
> > > > Reviewed-by: Tariq Toukan <[email protected]>
> > > > ---
> > > > drivers/pci/pci.c | 21 +++++++++++++++++++++
> > > > drivers/pci/pci.h | 9 +++++++++
> > > > 2 files changed, 30 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index 43075be79388..9ce89e254197 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -5208,6 +5208,27 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
> > > > return PCIE_LNK_WIDTH_UNKNOWN;
> > > > }
> > > > +/**
> > > > + * pcie_bandwidth_capable - calculates a PCI device's link bandwidth capability
> > > > + * @dev: PCI device
> > > > + * @speed: storage for link speed
> > > > + * @width: storage for link width
> > > > + *
> > > > + * Calculate a PCI device's link bandwidth by querying for its link speed
> > > > + * and width, multiplying them, and applying encoding overhead.
> > > > + */
> > > > +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
> > > > + enum pcie_link_width *width)
> > > > +{
> > > > + *speed = pcie_get_speed_cap(dev);
> > > > + *width = pcie_get_width_cap(dev);
> > > > +
> > > > + if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
> > > > + return 0;
> > > > +
> > > > + return *width * PCIE_SPEED2MBS_ENC(*speed);
> > > > +}
> > > > +
> > > > /**
> > > > * pci_select_bars - Make BAR mask from the type of resource
> > > > * @dev: the PCI device for which BAR mask is made
> > > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > > > index 66738f1050c0..2a50172b9803 100644
> > > > --- a/drivers/pci/pci.h
> > > > +++ b/drivers/pci/pci.h
> > > > @@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev *dev);
> > > > (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> > > > "Unknown speed")
> > > > +/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for gen3 */
> > > > +#define PCIE_SPEED2MBS_ENC(speed) \
> > >
> > > Missing gen4.
> >
> > I made it "gen3+". I think that's accurate, isn't it? The spec
> > doesn't seem to actually use "gen3" as a specific term, but sec 4.2.2
> > says rates of 8 GT/s or higher (which I think includes gen3 and gen4)
> > use 128b/130b encoding.
> >
>
> I meant that PCIE_SPEED_16_0GT will return 0 from this macro since it wasn't
> added. Need to return 15754.
Oh, duh, of course! Sorry for being dense. What about the following?
I included the calculation as opposed to just the magic numbers to try
to make it clear how they're derived. This has the disadvantage of
truncating the result instead of rounding, but I doubt that's
significant in this context. If it is, we could use the magic numbers
and put the computation in a comment.
Another question: we currently deal in Mb/s, not MB/s. Mb/s has the
advantage of sort of corresponding to the GT/s numbers, but using MB/s
would have the advantage of smaller numbers that match the table here:
https://en.wikipedia.org/wiki/PCI_Express#History_and_revisions,
but I don't know what's most typical in user-facing situations.
What's better?
commit 946435491b35b7782157e9a4d1bd73071fba7709
Author: Tal Gilboa <[email protected]>
Date: Fri Mar 30 08:32:03 2018 -0500
PCI: Add pcie_bandwidth_capable() to compute max supported link bandwidth
Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
a device, based on the max link speed and width, adjusted by the encoding
overhead.
The maximum bandwidth of the link is computed as:
max_link_width * max_link_speed * (1 - encoding_overhead)
2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw bandwidth
available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
reduces it by about 1.5%.
The result is in Mb/s, i.e., megabits/second, of raw bandwidth.
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: add 16 GT/s, adjust for pcie_get_speed_cap() and
pcie_get_width_cap() signatures, don't export outside drivers/pci]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 43075be79388..ff1e72060952 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5208,6 +5208,28 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
return PCIE_LNK_WIDTH_UNKNOWN;
}
+/**
+ * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
+ * @dev: PCI device
+ * @speed: storage for link speed
+ * @width: storage for link width
+ *
+ * Calculate a PCI device's link bandwidth by querying for its link speed
+ * and width, multiplying them, and applying encoding overhead. The result
+ * is in Mb/s, i.e., megabits/second of raw bandwidth.
+ */
+u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
+ enum pcie_link_width *width)
+{
+ *speed = pcie_get_speed_cap(dev);
+ *width = pcie_get_width_cap(dev);
+
+ if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
+ return 0;
+
+ return *width * PCIE_SPEED2MBS_ENC(*speed);
+}
+
/**
* pci_select_bars - Make BAR mask from the type of resource
* @dev: the PCI device for which BAR mask is made
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 66738f1050c0..37f9299ed623 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -261,8 +261,18 @@ void pci_disable_bridge_window(struct pci_dev *dev);
(speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
"Unknown speed")
+/* PCIe speed to Mb/s reduced by encoding overhead */
+#define PCIE_SPEED2MBS_ENC(speed) \
+ ((speed) == PCIE_SPEED_16_0GT ? (16000*(128/130)) : \
+ (speed) == PCIE_SPEED_8_0GT ? (8000*(128/130)) : \
+ (speed) == PCIE_SPEED_5_0GT ? (5000*(8/10)) : \
+ (speed) == PCIE_SPEED_2_5GT ? (2500*(8/10)) : \
+ 0)
+
enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
+u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
+ enum pcie_link_width *width);
/* Single Root I/O Virtualization */
struct pci_sriov {
On 4/2/2018 5:05 PM, Bjorn Helgaas wrote:
> On Mon, Apr 02, 2018 at 10:34:58AM +0300, Tal Gilboa wrote:
>> On 4/2/2018 3:40 AM, Bjorn Helgaas wrote:
>>> On Sun, Apr 01, 2018 at 11:38:53PM +0300, Tal Gilboa wrote:
>>>> On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
>>>>> From: Tal Gilboa <[email protected]>
>>>>>
>>>>> Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
>>>>> a device, based on the max link speed and width, adjusted by the encoding
>>>>> overhead.
>>>>>
>>>>> The maximum bandwidth of the link is computed as:
>>>>>
>>>>> max_link_speed * max_link_width * (1 - encoding_overhead)
>>>>>
>>>>> The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using 8b/10b
>>>>> encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
>>>>> encoding.
>>>>>
>>>>> Signed-off-by: Tal Gilboa <[email protected]>
>>>>> [bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
>>>>> signatures, don't export outside drivers/pci]
>>>>> Signed-off-by: Bjorn Helgaas <[email protected]>
>>>>> Reviewed-by: Tariq Toukan <[email protected]>
>>>>> ---
>>>>> drivers/pci/pci.c | 21 +++++++++++++++++++++
>>>>> drivers/pci/pci.h | 9 +++++++++
>>>>> 2 files changed, 30 insertions(+)
>>>>>
>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>>> index 43075be79388..9ce89e254197 100644
>>>>> --- a/drivers/pci/pci.c
>>>>> +++ b/drivers/pci/pci.c
>>>>> @@ -5208,6 +5208,27 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
>>>>> return PCIE_LNK_WIDTH_UNKNOWN;
>>>>> }
>>>>> +/**
>>>>> + * pcie_bandwidth_capable - calculates a PCI device's link bandwidth capability
>>>>> + * @dev: PCI device
>>>>> + * @speed: storage for link speed
>>>>> + * @width: storage for link width
>>>>> + *
>>>>> + * Calculate a PCI device's link bandwidth by querying for its link speed
>>>>> + * and width, multiplying them, and applying encoding overhead.
>>>>> + */
>>>>> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
>>>>> + enum pcie_link_width *width)
>>>>> +{
>>>>> + *speed = pcie_get_speed_cap(dev);
>>>>> + *width = pcie_get_width_cap(dev);
>>>>> +
>>>>> + if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
>>>>> + return 0;
>>>>> +
>>>>> + return *width * PCIE_SPEED2MBS_ENC(*speed);
>>>>> +}
>>>>> +
>>>>> /**
>>>>> * pci_select_bars - Make BAR mask from the type of resource
>>>>> * @dev: the PCI device for which BAR mask is made
>>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>>>> index 66738f1050c0..2a50172b9803 100644
>>>>> --- a/drivers/pci/pci.h
>>>>> +++ b/drivers/pci/pci.h
>>>>> @@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev *dev);
>>>>> (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
>>>>> "Unknown speed")
>>>>> +/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for gen3 */
>>>>> +#define PCIE_SPEED2MBS_ENC(speed) \
>>>>
>>>> Missing gen4.
>>>
>>> I made it "gen3+". I think that's accurate, isn't it? The spec
>>> doesn't seem to actually use "gen3" as a specific term, but sec 4.2.2
>>> says rates of 8 GT/s or higher (which I think includes gen3 and gen4)
>>> use 128b/130b encoding.
>>>
>>
>> I meant that PCIE_SPEED_16_0GT will return 0 from this macro since it wasn't
>> added. Need to return 15754.
>
> Oh, duh, of course! Sorry for being dense. What about the following?
> I included the calculation as opposed to just the magic numbers to try
> to make it clear how they're derived. This has the disadvantage of
> truncating the result instead of rounding, but I doubt that's
> significant in this context. If it is, we could use the magic numbers
> and put the computation in a comment.
We can always use DIV_ROUND_UP((speed * enc_nominator),
enc_denominator). I think this is confusing and since this introduces a
bandwidth limit I would prefer to give a wider limit than a wrong one,
even it is by less than 1Mb/s. My vote is for leaving it as you wrote below.
>
> Another question: we currently deal in Mb/s, not MB/s. Mb/s has the
> advantage of sort of corresponding to the GT/s numbers, but using MB/s
> would have the advantage of smaller numbers that match the table here:
> https://en.wikipedia.org/wiki/PCI_Express#History_and_revisions,
> but I don't know what's most typical in user-facing situations.
> What's better?
I don't know what's better but for network devices we measure bandwidth
in Gb/s, so presenting bandwidth in MB/s would mean additional
calculations. The truth is I would have prefer to use Gb/s instead of
Mb/s, but again, don't want to loss up to 1Gb/s.
>
>
> commit 946435491b35b7782157e9a4d1bd73071fba7709
> Author: Tal Gilboa <[email protected]>
> Date: Fri Mar 30 08:32:03 2018 -0500
>
> PCI: Add pcie_bandwidth_capable() to compute max supported link bandwidth
>
> Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
> a device, based on the max link speed and width, adjusted by the encoding
> overhead.
>
> The maximum bandwidth of the link is computed as:
>
> max_link_width * max_link_speed * (1 - encoding_overhead)
>
> 2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw bandwidth
> available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
> reduces it by about 1.5%.
>
> The result is in Mb/s, i.e., megabits/second, of raw bandwidth.
>
> Signed-off-by: Tal Gilboa <[email protected]>
> [bhelgaas: add 16 GT/s, adjust for pcie_get_speed_cap() and
> pcie_get_width_cap() signatures, don't export outside drivers/pci]
> Signed-off-by: Bjorn Helgaas <[email protected]>
> Reviewed-by: Tariq Toukan <[email protected]>
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 43075be79388..ff1e72060952 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5208,6 +5208,28 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
> return PCIE_LNK_WIDTH_UNKNOWN;
> }
>
> +/**
> + * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
> + * @dev: PCI device
> + * @speed: storage for link speed
> + * @width: storage for link width
> + *
> + * Calculate a PCI device's link bandwidth by querying for its link speed
> + * and width, multiplying them, and applying encoding overhead. The result
> + * is in Mb/s, i.e., megabits/second of raw bandwidth.
> + */
> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
> + enum pcie_link_width *width)
> +{
> + *speed = pcie_get_speed_cap(dev);
> + *width = pcie_get_width_cap(dev);
> +
> + if (*speed == PCI_SPEED_UNKNOWN || *width == PCIE_LNK_WIDTH_UNKNOWN)
> + return 0;
> +
> + return *width * PCIE_SPEED2MBS_ENC(*speed);
> +}
> +
> /**
> * pci_select_bars - Make BAR mask from the type of resource
> * @dev: the PCI device for which BAR mask is made
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 66738f1050c0..37f9299ed623 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -261,8 +261,18 @@ void pci_disable_bridge_window(struct pci_dev *dev);
> (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> "Unknown speed")
>
> +/* PCIe speed to Mb/s reduced by encoding overhead */
> +#define PCIE_SPEED2MBS_ENC(speed) \
> + ((speed) == PCIE_SPEED_16_0GT ? (16000*(128/130)) : \
> + (speed) == PCIE_SPEED_8_0GT ? (8000*(128/130)) : \
> + (speed) == PCIE_SPEED_5_0GT ? (5000*(8/10)) : \
> + (speed) == PCIE_SPEED_2_5GT ? (2500*(8/10)) : \
> + 0)
> +
> enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
> enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
> + enum pcie_link_width *width);
>
> /* Single Root I/O Virtualization */
> struct pci_sriov {
>
> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Friday, March 30, 2018 2:06 PM
> To: Tal Gilboa <[email protected]>
> Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> Goudar <[email protected]>; Kirsher, Jeffrey T
> <[email protected]>; [email protected]; intel-wired-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: [PATCH v5 12/14] fm10k: Report PCIe link properties with
> pcie_print_link_status()
>
> From: Bjorn Helgaas <[email protected]>
>
> Use pcie_print_link_status() to report PCIe link speed and possible
> limitations instead of implementing this in the driver itself.
>
> Note that pcie_get_minimum_link() can return misleading information because
> it finds the slowest link and the narrowest link without considering the
> total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
> 2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
> corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
> about 2000 MB/s for a 16 GT/s x1 link.
This comment is about what's being fixed, so it would have been easier to parse if it were written to more clearly indicate that we're removing (and not adding) this behavior.
Aside from the commit message (which I don't feel strongly enough needs a re-send of the patch) this looks good to me.
Acked-by: Jacob Keller <[email protected]>
Thanks Bjorn and Tal for fixing this!
>
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 87 --------------------------
> 1 file changed, 1 insertion(+), 86 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> index a434fecfdfeb..aa05fb534942 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> @@ -2120,91 +2120,6 @@ static int fm10k_sw_init(struct fm10k_intfc *interface,
> return 0;
> }
>
> -static void fm10k_slot_warn(struct fm10k_intfc *interface)
> -{
> - enum pcie_link_width width = PCIE_LNK_WIDTH_UNKNOWN;
> - enum pci_bus_speed speed = PCI_SPEED_UNKNOWN;
> - struct fm10k_hw *hw = &interface->hw;
> - int max_gts = 0, expected_gts = 0;
> -
> - if (pcie_get_minimum_link(interface->pdev, &speed, &width) ||
> - speed == PCI_SPEED_UNKNOWN || width ==
> PCIE_LNK_WIDTH_UNKNOWN) {
> - dev_warn(&interface->pdev->dev,
> - "Unable to determine PCI Express bandwidth.\n");
> - return;
> - }
> -
> - switch (speed) {
> - case PCIE_SPEED_2_5GT:
> - /* 8b/10b encoding reduces max throughput by 20% */
> - max_gts = 2 * width;
> - break;
> - case PCIE_SPEED_5_0GT:
> - /* 8b/10b encoding reduces max throughput by 20% */
> - max_gts = 4 * width;
> - break;
> - case PCIE_SPEED_8_0GT:
> - /* 128b/130b encoding has less than 2% impact on throughput */
> - max_gts = 8 * width;
> - break;
> - default:
> - dev_warn(&interface->pdev->dev,
> - "Unable to determine PCI Express bandwidth.\n");
> - return;
> - }
> -
> - dev_info(&interface->pdev->dev,
> - "PCI Express bandwidth of %dGT/s available\n",
> - max_gts);
> - dev_info(&interface->pdev->dev,
> - "(Speed:%s, Width: x%d, Encoding Loss:%s, Payload:%s)\n",
> - (speed == PCIE_SPEED_8_0GT ? "8.0GT/s" :
> - speed == PCIE_SPEED_5_0GT ? "5.0GT/s" :
> - speed == PCIE_SPEED_2_5GT ? "2.5GT/s" :
> - "Unknown"),
> - hw->bus.width,
> - (speed == PCIE_SPEED_2_5GT ? "20%" :
> - speed == PCIE_SPEED_5_0GT ? "20%" :
> - speed == PCIE_SPEED_8_0GT ? "<2%" :
> - "Unknown"),
> - (hw->bus.payload == fm10k_bus_payload_128 ? "128B" :
> - hw->bus.payload == fm10k_bus_payload_256 ? "256B" :
> - hw->bus.payload == fm10k_bus_payload_512 ? "512B" :
> - "Unknown"));
> -
> - switch (hw->bus_caps.speed) {
> - case fm10k_bus_speed_2500:
> - /* 8b/10b encoding reduces max throughput by 20% */
> - expected_gts = 2 * hw->bus_caps.width;
> - break;
> - case fm10k_bus_speed_5000:
> - /* 8b/10b encoding reduces max throughput by 20% */
> - expected_gts = 4 * hw->bus_caps.width;
> - break;
> - case fm10k_bus_speed_8000:
> - /* 128b/130b encoding has less than 2% impact on throughput */
> - expected_gts = 8 * hw->bus_caps.width;
> - break;
> - default:
> - dev_warn(&interface->pdev->dev,
> - "Unable to determine expected PCI Express
> bandwidth.\n");
> - return;
> - }
> -
> - if (max_gts >= expected_gts)
> - return;
> -
> - dev_warn(&interface->pdev->dev,
> - "This device requires %dGT/s of bandwidth for optimal
> performance.\n",
> - expected_gts);
> - dev_warn(&interface->pdev->dev,
> - "A %sslot with x%d lanes is suggested.\n",
> - (hw->bus_caps.speed == fm10k_bus_speed_2500 ? "2.5GT/s " :
> - hw->bus_caps.speed == fm10k_bus_speed_5000 ? "5.0GT/s " :
> - hw->bus_caps.speed == fm10k_bus_speed_8000 ? "8.0GT/s " :
> ""),
> - hw->bus_caps.width);
> -}
> -
> /**
> * fm10k_probe - Device Initialization Routine
> * @pdev: PCI device information struct
> @@ -2326,7 +2241,7 @@ static int fm10k_probe(struct pci_dev *pdev, const
> struct pci_device_id *ent)
> mod_timer(&interface->service_timer, (HZ * 2) + jiffies);
>
> /* print warning for non-optimal configurations */
> - fm10k_slot_warn(interface);
> + pcie_print_link_status(interface->pdev);
>
> /* report MAC address for logging */
> dev_info(&pdev->dev, "%pM\n", netdev->dev_addr);
> -----Original Message-----
> From: Tal Gilboa [mailto:[email protected]]
> Sent: Monday, April 02, 2018 7:34 AM
> To: Bjorn Helgaas <[email protected]>
> Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> Goudar <[email protected]>; Kirsher, Jeffrey T
> <[email protected]>; [email protected]; intel-wired-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH v5 03/14] PCI: Add pcie_bandwidth_capable() to compute
> max supported link bandwidth
>
> On 4/2/2018 5:05 PM, Bjorn Helgaas wrote:
> > On Mon, Apr 02, 2018 at 10:34:58AM +0300, Tal Gilboa wrote:
> >> On 4/2/2018 3:40 AM, Bjorn Helgaas wrote:
> >>> On Sun, Apr 01, 2018 at 11:38:53PM +0300, Tal Gilboa wrote:
> >>>> On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
> >>>>> From: Tal Gilboa <[email protected]>
> >>>>>
> >>>>> Add pcie_bandwidth_capable() to compute the max link bandwidth
> supported by
> >>>>> a device, based on the max link speed and width, adjusted by the
> encoding
> >>>>> overhead.
> >>>>>
> >>>>> The maximum bandwidth of the link is computed as:
> >>>>>
> >>>>> max_link_speed * max_link_width * (1 - encoding_overhead)
> >>>>>
> >>>>> The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using
> 8b/10b
> >>>>> encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
> >>>>> encoding.
> >>>>>
> >>>>> Signed-off-by: Tal Gilboa <[email protected]>
> >>>>> [bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
> >>>>> signatures, don't export outside drivers/pci]
> >>>>> Signed-off-by: Bjorn Helgaas <[email protected]>
> >>>>> Reviewed-by: Tariq Toukan <[email protected]>
> >>>>> ---
> >>>>> drivers/pci/pci.c | 21 +++++++++++++++++++++
> >>>>> drivers/pci/pci.h | 9 +++++++++
> >>>>> 2 files changed, 30 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> >>>>> index 43075be79388..9ce89e254197 100644
> >>>>> --- a/drivers/pci/pci.c
> >>>>> +++ b/drivers/pci/pci.c
> >>>>> @@ -5208,6 +5208,27 @@ enum pcie_link_width
> pcie_get_width_cap(struct pci_dev *dev)
> >>>>> return PCIE_LNK_WIDTH_UNKNOWN;
> >>>>> }
> >>>>> +/**
> >>>>> + * pcie_bandwidth_capable - calculates a PCI device's link bandwidth
> capability
> >>>>> + * @dev: PCI device
> >>>>> + * @speed: storage for link speed
> >>>>> + * @width: storage for link width
> >>>>> + *
> >>>>> + * Calculate a PCI device's link bandwidth by querying for its link speed
> >>>>> + * and width, multiplying them, and applying encoding overhead.
> >>>>> + */
> >>>>> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed
> *speed,
> >>>>> + enum pcie_link_width *width)
> >>>>> +{
> >>>>> + *speed = pcie_get_speed_cap(dev);
> >>>>> + *width = pcie_get_width_cap(dev);
> >>>>> +
> >>>>> + if (*speed == PCI_SPEED_UNKNOWN || *width ==
> PCIE_LNK_WIDTH_UNKNOWN)
> >>>>> + return 0;
> >>>>> +
> >>>>> + return *width * PCIE_SPEED2MBS_ENC(*speed);
> >>>>> +}
> >>>>> +
> >>>>> /**
> >>>>> * pci_select_bars - Make BAR mask from the type of resource
> >>>>> * @dev: the PCI device for which BAR mask is made
> >>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> >>>>> index 66738f1050c0..2a50172b9803 100644
> >>>>> --- a/drivers/pci/pci.h
> >>>>> +++ b/drivers/pci/pci.h
> >>>>> @@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev
> *dev);
> >>>>> (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> >>>>> "Unknown speed")
> >>>>> +/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for
> gen3 */
> >>>>> +#define PCIE_SPEED2MBS_ENC(speed) \
> >>>>
> >>>> Missing gen4.
> >>>
> >>> I made it "gen3+". I think that's accurate, isn't it? The spec
> >>> doesn't seem to actually use "gen3" as a specific term, but sec 4.2.2
> >>> says rates of 8 GT/s or higher (which I think includes gen3 and gen4)
> >>> use 128b/130b encoding.
> >>>
> >>
> >> I meant that PCIE_SPEED_16_0GT will return 0 from this macro since it wasn't
> >> added. Need to return 15754.
> >
> > Oh, duh, of course! Sorry for being dense. What about the following?
> > I included the calculation as opposed to just the magic numbers to try
> > to make it clear how they're derived. This has the disadvantage of
> > truncating the result instead of rounding, but I doubt that's
> > significant in this context. If it is, we could use the magic numbers
> > and put the computation in a comment.
>
> We can always use DIV_ROUND_UP((speed * enc_nominator),
> enc_denominator). I think this is confusing and since this introduces a
> bandwidth limit I would prefer to give a wider limit than a wrong one,
> even it is by less than 1Mb/s. My vote is for leaving it as you wrote below.
>
> >
> > Another question: we currently deal in Mb/s, not MB/s. Mb/s has the
> > advantage of sort of corresponding to the GT/s numbers, but using MB/s
> > would have the advantage of smaller numbers that match the table here:
> > https://en.wikipedia.org/wiki/PCI_Express#History_and_revisions,
> > but I don't know what's most typical in user-facing situations.
> > What's better?
>
> I don't know what's better but for network devices we measure bandwidth
> in Gb/s, so presenting bandwidth in MB/s would mean additional
> calculations. The truth is I would have prefer to use Gb/s instead of
> Mb/s, but again, don't want to loss up to 1Gb/s.
>
I prefer this version with the calculation in line since it makes the derivation clear. Keeping them in Mb/s makes it easier to convert to Gb/s, which is what most people would expect.
Thanks,
Jake
> >
> >
> > commit 946435491b35b7782157e9a4d1bd73071fba7709
> > Author: Tal Gilboa <[email protected]>
> > Date: Fri Mar 30 08:32:03 2018 -0500
> >
> > PCI: Add pcie_bandwidth_capable() to compute max supported link
> bandwidth
> >
> > Add pcie_bandwidth_capable() to compute the max link bandwidth
> supported by
> > a device, based on the max link speed and width, adjusted by the encoding
> > overhead.
> >
> > The maximum bandwidth of the link is computed as:
> >
> > max_link_width * max_link_speed * (1 - encoding_overhead)
> >
> > 2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw
> bandwidth
> > available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
> > reduces it by about 1.5%.
> >
> > The result is in Mb/s, i.e., megabits/second, of raw bandwidth.
> >
> > Signed-off-by: Tal Gilboa <[email protected]>
> > [bhelgaas: add 16 GT/s, adjust for pcie_get_speed_cap() and
> > pcie_get_width_cap() signatures, don't export outside drivers/pci]
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > Reviewed-by: Tariq Toukan <[email protected]>
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 43075be79388..ff1e72060952 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5208,6 +5208,28 @@ enum pcie_link_width pcie_get_width_cap(struct
> pci_dev *dev)
> > return PCIE_LNK_WIDTH_UNKNOWN;
> > }
> >
> > +/**
> > + * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
> > + * @dev: PCI device
> > + * @speed: storage for link speed
> > + * @width: storage for link width
> > + *
> > + * Calculate a PCI device's link bandwidth by querying for its link speed
> > + * and width, multiplying them, and applying encoding overhead. The result
> > + * is in Mb/s, i.e., megabits/second of raw bandwidth.
> > + */
> > +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed
> *speed,
> > + enum pcie_link_width *width)
> > +{
> > + *speed = pcie_get_speed_cap(dev);
> > + *width = pcie_get_width_cap(dev);
> > +
> > + if (*speed == PCI_SPEED_UNKNOWN || *width ==
> PCIE_LNK_WIDTH_UNKNOWN)
> > + return 0;
> > +
> > + return *width * PCIE_SPEED2MBS_ENC(*speed);
> > +}
> > +
> > /**
> > * pci_select_bars - Make BAR mask from the type of resource
> > * @dev: the PCI device for which BAR mask is made
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index 66738f1050c0..37f9299ed623 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -261,8 +261,18 @@ void pci_disable_bridge_window(struct pci_dev *dev);
> > (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> > "Unknown speed")
> >
> > +/* PCIe speed to Mb/s reduced by encoding overhead */
> > +#define PCIE_SPEED2MBS_ENC(speed) \
> > + ((speed) == PCIE_SPEED_16_0GT ? (16000*(128/130)) : \
> > + (speed) == PCIE_SPEED_8_0GT ? (8000*(128/130)) : \
> > + (speed) == PCIE_SPEED_5_0GT ? (5000*(8/10)) : \
> > + (speed) == PCIE_SPEED_2_5GT ? (2500*(8/10)) : \
> > + 0)
> > +
> > enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
> > enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
> > +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed
> *speed,
> > + enum pcie_link_width *width);
> >
> > /* Single Root I/O Virtualization */
> > struct pci_sriov {
> >
> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Friday, March 30, 2018 2:05 PM
> To: Tal Gilboa <[email protected]>
> Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> Goudar <[email protected]>; Kirsher, Jeffrey T
> <[email protected]>; [email protected]; intel-wired-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: [PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed and
> whether it's limited
>
> From: Tal Gilboa <[email protected]>
>
> Add pcie_print_link_status(). This logs the current settings of the link
> (speed, width, and total available bandwidth).
>
> If the device is capable of more bandwidth but is limited by a slower
> upstream link, we include information about the link that limits the
> device's performance.
>
> The user may be able to move the device to a different slot for better
> performance.
>
> This provides a unified method for all PCI devices to report status and
> issues, instead of each device reporting in a different way, using
> different code.
>
> Signed-off-by: Tal Gilboa <[email protected]>
> [bhelgaas: changelog, reword log messages, print device capabilities when
> not limited]
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> drivers/pci/pci.c | 29 +++++++++++++++++++++++++++++
> include/linux/pci.h | 1 +
> 2 files changed, 30 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e00d56b12747..cec7aed09f6b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5283,6 +5283,35 @@ u32 pcie_bandwidth_capable(struct pci_dev *dev,
> enum pci_bus_speed *speed,
> return *width * PCIE_SPEED2MBS_ENC(*speed);
> }
>
> +/**
> + * pcie_print_link_status - Report the PCI device's link speed and width
> + * @dev: PCI device to query
> + *
> + * Report the available bandwidth at the device. If this is less than the
> + * device is capable of, report the device's maximum possible bandwidth and
> + * the upstream link that limits its performance to less than that.
> + */
> +void pcie_print_link_status(struct pci_dev *dev)
> +{
> + enum pcie_link_width width, width_cap;
> + enum pci_bus_speed speed, speed_cap;
> + struct pci_dev *limiting_dev = NULL;
> + u32 bw_avail, bw_cap;
> +
> + bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
> + bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed,
> &width);
> +
> + if (bw_avail >= bw_cap)
> + pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
> + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> + else
> + pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d
> link at %s (capable of %d Mb/s with %s x%d link)\n",
> + bw_avail, PCIE_SPEED2STR(speed), width,
> + limiting_dev ? pci_name(limiting_dev) : "<unknown>",
> + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> +}
Personally, I would make thic last one a pci_warn() to indicate it at a higher log level, but I'm ok with the wording, and if consensus is that this should be at info, I'm ok with that.
Thanks,
Jake
> +EXPORT_SYMBOL(pcie_print_link_status);
> +
> /**
> * pci_select_bars - Make BAR mask from the type of resource
> * @dev: the PCI device for which BAR mask is made
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index f2bf2b7a66c7..38f7957121ef 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1086,6 +1086,7 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum
> pci_bus_speed *speed,
> u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev
> **limiting_dev,
> enum pci_bus_speed *speed,
> enum pcie_link_width *width);
> +void pcie_print_link_status(struct pci_dev *dev);
> void pcie_flr(struct pci_dev *dev);
> int __pci_reset_function_locked(struct pci_dev *dev);
> int pci_reset_function(struct pci_dev *dev);
On Mon, Apr 02, 2018 at 04:00:16PM +0000, Keller, Jacob E wrote:
> > -----Original Message-----
> > From: Tal Gilboa [mailto:[email protected]]
> > Sent: Monday, April 02, 2018 7:34 AM
> > To: Bjorn Helgaas <[email protected]>
> > Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> > <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> > Goudar <[email protected]>; Kirsher, Jeffrey T
> > <[email protected]>; [email protected]; intel-wired-
> > [email protected]; [email protected]; [email protected];
> > [email protected]
> > Subject: Re: [PATCH v5 03/14] PCI: Add pcie_bandwidth_capable() to compute
> > max supported link bandwidth
> >
> > On 4/2/2018 5:05 PM, Bjorn Helgaas wrote:
> > > On Mon, Apr 02, 2018 at 10:34:58AM +0300, Tal Gilboa wrote:
> > >> On 4/2/2018 3:40 AM, Bjorn Helgaas wrote:
> > >>> On Sun, Apr 01, 2018 at 11:38:53PM +0300, Tal Gilboa wrote:
> > >>>> On 3/31/2018 12:05 AM, Bjorn Helgaas wrote:
> > >>>>> From: Tal Gilboa <[email protected]>
> > >>>>>
> > >>>>> Add pcie_bandwidth_capable() to compute the max link bandwidth
> > supported by
> > >>>>> a device, based on the max link speed and width, adjusted by the
> > encoding
> > >>>>> overhead.
> > >>>>>
> > >>>>> The maximum bandwidth of the link is computed as:
> > >>>>>
> > >>>>> max_link_speed * max_link_width * (1 - encoding_overhead)
> > >>>>>
> > >>>>> The encoding overhead is about 20% for 2.5 and 5.0 GT/s links using
> > 8b/10b
> > >>>>> encoding, and about 1.5% for 8 GT/s or higher speed links using 128b/130b
> > >>>>> encoding.
> > >>>>>
> > >>>>> Signed-off-by: Tal Gilboa <[email protected]>
> > >>>>> [bhelgaas: adjust for pcie_get_speed_cap() and pcie_get_width_cap()
> > >>>>> signatures, don't export outside drivers/pci]
> > >>>>> Signed-off-by: Bjorn Helgaas <[email protected]>
> > >>>>> Reviewed-by: Tariq Toukan <[email protected]>
> > >>>>> ---
> > >>>>> drivers/pci/pci.c | 21 +++++++++++++++++++++
> > >>>>> drivers/pci/pci.h | 9 +++++++++
> > >>>>> 2 files changed, 30 insertions(+)
> > >>>>>
> > >>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > >>>>> index 43075be79388..9ce89e254197 100644
> > >>>>> --- a/drivers/pci/pci.c
> > >>>>> +++ b/drivers/pci/pci.c
> > >>>>> @@ -5208,6 +5208,27 @@ enum pcie_link_width
> > pcie_get_width_cap(struct pci_dev *dev)
> > >>>>> return PCIE_LNK_WIDTH_UNKNOWN;
> > >>>>> }
> > >>>>> +/**
> > >>>>> + * pcie_bandwidth_capable - calculates a PCI device's link bandwidth
> > capability
> > >>>>> + * @dev: PCI device
> > >>>>> + * @speed: storage for link speed
> > >>>>> + * @width: storage for link width
> > >>>>> + *
> > >>>>> + * Calculate a PCI device's link bandwidth by querying for its link speed
> > >>>>> + * and width, multiplying them, and applying encoding overhead.
> > >>>>> + */
> > >>>>> +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed
> > *speed,
> > >>>>> + enum pcie_link_width *width)
> > >>>>> +{
> > >>>>> + *speed = pcie_get_speed_cap(dev);
> > >>>>> + *width = pcie_get_width_cap(dev);
> > >>>>> +
> > >>>>> + if (*speed == PCI_SPEED_UNKNOWN || *width ==
> > PCIE_LNK_WIDTH_UNKNOWN)
> > >>>>> + return 0;
> > >>>>> +
> > >>>>> + return *width * PCIE_SPEED2MBS_ENC(*speed);
> > >>>>> +}
> > >>>>> +
> > >>>>> /**
> > >>>>> * pci_select_bars - Make BAR mask from the type of resource
> > >>>>> * @dev: the PCI device for which BAR mask is made
> > >>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > >>>>> index 66738f1050c0..2a50172b9803 100644
> > >>>>> --- a/drivers/pci/pci.h
> > >>>>> +++ b/drivers/pci/pci.h
> > >>>>> @@ -261,8 +261,17 @@ void pci_disable_bridge_window(struct pci_dev
> > *dev);
> > >>>>> (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> > >>>>> "Unknown speed")
> > >>>>> +/* PCIe speed to Mb/s with encoding overhead: 20% for gen2, ~1.5% for
> > gen3 */
> > >>>>> +#define PCIE_SPEED2MBS_ENC(speed) \
> > >>>>
> > >>>> Missing gen4.
> > >>>
> > >>> I made it "gen3+". I think that's accurate, isn't it? The spec
> > >>> doesn't seem to actually use "gen3" as a specific term, but sec 4.2.2
> > >>> says rates of 8 GT/s or higher (which I think includes gen3 and gen4)
> > >>> use 128b/130b encoding.
> > >>>
> > >>
> > >> I meant that PCIE_SPEED_16_0GT will return 0 from this macro since it wasn't
> > >> added. Need to return 15754.
> > >
> > > Oh, duh, of course! Sorry for being dense. What about the following?
> > > I included the calculation as opposed to just the magic numbers to try
> > > to make it clear how they're derived. This has the disadvantage of
> > > truncating the result instead of rounding, but I doubt that's
> > > significant in this context. If it is, we could use the magic numbers
> > > and put the computation in a comment.
> >
> > We can always use DIV_ROUND_UP((speed * enc_nominator),
> > enc_denominator). I think this is confusing and since this introduces a
> > bandwidth limit I would prefer to give a wider limit than a wrong one,
> > even it is by less than 1Mb/s. My vote is for leaving it as you wrote below.
> >
> > > Another question: we currently deal in Mb/s, not MB/s. Mb/s has the
> > > advantage of sort of corresponding to the GT/s numbers, but using MB/s
> > > would have the advantage of smaller numbers that match the table here:
> > > https://en.wikipedia.org/wiki/PCI_Express#History_and_revisions,
> > > but I don't know what's most typical in user-facing situations.
> > > What's better?
> >
> > I don't know what's better but for network devices we measure bandwidth
> > in Gb/s, so presenting bandwidth in MB/s would mean additional
> > calculations. The truth is I would have prefer to use Gb/s instead of
> > Mb/s, but again, don't want to loss up to 1Gb/s.
>
> I prefer this version with the calculation in line since it makes
> the derivation clear. Keeping them in Mb/s makes it easier to
> convert to Gb/s, which is what most people would expect.
OK, let's keep this patch as-is since returning Mb/s means we
don't have to worry about floating point, and it sounds like we
agree the truncation isn't a big deal.
I'll post a proposal to convert to Gb/s when printing.
> > > commit 946435491b35b7782157e9a4d1bd73071fba7709
> > > Author: Tal Gilboa <[email protected]>
> > > Date: Fri Mar 30 08:32:03 2018 -0500
> > >
> > > PCI: Add pcie_bandwidth_capable() to compute max supported link
> > bandwidth
> > >
> > > Add pcie_bandwidth_capable() to compute the max link bandwidth
> > supported by
> > > a device, based on the max link speed and width, adjusted by the encoding
> > > overhead.
> > >
> > > The maximum bandwidth of the link is computed as:
> > >
> > > max_link_width * max_link_speed * (1 - encoding_overhead)
> > >
> > > 2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw
> > bandwidth
> > > available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
> > > reduces it by about 1.5%.
> > >
> > > The result is in Mb/s, i.e., megabits/second, of raw bandwidth.
> > >
> > > Signed-off-by: Tal Gilboa <[email protected]>
> > > [bhelgaas: add 16 GT/s, adjust for pcie_get_speed_cap() and
> > > pcie_get_width_cap() signatures, don't export outside drivers/pci]
> > > Signed-off-by: Bjorn Helgaas <[email protected]>
> > > Reviewed-by: Tariq Toukan <[email protected]>
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 43075be79388..ff1e72060952 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -5208,6 +5208,28 @@ enum pcie_link_width pcie_get_width_cap(struct
> > pci_dev *dev)
> > > return PCIE_LNK_WIDTH_UNKNOWN;
> > > }
> > >
> > > +/**
> > > + * pcie_bandwidth_capable - calculate a PCI device's link bandwidth capability
> > > + * @dev: PCI device
> > > + * @speed: storage for link speed
> > > + * @width: storage for link width
> > > + *
> > > + * Calculate a PCI device's link bandwidth by querying for its link speed
> > > + * and width, multiplying them, and applying encoding overhead. The result
> > > + * is in Mb/s, i.e., megabits/second of raw bandwidth.
> > > + */
> > > +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed
> > *speed,
> > > + enum pcie_link_width *width)
> > > +{
> > > + *speed = pcie_get_speed_cap(dev);
> > > + *width = pcie_get_width_cap(dev);
> > > +
> > > + if (*speed == PCI_SPEED_UNKNOWN || *width ==
> > PCIE_LNK_WIDTH_UNKNOWN)
> > > + return 0;
> > > +
> > > + return *width * PCIE_SPEED2MBS_ENC(*speed);
> > > +}
> > > +
> > > /**
> > > * pci_select_bars - Make BAR mask from the type of resource
> > > * @dev: the PCI device for which BAR mask is made
> > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > > index 66738f1050c0..37f9299ed623 100644
> > > --- a/drivers/pci/pci.h
> > > +++ b/drivers/pci/pci.h
> > > @@ -261,8 +261,18 @@ void pci_disable_bridge_window(struct pci_dev *dev);
> > > (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> > > "Unknown speed")
> > >
> > > +/* PCIe speed to Mb/s reduced by encoding overhead */
> > > +#define PCIE_SPEED2MBS_ENC(speed) \
> > > + ((speed) == PCIE_SPEED_16_0GT ? (16000*(128/130)) : \
> > > + (speed) == PCIE_SPEED_8_0GT ? (8000*(128/130)) : \
> > > + (speed) == PCIE_SPEED_5_0GT ? (5000*(8/10)) : \
> > > + (speed) == PCIE_SPEED_2_5GT ? (2500*(8/10)) : \
> > > + 0)
> > > +
> > > enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
> > > enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
> > > +u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed
> > *speed,
> > > + enum pcie_link_width *width);
> > >
> > > /* Single Root I/O Virtualization */
> > > struct pci_sriov {
> > >
On Mon, Apr 02, 2018 at 04:25:17PM +0000, Keller, Jacob E wrote:
> > -----Original Message-----
> > From: Bjorn Helgaas [mailto:[email protected]]
> > Sent: Friday, March 30, 2018 2:05 PM
> > To: Tal Gilboa <[email protected]>
> > Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> > <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> > Goudar <[email protected]>; Kirsher, Jeffrey T
> > <[email protected]>; [email protected]; intel-wired-
> > [email protected]; [email protected]; [email protected];
> > [email protected]
> > Subject: [PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed and
> > whether it's limited
> >
> > From: Tal Gilboa <[email protected]>
> >
> > Add pcie_print_link_status(). This logs the current settings of the link
> > (speed, width, and total available bandwidth).
> >
> > If the device is capable of more bandwidth but is limited by a slower
> > upstream link, we include information about the link that limits the
> > device's performance.
> >
> > The user may be able to move the device to a different slot for better
> > performance.
> >
> > This provides a unified method for all PCI devices to report status and
> > issues, instead of each device reporting in a different way, using
> > different code.
> >
> > Signed-off-by: Tal Gilboa <[email protected]>
> > [bhelgaas: changelog, reword log messages, print device capabilities when
> > not limited]
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > ---
> > drivers/pci/pci.c | 29 +++++++++++++++++++++++++++++
> > include/linux/pci.h | 1 +
> > 2 files changed, 30 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index e00d56b12747..cec7aed09f6b 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5283,6 +5283,35 @@ u32 pcie_bandwidth_capable(struct pci_dev *dev,
> > enum pci_bus_speed *speed,
> > return *width * PCIE_SPEED2MBS_ENC(*speed);
> > }
> >
> > +/**
> > + * pcie_print_link_status - Report the PCI device's link speed and width
> > + * @dev: PCI device to query
> > + *
> > + * Report the available bandwidth at the device. If this is less than the
> > + * device is capable of, report the device's maximum possible bandwidth and
> > + * the upstream link that limits its performance to less than that.
> > + */
> > +void pcie_print_link_status(struct pci_dev *dev)
> > +{
> > + enum pcie_link_width width, width_cap;
> > + enum pci_bus_speed speed, speed_cap;
> > + struct pci_dev *limiting_dev = NULL;
> > + u32 bw_avail, bw_cap;
> > +
> > + bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
> > + bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed,
> > &width);
> > +
> > + if (bw_avail >= bw_cap)
> > + pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
> > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> > + else
> > + pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d
> > link at %s (capable of %d Mb/s with %s x%d link)\n",
> > + bw_avail, PCIE_SPEED2STR(speed), width,
> > + limiting_dev ? pci_name(limiting_dev) : "<unknown>",
> > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> > +}
>
> Personally, I would make thic last one a pci_warn() to indicate it at a
> higher log level, but I'm ok with the wording, and if consensus is that
> this should be at info, I'm ok with that.
Tal's original patch did have a pci_warn() here, and we went back and
forth a bit. They get bug reports when a device doesn't perform as
expected, which argues for pci_warn(). But they also got feedback
saying warnings are a bit too much, which argues for pci_info() [1]
I don't have a really strong opinion either way. I have a slight
preference for info because the user may not be able to do anything
about it (there may not be a faster slot available), and I think
distros are usually configured so a warning interrupts the smooth
graphical boot.
It looks like mlx4, fm10k, and ixgbe currently use warnings, while
bnx2x, bnxt_en, and cxgb4 use info. It's a tie so far :)
[1] https://lkml.kernel.org/r/[email protected]
Here's a proposal for printing the bandwidth as "x.xxx Gb/s":
commit ad370f38c1b5e9b8bb941eaed84ebb676c4bdaa4
Author: Tal Gilboa <[email protected]>
Date: Fri Mar 30 08:56:47 2018 -0500
PCI: Add pcie_print_link_status() to log link speed and whether it's limited
Add pcie_print_link_status(). This logs the current settings of the link
(speed, width, and total available bandwidth).
If the device is capable of more bandwidth but is limited by a slower
upstream link, we include information about the link that limits the
device's performance.
The user may be able to move the device to a different slot for better
performance.
This provides a unified method for all PCI devices to report status and
issues, instead of each device reporting in a different way, using
different code.
Signed-off-by: Tal Gilboa <[email protected]>
[bhelgaas: changelog, reword log messages, print device capabilities when
not limited, print bandwidth in Gb/s]
Signed-off-by: Bjorn Helgaas <[email protected]>
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c6e3c0524699..ab2346041fa4 100644
--- a/drivers/pci/pci.c
++ b/drivers/pci/pci.c
@@ -5287,6 +5287,38 @@ u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
return *width * PCIE_SPEED2MBS_ENC(*speed);
}
+/**
+ * pcie_print_link_status - Report the PCI device's link speed and width
+ * @dev: PCI device to query
+ *
+ * Report the available bandwidth at the device. If this is less than the
+ * device is capable of, report the device's maximum possible bandwidth and
+ * the upstream link that limits its performance to less than that.
+ */
+void pcie_print_link_status(struct pci_dev *dev)
+{
+ enum pcie_link_width width, width_cap;
+ enum pci_bus_speed speed, speed_cap;
+ struct pci_dev *limiting_dev = NULL;
+ u32 bw_avail, bw_cap;
+
+ bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
+ bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed, &width);
+
+ if (bw_avail >= bw_cap)
+ pci_info(dev, "%u.%03u Gb/s available bandwidth (%s x%d link)\n",
+ bw_cap / 1000, bw_cap % 1000,
+ PCIE_SPEED2STR(speed_cap), width_cap);
+ else
+ pci_info(dev, "%u.%03u Gb/s available bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link)\n",
+ bw_avail / 1000, bw_avail % 1000,
+ PCIE_SPEED2STR(speed), width,
+ limiting_dev ? pci_name(limiting_dev) : "<unknown>",
+ bw_cap / 1000, bw_cap % 1000,
+ PCIE_SPEED2STR(speed_cap), width_cap);
+}
+EXPORT_SYMBOL(pcie_print_link_status);
+
/**
* pci_select_bars - Make BAR mask from the type of resource
* @dev: the PCI device for which BAR mask is made
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f2bf2b7a66c7..38f7957121ef 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1086,6 +1086,7 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
enum pci_bus_speed *speed,
enum pcie_link_width *width);
+void pcie_print_link_status(struct pci_dev *dev);
void pcie_flr(struct pci_dev *dev);
int __pci_reset_function_locked(struct pci_dev *dev);
int pci_reset_function(struct pci_dev *dev);
> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Monday, April 02, 2018 12:58 PM
> To: Keller, Jacob E <[email protected]>
> Cc: Tal Gilboa <[email protected]>; Tariq Toukan <[email protected]>; Ariel
> Elior <[email protected]>; Ganesh Goudar <[email protected]>;
> Kirsher, Jeffrey T <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]
> Subject: Re: [PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed
> and whether it's limited
>
> On Mon, Apr 02, 2018 at 04:25:17PM +0000, Keller, Jacob E wrote:
> > > -----Original Message-----
> > > From: Bjorn Helgaas [mailto:[email protected]]
> > > Sent: Friday, March 30, 2018 2:05 PM
> > > To: Tal Gilboa <[email protected]>
> > > Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> > > <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> > > Goudar <[email protected]>; Kirsher, Jeffrey T
> > > <[email protected]>; [email protected]; intel-wired-
> > > [email protected]; [email protected]; [email protected];
> > > [email protected]
> > > Subject: [PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed
> and
> > > whether it's limited
> > >
> > > From: Tal Gilboa <[email protected]>
> > >
> > > Add pcie_print_link_status(). This logs the current settings of the link
> > > (speed, width, and total available bandwidth).
> > >
> > > If the device is capable of more bandwidth but is limited by a slower
> > > upstream link, we include information about the link that limits the
> > > device's performance.
> > >
> > > The user may be able to move the device to a different slot for better
> > > performance.
> > >
> > > This provides a unified method for all PCI devices to report status and
> > > issues, instead of each device reporting in a different way, using
> > > different code.
> > >
> > > Signed-off-by: Tal Gilboa <[email protected]>
> > > [bhelgaas: changelog, reword log messages, print device capabilities when
> > > not limited]
> > > Signed-off-by: Bjorn Helgaas <[email protected]>
> > > ---
> > > drivers/pci/pci.c | 29 +++++++++++++++++++++++++++++
> > > include/linux/pci.h | 1 +
> > > 2 files changed, 30 insertions(+)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index e00d56b12747..cec7aed09f6b 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -5283,6 +5283,35 @@ u32 pcie_bandwidth_capable(struct pci_dev *dev,
> > > enum pci_bus_speed *speed,
> > > return *width * PCIE_SPEED2MBS_ENC(*speed);
> > > }
> > >
> > > +/**
> > > + * pcie_print_link_status - Report the PCI device's link speed and width
> > > + * @dev: PCI device to query
> > > + *
> > > + * Report the available bandwidth at the device. If this is less than the
> > > + * device is capable of, report the device's maximum possible bandwidth and
> > > + * the upstream link that limits its performance to less than that.
> > > + */
> > > +void pcie_print_link_status(struct pci_dev *dev)
> > > +{
> > > + enum pcie_link_width width, width_cap;
> > > + enum pci_bus_speed speed, speed_cap;
> > > + struct pci_dev *limiting_dev = NULL;
> > > + u32 bw_avail, bw_cap;
> > > +
> > > + bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
> > > + bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed,
> > > &width);
> > > +
> > > + if (bw_avail >= bw_cap)
> > > + pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
> > > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> > > + else
> > > + pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d
> > > link at %s (capable of %d Mb/s with %s x%d link)\n",
> > > + bw_avail, PCIE_SPEED2STR(speed), width,
> > > + limiting_dev ? pci_name(limiting_dev) : "<unknown>",
> > > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> > > +}
> >
> > Personally, I would make thic last one a pci_warn() to indicate it at a
> > higher log level, but I'm ok with the wording, and if consensus is that
> > this should be at info, I'm ok with that.
>
> Tal's original patch did have a pci_warn() here, and we went back and
> forth a bit. They get bug reports when a device doesn't perform as
> expected, which argues for pci_warn(). But they also got feedback
> saying warnings are a bit too much, which argues for pci_info() [1]
>
> I don't have a really strong opinion either way. I have a slight
> preference for info because the user may not be able to do anything
> about it (there may not be a faster slot available), and I think
> distros are usually configured so a warning interrupts the smooth
> graphical boot.
>
> It looks like mlx4, fm10k, and ixgbe currently use warnings, while
> bnx2x, bnxt_en, and cxgb4 use info. It's a tie so far :)
>
> [1] https://lkml.kernel.org/r/e47f3628-b56c-4d0a-f18b-
> [email protected]
>
With that information, I'm fine with the proposal to display this as only an info. The message is still printed and can be used for debugging purposes, and I think that's really enough.
> Here's a proposal for printing the bandwidth as "x.xxx Gb/s":
Nice, I like that also.
Regards,
Jake
On Mon, Apr 02, 2018 at 03:56:06PM +0000, Keller, Jacob E wrote:
> > -----Original Message-----
> > From: Bjorn Helgaas [mailto:[email protected]]
> > Sent: Friday, March 30, 2018 2:06 PM
> > To: Tal Gilboa <[email protected]>
> > Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> > <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> > Goudar <[email protected]>; Kirsher, Jeffrey T
> > <[email protected]>; [email protected]; intel-wired-
> > [email protected]; [email protected]; [email protected];
> > [email protected]
> > Subject: [PATCH v5 12/14] fm10k: Report PCIe link properties with
> > pcie_print_link_status()
> >
> > From: Bjorn Helgaas <[email protected]>
> >
> > Use pcie_print_link_status() to report PCIe link speed and possible
> > limitations instead of implementing this in the driver itself.
> >
> > Note that pcie_get_minimum_link() can return misleading information because
> > it finds the slowest link and the narrowest link without considering the
> > total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
> > 2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
> > corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
> > about 2000 MB/s for a 16 GT/s x1 link.
>
> This comment is about what's being fixed, so it would have been easier to
> parse if it were written to more clearly indicate that we're removing
> (and not adding) this behavior.
Good point. Is this any better?
fm10k: Report PCIe link properties with pcie_print_link_status()
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.
pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link. For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself. This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.
Note that the driver previously used dev_warn() to suggest using a
different slot, but pcie_print_link_status() uses dev_info() because if the
platform has no faster slot available, the user can't do anything about the
warning and may not want to be bothered with it.
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> On Behalf Of Bjorn Helgaas
> Sent: Monday, April 02, 2018 1:32 PM
> To: Keller, Jacob E <[email protected]>
> Cc: Tal Gilboa <[email protected]>; Tariq Toukan <[email protected]>; Ariel
> Elior <[email protected]>; Ganesh Goudar <[email protected]>;
> Kirsher, Jeffrey T <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]
> Subject: Re: [PATCH v5 12/14] fm10k: Report PCIe link properties with
> pcie_print_link_status()
>
> On Mon, Apr 02, 2018 at 03:56:06PM +0000, Keller, Jacob E wrote:
> > > -----Original Message-----
> > > From: Bjorn Helgaas [mailto:[email protected]]
> > > Sent: Friday, March 30, 2018 2:06 PM
> > > To: Tal Gilboa <[email protected]>
> > > Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
> > > <[email protected]>; Ariel Elior <[email protected]>; Ganesh
> > > Goudar <[email protected]>; Kirsher, Jeffrey T
> > > <[email protected]>; [email protected]; intel-wired-
> > > [email protected]; [email protected]; [email protected];
> > > [email protected]
> > > Subject: [PATCH v5 12/14] fm10k: Report PCIe link properties with
> > > pcie_print_link_status()
> > >
> > > From: Bjorn Helgaas <[email protected]>
> > >
> > > Use pcie_print_link_status() to report PCIe link speed and possible
> > > limitations instead of implementing this in the driver itself.
> > >
> > > Note that pcie_get_minimum_link() can return misleading information
> because
> > > it finds the slowest link and the narrowest link without considering the
> > > total bandwidth of the link. If the path contains a 16 GT/s x1 link and a
> > > 2.5 GT/s x16 link, pcie_get_minimum_link() returns 2.5 GT/s x1, which
> > > corresponds to 250 MB/s of bandwidth, not the actual available bandwidth of
> > > about 2000 MB/s for a 16 GT/s x1 link.
> >
> > This comment is about what's being fixed, so it would have been easier to
> > parse if it were written to more clearly indicate that we're removing
> > (and not adding) this behavior.
>
> Good point. Is this any better?
>
> fm10k: Report PCIe link properties with pcie_print_link_status()
>
> Previously the driver used pcie_get_minimum_link() to warn when the NIC
> is in a slot that can't supply as much bandwidth as the NIC could use.
>
> pcie_get_minimum_link() can be misleading because it finds the slowest link
> and the narrowest link (which may be different links) without considering
> the total bandwidth of each link. For a path with a 16 GT/s x1 link and a
> 2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
> bandwidth, not the true available bandwidth of about 1969 MB/s for a
> 16 GT/s x1 link.
>
> Use pcie_print_link_status() to report PCIe link speed and possible
> limitations instead of implementing this in the driver itself. This finds
> the slowest link in the path to the device by computing the total bandwidth
> of each link and compares that with the capabilities of the device.
>
> Note that the driver previously used dev_warn() to suggest using a
> different slot, but pcie_print_link_status() uses dev_info() because if the
> platform has no faster slot available, the user can't do anything about the
> warning and may not want to be bothered with it.
Perfect! Thanks!
-Jake
On 4/2/2018 11:25 PM, Keller, Jacob E wrote:
>
>
>> -----Original Message-----
>> From: Bjorn Helgaas [mailto:[email protected]]
>> Sent: Monday, April 02, 2018 12:58 PM
>> To: Keller, Jacob E <[email protected]>
>> Cc: Tal Gilboa <[email protected]>; Tariq Toukan <[email protected]>; Ariel
>> Elior <[email protected]>; Ganesh Goudar <[email protected]>;
>> Kirsher, Jeffrey T <[email protected]>; [email protected];
>> [email protected]; [email protected]; linux-
>> [email protected]; [email protected]
>> Subject: Re: [PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed
>> and whether it's limited
>>
>> On Mon, Apr 02, 2018 at 04:25:17PM +0000, Keller, Jacob E wrote:
>>>> -----Original Message-----
>>>> From: Bjorn Helgaas [mailto:[email protected]]
>>>> Sent: Friday, March 30, 2018 2:05 PM
>>>> To: Tal Gilboa <[email protected]>
>>>> Cc: Tariq Toukan <[email protected]>; Keller, Jacob E
>>>> <[email protected]>; Ariel Elior <[email protected]>; Ganesh
>>>> Goudar <[email protected]>; Kirsher, Jeffrey T
>>>> <[email protected]>; [email protected]; intel-wired-
>>>> [email protected]; [email protected]; [email protected];
>>>> [email protected]
>>>> Subject: [PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed
>> and
>>>> whether it's limited
>>>>
>>>> From: Tal Gilboa <[email protected]>
>>>>
>>>> Add pcie_print_link_status(). This logs the current settings of the link
>>>> (speed, width, and total available bandwidth).
>>>>
>>>> If the device is capable of more bandwidth but is limited by a slower
>>>> upstream link, we include information about the link that limits the
>>>> device's performance.
>>>>
>>>> The user may be able to move the device to a different slot for better
>>>> performance.
>>>>
>>>> This provides a unified method for all PCI devices to report status and
>>>> issues, instead of each device reporting in a different way, using
>>>> different code.
>>>>
>>>> Signed-off-by: Tal Gilboa <[email protected]>
>>>> [bhelgaas: changelog, reword log messages, print device capabilities when
>>>> not limited]
>>>> Signed-off-by: Bjorn Helgaas <[email protected]>
>>>> ---
>>>> drivers/pci/pci.c | 29 +++++++++++++++++++++++++++++
>>>> include/linux/pci.h | 1 +
>>>> 2 files changed, 30 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>> index e00d56b12747..cec7aed09f6b 100644
>>>> --- a/drivers/pci/pci.c
>>>> +++ b/drivers/pci/pci.c
>>>> @@ -5283,6 +5283,35 @@ u32 pcie_bandwidth_capable(struct pci_dev *dev,
>>>> enum pci_bus_speed *speed,
>>>> return *width * PCIE_SPEED2MBS_ENC(*speed);
>>>> }
>>>>
>>>> +/**
>>>> + * pcie_print_link_status - Report the PCI device's link speed and width
>>>> + * @dev: PCI device to query
>>>> + *
>>>> + * Report the available bandwidth at the device. If this is less than the
>>>> + * device is capable of, report the device's maximum possible bandwidth and
>>>> + * the upstream link that limits its performance to less than that.
>>>> + */
>>>> +void pcie_print_link_status(struct pci_dev *dev)
>>>> +{
>>>> + enum pcie_link_width width, width_cap;
>>>> + enum pci_bus_speed speed, speed_cap;
>>>> + struct pci_dev *limiting_dev = NULL;
>>>> + u32 bw_avail, bw_cap;
>>>> +
>>>> + bw_cap = pcie_bandwidth_capable(dev, &speed_cap, &width_cap);
>>>> + bw_avail = pcie_bandwidth_available(dev, &limiting_dev, &speed,
>>>> &width);
>>>> +
>>>> + if (bw_avail >= bw_cap)
>>>> + pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
>>>> + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
>>>> + else
>>>> + pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d
>>>> link at %s (capable of %d Mb/s with %s x%d link)\n",
>>>> + bw_avail, PCIE_SPEED2STR(speed), width,
>>>> + limiting_dev ? pci_name(limiting_dev) : "<unknown>",
>>>> + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
>>>> +}
>>>
>>> Personally, I would make thic last one a pci_warn() to indicate it at a
>>> higher log level, but I'm ok with the wording, and if consensus is that
>>> this should be at info, I'm ok with that.
>>
>> Tal's original patch did have a pci_warn() here, and we went back and
>> forth a bit. They get bug reports when a device doesn't perform as
>> expected, which argues for pci_warn(). But they also got feedback
>> saying warnings are a bit too much, which argues for pci_info() [1]
>>
>> I don't have a really strong opinion either way. I have a slight
>> preference for info because the user may not be able to do anything
>> about it (there may not be a faster slot available), and I think
>> distros are usually configured so a warning interrupts the smooth
>> graphical boot.
>>
>> It looks like mlx4, fm10k, and ixgbe currently use warnings, while
>> bnx2x, bnxt_en, and cxgb4 use info. It's a tie so far :)
>>
>> [1] https://lkml.kernel.org/r/e47f3628-b56c-4d0a-f18b-
>> [email protected]
>>
>
> With that information, I'm fine with the proposal to display this as only an info. The message is still printed and can be used for debugging purposes, and I think that's really enough.
>
>> Here's a proposal for printing the bandwidth as "x.xxx Gb/s":
>
> Nice, I like that also.
>
> Regards,
> Jake
>
Same here for both.
On Mon, Apr 2, 2018 at 7:05 AM, Bjorn Helgaas <[email protected]> wrote:
> +/* PCIe speed to Mb/s reduced by encoding overhead */
> +#define PCIE_SPEED2MBS_ENC(speed) \
> + ((speed) == PCIE_SPEED_16_0GT ? (16000*(128/130)) : \
> + (speed) == PCIE_SPEED_8_0GT ? (8000*(128/130)) : \
> + (speed) == PCIE_SPEED_5_0GT ? (5000*(8/10)) : \
> + (speed) == PCIE_SPEED_2_5GT ? (2500*(8/10)) : \
> + 0)
> +
Should this be "(speed * x ) / y" instead? wouldn't they calculate
128/130 and truncate that to zero before multiplying by the speed? Or
are compilers smart enough to do this the other way to avoid the
losses?
Thanks,
Jake
On Mon, Apr 02, 2018 at 05:30:54PM -0700, Jacob Keller wrote:
> On Mon, Apr 2, 2018 at 7:05 AM, Bjorn Helgaas <[email protected]> wrote:
> > +/* PCIe speed to Mb/s reduced by encoding overhead */
> > +#define PCIE_SPEED2MBS_ENC(speed) \
> > + ((speed) == PCIE_SPEED_16_0GT ? (16000*(128/130)) : \
> > + (speed) == PCIE_SPEED_8_0GT ? (8000*(128/130)) : \
> > + (speed) == PCIE_SPEED_5_0GT ? (5000*(8/10)) : \
> > + (speed) == PCIE_SPEED_2_5GT ? (2500*(8/10)) : \
> > + 0)
> > +
>
> Should this be "(speed * x ) / y" instead? wouldn't they calculate
> 128/130 and truncate that to zero before multiplying by the speed? Or
> are compilers smart enough to do this the other way to avoid the
> losses?
Yep, thanks for saving me yet more embarrassment.
> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Tuesday, April 03, 2018 7:06 AM
> To: Jacob Keller <[email protected]>
> Cc: Tal Gilboa <[email protected]>; Tariq Toukan <[email protected]>;
> Keller, Jacob E <[email protected]>; Ariel Elior <[email protected]>;
> Ganesh Goudar <[email protected]>; Kirsher, Jeffrey T
> <[email protected]>; [email protected]; intel-wired-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH v5 03/14] PCI: Add pcie_bandwidth_capable() to compute
> max supported link bandwidth
>
> On Mon, Apr 02, 2018 at 05:30:54PM -0700, Jacob Keller wrote:
> > On Mon, Apr 2, 2018 at 7:05 AM, Bjorn Helgaas <[email protected]> wrote:
> > > +/* PCIe speed to Mb/s reduced by encoding overhead */
> > > +#define PCIE_SPEED2MBS_ENC(speed) \
> > > + ((speed) == PCIE_SPEED_16_0GT ? (16000*(128/130)) : \
> > > + (speed) == PCIE_SPEED_8_0GT ? (8000*(128/130)) : \
> > > + (speed) == PCIE_SPEED_5_0GT ? (5000*(8/10)) : \
> > > + (speed) == PCIE_SPEED_2_5GT ? (2500*(8/10)) : \
> > > + 0)
> > > +
> >
> > Should this be "(speed * x ) / y" instead? wouldn't they calculate
> > 128/130 and truncate that to zero before multiplying by the speed? Or
> > are compilers smart enough to do this the other way to avoid the
> > losses?
>
> Yep, thanks for saving me yet more embarrassment.
That's what patch review is for :D
Thanks,
Jake
On Fri, 30 Mar 2018 16:05:18 -0500, Bjorn Helgaas wrote:
> + if (bw_avail >= bw_cap)
> + pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
> + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> + else
> + pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d link at %s (capable of %d Mb/s with %s x%d link)\n",
> + bw_avail, PCIE_SPEED2STR(speed), width,
> + limiting_dev ? pci_name(limiting_dev) : "<unknown>",
> + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
I was just looking at using this new function to print PCIe BW for a
NIC, but I'm slightly worried that there is nothing in the message that
says PCIe... For a NIC some people may interpret the bandwidth as NIC
bandwidth:
[ 39.839989] nfp 0000:04:00.0: Netronome Flow Processor NFP4000/NFP6000 PCIe Card Probe
[ 39.848943] nfp 0000:04:00.0: 63.008 Gb/s available bandwidth (8 GT/s x8 link)
[ 39.857146] nfp 0000:04:00.0: RESERVED BARs: 0.0: General/MSI-X SRAM, 0.1: PCIe XPB/MSI-X PBA, 0.4: Explicit0, 0.5: Explicit1, fre4
It's not a 63Gbps NIC... I'm sorry if this was discussed before and I
didn't find it. Would it make sense to add the "PCIe: " prefix to the
message like bnx2x used to do? Like:
nfp 0000:04:00.0: PCIe: 63.008 Gb/s available bandwidth (8 GT/s x8 link)
Sorry for a very late comment.
On Thu, Apr 12, 2018 at 09:32:49PM -0700, Jakub Kicinski wrote:
> On Fri, 30 Mar 2018 16:05:18 -0500, Bjorn Helgaas wrote:
> > + if (bw_avail >= bw_cap)
> > + pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
> > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> > + else
> > + pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d link at %s (capable of %d Mb/s with %s x%d link)\n",
> > + bw_avail, PCIE_SPEED2STR(speed), width,
> > + limiting_dev ? pci_name(limiting_dev) : "<unknown>",
> > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
>
> I was just looking at using this new function to print PCIe BW for a
> NIC, but I'm slightly worried that there is nothing in the message that
> says PCIe... For a NIC some people may interpret the bandwidth as NIC
> bandwidth:
>
> [ 39.839989] nfp 0000:04:00.0: Netronome Flow Processor NFP4000/NFP6000 PCIe Card Probe
> [ 39.848943] nfp 0000:04:00.0: 63.008 Gb/s available bandwidth (8 GT/s x8 link)
> [ 39.857146] nfp 0000:04:00.0: RESERVED BARs: 0.0: General/MSI-X SRAM, 0.1: PCIe XPB/MSI-X PBA, 0.4: Explicit0, 0.5: Explicit1, fre4
>
> It's not a 63Gbps NIC... I'm sorry if this was discussed before and I
> didn't find it. Would it make sense to add the "PCIe: " prefix to the
> message like bnx2x used to do? Like:
>
> nfp 0000:04:00.0: PCIe: 63.008 Gb/s available bandwidth (8 GT/s x8 link)
I agree, that does look potentially confusing. How about this:
nfp 0000:04:00.0: 63.008 Gb/s available PCIe bandwidth (8 GT/s x8 link)
I did have to look twice at this before I remembered that we're
printing Gb/s (not GB/s). Most of the references I found on the web
use GB/s when talking about total PCIe bandwidth.
But either way I think it's definitely worth mentioning PCIe
explicitly.
> -----Original Message-----
> From: Bjorn Helgaas [mailto:[email protected]]
> Sent: Friday, April 13, 2018 7:07 AM
> To: Jakub Kicinski <[email protected]>
> Cc: Tal Gilboa <[email protected]>; Tariq Toukan <[email protected]>;
> Keller, Jacob E <[email protected]>; Ariel Elior <[email protected]>;
> Ganesh Goudar <[email protected]>; Kirsher, Jeffrey T
> <[email protected]>; [email protected]; intel-wired-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed
> and whether it's limited
>
> On Thu, Apr 12, 2018 at 09:32:49PM -0700, Jakub Kicinski wrote:
> > On Fri, 30 Mar 2018 16:05:18 -0500, Bjorn Helgaas wrote:
> > > + if (bw_avail >= bw_cap)
> > > + pci_info(dev, "%d Mb/s available bandwidth (%s x%d link)\n",
> > > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> > > + else
> > > + pci_info(dev, "%d Mb/s available bandwidth, limited by %s x%d
> link at %s (capable of %d Mb/s with %s x%d link)\n",
> > > + bw_avail, PCIE_SPEED2STR(speed), width,
> > > + limiting_dev ? pci_name(limiting_dev) : "<unknown>",
> > > + bw_cap, PCIE_SPEED2STR(speed_cap), width_cap);
> >
> > I was just looking at using this new function to print PCIe BW for a
> > NIC, but I'm slightly worried that there is nothing in the message that
> > says PCIe... For a NIC some people may interpret the bandwidth as NIC
> > bandwidth:
> >
> > [ 39.839989] nfp 0000:04:00.0: Netronome Flow Processor NFP4000/NFP6000
> PCIe Card Probe
> > [ 39.848943] nfp 0000:04:00.0: 63.008 Gb/s available bandwidth (8 GT/s x8 link)
> > [ 39.857146] nfp 0000:04:00.0: RESERVED BARs: 0.0: General/MSI-X SRAM, 0.1:
> PCIe XPB/MSI-X PBA, 0.4: Explicit0, 0.5: Explicit1, fre4
> >
> > It's not a 63Gbps NIC... I'm sorry if this was discussed before and I
> > didn't find it. Would it make sense to add the "PCIe: " prefix to the
> > message like bnx2x used to do? Like:
> >
> > nfp 0000:04:00.0: PCIe: 63.008 Gb/s available bandwidth (8 GT/s x8 link)
>
> I agree, that does look potentially confusing. How about this:
>
> nfp 0000:04:00.0: 63.008 Gb/s available PCIe bandwidth (8 GT/s x8 link)
>
> I did have to look twice at this before I remembered that we're
> printing Gb/s (not GB/s). Most of the references I found on the web
> use GB/s when talking about total PCIe bandwidth.
>
> But either way I think it's definitely worth mentioning PCIe
> explicitly.
I also agree printing PCIe explicitly is good.
Thanks,
Jake