[RESEND with Shanker's patches as those depend on this series]
PCI and PCIe devices may support a number of possible reset mechanisms
for example Function Level Reset (FLR) provided via Advanced Feature or
PCIe capabilities, Power Management reset, bus reset, or device specific reset.
Currently the PCI subsystem creates a policy prioritizing these reset methods
which provides neither visibility nor control to userspace.
Expose the reset methods available per device to userspace, via sysfs
and allow an administrative user or device owner to have ability to
manage per device reset method priorities or exclusions.
This feature aims to allow greater control of a device for use cases
as device assignment, where specific device or platform issues may
interact poorly with a given reset method, and for which device specific
quirks have not been developed.
Changes in v2:
- Use byte array instead of bitmap to keep track of
ordering of reset methods
- Fix incorrect use of reset_fn field in octeon driver
- Allow writing comma separated list of names of supported reset
methods to reset_method sysfs attribute
- Writing empty string instead of "none" to reset_method attribute
disables ability of reset the device
Sending Raphael's patch again as this series depends on it.
Amey Narkhede (4):
PCI: Add pcie_reset_flr to follow calling convention of other reset
methods
PCI: Add new array for keeping track of ordering of reset methods
PCI: Remove reset_fn field from pci_dev
PCI/sysfs: Allow userspace to query and set device reset mechanism
Raphael Norwitz (1):
PCI: merge slot and bus reset implementations
Shanker Donthineni (2):
PCI: Add support for a function level reset based on _RST method
PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs
Documentation/ABI/testing/sysfs-bus-pci | 16 ++
drivers/crypto/cavium/nitrox/nitrox_main.c | 4 +-
.../ethernet/cavium/liquidio/lio_vf_main.c | 2 +-
drivers/pci/pci-sysfs.c | 93 +++++++-
drivers/pci/pci.c | 206 +++++++++++-------
drivers/pci/pci.h | 10 +-
drivers/pci/pcie/aer.c | 12 +-
drivers/pci/probe.c | 4 +-
drivers/pci/quirks.c | 23 +-
include/linux/pci.h | 11 +-
10 files changed, 278 insertions(+), 103 deletions(-)
--
2.31.1
Currently there is separate function pcie_has_flr to probe
if pcie flr is supported by the device which does not match
the calling convention followed by reset methods which use second
function argument to decide whether to probe or not.
Add new function pcie_reset_flr that follows the calling
convention of reset methods.
Reviewed-by: Alex Williamson <[email protected]>
Reviewed-by: Raphael Norwitz <[email protected]>
Co-developed-by: Alex Williamson <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Amey Narkhede <[email protected]>
---
drivers/crypto/cavium/nitrox/nitrox_main.c | 4 +-
drivers/pci/pci.c | 62 ++++++++++++----------
drivers/pci/pcie/aer.c | 12 ++---
drivers/pci/quirks.c | 9 ++--
include/linux/pci.h | 2 +-
5 files changed, 43 insertions(+), 46 deletions(-)
diff --git a/drivers/crypto/cavium/nitrox/nitrox_main.c b/drivers/crypto/cavium/nitrox/nitrox_main.c
index facc8e6bc..15d6c8452 100644
--- a/drivers/crypto/cavium/nitrox/nitrox_main.c
+++ b/drivers/crypto/cavium/nitrox/nitrox_main.c
@@ -306,9 +306,7 @@ static int nitrox_device_flr(struct pci_dev *pdev)
return -ENOMEM;
}
- /* check flr support */
- if (pcie_has_flr(pdev))
- pcie_flr(pdev);
+ pcie_reset_flr(pdev, 0);
pci_restore_state(pdev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a8f8dd588..b998d6ad3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4573,32 +4573,12 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
}
EXPORT_SYMBOL(pci_wait_for_pending_transaction);
-/**
- * pcie_has_flr - check if a device supports function level resets
- * @dev: device to check
- *
- * Returns true if the device advertises support for PCIe function level
- * resets.
- */
-bool pcie_has_flr(struct pci_dev *dev)
-{
- u32 cap;
-
- if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
- return false;
-
- pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
- return cap & PCI_EXP_DEVCAP_FLR;
-}
-EXPORT_SYMBOL_GPL(pcie_has_flr);
-
/**
* pcie_flr - initiate a PCIe function level reset
* @dev: device to reset
*
- * Initiate a function level reset on @dev. The caller should ensure the
- * device supports FLR before calling this function, e.g. by using the
- * pcie_has_flr() helper.
+ * Initiate a function level reset unconditionally on @dev without
+ * checking any flags and DEVCAP
*/
int pcie_flr(struct pci_dev *dev)
{
@@ -4621,6 +4601,31 @@ int pcie_flr(struct pci_dev *dev)
}
EXPORT_SYMBOL_GPL(pcie_flr);
+/**
+ * pcie_reset_flr - initiate a PCIe function level reset
+ * @dev: device to reset
+ * @probe: If set, only check if the device can be reset this way.
+ *
+ * Initiate a function level reset on @dev.
+ */
+int pcie_reset_flr(struct pci_dev *dev, int probe)
+{
+ u32 cap;
+
+ if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
+ return -ENOTTY;
+
+ pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
+ if (!(cap & PCI_EXP_DEVCAP_FLR))
+ return -ENOTTY;
+
+ if (probe)
+ return 0;
+
+ return pcie_flr(dev);
+}
+EXPORT_SYMBOL_GPL(pcie_reset_flr);
+
static int pci_af_flr(struct pci_dev *dev, int probe)
{
int pos;
@@ -5100,11 +5105,9 @@ int __pci_reset_function_locked(struct pci_dev *dev)
rc = pci_dev_specific_reset(dev, 0);
if (rc != -ENOTTY)
return rc;
- if (pcie_has_flr(dev)) {
- rc = pcie_flr(dev);
- if (rc != -ENOTTY)
- return rc;
- }
+ rc = pcie_reset_flr(dev, 0);
+ if (rc != -ENOTTY)
+ return rc;
rc = pci_af_flr(dev, 0);
if (rc != -ENOTTY)
return rc;
@@ -5135,8 +5138,9 @@ int pci_probe_reset_function(struct pci_dev *dev)
rc = pci_dev_specific_reset(dev, 1);
if (rc != -ENOTTY)
return rc;
- if (pcie_has_flr(dev))
- return 0;
+ rc = pcie_reset_flr(dev, 1);
+ if (rc != -ENOTTY)
+ return rc;
rc = pci_af_flr(dev, 1);
if (rc != -ENOTTY)
return rc;
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index ba2238834..f4e891bd5 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1405,13 +1405,11 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
}
if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
- if (pcie_has_flr(dev)) {
- rc = pcie_flr(dev);
- pci_info(dev, "has been reset (%d)\n", rc);
- } else {
- pci_info(dev, "not reset (no FLR support)\n");
- rc = -ENOTTY;
- }
+ rc = pcie_reset_flr(dev, 0);
+ if (!rc)
+ pci_info(dev, "has been reset\n");
+ else
+ pci_info(dev, "not reset (no FLR support: %d)\n", rc);
} else {
rc = pci_bus_error_reset(dev);
pci_info(dev, "%s Port link has been reset (%d)\n",
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 653660e3b..5318833f3 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3831,7 +3831,7 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
u32 cfg;
if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
- !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
+ pcie_reset_flr(dev, 1) || !pci_resource_start(dev, 0))
return -ENOTTY;
if (probe)
@@ -3900,13 +3900,10 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
*/
static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
{
- if (!pcie_has_flr(dev))
- return -ENOTTY;
+ int ret = pcie_reset_flr(dev, probe);
if (probe)
- return 0;
-
- pcie_flr(dev);
+ return ret;
msleep(250);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 979d54335..8d20e51ab 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1217,7 +1217,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
enum pci_bus_speed *speed,
enum pcie_link_width *width);
void pcie_print_link_status(struct pci_dev *dev);
-bool pcie_has_flr(struct pci_dev *dev);
+int pcie_reset_flr(struct pci_dev *dev, int probe);
int pcie_flr(struct pci_dev *dev);
int __pci_reset_function_locked(struct pci_dev *dev);
int pci_reset_function(struct pci_dev *dev);
--
2.31.1
From: Raphael Norwitz <[email protected]>
Slot resets are bus resets with additional logic to prevent a device
from being removed during the reset. Currently slot and bus resets have
separate implementations in pci.c, complicating higher level logic. As
discussed on the mailing list, they should be combined into a generic
function which performs an SBR. This change adds a function,
pci_reset_bus_function(), which first attempts a slot reset and then
attempts a bus reset if -ENOTTY is returned, such that there is now a
single device agnostic function to perform an SBR.
This new function is also needed to add SBR reset quirks and therefore
is exposed in pci.h.
Link: https://lkml.org/lkml/2021/3/23/911
Suggested-by: Alex Williamson <[email protected]>
Signed-off-by: Amey Narkhede <[email protected]>
Signed-off-by: Raphael Norwitz <[email protected]>
---
drivers/pci/pci.c | 19 +++++++++++--------
include/linux/pci.h | 1 +
2 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 16a17215f..a8f8dd588 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4982,6 +4982,15 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, int probe)
return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
}
+int pci_reset_bus_function(struct pci_dev *dev, int probe)
+{
+ int rc = pci_dev_reset_slot_function(dev, probe);
+
+ if (rc != -ENOTTY)
+ return rc;
+ return pci_parent_bus_reset(dev, probe);
+}
+
static void pci_dev_lock(struct pci_dev *dev)
{
pci_cfg_access_lock(dev);
@@ -5102,10 +5111,7 @@ int __pci_reset_function_locked(struct pci_dev *dev)
rc = pci_pm_reset(dev, 0);
if (rc != -ENOTTY)
return rc;
- rc = pci_dev_reset_slot_function(dev, 0);
- if (rc != -ENOTTY)
- return rc;
- return pci_parent_bus_reset(dev, 0);
+ return pci_reset_bus_function(dev, 0);
}
EXPORT_SYMBOL_GPL(__pci_reset_function_locked);
@@ -5135,13 +5141,10 @@ int pci_probe_reset_function(struct pci_dev *dev)
if (rc != -ENOTTY)
return rc;
rc = pci_pm_reset(dev, 1);
- if (rc != -ENOTTY)
- return rc;
- rc = pci_dev_reset_slot_function(dev, 1);
if (rc != -ENOTTY)
return rc;
- return pci_parent_bus_reset(dev, 1);
+ return pci_reset_bus_function(dev, 1);
}
/**
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 86c799c97..979d54335 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1228,6 +1228,7 @@ int pci_probe_reset_bus(struct pci_bus *bus);
int pci_reset_bus(struct pci_dev *dev);
void pci_reset_secondary_bus(struct pci_dev *dev);
void pcibios_reset_secondary_bus(struct pci_dev *dev);
+int pci_reset_bus_function(struct pci_dev *dev, int probe);
void pci_update_resource(struct pci_dev *dev, int resno);
int __must_check pci_assign_resource(struct pci_dev *dev, int i);
int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
--
2.31.1
reset_fn field is used to indicate whether the
device supports any reset mechanism or not.
Deprecate use of reset_fn in favor of new
reset_methods array which can be used to keep
track of all supported reset mechanisms of a device
and their ordering.
The octeon driver is incorrectly using reset_fn field
to detect if the device supports FLR or not. Use
pcie_reset_flr to probe whether it supports
FLR or not.
Reviewed-by: Alex Williamson <[email protected]>
Reviewed-by: Raphael Norwitz <[email protected]>
Co-developed-by: Alex Williamson <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Amey Narkhede <[email protected]>
---
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 2 +-
drivers/pci/pci-sysfs.c | 6 ++----
drivers/pci/pci.c | 6 +++---
drivers/pci/probe.c | 1 -
drivers/pci/quirks.c | 2 +-
include/linux/pci.h | 1 -
6 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
index 516f166ce..336d149ee 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
@@ -526,7 +526,7 @@ static void octeon_destroy_resources(struct octeon_device *oct)
oct->irq_name_storage = NULL;
}
/* Soft reset the octeon device before exiting */
- if (oct->pci_dev->reset_fn)
+ if (!pcie_reset_flr(oct->pci_dev, 1))
octeon_pci_flr(oct);
else
cn23xx_vf_ask_pf_to_do_flr(oct);
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index f8afd54ca..388895099 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1334,7 +1334,7 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
pcie_vpd_create_sysfs_dev_files(dev);
- if (dev->reset_fn) {
+ if (pci_reset_supported(dev)) {
retval = device_create_file(&dev->dev, &dev_attr_reset);
if (retval)
goto error;
@@ -1417,10 +1417,8 @@ int __must_check pci_create_sysfs_dev_files(struct pci_dev *pdev)
static void pci_remove_capabilities_sysfs(struct pci_dev *dev)
{
pcie_vpd_remove_sysfs_dev_files(dev);
- if (dev->reset_fn) {
+ if (pci_reset_supported(dev))
device_remove_file(&dev->dev, &dev_attr_reset);
- dev->reset_fn = 0;
- }
}
/**
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ca46a55c7..664cf2d35 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5192,7 +5192,7 @@ int pci_reset_function(struct pci_dev *dev)
{
int rc;
- if (!dev->reset_fn)
+ if (!pci_reset_supported(dev))
return -ENOTTY;
pci_dev_lock(dev);
@@ -5228,7 +5228,7 @@ int pci_reset_function_locked(struct pci_dev *dev)
{
int rc;
- if (!dev->reset_fn)
+ if (!pci_reset_supported(dev))
return -ENOTTY;
pci_dev_save_and_disable(dev);
@@ -5251,7 +5251,7 @@ int pci_try_reset_function(struct pci_dev *dev)
{
int rc;
- if (!dev->reset_fn)
+ if (!pci_reset_supported(dev))
return -ENOTTY;
if (!pci_dev_trylock(dev))
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c5cfdd239..4764e031a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2404,7 +2404,6 @@ static void pci_init_capabilities(struct pci_dev *dev)
pcie_report_downtraining(dev);
pci_init_reset_methods(dev);
- dev->reset_fn = pci_reset_supported(dev);
}
/*
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 5318833f3..8f47d139c 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5535,7 +5535,7 @@ static void quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev)
if (pdev->subsystem_vendor != PCI_VENDOR_ID_LENOVO ||
pdev->subsystem_device != 0x222e ||
- !pdev->reset_fn)
+ !pci_reset_supported(pdev))
return;
if (pci_enable_device_mem(pdev))
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5c5925ecf..9f8347799 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -429,7 +429,6 @@ struct pci_dev {
unsigned int state_saved:1;
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
- unsigned int reset_fn:1;
unsigned int is_hotplug_bridge:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
--
2.31.1
Add reset_method sysfs attribute to enable user to
query and set user preferred device reset methods and
their ordering.
Reviewed-by: Alex Williamson <[email protected]>
Reviewed-by: Raphael Norwitz <[email protected]>
Co-developed-by: Alex Williamson <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Amey Narkhede <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 16 +++++
drivers/pci/pci-sysfs.c | 91 ++++++++++++++++++++++++-
2 files changed, 104 insertions(+), 3 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 25c9c3977..36fba7ebf 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -121,6 +121,22 @@ Description:
child buses, and re-discover devices removed earlier
from this part of the device tree.
+What: /sys/bus/pci/devices/.../reset_method
+Date: March 2021
+Contact: Amey Narkhede <[email protected]>
+Description:
+ Some devices allow an individual function to be reset
+ without affecting other functions in the same slot.
+ For devices that have this support, a file named reset_method
+ will be present in sysfs. Reading this file will give names
+ of the device supported reset methods and their ordering.
+ Writing the name or comma separated list of names of any of
+ the device supported reset methods to this file will set the
+ reset methods and their ordering to be used when resetting
+ the device. Writing empty string to this file will disable
+ ability to reset the device and writing "default" will return
+ to the original value.
+
What: /sys/bus/pci/devices/.../reset
Date: July 2009
Contact: Michael S. Tsirkin <[email protected]>
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 388895099..cf2f66270 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1304,6 +1304,84 @@ static const struct bin_attribute pcie_config_attr = {
.write = pci_write_config,
};
+static ssize_t reset_method_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ ssize_t len = 0;
+ int i, prio;
+
+ for (prio = PCI_RESET_FN_METHODS; prio; prio--) {
+ for (i = 0; i < PCI_RESET_FN_METHODS; i++) {
+ if (prio == pdev->reset_methods[i]) {
+ len += sysfs_emit_at(buf, len, "%s%s",
+ len ? "," : "",
+ pci_reset_fn_methods[i].name);
+ break;
+ }
+ }
+
+ if (i == PCI_RESET_FN_METHODS)
+ break;
+ }
+
+ return len;
+}
+
+static ssize_t reset_method_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ u8 reset_methods[PCI_RESET_FN_METHODS];
+ struct pci_dev *pdev = to_pci_dev(dev);
+ u8 prio = PCI_RESET_FN_METHODS;
+ char *name;
+ int i;
+
+ /*
+ * Initialize reset_method such that 0xff indicates
+ * supported but not currently enabled reset methods
+ * as we only use priority values which are within
+ * the range of PCI_RESET_FN_METHODS array size
+ */
+ for (i = 0; i < PCI_RESET_FN_METHODS; i++)
+ reset_methods[i] = pdev->reset_methods[i] ? 0xff : 0;
+
+ if (sysfs_streq(buf, "")) {
+ pci_warn(pdev, "All device reset methods disabled by user");
+ goto set_reset_methods;
+ }
+
+ if (sysfs_streq(buf, "default")) {
+ for (i = 0; i < PCI_RESET_FN_METHODS; i++)
+ reset_methods[i] = reset_methods[i] ? prio-- : 0;
+ goto set_reset_methods;
+ }
+
+ while ((name = strsep((char **)&buf, ",")) != NULL) {
+ for (i = 0; i < PCI_RESET_FN_METHODS; i++) {
+ if (reset_methods[i] &&
+ sysfs_streq(name, pci_reset_fn_methods[i].name)) {
+ reset_methods[i] = prio--;
+ break;
+ }
+ }
+ if (i == PCI_RESET_FN_METHODS)
+ return -EINVAL;
+ }
+
+ if (reset_methods[0] &&
+ reset_methods[0] != PCI_RESET_FN_METHODS)
+ pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
+
+set_reset_methods:
+ memcpy(pdev->reset_methods, reset_methods, sizeof(reset_methods));
+ return count;
+}
+
+static DEVICE_ATTR_RW(reset_method);
+
static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
@@ -1337,11 +1415,16 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
if (pci_reset_supported(dev)) {
retval = device_create_file(&dev->dev, &dev_attr_reset);
if (retval)
- goto error;
+ goto err_reset;
+ retval = device_create_file(&dev->dev, &dev_attr_reset_method);
+ if (retval)
+ goto err_method;
}
return 0;
-error:
+err_method:
+ device_remove_file(&dev->dev, &dev_attr_reset);
+err_reset:
pcie_vpd_remove_sysfs_dev_files(dev);
return retval;
}
@@ -1417,8 +1500,10 @@ int __must_check pci_create_sysfs_dev_files(struct pci_dev *pdev)
static void pci_remove_capabilities_sysfs(struct pci_dev *dev)
{
pcie_vpd_remove_sysfs_dev_files(dev);
- if (pci_reset_supported(dev))
+ if (pci_reset_supported(dev)) {
device_remove_file(&dev->dev, &dev_attr_reset);
+ device_remove_file(&dev->dev, &dev_attr_reset_method);
+ }
}
/**
--
2.31.1
Introduce a new array reset_methods in struct pci_dev
to keep track of reset mechanisms supported by the
device and their ordering. Also refactor probing and reset
functions to take advantage of calling convention of reset
functions.
Reviewed-by: Alex Williamson <[email protected]>
Reviewed-by: Raphael Norwitz <[email protected]>
Co-developed-by: Alex Williamson <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Amey Narkhede <[email protected]>
---
drivers/pci/pci.c | 107 ++++++++++++++++++++++++++------------------
drivers/pci/pci.h | 10 ++++-
drivers/pci/probe.c | 5 +--
include/linux/pci.h | 7 +++
4 files changed, 82 insertions(+), 47 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b998d6ad3..ca46a55c7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -72,6 +72,14 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
msleep(delay);
}
+bool pci_reset_supported(struct pci_dev *dev)
+{
+ u8 null_reset_methods[PCI_RESET_FN_METHODS] = { 0 };
+
+ return memcmp(null_reset_methods,
+ dev->reset_methods, PCI_RESET_FN_METHODS);
+}
+
#ifdef CONFIG_PCI_DOMAINS
int pci_domains_supported = 1;
#endif
@@ -5068,6 +5076,19 @@ static void pci_dev_restore(struct pci_dev *dev)
err_handler->reset_done(dev);
}
+/*
+ * The ordering for functions in pci_reset_fn_methods
+ * is required for reset_methods byte array defined
+ * in struct pci_dev
+ */
+const struct pci_reset_fn_method pci_reset_fn_methods[] = {
+ { .reset_fn = &pci_dev_specific_reset, .name = "device_specific" },
+ { .reset_fn = &pcie_reset_flr, .name = "flr" },
+ { .reset_fn = &pci_af_flr, .name = "af_flr" },
+ { .reset_fn = &pci_pm_reset, .name = "pm" },
+ { .reset_fn = &pci_reset_bus_function, .name = "bus" },
+};
+
/**
* __pci_reset_function_locked - reset a PCI device function while holding
* the @dev mutex lock.
@@ -5090,65 +5111,65 @@ static void pci_dev_restore(struct pci_dev *dev)
*/
int __pci_reset_function_locked(struct pci_dev *dev)
{
- int rc;
+ int i, rc = -ENOTTY;
+ u8 prio;
might_sleep();
- /*
- * A reset method returns -ENOTTY if it doesn't support this device
- * and we should try the next method.
- *
- * If it returns 0 (success), we're finished. If it returns any
- * other error, we're also finished: this indicates that further
- * reset mechanisms might be broken on the device.
- */
- rc = pci_dev_specific_reset(dev, 0);
- if (rc != -ENOTTY)
- return rc;
- rc = pcie_reset_flr(dev, 0);
- if (rc != -ENOTTY)
- return rc;
- rc = pci_af_flr(dev, 0);
- if (rc != -ENOTTY)
- return rc;
- rc = pci_pm_reset(dev, 0);
- if (rc != -ENOTTY)
- return rc;
- return pci_reset_bus_function(dev, 0);
+ for (prio = PCI_RESET_FN_METHODS; prio; prio--) {
+ for (i = 0; i < PCI_RESET_FN_METHODS; i++) {
+ if (dev->reset_methods[i] == prio) {
+ /*
+ * A reset method returns -ENOTTY if it doesn't support this device
+ * and we should try the next method.
+ *
+ * If it returns 0 (success), we're finished. If it returns any
+ * other error, we're also finished: this indicates that further
+ * reset mechanisms might be broken on the device.
+ */
+ rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
+ if (rc != -ENOTTY)
+ return rc;
+ break;
+ }
+ }
+ if (i == PCI_RESET_FN_METHODS)
+ break;
+ }
+ return rc;
}
EXPORT_SYMBOL_GPL(__pci_reset_function_locked);
/**
- * pci_probe_reset_function - check whether the device can be safely reset
- * @dev: PCI device to reset
+ * pci_init_reset_methods - check whether device can be safely reset
+ * and store supported reset mechanisms.
+ * @dev: PCI device to check for reset mechanisms
*
* Some devices allow an individual function to be reset without affecting
* other functions in the same device. The PCI device must be responsive
- * to PCI config space in order to use this function.
+ * to reads and writes to its PCI config space in order to use this function.
*
- * Returns 0 if the device function can be reset or negative if the
- * device doesn't support resetting a single function.
+ * Stores reset mechanisms supported by device in reset_methods byte array
+ * which is a member of struct pci_dev
*/
-int pci_probe_reset_function(struct pci_dev *dev)
+void pci_init_reset_methods(struct pci_dev *dev)
{
- int rc;
+ int i, rc;
+ u8 prio = PCI_RESET_FN_METHODS;
+ u8 reset_methods[PCI_RESET_FN_METHODS] = { 0 };
- might_sleep();
+ BUILD_BUG_ON(ARRAY_SIZE(pci_reset_fn_methods) != PCI_RESET_FN_METHODS);
- rc = pci_dev_specific_reset(dev, 1);
- if (rc != -ENOTTY)
- return rc;
- rc = pcie_reset_flr(dev, 1);
- if (rc != -ENOTTY)
- return rc;
- rc = pci_af_flr(dev, 1);
- if (rc != -ENOTTY)
- return rc;
- rc = pci_pm_reset(dev, 1);
- if (rc != -ENOTTY)
- return rc;
+ might_sleep();
- return pci_reset_bus_function(dev, 1);
+ for (i = 0; i < PCI_RESET_FN_METHODS; i++) {
+ rc = pci_reset_fn_methods[i].reset_fn(dev, 1);
+ if (!rc)
+ reset_methods[i] = prio--;
+ else if (rc != -ENOTTY)
+ break;
+ }
+ memcpy(dev->reset_methods, reset_methods, sizeof(reset_methods));
}
/**
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index ef7c46613..61d09e4dd 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -39,7 +39,7 @@ enum pci_mmap_api {
int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vmai,
enum pci_mmap_api mmap_api);
-int pci_probe_reset_function(struct pci_dev *dev);
+void pci_init_reset_methods(struct pci_dev *dev);
int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
int pci_bus_error_reset(struct pci_dev *dev);
@@ -612,6 +612,14 @@ struct pci_dev_reset_methods {
int (*reset)(struct pci_dev *dev, int probe);
};
+typedef int (*pci_reset_fn_t)(struct pci_dev *, int);
+
+struct pci_reset_fn_method {
+ pci_reset_fn_t reset_fn;
+ char *name;
+};
+
+extern const struct pci_reset_fn_method pci_reset_fn_methods[];
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_reset(struct pci_dev *dev, int probe);
#else
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 953f15abc..c5cfdd239 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2403,9 +2403,8 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_rcec_init(dev); /* Root Complex Event Collector */
pcie_report_downtraining(dev);
-
- if (pci_probe_reset_function(dev) == 0)
- dev->reset_fn = 1;
+ pci_init_reset_methods(dev);
+ dev->reset_fn = pci_reset_supported(dev);
}
/*
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8d20e51ab..5c5925ecf 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -49,6 +49,8 @@
PCI_STATUS_SIG_TARGET_ABORT | \
PCI_STATUS_PARITY)
+#define PCI_RESET_FN_METHODS 5
+
/*
* The PCI interface treats multi-function devices as independent
* devices. The slot/function address of each device is encoded
@@ -506,6 +508,10 @@ struct pci_dev {
char *driver_override; /* Driver name to force a match */
unsigned long priv_flags; /* Private flags for the PCI driver */
+ /*
+ * See pci_reset_fn_methods array in pci.c for ordering
+ */
+ u8 reset_methods[PCI_RESET_FN_METHODS]; /* Array for storing ordering of reset methods */
};
static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
@@ -1219,6 +1225,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
void pcie_print_link_status(struct pci_dev *dev);
int pcie_reset_flr(struct pci_dev *dev, int probe);
int pcie_flr(struct pci_dev *dev);
+bool pci_reset_supported(struct pci_dev *dev);
int __pci_reset_function_locked(struct pci_dev *dev);
int pci_reset_function(struct pci_dev *dev);
int pci_reset_function_locked(struct pci_dev *dev);
--
2.31.1
From: Shanker Donthineni <sdonthineni () nvidia ! com>
The _RST is a standard method specified in the ACPI specification. It
provides a function level reset when it is described in the acpi_device
context associated with PCI-device.
Implement a new reset function pci_dev_acpi_reset() for probing RST
method and execute if it is defined in the firmware. The ACPI binding
information is available only after calling device_add(), so move
pci_init_reset_methods() to end of the pci_device_add().
The default priority of the acpi reset is set to below device-specific
and above hardware resets.
Signed-off-by: Shanker Donthineni <[email protected]>
---
drivers/pci/pci.c | 30 ++++++++++++++++++++++++++++++
drivers/pci/probe.c | 2 +-
include/linux/pci.h | 2 +-
3 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 664cf2d35..d39dba590 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5076,6 +5076,35 @@ static void pci_dev_restore(struct pci_dev *dev)
err_handler->reset_done(dev);
}
+/**
+ * pci_dev_acpi_reset - do a function level reset using _RST method
+ * @dev: device to reset
+ * @probe: check if _RST method is included in the acpi_device context.
+ */
+static int pci_dev_acpi_reset(struct pci_dev *dev, int probe)
+{
+#ifdef CONFIG_ACPI
+ acpi_handle handle = ACPI_HANDLE(&dev->dev);
+
+ /* Return -ENOTTY if _RST method is not included in the dev context */
+ if (!handle || !acpi_has_method(handle, "_RST"))
+ return -ENOTTY;
+
+ /* Return 0 for probe phase indicating that we can reset this device */
+ if (probe)
+ return 0;
+
+ /* Invoke _RST() method to perform a function level reset */
+ if (ACPI_FAILURE(acpi_evaluate_object(handle, "_RST", NULL, NULL))) {
+ pci_warn(dev, "Failed to reset the device\n");
+ return -EINVAL;
+ }
+ return 0;
+#else
+ return -ENOTTY;
+#endif
+}
+
/*
* The ordering for functions in pci_reset_fn_methods
* is required for reset_methods byte array defined
@@ -5083,6 +5112,7 @@ static void pci_dev_restore(struct pci_dev *dev)
*/
const struct pci_reset_fn_method pci_reset_fn_methods[] = {
{ .reset_fn = &pci_dev_specific_reset, .name = "device_specific" },
+ { .reset_fn = &pci_dev_acpi_reset, .name = "acpi" },
{ .reset_fn = &pcie_reset_flr, .name = "flr" },
{ .reset_fn = &pci_af_flr, .name = "af_flr" },
{ .reset_fn = &pci_pm_reset, .name = "pm" },
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4764e031a..d4becd6ff 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2403,7 +2403,6 @@ static void pci_init_capabilities(struct pci_dev *dev)
pci_rcec_init(dev); /* Root Complex Event Collector */
pcie_report_downtraining(dev);
- pci_init_reset_methods(dev);
}
/*
@@ -2494,6 +2493,7 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
dev->match_driver = false;
ret = device_add(&dev->dev);
WARN_ON(ret < 0);
+ pci_init_reset_methods(dev);
}
struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 9f8347799..b4a5d2146 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -49,7 +49,7 @@
PCI_STATUS_SIG_TARGET_ABORT | \
PCI_STATUS_PARITY)
-#define PCI_RESET_FN_METHODS 5
+#define PCI_RESET_FN_METHODS 6
/*
* The PCI interface treats multi-function devices as independent
--
2.31.1
From: Shanker Donthineni <sdonthineni () nvidia ! com>
On select platforms, some Nvidia GPU devices do not work with SBR.
Triggering SBR would leave the device inoperable for the current
system boot. It requires a system hard-reboot to get the GPU device
back to normal operating condition post-SBR. For the affected
devices, enable NO_BUS_RESET quirk to fix the issue.
This issue will be fixed in the next generation of hardware.
Signed-off-by: Shanker Donthineni <[email protected]>
---
drivers/pci/quirks.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 8f47d139c..ceec67342 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3558,6 +3558,18 @@ static void quirk_no_bus_reset(struct pci_dev *dev)
dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
}
+/*
+ * Some Nvidia GPU devices do not work with bus reset, SBR needs to be
+ * prevented for those affected devices.
+ */
+static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
+{
+ if ((dev->device & 0xffc0) == 0x2340)
+ quirk_no_bus_reset(dev);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+ quirk_nvidia_no_bus_reset);
+
/*
* Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset.
* The device will throw a Link Down error on AER-capable systems and
--
2.31.1
Hi Amey,
[...]
> +/*
> + * The ordering for functions in pci_reset_fn_methods
> + * is required for reset_methods byte array defined
> + * in struct pci_dev
> + */
A small nitpick: missing period at the end of the sentence in the
comment above, and in other comments too. Might add for completeness
and consistency.
[...]
> +typedef int (*pci_reset_fn_t)(struct pci_dev *, int);
> +
> +struct pci_reset_fn_method {
> + pci_reset_fn_t reset_fn;
> + char *name;
> +};
Question about the custom type definition above: would it be really
needed? It there is only potentially a limited use for it, then perhaps
it would not be useful to have one?
Linus also has some preference on usage of custom types, as per:
https://yarchive.net/comp/linux/typedefs.html
But, in the end, this really boils down to a matter of style and/or
preference.
[...]
> +#define PCI_RESET_FN_METHODS 5
Not sure if worth changing name of this constant, but what about the
following:
#define PCI_RESET_FN_METHODS_NUM 5
Or even perhaps:
#define PCI_RESET_METHODS_NUM 5
So it's a little bit more self-explanatory. This would be in the
similar notion, as per:
https://elixir.bootlin.com/linux/v5.13-rc2/source/include/linux/pci.h#L115
[...]
> + u8 reset_methods[PCI_RESET_FN_METHODS]; /* Array for storing ordering of reset methods */
This comment reads somewhat awkward - we know that an array would be
used, most likely, for storing things, thus what about the following:
/* Reset methods ordered by priority */
Just a suggestion, though.
Krzysztof
Hi Amey,
Thank you for working on this! Few comments and suggestions below.
[...]
> Link: https://lkml.org/lkml/2021/3/23/911
Linking to lkml.org is fine, however it became a canon now to link to
lore, so this would be:
https://lore.kernel.org/lkml/[email protected]/
I personally find it a bit easier to read on lore compared to lkml.org
when it goes to a large and long running threads.
[...]
> +int pci_reset_bus_function(struct pci_dev *dev, int probe)
> +{
> + int rc = pci_dev_reset_slot_function(dev, probe);
> +
> + if (rc != -ENOTTY)
> + return rc;
> + return pci_parent_bus_reset(dev, probe);
> +}
Depends on the style, but I would suggest using a boolean type for the
probe argument here and in the other functions that enable or disable
something. I makes the intent clear, and this is also a popular pattern
you can see throughout the PCI tree.
Also, I would suggest adding a newline to separate final return, so that
it's easier to read the code, and to keep things consistent.
[...]
> - rc = pci_dev_reset_slot_function(dev, 0);
> - if (rc != -ENOTTY)
> - return rc;
> - return pci_parent_bus_reset(dev, 0);
> + return pci_reset_bus_function(dev, 0);
See above about using boolean type here.
[...]
> - rc = pci_dev_reset_slot_function(dev, 1);
> if (rc != -ENOTTY)
> return rc;
>
> - return pci_parent_bus_reset(dev, 1);
> + return pci_reset_bus_function(dev, 1);
Same as above.
Krzysztof
Hi Amey,
[...]
> +int pcie_reset_flr(struct pci_dev *dev, int probe)
> +{
> + u32 cap;
> +
> + if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> + return -ENOTTY;
> +
> + pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> + if (!(cap & PCI_EXP_DEVCAP_FLR))
> + return -ENOTTY;
> +
> + if (probe)
> + return 0;
> +
> + return pcie_flr(dev);
> +}
Similarly to my suggestion in the first patch in the series, perhaps
using a boolean here would be an option.
Having said that, the following existing functions aren't doing it, so
for the sake of keeping things consistent it might not be the best
option, as per:
static int pci_af_flr(struct pci_dev *dev, int probe)
int nvme_disable_and_flr(struct pci_dev *dev, int probe)
Krzysztof
Hi Amey,
[...]
> + if (sysfs_streq(buf, "")) {
> + pci_warn(pdev, "All device reset methods disabled by user");
> + goto set_reset_methods;
> + }
The sysfs_streq() is nice, indeed.
> + while ((name = strsep((char **)&buf, ",")) != NULL) {
I believe we could make this parsing a little bit more resilient,
especially since we are handling user input and this cannot ever be
really fully trusted, so for example:
while ((name = strsep((char **)&buf, ","))) {
if !(strlen(name)) <--- sysfs_streq() could be used here too.
continue;
name = strim(name); <--- remove leading and trailing whitespaces, if any.
(...)
[...]
> + if (reset_methods[0] &&
> + reset_methods[0] != PCI_RESET_FN_METHODS)
> + pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
What would be difference between disabling and de-prioritizing, is there
be a way for us to distinguish between the two? I was wondering if we
could, notify the user when the device specific reset is disable or when
it has been de-prioritized?
Krzysztof
On 21/05/20 05:05PM, Krzysztof Wilczyński wrote:
> Hi Amey,
>
> [...]
> > +int pcie_reset_flr(struct pci_dev *dev, int probe)
> > +{
> > + u32 cap;
> > +
> > + if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > + return -ENOTTY;
> > +
> > + pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > + if (!(cap & PCI_EXP_DEVCAP_FLR))
> > + return -ENOTTY;
> > +
> > + if (probe)
> > + return 0;
> > +
> > + return pcie_flr(dev);
> > +}
>
> Similarly to my suggestion in the first patch in the series, perhaps
> using a boolean here would be an option.
>
> Having said that, the following existing functions aren't doing it, so
> for the sake of keeping things consistent it might not be the best
> option, as per:
>
> static int pci_af_flr(struct pci_dev *dev, int probe)
> int nvme_disable_and_flr(struct pci_dev *dev, int probe)
>
> Krzysztof
All the functions which implement different types of resets including
quirks have ...reset(struct pci_dev *dev, int probe) signature.
Should I modify all of them?
Thanks,
Amey
On 21/05/25 05:17PM, Krzysztof Wilczyński wrote:
> Hi Amey,
>
> Sorry for late reply!
>
> [...]
> > > Similarly to my suggestion in the first patch in the series, perhaps
> > > using a boolean here would be an option.
> > >
> > > Having said that, the following existing functions aren't doing it, so
> > > for the sake of keeping things consistent it might not be the best
> > > option, as per:
> > >
> > > static int pci_af_flr(struct pci_dev *dev, int probe)
> > > int nvme_disable_and_flr(struct pci_dev *dev, int probe)
> > >
> > > Krzysztof
> >
> > All the functions which implement different types of resets including
> > quirks have ...reset(struct pci_dev *dev, int probe) signature.
> > Should I modify all of them?
>
> Might not be worth it to change anything then, especially if the other
> functions there already use an integer argument to enable or disable the
> problem or something else. At least no in this series.
>
> Krzysztof
Actually I made a new separate patch at the end to implement this change.
I'll send v3 soon.
Thanks,
Amey
Hi Amey,
Sorry for late reply!
[...]
> > Similarly to my suggestion in the first patch in the series, perhaps
> > using a boolean here would be an option.
> >
> > Having said that, the following existing functions aren't doing it, so
> > for the sake of keeping things consistent it might not be the best
> > option, as per:
> >
> > static int pci_af_flr(struct pci_dev *dev, int probe)
> > int nvme_disable_and_flr(struct pci_dev *dev, int probe)
> >
> > Krzysztof
>
> All the functions which implement different types of resets including
> quirks have ...reset(struct pci_dev *dev, int probe) signature.
> Should I modify all of them?
Might not be worth it to change anything then, especially if the other
functions there already use an integer argument to enable or disable the
problem or something else. At least no in this series.
Krzysztof