2017-08-18 21:32:00

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v11 0/4] PCI: Use CRS Software Visibility to wait for device to become ready

This is another rev of Sinan's series to fix an FLR issue seen with an
Intel 750 NVMe drive. It seems that the 750 requires more time than we
currently allow after the FLR to become ready for use.

The biggest remaining issue is that I *think* this does not fix the issue
on systems where CRS Software Visibility is not supported or disabled.

Differences from v10:
- Move 0x0001 test inside pci_bus_wait_crs()
- Fix pre-existing "valid response before timeout" bug
- Add message if a device becomes ready after we printed "waiting"
- Reorder messages so we print "giving up after X" instead of
"not ready after X; giving up after X"

---

Bjorn Helgaas (1):
PCI: Don't ignore valid response before CRS timeout

Sinan Kaya (3):
PCI: Factor out pci_bus_wait_crs()
PCI: Handle CRS ("device not ready") returned by device after FLR
PCI: Warn periodically while waiting for device to become ready


drivers/pci/pci.c | 19 ++++++++++++++--
drivers/pci/pci.h | 1 +
drivers/pci/probe.c | 61 +++++++++++++++++++++++++++++++++------------------
3 files changed, 57 insertions(+), 24 deletions(-)


Incremental diff from Sinan's v10 posting:

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c853551bc8d1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3824,22 +3824,18 @@ static void pci_flr_wait(struct pci_dev *dev)
bool ret;

/*
- * Don't touch the HW before waiting 100ms. HW has to finish
- * within 100ms according to PCI Express Base Specification
- * Revision 3.1 Section 6.6.2: Function-Level Reset (FLR).
+ * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+ * 100ms, but even after that, it may respond to config requests
+ * with CRS status if it requires more time.
*/
msleep(100);

- if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID,
- &id))
+ if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
return;

- /* See if we can find a device via CRS first. */
- if ((id & 0xffff) == 0x0001) {
- ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
- if (ret)
- return;
- }
+ ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+ if (ret)
+ return;

do {
msleep(100);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1bbe85190183..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,8 +235,7 @@ enum pci_bar_type {
pci_bar_mem64, /* A 64-bit memory BAR */
};

-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l,
- int crs_timeout);
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int crs_timeout);
int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f4cf7d0e25e..99799cc1de22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,42 +1824,50 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_alloc_dev);

-bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
{
int delay = 1;

+ if ((*l & 0xffff) != 0x0001)
+ return true; /* not a CRS completion */
+
+ if (!timeout)
+ return false; /* CRS, but caller doesn't want to wait */
+
/*
- * Configuration Request Retry Status. Some root ports return the
- * actual device ID instead of the synthetic ID (0xFFFF) required
- * by the PCIe spec. Ignore the device ID and only check for
- * (vendor id == 1).
+ * We got the reserved Vendor ID that indicates a completion with
+ * Configuration Request Retry Status (CRS). Retry until we get a
+ * valid Vendor ID or we time out.
*/
- do {
+ while ((*l & 0xffff) == 0x0001) {
msleep(delay);
delay *= 2;
- if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+
+ if (delay > timeout) {
+ pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
+ pci_domain_nr(bus), bus->number,
+ PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
return false;
+ }

if (delay >= 1000)
- pr_info("pci %04x:%02x:%02x.%d: not responding since %dms still polling\n",
+ pr_info("pci %04x:%02x:%02x.%d: not ready after %dms; waiting\n",
pci_domain_nr(bus), bus->number,
- PCI_SLOT(devfn), PCI_FUNC(devfn), delay);
+ PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);

- /* Card hasn't responded in 60 seconds? Must be stuck. */
- if (delay > crs_timeout) {
- pr_warn("pci %04x:%02x:%02x.%d: not responding %dms timeout reached\n",
- pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
- PCI_FUNC(devfn), crs_timeout);
+ if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
return false;
- }
- } while ((*l & 0xffff) == 0x0001);
+ }

+ if (delay >= 1000)
+ pr_info("pci %04x:%02x:%02x.%d: ready after %dms\n",
+ pci_domain_nr(bus), bus->number,
+ PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
return true;
}
-EXPORT_SYMBOL(pci_bus_wait_crs);

bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
- int crs_timeout)
+ int timeout)
{
if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
return false;
@@ -1869,20 +1877,7 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
*l == 0x0000ffff || *l == 0xffff0000)
return false;

- /*
- * Configuration Request Retry Status. Some root ports return the
- * actual device ID instead of the synthetic ID (0xFFFF) required
- * by the PCIe spec. Ignore the device ID and only check for
- * (vendor id == 1).
- */
- if ((*l & 0xffff) == 0x0001) {
- if (!crs_timeout)
- return false;
-
- return pci_bus_wait_crs(bus, devfn, l, crs_timeout);
- }
-
- return true;
+ return pci_bus_wait_crs(bus, devfn, l, timeout);
}
EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);



2017-08-18 21:32:08

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout

While waiting for a device to become ready (i.e., to return a non-CRS
completion to a read of its Vendor ID), if we got a valid response to the
very last read before timing out, we printed a warning and gave up on the
device even though it was actually ready.

For a typical 60s timeout, we wait about 65s (it's not exact because of the
exponential backoff), but we treated devices that became ready between 33s
and 65s as though they failed.

Move the Device ID read later so we check whether the device is ready
immediately, before checking for a timeout.

Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/probe.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c31310db0404..08ea844ac4ba 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,

msleep(delay);
delay *= 2;
- if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
- return false;
- /* Card hasn't responded in 60 seconds? Must be stuck. */
+
if (delay > crs_timeout) {
printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
PCI_FUNC(devfn));
return false;
}
+
+ if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+ return false;
}

return true;

2017-08-18 21:32:15

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

From: Sinan Kaya <[email protected]>

Configuration Request Retry Status (CRS) was previously hidden inside
pci_bus_read_dev_vendor_id(). We want to add support for CRS in other
situations, such as waiting for a device to become ready after a Function
Level Reset.

Move CRS handling into pci_bus_wait_crs() so it can be called from other
places.

Signed-off-by: Sinan Kaya <[email protected]>
[bhelgaas: move CRS reserved Vendor ID test into pci_bus_wait_crs(), remove
EXPORT_SYMBOL]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.h | 1 +
drivers/pci/probe.c | 39 +++++++++++++++++++++++----------------
2 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 22e061738c6f..b0857052c04a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -235,6 +235,7 @@ enum pci_bar_type {
pci_bar_mem64, /* A 64-bit memory BAR */
};

+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout);
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int crs_timeout);
int pci_setup_device(struct pci_dev *dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 08ea844ac4ba..342a86640c6b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1824,33 +1824,26 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_alloc_dev);

-bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
- int crs_timeout)
+bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
{
int delay = 1;

- if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
- return false;
+ if ((*l & 0xffff) != 0x0001)
+ return true; /* not a CRS completion */

- /* some broken boards return 0 or ~0 if a slot is empty: */
- if (*l == 0xffffffff || *l == 0x00000000 ||
- *l == 0x0000ffff || *l == 0xffff0000)
- return false;
+ if (!timeout)
+ return false; /* CRS, but caller doesn't want to wait */

/*
- * Configuration Request Retry Status. Some root ports return the
- * actual device ID instead of the synthetic ID (0xFFFF) required
- * by the PCIe spec. Ignore the device ID and only check for
- * (vendor id == 1).
+ * We got the reserved Vendor ID that indicates a completion with
+ * Configuration Request Retry Status (CRS). Retry until we get a
+ * valid Vendor ID or we time out.
*/
while ((*l & 0xffff) == 0x0001) {
- if (!crs_timeout)
- return false;
-
msleep(delay);
delay *= 2;

- if (delay > crs_timeout) {
+ if (delay > timeout) {
printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
PCI_FUNC(devfn));
@@ -1863,6 +1856,20 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,

return true;
}
+
+bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
+ int timeout)
+{
+ if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
+ return false;
+
+ /* some broken boards return 0 or ~0 if a slot is empty: */
+ if (*l == 0xffffffff || *l == 0x00000000 ||
+ *l == 0x0000ffff || *l == 0xffff0000)
+ return false;
+
+ return pci_bus_wait_crs(bus, devfn, l, timeout);
+}
EXPORT_SYMBOL(pci_bus_read_dev_vendor_id);

/*

2017-08-18 21:32:21

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v11 3/4] PCI: Handle CRS ("device not ready") returned by device after FLR

From: Sinan Kaya <[email protected]>

XXX I think this needs to somehow use the same timeout for the PCI_COMMAND
loop as for the CRS part, so this works on machines without CRS Software
Visibility. -- bhelgaas

Sporadic reset issues have been observed with Intel 750 NVMe drive while
assigning the physical function to the guest machine. The sequence of
events observed is as follows:

- perform a Function Level Reset (FLR)
- sleep up to 1000ms total
- read ~0 from PCI_COMMAND
- warn that the device didn't return from FLR
- touch the device before it's ready
- device drops config writes when we restore register settings
- incomplete register restore leaves device in inconsistent state
- device probe fails because device is in inconsistent state

After reset, an endpoint may respond to config requests with Configuration
Request Retry Status (CRS) to indicate that it is not ready to accept new
requests. See PCIe r3.1, sec 2.3.1 and 6.6.2.

After an FLR, read the Vendor ID and use pci_bus_wait_crs() to wait for a
value that indicates the device is ready.

If pci_bus_wait_crs() fails, i.e., the device has responded with CRS status
for at least the timeout interval, fall back to the old behavior of reading
the Command register until it is not ~0.

Signed-off-by: Sinan Kaya <[email protected]>
[bhelgaas: changelog, adjust for Vendor ID test being inside
pci_bus_wait_crs(), drop PCI_COMMAND tweaks]
Not-Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/pci.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index af0cc3456dc1..34c0aa1f37aa 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3821,17 +3821,32 @@ static void pci_flr_wait(struct pci_dev *dev)
{
int i = 0;
u32 id;
+ bool ret;
+
+ /*
+ * Per PCIe r3.1, sec 6.6.2, the device should finish FLR within
+ * 100ms, but even after that, it may respond to config requests
+ * with CRS status if it requires more time.
+ */
+ msleep(100);
+
+ if (pci_bus_read_config_dword(dev->bus, dev->devfn, PCI_VENDOR_ID, &id))
+ return;
+
+ ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+ if (ret)
+ return;

do {
msleep(100);
pci_read_config_dword(dev, PCI_COMMAND, &id);
- } while (i++ < 10 && id == ~0);
+ } while (i++ < 9 && id == ~0);

if (id == ~0)
dev_warn(&dev->dev, "Failed to return from FLR\n");
else if (i > 1)
dev_info(&dev->dev, "Required additional %dms to return from FLR\n",
- (i - 1) * 100);
+ i * 100);
}

/**

2017-08-18 21:32:29

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH v11 4/4] PCI: Warn periodically while waiting for device to become ready

From: Sinan Kaya <[email protected]>

Add a print statement in pci_bus_wait_crs() so that user observes the
progress of device polling instead of silently waiting for timeout to be
reached.

Signed-off-by: Sinan Kaya <[email protected]>
[bhelgaas: check for timeout first so we don't print "waiting, giving up",
always print time we've slept (not the actual timeout, print a "ready"
message if we've printed a "waiting" message]
Signed-off-by: Bjorn Helgaas <[email protected]>
---
drivers/pci/probe.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 342a86640c6b..99799cc1de22 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1844,16 +1844,25 @@ bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l, int timeout)
delay *= 2;

if (delay > timeout) {
- printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
- pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
- PCI_FUNC(devfn));
+ pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
+ pci_domain_nr(bus), bus->number,
+ PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
return false;
}

+ if (delay >= 1000)
+ pr_info("pci %04x:%02x:%02x.%d: not ready after %dms; waiting\n",
+ pci_domain_nr(bus), bus->number,
+ PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
+
if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
return false;
}

+ if (delay >= 1000)
+ pr_info("pci %04x:%02x:%02x.%d: ready after %dms\n",
+ pci_domain_nr(bus), bus->number,
+ PCI_SLOT(devfn), PCI_FUNC(devfn), delay - 1);
return true;
}


2017-08-21 13:54:01

by Sinan Kaya

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> + if ((*l & 0xffff) != 0x0001)
> + return true; /* not a CRS completion */
>

This version certainly looks cleaner. However, it breaks pci_flr_wait().

If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
function returns true. pci_flr_wait() prematurely bails out from here.


pci_flr_wait()
{

+ ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
+ if (ret)
+ return;

}

We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
function.

That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
function that would check for the magic 0x0001 value and return true. Otherwise, false.

pci_bus_read_dev_vendor_id() would do this

pci_bus_read_dev_vendor_id()
{
...
if (pci_bus_crs_visibility_supported())
return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);

return true
}

Similar pattern for pci_flr_wait().


--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

2017-08-21 14:03:00

by Sinan Kaya

[permalink] [raw]
Subject: Re: [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout

On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> While waiting for a device to become ready (i.e., to return a non-CRS
> completion to a read of its Vendor ID), if we got a valid response to the
> very last read before timing out, we printed a warning and gave up on the
> device even though it was actually ready.
>
> For a typical 60s timeout, we wait about 65s (it's not exact because of the
> exponential backoff), but we treated devices that became ready between 33s
> and 65s as though they failed.
>
> Move the Device ID read later so we check whether the device is ready
> immediately, before checking for a timeout.
>
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> drivers/pci/probe.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index c31310db0404..08ea844ac4ba 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
>
> msleep(delay);
> delay *= 2;
> - if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> - return false;
> - /* Card hasn't responded in 60 seconds? Must be stuck. */
> +

There is still a problem here. We'll wait some time above and return without checking if
we actually found the card or not.

> if (delay > crs_timeout) {
> printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> PCI_FUNC(devfn));
> return false;
> }
> +
> + if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> + return false;
> }
>
> return true;
>
>

Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
by copy paste.

while ((*l & 0xffff) == 0x0001) {
if (!crs_timeout)
return false;

/* Card hasn't responded in 60 seconds? Must be stuck. */
if (delay > crs_timeout) {
printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
PCI_FUNC(devfn));
return false;
}
msleep(delay);
delay *= 2;
if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
return false;
}

--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

2017-08-21 17:44:51

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v11 1/4] PCI: Don't ignore valid response before CRS timeout

On Mon, Aug 21, 2017 at 10:02:55AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > While waiting for a device to become ready (i.e., to return a non-CRS
> > completion to a read of its Vendor ID), if we got a valid response to the
> > very last read before timing out, we printed a warning and gave up on the
> > device even though it was actually ready.
> >
> > For a typical 60s timeout, we wait about 65s (it's not exact because of the
> > exponential backoff), but we treated devices that became ready between 33s
> > and 65s as though they failed.
> >
> > Move the Device ID read later so we check whether the device is ready
> > immediately, before checking for a timeout.
> >
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > ---
> > drivers/pci/probe.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index c31310db0404..08ea844ac4ba 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1849,15 +1849,16 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
> >
> > msleep(delay);
> > delay *= 2;
> > - if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > - return false;
> > - /* Card hasn't responded in 60 seconds? Must be stuck. */
> > +
>
> There is still a problem here. We'll wait some time above and return without checking if
> we actually found the card or not.
>
> > if (delay > crs_timeout) {
> > printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> > pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> > PCI_FUNC(devfn));
> > return false;
> > }
> > +
> > + if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> > + return false;
> > }
> >
> > return true;
> >
> >
>
> Here is another shot at it. Sorry, I don't have a diff syntax. I made up the function
> by copy paste.

Yep, this is much better, thanks!

> while ((*l & 0xffff) == 0x0001) {
> if (!crs_timeout)
> return false;
>
> /* Card hasn't responded in 60 seconds? Must be stuck. */
> if (delay > crs_timeout) {
> printk(KERN_WARNING "pci %04x:%02x:%02x.%d: not responding\n",
> pci_domain_nr(bus), bus->number, PCI_SLOT(devfn),
> PCI_FUNC(devfn));
> return false;
> }
> msleep(delay);
> delay *= 2;
> if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
> return false;
> }
>
> --
> Sinan Kaya
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2017-08-21 19:18:10

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

On Mon, Aug 21, 2017 at 09:53:56AM -0400, Sinan Kaya wrote:
> On 8/18/2017 5:32 PM, Bjorn Helgaas wrote:
> > + if ((*l & 0xffff) != 0x0001)
> > + return true; /* not a CRS completion */
> >
>
> This version certainly looks cleaner. However, it breaks pci_flr_wait().
>
> If some root port doesn't support CRS and returns 0xFFFFFFFF, pci_bus_wait_crs()
> function returns true. pci_flr_wait() prematurely bails out from here.
>
>
> pci_flr_wait()
> {
>
> + ret = pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> + if (ret)
> + return;
>
> }
>
> We can change the return code to false above but then we break pci_bus_read_dev_vendor_id()
> function.
>
> That's why, I was interested in creating a pci_bus_crs_visibility_supported() helper
> function that would check for the magic 0x0001 value and return true. Otherwise, false.
>
> pci_bus_read_dev_vendor_id() would do this
>
> pci_bus_read_dev_vendor_id()
> {
> ...
> if (pci_bus_crs_visibility_supported())
> return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>
> return true
> }
>
> Similar pattern for pci_flr_wait().

I think that makes sense. We'd want to check for CRS SV being
enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
pci_enable_crs() and cache it somewhere. Maybe a crs_sv_enabled bit
in the root port's pci_dev, and check it with something like what
pcie_root_rcb_set() does?

2017-08-21 19:37:11

by Sinan Kaya

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
>> pci_bus_read_dev_vendor_id()
>> {
>> ...
>> if (pci_bus_crs_visibility_supported())
>> return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>>
>> return true
>> }
>>
>> Similar pattern for pci_flr_wait().

Sorry for the poor choice of function name.

I was thinking of something like this.

bool pci_bus_crs_pending(u32 l)
{
return (l & 0xFFFF) == 0x0001
}

if (pci_bus_crs_pending(id))
return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);

> I think that makes sense. We'd want to check for CRS SV being
> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> pci_enable_crs() and cache it somewhere. Maybe a crs_sv_enabled bit
> in the root port's pci_dev, and check it with something like what
> pcie_root_rcb_set() does?
>

You can observe CRS under the following conditions

1. root port <-> endpoint
2. bridge <-> endpoint
3. root port<->bridge

I was relying on the fact that we are reading 0x001 as an indication that
this device detected CRS. Maybe, this is too indirect.

If we also want to capture the capability, I think the right thing is to
check the parent capability.

bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
{
if (device type(bridge) == root port)
return read(root_crs_register_reg);

if (device type(bridge) == switch)
return read(switch_crs_register);

return false;
}

bool pci_bus_crs_pending(struct pci_dev *dev, u32 l)
{
if !pci_bus_crs_vis_supported(dev->parent)
return false;

return (l & 0xFFFF) == 0x0001;
}

I'll prototype this. Let me know if you have concerns.

--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

2017-08-21 20:23:44

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> ...
> if (pci_bus_crs_pending(id))
> return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>
> > I think that makes sense. We'd want to check for CRS SV being
> > enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> > pci_enable_crs() and cache it somewhere. Maybe a crs_sv_enabled bit
> > in the root port's pci_dev, and check it with something like what
> > pcie_root_rcb_set() does?
> >
>
> You can observe CRS under the following conditions
>
> 1. root port <-> endpoint
> 2. bridge <-> endpoint
> 3. root port<->bridge
>
> I was relying on the fact that we are reading 0x001 as an indication that
> this device detected CRS. Maybe, this is too indirect.
>
> If we also want to capture the capability, I think the right thing is to
> check the parent capability.
>
> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> {
> if (device type(bridge) == root port)
> return read(root_crs_register_reg);
>
> if (device type(bridge) == switch)
> return read(switch_crs_register);

I don't understand this part. AFAIK, CRS SV is only a feature of root
ports. The capability and enable bits are in the Root Capabilities
and Root Control registers.

It's certainly true that a device below a switch can respond with a
CRS completion, but the switch is not the requester, and my
understanding is that it would not take any action on the completion
other than passing it upstream.

2017-08-21 20:32:31

by Sinan Kaya

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

On 8/21/2017 4:23 PM, Bjorn Helgaas wrote:
> On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
>> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
>> ...
>> if (pci_bus_crs_pending(id))
>> return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
>>
>>> I think that makes sense. We'd want to check for CRS SV being
>>> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
>>> pci_enable_crs() and cache it somewhere. Maybe a crs_sv_enabled bit
>>> in the root port's pci_dev, and check it with something like what
>>> pcie_root_rcb_set() does?
>>>
>>
>> You can observe CRS under the following conditions
>>
>> 1. root port <-> endpoint
>> 2. bridge <-> endpoint
>> 3. root port<->bridge
>>
>> I was relying on the fact that we are reading 0x001 as an indication that
>> this device detected CRS. Maybe, this is too indirect.
>>
>> If we also want to capture the capability, I think the right thing is to
>> check the parent capability.
>>
>> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
>> {
>> if (device type(bridge) == root port)
>> return read(root_crs_register_reg);
>>
>> if (device type(bridge) == switch)
>> return read(switch_crs_register);
>
> I don't understand this part. AFAIK, CRS SV is only a feature of root
> ports. The capability and enable bits are in the Root Capabilities
> and Root Control registers.
>

No question about it.

> It's certainly true that a device below a switch can respond with a
> CRS completion, but the switch is not the requester, and my
> understanding is that it would not take any action on the completion
> other than passing it upstream.
>

I saw some bridge references in the spec for CRS. I was going to do
some research for it. You answered my question. I was curious how this
would impact the behavior.

"Bridge Configuration Retry Enable – When Set, this bit enables PCI Express
to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in
response to Configuration Requests that target devices below the bridge.
Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for
further details."

--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

2017-08-21 21:09:36

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

On Mon, Aug 21, 2017 at 04:32:26PM -0400, Sinan Kaya wrote:
> On 8/21/2017 4:23 PM, Bjorn Helgaas wrote:
> > On Mon, Aug 21, 2017 at 03:37:06PM -0400, Sinan Kaya wrote:
> >> On 8/21/2017 3:18 PM, Bjorn Helgaas wrote:
> >> ...
> >> if (pci_bus_crs_pending(id))
> >> return pci_bus_wait_crs(dev->bus, dev->devfn, &id, 60000);
> >>
> >>> I think that makes sense. We'd want to check for CRS SV being
> >>> enabled, e.g., maybe read PCI_EXP_RTCTL_CRSSVE back in
> >>> pci_enable_crs() and cache it somewhere. Maybe a crs_sv_enabled bit
> >>> in the root port's pci_dev, and check it with something like what
> >>> pcie_root_rcb_set() does?
> >>>
> >>
> >> You can observe CRS under the following conditions
> >>
> >> 1. root port <-> endpoint
> >> 2. bridge <-> endpoint
> >> 3. root port<->bridge
> >>
> >> I was relying on the fact that we are reading 0x001 as an indication that
> >> this device detected CRS. Maybe, this is too indirect.
> >>
> >> If we also want to capture the capability, I think the right thing is to
> >> check the parent capability.
> >>
> >> bool pci_bus_crs_vis_supported(struct pci_dev *bridge)
> >> {
> >> if (device type(bridge) == root port)
> >> return read(root_crs_register_reg);
> >>
> >> if (device type(bridge) == switch)
> >> return read(switch_crs_register);
> >
> > I don't understand this part. AFAIK, CRS SV is only a feature of root
> > ports. The capability and enable bits are in the Root Capabilities
> > and Root Control registers.
> >
>
> No question about it.
>
> > It's certainly true that a device below a switch can respond with a
> > CRS completion, but the switch is not the requester, and my
> > understanding is that it would not take any action on the completion
> > other than passing it upstream.
> >
>
> I saw some bridge references in the spec for CRS. I was going to do
> some research for it. You answered my question. I was curious how this
> would impact the behavior.
>
> "Bridge Configuration Retry Enable – When Set, this bit enables PCI Express
> to PCI/PCI-X bridges to return Configuration Request Retry Status (CRS) in
> response to Configuration Requests that target devices below the bridge.
> Refer to the PCI Express to PCI/PCI-X Bridge Specification, Revision 1.0 for
> further details."

(The above is from PCIe r3.1, sec 7.8.4, Device Control Register, and
also discussed in sec 4.3 of the PCIe-to-PCI/PCI-X bridge spec.)

In any event, the Bridge Configuration Retry Enable only determines
whether the bridge ever returns a CRS completion. The bridge itself
never converts a CRS completion into the 0x0001 vendor ID.

2017-08-23 04:40:50

by Sinan Kaya

[permalink] [raw]
Subject: Re: [PATCH v11 2/4] PCI: Factor out pci_bus_wait_crs()

On 8/21/2017 3:37 PM, Sinan Kaya wrote:
> bool pci_bus_crs_pending(struct pci_dev *dev, u32 l)
> {
> if !pci_bus_crs_vis_supported(dev->parent)
> return false;

Apparently, I can't do this. By the time, we come to here from vendor id read
function, the topology has not been set up yet.

I'm getting an exception:

[ 6.067392] [<ffff00000846c878>] pci_bus_crs_visibility_pending+0x4/0x7c
[ 6.074085] [<ffff00000846cccc>] pci_scan_single_device+0x40/0xb4
[ 6.080170] [<ffff00000846cd90>] pci_scan_slot+0x50/0xe8
[ 6.085474] [<ffff00000846dc14>] pci_scan_child_bus+0x30/0x108
[ 6.091300] [<ffff0000084bab94>] acpi_pci_root_create+0x184/0x1f0
[ 6.097388] [<ffff000008091dc8>] pci_acpi_scan_root+0x188/0x1d4
[ 6.103298] [<ffff0000084ba7a8>] acpi_pci_root_add+0x38c/0x44c
[ 6.109125] [<ffff0000084b4d94>] acpi_bus_attach+0xe0/0x1ac
[ 6.114689] [<ffff0000084b4e08>] acpi_bus_attach+0x154/0x1ac
[ 6.120340] [<ffff0000084b4e08>] acpi_bus_attach+0x154/0x1ac
[ 6.125991] [<ffff0000084b6608>] acpi_bus_scan+0x60/0x70
[ 6.131297] [<ffff0000091c87f8>] acpi_scan_init+0xd8/0x228
[ 6.136774] [<ffff0000091c84e0>] acpi_init+0x2d4/0x328
[ 6.141905] [<ffff0000091a0c88>] do_one_initcall+0x80/0x108
[ 6.147469] [<ffff0000091a0e98>] kernel_init_freeable+0x188/0x228
[ 6.153556] [<ffff000008c9bcbc>] kernel_init+0x10/0xfc
[ 6.158687] [<ffff000008082ec0>] ret_from_fork+0x10/0x50



>
> return (l & 0xFFFF) == 0x0001;
> }


--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.