2008-12-22 22:12:41

by Andrew Patterson

[permalink] [raw]
Subject: [PATCH] ASPM: Use msleep instead of cpu_relax during link retraining

ASPM: Use msleep instead of cpu_relax during link retraining

The cpu_relax() function can be a noop on certain architectures
like IA-64 when CPU threads are disabled, so use msleep instead
during link retraining busy/wait loop.

Introduce define LINK_RETRAIN_TIMEOUT instead of hard-coding
timeout in pcie_aspm_configure_common_clock.

Signed-off-by: Andrew Patterson <[email protected]>
---

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index e361c7d..c00fd65 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -17,6 +17,7 @@
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/jiffies.h>
+#include <linux/delay.h>
#include <linux/pci-aspm.h>
#include "../pci.h"

@@ -70,6 +71,8 @@ static const char *policy_str[] = {
[POLICY_POWERSAVE] = "powersave"
};

+#define LINK_RETRAIN_TIMEOUT HZ
+
static int policy_to_aspm_state(struct pci_dev *pdev)
{
struct pcie_link_state *link_state = pdev->link_state;
@@ -217,16 +220,18 @@ static void pcie_aspm_configure_common_clock(struct pci_dev *pdev)
pci_write_config_word(pdev, pos + PCI_EXP_LNKCTL, reg16);

/* Wait for link training end */
- /* break out after waiting for 1 second */
+ /* break out after waiting for timeout */
start_jiffies = jiffies;
- while ((jiffies - start_jiffies) < HZ) {
+ for (;;) {
pci_read_config_word(pdev, pos + PCI_EXP_LNKSTA, &reg16);
if (!(reg16 & PCI_EXP_LNKSTA_LT))
break;
- cpu_relax();
+ if ((jiffies - start_jiffies) >= LINK_RETRAIN_TIMEOUT)
+ break;
+ msleep(1);
}
/* training failed -> recover */
- if ((jiffies - start_jiffies) >= HZ) {
+ if ((jiffies - start_jiffies) >= LINK_RETRAIN_TIMEOUT) {
dev_printk (KERN_ERR, &pdev->dev, "ASPM: Could not configure"
" common clock\n");
i = 0;


2008-12-25 19:01:49

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] ASPM: Use msleep instead of cpu_relax during link retraining

On Mon 2008-12-22 15:11:57, Andrew Patterson wrote:
> ASPM: Use msleep instead of cpu_relax during link retraining
>
> The cpu_relax() function can be a noop on certain architectures
> like IA-64 when CPU threads are disabled, so use msleep instead
> during link retraining busy/wait loop.

Author clearly wanted to do a busy loop... why do you think 10msec
delay here is acceptable?


> Introduce define LINK_RETRAIN_TIMEOUT instead of hard-coding
> timeout in pcie_aspm_configure_common_clock.
>
> Signed-off-by: Andrew Patterson <[email protected]>
> @@ -70,6 +71,8 @@ static const char *policy_str[] = {
> [POLICY_POWERSAVE] = "powersave"
> };
>
> +#define LINK_RETRAIN_TIMEOUT HZ
> +
> static int policy_to_aspm_state(struct pci_dev *pdev)
> {
> struct pcie_link_state *link_state = pdev->link_state;
> @@ -217,16 +220,18 @@ static void pcie_aspm_configure_common_clock(struct pci_dev *pdev)
> pci_write_config_word(pdev, pos + PCI_EXP_LNKCTL, reg16);
>
> /* Wait for link training end */
> - /* break out after waiting for 1 second */
> + /* break out after waiting for timeout */
> start_jiffies = jiffies;
> - while ((jiffies - start_jiffies) < HZ) {
> + for (;;) {
> pci_read_config_word(pdev, pos + PCI_EXP_LNKSTA, &reg16);
> if (!(reg16 & PCI_EXP_LNKSTA_LT))
> break;
> - cpu_relax();
> + if ((jiffies - start_jiffies) >= LINK_RETRAIN_TIMEOUT)
> + break;
> + msleep(1);

Is this safe w.r.t. jiffie wraparounds?

> }
> /* training failed -> recover */
> - if ((jiffies - start_jiffies) >= HZ) {
> + if ((jiffies - start_jiffies) >= LINK_RETRAIN_TIMEOUT) {
> dev_printk (KERN_ERR, &pdev->dev, "ASPM: Could not configure"
> " common clock\n");
> i = 0;

AFAICT this can trigger false positives. !reg16 test succeeds and then
jiffies tick.

...it could happen before but you make it way more probable...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-12-25 23:24:53

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] ASPM: Use msleep instead of cpu_relax during link retraining

On Thu, Dec 25, 2008 at 08:01:29PM +0100, Pavel Machek wrote:
> On Mon 2008-12-22 15:11:57, Andrew Patterson wrote:
> > ASPM: Use msleep instead of cpu_relax during link retraining
> >
> > The cpu_relax() function can be a noop on certain architectures
> > like IA-64 when CPU threads are disabled, so use msleep instead
> > during link retraining busy/wait loop.
>
> Author clearly wanted to do a busy loop... why do you think 10msec
> delay here is acceptable?

10ms? I see a 1ms sleep.

> > @@ -217,16 +220,18 @@ static void pcie_aspm_configure_common_clock(struct pci_dev *pdev)
> > pci_write_config_word(pdev, pos + PCI_EXP_LNKCTL, reg16);
> >
> > /* Wait for link training end */
> > - /* break out after waiting for 1 second */
> > + /* break out after waiting for timeout */
> > start_jiffies = jiffies;
> > - while ((jiffies - start_jiffies) < HZ) {
> > + for (;;) {
> > pci_read_config_word(pdev, pos + PCI_EXP_LNKSTA, &reg16);
> > if (!(reg16 & PCI_EXP_LNKSTA_LT))
> > break;
> > - cpu_relax();
> > + if ((jiffies - start_jiffies) >= LINK_RETRAIN_TIMEOUT)
> > + break;
> > + msleep(1);
>
> Is this safe w.r.t. jiffie wraparounds?

Definitely needs to be time_before/time_after.

> > }
> > /* training failed -> recover */
> > - if ((jiffies - start_jiffies) >= HZ) {
> > + if ((jiffies - start_jiffies) >= LINK_RETRAIN_TIMEOUT) {
> > dev_printk (KERN_ERR, &pdev->dev, "ASPM: Could not configure"
> > " common clock\n");
> > i = 0;
>
> AFAICT this can trigger false positives. !reg16 test succeeds and then
> jiffies tick.
>
> ...it could happen before but you make it way more probable...

No, because the test moved.

I came up with this loop (off the top of my head):

<willy> for (;;) {
<willy> pci_read_config_word(pdev, pos + PCI_EXP_LNKSTA, &reg16);
<willy> if (!(reg16 & PCI_EXP_LNKSTA_LT))
<willy> break;
<willy> if ((jiffies - start_jiffies) >= HZ)
<willy> break;
<willy> msleep(1);
<willy> }

Andrew has mostly followed that ... improving it to LINK_RETRAIN_TIMEOUT
instead of HZ.

Yes, the subsequent test should be of reg16 instead of jiffies.

And we should be using time_before/after instead of the explicit
comparison.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2008-12-26 18:27:22

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] ASPM: Use msleep instead of cpu_relax during link retraining

On Thu 2008-12-25 16:24:27, Matthew Wilcox wrote:
> On Thu, Dec 25, 2008 at 08:01:29PM +0100, Pavel Machek wrote:
> > On Mon 2008-12-22 15:11:57, Andrew Patterson wrote:
> > > ASPM: Use msleep instead of cpu_relax during link retraining
> > >
> > > The cpu_relax() function can be a noop on certain architectures
> > > like IA-64 when CPU threads are disabled, so use msleep instead
> > > during link retraining busy/wait loop.
> >
> > Author clearly wanted to do a busy loop... why do you think 10msec
> > delay here is acceptable?
>
> 10ms? I see a 1ms sleep.

Yes... IIRC msleep will sleep for up-to 1/HZ on non-highres systems.

> Yes, the subsequent test should be of reg16 instead of jiffies.

Thanks.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-01-05 19:17:19

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH] ASPM: Use msleep instead of cpu_relax during link retraining

On Friday, December 26, 2008 10:27 am Pavel Machek wrote:
> On Thu 2008-12-25 16:24:27, Matthew Wilcox wrote:
> > On Thu, Dec 25, 2008 at 08:01:29PM +0100, Pavel Machek wrote:
> > > On Mon 2008-12-22 15:11:57, Andrew Patterson wrote:
> > > > ASPM: Use msleep instead of cpu_relax during link retraining
> > > >
> > > > The cpu_relax() function can be a noop on certain architectures
> > > > like IA-64 when CPU threads are disabled, so use msleep instead
> > > > during link retraining busy/wait loop.
> > >
> > > Author clearly wanted to do a busy loop... why do you think 10msec
> > > delay here is acceptable?
> >
> > 10ms? I see a 1ms sleep.
>
> Yes... IIRC msleep will sleep for up-to 1/HZ on non-highres systems.
>
> > Yes, the subsequent test should be of reg16 instead of jiffies.

Andrew, care to send an updated patch which includes fixes for the issues
caught by Pavel & Matthew?

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

2009-01-05 19:37:36

by Andrew Patterson

[permalink] [raw]
Subject: Re: [PATCH] ASPM: Use msleep instead of cpu_relax during link retraining

On Mon, 2009-01-05 at 11:17 -0800, Jesse Barnes wrote:
> On Friday, December 26, 2008 10:27 am Pavel Machek wrote:
> > On Thu 2008-12-25 16:24:27, Matthew Wilcox wrote:
> > > On Thu, Dec 25, 2008 at 08:01:29PM +0100, Pavel Machek wrote:
> > > > On Mon 2008-12-22 15:11:57, Andrew Patterson wrote:
> > > > > ASPM: Use msleep instead of cpu_relax during link retraining
> > > > >
> > > > > The cpu_relax() function can be a noop on certain architectures
> > > > > like IA-64 when CPU threads are disabled, so use msleep instead
> > > > > during link retraining busy/wait loop.
> > > >
> > > > Author clearly wanted to do a busy loop... why do you think 10msec
> > > > delay here is acceptable?
> > >
> > > 10ms? I see a 1ms sleep.
> >
> > Yes... IIRC msleep will sleep for up-to 1/HZ on non-highres systems.
> >
> > > Yes, the subsequent test should be of reg16 instead of jiffies.
>
> Andrew, care to send an updated patch which includes fixes for the issues
> caught by Pavel & Matthew?
>

Will do.

Andrew

> Thanks,