2009-11-13 17:05:36

by Breno Leitao

[permalink] [raw]
Subject: PCI: pci_restore_state() is returning 0 when it fails

Actually pci_restore_state() is returning 0 if the restore process
fails, instead of a error value.

If it fails, I believe that it should return -EPERM, once that
it is an invalid operation and probably pci_save_state() wasn't
called.

Signed-off-by: Breno Leitao <[email protected]>
---
drivers/pci/pci.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 4e4c295..b677ca3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -857,7 +857,7 @@ pci_restore_state(struct pci_dev *dev)
u32 val;

if (!dev->state_saved)
- return 0;
+ return -EPERM;

/* PCI Express register must be restored first */
pci_restore_pcie_state(dev);
--
1.6.0.4


2009-11-13 20:07:31

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: PCI: pci_restore_state() is returning 0 when it fails

On Friday 13 November 2009, Breno Leitao wrote:
> Actually pci_restore_state() is returning 0 if the restore process
> fails, instead of a error value.
>
> If it fails, I believe that it should return -EPERM, once that
> it is an invalid operation and probably pci_save_state() wasn't
> called.

I believe this patch will break a number of things.

Does it actually fix any problem you have observed?

Rafael


> Signed-off-by: Breno Leitao <[email protected]>
> ---
> drivers/pci/pci.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 4e4c295..b677ca3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -857,7 +857,7 @@ pci_restore_state(struct pci_dev *dev)
> u32 val;
>
> if (!dev->state_saved)
> - return 0;
> + return -EPERM;
>
> /* PCI Express register must be restored first */
> pci_restore_pcie_state(dev);
>

2009-11-16 14:13:40

by Breno Leitao

[permalink] [raw]
Subject: Re: PCI: pci_restore_state() is returning 0 when it fails

Hi Rafael,

Rafael J. Wysocki wrote:
> On Friday 13 November 2009, Breno Leitao wrote:
>> Actually pci_restore_state() is returning 0 if the restore process
>> fails, instead of a error value.
>>
>> If it fails, I believe that it should return -EPERM, once that
>> it is an invalid operation and probably pci_save_state() wasn't
>> called.
>
> I believe this patch will break a number of things.
Well, I checked it, and found that there are around 10 places that
really verify the return value for this function, and almost all of them
do the correct thing, and the patch doesn't seem to break any of them
except a specific case in the drivers/net/sfc/falcon.c file, that contains:

if (FALCON_IS_DUAL_FUNC(efx)) {
rc = pci_restore_state(nic_data->pci_dev2);
if (rc) {
EFX_ERR(efx, "failed to restore PCI config for "
"the secondary function\n");
goto fail3;
}
}
rc = pci_restore_state(efx->pci_dev);
if (rc) {
EFX_ERR(efx, "failed to restore PCI config for the "
"primary function\n");
goto fail4;


In this case, if FALCON_IS_DUAL_FUNC(efx) returns true, then the next
pci_restore_state(efx->pci_dev) will return -1, and cause the "failed to
restore PCI config for the primary function" error.
That's because the code is calling pci_restore_state() twice without calling
pci_save_state() in the middle.
Since this seems to be the only place that will be broken, and the fix is
trivial, I believe that the patch can be applied smoothly.

> Does it actually fix any problem you have observed?
No, but we use this function to resume drivers after PPC machines detects
errors (EEH). And before your patch, we basically save the state once
(in the init function), and then call pci_restore_state() every time the
machine detects an error. After your patch, it's not possible anymore,
because pci_restore_state() will not restore the state after the
first restore. So, I'd like to instrument some drivers to check if it's
possible to recover or not.

I also believe that returning 0 for a failed function isn't a good plan.

Thanks for the review,
Breno

2009-11-16 14:49:33

by Ben Hutchings

[permalink] [raw]
Subject: Re: PCI: pci_restore_state() is returning 0 when it fails

On Mon, 2009-11-16 at 12:13 -0200, Breno Leitao wrote:
> Hi Rafael,
>
> Rafael J. Wysocki wrote:
> > On Friday 13 November 2009, Breno Leitao wrote:
> >> Actually pci_restore_state() is returning 0 if the restore process
> >> fails, instead of a error value.
> >>
> >> If it fails, I believe that it should return -EPERM, once that
> >> it is an invalid operation and probably pci_save_state() wasn't
> >> called.
> >
> > I believe this patch will break a number of things.
> Well, I checked it, and found that there are around 10 places that
> really verify the return value for this function, and almost all of them
> do the correct thing, and the patch doesn't seem to break any of them
> except a specific case in the drivers/net/sfc/falcon.c file, that contains:
[...]
> That's because the code is calling pci_restore_state() twice without calling
> pci_save_state() in the middle.
> Since this seems to be the only place that will be broken, and the fix is
> trivial, I believe that the patch can be applied smoothly.
[...]

This code supports two similar PCI devices, one of which has a second
function that is not truly independent. For that chip it saves and
restores both functions' config space. So far as I know, there are no
cases where it fails to match save and restore.

Ben.

--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

2009-11-23 12:38:28

by Breno Leitao

[permalink] [raw]
Subject: Re: PCI: pci_restore_state() is returning 0 when it fails

Hi Rafael,

I didn't hear back after the analysis that there is no regression
after this patch. Did you have a chance to think about this patch ?

Thanks
Breno

Ben Hutchings wrote:
> On Mon, 2009-11-16 at 12:13 -0200, Breno Leitao wrote:
>> Hi Rafael,
>>
>> Rafael J. Wysocki wrote:
>>> On Friday 13 November 2009, Breno Leitao wrote:
>>>> Actually pci_restore_state() is returning 0 if the restore process
>>>> fails, instead of a error value.
>>>>
>>>> If it fails, I believe that it should return -EPERM, once that
>>>> it is an invalid operation and probably pci_save_state() wasn't
>>>> called.
>>> I believe this patch will break a number of things.
>> Well, I checked it, and found that there are around 10 places that
>> really verify the return value for this function, and almost all of them
>> do the correct thing, and the patch doesn't seem to break any of them
>> except a specific case in the drivers/net/sfc/falcon.c file, that contains:
> [...]
>> That's because the code is calling pci_restore_state() twice without calling
>> pci_save_state() in the middle.
>> Since this seems to be the only place that will be broken, and the fix is
>> trivial, I believe that the patch can be applied smoothly.
> [...]
>
> This code supports two similar PCI devices, one of which has a second
> function that is not truly independent. For that chip it saves and
> restores both functions' config space. So far as I know, there are no
> cases where it fails to match save and restore.
>
> Ben.
>

2009-11-23 19:29:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: PCI: pci_restore_state() is returning 0 when it fails

On Monday 23 November 2009, Breno Leitao wrote:
> Hi Rafael,
>
> I didn't hear back after the analysis that there is no regression
> after this patch. Did you have a chance to think about this patch ?

I'm not really sure why you insist that we take it. Does it solve any problem
that you're observing in practice? If so, what's your test case.

Rafael