2006-05-26 03:00:07

by Shaohua Li

[permalink] [raw]
Subject: [RFC]disable msi mode in pci_disable_device

Brice said the pci_save_msi_state breaks his driver in his special usage
(not in suspend/resume), as pci_save_msi_state will disable msi mode. In
his usage, pci_save_state will be called at runtime, and later (after
the device operates for some time and has an error) pci_restore_state
will be called.
In another hand, suspend/resume needs disable msi mode, as device should
stop working completely. This patch try to workaround this issue.
Drivers are expected call pci_disable_device in suspend time after
pci_save_state.

Signed-off-by: Shaohua Li <[email protected]>
---

linux-2.6.17-rc4-root/drivers/pci/msi.c | 6 ++++--
linux-2.6.17-rc4-root/drivers/pci/pci.c | 9 ++++++++-
linux-2.6.17-rc4-root/include/linux/pci.h | 2 ++
3 files changed, 14 insertions(+), 3 deletions(-)

diff -puN drivers/pci/msi.c~disable-msi-in-pci_disable_device drivers/pci/msi.c
--- linux-2.6.17-rc4/drivers/pci/msi.c~disable-msi-in-pci_disable_device 2006-05-25 08:18:23.000000000 +0800
+++ linux-2.6.17-rc4-root/drivers/pci/msi.c 2006-05-25 08:25:19.000000000 +0800
@@ -442,9 +442,11 @@ static void enable_msi_mode(struct pci_d
/* Set enabled bits to single MSI & enable MSI_enable bit */
msi_enable(control, 1);
pci_write_config_word(dev, msi_control_reg(pos), control);
+ dev->msi_enabled = 1;
} else {
msix_enable(control);
pci_write_config_word(dev, msi_control_reg(pos), control);
+ dev->msix_enabled = 1;
}
if (pci_find_capability(dev, PCI_CAP_ID_EXP)) {
/* PCI Express Endpoint device detected */
@@ -461,9 +463,11 @@ void disable_msi_mode(struct pci_dev *de
/* Set enabled bits to single MSI & enable MSI_enable bit */
msi_disable(control);
pci_write_config_word(dev, msi_control_reg(pos), control);
+ dev->msi_enabled = 0;
} else {
msix_disable(control);
pci_write_config_word(dev, msi_control_reg(pos), control);
+ dev->msix_enabled = 0;
}
if (pci_find_capability(dev, PCI_CAP_ID_EXP)) {
/* PCI Express Endpoint device detected */
@@ -538,7 +542,6 @@ int pci_save_msi_state(struct pci_dev *d
pci_read_config_dword(dev, pos + PCI_MSI_DATA_32, &cap[i++]);
if (control & PCI_MSI_FLAGS_MASKBIT)
pci_read_config_dword(dev, pos + PCI_MSI_MASK_BIT, &cap[i++]);
- disable_msi_mode(dev, pos, PCI_CAP_ID_MSI);
save_state->cap_nr = PCI_CAP_ID_MSI;
pci_add_saved_cap(dev, save_state);
return 0;
@@ -593,7 +596,6 @@ int pci_save_msix_state(struct pci_dev *
}
*((u16 *)&save_state->data[0]) = control;

- disable_msi_mode(dev, pos, PCI_CAP_ID_MSIX);
save_state->cap_nr = PCI_CAP_ID_MSIX;
pci_add_saved_cap(dev, save_state);
return 0;
diff -puN include/linux/pci.h~disable-msi-in-pci_disable_device include/linux/pci.h
--- linux-2.6.17-rc4/include/linux/pci.h~disable-msi-in-pci_disable_device 2006-05-25 08:18:36.000000000 +0800
+++ linux-2.6.17-rc4-root/include/linux/pci.h 2006-05-25 08:19:34.000000000 +0800
@@ -163,6 +163,8 @@ struct pci_dev {
unsigned int is_busmaster:1; /* device is busmaster */
unsigned int no_msi:1; /* device may not use msi */
unsigned int block_ucfg_access:1; /* userspace config space access is blocked */
+ unsigned int msi_enabled:1;
+ unsigned int msix_enabled:1;

u32 saved_config_space[16]; /* config space saved at suspend time */
struct hlist_head saved_cap_space;
diff -puN drivers/pci/pci.c~disable-msi-in-pci_disable_device drivers/pci/pci.c
--- linux-2.6.17-rc4/drivers/pci/pci.c~disable-msi-in-pci_disable_device 2006-05-25 08:20:42.000000000 +0800
+++ linux-2.6.17-rc4-root/drivers/pci/pci.c 2006-05-25 08:32:33.000000000 +0800
@@ -533,7 +533,14 @@ void
pci_disable_device(struct pci_dev *dev)
{
u16 pci_command;
-
+
+ if (dev->msi_enabled)
+ disable_msi_mode(dev, pci_find_capability(dev, PCI_CAP_ID_MSI),
+ PCI_CAP_ID_MSI);
+ if (dev->msix_enabled)
+ disable_msi_mode(dev, pci_find_capability(dev, PCI_CAP_ID_MSI),
+ PCI_CAP_ID_MSIX);
+
pci_read_config_word(dev, PCI_COMMAND, &pci_command);
if (pci_command & PCI_COMMAND_MASTER) {
pci_command &= ~PCI_COMMAND_MASTER;
_


2006-05-26 19:51:58

by Andrew Morton

[permalink] [raw]
Subject: Re: [RFC]disable msi mode in pci_disable_device

Shaohua Li <[email protected]> wrote:
>
> Brice said the pci_save_msi_state breaks his driver in his special usage
> (not in suspend/resume), as pci_save_msi_state will disable msi mode.

That sounds wrong of pci_save_msi_state(). It's supposed to save the
state, not to go fiddling with it.

> In
> his usage, pci_save_state will be called at runtime, and later (after
> the device operates for some time and has an error) pci_restore_state
> will be called.

Is that a sane thing for a driver to be doing? (Not relevant to this issue
though).

> In another hand, suspend/resume needs disable msi mode, as device should
> stop working completely. This patch try to workaround this issue.
> Drivers are expected call pci_disable_device in suspend time after
> pci_save_state.

Surely the drivers should be calling pci_disable_msix() or something after
saving the state rather than relying upon magical side-effects of
pci_save_msi_state(). Or we do disable_msi_mode() or whatever in
pci_disable_device().

2006-05-26 20:27:25

by Brice Goglin

[permalink] [raw]
Subject: Re: [RFC]disable msi mode in pci_disable_device

Andrew Morton wrote:
>> In
>> his usage, pci_save_state will be called at runtime, and later (after
>> the device operates for some time and has an error) pci_restore_state
>> will be called.
>
> Is that a sane thing for a driver to be doing? (Not relevant to this issue
> though).

The aim is to be able to recover from a memory parity error in the NIC.
Such errors happen sometimes, especially when a cosmic ray comes by. To
recover, we restore the state that we saved at the end of the
initialization. As saving currently disables MSI, we currently have to
restore the state right after saving it at the end of the initialization
(see the end of
myri10ge_probe in http://lkml.org/lkml/2006/5/23/24).

I just tried, the patch fixes our problem (no need to restore right
after saving to reenable MSI).

Brice

2006-05-26 23:13:11

by Rajesh Shah

[permalink] [raw]
Subject: Re: [RFC]disable msi mode in pci_disable_device

On Fri, May 26, 2006 at 10:26:57PM +0200, Brice Goglin wrote:
>
> I just tried, the patch fixes our problem (no need to restore right
> after saving to reenable MSI).
>
Yeah, I agree this latest patch from Shaohua is the right thing,
and that pci save/restore msi state functions should not have
the side effect of disabling/enabling MSI. Shaohua, do drivers
already call pci_disable_device() or will you have to patch
them all to get the disable effect?

Rajesh

2006-05-27 08:07:21

by Brice Goglin

[permalink] [raw]
Subject: Re: [RFC]disable msi mode in pci_disable_device

Rajesh Shah wrote:
> On Fri, May 26, 2006 at 10:26:57PM +0200, Brice Goglin wrote:
>
>> I just tried, the patch fixes our problem (no need to restore right
>> after saving to reenable MSI).
>>
>>
> Yeah, I agree this latest patch from Shaohua is the right thing,
> and that pci save/restore msi state functions should not have
> the side effect of disabling/enabling MSI. Shaohua, do drivers
> already call pci_disable_device() or will you have to patch
> them all to get the disable effect?
>

We would have to patch if we knew that disabling was required. But these
drivers that do not call pci_disable_device (for instance tg3 and bnx2)
were already working before pci_save_msi_state was added by Shaohua in
2.6.17-rc:
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=41017f0cac925e4a6bcf3359b75e5538112d4216
So they were working without any PCI core function disabling MSI for
them during suspend.

For these drivers, it might be a regression against 2.6.17-rc, but not
against 2.6.16. I'd say it's fine.

Brice

2006-05-29 02:14:18

by Shaohua Li

[permalink] [raw]
Subject: Re: [RFC]disable msi mode in pci_disable_device

On Fri, 2006-05-26 at 16:10 -0700, Rajesh Shah wrote:
> On Fri, May 26, 2006 at 10:26:57PM +0200, Brice Goglin wrote:
> >
> > I just tried, the patch fixes our problem (no need to restore right
> > after saving to reenable MSI).
> >
> Yeah, I agree this latest patch from Shaohua is the right thing,
> and that pci save/restore msi state functions should not have
> the side effect of disabling/enabling MSI. Shaohua, do drivers
> already call pci_disable_device() or will you have to patch
> them all to get the disable effect?
I guess most drivers already do this. It's recommended way (call
pci_disable_device in suspend) for a long time for suspend/resume. If
they really care, the driver authors should fix them.

Thanks,
Shaohua

2006-05-31 21:06:29

by linas

[permalink] [raw]
Subject: Re: [RFC]disable msi mode in pci_disable_device

On Fri, May 26, 2006 at 10:26:57PM +0200, Brice Goglin wrote:
> Andrew Morton wrote:
> >> In
> >> his usage, pci_save_state will be called at runtime, and later (after
> >> the device operates for some time and has an error) pci_restore_state
> >> will be called.
> >
> > Is that a sane thing for a driver to be doing? (Not relevant to this issue
> > though).
>
> The aim is to be able to recover from a memory parity error in the NIC.
> Such errors happen sometimes, especially when a cosmic ray comes by. To
> recover, we restore the state that we saved at the end of the
> initialization. As saving currently disables MSI, we currently have to
> restore the state right after saving it at the end of the initialization
> (see the end of
> myri10ge_probe in http://lkml.org/lkml/2006/5/23/24).

My experience dealing with a similar thing suggests that its usually
easier to restore the state to where it was after a cold boot, but
before the device driver touched the h/w.

--linas