Since some devices may not implement the MWI bit, we should check that
the write did set it and return an error if it didn't.
Signed-off-by: Matthew Wilcox <[email protected]>
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a544997..3d041f4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -900,13 +900,17 @@ #endif
return rc;
pci_read_config_word(dev, PCI_COMMAND, &cmd);
- if (! (cmd & PCI_COMMAND_INVALIDATE)) {
- pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
- cmd |= PCI_COMMAND_INVALIDATE;
- pci_write_config_word(dev, PCI_COMMAND, cmd);
- }
-
- return 0;
+ if (cmd & PCI_COMMAND_INVALIDATE)
+ return 0;
+
+ pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
+ cmd |= PCI_COMMAND_INVALIDATE;
+ pci_write_config_word(dev, PCI_COMMAND, cmd);
+
+ /* read result from hardware (in case bit refused to enable) */
+ pci_read_config_word(dev, PCI_COMMAND, &cmd);
+
+ return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
}
/**
We used to check whether pci_set_mwi() had succeeded by testing the
hardware MWI bit. Now we need only check the return value (and failing
to do so is a warning). Also, pci_set_mwi() will fail if the cache line
size is 0, so we don't need to check that ourselves any more.
Signed-off-by: Matthew Wilcox <[email protected]>
diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
index d11d28c..64d999b 100644
--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -1135,7 +1135,6 @@ static void __devinit tulip_mwi_config (
{
struct tulip_private *tp = netdev_priv(dev);
u8 cache;
- u16 pci_command;
u32 csr0;
if (tulip_debug > 3)
@@ -1153,21 +1152,15 @@ static void __devinit tulip_mwi_config (
/* set or disable MWI in the standard PCI command bit.
* Check for the case where mwi is desired but not available
*/
- if (csr0 & MWI) pci_set_mwi(pdev);
- else pci_clear_mwi(pdev);
-
- /* read result from hardware (in case bit refused to enable) */
- pci_read_config_word(pdev, PCI_COMMAND, &pci_command);
- if ((csr0 & MWI) && (!(pci_command & PCI_COMMAND_INVALIDATE)))
- csr0 &= ~MWI;
-
- /* if cache line size hardwired to zero, no MWI */
- pci_read_config_byte(pdev, PCI_CACHE_LINE_SIZE, &cache);
- if ((csr0 & MWI) && (cache == 0)) {
- csr0 &= ~MWI;
+ if (csr0 & MWI) {
+ if (pci_set_mwi(pdev))
+ csr0 &= ~MWI;
+ } else {
pci_clear_mwi(pdev);
}
+ pci_read_config_byte(pdev, PCI_CACHE_LINE_SIZE, &cache);
+
/* assign per-cacheline-size cache alignment and
* burst length values
*/
Matthew Wilcox wrote:
> Also, pci_set_mwi() will fail if the cache line
> size is 0, so we don't need to check that ourselves any more.
NAK, not true on all arches. sparc64 at least presumes that the
firmware DTRT with cacheline size, which hurts us now given this tulip patch
Jeff
Matthew Wilcox wrote:
> Since some devices may not implement the MWI bit, we should check that
> the write did set it and return an error if it didn't.
>
> Signed-off-by: Matthew Wilcox <[email protected]>
ACK, though (as it seems you are doing) you should audit pci_set_mwi()
users to make sure behavior matches up with this implementation
Jeff
On Fri, Oct 06, 2006 at 03:15:15PM -0400, Jeff Garzik wrote:
> Matthew Wilcox wrote:
> >Also, pci_set_mwi() will fail if the cache line
> >size is 0, so we don't need to check that ourselves any more.
>
> NAK, not true on all arches. sparc64 at least presumes that the
> firmware DTRT with cacheline size, which hurts us now given this tulip patch
How does it hurt us?
int pcibios_prep_mwi(struct pci_dev *dev)
{
/* We set correct PCI_CACHE_LINE_SIZE register values for every
* device probed on this platform. So there is nothing to check
* and this always succeeds.
*/
return 0;
}
If Dave's wrong about that, it hurts him, not us ;-)
It's still not necessary for the Tulip driver to check.
Matthew Wilcox wrote:
> On Fri, Oct 06, 2006 at 03:15:15PM -0400, Jeff Garzik wrote:
>> Matthew Wilcox wrote:
>>> Also, pci_set_mwi() will fail if the cache line
>>> size is 0, so we don't need to check that ourselves any more.
>> NAK, not true on all arches. sparc64 at least presumes that the
>> firmware DTRT with cacheline size, which hurts us now given this tulip patch
>
> How does it hurt us?
>
> int pcibios_prep_mwi(struct pci_dev *dev)
> {
> /* We set correct PCI_CACHE_LINE_SIZE register values for every
> * device probed on this platform. So there is nothing to check
> * and this always succeeds.
> */
> return 0;
> }
>
> If Dave's wrong about that, it hurts him, not us ;-)
>
> It's still not necessary for the Tulip driver to check.
The unmodified tulip driver checks both MWI and cacheline-size because
one of the clones (PNIC or PNIC2) will let you set the MWI bit, but
hardwires cacheline size to zero.
If the arches do not behave consistently, we need to keep the check in
the tulip driver, to avoid incorrectly programming the csr0 MWI bit.
Jeff
Jeff Garzik wrote:
> The unmodified tulip driver checks both MWI and cacheline-size because
> one of the clones (PNIC or PNIC2) will let you set the MWI bit, but
> hardwires cacheline size to zero.
>
> If the arches do not behave consistently, we need to keep the check in
> the tulip driver, to avoid incorrectly programming the csr0 MWI bit.
>
> Jeff
I should think that pci_set_mwi should fail if either the cache line
size showed 0 (after either setting the correct size or assuming it
should have been set already) or the MWI bit ended up clear after we
tried to turn it on.
That pcibios_prep_mwi code for sparc64 looks wrong, if you plug in a
device that doesn't implement MWI at all it will probably not let
anything get written to PCI_CACHE_LINE_SIZE other than 0, but it is
blindly succeeding in all cases. Even if the arch assumes the firmware
already set the size properly it should still make sure it is non-zero
before allowing MWI..
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/
On Fri, Oct 06, 2006 at 03:59:57PM -0400, Jeff Garzik wrote:
> The unmodified tulip driver checks both MWI and cacheline-size because
> one of the clones (PNIC or PNIC2) will let you set the MWI bit, but
> hardwires cacheline size to zero.
Maybe the generic pci_set_mwi() can verify cacheline size is non-zero?
I don't think each driver should need to enforce this.
> If the arches do not behave consistently, we need to keep the check in
> the tulip driver, to avoid incorrectly programming the csr0 MWI bit.
Why not fix the arches to be consistent?
There's alot more drivers than arches...and we have control
of the arch specific PCI support.
thanks,
grant
Grant Grundler wrote:
> On Fri, Oct 06, 2006 at 03:59:57PM -0400, Jeff Garzik wrote:
>> The unmodified tulip driver checks both MWI and cacheline-size because
>> one of the clones (PNIC or PNIC2) will let you set the MWI bit, but
>> hardwires cacheline size to zero.
>
> Maybe the generic pci_set_mwi() can verify cacheline size is non-zero?
> I don't think each driver should need to enforce this.
Agreed.
>> If the arches do not behave consistently, we need to keep the check in
>> the tulip driver, to avoid incorrectly programming the csr0 MWI bit.
>
> Why not fix the arches to be consistent?
> There's alot more drivers than arches...and we have control
> of the arch specific PCI support.
Agreed.
Jeff
On Fri, 06 Oct 2006 13:05:18 -0600
Matthew Wilcox <[email protected]> wrote:
> Since some devices may not implement the MWI bit, we should check that
> the write did set it and return an error if it didn't.
>
> Signed-off-by: Matthew Wilcox <[email protected]>
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a544997..3d041f4 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -900,13 +900,17 @@ #endif
> return rc;
>
> pci_read_config_word(dev, PCI_COMMAND, &cmd);
> - if (! (cmd & PCI_COMMAND_INVALIDATE)) {
> - pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> - cmd |= PCI_COMMAND_INVALIDATE;
> - pci_write_config_word(dev, PCI_COMMAND, cmd);
> - }
> -
> - return 0;
> + if (cmd & PCI_COMMAND_INVALIDATE)
> + return 0;
> +
> + pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> + cmd |= PCI_COMMAND_INVALIDATE;
> + pci_write_config_word(dev, PCI_COMMAND, cmd);
> +
> + /* read result from hardware (in case bit refused to enable) */
> + pci_read_config_word(dev, PCI_COMMAND, &cmd);
> +
> + return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
> }
>
> /**
Bisection shows that this patch
(pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
suspend-to-disk on my Vaio. It writes the suspend image and gets to the
point where it's supposed to power down, but doesn't.
After a manual power-cycle it successfully resumes from disk, but
networking (at least) is dead.
On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> On Fri, 06 Oct 2006 13:05:18 -0600
> Matthew Wilcox <[email protected]> wrote:
>
> > Since some devices may not implement the MWI bit, we should check that
> > the write did set it and return an error if it didn't.
> >
> > Signed-off-by: Matthew Wilcox <[email protected]>
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index a544997..3d041f4 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -900,13 +900,17 @@ #endif
> > return rc;
> >
> > pci_read_config_word(dev, PCI_COMMAND, &cmd);
> > - if (! (cmd & PCI_COMMAND_INVALIDATE)) {
> > - pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> > - cmd |= PCI_COMMAND_INVALIDATE;
> > - pci_write_config_word(dev, PCI_COMMAND, cmd);
> > - }
> > -
> > - return 0;
> > + if (cmd & PCI_COMMAND_INVALIDATE)
> > + return 0;
> > +
> > + pr_debug("PCI: Enabling Mem-Wr-Inval for device %s\n", pci_name(dev));
> > + cmd |= PCI_COMMAND_INVALIDATE;
> > + pci_write_config_word(dev, PCI_COMMAND, cmd);
> > +
> > + /* read result from hardware (in case bit refused to enable) */
> > + pci_read_config_word(dev, PCI_COMMAND, &cmd);
> > +
> > + return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
> > }
> >
> > /**
>
> Bisection shows that this patch
> (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> suspend-to-disk on my Vaio. It writes the suspend image and gets to the
> point where it's supposed to power down, but doesn't.
>
> After a manual power-cycle it successfully resumes from disk, but
> networking (at least) is dead.
Ok, I'll drop this from my tree too.
Matthew, let me know whn you have a revised patch you wish to have me
include.
thanks,
greg k-h
On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> Bisection shows that this patch
> (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> suspend-to-disk on my Vaio. It writes the suspend image and gets to the
> point where it's supposed to power down, but doesn't.
How odd. What driver is calling pci_set_mwi() on the suspend path?
What drivers do you have loaded on the Vaio?
On Sat, 14 Oct 2006 08:02:49 -0600
Matthew Wilcox <[email protected]> wrote:
> On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> > Bisection shows that this patch
> > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> > suspend-to-disk on my Vaio. It writes the suspend image and gets to the
> > point where it's supposed to power down, but doesn't.
>
> How odd. What driver is calling pci_set_mwi() on the suspend path?
ehci_pci_reinit(). I stuck a dump_stack() in there. See
http://userweb.kernel.org/~akpm/s5000342.jpg
> What drivers do you have loaded on the Vaio?
sony:/home/akpm> lsmod
Module Size Used by
ipw2200 163184 0
sonypi 21484 0
autofs4 19908 1
hidp 16192 2
l2cap 21764 5 hidp
bluetooth 49060 2 hidp,l2cap
sunrpc 154172 1
ip_conntrack_netbios_ns 3328 0
ipt_REJECT 4736 1
xt_state 2496 2
ip_conntrack 51020 2 ip_conntrack_netbios_ns,xt_state
nfnetlink 7128 1 ip_conntrack
xt_tcpudp 3392 4
iptable_filter 3264 1
ip_tables 12616 1 iptable_filter
x_tables 15428 4 ipt_REJECT,xt_state,xt_tcpudp,ip_tables
video 16836 0
sony_acpi 7312 0
sbs 15968 0
i2c_ec 5504 1 sbs
button 7184 0
battery 10692 0
asus_acpi 17564 0
backlight 6464 2 sony_acpi,asus_acpi
ac 5636 0
nvram 8072 0
ohci1394 33264 0
ehci_hcd 30088 0
ieee1394 291896 1 ohci1394
uhci_hcd 22092 0
sg 33628 0
joydev 9920 0
evbug 3392 0
snd_hda_intel 18968 0
snd_hda_codec 161536 1 snd_hda_intel
snd_seq_dummy 4228 0
snd_seq_oss 31744 0
snd_seq_midi_event 7360 1 snd_seq_oss
snd_seq 48208 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
snd_seq_device 8524 3 snd_seq_dummy,snd_seq_oss,snd_seq
ieee80211 30920 1 ipw2200
snd_pcm_oss 41504 0
snd_mixer_oss 16640 1 snd_pcm_oss
ieee80211_crypt 6016 1 ieee80211
snd_pcm 74632 3 snd_hda_intel,snd_hda_codec,snd_pcm_oss
snd_timer 21316 2 snd_seq,snd_pcm
snd 50980 9 snd_hda_intel,snd_hda_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
soundcore 7968 1 snd
snd_page_alloc 10376 2 snd_hda_intel,snd_pcm
piix 9604 0 [permanent]
i2c_i801 7820 0
pcspkr 3136 0
i2c_core 21840 2 i2c_ec,i2c_i801
generic 5252 0 [permanent]
ext3 127688 1
jbd 52712 1 ext3
ide_disk 16000 0
ide_core 114780 3 piix,generic,ide_disk
On Sat, Oct 14, 2006 at 01:48:55PM -0700, Andrew Morton wrote:
> On Sat, 14 Oct 2006 08:02:49 -0600
> Matthew Wilcox <[email protected]> wrote:
>
> > On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> > > Bisection shows that this patch
> > > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> > > suspend-to-disk on my Vaio. It writes the suspend image and gets to the
> > > point where it's supposed to power down, but doesn't.
> >
> > How odd. What driver is calling pci_set_mwi() on the suspend path?
>
> ehci_pci_reinit(). I stuck a dump_stack() in there. See
> http://userweb.kernel.org/~akpm/s5000342.jpg
Thanks for the picture; that's really helpful.
I see. We hibernate all the devices then wake them all back up again.
No doubt there's a good reason for this.
Still doesn't make much sense, though. As far as I can see, the only
consequence of this particular patch is that 1) we do an additional read
from PCI_COMMAND and 2) we can return -EINVAL in one additional case.
But the only effect of returning EINVAL is a printk (for this particular
driver):
/* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
retval = pci_set_mwi(pdev);
if (!retval)
ehci_dbg(ehci, "MWI active\n");
ehci_port_power(ehci, 0);
return 0;
So even if we return EINVAL ... big deal.
Is it possible reading PCI_COMMAND too quickly after writing it causes
a foul-up? That would be weird ...
so I suppose there's a few things I can ask you to try:
1. Stop reading the register back altogether. This should revert the behaviour to the prepatch state:
- pci_read_config_word(dev, PCI_COMMAND, &cmd);
+// pci_read_config_word(dev, PCI_COMMAND, &cmd);
2. Put an mdelay(1); before that line
3. Change the last line to just return 0.
- return (cmd & PCI_COMMAND_INVALIDATE) ? 0 : -EINVAL;
+ return 0;
> > What drivers do you have loaded on the Vaio?
>
> sony:/home/akpm> lsmod
I don't see any of the other drivers calling pci_set_mwi, so i guess we're
looking at the right suspect.
I feel rather guilty about the amount of time you're spending on this;
any bugs you want me to look at as penance?
On Sat, 14 Oct 2006 21:20:01 -0600
Matthew Wilcox <[email protected]> wrote:
> On Sat, Oct 14, 2006 at 01:48:55PM -0700, Andrew Morton wrote:
> > On Sat, 14 Oct 2006 08:02:49 -0600
> > Matthew Wilcox <[email protected]> wrote:
> >
> > > On Fri, Oct 13, 2006 at 09:41:35PM -0700, Andrew Morton wrote:
> > > > Bisection shows that this patch
> > > > (pci-check-that-mwi-bit-really-did-get-set.patch in Greg's PCI tree) breaks
> > > > suspend-to-disk on my Vaio. It writes the suspend image and gets to the
> > > > point where it's supposed to power down, but doesn't.
> > >
> > > How odd. What driver is calling pci_set_mwi() on the suspend path?
> >
> > ehci_pci_reinit(). I stuck a dump_stack() in there. See
> > http://userweb.kernel.org/~akpm/s5000342.jpg
>
> Thanks for the picture; that's really helpful.
>
> I see. We hibernate all the devices then wake them all back up again.
> No doubt there's a good reason for this.
>
> Still doesn't make much sense, though. As far as I can see, the only
> consequence of this particular patch is that 1) we do an additional read
> from PCI_COMMAND and 2) we can return -EINVAL in one additional case.
> But the only effect of returning EINVAL is a printk (for this particular
> driver):
>
> /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
> retval = pci_set_mwi(pdev);
> if (!retval)
> ehci_dbg(ehci, "MWI active\n");
>
> ehci_port_power(ehci, 0);
>
> return 0;
>
> So even if we return EINVAL ... big deal.
>
> Is it possible reading PCI_COMMAND too quickly after writing it causes
> a foul-up? That would be weird ...
>
> so I suppose there's a few things I can ask you to try:
It seems to have stopped happening. I don't know why.
gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch
still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg
> But the only effect of returning EINVAL is a printk (for this particular
> driver):
>
> /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
> retval = pci_set_mwi(pdev);
> if (!retval)
> ehci_dbg(ehci, "MWI active\n");
Erm, I've lost context here but it's completely legit for hardware
to NOT support MWI, so it is in no way an error if it's not set.
(Memory-Write-Invalidate is just a more efficient way to write data
that may be cached; if the device can't issue those cycles, there's
no loss of correctness.)
Since it's not an error, there should be no such printk ... which
is exactly how it's coded above.
Who is issuing the printk on a non-error code path??
- Dave
On Sun, Oct 15, 2006 at 12:08:09AM -0700, David Brownell wrote:
> > But the only effect of returning EINVAL is a printk (for this particular
> > driver):
> >
> > /* PCI Memory-Write-Invalidate cycle support is optional (uncommon) */
> > retval = pci_set_mwi(pdev);
> > if (!retval)
> > ehci_dbg(ehci, "MWI active\n");
>
> Erm, I've lost context here but it's completely legit for hardware
> to NOT support MWI, so it is in no way an error if it's not set.
> (Memory-Write-Invalidate is just a more efficient way to write data
> that may be cached; if the device can't issue those cycles, there's
> no loss of correctness.)
>
> Since it's not an error, there should be no such printk ... which
> is exactly how it's coded above.
>
> Who is issuing the printk on a non-error code path??
Er, that would be the EHCI driver, which you wrote ...
On Sat, Oct 14, 2006 at 11:53:59PM -0700, Andrew Morton wrote:
> It seems to have stopped happening. I don't know why.
Argh. Possibly a mistake during the bisect procedure?
> gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch
> still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg
I believe that; I was going to generate a new patch for that yesterday,
but got distracted trying to debug your other problem. I'll put out a
new version of that patch later today.
Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell:
> Since it's not an error, there should be no such printk ... which
> is exactly how it's coded above.
The underlying bug is that someone marked pci_set_mwi must-check, that's
wrong for most of the drivers that use it. If you remove the must check
annotation from it then the problem and a thousand other spurious
warnings go away.
On Sun, Oct 15, 2006 at 03:21:22PM +0100, Alan Cox wrote:
> Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell:
> > Since it's not an error, there should be no such printk ... which
> > is exactly how it's coded above.
>
> The underlying bug is that someone marked pci_set_mwi must-check, that's
> wrong for most of the drivers that use it. If you remove the must check
> annotation from it then the problem and a thousand other spurious
> warnings go away.
There's only about 20 users of pci_set_mwi ... about 12 of them seem to
check it, one of them uses a variable called
compiler_warning_pointless_fix which leaves about 7 warnings to be
removed by removing the __must_check.
However, I do believe the __must_check should be removed. For example,
the LSI 53c1030 has *nothing* to be done if setting MWI fails. It just
doesn't work, and the device copes. It's not like Tulip or sym53c8xx
where there are additional bits to be set or cleared in control registers.
On Sun, 15 Oct 2006 07:57:56 -0600
Matthew Wilcox <[email protected]> wrote:
> On Sun, Oct 15, 2006 at 03:21:22PM +0100, Alan Cox wrote:
> > Ar Sul, 2006-10-15 am 00:08 -0700, ysgrifennodd David Brownell:
> > > Since it's not an error, there should be no such printk ... which
> > > is exactly how it's coded above.
> >
> > The underlying bug is that someone marked pci_set_mwi must-check, that's
> > wrong for most of the drivers that use it. If you remove the must check
> > annotation from it then the problem and a thousand other spurious
> > warnings go away.
>
> There's only about 20 users of pci_set_mwi ... about 12 of them seem to
> check it, one of them uses a variable called
> compiler_warning_pointless_fix which leaves about 7 warnings to be
> removed by removing the __must_check.
>
> However, I do believe the __must_check should be removed. For example,
> the LSI 53c1030 has *nothing* to be done if setting MWI fails. It just
> doesn't work, and the device copes.
If the drivers doesn't care and if it makes no difference to performance
then just delete the call to pci_set_mwi().
But if MWI _does_ make a difference to performance then we should tell
someone that it isn't working rather than silently misbehaving?
On Sun, 15 Oct 2006 07:54:41 -0600
Matthew Wilcox <[email protected]> wrote:
> On Sat, Oct 14, 2006 at 11:53:59PM -0700, Andrew Morton wrote:
> > It seems to have stopped happening. I don't know why.
>
> Argh. Possibly a mistake during the bisect procedure?
I don't think so - I spent a lot of time fiddling with that because it was
so bizarre to have two patches each of which caused the same failure.
Something has changed, perhaps config: the failure is a bit different now
(happens earlier).
> > gregkh-pci-pci-prevent-user-config-space-access-during-power-state-transitions.patch
> > still breaks suspend though: http://userweb.kernel.org/~akpm/s5000349.jpg
>
> I believe that; I was going to generate a new patch for that yesterday,
> but got distracted trying to debug your other problem. I'll put out a
> new version of that patch later today.
ok.. Add plenty of debug printks to it.
(From Alan Cox:)
> The underlying bug is that someone marked pci_set_mwi must-check, that's
> wrong for most of the drivers that use it. If you remove the must check
> annotation from it then the problem and a thousand other spurious
> warnings go away.
Yes, there seems to be abuse of this new "must_check" feature.
(From Andrew Morton:)
> But if MWI _does_ make a difference to performance then we should tell
> someone that it isn't working rather than silently misbehaving?
Thing is, a "difference to performance (alone)" != "misbehavior".
If it affected correctness, then a warning would be appropriate.
Most drivers should be able to say "enable MWI if possible, but
don't worry if it's not possible". Only a few controllers need
additional setup to make MWI actually work ... if they couldn't
do that setup, that'd be worth a warning before they backed off
to run in a non-MWI mode.
- Dave
On Sun, 15 Oct 2006 12:16:31 -0700
David Brownell <[email protected]> wrote:
> (From Alan Cox:)
> > The underlying bug is that someone marked pci_set_mwi must-check, that's
> > wrong for most of the drivers that use it. If you remove the must check
> > annotation from it then the problem and a thousand other spurious
> > warnings go away.
>
> Yes, there seems to be abuse of this new "must_check" feature.
>
>
> (From Andrew Morton:)
> > But if MWI _does_ make a difference to performance then we should tell
> > someone that it isn't working rather than silently misbehaving?
>
> Thing is, a "difference to performance (alone)" != "misbehavior".
>
> If it affected correctness, then a warning would be appropriate.
>
> Most drivers should be able to say "enable MWI if possible, but
> don't worry if it's not possible". Only a few controllers need
> additional setup to make MWI actually work ... if they couldn't
> do that setup, that'd be worth a warning before they backed off
> to run in a non-MWI mode.
>
So the semantics of pci_set_mwi() are "try to set MWI if this
platform/device supports it".
In that case its interface is misdesigned, because it doesn't discriminate
between "yes-it-does/no-it-doesn't" (which we don't want to report, because
either is expected and legitimate) and "something screwed up", which we do
want to report, because it is always unexpected.
So an appropriate return value protocol would be
0: No error, unable to set MWI
1: No error, able to set MWI
-EFOO: Error
Ar Sul, 2006-10-15 am 10:45 -0700, ysgrifennodd Andrew Morton:
> If the drivers doesn't care and if it makes no difference to performance
> then just delete the call to pci_set_mwi().
>
> But if MWI _does_ make a difference to performance then we should tell
> someone that it isn't working rather than silently misbehaving?
It isn't misbehaving it just isn't available. MWI is rather different to
say pci_set_master() in that it makes a lot of sense for many drivers to
ask for it but not care about the results. Something like pci_set_master
failing is a big problem and has to be handled (although as we often
don't use BIOS PCI services we see fake success in some cases).
MWI is an "extra cheese" option not a "no pizza" case
Alan
> > Most drivers should be able to say "enable MWI if possible, but
> > don't worry if it's not possible". Only a few controllers need
> > additional setup to make MWI actually work ... if they couldn't
> > do that setup, that'd be worth a warning before they backed off
> > to run in a non-MWI mode.
> >
>
> So the semantics of pci_set_mwi() are "try to set MWI if this
> platform/device supports it".
Not what I said ... that's what the _driver_ usually wants to do,
which is different from the step implemented by set_mwi().
What Alan Cox said is a better paraphrase:
> MWI is an "extra cheese" option not a "no pizza" case
Or "sorry, that car is not available in olive, just burgundy."
Not:
> In that case its interface is misdesigned, because it doesn't discriminate
> between "yes-it-does/no-it-doesn't" (which we don't want to report, because
> either is expected and legitimate) and "something screwed up", which we do
> want to report, because it is always unexpected.
You mis-understand. It's completely legit for the driver not to care.
I agree that set_mwo() should set MWI if possible, and fail cleanly
if it couldn't (for whatever reason). Thing is, choosing to treat
that as an error must be the _driver's_ choice ... it'd be wrong to force
that policy into the _interface_ by forcing must_check etc.
- Dave
On Sun, 15 Oct 2006 15:45:58 -0700
David Brownell <[email protected]> wrote:
> > In that case its interface is misdesigned, because it doesn't discriminate
> > between "yes-it-does/no-it-doesn't" (which we don't want to report, because
> > either is expected and legitimate) and "something screwed up", which we do
> > want to report, because it is always unexpected.
>
> You mis-understand. It's completely legit for the driver not to care.
>
> I agree that set_mwo() should set MWI if possible, and fail cleanly
> if it couldn't (for whatever reason). Thing is, choosing to treat
> that as an error must be the _driver's_ choice ... it'd be wrong to force
> that policy into the _interface_ by forcing must_check etc.
No. If pci_set_mwi() detects an unexpected error then the driver should
take some action: report it, recover from it, fail to load, etc. If the
driver fails to do any of this then it's a buggy driver.
You, the driver author _do not know_ what pci_set_mwi() does at present, on
all platforms, nor do you know what it does in the future. For you the
driver author to make assumptions about what's happening inside
pci_set_mwi() is a layering violation. Maybe the bridge got hot-unplugged.
Maybe the attempt to set MWI caused some synchronous PCI error. For
example, take a look at the various implementations of pci_ops.read()
around the place - various of them can fail for various reasons.
Now it could be that an appropriate solution is to make pci_set_mwi()
return only 0 or 1, and to generate a warning from within pci_set_mwi()
if some unexpected error happens. In which case it is legitimate for
callers to not check for errors.
This is not a terribly important issue, and it is far from the worst case
of missed error-checking which we have in there.
Ar Sul, 2006-10-15 am 16:18 -0700, ysgrifennodd Andrew Morton:
> No. If pci_set_mwi() detects an unexpected error then the driver should
> take some action: report it, recover from it, fail to load, etc. If the
> driver fails to do any of this then it's a buggy driver.
Wrong and there are several drivers in the kernel that are proof of
this.
> You, the driver author _do not know_ what pci_set_mwi() does at present, on
> all platforms, nor do you know what it does in the future. For you the
You don't care. It isn't an error for set_mwi to fail. In fact the only
reason set_mwi even needs to bother with a return code is that some
chips want you to set other config private to the device if it is
available and active.
Alan
On Mon, 16 Oct 2006 01:02:40 +0100
Alan Cox <[email protected]> wrote:
> Ar Sul, 2006-10-15 am 16:18 -0700, ysgrifennodd Andrew Morton:
> > No. If pci_set_mwi() detects an unexpected error then the driver should
> > take some action: report it, recover from it, fail to load, etc. If the
> > driver fails to do any of this then it's a buggy driver.
>
> Wrong and there are several drivers in the kernel that are proof of
> this.
Let me restore the words from my earlier email which you removed so that
you could say that:
For you the driver author to make assumptions about what's happening
inside pci_set_mwi() is a layering violation. Maybe the bridge got
hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI
error. For example, take a look at the various implementations of
pci_ops.read() around the place - various of them can fail for various
reasons.
> > You, the driver author _do not know_ what pci_set_mwi() does at present, on
> > all platforms, nor do you know what it does in the future. For you the
>
> You don't care. It isn't an error for set_mwi to fail. In fact the only
> reason set_mwi even needs to bother with a return code is that some
> chips want you to set other config private to the device if it is
> available and active.
>
Let me restore the words from my earlier email which you removed which
address that:
Now it could be that an appropriate solution is to make pci_set_mwi()
return only 0 or 1, and to generate a warning from within pci_set_mwi()
if some unexpected error happens. In which case it is legitimate for
callers to not check for errors.
Andrew Morton writes:
> If the drivers doesn't care and if it makes no difference to performance
> then just delete the call to pci_set_mwi().
>
> But if MWI _does_ make a difference to performance then we should tell
> someone that it isn't working rather than silently misbehaving?
That sounds like we need a printk inside pci_set_mwi then, rather than
adding a printk to every single caller.
Paul.
On Mon, 16 Oct 2006 10:00:25 +1000 Paul Mackerras <[email protected]> wrote:
> Andrew Morton writes:
>
> > If the drivers doesn't care and if it makes no difference to performance
> > then just delete the call to pci_set_mwi().
> >
> > But if MWI _does_ make a difference to performance then we should tell
> > someone that it isn't working rather than silently misbehaving?
>
> That sounds like we need a printk inside pci_set_mwi then, rather than
> adding a printk to every single caller.
>
I think so, yes. That's a good solution to a lot of these things.
> > I agree that set_mwo() should set MWI if possible, and fail cleanly
> > if it couldn't (for whatever reason). Thing is, choosing to treat
> > that as an error must be the _driver's_ choice ... it'd be wrong to force
> > that policy into the _interface_ by forcing must_check etc.
>
> No. If pci_set_mwi() detects an unexpected error then the driver should
> take some action: report it, recover from it, fail to load, etc. If the
> driver fails to do any of this then it's a buggy driver.
But what is an "unexpected" "error"? Not every fault that's unexpected
is an error; consider a page fault.
Consider also kfree(NULL). The same motivation for drivers not needing
to validate that parameter is behind arguing that device drivers should
not need to poke around in PCI config space to verify that the device
supports MWI; and should have the flexibility to ignore the return code,
just like kfree() returns no value.
> You, the driver author _do not know_ what pci_set_mwi() does at present, on
> all platforms, nor do you know what it does in the future.
I know that it enables MWI accesses ... or fails. Beyond that, there
should be no reason to care. If the hardware can use a lower-overhead
type of PCI bus cycle, I want it to do so. If not, no sweat.
> This is not a terribly important issue, and it is far from the worst case
> of missed error-checking which we have in there.
The reason I think it's important enough to continue this discussion is
that as it currently stands, it's a good example of a **BAD** interface
design ... since it's pointlessly marked as must_check. (See appended
patch to fix that issue.)
If you wouldn't want all callers of kfree() to say "if (ptr) kfree(ptr);";
why then would anyone want to demand
... read the pci config space byte (and cleanly handle errors) ...
... check for the MWI bit ...
... if it's set (so we "expect" this next call to succeed) then:
... call pci_set_mwi() ...
... test set_mwi() value ...
... ignore that value ...
where the first three lines duplicate code _inside_ pci_set_mwi(), and the
last two lines are obviously a pure waste of code and effort. I'd want to
know the criteria by which "if(ptr)" is judged a waste of effort in all
callers, but that more extensive PCI configspace logic was not...
- Dave
-------------------- CUT HERE
Remove bogus annotation of pci_set_mwi() as __must_check. It's completely
reasonable for drivers to not care about the return code, when all they're
doing is enabling an optional performance-improving mechanism that's often
not even available.
Signed-off-by: David Brownell <[email protected]>
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -499,7 +499,7 @@ int __must_check pci_enable_device_bars(
void pci_disable_device(struct pci_dev *dev);
void pci_set_master(struct pci_dev *dev);
#define HAVE_PCI_SET_MWI
-int __must_check pci_set_mwi(struct pci_dev *dev);
+int pci_set_mwi(struct pci_dev *dev);
void pci_clear_mwi(struct pci_dev *dev);
void pci_intx(struct pci_dev *dev, int enable);
int pci_set_dma_mask(struct pci_dev *dev, u64 mask);
> > If the drivers doesn't care and if it makes no difference to performance
> > then just delete the call to pci_set_mwi().
> >
> > But if MWI _does_ make a difference to performance then we should tell
> > someone that it isn't working rather than silently misbehaving?
To repeat: it's not "misbehaving" since support for MWI cycles is
completely optional. It'd be more interesting to get messages in
the cases that it can be enabled, since typically it can't be.
> That sounds like we need a printk inside pci_set_mwi then, rather than
> adding a printk to every single caller.
Maybe wrapped in #ifdef CONFIG_SPAM_MY_KERNEL_LOG_MESSAGES ... :)
On Sun, 15 Oct 2006 17:16:35 -0700
David Brownell <[email protected]> wrote:
>
> > You, the driver author _do not know_ what pci_set_mwi() does at present, on
> > all platforms, nor do you know what it does in the future.
>
> I know that it enables MWI accesses ... or fails. Beyond that, there
> should be no reason to care. If the hardware can use a lower-overhead
> type of PCI bus cycle, I want it to do so. If not, no sweat.
>
There are two reasons why it can fail:
1: The bus doesn't support MWI. Here, the caller doesn't care.
2: The bus _does_ support MWI, but the attempt to enable it failed.
Here we very much do care, because we're losing performance.
>
> > This is not a terribly important issue, and it is far from the worst case
> > of missed error-checking which we have in there.
>
> The reason I think it's important enough to continue this discussion is
> that as it currently stands, it's a good example of a **BAD** interface
> design ... since it's pointlessly marked as must_check. (See appended
> patch to fix that issue.)
It's important to continue this discussion so that certain principles can
be set and agreed to. Because we have a *lot* of unchecked errors in
there. We would benefit from setting guidelines establishing
- Which sorts of errors should be handled in callers
- Which sorts of errors should be handled (ie: just reported) in callees
- Which sorts of errors should be handled in neither callers nor callees
(are there any of these?)
- Whether is it ever legitimate for a caller to not check the return code
from a callee which can return -EFOO. (I suspect not - it probably
indicates a misdesign in the callee, as in this case).
Andrew Morton writes:
> Let me restore the words from my earlier email which you removed so that
> you could say that:
>
> For you the driver author to make assumptions about what's happening
> inside pci_set_mwi() is a layering violation. Maybe the bridge got
> hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI
> error. For example, take a look at the various implementations of
> pci_ops.read() around the place - various of them can fail for various
> reasons.
Maybe aliens are firing a ray-gun at the card. I think it's
fundamentally wrong for the driver to deny service completely because
of a maybe.
Either there was a transient error that only affected the attempt to
set MWI, in which case a printk (inside pci_set_mwi!) is appropriate,
and we carry on. Or there is a persistent error condition, in which
case the driver will see something else fail soon enough - something
that the driver actually needs to have working in order to operate -
and fail at that point.
For the driver to stop and refuse to go any further because of an
error in pci_set_mwi has far more disadvantages than advantages.
Paul.
On Mon, 16 Oct 2006 10:44:30 +1000
Paul Mackerras <[email protected]> wrote:
> Andrew Morton writes:
>
> > Let me restore the words from my earlier email which you removed so that
> > you could say that:
> >
> > For you the driver author to make assumptions about what's happening
> > inside pci_set_mwi() is a layering violation. Maybe the bridge got
> > hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI
> > error. For example, take a look at the various implementations of
> > pci_ops.read() around the place - various of them can fail for various
> > reasons.
>
> Maybe aliens are firing a ray-gun at the card. I think it's
> fundamentally wrong for the driver to deny service completely because
> of a maybe.
>
> Either there was a transient error that only affected the attempt to
> set MWI, in which case a printk (inside pci_set_mwi!) is appropriate,
> and we carry on. Or there is a persistent error condition, in which
> case the driver will see something else fail soon enough - something
> that the driver actually needs to have working in order to operate -
> and fail at that point.
>
> For the driver to stop and refuse to go any further because of an
> error in pci_set_mwi has far more disadvantages than advantages.
>
Sure.
So I think what we're needing in this case is:
- A modified version of Willy's patch which returns 0 if MWI was enabled,
1 if MWI isn't available.
- A printk if something went bad
It appears that the various functions which try to match the line sizes
already have printks if something went wrong, but they're using
KERN_DEBUG facility level and that warning would dupliate the new one in
pci_set_mwi().
And although the various implementations of pci_read_config_foo() and
pci_write_config_foo() can return error codes, we have so many instances
where we're not checking it, I don't think it's practical to try to start
doing that everywhere.
- drop the __must_check.
Question is, should pci_set_mwi() ever return -EFOO? I guess it should, in
the case where setting the line size didn't work out.
On Sunday 15 October 2006 6:10 pm, Andrew Morton wrote:
> - A printk if something went bad
Where "bad" would I hope exclude cases where the device
doesn't support MWI ... that is, the "go faster if you can"
advice from the driver, where it isn't an error to run into
the common case of _this_ implementation not happening to
be able to issue MWI cycles.
The other cases, where something actually went wrong, would
be reasonable for emitting KERN_ERR or KERN_WARNING messages.
- Dave
Ar Sul, 2006-10-15 am 18:10 -0700, ysgrifennodd Andrew Morton:
> Question is, should pci_set_mwi() ever return -EFOO? I guess it should, in
> the case where setting the line size didn't work out.
It does no harm, no driver will ever check anything but 0 v !0 because
the handling is no different in either case.
Ar Sul, 2006-10-15 am 17:16 -0700, ysgrifennodd David Brownell:
> Signed-off-by: David Brownell <[email protected]>
Acked-by: Alan Cox <[email protected]>
>
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -499,7 +499,7 @@ int __must_check pci_enable_device_bars(
> void pci_disable_device(struct pci_dev *dev);
> void pci_set_master(struct pci_dev *dev);
> #define HAVE_PCI_SET_MWI
> -int __must_check pci_set_mwi(struct pci_dev *dev);
> +int pci_set_mwi(struct pci_dev *dev);
> void pci_clear_mwi(struct pci_dev *dev);
> void pci_intx(struct pci_dev *dev, int enable);
> int pci_set_dma_mask(struct pci_dev *dev, u64 mask);
Ar Sul, 2006-10-15 am 16:44 -0700, ysgrifennodd Andrew Morton:
> Let me restore the words from my earlier email which you removed so that
> you could say that:
>
> For you the driver author to make assumptions about what's happening
> inside pci_set_mwi() is a layering violation. Maybe the bridge got
> hot-unplugged. Maybe the attempt to set MWI caused some synchronous PCI
> error. For example, take a look at the various implementations of
> pci_ops.read() around the place - various of them can fail for various
> reasons.
Let me repeat what I said before. As a driver author I do not care. It
doesn't matter if it failed because it is not supported or because a
pink elephant went for a dance on the PCI bus.
> Now it could be that an appropriate solution is to make pci_set_mwi()
> return only 0 or 1, and to generate a warning from within pci_set_mwi()
> if some unexpected error happens. In which case it is legitimate for
> callers to not check for errors.
That would be my belief, and ditto for a lot of these other functions -
even the correctly __must_check ones like pci_set_master should do the
error reporting in the set_master() function etc not in every driver.
That gives us a single consistent printk and avoids missing them out or
bloat.
Alan