2006-10-12 01:47:21

by Jeff Garzik

[permalink] [raw]
Subject: [PATCH] serial: handle pci_enable_device() failure upon resume



Signed-off-by: Jeff Garzik <[email protected]>

---

drivers/serial/8250_pci.c | 8 ++++++--

diff --git a/drivers/serial/8250_pci.c b/drivers/serial/8250_pci.c
index 4d0ff8f..aa96e94 100644
--- a/drivers/serial/8250_pci.c
+++ b/drivers/serial/8250_pci.c
@@ -1810,16 +1810,20 @@ static int pciserial_resume_one(struct p
pci_restore_state(dev);

if (priv) {
+ int rc;
+
/*
* The device may have been disabled. Re-enable it.
*/
- pci_enable_device(dev);
+ rc = pci_enable_device(dev);
+ if (rc)
+ return rc;

pciserial_resume_ports(priv);
}
return 0;
}
-#endif
+#endif /* CONFIG_PM */

static struct pci_device_id serial_pci_tbl[] = {
{ PCI_VENDOR_ID_V3, PCI_DEVICE_ID_V3_V960,


2006-10-13 08:00:06

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] serial: handle pci_enable_device() failure upon resume

On Wed, Oct 11, 2006 at 09:47:20PM -0400, Jeff Garzik wrote:
> Signed-off-by: Jeff Garzik <[email protected]>

NAK. What happens to the ports if pci_enable_device() fails and someone
has them open?

It's far better to leave the must_check warning behind until it can be
_correctly_ solved, rather than papering over the problem with bogus
patches to return errors without taking an appropriate additional action.

IOW, the warnings serve as a reminder that *proper* error handling needs
to be implemented.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2006-10-13 20:15:58

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] serial: handle pci_enable_device() failure upon resume

On Fri, 13 Oct 2006 08:59:53 +0100
Russell King <[email protected]> wrote:

> On Wed, Oct 11, 2006 at 09:47:20PM -0400, Jeff Garzik wrote:
> > Signed-off-by: Jeff Garzik <[email protected]>
>
> NAK. What happens to the ports if pci_enable_device() fails and someone
> has them open?

They're screwed either way.

> It's far better to leave the must_check warning behind until it can be
> _correctly_ solved, rather than papering over the problem with bogus
> patches to return errors without taking an appropriate additional action.
>
> IOW, the warnings serve as a reminder that *proper* error handling needs
> to be implemented.

What would that error handling do?

Until that has been implemented, it would be good if we could at least spit
a printk telling people what failed, so when the machine later goes kaput
we know why it happened.

An appropriate place in which to perform that reporting is up in the
high-level resume code, so Jeff's patch is appropriate.

Right now, we get silent failure *and* a compile-time warning. It's hard
to see how that situation could be made worse.

2006-10-13 22:50:20

by Alan

[permalink] [raw]
Subject: Re: [PATCH] serial: handle pci_enable_device() failure upon resume

Ar Gwe, 2006-10-13 am 13:15 -0700, ysgrifennodd Andrew Morton:
> Right now, we get silent failure *and* a compile-time warning. It's hard
> to see how that situation could be made worse.

I support Russells NAK but for another reason. You can't use printk in
the paths of a serial driver that may be the console. There is already a
whole ordering issue here and I'm not convinced it is handled correctly
reviewing the diff. Printks will make it worse in any driver (and waste
memory on handling impossible untestable situations that don't appen)

The printk case and serial first ordering want fixing correctly because
serial console is about the only sane way to debug resume in the first
place.

Alan

2006-10-14 08:38:37

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] serial: handle pci_enable_device() failure upon resume

On Fri, Oct 13, 2006 at 01:15:16PM -0700, Andrew Morton wrote:
> On Fri, 13 Oct 2006 08:59:53 +0100
> Russell King <[email protected]> wrote:
> > On Wed, Oct 11, 2006 at 09:47:20PM -0400, Jeff Garzik wrote:
> > > Signed-off-by: Jeff Garzik <[email protected]>
> >
> > NAK. What happens to the ports if pci_enable_device() fails and someone
> > has them open?
>
> They're screwed either way.
>
> > It's far better to leave the must_check warning behind until it can be
> > _correctly_ solved, rather than papering over the problem with bogus
> > patches to return errors without taking an appropriate additional action.
> >
> > IOW, the warnings serve as a reminder that *proper* error handling needs
> > to be implemented.
>
> What would that error handling do?

Whatever would be correct for the driver.

In the general case of serial drivers, we can probably get away with
removing the port, but I think it'll need a couple of changes to
serial_core to prevent any spurious accesses to the device.

In the case of 8250_pci, slightly changing the way the remove method
works will probably be sufficient, provided the above is also fixed.

So, there _are_ things which can be done, and it requires someone
knowledgeable about the driver to work them out. Handling this case
is NOT a janitorial task by any means.

> Until that has been implemented, it would be good if we could at least spit
> a printk telling people what failed, so when the machine later goes kaput
> we know why it happened.
>
> An appropriate place in which to perform that reporting is up in the
> high-level resume code, so Jeff's patch is appropriate.

No it isn't. What could the high level resume code do? Call the driver's
remove method? Some driver remove methods hit the hardware as well, so
that wouldn't be _any_ different to the current situation of not checking
the pci_enable_device() return code.

> Right now, we get silent failure *and* a compile-time warning. It's hard
> to see how that situation could be made worse.

As I said, an oops. The nature of this oops could result in:
- user data being destroyed.
- prevention any driver being subsequently loaded or unloaded.

The approach that people are taking at the moment seems to be:
- let's make some random functions __must_check
- oh dear we get lots of warnings
- we must do something to shut these warnings up
- lets think of something and apply it without thought to all drivers
introducing a lot of potential oops-creating scenarios

We are far better off with the warning - at least there is *something*
to tell driver authors that they need to fix their code.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core