Hi,
As mentioned in Documentaion/pci.txt, pci device driver should call
pci_disable_device() to deallocate any IRQ resources, disable PCI
bus-mastering and etc. when it decides to stop using the device.
But there seems to be many drivers that don't use pci_disable_device()
properly so far.
The following patch changes pci_device_remove() to call
pci_disable_device() instead of the driver if the device has not been
disabled by the driver.
Signed-off-by: Kenji Kaneshige <[email protected]>
---
linux-2.6.9-rc1-kanesige/drivers/pci/pci-driver.c | 7 +++++++
1 files changed, 7 insertions(+)
diff -puN drivers/pci/pci-driver.c~force_pci_disable_device drivers/pci/pci-driver.c
--- linux-2.6.9-rc1/drivers/pci/pci-driver.c~force_pci_disable_device 2004-09-01 17:42:40.000000000 +0900
+++ linux-2.6.9-rc1-kanesige/drivers/pci/pci-driver.c 2004-09-02 14:54:39.824783993 +0900
@@ -291,6 +291,13 @@ static int pci_device_remove(struct devi
drv->remove(pci_dev);
pci_dev->driver = NULL;
}
+ /*
+ * If the device has not been disabled, we call
+ * pci_disable_device() instead of the driver.
+ */
+ if (pci_dev->is_enabled)
+ pci_disable_device(pci_dev);
+
pci_dev_put(pci_dev);
return 0;
}
On Maw, 2004-09-07 at 02:26, Kenji Kaneshige wrote:
> Hi,
>
> As mentioned in Documentaion/pci.txt, pci device driver should call
> pci_disable_device() to deallocate any IRQ resources, disable PCI
> bus-mastering and etc. when it decides to stop using the device.
> But there seems to be many drivers that don't use pci_disable_device()
> properly so far.
Think about unloading frame buffers or PCI devices with multiple
functions and multiple drivers. I agree the drivers definitely want
fixing where appropriate. I'm not sure your approach is safe (although a
debug printk would work wonders perhaps ?)
Another question: When I suggested doing exactly this for kexec I got
flamed by people claiming that disabling bus mastering isnt a defined
operation only enabling it. I'm still unconvinced by their protestations
but wonder what the PCI gurus can answer here.
Alan Cox wrote:
> On Maw, 2004-09-07 at 02:26, Kenji Kaneshige wrote:
>> Hi,
>>
>> As mentioned in Documentaion/pci.txt, pci device driver should call
>> pci_disable_device() to deallocate any IRQ resources, disable PCI
>> bus-mastering and etc. when it decides to stop using the device.
>> But there seems to be many drivers that don't use pci_disable_device()
>> properly so far.
>
> Think about unloading frame buffers or PCI devices with multiple
> functions and multiple drivers. I agree the drivers definitely want
> fixing where appropriate. I'm not sure your approach is safe (although a
I don't understand what you are worried about. Could you tell me
what would be a problem with frame buffers or PCI devices with
multiple functions?
> debug printk would work wonders perhaps ?)
Exactly. I think debug printk is more important because what we
really need is fixing all broken drivers which don't call
pci_disable_device().
Thanks,
Kenji Kaneshige
On Mer, 2004-09-08 at 04:14, Kenji Kaneshige wrote:
> > Think about unloading frame buffers or PCI devices with multiple
> > functions and multiple drivers. I agree the drivers definitely want
> > fixing where appropriate. I'm not sure your approach is safe (although a
>
> I don't understand what you are worried about. Could you tell me
> what would be a problem with frame buffers or PCI devices with
> multiple functions?
If I have a framebuffer driver loaded for my video card in bitmap mode
all is well. If I unload it you then disable the video hardware even
though it would still be otherwise usable in text mode.
The same occurs when one PCI device has multiple functions (not PCI
functions but linux drivers using it). There are some examples of this
such as the CS5520 where one BAR is the IDE controller.
Alan Cox wrote:
> On Mer, 2004-09-08 at 04:14, Kenji Kaneshige wrote:
>> > Think about unloading frame buffers or PCI devices with multiple
>> > functions and multiple drivers. I agree the drivers definitely want
>> > fixing where appropriate. I'm not sure your approach is safe (although a
>>
>> I don't understand what you are worried about. Could you tell me
>> what would be a problem with frame buffers or PCI devices with
>> multiple functions?
>
> If I have a framebuffer driver loaded for my video card in bitmap mode
> all is well. If I unload it you then disable the video hardware even
> though it would still be otherwise usable in text mode.
>
> The same occurs when one PCI device has multiple functions (not PCI
> functions but linux drivers using it). There are some examples of this
> such as the CS5520 where one BAR is the IDE controller.
>
Thank you for answering.
I understand that there are some devices that need to be enabled
even after their drivers are unloaded, and my approach might not
be safe in this case. I think the best way to solve the problem
(missing pci_disable_device) is to fix broken drivers one by one.
I think debug printk will helpful to fix those drivers, but I
don't know what kind of message is appropriate...
Thanks,
Kenji Kaneshige
On Thu, Sep 09, 2004 at 02:55:39PM +0900, Kenji Kaneshige wrote:
> Alan Cox wrote:
> > On Mer, 2004-09-08 at 04:14, Kenji Kaneshige wrote:
> >> > Think about unloading frame buffers or PCI devices with multiple
> >> > functions and multiple drivers. I agree the drivers definitely want
> >> > fixing where appropriate. I'm not sure your approach is safe (although a
> >>
> >> I don't understand what you are worried about. Could you tell me
> >> what would be a problem with frame buffers or PCI devices with
> >> multiple functions?
> >
> > If I have a framebuffer driver loaded for my video card in bitmap mode
> > all is well. If I unload it you then disable the video hardware even
> > though it would still be otherwise usable in text mode.
> >
> > The same occurs when one PCI device has multiple functions (not PCI
> > functions but linux drivers using it). There are some examples of this
> > such as the CS5520 where one BAR is the IDE controller.
> >
>
> Thank you for answering.
>
> I understand that there are some devices that need to be enabled
> even after their drivers are unloaded, and my approach might not
> be safe in this case. I think the best way to solve the problem
> (missing pci_disable_device) is to fix broken drivers one by one.
> I think debug printk will helpful to fix those drivers, but I
> don't know what kind of message is appropriate...
Yes, this should be pointed out with a warning message, which will be
safer. How about something like:
dev_warn(&pci_dev->dev, "Device was removed without properly "
"calling pci_disable_device(), please fix.\n");
WARN_ON(1);
Care to redo your patch with that?
thanks,
greg k-h
Greg KH wrote:
>> I understand that there are some devices that need to be enabled
>> even after their drivers are unloaded, and my approach might not
>> be safe in this case. I think the best way to solve the problem
>> (missing pci_disable_device) is to fix broken drivers one by one.
>> I think debug printk will helpful to fix those drivers, but I
>> don't know what kind of message is appropriate...
>
> Yes, this should be pointed out with a warning message, which will be
> safer. How about something like:
>
> dev_warn(&pci_dev->dev, "Device was removed without properly "
> "calling pci_disable_device(), please fix.\n");
> WARN_ON(1);
>
> Care to redo your patch with that?
Thank you for your advice.
I changed my patch based on your feedback. But I have one
concern about putting "WARN_ON(1);". I'm worrying that people
might be surprised if stack dump is shown on their console,
especially if the broken driver handles many devices.
For example, following console messages were displayed when I
tested my patch by loading/unloading 'uhci_hcd' which handles
two devices on my machine. How do you think?
Thanks,
Kenji Kaneshige
uhci_hcd 0000:00:1d.0: USB bus 1 deregistered
uhci_hcd 0000:00:1d.0: Device was removed without properly calling pci_disable_device(), please fix.
Badness in pci_device_remove at drivers/pci/pci-driver.c:301
Call Trace:
[<a000000100019e40>] show_stack+0x80/0xa0
sp=e0000002e0177c10 bsp=e0000002e01710c8
[<a000000100398340>] pci_device_remove+0x140/0x1a0
sp=e0000002e0177de0 bsp=e0000002e01710a0
[<a000000100448be0>] device_release_driver+0x120/0x140
sp=e0000002e0177de0 bsp=e0000002e0171078
[<a000000100448c50>] driver_detach+0x50/0xa0
sp=e0000002e0177de0 bsp=e0000002e0171058
[<a000000100449780>] bus_remove_driver+0xc0/0x180
sp=e0000002e0177de0 bsp=e0000002e0171038
[<a00000010044a250>] driver_unregister+0x30/0xc0
sp=e0000002e0177de0 bsp=e0000002e0171020
[<a000000100398920>] pci_unregister_driver+0x20/0x60
sp=e0000002e0177de0 bsp=e0000002e0171008
[<a0000002000e18b0>] uhci_hcd_cleanup+0x30/0x130 [uhci_hcd]
sp=e0000002e0177de0 bsp=e0000002e0170fe8
[<a0000001000d0a40>] sys_delete_module+0x440/0x4e0
sp=e0000002e0177de0 bsp=e0000002e0170f78
[<a0000001000120a0>] ia64_ret_from_syscall+0x0/0x20
sp=e0000002e0177e30 bsp=e0000002e0170f78
uhci_hcd 0000:00:1d.1: remove, state 1
usb usb2: USB disconnect, address 1
uhci_hcd 0000:00:1d.1: USB bus 2 deregistered
uhci_hcd 0000:00:1d.1: Device was removed without properly calling pci_disable_device(), please fix.
Badness in pci_device_remove at drivers/pci/pci-driver.c:301
Call Trace:
[<a000000100019e40>] show_stack+0x80/0xa0
sp=e0000002e0177c10 bsp=e0000002e01710c8
[<a000000100398340>] pci_device_remove+0x140/0x1a0
sp=e0000002e0177de0 bsp=e0000002e01710a0
[<a000000100448be0>] device_release_driver+0x120/0x140
sp=e0000002e0177de0 bsp=e0000002e0171078
[<a000000100448c50>] driver_detach+0x50/0xa0
sp=e0000002e0177de0 bsp=e0000002e0171058
[<a000000100449780>] bus_remove_driver+0xc0/0x180
sp=e0000002e0177de0 bsp=e0000002e0171038
[<a00000010044a250>] driver_unregister+0x30/0xc0
sp=e0000002e0177de0 bsp=e0000002e0171020
[<a000000100398920>] pci_unregister_driver+0x20/0x60
sp=e0000002e0177de0 bsp=e0000002e0171008
[<a0000002000e18b0>] uhci_hcd_cleanup+0x30/0x130 [uhci_hcd]
sp=e0000002e0177de0 bsp=e0000002e0170fe8
[<a0000001000d0a40>] sys_delete_module+0x440/0x4e0
sp=e0000002e0177de0 bsp=e0000002e0170f78
[<a0000001000120a0>] ia64_ret_from_syscall+0x0/0x20
sp=e0000002e0177e30 bsp=e0000002e0170f78
On Iau, 2004-09-09 at 11:29, Kenji Kaneshige wrote:
> > dev_warn(&pci_dev->dev, "Device was removed without properly "
> > "calling pci_disable_device(), please fix.\n");
> > WARN_ON(1);
> >
"This may need fixing" would be better than "please fix" as it may be
a wrong warning
> I changed my patch based on your feedback. But I have one
> concern about putting "WARN_ON(1);". I'm worrying that people
> might be surprised if stack dump is shown on their console,
> especially if the broken driver handles many devices.
You could put
#ifdef CONFIG_DEBUG_KERNEL
#endif
around that section, then only users selecting kernel debugging would
be bothered by it.
On Thu, Sep 09, 2004 at 07:29:09PM +0900, Kenji Kaneshige wrote:
> Greg KH wrote:
> >>I understand that there are some devices that need to be enabled
> >>even after their drivers are unloaded, and my approach might not
> >>be safe in this case. I think the best way to solve the problem
> >>(missing pci_disable_device) is to fix broken drivers one by one.
> >>I think debug printk will helpful to fix those drivers, but I
> >>don't know what kind of message is appropriate...
> >
> >Yes, this should be pointed out with a warning message, which will be
> >safer. How about something like:
> >
> > dev_warn(&pci_dev->dev, "Device was removed without properly "
> > "calling pci_disable_device(), please
> > fix.\n");
> > WARN_ON(1);
> >
> >Care to redo your patch with that?
>
> Thank you for your advice.
>
> I changed my patch based on your feedback. But I have one
> concern about putting "WARN_ON(1);". I'm worrying that people
> might be surprised if stack dump is shown on their console,
> especially if the broken driver handles many devices.
>
> For example, following console messages were displayed when I
> tested my patch by loading/unloading 'uhci_hcd' which handles
> two devices on my machine. How do you think?
I like Alan's advice. Also, a patch for the uhci-hcd driver would be
nice to have :)
thanks,
greg k-h
Well, the kexec folks have a bit of a point (but not IMO a strong one).
Overall to be friendly with firmware and multi-driver situations we
should try and _restore_ the state of the PCI device prior to
pci_enable_device(), when pci_disable_device() is called.
Some situations -- namely BIOS/ACPI/SMM and kexec -- really do care
about the state of the hardware at shutdown time. Other situations are
purely Linux problems, like what Jens ran into a month or more ago:
* IDE driver loads, including on legacy addresses
* modprobe ata_piix
* pci_enable_device() on a running device
* notice, according to pci_request_regions, that regions are busy
* pci_disable_device()
* IDE suddenly stops working because we disabled IO/MEM
However, with regards to bus-mastering bit specifically, it is
considered a Real Good Idea on a lot of the hardware I mess with to
disable busmastering when you shut down the hardware.
Jeff
Alan Cox wrote:
> On Iau, 2004-09-09 at 11:29, Kenji Kaneshige wrote:
>> > dev_warn(&pci_dev->dev, "Device was removed without properly "
>> > "calling pci_disable_device(), please fix.\n");
>> > WARN_ON(1);
>> >
>
> "This may need fixing" would be better than "please fix" as it may be
> a wrong warning
>
Yes.
I should have considered drivers that intentionally don't disable
devices. I'll change the message.
>> I changed my patch based on your feedback. But I have one
>> concern about putting "WARN_ON(1);". I'm worrying that people
>> might be surprised if stack dump is shown on their console,
>> especially if the broken driver handles many devices.
>
> You could put
>
> #ifdef CONFIG_DEBUG_KERNEL
>
> #endif
>
> around that section, then only users selecting kernel debugging would
> be bothered by it.
>
Thank you for advice.
But I don't know if we should take this approach, because
'CONFIG_DEBUG_KERNEL' is set by default on RedHat and some
other distros.
How do you think?
Thanks,
Kenji Kaneshige
Greg KH wrote:
>
> I like Alan's advice. Also, a patch for the uhci-hcd driver would be
> nice to have :)
>
Sure, I'll make a patch for the uhci-hcd driver (and ehci-hcd,
e1000, etc. also).
Thanks,
Kenji Kaneshige
Here is an updated patch for missing pci_disable_device().
Greg, please apply.
Thanks,
Kenji Kaneshige
As mentioned in Documentaion/pci.txt, pci device driver should call
pci_disable_device() when it decides to stop using the device. But
there are some drivers that don't use pci_disable_device() so far.
This patch adds warning messages that are displayed if the device is
removed without properly calling pci_disable_device().
'WARN_ON(1)' is commented out for now because I guess many people
(including some distros) enables 'CONFIG_DEBUG_KERNEL'. People might
be surprised if many stack dumps are displayed on their console.
Signed-off-by: Kenji Kaneshige <[email protected]>
---
linux-2.6.9-rc1-kanesige/drivers/pci/pci-driver.c | 13 +++++++++++++
1 files changed, 13 insertions(+)
diff -puN drivers/pci/pci-driver.c~force_pci_disable_device drivers/pci/pci-driver.c
--- linux-2.6.9-rc1/drivers/pci/pci-driver.c~force_pci_disable_device 2004-09-13 12:41:23.588330045 +0900
+++ linux-2.6.9-rc1-kanesige/drivers/pci/pci-driver.c 2004-09-13 12:41:23.591259749 +0900
@@ -291,6 +291,19 @@ static int pci_device_remove(struct devi
drv->remove(pci_dev);
pci_dev->driver = NULL;
}
+
+#ifdef CONFIG_DEBUG_KERNEL
+ /*
+ * If the driver decides to stop using the device, it should
+ * call pci_disable_device().
+ */
+ if (pci_dev->is_enabled) {
+ dev_warn(&pci_dev->dev, "Device was removed without properly "
+ "calling pci_disable_device(). This may need fixing.\n");
+ /* WARN_ON(1); */
+ }
+#endif /* CONFIG_DEBUG_KERNEL */
+
pci_dev_put(pci_dev);
return 0;
}
_
On Mon, Sep 13, 2004 at 12:55:18PM +0900, Kenji Kaneshige wrote:
> Here is an updated patch for missing pci_disable_device().
> Greg, please apply.
Applied, thanks.
greg k-h