2014-01-02 22:47:07

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH][tentative] PCI / ACPI: Rework PCI host bridge removal to avoid sysfs warnings

On Tue, Dec 31, 2013 at 1:03 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Tuesday, December 31, 2013 10:45:46 AM Yinghai Lu wrote:
>> On Mon, Dec 30, 2013 at 5:15 AM, Rafael J. Wysocki <[email protected]> wrote:
>> > On Monday, December 30, 2013 01:51:28 PM Rafael J. Wysocki wrote:
>> >> > We have patches that need to stop ioapic and iommu between
>> >> > pci_stop_root_bus and pci_remove_root_bus.
>> >
>> > BTW, what *exactly* do they need to be stopped between? After these two patches:
>>
>> need to stop regular pci drivers before stop "driver" for ioapic/dmar.
>>
>> >
>> >> > Please check if the problem still happen after
>> >> >
>> >> > http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/patch/?id=c4a0a5d964e90b93eb4101c3927b788e083e530f
>> >> >
>> >> > http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/patch/?id=e3b439e1d315aff59c1b4f0fc43c5cd2d82b4138
>> >
>> > pci_stop_root_bus() is just a walk over devices on the root bus stopping
>> > them and pci_remove_root_bus() starts with the removal of those devices.
>> >
>> > Surely, those two list walks can be combined into one?
>>
>> maybe ok, but we have to problem to make sure stop pci drivers before
>> "driver" for ioapic/dmar.
>
> That's fine, but ioapic/dmar stopping need not happen between the stopping of
> drivers and removing of devices on the root bus I suppose?
>
> Actually, I think that the ioapic/dmar stopping should be carried out after
> removing all of the root bus devices, or it can be racy with respect to a
> driver reload. Isn't that the case?

No. It should be before removing all root bus devices.
as they need to access the pci devices during stop ioapic and dmar.

Also ioapic itself could one one pci device.

Yinghai


2014-01-03 00:31:49

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH][tentative] PCI / ACPI: Rework PCI host bridge removal to avoid sysfs warnings

On Thursday, January 02, 2014 02:47:04 PM Yinghai Lu wrote:
> On Tue, Dec 31, 2013 at 1:03 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Tuesday, December 31, 2013 10:45:46 AM Yinghai Lu wrote:
> >> On Mon, Dec 30, 2013 at 5:15 AM, Rafael J. Wysocki <[email protected]> wrote:
> >> > On Monday, December 30, 2013 01:51:28 PM Rafael J. Wysocki wrote:
> >> >> > We have patches that need to stop ioapic and iommu between
> >> >> > pci_stop_root_bus and pci_remove_root_bus.
> >> >
> >> > BTW, what *exactly* do they need to be stopped between? After these two patches:
> >>
> >> need to stop regular pci drivers before stop "driver" for ioapic/dmar.
> >>
> >> >
> >> >> > Please check if the problem still happen after
> >> >> >
> >> >> > http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/patch/?id=c4a0a5d964e90b93eb4101c3927b788e083e530f
> >> >> >
> >> >> > http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/patch/?id=e3b439e1d315aff59c1b4f0fc43c5cd2d82b4138
> >> >
> >> > pci_stop_root_bus() is just a walk over devices on the root bus stopping
> >> > them and pci_remove_root_bus() starts with the removal of those devices.
> >> >
> >> > Surely, those two list walks can be combined into one?
> >>
> >> maybe ok, but we have to problem to make sure stop pci drivers before
> >> "driver" for ioapic/dmar.
> >
> > That's fine, but ioapic/dmar stopping need not happen between the stopping of
> > drivers and removing of devices on the root bus I suppose?
> >
> > Actually, I think that the ioapic/dmar stopping should be carried out after
> > removing all of the root bus devices, or it can be racy with respect to a
> > driver reload. Isn't that the case?
>
> No. It should be before removing all root bus devices.
> as they need to access the pci devices during stop ioapic and dmar.
>
> Also ioapic itself could one one pci device.

Well, if we stop drivers first, then stop ioapic/dmar and *then* remove
devices, it is possible to rebind a driver to a device after ioapic/dmar has
been stopped, which I guess will not lead to anything nice?

Rafael

2014-01-06 19:28:46

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH][tentative] PCI / ACPI: Rework PCI host bridge removal to avoid sysfs warnings

On Thu, Jan 2, 2014 at 4:45 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Thursday, January 02, 2014 02:47:04 PM Yinghai Lu wrote:
>>
>> No. It should be before removing all root bus devices.
>> as they need to access the pci devices during stop ioapic and dmar.
>>
>> Also ioapic itself could one one pci device.
>
> Well, if we stop drivers first, then stop ioapic/dmar and *then* remove
> devices, it is possible to rebind a driver to a device after ioapic/dmar has
> been stopped, which I guess will not lead to anything nice?

Not sure how that could happen.

If it would really happen, we could set dev->match_driver to 0 in pci_stop_dev.

Yinghai

2014-01-06 20:28:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH][tentative] PCI / ACPI: Rework PCI host bridge removal to avoid sysfs warnings

On Monday, January 06, 2014 11:28:43 AM Yinghai Lu wrote:
> On Thu, Jan 2, 2014 at 4:45 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Thursday, January 02, 2014 02:47:04 PM Yinghai Lu wrote:
> >>
> >> No. It should be before removing all root bus devices.
> >> as they need to access the pci devices during stop ioapic and dmar.
> >>
> >> Also ioapic itself could one one pci device.
> >
> > Well, if we stop drivers first, then stop ioapic/dmar and *then* remove
> > devices, it is possible to rebind a driver to a device after ioapic/dmar has
> > been stopped, which I guess will not lead to anything nice?
>
> Not sure how that could happen.
>
> If it would really happen, we could set dev->match_driver to 0 in pci_stop_dev.

Simply, run "modprobe -r driver && modprobe driver" in a loop and
remove the PCI host bridge the given device is on in parallel to that. Chances
are, you'll see some nice breakage.

Also what happens if somebody uses the "remove" sysfs attribute on a device
needed by ioapic/dmar?

Rafael

2014-01-08 23:41:55

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH][tentative] PCI / ACPI: Rework PCI host bridge removal to avoid sysfs warnings

On Mon, Jan 6, 2014 at 12:41 PM, Rafael J. Wysocki <[email protected]> wrote:

>> Not sure how that could happen.
>>
>> If it would really happen, we could set dev->match_driver to 0 in pci_stop_dev.
>
> Simply, run "modprobe -r driver && modprobe driver" in a loop and
> remove the PCI host bridge the given device is on in parallel to that. Chances
> are, you'll see some nice breakage.

I would suggest using match_driver prevent driver from attaching again.

---
drivers/pci/remove.c | 1 +
1 file changed, 1 insertion(+)

Index: linux-2.6/drivers/pci/remove.c
===================================================================
--- linux-2.6.orig/drivers/pci/remove.c
+++ linux-2.6/drivers/pci/remove.c
@@ -11,6 +11,7 @@ static void pci_stop_dev(struct pci_dev
pci_proc_detach_device(dev);
pci_remove_sysfs_dev_files(dev);
device_release_driver(&dev->dev);
+ dev->match_driver = false;
dev->is_added = 0;
}


>
> Also what happens if somebody uses the "remove" sysfs attribute on a device
> needed by ioapic/dmar?

Good question, we will have problem in that case.
To make it simple, we may hide the "remove" in sysfs for ioapic pci device ?

Thanks

Yinghai

2014-01-08 23:56:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH][tentative] PCI / ACPI: Rework PCI host bridge removal to avoid sysfs warnings

On Wednesday, January 08, 2014 03:41:52 PM Yinghai Lu wrote:
> On Mon, Jan 6, 2014 at 12:41 PM, Rafael J. Wysocki <[email protected]> wrote:
>
> >> Not sure how that could happen.
> >>
> >> If it would really happen, we could set dev->match_driver to 0 in pci_stop_dev.
> >
> > Simply, run "modprobe -r driver && modprobe driver" in a loop and
> > remove the PCI host bridge the given device is on in parallel to that. Chances
> > are, you'll see some nice breakage.
>
> I would suggest using match_driver prevent driver from attaching again.

Yes, we can do that. Some locking is needed for it to be non-racy, however.

Anyway, we still have the problem with race conditions between different PCI
removal/rescan code paths. And I'm still going to prepare a patch to use
the remove-rescan mutex to address those race conditions and that patch should
help here too.

>
> ---
> drivers/pci/remove.c | 1 +
> 1 file changed, 1 insertion(+)
>
> Index: linux-2.6/drivers/pci/remove.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/remove.c
> +++ linux-2.6/drivers/pci/remove.c
> @@ -11,6 +11,7 @@ static void pci_stop_dev(struct pci_dev
> pci_proc_detach_device(dev);
> pci_remove_sysfs_dev_files(dev);
> device_release_driver(&dev->dev);
> + dev->match_driver = false;
> dev->is_added = 0;
> }
>
>
> >
> > Also what happens if somebody uses the "remove" sysfs attribute on a device
> > needed by ioapic/dmar?
>
> Good question, we will have problem in that case.
> To make it simple, we may hide the "remove" in sysfs for ioapic pci device ?

Yeah, we need to do that if using that attribute may lead to problems.

Thanks,
Rafael